PacBio Blog

Wednesday, October 29, 2014

‘Revolutionizing HLA Typing': Uppsala’s Ulf Gyllensten on How Long Reads Give Access to New Areas of the Human Genome

In a recent interview with Theral Timpson — part of Mendelspod’s series on long-read sequencing — Ulf Gyllensten, a scientist at Uppsala University, spoke about using PacBio® technology for HLA typing, human genome studies, transcriptomics, and more.

Based in the medical genetics and genomics department, Gyllensten focuses on two areas: using systems biology to study biological variation in human physiology and studying the epidemiology of human papilloma virus and its genetic link to cervical cancer. He also works with the National Genomics Infrastructure, a national core facility in Sweden for genotyping and DNA sequencing, where he has access to all commercially available sequencing platforms.

In the podcast, Gyllensten spoke about advances in screening for HPV, his predictions for the widespread use of genome sequencing in the clinic, and applications using Single Molecule, Real-Time (SMRT®) Sequencing for human genome studies.

Unambiguous HLA typing 

“PacBio is really revolutionizing HLA typing,” Gyllensten said, noting that long-read sequencing addresses the ongoing challenge of linking polymorphisms in distant parts of the HLA genes and distinguishing alleles. “I have been in that field for quite a while. … Finally, we have a technology that will resolve all the ambiguities in HLA typing, which will have a huge impact.”

Gyllensten said the major advantage of SMRT Sequencing for the HLA region is its ability to completely sequence all HLA genes (both class 1 and class 2), getting all the introns and exons for each in a single long read. He believes PacBio sequencing, with its rapid turnaround time, will ultimately become “the key technology” for matching donors and recipients in organ transplantation.

Asked by Timpson whether it’s really possible to achieve 100 percent accuracy for these complicated regions using SMRT Sequencing, Gyllensten replied that it was. “The fact that you can sequence a single allele — that is, a single chromosome by itself and then the other chromosome in the individual — and separate them down to the single base is really the most accurate way you can ever do HLA typing,” he said.

Applications in human genomics

Gyllensten told Timpson that his team expected the primary use of PacBio sequencing to be for smaller genomes, such as getting complete de novo assemblies for pathogens. While they do routinely handle those projects, he was surprised to find robust demand for using the sequencer to analyze larger genomes — including human — as well. “Before having the PacBio instrument installed and running we hadn’t thought about some of these things,” he said. “But it’s opening a lot of opportunities.”

He noted that clinical research, in particular, is a good fit for SMRT sequencing. “Focusing in on particular regions actually suits clinical genetics and clinical immunology because they don’t want the whole genome. They have their favorite genes, favorite targets,” Gyllensten said. “Those can then be accessed through the PacBio [system], and the information that is coming out is really information that could not come out of any other sequencing technology at this point.”

Researching treatment resistance and cancer biology in individuals with leukemia is one example of where the PacBio platform can make a difference. SMRT Sequencing can more accurately cover the fusion gene that is responsible for the nature of the leukemia and its development, Gyllensten said. In addition, he believes PacBio’s technology offers the potential for early detection of new mutations linked to treatment resistance. Reliable early detection could one day make a difference in clinicians’ ability to change a patient’s therapy at the earliest sign of resistance, he noted.

“It all has to do with the long read because you need to sequence maybe 2 or 3 kb around the particular breakpoint in this patient to figure out whether they have a resistance mutation or not,” Gyllensten said, “and there is no other technology that can do that.”

A view into genomic dark matter

Gyllensten told Timpson that as people begin to figure out how much important information is being missed in genome sequences, they will move to a platform that offers more complete views of biology.

Transcriptomics is one place where SMRT Sequencing makes a real difference. “Very few studies have been done on complete transcriptome data,” Gyllensten said. “I think when people start to see that, they will eventually move into … long-read [sequencing].”

A comprehensive view of the human genome will also motivate people to move away from short-read sequencing. At some point, he said, scientists will look at all the short-read data that has been amassed for human genome studies and “realize that a lot of the questions will still not have been answered. They will ask, is the answer hidden in that 15-20 percent of the genome we still haven’t covered with the present technology?” Gyllensten said. “Then there will be a rush to understand the remaining [portion] of the genome.”

Validating other technologies

According to Gyllensten, whose core facility still runs a number of Sanger sequencers, PacBio sequencing has been gaining ground as the preferred technology for validating results found by short-read platforms. “We are seeing more and more requests to do it not with the Sanger, but with the PacBio [sequencer],” he said. “You need to validate [with] different technology and PacBio is really well suited for that.”

Tuesday, October 21, 2014

Data Release: Whole Human Transcriptome from Brain, Heart, and Liver

In higher eukaryotic organisms, like humans, RNA transcripts from the vast majority of genes are alternatively spliced. Alternative splicing dramatically increases the protein-coding potential of eukaryotic genomes and its regulation is often specific to a given tissue or developmental stage.

Using our updated Iso-Seq™ sample preparation protocol, we have generated a dataset containing the full-length whole transcriptome from three diverse human tissues (brain, heart, and liver). The updated version of the Iso-Seq method incorporates the use of a new PCR polymerase that improves the representation of larger transcripts, enabling sequencing of cDNAs of nearly 10 kb in length. The inclusion of multiple sample types makes this dataset ideal for exploring differential alternative splicing events. Download the polished, full-length transcript sequences and the raw data files.

Monday, October 20, 2014

SMRT Sequencing for the HLA Complex: PacBio Goes to ASHI

This week marks the 40th annual meeting of the American Society for Histocompatibility and Immunogenetics, better known in the community as ASHI. The PacBio® team is looking forward to attending; after all, several organizations are now using Single Molecule, Real-Time (SMRT®) Sequencing specifically for resolving the incredibly complex genetic regions related to histocompatibility.

Earlier this year, we announced that two leading HLA typing institutions had adopted SMRT Sequencing to untangle this highly polymorphic set of genes: Anthony Nolan, a UK-based blood cancer charity that started the world’s first bone marrow registry, and HistoGenetics, a pioneer that has used sequence-based typing to characterize HLA regions in more than 14 million samples. We’re pleased to report that scientists from both organizations will be giving presentations at our ASHI workshop, Advances in Fully Phased HLA & KIR Typing Using SMRT® Sequencing.

Wednesday, October 15, 2014

New Chemistry Boosts Average Read Length to 10 kb – 15 kb for PacBio® RS II

We are pleased to announce the launch of our new reagent kit, P6-C4, which represents the next generation of our polymerase as well as our chemistry. This kit replaces the P5-C3 chemistry and is recommended for all SMRT® Sequencing applications, including de novo assembly, targeted sequencing, isoform sequencing, minor variant detection, scaffolding, long-repeat spanning, SNP phasing, and structural variant analysis.

P6-C4 continues the steady read length improvement our users have seen since the instrument first launched. With this new chemistry, average read lengths increase to 10 kb - 15 kb, with half of all data in reads 14 kb or longer. The throughput is expected to be between 500 million to 1 billion bases per SMRT Cell, depending on the sample being sequenced. By providing more throughput per instrument run, the chemistry enables users to sequence larger genomes and observe previously undetected structural variants, highly repetitive regions, and distant genetic elements.

Friday, October 10, 2014

ASHG 2014: A New Look at the Human Genome with Long-Read Sequencing

Scientists around the world are getting ready for the annual meeting of the American Society of Human Genetics taking place October 18-22 at the San Diego Convention Center. We’re looking forward to a number of excellent presentations and posters, and are delighted to see that many of them will focus on applying Single Molecule, Real-Time (SMRT®) Sequencing to human studies.

If you’ll be among those attending ASHG, be sure to attend our workshop, A New Look at the Human Genome – Novel Insights with Long-Read PacBio Sequencing, taking place 12:30 – 2:00 p.m. on Tuesday, October 21. Register in advance to reserve your seat or to receive the recording following the event. Our CSO, Jonas Korlach, will host the workshop, which includes:

* Increased Complexity of the Human Genome Revealed by Single-Molecule Sequencing
Evan Eichler, University of Washington 

* Defining a Personal, Allele-Specific, and Single-Molecule Long-Read Transcriptome
Hagen Tilgner, Stanford University

* Long-Read Multiplexed Amplicon Sequencing: Applications for Epigenetics and Pharmacogenetics
Stuart Scott, Icahn School of Medicine at Mount Sinai

Thursday, October 9, 2014

New Brain Study Reveals Higher Molecular Diversity from Alternative Splicing

A new paper from scientists in Switzerland and the US adds to recent findings about diversity of neuronal transcripts in the mammalian brain. The authors report that this study was only possible using long reads from Single Molecule, Real-Time (SMRT®) Sequencing.

Targeted Combinatorial Alternative Splicing Generates Brain Region-Specific Repertoires of Neurexins,” from lead author Dietmar Schreiner, senior author Peter Scheiffele, and collaborators, was published this month in the journal Neuron. The researchers are from the University of Basel, ETH Zurich, and North Carolina State University. This is the second study on neurexin mRNA diversity using PacBio® sequencing.

Monday, October 6, 2014

'The Quality of PacBio Data Is Beyond Compare': Eric Schadt on Applications of SMRT Sequencing to Human Genetics

As part of its continuing series on long-read sequencing, last week Mendelspod aired an engaging interview with Eric Schadt, Professor & Chair of Genetics and Genomic Sciences, and Director of the Icahn Institute for Genomics and Multiscale Biology at Mount Sinai.

Having now spent three years in his role at the groundbreaking institute, he reports that they are making great progress in the quest to build better data-driven health profiles around individuals that may better guide healthcare choices.

On short-read versus long-read sequencing
Short-read sequencing technologies still maintain the advantage in terms of throughput, says Schadt, but there are a variety of important genomic features that cannot be characterized without long-read sequencing, such as long tandem repeats, bigger structural variations, and focal variants important in cancer.

Thursday, October 2, 2014

‘We’re Going to Find the Keys’: Dan Geraghty Discusses an Approach to Understanding Causal Genetic Variation

Dan Geraghty, a researcher at Fred Hutchinson Cancer Research Center and CEO of Scisco Genetics, has spent much of his career focused on the genetics of immune response. Recently he talked to Mendelspod host Theral Timpson as part of a continuing series of podcasts on the rise of long-read sequencing.

Geraghty explained that while there have been decades’ worth of studies associating the genetics of the major histocompatibility complex (MHC), and the highly polymorphic HLA class 1 and 2 genes, we still haven’t found the key mutations for a variety of different autoimmune diseases such as type 1 diabetes, rheumatoid arthritis, multiple sclerosis, and others.

Enormous amounts of linkage disequilibrium in these regions are one factor, as is getting information in phase, so larger stretches of sequence are needed. Recently Geraghty has begun using Single Molecule, Real-Time (SMRT®) Technology with hopes of drilling down to the causal genetics.

Tuesday, September 30, 2014

New Papers Detail Complexity of Methylome-Related Virulence in Human Pathogens

In two new publications, one published today, scientists from Australia, Italy, the UK, and the US report critical and surprising new findings about DNA methylation-related complexity of bacteria. Adding to the list of advances from genome-wide epigenetic analysis, these projects enhance our understanding of how methylation systems work in human pathogens — and offer important clues for future investigations into how to treat them.

Today’s paper, “A random six-phase switch regulates pneumococcal virulence via global epigenetic changes,” was published in Nature Communications by scientists at the University of Leicester, University of Siena, University of Adelaide, and Griffith University. Senior authors Marco Oggioni and Michael Jennings and their collaborators studied Streptococcus pneumoniae, a bacterium responsible for serious infectious diseases including pneumonia, to figure out how the organism shifts between relatively benign and highly pathogenic phases.

Tuesday, September 23, 2014

Science Perspective: “Tracking Antibiotic Resistance”

In the current issue of Science there is an interesting Perspective by Scott Beatson and Mark Walker of the University of Queensland discussing research published this week in Science Translational Medicine by Conlan et al. who used SMRT® Sequencing to track plasmid diversity of hospital-associated infectious bacteria at the NIH Clinical Center.

The article provides a nice overview of the paper, including an explanation of the important role that plasmids play in spreading antibiotic resistance. They illustrate why short-read DNA sequencing technologies are insufficient in resolving them and long reads are necessary for this work.

“Plasmids may be viewed as the ‘dark matter’ of short-read bacterial genome assemblies, with many large-scale genomic studies conspicuously avoiding the complexities of plasmid structure. Genomic comparisons such as that described by Conlan et al. reveal how the dynamism in the structure and arrangement of resistance elements can only be realized by ‘closing’ plasmid genomes with long-read sequencing,” they write.