PacBio Blog

Wednesday, May 6, 2015

Tutorial on the Iso-Seq Method: Applications, Protocol, and Experimental Design

If you missed our recent webinar on isoform sequencing with the PacBio® platform, we’ve made the full recording available for on-demand access. “Iso-Seq™ Method: Sample Prep and Experimental Design for Full-Length cDNA Sequencing” offers an overview of the application, along with specific sample prep tips, factors to consider when designing an experiment, and suggestions about what kinds of projects can take advantage of this method.

Hosted by our own Tyson Clark, the webinar begins with a look at why it’s important to capture full-length transcripts. There are known human genes that have very different functions depending on which splice variant is expressed. With alternative splicing so critical to genome function — Clark noted one Drosophila gene that can make more than 38,000 isoforms — scientists who miss the full transcript aren’t seeing the full picture of gene activity. Single Molecule, Real-Time (SMRT®) Sequencing is the only technology that enables complete views of these isoforms, from poly-A tails to 5’ ends, without assembly.

The Iso-Seq method can be used for a number of projects. Transcript identification and annotation allows users to detect alternatively spliced isoforms. Targeted sequencing can be used to show, for example, which isoforms are enriched in which tissues. Another application  is normalization; the approach can be used to reduce the representation of highly expressed genes to increase the diversity of transcripts observed per SMRT Cell.

Clark walks viewers through the steps of an Iso-Seq experiment, from converting RNA to cDNA, amplification, size selection, cleanup, and sequencing. While it is possible to perform Iso-Seq analysis with as little as 5 ng of RNA input, 50 ng is the recommended minimum, and input can go as high as 1 µg. Clark recommends using RNA with a RIN score of at least 6.

He highlighted two size-selection instruments from Sage Science — the BluePippin™ and SageELF™ systems — that fit well in the PacBio workflow for isoform sequencing and help users generate reads from even the longest transcripts in their samples. Size selection allows for more even representation across cDNA of different size ranges, since smaller fragments may load preferentially on the sequencer. Clark suggests running a Bioanalyzer trace of your sample up front to determine the range of sizes if you don’t have a good sense of this already; this can inform whether to perform size selection or which sizing approach to use. (With sizing, the sample prep and sequencing process generally takes about three days.)

When designing your experiment, Clark recommends considering how many SMRT Cells you’ll need; assume that each will generate 20,000 to 25,0000 full-length transcript sequences (slightly fewer for especially large cDNA fragments). For gene-specific isoform characterization or very targeted interrogations, a single SMRT Cell may suffice. For a comprehensive survey of full-length isoforms across a transcriptome with several size fractions, you might need 12 to 16 SMRT Cells.

For more information on the Iso-Seq method, please check out these resources:
Iso-Seq website
Iso-Seq analysis information
Available data sets (MCF-7 and human tissues)

Monday, May 4, 2015

PAG Grant Winner: Rainforest Tree Homalanthus nutans to get the SMRT Treatment

We’re pleased to announce the winner of our recent “Most Interesting Genome in the World” grant competition. Congratulations to Jay Keasling and Jeff Wong at the University of California, Berkeley! The grant program, which was supported by co-sponsors Sage Science, Computomics, and the Arizona Genomics Institute, was very competitive with more than 250 submitted proposals.

Keasling and Wong will be awarded SMRT® Sequencing — using up to 40 SMRT Cells with BluePippin™ DNA size selection — for Homalanthus nutans, a small rainforest tree that grows in Samoa. The plant is critical as the source of a natural product called prostratin, which is under development as an anti-HIV therapy. The genome size is estimated at 400 Mb.

Thursday, April 30, 2015

In Study, Continuous Long Reads Outperform Synthetic Long Reads for Resolving Tandem Repeats

Scientists from Argentina and Brazil published the results of a study comparing long-read approaches to characterize the genome structure of a highly complex region of the Y chromosome in Drosophila melanogaster. They found that Single Molecule, Real-Time (SMRT®) Sequencing outperformed synthetic long reads in accurately representing tandem repeats.

The study aimed to resolve the structure of the autosomal gene Mst77F, which had previously been found to have multiple tandem copies; the region, however, was known to be grossly misassembled in the reference. The scientists, from Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas and Universidade Federal do Rio de Janeiro, used Illumina TruSeq Synthetic Long-Reads technology with Celera Assembler as well as PacBio® long-read sequence data assembled with MHAP to interrogate the genomic region. Results were published in the journal G3: Genes, Genomes, Genetics in a paper entitled “Long-read single molecule sequencing to resolve tandem gene copies: The Mst77Y region on the Drosophila melanogaster Y chromosome.”

Monday, April 27, 2015

New Solutions for Comprehensive and Efficient Targeted Sequencing and Multiplexing of Samples

We are proud to announce the introduction of several new solutions for targeted sequencing and sample multiplexing on the PacBio® Sequencing System.

New Targeted Sequencing Workflow through Collaboration with Roche NimbleGen

Today we announced a new workflow that combines Roche NimbleGen’s SeqCap EZ enrichment technology with large DNA fragments (up to 6 kb) and our Single Molecule, Real-Time (SMRT®) Sequencing to provide a more comprehensive view of variants, transgene integration sites, and haplotype information over multi-kilobase contiguous regions. The laboratory workflow is described in a shared protocol. For each targeted region, SAMtools are used to phase and bin reads by haplotype, and then Quiver is applied to polish each haplotype to high consensus accuracy. This entire bioinformatics workflow is summarized on GitHub.

Friday, April 24, 2015

On DNA Day, Celebrating New Firsts

Hooray for DNA!  We’re excited to celebrate this day as it it honors two major accomplishments in the field: the 1953 publication of the structure of DNA, and the 2003 completion of the Human Genome Project.

With the amount of attention DNA has received in the past century, it is hard to believe that in some ways we are still just getting acquainted with the molecule. Here at PacBio, we are proud to be helping life sciences researchers achieve new firsts with DNA. Because it does not use amplification, our Single Molecule, Real-Time (SMRT®) Sequencing platform provides the purest view of individual DNA fragments.

Friday, April 17, 2015

AACR 2015: A Novel Look at Cancer, and a New SMRT Sequencing Grant Program

We’re looking forward to the annual meeting of the American Association for Cancer Research, which kicks off this weekend in Philadelphia. From directly phasing variants to sequencing full-length gene isoforms and other complex events, many scientists are already using SMRT® Sequencing to make exciting discoveries in cancer research. We hear from customers that the single-molecule approach opens the door for experiments they could not have done any other way.

If you’ll be at AACR, we encourage you to attend the talk from UCSF’s Catherine Smith on Monday at 10:40 a.m. in room 201. Her presentation, “Polyclonal and heterogeneous resistance to targeted therapy in leukemia,” will report on studies of patients with acute myeloid leukemia or chronic myeloid leukemia using the PacBio® system. The team focused on compound mutations, or multiple mutations, often distant from each other, on the same allele. These mutations appear to be indicative of resistance to tyrosine kinase inhibitors, something Smith and her colleagues have explored extensively since their finding that FLT3 is a valid therapeutic target for some patients with leukemia.

Wednesday, April 15, 2015

In Genome-wide Study, Long Reads Prove Critical for Structural Variant Discovery

In a paper just published in BMC Genomics, a team of scientists led by Baylor’s Human Genome Sequencing Center reports a thorough analysis of structural variation in a personal genome. What makes this study special is the large number of different technologies applied and the sheer volume of data gathered and analyzed for this single genome. The paper also includes the first known analysis of structural variation in a diploid human genome using SMRT® Sequencing, with 10x coverage from PacBio® long reads.

Lead authors Adam English and William Salerno and their collaborators at a number of institutions describe the results obtained from a structural variant calling pipeline they have developed called Parliament. (Check out the full paper: “Assessing structural variation in a personal genome—towards a human reference diploid genome.”)

Thursday, March 26, 2015

In Chronic Myeloid Leukemia Study, SMRT Sequencing Detects Resistance Mutations Early, New Splice Isoforms and More

Scientists from Uppsala University report in a recent paper that using the Iso-Seq™ method with SMRT® Sequencing allowed them to detect and monitor mutations in the BCR-ABL1 fusion gene for patients with chronic myeloid leukemia (CML). Screening mutations in this region is important for determining the point at which these patients become resistant to tyrosine kinase inhibitor (TKI) therapies, and is currently performed in the clinic using Sanger sequencing, quantitative RT-PCR, and other assays.

The paper, “Clonal distribution of BCR-ABL1 mutations and splice isoforms by single-molecule long-read RNA sequencing,” was published last month in BMC Cancer from lead author Lucia Cavelier and collaborators. In it, the scientists describe sequencing samples from six patients who experienced poor response to cancer treatment; samples were collected at diagnosis and at subsequent follow-up periods and sequenced on the PacBio® system.

Tuesday, March 3, 2015

AGBT Highlights, Day Three: Genomic Medicine, Population Specific Genomes, Goats & Influenza

Day 3 of the AGBT conference was packed with interesting talks - we've covered a few highlights below.  Admittedly, it took a little more caffeine than usual to power through the day.....

In the clinical session, Euan Ashley from Stanford told attendees that genomic medicine is no longer something that we’re aiming for; it’s already here and being used routinely. He expressed concerns about accurate mapping of short-read sequence data for clinical utility, adding that the community needs to make progress in understanding complex genomic regions. Ashley noted that we still don’t have a gold-quality human genome with every single base known, and that achieving that remains an important goal for the field.

Friday, February 27, 2015

AGBT 2015: PacBio Workshop Review & Recording

Our AGBT workshop attracted more than 500 attendees thanks to the high-profile speakers who shared their perspectives on human genomic research. Because of the exclusivity of AGBT, we decided to live-stream our workshop to reach the broader scientific community. Thanks to the the hundreds of people who tuned in to our live webcast from afar! Here are some highlights from the presentations and the recording of the workshop is at the bottom of this post.