PacBio Blog

Monday, August 18, 2014

Genome-Wide Methylation in Human Microbiome Samples

Scientists in Florida and Finland recently published a report of their work studying methylation patterns in two human microbiome samples. While microbiome studies have become quite popular, the authors note there have been no prior papers detailing genome-wide methylation of bacteria found in those studies. Their goal was to ascertain how much added functional variation might occur based on methylation patterns.

The methylome of the gut microbiome: disparate Dam methylation patterns in intestinal Bacteroides dorei,” published in Frontiers in Microbiology, comes from lead author Michael Leonard and senior author Eric Triplett at the University of Florida plus a team of collaborators from hospitals and universities across Finland.

The scientists used Single Molecule, Real-Time (SMRT®) Sequencing for its ability not just to sequence bacterial genomes to closure, but also to read methylation patterns across those genomes. They studied two stool samples from children at high risk for developing type 1 diabetes; both stool samples were dominated by Bacteroides dorei. In both strains, after sequencing to closure using the PacBio® sequencer, the team looked at GATC motifs for Dam methylation, which is believed to change gene expression in bacteria.

A marked difference between the genomes was discovered during methylation analysis: the first strain lacked Dam methylation entirely, while the second contained more than 20,000 methylated GATC sites. (Indeed, that strain only had three GATC sites that were not found to be methylated.) Scientists determined that the first genome lacked the DamMT gene, though both strains had other methylation patterns. “Another interesting observation is that of all of the methylation motifs observed in these two genomes, none is methylated in both genomes,” the authors report. “This suggests that the primary source of methyltransferases in these genomes is through lateral transfer, often from phage.”

Based on these remarkable differences, the scientists conclude that DNA sequence alone is not enough to understand the function of bacterial strains in a microbiome sample. “This work suggests that future microbiome studies should consider the methylome when describing the bacterial diversity in the gut,” the authors write. “Such analyses are no longer difficult given the latest sequencing technologies.”

Wednesday, August 6, 2014

Plant and Animal Genomes: New Web Resource Available

After so many compelling customer projects for microbial genomes, it’s been rewarding to see more scientists turning to Single Molecule, Real-Time (SMRT®) Sequencing for larger genomes, such as plants and animals. Many PacBio users are performing de novo sequencing and assembly or upgrading draft genomes initially generated by short-read technologies. Extraordinarily long reads and throughput improvements have allowed scientists to affordably assemble and close genomes such as the Atlantic cod, spinach, and Orpinomyces, an anaerobic fungus found in the rumen of cows, to name a few.

As reported by several customers at the 2014 Plant & Animal Genome conference in San Diego, new features of SMRT Sequencing, including the ability to identify full-length isoforms and automate haplotyping, are making it possible for researchers to generate high-quality, contiguous assemblies with improved genome annotations. A more holistic view offers scientists better insights into individual gene functions and their coordination within networks.


Tuesday, July 29, 2014

Novel Study of Genome-wide PT Modifications in Bacteria Performed with SMRT Sequencing

A recent paper from scientists in China and the United States demonstrates a novel view of phosphorothioate (PT) DNA modifications in two bacterial genomes. Scientists from Shanghai Jiao Tong University, Massachusetts Institute of Technology, Wuhan University, and Pacific Biosciences teamed up to deploy Single Molecule, Real-Time (SMRT®) Sequencing to generate the first genome-wide view of PT modifications and to better understand their function. “Genomic mapping of phosphorothioates reveals partial modification of short consensus sequences” by Cao et al. was published in Nature Communications.

The authors note that PT modifications, which replace a non-bridging phosphate oxygen with sulphur, were only recently discovered to occur naturally in bacteria. (PT modifications are used by scientists to stabilize synthetic DNA molecules against nuclease degradation.) Today, these modifications have been seen in more than 200 bacteria and archaea, but the detailed genome-wide distribution and biological functions have not been clear.


Tuesday, July 22, 2014

At ISMB, Gene Myers’ Keynote Offers History, Future of Genome Assembly

At ISMB 2014 in Boston earlier this month, Gene Myers of the Max-Planck Institute for Molecular Cell Biology and Genetics, presented a keynote address entitled “DNA Assembly: Past, Present, and Future.”  Myers received the prestigious Senior Scientist Accomplishment Award from the International Society for Computational Biology (ISCB) at the event.

The ISCB Senior Scientist Accomplishment Award honors respected leaders in computational biology and bioinformatics for their significant contributions to these fields through research, education, and service. Myers is being honored as the 2014 winner for his outstanding contributions to the bioinformatics community, particularly for his work on sequence comparison algorithms, whole-genome shotgun sequencing methods, and for his recent endeavors in developing software and microscopic devices for bioimage informatics. 


Friday, July 11, 2014

ISMB 2014: The World Cup of Bioinformatics

We’re eager for the #ISMB conference — it’s the 22nd annual Intelligent Systems for Molecular Biology event — kicking off this weekend in Boston. As we continue to push our technology to deliver longer read lengths, we have been honored to work with many leading bioinformaticians to optimize the processing and analysis of our data.

Several of those experts will be speaking at ISMB this year. On Sunday, attendees will hear from Adam Phillippy of the National Biodefense Analysis and Countermeasures Center. He’ll be presenting at noon on producing complete genome assemblies using Single Molecule, Real-Time (SMRT®) Sequencing data. Adam’s team recently developed a new assembler called MHAP that dramatically reduces CPU power needed for building assemblies, so we are eager to hear more.


Wednesday, July 9, 2014

Optimizing Eukaryotic De Novo Genome Assembly: Webinar Recording Available

 http://programs.pacificbiosciences.com/l/1652/2014-07-09/2wbhjt
Our webinar on eukaryotic genome assembly attracted a great crowd, and now we’re making the full recording available to the community. The session featured great hands-on information and best practices for working with Single Molecule, Real-Time (SMRT®) Sequencing data. “Optimizing Eukaryotic Genome Assembly with Long-Read Sequencing” featured three excellent speakers — Michael Schatz and James Gurtowski from Cold Spring Harbor Laboratory and Sergey Koren from the National Biodefense Analysis and Countermeasures Center — and was hosted by our own CSO Jonas Korlach.

Schatz kicked off the session with an overview of assemblers for PacBio® data (as well as recommendations for when to use each one) and a look at the challenges of short-read assemblies. He also set expectations around long-read data, noting that for genomes less than 100 Mb, users should expect a nearly perfect assembly from the automated workflow. Genomes up to 1 Gb should be represented in a high-quality assembly with a contig N50 of at least 1 Mb. Genomes larger than that will have shorter contig N50 stats and will require larger computational power, he added.


Tuesday, July 1, 2014

Scientists Generate the First Personal Transcriptome Using SMRT Sequencing

A new paper from scientists at Stanford University and Yale University describes the use of Single Molecule, Real-Time (SMRT®) Sequencing to generate transcriptomes for three individuals. The work is believed to be the first personal transcriptome analysis using long-read sequencing.

The paper, entitled “Defining a personal, allele-specific, and single-molecule long-read transcriptome,” was published in PNAS by Hagen Tilgner, Fabian Grubert, Donald Sharon, and Michael Snyder. Last year, the same authors published a study using SMRT Sequencing to analyze transcriptomes across tissue samples from human organs. In the PNAS publication, they compare metrics from the new data set to those from the previous study.


Friday, June 27, 2014

At SFAF 2014, Great Science and High-Quality Genomes

It’s been a busy start to the summer, but we’re still basking in the top-notch presentations and posters from the Sequencing, Finishing, and Analysis in the Future meeting last month. Hosted by Los Alamos National Laboratory in Santa Fe, this has become a premier event for scientists working on sequencing protocols, analysis, and assembly methods.

Many speakers presented data including reads from Single Molecule, Real-Time (SMRT®) Sequencing. Jeff Rogers from Baylor College of Medicine used long PacBio® reads with the PBJelly algorithm to fill gaps in many mammalian genomes, including sheep, rat, baboon, sooty mangabey, and mouse lemur. Tina Graves-Lindsay from Washington University reported work on improving the reference human genome through BAC sequencing and the use of a haploid human data set, which included PacBio’s CHM1TERT data release. James Gurtowski from Cold Spring Harbor Labs detailed improvements to genome assemblies of yeast, Arabidopsis, and various rice strains using his new algorithms, ECTools.


Monday, June 23, 2014

Unprecedented Read Length at the Icahn Institute:
Precise Sizing + SMRT Sequencing

At the Icahn Institute for Genomics and Multiscale Biology at Mount Sinai in New York City, technology development expert Robert Sebra, Ph.D., sees tremendous need for long-read, high-accuracy sequencing for use in microbial surveillance, detection of repeat expansions, and other research applications. To meet that demand, he relies on Single Molecule, Real-Time (SMRT®) Sequencing from Pacific Biosciences with BluePippin™ automated DNA size selection from Sage Science. Together, these tools offer a powerful solution and industry-leading read lengths that allow Sebra and other researchers to resolve repeat elements and structural variants, rapidly close microbial genomes, and measure epigenetic marks.

Sebra, an assistant professor of genetic and genomic sciences, is no stranger to the SMRT Sequencing platform: he spent five years working at PacBio helping to develop that technology. Ultimately, his belief in the system led him to join the Icahn Institute, where he would get to use the PacBio® sequencer as a customer. Sebra, who came to Mount Sinai in 2012, says, “I had experienced firsthand the value of long-read sequencing and wanted to apply it to human and infectious disease research.”


Monday, June 2, 2014

Intro to the Iso-Seq Method: Full-length transcript sequencing

With the recent launch of SMRT Analysis v2.2, we’re excited to introduce analysis software support for the new Iso-Seq™ method for sequencing full-length transcripts and gene isoforms, with no assembly required! Today we’ll take a deeper look at the Iso-Seq method to explain its unique scientific value and review publications from those already applying Single Molecule, Real-Time (SMRT®) Sequencing to this exciting area of research.

In plant and animal genomes, along with all higher eukaryotic organisms, the majority of genes are alternatively spliced to produce multiple transcript isoforms. In humans, for example, there is evidence for alternative splicing of more than 95% of genes [1], with an average of more than five isoforms per gene.  Gene regulation through alternative splicing can dramatically increase the protein-coding potential of a genome that contains a limited number of genes that encode proteins. Somewhat surprisingly, alternatively spliced isoforms from a single gene can also have very different, even antagonistic, functions [2]. Therefore, understanding the functional biology of a genome requires knowing the full complement of isoforms. Microarrays and high-throughput cDNA sequencing have become incredibly useful tools for studying transcriptomes, yet these technologies provide small snippets of transcripts and building complete transcripts to study gene isoforms has been challenging.