Fast Algorithms for Improved Transcriptome Analysis

  • Rob Patro of Stony Brook University
  • A Genomics@JHU Seminar
  • When: April 05, 2016, 10:30
  • Where: Bloomberg School of Public Health
    9th Floor Room E9519 (use south elevators)
    615 Wolfe St.
    Baltimore MD 21205
  • Light refreshments served at 10:00 am

Abstract

Short read alignments are the lingua franca in much of computational genomics. Most analyses “begin with a bam”. This requires that the reads are aligned to the reference (genome or transcriptome) of interest. Given the tremendous speed of acquisition of sequencing data, the process of alignment can pose a significant computational burden. Crucially, this burden is not always necessary. In this talk, I will advocate for analysis-efficient computing, which centers around the design of algorithms and tools that compute only the information necessary to perform the required analysis.

As an example of this idea, I will discuss the concept of quasi-mapping, which provides a subset of the information present in traditional read alignments, but which can be computed much more efficiently. I will discuss our software, RapMap, which provides an efficient implementation of quasi-mapping when the target reference consists of a collection of transcript sequences. Finally, I will show how this quasi-mapping information can replace traditional alignments in two particular tasks: transcript-level abundance estimation from RNA-seq data (implemented in the Salmon and updated Sailfish software), and the sequence-based clustering of contigs in de novo transcriptomics (implemented in the RapClust software). In both of these applications, we obtain greater than order-of-magnitude speed improvements over existing alignment-based approaches without any sacrifice (and sometimes an improvement) in accuracy. We expect this type of analysis-efficient computing to extend to a wide variety of genomic, transcriptomic, and metagenomic analyses, and anticipate it will become increasingly important as a multitude of sequencing assays continue to drop in cost and produce ever-greater volumes of data.

Biography

Rob Patro is an Assistant Professor of Computer Science at Stony Brook University, where he heads the Computational Biology and Network Evolution (COMBINE) lab. Prior to joining Stony Brook, Rob obtained his Ph.D. in Computer Science from the University of Maryland, and was a postdoctoral research associate in Kingsford Group in the department of Computational Biology at Carnegie Mellon University.

Genomics @ JHU Seminar Series

View All Events