Getting started with recount2 and accessing it via R

Leonardo Collado-Torres1, 2, Abhinav Nellore3, 4, 5, Andrew E. Jaffe6, 7, 1Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205; 2Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21205; 3Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR, 97239; 4Department of Surgery, Oregon Health and Science University, Portland, OR, 97239; 5Computational Biology Program, Oregon Health and Science University, Portland, OR, 97239; 6Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205; 7Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205

Poster

The recount2 resource is composed of over 70,000 uniformly processed human RNA-seq samples spanning TCGA and SRA, including GTEx. The processed data can be accessed via the recount2 website and the recount Bioconductor package. In the poster we will describe the recount2 resource starting from how the coverage count matrices were computed in recount2 as well as different ways of obtaining public metadata, which can facilitate downstream analyses. We showcase how to use the recount package and how to integrate it with other Bioconductor packages. We will illustrate step-by-step directions that show how to do a gene-level differential expression analysis, visualize base-level genome coverage data, and perform an analyses at multiple feature levels. The associated workflow at https://f1000research.com/articles/6-1558/v1 provides further information to understand the data in recount2 and a compendium of R code to use the data.