Poster Abstracts

Poster #1: Big Data Approaches to Cancer Immunotherapy

Daphne E. Schlesinger1,3,4, Tricia Cottrell5, Peter Nguyen6,7, Sneha Berry6,7, Benjamin Green6,7, Nicolas Giraldo-Castillo5, Janice M. Taube5,6,7, and Alex Szalay1,2,3, 1Johns Hopkins Department of Physics & Astronomy; 2Johns Hopkins Department of Computer Science; 3The Institute for Data Intensive Engineering and Science; 4Johns Hopkins Department of Biomedical Engineering; 5Johns Hopkins Department of Pathology; 6Johns Hopkins Department of Dermatology; 7Bloomberg-Kimmel Institute for Cancer Immunotherapy
Targeted immunotherapy is an increasingly promising area of innovation for cancer treatment. In particular, researchers have identified a correlation between patient survival and the spatial relations between T-cells and the tumor margin [1]. Furthermore, the expression of transmembrane proteins on tumor cells, most notable PD-L1, reflects the immunogenicity of the cells of a given cancer, … Continued

Poster #2: Characteristics and Causes of Denmark Strait Overflow Transport Variability

Mattia Almansi1, Thomas W. N. Haine1, Robert S. Pickart2, Marcello G. Magaldi3,1, Renske Gelderloos1, and Dana Mastropole2, 1Department of Earth and Planetary Sciences, Johns Hopkins University, Baltimore, Maryland. 2Woods Hole Oceanographic Institution, Woods Hole, Massachusetts. 3CNR-Consiglio Nazionale delle Ricerche, ISMAR-Istituto di Scienze Marine, Lerici, Italy.
We present initial results from a year-long, high-resolution (~2 km) numerical simulation covering the east Greenland shelf and the Iceland and Irminger Seas. The numerical model have been run on the Maryland Advanced Research Computing Center (MARCC) and the post-processing have been performed on the Johns Hopkins Data-Scope. Our datasets and user-friendly post-processing scripts are … Continued

Poster #3: FaRSA for minimizing convex l_1-regularized functions

Tianyi Chen and Daniel P. Robinson, Applied Mathematics and Statistics, Johns Hopkins University
We present our work on minimizing objective functions that may be written as the sum of a convex function and a sparsity inducing L1 regularizer. By using curvature information from subspaces that evolve during the solution process, we have designed an algorithm that is generally better than the state-of-the-art in terms of both robustness and … Continued

Poster #4: Getting started with recount2 and accessing it via R

Leonardo Collado-Torres1, 2, Abhinav Nellore3, 4, 5, Andrew E. Jaffe6, 7, 1Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205; 2Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21205; 3Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR, 97239; 4Department of Surgery, Oregon Health and Science University, Portland, OR, 97239; 5Computational Biology Program, Oregon Health and Science University, Portland, OR, 97239; 6Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205; 7Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205
The recount2 resource is composed of over 70,000 uniformly processed human RNA-seq samples spanning TCGA and SRA, including GTEx. The processed data can be accessed via the recount2 website and the recount Bioconductor package. In the poster we will describe the recount2 resource starting from how the coverage count matrices were computed in recount2 as … Continued

Poster #5: Leveraging linked reads for single-sample somatic variant calling

Charlotte Darby, Ben Langmead, Michael Schatz, Computer Science, Johns Hopkins University
In contrast to germline (inherited) variants, DNA mutations occurring during development are only present in some cells of the developed individual. A healthy human is thought to harbor many benign “somatic mutations” throughout their body, but certain additional ones can be disease-causing. Somatic mutations have been implicated in autism, rare diseases, including those where the … Continued

Poster #6: MEDE-DSC Integration Beyond MEDE

David C. Elbert*, 1, Nicholas S. Carey2, and Tamás Budavári3, 2, 4 **, 1. Department of Earth and Planetary Sciences, Johns Hopkins University, Baltimore, MD 21218. 2. Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218. 3. Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD 21218. 4. Hopkins Extreme Materials Institute Johns Hopkins University, Baltimore, MD 21218.
Over the last six years, the interagency Materials Genome Initiative (MGI) has driven data-centric integration in materials sciences and engineering. This focus includes development of diverse data repositories and computational hubs that significantly advance opportunities for collaboration and federated data sharing. Our participation in the Materials Data Infrastructure Consortium has helped coordinate development efforts and … Continued

Poster #7: Materials In Extreme Environments (MEDE) Data Science Integration: MEDE Data Science Cloud, version 1

David C. Elbert*,1, Nicholas S. Carey2, Aavik Pakrasi and Tamás Budavári3, 2, 4 ** , 1. Department of Earth and Planetary Sciences, 2. Department of Computer Science, 3. Department of Applied Mathematics and Statistics, 4. Hopkins Extreme Materials Institute. Johns Hopkins University, Baltimore, MD 21218
The first five years of the Materials Genome Initiative (MGI) has been marked by significant advances in access to research data across the materials domain. In addition, there is broad recognition that materials sciences and engineering is confronting issues of rapidly expanding data scale and scope. Such Big Data come from advances on many fronts … Continued

Poster #8: Novel Cell Type-Specific Alternative Splicing Across the Nervous System

Jonathan Ling1, Christopher Wilks2,3, Abhinav Nellore4,5, Ben Langmead2,3, 1: Johns Hopkins University, Neuroscience, Baltimore, MD; 2: Johns Hopkins University, Computer Science, Baltimore, MD; 3: Johns Hopkins University, Center for Computational Biology, Baltimore, MD; 4: Oregon Health & Science University, Biomedical Engineering, Portland, OR; 5: Oregon Health & Science University, Surgery, Portland, OR
De novo identification of novel transcripts is an exceptionally challenging task and researchers commonly rely on annotated transcript databases to quantify expression or alternative splicing. However, unannotated splicing events can be crucial to understanding disease and discovering new therapies. As an example, we recently developed a method for identifying novel and unannotated cryptic exons that … Continued

Poster #9: Probabilistic cross-identification of multiple catalogs in crowded fields

Xiaochen Shi, Tamas Budavari, and Amitabh Basu, Applied Mathematics and Statistics, Johns Hopkins University
Matching astronomical catalogs in crowded regions of the sky is challenging both statistically and computationally due to the many possible alternative associations. Budavári and Basu (2016) modeled the two-catalog situation as an Assignment Problem and used the well-known Hungarian algorithm to solve it. Here we treat cross-identification of multiple catalogs by introducing a different approach … Continued

Poster #10: Snaptron: a tool and service for studying splicing in tens of thousands of individuals

Christopher Wilks1,2, Jonathan Ling6, Phani Gaddipati4, Abhinav Nellore5,6, Ben Langmead1,2, 1. Department of Computer Science, Johns Hopkins University 2. Center for Computational Biology, Johns Hopkins University 3. Department of Neuroscience, Johns Hopkins University 4. Department of Biomedical Engineering, Johns Hopkins University 5. Department of Biomedical Engineering, Oregon Health & Science University 6. Department of Surgery, Oregon Health & Science University
As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be too difficult to obtain or too computationally unwieldy to analyze from scratch. We present Snaptron [1], a search engine for summarized RNA sequencing data. It serves … Continued

Poster #11: Take me out to Big Data: Analyzing professional baseball data online with SciServer

Jordan Raddick, IDIES
“You can observe a lot by just watching.” -Yogi Berra For millions of people, following professional and amateur sports provides their first exposure to statistics. Even sports fans with no formal education at all routinely engage in sophisticated statistical thinking. That fact means that sports information provides an excellent opportunity to teach data science to … Continued

Poster #12: The New SciServer: Collaborative Tools for Data-Driven Engineering and Science

Jordan Raddick, Evelin Bányai, Joseph Booker, Tamás Budavári, Camy Chhetri, László Dobos, Lance Joseph, Jai Won Kim, Gerard Lemson, Dmitry Medvedev, Victor Paul, Mike Rippin, Bonnie Souter, Alex Szalay, Manuchehr Taghizadeh-Popp, Ani Thakar, Jan Vandenberg, Sue Werner, Alainna White, IDIES
SciServer is an online environment for research and education with big data, now being developed at IDIES with funding from the National Science Foundation. The system currently offers free access to big datasets online, with browser-based visualization and analysis tools. It is now being used by researchers and educators around the world to understand and … Continued

Poster #13: The role of topoisomerase II beta in the formation of transcriptional hubs in prostate cancer cells

Heather C Wick1,*, Michael Haffner2,3,*, David Esopi2, William Nelson2,3, Srinivasan Yegnasubramanian2,3,&, Sarah Wheelan1,2,&, 1: Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA, 2: Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, USA, 3: Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, USA. *: Contributed equally; &: Co-mentors
Cellular transcriptional programs requiring changes in expression of multiple genes may be more efficient when the target genes are brought into physical proximity by chromatin conformational changes. We hypothesize that topoisomerases, which induce transient single and double stranded breaks in DNA to relieve topological constraints, are required to facilitate the formation of such transcriptional hubs … Continued