Data-driven re-annotation of patient tumors and derived model systems

  • Benjamin Haibe-Kains
  • A Genomics@JHU Seminar
  • When: May 01, 2017, 13:30
  • Where: BSPH E3609 (Genome Café)
    Bloomberg School of Public Health Building
    615 N Wolfe St,
    Baltimore, MD 21205


The success of precision medicine largely relies on comprehensive characterization of patient tumors and their derived model systems to select the best therapy for each individual patient. Recent initiatives generated massive amounts of molecular data for healthy (GTEx) and tumor (TCGA) tissues from patients, as well as patient-derived cancer cell lines (Cancer Cell Line Encyclopedia) and xenografts (PDX Encyclopedia). High-throughput molecular profiling at the (epi)genomic, transcriptomic and proteomic levels is complex and presents many challenges that are actively investigated by the scientific community. Paradoxically, the clinical and histopathological data, although crucial for any phenotypic association study, are not as intensively studied as the molecular data. Our results indicate that the quality of such metadata may be questionable and have tremendous impact on the results of omics studies. In this talk, I will present our recent work on data-driven re-annotation of patient-derived models that are commonly used to screen anticancer therapeutics, namely immortalized cancer cell lines and mouse xenografts. Transforming gene expressions into binary features based on the concept of binary gene pairs, we developed a robust classifier for tissue type, histology and sex predictions. Our classifier allowed us to identify a set of misidentified cell lines, as well as metastatic models whose gene expression patterns strongly deviate from their primary tumor site. Our computational framework therefore enables data-driven correction of sample annotations and holds the potential to detect model systems lacking relevance for their annotated cancer type.

Genomics @ JHU Seminar Series

View All Events