- William Noble of the University of Washington
- A Genomics@JHU Seminar
- When: February 23, 2016, 10:00
- Where: Welsh Library West Reading Room
1900 E. Monument St
Baltimore MD 21205
- Light refreshments served at 10:00 am
Genomic sequencing assays such as ChIP-seq and DNase-seq can measure a wide variety of types
of genomic activity, but the high cost of sequencing limits the number of these assays that are usually
performed in a given experimental condition. I will discuss a principled method for selecting which
genomics assays to perform, given a limited budget. The method relies upon optimization over
submodular functions, which are discrete set functions that have properties analogous to certain
continuous convex functions. I will also show how a similar submodular optimization approach can be
applied to the problem of selecting a representative subset of protein sequences from a large database.
I will also describe some of our work developing methods for using unsupervised machine learning to
interpret large, heterogeneous collections of genomic data. Semi-automated genome annotation (SAGA)
algorithms facilitate human interpretation of heterogeneous collections of genomics data by
simultaneously partitioning the human genome and assigning labels to the resulting genomic segments.
However, existing SAGA methods cannot integrate inherently pairwise chromatin conformation data. We
developed a new computational method, called graph-based regularization (GBR), for expressing a
pairwise prior that encourages certain pairs of genomic loci to receive the same label in a genome
annotation. We used GBR to exploit chromatin conformation information during genome annotation by
encouraging positions that are close in 3D to occupy the same type of domain.
William Noble is the Director of the University of Washington Computational Molecular Biology Program
and Co-Director of the University of Washington Center for Nuclear Organization and Function.