IDIES is pleased to announce its inaugural Summer Student Fellowship program! This program will offer awards of $6,000 to support a summer research project lead by undergraduate students with the guidance of an IDIES faculty member mentor. These projects are meant to provide an opportunity for students to participate in a 10-week (June – August) full-time data science focused research project in collaboration with an IDIES faculty member.

Spring 2020

Using Machine Learning to Design Highly Stable, Biologically Active Proteins

Gina El Nesr (WSE & KSAS)
Mentor: Doug Barrick (Biophysics, KSAS)Researchers have sought methods to design proteins that are highly stable and retain their biological activities. The recent dramatic increase in genome sequencing data provides scientists with sufficient data for sequence-based protein design. One method for protein design that has shown success for stabilizing proteins uses consensus sequences. Although consensus sequences have been found to be more stable and biologically active, the implicit assumption of that residue’s frequencies are independent. In protein structure, residues are coupled to one-another in a large interconnected network of interactions. The goal of this project is to employ a robust method to design proteins that incorporates these residue interactions. By using Restricted Boltzmann Machines to learn residue sequence-structure encodings, we can potentially discover sequences of proteins with improved stabilities, solubilities and shelf-lives. Developing such a methodology has applications in pharmaceuticals, biotechnology, and chemical industries.

Data-Driven Differential Diagnosis of Common Pulmonary Diseases in the ICU

Zherui Xuan
Stuart Ray (Health Sciences Informatics, SOM)The goal of this project is to create an algorithm that will match a physician’s judgement in determining an ICU patient’s pulmonary differential diagnoses. The project will focus on ICU pulmonary patients and the very first stage of designing an AI driven CP: identification of differential diagnoses. During the summer, with the guidance of data scientists and physicians, I will research, create, and test such a computer algorithm focused on the assignment of pulmonary disease patients to five common pulmonary differential diagnoses (pneumothorax, bronchitis, COPD, pneumonia, lung cancer, and other) in the ICU by using vital signs (heart rate, blood pressure, respiratory rate, temperature, etc), common diagnostic labs (blood chemistry, hematology, urine analysis, microbiology tests, etc), and basic imaging reports (X-ray, CT scans, etc).

Hunting for Metal-Poor Main Sequence Stars in Spectroscopic Surveys

Vedant Chandra
Mentor: Kevin Schlaufman (Physics and Astronomy, KSAS)Metal-poor stars are 10 billion year-old local relics of the early Universe. Therefore, their characteristics can be used to infer the properties of the first stellar generation and the earliest evolution of the Milky Way. These metal-poor stars are, however, rare and hard to find – only a small fraction of the Milky Way’s metal-poor stellar population has been characterized. One significant challenge in this field is the spectroscopic similarity between rare metal-poor main sequence stars and common cool white dwarfs. Our project develops machine learning algorithms for large spectroscopic surveys, tuned to break this degeneracy using Bayesian convolutional neural networks. We will publish our metal-poor star discoveries and distribute our software tools for use by the broader astronomical community.