Category: publications

Machine Learning Analysis of Motor Evoked Potential Time Series to Predict Disability Progression in Multiple Sclerosis

Background: Evoked potentials (EPs) are a measure of the conductivity of the central nervous system. They are used to monitor disease progression of multiple sclerosis patients. Previous studies only extracted a few variables from the EPs, which are often further condensed into a single variable: the EP score. We perform a machine learning analysis of…
Read more

Kinetic profiling of metabolic specialists demonstrates stability and consistency of in vivo enzyme turnover numbers

Enzyme turnover numbers (kcats) are essential for a quantitative understanding of cells. Because kcats are traditionally measured in low-throughput assays, they are often noisy, non-physiological, inconsistent, and labor-intensive to obtain. We use a data-driven approach to estimate in vivo kcats using metabolic specialist E. coli strains that resulted from gene knockouts in central metabolism followed…
Read more

Connecting Histopathology Imaging and Proteomics in Kidney Cancer through Machine Learning

Proteomics data encode molecular features of diagnostic value and accurately reflect key underlying biological mechanisms in cancers. Histopathology imaging is a well-established clinical approach to cancer diagnosis. The predictive relationship between large-scale proteomics and H&E-stained histopathology images remains largely uncharacterized. Here we investigate such associations through the application of machine learning, including deep neural networks,…
Read more

Biological network topology features predict gene dependencies in cancer cell lines

In this paper we explore computational approaches that enable us to identify genes that have become essential in individual cancer cell lines. Using recently published experimental cancer cell line gene essentiality data, human protein-protein interaction (PPI) network data and individual cell-line genomic alteration data we have built a range of machine learning classification models to…
Read more

Interpretable Machine Learning for Perturbation Biology

Systematic perturbation of cells followed by comprehensive measurements of molecular and phenotypic responses provides an informative data resource for constructing computational models of cell biology. Models that generalize well beyond training data can be used to identify combinatorial perturbations of potential therapeutic interest. Major challenges for machine learning on large biological datasets are to find…
Read more

Facetto: Combining Unsupervised and Supervised Learning for Hierarchical Phenotype Analysis in Multi-Channel Image Data

Facetto is a scalable visual analytics application that is used to discover single-cell phenotypes in high-dimensional multi-channel microscopy images of human tumors and tissues. Such images represent the cutting edge of digital histology and promise to revolutionize how diseases such as cancer are studied, diagnosed, and treated. Highly multiplexed tissue images are complex, comprising 10^9…
Read more

DeLTA: Automated cell segmentation, tracking, and lineage reconstruction using deep learning

Microscopy image analysis is a major bottleneck in quantification of single-cell microscopy data, typically requiring human supervision and curation, which limit both accuracy and throughput. To address this, we developed a deep learning-based image analysis pipeline that performs segmentation, tracking, and lineage reconstruction. Our analysis focuses on time-lapse movies of Escherichia coli cells trapped in…
Read more

Benchmarking predictions of MHC class I restricted T cell epitopes

T cell epitope candidates are commonly identified using computational prediction tools in order to enable applications such as vaccine design, cancer neoantigen identification, development of diagnostics and removal of unwanted immune responses against protein therapeutics. Most T cell epitope prediction tools are based on machine learning algorithms trained on MHC binding or naturally processed MHC…
Read more

Antibody Complementarity Determining Region Design Using High-Capacity Machine Learning

The precise targeting of antibodies and other protein therapeutics is required for their proper function and the elimination of deleterious off-target effects. Often the molecular structure of a therapeutic target is unknown and randomized methods are used to design antibodies without a model that relates antibody sequence to desired properties. Here we present a machine…
Read more

Fully Interpretable Deep Learning Model of Transcriptional Control

The universal expressibility assumption of Deep Neural Networks (DNN) is the key motivation behind recent work in the system biology community to employ DNNs to solve important problems in functional genomics and molecular genetics. Because of the black-box nature of DNN, such assumptions, while useful in practice, are unsatisfactory for scientific analysis. In this paper,…
Read more