SIMONS SEMINAR: ROBUST REGRESSION FOR MICROBIOME DATA ANALYSIS
Flatiron Institute, Simons Foundation
Date: Thursday, Jan 10, 2019
Time: 02:30 PM
Venue: Simons Centre, Ground Floor Hall, NCBS
Recent advances in low-cost metagenomic and amplicon sequencing techniques enable routine sampling of environmental and host-associated microbial communities across different habitats. The data produced by these large-scale surveys typically comprise relative abundances (or compositions) of microbial taxa at different taxonomic levels. To investigate the dependency of additional covariate measurements such as metabolites or host phenotypes on the microbial compositions, we introduce a general robust regression framework for compositional data. We propose a novel log-contrast regression model with mean shift parameters that allow the identification of sample outliers and maintains sub-compositional coherence with respect to the associated phylogenetic tree. The model is estimated using a sparse penalized regression approach that simultaneously enforces sparsity in the mean shift and covariate parameters. We demonstrate the superiority of our approach using a wide range of synthetic simulation scenarios and infer novel associations between body mass index measurements and human gut microbes on a large public collection of human gut microbiome data.
At the end of the talk, I will give an overview of a statistical method for low-rank and sparse factor regression with an application in yeast cell cycle analysis, to identify transcription factors regulating the RNA transcript levels of yeast genes within the eukaryotic cell cycle.