Mayetri Gupta, Department of Biostatistics, School of Public Health, UNC Chapel Hill

Friday April 20th, 4pm, Phillips 332 (refreshments served in Phillips 330 starting at 3:30)

Variable selection in high dimensional regression models with applications to gene regulatory networks

Abstract: The development of modern scientific techniques, including large scale genomic technologies, has led to the generation of enormous amounts of data often characterized by high dimensions and complex dependence structures. In many cases, the dimensionality of variables measured (d) exceeds the number of observations (n), leading to model non-identifiability and difficulties in parameter estimation. Variable selection procedures also fail due to the impossibility of enumerating and testing massive collections of models, as well as the inability to estimate the larger models by standard procedures.

We develop a procedure for Bayesian variable selection in a high-dimensional regression framework, focusing on generalized linear models (GLMs). First, we generalize the g-prior (Zellner 1986), to a class of priors based on the Information matrix, and having a "ridge" parameter that leads to posterior propriety of the regression coefficients even if d > n. This generalized version of the prior is found to have several attractive properties, including being semiautomatic in nature and requiring very little hyper-parameter specification. We investigate the operating characteristics of this prior in the context of discovering gene regulatory networks from genomic sequence and gene expression microarray data from a yeast cell-cycle experiment. By formulating this problem as variable selection in a high-dimensional regression mixture framework, we devise a unified and efficient Markov chain Monte Carlo procedure to simultaneously determine the latent groupings of genes and unknown sets of motifs involved in their regulation, from a large number of potential regulators.

This is joint work with Joseph Ibrahim at the Department of Biostatistics at UNC.