BiRC seminar: Hugues Aschard, Harvard University
Department of Epidemiology, Cambridge, MA
Info about event
Time
Location
BiRC, Building 1110, C.F. Møllers Alle 8, 8000 Aarhus C
Organizer
Title: Playing musical chairs in multi-phenotype studies improves power and identifies novel associations
Abstract:
Variability in complex human traits is associated with many factors, including exposures, biomarkers and genetic variants, all of which are increasingly collected in large-scale cohorts. Identifying the genetic variants that are causally associated with human phenotypes among the tens of millions of variants that are typically tested remains a challenge. Current strategies to improve power to identify modest genetic associations mostly consist of applying univariate statistical approaches such as linear or logistic regression (LR) and increasing study sample sizes. While successful, these approaches do not leverage the environmental and genetic factors shared between the multiple phenotypes typically collected in contemporary cohorts. Here we develop a method called Musical Chairs (MC) that improves identification of small effects in studies where a large number of correlated variables have been measured on the same samples. When testing a specific variant for association with a phenotype of interest, including additional correlated phenotypes as covariates can increase or decrease power depending on the underlying causal relationship between the genetic variant, the phenotype, and the covariates. MC is a data-driven approach that leverages our previous work (Zaitlen et al PG 2012, Aschard et al AJHG 2015) to select covariates that will increase power for each SNP-phenotype pair considered. MC is computationally efficient and maintains a controlled false positive rate even in the presence of thousands of phenotypes. Simulations based on phenotypic correlation structures from real cohorts provide direct support that large sets of correlated variables can be leveraged to achieve dramatic increases in statistical power equivalent to a two or even three or four fold increase in sample size. To demonstrate the power of our approach, we performed large scale genetic association screening of multiple real datasets including metabolites, gene-expression, and microbiome data. In each dataset we identified new associated variants, many of them being replicated in independent studies.