Skip to Main Content

An official website of the United States government

About this Publication
Title
Genetic background comparison using distance-based regression, with applications in population stratification evaluation and adjustment.
Pubmed ID
19140130 (View this publication on the PubMed website)
Publication
Genet. Epidemiol. 2009 Jul; Volume 33 (Issue 5): Pages 432-41
Authors
Li Q, Wacholder S, Hunter DJ, Hoover RN, Chanock S, Thomas G, Yu K
Affiliations
  • Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA.
Abstract

Population stratification (PS) can lead to an inflated rate of false-positive findings in genome-wide association studies (GWAS). The commonly used approach of adjustment for a fixed number of principal components (PCs) could have a deleterious impact on power when selected PCs are equally distributed in cases and controls, or the adjustment of certain covariates, such as self-identified ethnicity or recruitment center, already included in the association analyses, correctly maps to major axes of genetic heterogeneity. We propose a computationally efficient procedure, PC-Finder, to identify a minimal set of PCs while permitting an effective correction for PS. A general pseudo F statistic, derived from a non-parametric multivariate regression model, can be used to assess whether PS exists or has been adequately corrected by a set of selected PCs. Empirical data from two GWAS conducted as part of the Cancer Genetic Markers of Susceptibility (CGEMS) project demonstrate the application of the procedure. Furthermore, simulation studies show the power advantage of the proposed procedure in GWAS over currently used PS correction strategies, particularly when the PCs with substantial genetic variation are distributed similarly in cases and controls and therefore do not induce PS.

Related CDAS Studies
Related CDAS Projects