Skip to Main Content

An official website of the United States government

About this Publication
Title
Insights into colon cancer etiology via a regularized approach to gene set analysis of GWAS data.
Pubmed ID
20560206 (View this publication on the PubMed website)
Publication
Am. J. Hum. Genet. 2010 Jun; Volume 86 (Issue 6): Pages 860-71
Authors
Chen LS, Hutter CM, Potter JD, Liu Y, Prentice RL, Peters U, Hsu L
Affiliations
  • Biostatistics and Biomathematics Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109-1024, USA.
Abstract

Genome-wide association studies (GWAS) have successfully identified susceptibility loci from marginal association analysis of SNPs. Valuable insight into genetic variation underlying complex diseases will likely be gained by considering functionally related sets of genes simultaneously. One approach is to further develop gene set enrichment analysis methods, which are initiated in gene expression studies, to account for the distinctive features of GWAS data. These features include the large number of SNPs per gene, the modest and sparse SNP associations, and the additional information provided by linkage disequilibrium (LD) patterns within genes. We propose a "gene set ridge regression in association studies (GRASS)" algorithm. GRASS summarizes the genetic structure for each gene as eigenSNPs and uses a novel form of regularized regression technique, termed group ridge regression, to select representative eigenSNPs for each gene and assess their joint association with disease risk. Compared with existing methods, the proposed algorithm greatly reduces the high dimensionality of GWAS data while still accounting for multiple hits and/or LD in the same gene. We show by simulation that this algorithm performs well in situations in which there are a large number of predictors compared to sample size. We applied the GRASS algorithm to a genome-wide association study of colon cancer and identified nicotinate and nicotinamide metabolism and transforming growth factor beta signaling as the top two significantly enriched pathways. Elucidating the role of variation in these pathways may enhance our understanding of colon cancer etiology.

Related CDAS Studies
Related CDAS Projects