Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data.
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland, United States of America.
- Department of Statistics, Dongguk University, Seoul, Korea.
- Center for Psychiatric Genetics, Department of Psychiatry and Behavioral Sciences, North Shore University Health System Research Institute, University of Chicago Pritzker School of Medicine, Evanston, Illinois, United States of America.
- Dept. of Statistics, Northern Illinois University, DeKalb, Illinois, United States of America.
- Information Management Services, Inc., Rockville, Maryland, United States of America.
- Institute of Population Health Sciences, National Health Research Institutes, Miaoli, Taiwan.
- Department of Preventive Medicine and Department of Obstetrics and Gynecology, USC Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America.
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
- Geisel School of Medicine, Dartmouth College, Hanover, New Hampshire, United States of America.
- Human Genetics Foundation, Turin, Italy.
- National Institute of Cancer Research, National Health Research Institutes, Zhunan, Taiwan.
- Department of Etiology & Carcinogenesis, Cancer Institute and Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
- Department of Epidemiology, School of Public Health, China Medical University, Shenyang, China.
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore.
- Division of Molecular Medicine, Aichi Cancer Center Research Institute, Chikusa-ku, Nagoya, Japan.
- Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea.
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America.
- Epidemiology Research Program, American Cancer Society, Atlanta, Georgia, United States of America.
- Division of Epidemiology, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America.
- Department of Oncology, the Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America.
- Department of Gastrointestinal Medical Oncology, University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America.
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, Connecticut, United States of America.
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America.
- Department of Medicine and Epidemiology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, United States of America.
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany.
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, California, United States of America.
Recent heritability analyses have indicated that genome-wide association studies (GWAS) have the potential to improve genetic risk prediction for complex diseases based on polygenic risk score (PRS), a simple modelling technique that can be implemented using summary-level data from the discovery samples. We herein propose modifications to improve the performance of PRS. We introduce threshold-dependent winner's-curse adjustments for marginal association coefficients that are used to weight the single-nucleotide polymorphisms (SNPs) in PRS. Further, as a way to incorporate external functional/annotation knowledge that could identify subsets of SNPs highly enriched for associations, we propose variable thresholds for SNPs selection. We applied our methods to GWAS summary-level data of 14 complex diseases. Across all diseases, a simple winner's curse correction uniformly led to enhancement of performance of the models, whereas incorporation of functional SNPs was beneficial only for selected diseases. Compared to the standard PRS algorithm, the proposed methods in combination led to notable gain in efficiency (25-50% increase in the prediction R2) for 5 of 14 diseases. As an example, for GWAS of type 2 diabetes, winner's curse correction improved prediction R2 from 2.29% based on the standard PRS to 3.10% (P = 0.0017) and incorporating functional annotation data further improved R2 to 3.53% (P = 2×10-5). Our simulation studies illustrate why differential treatment of certain categories of functional SNPs, even when shown to be highly enriched for GWAS-heritability, does not lead to proportionate improvement in genetic risk-prediction because of non-uniform linkage disequilibrium structure.
- 2007-0004: A Whole Genome Association Study (WGAS) of Lung Cancer and Smoking (Neil Caporaso - 2007)
- 2006-0306: Whole Genome Scan of Incident Pancreatic Cancer in the Cohort Consortium (PanScan) (Rachael Stolzenberg-Solomon - 2006)
- 2006-0285: Genome-wide Association Study for Colon Cancer (Ulrike Peters - 2006)
- 2005-0003: C-GEMS: Cancer Genetic Markers of Susceptibility A Strategic Initiative to Identify Novel Genetic Determinants of Cancer (Mitchell Machiela - 2005)