Skip to Main Content
An official website of the United States government

Developing Contextualized Cancer Risk Models: Integrating Polygenic Risk Scores with Environmental, Geospatial, and Social Determinants of Health Data

Principal Investigator

Name
Phillip Kraft

Degrees
Ph.D.

Institution
National Cancer Institute

Position Title
Senior Investigator

Email
phillip.kraft@nih.gov

About this CDAS Project

Study
PLCO (Learn more about this study)

Project ID
PLCO-2003

Initial CDAS Request Approval
Dec 16, 2025

Title
Developing Contextualized Cancer Risk Models: Integrating Polygenic Risk Scores with Environmental, Geospatial, and Social Determinants of Health Data

Summary
Polygenic risk scores (PRS) have emerged as a powerful tool in precision health, offering critical utility in the estimation and stratification of cancer risk at both the individual and population levels. However, the predictive performance of established PRSs is demonstrably compromised by several methodological limitations, including Eurocentric bias, selection bias and, critically, confounding introduced by measured and unmeasured sociodemographic and environmental variables. Specifically, genetic differences are often confounded with social, economic, and geographic factors that contribute to differential disease risk. This project will interrogate the extent to which such biases affect the performance of established PRSs for prostate, lung, colorectal, ovarian, and breast cancer. We will use both available geospatial, environmental data and social determinants of health data as outcomes with which to test associations of various cancer PRSs, sourced from the publicly available Polygenic Score Catalog. After characterizing the extent to which cancer PRSs are associated with environmental variables in PLCO, we will use this information to build integrative risk models and understand the utility of such models in predicting cancer risk over time.

The current reliance on uncontextualized PRSs introduces the potential miscalibration of risk estimates and may inadvertently perpetuate health disparities by wrongly attributing risk differences to genetic signals rather than socio-environmental exposures. By rigorously quantifying the environmental confounding (Aim 1) and developing highly robust, integrative risk models (Aim 2), this project will generate a crucial methodological blueprint for the next generation of precision health tools. These new models will provide researchers, clinicians, and public health officials with a more accurate and equitable framework (Aim 3) for cancer risk stratification, enabling highly targeted surveillance and intervention strategies that address both genetic predisposition and modifiable social determinants of health, ultimately leading to improved cancer risk models that can translate into reduced health disparities.

Aims

• Specific Aim 1: Characterize the association between established cancer PRSs and environmental variables.
o We will analyze publicly available prostate, lung, colorectal, breast, and ovarian cancer PRSs (sourced from the Polygenic Score Catalog) in the PLCO dataset to quantify their associations with key lifestyle and environmental factors, including geospatial data, questionnaire responses, and social determinants of health (SDoH). This step will establish the magnitude and nature of confounding bias in current cancer PRS models.
• Specific Aim 2: Develop integrative risk models by incorporating environmental confounders.
o We will utilize the identified significant environmental variables (from Aim 1) to construct novel integrative risk models that combine genetic risk (PRS) with environmental variables (geospatial/questionnaire/SDoH data). This aims to improve the predictive accuracy of the models while accounting for known associations of cancer PRSs with sociodemographic environmental factors.
• Specific Aim 3: Evaluate the utility of integrative risk models in predicting cancer risk over time.
o We will validate the performance of the integrative risk models (from Aim 2) against the original established PRSs for predicting cancer risk longitudinally within the PLCO cohort. This evaluation will specifically assess the gain in predictive utility (ΔR2 or AUC) over basic clinical risk models and their accuracy in predicting cancer outcomes. We will also evaluate odds and hazards ratios using these models. As available, we will also explore the transportability of these models by applying them to external cohorts and measure predictive power in other settings.

Collaborators

Phillip Kraft National Cancer Institute
Jayati Sharma National Cancer Institute
Rena Jones National Cancer Institute