Comprehensive Analysis of Risk Factors and Predictive Modeling for Different Cancer types in the PLCO Dataset

Principal Investigator

Name
Cheng He

Degrees
Ph.D

Institution
Mirxes Labs Pte Ltd

Position Title
Vice President

Email
hecheng@mirxes.com

About this CDAS Project

Study
PLCO (Learn more about this study)

Project ID
PLCO-1768

Initial CDAS Request Approval
Dec 16, 2024

Title
Comprehensive Analysis of Risk Factors and Predictive Modeling for Different Cancer types in the PLCO Dataset

Summary
This project seeks to analyze clinical and epidemiological data from the PLCO Cancer Screening Trial to develop predictive models for all four cancer type. The study will identify key predictors for cancer (including its type), construct and evaluate multivariate risk prediction models, and analyze screening effectiveness metrics such as lead time, sojourn time, and over-diagnosis rates.

Through advanced statistical and machine learning methods, the research aims to uncover patterns and correlations that provide deeper insights into cancer risk and progression. By evaluating sensitivity, specificity, and receiver operating characteristic curves adjusted for baseline covariates, this work will help the design of targeted cancer prevention strategies and optimize screening protocols.

Aims

1. Identify Key Predictors of Cancer Risk:
Assess correlations between clinical/epidemiological variables and the occurrence of cancer.
Determine the most influential predictors for cancers at different diagnostic time horizons (e.g., within 1 year, 2 years).

2. Develop Multivariate Risk Prediction Models
Construct predictive models using logistic regression, random forests, or other machine learning methods.
Quantify model performance using metrics such as AUC, sensitivity, and specificity, ensuring robustness via cross-validation.

3. Evaluate the Performance of Screening Strategies
Estimate metrics like lead time, sojourn time, and over-diagnosis rates for specific cancer type.
Analyze screening effectiveness across demographic subgroups, providing insights into disparities in outcomes.

4. Inform Future Cancer Prevention and Screening Design
Provide actionable recommendations for optimizing screening and prevention strategies based on study findings.

Collaborators

Dr. Piyush Samant, Data Scientist
Dr. Cheng He, VP R&D