Skip to Main Content

An official website of the United States government

Government Funding Lapse

Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit  cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Principal Investigator
Name
Cheng He
Degrees
Ph.D
Institution
Mirxes Labs Pte Ltd
Position Title
Vice President
Email
About this CDAS Project
Study
PLCO (Learn more about this study)
Project ID
PLCO-1768
Initial CDAS Request Approval
Dec 16, 2024
Title
Comprehensive Analysis of Risk Factors and Predictive Modeling for Different Cancer types in the PLCO Dataset
Summary
This project seeks to analyze clinical and epidemiological data from the PLCO Cancer Screening Trial to develop predictive models for all four cancer type. The study will identify key predictors for cancer (including its type), construct and evaluate multivariate risk prediction models, and analyze screening effectiveness metrics such as lead time, sojourn time, and over-diagnosis rates.

Through advanced statistical and machine learning methods, the research aims to uncover patterns and correlations that provide deeper insights into cancer risk and progression. By evaluating sensitivity, specificity, and receiver operating characteristic curves adjusted for baseline covariates, this work will help the design of targeted cancer prevention strategies and optimize screening protocols.
Aims

1. Identify Key Predictors of Cancer Risk:
Assess correlations between clinical/epidemiological variables and the occurrence of cancer.
Determine the most influential predictors for cancers at different diagnostic time horizons (e.g., within 1 year, 2 years).

2. Develop Multivariate Risk Prediction Models
Construct predictive models using logistic regression, random forests, or other machine learning methods.
Quantify model performance using metrics such as AUC, sensitivity, and specificity, ensuring robustness via cross-validation.

3. Evaluate the Performance of Screening Strategies
Estimate metrics like lead time, sojourn time, and over-diagnosis rates for specific cancer type.
Analyze screening effectiveness across demographic subgroups, providing insights into disparities in outcomes.

4. Inform Future Cancer Prevention and Screening Design
Provide actionable recommendations for optimizing screening and prevention strategies based on study findings.

Collaborators

Dr. Piyush Samant, Data Scientist
Dr. Cheng He, VP R&D