Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Yuyang Liu
Degrees
Ph.D
Institution
Institute of medical information, Chinese Academy of Medical Sciences & Peking Union Medical College
Position Title
Professor
Email
About this CDAS Project
Study
PLCO (Learn more about this study)
Project ID
PLCO-1743
Initial CDAS Request Approval
Nov 19, 2024
Title
Development and Validation of Predictive Models for Thyroid Cancer Risk Using PLCO Data
Summary
Thyroid cancer has become one of the fastest-growing cancers globally, and its rising incidence has brought increased attention to early detection and risk stratification. This project aims to develop and validate predictive models for thyroid cancer using the rich data available in the PLCO Cancer Screening Trial. By leveraging the extensive demographic, lifestyle, clinical, and biomarker data within PLCO, we will identify key risk factors associated with thyroid cancer and build robust predictive models to improve early detection.

The project will employ machine learning and statistical modeling techniques to analyze data from individuals who developed thyroid cancer and those who did not, comparing a range of features, including genetic markers, family history, lifestyle factors, and biochemical measurements. The resulting predictive model will then be validated on external datasets to ensure generalizability and clinical utility. The long-term goal is to create a validated tool that can identify individuals at high risk for thyroid cancer, facilitating early interventions and personalized monitoring strategies.
Aims

Aim 1: Identify and Quantify Risk Factors Associated with Thyroid Cancer
Analyze the PLCO dataset to identify demographic, lifestyle, clinical, and genetic risk factors significantly associated with thyroid cancer development. Statistical and machine learning methods, such as logistic regression and feature selection algorithms, will be applied to pinpoint the most influential variables.

Aim 2: Develop a Predictive Model for Thyroid Cancer Risk
Using the identified risk factors, develop a predictive model for assessing thyroid cancer risk in individuals. Machine learning techniques, including random forests, support vector machines, and neural networks, will be employed to develop a robust predictive model that integrates multiple risk factors for personalized risk assessment.

Aim 3: Validate and Optimize the Predictive Model
Validate the model using internal cross-validation within the PLCO dataset and, if possible, external validation on other cancer datasets. Performance metrics such as accuracy, sensitivity, specificity, and the area under the ROC curve will be evaluated to optimize model performance.

Aim 4: Assess Potential Clinical Applications of the Predictive Model
Evaluate the clinical utility of the predictive model by examining its potential integration into risk assessment protocols and screening guidelines. Conduct preliminary tests to determine how the model could inform early detection strategies and preventive measures for individuals at high risk for thyroid cancer.

Collaborators

Ganxun Wu, The Fourth Hospital of Hebei Medical University and Hebei Tumor Hospital