Skip to Main Content
An official website of the United States government
CDAS has a New Look: On December 9th, the CDAS website was updated with a new design! The update incorporates all of the existing CDAS functionality with a more modern and user friendly interface.

Implication of new models for predicting lung cancer screening selection based on CT scan and pathology image data

Principal Investigator

Name
Yuan Luo

Degrees
Ph.D.

Institution
Northwestern University

Position Title
Associate Professor

Email
yuan.luo@northwestern.edu

About this CDAS Project

Study
PLCO (Learn more about this study)

Project ID
PLCO-710

Initial CDAS Request Approval
Dec 31, 2020

Title
Implication of new models for predicting lung cancer screening selection based on CT scan and pathology image data

Summary
Lung cancer is the leading killer disease and accounts for approximately 25% of all cancer deaths in the United States. As the main cause of small cell and non-small cell lung cancer, smoking contributes to over 80% of lung cancer deaths in both men and women. It is therefore important to screen the ever-smokers in order to reduce the lung cancer occurrence and mortality rate.

Different risk models select different populations for screening. The model performance varies based on the selection criteria. Building on top of the existing models, we propose the new lung cancer ever-smoker screening model based on machine learning strategies to better help the health workers and reduce lung cancer mortality rate. Our model takes the CT scan and pathology image data from NLST and PLCO databases. With enhanced feature selection and engineering, we are expecting better performance by comparing with existing models.

Aims

1. Data exploration and existing model validation
We explore the PLCO database based on different features, like gender, age, ethnicities, etc. The initial data visualization will be performed using R and Python. The existing model performance on this dataset is also evaluated.

2. Implication of new models based on machine learning
We will propose new models based on machine learning and feature engineering. The aim is to have better prediction accuracy and lower the lung cancer mortality after the screening.

3. Model evaluation and comparison
Our model will be evaluated using the common evaluation metrics. We shall plot the comparison between our model performance and others.

Collaborators

Ning Zhang, Northwestern University
Chengsheng Mao, Northwestern University
Yiming Li, Northwestern University
Yawei Li, Northwestern University
Saya Dennis, Northwestern University
Garrett Eickelberg, Northwestern University
Meghan Hutch, Northwestern University
Yikuan Li, Northwestern University
Hanyin Wang, Northwestern University