Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Jon Steingrimsson
Degrees
PhD
Institution
Brown University
Position Title
Assistant Professor of Biostatistics
Email
About this CDAS Project
Study
NLST (Learn more about this study)
Project ID
NLST-556
Initial CDAS Request Approval
Aug 22, 2019
Title
Generalizing prediction of lung cancer from the NLST cohort using deep learning
Summary
The National Lung Screening Trial (NLST) was designed to determine if low-dose CT scans reduced mortality due to lung cancer in high risk populations compared to chest radiography. The NLST was undertaken at 33 different medical institutions by the Lung Screening Study (LSS) and the American College of Radiology (ACRIN). There were 10 LSS centers and 23 ACRIN centers and up to three annual screenings (T0, T1, T2). The study found that using low-dose CT scans resulted in 20% decrease in mortality compared to radiography. One question posed from the NLST analysis was if populations with different risk profiles would observe similar results with low-dose CT screenings as was observed in the NLST trial. This question was partially addressed by comparing the baseline characteristics between the NLST data and characteristics recorded on the Tobacco Use Supplement portion of the US Census. The aim of this project is to directly assess the generalizability of deep learning based prediction models for developing lung cancer, built using low-dose CT screenings and baseline covariates, to populations with different risk profiles. We will generalize the predictions of developing cancer from the ACRIN based sites to the LSS based sites, which allows an evaluation of the generalizability of prediction of lung cancer by CT scan between differing populations. We will develop and evaluate the performance of statistical methods to generalize the deep learning predictions to a population with different risk factors than the population used to build the model.
Aims

Aim 1: Develop a deep learning based predictive model using CT scans and baseline questionnaire covariates for patients in the ACRIN sites. This requires low-dose CT scans, questionnaire data, and screening and cancer information from the ACRIN site patients. We will predict the probability of developing lung cancer within one year of the low-dose CT scan.

Aim 2: Generalize the deep learning predictions from the ACRIN site patients to the LSS site patients. The model will be fitted on the low-dose CT scans and relevant baseline questionnaire covariates. This will require low-dose CT scans, baseline questionnaire information, and screening and cancer data from both the ACRIN and LSS site patients. The cancer data for the LSS patients, the population we are generalizing to, will be held out as the ground truth and used to assess our model. We will consider different methods to generalize the predictions, such deep learning algorithms that use the G-formula and inverse probability weighing.

Collaborators

Constantine Gatsonis, Brown University, Co-Investigator
Samantha Morrison, Brown University, Co-Investigator