NLST-560: Predicting Progression of Lung Lesions on CT scans - Approved Projects

Studies on CDAS

Additional Studies...

More Information

Principal Investigator

Name

Sendhil Mullainathan

Degrees

PhD

Institution

University of Chicago

Position Title

Roman Family University Professor of Computation and Behavioral Science

sendhil.mullainathan@chicagobooth.edu

About this CDAS Project

Study

NLST (Learn more about this study)

Project ID

NLST-560

Initial CDAS Request Approval

Sep 6, 2019

Title

Predicting Progression of Lung Lesions on CT scans

Summary

We are interested in the progression of lung lesions on CT scans in the National Cancer Screening Trial.
Using time series data and machine learning techniques, we will try to predict from subsequent images how the prior image will progress. We are particularly focused on data that might be cancer and those that might be false positives for cancer. Our goal in studying this progression is to understand how these lesions evolve over time – which ones are dangerous and which are not. Applying machine learning to these data is an interesting way to understand such a trajectory, and can possibly help accelerate detection of patterns and characteristics that lead to dangerous lesions.

Specifically, our team has experience in unsupervised and generative learning models. We plan to train an algorithm to encode CT images and, through the model’s neural network, translate those images into a set of variables – and then those variables into a compressed image. Because we are going to be using a generative model, it’s important for us to have as many positive cancer instances as possible, as positive examples are effective at training such models. Given the need for the number of positive examples, we will be requesting 15,000 patient images. While we recognize this is the limit of data available per project we believe this amount of data will add maximum value to our project. An abundance of samples will also help us eventually develop a supervised model that will train other medical datasets beyond cancer.

For all of the connected data we receive we would like all available associated outcomes that we can obtain.

Aims

-Using 15,000 patient images, train a generative model to predict how lung lesions will progress
-Understand evolution, patterns, and characteristics of dangerous and benign lung lesions
-Using these patient samples, begin developing a supervised model that can train broader range of medical datasets

Collaborators

Dr. Aytek Oto, University of Chicago Department of Radiology