Extended Cancer Prediction from Lung CT Images
Principal Investigator
Name
Atilla Kiraly
Degrees
Ph.D.
Institution
Google
Position Title
Staff Software Engineer
Email
About this CDAS Project
Study
NLST
(Learn more about this study)
Project ID
NLST-594
Initial CDAS Request Approval
Nov 1, 2019
Title
Extended Cancer Prediction from Lung CT Images
Summary
Project Summary
Since 2013, the USPSTF recommends annual screening for lung cancer with low-dose computed tomography (LDCT) in adults aged 55 to 80 years who have a 30 pack-year smoking history and currently smoke or have quit within the past 15 years. However, access to health care and affordability issues still appear to be barriers to screening for lung cancer (Delmerico et al, 2014). Automating detection has the potential of increasing efficiency and reducing costs. Rapid advances in computer vision and large scale machine learning have made it possible to train computer algorithms to identify high-level concepts at an accuracy exceeding that of humans (Ioffe et al, 2015; Szegedy et al, 2015).
In our previous project (NLST-204) we developed an algorithm that could detect lung cancer with state-of-the-art performance, outperforming radiologists in a reader study (Ardila et al, 2019). These predictions for cancer being diagnosed within 1-2 years of the image. In its recent publication in the Journal of
Thoracic Oncology, The National Lung Screening Trial Research Team reviewed extended follow-up data beyond NLST’s 2009 cutoff to the end of 2014 for a portion of patients. This proposed project investigates if it’s possible to predict lung cancer diagnosis beyond 5 years for these patients using this follow-up data.
Propose to receive two sets of data:
1. A training set of patients with known follow-up until 2014. Our team will receive the follow-up results, i.e., if the patient developed cancer and how many years this occured after the last scan. Team will use this to train and tune the existing model for this task.
2. A test set of patients with known follow-up until 2014. Google to send predictions to IMS using its model but not receive the actual follow-up status of these patients. IMS to evaluate algorithm performance based on the sent predictions and share this with the Google team.
Since 2013, the USPSTF recommends annual screening for lung cancer with low-dose computed tomography (LDCT) in adults aged 55 to 80 years who have a 30 pack-year smoking history and currently smoke or have quit within the past 15 years. However, access to health care and affordability issues still appear to be barriers to screening for lung cancer (Delmerico et al, 2014). Automating detection has the potential of increasing efficiency and reducing costs. Rapid advances in computer vision and large scale machine learning have made it possible to train computer algorithms to identify high-level concepts at an accuracy exceeding that of humans (Ioffe et al, 2015; Szegedy et al, 2015).
In our previous project (NLST-204) we developed an algorithm that could detect lung cancer with state-of-the-art performance, outperforming radiologists in a reader study (Ardila et al, 2019). These predictions for cancer being diagnosed within 1-2 years of the image. In its recent publication in the Journal of
Thoracic Oncology, The National Lung Screening Trial Research Team reviewed extended follow-up data beyond NLST’s 2009 cutoff to the end of 2014 for a portion of patients. This proposed project investigates if it’s possible to predict lung cancer diagnosis beyond 5 years for these patients using this follow-up data.
Propose to receive two sets of data:
1. A training set of patients with known follow-up until 2014. Our team will receive the follow-up results, i.e., if the patient developed cancer and how many years this occured after the last scan. Team will use this to train and tune the existing model for this task.
2. A test set of patients with known follow-up until 2014. Google to send predictions to IMS using its model but not receive the actual follow-up status of these patients. IMS to evaluate algorithm performance based on the sent predictions and share this with the Google team.
Aims
Specific Aim 1: Evaluate the accuracy of the Google's lung cancer model to make predictions of cancer between 5 and 10 years of an initial LDCT scan.
Specific Aim 2: Investigate how the algorithm may be used in conjunction with human readers to provide an assisted read
Specific Aim 3: Use deep learning techniques to augment the performance of the existing lung screening algorithm with the additional data
Specific Aim 4: Model additional targets, such as mortality outcomes or nodule growth
Collaborators
Daniel Tse, Google
Diego Ardila, Google
Atilla Kiraly, Google
Wenxing Ye, Google
Shravya Shetty, Google
Jie Yang, Google