Skip to Main Content

An official website of the United States government

Principal Investigator
Elliot English
Position Title
Senior Data Scientist
About this CDAS Project
NLST (Learn more about this study)
Project ID
Initial CDAS Request Approval
May 28, 2015
Utilization of deep learning methodology for automated lung nodule detection
Lung cancer is the second most common diagnosed cancer in the United States, with 221,200 Americans expected to be diagnosed with the disease and 158,040 deaths from the disease by the end of 2015 (NCI Cancer Statistics). Computer-assisted detection systems have long attempted to improve the radiologist's detection of lung nodules–since at least the 1990's (Kobayashi et al., 1996). Given that earlier detection of lung nodules is critical to improving treatment outcomes, developing more accurate systems that can augment the radiologist in detecting nodules during screening CT scans is of large importance in the global fight to reduce the number of deaths from lung cancer.

Deep learning is a branch of machine learning that utilizes deep architectures to learn layers of abstractions that represent underlying data, which in this case is the specific architecture of the lung and its pathological state. When applied to healthcare images, deep learning can extract features of an image and make observations that may be more difficult for the human eye to detect. Recent work by Shin et al. at the National Institutes of Health to analyze large sets of radiologic imaging in an automated fashion demonstrates that there is increasing belief in research organizations that algorithms based on large image datasets have the potential to have a very positive impact on cancer screening (Shin et al., 2015).

Our goal is to utilize deep learning methods pioneered by our collaborator, Dr. Richard Socher, and a library of CT scans with known diagnoses from the NLST to develop a reliable method for the automatic detection of lung cancer on CT imaging. Previous attempts at automatic detection of lung nodules (not based on deep learning) often suffer from high false positive and false negative rates making their use cumbersome and time-consuming by radiologists. We believe that the application of deep learning to CT scans in the detection of lung nodules has the potential to significantly decrease the false positive and false negative rates of computer-assisted detection. The NLST database provides access to a number of CT scans that is critical to the development of a deep learning-based model.

Aim #1: We aim to develop an algorithm that will detect lung nodules on CT imaging with high accuracy utilizing novel deep learning techniques.
Aim #2: We aim to develop predictions of malignancy for a given nodule on CT imaging.


Richard Socher, PhD; MetaMind
Brian Pierce, MS; MetaMind