Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Kevin Lyman
Degrees
BS
Institution
Enlitic
Position Title
COO & Lead Scientist
Email
About this CDAS Project
Study
NLST (Learn more about this study)
Project ID
NLST-98
Initial CDAS Request Approval
Dec 2, 2014
Title
Automatic identification and classification of lung abnormalities via deep learning and machine learning
Summary
Algorithms that “read” biomedical images in search of abnormalities hold the promise of more accurate, faster and more accessible diagnoses, and better patient outcomes. However their success to date is limited by the need to laborious “hand engineer” computational image features that characterize the steps in a radiologist’s process. Some of these features take years to create, and some algorithms use thousands of features. In 2012, an algorithm called "deep learning" transformed the field of computer vision via the usage of large neural networks that run on GPUs, that automatically learn the relevant features. We propose the creation of a deep neural network for lung. Such networks would learn the appearance of lung tissue and structure, the variation across people, and the variation of abnormalities. We would also build algorithms that detect lung nodules, and estimate their likelihood of malignancy.

Past research has shown that machine learning techniques on large numbers of features can be effective in identifying clinically relevant features of lung tumours (Aerts et al, 2014). Other research has shown that deep learning can be effective in generating features for analysing bone lesions (Roth et al, 2014). We plan to test a combination of these approaches, by using machine learning to analyse features built using deep learning to try to identify clinically relevant factors for lung tumor scans.
Aims

Aim 1: Create a deep neural network of human lungs
One limitation of deep learning is that it generally requires a large number of images. NLST provides the largest lung image database, and thus we believe will be an essential component for a deep learning transformation of the field. We plan to create a convolutional neural network (CNN) for lung CT and another for lung pathology. The networks will serve multiple goals (see Aims 2 and 3), and also can be cross-linked with secondary data sources to serve those goals. We also plan to visualize the networks to learn the hierarchy of features and compare it to human radiologist and pathologist features.

Aim 2: Automatic identification of suspect regions and estimation of likelihood of malignancy
In order for our algorithm to learn, we require annotations on a subset of the images. If these annotations are unavailble from NLST, we may create our own for the purposes of this study. In addition to generating our own annotations, we also plan to use an annotated images to create deep learning features using unsupervised training. We will then analyse these features using machine learning.

Aim 3: Automatic estimation of prognosis
We will use the patient diagnoses and outcome data to automatically estimate patient prognosis. We will use a technique similar to that in Beck et al (2011), but using deep learning features in addition to domain specific features, and in both radiology and pathology.

Collaborators

Jeremy Howard, Enlitic