Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Rachel Jennings
Degrees
PhD Mathematics
Institution
UHG R&D
Position Title
Director of Data Science
Email
About this CDAS Project
Study
NLST (Learn more about this study)
Project ID
NLST-529
Initial CDAS Request Approval
Jun 21, 2019
Title
Image Classification of LDCT for Computer-aided Lung Cancer Diagnosis
Summary
Lung cancer accounts for one-fourth of cancer-related deaths in the United States, whereas pancreatic, breast, and colorectal cancers each account for about one-tenth. Fortunately, chest screening procedures have the capabilities to find pre-cancerous lung nodules—both squamous dysplasia (SD) and atypical adenomatous hyperplasia (AAH)—early enough for removal, resulting in the drastic reduction of lung cancer mortality. Low-dosage computed tomography (LDCT) has become the recommended procedure for screening high-risk patients (current or past smokers aged 55-74 with >30 pack-years and less than 15 years since quitting) annually or more frequently as it demonstrates improved detection capabilities over chest x-rays while maintaining less radiation exposure than a full CT chest scan. However, the high false positive rate (95%) of LDCT interpretation leads to unnecessary biopsies and other tests.

With the recent rise of machine learning in image classification, medicine, cancer detection, and all of these simultaneously (e.g., breast cancer), the field seems ripe for image classification of LDCT scans as either cancerous or non-cancerous. Via standard computer-aided diagnosis (CADx) practices, such classification results could aid medical professionals in determining diagnosis and next steps. Using this dataset, we will train and optimize this model in an effort to maintain high sensitivity, while reducing the false positive rate, as demonstrated on the hold-out set.

Literature and past work in this space suggest convolution-based approaches as a natural and intelligent direction. We will apply neural networks designed from a partial differential equations (PDE) perspective (a subfield with recent promise). We will build a PDE-based machine learning model and other approaches for comparison to classify LDCT images of pulmonary nodules as cancerous or non-cancerous to assist in the early diagnosis of lung cancer
Aims

• Can a machine learning model reasonably classify LDCT images, providing a more accurate classification of cancerous pulmonary nodules to assist in the early diagnosis of lung cancer?
• How does the constructed machine learning model compare to the current false positive rate of 95%?
• What factors, if any, seem capable of improving the predictions made by the image classifier (e.g., can preprocessing via deblurring help, are there other more interpretable approaches)? When incorporated into the classifier, how do these factors affect the false positive rate?

Collaborators

1. Stephen Garth, UnitedHealth Group R&D, sgarth@savvysherpa.com
2. Derek Onken, Emory University , donken@emory.edu
3. Lars Ruthotto, Emory University , lruthotto@emory.edu
4. Wesley Carter, UnitedHealth Group R&D, WesleyCarter@savvysherpa.com
5. Hunter McCawley, UnitedHealth Group R&D, HunterMcCawley@savvysherpa.com
6. Jonathan Rolfs, UnitedHealth Group R&D, JonathanRolfs@savvysherpa.com
7. Alex Bacon, UnitedHealth Group R&D, SamBacon@savvysherpa.com
8. Laura Hebzynski, UnitedHealth Group R&D, LauraHebzynski@savvysherpa.com
9. Jessica Gronski, UnitedHealth Group R&D, JessicaGronski@savvysherpa.com
10. Ramira Victoria San Juan, UnitedHealth Group R&D, RamiraSanJuan@savvysherpa.com
11. Prajakta Patil, UnitedHealth Group R&D, PrajaktaPatil@savvysherpa.com