High Performance Computing Enabled Deep Learning for Lung Cancer Classification
Principal Investigator
Name
Derek Ni
Degrees
B Sc., M.Sc., Ph.D.
Institution
F. Hoffmann-La Roche AG
Position Title
Advanced Analytics
Email
About this CDAS Project
Study
NLST
(Learn more about this study)
Project ID
NLST-312
Initial CDAS Request Approval
Jun 8, 2017
Title
High Performance Computing Enabled Deep Learning for Lung Cancer Classification
Summary
Lung cancer has been the most common cancer in the world for several decades, accounting for 1 in 5 of all cancer deaths. Worldwide, three people die from lung cancer every minute.
If diagnosed at an earlier rather than an advanced stage, patients with lung cancer have a 13 times higher likelihood of living for five years. Screening is introduced to do this. Screening tests may include laboratory tests to check blood and other fluids, genetic tests that look for inherited genetic markers linked to disease, and imaging tests that produce pictures of the inside of the body. These tests are typically available to the general population.
In lung cancer screening, individuals who have a high risk of developing lung cancer but no signs or symptoms of the disease undergo low-dose computed tomography (CT) scanning of the chest. When a low-dose CT scan of the chest is done for lung cancer screening, it's common to find small, abnormal areas (called nodules or masses) in the lungs, especially in current or former smokers. Most lung nodules seen on CT scans are not cancer.
The image assessments in use today are identifying lung lesions as potentially cancerous that later turn out to not be cancer. False positive rate may lead to unnecessary anxiety, additional follow-up imaging and interventional treatments. On the other side, true positive rate in the early stage detection reduces the lung cancer deaths rate significantly as stated in the above.
Leveraging knowledge in areas of statistics, mathematics and computer science makes it possible to perform a reliable lung cancerous classification based on image data analysis. Image data analysis consists of segmentation, visualization (3D), feature extraction, and normalization. In this project we will as well introduce deep learning methods including convolutional neural network to provide the solution in this domain.
Besides the algorithm itself, there is another goal in this project to investigate a solution with better performance for big data volume and high algorithm complexity. With the emerging high performance computing (HPC) technologies in hardware and software, the good that it can do in the health care area can be foreseen.
If diagnosed at an earlier rather than an advanced stage, patients with lung cancer have a 13 times higher likelihood of living for five years. Screening is introduced to do this. Screening tests may include laboratory tests to check blood and other fluids, genetic tests that look for inherited genetic markers linked to disease, and imaging tests that produce pictures of the inside of the body. These tests are typically available to the general population.
In lung cancer screening, individuals who have a high risk of developing lung cancer but no signs or symptoms of the disease undergo low-dose computed tomography (CT) scanning of the chest. When a low-dose CT scan of the chest is done for lung cancer screening, it's common to find small, abnormal areas (called nodules or masses) in the lungs, especially in current or former smokers. Most lung nodules seen on CT scans are not cancer.
The image assessments in use today are identifying lung lesions as potentially cancerous that later turn out to not be cancer. False positive rate may lead to unnecessary anxiety, additional follow-up imaging and interventional treatments. On the other side, true positive rate in the early stage detection reduces the lung cancer deaths rate significantly as stated in the above.
Leveraging knowledge in areas of statistics, mathematics and computer science makes it possible to perform a reliable lung cancerous classification based on image data analysis. Image data analysis consists of segmentation, visualization (3D), feature extraction, and normalization. In this project we will as well introduce deep learning methods including convolutional neural network to provide the solution in this domain.
Besides the algorithm itself, there is another goal in this project to investigate a solution with better performance for big data volume and high algorithm complexity. With the emerging high performance computing (HPC) technologies in hardware and software, the good that it can do in the health care area can be foreseen.
Aims
1. Train a deep Learning model to automaticaly classify lung cancer by using image data
2. Accelerate the performance in the training process and classification process by using high performance computing technologies, in terms of big data volume and high algorithm complexity
Collaborators
F. Hoffmann-La Roche AG
Poznan University of Technology