Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Liang Ge
Degrees
B.Sc., MBA
Institution
Diannai (Shanghai) Biotech Co., Ltd.
Position Title
Director
Email
About this CDAS Project
Study
NLST (Learn more about this study)
Project ID
NLST-269
Initial CDAS Request Approval
Dec 27, 2016
Title
Application of machining learning in lung cancer prediction with the NLST and PLCO data
Summary
Lung cancer is a life-threatening disease to human beings. It is the most common cause of cancer-related death in men and second most common in women. According to the SEER program, the five-year survival rate of lung cancer patient is 17.7% in the United States, while outcomes on average are worse in the developing world. However, early detection of lung cancer using LDCT screening has shown to reduce lung cancer mortality.
Efforts to assess individual lung cancer risks and to predict the probability of a solitary pulmonary nodule being malignant have been made. To date, over ten risk assessment models and over five nodule malignancy prediction models have been developed. However, most were developed with classical statistical methods. Artificial intelligence has been recently applied to aid diagnosis, but mainly in imaging analysis.
This study attempts to use machining learning methods, such as supported vector machine and neural networks to develop new models to assess individual cancer risks and to predict nodule malignancy with the data from NLST and PLCO. The machining learning methods will be using demographic, exposure, clinical and descriptive CT imaging information, and will not be directly analyzing CT images. Thus, the models is expected to be used in the future without strict hardware requirements. We also will evaluate the performance of machine learning-based models by comparison with reported models. This study will also assess the developed models in Chinese participants, to see whether the models fit Chinese population and to explore the reasons of bias, if there is.
Aims

1) Develop a machine learning-based risk estimator to classify high risk population.
2) Develop a machine learning-based model to predict whether a solitary nodule is lung cancer using a combination of demographic, clinical and descriptive imaging parameters.
3) Compare the performance of machine learning-based method with other reported prediction models.
4) Assess the developed models in a cohort of Chinese participants.

Collaborators

Jiaqi Qian MD, Diannai (Shanghai) Biotech Co., Ltd.
Ying Ding, Diannai (Shanghai) Biotech Co., Ltd.
Lijun Shang, Diannai (Shanghai) Biotech Co., Ltd.