Skip to Main Content

COVID-19 is an emerging, rapidly evolving situation.

What people with cancer should know:

Get the latest public health information from CDC:

Get the latest research information from NIH:

Principal Investigator
Brian Huang
University of Pennsylvania
Position Title
About this CDAS Project
NLST (Learn more about this study)
Project ID
Initial CDAS Request Approval
Aug 18, 2020
Deep Learning for Prediction of Progression and Overall Survival of Lung Cancer using CT Examinations
An important part of oncologic care for lung cancer is predicting and estimating progression chances and overall survival times. This is crucial for making clinical treatment decisions and informing patient expectations. Currently, prognostic outcomes are estimated with TNM staging, taking into account a combination of clinical, pathologic, and radiographic features. However, TNM staging is only able to classify patients into broad risk stratifications and does not represent the immense variability in progression and survival between patients with the same TNM stage. Additional work has been performed examining other prognostic factors, such as performance status, weight loss, and comorbidities, but better prognostic methods are necessary to encapsulate the inter-patient heterogeneity and provide more accurate, personalized predictions.

We propose a two-stage deep learning methodology to predict progression and overall survival of patients with malignant lung lesions using CT imaging examinations. The first stage is a convolutional neural net (CNN) that will be used to make binary predictions for progression and to extract deeper radiographic features from the CT images. We will utilize the recently developed CNN architecture EfficientNet, which was constructed by systematically up-scaling existing CNNs by width, depth, and resolution. The second stage is a Random Survival Forest (RSF) model using the extracted radiographic features along with clinical features to predict overall survival and calculate risk scores for each patient. Both these approaches have been individually validated in other settings; CNNs have been previously used to successfully classify lung lesions as either malignant or benign and RSFs have been used to model overall survival times in severe coronary artery disease. This is the first study examining progression prediction for malignant lung lesions using a CNN with CT images and subsequently utilizing a RSF with the CNN extracted features to predict overall survival. This work would provide a novel method of precisely assessing progression risk and overall survival for patients with lung cancer. In addition, results obtained using this methodology will provide a basis for incorporation of similar machine learning approaches in other non-classification tasks that involve continuous outcomes.

Aim 1: Develop a Convolutional Neural Network (CNN) to accurately generate binary progression predictions from CT imaging for patients with detected malignant lung lesions.

Aim 2: Extract deeper features from CT imaging using the CNN and utilize them to develop a Random Survival Forest model to model overall survival for patients.


Gang Cheng, M.D., University of Pennsylvania
Joseph Mammarappallil, M.D., Ph.D., Duke University
Yi Li, M.D., M.Sc, Fox Chase Cancer Center