Comparison of Recurrent Neural Network (RNN) versus Survival Analysis for Lung Cancer Risk Prediction
In this project, we plan to:
1. Conduct simulation studies to examine the statistical properties of RNN-based prediction. To ensure that these simulation studies are realistic, we will create simulation datasets that will have similar covariates and event time distributions as the real NLST data.
2. Using 5-fold cross-validation in the NLST X-ray arm, we will fit an RNN model as well as traditional survival models (such as the Cox model described in Katki et al., JAMA 2016) to the training subsets using only baseline characteristics as predictors. We will then validate by AUC and by calibration plots of observed vs. expected (O/E) using the validation subsets.
3. Using 5-fold cross-validation in the NLST low-dose CT arm, we will fit an RNN model as well as traditional survival models to the training subsets using baseline characteristics and CT results as predictors. We will then validate by AUC and by calibration plots of observed vs. expected ratios (O/E) using the validation subsets.
Li Cheung National Cancer Institute
Qing Pan George Washington University
Guannan Chen George Washington University