Statistical models to predict future subject’s lung cancer risk: application to NLST and PLCO data
We fit the training data with a set of regression models relating the outcome to its baseline covariates. For each fitted model, we create a scoring system for predicting potential outcomes and obtain the corresponding optimal stratification rule. Then, all the resulting stratification strategies are evaluated with a test set to select a final stratification system. Lastly, we obtain the inferential results of this selected stratification scheme with an independent holdout dataset.
We illustrate the proposed methods using NLST chest X-ray group as the training and test set, and PLCO lung component as the independent validation set.
The aim of this study is to develop a quantitative stratification procedure for predicting potential lung cancer risk for individual subject.
SuChun Cheng, Dana-Farber Cancer Institute
L.J. Wei, PhD, Harvard University