Statistical models to predict future subject’s lung cancer risk: application to NLST and PLCO data
Principal Investigator
Name
Ping Hu
Degrees
Scd, SM
Institution
National Cancer Institute
Position Title
Mathematical Statistician
Email
About this CDAS Project
Study
PLCO
(Learn more about this study)
Project ID
PLCO-200
Initial CDAS Request Approval
Mar 29, 2016
Title
Statistical models to predict future subject’s lung cancer risk: application to NLST and PLCO data
Summary
The National Lung Screening Trial (NLST) compared two ways of detecting lung cancer: low-dose helical computed tomography (CT) and standard chest X-ray. The lung component of the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial was undertaken to determine whether there is a reduction in lung cancer mortality from screening using chest X-ray. Using the data from these two large randomized screening trials with well-defined groups of healthy people, we utilize methods developed by Yong, Wei, etc (2014) to create an optimal stratified prediction procedure to estimate potential lung cancer risk for individual subject.
We fit the training data with a set of regression models relating the outcome to its baseline covariates. For each fitted model, we create a scoring system for predicting potential outcomes and obtain the corresponding optimal stratification rule. Then, all the resulting stratification strategies are evaluated with a test set to select a final stratification system. Lastly, we obtain the inferential results of this selected stratification scheme with an independent holdout dataset.
We illustrate the proposed methods using NLST chest X-ray group as the training and test set, and PLCO lung component as the independent validation set.
We fit the training data with a set of regression models relating the outcome to its baseline covariates. For each fitted model, we create a scoring system for predicting potential outcomes and obtain the corresponding optimal stratification rule. Then, all the resulting stratification strategies are evaluated with a test set to select a final stratification system. Lastly, we obtain the inferential results of this selected stratification scheme with an independent holdout dataset.
We illustrate the proposed methods using NLST chest X-ray group as the training and test set, and PLCO lung component as the independent validation set.
Aims
The aim of this study is to develop a quantitative stratification procedure for predicting potential lung cancer risk for individual subject.
Collaborators
SuChun Cheng, Dana-Farber Cancer Institute
L.J. Wei, PhD, Harvard University