Predict false positive rate for lung cancer screening tests in the NLST data
Our primary goal is to use NLST 3D voxel-wise image data (in DICOM files) and machine learning models developed by Peng Huang, etc (2013) for image texture analysis to build prediction model that identify subjects who are at high risk to develop lung cancer. More specifically, we draw image motifs from selected region of interest and apply random forest and support vector machine to build classifiers. We will further develop risk scores using both image and non-image covariates. False positive rate of this image prediction will be estimated and compared with conventional manual image film reading method.
We propose to study 200 lung cancer cases and 200 control participants. Among them, 100 lung cancer cases and 100 control participants will be used as training set, and another 100 lung cancer cases and 100 control participants will be used as test set.
CT texture analysis (CTTA) is a method of quantifying lesion heterogeneity based on distribution of pixel intensities within a region of interest. The aims of this study is to investigate the ability of CTTA to distinguish different lung lesions, and develop a predictive model utilizing CTTA parameters to estimate the false positive rate of Chest x-ray and Low-dose CT for the NLST.
Dr. Peng Huang, Associate Professor of Oncology Biostatistics, Johns Hopkins University
Yifei Sun, Doctoral student, Johns Hopkins University