Subgroup Identification based on Lung Cancer Risk Prediction Model
Principal Investigator
Name
Jon Steingrimsson
Degrees
Ph.D.
Institution
Brown University
Position Title
Assistant Professor of Biostatistics
Email
About this CDAS Project
Study
NLST
(Learn more about this study)
Project ID
NLST-701
Initial CDAS Request Approval
Aug 5, 2020
Title
Subgroup Identification based on Lung Cancer Risk Prediction Model
Summary
Machine learning models are often considered applicable to the data population they were trained on. However, it is natural to consider whether there is some sub-population, that the model is extremely effective at predicting, driving up the overall prediction performance. Or equivalently, we wish to know if there exists a certain subgroup that is poorly-represented by the model. For example, one might be interested in whether a model built primarily using data of young people is representative of the older population. The goal of this project is to discover and analyze potential subgroups of the overall population for risk prediction models built using lung cancer screening data.
Aims
1. Develop a data-driven algorithm that identifies subgroups with differential prediction accuracy.
2. Apply the algorithm to discover potential sub-populations of the NLST data that are not well-represented by the risk prediction models built using previous lung cancer screening data. Perform a detailed analysis of the subgroups identified (if there's any).
Collaborators
Constantine Gatsonis, Brown University
Ruotao Zhang, Brown University