Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Li Cheung
Degrees
Ph.D.
Institution
Li Chien Cheung
Position Title
Earl Stadtman Investigator
Email
About this CDAS Project
Study
PLCO (Learn more about this study)
Project ID
PLCO-1238
Initial CDAS Request Approval
May 30, 2023
Title
Two-phase designs with failure time processes subject to non-susceptibility
Summary
We have developed a statistical approach for the efficient selection of individuals to be included in biomarker studies, such that inference is optimized given sampling constraints. If biospecimens are rare or if the biomarker is expensive, then a biomarker study may only be able to test a subset of the cohort. Our approach can inform investigators which individuals in a cohort to include in that subset. A unique aspect of this work is that we consider this second phase selection in the context of time-to-event data where susceptibility to disease may be affected by a genetic component.

As proof of principle of the new approach, we plan to examine the efficiency of these second phase selections in the context of developing a lung cancer risk model in the PLCO. I have previously published lung cancer risk models using the PLCO that use demographic, family history of lung cancer, and smoking exposure variables; the next step is to incorporate polygenic risk scores (PRS) in these models. We will first fit a preliminary risk model to the entire PLCO cohort using demographic variables, family history of lung cancer, smoking exposure, and polygenic risk scores (considering both PRS based on European samples and a PRS based on a recent multiple-ancestry meta-analysis). We will then compare the performance of risk models that are fitted to subsets of the PLCO cohort (e.g., a randomly chosen subset versus our optimized approach) and examine bias and variance of risk estimates. While demonstrating proof of principle of the new statistical approach, we also begin the first steps toward development of a new lung cancer risk model that may require use of the new approach to select samples from other cohorts for genetic testing.
Aims

1. Develop an efficient two-phase sampling design for failure time processes subject to non-susceptibility. This portion of the work is completed.
2. Intensively examine the proposed design via simulation studies. This portion of the work is completed.
3. Develop "cure" mixture models for the risk of lung cancer using the PLCO cohort. This model will include demographic variables, family history of lung cancer, smoking exposure, and polygenic risk scores.
4. Examine how well models fitted to subsets (e.g., randomly chosen versus our optimized approach) of the PLCO cohort perform relative to models fitted on the full PLCO cohort.

Collaborators

Fangya Mao, Biostatistics Branch, Division of Cancer Epidemiology and Genetics, NCI, NIH
Jianxin Shi, Biostatistics Branch, Division of Cancer Epidemiology and Genetics, NCI, NIH
Richard Cook, University of Waterloo