Inference in ROC curves for biomrkers measured in two phase nested case control studies

Principal Investigator

Name
Ziding Feng

Degrees
Ph.D.

Institution
Dept. of Biostatistics, The University of Texas MD Anderson Cancer Center

Position Title
Professor

Email
zfeng3@mdanderson.org

About this CDAS Project

Study
PLCO (Learn more about this study)

Project ID
PLCO-282

Initial CDAS Request Approval
Jun 30, 2017

Title
Inference in ROC curves for biomrkers measured in two phase nested case control studies

Summary
The two-phase case–control sampling design is common in biomarker evaluation studies. In the first phase, a large sample size that is representative of the target population is available and is referenced mainly for clinical characteristics of the participants. In the second phase, biomarker measurements are taken for only a subsample of phase one participants due to limited resources. Since the subsampling is usually done based on some matching criteria, inherent bias is present. For example, when dealing with a lung cancer study, matching cases and controls for smoking status may be crucial. This biased sampling needs to be taken into account when clinical questions that refer to the target population are of interest. While many researchers focus on the area under the ROC curve to assess the discriminatory ability of a biomarker, such a measure does not provide an appealing clinical interpretation. Clinicians are most often interested in the performance of a biomarker at high levels of sensitivity or specificity to avoid under-diagnosis or over-diagnosis, respectively. This is driven by the seriousness of a false negative or the invasiveness of the work-up required to identify a false positive. We are currently developing statistical methodology to obtain estimates of the ROC(t), at a given t, with its corresponding confidence intervals, while also accounting for the aforementioned biased sampling scheme.

Aims

Our aim is to develop a statistical methodology that will allow to estimate and construct confidence intervals for the sensitivity at a given specificity (and vice versa) when biomarker measurements are taken within a biased sample (commonly due to matching) of a larger cohort (i.e. two phase nested case control study). We require the data for 11 statistically significant inflammation biomarkers for lung cancer presented in the paper: "Circulating Inflammation Markers and Prospective Risk for Lung Cancer" by Meredith S. Shiels et al. (2013, JNCI, Vol 105, Issue 24, pages: 1871-1879). We want to illustrate our statistical approaches using this data set as the authors have considered matching. Based on these data and the clinical characteristics of the patients that have been used by Meredith S. Shiels et al. we can project an ROC curve estimator that will refer to the performance of these markers on the general population. This requires the clinical information of the full PLCO data that is already available to us.

Collaborators

Professor Ziding Feng, Dept. of Biostatistics, The University of Texas MD Anderson Cancer Center.

Related Publications

Estimation and inference of predictive discrimination for survival outcome risk prediction models.
Li R, Ning J, Feng Z
Lifetime Data Anal. 2022 Jan 21 PUBMED
Semiparametric isotonic regression analysis for risk assessment under nested case-control and case-cohort designs.
Li W, Li R, Feng Z, Ning J
Stat Methods Med Res. 2020 Aug; Volume 29 (Issue 8): Pages 2328-2343 PUBMED