Development and validation of an oral cancer risk prediction model
Principal Investigator
Name
Anil Chaturvedi
Degrees
PhD
Institution
National Cancer institute
Position Title
Investigator
Email
About this CDAS Project
Study
PLCO
(Learn more about this study)
Project ID
PLCO-133
Initial CDAS Request Approval
Mar 9, 2015
Title
Development and validation of an oral cancer risk prediction model
Summary
In 2014, an estimated 28,030 new cases of oral cancer (5,850 deaths) occurred in the United States. Although oral cavity cancers (OCCs) and oropharyngeal cancers (OPCs) have traditionally been grouped together, a much smaller proportion of OCCs are associated with human papillomavirus (HPV) infection. As a result, HPV vaccination would not be expected to have a major impact on the burden of OCCs. The majority of OCC cases (~75%) are attributable to tobacco and heavy alcohol use. OCCs are ideal candidates for screening, secondary prevention, and early detection given the amenability of the oral cavity for visual inspection as well as the availability of recognized precursor lesions, such as oral leukoplakia. But evidence is currently lacking to assess the benefits and harms, which was reflected in the most recent US Preventive Services Task Force recommendation statement concerning oral cancer screening. This lack of evidence arises from several gaps including the identification of high-risk populations/risk stratification tools, methods for screening and intervention, and natural history of OCCs. Ongoing studies by our group are addressing several of these questions. However, there are currently no risk stratification tools for OCCs.
Using information from the PLCO Cancer Screening study, we plan to validate an oral cancer risk prediction model, which we will create using information from the NIH-AARP study. Incidence of OCCs in the PLCO and AARP studies was relatively high and given the availability of data on a wide range of risk factors, we consider these the ideal studies to create and validate our model. Eventually, answers to whether OCC screening is effective and who should be targeted will come from high quality randomized controlled trials (RCTs) and risk prediction models. When investigators decide to plan RCTs, results from our model will be useful to maximize efficiency (i.e., inform power/sample size calculations), and maximize benefits/minimize harms from screening and improve cost-effectiveness of screening programs – assuming that OCC screening is found to be effective.
Cox regression models (with a competing risk of mortality component) will be used and the best fitting models will be selected based on standard measures (AIC and -2log-likelihood). Two models will be constructed (one focusing on OCC risk and the other on mortality risk) and combined non-parametrically. Models will be built using the range of predictors available in the AARP and PLCO study data sets (e.g., age, gender, race, tobacco/alcohol use, diet, marital status, physical activity, education, and others). In our validation data set (screening and control arms of the PLCO Cancer Screening Trial), the ability of our model to discriminate between oral cancer cases and noncases will be evaluated according to the AUC. Model calibration will be assessed by comparing the expected (E) and observed (O) frequency of oral cancer cases (E/O ratios); both overall and across important subgroups (e.g., gender, levels of exposure).
Using information from the PLCO Cancer Screening study, we plan to validate an oral cancer risk prediction model, which we will create using information from the NIH-AARP study. Incidence of OCCs in the PLCO and AARP studies was relatively high and given the availability of data on a wide range of risk factors, we consider these the ideal studies to create and validate our model. Eventually, answers to whether OCC screening is effective and who should be targeted will come from high quality randomized controlled trials (RCTs) and risk prediction models. When investigators decide to plan RCTs, results from our model will be useful to maximize efficiency (i.e., inform power/sample size calculations), and maximize benefits/minimize harms from screening and improve cost-effectiveness of screening programs – assuming that OCC screening is found to be effective.
Cox regression models (with a competing risk of mortality component) will be used and the best fitting models will be selected based on standard measures (AIC and -2log-likelihood). Two models will be constructed (one focusing on OCC risk and the other on mortality risk) and combined non-parametrically. Models will be built using the range of predictors available in the AARP and PLCO study data sets (e.g., age, gender, race, tobacco/alcohol use, diet, marital status, physical activity, education, and others). In our validation data set (screening and control arms of the PLCO Cancer Screening Trial), the ability of our model to discriminate between oral cancer cases and noncases will be evaluated according to the AUC. Model calibration will be assessed by comparing the expected (E) and observed (O) frequency of oral cancer cases (E/O ratios); both overall and across important subgroups (e.g., gender, levels of exposure).
Aims
1. To develop and validate an oral cavity cancer risk prediction model.
Collaborators
Joseph Tota (NCI/DCEG)
Hormuzd Katki (NCI/DCEG)
Noorie Hyun (NCI/DCEG)