Population-Based Screening for Endometrial Cancer: Human vs. Machine Intelligence.

Authors

Hart GR, Yan V, Huang GS, Liang Y, Nartowt BJ, Muhammad W, Deng J

Affiliations

Department of Therapeutic Radiology, Yale University, New Haven, CT, U.S.A.
Department of Statistics and Data Science, Yale University, New Haven, CT, U.S.A.
Department of Obstetrics, Gynecology and Reproductive Sciences, Yale University, New Haven, CT, U.S.A.

Abstract

Incidence and mortality rates of endometrial cancer are increasing, leading to increased interest in endometrial cancer risk prediction and stratification to help in screening and prevention. Previous risk models have had moderate success with the area under the curve (AUC) ranging from 0.68 to 0.77. Here we demonstrate a population-based machine learning model for endometrial cancer screening that achieves a testing AUC of 0.96. We train seven machine learning algorithms based solely on personal health data, without any genomic, imaging, biomarkers, or invasive procedures. The data come from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO). We further compare our machine learning model with 15 gynecologic oncologists and primary care physicians in the stratification of endometrial cancer risk for 100 women. We find a random forest model that achieves a testing AUC of 0.96 and a neural network model that achieves a testing AUC of 0.91. We test both models in risk stratification against 15 practicing physicians. Our random forest model is 2.5 times better at identifying above-average risk women with a 2-fold reduction in the false positive rate. Our neural network model is 2 times better at identifying above-average risk women with a 3-fold reduction in the false positive rate. Our machine learning models provide a non-invasive and cost-effective way to identify high-risk sub-populations who may benefit from early screening of endometrial cancer, prior to disease onset. Through statistical biopsy of personal health data, we have identified a new and effective approach for early cancer detection and prevention for individual patients.

Publication Details

PubMed ID
33733200

Digital Object Identifier
10.3389/frai.2020.539879

Publication
Front Artif Intell. 2020; Volume 3: Pages 539879

Related CDAS Studies

PLCO

Related CDAS Studies

PLCO-392: Stratifying Cancer Risks for Individuals Based on Deep Learning of PLCO Data (Jun Deng - 2018 )