Skip to Main Content

An official website of the United States government

About this Publication
Title
Population-Based Screening for Endometrial Cancer: Human vs. Machine Intelligence.
Pubmed ID
33733200 (View this publication on the PubMed website)
Digital Object Identifier
Publication
Front Artif Intell. 2020; Volume 3: Pages 539879
Authors
Hart GR, Yan V, Huang GS, Liang Y, Nartowt BJ, Muhammad W, Deng J
Affiliations
  • Department of Therapeutic Radiology, Yale University, New Haven, CT, U.S.A.
  • Department of Statistics and Data Science, Yale University, New Haven, CT, U.S.A.
  • Department of Obstetrics, Gynecology and Reproductive Sciences, Yale University, New Haven, CT, U.S.A.
Abstract

Incidence and mortality rates of endometrial cancer are increasing, leading to increased interest in endometrial cancer risk prediction and stratification to help in screening and prevention. Previous risk models have had moderate success with the area under the curve (AUC) ranging from 0.68 to 0.77. Here we demonstrate a population-based machine learning model for endometrial cancer screening that achieves a testing AUC of 0.96. We train seven machine learning algorithms based solely on personal health data, without any genomic, imaging, biomarkers, or invasive procedures. The data come from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO). We further compare our machine learning model with 15 gynecologic oncologists and primary care physicians in the stratification of endometrial cancer risk for 100 women. We find a random forest model that achieves a testing AUC of 0.96 and a neural network model that achieves a testing AUC of 0.91. We test both models in risk stratification against 15 practicing physicians. Our random forest model is 2.5 times better at identifying above-average risk women with a 2-fold reduction in the false positive rate. Our neural network model is 2 times better at identifying above-average risk women with a 3-fold reduction in the false positive rate. Our machine learning models provide a non-invasive and cost-effective way to identify high-risk sub-populations who may benefit from early screening of endometrial cancer, prior to disease onset. Through statistical biopsy of personal health data, we have identified a new and effective approach for early cancer detection and prevention for individual patients.

Related CDAS Studies
Related CDAS Projects