Skip to Main Content

An official website of the United States government

About this Publication
Title
Survival Stacking Ensemble Model for Lung Cancer Risk Prediction.
Pubmed ID
39575799 (View this publication on the PubMed website)
Digital Object Identifier
Publication
Stud Health Technol Inform. 2024 Nov 22; Volume 321: Pages 155-159
Authors
Alonso E, Calle X, Gurrutxaga I, Beristain A
Affiliations
  • Vicomtech Foundation, Basque Research and Technology Alliance (BRTA), Donostia - San Sebastián, Spain.
  • Department of Computer Architecture and Technology, University of the Basque Country (UPV/EHU), Donostia - San Sebastián, Spain.
Abstract

The most well-established risk factor for lung cancer (LC) is smoking, responsible for approximately 85% of cases. The Lung Cancer Risk Assessment Tool (LCRAT) is a key advancement in this field, which predicts individual risk based on factors like smoking habits, demographic details, personal and family medical history, and environmental exposures. This paper proposes a model with fewer features that improves state of the art performance, using a simplified stacking ensemble, making it more accessible and easier to implement in routine healthcare practice. The data used in this work were derived from two cohorts in the United States: The National Lung Screening Trial (NLST) and the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. Both our model and LCRAT achieve an AUC of 0.799 and 0.782 on test respectively. In terms of percentage of positives, in the 50% of the population, both detect 0.766 and 0.754 of the cases. The ensemble of different survival models enhances robustness by mitigating the weakness of individual models and directly impacts the efficiency of the model, increasing the efficiency and generalizability.

Related CDAS Studies
Related CDAS Projects