Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Gorkem Turgut Ozer
Degrees
Ph.D.
Institution
University of New Hampshire
Position Title
Assistant Professor
Email
About this CDAS Project
Study
PLCO (Learn more about this study)
Project ID
PLCO-1530
Initial CDAS Request Approval
Apr 12, 2024
Title
Using Explainability Methods to Unpack Disease Predictions for Individual-level Inference: An Empirical Study of the Lung Cancer Prediction Problem
Summary
Explainable Artificial Intelligence (XAI) has become an important field of study as machine learning models are increasingly deployed in high-stakes domains such as healthcare. Two of the most prominent methods in XAI are Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP), both of which aim to unpack black-box ensemble methods. Our project aims to demonstrate the value of XAI in generating insights at the patient level, while retaining the benefits of predictive methods in terms of accuracy and error minimization using data from the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. The PLCO Cancer datasets are widely studied, hence it provides an opportunity to build on the prior work and extend and enrich the findings.
More specifically, our study will focus on PLCO Lung data. Using the PLCO lung dataset, we plan to first employ a Cox Hazards Model as our main method to establish a baseline inference.on the data. Then, we will develop Survival Random Forest (SRF) and Survival Gradient Boosting (SGB) models to enhance the predictive power. However, this increase in predictive ability comes at the cost of reduced interpretability, as these methods only offer feature importance scores. To address this, we will extend our analysis to include interpretable methods like LIME and SHAP, thereby generating insights into the predictions at both the population and individual levels. Through this exploration, we seek to balance interpretability with predictive accuracy, demonstrating the value of XAI methods in improving model transparency, especially in critical decision-making scenarios.
Aims

1. Demonstrate the value of XAI in generating patient level insights while preserving predictive accuracy using PLCO lung data.
2. Provide a foundational example of employing ensemble models alongside with XAI techniques to conduct survival analysis on PLCO lung data.
3. Advocate for the application of XAI methods, such as LIME and SHAP, to achieve a balance between interpretability and accuracy in high-stakes domains, including healthcare.

Collaborators

Di Hu, PhD student from UC Irvine