Skip to Main Content

An official website of the United States government

Government Funding Lapse

Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit  cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Principal Investigator
Name
Erica Rutter
Degrees
Ph.D.
Institution
University of California Merced
Position Title
Assistant Professor
Email
About this CDAS Project
Study
PLCO (Learn more about this study)
Project ID
PLCOI-1943
Initial CDAS Request Approval
Sep 9, 2025
Title
Interpretable Deep Learning for Early Detection of Lung Disease Using the PLCO Dataset: Integrating Image Features with Large Language Models
Summary
Early detection of lung-associated diseases such as cancer and pulmonary infections is critical for improving patient outcomes. Deep learning techniques, particularly convolutional neural networks (CNNs), have shown promising performance in medical image classification tasks, but their black-box nature limits clinical trust and adoption. This project proposes to develop and evaluate an interpretable deep learning framework for the early detection of lung diseases using the imaging and associated clinical data from the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. The primary objective is to train CNN-based classifiers on PLCO chest radiographs to detect disease-relevant features. Special attention will be given to model interpretability and decision-making transparency. This will include the generation of class activation maps (CAMs) and other visualization techniques to highlight which features in the image are influencing predictions. To improve robustness and reduce spurious correlations, image preprocessing steps such as anatomical registration and segmentation will be incorporated to ensure the model focuses on biologically relevant structures rather than artifacts or background cues.

In a novel extension, this project will also examine the role of textual context by incorporating structured clinical annotations or large language model (LLM)-derived interpretations. The aim is to compare and contrast the image-based CNN outputs with natural language summaries, allowing an investigation into whether textual and visual cues align, support, or contradict each other. While the primary emphasis is on lung pathology, additional pathology site data will be used to test how well the interpretability framework generalizes to different organ systems. This comparative analysis will provide insight into the robustness and transferability of methods beyond the lung domain, an important consideration for future applications of interpretable AI in healthcare.

This work will contribute to the growing field of explainable AI in healthcare, providing tools and frameworks to ensure that models not only perform well but also offer interpretable, clinically meaningful insights. The broader impact includes increased clinician trust in AI systems, improved diagnostic accuracy, and the potential for earlier intervention in lung-related diseases. This research is conducted as part of a doctoral dissertation project and will contribute to peer-reviewed publications and open-source tools for the medical AI community.
Aims

Aim 1: Develop a deep learning model for multi-class classification of lung-associated diseases using PLCO imaging data. We will train CNN models on chest imaging data to distinguish between disease classes present in the PLCO dataset. Model training will be accompanied by rigorous evaluation metrics (sensitivity, specificity, AUC) across different disease types. Preprocessing methods such as lung field segmentation and anatomical registration will be employed to minimize artifact-driven learning and focus on biologically meaningful regions.

Aim 2: Generate and evaluate interpretable outputs to visualize model decision-making.
To improve clinical relevance, we will generate class activation maps (CAMs), saliency maps, and other visualization outputs that indicate specific regions of an image that contribute most to model decisions. These maps will be compared across correctly and incorrectly classified images to understand when and why the model fails. We will also assess the consistency of highlighted features across different model architectures and training conditions. In addition, we will perform the analysis on other pathology sites from the PLCO dataset to assess whether the interpretability framework consistently identifies biologically relevant regions across different organ systems to test the generalizability of the methods beyond the lungs.

Aim 3: Integrate clinical text or LLM-generated interpretations to contextualize model predictions. We will explore how structured text data (radiology reports) and large language model (LLM)-generated summaries can support, augment, or contradict image-based AI predictions. We hypothesize that combining visual and textual modalities will enhance diagnostic performance and provide a mechanism to cross-validate model decisions. This aim will involve both quantitative comparison and qualitative analysis of how radiologists versus models interpret similar features.

Collaborators

N/A