Leveraging Self-Supervised Vision Transformers for Non-Invasive Lung Cancer Biomarker Prediction from Full-Resolution Chest X-Rays
Lung cancer remains the leading cause of cancer-related mortality in the United States, with early detection and biomarker profiling critical for improving patient outcomes. While targeted therapies and immunotherapies have improved survival, determining eligibility for these treatments often requires invasive biopsies, which can be costly and impractical for certain populations. This project aims to develop and validate an artificial intelligence (AI)-based foundation model for non-invasive lung cancer biomarker prediction using full-resolution chest X-ray (CXR) imaging data. By leveraging large-scale repositories, including the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial and the Penn Medicine Biobank (PMBB), we will train a deep learning model to identify imaging signatures associated with lung cancer risk and molecular biomarkers, ultimately advancing early detection and personalized treatment strategies.
Current lung cancer risk models rely on demographic data, clinical risk factors, and radiologic annotations, but we hypothesize that CXRs contain additional predictive information beyond recognized imaging markers. While AI-based radiomics and deep learning (DL) methods have shown promise in analyzing medical imaging for tumor malignancy assessment, they often require extensive annotations. To overcome these challenges, we propose leveraging a self-supervised learning (SSL)-based vision transformer (ViT) trained on large-scale, unlabeled CXR datasets to capture robust, generalizable imaging representations. SSL has shown success in other medical imaging applications, such as cardiovascular disease prediction from retinal images and computational pathology for genetic mutation identification, yet its application in high-resolution CXR analysis remains largely unexplored.
Extracting relevant clinical features from full-resolution CXRs presents computational challenges due to large image sizes and redundant data. To address these issues, we will enhance ViTs with softmax-free or dilated attention mechanisms, allowing efficient processing of high-resolution CXRs while maintaining predictive accuracy. Our self-supervised framework will employ masked auto-encoder pretraining to enable effective learning of lung cancer risk and survival outcomes.
Specific Aim 1: Construct imaging signatures of lung cancer risk and patient survival.
We will develop a high-resolution ViT-based model that performs end-to-end analysis of full-resolution CXRs, utilizing over 1,000,000 CXR images from the PLCO trial and PMBB as training data. By incorporating dilated or softmax-free attention, which expands the attentive field without increasing computational complexity, the model will effectively capture risk-relevant imaging features. Masked autoencoding pretraining on these large datasets will allow the model to learn a generalized representation for predicting lung cancer risk and patient survival.
Specific Aim 2: Predict molecular biomarkers in non-small cell lung cancer (NSCLC) using radiological imaging features.
Prior studies have demonstrated that non-invasive radiomic features can predict gene expression, mutations, and PD-L1 status in lung cancer. Building on these findings, we will train deep learning models to jointly model genetic biomarkers and CXR-derived imaging features, utilizing an aggregation transformer for effective feature representation.
All methods developed in this pilot grant will be implemented as open-source Python packages and validated on PLCO, PMBB, and additional datasets. Our results will establish a statistically rigorous foundation for leveraging AI-driven imaging features in non-invasive lung cancer biomarker profiling, advancing early detection and precision oncology strategies.
Prof. Daniel Truhn (dtruhn@ukaachen.de)
Prof. Christos Davatzikos (Christos.Davatzikos@pennmedicine.upenn.edu)
Prof. Sven Nebelung (snebelung@ukaachen.de)