Foundation LDCT model for lung cancer prediction

Principal Investigator

Name
Daan van den Broek

Degrees
Ph.D.

Institution
Netherlands Cancer Institute

Position Title
Head of the department for Laboratory Medicine

Email
da.vd.broek@nki.nl

About this CDAS Project

Study
NLST (Learn more about this study)

Project ID
NLST-1512

Initial CDAS Request Approval
May 4, 2026

Title
Foundation LDCT model for lung cancer prediction

Summary
We propose to develop and evaluate an LDCT foundation model for lung cancer screening. The model will support both binary detection (cancer present or absent per scan) and/or risk prediction (e.g., near-term to medium-term cancer risk) to assist in screening and follow-up decisions. We will use NLST low-dose CT and, where appropriate, other compatible LDCT datasets. A central focus is early-stage disease and performance on indeterminate and positive findings, so the model can help clinicians validate and triage findings rather than only acting as a first-line triage tool. We will follow data selection practices used in prior NLST screening studies so that training and evaluation are comparable to the literature. The foundation model will be benchmarked against published LDCT screening models using the same or closely aligned outcomes and metrics, with the goal of achieving competitive performance through a superior development scheme.

Aims

* Aim 1: Curate NLST LDCT data for foundation model development.

We will assemble a curated LDCT cohort from NLST using selection criteria consistent with prior lung cancer screening work (e.g., series and quality filters commonly used in the literature). We will link imaging to the outcomes and basic covariates needed for binary cancer detection per scan and risk prediction, while complying with all NLST data-use and privacy requirements.

* Aim 2: Train an LDCT foundation model for lung cancer screening.

We will develop and train a foundation model on the curated NLST LDCT set (and compatible external LDCT data if available). The model will be trained to support binary detection and risk stratification at the scan level. We will emphasize robustness across scanners and protocols and efficiency so the approach could scale to large screening populations.

* Aim 3: Evaluate performance and clinical relevance.

We will evaluate the model on held-out NLST data using evaluation protocols aligned with published LDCT screening studies. We will report detection and risk prediction performance, with particular attention to early-stage disease and to indeterminate and positive screening findings. We will assess calibration and performance across key subgroups to understand suitability for use as a screening-assist tool.

Collaborators

Kevin Groot Lipman Netherlands Cancer Institute
Daan van den Broek Netherlands Cancer Institute
Apostolos-Kosmas Galanis Netherlands Cancer Institute