Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Aristotelis Tsirigos
Degrees
Ph.D
Institution
NYU Langone Health
Position Title
Full Professor
Email
About this CDAS Project
Study
NLST (Learn more about this study)
Project ID
NLST-1481
Initial CDAS Request Approval
Nov 13, 2025
Title
Replication and Understanding of the Sybil Deep Learning Model for Lung Cancer Risk Prediction Using NLST Low-Dose CT Scans
Summary
We are working on a project to replicate and study the Sybil model, a deep learning system that predicts a person’s future risk of developing lung cancer from a single low-dose CT scan. The model was originally developed at MIT, and I want to reproduce it using the publicly available National Lung Screening Trial (NLST) dataset.
Our goal is not just to re-run their code, but to really understand how the model works—how the data is prepared, what the network learns, and how it interprets CT scans to estimate risk years in advance. We are especially interested in the parts of the model that handle 3D imaging data and the attention mechanisms that highlight which regions of the lungs the model focuses on.
This project is part of ongoing research training within the Biomedical Engineering program at NYU Tandon, conducted in collaboration with the Tsirigos Lab at NYU Langone Health. By reproducing the Sybil model’s results, the team aims to gain hands-on experience working with large-scale medical imaging data and to better understand the process of training and evaluating deep learning models for screening and early detection of disease. Once the workflow is fully understood, similar approaches will be explored for potential application to NYU Langone’s own lung screening datasets in future research.
All analyses involving the NLST data will be performed on secure institutional servers, and all results will be used solely for educational and academic research purposes.
Aims

Aim 1: Recreate the Sybil model pipeline—from preprocessing through inference—using the NLST Lung Cancer Selection subset.
Aim 2: Evaluate how closely the replicated model’s predictions match those reported in the original paper.
Aim 3: Build a detailed, working understanding of the model’s internal logic, including how attention maps highlight image areas linked to higher risk.
Aim 4: Use what we learn from this process to help adapt similar approaches to NYU Langone’s lung cancer screening datasets.
Hypothesis: Deep learning can detect subtle patterns in low-dose CT scans that correlate with future lung cancer risk, even when no visible tumor is present. Understanding how Sybil achieves this can help improve both screening tools and model interpretability.

Collaborators

Aristotelis Tsirigos New York University (Affiliated with NYU Langone)
ELEKTRA MANOLAKOS New York University (Affiliated with NYU Langone)
Aristotelis Tsirigos New York University (Affiliated with NYU Langone)