Skip to Main Content

An official website of the United States government

Principal Investigator
Name
William Fisher
Degrees
MD
Institution
Baylor College of Medicine
Position Title
Professor
Email
About this CDAS Project
Study
PLCO (Learn more about this study)
Project ID
2025-0013
Initial CDAS Request Approval
Jun 12, 2025
Title
Machine Learning Multi-omics Spectroscopy Analysis of Serum for Early Detection of Pancreatic Cancer
Summary
Pancreatic ductal adenocarcinoma (PDAC) is the 3rd leading cause of cancer-related deaths in the US. It has a 5-year survival of 12.8% because 80% of patients are diagnosed in advanced stages. PDAC incidence is increasing with our aging population, and at-risk groups such as subjects with pancreatic cysts, chronic pancreatitis (CP) and new-onset-diabetes (NOD) over the age of 50 have been identified for screening. However, current screening methods are still dependable on CT/MRI or endoscopic ultrasound modalities, which are too expensive and invasive to serve as primary tools. CA19-9 is the only FDA-approved non-invasive tumor biomarker for PDAC however it is not useful as a prescreening tool. Similarly, genetic sequencing approaches targeting nucleic acid-based omics (i.e., genomics, epigenomics, transcriptomics) isolating ctDNA, exosomes, and/or microRNA are ineffective at early stages due to undetectable signals.

A broader approach utilizing molecular and functional omics have emerged to address this gap. Our novel platform Multi-omics Spectral Analysis (MOSA-DX) combines proteomics and metabolomics detection of tumor and non-tumor-derived signals from serum at early cancer stages. Infrared spectroscopy detects proteins, lipids, carbohydrates, phosphates, and other metabolites from a single 9 µL serum drop, generating a unique spectral profile compatible with machine learning (ML) algorithms with tunable sensitivity and specificity for disease prediction. Our team previously demonstrated 93% sensitivity in distinguishing early-stage (I/II) PDAC (n=73) from normal non-PDAC patients (n=459). In a subsequent study utilizing our center’s tissue bank, we tested MOSA-DX’s current ML algorithm in an independent set of patients with early-stage (I/II) PDAC (n=100) and pancreatic cysts (n=80), achieving 83.3% sensitivity, outperforming CA19-9’s sensitivity (79% in advanced stage PDAC). To be clinically useful, our test must distinguish early-stage (I/II) PDAC from at-risk populations exhibiting similar omics changes and samples collected from subjects prior to the diagnosis of PDAC.

We propose leveraging our unique access to serum samples from different sources. Our team has built a tissue resource bank with samples from patients with stage I/II PDAC and pancreatic cysts. Our leadership in the Consortium for the study of Chronic Pancreatitis, Diabetes, and Pancreatic cancer (CPDPC) provides access to additional samples from subjects with PDAC, and at-risk populations such as CP, NOD, and those without pancreatic disease obtained from 10 different academic institutions within the US. To clinically validate our test based on ProBe design studies, we require two distinct sample sets: (1) independent external validation sets to evaluate the test's performance across diverse populations and settings, and (2) pre-diagnostic samples taken from patients collected prospectively in PLCO cohorts.

With this application, we will complement our existing resources to ensure comprehensive clinical validation of our test, aiming for >90% sensitivity. We aim to design a high-sensitivity test that can tolerate a lower specificity, for a clinically relevant economically feasible early diagnostic strategy. We envision this as a first sieve rule-out test for enrichment of at-risk populations suitable for prompt imaging workup.
Aims

Specific Aim 1. Analytical validation of ML algorithm using spectral data from early-stage I/II PDAC, and at-risk matched (age, sex, race) controls without PDAC (pancreatic cysts, CP, NOD, disease-free). We will assess assay parameters: quality control, accuracy, and precision. Spectral and patient metadata will refine the ML algorithm using nested cross-validation to evaluate thresholds of each clinical setting, combining them into a unified model tuned for high sensitivity (rule-out test). Samples will be obtained from our tissue bank and CPDCP repositories.

Specific Aim 2. Clinical validation of optimized ML algorithm using a series of blinded (to diagnosis) samples independent from Aim 1. These will include two independent sets: 1) Stage I/II PDAC patients compared to matched at-risk controls (pancreatic cysts, CP, NOD, disease-free); and 2) A second independent sample set with identical cases and controls, but completely different samples. Performance (diagnostic sensitivity and specificity) will be assessed for each validation set and benchmarked against CA19-9. Metadata will be used to stratify the results. Samples will be from our center's tissue bank and CPDPC repository.

Specific Aim 3. Pre-clinical evaluation of our optimized platform in detecting PDAC prior to clinical diagnosis using blinded pre-diagnostic samples independent from Aims 1 and 2. This will include PLCO samples previously collected under surveillance from at-risk groups 1-5 years before clinical PDAC diagnosis matched with at-risk controls who did not develop PDAC.

Collaborators

William E Fisher (Baylor College of Medicine)
Liang Li (MD Anderson Cancer Center)
David Palmer (Dxcover Ltd)