Skip to Main Content

An official website of the United States government

Principal Investigator
Uday Kumbhar
Innotech Precision Medicine
Position Title
About this CDAS Project
PLCO (Learn more about this study)
Project ID
Initial CDAS Request Approval
Jul 8, 2024
Biomarkers for Head and Neck Cancer
The primary objective of this project is to develop and implement advanced machine learning algorithms to identify genetic and protein biomarkers for early detection of head and neck cancer. Leveraging a robust dataset from the PLCO (Prostate, Lung, Colorectal, and Ovarian) Cancer Screening Trial, this research aims to uncover significant molecular changes that occur at different stages of head and neck cancer. By analyzing genomic, proteomic, and epigenomic data, the project seeks to construct predictive models that can accurately identify early-stage cancer, thereby improving early diagnosis and patient outcomes. This project builds upon my extensive experience in developing ETL pipelines and predictive models for cancer detection at Innotech Precision Medicine. The findings will contribute to the growing body of knowledge in cancer genomics and have the potential to significantly enhance early cancer detection methodologies.

1) Develop a Comprehensive Data Pipeline:
Design and implement an ETL (Extract, Transform, Load) pipeline to preprocess and integrate genomic, proteomic, and epigenomic data from the PLCO dataset.
Ensure data quality and consistency to facilitate accurate downstream analyses.

2) Identify Biomarkers for Early Detection:
Utilize machine learning algorithms to analyze the integrated dataset and identify genetic and protein biomarkers associated with early stages of head and neck cancer.
Perform feature selection to pinpoint the most significant biomarkers that can serve as predictive indicators.

3) Construct Predictive Models:
Develop and validate machine learning models to predict the presence of head and neck cancer based on identified biomarkers.
Evaluate model performance using metrics such as accuracy, sensitivity, specificity, and area under the ROC curve (AUC).

4) Validate Findings with External Datasets:
Validate the predictive models and identified biomarkers using external datasets to ensure generalizability and robustness of the findings.
Compare performance with existing early detection methods to assess improvements.

5) Disseminate Results:
Publish findings in peer-reviewed journals and present at scientific conferences to share insights with the broader research community.
Collaborate with clinical researchers to explore the practical application of the predictive models in clinical settings.


1) Innotech Precision Medicine
2) Dr. Roya Khosravi-Far, Ph.D., PLD, InnoTech, co-founder, President and Chief Executive Officer
3) Uday Kumbhar