Skip to Main Content

An official website of the United States government

Government Funding Lapse

Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit  cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Principal Investigator
Name
Jacob Levman
Degrees
Ph.D., M.A.Sc., B.A.Sc.
Institution
St. Francis Xavier University
Position Title
Associate Professor of Computer Science
Email
About this CDAS Project
Study
PLCO (Learn more about this study)
Project ID
PLCO-1990
Initial CDAS Request Approval
Sep 22, 2025
Title
Predictive analytics with machine learning for ovarian cancer
Summary
In this research project, we propose to apply automated machine learning (AutoML) technology to predictive analytics of ovarian cancer.
Automated machine learning (AutoML) is an emergent research domain that includes the automated search for high-quality, reliable machine learning models. The AutoML package, df-analyze, is a tool for performing automated machine learning on small to medium sized datasets [1]. The package supports various machine learning algorithms, and is able to perform several forms of feature selection [2], and two forms of validation [1]. The package creates summary tables for the performance of all combinations of learning machines and feature selection approaches, as well as markdown files with readable reports and statistical assessments [1]. It also supports a novel redundancy-aware wrapper-based step-up feature selection method, a technique that helps find small feature sets that may exhibit predictive potential [1]. Df-analyze has been used previously for diagnostics of schizophrenia [3], chronic kidney disease [4], ethical artificial intelligence [5], predicting thyroid cancer recurrence [6], diagnosing, predicting treatment, and staging in pediatric appendicitis [7], and studying proteins potentially linked with learning in the cerebral cortex [8].
In this study, we hypothesize that the application of df-analyze AutoML technology to Ovarian cancer data may: 1) create technologies with diagnostic, staging, and/or prognostic value, and 2) may help elucidate our understanding of factors predictive of important aspects of ovarian cancer and its management, including providing potential insights into factors predictive of severity, issues associated with detection methods (screen vs. interval detected tumours), etc.

References:
1. stfxecutables. (2024). df-analyze documentation. Retrieved from https://github.com/stfxecutables/df-analyze
2. Train in Data. (n.d.). Feature selection with wrapper methods. Retrieved October 21, 2024, from https://www.blog.trainindata.com/feature-selection-with-wrapper-methods/
3. Levman, J.; Jennings, M.; Rouse, E.; Berger, D.; Kabaria, P.; Nangaku, M.; Gondra, I.; Takahashi, E. A Morphological Study of Schizophrenia with Magnetic Resonance Imaging, Advanced Analytics, and Machine Learning. Front. Neurosci. 2022, 16, doi:10.3389/fnins.2022.926426.
4. Figueroa, J., Etim, P., Shibu, A., Berger, D., Levman, J. Diagnosing and Characterizing Chronic Kidney Disease with Machine Learning: The Value of Clinical Patient Characteristics as Evidenced from an Open Dataset. Electronics 2024;13, 4326.
5. Saville, K.; Berger, D.; Levman, J. Mitigating Bias Due to Race and Gender in Machine Learning Predictions of Traffic Stop Outcomes. Information. Accepted for Publication Oct. 28th, 2024.
6. M. Penner, D. Berger, X. Guo, J. Levman “Machine Learning in Differentiated Thyroid Cancer Recurrence and Risk Prediction,” Applied Sciences, 15(17), 9397, 2025.
7. Kendall, J.; Gaspar, G.; Berger, D; Levman, J. Machine Learning and Feature Selection in Pediatric Appendicitis. Tomography 2025, 11(8), 90. https://doi.org/10.3390/tomography11080090
8. Huang, X.; Gauthier, C.; Berger, D.; Cai, H.; Levman, J. Identifying Cortical Molecular Biomarkers Potentially Associated with Learning in Mice Using Artificial Intelligence. Int. J. Mol. Sci. 2025, 26, 6878. https://doi.org/10.3390/ijms26146878.
Aims

The specific aims are to: 1) develop machine learning models for predicting important aspects of ovarian cancer, such as tumor stage/severity, detection method (screen vs. interval cancers), etc.
2) to uncover patterns in the data by identifying highly predictive subsets of features for each predictive task considered. Thus we aim not only to develop novel technologies that could assist in improving the standard of patient care, but we also are interested in the potential to educate the medical research community as to feature subsets that are highly predictive of important aspects of ovarian cancer. By not only looking at methods for predicting disease severity, but also investigating characteristics of tumors that are predictive of screen vs. interval cancers, our analyses may identify factors that can improve our understanding of the shortcomings of current screening methods.

Collaborators

Jacob Levman St. Francis Xavier University
Keely Ralf St. Francis Xavier University
Xuchen Guo St. Francis Xavier University