Using machine learning to understand demographic differences when predicting ovarian cancer
Principal Investigator
Name
Devanshi Kothari
Institution
Independent
Position Title
Student
Email
About this CDAS Project
Study
PLCO
(Learn more about this study)
Project ID
PLCO-806
Initial CDAS Request Approval
Jul 15, 2021
Title
Using machine learning to understand demographic differences when predicting ovarian cancer
Summary
The research study will use machine learning to develop a predictive model for ovarian cancer. The study will also analyze different demographic factors and how the algorithmic fairness of machine learning models for CA-125 differs among different demographics such as age or race. The study will assess the accuracy of CA-125 among different racial & age groups. Then, a machine learning model using all demographic factors. The model will be assessed using metrics of algorithmic fairness, such as statistical parity, equalized odds, and the equality of positive predicted value (PPV) & negative predicted value (NPV). After that, new models will be created by subsetting the data based on race, age, and/or other relevant factors. The model will be trained based on these factors to see if it is more accurate.
Aims
- The purpose of this project is to determine how the algorithmic fairness of machine learning models for CA-125 differ among different demographics such as age or race.
- In addition, the study aims to determine if the effectiveness of these models can be improved by training models separately for different subsets of the population (e.g. different models for different racial groups).
Collaborators
Dhamanpreet Kaur, Summer STEM Institute