“Prostate cancer data analysis using machine learning methods”
We aim to develop a prediction model for prostate cancer based on variables available in clinical data which may be attained without biopsies, since, as stated in the response of question 365, biopsies are invasive and potentially harmful to patients who do not have prostate cancer. The most common variable used is prostate specific antigen (PSA) levels, however it has been criticised on the grounds that testing with it leads to a lot of false positive diagnoses, meaning it might not actually reduce mortality from prostate cancer, but only increase the amount of people going through unnecessary treatment which has negative side effects such as incontinence [1]. We therefore aim to build a prediction model that will be of use in a clinical setting, and if possible investigate whether variables such as age and free:total PSA can be used to obtain better predictions than total PSA alone, where free PSA is PSA that is not bound to other proteins [2].
[1] Roland Martin, Neal David, Buckley Richard. “What should doctors say to men asking for a PSA test?” BMJ 2018; 362 :k3702
[2] Nancy Ferrari, “What is the difference between PSA and free PSA?,” Harvard Medical School, accessed April 3, 2019, https://www.health.harvard.edu/blog/what-is-the-difference-between-psa-and-free-psa-20091001114.
Dr. Matloob Khushi (Director, Master of Data Science School of Computer Science, J12 - Computer Science Building, The University of Sydney)
-
Predicting High-Risk Prostate Cancer Using Machine Learning Methods
Henry Barlow, Shunqi Mao, Matloob Kushi
Data. 2019; Volume 4 (Issue 3): Pages 129