Using machine learning to predict risk from clinical factors and medical image
Principal Investigator
Name
Qiang CAO
Degrees
Bachelor
Institution
Independent
Position Title
PhD student
Email
About this CDAS Project
Study
PLCO
(Learn more about this study)
Project ID
PLCO-390
Initial CDAS Request Approval
Aug 9, 2018
Title
Using machine learning to predict risk from clinical factors and medical image
Summary
Globally more than 1 million people get colorectal cancer every year resulting in about 715,000 deaths as of 2010 up from 490,000 in 1990. As of 2012, it is the second most common cause of cancer in women (9.2% of diagnoses) and the third most common in men (10.0%) with it being the fourth most common cause of cancer death after lung, stomach, and liver cancer. Colorectal cancer is the commonest cancer in Hong Kong. It accounted for 16.6% of all new cancer cases in 2015.
Some prior studies have been talking about the effects of red meat, fibre, and other food on the development or prevention of colorectal cancer. And there are some evidence links intake of vitamin D, calcium, and folate with reduction in risk of colorectal.
And some researchers argued that colonoscopy is very useful for early diagnosis of colorectal.
As early detection can save many patients, we will design a novel machine learning mehtod to predict colorectal cancer risk of a patient . It can help identify the risk factors of colorectal cancer as well as alarm the patient to seek for treatment as soon as possible. Thus, to comprehensively understand the risk factors of colorectal, I need Colorectal comprehensive dataset and those sub-study data(Supplemental Questionnaire,Dietary Questionnaire,Diet History Questionnaire,Vitamin D,Colonoscopy Utilization, Contamination Survey) to train and test the machine learning model.
After completed the prediction model, I want to test the utility and generality of my novel machine learning method with some different dataset. I will use a new machine learning method to predict the risk of colorectal cancer and to explore the key risk factors. After that, I want to test whether this new method could be used on the risk prediction of other cancer. Since lung cancer and liver cancer are very prevalent in the world, I choose those two datasets to do the validation.
Some prior studies have been talking about the effects of red meat, fibre, and other food on the development or prevention of colorectal cancer. And there are some evidence links intake of vitamin D, calcium, and folate with reduction in risk of colorectal.
And some researchers argued that colonoscopy is very useful for early diagnosis of colorectal.
As early detection can save many patients, we will design a novel machine learning mehtod to predict colorectal cancer risk of a patient . It can help identify the risk factors of colorectal cancer as well as alarm the patient to seek for treatment as soon as possible. Thus, to comprehensively understand the risk factors of colorectal, I need Colorectal comprehensive dataset and those sub-study data(Supplemental Questionnaire,Dietary Questionnaire,Diet History Questionnaire,Vitamin D,Colonoscopy Utilization, Contamination Survey) to train and test the machine learning model.
After completed the prediction model, I want to test the utility and generality of my novel machine learning method with some different dataset. I will use a new machine learning method to predict the risk of colorectal cancer and to explore the key risk factors. After that, I want to test whether this new method could be used on the risk prediction of other cancer. Since lung cancer and liver cancer are very prevalent in the world, I choose those two datasets to do the validation.
Aims
1. Build a novel machine learning method to assist colorectal cancer diagnosis based on clinical factors and dietary data.
2. Identify the key factors that lead to colorectal
3. Test whether dietary factor will significantlly affect colorectal cancer
4. Compare our method to prior research
5. Test the utility and generality of my novel machine learning method with lung cancer and liver cancer dataset.
Collaborators
To be determined.