Data Science Project: To develop prediction model for Lung Cancer Risk
Principal Investigator
Name
Ella Buzhor
Degrees
Ph.D, B.Sc.Pharm
Institution
Holon Institute of Technology (HIT)
Position Title
Precision Oncology Medical Manager @ Roche
Email
About this CDAS Project
Study
PLCO
(Learn more about this study)
Project ID
PLCO-1121
Initial CDAS Request Approval
Dec 5, 2022
Title
Data Science Project: To develop prediction model for Lung Cancer Risk
Summary
As part of my data science course at Bar Ilan University I need to develop artificial intelligence-based model.
My project will focus on development of the algorithm that can predict risk for Lung cancer development.
Early detection of lung cancer (LC) can be challenging due to late onset of cancer-related symptoms, thus majority of the LC cases discovered at the advanced/metastatic stages.
Early detection can significantly improve survival. Currently due to availability of novel neo-adjuvant and adjuvant treatment early LC detection is imperative in order to provide better survival for those patients. Until early LC screening by low-dose CT will be fully incorporated for early detection. AI models can enrich the risk population for screening.
I wish to analyze the data in order to detect parameters that can be incorporated for the machine learning (ML) models and utilize ML and deep learning models for prediction of the higher risk population for LC development in order to recommend them screening for LC.
My project will focus on development of the algorithm that can predict risk for Lung cancer development.
Early detection of lung cancer (LC) can be challenging due to late onset of cancer-related symptoms, thus majority of the LC cases discovered at the advanced/metastatic stages.
Early detection can significantly improve survival. Currently due to availability of novel neo-adjuvant and adjuvant treatment early LC detection is imperative in order to provide better survival for those patients. Until early LC screening by low-dose CT will be fully incorporated for early detection. AI models can enrich the risk population for screening.
I wish to analyze the data in order to detect parameters that can be incorporated for the machine learning (ML) models and utilize ML and deep learning models for prediction of the higher risk population for LC development in order to recommend them screening for LC.
Aims
My aims:
1. Analyze data by exploratory data analysis
2. To prepare data for models: treatment of missing values and outliers
3. to divide the dataset for training and test data
4. To test several ML and DL models to develop the more accurate model in order to identify high-risk
patients for Lung cancer screening.
Collaborators
This is personal course graduation project, no collaborators.
Course supervisor: Dr. Tomas Karpati (MD) - I will seek his advise for any problems that may arise during the model development