Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Mengjie Shen
Degrees
Master of Infomation Technology
Institution
The University of Auckland
Position Title
Graduate Teaching Asistance
Email
About this CDAS Project
Study
NLST (Learn more about this study)
Project ID
NLST-431
Initial CDAS Request Approval
Jul 31, 2018
Title
Computerized Early Diagnosis of Lung Cancer
Summary
Use the NLST Participant dataset to generate a model that combines screening, smoking, disease history, alcohol and family history. When an individual’s information is provided, provide a reliable diagnosis based on the proposed model.

This study will be conducted using 4 solutions:
1. IBM Software Analytic Solution (SPSS Modeller).
2. Microsoft Software Analytics Solution (Microsoft SQL Server, Azure Machine Learning, and Power BI).
3. Open Source Analytics Solution (Python, Jupyter, MySQL, MySQL Workbench, Kettle/Spoon, Tableau, and Weka).
4. Big Data Analytics Solution (AWS, Jupyter, PySpark, Spark)

The modelling methods used in this study will be four classification algorithms:
1. IF-THEN Rule
2. Decision tree
3. Bayesian classifiers
4. Neural networks
Aims

- Diagnose lung cancer in screening before severe symptoms appear.
- Potentially increase the ratio of early diagnosis of lung cancer patients.

Collaborators

Project supervisor: David Sundaram