Skip to Main Content
An official website of the United States government

Government Funding Lapse

Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Medical data mining on small datasets

Principal Investigator

Name
Ron Wolfslast

Degrees
B.Sc.

Institution
University Hamburg

Position Title
Student of Business Informatics

Email
ron.wolfslast@outlook.com

About this CDAS Project

Study
NLST (Learn more about this study)

Project ID
NLST-266

Initial CDAS Request Approval
Dec 16, 2016

Title
Medical data mining on small datasets

Summary
The goal of my master thesis is to work out what data mining methods are necessary to classify a patient if he is affected by a specific disease.

Aims

The aim is to find out what kind of data preparation and modeling techniques are necessary to provide a reliable prediction model for small datasets with proportionally many features as it is common for medical purposes. The unusually large size of the NLST lung cancer patient dataset makes it possible to experiment with different amounts of training data and to evaluate the fit of the trained model reliably.

Collaborators

University Hamburg