Skip to Main Content
An official website of the United States government
CDAS has a New Look: On December 9th, the CDAS website was updated with a new design! The update incorporates all of the existing CDAS functionality with a more modern and user friendly interface.

Exploratory Data Analysis to Derive Research Questions on Lung Cancer Screening and Diagnosis

Principal Investigator

Name
Wolfgang Frimmel

Degrees
Ph.D.

Institution
Johannes Kepler University Linz

Position Title
Associate Professor,

Email
wolfgang.frimmel@jku.at

About this CDAS Project

Study
NLST (Learn more about this study)

Project ID
NLST-1416

Initial CDAS Request Approval
Apr 28, 2025

Title
Exploratory Data Analysis to Derive Research Questions on Lung Cancer Screening and Diagnosis

Summary
This project aims to conduct a comprehensive exploratory data analysis (EDA) of the National Lung Screening Trial (NLST) datasets. The purpose is to understand the structure, contents, and relationships across datasets in order to derive a focused and meaningful research question for my Master’s thesis in Economics and Business Analytics at Johannes Kepler University Linz.
I intend to explore associations between participant characteristics, imaging findings (CT and X-ray), diagnostic procedures, and lung cancer outcomes. The analysis will also consider differences across sex, age, smoking history, and ethnicity and examine the sequence and effectiveness of screenings and follow-ups.
The goal is to identify a research question that could support evidence-based insights or hypothesis generation on lung cancer detection, treatment, or disparities in outcomes. For this, access to all structured NLST datasets is requested.

Aims

1. To explore all relevant NLST datasets to gain a comprehensive understanding of lung cancer screening, diagnosis, and treatment pathways.
2. To examine associations between demographic/risk variables and lung cancer outcomes.
3. To identify differences in screening results and procedures across sex, ethnicity, and other factors.
4. To generate a specific research question based on findings from exploratory analysis.
5. To support a Master's thesis project through data-driven discovery and modeling.

Collaborators

Only me.