Skip to Main Content
An official website of the United States government

Investigation into the Effect of Dimensionality Reduction Techniques in Machine Learning Algorithms on PLCO's Lung data

Principal Investigator

Name
Olawale Omotosho

Degrees
M.Sc.

Institution
University of Hull

Position Title
AI and Data Science Master's Student

Email
o.omotosho-2022@hull.ac.uk

About this CDAS Project

Study
PLCO (Learn more about this study)

Project ID
PLCO-1323

Initial CDAS Request Approval
Sep 11, 2023

Title
Investigation into the Effect of Dimensionality Reduction Techniques in Machine Learning Algorithms on PLCO's Lung data

Summary
In an era of rapid advancements in machine learning, our research seeks to address a critical challenge: high-dimensional data complexity, often referred to as the "curse of dimensionality." This challenge is particularly pertinent in the field of medical research. The exponential growth of healthcare datasets calls for innovative solutions to streamline model performance while preserving vital information.
Our research draws inspiration from recent strides made in dimensionality reduction (Reddy et al., 2020; Abdul Salam et al., 2021). These studies underscore the transformative potential of dimensionality reduction techniques, such as Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), and Autoencoders, in simplifying the labyrinthine data landscapes inherent to cancer research. Our paramount objective is to illuminate how the integration of dimensionality reduction methodologies into the domain of machine learning can revolutionize the way we approach cancer prediction.
Going beyond mere performance evaluation, our research embarks on a quantitative odyssey aimed at unravelling the profound implications of dimensionality reduction in the context of cancer prediction. We hone our focus on PCA, t-SNE, and LDA, dissecting their influence on pivotal model performance metrics—accuracy, precision, recall, F1-score, and computation time. Through rigorous quantitative analysis, we aspire to shed light on the intricate interplay between dimensionality reduction techniques and the quality and efficiency of machine learning models in the realm of cancer prediction.

Aims

• Evaluate machine learning algorithm performance for cancer prediction before and after applying dimensionality reduction techniques.
• Quantify the impact of dimensionality reduction methods (PCA, t-SNE, LDA) on key model metrics
• Investigate the benefits of a hybrid approach that combines multiple dimensionality reduction techniques to optimize model accuracy and
computational efficiency.

Collaborators

Olawale Omotosho - (University of Hull)
Dr Lawrence Bilton - (University of Hull)