Towards Inherent Explainability in Breast Cancer Diagnosis -Unveiling Intrinsic Feature Relevance of Black Box Deep Learning Models using Tabular Data

Principal Investigator

Name
Mahreen Marghoob

Degrees
Masters programme Health Informatics

Institution
Stockholm University

Position Title
Student

Email
de_dentist@hotmail.com

About this CDAS Project

Study
PLCO (Learn more about this study)

Project ID
PLCO-1470

Initial CDAS Request Approval
Feb 5, 2024

Title
Towards Inherent Explainability in Breast Cancer Diagnosis -Unveiling Intrinsic Feature Relevance of Black Box Deep Learning Models using Tabular Data

Summary
This master's thesis project addresses the critical need for enhanced interpretability of deep learning
models for their effective utilization as supportive tools in breast cancer diagnosis. Traditional
diagnostic approaches are prone to human errors, necessitating a paradigm shift towards the
implementation of machine learning (ML) and deep learning (DL) techniques. Despite recurrent much
superior diagnostic capabilities of deep learning models, the inherent complexity and black-box nature
of these models pose challenges to their interpretability and hurdles their clinical usage.
The core objective of this research is to introduce inherent explainability into breast cancer diagnosis
using tabular data. The proposed methodology involves unwrapping deep neural networks (DNNs)
into subsequent local linear models (LLMs) based on their activation patterns and activation regions.
By unraveling the decision-making processes within these sophisticated models, we aim to enhance
transparency and understanding. A key focus is on providing intrinsic interpretability through LLMbased local linear profiles, evaluating the joint importance of individual features. To benchmark the
effectiveness of our approach, we will compare the outcomes with intrinsic classifiers of simpler
structures, including decision trees, and an attention-based neural network, TabNet. This comparative
analysis aims to underscore the strengths of our proposed method in improving feature relevance and
interpretability. Moreover, if time and resources constraints allow the research endeavors to validate
the identified features by engaging oncologists' expertise, ensuring alignment with clinical insights.

Aims

1.The primary aim of this research is to explore inherent explainability in breast cancer diagnosis using tabular data, with a specific focus on novel techniques such as unwrapping DNN into local linear models based on activation patterns and activation regions.
2. Compare the diagnostic performance of well-established traditional classifiers, notably Random Forests, with unwrapped DNN into local linear models
3. Feature relevance comparison between inherently explaianable classifier Decision Tree with that of atten mechanism based TabNet and Unwrapped DNN
4. If time and resources allow, getting these feature relevances validated from clinicians.

Collaborators

Supervisor: Alejandro Kuratomi
PhD student in Machine Learning Interpretability,
DSV – Department of Computer and Systems Sciences,
Stockholm University, Sweden.