Multimodal Machine Learning for Precision Lung Cancer Management
Aim 1: Develop, optimize, and evaluate a multi-modal machine learning pipeline for predicting patient prognosis who are under treatments by integrating information from different clinical data sources. In aim 1, we propose to build unimodal machine learning models on genomics data and pathology images and then combine them into a multimodal machine learning model with multi-modal fusion layers. Patient data including pathology images, gene mutations, demographic information, findings in radiology reports and oncology notes, and other clinical information will be used to train and validate this pipeline. We will apply an existing image analysis method for analyzing pathology images and integrate it into the pipeline. In addition, we plan to improve the existing methods by adjusting the pipeline based on evaluation metrics like c-index. The initial dataset used for analysis will be lung cancer patients who underwent targeted treatment with tyrosine kinase inhibitors (TKI). We will extend these approaches to cohorts of lung cancer patients receiving other treatments like immunotherapy based on data availability.
Aim 2: To test for statistical associations between patterns in patient clinical profile and patient prognosis under selected treatments for cancer patients. In this aim, We propose to use statistical tests and regression methods to test whether there are factors associated with patient outcomes (e.g., resistance and response to treatment, overall survival) while accounting for possible confounders. Similar to data used in aim 1, the initial cohort will be lung cancer patients who used TKI medications with the plan for extending to other treatments.
Aim 3: Build and validate a multi-modal machine learning pipeline for application on treatments with similar drug resistance variabilities. In aim 3, the pipeline developed for the treatment of interest in aim 2 will be adapted for application on other cancer treatments to demonstrate the generalizability of the machine learning framework developed in this project.
With regard to the expected outcomes, possible associations between patient prognosis and genetic mutation patterns, pathological and radiological findings as well as other clinical information will be tested and identified for each particular treatment. This project will also provide a method for integrating information from different modalities and data sources in patient clinical profile to help tailor treatment plans. These outcomes are expected to assist decision making in cancer management by providing recommendations on treatment choices, and thus have a positive impact on precision care for cancer patients and their outcomes. Finally, these results are expected to have potential generalizability to various cancer types and other diseases and their treatments, which will be explored in future work.
Saeed Hassanpour, Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth
Shuai Jiang, Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth
Arief A. Suriawinata, Department of Pathology and Laboratory Medicine, Dartmouth-Hitchcock Medical Center
Liesbeth Hondelink, Leiden University Medical Center
Faraz Farhadi, Geisel School of Medicine at Dartmouth