Machine Learning algorithm development for lung cancer subtype identification and survival prediction
Principal Investigator
Name
JUNZHOU HUANG
Degrees
Ph.D.
Institution
University of Texas at Arlington
Position Title
Professor
Email
jzhuang@uta.edu
About this CDAS Project
Study
NLST
(Learn more about this study)
Project ID
NLST-141
Initial CDAS Request Approval
Jun 30, 2015
Title
Machine Learning algorithm development for lung cancer subtype identification and survival prediction
Summary
Lung cancer is one of serious diseases causing death for both men and women. In this project, we will propose a computer-aided subtype lung cancer diagnosis and survival prediction on NLST data. At first, we start from a challenging and important clinical case, i.e., differentiation of two subtypes of Non-small cell lung cancer (NSCLC) which is the most common type of lung cancer. The whole process will include feature extraction and subtype cancer classification. For feature extraction, we plan to extract local and holistic features from NLST histopathology images. To extract local features, a robust cell detection and segmentation method is adopted to segment each individual cell in images. Then, based on cell detection results, a set of extensive local features are extracted using efficient geometry and texture descriptors. To investigate the effectiveness of holistic features in lung cancer images, we extract architecture features from labeled nuclei centroids. Each subtype lung cancer sample will be described by one high-dimensional feature vector. Moreover, to reduce the high dimension, we will use machine learning methods to find out important features (markers) from all features. After feature extraction, several different classification techniques like Support Vector Machine (SVM) and Random Forest that can handle high-dimensional data will be evaluated.
Survival analysis is related to death in biological organisms and failure in mechanical systems. In survival analysis, cox proportional hazards model is one of the most commonly used multivariate approaches to analyze the survival time data in medical research. It is a semi-parametric method that does not need a specific baseline hazard function and has the capability to effectively handle censoring problem. In our project, a Cox proportional hazards model based on important features is fitted by component-wise likelihood based boosting. Significant image markers can be discovered using the bootstrap analysis and the survival prediction performance of the model will be evaluated.
Aims
In the project, we first aim to investigate important and novel image features for both computer aided diagnosis and prognosis of lung cancer. In our plan, the framework include cell detection, segmentation, feature extraction, classification, and survival analysis for NLST NSCLC Histopathology images. A complete set of cellular features are extracted and several advanced machine learning classification approaches are compared using image features extracted in previous steps. If it works successfully, we can find representative feature variable for NSCLC subtype classification.
We conduct survival analysis based on a Cox model and also apply several survival analysis approaches to evaluate the discovered image features. By these evaluation, a set of prognostic image markers that are highly correlated to NSCLC patients’ survival analysis will be found. Using these image markers, we can accurately predict NSCLC patients’ survival. Together with clinical information, it provides significant clinical values for patients’ prognosis.
In summary, our project based on NLST data aims to design a system to assist doctors for more objective and accurate diagnoses and prognoses of lung cancer.
Related Publications
-
Whole slide images based cancer survival prediction using attention guided deep multiple instance learning networks.
Yao J, Zhu X, Jonnagaddala J, Hawkins N, Huang J
Med Image Anal. 2020 Oct; Volume 65: Pages 101789 PUBMED -
Graph CNN for Survival Analysis on Whole Slide Pathological Images
Ruoyu Li , Jiawen Yao , Xinliang Zhu , Yeqing Li , Junzhou Huang
MICCAI 2018. 2018 Sep 26; Volume 11071: Pages pp 174-182 -
WSISA: Making Survival Prediction from Whole Slide Histopathological Images
Xinliang Zhu , Univerisity of Texas at Arlington, Tencent AI Lab , Jiawen Yao , Univerisity of Texas at Arlington, Tencent AI Lab , Feiyun Zhu , Univerisity of Texas at Arlington, Tencent AI Lab , Junzhou Huang , Univerisity of Texas at Arlington, Tencent AI Lab
IEEE. 2017; Pages pp. 6855-6863 -
Deep convolutional neural network for survival analysis with pathological images
Xinliang Zhu , Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX, USA , Jiawen Yao , Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX, USA , Junzhou Huang , Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX, USA
IEEE. 2016; Pages pp. 544-547 -
Detecting 10,000 Cells in One Second
Zheng Xu , Junzhou Huang
MICCAI 2016. 2016 Oct 2; Volume 9901: Pages pp 676-684 -
Imaging Biomarker Discovery for Lung Cancer Survival Prediction
Jiawen Yao , Sheng Wang , Xinliang Zhu , Junzhou Huang
MICCAI 2016. 2016 Oct 2; Volume 9901: Pages pp 649-657 -
An effective approach for robust lung cancer cell detection.
Hao Pan , Zheng Xu , Junzhou Huang
Patch-MI 2015. 2016 Jan 8; Volume 9467: Pages pp 87-94 -
Efficient lung cancer cell detection with deep convolution neural network.
Zheng Xu , Junzhou Huang
Patch-MI 2015. 2016 Jan 8; Volume 9467: Pages pp. 79-86 -
Fast Regions-of-Interest Detection in Whole Slide Histopathology Images
Ruoyu Li , Junzhou Huang
Patch-MI 2015. 2016 Jan 8; Volume 9467 -
Computer-Assisted Diagnosis of Lung Cancer Using Quantitative Topology Features
Jiawen Yao , Dheeraj Ganti , Xin Luo , Guanghua Xiao , Yang Xie , Shirley Yan , Junzhou Huang
MLMI 2015. 2015 Oct 2; Volume 9352: Pages pp 288-295