Longitudinal Deep Radiomic Signatures for Lung Cancer Prognosis and Treatment Response Prediction
It is well known that DCNNs are very data hungry architectures — and an assessment of the best architecture for a given task might depend strongly on whether one trains on 1000 or 50,000 patient records. Many recent studies of computer aided-diagnosis (CAD) for lung nodule detection make use of the LIDC/IDRI dataset (Armato et al., 2011); however, with about 1000 patients, this dataset might be too small to fully understand algorithm performance, and particularly the behavior on rare nodule types. A much larger dataset would allow to provide a finer understanding of the tradeoffs involving algorithm complexity versus accuracy, for different training dataset sizes.
Moreover, recent progress in the field of radiomics has shown promise in improving diagnosis and patient stratification in lung cancer (Scrivener et al., 2016). Radiomics aims at the characterization of tumor phenotypes from the analysis of quantitative imaging features, such as textures and tumor shape, and such features have been associated with specific mutations in lung (Liu et al., 2016) and a number of other cancers (Aerts et al., 2014). So far, most work on radiomics has been centered on feature extraction from single images, without much consideration for the sequence of features that may be extracted from a multi-year longitudinal patient screening and follow-up effort. In addition, the current published works on radiomics continue to use hand-engineered features, instead of features trained to be maximally predictive of a clinical outcome.
This project aims at bridging this gap by evaluating whether DCNNs can serve to extract predictive radiomic features, either from individual scans or given a (longitudinal) sequence thereof. To this end, we will correlate features extracted from DCNNs trained on the task of detecting lung nodules with available patient outcome results, including patient survival, diagnostic procedures undertaken, medical complications, lung cancer stage, and treatment administered. We intend to evaluate a number of DCNN architectures for this task, including multitask and transfer learning models that have proven successful in a medical imaging context (Tajbakhsh et al., 2016).
If successful, this work will assert the usefulness of DCNNs to provide a general modeling framework to integrate imaging with other clinical patient data into a predictive system that could help support clinical decisions and ultimately improve patient care.
The specific aims of the project can be outlined as follows.
Aim 1: Nodule Detection and Transfer Learning
First, we want to assess the ability of DCNNs to support transfer learning for the task of nodule detection. For this, we will start from an initial model trained on existing public datasets including LIDC & RIDER to account for different scanning resolutions (Aerts et al., 2014); then we will evaluate the model detection performance on a held-out test set from the NLST database. This step aims at ensuring that DCNNs can exhibit useful baseline performance from the low-dose CT scans collected during the NLST effort.
Aim 2: Extraction of Radiomic Features from Single Images
Next, we want to use DCNNs to enhance traditional radiomic signatures to predict the patient’s progression in the continuum of care, including tumor staging, follow-on diagnostic procedures, treatment resistance. These features would be extracted from individual images only. We will experiment with architectures that emphasize the extraction of general-purpose features, that would simultaneously be helpful for predicting a variety of outcome types, building on our group’s recent work of extending DCNNs to support heteromodal medical image analysis (Havaei et al., 2016).
Aim 3: Extraction of Longitudinal Radiomic Features
Finally, we want to extend DCNNs to allow accounting for all images that have been taken for a given patient up to a given point in time. We posit that a characterization of the dynamics of radiomic features evolutions can maximize our understanding of tumor biology and related patient condition, potentially enabling early triggers of diagnostic or therapeutic options. The available NLST data provide a unique set allowing to test this hypothesis.
References
Aerts, H. J., et al., Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nature communications, 5:4006, 2014.
Anthimopoulos, M., et al., Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network. IEEE Transactions on Medical Imaging, 35(5):1207–1216, 2016
Armato, S.G. et al., The lung image database consortium (LIDC) and image database resource initiative (IDRI): A completed reference database of lung nodules on CT scans. Medical Physics, 38(2):915–931, 2011.
Havaei, M., et al., HeMIS: Hetero-Modal Image Segmentation. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016 Volume 9901 of the series Lecture Notes in Computer Science pp. 469-477, Springer, 2016.
Liu Y., Radiomic Features Are Associated With EGFR Mutation Status in Lung Adenocarcinomas, Clinical Lung Cancer, 17(5):441-448, 2016
Scrivener, M., Radiomics applied to lung cancer: a review. Translational Cancer Research, 5(4):398-409, 2016.
Setio, A., Pulmonary Nodule Detection in CT Images: False Positive Reduction Using Multi-View Convolutional Networks. IEEE Transactions on Medical Imaging, 35(5):1160–1169, 2016.
Tajbakhsh, N. et al., Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? IEEE transactions on medical imaging, 35(5):1299-1312, 2016.
Yoshua Bengio, Ph.D., University of Montreal
Florent Chandelier, Ph.D., Cadens Medical Imaging Inc.
Aaron Courville, Ph.D., University of Montreal
Mohammad Havaei, Ph.D., Imagia Inc.