Skip to Main Content

COVID-19 is an emerging, rapidly evolving situation.

What people with cancer should know:

Get the latest public health information from CDC:

Get the latest research information from NIH:

Principal Investigator
I-Fang Chung
Institute of Biomedical Informatics, National Yang-Ming University, Taiwan
Position Title
About this CDAS Project
NLST (Learn more about this study)
Project ID
Initial CDAS Request Approval
Dec 2, 2020
Deep learning for identification of spread through air spaces (STAS) and prediction of survival of lung adenocarcinoma using pathology images
Pathology examination plays an important role in the management of lung adenocarcinoma. It provides valuable information such as histologic subtype, cancer stage, tumor grading, and histologic features predictive of patient prognosis. Recently, an important novel pathologic feature, which was named "spread through air spaces (STAS)", was described in lung adenocarcinoma. STAS is defined as microscopic spread of lung cancer tumor cells into air spaces in the lung parenchyma adjacent to the main tumor. Tumors with STAS carry a higher risk of disease recurrence after surgical resection, especially those treated with sublobar resection. The importance of STAS on patient outcome has been observed in multiple independent studies, and validated in recent international, multi-Institutional studies. However, STAS is very tiny in size, and the morphology of STAS can be quite similar to the morphology of macrophages, which are widely distributed everywhere in the lung. The evaluation of STAS therefore requires thorough high power microscopic examination by experienced pathologists, which can be time-consuming. Hence, it needs an effective way to identify STAS on lung pathology images.

In this project we plan to use lung pathology images from three different datasets to build deep learning models for identification of STAS, including the NLST pathology images, images from The Cancer Genome Atlas (TCGA) lung adenocarcinoma cohort, and images from our collaboration hospital (Taipei Veterans General Hospital). We first perform image labeling of STAS and tumor areas to provide proper annotations of data for building deep learning models. In addition, we plan to adopt two kinds of deep learning strategies for effective identification of STAS on lung pathology images: (1) object detection based models, such as Fast-RCNN, YOLO, RetinaNet, CornorNet, CenterNet, etc . (2) segmentation based approaches, e.g., Mask-RCNN, YOLOACT, UNet, DeepLab, etc. Note that, the segmentation models can be further adopted to identify tumor areas. Although those models are built based on sampling region of interest (ROI) on whole slide images (WSI), we shall apply the models to the usage of WSI. Furthermore, since STAS is a well-established prognostic marker in lung adenocarcinoma patients, we also plan to perform quantitative analysis of STAS, and investigate their correlation with patient prognosis, using clinical outcome data from NLST, TCGA, and our collaboration hospital (Taipei Veterans General Hospital).

Aim #1: Review pathology whole slide images from NLST, TCGA and our collaboration hospital and perform image labeling to annotate STAS and tumor areas

Aim #2: Use the annotated images and adopt various image object detection and segmentation approaches to train deep learning models for identification of STAS

Aim #3: Explore the relationship between STAS on pathology images and survival data


Yi-Chen Yeh, M.D., Taipei Veterans General Hospital, National Yang-Ming University