Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Yang Xie
Degrees
PhD, MD
Institution
UT Southwestern Medical Center
Position Title
Associate Professor
Email
About this CDAS Project
Study
NLST (Learn more about this study)
Project ID
NLST-62
Initial CDAS Request Approval
Mar 6, 2014
Title
Predicting lung cancer patient outcomes using pathological imaging
Summary
Lung cancer is the most common human cancer and the deadliest in the United States and globally, having a five-year survival of only 16%. Non-small-cell lung cancer (NSCLC) accounts for approximately 85% of known lung cancer cases. Current guidelines for treating NSCLC are largely based on clinical and pathological staging systems, as well as additional factors like smoking history and gender. More precise diagnosis and classification of NSCLC patients could help “tailor” specific treatment plans for individual patients. In the last decade, large-scale genomic profiling technology has been used to obtain genome-wide mRNA expression levels in lung cancer, and many mRNA expression signatures have been developed for NSCLC prognosis. However, the requirement of frozen tissues for microarray experiments limits the clinical usage of genome-wide expression profiling data. The goal of this study is to test the feasibility of developing computerized algorithms for lung cancer diagnosis and prognosis using pathological imaging, which are widely available and provide a valuable rich source for studying the lung cancer pathology and associated clinical outcomes. We plan to develop and validate clinical assays for lung cancer diagnostic and predicting the disease prognosis using pathological image data and clinical data from the National Lung Screening Trial (NLST). The main criteria of the evaluations were: 1) the ability of the assay to determine the patient histology using computer automatic imaging analysis algorithms; 2) the ability of the assay to predict the survival outcome of the lung cancer patients; 3) the predictive power of the pathological imaging after adjusted for the clinical variables, such as age, gender and stage.
Aims

Although a large number of cancer biomarkers have been reported, few have been translated into real clinical tools. The major bottleneck in translating biomarker discovery to improve patient outcomes is the availability of accurate clinical tests (assays) that will allow treatments to be optimized and tailored to an individual’s needs. If we can develop reliable computerized algorithms for diagnosis and prognosis of lung cancer from pathological imaging, this will have an immediate impact on patient care in lung cancer. If implemented successfully, this study will have immense clinical benefit in terms of planning cares of treatments for individual patients.
Aim 1. To develop diagnostic and prognostic signatures from pathological imaging data. Pathological imaging data contain thousands or even millions of features, so we will use state-of-the-art dimension reduction and machine learning techniques to derive classification and prediction models. Cross-validation will be used for parameter optimization and internal validation. The prediction performance will be characterized using Receiver Operating Characteristic (ROC) curves.
Aim 2. To build a comprehensive prediction models by integrating pathological imaging and clinical information. We will develop a novel computational approach to integrate different data sources using an Area Under the Curve (AUC) weighted average method. Through this approach, we will best utilize the information from different types of data and optimize the prediction performance. Lasso-based penalization will be used to tackle possible co-linearity problems.
Aim 3. To validate and characterize the comprehensive prediction model using an independent patient cohort. We will randomly select 400 NSCLC samples from UT Lung SPORE Tissue Bank. For this patient cohort, we will collect clinical and epidemiology data, as well as pathological image data from both normal and tumor tissues. We will test the prediction model on this cohort and use a blinding approach to ensure the validity of the testing results.