Predicting lung cancer patient outcomes using pathological imaging
Although a large number of cancer biomarkers have been reported, few have been translated into real clinical tools. The major bottleneck in translating biomarker discovery to improve patient outcomes is the availability of accurate clinical tests (assays) that will allow treatments to be optimized and tailored to an individual’s needs. If we can develop reliable computerized algorithms for diagnosis and prognosis of lung cancer from pathological imaging, this will have an immediate impact on patient care in lung cancer. If implemented successfully, this study will have immense clinical benefit in terms of planning cares of treatments for individual patients.
Aim 1. To develop diagnostic and prognostic signatures from pathological imaging data. Pathological imaging data contain thousands or even millions of features, so we will use state-of-the-art dimension reduction and machine learning techniques to derive classification and prediction models. Cross-validation will be used for parameter optimization and internal validation. The prediction performance will be characterized using Receiver Operating Characteristic (ROC) curves.
Aim 2. To build a comprehensive prediction models by integrating pathological imaging and clinical information. We will develop a novel computational approach to integrate different data sources using an Area Under the Curve (AUC) weighted average method. Through this approach, we will best utilize the information from different types of data and optimize the prediction performance. Lasso-based penalization will be used to tackle possible co-linearity problems.
Aim 3. To validate and characterize the comprehensive prediction model using an independent patient cohort. We will randomly select 400 NSCLC samples from UT Lung SPORE Tissue Bank. For this patient cohort, we will collect clinical and epidemiology data, as well as pathological image data from both normal and tumor tissues. We will test the prediction model on this cohort and use a blinding approach to ensure the validity of the testing results.