NLST-787: Image Synthesis for Data Augmentation in Medical CT using Deep … - Approved Projects

Studies on CDAS

Additional Studies...

More Information

Principal Investigator

Name

Klaus Mueller

Degrees

PhD

Institution

Stony Brook University

Position Title

Professor

mueller@cs.sunysb.edu

About this CDAS Project

Study

NLST (Learn more about this study)

Project ID

NLST-787

Initial CDAS Request Approval

Apr 21, 2021

Title

Image Synthesis for Data Augmentation in Medical CT using Deep Learning

Summary

The overall goal of this project is to generate diverse training and challenging testing datasets of low-dose CT scans purposed to establish an evaluation methodology for AI-based imaging methods. Our work builds on several papers we have published in recent years.

A. Krishna, K. Bartake, C. Niu, G. Wang, Y. Lai, X. Jia, k. Mueller , "Image synthesis for data augmentation in medical CT using deep reinforcement learning," to appear in International Meeting on Fully Three-Dimensional Image Reconstruction, 2021.

A. Krishna, K. Mueller, "Medical (CT) Image Generation with Style," International Meeting on Fully Three-Dimensional Image Reconstruction, Philadelphia, PA, June 2019.

Thus far we have worked with very limited data and having access to the NLST data would help us tremendously to leap forward.

Motivating our work is the fact that over the past several years, artificial intelligence and machine learning, especially deep learning, has been the most prominent direction of tomographic research. It is widely recognized that these technologies represent a paradigm shift, with great promise for tomographic reconstruction, image processing and analysis. However, it has also become widely recognized that deep neural networks often have generalizability issues. This critical challenge must be addressed to optimize the performance of deep neural networks in medical applications.

Methods that build on artificial intelligence require massive training data. However, access to clinical CT raw data is usually limited to researchers under restrictive agreements, due to patient privacy, clinical overhead, and company proprietary considerations. But even when local repositories are available, they are usually not varied enough for robust AI model development. Hence, there is a great need for obtaining these data via synthesis. The approach we have been developing bears great promise in achieving this goal. It will derive complete knowledge of real-word sample distributions even in the absence of complete real data. It will generate massive amounts of realistic data and images at high variety which can then be used to evaluate the generalizability and robustness of proposed AI-based methods and systems

Having the NLST data available will be instrumental for devising a system that can generate elaborate and imaginative image samples that will match the quality and realism of real world samples. We believe that this innovative learning approach will elevate CT simulation to a new high level.

Aims

Specific Aims:
Aim 1: Decompose the CT images into features such as lungs, heart, esophagus, spinal cord, and outer torso. Use the NLST data to learn the parameterized boundary representations for these which can then be modulated in the synthesis process using a Generative Adversarial Network (GAN).
Aim 2: Use the NLST data to learn the textures of these features and other image areas to be used by a Style GAN framework to fill the boundary representations generated in Aim 1.
Aim 3: Use the NLST data to learn realistic pathological features and then add them as a special layer to the images generated in Aims 1 and 2.

Collaborators

Ge Wang, PhD, Rensselaer Polytechnic Institute
Xun Jia, PhD, University of Texas Southwestern Medical Center