Autoencoder for CT images and automatic clustering of patients
Principal Investigator
Name
michael roberts
Degrees
Ph.D., MMath
Institution
Astrazeneca
Position Title
Postdoctoral fellow
Email
About this CDAS Project
Study
NLST
(Learn more about this study)
Project ID
NLST-647
Initial CDAS Request Approval
Mar 10, 2020
Title
Autoencoder for CT images and automatic clustering of patients
Summary
We have developed an autoencoder to encode CT images into a feature vector. The feature vector is of significantly lower dimension (vector of length ~4000) than a CT image (~200 million voxels). We would like to utilise the National Lung Cancer Screening Trial data to better train the autoencoder – as this resource is the largest available and also has a great variety of CT scanners. We believe this will have great generalisability and be of great use to the research community, allowing an encoding of entire datasets of CT images into small feature vectors for (1) machine learning analysis to classify patients by outcome (2) automated clustering of patients by features. Clustering based on encoded features should give outputs of great clinical utility, for example adverse event prediction using the encoded features. Nothing like this currently exists and we have the opportunity to create it and release the model publicly for researchers.
Aims
> Train an autoencoder using the data which will be highly generalisable to new CT images.
> Using the trained autoencoder on new datasets we aim to show that the encoded features can be used for classifying patients into different outcome groups. We also aim to predict adverse events.
Collaborators
Andrew Reynolds, Astrazeneca
Mishal Patel, Astrazeneca
Tom White, Astrazeneca