Minibatch Gradient Descent Method for Deep Survival Analysis
As more medical images become available for diagnosis and treatment, researchers have developed some deep learning methods for survival analysis that utilize these imaging data. The methods, however, cannot scale to massive datasets due to the unique challenges of working with survival data. We hope to develop a deep convolutional neural network for predicting survival from medical images for a large number of patients.
Gradient descent is a commonly used method to find parameters by optimizing the loss function of a neural network. To find a local minimum of the loss function, it moves along the steepest direction in each iteration until converges. However, using gradient descent on a large dataset is not computationally efficient: each sample in the training set is evaluated before updating the parameters. A typical way to overcome this limitation is to use minibatch gradient descent. It splits the training set into small batches and uses those batches to update parameters iteratively. Unfortunately, standard minibatch gradient descent cannot be used for the Cox proportional hazards model without substantial modification. We have developed a novel algorithm that augments standard gradient descent for survival data to be amenable to minibatch gradient descent.
The National Lung Screening Trial (NLST) dataset would be an excellent benchmark for our methodology due to the large number of samples. Experiments on NLST dataset will be conducted to evaluate the performance of the minibatch gradient descent method, and comparison could be made with other existing methods. We plan to publish our findings in major Machine Learning and Bioinformatics outlets and cite and acknowledge the NLST for its critical role in developing this new methodology.
1. Develop an algorithm that augments standard gradient descent for survival data to be amenable to minibatch gradient descent.
2. Perform experiments on NLST dataset with the algorithm we developed.
3. Compare the performance of the proposed method with other existing methods.
Xiawei Wang, Ph.D. student, Graduate Group in Biostatistics, University of California-Davis, email@example.com
Xiaoyue Li, Ph.D. candidate, Dept. of Statistics, University of California-Davis, firstname.lastname@example.org
Thomas Lee, Professor, Dept. of Statistics, University of California-Davis, email@example.com