Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Mingzhe Wu
Degrees
Ph.D.
Institution
Fudan University
Position Title
Student
Email
About this CDAS Project
Study
NLST (Learn more about this study)
Project ID
NLST-387
Initial CDAS Request Approval
Jan 31, 2018
Title
Evaluating lung cancer risks with time-to-event data: state-of-the-art model with certified performance
Summary
The classic Cox proportional hazard model (Cox, 1972) makes a fundamental assumption on linear effect of features. While being useful and easy to explain for clinical trials where covariates are measurements of medical indicators, the linear assumption is somehow too stringent for some of the cases where covariates are observed up to an implicit (inverse) feature map. This becomes extremely eminent when some of the covariates are not trivially Euclidean vectors like image or language, thus requiring an intermediate representation learning step (Bengio et al, 2013) with deep neural networks being its state-of-the-art solutions.

In order to utilize proportional hazard model for representation learning from image/language domain, basic assumptions need to be clarified as time-to-event data does not have a trivial formulation of learning theoretic risks. Among them most commonly used are parametric assumptions on the baseline survival function, which is somehow easily generalizable from log-linear multiplicative factor to a neural network factor; while on the other hand, the more useful semiparametric approach, which uses partial likelihood function as its optimization objective to get the parameter estimates, is much less trivial for an elegant SGD solution since for each individual sample, the computation cost for a gradient component varies. A doubly stochastic proximal gradient descent algorithm (Achab et al, 2016) was proposed with certified theoretical performance regarding specific objective functions, and gives useful insight for an implementation of prevailing deep neural network architecture as feature extractor from image domain (i.e. multi-layered Convolutional Neural Nets like AlexNet or ResNet). The algorithm, which decomposes in phases, combines SGD and MCMC to handle with both the very large sample size and the inner complexity of each gradient computation. Within a phase, each inner iteration samples an index and obtains an approximation of the gradient by applying MCMC algorithm.

Considering the high resolution of pathological images and the heterogeneity of patients, the WSISA framework (Zhu et al, 2017) is needed. It consists of four main stages, that is sampling, clustering, clusters-selecting, and aggregating. With the WSISA framework, histopathological images can be directly utilized to predict patients’ clinical outcomes, which is expected to greatly benefit cancer treatment.

References

[1] Achab, Massil, et al. "SGD with Variance Reduction beyond Empirical Risk Minimization." arXiv preprint arXiv:1510.04822 (2015).

[2] Bengio, Yoshua, Aaron Courville, and Pascal Vincent. "Representation learning: A review and new perspectives." IEEE transactions on pattern analysis and machine intelligence 35.8 (2013): 1798-1828.

[3] Blakely, Richard J., and Allan Cox. "Evidence for short geomagnetic polarity intervals in the early Cenozoic." Journal of Geophysical Research 77.35 (1972): 7065-7072.

[4] Zhu, Xinliang, et al. "Wsisa: Making survival prediction from whole slide histopathological images." IEEE Conference on Computer Vision and Pattern Recognition. 2017.
Aims

Aim 1: We expect that via an algorithm that better scales with sample size, it's possible to get more precise results in terms of prediction (here prediction metric from (93 Article) will be adopted).
Aim 2: It may lead to some empirical evidence that justifies certain theoretical results considering statistical consistency of the Breslow estimator for nonparametric baseline function with the presence of a nonparametric multiplicative factor.
Aim 3: A deep survival analysis package using the doubly stochastic proximal gradient decent algorithm will be developed.

Collaborators

Ruofan Wu
Chanjuan Lin