Precision medicine via survival analysis
1) To solve the first challenge mentioned above, we can cluster the patients into several subgroups, and build survival prediction and/or diagnosis models for each sub-group sharing common susceptibility to a particular disease, but rather target a unique patient. To be specific, a latent class survival regression model can be designed to simultaneously cluster the patients into several sub groups and train one accurate survival prediction model for each sub-group. This model can be trained via expectation-maximization (EM) algorithm or tensor decomposition based methods.
2) Integration of medical image information and other clinical information is a real challenge, not only because the data sources are heterogeneous but also because medical image information usually contains much more number of features than other clinical information. Thus, the integration methods should be able to prevent the clinical information be overwhelmed. To achieve this goal, we can use some representation learning methods to map the raw data from multi-source into intermediate representations, which preserve the properties of each data source and can be easily combined. Alternatively, we can build a survival prediction model for each data source and then integrate the learned models to get the final prediction.
3) In healthcare analysis, data is always longitudinal with both time-dependent features, e.g., blood pressure and blood glucose, and static features, e.g., race and sex. To encode both time-dependent and static features in survival prediction model, we can structure the time-dependent features as a third order tensor with modes sample*feature*time. Thus, the temporal smoothness and the concept drift of the time-dependent features can be efficient encoded via adding regularization term such as fused lasso and trace norm in the slices of a third order tensor.
jieping Ye, university of Michigan
jiayu zhou, Michigan state University
Lu Wang, Wayne State University