Skip to Main Content

An official website of the United States government

Government Funding Lapse

Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit  cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Principal Investigator
Name
Mihaela van der Schaar
Degrees
Ph.D.
Institution
University of California, Los Angeles
Position Title
Professor
Email
About this CDAS Project
Study
PLCO (Learn more about this study)
Project ID
PLCO-453
Initial CDAS Request Approval
Feb 11, 2019
Title
Feature Selection for Survival Analysis with Competing Risks using Deep Learning
Summary
Deep learning models for survival analysis have gained significant attention in the literature, but they suffer from severe performance deficits when the dataset contains many irrelevant features. We will give empirical evidence for this problem in real-world medical settings using the state-of-the-art model DeepHit. Furthermore, we will develop methods to improve the deep learning model through novel approaches to feature selection in survival analysis. We will propose filter methods for hard feature selection and a neural network architecture that weights features for soft feature selection. Our experiments on a real-world medical dataset will demonstrate that substantial performance improvements against the original models are achievable.
Aims

Recent research has produced a variety of successful new deep learning models for survival analysis. Whilst some methods have strong parametric assumptions, more general models have been developed. However, deep learning approaches suffer from performance deficits when there are many irrelevant features. This can certainly be the case in medical datasets, where numerous features may be recorded about a patient. In this paper, we give evidence for this problem using DeepHit on a large real-world medical dataset, and propose feature selection techniques to achieve substantial performance improvements.

Collaborators

Carl Rietschel, University of Oxford, United Kingdom
Jinsung Yoon, University of California, Los Angeles, USA
Mihaela van der Schaar, University of California, Los Angeles, USA