IEEE Access (Jan 2019)

HitBoost: Survival Analysis via a Multi-Output Gradient Boosting Decision Tree Method

  • Pei Liu,
  • Bo Fu,
  • Simon X. Yang

DOI
https://doi.org/10.1109/ACCESS.2019.2913428
Journal volume & issue
Vol. 7
pp. 56785 – 56795

Abstract

Read online

Survival analysis, in many areas such as healthcare and finance, mainly studies the probability of time to the event of interest. Among various methods that build survival predictive models, a class of methods combining with machine learning techniques make assumptions about hazard functions, while another class of methods directly exploit complex neural networks to learn the latent representation of hazard functions. For the traditional survival predictive models, the assumption about hazard functions restricts their performance to some extends. Similarly, the advanced survival predictive models built by complex neural networks also suffer from fairly poor interpretation in real applications. To solve these problems, in this paper, a novel survival analysis method named HitBoost is proposed to predict the probability distribution of the first hitting time (FHT). Instead of making any assumptions about the underlying stochastic process, the proposed HitBoost adopts the multi-output gradient boosting decision tree to implicitly capture the connections between the static covariate and the underlying stochastic process. Furthermore, in the process of tree boosting, the relevant statistics can be utilized to effectively measure the feature importance. The results of evaluations and case studies on benchmarks show that, in comparison to the classical methods, the proposed HitBoost is superior in prediction performance and risk discrimination. Therefore, the HitBoost can be utilized as an effective method to build survival predictive models or to find the important factors for cause-specific failure.

Keywords