Archives of Trauma Research (Nov 2023)
A comprehensive evaluation of ensemble learning methods and decision trees for predicting trauma patient discharge status using real-world data
Abstract
Background: Trauma registries collect and document data about the acute injury care in hospitals. The goal of trauma care systems is to reduce injury occurrence and enhance trauma patient survival rates. Objectives: In this article, the Kashan trauma registry was used to predict trauma patient discharge status using machine learning. Methods: This study employed 3930 Kashan Trauma Centre Registry entries after preprocessing. The study experimented with decision trees of varying complexity, using three separate metrics - information gain, Gini index, and gain ratio - to build and evaluate the trees. Finally, bagging, boosting and stacking ensemble learning techniques were implemented to evaluate their predictive performance. Ensemble learning models were developed based on decision trees of varying depths that utilized different learning measures/metrics. The predictive performance of the algorithms was evaluated using metrics such as accuracy, precision, recall, and the area under the receiver operating characteristic curve (AUC). This study aimed to compare ensemble-learning techniques like bagging, boosting and stacking to decision trees configured with various parameter settings, to assess their ability to predict trauma patients' discharge status outcomes. Results: The stacking technique, which used decision tree algorithms (depth=5) that integrated parameters like information gain, gain ratio and Gini index at the base level along with KNN (k=12) using Euclidean distance, and then incorporated logistic regression as the meta-classifier, demonstrated superior predictive performance compared to using individual decision trees, bagging or boosting approaches alone. Conclusion: However, while decision trees are straightforward algorithms and ensemble methods are more time-consuming and computationally complex, this study indicates that stacking learning is superior to single decision tree methods with a variety of parameters, bagging, and boosting.
Keywords