Scientific Reports (Jun 2024)
An interpretable ensemble structure with a non-iterative training algorithm to improve the predictive accuracy of healthcare data analysis
Abstract
Abstract The modern development of healthcare is characterized by a set of large volumes of tabular data for monitoring and diagnosing the patient's condition. In addition, modern methods of data engineering allow the synthesizing of a large number of features from an image or signals, which are presented in tabular form. The possibility of high-precision and high-speed processing of such large volumes of medical data requires the use of artificial intelligence tools. A linear machine learning model cannot accurately analyze such data, and traditional bagging, boosting, or stacking ensembles typically require significant computing power and time to implement. In this paper, the authors proposed a method for the analysis of large sets of medical data, based on a designed linear ensemble method with a non-iterative learning algorithm. The basic node of the new ensemble is an extended-input SGTM neural-like structure, which provides high-speed data processing at each level of the ensemble. Increasing prediction accuracy is ensured by dividing the large dataset into parts, the analysis of which is carried out in each node of the ensemble structure and taking into account the output signal from the previous level of the ensemble as an additional attribute on the next one. Such a design of a new ensemble structure provides both a significant increase in the prediction accuracy for large sets of medical data analysis and a significant reduction in the duration of the training procedure. Experimental studies on a large medical dataset, as well as a comparison with existing machine learning methods, confirmed the high efficiency of using the developed ensemble structure when solving the prediction task.