Applied Sciences (Aug 2024)
Integrating Structured and Unstructured Data with BERTopic and Machine Learning: A Comprehensive Predictive Model for Mortality in ICU Heart Failure Patients
Abstract
Heart failure remains a leading cause of mortality worldwide, particularly within Intensive Care Unit (ICU)-patient populations. This study introduces an innovative approach to predicting ICU mortality by seamlessly integrating electronic health record (EHR) data with a BERTopic-based hybrid machine-learning methodology. The MIMIC-III database serves as the primary data source, encompassing structured and unstructured data from 6606 ICU-admitted heart-failure patients. Unstructured data are processed using BERTopic, complemented by machine-learning algorithms for prediction and performance evaluation. The results indicate that the inclusion of unstructured data significantly enhances the model’s predictive accuracy regarding patient mortality. The amalgamation of structured and unstructured data effectively identifies key variables, enhancing the precision of the predictive model. The developed model demonstrates potential in improving healthcare decision-making, elevating patient outcomes, and optimizing resource allocation within the ICU setting. The handling and application of unstructured data emphasize the utilization of clinical narrative records by healthcare professionals, elevating this research beyond the traditional structured data predictive tools. This study contributes to the ongoing discourse in critical care and predictive modeling, offering valuable insights into the potential of integrating unstructured data into healthcare analytics.
Keywords