BMC Medical Informatics and Decision Making (Dec 2022)
A machine learning approach for predicting high risk hospitalized patients with COVID-19 SARS-Cov-2
Abstract
Abstract Background This study aimed to explore whether explainable Artificial Intelligence methods can be fruitfully used to improve the medical management of patients suffering from complex diseases, and in particular to predict the death risk in hospitalized patients with SARS-Cov-2 based on admission data. Methods This work is based on an observational ambispective study that comprised patients older than 18 years with a positive SARS-Cov-2 diagnosis that were admitted to the hospital Azienda Ospedaliera “SS Antonio e Biagio e Cesare Arrigo”, Alessandria, Italy from February, 24 2020 to May, 31 2021, and that completed the disease treatment inside this structure. The patients’medical history, demographic, epidemiologic and clinical data were collected from the electronic medical records system and paper based medical records, entered and managed by the Clinical Study Coordinators using the REDCap electronic data capture tool patient chart. The dataset was used to train and to evaluate predictive ML models. Results We overall trained, analysed and evaluated 19 predictive models (both supervised and unsupervised) on data from 824 patients described by 43 features. We focused our attention on models that provide an explanation that is understandable and directly usable by domain experts, and compared the results against other classical machine learning approaches. Among the former, JRIP showed the best performance in 10-fold cross validation, and the best average performance in a further validation test using a different patient dataset from the beginning of the third COVID-19 wave. Moreover, JRIP showed comparable performances with other approaches that do not provide a clear and/or understandable explanation. Conclusions The ML supervised models showed to correctly discern between low-risk and high-risk patients, even when the medical disease context is complex and the list of features is limited to information available at admission time. Furthermore, the models demonstrated to reasonably perform on a dataset from the third COVID-19 wave that was not used in the training phase. Overall, these results are remarkable: (i) from a medical point of view, these models evaluate good predictions despite the possible differences entitled with different care protocols and the possible influence of other viral variants (i.e. delta variant); (ii) from the organizational point of view, they could be used to optimize the management of health-care path at the admission time.
Keywords