IEEE Access (Jan 2023)

ML-ECG-COVID: A Machine Learning-Electrocardiogram Signal Processing Technique for COVID-19 Predictive Modeling

  • John Irungu,
  • Timothy Oladunni,
  • Andrew C. Grizzle,
  • Max Denis,
  • Marzieh Savadkoohi,
  • Esther Ososanya

DOI
https://doi.org/10.1109/ACCESS.2023.3335384
Journal volume & issue
Vol. 11
pp. 135994 – 136014

Abstract

Read online

Since the outbreak of coronavirus also known as COVID-19, there have been several studies on the disease. This study investigates patients’ electrocardiography (ECG) properties for an accurate prediction of this infectious disease. Our findings will be useful to medical practitioners in the accurate prognosis of COVID-19. We analyzed ECG datasets of patients who had tested positive for COVID-19 and Normal Persons who had tested negative. Using the analyzed dataset, we designed, developed, and evaluated twelve machine-learning models to predict COVID-19 with different combinations of learning algorithms and feature engineering techniques. Time, frequency, and time-frequency domain features were extracted using Time Series Feature Extraction Library (TSFEL). A combination of domains’ features comprising of Time, Frequency, and Time-Frequency attributes for prediction were also investigated. We deployed a T-test to determine the necessary and enough features for our predictive modeling. The ECG morphological feature analysis was based on peak-to-peak detection, onset, and offset of the QRS complex. K-nearest neighbors (KNN), Random Forest, and Support Vector Machine (SVM) learning algorithms were deployed to classify the ECG signals for the prediction of COVID-19. For each feature domain, the Shapley Additive Explanations (SHAP) visualization tool was used to ‘open’ the black box for explainability and interpretability of feature importance and weights. SHAP’s result suggests that extracted feature weight distribution in different algorithms shows consistency in feature importance per domain. However, the weight and the proportion are not the same. Performance evaluation of our classifiers was based on accuracy, sensitivity, precision, and F1 score. Our experimental results show that the classification encompassing all domain features (Temporal, Spectral, and wavelet) with Random Forest has a slight edge over other models. The proposed model has accuracy, sensitivity, precision, and F1 score of 97.09%, 98%, 97%, and 97%, respectively. COVID-19 continues to pose a threat to our world with its constant mutation, we believe that a study on its accurate prediction is critical for an effective universal mitigation strategy.

Keywords