Informatics in Medicine Unlocked (Jan 2022)

Prediction of COVID-19 using long short-term memory by integrating principal component analysis and clustering techniques

  • Saratu Yusuf Ilu,
  • Prasad Rajesh,
  • Hassan Mohammed

Journal volume & issue
Vol. 31
p. 100990

Abstract

Read online

Severe acute respiratory syndrome coronavirus (SARS-COV) is a major family of viruses that cause infections in both animals and humans, including common cold, coronavirus disease (COVID-19), severe acute respiratory syndrome (SARS), and Middle East respiratory syndrome. This study primarily aims to predict the number of COVID-19 positive cases in 36 states of Nigeria using a long short-term memory (LSTM) algorithm of deep learning. The proposed approach employs K-means clustering to detect outliers and principal component analysis (PCA) to select important features from the dataset. The LSTM was chosen because of its non-linear characteristics to handle the dataset. As COVID-19 cases follow non-linear characteristics, LSTM is the most suitable algorithm for predicting their numbers. For comparison, several types of machine learning algorithms, such as naive Bayes, XG-boost, and SVM, were employed. After the comparison, LSTM was observed to be superior among all algorithms.

Keywords