Alexandria Engineering Journal (Jan 2023)
Genomic image representation of human coronavirus sequences for COVID-19 detection
Abstract
Coronavirus (CoV) disease 2019 (COVID-19) is a severe pandemic affecting millions worldwide. Due to its rapid evolution, researchers have been working on developing diagnostic approaches to suppress its spread. This study presents an effective automated approach based on genomic image processing (GIP) techniques to rapidly detect COVID-19, among other human CoV diseases, with high acceptable accuracy. The GIP technique was applied as follows: first, genomic graphical mapping techniques were used to convert the genome sequences into genomic grayscale images. The frequency chaos game representation (FCGR) and single gray-level representation (SGLR) techniques were used in this investigation. Then, several statistical features were obtained from the images to train and test many classifiers, including the k-nearest neighbors (KNN). This study aimed to determine the efficacy of the FCGR (with different orders) and SGLR images for accurately detecting COVID-19, using a dataset containing both partial and complete genome sequences. The results recommended the fourth-order FCGR image as a proper genomic image for extracting statistical features and achieving accurate classification. Furthermore, the results showed that KNN achieved an overall accuracy of 99.39% in detecting COVID-19, among other human CoV diseases, with 99.48% precision, 99.31% sensitivity, 99.47% specificity, 0.99 F1-score, and 0.99 Matthew's correlation coefficient.