Heliyon (Aug 2024)
Pixel embedding for grayscale medical image classification
Abstract
In our paper, we present an extension of text embedding architectures for grayscale medical image classification. We introduce a mechanism that combines n-gram features with an efficient pixel flattening technique to preserve spatial information during feature representation generation. Our approach involves flattening all pixels in grayscale medical images using a combination of column-wise, row-wise, diagonal-wise, and anti-diagonal-wise orders. This ensures that spatial dependencies are captured effectively in the feature representations. To evaluate the effectiveness of our method, we conducted a benchmark using 5 grayscale medical image datasets of varying sizes and complexities. 10-fold cross-validation showed that our approach achieved test accuracy score of 99.92 % on the Medical MNIST dataset, 90.06 % on the Chest X-ray Pneumonia dataset, 96.94 % on the Curated Covid CT dataset, 79.11 % on the MIAS dataset and 93.17 % on the Ultrasound dataset. The framework and reproducible code can be found on GitHub at https://github.com/xizhou/pixel_embedding.