Informatics (Feb 2025)

Anemia Classification System Using Machine Learning

  • Jorge Gómez Gómez,
  • Camilo Parra Urueta,
  • Daniel Salas Álvarez,
  • Velssy Hernández Riaño,
  • Gustavo Ramirez-Gonzalez

DOI
https://doi.org/10.3390/informatics12010019
Journal volume & issue
Vol. 12, no. 1
p. 19

Abstract

Read online

In this study, a system was developed to predict anemia using blood count data and supervised learning algorithms. Anemia, a common condition characterized by low levels of red blood cells or hemoglobin, affects oxygenation and often causes symptoms, such as fatigue and shortness of breath. The diagnosis of anemia often requires laboratory tests, which can be challenging in low-resource areas where anemia is common. We built a supervised learning approach and trained three models (Linear Discriminant Analysis, Decision Trees, and Random Forest) using an anemia dataset from a previous study by Sabatini in 2022. The Random Forest model achieved an accuracy of 99.82%, highlighting its capability to subclassify anemia types (microcytic, normocytic, and macrocytic) with high precision, which is a novel advancement compared to prior studies limited to binary classification (presence/absence of anemia) of the same dataset.

Keywords