BMC Medical Informatics and Decision Making (Nov 2019)

ThalPred: a web-based prediction tool for discriminating thalassemia trait and iron deficiency anemia

  • V. Laengsri,
  • W. Shoombuatong,
  • W. Adirojananon,
  • C. Nantasenamat,
  • V. Prachayasittikul,
  • P. Nuchnoi

DOI
https://doi.org/10.1186/s12911-019-0929-2
Journal volume & issue
Vol. 19, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background The hypochromic microcytic anemia (HMA) commonly found in Thailand are iron deficiency anemia (IDA) and thalassemia trait (TT). Accurate discrimination between IDA and TT is an important issue and better methods are urgently needed. Although considerable RBC formulas and indices with various optimal cut-off values have been developed, distinguishing between IDA and TT is still a challenging problem due to the diversity of various anemic populations. To address this problem, it is desirable to develop an improved and automated prediction model for discriminating IDA from TT. Methods We retrospectively collected laboratory data of HMA found in Thai adults. Five machine learnings, including k-nearest neighbor (k-NN), decision tree, random forest (RF), artificial neural network (ANN) and support vector machine (SVM), were applied to construct a discriminant model. Performance was assessed and compared with thirteen existing discriminant formulas and indices. Results The data of 186 patients (146 patients with TT and 40 with IDA) were enrolled. The interpretable rules derived from the RF model were proposed to demonstrate the combination of RBC indices for discriminating IDA from TT. A web-based tool ‘ThalPred’ was implemented using an SVM model based on seven RBC parameters. ThalPred achieved prediction results with an external accuracy, MCC and AUC of 95.59, 0.87 and 0.98, respectively. Conclusion ThalPred and an interpretable rule were provided for distinguishing IDA from TT. For the convenience of health care team experimental scientists, a web-based tool has been established at http://codes.bio/thalpred/ by which users can easily get their desired screening test result without the need to go through the underlying mathematical and computational details.

Keywords