Scientific Reports (Oct 2022)

Evaluation of two semi-supervised learning methods and their combination for automatic classification of bone marrow cells

  • Iori Nakamura,
  • Haruhi Ida,
  • Mayu Yabuta,
  • Wataru Kashiwa,
  • Maho Tsukamoto,
  • Shigeki Sato,
  • Syuichi Ota,
  • Naoki Kobayashi,
  • Hiromi Masauzi,
  • Kazunori Okada,
  • Sanae Kaga,
  • Keiko Miwa,
  • Hiroshi Kanai,
  • Nobuo Masauzi

DOI
https://doi.org/10.1038/s41598-022-20651-4
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Differential bone marrow (BM) cell counting is an important test for the diagnosis of various hematological diseases. However, it is difficult to accurately classify BM cells due to non-uniformity and the lack of reproducibility of differential counting. Therefore, automatic classification systems have been developed in which deep learning is used. These systems requires large and accurately labeled datasets for training. To overcome this, we used semi-supervised learning (SSL), in which learning proceeds while labeling. We used three methods: self-training (ST), active learning (AL), and a combination of these methods, and attempted to automatically classify 16 types of BM cell images. ST involves data verification, as in AL, before adding them to the training dataset (confirmed self-training: CST). After 25 rounds of CST, AL, and CST + AL, the initial number of training data increased from 425 to 40,518; 3682; and 47,843, respectively. Accuracies for the test data of 50 images for each cell type were 0.944, 0.941, and 0.976, respectively. Data added with CST or AL showed some imbalances between classes, while CST + AL exhibited fewer imbalances. We suggest that CST + AL, when combined with two SSL methods, is efficient in increasing training data for the development of automatic BM cells classification systems.