Scientific Reports (Jan 2022)

A large dataset of white blood cells containing cell locations and types, along with segmented nuclei and cytoplasm

  • Zahra Mousavi Kouzehkanan,
  • Sepehr Saghari,
  • Sajad Tavakoli,
  • Peyman Rostami,
  • Mohammadjavad Abaszadeh,
  • Farzaneh Mirzadeh,
  • Esmaeil Shahabi Satlsar,
  • Maryam Gheidishahran,
  • Fatemeh Gorgi,
  • Saeed Mohammadi,
  • Reshad Hosseini

DOI
https://doi.org/10.1038/s41598-021-04426-x
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Accurate and early detection of anomalies in peripheral white blood cells plays a crucial role in the evaluation of well-being in individuals and the diagnosis and prognosis of hematologic diseases. For example, some blood disorders and immune system-related diseases are diagnosed by the differential count of white blood cells, which is one of the common laboratory tests. Data is one of the most important ingredients in the development and testing of many commercial and successful automatic or semi-automatic systems. To this end, this study introduces a free access dataset of normal peripheral white blood cells called Raabin-WBC containing about 40,000 images of white blood cells and color spots. For ensuring the validity of the data, a significant number of cells were labeled by two experts. Also, the ground truths of the nuclei and cytoplasm are extracted for 1145 selected cells. To provide the necessary diversity, various smears have been imaged, and two different cameras and two different microscopes were used. We did some preliminary deep learning experiments on Raabin-WBC to demonstrate how the generalization power of machine learning methods, especially deep neural networks, can be affected by the mentioned diversity. Raabin-WBC as a public data in the field of health can be used for the model development and testing in different machine learning tasks including classification, detection, segmentation, and localization.