IEEE Access (Jan 2025)

Unsupervised Classification for Circulating Tumor Cells

  • Ling An,
  • Haibo Hu,
  • Mengke Song,
  • Lin Cheng,
  • Shuo Ba,
  • Zhaocong Liu,
  • Zhuohang Yu,
  • Zhenyu Zhang,
  • Yi Liu,
  • Chichun Zhou

DOI
https://doi.org/10.1109/access.2025.3564012
Journal volume & issue
Vol. 13
pp. 85669 – 85681

Abstract

Read online

Circulating tumor cell (CTC) detection is crucial for reducing cancer mortality and disease burden. Traditional methods rely on the physical or biological properties of CTCs and involve complex enrichment and separation processes. These methods are often inefficient and limited by cellular heterogeneity and detection accuracy, making them unsuitable for rapid screening. In contrast, intelligent cell recognition and classification using low-resolution microscopy images allows faster detection. Supervised learning has shown excellent performance in CTC detection. However, it heavily depends on large quantities of labeled immunofluorescence image data, which not only consumes significant expert time and resources but also increases costs, limiting its widespread applications. Furthermore, the scarcity of public datasets and the high costs of proprietary datasets further hinder the effectiveness of the detection. Therefore, developing personalized and cost-effective datasets is essential for practical application. Unsupervised methods, which perform classification without the need for manual labeling, offer promising solutions to overcome these challenges. This study adopts an unsupervised classification method that performs excellently on fungal datasets, integrating two-step dimensionality reduction and hybrid clustering voting for automated CTC classification. The main advantage of this method is its domain independence: it does not rely on domain-specific features and is robust to imaging differences, making it applicable to various image classification tasks in specialized domains, which enables the method to be successfully transferred to the field of CTC detection. This method does not require manual labeling, achieves an accuracy of 98.2%, and has a discard rate of 25.23%, effectively distinguishing CTCs from white blood cells (WBCs). This method reduces the reliance on manually labeled data, lowering the cost and complexity of creating personalized CTC datasets while improving detection accuracy. It also enhances the efficiency of deep learning in medical diagnostics, particularly in CTC detection.

Keywords