IEEE Access (Jan 2025)

Tongue-LiteSAM: A Lightweight Model for Tongue Image Segmentation With Zero-Shot

  • Daiqing Tan,
  • Hao Zang,
  • Xinyue Zhang,
  • Han Gao,
  • Ji Wang,
  • Zaijian Wang,
  • Xing Zhai,
  • Huixia Li,
  • Yan Tang,
  • Aiqing Han

DOI
https://doi.org/10.1109/ACCESS.2025.3528658
Journal volume & issue
Vol. 13
pp. 11689 – 11703

Abstract

Read online

Objective: Tongue image segmentation is a crucial step in the intelligent recognition of tongue diagnosis in Traditional Chinese Medicine (TCM). Existing deep learning-based tongue image segmentation models face issues such as poor versatility and insufficient expressiveness in zero-shot tasks. This study aims to construct an efficient model with zero-shot suitable for tongue image segmentation in TCM. Methods: We developed the Tongue-LiteSAM model by improving the SAM (Segment Anything Model) framework to suit tongue segmentation. Based on the basic SAM model, the improvement involved modifying the image encoder by integrating two lightweight ViT-Tiny image encoders, effectively reducing the model’s parameter count. Additionally, data perturbation techniques were employed to enhance the zero-shot segmentation capability of the model and ensure robust performance across different data sources. Results: Experiments conducted on six distinct tongue image datasets demonstrated that the Tongue-LiteSAM model outperformed traditional convolutional neural network-based models and transformers, the original SAM model, and other related improved models in tongue image segmentation tasks. Conclusion: The Tongue-LiteSAM model provides a more objective and consistent solution for tongue diagnosis, and has better zero-shot segmentation capabilities. By optimizing the model structure and data processing strategies, the accuracy and practicality of tongue diagnosis models are effectively improved, offering new technical support for the modernization and precision of TCM tongue diagnosis.

Keywords