Frontiers in Physiology (Apr 2025)

Cross-modal attention model integrating tongue images and descriptions: a novel intelligent TCM approach for pathological organ diagnosis

  • Quan Gan,
  • Chen Wang,
  • Zhaoman Zhong,
  • Jiaying Wu,
  • Qiwei Ge,
  • Lei Shi,
  • Jiaqing Shang,
  • Chuanxia Liu

DOI
https://doi.org/10.3389/fphys.2025.1580985
Journal volume & issue
Vol. 16

Abstract

Read online

IntroductionTongue diagnosis is a fundamental technique in traditional Chinese medicine (TCM), where clinicians evaluate the tongue’s appearance to infer the condition of pathological organs. However, most existing research on intelligent tongue diagnosis primarily focuses on analyzing tongue images, often neglecting the important descriptive text that accompanies these images. This text is an essential component of clinical diagnosis. To overcome this gap, we propose a novel Cross-Modal Pathological Organ Diagnosis Model that integrates tongue images and textual descriptions for more accurate pathological classificationMethodsOur model extracts features from both the tongue images and the corresponding textual descriptions. These features are then fused using a cross-modal attention mechanism to enhance the classification of pathological organs. The cross-modal attention mechanism enables the model to effectively combine visual and textual information, addressing the limitations of using either modality aloneResultsWe conducted experiments using a self-constructed dataset to evaluate our model’s performance. The results demonstrate that our model outperforms common models regarding overall accuracy. Additionally, ablation studies, where either tongue images or textual descriptions were used alone, confirmed the significant benefit of multimodal fusion in improving diagnostic accuracy.DiscussionThis study introduces a new perspective on intelligent tongue diagnosis in TCM by incorporating visual and textual data. The experimental findings highlight the importance of cross-modal feature fusion for improving the accuracy of pathological diagnosis. Our approach not only contributes to the development of more effective diagnostic systems but also paves the way for future advancements in the automation of TCM diagnosis.

Keywords