JMIR Medical Informatics (Sep 2022)

Leveraging Representation Learning for the Construction and Application of a Knowledge Graph for Traditional Chinese Medicine: Framework Development Study

  • Heng Weng,
  • Jielong Chen,
  • Aihua Ou,
  • Yingrong Lao

DOI
https://doi.org/10.2196/38414
Journal volume & issue
Vol. 10, no. 9
p. e38414

Abstract

Read online

BackgroundKnowledge discovery from treatment data records from Chinese physicians is a dramatic challenge in the application of artificial intelligence (AI) models to the research of traditional Chinese medicine (TCM). ObjectiveThis paper aims to construct a TCM knowledge graph (KG) from Chinese physicians and apply it to the decision-making related to diagnosis and treatment in TCM. MethodsA new framework leveraging a representation learning method for TCM KG construction and application was designed. A transformer-based Contextualized Knowledge Graph Embedding (CoKE) model was applied to KG representation learning and knowledge distillation. Automatic identification and expansion of multihop relations were integrated with the CoKE model as a pipeline. Based on the framework, a TCM KG containing 59,882 entities (eg, diseases, symptoms, examinations, drugs), 17 relations, and 604,700 triples was constructed. The framework was validated through a link predication task. ResultsExperiments showed that the framework outperforms a set of baseline models in the link prediction task using the standard metrics mean reciprocal rank (MRR) and Hits@N. The knowledge graph embedding (KGE) multitagged TCM discriminative diagnosis metrics also indicated the improvement of our framework compared with the baseline models. ConclusionsExperiments showed that the clinical KG representation learning and application framework is effective for knowledge discovery and decision-making assistance in diagnosis and treatment. Our framework shows superiority of application prospects in tasks such as KG-fused multimodal information diagnosis, KGE-based text classification, and knowledge inference–based medical question answering.