World Journal of Traditional Chinese Medicine (Jan 2021)

Intelligent prescription-generating models of traditional chinese medicine based on deep learning

  • Qing-Yang Shi,
  • Li-Zi Tan,
  • Lim Lian Seng,
  • Hui-Jun Wang

DOI
https://doi.org/10.4103/wjtcm.wjtcm_54_21
Journal volume & issue
Vol. 7, no. 3
pp. 361 – 369

Abstract

Read online

Objective: This study aimed to construct an intelligent prescription-generating (IPG) model based on deep-learning natural language processing (NLP) technology for multiple prescriptions in Chinese medicine. Materials and Methods: We selected the Treatise on Febrile Diseases and the Synopsis of Golden Chamber as basic datasets with EDA data augmentation, and the Yellow Emperor's Canon of Internal Medicine, the Classic of the Miraculous Pivot, and the Classic on Medical Problems as supplementary datasets for fine-tuning. We selected the word-embedding model based on the Imperial Collection of Four, the bidirectional encoder representations from transformers (BERT) model based on the Chinese Wikipedia, and the robustly optimized BERT approach (RoBERTa) model based on the Chinese Wikipedia and a general database. In addition, the BERT model was fine-tuned using the supplementary datasets to generate a Traditional Chinese Medicine-BERT model. Multiple IPG models were constructed based on the pretraining strategy and experiments were performed. Metrics of precision, recall, and F1-score were used to assess the model performance. Based on the trained models, we extracted and visualized the semantic features of some typical texts from treatise on febrile diseases and investigated the patterns. Results: Among all the trained models, the RoBERTa-large model performed the best, with a test set precision of 92.22%, recall of 86.71%, and F1-score of 89.38% and 10-fold cross-validation precision of 94.5% ± 2.5%, recall of 90.47% ± 4.1%, and F1-score of 92.38% ± 2.8%. The semantic feature extraction results based on this model showed that the model was intelligently stratified based on different meanings such that the within-layer's patterns showed the associations of symptom–symptoms, disease–symptoms, and symptom–punctuations, while the between-layer's patterns showed a progressive or dynamic symptom and disease transformation. Conclusions: Deep-learning-based NLP technology significantly improves the performance of IPG model. In addition, NLP-based semantic feature extraction may be vital to further investigate the ancient Chinese medicine texts.

Keywords