Applied Sciences (Jul 2017)

Chinese Medical Question Answer Matching Using End-to-End Character-Level Multi-Scale CNNs

  • Sheng Zhang,
  • Xin Zhang,
  • Hui Wang,
  • Jiajun Cheng,
  • Pei Li,
  • Zhaoyun Ding

DOI
https://doi.org/10.3390/app7080767
Journal volume & issue
Vol. 7, no. 8
p. 767

Abstract

Read online

This paper focuses mainly on the problem of Chinese medical question answer matching, which is arguably more challenging than open-domain question answer matching in English due to the combination of its domain-restricted nature and the language-specific features of Chinese. We present an end-to-end character-level multi-scale convolutional neural framework in which character embeddings instead of word embeddings are used to avoid Chinese word segmentation in text preprocessing, and multi-scale convolutional neural networks (CNNs) are then introduced to extract contextual information from either question or answer sentences over different scales. The proposed framework can be trained with minimal human supervision and does not require any handcrafted features, rule-based patterns, or external resources. To validate our framework, we create a new text corpus, named cMedQA, by harvesting questions and answers from an online Chinese health and wellness community. The experimental results on the cMedQA dataset show that our framework significantly outperforms several strong baselines, and achieves an improvement of top-1 accuracy by up to 19%.

Keywords