International Journal of Crowd Science (Jun 2017)

Mining medical related temporal information from patients’ self-description

  • Lichao Zhu,
  • Hangzhou Yang,
  • Zhijun Yan

DOI
https://doi.org/10.1108/IJCS-08-2017-0018
Journal volume & issue
Vol. 1, no. 2
pp. 110 – 120

Abstract

Read online

Purpose – The purpose of this paper is to develop a new method to extract medical temporal information from online health communities. Design/methodology/approach – The authors trained a conditional random-filed model for the extraction of temporal expressions. The temporal relation identification is considered as a classification task and several support vector machine classifiers are built in the proposed method. For the model training, the authors extracted some high-level semantic features including co-reference relationship of medical concepts and the semantic similarity among words. Findings – For the extraction of TIMEX, the authors find that well-formatted expressions are easy to recognize, and the main challenge is the relative TIMEX such as “three days after onset”. It also shows the same difficulty for normalization of absolute date or well-formatted duration, whereas frequency is easier to be normalized. For the identification of DocTimeRel, the result is fairly well, and the relation is difficult to identify when it involves a relative TIMEX or a hypothetical concept. Originality/value – The authors proposed a new method to extract temporal information from the online clinical data and evaluated the usefulness of different level of syntactic features in this task.

Keywords