BMC Medical Informatics and Decision Making (Jan 2019)

Integrating shortest dependency path and sentence sequence into a deep learning framework for relation extraction in clinical text

  • Zhiheng Li,
  • Zhihao Yang,
  • Chen Shen,
  • Jun Xu,
  • Yaoyun Zhang,
  • Hua Xu

DOI
https://doi.org/10.1186/s12911-019-0736-9
Journal volume & issue
Vol. 19, no. S1
pp. 1 – 8

Abstract

Read online

Abstract Background Extracting relations between important clinical entities is critical but very challenging for natural language processing (NLP) in the medical domain. Researchers have applied deep learning-based approaches to clinical relation extraction; but most of them consider sentence sequence only, without modeling syntactic structures. The aim of this study was to utilize a deep neural network to capture the syntactic features and further improve the performances of relation extraction in clinical notes. Methods We propose a novel neural approach to model shortest dependency path (SDP) between target entities together with the sentence sequence for clinical relation extraction. Our neural network architecture consists of three modules: (1) sentence sequence representation module using bidirectional long short-term memory network (Bi-LSTM) to capture the features in the sentence sequence; (2) SDP representation module implementing the convolutional neural network (CNN) and Bi-LSTM network to capture the syntactic context for target entities using SDP information; and (3) classification module utilizing a fully-connected layer with Softmax function to classify the relation type between target entities. Results Using the 2010 i2b2/VA relation extraction dataset, we compared our approach with other baseline methods. Our experimental results show that the proposed approach achieved significant improvements over comparable existing methods, demonstrating the effectiveness of utilizing syntactic structures in deep learning-based relation extraction. The F-measure of our method reaches 74.34% which is 2.5% higher than the method without using syntactic features. Conclusions We propose a new neural network architecture by modeling SDP along with sentence sequence to extract multi-relations from clinical text. Our experimental results show that the proposed approach significantly improve the performances on clinical notes, demonstrating the effectiveness of syntactic structures in deep learning-based relation extraction.

Keywords