IEEE Access (Jan 2019)

A Deep Learning Approach With Deep Contextualized Word Representations for Chemical–Protein Interaction Extraction From Biomedical Literature

  • Cong Sun,
  • Zhihao Yang,
  • Ling Luo,
  • Lei Wang,
  • Yin Zhang,
  • Hongfei Lin,
  • Jian Wang

DOI
https://doi.org/10.1109/ACCESS.2019.2948155
Journal volume & issue
Vol. 7
pp. 151034 – 151046

Abstract

Read online

Mining interactions between chemicals and proteins/genes is of crucial relevance for clinical medicine, adverse drug effects, and pharmacological research. Although chemical-protein interactions (CPIs) can be manually extracted, this process is expensive and time-consuming. Therefore, it is of considerable significance to automatically extract CPIs from biomedical literature. Currently, the popular methods for CPI extraction are based on deep learning to avoid sophisticated handcrafted features derived from linguistic analyses. However, the performance of existing methods is usually unsatisfactory. The reasons may be that (1) traditional word-embedding methods cannot adequately model context information, and (2) it is difficult to effectively distinguish which words play critical roles in long biomedical sentences. In this study, we propose a novel Deep-contextualized Stacked Bi-LSTM model (DS-LSTM) to tackle the drawbacks of existing methods. Specifically, our model mainly consists of three components: deep contextualized word representations, the entity attention mechanism, and stacked bidirectional long short-term memory networks (Bi-LSTMs). The deep contextualized word representations are introduced to effectively model complex characteristics of word use (e.g., syntax and semantics) and the variations of these words in the context (i.e., to model polysemy), thereby generating context information. The entity attention mechanism is applied to prioritize the weights of words associated with target entities to distinguish which words play critical roles in long biomedical sentences. We evaluate our model on the CHEMPROT corpus. Our approach achieves a micro-averaged F-score of 69.44%, which is significantly higher than existing state-of-the-art methods. Experimental results show that our approach can adequately model context information, effectively distinguish which words play critical roles in long biomedical sentences and, therefore, improve the overall performance.

Keywords