BMC Bioinformatics (Jul 2019)

Chemical-induced disease relation extraction via attention-based distant supervision

  • Jinghang Gu,
  • Fuqing Sun,
  • Longhua Qian,
  • Guodong Zhou

DOI
https://doi.org/10.1186/s12859-019-2884-4
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background Automatically understanding chemical-disease relations (CDRs) is crucial in various areas of biomedical research and health care. Supervised machine learning provides a feasible solution to automatically extract relations between biomedical entities from scientific literature, its success, however, heavily depends on large-scale biomedical corpora manually annotated with intensive labor and tremendous investment. Results We present an attention-based distant supervision paradigm for the BioCreative-V CDR extraction task. Training examples at both intra- and inter-sentence levels are generated automatically from the Comparative Toxicogenomics Database (CTD) without any human intervention. An attention-based neural network and a stacked auto-encoder network are applied respectively to induce learning models and extract relations at both levels. After merging the results of both levels, the document-level CDRs can be finally extracted. It achieves the precision/recall/F1-score of 60.3%/73.8%/66.4%, outperforming the state-of-the-art supervised learning systems without using any annotated corpus. Conclusion Our experiments demonstrate that distant supervision is promising for extracting chemical disease relations from biomedical literature, and capturing both local and global attention features simultaneously is effective in attention-based distantly supervised learning.

Keywords