Journal of Biomedical Semantics (Oct 2011)

Event extraction for DNA methylation

  • Ohta Tomoko,
  • Pyysalo Sampo,
  • Miwa Makoto,
  • Tsujii Jun’ichi

DOI
https://doi.org/10.1186/2041-1480-2-S5-S2
Journal volume & issue
Vol. 2, no. Suppl 5
p. S2

Abstract

Read online

Abstract Background We consider the task of automatically extracting DNA methylation events from the biomedical domain literature. DNA methylation is a key mechanism of epigenetic control of gene expression and implicated in many cancers, but there has been little study of automatic information extraction for DNA methylation. Results We present an annotation scheme for DNA methylation following the representation of the BioNLP shared task on event extraction, select a set of 200 abstracts including a representative sample of all PubMed citations relevant to DNA methylation, and introduce manual annotation for this corpus marking nearly 3000 gene/protein mentions and 1500 DNA methylation and demethylation events. We retrain a state-of-the-art event extraction system on the corpus and find that automatic extraction of DNA methylation events, the methylated genes, and their methylation sites can be performed at 78% precision and 76% recall. Conclusions Our results demonstrate that reliable extraction methods for DNA methylation events can be created through corpus annotation and straightforward retraining of a general event extraction system. The introduced resources are freely available for use in research from the GENIA project homepage http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA.