Cybernetics and Information Technologies (Jun 2023)

Joint Reference and Relation Extraction from Legal Documents with Enhanced Decoder Input

  • Thuy Nguyen Thi Thanh,
  • Diep Nguyen Ngoc,
  • Bach Ngo Xuan,
  • Phuong Tu Minh

DOI
https://doi.org/10.2478/cait-2023-0014
Journal volume & issue
Vol. 23, no. 2
pp. 72 – 86

Abstract

Read online

This paper deals with an important task in legal text processing, namely reference and relation extraction from legal documents, which includes two subtasks: 1) reference extraction; 2) relation determination. Motivated by the fact that two subtasks are related and share common information, we propose a joint learning model that solves simultaneously both subtasks. Our model employs a Transformer-based encoder-decoder architecture with non-autoregressive decoding that allows relaxing the sequentiality of traditional seq2seq models and extracting references and relations in one inference step. We also propose a method to enrich the decoder input with learnable meaningful information and therefore, improve the model accuracy. Experimental results on a dataset consisting of 5031 legal documents in Vietnamese with 61,446 references show that our proposed model performs better results than several strong baselines and achieves an F1 score of 99.4% for the joint reference and relation extraction task.

Keywords