An ELECTRA-Based Model for Neural Coreference Resolution

Francesco Gargiulo; Aniello Minutolo; Raffaele Guarasci; Emanuele Damiano; Giuseppe De Pietro; Hamido Fujita; Massimo Esposito

doi:10.1109/ACCESS.2022.3189956

IEEE Access (Jan 2022)

An ELECTRA-Based Model for Neural Coreference Resolution

Francesco Gargiulo,
Aniello Minutolo,
Raffaele Guarasci,
Emanuele Damiano,
Giuseppe De Pietro,
Hamido Fujita,
Massimo Esposito

Affiliations

Francesco Gargiulo: National Research Council (CNR), Institute for High Performance Computing and Networking (ICAR), Naples, Italy
Aniello Minutolo: ORCiD; National Research Council (CNR), Institute for High Performance Computing and Networking (ICAR), Naples, Italy
Raffaele Guarasci: ORCiD; National Research Council (CNR), Institute for High Performance Computing and Networking (ICAR), Naples, Italy
Emanuele Damiano: National Research Council (CNR), Institute for High Performance Computing and Networking (ICAR), Naples, Italy
Giuseppe De Pietro: National Research Council (CNR), Institute for High Performance Computing and Networking (ICAR), Naples, Italy
Hamido Fujita: Faculty of Information Technology, Ho Chi Minh City University of Technology (HUTECH), Ho Chi Minh City, Vietnam
Massimo Esposito: ORCiD; National Research Council (CNR), Institute for High Performance Computing and Networking (ICAR), Naples, Italy

DOI: https://doi.org/10.1109/ACCESS.2022.3189956
Journal volume & issue: Vol. 10
pp. 75144 – 75157

Abstract

Read online

In last years, coreference resolution has received a sensibly performance boost exploiting different pre-trained Neural Language Models, from BERT to SpanBERT until Longformer. This work is aimed at assessing, for the first time, the impact of ELECTRA model on this task, moved by the experimental evidence of an improved contextual representation and better performance on different downstream tasks. In particular, ELECTRA has been employed as representation layer in an assessed neural coreference architecture able to determine entity mentions among spans of text and to best cluster them. The architecture itself has been optimized: i) by simplifying the modality of representation of spans of text but still considering both the context they appear and their entire content, ii) by maximizing both the number and length of input textual segments to exploit better the improved contextual representation power of ELECTRA, iii) by maximizing the number of spans of text to be processed, since potentially representing mentions, preserving computational efficiency. Experimental results on the OntoNotes dataset have shown the effectiveness of this solution from both a quantitative and qualitative perspective, and also with respect to other state-of-the-art models, thanks to a more proficient token and span representation. The results also hint at the possible use of this solution also for low-resource languages, simply requiring a pre-trained version of ELECTRA instead of language-specific models trained to handle either spans of text or long documents.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords