Toward Universal Word Sense Disambiguation Using Deep Neural Networks

Hiram Calvo; Arturo P. Rocha-Ramirez; Marco A. Moreno-Armendariz; Carlos A. Duchanoy

doi:10.1109/ACCESS.2019.2914921

IEEE Access (Jan 2019)

Toward Universal Word Sense Disambiguation Using Deep Neural Networks

Hiram Calvo,
Arturo P. Rocha-Ramirez,
Marco A. Moreno-Armendariz,
Carlos A. Duchanoy

Affiliations

Hiram Calvo: ORCiD; Centro de Investigación en Computación, Instituto Politécnico Nacional J. D. Bátiz e/ M.O. de Mendizábal, Mexico City, Mexico
Arturo P. Rocha-Ramirez: Centro de Investigación en Computación, Instituto Politécnico Nacional J. D. Bátiz e/ M.O. de Mendizábal, Mexico City, Mexico
Marco A. Moreno-Armendariz: ORCiD; Centro de Investigación en Computación, Instituto Politécnico Nacional J. D. Bátiz e/ M.O. de Mendizábal, Mexico City, Mexico
Carlos A. Duchanoy: Centro de Investigación en Computación, Instituto Politécnico Nacional J. D. Bátiz e/ M.O. de Mendizábal, Mexico City, Mexico

DOI: https://doi.org/10.1109/ACCESS.2019.2914921
Journal volume & issue: Vol. 7
pp. 60264 – 60275

Abstract

Read online

Traditionally, approaches based on neural networks to solve the problem of disambiguation of the meaning of words (WSD) use a set of classifiers at the end, which results in a specialization in a single set of words-those for which they were trained. This makes impossible to apply the learned models to words not previously seen in the training corpus. This paper seeks to address a generalization of the problem of WSD in order to solve it through deep neural networks without limiting the method to a fixed set of words, with a performance close to the state-of-the-art, and an acceptable computational cost. We explore different architectures based on multilayer perceptrons, recurrent cells (Long Short-Term Memory-LSTM and Gated Recurrent Units-GRU), and a classifier model. Different sources and dimensions of embeddings were tested as well. The main evaluation was performed on the Senseval 3 English Lexical Sample. To evaluate the application to an unseen set of words, learned models are evaluated in the completely unseen words of a different corpus (Senseval 2 English Lexical Sample), overcoming the random baseline.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords