IEEE Open Journal of the Computer Society (Jan 2024)

A Word Sense Disambiguation Method Applied to Natural Language Processing for the Portuguese Language

  • Clovis Holanda do Nascimento,
  • Vinicius Cardoso Garcia,
  • Ricardo de Andrade Araujo

DOI
https://doi.org/10.1109/OJCS.2024.3396518
Journal volume & issue
Vol. 5
pp. 268 – 277

Abstract

Read online

Natural language processing (NLP) and artificial intelligence (AI) have advanced significantly in recent years, enabling the development of various tasks, such as machine translation, text summarization, sentiment analysis, and speech analysis. However, there are still challenges to overcome, such as natural language ambiguity. One of the problems caused by ambiguity is the difficulty of determining the proper meaning of a word in a specific context. For example, the word “mouse” can mean a computer peripheral or an animal, depending on the context. This limitation can lead to an incorrect semantic interpretation of the processed sentence. In recent years, language models (LMs) have provided a new impetus to NLP and AI, including in the task of word sense disambiguation (WSD). LMs are capable of learning and generating texts as they are trained on large amounts of data. However, in the Portuguese language, there are still few studies on WSD using LMs. Given this scenario, this article presents a method for WSD for the Portuguese language. To do this, it uses the BERTimbau language model, which is specific to the Portuguese. The results will be evaluated using the metrics established in the literature.

Keywords