IEEE Access (Jan 2024)

Naïve Bayes Approach for Word Sense Disambiguation System With a Focus on Parts-of-Speech Ambiguity Resolution

  • Ajith Abraham,
  • Bineet Kumar Gupta,
  • Archana Sachindeo Maurya,
  • Satya Bhushan Verma,
  • Mohammad Husain,
  • Arshad Ali,
  • Sami Alshmrany,
  • Sanjay Gupta

DOI
https://doi.org/10.1109/ACCESS.2024.3453912
Journal volume & issue
Vol. 12
pp. 126668 – 126678

Abstract

Read online

Natural languages are written and spoken languages, and NLP (Natural Language Processing) is the ability of a computer program to recognize both written and spoken languages. Word Sense Disambiguation (WSD) is identified as a challenging area of research in Artificial Intelligence (AI), and Machine Translation (MT). WSD is the procedure for selecting the exact meaning of a word that has more than one meaning. This is an essential application for all-natural language processing applications. There are various knowledge-based, supervised, and unsupervised approaches to WSD process. The Naïve Bayes classifier as an example of approach supervised and unsupervised approaches is the most important method. In this paper, we emphasize on the use of the Naïve Bayes approach for text classification in WSD techniques. Bayes’ hypothesis is a probabilistic model and a reliable approach for text classification. Bayes’ hypothesis acknowledges that the occurrence of some other features is not dependent on the presence of a particular element in a class. This calculation can be used to solve multi-class prediction problems. This classifier performs better compared to the other methods of different approaches. This paper gives an itemized investigation of Naïve Bayes algorithms, which depicts its ideas, covering up Naïve Bayes,’ text characterization, traditional innocent Bayes,’ and machine learning. We have used the collocation method of feature extraction for the WSD of English sentences. Using this model we have disambiguate ambiguous English words by predicting part-of-speech inclusive of “noun,” “verb,” “adverb,” and “adjective.” This disambiguation module is an enhancement in machine translation. The system reported the performance measure of seventy-eight (78%) percent of the scale on F1-measure.

Keywords