Applied Sciences (Feb 2022)

Word Sense Disambiguation Using Clustered Sense Labels

  • Jeong Yeon Park,
  • Hyeong Jin Shin,
  • Jae Sung Lee

DOI
https://doi.org/10.3390/app12041857
Journal volume & issue
Vol. 12, no. 4
p. 1857

Abstract

Read online

Sequence labeling models for word sense disambiguation have proven highly effective when the sense vocabulary is compressed based on the thesaurus hierarchy. In this paper, we propose a method for compressing the sense vocabulary without using a thesaurus. For this, sense definitions in a dictionary are converted into sentence vectors and clustered into the compressed senses. First, the very large set of sense vectors is partitioned for less computational complexity, and then it is clustered hierarchically with awareness of homographs. The experiment was done on the English Senseval and Semeval datasets and the Korean Sejong sense annotated corpus. This process demonstrated that the performance greatly increased compared to that of the uncompressed sense model and is comparable to that of the thesaurus-based model.

Keywords