Iranian Journal of Information Processing & Management (Dec 2022)
Investigating the Effectiveness of Semantic Tagging in Sense Disambiguation of Specialized Homographs from the perspective of F-Measure in Retrieving scientific texts
Abstract
The aim of this study was to explain the application of text corpus tagging method in Sense disambiguation from specialized homographs and increasing the retrieval F-Measure of scientific texts containing such homographs. This is an experimental study. Specialized homographs were identified by direct observation and morphological analysis of the word. The research sample consisted of 442 scientific articles of two groups of experimental group and control group. The control group had 221 full-text articles without tags and the experimental group had same 221 tagged articles, which were tested in the information retrieval system to measure the effectiveness of tagging in word sense disambiguation from specialized homographs. The level of significance of the Wilcoxon signed-rank test showed that the F-Measure of retrieval results of specialized homographs after using the tagged specialized text corpus in the information retrieval system is significantly different than before. Examination of negative and positive rankings showed that the F-Measure of the results after using the tagged specialized text corpus has increased significantly and has reached its maximum level of 1. The findings of the present study showed that there is not necessarily an inverse relationship between recall and precision, and the two can reach their maximum level of 1. The better efficiency of the retrieval system using this approach is due to the empowerment of the retrieval system in distinguishing between specialized homographs and identifying their semantic roles by using semantic tags as training data that were considered in the test and training set. Embedding the training set in the structure of the retrieval system provides additional information to serve the retrieval system to distinguish between the various meanings of specialized homographs. This tool is one of the elements that causes the optimal quality of retrieval and leads the information retrieval system from word-driven retrieval to content-driven retrieval when retrieving texts containing specialized homographs.