Computers (Sep 2024)
Cross-Lingual Short-Text Semantic Similarity for Kannada–English Language Pair
Abstract
Analyzing the semantic similarity of cross-lingual texts is a crucial part of natural language processing (NLP). The computation of semantic similarity is essential for a variety of tasks such as evaluating machine translation systems, quality checking human translation, information retrieval, plagiarism checks, etc. In this paper, we propose a method for measuring the semantic similarity of Kannada–English sentence pairs that uses embedding space alignment, lexical decomposition, word order, and a convolutional neural network. The proposed method achieves a maximum correlation of 83% with human annotations. Experiments on semantic matching and retrieval tasks resulted in promising results in terms of precision and recall.
Keywords