Applied Sciences (Sep 2023)

A Purely Entity-Based Semantic Search Approach for Document Retrieval

  • Mohamed Lemine Sidi,
  • Serkan Gunal

DOI
https://doi.org/10.3390/app131810285
Journal volume & issue
Vol. 13, no. 18
p. 10285

Abstract

Read online

Over the past decade, knowledge bases (KB) have been increasingly utilized to complete and enrich the representation of queries and documents in order to improve the document retrieval task. Although many approaches have used KB for such purposes, the problem of how to effectively leverage entity-based representation still needs to be resolved. This paper proposes a Purely Entity-based Semantic Search Approach for Information Retrieval (PESS4IR) as a novel solution. The approach includes (i) its own entity linking method and (ii) an inverted indexing method, and for document retrieval and ranking, (iii) an appropriate ranking method is designed to take advantage of all the strengths of the approach. We report the findings on the performance of our approach, which is tested by queries annotated by two known entity linking tools, REL and DBpedia-Spotlight. The experiments are performed on the standard TREC 2004 Robust and MSMARCO collections. By using the REL method on the Robust collection, for the queries whose terms are all annotated and whose average annotation scores are greater than or equal to 0.75, our approach achieves the maximum nDCG@5 score (1.00). Also, it is shown that using PESS4IR alongside another document retrieval method would improve performance, unless that method alone achieves the maximum nDCG@5 score for those highly annotated queries.

Keywords