IEEE Access (Jan 2021)

Comparative Analysis of Information Retrieval Models on Quran Dataset in Cross-Language Information Retrieval Systems

  • Ayman A. Taan,
  • Shafiq Ur Rehman Khan,
  • Ali Raza,
  • Ayaz Muhammad Hanif,
  • Hira Anwar

DOI
https://doi.org/10.1109/ACCESS.2021.3126168
Journal volume & issue
Vol. 9
pp. 169056 – 169067

Abstract

Read online

English is an international language used for communication worldwide but still many cannot read, write, understand, or communicate in English. On the other hand, the World Wide Web has unlimited resources of information in different languages which English native find challenging to understand. To avoid such barriers, Cross-Language Information Retrieval (CLIR) systems are proposed, which refers to document retrieval tasks across different languages. This work focuses on the performance evaluation of different Information Retrieval (IR) models in CLIR system using Quran dataset. Furthermore, this work also investigated the length of query and query expansion models for effective retrieval. The results show that different length of queries has an impact on the performance of the retrieval methods in terms of effectiveness. Hence, after comprehensive experiments, an appropriate length of query for Arabic CLIR system is suggested along with the best query expansion and retrieval model.

Keywords