IEEE Access (Jan 2024)

Advancements in Arabic Named Entity Recognition: A Comprehensive Review

  • Taoufiq El Moussaoui,
  • Chakir Loqman

DOI
https://doi.org/10.1109/ACCESS.2024.3491897
Journal volume & issue
Vol. 12
pp. 180238 – 180266

Abstract

Read online

With the growing prevalence of Arabic texts, extracting relevant information from these sources has become increasingly vital. Consequently, there is a pressing demand for technologies and tools capable of processing this relevant data. Named Entity Recognition (NER) is a fundamental technique in the field of information extraction, serving as the basis for many natural language applications such as question-answering (QA), machine translation, and text summarization. This paper presents a detailed review of the evolution of Arabic NER. We begin by providing an in-depth exploration of the foundational aspects of Arabic NER, encompassing its domains, types, applications, annotation schemes, and the challenges inherent in its development. We present an overview of the existing resources applicable to Arabic NER, including datasets, gazetteers, and prevalent processing tools. Then, we classify current approaches into three main paradigms: Rule-Based, Machine Learning (ML), and Deep-Learning (DL) methods. Within each paradigm, we survey notable methods and elucidate their architectures. Additionally, we synthesize the practical applications of these NER methods and propose potential directions for future research in the domain of Arabic NER. This literature review provides a valuable resource for researchers aiming to improve NER studies in Arabic.

Keywords