IEEE Access (Jan 2019)

Arabic Natural Language Processing and Machine Learning-Based Systems

  • Souad Larabi Marie-Sainte,
  • Nada Alalyani,
  • Sihaam Alotaibi,
  • Sanaa Ghouzali,
  • Ibrahim Abunadi

DOI
https://doi.org/10.1109/ACCESS.2018.2890076
Journal volume & issue
Vol. 7
pp. 7011 – 7020

Abstract

Read online

Arabic natural language processing (ANLP) consists of developing techniques and tools that can utilize and analyze the Arabic language in both written and spoken contexts. ANLP makes an important contribution to many existing developed systems. It provides Arabic and non-Arabic speakers with helpful and convenient tools that can be used in different domains. Modern ANLP tools are developed using machine learning (ML) techniques. ML algorithms are widely used in NLP because of their high accuracy rate regardless of the robustness of the data that is used and because of the ease with which they can be implemented. On the other hand, the methodology of ANLP applications based on ML involves several distinct phases. It is, therefore, crucial to recognize and understand these phases in detail as well as the most widely used ML algorithms. This survey discusses this concept in detail, shows the involvement of ML techniques in developing such tools, and identifies well-known techniques used in ANLP. Moreover, this survey discusses the characteristics and complexity of the Arabic language in addition to the importance and needs of ANLP.

Keywords