E3S Web of Conferences (Jan 2025)

Prediction of case types from non-searchable pdf documents in arabic: Comparison of machine learning and deep learning with image processing

  • El Arrasse Mouad,
  • Khourdifi Youness,
  • Mounir Soufyane,
  • El Alami Alae

DOI
https://doi.org/10.1051/e3sconf/202560100110
Journal volume & issue
Vol. 601
p. 00110

Abstract

Read online

The study conducted focuses on predicting the different types of judicial cases presented to Moroccan administrative courts by using court decisions in the form of non-searchable PDF documents in the Arabic language. To achieve this, we utilized image processing, text cleaning techniques, and machine learning algorithms.We carried out a comparative study using both machine learning and deep learning techniques. The experiment was conducted in two phases: first on 697 court decisions, and then on 14,207 decisions from the Administrative Court of Appeal in Marrakech. Despite the challenges associated with the Arabic language, our methods were able to efficiently extract text, leading to accurate predictions. For the experiment on 697 decisions, machine learning achieved an accuracy rate of 91%, while deep learning reached 100%. For the experiment on 14,207 decisions, machine learning obtained an accuracy of 97%, and deep learning achieved 96%.As a result, this study contributes to the existing literature on the digitization and processing of unstructured documents in the Arabic language, as well as on the prediction of judicial case types through the use of machine learning and deep learning algorithms.

Keywords