Journal of King Saud University: Computer and Information Sciences (Apr 2017)

Morphological disambiguation of Tunisian dialect

  • Inès Zribi,
  • Mariem Ellouze,
  • Lamia Hadrich Belguith,
  • Philippe Blache

DOI
https://doi.org/10.1016/j.jksuci.2017.01.004
Journal volume & issue
Vol. 29, no. 2
pp. 147 – 155

Abstract

Read online

In this paper, we propose a method to disambiguate the output of a morphological analyzer of the Tunisian dialect. We test three machine-learning techniques that classify the morphological analysis of each word token into two classes: true and false. The class label is assigned to each analysis according to the context of the corresponding word in a sentence. In failure cases, we combine the results of the proposed techniques with a bigram classifier to choose only one analysis for a given word. We disambiguate the result of the morphological analyzer of the Tunisian Dialect Al-Khalil-TUN (Zribi et al., 2013b). We use the Spoken Tunisian Arabic Corpus STAC (Zribi et al., 2015) to train and test our method. The evaluation shows that the proposed method has achieved an accuracy performance of 87.32%.

Keywords