Journal of Information Systems and Informatics (Dec 2023)

Detection of Hate Speech Code Mix Involving English and Other Nigerian Languages

  • Joseph Nda Ndabula,
  • Oyenike Mary Olanrewaju,
  • Faith O Echobu

DOI
https://doi.org/10.51519/journalisi.v5i4.595
Journal volume & issue
Vol. 5, no. 4
pp. 1416 – 1431

Abstract

Read online

Hate speech is a recurrent event and has become a cause for global concern. The proliferation of hate speech has recently become prevalent, breeding room for violence and discrimination against specific individuals or groups. In Nigeria, message masking (use of language-mix) has become the new normal, especially in disseminating hateful and inciting comments. Hence, there is a need to curb the spread over social media. Therefore, this research focuses on detecting hate speech on social media with a code-mix of English, Pidgin and any of the three major Nigerian languages (Hausa, Igbo and Yoruba). The research used two machine learning algorithms: Support Vector Machine (SVM) and Random Forest (RF). Data were collected from tweets on the EndSARS protest and the 2023 Nigerian elections. The major features were extracted, and the text was converted into vectors using TF-IDF and Bag-of-words (BoW), which were used to train and test the model. The result showed that SVM performed better in classifying hate speech than RF on both TF-IDF and BoW features, averaging 93.43% for accuracy, 93.70% for precision, 93.43% for recall, and 93.57% for F1-score.

Keywords