Algorithms (Aug 2022)
Social Media Hate Speech Detection Using Explainable Artificial Intelligence (XAI)
Abstract
Explainable artificial intelligence (XAI) characteristics have flexible and multifaceted potential in hate speech detection by deep learning models. Interpreting and explaining decisions made by complex artificial intelligence (AI) models to understand the decision-making process of these model were the aims of this research. As a part of this research study, two datasets were taken to demonstrate hate speech detection using XAI. Data preprocessing was performed to clean data of any inconsistencies, clean the text of the tweets, tokenize and lemmatize the text, etc. Categorical variables were also simplified in order to generate a clean dataset for training purposes. Exploratory data analysis was performed on the datasets to uncover various patterns and insights. Various pre-existing models were applied to the Google Jigsaw dataset such as decision trees, k-nearest neighbors, multinomial naïve Bayes, random forest, logistic regression, and long short-term memory (LSTM), among which LSTM achieved an accuracy of 97.6%. Explainable methods such as LIME (local interpretable model—agnostic explanations) were applied to the HateXplain dataset. Variants of BERT (bidirectional encoder representations from transformers) model such as BERT + ANN (artificial neural network) with an accuracy of 93.55% and BERT + MLP (multilayer perceptron) with an accuracy of 93.67% were created to achieve a good performance in terms of explainability using the ERASER (evaluating rationales and simple English reasoning) benchmark.
Keywords