Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) (Aug 2024)

An Optimized Hyperparameter Tuning for Improved Hate Speech Detection with Multilayer Perceptron

  • Muhamad Ridwan,
  • Ema Utami

DOI
https://doi.org/10.29207/resti.v8i4.5949
Journal volume & issue
Vol. 8, no. 4
pp. 525 – 534

Abstract

Read online

Hate speech classification is a critical task in the domain of natural language processing, aiming to mitigate the negative impacts of harmful content on digital platforms. This study explores the application of a Multilayer Perceptron (MLP) model for hate speech classification, utilizing Bag of Words (BoW) for feature extraction. The hypothesis posits that hyperparameter tuning through sophisticated optimization techniques will significantly improve model performance. To validate this hypothesis, we employed two distinct hyperparameter tuning approaches: Random Search and Optuna. Random Search provides a straightforward yet effective means of exploring the hyperparameter space, while Optuna offers a more sophisticated, optimization-based approach to hyperparameter selection. The study involved training the MLP model on a labeled dataset is based on crawling results on the Twitter platform of hate speech and non-hate speech overall total dataset is 13.169, followed by evaluation using standard metrics. Our experimental results demonstrate the comparative effectiveness of these two hyperparameter tuning methods. Notably, the MLP model tuned with Optuna achieved a higher F1-score of 81.49%, compared to 79.70% achieved with Random Search, indicating the superior performance of Optuna in optimizing the hyperparameters. These results were obtained through extensive cross-validation to ensure robustness and generalizability. The findings underscore the importance of optimized hyperparameters in developing robust hate speech classification systems. The superior perform ance of Optuna highlights its potential for broader application in other machine learning tasks requiring hyperparameter optimization. This improvement enables more reliable and efficient automated moderation, which is crucial for the integrity and security of digital communication platforms such as Twitter.

Keywords