Sara Detection on Social Media Using Deep Learning Algorithm Development

M. Khairul Anam; Lucky Lhaura Van FC; Hamdani Hamdani; Rahmaddeni Rahmaddeni; Junadhi Junadhi; Muhammad Bambang Firdaus; Irwanda Syahputra; Yuda Irawan

doi:10.37385/jaets.v6i1.5390

Journal of Applied Engineering and Technological Science (Dec 2024)

Sara Detection on Social Media Using Deep Learning Algorithm Development

M. Khairul Anam,
Lucky Lhaura Van FC,
Hamdani Hamdani,
Rahmaddeni Rahmaddeni,
Junadhi Junadhi,
Muhammad Bambang Firdaus,
Irwanda Syahputra,
Yuda Irawan

Affiliations

M. Khairul Anam: Universitas Samudra
Lucky Lhaura Van FC: Universitas Lancang Kuning
Hamdani Hamdani: Universitas Mulawarman
Rahmaddeni Rahmaddeni: Universitas Sains dan Teknologi Indonesia
Junadhi Junadhi: Universitas Sains dan Teknologi Indonesia
Muhammad Bambang Firdaus: Universitas Mulawarman
Irwanda Syahputra: Universitas Samudra
Yuda Irawan: Universitas Hang Tuah Pekanbaru

DOI: https://doi.org/10.37385/jaets.v6i1.5390
Journal volume & issue: Vol. 6, no. 1

Abstract

Read online

Social media has become a key platform for disseminating information and opinions, particularly in Indonesia, where SARA (Ethnicity, Religion, Race, and Intergroup) issues can fuel social tensions. To address this, developing an automated system to detect and classify harmful content is essential. This study develops a deep learning model using Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) to detect SARA-related comments on Twitter. The method involves data collection through web scraping, followed by cleaning, manual labeling, and text preprocessing. To address data imbalance, SMOTE (Synthetic Minority Over-sampling Technique) is applied, while early stopping prevents overfitting. Model performance is evaluated using precision, recall, and F1-score. The results demonstrate that SMOTE significantly improves model performance, particularly in detecting minority-class SARA comments. CNN+SMOTE achieves a accuracy of 93%, and BiLSTM+SMOTE records a recall of 88%, effectively capturing patterns in SARA and non-SARA data. With SMOTE and early stopping, the model successfully manages class imbalance and reduces overfitting. This research supports efforts to curtail hate speech on social media, especially in the Indonesian context, where SARA-related issues often dominate public discourse.

Published in Journal of Applied Engineering and Technological Science

ISSN: 2715-6087 (Print); 2715-6079 (Online)
Publisher: Yayasan Pendidikan Riset dan Pengembangan Intelektual (YRPI)
Country of publisher: Indonesia
LCC subjects: Technology: Engineering (General). Civil engineering (General); Technology: Technology (General)
Website: https://journal.yrpipku.com/index.php/jaets/index

About the journal

Abstract

Keywords