Jurnal Teknologi Informasi dan Ilmu Komputer (Aug 2024)
Optimasi Klasifikasi Sentimen Komentar Pengguna Game Bergerak Menggunakan Svm, Grid Search Dan Kombinasi N-Gram
Abstract
Game online telah menjadi fenomena budaya signifikan dalam industri yang berkembang pesat. Pengguna dan pengembang game menggunakan analisis sentimen untuk memahami opini dan ulasan pemain, yang membantu dalam pengembangan dan peningkatan game. Penelitian ini melakukan klasifikasi sentimen menggunakan algoritma Support Vector Machine (SVM) dengan penerapan teknik N-Gram untuk seleksi fitur. Grid Search (GS) digunakan untuk optimasi hyperparameter guna mencapai akurasi optimal. Eksperimen dilakukan dengan berbagai skenario, termasuk variasi jumlah data, pengaturan hyperparameter, rasio dataset pelatihan dan pengujian, serta konfigurasi N-Gram. Kinerja model dinilai menggunakan metrik seperti Akurasi, Presisi, Recall, dan Area di Bawah Kurva ROC (AUC). Hasil menunjukkan bahwa dengan dataset gabungan (Allgame) dan integrasi fitur seleksi N-Gram Unigram, Bigram, dan Trigram (UniBiTri), model ini mencapai akurasi 87,3%, presisi 88,5%, recall 85,5%, dan AUC 0,9081, menggunakan kernel Fungsi Basis Radial (RBF) dengan validasi silang k-fold (k=10). Abstract Online gaming has become a significant cultural phenomenon within a rapidly expanding industry. Game users and developers leverage sentiment analysis to understand player opinions and reviews, which subsequently guide game development and enhancements. In this study, sentiment classification was performed using the Support Vector Machine (SVM) algorithm, employing N-Gram techniques for feature selection. Grid Search (GS) was utilized for hyperparameter optimization to achieve the highest possible accuracy. To evaluate the impact of these methods, experiments were conducted across various scenarios, including different data quantities, hyperparameter settings, training and testing dataset ratios, and N-Gram configurations. The performance of the classification model was assessed using metrics such as Accuracy, Precision, Recall, and the Area Under the ROC Curve (AUC). The results of the study indicate that by using 3600 rows from a combined dataset (Allgame) and integrating Unigram, Bigram, and Trigram (UniBiTri) N-Gram selection features, along with k-fold cross-validation (k=10) and the Radial Basis Function (RBF) kernel, the model effectively classifies user reviews. Specifically, the model achieved an accuracy of 87.3%, precision of 88.5%, recall of 85.5%, and an AUC of 0.9081.