Protek: Jurnal Ilmiah Teknik Elektro (Sep 2024)
Comparison of Feature Extraction Methods for Conducting Sentiment Classification in Ternate Malay Language using Machine Learning Approaches
Abstract
Local people in Ternate, North Maluku, often use local languages to communicate on social media. This poses a challenge for newcomers to understand the implied meaning and emotions of the messages conveyed through social media. This research aims to develop a natural language processing (NLP)-based emotion classification method that can be applied to Ternate Malay text datasets. The application of NLP is expected to improve the accuracy of emotion detection and classification in the text. The research was conducted by applying and comparing the performance of several classification models trained using Ternate Malay text datasets. The models used include SVM (Support Vector Machine), K-Nearest Neighbors (KNN) Random Forest, Decision Tree and Logistic Regression. Each model is applied using BoW (Bag-of-Words) and Word2Vec vectorization representations. The evaluation results show that the BoW+SVM model provides the highest performance with 77% accuracy, followed by BoW+Random Forest (75%) and BoW+Logistic Regression (73%). Thus it can be concluded that NLP can be applied to the Ternate Malay language dataset to classify emotions based on text.
Keywords