Jurnal ELTIKOM: Jurnal Teknik Elektro, Teknologi Informasi dan Komputer (Jun 2025)

Imbalanced Text Classification on Tourism Reviews using Ada-boost Naïve Bayes

  • Ika Oktavia Suzanti,
  • Fajrul Ihsan Kamil,
  • Eka Mala Sari Rochman,
  • Huzain Azis,
  • Alfa Faridh Suni,
  • Fika Hastarita Rachman,
  • Firdaus Solihin

DOI
https://doi.org/10.31961/eltikom.v9i1.1496
Journal volume & issue
Vol. 9, no. 1
pp. 91 – 97

Abstract

Read online

Hidden paradise is a term that aptly describes the island of Madura, which offers diverse tourism potential. Through the Google Maps application, tourists can access sentiment-based information about various attractions in Madura, serving both as a reference before visiting and as evaluation material for the local government. The Multinomial Naïve Bayes method is used for text classification due to its simplicity and effectiveness in handling text mining tasks. The sentiment classification is divided into three categories: positive, negative, and mixed. Initial analysis revealed an imbalance in sentiment data, with most reviews being positive. To address this, sampling techniques—both oversampling and undersampling—were applied to achieve a more balanced data distribution. Additionally, the Adaptive Boosting ensemble method was used to enhance the accuracy of the Multinomial Naïve Bayes model. The dataset was split into training and testing sets using ratios of 60:40, 70:30, and 80:20 to evaluate the model’s stability and reliability. The results showed that the highest F1-score, 84.1%, was achieved using the Multinomial Naïve Bayes method with Adaptive Boosting, which outperformed the model without boosting, which had an accuracy of 76%.

Keywords