Journal of Applied Informatics and Computing (Mar 2025)

Effectiveness of AdaBoost and XGBoost Algorithms in Sentiment Analysis of Movie Reviews

  • I Gusti Ayu Nandia Lestari,
  • Ni Made Rai Masita Dewi,
  • Komang Gita Meiliana,
  • I Komang Agus Ady Aryanto

DOI
https://doi.org/10.30871/jaic.v9i2.9077
Journal volume & issue
Vol. 9, no. 2
pp. 258 – 264

Abstract

Read online

Currently there are many entertainment platforms that provide various movies, TV shows, games, and other content. These platforms usually offer a variety of features, one of which is reviews. Review data written by viewers plays an important role in influencing public interest in the film. However, the increasing number of reviews makes it difficult to assess the sentiment of the film quickly and accurately. This highlights the need for a system that can analyze reviews based on sentiment, making it easier for viewers to evaluate the film and supporting the entertainment industry in understanding the needs of the audience. Therefore, this study develops a sentiment analysis model to identify whether a review contains positive or negative sentiment using machine learning algorithms. The data used to build the model is obtained from user reviews of a film on the IMDb platform. This dataset is available on Kaggle with 50,000 movie reviews in text format. The characteristics of the data include two columns: review_text and sentiment. The methods used to create the classification model are AdaBoost and XGBoost. The data preprocessing process includes several stages such as text cleaning, tokenization, stopword removal, lemmatization, and vectorization using TF-IDF to convert the review text into numeric form, as well as converting the positive and negative labels into 1 and 0. Based on the results of model training with cross-validation, the accuracy of the XGBoost model is 85% and AdaBoost is 77%. Feature selection showed an improvement in the XGBoost model's accuracy from 85% to 86%, while the AdaBoost model's performance remained stable at 77%. Thus, it can be concluded that the XGBoost model demonstrates better performance than the AdaBoost model in sentiment classification.

Keywords