Sistemasi: Jurnal Sistem Informasi (May 2024)

Air Quality Index Classification for Imbalanced Data using Machine Learning Approach

  • Bryan Valentino Jayadi,
  • Manatap Dolok Lauro,
  • Zyad Rusdi,
  • Teny Handhayani

DOI
https://doi.org/10.32520/stmsi.v13i3.3503
Journal volume & issue
Vol. 13, no. 3
pp. 951 – 958

Abstract

Read online

Air pollution is one of the problems in society. Air pollutions affect human health and environment. In Indonesia, air quality index is measured by the level of particulate matter 10 (PM10), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3), and nitrogen dioxide (NO2). This research is conducted to evaluate the performance of machine learning algorithms, e.g., Support Vector Machine (SVM), Naïve Bayes, Logistic Regression, Decision Tree, and AdaBoost, to classify air quality index based on the level of PM10, CO, SO2, O3, and NO2 with imbalanced samples. The air quality index is classified into Good, Moderate, and Unhealthy. The dataset is downloaded from Open Data Jakarta from 2010 -2021. The data containing 4383 samples consist of 1155 samples of Good, 3087 samples of Moderate, and 141 samples of Unhealthy. The experimental results show that Decision Tree outperforms other methods. Decision Tree produces accuracy, precision, recall, and F1-score of 99%, 98%, 99%, and 98%, respectively.