Performance Analysis of Random Forest Using Attribute Normalization

Arie Nugroho; Abdullah Husin

doi:10.32520/stmsi.v11i1.1681

Sistemasi: Jurnal Sistem Informasi (Jan 2022)

Performance Analysis of Random Forest Using Attribute Normalization

Arie Nugroho,
Abdullah Husin

Affiliations

Arie Nugroho: Universitas Nusantara PGRI Kediri
Abdullah Husin: Universitas Islam Indragiri

DOI: https://doi.org/10.32520/stmsi.v11i1.1681
Journal volume & issue: Vol. 11, no. 1
pp. 186 – 196

Abstract

Read online

Data mining can process previous data into a pattern to help the next human activity. Data mining is divided into several methods: classification, clustering, association, and forecasting. This study, using the classification method to determine the pattern of a dataset so that it can be used to predict decisions with new data. The dataset for the classification method must have a label or class. Datasets that have an unbalanced number of tags (imbalanced datasets) can affect the shape of the model and predictive results for new data. To overcome this problem, this research uses the ensemble method and pre-processing. One of the algorithms in the ensemble learning method is a random forest, and the pre-processing used is attribute normalization by converting nominal data to numeric. Random forest is the development of the decision tree that produces a tree-shaped pattern, showing the flow of the classification process. Random forest will be used for the learning process on the data after the attribute normalization process is carried out. This study aims to apply the attribute normalization process and use the random forest algorithm to overcome imbalanced datasets and measure accuracy. This study uses a public dataset from the UCI Repository, namely car evaluation. The accuracy of this method is ± 99% with 90% training data and 10% testing data, and ± 95.95% with eight k-folds cross-validation, and the number of trees is 100 trees.

Published in Sistemasi: Jurnal Sistem Informasi

ISSN: 2302-8149 (Print); 2540-9719 (Online)
Publisher: Islamic University of Indragiri
Country of publisher: Indonesia
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://sistemasi.ftik.unisi.ac.id/index.php/stmsi

About the journal