E-Jurnal Matematika (Nov 2021)
SMOTE: POTENSI DAN KEKURANGANNYA PADA SURVEI
Abstract
Imbalanced data is a problem that is often found in real-world cases of classification. Imbalanced data causes misclassification will tend to occur in the minority class. This can lead to errors in decision-making if the minority class has important information and it’s the focus of attention in research. Generally, there are two approaches that can be taken to deal with the problem of imbalanced data, the data level approach and the algorithm level approach. The data level approach has proven to be very effective in dealing with imbalanced data and more flexible. The oversampling method is one of the data level approaches that generally gives better results than the undersampling method. SMOTE is the most popular oversampling method used in more applications. In this study, we will discuss in more detail the SMOTE method, potential, and disadvantages of this method. In general, this method is intended to avoid overfitting and improve classification performance in the minority class. However, this method also causes overgeneralization which tends to be overlapping.