Pilar Nusa Mandiri (Mar 2024)
CLASSIFICATION OF CUSTOMERS’ REPEAT ORDER PROBABILITY USING DECISION TREE, NAÏVE BAYES AND RANDOM FOREST
Abstract
Limited customer information in sales data on e-commerce in Indonesia hinders companies in determining targeted marketing strategies, especially in targeting groups of potential customers to make repeat purchases. Sales data in the form of customers' names and cellphone numbers has been hidden by e-commerce, and only data is available in the form of products purchased, number of purchases, and customer addresses. So far, the methods used to determine potential customers mostly use more complete data features. Research that uses limited e-commerce data to determine potential customers is scarce. Several algorithms for predicting repeat purchases in e-commerce also have been widely used. However, the comparison of the performance of these methods in the context of e-commerce in Indonesia with limited data has yet to be discovered. In this research, the Decision Tree, Naive Bayes, and Random Forest methods were compared to classify potential customers using Maschere brand sales data from two e-commerce sites, namely Tokopedia and Shopee. The research results show that the Decision Tree algorithm achieved an accuracy of 90.91%, Naive Bayes achieved an accuracy of 37.50%, and Random Forest achieved the best level of accuracy, namely 93.94%. These results show that the Random Forest method is the best method for classifying customers' probability of repeat purchases. In the future, the results of this research can be developed again as a decision-making system to determine potential customers.
Keywords