Jurnal Sisfokom (Jul 2023)

Automatic Categorization of Multi Marketplace FMCGs Products using TF-IDF and PCA Features

  • Sri Suci Indasari,
  • Aris Tjahyanto

DOI
https://doi.org/10.32736/sisfokom.v12i2.1621
Journal volume & issue
Vol. 12, no. 2
pp. 198 – 204

Abstract

Read online

The use of technology in line with the increasing number of internet users has caused a shift in the product sales ecosystem to the realm of electronic commerce (electronic commerce). A total of 73.23 customers made purchase transactions using e-commerce and the most purchased products were products classified as Fast Moving Consumer Goods (FMCGs). The increasingly varied FMCGs data coupled with the increasing number of marketplaces is felt to need to be broken down into specific groups. The process is carried out by analyzing e-commerce product information, especially product names, and descriptions. In this study, we propose an automatic categorization of multiple marketplaces using data from multiple marketplaces. Data text is converted into structured data with a series of preprocessing, and comprehensive experiments are carried out to see the extraction performance of variables including TF-IDF, BOW, and N-Gram. All three methods are used to validate text data sets with K-Means grouping results used with the help of PCA to reduce data dimensions. The results show that the performance of the TF-IDF algorithm with a dimension reduction value of 70 and the use of Python can provide optimal results for the percentage of grouping data.

Keywords