Big Data and Cognitive Computing (Jun 2024)
Application of Natural Language Processing and Genetic Algorithm to Fine-Tune Hyperparameters of Classifiers for Economic Activities Analysis
Abstract
This study proposes a method for classifying economic activity descriptors to match Nomenclature of Economic Activities (NACE) codes, employing a blend of machine learning techniques and expert evaluation. By leveraging natural language processing (NLP) methods to vectorize activity descriptors and utilizing genetic algorithm (GA) optimization to fine-tune hyperparameters in multi-class classifiers like Naive Bayes, Decision Trees, Random Forests, and Multilayer Perceptrons, our aim is to boost the accuracy and reliability of an economic classification system. This system faces challenges due to the absence of precise target labels in the dataset. Hence, it is essential to initially check the accuracy of utilized methods based on expert evaluations using a small dataset before generalizing to a larger one.
Keywords