π-Economy (Oct 2023)
Reinforcement Learning as an Artificial Intelligence Technology to Solve Socio-Economic Problems: Algorithms Performance Assessment
Abstract
Reinforcement learning is a class of machine learning and artificial intelligence methods, a field for the applied problem studied, as well as methods for solving it. One of these problems is management in social and economic systems, designing optimal control taking into account the systems’ properties such as variety of characteristics scales, heterogeneity of data samples, incompleteness and gaps in the data, data stochasticity, their multicollinearity and heteroscedasticity. Reinforcement learning methods are not sensitive to these features and can be used with higher efficiency in various applications of economics, finance and business. Reinforcement learning is closest to the way humans learn, and solutions to emerging problems can be found in the field of biological self-learning systems based on the principle of trial and error. Reinforcement learning methods are a computational approach to learning, when the control subject (agent) learns under interaction with a complex, dynamic, often stochastic, control object (environment) like a socio-economic system in order to maximize the total reward. In the process of modeling, the problem of choosing such learning algorithms that adequately reflect the stochastic dynamics of the modeled object and have high performance is very important. Business and quality metrics that are appropriate for assessing the quality of supervised and unsupervised learning methods in machine learning are not entirely suitable for evaluating the effectiveness of reinforcement learning methods, since there is no empirical data for evaluation. The paper proposes a number of quality indicators of training for managerial decisions generated on the basis of training methods with reinforcement learning. We use an example for the corporate human resources management. A comparison for learning algorithms such as DQN, DDQN, SARSA, PRO for designing optimal trajectories for the proficiency training of the personnel is made. An assessment of the proposed quality indicators for the entire group of learning methods is carried out and one of the algorithms with the highest performance is selected.
Keywords