Learning rate in the reinforcement learning method for unknown location targets searching system

Y. Albrekht; A. Писаренко

doi:10.20535/1560-8956.42.2023.278916

Adaptivni Sistemi Avtomatičnogo Upravlinnâ (May 2023)

Learning rate in the reinforcement learning method for unknown location targets searching system

Y. Albrekht,
A. Писаренко

Affiliations

Y. Albrekht: Igor Sikorsky Kyiv Polytechnic Institute
A. Писаренко: Igor Sikorsky Kyiv Polytechnic Institute

DOI: https://doi.org/10.20535/1560-8956.42.2023.278916
Journal volume & issue: Vol. 1, no. 42
pp. 3 – 8

Abstract

Read online

The object of study is a system with a different number of mutually independent modules in reinforcement learning. The article reviews the research related to reinforcement learning and the need to determine the dependence of the learning rate of the object of study on the number of mutually independent modules. The purpose of the study is to determine the optimal number of independent modules at which the object of study will learn the fastest, as well as to determine whether it is possible to compare systems with the same number of mutually independent modules and systems with interconnected modules. The study defines an environment with two types of objects that bring points to the final score and uses Deep Q Learning algorithms with 36 inputs and 5 possible outcomes to conduct the experiment. The research is part of the solution to the problem of creating a drone flock management system for finding the position of objects in unknown terrain. The article discusses the problem of determining the optimal number of objects for which reinforcement learning will give the best results, and whether the results can be compared between a flock of objects linked by the same input data and a single neural network that controls them and a group of mutually independent modules that make decisions only on the basis of input data received independently. The article provides an overview of recent breakthroughs in computer vision and speech recognition based on efficient training of neural networks on very large data sets that motivate the use of reinforcement learning. The results of the study indicate that an increase in the number of mutually independent modules when training a system slows down learning. Ref. 5, pic. 5.

Published in Adaptivni Sistemi Avtomatičnogo Upravlinnâ

ISSN: 1560-8956 (Print); 2522-9575 (Online)
Publisher: Igor Sikorsky Kyiv Polytechnic Institute
Country of publisher: Ukraine
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Automation
Website: http://asac.kpi.ua/

About the journal

Abstract

Keywords