Autonomous decision-making of UAV cluster with communication constraints based on reinforcement learning

Zhang Ting-Ting; Chen Yan; Dong Ren-zhi; Chen Tao; Liu Yan; Zhang Kai-Ge; Song Ai-Guo; Lan Yu-Shi

doi:10.1186/s13677-025-00738-9

Journal of Cloud Computing: Advances, Systems and Applications (Feb 2025)

Autonomous decision-making of UAV cluster with communication constraints based on reinforcement learning

Zhang Ting-Ting,
Chen Yan,
Dong Ren-zhi,
Chen Tao,
Liu Yan,
Zhang Kai-Ge,
Song Ai-Guo,
Lan Yu-Shi

Affiliations

Zhang Ting-Ting: Army Engineering University of PLA, College of Command and Control Engineering
Chen Yan: Army Engineering University of PLA, College of Command and Control Engineering
Dong Ren-zhi: Cetccloud (Beijing) Technology Co
Chen Tao: National University of Defense Technology
Liu Yan: North Automatic Control Technology Institute
Zhang Kai-Ge: North Automatic Control Technology Institute
Song Ai-Guo: Southeast University
Lan Yu-Shi: Nanjing Research Institute of Electronic Engineering

DOI: https://doi.org/10.1186/s13677-025-00738-9
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Artificial intelligence techniques are increasingly applied in the study of autonomous decision-making in unmanned clustered distributed systems. However, communication constraints has become a big bottleneck that restricts its performance. To address the need for unmanned aerial vehicles(UAVs) to execute collaborative attack missions in complex communication-constrained environments, this paper propose an autonomous decision-making method for UAVs based on Multi-Agent Reinforcement Learning (MARL). Firstly, the autonomous decision-making processes of UAV clusters are modeled as Decentralized Partially Observable Markov Decision Processes(Dec-POMDPs). Next, the algorithm is enhanced within the framework of Multi-Agent Deep Deterministic Policy Gradient(MADDPG) by designing an explicit inter-intelligent communication mechanism to achieve information exchange among UAVs. Subsequently, the algorithm utilizes Long Short-Term Memory(LSTM) networks to process the local observations of the UAVs, enhancing the effectiveness of the information sent by combining historical data with current observations. Finally, multiple rounds of experiments are conducted across various communication-constrained scenarios. Simulation results indicate that the proposed method improves the task completion capability by 46.0% and enhances stability by 24.9% compared to baseline algorithm MADDPG. Additionally, the algorithm demonstrates better generalization and exhibits good scalability, effectively adapting to varying numbers of UAVs. This research provides new theoretical insights and a technical framework for the collaboration of UAVs in environments with communication constraints, which holds great practical importance in improving the ability and application scope of UAV systems.

Published in Journal of Cloud Computing: Advances, Systems and Applications

ISSN: 2192-113X (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofcloudcomputing.springeropen.com

About the journal

Abstract

Keywords