Communications Engineering (Feb 2024)

Stable training via elastic adaptive deep reinforcement learning for autonomous navigation of intelligent vehicles

  • Yujiao Zhao,
  • Yong Ma,
  • Guibing Zhu,
  • Songlin Hu,
  • Xinping Yan

DOI
https://doi.org/10.1038/s44172-024-00182-8
Journal volume & issue
Vol. 3, no. 1
pp. 1 – 8

Abstract

Read online

Abstract The uncertain stability of deep reinforcement learning training on complex tasks impedes its development and deployment, especially in intelligent vehicles, such as intelligent surface vessels and self-driving cars. Complex and varied environmental states puzzle training of decision-making networks. Here we propose an elastic adaptive deep reinforcement learning algorithm to address these challenges and achieve autonomous navigation in intelligent vehicles. Our method trains the decision-making network over the function and optimization learning stages, in which the state space and action space of autonomous navigation tasks are pruned by choosing classic states and actions to reduce data similarity, facilitating more stable training. We introduce a task-adaptive observed behaviour classification technique in the function learning stage to divide state and action spaces into subspaces and identify classic states and actions. In which the classic states and actions are accumulated as the training dataset that enhances its training efficiency. In the subsequent optimization learning stage, the decision-making network is refined through meticulous exploration and accumulation of datasets. The proposed elastic adaptive deep reinforcement learning enables the decision-making network to effectively learn from complex state and action spaces, leading to more efficient training compared to traditional deep reinforcement learning approaches. Simulation results demonstrate the remarkable effectiveness of our method in training decision-making networks for intelligent vehicles. The findings validate that our method provides reliable and efficient training for decision-making networks in intelligent vehicles. Moreover, our method exhibits stability in training other tasks characterized by continuous state and action spaces.