Applied Artificial Intelligence (Dec 2024)
Deep Recurrent Reinforcement Learning for Intercept Guidance Law under Partial Observability
Abstract
Nowadays, the rapid development of hypersonic vehicles brings great challenges to the missile defense system. As achieving successful interception depends highly on terminal guidance laws, research on guidance laws for intercepting highly maneuvering targets has aroused increasing attention. Artificial intelligence technologies, such as deep reinforcement learning (DRL), have been widely applied to improve the performance of guidance laws. However, the existing DRL guidance laws rarely consider the partial observability problem of onboard sensors, resulting in the limitations of their engineering applications. In this paper, a deep recurrent reinforcement learning (DRRL)-based guidance method is investigated to address the intercept guidance problem against maneuvering targets under partial observability. The sequence consisting of previous state observations is utilized as the input of the policy network. A recurrent layer is introduced into the networks to extract hidden information behind the temporal sequence to support policy training. The guidance problem is formulated as a partially observable Markov decision process model, and then a range-weighted reward function that considers the line-of-sight rate and energy consumption is designed to guarantee convergence of policy training. The effectiveness of the proposed DRRL guidance law is validated by extensive numerical simulations.