Безопасность информационных технологий (Jun 2024)

ENSURING SAFETY WHILE ENHANCING PERFORMANCE: ENCOURAGING REINFORCEMENT LEARNING BY ADDRESSING CONSTRAINTS AND UNCERTAINTY

  • Mohsen Abdollahzadeh Aghbolagh

DOI
https://doi.org/10.26583/bit.2024.2.06
Journal volume & issue
Vol. 31, no. 2
pp. 90 – 110

Abstract

Read online

Striking a balance between safety and performance remains a critical concern, despite advancements in the field. To address this issue, a versatile framework named Safety Goes Along with Performance (SGAWP) is proposed, centered on off-policy algorithms grounded in value function optimization. SGAWP utilizes reinforcement learning to navigate the data space, emphasizing high task performance while addressing risks (such as undesirable states) by incorporating safety costs into the value function. By integrating uncertainty management and task performance constraints, SGAWP aims to achieve improved safety performance alongside respectable task performance. Moreover, SGAWP leverages curiosity-driven exploration to expand the data space and employs task policies to enhance safety policy performance. As a result, SGAWP enhances safety performance with minimal loss in task performance. Beyond its success in reinforcement learning, SGAWP holds promise for applications like autonomous driving, where safety is paramount. Through rigorous experimentation across various off-policy algorithms, SGAWP demonstrates robust generalization and achieves its objectives effectively.

Keywords