C2RL: Convolutional-Contrastive Learning for Reinforcement Learning Based on Self-Pretraining for Strong Augmentation

Sanghoon Park; Jihun Kim; Han-You Jeong; Tae-Kyoung Kim; Jinwoo Yoo

doi:10.3390/s23104946

Sensors (May 2023)

C2RL: Convolutional-Contrastive Learning for Reinforcement Learning Based on Self-Pretraining for Strong Augmentation

Sanghoon Park,
Jihun Kim,
Han-You Jeong,
Tae-Kyoung Kim,
Jinwoo Yoo

Affiliations

Sanghoon Park: Graduate School of Automotive Engineering, Kookmin University, Seoul 02707, Republic of Korea
Jihun Kim: Graduate School of Automotive Engineering, Kookmin University, Seoul 02707, Republic of Korea
Han-You Jeong: Department of Electrical Engineering, Pusan National University, Busan 46241, Republic of Korea
Tae-Kyoung Kim: Department of Electronic Engineering, Gachon University, Seongnam 13120, Republic of Korea
Jinwoo Yoo: Department of Automobile and IT Convergence, Kookmin University, Seoul 02707, Republic of Korea

DOI: https://doi.org/10.3390/s23104946
Journal volume & issue: Vol. 23, no. 10
p. 4946

Abstract

Read online

Reinforcement learning agents that have not been seen during training must be robust in test environments. However, the generalization problem is challenging to solve in reinforcement learning using high-dimensional images as the input. The addition of a self-supervised learning framework with data augmentation in the reinforcement learning architecture can promote generalization to a certain extent. However, excessively large changes in the input images may disturb reinforcement learning. Therefore, we propose a contrastive learning method that can help manage the trade-off relationship between the performance of reinforcement learning and auxiliary tasks against the data augmentation strength. In this framework, strong augmentation does not disturb reinforcement learning and instead maximizes the auxiliary effect for generalization. Results of experiments on the DeepMind Control suite demonstrate that the proposed method effectively uses strong data augmentation and achieves a higher generalization than the existing methods.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords