IEEE Access (Jan 2023)
Detect to Focus: Latent-Space Autofocusing System With Decentralized Hierarchical Multi-Agent Reinforcement Learning
Abstract
State-of-the-art object detection models are frequently trained offline using available datasets, such as ImageNet: large and overly diverse data that are unbalanced and hard to cluster semantically. This kind of training drops the object detection performance should the change in illumination, in the environmental conditions (e.g., rain or dust), or in the lens positioning (out-of-focus blur) occur. We propose a simple way to intelligently control the camera and the lens focusing settings in such scenarios using DASHA, a Decentralized Autofocusing System with Hierarchical Agents. Our agents learn to focus on scenes in challenging environments, significantly enhancing the pattern recognition capacity beyond the popular detection models (YOLO, Faster R-CNN, and Retina are considered). At the same time, the decentralized training allows preserving the equipment from overheating. The algorithm relies on the latent representation of the camera’s stream and, thus, it is the first method to allow a completely no-reference imaging, where the system trains itself to auto-focus itself. The paper introduces a novel method for auto-tuning imaging equipment via hierarchical reinforcement learning. The technique involves the use of two interacting agents which independently manage the camera and lens settings, enabling optimal focus across different lighting situations. The unique aspect of this approach is its dependence on the latent feature vector of the real-time image scene for autofocusing, marking it as the first method of its kind to auto-tune a camera without necessitating reference or calibration data.
Keywords