IEEE Access (Jan 2024)
Object Detection Method Using Image and Number of Objects on Image as Label
Abstract
Object detection is an essential step in various applications. After deep learning appeared, convolutional neural networks or transformers have shown significant improvement in object detection compared to statistically motivated algorithms. But, they still require improvement in multiple aspects. One is to maintain detection performance in unseen environments without retraining using images and labels from unseen environments. The other is to reduce the tight requirements of labels. In object detection, a bounding box is usually used as a label for an object. In this paper, we propose an object detection algorithm that requires only images and the number of objects on images as labels. We approach the problem with deep reinforcement learning. The proposed algorithm uses an actor-critic algorithm that can produce continuous action. We make an actor model to produce multiple bounding boxes, and a critic model evaluates well as training goes on. Also, we propose a reward model using a pre-trained model trained with an object detection dataset. The proposed algorithm requires only images and the number of objects on images, not bounding boxes. We show that the proposed algorithm gives a comparable result to the transformer-based approach through experiments. Also, it can adapt to unseen environments by only using images and the number of objects on images.
Keywords