The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (May 2022)
ACTIVE REINFORCEMENT LEARNING FOR THE SEMANTIC SEGMENTATION OF IMAGES CAPTURED BY MOBILE SENSORS
Abstract
In recent years, various Convolutional Neural Networks (CNN) have been used to achieve acceptable performance on semantic segmentation tasks. However, these supervised learning methods require an extensive amount of annotated training data to perform well. Additionally, the model would need to be trained on the same kind of dataset to generalize well for other tasks. Further, commonly real world datasets are usually highly imbalanced. This problem leads to poor performance in the detection of underrepresented classes, which could be the most critical for some applications. The annotation task is time-consuming human labour that creates an obstacle to utilizing supervised learning methods on vision tasks. In this work, we experiment with implementing a reinforced active learning method with a weighted performance metric to reduce human labour while achieving competitive results. A deep Q-network (DQN) is used to find the optimal policy, which would be choosing the most informative regions of the image to be labelled from the unlabelled set. Then, the neural network would be trained with newly labelled data, and its performance would be evaluated. A weighted Intersection over Union (IoU) is used to calculate the rewards for the DQN network. By using weighted IoU, we target to bring more attention to underrepresented classes.