The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (May 2022)
ADDRESSING CLASS IMBALANCE IN MULTI-CLASS IMAGE CLASSIFICATION BY MEANS OF AUXILIARY FEATURE SPACE RESTRICTIONS
Abstract
Learning from imbalanced class distributions generally leads to a classifier that is not able to distinguish classes with few training examples from the other classes. In the context of cultural heritage, addressing this problem becomes important when existing digital online collections consisting of images depicting artifacts and assigned semantic annotations shall be completed automatically; images with known annotations can be used to train a classifier that predicts missing information, where training data is often highly imbalanced. In the present paper, combining a classification loss with an auxiliary clustering loss is proposed to improve the classification performance particularly for underrepresented classes, where additionally different sampling strategies are applied. The proposed auxiliary loss aims to cluster feature vectors with respect to the semantic annotations as well as to visual properties of the images to be classified and thus, is supposed to help the classifier in distinguishing individual classes. We conduct an ablation study on a dataset consisting of images depicting silk fabrics coming along with annotations for different silk-related classification tasks. Experimental results show improvements of up to 10.5% in average F1-score and up to 20.8% in the F1-score averaged over the underrepresented classes in some classification tasks.