Doklady Belorusskogo gosudarstvennogo universiteta informatiki i radioèlektroniki (Mar 2020)

APPROACH TO IMAGE ANALYSIS FOR COMPUTER VISION SYSTEMS

  • N. A. Iskra

DOI
https://doi.org/10.35596/1729-7648-2020-18-2-62-70
Journal volume & issue
Vol. 18, no. 2
pp. 62 – 70

Abstract

Read online

This paper suggests an approach to the semantic image analysis for application in computer vision systems. The aim of the work is to develop a method for automatically construction of a semantic model, that formalizes the spatial relationships between objects in the image and research thereof. A distinctive feature of this model is the detection of salient objects, due to which the construction algorithm analyzes significantly less relations between objects, which can greatly reduce the image processing time and the amount of resources spent for processing. Attention is paid to the selection of a neural network algorithm for object detection in an image, as a preliminary stage of model construction. Experiments were conducted on test datasets provided by Visual Genome database, developed by researchers from Stanford University to evaluate object detection algorithms, image captioning models, and other relevant image analysis tasks. When assessing the performance of the model, the accuracy of spatial relations recognition was evaluated. Further, the experiments on resulting model interpretation were conducted, namely image annotation, i.e. generating a textual description of the image content. The experimental results were compared with similar results obtained by means of the algorithm based on neural networks algorithm on the same dataset by other researchers, as well as by the author of this paper earlier. Up to 60 % improvement in image captioning quality (according to the METEOR metric) compared with neural network methods has been shown. In addition, the use of this model allows partial cleansing and normalization of data for training neural network architectures, which are widely used in image analysis among others. The prospects of using this technique in situational monitoring are considered. The disadvantages of this approach are some simplifications in the construction of the model, which will be taken into account in the further development of the model.

Keywords