IEEE Access (Jan 2022)
Scene Graph Generation Using Depth, Spatial, and Visual Cues in 2D Images
Abstract
To understand an image or a scene properly, it is necessary to identify objects participating in the scene, their relationships, and various attributes that describe their properties. A scene graph is a high-level representation that confines all these features in a structured manner. Scene graph generation includes multiple challenges like the semantics of relationships considered and the availability of a well-balanced dataset with sufficient training examples. We tried to mitigate these problems by extracting two subsets, VG-R10 and VG-A16, from the popular Visual Genome dataset. Also, a framework (S2G) is proposed for generating scene graphs directly from images using depth and spatial information of object pairs. Evaluations on the scene graph generation model reveal that the proposed framework achieves better results on our data than the state-of-the-art.
Keywords