IET Computer Vision (Oct 2020)

Subgraph and object context‐masked network for scene graph generation

  • Zhenxing Zheng,
  • Zhendong Li,
  • Gaoyun An,
  • Songhe Feng

DOI
https://doi.org/10.1049/iet-cvi.2019.0896
Journal volume & issue
Vol. 14, no. 7
pp. 546 – 553

Abstract

Read online

Scene graph generation is to recognise objects and their semantic relationships in an image and can help computers understand visual scene. To improve relationship prediction, geometry information is essential and usually incorporated into relationship features. Existing methods use coordinates of objects to encode their spatial layout. However, in this way, they neglect the context of objects. In this study, to take full use of spatial knowledge efficiently, the authors propose a novel subgraph and object context‐masked network (SOCNet) consisting of spatial mask relation inference (SMRI) and hierarchical message passing (HMP) modules to address the scene graph generation task. In particular, to take advantage of spatial knowledge, SMRI masks partial context of object features depending on their spatial layout of objects and corresponding subgraph to facilitate their relationship recognition. To refine the features of objects and subgraphs, they also propose HMP that passes highly correlated messages from both microcosmic and macroscopic aspects through a triple‐path structure including subgraph–subgraph, object–object, and subgraph–object paths. Finally, statistical co‐occurrence probability is used to regularise relationship prediction. SOCNet integrates HMP and SMRI into a unified network, and comprehensive experiments on visual relationship detection and visual genome datasets indicate that SOCNet outperforms several state‐of‐the‐art methods on two common tasks.

Keywords