Subgraph and object context‐masked network for scene graph generation

Zhenxing Zheng; Zhendong Li; Gaoyun An; Songhe Feng

doi:10.1049/iet-cvi.2019.0896

IET Computer Vision (Oct 2020)

Subgraph and object context‐masked network for scene graph generation

Zhenxing Zheng,
Zhendong Li,
Gaoyun An,
Songhe Feng

Affiliations

Zhenxing Zheng: Institute of Information ScienceBeijing Jiaotong UniversityBeijing100044People's Republic of China
Zhendong Li: Institute of Information ScienceBeijing Jiaotong UniversityBeijing100044People's Republic of China
Gaoyun An: Institute of Information ScienceBeijing Jiaotong UniversityBeijing100044People's Republic of China
Songhe Feng: School of Computer and Information TechnologyBeijing Jiaotong UniversityBeijing100044People's Republic of China

DOI: https://doi.org/10.1049/iet-cvi.2019.0896
Journal volume & issue: Vol. 14, no. 7
pp. 546 – 553

Abstract

Read online

Scene graph generation is to recognise objects and their semantic relationships in an image and can help computers understand visual scene. To improve relationship prediction, geometry information is essential and usually incorporated into relationship features. Existing methods use coordinates of objects to encode their spatial layout. However, in this way, they neglect the context of objects. In this study, to take full use of spatial knowledge efficiently, the authors propose a novel subgraph and object context‐masked network (SOCNet) consisting of spatial mask relation inference (SMRI) and hierarchical message passing (HMP) modules to address the scene graph generation task. In particular, to take advantage of spatial knowledge, SMRI masks partial context of object features depending on their spatial layout of objects and corresponding subgraph to facilitate their relationship recognition. To refine the features of objects and subgraphs, they also propose HMP that passes highly correlated messages from both microcosmic and macroscopic aspects through a triple‐path structure including subgraph–subgraph, object–object, and subgraph–object paths. Finally, statistical co‐occurrence probability is used to regularise relationship prediction. SOCNet integrates HMP and SMRI into a unified network, and comprehensive experiments on visual relationship detection and visual genome datasets indicate that SOCNet outperforms several state‐of‐the‐art methods on two common tasks.

Published in IET Computer Vision

ISSN: 1751-9632 (Print); 1751-9640 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519640

About the journal

Abstract

Keywords