IET Computer Vision (Mar 2019)

Joint optimisation convex‐negative matrix factorisation for multi‐modal image collection summarisation based on images and tags

  • Wenkai Zhang,
  • Kun Fu,
  • Xian Sun,
  • Yuhang Zhang,
  • Hao Sun,
  • Hongqi Wang

DOI
https://doi.org/10.1049/iet-cvi.2017.0568
Journal volume & issue
Vol. 13, no. 2
pp. 125 – 130

Abstract

Read online

Image collection summarisation aims to represent a large‐scale multi‐modal collection with a small subset of images and tags, helping navigate a large image dataset. Most extant methods leverage the contributions of text‐to‐visual summaries, ignoring the visual contribution to the textual topic. When the tags are weakly labelled, the textual topic cannot accurately reflect the visual summary. To solve this, the authors propose a novel model, joint optimisation of convex non‐negative matrix factorisation, which incorporates images and tags in a beneficial way. The objective function contains visual and textual error functions, sharing the same indicator matrix, connecting different modal relations. Then, they propose an iterative algorithm to optimise the proposed model. Finally, they explore the effects of different visual feature representations (e.g. bag‐of‐words and deep learning) on multi‐modal collection summary. Our proposed method is then compared with state‐of‐the‐art algorithms using two multi‐modal datasets (i.e. MIRFlickr and NUS‐WIDE‐SCENE). Experimental results demonstrate the effectiveness of their proposed approach.

Keywords