IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2023)

Photo Semantic Understanding and Retargeting by a Noise-Robust Regularized Topic Model

  • Guifeng Wang,
  • Luming Zhang,
  • Yongbin Li,
  • Yichuan Sheng

DOI
https://doi.org/10.1109/JSTARS.2023.3247745
Journal volume & issue
Vol. 16
pp. 3495 – 3505

Abstract

Read online

Retargeting aims at displaying a photo with an arbitrary aspect ratio, wherein the visually/semantically prominent objects are appropriately preserved and visual distortions can be well alleviated. Conventional retargeting models are built upon the visual perception of photos from a family of prespecified communities (e.g., “portrait”), wherein the underlying community-specific features are not learned explicitly. Thus, they cannot appropriately retarget aerial photos, which contains a rich variety of objects with different scales. In this article, a novel aerial photo retargeting framework is designed by encoding the deep features from automatically detected Google Maps (https://www.google.com/maps) communities into a regularized probabilistic model. Specifically, we first propose an enhanced matrix factorization (MF) algorithm to calculate communities based on million-scale Google Maps pictures, for each of which deep feature is learned simultaneously. The enhanced MF incorporates label denoising, between-communities correlation, and deep feature encoding collaboratively. Subsequently, a probabilistic model called latent topic model (LTM) is designed that quantifies the spatial layouts of multiple Google Maps communities in the underlying hidden space. To alleviate the overfitting from Google Maps communities with imbalanced numbers of aerial photos, a regularizer is added into the LTM. Finally, by leveraging the regularized LTM, we shrink the test photo horizontally/vertically to maximize the posterior probability of the retargted photo. Comprehensive subjective evaluations and visualizations have demonstrated the advantages of our method. Besides, our calculate Google Maps communities are competitively consistent with the ground truth, according to the quantitative comparisons on the 2 M Google Maps photos.

Keywords