IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2021)

Ground Camera Image and Large-Scale 3-D Image-Based Point Cloud Registration Based on Learning Domain Invariant Feature Descriptors

  • Weiquan Liu,
  • Baiqi Lai,
  • Cheng Wang,
  • Guorong Cai,
  • Yanfei Su,
  • Xuesheng Bian,
  • Yongchuan Li,
  • Shuting Chen,
  • Jonathan Li

DOI
https://doi.org/10.1109/JSTARS.2020.3035359
Journal volume & issue
Vol. 14
pp. 997 – 1009

Abstract

Read online

Multisource data are captured from different sensors or generated with different generation mechanisms. Ground camera images (images taken from ground-based camera) and rendered images (synthesized by the position information from 3-D image-based point cloud) are different-source geospatial data, called cross-domain images. Particularly, in outdoor environments, the registration relationship between the above cross-domain images is available to establish the spatial relationship between 2-D and 3-D space, which is an indirect solution for virtual-real registration of augmented reality (AR). However, the traditional handcrafted feature descriptors cannot match the above cross-domain images because of the low quality of rendered images and the domain gap between cross-domain images. In this article, inspired by the success achieved by deep learning in computer vision, we first propose an end-to-end network, DIFD-Net, to learn domain invariant feature descriptors (DIFDs) for cross-domain image patches. The DIFDs are used for cross-domain image patch retrieval to the registration of ground camera and rendered images. Second, we construct a domain-kept consistent loss function, which balances the feature descriptors for narrowing the gap in different domains, to optimize DIFD-Net. Specially, the negative samples are generated from positive during training, and the introduced constraint of intermediate feature maps increases extra supervision information to learn feature descriptors. Finally, experiments show the superiority of DIFDs for the retrieval of cross-domain image patches, which achieves state-of-the-art retrieval performance. Additionally, we use DIFDs to match ground camera images and rendered images, and verify the feasibility of the derived AR virtual-real registration in open outdoor environments.

Keywords