Hierarchical grid-constrained fusion network for image stitching

Yongqin Zhang; Baojie Ruan; Linge Du; Liangjiang Li; Zhan Li; Xiaofeng Wang; Meng Wu; Jinsheng Xiao

doi:10.1007/s44443-025-00005-6

Journal of King Saud University: Computer and Information Sciences (Apr 2025)

Hierarchical grid-constrained fusion network for image stitching

Yongqin Zhang,
Baojie Ruan,
Linge Du,
Liangjiang Li,
Zhan Li,
Xiaofeng Wang,
Meng Wu,
Jinsheng Xiao

Affiliations

Yongqin Zhang: School of Archaeology and Cultural Heritage, Zhengzhou University
Baojie Ruan: School of Information Science and Technology, Northwest University
Linge Du: School of Information Science and Technology, Northwest University
Liangjiang Li: School of Information Science and Technology, Northwest University
Zhan Li: School of Information Science and Technology, Northwest University
Xiaofeng Wang: School of Information Science and Technology, Northwest University
Meng Wu: School of Information and Control Engineering, Xi’an University of Architecture and Technology
Jinsheng Xiao: Electronic Information School, Wuhan University

DOI: https://doi.org/10.1007/s44443-025-00005-6
Journal volume & issue: Vol. 37, no. 3
pp. 1 – 24

Abstract

Read online

Abstract The digitization of ancient murals is crucial in preserving, inheriting, and utilizing cultural heritage. Due to the vast coverage of murals, digitization typically involves capturing images in segments and then stitching them together. However, existing image stitching techniques face limitations regarding efficiency, generalization capabilities, and robustness, which hinder their practical applicability. To solve these problems and improve the performance of image stitching, this paper proposes an unsupervised hierarchical grid-constrained fusion network model. This model consists of two main modules: image alignment and image synthesis. The image alignment module incorporates prior knowledge, such as feature pyramids, attention mechanisms, and context dependencies, to facilitate feature extraction. It also includes a multi-scale grid homography generation part that utilizes region masks to create a deformation field. Additionally, a stitching-domain transformation unit is employed to ensure spatial consistency by deforming the input reference and target images. In the image synthesis module, a progressive inference fusion network framework is proposed to simplify the complex image fusion problem into a multi-granularity image synthesis problem. This framework utilizes two encoder-decoder cascaded network units to merge information progressively from coarse to fine granularity, producing high-resolution stitched images. Experimental results demonstrate that the proposed model exhibits superior robustness on both public and custom image datasets, and generally outperforms state-of-the-art image stitching methods, especially in subjective visual assessments.

Published in Journal of King Saud University: Computer and Information Sciences

ISSN: 1319-1578 (Print); 2213-1248 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.sciencedirect.com/journal/journal-of-king-saud-university-computer-and-information-sciences

About the journal

Abstract

Keywords