IEEE Access (Jan 2024)

Image Super-Resolution With Unified-Window Attention

  • Gunhee Cho,
  • Yong Suk Choi

DOI
https://doi.org/10.1109/ACCESS.2024.3368436
Journal volume & issue
Vol. 12
pp. 30852 – 30866

Abstract

Read online

Recent studies on image super-resolution (SR) have focused on just expanding the receptive field. However, the insight from local attribution maps has enlightened the need for an improved exploitation of information within the receptive field. To address this, we propose a novel image super-resolution model, named Uniwin, that balances local and global interactions through unifying two types of window-based local attention mechanisms: shifted-window attention and sliding-window attention. Uniwin combines swift global context access with comprehensive local context capture. Our approach involves the initial sliding-window attention to collect comprehensive local pattern information, followed by the non-overlapping shifted-window attention to expand the receptive field for global interactions. Empirical evaluations demonstrate Uniwin’s superiority over state-of-the-art models across five benchmark datasets. Specifically, in SR $\times 2$ task, our model achieved 0.25dB higher PSNR on the Set14 dataset, and 0.14dB higher PSNR on the Urban100 dataset than existing state-of-the-art model using a similar number of parameters. Additionally, our model achieved comparable performance to existing large-scale models with only 58% of the parameters. Ablation studies on sliding-window and shifted-window attention mechanisms reveal the critical importance of the context harmonization for image super-resolution. In conclusion, Uniwin represents a reasonable solution that effectively integrates global and local attention mechanisms, enhancing image super-resolution performance.

Keywords