Complex & Intelligent Systems (Apr 2024)

NLKFill: high-resolution image inpainting with a novel large kernel attention

  • Ting Wang,
  • Dong Xiang,
  • Chuan Yang,
  • Jiaying Liang,
  • Canghong Shi

DOI
https://doi.org/10.1007/s40747-024-01411-5
Journal volume & issue
Vol. 10, no. 4
pp. 4921 – 4938

Abstract

Read online

Abstract The integration of convolutional neural network (CNN) and transformer enhances the network’s capacity for concurrent modeling of texture details and global structures. However, training challenges with transformer limit their effectiveness to low-resolution images, leading to increased artifacts in slightly larger images. In this paper, we propose a single-stage network utilizing large kernel attention (LKA) to address high-resolution damaged images. LKA enables the capture of both global and local details, akin to transformer and CNN networks, resulting in high-quality inpainting. Our method excels in: (1) reducing parameters, improving inference speed, and enabling direct training on 1024 $$\times $$ × 1024 resolution images; (2) utilizing LKA for enhanced extraction of global high-frequency and local details; (3) demonstrating excellent generalization on irregular mask models and common datasets such as Places2, Celeba-HQ, FFHQ, and the random irregular mask dataset Pconv from NVIDIA.

Keywords