Image inpainting network based on multi‐level attention mechanism

Hongyue Xiang; Weidong Min; Zitai Wei; Meng Zhu; Mengxue Liu; Ziyang Deng

doi:10.1049/ipr2.12958

IET Image Processing (Feb 2024)

Image inpainting network based on multi‐level attention mechanism

Hongyue Xiang,
Weidong Min,
Zitai Wei,
Meng Zhu,
Mengxue Liu,
Ziyang Deng

Affiliations

Hongyue Xiang: School of Mathematics and Computer Science Nanchang University Nanchang China
Weidong Min: School of Mathematics and Computer Science Nanchang University Nanchang China
Zitai Wei: School of Mathematics and Computer Science Nanchang University Nanchang China
Meng Zhu: School of Mathematics and Computer Science Nanchang University Nanchang China
Mengxue Liu: School of Mathematics and Computer Science Nanchang University Nanchang China
Ziyang Deng: School of Mathematics and Computer Science Nanchang University Nanchang China

DOI: https://doi.org/10.1049/ipr2.12958
Journal volume & issue: Vol. 18, no. 2
pp. 428 – 438

Abstract

Read online

Abstract Image inpainting networks based on deep learning techniques have been widely used in many important fields. However, most inpainting networks fail to generate desirable repaired images. This may be due to their failure to extract effective features and accurately assign high weights to the undamaged regions. To alleviate these problems, an image inpainting network based on gated convolution and multi‐level attention mechanism (IIN‐GCMAM) is proposed in this paper. This network follows encoder–decoder architecture, consisting of the gated convolution encoder (GC‐encoder) and the multi‐level attention mechanism decoder (MAM‐decoder). The GC‐encoder weighs the extracted features with gated convolutions, which reduces the interference caused by the damaged regions. The multi‐level attention mechanism employed in the MAM‐decoder uses multi‐scale feature maps spatially and channel‐wise to improve the consistency in global structure and the fineness of repaired results. Extensive experiments are conducted on the common datasets, Paris StreetView and CelebA. Experimental results indicate that the proposed IIN‐GCMAM can achieve a good performance on the common evaluation metrics and visual effects. It can achieve 0.0408, 0.720, and 22.27 in MAE, SSIM, and PSNR at the mask ratio of 50%–60%, respectively.

Published in IET Image Processing

ISSN: 1751-9659 (Print); 1751-9667 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Photography; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519667

About the journal

Abstract

Keywords