Swin Transformer Fusion Network for Image Quality Assessment

Hyeongmyeon Kim; Changhoon Yim

doi:10.1109/ACCESS.2024.3378092

IEEE Access (Jan 2024)

Swin Transformer Fusion Network for Image Quality Assessment

Hyeongmyeon Kim,
Changhoon Yim

Affiliations

Hyeongmyeon Kim: Department of Artificial Intelligence, Konkuk University, Seoul, South Korea
Changhoon Yim: ORCiD; Department of Artificial Intelligence, Konkuk University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3378092
Journal volume & issue: Vol. 12
pp. 57741 – 57754

Abstract

Read online

This paper presents an efficient deep-learning model named Swin Transformer fusion network (STFN) for full-reference image quality assessment (FR-IQA). The STFN model uses the first and second stages of the Swin Transformer for feature extraction. To unify the features from these two stages, we propose fusion operations including reverse patch merging (RPM) and mediator block (MB) operations. The RPM is a kind of reverse operation of the patch merging operation in the Swin Transformer stage, and it reshapes the size of the second stage feature so as to match that of the first stage feature. The MB operation efficiently combines multiple features from the RPM block and the first stage Swin Transformer for subsequent operations. Experimental results show that the proposed STFN model provides significantly improved performance than the previous traditional and deep-learning models for various kinds of image datasets for FR-IQA. The STFN model also shows superior performance compared to the state-of-the-art method for FR-IQA with smaller training time and model size. The code and pretrained models are publicly available at https://github.com/KIIPLab/STFN.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords