IEEE Access (Jan 2022)

DenSE SwinHDR: SDRTV to HDRTV Conversion Using Densely Connected Swin Transformer With Squeeze and Excitation Module

  • Joon-Ki Bae,
  • Subin Yang,
  • Sung-Ho Bae

DOI
https://doi.org/10.1109/ACCESS.2022.3231339
Journal volume & issue
Vol. 10
pp. 133969 – 133980

Abstract

Read online

Modern displays have the capability of rendering video contents encoded with high dynamic range (HDR) standards. HDR contents deliver more realistic visual experiences by wider color gamut and luminance compared to standard dynamic range (SDR) contents. However, most of the available contents are encoded with SDR standards. To provide HDR contents, the technology converts existing SDR contents to HDR contents, what we call SDRTV-to-HDRTV, is highly demanded by providers such as IPTV or broadcasting services. In this paper, we divide SDRTV-to-HDRTV conversion problem into global and local mapping problems. Transformers recently achieve significant performance and are known for conducting global mapping effectively. Convolution Neural Networks (CNN) are specialized for extracting and converting local features. In this regard, we introduce a combined model with a transformer for global mapping and CNN for local mapping, which solves the SDRTD-to-HDRTV problem in a complementary manner. We intensively explore the best combination strategy for transformers and CNNs. Through comprehensive objective/subjective experiments, we verified that the proposed method achieves the highest performance compared to the existing models in both fidelity and visual quality perspectives. To the best of our knowledge, we are the first to utilize Vision Transformer for the SDRTV-to-HDRTV conversion problem. To boost the performance, we combined Vision Transformer with architectural strategies which are previously applied on convolutional neural networks such as residual connection, dense connection, and squeeze-and-excitation module. We introduce a new Vision Transformer architecture denoted as DenSE-SwinHDR. Our method outperforms in terms of objective scores and visual quality compared to the state-of-the-art methods. Specifically, DenSE-SwinHDR achieved 0.79 dB PSNR, 0.93 dB PU-PSNR gain over HDRTVNet. Also, our proposed method achieve best performance on subjective quality assessment.

Keywords