IET Computer Vision (Apr 2023)

CDF‐net: A convolutional neural network fusing frequency domain and spatial domain features

  • Aitao Yang,
  • Min Li,
  • Zhaoqing Wu,
  • Yujie He,
  • Xiaohua Qiu,
  • Yu Song,
  • Weidong Du,
  • Yao Gou

DOI
https://doi.org/10.1049/cvi2.12167
Journal volume & issue
Vol. 17, no. 3
pp. 319 – 329

Abstract

Read online

Abstract Convolutional neural network (CNN), as a classic deep learning algorithm, has been applied to various computer vision tasks. However, most classic CNN models focus on the extraction and utilisation of spatial domain features, while ignoring the potential ability of frequency domain feature extraction. In this study, the mechanism in the backbone design is explored. Firstly, the traditional DCT formula is converted into a convolution form through mathematical derivation. On the basis of a a new type of convolution, namely the DCT Convolution is designed. It is more applicable to deep learning network architectures. Secondly, based on the DCT Convolution, a new cross‐domain fusion network named CDF‐Net is designed. The frequency domain and spatial domain features of the input sample are extracted and fused by the network. CDF‐Net is a general network framework which can be applied to most existing prevalent networks. Finally, various experiments are conducted. On image classification task, for Imagenet2012 dataset, the method proposed was applied to ResNet50, and the accuracy of Top1 was increased by 3.684%. On object detection task, for COCO2017 dataset, the method proposed in this study was applied to ResNet50 and ResNeXt50, mAP were improved by 0.5% and 1.2% respectively.

Keywords