IEEE Access (Jan 2022)

Effective Hardware Accelerator for 2D DCT/IDCT Using Improved Loeffler Architecture

  • Zhiwei Zhou,
  • Zhongliang Pan

DOI
https://doi.org/10.1109/ACCESS.2022.3146162
Journal volume & issue
Vol. 10
pp. 11011 – 11020

Abstract

Read online

This paper proposes an effective hardware accelerator for 2D $8\times 8$ discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) using an improved Loeffler architecture. The accelerator optimizes the data stream of the Loeffler 8-point 1D DCT/IDCT according to the characteristics of image and video processing. An 8-stage pipeline structure greatly improves the processing speed by reasonably dividing the number of clock cycles and simplifying the arithmetic operations in each cycle. The multiplication-free approximation of the DCT coefficients is implemented through adders and shifters, combined with both fixed-point and canonic signed digit (CSD) coding. In particular, the proposed fast parallel transposed matrix architecture achieves the function of row-column coefficient conversion with lower circuit complexity. The FPGA implementation of the proposed architecture uses a Virtex-7 XC7VX330T device, running at 288 MHz with a throughput of 558 M Pixel/sec, and a Full HD real-time frame rate of up to 269 fps. Only 33 cycles are required to complete the $8\times 8$ blocks of 2D DCT/IDCT, which can be used as a high-performance hardware accelerator for image and video compression encoding.

Keywords