Songklanakarin Journal of Science and Technology (SJST) (Aug 2021)

High performance 2D convolution utilizing the AVX512 on a multi-core architecture

  • Isamail Masamae,
  • Panyayot Chaikan

DOI
https://doi.org/10.14456/sjst-psu.2021.160
Journal volume & issue
Vol. 43, no. 4
pp. 1230 – 1236

Abstract

Read online

Convolution is a time consuming operation, especially for signal and image processing, which led us to develop an efficient implementation of 2D convolution for a multi-core architecture utilizing AVX512 intrinsics and OpenMP. For single precision convolution, our algorithm is on average 2.30, 3.88, 5.75, and 19.95 times faster than the IPP, OpenCV, Baziotis's algorithm, and MKL libraries. For double precision convolution, our algorithm is on average 3.12, 5.10, and 16.95 times faster than the OpenCV, Baziotis's algorithm, and MKL libraries. We have also developed a hybrid 2D convolution algorithm, written in C and assembly, to further augment the processing speeds for small kernel sizes.

Keywords