Tongxin xuebao (May 2024)
Efficient implementation scheme of SM4 algorithm based on FPGA
Abstract
To address the inefficient data processing performance and excessive resource utilization issues that field-programmable gate array (FPGA)-based SM4 implementations faced, an implementation scheme that adopted both iteration and pipeline in order to reduce resource consumption and improve throughput was proposed. A combination of cyclic key extension and 32 bit pipeline encryption and decryption architecture was adopted by the proposed scheme. The cyclic key extension reduced logical resource consumption, while the 32 bit pipeline encryption and decryption improved data throughput. Additionally, an algebraic S-box that combined linear operations to select an optimal matrix from those generated by different irreducible polynomials was employed. Resource usage and computation overhead was further minimized, thus achieving an increased engineering frequency. Experimental results demonstrate a 43% throughput improvement and a 10% reduction in resource usage compared to the current best scheme.