Engineering Science and Technology, an International Journal (Jun 2024)

A low-cost high-speed radix-4 Montgomery modular multiplier without carry-propagate format conversion

  • Shiann-Rong Kuang,
  • Chun-Yi Wang,
  • Yen-Jui Chen

Journal volume & issue
Vol. 54
p. 101703

Abstract

Read online

Modular multiplication is the most critical and time-consuming operation in numerous public-key cryptosystems used to establish secure networks. Especially in Internet of Things (IoT) applications, achieving low area complexity and high performance is crucial when designing modular multipliers. To meet these criteria, this paper proposes a low-cost, high-speed Montgomery modular multiplier that incorporates radix-4 Booth encoding and a simple two-level carry-save addition (CSA) architecture to reduce the execution time and area cost required to complete one modular multiplication. The proposed radix-4 multiplier operates on binary numbers as input/output operands and replaces the triple modulus with a negative value of the modulus, resulting in the minimum register requirement for modular multiplication. Furthermore, several simple circuits are designed to efficiently calculate the quotient and correction bits for negative multiples. Consequently, all the inputs required for the two-level CSA architecture can be generated on the fly, leading to a very short critical path delay. In contrast to previous designs that rely on additional carry-propagate adders to convert the modular multiplication result from carry-save form to binary representation, the proposed design reuses the two-level CSA architecture for format conversion. It also introduces a novel mechanism and a simplified zero detector to minimize the time and area overheads associated with format conversion. Theoretical analysis and FPGA implementation results demonstrate that the proposed radix-4 Montgomery multiplier achieves nearly the highest performance and the lowest area complexity when compared to existing radix-2 and radix-4 designs. As a result, it is highly suitable for deployment in IoT devices with limited resources.

Keywords