PLoS ONE (Jan 2019)
Speed and energy optimized quasi-delay-insensitive block carry lookahead adder.
Abstract
We present a new asynchronous quasi-delay-insensitive (QDI) block carry lookahead adder with redundant carry (BCLARC) realized using delay-insensitive dual-rail data encoding and 4-phase return-to-zero (RTZ) and 4-phase return-to-one (RTO) handshaking. The proposed QDI BCLARC is found to be faster and energy-efficient than the existing asynchronous adders which are QDI and non-QDI (i.e., relative-timed). Compared to existing asynchronous adders corresponding to various architectures such as the ripple carry adder (RCA), the conventional carry lookahead adder (CCLA), the carry select adder (CSLA), the BCLARC, and the hybrid BCLARC-RCA, the proposed BCLARC is found to be faster and more energy-optimized. The cycle time (CT), which is expressed as the sum of the worst-case times taken for processing the data and the spacer, governs the speed. The product of average power dissipation and CT viz. the power-cycle time product (PCTP) defines the low power/energy efficiency. For a 32-bit addition, the proposed QDI BCLARC achieves the following reductions in design metrics on average over its counterparts when considering RTZ and RTO handshaking: i) 20.5% and 19.6% reductions in CT and PCTP respectively compared to an optimum QDI early output RCA, ii) 16.5% and 15.8% reductions in CT and PCTP respectively compared to an optimum relative-timed RCA, iii) 32.9% and 35.9% reductions in CT and PCTP respectively compared to an optimum uniform input-partitioned QDI early output CSLA, iv) 47.5% and 47.2% reductions in CT and PCTP respectively compared to an optimum QDI early output CCLA, v) 14.2% and 27.3% reductions in CT and PCTP respectively compared to an optimum QDI early output BCLARC, and vi) 12.2% and 11.6% reductions in CT and PCTP respectively compared to an optimum QDI early output hybrid BCLARC-RCA. The adders were implemented using a 32/28nm CMOS technology.