IEEE Access (Jan 2024)

High-Throughput Accelerator for Exact-MMSE Soft-Output Detection in Open RAN Systems

  • Thomas James Thomas,
  • Konstantinos Nikitopoulos

DOI
https://doi.org/10.1109/ACCESS.2024.3443536
Journal volume & issue
Vol. 12
pp. 113785 – 113798

Abstract

Read online

Open Radio Access Networks (Open RANs), realized fully in software, require excessive computing resources to support time-sensitive signal-processing algorithms in the physical layer. Among them, multiple-input-multiple-output (MIMO) processing is a key functionality used to drive higher connectivity in the uplink, but it is computationally intensive, triggering the need for hardware acceleration to overcome the processing inefficiency of software-based solutions. Additionally, energy efficiency is becoming a key focus in Open RAN to enable sustainable deployments that utilize available resources efficiently. Because channel-inversion complexity increases polynomially with the number of users in linear detectors, such as zero-forcing (ZF) and minimum-mean-square-error (MMSE), acceleration based on channel-inverse approximations has gained significant attention. However, they unnecessarily multiply the number of base station (BS) antennas to ensure accurate detection, leading to a drastic increase in power consumption owing to the additional radio frequency (RF) chains employed. In contrast, linear detectors achieve a sufficiently good performance with only twice the number of BS antennas as users. This work introduces an exact-MMSE and soft-output hardware accelerator that includes an inversion-free, highly-parallel QR decomposition (QRD) architecture and a low-complexity detector stage with per-cycle soft-output generation, significantly improving the processing latency and throughput. The proposed architecture is fully scalable to support diverse MIMO configurations. Implementation evaluations on a Xilinx Virtex Ultrascale+ field-programmable gate array (FPGA) demonstrate that the proposed exact solution can achieve more than $2\times $ improvement in hardware throughput over existing approximate designs. Moreover, the peak throughput can be increased around 10-fold in slowly fading channels.

Keywords