Optimizing Matrix-Vector Operations With CGLA for High-Performance Approximate k-NN Search

Dohyun Kim; Yasuhiko Nakashima

doi:10.1109/access.2025.3582825

IEEE Access (Jan 2025)

Optimizing Matrix-Vector Operations With CGLA for High-Performance Approximate k-NN Search

Dohyun Kim,
Yasuhiko Nakashima

Affiliations

Dohyun Kim: ORCiD; Graduation School of Information Science, Nara Institute of Science and Technology (NAIST), Ikoma, Nara, Japan
Yasuhiko Nakashima: ORCiD; Graduation School of Information Science, Nara Institute of Science and Technology (NAIST), Ikoma, Nara, Japan

DOI: https://doi.org/10.1109/access.2025.3582825
Journal volume & issue: Vol. 13
pp. 111087 – 111097

Abstract

Read online

In this paper, we propose a novel approach to accelerating approximate k-nearest neighbor (k-NN) search-a crucial operation in modern vector databases such as OpenSearch, ElasticSearch, Pinecone, and Milvus. Our method specifically targets the post-clustering phase in the inverted file (IVF) technique, where the bulk of computational effort is devoted to matrix-vector (MV) multiplication. The primary innovation lies in the optimization technique known as IMAX3–a Coarse-Grained Linear Array (CGLA)–which effectively leverages local memory to retain vectors between queries. This approach eliminates the need to reload vectors for subsequent queries, thereby minimizing memory movement and significantly reducing latency. Experimental evaluations demonstrate that, when excluding overhead, IMAX3 achieves up to a $54.9\times $ speedup for repeat queries compared to the initial execution-substantially outperforming conventional methods and making our solution highly suitable for real-time vector search applications. Moreover, relative to the RTX 4090, we achieve an energy-delay product (EDP) reduction of up to $498.41\times $ and on average $185.23\times $ .

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords