IEEE Access (Jan 2024)
LK-Index: A Learned Index for KNN Queries
Abstract
The k-Nearest Neighbor (kNN) search is a crucial problem in database and data mining, especially in high-dimensional space. However, traditional kNN algorithms based on distance metrics and brute-force search often have low search efficiency and accuracy, and high computational complexity when dealing with large-scale high-dimensional datasets. These limitations have made them a bottleneck in practical applications. Inspired by the recently learned index that replaces B-tree with machine learning models, I propose a method for kNN search based on a learned index, named LK-index. Initially, a conventional tree-based index is created to process queries. The tree index is then utilized to find the k-nearest neighbor points, and the neural network is trained to act as a learned index. Finally, the actual k-nearest neighbors are obtained by computing the potential k-nearest neighbor points. Experiments conducted on synthetic and real-world datasets demonstrate that the LK-index yields certain advantages regarding search accuracy and time consumption.
Keywords