大数据 (Jan 2024)

Approximate nearest neighbor hybrid query algorithm based on tolerance factor

  • Guangfu HE,
  • Yuanhai XUE,
  • Cuiting CHEN,
  • Xiaoming YU,
  • Xinran LIU,
  • Xueqi CHENG

Journal volume & issue
Vol. 10
pp. 17 – 34

Abstract

Read online

Approximate nearest neighbor search (ANNS) is an important technique in the field of computer science for efficient similarity search, enabling fast information retrieval in large-scale datasets.With the increasing demand for high-precision information retrieval, there is a growing use of the hybrid query method that utilizes both structured and unstructured information for querying, which has broad application prospects.However, filtered greedy algorithms based on nearest neighbor graphs may reduce the connectivity of the graph due to the impact of structural constraints in hybrid queries, ultimately damaging search accuracy.This article proposes a tolerance factor based filtered greedy algorithm, which controls the participation of vertices that do not meet structural constraints in routing through a tolerance factor, maintaining the connectivity of the original nearest neighbor graph without changing the index structure, and overcoming the negative impact of structural constraints on retrieval accuracy.The experimental results demonstrate that the new method can achieve high-precision search for ANNS under different levels of structural constraints, while maintaining retrieval efficiency.This study solves the problem of ANNS based on nearest neighbor graphs in hybrid query scenarios, providing an effective solution for fast hybrid query information retrieval in large-scale datasets.

Keywords