IEEE Access (Jan 2018)

Binary Hashing for Approximate Nearest Neighbor Search on Big Data: A Survey

  • Yuan Cao,
  • Heng Qi,
  • Wenrui Zhou,
  • Jien Kato,
  • Keqiu Li,
  • Xiulong Liu,
  • Jie Gui

DOI
https://doi.org/10.1109/ACCESS.2017.2781360
Journal volume & issue
Vol. 6
pp. 2039 – 2054

Abstract

Read online

Nearest neighbor search is a fundamental problem in various domains, such as computer vision, data mining, and machine learning. With the explosive growth of data on the Internet, many new data structures using spatial partitions and recursive hyperplane decomposition (e.g., k-d trees) are proposed to speed up the nearest neighbor search. However, these data structures are facing big data challenges. To meet these challenges, binary hashing-based approximate nearest neighbor search methods attract substantial attention due to their fast query speed and drastically reduced storage. Since the most notably locality sensitive hashing was proposed, a large number of binary hashing methods have emerged. In this paper, we first illustrate the development of binary hashing research by proposing an overall and clear classification of them. Then we conduct extensive experiments to compare the performance of these methods on five famous and public data sets. Finally, we present our view on this topic.

Keywords