Binary Hashing for Approximate Nearest Neighbor Search on Big Data: A Survey

Yuan Cao; Heng Qi; Wenrui Zhou; Jien Kato; Keqiu Li; Xiulong Liu; Jie Gui

doi:10.1109/ACCESS.2017.2781360

IEEE Access (Jan 2018)

Binary Hashing for Approximate Nearest Neighbor Search on Big Data: A Survey

Yuan Cao,
Heng Qi,
Wenrui Zhou,
Jien Kato,
Keqiu Li,
Xiulong Liu,
Jie Gui

Affiliations

Yuan Cao: School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Heng Qi: ORCiD; School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Wenrui Zhou: School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Jien Kato: Graduate School of Information Science, Nagoya University, Nagoya, Japan
Keqiu Li: School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Xiulong Liu: School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Jie Gui: Chinese Academy of Sciences, Institute of Intelligent Machines, Hefei, China

DOI: https://doi.org/10.1109/ACCESS.2017.2781360
Journal volume & issue: Vol. 6
pp. 2039 – 2054

Abstract

Read online

Nearest neighbor search is a fundamental problem in various domains, such as computer vision, data mining, and machine learning. With the explosive growth of data on the Internet, many new data structures using spatial partitions and recursive hyperplane decomposition (e.g., k-d trees) are proposed to speed up the nearest neighbor search. However, these data structures are facing big data challenges. To meet these challenges, binary hashing-based approximate nearest neighbor search methods attract substantial attention due to their fast query speed and drastically reduced storage. Since the most notably locality sensitive hashing was proposed, a large number of binary hashing methods have emerged. In this paper, we first illustrate the development of binary hashing research by proposing an overall and clear classification of them. Then we conduct extensive experiments to compare the performance of these methods on five famous and public data sets. Finally, we present our view on this topic.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords