A Ranking-Based Hashing Algorithm Based on the Distributed Spark Platform

Anbang Yang; Jiangbo Qian; Huahui Chen; Yihong Dong

doi:10.3390/info11030148

Information (Mar 2020)

A Ranking-Based Hashing Algorithm Based on the Distributed Spark Platform

Anbang Yang,
Jiangbo Qian,
Huahui Chen,
Yihong Dong

Affiliations

Anbang Yang: Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, China
Jiangbo Qian: Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, China
Huahui Chen: Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, China
Yihong Dong: Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, China

DOI: https://doi.org/10.3390/info11030148
Journal volume & issue: Vol. 11, no. 3
p. 148

Abstract

Read online

With the rapid development of modern society, generated data has increased exponentially. Finding required data from this huge data pool is an urgent problem that needs to be solved. Hashing technology is widely used in similarity searches of large-scale data. Among them, the ranking-based hashing algorithm has been widely studied due to its accuracy and speed regarding the search results. At present, most ranking-based hashing algorithms construct loss functions by comparing the rank consistency of data in Euclidean and Hamming spaces. However, most of them have high time complexity and long training times, meaning they cannot meet requirements. In order to solve these problems, this paper introduces a distributed Spark framework and implements the ranking-based hashing algorithm in a parallel environment on multiple machines. The experimental results show that the Spark-RLSH (Ranking Listwise Supervision Hashing) can greatly reduce the training time and improve the training efficiency compared with other ranking-based hashing algorithms.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords