Data Science and Engineering (Sep 2023)

Deep Learning-Based Bloom Filter for Efficient Multi-key Membership Testing

  • Haitian Chen,
  • Ziwei Wang,
  • Yunchuan Li,
  • Ruixin Yang,
  • Yan Zhao,
  • Rui Zhou,
  • Kai Zheng

DOI
https://doi.org/10.1007/s41019-023-00224-9
Journal volume & issue
Vol. 8, no. 3
pp. 234 – 246

Abstract

Read online

Abstract Multi-key membership testing plays a crucial role in computing systems and networking applications, encompassing web search, mail systems, distributed databases, firewalls, and network routing. Traditional approaches, such as the Bloom filter, encounter limitations within this specific context. Addressing these challenges, we propose the Multi-key Learned Bloom Filter (MLBF), a hybrid method that combines machine learning techniques with the Bloom filter. The MLBF introduces a value-interaction-based multi-key classifier and a multi-key Bloom filter. Furthermore, we introduce an Interval-based MLBF approach, which categorizes keys into specific intervals based on data distribution to minimize the False Positive Rate (FPR). Additionally, MLBF incorporates an out-of-distribution (OOD) detection component to identify data shifts. Through extensive experimental evaluations on three authentic datasets, we demonstrate the superiority of the proposed MLBF in terms of FPR and query efficiency.

Keywords