Algorithms (May 2023)

Well-Separated Pair Decompositions for High-Dimensional Datasets

  • Domagoj Matijević

DOI
https://doi.org/10.3390/a16050254
Journal volume & issue
Vol. 16, no. 5
p. 254

Abstract

Read online

Well-separated pair decomposition (WSPD) is a well known geometric decomposition used for encoding distances, introduced in a seminal paper by Paul B. Callahan and S. Rao Kosaraju in 1995. WSPD compresses O(n2) pairwise distances of n given points from Rd in O(n) space for a fixed dimension d. However, the main problem with this remarkable decomposition is the “hidden” dependence on the dimension d, which in practice does not allow for the computation of a WSPD for any dimension d>2 or d>3 at best. In this work, I will show how to compute a WSPD for points in Rd and for any dimension d. Instead of computing a WSPD directly in Rd, I propose to learn nonlinear mapping and transform the data to a lower-dimensional space Rd′, d′=2 or d′=3, since only in such low-dimensional spaces can a WSPD be efficiently computed. Furthermore, I estimate the quality of the computed WSPD in the original Rd space. My experiments show that for different synthetic and real-world datasets my approach allows that a WSPD of size O(n) can still be computed for points in Rd for dimensions d much larger than two or three in practice.

Keywords