Algorithms (Nov 2020)
<span style="font-variant: small-caps">Lumáwig</span>: An Efficient Algorithm for Dimension Zero Bottleneck Distance Computation in Topological Data Analysis
Abstract
Stability of persistence diagrams under slight perturbations is a key characteristic behind the validity and growing popularity of topological data analysis in exploring real-world data. Central to this stability is the use of Bottleneck distance which entails matching points between diagrams. Instances of use of this metric in practical studies have, however, been few and sparingly far between because of the computational obstruction, especially in dimension zero where the computational cost explodes with the growth of data size. We present a novel efficient algorithm to compute dimension zero bottleneck distance between two persistent diagrams of a specific kind which runs significantly faster and provides significantly sharper approximates with respect to the output of the original algorithm than any other available algorithm. We bypass the overwhelming matching problem in previous implementations of the bottleneck distance, and prove that the zero dimensional bottleneck distance can be recovered from a very small number of matching cases. Partly in keeping with nomenclature traditions in this area of TDA, we name this algorithm Lumáwig as a nod to a deity in the northern Philippines, where the algorithm was developed. We show that Lumáwig generally enjoys linear complexity as shown by empirical tests. We also present an application that leverages dimension zero persistence diagrams and the bottleneck distance to produce features for classification tasks.
Keywords