Applied Sciences (Feb 2022)

An Implementation of the HDBSCAN* Clustering Algorithm

  • Geoffrey Stewart ,
  • Mahmood Al-Khassaweneh

DOI
https://doi.org/10.3390/app12052405
Journal volume & issue
Vol. 12, no. 5
p. 2405

Abstract

Read online

An implementation of the HDBSCAN* clustering algorithm, Tribuo Hdbscan, is presented in this work. The implementation is developed as a new feature of the Java machine learning library Tribuo. This implementation leverages concurrency and achieves better performance than the reference Java implementation. Tribuo Hdbscan provides prediction functionality, which is a novel technique to make fast predictions for unseen data points using an HDBSCAN* clustering model. Tribuo Hdbscan cluster results and performance measurements are also compared with the state-of-the-art HDBSCAN* implementation, the Python module hdbscan.

Keywords