Applied Sciences (Feb 2022)
An Implementation of the HDBSCAN* Clustering Algorithm
Abstract
An implementation of the HDBSCAN* clustering algorithm, Tribuo Hdbscan, is presented in this work. The implementation is developed as a new feature of the Java machine learning library Tribuo. This implementation leverages concurrency and achieves better performance than the reference Java implementation. Tribuo Hdbscan provides prediction functionality, which is a novel technique to make fast predictions for unseen data points using an HDBSCAN* clustering model. Tribuo Hdbscan cluster results and performance measurements are also compared with the state-of-the-art HDBSCAN* implementation, the Python module hdbscan.
Keywords