Data Science and Engineering (Jul 2020)

DeepECT: The Deep Embedded Cluster Tree

  • Dominik Mautz,
  • Claudia Plant,
  • Christian Böhm

DOI
https://doi.org/10.1007/s41019-020-00134-0
Journal volume & issue
Vol. 5, no. 4
pp. 419 – 432

Abstract

Read online

Abstract The idea of combining the high representational power of deep learning techniques with clustering methods has gained much attention in recent years. Optimizing a clustering objective and the dataset representation simultaneously has been shown to be advantageous over separately optimizing them. So far, however, all proposed methods have been using a flat clustering strategy, with the actual number of clusters known a priori. In this paper, we propose the Deep Embedded Cluster Tree (DeepECT), the first divisive hierarchical embedded clustering method. The cluster tree does not need to know the actual number of clusters during optimization. Instead, the level of detail to be analyzed can be chosen afterward and for each sub-tree separately. An optional data-augmentation-based extension allows DeepECT to ignore prior-known invariances of the dataset, such as affine transformations in image data. We evaluate and show the advantages of DeepECT in extensive experiments.

Keywords