Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical Data

Korey P. Wylie; Jason R. Tregellas

doi:10.3390/math13010072

Mathematics (Dec 2024)

Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical Data

Korey P. Wylie,
Jason R. Tregellas

Affiliations

Korey P. Wylie: Department of Psychiatry, University of Colorado School of Medicine, Anschutz Medical Campus, Anschutz Health Sciences Building, 1890 N Revere Ct, Aurora, CO 80045, USA
Jason R. Tregellas: Department of Psychiatry, University of Colorado School of Medicine, Anschutz Medical Campus, Anschutz Health Sciences Building, 1890 N Revere Ct, Aurora, CO 80045, USA

DOI: https://doi.org/10.3390/math13010072
Journal volume & issue: Vol. 13, no. 1
p. 72

Abstract

Read online

Hierarchical clustering analysis (HCA) is a widely used unsupervised learning method. Limitations of HCA, however, include imposing an artificial hierarchy onto non-hierarchical data and fixed two-way mergers at every level. To address this, the current work describes a novel rootlets hierarchical principal component analysis (hPCA). This method extends typical hPCA using multivariate statistics to construct adaptive multiway mergers and Riemannian geometry to visualize nested dependencies. The rootlets hPCA algorithm and its projection onto the Poincaré disk are presented as examples of this extended framework. The algorithm constructs high-dimensional mergers using a single parameter, interpreted as a p-value. It decomposes a similarity matrix from GL(m, ℝ) using a sequence of rotations from SO(k), k m. Analysis shows that the rootlets algorithm limits the number of distinct eigenvalues for any merger. Nested clusters of arbitrary size but equal correlations are constructed and merged using their leading principal components. The visualization method then maps elements of SO(k) onto a low-dimensional hyperbolic manifold, the Poincaré disk. Rootlets hPCA was validated using simulated datasets with known hierarchical structure, and a neuroimaging dataset with an unknown hierarchy. Experiments demonstrate that rootlets hPCA accurately reconstructs known hierarchies and, unlike HCA, does not impose a hierarchy on data.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords