Algorithms (Mar 2024)

Exploratory Data Analysis and Searching Cliques in Graphs

  • András Hubai,
  • Sándor Szabó,
  • Bogdán Zaválnij

DOI
https://doi.org/10.3390/a17030112
Journal volume & issue
Vol. 17, no. 3
p. 112

Abstract

Read online

The principal component analysis is a well-known and widely used technique to determine the essential dimension of a data set. Broadly speaking, it aims to find a low-dimensional linear manifold that retains a large part of the information contained in the original data set. It may be the case that one cannot approximate the entirety of the original data set using a single low-dimensional linear manifold even though large subsets of it are amenable to such approximations. For these cases we raise the related but different challenge (problem) of locating subsets of a high dimensional data set that are approximately 1-dimensional. Naturally, we are interested in the largest of such subsets. We propose a method for finding these 1-dimensional manifolds by finding cliques in a purpose-built auxiliary graph.

Keywords