Austrian Journal of Statistics (Apr 2017)

Extracting Information from Interval Data Using Symbolic Principal Component Analysis

  • M. R. Oliveira,
  • M. Vilela,
  • A. Pacheco,
  • Rui Valadas,
  • Paulo Salvador

DOI
https://doi.org/10.17713/ajs.v46i3-4.673
Journal volume & issue
Vol. 46, no. 3-4

Abstract

Read online

We introduce generic definitions of symbolic variance and covariance for random interval-valued variables, that lead to a unified and insightful interpretation of four known symbolic principal component estimation methods: CPCA, VPCA, CIPCA, and SymCovPCA. Moreover, we propose the use of truncated versions of symbolic principal components, that use a strict subset of the original symbolic variables, as a way to improve the interpretation of symbolic principal components. Furthermore, the analysis of a real dataset leads to a meaningful characterization of Internet traffic applications, while highligting similarities between the symbolic principal component estimation methods considered in the paper.