Statistika: Statistics and Economy Journal (Sep 2017)

Iteratively Reweighted Least Squares Algorithm for Sparse Principal Component Analysis with Application to Voting Records

  • Tomáš Masák

Journal volume & issue
Vol. 97, no. 3
pp. 88 – 106

Abstract

Read online

Principal component analysis (PCA) is a popular dimensionality reduction and data visualization method. Sparse PCA (SPCA) is its extensively studied and NP-hard-to-solve modifcation. In the past decade, many diferent algorithms were proposed to perform SPCA. We build upon the work of Zou et al. (2006) who recast the SPCA problem into the regression framework and proposed to induce sparsity with the l1 penalty. Instead, we propose to drop the l1 penalty and promote sparsity by re-weighting the l2-norm. Our algorithm thus consists mainly of solving weighted ridge regression problems. We show that the algorithm basically attempts to fnd a solution to a penalized least squares problem with a non-convex penalty that resembles the l0-norm more closely. We also apply the algorithm to analyze the voting records of the Chamber of Deputies of the Parliament of the Czech Republic. We show not only why the SPCA is more appropriate to analyze this type of data, but we also discuss whether the variable selection property can be utilized as an additional piece of information, for example to create voting calculators automatically.

Keywords