Computation (May 2025)

An Explainable Framework Integrating Local Biplots and Gaussian Processes for Unemployment Rate Prediction in Colombia

  • Diego Armando Pérez-Rosero,
  • Diego Alejandro Manrique-Cabezas,
  • Jennifer Carolina Triana-Martinez,
  • Andrés Marino Álvarez-Meza,
  • German Castellanos-Dominguez

DOI
https://doi.org/10.3390/computation13050116
Journal volume & issue
Vol. 13, no. 5
p. 116

Abstract

Read online

Addressing unemployment is essential for formulating effective public policies. In particular, socioeconomic and monetary variables serve as essential indicators for anticipating labor market trends, given their strong influence on employment dynamics and economic stability. However, effective unemployment rate prediction requires addressing the non-stationary and non-linear characteristics of labor data. Equally important is the preservation of interpretability in both samples and features to ensure that forecasts can meaningfully inform public decision-making. Here, we provide an explainable framework integrating unsupervised and supervised machine learning to enhance unemployment rate prediction and interpretability. Our approach is threefold: (i) we gather a dataset for Colombian unemployment rate prediction including monetary and socioeconomic variables. (ii) Then, we used a Local Biplot technique from the widely recognized Uniform Manifold Approximation and Projection (UMAP) method along with local affine transformations as an unsupervised representation of non-stationary and non-linear data patterns in a simplified and comprehensible manner. (iii) A Gaussian Processes regressor with kernel-based feature relevance analysis is coupled as a supervised counterpart for both unemployment rate prediction and input feature importance analysis. We demonstrated the effectiveness of our proposed approach through a series of experiments conducted on our customized database focused on unemployment indicators in Colombia. Furthermore, we carried out a comparative analysis between traditional statistical techniques and modern machine learning methods. The results revealed that our framework significantly enhances both clustering and predictive performance, while also emphasizing the importance of input samples and feature selection in driving accurate outcomes.

Keywords