IEEE Access (Jan 2024)
A Bayesian Gaussian Process-Based Latent Discriminative Generative Decoder (LDGD) Model for High-Dimensional Data
Abstract
Extracting meaningful information from high-dimensional data poses a formidable modeling challenge, particularly when the data is obscured by noise or represented through different modalities. This research proposes a novel non-parametric modeling approach, leveraging the Gaussian process (GP), to characterize high-dimensional data by mapping it to a latent low-dimensional manifold. This model, named the latent discriminative generative decoder (LDGD), employs both the data and associated labels in the manifold discovery process. A Bayesian solution is derived to infer the latent variables, allowing LDGD to effectively capture inherent stochasticity in the data. Applications of LDGD are demonstrated on both synthetic and benchmark datasets. Not only does LDGD infer the manifold accurately, but its accuracy in predicting data points’ labels surpasses state-of-the-art approaches. In the development of LDGD, inducing points are incorporated to reduce the computational complexity of Gaussian processes for large datasets, enabling batch training for enhanced efficient processing and scalability. Additionally, we show that LDGD can robustly infer manifold and precisely predict labels for scenarios in which data size is limited, demonstrating its capability to characterize high-dimensional data with limited samples efficiently. These collective attributes highlight the importance of developing non-parametric modeling approaches to analyze high-dimensional data.
Keywords