Yet Another Discriminant Analysis (YADA): A Probabilistic Model for Machine Learning Applications

Richard V. Field; Michael R. Smith; Ellery J. Wuest; Joe B. Ingram

doi:10.3390/math12213392

Mathematics (Oct 2024)

Yet Another Discriminant Analysis (YADA): A Probabilistic Model for Machine Learning Applications

Richard V. Field,
Michael R. Smith,
Ellery J. Wuest,
Joe B. Ingram

Affiliations

Richard V. Field: Sandia National Laboratories, Albuquerque, NM 87185, USA
Michael R. Smith: Sandia National Laboratories, Albuquerque, NM 87185, USA
Ellery J. Wuest: Klipsch School of Electrical and Computer Engineering, New Mexico State University, Las Cruces, NM 88003, USA
Joe B. Ingram: Sandia National Laboratories, Albuquerque, NM 87185, USA

DOI: https://doi.org/10.3390/math12213392
Journal volume & issue: Vol. 12, no. 21
p. 3392

Abstract

Read online

This paper presents a probabilistic model for various machine learning (ML) applications. While deep learning (DL) has produced state-of-the-art results in many domains, DL models are complex and over-parameterized, which leads to high uncertainty about what the model has learned, as well as its decision process. Further, DL models are not probabilistic, making reasoning about their output challenging. In contrast, the proposed model, referred to as Yet Another Discriminate Analysis(YADA), is less complex than other methods, is based on a mathematically rigorous foundation, and can be utilized for a wide variety of ML tasks including classification, explainability, and uncertainty quantification. YADA is thus competitive in most cases with many state-of-the-art DL models. Ideally, a probabilistic model would represent the full joint probability distribution of its features, but doing so is often computationally expensive and intractable. Hence, many probabilistic models assume that the features are either normally distributed, mutually independent, or both, which can severely limit their performance. YADA is an intermediate model that (1) captures the marginal distributions of each variable and the pairwise correlations between variables and (2) explicitly maps features to the space of multivariate Gaussian variables. Numerous mathematical properties of the YADA model can be derived, thereby improving the theoretic underpinnings of ML. Validation of the model can be statistically verified on new or held-out data using native properties of YADA. However, there are some engineering and practical challenges that we enumerate to make YADA more useful.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords