Convergence and Optimality Analysis of Low-Dimensional Generative Adversarial Networks Using Error Function Integrals

Graham W. Pulford; Kirill Kondrashov

doi:10.1109/ACCESS.2021.3133762

IEEE Access (Jan 2021)

Convergence and Optimality Analysis of Low-Dimensional Generative Adversarial Networks Using Error Function Integrals

Graham W. Pulford,
Kirill Kondrashov

Affiliations

Graham W. Pulford: ORCiD; Consultant, BandGapAI, Toulouse, France
Kirill Kondrashov: GN A/S, Eindhoven, The Netherlands

DOI: https://doi.org/10.1109/ACCESS.2021.3133762
Journal volume & issue: Vol. 9
pp. 165366 – 165384

Abstract

Read online

Due to their success at synthesising highly realistic images, many claims have been made about optimality and convergence in generative adversarial networks (GANs). But what of vanishing gradients, saturation, and other numerical problems noted by AI practitioners? Attempts to explain these phenomena have so far been based on purely empirical studies or differential equations, valid only in the limit. We take a fresh look at these questions using explicit, low-dimensional models. We revisit the well known optimal discriminator result and, by construction of a counterexample, show that it is not valid in the case of practical interest: when the dimension of the latent variable is less than that of the data: ${\mathrm{ dim}}({\mathbf{z}}) < {\mathrm{ dim}}({\mathbf{x}})$ . To examine convergence issues, we consider a 1-D least squares (LS) GAN with exponentially distributed data, a Rayleigh distributed latent variable, a square law generator and a discriminator of the form $D(x)=(1+ {\text {erf}}(x))/2$ where erf is the error function. We obtain explicit representations of the cost (or loss) function and its derivatives. The representation is exact down to the evaluation of a well-behaved 1-D integral. We present analytical numerical examples of 2D and 4D parameter trajectories for gradient-based minimax optimisation. Although the cost function has no saddle points, it generally has a minimum, maximum and plateaux areas. The gradient algorithms typically converge to a plateau, where the gradients vanish and the cost function saturates. This is an undesirable setting with no implications of optimality for either the generator or discriminator. The analytical method is compared with stochastic gradient optimisation and proven to be a very accurate predictor of the latter’s performance. The quasi-deterministic framework we develop is a powerful analytical tool for understanding convergence behaviour of low-dimensional GANs based on least-squares cost criteria.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords