Patterns (Nov 2023)
Improving reduced-order models through nonlinear decoding of projection-dependent outputs
Abstract
Summary: A fundamental hindrance to building data-driven reduced-order models (ROMs) is the poor topological quality of a low-dimensional data projection. This includes behavior such as overlapping, twisting, or large curvatures or uneven data density that can generate nonuniqueness and steep gradients in quantities of interest (QoIs). Here, we employ an encoder-decoder neural network architecture for dimensionality reduction. We find that nonlinear decoding of projection-dependent QoIs, when embedded in a dimensionality reduction technique, promotes improved low-dimensional representations of complex multiscale and multiphysics datasets. When data projection (encoding) is affected by forcing accurate nonlinear reconstruction of the QoIs (decoding), we minimize nonuniqueness and gradients in representing QoIs on a projection. This in turn leads to enhanced predictive accuracy of a ROM. Our findings are relevant to a variety of disciplines that develop data-driven ROMs of dynamical systems such as reacting flows, plasma physics, atmospheric physics, or computational neuroscience. The bigger picture: Large datasets are abundant in various scientific and engineering disciplines. Multiple physical variables are frequently gathered into one dataset, leading to high data dimensionality. Visualizing and modeling multivariate datasets can be achieved through dimensionality reduction. However, in many reduction techniques to date, there is no guarantee that the reduced data representation will possess certain desired topological qualities. We show that the quality of reduced data representations can be significantly improved by informing data projections by target quantities of interest (QoIs), some of which are functions of the projection itself. The target QoIs can include closure terms required in modeling, important physical variables, or class labels in the case of categorical data. This work can have particular relevance in data visualization and efficient modeling of dynamical systems with many degrees of freedom, as well as in fundamental research of representation learning.