Frontiers in Artificial Intelligence (Nov 2023)

An algorithm for computing Schubert varieties of best fit with applications

  • Karim Karimov,
  • Michael Kirby,
  • Chris Peterson

DOI
https://doi.org/10.3389/frai.2023.1274830
Journal volume & issue
Vol. 6

Abstract

Read online

We propose the geometric framework of the Schubert variety as a tool for representing a collection of subspaces of a fixed vector space. Specifically, given a collection of l-dimensional subspaces V1, …, Vr of ℝn, represented as the column spaces of matrices X1, …, Xr, we seek to determine a representative matrix K∈ℝn×k such that each subspace Vi intersects (or comes close to intersecting) the span of the columns of K in at least c dimensions. We formulate a non-convex optimization problem to determine such a K along with associated sets of vectors {ai} and {bi} used to express linear combinations of the columns of the Xi that are close to linear combinations of the columns of K. Further, we present a mechanism for integrating this representation into an artificial neural network architecture as a computational unit (which we refer to as an abstract node). The representative matrix K can be learned in situ, or sequentially, as part of a learning problem. Additionally, the matrix K can be employed as a change of coordinates in the learning problem. The set of all l-dimensional subspaces of ℝn that intersects the span of the columns of K in at least c dimensions is an example of a Schubert subvariety of the Grassmannian GR(l, n). When it is not possible to find a Schubert variety passing through a collection of points on GR(l, n), the goal of the non-convex optimization problem is to find the Schubert variety of best fit, i.e., the Schubert variety that comes as close as possible to the points. This may be viewed as an analog of finding a subspace of best fit to data in a vector space. The approach we take is well-suited to the modeling of collections of sets of data either as a stand-alone Schubert variety of best fit (SVBF), or in the processing workflow of a deep neural network. We present applications to some classification problems on sets of data to illustrate the behavior of the method.

Keywords