Physchem (Mar 2022)

Advanced Machine Learning Methods for Learning from Sparse Data in High-Dimensional Spaces: A Perspective on Uses in the Upstream of Development of Novel Energy Technologies

  • Sergei Manzhos,
  • Manabu Ihara

DOI
https://doi.org/10.3390/physchem2020006
Journal volume & issue
Vol. 2, no. 2
pp. 72 – 95

Abstract

Read online

Machine learning (ML) has found increasing use in physical sciences, including research on energy conversion and storage technologies, in particular, so-called sustainable technologies. While often ML is used to directly optimize the parameters or phenomena of interest in the space of features, in this perspective, we focus on using ML to construct objects and methods that help in or enable the modeling of the underlying phenomena. We highlight the need for machine learning from very sparse and unevenly distributed numeric data in multidimensional spaces in these applications. After a brief introduction of some common regression-type machine learning techniques, we focus on more advanced ML techniques which use these known methods as building blocks of more complex schemes and thereby allow working with extremely sparse data and also allow generating insight. Specifically, we will highlight the utility of using representations with subdimensional functions by combining the high-dimensional model representation ansatz with machine learning methods such as neural networks or Gaussian process regressions in applications ranging from heterogeneous catalysis to nuclear energy.

Keywords