Toward a simple yet efficient cost function for the optimization of Gaussian process regression model hyperparameters

Bienfait K. Isamura; Paul L. A. Popelier

doi:10.1063/5.0151033

AIP Advances (Sep 2023)

Toward a simple yet efficient cost function for the optimization of Gaussian process regression model hyperparameters

Bienfait K. Isamura,
Paul L. A. Popelier

Affiliations

Bienfait K. Isamura: Department of Chemistry, The University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom
Paul L. A. Popelier: Department of Chemistry, The University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom

DOI: https://doi.org/10.1063/5.0151033
Journal volume & issue: Vol. 13, no. 9
pp. 095202 – 095202-21

Abstract

Read online

FFLUX is a novel machine-learnt force field using pre-trained Gaussian process regression (GPR) models to predict energies and multipole moments of quantum atoms in molecular dynamic simulations. At the heart of FFLUX lies the program FEREBUS, a Fortran90 and OpenMP-parallelized regression engine, which trains and validates GPR models of chemical accuracy. Training a GPR model is about finding an optimal set of model hyperparameters (θ). This time-consuming task is usually accomplished by maximizing the marginal/concentrated log-likelihood function LLy|x,θ, known as the type-II maximum likelihood approach. Unfortunately, this widespread approach can suffer from the propagation of numerical errors, especially in the noise-free regime, where the expected correlation between LLy|x,θ̂ [maximized value of the LLy|x,θ function] and the models’ performance may no longer be valid. In this scenario, the LLy|x,θ function is no longer a reliable guide for model selection. While one could still rely on a pre-conditioner to improve the condition number of the covariance matrix, this choice is never unique and often comes with increased computational cost. Therefore, we have equipped FEREBUS with an alternatively simple, intuitive, viable, and less error-prone protocol called “iterative hold-out cross-validation” for the optimization of θ values. This protocol involves (1) a stratified random sampling of both training and validation sets, followed by (2) an iterative minimization of the predictive RMSE(θ) of intermediary models over a sufficiently large validation set. Its greatest asset is the assurance that the optimization process keeps reducing the generalization error of intermediary GPR models on unseen datasets, something that maximizing LLy|x,θ does not guarantee.

Published in AIP Advances

ISSN: 2158-3226 (Online)
Publisher: AIP Publishing LLC
Country of publisher: United States
LCC subjects: Science: Physics
Website: http://aipadvances.aip.org/

About the journal