IEEE Access (Jan 2020)
Bayesian Contextual Bandits for Hyper Parameter Optimization
Abstract
Hyper parameter optimization (HPO) is a crucial step in modern machine learning systems. Bayesian optimization (BO) has shown great promise in HPO, where the parameter evaluation is conducted through a black-box optimization procedure. However, the main drawback of BO lies in the expensive computation cost, which limits its application in large deep models. Hence there is strong motivation to reduce the number of epochs of training required by leveraging the partial information provided by iterative training procedures. Recent advancements use probabilistic models that extrapolate learning curves explicitly to early-stop the training of poor-performing configurations. However, these approaches impose a strong prior assumption on the distribution of learning curves and involve much larger computational complexity or rely heavily on predefined rules, which is not general for different cases. To tackle the challenge of training resource allocation in infinite parameter search space and in time horizon, we study HPO problem in Bayesian contextual bandits setting and derive several fully-dynamic strategies that are information-efficient and scalable. Extensive experiments demonstrate that the proposed method can significantly speed up the HPO process.
Keywords