Bayesian Optimization for Instruction Generation

Antonio Sabbatella; Francesco Archetti; Andrea Ponti; Ilaria Giordani; Antonio Candelieri

doi:10.3390/app142411865

Applied Sciences (Dec 2024)

Bayesian Optimization for Instruction Generation

Antonio Sabbatella,
Francesco Archetti,
Andrea Ponti,
Ilaria Giordani,
Antonio Candelieri

Affiliations

Antonio Sabbatella: Department of Computer Science Systems and Communication, University of Milano-Bicocca, 20126 Milano, Italy
Francesco Archetti: Department of Computer Science Systems and Communication, University of Milano-Bicocca, 20126 Milano, Italy
Andrea Ponti: Department of Economics Management and Statistics, University of Milano-Bicocca, 20126 Milano, Italy
Ilaria Giordani: OAKS s.r.l., 20125 Milan, Italy
Antonio Candelieri: Department of Economics Management and Statistics, University of Milano-Bicocca, 20126 Milano, Italy

DOI: https://doi.org/10.3390/app142411865
Journal volume & issue: Vol. 14, no. 24
p. 11865

Abstract

Read online

The performance of Large Language Models (LLMs) strongly depends on the selection of the best instructions for different downstream tasks, especially in the case of black-box LLMs. This study introduces BOInG (Bayesian Optimization for Instruction Generation), a method leveraging Bayesian Optimization (BO) to efficiently generate instructions while addressing the combinatorial nature of instruction search. Over the last decade, BO has emerged as a highly effective optimization method in various domains due to its flexibility and sample efficiency. At its core, BOInG employs Bayesian search in a low-dimensional continuous space, projecting solutions into a high-dimensional token embedding space to retrieve discrete tokens. These tokens act as seeds for the generation of human-readable, task-relevant instructions. Experimental results demonstrate that BOInG achieves comparable or superior performance to state-of-the-art methods, such as InstructZero and Instinct, with substantially lower resource requirements while also enabling the use of both white-box and black-box models. This approach offers both theoretical and practical benefits without requiring specialized hardware.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords