A general supply-inspect cost framework to regulate the reliability-usability trade-offs for few-shot inference

Fernando Martínez-Plumed; Gonzalo Jaimovitch-López; Cèsar Ferri; María José Ramírez-Quintana; José Hernández-Orallo

doi:10.1007/s40747-024-01599-6

Complex & Intelligent Systems (Aug 2024)

A general supply-inspect cost framework to regulate the reliability-usability trade-offs for few-shot inference

Fernando Martínez-Plumed,
Gonzalo Jaimovitch-López,
Cèsar Ferri,
María José Ramírez-Quintana,
José Hernández-Orallo

Affiliations

Fernando Martínez-Plumed: VRAIN, Universitat Politècnica de València
Gonzalo Jaimovitch-López: VRAIN, Universitat Politècnica de València
Cèsar Ferri: VRAIN, Universitat Politècnica de València
María José Ramírez-Quintana: VRAIN, Universitat Politècnica de València
José Hernández-Orallo: VRAIN, Universitat Politècnica de València

DOI: https://doi.org/10.1007/s40747-024-01599-6
Journal volume & issue: Vol. 10, no. 6
pp. 8287 – 8317

Abstract

Read online

Abstract Language models and other recent machine learning paradigms blur the distinction between generative and discriminative tasks, in a continuum that is regulated by the degree of pre- and post-supervision that is required from users, as well as the tolerated level of error. In few-shot inference, we need to find a trade-off between the number and cost of the solved examples that have to be supplied, those that have to be inspected (some of them accurate but others needing correction) and those that are wrong but pass undetected. In this paper, we define a new Supply-Inspect Cost Framework, associated graphical representations and comprehensive metrics that consider all these elements. To optimise few-shot inference under specific operating conditions, we introduce novel algorithms that go beyond the concept of rejection rules in both static and dynamic contexts. We illustrate the effectiveness of all these elements for a transformative domain, data wrangling, for which language models can have a huge impact if we are able to properly regulate the reliability-usability trade-off, as we do in this paper.

Published in Complex & Intelligent Systems

ISSN: 2199-4536 (Print); 2198-6053 (Online)
Publisher: Springer
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.springer.com/journal/40747

About the journal

Abstract

Keywords