Complex & Intelligent Systems (Aug 2024)

A general supply-inspect cost framework to regulate the reliability-usability trade-offs for few-shot inference

  • Fernando Martínez-Plumed,
  • Gonzalo Jaimovitch-López,
  • Cèsar Ferri,
  • María José Ramírez-Quintana,
  • José Hernández-Orallo

DOI
https://doi.org/10.1007/s40747-024-01599-6
Journal volume & issue
Vol. 10, no. 6
pp. 8287 – 8317

Abstract

Read online

Abstract Language models and other recent machine learning paradigms blur the distinction between generative and discriminative tasks, in a continuum that is regulated by the degree of pre- and post-supervision that is required from users, as well as the tolerated level of error. In few-shot inference, we need to find a trade-off between the number and cost of the solved examples that have to be supplied, those that have to be inspected (some of them accurate but others needing correction) and those that are wrong but pass undetected. In this paper, we define a new Supply-Inspect Cost Framework, associated graphical representations and comprehensive metrics that consider all these elements. To optimise few-shot inference under specific operating conditions, we introduce novel algorithms that go beyond the concept of rejection rules in both static and dynamic contexts. We illustrate the effectiveness of all these elements for a transformative domain, data wrangling, for which language models can have a huge impact if we are able to properly regulate the reliability-usability trade-off, as we do in this paper.

Keywords