IEEE Access (Jan 2024)

A Robust Framework for Distributional Shift Detection Under Sample-Bias

  • Birk Torpmann-Hagen,
  • Michael A. Riegler,
  • Pal Halvorsen,
  • Dag Johansen

DOI
https://doi.org/10.1109/ACCESS.2024.3393296
Journal volume & issue
Vol. 12
pp. 59598 – 59611

Abstract

Read online

Deep Neural Networks have been shown to perform poorly or even fail altogether when deployed in real-world settings, despite exhibiting excellent performance on initial benchmarks. This typically occurs due to relative changes in the nature of the production data, often referred to as distributional shifts. In an attempt to increase the transparency, trustworthiness, and overall utility of deep learning systems, a growing body of work has been dedicated to developing distributional shift detectors. As part of our work, we investigate distributional shift detectors that utilize statistical tests of neural network-based representations of data. We show that these methods are prone to fail under sample-bias, which we argue is unavoidable in most practical machine learning systems. To mitigate this, we implement a novel distributional shift detection framework which explicitly accounts for sample-bias via a simple sample-selection procedure. In particular, we show that the effect of sample-bias can be significantly reduced by performing statistical tests against the most similar data in the training set, as opposed to the training set as a whole. We find that this improves the stability and accuracy of a variety of distributional shift detection methods on both covariate- and semantic-shifts, with improvements to balanced accuracy typically ranging between 0.1 and 0.2, and false-positive-rates often being eliminated altogether under bias.

Keywords