Active learning with human heuristics: an algorithm robust to labeling bias

Sriram Ravichandran; Nandan Sudarsanam; Nandan Sudarsanam; Balaraman Ravindran; Balaraman Ravindran; Konstantinos V. Katsikopoulos

doi:10.3389/frai.2024.1491932

Frontiers in Artificial Intelligence (Nov 2024)

Active learning with human heuristics: an algorithm robust to labeling bias

Sriram Ravichandran,
Nandan Sudarsanam,
Nandan Sudarsanam,
Balaraman Ravindran,
Balaraman Ravindran,
Konstantinos V. Katsikopoulos

Affiliations

Sriram Ravichandran: Department of Management Studies, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
Nandan Sudarsanam: Department of Data Science and AI, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
Nandan Sudarsanam: Wadhwani School of Data Science and AI, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
Balaraman Ravindran: Department of Data Science and AI, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
Balaraman Ravindran: Wadhwani School of Data Science and AI, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
Konstantinos V. Katsikopoulos: Department of Decision Analytics and Risk, University of Southampton Business School, Southampton, United Kingdom

DOI: https://doi.org/10.3389/frai.2024.1491932
Journal volume & issue: Vol. 7

Abstract

Read online

Active learning enables prediction models to achieve better performance faster by adaptively querying an oracle for the labels of data points. Sometimes the oracle is a human, for example when a medical diagnosis is provided by a doctor. According to the behavioral sciences, people, because they employ heuristics, might sometimes exhibit biases in labeling. How does modeling the oracle as a human heuristic affect the performance of active learning algorithms? If there is a drop in performance, can one design active learning algorithms robust to labeling bias? The present article provides answers. We investigate two established human heuristics (fast-and-frugal tree, tallying model) combined with four active learning algorithms (entropy sampling, multi-view learning, conventional information density, and, our proposal, inverse information density) and three standard classifiers (logistic regression, random forests, support vector machines), and apply their combinations to 15 datasets where people routinely provide labels, such as health and other domains like marketing and transportation. There are two main results. First, we show that if a heuristic provides labels, the performance of active learning algorithms significantly drops, sometimes below random. Hence, it is key to design active learning algorithms that are robust to labeling bias. Our second contribution is to provide such a robust algorithm. The proposed inverse information density algorithm, which is inspired by human psychology, achieves an overall improvement of 87% over the best of the other algorithms. In conclusion, designing and benchmarking active learning algorithms can benefit from incorporating the modeling of human heuristics.

Published in Frontiers in Artificial Intelligence

ISSN: 2624-8212 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/artificial-intelligence#

About the journal

Abstract

Keywords