IEEE Access (Jan 2024)
Shortcut Learning Explanations for Deep Natural Language Processing: A Survey on Dataset Biases
Abstract
The introduction of pre-trained large language models (LLMs) has transformed NLP by fine-tuning task-specific datasets, enabling notable advancements in news classification, language translation, and sentiment analysis. This has revolutionized the field, driving remarkable breakthroughs and progress. However, the growing recognition of bias in textual data has emerged as a critical focus in the NLP community, revealing the inherent limitations of models trained on specific datasets. LLMs exploit these dataset biases and artifacts as expedient shortcuts for prediction. The reliance of LLMs on dataset bias and artifacts as shortcuts for prediction has hindered their generalizability and adversarial robustness. Addressing this issue is crucial to enhance the reliability and resilience of LLMs in various contexts. This survey provides a comprehensive overview of the rapidly growing body of research on shortcut learning in language models, classifying the research into four main areas: the factors of shortcut learning, the origin of bias, the detection methods of dataset biases, and understanding mitigation strategies to address data biases. The goal of this study is to offer a contextualized, in-depth look at the state of learning models, highlighting the major areas of attention and suggesting possible directions for further research.
Keywords