IEEE Access (Jan 2024)

Debiased Learning via Composed Conceptual Sensitivity Regularization

  • Sunghwan Joo,
  • Taesup Moon

DOI
https://doi.org/10.1109/ACCESS.2024.3477454
Journal volume & issue
Vol. 12
pp. 170295 – 170308

Abstract

Read online

Deep neural networks often rely on spurious features, which are attributes correlated with class labels but irrelevant to the actual task, leading to poor generalization when these features are absent. To train classifiers that are not biased towards spurious features, recent research has leveraged explainable AI (XAI) techniques to identify and modify model behavior. Specifically, Concept Activation Vectors (CAVs), which indicate the direction toward specific concepts in the embedding space, were used to measure and regularize the conceptual sensitivity of the classifier, thereby reducing its reliance on spurious features. However, these approaches struggle with non-linear or high-dimensional spurious correlations due to the use of linear CAVs in previous works. In this paper, we propose Composite Conceptual Sensitivity Regularization (CCSR), a novel method designed to address these limitations. CCSR utilizes concept gradients to assign individualized CAVs for each sample, enabling the handling of non-linearly distributed spurious features in embedding space. Additionally, our method employs multiple CAVs for regularization, effectively mitigating spurious features both locally and globally. To the best of our knowledge, our research is the first to consider the non-linearity of spurious features in model bias regularization. Our results show that CCSR outperforms existing methods on several benchmarks, e.g., Waterbirds, CatDogs, and CelebA-Collars datasets, under the conditions for both with and without group labels on the validation dataset, even when minority samples are absent in the training dataset. These findings highlight the potential of CCSR to improve model robustness and generalization.

Keywords