GBMix: Enhancing Fairness by Group-Balanced Mixup

Sangwoo Hong; Youngseok Yoon; Hyungjun Joo; Jungwoo Lee

doi:10.1109/ACCESS.2024.3358275

IEEE Access (Jan 2024)

GBMix: Enhancing Fairness by Group-Balanced Mixup

Sangwoo Hong,
Youngseok Yoon,
Hyungjun Joo,
Jungwoo Lee

Affiliations

Sangwoo Hong: ORCiD; Department of Electrical and Computer Engineering, Communications and Machine Learning Laboratory, Seoul National University, Seoul, South Korea
Youngseok Yoon: Department of Electrical and Computer Engineering, Communications and Machine Learning Laboratory, Seoul National University, Seoul, South Korea
Hyungjun Joo: Department of Electrical and Computer Engineering, Communications and Machine Learning Laboratory, Seoul National University, Seoul, South Korea
Jungwoo Lee: ORCiD; Department of Electrical and Computer Engineering, Communications and Machine Learning Laboratory, Seoul National University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3358275
Journal volume & issue: Vol. 12
pp. 18877 – 18887

Abstract

Read online

Mixup is a powerful data augmentation strategy that has been shown to improve the generalization and adversarial robustness of machine learning classifiers, particularly in computer vision applications. Despite its simplicity and effectiveness, the impact of Mixup on the fairness of a model has not been thoroughly investigated yet. In this paper, we demonstrate that Mixup can perpetuate or even exacerbate bias presented in the training set. We provide insight to understand the reasons behind this behavior and propose GBMix, a group-balanced Mixup strategy to train fair classifiers. It groups the dataset based on their attributes and balances the Mixup ratio between the groups. Through the reorganization and balance of Mixup among groups, GBMix effectively enhances both average and worst-case accuracy concurrently. We empirically show that GBMix effectively mitigates bias in the training set and reduces the performance gap between groups. This effect is observed across a range of datasets and networks, and GBMix outperforms all the state-of-the-art methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords