IEEE Access (Jan 2023)
Measuring Imbalance on Intersectional Protected Attributes and on Target Variable to Forecast Unfair Classifications
Abstract
Bias in software systems is a serious threat to human rights: when software makes decisions that allocate resources or opportunities, may disparately impact people based on personal traits (e.g., gender, ethnic group, etc.), systematically (dis)advantaging certain social groups. The cause is very often the imbalance of training data, that is, unequal distribution of data between the classes of an attribute. Previous studies showed that lower levels of balance in protected attributes are related to higher levels of unfairness in the output. In this paper we contribute to the current status of knowledge on balance measures as risk indicators of systematic discriminations by studying imbalance on two further aspects: the intersectionality among the classes of protected attributes, and the combination of the target variable with protected attributes. We conduct an empirical study to verify whether: i) it is possible to infer the balance of intersectional attributes from the balance of the primary attributes, ii) measures of balance on intersectional attributes are helpful to detect unfairness in the classification outcome, iii) the computation of balance on the combination of a target variable with protected attributes improves the detection of unfairness. Overall the results reveal positive answers, but not for every combination of balance measure and fairness criterion. For this reason, we recommend selecting the fairness and balance measures that are most suitable to the application context when applying our risk approach to real cases.
Keywords