Proceedings of the International Conference on Applied Innovations in IT (Nov 2023)
Demographic Bias in Medical Datasets for Clinical AI
Abstract
Numerous studies have detailed instances of demographic bias in medical data and artificial intelligence (AI) systems used in medical setting. Moreover, these studies have also shown how these biases can significantly impact the access to and quality of care, as well as quality of life for patients belonging in certain under-represented groups. These groups are then being marginalised because of stigma based on demographic information such as race, gender, age, ability, and so on. Since the performance of AI models is highly dependent on the quality of data used to train the algorithms, it is a necessary precaution to analyse any potential bias inadvertently existent in the data, in order to mitigate the consequences of using biased data in creating medical AI systems. For that reason, we propose a machine learning (ML) analysis which receives patient biosignals as input information and analyses them for two types of demographic bias, namely gender and age bias. The analysis is performed using several ML algorithms (Logistic Regression, Decision Trees, Random Forest, and XGBoost). The trained models are evaluated with a holdout technique and by observing the confusion matrixes and the classification reports. The results show that the models are capable of detecting bias in data. This makes the proposed approach one way to identify bias in data, especially throughout the process of building AI-based medical systems. Consequently, the proposed pipeline can be used as a mitigation technique for bias analysis in data.
Keywords