Deconstructing demographic bias in speech-based machine learning models for digital health

Michael Yang; Abd-Allah El-Attar; Theodora Chaspari

doi:10.3389/fdgth.2024.1351637

Frontiers in Digital Health (Jul 2024)

Deconstructing demographic bias in speech-based machine learning models for digital health

Michael Yang,
Abd-Allah El-Attar,
Theodora Chaspari

Affiliations

Michael Yang: Computer Science & Engineering, Texas A&M University, College Station, TX, United States
Abd-Allah El-Attar: Computer Science & Engineering, Texas A&M University Qatar, Al Rayyan, Qatar
Theodora Chaspari: Institute of Cognitive Science & Computer Science, University of Colorado Boulder, Boulder, CO, United States

DOI: https://doi.org/10.3389/fdgth.2024.1351637
Journal volume & issue: Vol. 6

Abstract

Read online

IntroductionMachine learning (ML) algorithms have been heralded as promising solutions to the realization of assistive systems in digital healthcare, due to their ability to detect fine-grain patterns that are not easily perceived by humans. Yet, ML algorithms have also been critiqued for treating individuals differently based on their demography, thus propagating existing disparities. This paper explores gender and race bias in speech-based ML algorithms that detect behavioral and mental health outcomes.MethodsThis paper examines potential sources of bias in the data used to train the ML, encompassing acoustic features extracted from speech signals and associated labels, as well as in the ML decisions. The paper further examines approaches to reduce existing bias via using the features that are the least informative of one’s demographic information as the ML input, and transforming the feature space in an adversarial manner to diminish the evidence of the demographic information while retaining information about the focal behavioral and mental health state.ResultsResults are presented in two domains, the first pertaining to gender and race bias when estimating levels of anxiety, and the second pertaining to gender bias in detecting depression. Findings indicate the presence of statistically significant differences in both acoustic features and labels among demographic groups, as well as differential ML performance among groups. The statistically significant differences present in the label space are partially preserved in the ML decisions. Although variations in ML performance across demographic groups were noted, results are mixed regarding the models’ ability to accurately estimate healthcare outcomes for the sensitive groups.DiscussionThese findings underscore the necessity for careful and thoughtful design in developing ML models that are capable of maintaining crucial aspects of the data and perform effectively across all populations in digital healthcare applications.

Published in Frontiers in Digital Health

ISSN: 2673-253X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Public aspects of medicine; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/digital-health#

About the journal

Abstract

Keywords