Privacy-Oriented Manipulation of Speaker Representations

Francisco Teixeira; Alberto Abad; Bhiksha Raj; Isabel Trancoso

doi:10.1109/ACCESS.2024.3409067

IEEE Access (Jan 2024)

Privacy-Oriented Manipulation of Speaker Representations

Francisco Teixeira,
Alberto Abad,
Bhiksha Raj,
Isabel Trancoso

Affiliations

Francisco Teixeira: ORCiD; INESC-ID/Instituto Superior Técnico, University of Lisbon, Lisbon, Portugal
Alberto Abad: INESC-ID/Instituto Superior Técnico, University of Lisbon, Lisbon, Portugal
Bhiksha Raj: LTI, Carnegie Mellon University, Pittsburgh, PA, USA
Isabel Trancoso: INESC-ID/Instituto Superior Técnico, University of Lisbon, Lisbon, Portugal

DOI: https://doi.org/10.1109/ACCESS.2024.3409067
Journal volume & issue: Vol. 12
pp. 82949 – 82971

Abstract

Read online

Speaker embeddings are ubiquitous, with applications ranging from speaker recognition and diarization to speech synthesis and voice anonymization. The amount of information held by these embeddings lends them versatility but also raises privacy concerns. Speaker embeddings have been shown to contain sensitive information, including the speaker’s age, sex, health state and more – in other words, information that speakers may want to keep private, especially when it is not required for the target task. In this work, we propose a method for removing and manipulating private attribute information in speaker representations that leverages a Vector-Quantized Variational Autoencoder architecture combined with an adversarial classifier and a novel mutual information loss. We validate our model on two attributes, sex and age, and perform experiments to remove or manipulate this information using ignorant and informed attackers. The model is tested with in-domain and out-of-domain data to assess its robustness, and the resulting speaker representations are used in a speaker verification scenario to validate their utility. Our results show that our model obtains a strong trade-off between utility and privacy, achieving age and sex classification results near chance level for both attackers and yielding little impact on speaker verification performance.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords