Phantom in the opera: adversarial music attack for robot dialogue system

Sheng Li; Jiyi Li; Yang Cao

doi:10.3389/fcomp.2024.1355975

Frontiers in Computer Science (Feb 2024)

Phantom in the opera: adversarial music attack for robot dialogue system

Sheng Li,
Jiyi Li,
Yang Cao

Affiliations

Sheng Li: National Institute of Information and Communications Technology, Kyoto, Japan
Jiyi Li: University of Yamanashi, Kofu, Japan
Yang Cao: Hokkaido University, Sapporo, Japan

DOI: https://doi.org/10.3389/fcomp.2024.1355975
Journal volume & issue: Vol. 6

Abstract

Read online

This study explores the vulnerability of robot dialogue systems' automatic speech recognition (ASR) module to adversarial music attacks. Specifically, we explore music as a natural camouflage for such attacks. We propose a novel method to hide ghost speech commands in a music clip by slightly perturbing its raw waveform. We apply our attack on an industry-popular ASR model, namely the time-delay neural network (TDNN), widely used for speech and speaker recognition. Our experiment demonstrates that adversarial music crafted by our attack can easily mislead industry-level TDNN models into picking up ghost commands with high success rates. However, it sounds no different from the original music to the human ear. This reveals a serious threat by adversarial music to robot dialogue systems, calling for effective defenses against such stealthy attacks.

Published in Frontiers in Computer Science

ISSN: 2624-9898 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/computer-science#

About the journal

Abstract

Keywords