Decoding Envelope and Frequency-Following EEG Responses to Continuous Speech Using Deep Neural Networks

Mike D. Thornton; Danilo P. Mandic; Tobias J. Reichenbach

doi:10.1109/OJSP.2024.3378593

IEEE Open Journal of Signal Processing (Jan 2024)

Decoding Envelope and Frequency-Following EEG Responses to Continuous Speech Using Deep Neural Networks

Mike D. Thornton,
Danilo P. Mandic,
Tobias J. Reichenbach

Affiliations

Mike D. Thornton: ORCiD; Department of Computing, Imperial College London, London, U.K.
Danilo P. Mandic: ORCiD; Department of Electrical and Electronic Engineering, Imperial College London, London, U.K.
Tobias J. Reichenbach: ORCiD; Department for Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany

DOI: https://doi.org/10.1109/OJSP.2024.3378593
Journal volume & issue: Vol. 5
pp. 700 – 716

Abstract

Read online

The electroencephalogram (EEG) offers a non-invasive means by which a listener's auditory system may be monitored during continuous speech perception. Reliable auditory-EEG decoders could facilitate the objective diagnosis of hearing disorders, or find applications in cognitively-steered hearing aids. Previously, we developed decoders for the ICASSP Auditory EEG Signal Processing Grand Challenge (SPGC). These decoders placed first in the match-mismatch task: given a short temporal segment of EEG recordings, and two candidate speech segments, the task is to identify which of the two speech segments is temporally aligned, or matched, with the EEG segment. The decoders made use of cortical responses to the speech envelope, as well as speech-related frequency-following responses, to relate the EEG recordings to the speech stimuli. Here we comprehensively document the methods by which the decoders were developed. We extend our previous analysis by exploring the association between speaker characteristics (pitch and sex) and classification accuracy, and provide a full statistical analysis of the final performance of the decoders as evaluated on a heldout portion of the dataset. Finally, the generalisation capabilities of the decoders are characterised, by evaluating them using an entirely different dataset which contains EEG recorded under a variety of speech-listening conditions. The results show that the match-mismatch decoders achieve accurate and robust classification accuracies, and they can even serve as auditory attention decoders without additional training.

Published in IEEE Open Journal of Signal Processing

ISSN: 2644-1322 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8782710

About the journal

Abstract

Keywords