Temporal Resolution Needed for Auditory Communication: Measurement With Mosaic Speech

Yoshitaka Nakajima; Mizuki Matsuda; Kazuo Ueda; Gerard B. Remijn

doi:10.3389/fnhum.2018.00149

Frontiers in Human Neuroscience (Apr 2018)

Temporal Resolution Needed for Auditory Communication: Measurement With Mosaic Speech

Yoshitaka Nakajima,
Mizuki Matsuda,
Kazuo Ueda,
Gerard B. Remijn

Affiliations

Yoshitaka Nakajima: Department of Human Science, Faculty of Design/Research Center for Applied Perceptual Science, Kyushu University, Fukuoka, Japan
Mizuki Matsuda: Nihon Kohden Corporation, Tokyo, Japan
Kazuo Ueda: Department of Human Science, Faculty of Design/Research Center for Applied Perceptual Science, Kyushu University, Fukuoka, Japan
Gerard B. Remijn: Department of Human Science, Faculty of Design/Research Center for Applied Perceptual Science, Kyushu University, Fukuoka, Japan

DOI: https://doi.org/10.3389/fnhum.2018.00149
Journal volume & issue: Vol. 12

Abstract

Read online

Temporal resolution needed for Japanese speech communication was measured. A new experimental paradigm that can reflect the spectro-temporal resolution necessary for healthy listeners to perceive speech is introduced. As a first step, we report listeners' intelligibility scores of Japanese speech with a systematically degraded temporal resolution, so-called “mosaic speech”: speech mosaicized in the coordinates of time and frequency. The results of two experiments show that mosaic speech cut into short static segments was almost perfectly intelligible with a temporal resolution of 40 ms or finer. Intelligibility dropped for a temporal resolution of 80 ms, but was still around 50%-correct level. The data are in line with previous results showing that speech signals separated into short temporal segments of <100 ms can be remarkably robust in terms of linguistic-content perception against drastic manipulations in each segment, such as partial signal omission or temporal reversal. The human perceptual system thus can extract meaning from unexpectedly rough temporal information in speech. The process resembles that of the visual system stringing together static movie frames of ~40 ms into vivid motion.

Published in Frontiers in Human Neuroscience

ISSN: 1662-5161 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Internal medicine: Neurosciences. Biological psychiatry. Neuropsychiatry
Website: http://www.frontiersin.org/human_neuroscience

About the journal

Abstract

Keywords