A dataset for voice-based human identity recognition

Baha’ A. Alsaify; Hadeel S. Abu Arja; Baskal Y. Maayah; Masa M. Al-Taweel

Data in Brief (Jun 2022)

A dataset for voice-based human identity recognition

Baha’ A. Alsaify,
Hadeel S. Abu Arja,
Baskal Y. Maayah,
Masa M. Al-Taweel

Affiliations

Baha’ A. Alsaify: Corresponding author.; Department of Network Engineering and Security, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan
Hadeel S. Abu Arja: Department of Network Engineering and Security, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan
Baskal Y. Maayah: Department of Network Engineering and Security, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan
Masa M. Al-Taweel: Department of Network Engineering and Security, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan

Journal volume & issue: Vol. 42
p. 108070

Abstract

Read online

This paper introduces a new English speech dataset suitable for training and evaluating speaker recognition systems. Samples were obtained from non-native English speakers from the Arab region over the course of two months. The dataset was divided into two sub-datasets. Ten samples were collected from each speaker for each sub-dataset. The first sub-dataset contains samples of speakers repeating the phrase “Machine learning 1, 2, 3, 4, 5, 6, 7, 8, 9, 10”. The second sub-dataset contains samples for the same speakers speaking randomly for five to ten seconds for each sample. The dataset consists of 150 speakers with a total of 3,000 data samples and about six hours of speech.

Published in Data in Brief

ISSN: 2352-3409 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Science (General)
Website: http://www.journals.elsevier.com/data-in-brief/

About the journal

Abstract

Keywords