A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition

Yibo Huang; Hexiang Hou; Yong Wang; Yuan Zhang; Manhong Fan

doi:10.1109/ACCESS.2020.2974029

IEEE Access (Jan 2020)

A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition

Yibo Huang,
Hexiang Hou,
Yong Wang,
Yuan Zhang,
Manhong Fan

Affiliations

Yibo Huang: ORCiD; College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, China
Hexiang Hou: College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, China
Yong Wang: College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, China
Yuan Zhang: College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, China
Manhong Fan: College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, China

DOI: https://doi.org/10.1109/ACCESS.2020.2974029
Journal volume & issue: Vol. 8
pp. 34140 – 34152

Abstract

Read online

Most speech authentication algorithms are over-optimized for robustness and efficiency, resulting in poor discrimination. Hashing shorter sequence is likely to cause the same hashing sequence to come from different speech segments, which will cause serious deviations in authentication. Few people pay attention to the research on the discrimination of hashing sequence length, so this paper proposes a long sequence speech authentication algorithm based on constant Q transform (CQT) and tensor decomposition (TD). In this paper, hashing long sequence is used to solve the problem of poor collision resistance of existing algorithms, fast and accurate authentication can be achieved for important speech fragments with large data volumes. The sub-band in the frequency domain are first divided into different matrix, then the variance set of sub-band in the frequency domain is obtained, and finally the feature values are obtained by CQT and TD transformation. The obtained feature values have strong robustness and can cope with the interference of complex channel environment. In this paper, Texas Instruments and Massachusetts Institute of Technology (TIMIT) speech database and the Text to Speech (TTS) are used to establish a database of 51600 speeches to verify the performance of the algorithm. Experimental results show that compared with the existing speech authentication algorithms, the proposed algorithm has the characteristics of high discrimination, strong robustness and high efficiency.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords