IEEE Access (Jan 2021)

Robust Blind Speech Watermarking via FFT-Based Perceptual Vector Norm Modulation With Frame Self-Synchronization

  • Hwai-Tsu Hu,
  • Hsien-Hsin Chou,
  • Tung-Tsun Lee

DOI
https://doi.org/10.1109/ACCESS.2021.3049525
Journal volume & issue
Vol. 9
pp. 9916 – 9925

Abstract

Read online

Watermarking is an important measure for protecting proprietary digital multimedia data. This paper presents a novel approach to achieving robust and imperceptible blind speech watermarking on a frame-by-frame basis. The proposed method employs two modules operating in the fast Fourier transform (FFT) domain. The first module is referred to as downward progressive quantization index modulation. It modulates the vector norms drawn from FFT coefficients according to a guideline deduced from human auditory masking properties. The second module is referred to as boundary-constrained iterative adjustment. It provides a smooth transition across frames in the resulting speech waveform. Experiment results confirm the imperceptibility of the proposed modulation scheme in terms of the mean opinion score – listening quality objective (MOS–LQO) based on the perceptual evaluation of speech quality (PESQ) metric. The proposed watermarking method matched and exceeded the performance of five state-of-the-art methods in terms of robustness against common speech processing attacks.

Keywords