IEEE Access (Jan 2021)
Use of Machine Learning for Deception Detection From Spectral and Cepstral Features of Speech Signals
Abstract
In this research, four unique nonlinear speech features are extracted and analyzed to study the dissimilarity pattern between when the speaker is being deceitful and truthful based on how human speech is perceived. The speaker was under stress in a police interrogation where two ground truth and two deceitful responses were recorded during three different times of the day. Using the audio recordings from all three sessions, the cepstral features and spectral energy features are extracted. Cepstral features are the Mel frequency cepstrum coefficient, from where the delta cepstrum and the time-difference cepstrum features are developed. On the other hand, the spectral energy features are the energy of Bark band energy from where the delta energy and the time-difference energy features are developed. The Levenberg-Marquardt classification method and the long short-term memory classification method are then applied to evaluate the accuracy of detecting deception based on the nine unique training and testing combinations of the three different sessions and their extracted cepstrum and spectral energy features. In addition, the principal component analysis is applied to reduce the dimensionality from the extracted features for further improvement. The projected principal components of the four types of features showed improved accuracy in order to distinguish between truthful and deceptive speech pattern. After incorporating with principal component analysis, the long short-term memory classification method with time-difference spectral energy feature shows the highest recognition rate compared to Levenberg-Marquardt algorithm with other cepstral and spectral features.
Keywords