FF-BTP Model for Novel Sound-Based Community Emotion Detection

Arif Metehan Yildiz; Masayuki Tanabe; Makiko Kobayashi; Ilknur Tuncer; Prabal Datta Barua; Sengul Dogan; Turker Tuncer; Ru-San Tan; U. Rajendra Acharya

doi:10.1109/ACCESS.2023.3318751

IEEE Access (Jan 2023)

FF-BTP Model for Novel Sound-Based Community Emotion Detection

Arif Metehan Yildiz,
Masayuki Tanabe,
Makiko Kobayashi,
Ilknur Tuncer,
Prabal Datta Barua,
Sengul Dogan,
Turker Tuncer,
Ru-San Tan,
U. Rajendra Acharya

Affiliations

Arif Metehan Yildiz: ORCiD; Department of Digital Forensics Engineering, Technology Faculty, Firat University, Elazig, Turkey
Masayuki Tanabe: ORCiD; Department of Digital Forensics Engineering, Technology Faculty, Firat University, Elazig, Turkey
Makiko Kobayashi: ORCiD; Faculty of Advanced Science and Technology, Kumamoto University, Kumamoto, Japan
Ilknur Tuncer: ORCiD; Elazig Governorship, Interior Ministry, Elazig, Turkey
Prabal Datta Barua: ORCiD; School of Business (Information System), University of Southern Queensland, Springfield, QLD, Australia
Sengul Dogan: ORCiD; Department of Digital Forensics Engineering, Technology Faculty, Firat University, Elazig, Turkey
Turker Tuncer: ORCiD; Department of Digital Forensics Engineering, Technology Faculty, Firat University, Elazig, Turkey
Ru-San Tan: ORCiD; Department of Cardiology, National Heart Centre Singapore, Outram, Singapore
U. Rajendra Acharya: ORCiD; International Research Organization for Advanced Science and Technology (IROAST), Kumamoto University, Kumamoto, Japan

DOI: https://doi.org/10.1109/ACCESS.2023.3318751
Journal volume & issue: Vol. 11
pp. 108705 – 108715

Abstract

Read online

Most emotion classification schemes to date have concentrated on individual inputs rather than crowd-level signals. In addressing this gap, we introduce Sound-based Community Emotion Recognition (SCED) as a fresh challenge in the machine learning domain. In this pursuit, we crafted the FF-BTP-based feature engineering model inspired by deep learning principles, specifically designed for discerning crowd sentiments. Our unique dataset was derived from 187 YouTube videos, summing up to 2733 segments each of 3 seconds (sampled at 44.1 KHz). These segments, capturing overlapping speech, ambient sounds, and more, were meticulously categorized into negative, neutral, and positive emotional content. Our architectural design fuses the BTP, a textural feature extractor, and an innovative handcrafted feature selector inspired by Hinton’s FF algorithm. This combination identifies the most salient feature vector using calculated mean square error. Further enhancements include the incorporation of a multilevel discrete wavelet transform for spatial and frequency domain feature extraction, and a sophisticated iterative neighborhood component analysis for feature selection, eventually employing a support vector machine for classification. On testing, our FF-BTP model showcased an impressive 97.22% classification accuracy across three categories using the SCED dataset. This handcrafted approach, although inspired by deep learning’s feature analysis depth, requires significantly lower computational resources and still delivers outstanding results. It holds promise for future SCED-centric applications.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords