Human&#x2013;Robot Collaboration Using Sequential-Recurrent-Convolution-Network-Based Dynamic Face Emotion and Wireless Speech Command Recognitions

Chih-Lyang Hwang; Yu-Chen Deng; Shih-En Pu

doi:10.1109/ACCESS.2022.3228825

IEEE Access (Jan 2023)

Human–Robot Collaboration Using Sequential-Recurrent-Convolution-Network-Based Dynamic Face Emotion and Wireless Speech Command Recognitions

Chih-Lyang Hwang,
Yu-Chen Deng,
Shih-En Pu

Affiliations

Chih-Lyang Hwang: Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Yu-Chen Deng: Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Shih-En Pu: Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan

DOI: https://doi.org/10.1109/ACCESS.2022.3228825
Journal volume & issue: Vol. 11
pp. 37269 – 37282

Abstract

Read online

The proposed sequential recurrent convolution network (SRCN) includes two parts: one convolution neural network (CNN) and a sequence of long short-term memory (LSTM) models. The CNN is to achieve the feature vector of face emotion or speech command. Then, a sequence of LSTM models with the shared weight reflects a sequence of inputs provided by a (pre-trained) CNN with a sequence of input sub-images or spectrograms corresponding to face emotion and speech command, respectively. Simply put, one SRCN for dynamic face emotion recognition (SRCN-DFER) and another SRCN for wireless speech command recognition (SRCN-WSCR) are developed. The proposed approach not only effectively tackles the recognitions of dynamic mapping of face emotion and speech command with average generalized recognition rate of 98% and 96.7% but also prevents the overfitting problem in a noisy environment. The comparisons among mono and stereo visions, Deep CNN, and ResNet50 confirm the superiority of the proposed SRCN-DFER. The comparisons among SRCN-WSCR with noise-free data, SRCN-WSCR with noisy data, and multiclass support vector machine validate its robustness. Finally, the human-robot collaboration (HRC) using our developed omnidirectional service robot, including human and face detections, trajectory tracking by the previously designed adaptive stratified finite-time saturated control, face emotion and speech command recognitions, and music play, validates the effectiveness, feasibility, and robustness of the proposed method.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords