Head Pose Estimation in the Wild Assisted by Facial Landmarks Based on Convolutional Neural Networks

Jiahao Xia; Libo Cao; Guanjun Zhang; Jiacai Liao

doi:10.1109/ACCESS.2019.2909327

IEEE Access (Jan 2019)

Head Pose Estimation in the Wild Assisted by Facial Landmarks Based on Convolutional Neural Networks

Jiahao Xia,
Libo Cao,
Guanjun Zhang,
Jiacai Liao

Affiliations

Jiahao Xia: ORCiD; State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China
Libo Cao: State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China
Guanjun Zhang: ORCiD; State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China
Jiacai Liao: State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China

DOI: https://doi.org/10.1109/ACCESS.2019.2909327
Journal volume & issue: Vol. 7
pp. 48470 – 48483

Abstract

Read online

Convolutional neural networks (CNNs) exhibit excellent performance on the head pose estimation problem under controllable conditions, but their generalization ability in the wild needs to be improved. To address this issue, we propose an approach involving the introduction of facial landmark information into the task simplifier and landmark heatmap generator constructed before the feed-forward neural network, which can use this information to normalize the face shape into a canonical shape and generate a landmark heatmap based on the transformed facial landmarks to assist in feature extraction, for enhancing generalization ability in the wild. Our method was trained on 300W-LP and tested on AFLW2000-3D. The result shows that for the same feed-forward neural network when our method is used to introduce facial landmark information into a CNN, accuracy improves from 88.5% to 99.0% and mean average error decreases from 5.94° to 1.46° on AFLW2000-3D. Furthermore, we evaluate our method on several datasets used for pose estimation and compare the result with AFLW2000-3D, finding that the features extracted by a CNN could not reflect the head pose efficiently, which limits the performance of the CNN on the head pose estimation problem in wild. By introducing facial landmarks, the CNN could extract features that reflect head pose more efficiently, thereby significantly improving the accuracy of head pose estimation in the wild.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords