Proceedings of the XXth Conference of Open Innovations Association FRUCT (Apr 2022)
Meta-Learning, Fast Adaptation, and Latent Representation for Head Pose Estimation
Abstract
Head pose estimation is used in a variety of human computer interface applications, like stare tracking, driving assistance, impaired assistance, and entertainment. Advances in convolutional neural networks have a considerable improvement in the performance of head pose estimation. However, difficulties in capturing well-labelled head pose data and differences in the facial features of different persons make them difficult to use. This work proposes a meta-learning based technique for head pose estimation problem in BIWI head pose dataset. An approach to learning latent representation of head pose features using variational autoencoder is implemented. Then a fast, adaptable head pose estimator is trained using meta-learning in a few-shot settings. Model agnostic meta-learning (MAML) algorithm has been deployed for training a head pose estimator. Mean Average Error (MAEavg) of 7.33 is achieved in predicting head pose angles in one-shot settings. After meta-training, the optimized model is used to analyze fast adaptation in a test set that has been separated from the BIWI head pose dataset. We begin with the trained networks optimum parameters and optimize the inner loop for quick adaptation. The optimized model can predict accurate head poses using as few as 10 gradient descent steps in the unseen set of tasks sampled from the test set.
Keywords