IEEE Access (Jan 2024)
Multimodal Engagement Recognition From Image Traits Using Deep Learning Techniques
Abstract
Learner engagement is a significant factor determining the success of implementing an intelligent educational network. Currently the use of Massive Open Online Courses has increased because of the flexibility offered by such online learning systems. The COVID period has encouraged practitioners to continue to engage in new ways of online and hybrid teaching. However, monitoring student engagement and keeping the right level of interaction in an online classroom is challenging for teachers. In this paper we propose an engagement recognition model by combining the image traits obtained from a camera, such as facial emotions, gaze tracking with head pose estimation and eye blinking rate. In the first step, a face recognition model was implemented. The next stage involved training the facial emotion recognition model using deep learning convolutional neural network with the datasets FER 2013.The classified emotions were assigned weights corresponding to the academic affective states. Subsequently, by using the Dlib’s face detector and shape predicting algorithm, the gaze direction with head pose estimation, eyes blinking rate and status of the eye (closed or open) were identified. Combining all these modalities obtained from the image traits, we propose an engagement recognition system. The experimental results of the proposed system were validated by the quiz score obtained at the end of each session. This model can be used for real time video processing of the student’s affective state. The teacher can obtain a detailed analytics of engagement statics on a spreadsheet at the end of the session thus facilitating the necessary follow-up actions.
Keywords