IEEE Access (Jan 2024)
RealMock: Crafting Realistic Animated Portraits via Dual-Driven Landmark Editing
Abstract
The field of animated portrait generation, driven by audio cues, has seen remarkable advancements in creating lifelike visuals. Despite these strides, current methodologies struggle with the challenge of balancing stability and naturalism. Audio-driven techniques often encounter difficulties due to the subtlety of audio signals, leading to inconsistencies. In contrast, methods that rely solely on facial landmarks, while more stable, can produce artificial-looking results due to excessive manipulation of key point data. This paper introduces RealMock, an innovative approach that harmonizes audio inputs with facial landmarks during the training phase. RealMock pioneers a dual-driven training regimen, enabling the generation of animated portraits from audio, facial landmarks, or a combination of both. This novel methodology results in a more stable and authentic animation process. Extensive testing against existing algorithms across various public datasets, as well as our proprietary dataset, has highlighted RealMock’s superiority in both quantitative metrics and qualitative assessments. The RealMock framework has broad practical implications, including revolutionizing media production with realistic animated characters and improving online education through engaging avatar-based learning experiences.
Keywords