SeamsTalk: Seamless Talking Face Generation via Flow-Guided Inpainting

Yeongho Jeong; Gyeongman Kim; Doohyuk Jang; Jaeryong Hwang; Eunho Yang

doi:10.1109/ACCESS.2024.3381992

IEEE Access (Jan 2024)

SeamsTalk: Seamless Talking Face Generation via Flow-Guided Inpainting

Yeongho Jeong,
Gyeongman Kim,
Doohyuk Jang,
Jaeryong Hwang,
Eunho Yang

Affiliations

Yeongho Jeong: Kim Jaechul Graduate School of Artificial Intelligence, KAIST, Yuseong-gu, Daejeon, Republic of Korea
Gyeongman Kim: ORCiD; Kim Jaechul Graduate School of Artificial Intelligence, KAIST, Yuseong-gu, Daejeon, Republic of Korea
Doohyuk Jang: ORCiD; Kim Jaechul Graduate School of Artificial Intelligence, KAIST, Yuseong-gu, Daejeon, Republic of Korea
Jaeryong Hwang: Department of Cyber Science, Republic of Korea Naval Academy, Jinhae-gu, Changwon, Republic of Korea
Eunho Yang: Kim Jaechul Graduate School of Artificial Intelligence, KAIST, Yuseong-gu, Daejeon, Republic of Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3381992
Journal volume & issue: Vol. 12
pp. 46678 – 46689

Abstract

Read online

Talking face generation aims to generate a face video to speak according to a given audio or driving video. Despite the importance of natural lower face movement, previous approaches have focused only on animating the lip, neglecting the connection between the modified lower face and the original background. As a result, the generated face is not smoothly integrated into the original video. To address this, we propose a new method to create a seamless talking face video by reformulating talking face generation as a conditional video in-painting. Moreover, since previous methods solely rely on referencing the original frame, the original frame’s lip shape influences a generated lip shape. Therefore, we devise a two-stage pipeline that leverages the original frame to reduce scene-specific information loss in the lower face and then utilizes multiple other frames to generate a desired lip shape. Experimental results demonstrate that our method generates a seamless talking face while maintaining similarly accurate lip shapes compared to existing methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords