Applied Sciences (Aug 2024)

3D Character Animation and Asset Generation Using Deep Learning

  • Vlad-Constantin Lungu-Stan,
  • Irina Georgiana Mocanu

DOI
https://doi.org/10.3390/app14167234
Journal volume & issue
Vol. 14, no. 16
p. 7234

Abstract

Read online

Besides video content, a significant part of entertainment is represented by computer games and animations such as cartoons. Creating such entertainment is based on two fundamental steps: asset generation and character animation. The main problem stems from its repetitive nature and the needed amounts of concentration and skill. The latest advances in deep learning and generative techniques have provided a set of powerful tools which can be used to alleviate these problems by facilitating the tasks of artists and engineers and providing a better workflow. In this work we explore practical solutions for facilitating and hastening the creative process: character animation and asset generation. In character animation, the task is to either move the joints of a subject manually or to correct the noisy data coming out of motion capture. The main difficulties of these tasks are their repetitive nature and the needed amounts of concentration and skill. For the animation case, we propose two decoder-only transformer based solutions, inspired by the current success of GPT. The first, AnimGPT, targets the original animation workflow by predicting the next pose of an animation based on a set of previous poses, while the second, DenoiseAnimGPT, tackles the motion capture case by predicting the clean current pose based on all previous poses and the current noisy pose. Both models obtained good performances on the CMU motion dataset, with the generated results being imperceptible to the untrained human eye. Quantitative evaluation was performed using mean absolute error between the ground truth motion vectors and the predicted motion vector. For both networks AnimGPT and DenoiseAnimGPT errors were 0.345, respectively 0.2513 (for 50 frames) that indicates better performances compared with other solutions. For asset generation, diffusion models were used. Using image generation and outpainting, we created a method that generates good backgrounds by combining the idea of text conditioned generation and text conditioned image editing. A time coherent algorithm that creates animated effects for characters was obtained.

Keywords