IEEE Access (Jan 2022)

Fonts That Fit the Music: A Multimodal Design Trend Analysis of Lyric Videos

  • Daichi Haraguchi,
  • Shota Sakaguchi,
  • Jun Kato,
  • Masataka Goto,
  • Seiichi Uchida

DOI
https://doi.org/10.1109/ACCESS.2022.3184028
Journal volume & issue
Vol. 10
pp. 65414 – 65425

Abstract

Read online

Lyric videos, or kinetic typography videos, are music videos showing lyric text in synchronization with the music. The purpose of this paper is to quantitatively and qualitatively analyze lyric videos to understand their design trends via three modalities: word motion, font style, and music style. These trends will not only be helpful as hints for designing new lyric videos but also be meaningful to quantitatively reveal the thought processes of the video design professionals. To achieve this, we needed to develop or utilize several technologies. First, we developed a lyric word tracking method to capture the motion of individual lyric words. The proposed method uses the lyric text as the guiding information for word tracking to overcome the difficulties arising from the various word appearances and motions. Second, we developed a font style estimator to quantify the appearance of each word as a feature vector. Finally, we employed a music style estimator to quantify the mood of the music, e.g., “techno” and “fast.” We then analyzed feature vectors of these three style modalities collected at 3,494 time points in 100 lyric videos. After revealing the trend of each modality via k-means, we conducted a co-occurrence analysis to understand the correlation between each modality pair. Our experimental results indicate that such a cluster-wise co-occurrence analysis can capture interesting trends hidden in lyric video designs.

Keywords