Data in Brief (Jun 2024)

LUMINA: Linguistic unified multimodal Indonesian natural audio-visual dataset

  • Eka Rahayu Setyaningsih,
  • Anik Nur Handayani,
  • Wahyu Sakti Gunawan Irianto,
  • Yosi Kristian,
  • Christian Trisno Sen Long Chen

Journal volume & issue
Vol. 54
p. 110279

Abstract

Read online

The LUMINA (Linguistic Unified Multimodal Indonesian Natural Audio-Visual) Dataset is a carefully curated constrained audio-visual dataset designed to support research in the field of speech perception. Spoken exclusively in Indonesian, LUMINA contains high-quality audio-visual recordings featuring 14 native speakers, including 9 males and 5 females. Each speaker contributes approximately 1,000 sentences, producing a rich and diverse data collection. The recorded videos focus on facial recordings, capturing essential visual cues and expressions that accompany speech. This extensive dataset provides a valuable resource for understanding how humans perceive and process spoken language, paving the way for speech recognition and synthesis technology advancements.

Keywords