Data in Brief (Aug 2024)
SunoCaps: A novel dataset of text-prompt based AI-generated music with emotion annotations
Abstract
The SunoCaps dataset aims to provide an innovative contribution to music data. Expert description of human-made musical pieces, from the widely used MusicCaps dataset, are used as prompts for generating complete songs for this dataset. This Automatic Music Generation is done with the state-of-the-art Suno generator of audio-based music. A subset of 64 pieces from MusicCaps is currently included, with a total of 256 generated entries. This total stems from generating four different variations for each human piece; two versions based on the original caption and two versions based on the original aspect description.As an AI-generated music dataset, SunoCaps also includes expert-based information on prompt alignment, with the main differences between prompt and final generation annotated. Furthermore, annotations describing the main discrete emotions induced by the piece. This dataset can have an array of implementations, such as creating and improving music generation validation tools, training systems for multi-layered architectures and the optimization of music emotion estimation systems.