IEEE Access (Jan 2022)
PipaSet and TEAS: A Multimodal Dataset and Annotation Platform for Automatic Music Transcription and Expressive Analysis Dedicated to Chinese Traditional Plucked String Instrument Pipa
Abstract
Music information retrieval (MIR) is developing these years rapidly. As the fundamental MIR tasks, automatic music transcription (AMT) and expressive analysis (EA) are gaining momentum in both Western and non-European music. However, the annotated datasets for non-Eurogenic instruments remain scarce in terms of quantity and feature diversity, so that general evaluations and data-driven models on various tasks cannot be well explored. As one of the most popular traditional plucked string instruments in Asia, which is barely studied in the MIR community, pipa has lots of distinctive national and local characteristics, including the fake nails, intrinsic pitch deviation, rubato, as well as sophisticated playing techniques, that greatly enhance the music expressiveness. Our work aims to systematically clarify a complete creation procedure of a pipa solo dataset with audio, musical notation and multiview video modalities. The use of 4-track string vibration signals captured by optical sensors paves a path to the high quality annotations. Furthermore, a transcription and expressiveness annotation system (TEAS) was transparently implemented to ensure the scalability of dataset. Three expressive analysis approaches in this system were newly proposed and evaluated in the paper. Finally, a series of the existing and emerging MIR tasks enabled by this dataset were enumerated and two AMT models were simply investigated for future exploration.
Keywords