Transactions of the International Society for Music Information Retrieval (Mar 2021)
Automatic Generation of Piano Score Following Videos
Abstract
This article studies the problem of generating a piano score following video from an audio recording in a fully automated manner. This problem contains two components: identifying the piece and aligning the audio with raw sheet music images. Unlike previous work, we focus primarily on working with raw, unprocessed sheet music from IMSLP, which may contain filler pages, other unrelated pieces or movements, or repeats and jumps whose locations are unknown a priori. To solve this problem, we combine state-of-the-art methods with a novel alignment algorithm called Hierarchical DTW to handle discontinuities, which are the bottleneck on system performance. Hierarchical DTW handles repeats and jumps by considering all line breaks as possible jump locations, and it applies DTW at both a feature level and a segment level. We evaluate our algorithm with 200 PDFs from IMSLP and real audio recordings from Youtube. Our experiments show that Hierarchical DTW consistently outperforms a previously proposed Jump DTW algorithm in handling various types of discontinuities. We present extensive experimental results and analysis of the proposed algorithm.
Keywords