Applied Sciences (Mar 2025)

Full-Scale Piano Score Recognition

  • Xiang-Yi Zhang,
  • Jia-Lien Hsu

DOI
https://doi.org/10.3390/app15052857
Journal volume & issue
Vol. 15, no. 5
p. 2857

Abstract

Read online

Sheet music is one of the most efficient methods for storing music. Meanwhile, a large amount of sheet music-image data is stored in paper form, but not in a computer-readable format. Therefore, digitizing sheet music is an essential task, such that the encoded music object could be effectively utilized for tasks such as editing or playback. Although there have been a few studies focused on recognizing sheet music images with simpler structures—such as monophonic scores or more modern scores with relatively simple structures, only containing clefs, time signatures, key signatures, and notes—in this paper we focus on the issue of classical sheet music containing dynamics symbols and articulation signs, more than only clefs, time signatures, key signatures, and notes. Therefore, this study augments the data from the GrandStaff dataset by concatenating single-line scores into multi-line scores and adding various classical music dynamics symbols not included in the original GrandStaff dataset. Given a full-scale piano score in pages, our approach first applies three YOLOv8 models to perform the three tasks: 1. Converting a full page of sheet music into multiple single-line scores; 2. Recognizing the classes and absolute positions of dynamics symbols in the score; and 3. Finding the relative positions of dynamics symbols in the score. Then, the identified dynamics symbols are removed from the original score, and the remaining score serves as the input into a Convolutional Recurrent Neural Network (CRNN) for the following steps. The CRNN outputs KERN notation (KERN, a core pitch/duration representation for common practice music notation) without dynamics symbols. By combining the CRNN output with the relative and absolute position information of the dynamics symbols, the final output is obtained. The results show that with the assistance of YOLOv8, there is a significant improvement in accuracy.

Keywords