IEEE Access (Jan 2024)

Polyphonic Piano Music Transcription System Exploiting Mutual Correlations of Different Musical Note States

  • Taehyeon Kim,
  • Donghyeon Lee,
  • Man-Je Kim,
  • Chang Wook Ahn

DOI
https://doi.org/10.1109/ACCESS.2024.3425167
Journal volume & issue
Vol. 12
pp. 93689 – 93700

Abstract

Read online

Generally, polyphonic piano music transcription systems are designed to estimate and determine pitch activities along with various note states for each audio frame. While the music transcription system has multiple uses in the Music Information Retrieval (MIR) field, due to the complicated structures of the note events, precisely predicting various note states is still regarded as a challenging task. Accordingly, approaches to designing neural network architectures have evolved to facilitate the joint prediction of each note state. However, recent models have not been able to efficiently exploit mutual correlations among different note states. The key contribution of our work is that we verified mutual correlations between the different note states and reflected them in the model architecture. It enables the transcription system to recognize clearer note events and produce high-quality real-world results. We propose a kernel-sharing feature extractor module for exploiting those mutual correlations in the feature extraction step. Moreover, to make a system recognize the shape of the pitch envelope, we added some connections between the note state-specific detector modules in the note state detection step. The efficacy of our architecture was thoroughly validated in a series of experiments using the publicly available MAESTRO datasets proposed by Google Magenta. Furthermore, ablation studies are performed to demonstrate notions of those mutual correlations and show the impact and significance of the suggested approach.

Keywords