Real-Time Emotion-Based Piano Music Generation Using Generative Adversarial Network (GAN)

Lijun Zheng; Chenglong Li

doi:10.1109/ACCESS.2024.3414673

IEEE Access (Jan 2024)

Real-Time Emotion-Based Piano Music Generation Using Generative Adversarial Network (GAN)

Lijun Zheng,
Chenglong Li

Affiliations

Lijun Zheng: ORCiD; School of Music, Ewha Womans University, Seoul, South Korea
Chenglong Li: ORCiD; Conservatory of Music and Dance, Qiannan Normal College for Nationalities, Duyun, Guizhou, China

DOI: https://doi.org/10.1109/ACCESS.2024.3414673
Journal volume & issue: Vol. 12
pp. 87489 – 87500

Abstract

Read online

Automatic creation of real-time, emotion-based piano music pieces remains a challenge for deep learning models. While Generative Adversarial Networks (GANs) have shown promise, existing methods can struggle with generating musically coherent pieces and often require complex manual configuration. This paper proposes a novel model called Learning Automata-based Self-Attention Generative Adversarial Network (LA-SAGAN) to address these limitations. The proposed model uses a Generative Adversarial Network (GAN), combined with Self-Attention (SA) mechanism to reach this goal. The benefits of using SA modules in GAN architecture is twofold: First, SA mechanism results in generating music pieces with homogenous structure, which means long-distance dependencies in generated outputs are considered. Second, the SA mechanism utilizes the emotional features of the input to produce output pieces. This results in generating music pieces with desired genre or theme. In order to control the complexity of the proposed model, and optimize its structure, a set of Learning Automata (LA) models have been used to determine the activity state of each SA module. To do this, an iterative algorithm based on cooperation of LAs is introduced which optimizes the model by deactivating unnecessary SA modules. The efficiency of the proposed model in generating piano music has been evaluated. Evaluations demonstrate LA-SAGAN’s effectiveness: at least 14.47% improvement in entropy (diversity) and improvements in precision (at least 2.47%) and recall (at least 2.13%). Moreover, human evaluation confirms superior musical coherence and adherence to emotional cues.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords