Motif Transformer: Generating Music With Motifs

Heng Wang; Sen Hao; Cong Zhang; Xiaohu Wang; Yilin Chen

doi:10.1109/ACCESS.2023.3287271

IEEE Access (Jan 2023)

Motif Transformer: Generating Music With Motifs

Heng Wang,
Sen Hao,
Cong Zhang,
Xiaohu Wang,
Yilin Chen

Affiliations

Heng Wang: ORCiD; School of Mathematics and Computer, Wuhan Polytechnic University, Wuhan, China
Sen Hao: ORCiD; School of Mathematics and Computer, Wuhan Polytechnic University, Wuhan, China
Cong Zhang: ORCiD; School of Electrical and Electronic Engineering, Wuhan Polytechnic University, Wuhan, China
Xiaohu Wang: School of Mathematics and Computer, Wuhan Polytechnic University, Wuhan, China
Yilin Chen: Hubei Key Laboratory of Intelligent Robot, Wuhan Institute of Technology, Wuhan, China

DOI: https://doi.org/10.1109/ACCESS.2023.3287271
Journal volume & issue: Vol. 11
pp. 63197 – 63204

Abstract

Read online

Music is composed of a set of regular sound waves, which are usually ordered and have a large number of repetitive structures. Important notes, chords, and music fragments often appear repeatedly. Such repeated fragments (referred to as motifs) are usually the soul of a song. However, most music generated by existing music generation methods can not have distinct motifs like real music. This study proposes a novel multi- encoders model called Motif Transformer to generate music containing more motifs. The model is constructed using an encoder-decoder framework that includes an original encoder, a bidirectional long short term memory-attention encoder (abbreviated as bilstm-attention encoder), and a gated decoder. Where the original encoder is taken from the transformer’s encoder and the bilstm-attention encoder is constructed from the bidirectional long short-term memory network (BILSTM) and the attention mechanism; Both the original encoder and the bilstm-attention encoder encode the motifs and input the encoded information representations to the gated decoder; The gated decoder decodes the entire input of the music and the information passed by the encoders and enhances the model’s ability to capture motifs of the music in a gated manner to generate music with significantly repeated fragments. In addition, in order to better measure the model’s ability of generating motifs, this study proposes an evaluation metric called used motifs. Experiments on multiple music field metrics show that the model proposed in this study can generate smoother and more beautiful music, and the generated music contains more motifs.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords