Musical timbre style transfer with diffusion model

Hong Huang; Junfeng Man; Luyao Li; Rongke Zeng

doi:10.7717/peerj-cs.2194

PeerJ Computer Science (Jul 2024)

Musical timbre style transfer with diffusion model

Hong Huang,
Junfeng Man,
Luyao Li,
Rongke Zeng

Affiliations

Hong Huang: School of Computer Science, Hunan University of Technology, Zhuzhou, China
Junfeng Man: School of Intelligent Manufacturing, Hunan First Normal University, Changsha, China
Luyao Li: School of Computer Science, Hunan University of Technology, Zhuzhou, China
Rongke Zeng: School of Computer Science, Hunan University of Technology, Zhuzhou, China

DOI: https://doi.org/10.7717/peerj-cs.2194
Journal volume & issue: Vol. 10
p. e2194

Abstract

Read online Read online

In this work, we focus on solving the problem of timbre transfer in audio samples. The goal is to transfer the source audio’s timbre from one instrument to another while retaining as much of the other musical elements as possible, including loudness, pitch, and melody. While image-to-image style transfer has been used for timbre and style transfer in music recording, the current state of the findings is unsatisfactory. Current timbre transfer models frequently contain samples with unrelated waveforms that affect the quality of the generated audio. The diffusion model has excellent performance in the field of image generation and can generate high-quality images. Inspired by it, we propose a kind of timbre transfer technology based on the diffusion model. To be specific, we first convert the original audio waveform into the constant-Q transform (CQT) spectrogram and adopt image-to-image conversion technology to achieve timbre transfer. Lastly, we reconstruct the produced CQT spectrogram into an audio waveform using the DiffWave model. In both many-to-many and one-to-one timbre transfer tasks, we assessed our model. The experimental results show that compared with the baseline model, the proposed model has good performance in one-to-one and many-to-many timbre transfer tasks, which is an interesting technical progress.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords