WaveTransfer: A Flexible End-to-end Multi-instrument Timbre Transfer with Diffusion

Teysir Baoueb; Xiaoyu Bie; Hicham Janati; Gael Richard

Communication Dans Un Congrès Année : 2024

WaveTransfer: A Flexible End-to-end Multi-instrument Timbre Transfer with Diffusion

(1, 2) , (1, 2) , (1, 2) , (1, 2)

1
2

Teysir Baoueb

Fonction : Auteur
PersonId : 1343186
ORCID : 0009-0001-2263-4309

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Xiaoyu Bie

Fonction : Auteur
PersonId : 1393940
IdHAL : xbie
ORCID : 0000-0003-1480-0538
IdRef : 275513904

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Hicham Janati

Fonction : Auteur
PersonId : 1113746

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Gael Richard

Fonction : Auteur
PersonId : 14146
IdHAL : gael-richard
IdRef : 094977208

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Résumé

As diffusion-based deep generative models gain prevalence, researchers are actively investigating their potential applications across various domains, including music synthesis and style alteration. Within this work, we are interested in timbre transfer, a process that involves seamlessly altering the instrumental characteristics of musical pieces while preserving essential musical elements. This paper introduces WaveTransfer, an end-to-end diffusion model designed for timbre transfer. We specifically employ the bilateral denoising diffusion model (BDDM) for noise scheduling search. Our model is capable of conducting timbre transfer between audio mixtures as well as individual instruments. Notably, it exhibits versatility in that it accommodates multiple types of timbre transfer between unique instrument pairs in a single model, eliminating the need for separate model training for each pairing. Furthermore, unlike recent works limited to 16 kHz, WaveTransfer can be trained at various sampling rates, including the industry-standard 44.1 kHz, a feature of particular interest to the music community.

Mots clés

Multi-instrumental timbre transfer diffusion models music transformation generative AI Multi-instrumental timbre transfer diffusion models music transformation generative AI

Domaines

Traitement du signal et de l'image [eess.SP] Intelligence artificielle [cs.AI] Apprentissage [cs.LG] Son [cs.SD]

Fichier principal

MLSP_2024 _WaveTransfer.pdf (2.47 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Teysir Baoueb : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04685184

Soumis le : jeudi 5 septembre 2024-09:24:52

Dernière modification le : mercredi 23 octobre 2024-10:30:04

Dates et versions

hal-04685184 , version 1 (05-09-2024)

Identifiants

HAL Id : hal-04685184 , version 1

Citer

Teysir Baoueb, Xiaoyu Bie, Hicham Janati, Gael Richard. WaveTransfer: A Flexible End-to-end Multi-instrument Timbre Transfer with Diffusion. 2024 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2024), Sep 2024, London (UK), United Kingdom. ⟨hal-04685184⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

LTCI IDS S2A IP_PARIS INSTITUT-MINES-TELECOM

383 Consultations

132 Téléchargements

WaveTransfer: A Flexible End-to-end Multi-instrument Timbre Transfer with Diffusion

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager