Unsupervised Music Source Separation Using Differentiable Parametric Source Models

Kilian Schulze-Forster; Gaël Richard; Liam Kelley; Clement Doire; Roland Badeau

doi:10.1109/TASLP.2023.3252272

Article Dans Une Revue IEEE/ACM Transactions on Audio, Speech and Language Processing Année : 2023

Unsupervised Music Source Separation Using Differentiable Parametric Source Models

, (1, 2) , , , (1, 2)

1
2

Kilian Schulze-Forster

Fonction : Auteur
PersonId : 1239155
ORCID : 0000-0001-8397-7914

Gaël Richard

Fonction : Auteur
PersonId : 14146
IdHAL : gael-richard
IdRef : 094977208

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Liam Kelley

Fonction : Auteur

Clement Doire

Fonction : Auteur
PersonId : 1239156
ORCID : 0000-0002-8739-4510

Roland Badeau

Fonction : Auteur
PersonId : 1121
IdHAL : rbadeau
ORCID : 0000-0002-9630-6877
IdRef : 106938134

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Résumé

Supervised deep learning approaches to underdetermined audio source separation achieve state-of-the-art performance but require a dataset of mixtures along with their corresponding isolated source signals. Such datasets can be extremely costly to obtain for musical mixtures. This raises a need for unsupervised methods. We propose a novel unsupervised model-based deep learning approach to musical source separation. Each source is modelled with a differentiable parametric source-filter model. A neural network is trained to reconstruct the observed mixture as a sum of the sources by estimating the source models' parameters given their fundamental frequencies. At test time, soft masks are obtained from the synthesized source signals. The experimental evaluation on a vocal ensemble separation task shows that the proposed method outperforms learning-free methods based on nonnegative matrix factorization and a supervised deep learning baseline. Integrating domain knowledge in the form of source models into a data-driven method leads to high data efficiency: the proposed approach achieves good separation quality even when trained on less than three minutes of audio. This work makes powerful deep learning based separation usable in scenarios where training data with ground truth is expensive or nonexistent

Domaines

Traitement du signal et de l'image [eess.SP]

Fichier principal

Unsupervised_Music_Source_Separation_Using_Differentiable_Parametric_Source_Models-3.pdf (1.84 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Gaël RICHARD : Connectez-vous pour contacter le contributeur

https://telecom-paris.hal.science/hal-04038023

Soumis le : mardi 28 mars 2023-14:08:50

Dernière modification le : mercredi 11 octobre 2023-16:05:44

Archivage à long terme le : jeudi 29 juin 2023-19:04:28

Dates et versions

hal-04038023 , version 1 (28-03-2023)

Identifiants

HAL Id : hal-04038023 , version 1
ARXIV : 2201.09592
DOI : 10.1109/TASLP.2023.3252272

Citer

Kilian Schulze-Forster, Gaël Richard, Liam Kelley, Clement Doire, Roland Badeau. Unsupervised Music Source Separation Using Differentiable Parametric Source Models. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2023, 31, pp.1276-1289. ⟨10.1109/TASLP.2023.3252272⟩. ⟨hal-04038023⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM LTCI IDS S2A IP_PARIS

63 Consultations

161 Téléchargements

Unsupervised Music Source Separation Using Differentiable Parametric Source Models

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager