DNN-FREE LOW-LATENCY ADAPTIVE SPEECH ENHANCEMENT BASED ON FRAME-ONLINE BEAMFORMING POWERED BY BLOCK-ONLINE FASTMNMF - Télécom Paris Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

DNN-FREE LOW-LATENCY ADAPTIVE SPEECH ENHANCEMENT BASED ON FRAME-ONLINE BEAMFORMING POWERED BY BLOCK-ONLINE FASTMNMF

Résumé

This paper describes a practical dual-process speech enhancement system that adapts environment-sensitive frame-online beamforming (front-end) with help from environment-free block-online source separation (back-end). To use minimum variance distortionless response (MVDR) beamforming, one may train a deep neural network (DNN) that estimates timefrequency masks used for computing the covariance matrices of sources (speech and noise). Backpropagation-based runtime adaptation of the DNN was proposed for dealing with the mismatched training-test conditions. Instead, one may try to directly estimate the source covariance matrices with a state-ofthe-art blind source separation method called fast multichannel non-negative matrix factorization (FastMNMF). In practice, however, neither the DNN nor the FastMNMF can be updated in a frame-online manner due to its computationally-expensive iterative nature. Our DNN-free system leverages the posteriors of the latest source spectrograms given by block-online FastMNMF to derive the current source covariance matrices for frame-online beamforming. The evaluation shows that our frame-online system can quickly respond to scene changes caused by interfering speaker movements and outperformed an existing block-online system with DNN-based beamforming by 5.0 points in terms of the word error rate.
Fichier principal
Vignette du fichier
2207.10934.pdf (635.16 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03821095 , version 1 (19-10-2022)

Identifiants

  • HAL Id : hal-03821095 , version 1

Citer

Aditya Arie Nugraha, Kouhei Sekiguchi, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii. DNN-FREE LOW-LATENCY ADAPTIVE SPEECH ENHANCEMENT BASED ON FRAME-ONLINE BEAMFORMING POWERED BY BLOCK-ONLINE FASTMNMF. 17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022), 2022, Bamberg, Germany. ⟨hal-03821095⟩
10 Consultations
88 Téléchargements

Partager

Gmail Facebook X LinkedIn More