Long short-term memory for speaker generalization in supervised speech separation, The Journal of the Acoustical Society of America, vol.141, issue.6, pp.4705-4714, 2017. ,
An overview of informed audio source separation, 14th International Workshop on Image Analysis for Multimedia Interactive Services, pp.1-4, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00958661
Text-informed speech enhancement with deep neural networks, Sixteenth Annual Conference of the International Speech Communication Association, 2015. ,
Sailalign: Robust long speech-text alignment, Proc. of Workshop on New Tools and Methods for Very-Large Scale Phonetics Research, 2011. ,
Probabilistic kernels for improved text-to-speech alignment in long audio tracks, IEEE Signal Processing Letters, vol.23, issue.1, pp.126-129, 2015. ,
Weakly informed audio source separation, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02280472
Multipletarget deep learning for lstm-rnn based speech enhancement," in Hands-free Speech Communications and Microphone Arrays, pp.136-140, 2017. ,
, A fully convolutional neural network for speech enhancement, 1993.
Segan: Speech enhancement generative adversarial network, pp.3642-3646, 2017. ,
Text-informed audio source separation using nonnegative matrix partial co-factorization, IEEE International Workshop on Machine Learning for Signal Processing, pp.1-6, 2013. ,
Phonemespecific speech separation, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.146-150, 2016. ,
A phoneme-based pre-training approach for deep neural network with application to speech enhancement, IEEE International Workshop on Acoustic Signal Enhancement, pp.1-5, 2016. ,
A recursive algorithm for the forced alignment of very long audio segments, Fifth International Conference on Spoken Language Processing, 1998. ,
Montreal forced aligner: Trainable text-speech alignment using kaldi, pp.498-502, 2017. ,
Attention-based models for speech recognition, Advances in neural information processing systems, pp.577-585, 2015. ,
Automatic singing transcription based on encoder-decoder recurrent neural networks with a weakly-supervised attention mechanism, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.161-165, 2019. ,
Effective approaches to attention-based neural machine translation, 2015. ,
Speech discrimination by dynamic programming, Cybernetics and Systems Analysis, vol.4, issue.1, pp.52-57, 1968. ,
The MUSDB18 corpus for music separation, Stylianos Ioannis Mimilakis, and Rachel Bittner, 2017. ,
Timit acoustic phonetic continuous speech corpus, Linguistic Data Consortium, 1993. ,
Adam: A method for stochastic optimization, 2014. ,
Performance measurement in blind audio source separation, Speech, and Language Processing, vol.14, pp.1462-1469, 2006. ,
URL : https://hal.archives-ouvertes.fr/inria-00544230
Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs, IEEE International Conference on Acoustics, Speech and Signal Processing, vol.2, pp.749-752, 2001. ,
An algorithm for intelligibility prediction of timefrequency weighted noisy speech, Speech, and Language Processing, vol.19, pp.2125-2136, 2011. ,