Convolutional neural networks for phoneme recognition

Cornelius Glackin; Julie Wall; Gérard Chollet; Nazim Dugan; Nigel Cannings

doi:10.5220/0006653001900195

Communication Dans Un Congrès Année : 2018

Convolutional neural networks for phoneme recognition

(1) , (2) , (3, 4, 5, 1) , (1) , (1)

1
2
3
4
5

Cornelius Glackin

Fonction : Auteur

Intelligent Voice Ltd

Julie Wall

Fonction : Auteur

University of East London

Gérard Chollet

Fonction : Auteur
PersonId : 176991
IdHAL : gerard-chollet
ORCID : 0000-0003-4245-146X
IdRef : 078020824

Institut Polytechnique de Paris

Département Electronique et Physique

ARMEDIA

Intelligent Voice Ltd

Nazim Dugan

Fonction : Auteur

Intelligent Voice Ltd

Nigel Cannings

Fonction : Auteur

Intelligent Voice Ltd

Résumé

This paper presents a novel application of convolutional neural networks to phoneme recognition. The phonetic transcription of the TIMIT speech corpus is used to label spectrogram segments for training the convolutional neural network. A window of a fixed size slides over the spectrogram of the TIMIT utterances and the resulting spectrogram patches are assigned to the appropriate phone class by parsing TIMIT’s phone transcription. The convolutional neural network is the standard GoogLeNet implementation trained with stochastic gradient descent with mini batches. After training, phonetic rescoring is performed in the usual way to map the TIMIT phone set to the smaller standard set. Benchmark results are presented for comparison to other state-of-the-art approaches. Finally, conclusions and future directions with regard to extending the approach are discussed.

Mots clés

ASR Phonetic decoding

Domaines

Informatique [cs] Traitement du signal et de l'image [eess.SP]

TelecomParis HAL : Connectez-vous pour contacter le contributeur

https://telecom-paris.hal.science/hal-02287732

Soumis le : vendredi 13 septembre 2019-17:17:36

Dernière modification le : jeudi 21 décembre 2023-11:26:09

Dates et versions

hal-02287732 , version 1 (13-09-2019)

Identifiants

HAL Id : hal-02287732 , version 1
DOI : 10.5220/0006653001900195

Citer

Cornelius Glackin, Julie Wall, Gérard Chollet, Nazim Dugan, Nigel Cannings. Convolutional neural networks for phoneme recognition. 7th International Conference on Pattern Recognition Applications and Methods, Jan 2018, Funchal, Portugal. pp.190-195, ⟨10.5220/0006653001900195⟩. ⟨hal-02287732⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM TELECOM-SUDPARIS PARISTECH UNIV-PARIS-SACLAY

31 Consultations

0 Téléchargements

Convolutional neural networks for phoneme recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager