Hog and Subband power distribution image features for acoustic scene classification - Télécom Paris Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Hog and Subband power distribution image features for acoustic scene classification

Résumé

Acoustic scene classification is a difficult problem mostly due to the high density of events concurrently occurring in audio scenes. In order to capture the occurrences of these events we propose to use the Subband Power Distribution (SPD) as a feature. We extract it by computing the histogram of amplitude values in each frequency band of a spectrogram image. The SPD allows us to model the density of events in each frequency band. Our method is evaluated on a large acoustic scene dataset using support vector machines. We outperform the previous methods when using the SPD in conjunction with the histogram of gradients. To reach further improvement, we also consider the use of an approximation of the earth mover's distance kernel to compare histograms in a more suitable way. Using the so-called Sinkhorn kernel improves the results on most of the feature configurations. Best performances reach a 92.8% F1 score.
Fichier non déposé

Dates et versions

hal-02287266 , version 1 (13-09-2019)

Identifiants

  • HAL Id : hal-02287266 , version 1

Citer

Victor Bisot, Slim Essid, Gael Richard. Hog and Subband power distribution image features for acoustic scene classification. EUSIPCO, Sep 2015, Nice, France. pp.719-723. ⟨hal-02287266⟩
41 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More