Skip to Main content Skip to Navigation
Conference papers

Downbeat Detection with Conditional Random Fields and Deep Learned Features

Simon Durand 1, 2 Slim Essid 1, 2 
1 S2A - Signal, Statistique et Apprentissage
LTCI - Laboratoire Traitement et Communication de l'Information
Abstract : In this paper, we introduce a novel Conditional Random Field (CRF) system that detects the downbeat sequence of musical audio signals. Feature functions are computed from four deep learned representations based on harmony, rhythm, melody and bass content to take advantage of the high-level and multi-faceted aspect of this task. Downbeats being dynamic, the powerful CRF classification system allows us to combine our features with an adapted temporal model in a fully data-driven fashion. Some meters being under-represented in our training set, we show that data augmentation enables a statistically significant improvement of the results by taking into account class imbalance. An evaluation of different configurations of our system on nine datasets shows its efficiency and potential over a heuristic based approach and four downbeat tracking algo- rithms.
Complete list of metadata
Contributor : TelecomParis HAL Connect in order to contact the contributor
Submitted on : Saturday, September 14, 2019 - 6:52:17 PM
Last modification on : Wednesday, November 3, 2021 - 6:20:32 AM


  • HAL Id : hal-02288480, version 1


Simon Durand, Slim Essid. Downbeat Detection with Conditional Random Fields and Deep Learned Features. International Society for Music Information Retrieval (ISMIR), Aug 2016, New York City, United States. pp.386-392. ⟨hal-02288480⟩



Record views