Structure discovery & inference – Rethinking MIR evaluation methods
Our goal in this work is twofold: to develop an intelligent listening and predictive module of chord sequences, and to propose an adapted evaluation of the associated Music Information Retrieval (MIR) tasks that are the real-time extraction of musical chord labels from a live audio stream and the prediction of a possible continuation of the extracted symbolic sequence.
Therefore, we propose two independent modules that allows to extract chords in real-time and to predict a possible continuation of an input chord sequence. Both modules are available online, along with tutorials. This modules are aimed to be used in co-creative context such as through an integration within the DYCI2 library.
In the case of chords, there exists some strong inherent hierarchical and functional relationships. However, most of the research in the field of MIR focuses mainly on the performance of chord-based statistical models, without considering music-based evaluation or learning. Indeed, usual evaluations are based on a binary qualification of the classification outputs (right chord predicted versus wrong chord predicted).
Therefore, our research that are detailed in the following introduce a specifically-tailored chord analyzer that allows to measure the performances of chord-based models in term of functional qualification of the classification outputs (by taking into account the harmonic function of the chords). Then, in order to introduce musical knowledge into the learning process for the automatic chord extraction task, we also present a specific musical distance for comparing predicted and labeled chords. Finally, we conduct investigations into the impact of including high-level metadata in chord sequence prediction learning (such as information on key or downbeat position). We show that a model can obtain better performances in term of accuracy or perplexity, but output biased results. At the same time, a model with a lower accuracy score can output errors with more musical meaning. Therefore, performing a goal-oriented evaluation allows a better understanding of the results and a more adapted design of MIR models.
Some related articles
Tristan Carsault, Jérôme Nika, Philippe Esling, and Gérard Assayag. 2021. “Combining Real-Time Extraction and Prediction of Musical Chord Progressions for Creative Applications” Electronics 10, no. 21: 2634. https://doi.org/10.3390/electronics10212634
Tristan Carsault, Andrew McLeod, Philippe Esling, Jerome Nika, Eita Nakamura, Kazuyoshi Yoshii. Multi-Step Chord Sequence Prediction Based on Aggregated Multi-Scale Encoder-Decoder Network. IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), 2019.
Tristan Carsault, Jérôme Nika, Philippe Esling. Using musical relationships between chord labels in automatic chord extraction tasks. International Society for Music Information Retrieval Conference (ISMIR 2018), Sep 2018, Paris, France.