Transactions of the International Society for Music Information Retrieval (Nov 2021)

On Evaluation of Inter- and Intra-Rater Agreement in Music Recommendation

  • Arthur Flexer,
  • Taric Lallai,
  • Katja Rašl

DOI
https://doi.org/10.5334/tismir.107
Journal volume & issue
Vol. 4, no. 1

Abstract

Read online

Our work is concerned with the subjective perception of music similarity in the context of music recommendation. We present two user studies to explore inter- and intra-rater agreement in quantification of general similarity between pieces of recommended music. Contrary to previous efforts, our test participants are of more uniform age and share a comparable musical background to lower variation within the participant group. The first study uses carefully curated song material from five distinct genres while the second uses songs from a single genre only, with almost all songs in both studies previously unknown to test participants. Repeating the listening tests with a two week lag shows that intra-rater agreement is higher than inter-rater agreement for both studies. Agreement for the single genre study is lower since genre of songs seems a major factor in judging similarity between songs. Mood of raters at test-time is found to have an influence on intra-rater agreement. We discuss the impacts of our results on evaluation of music recommenders and question the validity of experiments on general music similarity.

Keywords