Skip to main content

Learning music similarity from relative user ratings

Abstract

Computational modelling of music similarity is an increasingly important part of personalisation and optimisation in music information retrieval and research in music perception and cognition. The use of relative similarity ratings is a new and promising approach to modelling similarity that avoids well known problems with absolute ratings. In this article, we use relative ratings from the MagnaTagATune dataset with new and existing variants of state-of-the-art algorithms and provide the first comprehensive and rigorous evaluation of this approach. We compare metric learning based on support vector machines (SVMs) and metric-learning-to-rank (MLR), including a diagonal and a novel weighted variant, and relative distance learning with neural networks (RDNN). We further evaluate the effectiveness of different high and low level audio features and genre data, as well as dimensionality reduction methods, weighting of similarity ratings, and different sampling methods. Our results show that music similarity measures learnt on relative ratings can be significantly better than a standard Euclidian metric, depending on the choice of learning algorithm, feature sets and application scenario. MLR and SVM outperform DMLR and RDNN, while MLR with weighted ratings leads to no further performance gain. Timbral and music-structural features are most effective, and all features jointly are significantly better than any other combination of feature sets. Sharing audio clips (but not the similarity ratings) between test and training sets improves performance, in particular for the SVM-based methods, which is useful for some applications scenarios. A testing framework has been implemented in Matlab and made publicly available http://mi.soi.city.ac.uk/datasets/ir2012framework so that these results are reproducible.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. http://www.music-ir.org/mirex/wiki/2011:Evalutron6000_Walkthrough

  2. http://www.allmusic.com/

  3. http://cseweb.ucsd.edu/~bmcfee/code/mlr/

  4. http://svmlight.joachims.org/

  5. http://mi.soi.city.ac.uk/datasets/magnatagatune

  6. http://magnatune.com/info/api.html

References

  • Aho, A. V., Garey, M. R., & Ullman, J. D. (1972). The transitive reduction of a directed graph. SIAM Journal on Computing, 1(2), 131–137.

    Article  MATH  MathSciNet  Google Scholar 

  • Akkermans, V., Font, F., Funollet, J., De Jong, B., Roma, G., Togias, S., et al. (2011). Freesound 2: An improved platform for sharing audio clips. In International Society for Music Information Retrieval Conference (ISMIR 2011), Late-breaking Demo Session. Miami, Florida, USA.

  • Allan, H., Müllensiefen, D., & Wiggins, G. (2007). Methodological considerations in studies of musical similarity. In 8th International conference on music information retrieval, pp. 473–478.

  • Bogdanov, D., Serrà, J., Wack, N., & Herrera, P. (2009). From low-level to high-level: Comparative study of music similarity measures. In IEEE International symposium on multimedia. Workshop on Advances in Music Information Research (AdMIRe).

  • Bosma, M., Veltkamp, R. C., & Wiering, F. (2006). Muugle: A modular music information retrieval framework. In International symposium on music information retrieval.

  • Braun, H. (1997). Neuronale Netze—Optimierung durch Lernen und Evolution. Springer, Berlin.

    MATH  Google Scholar 

  • Braun, H., Feulner, J., & Ullrich, V. (1991). Learning strategies for solving the planning problem using backpropagation. In Proceedings of NEURO-Nimes 91, 4th international conference on neural networks and their applications.

  • Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., Slaney, M. (2008). Content-based music information retrieval: Current directions and future challenges. Proceedings of the IEEE, 96(4), 668–696.

    Article  Google Scholar 

  • Celma, O. (2008). Music recommendation and discovery in the long tail. Ph.D. thesis, Universitat Pompeu Fabra, Barcelona.

  • Davis, J. V., Kulis, B., Jain, P., Sra, S., & Dhillon, I. S. (2007). Information-theoretic metric learning. In Proceedings of the 24th international conference on machine learning, ICML ’07 (pp. 209–216). New York, NY, USA: ACM.

  • Ellis, D. P. W., & Whitman, B. (2002). The quest for ground truth in musical artist similarity. In Proceedings of the international symposium on music information retrieval (ISMIR) (pp. 170–177).

  • Ferrer, R., & Eerola, T. (2010). Timbral qualities of semantic structures of music. In Proceedings of the 11th International Society for Music (pp. 571–576).

  • Galleguillos, C., McFee, B., Belongie, S., & Lanckriet, G. R. G. (2011). From region similarity to category discovery. InIEEE conference in computer vision and patter recognition (CVPR) (pp. 2665–2672).

  • Gammerman, A., Vovk, V., & Vapnik, V. (1998). Learning by transduction. In G. Cooper & S. Moral (Eds.), Uncertainty in artificial intelligence (pp. 148–155). San Francisco, CA: Morgan Kaufmann.

    Google Scholar 

  • Gentner, D., & Markman, A. (1997) Structure mapping in analogy and similarity. American Psychologist, 52(1), 45–56.

    Article  Google Scholar 

  • Hörnel, D. (2004). Chordnet: Learning and producing voice leading with neural networks and dynamic programming. Journal of New Music Research, 33(4), 387–397.

    Article  Google Scholar 

  • Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366. doi:10.1016/0893-6080(89)90020-8.

    Article  Google Scholar 

  • Jehan, T. (2005). Creating music by listening. Ph.D. thesis, Massachusetts Institute of Technology, MA, USA.

  • Karp, R. M. (1972). Reducibility among combinatorial problems. In R. E. Miller & J. W. Thatcher (Eds.), Complexity of computer computations (pp. 85–103). New York: Plenum Press.

    Chapter  Google Scholar 

  • Mahalanobis, P. C. (1936). On the generalised distance in statistics. In Proceedings of the National Institute of Sciences of India 2 (pp. 49–55). MIT Press.

  • McFee, B., Barrington, L., & Lanckriet, G. (2010). Learning similarity from collaborative filters. In Proceedings of the International Society for Music Information Retrieval Conference (pp. 345–350).

  • McFee, B., & Lanckriet, G. (2009). Heterogeneous embedding for subjective artist similarity. In Proceedings of the international symposium on music information retrieval (ISMIR).

  • Mcfee, B., & Lanckriet, G. (2010). Metric learning to rank. In Proceedings of the 27th annual International conference on machine learning (ICML).

  • McFee, B., & Lanckriet, G. (2012). Hypergraph models of playlist dialects. In 13th International symposium for music information retrieval (ISMIR2012)).

  • Musil, J., El-Nusairi, B., & Müllensiefen, D. (2012). Perceptual dimensions of short audio clips and corresponding timbre features. In Proceedings of the 9th international symposium on computer music modelling and retrieval (CMMR 2012).

  • Novello, A., Mckinney, M. F., & Kohlrausch, A. (2006). Perceptual evaluation of music similarity. In Proceedings of the 7th international conference on music information retrieval (ISMIR).

  • Page, K., Fields, B., De Roure, D., Crawford, T., & Downie, J. S. (2012). Reuse, remix, repeat: The workflows of mir. In Proceedings of the 13th International Society for Music Information Retrieval Conference. Porto, Portugal.

  • Ricci, F. (2012). Context-aware music recommender systems: workshop keynote abstract. In Proceedings of the 21st world wide web conference, WWW 2012 (pp. 865–866). Lyon.

  • Riedmiller, M., & Braun, H. (1993). A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In Proceedings of the IEEE international conference on neural networks (pp. 586–591). San Francisco, CA.

  • Schultz, M., & Joachims, T. (2003). Learning a distance metric from relative comparisons. In Advances in neural information processing systems (NIPS). MIT Press.

  • Serra, X. (2012). Data gathering for a culture specific approach in mir. In Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, pp. 867–868.

  • Slaney, M., Weinberger, K. Q., & White, W. (2008). Learning a metric for music similarity. In J. P. Bello, E. Chew, D. Turnbull (eds.) International Society for Music Information Retrieval (ISMIR) 2008 (pp. 313–318).

  • Slaney, M., & White, W. (2007). Similarity based on rating data. In Proceedings of the 2007 International Society for Music Information Retrieval (ISMIR) (pp. 479–484).

  • Stober, S., & Nürnberger, A. (2010). Similarity adaptation in an exploratory retrieval scenario. In Proceedings of 8th international workshop on adaptive multimedia retrieval (AMR’10). Linz, Austria (To appear).

  • Stober, S., & Nürnberger, A. (2011). An experimental comparison of similarity adaptation approaches. In Proceedings of 9th international workshop on adaptive multimedia retrieval (AMR). Barcelona, Spain (To appear).

  • Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In Proceedings of the international conference on machine learning (ICML).

  • Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327–352.

    Article  Google Scholar 

  • Weinberger, K., & Saul, L. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10, 207–244.

    MATH  Google Scholar 

  • Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10, 207–244.

    MATH  Google Scholar 

  • Wolff, D., Stober, S., Nürnberger, A., & Weyde, T. (2012). A systematic comparison of music similarity adaptation approaches. In Proceedings of international symposium on music information retrieval (ISMIR) (To appear).

  • Wolff, D., & Weyde, T. (2011a). Adapting metrics for music similarity using comparative judgements. In Proceedings of international symposium on music information retrieval (ISMIR).

  • Wolff, D., & Weyde, T. (2011b). Combining sources of description for approximating music similarity ratings. In Proceedings of 9th international workshop on adaptive multimedia retrieval (AMR). Barcelona, Spain.

  • Wolff, D., & Weyde, T. (2011c). On culture-dependent modelling of music similarity. In Proceedings of fourth international conference of students of systematic musicology sysmus. Cologne, Germany.

  • Wolff, D., & Weyde, T. (2012). Adapting similarity on the magnatagatune database: effects of model and feature choices. In Proceedings of the 21st international conference companion on world wide web, WWW ’12 Companion (pp. 931–936). New York, NY, USA: ACM.

  • Yang, L. (2006). Distance metric learning: A comprehensive survey. Michigan State Universiy pp. 1–51.

Download references

Acknowledgements

We thank Brian McFee for providing and maintaining the MLR code and Thorsten Joachims for providing the SVM-Light software and his support with using the solver. We would also like to thank Andrew Macfarlane and Gregory Slabaugh for their helpful comments on this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Wolff.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Wolff, D., Weyde, T. Learning music similarity from relative user ratings. Inf Retrieval 17, 109–136 (2014). https://doi.org/10.1007/s10791-013-9229-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10791-013-9229-0

Keywords

  • Music similarity
  • Relative similarity ratings
  • Metric learning
  • Support vector machines
  • Metric learning to rank
  • Neural networks