Information Retrieval

, Volume 17, Issue 2, pp 109–136 | Cite as

Learning music similarity from relative user ratings

Article

Abstract

Computational modelling of music similarity is an increasingly important part of personalisation and optimisation in music information retrieval and research in music perception and cognition. The use of relative similarity ratings is a new and promising approach to modelling similarity that avoids well known problems with absolute ratings. In this article, we use relative ratings from the MagnaTagATune dataset with new and existing variants of state-of-the-art algorithms and provide the first comprehensive and rigorous evaluation of this approach. We compare metric learning based on support vector machines (SVMs) and metric-learning-to-rank (MLR), including a diagonal and a novel weighted variant, and relative distance learning with neural networks (RDNN). We further evaluate the effectiveness of different high and low level audio features and genre data, as well as dimensionality reduction methods, weighting of similarity ratings, and different sampling methods. Our results show that music similarity measures learnt on relative ratings can be significantly better than a standard Euclidian metric, depending on the choice of learning algorithm, feature sets and application scenario. MLR and SVM outperform DMLR and RDNN, while MLR with weighted ratings leads to no further performance gain. Timbral and music-structural features are most effective, and all features jointly are significantly better than any other combination of feature sets. Sharing audio clips (but not the similarity ratings) between test and training sets improves performance, in particular for the SVM-based methods, which is useful for some applications scenarios. A testing framework has been implemented in Matlab and made publicly available http://mi.soi.city.ac.uk/datasets/ir2012framework so that these results are reproducible.

Keywords

Music similarity Relative similarity ratings Metric learning Support vector machines Metric learning to rank Neural networks 

References

  1. Aho, A. V., Garey, M. R., & Ullman, J. D. (1972). The transitive reduction of a directed graph. SIAM Journal on Computing, 1(2), 131–137.CrossRefMATHMathSciNetGoogle Scholar
  2. Akkermans, V., Font, F., Funollet, J., De Jong, B., Roma, G., Togias, S., et al. (2011). Freesound 2: An improved platform for sharing audio clips. In International Society for Music Information Retrieval Conference (ISMIR 2011), Late-breaking Demo Session. Miami, Florida, USA.Google Scholar
  3. Allan, H., Müllensiefen, D., & Wiggins, G. (2007). Methodological considerations in studies of musical similarity. In 8th International conference on music information retrieval, pp. 473–478.Google Scholar
  4. Bogdanov, D., Serrà, J., Wack, N., & Herrera, P. (2009). From low-level to high-level: Comparative study of music similarity measures. In IEEE International symposium on multimedia. Workshop on Advances in Music Information Research (AdMIRe).Google Scholar
  5. Bosma, M., Veltkamp, R. C., & Wiering, F. (2006). Muugle: A modular music information retrieval framework. In International symposium on music information retrieval.Google Scholar
  6. Braun, H. (1997). Neuronale Netze—Optimierung durch Lernen und Evolution. Springer, Berlin.MATHGoogle Scholar
  7. Braun, H., Feulner, J., & Ullrich, V. (1991). Learning strategies for solving the planning problem using backpropagation. In Proceedings of NEURO-Nimes 91, 4th international conference on neural networks and their applications.Google Scholar
  8. Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., Slaney, M. (2008). Content-based music information retrieval: Current directions and future challenges. Proceedings of the IEEE, 96(4), 668–696.CrossRefGoogle Scholar
  9. Celma, O. (2008). Music recommendation and discovery in the long tail. Ph.D. thesis, Universitat Pompeu Fabra, Barcelona.Google Scholar
  10. Davis, J. V., Kulis, B., Jain, P., Sra, S., & Dhillon, I. S. (2007). Information-theoretic metric learning. In Proceedings of the 24th international conference on machine learning, ICML ’07 (pp. 209–216). New York, NY, USA: ACM.Google Scholar
  11. Ellis, D. P. W., & Whitman, B. (2002). The quest for ground truth in musical artist similarity. In Proceedings of the international symposium on music information retrieval (ISMIR) (pp. 170–177).Google Scholar
  12. Ferrer, R., & Eerola, T. (2010). Timbral qualities of semantic structures of music. In Proceedings of the 11th International Society for Music (pp. 571–576).Google Scholar
  13. Galleguillos, C., McFee, B., Belongie, S., & Lanckriet, G. R. G. (2011). From region similarity to category discovery. InIEEE conference in computer vision and patter recognition (CVPR) (pp. 2665–2672).Google Scholar
  14. Gammerman, A., Vovk, V., & Vapnik, V. (1998). Learning by transduction. In G. Cooper & S. Moral (Eds.), Uncertainty in artificial intelligence (pp. 148–155). San Francisco, CA: Morgan Kaufmann.Google Scholar
  15. Gentner, D., & Markman, A. (1997) Structure mapping in analogy and similarity. American Psychologist, 52(1), 45–56.CrossRefGoogle Scholar
  16. Hörnel, D. (2004). Chordnet: Learning and producing voice leading with neural networks and dynamic programming. Journal of New Music Research, 33(4), 387–397.CrossRefGoogle Scholar
  17. Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366. doi:10.1016/0893-6080(89)90020-8.CrossRefGoogle Scholar
  18. Jehan, T. (2005). Creating music by listening. Ph.D. thesis, Massachusetts Institute of Technology, MA, USA.Google Scholar
  19. Karp, R. M. (1972). Reducibility among combinatorial problems. In R. E. Miller & J. W. Thatcher (Eds.), Complexity of computer computations (pp. 85–103). New York: Plenum Press.CrossRefGoogle Scholar
  20. Mahalanobis, P. C. (1936). On the generalised distance in statistics. In Proceedings of the National Institute of Sciences of India 2 (pp. 49–55). MIT Press.Google Scholar
  21. McFee, B., Barrington, L., & Lanckriet, G. (2010). Learning similarity from collaborative filters. In Proceedings of the International Society for Music Information Retrieval Conference (pp. 345–350).Google Scholar
  22. McFee, B., & Lanckriet, G. (2009). Heterogeneous embedding for subjective artist similarity. In Proceedings of the international symposium on music information retrieval (ISMIR).Google Scholar
  23. Mcfee, B., & Lanckriet, G. (2010). Metric learning to rank. In Proceedings of the 27th annual International conference on machine learning (ICML).Google Scholar
  24. McFee, B., & Lanckriet, G. (2012). Hypergraph models of playlist dialects. In 13th International symposium for music information retrieval (ISMIR2012)).Google Scholar
  25. Musil, J., El-Nusairi, B., & Müllensiefen, D. (2012). Perceptual dimensions of short audio clips and corresponding timbre features. In Proceedings of the 9th international symposium on computer music modelling and retrieval (CMMR 2012).Google Scholar
  26. Novello, A., Mckinney, M. F., & Kohlrausch, A. (2006). Perceptual evaluation of music similarity. In Proceedings of the 7th international conference on music information retrieval (ISMIR).Google Scholar
  27. Page, K., Fields, B., De Roure, D., Crawford, T., & Downie, J. S. (2012). Reuse, remix, repeat: The workflows of mir. In Proceedings of the 13th International Society for Music Information Retrieval Conference. Porto, Portugal.Google Scholar
  28. Ricci, F. (2012). Context-aware music recommender systems: workshop keynote abstract. In Proceedings of the 21st world wide web conference, WWW 2012 (pp. 865–866). Lyon.Google Scholar
  29. Riedmiller, M., & Braun, H. (1993). A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In Proceedings of the IEEE international conference on neural networks (pp. 586–591). San Francisco, CA.Google Scholar
  30. Schultz, M., & Joachims, T. (2003). Learning a distance metric from relative comparisons. In Advances in neural information processing systems (NIPS). MIT Press.Google Scholar
  31. Serra, X. (2012). Data gathering for a culture specific approach in mir. In Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, pp. 867–868.Google Scholar
  32. Slaney, M., Weinberger, K. Q., & White, W. (2008). Learning a metric for music similarity. In J. P. Bello, E. Chew, D. Turnbull (eds.) International Society for Music Information Retrieval (ISMIR) 2008 (pp. 313–318).Google Scholar
  33. Slaney, M., & White, W. (2007). Similarity based on rating data. In Proceedings of the 2007 International Society for Music Information Retrieval (ISMIR) (pp. 479–484).Google Scholar
  34. Stober, S., & Nürnberger, A. (2010). Similarity adaptation in an exploratory retrieval scenario. In Proceedings of 8th international workshop on adaptive multimedia retrieval (AMR’10). Linz, Austria (To appear).Google Scholar
  35. Stober, S., & Nürnberger, A. (2011). An experimental comparison of similarity adaptation approaches. In Proceedings of 9th international workshop on adaptive multimedia retrieval (AMR). Barcelona, Spain (To appear).Google Scholar
  36. Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In Proceedings of the international conference on machine learning (ICML).Google Scholar
  37. Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327–352.CrossRefGoogle Scholar
  38. Weinberger, K., & Saul, L. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10, 207–244.MATHGoogle Scholar
  39. Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10, 207–244.MATHGoogle Scholar
  40. Wolff, D., Stober, S., Nürnberger, A., & Weyde, T. (2012). A systematic comparison of music similarity adaptation approaches. In Proceedings of international symposium on music information retrieval (ISMIR) (To appear).Google Scholar
  41. Wolff, D., & Weyde, T. (2011a). Adapting metrics for music similarity using comparative judgements. In Proceedings of international symposium on music information retrieval (ISMIR).Google Scholar
  42. Wolff, D., & Weyde, T. (2011b). Combining sources of description for approximating music similarity ratings. In Proceedings of 9th international workshop on adaptive multimedia retrieval (AMR). Barcelona, Spain.Google Scholar
  43. Wolff, D., & Weyde, T. (2011c). On culture-dependent modelling of music similarity. In Proceedings of fourth international conference of students of systematic musicology sysmus. Cologne, Germany.Google Scholar
  44. Wolff, D., & Weyde, T. (2012). Adapting similarity on the magnatagatune database: effects of model and feature choices. In Proceedings of the 21st international conference companion on world wide web, WWW ’12 Companion (pp. 931–936). New York, NY, USA: ACM.Google Scholar
  45. Yang, L. (2006). Distance metric learning: A comprehensive survey. Michigan State Universiy pp. 1–51.Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of Computing, School of InformaticsCity UniversityLondonUK

Personalised recommendations