Abstract
In this paper two sets of evaluation experiments are conducted. First, we compare state-of-the-art automatic music genre classification algorithms to human performance on the same dataset, via a listening experiment. This will show that the improvements of content-based systems over the last years have reduced the gap between automatic and human classification performance, but could not yet close this gap. As an important extension to previous work in this context, we will also compare the automatic and human classification performance to a collaborative approach. Second, we propose two evaluation metrics, called user scores, that are based on the votes of the participants of the listening experiment. This user centric evaluation approach allows to get rid of predefined ground truth annotations and allows to account for the ambiguous human perception of musical genre. To take genre ambiguities into account is an important advantage with respect to the evaluation of content-based systems, especially since the dataset compiled in this work (both the audio files and collected votes) are publicly available.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aucouturier, J.J., Defreville, B., Pachet, F.: The bag-of-frames approach to audio pattern recognition: A sufficient model for urban soundscapes but not for polyphonic music. The Journal of the Acoustical Society of America (2007)
Bella, S.D., Peretz, I.: Differentiation of classical music requires little learning but rhythm. Cognition (2005)
Craft, A., Wiggins, G.A., Crawford, T.: How many beans make five? the consensus problem in music-genre classification and a new evaluation method for single-genre categorisation systems. In: Proc. Int. Sym. on Music Information Retrieval, ISMIR 2007 (2007)
Ellis, D., Whitman, B., Berenzweig, A., Lawrence, S.: The quest for ground truth in musical artist similarity. In: Proc. of the 3rd International Conference on Music Information Retrieval, ISMIR 2002 (2002)
Flexer, A., Schnitzer, D.: Album and artist effects for audio similarity at the scale of the web. In: Proc. of the 6th Sound and Music Computing Conference, SMC 2009 (2009)
Geleijnse, G., Schedl, M., Knees, P.: The quest for ground truth in musical artist tagging in the social web era. In: Proc. of the 8th International Conference on Music Information Retrieval, ISMIR 2007 (2007)
Gjerdingen, R., Perrott, D.: Scanning the dial: The rapid recognition of music genres. Journal of New Music Research (2008)
Guaus, E., Herrera, P.: Music genre categorization in humans and machines. In: 121 AES Convention (2006)
Levy, M., Sandler, M.: Lightweight measures for timbral similarity of musical audio. In: AMCMM 2006: Proceedings of the 1st ACM Workshop on Audio and Music Computing Multimedia, Santa Barbara, California, USA, pp. 27–36 (2006)
Lippens, S., Martens, J., Mulder, T.D., Tzanetakis, G.: A comparison of human and automatic musical genre classification. In: Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP 2004 (2004)
McKay, C., Fujinaga, I.: Musical genre classification: Is it worth pursuing and how can it be improved? In: Proc. of the 7th Int. Conf. on Music Information Retrieval, ISMIR 2006 (2006)
Pachet, F., Cazaly, D.: A taxonomy of musical genre. In: Proc. of Content-Based Multimedia Information Access Conference, RIOA (2000)
Pohle, T., Schnitzer, D., Schedl, M., Knees, P., Widmer, G.: On rhythm and general music similarity. In: Proc. of the 10th International Society for Music Information Retrieval Conference, ISMIR 2009 (2009)
Seyerlehner, K., Schedl, M.: Block-level audio feature for music genre classification. In: Online Proc. of the 5th Annual Music Information Retrieval Evaluation eXchange, MIREX 2009 (2009)
Seyerlehner, K., Widmer, G., Knees, P.: Frame level audio similarity - a codebook approach. In: Proc. of the 11th International Conference on Digital Audio Effects, DAFx 2008 (2008)
Seyerlehner, K., Widmer, G., Pohle, T.: Fusing block-level features for music similarity estimation. In: Proc. of the 13th International Conference on Digital Audio Effects, DAFx 2010 (2010)
Soltau, H., Schultz, T., Westphal, M., Waibel, A.: Recognition of music types. In: Proc. of the 23rd IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP 1998 (2010)
Sordo, M., Celma, O., Blech, M., Guaus, E.: The quest for musical genres: Do the experts and the wisdom of crowds agree? In: Proc. of the 9th International Conference on Music Information Retrieval, ISMIR 2008 (2008)
Tzanetakis, G., Cook, P.: Musical genre classification of audio signal. IEEE Transactions on Audio and Speech Processing (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Seyerlehner, K., Widmer, G., Knees, P. (2011). A Comparison of Human, Automatic and Collaborative Music Genre Classification and User Centric Evaluation of Genre Classification Systems. In: Detyniecki, M., Knees, P., Nürnberger, A., Schedl, M., Stober, S. (eds) Adaptive Multimedia Retrieval. Context, Exploration, and Fusion. AMR 2010. Lecture Notes in Computer Science, vol 6817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27169-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-27169-4_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27168-7
Online ISBN: 978-3-642-27169-4
eBook Packages: Computer ScienceComputer Science (R0)