Abstract
The revolution in music distribution and storage brought about by digital technology has fueled tremendous interest in and attention to the ways that information technology can be applied to this kind of content. The rapidly growing corpus of digitally available music data requires novel technologies that allow users to browse personal collections or discover new music on the world wide web, or to help music creators to manage and protect their rights.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
E. Allamanche, J. Herre, O. Hellmuth, B. Fröba, and M. Cremer, AudioID: Towards content-based identification of audio material, in Proceedings of the AES Convention, Amsterdam, The Netherlands, 2001.
A. Arzt, S. Böck, and G. Widmer, Fast identification of piece and score position via symbolic fingerprinting, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2012, pp. 433–438.
R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison Wesley, 1999.
J. P. Bello, Audio-based cover song retrieval using approximate chord sequences: testing shifts, gaps, swaps and beats, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 2007, pp. 239–244.
T. Bertin-Mahieux and D. P. Ellis, Large-scale cover song recognition using hashed chroma landmarks, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Platz, NY, 2011, pp. 117–120.
D. Bogdanov, J. Serrà , N. Wack, and P. Herrera, Unifying low-level and high-level music similarity measures, IEEE Transactions on Multimedia, 13 (2011), pp. 687–701.
P. Cano, E. Batlle, E. Gómez, L. de C. T. Gomes, and M. Bonnet, Audio fingerprinting: Concepts and applications, in Computational Intelligence for Modelling and Prediction, S. K. Halgamuge and L. Wang, eds., vol. 2 of Studies in Computational Intelligence, Springer, 2005, pp. 233–245.
P. Cano, E. Batlle, T. Kalker, and J. Haitsma, A review of audio fingerprinting, The Journal of VLSI Signal Processing, 41 (2005), pp. 271–284.
P. Cano, E. Batlle, H. Mayer, and H. Neuschmied, Robust sound modeling for song detection in broadcast audio, in Proceedings of the AES Convention, 2002, pp. 1–7.
M. A. Casey, C. Rhodes, and M. Slaney, Analysis of minimum distances in highdimensional musical spaces, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 1015–1028.
M. A. Casey, R. Veltkap, M. Goto, M. Leman, C. Rhodes, and M. Slaney, Content-based music information retrieval: Current directions and future challenges, Proceedings of the IEEE, 96 (2008), pp. 668–696.
M. Clausen and F. Kurth, A unified approach to content-based and fault tolerant music identification, IEEE Transactions on Multimedia, 6 (2004), pp. 717–731.
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, The MIT Press, 3rd ed., 2009.
J. S. Downie, The music information retrieval evaluation exchange (2005–2007): A window into music information retrieval research, Acoustical Science and Technology, 29 (2008), pp. 247–255.
J. S. Downie, M. Bay, A. F. Ehmann, and M. C. Jones, Audio cover song identification: MIREX 2006-2007 results and analyses, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2008, pp. 468–474.
D. P. Ellis and G. E. Poliner, Identifying ‘cover songs’ with chroma features and dynamic programming beat tracking, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 4, Honolulu, Hawaii, USA, 2007.
S. Ewert, M. Müller, V. Konz, D. Müllensiefen, and G. Wiggins, Towards crossversion harmonic analysis of music, IEEE Transactions on Multimedia, 14 (2012), pp. 770–782.
S. Fenet, G. Richard, and Y. Grenier, A scalable audio fingerprint method with robustness to pitch-shifting, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Miami, Florida, USA, 2011, pp. 121–126.
R. Foucard, J.-L. Durrieu, M. Lagrange, and G. Richard, Multimodal similarity between musical streams for cover version detection, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, Texas, USA, 2010.
E. Gómez, B. S. Ong, and P. Herrera, Automatic tonal analysis from music summaries for version identification, in Proceedings of the AES Convention, San Francisco, California, USA, 2006.
F. Gouyon, A computational approach to rhythm description: audio features for the computation of rhythm periodicity functions and their use in tempo induction and music content processing, PhD thesis, Universitat Pompeu Fabra, Barcelona, Spain, 2005.
P. Grosche and M. Müller, Toward characteristic audio shingles for efficient crossversion music retrieval, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, Japan, 2012, pp. 473–476.
——, Toward musically-motivated audio fingerprints, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, Japan, 2012, pp. 93–96.
P. Grosche, M. Müller, and J. Serrà , Audio content-based music retrieval, in Multimodal Music Processing, M.Müller, M. Goto, and M. Schedl, eds., vol. 3 of Dagstuhl Follow-Ups, Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 2012, pp. 157–174.
F. Guillet and H. Hamilton, eds., Quality Measures in Data Mining, Springer, 2007.
J. Haitsma and T. Kalker, A highly robust audio fingerprinting system, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Paris, France, 2002, pp. 107–115.
——, Speed-change resistant audio fingerprinting using auto-correlation, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2003, pp. 728–731.
D. Jang, S. Lee, J. S. Lee, M. Jin, J. S. Seoa, S. Lee, and C. D. Yoo, Automatic commercial monitoring for TV broadcasting using audio fingerprinting, in Proceedings of the Audio Engineering Society Conference (AES), Seoul, Korea, September 2006, pp. 38–43.
Y. Ke, D. Hoiem, and R. Sukthankar, Computer vision for music identification, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, California, USA, 2005, pp. 597–604.
Y. E. Kim, E. M. Schmidt, R. Migneco, B. C. Morton, P. Richardson, J. Scott, J. A. Speck, and D. Turnbull, Music emotion recognition: A state of the art review, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Utrecht, The Netherlands, 2010, pp. 255–266.
F. Kurth and M. Müller, Efficient index-based audio matching, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 382–395.
F. Kurth, A. Ribbrock, and M. Clausen, Identification of highly distorted audio material for querying large scale data bases, in Proceedings of the AES Convention, 2002.
M. Marolt, A mid-level representation for melody-based retrieval in audio collections, IEEE Transactions on Multimedia, 10 (2008), pp. 1617–1625.
M. Müller, Information Retrieval for Music and Motion, Springer Verlag, 2007.
M. Müller, F. Kurth, and M. Clausen, Audio matching via chroma-based statistical features, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2005, pp. 288–295.
——, Chroma-based statistical audio features for audio matching, in Proceedings of the Workshop on Applications of Signal Processing (WASPAA), New Paltz, New York, USA, 2005, pp. 275–278.
M. Ramona and G. Peeters, Audio identification based on spectral modeling of barkbands energy and synchronization through onset detection, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2011, pp. 477–480.
J. Salamon, J. Serrà , and E. Gómez, Melody, bass line and harmony descriptions for music version identification, in International Workshop on Advances in Music Information Research (AdMIRe), 2012.
H. Schreiber and M. Müller, Accelerating index-based audio identification, IEEE Transactions on Multimedia, 16 (2014), pp. 1654–1664.
B. Schuller, F. Eyben, and G. Rigoll, Tango or waltz?: Putting ballroom dance style into tempo detection, EURASIP Journal on Audio, Speech, and Music Processing, (2008).
J. Serrà , Identification of versions of the same musical composition by processing audio descriptions, PhD thesis, Universitat Pompeu Fabra, Barcelona, Spain, 2011.
J. Serrà , E. Gómez, and P. Herrera, Audio cover song identification and similarity: background, approaches, evaluation and beyond, in Advances in Music Information Retrieval, Z. W. Ras and A. A. Wieczorkowska, eds., vol. 274 of Studies in Computational Intelligence, Springer, Berlin, Germany, 2010, ch. 14, pp. 307–332.
J. Serrà , E. Gómez, P. Herrera, and X. Serra, Chroma binary similarity and local alignment applied to cover song identification, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 1138–1151.
J. Serrà , X. Serra, and R. G. Andrzejak, Cross recurrence quantification for cover song identification, New Journal of Physics, 11 (2009).
T. F. Smith and M. S. Waterman, Identification of common molecular subsequences, Journal of Molecular Biology, 147 (1981), pp. 195–197.
W.-H. Tsai, H.-M. Yu, and H.-M. Wang, Using the similarity of main melodies to identify cover versions of popular songs for music document retrieval, Journal of Information Science and Engineering, 24 (2008), pp. 1669–1687.
E. Tsunoo, T. Akase, N. Ono, and S. Sagayama, Musical mood classification by rhythm and bass-line unit pattern analysis, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2010.
G. Tzanetakis and P. Cook, Musical genre classification of audio signals, IEEE Transactions on Speech and Audio Processing, 10 (2002), pp. 293–302.
A. Wang, An industrial strength audio search algorithm, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Baltimore, Maryland, USA, 2003, pp. 7–13.
F. Weninger, M. Wöllmer, and B. Schuller, Automatic assessment of singer traits in popular music: Gender, age, height and race, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Miami, Florida, USA, 2011, pp. 37–42.
K. West and P. Lamere, A model-based approach to constructing music similarity functions, EURASIP Journal on Advances in Signal Processing, (2007).
I. H. Witten, A. Moffat, and T. C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann, 1999.
Y.-H. Yang and H. H. Chen, Music Emotion Recognition, CRC Press, 2011.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Müller, M. (2015). Content-Based Audio Retrieval. In: Fundamentals of Music Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-21945-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-21945-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21944-8
Online ISBN: 978-3-319-21945-5
eBook Packages: Computer ScienceComputer Science (R0)