Skip to main content

Content-Based Audio Retrieval

Abstract

The revolution in music distribution and storage brought about by digital technology has fueled tremendous interest in and attention to the ways that information technology can be applied to this kind of content. The rapidly growing corpus of digitally available music data requires novel technologies that allow users to browse personal collections or discover new music on the world wide web, or to help music creators to manage and protect their rights.

Keywords

  • Dynamic Time Warping
  • Matching Function
  • Inverted List
  • Chroma Feature
  • Database Document

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-21945-5_7
  • Chapter length: 59 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-21945-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   79.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. Allamanche, J. Herre, O. Hellmuth, B. Fröba, and M. Cremer, AudioID: Towards content-based identification of audio material, in Proceedings of the AES Convention, Amsterdam, The Netherlands, 2001.

    Google Scholar 

  2. A. Arzt, S. Böck, and G. Widmer, Fast identification of piece and score position via symbolic fingerprinting, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2012, pp. 433–438.

    Google Scholar 

  3. R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison Wesley, 1999.

    Google Scholar 

  4. J. P. Bello, Audio-based cover song retrieval using approximate chord sequences: testing shifts, gaps, swaps and beats, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 2007, pp. 239–244.

    Google Scholar 

  5. T. Bertin-Mahieux and D. P. Ellis, Large-scale cover song recognition using hashed chroma landmarks, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Platz, NY, 2011, pp. 117–120.

    Google Scholar 

  6. D. Bogdanov, J. Serrà, N. Wack, and P. Herrera, Unifying low-level and high-level music similarity measures, IEEE Transactions on Multimedia, 13 (2011), pp. 687–701.

    Google Scholar 

  7. P. Cano, E. Batlle, E. Gómez, L. de C. T. Gomes, and M. Bonnet, Audio fingerprinting: Concepts and applications, in Computational Intelligence for Modelling and Prediction, S. K. Halgamuge and L. Wang, eds., vol. 2 of Studies in Computational Intelligence, Springer, 2005, pp. 233–245.

    Google Scholar 

  8. P. Cano, E. Batlle, T. Kalker, and J. Haitsma, A review of audio fingerprinting, The Journal of VLSI Signal Processing, 41 (2005), pp. 271–284.

    Google Scholar 

  9. P. Cano, E. Batlle, H. Mayer, and H. Neuschmied, Robust sound modeling for song detection in broadcast audio, in Proceedings of the AES Convention, 2002, pp. 1–7.

    Google Scholar 

  10. M. A. Casey, C. Rhodes, and M. Slaney, Analysis of minimum distances in highdimensional musical spaces, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 1015–1028.

    Google Scholar 

  11. M. A. Casey, R. Veltkap, M. Goto, M. Leman, C. Rhodes, and M. Slaney, Content-based music information retrieval: Current directions and future challenges, Proceedings of the IEEE, 96 (2008), pp. 668–696.

    Google Scholar 

  12. M. Clausen and F. Kurth, A unified approach to content-based and fault tolerant music identification, IEEE Transactions on Multimedia, 6 (2004), pp. 717–731.

    Google Scholar 

  13. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, The MIT Press, 3rd ed., 2009.

    Google Scholar 

  14. J. S. Downie, The music information retrieval evaluation exchange (2005–2007): A window into music information retrieval research, Acoustical Science and Technology, 29 (2008), pp. 247–255.

    Google Scholar 

  15. J. S. Downie, M. Bay, A. F. Ehmann, and M. C. Jones, Audio cover song identification: MIREX 2006-2007 results and analyses, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2008, pp. 468–474.

    Google Scholar 

  16. D. P. Ellis and G. E. Poliner, Identifying ‘cover songs’ with chroma features and dynamic programming beat tracking, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 4, Honolulu, Hawaii, USA, 2007.

    Google Scholar 

  17. S. Ewert, M. Müller, V. Konz, D. Müllensiefen, and G. Wiggins, Towards crossversion harmonic analysis of music, IEEE Transactions on Multimedia, 14 (2012), pp. 770–782.

    Google Scholar 

  18. S. Fenet, G. Richard, and Y. Grenier, A scalable audio fingerprint method with robustness to pitch-shifting, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Miami, Florida, USA, 2011, pp. 121–126.

    Google Scholar 

  19. R. Foucard, J.-L. Durrieu, M. Lagrange, and G. Richard, Multimodal similarity between musical streams for cover version detection, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, Texas, USA, 2010.

    Google Scholar 

  20. E. Gómez, B. S. Ong, and P. Herrera, Automatic tonal analysis from music summaries for version identification, in Proceedings of the AES Convention, San Francisco, California, USA, 2006.

    Google Scholar 

  21. F. Gouyon, A computational approach to rhythm description: audio features for the computation of rhythm periodicity functions and their use in tempo induction and music content processing, PhD thesis, Universitat Pompeu Fabra, Barcelona, Spain, 2005.

    Google Scholar 

  22. P. Grosche and M. Müller, Toward characteristic audio shingles for efficient crossversion music retrieval, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, Japan, 2012, pp. 473–476.

    Google Scholar 

  23. ——, Toward musically-motivated audio fingerprints, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, Japan, 2012, pp. 93–96.

    Google Scholar 

  24. P. Grosche, M. Müller, and J. Serrà, Audio content-based music retrieval, in Multimodal Music Processing, M.Müller, M. Goto, and M. Schedl, eds., vol. 3 of Dagstuhl Follow-Ups, Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 2012, pp. 157–174.

    Google Scholar 

  25. F. Guillet and H. Hamilton, eds., Quality Measures in Data Mining, Springer, 2007.

    Google Scholar 

  26. J. Haitsma and T. Kalker, A highly robust audio fingerprinting system, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Paris, France, 2002, pp. 107–115.

    Google Scholar 

  27. ——, Speed-change resistant audio fingerprinting using auto-correlation, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2003, pp. 728–731.

    Google Scholar 

  28. D. Jang, S. Lee, J. S. Lee, M. Jin, J. S. Seoa, S. Lee, and C. D. Yoo, Automatic commercial monitoring for TV broadcasting using audio fingerprinting, in Proceedings of the Audio Engineering Society Conference (AES), Seoul, Korea, September 2006, pp. 38–43.

    Google Scholar 

  29. Y. Ke, D. Hoiem, and R. Sukthankar, Computer vision for music identification, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, California, USA, 2005, pp. 597–604.

    Google Scholar 

  30. Y. E. Kim, E. M. Schmidt, R. Migneco, B. C. Morton, P. Richardson, J. Scott, J. A. Speck, and D. Turnbull, Music emotion recognition: A state of the art review, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Utrecht, The Netherlands, 2010, pp. 255–266.

    Google Scholar 

  31. F. Kurth and M. Müller, Efficient index-based audio matching, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 382–395.

    Google Scholar 

  32. F. Kurth, A. Ribbrock, and M. Clausen, Identification of highly distorted audio material for querying large scale data bases, in Proceedings of the AES Convention, 2002.

    Google Scholar 

  33. M. Marolt, A mid-level representation for melody-based retrieval in audio collections, IEEE Transactions on Multimedia, 10 (2008), pp. 1617–1625.

    Google Scholar 

  34. M. Müller, Information Retrieval for Music and Motion, Springer Verlag, 2007.

    Google Scholar 

  35. M. Müller, F. Kurth, and M. Clausen, Audio matching via chroma-based statistical features, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2005, pp. 288–295.

    Google Scholar 

  36. ——, Chroma-based statistical audio features for audio matching, in Proceedings of the Workshop on Applications of Signal Processing (WASPAA), New Paltz, New York, USA, 2005, pp. 275–278.

    Google Scholar 

  37. M. Ramona and G. Peeters, Audio identification based on spectral modeling of barkbands energy and synchronization through onset detection, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2011, pp. 477–480.

    Google Scholar 

  38. J. Salamon, J. Serrà, and E. Gómez, Melody, bass line and harmony descriptions for music version identification, in International Workshop on Advances in Music Information Research (AdMIRe), 2012.

    Google Scholar 

  39. H. Schreiber and M. Müller, Accelerating index-based audio identification, IEEE Transactions on Multimedia, 16 (2014), pp. 1654–1664.

    Google Scholar 

  40. B. Schuller, F. Eyben, and G. Rigoll, Tango or waltz?: Putting ballroom dance style into tempo detection, EURASIP Journal on Audio, Speech, and Music Processing, (2008).

    Google Scholar 

  41. J. Serrà, Identification of versions of the same musical composition by processing audio descriptions, PhD thesis, Universitat Pompeu Fabra, Barcelona, Spain, 2011.

    Google Scholar 

  42. J. Serrà, E. Gómez, and P. Herrera, Audio cover song identification and similarity: background, approaches, evaluation and beyond, in Advances in Music Information Retrieval, Z. W. Ras and A. A. Wieczorkowska, eds., vol. 274 of Studies in Computational Intelligence, Springer, Berlin, Germany, 2010, ch. 14, pp. 307–332.

    Google Scholar 

  43. J. Serrà, E. Gómez, P. Herrera, and X. Serra, Chroma binary similarity and local alignment applied to cover song identification, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 1138–1151.

    Google Scholar 

  44. J. Serrà, X. Serra, and R. G. Andrzejak, Cross recurrence quantification for cover song identification, New Journal of Physics, 11 (2009).

    Google Scholar 

  45. T. F. Smith and M. S. Waterman, Identification of common molecular subsequences, Journal of Molecular Biology, 147 (1981), pp. 195–197.

    Google Scholar 

  46. W.-H. Tsai, H.-M. Yu, and H.-M. Wang, Using the similarity of main melodies to identify cover versions of popular songs for music document retrieval, Journal of Information Science and Engineering, 24 (2008), pp. 1669–1687.

    Google Scholar 

  47. E. Tsunoo, T. Akase, N. Ono, and S. Sagayama, Musical mood classification by rhythm and bass-line unit pattern analysis, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2010.

    Google Scholar 

  48. G. Tzanetakis and P. Cook, Musical genre classification of audio signals, IEEE Transactions on Speech and Audio Processing, 10 (2002), pp. 293–302.

    Google Scholar 

  49. A. Wang, An industrial strength audio search algorithm, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Baltimore, Maryland, USA, 2003, pp. 7–13.

    Google Scholar 

  50. F. Weninger, M. Wöllmer, and B. Schuller, Automatic assessment of singer traits in popular music: Gender, age, height and race, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Miami, Florida, USA, 2011, pp. 37–42.

    Google Scholar 

  51. K. West and P. Lamere, A model-based approach to constructing music similarity functions, EURASIP Journal on Advances in Signal Processing, (2007).

    Google Scholar 

  52. I. H. Witten, A. Moffat, and T. C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann, 1999.

    Google Scholar 

  53. Y.-H. Yang and H. H. Chen, Music Emotion Recognition, CRC Press, 2011.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meinard Müller .

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Müller, M. (2015). Content-Based Audio Retrieval. In: Fundamentals of Music Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-21945-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21945-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21944-8

  • Online ISBN: 978-3-319-21945-5

  • eBook Packages: Computer ScienceComputer Science (R0)