Skip to main content

Melody-Based Approaches in Music Retrieval and Recommendation Systems

  • Chapter
  • First Online:
Multimedia Services in Intelligent Environments

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 25))

  • 742 Accesses

Abstract

Analysis and description of digital audio, being one of the fundamental information sources in the modern world, becomes surprisingly complex when it comes to music information retrieval. In addition to the numerous known classic problems, such as pitch and rhythm extraction, tempo analysis, audio key detection or chord estimation, there exist further considerable difficulties in higher-level analysis involving e.g. emotional content or genre categorization. These issues are of utmost interest for the rapidly growing music industry and new e-commerce solutions which necessitate the need for precise estimation of users’ musical preferences. Out of the many proposed approaches, those based on modelling the social context and behavior of a user via collaborative filtering techniques, although potentially effective, do not offer real insight into the decision-making process. The content-based methods, on the other hand, enable to build refined models of users’ preferences with adjustable weights attributed to different musical elements and varied description levels. This chapter presents an evaluation of the latest efforts and development in fields related to music information retrieval, as well as a study of melody based approaches for use with recommendation systems. The evaluation concentrates on algorithms extracting audio features relevant to the idea of similarity analysis and recommendation proposals. Only content-based audio analysis approaches are taken into account. Tasks such as audio key detection, chord estimation, tempo analysis or genre classification are described, latest achievements and algorithms are presented and ways in which they can be effectively used in recommendation systems are estimated. The concept of audio melody extraction is explored and the authors’ own research results in this subject are presented. Possible usage scenarios of melody analysis for music recommendation services are evaluated. Application in conjunction with query by humming and query by tapping systems is analysed as an example of a complete recommender solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    introduced in MIREX 2009

References

  1. Celma, O.: Music Recommendation and Discovery: The Long Tail, Long Fail, and Long Play in the Digital Music Space. Springer (2010)

    Google Scholar 

  2. http://majorminer.org

  3. Barrington, L., Oda, R., Lanckriet, G.: Smarter than genius? Human evaluation of music recommender systems. In: In Proceedings of the 10th International Conference on Music, Information Retrieval, pp. 31–36 (2009)

    Google Scholar 

  4. http://www.music-ir.org/mirex

  5. http://www.ismir.net/

  6. Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. Speech Audio Process. IEEE Trans. 10(5), 293–302 (2002)

    Article  Google Scholar 

  7. Stasiak, B., Yatsymirskyy, M.: Application of Fast Orthogonal Neural Network in content-based music genre classification. Pol. J. Environ. Stud. 17(2A), 81–85 (2008)

    Google Scholar 

  8. Boersma, P., Weenink, D.: Praat, a system for doing phonetics by computer. Glot International 5(9/10), 341–345 (2001)

    Google Scholar 

  9. Niewiadomy, D., Pelikant, A.: Digital speech signal parameterization by Mel Frequency Cepstral Coefficients and word boundaries. J. Appl. Comput. Sci. 15(2), 71–81 (2007). http://edu.ics.p.lodz.pl/file.php/38/2-2007/niewiadomy.pdf

  10. Hamel, Y.B.P., Lemieux, S., Eck, D.: Combining visual and acoustic features for music genre classification. In: Proceedings of ISMIR, 2011 (2011)

    Google Scholar 

  11. Seyerlehner, K., Widmer, G., Schedl, M., Knees, P.: Automatic music tag classification based on block-level features. In: Proceedings of the 7th Sound and Music Computing Conference (SMC 2010). Barcelona, Spain (2010)

    Google Scholar 

  12. Tardieu, D., Charbuillet, C., Cornu, F., Peeters, G.: MIREX-2011 Single-label and multi-label classification tasks: Ircamclassification2011 submission

    Google Scholar 

  13. http://www.last.fm/

  14. Downie, J.S.: The music information retrieval evaluation exchange (2005–2007): A window into music information retrieval research. Acoust. Sci. Technol. 29(4), 247–255 (2008)

    Article  Google Scholar 

  15. Whitman, B., Flake, G., Lawrence, S.: Artist detection in music with Minnowmatch. In: Neural Networks for Signal Processing XI, 2001. Proceedings of the 2001 IEEE Signal Processing Society, Workshop, pp. 559–568 (2001)

    Google Scholar 

  16. Fujishima, T.: Realtime chord recognition of musical sound: A system using Common Lisp Music. In: Proceedings of the International Computer Music Conference, Beijing. International Computer Music Association (1999)

    Google Scholar 

  17. Lee, K.: Automatic chord recognition from audio using an HMM with supervised learning. In: Proceedings of ISMIR (2006)

    Google Scholar 

  18. Harte, C., Sandler, M., Gasser, M.: Detecting harmonic change in musical audio. In: Proceedings of Audio and Music Computing for Multimedia, Workshop (2006)

    Google Scholar 

  19. http://www.spotify.com/

  20. http://www.grooveshark.com/

  21. Seyerlehner, K., Schedl, M., Pohle, T., Knees, P.: MIREX-2010 Using block-level features for genre classification, tag classification and music similarity estimation

    Google Scholar 

  22. Bogdanov, D., Serra, J., Wack, N., Herrera, P.: MIREX-2010 Hybrid music similarity measure

    Google Scholar 

  23. Tzanetakis, G.: MIREX-2010 Marsyas submissions to MIREX 2010

    Google Scholar 

  24. Gkiokas, A., Katsouros, V., Carayannis, G.: MIREX-2010 Audio tempo extraction algorithm for MIREX 2010

    Google Scholar 

  25. Peeters, G., Papadopoulos, H.: Simultaneous beat and downbeat-tracking using a~ probabilistic framework: Theory and large-scale evaluation. IEEE Trans. Audio Speech Lang. Process. 19(6) (2011)

    Google Scholar 

  26. http://www.music-ir.org/mirex/abstracts/2011/KFRO1.pdf

  27. Hainsworth, S., Malcolm, M.: Beat tracking with particle filtering algorithms. In: Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on. IEEE (2003)

    Google Scholar 

  28. Cemgil, A.T., Kappen, H.J.: Monte carlo methods for tempo tracking and rhythm quantization. J. Artif. Intell. Res. 18, 45–81 (2003)

    MATH  Google Scholar 

  29. Scheirer, E.D.: Tempo and beat analysis of acoustic musical signals. J. Acoust. Soc. Am. 103(1), 588–601 (1998)

    Article  Google Scholar 

  30. Laroche, J.: Estimating tempo, swing and beat locations in audio recordings. In: Proceedings of the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA’01), pp. 131–135 (2001)

    Google Scholar 

  31. Klapuri, A.P., Eronen, A.J., Astola, J.T.: Analysis of the meter of acoustic musical signals. In: IEEE Transactions on Speech and Audio Processing, pp. 342–355 (2004)

    Google Scholar 

  32. Peeters, G.: Template-based estimation of time-varying tempo. EURASIP J. Adv. Sig. Proc. (2007)

    Google Scholar 

  33. Schoenberg, A.: Theory of Harmony. University of California Press, California (1978)

    Google Scholar 

  34. Gomez, E.: Tonal description of music audio signals. PhD thesis, Universitat Pompeu Fabra (2006)

    Google Scholar 

  35. Krumhansl, C.: Cognitive foundations of musical pitch. Oxford University Press, USA (1990)

    Google Scholar 

  36. Temperley, D.: What’s Key for Key? The Krumhansl-Schmuckler Key-Finding Algorithm Reconsidered, pp. 65–100. University of California, California (1999)

    Google Scholar 

  37. Ni, Y., Mcvicar, M., Santos-Rodriguez, R., Bie, T.D.: MIREX-2011 Harmony progression analyzer for MIREX 2011

    Google Scholar 

  38. Khadkevich, M., Omologo, M.: MIREX-2011 Audio chord detection

    Google Scholar 

  39. Cho, T., Bello, J.P.: MIREX-2011 A feature smoothing method for chord recognition using recurrence plots

    Google Scholar 

  40. Salomon, J., Gomez, E.: Melody extraction from polyphonic music: MIREX 2011

    Google Scholar 

  41. Eokhwan Jo, S.P.S., Yoo, C.D.: Melody extraction from polyphonic audio signal MIREX 2011

    Google Scholar 

  42. Liao, W.H., Su, A.W.Y., Yeh, C., Roebel, A.: Melody estimation for MIREX 2011

    Google Scholar 

  43. Zwicker, E.: Subdivision of the audible frequency range into critical bands. J. Acoust. Soc. Am. (33) (1961)

    Google Scholar 

  44. Vicent, E., Plumbley, M.: Predominant-f0 estimation using Bayesian harmonic waveform models. In: 2005 Music Information Retrieval Evaluation eXchange (MIREX) (2005)

    Google Scholar 

  45. Yeh, C., Roebel, A., Rodet, X.: Multiple fundamental frequency estimation and polyphony inference of polyphonic music signals. Trans. Audio Speech Lang. Proc. 18(6), 1116–1126 (2010). doi:10.1109/TASL.2009.2030006

  46. Martin, B., Hanna, P., Robine, M., Ferraro, P.: Structural analysis of harmonic features using string matching techniques

    Google Scholar 

  47. Chen, R., Li, M.: Music structural segmentation by combining harmonic and timbral information. In: Klapuri, A., Leider, C. (eds.) ISMIR, pp. 477–482 (2011)

    Google Scholar 

  48. Peeters, G.: Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach. In: Dixon, S., Bainbridge, D., Typke, R. (eds.) ISMIR, pp. 35–40. Austrian Computer Society (2007)

    Google Scholar 

  49. Hanna, P., Robine, M.: Query by tapping system based on alignment algorithm. In: ICASSP, pp. 1881–1884. IEEE (2009)

    Google Scholar 

  50. Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)

    Article  Google Scholar 

  51. Chen, C.T., Jang, J.S.R.: Query by tapping

    Google Scholar 

  52. Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proceedings of the Institute of Phonetic Sciences, vol. 17 (1993)

    Google Scholar 

  53. Dziubiński, M., Kostek, B.: High accuracy and octave error immune pitch detection algorithms. Arch. Acoust. 29(1) 1–21 (2004)

    Google Scholar 

  54. Gerhard, D.: Pitch Extraction and Fundamental Frequency: History and Current Techniques. Department of Computer Science, University of Regina, Regina (2003)

    Google Scholar 

  55. Ghias, A., Logan, J., Chamberlin, D., Smith, B.C.: Query by humming musical information retrieval in an audio database. In: Proceedings of the Third ACM International Conference on Multimedia. ACM (1995)

    Google Scholar 

  56. McNab, R.J., Smith, L.A., Witten, I.H., Henderson, C.L., Cunningham, S.J.: Towards the digital music library: Tune retrieval from acoustic input. In: Proceedings of the First ACM International Conference on Digital Libraries. ACM (1996)

    Google Scholar 

  57. Huang, S., Wang, L., Hu, S., Jiang, H., Xu, B.: Query by humming via multiscale transportation distance in random query occurrence context. In: Multimedia and Expo, 2008 IEEE International Conference on. IEEE (2008)

    Google Scholar 

  58. Typke, R., Wiering, F., Veltkamp, R.C.: Transportation distances and human perception of melodic similarity. Musicae Scientiae 11(1), 153–181 (2007)

    Google Scholar 

  59. Uitdenbogerd, A., Zobel, J.: Melodic matching techniques for large databases. In: Proceedings of the Seventh ACM International Conference on Multimedia (Part 1). ACM (1999)

    Google Scholar 

  60. Jang, J.S., Lee, H.R.: Hierarchical filtering method for content-based music retrieval via acoustic input. In: Proceedings of the Ninth ACM International Conference on Multimedia. ACM (2001)

    Google Scholar 

  61. Zhu, Y., Shasha, D.: Warping indexes with envelope transforms for query by humming. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. ACM (2003)

    Google Scholar 

  62. Itakura, F.: Minimum prediction residual principle applied to speech recognition. Acoust. Speech Sig. Process. IEEE Trans. on 23(1), 67–72 (1975)

    Google Scholar 

  63. Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. Acoust. Speech Sig. Process. IEEE Trans. on 26(1), 43–49 (1978)

    Google Scholar 

  64. Keogh, E.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005)

    Google Scholar 

  65. Sakurai, Y., Faloutsos, C., Yamamuro, M.: Stream monitoring under the time warping distance. In: Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. IEEE (2007)

    Google Scholar 

  66. Lijffijt, J., Papapetrou, P., Hollmen, J., Athitsos, V.: Benchmarking dynamic time warping for music retrieval. In: Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments. ACM (2010)

    Google Scholar 

  67. Jeon, W., Ma, C.: Efficient search of music pitch contours using wavelet transforms and segmented dynamic time warping. In: Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. IEEE (2011)

    Google Scholar 

  68. Wang, L., Huang, S., Hu, S., Laing, J., Xu, B.: An effective and efficient method for query by humming system based on multi-similarity measurement fusion. In: Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on. IEEE (2008)

    Google Scholar 

  69. Yu, H.M., Tsai, W.H., Wang, H.M.: A query-by-singing system for retrieving karaoke music. Multimedia IEEE Trans. on. 10(8), 1626–1637 (2008)

    Google Scholar 

  70. http://www.music-ir.org/mirex/abstracts/2011/jsslp1.pdf

  71. Stasiak, B.: Follow that tune-dynamic time warping refinement for query by humming. In: Proceedings of Joint Conference on New Trends in Audio and Video Signal Processing: Algorithms, Architectures, Arrangements, and Applications (NTAV/SPA), pp. 109–114 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bartłomiej Stasiak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Stasiak, B., Papiernik, M. (2013). Melody-Based Approaches in Music Retrieval and Recommendation Systems. In: Tsihrintzis, G., Virvou, M., Jain, L. (eds) Multimedia Services in Intelligent Environments. Smart Innovation, Systems and Technologies, vol 25. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00375-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-00375-7_9

  • Published:

  • Publisher Name: Springer, Heidelberg

  • Print ISBN: 978-3-319-00374-0

  • Online ISBN: 978-3-319-00375-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics