Melody-Based Approaches in Music Retrieval and Recommendation Systems

Stasiak, Bartłomiej; Papiernik, Mateusz

doi:10.1007/978-3-319-00375-7_9

Bartłomiej Stasiak⁴ &
Mateusz Papiernik⁴

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 25))

742 Accesses

Abstract

Analysis and description of digital audio, being one of the fundamental information sources in the modern world, becomes surprisingly complex when it comes to music information retrieval. In addition to the numerous known classic problems, such as pitch and rhythm extraction, tempo analysis, audio key detection or chord estimation, there exist further considerable difficulties in higher-level analysis involving e.g. emotional content or genre categorization. These issues are of utmost interest for the rapidly growing music industry and new e-commerce solutions which necessitate the need for precise estimation of users’ musical preferences. Out of the many proposed approaches, those based on modelling the social context and behavior of a user via collaborative filtering techniques, although potentially effective, do not offer real insight into the decision-making process. The content-based methods, on the other hand, enable to build refined models of users’ preferences with adjustable weights attributed to different musical elements and varied description levels. This chapter presents an evaluation of the latest efforts and development in fields related to music information retrieval, as well as a study of melody based approaches for use with recommendation systems. The evaluation concentrates on algorithms extracting audio features relevant to the idea of similarity analysis and recommendation proposals. Only content-based audio analysis approaches are taken into account. Tasks such as audio key detection, chord estimation, tempo analysis or genre classification are described, latest achievements and algorithms are presented and ways in which they can be effectively used in recommendation systems are estimated. The concept of audio melody extraction is explored and the authors’ own research results in this subject are presented. Possible usage scenarios of melody analysis for music recommendation services are evaluated. Application in conjunction with query by humming and query by tapping systems is analysed as an example of a complete recommender solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
introduced in MIREX 2009

References

Celma, O.: Music Recommendation and Discovery: The Long Tail, Long Fail, and Long Play in the Digital Music Space. Springer (2010)
Google Scholar
http://majorminer.org
Barrington, L., Oda, R., Lanckriet, G.: Smarter than genius? Human evaluation of music recommender systems. In: In Proceedings of the 10th International Conference on Music, Information Retrieval, pp. 31–36 (2009)
Google Scholar
http://www.music-ir.org/mirex
http://www.ismir.net/
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. Speech Audio Process. IEEE Trans. 10(5), 293–302 (2002)
Article Google Scholar
Stasiak, B., Yatsymirskyy, M.: Application of Fast Orthogonal Neural Network in content-based music genre classification. Pol. J. Environ. Stud. 17(2A), 81–85 (2008)
Google Scholar
Boersma, P., Weenink, D.: Praat, a system for doing phonetics by computer. Glot International 5(9/10), 341–345 (2001)
Google Scholar
Niewiadomy, D., Pelikant, A.: Digital speech signal parameterization by Mel Frequency Cepstral Coefficients and word boundaries. J. Appl. Comput. Sci. 15(2), 71–81 (2007). http://edu.ics.p.lodz.pl/file.php/38/2-2007/niewiadomy.pdf
Hamel, Y.B.P., Lemieux, S., Eck, D.: Combining visual and acoustic features for music genre classification. In: Proceedings of ISMIR, 2011 (2011)
Google Scholar
Seyerlehner, K., Widmer, G., Schedl, M., Knees, P.: Automatic music tag classification based on block-level features. In: Proceedings of the 7th Sound and Music Computing Conference (SMC 2010). Barcelona, Spain (2010)
Google Scholar
Tardieu, D., Charbuillet, C., Cornu, F., Peeters, G.: MIREX-2011 Single-label and multi-label classification tasks: Ircamclassification2011 submission
Google Scholar
http://www.last.fm/
Downie, J.S.: The music information retrieval evaluation exchange (2005–2007): A window into music information retrieval research. Acoust. Sci. Technol. 29(4), 247–255 (2008)
Article Google Scholar
Whitman, B., Flake, G., Lawrence, S.: Artist detection in music with Minnowmatch. In: Neural Networks for Signal Processing XI, 2001. Proceedings of the 2001 IEEE Signal Processing Society, Workshop, pp. 559–568 (2001)
Google Scholar
Fujishima, T.: Realtime chord recognition of musical sound: A system using Common Lisp Music. In: Proceedings of the International Computer Music Conference, Beijing. International Computer Music Association (1999)
Google Scholar
Lee, K.: Automatic chord recognition from audio using an HMM with supervised learning. In: Proceedings of ISMIR (2006)
Google Scholar
Harte, C., Sandler, M., Gasser, M.: Detecting harmonic change in musical audio. In: Proceedings of Audio and Music Computing for Multimedia, Workshop (2006)
Google Scholar
http://www.spotify.com/
http://www.grooveshark.com/
Seyerlehner, K., Schedl, M., Pohle, T., Knees, P.: MIREX-2010 Using block-level features for genre classification, tag classification and music similarity estimation
Google Scholar
Bogdanov, D., Serra, J., Wack, N., Herrera, P.: MIREX-2010 Hybrid music similarity measure
Google Scholar
Tzanetakis, G.: MIREX-2010 Marsyas submissions to MIREX 2010
Google Scholar
Gkiokas, A., Katsouros, V., Carayannis, G.: MIREX-2010 Audio tempo extraction algorithm for MIREX 2010
Google Scholar
Peeters, G., Papadopoulos, H.: Simultaneous beat and downbeat-tracking using a^~ probabilistic framework: Theory and large-scale evaluation. IEEE Trans. Audio Speech Lang. Process. 19(6) (2011)
Google Scholar
http://www.music-ir.org/mirex/abstracts/2011/KFRO1.pdf
Hainsworth, S., Malcolm, M.: Beat tracking with particle filtering algorithms. In: Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on. IEEE (2003)
Google Scholar
Cemgil, A.T., Kappen, H.J.: Monte carlo methods for tempo tracking and rhythm quantization. J. Artif. Intell. Res. 18, 45–81 (2003)
MATH Google Scholar
Scheirer, E.D.: Tempo and beat analysis of acoustic musical signals. J. Acoust. Soc. Am. 103(1), 588–601 (1998)
Article Google Scholar
Laroche, J.: Estimating tempo, swing and beat locations in audio recordings. In: Proceedings of the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA’01), pp. 131–135 (2001)
Google Scholar
Klapuri, A.P., Eronen, A.J., Astola, J.T.: Analysis of the meter of acoustic musical signals. In: IEEE Transactions on Speech and Audio Processing, pp. 342–355 (2004)
Google Scholar
Peeters, G.: Template-based estimation of time-varying tempo. EURASIP J. Adv. Sig. Proc. (2007)
Google Scholar
Schoenberg, A.: Theory of Harmony. University of California Press, California (1978)
Google Scholar
Gomez, E.: Tonal description of music audio signals. PhD thesis, Universitat Pompeu Fabra (2006)
Google Scholar
Krumhansl, C.: Cognitive foundations of musical pitch. Oxford University Press, USA (1990)
Google Scholar
Temperley, D.: What’s Key for Key? The Krumhansl-Schmuckler Key-Finding Algorithm Reconsidered, pp. 65–100. University of California, California (1999)
Google Scholar
Ni, Y., Mcvicar, M., Santos-Rodriguez, R., Bie, T.D.: MIREX-2011 Harmony progression analyzer for MIREX 2011
Google Scholar
Khadkevich, M., Omologo, M.: MIREX-2011 Audio chord detection
Google Scholar
Cho, T., Bello, J.P.: MIREX-2011 A feature smoothing method for chord recognition using recurrence plots
Google Scholar
Salomon, J., Gomez, E.: Melody extraction from polyphonic music: MIREX 2011
Google Scholar
Eokhwan Jo, S.P.S., Yoo, C.D.: Melody extraction from polyphonic audio signal MIREX 2011
Google Scholar
Liao, W.H., Su, A.W.Y., Yeh, C., Roebel, A.: Melody estimation for MIREX 2011
Google Scholar
Zwicker, E.: Subdivision of the audible frequency range into critical bands. J. Acoust. Soc. Am. (33) (1961)
Google Scholar
Vicent, E., Plumbley, M.: Predominant-f0 estimation using Bayesian harmonic waveform models. In: 2005 Music Information Retrieval Evaluation eXchange (MIREX) (2005)
Google Scholar
Yeh, C., Roebel, A., Rodet, X.: Multiple fundamental frequency estimation and polyphony inference of polyphonic music signals. Trans. Audio Speech Lang. Proc. 18(6), 1116–1126 (2010). doi:10.1109/TASL.2009.2030006
Martin, B., Hanna, P., Robine, M., Ferraro, P.: Structural analysis of harmonic features using string matching techniques
Google Scholar
Chen, R., Li, M.: Music structural segmentation by combining harmonic and timbral information. In: Klapuri, A., Leider, C. (eds.) ISMIR, pp. 477–482 (2011)
Google Scholar
Peeters, G.: Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach. In: Dixon, S., Bainbridge, D., Typke, R. (eds.) ISMIR, pp. 35–40. Austrian Computer Society (2007)
Google Scholar
Hanna, P., Robine, M.: Query by tapping system based on alignment algorithm. In: ICASSP, pp. 1881–1884. IEEE (2009)
Google Scholar
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)
Article Google Scholar
Chen, C.T., Jang, J.S.R.: Query by tapping
Google Scholar
Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proceedings of the Institute of Phonetic Sciences, vol. 17 (1993)
Google Scholar
Dziubiński, M., Kostek, B.: High accuracy and octave error immune pitch detection algorithms. Arch. Acoust. 29(1) 1–21 (2004)
Google Scholar
Gerhard, D.: Pitch Extraction and Fundamental Frequency: History and Current Techniques. Department of Computer Science, University of Regina, Regina (2003)
Google Scholar
Ghias, A., Logan, J., Chamberlin, D., Smith, B.C.: Query by humming musical information retrieval in an audio database. In: Proceedings of the Third ACM International Conference on Multimedia. ACM (1995)
Google Scholar
McNab, R.J., Smith, L.A., Witten, I.H., Henderson, C.L., Cunningham, S.J.: Towards the digital music library: Tune retrieval from acoustic input. In: Proceedings of the First ACM International Conference on Digital Libraries. ACM (1996)
Google Scholar
Huang, S., Wang, L., Hu, S., Jiang, H., Xu, B.: Query by humming via multiscale transportation distance in random query occurrence context. In: Multimedia and Expo, 2008 IEEE International Conference on. IEEE (2008)
Google Scholar
Typke, R., Wiering, F., Veltkamp, R.C.: Transportation distances and human perception of melodic similarity. Musicae Scientiae 11(1), 153–181 (2007)
Google Scholar
Uitdenbogerd, A., Zobel, J.: Melodic matching techniques for large databases. In: Proceedings of the Seventh ACM International Conference on Multimedia (Part 1). ACM (1999)
Google Scholar
Jang, J.S., Lee, H.R.: Hierarchical filtering method for content-based music retrieval via acoustic input. In: Proceedings of the Ninth ACM International Conference on Multimedia. ACM (2001)
Google Scholar
Zhu, Y., Shasha, D.: Warping indexes with envelope transforms for query by humming. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. ACM (2003)
Google Scholar
Itakura, F.: Minimum prediction residual principle applied to speech recognition. Acoust. Speech Sig. Process. IEEE Trans. on 23(1), 67–72 (1975)
Google Scholar
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. Acoust. Speech Sig. Process. IEEE Trans. on 26(1), 43–49 (1978)
Google Scholar
Keogh, E.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005)
Google Scholar
Sakurai, Y., Faloutsos, C., Yamamuro, M.: Stream monitoring under the time warping distance. In: Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. IEEE (2007)
Google Scholar
Lijffijt, J., Papapetrou, P., Hollmen, J., Athitsos, V.: Benchmarking dynamic time warping for music retrieval. In: Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments. ACM (2010)
Google Scholar
Jeon, W., Ma, C.: Efficient search of music pitch contours using wavelet transforms and segmented dynamic time warping. In: Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. IEEE (2011)
Google Scholar
Wang, L., Huang, S., Hu, S., Laing, J., Xu, B.: An effective and efficient method for query by humming system based on multi-similarity measurement fusion. In: Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on. IEEE (2008)
Google Scholar
Yu, H.M., Tsai, W.H., Wang, H.M.: A query-by-singing system for retrieving karaoke music. Multimedia IEEE Trans. on. 10(8), 1626–1637 (2008)
Google Scholar
http://www.music-ir.org/mirex/abstracts/2011/jsslp1.pdf
Stasiak, B.: Follow that tune-dynamic time warping refinement for query by humming. In: Proceedings of Joint Conference on New Trends in Audio and Video Signal Processing: Algorithms, Architectures, Arrangements, and Applications (NTAV/SPA), pp. 109–114 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information Technology, Technical University of Łódź, ul.Wólczańska 215, 90-924, Łódź, Poland
Bartłomiej Stasiak & Mateusz Papiernik

Authors

Bartłomiej Stasiak
View author publications
You can also search for this author in PubMed Google Scholar
Mateusz Papiernik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bartłomiej Stasiak .

Editor information

Editors and Affiliations

, Department of Informatics, University of Piraeus, Karaoli&Dimitriou St. 80, Piraeus, 18534, Greece
George A. Tsihrintzis
, Department of Informatics, University of Piraeus, Karaoli&Dimitriou St. 80, Piraeus, 18534, Greece
Maria Virvou
University of South Australia, Mawson Lakes, Adelaide, 5095, South Australia, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Stasiak, B., Papiernik, M. (2013). Melody-Based Approaches in Music Retrieval and Recommendation Systems. In: Tsihrintzis, G., Virvou, M., Jain, L. (eds) Multimedia Services in Intelligent Environments. Smart Innovation, Systems and Technologies, vol 25. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00375-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-00375-7_9
Published: 17 May 2013
Publisher Name: Springer, Heidelberg
Print ISBN: 978-3-319-00374-0
Online ISBN: 978-3-319-00375-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics