Abstract
Analysis and description of digital audio, being one of the fundamental information sources in the modern world, becomes surprisingly complex when it comes to music information retrieval. In addition to the numerous known classic problems, such as pitch and rhythm extraction, tempo analysis, audio key detection or chord estimation, there exist further considerable difficulties in higher-level analysis involving e.g. emotional content or genre categorization. These issues are of utmost interest for the rapidly growing music industry and new e-commerce solutions which necessitate the need for precise estimation of users’ musical preferences. Out of the many proposed approaches, those based on modelling the social context and behavior of a user via collaborative filtering techniques, although potentially effective, do not offer real insight into the decision-making process. The content-based methods, on the other hand, enable to build refined models of users’ preferences with adjustable weights attributed to different musical elements and varied description levels. This chapter presents an evaluation of the latest efforts and development in fields related to music information retrieval, as well as a study of melody based approaches for use with recommendation systems. The evaluation concentrates on algorithms extracting audio features relevant to the idea of similarity analysis and recommendation proposals. Only content-based audio analysis approaches are taken into account. Tasks such as audio key detection, chord estimation, tempo analysis or genre classification are described, latest achievements and algorithms are presented and ways in which they can be effectively used in recommendation systems are estimated. The concept of audio melody extraction is explored and the authors’ own research results in this subject are presented. Possible usage scenarios of melody analysis for music recommendation services are evaluated. Application in conjunction with query by humming and query by tapping systems is analysed as an example of a complete recommender solution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
introduced in MIREX 2009
References
Celma, O.: Music Recommendation and Discovery: The Long Tail, Long Fail, and Long Play in the Digital Music Space. Springer (2010)
Barrington, L., Oda, R., Lanckriet, G.: Smarter than genius? Human evaluation of music recommender systems. In: In Proceedings of the 10th International Conference on Music, Information Retrieval, pp. 31–36 (2009)
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. Speech Audio Process. IEEE Trans. 10(5), 293–302 (2002)
Stasiak, B., Yatsymirskyy, M.: Application of Fast Orthogonal Neural Network in content-based music genre classification. Pol. J. Environ. Stud. 17(2A), 81–85 (2008)
Boersma, P., Weenink, D.: Praat, a system for doing phonetics by computer. Glot International 5(9/10), 341–345 (2001)
Niewiadomy, D., Pelikant, A.: Digital speech signal parameterization by Mel Frequency Cepstral Coefficients and word boundaries. J. Appl. Comput. Sci. 15(2), 71–81 (2007). http://edu.ics.p.lodz.pl/file.php/38/2-2007/niewiadomy.pdf
Hamel, Y.B.P., Lemieux, S., Eck, D.: Combining visual and acoustic features for music genre classification. In: Proceedings of ISMIR, 2011 (2011)
Seyerlehner, K., Widmer, G., Schedl, M., Knees, P.: Automatic music tag classification based on block-level features. In: Proceedings of the 7th Sound and Music Computing Conference (SMC 2010). Barcelona, Spain (2010)
Tardieu, D., Charbuillet, C., Cornu, F., Peeters, G.: MIREX-2011 Single-label and multi-label classification tasks: Ircamclassification2011 submission
Downie, J.S.: The music information retrieval evaluation exchange (2005–2007): A window into music information retrieval research. Acoust. Sci. Technol. 29(4), 247–255 (2008)
Whitman, B., Flake, G., Lawrence, S.: Artist detection in music with Minnowmatch. In: Neural Networks for Signal Processing XI, 2001. Proceedings of the 2001 IEEE Signal Processing Society, Workshop, pp. 559–568 (2001)
Fujishima, T.: Realtime chord recognition of musical sound: A system using Common Lisp Music. In: Proceedings of the International Computer Music Conference, Beijing. International Computer Music Association (1999)
Lee, K.: Automatic chord recognition from audio using an HMM with supervised learning. In: Proceedings of ISMIR (2006)
Harte, C., Sandler, M., Gasser, M.: Detecting harmonic change in musical audio. In: Proceedings of Audio and Music Computing for Multimedia, Workshop (2006)
Seyerlehner, K., Schedl, M., Pohle, T., Knees, P.: MIREX-2010 Using block-level features for genre classification, tag classification and music similarity estimation
Bogdanov, D., Serra, J., Wack, N., Herrera, P.: MIREX-2010 Hybrid music similarity measure
Tzanetakis, G.: MIREX-2010 Marsyas submissions to MIREX 2010
Gkiokas, A., Katsouros, V., Carayannis, G.: MIREX-2010 Audio tempo extraction algorithm for MIREX 2010
Peeters, G., Papadopoulos, H.: Simultaneous beat and downbeat-tracking using a~ probabilistic framework: Theory and large-scale evaluation. IEEE Trans. Audio Speech Lang. Process. 19(6) (2011)
Hainsworth, S., Malcolm, M.: Beat tracking with particle filtering algorithms. In: Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on. IEEE (2003)
Cemgil, A.T., Kappen, H.J.: Monte carlo methods for tempo tracking and rhythm quantization. J. Artif. Intell. Res. 18, 45–81 (2003)
Scheirer, E.D.: Tempo and beat analysis of acoustic musical signals. J. Acoust. Soc. Am. 103(1), 588–601 (1998)
Laroche, J.: Estimating tempo, swing and beat locations in audio recordings. In: Proceedings of the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA’01), pp. 131–135 (2001)
Klapuri, A.P., Eronen, A.J., Astola, J.T.: Analysis of the meter of acoustic musical signals. In: IEEE Transactions on Speech and Audio Processing, pp. 342–355 (2004)
Peeters, G.: Template-based estimation of time-varying tempo. EURASIP J. Adv. Sig. Proc. (2007)
Schoenberg, A.: Theory of Harmony. University of California Press, California (1978)
Gomez, E.: Tonal description of music audio signals. PhD thesis, Universitat Pompeu Fabra (2006)
Krumhansl, C.: Cognitive foundations of musical pitch. Oxford University Press, USA (1990)
Temperley, D.: What’s Key for Key? The Krumhansl-Schmuckler Key-Finding Algorithm Reconsidered, pp. 65–100. University of California, California (1999)
Ni, Y., Mcvicar, M., Santos-Rodriguez, R., Bie, T.D.: MIREX-2011 Harmony progression analyzer for MIREX 2011
Khadkevich, M., Omologo, M.: MIREX-2011 Audio chord detection
Cho, T., Bello, J.P.: MIREX-2011 A feature smoothing method for chord recognition using recurrence plots
Salomon, J., Gomez, E.: Melody extraction from polyphonic music: MIREX 2011
Eokhwan Jo, S.P.S., Yoo, C.D.: Melody extraction from polyphonic audio signal MIREX 2011
Liao, W.H., Su, A.W.Y., Yeh, C., Roebel, A.: Melody estimation for MIREX 2011
Zwicker, E.: Subdivision of the audible frequency range into critical bands. J. Acoust. Soc. Am. (33) (1961)
Vicent, E., Plumbley, M.: Predominant-f0 estimation using Bayesian harmonic waveform models. In: 2005 Music Information Retrieval Evaluation eXchange (MIREX) (2005)
Yeh, C., Roebel, A., Rodet, X.: Multiple fundamental frequency estimation and polyphony inference of polyphonic music signals. Trans. Audio Speech Lang. Proc. 18(6), 1116–1126 (2010). doi:10.1109/TASL.2009.2030006
Martin, B., Hanna, P., Robine, M., Ferraro, P.: Structural analysis of harmonic features using string matching techniques
Chen, R., Li, M.: Music structural segmentation by combining harmonic and timbral information. In: Klapuri, A., Leider, C. (eds.) ISMIR, pp. 477–482 (2011)
Peeters, G.: Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach. In: Dixon, S., Bainbridge, D., Typke, R. (eds.) ISMIR, pp. 35–40. Austrian Computer Society (2007)
Hanna, P., Robine, M.: Query by tapping system based on alignment algorithm. In: ICASSP, pp. 1881–1884. IEEE (2009)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)
Chen, C.T., Jang, J.S.R.: Query by tapping
Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proceedings of the Institute of Phonetic Sciences, vol. 17 (1993)
Dziubiński, M., Kostek, B.: High accuracy and octave error immune pitch detection algorithms. Arch. Acoust. 29(1) 1–21 (2004)
Gerhard, D.: Pitch Extraction and Fundamental Frequency: History and Current Techniques. Department of Computer Science, University of Regina, Regina (2003)
Ghias, A., Logan, J., Chamberlin, D., Smith, B.C.: Query by humming musical information retrieval in an audio database. In: Proceedings of the Third ACM International Conference on Multimedia. ACM (1995)
McNab, R.J., Smith, L.A., Witten, I.H., Henderson, C.L., Cunningham, S.J.: Towards the digital music library: Tune retrieval from acoustic input. In: Proceedings of the First ACM International Conference on Digital Libraries. ACM (1996)
Huang, S., Wang, L., Hu, S., Jiang, H., Xu, B.: Query by humming via multiscale transportation distance in random query occurrence context. In: Multimedia and Expo, 2008 IEEE International Conference on. IEEE (2008)
Typke, R., Wiering, F., Veltkamp, R.C.: Transportation distances and human perception of melodic similarity. Musicae Scientiae 11(1), 153–181 (2007)
Uitdenbogerd, A., Zobel, J.: Melodic matching techniques for large databases. In: Proceedings of the Seventh ACM International Conference on Multimedia (Part 1). ACM (1999)
Jang, J.S., Lee, H.R.: Hierarchical filtering method for content-based music retrieval via acoustic input. In: Proceedings of the Ninth ACM International Conference on Multimedia. ACM (2001)
Zhu, Y., Shasha, D.: Warping indexes with envelope transforms for query by humming. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. ACM (2003)
Itakura, F.: Minimum prediction residual principle applied to speech recognition. Acoust. Speech Sig. Process. IEEE Trans. on 23(1), 67–72 (1975)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. Acoust. Speech Sig. Process. IEEE Trans. on 26(1), 43–49 (1978)
Keogh, E.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005)
Sakurai, Y., Faloutsos, C., Yamamuro, M.: Stream monitoring under the time warping distance. In: Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. IEEE (2007)
Lijffijt, J., Papapetrou, P., Hollmen, J., Athitsos, V.: Benchmarking dynamic time warping for music retrieval. In: Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments. ACM (2010)
Jeon, W., Ma, C.: Efficient search of music pitch contours using wavelet transforms and segmented dynamic time warping. In: Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. IEEE (2011)
Wang, L., Huang, S., Hu, S., Laing, J., Xu, B.: An effective and efficient method for query by humming system based on multi-similarity measurement fusion. In: Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on. IEEE (2008)
Yu, H.M., Tsai, W.H., Wang, H.M.: A query-by-singing system for retrieving karaoke music. Multimedia IEEE Trans. on. 10(8), 1626–1637 (2008)
Stasiak, B.: Follow that tune-dynamic time warping refinement for query by humming. In: Proceedings of Joint Conference on New Trends in Audio and Video Signal Processing: Algorithms, Architectures, Arrangements, and Applications (NTAV/SPA), pp. 109–114 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Stasiak, B., Papiernik, M. (2013). Melody-Based Approaches in Music Retrieval and Recommendation Systems. In: Tsihrintzis, G., Virvou, M., Jain, L. (eds) Multimedia Services in Intelligent Environments. Smart Innovation, Systems and Technologies, vol 25. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00375-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-00375-7_9
Published:
Publisher Name: Springer, Heidelberg
Print ISBN: 978-3-319-00374-0
Online ISBN: 978-3-319-00375-7
eBook Packages: EngineeringEngineering (R0)