Skip to main content

Applications in Intelligent Music Analysis

  • Chapter
  • First Online:
Intelligent Audio Analysis

Part of the book series: Signals and Communication Technology ((SCT))

Abstract

As digitised music has conquered the market for more than a decade, advanced techniques of Intelligent Music Analysis are gaining interest and importance. From this exciting field, recent application examples were selected for presentation in detail from the work of the author including current performance benchmarks. These comprise drum-beat separation, onset detection, tempo, metre, ballroom dance style, and mood determination, key and chord recognition, and structure analysis alongside singer trait classification. The latter includes singer age, gender, race, and height recognition.

Of all noises, I think music is the least disagreeable.

—Samuel Johnson.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://mtg.upf.edu/ismir2004/contest/tempoContest/node5.html

  2. 2.

    http://w3.ift.ulaval.ca/~allac88/dataset.tar.gz

  3. 3.

    http://www.mmk.ei.tum.de/~~sch/brd.txt

  4. 4.

    http://www.openaudio.eu/chord.txt

  5. 5.

    The term downbeat stems from orchestral conducting: The lowest point on the baton signals the downbeat.

  6. 6.

    http://www.olga.net

  7. 7.

    Allmusic (http://www.allmusic.com)

  8. 8.

    MIREX 2008 (http://www.music-ir.org/mirex/2008)

  9. 9.

    The annotation scheme is inspired by the TIMIT corpus as was used in Sect. 10.4.3. As such, the term ‘race’ is adopted from the corpus’ meta-information—though modern biology often neither classifies the homo sapiens sapiens by race nor sub-categories for collective differentiation in both physical and behavioural traits. Opposing current molecular biologic and population genetic research’s view that a systematic categorisation may be insufficient to describe the enormous diversity and fluent differences between geographic population, it can be argued that, when aiming at an end-user information retrieval system, a categorisation into illustrative, archetypal categories can be useful.

  10. 10.

    http://www.imdb.com

  11. 11.

    http://www.wikipedia.org

  12. 12.

    http://www.youtube.com

  13. 13.

    http://www.openaudio.eu/UltraStar_Singers.arff

  14. 14.

    A complex random variable whose real and imaginary parts are independent and follow a real Gaussian distribution, with mean equal to \(0\) and identical variance or co-variance matrix in case of a multi-variate distribution.

  15. 15.

    http://www.durrieu.ch/phd/software.html

References

  1. Casey, M., Slaney, M.: Fast recognition of remixed music audio. In: IEEE Proceedings International Conference on Audio Speech and Signal Processing (ICASSP), vol. IV, pp. 1425–1428 (2007)

    Google Scholar 

  2. Schuller, B., Zobl, M., Rigoll, G., Lang, M.: A hybrid music retrieval system using belief networks to integrate queries and contextual knowledge. In: IEEE Proceedings 4th IEEE International Conference on Multimedia and Expo, ICME 2003, vol. I, pp. 57–60. Baltimore (2003)

    Google Scholar 

  3. Schuller, B., Rigoll, G., Lang, M.: Multimodal music retrieval for large databases. In: IEEE Proceedings 5th IEEE International Conference on Multimedia and Expo, ICME 2004, vol. 2, pp. 755–758. Taipei, Taiwan (2004)

    Google Scholar 

  4. Downie, J.: Music information retrieval. Ann. Rev. Inform. Sci. Technol. 37, 295–340 (2003)

    Article  Google Scholar 

  5. Scheirer, E.D.: Tempo and beat analysis of acoustic musical signals. Acoust. Soc. Am. 103(1), 588–601 (1998)

    Article  Google Scholar 

  6. Schuller, B., Eyben, F., Rigoll, G.: Tango or waltz?—putting ballroom dance style into tempo detection. EURASIP J. Audio, Speech, Music Process. Spec. Issue Intell. Audio, Speech, Music Process. Appl. (Article ID 846135), 12 (2008)

    Google Scholar 

  7. Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)

    Article  Google Scholar 

  8. Schuller, B., Hage, C., Schuller, D., Rigoll, G.: “mister d.j., cheer me up!”: Musical and textual features for automatic mood classification. J. New Music Res. 39(1), 13–34 (2010)

    Article  Google Scholar 

  9. Berenzweig, A., Ellis, D.: Locating Singing Voice Segments Within Musical Signals. In: Proceedings of International Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 119–123. Mohonk, New York (2001)

    Google Scholar 

  10. Schuller, B., Hörnler, B., Arsić, D., Rigoll, G.: Audio chord labeling by musiological modeling and beat-synchronization. In: IEEE Proceedings 10th IEEE International Conference on Multimedia and Expo, ICME 2009, pp. 526–529. New York (2009)

    Google Scholar 

  11. Bellmann, H.: About the determination of key of a musical excerpt. In: Computer Music Modeling and Retrieval, vol. 3902 LNCS, pp. 76–91. Springer, Berlin (2006)

    Google Scholar 

  12. Dannenberg, R., Goto, M.: Music structure analysis from acoustic signals. In: Havelock, D., Kuwano, S., Vorländer, M. (eds.) Handbook of Signal Processing in Acoustics, vol. 1, pp. 305–331. Springer (2009)

    Google Scholar 

  13. Dixon, S., Pampalk, E., Widmer, G.: Classification of dance music by periodicity patterns. In: Proceedings of the 4th International Conference on Music, Information Retrieval, pp. 159–165 (2003)

    Google Scholar 

  14. Foote, J., Uchihashi, S.: The beat spectrum: a new approach to rhythm analysis. In: Proceedings of International Conference on Multimedia and Expo (ICME), IEEE, Tokyo (2001)

    Google Scholar 

  15. Hu, N., Dannenberg, R.B., Tzanetakis, G.: Polyphonic audio matching and alignment for music retrieval. In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 185–188 (2003)

    Google Scholar 

  16. Klapuri, A.P., Eronen, A.J., Astola, J.T.: Analysis of the meter of acoustic musical signals. IEEE Trans. Speech Audio Process. 14(1), 342–355 (2006)

    Article  Google Scholar 

  17. Müller, M., Ellis, D., Klapuri, A., Richard, G.: Signal processing for music analysis. IEEE J. Sel. Top. Sig. Process. 5(6), 1088–1110 (2011)

    Article  Google Scholar 

  18. Orio, N.: Music retrieval: a tutorial and review. Found. Trends Inf. Retrieval 1, 1–90 (2006)

    Article  MATH  Google Scholar 

  19. Uhle, C., Rohden, J., Cremer, M., Herre, J.: Low complexity musical meter estimation from polyphonic music. In: Proceedings of the AES 25th international conference, pp. 63–68. London, UK (2004)

    Google Scholar 

  20. Weninger, F., Schuller, B., Liem, C., Kurth, F., Hanjalic, A.: Music information retrieval: an inspirational guide to transfer from related disciplines. In: Müller, M., Goto, M. (eds.) Multimodal Music Processing, vol. 11041, pp. 195–215. Seminar of Dagstuhl Follow-UpsSchloss Dagstuhl, Germany (2012)

    Google Scholar 

  21. Schuller, B., Lehmann, A., Weninger, F., Eyben, F., Rigoll, G.: Blind enhancement of the rhythmic and harmonic sections by nmf: Does it help? In: Proceedings International Conference on Acoustics Including the 35th German Annual Conference on Acoustics, NAG/DAGA 2009, pp. 361–364, Acoustical Society of the Netherlands, DEGA, Rotterdam, The Netherlands (2009)

    Google Scholar 

  22. Böck, S., Eyben, F., Schuller, B.: Onset detection with bidirectional long short-term memory neural networks. In: Proceedings Annual Meeting of the MIREX 2010 Community as Aart of the 11th International Conference on Music Information Retrieval, ISMIR. p. 2. Utrecht, Netherlands (2010)

    Google Scholar 

  23. Eyben, F., Böck, S., Schuller, B., Graves, A.: Universal onset detection with bidirectional long-short term memory neural networks. In: Proceedings 11th International Society for Music Information Retrieval Conference, ISMIR 2010, pp. 589–594. Utrecht, The Netherlands (2010)

    Google Scholar 

  24. Schuller, B., Eyben, F., Rigoll, G.: Fast and robust meter and tempo recognition for the automatic discrimination of ballroom dance styles. In: IEEE Proceedings 32nd IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2007, vol. I, pp. 217–220. Honolulu, HY (2007)

    Google Scholar 

  25. Eyben, F., Schuller, B., Reiter, S., Rigoll, G.: Wearable assistance for the ballroom-dance hobbyist—holistic rhythm analysis and dance-style classification. In: IEEE Proceedings 8th IEEE International Conference on Multimedia and Expo, ICME 2007, pp. 92–95. Beijing, China (2007)

    Google Scholar 

  26. Eyben, F., Schuller, B.: Tempo estimation from tatum and meter vectors. In: Proceedings Annual Meeting of the MIREX 2010 Community as Part of the 11th International Conference on Music Information Retrieval, ISMIR, p. 1. Utrecht, Netherlands (2010)

    Google Scholar 

  27. Böck, S., Eyben, F., Schuller, B.: Tempo detection with bidirectional long short-term memory neural networks. In: Proceedings Annual Meeting of the MIREX 2010 community as Part of the 11th International Conference on Music Information Retrieval, ISMIR, p. 3. Utrecht, Netherlands (2010)

    Google Scholar 

  28. Schuller, B., Gollan, B.: Music theoretic and perception-based features for audio key determination. J. New Music Res. 41(2), 175–193 (2012)

    Article  Google Scholar 

  29. Schuller, B., Eyben, F., Rigoll, G.: Beat-synchronous data-driven automatic chord labeling. In: Proceedings 34. Jahrestagung für Akustik, DAGA, DEGA, pp. 555–556. Dresden, Germany (2008)

    Google Scholar 

  30. Schuller, B., Dibiasi, F., Eyben, F., Rigoll, G.: One day in half an hour: music thumbnailing incorporating harmony- and rhythm structure. In: Proceedings 6th Workshop on Adaptive Multimedia Retrieval, AMR 2008, p. 10. Berlin, Germany (2008)

    Google Scholar 

  31. Schuller, B., Dibiasi, F., Eyben, F., Rigoll, G.: Music thumbnailing incorporating harmony- and rhythm structure. In: Detyniecki, M., Leiner, U., Nürnberger, A. (eds.) Adaptive Multimedia Retrieval: 6th International Workshop, AMR 2008, June 26–27, Berlin, Germany (2008). Revised Selected Papers, vol. 5811/2010, Lecture Notes in Computer Science (LNCS), pp. 78–88. Springer, Berlin (2010)

    Google Scholar 

  32. Schuller, B., Dorfner, J., Rigoll, G.: Determination of non-prototypical valence and arousal in popular music: Features and performances. EURASIP J. Audio, Speech, Music Process. Spec. Issue Scalable Audio-Content Anal. (Article ID 735854), 19 (2010)

    Google Scholar 

  33. Schuller, B., Weninger, F., Dorfner, J.: Multi-modal non-prototypical music mood analysis in continuous space: Reliability and performances. In: Proceedings 12th International Society for Music Information Retrieval Conference, ISMIR 2011, pp. 759–764. Miami (2011)

    Google Scholar 

  34. Schuller, B., Kozielski, C., Weninger, F., Eyben, F., Rigoll, G.: Vocalist gender recognition in recorded popular music. In: Proceedings 11th International Society for Music Information Retrieval Conference, ISMIR 2010, pp. 613–618. Utrecht, The Netherlands (2010)

    Google Scholar 

  35. Weninger, F., Durrieu, J.-L., Eyben, F., Richard, G., Schuller, B.: Combining monaural source separation with long short-term memory for increased robustness in vocalist gender recognition. In: IEEE Proceedings 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, pp. 2196–2199. Prague, Czech Republic (2011)

    Google Scholar 

  36. Weninger, F., Wöllmer, M., Schuller, B.: Automatic assessment of singer traits in popular music: Gender, age, height and race. In: Proceedings 12th International Society for Music Information Retrieval Conference, ISMIR 2011, pp. 37–42. Miami (2011)

    Google Scholar 

  37. Grosche, P., Schuller, B., Müller, M., Rigoll, G.: Automatic transcription of recorded music. Acta Acustica united with Acustica. 98(2), 199–215(17) (2012)

    Google Scholar 

  38. Schuller, B., Rigoll, G.: Self-learning acoustic feature generation and selection for the discrimination of musical signals. In: Proceedings 32. Jahrestagung für Akustik, DAGA 2006, pp. 285–286. Braunschweig, Germany (2006)

    Google Scholar 

  39. Schuller, B., Wallhoff, F., Arsić, D., Rigoll, G.: Musical signal type discrimination based on large open feature sets. In: IEEE Proceedings 7th IEEE International Conference on Multimedia and Expo, ICME 2006, pp. 1089–1092. Toronto, Canada (2006)

    Google Scholar 

  40. Schuller, B., Schmitt, B.J.B., Arsić, D., Reiter, S., Lang, M., Rigoll, G.: Feature selection and stacking for robust discrimination of speech, monophonic singing, and polyphonic music. In: Proceedings 6th IEEE International Conference on Multimedia and Expo, ICME 2005, pp. 840–843. Amsterdam, The Netherlands (2005)

    Google Scholar 

  41. Schuller, B., Rigoll, G., Lang, M.: Hmm-based music retrieval using stereophonic feature information and framelength adaptation. In: Proceedings 4th IEEE International Conference on Multimedia and Expo, ICME 2003, vol. II, pp. 713–716. Baltimore (2003)

    Google Scholar 

  42. Schuller, B., Rigoll, G., Lang, M.: Matching monophonic audio clips to polyphonic recordings. In: Proceedings 31. Jahrestagung für Akustik, DAGA, 2005, DEGA, pp. 299–300, Munich, Germany (2005)

    Google Scholar 

  43. Weninger, F., Amir, N., Amir, O., Ronen, I., Eyben, F., Schuller, B.: Robust feature extraction for automatic recognition of vibrato singing in recorded polyphonic music. In: Proceedings 37th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012, pp. 85–88. Kyoto, Japan (2012)

    Google Scholar 

  44. Helén, M., Virtanen, T.: Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine. In Proceedings of EUSIPCO, Antalya, Turkey (2005)

    Google Scholar 

  45. Paulus, J., Virtanen, T.: Drum transcription with non-negative spectrogram factorisation. In: Proceedings of EUSIPCO, p. 4, EURASIP, Antalya, Turkey (2005)

    Google Scholar 

  46. Moreau, A., Flexer, A.: Drum transcription in polyphonic music using non-negative matrix factorisation. In: Proceedings of 8th International Conference on Music Information Retrieval (ISMIR), September 23–27, pp. 353–354, Vienna, Austria (2007)

    Google Scholar 

  47. Uhle, C., Dittmar, C., Sporer, T.: Extraction of drum tracks from polyphonic music using independent subspace analysis. In: Proceedings of 4th International Symposium on Independent Component Analysis and Blind Signal Separation (ICA), April 2003, pp. 843–848, Nara, Japan (2003)

    Google Scholar 

  48. Smaragdis, P., Brown, J.C.: Non-negative matrix factorization for polyphonic music transcription. In: IEEE Proceedings of WASPAA, pp. 177–180 (2003)

    Google Scholar 

  49. Virtanen, T., Ryynänen, M.: A. Mesaros. Combining pitch-based inference and non-negative spectrogram factorization in separating vocals from polyphonic music. In: ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition, SAPA 2008, pp. 17–22, ISCA, Brisbane (2008)

    Google Scholar 

  50. Sethares, W.A.: Local consonance and the relationship between timbre and scale. J. Acoust. Soc. Am. 94(3), 1218–1228 (1993)

    Article  MathSciNet  Google Scholar 

  51. Gouyon, F., Klapuri, A.P., Dixon, S., Alonso, M., Tzanetakis, G., Uhle, C., Cano, P.: An experimental comparison of audio tempo induction algorithms. IEEE Trans. Audio, Speech, Lang. Process. 14(5), 1832–1844 (2006)

    Article  Google Scholar 

  52. Dixon, S.: Onset detection revisited. In: Proceedings of DAFx-06, pp. 133–137, Montreal, Canada (2006)

    Google Scholar 

  53. Zhou, R., Reiss, J.: Music onset detection combining energy-based and pitch-based approaches. In: Proceedings of MIREX as part of the 8th International Conference on Music Information Retrieval (ISMIR). Sept 23–27. P. 4, Vienna, Austria (2007)

    Google Scholar 

  54. Röbel, A.: Onset detection by means of transient peak classification in harmonic bands. In: Proceedings of MIREX as part of the 10th International Conference on Music Information Retrieval (ISMIR), P. 2, Kobe, Japan (2009)

    Google Scholar 

  55. Klapuri, A.: Sound onset detection by applying psychoacoustic knowledge. In Proceedings of ICASSP, vol. 6, pp. 3089–3092 (1999)

    Google Scholar 

  56. Bello, J., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., Sandler, M.: A tutorial on onset detection in music signals. IEEE Trans. Speech Audio Process. 13(5), 1035–1047 (2005)

    Article  Google Scholar 

  57. Duxbury, C., Bello, J.P., Davies, M.: M. Sandler. Complex domain onset detection for musical signals. In: Proceedings of Digital Audio Effects Workshop (DAFx-03) pp. 1–4, London, UK (2003)

    Google Scholar 

  58. Collins, N.: Using a pitch detector for onset detection. In Proceedings of ISMIR, pp. 100–106 (2005)

    Google Scholar 

  59. Basseville, M., Nikiforov, I.V.: Detection of Abrupt Changes: Theory and Application. Prentice-Hall, Englewood Cliffs (1993)

    Google Scholar 

  60. Lacoste, A., Eck, D.: Onset detection with artificial neural networks. In: Proceedings of MIREX as part of the 6th International Conference on Music Information Retrieval (ISMIR), P. 4, London, UK (2005)

    Google Scholar 

  61. Graves, A.: Supervised sequence labelling with recurrent neural networks. Ph.D. Thesis, Technische Universität München (2008)

    Google Scholar 

  62. Collins, N.: A comparison of sound onset detection algorithms with emphasis on psychoacoustically motivated detection functions. In: Proceedings of AES Convention 118, pp. 28–31 (2005)

    Google Scholar 

  63. Handel, S.: Listening: An Introduction to the Perception of Auditory Events. MIT Press, Cambridge (1989)

    Google Scholar 

  64. Böck, S.: Onset Detector 2011. In: Proceedings Annual Meeting of the MIREX 2011 Community as Part of the 12th International Conference on Music Information Retrieval. p. 2. ISMIR, ISMIR (2011)

    Google Scholar 

  65. Böck, S., Arzt, A., Krebs, F., Schedl, M.: Online real-time onset detection with recurrent neural networks. In Proceedings of the 15th International Conference on Digital Audio Effects (DAFx-12), p. 4. New York, UK (2012)

    Google Scholar 

  66. Klapuri, A.P.: Musical meter estimation and music transcription. In: Proceedings of Cambridge Music Processing Colloquium, Cambridge University, UK (2003)

    Google Scholar 

  67. Gouyon, F., Herrera, P.: Determination of the meter of musical audio signals: seeking recurrences in beat segment descriptors. In: AES 114th Convention, Amsterdam, The Netherlands (2003)

    Google Scholar 

  68. Gouyon, F., Dixon, S., Pampalk, E., Widmer, G.: Evaluating rhythmic descriptors for musical genre classification. In: Proceedings of the AES 25th International Conference, pp. 196–204. London, UK (2004)

    Google Scholar 

  69. Grosche, P., Müller, M., Kurth, F.: Cyclic tempogram—a mid-level tempo representation for music signals. In: Proceedings of ICASSP, pp. 5522–5525. Dallas, TX (2010)

    Google Scholar 

  70. Kirovski, D., Attias, H.: Beat-ID: identifying music with beat analysis. In: Proceedings of the International Workshop on Multimedia Signal Processing, IEEE, pp. 190–193, St. Thomas, US Virgin Islands (2002)

    Google Scholar 

  71. Kurth, F., Gehrmann, T., Muller, M.: The cyclic beat spectrum: Tempo-related audio features for time-scale invariant audio identification. In: Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR 2006), pp. 35–40, Victoria, Canada (2006)

    Google Scholar 

  72. Goto, M., Muraoka, Y.: A real-time beat tracking system for audio signals. In: Proceedings of the 1995 International Computer Music Conference, pp. 171–174 (1995)

    Google Scholar 

  73. Goto, M., Muraoka, Y.: Real-time rhythm tracking for drumless audio signals—chord change detection for musical decisions. In: Proceedings of the IJCAI-97 Workshop on Computational Auditory Scene, Analysis, pp. 135–144 (1997)

    Google Scholar 

  74. Goto, M.: An audio-based real-time beat tracking system for music with or without drum-sounds. J. New Music Res. 30(2), 159–171 (2001)

    Google Scholar 

  75. Seppänen, J.: Computational models of musical meter recognition. Master’s thesis, Tampere University of Technology (2001)

    Google Scholar 

  76. Dixon, S.: Automatic extraction of tempo and beat from expressive performances. J. New Music Res. 30, 39–58 (2001)

    Google Scholar 

  77. Hainsworth, S., Macleod, M.: Beat tracking with particle filtering algorithms. In: 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 91–94 (2003)

    Google Scholar 

  78. Alonso, M., David, B., Richard, G.: Tempo and beat estimation of musical signals. In: Proceedings of the International Conference on Music Information Retrieval, pp. 158–163 (2004)

    Google Scholar 

  79. Sethares, W.A., Staley, T.W.: Meter and periodicity in musical performance. J. New Music Res. 22(5), 1–11 (2001)

    Google Scholar 

  80. Brown, J.C.: Determination of meter of musical scores by autocorrelation. J. Acoust. Soc. Am. 94(4), 1953–1957 (1993)

    Google Scholar 

  81. van Noorden, L., Moelants, D.: Resonance in the perception of musical pulse. J. New Music Res. 28(1), 43–66 (1999)

    Article  Google Scholar 

  82. Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)

    Google Scholar 

  83. Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimedia 13(2), 303–319 (2011)

    Article  Google Scholar 

  84. Ballroomdancers.com. Preview audio examples of ballroom dance music. https://secure.ballroomdancers.com/Music/style.asp, (2006)

  85. Zwicker, E., Fastl, H.: Psychoacoustics—Facts and Models. 2nd edn. Springer, Berlin (1999)

    Google Scholar 

  86. Paulus, J., Klapuri, A.P.: Measuring the similarity of rhythmic patterns. In: Proceedings of the2002 International Conference on Music Information Retrieval (ISMIR 2002). France, Paris (2002)

    Google Scholar 

  87. Gouyon, F., Dixon, S.: Dance music classification: A tempo-based approach. In: Proceedings of Fitfth International Conference on Music Information Retrieval, ISMIR, p. 4, Barcelona, Spain (2004)

    Google Scholar 

  88. Daniel, A., Emiya, V., David, B.: Perceptual-based evaluation of the errors usually made when automatically transcribing music. In: Proceedings of 9th International Symposium on Music Information Retrieval (ISMIR), pp. 550–555. Philadelphia (2008)

    Google Scholar 

  89. Gomez, E.: Estimating the tonality of polyphonic audio files: cognitive versus machine learning modelling strategies. In: Proceedings of 5th International Conference on Music Information Retrieval, Barcelona, Spain (2004)

    Google Scholar 

  90. Gomez, E.: Key estimation from polyphonic audio. In: Proceedings of 1st Annual Music Information Retrieval Evaluation eXchange (MIREX’05), London, UK (2005)

    Google Scholar 

  91. Izmirli, O.: Template based keyfinding from audio. In: Proceedings of International Computer Music Conference (ICMC), pp. 211–214. Barcelona, Spain (2005)

    Google Scholar 

  92. Mardirossian, A., Chew, E.: Skefis a symbolic (midi) key-finding system. In: Proceedings of 6th International Symposium on Music Information Retrieval (ISMIR), no pagination, pp. 1–8. London, UK (2005)

    Google Scholar 

  93. Pauws, S.: Musical keyextraction from audio. In: Proceedings of 5th International Symposium on Music Information Retrieval (ISMIR), pp. 96–99. Barcelona, Spain (2004)

    Google Scholar 

  94. Chuan, C., Chew, E.: Fuzzy analysis in pitch class determination for polyphonic audio keyfinding. In: Proceedings of 6th International Symposium on Music Information Retrieval (ISMIR), pp. 296–303. London, UK (2005)

    Google Scholar 

  95. Peeters, G.: Chroma-based estimation of musical key from audio-signal analysis. In: Proceedings of 7th International Symposium on Music Information Retrieval (ISMIR). Victoria, Canada (2006)

    Google Scholar 

  96. Chuan, C.H., Chew, E.: Audio key finding: considerations in system design and case studies on chopins 24 preludes. EURASIP J. Adv. Sig. Process. 2007(056561) (2006)

    Google Scholar 

  97. Noland, K., Sandler, M.: Key estimation using a hidden markov model. In: Proceedings of 7th International Symposium on Music Information Retrieval (ISMIR), pp. 121–126. Victoria, Canada (2006)

    Google Scholar 

  98. Mandel, M.I., Ellis, D.P.W.: Song-level features and support vector machines for music classification. In: Proceedings 6th International Conference on Music Information Retrieval (ISMIR), pp. 594–599. London, UK (2005)

    Google Scholar 

  99. Mauch, M., Dixon, S.: Simultaneous estimation of chords and musical context from audio. IEEE Trans. Audio, Speech Lang. Process. 18(6), 1280–1289 (2010)

    Article  Google Scholar 

  100. Fujishima, T.: Realtime chord recognition of musical sound: a system using common lisp music. In: Proceedings of International Computer Music Conference, pp. 464–467. Bejing, China (1999)

    Google Scholar 

  101. Gomez, E.: Tonal description of polyphonic audio for music content processing. INFORMS J. Comput. 18(3), 294–304 (2006)

    Article  Google Scholar 

  102. Temperly, D.: An algorithm for harmonic analysis. Music Percept. 15, 31–68 (1997)

    Article  Google Scholar 

  103. Madsen, S.T., Widmer, G.: Key-finding with interval profiles. In: Proceedings International Computer Music Conference (ICMC), p. 4, Copenhagen, Denmark (2007)

    Google Scholar 

  104. Lee, K., Slaney, M.: A unified system for chord transcription and key extraction using hidden markov models. In: Proceedings of 8th International Symposium on Music Information Retrieval (ISMIR). Vienna, Austria (2007)

    Google Scholar 

  105. Lee, K., Slaney, M.: Acoustic chord transcription and key extraction from audio using key-dependent hmms trained on synthesized audio. IEEE Trans. Audio, Speech, Lang. Process. 16, 291–301 (2008)

    Article  Google Scholar 

  106. Purwins, H., Blankertz, B., Dornhege, G., Obermayer, K.: Scale degree from audio investigated with machine learning techniques. In: Proceedings Audio Engineering Society 116th Convention (2004)

    Google Scholar 

  107. Purwins, H.: Profiles of Pitch Classes Circularity of Relative Pitch and Key—Experiments, Models, Computational Music Analysis, and Perspectives. Ph.D. thesis, Technische Universität, Berlin (2005)

    Google Scholar 

  108. Cremer, M., Derboven, C.: A system for harmonic analysis of polyphonic music. In: Proceedings 25th International AES Conference. London, UK (2004)

    Google Scholar 

  109. Sun, J., Li, H., Li, L.: Key detection through pitch class distribution model and ANN. In: 2009 16th International Conference on Digital Signal Processing. Santorini, Hellas (2009)

    Google Scholar 

  110. Cabral, G., Briot, J.-P., Pachet, F.: Impact of distance in pitch class profile computation. In: Proceedings of 10th Brazilian Symposium on Computer Music (SBCM2005), pp. 135–144. Belo Horizonte, Brazil (2005)

    Google Scholar 

  111. Izmirli, O.: Audio key finding using low-dimensional spaces. In: Proceedings of 7th International Conference on Music Information Retrieval (ISMIR). Victoria, Canada (2006)

    Google Scholar 

  112. Izmirli, O.: An algorithm for audio key finding. In: Proceedings of Music Information Retrieval Evaluation Exchange (MIREX2005), as Part of the 6th International Symposium on Music Information Retrieval (ISMIR). London, UK (2006)

    Google Scholar 

  113. Zhu, Y.: An audio key finding algorithm. In: Proceedings of the 1st Annual Music Information Retrieval Evaluatione Xchange(MIREX’05). London, UK (2005)

    Google Scholar 

  114. Zhu, Y.: Music key detection for musical audio. In: Proceedings of 11th International Multimedia Modelling Conference (MMM’05). Melbourne, Australia (2005)

    Google Scholar 

  115. Chew, E.: An algorithm for determining key boundaries. In: Proceedings 2nd International Conference on Music and Artificial Intelligence (ICMAI). Edinburgh, Scotland (2002)

    Google Scholar 

  116. Shenoy, A., Mohapatra, R., Wang, Y.: Key determination of acoustic musical signals. In: Proceedings of International Conference on Multimedia and Expo(ICME). Singapore (2004)

    Google Scholar 

  117. Sheh, A., Ellis, D.: Chord segmentation and recognition using emtrained hidden markov models. In: Proceedings of ISMIR 2003, pp. 183–189. Baltimore, Maryland (2003)

    Google Scholar 

  118. Chai, W., Vercoe, B.: Detection of key change in classical piano music. In: Proceedings 6th International Conference on Music Information Retrieval (ISMIR), pp. 468–474. London, UK (2005)

    Google Scholar 

  119. Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., Slaney, M.: Content-based music information retrieval: current directions and future challenges. Proc. IEEE 96(4), 668–696 (2008)

    Google Scholar 

  120. Bartsch, M.A., Wakefield, G.H.: To catch a Chorus: using chroma-based representations for audio thumbnailing. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics 2001, pp. 15–18, New Paltz, New York (2001)

    Google Scholar 

  121. Wakefield, G.: Mathematical representation of joint time chroma distributions. In: Proceedings of SPIE, vol. 3807, pp. 637–645. Denver, Colorado (1999)

    Google Scholar 

  122. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Google Scholar 

  123. Krumhansl, C.: Tonal hierarchies and rare intervals in music cognition. Music Percept. Interdisc. J. 7(3), 309–324 (1990)

    Article  Google Scholar 

  124. Gatzsche, D., Gatzsche, G., Mehnert, M., Brandenburg, K.: A symmetry based approach for musical tonality analysis. In: Proceedings of 8th International Society for Music Information Retrieval Conference (ISMIR), no pagination, Vienna, Austria (2007)

    Google Scholar 

  125. Bello, J.P., Daudet, L., Sandler, M.B.: Automatic piano transcription using frequency and time-domain information. IEEE Trans. Audio, Speech Lang. Process. 14(6), 2242–2251 (2006)

    Article  Google Scholar 

  126. Duan, Z., Lu, L., Zhang, C.: Audio tonality mode classification without tonic annotations. In: Proceedings of 8th IEEE International Conference on Multimedia and Expo (ICME), pp. 1361–1364. Hannover, Germany (2008)

    Google Scholar 

  127. Izmirli, O.: Tonal-atonal classification of music audio using diffusion maps. In: Proceedings of 10th International Society for Music Information Retrieval Conference (ISMIR 2009), pp. 687–691. Kobe, Japan (2009)

    Google Scholar 

  128. Vuvan, D., Prince, J., Schmuckler, M.: Probing the minor tonal hierarchy. Music Percept. Interdisc. J. 28(5), 461–472 (2011)

    Article  Google Scholar 

  129. Papadopoulos, H., Peeters, G.: Local key estimation based on harmonic and metric structures. In: Proceedings of 12th International Conference on Digital Audio Effects (DAFx-09), pp. 1–8. Como, Italy (2009)

    Google Scholar 

  130. Goto, M., Muraoka, Y.: An audio-based real-time beat tracking system and its applications. In: Proceedings International Computer Music Confernce, pp. 17–20. ICMA, San Francisco (1998)

    Google Scholar 

  131. Rocher, T., Robine, M., Hanna, P., Oudre, L.: Concurrent estimation of chords and keys from audio. In: Proceedings of 11th International Society for Music Information Retrieval Conference (ISMIR), pp. 141–146. Utrecht, The Netherlands (2010)

    Google Scholar 

  132. Bello, J.B., Pickens, J.: A robust mid-level representation for harmonic content in music signals. Proc. ISMIR 2005, 304–311 (2005)

    Google Scholar 

  133. Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK book (v3.4). Cambridge University Press, Cambridge (2006)

    Google Scholar 

  134. Stober, S., Nürnberger, A.: Towards user-adaptive structuring and organization of music collections. In: Proceedings of 6th Workshop on Adaptive Multimedia Retrieval (AMR 2008), Berlin, Germany (2008)

    Google Scholar 

  135. Burges, C.J.C., Plastina, D., Platt, J.C., Renshaw, E., Malvar, H.S.: Duplicate detection and audio thumbnails with audio fingerprinting. Technical Report MSR-TR-2004-19, Microsoft Research (MSR), March (2004)

    Google Scholar 

  136. Logan, B., Chu, S.: Music summarization using key phrases. Proc. ICASSP 2, 749–752 (2000)

    Google Scholar 

  137. Aucouturier, J.-J., Pachet, F., Sandler, M.: The way it sounds: timbre models for analysis and retrieval of music signals. IEEE Trans. Multimedia 7(6), 1028–1035 (2005)

    Article  Google Scholar 

  138. Aucouturier, J.-J., Sandler, M.: Segmentation of musical signals using hidden markov models. In: Proceedings of the 110th AES Convention, AES (Audio Engineering Society), Amsterdam, The Netherlands (2001)

    Google Scholar 

  139. Jehan, T.: Hierarchical multi-class self similarities. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 311–314 (2005)

    Google Scholar 

  140. Foote, J.: Visualizing music and audio using self-similarity. In: Proceedings of 7th ACM International Conference on Multimedia (Part 1), pp. 77–80 (1999)

    Google Scholar 

  141. Cooper, M., Foote, J.: Automatic music summarization via similarity analysis. In: Proceedings of 3rd ISMIR, pp. 81–5 (2002)

    Google Scholar 

  142. Peeters, G., Burthe, A.L., Rodet, X.: Toward automatic music audio summary generation from signal analysis. In: Proceedings of 3rd ISMIR, pp. 94–100 (2002)

    Google Scholar 

  143. Abdallah, S.A., Noland, K., Sandler, M., Casey, M., Rhodes, C.: Theory and evaluation of a bayesian music structure extractor. In: Proceedings of 6th ISMIR, pp. 420–425 (2005)

    Google Scholar 

  144. Goto, M.: A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Trans. Audio, Speech, Lang. Process. 14(5), 1783–1794 (2006)

    Article  Google Scholar 

  145. Müller, M., Kurth, F.: Enhancing similarity matrices for music audio analysis. Proc. ICASSP 5, 9–12 (2006)

    Google Scholar 

  146. D’Aguanno, A., Vercellesi, G.: Automatic synchronization between audio and partial music score representation. In: Proceedings of 6th Workshop on Adaptive Multimedia Retrieval (AMR 2008). Berlin, Germany (2008)

    Google Scholar 

  147. Tolos, M., Tato, R., Kemp, T.: Mood-based navigation through large collections of musical data. In: Proceedings of 2nd CCNC 2005, pp. 71–75. Las Vegas, NV (2005)

    Google Scholar 

  148. Feng, Y., Zhuang, Y., Pan, Y.: Popular music retrieval by detecting mood. In: Proceedings 26th International SIGIR Conference on Research and Development in Information Retrieval, pp. 375–376. Toronto, ACM, Canada (2003)

    Google Scholar 

  149. Li, T., Ogihara, M.: Detecting emotion in music. In: Proceedigns of ISMIR, pp. 239–240. Baltimore (2003)

    Google Scholar 

  150. Liu, D.: Automatic mood detection from acoustic music data. In: Proceedings International Conference on Music Information Retrieval, pp. 13–17 (2003)

    Google Scholar 

  151. Lu, L., Liu, D., Zhang, H.: Automatic mood detection and tracking of music audio signals. IEEE Trans. Audio, Speech, Lang. Process. 14(1), 5–18 (2006)

    Article  MathSciNet  Google Scholar 

  152. Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.: Multi-label classification of music into emotions. In: Proceedings 9th International Conference on Music Information Retrieval (ISMIR), pp. 325–330. Philadelphia (2008)

    Google Scholar 

  153. Logan, B.: Mel frequency cepstral coefficients for music modeling. In: Proceedings of ISMIR. Plymouth, USA (2000)

    Google Scholar 

  154. Peeters, G.: A generic training and classification system for MIREX08 classification tasks: Audio music mood, audio genre, audio artist and audio tag. In: Proceedings of MIREX as part of the 9th International Conference on Music Information Retrieval (ISMIR), ISMIR, Philadelphia, PY (2008)

    Google Scholar 

  155. Boersma, P.: Praat, a system for doing phonetics by computer. Glot Int. 5, 341–345 (2001)

    Google Scholar 

  156. Chase, W.: How Music REALLY Works!. 2nd edn. Roedy Black Publishing, Vancouver, Canada (2006)

    Google Scholar 

  157. Harte, C.A., Sandler, M.: Automatic chord identification using a quantised chromagram. 118th Convention of the AES, May (2005)

    Google Scholar 

  158. Porter, M.F.: An algorithm for suffix stripping. Program 3(14), 130–137 (1980)

    Google Scholar 

  159. Chuang, Z.-J., Wu, C.-H.: Emotion recognition using acoustic features and textual content. In: Proceedings of ICME, pp. 53–56. Taipei, Taiwan (2004)

    Google Scholar 

  160. Schuller, B., Batliner, A., Steidl, S., Seppi, D.: Emotion recognition from speech: putting asr in the loop. In: Proceedings 34th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009, pp. 4585–4588. Taipei, Taiwan (2009)

    Google Scholar 

  161. Liu, H., Singh, P.: ConceptNet—a practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)

    Article  MathSciNet  Google Scholar 

  162. Schuller, B., Schenk, J., Rigoll, G., Knaup, T.: The godfather versus “chaos”: comparing linguistic analysis based on online knowledge sources and bags-of-n-grams for movie review valence estimation. In: IAPR, IEEE Proceedings 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 858–862, Barcelona, Spain (2009)

    Google Scholar 

  163. Ekman, P., Sorenson, E., Friesen, W.: Pan-cultural elements in facial displays of emotions. Science 164, 86–88 (1969)

    Article  Google Scholar 

  164. Bradley, M.M., Lang, P.J.: Affective norms for english words (anew): Stimuli, instruction manual, and affective ratings. Technical Report C-1, Center for Research in Psychophysiology, University of Florida, Gainesville, Florida (1999)

    Google Scholar 

  165. Hu, X., Downie, J.S.: Exploring mood metadata: relationships with genre, artist and usage metadata. In: Proceedings 8th International Conference on Music Information Retrieval (ISMIR), Vienna, Austria (2007)

    Google Scholar 

  166. Wöllmer, M., Eyben, F., Reiter, S., Schuller, B., Cox, C., Douglas-Cowie, E., Cowie, R.: Abandoning emotion classes—towards continuous emotion recognition with modelling of long-range dependencies. In: Proceedings INTERSPEECH 2008, 9th Annual Conference of the International Speech Communication Association, incorporating 12th Australasian International Conference on Speech Science and Technology, SST 2008, ISCA/ASSTA, ISCA, pp. 597–600. Brisbane, Australia (2008)

    Google Scholar 

  167. Schuller, B., Müller, R., Eyben, F., Gast, J., Hörnler, B., Wöllmer, M., Rigoll, G., Höthker, A., Konosu, H.: Being bored? recognising natural interest by extensive audiovisual integration for real-life application. Image Vis. Comput. Spec. Issue Vis. Multimodal Anal. Hum. Spontaneous Behav. 27(12), 1760–1774 (2009)

    Google Scholar 

  168. Mesaros, A., Virtanen, T., Klapuri, A.: Singer identification in polyphonic music using vocal separation and pattern recognition methods. In: Proceedings of ISMIR, pp. 375–378 (2007)

    Google Scholar 

  169. Mesaros, A., Virtanen, T.: Automatic recognition of lyrics in singing. EURASIP J. Audio, Speech, Music Process. Article ID 546047 (2009)

    Google Scholar 

  170. Durrieu, J.-L., Richard, G., David, B., Févotte, C.: Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Trans. Audio, Speech, Lang. Process. 18(3), 564–575 (2010)

    Article  Google Scholar 

  171. Durrieu, J.-L., Richard, G., David, B.: An iterative approach to monaural musical mixture de-soloing. In: Proceedings of ICASSP, pp. 105–108, Taipei, Taiwan (2009)

    Google Scholar 

  172. Eyben, F., Wöllmer, M., Schuller, B.: Opensmile—the munich versatile and fast open-source audio feature extractor. In: Proceedings of the 9th ACM International Conference on Multimedia, MM 2010, pp. 1459–1462. ACM, Florence, Italy (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Björn Schuller .

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Schuller, B. (2013). Applications in Intelligent Music Analysis. In: Intelligent Audio Analysis. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36806-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36806-6_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36805-9

  • Online ISBN: 978-3-642-36806-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics