Applications in Intelligent Music Analysis

Schuller, Björn

doi:10.1007/978-3-642-36806-6_11

Björn Schuller²

Part of the book series: Signals and Communication Technology ((SCT))

2288 Accesses
2 Citations

Abstract

As digitised music has conquered the market for more than a decade, advanced techniques of Intelligent Music Analysis are gaining interest and importance. From this exciting field, recent application examples were selected for presentation in detail from the work of the author including current performance benchmarks. These comprise drum-beat separation, onset detection, tempo, metre, ballroom dance style, and mood determination, key and chord recognition, and structure analysis alongside singer trait classification. The latter includes singer age, gender, race, and height recognition.

Of all noises, I think music is the least disagreeable.

—Samuel Johnson.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://mtg.upf.edu/ismir2004/contest/tempoContest/node5.html
2.
http://w3.ift.ulaval.ca/~allac88/dataset.tar.gz
3.
http://www.mmk.ei.tum.de/~~sch/brd.txt
4.
http://www.openaudio.eu/chord.txt
5.
The term downbeat stems from orchestral conducting: The lowest point on the baton signals the downbeat.
6.
http://www.olga.net
7.
Allmusic (http://www.allmusic.com)
8.
MIREX 2008 (http://www.music-ir.org/mirex/2008)
9.
The annotation scheme is inspired by the TIMIT corpus as was used in Sect. 10.4.3. As such, the term ‘race’ is adopted from the corpus’ meta-information—though modern biology often neither classifies the homo sapiens sapiens by race nor sub-categories for collective differentiation in both physical and behavioural traits. Opposing current molecular biologic and population genetic research’s view that a systematic categorisation may be insufficient to describe the enormous diversity and fluent differences between geographic population, it can be argued that, when aiming at an end-user information retrieval system, a categorisation into illustrative, archetypal categories can be useful.
10.
http://www.imdb.com
11.
http://www.wikipedia.org
12.
http://www.youtube.com
13.
http://www.openaudio.eu/UltraStar_Singers.arff
14.
A complex random variable whose real and imaginary parts are independent and follow a real Gaussian distribution, with mean equal to \(0\) and identical variance or co-variance matrix in case of a multi-variate distribution.
15.
http://www.durrieu.ch/phd/software.html

References

Casey, M., Slaney, M.: Fast recognition of remixed music audio. In: IEEE Proceedings International Conference on Audio Speech and Signal Processing (ICASSP), vol. IV, pp. 1425–1428 (2007)
Google Scholar
Schuller, B., Zobl, M., Rigoll, G., Lang, M.: A hybrid music retrieval system using belief networks to integrate queries and contextual knowledge. In: IEEE Proceedings 4th IEEE International Conference on Multimedia and Expo, ICME 2003, vol. I, pp. 57–60. Baltimore (2003)
Google Scholar
Schuller, B., Rigoll, G., Lang, M.: Multimodal music retrieval for large databases. In: IEEE Proceedings 5th IEEE International Conference on Multimedia and Expo, ICME 2004, vol. 2, pp. 755–758. Taipei, Taiwan (2004)
Google Scholar
Downie, J.: Music information retrieval. Ann. Rev. Inform. Sci. Technol. 37, 295–340 (2003)
Article Google Scholar
Scheirer, E.D.: Tempo and beat analysis of acoustic musical signals. Acoust. Soc. Am. 103(1), 588–601 (1998)
Article Google Scholar
Schuller, B., Eyben, F., Rigoll, G.: Tango or waltz?—putting ballroom dance style into tempo detection. EURASIP J. Audio, Speech, Music Process. Spec. Issue Intell. Audio, Speech, Music Process. Appl. (Article ID 846135), 12 (2008)
Google Scholar
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)
Article Google Scholar
Schuller, B., Hage, C., Schuller, D., Rigoll, G.: “mister d.j., cheer me up!”: Musical and textual features for automatic mood classification. J. New Music Res. 39(1), 13–34 (2010)
Article Google Scholar
Berenzweig, A., Ellis, D.: Locating Singing Voice Segments Within Musical Signals. In: Proceedings of International Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 119–123. Mohonk, New York (2001)
Google Scholar
Schuller, B., Hörnler, B., Arsić, D., Rigoll, G.: Audio chord labeling by musiological modeling and beat-synchronization. In: IEEE Proceedings 10th IEEE International Conference on Multimedia and Expo, ICME 2009, pp. 526–529. New York (2009)
Google Scholar
Bellmann, H.: About the determination of key of a musical excerpt. In: Computer Music Modeling and Retrieval, vol. 3902 LNCS, pp. 76–91. Springer, Berlin (2006)
Google Scholar
Dannenberg, R., Goto, M.: Music structure analysis from acoustic signals. In: Havelock, D., Kuwano, S., Vorländer, M. (eds.) Handbook of Signal Processing in Acoustics, vol. 1, pp. 305–331. Springer (2009)
Google Scholar
Dixon, S., Pampalk, E., Widmer, G.: Classification of dance music by periodicity patterns. In: Proceedings of the 4th International Conference on Music, Information Retrieval, pp. 159–165 (2003)
Google Scholar
Foote, J., Uchihashi, S.: The beat spectrum: a new approach to rhythm analysis. In: Proceedings of International Conference on Multimedia and Expo (ICME), IEEE, Tokyo (2001)
Google Scholar
Hu, N., Dannenberg, R.B., Tzanetakis, G.: Polyphonic audio matching and alignment for music retrieval. In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 185–188 (2003)
Google Scholar
Klapuri, A.P., Eronen, A.J., Astola, J.T.: Analysis of the meter of acoustic musical signals. IEEE Trans. Speech Audio Process. 14(1), 342–355 (2006)
Article Google Scholar
Müller, M., Ellis, D., Klapuri, A., Richard, G.: Signal processing for music analysis. IEEE J. Sel. Top. Sig. Process. 5(6), 1088–1110 (2011)
Article Google Scholar
Orio, N.: Music retrieval: a tutorial and review. Found. Trends Inf. Retrieval 1, 1–90 (2006)
Article MATH Google Scholar
Uhle, C., Rohden, J., Cremer, M., Herre, J.: Low complexity musical meter estimation from polyphonic music. In: Proceedings of the AES 25th international conference, pp. 63–68. London, UK (2004)
Google Scholar
Weninger, F., Schuller, B., Liem, C., Kurth, F., Hanjalic, A.: Music information retrieval: an inspirational guide to transfer from related disciplines. In: Müller, M., Goto, M. (eds.) Multimodal Music Processing, vol. 11041, pp. 195–215. Seminar of Dagstuhl Follow-UpsSchloss Dagstuhl, Germany (2012)
Google Scholar
Schuller, B., Lehmann, A., Weninger, F., Eyben, F., Rigoll, G.: Blind enhancement of the rhythmic and harmonic sections by nmf: Does it help? In: Proceedings International Conference on Acoustics Including the 35th German Annual Conference on Acoustics, NAG/DAGA 2009, pp. 361–364, Acoustical Society of the Netherlands, DEGA, Rotterdam, The Netherlands (2009)
Google Scholar
Böck, S., Eyben, F., Schuller, B.: Onset detection with bidirectional long short-term memory neural networks. In: Proceedings Annual Meeting of the MIREX 2010 Community as Aart of the 11th International Conference on Music Information Retrieval, ISMIR. p. 2. Utrecht, Netherlands (2010)
Google Scholar
Eyben, F., Böck, S., Schuller, B., Graves, A.: Universal onset detection with bidirectional long-short term memory neural networks. In: Proceedings 11th International Society for Music Information Retrieval Conference, ISMIR 2010, pp. 589–594. Utrecht, The Netherlands (2010)
Google Scholar
Schuller, B., Eyben, F., Rigoll, G.: Fast and robust meter and tempo recognition for the automatic discrimination of ballroom dance styles. In: IEEE Proceedings 32nd IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2007, vol. I, pp. 217–220. Honolulu, HY (2007)
Google Scholar
Eyben, F., Schuller, B., Reiter, S., Rigoll, G.: Wearable assistance for the ballroom-dance hobbyist—holistic rhythm analysis and dance-style classification. In: IEEE Proceedings 8th IEEE International Conference on Multimedia and Expo, ICME 2007, pp. 92–95. Beijing, China (2007)
Google Scholar
Eyben, F., Schuller, B.: Tempo estimation from tatum and meter vectors. In: Proceedings Annual Meeting of the MIREX 2010 Community as Part of the 11th International Conference on Music Information Retrieval, ISMIR, p. 1. Utrecht, Netherlands (2010)
Google Scholar
Böck, S., Eyben, F., Schuller, B.: Tempo detection with bidirectional long short-term memory neural networks. In: Proceedings Annual Meeting of the MIREX 2010 community as Part of the 11th International Conference on Music Information Retrieval, ISMIR, p. 3. Utrecht, Netherlands (2010)
Google Scholar
Schuller, B., Gollan, B.: Music theoretic and perception-based features for audio key determination. J. New Music Res. 41(2), 175–193 (2012)
Article Google Scholar
Schuller, B., Eyben, F., Rigoll, G.: Beat-synchronous data-driven automatic chord labeling. In: Proceedings 34. Jahrestagung für Akustik, DAGA, DEGA, pp. 555–556. Dresden, Germany (2008)
Google Scholar
Schuller, B., Dibiasi, F., Eyben, F., Rigoll, G.: One day in half an hour: music thumbnailing incorporating harmony- and rhythm structure. In: Proceedings 6th Workshop on Adaptive Multimedia Retrieval, AMR 2008, p. 10. Berlin, Germany (2008)
Google Scholar
Schuller, B., Dibiasi, F., Eyben, F., Rigoll, G.: Music thumbnailing incorporating harmony- and rhythm structure. In: Detyniecki, M., Leiner, U., Nürnberger, A. (eds.) Adaptive Multimedia Retrieval: 6th International Workshop, AMR 2008, June 26–27, Berlin, Germany (2008). Revised Selected Papers, vol. 5811/2010, Lecture Notes in Computer Science (LNCS), pp. 78–88. Springer, Berlin (2010)
Google Scholar
Schuller, B., Dorfner, J., Rigoll, G.: Determination of non-prototypical valence and arousal in popular music: Features and performances. EURASIP J. Audio, Speech, Music Process. Spec. Issue Scalable Audio-Content Anal. (Article ID 735854), 19 (2010)
Google Scholar
Schuller, B., Weninger, F., Dorfner, J.: Multi-modal non-prototypical music mood analysis in continuous space: Reliability and performances. In: Proceedings 12th International Society for Music Information Retrieval Conference, ISMIR 2011, pp. 759–764. Miami (2011)
Google Scholar
Schuller, B., Kozielski, C., Weninger, F., Eyben, F., Rigoll, G.: Vocalist gender recognition in recorded popular music. In: Proceedings 11th International Society for Music Information Retrieval Conference, ISMIR 2010, pp. 613–618. Utrecht, The Netherlands (2010)
Google Scholar
Weninger, F., Durrieu, J.-L., Eyben, F., Richard, G., Schuller, B.: Combining monaural source separation with long short-term memory for increased robustness in vocalist gender recognition. In: IEEE Proceedings 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, pp. 2196–2199. Prague, Czech Republic (2011)
Google Scholar
Weninger, F., Wöllmer, M., Schuller, B.: Automatic assessment of singer traits in popular music: Gender, age, height and race. In: Proceedings 12th International Society for Music Information Retrieval Conference, ISMIR 2011, pp. 37–42. Miami (2011)
Google Scholar
Grosche, P., Schuller, B., Müller, M., Rigoll, G.: Automatic transcription of recorded music. Acta Acustica united with Acustica. 98(2), 199–215(17) (2012)
Google Scholar
Schuller, B., Rigoll, G.: Self-learning acoustic feature generation and selection for the discrimination of musical signals. In: Proceedings 32. Jahrestagung für Akustik, DAGA 2006, pp. 285–286. Braunschweig, Germany (2006)
Google Scholar
Schuller, B., Wallhoff, F., Arsić, D., Rigoll, G.: Musical signal type discrimination based on large open feature sets. In: IEEE Proceedings 7th IEEE International Conference on Multimedia and Expo, ICME 2006, pp. 1089–1092. Toronto, Canada (2006)
Google Scholar
Schuller, B., Schmitt, B.J.B., Arsić, D., Reiter, S., Lang, M., Rigoll, G.: Feature selection and stacking for robust discrimination of speech, monophonic singing, and polyphonic music. In: Proceedings 6th IEEE International Conference on Multimedia and Expo, ICME 2005, pp. 840–843. Amsterdam, The Netherlands (2005)
Google Scholar
Schuller, B., Rigoll, G., Lang, M.: Hmm-based music retrieval using stereophonic feature information and framelength adaptation. In: Proceedings 4th IEEE International Conference on Multimedia and Expo, ICME 2003, vol. II, pp. 713–716. Baltimore (2003)
Google Scholar
Schuller, B., Rigoll, G., Lang, M.: Matching monophonic audio clips to polyphonic recordings. In: Proceedings 31. Jahrestagung für Akustik, DAGA, 2005, DEGA, pp. 299–300, Munich, Germany (2005)
Google Scholar
Weninger, F., Amir, N., Amir, O., Ronen, I., Eyben, F., Schuller, B.: Robust feature extraction for automatic recognition of vibrato singing in recorded polyphonic music. In: Proceedings 37th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012, pp. 85–88. Kyoto, Japan (2012)
Google Scholar
Helén, M., Virtanen, T.: Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine. In Proceedings of EUSIPCO, Antalya, Turkey (2005)
Google Scholar
Paulus, J., Virtanen, T.: Drum transcription with non-negative spectrogram factorisation. In: Proceedings of EUSIPCO, p. 4, EURASIP, Antalya, Turkey (2005)
Google Scholar
Moreau, A., Flexer, A.: Drum transcription in polyphonic music using non-negative matrix factorisation. In: Proceedings of 8th International Conference on Music Information Retrieval (ISMIR), September 23–27, pp. 353–354, Vienna, Austria (2007)
Google Scholar
Uhle, C., Dittmar, C., Sporer, T.: Extraction of drum tracks from polyphonic music using independent subspace analysis. In: Proceedings of 4th International Symposium on Independent Component Analysis and Blind Signal Separation (ICA), April 2003, pp. 843–848, Nara, Japan (2003)
Google Scholar
Smaragdis, P., Brown, J.C.: Non-negative matrix factorization for polyphonic music transcription. In: IEEE Proceedings of WASPAA, pp. 177–180 (2003)
Google Scholar
Virtanen, T., Ryynänen, M.: A. Mesaros. Combining pitch-based inference and non-negative spectrogram factorization in separating vocals from polyphonic music. In: ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition, SAPA 2008, pp. 17–22, ISCA, Brisbane (2008)
Google Scholar
Sethares, W.A.: Local consonance and the relationship between timbre and scale. J. Acoust. Soc. Am. 94(3), 1218–1228 (1993)
Article MathSciNet Google Scholar
Gouyon, F., Klapuri, A.P., Dixon, S., Alonso, M., Tzanetakis, G., Uhle, C., Cano, P.: An experimental comparison of audio tempo induction algorithms. IEEE Trans. Audio, Speech, Lang. Process. 14(5), 1832–1844 (2006)
Article Google Scholar
Dixon, S.: Onset detection revisited. In: Proceedings of DAFx-06, pp. 133–137, Montreal, Canada (2006)
Google Scholar
Zhou, R., Reiss, J.: Music onset detection combining energy-based and pitch-based approaches. In: Proceedings of MIREX as part of the 8th International Conference on Music Information Retrieval (ISMIR). Sept 23–27. P. 4, Vienna, Austria (2007)
Google Scholar
Röbel, A.: Onset detection by means of transient peak classification in harmonic bands. In: Proceedings of MIREX as part of the 10th International Conference on Music Information Retrieval (ISMIR), P. 2, Kobe, Japan (2009)
Google Scholar
Klapuri, A.: Sound onset detection by applying psychoacoustic knowledge. In Proceedings of ICASSP, vol. 6, pp. 3089–3092 (1999)
Google Scholar
Bello, J., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., Sandler, M.: A tutorial on onset detection in music signals. IEEE Trans. Speech Audio Process. 13(5), 1035–1047 (2005)
Article Google Scholar
Duxbury, C., Bello, J.P., Davies, M.: M. Sandler. Complex domain onset detection for musical signals. In: Proceedings of Digital Audio Effects Workshop (DAFx-03) pp. 1–4, London, UK (2003)
Google Scholar
Collins, N.: Using a pitch detector for onset detection. In Proceedings of ISMIR, pp. 100–106 (2005)
Google Scholar
Basseville, M., Nikiforov, I.V.: Detection of Abrupt Changes: Theory and Application. Prentice-Hall, Englewood Cliffs (1993)
Google Scholar
Lacoste, A., Eck, D.: Onset detection with artificial neural networks. In: Proceedings of MIREX as part of the 6th International Conference on Music Information Retrieval (ISMIR), P. 4, London, UK (2005)
Google Scholar
Graves, A.: Supervised sequence labelling with recurrent neural networks. Ph.D. Thesis, Technische Universität München (2008)
Google Scholar
Collins, N.: A comparison of sound onset detection algorithms with emphasis on psychoacoustically motivated detection functions. In: Proceedings of AES Convention 118, pp. 28–31 (2005)
Google Scholar
Handel, S.: Listening: An Introduction to the Perception of Auditory Events. MIT Press, Cambridge (1989)
Google Scholar
Böck, S.: Onset Detector 2011. In: Proceedings Annual Meeting of the MIREX 2011 Community as Part of the 12th International Conference on Music Information Retrieval. p. 2. ISMIR, ISMIR (2011)
Google Scholar
Böck, S., Arzt, A., Krebs, F., Schedl, M.: Online real-time onset detection with recurrent neural networks. In Proceedings of the 15th International Conference on Digital Audio Effects (DAFx-12), p. 4. New York, UK (2012)
Google Scholar
Klapuri, A.P.: Musical meter estimation and music transcription. In: Proceedings of Cambridge Music Processing Colloquium, Cambridge University, UK (2003)
Google Scholar
Gouyon, F., Herrera, P.: Determination of the meter of musical audio signals: seeking recurrences in beat segment descriptors. In: AES 114th Convention, Amsterdam, The Netherlands (2003)
Google Scholar
Gouyon, F., Dixon, S., Pampalk, E., Widmer, G.: Evaluating rhythmic descriptors for musical genre classification. In: Proceedings of the AES 25th International Conference, pp. 196–204. London, UK (2004)
Google Scholar
Grosche, P., Müller, M., Kurth, F.: Cyclic tempogram—a mid-level tempo representation for music signals. In: Proceedings of ICASSP, pp. 5522–5525. Dallas, TX (2010)
Google Scholar
Kirovski, D., Attias, H.: Beat-ID: identifying music with beat analysis. In: Proceedings of the International Workshop on Multimedia Signal Processing, IEEE, pp. 190–193, St. Thomas, US Virgin Islands (2002)
Google Scholar
Kurth, F., Gehrmann, T., Muller, M.: The cyclic beat spectrum: Tempo-related audio features for time-scale invariant audio identification. In: Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR 2006), pp. 35–40, Victoria, Canada (2006)
Google Scholar
Goto, M., Muraoka, Y.: A real-time beat tracking system for audio signals. In: Proceedings of the 1995 International Computer Music Conference, pp. 171–174 (1995)
Google Scholar
Goto, M., Muraoka, Y.: Real-time rhythm tracking for drumless audio signals—chord change detection for musical decisions. In: Proceedings of the IJCAI-97 Workshop on Computational Auditory Scene, Analysis, pp. 135–144 (1997)
Google Scholar
Goto, M.: An audio-based real-time beat tracking system for music with or without drum-sounds. J. New Music Res. 30(2), 159–171 (2001)
Google Scholar
Seppänen, J.: Computational models of musical meter recognition. Master’s thesis, Tampere University of Technology (2001)
Google Scholar
Dixon, S.: Automatic extraction of tempo and beat from expressive performances. J. New Music Res. 30, 39–58 (2001)
Google Scholar
Hainsworth, S., Macleod, M.: Beat tracking with particle filtering algorithms. In: 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 91–94 (2003)
Google Scholar
Alonso, M., David, B., Richard, G.: Tempo and beat estimation of musical signals. In: Proceedings of the International Conference on Music Information Retrieval, pp. 158–163 (2004)
Google Scholar
Sethares, W.A., Staley, T.W.: Meter and periodicity in musical performance. J. New Music Res. 22(5), 1–11 (2001)
Google Scholar
Brown, J.C.: Determination of meter of musical scores by autocorrelation. J. Acoust. Soc. Am. 94(4), 1953–1957 (1993)
Google Scholar
van Noorden, L., Moelants, D.: Resonance in the perception of musical pulse. J. New Music Res. 28(1), 43–66 (1999)
Article Google Scholar
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)
Google Scholar
Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimedia 13(2), 303–319 (2011)
Article Google Scholar
Ballroomdancers.com. Preview audio examples of ballroom dance music. https://secure.ballroomdancers.com/Music/style.asp, (2006)
Zwicker, E., Fastl, H.: Psychoacoustics—Facts and Models. 2nd edn. Springer, Berlin (1999)
Google Scholar
Paulus, J., Klapuri, A.P.: Measuring the similarity of rhythmic patterns. In: Proceedings of the2002 International Conference on Music Information Retrieval (ISMIR 2002). France, Paris (2002)
Google Scholar
Gouyon, F., Dixon, S.: Dance music classification: A tempo-based approach. In: Proceedings of Fitfth International Conference on Music Information Retrieval, ISMIR, p. 4, Barcelona, Spain (2004)
Google Scholar
Daniel, A., Emiya, V., David, B.: Perceptual-based evaluation of the errors usually made when automatically transcribing music. In: Proceedings of 9th International Symposium on Music Information Retrieval (ISMIR), pp. 550–555. Philadelphia (2008)
Google Scholar
Gomez, E.: Estimating the tonality of polyphonic audio files: cognitive versus machine learning modelling strategies. In: Proceedings of 5th International Conference on Music Information Retrieval, Barcelona, Spain (2004)
Google Scholar
Gomez, E.: Key estimation from polyphonic audio. In: Proceedings of 1st Annual Music Information Retrieval Evaluation eXchange (MIREX’05), London, UK (2005)
Google Scholar
Izmirli, O.: Template based keyfinding from audio. In: Proceedings of International Computer Music Conference (ICMC), pp. 211–214. Barcelona, Spain (2005)
Google Scholar
Mardirossian, A., Chew, E.: Skefis a symbolic (midi) key-finding system. In: Proceedings of 6th International Symposium on Music Information Retrieval (ISMIR), no pagination, pp. 1–8. London, UK (2005)
Google Scholar
Pauws, S.: Musical keyextraction from audio. In: Proceedings of 5th International Symposium on Music Information Retrieval (ISMIR), pp. 96–99. Barcelona, Spain (2004)
Google Scholar
Chuan, C., Chew, E.: Fuzzy analysis in pitch class determination for polyphonic audio keyfinding. In: Proceedings of 6th International Symposium on Music Information Retrieval (ISMIR), pp. 296–303. London, UK (2005)
Google Scholar
Peeters, G.: Chroma-based estimation of musical key from audio-signal analysis. In: Proceedings of 7th International Symposium on Music Information Retrieval (ISMIR). Victoria, Canada (2006)
Google Scholar
Chuan, C.H., Chew, E.: Audio key finding: considerations in system design and case studies on chopins 24 preludes. EURASIP J. Adv. Sig. Process. 2007(056561) (2006)
Google Scholar
Noland, K., Sandler, M.: Key estimation using a hidden markov model. In: Proceedings of 7th International Symposium on Music Information Retrieval (ISMIR), pp. 121–126. Victoria, Canada (2006)
Google Scholar
Mandel, M.I., Ellis, D.P.W.: Song-level features and support vector machines for music classification. In: Proceedings 6th International Conference on Music Information Retrieval (ISMIR), pp. 594–599. London, UK (2005)
Google Scholar
Mauch, M., Dixon, S.: Simultaneous estimation of chords and musical context from audio. IEEE Trans. Audio, Speech Lang. Process. 18(6), 1280–1289 (2010)
Article Google Scholar
Fujishima, T.: Realtime chord recognition of musical sound: a system using common lisp music. In: Proceedings of International Computer Music Conference, pp. 464–467. Bejing, China (1999)
Google Scholar
Gomez, E.: Tonal description of polyphonic audio for music content processing. INFORMS J. Comput. 18(3), 294–304 (2006)
Article Google Scholar
Temperly, D.: An algorithm for harmonic analysis. Music Percept. 15, 31–68 (1997)
Article Google Scholar
Madsen, S.T., Widmer, G.: Key-finding with interval profiles. In: Proceedings International Computer Music Conference (ICMC), p. 4, Copenhagen, Denmark (2007)
Google Scholar
Lee, K., Slaney, M.: A unified system for chord transcription and key extraction using hidden markov models. In: Proceedings of 8th International Symposium on Music Information Retrieval (ISMIR). Vienna, Austria (2007)
Google Scholar
Lee, K., Slaney, M.: Acoustic chord transcription and key extraction from audio using key-dependent hmms trained on synthesized audio. IEEE Trans. Audio, Speech, Lang. Process. 16, 291–301 (2008)
Article Google Scholar
Purwins, H., Blankertz, B., Dornhege, G., Obermayer, K.: Scale degree from audio investigated with machine learning techniques. In: Proceedings Audio Engineering Society 116th Convention (2004)
Google Scholar
Purwins, H.: Profiles of Pitch Classes Circularity of Relative Pitch and Key—Experiments, Models, Computational Music Analysis, and Perspectives. Ph.D. thesis, Technische Universität, Berlin (2005)
Google Scholar
Cremer, M., Derboven, C.: A system for harmonic analysis of polyphonic music. In: Proceedings 25th International AES Conference. London, UK (2004)
Google Scholar
Sun, J., Li, H., Li, L.: Key detection through pitch class distribution model and ANN. In: 2009 16th International Conference on Digital Signal Processing. Santorini, Hellas (2009)
Google Scholar
Cabral, G., Briot, J.-P., Pachet, F.: Impact of distance in pitch class profile computation. In: Proceedings of 10th Brazilian Symposium on Computer Music (SBCM2005), pp. 135–144. Belo Horizonte, Brazil (2005)
Google Scholar
Izmirli, O.: Audio key finding using low-dimensional spaces. In: Proceedings of 7th International Conference on Music Information Retrieval (ISMIR). Victoria, Canada (2006)
Google Scholar
Izmirli, O.: An algorithm for audio key finding. In: Proceedings of Music Information Retrieval Evaluation Exchange (MIREX2005), as Part of the 6th International Symposium on Music Information Retrieval (ISMIR). London, UK (2006)
Google Scholar
Zhu, Y.: An audio key finding algorithm. In: Proceedings of the 1st Annual Music Information Retrieval Evaluatione Xchange(MIREX’05). London, UK (2005)
Google Scholar
Zhu, Y.: Music key detection for musical audio. In: Proceedings of 11th International Multimedia Modelling Conference (MMM’05). Melbourne, Australia (2005)
Google Scholar
Chew, E.: An algorithm for determining key boundaries. In: Proceedings 2nd International Conference on Music and Artificial Intelligence (ICMAI). Edinburgh, Scotland (2002)
Google Scholar
Shenoy, A., Mohapatra, R., Wang, Y.: Key determination of acoustic musical signals. In: Proceedings of International Conference on Multimedia and Expo(ICME). Singapore (2004)
Google Scholar
Sheh, A., Ellis, D.: Chord segmentation and recognition using emtrained hidden markov models. In: Proceedings of ISMIR 2003, pp. 183–189. Baltimore, Maryland (2003)
Google Scholar
Chai, W., Vercoe, B.: Detection of key change in classical piano music. In: Proceedings 6th International Conference on Music Information Retrieval (ISMIR), pp. 468–474. London, UK (2005)
Google Scholar
Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., Slaney, M.: Content-based music information retrieval: current directions and future challenges. Proc. IEEE 96(4), 668–696 (2008)
Google Scholar
Bartsch, M.A., Wakefield, G.H.: To catch a Chorus: using chroma-based representations for audio thumbnailing. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics 2001, pp. 15–18, New Paltz, New York (2001)
Google Scholar
Wakefield, G.: Mathematical representation of joint time chroma distributions. In: Proceedings of SPIE, vol. 3807, pp. 637–645. Denver, Colorado (1999)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Google Scholar
Krumhansl, C.: Tonal hierarchies and rare intervals in music cognition. Music Percept. Interdisc. J. 7(3), 309–324 (1990)
Article Google Scholar
Gatzsche, D., Gatzsche, G., Mehnert, M., Brandenburg, K.: A symmetry based approach for musical tonality analysis. In: Proceedings of 8th International Society for Music Information Retrieval Conference (ISMIR), no pagination, Vienna, Austria (2007)
Google Scholar
Bello, J.P., Daudet, L., Sandler, M.B.: Automatic piano transcription using frequency and time-domain information. IEEE Trans. Audio, Speech Lang. Process. 14(6), 2242–2251 (2006)
Article Google Scholar
Duan, Z., Lu, L., Zhang, C.: Audio tonality mode classification without tonic annotations. In: Proceedings of 8th IEEE International Conference on Multimedia and Expo (ICME), pp. 1361–1364. Hannover, Germany (2008)
Google Scholar
Izmirli, O.: Tonal-atonal classification of music audio using diffusion maps. In: Proceedings of 10th International Society for Music Information Retrieval Conference (ISMIR 2009), pp. 687–691. Kobe, Japan (2009)
Google Scholar
Vuvan, D., Prince, J., Schmuckler, M.: Probing the minor tonal hierarchy. Music Percept. Interdisc. J. 28(5), 461–472 (2011)
Article Google Scholar
Papadopoulos, H., Peeters, G.: Local key estimation based on harmonic and metric structures. In: Proceedings of 12th International Conference on Digital Audio Effects (DAFx-09), pp. 1–8. Como, Italy (2009)
Google Scholar
Goto, M., Muraoka, Y.: An audio-based real-time beat tracking system and its applications. In: Proceedings International Computer Music Confernce, pp. 17–20. ICMA, San Francisco (1998)
Google Scholar
Rocher, T., Robine, M., Hanna, P., Oudre, L.: Concurrent estimation of chords and keys from audio. In: Proceedings of 11th International Society for Music Information Retrieval Conference (ISMIR), pp. 141–146. Utrecht, The Netherlands (2010)
Google Scholar
Bello, J.B., Pickens, J.: A robust mid-level representation for harmonic content in music signals. Proc. ISMIR 2005, 304–311 (2005)
Google Scholar
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK book (v3.4). Cambridge University Press, Cambridge (2006)
Google Scholar
Stober, S., Nürnberger, A.: Towards user-adaptive structuring and organization of music collections. In: Proceedings of 6th Workshop on Adaptive Multimedia Retrieval (AMR 2008), Berlin, Germany (2008)
Google Scholar
Burges, C.J.C., Plastina, D., Platt, J.C., Renshaw, E., Malvar, H.S.: Duplicate detection and audio thumbnails with audio fingerprinting. Technical Report MSR-TR-2004-19, Microsoft Research (MSR), March (2004)
Google Scholar
Logan, B., Chu, S.: Music summarization using key phrases. Proc. ICASSP 2, 749–752 (2000)
Google Scholar
Aucouturier, J.-J., Pachet, F., Sandler, M.: The way it sounds: timbre models for analysis and retrieval of music signals. IEEE Trans. Multimedia 7(6), 1028–1035 (2005)
Article Google Scholar
Aucouturier, J.-J., Sandler, M.: Segmentation of musical signals using hidden markov models. In: Proceedings of the 110th AES Convention, AES (Audio Engineering Society), Amsterdam, The Netherlands (2001)
Google Scholar
Jehan, T.: Hierarchical multi-class self similarities. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 311–314 (2005)
Google Scholar
Foote, J.: Visualizing music and audio using self-similarity. In: Proceedings of 7th ACM International Conference on Multimedia (Part 1), pp. 77–80 (1999)
Google Scholar
Cooper, M., Foote, J.: Automatic music summarization via similarity analysis. In: Proceedings of 3rd ISMIR, pp. 81–5 (2002)
Google Scholar
Peeters, G., Burthe, A.L., Rodet, X.: Toward automatic music audio summary generation from signal analysis. In: Proceedings of 3rd ISMIR, pp. 94–100 (2002)
Google Scholar
Abdallah, S.A., Noland, K., Sandler, M., Casey, M., Rhodes, C.: Theory and evaluation of a bayesian music structure extractor. In: Proceedings of 6th ISMIR, pp. 420–425 (2005)
Google Scholar
Goto, M.: A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Trans. Audio, Speech, Lang. Process. 14(5), 1783–1794 (2006)
Article Google Scholar
Müller, M., Kurth, F.: Enhancing similarity matrices for music audio analysis. Proc. ICASSP 5, 9–12 (2006)
Google Scholar
D’Aguanno, A., Vercellesi, G.: Automatic synchronization between audio and partial music score representation. In: Proceedings of 6th Workshop on Adaptive Multimedia Retrieval (AMR 2008). Berlin, Germany (2008)
Google Scholar
Tolos, M., Tato, R., Kemp, T.: Mood-based navigation through large collections of musical data. In: Proceedings of 2nd CCNC 2005, pp. 71–75. Las Vegas, NV (2005)
Google Scholar
Feng, Y., Zhuang, Y., Pan, Y.: Popular music retrieval by detecting mood. In: Proceedings 26th International SIGIR Conference on Research and Development in Information Retrieval, pp. 375–376. Toronto, ACM, Canada (2003)
Google Scholar
Li, T., Ogihara, M.: Detecting emotion in music. In: Proceedigns of ISMIR, pp. 239–240. Baltimore (2003)
Google Scholar
Liu, D.: Automatic mood detection from acoustic music data. In: Proceedings International Conference on Music Information Retrieval, pp. 13–17 (2003)
Google Scholar
Lu, L., Liu, D., Zhang, H.: Automatic mood detection and tracking of music audio signals. IEEE Trans. Audio, Speech, Lang. Process. 14(1), 5–18 (2006)
Article MathSciNet Google Scholar
Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.: Multi-label classification of music into emotions. In: Proceedings 9th International Conference on Music Information Retrieval (ISMIR), pp. 325–330. Philadelphia (2008)
Google Scholar
Logan, B.: Mel frequency cepstral coefficients for music modeling. In: Proceedings of ISMIR. Plymouth, USA (2000)
Google Scholar
Peeters, G.: A generic training and classification system for MIREX08 classification tasks: Audio music mood, audio genre, audio artist and audio tag. In: Proceedings of MIREX as part of the 9th International Conference on Music Information Retrieval (ISMIR), ISMIR, Philadelphia, PY (2008)
Google Scholar
Boersma, P.: Praat, a system for doing phonetics by computer. Glot Int. 5, 341–345 (2001)
Google Scholar
Chase, W.: How Music REALLY Works!. 2nd edn. Roedy Black Publishing, Vancouver, Canada (2006)
Google Scholar
Harte, C.A., Sandler, M.: Automatic chord identification using a quantised chromagram. 118th Convention of the AES, May (2005)
Google Scholar
Porter, M.F.: An algorithm for suffix stripping. Program 3(14), 130–137 (1980)
Google Scholar
Chuang, Z.-J., Wu, C.-H.: Emotion recognition using acoustic features and textual content. In: Proceedings of ICME, pp. 53–56. Taipei, Taiwan (2004)
Google Scholar
Schuller, B., Batliner, A., Steidl, S., Seppi, D.: Emotion recognition from speech: putting asr in the loop. In: Proceedings 34th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009, pp. 4585–4588. Taipei, Taiwan (2009)
Google Scholar
Liu, H., Singh, P.: ConceptNet—a practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)
Article MathSciNet Google Scholar
Schuller, B., Schenk, J., Rigoll, G., Knaup, T.: The godfather versus “chaos”: comparing linguistic analysis based on online knowledge sources and bags-of-n-grams for movie review valence estimation. In: IAPR, IEEE Proceedings 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 858–862, Barcelona, Spain (2009)
Google Scholar
Ekman, P., Sorenson, E., Friesen, W.: Pan-cultural elements in facial displays of emotions. Science 164, 86–88 (1969)
Article Google Scholar
Bradley, M.M., Lang, P.J.: Affective norms for english words (anew): Stimuli, instruction manual, and affective ratings. Technical Report C-1, Center for Research in Psychophysiology, University of Florida, Gainesville, Florida (1999)
Google Scholar
Hu, X., Downie, J.S.: Exploring mood metadata: relationships with genre, artist and usage metadata. In: Proceedings 8th International Conference on Music Information Retrieval (ISMIR), Vienna, Austria (2007)
Google Scholar
Wöllmer, M., Eyben, F., Reiter, S., Schuller, B., Cox, C., Douglas-Cowie, E., Cowie, R.: Abandoning emotion classes—towards continuous emotion recognition with modelling of long-range dependencies. In: Proceedings INTERSPEECH 2008, 9th Annual Conference of the International Speech Communication Association, incorporating 12th Australasian International Conference on Speech Science and Technology, SST 2008, ISCA/ASSTA, ISCA, pp. 597–600. Brisbane, Australia (2008)
Google Scholar
Schuller, B., Müller, R., Eyben, F., Gast, J., Hörnler, B., Wöllmer, M., Rigoll, G., Höthker, A., Konosu, H.: Being bored? recognising natural interest by extensive audiovisual integration for real-life application. Image Vis. Comput. Spec. Issue Vis. Multimodal Anal. Hum. Spontaneous Behav. 27(12), 1760–1774 (2009)
Google Scholar
Mesaros, A., Virtanen, T., Klapuri, A.: Singer identification in polyphonic music using vocal separation and pattern recognition methods. In: Proceedings of ISMIR, pp. 375–378 (2007)
Google Scholar
Mesaros, A., Virtanen, T.: Automatic recognition of lyrics in singing. EURASIP J. Audio, Speech, Music Process. Article ID 546047 (2009)
Google Scholar
Durrieu, J.-L., Richard, G., David, B., Févotte, C.: Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Trans. Audio, Speech, Lang. Process. 18(3), 564–575 (2010)
Article Google Scholar
Durrieu, J.-L., Richard, G., David, B.: An iterative approach to monaural musical mixture de-soloing. In: Proceedings of ICASSP, pp. 105–108, Taipei, Taiwan (2009)
Google Scholar
Eyben, F., Wöllmer, M., Schuller, B.: Opensmile—the munich versatile and fast open-source audio feature extractor. In: Proceedings of the 9th ACM International Conference on Multimedia, MM 2010, pp. 1459–1462. ACM, Florence, Italy (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

LS für Mensch-Maschine-Kommunikation, TU München, Arcisstr. 21, 80290, München, Germany
Björn Schuller

Authors

Björn Schuller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Björn Schuller .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Schuller, B. (2013). Applications in Intelligent Music Analysis. In: Intelligent Audio Analysis. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36806-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-36806-6_11
Published: 25 April 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36805-9
Online ISBN: 978-3-642-36806-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics