A Survey of Music Structure Analysis Techniques for Music Applications

  • Namunu C. Maddage
  • Haizhou Li
  • Mohan S. Kankanhalli
Part of the Studies in Computational Intelligence book series (SCI, volume 231)

Abstract

Music carries multilayer information which forms different structures. The information embedded in the music can be categorized into time information, harmony/melody, music regions, music similarities, song structures and music semantics. In this chapter, we first survey existing techniques for the music structure information extraction and analysis. We then discuss how the music structure information extraction helps develop music applications. Experimental studies indicate that the success of long term music research is based on how well we integrate domain knowledge of relevant disciplines such as musicology, psychology and signal processing.

Keywords

Music Information Retrieval Music Structure Music Signal Music Content Singing Voice 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Allen, D.: Octave Discriminability of Musical and Non-musical Subjects. Journal of the Psychonomic Science 7, 421–422 (1967)Google Scholar
  2. 2.
    Alonso, M., Badeau, R., David, B., Richard, G.: Musical Tempo Estimation using Noise Subspace Projections. In: Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, New York, October 19-22 (2003)Google Scholar
  3. 3.
    Allen, P.E., Dannenberg, R.B.: Tracking Musical Beats in Real Time. In: Proc. of the International Computer Music Conference (ICMA), Glasgow, pp. 140–143 (1990)Google Scholar
  4. 4.
    Attneave, F., Olson, R.: Pitch as a Medium: A New Approach to Psychophysical Scaling. American Journal of Psychology 84, 147–166 (1971)CrossRefGoogle Scholar
  5. 5.
    Bachem, A.: A Tone Height and Tone Chroma as Two Different Pitch Qualities. International Journal of Psychonomics (Acta Psychological) 7, 80–88 (1950)Google Scholar
  6. 6.
    Bachem, A.: Time Factors in Relative and Absolute Pitch Determination. Journal of the Acoustical Society of America (JASA) 26, 751–753 (1954)CrossRefGoogle Scholar
  7. 7.
    Baratè, A., Ludovico, L.A.: An XML-based Synchronization of Audio and Graphical Representations of Music Scores. In: Proc. 8th International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS 2007 (2007)Google Scholar
  8. 8.
    Bartsch, M.A., Wakefield, G.H.: To Catch a Chorus: Using Chroma-based Representations for Audio Thumbnailing. In: Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, New York, October 21-24 (2001)Google Scholar
  9. 9.
    Bartsch, M.A., Wakefield, G.H.: Singing Voice Identification Using Spectral Envelope Estimation. IEEE Transaction on Speech and Audio Processing 12(2), 100–109 (2004)CrossRefGoogle Scholar
  10. 10.
    Bello, J.P., Sandler, M.B.: Phase-Based Note Onset Detection for Music Signals. In: Proc. International conference on Acoustics, Speech, and Signal processing (ICASSP), Hong Kong, April 6-10 (2003)Google Scholar
  11. 11.
    Berenzweig, A.L., Ellis, D.P.W.: Location singing voice segments within music signals. In: Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, New York, October 21-24, 2001, pp. 119–122 (2001)Google Scholar
  12. 12.
    Bharucha, J.J., Stoeckig, K.: Reaction Time and Musical Expectancy: Priming of Chords. Journal of Experimental Psychology: Human Perception and Performance 12, 403–410 (1986)CrossRefGoogle Scholar
  13. 13.
    Bharucha, J.J., Stoeckig, K.: Priming of Chords: Spreading Activation or Overlapping Frequency Spectra? Journal of Perception and Psychophysics 41(6), 519–524 (1987)Google Scholar
  14. 14.
    Biasutti, M.: Sharp Low-and High-Frequency Limits on Musical Chord Recognition. Journal of Hearing Research 105, 77–84 (1997)CrossRefGoogle Scholar
  15. 15.
    Brown, J.C.: Calculation of a Constant Q Spectral Transform. Journal of the Acoustical Society of America (JASA) 89, 425–434 (1991)CrossRefGoogle Scholar
  16. 16.
    Brown, J.C., Puckette, M.S.: An efficient algorithm for the calculation of a constant Q transform. Journal of Acoustic Society America 92(5), 1933–1941 (1992)Google Scholar
  17. 17.
    Brown, J.C., Cooke, M.: Perceptual Grouping of Musical Sounds: A Computational Model. Journal of New Music Research 23, 107–132 (1994)CrossRefGoogle Scholar
  18. 18.
    Brown, J.C.: Computer identification of musical instruments using pattern recognition with Capstral coefficients as features. Journal of Acoustic Society America 105(3), 1933–1941 (1999)CrossRefGoogle Scholar
  19. 19.
    Cemgil, A.T., Kappen, H.J., Desain, P.W.M., Honing, H.J.: On tempo tracking: Tempogram representation and Kalman filtering. Journal of New Music Research 29(4), 259–273 (2001)CrossRefGoogle Scholar
  20. 20.
    Chai, W., Vercoe, B.: Music Thumbnailing via Structural Analysis. In: Proc. ACM International conference on Multimedia (ACM MM), Berkeley, CA, USA, November 2-8, 2003, pp. 223–226 (2003)Google Scholar
  21. 21.
    Cooper, M., Foote, J.: Automatic Music Summarization via Similarity Analysis. In: Proc. 3rd International Symposium of Music Information Retrieval (ISMIR), Paris, France, October 13-17 (2002)Google Scholar
  22. 22.
    Cosi, P., DePoli, G., Prandoni, P.: Timbre characterization with Mel- Cepstrum and neural nets. In: Proc. of International Computer Music Conference (ICMC), Aarhus, Denmark, September 12 - 17, pp. 42–45 (1994)Google Scholar
  23. 23.
    Dannenberg, R.B.: An On-Line Algorithm for Real-Time Accompaniment. In: Proc. International Computer Music Conference, pp. 193–198 (1984)Google Scholar
  24. 24.
    Dannenberg, R.B., Hu, N.: Discovering Musical structure in Audio Recordings. In: Proc. 2nd International Conference of Music and Intelligence (ICMAI), Edinburgh, Scotland, UK, September 12-14, 2002, pp. 43–57 (2002)Google Scholar
  25. 25.
    Davies, M.E.P., Plumbley, M.D.: Causal Tempo Tracking of Audio. In: Proc. of 5th International Symposium/Conference of Music Information Retrieval (ISMIR), Barcelona, Spain, October 10-15 (2004)Google Scholar
  26. 26.
    Deutsch, D.: The Psychology of Music, 2nd edn. Series in Cognition and Perception. Academic Press, San Diego (1999)Google Scholar
  27. 27.
    Dixon, S.: Automatic Extraction of Tempo and Beat from Expressive Performances. Journal of New Music Research 30(1), 39–58 (2001)CrossRefGoogle Scholar
  28. 28.
    Dowling, W.J., Harwood, D.L.: Music Cognition. Series in Cognition and Perception. Academic Press, San Diego (1986)Google Scholar
  29. 29.
    Dubnov, S., Rodet, X.: Timbre Recognition with Combined Stationary and Temporal Features. In: Proc. International Computer Music Conference (ICMC), Michigan, USA, October 1-6 (1998)Google Scholar
  30. 30.
    Duxburg, C., Sandler, M., Davies, M.: A Hybrid Approach to Musical Note Onset Detection. In: Proc. of 5th International Conference on Digital Audio Effects (DAFx 2002), Hamburg, Germany, September 26-28 (2002)Google Scholar
  31. 31.
    Eggink, J., Brown, G.J.: Extracting Melody Lines from Complex Audio. In: Proc. of 5th International Symposium/Conference of Music Information Retrieval (ISMIR), Barcelona, Spain, October 10-15 (2004)Google Scholar
  32. 32.
    Ellis, D.P.W., Poliner, G.E.: Identifying ‘Cover Songs’ with Chroma Features and Dynamic Programming Beat Tracking. In: Proc. International Conference on Acoustics, Speech, and Signal Processing, ICASSP (2006)Google Scholar
  33. 33.
    Eronen, A., Klapuri, A.: Musical Instrument Recognition Using Cepstral Coefficients and Temporal Features. In: Proc. of International Conference on Acoustic, Speech and Signal Processing (ICASSP), Istanbul, Turkey, June 05-09 (2000)Google Scholar
  34. 34.
    Fletcher, H.: Some Physical Characteristics of Speech and Music. Journal of Acoustical Society of America 3(2), 1–26 (1931)CrossRefGoogle Scholar
  35. 35.
    Foote, J., Cooper, M., Girgensohn, A.: Creating Music Video using Automatic Media Analysis. In: Proc. International ACM Conference on Multimedia (ACM MM), Juan-les-Pins, France, December 1-6 (2002)Google Scholar
  36. 36.
    Fujihara, H., Goto, M., Ogata, J., Komatani, K., Ogata, T., Okuno, H.G.: Automatic Synchronization between Lyrics and Music CD Recordings based on Viterbi Alignment of Segregated Vocal Signals. In: Proc. IEEE International Symposium on Multimedia (ISMIR),Google Scholar
  37. 37.
    Fujinaga, I.: Machine Recognition of Timbre Using Steady-state Tone of Acoustic Musical Instruments. In: Proc. International Computer Music Conference (ICMC), Michigan, USA, October 1-6, pp. 207–210 (1998)Google Scholar
  38. 38.
    Fujishima, T.: Real-time Chord Recognition of Musical Sounds: A System using Common Lisp Music. In: Proc. of International Computer Music Conference (ICMC), 1999, Beijing, pp. 464–467 (1999)Google Scholar
  39. 39.
    Gao, S., Lee, C.H.: An Adaptive Learning Approach to Music Tempo and Beat Analysis. In: Proc. of International Conference on Acoustic, Speech and Signal Processing (ICASSP), Montreal, Canada, May 17-21 (2004)Google Scholar
  40. 40.
    Gao, S., Lee, C.H., Zhu, Y.: An Unsupervised Learning Approach to Music Event Detection. In: Proc. of IEEE International Conference on Multimedia and Expo. (ICME), Taipei, Taiwan, June 27-30 (2004)Google Scholar
  41. 41.
    Ghias, A., Logan, J., Chamberlin, D., Smith, B.C.: Query By Humming: Musical Information Retrieval in an Audio Database. In: 3rd ACM International conference on Multimedia (ACM MM), San Francisco, California, USA, November 5-9, pp. 231–236 (1995)Google Scholar
  42. 42.
    Goldstein, J.L.: An Optimum Processor Theory for the Central Formation of the Pitch of Complex Tones. Journal of the Acoustical Society of America (JASA) 54, 1496–1516 (1973)CrossRefGoogle Scholar
  43. 43.
    Goto, M., Muraoka, Y.: A Beat Tracking System for Acoustic Signals of Music. In: Proc. 2nd ACM International Conference on Multimedia, San Francisco, California, USA, October 15-20, pp. 365–372 (1994)Google Scholar
  44. 44.
    Goto, M.: A Predominant F0 Estimation Method for CD Recordings: MAP Estimation using EM Algorithm for Adaptive Tone Models. In: Proc. of International conference on Acoustics, Speech, and Signal processing (ICASSP), Sault lake city, Utah, May 7-11, pp. 3365–3368 (2001)Google Scholar
  45. 45.
    Goto, M.: An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds. Journal of New Music Research 30(2), 159–171 (2001)CrossRefGoogle Scholar
  46. 46.
    Goto, M.: A Chorus-Section Detecting Method for Musical Audio Signals. In: Proc. International conference on Acoustics, Speech, and Signal processing (ICASSP), Hong Kong, April 6-10 (2003)Google Scholar
  47. 47.
    Gouyon, F., Herrera, P., Cano, P.: Pulse-Dependent Analyses of Percussive Music. In: Proc. International Conference on Virtual, Synthetic and Entertainment Audio (AES 22), Espoo, Finland, June 15-17 (2002)Google Scholar
  48. 48.
    Han, K.P., Pank, Y.S., Jeon, S.G., Lee, G.C., Ha, Y.H.: Genre Classification System on TV Sound Signals Based on a Spectrogram Analysis. IEEE Transaction on Consumer Electronics 55(1), 33–42 (1998)Google Scholar
  49. 49.
    Houtgast, T.: Sub-Harmonic Pitches of a Pure Tone at Low S/N Ratio. Journal of the Acoustical Society of America (JASA) 60(2), 405–409 (1976)CrossRefGoogle Scholar
  50. 50.
    Hartmann, W.: On the Origin of the Enlarged Melodic Octaves. Journal of the Acoustical Society of America (JASA) 93, 3400–3409 (1993)CrossRefGoogle Scholar
  51. 51.
    International Conference on Computer Music ResearchGoogle Scholar
  52. 52.
    International Society for Music Information RetrievalGoogle Scholar
  53. 53.
    Jensen, K., Andersen, T.H.: Real-time beat estimation using feature extraction. In: Wiil, U.K. (ed.) CMMR 2003. LNCS, vol. 2771, pp. 13–22. Springer, Heidelberg (2004)Google Scholar
  54. 54.
    Jiang, D.N., Lu, L., Zhang, H.J., Tao, J.H., Cai, L.H.: Music Type Classification by Spectral Contrast Feature. In: Proc. of IEEE International Conference on Multimedia and Expo. (ICME), Lausanne, Switzerland (2002)Google Scholar
  55. 55.
    Jourdain, R.: Music, The Brain, and Ecstasy: How Music Capture Our Imagination. HarperCollins (1997)Google Scholar
  56. 56.
    Journal of New Music Research Google Scholar
  57. 57.
    Journal of the Acoustical Society of America Computer Music Journal (JASA)Google Scholar
  58. 58.
    Kameoka, H., NIshimoto, T., Sagayama, S.: Separation of Harmonic Structures based on Tied Gaussian Mixture Model and Information Criterion for Concurrent Sounds. In: Proc. of International conference on Acoustics, Speech, and Signal processing (ICASSP), Montreal, Canada (May 2004)Google Scholar
  59. 59.
    Kaminskyj, I., Materka, A.: Automatic Source Identification of Monophonic Musical Instrument Sounds. In: Proc. IEEE International Conference on Neural Networks, Perth, Australia, November 27-December 1, pp. 189–194 (1995)Google Scholar
  60. 60.
    Kashino, K., Murase, H.: Music Recognition using Note Transition Context. In: Proc. of International conference on Acoustics, Speech, and Signal processing (ICASSP), Seattle, Washington, USA, May 12-15 (1998)Google Scholar
  61. 61.
    Kim, Y.K., Brian, W.: Singer Identification in Popular Music Recordings Using Voice Coding Features. In: Proc. 3rd International Symposium of Music Information Retrieval (ISMIR), Paris, France, October 13-17 (2002)Google Scholar
  62. 62.
    Klapuri, A.P.: Multiple Fundamental Frequency Estimation Based on Harmonicity and Spectral Smoothness. IEEE Transaction on Speech and Audio Processing 11(6), 804–816 (2003)CrossRefGoogle Scholar
  63. 63.
    Krishnaswamy, A.: Application of Pitch Tracking to South Indian Classical Music. In: Proc. of International conference on Acoustics, Speech, and Signal processing (ICASSP), Hong Kong, April 6-10 (2003)Google Scholar
  64. 64.
    Krumhansl, C.L.: The Psychological Representation of Musical Pitch in a Tonal Context. Journal of Cognitive Psychology 11(3), 346–374 (1979)CrossRefGoogle Scholar
  65. 65.
    Laden, B., Keefe, D.H.: The Representation of Pitch in a Neural Net Model of Chord Classification. Computer Music Journal 13(4), 12–26 (Winter 1989)CrossRefGoogle Scholar
  66. 66.
    Leung, T.W., Ngo, C.W.: ICA-FX Features for Classification of Singing Voice and Instrumental Sound. In: Proc. International Conference on Pattern Recognition (ICPR), Cambridge, UK, August 23-26 (2004)Google Scholar
  67. 67.
    Logan, B., Chu, S.: Music Summarization Using Key Phrases. In: Proc. International Conference on Acoustics, Speech, and Signal processing (ICASSP), Orlando, USA (2000)Google Scholar
  68. 68.
    Lu, L., Zhang, H.J.: Automated Extraction of Music Snippets. In: Proc. ACM International Conference on Multimedia (ACM MM), Berkeley, CA, USA, pp. 140–147 (2003)Google Scholar
  69. 69.
    Lu, L., Zhang, H.J.: Automatic Mood Detection and Tracking of Music Audio Signals. IEEE Transactions on Audio, Speech, and Language Processing 14(1) (January 2006)Google Scholar
  70. 70.
    Maddage, N.C., Xu, C.S., Kankanhalli, M.S., Shao, X.: Content-based Music Structure Analysis with Applications to Music Semantic Understanding. In: Proc. International ACM Conference on Multimedia (ACM MM), New York, USA, October 10-16 (2004)Google Scholar
  71. 71.
    Maddage, N.C.: Content-Based Music Structure Analysis. Ph.D. dissertation, School of Computing, National University of Singapore (2005)Google Scholar
  72. 72.
    Maddage, N.C., Kankanhalli, M.S., Li, H.: A Hierarchical Approach for Music Chord Modelling based on the Analysis of Tonal Characteristics. In: IEEE International Conference on Multimedia & Expo. (ICME), Toronto, Canada, July 9-12 (2006)Google Scholar
  73. 73.
    Maddage, N.C., Li, H., Kankanhalli, M.S.: Music Structure based Vector Space Retrieval. In: Proc. International Conference of ACM Special Interest Group on Information Retrieval (ACM SIGIR), pp. 67–74 (2006)Google Scholar
  74. 74.
    Martin, K.D.: Sound-Source Recognition: A Theory and Computational Model. Ph.D. dissertation, Massachusetts Institute of Technology (MIT), Media Lab, Cambridge, USA (June 1999)Google Scholar
  75. 75.
    Marques, J.: An Automatic Annotation System for Audio Data Containing Music. Master’s Thesis, Massachusetts Institute of Technology (MIT), Media Lab, Cambridge, USA (1999)Google Scholar
  76. 76.
    McKinney, M.F., Delgutte, B.: A Possible Neurophysiologic Basis of the Octave enlargement Effect. Journal of the Acoustical Society of America (JASA) 106(5), 2679–2692 (1999)CrossRefGoogle Scholar
  77. 77.
    McNab, R.J., Smith, L.A., Witten, I.H., Henderson, C.L.: Tune Retrieval in the Multimedia Library. Journal of Multimedia Tools and Applications 10(2-3), 113–132 (2000)MATHCrossRefGoogle Scholar
  78. 78.
    Miller, R.: The Structure of Singing: System and Art in Vocal Technique. Wadsworth Group/Thomson Learning, Belmont California, USA (1986)Google Scholar
  79. 79.
    Moorer, J.A.: On the Segmentation and Analysis of Continuous Musical Sound by Digital Computer. Ph.D. dissertation, Department of Computer Science, Stanford University (1975)Google Scholar
  80. 80.
    Music Information Retrieval Evaluation eXchange (MIREX )Google Scholar
  81. 81.
    Nwe, T.L., Wang, Y.: Automatic Detection of Vocal Segments in Popular Songs. In: Proc. of 5th International Symposium/Conference of Music Information Retrieval (ISMIR), Barcelona, Spain, October 10-15 (2004)Google Scholar
  82. 82.
    Ohgushi, K.: On the Role of Spatial and Temporal Cues in the Perception of the Pitch of Complex Tones. Journal of the Acoustical Society of America (JASA) 64, 764–771 (1978)CrossRefGoogle Scholar
  83. 83.
    Ohgushi, K.: The Origin of Tonality and a Possible Explanation of the Octave Enlargement Phenomenon. Journal of the Acoustical Society of America (JASA) 73, 1694–1700 (1983)CrossRefGoogle Scholar
  84. 84.
    Paulus, J., Klapuri, A.: Music Structure Analysis using a Probabilistic Fitness Measure and an Integrated Musicological Model. In: Proc. International Symposium/Conference of Music Information Retrieval, ISMIR (2008)Google Scholar
  85. 85.
    Perkins, C., Hodson, O., Hardman, V.: A Survey of Packet Loss Recovery Techniques for Streaming Audio. IEEE Network Magazine, 40–48 (September/October 1998)Google Scholar
  86. 86.
    Pikrakis, A., Antonopoulos, I., Theodoridis, S.: Music Meter and Tempo Tracking from Raw Polyphonic Audio. In: Proc. of 5th International Symposium/Conference of Music Information Retrieval (ISMIR), Barcelona, Spain, October 10-15 (2004)Google Scholar
  87. 87.
    Pinto, A., Haus, G.: A novel xml music information retrieval method using graph invariants. ACM Transactions on Information Systems (2007)Google Scholar
  88. 88.
    Poliner, G., Ellis, D., Ehmann, A., Gómez, E., Streich, S., Ong, B.: Melody Transcription from Music Audio: Approaches and Evaluation. IEEE Transaction on Audio, Speech, and Language Processing 14(4), 1247–1256 (2007)CrossRefGoogle Scholar
  89. 89.
    Pye, D.: Content-Based Methods for the management of Digital Music. In: Proc. of International conference on Acoustics, Speech, and Signal processing (ICASSP), Istanbul, Turkey, June 05-09 (2000)Google Scholar
  90. 90.
    Ritsma, R.J.: Frequency Dominant in the Perception of the Pitch of Complex Sounds. Journal of Acoustical Society of America 42(1), 191–198 (1967)CrossRefGoogle Scholar
  91. 91.
    Rossing, T.D., Moore, F.R., Wheeler, P.A.: Science of Sound, 3rd edn. Addison Wesley, Reading (2001)Google Scholar
  92. 92.
    Rudiments and Theory of Music, The associated board of the royal schools of music, 14 Bedford Square, London, WC1B 3JG (1949)Google Scholar
  93. 93.
    Saitou, T., Unoki, M., Akagi, M.: Extraction of F0 Dynamic Characteristics and Developments of F0 Control Model in Singing Voice. In: Proc. of the 8th International Conference on Auditory Display, Kyoto, Japan, July 02 – 05 (2002)Google Scholar
  94. 94.
    Sakeo, H., Chiba, S.: Dynamic Programming Algorithm Optimization for Spoken Word Recognition. IEEE Transaction on Audio, Speech, and Language Processing 26(1), 43–49 (1978)CrossRefGoogle Scholar
  95. 95.
    Scheirer, E.D.: Tempo and Beat Analysis of Acoustic Music Signals. Journal of Acoustical Society of America 103(1), 588–601 (1998)CrossRefGoogle Scholar
  96. 96.
    Scaringella, N., Zoia, G.: A Real-Time Beat Tracker for Unrestricted Audio Signals. In: Proc. of the Conference of Sound and Music Computing (JIM/CIM), Paris, France, October 20-22 (2004)Google Scholar
  97. 97.
    Scaringella, N., Zoia, G., Mlynek, D.: Automatic Genre Classification of Music Content. IEEE Signal Processing Magazine 23(2) (March 2006)Google Scholar
  98. 98.
    Sethares, W.A., Staley, T.W.: Meter and Periodicity in Music Performance. Journal of New Music Research 30(2) (June 2001)Google Scholar
  99. 99.
    Sethares, W.A., Morris, R.D., Sethares, J.C.: Beat Tracking of Musical Performances Using Low-Level Audio Features. IEEE Transactions on Speech and Audio Processing 13(2), 275–285 (2005)CrossRefGoogle Scholar
  100. 100.
    Sheh, A., Ellis, D.P.W.: Chord Segmentation and Recognition using EM-Trained Hidden Markov Models. In: Proc. 4th International Symposium of Music Information Retrieval (ISMIR), Baltimore, Maryland, USA, October 26-30 (2003)Google Scholar
  101. 101.
    Shenoy, A., Mohapatra, R., Wang, Y.: Key Detection of Acoustic Musical Signals. In: Proc. of IEEE International Conference on Multimedia and Expo. (ICME), Taipei, Taiwan, June 27-30 (2004)Google Scholar
  102. 102.
    Shepard, R.N.: Circularity in Judgments of Relative Pitch. Journal of the Acoustical Society of America (JASA) 36, 2346–2353 (1964)CrossRefGoogle Scholar
  103. 103.
    Shifrin, J., Pardo, B., Meek, C., Birmingham, W.P.: HMM-Based Musical Query Retrieval. In: Proc. of the 2nd Joint International Conference (ACM & IEEE-CS) on Digital Libraries (JCDL), Portland, Origone, USA, July 14-18, pp. 295–300 (2002)Google Scholar
  104. 104.
    Soltau, H., Schultz, T., Westphal, M., Waibel, A.: Recognition of Music Types. In: Proc. of International conference on Acoustics, Speech, and Signal processing (ICASSP), Seattle, Washington, USA, May 12-15 (1998)Google Scholar
  105. 105.
    Stevens, S.S., Volkmann, J., Newman, E.B.: A Scale for the Measurement of the Psychological Magnitude of Pitch. Journal of the Acoustical Society of America (JASA) 8(3), 185–190 (1937)CrossRefGoogle Scholar
  106. 106.
    Stevens, S.S., Volkmann, J.: The Relation of Pitch Frequency; a Relative Scale. Journal of the Acoustical Society of America (JASA) 53, 329–353 (1940)Google Scholar
  107. 107.
    Su, B., Jeng, S.: Multi-Timbre Chord Classification using Wavelet Transform and Self-organized Map Neural Networks. In: Proc. of International conference on Acoustics, Speech, and Signal processing (ICASSP), Sault lake city, Utah, vol. V, pp. 3377–3380 (2001)Google Scholar
  108. 108.
    Sundberg, J., Lindqvist, J.: Musical Octaves and Pitch. Journal of the Acoustical Society of America (JASA) 54, 922–929 (1973)CrossRefGoogle Scholar
  109. 109.
    Sundberg, J.: The Science of the Singing Voice. Northern Illinois University Press, Dekalb (1987)Google Scholar
  110. 110.
    Szczerba, M., Czyżewski, A.: Pitch estimation Enhancement Employing Neural Network-Based Music Prediction. In: Proc. 6th IASTED International Conference on Artificial Intelligence and Soft Computing (ASC), Banff, Canada, July 17-19 (2002)Google Scholar
  111. 111.
    MUSIC TECH, Ten Minute Master No 18: Song Structure, MUSIC TECH magazine, pp. 62–63 (October 2003), http://www.musictechmag.co.uk
  112. 112.
    Takeda, H., NIshimoto, T., Sagayama, S.: Rhythm and Tempo Recognition of Music Performance from a Probabilistic Approach. In: Proc. 5th International Symposium of Music Information Retrieval (ISMIR), Barcelona, Spain, October 2004, pp. 357–364 (2004)Google Scholar
  113. 113.
    Terhardt, E.: Pitch, Consonance and Harmony. Journal of the Acoustical Society of America (JASA) 55(5), 1061–1069 (1974)CrossRefGoogle Scholar
  114. 114.
    Terhardt, E.: Pitch of Complex Signals According to Virtual-Pitch Theory: Tests, Examples, and Predictions. Journal of the Acoustical Society of America (JASA) 71(3), 671–678 (1982)CrossRefGoogle Scholar
  115. 115.
    Tsai, W.H., Wang, H.M., Rodgers, D., Cheng, S.S., Yu, H.M.: Blind Clustering of Popular Music Recordings Based on Singer Voice Characteristics. In: Proc. 4th International Symposium of Music Information Retrieval (ISMIR), Baltimore, Maryland, USA, October 26-30 (2003)Google Scholar
  116. 116.
    Typke, R., Veltkamp, R.C., Wiering, F.: Searching Notated Polyphonic Music Using Transportation Distances. In: Proc. International ACM Conference on Multimedia (ACM MM), New York, USA, October 10-16 (2004)Google Scholar
  117. 117.
    Tzanetakis, G., Cook, P.: Music Genre Classification of Audio Signals. IEEE Transactions on Speech and Audio Processing 10(5), 293–302 (2002)CrossRefGoogle Scholar
  118. 118.
    Tzanetakis, G.: Song-Specific Bootstrapping of Singing Voice Structure. In: Proc. of IEEE International Conference on Multimedia and Expo. (ICME), Taipei, Taiwan, June 27-30 (2004)Google Scholar
  119. 119.
    Uhle, C., Herre, J.: Estimation of Tempo, MicroTime and Time Signature from Percussive Music. In: Proc. of the 6th International Conference on Digital Audio Effects (DAFX 2003), London, UK, September 8-11 (2003)Google Scholar
  120. 120.
    Wang, Y., Vilermo, M.: A Compressed Domain Beat Detection Using MP3 Audio Bitstreams. In: Proc. 9th ACM International Conference on Multimedia (ACM MM), Ottawa, Ontario, Canada, September 30 - October 5 (2001)Google Scholar
  121. 121.
    Wang, Y., Ahmaniemi, A., Isherwood, D., Huang, W.: Content –Based UEP: A New Scheme for Packet Loss Recovery in Music Streaming. In: Proc. ACM International conference on Multimedia (ACM MM), Berkeley, CA, USA, November 2-8 (2003)Google Scholar
  122. 122.
    Wang, Y., Kan, M.Y., Nwe, T.L., Shenoy, A., Yin, J.: LyricAlly: Automatic Synchronization of Acoustic Music Signals and Textual Lyrics. In: Proc. International ACM Conference on Multimedia (ACM MM), New York, USA, October 10-16 (2004)Google Scholar
  123. 123.
    Wang, Y., Huang, W., Korhonen, J.: A Framework for Robust and Scalable Audio Streaming. In: Proc. International ACM Conference on Multimedia (ACM MM), New York, USA, October 10-16 (2004)Google Scholar
  124. 124.
    Wah, B.W., Su, X., Lin, D.: A Survey of Error-Concealment Schemes for Real-Time Audio and Video Transmission over the Internet. In: IEEE International Symposium on Multimedia Software Engineering, Taipei, Taiwan, December 2000, pp. 17–24 (2000)Google Scholar
  125. 125.
    Ward, W.: Subjective Musical Pitch. Journal of the Acoustical Society of America (JASA) 26, 369–380 (1954)CrossRefGoogle Scholar
  126. 126.
    Wyse, L., Wang, Y., Zhu, X.: Application of a Content-Based Percussive Sound Synthesizer to Packet Loss Recovery in Music Streaming. In: Proc. ACM International conference on Multimedia (ACM MM), Berkeley, CA, USA, November 2-8 (2003)Google Scholar
  127. 127.
    Xi, S., Xu, C.S., Kankanhalli, M.S.: Unsupervised Classification of Music Genre Using Hidden Markov Model. In: Proc. of IEEE International Conference on Multimedia and Expo. (ICME), Taipei, Taiwan, June 27-30 (2004)Google Scholar
  128. 128.
    Xi, S., Maddage, N.C., Xu, C.S., Kankanhalli, M.S.: Automatic music summarization based on music structure analysis. In: Proc. Acoustics, Speech, and Signal Processing (2005)Google Scholar
  129. 129.
    Xu, C., Zhu, Y., Tian, Q.: Automatic Music Summarization Based on Temporal, Spectral and Cepstral Features. In: Proc. IEEE International Conference on Multimedia and Expo., Lausanne, Switzerland, August 26-29, pp. 117–120 (2002)Google Scholar
  130. 130.
    Xu, C.S., Maddage, N.C., Shao, X., Cao, F., Tian, Q.: Musical Genre Classification Using Support Vector Machines. In: Proc. International Conference on Acoustics, Speech, and Signal processing (ICASSP), pp. V429–V432 (2003)Google Scholar
  131. 131.
    Xu, C.S., Maddage, N.C., Shao, X.: Automatic Music Classification and Summarization. IEEE Transaction on Speech and Audio Processing 13, 441–450 (2005)CrossRefGoogle Scholar
  132. 132.
    Xu, C.S., Maddage, N.C., Shao, X., Qi, T.: Content-Adaptive Digital Music Watermarking based on Music Structure Analysis. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) 3(1) (2007)Google Scholar
  133. 133.
    Yoshioka, T., Kitahara, T., Komatani, K., Ogata, T., Okuna, H.G.: Automatic Chord Transcription with Concurrent Recognition of Chord Symbols and Boundaries. In: Proc. of 5th International Symposium/Conference of Music Information Retrieval (ISMIR), Barcelona, Spain, October 10-15 (2004)Google Scholar
  134. 134.
    Zhu, Y.: Content-Based Music Retrieval by Acoustic Query. Ph.D. dissertation, Department of Computer Science, National University of Singapore (October 2004)Google Scholar
  135. 135.
    Zhu, Y., Kankanhalli, M.S., Gao, S.: Music Key Detection for Musical Audio. In: Proc. 11th International Multimedia Modelling Conference (MMM), Melbourne, Australia, January 12-14 (2005)Google Scholar
  136. 136.
    Zhang, T., Kuo, C.C.J.: Audio Content Analysis for Online Audiovisual Data Segmentation and Classification. IEEE Transaction on Speech and Audio Processing 9(4), 441–457 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Namunu C. Maddage
    • 1
  • Haizhou Li
    • 2
  • Mohan S. Kankanhalli
    • 3
  1. 1.Electrical and Computer EngineeringRoyal Melbourne Institute of Technology (RMIT) UniversityMelbourne
  2. 2.Institute for Infocomm Research (I2R)ConnexisSingapore
  3. 3.School of ComputingNational University of SingaporeSingapore

Personalised recommendations