An HMM-Based Framework for Supporting Accurate Classification of Music Datasets

  • Alfredo Cuzzocrea
  • Enzo Mumolo
  • Gianni Vercelli
Part of the Studies in Big Data book series (SBD, volume 40)


In this paper, we use Hidden Markov Models (HMM) and Mel-Frequency Cepstral Coefficients (MFCC) to build statistical models of classical music composers directly from the music datasets. Several musical pieces are divided by instruments (String, Piano, Chorus, Orchestra), and, for each instrument, statistical models of the composers are computed. We selected 19 different composers spanning four centuries by using a total number of 400 musical pieces. Each musical piece is classified as belonging to a composer if the corresponding HMM gives the highest likelihood for that piece. We show that the so-developed models can be used to obtain useful information on the correlation between the composers. Moreover, by using the maximum likelihood approach, we also classified the instrumentation used by the same composer. Besides as an analysis tool, the described approach has been used as a classifier. This overall originates an HMM-based framework for supporting accurate classification of music datasets. On a dataset of String Quartet movements, we obtained an average composer classification accuracy of more than \(96\%\). As regards instrumentation classification, we obtained an average classification of slightly less than \(100\%\) for Piano, Orchestra and String Quartet. In this paper, the most significant results coming from our experimental assessment and analysis are reported and discussed in detail.


  1. 1.
    Ajmera, J., McCowan, I., Bourlard, H.: Speech/music segmentation using entropy and dynamism features in a HMM classification framework. Speech Commun. 40(3), 351–363 (2003)CrossRefGoogle Scholar
  2. 2.
    Bassiou, N., Kotropoulos, C., Papazoglou-Chalikias, A.: Greek folk music classification into two genres using lyrics and audio via canonical correlation analysis. In: 9th International Symposium on Image and Signal Processing and Analysis, ISPA 2015, Zagreb, Croatia, 7–9 September 2015, pp. 238–243.Google Scholar
  3. 3.
    Bhalke, D.G., Rao, C.B.R., Bormane, D.S.: Automatic musical instrument classification using fractional fourier transform based- MFCC features and counter propagation neural network. J. Intell. Inf. Syst. 46(3), 425–446 (2016)CrossRefGoogle Scholar
  4. 4.
    Cont, A.: Realtime audio to score alignment for polyphonic music instruments, using sparse non-negative constraints and hierarchical HMMS. In: 2006 IEEE international conference on acoustics speech and signal processing, ICASSP 2006, Toulouse, France, 14–19 May 2006, pp. 245–248Google Scholar
  5. 5.
    Cuzzocrea, A.: Accuracy control in compressed multidimensional data cubes for quality of answer-based OLAP tools. In: Proceedings of the 8th International Conference on Scientific and Statistical Database Management, SSDBM 2006, 3–5 July 2006, Vienna, Austria, pp. 301–310Google Scholar
  6. 6.
    Cuzzocrea, A.: Privacy and security of big data: Current challenges and future research perspectives. In: Proceedings of the First International Workshop on Privacy and Secuirty of Big Data, PSBD@CIKM 2014, Shanghai, China, 7 November 2014, pp. 45–47Google Scholar
  7. 7.
    Cuzzocrea, A., Furfaro, F., Saccà, D.: Enabling OLAP in mobile environments via intelligent data cube compression techniques. J. Intell. Inf. Syst. 33(2), 95–143 (2009)CrossRefGoogle Scholar
  8. 8.
    Cuzzocrea, A., Matrangolo, U.: Analytical synopses for approximate query answering in OLAP environments. In: Proceedings of the 15th International Conference Database and Expert Systems Applications, DEXA 2004 Zaragoza, Spain, 30 August, 3 September 2004, pp. 359–370Google Scholar
  9. 9.
    Cuzzocrea, A., Saccà, D., Ullman, J.D.: Big data: a research agenda. In: 17th International Database Engineering & Applications Symposium, IDEAS ’13, Barcelona, Spain, 09–11 October 2013, pp. 198–203Google Scholar
  10. 10.
    Cuzzocrea, A., Song, I., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution! In: Proceedings of the DOLAP 2011, ACM 14th International Workshop on Data Warehousing and OLAP, Glasgow, United Kingdom, 28 October 2011, pp. 101–104Google Scholar
  11. 11.
    Emiya, V., Badeau, R., David, B.: Automatic transcription of piano music based on HMM tracking of jointly-estimated pitches. In: 2008 16th European Signal Processing Conference, EUSIPCO 2008, Lausanne, Switzerland, 25–29 August 2008, pp. 1–5Google Scholar
  12. 12.
    Fant, G.: Acoustic Theory of Speech Production. Mouton, The Hague (1960)Google Scholar
  13. 13.
    Furuya, M., Oku, K., Kawagoe, K.: Music feeling classification based on lyrics using weighting of non-emotional words. In: Proceedings of the 13th International Conference on Advances in Mobile Computing and Multimedia, MoMM 2015, Brussels, Belgium, 11–13 December 2015, pp. 380–383Google Scholar
  14. 14.
    Gao, S., Zhu, Y.: A hmm-embedded unsupervised learning to musical event detection. In: Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, ICME 2005, 6-9 July 2005, Amsterdam, The Netherlands, pp. 334–337Google Scholar
  15. 15.
    Ghahramani, Z.: An introduction to hidden markov models and bayesian networks. IJPRAI 15(1), 9–42 (2001)Google Scholar
  16. 16.
    Herremans, D., Sörensen, K., Martens, D.: Classification and generation of composer-specific music using global feature models and variable neighborhood search. Comput. Music J. 39(3), 71–91 (2015)CrossRefGoogle Scholar
  17. 17.
    J. H. Jensen, M. G. Christensen, M. Murthi, and S. H. Jensen. Evaluation of mfcc estimation techniques for music similarity. In European Signal Processing Conference, EUSIPCO, 2006Google Scholar
  18. 18.
    Jeong, I., Lee, K.: Learning temporal features using a deep neural network and its application to music genre classification. In: Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016, New York City, United States, 7–11 August 2016, pp. 434–440Google Scholar
  19. 19.
    Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Trans. Commun. 702–710 (1980)Google Scholar
  20. 20.
    M.-C. Marinescu and R. Ramirez. Modeling expressive performances of the singing voice. In International Workshop on Machine Learning and Music, 2009Google Scholar
  21. 21.
    Myung, J., Kim, K., Park, J., Koo, M., Kim, J.: Two-pass search strategy using accumulated band energy histogram for hmm-based identification of perceptually identical music. Int. J. Imaging Syst. Technol. 23(2), 127–132 (2013)CrossRefGoogle Scholar
  22. 22.
    Nakamura, E., Yoshii, K., Sagayama, S.: Rhythm transcription of polyphonic piano music based on merged-output HMM for multiple voices. IEEE/ACM Trans. Audio, Speech Lang. Process. 25(4), 794–806 (2017)CrossRefGoogle Scholar
  23. 23.
    R. Nobrega and S. Cavaco. Detecting key features in popular music: case study - singing voice detection. In International Workshop on Machine Learning and Music, 2009Google Scholar
  24. 24.
    Pollastri, E., Simoncelli, G.: Classification of melodies by composer with hidden markov models. International Conference on Web Delivering of Music, 0088 (2001)Google Scholar
  25. 25.
    Przybysz, A.L., Corassa, R., dos Santos, C.L., Silla, C.N.: Latin music mood classification using cifras. In: 2015 IEEE International Conference on Systems, Man, and Cybernetics, Kowloon Tong, Hong Kong, 9–12 October 2015, pp. 1682–1686Google Scholar
  26. 26.
    Rajesh, B., Bhalke, D.G.: Automatic genre classification of indian tamil and western music using fractional MFCC. Int. J. Speech Technol. 19(3), 551–563 (2016)CrossRefGoogle Scholar
  27. 27.
    J. C. Ross and J. Samuel. Hierarchical clustering of music database based on HMM and markov chain for search efficiency. In Speech, Sound and Music Processing: Embracing Research in India - 8th International Symposium, CMMR 2011, 20th International Symposium, FRSM 2011, Bhubaneswar, India, March 9-12, 2011, Revised Selected Papers, pages 98–103, 2011CrossRefGoogle Scholar
  28. 28.
    Tribolet, J.M.: Seismic applications of homomorphic signal processing. Prentice Hall (1979)Google Scholar
  29. 29.
    Wolkowicz, J., Keselj, V.: Evaluation of n-gram-based classification approaches on classical music corpora. In: Proceedings of the Mathematics and Computation in Music - 4th International Conference, MCM 2013, Montreal, QC, Canada, 12–14 June 2013, pp. 213–225CrossRefGoogle Scholar
  30. 30.
    Wu, M., Jang, J.R.: Combining acoustic and multilevel visual features for music genre classification. TOMCCAP, 12(1), 10:1–10:17 (2015)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Yang, X., He, L., Qu, D., Zhang, W., Johnson, M.T.: Semi-supervised feature selection for audio classification based on constraint compensated laplacian score. EURASIP J. Audio Speech Music Process. 2016(9)(2016)Google Scholar
  32. 32.
    Yaslan, Y., Cataltepe, Z.: Audio genre classification with semi-supervised feature ensemble learning. In: International Workshop on Machine Learning and Music (2009)Google Scholar
  33. 33.
    Yeminy, Y.R., Keller, Y., Gannot, S.: Single microphone speech separation by diffusion-based HMM estimation. EURASIP J. Audio Speech Music Process. 2016(16) (2016)Google Scholar
  34. 34.
    Zhang, W., Lei, W., Xu, X., Xing, X.: Improved music genre classification with convolutional neural networks. In: Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, 8–12 September 2016, pp. 3304–3308Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  • Alfredo Cuzzocrea
    • 1
  • Enzo Mumolo
    • 2
  • Gianni Vercelli
    • 3
  1. 1.DIA DepartmentUniversity of Trieste and ICAR-CNRTriesteItaly
  2. 2.DIA DepartmentUniversity of TriesteTriesteItaly
  3. 3.DIBRIS DepartmentUniversity of GenovaGenovaItaly

Personalised recommendations