Audio Songs Classification Based on Music Patterns

  • Rahul Sharma
  • Y. V. Srinivasa Murthy
  • Shashidhar G. Koolagudi
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 381)


In this work, effort has been made to classify audio songs based on their music pattern which helps us to retrieve the music clips based on listener’s taste. This task is helpful in indexing and accessing the music clip based on listener’s state. Seven main categories are considered for this work such as devotional, energetic, folk, happy, pleasant, sad and, sleepy. Forty music clips of each category for training phase and fifteen clips of each category for testing phase are considered; vibrato-related features such as jitter and shimmer along with the mel-frequency cepstral coefficients (MFCCs); statistical values of pitch such as min, max, mean, and standard deviation are computed and added to the MFCCs, jitter, and shimmer which results in a 19-dimensional feature vector. feedforward backpropagation neural network (BPNN) is used as a classifier due to its efficiency in mapping the nonlinear relations. The accuracy of 82 % is achieved on an average for 105 testing clips.


Music classification Music indexing and retrieval Mel-frequency cepstral coefficients Artificial neural networks Pattern recognition Statistical properties Vibrato 


  1. 1.
    Perrot, D., Gjerdigen, R.: Scanning the dial: an exploration of factors in the identification of musical style. In: Proceedings of the 1999 Society for Music Perception and Cognition, p. 88 (1999)Google Scholar
  2. 2.
    Park, H.S., Yoo, J.O., Cho, S.B.: A context-aware music recommendation system using fuzzy bayesian networks with utility theory. In: Fuzzy Systems and Knowledge Discovery, pp. 970–979. Springer (2006)Google Scholar
  3. 3.
    Casey, M.A., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., Slaney, M.: Content-based music information retrieval: current directions and future challenges. Proc. IEEE 96, 668–696 (2008)CrossRefGoogle Scholar
  4. 4.
    Freed, A.: Music metadata quality: a multiyear case study using the music of skip james. In: Audio Engineering Society Convention 121, Audio Engineering Society (2006)Google Scholar
  5. 5.
    Mesaros, A., Virtanen, T., Klapuri, A.: Singer Identification in Polyphonic Music Using Vocal Separation and Pattern Recognition Methods, pp. 375–378. ISMIR (2007)Google Scholar
  6. 6.
    Ratanpara, T., Patel, N.: Singer identification using mfcc and lpc coefficients from indian video songs. In: Emerging ICT for Bridging the Future-Proceedings of the 49th Annual Convention of the Computer Society of India (CSI), vol. 1, pp. 275–282. Springer (2015)Google Scholar
  7. 7.
    Cai, W., Li, Q., Guan, X.: Automatic singer identification based on auditory features. In: Seventh International Conference on Natural Computation (ICNC), IEEE, vol. 3, pp. 1624–1628 (2011)Google Scholar
  8. 8.
    Mesaros, A., Astola, J.: The Mel-Frequency Cepstral Coefficients in the Context of Singer Identification, pp. 610–613. ISMIR, Citeseer (2005)Google Scholar
  9. 9.
    Rabiner, L.R., Juang, B.H.: Fundamentals of speech recognition, vol. 14. PTR Prentice Hall Englewood Cliffs (1993)Google Scholar
  10. 10.
    Seddik, H., Rahmouni, A., Sayadi, M.: Text independent speaker recognition using the mel frequency cepstral coefficients and a neural network classifier. In: First International Symposium on Control, Communications and Signal Processing, IEEE, pp. 631–634 (2004)Google Scholar
  11. 11.
    Fredrickson, S., Tarassenko, L.: Text-Independent Speaker Recognition Using Neural Network Techniques (1995)Google Scholar
  12. 12.
    Mafra, A.T., Simões, M.G.: Text independent automatic speaker recognition using selforganizing maps. In: Industry Applications Conference, 2004. 39th IAS Annual Meeting. Conference Record of the 2004 IEEE, vol. 3, pp. 1503–1510 (2004)Google Scholar
  13. 13.
    Brown, J.C.: Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. J. Acoustical Soc. Am. 105, 1933–1941 (1999)CrossRefGoogle Scholar
  14. 14.
    Collier, W.G., Hubbard, T.L.: Musical scales and brightness evaluations: effects of pitch, direction, and scale mode. Musicae Scientiae 8, 151–173 (2004)Google Scholar
  15. 15.
    Joder, C., Essid, S., Richard, G.: Temporal integration for audio classification with application to musical instrument classification. IEEE Trans. Audio Speech Lang. Proc. 17, 174–186 (2009)Google Scholar
  16. 16.
    Eronen, A.: Comparison of features for musical instrument recognition. In: IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics, pp. 19–22 (2001)Google Scholar
  17. 17.
    Eronen, A., Klapuri, A.: Musical instrument recognition using cepstral coefficients and temporal features. In: IEEE Int. Conf. Acoustics Speech Signal Proc. (ICASSP’00) 2, II753–II756 (2000)Google Scholar
  18. 18.
    Chui, C.K.: An Introduction to Wavelets, vol. 1. Academic Press (2014)Google Scholar
  19. 19.
    Hlawatsch, P.F., Boudreaux-Bartels, G.: Special Issue on Wavelets and Signal Processing. Urbana (51) 61801Google Scholar
  20. 20.
    Li, G., Khokhar, A.A.: Content-based indexing and retrieval of audio data using wavelets. IEEE Int. Conf. Multimedia Expo (ICME) 2, 885–888 (2000)Google Scholar
  21. 21.
    Li, T., Ogihara, M., Li, Q.: A comparative study on content-based music genre classification. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp. 282–289 (2003)Google Scholar
  22. 22.
    Tzanetakis, G., Essl, G.: Automatic musical genre classification of audio signals. In: Proceedings in International Symposium on Music Information Retrieval, ISMIR (Oct. 2001)Google Scholar
  23. 23.
    Wold, E., Blum, T., Keislar, D., Wheaten, J.: Content-based classification, search, and retrieval of audio. MultiMedia IEEE 3, 27–36 (1996)CrossRefGoogle Scholar
  24. 24.
    Kim, H.G., Sikora, T.: Audio spectrum projection based on several basis decomposition algorithms applied to general sound recognition and audio segmentation. na (2004)Google Scholar
  25. 25.
    Lerch, A.: An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics. Wiley (2012)Google Scholar
  26. 26.
    Sun, X.: Pitch determination and voice quality analysis using subharmonic-to-harmonic ratio. IEEE Int. Conf. Acoustics Speech Signal Proc. (ICASSP) 1, I–333 (2002)Google Scholar
  27. 27.
    Koolagudi, S., Shivakranthi, B., Rao, K.S., Ramteke, P.B.: Contribution of telugu vowels in identifying emotions. In: Eighth International Conference on Advances in Pattern Recognition (ICAPR), IEEE, pp. 1–6 (2015)Google Scholar
  28. 28.
    Berenzweig, A.L., Ellis, D.P., Lawrence, S.: Using voice segments to improve artist classification of music. In: Audio Engineering Society Conference: 22nd International Conference: Virtual, Synthetic, and Entertainment Audio, Audio Engineering Society (2002)Google Scholar
  29. 29.
    Murthy, Y.S., Koolagudi, S.G.: Classification of vocal and non-vocal regions from audio songs using spectral features and pitch variations. In: IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE), IEEE, pp. 1–6 (2015)Google Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  • Rahul Sharma
    • 1
  • Y. V. Srinivasa Murthy
    • 1
  • Shashidhar G. Koolagudi
    • 1
  1. 1.National Institute of Technology KarnatakaSurathkalIndia

Personalised recommendations