Component Characterization of Western and Indian Classical Music

  • Shivam SharmaEmail author
  • Seema Ghisingh
  • Vinay Kumar Mittal
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 678)


Regular pitch detection algorithms are known to be immensely useful for speech source analysis. Their utility is not as reliable when processing polyphonic acoustic mixtures like Music. This is an investigative study of music components like rhythm, accompaniment and Lyrical-voicing, that is seen as a critical task towards targeted music component identification and processing. Popular music forms like Western and Hindustani Classical are considered for our study dataset. For Western cases, comparative preliminary analysis of the spectral characteristics like Harmonics and Energy is done towards characterization of Music region against that of Lyrics-music mixture. \(F_{0}\) contour analysis for these regions, using Autocorrelation and Zero frequency filtering indicates the utility of the latter in Lyrical-voicing onset identification. Short-time spectral analysis leads to the distinctive understanding about the Harmonic structure according to the music polyphony. Strength of Excitation is found to be insightful towards characterizing sounds like base sounds, prominent in percussion instruments. For study on Classical music, \(F_{0}\) contour analysis using raw signal and LP Residual elucidate the characteristic average pitch effect, which comes out to be higher for the Alaap region in case of Female artists and Lyrics composition regions for the Male artists, giving cues towards the applications like Raaga identification and summarization. The analysis of the excitation source features for various music components done in this work present some insightful observations and clues towards effective Music component processing.


Western Classical Pitch Harmonics Energy Raaga 


  1. 1.
    de Cheveigne, A.: Yin, a fundamental frequency estimator for speech and music. J. Acoust. Soc. Am. 111(4), 1917–1930 (2002). doi: 10.1121/1.1458024 CrossRefGoogle Scholar
  2. 2.
    Ephraim, Y., Malah, D.: Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 32(6), 1109–1121 (1984)CrossRefGoogle Scholar
  3. 3.
    Haykin, S.: An Introduction to Analog and Digital Communications. Wiley, New York (1989).
  4. 4.
    Li, Y., Wang, D.: Separation of singing voice from music accompaniment for monaural recordings. Trans. Audio, Speech Lang. Proc. 15(4), 1475–1487 (2007). doi: 10.1109/TASL.2006.889789
  5. 5.
    Martin, R.: Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process. 9(5), 504–512 (2001)CrossRefGoogle Scholar
  6. 6.
    Mittal, V.K., Yegnanarayana, B.: Significance of aperiodicity in the pitch perception of expressive voices. In: INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, 14–18 September, 2014, pp. 504–508 (2014).
  7. 7.
    Mittal, V.K., Yegnanarayana, B.: Study of characteristics of aperiodicity in Noh voices. J. Acoust. Soc. Am. 137(6) (2015)Google Scholar
  8. 8.
    Ockelford, A.: Repetition in music: theoretical and metatheoretical perspectives. In: Royal Musical Association Monographs. Farnham, U.K., Ashgate (2005)Google Scholar
  9. 9.
    Oppenheim, A.V., Schafer, R.W., Buck, J.R.: Discrete-time signal processing, 2nd edn. Prentice-Hall Inc., Upper Saddle River (1999)Google Scholar
  10. 10.
    Ozerov, A., Philippe, P., Bimbot, F., Gribonval, R.: Adaptation of bayesian models for single-channel source separation and its application to voice/music separation in popular songs. IEEE Trans. Audio, Speech Lang. Process. 15(5), 1564–1578 (2007). doi: 10.1109/TASL.2007.899291
  11. 11.
    Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice-Hall Inc., Upper Saddle River (1993)zbMATHGoogle Scholar
  12. 12.
    Rafii, Z., Pardo, B.: Repeating pattern extraction technique (repet): a simple method for music/voice separation. IEEE Trans. Audio Speech Lang. Process. 21(1), 73–84 (2013). doi: 10.1109/TASL.2012.2213249 CrossRefGoogle Scholar
  13. 13.
    Rao, V., Ramakrishnan, S., Rao, P.: Singing voice detection in north indian classical music. In: Proceedings of the National Conference on Communications (NCC) (2008)Google Scholar
  14. 14.
    Sharma, S., Mittal, V.K.: Singing characterization using temporal and spectral features in indian musical notes. In: 2016 International Conference on Signal Processing and Communication. JIIT, Noida (2016)Google Scholar
  15. 15.
    Sharma, S., Mittal, V.K.: Window selection for accurate music source separation using repet. In: 2016 3rd International Conference on Signal Processing and Integrated Networks (SPIN), pp. 270–274 (2016). doi: 10.1109/SPIN.2016.7566702
  16. 16.
    Sjölander, K., Beskow, J.: Wavesurfer-an open source speech toolGoogle Scholar
  17. 17.
    Sohn, J., Kim, N.S., Sung, W.: A statistical model-based voice activity detection. IEEE Signal Process. Lett. 6(1), 1–3 (1999)CrossRefGoogle Scholar
  18. 18.
    Yegnanarayana, B., Murty, K.S.R.: Event-based instantaneous fundamental frequency estimation from speech signals. IEEE Trans. Audio Speech Lang. Process. 17(4), 614–624 (2009). doi: 10.1109/TASL.2008.2012194 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Shivam Sharma
    • 1
    Email author
  • Seema Ghisingh
    • 1
  • Vinay Kumar Mittal
    • 1
  1. 1.Indian Institute of Information Technology ChittoorSri CityIndia

Personalised recommendations