Automatic Singing Voice Recognition Employing Neural Networks and Rough Sets

  • Paweł Żwan
  • Piotr Szczuko
  • Bożena Kostek
  • Andrzej Czyżewski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5390)


The aim of the research study presented in this paper is the automatic recognition of a singing voice. For this purpose, a database containing sample recordings of trained and untrained singers was constructed. Based on these recordings, certain voice parameters were extracted. Two recognition categories were defined – one reflecting the skills of a singer (quality), and the other reflecting the type of the singing voice (type). The paper also presents the parameters designed especially for the analysis of a singing voice and gives their physical interpretation. Decision systems based on artificial neutral networks and rough sets are used for automatic voice quality/ type classification. Results obtained from both decision systems are then compared and conclusions are derived.


Singing Voice Feature extraction Automatic Classification Artificial Neural Networks Rough Sets Music Information Retrieval 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bazan, J.G., Szczuka, M.S.: The Rough Set Exploration System. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 37–56. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  2. 2.
    Bloothoof, G.: The sound level of the singers formant in professional singing. J. Acoust. Soc. Am. 79(6), 2028–2032 (1986)CrossRefGoogle Scholar
  3. 3.
    Childers, D.G., Skinner, D.P., Kemerait, R.C.: The Cepstrum: A Guide to Processing. Proc. IEEE 65, 1428–1443 (1977)CrossRefGoogle Scholar
  4. 4.
    Dejonckere, P.H., Olek, M.P.: Exactness of intervals in singing voice: A comparison between singing students and professional singers. In: Proc. 17th International Congress on Acoustics, Rome, VIII, pp. 120–121 (2001)Google Scholar
  5. 5.
    Diaz, J.A., Rothman, H.B.: Acoustic parameters for determining the differences between good and poor vibrato in singing. In: Proc. 17th International Congress on Acoustics, Rome, VIII, pp. 110–116 (2001)Google Scholar
  6. 6.
    Dziubiṅski, M., Kostek, B.: Octave Error Immune and Instantaneous Pitch Detection Algorithm. J. of New Music Research 34, 273–292 (2005)CrossRefGoogle Scholar
  7. 7.
    Fry, D.B.: Basis for the acoustical study of singing. J. Acoust. Soc. Am. 28, 789–798 (1957)Google Scholar
  8. 8.
    Harma, A.: Evaluation of a warped linear predictive coding scheme. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 897–900 (2000)Google Scholar
  9. 9.
    Harma, A.: A comparison of warped and conventional linear predictive coding. IEEE Transactions on Speech and Audio Processing 5, 579–588 (2001)CrossRefGoogle Scholar
  10. 10.
    Herzel, H., Titze, I., Steinecke, I.: Nonlinear dynamics of the voice: signal analysis and biomechanical modeling. CHAOS 5, 30–34 (1995)CrossRefGoogle Scholar
  11. 11.
    Herrera, P., Serra, X., Peeters, G.: A proposal for the description of audio in the context of MPEG-7. In: Proc. CBMI European Workshop on Content-Based Multimedia Indexing, Toulouse, France (1999)Google Scholar
  12. 12.
    Joliveau, E., Smith, J., Wolfe, J.: Vocal tract resonances in singing: the soprano voice. J. Acoust. Soc. America 116, 2434–2439 (2004)CrossRefGoogle Scholar
  13. 13.
    Kostek, B.: Soft Computing in Acoustics, Applications of Neural Networks, Fuzzy Logic and Rough Sets to Music Acoustics, Studies in Fuzziness and Soft Computing. Physica Verlag, Heidelberg (1999)zbMATHGoogle Scholar
  14. 14.
    Kostek, B., Czyżewski, A.: Representing Musical Instrument Sounds for Their Automatic Classification. J. Audio Eng. Soc. 49, 768–785 (2001)Google Scholar
  15. 15.
    Kostek, B.: Perception-Based Data Processing in Acoustics. In: Applications to Music Information Retrieval and Psychophysiology of Hearing. Series on Cognitive Technologies. Springer, Heidelberg (2005)Google Scholar
  16. 16.
    Kostek, B., Szczuko, P., Żwan, P., Dalka, P.: Processing of Musical Data Employing Rough Sets and Artificial Neural Networks. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 112–133. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Kostek, B.: Applying computational intelligence to musical acoustics. Archives of Acoustics 32(3), 617–629 (2007)Google Scholar
  18. 18.
    Kruger, E., Strube, H.W.: Linear prediction on a warped frequency scale. IEEE Trans. on Acoustics, Speech, and Signal Processing 36(9), 1529–1531 (1988)CrossRefzbMATHGoogle Scholar
  19. 19.
    Lindsay, A., Herre, J.: MPEG-7 and MPEG-7 Audio - An Overview. J. Audio Eng. Society 49(7/8), 589–594 (2001)Google Scholar
  20. 20.
    Mendes, A.: Acoustic effect of vocal training. In: Proc. 17th International Congress on Acoustics, Rome, VIII, pp. 106–107 (2001)Google Scholar
  21. 21.
    Pawlak, Z.: Rough Sets. International J. Computer and Information Sciences 11, 341–356 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Peters, J.F., Skowron, A. (eds.): Transactions on Rough Sets V. LNCS, vol. 4100. Springer, Heidelberg (2006)zbMATHGoogle Scholar
  23. 23.
    Rabiner, L.: On the use of autocorrelation analysis for pitch detection. IEEE Trans., ASSP 25, 24–33 (1977)CrossRefGoogle Scholar
  24. 24.
    Rough-set Exploration System,
  25. 25.
    Schutte, H.K., Miller, D.G.: Acoustic Details of Vibrato Cycle in Tenor High Notes. J. of Voice 5, 217–231 (1990)CrossRefGoogle Scholar
  26. 26.
    Sebestyen, G.S.: Decision-making processes in pattern recognition. Macmillan Publishing Co., Indianapolis (1965)Google Scholar
  27. 27.
    Sundberg, J.: The science of the singing voice. Northern Illinois University Press (1987)Google Scholar
  28. 28.
    Wieczorkowska, A., Czyżewski, A.: Rough Set Based Automatic Classification of Musical Instrument Sounds. Electr. Notes Theor. Comput. Sci. 82(4) (2003)Google Scholar
  29. 29.
    Wieczorkowska, A., Raṡ, Z.W.: Editorial: Music Information Retrieval. J. Intell. Inf. Syst. 21(1), 5–8 (2003)CrossRefGoogle Scholar
  30. 30.
    Wieczorkowska, A., Ras, Z.W., Zhang, X., Lewis, R.A.: Multi-way Hierarchic Classification of Musical Instrument Sounds, pp. 897–902. MUE, IEEE (2007)Google Scholar
  31. 31.
    Wolf, S.K.: Quantitative studies on the singing voice. J. Acoust. Soc. Am. 6, 255–266 (1935)CrossRefGoogle Scholar
  32. 32.
    Żwan, P.: Expert System for Automatic Classification and Quality Assessment of Singing Voices. 121 Audio Eng. Soc. Convention, San Francisco, USA (2006)Google Scholar
  33. 33.
    Żwan, P.: Expert system for objectivization of judgments of singing voices (in Polish), Ph.D. Thesis (supervisor: Kostek B.), Gdansk Univ. of Technology, Electronics, Telecommunications and Informatics Faculty, Multimedia Systems Department, Gdansk, Poland (2007)Google Scholar
  34. 34.
    Żwan, P., Kostek, B., Szczuko, P., Czyżewski, A.: Automatic Singing Voice Recognition Employing Neural Networks and Rough Sets. In: Kryszkiewicz, M., Peters, J.F., Rybinski, H., Skowron, A. (eds.) RSEISP 2007. LNCS (LNAI), vol. 4585, pp. 793–802. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  35. 35.
    Żwan, P.: Automatic singing quality recognition employing artificial neural networks. Archives of Acoustics 33(1), 65–71 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Paweł Żwan
    • 1
  • Piotr Szczuko
    • 1
  • Bożena Kostek
    • 1
  • Andrzej Czyżewski
    • 1
  1. 1.Multimedia Systems DepartmentGdańsk University of TechnologyGdańskPoland

Personalised recommendations