Advertisement

Combining Cepstral and Prosodic Features for Classification of Disfluencies in Stuttered Speech

  • P. Mahesha
  • D. S. Vinod
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 308)

Abstract

The process of recognition and classification of dysfluencies are significant in objective assessment of stuttered speech. The main focus of this study is to combine prosodic features and cepstral features in order to improve the performance of dysfluency recognition. The term prosody represents several characteristics related to human speech such as speaking rate, loudness, duration, and pitch. In this study, pitch, energy, and duration are considered as prosodic features and Mel Frequency Cepstral Coefficient (MFCC), delta MFCC (DMFCC), and delta–delta MFCC (DDMFCC) are used as cepstral feature set. The efficacy of the considered features has been evaluated using support vector machine (SVM) classifier. Experimental results demonstrated considerable enhancement in the overall performance with respect to the conventional methods present in the literature of stuttering dysfluency recognition.

Keywords

Dysfluency MFCC Prosodic features Stuttering and SVM 

References

  1. 1.
    Young, M.: Predicting ratings of severity of stuttering [monograph]. pp. 31–54 (1961)Google Scholar
  2. 2.
    Sherman, D.: Clinical and experimental use of the Iowa scale of severity of stuttering. J. Speech Hear. Disord. 17, 316–320 (1952)CrossRefGoogle Scholar
  3. 3.
    Cullinan, W.L., Prathe, E.M., Williams, D.: Comparison of procedures for scaling severity of stuttering. J. Speech Hear. Res. 6, 187–194 (1963)CrossRefGoogle Scholar
  4. 4.
    Bloodstein, O.: A handbook on stuttering. Singular Publishing Group Inc., San-Diego, London (1995)Google Scholar
  5. 5.
    Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C.: An approach for objective assessment of stuttered speech using MFCC features. ICGST Int. J. Digital Sig. Proc. DSP 9, 19–24 (2009)Google Scholar
  6. 6.
    Ravikumar KM, Reddy B, Rajagopal R, Nagaraj H (2008) Automatic detection of syllable repetition in read speech for objective assessment of stuttered disfluencies. In: Proceedings of World Academy Science, Engineering and Technology, pp. 270–273 (2008)Google Scholar
  7. 7.
    Howell, P., Sackin, S., Glenn, K.: Development of a two stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: II. ANN recognition of repetitions and prolongations with supplied word segment markers. J. Speech Lang. Hear. Res. 40, 1085 (1997)CrossRefGoogle Scholar
  8. 8.
    Noth, E., Niemann, H., Haderlein, T., Decher, M., Eysholdt, U., Rosanowski, F., Wittenberg, T.: Automatic Stuttering Recognition Using Hidden Markov Models. Interspeech (2000)Google Scholar
  9. 9.
    Czyzewski, A., Kaczmarek, A., Kostek, B.: Intelligent processing of stuttered speech. J. Intell. Inf. Syst. 21, 143–171 (2003)CrossRefGoogle Scholar
  10. 10.
    Wisniewski, M., Kuniszyk-Jozkowiak, W., Smolka, E., Suszynsk, W.: Automatic Detection of Disorders in a Continuous Speech with the Hidden Markov Models Approach, vol. 45/2008. In: Computer Recognition Systems vol. 2. Springer, Berlin/Heidelberg, pp. 445–453 (2007)Google Scholar
  11. 11.
    Wisniewski, M., Kuniszyk-Jozkowiak, W., Smolka, E., Suszynski, W.: Automatic detection of prolonged fricative phonemes with the hidden Markov models approach. J. Med. Inf. Technol. 11, 1–6 (2007)Google Scholar
  12. 12.
    Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: MFCC based recognition of repetition and prolongation in stuttered speech using K-NN and LDA. In: Proceedings of 2009 IEEE student conference on research and development (SCOReD), Malaysia (2009)Google Scholar
  13. 13.
    Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: Automatic detection of prolongations and repetitions using LPCC. In: Proceedings of International Conference for Technical Postgraduates (TECHPOS), pp. 1–4 (2009)Google Scholar
  14. 14.
    Mahesha, P., Vinod, D.S.: Automatic classification of dysfluencies in stuttered speech using MFCC. In: International Conference on Computing Communication and Information Technology, Chennai (2012)Google Scholar
  15. 15.
    Mahesha, P., Vinod, D.S.: Classification of speech dysfluencies using speech parameterization techniques and multiclass SVM. In: 9th International Conference, QShine 2013 vol. 115. Greader Noida, Springer Berlin, Heidelberg, pp. 298–308 (2013)Google Scholar
  16. 16.
    Howell, P., Huckvale, M.: Facilities to assist people to research into stammered speech. Stammering Research: An on-line journal published by the British Stammering Association, pp. 130–242 (2004)Google Scholar
  17. 17.
    Devis, S., Howell, P., Batrip, J.: The UCLASS archive of stuttered speech. J. Speech Lang. Hear. Res. (2009)Google Scholar
  18. 18.
    Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)Google Scholar
  19. 19.
    Proakis, J.G., Manolakis, D.G.: Digital Signal Processing, Principles, Algorithms and Applications. MacMillan, New York (2007)Google Scholar
  20. 20.
    Muda, L., Begam, K.M., Elamvazuthi, I.: Voice recognition algorithms using Mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. J. Comput. 2, 138–143 (2010)Google Scholar
  21. 21.
    O’Shaughnessy, D.: Linear predictive coding. Potentials. IEEE 7, 29–32 (1988)Google Scholar
  22. 22.
    Feng, L.: Speaker recognition. Master’s thesis, Institute of Informatics and Mathematical Modeling. Technical University of Denmark, DTU (2004)Google Scholar
  23. 23.
    Dehak, N., Dumouchel, P., Kenny, P.: Modeling prosodic features with joint factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 15, 2095–2103 (2007)CrossRefGoogle Scholar
  24. 24.
    Hess, W.J.: Pitch Determination of Speech Signals. Springer, Berlin (1983)Google Scholar
  25. 25.
    Schoslkopf, B., Smola, A.: Learning with Kernals, Support Vector Machines. MIT Press, London (2002)Google Scholar
  26. 26.
    Wang, Q., Yang, J., Yang, W.: Face detection using rectangle features and SVM. Int. J. Intell. Technol. 1(3), 228–232 (2006)Google Scholar
  27. 27.
    Mercer, J.: Functions of positive and negative type, and their connection with the theory of integral equations. Trans. London Philos. Soc. (A) 209, 415–446 (1909)CrossRefMATHGoogle Scholar
  28. 28.
    Godino-Llorente, J., Gomez-Vilda, P., Blanco-Velasco, M.: Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Trans. Biomed. Eng. 53, 1943–1953 (2006)CrossRefGoogle Scholar

Copyright information

© Springer India 2015

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringS.J. College of EngineeringMysoreIndia

Personalised recommendations