Skip to main content

Combining Cepstral and Prosodic Features for Classification of Disfluencies in Stuttered Speech

  • Conference paper
  • First Online:
Intelligent Computing, Communication and Devices

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 308))

Abstract

The process of recognition and classification of dysfluencies are significant in objective assessment of stuttered speech. The main focus of this study is to combine prosodic features and cepstral features in order to improve the performance of dysfluency recognition. The term prosody represents several characteristics related to human speech such as speaking rate, loudness, duration, and pitch. In this study, pitch, energy, and duration are considered as prosodic features and Mel Frequency Cepstral Coefficient (MFCC), delta MFCC (DMFCC), and delta–delta MFCC (DDMFCC) are used as cepstral feature set. The efficacy of the considered features has been evaluated using support vector machine (SVM) classifier. Experimental results demonstrated considerable enhancement in the overall performance with respect to the conventional methods present in the literature of stuttering dysfluency recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Young, M.: Predicting ratings of severity of stuttering [monograph]. pp. 31–54 (1961)

    Google Scholar 

  2. Sherman, D.: Clinical and experimental use of the Iowa scale of severity of stuttering. J. Speech Hear. Disord. 17, 316–320 (1952)

    Article  Google Scholar 

  3. Cullinan, W.L., Prathe, E.M., Williams, D.: Comparison of procedures for scaling severity of stuttering. J. Speech Hear. Res. 6, 187–194 (1963)

    Article  Google Scholar 

  4. Bloodstein, O.: A handbook on stuttering. Singular Publishing Group Inc., San-Diego, London (1995)

    Google Scholar 

  5. Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C.: An approach for objective assessment of stuttered speech using MFCC features. ICGST Int. J. Digital Sig. Proc. DSP 9, 19–24 (2009)

    Google Scholar 

  6. Ravikumar KM, Reddy B, Rajagopal R, Nagaraj H (2008) Automatic detection of syllable repetition in read speech for objective assessment of stuttered disfluencies. In: Proceedings of World Academy Science, Engineering and Technology, pp. 270–273 (2008)

    Google Scholar 

  7. Howell, P., Sackin, S., Glenn, K.: Development of a two stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: II. ANN recognition of repetitions and prolongations with supplied word segment markers. J. Speech Lang. Hear. Res. 40, 1085 (1997)

    Article  Google Scholar 

  8. Noth, E., Niemann, H., Haderlein, T., Decher, M., Eysholdt, U., Rosanowski, F., Wittenberg, T.: Automatic Stuttering Recognition Using Hidden Markov Models. Interspeech (2000)

    Google Scholar 

  9. Czyzewski, A., Kaczmarek, A., Kostek, B.: Intelligent processing of stuttered speech. J. Intell. Inf. Syst. 21, 143–171 (2003)

    Article  Google Scholar 

  10. Wisniewski, M., Kuniszyk-Jozkowiak, W., Smolka, E., Suszynsk, W.: Automatic Detection of Disorders in a Continuous Speech with the Hidden Markov Models Approach, vol. 45/2008. In: Computer Recognition Systems vol. 2. Springer, Berlin/Heidelberg, pp. 445–453 (2007)

    Google Scholar 

  11. Wisniewski, M., Kuniszyk-Jozkowiak, W., Smolka, E., Suszynski, W.: Automatic detection of prolonged fricative phonemes with the hidden Markov models approach. J. Med. Inf. Technol. 11, 1–6 (2007)

    Google Scholar 

  12. Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: MFCC based recognition of repetition and prolongation in stuttered speech using K-NN and LDA. In: Proceedings of 2009 IEEE student conference on research and development (SCOReD), Malaysia (2009)

    Google Scholar 

  13. Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: Automatic detection of prolongations and repetitions using LPCC. In: Proceedings of International Conference for Technical Postgraduates (TECHPOS), pp. 1–4 (2009)

    Google Scholar 

  14. Mahesha, P., Vinod, D.S.: Automatic classification of dysfluencies in stuttered speech using MFCC. In: International Conference on Computing Communication and Information Technology, Chennai (2012)

    Google Scholar 

  15. Mahesha, P., Vinod, D.S.: Classification of speech dysfluencies using speech parameterization techniques and multiclass SVM. In: 9th International Conference, QShine 2013 vol. 115. Greader Noida, Springer Berlin, Heidelberg, pp. 298–308 (2013)

    Google Scholar 

  16. Howell, P., Huckvale, M.: Facilities to assist people to research into stammered speech. Stammering Research: An on-line journal published by the British Stammering Association, pp. 130–242 (2004)

    Google Scholar 

  17. Devis, S., Howell, P., Batrip, J.: The UCLASS archive of stuttered speech. J. Speech Lang. Hear. Res. (2009)

    Google Scholar 

  18. Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)

    Google Scholar 

  19. Proakis, J.G., Manolakis, D.G.: Digital Signal Processing, Principles, Algorithms and Applications. MacMillan, New York (2007)

    Google Scholar 

  20. Muda, L., Begam, K.M., Elamvazuthi, I.: Voice recognition algorithms using Mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. J. Comput. 2, 138–143 (2010)

    Google Scholar 

  21. O’Shaughnessy, D.: Linear predictive coding. Potentials. IEEE 7, 29–32 (1988)

    Google Scholar 

  22. Feng, L.: Speaker recognition. Master’s thesis, Institute of Informatics and Mathematical Modeling. Technical University of Denmark, DTU (2004)

    Google Scholar 

  23. Dehak, N., Dumouchel, P., Kenny, P.: Modeling prosodic features with joint factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 15, 2095–2103 (2007)

    Article  Google Scholar 

  24. Hess, W.J.: Pitch Determination of Speech Signals. Springer, Berlin (1983)

    Google Scholar 

  25. Schoslkopf, B., Smola, A.: Learning with Kernals, Support Vector Machines. MIT Press, London (2002)

    Google Scholar 

  26. Wang, Q., Yang, J., Yang, W.: Face detection using rectangle features and SVM. Int. J. Intell. Technol. 1(3), 228–232 (2006)

    Google Scholar 

  27. Mercer, J.: Functions of positive and negative type, and their connection with the theory of integral equations. Trans. London Philos. Soc. (A) 209, 415–446 (1909)

    Article  MATH  Google Scholar 

  28. Godino-Llorente, J., Gomez-Vilda, P., Blanco-Velasco, M.: Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Trans. Biomed. Eng. 53, 1943–1953 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Mahesha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer India

About this paper

Cite this paper

Mahesha, P., Vinod, D.S. (2015). Combining Cepstral and Prosodic Features for Classification of Disfluencies in Stuttered Speech. In: Jain, L., Patnaik, S., Ichalkaranje, N. (eds) Intelligent Computing, Communication and Devices. Advances in Intelligent Systems and Computing, vol 308. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2012-1_67

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2012-1_67

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2011-4

  • Online ISBN: 978-81-322-2012-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics