Combining Cepstral and Prosodic Features for Classification of Disfluencies in Stuttered Speech

Mahesha, P.; Vinod, D. S.

doi:10.1007/978-81-322-2012-1_67

P. Mahesha⁵ &
D. S. Vinod⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 308))

2605 Accesses
3 Citations

Abstract

The process of recognition and classification of dysfluencies are significant in objective assessment of stuttered speech. The main focus of this study is to combine prosodic features and cepstral features in order to improve the performance of dysfluency recognition. The term prosody represents several characteristics related to human speech such as speaking rate, loudness, duration, and pitch. In this study, pitch, energy, and duration are considered as prosodic features and Mel Frequency Cepstral Coefficient (MFCC), delta MFCC (DMFCC), and delta–delta MFCC (DDMFCC) are used as cepstral feature set. The efficacy of the considered features has been evaluated using support vector machine (SVM) classifier. Experimental results demonstrated considerable enhancement in the overall performance with respect to the conventional methods present in the literature of stuttering dysfluency recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Young, M.: Predicting ratings of severity of stuttering [monograph]. pp. 31–54 (1961)
Google Scholar
Sherman, D.: Clinical and experimental use of the Iowa scale of severity of stuttering. J. Speech Hear. Disord. 17, 316–320 (1952)
Article Google Scholar
Cullinan, W.L., Prathe, E.M., Williams, D.: Comparison of procedures for scaling severity of stuttering. J. Speech Hear. Res. 6, 187–194 (1963)
Article Google Scholar
Bloodstein, O.: A handbook on stuttering. Singular Publishing Group Inc., San-Diego, London (1995)
Google Scholar
Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C.: An approach for objective assessment of stuttered speech using MFCC features. ICGST Int. J. Digital Sig. Proc. DSP 9, 19–24 (2009)
Google Scholar
Ravikumar KM, Reddy B, Rajagopal R, Nagaraj H (2008) Automatic detection of syllable repetition in read speech for objective assessment of stuttered disfluencies. In: Proceedings of World Academy Science, Engineering and Technology, pp. 270–273 (2008)
Google Scholar
Howell, P., Sackin, S., Glenn, K.: Development of a two stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: II. ANN recognition of repetitions and prolongations with supplied word segment markers. J. Speech Lang. Hear. Res. 40, 1085 (1997)
Article Google Scholar
Noth, E., Niemann, H., Haderlein, T., Decher, M., Eysholdt, U., Rosanowski, F., Wittenberg, T.: Automatic Stuttering Recognition Using Hidden Markov Models. Interspeech (2000)
Google Scholar
Czyzewski, A., Kaczmarek, A., Kostek, B.: Intelligent processing of stuttered speech. J. Intell. Inf. Syst. 21, 143–171 (2003)
Article Google Scholar
Wisniewski, M., Kuniszyk-Jozkowiak, W., Smolka, E., Suszynsk, W.: Automatic Detection of Disorders in a Continuous Speech with the Hidden Markov Models Approach, vol. 45/2008. In: Computer Recognition Systems vol. 2. Springer, Berlin/Heidelberg, pp. 445–453 (2007)
Google Scholar
Wisniewski, M., Kuniszyk-Jozkowiak, W., Smolka, E., Suszynski, W.: Automatic detection of prolonged fricative phonemes with the hidden Markov models approach. J. Med. Inf. Technol. 11, 1–6 (2007)
Google Scholar
Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: MFCC based recognition of repetition and prolongation in stuttered speech using K-NN and LDA. In: Proceedings of 2009 IEEE student conference on research and development (SCOReD), Malaysia (2009)
Google Scholar
Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: Automatic detection of prolongations and repetitions using LPCC. In: Proceedings of International Conference for Technical Postgraduates (TECHPOS), pp. 1–4 (2009)
Google Scholar
Mahesha, P., Vinod, D.S.: Automatic classification of dysfluencies in stuttered speech using MFCC. In: International Conference on Computing Communication and Information Technology, Chennai (2012)
Google Scholar
Mahesha, P., Vinod, D.S.: Classification of speech dysfluencies using speech parameterization techniques and multiclass SVM. In: 9th International Conference, QShine 2013 vol. 115. Greader Noida, Springer Berlin, Heidelberg, pp. 298–308 (2013)
Google Scholar
Howell, P., Huckvale, M.: Facilities to assist people to research into stammered speech. Stammering Research: An on-line journal published by the British Stammering Association, pp. 130–242 (2004)
Google Scholar
Devis, S., Howell, P., Batrip, J.: The UCLASS archive of stuttered speech. J. Speech Lang. Hear. Res. (2009)
Google Scholar
Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)
Google Scholar
Proakis, J.G., Manolakis, D.G.: Digital Signal Processing, Principles, Algorithms and Applications. MacMillan, New York (2007)
Google Scholar
Muda, L., Begam, K.M., Elamvazuthi, I.: Voice recognition algorithms using Mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. J. Comput. 2, 138–143 (2010)
Google Scholar
O’Shaughnessy, D.: Linear predictive coding. Potentials. IEEE 7, 29–32 (1988)
Google Scholar
Feng, L.: Speaker recognition. Master’s thesis, Institute of Informatics and Mathematical Modeling. Technical University of Denmark, DTU (2004)
Google Scholar
Dehak, N., Dumouchel, P., Kenny, P.: Modeling prosodic features with joint factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 15, 2095–2103 (2007)
Article Google Scholar
Hess, W.J.: Pitch Determination of Speech Signals. Springer, Berlin (1983)
Google Scholar
Schoslkopf, B., Smola, A.: Learning with Kernals, Support Vector Machines. MIT Press, London (2002)
Google Scholar
Wang, Q., Yang, J., Yang, W.: Face detection using rectangle features and SVM. Int. J. Intell. Technol. 1(3), 228–232 (2006)
Google Scholar
Mercer, J.: Functions of positive and negative type, and their connection with the theory of integral equations. Trans. London Philos. Soc. (A) 209, 415–446 (1909)
Article MATH Google Scholar
Godino-Llorente, J., Gomez-Vilda, P., Blanco-Velasco, M.: Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Trans. Biomed. Eng. 53, 1943–1953 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, S.J. College of Engineering, Mysore, Karnataka, India
P. Mahesha & D. S. Vinod

Authors

P. Mahesha
View author publications
You can also search for this author in PubMed Google Scholar
D. S. Vinod
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P. Mahesha .

Editor information

Editors and Affiliations

University of Canberra, Faculty of Education, Science, Technology and Mathematics, Canberra, Australia, and University of South Australia, Adelaide, South Australia, Australia
Lakhmi C. Jain
Department of Computer Science and Engin, SOA University, Bhubaneswar, Odisha, India
Srikanta Patnaik
Department of Premier and Cabinet, Office of the Chief Information Officer, Adelaide, South Australia, Australia
Nikhil Ichalkaranje

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mahesha, P., Vinod, D.S. (2015). Combining Cepstral and Prosodic Features for Classification of Disfluencies in Stuttered Speech. In: Jain, L., Patnaik, S., Ichalkaranje, N. (eds) Intelligent Computing, Communication and Devices. Advances in Intelligent Systems and Computing, vol 308. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2012-1_67

Download citation

DOI: https://doi.org/10.1007/978-81-322-2012-1_67
Published: 26 August 2014
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2011-4
Online ISBN: 978-81-322-2012-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics