Segment-Removal Based Stuttered Speech Remediation

  • Pierre Arbajian
  • Ayman Hajja
  • Zbigniew W. Raś
  • Alicja A. Wieczorkowska
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10785)


Speech remediation by identifying those segments which take away from the substance of the speech content can be performed by correctly identifying portions of speech which can be deleted without diminishing from the speech quality, but rather improving the speech. Speech remediation is especially important when the speech is disfluent as in the case of stuttered speech. In this paper, we describe a stuttered speech remediation approach based on the identification of those segments of speech which when removed would enhance speech understandability in terms of both speech content and speech flow. The approach we adopted consists of first identifying and extracting speech segments that have weak significance due to their low relative intensity, then classifying the segments that should be removed. We trained several classifiers using a large set of inherent and derived features extracted from the audio segments for the purpose of automatic improvement of stuttered speech by providing a second layer filtering stage. This second layer would discern the audio segments that need to be eliminated from the ones that do not. The resulting speech is then compared to the manually-labeled “gold standard” optimal speech. The quality comparisons of the resulting enhanced speeches and their manually-labeled counterparts were favorable and the corresponding tabulated results are presented below. To further enhance the quality of the classifiers we adopted a voting techniques that encompassed an extended set of models from 14 algorithms and presented the classifier performance measures from different voting threshold values. This voting approach allowed us to improve the specificity of the classification by reducing the false positive classifications at the expense on additional false negatives thus improving the practical effectiveness of the system.


Stuttering detection Speech analysis Speech remediation Classification 


  1. 1.
    Ai, O.C., Hariharan, M., Yaacob, S., Chee, L.S.: Classification of speech dysfluencies with MFCC and LPCC features. Expert Syst. Appl. 39(2), 2157–2165 (2012)CrossRefGoogle Scholar
  2. 2.
    Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proceedings of the Institute of Phonetic Sciences, vol. 17, no. 1193 (1993)Google Scholar
  3. 3.
    Chee, L.S., Ai, O.C., Yaacob, S.: Overview of automatic stuttering recognition system. In: Proceedings of the International Conference on Man-Machine Systems, Batu Ferringhi, Penang Malaysia, pp. 1–6, October 2009Google Scholar
  4. 4.
    Fook, C.Y., Muthusamy, H., Chee, L.S., Yaacob, S.B., Adom, A.H.B.: Comparison of speech parameterization techniques for the classification of speech disfluencies. Turkish J. Electr. Eng. Comput. Sci. 21, no. Sup. 1 (2013)Google Scholar
  5. 5.
    Hariharan, M., Chee, L.S., Ai, O.C., Yaacob, S.: Classification of speech dysfluencies using LPC based parameterization techniques. J. Med. Syst. 36(3), 1821–1830 (2012)CrossRefGoogle Scholar
  6. 6.
    Honal, M., Schultz, T.: Automatic disfluency removal on recognized spontaneous speech-rapid adaptation to speaker dependent disfluencies. In: ICASSP, vol. 1, pp. 969–972 (2005)Google Scholar
  7. 7.
    Honal, M., Schultz, T.: Correction of disfluencies in spontaneous speech using a noisy-channel approach. In: Interspeech (2003)Google Scholar
  8. 8.
    Howell, P., Davis, S., Bartrip, J.: The UCLASS archive of stuttered speech. J. Speech Lang. Hear. Res. 52(2), 556–569 (2009)CrossRefGoogle Scholar
  9. 9.
    Km, R.K., Ganesan, S.: Comparison of multidimensional MFCC feature vectors for objective assessment of stuttered disfluencies. Int. J. Adv. Netw. Appl. 2(05), 854–860 (2011)Google Scholar
  10. 10.
    Lease, M., Johnson, M., Charniak, E.: Recognizing disfluencies in conversational speech. IEEE Trans. Audio Speech Lang. Process. 14(5), 1566–1573 (2006)CrossRefGoogle Scholar
  11. 11.
    Liu, Y., Shriberg, E., Stolcke, A., Harper, M.P.: Comparing HMM, maximum entropy, and conditional random fields for disfluency detection. In: Interspeech, pp. 3313–3316 (2005)Google Scholar
  12. 12.
    Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C.: An approach for objective assessment of stuttered speech using MFCC. In: The International Congress for Global Science and Technology, p. 19 (2009)Google Scholar
  13. 13.
    Świetlicka, I., Kuniszyk-Jóźkowiak, W., Smołka, E.: Hierarchical ANN system for stuttering identification. Comput. Speech Lang. 27(1), 228–242 (2013)CrossRefGoogle Scholar
  14. 14.
    Raghavendra, M., Rajeswari, P.: Determination of disfluencies associated in stuttered speech using MFCC feature extraction. Comput. Speech Lang, IJEDR 4(2), 2321–9939 (2016)Google Scholar
  15. 15.
    Czyzewski, A., Kaczmarek, A., Kostek, B.: Intelligent processing of stuttered speech. J. Intell. Inf. Syst. 21, 143–171 (2003)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Pierre Arbajian
    • 1
  • Ayman Hajja
    • 2
  • Zbigniew W. Raś
    • 1
    • 3
    • 4
  • Alicja A. Wieczorkowska
    • 3
  1. 1.Department of Computer ScienceUniversity of North CarolinaCharlotteUSA
  2. 2.Department of Computer ScienceCollege of CharlestonCharlestonUSA
  3. 3.Multimedia DepartmentPolish-Japanese Academy of Information TechnologyWarsawPoland
  4. 4.Institute of Computer ScienceWarsaw University of TechnologyWarsawPoland

Personalised recommendations