Abstract
Speech remediation by identifying those segments which take away from the substance of the speech content can be performed by correctly identifying portions of speech which can be deleted without diminishing from the speech quality, but rather improving the speech. Speech remediation is especially important when the speech is disfluent as in the case of stuttered speech. In this paper, we describe a stuttered speech remediation approach based on the identification of those segments of speech which when removed would enhance speech understandability in terms of both speech content and speech flow. The approach we adopted consists of first identifying and extracting speech segments that have weak significance due to their low relative intensity, then classifying the segments that should be removed. We trained several classifiers using a large set of inherent and derived features extracted from the audio segments for the purpose of automatic improvement of stuttered speech by providing a second layer filtering stage. This second layer would discern the audio segments that need to be eliminated from the ones that do not. The resulting speech is then compared to the manually-labeled “gold standard” optimal speech. The quality comparisons of the resulting enhanced speeches and their manually-labeled counterparts were favorable and the corresponding tabulated results are presented below. To further enhance the quality of the classifiers we adopted a voting techniques that encompassed an extended set of models from 14 algorithms and presented the classifier performance measures from different voting threshold values. This voting approach allowed us to improve the specificity of the classification by reducing the false positive classifications at the expense on additional false negatives thus improving the practical effectiveness of the system.
Keywords
- Stuttering detection
- Speech analysis
- Speech remediation
- Classification
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ai, O.C., Hariharan, M., Yaacob, S., Chee, L.S.: Classification of speech dysfluencies with MFCC and LPCC features. Expert Syst. Appl. 39(2), 2157–2165 (2012)
Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proceedings of the Institute of Phonetic Sciences, vol. 17, no. 1193 (1993)
Chee, L.S., Ai, O.C., Yaacob, S.: Overview of automatic stuttering recognition system. In: Proceedings of the International Conference on Man-Machine Systems, Batu Ferringhi, Penang Malaysia, pp. 1–6, October 2009
Fook, C.Y., Muthusamy, H., Chee, L.S., Yaacob, S.B., Adom, A.H.B.: Comparison of speech parameterization techniques for the classification of speech disfluencies. Turkish J. Electr. Eng. Comput. Sci. 21, no. Sup. 1 (2013)
Hariharan, M., Chee, L.S., Ai, O.C., Yaacob, S.: Classification of speech dysfluencies using LPC based parameterization techniques. J. Med. Syst. 36(3), 1821–1830 (2012)
Honal, M., Schultz, T.: Automatic disfluency removal on recognized spontaneous speech-rapid adaptation to speaker dependent disfluencies. In: ICASSP, vol. 1, pp. 969–972 (2005)
Honal, M., Schultz, T.: Correction of disfluencies in spontaneous speech using a noisy-channel approach. In: Interspeech (2003)
Howell, P., Davis, S., Bartrip, J.: The UCLASS archive of stuttered speech. J. Speech Lang. Hear. Res. 52(2), 556–569 (2009)
Km, R.K., Ganesan, S.: Comparison of multidimensional MFCC feature vectors for objective assessment of stuttered disfluencies. Int. J. Adv. Netw. Appl. 2(05), 854–860 (2011)
Lease, M., Johnson, M., Charniak, E.: Recognizing disfluencies in conversational speech. IEEE Trans. Audio Speech Lang. Process. 14(5), 1566–1573 (2006)
Liu, Y., Shriberg, E., Stolcke, A., Harper, M.P.: Comparing HMM, maximum entropy, and conditional random fields for disfluency detection. In: Interspeech, pp. 3313–3316 (2005)
Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C.: An approach for objective assessment of stuttered speech using MFCC. In: The International Congress for Global Science and Technology, p. 19 (2009)
Świetlicka, I., Kuniszyk-Jóźkowiak, W., Smołka, E.: Hierarchical ANN system for stuttering identification. Comput. Speech Lang. 27(1), 228–242 (2013)
Raghavendra, M., Rajeswari, P.: Determination of disfluencies associated in stuttered speech using MFCC feature extraction. Comput. Speech Lang, IJEDR 4(2), 2321–9939 (2016)
Czyzewski, A., Kaczmarek, A., Kostek, B.: Intelligent processing of stuttered speech. J. Intell. Inf. Syst. 21, 143–171 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Arbajian, P., Hajja, A., Raś, Z.W., Wieczorkowska, A.A. (2018). Segment-Removal Based Stuttered Speech Remediation. In: Appice, A., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2017. Lecture Notes in Computer Science(), vol 10785. Springer, Cham. https://doi.org/10.1007/978-3-319-78680-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-78680-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78679-7
Online ISBN: 978-3-319-78680-3
eBook Packages: Computer ScienceComputer Science (R0)