Segment-Removal Based Stuttered Speech Remediation

Arbajian, Pierre; Hajja, Ayman; Raś, Zbigniew W.; Wieczorkowska, Alicja A.

doi:10.1007/978-3-319-78680-3_2

Pierre Arbajian¹⁸,
Ayman Hajja¹⁹,
Zbigniew W. Raś^18,20,21 &
…
Alicja A. Wieczorkowska²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10785))

Included in the following conference series:

International Workshop on New Frontiers in Mining Complex Patterns

447 Accesses
3 Citations

Abstract

Speech remediation by identifying those segments which take away from the substance of the speech content can be performed by correctly identifying portions of speech which can be deleted without diminishing from the speech quality, but rather improving the speech. Speech remediation is especially important when the speech is disfluent as in the case of stuttered speech. In this paper, we describe a stuttered speech remediation approach based on the identification of those segments of speech which when removed would enhance speech understandability in terms of both speech content and speech flow. The approach we adopted consists of first identifying and extracting speech segments that have weak significance due to their low relative intensity, then classifying the segments that should be removed. We trained several classifiers using a large set of inherent and derived features extracted from the audio segments for the purpose of automatic improvement of stuttered speech by providing a second layer filtering stage. This second layer would discern the audio segments that need to be eliminated from the ones that do not. The resulting speech is then compared to the manually-labeled “gold standard” optimal speech. The quality comparisons of the resulting enhanced speeches and their manually-labeled counterparts were favorable and the corresponding tabulated results are presented below. To further enhance the quality of the classifiers we adopted a voting techniques that encompassed an extended set of models from 14 algorithms and presented the classifier performance measures from different voting threshold values. This voting approach allowed us to improve the specificity of the classification by reducing the false positive classifications at the expense on additional false negatives thus improving the practical effectiveness of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ai, O.C., Hariharan, M., Yaacob, S., Chee, L.S.: Classification of speech dysfluencies with MFCC and LPCC features. Expert Syst. Appl. 39(2), 2157–2165 (2012)
Article Google Scholar
Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proceedings of the Institute of Phonetic Sciences, vol. 17, no. 1193 (1993)
Google Scholar
Chee, L.S., Ai, O.C., Yaacob, S.: Overview of automatic stuttering recognition system. In: Proceedings of the International Conference on Man-Machine Systems, Batu Ferringhi, Penang Malaysia, pp. 1–6, October 2009
Google Scholar
Fook, C.Y., Muthusamy, H., Chee, L.S., Yaacob, S.B., Adom, A.H.B.: Comparison of speech parameterization techniques for the classification of speech disfluencies. Turkish J. Electr. Eng. Comput. Sci. 21, no. Sup. 1 (2013)
Google Scholar
Hariharan, M., Chee, L.S., Ai, O.C., Yaacob, S.: Classification of speech dysfluencies using LPC based parameterization techniques. J. Med. Syst. 36(3), 1821–1830 (2012)
Article Google Scholar
Honal, M., Schultz, T.: Automatic disfluency removal on recognized spontaneous speech-rapid adaptation to speaker dependent disfluencies. In: ICASSP, vol. 1, pp. 969–972 (2005)
Google Scholar
Honal, M., Schultz, T.: Correction of disfluencies in spontaneous speech using a noisy-channel approach. In: Interspeech (2003)
Google Scholar
Howell, P., Davis, S., Bartrip, J.: The UCLASS archive of stuttered speech. J. Speech Lang. Hear. Res. 52(2), 556–569 (2009)
Article Google Scholar
Km, R.K., Ganesan, S.: Comparison of multidimensional MFCC feature vectors for objective assessment of stuttered disfluencies. Int. J. Adv. Netw. Appl. 2(05), 854–860 (2011)
Google Scholar
Lease, M., Johnson, M., Charniak, E.: Recognizing disfluencies in conversational speech. IEEE Trans. Audio Speech Lang. Process. 14(5), 1566–1573 (2006)
Article Google Scholar
Liu, Y., Shriberg, E., Stolcke, A., Harper, M.P.: Comparing HMM, maximum entropy, and conditional random fields for disfluency detection. In: Interspeech, pp. 3313–3316 (2005)
Google Scholar
Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C.: An approach for objective assessment of stuttered speech using MFCC. In: The International Congress for Global Science and Technology, p. 19 (2009)
Google Scholar
Świetlicka, I., Kuniszyk-Jóźkowiak, W., Smołka, E.: Hierarchical ANN system for stuttering identification. Comput. Speech Lang. 27(1), 228–242 (2013)
Article Google Scholar
Raghavendra, M., Rajeswari, P.: Determination of disfluencies associated in stuttered speech using MFCC feature extraction. Comput. Speech Lang, IJEDR 4(2), 2321–9939 (2016)
Google Scholar
Czyzewski, A., Kaczmarek, A., Kostek, B.: Intelligent processing of stuttered speech. J. Intell. Inf. Syst. 21, 143–171 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of North Carolina, 9201 University City Blvd., Charlotte, NC, 28223, USA
Pierre Arbajian & Zbigniew W. Raś
Department of Computer Science, College of Charleston, 66 George Street, Charleston, SC, 29424, USA
Ayman Hajja
Multimedia Department, Polish-Japanese Academy of Information Technology, Koszykowa 86, 02-008, Warsaw, Poland
Zbigniew W. Raś & Alicja A. Wieczorkowska
Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665, Warsaw, Poland
Zbigniew W. Raś

Authors

Pierre Arbajian
View author publications
You can also search for this author in PubMed Google Scholar
Ayman Hajja
View author publications
You can also search for this author in PubMed Google Scholar
Zbigniew W. Raś
View author publications
You can also search for this author in PubMed Google Scholar
Alicja A. Wieczorkowska
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pierre Arbajian .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Bari Aldo Moro, Bari, Italy
Corrado Loglisci
CNR, Rende, Italy
Giuseppe Manco
CNR, Rende, Italy
Elio Masciari
University of North Carolina, Charlotte, North Carolina, USA
Zbigniew W. Ras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Arbajian, P., Hajja, A., Raś, Z.W., Wieczorkowska, A.A. (2018). Segment-Removal Based Stuttered Speech Remediation. In: Appice, A., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2017. Lecture Notes in Computer Science(), vol 10785. Springer, Cham. https://doi.org/10.1007/978-3-319-78680-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-78680-3_2
Published: 24 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78679-7
Online ISBN: 978-3-319-78680-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics