Preprocessing of Dysarthric Speech in Noise Based on CV–Dependent Wiener Filtering

Park, Ji Hun; Seong, Woo Kyeong; Kim, Hong Kook

doi:10.1007/978-1-4614-1335-6_6

Ji Hun Park³,
Woo Kyeong Seong³ &
Hong Kook Kim³

463 Accesses
3 Citations

Abstract

In this paper, we propose a consonant–vowel (CV) dependentWiener filter for dysarthric automatic speech recognition (ASR) in noisy environments. When a Wiener filter is applied to dysarthric speech in noise, it distorts initial consonants of dysarthric speech. This is because compared to normal speech, the speech spectrum at a consonant-vowel onset in dysarthric speech is much similar to that of noise, thus speech at the onset is easy to be removed by the Wiener filtering. In order to mitigate this problem, the transfer function of a Wiener filter is differently constructed depending on the result of CV classification that is performed by combining voice activity detection (VAD) and vowel onset estimation. In this work, VAD is done by a statistical model based approach and the vowel onset estimation is by investigating the variation of linear prediction residual signals. To demonstrate the effectiveness of the proposed CV–dependentWiener filter on the performance of dysarthric ASR, we compare the performance of an ASR system employing the proposed method with that using a conventional Wiener filter for different groups of degrees of disability under different signal–to–noise ratio conditions. Consequently, it is shown from the ASR experiments that the proposed Wiener filter achieves a relative average word error rate reduction of 10.41%, 6.03%, and 0.94% for the mild, moderate, and severe group of disability, respectively, when compared to the conventional Wiener filter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Haines D (2004) Neuroanatomy: an Atlas of Structures, Sections, and Systems. Lippincott Williams and Wilkins, Hagerstown
Google Scholar
Platt LJ, Andrews G, Young M, Quinn PT (1980) Dysarthria of adult cerebral palsy: I. Intelligibility and articulatory impairment. Journal of Speech and Hearing Research 23(1):28–40
Google Scholar
Hasegawa–Johnson M, Gunderson J, Perlman A, Huang T (2006) HMM– based and SVM–based recognition of the speech of talkers with spastic dysarthria. in Proc. of International Conference on Acoustics, Speech, and Signal Processing 1:1060–1063
Google Scholar
Parker M, Cunningham S, Enderby P, Hawley, M, Green P (2006) Automatic speech recognition and training for severely dysarthric users of assistive technology: the STARDUST project. Clinical Linguistics and Phonetics 20(2/3):149–156
Google Scholar
Benesty J, Makino S, Chen J (2005) Speech Enhancement. Springer, Berlin [6] Erkelens JS, Heusdens R (2008) Tracking of nonstationary noise based on data–driven recursive noise power estimation. IEEE Trans. on Audio, Speech, and Language Processing 16(6):1112–1123
Google Scholar
Kent RD, Rosenbek JC (1983) Acoustic patterns of apraxia of speech. Journal of Speech and Hearing Research 26(2):231–249
Google Scholar
Platt LJ, Andrews G, Howie PM (1980) Dysarthria of adult cerebral palsy: II. Phonemic analysis of articulation errors. Journal of Speech and Hearing Research 23(1):41–55
Google Scholar
Sohn J, Kim NS, Sung W (1999) A statistical model based voice activity detection. IEEE Signal Processing Letters 6(1):1–3
Article Google Scholar
Prasanna SR, Reddy BV, Krishnamoorthy P (2009) Vowel onset point detection using source, spectral peaks, and modulation spectrum energies. IEEE Trans. on Audio, Speech, and Language Processing 17(4):556–565
Google Scholar
Kim S, Oh S, Jung HY, Jeong HB, Kim JS (2002) Common speech database collection. in Proc. Acoustical Society of Korea 21(1):21–24
Google Scholar

Download references

Acknowledgements

This work was supported in part by the R&D Program of MKE/KEIT (10036461, Development of an embedded key-word spotting speech recognition system individually customized for disabled persons with dysarthria) and the Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education, Science and Technology (No.2010-0023888).

Author information

Authors and Affiliations

School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, 500-712, Korea
Ji Hun Park, Woo Kyeong Seong & Hong Kook Kim

Authors

Ji Hun Park
View author publications
You can also search for this author in PubMed Google Scholar
Woo Kyeong Seong
View author publications
You can also search for this author in PubMed Google Scholar
Hong Kook Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ji Hun Park .

Editor information

Editors and Affiliations

, Dept. of Languages and Computer Systems, University of Granada, Granada, 18071, Spain
Ramón López-Cózar Delgado
, Dept. of Computer Science & Engineering, Waseda University, Okubo 3-4-1, Tokyo, 169-8555, Japan
Tetsunori Kobayashi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Park, J.H., Seong, W.K., Kim, H.K. (2011). Preprocessing of Dysarthric Speech in Noise Based on CV–Dependent Wiener Filtering. In: Delgado, RC., Kobayashi, T. (eds) Proceedings of the Paralinguistic Information and its Integration in Spoken Dialogue Systems Workshop. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1335-6_6

Download citation

DOI: https://doi.org/10.1007/978-1-4614-1335-6_6
Published: 12 August 2011
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-1334-9
Online ISBN: 978-1-4614-1335-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics