Study of articulators’ contribution and compensation during speech by articulatory speech recognition
- 59 Downloads
In this paper, the contributions of dynamic articulatory information were evaluated by using an articulatory speech recognition system. The Electromagnetic Articulographic dataset is relatively small and hard to be recorded compared with popular speech corpora used for modern speech study. We used articulatory data to study the contribution of each observation channel of vocal tracts in speech recognition by DNN framework. We also analyzed the recognition results of each phoneme according to speech production rules. The contribution rate of each articulator can be considered as the crucial level of each phoneme in speech production. Furthermore, the results indicate that the contribution of each observation point is not relevant to a specific method. The tendency of a contribution of each sensor is identical to the rules of Japanese phonology. In this work, we also evaluated the compensation effect between different channels. We discovered that crucial points are hard to be compensated for compared with non-crucial points. The proposed method can help us identify the crucial points of each phoneme during speech. The results of this paper can contribute to the study of speech production and articulatory-based speech recognition.
KeywordsDNN Articulatory recognition Articulators’ contribution Crucial level Compensation
This work was supported in part by grants from the National Natural Science Foundation of China (General Program No. 61471259, and Key Program No. 61233009) and in part by NSFC of Tianjin (No. 16JCZDJC35400).
- 1.Akamatsu T (1997) Japanese phonetics: theory and practice. Lincom Europa, München ISBN 3-89586-095-6Google Scholar
- 2.Chen Q, Zhang WL, Tong N, Li B-C (2013) RBM-based phoneme recognition by deep neural network based on RBM. Journal of Information Engineering University 14(5):569–574Google Scholar
- 5.Dang J, Lizuka Y, Markov K, Nakamura S (2003) Improvement of speech recognition method using speech production mechanism. In: Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, Spain, pp 731–734Google Scholar
- 10.Itō J, Armin MR (1995) Japanese phonology. In: Goldsmith J (ed) The handbook of phonological theory. Blackwell, Oxford, pp 817–838Google Scholar
- 11.Lu X, Dang J (2004) Speech recognition based on a combination of traditional speech features with articulatory information. In: The 18th International Congress on Acoustics (ICA2004), Kyoto, Japan, 4–9 April, pp 3499–3502Google Scholar
- 12.Lu X, Dang J (2005) Speech recognition based on a combination of acoustic features with articulatory information. Chin J Acoust 3:271–279Google Scholar
- 14.Povey D, Ghoshal A, Boulianne G et al (2011) The Kaldi speech recognition toolkit. Idiap, MartignyGoogle Scholar
- 17.Zhang J, Wei J (2015) Vowel normalization by articulatory information. In: Signal and information processing association summit and conference asia-pacific signal and information processing association pp 217–221Google Scholar