Abstract
A vocal imitation system was developed using a computational model that supports the motor theory of speech perception. A critical problem in vocal imitation is how to generate speech sounds produced by adults, whose vocal tracts have physical properties (i.e., articulatory motions) differing from those of infants’ vocal tracts. To solve this problem, a model based on the motor theory of speech perception, was constructed. Applying this model enables the vocal imitation system to estimate articulatory motions for unexperienced speech sounds that have not actually been generated by the system. The system was implemented by using Recurrent Neural Network with Parametric Bias (RNNPB) and a physical vocal tract model, called Maeda model. Experimental results demonstrated that the system was sufficiently robust with respect to individual differences in speech sounds and could imitate unexperienced vowel sounds.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Liberman, A.M., Cooper, F.S., et al.: A motor theory of speech perception. In: Proc. Speech Communication Seminar, Paper-D3, Stockholm (1962)
Tani, J., Ito, M.: Self-organization of behavioral primitives as multiple attractor dynamics: A robot experiment. IEEE Transactions on SMC Part A 33(4), 481–488 (2003)
Minematsu, N., Nishimura, T., Nishinari, K., Sakuraba, K.: Theorem of the invariant structure and its derivation of speech gestalt. In: Proc. Int. Workshop on Speech Recognition and Intrinsic Variations, pp. 47–52 (2006)
Fadiga, L., Craighero, L., Buccino, G., Rizzolatti, G.: Speech listening specifically modulates the excitability of tongue muscles: a TMS study. European Journal of Cognitive Neuroscience 15, 399–402 (2002)
Hickok, G., Buchsbaum, B., Humphries, C., Muftuler, T.: Auditory-motor interaction revealed by fmri. Area Spt. Journal of Cognitive Neuroscience 15(5), 673–682 (2003)
Yokoya, R., Ogata, T., Tani, J., Komatani, K., Okuno, H.G.: Experience based imitation using RNNPB. In: IEEE/RSJ IROS 2006 (2006)
Maeda, S.: Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal tract shapes using an articulatory model. In: Speech production and speech modeling, pp. 131–149. Kluwer Academic Publishers, Dordrecht (1990)
Kitawaki, N., Itakura, F., Saito, S.: Optimum coding of transmission parameters in parcor speech analysis synthesis system. Transactions of the Institute of Electronics and Communication Engineers of Japan (IEICE) J61-A(2), 119–126 (1978)
Kawahara, H.: Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited. In: IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp. 1303–1306 (1997)
Jordan, M.: Attractor dynamics and parallelism in a connectionist sequential machine. In: Eighth Annual Conference of the Cognitive Science Society, Erlbaum, Hillsdale, NJ, pp. 513–546 (1986)
Rumelhart, D., Hinton, G., Williams, R.: Learning internal representation by error propagation. MIT Press, Cambridge (1986)
Atal, B.S.: Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. Journal of the Acoustical Society of America 55, 1304–1312 (1972)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kanda, H., Ogata, T., Komatani, K., Okuno, H.G. (2008). Vowel Imitation Using Vocal Tract Model and Recurrent Neural Network. In: Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds) Neural Information Processing. ICONIP 2007. Lecture Notes in Computer Science, vol 4985. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69162-4_24
Download citation
DOI: https://doi.org/10.1007/978-3-540-69162-4_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69159-4
Online ISBN: 978-3-540-69162-4
eBook Packages: Computer ScienceComputer Science (R0)