A Talking Robot and Its Singing Performance by the Mimicry of Human Vocalization

  • M. Kitani
  • T. Hara
  • H. Hanada
  • H. Sawada
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 99)


A talking and singing robot which adaptively learns the vocalization skill by an auditory feedback learning is being developed. The fundamental frequency and the spectrum envelope determine the principal characteristics of a sound. The former is the characteristics of a source sound generated by a vibrating object, and the latter is operated by the work of the resonance effects. In vocalization, the vibration of vocal cords generates a source sound, and then the sound wave is led to a vocal tract, which works as a resonance filter to determine the spectrum envelope. The paper describes the construction of vocal cords and a vocal tract for the realization of a talking and singing robot, together with the control algorithm for the acquisition of singing performance by mimicking human vocalization and singing voices. Generated voices were evaluated by listening experiments.


Nasal Cavity Vocal Cord Source Sound Vocal Tract Auditory Feedback 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [Depalle et al.1994]
    Depalle, P.H., Garcia, G., Rodet, X.: A virtual castrato. In: Int. Comp. Music Conference, pp. 357–360 (1994)Google Scholar
  2. [Flanagan 1972]
    Flanagan, J.L.: Speech analysis synthesis and perception. Springer, Heidelberg (2001)Google Scholar
  3. [Fukui et al. 2007]
    Fukui, K., Shintaku, E., Honda, M., Takanishi, A.: Mechanical vocal cord for anthropomorphic talking robot based on human biomechanical structure. The Jap. Soc. of Mech. Eng. 73(734), 112–118 (2007)Google Scholar
  4. [Hirose 1992]
    Hirose, K.: Current trends and future prospects of speech synthesis. J. of the Acoustical Society of Japan, pp. 39–45 (1992)Google Scholar
  5. [Kohonen 1995]
    Kohonen, T.: Self-organizing maps. Springer, Berlin (1995)Google Scholar
  6. [Miura et al. 2007]
    Miura, K., Asada, M., Yoshikawa, Y.: Unconscious anchoring in maternal imitation that helps finding the correspondence of caregiver’s vowel categories. Advanced Robotics 21(13), 1583–1600 (2007)Google Scholar
  7. [Nakamura and Sawada 2006]
    Nakamura, M., Sawada, H.: Talking robot and the analysis of autonomous voice acquisition. In: Proceedings of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp. 4684–4689 (2006)Google Scholar
  8. [Sawada 2007]
    Sawada, H.: Talking robot and the autonomous acquisition of vocalization and singing skill. In: Grimm, K. (ed.) Chapter in Robust Speech Recognition and Understanding, vol. 22, pp. 385–404 (2007), ISBN: 978-3-902613-08-0Google Scholar
  9. [Sawada and Nakamura 2005]
    Sawada, H., Nakamura, M.: A talking robot and its singing skill acquisition. In: International Conference on Knowledge-Based Intelligent Information and Engineering Systems, pp. 898–907 (2005)Google Scholar
  10. [Smith III 1991]
    Smith III, J.O.: Viewpoints on the history of digital synthesis. In: Int. Comp. Music Conference, pp. 1–10 (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • M. Kitani
    • 1
  • T. Hara
    • 1
  • H. Hanada
    • 1
  • H. Sawada
    • 1
  1. 1.Department of Intelligent Mechanical Systems Engineering, Faculty of EngineeringKagawa UniversityJapan

Personalised recommendations