Mandarin Voice Conversion Using Tone Codebook Mapping

  • Guoyu Zuo
  • Yao Chen
  • Xiaogang Ruan
  • Wenju Liu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3930)


A tone codebook mapping method is proposed to obtain a better performance in voice conversion of Mandarin speech than the conventional conversion method which deals mainly with short-time spectral envelopes. The pitch contour of the whole Mandarin syllable is used as a unit type for pitch conversion. The syllable pitch contours are first extracted from the source and target utterances. Time normalization and moving average filtering are then performed on them. These preprocessed pitch contours are classified to generate the source and target tone codebooks, and by associating them, a Mandarin tone mapping codebook is finally obtained in terms of speech alignment. Experiment results show that the proposed method for voice conversion can deliver a satisfactory performance in Mandarin speech.


Pitch Contour Target Tone Target Speaker Voice Conversion Pitch Difference 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Moulines, E., Sagisaka, Y.: Voice conversion: state of the art and perspectives. Special Issue of Speech Communication 16(2), 125–126 (1995)Google Scholar
  2. 2.
    Abe, M., Nakamura, S., Shikano, K., Kuwabara, H.: Voice Conversion through Vector Quantization. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, NY, USA, pp. 655–658 (1988)Google Scholar
  3. 3.
    Stylianou, Y., Cappe, O., Moulines, E.: Continuous Probabilistic Transform for Voice Conversion. IEEE Transaction on Speech and Audio Processing 6(2), 131–142 (1998)CrossRefGoogle Scholar
  4. 4.
    Türk, O.: New Methods for Voice Conversion (MS thesis). Boğaziçi University, Turkey (2003)Google Scholar
  5. 5.
    Zhou, T.: Modern Chinese Phonetics. Beijing Normal University Press, Beijing (1990)Google Scholar
  6. 6.
    Chu, M.: Research on Chinese TTS system with high intelligibility and naturalness (Doctoral thesis). Institute of Acoustic, Chinese Academy of Sciences, Beijing (1995)Google Scholar
  7. 7.
    Zhu, T., Gao, W.: Data Mining for Learning Mandarin Prosodic Models. Chinese Journal of Computer 23(11), 1179–1183 (2000)Google Scholar
  8. 8.
    Kain, A., Macon, M.: Spectral Voice Conversion for Text-to-Speech Synthesis. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Seattle, USA, May 1998, pp. 285–288 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Guoyu Zuo
    • 1
    • 3
  • Yao Chen
    • 2
  • Xiaogang Ruan
    • 1
  • Wenju Liu
    • 3
  1. 1.Institute of Artificial Intelligence and RoboticsBeijing University of TechnologyBeijingChina
  2. 2.School of Computer SciencesBeijing University of TechnologyBeijingChina
  3. 3.Institute of AutomationChinese Academy of SciencesBeijingChina

Personalised recommendations