Transferring Vocal Expression of F0 Contour Using Singing Voice Synthesizer

  • Yukara Ikemiya
  • Katsutoshi Itoyama
  • Hiroshi G. Okuno
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8482)


A system for transferring vocal expressions separately from singing voices with accompaniment to singing voice synthesizers is described. The expressions appear as fluctuations in the fundamental frequency contour of the singing voice, such as vibrato, glissando, and kobushi. The fundamental frequency contour of the singing voice is estimated using the subharmonic summation in a limited frequency range and aligned temporally to chromatic pitch sequence. Each expression is transcribed and parameterized in accordance with designed rules. Finally, the expressions are transferred to given scores on the singing voice synthesizer. Experiments demonstrated that the proposed system can transfer the vocal expressions while retaining singer’s individuality on two singing voice synthesizers: the Vocaloid and the CeVIO.


Viterbi Algorithm Musical Piece Pitch Range Music Information Retrieval Vocal Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Downie, J.S.: Music information retrieval. Annu. Rev. Inf. Sci. Technol. 37, 295–340 (2003)CrossRefGoogle Scholar
  2. 2.
    Kenmochi, H., Ohshita, H.: Vocaloid - commercial singing synthesizer based on sample concatenation. In: INTERSPEECH 2007, pp. 4009–4010 (2007)Google Scholar
  3. 3.
    Saito, T., Goto, M.: Acoustic and perceptual effects of vocal training in amateur male singing. In: INTERSPEECH 2009, pp. 832–835 (September 2009)Google Scholar
  4. 4.
    Guzman, M.A., Dowdall, J., Rubin, A.D., Maki, A., Levin, S., Mayerhoff, R., Jackson-Menaldi, M.C.: Influence of emotional expression, loudness, and gender on the acoustic parameters of vibrato in classical singers. Journal of Voice 26(5), 675–681 (2012)CrossRefGoogle Scholar
  5. 5.
    Stables, R., Athwal, C., Bullock, J.: Fundamental frequency modulation in singing voice synthesis. In: International Conference on Speech, Sound and Music Processing: Embracing Research in India, pp. 104–119 (2012)Google Scholar
  6. 6.
    Umbert, M., Bonada, J., Blaauw, M.: Generating singing voice expression contours based on unit selection. In: SMAC (July 2013)Google Scholar
  7. 7.
    Nakano, T., Goto, M.: VocaListener2: A singing synthesis system able to mimic a user’s singing in terms of voice timbre changes as well as pitch and dynamics. In: ICASSP 2011, pp. 453–456 (2011)Google Scholar
  8. 8.
    Ohishi, Y., Kameoka, H., Mochihashi, D., Kashino, K.: A stochastic model of singing voice F0 contours for characterizing expressive dynamic components. In: Proc. INTERSPEECH (September 2012)Google Scholar
  9. 9.
    Oura, K., Mase, A., Yamada, T., Muto, S., Nankaku, Y., Tokuda, K.: Recent development of the HMM-based singing voice synthesis system - Sinsy. In: Proc. ISCA Tutorial and Research Workshop on Speech Synthesis, pp. 211–216 (September 2010)Google Scholar
  10. 10.
    Saino, K., Tachibana, M., Kenmochi, H.: A singing style modeling system for singing voice synthesizers. In: Proc. INTERSPEECH, pp. 2894–2897 (September 2010)Google Scholar
  11. 11.
    Lee, S.W., Ang, S.T., Dong, M., Li, H.: Generalized F0 modelling with absolute and relative pitch features for singing voice synthesis. In: Proc. ICASSP, pp. 429–432 (March 2012)Google Scholar
  12. 12.
    Yasuraoka, N., Abe, T., Itoyama, K., Takahashi, T., Ogata, T., Okuno, H.G.: Changing timbre and phrase in existing musical performances as you like. In: ACM Multimedia 2009, p. 10 (2009)Google Scholar
  13. 13.
    Hermes, D.J.: Measurement of pitch by subharmonic summation. J. Acoust. Soc. Am. 83(1), 257–264 (1988)CrossRefGoogle Scholar
  14. 14.
    Brown, J.C.: Calculation of a constant q spectral transform. J. Acoust. Soc. Am.  89(1), 425–434 (1991)CrossRefGoogle Scholar
  15. 15.
    Nakano, T., Goto, M.: An automatic singing skill evaluation method for unknown melodies using pitch interval accuracy and vibrato features. In: Proc. INTER- SPEECH (September 2006)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Yukara Ikemiya
    • 1
  • Katsutoshi Itoyama
    • 1
  • Hiroshi G. Okuno
    • 1
  1. 1.Graduate School of InformaticsKyoto UniversitySakyoJapan

Personalised recommendations