Skip to main content
Log in

Technical foundations of TANDEM-STRAIGHT, a speech analysis, modification and synthesis framework

  • Published:
Sadhana Aims and scope Submit manuscript

Abstract

This article presents comprehensive technical information about STRAIGHT and TANDEM-STRAIGHT, a widely used speech modification tool and its successor. They share the same concept: the periodic excitation found in voiced sounds is an efficient mechanism for transmitting underlying smooth time–frequency representation. The tools are also based on the perceptual equivalence of two sets of independent Gaussian random signals. This equivalence makes it possible to discard input phase information intentionally and enables flexible manipulation of parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Harris F J 1978 On the use of windows for harmonic analysis with the discrete Fourier transform, Proc. IEEE 66(1): 51–83

    Article  Google Scholar 

  • Kawahara H, Masuda-Katsuse I, de Cheveigné A 1999a Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction, Speech Commun. 27(3–4): 187–207

    Article  Google Scholar 

  • Kawahara H, Katayose H, de Cheveigné A, Patterson R D 1999b Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity, In Proc. EUROSPEECH’99, ESCA. vol. 6, pp. 2781–2784

  • Kawahara H, de Cheveigné A, Banno H, Takahashi T, Irino T 2005 Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT, In Proc. Interspeech 2005 ISCA, pp. 537–540

  • Kawahara H 2006 STRAIGHT, exploration of the other aspect of vocoder: Perceptually isomorphic decomposition of speech sounds, Acoust. Sci. Technol. 27(5): 349–353

    Article  Google Scholar 

  • Kawahara H, Morise M, Takahashi T, Nisimura R, Irino T, Banno H 2008 A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0 and aperiodicity estimation, In Proc. ICASSP 2008 IEEE, pp. 3933–3936

  • Morise M, Takahashi T, Kawahara H, Irino T 2007 Power spectrum estimation method for periodic signals virtually irrespective to time window position, Trans. IEICE J90-D(12): 3265–3267 (in Japanese)

    Google Scholar 

  • Nuttall A H 1981 Some windows with very good sidelobe behavior, IEEE Trans. Audio Speech Signal Process. 29(1): 84–91

    Article  Google Scholar 

  • Unser M 2000 Sampling – 50 years after Shannon, Proc. IEEE 88(4): 569–587

    Article  Google Scholar 

  • Welch P D 1967 The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms, IEEE Trans. Audio Electroacoust. AU-15(2): 70–73

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to HIDEKI KAWAHARA.

Rights and permissions

Reprints and permissions

About this article

Cite this article

KAWAHARA, H., MORISE, M. Technical foundations of TANDEM-STRAIGHT, a speech analysis, modification and synthesis framework. Sadhana 36, 713–727 (2011). https://doi.org/10.1007/s12046-011-0043-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12046-011-0043-3

Keywords

Navigation