Robust Voicing Detection and F0 Estimation Method

Rao, K. Sreenivasa; Narendra, N. P.

doi:10.1007/978-3-030-02759-9_3

K. Sreenivasa Rao⁴ &
N. P. Narendra⁵

Part of the book series: SpringerBriefs in Speech Technology ((BRIEFSSPEECHTECH))

333 Accesses

Abstract

In this chapter, a robust voicing detection and F ₀ estimation method is proposed for HMM-based speech synthesis system. Impulse-like excitation present in the voiced speech is utilized for extracting the fundamental frequency. Zero-frequency filter method is used to derive the locations of impulse excitation. The size of the window used in the ZFF method is exploited for accurate voicing detection and F ₀ estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, T. Kitamura, Speech parameter generation algorithms for HMM-based speech synthesis, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2000), pp. 1315–1318
Google Scholar
K. Tokuda, T. Mausko, N. Miyazaki, T. Kobayashi, Multi-space probability distribution HMM. IEICE Trans. Inf. Syst. E85-D(3), 455–464 (2002)
Google Scholar
J. Yamagishi, Z. Ling, S. King, Robustness of HMM-based speech synthesis, in Proceedings of the Interspeech (2008), pp. 581–584
Google Scholar
D. Talkin, A robust algorithm for pitch tracking (RAPT), in Speech Coding and Synthesis (Elsevier Science, Amsterdam, 1995), pp. 495–518
Google Scholar
P. Boersma, Accurate short-term analysis of fundamental frequency and the harmonics-to-noise ratio of a sampled sound. Inst. Phon. Sci. 17, 97–110 (1993)
Google Scholar
R. Goldberg, L. Riek, A Practical Handbook of Speech Coders (CRC Press, Boca Raton, 2000)
Book Google Scholar
K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)
Article Google Scholar
P. Alku, T. Bakstrom, E. Vikman, Normalized amplitude quotient for parameterization of the glottal flow. J. Acoust. Soc. Am. 112(2), 701–710 (2002)
Article Google Scholar
K.S.R. Murty, B. Yegnanarayana, M.A. Joseph, Characterization of glottal activity from speech signals. IEEE Signal Process. Lett. 16(6), 469–472 (2009)
Article Google Scholar
B. Yegnanarayana, K.S.R. Murty, Event-based instantaneous fundamental frequency estimation from speech signals. IEEE Trans. Audio Speech Lang. Process. 17(4), 614–624 (2009)
Article Google Scholar
Y. Bayya, D.N. Gowda, Spectro-temporal analysis of speech signals using zero-time windowing and group delay function. Speech Commun. 55(6), 782–795 (2013)
Article Google Scholar
D.J. Hermes, Measurement of pitch by subharmonic summation. J. Acoust. Soc. Am. 83(1), 257–264 (1988)
Article Google Scholar
H. Kawahara, H. Katayose, A. de Cheveigne, R. Patterson, Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity, in Proceedings of the Eurospeech (1999), pp. 2781–2784
Google Scholar
T. Drugman, A. Alwan, Joint robust voicing detection and pitch estimation based on residual harmonics, in Proceedings of the Interspeech (2011), pp. 1973–1976
Google Scholar
F. Plante, G.F. Meyer, W.A. Aubsworth, A pitch extraction reference database, in Eurospeech (1995), pp. 837–840
Google Scholar
P. Bagshaw, S.M. Hiller, M.A. Jack, Enhanced pitch tracking and the processing of FQ contours for computer and intonation teaching, in Eurospeech (1993), pp. 1003–1006
Google Scholar
HMM-based speech synthesis system (HTS). Available: http://hts.sp.nitech.ac.jp/
J.J. Odella, The use of context in large vocabulary speech recognition, Ph.D. dissertation, Cambridge University, 1995
Google Scholar
K. Shinoda, T. Watanabe, MDL-based context-dependent subword modeling for speech recognition. J. Acoust. Soc. Jpn. (E) 21(2), 79–86 (2000)
Article Google Scholar
T. Toda, K. Tokuda, A speech parameter generation algorithm considering global variance for HMM-based speech synthesis. IEICE Trans. Inf. Syst. 90(5), 816–824 (2007)
Article Google Scholar
CMU ARCTIC speech synthesis databases. Available: http://festvox.org/cmu_arctic/
H. Zen, T. Toda, M. Nakamura, K. Tokuda, Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005. IEICE Trans. Inf. Syst. E90-D(1), 325–333 (2007)
Article Google Scholar
H. Zen, T. Toda, K. Tokuda, The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006. IEICE Trans. Inf. Syst. E91-D(6), 1764–1773 (2008)
Article Google Scholar
K. Oura, H. Zen, Y. Nankaku, A. Lee, K. Tokuda, A tied covariance technique for HMM-based speech synthesis. IEICE Trans. Inf. Syst. E93-D(3), 595–601 (2010)
Article Google Scholar
Q. Zhang, F. Soong, Y. Qian, Z. Yan, J. Pan, Y. Yan, Improved modeling for F0 generation and V/U decision in HMM-based TTS, in Proceedings of the International Conference on Acoustics Speech and Signal Processing (ICASSP) (2010), pp. 4606–4609
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
K. Sreenivasa Rao
Aalto University, Espoo, Finland
N. P. Narendra

Authors

K. Sreenivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar
N. P. Narendra
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Rao, K.S., Narendra, N.P. (2019). Robust Voicing Detection and F₀ Estimation Method. In: Source Modeling Techniques for Quality Enhancement in Statistical Parametric Speech Synthesis. SpringerBriefs in Speech Technology. Springer, Cham. https://doi.org/10.1007/978-3-030-02759-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-02759-9_3
Published: 14 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02758-2
Online ISBN: 978-3-030-02759-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics