Abstract
Epoch extraction helps in speech enhancement and multispeaker separation from a speech. But it is a challenging task due to time-varying characteristics of the source and the system. Epoch sequence is useful to manipulate prosody in speech synthesis applications. Accurate estimation of epochs helps in characterizing voice quality features. This chapter aims at developing an extraction algorithm independent of the characteristics of vocal tract system. It improves the accuracy of epochs extracted and pitch detected from speech signal. For feature detection, we propose a robust framework derived from Hilbert–Huang transform of speech signal. The intrinsic mode functions (IMF) sharply identify instantaneous frequencies as function of time. The proposed technique guarantees accurate pitch estimation because of better decorrelating nature of HHT compared with DCT and DFTs. The results are simulated for an input speech signal taken from NOISEX-92 database. The simulated results show that the proposed algorithm outperforms the existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kounoudes A, Naylor PA, Brookes M (2012) The DYPSA algorithm for estimation of glottal closure instants in voiced speech. In: IEEE conference on acoustic speech signal processing‚ Vol 1, pp 349–352
Ananthapadmanabha TV, Ramakrishnan AG (2011) Epoch extraction based on integrated linear prediction residual using plosion index. IEEE Trans Audio Speech Lang Process 21(12):2471–2480
Yegnanarayana B, Murty PS (2010) Extraction of vocal-tract system characteristics from speech signals. IEEE Trans Speech Audio Process 8(3):267–281
Yegnanarayana B, Gangashetty SV (2011) Epoch-based analysis of speech signals. In: IEEE conference on Indian academy of sciences‚ Vol 36‚ Part 5‚ pp 651–697
Hu G, Wang D (2010) A tandem algorithm for pitch estimation and voiced speech segregation. IEEE Trans Audio Speech Lang Process 18(8):2067–2079
Höge H, Siemens AG (2011) Basic parameters in speech processing the need for evaluation. In: Corporate Technology, IEEE Transactions on Audio, Speech and Language Processing USA
Molla MKI, Hirose K (2007) Single-mixture audio source separation by subspace decomposition of Hilbert spectrum. IEEE Trans Acoust Speech Signal Process 15(3):893–900
Huang F, Lee T (2013) Pitch estimation in noisy speech using accumulated peak spectrum and sparse estimation technique. IEEE Trans Audio Speech Lang Process 21(1):99–109
Markel J (2010) The sift algorithm for fundamental frequency estimation. IEEE Trans Audio Electroacoust AU-20:367–377
Rao KR, Yip P (1990) Discrete cosine transform: algorithms, advantages, applications. Academic Press Professional Inc‚ San Diego‚ CA‚ USA
de la Cuadra P, Master A, Sapp C (2001) Efficient pitch detection techniques for interactive music. In: International Computer Music Conference, Havana
Drugman T et al (2011) Detection of glottal closure instants from speech signals: a quantitative review. IEEE Trans Audio Speech Lang Process 20(3):994–1006
Rao KS, Yegnanarayana B (2006) Prosody modification using instants of significant excitation. IEEE Trans Audio Speech Lang Process 14(3):972–980
Rabiner LR et al (1976) A comparative performance study of several pitch detection algorithms. IEEE Trans Acoust Speech Signal Process ASSP-24(5):399–417
Nagarajan T, Sripriya N (2013) Pitch estimation using harmonic product spectrum derived from DFT. In: Audio speech and language processing‚ TENCON 2013 - 2013 IEEE Region 10 Conference (31194)
Shimamura T, Kobayashi H (2001) Weighted autocorrelation for pitch extraction of noisy speech. IEEE Trans Speech Audio Process 9(7):727–730
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Samiappan, D., Jaba Deva Krupa, A., Monika, R. (2018). Epoch Extraction Using Hilbert–Huang Transform for Identification of Closed Glottis Interval. In: Saini, H., Singh, R., Reddy, K. (eds) Innovations in Electronics and Communication Engineering . Lecture Notes in Networks and Systems, vol 7. Springer, Singapore. https://doi.org/10.1007/978-981-10-3812-9_15
Download citation
DOI: https://doi.org/10.1007/978-981-10-3812-9_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3811-2
Online ISBN: 978-981-10-3812-9
eBook Packages: EngineeringEngineering (R0)