Abstract
Performance evaluation, in a complete speech analysis-synthesis system, has been carried out for a wavelet-based pitch detection scheme that has been reported earlier. Speech quality, time for computation and memory consumption (for real-time implementation) are the parameters that have been considered while comparing this system with analysis-synthesis systems that use pitch detection based on autocorrelation and cepstral analysis. Results for different speech signals show that autocorrelation-based pitch detection scheme is the best in terms of speech quality and memory consumption while wavelet-based pitch detection stands in between the other two methods.
Similar content being viewed by others
References
Ananthapadmanabha, T. V., & Yegnanarayana, B. (1979). Epoch extraction from linear prediction residual for identification and closed glottis interval. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27, 309–319.
Banerjee, M., Vani, B. A., & Krishna, G. R. (2004). Optimal real time DSP implementation of ITU G.729 speech codec. In Proceedings of IEEE. 60th vehicular technology conference (VTC 2004), Los Angeles, USA (Vol. 6, pp. 3908–3912).
Breakpoint help (2012). http://processors.wiki.ti.com/index.php/Breakpoint
Chen, H. (2001). Efficient implementation of low bit rate 1.6 kbps speech coder using field programmable gate arrays. In Proceedings of IEEE workshop on signal processing systems, Antwerpen, Belgium (pp. 161–168).
Cheng, Y. M., & O’Shaughnessy, D. (1989). Automatic and reliable estimation of glottal closure instant and period. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37, 1805–1815.
Kadambe, S., & Boudreaux-Bartels, G. F. (1992). Application of the wavelet transform for pitch detection of speech signals. IEEE Transactions on Information Theory, 38(2), 917–924.
Mahawar, K., Kumar, V., & Gupta, H. O. (2012). Design and implementation of AMBE based voice codec module over custom FPGA platform. In Proceedings of international conference on computing, communication and applications (ICCCA), Tamil Nadu, India (pp. 1–5).
Mallat, S. G., & Zhong, S. (1989). Complete signal representation with multiscale edges (Tech. rep. Courant Inst. of Math. Sci., RRT-483-RR-219).
MATLAB online help (2012). http://www.mathworks.in/help/toolbox/simulink/ug/f0-7640.html
Mohapatra, S., Dhiman, V., Bhattacharya, S., & Kumar, S. (2011). A theoretical justification for coincidence of wavelet maxima at a particular scale pair in an event-based pitch detection method. In Proceedings of IEEE international conference on devices communications (ICDeCom), BIT Mesra, Ranchi, India (pp. 403–406).
Pang, J., Chauhan, S., & Bhlodia, J. M. (2008). Speech compression FPGA design by using different discrete wavelet transform schemes. In Proceedings of advances in electrical and electronics engineering, IAENG special edition of the world congress on engineering and computer science (pp. 21–29).
Pasero, E., & Montuori, A. (2002). Real-time perceptual coding of wideband speech by competitive neural networks. In Lecture notes in computer science (Vol. 2486, pp. 160–167). Berlin: Springer.
Unisa, A. P. Q., & Guevara, R. C. L. (2009). Real-time implementation of wideband sinusoidal speech coder on ADSP-21065L. In Proceedings of 16th international conference on digital signal processing, Santorini, Greece (pp. 1–5).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kumar, S., Bhattacharya, S., Dhiman, V. et al. Performance evaluation of a wavelet-based pitch detection scheme. Int J Speech Technol 16, 431–437 (2013). https://doi.org/10.1007/s10772-013-9194-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-013-9194-4