ICISP 2008: Image and Signal Processing pp 544-551 | Cite as
Voicing Detection in Noisy Speech Signal
Abstract
An algorithm for voicing detection in noisy speech signal is proposed. This algorithm is based on the product of wavelet transforms at some scales called multi-scale product. The multi-scale product has the ability to reinforce the edge in the signal while suppressing additive noise. Motivated by the fact that unvoiced sounds are, in most important speech production models, considered as filtered noise, we apply the multi-scale product on speech signal for detection of voiced segments. In fact, the multi-scale product anneals the signal frames corresponding to unvoiced sounds and frames of silence, while it conserves speech periodicity for voiced frames.
Keywords
Wavelet transform multi-scale product voicing decision speech signalReferences
- 1.Campbell Jr., J.P.: Speaker Recognition: A Tutorial. Proceedings of the IEEE 85(9), 1437–1462 (1997)CrossRefGoogle Scholar
- 2.Martin, A., Charlet, D., Mauuary, L.: Robust Speech/ Non-speech Detection Using LDA Applied to MFCC. ICASSP, vol. 1, pp. 237-240 (2001)Google Scholar
- 3.Ishizaka, K., Flanagan, J.L.: Synthesis of voiced Sounds from a Two-mass Model of the Vocal Chords. Bell System Technical J. 50(6), 1233–1268 (1972)Google Scholar
- 4.Atal, B., Rabiner, L.: A Pattern Recognition Approach to Voiced-unvoiced-silence Classification with Applications to Speech Recognition. IEEE Trans. On Signal Processing 24(3), 201–212 (1976)CrossRefGoogle Scholar
- 5.Kedem, B.: Spectral Analysis and Discrimination by Zero-crossings. In: Proc. IEEE, vol. 74, pp. 1477–1493 (1986)Google Scholar
- 6.Childers, D.G., Hahn, M., Larar, J.N.: Silence and Voiced/Unvoiced/Mixed Excitation Classification of Speech. IEEE Trans. On Acoust., Speech, Signal Process 37(11), 1771–1774 (1989)CrossRefGoogle Scholar
- 7.Liao, L., Gregory, M.: Algorithms for Speech Classification. 5th ISSPA Brisbane, 623-627 (1999)Google Scholar
- 8.Zahorian, S.A., Silsbee, P., Wang, X.: Phone Classification with Segmental Features and a Binary-pair Partitioned Neural Network Classifier. In: ICASSP, pp. 1011-1014 (1997) Google Scholar
- 9.Niyogi, P., Sondhi, M.M.: Dectecting Stop Consonant in Continuous Speech. J. Acoust. Soc. Am. 111, 1063–1076 (2002)CrossRefGoogle Scholar
- 10.Xiong, Z., Huang, T.: Boosting Speech/non-speech Classification Using Averaged Mel-frequency Cepstrum. In: IEEE Pacific-Rim Conf. on Multimedia (2002)Google Scholar
- 11.Yang, H., Vuuren, S.V., Hermansky, H.: Relevancy of Time-frequency Features for Phonetic Classification Measured by Mutual Information. In: ICASSP, vol. 1, pp. 225-228 (1999) Google Scholar
- 12.Lachiri, Z., Ellouze, N.: Speech Classification in Noisy Environment Using Subband Decomposition. In: ISSPA, vol. 1, pp. 409-412 (2003) Google Scholar
- 13.Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Academic Press, San Diego (1999)MATHGoogle Scholar
- 14.Mallat, S., Hwang, W.L.: Singularity Detection and Processing with Wavelets. IEEE Trans. On Information Theory 38(2), 617–643 (1992)CrossRefMathSciNetGoogle Scholar
- 15.Rosenfeld, A., Thurson, M.: Edge and Curve Detection for Visual Scene Analysis. IEEE Trans. Comput. 20, 562–569 (1971)CrossRefGoogle Scholar
- 16.Sadler, B.M., Pham, T., Sadler, L.C.: Optimal and Wavelet-based Shock Wave Detection and Estimation. Journal of the Acoustical Society of America 104(2), 955–963 (1998)CrossRefGoogle Scholar
- 17.Sadler, B.M., Swami, A.: Analysis of Multiscale Products for Step Detection and Estimation. IEEE Trans. Inform. Theory 45(3), 1043–1051 (1999)MATHCrossRefMathSciNetGoogle Scholar
- 18.Bouzid, A., Ellouze, N.: Open Quotient Measurements Based on Multiscale Product of Speech Signal Wavelet Transform. Research Letter in Signal Processing 2007, p. 5 (2007); Article ID 62521 doi:10.1155/2007/62521 Google Scholar
- 19.Plante, F., Meyer, G.F., Ainsworth, W.A.: A Pitch Extraction Reference Database. In: Proc. Eurospeech 1995, pp. 837–840 (1995)Google Scholar