Voicing Detection in Noisy Speech Signal

  • Aïcha Bouzid
  • Noureddine Ellouze
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5099)


An algorithm for voicing detection in noisy speech signal is proposed. This algorithm is based on the product of wavelet transforms at some scales called multi-scale product. The multi-scale product has the ability to reinforce the edge in the signal while suppressing additive noise. Motivated by the fact that unvoiced sounds are, in most important speech production models, considered as filtered noise, we apply the multi-scale product on speech signal for detection of voiced segments. In fact, the multi-scale product anneals the signal frames corresponding to unvoiced sounds and frames of silence, while it conserves speech periodicity for voiced frames.


Wavelet transform multi-scale product voicing decision speech signal 


  1. 1.
    Campbell Jr., J.P.: Speaker Recognition: A Tutorial. Proceedings of the IEEE 85(9), 1437–1462 (1997)CrossRefGoogle Scholar
  2. 2.
    Martin, A., Charlet, D., Mauuary, L.: Robust Speech/ Non-speech Detection Using LDA Applied to MFCC. ICASSP, vol. 1, pp. 237-240 (2001)Google Scholar
  3. 3.
    Ishizaka, K., Flanagan, J.L.: Synthesis of voiced Sounds from a Two-mass Model of the Vocal Chords. Bell System Technical J. 50(6), 1233–1268 (1972)Google Scholar
  4. 4.
    Atal, B., Rabiner, L.: A Pattern Recognition Approach to Voiced-unvoiced-silence Classification with Applications to Speech Recognition. IEEE Trans. On Signal Processing 24(3), 201–212 (1976)CrossRefGoogle Scholar
  5. 5.
    Kedem, B.: Spectral Analysis and Discrimination by Zero-crossings. In: Proc. IEEE, vol. 74, pp. 1477–1493 (1986)Google Scholar
  6. 6.
    Childers, D.G., Hahn, M., Larar, J.N.: Silence and Voiced/Unvoiced/Mixed Excitation Classification of Speech. IEEE Trans. On Acoust., Speech, Signal Process 37(11), 1771–1774 (1989)CrossRefGoogle Scholar
  7. 7.
    Liao, L., Gregory, M.: Algorithms for Speech Classification. 5th ISSPA Brisbane, 623-627 (1999)Google Scholar
  8. 8.
    Zahorian, S.A., Silsbee, P., Wang, X.: Phone Classification with Segmental Features and a Binary-pair Partitioned Neural Network Classifier. In: ICASSP, pp. 1011-1014 (1997) Google Scholar
  9. 9.
    Niyogi, P., Sondhi, M.M.: Dectecting Stop Consonant in Continuous Speech. J. Acoust. Soc. Am. 111, 1063–1076 (2002)CrossRefGoogle Scholar
  10. 10.
    Xiong, Z., Huang, T.: Boosting Speech/non-speech Classification Using Averaged Mel-frequency Cepstrum. In: IEEE Pacific-Rim Conf. on Multimedia (2002)Google Scholar
  11. 11.
    Yang, H., Vuuren, S.V., Hermansky, H.: Relevancy of Time-frequency Features for Phonetic Classification Measured by Mutual Information. In: ICASSP, vol. 1, pp. 225-228 (1999) Google Scholar
  12. 12.
    Lachiri, Z., Ellouze, N.: Speech Classification in Noisy Environment Using Subband Decomposition. In: ISSPA, vol. 1, pp. 409-412 (2003) Google Scholar
  13. 13.
    Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Academic Press, San Diego (1999)zbMATHGoogle Scholar
  14. 14.
    Mallat, S., Hwang, W.L.: Singularity Detection and Processing with Wavelets. IEEE Trans. On Information Theory 38(2), 617–643 (1992)CrossRefMathSciNetGoogle Scholar
  15. 15.
    Rosenfeld, A., Thurson, M.: Edge and Curve Detection for Visual Scene Analysis. IEEE Trans. Comput. 20, 562–569 (1971)CrossRefGoogle Scholar
  16. 16.
    Sadler, B.M., Pham, T., Sadler, L.C.: Optimal and Wavelet-based Shock Wave Detection and Estimation. Journal of the Acoustical Society of America 104(2), 955–963 (1998)CrossRefGoogle Scholar
  17. 17.
    Sadler, B.M., Swami, A.: Analysis of Multiscale Products for Step Detection and Estimation. IEEE Trans. Inform. Theory 45(3), 1043–1051 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Bouzid, A., Ellouze, N.: Open Quotient Measurements Based on Multiscale Product of Speech Signal Wavelet Transform. Research Letter in Signal Processing 2007, p. 5 (2007); Article ID 62521 doi:10.1155/2007/62521 Google Scholar
  19. 19.
    Plante, F., Meyer, G.F., Ainsworth, W.A.: A Pitch Extraction Reference Database. In: Proc. Eurospeech 1995, pp. 837–840 (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Aïcha Bouzid
    • 1
  • Noureddine Ellouze
    • 2
  1. 1.Institut Supérieur d’Electronique et de Communication de SfaxSfaxTunisia
  2. 2.Ecole Nationale d’Ingénieurs de TunisTunisTunisia

Personalised recommendations