Abstract
This paper investigates the utilization of wavelet filters via multistage convolution by Reverse Biorthogonal Wavelets (RBW) in high and low pass band frequency parts of speech signal. Speech signal is decomposed into two pass bands of frequency; high and low, and then the noise is removed in each band individually in different stages via wavelet filters. This approach provides better outcomes because it does not cut the speech information, which occurs when utilizing conventional thresholding. We tested the proposed method via several noise probability distribution functions. Subjective evaluation is engaged in conjunction with objective evaluation to accomplish optimal investigation method. The method is simple but has surprise high quality results. The method shows superiority over Donoho and Johnstone thresholding method and Birge-Massart thresholding strategy method.
Similar content being viewed by others
References
Bahoura, M., & Rouat, J. (2006). Wavelet speech enhancement based on time–scale adaptation. Speech Communication, 48, 1620–1637.
Berouti, M., Schwartz, R., & Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In Proceeding of the IEEE conference on acoustics, speech and signal processing (pp. 208–211).
Birgé, L., & Massart, P. (1997). From model selection to adaptive estimation. In Festschrift for Lucien Le Cam (pp. 55–88). New York: Springer.
Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics Speech and Signal Processing, 27, 113–120.
Breithaupt, C., & Martin, R. (2003). MMSE estimation of magnitude-squared DFT coefficients with super-Gaussian priors. In IEEE proceeding of international conference on acoustics, speech and signal processing (Vol. I, pp. 896–899).
Cohen, A., Daubechies, I., & Feauveau, J. (1992). Biorthogonal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 45(5), 485–560.
Daqrouq, K., & Abu-Isbeih, I.N. (2007). Arrhythmia detection using wavelet transform. In IEEE Region 8, EUROCON 2007, Warsaw, Poland.
Daqrouq, K., & Abu-Sheikha, N. (2005). Heart rate variability analysis using wavelet transform. Asian Journal for Information Technology, 4(4).
Dat, T., Takeda, K., & Itakura, F. (2005). Generalized gamma modeling of speech and its online estimation for speech enhancement. In Proceeding of ICASSP-2005 (pp. 181–184).
Daubechies, I. (1988). Orthonormal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 41(11), 909–996.
Daubechies, I. (1992). Ten lectures on wavelets. In CBMS-NSF conference series in applied mathematics. Philadelphia: SIAM.
Deller, J., Hansen, J., & Proakis, J. (2000). Discrete-time processing of speech signals (2nd ed.). New York: IEEE Press.
Diethorn, E. (2000). Subband noise reduction methods for speech enhancement. In S. L. Gay, & J. Benesty (Eds.), Acoustic signal processing for telecommunication. Dordrecht: Kluwer Academic. Chapter 9.
Donoho, D. (1993). Nonlinear wavelet methods for recovering signals, images, and densities from indirect and noisy data. Proceedings of Symposia in Applied Mathematics, 47, 173–205.
Donoho, D. (1995). Denoising by soft thresholding. IEEE Transactions on Information Theory, 41(3), 613–627.
Donoho, D., & Johnstone, I. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81, 425–455.
Donoho, D., & Johnstone, I. (1995). Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association, 90, 1200–1224.
Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-32(6), 1109–1121.
Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-33, 443–445.
Gabor, D. (1946). Theory of communications. Journal of the Institute of Electrical Engineering London, 93, 429–457.
Ghanbari, Y., & Karami, M. (2004). Spectral subtraction in the wavelet domain for speech enhancement. Internat. J. Software Inf. Technol. (IJSIT), 1, 26–30.
Ghanbari, Y., & Kerami-Mollaei, M.R. (2006). A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Communication, 48, 927–940.
Hansen, J., & Pellom, B. (1998). An effective quality evaluation protocol for speech enhancement algorithms. In Proc. int. conf. spoken lang. process. (Vol. 7, pp. 2819–2822).
Haykin, S. (1996). Adaptive filter theory (3rd ed.). New York: Prentice Hall.
Hu, Y., & Loizou, P. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 229–238.
Huang, H., & Pan, J. (2006). Uniform and warped low delay filter-banks for speech enhancement. Signal Processing, 86, 792–803.
Huang, Q., Yang, J., & Shoushui, W. (2007). Variational Bayesian learning for speech modeling and enhancement. Signal Processing, 87, 2026–2035.
ITU-T Rec. P. 835 (2003). Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm, ITU-T, ITU-T Rec. P. 835.
Johnson, M., Yuan, X., & Ren, Y. (2007). Speech signal enhancement through adaptive wavelet thresholding. Speech Communication, 49, 123–133.
Johnstone, I., & Silverman, B. (1997). Wavelet threshold estimators for data with correlated noise. Journal of the Royal Statistical Society, Series B (Gen.), 59, 319–351.
Kamath, S., & Loizou, P. (2002). A Multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In IEEE international conference on acoustics, speech, and signal processing (Vol. 4, pp. 4160–4164).
Kamrul, H. (2004). Reducing signal-bias from MAD estimated noise level for DCT speech enhancement. Signal Processing, 84, 151–162.
Kitawaki, N., Nagabuchi, H., & Itoh, K. (1988). Objective quality evaluation for low bit-rate speech coding systems. IEEE Journal on Selected Areas in Communications, 6(2), 262–273.
Klatt, D. (1982). Prediction of perceived phonetic distance from critical band spectra. In IEEE international conference on acoustics, speech, and signal processing (Vol. 7, pp. 1278–1281).
Klein, M., & Kabal, P. (2002). Signal subspace speech enhancement with perceptual post-filtering. In IEEE international conference on acoustics, speech, and signal processing (Vol. 1, pp. 537–540).
Lotter, T., & Vary, P. (2005). Speech enhancement by MAP spectral amplitude estimation using a super-Gaussian speech model. EURASIP Journal on Applied Signal Processing, 7, 1110–1126.
Mallat, S. (1989a). A theory for multiresolution signal decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 674–693.
Mallat, S. (1989b). Multifrequency channel decompositions of images and wavelet models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(12), 2091–2110.
Mallat, S., & Hwang, W. (1992). Singularity detection and processing with wavelets. IEEE Transactions on Information Theory, 38, 617–643.
Martin, R. (2002). Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors. In IEEE int. conf. acoustics, speech, signal processing, Orlando, Florida.
Sameti, H. (1998). Hmm-based strategies for enhancement of speech signals embedded in nonstationary noise. IEEE Transactions on Acoustics, Speech, and Signal Processing, 6, 445–455.
Senapati, S., & Chakroborty, S. (2008). Speech enhancement by joint statistical characterization in the Log Gabor Wavelet domain Goutam Saha. Speech Communication, 50, 504–518.
Seok, J., & Bae, K. (1997). Speech enhancement with reduction of noise components in the wavelet domain. In IEEE international conference on acoustics, speech, and signal processing (ICASSP’97) (Vol. 2, pp. 1323–1326).
Sheikhzadeh, H., & Abutalebi, H. (2001). An improved waveletbased speech enhancement system. In Proceeding of the 7th Eur. conference speech comm. technol. (EuroSpeech), Aalborg, Denmark.
Tufekci, Z., Gowdy, J., Gurbuz, S., & Patterson, E. (2006). Applied mel-frequency discrete wavelet coefficients and parallel model compensation for noise-robust speech recognition. Speech Communication, 48, 1294–1307.
Turbin, V., & Faucheur, N. (2007). Estimation of speech quality of noise reduced signals. In Proceeding online workshop meas. speech audio quality network.
Veprek, P., & Scordilis, M. (2002). Analysis, enhancement and evaluation of five pitch determination techniques. Speech Communication, 37, 249–270.
Vidakovic, B., & Lozoya, C. (1998). On time-dependant wavelet denoising. IEEE Transaction on Signal Processing, 46, 2549–2548.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Daqrouq, K., Abu-Isbeih, I.N., Daoud, O. et al. An investigation of speech enhancement using wavelet filtering method. Int J Speech Technol 13, 101–115 (2010). https://doi.org/10.1007/s10772-010-9073-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-010-9073-1