Abstract
This paper presents a real-time architecture of an improved single-channel speech enhancement system based on phase-aware multi-band complex spectral subtraction. Using the proposed technique, the short-time spectral magnitude of the clean speech signal is estimated by considering the spectral phase of the speech and noise signal components. Moreover, the estimated spectral phase of the clean speech signal is also utilized for signal reconstruction in the time domain. The proposed system is made of the basic preprocessing module followed by an short-time Fourier transform analyzer, a noise power estimator based on improved minima controlled recursive array, a phase estimator unit and an overlap-add synthesis unit. The proposed architecture is implemented on a Field Programmable Gate Array (FPGA) using the Xilinx ISE tool. The overall resource utilization and the maximum operating frequency are also computed for a Virtex-6 FPGA chip. It has been experimentally shown that the proposed speech enhancement framework performs better than the other existing standard benchmark methods in terms of various quality and intelligibility assessment metrics.
Similar content being viewed by others
References
R. Andraka, A survey of CORDIC algorithms for FPGA based computers, in Proceedings of the 1998 ACM/SIGDA Sixth International Symposium on Field Programmable Gate Arrays (ACM, 1998), pp. 191–200
M. Bahoura, H. Ezzaidi, FPGA-implementation of parallel and sequential architectures for adaptive noise cancelation. Circuits Syst. Signal Process. 30(6), 1521–1548 (2011). doi:10.1007/s00034-011-9310-0
M. Bahoura, H. Ezzaidi, Implementation of spectral subtraction method on FPGA using high-level programming tool, in Proceedings of the 2012 24th International Conference on Microelectronics (ICM), IEEE (2012), pp. 1–4
M. Berouti, R. Schwartz, J. Makhoul, Enhancement of speech corrupted by acoustic noise, in IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP’79, vol. 4. IEEE (1979), pp. 208–211
S. Boll, Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27(2), 113–120 (1979)
I. Cohen, B. Berdugo, Noise estimation by minima controlled recursive averaging for robust speech enhancement. IEEE Signal Process. Lett. 9(1), 12–15 (2002)
R.E. Crochiere, A weighted overlap-add method of short-time fourier analysis/synthesis. IEEE Trans. Acoust. Speech Signal Process. 28(1), 99–102 (1980)
G. Doblinger, Computationally efficient speech enhancement by spectral minima tracking in subbands, in Proc. EUROSPEECH, vol. 2, 1995, pp. 1513–1516
Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 33(2), 443–445 (1985)
J.S. Garofolo, L.F Lamel, W.M. Fisher, J.G. Fiscus, D.S. Pallett, Darpa timit acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1-1.1, in NASA STI/Recon Technical Report N 93 (1993)
T. Gerkmann, M. Krawczyk-Becker, J. Le Roux, Phase processing for single-channel speech enhancement: history and recent advances. IEEE Signal Process. Mag. 32(2), 55–66 (2015). doi:10.1109/MSP.2014.2369251
M.K. Hasan, S. Salahuddin, M.R. Khan, A modified a priori snr for speech enhancement using spectral subtraction rules. IEEE Signal Process. Lett. 11(4), 450–453 (2004)
Y. Hu, P.C. Loizou, Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16(1), 229–238 (2008)
S. Kamath, P. Loizou, A multi-band spectral subtraction method for enhancing speech corrupted by colored noise, in IEEE International Conference on Acoustics Speech and Signal Processing, vol. 4 (Citeseer, 2002), pp. 4164–4167
M.F. Kasim, T. Adiono, M. Fahreza, M.F. Zakiy, Real-time architecture and FPGA implementation of adaptive general spectral substraction method. Proced. Technol. 11, 191–198 (2013)
M. Krawczyk, T. Gerkmann, STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 22(12), 1931–1940 (2014)
P. Krishnamoorthy, S.M. Prasanna, Enhancement of noisy speech by temporal and spectral processing. Speech Commun. 53(2), 154–174 (2011)
J. Kulmer, P. Mowlaee, Phase estimation in single channel speech enhancement using phase decomposition. IEEE Signal Process. Lett. 22(5), 598–602 (2015)
J. Le Roux, E. Vincent, Consistent wiener filtering for audio source separation. IEEE Signal Process. Lett. 20(3), 217–220 (2013)
J. Lim, A. Oppenheim, Enhancement and bandwidth compression of noisy speech. Proc. IEEE 67(12), 1586–1604 (1979). doi:10.1109/PROC.1979.11540
P. Lockwood, J. Boudy, Experiments with a nonlinear spectral subtractor (NSS), hidden markov models and the projection, for robust speech recognition in cars. Speech Commun. 11(2–3), 215–228 (1992)
P. Loizou, G. Kim, Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions. IEEE Trans. Audio Speech Lang. Process. 19(1), 47–56 (2011). doi:10.1109/TASL.2010.2045180
P.C. Loizou, Speech enhancement: theory and practice (CRC Press, Boca Raton, 2013)
U. Mahbub, T. Rahman, A. Rashid, FPGA implementation of real time acoustic noise suppression by spectral subtraction using dynamic moving average method, in IEEE Symposium on Industrial Electronics and Applications, 2009. ISIEA 2009, vol. 1. IEEE (2009), pp. 365–370
H. Momeni, H. Abutalebi, Generalization of maximum a posteriori amplitude estimator under speech presence uncertainty for speech enhancement. Circuits Syst. Signal Process. 33(8), 2565–2582 (2014). doi:10.1007/s00034-014-9762-0
P. Mowlaee, R. Martin, On phase importance in parameter estimation for single-channel source separation, in International Workshop on Acoustic Signal Enhancement; Proceedings of IWAENC 2012 VDE (2012), pp. 1–4
R. Naik, A. Stojcevski, V. Vibhute, J. Singh, Implementation of magnitude estimation algorithm for hearing aid, in Proceedings of the 2004 IEEE International Workshop on Biomedical Circuits and Systems. IEEE (2004), pp. S1–3
K. Paliwal, K. Wójcicki, B. Shannon, The importance of phase in speech enhancement. Speech Commun. 53(4), 465–494 (2011)
S. Rangachari, P.C. Loizou, A noise-estimation algorithm for highly non-stationary environments. Speech Commun. 48(2), 220–231 (2006)
S. Samui, I. Chakrabarti, S.K. Ghosh, Global soft decision based speech enhancement using voiced-unvoiced uncertainty and harmonic phase decomposition technique, in Proceedings of the 2016 International Conference on Signal Processing and Communications (SPCOM). IEEE (2016)
S. Samui, I. Chakrabarti, S.K. Ghosh, Improved single channel phase-aware speech enhancement technique for low signal-to-noise ratio signal. IET Signal Process. 10(6), 641–650 (2016). doi:10.1049/iet-spr.2015.0182
S. Samui, I. Chakrabarti, S.K. Ghosh, Two-stage temporal processing for single-channel speech enhancement. Interspeech 2016, 3723–3727 (2016). doi:10.21437/Interspeech.2016-307
J.W. Seok, K.S. Bae, Reduction of musical noise in spectral subtraction method using subframe phase randomisation. Electron. Lett. 35(2), 123–125 (1999)
J. Sohn, N.S. Kim, W. Sung, A statistical model-based voice activity detection. IEEE Signal Process. Lett. 6(1), 1–3 (1999)
A. Sugiyama, R. Miyahara, Phase randomization-a new paradigm for single-channel signal enhancement, in Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE (2013), pp. 7487–7491
C.H. Taal, R.C. Hendriks, R. Heusdens, J. Jensen, An algorithm for intelligibility prediction of time-frequency weighted noisy speech. IEEE Trans. Audio Speech Lang. Process. 19(7), 2125–2136 (2011)
A. Varga, H.J. Steeneken, Assessment for automatic speech recognition: Ii. NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun. 12(3), 247–251 (1993)
P. Vary, Noise suppression by spectral magnitude estimation-mechanism and theoretical limits. Signal Process. 8(4), 387–400 (1985)
N. Virag, Single channel speech enhancement based on masking properties of the human auditory system. IEEE Trans. Speech Audio Process. 7(2), 126–137 (1999)
D. Wang, J.S. Lim, The unimportance of phase in speech enhancement. IEEE Trans. Acoust. Speech Signal Process. 30(4), 679–681 (1982)
M.C. Wen, S.J. Wang, Y.N. Lin, Low power parallel multiplier with column bypassing, in Proceedings of the IEEE International Symposium on Circuits and Systems, 2005. ISCAS 2005. IEEE (2005), pp. 1638–1641
J. Whittington, K. Deo, T. Kleinschmidt, M. Mason, FPGA implementation of spectral subtraction for in-car speech enhancement and recognition, in Proceedings of the 2nd International Conference on Signal Processing and Communication Systems, 2008. ICSPCS 2008. IEEE (2008), pp. 1–8
K.K. Wójcicki, P.C. Loizou, Channel selection in the modulation domain for improved speech intelligibility in noise. J. Acoust. Soc. Am. 131(4), 2904–2913 (2012)
Acknowledgements
The authors are thankful to the editor and the anonymous reviewers for their helpful suggestions and valuable comments throughout the review process, which have considerably helped to improve the content of the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Samui, S., Sahu, P., Chakrabarti, I. et al. FPGA Implementation of a Phase-Aware Single-Channel Speech Enhancement System. Circuits Syst Signal Process 36, 4688–4715 (2017). https://doi.org/10.1007/s00034-017-0541-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-017-0541-6