Abstract
In this paper, we focus on the estimation-based frequency-domain speech enhancement methods under speech presence uncertainty. Through the minimization of an average risk function, a generalization of maximum a posteriori spectral amplitude estimator is derived. By adjusting the cost parameters, we can control the error caused by noise falsely detected as speech. Our experimental results show that the proposed system can be a simple alternative to Abramson’s simultaneous detection and estimation approach for speech enhancement since it involves merely estimation under speech presence uncertainty and does not require any detector. Moreover, the proposed estimator takes advantage of a more straightforward implementation, since there is no need for the computation of Bessel functions.
Similar content being viewed by others
References
A. Abramson, I. Cohen, Simultaneous detection and estimation approach for speech enhancement. IEEE Trans. Audio Speech Lang. Process 15(4), 2348–2359 (2007)
J. Benesty, J. Chen, E. Habets, Speech enhancement in the STFT domain (Springer, Berlin, 2011)
J. Benesty, J. Chen, Y. Huang, I. Cohen, Noise Reduction in Speech Processing (Springer, Berlin, 2009)
I. Cohen, Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging. IEEE Trans Speech Audio Process 11(5), 466–475 (2003)
Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process 32(6), 1109–1121 (1984)
J.S. Erkelens, R.C. Hendriks, R. Heusdens, J. Jensen, Minimum mean-square error estimation of discrete fourier coefficients with generalized gamma priors. IEEE Trans. Audio Speech Lang. Process 15(6), 1741–1752 (2007)
A. Fredriksen, D. Middleton, D. Vandelinde, Simultaneous signal detection and estimation under multiple hypotheses. IEEE Trans. Inf. Theory 18(5), 607–614 (1972)
R.J. Mcaulay, M.L. Malpass, Speech enhancement using a soft-decision noise suppression filter. IEEE Trans. Acoust. Speech Signal Process 28(2), 137–145 (1980)
D. Middleton, F. Esposito, Simultaneous optimum detection and estimation of signals in noise. IEEE Trans. Inf. Theory 14(3), 434–444 (1968)
P.J. Wolfe, S.J. Godsill, Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement. EURASIP J. Appl. Signal Process 10, 1043–1051 (2003)
C.H. You, S.N. Koh, S. Rahardja, \(\beta \)-Order MMSE spectral amplitude estimation for speech enhancement. IEEE Trans. Speech Audio Process 13(4), 475–486 (2005)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
We want to find \(\hat{A}\) that minimizes the phrase,
In an optimization problem, for finding \(\hat{\theta }\) that minimizes \(J(\hat{\theta }), \, J(\hat{\theta })\) is differentiated according to \(\hat{\theta }\) and set equal to zero. The answer would be \(\hat{\theta }=\theta _{opt} \). Alternatively, to solve the optimization problem, we can consider \(\hat{\theta }-\theta _{opt} \) to be equal to \(\frac{\partial }{\partial \hat{\theta }}J\), i.e., \(\hat{\theta }-\theta _{opt} \equiv \frac{\partial }{\partial \hat{\theta }}J\). Based on this remark, we follow the derivation of Eq. (13).
Differentiating (22) with respect to \(\hat{A}\) yields
Since our purpose is finding \(\hat{A}\) that minimizes the phrase,
As known from estimation theory, the optimum \(\hat{A}\) that minimizes (24) is one that can maximize the posterior distribution \(f(A,\alpha \vert Y,H_1 )\) if \(\Delta \) is small enough. In other words, the optimum \(\hat{A}\) is the MAP spectral amplitude estimator as derived in [10]
So, based on the above-mentioned remark, we have
Also, in finding \(\hat{A}\) that minimizes the phrase:
the optimum estimator is \(\hat{A}=G_f R\), if \(\Delta \) is small enough. Based on the above-mentioned remark, we have
By substituting Eqs. (26) and (28) into Eq. (23), we obtain
Rights and permissions
About this article
Cite this article
Momeni, H., Abutalebi, H.R. Generalization of Maximum A Posteriori Amplitude Estimator Under Speech Presence Uncertainty for Speech Enhancement. Circuits Syst Signal Process 33, 2565–2582 (2014). https://doi.org/10.1007/s00034-014-9762-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-014-9762-0