Skip to main content
Log in

Generalization of Maximum A Posteriori Amplitude Estimator Under Speech Presence Uncertainty for Speech Enhancement

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

In this paper, we focus on the estimation-based frequency-domain speech enhancement methods under speech presence uncertainty. Through the minimization of an average risk function, a generalization of maximum a posteriori spectral amplitude estimator is derived. By adjusting the cost parameters, we can control the error caused by noise falsely detected as speech. Our experimental results show that the proposed system can be a simple alternative to Abramson’s simultaneous detection and estimation approach for speech enhancement since it involves merely estimation under speech presence uncertainty and does not require any detector. Moreover, the proposed estimator takes advantage of a more straightforward implementation, since there is no need for the computation of Bessel functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. As it is shown in Sect. 5.3, this assumption is also compatible with the observed performance of the detector of [1].

References

  1. A. Abramson, I. Cohen, Simultaneous detection and estimation approach for speech enhancement. IEEE Trans. Audio Speech Lang. Process 15(4), 2348–2359 (2007)

    Article  Google Scholar 

  2. J. Benesty, J. Chen, E. Habets, Speech enhancement in the STFT domain (Springer, Berlin, 2011)

  3. J. Benesty, J. Chen, Y. Huang, I. Cohen, Noise Reduction in Speech Processing (Springer, Berlin, 2009)

    Google Scholar 

  4. I. Cohen, Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging. IEEE Trans Speech Audio Process 11(5), 466–475 (2003)

    Article  Google Scholar 

  5. Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process 32(6), 1109–1121 (1984)

    Article  Google Scholar 

  6. J.S. Erkelens, R.C. Hendriks, R. Heusdens, J. Jensen, Minimum mean-square error estimation of discrete fourier coefficients with generalized gamma priors. IEEE Trans. Audio Speech Lang. Process 15(6), 1741–1752 (2007)

    Article  Google Scholar 

  7. A. Fredriksen, D. Middleton, D. Vandelinde, Simultaneous signal detection and estimation under multiple hypotheses. IEEE Trans. Inf. Theory 18(5), 607–614 (1972)

    Article  MATH  MathSciNet  Google Scholar 

  8. R.J. Mcaulay, M.L. Malpass, Speech enhancement using a soft-decision noise suppression filter. IEEE Trans. Acoust. Speech Signal Process 28(2), 137–145 (1980)

    Article  Google Scholar 

  9. D. Middleton, F. Esposito, Simultaneous optimum detection and estimation of signals in noise. IEEE Trans. Inf. Theory 14(3), 434–444 (1968)

    Article  MATH  Google Scholar 

  10. P.J. Wolfe, S.J. Godsill, Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement. EURASIP J. Appl. Signal Process 10, 1043–1051 (2003)

    Article  Google Scholar 

  11. C.H. You, S.N. Koh, S. Rahardja, \(\beta \)-Order MMSE spectral amplitude estimation for speech enhancement. IEEE Trans. Speech Audio Process 13(4), 475–486 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamid Reza Abutalebi.

Appendix

Appendix

We want to find \(\hat{A}\) that minimizes the phrase,

$$\begin{aligned} J&= qb_{11} f(Y\vert H_1 )(1-\int \limits _{\hat{A}-\Delta }^{\hat{A}+\Delta } {\int \limits _0^{2\pi } {f(Y\vert A,\alpha )f(A,\alpha \vert H_1 )/f(Y\vert H_1 )d\alpha dA} } ) \nonumber \\&+\,(1-q)b_{01} f(Y\vert H_0 )( {1+U(G_f R-\hat{A}-\Delta )-U(G_f R-\hat{A}+\Delta )}).\qquad \quad \end{aligned}$$
(22)

In an optimization problem, for finding \(\hat{\theta }\) that minimizes \(J(\hat{\theta }), \, J(\hat{\theta })\) is differentiated according to \(\hat{\theta }\) and set equal to zero. The answer would be \(\hat{\theta }=\theta _{opt} \). Alternatively, to solve the optimization problem, we can consider \(\hat{\theta }-\theta _{opt} \) to be equal to \(\frac{\partial }{\partial \hat{\theta }}J\), i.e., \(\hat{\theta }-\theta _{opt} \equiv \frac{\partial }{\partial \hat{\theta }}J\). Based on this remark, we follow the derivation of Eq. (13).

Differentiating (22) with respect to \(\hat{A}\) yields

$$\begin{aligned} \frac{\partial }{\partial \hat{A}}J&= qb_{11} f(Y\vert H_1 )\frac{\partial }{\partial \hat{A}}(1-\int \limits _{\hat{A}-\Delta }^{\hat{A}+\Delta } {\int \limits _0^{2\pi } {f(Y\vert A,\alpha )f(A,\alpha \vert H_1 )/f(Y\vert H_1 )d\alpha dA} } ) \nonumber \\&+\,(1-q)b_{01} f(Y\vert H_0 )\frac{\partial }{\partial \hat{A}}(1+U(G_f R-\hat{A}-\Delta )-U(G_f R-\hat{A}+\Delta ).\nonumber \\ \end{aligned}$$
(23)

Since our purpose is finding \(\hat{A}\) that minimizes the phrase,

$$\begin{aligned} \left( {1-\int \limits _{\hat{A}-\Delta }^{\hat{A}+\Delta } {\int \limits _0^{2\pi } {f(Y\vert A,\alpha )f(A,\alpha \vert H_1 )/f(Y\vert H_1 )d\alpha dA} } }\right) . \end{aligned}$$
(24)

As known from estimation theory, the optimum \(\hat{A}\) that minimizes (24) is one that can maximize the posterior distribution \(f(A,\alpha \vert Y,H_1 )\) if \(\Delta \) is small enough. In other words, the optimum \(\hat{A}\) is the MAP spectral amplitude estimator as derived in [10]

$$\begin{aligned} \hat{A}=\frac{\xi +\sqrt{\xi ^2+2\frac{\xi (1+\xi )}{\gamma }} }{2(1+\xi )}R \triangleq G_\mathrm{MAP} R. \end{aligned}$$
(25)

So, based on the above-mentioned remark, we have

$$\begin{aligned} \hat{A}-G_\mathrm{MAP} R\equiv \quad \frac{\partial }{\partial \hat{A}}\left( {1-\int \limits _{\hat{A}-\Delta }^{\hat{A}+\Delta } {\int \limits _0^{2\pi } {f(Y\vert A,\alpha )f(A,\alpha \vert H_1 )/f(Y\vert H_1 )d\alpha dA} } }\right) \!.\qquad \end{aligned}$$
(26)

Also, in finding \(\hat{A}\) that minimizes the phrase:

$$\begin{aligned} \left( {1+U(G_f R-\hat{A}-\Delta )-U(G_f R-\hat{A}+\Delta )}\right) \!, \end{aligned}$$
(27)

the optimum estimator is \(\hat{A}=G_f R\), if \(\Delta \) is small enough. Based on the above-mentioned remark, we have

$$\begin{aligned} \hat{A}-G_f R\equiv \frac{\partial }{\partial \hat{A}}\left( {1+U(G_f R-\hat{A}-\Delta )-U(G_f R-\hat{A}+\Delta )}\right) . \end{aligned}$$
(28)

By substituting Eqs. (26) and (28) into Eq. (23), we obtain

$$\begin{aligned} \frac{\partial }{\partial \hat{A}}J=qb_{11} f(Y\vert H_1 )(\hat{A}-G_\mathrm{MAP} R)+(1-q)b_{01} f(Y\vert H_0 )(\hat{A}-G_f R)\!. \end{aligned}$$
(29)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Momeni, H., Abutalebi, H.R. Generalization of Maximum A Posteriori Amplitude Estimator Under Speech Presence Uncertainty for Speech Enhancement. Circuits Syst Signal Process 33, 2565–2582 (2014). https://doi.org/10.1007/s00034-014-9762-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-014-9762-0

Keywords

Navigation