Statistically Optimal Joint Multimicrophone MAP Estimators Under Super-Gaussian Assumption

Ranjbaryan, Raziyeh; Abutalebi, Hamid Reza

doi:10.1007/s00034-023-02515-y

Statistically Optimal Joint Multimicrophone MAP Estimators Under Super-Gaussian Assumption

Published: 04 November 2023

Volume 43, pages 1492–1517, (2024)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

90 Accesses
Explore all metrics

Abstract

This paper presents two super-Gaussian-based multimicrophone maximum a posteriori (MAP) estimators which exploit both amplitude and phase of speech signal from noisy observations. It is well known that super-Gaussian distributions model the statistical properties of speech signal more accurately. Under the independent Gaussian statistical assumption for noise signals, which is usually valid in wireless acoustic sensor networks, two joint multimicrophone estimators are derived while the speech signal is modeled by super-Gaussian distribution. Since the microphones are distributed randomly and may also belong to different devices, the independency assumption of noise signals is more reasonable in these networks. The performance of the proposed estimators is compared to that of four baseline estimators; the first is the multimicrophone minimum mean square error (MMSE) estimation, where both amplitude and phase are derived assuming Gaussian properties for speech signal. The second baseline is the multimicrophone MAP-based amplitude estimator, that utilizes the super-Gaussian statistics to just obtain the amplitude of speech and keeps the phase unchanged. As the third one, we have considered a minimum variance distortion-less response filter followed by a super-Gaussian MMSE estimator. We have also compared the performance of the proposed estimators with the centralized multichannel Wiener filter. The simulation experiments demonstrate remarkable ability of the proposed estimators to enhance speech quality and intelligibility when the clean speech is degraded by a mixture of both point source interference and additive noise in reverberant environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distributed Speech Presence Probability Estimator in Fully Connected Wireless Acoustic Sensor Networks

Article 06 June 2020

Robust Signal-to-Noise Ratio Estimation in Non-Gaussian Noise Channel

Article 05 July 2016

Informed Spatial Filtering Based on Constrained Independent Component Analysis

Data Availability

The simulated datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Notes

The input full-band SNRs are computed for the first microphone since we considered it as the reference one.
The simulation codes are available at https://pws.yazd.ac.ir/sprl/Ranjbaryan-CSSP/Codes.rar.

References

A. Abramson, I. Cohen, Simultaneous detection and estimation approach for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 15(8), 2348–2359 (2007). https://doi.org/10.1109/TASL.2007.904231
Article Google Scholar
H.R. Abutalebi, M. Rashidinejad, Speech enhancement based on beta-order MMSE estimation of short time spectral amplitude and Laplacian speech modeling. Speech Commun. 67, 92–101 (2015). https://doi.org/10.1016/j.specom.2014.12.002
Article Google Scholar
J.B. Allen, D.A. Berkley, Image method for efficiently simulating small-room acoustics. Acoust. Soc. Am. J. 65, 943–950 (1979). https://doi.org/10.1121/1.382599
Article ADS Google Scholar
I. Andrianakis, P.R. White, MMSE speech spectral amplitude estimators with Chi and Gamma speech priors. In: proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 1068–1071, (2006). https://doi.org/10.1109/ICASSP.2006.1660842
R. Balan, J. Rosca, Microphone array speech enhancement by Bayesian estimation of spectral amplitude and phase. In: proc. Sensor Array and Multichannel Signal Processing Workshop Proceedings (SAM), pp 209–213, (2002) https://doi.org/10.1109/SAM.2002.1191030
A. Bertrand, M. Moonen, Distributed adaptive node-specific signal estimation in fully connected sensor networks—part I: sequential node updating. IEEE Trans. Signal Process. 58(10), 5277–5291 (2010). https://doi.org/10.1109/TSP.2010.2052612
Article ADS MathSciNet Google Scholar
S.R. Chiluveru, M. Tripathy, Low SNR speech enhancement with DNN based phase estimation. Int. J. Speech Technol. 22(1), 283–292 (2019). https://doi.org/10.1007/s10772-019-09603-y
Article Google Scholar
T.H. Dat, K. Takeda, F. Itakura, Generalized Gamma modeling of speech and its online estimation for speech enhancement. In: proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 181–184, (2005). https://doi.org/10.1109/ICASSP.2005.1415975
S. Doclo, M. Moonen, T. Van den Bogaert et al., Reduced-bandwidth and distributed MWF-based noise reduction algorithms for binaural hearing aids. IEEE Trans. Audio Speech Lang. Process. 17(1), 38–51 (2009). https://doi.org/10.1109/TASL.2008.2004291
Article Google Scholar
Y. Ephraim, D. Malah, Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 32(6), 1109–1121 (1984). https://doi.org/10.1109/TASSP.1984.1164453
Article Google Scholar
J.S. Erkelens, R.C. Hendriks, R. Heusdens et al., Minimum mean-square error estimation of discrete Fourier coefficients with generalized Gamma priors. IEEE Trans. Audio Speech Lang. Process. 15(6), 1741–1752 (2007). https://doi.org/10.1109/TASL.2007.899233
Article Google Scholar
J.S. Garofolo, Getting started with the DARPA TIMIT CD-ROM: an acoustic phonetic continuous speech database. Tech. rep., National Institute of Standards and Technology (NIST), Gaithersburgh, MD, (prototype as of) (1988)
T. Gerkmann, M. Krawczyk-Becker, J.L. Roux, Phase processing for single channel speech enhancement. IEEE Signal Process. Mag. (2015)
T. Gerkmann, Bayesian estimation of clean speech spectral coefficients given a priori knowledge of the phase. IEEE Trans. Signal Process. 62(16), 4199–4208 (2014). https://doi.org/10.1109/TSP.2014.2336615
Article ADS MathSciNet Google Scholar
T. Gerkmann, MMSE-optimal enhancement of complex speech coefficients with uncertain prior knowledge of the clean speech phase. In: proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 4478–4482, (2014) https://doi.org/10.1109/ICASSP.2014.6854449
T. Gerkmann, R.C. Hendriks, Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Trans. Audio Speech Lang. Process. 20(4), 1383–1393 (2012). https://doi.org/10.1109/TASL.2011.2180896
Article Google Scholar
T. Gerkmann, C. Breithaupt, R. Martin, Improved a posteriori speech presence probability estimation based on a likelihood ratio with fixed priors. IEEE Trans. Audio, Speech Lang. Process. 16(5), 910–919 (2008). https://doi.org/10.1109/TASL.2008.921764
Article Google Scholar
R.C. Hendriks, R. Heusdens, J. Jensen, On robustness of multi-channel minimum mean-squared error estimators under super-Gaussian priors. In: proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp 157–160, (2009a). https://doi.org/10.1109/ASPAA.2009.5346488
R.C. Hendriks, R. Heusdens, U. Kjems et al., On optimal multichannel mean-squared error estimators for speech enhancement. IEEE Signal Process. Lett. 16(10), 885–888 (2009). https://doi.org/10.1109/LSP.2009.2026205
Article ADS Google Scholar
Y.A. Huang, J. Benesty, A multi-frame approach to the frequency-domain single-channel noise reduction problem. IEEE Trans. Audio Speech Lang. Process. 20(4), 1256–1269 (2012). https://doi.org/10.1109/TASL.2011.2174226
Article Google Scholar
M. Kazama, S. Gotoh, M. Tohyama et al., On the significance of phase in the short term Fourier spectrum for speech intelligibility. Acoust. Soc. Am. 127(3), 1432–1439 (2010)
Article ADS Google Scholar
H. Lang, J. Yang, Speech enhancement based on fusion of both magnitude/phase-aware features and targets. Electronics 9(7), 1125–1144 (2020). https://doi.org/10.3390/electronics9071125
Article Google Scholar
P. Loizou, Speech Enhancement: Theory and Practice, 1st edn. (CRC Press, Boca Raton, 2007)
Book Google Scholar
T. Lotter, Single- and Multi-Microphone Spectral Amplitude Estimation Using a Super-Gaussian Speech Model (Springer, Berlin, 2005)
Book Google Scholar
T. Lotter, P. Vary, Speech enhancement by MAP spectral amplitude estimation using a super-Gaussian speech model. EURASIP J. Adv. Signal Process. 7, 1110–1126 (2005). https://doi.org/10.1155/ASP.2005.1110
Article Google Scholar
T. Lotter, C. Benien, P. Vary, Multi channel direction independent speech enhancement using spectral amplitude estimation. EURASIP J. Appl. Signal Process. 2003, 1147–1156 (2003)
Google Scholar
S. Markovich-Golan, S. Gannot, I. Cohen, Distributed multiple constraints generalized sidelobe canceler for fully connected wireless acoustic sensor networks. IEEE Trans. Audio Speech Lang. Process. 21(2), 343–356 (2013). https://doi.org/10.1109/TASL.2012.2224454
Article Google Scholar
S. Markovich-Golan, A. Bertrand, M. Moonen et al., Optimal distributed minimum-variance beamforming approaches for speech enhancement in wireless acoustic sensor networks. Signal Process. 107, 4–20 (2015). https://doi.org/10.1016/j.sigpro.2014.07.014
Article Google Scholar
R. Martin, Speech enhancement using MMSE short time spectral estimation with Gamma distributed speech priors. In: proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 253–256, (2002). https://doi.org/10.1109/ICASSP.2002.5743702
R. Martin, Speech enhancement based on minimum mean-square error estimation and super-Gaussian priors. IEEE Trans. Speech Audio Process. 13(5), 845–856 (2005). https://doi.org/10.1109/TSA.2005.851927
Article Google Scholar
R. Martin, C. Breithaupt, Speech enhancement in the DFT domain using Laplacian speech priors. In: proc. International Workshop on Acoustic Echo and Noise Control (IWAENC), pp 87–90 (2003)
N. Oo, W.S. Gan, On harmonic addition theorem. Int. J. Comput. Commun. Eng. 1(3), 200–202 (2012)
Article Google Scholar
K. Paliwal, K. Wójcicki, B. Shannon, The importance of phase in speech enhancement. Speech Commun. 53(4), 465–494 (2011). https://doi.org/10.1016/j.specom.2010.12.003
Article Google Scholar
A. Papoulis, S.U. Pillai, Probability, Random Variables, and Stochastic Processes, 4th edn. (McGraw Hill, Boston, 2002)
Google Scholar
P.G. Patil, T.H. Jaware, S.P. Patil et al., Marathi speech intelligibility enhancement using I-AMS based neuro-fuzzy classifier approach for hearing aid users. IEEE Access 10, 123028–123042 (2022). https://doi.org/10.1109/ACCESS.2022.3223365
Article Google Scholar
P.S. Rani, S. Andhavarapu, S.R. Murty Kodukula, Significance of phase in DNN based speech enhancement algorithms. In: proc. National Conference on Communications (NCC), pp 1–5, (2020), https://doi.org/10.1109/NCC48643.2020.9056089
R. Ranjbaryan, H.R. Abutalebi, Distributed speech presence probability estimator in fully connected wireless acoustic sensor networks. Circuits Syst. Signal Process. 39, 6121–6141 (2020). https://doi.org/10.1007/s00034-020-01452-4
Article Google Scholar
R. Ranjbaryan, H.R. Abutalebi, Multiframe maximum a posteriori estimators for single-microphone speech enhancement. IET Signal Proc. 15(7), 467–481 (2021). https://doi.org/10.1049/sil2.12045
Article Google Scholar
R. Ranjbaryan, S. Doclo, H.R. Abutalebi, Distributed MAP estimators for noise reduction in fully connected wireless acoustic sensor networks. In: Proc. Speech Communication; 13th ITG-Symposium, pp 1–5 (2018)
S. Samui, I. Chakrabarti, S.K. Ghosh, Improved single channel phase-aware speech enhancement technique for low signal-to-noise ratio signal. IET Signal Proc. 10(6), 641–650 (2016). https://doi.org/10.1049/iet-spr.2015.0182
Article Google Scholar
M. Souden, J. Chen, J. Benesty et al., An integrated solution for online multichannel noise tracking and reduction. IEEE Trans. Audio Speech Lang. Process. 19(7), 2159–2169 (2011). https://doi.org/10.1109/TASL.2011.2118205
Article Google Scholar
C.H. Taal, R.C. Hendriks, R. Heusdens, et al., A short-time objective intelligibility measure for time-frequency weighted noisy speech. In: proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 4214–4217, (2010). https://doi.org/10.1109/ICASSP.2010.5495701
M. Trawicki, M.T. Johnson, Improvements of the Beta-order minimum mean-square error (MMSE) spectral amplitude estimator using Chi priors. In: proc. Thirteenth Annual Conference of the International Speech Communication Association (INTERSPEECH), pp 939–942 (2012a)
M.B. Trawicki, M.T. Johnson. Distributed multichannel speech enhancement with minimum mean-square short time spectral amplitude, log-spectral amplitude and spectral phase estimation. Signal Processing pp 345–356 (2012b)
M.B. Trawicki, M.T. Johnson, Speech enhancement using Bayesian estimators of the perceptually-motivated short-time spectral amplitude (STSA) with Chi speech priors. Speech Commun. 57, 101–113 (2014). https://doi.org/10.1016/j.specom.2013.09.009
Article Google Scholar
Y. Wakabayashi, T. Fukumori, M. Nakayama et al., Single-channel speech enhancement with phase reconstruction based on phase distortion averaging. IEEE/ACM Trans. Audio Speech Lang. Process. 26(9), 1559–1569 (2018). https://doi.org/10.1109/TASLP.2018.2831632
Article Google Scholar
D. Wang, J. Lim, The unimportance of phase in speech enhancement. IEEE Trans. Acoust. Speech Signal Process. 30(4), 679–681 (1982). https://doi.org/10.1109/TASSP.1982.1163920
Article Google Scholar
P.J. Wolfe, S.J. Godsill, Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement. EURASIP J. Adv. Signal Process. 10, 1043–1051 (2003)
Google Scholar
Z. Zhang, D.S. Williamson, Y. Shen, Impact of phase distortion and phase-insensitive speech enhancement on speech quality perceived by hearing-impaired listeners. J. Acoust. Soc. Am. 148(4), 2650–2650 (2020). https://doi.org/10.1121/1.5147369
Article ADS Google Scholar
N. Zheng, X.L. Zhang, Phase-aware speech enhancement based on deep neural networks. IEEE/ACM Trans. Audio Speech Lang. Process. 27(1), 63–76 (2019). https://doi.org/10.1109/TASLP.2018.2870742
Article MathSciNet Google Scholar

Download references

Acknowledgements

We are grateful to the Department of Medical Physics and Acoustics, University of Oldenburg, for allowing access to their recorded data.

Author information

Authors and Affiliations

Electrical Engineering Department, Yazd University, Yazd, 89195-741, Iran
Raziyeh Ranjbaryan & Hamid Reza Abutalebi

Authors

Raziyeh Ranjbaryan
View author publications
You can also search for this author in PubMed Google Scholar
Hamid Reza Abutalebi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raziyeh Ranjbaryan.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

A generalized form of (9) has been presented in Example 1 of Chapter 5 of [34], that expresses the probability distribution of random variable Y, which is a function of variable X as follows:

$$\begin{aligned} Y = aX + b, \end{aligned}$$

(A1)

where a and b represent deterministic variables. In the case of $ b = 0 $, this equation is simplified to our case. Although in general, the division of two random variables X and Y, i.e., Y/X yields a random variable, however, in special case like the current situation, ($ a = \dfrac{Y}{X} $) represents a deterministic value. In the problem at hand

$$\begin{aligned} {\left\{ \begin{array}{ll} Y \longleftarrow A_{m} \\ X \longleftarrow A_{1} \end{array}\right. } \end{aligned}$$

(A2)

where random variables are with Rayleigh distribution, and

$$\begin{aligned} {\left\{ \begin{array}{ll} a \longleftarrow C_{m} \\ b \longleftarrow 0 \end{array}\right. } \end{aligned}$$

(A3)

so, $ C_{m} $ represents a deterministic value (the ratio of two standard deviations) as explained in the manuscript.

Based on [34], the distribution function of $ F_y(y) $ is computed as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll} F_y(y) = P\{X\le \dfrac{y-b}{a}\}= F_x(\dfrac{y-b}{a}), \qquad &{} a > 0, \\ F_y(y)= P\{X \ge \dfrac{y-b}{a}\}= 1- F_x(\dfrac{y-b}{a}), \qquad &{} a < 0, \end{array}\right. } \end{aligned}$$

(A4)

and the PDF is computed as

$$\begin{aligned} f_y(y) =\frac{1}{\mid a \mid } f_x(\dfrac{y-b}{a}). \end{aligned}$$

(A5)

In our problem the amplitude $ A_1 $ has the super-Gaussian distribution

$$\begin{aligned} p(A_1) = {\left\{ \begin{array}{ll} \dfrac{\mu ^{\nu +1} A_1^\nu }{\Gamma (\nu +1) \sigma ^{\nu +1}_x(1)} \exp \left( \dfrac{-\mu A_1}{\sigma _x(1)} \right) , \qquad &{} A_1 > 0, \\ 0, \qquad &{} \text {else}, \end{array}\right. } \end{aligned}$$

(A6)

hence, the PDF of $ A_{m} = C_{m}A_1 $ is given by

$$\begin{aligned} p(A_{m}) =\frac{1}{C_{m}} \, p(\dfrac{A_{m}}{C_{m}}), \end{aligned}$$

(A7)

consequently:

$$\begin{aligned} p(A_{m}) = {\left\{ \begin{array}{ll} \dfrac{\mu ^{\nu +1}A_m^\nu }{ \Gamma (\nu +1)(C_m\sigma _x(1))^{\nu +1}} \exp \left( \dfrac{-\mu A_m}{C_m\sigma _x(1)} \right) , \qquad &{} A_m > 0, \\ 0, \qquad &{} \text {else}, \end{array}\right. } \end{aligned}$$

(A8)

which again represents super-Gaussian distribution with variance $ \sigma ^2_x(m) = C_m^2\sigma ^2_x(1) $ as mentioned in the manuscript.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ranjbaryan, R., Abutalebi, H.R. Statistically Optimal Joint Multimicrophone MAP Estimators Under Super-Gaussian Assumption. Circuits Syst Signal Process 43, 1492–1517 (2024). https://doi.org/10.1007/s00034-023-02515-y

Download citation

Received: 24 February 2023
Revised: 07 September 2023
Accepted: 08 September 2023
Published: 04 November 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00034-023-02515-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistically Optimal Joint Multimicrophone MAP Estimators Under Super-Gaussian Assumption

Abstract

Access this article

Similar content being viewed by others

Distributed Speech Presence Probability Estimator in Fully Connected Wireless Acoustic Sensor Networks

Robust Signal-to-Noise Ratio Estimation in Non-Gaussian Noise Channel

Informed Spatial Filtering Based on Constrained Independent Component Analysis

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Statistically Optimal Joint Multimicrophone MAP Estimators Under Super-Gaussian Assumption

Abstract

Access this article

Similar content being viewed by others

Distributed Speech Presence Probability Estimator in Fully Connected Wireless Acoustic Sensor Networks

Robust Signal-to-Noise Ratio Estimation in Non-Gaussian Noise Channel

Informed Spatial Filtering Based on Constrained Independent Component Analysis

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation