Audio Zoom for Smartphones Based on Multiple Adaptive Beamformers

Duong, Ngoc Q. K.; Berthet, Pierre; Zabre, Sidkièta; Kerdranvat, Michel; Ozerov, Alexey; Chevallier, Louis

doi:10.1007/978-3-319-53547-0_12

Audio Zoom for Smartphones Based on Multiple Adaptive Beamformers

Ngoc Q. K. Duong¹⁷,
Pierre Berthet¹⁸,
Sidkièta Zabre¹⁹,
Michel Kerdranvat¹⁷,
Alexey Ozerov¹⁷ &
…
Louis Chevallier¹⁷

Conference paper
First Online: 15 February 2017

2061 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10169))

Abstract

Some recent smartphones have offered the so-called audio zoom feature which allows to focus sound capture in the front direction while attenuating progressively surrounding sounds along with video zoom. This paper proposes a complete implementation of such function involving two major steps. First, targeted sound source is extracted by a novel approach that combines multiple adaptive beamformers having different look directions with a post-processing algorithm. Second, spatial zooming effect is created by leveraging the microphone signals and the enhanced target source. Subjective test with real-world audio recordings using a mock-up simulating an usual shape of the smartphone confirms the rich user experience obtained by the proposed system.

This work has been done while the Pierre Berthet and the Sidkièta Zabre were with Technicolor.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://www.youtube.com/watch?v=7DEyuapmRCs.
2.
http://www.idownloadblog.com/2012/09/12/iphone-5-three-mics/.
3.
Note that, preliminary study in [8] did not show remarkable advantage of BSS compared to beamforming in some specific setups such as a single target source in noise field.
4.
The demostration has been presented at the Show and Tell session of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016).
5.
Note that in the output of RMVDR there is usually some artifact due to the nonlinear processing, and the signal distortion is more severe at high frequencies where the array’s geometry error has more impact.
6.
Matlab code is available at: http://webee.technion.ac.il/Sites/People/IsraelCohen/Download/omlsa.m.

References

Avendano, C., Solbach, L.: Audio zoom. US Patent Submitted 20 110 129 095A1 (2011). http://www.google.com/patents/US20110129095
Lee, K., Song, H., Lee, Y., Son, Y., Kim, J.: Mobile terminal and audio zooming method thereof. US Patent Submitted 20 130 342 730A1 (2013). http://www.google.com/patents/US20130342730
Veen, B.V., Buckley, K.: Beamforming: a versatile approach to spatial filtering. IEEE ASSP Mag. 5(2), 4–24 (1988)
Article Google Scholar
Li, J., Stoica, P.: Robust Adaptive Beamforming. Wiley, New York (2005)
Book Google Scholar
Makino, S., Lee, T.-W., Sawada, H.: Blind Speech Separation. Springer, New York (2007)
Book Google Scholar
Vincent, E., Araki, S., Theis, F., Nolte, G., Bofill, P., Sawada, H., Ozerov, A., Gowreesunker, V., Lutter, D., Duong, N.Q.K.: The signal separation campaign (2007–2010): achievements and remaining challenges. Sig. Process. 92, 1928–1936 (2012)
Article Google Scholar
Duong, N.Q.K., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio Speech Lang. Process. 18(7), 1830–1840 (2010)
Article Google Scholar
Thiemann, J., Vincent, E.: An experimental comparison of source separation and beamforming techniques for microphone array signal enhancement. In: Proceedings of International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–5 (2013)
Google Scholar
Bitzer, J., Simmer, K.U.: Superdirective microphone arrays. In: Brandstein, M., Ward, D. (eds.) Microphone Arrays. Digital Signal Processing, pp. 19–38. Springer, Heidelberg (2010)
Google Scholar
Gu, Y., Leshem, A.: Robust adaptive beamforming based on interference covariance matrix reconstruction and steering vector estimation. IEEE Trans. Sig. Process. 60(7), 3881–3885 (2012)
Article MathSciNet Google Scholar
Takada, S., Kanba, S., Ogawa, T., Akagiri, K., Kobayashi, T.: Sound source separation using null-beamforming and spectral subtraction for mobile devices. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 30–33 (2007)
Google Scholar
Bianchi, L., D’Amelio, F., Antonacci, F., Sarti, A., Tubaro, S.: A plenacoustic approach to acoustic signal extraction. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2005)
Google Scholar
Markovich, S., Gannot, S., Cohen, I.: Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfereing speech signals. IEEE Trans. Audio Speech Lang. Process. 17(6), 1071–1086 (2009)
Article Google Scholar
Loesch, B., Yang, B.: Online blind source separation based on time-frequency sparseness. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 117–120 (2009)
Google Scholar
Masnadi-Shirazi, A., Rao, B.D.: Separation and tracking of multiple speakers in a reverberant environment using a multiple model particle filter glimpsing method. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 2516–2519 (2011)
Google Scholar
Mestre, X., Lagunas, M.: On diagonal loading for minimum variance beamformers. In: Proceedings of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), pp. 459–462 (2003)
Google Scholar
Zelinski, R.: A microphone array with adaptive post-filtering for noise reduction in reverberant rooms. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 2578–2581 (1998)
Google Scholar
McCowan, I.A., Bourlard, H.: Microphone array post-filter for diffuse noise field. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1905–1908 (2002)
Google Scholar
Duong, N.Q.K., Vincent, E., Gribonval, R.: Spatial location priors for gaussian model based reverberant audio source separation. EURASIP J. Adv. Sig. Process. 1, 1–11 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Technicolor, 975 avenue des Champs Blancs, 35576, Cesson Sévigné, France
Ngoc Q. K. Duong, Michel Kerdranvat, Alexey Ozerov & Louis Chevallier
3D Sound Labs, 22 rue de la Rigourdière, 35510, Cesson Sévigné, France
Pierre Berthet
Altran Technologies, 3 Rue Louis Braille, 35136, Saint-Jacques-de-la-Lande, France
Sidkièta Zabre

Authors

Ngoc Q. K. Duong
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Berthet
View author publications
You can also search for this author in PubMed Google Scholar
Sidkièta Zabre
View author publications
You can also search for this author in PubMed Google Scholar
Michel Kerdranvat
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Ozerov
View author publications
You can also search for this author in PubMed Google Scholar
Louis Chevallier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ngoc Q. K. Duong .

Editor information

Editors and Affiliations

Institute of Information Theory and Automation, Prague, Czech Republic
Petr Tichavský
Sharif University of Technology, Tehran, Iran
Massoud Babaie-Zadeh
Grenoble-Alpes University, Grenoble, France
Olivier J.J. Michel
Toulon University, Toulon, France
Nadège Thirion-Moreau

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Duong, N.Q.K., Berthet, P., Zabre, S., Kerdranvat, M., Ozerov, A., Chevallier, L. (2017). Audio Zoom for Smartphones Based on Multiple Adaptive Beamformers. In: Tichavský, P., Babaie-Zadeh, M., Michel, O., Thirion-Moreau, N. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2017. Lecture Notes in Computer Science(), vol 10169. Springer, Cham. https://doi.org/10.1007/978-3-319-53547-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-53547-0_12
Published: 15 February 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-53546-3
Online ISBN: 978-3-319-53547-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics