Skip to main content
Log in

An information based feedback control for audio-motor binaural localization

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

In static scenarios, binaural sound localization is fundamentally limited by front-back ambiguity and distance non-observability. Over the past few years, “active” schemes have been shown to overcome these shortcomings, by combining spatial binaural cues with the motor commands of the sensor. In this context, given a Gaussian prior on the relative position to a source, this paper determines an admissible motion of a binaural head which leads, on average, to the one-step-ahead most informative audio-motor localization. To this aim, a constrained optimization problem is set up, which consists in maximizing the entropy of the next predicted measurement probability density function over a cylindric admissible set. The method is appraised through geometrical arguments, and validated in simulations and on real-life robotic experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Consider again the dynamic equation (2) with no dynamic noise, and assume that the posterior covariance \(\overline{P}_{k|k}\) of the full state \(X_k\) (defined in \(\mathbb {R}^3\)) is \(\overline{P}_{k|k} = {{\mathrm{diag}}}(0,P_{k|k})\). As the vector \(R^T(\phi ){T}\) is constant, the next “full” predicted covariance \(\overline{P}_{k+1|k}\) writes as \(\overline{P}_{k+1|k} = R^T(\phi )\overline{P}_{k|k}R(\phi )\), with \(R(\phi ) = {{\mathrm{diag}}}(1,r(\phi ))\), and \({|R(\phi )|{}={}|r(\phi )|{}={}1}\). Consequently, \(\overline{P}_{k+1|k} = {{\mathrm{diag}}}(0,P_{k+1|k})\) with \({|P_{k+1|k}|=|r^T(\phi )P_{k|k}r(\phi )|=|P_{k|k}|}\).

References

  • Aaronson, N., & Hartmann, W. (2014). Testing, correcting, and extending the Woodworth model for interaural time difference. The Journal of the Acoustical Society of America, 135, 817–823.

    Article  Google Scholar 

  • Bourgault, F., Makarenko, A., Williams, S., Grocholsky, B., Durrant-Whyte, H. (2002). Information based adaptive robotic exploration. In IEEE/RSJ international conference on intelligent robots and systems, (IROS’2002), Lausanne, Switzerland.

  • Bustamante, G., Danès, P., Forgue, T., Podlubne, A. (2016) Towards information-based feedback control for binaural active localization. In IEEE international conference on acoustics, speech, and signal processing (ICASSP’2016), Shanghai, China.

  • Bustamante, G., Portello, A., Danès, P. (2015). A three-stage framework to active source localization from a binaural head. In IEEE international conference on acoustics, speech, and signal processing (ICASSP’2015), Brisbane, Australia.

  • Cooke, M., Lu, Y., Lu, Y., Horaud, R. (2007). Active hearing, active speaking. In International symposium on auditory and audiological research (ISAAR’07), Marienlyst, Helsigør, Denmark.

  • Cover, T., & Thomas, J. (1991). Elements of information theory. New York: Wiley.

    Book  MATH  Google Scholar 

  • Denzler, J., & Brown, C. (2002). Information theoretic sensor data selection for active object recognition and state estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2), 145–157.

    Article  Google Scholar 

  • Deutsch, B., Zobel, M., Denzler, J., & Niemann, H. (2004). Multi-step entropy based sensor control for visual object tracking. Pattern Recognition, 3175, 359–366.

    Google Scholar 

  • Feder, H., Leonard, J., & Smith, C. (1999). Adaptive mobile robot navigation and mapping. The International Journal of Robotics Research, 18(7), 650–668.

    Article  Google Scholar 

  • Forster, C., Pizzoli, M., Scaramuzza, D. (2014). Appearance-based active, monocular, dense reconstruction for micro aerial vehicles. In Proceedings of robotics, science and systems, Berkeley, USA.

  • Grocholsky, B., Makarenko, A., Durrant-Whyte, H. (2003). Information-theoretic coordinated control of multiple sensor platforms. In IEEE international conference on robotics and automation, (ICRA’03), Taipei, Taiwan.

  • Julian, B. (2013). Mutual information-based gradient-ascent control for distributed robotics. PhD thesis, Massachusetts Institute of Technology.

  • Julier, S. J., & Uhlmann, J. K. (2004). Unscented filtering and nonlinear estimation. Proceedings of the IEEE, 92(3), 401–422. doi:10.1109/JPROC.2003.823141.

    Article  Google Scholar 

  • Kumon, M., Fukushima, K., Kunimatsu, S., Ishitobi, M. (2010). Motion planning based on simultaneous perturbation stochastic approximation for mobile auditory robots. In IEEE/RSJ international conference on intelligent robots and systems (IROS’2010), Taipei, Taiwan.

  • Le Cadre, J. P., & Laurent-Michel, S. (1999). Optimizing the receiver maneuvers for bearings-only tracking. Automatica, 35(4), 591–606.

    Article  MATH  Google Scholar 

  • Mallet, A., Pasteur, C., Herrb, M., Lemaignan, S., Ingrand, F. (2010). Genom3: Building middleware-independent robotic components. In IEEE international conference on robotics and automation, (ICRA’2010), Anchorage, Alaska.

  • Manyika, J. (1993). An information-theoretic approach to data fusion and sensor management. PhD thesis, University of Oxford.

  • Martinson, E., Apker, T., Bugajska, M. (2011). Optimizing a reconfigurable robotic microphone array. In IEEE/RSJ international conference on intelligent robots and systems (IROS’2011), San Francisco, California.

  • Martinson, E., & Schultz, A. (2009). Discovery of sound sources by an autonomous mobile robot. Autonomous Robots, 27, 221–237.

    Article  Google Scholar 

  • Nakadai, K., Lourens, T., Okuno, H., Kitano, H. (2000). Active audition for humanoid. In National conference on artificial intelligence (AAAI’2000). Austin, TX.

  • Portello, A., Bustamante, G., Danès, P., Mifsud, A. (2014a). Localization of multiple sources from a binaural head in a known noisy environment. In IEEE/RSJ international conference on intelligent robots and systems (IROS’2014), Chicago, IL.

  • Portello, A., Bustamante, G., Danès, P., Piat, J., Manhès, J. (2014b). Active localization of an intermittent sound source from a moving binaural sensor. In Forum Acustium (FA’2014), Krakow, Poland.

  • Portello, A., Danès, P., Argentieri, S. (2012). Active binaural localization of intermittent moving sources in the presence of false measurements. In IEEE/RSJ international conference on intelligent robots and systems (IROS’2012).

  • Portello, A., Danès, P., Argentieri, S., Pledel, S. (2013). HRTF-based source azimuth estimation and activity detection from a binaural sensor. In IEEE/RSJ international conference on intelligent robots and systems (IROS’2013), Tokyo, Japan.

  • Ristic, B., & Arulampalam, M. (2003). Tracking a manoeuvring target using angle-only measurements: Algorithms and performance. Signal Processing, 83(6), 1223–1238.

    Article  MATH  Google Scholar 

  • Sasaki, Y., Thompson, S., Kaneyoshi, M., Kagami, S. (2010). Map-generation and identification of multiple sound sources from robot in motion. In IEEE/RSJ international conference on intelligent robots and systems (IROS’2010), Taipei, Taiwan (pp. 437–443).

  • Sommerlade, E., Reid, I. (2008). Information-theoretic active scene exploration. In IEEE conference on computer vision and pattern recognition, (CVPR’2008), Anchorage, Alaska.

  • Thrun, S., Burgard, W., & Fox, D. (2005). Probabilistic robotics. Cambridge, MA: The MIT Press.

    MATH  Google Scholar 

  • Vincent, E., Sini, A., Charpillet, F. (2015). Audio source localization by optimal control of a mobile robot. In IEEE international conference on acoustics, speech and signal processing (ICASSP’2015), Brisbane, Australia.

Download references

Acknowledgements

The authors would like to thank Matthieu Herrb, Anthony Mallet, and Xavier Dollat for their invaluable help.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrick Danès.

Additional information

This work was partially supported by EU FET Grant Two!Ears, ICT-618075, www.twoears.eu.

This is one of several papers published in Autonomous Robots comprising the Special Issue on Active Perception.

Appendix

Appendix

Consider the posterior state pdf \(p(x_k|z_{1:k})\) of the sensor-to-source position at time k, and \({\mathcal {N}}(x_k;{\hat{x}}_{k|k},P_{k|k})\) the approximate Gaussian belief. This pdf can be mapped into the 1D Gaussian approximation \({\mathcal {N}}(z_{k+1};\hat{z}_{k+1|k},S_{k+1|k})\) of the predicted measurement pdf \(p(z_{k+1}|z_{1:k})\), by using the unscented transform. The aim is then to maximize the variance \(S_{k+1|k}\) so as to increase the entropy \(h(z_{k+1}|z_{1:k})\). This involves the composition of several functions.

First the sigma-points \(\left\{ X_{i}^{-}\right\} \) corresponding to \({p(x_{k}|z_{1:k}) = {\mathcal {N}}(x_k;{\hat{x}}_{k|k},P_{k|k})}\) are computed from the posterior mean \({\hat{x}}_{k|k}\) of the state vector at time k and the Cholesky decomposition \(P_{k|k} = L_{k|k}L_{k|k}^T\) of the posterior covariance:

$$\begin{aligned} \left\{ X_{i}^{-}\right\} = {\text {Sigma}}\_{\text {points}} \left( {\hat{x}}_{k|k},L_{k|k} \right) \end{aligned}$$
(15)

The sigma-points \(\left\{ X_{i}^{+}\right\} \) of the next predicted state pdf \(p(x_{k+1}|z_{1:k}) = {\mathcal {N}}(x_k;{\hat{x}}_{k+1|k},P_{k+1|k})\) can be obtained by applying the translation and rotation on each sigma point in the set \(\left\{ X_{i}^{-}\right\} \). Note that (2) is defined as a function of \((T_y,T_z,\phi )\), so that

$$\begin{aligned} \forall i,\ X_{i}^{+} = \Phi _{X_{i}^-}(T_y,T_z,\phi ). \end{aligned}$$
(16)

Then the set of sigma-points \(\left\{ Z_{i}^+\right\} \) of the predicted measurement pdf \(p(z_{k+1}|z_{1:k}) = {\mathcal {N}}(z_k;\hat{z}_{k+1|k},S_{k+1|k})\) can be obtained from \(\left\{ X_{i}^+\right\} \) defined in (16) by:

$$\begin{aligned} \forall i,\ Z_{i}^+ = l\left( -\mathrm {atan2}\left( X_{i}^+(1),X_{i}^+(2) \right) \right) , \end{aligned}$$
(17)

with \(X_{i}^+(1)\) and \(X_{i}^+(2)\) the components of \(X_{i}^+\), and \(l(\cdot )\) the measurement equation used to guide the exploration. Finally the mean \(\hat{z}_{k+1|k}\) and variance \(S_{k+1|k}\) of \(p(z_{k+1}|z_{1:k})\) are computed by

$$\begin{aligned} \hat{z}_{k+1|k}= & {} \sum _i w_m^{i}Z_{i}^+\end{aligned}$$
(18)
$$\begin{aligned} S_{k+1|k}= & {} \sum _i w_c^{i}\left( Z_{i}^+ - \hat{z}_{k+1|k}\right) ^2, \end{aligned}$$
(19)

with \(\left\{ w_m^i\right\} \) and \(\left\{ w_c^i\right\} \) the classic weights of the unscented transform.

The log of the variance \(S_{k+1|k}\) comes as a function of the finite translation and rotation, i.e., \(\log S_{k+1|k} = {F}_{k}(T_y,T_z,\phi )\). However the maximum of this function is not analytically tractable. Its gradient around \({U} = (T_y,T_z,\phi )\) is then computed as follows.

The first order Taylor expansion of the functions \(\Phi _{X_{i}^-}\), \(\mathrm {atan2}\), l, and \(\log \), are composed around U with infinitesimal translations and rotation \({du} = (dT_y, dT_z, d\phi )^T\):

$$\begin{aligned}&\Phi _{X_{i}^-}({U} + {du}) = \Phi _{X_i^-}({U}) + {J{\Phi _{X_i^-}}}({U}) \, {du} \nonumber \\&\mathrm {atan2}(u,v){}={}\mathrm {atan2}(u_0,v_0){}+{}{{\nabla }^T\mathrm {atan2}}(u_0,v_0) \, \begin{pmatrix} u - u_0\\ v-v_0 \end{pmatrix} \nonumber \\&l(w) = l(w_0) + l'(w_0)(w-w_0) \nonumber \\&\log (r) = \log (r_0) + \frac{1}{r_0}(r-r_0) \end{aligned}$$
(20)

with \({\nabla }\) the gradient operator. \(J{\Phi _{X_i^-}}({U})\) is the Jacobian of \(\Phi _{X_i^-}\) at U. Then the result of the composition, noted \(Z_{i}(dT_y,dT_z,d\phi )\), is used to retrieve the mean and the variance with (18) and (19). Finally, the first order Taylor expansion of \(F_k(dT_y,dT_z,d\phi )\) is obtained, highlighting the gradient \({\nabla }F_k\):

$$\begin{aligned} F_k\left( {{U} + {du}}\right) = F_k\left( {U}\right) + {\nabla }^TF_k ({U}) \, {du}. \end{aligned}$$
(21)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bustamante, G., Danès, P., Forgue, T. et al. An information based feedback control for audio-motor binaural localization. Auton Robot 42, 477–490 (2018). https://doi.org/10.1007/s10514-017-9639-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-017-9639-8

Keywords

Navigation