Abstract
We address the problem of controlling a small team of robots to estimate the location of a mobile target using non-linear range-only sensors. Our control law maximizes the mutual information between the team’s estimate and future measurements over a finite time horizon. Because the computations associated with such policies scale poorly with the number of robots, the time horizon associated with the policy, and typical non-parametric representations of the belief, we design approximate representations that enable real-time operation. The main contributions of this paper include the control policy, an algorithm for approximating the belief state, and an extensive study of the performance of these algorithms using simulations and real world experiments in complex, indoor environments.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Binney, J., Krause, A., & Sukhatme, G. S. (2013). Optimizing waypoints for monitoring spatiotemporal phenomena. International Journal of Robotics Research, 32(8), 873–888.
Charrow, B., Kumar, V., & Michael, N. (2013). Approximate representations for multi-robot control policies that maximize mutual information. In Proceedings of Robotics: Science and Systems, Berlin, Germany.
Charrow, B., Michael, N., & Kumar, V. (2014). Cooperative multi-robot estimation and control for radio source localization. The International Journal of Robotics Research, 33, 569–580.
Chung, T., Hollinger, G., & Isler, V. (2014). Search and pursuit-evasion in mobile robotics. Autonomous Robots, 31(4), 299–316.
Cover, T. M., & Thomas, J. A. (2004). Elements of information theory. New York: Wiley.
Dame, A., & Marchand, E. (2011). Mutual information-based visual servoing. The IEEE Transactions on Robotics, 27(5), 958–969.
Djugash, J., & Singh, S. (2012). Motion-aided network slam with range. International Journal of Robotics Research, 31(5), 604–625.
Fannes, M. (1973). A continuity property of the entropy density for spin lattice systems. Communications in Mathematical Physics, 31(4), 291–294.
Fox, D. (2003). Adapting the sample size in particle filters through KLD-sampling. International Journal of Robotics Research, 22(12), 985–1003.
Golovin, D., & Krause, A. (2011). Adaptive submodularity: Theory and applications in active learning and stochastic optimization. The Journal of Artificial Intelligence Research, 42(1), 427–486.
Grocholsky, B. (2002). Information-theoretic control of multiple sensor platforms. PhD thesis, Australian Centre for Field Robotics.
Hahn, T. (2013, January). Cuba. http://www.feynarts.de/cuba/.
Hoffmann, G., & Tomlin, C. (2010). Mobile sensor network control using mutual information methods and particle filters. The IEEE Transactions on Automatic Control, 55(1), 32–47.
Hollinger, G., & Sukhatme, G. (2013). Sampling-based motion planning for robotic information gathering. In Proceedings of Robotics: Science and Systems, Berlin, Germany.
Hollinger, G., Djugash, J., & Singh, S. (2011). Target tracking without line of sight using range from radio. Autonomous Robots, 32(1), 1–14.
Huber, M., & Hanebeck, U. (2008). Progressive gaussian mixture reduction. In International Conference on Information Fusion.
Huber, M., Bailey, T., Durrant-Whyte, H., & Hanebeck, U. (2008, August). On entropy approximation for gaussian mixture random vectors. In Conference on Multisensor Fusion and Integration for Intelligent Systems, Seoul, Korea (pp. 181–188).
Julian, B. J., Angermann, M., Schwager, M., & Rus, D. (2011, September). A scalable information theoretic approach to distributed robot coordination. In Proceedings of the IEEE/RSJ International Conference on Intellegent Robots and System, San Francisco, USA (pp. 5187–5194).
Kassir, A., Fitch, R., & Sukkarieh, S. (2012, May). Decentralised information gathering with communication costs. In Proceedings of the IEEE International Conference on Robotics and Automation, Saint Paul, USA (pp. 2427–2432).
Krause, A., & Guestrin, C. (2005). Near-optimal nonmyopic value of information in graphical models. In Conference on Uncertainty in Artificial Intelligence (pp. 324–331).
nanoPAN 5375 Development Kit. (2013, September). http://nanotron.com/EN/pdf/Factsheet_nanoPAN_5375_Dev_Kit.pdf.
Owen, D. (1980). A table of normal integrals. Communications in Statistics-Simulation and Computation, 9(4), 389–419.
Park, J.G., Charrow, B., Battat, J., Curtis, D., Minkov, E., Hicks, J. Teller, S., & Ledlie, J. (2010). Growing an organic indoor location system. In Proceedings of International Conference on Mobile Systems, Applications, and Services, San Francisco, CA.
ROS. (2013, January). http://www.ros.org/wiki/.
Runnals, A. (2007). Kullback–Leibler approach to gaussian mixture reduction. IEEE Transactions on Aerospace and Electronic Systems, 43(3), 989–999.
Ryan, A., & Hedrick, J. (2010). Particle filter based information-theoretic active sensing. Robotics and Autonomous Systems, 58(5), 574–584.
Silva, J.F., Parada, P. (2011). Sufficient conditions for the convergence of the shannon differential entropy. In IEEE Information Theory Workshop, Paraty, Brazil (pp. 608–612).
Singh, A., Krause, A., Guestrin, C., & Kaiser, W. J. (2009). Efficient informative sensing using multiple robots. The Journal of Artificial Intelligence Research, 34(1), 707–755.
Stump, E., Kumar, V., Grocholsky, B., & Shiroma, P. (2009). Control for localization of targets using range-only sensors. International Journal of Robotics Research, 28(6), 743.
Thrun, S., Burgard, W., & Fox, D. (2008). Probabilistic robotics. Cambridge: MIT Press.
Vidal, R., Shakernia, O., Jin Kim, H., Shim, D., & Sastry, S. (2002). Probabilistic pursuit-evasion games: Theory, implementation, and experimental evaluation. IEEE Transactions on Robotics and Automation, 18(5), 662–669.
Whaite, P., & Ferrie, F. P. (1997). Autonomous exploration: Driven by uncertainty. The IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(3), 193–205.
Acknowledgments
We gratefully acknowledge the support of ONR Grant N00014-07-1-0829, ARL Grant W911NF-08-2-0004, and AFOSR Grant FA9550-10-1-0567. Benjamin Charrow was supported by a NDSEG fellowship from the Department of Defense.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proofs
Appendix: Proofs
1.1 Integrating Gaussians over a half-space
Proving Lemmas 1 and 2 requires integrating Gaussian functions over a half-space. The following identities will be used several times. Owen (1980) gives these identities for 1-dimensional Gaussians, but we have been unable to find a reference for the multivariate case. For completeness, we prove them here.
Lemma 6
If \(f(x)=\mathcal {N}(x; \mu , \Sigma )\) is a \(k\)-dimensional Gaussian and \(A=\{x : a^Tx + b > 0\}\) is a half-space then
where \(p=(a^T\mu +b)/\sqrt{a^T\Sigma a}\) is a scalar, \(\phi \left( x\right) =\mathcal {N}(x;0,1)\) is the PDF of the standard 1-dimensional Gaussian and \(\Phi \left( x\right) \) is its CDF.
Proof of (9)
All of these integrals are evaluated by making the substitution \(x=Ry\), where \(R\) is a rotation matrix that makes the half-space \(A\) axis aligned. Specifically, define \(R\) such that \(a^TR=\Vert a\Vert e_1^T\) where \(e_1\) is a \(k\)-dimensional vector whose first entry is 1 and all others are 0. This substitution is advantageous, because it makes the limits of integration for all components of \(y\) except \(y_1 [-\infty ,\infty ]\).
Because \(R^Tx=y\), \(y\) is a \(k\)-dimensional Gaussian with density \(q(y)=\mathcal {N}(y; R^T\mu , R^T\Sigma R)\). The determinant of the Jacobian of \(y\) is the determinant of a rotation matrix: \(|\partial y / \partial x|=|R^T|=1\). Substituting:
where \(q_1(y_1)=\mathcal {N}(y_1; e_1^T (R^T\mu ), e_1^T(R^T\Sigma R)e_1)\) is the density for the first component of \(y\). The final step follows as the limits of integration marginalize \(q(y)\). To simplify the remaining integral, apply the definition of \(R\), \(q_1(y_1) =\mathcal {N}(y_1;\mu ^Ta/\Vert a\Vert , a^T\Sigma a/ \Vert a\Vert ^2)\), and use \(1-\Phi \left( -x\right) =\Phi \left( x\right) \):
\(\square \)
Proof of (10)
First, we perform a change of variables so that the integral is evaluated over the standard multivariate Gaussian \(g(x)=\mathcal {N}(x;0, I)\). This substitution is \(x=\Sigma ^{1/2}y+\mu \) which can be seen by noting that 1) \(|\partial y / \partial x|=|\Sigma |^{-1/2}\) and 2) \({f(x)=|\Sigma |^{-1/2}g(\Sigma ^{-1/2}(x-\mu ))}\):
where \(B=\{y : (a^T\Sigma ^{1/2})y +(a^T\mu +b) > 0\}\) is the transformed half-space.
To evaluate the first term in (12), we calculate \(\int _C yg(y)\,dy\), where \(C=\{y: c^Ty + d > 0\}\) is a new generic half-space. This integral is easier than the original problem as \(g(y)\) is the density of a zero-mean Gaussian with identity covariance. To proceed, substitute \(y=Rz\) where \(R\) is a rotation matrix that makes \(C\) axis-aligned (i.e., \(c^TR=\Vert c\Vert e_1^T\)) and observe that \(g(Rz)=g(z)\):
\(e_1\) appears because \(g(x)\) is 0-mean; the only non-zero component of the integral is from \(z_1\). The 1-dimensional integral follows as \(d\phi (x)/dx=-x\phi (x)\).
To finish, (12) can be evaluated using the formula in (9) and (13):
\(\square \)
Proof of (11)
Similar to the last proof, make the substitution \({x=\Sigma ^{1/2}y+\mu }\) with \(g(y)\) as the standard multivariate Gaussian and expand terms.
where \(B=\{y : (a^T\Sigma ^{1/2})y +(a^T\mu +b) > 0\}\) is the transformed half-space.
To evaluate (14), we only need a formula for the first integral; the previous proofs have expressions for the other 3 integrals. To evaluate the first integral, let \(C=\{y: c^Ty + d > 0\}\) be a new half-space and use the same rotation substitution \(y=Rz\) as in the last proof.
The previous integral can be evaluated by analyzing each scalar component. \(g\) is the standard multivariate Gaussian, so \(g(z)=\prod _{i=1}^k\phi (z_i)\). There are three types of terms:
-
\(z_1^2\): The limits of integration marginalize \(g\) and the resulting integral can be solved using integration by parts:
$$\begin{aligned} \int _{z_1\Vert c\Vert +d>0} z_1^2 g(z)\,dz&=\int _{-d/\Vert c\Vert }^\infty z_1^2\phi (z_1)\,dz_1 \nonumber \\&=\Phi \left( \frac{d}{\Vert c\Vert }\right) -\frac{d}{\Vert c\Vert }\phi \left( \frac{d}{\Vert c\Vert }\right) \end{aligned}$$ -
\(z_i^2; (i > 1)\):
$$\begin{aligned}&\int _{z_1\Vert c\Vert +d>0} z_i^2 g(z)\,dz \nonumber \\ {}&=\int _{-d/\Vert c\Vert }^\infty \phi (z_1)\,dz_1 \int _{-\infty }^\infty z_i^2 \phi (z_i)\,dz_i = \Phi \left( \frac{d}{\Vert c\Vert }\right) \end{aligned}$$The integral over \(z_i\) is 1 because it’s the variance of the standard normal.
-
\(z_iz_j\; (i\ne j,i\ne 1)\): These indices cover all non-diagonal elements.
$$\begin{aligned}&\int _{z_1\Vert c\Vert +d>0} z_i z_j g(z)\,dz \nonumber \\&\quad ={\int _{\alpha }^\beta z_j\phi (z_j)\,dz_j}{\int _{-\infty }^\infty z_i \phi (z_i)\,dz_i}=0 \end{aligned}$$\(z_i\ne 1\), so its limits are the real line. Because \(g\) is 0 mean the integral is 0.
We now have formula for all of the terms in (14). Recall that \(B=\{x : (\Sigma ^{1/2} a)^Tx + (a^T\mu + b) > 0\}\). Using \(p=(a^T\mu +b)/\sqrt{a^T\Sigma a}\), (9), (10), and (15):
which is the formula we sought. \(\square \)
1.2 Proof of Lemma 1
The norm can be evaluated by splitting the integral up into regions where the absolute value disappears. Let \(A\) be the set where \(f(x) > g(x)\) and \(A^c\) be the complement of \(A\) where \(f(x) \le g(x)\). Noting that \(\int _{A} g(x) = 1 - \int _{A^c} g(x)\):
\(f\) and \(g\) have the same covariance, so \(f\) is bigger than \(g\) on a half-space \(A=\{x : a^Tx + b > 0\}\) where \(a=\Sigma ^{-1}(\mu _1-\mu _2)\) and \(b=(\mu _1+\mu _2)\Sigma ^{-1}(\mu _2-\mu _1)/2\). This means (16) can be evaluated using Lemma 6. To do this, we need to evaluate \(p\) for \(\int _A f(x)\,dx\) and \(\int _A g(x)\,dx\). There are three relevant terms:
Using \(\delta =\Vert \mu _1-\mu _2\Vert _\Sigma /2\) we get \((a^T\mu _1+b)/\sqrt{a^T\Sigma a}=\delta \) and \((a^T\mu _2+b)/\sqrt{a^T\Sigma a}=-\delta \). Plugging these values into Lemma 6 completes the proof.
1.3 Proof of Lemma 2
Let \(X\) be a random variable whose density is \(|f(x)-g(x)|/\Vert f-g\Vert _1\). Calculating the entropy of \(X\) is difficult as the expression involves the log of the absolute value of the difference of exponentials. Fortunately, the covariance of \(X\) can be calculated. This is useful because the maximum entropy of any distribution with covariance \(\Sigma \) is \((1/2)\log {((2\pi e)^k |\Sigma |)}\), the entropy of the multivariate Gaussian (Cover and Thomas 2004, Thm. 8.6.5). By explicitly calculating the determinant of the covariance of \(X\), we prove the desired bound.
To calculate \(X\)’s covariance, use the formula \(\hbox {cov}{X}=\mathbb {E}_{}\left[ XX^T\right] -\mathbb {E}_{}\left[ X\right] \mathbb {E}_{}\left[ X\right] ^T\). Similar to the proof of Lemma 1, we evaluate the mean by breaking the integral up into a region \(A\) where \(f(x)>g(x)\) and \(A^C\) where \(g(x)\ge f(x)\) vice-versa:
As before, \(f\) and \(g\) have the some covariance, so \(A = \{x : a^Tx + b >0\}\) and \(A^c=\{x : (-a)^Tx + (-b) \ge 0\}\) are half-spaces with \(a=\Sigma ^{-1}(\mu _1-\mu _2)\) and \(b=(\mu _1+\mu _2)\Sigma ^{-1}(\mu _2-\mu _1)/2\).
Each of the terms in (20) are Gaussian functions integrated over a half-space, which can be evaluated using Lemma 6. To simplify the algebra, we evaluate the integrals involving \(f\) and \(g\) separately. This involves calculating a few terms, three of which are repeats: (17), (18), and (19). The other term is \(\Sigma a = \mu _1-\mu _2\). As the difference of the means will arise repeatedly, define \(\Delta =\mu _1-\mu _2\). Noting \(2\delta =\Vert \Delta \Vert _\Sigma \) and \(\phi (x)=\phi (-x)\), the integrals involving \(f\) are:
Next, evaluate the integrals in (20) involving \(g\). This can be done by pattern matching from (21). The main change is that \(\mu _1\) becomes \(\mu _2\) and the sign of \(p\) in Lemma 6 flips, meaning the sign of the arguments to \(\phi \left( \cdot \right) \) and \(\Phi \left( \cdot \right) \) flip (see (17) and (18)).
Subtracting (22) from (21), recognizing \(\phi (x)=\phi (-x)\), and dividing by \(\Vert f-g\Vert _1=2(\Phi \left( \delta \right) -\Phi \left( -\delta \right) )\) simplifies (20):
We now evaluate \(X\)’s second moment. Split the integral up over \(A\) and \(A^c\):
Once again, we evaluate this expression by separately evaluating the integrals involving \(f\) and \(g\).
Starting with \(f\) and using Lemma 6 with \(\Delta \) and \(\delta \):
Taking the difference of (25) and (26):
To evaluate the integrals in (24) involving \(g\), we can pattern match using (25) and (26). As in the derivation of (22), this involves negating the \(p\) terms and replacing \(\mu _1\) with \(\mu _2\).
Note the sign difference of \(\Delta \Delta ^T\) compared to (27).
To finish calculating the second moment, subtract (28) from (27) and divide by \(\Vert f-g\Vert _1\), simplifying (24)
We can now express the covariance of \(X\).
Which follows as \((\mu _1\mu _1^T+\mu _2\mu _2^T)/2-(\mu _1+\mu _2)(\mu _1+\mu _2)^T/4=\Delta \Delta ^T/4\).
To calculate the determinant of the covariance, factor \(\Sigma \) out of (29) and define \(\alpha =\delta ^2 + \frac{2\phi \left( \delta \right) \delta }{2\Phi \left( \delta \right) -1}\):
The last step follows from the eigenvalues. \(\Sigma ^{-1}\Delta \Delta ^T\) only has one non-zero eigenvalue; it is a rank one matrix as \(\Sigma ^{-1}\) is full rank and \(\Delta \Delta ^T\) is rank one. The trace of a matrix is the sum of its eigenvalues, so \(tr(\Sigma ^{-1}\Delta \Delta ^T) =tr(\Delta ^T\Sigma ^{-1}\Delta )=\Vert \Delta \Vert _\Sigma ^2=4\delta ^2\) is the non-zero eigenvalue. Consequently, the only non-zero eigenvalue of \(\alpha \Sigma ^{-1}\Delta \Delta ^T/4\delta ^2\) is \(\alpha \). Adding \(I\) to a matrix increases all its eigenvalues by 1 so the only eigenvalue of \(I+\alpha \Sigma ^{-1}\Delta \Delta ^T/4\delta ^2\) that is not 1 has a value of \(1 + \alpha \). The determinant of a matrix is the product of its eigenvalues, so \(|I+\alpha \Sigma ^{-1}\Delta \Delta ^T/4\delta ^2|=1+\alpha \). As discussed at the beginning of the proof, substituting (30) into the expression for the entropy of a multivariate Gaussian achieves the desired upper bound.
1.4 Proof of Theorem 2
To prove the theorem, we build on Theorem 1. Unfortunately, it cannot be directly applied because it is difficult to evaluate 1) the \(L_1\) norm between GMMs and 2) the entropy of the normalized difference of mixture models. However, Lemmas 1 and 2 provide a way to evaluate these quantities when the densities are individual Gaussians. To exploit this fact, we split the problem up and analyze the difference in entropies between GMMs that only differ by a single component.
To start, define \(d_j(x)=\sum _{i=1}^j w_i g_i(x) + \sum _{i=j+1}^n w_i f_i(x)\). \(d_j\) is a GMM whose first \(j\) components are the first \(j\) components in \(g\) and last \(n-j\) components are the last \(n-j\) components in \(f\). Note that \(d_0 (x) = g(x)\) and \(d_n (x) = g(x)\). Using the triangle inequality:
where the last step applied the same trick \(n-2\) more times. Because \(d_{j-1} (x) - d_{j} (x) = w_j (f_j (x) - g_j (x))\) each term in the summand can be bounded using Theorem 1:
Because \(f_j(x)\) and \(g_j(x)\) are Gaussians with the same covariance, we can apply Lemmas 1 and 2 to complete the proof.
Rights and permissions
About this article
Cite this article
Charrow, B., Kumar, V. & Michael, N. Approximate representations for multi-robot control policies that maximize mutual information. Auton Robot 37, 383–400 (2014). https://doi.org/10.1007/s10514-014-9411-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-014-9411-2