Skip to main content
Log in

Maximal coverage problems with routing constraints using cross-entropy Monte Carlo tree search

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

Spatial search, and environmental monitoring are key technologies in robotics. These problems can be reformulated as maximal coverage problems with routing constraints, which are NP-hard problems. The generalized cost-benefit algorithm (GCB) can solve these problems with theoretical guarantees. To achieve better performance, evolutionary algorithms (EA) boost its performance via more samples. However, it is hard to know the terminal conditions of EA to outperform GCB. To solve these problems with theoretical guarantees and terminal conditions, in this research, the cross-entropy based Monte Carlo Tree Search algorithm (CE-MCTS) is proposed. It consists of three parts: the EA for sampling the branches, the upper confidence bound policy for selections, and the estimation of distribution algorithm for simulations. The experiments demonstrate that the CE-MCTS outperforms benchmark approaches (e.g., GCB, EAMC) in spatial search problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Algorithm 1
Algorithm 2
Algorithm 3
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

Data Availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Notes

  1. In the geometry, the three sets consists a triangle. Hence, it is called triangular estimation.

  2. The \(31\%\) is the lower bound of GCB. \(0.5\times (1-1/e) \approx 31\%\).

  3. \(P(Z=1)\) denotes the probability of the target is in the camera view. \(P_{thr}\) is the detection probability threshold.

References

  • Amigoni, F., & Gallo, A. (2005). A multi-objective exploration strategy for mobile robots. In IEEE international conference on robotics and automation (pp. 3850–3855).

  • Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2), 235–256.

    Article  Google Scholar 

  • Bian, C., Feng, C., Qian, C., & Yu, Y. (2020). An efficient evolutionary algorithm for subset selection with general cost constraints. Proceedings of the AAAI Conference on Artificial Intelligence, 34(4), 3267–3274.

    Article  Google Scholar 

  • Bircher, A., Kamel, M., Alexis, K., Oleynikova, H., & Siegwart, R. (2016). Receding horizon “next-best-view” planner for 3d exploration. In IEEE international conference on robotics and automation (ICRA) (pp. 1462–1468).

  • Bircher, A., Kamel, M., Alexis, K., Oleynikova, H., & Siegwart, R. (2018). Receding horizon path planning for 3d exploration and surface inspection. Autonomous Robots, 42(2), 291–306.

    Article  Google Scholar 

  • Brock, O., Trinkle, J., & Ramos, F. (2009). Proofs and experiments in scalable, near-optimal search by multiple robots. Robotics: Science and Systems IV, pp. 206–213.

  • Cao, Z. L., Huang, Y., & Hall, E. L. (1988). Region filling operations with random obstacle avoidance for mobile robots. Journal of Robotic systems, 5(2), 87–102.

    Article  Google Scholar 

  • Chaslot, G., Bakkes, S., Szita, I., & Spronck, P. (2008). Monte-Carlo tree search: A new framework for game AI. Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 4(1), 216–217.

    Article  Google Scholar 

  • Cheng, P. Keller, J., & Kumar, V. (2008). Time-optimal UAV trajectory planning for 3d urban structure coverage. In IEEE/RSJ international conference on intelligent robots and systems (pp. 2750–2757).

  • Cieslewski, T., Kaufmann, E., & Scaramuzza, D. (2017). Rapid exploration with multi-rotors: A frontier selection method for high speed flight. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 2135–2142).

  • Conforti, M., & Cornuéjols, G. (1984). Submodular set functions, matroids and the greedy algorithm: Tight worst-case bounds and some generalizations of the Rado-Edmonds theorem. Discrete Applied Mathematics, 7(3), 251–274.

    Article  MathSciNet  Google Scholar 

  • Connolly, C. (1985). The determination of next best views. IEEE International Conference on Robotics and Automation, 2, 432–435.

    Google Scholar 

  • Coquelin, P. -A., & Munos, R. (2007). Bandit algorithms for tree search. arXiv:0703-062 [CS].

  • Corah, M., & Michael, N. (2019). Distributed matroid-constrained submodular maximization for multi-robot exploration: Theory and practice. Autonomous Robots, 43(2), 485–501.

    Article  Google Scholar 

  • Costa, A., Jones, O. D., & Kroese, D. (2007). Convergence properties of the cross-entropy method for discrete optimization. Operations Research Letters, 35(5), 573–580.

    Article  MathSciNet  Google Scholar 

  • Coulom, R. (2006). Efficient selectivity and backup operators in Monte-Carlo tree search. In International conference on computers and games (pp. 72–83).

  • De Boer, P.-T., Kroese, D. P., Mannor, S., & Rubinstein, R. Y. (2005). A tutorial on the cross-entropy method. Annals of Operations Research, 134(1), 19–67.

    Article  MathSciNet  Google Scholar 

  • Deng, D., Duan, R., Liu, J., Sheng, K., & Shimada, K. (2020). Robotic exploration of unknown 2d environment using a frontier-based automatic-differentiable information gain measure. In IEEE/ASME international conference on advanced intelligent mechatronics (AIM) (pp. 1497–1503).

  • Deng, D., Xu, Z., Zhao, W., & Shimada, K. (2020). Frontier-based automatic-differentiable information gain measure for robotic exploration of unknown 3d environments. arXiv:2011.05288.

  • Englot, B., & Hover, F. (2012). Sampling-based coverage path planning for inspection of complex structures. Proceedings of the International Conference on Automated Planning and Scheduling, 22, 29–37.

    Article  Google Scholar 

  • Feige, U. (1998). A threshold of ln n for approximating set cover. Journal of Applied and Computational Mechanics (JACM), 45(4), 634–652.

    Google Scholar 

  • Fisher, M. L., Nemhauser, G. L., & Wolsey, L. A. (1978). An analysis of approximations for maximizing submodular set functions—II. Polyhedral Combinatorics (pp. 73–87).

  • Friedrich, T., & Neumann, F. (2015). Maximizing submodular functions under matroid constraints by evolutionary algorithms. Evolutionary Computation, 23(4), 543–558.

    Article  PubMed  Google Scholar 

  • Galceran, E., & Carreras, M. (2023). A survey on coverage path planning for robotics. Robotics and Autonomous Systems, 61, 1258–1276.

    Article  Google Scholar 

  • González-Banos, H. H., & Latombe, J.-C. (2002). Navigation strategies for exploring indoor environments. The International Journal of Robotics Research, 21(10–11), 829–848.

    Article  Google Scholar 

  • Grossman, T., & Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. European Journal of Operational Research, 101(1), 81–92.

    Article  Google Scholar 

  • Guillaume, M., Winands, M. H., Szita, I., & van den Herik, H. J. (2008). Cross-entropy for Monte-Carlo tree search. International Computer Games Association, 31(3), 145–156.

    Google Scholar 

  • James, S., Konidaris, G., & Rosman, B. (2017). An analysis of Monte Carlo tree search. AAAI Conference on Artificial Intelligence, 31, 3576–3582.

    Google Scholar 

  • Khuller, S., Moss, A., & Naor, J. S. (1999). The budgeted maximum coverage problem. Information Processing Letters, 70(1), 39–45.

    Article  MathSciNet  Google Scholar 

  • Kocsis, L., & Szepesvári, C. (2006). Bandit based Monte-Carlo planning. In European conference on machine learning (pp. 282–293).

  • Krause, A., & Guestrin, C. (2005). A note on the budgeted maximization of submodular functions.

  • Krause, A., & Guestrin, C. (2007). Near-optimal observation selection using submodular functions. Conference on Artificial Intelligence (AAAI) Nectar track, 7, 1650–1654.

    Google Scholar 

  • Krause, A., Guestrin, C., Gupta, A., & Kleinberg, J. (2006). Near-optimal sensor placements: Maximizing information while minimizing communication cost. In Proceedings of the 5th international conference on Information processing in sensor networks (pp. 2–10).

  • Lanillos, P., Besada-Portas, E., Pajares, G., & Ruz, J. J. (2012). Minimum time search for lost targets using cross entropy optimization. In IEEE/RSJ international conference on intelligent robots and systems (pp. 602–609).

  • Lin, S., & Kernighan, B. W. (1973). An effective heuristic algorithm for the traveling-salesman problem. Operations Research, 21(2), 498–516.

    Article  MathSciNet  Google Scholar 

  • Lu, B. -X., & Tseng, K. -S. (2020). 3d map exploration via learning submodular functions in the Fourier domain. In International conference on unmanned aircraft systems (ICUAS) (pp. 1199–1205).

  • Lu, B.-X., & Tseng, K.-S. (2022). 3d map exploration using topological Fourier sparse set. Journal of Intelligent and Robotic Systems, 104, 75.

    Article  Google Scholar 

  • Luperto, M., Antonazzi, M., Amigoni, F., & Borghese, N. A. (2020). Robot exploration of indoor environments using incomplete and inaccurate prior knowledge. Robotics and Autonomous Systems, 133, 103622.

    Article  Google Scholar 

  • Moerland, T. M., Broekens, J., Plaat, A., & Jonker, C. M. (2020). The second type of uncertainty in Monte Carlo tree search. arXiv:2005.09645.

  • Nemhauser, G. L., Wolsey, L. A., & Fisher, M. L. (1978). An analysis of approximations for maximizing submodular set functions—I. Mathematical Programming, 14(1), 265–294.

    Article  MathSciNet  Google Scholar 

  • Qian, C., Shi, J.-C., Yu, Y., & Tang, K. (2017). On subset selection with general cost constraints. International Joint Conference on Artificial Intelligence, 17, 2613–2619.

    Google Scholar 

  • Qian, C., Yu, Y., & Zhou, Z.-H. (2015). Subset selection by pareto optimization. In Advances in neural information processing systems (pp. 1774–1782).

  • Roostapour, V., Neumann, A., Neumann, F., & Friedrich, T. (2019). Pareto optimization for subset selection with dynamic cost constraints. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 2354–2361.

    Article  Google Scholar 

  • Rubinstein, R. (1999). The cross-entropy method for combinatorial and continuous optimization. Methodology and Computing in Applied Probability, 1(2), 127–190.

    Article  MathSciNet  Google Scholar 

  • Schadd, M. P., Winands, M. H., Tak, M. J., & Uiterwijk, J. W. (2012). Single-player Monte-Carlo tree search for SameGame. Knowledge-Based Systems, 34, 3–11.

    Article  Google Scholar 

  • Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529(7587), 484–489.

    Article  ADS  PubMed  CAS  Google Scholar 

  • Singh, A., Krause, A., Guestrin, C., Kaiser, W. J., & Batalin, M. A. (2006). Efficient planning of informative paths for multiple robots. In International joint conference on artificial intelligence (IJCAI) (pp. 2204–2211).

  • Sviridenko, M. (2004). A note on maximizing a submodular set function subject to a knapsack constraint. Operations Research Letters, 32(1), 41–43.

    Article  MathSciNet  Google Scholar 

  • Tseng, K.-S. (2021). Transfer learning of coverage functions via invariant properties in the Fourier domain. Autonomous Robots, 45(4), 519–542.

    Article  Google Scholar 

  • Tseng, K.-S., & Mettler, B. (2017). Near-optimal probabilistic search via submodularity and sparse regression. Autonomous Robots, 41(1), 205–229.

    Article  Google Scholar 

  • Tseng, K.-S., & Mettler, B. (2018). Near-optimal probabilistic search using spatial Fourier sparse set. Autonomous Robots, 42(2), 329–351.

    Article  Google Scholar 

  • Umari, H., & Mukhopadhyay, S. (2017). Autonomous robotic exploration based on multiple rapidly-exploring randomized trees. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1396–1402).

  • Xiao, C., Huang, R., Mei, J., Schuurmans, D., & Müller, M. (2019). Maximum entropy Monte-Carlo planning. Advances in Neural Information Processing Systems, 32, 9520–9528.

    Google Scholar 

  • Yamauchi, B. (1997). A frontier-based approach for autonomous exploration. In IEEE international symposium on computational intelligence in robotics and automation (CIRA) (pp. 146–151).

  • Yasutomi, F., Yamada, M., & Tsukamoto, K. (1988). Cleaning robot control. In Proceedings. IEEE international conference on robotics and automation (pp. 1839–1841).

  • Zhang, H., & Vorobeychik, Y. (2016). Submodular optimization with routing constraints. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1), 819–825.

    Article  Google Scholar 

  • Zhou, B., Zhang, Y., Chen, X., & Shen, S. (2021). Fuel: Fast UAV exploration using incremental frontier structure and hierarchical planning. IEEE Robotics and Automation Letters, 6(2), 779–786.

    Article  Google Scholar 

Download references

Funding

This research was supported by Taiwan Swarm Innovation Inc., Taiwan MOST Grant 108-2221-E-008-074-MY3, 111-2221-903 E-008-097, and NSTC 112-2221-E-008-075.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, P-TL and K-ST; methodology, P-TL and K-ST; software, P-TL and K-ST; validation, P-TL and K-ST; formal analysis, P-TL and K-ST; investigation, P-TL and K-ST; resources, P-TL and K-ST; data collection, P-TL; writing–original draft preparation, P-TL; writing–review and editing, K-ST; visualization, P-TL; supervision, K-ST; project administration, K-ST; funding acquisition, K-ST.

Corresponding author

Correspondence to Kuo-Shih Tseng.

Ethics declarations

Ethics approval

The research does not involve human participants, their data or biological material and it does not involve animals.

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 85583 KB)

Appendix A: Mathematical formulation of CEM

Appendix A: Mathematical formulation of CEM

The mathematical formulation of CEM is as follows: First, the probability l representing f(x) is greater than or equal to the threshold \((\gamma )\). When l is small (e.g., \(l\le 10^{-5}\)), it is called a rare event. Mathematically, l can be expressed as

$$\begin{aligned} l = {\mathbb {P}}_u(f(\textbf{X})\ge \gamma )= {\mathbb {E}}_uI_{{f(\textbf{X})\ge \gamma }}, \end{aligned}$$
(15)

where I is an indicator function and x is sampled by the density \(h(\cdot ;u)\) with the parameter u. Alternatively, l can be derived as

$$\begin{aligned} l = \int I_{{f(x)\ge \gamma }} \frac{h(x;u)}{g(x)}g(x) \,dx = {\mathbb {E}}_gI_{{f(\textbf{X})\ge \gamma }}\frac{h(\textbf{X};u)}{g(\textbf{X})}, \end{aligned}$$
(16)

where g is another density. Notice that, in Eq. 16, the expectation is taken by g. An unbiased estimator of l is

$$\begin{aligned} \hat{l} = \frac{1}{N} \sum _{i=1}^{N} I_{{f(x_i)\ge \gamma }}\frac{h(x_i;u)}{g(x_i)}. \end{aligned}$$
(17)

Second, density estimation of l is

$$\begin{aligned} g^*(x) := \frac{I_{{f(x)\ge \gamma }}h(x;u)}{l}. \end{aligned}$$
(18)

However, l is unknown. The direct way is to select g from the family of densities \(h(\cdot ;v)\).

Third, a measurement between two density functions g and h is the \(Kullback-Leibler\) distance (KLD) (so called the cross-entropy between g and h). The KLD is defined as:

$$\begin{aligned}&{\mathcal {D}}(g,h) \nonumber \\ {}&\quad = {\mathbb {E}}_g \ln \frac{g(\textbf{X})}{h(\textbf{X})} = \int g(x)\ln g(x) \,dx - \int g(x)\ln h(x) \,dx. \end{aligned}$$
(19)

Hence, minimizing KLD between \(g^*\) in Eq. 18 and \(h(\cdot ;v)\) is equal to maximize \(\int g^*{(x)}\ln h(x;v) \,dx\). Replacing \(g^*{(x)}\) with Eq. 18, it is equivalent to

$$\begin{aligned}&\max _{v} \int \frac{I_{{f(x)\ge \gamma }}h(x;u)}{l} \ln h(x;v) \,dx \nonumber \\ {}&\quad = \max _{v} {\mathbb {E}}_u I_{f(x)\ge \gamma } \ln h(\textbf{X};v). \end{aligned}$$
(20)

Finally, v can be obtained by solving Eq. 20.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, PT., Tseng, KS. Maximal coverage problems with routing constraints using cross-entropy Monte Carlo tree search. Auton Robot 48, 3 (2024). https://doi.org/10.1007/s10514-024-10156-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10514-024-10156-6

Keywords

Navigation