Advertisement

ARES: Adaptive Receding-Horizon Synthesis of Optimal Plans

  • Anna LukinaEmail author
  • Lukas Esterle
  • Christian Hirsch
  • Ezio Bartocci
  • Junxing Yang
  • Ashish Tiwari
  • Scott A. Smolka
  • Radu Grosu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10206)

Abstract

We introduce ARES, an efficient approximation algorithm for generating optimal plans (action sequences) that take an initial state of a Markov Decision Process (MDP) to a state whose cost is below a specified (convergence) threshold. ARES uses Particle Swarm Optimization, with adaptive sizing for both the receding horizon and the particle swarm. Inspired by Importance Splitting, the length of the horizon and the number of particles are chosen such that at least one particle reaches a next-level state, that is, a state where the cost decreases by a required delta from the previous-level state. The level relation on states and the plans constructed by ARES implicitly define a Lyapunov function and an optimal policy, respectively, both of which could be explicitly generated by applying ARES to all states of the MDP, up to some topological equivalence relation. We also assess the effectiveness of ARES by statistically evaluating its rate of success in generating optimal plans. The ARES algorithm resulted from our desire to clarify if flying in V-formation is a flocking policy that optimizes energy conservation, clear view, and velocity alignment. That is, we were interested to see if one could find optimal plans that bring a flock from an arbitrary initial state to a state exhibiting a single connected V-formation. For flocks with 7 birds, ARES is able to generate a plan that leads to a V-formation in 95% of the 8,000 random initial configurations within 63 s, on average. ARES can also be easily customized into a model-predictive controller (MPC) with an adaptive receding horizon and statistical guarantees of convergence. To the best of our knowledge, our adaptive-sizing approach is the first to provide convergence guarantees in receding-horizon techniques.

Keywords

Particle Swarm Optimization Particle Swarm Optimization Algorithm Model Predictive Control Markov Decision Process Optimal Plan 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

The first author and the last author would like to thank Jan Kr̆etínský for very valuable feedback. This work was partially supported by the Doctoral Program Logical Methods in Computer Science and the Austrian National Research Network RiSE/SHiNE (S11405-N23 and S11412-N23) project funded by the Austrian Science Fund (FWF) project W1255-N23, the EU ICT COST Action IC1402 ARVI, the Fclose (Federated Cloud Security) project funded by UnivPM, and National Science Foundation grant CCF 1423296.

References

  1. 1.
    Bajec, I.L., Heppner, F.H.: Organized flight in birds. Anim. Behav. 78(4), 777–789 (2009)CrossRefGoogle Scholar
  2. 2.
    Bartocci, E., Bortolussi, L., Brázdil, T., Milios, D., Sanguinetti, G.: Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent. In: Agha, G., Houdt, B. (eds.) QEST 2016. LNCS, vol. 9826, pp. 244–259. Springer, Heidelberg (2016). doi: 10.1007/978-3-319-43425-4_17 CrossRefGoogle Scholar
  3. 3.
    Baxter, J., Bartlett, P.L., Weaver, L.: Experiments with infinite-horizon, policy-gradient estimation. J. Artif. Int. Res. 15(1), 351–381 (2011)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)zbMATHGoogle Scholar
  5. 5.
    Camacho, E.F., Alba, C.B.: Model Predictive Control. Advanced Textbooks in Control and Signal Processing. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Cattivelli, F.S., Sayed, A.H.: Modeling bird flight formations using diffusion adaptation. IEEE Trans. Signal Process. 59(5), 2038–2051 (2011)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Chazelle, B.: The convergence of bird flocking. J. ACM 61(4), 21:1–21:35 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Chen, Y., Wu, B., Lai, T.L.: Fast Particle Filters and Their Applications to Adaptive Control in Change-Point ARX Models and Robotics. INTECH Open Access Publisher (2009)Google Scholar
  9. 9.
    Cutts, C., Speakman, J.: Energy savings in formation flight of pink-footed geese. J. Exp. Biol. 189(1), 251–261 (1994)Google Scholar
  10. 10.
    Dang, A.D., Horn, J.: Formation control of autonomous robots following desired formation during tracking a moving target. In: Proceedings of the International Conference on Cybernetics, pp. 160–165. IEEE (2015)Google Scholar
  11. 11.
    Dimock, G., Selig, M.: The aerodynamic benefits of self-organization in bird flocks. Urbana 51, 1–9 (2003)Google Scholar
  12. 12.
    Flake, G.W.: The Computational Beauty of Nature: Computer Explorations of Fractals, Chaos, Complex Systems, and Adaptation. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  13. 13.
    García, C.E., Prett, D.M., Morari, M.: Model predictive control: theory and practice – a survey. Automatica 25(3), 335–348 (1989)CrossRefzbMATHGoogle Scholar
  14. 14.
    Gennaro, M.C.D., Iannelli, L., Vasca, F.: Formation control and collision avoidance in mobile agent systems. In: Proceedings of the International Symposium on Control and Automation Intelligent Control, pp. 796–801. IEEE (2005)Google Scholar
  15. 15.
    Glasserman, P., Heidelberger, P., Shahabuddin, P., Zajic, T.: Multilevel splitting for estimating rare event probabilities. Oper. Res. 47(4), 585–600 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Grosu, R., Peled, D., Ramakrishnan, C.R., Smolka, S.A., Stoller, S.D., Yang, J.: Using statistical model checking for measuring systems. In: Margaria, T., Steffen, B. (eds.) ISoLA 2014. LNCS, vol. 8803, pp. 223–238. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-45231-8_16 Google Scholar
  17. 17.
    Henriques, D., Martins, J.G., Zuliani, P., Platzer, A., Clarke, E.M.: Statistical model checking for Markov decision processes. In: Proceedings of QEST 2012: The Ninth International Conference on Quantitative Evaluation of Systems, QEST 2012, pp. 84–93. IEEE Computer Society (2012)Google Scholar
  18. 18.
    Heppner, F.H.: Avian flight formations. Bird-Banding 45(2), 160–169 (1974)CrossRefGoogle Scholar
  19. 19.
    Hérault, T., Lassaigne, R., Magniette, F., Peyronnet, S.: Approximate probabilistic model checking. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 73–84. Springer, Heidelberg (2004). doi: 10.1007/978-3-540-24622-0_8 CrossRefGoogle Scholar
  20. 20.
    Hung, Y., Wang, W.: Accelerating parallel particle swarm optimization via GPU. Optim. Methods Softw. 27(1), 33–51 (2012)CrossRefzbMATHGoogle Scholar
  21. 21.
    Kahn, H., Harris, T.E.: Estimation of particle transmission by random sampling. Natl. Bur. Stand. Appl. Math. Ser. 12, 27–30 (1951)Google Scholar
  22. 22.
    Kalajdzic, K., Jegourel, C., Lukina, A., Bartocci, E., Legay, A., Smolka, S.A., Grosu, R.: Feedback control for statistical model checking of cyber-physical systems. In: Margaria, T., Steffen, B. (eds.) ISoLA 2016. LNCS, vol. 9952, pp. 46–61. Springer, Heidelberg (2016). doi: 10.1007/978-3-319-47166-2_4 CrossRefGoogle Scholar
  23. 23.
    Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of 1995 IEEE International Conference on Neural Networks, pp. 1942–1948 (1995)Google Scholar
  24. 24.
    Lissaman, P., Shollenberger, C.A.: Formation flight of birds. Science 168(3934), 1003–1005 (1970)CrossRefGoogle Scholar
  25. 25.
    Mannor, S., Rubinstein, R.Y., Gat, Y.: The cross entropy method for fast policy search. In: ICML, pp. 512–519 (2003)Google Scholar
  26. 26.
    Nathan, A., Barbosa, V.C.: V-like formations in flocks of artificial birds. Artif. Life 14(2), 179–188 (2008)CrossRefGoogle Scholar
  27. 27.
    Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. SIGGRAPH Comput. Graph. 21(4), 25–34 (1987)CrossRefGoogle Scholar
  28. 28.
    Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice-Hall, Upper Saddle River (2010)zbMATHGoogle Scholar
  29. 29.
    Rymut, B., Kwolek, B., Krzeszowski, T.: GPU-accelerated human motion tracking using particle filter combined with PSO. In: Blanc-Talon, J., Kasinski, A., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2013. LNCS, vol. 8192, pp. 426–437. Springer, Heidelberg (2013). doi: 10.1007/978-3-319-02895-8_38 CrossRefGoogle Scholar
  30. 30.
    Seiler, P., Pant, A., Hedrick, K.: Analysis of bird formations. In: Proceedings of the Conference on Decision and Control, vol. 1, pp. 118–123. IEEE (2002)Google Scholar
  31. 31.
    Stulp, F., Sigaud, O.: Path integral policy improvement with covariance matrix adaptation. arXiv preprint arXiv:1206.4621 (2012)
  32. 32.
    Stulp, F., Sigaud, O.: Policy improvement methods: between black-box optimization and episodic reinforcement learning (2012). http://hal.upmc.fr/hal-00738463/
  33. 33.
    Verfaillie, G., Pralet, C., Vidal, V., Teichteil, F., Infantes, G., Lesire, C.: Synthesis of plans or policies for controlling dynamic systems. AerospaceLab (4), 1–12 (2012)Google Scholar
  34. 34.
    Weimerskirch, H., Martin, J., Clerquin, Y., Alexandre, P., Jiraskova, S.: Energy saving in flight formation. Nature 413(6857), 697–698 (2001)CrossRefGoogle Scholar
  35. 35.
    Yang, J., Grosu, R., Smolka, S.A., Tiwari, A.: Love thy neighbor: V-formation as a problem of model predictive control. In: LIPIcs-Leibniz International Proceedings in Informatics, vol. 59. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016)Google Scholar
  36. 36.
    Yang, J., Grosu, R., Smolka, S.A., Tiwari, A.: V-formation as optimal control. In: Proceedings of the Biological Distributed Algorithms Workshop 2016 (2016)Google Scholar
  37. 37.
    Zhou, Y., Tan, Y.: GPU-based parallel particle swarm optimization. In: Proceedings of the Congress on Evolutionary Computation, pp. 1493–1500. IEEE (2009)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  • Anna Lukina
    • 1
    Email author
  • Lukas Esterle
    • 1
  • Christian Hirsch
    • 1
  • Ezio Bartocci
    • 1
  • Junxing Yang
    • 2
  • Ashish Tiwari
    • 3
  • Scott A. Smolka
    • 2
  • Radu Grosu
    • 1
    • 2
  1. 1.Cyber-Physical Systems GroupTechnische Universität WienViennaAustria
  2. 2.Department of Computer ScienceStony Brook UniversityNew YorkUSA
  3. 3.SRI InternationalMenlo ParkUSA

Personalised recommendations