Advertisement

Reasoning and Learning for Awareness and Adaptation

  • Matthias Hölzl
  • Thomas Gabor
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8998)

Abstract

Reasoning and learning for awareness and adaptation are challenging endeavors since cogitation has to be tightly integrated with action execution and reaction to unforeseen contingencies. After discussing the notion of awareness and presenting a classification scheme for awareness mechanisms, we introduce Extended Behavior Trees (XBTs), a novel modeling method for hierarchical, concurrent behaviors that allows the interleaving of reasoning, learning and actions. The semantics of XBTs are defined by a transformation to SCEL so that sophisticated synchronization strategies are straightforward to realize and different kinds of distributed, hierarchical learning and reasoning—from centrally coordinated to fully autonomic—can easily be expressed. We propose novel hierarchical reinforcement-learning strategies called Hierarchical (Lenient) Frequency-Adjusted Q-learning, that can be implemented using XBTs. Finally we discuss how XBTs can be used to define a multi-layer approach to learning, called teacher-student learning, that combines centralized and distributed learning in a seamless way.

Keywords

Autonomic Computing Learning Reasoning Planning Behavioral Adaptation Self-awareness Computational Reflection 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abeywickrama, D., Zambonelli, F.: Model Checking Goal-oriented Requirements for Self-Adaptive Systems. In: 19th IEEE Conference on the Engineering of Computer-based Systems, Novi Sad, Serbia, April 2012, IEEE CS Press, Los Alamitos (2012), http://pmi.ascens-ist.eu/text_files/0000/0017/ECBS12.pdf Google Scholar
  2. 2.
    Agogino, A.K., Tumer, K.: Analyzing and visualizing multiagent rewards in dynamic and stochastic domains. Autonomous Agents and Multi-Agent Systems 17(2), 320–338 (2008), doi:10.1007/s10458-008-9046-9CrossRefGoogle Scholar
  3. 3.
    Alpaydin, E.: Introduction to Machine Learning, 2nd edn. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2010)zbMATHGoogle Scholar
  4. 4.
    Anderson, M.L., Perlis, D.: Logic, self-awareness and self-improvement: the metacognitive loop and the problem of brittleness. J. Log. Comput. 15(1), 21–40 (2005)CrossRefzbMATHMathSciNetGoogle Scholar
  5. 5.
    Andre, D.: Programmable Reinforcement Learning Agents. Ph.D. thesis, University of California at Berkeley (2003)Google Scholar
  6. 6.
    Au, T., Ilghami, O., Kuter, U., Murdock, J.W., Nau, D.S., Wu, D., Yaman, F.: SHOP2: an HTN planning system. CoRR abs/1106.4869 (2011), http://arxiv.org/abs/1106.4869
  7. 7.
    Bloembergen, D., Kaisers, M., Tuyls, K.: Lenient frequency adjusted Q-learning. In: Proc. of 22nd Belgium-Netherlands Conf. on Artificial Intelligence (BNAIC 2010), pp. 19–26 (2010)Google Scholar
  8. 8.
    Börgers, T., Sarin, R.: Learning Through Reinforcement and Replicator Dynamics. Journal of Economic Theory 77, 1–14 (1997)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Bruni, R., Corradini, A., Gadducci, F., Hölzl, M., Lafuente, A.L., Vandin, A., Wirsing, M.: Reconciling White-Box and Black-Box Perspectives on Behavioral Self-adaptation. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 163–184. Springer, Heidelberg (2015)Google Scholar
  10. 10.
    Busoniu, L., Babuska, R., Schutter, B.D., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton (2012)Google Scholar
  11. 11.
    Colvin, R.J., Hayes, I.J.: A semantics for Behavior Trees using {CSP} with specification commands. Science of Computer Programming 76(10), 891–914 (2011), http://www.sciencedirect.com/science/article/pii/S0167642310002066 CrossRefzbMATHGoogle Scholar
  12. 12.
    Dinu, C.M., Dimitrov, P., Weel, B., Eiben, A.E.: Self-adapting fitness evaluation times for on-line evolution of simulated robots. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. GECCO ’13, pp. 191–198. ACM Press, New York (2013), doi:10.1145/2463372.2463405CrossRefGoogle Scholar
  13. 13.
    Drusinsky, D.: Modeling and Verification Using UML Statecharts. Elsevier, Amsterdam (2006)Google Scholar
  14. 14.
    Endsley, M.: Design and evaluation for situation awareness enhancement. In: Proceedings of the Human Factors Society 32nd Annual Meeting, pp. 97–101. Human Factors Society (1988)Google Scholar
  15. 15.
    Gallup, G.G.: Self recognition in primates: A comparative approach to the bidirectional properties of consciousness. American Psychologist 32(5), 329–338 (1977)CrossRefGoogle Scholar
  16. 16.
    Games, E.: How Unreal Engine 4 Behavior Trees Differ (2014), https://docs.unrealengine.com/latest/INT/Engine/AI/BehaviorTrees/HowUE4BehaviorTreesDiffer/index.html, last accessed 2014-11-28
  17. 17.
    Ghallab, M., Nau, D.S., Traverso, P.: Automated planning - theory and practice. Elsevier, Amsterdam (2004)Google Scholar
  18. 18.
    Ghallab, M., Nau, D.S., Traverso, P.: The actor’s view of automated planning and acting: A position paper. Artif. Intell. 208, 1–17 (2014), doi:10.1016/j.artint.2013.11.002CrossRefGoogle Scholar
  19. 19.
    Hoch, N., Monreale, G.V., Montanari, U., Sammartino, M., Siwe, A.T.: From Local to Global Knowledge and Back. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 185–220. Springer, Heidelberg (2015)Google Scholar
  20. 20.
    Hölzl, M., Koch, N., Puviani, M., Wirsing, M., Zambonelli, F.: The Ensemble Development Life Cycle and Best Practices for Collective Autonomic Systems. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 325–354. Springer, Heidelberg (2015)Google Scholar
  21. 21.
    Hölzl, M., Wirsing, M.: Issues in engineering self-aware and self-expressive ensembles. In: Pitt, J. (ed.) The Computer After Me: Awareness and Self-awareness in Autonomic Systems, October 2014, Imperial College Press (2014)Google Scholar
  22. 22.
    Hölzl, M.M., Wirsing, M.: Towards a system model for ensembles. In: Agha, G., Danvy, O., Meseguer, J. (eds.) Formal Modeling: Actors, Open Systems, Biological Systems. LNCS, vol. 7000, pp. 241–261. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  23. 23.
    Isla, D.: Handling complexity in the halo 2 ai. In: Proceedings of the Game Developer’s Conference 2005 (GDC2005) (2005), http://www.gamasutra.com/view/feature/130663/gdc_2005_proceeding_handling_.php, last accessed 2014-11-28
  24. 24.
    Kaisers, M., Tuyls, K.: Frequency adjusted multi-agent q-learning. In: van der Hoek, W., Kaminka, G.A., Lespérance, Y., Luck, M., Sen, S. (eds.) 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), vol. 1–3, Toronto, Canada, May 10-14, 2010, pp. 309–316. ACM Press, New York (2010), doi:10.1145/1838206.1838250Google Scholar
  25. 25.
    Karafotias, G., Haasdijk, E., Eiben, A.E.: An algorithm for distributed on-line, on-board evolutionary robotics. In: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, pp. 171–178. ACM Press, New York (2011), doi:10.1145/2001576.2001601Google Scholar
  26. 26.
    Lewis, P.R., Chandra, A., Parsons, S., Robinson, E., Glette, K., Bahsoon, R., Torresen, J., Yao, X.: A Survey of Self-Awareness and Its Application in Computing Systems (2011)Google Scholar
  27. 27.
    Marthi, B.: Concurrent hierarchical reinforcement learning. In: Veloso, M.M., Kambhampati, S. (eds.) Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, Pittsburgh, Pennsylvania, USA, July 9-13, 2005, pp. 1652–1653. AAAI Press / The MIT Press (2005), http://www.aaai.org/Library/AAAI/2005/dc05-009.php Google Scholar
  28. 28.
    Marzinotto, A., Colledanchise, M., Smith, C., Ögren, P.: Towards a unified behavior trees framework for robot control. In: 2014 IEEE International Conference on Robotics and Automation, ICRA 2014, Hong Kong, China, May 31 - June 7, 2014, pp. 5420–5427. IEEE Computer Society Press, Los Alamitos (2014), doi:10.1109/ICRA.2014.6907656CrossRefGoogle Scholar
  29. 29.
    Millington, I., Funge, J.: Artificial Intelligence for Games, 2nd edn. Morgan Kaufmann, San Francisco (2009)Google Scholar
  30. 30.
    Mitchell, M.: Self-awareness and control in decentralized systems. In: Metacognition in Computation, pp. 80–85 (2005)Google Scholar
  31. 31.
    Murphy, K.P.: Machine Learning: A Probabilistic Perspective. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2013)Google Scholar
  32. 32.
    De Nicola, R., Latella, D., Lafuente, A.L., Loreti, M., Margheri, A., Massink, M., Morichetta, A., Pugliese, R., Tiezzi, F., Vandin, A.: The SCEL Language: Design, Implementation, Verification. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 3–71. Springer, Heidelberg (2015)Google Scholar
  33. 33.
    Ogren, P.: Increasing Modularity of UAV Control Systems using Computer Game Behavior Trees. AIAA Guidance, Navigation and Control Conference, Minneapolis, Minnesota, pp. 13–16 (2012)Google Scholar
  34. 34.
    Pinciroli, C., Bonani, M., Mondada, F., Dorigo, M.: Adaptation and Awareness in Robot Ensembles: Scenarios and Algorithms. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 471–494. Springer, Heidelberg (2015)Google Scholar
  35. 35.
    Pinciroli, C., Trianni, V., O’Grady, R., Pini, G., Brutschy, A., Brambilla, M., Mathews, N., Ferrante, E., Caro, G.D., Ducatelle, F., Stirling, T.S., Gutiérrez, Á., Gambardella, L.M., Dorigo, M.: ARGoS: A modular, multi-engine simulator for heterogeneous swarm robotics. In: IROS, pp. 5027–5034. IEEE Computer Society Press, Los Alamitos (2011)Google Scholar
  36. 36.
    Schwartz, H.M.: Multi-Agent Machine Learning: A Reinforcement Approach. Wiley, Chichester (2014)CrossRefGoogle Scholar
  37. 37.
    Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, New York (2008)CrossRefGoogle Scholar
  38. 38.
    Smith, B.C.: Reflection and semantics in LISP. In: POPL ’84: Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pp. 23–35. ACM Press, New York (1984)CrossRefGoogle Scholar
  39. 39.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)Google Scholar
  40. 40.
    Tanaka, K., Wakuta, K.: On Continuous Time Markov Games With The Expected Average Reward Criterion. Science Reports of Niigata University. Series A, Mathematics 14, 15–24 (1977), http://projecteuclid.org/euclid.nihmj/1273779029 zbMATHMathSciNetGoogle Scholar
  41. 41.
    Tomassini, M.: Spatially Structured Evolutionary Algorithms: Artificial Evolution in Space and Time. Natural Computing Series. Springer, Heidelberg (2005), http://books.google.de/books?id=z7Hf6bL3x7MC Google Scholar
  42. 42.
    Vassev, E., Hinchey, M.: Knowledge Representation for Adaptive and Self-aware Systems. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 221–247. Springer, Heidelberg (2015)Google Scholar
  43. 43.
    Watkins, C.: Learning from Delayed Rewards. Ph.D. thesis, Cambridge (1989)Google Scholar
  44. 44.
    Watson, R.A., Ficici, S.G., Pollack, J.B.: Embodied evolution: Distributing an evolutionary algorithm in a population of robots. Robotics and Autonomous Systems 39(1), 1–18 (2002), http://dblp.uni-trier.de/db/journals/ras/ras39.html#WatsonFP02 CrossRefGoogle Scholar
  45. 45.
    Weiss, G. (ed.): Multiagent Systems, 2nd edn. MIT Press, Cambridge (2013)Google Scholar
  46. 46.
    Wiering, M., van Otterlo, M. (eds.): Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12. Springer, Heidelberg (2012)Google Scholar
  47. 47.
    Zambonelli, F., Bicocchi, N., Cabri, G., Leonardi, L., Puviani, M.: On self-adaptation, self-expression, and self-awareness in autonomic service component ensembles. In: SASO Workshops, pp. 108–113. IEEE Computer Society Press, Los Alamitos (2011)Google Scholar
  48. 48.
    Zhang, G., Hölzl, M.M.: HiLA: High-Level Aspects for UML State Machines. In: Ghosh, S. (ed.) MODELS Workshops 2009. LNCS, vol. 6002, pp. 104–118. Springer, Heidelberg (2010)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Matthias Hölzl
    • 1
  • Thomas Gabor
    • 1
  1. 1.Ludwig-Maximilians-Universität MünchenGermany

Personalised recommendations