Evolutionary Intelligence

, Volume 6, Issue 1, pp 1–26 | Cite as

Scalable multiagent learning through indirect encoding of policy geometry

Research Paper

Abstract

Multiagent systems present many challenging, real-world problems to artificial intelligence. Because it is difficult to engineer the behaviors of multiple cooperating agents by hand, multiagent learning has become a popular approach to their design. While there are a variety of traditional approaches to multiagent learning, many suffer from increased computational costs for large teams and the problem of reinvention (that is, the inability to recognize that certain skills are shared by some or all team member). This paper presents an alternative approach to multiagent learning called multiagent HyperNEAT that represents the team as a pattern of policies rather than as a set of individual agents. The main idea is that an agent’s location within a canonical team layout (which can be physical, such as positions on a sports team, or conceptual, such as an agent’s relative speed) tends to dictate its role within that team. This paper introduces the term policy geometry to describe this relationship between role and position on the team. Interestingly, such patterns effectively represent up to an infinite number of multiagent policies that can be sampled from the policy geometry as needed to allow training very large teams or, in some cases, scaling up the size of a team without additional learning. In this paper, multiagent HyperNEAT is compared to a traditional learning method, multiagent Sarsa(λ), in a predator–prey domain, where it demonstrates its ability to train large teams.

Keywords

Multiagent learning Indirect encoding HyperNEAT Neural networks 

References

  1. 1.
    Aaltonen et al (over 100 authors) (2009) Measurement of the top quark mass with dilepton events selected using neuroevolution at CDF. Phys Rev Lett 102(15):2001Google Scholar
  2. 2.
    Altenberg L (1994) Evolving better representations through selective genome growth. In: Proceedings of the IEEE world congress on computational intelligence. IEEE Press, Piscataway, NJ, pp 182–187Google Scholar
  3. 3.
    Angeline PJ, Saunders GM, Pollack JB (1993) An evolutionary algorithm that constructs recurrent neural networks. IEEE Trans Neural Netw 5:54–65CrossRefGoogle Scholar
  4. 4.
    Baldassarre G, Trianni V, Bonani M, Mondada F, Dorigo M, Nolfi S (2007) Self-organized coordinated motion in groups of physically connected robots. IEEE Trans Syst Man Cybern Part B Cybern 37(1):224–239CrossRefGoogle Scholar
  5. 5.
    Bentley PJ, Kumar S (1999) The ways to grow designs: a comparison of embryogenies for an evolutionary design problem. In: Proceedings of the genetic and evolutionary computation conference (GECCO-1999). Kaufmann, San Francisco, pp 35–43Google Scholar
  6. 6.
    Bongard J (2000) Reducing collective behavioural complexity through heterogeneity. Artificial life VII: proceedings of the seventh international conference on artificial lifeGoogle Scholar
  7. 7.
    Bongard JC (2002) Evolving modular genetic regulatory networks. In: Proceedings of the 2002 congress on evolutionary computationGoogle Scholar
  8. 8.
    Bousquet F, Le Page C (2004) Multi-agent simulations and ecosystem management: a review. Ecol Model 176(3–4):313–332CrossRefGoogle Scholar
  9. 9.
    Boutilier C (1996) Planning, learning and coordination in multiagent decision processes. In: Proceedings of the 6th conference on theoretical aspects of rationality and knowledge. Morgan Kaufmann Publishers Inc., pp 195–210Google Scholar
  10. 10.
    Bowling M, Veloso M (2002) Multiagent learning using a variable learning rate. Artif Intell 136(2):215–250MathSciNetMATHCrossRefGoogle Scholar
  11. 11.
    Bryant BD, Miikkulainen R (2003) Neuroevolution for adaptive teams. In: Proceedings of the 2003 congress on evolutionary computation (CEC 2003), vol 3. IEEE, Piscataway, NJ, pp 2194–2201Google Scholar
  12. 12.
    Bull L, Holland O (1997) Evolutionary computing in multiagent environments: eusociality. In: Proceedings of the annual conference on genetic programming. Morgan KaufmannGoogle Scholar
  13. 13.
    Busoniu L, Schutter BD, Babuska R (2005) Learning and coordination in dynamic multiagent systems. Technical Report 05-019, Delft University of TechnologyGoogle Scholar
  14. 14.
    Busoniu L, Babuška R, De Schutter B (2008) A comprehensive survey of multi-agent reinforcement learning. IEEE Trans Syst Man Cybern Part C Appl Rev 38(2):156–172. doi:10.1109/TSMCC.2007.913919 CrossRefGoogle Scholar
  15. 15.
    Castelpietra C, Iocchi L, Nardi D, Piaggio M, Scalzo A, Sgorbissa A (2000) Coordination among heterogeneous robotic soccer players. In: Intelligent robots and systems, 2000.(IROS 2000). Proceedings. 2000 IEEE/RSJ international conference on, IEEE, vol 2, pp 1385–1390Google Scholar
  16. 16.
    Christensen A, Dorigo M (2006) Incremental evolution of robot controllers for a highly integrated task. From animals to animats 9, pp 473–484Google Scholar
  17. 17.
    Claus C, Boutilier C (1998) The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the national conference on artificial intelligence. John Wiley & Sons Ltd, pp 746–752Google Scholar
  18. 18.
    Clune J, Ofria C, Pennock R (2008) How a generative encoding fares as problem-regularity decreases. In: Proceedings of the 10th international conference on parallel problem solving from nature (PPSN 2008). Springer, Berlin, pp 258–367Google Scholar
  19. 19.
    Clune J, Beckmann BB, Pennock R, Ofria C (2009a) HybrID: a hybridization of indirect and direct encodings for evolutionary computation. In: Proceedings of the European conference on artificial life (ECAL-2009)Google Scholar
  20. 20.
    Clune J, Beckmann BE, Ofria C, Pennock RT (2009b) Evolving coordinated quadruped gaits with the HyperNEAT generative encoding. In: Proceedings of the IEEE congress on evolutionary computation (CEC-2009) special session on evolutionary robotics. IEEE Press, Piscataway, NJ, USAGoogle Scholar
  21. 21.
    Clune J, Pennock RT, Ofria C (2009) The sensitivity of HyperNEAT to different geometric representations of a problem. In: Proceedings of the genetic and evolutionary computation conference (GECCO-2009). ACM Press, New York, NY, USAGoogle Scholar
  22. 22.
    Clune J, Beckmann B, McKinley P, Ofria C (2010) Investigating whether HyperNEAT produces modular neural networks. In: Proceedings of the genetic and evolutionary computation conference (GECCO-2010). ACM Press, New York, NYGoogle Scholar
  23. 23.
    Conitzer V, Sandholm T (2007) AWESOME: a general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. Mach Learn 67(1):23–43CrossRefGoogle Scholar
  24. 24.
    Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314MathSciNetMATHCrossRefGoogle Scholar
  25. 25.
    D’Ambrosio D, Lehman J, Risi S, Stanley KO (2010) Evolving policy geometry for scalable multiagent learning. In: Proceedings of the ninth international conference on autonomous agents and multiagent systems (AAMAS-2010), international foundation for autonomous agents and multiagent system, pp 731–738Google Scholar
  26. 26.
    D’Ambrosio DB, Stanley KO (2008) Generative encoding for multiagent learning. In: Proceedings of the genetic and evolutionary computation conference (GECCO 2008). ACM Press, New York, NYGoogle Scholar
  27. 27.
    D’Ambrosio DB, Lehman J, Risi S, Stanley KO (2010) Evolving policy geometry for scalable multiagent learning. In: Proceedings of the 9th international conference on autonomous agents and multiagent systems: volume 1-volume 1, international foundation for autonomous agents and multiagent systems, pp 731–738Google Scholar
  28. 28.
    Drchal J, Koutnk J, Snorek M (2009) HyperNEAT controlled robots learn to drive on roads in simulated environment. In: Proceedings of the IEEE congress on evolutionary computation (CEC-2009). IEEE Press, Piscataway, NJ, USAGoogle Scholar
  29. 29.
    Dupuy TN (1990) The evolution of weapons and warfare. Da Capo, New York, NY, USAGoogle Scholar
  30. 30.
    Eggenberger P (1997) Evolving morphologies of simulated 3d organisms based on differential gene expression. Fourth European conference on artificial lifeGoogle Scholar
  31. 31.
    Ficici S, Pollack J (2000) A game-theoretic approach to the simple coevolutionary algorithm. Lecture notes in computer science, pp 467–476Google Scholar
  32. 32.
    Floreano D, Dürr P, Mattiussi C (2008) Neuroevolution: from architectures to learning. Evol Intell 1:47–62CrossRefGoogle Scholar
  33. 33.
    Gauci J, Stanley KO (2007) Generating large-scale neural networks through discovering geometric regularities. In: Proceedings of the genetic and evolutionary computation conference (GECCO 2007). ACM Press, New York, NYGoogle Scholar
  34. 34.
    Gauci J, Stanley KO (2008) A case study on the critical role of geometric regularity in machine learning. In: Proceedings of the twenty-third AAAI conference on artificial intelligence (AAAI-2008). AAAI Press, Menlo Park, CAGoogle Scholar
  35. 35.
    Gauci J, Stanley KO (2010) Autonomous evolution of topographic regularities in artificial neural networks. Neural Comput 22(7):1860–1898MATHCrossRefGoogle Scholar
  36. 36.
    Gauci J, Stanley KO (2010) Indirect encoding of neural networks for scalable go. In: Schaefer R, Cotta C, Kołodziej J, Rudolph G (eds) Parallel problem solving from nature—PPSN XI, vol 6238. Springer, Lecture Notes in Computer Science, pp 354–363Google Scholar
  37. 37.
    Gomez F, Miikkulainen R (1999) Solving non-Markovian control tasks with neuroevolution. In: Proceedings of the 16th international joint conference on artificial intelligence. Kaufmann, San Francisco, pp 1356–1361Google Scholar
  38. 38.
    Green C (2003–2006) SharpNEAT homepage. http://sharpneat.sourceforge.net/
  39. 39.
    Gruau F, Whitley D, Pyeatt L (1996) A comparison between cellular encoding and direct encoding for genetic neural networks. In: Koza JR, Goldberg DE, Fogel DB, Riolo RL (eds) Genetic programming 1996: proceedings of the first annual conference. MIT Press, Cambridge, MA, pp 81–89Google Scholar
  40. 40.
    Haasdijk E, Rusu A, Eiben A (2010) HyperNEAT for locomotion control in modular robots. Evolvable systems: from biology to hardware, pp 169–180Google Scholar
  41. 41.
    Harvey I (1993) The artificial evolution of adaptive behavior. PhD thesis, School of Cognitive and Computing Sciences, University of Sussex, SussexGoogle Scholar
  42. 42.
    Haynes T, Sen S (1996) Co-adaptation in a team. Int J Comput Intell Organ 1(4):1–20Google Scholar
  43. 43.
    Hornby GS, Pollack JB (2002) Creating high-level components with a generative representation for body-brain evolution. Artif Life 8(3)Google Scholar
  44. 44.
    Hotz P, Gomez G, Pfeifer R (2003) Evolving the morphology of a neural network for controlling a foveating retina-and its test on a real robot. In: Artificial life VIII-8th international conference on the simulation and synthesis of living systems, vol 2003Google Scholar
  45. 45.
    Hsu W, Gustafson S (2002) Genetic programming and multi-agent layered learning by reinforcements. In: Genetic and evolutionary computation conference, pp 764–771Google Scholar
  46. 46.
    Hu J, Wellman M (2003) Nash Q-learning for general-sum stochastic games. J Mach Learn Res 4:1039–1069MathSciNetGoogle Scholar
  47. 47.
    Hu J, Wellman MP (1998) Multiagent reinforcement learning: theoretical framework and an algorithm. In: Proceedings of 15th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 242–250Google Scholar
  48. 48.
    Iba H (1996) Emergent cooperation for multiple agents using genetic programming. Parallel problem solving from nature PPSN IV, pp 32–41Google Scholar
  49. 49.
    Ishiwaka Y, Sato T, Kakazu Y (2003) An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning. Robot Auton Syst 43(4):245–256CrossRefGoogle Scholar
  50. 50.
    Jim K, Giles C (2000) Talking helps: evolving communicating agents for the predator-prey pursuit problem. Artif Life 6(3):237–254CrossRefGoogle Scholar
  51. 51.
    Kalech M, Kaminka G (2003) On the design of social diagnosis algorithms for multi-agent teams. In: International joint conference on artificial intelligence, vol 18, pp 370–375Google Scholar
  52. 52.
    Knoester D, Goldsby H, McKinley P (2010) Neuroevolution of mobile ad hoc networks. In: Proceedings of the 12th annual conference on genetic and evolutionary computation. ACM, pp 603–610Google Scholar
  53. 53.
    Kobayashi K, Nakano K, Kuremoto T, Obayashi M (2010) A state predictor-based reinforcement learning system. Electron Commun Jpn 93(6):8–18CrossRefGoogle Scholar
  54. 54.
    Kok J, Hoen P, Bakker B, Vlassis N (2005) Utile coordination: learning interdependencies among cooperative agents. In: Proceeding symposium on computational intelligence and games, pp 29–36Google Scholar
  55. 55.
    Koza JR, Rice JP (1991) Genetic generalization of both the weights and architecture for a neural network. In: Proceedings of the international joint conference on neural networks, vol 2 (New York, NY). IEEE, Piscataway, NJ, pp 397–404Google Scholar
  56. 56.
    Kutschinski E, Uthmann T, Polani D (2003) Learning competitive pricing strategies by multi-agent reinforcement learning. J Econ Dyn Control 27(11–12):2207–2218MathSciNetMATHCrossRefGoogle Scholar
  57. 57.
    Lindenmayer A (1974) Adding continuous components to L-systems. In: Rozenberg G, Salomaa A (eds) L systems, Lecture Notes in Computer Science 15. Springer, Heidelberg, Germany, pp 53–68Google Scholar
  58. 58.
    Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning. In: Machine learning: proceedings of the 11th annual conference. Kaufmann, San Francisco, pp 157–163Google Scholar
  59. 59.
    Luke S, Spector L (1996) Evolving graphs and networks with edge encoding: preliminary report. In: Koza JR (ed) Late-breaking papers of genetic programming 1996, Stanford BookstoreGoogle Scholar
  60. 60.
    Martin AP (1999) Increasing genomic complexity by gene duplication and the origin of vertebrates. Am Nat 154(2):111–128CrossRefGoogle Scholar
  61. 61.
    Matarić M (1997) Reinforcement learning in the multi-robot domain. Auton Robots 4(1):73–83CrossRefGoogle Scholar
  62. 62.
    Miconi T (2003) When evolving populations is better than coevolving individuals: the blind mice problem. In: Gottlob G, Walsh T (eds) Proceedings of the eighteenth international joint conference on artificial intelligence (IJCAI ’03). Morgan KaufmannGoogle Scholar
  63. 63.
    Miller JF (2004) Evolving a self-repairing, self-regulating, French flag organism. In: Proceedings of the genetic and evolutionary computation conference (GECCO-2004). Springer, BerlinGoogle Scholar
  64. 64.
    Montana DJ, Davis L (1989) Training feedforward neural networks using genetic algorithms. In: Proceedings of the 11th international joint conference on artificial intelligence. Kaufmann, San Francisco, pp 762–767Google Scholar
  65. 65.
    Nolfi S, Floreano D (1998) Coevolving predator and prey robots: do arms races arise in artificial evolution? Artif Life 4(4):311–335CrossRefGoogle Scholar
  66. 66.
    Oliveira E, Fischer K, Stepankova O (1999) Multi-agent systems: which research for which applications. Robotics Auton Syst 27(1):91–106CrossRefGoogle Scholar
  67. 67.
    Panait L, Luke S (2005) Cooperative multi-agent learning: the state of the art. Auton Agents Multi Agent Syst 3(11):383–434. doi:10.1007/s10458-005-2631-2 Google Scholar
  68. 68.
    Panait L, Wiegand R, Luke S (2003) Improving coevolutionary search for optimal multiagent behaviors. Proceedings of the eighteenth international joint conference on artificial intelligence (IJCAI), pp 653–658Google Scholar
  69. 69.
    Panait L, Luke S, Harrison JF (2006) Archive-based cooperative coevolutionary algorithms. In: Proceedings of the 8th annual conference on genetic and evolutionary computation. ACM, New York, NY, USA, pp 345–352Google Scholar
  70. 70.
    Panait L, Luke S, Wiegand R (2006) Biasing coevolutionary search for optimal multiagent behaviors. IEEE Trans Evol Comput 10(6):629–645CrossRefGoogle Scholar
  71. 71.
    Panait L, Tuyls K, Luke S (2008) Theoretical advantages of lenient learners: an evolutionary game theoretic perspective. J Mach Learn Res 9:423–457MathSciNetMATHGoogle Scholar
  72. 72.
    Potter M, De Jong K (1994) A cooperative coevolutionary approach to function optimization. Lect Notes Comput Sci 866:249–259CrossRefGoogle Scholar
  73. 73.
    Potter M, Meeden L, Schultz A (2001) Heterogeneity in the coevolved behaviors of mobile robots: the emergence of specialists. In: International joint conference on artificial intelligence, vol 17. Lawrence Erlbaum Associates Ltd, pp 1337–1343Google Scholar
  74. 74.
    Potter MA, De Jong KA, Grefenstette JJ (1995) A coevolutionary approach to learning sequential decision rules. In: Eshelman LJ (ed) Proceedings of the sixth international conference on genetic algorithms. Kaufmann, San FranciscoGoogle Scholar
  75. 75.
    Price B, Boutilier C (1999) Implicit imitation in multiagent reinforcement learning. In: Machine learning. Morgam Kaufmann Publishers, Inc., pp 325–334Google Scholar
  76. 76.
    Puppala N, Sen S, Gordin M (1998) Shared memory based cooperative coevolution. In: Evolutionary computation proceedings, 1998. IEEE world congress on computational intelligence., The 1998 IEEE international conference on, pp 570–574Google Scholar
  77. 77.
    Quinn M, Smith L, Mayley G, Husbands P, Quinn M, Smith L, Mayley G, Husbands P (2003) Evolving controllers for a homogeneous system of physical robots: structured cooperation with minimal sensors. Philos Trans R Soc Lond A Math Phys Eng Sci 361(1811):2321–2343MathSciNetCrossRefGoogle Scholar
  78. 78.
    Ren Z, Williams AB (2003) Lessons learned in single-agent and multiagent learning with robot foraging. In: IEEE international conference on systems, man and cybernetics, 2003, vol 3, pp 2757–2762Google Scholar
  79. 79.
    Risi S, Stanley KO (2010) Indirectly encoding neural plasticity as a pattern of local rules. In: Proceedings of the 11th international conference on simulation of adaptive behavior (SAB2010). Springer, BerlinGoogle Scholar
  80. 80.
    Saravanan N, Fogel DB (1995) Evolving neural control systems. IEEE expert, pp 23–27Google Scholar
  81. 81.
    Schlachter F, Schwarzer C, Kernbach S, Michiels N, Levi P (2010) Incremental online evolution and adaptation of neural networks for robot control in dynamic environments. In: ADAPTIVE 2010, the second international conference on adaptive and self-adaptive systems and applications, pp 111–116Google Scholar
  82. 82.
    Secretan J, Beato N, D’Ambrosio DB, Rodriguez A, Campbell A, Stanley KO (2008) Picbreeder: evolving pictures collaboratively online. In: CHI ’08: proceedings of the twenty-sixth annual SIGCHI conference on Human factors in computing systems. ACM, New York, NY, USA, pp 1759–1768, doi:10.1145/1357054.1357328
  83. 83.
    Secretan J, Beato N, D’Ambrosio DB, Rodriguez A, Campbell A, Folsom-Kovarik JT, Stanley KO (2011) Picbreeder: a case study in collaborative evolutionary exploration of design space. Evol Comput 19(3):373–403Google Scholar
  84. 84.
    Servin A, Kudenko D (2008) Multi-agent reinforcement learning for intrusion detection. Lect Notes Comput Sci 4865:211CrossRefGoogle Scholar
  85. 85.
    Shoham Y, Powers R, Grenager T (2004) Multi-agent reinforcement learning: a critical survey. In: AAAI fall symposium on artificial multi-agent learningGoogle Scholar
  86. 86.
    Sims K (1994) Evolving 3D morphology and behavior by competition. In: Brooks RA, Maes P (eds) Proceedings of the fourth international workshop on the synthesis and simulation of living systems (Artificial Life IV). MIT Press, Cambridge, MA, pp 28–39Google Scholar
  87. 87.
    Singh S, Kearns M, Mansour Y (2000) Nash convergence of gradient dynamics in general-sum games. In: In Proceedings of the sixteenth conference on uncertainty in artificial intelligenceGoogle Scholar
  88. 88.
    Soltoggio A, Bullinaria AJ, Mattiussi C, Dürr P, Floreano D (2008) Evolutionary advantages of neuromodulated plasticity in dynamic, reward-based scenarios. In: Bullock S, Noble J, Watson R, Bedau M (eds) Proceedings of the eleventh international conference on artificial life (Alife XI). MIT Press, Cambridge, MAGoogle Scholar
  89. 89.
    Stanley KO (2007) Compositional pattern producing networks: a novel abstraction of development. Genet Program Evol Mach Special Issue Dev Syst 8(2):131–162MathSciNetCrossRefGoogle Scholar
  90. 90.
    Stanley KO, Miikkulainen R (2002) Evolving neural networks through augmenting topologies. Evol Comput 10:99–127CrossRefGoogle Scholar
  91. 91.
    Stanley KO, Miikkulainen R (2003) A taxonomy for artificial embryogeny. Artif Life 9(2):93–130CrossRefGoogle Scholar
  92. 92.
    Stanley KO, Miikkulainen R (2004) Competitive coevolution through evolutionary complexification. J Artif Intell Res 21:63–100Google Scholar
  93. 93.
    Stanley KO, Bryant BD, Miikkulainen R (2005) Evolving neural network agents in the NERO video game. In: Proceedings of the IEEE 2005 symposium on computational intelligence and gamesGoogle Scholar
  94. 94.
    Stanley KO, Bryant BD, Miikkulainen R (2005) Real-time neuroevolution in the NERO video game. IEEE Trans Evol Comput Special Issue Evolut Comput Games 9(6):653–668CrossRefGoogle Scholar
  95. 95.
    Stanley KO, Kohl N, Miikkulainen R (2005) Neuroevolution of an automobile crash warning system. In: Proceedings of the genetic and evolutionary computation conferenceGoogle Scholar
  96. 96.
    Stanley KO, D’Ambrosio DB, Gauci J (2009) A hypercube-based indirect encoding for evolving large-scale neural networks. Artif Life 15(2):185–212CrossRefGoogle Scholar
  97. 97.
    Stone P, Sutton RS (2001) Scaling reinforcement learning toward RoboCup soccer. In: Proceedings of the 18th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 537–544Google Scholar
  98. 98.
    Stone P, Veloso M (2000) Layered learning. In: Machine learning: ECML 2000, pp 369–381Google Scholar
  99. 99.
    Stone P, Veloso M (2000) Multiagent systems: a survey from a machine learning perspective. Auton Robots 8(3):345–383CrossRefGoogle Scholar
  100. 100.
    Stone P, Sutton RS, Singh SP (2001) Reinforcement learning for 3 vs. 2 keepaway. In: RoboCup 2000: Robot Soccer World Cup IV. Springer, London, UK, pp 249–258Google Scholar
  101. 101.
    Stone P, Sutton R, Kuhlmann G (2005) Reinforcement learning for robocup soccer keepaway. Adapt Behav 13(3):165CrossRefGoogle Scholar
  102. 102.
    Suematsu N, Hayashi A (2002) A multiagent reinforcement learning algorithm using extended optimal response. In: Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1. ACM, New York, NY, USA, pp 370–377Google Scholar
  103. 103.
    Sutton R (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Advances in neural information processing systems 8, pp 1038–1044Google Scholar
  104. 104.
    Sutton R, Barto A (1998) Reinforcement learning: an introduction. The MIT press, Cambride, MAGoogle Scholar
  105. 105.
    Sutton RS (2009) Tile coding software, version 2.0, http://webdocs.cs.ualberta.ca/~sutton/tiles2.html
  106. 106.
    Talvitie E, Singh S (2007) An experts algorithm for transfer learning. In: Proceedings of the twentieth international joint conference on artificial intelligence, pp 1065–1070Google Scholar
  107. 107.
    Tan M (1997) Multi-agent reinforcement learning: independent vs. cooperative agents. Readings in agents, pp 487–494Google Scholar
  108. 108.
    Taylor M, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res 10:1633–1685MathSciNetMATHGoogle Scholar
  109. 109.
    Taylor M, Whiteson S, Stone P (2007) Transfer via inter-task mappings in policy search reinforcement learning. In: Proceedings of the 6th international joint conference on autonomous agents and multiagent systems, pp 1–8. ACMGoogle Scholar
  110. 110.
    Taylor ME, Stone P (2005) Behavior transfer for value-function-based reinforcement learning. In: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems. ACM, New York, NY, USA, AAMAS ’05, pp 53–59, doi:10.1145/1082473.1082482
  111. 111.
    Taylor ME, Whiteson S, Stone P (2006) Comparing evolutionary and temporal difference methods in a reinforcement learning domain. In: GECCO 2006: proceedings of the genetic and evolutionary computation conference, pp 1321–1328Google Scholar
  112. 112.
    Verbancsics P, Stanley KO (2010) Evolving static representations for task transfer. J Mach Learn Res (JMLR) 11:1737–1769MathSciNetMATHGoogle Scholar
  113. 113.
    Verbancsics P, Stanley KO (2010) Task transfer through indirect encoding. In: Proceedings of the genetic and evolutionary computation conference (GECCO 2010). ACM Press, New York, NYGoogle Scholar
  114. 114.
    Waibel M, Keller L, Floreano D (2009) Genetic team composition and level of selection in the evolution of multi-agent systems. IEEE Trans Evol Comput 13(3):648–660. doi:10.1109/TEVC.2008.2011741 Google Scholar
  115. 115.
    Waskow SJ, Bazzan ALC (2010) Improving space representation in multiagent learning via tile coding. In: Proceedings of the 20th Brazilian conference on advances in artificial intelligence. Springer, Berlin, Heidelberg, SBIA’10, pp 153–162Google Scholar
  116. 116.
    Watson JD, Hopkins NH, Roberts JW, Steitz JA, Weiner AM (1987) Molecular biology of the gene, 4 edn. The Benjamin Cummings Publishing Company, Inc., Menlo Park, CAGoogle Scholar
  117. 117.
    Whiteson S, Kohl N, Miikkulainen R, Stone P (2005) Evolving keepaway soccer players through task decomposition. Mach Learn 59:5–30CrossRefGoogle Scholar
  118. 118.
    Wiegand RP (2004) An analysis of cooperative coevolutionary algorithms. PhD thesis, George Mason University, Fairfax, VA, USA, director-Kenneth A. JongGoogle Scholar
  119. 119.
    Woolley BG, Stanley KO (2010) Evolving a single scalable controller for an octopus arm with a variable number of segments. In: Schaefer R, Cotta C, Kołodziej J, Rudolph G (eds) Parallel problem solving from nature—PPSN XI, vol 6239. Springer, Lecture Notes in Computer Science, pp 270–279Google Scholar
  120. 120.
    Yao X (1999) Evolving artificial neural networks. Proc IEEE 87(9):1423–1447CrossRefGoogle Scholar
  121. 121.
    Yong C, Miikkulainen R (2010) Co-evolution of role-based cooperation in multi-agent systems. IEEE Trans Auton Ment Dev 1:170–186CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Department of Electrical Engineering and Computer ScienceUniversity of Central FloridaOrlandoUSA

Personalised recommendations