Efficient Reward Functions for Adaptive Multi-rover Systems

  • Kagan Tumer
  • Adrian Agogino
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3898)


This chapter focuses on deriving reward functions that allow multiple agents to co-evolve efficient control policies that maximize a system level reward in noisy and dynamic environments. The solution we present is based on agent rewards satisfying two crucial properties. First, the agent reward function and global reward function has to be aligned, that is, an agent maximizing its agent-specific reward should also maximize the global reward. Second, the agent has to receive sufficient “signal” from its reward, that is, an agent’s action should have a large influence over its agent-specific reward. Agents using rewards with these two properties will evolve the correct policies quickly. This hypothesis is tested in episodic and non-episodic, continuous-space multi-rover environment where rovers evolve to maximize a global reward function over all rovers. The environments are dynamic (i.e. changes over time), noisy and have restriction on communication between agents. We show that a control policy evolved using agent-specific rewards satisfying the above properties outperforms policies evolved using global rewards by up to 400%. More notably, in the presence of a larger number of rovers or rovers with noisy and communication limited sensors, the proposed method outperforms global reward by a higher percentage than in noise-free conditions with a small number of rovers.


Control Policy Reward Function Multi Layer Perceptron Congestion Game Sensitive Reward 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agah, A., Bekey, G.A.: A genetic algorithm-based controller for decentralized multi-agent robotic systems. Proc. of the IEEE International Conference of Evolutionary Computing, Nagoya, Japan (1996)Google Scholar
  2. 2.
    Agogino, A., Stanley, K., Miikkulainen, R.: Online interactive neuro-evolution. Neural Processing Letters 11, 29–38 (2000)CrossRefGoogle Scholar
  3. 3.
    Agogino, A., Tumer, K.: Efficient evaluation functions for multi-rover systems. In: The Genetic and Evolutionary Computation Conference, Seatle, WA, June 2004, pp. 1–12 (2004)Google Scholar
  4. 4.
    Balch, T.: Behavioral diversity as multiagent cooperation. In: Proc. of SPIE 1999 Workshop on Multiagent Systems, Boston, MA (1999)Google Scholar
  5. 5.
    Baldassarre, G., Nolfi, S., Parisi, D.: Evolving mobile robots able to display collective behavior. Artificial Life 9, 255–267 (2003)CrossRefGoogle Scholar
  6. 6.
    Dorigo, M., Gambardella, L.M.: Ant colony systems: A cooperative learning approach to the travelling salesman problem. IEEE Transactions on Evolutionary Computation 1(1), 53–66 (1997); Efficient Reward Functions for Adaptive Multi-rover Systems 191CrossRefGoogle Scholar
  7. 7.
    Farritor, S., Dubowsky, S.: Planning methodology for planetary robotic exploration. ASME Journal of Dynamic Systems, Measurement and Control 124, pages 4, 698–701 (2002)CrossRefGoogle Scholar
  8. 8.
    Floreano, D., Mondada, F.: Automatic creation of an autonomous agent: Genetic evolution of a neural-network driven robot. In: Proc. of Conf. on Simulation of Adaptive Behavior (1994)Google Scholar
  9. 9.
    Gomez, F., Miikkulainen, R.: Active guidance for a finless rocket through neuroevolution. In: Proc. of Genetic and Evolutionary Comp. Conf., Chicago, IL (2003)Google Scholar
  10. 10.
    Hoffmann, F., Koo, T.-J., Shakernia, O.: Evolutionary design of a helicopter autopilot. In: Advances in Soft Computing - Engineering Design and Manufacturing, Part 3: Intelligent Control, pp. 201–214 (1999)Google Scholar
  11. 11.
    Lamma, E., Riguzzi, F., Pereira, L.: Belief revision by multi-agent genetic search. In: Proc. of the 2nd International Workshop on Computational Logic for Multi- Agent Systems, Paphos, Cyprus (December 2001)Google Scholar
  12. 12.
    Martinoli, A., Ijspeert, A.J., Mondala, F.: Understanding collective aggregation mechanisms: From probabilistic modelling to experiments with real robots. Robotics and Autonomous Systems 29, 51–63 (1999)CrossRefGoogle Scholar
  13. 13.
    Martinoli, A., Mondala, F.: Collective and cooperative group behaviors: Biologically inspired experiments in robotics. In: Khatib, O., Salisbur, J. (eds.) Proc. of the Fourth Intl. Symp. on Experimental Robotics. Springer, New York (1995)Google Scholar
  14. 14.
    Mataric, M.J.: Coordination and learning in multi-robot systems. IEEE Intelligent Systems, 6–8 (March 1998)Google Scholar
  15. 15.
    Stanley, K., Miikkulainen, R.: Efficient reinforcement learning through evolving neural network topologies. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2002), San Francisco, CA (2002)Google Scholar
  16. 16.
    Tumer, K., Agogino, A.: Overcoming communication restrictions in collectives. In: Proceedings of the International Joint Conference on Neural Networks, Budapest, Hungary (July 2004)Google Scholar
  17. 17.
    Tumer, K., Wolpert, D. (eds.): Collectives and the Design of Complex Systems. Springer, New York (2004)MATHGoogle Scholar
  18. 18.
    Tumer, K., Wolpert, D.: A survey of collectives. In: Collectives and the Design of Complex Systems, vol. 42, p. 1. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  19. 19.
    Tumer, K., Wolpert, D.H.: Collective intelligence and Braess paradox. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence, Austin, TX, pp. 104–109 (2000)Google Scholar
  20. 20.
    Whitley, D., Gruau, F., Pyeatt, L.: Cellular encoding applied to neurocontrol. In: International Conference on Genetic Algorithms (1995)Google Scholar
  21. 21.
    Wolpert, D.H., Tumer, K.: Optimal payoff functions for members of collectives. Advances in Complex Systems 4(2/3), 265–279 (2001)CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Kagan Tumer
    • 1
  • Adrian Agogino
    • 2
  1. 1.NASA Ames Research CenterMoffet FieldUSA
  2. 2.UC Santa Cruz, NASA Ames Research CenterMoffet FieldUSA

Personalised recommendations