Reinforcement Learning for Cooperating and Communicating Reactive Agents in Electrical Power Grids

  • Martin Riedmiller
  • Andrew Moore
  • Jeff Schneider
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2103)


Social behaviour in intelligent agent systems is often considered to be achieved by deliberative, in-depth reasoning techniques. This paper shows, how a purely reactive multi-agent system can learn to evolve cooperative behaviour, by means of learning from previous experiences. In particular, we describe a learning multi agent approach to the problem of controlling power flow in an electrical power-grid. The problem is formulated within the framework of dynamic programming. Via a global optimization goal, a set of individual agents is forced to autonomously learn to cooperate and communicate. The ability of the purely reactive distributed systems to solve the global problem by means of establishing a communication mechanism is shown on two prototypical network configurations.


Reinforcement Learn Markov Decision Process Successful Policy Incoming Message Learning Agent 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    L. C. Baird. Residual algorithms: Reinforcement learning with function approximation. In Machine Learning: Proceedings of the 12th International Conference, 1995.Google Scholar
  2. 2.
    A. G. Barto and R. H. Crites. Improving elevator performance using reinforcement learning. In M. E. Hasselmo D. S. Touretzky, M. C. Mozer, editors, Advances in Neural Information Processing Systems 8. MIT Press, 1996.Google Scholar
  3. 3.
    C. Boutilier. Sequential optimality and coordination in multi agent systems. Proceedings of Joint Conference on Artificial Intelligence, IJCAI, pages 178–185, 1998.Google Scholar
  4. 4.
    W. Brauer and G. Weiss. Multi-machine scheduling — a multi-agent learning approach. In Proceedings of the 3rd International Conference on Multi-Agent Systems, pages 42–48, 1998.Google Scholar
  5. 5.
    W. Wong A. Moore J. Schneider and M. Riedmiller. Distributed value functions. In Proceedings of International Conference on Machine Learning, ICML’99, pages 371–378, Bled, Slovenia, 1999.Google Scholar
  6. 6.
    L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4:237–285, 1996.Google Scholar
  7. 7.
    M. Lauer and M. Riedmiller. An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In Proceedings of International Conference on Machine Learning, ICML’ 00, pages 535–542, Stanford, CA, 2000.Google Scholar
  8. 8.
    S. Mahadevan and G. Theocharous. Optimization production manufacturing using reinforcement learning. In Proceedings of the Eleventh International FLAIRS Conference, pages 372–377. AAAI Press, 1998.Google Scholar
  9. 9.
    M. Riedmiller. Concepts and facilities of a neural reinforcement learning control architecture for technical process control. Journal of Neural Computing and Application, 8:323–338, 2000.CrossRefGoogle Scholar
  10. 10.
    S. Riedmiller and M. Riedmiller. A neural reinforcement learning approach to learn local dispatching policies in production scheduling. In Proceedings of International Joint Conference on Artificial Intelligence, ICJAI’99, Stockholm, 1999.Google Scholar
  11. 11.
    C. J. Watkins. Learning from Delayed Rewards. PhD thesis, Cambridge University, 1989.Google Scholar
  12. 12.
    M. Woolridge. Intelligent agents. In G. Weiss, editor, Multi Agent Systems. MIT Press, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Martin Riedmiller
    • 1
  • Andrew Moore
    • 2
  • Jeff Schneider
    • 2
  1. 1.Computer Science Dept.University of KarlsruheKarlsruheGermany
  2. 2.Robotics InsituteCarnegie-Mellon-UniversityPittsburghUSA

Personalised recommendations