Distributed MPC Using Reinforcement Learning Based Negotiation: Application to Large Scale Systems

  • B. MorcegoEmail author
  • V. Javalera
  • V. Puig
  • R. Vito
Part of the Intelligent Systems, Control and Automation: Science and Engineering book series (ISCA, volume 69)


This chapter describes a methodology to deal with the interaction (negotiation) between MPC controllers in a distributed MPC architecture. This approach combines ideas from Distributed Artificial Intelligence (DAI) and Reinforcement Learning (RL) in order to provide a controller interaction based on negotiation, cooperation and learning techniques. The aim of this methodology is to provide a general structure to perform optimal control in networked distributed environments, where multiple dependencies between subsystems are found. Those dependencies or connections often correspond to control variables. In that case, the distributed control has to be consistent in each subsystem. One of the main new concepts of this architecture is the negotiator agent. Negotiator agents interact with MPC agents to reach an agreement on the optimal value of the shared control variables. The optimal value of those shared control variables has to accomplish a common goal, probably incompatible with the specific goals of each partition that share the variable. Two cases of study are discussed, a small water distribution network and the Barcelona water network. The results suggest that this approach is a promising strategy when centralized control is not a reasonable choice.


Agent Negotiation Negotiation Variables State Transition Probability Function Shared Variables Discrete-time Stochastic Control Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    D. Barcelli, Decomposizione ottima e controllo predittivo distribuito della rete idrica di Barcellona (Università di Siena, Facoltà di Ingegneria Informatica, Master’s thesis, 2008)Google Scholar
  2. 2.
    M. Brdys, B. Ulanicki, Operational Control of Water Systems: Structures, Algorithms and Applications (Prentice Hall International, Hemel Hempstead, Hertfordshire, 1994)Google Scholar
  3. 3.
    V. Fambrini, C. Ocampo-Martinez, Modelling and decentralized Model Predictive Control of drinking water networks. Technical Report IRI-TR-04-09, Institut de Robòtica i Informàtica Industrial (CSIC-UPC), April 2009Google Scholar
  4. 4.
    H. El Fawal, D. Georges, and G. Bornard, Optimal control of complex irrigation systems via descomposition-coordination and the use of augmented Lagrangian, in IEEE International Conference Systems, Man and Cybernetics, San Diego, 1998, pp. 3874–3879Google Scholar
  5. 5.
    M. Gómez, J. Rodellar, F. Vea, J. Mantecón, and J. Cardona, Decentralized adaptive control for water distribution, in IEEE International on Systems, Man and Cybernetics, San Diego, 1998, pp. 1411–1416Google Scholar
  6. 6.
    T. Jaakkola, M.I. Jordan, S.P. Singh, Q-learning. Mach. Learn. 8, 1185–1201 (1994)Google Scholar
  7. 7.
    V. Javalera, B. Morcego, V. Puig, Distributed MPC for large scale systems using agent-based reinforcement learning, in IFAC Symposium Large Scale Systems, Lille 2010Google Scholar
  8. 8.
    R.R. Negenborn, B. De Schutter, J. Hellendoorn, Multi-agent model predictive control for transportation networks: Serial vs. parallel schemes. Eng. Appl. Artif. Intell. 21(3), 353–366 (April 2008)CrossRefGoogle Scholar
  9. 9.
    C. Ocampo-Martinez, D. Barcelli, V. Puig, and A. Bemporad, Hierarchical and decentralised model predictive control of drinking water networks: Application to the barcelona case study. IET Control Theory & Applications, Conditionally accepted, 2011Google Scholar
  10. 10.
    C. Ocampo-Martinez, S. Bovo, V. Puig, Partitioning approach oriented to the decentralised predictive control of large-scale systems. J. Process Control 21(5), 775–786 (2011)CrossRefGoogle Scholar
  11. 11.
    J. Quevedo, V. Puig, G. Cembrano, J. Blanch, Validation and reconstruction of flow meter data in the Barcelona water distribution network. Control Eng. Pract. 11(6), 640–651 (June 2010)CrossRefGoogle Scholar
  12. 12.
    J.B. Rawlings, B. Stewart, Coordinating multiple optimization-based controllers: New opportunities and challenges. J. Process Control 18(9), 839–845 (2008)CrossRefGoogle Scholar
  13. 13.
    D.D. Šiljak, Decentralized Control of Complex Systems (Academic Press, New York, 1991)Google Scholar
  14. 14.
    J.N. Tsitsiklis, Asynchronous stochastic approximation and Q-learning. Mach. Learn. 16, 185–202 (1994)zbMATHGoogle Scholar
  15. 15.
    A.N. Venkat, J.B. Rawlings, S.J. Wrigth, Stability and optimality of distributed model predictive control, in IEEE Conference on Decision and Control and European Control Conference, Seville, 2005Google Scholar
  16. 16.
    C.I.C.H. Watkins, Learning from Delayed Rewards. Doctoral Dissertation (University of Cambridge, Cambridge, 1989)Google Scholar
  17. 17.
    C.I.C.H. Watkins, P. Dayan, Q-learning. Mach. Learn. 8, 279–292 (1992)zbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Advanced Control Systems GroupTerrassaSpain

Personalised recommendations