Game-Theory-Based Consensus Learning of Double-Integrator Agents in the Presence of Worst-Case Adversaries

  • Kyriakos G. Vamvoudakis
  • João P. Hespanha


This work proposes a game-theory-based technique for guaranteeing consensus in unreliable networks by satisfying local objectives. This multi-agent problem is addressed under a distributed framework, in which every agent has to find the best controller against a worst-case adversary so that agreement is reached among the agents in the networked team. The construction of such controllers requires the solution of a system of coupled partial differential equations, which is typically not feasible. The algorithm proposed uses instead three approximators for each agent: one to approximate the value function, one to approximate the control law, and a third one to approximate a worst-case adversary. The tuning laws for every controller and adversary are driven by their neighboring controllers and adversaries, respectively, and neither the controller nor the adversary knows each other’s policies. A Lyapunov stability proof ensures that all the signals remain bounded and consensus is asymptotically reached. Simulation results are provided to demonstrate the efficacy of the proposed approach.


Game theory Consensus Hamilton–Jacobi equations Optimization Security 



This material is based upon work supported in part by NATO under Grant No. SPS G5176, by ONR Minerva under Grant No. N00014-18-1-2160, by an NSF CAREER, by NAWCAD under Grant No. N00421-16-2-0001, and by an US Office of Naval Research MURI Grant No. N00014-16-1-2710.


  1. 1.
    Teixeira, A., Sandberg, H., Johansson, K.H.: Networked control systems under cyber attacks with applications to power networks. In: American Control Conference (ACC), 2010, pp. 3690–3696. IEEE (2010)Google Scholar
  2. 2.
    Vamvoudakis, K.G., Hespanha, J.P.: Online optimal operation of parallel voltage-source inverters using partial information. IEEE Trans. Ind. Electron. 64(5), 4296–4305 (2017)CrossRefGoogle Scholar
  3. 3.
    Beard, R.W., McLain, T.W., Nelson, D.B., Kingston, D., Johanson, D.: Decentralized cooperative aerial surveillance using fixed-wing miniature UAVs. Proc. IEEE 94(7), 1306–1324 (2006)CrossRefGoogle Scholar
  4. 4.
    Kunwar, F., Benhabib, B.: Rendezvous-guidance trajectory planning for robotic dynamic obstacle avoidance and interception. IEEE Trans. Syst. Man Cybern. Part B Cybern. 36(6), 1432–1441 (2006)CrossRefGoogle Scholar
  5. 5.
    Lee, D., Spong, M.W.: Stable flocking of multiple inertial agents on balanced graphs. IEEE Trans. Autom. Control 52(8), 1469–1475 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Olfati-Saber, R., Fax, J.A., Murray, R.M.: Consensus and cooperation in networked multi-agent systems. Proc. IEEE 95(1), 215–233 (2007)CrossRefzbMATHGoogle Scholar
  7. 7.
    Jadbabaie, A., Lin, J., Morse, A.S.: Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Trans. Autom. Control 48(6), 988–1001 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Ren, W., Beard, R.W., Atkins, E.M.: A survey of consensus problems in multi-agent coordination. In: American Control Conference, 2005. Proceedings of the 2005, pp. 1859–1864. IEEE (2005)Google Scholar
  9. 9.
    Tsitsiklis, J.N.: Problems in decentralized decision making and computation. Tech. rep., MASSACHUSETTS INST OF TECH CAMBRIDGE LAB FOR INFORMATION AND DECISION SYSTEMS (1984)Google Scholar
  10. 10.
    Abbeel, P., Coates, A., Ng, A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. Int. J. Robot. Res. 29(13), 1608–1639 (2010)CrossRefGoogle Scholar
  11. 11.
    Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar
  12. 12.
    Vamvoudakis, K.G., Antsaklis, P.J., Dixon, W.E., Hespanha, J.P., Lewis, F.L., Modares, H., Kiumarsi, B.: Autonomy and machine intelligence in complex systems: a tutorial. In: American Control Conference (ACC), 2015, pp. 5062–5079. IEEE (2015)Google Scholar
  13. 13.
    Vamvoudakis, K.G., Modares, H., Kiumarsi, B., Lewis, F.L.: Game theory-based control system algorithms with real-time reinforcement learning: how to solve multiplayer games online. IEEE Control Syst. 37(1), 33–52 (2017)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge (1998)Google Scholar
  15. 15.
    Vrabie, D., Vamvoudakis, K.G., Lewis, F.L.: Optimal adaptive control and differential games by reinforcement learning principles, vol. 2. IET (2013)Google Scholar
  16. 16.
    Werbos, P.J.: Approximate dynamic programming for real-time control and neural modeling. In: White, D.A., Sofge, D.A. (eds.) Handbook of Intelligent Control. Van Nostrand Reinhold, New York (1992)Google Scholar
  17. 17.
    Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-dynamic programming: an overview. In: Proceedings of the 34th IEEE Conference on Decision and Control, vol. 1, pp. 560–564. IEEE (1995)Google Scholar
  18. 18.
    Cardenas, A., Amin, S., Sinopoli, B., Giani, A., Perrig, A., Sastry, S., et al.: Challenges for securing cyber physical systems. In: Workshop on Future Directions in Cyber-Physical Systems Security, vol. 5 (2009)Google Scholar
  19. 19.
    Cardenas, A.A., Amin, S., Sastry, S.: Secure control: towards survivable cyber-physical systems. In: 28th International Conference on Distributed Computing Systems Workshops, 2008. ICDCS’08, pp. 495–500. IEEE (2008)Google Scholar
  20. 20.
    Pasqualetti, F., Bicchi, A., Bullo, F.: Consensus computation in unreliable networks: a system theoretic approach. IEEE Trans. Autom. Control 57(1), 90–104 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Alpcan, T., Başar, T.: Network Security: A Decision and Game-Theoretic Approach. Cambridge University Press, Cambridge (2010)CrossRefzbMATHGoogle Scholar
  22. 22.
    Basar, T., Olsder, G.J.: Dynamic noncooperative game theory, vol. 23. Siam (1999)Google Scholar
  23. 23.
    Vamvoudakis, K.G., Hespanha, J.P., Sinopoli, B., Mo, Y.: Detection in adversarial environments. IEEE Trans. Autom. Control 59(12), 3209–3223 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Holmgren, A.J., Jenelius, E., Westin, J.: Evaluating strategies for defending electric power networks against antagonistic attacks. IEEE Trans. Power Syst. 22(1), 76–84 (2007)CrossRefGoogle Scholar
  25. 25.
    Wang, J., Elia, N.: Distributed averaging algorithms resilient to communication noise and dropouts. IEEE Trans. Signal Process. 61(9), 2231–2242 (2013)CrossRefGoogle Scholar
  26. 26.
    Zhu, M., Martínez, S.: Attack-resilient distributed formation control via online adaptation. In: 2011 50th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC), pp. 6624–6629. IEEE (2011)Google Scholar
  27. 27.
    Chung, S.J., Slotine, J.J.E.: Cooperative robot control and concurrent synchronization of lagrangian systems. IEEE Trans. Rob. 25(3), 686–700 (2009)CrossRefGoogle Scholar
  28. 28.
    LeBlanc, H.J., Koutsoukos, X.D.: Low complexity resilient consensus in networked multi-agent systems with adversaries. In: Proceedings of the 15th ACM International Conference on Hybrid Systems: Computation and Control, pp. 5–14. ACM (2012)Google Scholar
  29. 29.
    Semsar, E., Khorasani, K.: Optimal control and game theoretic approaches to cooperative control of a team of multi-vehicle unmanned systems. In: 2007 IEEE International Conference on Networking, Sensing and Control, pp. 628–633. IEEE (2007)Google Scholar
  30. 30.
    Semsar-Kazerooni, E., Khorasani, K.: An lmi approach to optimal consensus seeking in multi-agent systems. In: American Control Conference, 2009. ACC’09., pp. 4519–4524. IEEE (2009)Google Scholar
  31. 31.
    Khanafer, A., Touri, B., Başar, T.: Consensus in the presence of an adversary. IFAC Proc. Vol. 45(26), 276–281 (2012)CrossRefGoogle Scholar
  32. 32.
    Bauso, D., Giarre, L., Pesenti, R.: Mechanism design for optimal consensus problems. In: 2006 45th IEEE Conference on Decision and Control, pp. 3381–3386. IEEE (2006)Google Scholar
  33. 33.
    Chung, S.J., Bandyopadhyay, S., Chang, I., Hadaegh, F.Y.: Phase synchronization control of complex networks of lagrangian systems on adaptive digraphs. Automatica 49(5), 1148–1161 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Carli, R., Zampieri, S.: Networked clock synchronization based on second order linear consensus algorithms. In: 2010 49th IEEE Conference on Decision and Control (CDC), pp. 7259–7264. IEEE (2010)Google Scholar
  35. 35.
    Yucelen, T., Egerstedt, M.: Control of multiagent systems under persistent disturbances. In: American Control Conference (ACC), 2012, pp. 5264–5269. IEEE (2012)Google Scholar
  36. 36.
    Vamvoudakis, K.G., Lewis, F.L., Hudas, G.R.: Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality. Automatica 48(8), 1598–1611 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Sundaram, S., Hadjicostis, C.N.: Distributed function calculation via linear iterative strategies in the presence of malicious agents. IEEE Trans. Autom. Control 56(7), 1495–1508 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Zhu, Q., Bushnell, L., Başar, T.: Resilient distributed control of multi-agent cyber-physical systems. In: Control of Cyber-Physical Systems, pp. 301–316. Springer (2013)Google Scholar
  39. 39.
    Chen, L., Roy, S., Saberi, A.: On the information flow required for tracking control in networks of mobile sensing agents. IEEE Trans. Mob. Comput. 10(4), 519–531 (2011)CrossRefGoogle Scholar
  40. 40.
    Peymani, E., Grip, H.F., Saberi, A., Wang, X., Fossen, T.I.: H-\(\infty \) almost output synchronization for heterogeneous networks of introspective agents under external disturbances. Automatica 50(4), 1026–1036 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Bardi, M., Capuzzo-Dolcetta, I.: Optimal control and viscosity solutions of Hamilton–Jacobi–Bellman equations. Springer, Berlin (2008)zbMATHGoogle Scholar
  42. 42.
    Crandall, M.G., Lions, P.L.: Viscosity solutions of Hamilton–Jacobi equations. Trans. Am. Math. Soc. 277(1), 1–42 (1983)MathSciNetCrossRefzbMATHGoogle Scholar
  43. 43.
    Van Der Schaft, A.J.: L/sub 2/-gain analysis of nonlinear systems and nonlinear state-feedback h/sub infinity/control. IEEE Trans. Autom. Control 37(6), 770–784 (1992)CrossRefzbMATHGoogle Scholar
  44. 44.
    Rao, V.G., Bernstein, D.S.: Naive control of the double integrator. IEEE Control Syst. 21(5), 86–97 (2001)CrossRefGoogle Scholar
  45. 45.
    Kearns, M., Littman, M.L., Singh, S.: Graphical models for game theory. In: Proceedings of the Seventeenth conference on Uncertainty in Artificial Intelligence, pp. 253–260. Morgan Kaufmann Publishers Inc. (2001)Google Scholar
  46. 46.
    Beard, R.W., Saridis, G.N., Wen, J.T.: Approximate solutions to the time-invariant Hamilton–Jacobi–Bellman equation. J. Optim. Theory Appl. 96(3), 589–626 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Bryson, A., Ho, Y.C.: Applied Optimal Control. Hemisphere, New York (1975)Google Scholar
  48. 48.
    Khalil, H.K.: Nonlinear Systems, vol. 3. Prentice Hall, Upper Saddle River (2002)zbMATHGoogle Scholar
  49. 49.
    Hornik, K., Stinchcombe, M., White, H.: Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw. 3(5), 551–560 (1990)CrossRefGoogle Scholar
  50. 50.
    Ioannou, P., Fidan, B.: Adaptive control tutorial. Society for Industrial and Applied Mathematics (2006)Google Scholar
  51. 51.
    Lewis, F., Jagannathan, S., Yesildirak, A.: Neural Network Control of Robot Manipulators and Non-Linear Systems. CRC Press, Boca Raton (1998)Google Scholar
  52. 52.
    Cao, C., Hovakimyan, N.: Novel \(l_1\) neural network adaptive control architecture with guaranteed transient performance. IEEE Trans. Neural Netw. 18(4), 1160–1171 (2007)CrossRefGoogle Scholar
  53. 53.
    Krstic, M., Kanellakopoulos, I., Kokotovic, P.V.: Nonlinear and Adaptive Control Design. Wiley, New York (1995)zbMATHGoogle Scholar
  54. 54.
    Lavretsky, E., Wise, K.A.: Robust and adaptive control with aerospace applications. In: Advanced Textbooks in Control and Signal Processing. Springer-Verlag, London (2013)Google Scholar
  55. 55.
    Pomet, J.B., Praly, L.: Adaptive nonlinear regulation: estimation from the Lyapunov equation. IEEE Trans. Autom. Control 37(6), 729–740 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  56. 56.
    Anderson, B.: Exponential stability of linear equations arising in adaptive identification. IEEE Trans. Autom. Control 22(1), 83–88 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  57. 57.
    Ioannou, P.A., Tao, G.: Dominant richness and improvement of performance of robust adaptive control. Automatica 25(2), 287–291 (1989)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Kevin T. Crofton Department of Aerospace and Ocean EngineeringVirginia TechBlacksburgUSA
  2. 2.Department of Electrical and Computer EngineeringUniversity of CaliforniaSanta BarbaraUSA

Personalised recommendations