Revisited: Machine Intelligence in Heterogeneous Multi-Agent Systems

Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 622)


Machine-learning techniques have been widely applied for solving decision-making problems. Machine-learning algorithms perform better as compared to other algorithms while dealing with complex environments. The recent development in the area of neural network has enabled reinforcement learning techniques to provide the optimal policies for sophisticated and capable agents. In this paper, we would like to explore some algorithms people have applied recently based on interaction of multiple agents and their components. We would like to provide a survey of reinforcement-learning techniques to solve complex and real-world scenarios.


Machine learning Heterogeneous systems Multi Agents Q learning 



I would like to thank my wife Priyanka Talukdar, research scholar, department of Civil Engineering of IIT-Guwahati (India) for her valuable suggestions in shaping this paper. This survey was funded by Natural Sciences and Engineering Research Council (NSERC) Canada and my supervisor in Ryerson University, Canada.


  1. 1.
    Shoham Y, Leyton-Brown K (2009) Multiagent systems algorithmic, game-theoretic, and logical foundations. Cambridge University PressGoogle Scholar
  2. 2.
    Khalil KM, Abdelaziz M, Nazmy TT, Salem ABM (2015) Machine learning algorithms for multi agent systems. In: Proceedings of the international conference on intelligent information processing, security and advanced communication—IPAC’15Google Scholar
  3. 3.
    Yang Z, Shi X (2014) An agent-based immune evolutionary learning algorithm and its application. In: Proceedings of the intelligent control and automation (WCICA), pp 5008–5013Google Scholar
  4. 4.
    Qu S, Jian R, Chu T, Wang J, Tan T (2014) Computational reasoning and learning for smart manufacturing under realistic conditions. In: Proceedings of the Behavior, Economic and Social Computing (BESC) Conferences, pp 1–8Google Scholar
  5. 5.
    Marinescu A (2016) Prediction-based multi-agent reinforcement learning for inherently non-stationary environments. PhD thesis, Computer Science, University of Dublin, Trinity CollegeGoogle Scholar
  6. 6.
    Russell S, Norvig P (2003) Artificial intelligence: a modern approach. Prentice HallGoogle Scholar
  7. 7.
    Stone P, Veloso M (2008) Multiagent systems: a survey from a machine learning perspective. Auton Robot 8(3):345–383CrossRefGoogle Scholar
  8. 8.
    Sniezynski B (2009) Supervised rule learning and reinforcement learning in a multi-agent system for the fish banks game. In: Theory and novel applications of machine learningGoogle Scholar
  9. 9.
    Garland A, Alterman A (2004) Autonomous agents that learn to better coordinate. Auton Agent Multi-Agent Syst 8:267–301CrossRefGoogle Scholar
  10. 10.
    Williams A (2004) Learning to share meaning in a multi-agent system. Auton Agent Multi-Agent Syst 8:165–193CrossRefGoogle Scholar
  11. 11.
    Gehrke JD, Wojtusiak J (2008) Traffic prediction for agent route planning. In: Proceedings of the international conference on computational science, pp 692–701Google Scholar
  12. 12.
    Airiau S, Padham L, Sardina S, Sen S (2008) Incorporating learning in BDI agents. In: Adaptive Learning Agents and Multi-Agent Systems Workshop (ALAMAS + ALAg-08)Google Scholar
  13. 13.
    Kiselev A (2008) A self-organizing multi-agent system for online unsupervised learning in complex dynamic environments. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, pp 1808–1809Google Scholar
  14. 14.
    Sadeghlou M, Akbarzadeh TMR, Naghibi SMB (2014) Dynamic agent-based reward shaping for multi-agent systems. In: Proceedings of the Iranian Conference on Intelligent Systems (ICIS), pp 1–6Google Scholar
  15. 15.
    Lewenberg Y (2017) Machine learning techniques for multiagent systems. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17), pp 5185–5186Google Scholar
  16. 16.
    Bowling M, Veloso M (2002) Multiagent learning using a variable learning rate. Artif Intell 136:215–250MathSciNetCrossRefGoogle Scholar
  17. 17.
    Barto AG, Sutton RS, Anderson CW (1983) Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans Syst Man Cybern 5:843–846Google Scholar
  18. 18.
    Sutton RS (1990) Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Proceedings of the Seventh International Conference on Machine Learning (ICML-90), Austin, US, pp 216–224Google Scholar
  19. 19.
    Moore AW, Atkeson CG (1993) Prioritized sweeping: reinforcement learning with less data and less time. Mach Learn 13:103–130Google Scholar
  20. 20.
    Greenwald A, Hall K (2003) Correlated-Q learning. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML-03), Washington, US, pp 242–249Google Scholar
  21. 21.
    Kononen V (2005) Gradient descent for symmetric and asymmetric multiagent reinforcement learning. Web Intell Agent Syst 3:17–30Google Scholar
  22. 22.
    Lagoudakis MG, Parr R (2003) Least-squares policy iteration. Mach Learn Res 4:1107–1149MathSciNetzbMATHGoogle Scholar
  23. 23.
    McGlohon M, Sen S (2004) Learning to cooperate in multi-agent systems by combining Q-learning and evolutionary strategy. In: Proceedings of the world conference on lateral computingGoogle Scholar
  24. 24.
    Qi D, Sun R (2003) A multi-agent system integrating reinforcement learning, bidding and genetic algorithms. Web Intell Agent Syst 1:187–202Google Scholar
  25. 25.
    Puterman ML (2008) Markov decision processes: discrete stochastic dynamic programming, 1st edn. WileyGoogle Scholar
  26. 26.
    Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8:279–292zbMATHGoogle Scholar
  27. 27.
    Bertsekas DP (2001) Dynamic programming and optimal control, 2nd edn. Athena ScientificGoogle Scholar
  28. 28.
    Nguyen TT, Nguyen ND, Nahavandi S (2019) Deep reinforcement learning for multi-agent systems: a review of challenges, solutions and applications. retrieved from arXiv:1812.11794v2 [cs.LG] 6 Feb 2019
  29. 29.
    Mitchell T (1997) Machine learning. McGraw-Hill, New YorkzbMATHGoogle Scholar
  30. 30.
    Kaebling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4Google Scholar
  31. 31.
    Fitouri Trabelsi S, Alberto NCC, Gustavo ZCL, Mora-Camino F (2013) AN operational approach for ground handling management at airports with imperfect information. In: 19th International conference on industrial engineering and operations management, Valladolid, Spain, July 2013Google Scholar
  32. 32.
    Luo Y, Davis D, Liu K (2002) A multi-agent framework for stock trading. School of Computing. Staffordshire University, Stafford ST18 0DG, UK, Department of Computer Science, University of Hull, HU6 7RX, UKGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Department of Aerospace EngineeringRyerson UniversityTorontoCanada
  2. 2.Department of Computer Science and EngineeringSMITMajitarIndia

Personalised recommendations