Adaptive Replication of Large-Scale Multi-agent Systems – Towards a Fault-Tolerant Multi-agent Platform

  • Zahia Guessoum
  • Nora Faci
  • Jean-Pierre Briot
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3914)


In order to construct and deploy large-scale multi-agent systems, we must address one of the fundamental issues of distributed systems, the possibility of partial failures. This means that fault-tolerance is an inevitable issue for large-scale multi-agent systems. In this paper, we discuss the issues and propose an approach for supporting fault-tolerance of multi-agent systems. The starting idea is the application of replication strategies to agents, the most critical agents being replicated to prevent failures. As criticality of agents may evolve during the course of computation and problem solving, and as resources are bounded, we need to dynamically and automatically adapt the number of replicas of agents, in order to maximize their reliability and availability. We will describe our approach and related mechanisms for evaluating the criticality of a given agent (based on application-level semantic information, e.g. interdependences, and also system-level statistical information, e.g., communication load) and for deciding what strategy to apply (e.g., active or passive replication) and how to parameterize it (e.g., number of replicas). We also will report on experiments conducted with our prototype architecture (named DimaX).


Multiagent System Mobile Agent Agent Replication Agent Agent Monitoring Agent 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Assis-Silva, F.M., Popescu-Zeletin, R.: An approach for providing mobile agent fault tolerance. In: Rothermel, K., Hohl, F. (eds.) MA 1998. LNCS, vol. 1477, p. 14. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  2. 2.
    Bertier, M., Marin, O., Sens, P.: Implementation and performance evaluation of an adaptable failure detector. In: The International Conference on Dependable Systems and Networks, Washington, USA (2002)Google Scholar
  3. 3.
    Castelfranchi, C.: Dependence relations in multi-agent systems. In: Decentralized AI. Elsevier, Amsterdam (1992)Google Scholar
  4. 4.
    Colombetti, M., Verdicchio, M.: An analysis of agent speech acts as institutional actions. In: AAMAS-2002, pp. 1157–1164 (2002)Google Scholar
  5. 5.
    Fedoruk, A., Deters, R.: Improving fault-tolerance by replicating agents. In: AAMAS 2002, Bologna, Italy, pp. 373–744 (2002)Google Scholar
  6. 6.
    Guerraoui, R., Garbinato, B., Mazouni, K.: Lessons from designing and implementing GARF. In: Object-Based Parallel and Distributed Computation 1993. LNCS, vol. 791, pp. 238–256. Springer, Heidelberg (1995)Google Scholar
  7. 7.
    Guessoum, Z., Briot, J.-P.: From active objects to autonomous agents. IEEE Concurrency 7(3), 68–76 (1999)CrossRefGoogle Scholar
  8. 8.
    Guessoum, Z., Briot, J.-P., Marin, O., Hamel, A., Sens, P.: Dynamic and adaptive replication for large-scale reliable multi-agent systems. In: Garcia, A.F., de Lucena, C.J.P., Zambonelli, F., Omicini, A., Castro, J. (eds.) Software Engineering for Large-Scale Multi-Agent Systems. LNCS, vol. 2603, pp. 182–198. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  9. 9.
    Guessoum, Z., Faci, N., Briot, J.-P.: Adaptive replication of large-scale multiagent systems - towards a fault-tolerant multi-agent platform. In: Proceedings of the ICSE 2005 Fourth International Workshop on Software Engineering for Large-Scale Multi-Agent Systems (SELMAS 2005). ACM, New York (2006)Google Scholar
  10. 10.
    Hagg, S.: A sentinel approach to fault handling in multi-agent systems. In: Dickson, L., Zhang, C. (eds.) DAI 1996. LNCS, vol. 1286, pp. 190–195. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  11. 11.
    Horling, B., Benyo, B., Lesser, V.: Using self-diagnosis to adapt organizational structures. In: 5th International Conference on Autonomous Agents, Montreal, pp. 529–536. ACM Press, New York (2001)Google Scholar
  12. 12.
    Kaminka, G.A., Pynadath, D.V., Tambe, M.: Monitoring teams by overhearing: A multi-agent plan-recognition approach. Journal of Intelligence Artificial Research 17, 83–135 (2002)MATHGoogle Scholar
  13. 13.
    Kraus, S., Subrahmanian, V.S., Cihan Tacs, N.: Probabilistically survivable MASs. In: IJCAI 2003, pp. 789–795 (2003)Google Scholar
  14. 14.
    Malone, T.W., Crowston, K.: The interdisciplanary study of coordination. ACM Computing Surveys 26(1), 87–119 (1994)CrossRefGoogle Scholar
  15. 15.
    Marin, O., Bertier, M., Sens, P.: DARX - a framework for the fault-tolerant support of agent software. In: 14th International Symposium on Software Reliability Engineering (ISSRE 2003), Denver, Colorado, USA, pp. 406–417. IEEE, Los Alamitos (2003)Google Scholar
  16. 16.
    Klein, M., Rodriguez-Aguilar, J.A., Dellarocas, C.: Using domain-independent exception handling services to enable robust open multi-agent systems: The case of agent death. Journal of autonomous Agents and Multi-Agent Systems 7(1-2), 179–189 (2003)CrossRefGoogle Scholar
  17. 17.
    OMG TC Document ormsc/2001 07-01. Model driven architecture (mda). Technical report, OMG (2001)Google Scholar
  18. 18.
    Van Renesse, R., Birman, K., Maffeis, S.: Horus: A flexible group communication system. Communications of the ACM 39(4), 76–83 (1996)CrossRefGoogle Scholar
  19. 19.
    Roos, N., Teije, A.t., Witteveen, C.: A protocol for multi-agent diagnosis with spatially distributed knowledge. In: First Workshop on Programming Multiagent Systems: Languages, frameworks, techniques, and tools (ProMAS 2003), AAMAS 2003, pp. 655–661. ACM, New York (2003)Google Scholar
  20. 20.
    Sichman, J.S., Conte, R.: Multi-agent dependence by dependence graphs. In: AAMAS 2002, Bologna, Italy, pp. 483–490. ACM, New York (2002)Google Scholar
  21. 21.
    Sichman, J.S., Conte, R., Demazeau, Y.: Reasoning about others using dependence networks. In: Actes de Incontro del gruppo AI*IA di interesse speciale sul inteligenza artificiale distribuita, Roma, Italia (1993)Google Scholar
  22. 22.
    Sichman, J.S., Conte, R., Demazeau, Y.: A social reasoning mechanism based on dependence networks. In: Proceedings of ECAI 1994 - European Conference on Artificial Intelligence, Amsterdam, The Netherlands (August 1994)Google Scholar
  23. 23.
    Silva, L., Batista, V., Silva, J.: Fault-tolerant execution of mobile agents. In: International Conference on Dependable Systems and Networks, pp. 135–143 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Zahia Guessoum
    • 1
    • 2
  • Nora Faci
    • 2
  • Jean-Pierre Briot
    • 1
  1. 1.LIP6, Université Pierre et Marie Curie (Paris 6)ParisFrance
  2. 2.MODECO-CReSTIC – IUT de ReimsReimsFrance

Personalised recommendations