Distributed storage of replicated beliefs to facilitate recovery of distributed intelligent agents
We address the problem of recovering the state of an agent after a hardware/software failure of the system. We address the replication and reincarnation sub-problems of agent recovery under certain assumptions. An algorithm for distributed storage of replicated beliefs is provided and its correctness is proved formally. This algorithm allows the reincarnation of multiple crashed agents in a system of distributed autonomous intelligent agents. The scheme uses replication and distributed storage in the immediate neighboring agents, and uses distributed logical clocks to preserve the causality and to terminate retransmission.
Key-wordsDistributed fault tolerance Multi-agent system Recovery Reliability
Unable to display preview. Download preview PDF.
- P. Jalote, “Fault Tolerant Distributed Computing,” Prentice Hall, 1993.Google Scholar
- G. Kalinka and M. Tambe, “Social Comparison for Failure Detection and Recovery,” In this volume.Google Scholar
- D. Kinny, M. Georgeff, J. Bailey, D. B. Kemp, and K. Rammohanarao, “Active Databases and Agent Systems," Proceedings of the Second International Rules in Database Systems Workshop, RIDS95, Athens Greece, 1995.Google Scholar
- H. V. Leong and D. Agrawal, “Using Message Semantics to Reduce Rollback in Optimistic Message Logging Recovery Schemes,” Proceedings of the 14 th International Conference on Distributed Computing Systems, 1995Google Scholar
- M. Ranyal and M. Singhal, “Capturing Causality in Distributed Systems,” Communications of the ACM, February 1996, pp. 49–56.Google Scholar
- A. S. Rao and M. P. Georgeff, “Modeling Rational Agents Within a BDIArchitecture,” Proceedings of the Second International Conference on Principles of Knowledge Representation and Reasoning, San Mateo, CA, USA, Morgan Kaufaman publishers, 1991.Google Scholar
- A. S. Rao, “AgentSpeak(L): BDI Agents Speak Out in a Logical Computable Language,” in Agents Breaking Away, editors, Van de Velde, W. and Perram, J. W. Lecture Notes in Artificial Intelligence, LNAI 1038, Springer-Verlag, 1996Google Scholar
- A. Scheiper and M. Ranyal, “From Group Communications to Transactions in Distributed Systems,” Communications of the ACM, 39:4, 1996, pp. 84–87.Google Scholar
- M. P. Singh, “A Customizable Coordination Service for Autonomous Agents,” In this volume.Google Scholar
- J. Wuu and A. J. Bernstein, “Efficient Solutions to the Replicated Log and Dictionary Problems,” Proceedings of the 3rd ACM Symposium of Principles of Distributed Computing, ACM Press, New York, 1984, pp. 233–242.Google Scholar
- A. R. Worsely and A. Hodgson, “dMARS Fault Tolerant Communications, Reliable Messaging Use Cases,” Internal Report, The Australian AI Institute, Carlton, Victoria 3053, Australia, February 1995.Google Scholar
- M. Wooldridge and N. R. Jennings, “Intelligent Agents: Theory and Practice,” The Knowledge Engineering, Publisher: Springer Verlag, Volume 890, 1995Google Scholar