Active Fault-Tolerant System for Open Distributed Computing
Computer systems are growing in complexity and sophistication as open distributed systems and new technologies are used to achieve higher reliability and performance. Open distributed systems are some of the most successful structures ever designed for the computer community together with their undisputed benefits for users. However, this structure has also introduced a few side-effects, most notably the unanticipated runtime events and reconfiguration burdens imposed by the environmental changes. In this paper, we design a model that exploits the knowledge of pre-fault behavior to predict the suspected environmental faults and failures. Further, it can analyse the current underlying environmental behavior, in terms of current faults and failures. Therefore, this model mainly provides proactive as well as real-time fault-tolerant approaches in order to address unanticipated events and unpredictable hazards in distributed systems. Therefore, providing active fault tolerance could have a major impact with the growing requirements to support autonomic computing to overcome their rapidly growing complexity and to enable their further growth.
KeywordsCommunication Cost Replica Placement Replication Protocol Current Failure Advance Information Networking
Unable to display preview. Download preview PDF.
- 1.Leonardo, J.C., Oda, K., Yoshida, T.: An Adaptable Replication Scheme for Reliable Distributed Object-Oriented Computing. In: 17th International Conference on Advanced Information Networking and Applications (2003)Google Scholar
- 2.Oda, K., Tazuneki, S., Yoshida, T.: The Flying Object for an Open Distributed Environment. In: 15th International Conference on Information Networking (2001)Google Scholar
- 3.Michal, S., Guillaume, P., van S., M.: Latency-Driven Replica Placement. In: The International Symposium on Applications and the Internet (2005)Google Scholar
- 5.Wiesmann, M., Pedone, F., Schiper, A., Kemme, B., Alonso, G.: Understanding Replication in Databases and Distributed Systems. In: 20th International Conference on Distributed Computing Systems (2000)Google Scholar
- 6.Jimenez-Peris, R., Patino-Martinez, M., Kemme, B., Alonso., G.: How to Select a Replication Protocol According to Scalability. In: Availability, and Communication Overhead, 20th IEEE Symposium on Reliable Distributed Systems (2001)Google Scholar
- 7.Yasutake, Y., Masuyama, Y., Oda, K., Yoshida, T.: Clear Separation and Combination of Synchronization Constraint for Concurrent Object Oriented Programming. In: 17th International Conference on Advanced Information Networking and Applications (2003)Google Scholar