Abstract
Replication is well-suited to provide scalable fault-tolerance for distributed safety-critical applications. The main emphasis of this article is hence put on the introduction of a component-based software infrastructure which provides optimized availability at the cost of a well-controlled violation of constraints. We present the relevant details of a successful implementation for the telecommunication domain showing the feasibility of the proposed architecture.
Zusammenfassung
Replikation eignet sich ausgezeichnet, um sicherheitskritische verteilte Systeme mit skalierbarer Fehlertoleranz zu versehen. Der Beitrag dieses Artikels besteht daher primär in der Vorstellung einer Komponenten-basierten Software-Infrastruktur, welche die Verfügbarkeit auf Kosten einer genau kontrollierten Verletzung der Integritätsbedingungen optimiert. Um die Brauchbarkeit der vorgestellten Architektur nachzuweisen, werden relevante Details aus der Implementierung dieser Konzepte im Bereich der Telekommunikationsanwendungen vorgestellt.
Similar content being viewed by others
References
Bernstein, P., Hadzilacos, V., Goodman, N. (1987): Concurrency control and recovery in database systems. Addison-Wesley.
Birman, K. (1993): The process group approach to reliable distributed computing. Communication of ACM 36 (12): 37–53.
Davidson, S. B. Garcia-Molina, H., Skeen, D. (1985): Consistency in partitioned networks. ACM Computing Surveys 17 (3): 342–370.
Felber, P., Narasimhan, P. (2002): Reconciling replication and transactions for the end-to-end reliability of corba applications. In: Proc. of the Int. Symp. on Distributed Objects and Applications (DOA 2002).
Garcia-Molina, H., Wiederhold, G. (1982): Read-only transactions in a distributed database. ACM Trans. on Database Systems 7 (2): 209–234.
Helal, A., Heddaya, A., Bhargava, B. (1995): Replication techniques in distributed systems. Boston London Dordrecht: Kluwer Academic Publishers.
Liu, X., Helal, A., Du, W. (1998): Multiview access protocols for large-scale replication. ACM Trans. on Database Systems 23 (2): 158–198.
Malkhi, D., Reiter, M., Tulone, D., Ziskind, E. (2001): Persistent objects in the fleet system. In: Proc. of the 2nd DARPA Information Survivability Conf. and Exp. (DISCEX II), June 2001.
Moser, L. E., Melliar-Smith, P. M., Narasimhan, P., Tewksbury, L., Kalogeraki, V. (1999): The eternal system: An architecture for enterprise applications. In: Proc. of the Int. Enterprise Distributed Object Computing Conference EDOC99, September 1999, 214–222.
Pease, M., Shostak, R., Lamport, L. (1979): Reaching agreement in the presence of faults. J. ACM 27 (2): 228–234.
Pu, C., Leff, A. (1991): Replica control in distributed systems: An asynchronous approach. In: Proc. of the ACM SIGMOD Int. Conf. on Management of Data, ACM, April 1991, 377–386.
Schiper, A., Raynal, M. (1996): From group communication to transactions in distributed systems. Communication of ACM 39 (4): 84–87.
Schlichting, R., Schneider, F. (1982): Fail-stop processors: An approach to designing fault-tolerant computing systems. ACM Trans. on Computer Systems 1 (3): 222–238.
Schneider, F. (1990): Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Computing Surveys 22 (4): 299–319.
Skeen, D. (1985): Achieving high availability in partitioned database systems. In: Proc. of the Int. Conf. on Data Engineering, IEEE, 1985, 159–166.
van Renesse, R., Birman, K., Maffeis, S. (1996): Horus: a flexible group communication system. Communication of ACM 39 (4): 76–83.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Göschka, K.M., Smeikal, R. Using replication for increased availability of a distributed telecommunication management system. Elektrotech. Inftech. 121, 187–193 (2004). https://doi.org/10.1007/BF03055334
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF03055334
Keywords
- fault-tolerance
- replication
- transparent distribution and persistence
- trading consistency versus availability
- distributed systems