Fault Tolerance in Distributed Computing

Storm, Christian

doi:10.1007/978-3-8348-2381-6_2

Christian Storm²

511 Accesses

Abstract

A distributed system consists of several independent processing components that interact with each other via an interconnecting communication link network consisting of communication components. Distributed computing refers to the algorithmic controlling of the distributed system’s processing components by means of a distributed program in order to reach a collective goal, that is, to provide a certain service. Unfortunately, the components of literally every system are naturally imperfect and therefore prone to failures that may render the system unable to provide the service. In order to be able to tolerate the failure of some components, that is, to keep the service available despite these failures, the system must be equipped with redundancy in space and time. The former refers to redundant components that take over the part played by failed components. The latter refers to the additional overhead required to manage these components. Fault-tolerant distributed computing refers to the algorithmic controlling of the distributed system’s components to provide the desired service despite the presence of certain failures in the system by exploiting redundancy in space and time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Oldenburg, Germany
Christian Storm

Authors

Christian Storm
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Storm, C. (2012). Fault Tolerance in Distributed Computing. In: Specification and Analytical Evaluation of Heterogeneous Dynamic Quorum-Based Data Replication Schemes. Vieweg+Teubner Verlag. https://doi.org/10.1007/978-3-8348-2381-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-8348-2381-6_2
Publisher Name: Vieweg+Teubner Verlag
Print ISBN: 978-3-8348-2380-9
Online ISBN: 978-3-8348-2381-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics