Brief Announcement: Fault-Tolerant Object Location in Large Compute Clusters
A so-called single system image (SSI) allows threads in a distributed shared memory (DSM) system to access data and other resources in a location transparent manner. In this brief announcement, we present our ongoing research towards fault-tolerant algorithms that locate objects in such systems. In particular, we build our algorithms on a multi-version, object-based software transactional memory (STM) system, in which objects form a sequence of immutable object versions. Our algorithms are fully decentralized and allow resources to be added and removed at run-time without disturbing the application.
KeywordsStorage Location Object Reference Consensus Protocol Object Version Distribute Shared Memory
- 1.Bieniusa, A., Fuhrmann, T.: Consistency in hindsight – a fully decentralized STM algorithm. In: Proc. 24th IEEE Intl. Parallel and Distributed Processing Symposium, IPDPS 2010 (2010)Google Scholar
- 2.Posselt, S.-A.: Design of a reliable, fully decentralized software transactional memory protocol. Diploma thesis, Technische Universität München (2010)Google Scholar