Abstract
Distributed systems commonly replicate data to enhance system dependability. In such systems, a logical update on a data item results in a physical update on a number of copies. The synchronization and communication required to keep the copies of replicated data consistent introduce a delay when operations are performed. In systems distributed over a bandwidth-constrained area, such operational delays generally prove unacceptable and passive asynchronous replication is often used to mitigate the delays. The research described in this paper looks to develop a new methodology for passive asynchronous replication that includes the introduction of an adaptive data replication scheduler to increase the data redundancy of the most valued data objects in overloaded situations. The approach we describe relies on the batch processing nature of passive asynchronous replication to make update decisions based on a defined policy. The methodology uses an adaptive scheduling algorithm providing near real-time selection of which objects to replicate during overloaded conditions based on a trained multilayer perceptron neural network. Historical replication logs are used for the initial training of the network and supervised training continues as replication logs are created providing periodic improvement through a feedback mechanism. This paper presents summary results of data replication scheduling simulations with an emphasis on the design selection of the adaptive distributed scheduler.
Similar content being viewed by others
Abbreviations
- FCFS:
-
First come first served
- KP:
-
Knapsack problem
- MLP:
-
Multilayer perceptron
- MKP:
-
Multiple knapsack problem
- RPO:
-
Recovery point objective
References
Adams K (2005) An approach to real time adaptive decision making in dynamic distributed systems. Ph.D. Thesis, Virginia Polytechnic Institute and State University
Adams K, Gračanin D, Hinchey M (2005) Increasing resiliency through priority scheduling of asynchronous data replication. In: Proceedings of the 11th international conference on parallel and distributed systems (ICPADS 2005), Fukuoka, Japan, July
Adams K, Gračanin D, Teodorovič D (2004) A near optimal approach to quality of service data replication scheduling. In: Proceedings of the Winter Simulation Conference (WSC‘04), Washington D.C., pp 1847–1855
Alsberg P, Day J (1976) A principle for resilient sharing of distributed resources. In: Proceedings of the second international conference of Software Engineering. San Francisco, pp 562–570
Bertsekas D (1995). Dynamic programming and optimal control. Athena Scientific, Belmont
Bertsekas D (1999). Nonlinear programming, 2nd edn. Athena Scientific, Belmont
Bertsekas D and Tsitsiklis J (1996). Neuro-dynamic programming. Athena Scientific, Belmont
Bhide A, Goyal A, Hsiao H, Jhingran A (1992) An efficient scheme for providing high availability. In: Proceeding of the ACM SIGMOD international conference on Management of Data, San Diego, pp 236–245
Bishop C (1995). Neural networks for pattern recognition. Oxford University Press, New York
Bishop C (1998). Neural networks and machine learning. Springer, Berlin
Buretta M (1997). Data replication: tools and techniques for managing distributed information. Wiley, New York
Clark R (1990) Scheduling dependent real-time activities. Ph.D. Thesis, published as Technical Report CMU-CS-90–155, Carnegie Mellon University
Dua V, Papalexandri K and Pistikopoulos E (2004). Global optimization issues in multiparametric continuous and mixed-integer optimization problems. J Global Optim 30: 59–89
Eker J (1999) Flexible embedded control system-design and implementation. PhD-thesis, Lund Institute of Technology
Funahashi K (1989). On the approximate realization of continuous mappings by neural networks. Neural Netw 2(3): 183–192
Gal T (1995) Postoptimal analyses, parametric programming and related topics, 2nd edn. deGruyter
Geoffrion A and Nauss R (1977). Parametric and postoptimality analysis in integer linear programming. Manage Sci 23(5): 453–466
Gracanin D, Adams K, Eltoweissy M (2006) Data replication in collaborative sensor network systems. In: Proceedings of the 25th IEEE international Performance Computing and Communications conference (IPCCC), Phoenix, Arizona
Hornik K, Stinchcombe M and White H (1989). Multilayer feedforward networks are universal approximators. Neural Netw 2(5): 359–366
Jorden E (1999) Project prioritization and selection: the disaster scenario. In: Proceedings of the 32nd annual Hawaii international conference on Systems Sciences (HICSS-32), Maui, Hawaii
Lawrence J and Pasternack B (2002). Applied management science: a computer-integrated approach for decision making, 2nd edn. Wiley, New York
Lee H, Hsu C (1989) Neural network processing through energy minimization with learning ability to the multiconstraint zero-one knapsack problem. In: Proceedings of the IEEE international workshop on Architectures, Languages and Algorithms: Tools for Artificial Intelligence, Fairfax, VA, pp 548–555
Lu C (2001) Feedback control real-time scheduling. PhD Thesis, University of Virginia
Mohan C, Haderle D, Lindsay B, Pirahesh H and Schwarz P (1992). ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Trans Database Syst (TODS) 17(1): 94–162
Keller H, Pferschy U and Pisinger D (2004). Knapsack problems. Springer, Berlin
Seto D, Lehoczky J, Sha L, Shin K (1996) On task schedulability in real-time control systems. In: IEEE Real-Time Systems Symposium
Stonebraker M (1986). The case for shared nothing. Database Eng 9(1): 4–9
Zou H, Jahanian F (1998) Optimization of a real-time primary-backup replication service. In: Proceeding of the 17th IEEE symosium on Reliable Distributed Systems, W. Lafayette, IN, pp 177–185
Zou H, Jahanian F (1998) Real time primary-backup (RTPB) replication with temporal consistency guarantees. In: Proceedings of the 18th IEEE international conference on distributed computing systems, Amsterdam, The Netherlands, pp 48–56
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported in part by a research fellowship from the Naval Surface Warfare Center, Dahlgren Division. The views, opinions, and findings contained herein are those of the authors and should not be construed as an official Department of Defense position, policy, or decision.
Rights and permissions
About this article
Cite this article
Adams, K., Gračanin, D. Using adaptive scheduling for increased resiliency in passive asynchronous replication. Innovations Syst Softw Eng 3, 333–344 (2007). https://doi.org/10.1007/s11334-007-0037-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11334-007-0037-9