Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Using adaptive scheduling for increased resiliency in passive asynchronous replication

Abstract

Distributed systems commonly replicate data to enhance system dependability. In such systems, a logical update on a data item results in a physical update on a number of copies. The synchronization and communication required to keep the copies of replicated data consistent introduce a delay when operations are performed. In systems distributed over a bandwidth-constrained area, such operational delays generally prove unacceptable and passive asynchronous replication is often used to mitigate the delays. The research described in this paper looks to develop a new methodology for passive asynchronous replication that includes the introduction of an adaptive data replication scheduler to increase the data redundancy of the most valued data objects in overloaded situations. The approach we describe relies on the batch processing nature of passive asynchronous replication to make update decisions based on a defined policy. The methodology uses an adaptive scheduling algorithm providing near real-time selection of which objects to replicate during overloaded conditions based on a trained multilayer perceptron neural network. Historical replication logs are used for the initial training of the network and supervised training continues as replication logs are created providing periodic improvement through a feedback mechanism. This paper presents summary results of data replication scheduling simulations with an emphasis on the design selection of the adaptive distributed scheduler.

This is a preview of subscription content, log in to check access.

Abbreviations

FCFS:

First come first served

KP:

Knapsack problem

MLP:

Multilayer perceptron

MKP:

Multiple knapsack problem

RPO:

Recovery point objective

References

  1. 1.

    Adams K (2005) An approach to real time adaptive decision making in dynamic distributed systems. Ph.D. Thesis, Virginia Polytechnic Institute and State University

  2. 2.

    Adams K, Gračanin D, Hinchey M (2005) Increasing resiliency through priority scheduling of asynchronous data replication. In: Proceedings of the 11th international conference on parallel and distributed systems (ICPADS 2005), Fukuoka, Japan, July

  3. 3.

    Adams K, Gračanin D, Teodorovič D (2004) A near optimal approach to quality of service data replication scheduling. In: Proceedings of the Winter Simulation Conference (WSC‘04), Washington D.C., pp 1847–1855

  4. 4.

    Alsberg P, Day J (1976) A principle for resilient sharing of distributed resources. In: Proceedings of the second international conference of Software Engineering. San Francisco, pp 562–570

  5. 5.

    Bertsekas D (1995). Dynamic programming and optimal control. Athena Scientific, Belmont

  6. 6.

    Bertsekas D (1999). Nonlinear programming, 2nd edn. Athena Scientific, Belmont

  7. 7.

    Bertsekas D and Tsitsiklis J (1996). Neuro-dynamic programming. Athena Scientific, Belmont

  8. 8.

    Bhide A, Goyal A, Hsiao H, Jhingran A (1992) An efficient scheme for providing high availability. In: Proceeding of the ACM SIGMOD international conference on Management of Data, San Diego, pp 236–245

  9. 9.

    Bishop C (1995). Neural networks for pattern recognition. Oxford University Press, New York

  10. 10.

    Bishop C (1998). Neural networks and machine learning. Springer, Berlin

  11. 11.

    Buretta M (1997). Data replication: tools and techniques for managing distributed information. Wiley, New York

  12. 12.

    Clark R (1990) Scheduling dependent real-time activities. Ph.D. Thesis, published as Technical Report CMU-CS-90–155, Carnegie Mellon University

  13. 13.

    Dua V, Papalexandri K and Pistikopoulos E (2004). Global optimization issues in multiparametric continuous and mixed-integer optimization problems. J Global Optim 30: 59–89

  14. 14.

    Eker J (1999) Flexible embedded control system-design and implementation. PhD-thesis, Lund Institute of Technology

  15. 15.

    Funahashi K (1989). On the approximate realization of continuous mappings by neural networks. Neural Netw 2(3): 183–192

  16. 16.

    Gal T (1995) Postoptimal analyses, parametric programming and related topics, 2nd edn. deGruyter

  17. 17.

    Geoffrion A and Nauss R (1977). Parametric and postoptimality analysis in integer linear programming. Manage Sci 23(5): 453–466

  18. 18.

    Gracanin D, Adams K, Eltoweissy M (2006) Data replication in collaborative sensor network systems. In: Proceedings of the 25th IEEE international Performance Computing and Communications conference (IPCCC), Phoenix, Arizona

  19. 19.

    Hornik K, Stinchcombe M and White H (1989). Multilayer feedforward networks are universal approximators. Neural Netw 2(5): 359–366

  20. 20.

    Jorden E (1999) Project prioritization and selection: the disaster scenario. In: Proceedings of the 32nd annual Hawaii international conference on Systems Sciences (HICSS-32), Maui, Hawaii

  21. 21.

    Lawrence J and Pasternack B (2002). Applied management science: a computer-integrated approach for decision making, 2nd edn. Wiley, New York

  22. 22.

    Lee H, Hsu C (1989) Neural network processing through energy minimization with learning ability to the multiconstraint zero-one knapsack problem. In: Proceedings of the IEEE international workshop on Architectures, Languages and Algorithms: Tools for Artificial Intelligence, Fairfax, VA, pp 548–555

  23. 23.

    Lu C (2001) Feedback control real-time scheduling. PhD Thesis, University of Virginia

  24. 24.

    Mohan C, Haderle D, Lindsay B, Pirahesh H and Schwarz P (1992). ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Trans Database Syst (TODS) 17(1): 94–162

  25. 25.

    Keller H, Pferschy U and Pisinger D (2004). Knapsack problems. Springer, Berlin

  26. 26.

    Seto D, Lehoczky J, Sha L, Shin K (1996) On task schedulability in real-time control systems. In: IEEE Real-Time Systems Symposium

  27. 27.

    Stonebraker M (1986). The case for shared nothing. Database Eng 9(1): 4–9

  28. 28.

    Zou H, Jahanian F (1998) Optimization of a real-time primary-backup replication service. In: Proceeding of the 17th IEEE symosium on Reliable Distributed Systems, W. Lafayette, IN, pp 177–185

  29. 29.

    Zou H, Jahanian F (1998) Real time primary-backup (RTPB) replication with temporal consistency guarantees. In: Proceedings of the 18th IEEE international conference on distributed computing systems, Amsterdam, The Netherlands, pp 48–56

Download references

Author information

Correspondence to Denis Gračanin.

Additional information

This research was supported in part by a research fellowship from the Naval Surface Warfare Center, Dahlgren Division. The views, opinions, and findings contained herein are those of the authors and should not be construed as an official Department of Defense position, policy, or decision.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Adams, K., Gračanin, D. Using adaptive scheduling for increased resiliency in passive asynchronous replication. Innovations Syst Softw Eng 3, 333–344 (2007). https://doi.org/10.1007/s11334-007-0037-9

Download citation

Keywords

  • Data Replication
  • Policy-based backup
  • Backup/recovery
  • Algorithms
  • Neural Networks