Abstract
Many time-critical applications require predictable performance in the presence of failures. This paper considers a distributed system with independent periodic tasks which can checkpoint their state on some reliable medium in order to handle failures. The problem of preemptively scheduling a set of such tasks is discussed where every occurrence of a task has to be completely executed before the next occurrence of the same task can start. Efficient scheduling algorithms are proposed which yield sub-optimal schedules when there is provision for fault-tolerance. The performance of the solutions proposed is evaluated in terms of the number of processors and the cost of the checkpoints needed. Moreover, analytical studies are used to reveal interesting trade-offs associated with the scheduling algorithms.
Similar content being viewed by others
References
S. Balaji et al. Workload redistribution for fault-tolerance in a hard real-time distributed computing system.Proc. FTCS-19, Chicago, Illinois, pp. 366–373, June 1989.
A. A. Bertossi and M. A. Bonuccelli. Preemptive scheduling of periodic jobs in uniform multiprocessor systems.Inform. Proc. Lett. 16:3–6, 1983.
E. G. Coffman Jr. and P. Denning.Operating System Theory. Wiley, New York, 1976.
S. Davari and S. K. Dhall. An on line algorithm for real time tasks allocation.Proc. IEEE Int. Real-Time Systems Symposium, pages 194–200, 1986.
S. K. Dhall and C. L. Liu. On a real time scheduling problem.Operations Research 26:127–140, 1978.
C. M. Krishna and K. G. Shin. On scheduling tasks with a quick recovery from failure.IEEE Trans. on Computers, 35(5):448–454, 1986.
E. L. Lawler and C. U. Martel. Scheduling periodically occurring tasks on multiple processors.Inform. Proc. Lett. 12:9–12, 1981.
E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, and H. Shmoys. Sequencing and scheduling. Technical Report, Centre for Mathematics and Computer Science, Amsterdam, 1989.
A. L. Liestman and R. H. Campbell. A fault-tolerant scheduling problem.IEEE Trans. Soft. Eng. 12(11):1089–1095, 1986.
C. L. Liu and J. W. Layland. Scheduling algorithms for multiprogramming in a hard-real time environment.Journal ACM, 20:46–61, 1973.
J.P. Lehoczky, L. Sha, and Y. Ding. The rate monotonic scheduling algorithm: Exact characterization and average case behavior.Proc. IEEE Real-Time Systems Symposium, 1989.
J. P. Lehoczky, B. Spunt, and L. Sha. Aperiodic task scheduling for hard real-time systems.Proc. IEEE Real-Time Systems Symposium, 1988.
J. P. Lehoczky, L. Sha, and Y. Ding. The rate monotonic scheduling algorithm: Exact characterization and average case behavior.Proc. IEEE Real-Time Systems Symposium, 1989.
J. Y.-T. Leung and M. L. Merrill. A note on preemptive scheduling periodic real-time tasks.Inform. Proc. Lett. 11:115–118, 1980.
L. V. Mancini. Modular redundancy in a message passing system.IEEE Trans. Soft. Eng. 12(1):79–86, 1986.
R. McNaughton. Scheduling with deadlines and loss functions.Management Science, 12(7), 1959.
B. Randell, P. A. Lee, and P. C. Treleaven. Reliability issues in computing system design.ACM Computing Surveys, 10(2):123–166, 1978.
K. Ramamritham and J. A. Stankovic. Scheduling strategies adopted in SPRING: An overview.Found. of Real-Time Computing, 1991.
L. Sha, J. P. Lehoczky, and R. Rajkumar. Solutions for some practical problems in prioritized preemptive scheduling.Proc. IEEE Real-Time Systems Symposium, pages 181–191, 1986.
F. B. Schneider, and R. D. Schlichting. Towards fault-tolerant process control software.Proc. 11th IEEE FTCS, pages 48–55, Portland, Maine, June 1981.
R. D. Schlichting, and F. B. Schneider. Fail-stop processors: An approach to designing fault-tolerant computing systems.ACM Trans. on Computer Systems, 1(3):222–238, 1983.
J. A. Stankovic, K. Ramamritham, and S. Cheng. Evaluation of a flexible task scheduling algorithm for distributed hard real-time systems.IEEE Trans. on Computers, 34(12):1130–1143, 1985.
K. S. Trivedi.Probability and Statistics with Reliability, Queueing, and Computer Science Applications. Prentice-Hall, Englewood Cliffs, NJ, 1982.
Author information
Authors and Affiliations
Additional information
This work has been supported by grants from the Italian “Ministero dell'Università e della Ricerca Scientifica e Tecnologica” and the “Consiglio Nazionale delle Ricerche-Progetto Finalizzato Sistemi Informatici e Calcolo Parallelo”.
Rights and permissions
About this article
Cite this article
Bertossi, A.A., Mancini, L.V. Scheduling algorithms for fault-tolerance in hard-real-time systems. Real-Time Syst 7, 229–245 (1994). https://doi.org/10.1007/BF01088520
Issue Date:
DOI: https://doi.org/10.1007/BF01088520