Reliability aware scheduling of bag of real time tasks in cloud environment

Swain, Chinmaya Kumar; Saini, Neha; Sahu, Aryabartta

doi:10.1007/s00607-019-00749-w

Reliability aware scheduling of bag of real time tasks in cloud environment

Published: 10 August 2019

Volume 102, pages 451–475, (2020)
Cite this article

Computing Aims and scope Submit manuscript

495 Accesses
19 Citations
Explore all metrics

Abstract

Cloud environment uses data center with a huge number of computational resources, and the probability of failing any of the resources increases with scale. Failures cause unavailability of services, which affects the reliability of the system. It is essential to consider the reliability issue for application deployment in the cloud, considering the failure of the resources. In this work, we address the reliability aware scheduling of tasks with hard deadlines in the cloud environment. We design, analyze and provide solutions for two special cases of the problem where (a) tasks have a common deadline on the machines with equal failure rate, and (b) tasks with equal execution time. For the general case of the problem, we propose two-phase heuristic approaches, one is the task ordering, and other is tasks mapping to machines. The performance of different task orderings and task mapping approaches is evaluated through simulation using synthetic and real traces. Based on the simulation result, the earliest due date ordering of tasks and mapping of the current task to the most reliable machine along with long task dropping performs better in general settings. We observe that task repetition and replication further improve the performance of the heuristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Energy efficiency in cloud computing data centers: a survey on software technologies

Article 30 August 2022

Fault-tolerant allocation of deadline-constrained tasks through preemptive migration in heterogeneous cloud environments

Article 27 May 2024

Comparative analysis of metaheuristic load balancing algorithms for efficient load balancing in cloud computing

Article Open access 13 June 2023

References

Jammes F, Smit H (2005) Service-oriented paradigms in industrial automation. IEEE Trans Ind Inform 1(1):62–70
Article Google Scholar
Liu Q, Cai W, Shen J, Fu Z, Liu X, Linge N (2016) A speculative approach to spatial-temporal efficiency with multi-objective optimization in a heterogeneous cloud environment. Secur Commun Netw 9(17):4002–4012
Article Google Scholar
Ford D, Labelle F, Popovici FI, Stokely M, Truong V-A, Barroso L, Grimes C, Quinlan S (2010) Availability in globally distributed storage systems. In: Proceedings of the 9th USENIX conference on operating systems design and implementation, USENIX Association, pp 1–7
Machida F, Kawato M, Maeno Y (2010) Redundant virtual machine placement for fault-tolerant consolidated server clusters. In: IEEE network operations and management symposium—NOMS 2010, pp 32–39
Dai Y, Yang B, Dongarra J, Zhang G (2009) Cloud service reliability: modeling and analysis. In: IEEE Pacific Rim international symposium on dependable computing
Vishwanath KV, Nagappan N (2010) Characterizing cloud computing hardware reliability. In: Proceedings of the 1st ACM symposium on Cloud computing (SoCC’10), pp 193–204
Fu S, Xu C (2007) Exploring event correlation for failure prediction in coalitions of clusters. In: SC ’07: Proceedings of the 2007 ACM/IEEE conference on supercomputing, pp 1–12
Summary of the Amazon S3 Service Disruption in the Northern Virginia (US-EAST-1) Region. https://aws.amazon.com/message/41926/. Accessed 5 Sept 2018
Poola D, Garg SK, Buyya R, Yang Y, Ramamohanarao K (2014) Robust scheduling of scientific workflows with deadline and budget constraints in clouds. In: IEEE 28th international conference on advanced information networking and applications, pp 858–865
Sahoo SK, Sivasubramaniam A, Squillante MS, Zhang Y (2004) Failure data analysis of a large-scale heterogeneous server environment. In: Proceedings of conference on dependable systems and networks
Zhang Y, Squillante MS, Sivasubramaniam A, Sahoo RK (2004) Performance implications of failures in large-scale cluster scheduling. In: Proceedings of the 10th workshop on job scheduling strategies for parallel processing
Sahoo RK, Oliner AJ, Rish I et al (2003) Critical event prediction for proactive management in large-scale computer clusters. In: Proceedings of ACM international conference on knowledge discovery and data mining
Yang B, Xu X, Tan F, Park DH (2011) An utility-based job scheduling algorithm for cloud computing considering reliability factor. In: International conference on cloud and service computing, pp 95–102
Beaumont O, Eyraud-Dubois L, Larchevêque H (2013) Reliable service allocation in clouds. In: IEEE 27th international symposium on parallel and distributed processing, pp 55–66
Ferreira et al K (2011) Evaluating the Viability of Process Replication Reliability for Exascale Systems, International Conference for High Performance Computing, Networking, Storage and Analysis , pp. 1-12
Xie G, Chen Y, Liu Y, Wei Y, Li R, Li K (2017) Resource consumption cost minimization of reliable parallel applications on heterogeneous embedded systems. IEEE Trans Ind Inform 13(4):1629–1640
Article Google Scholar
Zhao B, Aydin H, Zhu D (2010) On maximizing reliability of real-time embedded applications under hard energy constraint. IEEE Trans Ind Inform 6(3):316–328
Article Google Scholar
Alam ABMB, Zulkernine M, Haque A (2017) A reliability-based resource allocation approach for cloud computing. In: IEEE 7th international symposium on cloud and service computing (SC2), pp 249–252
Qiu X, Dai Y, Xiang Y, Xing L (2016) A hierarchical correlation model for evaluating reliability, performance, and power consumption of a cloud service. IEEE Trans Syst Man Cybern Syst 46(3):401–412
Article Google Scholar
Shatz SM, Wang JP (1989) Models and algorithms for reliability-oriented task-allocation in redundant distributed computer systems. IEEE Trans Reliab 38(1):16–27
Article Google Scholar
Brucker P (2001) Scheduling algorithms, 3rd edn. Springer, Berlin
Book Google Scholar
Buttazzo GC, Bertogna M, Yao G (2013) Limited preemptive scheduling for real-time systems. A survey. IEEE Trans Ind Inform 9(1):3–15
Article Google Scholar
Lawler EL (1983) Scheduling a single machine to minimize the number of late jobs. EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/1983/6344.html. Accessed 10 Nov 2018
Baptiste P (2000) Preemptive scheduling of identical machines, Report 2000-314
Brucker P (1981) Minimizing maximum lateness in a two-machine unit-time job shop. Computing 27:367. https://doi.org/10.1007/BF02277185
Article MATH Google Scholar
Martello S, Toth P (2006) Knapsack problems. Wiley, London
MATH Google Scholar
Martello S, Pisinger D, Toth P (1999) Dynamic programming and strong bounds for the 0–1 knapsack problem. Manag Sci 45:414–424
Article Google Scholar
Brucker P, Kravchenko SA (1999) Preemption can make parallel machine scheduling problems hard. OSM Reihe P, Heft 211, Universit at Osnabruck, Fachbereich Mathematik/Informatik
J. Wilkes—More Google cluster data. http://googleresearch.blogspot.ch/2011/11/more-google-clusterdata.html. Accessed 7 July 2018
Chen Y, Alspaugh S, Katz R (2012) Interactive analytical processing in big data systems: a cross-industry study of mapreduce workloads. In: Proceedings of 38th international conference on very large databases
Chen Y, Ganapathi A, Griffith R, Katz R (2011) The case for evaluating mapreduce performance using workload suites. In: Proceedings of IEEE/ACM international symposium on modeling, analysis and simulation of computer and telecommunication systems

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati, India
Chinmaya Kumar Swain, Neha Saini & Aryabartta Sahu

Authors

Chinmaya Kumar Swain
View author publications
You can also search for this author in PubMed Google Scholar
Neha Saini
View author publications
You can also search for this author in PubMed Google Scholar
Aryabartta Sahu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chinmaya Kumar Swain.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Swain, C.K., Saini, N. & Sahu, A. Reliability aware scheduling of bag of real time tasks in cloud environment. Computing 102, 451–475 (2020). https://doi.org/10.1007/s00607-019-00749-w

Download citation

Received: 12 January 2019
Accepted: 05 August 2019
Published: 10 August 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s00607-019-00749-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reliability aware scheduling of bag of real time tasks in cloud environment

Abstract

Access this article

Similar content being viewed by others

Energy efficiency in cloud computing data centers: a survey on software technologies

Fault-tolerant allocation of deadline-constrained tasks through preemptive migration in heterogeneous cloud environments

Comparative analysis of metaheuristic load balancing algorithms for efficient load balancing in cloud computing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Reliability aware scheduling of bag of real time tasks in cloud environment

Abstract

Access this article

Similar content being viewed by others

Energy efficiency in cloud computing data centers: a survey on software technologies

Fault-tolerant allocation of deadline-constrained tasks through preemptive migration in heterogeneous cloud environments

Comparative analysis of metaheuristic load balancing algorithms for efficient load balancing in cloud computing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation