Cluster Computing

, Volume 5, Issue 3, pp 225–236 | Cite as

Online Prediction of the Running Time of Tasks

  • Peter A. Dinda


We describe and evaluate the Running Time Advisor (RTA), a system that can predict the running time of a compute-bound task on a typical shared, unreserved commodity host. The prediction is computed from linear time series predictions of host load and takes the form of a confidence interval that neatly expresses the error associated with the measurement and prediction processes – error that must be captured to make statistically valid decisions based on the predictions. Adaptive applications make such decisions in pursuit of consistent high performance, choosing, for example, the host where a task is most likely to meet its deadline. We begin by describing the system and summarizing the results of our previously published work on host load prediction. We then describe our algorithm for computing predictions of running time from host load predictions. We next evaluate the system using over 100,000 randomized testcases run on 39 different hosts, finding that is indeed capable of computing correct and useful confidence intervals. Finally, we report on our experience with using the RTA in application-oriented real-time scheduling in distributed systems.

performance prediction performance analysis adaptive applications 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    M. Aeschlimann, P. Dinda, L. Kallivokas, J. Lopez, B. Lowekamp and D. O'Hallaron, Preliminary report on the design of a framework for distributed visualization, in: Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'99 (Las Vegas, NV, (1999) pp. 1833–1839.Google Scholar
  2. [2]
    F. Berman and R. Wolski, Scheduling from the perspective of the application, in: Proc. of the 5th IEEE Symposium on High Performance Distributed Computing (HPDC96) (1996) pp. 100–111.Google Scholar
  3. [3]
    G.E.P. Box, G.M. Jenkins and G. Reinsel, Time Series Analysis: Forecasting and Control, 3rd Ed. (Prentice Hall, 1994).Google Scholar
  4. [4]
    P. Dinda, B. Lowekamp, L. Kallivokas and D. O'Hallaron, The case for prediction-based best-effort real-time systems, in: Proc. of the 7th International Workshop on Parallel and Distributed Real-Time Systems (WPDRTS 1999), Lecture Notes in Computer Science, Vol. 1586 (Springer, San Juan, PR, 1999) pp. 309–318. Extended version available as CMU Technical Report CMU-CS-TR-98-174.Google Scholar
  5. [5]
    P.A. Dinda, The statistical properties of host load, Scientific Programming 7(3,4) (1999). A version of this paper is also available as CMU Technical Report CMU-CS-TR-98-175.Google Scholar
  6. [6]
    P.A. Dinda, Resource signal prediction and its application to realtime scheduling advisors, Ph.D. thesis, School of Computer Science, Carnegie Mellon University (2000); Carnegie Mellon University Computer Science Department Technical Report CMU-CS-00-131.Google Scholar
  7. [7]
    P.A. Dinda and D.R. O'Hallaron, An extensible toolkit for resource prediction in distributed systems, Technical Report CMU-CS-99-138, School of Computer Science, Carnegie Mellon University (1999).Google Scholar
  8. [8]
    P.A. Dinda and D.R. O'Hallaron, Host load prediction using linear models, Cluster Computing 3(4) (2000).Google Scholar
  9. [9]
    P.A. Dinda and D.R. O'Hallaron, Realistic CPU workloads through host load trace playback, in: Proc. of 5th Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR2000), Lecture Notes in Computer Science, Vol. 1915 (Rochester, New York), 2000) pp. 246–259.Google Scholar
  10. [10]
    A.C. Dusseau, R.H. Arpaci and D.E. Culler, Effective distributed scheduling of parallel workloads, in: Proc. of the 1996 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (1996) pp. 25-36.Google Scholar
  11. [11]
    D.L. Eager, E.D. Lazowska and J. Zahorjan, The limited performance benefits of migrating active processes for load sharing, in: SIGMETRICS '88 (1988) pp. 63-72.Google Scholar
  12. [12]
    I. Foster and C. Kesselman (eds.), The Grid: Blueprint for a New Computing Infrastructure (Morgan Kaufmann, 1999).Google Scholar
  13. [13]
    M. Harchol-Balter and A.B. Downey, Exploiting process lifetime distributions for dynamic load balancing, in: Proc. of ACM SIGMETRICS '96 (1996) pp. 13-24.Google Scholar
  14. [14]
    R. Jain, The Art of Computer Systems Performance Analysis (Wiley, 1991).Google Scholar
  15. [15]
    J.F. Kurose and R. Chipalkatti, Load sharing in soft real-time distributed computer systems, IEEE Transactions on Computers C36(8) (1987) 993–1000.Google Scholar
  16. [16]
    W.E. Leland and T.J. Ott, Load-balancing heuristics and process behavior, in: Proceedings of ACM SIGMETRICS, Vol. 14 (1986) pp. 54-69.Google Scholar
  17. [17]
    C.L. Liu and J.W. Layland, Scheduling algorithms for multiprogramming in a hard real-time environment, Journal of the ACM 20(1) (1973) 46–61.Google Scholar
  18. [18]
    J. Lopez and D. O'Hallaron, Runtime support for adaptive heavyweight services, in: Proc. of 5th Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR2000), Lecture Notes in Computer Science, Vol. 1915 (Rochester, NY, 2000) pp. 221–234.Google Scholar
  19. [19]
    B. Lowekamp, N. Miller, D. Sutherland, T. Gross, P. Steenkiste and J. Subhlok, A resource monitoring system for network-aware applications, in: Proc. of the 7th IEEE International Symposium on High Performance Distributed Computing (HPDC) (1998) pp. 189-196.Google Scholar
  20. [20]
    M.W. Mutka and M. Livny, The available capacity of a privately owned workstation environment, Performance Evaluation 12(4) (1991) 269–284.Google Scholar
  21. [21]
    Object Management Group, The Common Object Request Broker: Architecture and Specification (version 2.3.1); Technical report, Object Management Group (1999).Google Scholar
  22. [22]
    K. Ramamrithham, J.A. Stankovic and W. Zhao, Distributed scheduling of tasks with deadlines and resource requirements, IEEE Transactions on Computer Systems 38(8) (1989) 1110–1123.Google Scholar
  23. [23]
    M. Rinard, D. Scales and M. Lam, Jade: A high-level machineindependent language for parallel programming, IEEE Computer 26(6) (1993) 28–38.Google Scholar
  24. [24]
    J.M. Schopf and F. Berman, Stochastic scheduling, in: Proc. of Supercomputing'99 (1999). Also available as Northwestern University Computer Science Department Technical Report CS-99-03.Google Scholar
  25. [25]
    B. Siegell and P. Steenkiste, Automatic generation of parallel programs with dynamic load balancing, in: Proc. of the 3rd International Symposium on High-Performance Distributed Computing (1994) pp. 166-175.Google Scholar
  26. [26]
    J. Stankovic, K. Ramamritham, D. Niehaus, M. Humphrey and G.Wallace, The spring system: Integrated support for complex real-time systems, Real-Time Systems Journal 16(2/3) (1999).Google Scholar
  27. [27]
    R. Wolski, Forecasting network performance to support dynamic scheduling using the network weather service, in: Proc. of the 6th High-Performance Distributed Computing Conference (HPDC97) (1997) pp. 316-325. Extended version available as UCSD Technical Report TR-CS96-494.Google Scholar
  28. [28]
    R. Wolski, N. Spring and J. Hayes, Predicting the CPU availability of time-shared unix systems, in: Proceedings of the 8th IEEE Symposium on High Performance Distributed Computing (HPDC99) (1999) pp. 105-112. UCSD Technical Report Number CS98-602.Google Scholar
  29. [29]
    J.A. Zinky, D.E. Bakken and R.E. Schantz, Architectural support for quality of service for CORBA objects, Theory and Practice of Object Systems 3(1) (1997) 55–73.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Peter A. Dinda
    • 1
  1. 1.Department of Computer ScienceNorthwestern UniversityUSA

Personalised recommendations