Advertisement

Comparing Logs and Models of Parallel Workloads Using the Co-plot Method

  • David Talby
  • Dror G. Feitelson
  • Adi Raveh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1659)

Abstract

We present a multivariate analysis technique called Co-plot that is especially suitable for samples with many variables and relatively few observations, as the data about workloads often is. Observations and variables are analyzed simultaneously. We find three stable clusters of highly correlated variables, but that the workloads themselves, on the other hand, are rather different from one another. Synthetic models for workload generation are also analyzed, and found to be reasonable; however, each model usually covers well one machine type. This leads us to conclude that a parameterized model of parallel workloads should be built, and we describe guidelines for such a model. Another feature that the models lack is self-similarity: We demonstrate that production logs exhibit this phenomenon in several attributes of the workload, and in contrast that the none of the synthetic models do.

Keywords

Synthetic Model Hurst Parameter Workload Model Processor Allocation Synthetic Workload 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A.K. Agrawala, J.M. Mohr and R.M. Byrant, “An approach to the workload characterization problem.” Computer 9 (6), pp.18–32, June 1976.zbMATHCrossRefGoogle Scholar
  2. 2.
    Maria Calzarossa and Giuseppe Serazzi, “Construction and Use of Multiclass Workload Models.” Performance Evaluation 19(4), pp. 341–352, 1994.CrossRefGoogle Scholar
  3. 3.
    Maria Calzarossa and Giuseppe Serazzi, “Workload Characterization: A Survey.” Proc. IEEE 81 (8), pp. 1136–1150, Aug 1993.Google Scholar
  4. 4.
    Allen B. Downey, “A Parallel Workload Model and Its Implications for Processor Allocation.” 6th Intl. Symp. High Performance Distributed Comput., Aug 1997.Google Scholar
  5. 5.
    Allen B. Downey, “Using Queue Time Predictions for Processor Allocation.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1997, Lect. Notes Comput. Sci. vol. 1291, pp. 35–57.Google Scholar
  6. 6.
    Allen B. Downey and Dror G. Feitelson, “The Elusive Goal of Workload Characterization.” Perf. Eval. Rev. 26(4), pp. 14–29, Mar 1999.CrossRefGoogle Scholar
  7. 7.
    D. G. Feitelson, “Packing schemes for gang scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1996, Lect. Notes Comput. Sci. vol. 1162, pp. 89–110.Google Scholar
  8. 8.
    Dror G. Feitelson and Morris A. Jette, “Improved Utilization and Responsiveness with Gang Scheduling”, In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1997, Lect. Notes Comp. Sci. vol. 1291, pp. 238–261.Google Scholar
  9. 9.
    D. G. Feitelson and B. Nitzberg, “Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1995, Lect. Notes Comput. Sci. vol. 949, pp. 337–360.Google Scholar
  10. 10.
    Dror G. Feitelson and Larry Rudolph, “Metrics and Benchmarking for Parallel Job Scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1998, Lect. Notes Comput. Sci. vol. 1459, pp. 1–24.CrossRefGoogle Scholar
  11. 11.
    D. Ferrai, “Workload characterization and selection in computer performance measurement.” Computer 5 (4), pp. 18–24, Jul/Aug 1972.CrossRefGoogle Scholar
  12. 12.
    Guttman, L., “A general non-metric technique for finding the smallest space for a configuration of points”, Psychometrica 33, pp. 479–506, 1968.CrossRefGoogle Scholar
  13. 13.
    Steven Hotovy, “Workload Evolution on the Cornell Theory Center IBM SP2.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1996, Lect. Notes Comput. Sci. vol. 1162, pp. 27–40.CrossRefGoogle Scholar
  14. 14.
    Joefon Jann, Pratap Pattnaik, Hubertus Franke, Fang Wang, Joseph Skovira, and Joseph Riodan, “Modeling of Workload in MPPs.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1997, Lect. Notes Comput. Sci. vol. 1291, pp. 95–116.Google Scholar
  15. 15.
    E.J. Koldinger, S.J. Eggers, and H.M. Levy, “On the validity of trace-driven simulation for multiprocessors.” In 18th Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 244–253, May 1991.Google Scholar
  16. 16.
    E.D. Lazowska, “The use of percentiles in modeling CPU service time distributions.” In Computer Performance, K.M. Chandy and M. Reiser (eds.), 53–66, North Holland, 1977.Google Scholar
  17. 17.
    W.E. Leland, M.S. Taqqu, W. Willinger, and D.V. Wilson, “On the self-similar nature of Ethernet traffic.” IEEE/ACM Trans. Networking 2 (1), pp. 1–15, Feb 1994.CrossRefGoogle Scholar
  18. 18.
    G. Lipshitz, and A. Raveh, “Applications of the Co-plot method in the study of socioeconomic differences among cities: A basis for a differential development policy”, Urban Studies 31, pp. 123–135, 1994.CrossRefGoogle Scholar
  19. 19.
    V. Lo, J. Mache, and K. Windisch, “A comparative study of real workload traces and synthetic workload models for parallel job scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1998. Lect. Notes Comput. Sci. vol. 1459, pp. 25–46.CrossRefGoogle Scholar
  20. 20.
    Uri Lublin, “A Workload Model for Parallel Computer Systems”, Master Thesis, Hebrew University of Jerusalem, 1999, in preparation.Google Scholar
  21. 21.
    S. Maital, “Multidimensional Scaling: Some Econometric Applications”, Journal of Econometrics 8, pp. 33–46, 1978.zbMATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    S. Majumdar, D.L. Eager, and R.B. Bunt, “Scheduling in multiprogrammed parallel systems.” In Sigmetrics Conf. Measurement & Modeling of Computer Systems, pp. 104–113, May 1988.Google Scholar
  23. 23.
    A. Raveh, “The Greek banking system: Reanalysis of performance”, European Journal of Operational Research, (forthcoming).Google Scholar
  24. 24.
    K. Windisch, V. Lo, R. Moore, D. Feitelson, and B. Nitzberg, “A comparison of workload traces from two production parallel machines.” In 6th Symp. Frontiers Massively Parallel Comput., pp.319–326, Oct 1996.Google Scholar
  25. 25.
    M.E. Crovella and A. Bestavros, “Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes.” In Sigmetrics Conf. Measurement & Modeling of Computer Systems, pp. 160–169, May 1996.Google Scholar
  26. 26.
    S.D. Gribble, G.S. Manku, D. Roselli, E.A. Brewer, T.J. Gibson and E.L. Miller, “Self-Similarity in File Systems.” Performace Evaluation Review 26 (1), pp. 141–150, 1998.CrossRefGoogle Scholar
  27. 27.
    Jan Beran, Statistics for Long-Memory Processes. Monographs on Statistics and Applied Probability. Chapman and Hall, New York, NY, 1994.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • David Talby
    • 1
  • Dror G. Feitelson
    • 1
  • Adi Raveh
    • 2
  1. 1.Institute of Computer ScienceThe Hebrew UniversityJerusalemJerusalemIsrael
  2. 2.Department of Business AdministrationThe Hebrew UniversityJerusalem

Personalised recommendations