Predicting application run times using historical information

  • Warren Smith
  • Ian Foster
  • Valerie Taylor
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1459)


We present a technique for deriving predictions for the run times of parallel applications from the run times of “similar” applications that have executed in the past. The novel aspect of our work is the use of search techniques to determine those application characteristics that yield the best definition of similarity for the purpose of making predictions. We use four workloads recorded from parallel computers at Argonne National Laboratory, the Cornell Theory Center, and the San Diego Supercomputer Center to evaluate the effectiveness of our approach. We show that on these workloads our techniques achieve predictions that are between 14 and 60 percent better than those achieved by other researchers; our approach achieves mean prediction errors that are between 40 and 59 percent of mean application run times.


Historical Information Argonne National Laboratory Greedy Search Genetic Search Good Template 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    C. Catlett and L. Smarr. Metacomputing. Communications of the ACM, 35(6):44–52, 1992.CrossRefGoogle Scholar
  2. [2]
    K. Czajkowski, I. Foster, C. Kesselman, S. Martin, W. Smith, and S. Tuecke. A Resource Management Architecture for Metasystems. Lecture Notes on Computer Science, 1998.Google Scholar
  3. [3]
    Murthy Devarakonda and Ravishankar Iyer. Predictability of Process Resource Usage: A Measurement-Based Study on UNIX. IEEE Transactions on Software Engineering, 15(12):1579–1586, December 1989.CrossRefGoogle Scholar
  4. [4]
    Allen Downey. Predicting Queue Times on Space-Sharing Parallel Computers. In International Parallel Processing Symposium, 1997.Google Scholar
  5. [5]
    N. R. Draper and H. Smith. Applied Regression Analysis, 2nd Edition. John Wiley and Sons, 1981.Google Scholar
  6. [6]
    Dror Feitelson and Bill Nitzberg. Job Characteristics of a Production Parallel Scientific Workload on the NASA Ames iPSC/860. Lecture Nodes on Computer Science, 949, 1995.Google Scholar
  7. [7]
    Ian Foster and Carl Kesselman. Globus: A Metacomputing Infrastructure Toolkit. International Journal of Supercomputing Applications, 11(2):115–128, 1997.CrossRefGoogle Scholar
  8. [8]
    Richard Gibbons. A Historical Application Profiler for Use by Parallel Scheculers. Lecture Notes on Computer Science, pages 58–75, 1997.Google Scholar
  9. [9]
    Richard Gibbons. A Historical Profiler for Use by Parallel Schedulers. Master's thesis, University of Toronto, 1997.Google Scholar
  10. [10]
    David E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, 1989.Google Scholar
  11. [11]
    Neil Weiss and Matthew Hassett. Introductory Statistics. Addison-Wesley, 1982.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Warren Smith
    • 1
    • 2
  • Ian Foster
    • 1
  • Valerie Taylor
    • 2
  1. 1.Mathematics and Computer Science DivisionArgonne National LaboratoryArgonne
  2. 2.Electrical and Computer Engineering DepartmentNorthwestern UniversityEvanston

Personalised recommendations