Compile-Time Based Performance Prediction

Cascaval, Calin; DeRose, Luiz; Padua, David A.; Reed, Daniel A.

doi:10.1007/3-540-44905-1_23

Calin Cascaval⁵,
Luiz DeRose⁵,
David A. Padua⁵ &
…
Daniel A. Reed⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1863))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

385 Accesses
18 Citations
3 Altmetric

Abstract

In this paper we present results we obtained using a compiler to predict performance of scientific codes. The compiler, Polaris [3], is both the primary tool for estimating the performance of a range of codes, and the beneficiary of the results obtained from predicting the program behavior at compile time. We show that a simple compile-time model, augmented with profiling data obtained using very light instrumentation, can be accurate within 20% (on average) of the measured performance for codes using both dense and sparse computational methods.

This work is supported in part by Army contract DABT63-95-C-0097; Army contract N66001-97-C-8532; NSF contract MIP-9619351; and a Partnership Award from IBM. This work is not necessarily representative of the positions or policies of the Army or Government.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

T. Ball and J. R. Larus. Branch prediction for free. In Proceedings of the ACM SIGPLAN Conference on Programming Languages Design and Implementation’ 93, pages 300–313, 1993.
Google Scholar
U. Banerjee. Dependence analysis. Kluwer Academic Publishers, 1997.
Google Scholar
W. Blume, R. Doallo, R. Eigenmann, J. Grout, J. Hoeflinger, T. Lawrence, J. Lee, D. Padua, Y. Paek, W. Pottenger, L. Rauchwerger, and P. Tu. Parallel Programming with Polaris. IEEE Computer, December 1996.
Google Scholar
R. Bramley, D. Gannon, T. Stuckey, J. Villacis, J. Balasubramanian, E. Akman, F. Breg, S. Diwan, and M. Govindaraju. The Linear System Analyzer, chapter PSEs. IEEE, 1998.
Google Scholar
C. Cascaval and D. A. Padua. Compile-time cache misses estimation using stack distances. In preparation.
Google Scholar
P. P. Chang, S. A. Mahlke, and W.-M. W. Hwu. Using profile information to assist classic compiler code optimizations. Software Practice and Experience, 21(12):1301–1321, December 1991.
Google Scholar
R. P. Colwell, R. P. Nix, J. J. O’Donnell, D. B. Papworth, and P. K. Rodman. A VLIW architecture for a trace scheduling compiler. In Proceedings of ASPLOS II, pages 180–192, Palo Alto, CA, October 1987.
Google Scholar
L. DeRose, Y. Zhang, and D. A. Reed. SvPablo: A multi-language performance analysis system. In 10th International Conference on Computer Performance Evaluation-Modelling Techniques and Tools-Performance Tools’98, pages 352–355, Palma de Mallorca, Spain, September 1998.
Google Scholar
T. Fahringer. Evaluation of benchmark performance estimation for parallel Fortran programs on massively parallel SIMD and MIMD computers. In IEEE Proceedings of the 2nd Euromicro Workshop on Parallel and Distributed Processing, Malaga, Spain, January 1994.
Google Scholar
T. Fahringer. Automatic Performance Prediction of Parallel Programs. Kluwer Academic Press, 1996.
Google Scholar
T. Fahringer. Estimating cache performance for sequential and data parallel programs. Technical Report TR 97-9, Institute for Software Technology and Parallel Systems, Univ. of Vienna, Vienna, Austria, October 1997.
Google Scholar
J. A. Fisher. Trace scheduling: A technique for global microcode compaction. IEEE Transactions on Computers, C(30):478–490, July 1981.
Google Scholar
J. D. Gee, M. D. Hill, and A. J. Smith. Cache performance of the SPEC92 benchmark suite. In Proceedings of the IEEE Micro, pages 17–27, August 1993.
Google Scholar
S. Ghosh, M. Martonosi, and S. Malik. Precise Miss Analysis for Program Transformations with Caches of Arbitrary Associativity. In Proceedings of ASPLOS VIII, San Jose, CA, October 1998.
Google Scholar
M. D. Hill and A. J. Smith. Evaluating associativity in cpu caches. IEEE Transactions on Computers, 38(12):1612–1630, December 1989.
Google Scholar
Y. Kang, M. Huang, S.-M. Yoo, Z. Ge, D. Keen, V. Lam, P. Pattnaik, and J. Torrellas. FlexRAM: Toward an advanced intelligent memory system. In International Conference on Computer Design (ICCD), October 1999.
Google Scholar
R. L. Mattson, J. Gecsei, D. Slutz, and I. Traiger. Evaluation techniques for storage hierarchies. IBM Systems Journal, 9(2), 1970.
Google Scholar
J. Reilly. SPEC95 Products and Benchmarks. SPEC Newsletter, September 1995.
Google Scholar
R. Saavedra and A. Smith. Measuring cache and tlb performance and their effect on benchmark run times. IEEE Transactions on Computers, 44(10):1223–1235, October 1995.
Google Scholar
R. H. Saavedra-Barrera and A. J. Smith. Analysis of benchmark characteristics and benchmark performance prediction. Technical Report CSD 92-715, Computer Science Division, UC Berkeley, 1992.
Google Scholar
R. H. Saavedra-Barrera, A. J. Smith, and E. Miya. Machine characterization based on an abstract high-level language machine. IEEE Transactions on Computers, 38(12):1659–1679, December 1989.
Google Scholar
V. Sarkar. Determining average program execution times and their variance. In Proceedings of the ACM SIGPLAN Conference on Programming Languages Design and Implementation’ 89, pages 298–312, Portland, Oregon, July 1989.
Google Scholar
R. A. Sugumar and S. G. Abraham. Set-associative cache simulation using generalized binomial trees. ACM Trans. Comp. Sys., 13(1), 1995.
Google Scholar
J. G. Thompson and A. J. Smith. Efficient (stack) algorithms for analysis of write-back and sector memories. ACM Transactions on Computer Systems, 7(1), 1989.
Google Scholar
W.-H. Wang and J.-L. Baer. Efficient trace-driven simulation methods for cache performance analysis. ACM Transactions on Computer Systems, 9(3), 1991.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Calin Cascaval, Luiz DeRose, David A. Padua & Daniel A. Reed

Authors

Calin Cascaval
View author publications
You can also search for this author in PubMed Google Scholar
Luiz DeRose
View author publications
You can also search for this author in PubMed Google Scholar
David A. Padua
View author publications
You can also search for this author in PubMed Google Scholar
Daniel A. Reed
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA, 92093-0114, USA
Larry Carter & Jeanne Ferrante &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cascaval, C., DeRose, L., Padua, D.A., Reed, D.A. (2000). Compile-Time Based Performance Prediction. In: Carter, L., Ferrante, J. (eds) Languages and Compilers for Parallel Computing. LCPC 1999. Lecture Notes in Computer Science, vol 1863. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44905-1_23

Download citation

DOI: https://doi.org/10.1007/3-540-44905-1_23
Published: 12 June 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67858-8
Online ISBN: 978-3-540-44905-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics