Load Scheduling with Profile Information

Lindenmaier, Götz; McKinley, Kathryn S.; Temam, Olivier

doi:10.1007/3-540-44520-X_31

Götz Lindenmaier⁵,
Kathryn S. McKinley⁶ &
Olivier Temam⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1900))

Included in the following conference series:

European Conference on Parallel Processing

479 Accesses
3 Citations

Abstract

Within the past five years, many manufactures have added hardware performance counters to their microprocessors to generate profile data cheaply. We show how to use Compaq’s DCPI tool to determine load latencies which are at a fine, instruction granularity and use them as fodder for improving instruction scheduling. We validate our heuristic for using DCPI latency data to classify loads as hits and misses against simulation numbers. We map our classification into the Multiflow compiler’s intermediate representation, and use a locality sensitive Balanced scheduling algorithm. Our experiments illustrate that our algorithm improves run times by 1% on average, but up to 10% on a Compaq Alpha.

This work is supported by EU Project 28198; NSF grants EIA-9726401, CDA-9502639, and a CAREER Award CCR-9624209; Darpa grant 5-21425; Compaq and by LTR Esprit project 24942 MHAOTEU. Any opinions, findings, or conclusions expressed are the authors’ and not necessarily the sponsors’.

Download to read the full chapter text

Chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

G. Ammons, T. Ball, and J. R. Larus. Exploiting hardware performance counters with flow and context sensitive profiling. In Proceedings of the SIGPLAN’ 97 Conference on Programming Language Design and Implementation, pages 85–96, Las Vegas, NV, June 1997.
Google Scholar
G. Ammons and J. R. Larus. Improving data-flow analysis with path profiles. In Proceedings of the SIGPLAN’ 98 Conference on Programming Language Design and Implementation, pages 72–84, Montreal, Canada, June 1998.
Google Scholar
J. M. Anderson, L. M. Berc, J. Dean, S. Ghemawat, M. R. Henzinger, S. A. Leung, R. L. Sites, M. T. Vandervoorde, C. A. Waldspurger, and W. E. Weihl. Continuous profiling: Where have all the cycles gone?ACM Transactions on Computer Systems, 15(4):357–390, November 1997.
Google Scholar
S. Carr. Combining optimization for cache and instruction-level parallelism. In The 1996 International Conference on Parallel Architectures and Compilation Techniques, Boston, MA, October 1996.
Google Scholar
J. Dean, J. E. Hicks, C. A. Waldspurger, W. E. Weihl, and G. Chrysos. ProfileMe: Hardware support for instruction level profiling on out-of-order processors. In Proceedings of the 30th International Symposium on Micro architecture, Research Triangle Park, NC, December 1997.
Google Scholar
Chen Ding, Steve Carr, and Phil Sweany. Modulo scheduling with cache reuse information. In Proceedings of EuroPar’ 97, pages 1079–1083, August 1997.
Google Scholar
J. A. Fisher. Trace scheduling: A technique for global microcode compaction. IEEE Transactions on Computers, C-30(7):478–490, July 1981.
Google Scholar
D. R. Kerns and S. Eggers. Balanced scheduling: Instruction scheduling when memory latency is uncertain. In Proceedings of the SIGPLAN’ 93 Conference on Programming Language Design and Implementation, pages 278–289, Albuquerque, NM, June 1993.
Google Scholar
C. Liao, M. Martonosi, and D. W. Clark. Performance monitoring in a myrinet-connected shrimp cluster. In 1998 ACM Sigmetrics Symposium on Parallel and Distributed Tools, August 1998.
Google Scholar
J. L. Lo and S. J. Eggers. Improving balanced scheduling with compiler optimizations that increase instruction-level parallelism. In Proceedings of the SIGPLAN’ 95 Conference on Programming Language Design and Implementation, pages 151–162, San Diego, CA, June 1995.
Google Scholar
P. G. Lowney, S. M. Freudenberger, T. J. Karzes, W. D. Lichtenstein, R. P. Nix, J. S. O’Donnell, and J. C. Ruttenberg. The multiflow trace scheduling compiler. The Journal of Supercomputing, pages 51–143, 1993.
Google Scholar
F. Jesus Sanchez and Antonio Gonzales. Cache sensitive modulo scheduling. In The 1997 International Conference on Parallel Architectures and Compilation Techniques, pages 261–271, November 1997.
Google Scholar
A. Srivastava and A. Eustace. ATOM: A system for building customized program analysis tools. In Proceedings of the SIGPLAN’ 94 Conference on Programming Language Design and Implementation, pages 196–205, Orlando, FL, June 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

Fakultät für Informatik, Universität Karlsruhe, Germany
Götz Lindenmaier
Department of Computer Science, University of Massachusetts, USA
Kathryn S. McKinley
Laboratoire de recherche en informatique, Universite de Paris Sud, France
Olivier Temam

Authors

Götz Lindenmaier
View author publications
You can also search for this author in PubMed Google Scholar
Kathryn S. McKinley
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Temam
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Informatik Lehrstuhl für Rechnertechnik und Rechnerorganisation, LRR-TUM, Technische Universität München, 80290, München, Deutschland
Arndt Bode , Thomas Ludwig , Wolfgang Karl & Roland Wismüller , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lindenmaier, G., McKinley, K.S., Temam, O. (2000). Load Scheduling with Profile Information. In: Bode, A., Ludwig, T., Karl, W., Wismüller, R. (eds) Euro-Par 2000 Parallel Processing. Euro-Par 2000. Lecture Notes in Computer Science, vol 1900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44520-X_31

Download citation

DOI: https://doi.org/10.1007/3-540-44520-X_31
Published: 18 August 2000
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67956-1
Online ISBN: 978-3-540-44520-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics