Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

European Conference on Parallel Processing

Euro-Par 1997: Euro-Par'97 Parallel Processing pp 1079–1083Cite as

  1. Home
  2. Euro-Par'97 Parallel Processing
  3. Conference paper
Modulo scheduling with cache reuse information

Modulo scheduling with cache reuse information

Workshop 17: Instruction-Level Parallelism

  • Chen Ding1,
  • Steve Carr2 &
  • Phil Sweany2 
  • Conference paper
  • First Online: 01 January 2005
  • 360 Accesses

  • 8 Citations

Part of the Lecture Notes in Computer Science book series (LNCS,volume 1300)

Abstract

Software pipelining for instruction-level parallel computers with non-blocking caches usually assigns memory access latency by assuming either all accesses are cache hits or all are cache misses. We contend setting memory latencies by cache reuse analysis leads to better software pipelining than either an all-hit or all-miss assumption. Using a simple cache-reuse model, our software pipelining optimization achieved 10% improved execution performance over assuming all-cache-hits and used 18% fewer registers than required by an all-cache-miss assumption. We conclude that software pipelining for architectures with non-blocking cache should incorprate a memory-reuse model.

This work was partially supported by the National Science Foundation under grants CCR-9409341 and CCR-9308348, as well as a grant from Digital Equipment.

Download conference paper PDF

References

  1. Abraham, S., Sugumar, R., Windheiser, D., Rau, B., and Gupta, R. Predictability of load/store instruction latencies. In Proceedings of the 26th International Symposium on Microarchitecture (MICRO-26) (Austin, TX, December 1993), pp. 139–152.

    Google Scholar 

  2. Chen, T.-F., and Baer, J.-L. Reducing memory latency via non-blocking and prefetching caches. In Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (Boston, Massachusetts, 1992), pp. 51–61.

    Google Scholar 

  3. Ding, C., Carr, S., and Sweany, P. Software pipelining with cache-reuse information. Tech. Rep. 96-07, Michigan Technological University, Sept. 1996. ftp://cs.mtu.edu/pub/carr/moduto.ps.gz.

    Google Scholar 

  4. Lam, M. Software pipelining: An effective scheduling technique for VLIW machines. SIGPLAN Notices 23, 7 (July 1988), 318–328. Proceedings of the ACM SIGPLAN '88 Conference on Programming Language Design and Implementation.

    Google Scholar 

  5. McKinley, K. S., Carr, S., and Tseng, C.-W. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems 18, 4 (1996), 424–453.

    CrossRef  Google Scholar 

  6. Mowry, T. C., Lam, M. S., and Gupta, A. Design and evaluation of a compiler algorithm for prefetching. In Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (Boston, Massachusetts, 1992), pp. 62–75.

    CrossRef  Google Scholar 

  7. Rau, B. Iterative modulo scheduling. In Proceedings of the 27th International Symposium on Microarchitecture (MICRO-27) (San Jose, CA, December 1994), pp. 63–74.

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Dept. of Computer Science, Rice University, 77005-1892, Houston, TX, USA

    Chen Ding

  2. Dept. of Computer Science, Michigan Technological University, 49931, Houghton, MI, USA

    Steve Carr & Phil Sweany

Authors
  1. Chen Ding
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Steve Carr
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Phil Sweany
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

    Rights and permissions

    Reprints and Permissions

    Copyright information

    © 1997 Springer-Verlag Berlin Heidelberg

    About this paper

    Cite this paper

    Ding, C., Carr, S., Sweany, P. (1997). Modulo scheduling with cache reuse information. In: Lengauer, C., Griebl, M., Gorlatch, S. (eds) Euro-Par'97 Parallel Processing. Euro-Par 1997. Lecture Notes in Computer Science, vol 1300. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0002856

    Download citation

    • .RIS
    • .ENW
    • .BIB
    • DOI: https://doi.org/10.1007/BFb0002856

    • Published: 26 September 2005

    • Publisher Name: Springer, Berlin, Heidelberg

    • Print ISBN: 978-3-540-63440-9

    • Online ISBN: 978-3-540-69549-3

    • eBook Packages: Springer Book Archive

    Share this paper

    Anyone you share the following link with will be able to read this content:

    Sorry, a shareable link is not currently available for this article.

    Provided by the Springer Nature SharedIt content-sharing initiative

    Search

    Navigation

    • Find a journal
    • Publish with us

    Discover content

    • Journals A-Z
    • Books A-Z

    Publish with us

    • Publish your research
    • Open access publishing

    Products and services

    • Our products
    • Librarians
    • Societies
    • Partners and advertisers

    Our imprints

    • Springer
    • Nature Portfolio
    • BMC
    • Palgrave Macmillan
    • Apress
    • Your US state privacy rights
    • Accessibility statement
    • Terms and conditions
    • Privacy policy
    • Help and support

    65.108.231.39

    Not affiliated

    Springer Nature

    © 2023 Springer Nature