Skip to main content

Register Bank Assignment for Spatially Partitioned Processors

  • Conference paper
Languages and Compilers for Parallel Computing (LCPC 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5335))

  • 877 Accesses

Abstract

Demand for instruction level parallelism calls for increasing register bandwidth without increasing the number of register ports. Emerging architectures address this need by partitioning registers into multiple distributed banks, which offers a technology scalable substrate but a challenging compilation target. This paper introduces a register allocator for spatially partitioned architectures. The allocator performs bank assignment together with allocation. It minimizes spill code and optimizes bank selection based on a priority function. This algorithm is unique because it must reason about multiple competing resource constraints and dependencies exposed by these architectures. We demonstrate an algorithm that uses critical path estimation, delays from registers to consuming functional units, and hardware resource constraints. We evaluate the algorithm on TRIPS, a functional, partitioned, tiled processor with register banks distributed on top of a 4 ×4 grid of ALUs. These results show that the priority banking algorithm implements a number of policies that improve performance, performance is sensitive to bank assignment, and the compiler manages this resource well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bernstein, D., Golumbic, M., Mansour, Y., Pinter, R., Goldin, D., Nahshon, I., Krawczyk, H.: Spill code minimization techniques for optimizing compliers. In: ACM SIGPLAN Symposium on Interpreters and Interpretive Techniques, pp. 258–263 (1989)

    Google Scholar 

  2. Brasier, T.S., Sweany, P.H., Beaty, S.J., Carr, S.: CRAIG: a practical framework for combining instruction scheduling and register assignment. In: Parallel Architectures and Compilation Techniques, pp. 11–18 (1995)

    Google Scholar 

  3. Briggs, P., Cooper, K.D., Torczon, L.: Improvements to graph coloring register allocation. In: ACM Transactions on Programming Languages and Systems, May 1994, vol. 16(3), pp. 428–455 (1994)

    Google Scholar 

  4. Chaitin, G.: Register allocation and spilling via graph coloring. In: ACM SIGPLAN Symposium on Compiler Construction, pp. 98–105 (1982)

    Google Scholar 

  5. Chow, F.C., Hennessy, J.L.: Priority-based coloring approach to register allocation. In: ACM Transactions on Programming Languages and Systems, vol. 12, pp. 501–536 (1990)

    Google Scholar 

  6. Coons, K., Chen, X., Kushwaha, S., Burger, D., McKinley, K.S.: A spatial path scheduling algorithm for edge architectures. In: ACM Conference on Architecture Support for Programming Languages and Operating Systems, pp. 129–140 (2006)

    Google Scholar 

  7. EEMBC. Embedded microprocessor benchmark consortium, http://www.eembc.org/

  8. Ellis, J.: A Compiler for VLIW Architecture. PhD thesis, Yale University (1984)

    Google Scholar 

  9. Farkas, K.L., Chow, P., Jouppi, N.P., Vranesic, Z.: The multicluster architecture: Reducing processor cycle time through partitioning. In: ACM/IEEE Symposium on Microarchitecture, pp. 327–356 (1997)

    Google Scholar 

  10. Hiser, J., Carr, S., Sweany, P., Beaty, S.J.: Register assignment for software pipelining with partitioned register banks. In: International Parallel and Distributed Processing Symposium, pp. 211–217 (2000)

    Google Scholar 

  11. Janssen, J., Corporaal, H.: Partitioned register files for TTAs. In: ACM/IEEE Symposium on Micorarchitecture, December 1995, pp. 301–312 (1995)

    Google Scholar 

  12. Kailas, K., Ebcioglu, K., Agrawala, A.: Cars: A new code generation framework for clustered ILP processors. In: Conference on High Performance Computer Architecture, pp. 133–143 (2001)

    Google Scholar 

  13. Maher, B., Smith, A., Burger, D., McKinley, K.S.: Merging head and tail duplication for convergent hyperblock formation. In: ACM/IEEE International Symposium on Microarchitecture, pp. 65–76 (2006)

    Google Scholar 

  14. Poletto, M., Sarkar, V.: Linear scan register allocation. In: ACM Transactions on Programming Languages and Systems, Spetember 1999, vol. 21, pp. 895–913 (1999)

    Google Scholar 

  15. Smith, A., Burrill, J., Gibson, J., Maher, B., Nethercote, N., Yoder, B., Burger, D.C., McKinley, K.S.: Compiling for EDGE architectures. In: International Conference on Code Generation and Optimization, pp. 185–195 (2006)

    Google Scholar 

  16. SPEC2000CPU. The standard performance evaluation corporation (SPEC), http://www.spec.org/

  17. Taylor, M.B., Agarwal, A.: Evaluation of the raw microprocessor: An exposed-wire-delay architecture for ILP and streams. In: ACM SIGARCH International Symposium on Computer Architecture, pp. 2–13 (2004)

    Google Scholar 

  18. Traub, O., Holloway, G., Smith, M.D.: Quality and speed in linear-scan register allocation. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, June 1998, pp. 895–913 (1998)

    Google Scholar 

  19. Warter, N.J.: Reverse if-conversion. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 290–299 (1993)

    Google Scholar 

  20. Yoder, B., Burrill, J., McDonald, R., Bush, K.B., Coons, K., Gebhart, M., Govindan, S., Maher, B., Nagarajan, R., Robatmili, B., Sankaralingam, K., Sharif, S., Smith, A.: Software infrastructure and tools for the TRIPS prototype. In: Third Annual Workshop on Modeling, Benchmarking and Simulation (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Robatmili, B., Coons, K., Burger, D., McKinley, K.S. (2008). Register Bank Assignment for Spatially Partitioned Processors. In: Amaral, J.N. (eds) Languages and Compilers for Parallel Computing. LCPC 2008. Lecture Notes in Computer Science, vol 5335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89740-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89740-8_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89739-2

  • Online ISBN: 978-3-540-89740-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics