SARA: Combining Stack Allocation and Register Allocation

  • V. Krishna Nandivada
  • Jens Palsberg
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3923)


Commonly-used memory units enable a processor to load and store multiple registers in one instruction. We showed in 2003 how to extend gcc with a stack-location-allocation (SLA) phase that reduces memory traffic by rearranging the stack and replacing some load/store instructions with load/store-multiple instructions. While speeding up the target code, our technique leaves room for improvement because of the phase ordering of register allocation before SLA. In this paper we present SARA which combines SLA and register allocation into a single phase. SARA creates a synergy among register assignment, spill-code generation, and SLA that makes the combined phase generate faster code than a sequence of the individual phases. We specify SARA by an integer linear program generated from the program text. We have implemented SARA in gcc, replacing gcc’s own implementation of register allocation. For our benchmarks, our results show that the target code is up to 16% faster than gcc with a separate SLA phase.


Integer Linear Program Basic Block Integer Linear Program Formulation Register Allocation Memory Instruction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
  2. 2.
    Allis, V.: A knowledge-based approach of connect-four–the game is solved: White wins. Technical Report IR–163, Vrije Universiteit Amsterdam (1988)Google Scholar
  3. 3.
    Appel, A.W., George, L.: Optimal spilling for CISC machines with few registers. In: PLDI 2001, pp. 243–253 (2001)Google Scholar
  4. 4.
    Austin, T.M., Breach, S.E., Sohi, G.S.: Efficient detection of all pointer and array access errors. In: PLDI 1994, pp. 290–301 (1994)Google Scholar
  5. 5.
    Bradlee, D., Eggers, S., Henry, R.: Integrating register allocation and instruction scheduling for riscs. In: ASPLOS 1991, pp. 122–131 (1991)Google Scholar
  6. 6.
    Briggs, P., Cooper, K.D., Torczon, L.: Improvements to graph coloring register allocation. ACM TOPLAS 16(3), 428–455 (1994)CrossRefGoogle Scholar
  7. 7.
    Callahan, D., Koblenz, B.: Register allocation via hierarchical graph coloring. In: PLDI 1991, pp. 192–203 (1991)Google Scholar
  8. 8.
    Chaitin, G.J.: Register allocation and spilling via graph coloring. SIGPLAN Notices 17(6), 98–105 (1982)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Fourer, R., Gay, D.M., Kernighan, B.W.: AMPL A modeling language for mathematical programming. Scientific Press (1993)Google Scholar
  10. 10.
    Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization algorithms. J. ACM 34(3), 596–615 (1987)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Fu, C., Wilken, K.: A faster optimal register allocator. In: Proceedings of ACM/IEEE MICRO 2002, pp. 245–256 (2002)Google Scholar
  12. 12.
    Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NPCompleteness. Freeman, New York (1979)zbMATHGoogle Scholar
  13. 13.
    Goodwin, D.W., Wilken, K.D.: Optimal and near-optimal global register allocations using 0-1 integer programming. Software–Practice & Experience 26(8), 929–968 (1996)CrossRefGoogle Scholar
  14. 14.
    Kong, T., Wilken, K.D.: Precise register allocation for irregular architectures. In: Proceedings of ACM/IEEE MICRO 1998, pp. 297–307 (1998)Google Scholar
  15. 15.
    Lerner, S., Grove, D., Chambers, C.: Composing dataflow analyses and transformations. In: POPL 2002, pp. 270–282 (2002)Google Scholar
  16. 16.
    Liberatore, V., Farach-Colton, M., Kremer, U.: Evaluation of algorithms for local register allocation. In: Jähnichen, S. (ed.) CC 1999 and ETAPS 1999. LNCS, vol. 1575, pp. 137–152. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  17. 17.
    Memik, G., Mangione-Smith, B., Hu, W.: Netbench: A benchmarking suite for network processors. In: IEEE ICCAD 2001 (2001)Google Scholar
  18. 18.
    Motwani, R., Palem, K.V., Sarkar, V., Reyen, S.: Combining register allocation and instruction scheduling. Tech.Report CS-TN-95-22 (1995)Google Scholar
  19. 19.
    Naik, M., Palsberg, J.: Compiling with code-size constraints. ACM Transactions on Embedded Computing Systems 3(1), 163–181 (2004)CrossRefGoogle Scholar
  20. 20.
    Krishna Nandivada, V., Palsberg, J.: Efficient spill code for SDRAM. In: CASES 2003, pp. 24–31 (2003)Google Scholar
  21. 21.
    Rivest, R.: The md5 message-digest algorithm. Request for Comment: 1321 (1992)Google Scholar
  22. 22.
    Sethi, R.: Complete register allocation problems. In: ACM STOC 1973, pp. 182–195 (1973)Google Scholar
  23. 23.
    Spalink, T., Karlin, S., Peterson, L.: Evaluating network processors in ip forwarding. Technical Report TR–626–00, Princeton University (2000)Google Scholar
  24. 24.
    Tremblay, M., Chan, J., Chaudhry, S., Conigliaro, A.W., Tse, S.S.: The majc architecture: A synthesis of parallelism and scalability. IEEE Micro. 20(6), 12–25 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • V. Krishna Nandivada
    • 1
  • Jens Palsberg
    • 1
  1. 1.UCLA University of CaliforniaLos AngelesUSA

Personalised recommendations