Skip to main content

Part of the book series: Embedded Systems ((EMSY))

  • 1013 Accesses

Abstract

High-performance embedded systems can only be developed when efficiency requirements are pursued at different levels of the system design. A predominant role is associated with compilers which are responsible for the generation of efficient machine code. To accomplish this goal, compilers have to feature advanced optimizations. The class of source code optimizations provides a number of benefits compared to optimizations applied at lower abstraction levels of the code. The most important issues are portability, early application in the optimization sequence to enable subsequent optimizations, and availability of more details about the program structure due to the high level of abstraction. In this chapter, novel WCET-aware source code level optimizations are presented, including procedure cloning, superblock optimizations, loop unrolling, and loop unswitching. Moreover, a technique called invariant path is presented to accelerate WCET-aware optimizations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. A.V. Aho, R. Sethi, J.D. Ullman, Compilers: Principles, Techniques, and Tools (Addison-Wesley/Longman, Boston, 1986)

    Google Scholar 

  2. A.W. Appel, Modern Compiler Implementation in C (Cambridge University Press, New York, 1997)

    Book  MATH  Google Scholar 

  3. D.F. Bacon, S.L. Graham, O.J. Sharp, Compiler transformations for high-performance computing. ACM Comput. Surv. 26(4), 345–420 (1994)

    Article  Google Scholar 

  4. S. Carr, K. Kennedy, Improving the ratio of memory operations to floating-point operations in loops. ACM Trans. Program. Lang. Syst. 16(6), 1768–1810 (1994)

    Article  Google Scholar 

  5. P.P. Chang, W.W. Hwu, Trace selection for compiling large C application programs to microcode, in Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture (MICRO), San Diego, USA, November 1988, pp. 21–29

    Google Scholar 

  6. P.P. Chang, S.A. Mahlke, W.W. Hwu, Using profile information to assist classic code optimizations. Softw. Pract. Exp. 21(12), 1301–1321 (1991)

    Article  Google Scholar 

  7. W. Chen, S. Mahlke, N. Warter et al., Using profile information to assist advanced compiler optimization and scheduling. Adv. Lang. Compil. Parallel Process. 757, 31–48 (1992)

    Article  Google Scholar 

  8. R. Cohn, P.G. Lowney, Design and analysis of profile-based optimization in Compaq’s compilation tools for alpha. J. Instr. Level Parallelism 2, 1–25 (2000)

    Google Scholar 

  9. K.D. Cooper, M.W. Hall, K. Kennedy, A methodology for procedure cloning. Comput. Lang. 19(2), 105–117 (1993)

    Article  MATH  Google Scholar 

  10. J.W. Davidson, S. Jinturkar, An aggressive approach to loop unrolling, Technical report, University of Virginia, Charlottesville, USA, 2001

    Google Scholar 

  11. A. Erosa, L.J. Hendren, Taming control flow: a structured approach to eliminating goto statements, in Proceedings of IEEE International Conference on Computer Languages (ICCL), Toulouse, France, May 1994, pp. 229–240

    Google Scholar 

  12. H. Falk, WCET-aware register allocation based on graph coloring, in Proceedings of the 46th Design Automation Conference (DAC), San Francisco, USA, July 2009, pp. 726–731

    Google Scholar 

  13. H. Falk, J.C. Kleinsorge, Optimal static WCET-aware scratchpad allocation of program code, in Proceedings of the 46th Design Automation Conference (DAC), San Francisco, USA, July 2009, pp. 732–737

    Google Scholar 

  14. H. Falk, P. Marwedel, Control flow driven splitting of loop nests at the source code level, in Proceedings of the Conference on Design, Automation and Test in Europe (DATE), Munich, Germany, March 2003, pp. 410–415

    Google Scholar 

  15. H. Falk, M. Schwarzer, Loop nest splitting for WCET-optimization and predictability improvement, in Proceedings of the 2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia (ESTIMedia), Seoul, Korea, October 2006, pp. 115–120

    Google Scholar 

  16. J.A. Fisher, Trace scheduling: a technique for global microcode compaction. IEEE Trans. Comput. 30(7), 478–490 (1981)

    Article  Google Scholar 

  17. G. Fursin, C. Miranda, S. Pop et al., Practical run-time adaptation with procedure cloning to enable continuous collective compilation, in Proceedings of the GCC Developers’ Summit, Ottawa, Canada, July 2007

    Google Scholar 

  18. R. Ghiya, L.J. Hendren, Putting pointer analysis to work, in Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), San Diego, USA, January 1998, pp. 121–133

    Google Scholar 

  19. R. Giegerich, U. Möncke, R. Wilhelm, Invariance of approximate semantics with respect to program transformations, in GI - 11. Jahrestagung in Verbindung mit Third Conference of the European Co-operation in Informatics (ECI), Munich, Germany, October 1981, pp. 1–10

    Google Scholar 

  20. K. Heydemann, F. Bodin, P. Knijnenburg et al., UFC: a global trade-off strategy for loop unrolling for VLIW architecture, in Proceedings of the 10th Workshop on Compilers for Parallel Computers (CPC), Amsterdam, The Netherlands, January 2001, pp. 59–70

    Google Scholar 

  21. W.W. Hwu, S.A. Mahlke, W.Y. Chen et al., The superblock: an effective technique for VLIW and superscalar compilation. J. Supercomput. 7, 229–248 (1993)

    Article  Google Scholar 

  22. T. Kelter, Superblock-based high-level WCET optimizations, Diploma thesis, TU Dortmund University, September 2009 (in German)

    Google Scholar 

  23. R. Kidd, W.W. Hwu, Abstract improved superblock optimization in GCC, in GCC Summit (2006)

    Google Scholar 

  24. A. Koseki, H. Komastu, Y. Fukazawa, A method for estimating optimal unrolling times for nested loops, in Proceedings of the 1997 International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN), Washington, USA, December 1997, pp. 376–382

    Google Scholar 

  25. D.M. Lavery, W.W. Hwu, Unrolling-based optimizations for modulo scheduling, in Proceedings of the 28th Annual International Symposium on Microarchitecture (MICRO), Ann Arbor, USA, November 1995, pp. 327–337

    Google Scholar 

  26. S. Lee, J. Lee, C.Y. Park, S.L. Min, A flexible tradeoff between code size and WCET using a dual instruction set processor, in Proceedings of the 8th International Workshop on Software & Compilers for Embedded Systems (SCOPES), Amsterdam, The Netherlands, September 2004, pp. 244–258

    Google Scholar 

  27. C. Lee, M. Potkonjak, W.H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communications systems, in Proceedings of the 30th Annual International Symposium on Microarchitecture (MICRO), Research Triangle Park, USA, December 1997, pp. 330–335

    Google Scholar 

  28. C. Liem, P. Paulin, A. Jerraya, Address calculation for retargetable compilation and exploration of instruction-set architectures, in Proceedings of the 33rd annual Design Automation Conference (DAC), Las Vegas, USA, June 1996, pp. 597–600

    Google Scholar 

  29. P. Lokuciejewsi, H. Falk, P. Marwedel, H. Theiling, WCET-driven, code-size critical procedure cloning, in Proceedings of the 11th International Workshop on Software & Compilers for Embedded Systems (SCOPES), Munich, Germany, March 2008, pp. 21–30

    Google Scholar 

  30. P. Lokuciejewski, P. Marwedel, Combining worst-case timing models, loop unrolling, and static loop analysis for WCET minimization, in Proceedings of the 22nd Euromicro Conference on Real-Time Systems (ECRTS), Dublin, Ireland, July 2009, pp. 35–44

    Google Scholar 

  31. P. Lokuciejewski, H. Falk, M. Schwarzer, P. Marwedel, H. Theiling, Influence of procedure cloning on WCET prediction, in Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), Salzburg, Austria, October 2007, pp. 137–142

    Google Scholar 

  32. P. Lokuciejewski, H. Falk, M. Schwarzer, M. Peter, Tighter WCET estimates by procedure cloning, in Proceedings of the 7th International Workshop on Worst-Case Execution Time Analysis (WCET), Pisa, Italy, July 2007, pp. 27–32

    Google Scholar 

  33. P. Lokuciejewski, F. Gedikli, P. Marwedel, Accelerating WCET-driven optimizations by the invariant path paradigm: a case study of loop unswitching, in Proceedings of the 12th International Workshop on Software & Compilers for Embedded Systems (SCOPES), Nice, France, April 2009, pp. 11–20

    Google Scholar 

  34. P. Lokuciejewski, T. Kelter, P. Marwedel, Superblock-based source code optimizations for WCET reduction, in Proceedings of the 7th IEEE International Conferences on Embedded Software and Systems (ICESS), Bradford, UK, June 2010

    Google Scholar 

  35. P.G. Lowney, S.M. Freudenberger, T.J. Karzes et al., The multiflow trace scheduling compiler. J. Supercomput. 7(1–2), 51–142 (1993)

    Article  Google Scholar 

  36. S.A. Mahlke, W.Y. Chen, J. Gyllenhaal et al., Compiler code transformations for superscalar-based high performance systems, in Proceedings of the 1992 ACM/IEEE Conference on Supercomputing, Washington, USA, July 1992, pp. 808–817

    Google Scholar 

  37. Mälardalen WCET Research Group. WCET Benchmarks, http://www.mrtc.mdh.se/projects/wcet, March 2010

  38. T.C. Mowry, Tolerating latency through software-controlled data prefetching, Technical report, Stanford University, Stanford, USA, 1994

    Google Scholar 

  39. S.S. Muchnick, Advanced Compiler Design and Implementation (Morgan Kaufmann, San Francisco, 1997)

    Google Scholar 

  40. A. Pabalkar, A. Shrivastava, A. Kannan, J. Lee, SDRM: simultaneous determination of regions and function-to-region mapping for scratchpad memories. Lect. Not. Comput. Sci. 5374, 569–582 (2008)

    Article  Google Scholar 

  41. A. Prantl, M. Schordan, J. Knoop, TuBound—a conceptually new tool for worst-case execution time analysis, in Proceedings of the 8th International Workshop on Worst-Case Execution Time Analysis (WCET), Prague, Czech Republik, July 2008

    Google Scholar 

  42. I. Puaut, WCET-centric software-controlled instruction caches for hard real-time systems, in Proceedings of the 18th Euromicro Conference on Real-Time Systems (ECRTS), Dresden, Germany, July 2006, pp. 217–226

    Google Scholar 

  43. I. Puaut, D. Decotigny, Low-complexity algorithms for static cache locking in multitasking hard real-time systems, in Proceedings of the 23rd IEEE Real-Time Systems Symposium (RTSS), Austin, USA, December 2002, pp. 114–123

    Google Scholar 

  44. V. Sarkar, Optimized unrolling of nested loops. Int. J. Parallel Program. 29(5), 545–581 (2001)

    Article  MATH  Google Scholar 

  45. B. Siegfried, M. Eduard, B. Scholz, Probabilistic procedure cloning for high-performance systems, Technical report, Institute for Software Science, University of Vienna, November 2000

    Google Scholar 

  46. W. So, A. Dean, Procedure cloning and integration for converting parallelism from coarse to fine grain, in Proceedings of the 7th Workshop on Interaction between Compilers and Computer Architectures (INTERACT), Anaheim, USA, February 2003, pp. 27–36

    Google Scholar 

  47. L. Song, K. Kavi, What can we gain by unfolding loops? SIGPLAN Not. 39(2), 26–33 (2004)

    Article  Google Scholar 

  48. B. Su, S. Ding, L. Jin, An improvement of trace scheduling for global microcode compaction. ACM SIGMICRO Newsl. 15(4), 78–85 (1984)

    Article  Google Scholar 

  49. V. Suhendra, T. Mitra, A. Roychoudhury et al., WCET centric data allocation to scratchpad memory, in Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS), Miami, USA, December 2005, pp. 223–232

    Google Scholar 

  50. H. Theiling, Control flow graphs for real-time systems analysis, PhD thesis, Saarland University, 2002

    Google Scholar 

  51. P. Tonella, Effects of different flow insensitive points-to analyses on DEF/USE sets, in Proceedings of the Third European Conference on Software Maintenance and Reengineering (CSMR), Amsterdam, The Netherlands, March 1999, pp. 62–69

    Google Scholar 

  52. UTDSP Benchmark Suite. http://www.eecg.toronto.edu/~corinna/DSP/infrastructure/UTDSP.html, March 2010

  53. F. Vahid, Procedure cloning: a transformation for improved system-level functional partitioning. ACM Trans. Des. Automat. Electron. Syst. 4(1), 70–96 (1999)

    Article  Google Scholar 

  54. H. Venturini, F. Riss, J.C. Fernandez et al., A fully-non-transparent approach to the code location problem, in Proceedings of the 11th International Workshop on Software & Compilers for Embedded Systems (SCOPES), Munich, Germany, March 2008, pp. 61–68

    Google Scholar 

  55. X. Vera, B. Lisper, J. Xue, Data cache locking for higher program predictability, in Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), San Diego, USA, July 2003, pp. 272–282

    Google Scholar 

  56. W. Zhao, D. Whalley, C. Healy, F. Mueller, WCET code positioning, in Proceedings of the 25th IEEE International Real-Time Systems Symposium (RTSS), Lisbon, Portugal, December 2004, pp. 81–91

    Google Scholar 

  57. W. Zhao, W. Kreahling, D. Whalley et al., Improving WCET by optimizing worst-case paths, in Proceedings of the 11th IEEE Real Time on Embedded Technology and Applications Symposium (RTAS), San Francisco, USA, March 2005, pp. 138–147

    Google Scholar 

  58. V. Zivojnović, J. Martínez Velarde, C. Schläger et al., DSPstone: a DSP-oriented benchmarking methodology, in Proceedings of the International Conference on Signal Processing and Technology (ICSPAT), Dallas, USA, January 1994, pp. 715–720

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul Lokuciejewski .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media B.V.

About this chapter

Cite this chapter

Lokuciejewski, P., Marwedel, P. (2011). WCET-Aware Source Code Level Optimizations. In: Worst-Case Execution Time Aware Compilation Techniques for Real-Time Systems. Embedded Systems. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9929-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-90-481-9929-7_4

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-9928-0

  • Online ISBN: 978-90-481-9929-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics