Real-Time Systems

, Volume 41, Issue 1, pp 52–85 | Cite as

Cache-aware timing analysis of streaming applications

  • Samarjit Chakraborty
  • Tulika Mitra
  • Abhik Roychoudhury
  • Lothar Thiele


Of late, there has been a considerable interest in models, algorithms and methodologies specifically targeted towards designing hardware and software for streaming applications. Such applications process potentially infinite streams of audio/video data or network packets and are found in a wide range of devices, starting from mobile phones to set-top boxes. Given a streaming application and an architecture, the timing analysis problem is to determine the timing properties of the processed data stream, given the timing properties of the input stream. This problem arises while determining many common performance metrics related to streaming applications and the mapping of such applications onto hardware architectures. Such metrics include the maximum delay experienced by any data item of the stream and the maximum backlog or the buffer requirement to store the incoming stream. Most of the previous work related to estimating or optimizing these metrics take a high-level view of the architecture and neglect micro-architectural features such as caches. In this paper, we show that an accurate estimation of these metrics, however, heavily relies on an appropriate modeling of the processor micro-architecture. Towards this, we present a novel framework for cache-aware timing analysis of stream processing applications. Our framework accurately models the evolution of the instruction cache of the underlying processor as a stream is processed, and the fact that the execution time involved in processing any data item depends on all the previous data items occurring in the stream. The main contribution of our method lies in its ability to seamlessly integrate program analysis techniques for micro-architectural modeling with known analytical methods for analyzing streaming applications, which treat the arrival/service of event streams as mathematical functions. This combination is powerful as it allows to model the code/cache-behavior of the streaming application, as well as the manner in which it is triggered by event arrivals. We employ our analysis method to an MPEG-2 encoder application and our experiments indicate that detailed modeling of the cache behavior is efficient, scalable and leads to more accurate timing/buffer size estimates.


Timing analysis Instruction cache Streaming applications 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Baruah S (2003) Dynamic- and static-priority scheduling of recurring real-time tasks. Real-Time Syst 24(1):93–128 MATHCrossRefMathSciNetGoogle Scholar
  2. Baruah S, Chen D, Gorinsky S, Mok AK (1999) Generalized multiframe tasks. Real-Time Syst 17(1):5–22 CrossRefGoogle Scholar
  3. Chakraborty S, Künzli S, Thiele L (2003a) A general framework for analysing system properties in platform-based embedded system designs. In: Proc. of the 6th design, automation and test in Europe (DATE), pp 190–195, Munich, Germany, March 2003 Google Scholar
  4. Chakraborty S, Künzli S, Thiele L, Herkersdorf A, Sagmeister P (2003b) Performance evaluation of network processor architectures: Combining simulation with analytical estimation. Comput Netw 41(5):641–665 MATHCrossRefGoogle Scholar
  5. Chen X, Hsieh H, Balarin F, Watanabe Y (2004) Logic of constraints: A quantitative performance and functional constraint formalism. IEEE Trans Comput-Aided Design Integr Circuits Syst (TCAD) 23(8):1243–1255 CrossRefGoogle Scholar
  6. Cohen G, Dubois D, Quadrat JP, Viot M (1985) A linear-system-theoretic view of discrete-event processes and its use for performance evaluation in manufacturing. IEEE Trans Autom Control 30(3):210–220 MATHCrossRefMathSciNetGoogle Scholar
  7. Cruz R (1991) A calculus for network delay, parts 1 & 2. IEEE Trans Inf Theory 37(1):114–141 MATHCrossRefMathSciNetGoogle Scholar
  8. Datta A, Choudhury S, Basu A, Tomiyama H, Dutt N (2001) Satisfying timing constraints of preemptive real-time tasks through task layout technique. In: IEEE international conference on VLSI design Google Scholar
  9. Engblom J (2002) Pipelines and static worst-case execution time analysis. PhD thesis, Uppsala University, Sweden Google Scholar
  10. Gordon MI et al (2002) A stream compiler for communication-exposed architectures. In: 10th Conf on architectural support for programming languages and operating systems (ASPLOS) Google Scholar
  11. Jersak M, Henia R, Ernst R (2004) Context-aware performance analysis for efficient embedded system design. In: Proc. of the 7th design, automation and test in Europe (DATE) Google Scholar
  12. Karp RM (1978) A characterization of the minimum cycle mean in a digraph. Discrete Math 23(3):309–311 MATHCrossRefMathSciNetGoogle Scholar
  13. Khailany B et al. (2001) Imagine: Media processing with streams. IEEE Micro 21(2):35–46 CrossRefGoogle Scholar
  14. Le Boudec J-Y, Thiran P (2001) Network calculus—A theory of deterministic queuing systems for the Internet. LNCS, vol 2050. Springer, Berlin Google Scholar
  15. Lee M et al (1993) A dual-mode instruction prefetch scheme for improved worst case and average case program execution times. In: IEEE international real-time systems symposium (RTSS) Google Scholar
  16. Lee C-G et al. (1998) Analysis of cache-related preemption delay in fixed-priority preemptive scheduling. IEEE Trans Comput 47(6):700–713 CrossRefMathSciNetGoogle Scholar
  17. Li Y-TS, Malik S, Wolfe A (1999) Performance estimation of embedded software with instruction cache modeling. ACM Trans Design Autom Electron Syst 4(3):257–279 CrossRefGoogle Scholar
  18. Li X, Mitra T, Roychoudhury A (2003) Accurate timing analysis by modeling caches, speculation and their interaction. In: ACM design automation conf. (DAC) Google Scholar
  19. Li X, Roychoudhury A, Mitra T (2004) Modeling out-of-order processors for software timing analysis. In: IEEE real-time systems symposium (RTSS) Google Scholar
  20. Lieverse P, Stefanov T, van der Wolf P, Deprettere EF (2001) System level design with Spade: an M-JPEG case study. In: ICCAD Google Scholar
  21. Lundqvist T, Stenström P (1999) Timing anomalies in dynamically scheduled microprocessors. In: IEEE real-time systems symposium (RTSS) Google Scholar
  22. Mathematica 5 (2003) Wolfram research.
  23. Maxiaguine A, Künzli S, Chakraborty S, Thiele L (2004) Rate analysis for streaming applications with on-chip buffer constraints. In: ASP-DAC Google Scholar
  24. Mitra T, Roychoudhury A (2007) Worst-case execution time and energy analysis. In: Compiler design handbook, 2nd edn. CRC, Boca Raton Google Scholar
  25. Mok AK, Chen D (1997) A multiframe model for real-time tasks. IEEE Trans Softw Eng 23(10):635–645 CrossRefGoogle Scholar
  26. Negi HS, Mitra T, Roychoudhury A (2003) Accurate estimation of cache related preemption delay. In: CODES+ISSS Google Scholar
  27. Pimentel AD, Hertzberger LO, Lieverse P, van der Wolf P, Deprettere EF (2001) Exploring embedded-systems architectures with Artemis. IEEE Comput 34(11):57–63 Google Scholar
  28. Puaut I, Decotigny D (2002) Low-complexity algorithms for static cache locking in multitasking hard real-time systems. In: IEEE real-time systems symposium (RTSS) Google Scholar
  29. Puschner P, Koza C (1989) Calculating the maximum execution time of real-time programs. J Real-Time Syst 1(2):159–176 CrossRefGoogle Scholar
  30. Richter K, Ernst R (2002) Event model interfaces for heterogeneous system analysis. In: Proceedings of the design, automation and test in Europe conference (DATE). IEEE Computer Society, Los Alamitos Google Scholar
  31. Richter K, Ziegenbein D, Jersak M, Ernst R (2002) Model composition for scheduling analysis in platform design. In: Proceedings of the design automation conference (DAC). ACM, New York Google Scholar
  32. Richter K, Jersak M, Ernst R (2003a) A formal approach to MpSoC performance verification. IEEE Comput 36(4):60–67 Google Scholar
  33. Richter K, Racu R, Ernst R (2003b) Scheduling analysis integration for heterogeneous multiprocessor soc. In: IEEE real-time systems symposium (RTSS) Google Scholar
  34. Rutten MJ, van Eijndhoven JTJ, Jaspers EGT, van der Wolf P, Gangwal OP, Timmer A (2002) A heterogeneous multiprocessor architecture for flexible media processing. IEEE Des Test Comput 19(4):39–50 CrossRefGoogle Scholar
  35. Sasinowski JE, Strosnider JK (1993) A dynamic programming algorithm for cache/memory partitioning for real-time systems. IEEE Trans Comput 42(8):997–1001 CrossRefGoogle Scholar
  36. Shaw AC (1989) Reasoning about time in higher level language software. IEEE Trans Softw Eng 15(7):875–889 CrossRefGoogle Scholar
  37. Staschulat J, Ernst R (2004) Multiple process execution in cache related preemption delay analysis. In: ACM international conference on embedded software (EMSOFT) Google Scholar
  38. Staschulat J, Ernst R (2005) Scheduling analysis of real-time systems with precise modeling of cache related preemption delay. In: Euromicro conference on real-time systems (ECRTS) Google Scholar
  39. Theiling H, Ferdinand C, Wilhelm R (2000) Fast and precise WCET prediction by separated cache and path analysis. J Real-Time Syst 18(2/3):157–179 CrossRefGoogle Scholar
  40. Thies W, Karczmarek M, Amarasinghe S (2002) StreamIt: A language for streaming applications. In: Proceedings of the 11th conference on compiler construction. LNCS, vol 2304. Springer, Berlin, pp 179–196 CrossRefGoogle Scholar
  41. Tomiyama H, Dutt N (2000) Program path analysis to bound cache-related preemption delay in preemptive real-time systems. In: ACM int. symp. on hardware–software codesign (CODES) Google Scholar
  42. Vera X, Lisper B, Xue J (2003) Data caches in multitasking hard real-time systems. In: International real-time systems symposium (RTSS) Google Scholar
  43. Wandeler E, Thiele L (2005) Abstracting functionality for modular performance analysis of hard real-time systems. In: Asia and South Pacific design automation conference (ASP-DAC) Google Scholar
  44. Wandeler E, Maxiaguine A, Thiele L (2005) Quantitative characterization of event streams in analysis of hard real-time applications. Real-Time Syst 29(2–3):205–225 MATHCrossRefGoogle Scholar
  45. Wilhelm R, Engblom J, Ermedahl A, Holsti N, Thesing S, Whalley D, Bernat G, Ferdinand C, Heckmann R, Mitra T, Mueller F, Puaut I, Puschner P, Staschulat J, Stenström P (2008) The determination of worst-case execution times—Overview of the methods and survey of tools. ACM Trans Embed Comput Syst (TECS) 7(3) Google Scholar
  46. Živković VD, van der Wolf P, Deprettere EF, de Kock EA (2002) Design space exploration of streaming multiprocessor architectures. In: IEEE workshop on signal processing systems (SIPS), San Diego, California Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Samarjit Chakraborty
    • 1
  • Tulika Mitra
    • 1
  • Abhik Roychoudhury
    • 1
  • Lothar Thiele
    • 2
  1. 1.National University of SingaporeSingaporeSingapore
  2. 2.Eidgenössische Technische Hochschule ZürichZürichSwitzerland

Personalised recommendations