Abstract
Caches are essential to bridge the gap between the high latency main memory and the fast processor pipeline. Standard processor architectures implement two first-level caches to avoid a structural hazard in the pipeline: an instruction cache and a data cache. For tight worst-case execution times it is important to classify memory accesses as either cache hit or cache miss. The addresses of instruction fetches are known statically and static cache hit/miss classification is possible for the instruction cache. The access to data that is cached in the data cache is harder to predict statically. Several different data areas, such as stack, global data, and heap allocated data, share the same cache. Some addresses are known statically, other addresses are only known at runtime. With a standard cache organization all those different data areas must be considered by worst-case execution time analysis. In this paper we propose to split the data cache for the different data areas. Data cache analysis can be performed individually for the different areas. Access to an unknown address in the heap does not destroy the abstract cache state for other data areas. Furthermore, we propose to use a small, highly associative cache for the heap area. We designed and implemented a static analysis for this cache, and integrated it into a worst-case execution time analysis tool.
Similar content being viewed by others
References
Angiolini F, Benini L, Caprara A (2003) Polynomial-time algorithm for on-chip scratchpad memory partitioning. In: Proceedings of the international conference on compilers, architectures and synthesis for embedded systems (CASES-03), October 30–November 01 2003. ACM, New York, pp 318–326
Arnold R, Mueller F, Whalley D, Harmon M (1994) Bounding worst-case instruction cache performance. In: IEEE real-time systems symposium, pp 172–181
Avissar O, Barua R, Stewart D (2002) An optimal memory allocation scheme for scratch-pad-based embedded systems. ACM Trans Embed Comput Syst 1(1):6–26
Busquets-Mataix JV, Serrano JJ, Ors R, Gil PJ, Wellings AJ (1996) Adding instruction cache effect to schedulability analysis of preemptive real-time systems. In: IEEE real-time technology and applications symposium (RTAS ’96), June 1996. IEEE Comput Soc, Washington, pp 204–213
Deutsch A (1992) A storeless model of aliasing and its abstractions using finite representations of right-regular equivalence relations. In: Proceedings of the 1992 international conference on computer languages, April 1992, pp 2–13
Deverge J-F, Puaut I (2007) Wcet-directed dynamic scratchpad memory allocation of data. In: ECRTS ’07: Proceedings of the 19th Euromicro conference on real-time systems, Washington, DC, USA, 2007. IEEE Comput Soc, Los Alamitos, pp 179–190
Edwards SA, Lee EA (2007) The case for the precision timed (PRET) machine. In: DAC ’07: Proceedings of the 44th annual conference on design automation. ACM, New York, pp 264–265
Emami M, Ghiya R, Hendren LJ (1994) Context-sensitive interprocedural points-to analysis in the presence of function pointers, pp 242–256
Ferdinand C, Heckmann R, Langenbach M, Martin F, Schmidt M, Theiling H, Thesing S, Wilhelm R (2001) Reliable and precise WCET determination for a real-life processor. In: Henzinger TA, Kirsch CM (eds) EMSOFT, Lecture notes in computer science, vol 2211. Springer, Berlin, pp 469–485
Ferdinand C, Wilhelm R (1999) Efficient and precise cache behavior prediction for real-time systems. Real-Time Syst 17(2–3):131–181
González A, Aliagas C, Valero M (1995) A data cache with multiple caching strategies tuned to different types of locality. In: ICS ’95: Proceedings of the 9th international conference on supercomputing. ACM, New York, pp 338–347
Healy CA, Arnold RD, Mueller F, Whalley DB, Harmon MG (1999) Bounding pipeline and instruction cache performance. IEEE Trans Comput 48(1):53–70
Healy CA, Whalley DB, Harmon MG (1995) Integrating the timing analysis of pipelining and instruction caching. In: IEEE real-time systems symposium, pp 288–297
Heckmann R, Langenbach M, Thesing S, Wilhelm R (2003) The influence of processor architecture on the design and results of WCET tools. Proc IEEE 91(7):1038–1054
Hennessy J, Patterson D (2006) Computer architecture: a quantitative approach, 4th edn. Morgan Kaufmann, San Mateo
Herlihy M, Eliot J, Moss B (1993) Transactional memory: Architectural support for lock-free data structures. In: Proceedings of the 20th annual international symposium on computer architecture, pp 289–300
Herter J, Reineke J (2009) Making dynamic memory allocation static to support WCET analyses. In: Proceedings of 9th international workshop on worst-case execution time (WCET) analysis, June 2009
Herter J, Reineke J, Wilhelm R (2008) CAMA: Cache-aware memory allocation for WCET analysis. In: Caccamo M (ed) Proceedings work-in-progress session of the 20th euromicro conference on real-time systems, July 2008, pp 24–27
Huber B, Puffitsch W, Schoeberl M (2011) Worst-case execution time analysis driven object cache design. Concurrency and Computation: Practice and Experience. doi:10.1002/cpe.1763
Jouppi NP (1990) Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In: Proceedings of the 17th annual international symposium on computer architecture. Seattle, WA, May 1990, pp 364–373
Kalibera T, Parizek P, Malohlava M, Schoeberl M (2010) Exhaustive testing of safety critical Java. In: Proceedings of the 8th international workshop on java technologies for real-time and embedded systems (JTRES 2010). ACM, New York, pp 164–174
Kim S-K, Min SL, Ha R (1996) Efficient worst case timing analysis of data caching. In: IEEE real-time technology and applications symposium (RTAS ’96), June 1996. IEEE Comput Soc, Washington, pp 230–240
Locke D, Andersen BS, Brosgol B, Fulton M, Henties T, Hunt JJ, Nielsen JO, Nilsen K, Schoeberl M, Tokar J, Vitek J, Wellings A (2011) Safety-critical Java technology specification, public draft
Lundqvist T, Stenström P (1999) A method to improve the estimated worst-case performance of data caching. In: Proc 6th international conference on real-time computing systems and applications. IEEE Comput Soc, Los Alamitos, pp 255–262
McIlroy R, Dickman P, Sventek J (2008) Efficient dynamic heap allocation of scratch-pad memory. In: ISMM ’08: Proceedings of the 7th international symposium on memory management. ACM, New York, pp 31–40
Milutinovic V, Tomasevic M, Markovi B, Tremblay M (1996) A new cache architecture concept: the split temporal/spatial cache. In: 8th Mediterranean electrotechnical conference, MELECON ’96, May 1996, vol 2, pp 1108–1111
Nemer F, Cassé H, Sainrat P, Bahsoun JP, De Michiel M (2006) Papabench: a free real-time benchmark. In: Proceedings of 6th international workshop on worst-case execution time analysis (WCET)
Patterson DA (1985) Reduced instruction set computers. Commun ACM 28(1):8–21
Pitter C, Schoeberl M (2010) A real-time Java chip-multiprocessor. ACM Trans Embed Comput Syst 10(1):9:1–34
Puaut I (2006) WCET-centric software-controlled instruction caches for hard real-time systems. In: ECRTS ’06: Proceedings of the 18th Euromicro conference on real-time systems, Washington, DC, USA. IEEE Comput Soc, Los Alamitos, pp 217–226
Puaut I, Pais C (2007) Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison. In: Proceedings of the conference on design, automation and test in Europe (DATE 2007). EDA Consortium, San Jose, pp 1484–1489
Puffitsch W (2009) Data caching, garbage collection, and the Java memory model. In: Proceedings of the 7th international workshop on java technologies for real-time and embedded systems (JTRES 2009). ACM, New York, pp 90–99
Reineke J, Grund D, Berg C, Wilhelm R (2007) Timing predictability of cache replacement policies. Real-Time Syst 37(2):99–122
Schoeberl M (2008) Application experiences with a real-time Java processor. In: Proceedings of the 17th IFAC world congress, Seoul, Korea, July 2008, pp 9320–9325
Schoeberl M (2008) A Java processor architecture for embedded real-time systems. J Syst Archit 54(1–2):265–286
Schoeberl M (2009) Time-predictable cache organization. In: Proceedings of the first international workshop on software technologies for future dependable distributed systems (STFSSD 2009), Tokyo, Japan, March 2009. IEEE Comput Soc, Los Alamitos, pp 11–16
Schoeberl M (2009) Time-predictable computer architecture. EURASIP J Embed Syst 2009, Article ID 758480:17
Schoeberl M (2011) A time-predictable object cache. In: Proceedings of the 14th IEEE international symposium on object/component/service-oriented real-time distributed computing (ISORC 2011), Newport Beach, CA, USA, March 2011. IEEE Comput Soc, Los Alamitos, pp 99–105
Schoeberl M, Binder W, Villazon A (2011) Design space exploration of object caches with cross-profiling. In: Proceedings of the 14th IEEE International Symposium on Object/component/service-oriented Real-time distributed Computing (ISORC 2011), Newport Beach, CA, USA, March 2011. IEEE Comput Soc, Los Alamitos, pp 213–221
Schoeberl M, Brandner F, Vitek J (2010) RTTM: Real-time transactional memory. In: Proceedings of the 25th ACM symposium on applied computing (SAC), Sierre, Switzerland, March 2010. ACM, New York, pp 326–333
Schoeberl M, Preusser TB, Uhrig S (2010) The embedded Java benchmark suite JemBench. In: Proceedings of the 8th international workshop on java technologies for real-time and embedded systems (JTRES 2010). ACM, New York, pp 120–127
Schoeberl M, Puffitsch W, Huber B (2009) Towards time-predictable data caches for chip-multiprocessors. In: Proceedings of the seventh IFIP workshop on software technologies for future embedded and ubiquitous systems (SEUS 2009), Lecture notes in computer science, vol 5860. Springer, Berlin, pp 180–191
Schoeberl M, Puffitsch W, Pedersen RU, Huber B (2010) Worst-case execution time analysis for a Java processor. Softw Pract Exp 40(6):507–542
Shaw AC (1989) Reasoning about time in higher-level language software. IEEE Trans Softw Eng 15(7):875–889
Suhendra V, Mitra T, Roychoudhury A Chen T (2005) WCET centric data allocation to scratchpad memory. In: Proceedings of the 26th IEEE international real-time systems symposium (RTSS). IEEE Comput Soc, New York, pp 223–232
Vera X, Lisper B, Xue J (2003) Data cache locking for higher program predictability. In: Proceedings of the 2003 ACM SIGMETRICS international conference on measurement and modeling of computer systems (SIGMETRICS-03), Performance Evaluation Review, vol 31, June 11–14 2003. ACM, New York, pp 272–282
Vera X, Lisper B, Jingling X (2003) Data caches in multitasking hard real-time systems. In: IEEE real-time systems symposium. IEEE Comput Soc, Los Alamitos, pp 154–165
Vera X, Lisper B, Jingling X (2007) Data cache locking for tight timing calculations. ACM Trans Embed Comput Syst 7, 4:1–4:38
Verma M, Marwedel P (2006) Overlay techniques for scratchpad memories in low power embedded processors. IEEE Trans Very Large Scale Integr (VLSI) Syst 14(8):802–815
Wehmeyer L, Marwedel P (2005) Influence of memory hierarchies on predictability for time constrained embedded software. In: Proceedings of design, automation and test in Europe (DATE2005), vol 1, pp 600–605
Wellings A, Schoeberl M (2009) Thread-local scope caching for real-time Java. In: Proceedings of the 12th IEEE international symposium on object/component/service-oriented real-time distributed computing (ISORC), Tokyo, Japan, March 2009. IEEE Comput Soc, Los Alamitos, pp 275–282
White RT, Mueller F, Healy CA, Whalley DB, Harmon MG (1999) Timing analysis for data and wrap-around fill caches. Real-Time Syst 17(2–3):209–233
Whitham J, Audsley N (2009) Implementing time-predictable load and store operations. In: Proceedings of the international conference on embedded software (EMSOFT)
Whitham J, Audsley N (2010) Investigating average versus worst-case timing behavior of data caches and data scratchpads. In: Proceedings of the 2010 22nd euromicro conference on real-time systems ECRTS ’10, Washington, DC, USA. IEEE Comput Soc, Los Alamitos, pp 165–174
Whitham J, Audsley N (2010) Studying the applicability of the scratchpad memory management unit. In: Proceedings of the 2010 16th IEEE real-time and embedded technology and applications symposium RTAS ’10, Washington, DC, USA. IEEE Comput Soc, Los Alamitos, pp 205–214
Wilhelm R, Engblom J, Ermedahl A, Holsti N, Thesing S, Whalley D, Bernat G, Ferdinand C, Heckmann R, Mitra T, Mueller F, Puaut I, Puschner P, Staschulat J, Stenström P (2008) The worst-case execution time problem—overview of methods and survey of tools. ACM Trans Embed Comput Syst 7(3):1–53
Wilhelm R, Grund D, Reineke J, Schlickling M, Pister M, Ferdinand C (2009) Memory hierarchies, pipelines, and buses for future architectures in time-critical embedded systems. IEEE Trans Comput-Aided Des Integr Circuits Syst 28(7):966–978
Acknowledgements
We would like to thank the anonymous reviewers for their detailed comments, which helped to improve the paper. This research has received partial funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] under grant agreement number 216682 (JEOPARD).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Schoeberl, M., Huber, B. & Puffitsch, W. Data cache organization for accurate timing analysis. Real-Time Syst 49, 1–28 (2013). https://doi.org/10.1007/s11241-012-9159-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11241-012-9159-8