Data Organization: The Processor Core/Cache Interface
Modern embedded systems increasingly rely on processors due to the advantages they offer in terms of flexibility, reduction in design time and full-custom layout quality [MG95]. The software part of the partitioned specification of Figure 1.2 is executed on these embedded processors, which are often available in the form of cores, to be instantiated as part of a larger system on a chip. This is feasible in current technology due to the relatively small area occupied by the processor cores, making the rest of the on-chip die area available for RAM, ROM, coprocessors, and other modules. Apart from the processors in the Digital Signal Processing domain (such as the TMS320 series from Texas Instruments), we also find microprocessors with relatively general purpose architectures available as embedded processors. An example of such a general purpose embedded processor is LSI Logic’s CW4001 [ACG+95], which is based on the MIPS family of processors.
KeywordsExpense Cond Padding
Unable to display preview. Download preview PDF.
- 1.Note that this clustering is for illustrative purposes only. If a → b → c → d → e was the only execution chain in the program, a better clustering is x: (a, b), y: (c, d), z: (e).Google Scholar
- 2.This section is the result of joint work with Professor Hiroshi Nakamura of the University of TokyoGoogle Scholar
- 3.The sigimag array, which has an identical access pattern to sigreal, is prevented from conflicting with sigrealby adjusting the distance between the two arrays, i.e., by inserting padding between them.Google Scholar
- 4.Note that ESS performs only one-dimensional tiling, and cannot incorporate two-dimensional tiling.Google Scholar
- 5.For small array sizes, all the data fits into the cache, obviating the necessity for tiling; for larger sizes, the simulation times on the commercial simulator, SHADE, were too long to examine every integral data size. Hence the range 35–350.Google Scholar