Abstract
After having presented the challenges and requirements for system level design of image processing applications, this chapter aims to discuss fundamentals on system level design and to give an overview on related work. Section 3.1 starts with the question how to specify the application behavior. In this context also some fundamental data flow models of computation are reviewed. Next, Section 3.2 gives an introduction to existing approaches in behavioral hardware synthesis. Communication and memory synthesis techniques are discussed separately in Section 3.4. Section 3.3 details some aspects about memory analysis and optimization. Section 3.5 reviews several system-level design approaches before Section 3.6 concludes this chapter with a conclusion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Without break or continue statements.
- 2.
A slight relaxation is possible by copying written data to several destinations.
- 3.
Presence of deadlocks cannot be proved.
- 4.
The split actor belongs to the class of CSDF, described in Section 3.1.3.2..
- 5.
In [166], an alternative representation has been chosen in order to emphasize the behavior of a possible hardware implementation. As the latter is only able to read 1 pixel per clock cycle, the initial phase is split into several invocations. Furthermore, it has been assumed that the filter already reads the next input when processing the window at the right border.
- 6.
now Thales
- 7.
Note that this optimization differs from the JPEG2000 tiling described in Section 2.2 in that the latter introduces an additional border processing in order to avoid the resulting multiple data accesses. As such a behavior, however, changes the produced output result, it cannot be used automatically by the DEFACTO compiler.
- 8.
A vector \(\mathbf{a}\in\mathbb{R}^{n}\) is called lexicographic positive if \(\exists i:\forall1\leq j < i,\,\langle\mathbf{a},\mathbf{e_{j}}\rangle\geq0\wedge\langle\mathbf{a},\mathbf{e_{i}}\rangle>0\).
- 9.
The latter can be achieved by dividing the vector with the greatest common divisor of all vector components.
- 10.
Note that in this case the execution semantics of the SDF graph slightly change because actor 2 has to check not only for sufficient input data, but also for enough free space on each output edge. From the theoretical point of view, however, this does not cause any difficulties since the occurring behavior can be modeled by introduction of a feedback edge between actors 3 and 2 [301].
- 11.
It is assumed that the buffer has to be allocated at the beginning of the producing invocation and can be released at the end of the consuming invocation.
- 12.
(see Section 3.2.4 for a mathematical definition)
- 13.
This kind of loop programs indicate data dependencies instead of being supposed to execute all iterations sequentially.
References
AutoPilot. http://www.autoesl.com/products.html
The Omega project. http://www.cs.umd.edu/projects/omega/
International technology roadmap for semiconductors – design. Tech. rep., International Technology Roadmap for Semiconductors (2007)
Absar, J., Catthoor, F.: Reuse analysis of indirectly indexed arrays. ACM Trans. Des. Autom. Electron. Syst. 11(2), 282–305 (2006)
Adé, M.: Data memory minimization for synchronous dataflow graphs emulated on DSP-FPGA targets. Ph.D. thesis, Katholieke Universiteit Leuven (1996)
Adé, M., Lauwereins, R., Peperstraete, J.: Buffer memory requirements in DSP applications. In: Proceedings of the 5th International Workshop on Rapid System Prototyping, pp. 108–123. Grenoble, France (1994)
Adé, M., Lauwereins, R., Peperstraete, J.A.: Data memory minimisation for synchronous data flow graphs emulated on DSP-FPGA targets. In: DAC ’97: Proceedings of the 34th Annual Conference on Design Automation, pp. 64–69. ACM Press, New York, NY, (1997)
Agrawal, A., Bakshi, A., Davis, J., Eames, B., Ledeczi, A., Mohanty, S., Mathur, V., Neema, S., Nordstrom, G., Prasanna, V., Raghavendra, C., Singh, M.: MILAN: A model based integrated simulation framework for design of embedded systems. In: Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES 2001), pp. 82–93. Snowbird, UT, (2001)
Aho, E., Vanne, J., Hamalainen, T.: Parallel memory architecture for arbitrary stride accesses. In: Proceedings of the 2006 IEEE Design and Diagnostics of Electronic Circuits and systems, pp. 63–68. Prague, Czech Republic (2006)
Ambler, S.W.: The Elements of UML(TM) 2.0 Style. Cambridge University Press, New York, NY (2005)
Ashenden, P.J.: The Designer’s Guide to VHDL. Morgan Kaufmann Publishers, San Francisco, CA (1991)
Balarin, F., Chiodo, M., Hsieh, H., Jureska, A., Lavagno, L., Passerone, C., Sangiovanni-Vincentelli, A., Sentovich, E., Suzuki, K., Tabbara, B.: Hardware-Software Co-design of Embedded System: The POLIS Approach. Kluwer, Norwell, MA (1997)
Balarin, F., Watanabe, Y., Hsieh, H., Lavagno, L., Passerone, C., Sangiovanni-Vincentelli, A.: Metropolis: An integrated electronic system design environment. Computer 36(4), 45–52 (2003)
Balasa, F., Catthoor, F., Man, H.D.: Dataflow-driven memory allocation for multi-dimensional signal processing systems. In: ICCAD ’94: Proceedings of the 1994 IEEE/ACM International Conference on Computer-Aided Design, pp. 31–34. IEEE Computer Society Press, Los Alamitos, CA, (1994)
Balasa, F., Kjeldsberg, P.G., Palkovic, M., Vandecappelle, A., Catthoor, F.: Loop transformation methodologies for array-oriented memory management. In: ASAP ’06: Proceedings of the IEEE 17th International Conference on Application-specific Systems, Architectures and Processors, pp. 205–212. IEEE Computer Society, Washington, DC, (2006)
Balasa, F., Zhu, H., Luican, I.I.: Computation of storage requirements for multi-dimensional signal processing applications. IEEE Trans. Very Large Scale Integr. Syst. 15(4), 447–460 (2007)
Banerjee, P., Shenoy, N., Choudhary, A., Hauck, S., Bachmann, C., Haldar, M., Joisha, P., Jones, A., Kanhare, A., Nayak, A., Periyacheri, S., Walkden, M., Zaretsky, D.: A MATLAB compiler for distributed, heterogeneous, reconfigurable computing systems. In: FCCM ’00: Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines, p. 39. IEEE Computer Society, Washington, DC, (2000)
Baradaran, N., Diniz, P.: Exploiting data reuse in modern FPGAs: Opportunities and challenges for compilers. In: International Workshop on Applied Reconfigurable Computing (ARC2005), pp. 1–10. Algarve, Portugal (2005). Keynote Lecture
Baradaran, N., Diniz, P.C.: A register allocation algorithm in the presence of scalar replacement for fine-grain configurable architectures. In: DATE ’05: Proceedings of the conference on Design, Automation and Test in Europe, pp. 6–11. IEEE Computer Society, Washington, DC, (2005)
Baradaran, N., Diniz, P.C.: Memory parallelism using custom array mapping to heterogeneous storage structures. In: International Conference on Field Programmable Logic and Applications (FPL), pp. 1–6. Madrid, Spain (2006)
Baradaran, N., Diniz, P.C., Park, J.: Extending the applicability of scalar replacement to multiple induction variables. In: Languages and Compilers for High Performance Computing, vol. 3602, pp. 455–469. Springer, New York, NY (2005)
Baradaran, N., Park, J., Diniz, P.C.: Compiler reuse analysis for the mapping of data in FPGAs with RAM blocks. In: Proceedings of IEEE International Conference on Field-Programmable Technology (FPT), pp. 145–152 (2004)
Baumstark, L., Guler, M., Wills, L.: Extracting an explicitly data-parallel representation of image-processing programs. In: Proceedings of 10th Working Conference on Reverse Engineering (WCRE), pp. 24–34. Victoria, B.C., Canada (2003)
Baumstark, L.B., Wills, L.M.: Retargeting sequential image-processing programs for data parallel execution. IEEE Trans. Softw. Eng. 31(2), 116–136 (2005)
Beierlein, T., Fröhlich, D., Steinbach, B.: Model-driven compilation of UML-models for reconfigurable architectures. In: 2nd RTAS Workshop on Model-Driven Embedded Systems (MoDES ’04). Toronto, Canada (2004)
Bekooij, M., Wiggers, M., van Meerbergen, J.: Efficient buffer capacity and scheduler setting computation for soft real-time stream processing applications. In: SCOPES ’07: Proceedings of the 10th International Workshop on Software & Compilers for Embedded Systems, pp. 1–10. ACM Press, New York, NY (2007)
Benkrid, K., Crookes, D., Smith, J., Benkrid, A.: High level programming for FPGA based image and video processing using hardware skeletons. In: The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’01), pp. 219–226. Rohnert Park, Canada (2001)
Bergeron, J., Cerny, E., Hunter, A., Nightingale, A.: Verification Methodology Manual for SystemVerilog. Springer, New York, NY (2005)
Berry, G., Gonthier, G.: The Esterel synchronous programming language: Design, semantics, implementation. Sci. Comput. Programming 19, pp. 87–152 (1992)
Beux, S.L.: Un flot de conception pour applications de traitement du signal systématique implémentées sur FPGA à base d’ingénierie dirigée par les modèles. Ph.D. thesis, Université des Sciences et Technologies de Lille (2007)
Beux, S.L., Marquet, P., Dekeyser, J.L.: A design flow to map parallel applications onto FPGAs. In: 17th IEEE International Conference on Field Programmable Logic and Applications (FPL2007), pp. 605–608. Amsterdam, The Netherlands (2007)
Beux, S.L., Marquet, P., Dekeyser, J.L.: Multiple abstraction views of FPGA to map parallel applications. In: Reconfigurable Communication-centric SoCs 2007 (ReCoSoC’07), pp. 90–97. Montpellier, France (2007)
Bhattacharya, B.: Parameterized modeling and scheduling for dataflow graphs. Master’s thesis, Department of Electrical and Computer Engineering, University of Maryland (1999)
Bhattacharya, B., Bhattacharyya, S.: Parameterized dataflow modeling of DSP systems. IEEE Trans. Signal Process. 49(10), 2408–2421 (2001)
Bhattacharya, B., Bhattacharyya, S.S.: Quasi-static scheduling of reconfigurable dataflow graphs for DSP systems. In: Proceedings of the 11th IEEE International Workshop on Rapid System Prototyping (RSP 2000), p. 84. IEEE Computer Society, Washington, DC (2000)
Bhattacharyya, S., Murthy, P., Lee, E.: APGAN and RPMC: Complementary heuristics for translating DSP block diagrams into efficient software implementations. In: Design Automation for Embedded Systems, pp. 33–60. Kluwer Academic Publishers, Boston, MA (1997)
Bijlsma, T., Bekooij, M.J.G., Smit, G.J.M., Jansen, P.G.: Efficient inter-task communication for nested loop programs on a multiprocessor system. In: Proceedings of the ProRISC 2007 Workshop, pp. 122–127. Utrecht, Technology Foundation, Veldhoven, The Netherlands (2007)
Bijlsma, T., Bekooij, M.J.G., Smit, G.J.M., Jansen, P.G.: Omphale: Streamlining the communication for jobs in a multi processor system on chip. Technical Report TR-CTIT-07-44, University of Twente, The Netherlands, Enschede (2007)
Bilsen, G., Engels, M., Lauwereins, R., Peperstraete, J.: Cyclo-static dataflow. IEEE Trans. Signal Process. 44(2), 397–408 (1996)
BINACHIP: BINACHIP. http://www.binachip.com/products.htm (2010)
Boulet, P.: Array-OL revisited, multidimensional intensive signal processing specification. Tech. Rep. 6113v2, Unité de recherche INRIA Futurs Parc Club Orsay Université, ZAC des Vignes, 4, rue Jacques Monod, 91893 ORSAY Cedex (France) (2007)
Boulet, P., Marquet, P., Éric Piel, Taillard, J.: Repetitive allocation modeling with MARTE. In: Forum on Specification and Design Languages (FDL’07). Barcelona, Spain (2007). Invited paper
Buck, J., Ha, S., Lee, E.A., Messerschmitt, D.G.: Ptolemy: A framework for simulating and prototyping heterogenous systems. Int. J. Comput. Simulat. 4(2), 155–182 (1994)
Buck, J.T.: Scheduling dynamic dataflow graphs with bounded memory using the token flow model. Ph.D. thesis, University of California at Berkeley (1993)
Buck, J.T.: Static scheduling and code generation from dynamic dataflow graphs with integer-valued control streams. In: 28th Asilomar Conference on Signals, Systems, and Computers. Pacific Grove, CA (1994)
Budiu, M.: Spatial computation. Ph.D. thesis, Carnegie Mellon School of Computer Science, Pittsburgh, PA (2003)
Budiu, M., Venkataramani, G., Chelcea, T., Goldstein, S.C.: Spatial computation. SIGOPS Oper. Syst. Rev. 38(5), 14–26 (2004)
Calvez, J.P., Perrier, V.: MPEG-2 encoder-decoder illustrative example. Tech. rep., CoFluent Design (2005)
Caspi, E.: Design automation for streaming systems. Ph.D. thesis, Electrical Engineering and Computer Sciences, University of California at Berkeley (2005)
Catthoor, F., Danckaert, K., Kulkarni, K., Brockmeyer, E., Kjeldsberg, P., Achteren, T., Omnes, T.: Data Access and Storage Management for Embedded Programmable Processors. Springer, New York, NY (2002)
Catthoor, F., Danckaert, K., Wuytack, S., Dutt, N.D.: Code transformations for data transfer and storage exploration preprocessing in multimedia processors. IEEE Des. Test 18(3), 70–82 (2001)
Catthoor, F., Wuytack, S., Greef, E.D., Balasa, F., Nachtergaele, L., Vandecappelle, A.: Custom Memory Management Methodology – Exploration of Memory Organisation for Embedded Multimedia System Design. Kluwer, Boston, MA (1998)
Celoxica: Handel-C language reference manual. http://www.celoxica.com (2010). Accessed 25 Sep 2010
Charot, F., Nyamsi, M., Quinton, P., Wagner, C.: Modeling and scheduling parallel data flow systems using structured systems of recurrence equations. In: Proceedings of the 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP ’04), pp. 6–16. IEEE Computer Society, Washington, DC (2004)
Chawala, N., Guizzetti, R., Meroth, Y., Deleule, A., Gupta, V., Kathail, V., Urard, P.: Multimedia application specific engine design using high level synthesis. In: DesignCon, pp. 1–24. Santa Clara, CA (2008)
Chen, M., Lee, E.: Design and implementation of a multidimensional synchronous dataflow environment. Conference Record of the Twenty-Eighth Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 519–524. Pacific Grove, CA (1994)
Chen, M.J.: Developing a multidimensional synchronous dataflow domain in Ptolemy. Tech. Rep. UCB/ERL M94/16, University of California, Berkeley (1994)
Cheung, E., Hsieh, H., Balarin, F.: Automatic buffer sizing for rate-constrained KPN applications on multiprocessor system-on-chip. In: IEEE International Workshop on High Level Design Validation and Test (HLVDT), pp. 37–44. Irvine, CA (2007)
Cockx, J., Denolf, K., Vanhoof, B., Stahl, R.: SPRINT: a tool to generate concurrent transaction-level models from sequential code. EURASIP J. Appl. Signal Process. 2007(1), 213–213 (2007)
Colorado State University: Cameron. http://www.cs.colostate.edu/cameron/index.html (2002). Accessed 25 Sep 2010
Cong, J., Fan, Y., Han, G., Jiang, W., Zhang, Z.: Behavior and communication co-optimization for systems with sequential communication media. In: DAC ’06: Proceedings of the 43rd Annual Conference on Design Automation, pp. 675–678. ACM Press, New York, NY (2006)
Cong, J., Fan, Y., Han, G., Jiang, W., Zhang, Z.: Platform-based behavior-level and system-level synthesis. In: IEEE International SOC Conference, pp. 199–202. Austin, TX (2006)
Cong, J., Han, G., Jiang, W.: Synthesis of an application-specific soft multiprocessor system. In: Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays (FPGA’07), pp. 99–107. ACM Press, New York, NY (2007)
Coussy, P.: Synthèse d’interface de communication pour les composants virtuels. Ph.D. thesis, Université de Bretagne Sud, Laboratoire d’Electronique des Systèmes Temps Réel (LESTER) (2003)
CoWare: CoWare platform architect. http://www.coware.com/products/platformarchitect.php (2010). Accessed 25 Sep 2010
CriticalBlue: Cascade. http://www.criticalblue.com/criticalblue_products/cascade.shtml
Cronquist, D., Gleason, C., Turean, F., Kathail, V.: Design of an H.264 encoder in five months using application engine synthesis. In: The 4th International Signal Processing Conference, pp. 1–7. Santa Clara, CA (2006)
Crookes, D., Benkrid, K., Bouridane, A., Alotaibi, K., Benkrid, A.: Design and implementation of a high level programming environment for FPGA-based image processing. IEE Proc. Vision Image Signal Process. 147(4), 377–384 (2000)
Danckaert, K., Catthoor, F., Man, H.D.: A preprocessing step for global loop transformations for data transfer optimization. In: Proceedings of the 2000 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES ’00), pp. 34–40. ACM Press, New York, NY (2000)
Darte, A., Schreiber, R., Villard, G.: Lattice-based memory allocation. Tech. Rep. RR2004-23, ENS-Lyon (2004)
Darte, A., Schreiber, R., Villard, G.: Lattice-based memory allocation. IEEE Trans. Comput. 54(10), 1242–1257 (2005)
Davare, A., Densmore, D., Meyerowitz, T., Pinto, A., Sangiovanni-Vincentelli, A., Yang, G., Zeng, H., Zhu, Q.: A next-generation design framework for platform-based design. In: Conference on Using Hardware Design and Verification Languages (DVCon). San Jose, CA (2007)
Dekeyser, J., Beux, S.L., Marquet, P.: Une approche modèle pour la conception conjointe de systèmes embarqués hautes performances dédiés au transport. In: International Workshop on Logistique & Transport (LT’ 2007). Sousse, Tunisie (2007)
Demeure, A., Gallo, Y.D.: An array approach for signal processing design. In: Sophia-Antipolis Conference on Micro-Electronics (SAME’98), System-on-Chip Session. France (1998)
Denolf, K., Bekooij, M., Cockx, J., Verkest, D., Corporaal, H.: Exploiting the expressiveness of cyclo-static dataflow to model multimedia implementations. EURASIP J. Adv. Signal Process. 2007(84078), 14 (2007)
Densmore, D., Passerone, R., Sangiovanni-Vincentelli, A.: A platform-based taxonomy for ESL design. IEEE Des. Test 23(5), 359–374 (2006)
Diet, F., D’Hollander, E., Beyls, K., Devos, H.: Embedding smart buffers for window operations in a stream-oriented C-to-VHDL compiler. In: 4th IEEE International Symposium on Electronic Design, Test and Applications (DELTA), pp. 142–147. Hong Kong (2008)
Diniz, P., Hall, M., Park, J., So, B., Ziegler, H.: Automatic mapping of C to FPGAs with the DEFACTO compilation and synthesis system. Microprocess. Microsyst. Special Issue FPGA Tools Tech. 29(2–3), 51–62 (2005)
Diniz, P.C.: Evaluation of code generation strategies for scalar replaced codes in fine-grain configurable architectures. In: FCCM ’05: Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 73–82. IEEE Computer Society, Washington, DC (2005)
Dömer, R., Gerstlauer, A., Peng, J., Shin, D., Cai, L., Yu, H., Abdi, S., Gajski, D.D.: System-on-chip environment: A SpecC-based framework for heterogeneous MPSoC design. EURASIP J. Embedded Syst. 2008(647953), 13 (2008)
Draper, B., Najjar, W., Bohm, W., Hammes, J., Rinker, B., Ross, C., Chawathe, M., Bins, J.: Compiling and optimizing image processing algorithms for FPGAs. In: Proceedings of 5th IEEE International Workshop on Computer Architectures for Machine Perception, pp. 222–231. Padova, PD (2000)
Dumont, P., Boulet, P.: Another multidimensional synchronous dataflow: Simulating Array-OL in Ptolemy II. Tech. Rep. 5516, Institut National de Recherche en Informatique et en Automatique, Cité Scientifique, 59 655 Villeneuve d’Ascq Cedex (2005)
Edwards, S.A.: SHIM: A language for hardware/software integration. In: SYNCHRON. Germany (2004)
Edwards, S.A.: The challenges of synthesizing hardware from C-like languages. IEEE Des. Test Comput. 23(5), 3765–386 (2006)
Edwards, S.A., Tardieu, O.: SHIM: a deterministic model for heterogeneous embedded systems. In: Proceedings of the 5th ACM International Conference on Embedded Software (EMSOFT ’05), pp. 264–272. ACM Press, New York, NY (2005)
Edwards, S.A., Tardieu, O.: Efficient code generation from SHIM models. In: LCTES ’06: Proceedings of the 2006 ACM SIGPLAN/SIGBED Conference on Language, Compilers, and Tool Support for Embedded Systems, pp. 125–134. ACM Press, New York, NY (2006)
Edwards, S.A., Vasudevan, N., Tardieu, O.: Programming shared memory multiprocessors with deterministic message-passing concurrency: Compiling SHIM to Pthreads. In: Design, Automation and Test in Europe, 2008. DATE ’08, pp. 1498–1503. Munich, Germany (2008)
Eker, J., Janneck, J.W.: Embedded system components using the CAL actor language. University of California, Berkeley (2002)
Eker, J., Janneck, J.W.: CAL language report. language version 1.0 | document edition 1, University of California at Berkeley (2003)
Engels, M., Bilsen, G., Lauwereins, R., Peperstraete, J.: Cyclo-static data flow: Model and implementation. In: Proceedings of the 28th Asilomar Conference on Signals, Systems, and Computers, pp. 503–507. Pacific Grove, CA (1994)
Erbas, C., Pimentel, A.D., Thompson, M., Polstra, S.: A framework for system-level modeling and simulation of embedded systems architectures. EURASIP J. Embedded Syst. 2007(1), 2–2 (2007)
Feautrier, P.: Scalable and structured scheduling. Int. J. Parallel Program. 34(5), 459–487 (2006)
Fischaber, S., Woods, R., McAllister, J.: SoC memory hierarchy derivation from dataflow graphs. In: IEEE Workshop on Signal Processing Systems, pp. 469–474. Shanghai, China (2007)
Forte Design Systems: Forte cynthesizer. http://www.forteds.com (2010). Accessed 25 Sep 2010
Fowler, M., Scott, K.: UML Distilled: Applying the Standard Object Modeling Language. Addison-Wesley, Reading, MA (1997)
Gamatié, A., Beux, S.L., Éric Piel, Etien, A., Ben-Atitallah, R., Marquet, P., Dekeyser, J.L.: A model driven design framework for high performance embedded systems. Tech. Rep. 6614, Institut National de Recherche en Informatique et en Automatique (2008)
Geilen, M., Basten, T., Stuijk, S.: Minimising buffer requirements of synchronous dataflow graphs with model checking. In: DAC ’05: Proceedings of the 42nd Annual Conference on Design Automation, pp. 819–824. ACM Press, New York, NY (2005)
Ghamarian, A., Geilen, M., Stuijk, S., Basten, T., Moonen, A., Bekooij, M., Theelen, B., Mousavi, M.: Throughput analysis of synchronous data flow graphs. In: Proceedings of the 6th International Conference on Application of Concurrency to System Design (ACSD’06), pp. 25–36. IEEE Computer Society, Washington, DC (2006)
Gokhale, M.B., Stone, J.M.: Automatic allocation of arrays to memories in FPGA processors with multiple memory banks. In: FCCM ’99: Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 63–69. IEEE Computer Society, Marriott at Napa Valley, Napa, CA (1999)
Gordon, M.I., Thies, W., Amarasinghe, S.: Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. SIGARCH Comput. Archit. News 34(5), 151–162 (2006)
Govindarajan, R., Gao, G.: A novel framework for multi-rate scheduling in DSP applications. In: Proceedings of the 1993 International Conference on Application Specific Array Processors, pp. 77–88. Venice, Italy (1993)
Govindarajan, R., Gao, G., Desai, P.: Minimizing buffer requirements under rate-optimal schedule in regular dataflow networks. J. VLSI Signal Process. 31, 207–229. Kluwer, Dordrecht (2002)
Greef, E.D., Catthoor, F., Man, H.D.: Memory size reduction through storage order optimization for embedded parallel multimedia applications. Parallel Comput. 23(12), 1811–1837 (1997)
Guillou, A.C.: Synthèse architecturale basée sur le modèle polyhédrique : Validation et extensions de la méthodologie mmalpha. Ph.D. thesis, L’université de Rennes (2003)
Guillou, A.C., Quinton, P., Risset, T.: Hardware synthesis for multi-dimensional time. Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors, pp. 40–50. The Hague, The Netherlands (2003)
Guo, Z., Najjar, W., Buyukkurt, B.: Efficient hardware code generation for FPGAs. ACM Trans. Archit. Code Optim. 5(1), 1–26 (2008)
Gupta, S., Dutt, N., Gupta, R., Nicolau, A.: SPARK: A high-level synthesis framework for applying parallelizing compiler transformations. In: Proceedings of the 16th International Conference on VLSI Design (VLSID ’03), p. 461. IEEE Computer Society, Washington, DC (2003)
Gupta, S., Gupta, R.K., Dutt, N.D., Nicolau, A.: Coordinated parallelizing compiler optimizations and high-level synthesis. ACM Trans. Des. Autom. Electron. Syst. 9(4), 441–470 (2004)
Ha, S., Kim, S., Lee, C., Yi, Y., Kwon, S., Joo, Y.P.: PeaCE: A hardware-software codesign environment for multimedia embedded systems. ACM Trans. Des. Autom. Electron. Syst. 12(3), 1–25 (2007)
Ha, S., Lee, C., Yi, Y., Kwon, S., Joo, Y.P.: Hardware-software codesign of multimedia embedded systems: the PeaCE approach. In: RTCSA ’06: Proceedings of the 12th IEEE International Conference on Embedded in Real-Time Computing Systems and Applications, pp. 207–214. IEEE Computer Society, Washington, DC (2006)
van Haastregt, S., Kienhuis, B.: Automated synthesis of streaming C applications to process networks in hardware. In: Proceedings of Design, Automation & Test in Europe, pp. 890–893. ACM Press, Nice, France (2009)
Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous data flow programming language Lustre. In: Proceedings of the IEEE, vol. 79, pp. 1305–1320 (1989)
Hammes, J., Rinker, B., Bohm, W., Najjar, W., Draper, B., Beveridge, R.: Cameron: High level language compilation for reconfigurable systems. In: PACT ’99: Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, p. 236. IEEE Computer Society, Washington, DC (1999)
Han, S.I., Guerin, X., Chae, S.I., Jerraya, A.A.: Buffer memory optimization for video codec application modeled in Simulink. In: DAC ’06: Proceedings of the 43rd Annual Conference on Design Automation, pp. 689–694. ACM Press, New York, NY (2006)
Hannig, F., Ruckdeschel, H., Dutta, H., Teich, J.: PARO: Synthesis of hardware accelerators for multi-dimensional dataflow-intensive applications. In: Proceedings of the Fourth International Workshop on Applied Reconfigurable Computing (ARC), Lecture Notes in Computer Science (LNCS), pp. 287–293. Springer, London, United Kingdom (2008)
Harel, D.: Statecharts: A visual formalism for complex systems. Sci. Comput. Program. 8(3), 231–274 (1987)
Hoare, C.A.R.: Communicating sequential processes. Commun. ACM 21(8), 666–677 (1978)
Hoare, C.A.R.: Communicating Sequential Processes. Prentice Hall, Upper Saddle River, NJ (2004)
Hsu, C.J., Keceli, F., Ko, M.Y., Shahparnia, S., Bhattacharyya, S.S.: DIF: An interchange format for dataflow-based design tools. Comput. Syst. Architect. Modeling, Simulat. 3133, 423–432. Samos, Greece (2004)
Hsu, C.J., Ko, M.Y., Bhattacharyya, S.S.: Software synthesis from the dataflow interchange format. In: Proceedings of the 2005 Workshop on Software and Compilers for Embedded Systems (SCOPES ’05), pp. 37–49. ACM Press, New York, NY (2005)
Hu, Q.: Hierarchical memory size estimation for loop transformation and data memory platform optimization. Ph.D. thesis, Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, Department of Electronics and Telecommunications (2007)
Hu, Q., Kjeldsberg, P.G., Vandecappelle, A., Palkovic, M., Catthoor, F.: Incremental hierarchical memory size estimation for steering of loop transformations. ACM Trans. Des. Autom. Electron. Syst. 12(50), 1–25 (2007)
Hu, Q., Palkovic, M., Kjeldsberg, P.G.: Memory requirement optimization with loop fusion and loop shifting. In: DSD ’04: Proceedings of the Digital System Design, EUROMICRO Systems, pp. 272–278. IEEE Computer Society, Washington, DC (2004)
Hu, Q., Vandecappelle, A., Kjeldsberg, P.G., Catthoor, F., Palkovic, M.: Fast memory footprint estimation based on maximal dependency vector calculation. In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE ’07), pp. 379–384. EDA Consortium, San Jose, CA (2007)
Hu, Q., Vandecappelle, A., Palkovic, M., Kjeldsberg, P.G., Brockmeyer, E., Catthoor, F.: Hierarchical memory size estimation for loop fusion and loop shifting in data-dominated applications. In: ASP-DAC ’06: Proceedings of the 2006 conference on Asia South Pacific design automation, pp. 606–611. IEEE Press, Piscataway, NJ (2006)
Hutchings, B., Bellows, P., Hawkins, J., Hemmert, S., Nelson, B., Rytting, M.: A CAD suite for high-performance FPGA design. In: Proceedings of Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’99), pp. 12–24. Marriott at Napa Valley, Napa, CA (1999)
IEEE: IEEE Standard VHDL Language Reference Manual. IEEE, IEEE Std. 1076–1987 edn. (1987)
IEEE: IEEE Standard VHDL Language Reference Manual. IEEE, IEEE Std. 1076–1993 edn. (1993)
IMEC: CleanC. http://www.imec.be/CleanC/ (2010). Accessed 19 Sep 2010
Impulse Accelerated Technologies: ImpulseC. http://www.impulsec.com/ (2010)
Jantsch, A., Sander, I.: Models of computation and languages for embedded system design. IEE Proc. Comput. Digital Tech. 152(2), 114–129 (2005)
Jha, P.K., Dutt, N.D.: High-level library mapping for memories. ACM Trans. Des. Autom. Electron. Syst. 5(3), 566–603 (2000)
Jung, H., Ha, S.: Hardware synthesis from coarse-grained dataflow specification for fast HW/SW cosynthesis. In: Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS ’04), pp. 24–29. IEEE Computer Society, Washington, DC (2004)
Kahn, G.: The semantics of a simple language for parallel programming. In: Proceedings of IFIP Congress 74, pp. 471–475. Stockholm, Sweden (1974)
Kangas, T., Kukkala, P., Orsila, H., Salminen, E., Hännikäinen, M., Hämäläinen, T.D., Riihimäki, J., Kuusilinna, K.: UML-based multiprocessor SoC design framework. ACM Trans. Embedded Comput. Syst. 5(2), 281–320 (2006)
Karczmarek, M., Thies, W., Amarasinghe, S.: Phased scheduling of stream programs. In: Proceedings of the 2003 ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded systems (LCTES ’03), pp. 103–112. ACM Press, New York, NY (2003)
Karp, R.M., Miller, R.E.: Properties of a model for parallel computations: Determinacy, termination and queuing. SIAM J. Appl. Math. 14(6), 1390–1411 (1966)
Karsai, G., Sztipanovits, J., Ledeczi, A., Bapty, T.: Model-integrated development of embedded software. Proc. IEEE 91(1), 145–164 (2003)
Kathail, V., Aditya, S., Schreiber, R., Rau, B.R., Cronquist, D.C., Sivaraman, M.: PICO: Automatically designing custom computers. Computer 35(9), 39–47 (2002)
Keinert, J., Haubelt, C., Teich, J.: Modeling and analysis of windowed synchronous algorithms. ICASSP2006 III, 892–895 (2006)
Kianzad, V., Bhattacharyya, S.S.: CHARMED: A multi-objective co-synthesis framework for multi-mode embedded systems. In: ASAP ’04: Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference, pp. 28–40. IEEE Computer Society, Washington, DC (2004)
Kienhuis, B., Deprettere, E.F.: Modeling stream-based applications using the SBF model of computation. J. VLSI Signal Process. Syst. 34(3), 291–300 (2003)
Kienhuis, B., Rijpkema, E., Deprettere, E.: Compaan: deriving process networks from MATLAB for embedded signal processing architectures. In: Proceedings of the Eighth International Workshop on Hardware/Software Codesign (CODES ’00), pp. 13–17. ACM Press, New York, NY (2000)
Kim, D.: A case study of system level specification and software synthesis of multi-mode multimedia terminal. ESTImedia 8, 231–274 (2003)
Kjeldsberg, P., Catthoor, F., Aas, E.J.: Detection of partially simultaneously alive signals in storage requirement estimation for data intensive applications. In: DAC ’01: Proceedings of the 38th conference on Design automation, pp. 365–370. ACM Press, New York, NY (2001)
Kjeldsberg, P.G., Catthoor, F., Aas, E.J.: Data dependency size estimation for use in memory optimization. IEEE Trans. CAD Integr. Circuits Syst. 22(7), 908–921 (2003)
Kjeldsberg, P.G., Catthoor, F., Aas, E.J.: Storage requirement estimation for optimized design of data intensive applications. ACM Trans. Des. Autom. Electron. Syst. 9(2), 133–158 (2004)
Ko, D.I.: System synthesis for image processing applications. Ph.D. thesis, University of Maryland (2006)
Ko, D.I., Bhattacharyya, S.S.: Modeling of block-based DSP systems. In: Proceedings of the IEEE Workshop on Signal Processing Systems, pp. 381–386. Seoul, Korea (2003)
Ko, D.I., Bhattacharyya, S.S.: Modeling of block-based DSP systems. J. VLSI Signal Process. Syst. 40(3), 289–299 (2005)
Ko, M.Y., Murthy, P.K., Bhattacharyya, S.S.: Compact procedural implementation in DSP software synthesis through recursive graph decomposition. In: Proceedings of the International Workshop on Software and Compilers for Embedded Processors, pp. 47–61. Amsterdam, The Netherlands (2004)
Ku, D., Micheli, G.D.: HardwareC: A language for hardware design. Tech. Rep. CSTL-TR-90-419, Computer Systems Lab, Stanford Univ. (1990). Version 2.0
Kuzmanov, G., Gaydadjiev, G.N., Vassiliadis, S.: Multimedia rectangularly addressable memory. IEEE Trans. Multimedia 8, 315–322 (2006)
Labbani, O.: Modélisation à haut niveau du contrôle dans des applications de traitement systématique à parallélisme massif. Ph.D. thesis, Université des Sciences et Technologies de Lille Laboratoire d’Informatique Fondamentale de Lille, 59655 Villeneuve (2006)ö
Labbani, O., Dekeyser, J.L., Boulet, P., Rutten, E.: Introduction of control into the Gaspard application UML metamodel: Synchronous approach. Tech. Rep. 5794, Laboratoire d’Informatique Fondamentale de Lille, Université des Sciences et Technologies de Lille 59655 Villeneuve d’Ascq Cedex, France (2005)
Lauwereins, R., Engels, M., Ade, M., Peperstraete, J.: Grape-II: a system-level prototyping environment for DSP applications. Computer 28(2), 35–43 (1995)
Lawal, N., O’Nils, M.: Embedded FPGA memory requirements for real-time video processing applications. In: 23rd NORCHIP Conference, pp. 206–209. Oulu, Finland (2005)
Lawal, N., O’Nils, M., Thörnberg, B.: C++ based system synthesis of real-time video processing systems targeting FPGA implementation. In: IPDPS, pp. 1–7 (2007)
Lawal, N., Thörnberg, B., O’Nils, M.: Address generation for FPGA RAMS for efficient implementation of real-time video processing systems. In: International Conference on Field Programmable Logic and Applications, pp. 136–141 (2005)
Lawal, N., Thörnberg, B., O’Nils, M.: Power-aware automatic constraint generation for FPGA based real-time video processing systems. In: NORCHIP, 2007, pp. 1–5. Aalborg, Denmark (2007)
Lee, E., Neuendorffer, S.: Concurrent models of computation for embedded software. IEE Proc. Comput. Digital Tech. 152(2), 239–250 (2005)
Lee, E.A., Messerschmitt, D.G.: Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput. C-36(1), 24–35 (1987)
Lefebvre, V.: Restructuration automatique des variables d’un programme en vue de sa parallélilsation. Ph.D. thesis, Université de Versailles (1998)
Lefebvre, V., Feautrier, P.: Storage management in parallel programs. In: 5th Euromicro Workshop on Parallel and Distributed Processing, pp. 181–188. London, Great Britian (1997)
Lefebvre, V., Feautrier, P.: Automatic storage management for parallel programs. Parallel Comput. 24(3-4), 649–671 (1998)
Lewis B. Baumstark, J., Wills, L.M.: Multidimensional dataflow-based parallelization for multimedia instruction set extensions. In: ICPPW ’06: Proceedings of the 2006 International Conference Workshops on Parallel Processing, pp. 319–326. IEEE Computer Society, Washington, DC (2006)
Liang, X., Jean, J., Tomko, K.: Data buffering and allocation in mapping generalized template matching on reconfigurable systems. J. Supercomput. 19(1), 77–91 (2001)
LLVM: LLVM. http://llvm.org/ (2010)
Manolache, S., Eles, P., Peng, Z.: Buffer space optimisation with communication synthesis and traffic shaping for NoCs. In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE ’06), pp. 718–723. European Design and Automation Association, 3001 Leuven, Belgium, Belgium (2006)
Martinez, F.: Reduce FPGA design time with PICO express FPGA. Xcell J. 64, 75–77 (2008)
Massachusetts Institute of Technology (MIT): Streamit. http://groups.csail.mit.edu/cag/streamit/index.shtml (2010)
MathWorks, T.: Simulink. www.mathworks.com/products/simulink/ (2010)
Mencer, O.: ASC: a stream compiler for computing with FPGAs. IEEE Trans. CAD Integr. Circuits Syst. 25(9), 1603–1617 (2006)
Mentor Graphics: Catapult C. http://www.mentor.com/products/esl/high_level_synthesis/catapult_synthesis/ (2010)
Moore, M.S.: Model integrated program synthesis for real time image processing. Ph.D. thesis, Vanderbilt University (1997)
Murthy, P.K.: Scheduling techniques for synchronous and multidimensional synchronous dataflow. Ph.D. thesis, University of California at Berkeley (1996)
Murthy, P.K., Bhattacharyya, S.S.: Shared buffer implementations of signal processing systems using lifetime analysis techniques. IEEE Trans. CAD Integr. Circuits Syst. 20(2), 177–198 (2001)
Murthy, P.K., Bhattacharyya, S.S.: Buffer merging—a powerful technique for reducing memory requirements of synchronous dataflow specifications. ACM Trans. Des. Autom. Electron. Syst. 9(2), 212–237 (2004)
Murthy, P.K., Lee, E.A.: On the optimal blocking factor for blocked, non-overlapped schedules. In: 28th Asilomar Conference on Signals, Systems, and Computers, vol. 2, pp. 1052–1057. IEEE CS Press. Pacific Grove, CA (1994)
Murthy, P.K., Lee, E.A.: Multidimensional synchronous dataflow. IEEE Trans. Signal Process. 50(7), 2064–2079 (2002)
Najjar, W.A., Böhm, W., Draper, B.A., Hammes, J., Rinker, R., Beveridge, J.R., Chawathe, M., Ross, C.: High-level language abstraction for reconfigurable computing. Computer 36(8), 63–69 (2003)
Natarajan, S., Levine, B., Tan, C., Newport, D., Bouldin, D.: Automatic mapping of Khoros-based applications to adaptive computing systems. In: Proceedings of 1999 Military and Aerospace Applications of Programmable Devices and Technologies International Conference (MAPLD), pp. 101–107. Laurel, MD (1999)
National Instruments: LabVIEW FPGA. http://www.ni.com/fpga/ (2010)
Neema, S., Bapty, T., Scott, J.: Development environment for dynamically reconfigurable embedded systems. In: Proceedings of the International Conference on Signal Processing Applications and Technology. Orlando, FL (1999)
Nichols, J., Moore, M.: An adaptable, cost effective image processing system. In: The 10th JANNAF Non-destructive Evaluation Sub Committee, pp. 1–5. Salt Lake City, UT (1998)
Nichols, J., Neema, S.: Dynamically reconfigurable embedded image processing system. In: Proceedings of the International Conference on Signal Processing Applications and Technology. Orlando, FL (1999)
Nikolov, H., Stefanov, T., Deprettere, E.: Multi-processor system design with ESPAM. In: Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS ’06), pp. 211–216. ACM Press, New York, NY (2006)
Nikolov, H., Stefanov, T., Deprettere, E.: Systematic and automated multiprocessor system design, programming, and implementation. IEEE Trans. CAD Integr. Circuits Syst. 27(3), 542–555 (2008)
Ning, Q., Gao, G.R.: A novel framework of register allocation for software pipelining. In: Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 29–42. Charleston, SC (1993)
Norell, H., Lawal, N., O’Nils, M.: Automatic generation of spatial and temporal memory architectures for embedded video processing systems. EURASIP J. Embedded Syst. 2007, Article ID 75,368, 10 pages (2007)
Object Management Group: Unified Modeling Language. http://www.uml.org (2010)
Oh, H., Ha, S.: Fractional rate dataflow model and efficient code synthesis for multimedia applications. In: Proceedings of the Joint Conference on Languages, Compilers and Tools for Embedded Systems (LCTES/SCOPES ’02), pp. 12–17. ACM Press, New York, NY (2002)
Oh, H., Ha, S.: Memory-optimized software synthesis from dataflow program graphs with large size data samples. EURASIP J. Appl. Signal Process. 2003(1), 514–529 (2003)
O’Nils, M., Thörnberg, B., Norell, H.: A comparison between local and global memory allocation for FPGA implementation of real-time video processing systems. In: Proceedings of the International Conference on Signals and Electronic Systems, pp. 429–432. Poznán, Poland (2004)
OSCI: Functional Specification for SystemC 2.0. Open SystemC Initiative. www.systemc.org (2002)
Ouaiss, I., Vemuri, R.: Global memory mapping for FPGA-based reconfigurable systems. In: Proceedings of 15th International Symposium on Parallel and Distributed Processing, pp. 1473–1480. San Francisco, CA (2001)
Ouaiss, I., Vemuri, R.: Hierarchical memory mapping during synthesis in FPGA-based reconfigurable computers. In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE ’01), pp. 650–657. IEEE Press, Piscataway, NJ (2001)
Ouaiss, I.E.: Hierarchical memory synthesis in reconfigurable computers. Ph.D. thesis, University of Cincinnati (2002)
Park, C., Chung, J., Ha, S.: Extended synchronous dataflow for efficient DSP system prototyping. IEEE International Workshop on Rapid System Prototyping pp. 196–201. Clear water, FL (1999)
Park, C., Ha, S.: Hardware synthesis from SPDF representation for multimedia applications. In: Proceedings of the 13th International Symposium on System Synthesis (ISSS ’00), pp. 215–220. IEEE Computer Society, Washington, DC (2000)
Park, C., Kim, S., Ha, S.: A dataflow specification for system level synthesis of 3D graphics applications. In: ASP-DAC, pp. 78–84. Yokohama, Japan (2001)
Park, J., Diniz, P.C.: Synthesis of pipelined memory access controllers for streamed data applications on FPGA-based computing engines. In: Proceedings of the 14th International Symposium on Systems Synthesis (ISSS ’01), pp. 221–226. ACM Press, New York, NY (2001)
Park, J., Diniz, P.C.: Partial data reuse for windowing computations: Performance modeling for FPGA implementations. In: Reconfigurable Computing: Architectures, Tools and Applications (ARC), Lecture Notes in Computer Science, vol. 4419, pp. 97–109. Mangaratiba, Brazil (2007)
Park, J.W.: An efficient buffer memory system for subarray access. IEEE Trans. Parallel Distrib. Syst. 12(3), 316–335 (2001)
Parks, T.M., Pino, J.L., Lee, E.A.: A comparison of synchronous and cycle-static dataflow. In: ASILOMAR ’95: Proceedings of the 29th Asilomar Conference on Signals, Systems and Computers (2-Volume Set), p. 204. IEEE Computer Society, Washington, DC (1995)
Petri, C.A.: Communication with automata. Supplement 1 to technical report RADC-TR-65-337, Rome Air Develop. Cent. (1962)
Piel Éric, Attitalah, R.B., Marquet, P., Meftali, S., Niar, S., Etien, A., Dekeyser, J.L., Boulet, P.: Gaspard2: from MARTE to SystemC simulation. In: Design, Automation and Test in Europe (DATE 08). Munich, Germany (2008)
Pimentel, A.D., Thompson, M., Polstra, S., Erbas, C.: On the calibration of abstract performance models for system-level design space exploration. In: International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (IC-SAMOS 2006), pp. 71–77. Samos, Greece (2006)
Reiter, R.: Scheduling parallel computations. J. ACM 15(4), 590–599 (1968)
Rijpkema, E., Deprettere, E., Kienhuis, B.: Compilation from MATLAB to process networks. In: Second International Workshop on Compiler and Architecture Support for Embedded Systems (CASES’99), pp. 388–395. Washington, DC (1999)
Rinker, R., Carter, M., Patel, A., Chawathe, M., Ross, C., Hammes, J., Najjar, W., Bohm, W.: An automated process for compiling dataflow graphs into reconfigurable hardware. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 9(1), 130–139 (2001)
Schreiber, R., Aditya, S., Mahlke, S., Kathail, V., Rau, B.R., Cronquist, D., Sivaraman, M.: PICO-NPA: High-level synthesis of nonprogrammable hardware accelerators. J. VLSI Signal Process. Syst. 31(2), 127–142 (2002)
Séméria, L., Sato, K., Micheli, G.D.: Synthesis of hardware models in C with pointers and complex data structures. IEEE Trans. Very Large Scale Integr. Syst. 9(6), 743–756 (2001)
Sen, M.: Model-based hardware design for image processing systems. Ph.D. thesis, Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies (2006)
Sen, M., Bhattacharyya, S., Lv, T., Wolf, W.: Modeling image processing systems with homogeneous parameterized dataflow graphs. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’05) 5, 133–136. Philadelphia, PA (2005)
Sen, M., Corretjer, I., Haim, F., Saha, S., Bhattacharyya, S.S., Schlessman, J., Wolf, W.: Computer vision on FPGAs: Design methodology and its application to gesture recognition. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) Workshops, p. 133. IEEE Computer Society, Washington, DC (2005)
Sen, M., Corretjer, I., Haim, F., Saha, S., Schlessman, J., Lv, T., Bhattacharyya, S.S., Wolf, W.: Dataflow-based mapping of computer vision algorithms onto FPGAs. EURASIP J. Embedded Syst. 2007(49236), 1–12 (2007)
So, B., Hall, M.W.: Increasing the applicability of scalar replacement. In: Proceedings of 13th International Conference on Compiler Construction, Lecture Notes in Computer Science, vol. 2985, pp. 185–201. Barcelona, Spain (2004)
So, B., Hall, M.W., Ziegler, H.E.: Custom data layout for memory parallelism. In: Proceedings of the International Symposium on Code Generation and Optimization (CGO ’04), pp. 291–302. IEEE Computer Society, Washington, DC (2004)
Sovani, C., Edwards, S.A.: FIFO sizing for high-performance pipelines. In: Proceedings of the International Workshop on Logic and Synthesis. San Diego, CA (2007)
Stefanov, T., Kienhuis, B., Deprettere, E.: Algorithmic transformation techniques for efficient exploration of alternative application instances. In: Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES), pp. 7–12. Estes Park, CO (2002)
Stefanov, T., Zissulescu, C., Turjan, A., Kienhuis, B., Deprettere, E.F.: System design using Kahn Process Networks: The Compaan/Laura approach. In: Proceedings of Design Automation & Test in Europe (DATE), pp. 340–345. Paris, France (2004)
Stichling, D., Kleinjohann, B.: CV-SDF - a model for real-time computer vision applications. In: WACV 2002: IEEE Workshop on Applications of Computer Vision. Orlando, FL (2002)
Strehl, K., Thiele, L., Gries, M., Ziegenbein, D., Ernst, R., Teich, J.: Funstate—an internal design representation for codesign. IEEE Trans. Very Large Scale Integr. Syst. 9(4), 524–544 (2001)
Strehl, K., Thiele, L., Ziegenbein, D., Ernst, R.: Scheduling hardware/software systems using symbolic techniques. Tech. Rep. 67, Swiss Federal Institute of Technology (ETH) and Technical University of Braunschweig (1999)
Stuijk, S., Geilen, M., Basten, T.: Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs. In: Proceedings of the 43rd Annual Conference on Design Automation (DAC ’06), pp. 899–904. ACM Press, New York, NY (2006)
Sutherland, S., Davidmann, S., Flake, P., Moorby, P.: SystemVerilog for Design: A Guide to Using SystemVerilog for Hardware Design and Modeling, vol. 2. Kluwer, Norwell, MA (2006)
Synopsys: Synopsys system studio. http://www.synopsys.com/systemstudio (2010). Accessed 19 Sep 2010
Taha, S., Radermache, A., Gerard, S., Dekeyser, J.L.: An open framework for detailed hardware modeling. In: International Symposium on Industrial Embedded Systems (SIES ’07), pp. 118–125. Lisbon, Portugal (2007)
Teich, J., Zitzler, E., Bhattacharyya, S.: Optimized software synthesis for digital signal processing algorithms: an evolutionary approach. In: IEEE Workshop on Signal Processing Systems, pp. 589–598. Boston, MA (1998)
Teich, J., Zitzler, E., Bhattacharyya, S.S.: Buffer memory optimization in DSP applications — an evolutionary approach. In: Fifth International Conference on Parallel Problem Solving from Nature (PPSN-V), pp. 885–894. Springer, Berlin, Germany (1998)
Tessier, R., Betz, V., Neto, D., Egier, A., Gopalsamy, T.: Power-efficient RAM mapping algorithms for FPGA embedded memory blocks. IEEE Trans. CAD Integr. Circuits Syst. 26(2), 278–290 (2007)
The SUIF Group: The SUIF compiler system. http://suif.stanford.edu/. Accessed 19 Sep 2010
Thies, W., Vivien, F., Sheldon, J., Amarasinghe, S.: A unified framework for schedule and storage optimization. Proc. ACM SIGPLAN 2001 Conf. Programming Lang. Des. Implementation 36(5), 232–242 (2001)
Thomas, D.R., Moorby, P.: The Verilog Hardware Description Language, vol. 5. Kluwer, Boston, MA (2002)
Thompson, M., Nikolov, H., Stefanov, T., Pimentel, A.D., Erbas, C., Polstra, S., Deprettere, E.F.: A framework for rapid system-level exploration, synthesis, and programming of multimedia MP-SoCs. In: Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS ’07), pp. 9–14. ACM Press, New York, NY (2007)
Thörnberg, B., Hu, Q., Palkovic, M., O’Nils, M., Kjeldsberg, P.G.: Polyhedral space generation and memory estimation from interface and memory models of real-time video systems. J. Syst. Softw. 79(2), 231–245 (2006)
Thörnberg, B., Norell, H., O’Nils, M.: Conceptual interface and memory-modeling for real-time image processing systems. In: IEEE Workshop on Multimedia Signal Processing, pp. 138–141. Paris, France (2002)
Thörnberg, B., Olsson, L., O’Nils, M.: Optimization of memory allocation for real-time video processing on FPGA. In: The 16th IEEE International Workshop on Rapid System Prototyping (RSP2005), pp. 141–147. Montreal, Canada (2005)
Thörnberg, B., Palkovic, M., Hu, Q., Olsson, L., Kjeldsberg, P.G., O’Nils, M., Catthoor, F.: Bit-width constrained memory hierarchy optimization for real-time video systems. IEEE Trans. CAD Integr. Circuits Syst. 26(4), 781–800 (2007)
Tronçon, R., Bruynooghe, M., Janssens, G., Catthoor, F.: Storage size reduction by in-place mapping of arrays. In: VMCAI ’02: Revised Papers from the 3rd International Workshop on Verification, Model Checking, and Abstract Interpretation, pp. 167–181. Springer, London, UK (2002)
Turjan, A., Kienhuis, B., Deprettere, E.: Translating affine nested-loop programs to process networks. In: Proceedings of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES ’04), pp. 220–229. ACM Press, New York, NY (2004)
Turjan, A., Kienhuis, B., Deprettere, E.: Solving out-of-order communication in Kahn Process Networks. J. VLSI Signal Process. 40, 7–18 (2005)
Turjan, A., Kienhuis, B., Deprettere, E.F.: Realizations of the extended linearization model. In: 2nd Int. Workshop on Systems, Architectures, Modeling, and Simulation (SAMOS 2002). Samos, Greece (2002)
Vasudevan, N., Edwards, S.: Static deadlock detection for the SHIM concurrent language. In: 6th ACM/IEEE International Conference on Formal Methods and Models for Co-Design (MEMOCODE 2008), pp. 49–58. Anaheim, CA (2008)
Vasudevan, N., Edwards, S.A.: A JPEG decoder in SHIM. Computer Science Tech. Rep. CUCS-048-06, Columbia University (2006)
Venkataramani, G., Najjar, W., Kurdahi, F., Bagherzadeh, N., Bohm, W., Hammes, J.: Automatic compilation to a coarse-grained reconfigurable system-on-chip. ACM Trans. Embedded Comput. Syst. (TECS) 2, 560–589 (2003)
Verdoolaege, S., Bruynooghe, M., Janssens, G., Catthoor, F.: Multi-dimensional incremental loop fusion for data locality. In: Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors, pp. 17–27. The Hague, The Netherlands (2003)
Verdoolaege, S., Nikolov, H., Stefanov, T.: PN: a tool for improved derivation of process networks. EURASIP J. Embedded Syst. 2007(1), 19–19 (2007)
Verhaegh, W.F.J., Aarts, E.H.L., van Gorp, P.C.N., Lippens, P.E.R.: A two-stage solution approach to multidimensional periodic scheduling. IEEE Trans. CAD Integr. Circuits Syst. 20(10), 1185–1199 (2001)
Vitkovski, A., Kuzmanov, G., Gaydadjiev, G.N.: Memory organization with multi-pattern parallel accesses. In: Proceedings of DATE, pp. 1420–1425. New York, NY (2008)
Wakabayashi, K., Okamoto, T.: C-based SoC design flow and EDA tools: An ASIC and system vendor perspective. IEEE Trans. CAD Integr. Circuits Syst. 19(12), 1507–1522 (2000)
Wauters, P., Engels, M., Lauwereins, R., Peperstraete, J.: Cyclo-dynamic dataflow. Internal Report ESAT/ACCA/95/2, Katholieke Universiteit Leuven, ESAT Department, Kard. Mercierlann 94, B-3001 Heverlee, Belgium (1996)
Wauters, P., Engels, M., Lauwereins, R., Peperstraete, J.: Cyclo-dynamic dataflow. In: Proceedings of the 4th Euromicro Workshop on Parallel and Distributed Processing (PDP ’96), pp. 319–326. Braga, Portugal (1996)
Weinhardt, M., Luk, W.: Memory access optimization for reconfigurable systems. IEEE Proc. Comput. Digital Tech. 148, 105–112 (2001)
Wiggers, M., Bekooij, M., Jansen, P., Smit, G.: Efficient computation of buffer capacities for multi-rate real-time systems with back-pressure. In: Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS ’06), pp. 10–15. ACM Press, New York, NY (2006)
Wiggers, M., Bekooij, M., Smit, G.: Efficient computation of buffer capacities for cyclo-static dataflow graphs. Tech. Rep., Centre for Telematics and Information Technology, University of Twente, Enschede (2006)
William Thies, J.L., Amarasinghe, S.: Phased computation graphs in the polyhedral model. Tech. Rep., MIT Laboratory for Computer Science Cambridge, MA 02139 (2002)
Yang, H., Jung, H., Ha, S.: Buffer minimization in RTL synthesis from coarse-grained dataflow specification. In: SASMI. Nagoya, Japan (2006)
Yu, H., Leeser, M.: Optimizing data intensive window-based image processing on reconfigurable hardware boards. In: IEEE Workshop on Signal Processing Systems Design and Implementation, pp. 491–496. Athens, Greece (2005). DOI 10.1109/SIPS.2005.1579918
Yu, H., Leeser, M.: Automatic sliding window operation optimization for FPGA-based computing boards. In: Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’06), pp. 76–88. IEEE Computer Society, Washington, DC (2006)
Zhao, Y., Malik, S.: Exact memory size estimation for array computations. IEEE Trans. Very Large Scale Integr. Syst. 8(5), 517–521 (2000)
Zhu, H., Luican, I.I., Balasa, F.: Exact computation of storage requirements for multi-dimensional signal processing applications. Tech. Rep., University of Illinois at Chicago (2005)
Ziegler, H., Hall, M.: Evaluating heuristics in automatically mapping multi-loop applications to FPGAs. In: Proceedings of the 2005 ACM/SIGDA 13th International Symposium on Field-Programmable Gate Arrays (FPGA ’05), pp. 184–195. ACM Press, New York, NY (2005)
Ziegler, H.E., Hall, M.W., Diniz, P.C.: Compiler-generated communication for pipelined FPGA applications. In: DAC ’03: Proceedings of the 40th Conference on Design Automation, pp. 610–615. ACM Press, New York, NY (2003)
Zissulescu, C., Kienhuis, B., Deprettere, E.: Communication synthesis in a multiprocessor environment. In: International Conference on Field Programmable Logic and Applications, pp. 360–365. Tampere, Finland (2005)
Zissulescu, C., Kienhuis, B., Deprettere, E.F.: Increasing pipelined IP core utilization in process networks using exploration. In: FPL, pp. 690–699. Belgium (2004)
Zissulescu, C., Turjan, A., Kienhuis, B., Deprettere, E.: Solving out of order communication using CAM memory; an implementation. In: 13th Annual Workshop on Circuits, Systems and Signal Processing (ProRISC 2002). Veldhoven, Netherlands (2002)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Keinert, J., Teich, J. (2011). Fundamentals and Related Work. In: Design of Image Processing Embedded Systems Using Multidimensional Data Flow. Embedded Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-7182-1_3
Download citation
DOI: https://doi.org/10.1007/978-1-4419-7182-1_3
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-7181-4
Online ISBN: 978-1-4419-7182-1
eBook Packages: EngineeringEngineering (R0)