Advertisement

Fundamentals and Related Work

Part of the Embedded Systems book series (EMSY)

Abstract

After having presented the challenges and requirements for system level design of image processing applications, this chapter aims to discuss fundamentals on system level design and to give an overview on related work. Section 3.1 starts with the question how to specify the application behavior. In this context also some fundamental data flow models of computation are reviewed. Next, Section 3.2 gives an introduction to existing approaches in behavioral hardware synthesis. Communication and memory synthesis techniques are discussed separately in Section 3.4. Section 3.3 details some aspects about memory analysis and optimization. Section 3.5 reviews several system-level design approaches before Section 3.6 concludes this chapter with a conclusion.

Keywords

Buffer Size Finite State Machine Image Processing Application Register Transfer Level Hardware Accelerator 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
  2. 4.
  3. 6.
    International technology roadmap for semiconductors – design. Tech. rep., International Technology Roadmap for Semiconductors (2007)Google Scholar
  4. 7.
    Absar, J., Catthoor, F.: Reuse analysis of indirectly indexed arrays. ACM Trans. Des. Autom. Electron. Syst. 11(2), 282–305 (2006)CrossRefGoogle Scholar
  5. 9.
    Adé, M.: Data memory minimization for synchronous dataflow graphs emulated on DSP-FPGA targets. Ph.D. thesis, Katholieke Universiteit Leuven (1996)Google Scholar
  6. 10.
    Adé, M., Lauwereins, R., Peperstraete, J.: Buffer memory requirements in DSP applications. In: Proceedings of the 5th International Workshop on Rapid System Prototyping, pp. 108–123. Grenoble, France (1994)Google Scholar
  7. 11.
    Adé, M., Lauwereins, R., Peperstraete, J.A.: Data memory minimisation for synchronous data flow graphs emulated on DSP-FPGA targets. In: DAC ’97: Proceedings of the 34th Annual Conference on Design Automation, pp. 64–69. ACM Press, New York, NY, (1997)Google Scholar
  8. 13.
    Agrawal, A., Bakshi, A., Davis, J., Eames, B., Ledeczi, A., Mohanty, S., Mathur, V., Neema, S., Nordstrom, G., Prasanna, V., Raghavendra, C., Singh, M.: MILAN: A model based integrated simulation framework for design of embedded systems. In: Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES 2001), pp. 82–93. Snowbird, UT, (2001)Google Scholar
  9. 14.
    Aho, E., Vanne, J., Hamalainen, T.: Parallel memory architecture for arbitrary stride accesses. In: Proceedings of the 2006 IEEE Design and Diagnostics of Electronic Circuits and systems, pp. 63–68. Prague, Czech Republic (2006)Google Scholar
  10. 15.
    Ambler, S.W.: The Elements of UML(TM) 2.0 Style. Cambridge University Press, New York, NY (2005)Google Scholar
  11. 16.
    Ashenden, P.J.: The Designer’s Guide to VHDL. Morgan Kaufmann Publishers, San Francisco, CA (1991)Google Scholar
  12. 18.
    Balarin, F., Chiodo, M., Hsieh, H., Jureska, A., Lavagno, L., Passerone, C., Sangiovanni-Vincentelli, A., Sentovich, E., Suzuki, K., Tabbara, B.: Hardware-Software Co-design of Embedded System: The POLIS Approach. Kluwer, Norwell, MA (1997)MATHGoogle Scholar
  13. 19.
    Balarin, F., Watanabe, Y., Hsieh, H., Lavagno, L., Passerone, C., Sangiovanni-Vincentelli, A.: Metropolis: An integrated electronic system design environment. Computer 36(4), 45–52 (2003)CrossRefGoogle Scholar
  14. 20.
    Balasa, F., Catthoor, F., Man, H.D.: Dataflow-driven memory allocation for multi-dimensional signal processing systems. In: ICCAD ’94: Proceedings of the 1994 IEEE/ACM International Conference on Computer-Aided Design, pp. 31–34. IEEE Computer Society Press, Los Alamitos, CA, (1994)Google Scholar
  15. 21.
    Balasa, F., Kjeldsberg, P.G., Palkovic, M., Vandecappelle, A., Catthoor, F.: Loop transformation methodologies for array-oriented memory management. In: ASAP ’06: Proceedings of the IEEE 17th International Conference on Application-specific Systems, Architectures and Processors, pp. 205–212. IEEE Computer Society, Washington, DC, (2006)Google Scholar
  16. 22.
    Balasa, F., Zhu, H., Luican, I.I.: Computation of storage requirements for multi-dimensional signal processing applications. IEEE Trans. Very Large Scale Integr. Syst. 15(4), 447–460 (2007)CrossRefGoogle Scholar
  17. 23.
    Banerjee, P., Shenoy, N., Choudhary, A., Hauck, S., Bachmann, C., Haldar, M., Joisha, P., Jones, A., Kanhare, A., Nayak, A., Periyacheri, S., Walkden, M., Zaretsky, D.: A MATLAB compiler for distributed, heterogeneous, reconfigurable computing systems. In: FCCM ’00: Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines, p. 39. IEEE Computer Society, Washington, DC, (2000)Google Scholar
  18. 24.
    Baradaran, N., Diniz, P.: Exploiting data reuse in modern FPGAs: Opportunities and challenges for compilers. In: International Workshop on Applied Reconfigurable Computing (ARC2005), pp. 1–10. Algarve, Portugal (2005). Keynote LectureGoogle Scholar
  19. 25.
    Baradaran, N., Diniz, P.C.: A register allocation algorithm in the presence of scalar replacement for fine-grain configurable architectures. In: DATE ’05: Proceedings of the conference on Design, Automation and Test in Europe, pp. 6–11. IEEE Computer Society, Washington, DC, (2005)Google Scholar
  20. 26.
    Baradaran, N., Diniz, P.C.: Memory parallelism using custom array mapping to heterogeneous storage structures. In: International Conference on Field Programmable Logic and Applications (FPL), pp. 1–6. Madrid, Spain (2006)Google Scholar
  21. 27.
    Baradaran, N., Diniz, P.C., Park, J.: Extending the applicability of scalar replacement to multiple induction variables. In: Languages and Compilers for High Performance Computing, vol. 3602, pp. 455–469. Springer, New York, NY (2005)Google Scholar
  22. 28.
    Baradaran, N., Park, J., Diniz, P.C.: Compiler reuse analysis for the mapping of data in FPGAs with RAM blocks. In: Proceedings of IEEE International Conference on Field-Programmable Technology (FPT), pp. 145–152 (2004)Google Scholar
  23. 29.
    Baumstark, L., Guler, M., Wills, L.: Extracting an explicitly data-parallel representation of image-processing programs. In: Proceedings of 10th Working Conference on Reverse Engineering (WCRE), pp. 24–34. Victoria, B.C., Canada (2003)Google Scholar
  24. 30.
    Baumstark, L.B., Wills, L.M.: Retargeting sequential image-processing programs for data parallel execution. IEEE Trans. Softw. Eng. 31(2), 116–136 (2005)CrossRefGoogle Scholar
  25. 31.
    Beierlein, T., Fröhlich, D., Steinbach, B.: Model-driven compilation of UML-models for reconfigurable architectures. In: 2nd RTAS Workshop on Model-Driven Embedded Systems (MoDES ’04). Toronto, Canada (2004)Google Scholar
  26. 32.
    Bekooij, M., Wiggers, M., van Meerbergen, J.: Efficient buffer capacity and scheduler setting computation for soft real-time stream processing applications. In: SCOPES ’07: Proceedings of the 10th International Workshop on Software & Compilers for Embedded Systems, pp. 1–10. ACM Press, New York, NY (2007)Google Scholar
  27. 33.
    Benkrid, K., Crookes, D., Smith, J., Benkrid, A.: High level programming for FPGA based image and video processing using hardware skeletons. In: The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’01), pp. 219–226. Rohnert Park, Canada (2001)Google Scholar
  28. 34.
    Bergeron, J., Cerny, E., Hunter, A., Nightingale, A.: Verification Methodology Manual for SystemVerilog. Springer, New York, NY (2005)Google Scholar
  29. 35.
    Berry, G., Gonthier, G.: The Esterel synchronous programming language: Design, semantics, implementation. Sci. Comput. Programming 19, pp. 87–152 (1992)MATHCrossRefGoogle Scholar
  30. 36.
    Beux, S.L.: Un flot de conception pour applications de traitement du signal systématique implémentées sur FPGA à base d’ingénierie dirigée par les modèles. Ph.D. thesis, Université des Sciences et Technologies de Lille (2007)Google Scholar
  31. 37.
    Beux, S.L., Marquet, P., Dekeyser, J.L.: A design flow to map parallel applications onto FPGAs. In: 17th IEEE International Conference on Field Programmable Logic and Applications (FPL2007), pp. 605–608. Amsterdam, The Netherlands (2007)Google Scholar
  32. 38.
    Beux, S.L., Marquet, P., Dekeyser, J.L.: Multiple abstraction views of FPGA to map parallel applications. In: Reconfigurable Communication-centric SoCs 2007 (ReCoSoC’07), pp. 90–97. Montpellier, France (2007)Google Scholar
  33. 39.
    Bhattacharya, B.: Parameterized modeling and scheduling for dataflow graphs. Master’s thesis, Department of Electrical and Computer Engineering, University of Maryland (1999)Google Scholar
  34. 40.
    Bhattacharya, B., Bhattacharyya, S.: Parameterized dataflow modeling of DSP systems. IEEE Trans. Signal Process. 49(10), 2408–2421 (2001)MathSciNetCrossRefGoogle Scholar
  35. 41.
    Bhattacharya, B., Bhattacharyya, S.S.: Quasi-static scheduling of reconfigurable dataflow graphs for DSP systems. In: Proceedings of the 11th IEEE International Workshop on Rapid System Prototyping (RSP 2000), p. 84. IEEE Computer Society, Washington, DC (2000)Google Scholar
  36. 42.
    Bhattacharyya, S., Murthy, P., Lee, E.: APGAN and RPMC: Complementary heuristics for translating DSP block diagrams into efficient software implementations. In: Design Automation for Embedded Systems, pp. 33–60. Kluwer Academic Publishers, Boston, MA (1997)Google Scholar
  37. 43.
    Bijlsma, T., Bekooij, M.J.G., Smit, G.J.M., Jansen, P.G.: Efficient inter-task communication for nested loop programs on a multiprocessor system. In: Proceedings of the ProRISC 2007 Workshop, pp. 122–127. Utrecht, Technology Foundation, Veldhoven, The Netherlands (2007)Google Scholar
  38. 44.
    Bijlsma, T., Bekooij, M.J.G., Smit, G.J.M., Jansen, P.G.: Omphale: Streamlining the communication for jobs in a multi processor system on chip. Technical Report TR-CTIT-07-44, University of Twente, The Netherlands, Enschede (2007)Google Scholar
  39. 45.
    Bilsen, G., Engels, M., Lauwereins, R., Peperstraete, J.: Cyclo-static dataflow. IEEE Trans. Signal Process. 44(2), 397–408 (1996)CrossRefGoogle Scholar
  40. 46.
    BINACHIP: BINACHIP. http://www.binachip.com/products.htm (2010)
  41. 47.
    Boulet, P.: Array-OL revisited, multidimensional intensive signal processing specification. Tech. Rep. 6113v2, Unité de recherche INRIA Futurs Parc Club Orsay Université, ZAC des Vignes, 4, rue Jacques Monod, 91893 ORSAY Cedex (France) (2007)Google Scholar
  42. 48.
    Boulet, P., Marquet, P., Éric Piel, Taillard, J.: Repetitive allocation modeling with MARTE. In: Forum on Specification and Design Languages (FDL’07). Barcelona, Spain (2007). Invited paperGoogle Scholar
  43. 49.
    Buck, J., Ha, S., Lee, E.A., Messerschmitt, D.G.: Ptolemy: A framework for simulating and prototyping heterogenous systems. Int. J. Comput. Simulat. 4(2), 155–182 (1994)Google Scholar
  44. 50.
    Buck, J.T.: Scheduling dynamic dataflow graphs with bounded memory using the token flow model. Ph.D. thesis, University of California at Berkeley (1993)Google Scholar
  45. 51.
    Buck, J.T.: Static scheduling and code generation from dynamic dataflow graphs with integer-valued control streams. In: 28th Asilomar Conference on Signals, Systems, and Computers. Pacific Grove, CA (1994)Google Scholar
  46. 52.
    Budiu, M.: Spatial computation. Ph.D. thesis, Carnegie Mellon School of Computer Science, Pittsburgh, PA (2003)Google Scholar
  47. 53.
    Budiu, M., Venkataramani, G., Chelcea, T., Goldstein, S.C.: Spatial computation. SIGOPS Oper. Syst. Rev. 38(5), 14–26 (2004)CrossRefGoogle Scholar
  48. 55.
    Calvez, J.P., Perrier, V.: MPEG-2 encoder-decoder illustrative example. Tech. rep., CoFluent Design (2005)Google Scholar
  49. 56.
    Caspi, E.: Design automation for streaming systems. Ph.D. thesis, Electrical Engineering and Computer Sciences, University of California at Berkeley (2005)Google Scholar
  50. 57.
    Catthoor, F., Danckaert, K., Kulkarni, K., Brockmeyer, E., Kjeldsberg, P., Achteren, T., Omnes, T.: Data Access and Storage Management for Embedded Programmable Processors. Springer, New York, NY (2002)MATHGoogle Scholar
  51. 58.
    Catthoor, F., Danckaert, K., Wuytack, S., Dutt, N.D.: Code transformations for data transfer and storage exploration preprocessing in multimedia processors. IEEE Des. Test 18(3), 70–82 (2001)CrossRefGoogle Scholar
  52. 60.
    Catthoor, F., Wuytack, S., Greef, E.D., Balasa, F., Nachtergaele, L., Vandecappelle, A.: Custom Memory Management Methodology – Exploration of Memory Organisation for Embedded Multimedia System Design. Kluwer, Boston, MA (1998)MATHGoogle Scholar
  53. 61.
    Celoxica: Handel-C language reference manual. http://www.celoxica.com (2010). Accessed 25 Sep 2010
  54. 62.
    Charot, F., Nyamsi, M., Quinton, P., Wagner, C.: Modeling and scheduling parallel data flow systems using structured systems of recurrence equations. In: Proceedings of the 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP ’04), pp. 6–16. IEEE Computer Society, Washington, DC (2004)Google Scholar
  55. 63.
    Chawala, N., Guizzetti, R., Meroth, Y., Deleule, A., Gupta, V., Kathail, V., Urard, P.: Multimedia application specific engine design using high level synthesis. In: DesignCon, pp. 1–24. Santa Clara, CA (2008)Google Scholar
  56. 64.
    Chen, M., Lee, E.: Design and implementation of a multidimensional synchronous dataflow environment. Conference Record of the Twenty-Eighth Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 519–524. Pacific Grove, CA (1994)Google Scholar
  57. 65.
    Chen, M.J.: Developing a multidimensional synchronous dataflow domain in Ptolemy. Tech. Rep. UCB/ERL M94/16, University of California, Berkeley (1994)Google Scholar
  58. 66.
    Cheung, E., Hsieh, H., Balarin, F.: Automatic buffer sizing for rate-constrained KPN applications on multiprocessor system-on-chip. In: IEEE International Workshop on High Level Design Validation and Test (HLVDT), pp. 37–44. Irvine, CA (2007)Google Scholar
  59. 67.
    Cockx, J., Denolf, K., Vanhoof, B., Stahl, R.: SPRINT: a tool to generate concurrent transaction-level models from sequential code. EURASIP J. Appl. Signal Process. 2007(1), 213–213 (2007)Google Scholar
  60. 68.
    Colorado State University: Cameron. http://www.cs.colostate.edu/cameron/index.html (2002). Accessed 25 Sep 2010
  61. 69.
    Cong, J., Fan, Y., Han, G., Jiang, W., Zhang, Z.: Behavior and communication co-optimization for systems with sequential communication media. In: DAC ’06: Proceedings of the 43rd Annual Conference on Design Automation, pp. 675–678. ACM Press, New York, NY (2006)Google Scholar
  62. 70.
    Cong, J., Fan, Y., Han, G., Jiang, W., Zhang, Z.: Platform-based behavior-level and system-level synthesis. In: IEEE International SOC Conference, pp. 199–202. Austin, TX (2006)Google Scholar
  63. 71.
    Cong, J., Han, G., Jiang, W.: Synthesis of an application-specific soft multiprocessor system. In: Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays (FPGA’07), pp. 99–107. ACM Press, New York, NY (2007)Google Scholar
  64. 73.
    Coussy, P.: Synthèse d’interface de communication pour les composants virtuels. Ph.D. thesis, Université de Bretagne Sud, Laboratoire d’Electronique des Systèmes Temps Réel (LESTER) (2003)Google Scholar
  65. 74.
    CoWare: CoWare platform architect. http://www.coware.com/products/platformarchitect.php (2010). Accessed 25 Sep 2010
  66. 75.
  67. 76.
    Cronquist, D., Gleason, C., Turean, F., Kathail, V.: Design of an H.264 encoder in five months using application engine synthesis. In: The 4th International Signal Processing Conference, pp. 1–7. Santa Clara, CA (2006)Google Scholar
  68. 77.
    Crookes, D., Benkrid, K., Bouridane, A., Alotaibi, K., Benkrid, A.: Design and implementation of a high level programming environment for FPGA-based image processing. IEE Proc. Vision Image Signal Process. 147(4), 377–384 (2000)CrossRefGoogle Scholar
  69. 78.
    Danckaert, K., Catthoor, F., Man, H.D.: A preprocessing step for global loop transformations for data transfer optimization. In: Proceedings of the 2000 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES ’00), pp. 34–40. ACM Press, New York, NY (2000)Google Scholar
  70. 79.
    Darte, A., Schreiber, R., Villard, G.: Lattice-based memory allocation. Tech. Rep. RR2004-23, ENS-Lyon (2004)Google Scholar
  71. 80.
    Darte, A., Schreiber, R., Villard, G.: Lattice-based memory allocation. IEEE Trans. Comput. 54(10), 1242–1257 (2005)CrossRefGoogle Scholar
  72. 82.
    Davare, A., Densmore, D., Meyerowitz, T., Pinto, A., Sangiovanni-Vincentelli, A., Yang, G., Zeng, H., Zhu, Q.: A next-generation design framework for platform-based design. In: Conference on Using Hardware Design and Verification Languages (DVCon). San Jose, CA (2007)Google Scholar
  73. 83.
    Dekeyser, J., Beux, S.L., Marquet, P.: Une approche modèle pour la conception conjointe de systèmes embarqués hautes performances dédiés au transport. In: International Workshop on Logistique & Transport (LT’ 2007). Sousse, Tunisie (2007)Google Scholar
  74. 84.
    Demeure, A., Gallo, Y.D.: An array approach for signal processing design. In: Sophia-Antipolis Conference on Micro-Electronics (SAME’98), System-on-Chip Session. France (1998)Google Scholar
  75. 85.
    Denolf, K., Bekooij, M., Cockx, J., Verkest, D., Corporaal, H.: Exploiting the expressiveness of cyclo-static dataflow to model multimedia implementations. EURASIP J. Adv. Signal Process. 2007(84078), 14 (2007)Google Scholar
  76. 86.
    Densmore, D., Passerone, R., Sangiovanni-Vincentelli, A.: A platform-based taxonomy for ESL design. IEEE Des. Test 23(5), 359–374 (2006)CrossRefGoogle Scholar
  77. 87.
    Diet, F., D’Hollander, E., Beyls, K., Devos, H.: Embedding smart buffers for window operations in a stream-oriented C-to-VHDL compiler. In: 4th IEEE International Symposium on Electronic Design, Test and Applications (DELTA), pp. 142–147. Hong Kong (2008)Google Scholar
  78. 88.
    Diniz, P., Hall, M., Park, J., So, B., Ziegler, H.: Automatic mapping of C to FPGAs with the DEFACTO compilation and synthesis system. Microprocess. Microsyst. Special Issue FPGA Tools Tech. 29(2–3), 51–62 (2005)Google Scholar
  79. 89.
    Diniz, P.C.: Evaluation of code generation strategies for scalar replaced codes in fine-grain configurable architectures. In: FCCM ’05: Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 73–82. IEEE Computer Society, Washington, DC (2005)Google Scholar
  80. 90.
    Dömer, R., Gerstlauer, A., Peng, J., Shin, D., Cai, L., Yu, H., Abdi, S., Gajski, D.D.: System-on-chip environment: A SpecC-based framework for heterogeneous MPSoC design. EURASIP J. Embedded Syst. 2008(647953), 13 (2008)Google Scholar
  81. 91.
    Draper, B., Najjar, W., Bohm, W., Hammes, J., Rinker, B., Ross, C., Chawathe, M., Bins, J.: Compiling and optimizing image processing algorithms for FPGAs. In: Proceedings of 5th IEEE International Workshop on Computer Architectures for Machine Perception, pp. 222–231. Padova, PD (2000)Google Scholar
  82. 92.
    Dumont, P., Boulet, P.: Another multidimensional synchronous dataflow: Simulating Array-OL in Ptolemy II. Tech. Rep. 5516, Institut National de Recherche en Informatique et en Automatique, Cité Scientifique, 59 655 Villeneuve d’Ascq Cedex (2005)Google Scholar
  83. 93.
    Edwards, S.A.: SHIM: A language for hardware/software integration. In: SYNCHRON. Germany (2004)Google Scholar
  84. 94.
    Edwards, S.A.: The challenges of synthesizing hardware from C-like languages. IEEE Des. Test Comput. 23(5), 3765–386 (2006)CrossRefGoogle Scholar
  85. 95.
    Edwards, S.A., Tardieu, O.: SHIM: a deterministic model for heterogeneous embedded systems. In: Proceedings of the 5th ACM International Conference on Embedded Software (EMSOFT ’05), pp. 264–272. ACM Press, New York, NY (2005)Google Scholar
  86. 96.
    Edwards, S.A., Tardieu, O.: Efficient code generation from SHIM models. In: LCTES ’06: Proceedings of the 2006 ACM SIGPLAN/SIGBED Conference on Language, Compilers, and Tool Support for Embedded Systems, pp. 125–134. ACM Press, New York, NY (2006)Google Scholar
  87. 97.
    Edwards, S.A., Vasudevan, N., Tardieu, O.: Programming shared memory multiprocessors with deterministic message-passing concurrency: Compiling SHIM to Pthreads. In: Design, Automation and Test in Europe, 2008. DATE ’08, pp. 1498–1503. Munich, Germany (2008)Google Scholar
  88. 98.
    Eker, J., Janneck, J.W.: Embedded system components using the CAL actor language. University of California, Berkeley (2002)MATHGoogle Scholar
  89. 99.
    Eker, J., Janneck, J.W.: CAL language report. language version 1.0 | document edition 1, University of California at Berkeley (2003)Google Scholar
  90. 100.
    Engels, M., Bilsen, G., Lauwereins, R., Peperstraete, J.: Cyclo-static data flow: Model and implementation. In: Proceedings of the 28th Asilomar Conference on Signals, Systems, and Computers, pp. 503–507. Pacific Grove, CA (1994)Google Scholar
  91. 101.
    Erbas, C., Pimentel, A.D., Thompson, M., Polstra, S.: A framework for system-level modeling and simulation of embedded systems architectures. EURASIP J. Embedded Syst. 2007(1), 2–2 (2007)Google Scholar
  92. 105.
    Feautrier, P.: Scalable and structured scheduling. Int. J. Parallel Program. 34(5), 459–487 (2006)MathSciNetMATHCrossRefGoogle Scholar
  93. 106.
    Fischaber, S., Woods, R., McAllister, J.: SoC memory hierarchy derivation from dataflow graphs. In: IEEE Workshop on Signal Processing Systems, pp. 469–474. Shanghai, China (2007)Google Scholar
  94. 107.
    Forte Design Systems: Forte cynthesizer. http://www.forteds.com (2010). Accessed 25 Sep 2010
  95. 108.
    Fowler, M., Scott, K.: UML Distilled: Applying the Standard Object Modeling Language. Addison-Wesley, Reading, MA (1997)Google Scholar
  96. 110.
    Gamatié, A., Beux, S.L., Éric Piel, Etien, A., Ben-Atitallah, R., Marquet, P., Dekeyser, J.L.: A model driven design framework for high performance embedded systems. Tech. Rep. 6614, Institut National de Recherche en Informatique et en Automatique (2008)Google Scholar
  97. 111.
    Geilen, M., Basten, T., Stuijk, S.: Minimising buffer requirements of synchronous dataflow graphs with model checking. In: DAC ’05: Proceedings of the 42nd Annual Conference on Design Automation, pp. 819–824. ACM Press, New York, NY (2005)Google Scholar
  98. 112.
    Ghamarian, A., Geilen, M., Stuijk, S., Basten, T., Moonen, A., Bekooij, M., Theelen, B., Mousavi, M.: Throughput analysis of synchronous data flow graphs. In: Proceedings of the 6th International Conference on Application of Concurrency to System Design (ACSD’06), pp. 25–36. IEEE Computer Society, Washington, DC (2006)Google Scholar
  99. 114.
    Gokhale, M.B., Stone, J.M.: Automatic allocation of arrays to memories in FPGA processors with multiple memory banks. In: FCCM ’99: Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 63–69. IEEE Computer Society, Marriott at Napa Valley, Napa, CA (1999)Google Scholar
  100. 115.
    Gordon, M.I., Thies, W., Amarasinghe, S.: Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. SIGARCH Comput. Archit. News 34(5), 151–162 (2006)CrossRefGoogle Scholar
  101. 116.
    Govindarajan, R., Gao, G.: A novel framework for multi-rate scheduling in DSP applications. In: Proceedings of the 1993 International Conference on Application Specific Array Processors, pp. 77–88. Venice, Italy (1993)Google Scholar
  102. 117.
    Govindarajan, R., Gao, G., Desai, P.: Minimizing buffer requirements under rate-optimal schedule in regular dataflow networks. J. VLSI Signal Process. 31, 207–229. Kluwer, Dordrecht (2002)MATHCrossRefGoogle Scholar
  103. 118.
    Greef, E.D., Catthoor, F., Man, H.D.: Memory size reduction through storage order optimization for embedded parallel multimedia applications. Parallel Comput. 23(12), 1811–1837 (1997)MATHCrossRefGoogle Scholar
  104. 119.
    Guillou, A.C.: Synthèse architecturale basée sur le modèle polyhédrique : Validation et extensions de la méthodologie mmalpha. Ph.D. thesis, L’université de Rennes (2003)Google Scholar
  105. 120.
    Guillou, A.C., Quinton, P., Risset, T.: Hardware synthesis for multi-dimensional time. Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors, pp. 40–50. The Hague, The Netherlands (2003)Google Scholar
  106. 121.
    Guo, Z., Najjar, W., Buyukkurt, B.: Efficient hardware code generation for FPGAs. ACM Trans. Archit. Code Optim. 5(1), 1–26 (2008)CrossRefGoogle Scholar
  107. 122.
    Gupta, S., Dutt, N., Gupta, R., Nicolau, A.: SPARK: A high-level synthesis framework for applying parallelizing compiler transformations. In: Proceedings of the 16th International Conference on VLSI Design (VLSID ’03), p. 461. IEEE Computer Society, Washington, DC (2003)Google Scholar
  108. 123.
    Gupta, S., Gupta, R.K., Dutt, N.D., Nicolau, A.: Coordinated parallelizing compiler optimizations and high-level synthesis. ACM Trans. Des. Autom. Electron. Syst. 9(4), 441–470 (2004)CrossRefGoogle Scholar
  109. 124.
    Ha, S., Kim, S., Lee, C., Yi, Y., Kwon, S., Joo, Y.P.: PeaCE: A hardware-software codesign environment for multimedia embedded systems. ACM Trans. Des. Autom. Electron. Syst. 12(3), 1–25 (2007)CrossRefGoogle Scholar
  110. 125.
    Ha, S., Lee, C., Yi, Y., Kwon, S., Joo, Y.P.: Hardware-software codesign of multimedia embedded systems: the PeaCE approach. In: RTCSA ’06: Proceedings of the 12th IEEE International Conference on Embedded in Real-Time Computing Systems and Applications, pp. 207–214. IEEE Computer Society, Washington, DC (2006)Google Scholar
  111. 126.
    van Haastregt, S., Kienhuis, B.: Automated synthesis of streaming C applications to process networks in hardware. In: Proceedings of Design, Automation & Test in Europe, pp. 890–893. ACM Press, Nice, France (2009)Google Scholar
  112. 127.
    Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous data flow programming language Lustre. In: Proceedings of the IEEE, vol. 79, pp. 1305–1320 (1989)CrossRefGoogle Scholar
  113. 128.
    Hammes, J., Rinker, B., Bohm, W., Najjar, W., Draper, B., Beveridge, R.: Cameron: High level language compilation for reconfigurable systems. In: PACT ’99: Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, p. 236. IEEE Computer Society, Washington, DC (1999)Google Scholar
  114. 129.
    Han, S.I., Guerin, X., Chae, S.I., Jerraya, A.A.: Buffer memory optimization for video codec application modeled in Simulink. In: DAC ’06: Proceedings of the 43rd Annual Conference on Design Automation, pp. 689–694. ACM Press, New York, NY (2006)Google Scholar
  115. 130.
    Hannig, F., Ruckdeschel, H., Dutta, H., Teich, J.: PARO: Synthesis of hardware accelerators for multi-dimensional dataflow-intensive applications. In: Proceedings of the Fourth International Workshop on Applied Reconfigurable Computing (ARC), Lecture Notes in Computer Science (LNCS), pp. 287–293. Springer, London, United Kingdom (2008)Google Scholar
  116. 131.
    Harel, D.: Statecharts: A visual formalism for complex systems. Sci. Comput. Program. 8(3), 231–274 (1987)MathSciNetMATHCrossRefGoogle Scholar
  117. 136.
    Hoare, C.A.R.: Communicating sequential processes. Commun. ACM 21(8), 666–677 (1978)MathSciNetMATHCrossRefGoogle Scholar
  118. 137.
    Hoare, C.A.R.: Communicating Sequential Processes. Prentice Hall, Upper Saddle River, NJ (2004)Google Scholar
  119. 138.
    Hsu, C.J., Keceli, F., Ko, M.Y., Shahparnia, S., Bhattacharyya, S.S.: DIF: An interchange format for dataflow-based design tools. Comput. Syst. Architect. Modeling, Simulat. 3133, 423–432. Samos, Greece (2004)Google Scholar
  120. 139.
    Hsu, C.J., Ko, M.Y., Bhattacharyya, S.S.: Software synthesis from the dataflow interchange format. In: Proceedings of the 2005 Workshop on Software and Compilers for Embedded Systems (SCOPES ’05), pp. 37–49. ACM Press, New York, NY (2005)Google Scholar
  121. 140.
    Hu, Q.: Hierarchical memory size estimation for loop transformation and data memory platform optimization. Ph.D. thesis, Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, Department of Electronics and Telecommunications (2007)Google Scholar
  122. 141.
    Hu, Q., Kjeldsberg, P.G., Vandecappelle, A., Palkovic, M., Catthoor, F.: Incremental hierarchical memory size estimation for steering of loop transformations. ACM Trans. Des. Autom. Electron. Syst. 12(50), 1–25 (2007)Google Scholar
  123. 142.
    Hu, Q., Palkovic, M., Kjeldsberg, P.G.: Memory requirement optimization with loop fusion and loop shifting. In: DSD ’04: Proceedings of the Digital System Design, EUROMICRO Systems, pp. 272–278. IEEE Computer Society, Washington, DC (2004)Google Scholar
  124. 143.
    Hu, Q., Vandecappelle, A., Kjeldsberg, P.G., Catthoor, F., Palkovic, M.: Fast memory footprint estimation based on maximal dependency vector calculation. In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE ’07), pp. 379–384. EDA Consortium, San Jose, CA (2007)Google Scholar
  125. 144.
    Hu, Q., Vandecappelle, A., Palkovic, M., Kjeldsberg, P.G., Brockmeyer, E., Catthoor, F.: Hierarchical memory size estimation for loop fusion and loop shifting in data-dominated applications. In: ASP-DAC ’06: Proceedings of the 2006 conference on Asia South Pacific design automation, pp. 606–611. IEEE Press, Piscataway, NJ (2006)Google Scholar
  126. 145.
    Hutchings, B., Bellows, P., Hawkins, J., Hemmert, S., Nelson, B., Rytting, M.: A CAD suite for high-performance FPGA design. In: Proceedings of Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’99), pp. 12–24. Marriott at Napa Valley, Napa, CA (1999)CrossRefGoogle Scholar
  127. 146.
    IEEE: IEEE Standard VHDL Language Reference Manual. IEEE, IEEE Std. 1076–1987 edn. (1987)Google Scholar
  128. 147.
    IEEE: IEEE Standard VHDL Language Reference Manual. IEEE, IEEE Std. 1076–1993 edn. (1993)Google Scholar
  129. 149.
    IMEC: CleanC. http://www.imec.be/CleanC/ (2010). Accessed 19 Sep 2010
  130. 151.
    Impulse Accelerated Technologies: ImpulseC. http://www.impulsec.com/ (2010)
  131. 154.
    Jantsch, A., Sander, I.: Models of computation and languages for embedded system design. IEE Proc. Comput. Digital Tech. 152(2), 114–129 (2005)CrossRefGoogle Scholar
  132. 155.
    Jha, P.K., Dutt, N.D.: High-level library mapping for memories. ACM Trans. Des. Autom. Electron. Syst. 5(3), 566–603 (2000)CrossRefGoogle Scholar
  133. 156.
    Jung, H., Ha, S.: Hardware synthesis from coarse-grained dataflow specification for fast HW/SW cosynthesis. In: Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS ’04), pp. 24–29. IEEE Computer Society, Washington, DC (2004)Google Scholar
  134. 157.
    Kahn, G.: The semantics of a simple language for parallel programming. In: Proceedings of IFIP Congress 74, pp. 471–475. Stockholm, Sweden (1974)Google Scholar
  135. 158.
    Kangas, T., Kukkala, P., Orsila, H., Salminen, E., Hännikäinen, M., Hämäläinen, T.D., Riihimäki, J., Kuusilinna, K.: UML-based multiprocessor SoC design framework. ACM Trans. Embedded Comput. Syst. 5(2), 281–320 (2006)CrossRefGoogle Scholar
  136. 159.
    Karczmarek, M., Thies, W., Amarasinghe, S.: Phased scheduling of stream programs. In: Proceedings of the 2003 ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded systems (LCTES ’03), pp. 103–112. ACM Press, New York, NY (2003)Google Scholar
  137. 160.
    Karp, R.M., Miller, R.E.: Properties of a model for parallel computations: Determinacy, termination and queuing. SIAM J. Appl. Math. 14(6), 1390–1411 (1966)MathSciNetCrossRefMATHGoogle Scholar
  138. 161.
    Karsai, G., Sztipanovits, J., Ledeczi, A., Bapty, T.: Model-integrated development of embedded software. Proc. IEEE 91(1), 145–164 (2003)CrossRefGoogle Scholar
  139. 162.
    Kathail, V., Aditya, S., Schreiber, R., Rau, B.R., Cronquist, D.C., Sivaraman, M.: PICO: Automatically designing custom computers. Computer 35(9), 39–47 (2002)CrossRefGoogle Scholar
  140. 166.
    Keinert, J., Haubelt, C., Teich, J.: Modeling and analysis of windowed synchronous algorithms. ICASSP2006 III, 892–895 (2006)Google Scholar
  141. 172.
    Kianzad, V., Bhattacharyya, S.S.: CHARMED: A multi-objective co-synthesis framework for multi-mode embedded systems. In: ASAP ’04: Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference, pp. 28–40. IEEE Computer Society, Washington, DC (2004)Google Scholar
  142. 173.
    Kienhuis, B., Deprettere, E.F.: Modeling stream-based applications using the SBF model of computation. J. VLSI Signal Process. Syst. 34(3), 291–300 (2003)MATHCrossRefGoogle Scholar
  143. 174.
    Kienhuis, B., Rijpkema, E., Deprettere, E.: Compaan: deriving process networks from MATLAB for embedded signal processing architectures. In: Proceedings of the Eighth International Workshop on Hardware/Software Codesign (CODES ’00), pp. 13–17. ACM Press, New York, NY (2000)Google Scholar
  144. 175.
    Kim, D.: A case study of system level specification and software synthesis of multi-mode multimedia terminal. ESTImedia 8, 231–274 (2003)Google Scholar
  145. 176.
    Kjeldsberg, P., Catthoor, F., Aas, E.J.: Detection of partially simultaneously alive signals in storage requirement estimation for data intensive applications. In: DAC ’01: Proceedings of the 38th conference on Design automation, pp. 365–370. ACM Press, New York, NY (2001)Google Scholar
  146. 177.
    Kjeldsberg, P.G., Catthoor, F., Aas, E.J.: Data dependency size estimation for use in memory optimization. IEEE Trans. CAD Integr. Circuits Syst. 22(7), 908–921 (2003)CrossRefGoogle Scholar
  147. 178.
    Kjeldsberg, P.G., Catthoor, F., Aas, E.J.: Storage requirement estimation for optimized design of data intensive applications. ACM Trans. Des. Autom. Electron. Syst. 9(2), 133–158 (2004)CrossRefGoogle Scholar
  148. 179.
    Ko, D.I.: System synthesis for image processing applications. Ph.D. thesis, University of Maryland (2006)Google Scholar
  149. 180.
    Ko, D.I., Bhattacharyya, S.S.: Modeling of block-based DSP systems. In: Proceedings of the IEEE Workshop on Signal Processing Systems, pp. 381–386. Seoul, Korea (2003)Google Scholar
  150. 181.
    Ko, D.I., Bhattacharyya, S.S.: Modeling of block-based DSP systems. J. VLSI Signal Process. Syst. 40(3), 289–299 (2005)CrossRefGoogle Scholar
  151. 182.
    Ko, M.Y., Murthy, P.K., Bhattacharyya, S.S.: Compact procedural implementation in DSP software synthesis through recursive graph decomposition. In: Proceedings of the International Workshop on Software and Compilers for Embedded Processors, pp. 47–61. Amsterdam, The Netherlands (2004)Google Scholar
  152. 183.
    Ku, D., Micheli, G.D.: HardwareC: A language for hardware design. Tech. Rep. CSTL-TR-90-419, Computer Systems Lab, Stanford Univ. (1990). Version 2.0Google Scholar
  153. 184.
    Kuzmanov, G., Gaydadjiev, G.N., Vassiliadis, S.: Multimedia rectangularly addressable memory. IEEE Trans. Multimedia 8, 315–322 (2006)CrossRefGoogle Scholar
  154. 185.
    Labbani, O.: Modélisation à haut niveau du contrôle dans des applications de traitement systématique à parallélisme massif. Ph.D. thesis, Université des Sciences et Technologies de Lille Laboratoire d’Informatique Fondamentale de Lille, 59655 Villeneuve (2006)öGoogle Scholar
  155. 186.
    Labbani, O., Dekeyser, J.L., Boulet, P., Rutten, E.: Introduction of control into the Gaspard application UML metamodel: Synchronous approach. Tech. Rep. 5794, Laboratoire d’Informatique Fondamentale de Lille, Université des Sciences et Technologies de Lille 59655 Villeneuve d’Ascq Cedex, France (2005)Google Scholar
  156. 187.
    Lauwereins, R., Engels, M., Ade, M., Peperstraete, J.: Grape-II: a system-level prototyping environment for DSP applications. Computer 28(2), 35–43 (1995)CrossRefGoogle Scholar
  157. 188.
    Lawal, N., O’Nils, M.: Embedded FPGA memory requirements for real-time video processing applications. In: 23rd NORCHIP Conference, pp. 206–209. Oulu, Finland (2005)Google Scholar
  158. 189.
    Lawal, N., O’Nils, M., Thörnberg, B.: C++ based system synthesis of real-time video processing systems targeting FPGA implementation. In: IPDPS, pp. 1–7 (2007)Google Scholar
  159. 190.
    Lawal, N., Thörnberg, B., O’Nils, M.: Address generation for FPGA RAMS for efficient implementation of real-time video processing systems. In: International Conference on Field Programmable Logic and Applications, pp. 136–141 (2005)Google Scholar
  160. 191.
    Lawal, N., Thörnberg, B., O’Nils, M.: Power-aware automatic constraint generation for FPGA based real-time video processing systems. In: NORCHIP, 2007, pp. 1–5. Aalborg, Denmark (2007)Google Scholar
  161. 192.
    Lee, E., Neuendorffer, S.: Concurrent models of computation for embedded software. IEE Proc. Comput. Digital Tech. 152(2), 239–250 (2005)CrossRefGoogle Scholar
  162. 193.
    Lee, E.A., Messerschmitt, D.G.: Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput. C-36(1), 24–35 (1987)CrossRefGoogle Scholar
  163. 194.
    Lefebvre, V.: Restructuration automatique des variables d’un programme en vue de sa parallélilsation. Ph.D. thesis, Université de Versailles (1998)Google Scholar
  164. 195.
    Lefebvre, V., Feautrier, P.: Storage management in parallel programs. In: 5th Euromicro Workshop on Parallel and Distributed Processing, pp. 181–188. London, Great Britian (1997)Google Scholar
  165. 196.
    Lefebvre, V., Feautrier, P.: Automatic storage management for parallel programs. Parallel Comput. 24(3-4), 649–671 (1998)MATHCrossRefGoogle Scholar
  166. 197.
    Lewis B. Baumstark, J., Wills, L.M.: Multidimensional dataflow-based parallelization for multimedia instruction set extensions. In: ICPPW ’06: Proceedings of the 2006 International Conference Workshops on Parallel Processing, pp. 319–326. IEEE Computer Society, Washington, DC (2006)Google Scholar
  167. 199.
    Liang, X., Jean, J., Tomko, K.: Data buffering and allocation in mapping generalized template matching on reconfigurable systems. J. Supercomput. 19(1), 77–91 (2001)MATHCrossRefGoogle Scholar
  168. 202.
    LLVM: LLVM. http://llvm.org/ (2010)
  169. 204.
    Manolache, S., Eles, P., Peng, Z.: Buffer space optimisation with communication synthesis and traffic shaping for NoCs. In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE ’06), pp. 718–723. European Design and Automation Association, 3001 Leuven, Belgium, Belgium (2006)Google Scholar
  170. 205.
    Martinez, F.: Reduce FPGA design time with PICO express FPGA. Xcell J. 64, 75–77 (2008)MATHGoogle Scholar
  171. 206.
    Massachusetts Institute of Technology (MIT): Streamit. http://groups.csail.mit.edu/cag/streamit/index.shtml (2010)
  172. 207.
    MathWorks, T.: Simulink. www.mathworks.com/products/simulink/ (2010)
  173. 208.
    Mencer, O.: ASC: a stream compiler for computing with FPGAs. IEEE Trans. CAD Integr. Circuits Syst. 25(9), 1603–1617 (2006)CrossRefGoogle Scholar
  174. 209.
  175. 210.
    Moore, M.S.: Model integrated program synthesis for real time image processing. Ph.D. thesis, Vanderbilt University (1997)Google Scholar
  176. 211.
    Murthy, P.K.: Scheduling techniques for synchronous and multidimensional synchronous dataflow. Ph.D. thesis, University of California at Berkeley (1996)Google Scholar
  177. 212.
    Murthy, P.K., Bhattacharyya, S.S.: Shared buffer implementations of signal processing systems using lifetime analysis techniques. IEEE Trans. CAD Integr. Circuits Syst. 20(2), 177–198 (2001)CrossRefGoogle Scholar
  178. 213.
    Murthy, P.K., Bhattacharyya, S.S.: Buffer merging—a powerful technique for reducing memory requirements of synchronous dataflow specifications. ACM Trans. Des. Autom. Electron. Syst. 9(2), 212–237 (2004)CrossRefGoogle Scholar
  179. 214.
    Murthy, P.K., Lee, E.A.: On the optimal blocking factor for blocked, non-overlapped schedules. In: 28th Asilomar Conference on Signals, Systems, and Computers, vol. 2, pp. 1052–1057. IEEE CS Press. Pacific Grove, CA (1994)Google Scholar
  180. 215.
    Murthy, P.K., Lee, E.A.: Multidimensional synchronous dataflow. IEEE Trans. Signal Process. 50(7), 2064–2079 (2002)CrossRefGoogle Scholar
  181. 216.
    Najjar, W.A., Böhm, W., Draper, B.A., Hammes, J., Rinker, R., Beveridge, J.R., Chawathe, M., Ross, C.: High-level language abstraction for reconfigurable computing. Computer 36(8), 63–69 (2003)CrossRefGoogle Scholar
  182. 217.
    Natarajan, S., Levine, B., Tan, C., Newport, D., Bouldin, D.: Automatic mapping of Khoros-based applications to adaptive computing systems. In: Proceedings of 1999 Military and Aerospace Applications of Programmable Devices and Technologies International Conference (MAPLD), pp. 101–107. Laurel, MD (1999)Google Scholar
  183. 218.
    National Instruments: LabVIEW FPGA. http://www.ni.com/fpga/ (2010)
  184. 219.
    Neema, S., Bapty, T., Scott, J.: Development environment for dynamically reconfigurable embedded systems. In: Proceedings of the International Conference on Signal Processing Applications and Technology. Orlando, FL (1999)Google Scholar
  185. 220.
    Nichols, J., Moore, M.: An adaptable, cost effective image processing system. In: The 10th JANNAF Non-destructive Evaluation Sub Committee, pp. 1–5. Salt Lake City, UT (1998)Google Scholar
  186. 221.
    Nichols, J., Neema, S.: Dynamically reconfigurable embedded image processing system. In: Proceedings of the International Conference on Signal Processing Applications and Technology. Orlando, FL (1999)Google Scholar
  187. 222.
    Nikolov, H., Stefanov, T., Deprettere, E.: Multi-processor system design with ESPAM. In: Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS ’06), pp. 211–216. ACM Press, New York, NY (2006)CrossRefGoogle Scholar
  188. 223.
    Nikolov, H., Stefanov, T., Deprettere, E.: Systematic and automated multiprocessor system design, programming, and implementation. IEEE Trans. CAD Integr. Circuits Syst. 27(3), 542–555 (2008)CrossRefGoogle Scholar
  189. 224.
    Ning, Q., Gao, G.R.: A novel framework of register allocation for software pipelining. In: Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 29–42. Charleston, SC (1993)Google Scholar
  190. 225.
    Norell, H., Lawal, N., O’Nils, M.: Automatic generation of spatial and temporal memory architectures for embedded video processing systems. EURASIP J. Embedded Syst. 2007, Article ID 75,368, 10 pages (2007)Google Scholar
  191. 226.
    Object Management Group: Unified Modeling Language. http://www.uml.org (2010)
  192. 227.
    Oh, H., Ha, S.: Fractional rate dataflow model and efficient code synthesis for multimedia applications. In: Proceedings of the Joint Conference on Languages, Compilers and Tools for Embedded Systems (LCTES/SCOPES ’02), pp. 12–17. ACM Press, New York, NY (2002)Google Scholar
  193. 228.
    Oh, H., Ha, S.: Memory-optimized software synthesis from dataflow program graphs with large size data samples. EURASIP J. Appl. Signal Process. 2003(1), 514–529 (2003)MATHGoogle Scholar
  194. 229.
    O’Nils, M., Thörnberg, B., Norell, H.: A comparison between local and global memory allocation for FPGA implementation of real-time video processing systems. In: Proceedings of the International Conference on Signals and Electronic Systems, pp. 429–432. Poznán, Poland (2004)Google Scholar
  195. 230.
    OSCI: Functional Specification for SystemC 2.0. Open SystemC Initiative. www.systemc.org (2002)
  196. 231.
    Ouaiss, I., Vemuri, R.: Global memory mapping for FPGA-based reconfigurable systems. In: Proceedings of 15th International Symposium on Parallel and Distributed Processing, pp. 1473–1480. San Francisco, CA (2001)Google Scholar
  197. 232.
    Ouaiss, I., Vemuri, R.: Hierarchical memory mapping during synthesis in FPGA-based reconfigurable computers. In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE ’01), pp. 650–657. IEEE Press, Piscataway, NJ (2001)Google Scholar
  198. 233.
    Ouaiss, I.E.: Hierarchical memory synthesis in reconfigurable computers. Ph.D. thesis, University of Cincinnati (2002)Google Scholar
  199. 234.
    Park, C., Chung, J., Ha, S.: Extended synchronous dataflow for efficient DSP system prototyping. IEEE International Workshop on Rapid System Prototyping pp. 196–201. Clear water, FL (1999)Google Scholar
  200. 235.
    Park, C., Ha, S.: Hardware synthesis from SPDF representation for multimedia applications. In: Proceedings of the 13th International Symposium on System Synthesis (ISSS ’00), pp. 215–220. IEEE Computer Society, Washington, DC (2000)Google Scholar
  201. 236.
    Park, C., Kim, S., Ha, S.: A dataflow specification for system level synthesis of 3D graphics applications. In: ASP-DAC, pp. 78–84. Yokohama, Japan (2001)Google Scholar
  202. 237.
    Park, J., Diniz, P.C.: Synthesis of pipelined memory access controllers for streamed data applications on FPGA-based computing engines. In: Proceedings of the 14th International Symposium on Systems Synthesis (ISSS ’01), pp. 221–226. ACM Press, New York, NY (2001)Google Scholar
  203. 238.
    Park, J., Diniz, P.C.: Partial data reuse for windowing computations: Performance modeling for FPGA implementations. In: Reconfigurable Computing: Architectures, Tools and Applications (ARC), Lecture Notes in Computer Science, vol. 4419, pp. 97–109. Mangaratiba, Brazil (2007)Google Scholar
  204. 239.
    Park, J.W.: An efficient buffer memory system for subarray access. IEEE Trans. Parallel Distrib. Syst. 12(3), 316–335 (2001)CrossRefGoogle Scholar
  205. 240.
    Parks, T.M., Pino, J.L., Lee, E.A.: A comparison of synchronous and cycle-static dataflow. In: ASILOMAR ’95: Proceedings of the 29th Asilomar Conference on Signals, Systems and Computers (2-Volume Set), p. 204. IEEE Computer Society, Washington, DC (1995)Google Scholar
  206. 241.
    Petri, C.A.: Communication with automata. Supplement 1 to technical report RADC-TR-65-337, Rome Air Develop. Cent. (1962)Google Scholar
  207. 242.
    Piel Éric, Attitalah, R.B., Marquet, P., Meftali, S., Niar, S., Etien, A., Dekeyser, J.L., Boulet, P.: Gaspard2: from MARTE to SystemC simulation. In: Design, Automation and Test in Europe (DATE 08). Munich, Germany (2008)Google Scholar
  208. 243.
    Pimentel, A.D., Thompson, M., Polstra, S., Erbas, C.: On the calibration of abstract performance models for system-level design space exploration. In: International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (IC-SAMOS 2006), pp. 71–77. Samos, Greece (2006)Google Scholar
  209. 247.
    Reiter, R.: Scheduling parallel computations. J. ACM 15(4), 590–599 (1968)MATHCrossRefGoogle Scholar
  210. 249.
    Rijpkema, E., Deprettere, E., Kienhuis, B.: Compilation from MATLAB to process networks. In: Second International Workshop on Compiler and Architecture Support for Embedded Systems (CASES’99), pp. 388–395. Washington, DC (1999)Google Scholar
  211. 250.
    Rinker, R., Carter, M., Patel, A., Chawathe, M., Ross, C., Hammes, J., Najjar, W., Bohm, W.: An automated process for compiling dataflow graphs into reconfigurable hardware. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 9(1), 130–139 (2001)CrossRefGoogle Scholar
  212. 252.
    Schreiber, R., Aditya, S., Mahlke, S., Kathail, V., Rau, B.R., Cronquist, D., Sivaraman, M.: PICO-NPA: High-level synthesis of nonprogrammable hardware accelerators. J. VLSI Signal Process. Syst. 31(2), 127–142 (2002)CrossRefMATHGoogle Scholar
  213. 253.
    Séméria, L., Sato, K., Micheli, G.D.: Synthesis of hardware models in C with pointers and complex data structures. IEEE Trans. Very Large Scale Integr. Syst. 9(6), 743–756 (2001)CrossRefGoogle Scholar
  214. 254.
    Sen, M.: Model-based hardware design for image processing systems. Ph.D. thesis, Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies (2006)Google Scholar
  215. 255.
    Sen, M., Bhattacharyya, S., Lv, T., Wolf, W.: Modeling image processing systems with homogeneous parameterized dataflow graphs. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’05) 5, 133–136. Philadelphia, PA (2005)Google Scholar
  216. 256.
    Sen, M., Corretjer, I., Haim, F., Saha, S., Bhattacharyya, S.S., Schlessman, J., Wolf, W.: Computer vision on FPGAs: Design methodology and its application to gesture recognition. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) Workshops, p. 133. IEEE Computer Society, Washington, DC (2005)Google Scholar
  217. 257.
    Sen, M., Corretjer, I., Haim, F., Saha, S., Schlessman, J., Lv, T., Bhattacharyya, S.S., Wolf, W.: Dataflow-based mapping of computer vision algorithms onto FPGAs. EURASIP J. Embedded Syst. 2007(49236), 1–12 (2007)CrossRefGoogle Scholar
  218. 258.
    So, B., Hall, M.W.: Increasing the applicability of scalar replacement. In: Proceedings of 13th International Conference on Compiler Construction, Lecture Notes in Computer Science, vol. 2985, pp. 185–201. Barcelona, Spain (2004)Google Scholar
  219. 259.
    So, B., Hall, M.W., Ziegler, H.E.: Custom data layout for memory parallelism. In: Proceedings of the International Symposium on Code Generation and Optimization (CGO ’04), pp. 291–302. IEEE Computer Society, Washington, DC (2004)Google Scholar
  220. 260.
    Sovani, C., Edwards, S.A.: FIFO sizing for high-performance pipelines. In: Proceedings of the International Workshop on Logic and Synthesis. San Diego, CA (2007)Google Scholar
  221. 261.
    Stefanov, T., Kienhuis, B., Deprettere, E.: Algorithmic transformation techniques for efficient exploration of alternative application instances. In: Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES), pp. 7–12. Estes Park, CO (2002)Google Scholar
  222. 262.
    Stefanov, T., Zissulescu, C., Turjan, A., Kienhuis, B., Deprettere, E.F.: System design using Kahn Process Networks: The Compaan/Laura approach. In: Proceedings of Design Automation & Test in Europe (DATE), pp. 340–345. Paris, France (2004)Google Scholar
  223. 263.
    Stichling, D., Kleinjohann, B.: CV-SDF - a model for real-time computer vision applications. In: WACV 2002: IEEE Workshop on Applications of Computer Vision. Orlando, FL (2002)Google Scholar
  224. 264.
    Strehl, K., Thiele, L., Gries, M., Ziegenbein, D., Ernst, R., Teich, J.: Funstate—an internal design representation for codesign. IEEE Trans. Very Large Scale Integr. Syst. 9(4), 524–544 (2001)CrossRefGoogle Scholar
  225. 265.
    Strehl, K., Thiele, L., Ziegenbein, D., Ernst, R.: Scheduling hardware/software systems using symbolic techniques. Tech. Rep. 67, Swiss Federal Institute of Technology (ETH) and Technical University of Braunschweig (1999)Google Scholar
  226. 267.
    Stuijk, S., Geilen, M., Basten, T.: Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs. In: Proceedings of the 43rd Annual Conference on Design Automation (DAC ’06), pp. 899–904. ACM Press, New York, NY (2006)Google Scholar
  227. 268.
    Sutherland, S., Davidmann, S., Flake, P., Moorby, P.: SystemVerilog for Design: A Guide to Using SystemVerilog for Hardware Design and Modeling, vol. 2. Kluwer, Norwell, MA (2006)Google Scholar
  228. 269.
    Synopsys: Synopsys system studio. http://www.synopsys.com/systemstudio (2010). Accessed 19 Sep 2010
  229. 270.
    Taha, S., Radermache, A., Gerard, S., Dekeyser, J.L.: An open framework for detailed hardware modeling. In: International Symposium on Industrial Embedded Systems (SIES ’07), pp. 118–125. Lisbon, Portugal (2007)Google Scholar
  230. 272.
    Teich, J., Zitzler, E., Bhattacharyya, S.: Optimized software synthesis for digital signal processing algorithms: an evolutionary approach. In: IEEE Workshop on Signal Processing Systems, pp. 589–598. Boston, MA (1998)Google Scholar
  231. 273.
    Teich, J., Zitzler, E., Bhattacharyya, S.S.: Buffer memory optimization in DSP applications — an evolutionary approach. In: Fifth International Conference on Parallel Problem Solving from Nature (PPSN-V), pp. 885–894. Springer, Berlin, Germany (1998)Google Scholar
  232. 274.
    Tessier, R., Betz, V., Neto, D., Egier, A., Gopalsamy, T.: Power-efficient RAM mapping algorithms for FPGA embedded memory blocks. IEEE Trans. CAD Integr. Circuits Syst. 26(2), 278–290 (2007)CrossRefGoogle Scholar
  233. 275.
    The SUIF Group: The SUIF compiler system. http://suif.stanford.edu/. Accessed 19 Sep 2010
  234. 276.
    Thies, W., Vivien, F., Sheldon, J., Amarasinghe, S.: A unified framework for schedule and storage optimization. Proc. ACM SIGPLAN 2001 Conf. Programming Lang. Des. Implementation 36(5), 232–242 (2001)CrossRefGoogle Scholar
  235. 277.
    Thomas, D.R., Moorby, P.: The Verilog Hardware Description Language, vol. 5. Kluwer, Boston, MA (2002)Google Scholar
  236. 278.
    Thompson, M., Nikolov, H., Stefanov, T., Pimentel, A.D., Erbas, C., Polstra, S., Deprettere, E.F.: A framework for rapid system-level exploration, synthesis, and programming of multimedia MP-SoCs. In: Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS ’07), pp. 9–14. ACM Press, New York, NY (2007)Google Scholar
  237. 279.
    Thörnberg, B., Hu, Q., Palkovic, M., O’Nils, M., Kjeldsberg, P.G.: Polyhedral space generation and memory estimation from interface and memory models of real-time video systems. J. Syst. Softw. 79(2), 231–245 (2006)CrossRefGoogle Scholar
  238. 280.
    Thörnberg, B., Norell, H., O’Nils, M.: Conceptual interface and memory-modeling for real-time image processing systems. In: IEEE Workshop on Multimedia Signal Processing, pp. 138–141. Paris, France (2002)Google Scholar
  239. 281.
    Thörnberg, B., Olsson, L., O’Nils, M.: Optimization of memory allocation for real-time video processing on FPGA. In: The 16th IEEE International Workshop on Rapid System Prototyping (RSP2005), pp. 141–147. Montreal, Canada (2005)Google Scholar
  240. 282.
    Thörnberg, B., Palkovic, M., Hu, Q., Olsson, L., Kjeldsberg, P.G., O’Nils, M., Catthoor, F.: Bit-width constrained memory hierarchy optimization for real-time video systems. IEEE Trans. CAD Integr. Circuits Syst. 26(4), 781–800 (2007)CrossRefGoogle Scholar
  241. 284.
    Tronçon, R., Bruynooghe, M., Janssens, G., Catthoor, F.: Storage size reduction by in-place mapping of arrays. In: VMCAI ’02: Revised Papers from the 3rd International Workshop on Verification, Model Checking, and Abstract Interpretation, pp. 167–181. Springer, London, UK (2002)Google Scholar
  242. 285.
    Turjan, A., Kienhuis, B., Deprettere, E.: Translating affine nested-loop programs to process networks. In: Proceedings of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES ’04), pp. 220–229. ACM Press, New York, NY (2004)Google Scholar
  243. 286.
    Turjan, A., Kienhuis, B., Deprettere, E.: Solving out-of-order communication in Kahn Process Networks. J. VLSI Signal Process. 40, 7–18 (2005)CrossRefGoogle Scholar
  244. 287.
    Turjan, A., Kienhuis, B., Deprettere, E.F.: Realizations of the extended linearization model. In: 2nd Int. Workshop on Systems, Architectures, Modeling, and Simulation (SAMOS 2002). Samos, Greece (2002)Google Scholar
  245. 288.
    Vasudevan, N., Edwards, S.: Static deadlock detection for the SHIM concurrent language. In: 6th ACM/IEEE International Conference on Formal Methods and Models for Co-Design (MEMOCODE 2008), pp. 49–58. Anaheim, CA (2008)Google Scholar
  246. 289.
    Vasudevan, N., Edwards, S.A.: A JPEG decoder in SHIM. Computer Science Tech. Rep. CUCS-048-06, Columbia University (2006)Google Scholar
  247. 290.
    Venkataramani, G., Najjar, W., Kurdahi, F., Bagherzadeh, N., Bohm, W., Hammes, J.: Automatic compilation to a coarse-grained reconfigurable system-on-chip. ACM Trans. Embedded Comput. Syst. (TECS) 2, 560–589 (2003)CrossRefGoogle Scholar
  248. 291.
    Verdoolaege, S., Bruynooghe, M., Janssens, G., Catthoor, F.: Multi-dimensional incremental loop fusion for data locality. In: Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors, pp. 17–27. The Hague, The Netherlands (2003)Google Scholar
  249. 292.
    Verdoolaege, S., Nikolov, H., Stefanov, T.: PN: a tool for improved derivation of process networks. EURASIP J. Embedded Syst. 2007(1), 19–19 (2007)Google Scholar
  250. 293.
    Verhaegh, W.F.J., Aarts, E.H.L., van Gorp, P.C.N., Lippens, P.E.R.: A two-stage solution approach to multidimensional periodic scheduling. IEEE Trans. CAD Integr. Circuits Syst. 20(10), 1185–1199 (2001)MATHCrossRefGoogle Scholar
  251. 295.
    Vitkovski, A., Kuzmanov, G., Gaydadjiev, G.N.: Memory organization with multi-pattern parallel accesses. In: Proceedings of DATE, pp. 1420–1425. New York, NY (2008)Google Scholar
  252. 297.
    Wakabayashi, K., Okamoto, T.: C-based SoC design flow and EDA tools: An ASIC and system vendor perspective. IEEE Trans. CAD Integr. Circuits Syst. 19(12), 1507–1522 (2000)CrossRefGoogle Scholar
  253. 298.
    Wauters, P., Engels, M., Lauwereins, R., Peperstraete, J.: Cyclo-dynamic dataflow. Internal Report ESAT/ACCA/95/2, Katholieke Universiteit Leuven, ESAT Department, Kard. Mercierlann 94, B-3001 Heverlee, Belgium (1996)Google Scholar
  254. 299.
    Wauters, P., Engels, M., Lauwereins, R., Peperstraete, J.: Cyclo-dynamic dataflow. In: Proceedings of the 4th Euromicro Workshop on Parallel and Distributed Processing (PDP ’96), pp. 319–326. Braga, Portugal (1996)Google Scholar
  255. 300.
    Weinhardt, M., Luk, W.: Memory access optimization for reconfigurable systems. IEEE Proc. Comput. Digital Tech. 148, 105–112 (2001)CrossRefGoogle Scholar
  256. 301.
    Wiggers, M., Bekooij, M., Jansen, P., Smit, G.: Efficient computation of buffer capacities for multi-rate real-time systems with back-pressure. In: Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS ’06), pp. 10–15. ACM Press, New York, NY (2006)Google Scholar
  257. 302.
    Wiggers, M., Bekooij, M., Smit, G.: Efficient computation of buffer capacities for cyclo-static dataflow graphs. Tech. Rep., Centre for Telematics and Information Technology, University of Twente, Enschede (2006)Google Scholar
  258. 303.
    William Thies, J.L., Amarasinghe, S.: Phased computation graphs in the polyhedral model. Tech. Rep., MIT Laboratory for Computer Science Cambridge, MA 02139 (2002)Google Scholar
  259. 309.
    Yang, H., Jung, H., Ha, S.: Buffer minimization in RTL synthesis from coarse-grained dataflow specification. In: SASMI. Nagoya, Japan (2006)Google Scholar
  260. 310.
    Yu, H., Leeser, M.: Optimizing data intensive window-based image processing on reconfigurable hardware boards. In: IEEE Workshop on Signal Processing Systems Design and Implementation, pp. 491–496. Athens, Greece (2005). DOI 10.1109/SIPS.2005.1579918Google Scholar
  261. 311.
    Yu, H., Leeser, M.: Automatic sliding window operation optimization for FPGA-based computing boards. In: Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’06), pp. 76–88. IEEE Computer Society, Washington, DC (2006)Google Scholar
  262. 314.
    Zhao, Y., Malik, S.: Exact memory size estimation for array computations. IEEE Trans. Very Large Scale Integr. Syst. 8(5), 517–521 (2000)CrossRefGoogle Scholar
  263. 315.
    Zhu, H., Luican, I.I., Balasa, F.: Exact computation of storage requirements for multi-dimensional signal processing applications. Tech. Rep., University of Illinois at Chicago (2005)Google Scholar
  264. 316.
    Ziegler, H., Hall, M.: Evaluating heuristics in automatically mapping multi-loop applications to FPGAs. In: Proceedings of the 2005 ACM/SIGDA 13th International Symposium on Field-Programmable Gate Arrays (FPGA ’05), pp. 184–195. ACM Press, New York, NY (2005)Google Scholar
  265. 317.
    Ziegler, H.E., Hall, M.W., Diniz, P.C.: Compiler-generated communication for pipelined FPGA applications. In: DAC ’03: Proceedings of the 40th Conference on Design Automation, pp. 610–615. ACM Press, New York, NY (2003)Google Scholar
  266. 318.
    Zissulescu, C., Kienhuis, B., Deprettere, E.: Communication synthesis in a multiprocessor environment. In: International Conference on Field Programmable Logic and Applications, pp. 360–365. Tampere, Finland (2005)Google Scholar
  267. 319.
    Zissulescu, C., Kienhuis, B., Deprettere, E.F.: Increasing pipelined IP core utilization in process networks using exploration. In: FPL, pp. 690–699. Belgium (2004)Google Scholar
  268. 320.
    Zissulescu, C., Turjan, A., Kienhuis, B., Deprettere, E.: Solving out of order communication using CAM memory; an implementation. In: 13th Annual Workshop on Circuits, Systems and Signal Processing (ProRISC 2002). Veldhoven, Netherlands (2002)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.NürnbergGermany
  2. 2.Department of Computer Science 12University of Erlangen-NurembergErlangenGermany

Personalised recommendations