Data Flow Implementation in Software and Hardware

  • Patrick R. Schaumont
Chapter

Abstract

The data flow model of computation is fully concurrent, and it is not committed to either a sequential or parallel implementation. Therefore, a single specification can be used to target software as well as hardware. This chapter describes the implementation of data flow systems in hardware and software on three different design targets. The first target is a software implementation of the data flow graph. Appropriate software scheduling techniques are needed to implement the data flow schedule. The second target is a fully parallel hardware implementation of the data flow graph. In this implementation, each actor maps into a separate hardware module, and individual modules synchronize through their token communications. The third target is hybrid: a combination of software and a hardware coprocessor. In this case, a data flow graph is partitioned over hardware and software. This chapter demonstrates the flexibility and convenience of a data flow specification under multiple targets.

References

  1. Aeroflex G (2009) Leon-3/grlib intellectual property cores. Technical report, http://www.gaisler.com
  2. Altera (2011) Avalon interface specifications. http://www.altera.com/literature/manual/mnl_avalon_spec.pdf
  3. Appel AW (1997) Modern compiler implementation in C: basic techniques. Cambridge University Press, New York, NY, USACrossRefGoogle Scholar
  4. Atmel (2008) AT91SAM7l128 preliminary technical information. http://www.atmel.com/dyn/products/product_card.asp?part_id=4293
  5. Berry G (2000) The foundations of esterel. In: Milner R (ed) Proof, language, and interaction. MIT, Cambridge, pp 425–454Google Scholar
  6. Bogdanov A, Knudsen L, Leander G, Paar C, Poschmann A, Robshaw M, Seurin Y, Vikkelsoe C (2007) Present: an ultra-lightweight block cipher. In: Proceedings of the cryptographic hardware and embedded systems 2007, Vienna, Springer, Heidelberg, pp 450–466Google Scholar
  7. Butenhof D (1997) Programming with POSIC Threads. Addison-Wesley Professional, 1997. ISBN 978-0201633924.Google Scholar
  8. Claasen T (1999) High speed: not the only way to exploit the intrinsic computational power of silicon. In: Solid-state circuits conference, 1999. Digest of technical papers, ISSCC. IEEE International, Piscataway, Piscataway, NJ, USA, pp 22–25Google Scholar
  9. Claasen T (2006) An industry perspective on current and future state of the art in system-on-chip (soc) technology. Proc IEEE 94(6):1121–1137CrossRefGoogle Scholar
  10. Committee T (1995) Tool interface standard executable and linkable format (elf) specification, version 1.2. Technical report, http://refspecs.freestandards.org/elf/elf.pdf
  11. Cytron R, Ferrante J, Rosen BK, Wegman MN, Zadeck FK (1991) Efficiently computing static single assignment form and the control dependence graph. ACM Trans Program Lang Syst 13(4):451–490CrossRefGoogle Scholar
  12. Davio M, Deschamps JP, Thayse A (1983) Digital systems with algorithm implementation. Wiley, New YorkMATHGoogle Scholar
  13. De Canniere C, Preneel B (2005) Trivium specifications. Technical report, ESAT/SCD-COSIC, K.U.Leuven, http://www.ecrypt.eu.org/stream/p3ciphers/trivium/trivium_p3.pdf
  14. Dennis J (2007) A dataflow retrospective – how it all began. http://csg.csail.mit.edu/Dataflow/talks/DennisTalk.pdf
  15. D’Errico J, Qin W (2006) Constructing portable compiled instruction-set simulators: an adl-driven approach. In: DATE ’06: proceedings of the conference on design, automation and test in Europe, Munich, pp 112–117Google Scholar
  16. Dijkstra EW (2009) The E.W. Dijkstra Archive. Technical report, http://www.cs.utexas.edu/users/EWD/
  17. ECRYPT (2008) The estream project. Technical report, http://www.ecrypt.eu.org/stream/technical.html
  18. Edwards SA (2006) The challenges of synthesizing hardware from c-like languages. IEEE Des Test Comput 23(5):375–386CrossRefGoogle Scholar
  19. Eker J, Janneck J, Lee E, Liu J, Liu X, Ludvig J, Neuendorffer S, Sachs S, Xiong Y (2003) Taming heterogeneity – the ptolemy approach. Proc IEEE 91(1):127–144CrossRefGoogle Scholar
  20. Gaj K, Chodowiec P (2009) FPGA and ASIC implementations of AES. In: Koc C (ed) Cryptographic engineering. Springer, New York. ISBN 978-0-387-71817-0.Google Scholar
  21. Gajski DD, Abdi S, Gerstlauere A, Schirner G (2009) Embedded system design: modeling, synthesis, verification. Springer, BostonGoogle Scholar
  22. Ganesan P, Venugopalan R, Peddabachagari P, Dean A, Mueller F, Sichitiu M (2003) Analyzing and modeling encryption overhead for sensor network nodes. In: WSNA ’03: proceedings of the 2nd ACM international conference on wireless sensor networks and applications. ACM, New York, pp 151–159. doi:http://doi.acm.org/10.1145/941350. 941372Google Scholar
  23. Good T, Benaissa M (2007) Hardware results for selected stream cipher candidates. Technical report, eSTREAM project, http://www.ecrypt.eu.org/stream/hw.html
  24. Gupta S, Gupta R, Dutt N, Nicolau A (2004) SPARK: a parallelizing approach to the high-level synthesis of digital circuits. Springer, BostonGoogle Scholar
  25. Harel D (1987) Statecharts: a visual formulation for complex systems. Sci Comput Program 8(3):231–274MathSciNetMATHCrossRefGoogle Scholar
  26. Hennessy JL, Patterson DA (2006) Computer architecture: a quantitative approach, 4th edn. Morgan Kaufmann, BostonMATHGoogle Scholar
  27. Hillis WD, Steele GL Jr (1986) Data parallel algorithms. Commun ACM 29(12):1170–1183CrossRefGoogle Scholar
  28. Hodjat A, Verbauwhede I (2004) High-throughput programmable cryptocoprocessor. IEEE Micro 24(3):34–45CrossRefGoogle Scholar
  29. Hoe JC (2000) Operation-centric hardware description and synthesis. Ph.D. thesis, MITGoogle Scholar
  30. Ivanov A, De Micheli G (2005) Guest editors’ introduction: The network-on-chip paradigm in practice and research. IEEE Des Test Comput 22(5):399–403CrossRefGoogle Scholar
  31. Kaps JP (2008) Chai-tea, cryptographic hardware implementations of xtea. In: INDOCRYPT. Springer, New York, pp 363–375Google Scholar
  32. Karlof C, Sastry N, Wagner D (2004) Tinysec: a link layer security architecture for wireless sensor networks. In: SenSys ’04: proceedings of the 2nd international conference on embedded networked sensor systems. ACM, New York, pp 162–175. doi:http: //doi.acm.org/10.1145/1031495.1031515Google Scholar
  33. Kastner R, Kaplan A, Sarrafzadeh M (2003) Synthesis techniques and optimizations for reconfigurable systems. Kluwer, BostonGoogle Scholar
  34. Keutzer K, Newton A, Rabaey J, Sangiovanni-Vincentelli A (2000) System-level design: orthogonalization of concerns and platform-based design. IEEE Trans Comput Aided Des Integr Circuit Syst 19(12):1523–1543CrossRefGoogle Scholar
  35. Keppel D (1994), Tools and Techniques for building fast portable thread packages. http://www.cs.washington.edu/research/compiler/papers.d/quickthreads.html
  36. Kogge PM (1981) The architecture of pipelined computers. McGraw-Hill, New YorkMATHGoogle Scholar
  37. Leander G, Paar C, Poschmann A, Schramm K (2007) New lightweight des variants. In: Biryukov A (ed) Fast software encryption. Lecture notes on computer science, vol 4593. Springer, New York, pp 196–200Google Scholar
  38. Lee EA, Messerschmitt DG (1987) Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans Comput 36(1):24–35MATHCrossRefGoogle Scholar
  39. Lee EA, Seshia SA (2011) Introduction to embedded systems, a cyber-physical systems approach. http://LeeSeshia.org, ISBN 978-0-557-70857-4.
  40. Leupers R, Ienne P (2006) Customizable embedded processors: design technologies and applications. Morgan Kaufmann, San FranciscoGoogle Scholar
  41. Ltd A (2009a) The amba system architecture. Technical report, http://www.arm.com/products/solutions/AMBAHomePage.html
  42. Ltd A (2009b) Arm infocenter. Technical report, http://infocenter.arm.com/help/index.jsp
  43. Lynch M (1993) Micro-programmed state machine design, CRC, Boca RatonGoogle Scholar
  44. Madsen J, Steensgaard-Madsen J, Christensen L (2002) A sophomore course in codesign. Computer 35(11):108–110. doi:http://dx.doi.org/10.1109/MC.2002.1046983Google Scholar
  45. Maharatna K, Valls J, Juang TB, Sridharan K, Meher P (2009) 50 years of cordic: algorithms, architectures, and applications. IEEE Trans Circuit Syst I Regul Pap 56(9):1893–1907MathSciNetCrossRefGoogle Scholar
  46. McKee S (2004) Reflections on the memory wall. In: Conference on computing frontiers. ACM, New York, pp 162–168Google Scholar
  47. Meiser G, Eisenbarth T, Lemke-Rust K, Paar C (2007) Software implementation of estream profile i ciphers on embedded 8-bit avr microcontrollers. Technical report, eSTREAM project. http://www.ecrypt.eu.org/stream/sw.html
  48. Menezes A, van Oorschot P, Vanstone S (2001) Handbook of applied cryptography. CRC, Boca RatonGoogle Scholar
  49. Micheli GD, Benini L (2006) Networks on chips: technology and tools (Systems on silicon). Morgan Kaufmann, San FranciscoGoogle Scholar
  50. Micheli GD, Wolf W, Ernst R (2001) Readings in hardware/software co-design. Morgan Kaufmann, San FranciscoGoogle Scholar
  51. Moderchai BA (2006) Principles of concurrent and distributed programming, 2nd edn. Addison Wesley, BostonGoogle Scholar
  52. Muchnick SS (1997) Advanced compiler design and implementation. Morgan Kaufmann, San FranciscoGoogle Scholar
  53. NIST (2001) Federal information processing standards publication 197: announcing the advanced encryption standard (aes). Technical report, http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
  54. Panda PR, Catthoor F, Dutt ND, Danckaert K, Brockmeyer E, Kulkarni C, Vandecappelle A, Kjeldsberg PG (2001) Data and memory optimization techniques for embedded systems. ACM Trans Des Autom Electron Syst 6(2):149–206CrossRefGoogle Scholar
  55. Parhi KK (1999) VLSI digital signal processing: design and implementation. Wiley, New York. ISBN 978-0471241867.Google Scholar
  56. Parhi KK, Messerschmitt DG (1991) Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding. Computers, IEEE Transactions on 40(2):178–195.CrossRefGoogle Scholar
  57. Pasricha S, Dutt N (2008) On-chip communication architectures: system on chip interconnect. Morgan Kaufmann, AmsterdamGoogle Scholar
  58. Potop-Butucaru D, Edwards SA, Berry G (2007) Compiling esterel. Springer, New YorkGoogle Scholar
  59. Qin W (2004) Modeling and description of embedded processors for the development of software tools. Ph.D. thesis, Princeton UniversityGoogle Scholar
  60. Qin W, Malik S (2003) Flexible and formal modeling of microprocessors with application to retargetable simulation. In: DATE ’03: proceedings of the conference on design, automation and test in Europe, Munich, p 10556Google Scholar
  61. Rabaey JM (2009) Low power design essentials. Springer, New YorkCrossRefGoogle Scholar
  62. Rowen C (2004) Engineering the complex SOC: fast, flexible design with configurable processors. Prentice Hall, Upper Saddle RiverGoogle Scholar
  63. Saleh R, Wilton S, Mirabbasi S, Hu A, Greenstreet M, Lemieux G, Pande P, Grecu C, Ivanov A (2006) System-on-chip: reuse and integration. Proc IEEE 94(6):1050–1069CrossRefGoogle Scholar
  64. Satoh A, Morioka S (2003) Hardware-focused performance comparison for the standard block ciphers aes, camellia, and triple-des. In: ISC, no. 2851. Lecture notes on computer science. Springer, New York, pp 252–266Google Scholar
  65. Schaumont P, Shukla S, Verbauwhede I (2006) Design with race-free hardware semantics. In: DATE’06: Proceedings on design, automation and test in Europe, IEEE 1, vol. 1, pp 6Google Scholar
  66. Smotherman M (2009) A brief history of microprogramming. Technical report, Clemson University. http://www.cs.clemson.edu/~mark/uprog.html
  67. Stanford Graphics Lab (2003) Brook language. http://graphics.stanford.edu/projects/brookgpu/lang.html
  68. Talla D, Hung CY, Talluri R, Brill F, Smith D, Brier D, Xiong B, Huynh D (2004) Anatomy of a portable digital mediaprocessor. IEEE Micro 24(2):32–39CrossRefGoogle Scholar
  69. Taubenfeld G (2006) Synchronization algorithms and concurrent programming. Pearson/Prentice Hall, HarlowGoogle Scholar
  70. Thies W (2008) Language and compiler support for stream programs. Ph.D. thesis, MIT. http://groups.csail.mit.edu/cag/streamit/shtml/documentation.shtml
  71. Vahid F (2003) The softening of hardware. Computer 36(4):27–34CrossRefGoogle Scholar
  72. Vahid F (2007a) Digital design. Wiley, HobokenGoogle Scholar
  73. Vahid F (2007b) It’s time to stop calling circuits “hardware”. Computer 40(9):106–108CrossRefGoogle Scholar
  74. Vahid F (2009) Dalton project. Technical report, http://www.cs.ucr.edu/~dalton/
  75. Valls J, Sansaloni T, Perez-Pascual A, Torres V, Almenar V (2006) The use of cordic in software defined radios: a tutorial. IEEE Commun Mag 44(9):46–50CrossRefGoogle Scholar
  76. Volder JE (1959) The cordic trigonometric computing technique. IEEE Trans Electron Comput EC-8(3):330–334CrossRefGoogle Scholar
  77. Wolf W (2003) A decade of hardware/software codesign. Computer 36(4):38–43CrossRefGoogle Scholar
  78. Wulf W, McKee S (1995) Hitting the memory wall: implications of the obvious. In: ACM SIGARCH computer architecture news, 23, http://www.cs.virginia.edu/papers/Hitting_Memory_Wall-wulf94.pdf
  79. Xilinx I (2009a) Xilinx embedded development toolkit. Technical report, http://www.xilinx.com/support/documentation/dt_edk.htm
  80. Yaghmour K, Masters J, Ben-Yossef G, Gerum P (2008) Building embedded Linux systems, 2nd edn. O’Reilly, SebastopolGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Patrick R. Schaumont
    • 1
  1. 1.Bradley Department of Electrical and Computer EngineeringVirginia TechBlacksburgUSA

Personalised recommendations