Selected Case Studies

  • Veljko Milutinović
  • Jakob Salom
  • Nemanja Trifunovic
  • Roberto Giorgi


Following the first chapter that presents the DataFlow computer architecture paradigms, this chapter sheds more light on DataFlow applications. There are a lot of publicly available research papers and other sources that portray how successful transfers of applications from control flow to DataFlow architectures can be in terms of speed, power consumption, and equipment size. At the beginning of this chapter, a novel classification of typical DataFlow applications is presented, which is in line with the most recent proposals of the European FP7/H2020 initiative. Then, some of most indicative papers are presented grouped according to described classification. The most representative single article from each available classification group is in detail described followed by other articles from the same group that are shortly introduced stating the topic, content, and acquired results (speedups, power reductions, other improvements, etc.).


  1. [Arram2013]
    Arram J et al (2013) Hardware acceleration of genetic sequence alignment. In: Proceedings of the 9th international symposium on reconfigurable computing: architectures, tools and applications, Los Angeles, CA, 2013. pp 13–24Google Scholar
  2. [Bezanic2013]
    Bezanic N et al (2013) Implementation of the RSA algorithm on a dataflow architecture. IPSI Trans Intern Res Belgrade Serbia 9(2):11–18Google Scholar
  3. [Chow2012]
    Chow GCT et al (2012) A mixed precision Monte Carlo methodology for reconfigurable accelerator systems. In: Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays (FPGA), Monterey, CA, 2012, pp 57–66Google Scholar
  4. [Chau2013]
    Chau TCP et al. (2013) Heterogeneous reconfigurable system for adaptive particle filters in real-time applications. In: Proceedings of the 9th international symposium on reconfigurable computing: architectures, tools and applications, Los Angeles, CA, 2013, pp 1–12Google Scholar
  5. [Cheung2012]
    Cheung KS, Schultz R, Luk W (2012) A large-scale spiking neural network accelerator for FPGA systems. In: Proceedings of the 2nd international conference on artificial neural networks, Lausanne, Switzerland, 2012, Part I, pp 113–120Google Scholar
  6. [Dimond2011]
    Dimond D, Racaniere S, Pell O (2011) Accelerating large-scale HPC applications using FPGAs. In: Proceedings of the 20th IEEE symposium on computer arithmetic (ARITH), Tubingen, Germany, 2011, pp 191–192.Google Scholar
  7. [Flynn2008]
    Flynn M et al (2008) Finding speedup in parallel processors. In: Proceedings of the international symposium on parallel and distributed computing, Krakow, Poland, 2008, pp 3–7Google Scholar
  8. [Fu2009]
    Fu H, Clapp RG, Mencer O, Pell O (2009) Accelerating 3D convolution using streaming architectures on FPGAs. In: SEG Houston 2009 international exposition and annual meeting, Houston, Texas, 2009.Google Scholar
  9. [Guo2012]
    Guo C, Fu H, Luk W (2012) A fully-pipelined expectation-maximization engine for Gaussian mixture models. In: Proceedings of the 2012 international conference on field-programmable technology (FPT), Seoul, S. Korea, 2012, pp 182–189Google Scholar
  10. [Guo2013]
    Guo L et al (2013) Customizable architectures for the set covering problem. Comput Architect News ACM Sigarch, New York 41(5):101–106Google Scholar
  11. [Ivkovic2013]
    Ivkovic S et al (2013) Source-sink model. IPSI Trans Inter Res Belgrade Serbia 9(2):28–33Google Scholar
  12. [Jin2012]
    Jin Q et al. (2012) Multi-level customization framework for curve based Monte Carlo financial simulations. In: Proc. 8th international conference on reconfigurable computing: architectures, tools and applications, Hong Kong, 2012, pp 187–201Google Scholar
  13. [Korolija2013]
    Korolija N et al (2013) Accelerating Lattice-Boltzmann method using the Maxeler DataFlow approach. IPSI Trans Internet Res Belgrade Serbia 9(2):34–42Google Scholar
  14. [Kumar2009]
    Kumar N, Satoor S,, Buck I (2009) Fast parallel expectation maximization for Gaussian mixture models on GPUs using CUDA. In: EEE international conference on high performance computing and communications, 2009, pp 103–109Google Scholar
  15. [Lindtjorn2011]
    Lindtjorn O et al (2011) Beyond traditional microprocessors for geoscience high-performance computing applications. Micro IEEE 31(2):41–49CrossRefGoogle Scholar
  16. [Liu2009]
    Liu W et al (2009) Anisotropic reverse-time migration using co-processors. SEG Houston 2009 international exposition and annual meeting, Houston, Texas, 2009Google Scholar
  17. [Marrocu1998]
    Marrocu M, Scardovelli R, Malguzzi P (1998) Parallelization and Performance of a Meteorological Limited Area Model. Parallel Comput 24(5–6):911–922Google Scholar
  18. [Mencer2011]
    Mencer O et al (2011) Finding the Right Level of Abstraction for Minimizing Operational Expenditure. iN: Proceedings of the 4th workshop on high performance computational finance, ACM New York, NY, 2011, pp 13–18Google Scholar
  19. [Nemeth2008]
    Nemeth T et al. (2008) An implementation of the acoustic wave equation on FPGAs. In: Proceedings of the 78th Society of Exploration Geophysicists (SEG) meeting, Las Vegas, November 2008, pp 2874–2878Google Scholar
  20. [Oriato2012]
    Oriato D et al (2012) Acceleration of a meteorological limited area model with dataflow engines. In: Proceedings of the 2012 symposium on application accelerators in high performance computing, Chicago, IL, July, 2012, pp 129–132Google Scholar
  21. [Pell2011]
    Pell O, Mencer O (2011) Surviving the end of frequency scaling with reconfigurable dataflow computing. Newsletter ACM SIGARCH Comput Architect News Arch 39(4):60–65. ACM, New York, NYGoogle Scholar
  22. [Pell2012]
    Pell O, Averbukh V (2012) Maximum performance computing with dataflow engines. Comput Sci Eng 14(4):98–103, Los Alamitos, CAGoogle Scholar
  23. [Pell2013]
    Pell O et al (2013) Finite-difference wave propagation modeling on special-purpose dataflow machines. IEEE Trans Parallel Distrib Syst 24(5):906–915CrossRefGoogle Scholar
  24. [Pell2014]
    Pell O et al (2014) Summary FD modeling beyond 70Hz with FPGA acceleration [Online]. Available:
  25. [Rafique2012]
    Rafique A, Kapre N, Constantinides GA (2012) Enhancing performance of Tall-Skinny Qr Factorization using FPGAs. In: Proceedings of the 22nd international conference on Field Programmable Logic and applications (FPL), Oslo, Norway, August 2012, pp 443–450Google Scholar
  26. [Rankovic2013]
    Rankovic V, Kos A, Milutinovic V (2013) Bitonic merge sort implementation on the Maxeler Dataflow supercomputing system. IPSI Trans Internet Res Belgrade Serbia 9(2):5–9Google Scholar
  27. [Ruan2014]
    Ruan H et al (2014) A fully pipelined probability density function engine for Gaussian Copula model. Tsinghua Sci Technol Tsinghua Univ Press (TUP) 19(2):194–202CrossRefGoogle Scholar
  28. [Stojanovic2013]
    Stojanovic S, Bojic D, Milutinovic V (2013) Solving Gross Pitaevskii equation using DataFlow paradigm. IPSI Trans Internet Res Belgrade Serbia 9(2):19–22Google Scholar
  29. [Stanojevic2013]
    Stanojevic I, Senk V, Milutinovic V (2013) Application of Maxeler Dataflow supercomputing to spherical code design. IPSI Trans Internet Res Belgrade Serbia 9(2):1–4Google Scholar
  30. [Sustran2013]
    Sustran Z, Todorovic M, Milutinovic V (2013) Feasibility study on the SAT solver on DataFlow architecture. IPSI Trans Internet Res Belgrade Serbia 9(2):23–27Google Scholar
  31. [Tomas2012]
    Tomas C et al. (2012) Acceleration of anisotropic phase shift plus interpolation with dataflow engines. In: Proceedings of the 82nd annual meeting and international exposition of the Society of Exploration Geophysics-SEG, Las Vegas, NE, 2012, pp 3402–3406Google Scholar
  32. [Tse2012]
    Tse HT (Anson) (2012) Accelerating reconfigurable financial computing. PhD thesis, Imperial College, London, G. Britain, 2012Google Scholar
  33. [Tse2012/2]
    Tse AHT et al (2012) Optimizing performance of quadrature methods with reduced precision. In: Proceedings of the 8th international symposium ARC 2012, Hong Kong, China, March, 2012, pp 251–263Google Scholar
  34. [Weston2010]
    Weston S et al (2010) Accelerating the computation of portfolios of Tranched credit derivatives. In: Proceedings of the 2010 IEEE Workshop on High Performance Computational Finance (WHPCF), New Orleans, LA, November 2010, pp 1–8Google Scholar
  35. [Weston2011]
    Weston S et al (2012) Rapid computation of value and risk for derivatives portfolios. Concurr Comput Pract Exp 24(8):880–894CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Veljko Milutinović
    • 1
  • Jakob Salom
    • 2
  • Nemanja Trifunovic
    • 3
  • Roberto Giorgi
    • 4
  1. 1.School of Electrical EngineeringUniversity of BelgradeBelgradeSerbia
  2. 2.MISANUBelgradeSerbia
  3. 3.Maxeler Technologies Inc.Palo AltoUSA
  4. 4.University of SienaSienaItaly

Personalised recommendations