Skip to main content

DRT: A Lightweight Runtime for Developing Benchmarks for a Dataflow Execution Model

  • Conference paper
  • First Online:
Architecture of Computing Systems (ARCS 2021)

Abstract

Future computers may take advantage of a dataflow program execution model (PXM) for both performance and energy advantages. One key element to provide a compilation tool-chain for such machines is a framework for developing initial benchmarks. DRT (Dataflow Run-Time) is a tool that enables the fast prototyping of those benchmarks for the Dataflow Threads (DF-Threads) PXM. In this work, we show how to use DRT to develop dataflow based examples to be targeted by a future compiler for the dataflow PXM.

DRT has been written in portable C code (tested with the GNU C compiler), and it is open-source, therefore, it can be used on real machines based on architectures like x86, AArch, RISC-V ISA.

Here, we discuss some didactic examples, and we show how to study and debug the data exchange, which is flowing through frames that are detached from the data stack. We compare DRT against similar dataflow runtime libraries such as DARTS and OCR. Even though our environment is not yet optimized, we found that DRT outperforms the above runtime frameworks in terms of execution time. We also give an evaluation of the time and complexity to develop DF-Threads examples in DRT compared to the approach of using a full system simulator and FPGAs for more accurate modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Checkout the DRT repository by this command: svn co https://svn.code.sf.net/p/drt/code/.

  2. 2.

    Source lines of code.

References

  1. Alves, T.A.O., Marzulo, L.A.J., Franca, F.M.G., Costa, V.S.: Trebuchet: exploring TLP with dataflow virtualisation. Int. J. High Perform. Syst. Archit. 3(2/3), 137–148 (2011)

    Article  Google Scholar 

  2. Argollo, E., Falcón, A., Faraboschi, P., Monchiero, M., Ortega, D.: COTSon: infrastructure for full system simulation. SIGOPS Oper. Syst. Rev. 43(1), 52–61 (2009)

    Article  Google Scholar 

  3. Arvind, Culler, D.E.: Dataflow architectures. Ann. Rev. Comput. Sci. 1, 225–253 (1986)

    Google Scholar 

  4. Arvind, K., Nikhil, R.S.: Executing a program on the MIT tagged-token dataflow architecture. IEEE Trans. Comput. 39(3), 300–318 (1990). https://doi.org/10.1109/12.48862

    Article  MATH  Google Scholar 

  5. CAPSL: The codelet execution model. https://www.capsl.udel.edu/codelets.shtml

  6. Chen, Y., Emer, J., Sze, V.: Using dataflow to optimize energy efficiency of deep neural network accelerators. IEEE Micro 37(3), 12–21 (2017)

    Article  Google Scholar 

  7. Dennis, J.B.: Data flow computation. In: Broy, M. (ed.) Control Flow and Data Flow: Concepts of Distributed Programming. Springer Study Edition, vol. 14, pp. 345–398. Springer, Heidelberg (1986). https://doi.org/10.1007/978-3-642-82921-5_8

    Chapter  Google Scholar 

  8. Dennis, J.B., Misunas, D.P.: A preliminary architecture for a basic data-flow processor (1974)

    Google Scholar 

  9. Farabet, C., Martini, B., Corda, B., Akselrod, P., Culurciello, E., LeCun, Y.: NeuFlow: a runtime reconfigurable dataflow processor for vision. In: CVPR 2011 Workshops, pp. 109–116 (2011)

    Google Scholar 

  10. Filgueras, A., et al.: The AXIOM project: IoT on heterogeneous embedded platforms. IEEE Des. Test., 1–6 (2019). http://www.dii.unisi.it/~giorgi/papers/Filgueras19-ieee_dnt.pdf. ISSN 2168-2356

  11. Gautier, T., Lima, J.V.F., Maillard, N., Raffin, B.: XKaapi: a runtime system for data-flow task programming on heterogeneous architectures. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp. 1299–1308 (2013)

    Google Scholar 

  12. Giorgi, R., Faraboschi, P.: An introduction to DF-Threads and their execution model. In: IEEE MPP, Paris, France, pp. 60–65, October 2014

    Google Scholar 

  13. Giorgi, R., Khalili, F., Procaccini, M.: AXIOM: a scalable, efficient and reconfigurable embedded platform. In: IEEE Proceedings of DATEi, pp. 1–6, March 2019

    Google Scholar 

  14. Giorgi, R., Khalili, F., Procaccini, M.: A design space exploration tool set for future 1k-core high-performance computers. In: ACM RAPIDO Workshop, pp. 1–6 (2019)

    Google Scholar 

  15. Giorgi, R., Khalili, F., Procaccini, M.: Translating timing into an architecture: the synergy of COTSon and HLS (domain expertise - designing a computer architecture via HLS). Hindawi - Int. J. Reconfigurable Comput. 2019, 1–18 (2019). https://doi.org/10.1155/2019/2624938

  16. Giorgi, R., Procaccini, M.: Bridging a data-flow execution model to a lightweight programming model. In: 2019 International Conference on HPCS (2019)

    Google Scholar 

  17. Giorgi, R., Scionti, A.: A scalable thread scheduling co-processor based on data-flow principles. Future Gener. Comput. Syst. 53, 100–108 (2015)

    Article  Google Scholar 

  18. Giorgi, R.: Scalable embedded computing through reconfigurable hardware: comparing DF-Threads, cilk, OpenMPI and jump. Microprocess. Microsyst. 63, 66–74 (2018)

    Article  Google Scholar 

  19. Giorgi, R., et al.: TERAFLUX: Harnessing dataflow in next generation teradevices. Microprocess. Microsyst. 38(8, Part B), 976–990 (2014)

    Article  Google Scholar 

  20. Hum, H.H.J., Maquelin, O., Theobald, K.B., Tian, X., Gao, G.R., Hendren, L.J.: A study of the EARTH-MANNA multithreaded system. Int. J. Parallel Program. 24(4), 319–348 (1996). https://doi.org/10.1007/BF03356753

    Article  Google Scholar 

  21. Kabrick, R., Perdomo, D.A.R., Raskar, S., Diaz, J.M.M., Fox, D., Gao, G.R.: CODIR: towards an MLIR codelet model dialect. In: 2020 IEEE/ACM Fourth Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPDRM), pp. 33–40. IEEE (2020)

    Google Scholar 

  22. Kavi, K., Arul, J., Giorgi, R.: Performance evaluation of a non-blocking multithreaded architecture for embedded, real-time and DSP applications. In: 14th International Conference on Parallel and Distributed Computing Systems (ISCA-PDCS-2001), Richardson, TX, USA, pp. 365–371, August 2001

    Google Scholar 

  23. Kavi, K.M., Giorgi, R., Arul, J.: Scheduled dataflow: execution paradigm, architecture, and performance evaluation. IEEE Trans. Comput. 50(8), 834–846 (2001)

    Article  Google Scholar 

  24. HP Labs: COTSon: Infrastructure for full system simulation. https://sourceforge.net/projects/cotson/files/

  25. Stéphane., Z.: DARTS: An asynchonous fine-grained runtime based on the codelet model. https://github.com/szuckerm/DARTS. Accessed Jan 2021

  26. Matheou, G., Evripidou, P.: FREDDO: an efficient framework for runtime execution of data-driven objects. In: International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), January 2016

    Google Scholar 

  27. Mattson, T.G., et al.: The open community runtime: a runtime system for extreme scale computing. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7 (2016)

    Google Scholar 

  28. Najjar, W.A., Lee, E.A., Gao, G.R.: Advances in the dataflow computational model. Parallel Comput. 25(13–14), 1907–1929 (1999)

    Article  Google Scholar 

  29. Nowatzki, T., Gangadhar, V., Ardalani, N., Sankaralingam, K.: Stream-dataflow acceleration. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA 2017, pp. 416–429. Association for Computing Machinery, New York (2017)

    Google Scholar 

  30. Nowatzki, T., Gangadhar, V., Sankaralingam, K.: Heterogeneous Von Neumann/dataflow microprocessors. Commun. ACM 62(6), 83–91 (2019)

    Article  Google Scholar 

  31. OCR: Open community runtime v1.0. https://xstack.exascale-tech.com/git/public/ocr.git. Accessed Jan 2021

  32. Pochayevets, O.: BMDFM: a hybrid dataflow runtime parallelization environment for shared memory multiprocessors. MS thesis in Computer Engineering (2006)

    Google Scholar 

  33. AXIOM Project: Agile, extensible, fast I/O module for the cyber-physical era. https://git.axiom-project.eu/. Accessed Jan 2021

  34. Sarkar, V., Hennessy, J.: Partitioning parallel programs for macro-dataflow. In: Proceedings of the 1986 ACM Conference on LISP and Functional Programming, LFP 1986, pp. 202–211. Association for Computing Machinery, New York (1986)

    Google Scholar 

  35. SECO s.r.l. http://www.seco.com

  36. Silva, R.J.N., et al.: Task scheduling in sucuri dataflow library. In: 2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW), pp. 37–42 (2016)

    Google Scholar 

  37. Stavrou, K., et al.: Programming abstractions and toolchain for dataflow multithreading architectures. In: Proceedings of the 8th International Symposium on Parallel and Distributed Computing (ISPDC 2009), pp. 107–114. IEEE, July 2009

    Google Scholar 

  38. Swanson, S., Michelson, K., Schwerin, A., Oskin, M.: Wavescalar. In: Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-36, p. 291. IEEE Computer Society (2003)

    Google Scholar 

  39. Verdoscia, L., Giorgi, R.: A data-flow soft-core processor for accelerating scientific calculation on FPGAs. Math. Probl. Eng. 2016(1), 1–21 (2016). Article ID: 3190234

    Article  Google Scholar 

  40. Weis, S., Garbade, A., Fechner, B., Mendelson, A., Giorgi, R., Ungerer, T.: Architectural support for fault tolerance in a teradevice dataflow system. Int. J. Parallel Program. 44(2), 208–232 (2014). https://doi.org/10.1007/s10766-014-0312-y

    Article  Google Scholar 

  41. Wilde, M., Hategan, M., Wozniak, J.M., Clifford, B., Katz, D.S., Foster, I.: Swift: a language for distributed parallel scripting. Parallel Comput. 37(9), 633–652 (2011). Emerging Programming Paradigms for Large-Scale Scientific Computing

    Article  Google Scholar 

  42. Wozniak, J.M., Armstrong, T.G., Wilde, M., Katz, D.S., Lusk, E., Foster, I.T.: Swift/T: large-scale application composition via distributed-memory dataflow processing. In: 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 95–102 (2013)

    Google Scholar 

  43. Yazdanpanah, F., Alvarez-Martinez, C., Jimenez-Gonzalez, D., Etsion, Y.: Hybrid dataflow/von-Neumann architectures. IEEE Trans. Parallel Distrib. Syst. 25(6), 1489–1509 (2014)

    Article  Google Scholar 

  44. Liu, Y., Furber, S.: A low power embedded dataflow coprocessor. In: IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design (ISVLSI 2005), pp. 246–247 (2005)

    Google Scholar 

  45. Zuckerman, S., Landwehr, A., Livingston, K., Gao, G.: Toward a self-aware codelet execution model. In: 2014 Fourth Workshop on DFM, pp. 26–29 (2014)

    Google Scholar 

  46. Zuckerman, S., Suetterlein, J., Knauerhase, R., Gao, G.R.: Using a “codelet” program execution model for exascale machines: position paper. In: Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era, EXADAPT 2011, pp. 64–69. Association for Computing Machinery, New York (2011)

    Google Scholar 

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their insightful comments and suggestions. The work of this paper is partly funded by the European Commission on AXIOM H2020 (id. 645496), TERAFLUX (id. 249013), HiPEAC (id. 871174).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Roberto Giorgi or Amin Sahebi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Giorgi, R., Procaccini, M., Sahebi, A. (2021). DRT: A Lightweight Runtime for Developing Benchmarks for a Dataflow Execution Model. In: Hochberger, C., Bauer, L., Pionteck, T. (eds) Architecture of Computing Systems. ARCS 2021. Lecture Notes in Computer Science(), vol 12800. Springer, Cham. https://doi.org/10.1007/978-3-030-81682-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-81682-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-81681-0

  • Online ISBN: 978-3-030-81682-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics