Skip to main content

Synthesizing Code for GPGPUs from Abstract Formal Models

  • 754 Accesses

Part of the Lecture Notes in Electrical Engineering book series (LNEE,volume 361)

Abstract

Today multiple frameworks exist for elevating the task of writing programs for GPGPUs, which are massively data-parallel execution platforms. These are needed as writing correct and high-performing applications for GPGPUs is notoriously difficult due to the intricacies of the underlying architecture. However, the existing frameworks lack a formal foundation that makes them difficult to use together with formal verification, testing, and design space exploration. We present in this chapter a novel software synthesis tool—called f2cc—which is capable of generating efficient GPGPU code from abstract formal models based on the synchronous model of computation. These models can be built using high-level modeling methodologies that hide low-level architecture details from the developer. The correctness of the tool has been experimentally validated on models derived from two applications. The experiments also demonstrate that the synthesized GPGPU code yielded a 28× speedup when executed on a graphics card with 96 cores and compared against a sequential version that uses only the CPU.

Keywords

  • Formal Modeling Methodology
  • ForSyDe
  • Thread Block
  • Thread Configuration
  • Shared Memory

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-24457-0_7
  • Chapter length: 20 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-24457-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   149.99
Price excludes VAT (USA)
Hardcover Book
USD   159.99
Price excludes VAT (USA)
Fig. 7.1
Fig. 7.2
Fig. 7.3
Fig. 7.4
Fig. 7.5
Fig. 7.6
Fig. 7.7
Fig. 7.8
Fig. 7.9
Fig. 7.10
Fig. 7.11
Fig. 7.12

Notes

  1. 1.

    Source code is available at http://forsyde.ict.kth.se/trac/wiki/ForSyDe/f2cc.

References

  1. Attarzadeh Niaki, S.H., Jakobsen, M.K., Sulonen, T., Sander, I.: Formal heterogeneous system modeling with SystemC. In: Forum on Specification and Design Languages, FDL 2012, pp. 160–167, Vienna, Austria, September 2012

    Google Scholar 

  2. Bell, N., Hoberock, J.: Thrust: A productivity-oriented library for cuda. In: Wen-mei, W.H. (ed.) GPU Computing Gems, Jade edition, Chapter 26, pp. 356–371. Morgan Kaufmann, Los Altos, CA (2011)

    Google Scholar 

  3. Benveniste, A., Berry, G.: The synchronous approach to reactive and real-time systems. Proc. IEEE 79(9), 1270–1280 (1991)

    CrossRef  Google Scholar 

  4. Berry, G., Cosserat, L.: The ESTEREL synchronous programming language and its mathematical semantics. In: Brookes, S., Roscoe, A., Winskel, G. (eds.) Seminar on Concurrency. Lecture Notes in Computer Science, vol. 197, pp. 389–448. Springer, Berlin (1985)

    Google Scholar 

  5. Brandes, U., Eiglsperger, M., Lerner, J.: GraphML Primer (June 2004). http://graphml.graphdrawing.org/primer/graphml-primer.html (last visited 2014-05-19).

  6. Chackravarty, M.M.T., Keller, G., Lee, S., McDonell, T.L., Grover, V.: Accelerating haskell array codes with multicore GPUs. In: Proceedings of the 6th Workshop on Declarative Aspects of Multicore Programming (DAMP’11), pp. 3–14 (2011)

    Google Scholar 

  7. Dastgeer, U., Kessler, C.W., Thibault, S.: Flexible runtime support for efficient skeleton programming on hybrid systems. In: Proceedings of the International Conference on Parallel Programming (ParCo’11), Heraklion, Greece (2011)

    Google Scholar 

  8. Edwards, S., Lavagno, L., Lee, E.A., Sangiovanni-Vincentelli, A.: Design of embedded systems: formal models, validation, and synthesis. Proc. IEEE 85, 366–387 (1997)

    CrossRef  Google Scholar 

  9. Garland, M., Kirk, D.B.: Understanding throughput-oriented architectures. Commun. ACM 53, 58–66 (2010)

    CrossRef  Google Scholar 

  10. Garland, M., Le Grand, S., Nickolls, J., Anderson, J., Hardwick, J., Morton, S., Phillips, E., Zhang, Y., Volkov, V.: Parallel computation experiences with cuda. IEEE Micro 28, 13–27 (2008)

    CrossRef  Google Scholar 

  11. Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous dataflow programming language LUSTRE. Proc. IEEE 79(9), 1305–1320 (1991)

    CrossRef  Google Scholar 

  12. Han, T.D., Abdelrahman, T.S.: hiCUDA: high-level GPGPU programming. IEEE Trans. Parallel Distrib. Syst. 22, 78–90 (2011)

    Google Scholar 

  13. Hjort Blindell, G.: Synthesizing software from a ForSyDe model targeting GPGPUs. Master’s thesis, KTH Royal Institute of Technology, School of Information and Communication, Stockholm, Sweden (2012)

    Google Scholar 

  14. Kirk, D.B., Wen-mei, W.H.: Programming Massively Parallel Processors. Morgan Kaufmann, Los Altos, CA (2010)

    Google Scholar 

  15. Lee, E.A., Sangiovanni-Vincentelli, A.: A framework for comparing models of computation. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. 17(12), 1217–1229 (1998)

    CrossRef  Google Scholar 

  16. Lee, S., Min, S.-J., Eigenmann, R.: OpenMP-to-CUDA: a compiler framework for automatic translation and optimization. In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’09), vol. 44, pp. 101–110 (2009)

    Google Scholar 

  17. Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: Nvidia Tesla: a unified graphics and computing architecture. IEEE Micro. 30, 39–55 (2010)

    Google Scholar 

  18. Nickolls, J., Dally, W.J.: The GPU computing era. IEEE Micro 30, 56–69 (2010)

    CrossRef  Google Scholar 

  19. Sander, I., Jantsch, A.: System modeling and transformational design refinement in ForSyDe. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. 23, 17–32 (2004)

    CrossRef  Google Scholar 

  20. Svensson, J., Claessen, K., Sheeran, M.: GPGPU kernel implementation and refinement using obsidian. In: Proceedings of the International Conference on Computational Science (ICCS’10), vol. 1, pp. 2065–2074 (2010)

    Google Scholar 

  21. Thies, W., Karczmarek, M., Amarasinghe, S.P.: StreamIt: a language for streaming applications. In Proceedings of the 11th International Conference on Compiler Construction, CC ’02, pp. 179–196 (2002)

    Google Scholar 

  22. Ungureanu, G.: Automatic software synthesis from high-level ForSyDe models targeting massively parallel processors. Master’s thesis, KTH Royal Institute of Technology, School of Information and Communication, Stockholm, Sweden (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriel Hjort Blindell .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Blindell, G.H., Menne, C., Sander, I. (2016). Synthesizing Code for GPGPUs from Abstract Formal Models. In: Oppenheimer, F., Medina Pasaje, J. (eds) Languages, Design Methods, and Tools for Electronic System Design. Lecture Notes in Electrical Engineering, vol 361. Springer, Cham. https://doi.org/10.1007/978-3-319-24457-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24457-0_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24455-6

  • Online ISBN: 978-3-319-24457-0

  • eBook Packages: EngineeringEngineering (R0)