Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

European Symposium on Programming

ESOP 2012: Programming Languages and Systems pp 316–335Cite as

  1. Home
  2. Programming Languages and Systems
  3. Conference paper
On the Correctness of the SIMT Execution Model of GPUs

On the Correctness of the SIMT Execution Model of GPUs

  • Axel Habermaier17 &
  • Alexander Knapp17 
  • Conference paper
  • 2236 Accesses

  • 23 Citations

Part of the Lecture Notes in Computer Science book series (LNPSE,volume 7211)

Abstract

GPUs are becoming a primary resource of computing power. They use a single instruction, multiple threads (SIMT) execution model that executes batches of threads in lockstep. If the control flow of threads within the same batch diverges, the different execution paths are scheduled sequentially; once the control flows reconverge, all threads are executed in lockstep again. Several thread batching mechanisms have been proposed, albeit without establishing their semantic validity or their scheduling properties. To increase the level of confidence in the correctness of GPU-accelerated programs, we formalize the SIMT execution model for a stack-based reconvergence mechanism in an operational semantics and prove its correctness by constructing a simulation between the SIMT semantics and a standard interleaved multi-thread semantics. We also demonstrate that the SIMT execution model produces unfair schedules in some cases. We discuss the problem of unfairness for different batching mechanisms like dynamic warp formation and a stack-less reconvergence strategy.

Keywords

  • Execution Model
  • Program Counter
  • Disable State
  • Active Thread
  • Unfairness Problem

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Download conference paper PDF

References

  1. AMD. Evergreen Family Instruction Set Architecture, Reference Guide (2011)

    Google Scholar 

  2. Barnat, J., Brim, L., Ceska, M., Lamr, T.: CUDA Accelerated LTL Model Checking. In: Proc. 15th Int. Conf. Parallel and Distributed Systems (ICPADS 2009), pp. 34–41 (2009)

    Google Scholar 

  3. Bošnački, D., Edelkamp, S., Sulewski, D., Wijs, A.: GPU-PRISM: An Extension of PRISM for General Purpose Graphics Processing Units. In: Proc. 9th Int. Wsh. Parallel and Distributed Methods in Verification (PDMV 2010), pp. 17–19 (2010)

    Google Scholar 

  4. Collange, S.: Stack-less SIMT Reconvergence at Low Cost. Technical Report HAL-00622654, INRIA (2011)

    Google Scholar 

  5. Coon, B.W., Nickolls, J.R., Nyland, L., Mills, P.C., Lindholm, J.E.: Indirect Function Call Instructions in a Synchronous Parallel Thread Processor, United States Patent Application #2009/0240931 (2009)

    Google Scholar 

  6. Fung, W.W.L., Aamodt, T.M.: Thread Block Compaction for Efficient SIMT Control Flow. In: Proc. 17th IEEE Int. Symp. High Performance Computer Architecture (HPCA 2011), pp. 25–36 (2011)

    Google Scholar 

  7. Fung, W.W.L., Sham, I., Yuan, G., Aamodt, T.M.: Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow. In: Proc. 40th Ann. IEEE/ACM Int. Symp. Microarchitecture (MICRO 2007), pp. 407–420 (2007)

    Google Scholar 

  8. Garland, M., Le Grand, S., Nickolls, J., Anderson, J., Hardwick, J., Morton, S., Phillips, E., Zhang, Y., Volkov, V.: Parallel Computing Experiences with CUDA. IEEE Micro 28(4), 13–27 (2008)

    CrossRef  Google Scholar 

  9. Habermaier, A.: The Model of Computation of CUDA and its Formal Semantics. Technical Report 2011-14, University of Augsburg (2011)

    Google Scholar 

  10. Habermaier, A., Knapp, A.: On the Correctness of the SIMT Execution Model of GPUs. Technical Report 2012-1, University of Augsburg (2012)

    Google Scholar 

  11. Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach, 5th edn. Elsevier Science & Technology (2011)

    Google Scholar 

  12. Khronos Group Inc. The OpenGL Shading Language 4.20, Revision 6 (2011)

    Google Scholar 

  13. Khronos OpenCL Working Group. The OpenCL Specification 1.2, Revision 15 (2011)

    Google Scholar 

  14. Levinthal, A., Porter, T.: Chap – A SIMD Graphics Processor. SIGGRAPH Comput. Graph. 18, 77–82 (1984)

    CrossRef  Google Scholar 

  15. The LLVM Compiler Infrastructure, http://www.llvm.org/ (01/04/2012)

  16. Mantor, M., Houston, M.: AMD Graphic Core Next: Low Power High Performance Graphics & Parallel Compute. Presentation at the AMD Fusion Developer Summit (2011)

    Google Scholar 

  17. Mark, W.: Future Graphics Architectures. ACM Queue 6, 54–64 (2008)

    CrossRef  Google Scholar 

  18. Meng, J., Tarjan, D., Skadron, K.: Dynamic Warp Subdivision for Integrated Branch and Memory Divergence Tolerance. In: Proc. 37th Ann. Int. Symp. Computer Architecture (ISCA 2010), pp. 235–246 (2010)

    Google Scholar 

  19. Moy, S., Lindholm, J.E.: Method and System for Programmable Pipelined Graphics Processing with Branching Instructions, United States Patent #6,947,047 (2005)

    Google Scholar 

  20. Muchnick, S.S.: Advanced Compiler Design and Implementation. Morgan Kaufmann Publishers Inc. (1997)

    Google Scholar 

  21. Nickolls, J.R., Dally, W.: The GPU Computing Era. IEEE Micro 30(2), 56–69 (2010)

    CrossRef  Google Scholar 

  22. NVIDIA. DirectCompute Programming Guide 3.2 (2010)

    Google Scholar 

  23. NVIDIA. cuobjdump. CUDA Toolkit 4.1 (2011)

    Google Scholar 

  24. NVIDIA. NVIDIA CUDA C Programming Guide 4.1 (2011)

    Google Scholar 

  25. NVIDIA. NVIDIA Opens Up CUDA Platform by Releasing Compiler Source Code (2011), http://tiny.cc/NvidiaLLVM (01/04/2012)

  26. Reynolds, J.C.: Theories of Programming Languages. Cambridge University Press (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Institute for Software and Systems Engineering, University of Augsburg, Germany

    Axel Habermaier & Alexander Knapp

Authors
  1. Axel Habermaier
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Alexander Knapp
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Technische Universität München, Boltzmannstrasse 3, 85748, Garching, Germany

    Helmut Seidl

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Habermaier, A., Knapp, A. (2012). On the Correctness of the SIMT Execution Model of GPUs. In: Seidl, H. (eds) Programming Languages and Systems. ESOP 2012. Lecture Notes in Computer Science, vol 7211. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28869-2_16

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-28869-2_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28868-5

  • Online ISBN: 978-3-642-28869-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature