Asynchronous Parallel Logic Simulation on Modern Graphics Processors

  • Yangdong Deng
  • Yuhao Zhu
  • Wang Bo
Chapter
Part of the Lecture Notes in Earth System Sciences book series (LNESS)

Abstract

Logic simulation has become the bottleneck of today’s integrated circuit (IC) design projects. For instance, over 80 % of the IC design turn-around time of NVIDIA is spent on logic simulation even with NVIDIA’s proprietary supercomputing facility. It is thus essential to develop parallel simulation solutions to maintain the momentum of increasing IC integration capacity. Inspired by the supreme parallel computing power of modern GPUs, in this chapter we reported our recent work on using GPU to accelerate the time-consuming IC verification process by developing a massively parallel gate-level logical simulator. To the best of authors’ knowledge, this work is the first one to leverage the power of the modern GPUs to successfully unleash the massive parallelism of a conservative discrete event driven algorithm, CMB algorithm. Based on a novel data-parallel algorithmic mapping strategy, both the data structure and processing flow of the CMB protocol are re-designed to better exploit the potential of modern GPUs. A dynamic memory management mechanism is developed to efficiently utilize the relatively limited GPU memory resource. Experimental results prove that our GPU based simulator outperforms a CPU baseline event-driven simulator by a factor of 47.4X on average. This work demonstrates that the CMB algorithm can be efficiently and effectively deployed on GPUs without the performance overhead that had hindered its successful applications on previous parallel architectures.

References

  1. Amdahl GM (1967) Validity of the single-processor approach to achieving large-scale computing capabilities. In: American federation of information processing societies conference, AFIPS Press, pp 483–485Google Scholar
  2. Bailey ML, Briner JV Jr, Chamberlain RD (1994) Parallel logic simulation of VLSI systems. ACM Comput Surv 26(3):255–294Google Scholar
  3. Bataineh A, Özgüner F, Szauter I (1992) Parallel logic and fault simulation algorithms for shared memory vector machines. In: International conference on computer-aided designGoogle Scholar
  4. Blythe D (2008) Rise of the graphics processor. Proc IEEE 96(5):761–778CrossRefGoogle Scholar
  5. Bryant RE (1977) Simulation of packet communications architecture computer system. MIT-LCS-TR-188, MITGoogle Scholar
  6. Chandy KM, Misra J (1979) Distributed simulation: a case study in design and verification of distributed programs. IEEE Trans Softw Eng SE-5(5):440–452Google Scholar
  7. Chandy KM, Misra J (1981) Asynchronous distributed simulation via a sequence of parallel computations. Commun ACM 24(4):198–206MathSciNetCrossRefGoogle Scholar
  8. Chandy KM, Misra J, Holmes V (1979) Distributed simulation of networks. Comput Netw 3:105–113MATHGoogle Scholar
  9. Chatterjee D, DeOrio A, Bertacco V (2009a) Event-driven gate-level simulation with GP-GPUs. In: Design automation conferenceGoogle Scholar
  10. Chatterjee D, DeOrio A, Bertacco V (2009b) High-performance gate-level simulation with GP-GPUs. In: Design automation test EuropeGoogle Scholar
  11. Fujimoto RM (2000) Parallel and distributed simulation systems. Wiley-Interscience, New YorkGoogle Scholar
  12. Fung WWL, Sham I, Yuan G, Aamodt TM (2007) Dynamic warp formation and scheduling for efficient GPU control flow. In: International symposium on microarchitecture, Chicago, pp 407–418Google Scholar
  13. Gulati K, Khatri S (2008) Towards acceleration of fault simulation using graphics processing units. In: Design automation conferenceGoogle Scholar
  14. Holmes V (1978) Parallel algorithms on multiple processor architectures. Ph.D. dissertation, Computer Science Department, University of Texas, AustinGoogle Scholar
  15. Huang JH (2010) Keynote speech. In: Mini GPU technology conference, BeijingGoogle Scholar
  16. IEEE (2005) IEEE Std. 1666–2005, Standard for SystemCGoogle Scholar
  17. Jefferson DR (1985) Virtual time. ACM Trans Prog Lang Syst 7(3):404–425MathSciNetCrossRefGoogle Scholar
  18. NVIDIA (2009) CUDA Programming Guide 2.3Google Scholar
  19. NVIDIA (2010) White paper. NVIDIA’s next generation CUDA™ compute architecture: FermiGoogle Scholar
  20. OpenCores (2010) http://www.opencores.org/
  21. Park H, Fishwick PA (2008) A fast hybrid time-synchronous/event approach to parallel discrete event simulation of queuing networks. In: Conference on winter simulationGoogle Scholar
  22. Park H, Fishwick PA (2009) A GPU-based application framework supporting fast discrete-event simulation. Simulation. doi:10.1177/0037549709340781
  23. PCI-SIG (2010) PCIe base 3.0 specification.http://www.pcisig.com/specifications/pciexpress/base3
  24. Peacock JK, Wong JW, Manning EG (1979) Distributed simulation using a network of processors. Comput Netw 3(1):44–56Google Scholar
  25. Perumalla KS (2006a) Discrete-event execution alternatives on general purpose graphical processing units (GPGPUs). In: Workshop on principles of advanced and distributed simulationGoogle Scholar
  26. Perumalla KS (2006b) Parallel and distributed simulation: traditional techniques and recent advances. In: Conference on winter simulationGoogle Scholar
  27. Rashinkar P, Paterson P, Singh L (2000) System-on-a-chip verification: methodology and techniques. Kluwer Academic Publishers, DordrechtGoogle Scholar
  28. Rybacki S, Himmelspach J, Uhrmacher AM (2009) Experiments with single core, multi-core, and GPU based computation of cellular automata. In: Advances in international conference system simulation, pp 62–67Google Scholar
  29. Soule L, Gupta, A (1991) An evaluation of the Chandy-Misra-Bryant algorithm for digital logic simulation. ACM Trans Model Comput Simul 1(4):308–347Google Scholar
  30. Synopsys (2010) VCS: multicore-enabled functional verification solution. http://www.synopsys.com/tools/verification/functionalverification/pages/vcs.aspx
  31. Xu Z, Bagrodia R (2007) GPU-accelerated evaluation platform for high fidelity network modeling. In: International workshop on principles of advanced and distributed simulationGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Yangdong Deng
    • 1
  • Yuhao Zhu
    • 2
  • Wang Bo
    • 3
  1. 1.Institute of MicroelectronicsTsinghua UniversityBeijingChina
  2. 2.Department of Computer ScienceUniversity of TexasAustinUSA
  3. 3.Department of Electrical EngineeringStanford UniversityStanfordUSA

Personalised recommendations