HCC 2016: Human Centered Computing pp 409-419 | Cite as

SSSP on GPU Without Atomic Operation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9567)

Abstract

Graph is a general theoretical model in many large scale data-driven applications. SSSP (Single Source Shortest Path) algorithm is a foundation for most important algorithms and applications. GPU remains its mainstream station in high performance computing with heterogeneous architecture computers. Because of the high parallelization of the GPU threads, the distances of the vertices of the GPU are updated by atomic operations to avoid the read and write errors. Most atomic operations are unnecessary since the read-write conflicts are rare in large graph. However, without atomic operations the result accuracy can’t be guaranteed. The atomic operations take large part of the running time of the program. To improve the performance of SSSP on GPU, we proposed an algorithm with data block iterations instead of atomic operations. The algorithm not only gets a high speed-up but also guarantees the accuracy of the result. Experimental results show that this SSSP algorithm gained a speedup of three times than the serial algorithm on CPU and more than ten times than the parallel algorithm on GPU with atomic operation.

Keywords

Graph SSSP Atomic operation Data block iteration 

Notes

Acknowledgement

This paper is supported by the Natural Science Foundation of China (61379048); Special Project of National CAS Union – The High Performance Cloud Service Platform for Enterprise Creative Computing; Special Project of Hebei-CAS Union (Hebei, No.14010015)–The Performance Optimization of Key Algorithms of Health Data Processing in Senile Dementia Analysis.

References

  1. 1.
    Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1, 269–271 (1959). http://dx.doi.org/10.1007/BF01386390 MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Fredman, M.L., Tarjan, R.E.: Fibonacci heaps their uses in improved network optimization algorithms. J. ACM (JACM) 34, 596–615 (1987)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bellman, R.: On a routing problem. Q. Appl. Math. 16, 87–90 (1958)MATHGoogle Scholar
  4. 4.
    Meyer, U., Sanders, P.: Δ-stepping: a parallelizable shortest path algorithm. J. Algorithms 49, 114–152 (2003)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
  6. 6.
    Sanders, J., Kandrot, E.: CUDA by Example: An Introduction to General-Purpose GPU Programming. Addison-Wesley, Upper Saddle River (2010)Google Scholar
  7. 7.
    Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: SC’s 2009: Proceedings of the 2009 ACM/IEEE Conference on Supercomputing, pp. 18:1–18:11, November 2009Google Scholar
  8. 8.
    Ortega-Arranz, H., Torres, Y.: A new GPU-based approach to the shortest path problem. IEEE (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyBeijing Institute of TechnologyBeijingChina
  2. 2.Laboratory of Parallel Software and Computational ScienceInstitute of Software, Chinese Academy of ScienceBeijingChina

Personalised recommendations