Advertisement

Asynchronous Parallel Dijkstra’s Algorithm on Intel Xeon Phi Processor

How to Accelerate Irregular Memory Access Algorithm
  • Weidong Zhang
  • Lei Zhang
  • Yifeng Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11334)

Abstract

As the instruction-level parallelism (ILP) on CPU develops to a rather advanced level, the exploration that whether many-core architecture is applicable for graph algorithms is generating more interests in researchers. However, due to the irregular memory access and the low ratio of computation to memory access, the performance of graph algorithms on many-core architectures has never worked good enough.

To obtain outstanding speedup on many-core architecture, first of all, we need to figure out three questions: (i) how to optimize the memory access, (ii) how to minimize the overhead of synchronization, (iii) how to exploit the parallelism in algorithm. Prior works hardly reach the goal if such questions are treated in separated way. Throughout this paper, we aim to settle these questions systematically, and try to provide a set of methods of optimizing graph algorithms on many-core architecture.

This paper mainly discusses how to accelerate the Single Source Shortest Path (SSSP) problem on Intel Many Integrated Core (MIC) architecture, on which we propose an asynchronous parallel Dijkstra’s algorithm. It aims at maximizing parallelism and minimizing overhead of synchronization. Experimental result shows that the MIC architecture could efficiently solve the SSSP problem, and its performance could be sped up by 9.2x compared to the benchmark of DIMACS.

Keywords

Single source shortest path Parallel algorithm Intel MIC Road network 

Notes

Acknowledgments

This work is supported by National Key R&D Program of China (under Grant 2017YFB0202001) and National Natural Science Foundation of China (under Grants 61432018,61672208).

References

  1. 1.
    Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25(2), 163–177 (2010)CrossRefGoogle Scholar
  2. 2.
    Freeman, L.C.: A set of measures of centrality based on betweenness. Sociometry 40(1), 35–41 (1977)CrossRefGoogle Scholar
  3. 3.
    Guimerá, R., Mossa, S., Turtschi, A., Amaral, L.A.N.: The worldwide air transportation network: anomalous centrality, community structure, and cities’ global roles. Proc. Natl. Acad. Sci. U.S.A. 102(22), 7794–7799 (2005)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Jeong, H., Mason, S.P., Barabási, A.L., Oltvai, Z.N.: Lethality and centrality in protein networks. Nature 411(6833), 41–42 (2001)CrossRefGoogle Scholar
  5. 5.
    Johnson, D.B.: A note on Dijkstra’s shortest path algorithm. J. ACM 20(3), 385–388 (1973)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Carlsson, S., Munro, J.I., Poblete, P.V.: An implicit binomial queue with constant insertion time. In: Karlsson, R., Lingas, A. (eds.) SWAT 1988. LNCS, vol. 318, pp. 1–13. Springer, Heidelberg (1988).  https://doi.org/10.1007/3-540-19487-8_1CrossRefGoogle Scholar
  7. 7.
    Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization algorithms. J. ACM 34(3), 338–346 (1984)MathSciNetGoogle Scholar
  8. 8.
    yuhc: Pregel and shortest path algorithm in graphx (2015). http://note.yuhc.me/2015/03/graphx-pregel-shortest-path/
  9. 9.
    Srinivasan, T., Balakrishnan, R., Gangadharan, S.A., Hayawardh, V.: A scalable parallelization of all-pairs shortest path algorithm for a high performance cluster environment. In: International Conference on Parallel and Distributed Systems, pp. 1–8 (2007)Google Scholar
  10. 10.
    Madduri, K., Bader, D.A., Berry, J.W., Crobak, J.R.: Parallel shortest path algorithms for solving large-scale instances. In: Dimacs Implementation Challenge - The Shortest Path Problem, vol. 74, pp. 249–290 (2011)Google Scholar
  11. 11.
  12. 12.
    Goldberg, A.V.: A practical shortest path algorithm with linear expected time. Siam J. Comput. 37(5), 1637–1655 (2008)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Denardo, E.V., Fox, B.L.: Shortest-route methods: 1. reaching, pruning, and buckets. Oper. Res. 27(1), 161–186 (1979)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Zhu, A.D., Ma, H., Xiao, X., Luo, S., Tang, Y., Zhou, S.: Shortest path and distance queries on road networks: towards bridging theory and practice. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 857–868 (2013)Google Scholar
  15. 15.
    Möhring, R.H., Schilling, H., Schütz, B., Wagner, D., Willhalm, T.: Partitioning graphs to speed up Dijkstra’s algorithm. In: Nikoletseas, S.E. (ed.) WEA 2005. LNCS, vol. 3503, pp. 189–202. Springer, Heidelberg (2005).  https://doi.org/10.1007/11427186_18CrossRefGoogle Scholar
  16. 16.
    Miller, F.P., Vandome, A.F., Mcbrewster, J.: kd-tree. Alpha Press, Orlando (2009)Google Scholar
  17. 17.
    Karypis, G., Kumar, V.: METIS: a software package for partitioning unstructured graphs. In: International Cryogenics Monograph, pp. 121–124 (1998)Google Scholar
  18. 18.
    9th DIMACS implementation challenge - shortest paths (2012). http://www.dis.uniroma1.it/challenge9/
  19. 19.
    Moore, E.F.: The shortest path through a maze. In: Proceedings of the International Symposium on the Theory of Switching, pp. 285–292 (1959)Google Scholar
  20. 20.
    Berteskas, D., Gallagre, R.: Distributed asynchronous Bellman-ford algorithm. In: Data Networks (1987)Google Scholar
  21. 21.
    Cheng, C., Riley, R., Kumar, S.P.R., Garcialunaaceves, J.J.: A loop-free extended bellman-ford routing protocol without bouncing effect. In: Symposium Proceedings on Communications Architectures and Protocols, pp. 224–236 (1989)Google Scholar
  22. 22.
    Chroboczek, J.: The Babel Routing Protocol. Heise Zeitschriften Verlag (2011)Google Scholar
  23. 23.
    Awerbuch, B., Bar-Noy, A., Gopal, M.: Approximate distributed bellman-ford algorithms. IEEE Trans. Commun. 42(8), 2515–2517 (1994)CrossRefGoogle Scholar
  24. 24.
    Pettie, S., Ramachandran, V.: Computing shortest paths with comparisons and additions. In: Thirteenth ACM-SIAM Symposium on Discrete Algorithms (2002)Google Scholar
  25. 25.
    Meyer, U., Sanders, P.: delta-stepping : a parallel single source shortest path algorithm. In: Bilardi, G., Italiano, G.F., Pietracaprina, A., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 393–404. Springer, Heidelberg (1998).  https://doi.org/10.1007/3-540-68530-8_33CrossRefGoogle Scholar
  26. 26.
    Micikevicius, P.: General parallel computation on commodity graphics hardware: case study with the all-pairs shortest paths problem. In: International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 2004 21–24 June 2004, Las Vegas, Nevada, USA, pp. 1359–1365 (2004)Google Scholar
  27. 27.
    Harish, P., Narayanan, P.J.: Accelerating Large graph algorithms on the GPU using CUDA. In: Aluru, S., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) HiPC 2007. LNCS, vol. 4873, pp. 197–208. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-77220-0_21CrossRefGoogle Scholar
  28. 28.
    Kranjčevič, M., Palossi, D., Pintarelli, S.: Parallel delta-stepping algorithm for shared memory architectures (2016)Google Scholar
  29. 29.
    Delling, D., Goldberg, A.V., Nowatzyk, A., Werneck, R.F.: PHAST: hardware-accelerated shortest path trees. J. Parallel Distrib. Comput. 73(7), 940–952 (2013)CrossRefGoogle Scholar
  30. 30.
    Geisberger, R., Sanders, P., Schultes, D., Delling, D.: Contraction hierarchies: faster and simpler hierarchical routing in road networks. In: McGeoch, C.C. (ed.) WEA 2008. LNCS, vol. 5038, pp. 319–333. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-68552-4_24CrossRefGoogle Scholar
  31. 31.
  32. 32.
    Intel xeon phi x100 family coprocessor - the architecture, 12 November 2012. https://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-codename-knights-corner
  33. 33.
    Orlin, J.B., Madduri, K., Subramani, K., Williamson, M.: A faster algorithm for the single source shortest path problem with few distinct positive lengths. J. Discrete Algorithms 8(2), 189–198 (2010)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Ortega-Arranz, H., Torres, Y., Gonzalez-Escribano, A., Llanos, D.R.: Comprehensive evaluation of a new GPU-based approach to the shortest path problem. Int. J. Parallel Prog. 43(5), 918–938 (2015)CrossRefGoogle Scholar
  35. 35.
    Sedeño-Noda, A., González-Barrera, J.D.: Fast and fine quickest path algorithm. Eur. J. Oper. Res. 238(2), 596–606 (2014)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.EECS of Peking UniversityBeijingPeople’s Republic of China
  2. 2.HuaWei Inc.ShenzhenChina

Personalised recommendations