Skip to main content

Blocked All-Pairs Shortest Paths Algorithm on Intel Xeon Phi KNL Processor: A Case Study

Part of the Communications in Computer and Information Science book series (CCIS,volume 790)


Manycores are consolidating in HPC community as a way of improving performance while keeping power efficiency. Knights Landing is the recently released second generation of Intel Xeon Phi architecture. While optimizing applications on CPUs, GPUs and first Xeon Phi’s has been largely studied in the last years, the new features in Knights Landing processors require the revision of programming and optimization techniques for these devices. In this work, we selected the Floyd-Warshall algorithm as a representative case study of graph and memory-bound applications. Starting from the default serial version, we show how data, thread and compiler level optimizations help the parallel implementation to reach 338 GFLOPS.


  • Xeon Phi
  • Knights Landing
  • Floyd-Warshall

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-75214-3_5
  • Chapter length: 11 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   74.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-75214-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   95.00
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.


  1. Green500 Supercomputer Ranking.

  2. Top500 Supercomputer Ranking.

  3. Barnes, T., et al.: Evaluating and optimizing the NERSC workload on knights landing. In: Proceedings of the 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, PMBS 2016, Piscataway, NJ, USA, pp. 43–53. IEEE Press (2016)

    Google Scholar 

  4. Bondhugula, U., Devulapalli, A., Dinan, J., Fernando, J., Wyckoff, P., Stahlberg, E., Sadayappan, P.: Hardware/software integration for FPGA-based all-pairs shortest-paths. In: 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 152–164, April 2006

    Google Scholar 

  5. Codreanu, V., Rodrguez, J., Saastad, O.W.: Best Practice Guide - Knights Landing (2017).

  6. Culler, D.E., Gupta, A., Singh, J.P.: Parallel Computer Architecture: A Hardware/Software Approach, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (1997)

    Google Scholar 

  7. Floyd, R.W.: Algorithm 97: shortest path. Commun. ACM 5(6), 345 (1962)

    CrossRef  Google Scholar 

  8. Giles, M.B., Reguly, I.: Trends in high-performance computing for engineering calculations. Philos. Trans. R. Soc. Lond. Math. Phys. Eng. Sci. 372(2022), 1–14 (2014)

    Google Scholar 

  9. Haidar, A., Tomov, S., Arturov, K., Guney, M., Story, S., Dongarra, J.: LU, QR, and Cholesky factorizations: programming model, performance analysis and optimization techniques for the Intel Knights Landing Xeon Phi. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7, September 2016

    Google Scholar 

  10. Han, S., Kang, S.: Optimizing all-pairs shortest-path algorithm using vector instructions (2005)

    Google Scholar 

  11. Hou, K., Wang, H., Feng, W.: Delivering parallel programmability to the masses via the Intel MIC ecosystem: a case study. In: 2014 43rd International Conference on Parallel Processing Workshops, pp. 273–282, September 2014

    Google Scholar 

  12. Jalali, S., Noroozi, M.: Determination of the optimal escape routes of underground mine networks in emergency cases. Saf. Sci. 47(8), 1077–1082 (2009)

    CrossRef  Google Scholar 

  13. Katz, G.J., Kider Jr., J.T.: All-pairs shortest-paths for large graphs on the GPU. In: Proceedings of the 23rd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware, GH 2008, pp. 47–55. Eurographics Association, Aire-la-Ville (2008)

    Google Scholar 

  14. Khan, P., Konar, G., Chakraborty, N.: Modification of Floyd-Warshall’s algorithm for shortest path routing in wireless sensor networks. In: 2014 Annual IEEE India Conference (INDICON), pp. 1–6, December 2014

    Google Scholar 

  15. Matsumoto, K., Nakasato, N., Sedukhin, S.G.: Blocked all-pairs shortest paths algorithm for hybrid cpu-gpu system. In: 2011 IEEE International Conference on High Performance Computing and Communications, pp. 145–152, September 2011

    Google Scholar 

  16. Nakaya, A., Goto, S., Kanehisa, M.: Extraction of correlated gene clusters by multiple graph comparison. Genome Inform. 12, 44–53 (2001)

    Google Scholar 

  17. Reinders, J., Jeffers, J., Sodani, A.: Intel Xeon Phi Processor High Performance Programming Knights Landing Edition. Morgan Kaufmann Publishers Inc., Boston (2016)

    Google Scholar 

  18. Rosales, C., Cazes, J., Milfeld, K., Gómez-Iglesias, A., Koesterke, L., Huang, L., Vienne, J.: A comparative study of application performance and scalability on the intel knights landing processor. In: Taufer, M., Mohr, B., Kunkel, J.M. (eds.) ISC High Performance 2016. LNCS, vol. 9945, pp. 307–318. Springer, Cham (2016).

    CrossRef  Google Scholar 

  19. Venkataraman, G., Sahni, S., Mukhopadhyaya, S.: A blocked all-pairs shortest-paths algorithm. In: Halldórsson, M.M. (ed.) SWAT 2000. LNCS, vol. 1851, pp. 419–432. Springer, Heidelberg (2000).

    CrossRef  Google Scholar 

  20. Warshall, S.: A theorem on boolean matrices. J. ACM 9(1), 11–12 (1962)

    MathSciNet  CrossRef  MATH  Google Scholar 

Download references


The authors thank the ArTeCS Group from Universidad Complutense de Madrid for letting use their Xeon Phi KNL system.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Enzo Rucci .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Rucci, E., De Giusti, A., Naiouf, M. (2018). Blocked All-Pairs Shortest Paths Algorithm on Intel Xeon Phi KNL Processor: A Case Study. In: De Giusti, A. (eds) Computer Science – CACIC 2017. CACIC 2017. Communications in Computer and Information Science, vol 790. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75213-6

  • Online ISBN: 978-3-319-75214-3

  • eBook Packages: Computer ScienceComputer Science (R0)