Skip to main content

CORRFUNC: Blazing Fast Correlation Functions with AVX512F SIMD Intrinsics

  • Conference paper
  • First Online:
Software Challenges to Exascale Computing (SCEC 2018)

Abstract

Correlation functions are widely used in extra-galactic astrophysics to extract insights into how galaxies occupy dark matter halos and in cosmology to place stringent constraints on cosmological parameters. A correlation function fundamentally requires computing pair-wise separations between two sets of points and then computing a histogram of the separations. Corrfunc is an existing open-source, high-performance software package for efficiently computing a multitude of correlation functions. In this paper, we will discuss the SIMD AVX512F kernels within Corrfunc, capable of processing 16 floats or 8 doubles at a time. The latest manually implemented Corrfunc AVX512F kernels show a speedup of up to \(\sim \)4\(\times \) relative to compiler-generated code for double-precision calculations. The AVX512F kernels show \(\sim \)1.6\(\times \) speedup relative to the AVX kernels and compares favorably to a theoretical maximum of \(2\times \). In addition, by pruning pairs with too large of a minimum possible separation, we achieve a \(\sim \)5–10% speedup across all the SIMD kernels. Such speedups highlight the importance of programming explicitly with SIMD vector intrinsics for complex calculations that can not be efficiently vectorized by compilers. Corrfunc is publicly available at https://github.com/manodeep/Corrfunc/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For double precision calculations, the upper 8 bits of the mask are identically set to 0.

  2. 2.

    AVX512CD is meant to allow vectorization of histogram updates but our attempts at automatic vectorization have proved futile so far.

  3. 3.

    A low \({{\mathcal {R}}_\mathrm {max}}\) is potentially a case where the bin refinement factors need to be set to (1, 1, 1) to boost the particle occupancy in the cells.

References

  1. Chhugani, J., et al.: Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systems. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012, pp. 1:1–1:11. IEEE Computer Society Press, Los Alamitos (2012). http://dl.acm.org/citation.cfm?id=2388996.2388998

  2. Gonnet, P.: A simple algorithm to accelerate the computation of non-bonded interactions in cell-based molecular dynamics simulations. J. Comput. Chem. 28(2), 570–573 (2007). https://doi.org/10.1002/jcc.20563. https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.20563

    Article  Google Scholar 

  3. Hockney, R., Goel, S., Eastwood, J.: Quiet high-resolution computer models of a plasma. J. Comput. Phys. 14(2), 148–158 (1974). https://doi.org/10.1016/0021-9991(74)90010-2. http://www.sciencedirect.com/science/article/pii/0021999174900102

    Article  Google Scholar 

  4. Lindahl, E., Hess, B., van der Spoel, D.: GROMACS 3.0: a package for molecular simulation and trajectory analysis. Mol. Model. Annu. 7, 306–317 (2001). https://doi.org/10.1007/s008940100045. https://link.springer.com/article/10.1007/s008940100045

    Article  Google Scholar 

  5. Peebles, P.J.E.: The Large-Scale Structure of the Universe. Princeton University Press, Princeton (1980)

    Google Scholar 

  6. Quentrec, B., Brot, C.: New method for searching for neighbors in molecular dynamics computations. J. Comput. Phys. 13(3), 430–432 (1973). https://doi.org/10.1016/0021-9991(73)90046-6. http://www.sciencedirect.com/science/article/pii/0021999173900466

    Article  Google Scholar 

  7. Sinha, M., Berlind, A.A., McBride, C.K., Scoccimarro, R., Piscionere, J.A., Wibking, B.D.: Towards accurate modelling of galaxy clustering on small scales: testing the standard \(\varLambda \)CDM + halo model. MNRAS 478, 1042–1064 (2018). https://doi.org/10.1093/mnras/sty967

    Article  Google Scholar 

  8. Sinha, M., Lehman, G.: Corrfunc—a suite of blazing fast correlation functions on the CPU. MNRAS (2019). (Submitted to MNRAS)

    Google Scholar 

  9. Willis, J.S., Schaller, M., Gonnet, P., Bower, R.G., Draper, P.W.: An efficient SIMD implementation of pseudo-Verlet lists for neighbour interactions in particle-based codes. ArXiv e-prints, April 2018

    Google Scholar 

Download references

Acknowledgements

MS was primarily supported by NSF Career Award (AST-1151650) during main Corrfunc design and development. MS was also supported by the by the Australian Research Council Laureate Fellowship (FL110100072) awarded to Stuart Wyithe and by funds for the Theoretical Astrophysical Observatory (TAO). TAO is part of the All-Sky Virtual Observatory and is funded and supported by Astronomy Australia Limited, Swinburne University of Technology, and the Australian Government. The latter is provided though the Commonwealth’s Education Investment Fund and National Collaborative Research Infrastructure Strategy (NCRIS), particularly the National eResearch Collaboration Tools and Resources (NeCTAR) project. Parts of this research were conducted by the Australian Research Council Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), through project number CE170100013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manodeep Sinha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sinha, M., Garrison, L. (2019). CORRFUNC: Blazing Fast Correlation Functions with AVX512F SIMD Intrinsics. In: Majumdar, A., Arora, R. (eds) Software Challenges to Exascale Computing. SCEC 2018. Communications in Computer and Information Science, vol 964. Springer, Singapore. https://doi.org/10.1007/978-981-13-7729-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-7729-7_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-7728-0

  • Online ISBN: 978-981-13-7729-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics