Efficient Neighbor Search for Particle Methods on GPUs

Conference paper
Part of the Lecture Notes in Computational Science and Engineering book series (LNCSE, volume 100)

Abstract

In this paper we present an efficient and general sorting-based approach for the neighbor search on GPUs. Finding neighbors of a particle is a common task in particle methods and has a significant impact on the overall computational effort–especially in dynamics simulations. We extend a space-filling curve algorithm presented in Connor and Kumar (IEEE Trans Vis Comput Graph, 2009) for its usage on GPUs with the parallel computing model Compute Unified Device Architecture (CUDA). To evaluate our implementation, we consider the respective execution time of our GPU search algorithm, for the most common assemblies of particles: a regular grid, uniformly distributed random points and cluster points in 2 and 3 dimensions. The measured computational time is compared with the theoretical time complexity of the extended algorithm and the computational time of its reference single-core implementation. The presented results show a speed up of factor of 4 comparing the GPU and CPU run times.

Keywords

Neighbor search GPU Meshfree methods and particle methods 

References

  1. 1.
    S. Aluru, F.E. Sevilgen, Parallel domain decomposition and load balancing using space-filling curves, in Proceedings of the 4th IEEE Conference on High Performance Computing, Bangalore, 1997, pp. 230–235Google Scholar
  2. 2.
    S. Arya, D.M. Mount, N.S. Netanyahu, R. Silverman, A.Y. Wu, An optimal algortihm for approximate nearest neighbor searching in fixed dimensions, in Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, Arlington, 1994, vol. 5, pp. 573–582Google Scholar
  3. 3.
    M. Bader, Space-Filling Curves – An Introduction with Applications in Scientific Computing (Springer, Berlin/Heidelberg, 2013)MATHGoogle Scholar
  4. 4.
    C. Böhm, S. Berchtold, A.D. Keim, Searching in high-dimensional spaces: index strucutres for improving the performance of multimedia databases. ACM Comput. Surv. 33, 322–373 (2001)CrossRefGoogle Scholar
  5. 5.
    T.M. Chan, A minimalist’s implementation of an approximate nearest neighbor algorithm in fixed dimensions, https://cs.uwaterloo.ca/~tmchan/sss.ps, May 2006
  6. 6.
  7. 7.
    M. Connor, P. Kumar, Fast construction of k-nearest neighbor graphs for point clouds. IEEE Trans. Vis. Comput. Graph. 14(4), 599–608 (2009)Google Scholar
  8. 8.
    A. Dashti, I. Komarov, R.M. D’Souza, Efficient computation of k-nearest neighbour graphs for large high-dimensional data sets on GPU clusters. PLoS ONE 8, e74113 (2013), plosone.org
  9. 9.
    V. Garcia, E. Debreuve, M. Barlaud, kNN CUDA, http://vincentfpgarcia.github.io/kNN-CUDA/
  10. 10.
    R.A. Gingold, J.J. Monaghan, Smoothed particle hydrodynamics: theory and application to non-spherical stars. Mon. Not. R. Astron. Soc. 181, 375–389 (1977)CrossRefMATHGoogle Scholar
  11. 11.
    M. Griebel, S. Knapek, G. Zumbusch, Numerical Simulation in Molecular Dynamics (Springer, Berlin/Heidelberg, 2007)MATHGoogle Scholar
  12. 12.
    P. Leite, J.M. Teixeira, T. Farias, B. Reis, V. Teichrieb, J. Kelner, Nearest neighbor searches on the gpu. Int. J. Parallel Program. 40(3), 313–330 (2012) (English)CrossRefGoogle Scholar
  13. 13.
    J. Mellor-Crummey, D. Whalley, K. Kennedy, Improving memory hierarchy performance fir irregular applications using data and computation reorderings. Int. J. Parallel Program. 29, 217–247 (2001)CrossRefMATHGoogle Scholar
  14. 14.
    D.M. Mount, S. Arya, ANN: a library for approximate nearest neighbor searching, http://www.cs.umd.edu/~mount/ANN/
  15. 15.
    S.A. Nene, S.K Nayar, A simple algorithm for nearest neighbor search in high dimensions. IEEE Trans. Pattern Anal. Mach. Intell. 19, 989–1003 (1997)Google Scholar
  16. 16.
    M.L. Parks, R.B. Lehoucq, S.J. Plimpton, S.A. Silling, Implementing peridynamics within a molecular dynamics code. Comput. Phys. Commun. (EL, ed.) 179, 777–783 (2008)Google Scholar
  17. 17.
    S. Plimpton, Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117, 1–19 (1995)CrossRefMATHGoogle Scholar
  18. 18.
    N. Satish, M. Harris, M. Garland, Designing efficient sorting algorithms for manycore GPUs, in IEEE International Symposium in Parallel & Distributed Processing, Rome, 2009, pp. 1–10Google Scholar
  19. 19.
    M.A. Schweitzer, A Parallel Multilevel Partition of Unity Method for Elliptic Partial Differential Equations. Lecture Notes in Computational Science and Engineering, vol. 29 (Springer, New York, 2003)Google Scholar
  20. 20.
    Y.D. Sergeyev, R.G. Strongin, D. Lera, Introduction to Global Optimization Exploiting Space-Filling Curves (Springer, New York/Heidelberg, 2013)CrossRefMATHGoogle Scholar
  21. 21.
    S.A. Silling, Reformulation of elasticity theory for discontinuties and long-range forces. Sandia report SAND98-2176, Sandia National Laboratories, 1998Google Scholar
  22. 22.
    S.A. Silling, E. Askari, A meshfree method based on the peridynamic model of solid mechanics. Comput. Struct. 83, 1526–1535 (2005)CrossRefGoogle Scholar
  23. 23.
    E. Sintorn, U. Assarsson, Fast parallel GPU-sorting using a hybrid algorithm. J. Parallel Distrib. Comput. 68, 1381–1388 (2008)CrossRefMATHGoogle Scholar
  24. 24.
    H. Tropf, H. Herzog, Multidimensional range search in dynamically balanced trees. Angew. Inform. (Appl. Inform.) 2, 71–77 (1981). Vieweg VerlagGoogle Scholar
  25. 25.
    M.S. Warren, J.K. Salmon, A parallel hashed oct-tree n-body algorithm, in Proceedings of the 1993 ACM/IEEE Conference on Supercomputing (Supercomputing’93), Portland (ACM, New York, 1993), pp. 12–21Google Scholar
  26. 26.
    W. Wen-mei, GPU Computing Gems Emerald Edition Applications of GPU Computing Series, 1st edn. (Morgan Kaufmann, Burlington, Massachusetts 2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Institute for Numerical SimulationBonnGermany

Personalised recommendations