Skip to main content
Log in

Emerging Architectures Enable to Boost Massively Parallel Data Mining Using Adaptive Sparse Grids

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Gaining knowledge out of vast datasets is a main challenge in data-driven applications nowadays. Sparse grids provide a numerical method for both classification and regression in data mining which scales only linearly in the number of data points and is thus well-suited for huge amounts of data. Due to the recursive nature of sparse grid algorithms and their classical random memory access pattern, they impose a challenge for the parallelization on modern hardware architectures such as accelerators. In this paper, we present the parallelization on several current task- and data-parallel platforms, covering multi-core CPUs with vector units, GPUs, and hybrid systems. We demonstrate that a less efficient implementation from an algorithmical point of view can be beneficial if it allows vectorization and a higher degree of parallelism instead. Furthermore, we analyze the suitability of parallel programming languages for the implementation. Considering hardware, we restrict ourselves to the x86 platform with SSE and AVX vector extensions and to NVIDIA’s Fermi architecture for GPUs. We consider both multi-core CPU and GPU architectures independently, as well as hybrid systems with up to 12 cores and 2 Fermi GPUs. With respect to parallel programming, we examine both the open standard OpenCL and Intel Array Building Blocks, a recently introduced high-level programming approach, and comment on their ease of use. As the baseline, we use the best results obtained with classically parallelized sparse grid algorithms and their OpenMP-parallelized intrinsics counterpart (SSE and AVX instructions), reporting both single and double precision measurements. The huge data sets we use are a real-life dataset stemming from astrophysics and artificial ones, all of which exhibit challenging properties. In all settings, we achieve excellent results, obtaining speedups of up to 188 × using single precision on a hybrid system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adelman-McCarthy J.K. et al.: The fifth data release of the Sloan Digital Sky Survey. ApJS 172, 634–644 (2007)

    Article  Google Scholar 

  2. Agullo E., Demmel J., Dongarra J., Hadri B., Kurzak J., Langou J., Ltaief H., Luszczek P., Tomov S.: Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects. J. Phys. Conf. Ser. 180(1), 012037 (2009)

    Article  Google Scholar 

  3. Allen D.M.: The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16(1), 125–127 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  4. Auckenthaler T., Blum V., Bungartz H.-J., Huckle T., Johanni R., Krämer L., Lang B., Lederer H., Willems P.: Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations. Parallel Comput. 37(12), 783–794 (2011)

    Article  Google Scholar 

  5. Auer, B.O.F., Bisseling, R.H.: A GPU algorithm for greedy graph matching. In: Facing the Multi-core Challenge II, Karlsruhe, Germany (2011, accepted)

  6. Bellman R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961)

    MATH  Google Scholar 

  7. Benk, J., Bungartz, H.-J., Nagy, A.-E., Schraufstetter, S.: An option pricing framework based on theta-calculus and sparse grids. In: Progress in Industrial Mathematics at ECMI 2010, Oct. 2010

  8. Bungartz H.-J., Griebel M.: Sparse grids. Acta Numer. 13, 147–269 (2004)

    Article  MathSciNet  Google Scholar 

  9. Bungartz, H.-J., Heinecke, A., Pflüger, D., Schraufstetter, S.: Parallelizing a Black-Scholes solver based on finite elements and sparse grids. In: IEEE Proceedings of IPDPS (2010)

  10. Bungartz, H.-J., Pflüger, D., Zimmer, S.: Adaptive sparse grid techniques for data mining. In: Bock, H., Kostina, E., Hoang, X., Rannacher, R. (eds.) Modelling, Simulation and Optimization of Complex Processes, Proceedings of the High Performance Scientific Computing 2006, Hanoi, Vietnam, pp. 121–130. Springer, Berlin, June 2008

  11. Deisher M., Smelyanskiy M., Nickerson B., Lee V.W., Chuvelev M., Dubey P.: Designing and dynamically load balancing hybrid LU for multi/many-core. Comput. Sci. 26, 211–220 (2011)

    Google Scholar 

  12. Evgeniou, T., Pontil, M., Poggio, T.: Regularization networks and support vector machines. In: Advances in Computational Mathematics, pp. 1–50. MIT Press, Cambridge (2000)

  13. Friedman J.H.: Multivariate adaptive regression splines. Ann. Stat. 19(1), 1–67 (1991)

    Article  MATH  Google Scholar 

  14. Ganapathysubramanian B., Zabaras N.: Sparse grid collocation schemes for stochastic natural convection problems. J. Comput. Phys. 225(1), 652–685 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  15. Garcke, J.: Regression with the optimised combination technique. In: ICML ’06: Proceedings of the 23rd International Conference on Machine Learning, pp. 321–328. ACM Press, New York, NY, USA (2006)

  16. Garcke, J.: A dimension adaptive sparse grid combination technique for machine learning. In: Read, W., Larson, J.W., Roberts, A.J. (eds.) Proceedings of the 13th Biennial Computational Techniques and Applications Conference, CTAC-2006, vol. 48 ANZIAM J., pp. C725–C740 (2007)

  17. Garcke J., Griebel M., Thess M.: Data mining with sparse grids. Computing 67(3), 225–253 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  18. Garcke J., Hegland M.: Fitting multidimensional data using gradient penalties and the sparse grid combination technique. Computing 84(1–2), 1–25 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  19. Ghuloum, A., et al.: Array building blocks: a flexible parallel programming model for multi-core and many-core architectures. (2010, Online)

  20. Heinecke, A., Klemm, M., Pflüger, D., Bode, A., Bungartz, H.-J.: Extending a highly parallel data mining algorithm to the Intel(R) many integrated core architecture. In: The 4th Workshop on UnConventional High Performance Computing 2011 (UCHPC 2011), Bordeaux, France (2011, accepted)

  21. Heinecke, A., Pflüger, D.: Multi- and many-core data mining with adaptive sparse grids. In: Proceedings of the 2011 ACM International Conference on Computing Frontiers (2011, accepted)

  22. Holtz, M.: Sparse Grid Quadrature in High Dimensions with Applications in Finance and Insurance. Dissertation, Institut für Numerische Simulation, Universität Bonn (2008)

  23. Humphrey, J.R., Price, D.K., Spagnoli, K.E., Paolini, A.L., Kelmelis, E.J.: CULA: hybrid GPU accelerated linear algebra routines, pp. 770502–770502–7 (2010)

  24. Intel. Intel Turbo Boost Technology in Intel Core microarchitecture (Nehalem) based processors (2008, Online)

  25. Intel Corporation. Intel 64 and IA-32 Architectures Software Developer’s Manual Combined Volumes: 1, 2A, 2B, 3A and 3B, 2011. Document Number 325462-039US

  26. Klimke A., Nunes R., Wohlmuth B.: Fuzzy arithmetic based on dimension-adaptive sparse grids: a case study of a large-scale finite element model under uncertain parameters. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 14, 561–577 (2006)

    Article  Google Scholar 

  27. Lee A., Yau C., Giles M.B., Doucet A., Holmes C.C.: On the Utility of Graphics Cards to Perform Massively Parallel Simulation of Advanced Monte Carlo Methods. J. Comput. Graph. Stat. 19(4), 769–789 (2010)

    Article  Google Scholar 

  28. Lee, V.W., Kim, C., Chhugani, J., Deisher, M., Kim, D., Nguyen, A.D., Satish, N., Smelyanskiy, M., Chennupaty, S., Hammarlund, P., Singhal, R., Dubey, P.: Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. In: ISCA, pp. 451–460 (2010)

  29. McIntosh-Smith, S., Irwin, J.: The best of both worlds: delivering aggregated performance for high-performance math libraries in accelerated systems. In: Proceedings of the 2007 International Supercomputing Conference, Dresden, Germany (2009)

  30. NVIDIA. Next Generation CUDA Compute Architecture: Fermi (2010)

  31. NVIDIA. NVIDIA CUDA C Programming Guide (2011)

  32. NVIDIA. OpenCL Best Practices Guide (2011)

  33. OpenMP Architecture Review Board. OpenMP Application Program Interface, Version 3.0 (2008)

  34. Pflüger, D.: Spatially Adaptive Sparse Grids for High-Dimensional Problems. Verlag Dr. Hut, München (2010)

  35. Pflüger D., Peherstorfer B., Bungartz H.-J.: Spatially adaptive sparse grids for high-dimensional data-driven problems. J. Complex. 26(5), 508–522 (2010)

    Article  MATH  Google Scholar 

  36. Reisinger C., Wittum G.: Efficient hierarchical approximation of high-dimensional option pricing problems. SIAM J. Sci. Comput. 29(1), 440–458 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  37. Skaugen, K.: Petascale to Exascale: extending Intel’s HPC commitment, Keynote ISC 2010, Hamburg. In: International Supercomputing Conference (ISC) (2010)

  38. Tomov S., Dongarra J., Baboulin M.: Towards dense linear algebra for hybrid GPU accelerated manycore systems. Parallel Comput. 36, 232–240 (2010)

    Article  MATH  Google Scholar 

  39. Trancoso, P., Artemiou, A.: Exploring the GPU to accelerate DSS query execution. In Proceedings of the 2008 ACM Conference on Computing Frontiers, Poster Session, pp. 109–110. ACM (2008)

  40. Volkov, V., Demmel, J.: Benchmarking GPUs to Tune Dense Linear Algebra. In: Proceedings of the 2008 ACM/IEEE Confernec on Supercomputing, pp. 31:1–31:11 (2008)

  41. von Petersdorff T., Schwab C.: Sparse finite element methods for operator equations with stochastic data. Appl. Math. 51(2), 145–180 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  42. Wendland H.: Scattered Data Approximation. Cambridge University Press, Cambridge (2005)

    MATH  Google Scholar 

  43. Widmer G., Hiptmair R., Schwab C.: Sparse adaptive finite elements for radiative transfer. J. Comput. Phys. 227, 6071–6105 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  44. Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, Amsterdam (2011)

  45. Zenger, C.: Sparse grids. In: Hackbusch, W. (ed.) Parallel Algorithms for Partial Differential Equations, vol. 31, Notes on Numerical Fluid Mechanics, pp. 241–251. Vieweg (1991)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Heinecke.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heinecke, A., Pflüger, D. Emerging Architectures Enable to Boost Massively Parallel Data Mining Using Adaptive Sparse Grids. Int J Parallel Prog 41, 357–399 (2013). https://doi.org/10.1007/s10766-012-0202-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-012-0202-0

Keywords

Navigation