Advertisement

Extending a Highly Parallel Data Mining Algorithm to the Intel ® Many Integrated Core Architecture

  • Alexander Heinecke
  • Michael Klemm
  • Dirk Pflüger
  • Arndt Bode
  • Hans-Joachim Bungartz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7156)

Abstract

Extracting knowledge from vast datasets is a major challenge in data-driven applications, such as classification and regression, which are mostly compute bound. In this paper, we extend our SG + +  algorithm to the Intel® Many Integrated Core Architecture (Intel® MIC Architecture). The ease of porting an application to Intel MIC Architecture is shown: porting existing SSE code is very easy and straightforward. We evaluate the current prototype pre-release coprocessor board codenamed Intel® “Knights Ferry”. We utilize the pragma-based offloading programming model offered by the Intel® Composer XE for Intel MIC Architecture, generating both the host and the coprocessor code. We compare the achieved performance with an NVIDIA C2050 accelerator and show that the pre-release Knights Ferry coprocessor delivers better performance than the C2050 and exceeds the C2050 when comparing the productivity aspect of implementing algorithms for the coprocessors.

Keywords

Intel® Many Integrated Core Architecture Intel® MIC Architecture Intel® Knights Ferry NVIDIA Fermi* GPGPU accelerators coprocessors data mining sparse grids 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bungartz, H.-J., Griebel, M.: Sparse Grids. Acta Numerica 13, 147–269 (2004)MathSciNetCrossRefGoogle Scholar
  2. 2.
    CAPS Enterprise. Rapidly Develop GPU Accelerated Applications (2011)Google Scholar
  3. 3.
    Intel Corporation. Pentium® Processor 75/90/100/120/133/150/166/200, Order Number 241997-010 (1997)Google Scholar
  4. 4.
    Intel Corporation. Intel® Xeon® Processor X5680 (2010), http://ark.intel.com (last accessed August 18, 2011)
  5. 5.
    Intel Corporation. Intel® Array Building Blocks (2011), http://software.intel.com/en-us/articles/intel-array-building-blocks/ (accessed June 15, 2011)
  6. 6.
    Intel Corporation. Intel® CilkTM Plus Language Specification, Document Number 324396-001US (2011)Google Scholar
  7. 7.
    Intel Corporation. Introducing Intel® Many Integrated Core Architecture (2011), http://www.intel.com/technology/architecture-silicon/mic/index.htm (accessed June 15, 2011)
  8. 8.
    Lee, A., et al.: On the Utility of Graphics Cards to Perform Massively Parallel Simulation of Advanced Monte Carlo Methods. Journal of Computational and Graphical Statistics 19(4), 769–789 (2010)CrossRefGoogle Scholar
  9. 9.
    Seiler, L., et al.: Larrabee: a Many-core x86 Architecture for Visual Computing. ACM Trans. Graph. 27(3), 18:1–18:15 (2008)Google Scholar
  10. 10.
    Khronos OpenCL Working Group. The OpenCL Specification, Version 1.1 (2010)Google Scholar
  11. 11.
    Heinecke, A., Pflüger, D.: Multi- and many-core data mining with adaptive sparse grids. In: Proc. of the 2011 ACM Intl. Conf. on Computing Frontiers (2011)Google Scholar
  12. 12.
    NVIDIA. Next Generation CUDATM Compute Architecture: FermiTM (2010)Google Scholar
  13. 13.
    NVIDIA. NVIDIA® CUDATM C Programming Guide (2011)Google Scholar
  14. 14.
    NVIDIA. OpenCLTM Best Practices Guide (2011)Google Scholar
  15. 15.
    OpenMP Architecture Review Board. OpenMP Application Program Interface, Version 3.0 (2008)Google Scholar
  16. 16.
    Pflüger, D.: Spatially Adaptive Sparse Grids for High-Dimensional Problems. Dissertation, Institut für Informatik, TUM, München (2010)Google Scholar
  17. 17.
    Reinders, J.: Intel Threading Building Blocks. O’Reilly, Sebastopol (2007)Google Scholar
  18. 18.
    Skaugen, K.: Petascale to Exascale. Keynote speech at the Intl. Supercomputing Conf. 2010 (2010)Google Scholar
  19. 19.
    The Portland Group. PGI Accelerator Compilers (2011), http://www.pgroup.com/resources/accel.htm (accessed June 15, 2011)
  20. 20.
    Volkov, V., Demmel, J.W.: Benchmarking GPUs to Tune Dense Linear Algebra. In: Proc. of the 2008 ACM/IEEE Conf. on Supercomputing, pp. 31:1–31:11 (2008)Google Scholar
  21. 21.
    Yelick, K.: Exascale Computing: More and Moore? 2011. Keynote speech at the 2011 ACM Intl. Conf. on Computing Frontiers (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Alexander Heinecke
    • 1
  • Michael Klemm
    • 3
  • Dirk Pflüger
    • 1
  • Arndt Bode
    • 2
  • Hans-Joachim Bungartz
    • 2
  1. 1.Technische Universität MünchenGarchingGermany
  2. 2.Leibniz-Rechenzentrum der Bayerischen Akademie der WissenschaftenGarchingGermany
  3. 3.Intel GmbHFeldkirchenGermany

Personalised recommendations