Skip to main content

Cyme: A Library Maximizing SIMD Computation on User-Defined Containers

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 8488)

Abstract

This paper presents Cyme, a C++ library aiming at abstracting the usage of SIMD instructions while maximizing the usage of the underlying hardware. Unlike similar efforts such as Boost.simd or VC, Cyme provides generic high level containers to the users which hides SIMD complexity. Cyme accomplishes this by 1) optimization of the Abstract Syntax Tree using Expression Templates Programming to prevent temporary copies and maximize the use of Fuse Multiply Add instructions and 2) creating a data layout in memory (AoS or AoSoA), which minimizes data addressing and manipulation throughout all SIMD registers. Implementation of Cyme library has been accomplished on the IBM Blue Gene/Q architecture using the 256 bit SIMD extensions (QPX) of the Power A2 processor. Functionality of the library is demonstrated on a computationally intensive kernel of a neuro-scientific application where an increase of GFlop/s performance by a factor of 6.72 over the original implementation is observed using Clang compiler.

Keywords

  • SIMD
  • Vectorization
  • Memory layout
  • C++
  • Generic Programming

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-07518-1_29
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-07518-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bik, A.J.C.: Software Vectorization Handbook, The: Applying Intel Multimedia Extensions for Maximum Performance. Intel Press (2004)

    Google Scholar 

  2. Esterie, P., Gaunard, M., Falcou, J., Lapresté, J.T., Rozoy, B.: Boost.simd: generic programming for portable simdization. In: PACT, pp. 431–432. ACM (2012)

    Google Scholar 

  3. Kretz, M., Lindenstruth, V.: Vc: A c++ library for explicit vectorization. Software: Practice and Experience 42(11), 1409–1430 (2012)

    Google Scholar 

  4. Vandevoorde, D., Josuttis, N.M.: C++ Templates. Addison-Weesley (2002)

    Google Scholar 

  5. http://software.intel.com/en-us/articles/intel-array-building-blocks

  6. Markram, H.: The blue brain project. Nature reviews. Neuroscience 7(2) (2006)

    Google Scholar 

  7. Hay, E., Hill, S., Schürmann, F., Markram, H., Segev, I.: Models of neocortical layer 5b pyramidal cells capturing a wide range of dendritic and perisomatic active properties. PLoS Comput. Biol. 7(7) (2011)

    Google Scholar 

  8. http://www.neuron.yale.edu/neuron/

  9. Core Conductor Theory and Cable Properties of Neurons. J. Wiley & Sons (2011)

    Google Scholar 

  10. Herculano-Houzel, S., Mota, B., Lent, R.: Cellular scaling rules for rodent brains. Proceedings of the National Academy of Sciences of the United States of America 103(32), 12138–12143 (2006)

    CrossRef  Google Scholar 

  11. IBM System Blue Gene Solution: BG/Q Application Development. IBM (2013)

    Google Scholar 

  12. Finkel, H.: http://trac.alcf.anl.gov/projects/llvm-bgq

  13. Salapura, V., Ganesan, K., Gara, A., Gschwind, M., Sexton, J.C., Walkup, R.: Next-generation performance counters: Towards monitoring over thousand concurrent events. In: ISPASS, pp. 139–146. IEEE (2008)

    Google Scholar 

  14. Williams, S., Waterman, A., Patterson, D.: Roofline: An insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ewart, T., Delalondre, F., Schürmann, F. (2014). Cyme: A Library Maximizing SIMD Computation on User-Defined Containers. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2014. Lecture Notes in Computer Science, vol 8488. Springer, Cham. https://doi.org/10.1007/978-3-319-07518-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07518-1_29

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07517-4

  • Online ISBN: 978-3-319-07518-1

  • eBook Packages: Computer ScienceComputer Science (R0)