Skip to main content

Cyme: A Library Maximizing SIMD Computation on User-Defined Containers

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8488))

Abstract

This paper presents Cyme, a C++ library aiming at abstracting the usage of SIMD instructions while maximizing the usage of the underlying hardware. Unlike similar efforts such as Boost.simd or VC, Cyme provides generic high level containers to the users which hides SIMD complexity. Cyme accomplishes this by 1) optimization of the Abstract Syntax Tree using Expression Templates Programming to prevent temporary copies and maximize the use of Fuse Multiply Add instructions and 2) creating a data layout in memory (AoS or AoSoA), which minimizes data addressing and manipulation throughout all SIMD registers. Implementation of Cyme library has been accomplished on the IBM Blue Gene/Q architecture using the 256 bit SIMD extensions (QPX) of the Power A2 processor. Functionality of the library is demonstrated on a computationally intensive kernel of a neuro-scientific application where an increase of GFlop/s performance by a factor of 6.72 over the original implementation is observed using Clang compiler.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bik, A.J.C.: Software Vectorization Handbook, The: Applying Intel Multimedia Extensions for Maximum Performance. Intel Press (2004)

    Google Scholar 

  2. Esterie, P., Gaunard, M., Falcou, J., Lapresté, J.T., Rozoy, B.: Boost.simd: generic programming for portable simdization. In: PACT, pp. 431–432. ACM (2012)

    Google Scholar 

  3. Kretz, M., Lindenstruth, V.: Vc: A c++ library for explicit vectorization. Software: Practice and Experience 42(11), 1409–1430 (2012)

    Google Scholar 

  4. Vandevoorde, D., Josuttis, N.M.: C++ Templates. Addison-Weesley (2002)

    Google Scholar 

  5. http://software.intel.com/en-us/articles/intel-array-building-blocks

  6. Markram, H.: The blue brain project. Nature reviews. Neuroscience 7(2) (2006)

    Google Scholar 

  7. Hay, E., Hill, S., Schürmann, F., Markram, H., Segev, I.: Models of neocortical layer 5b pyramidal cells capturing a wide range of dendritic and perisomatic active properties. PLoS Comput. Biol. 7(7) (2011)

    Google Scholar 

  8. http://www.neuron.yale.edu/neuron/

  9. Core Conductor Theory and Cable Properties of Neurons. J. Wiley & Sons (2011)

    Google Scholar 

  10. Herculano-Houzel, S., Mota, B., Lent, R.: Cellular scaling rules for rodent brains. Proceedings of the National Academy of Sciences of the United States of America 103(32), 12138–12143 (2006)

    Article  Google Scholar 

  11. IBM System Blue Gene Solution: BG/Q Application Development. IBM (2013)

    Google Scholar 

  12. Finkel, H.: http://trac.alcf.anl.gov/projects/llvm-bgq

  13. Salapura, V., Ganesan, K., Gara, A., Gschwind, M., Sexton, J.C., Walkup, R.: Next-generation performance counters: Towards monitoring over thousand concurrent events. In: ISPASS, pp. 139–146. IEEE (2008)

    Google Scholar 

  14. Williams, S., Waterman, A., Patterson, D.: Roofline: An insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ewart, T., Delalondre, F., Schürmann, F. (2014). Cyme: A Library Maximizing SIMD Computation on User-Defined Containers. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2014. Lecture Notes in Computer Science, vol 8488. Springer, Cham. https://doi.org/10.1007/978-3-319-07518-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07518-1_29

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07517-4

  • Online ISBN: 978-3-319-07518-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics