Multi-core and Many-core Processor Architectures



No book on programming would be complete without an overview of the hardware on which the software will execute. In this chapter we outline the main design principles and solutions applied when designing these chips, as well as the challenges facing the hardware industry, together with an outlook of promising technologies not yet in common practice. This chapter’s main goal is to introduce the reader to the most important processor architecture concepts (core organization, interconnects, memory architectures, support for parallel programming etc) relevant in the context of multi-core processors as well the most common processor architectures available today. We also analyze the challenges faced by processor designs as the number of cores will continue scaling and the emerging technologies—such as transactional memory, support for speculative threading, novel interconnects, 3D stacking of memory etc—that will allow continued scaling of processors in terms of available computational power.


Graphic Processing Unit Memory Bandwidth Transactional Memory Cache Coherence Helper Thread 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Censier L M, Featrier P (1978) A New Solution to Coherence Problems in Multicache Systems," IEEE Transactions on Computers, 27(12):1112-1118zbMATHCrossRefGoogle Scholar
  2. 2.
    Gschwind M, Hofstee H P, Flachs B, Hopkin M, Watanabe Y, Yamazaki T (2006) Synergistic Processing in Cell’s Multicore Architecture. IEEE Micro 26(2):10-24CrossRefGoogle Scholar
  3. 3.
    Intel Corporation (2010) Petascale to Exascale: Extending Intel’s HPC Commitment. Accessed 11 January 2011Google Scholar
  4. 4.
    Sonics MemMax Scheduler. Accessed 10 January 2011Google Scholar
  5. 5.
    Mutlu O, Moscibroda T (2009) Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems. IEEE Micro Special Issue 29(1):22-32CrossRefGoogle Scholar
  6. 6.
    Ahn J H, Leverich J, Schreiber R S, Jouppi N P (2009) Multicore DIMM: an Energy Efficient Memory Module with Independently Controlled DRAMs. Computer Architecture Letters 8(1): 5-8CrossRefGoogle Scholar
  7. 7.
    The OpenMP Architecture Review Board (2008) The OpenMP Application Program Interface. Accessed 10 January 2011Google Scholar
  8. 8.
    Frigo M, Leiserson C E, Randall K H (1998) The implementation of the Cilk-5 Multithreaded Language. Proceedings of the ACM SIGPLAN 1998 conference on Programming Language Design and Implementation, 212-223Google Scholar
  9. 9.
    Culler D E, Gupta A, Singh J P (1998) Parallel Computer Architecture: A Hardware/Software Approach. Morgan KaufmannGoogle Scholar
  10. 10.
    Hennessy J L, Patterson D A (2006) Computer Architecture: A Quantitative Approach 4th Edition, Morgan KaufmannGoogle Scholar
  11. 11.
    Wikipedia article Hyper-threading. Accessed 10.1.2010Google Scholar
  12. 12.
    Mars J, Williams D, Upton D, Ghosh S, Hazelwood K (2008) A Reactive Unobtrusive Prefetcher for Multicore and Manycore Architecture. Proceedings of the Workshop on Software and Hardware Challenges of Manycore Platforms 2008, 41-50Google Scholar
  13. 13.
    Nellans D, Sudan K, Balasubramonian R, Brunvand E (2010) Improving Server Performance on Multi-Cores via Selective Off-loading of OS Functionalility. Proceedings of the 10th Workshop on Interaction between Operating Systems and Computer ArchitectureGoogle Scholar
  14. 14.
    Chua L O (1971) Memristor—the Missing Circuit Element. IEEE Transactions on Circuit Theory 18(5):507-519CrossRefGoogle Scholar
  15. 15.
    Kurian G, Miller J E, Psota J, Eastep J, Liu J, Michel J, Kimerling L C, Agarwal A (2010) ATAC: a 1000-core Cache Coherent Processor with On-Chip Optical Network. Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques: 477-488Google Scholar
  16. 16.
    Herlihy M, Moss J E B (1993) Transactional Memory: Architectural Support for Lock-free Data Structures. Proceedings of the 20th International Symposium on Computer Architecture: 289-300Google Scholar
  17. 17.
    Falsafi B (2009) Energy-Centric Computing & Computer Architecture. Proceedings of the 2009 Workshop on New Directions in Computer ArchitectureGoogle Scholar
  18. 18.
    SPEC (2008) SPEC CPU2006. Accessed 11 January 2011Google Scholar
  19. 19.
    Åbo Akademi (2010) Cloud Software Program: SIP-Proxy and Apache Running on traditional X86 vs ARM Cortex-A9. Accessed 11 January 2011Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Oy L M Ericsson AbJorvasFinland

Personalised recommendations