The Synchronization Power of Coalesced Memory Accesses
Multicore processor architectures have established themselves as the new generation of processor architectures. As part of the one core to many cores evolution, memory access mechanisms have advanced rapidly. Several new memory access mechanisms have been implemented in many modern commodity multicore processors. Memory access mechanisms, by devising how processing cores access the shared memory, directly influence the synchronization capabilities of the multicore processors. Therefore, it is crucial to investigate the synchronization power of these new memory access mechanisms.
This paper investigates the synchronization power of coalesced memory accesses, a family of memory access mechanisms introduced in recent large multicore architectures like the CUDA graphics processors. We first design three memory access models to capture the fundamental features of the new memory access mechanisms. Subsequently, we prove the exact synchronization power of these models in terms of their consensus numbers. These tight results show that the coalesced memory access mechanisms can facilitate strong synchronization between the threads of multicore processors, without the need of synchronization primitives other than reads and writes. In the case of the contemporary CUDA processors, our results imply that the coalesced memory access mechanisms have consensus numbers up to sixteen.
Unable to display preview. Download preview PDF.
- 1.Cell Broadband Engine Architecture, version 1.01. IBM, Sony and Toshiba Corporations (2006)Google Scholar
- 2.NVIDIA CUDA Compute Unified Device Architecture, Programming Guide, version 1.1. NVIDIA Corporation (2007)Google Scholar
- 4.Afek, Y., Merritt, M., Taubenfeld, G.: The power of multi-objects (extended abstract). In: PODC 1996: Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing, pp. 213–222 (1996)Google Scholar
- 5.Attiya, H., Welch, J.: Distributed Computing: Fundamentals, Simulations, and Advanced Topics. John Wiley and Sons, Inc., Chichester (2004)Google Scholar
- 7.Castano, I., Micikevicius, P.: Personal communication. NVIDIA (2008)Google Scholar
- 9.Ha, P.H., Tsigas, P., Anshus, O.J.: The synchronization power of coalesced memory accesses. Technical report CS:2008-68, University of Tromsø, Norway (2008)Google Scholar
- 14.Ramamurthy, S., Moir, M., Anderson, J.H.: Real-time object sharing with minimal system support. In: Proc. of Symp. on Principles of Distributed Computing (PODC), pp. 233–242 (1996)Google Scholar
- 15.Ruppert, E.: Determining consensus numbers. In: Proc. of Symp. on Principles of Distributed Computing (PODC), pp. 93–99 (1997)Google Scholar
- 16.Ruppert, E.: Consensus numbers of multi-objects. In: Proc. of Symp. on Principles of Distributed Computing (PODC), pp. 211–217 (1998)Google Scholar