Abstract
The Single-chip Cloud Computer (SCC) is an experimental multicore processor created by Intel Labs for the many-core research community, to study many-core processors, their programmability and scalability in connection to communication models. It is based on a distributed memory architecture that combines fast-access, small on-chip memory with large off-chip private and shared memory. Additionally, its design is meant to favour message-passing over the traditional shared-memory programming. To this effect, the platform deliberately does not provide hardware supported cache-coherence or atomic memory read/write operations across cores. Because of these limitations of the hardware support, algorithmic designs of concurrent data structures in the literature are not suitable.
In this paper, we delve into the problem of designing concurrent data structures on such systems. By utilising their very efficient message-passing together with the limited shared memory available, we provide two techniques that use the concept of a coordinator and one that combines local locks with message passing. All three achieve high concurrency and resiliency. These techniques allow us to design three efficient algorithms for concurrent FIFO queues. Our techniques are general and can be used to implement other concurrent abstract data types. We also provide an experimental study of the proposed queues on the SCC platform, analysing the behaviour of the throughput of our algorithms based on different memory placement policies.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., San Francisco (2008)
Michael, M.M., Scott, M.L.: Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In: Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing, PODC 1996, pp. 267–275. ACM (1996)
J., Dighe, Howard, o.: A 48-core ia-32 message-passing processor with dvfs in 45nm cmos. In: 2010 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp. 108–109 (2010)
Herlihy, M.P.: Impossibility and universality results for wait-free synchronization. In: Proceedings of the Seventh Annual ACM Symposium on Principles of Distributed Computing, PODC 1988, pp. 276–290. ACM, New York (1988)
Zhang, W., Hou, L., others: Comparison research between xy and odd-even routing algorithm of a 2-dimension 3x3 mesh topology network-on-chip. In: WRI Global Congress on Intelligent Systems, GCIS 2009, vol. 3, pp. 329–333 (2009)
Intel Cooporation: SCC External Architecture Specification (November 2010)
Cederman, D., Chatterjee, B., et al.: et al.: A study of the behavior of synchronization methods in commonly used languages and systems. In: Proceedings of the 27th IEEE International Parallel & Distributed Processing Symposium (2013)
Gidenstam, A., Sundell, H., Tsigas, P.: Cache-aware lock-free queues for multiple producers/Consumers and weak memory consistency. In: Lu, C., Masuzawa, T., Mosbah, M. (eds.) OPODIS 2010. LNCS, vol. 6490, pp. 302–317. Springer, Heidelberg (2010)
Petrovic, D., André, Schiper, o.: Leveraging hardware message passing for efficient thread synchronization. In: 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. Number EPFL-CONF-190495 (2014)
Calciu, I., Gottschlich, J.E., Herlihy, M.: Using elimination and delegation to implement a scalable numa-friendly stack. In: Proc. Usenix Workshop on Hot Topics in Parallelism, HotPar (2013)
Ozi, J.P., David, F., et al.: Remote core locking: migrating critical-section execution to improve the performance of multithreaded applications. In: Proc. Usenix Annual Technical Conf., pp. 65–76 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Walulya, I., Nikolakopoulos, Y., Papatriantafilou, M., Tsigas, P. (2014). Concurrent Data Structures in Architectures with Limited Shared Memory Support. In: Lopes, L., et al. Euro-Par 2014: Parallel Processing Workshops. Euro-Par 2014. Lecture Notes in Computer Science, vol 8805. Springer, Cham. https://doi.org/10.1007/978-3-319-14325-5_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-14325-5_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14324-8
Online ISBN: 978-3-319-14325-5
eBook Packages: Computer ScienceComputer Science (R0)