Advertisement

Latencies of Conflicting Writes on Contemporary Multicore Architectures

  • Josef Weidendorfer
  • Michael Ott
  • Tobias Klug
  • Carsten Trinitis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4671)

Abstract

This paper provides a detailed investigation of latency penalties caused by repeated memory writes to nearby memory cells from different threads in parallel programs. When such writes map to the same corresponding cache lines in multiple processors, one can observe the so called false sharing effect. This effect can unnecessarily hamper parallel code due to the line granularity based cache hierarchy, which is common on contemporary processor architectures. In this contribution, a benchmark allowing for quantitative estimates about the consequences of the false sharing effect, is presented. Results show that multicore architectures with shared cache can reduce unwanted effects of false sharing.

Keywords

Multicore CMP False Sharing Cache 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Intel Corporation: Intel 64 and IA-32 Architectures: Software Developer’s Manual, Denver, CO, USA (2006)Google Scholar
  2. 2.
    Torrellas, J., Lam, H.S., Hennessy, J.L.: False sharing and spatial locality in multiprocessor caches. IEEE Trans. Comput. 43(6), 651–663 (1994)zbMATHCrossRefGoogle Scholar
  3. 3.
    Papamarcos, M.S., Patel, J.H.: A low-overhead coherence solution for multiprocessors with private cache memories. In: ISCA 1998: 25 years of the international symposia on Computer architecture (selected papers), pp. 284–290. ACM Press, New York (1998)CrossRefGoogle Scholar
  4. 4.
    Archibald, J., Baer, J.-L.: Cache coherence protocols: evaluation using a multiprocessor simulation model. ACM Trans. Comput. Syst. 4(4), 273–298 (1986)CrossRefGoogle Scholar
  5. 5.
    Bolosky, W.J., Scott, M.L.: False sharing and its effect on shared memory performance. In: Proc. of the USENIX Symposium on Experiences with Distributed and Multiprocessor Systems (SEDMS IV), San Diego, CA, pp. 57–71 (1993)Google Scholar
  6. 6.
    Desikan, R., Burger, D., Keckler, S.W.: Sharing speculation: A mechanism for low-latency access to falsely shared data. Technical Report CS-TR-03-05, The University of Texas at Austin, Department of Computer Sciences, Friday, 11 August, 106 16:16:41 GMT (2003)Google Scholar
  7. 7.
    Liu, K.C., King, C.T.: On the effectiveness of sectored caches in reducing false sharing misses. In: International Conference on Parallel and Distributed Systems (ICPADS ’97), December 11-13, 1997, Seoul, Korea, Proceedings, pp. 352–359 (1997)Google Scholar
  8. 8.
    Kadiyala, M., Bhuyan, L.N.: A dynamic cache sub-block design to reduce false sharing. In: ICCD 1995. Proceedings of the 1995 International Conference on Computer Design, Washington, DC, USA, p. 313. IEEE Computer Society Press, Los Alamitos (1995)Google Scholar
  9. 9.
    Sweazey, P., Smith, A.J.: A class of compatible cache consistency protocols and their support by the ieee futurebus. In: ISCA 1986. Proceedings of the 13th annual international symposium on Computer architecture, pp. 414–423. IEEE Computer Society Press, Los Alamitos (1986)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Josef Weidendorfer
    • 1
  • Michael Ott
    • 1
  • Tobias Klug
    • 1
  • Carsten Trinitis
    • 1
  1. 1.Technische Universität München Lehrstuhl für Rechnertechnik und Rechnerorganisation / Parallelrechnerarchitektur Boltzmannstraße 3, 85748 Garching bei München 

Personalised recommendations