Skip to main content

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

  • 1402 Accesses

Abstract

The discussion so far has assumed a shared memory address space with uniform access cost to every address. This assumption is not practical and in particular for multicore machines where some of the memory references made by a parallel program can be significantly longer than other memory references. In this chapter we consider a simplified model of parallel machines that demonstrates this claim and is used as a “formal” model to study ParC’s memory references. Though this model is not simulating a multicore machine it can be regarded as an intermediate stage between a uniform cost of shared memory references and the complexity of real multicore machines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Not all systems allow the reading of a hardware clock with sufficient granularity.

References

  • Agarwal, A., Kranz, D., Natarajan, V.: Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors. IEEE Trans. Parallel Distrib. Syst. 6(9), 943–962 (2002)

    Article  Google Scholar 

  • Amza, C., Cox, A.L., Dwarkadas, S., Keleher, P., Lu, H., Rajamony, R., Yu, W., Zwaenepoel, W.: Treadmarks: Shared memory computing on networks of workstations. Computer 29(2), 18–28 (2002)

    Article  Google Scholar 

  • André, F., Pazat, J.-L., Thomas, H.: Pandore: A system to manage data distribution. In: Intl. Conf. Supercomputing, pp. 380–388 (1990)

    Google Scholar 

  • Bala, V., Ferrante, J., Carter, L.: Explicit data placement (XDP): A methodology for explicit compile-time representation and optimization of data movement. In: Symp. Principles & Practice of Parallel Programming, pp. 139–148 (1993)

    Google Scholar 

  • Balasundaram, V., Fox, G., Kennedy, K., Kremer, U.: A static performance estimator to guide data partitioning decisions. In: Symp. Principles & Practice of Parallel Programming, pp. 213–223 (1991)

    Google Scholar 

  • Ben-Asher, Y., Podvolny, D.: Y-Invalidate: A new protocol for implementing weak consistency in DSM systems. Int. J. Parallel Program. 29(6), 583–606 (2001)

    Article  MATH  Google Scholar 

  • Blake, G., Dreslinski, R., Mudge, T.: A survey of multicore processors. IEEE Signal Process. Mag. 26(6), 26–37 (2009)

    Article  Google Scholar 

  • Chatterjee, S., Gilbert, J.R., Schreiber, R., Teng, S.H.: Automatic array alignment in data-parallel programs. In: Proceedings of the 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 16–28. ACM, New York (1993) ISBN 0897915607

    Chapter  Google Scholar 

  • Gupta, M., Banerjee, P.: Demonstration of automatic data partitioning techniques for parallelizing compilers on multicomputers. IEEE Trans. Parallel Distrib. Syst. 3(2), 179–193 (1992)

    Article  Google Scholar 

  • Karrenberg, R.: Automatic packetization. Technical report, Saarland University, Informatics (2010)

    Google Scholar 

  • Keleher, P., Cox, A.L., Zwaenepoel, W.: Lazy release consistency for software distributed shared memory. Comput. Archit. News 20(2), 13–21 (1992)

    Article  Google Scholar 

  • Keleher, P., Cox, A.L., Dwarkadas, S., Zwaenepoel, W.: Treadmarks: Distributed shared memory on standard workstations and operating systems. Rice University, Dept. of Computer Science (1993)

    Google Scholar 

  • Khanna, S., Muthukrishnan, S., Skiena, S.: Efficient array partitioning. In: Automata, Languages and Programming, pp. 616–626 (1997)

    Chapter  Google Scholar 

  • Larowe Jr., R.P., Schlatter Ellis, C.: Experimental comparison of memory management policies for NUMA multiprocessors. ACM Trans. Comput. Syst. 9(4), 319–363 (1991)

    Article  Google Scholar 

  • Lim, A.W., Liao, S.W., Lam, M.S.: Blocking and array contraction across arbitrarily nested loops using affine partitioning. ACM SIGPLAN Not. 36(7), 103–112 (2001)

    Article  Google Scholar 

  • Loveman, D.B.: High performance Fortran. IEEE Parallel Distrib. Technol. 1(1), 25–42 (1993)

    Article  Google Scholar 

  • Muchnick, S.S.: Advanced Compiler Design and Implementation. Morgan Kaufmann, San Mateo (1997). ISBN 1558603204

    Google Scholar 

  • Nitzberg, B., Lo, V.: Distributed shared memory: A survey of issues and algorithms. Computer 24(8), 52–60 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yosi Ben-Asher .

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag London

About this chapter

Cite this chapter

Ben-Asher, Y. (2012). Locality and Synchronization. In: Multicore Programming Using the ParC Language. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-2164-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-2164-0_3

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-2163-3

  • Online ISBN: 978-1-4471-2164-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics