Skip to main content

Timing Properties and Correctness for Structured Parallel Programs on x86-64 Multicores

  • Conference paper
  • First Online:
Foundational and Practical Aspects of Resource Analysis (FOPARA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9964))

  • 252 Accesses

Abstract

This paper determines correctness and timing properties for structured parallel programs on x86-64 multicores. Multicore architectures are increasingly common, but real architectures have unpredictable timing properties, and commonly used relaxed-memory concurrency models mean that even functional correctness is not obvious. This paper takes a rigorous approach to correctness and timing properties, examining common locking protocols from first principles, and extending this through queues to structured parallel constructs. We prove functional correctness and derive simple timing models, extending these for the first time from low-level machine operations to high-level parallel patterns. Our derived high-level timing models for structured parallel programs allow us to accurately predict upper bounds on program execution times on x86-64 multicores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We will ignore shared cache here, since it does not have a significant impact on the proofs.

  2. 2.

    In the sense that locks are not visible to the programmer, rather than they are not used by the hardware.

References

  1. Cole, M.I.: Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge (1991)

    MATH  Google Scholar 

  2. González-Vélez, H., Leyton, M.: A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw. Pract. Experience 40(12) 1135–1160 (2010)

    Google Scholar 

  3. Danelutto, M., Torquati, M.: A RISC building block set for structured parallel programming. In: Proceedings of the 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP 2013), pp. 46–50 (2013)

    Google Scholar 

  4. Lamport, L.: How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. C-28(9), 690–691 (1979)

    Google Scholar 

  5. Sewell, P., Sarkar, S., Owens, S., Zappa Nardelli, F., Myreen, M.O.: x86-TSO: a rigorous and usable programmer’s model for x86 multiprocessors. CACM 53(7), 89–97 (2010)

    Google Scholar 

  6. Owens, S., Sarkar, S., Sewell, P.: A better x86 memory model: x86-TSO. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009. LNCS, vol. 5674, pp. 391–407. Springer, Heidelberg (2009). doi:10.1007/978-3-642-03359-9_27

    Chapter  Google Scholar 

  7. Intel: Intel 64 and IA-32 Architectures Software Developer’s Manual vol. 3A: System Programming Guide, Part 1, §8.2.2. Intel (2013)

    Google Scholar 

  8. Papamarcos, M.S., Patel, J.H.: A low-overhead coherence solution for multiprocessors with private cache memories. In: Proceedings of the ISCA 1984: 11th Annual International Symposium on Computer Architecture, pp. 348–354. ACM (1984)

    Google Scholar 

  9. Chapman, B., Jost, G., van der Pas, R.: Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation). The MIT Press, Cambridge (2007)

    Google Scholar 

  10. Aldinucci, M., Danelutto, M., Kilpatrick, P., Torquati, M.: Fastflow: high-level and efficient streaming on multi-core. In: Programming Multi-core and Many-core Computing Systems. Parallel and Distributed Computing (2012)

    Google Scholar 

  11. Owens, S.: Reasoning about the implementation of concurrency abstractions on x86-TSO. In: D’Hondt, T. (ed.) ECOOP 2010. LNCS, vol. 6183, pp. 478–503. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14107-2_23

    Chapter  Google Scholar 

  12. Burckhardt, S., Gotsman, A., Musuvathi, M., Yang, H.: Concurrent library correctness on the TSO memory model. In: Seidl, H. (ed.) ESOP 2012. LNCS, vol. 7211, pp. 87–107. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28869-2_5

    Chapter  Google Scholar 

  13. Hamdan, M.M.: A Survey of Cost Models for Algorithmic Skeletons. Technical report, Heriot-Watt University (1999)

    Google Scholar 

  14. Fortune, S., Wyllie, J.: Parallelism in random access machines. In: Proceedings of the STOC 1978: 10th Annual ACM Symposium on Theory of Computing, pp. 114–118. ACM (1978)

    Google Scholar 

  15. Skillicorn, D.B.: A cost calculus for parallel functional programming. J. Parallel Distrib. Comput. 28, 65–83 (1995)

    Google Scholar 

  16. Valiant, L.G.: A bridging model for parallel computation. Commun. ACM (CACM) 33(8), 103–111 (1990)

    Article  Google Scholar 

  17. Lisper, B.: Towards parallel programming models for predictability. In: Proceedings of the WCET 2012: 12th International Workshop on Worst-Case Execution Time Analysis. OpenAccess Series in Informatics (OASIcs), vol. 23, pp. 48–58 (2012)

    Google Scholar 

  18. Blelloch, G.E., Greiner, J.: A provable time and space efficient implementation of NESL. In: Proceedings of the ICFP 1996: ACM SIGPLAN International Conference on Functional Programming, pp. 213–225 (1996)

    Google Scholar 

  19. Becker, P. (ed.): Programming Languages – C++. ISO/IEC (2011)

    Google Scholar 

  20. Williams, A.: C++ Concurrency in Action: Practical Multithreading. Manning Publications (2012)

    Google Scholar 

  21. Boehm, H.J., Adve, S.V.: Foundations of the C++ concurrency memory model. In: Proceedings of the PLDI 2008: 29th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 68–78 (2008)

    Google Scholar 

  22. Adve, S.V., Boehm, H.J.: Memory models: a case for rethinking parallel languages and hardware. Commun. ACM (CACM) 53(8), 90–101 (2010)

    Article  Google Scholar 

  23. Collins, R.L., Carloni, L.P.: Flexible filters: load balancing through backpressure for stream programs. In: Proceedings of the EMSOFT 2009: ACM SIGBED International Conference on Embedded Software, pp. 205–214 (2009)

    Google Scholar 

Download references

Acknowledgements

This work has been partially supported by the EU Horizon 2020 grant “RePhrase: Refactoring Parallel Heterogeneous Resource-Aware Applications – a Software Engineering Approach” (ICT-644235), by COST Action IC1202 (TACLe), supported by COST (European Cooperation in Science and Technology), and by EPSRC grant EP/M027317/1 “C\({}^3\): Scalable & Verified Shared Memory via Consistency-directed Cache Coherence”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kevin Hammond .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Hammond, K., Brown, C., Sarkar, S. (2016). Timing Properties and Correctness for Structured Parallel Programs on x86-64 Multicores. In: van Eekelen, M., Dal Lago, U. (eds) Foundational and Practical Aspects of Resource Analysis. FOPARA 2015. Lecture Notes in Computer Science(), vol 9964. Springer, Cham. https://doi.org/10.1007/978-3-319-46559-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46559-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46558-6

  • Online ISBN: 978-3-319-46559-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics