Abstract
Since the 1960s processors have, for efficiency, sometimes executed instructions out of program order, provided that the (sequential) semantics is preserved. On uniprocessor architectures this behaviour is not observable, however multicore architectures can expose instruction reorderings as unexpected, or “weak”, behaviours, which are notoriously difficult to reason about. In this paper we introduce a novel program operator, parallelized sequential composition, where ‘’ may execute instructions of \(c_2\) before those of \(c_1\), depending on \({{ \textsc {m}}} \), which controls the reordering of atomic instructions. When appropriately instantiated the operator exhibits many of the weak behaviours of TSO, Release Consistency, Arm, and RISC-V, and generalises sequential and parallel composition. We show how the nondeterminism introduced by reordering can be reasoned about by reduction to sequential or parallel forms, from where established techniques (such as rely/guarantee or Owicki-Gries) can be applied. This gives a more direct, intuitive and compositional framework for reasoning about weak behaviours that arise from processor reordering than semantics that are based on complex data structures over properties of global traces. The semantics and theory is encoded and verified in Isabelle/HOL, and we use its implementation in the Maude rewriting engine to empirically show its behaviours agree with hardware.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
We give an atomic expression evaluation semantics for assignments and guards, which is typically reasonable for assembler-level instructions.
- 4.
We use finite loops only to avoid the usual complications infinite loops introduce, which are orthogonal to the effects of instruction reordering.
- 5.
We deal only with partial correctness as we consider only finite traces.
- 6.
- 7.
The proviso on (30) rules out miraculous cases (if \((\!\vert {p}\vert \!) \cdot c\) is infeasible then \(\{p\}\, c \, \{q\}\) trivially holds for all q, but q is not reachable in this case).
- 8.
Arm’s LDAPR explicitly weakens the ordering between release/acquire instructions, which can be handled by distinguishing annotations syntactically rather than within the memory model definition.
References
Abd Alrahman, Y., Andric, M., Beggiato, A., Lafuente, A.L.: Can we efficiently check concurrent programs under relaxed memory models in Maude? In: Escobar, S. (ed.) WRLA 2014. LNCS, vol. 8663, pp. 21–41. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12904-4_2
Abdulla, P.A., Arora, J., Atig, M.F., Krishna, S.: Verification of programs under the release-acquire semantics. In: PLDI 2019, pp. 1117–1132. Association for Computing Machinery (2019)
Adve, S.V., Gharachorloo, K.: Shared memory consistency models: a tutorial. Computer 29(12), 66–76 (1996)
Alglave, J.: How to generate litmus tests automatically with the diy7 tool, 2020. Accessed June 2020. https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/generate-litmus-tests-automatically-diy7-tool
Alglave, J., Cousot, P., Maranget, L.: Syntax and semantics of the weak consistency model specification language cat. CoRR, abs/1608.07531 (2016)
Alglave, J., Deacon, W., Grisenthwaite, R., Hacquard, A., Maranget, L.: Armed cats: formal concurrency modelling at Arm. ACM Trans. Program. Lang. Syst. 43(2), 1–54 (2021)
Alglave, J., Kroening, D., Nimal, V., Tautschnig, M.: Software verification for weak memory via program transformation. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 512–532. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37036-6_28
Alglave, J., Maranget, L., Sarkar, S., Sewell, P.: Litmus: running tests against hardware. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 41–44. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19835-9_5
Alglave, J., Maranget, L., Tautschnig, M.: Herding cats: modelling, simulation, testing, and data mining for weak memory. ACM Trans. Program. Lang. Syst. 36(2), 7:1–7:74 (2014)
Arm Ltd.: Arm\(\textregistered \) Architecture Reference Manual, for the Armv8-A architecture profile (2020)
Armstrong, A., et al.: ISA semantics for ARMv8-a, RISC-V, and CHERI-MIPS. Proc. ACM Program. Lang. 3(POPL) (2019)
Arvind, A., Maessen, J.-W.: Memory model = instruction reordering + store atomicity. In: Proceedings of the 33rd Annual International Symposium on Computer Architecture, ISCA 2006, USA, pp. 29–40. IEEE Computer Society (2006)
Atig, M.F., Bouajjani, A., Burckhardt, S., Musuvathi, M.: On the verification problem for weak memory models. In: POPL 2010, pp. 7–18. ACM (2010)
Atig, M.F., Bouajjani, A., Burckhardt, S., Musuvathi, M.: What’s decidable about weak memory models? In: Seidl, H. (ed.) ESOP 2012. LNCS, vol. 7211, pp. 26–46. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28869-2_2
Batty, M., Memarian, K., Nienhuis, K., Pichon-Pharabod, J., Sewell, P.: The problem of programming language concurrency semantics. In: Vitek, J. (ed.) ESOP 2015. LNCS, vol. 9032, pp. 283–307. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46669-8_12
Boehm, H.-J., Adve, S.V.: Foundations of the C++ concurrency memory model. In: PLDI 2008, pp. 68–78. ACM (2008)
Boudol, G., Petri, G.: Relaxed memory models: an operational approach. In: POPL 2009, pp. 392–403. Association for Computing Machinery (2009)
Boudol, G., Petri, G., Serpette, B.: Relaxed operational semantics of concurrent programming languages. EPTCS 89, 19–33 (2012)
Brookes, S.: A semantics for concurrent separation logic. Theoret. Comput. Sci. 375(1–3), 227–270 (2007)
Clavel, M., et al.: Maude: specification and programming in rewriting logic. Theoret. Comput. Sci. 285(2), 187–243 (2002)
Colvin, R.J.: Parallelized sequential composition, pipelines, and hardware weak memory models. CoRR, abs/2105.02444 (2021)
Colvin, R.J., Smith, G.: A wide-spectrum language for verification of programs on weak memory models. In: Havelund, K., Peleska, J., Roscoe, B., de Vink, E. (eds.) FM 2018. LNCS, vol. 10951, pp. 240–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95582-7_14
Colvin, R.J., Winter, K.: An abstract semantics of speculative execution for reasoning about security vulnerabilities. In: Sekerinski, E., et al. (eds.) FM 2019. LNCS, vol. 12233, pp. 323–341. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-54997-8_21
Crary, K., Sullivan, M.J.: A calculus for relaxed memory. In: POPL 2015, pp. 623–636. ACM (2015)
Deacon, W., Alglave, J.: The herd ARMv8 model (2016). Accessed June 2020. https://github.com/herd/herdtools7/blob/master/herd/libdir/aarch64.cat
Doherty, S., Dongol, B., Wehrheim, H., Derrick, J.: Verifying C11 programs operationally. In: PPoPP 2019, pp. 355–365. ACM (2019)
Dubois, M., Scheurich, C., Briggs, F.: Memory access buffering in multiprocessors. In: Proceedings of the 13th Annual International Symposium on Computer Architecture, ISCA 1986, pp. 434–442. IEEE Computer Society Press (1986)
Flur, S., et al.: Modelling the ARMv8 architecture, operationally: concurrency and ISA. In: POPL 2016, pp. 608–621. ACM, New York (2016)
Flur, S., Maranget, L.: RISC-V architecture concurrency model litmus tests (2019). Accessed June 2020. https://github.com/litmus-tests/litmus-tests-riscv
Fox, A.C.J., Harman, N.A.: Algebraic models of correctness for microprocessors. Formal Aspects Comput. 12(4), 298–312 (2000)
Gharachorloo, K., Lenoski, D., Laudon, J., Gibbons, P., Gupta, A., Hennessy, J.: Memory consistency and event ordering in scalable shared-memory multiprocessors. In: ISCA 1990, pp. 15–26. ACM (1990)
Hoare, C.A.R.: Towards a theory of parallel programming. In: Operating System Techniques, pp. 61–71. Academic Press (1972). Proceedings of Seminar at Queen’s University, Belfast, Northern Ireland, August-September 1971
Hoare, C.A.R.: Some properties of predicate transformers. J. ACM 25(3), 461–480 (1978)
Hoare, C.A.R.T., Möller, B., Struth, G., Wehrman, I.: Concurrent Kleene algebra. In: Bravetti, M., Zavattaro, G. (eds.) CONCUR 2009. LNCS, vol. 5710, pp. 399–414. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04081-8_27
Hou, Z., Sanan, D., Tiu, A., Liu, Y., Hoa, K.C.: An executable formalisation of the SPARCv8 instruction set architecture: a case study for the LEON3 processor. In: Fitzgerald, J., Heitmeyer, C., Gnesi, S., Philippou, A. (eds.) FM 2016. LNCS, vol. 9995, pp. 388–405. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48989-6_24
Jagadeesan, R., Petri, G., Riely, J.: Brookes is relaxed, almost! In: Birkedal, L. (ed.) FoSSaCS 2012. LNCS, vol. 7213, pp. 180–194. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28729-9_12
Jones, R.B., SkakkebÆk, J.U., Dill, D.L.: Reducing manual abstraction in formal verification of out- of- order execution. In: Gopalakrishnan, G., Windley, P. (eds.) FMCAD 1998. LNCS, vol. 1522, pp. 2–17. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49519-3_2
Kang, J., Hur, C.-K., Lahav, O., Vafeiadis, V., Dreyer, D.: A promising semantics for relaxed-memory concurrency. In: POPL 2017, pp. 175–189. ACM (2017)
Kavanagh, R., Brookes, S.: A denotational semantics for SPARC TSO. Electron. Notes Theor. Comput. Sci. 336, 223–239 (2018)
Kocher, P., et al.: Spectre attacks: exploiting speculative execution. In: Security and Privacy, pp. 1–19. IEEE (2019)
Kokologiannakis, M., Vafeiadis, V.: HMC: model checking for hardware memory models. In: ASPLOS 2020, pp. 1157–1171. ACM (2020)
Lahav, O., Giannarakis, N., Vafeiadis, V.: Taming release-acquire consistency. In: POPL 2016, pp. 649–662. Association for Computing Machinery (2016)
Lahav, O., Vafeiadis, V.: Owicki-Gries reasoning for weak memory models. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9135, pp. 311–323. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-47666-6_25
Lahav, O., Vafeiadis, V.: Explaining relaxed memory models with program transformations. In: Fitzgerald, J., Heitmeyer, C., Gnesi, S., Philippou, A. (eds.) FM 2016. LNCS, vol. 9995, pp. 479–495. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48989-6_29
Lamport, L.: How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. 28(9), 690–691 (1979)
Lee, S.-H., et al.: Promising 2.0: global optimizations in relaxed memory concurrency. In: PLDI 2020, pp. 362–376. Association for Computing Machinery (2020)
Lustig, D., Pellauer, M., Martonosi, M.: PipeCheck: specifying and verifying microarchitectural enforcement of memory consistency models. In: 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 635–646 (2014)
Mador-Haim, S., et al.: An axiomatic memory model for POWER multiprocessors. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 495–512. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31424-7_36
Manerkar, Y.A., Lustig, D., Martonosi, M., Gupta, A.: PipeProof: automated memory consistency proofs for microarchitectural specifications. In: 51st Annual IEEE/ACM International Symposium on Microarchitecture, pp. 788–801 (2018)
Maranget, L.: AArch64 model vs. hardware. Accessed Jan 2020. http://moscova.inria.fr/~maranget/cats7/model-aarch64/
Maranget, L., Sarkar, S., Sewell, P.: A tutorial introduction to the ARM and POWER relaxed memory models (2012)
Milner, R.: A Calculus of Communicating Systems. Springer, Heidelberg (1982). https://doi.org/10.1007/3-540-10235-3
Morgan, C.: Of wp and CSP. In: Feijen, W.H.J., van Gasteren, A.J.M., Gries, D., Misra, J. (eds.) Beauty Is Our Business: A Birthday Salute to Edsger W. Dijkstra, pp. 319–326. Springer, New York (1990). https://doi.org/10.1007/978-1-4612-4476-9_37
Nipkow, T., Wenzel, M., Paulson, L.C. (eds.): Isabelle/HOL. LNCS, vol. 2283. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45949-9
O’Hearn, P.W.: Incorrectness logic. Proc. ACM Program. Lang. 4(POPL) (2019)
Owicki, S., Gries, D.: An axiomatic proof technique for parallel programs I. Acta Inf. 6(4), 319–340 (1976)
Plotkin, G.D.: A structural approach to operational semantics. J. Log. Algebr. Program. 60–61, 17–139 (2004)
Pulte, C., Flur, S., Deacon, W., French, J., Sarkar, S., Sewell, P.: Simplifying ARM concurrency: multicopy-atomic axiomatic and operational models for ARMv8. Proc. ACM Program. Lang. 2(POPL) (2017)
Pulte, C., Pichon-Pharabod, J., Kang, J., Lee, S.-H., Hur, C.-K.: Promising-ARM/RISC-V: a simpler and faster operational concurrency model. In: PLDI 2019, pp. 1–15. ACM (2019)
Rensink, A., Wehrheim, H.: Weak sequential composition in process algebras. In: Jonsson, B., Parrow, J. (eds.) CONCUR 1994. LNCS, vol. 836, pp. 226–241. Springer, Heidelberg (1994). https://doi.org/10.1007/978-3-540-48654-1_20
Ridge, T.: A rely-guarantee proof system for x86-TSO. In: Leavens, G.T., O’Hearn, P., Rajamani, S.K. (eds.) VSTTE 2010. LNCS, vol. 6217, pp. 55–70. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15057-9_4
RISC-V International. The RISC-V Instruction Set Manual. Volume I: User-Level ISA; Volume II: Privileged Architecture (2017)
Sarkar, S., Sewell, P., Alglave, J., Maranget, L., Williams, D.: Understanding POWER multiprocessors. SIGPLAN Not. 46(6), 175–186 (2011)
Sewell, P., Sarkar, S., Owens, S., Nardelli, F.Z., Myreen, M.O.: x86-TSO: a rigorous and usable programmer’s model for x86 multiprocessors. Commun. ACM 53(7), 89–97 (2010)
Shasha, D., Snir, M.: Efficient and correct execution of parallel programs that share memory. ACM Trans. Program. Lang. Syst. 10(2), 282–312 (1988)
Steinke, R.C., Nutt, G.J.: A unified theory of shared memory consistency. J. ACM 51(5), 800–849 (2004)
Thornton, J.E.: Parallel operation in the control data 6600. In: Proceedings of the October 27–29, 1964, Fall Joint Computer Conference, Part II: Very High Speed Computer Systems, AFIPS 1964, pp. 33–40. ACM (1964)
Tomasulo, R.M.: An efficient algorithm for exploiting multiple arithmetic units. IBM J. Res. Dev. 11(1), 25–33 (1967)
Travkin, O., Wehrheim, H.: Verification of concurrent programs on weak memory models. In: Sampaio, A., Wang, F. (eds.) ICTAC 2016. LNCS, vol. 9965, pp. 3–24. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46750-4_1
Trippel, C., Lustig, D., Martonosi, M.: Security verification via automatic hardware-aware exploit synthesis: the CheckMate approach. IEEE Micro 39(3), 84–93 (2019)
Wickerson, J., Batty, M., Sorensen, T., Constantinides, G.A.: Automatically comparing memory consistency models. SIGPLAN Not. 52(1), 190–204 (2017)
Winter, K., Zhang, C., Hayes, I.J., Keynes, N., Cifuentes, C., Li, L.: Path-sensitive data flow analysis simplified. In: Groves, L., Sun, J. (eds.) ICFEM 2013. LNCS, vol. 8144, pp. 415–430. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41202-8_27
Acknowledgements
We thank Graeme Smith, Kirsten Winter, Nicholas Coughlin and Ian Hayes for feedback on this work, and anonymous reviewers of earlier versions. We also thank Luc Maranget, Jade Alglave, and Christopher Pulte for assistance with litmus test analysis.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Colvin, R.J. (2021). Parallelized Sequential Composition and Hardware Weak Memory Models. In: Calinescu, R., Păsăreanu, C.S. (eds) Software Engineering and Formal Methods. SEFM 2021. Lecture Notes in Computer Science(), vol 13085. Springer, Cham. https://doi.org/10.1007/978-3-030-92124-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-92124-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92123-1
Online ISBN: 978-3-030-92124-8
eBook Packages: Computer ScienceComputer Science (R0)