We consider parameterized verification of systems executing according to the total store ordering (TSO) semantics. The processes manipulate abstract data types over potentially infinite domains. We present a framework that translates the reachability problem for such systems to the reachability problem for register machines enriched with the given abstract data type. We use the translation to obtain tight complexity bounds for TSO-based parameterized verification over several abstract data types, such as push-down automata, ordered multi push-down automata, one-counter nets, one-counter automata, and Petri nets. We apply the framework to get complexity bounds for higher order stack and counter variants as well.

1 Introduction

A parameterized system consists of a fixed but arbitrary number of identical processes that execute in parallel. The goal of parameterized verification is to prove the correctness of the system regardless of the number of processes. Examples for such systems are sensor networks, leader election protocols, and mutual exclusion protocols. The topic has been the subject of intensive research for more than three decades (see e.g. [6, 10, 13, 32]), and it is the subject of one chapter of the Handbook of Model Checking [8]. Research on parameterized verification has been mostly conducted under the premise that (i) the processes run according to the classical Sequential Consistency (SC) semantics, and (ii) the processes are finite-state machines.

Under SC, the processes operate on a set of shared variables through which they communicate atomically, i.e., read and write operations take effect immediately. In particular, a write operation is visible to all the processes as soon as the writing process carries out its operation. Therefore, the processes always maintain a uniform view of the shared memory: they all see the latest value written on any given variable, hence we can interpret program runs as interleavings of sequential process executions. Although SC has been immensely popular as an intuitive way of understanding the behaviours of concurrent processes, it is not realistic to assume computation platforms guarantee SC anymore. The reason is that, due to hardware and compiler optimizations, most modern platforms allow more relaxed program behaviours than those permitted under SC, leading to so-called weak memory models. Weakly consistent platforms are found at all levels of system design such as multiprocessor architectures (e.g., [47, 48]), Cache protocols (e.g., [21, 46]), language level concurrency (e.g., [41]), and distributed data stores (e.g., [17]). Therefore, in recent years, research on the parameterized verification of concurrent programs under weak memory models have started to become popular. Notable examples are the cases of the TSO semantics [4] and the Release-Acquire semantics of C11 [39].

In a parallel development, several works have extended the basic model of parameterized systems (under the SC semantics) by considering processes that are infinite-state systems. The most dominant such class has been the case where the individual processes are variants of push-down automata [28, 28, 30, 33, 36, 40, 42]

Parameterized verification is difficult, even under the original assumption of both SC and finite-state processes as we still need to handle an infinite state space. The extension to weakly consistent systems is even more complex due to the intricate extra process behaviours. Almost all weak memory models induce infinite state spaces even without parameterization and even when the program itself is finite-state. Therefore, performing parameterized verification under weak consistency requires handling a state space that is infinite in two dimensions; one due to parameterization and one due to the weak memory model. The same applies to the extension of parameterized verification under SC where the processes are infinite-state: in addition to infiniteness due to parameterization, we have a second source of infinity due to the infiniteness of the processes.

In this paper, we combine the above two extensions. We study parameterized verification of programs under the TSO semantics, where the processes use infinite data structures such as stacks and counters. The framework is uniform in that the manipulation can be described using an abstract data type.

We revisit the pivot abstraction technique presented in [4]. As a first contribution, we show that we can capture pivot abstraction precisely, using a class of register machines in which the registers assume values over a finite domain. We show that, for any given abstract data type \(\textsf{A}\), we can reduce, in polynomial time, the parameterized verification problem under TSO and \(\textsf{A}\) to the reachability problem for register machines manipulating \(\textsf{A}\). Furthermore, we show that the reduction also holds in the other direction: the reachability problem for register machines over \(\textsf{A}\) is polynomial-time reducible to the parameterized verification problem under TSO for \(\textsf{A}\). In particular, the model abstracts away the semantics of TSO (in fact, it abstracts away concurrency altogether) since we are dealing with a single register machine.

We summarize the contributions of the paper as follows:

  • We present a register abstraction scheme that captures the behaviour of parameterized systems under the TSO semantics.

  • We translate parameterized verification under the TSO semantics when the processes manipulate an ADT \(\textsf{A}\), to the reachability problem for register machines operating over \(\textsf{A}\).

  • We instantiate the framework for deciding the complexity of parameterized verification under TSO for different abstract data types. In particular we show the problem is PSpace-complete when \(\textsf{A}\) is a one-counter, \(\textsc {ExpTime}\)-complete if \(\textsf{A}\) is a stack, \({2} \text {-} \textsc {ETime}\)-complete if \(\textsf{A}\) is an ordered multi stack, and ExpSpace-complete if \(\textsf{A}\) is a Petri net. We obtain further complexity bounds for higher order counter and stacks.

Related Work There has been an extensive research effort on parameterized verification since the 1980s (see [8, 13] for recent surveys of the field). Early works showed the undecidability of the general problem (even assuming finite-state processes) [10], and hence the emphasis has been on finding useful special cases. Such cases are characterized by three aspects, namely the system topology (un-ordered, arrays, trees, graphs, rings, etc.), the allowed communication patterns (shared memory, Rendez-vous, broadcast, lossy channels, etc.), and the process types (anonymous, with IDs, with priorities, etc.) [20, 23, 24, 27, 31, 43].

Another line of research to counter undecidability are over-approximations based on regular model checking [1, 14, 16, 38], monotonic abstraction [5], and symmetry reduction [7, 22, 37].

A seminal work in the area is the paper by German and Sistla [32]. The authors consider the verification of systems consisting of an arbitrary number of finite-state processes interacting through Rendez-Vous communication. The paper shows that the model checking problem is ExpSpace-complete. In a series of more recent papers, parameterized verification has been considered in the case where the individual processes are push-down automata. [28, 30, 33, 36, 40, 42].All the above works assume the SC semantics.

Due to the relevance of weak memory models in parameterized verification, papers on the topic have started to appear in the last two years. The paper [4] considers parameterized verification of programs running under TSO, and shows that the reachability problem is PSpace-complete. However, the paper assumes that the processes are finite-state and, in particular, the processes do not manipulate unbounded data domains. The model of the paper corresponds to the particular case of our framework where we take the abstract data type to be empty. In this case our framework also implies PSpace-completeness.

The paper [39] shows PSpace-completeness when the underlying semantics is the Release-Acquire fragment of C11. The latter semantics gives rise to different semantics compared to TSO. The paper also considers finite-state processes.

The paper [2] considers parameterized verification of programs running under TSO. However, the paper applies the framework of well-structured systems where the buffers of the processes are modeled as lossy channels, and hence the complexity of the algorithm is non-primitive recursive. In particular, the paper does not give any complexity bounds for the reachability problem (or any other verification problems). Conchon et al. [19] address the parameterized verification of programs under TSO as well. They make use of Model Checker Modulo Theories, no decidability or complexity results are given. The paper [15] considers checking the robustness property against SC for parameterized systems running under the TSO semantics. However, the robustness problem is entirely different from reachability and the techniques and results developed in the paper cannot be applied in our setting. The paper shows that the problem is ExpSpace-hard. All these works assume finite-state processes.

In contrast to all the above works, the current paper is the first paper that studies decidability and complexity of parameterized verification under the TSO semantics when the individual processes are infinite-state.

2 Preliminaries

We denote a function f between sets A and B by \(f: A \mathop {\xrightarrow {~~}}B\). We write \(f[a \leftarrow b]\) to denote the function \(f'\) such that \(f'(a) = b\) and \(f'(x) = f(x)\) for all \(x \ne a\).

For a finite set A, we use \(|A|\) to refer to the size of A. We also use \(A^{*}\) to denote the set of words over A including the empty word \(\epsilon \). For a word \(w \in A^{*}\), we use \(|w|\) to refer to the length of w. We say a word w is differentiated if all symbols in w are pairwise different. The set \(A^\textsf{diff}\) is the set of all differentiated words over the set A. Finally, for a differentiated word w, we define \(\mathop {\textrm{pos}}(w)(a) \) as the unique position of the letter a in w.

A labelled transition system is a tuple \(\langle \textsf{C}, \textsf{C}_\textsf{init}, \textsf{Labs}, \mathop {\xrightarrow {~~}}\rangle \), where \(\textsf{C}\) is the set of configurations, \(\textsf{C}_\textsf{init}\subseteq \textsf{C}\) is the set of initial configurations, \(\textsf{Labs}\) is a finite set of labels and \(\mathop {\xrightarrow {~~}}\subseteq \textsf{C}\times \textsf{Labs}\times \textsf{C}\) is the transition relation over the set of configurations. For a transition \(\langle \textsf{c}_1, \textsf{lab}, \textsf{c}_2\rangle \in \mathop {\xrightarrow {~~}}\), we usually write \(\textsf{c}_1 \mathop {\xrightarrow {~\textsf{lab}~}} \textsf{c}_2\) instead. We use \(\textsf{c}_1 \mathop {\xrightarrow {~~}}\textsf{c}_2\) to denote that \(\textsf{c}_1 \mathop {\xrightarrow {~\textsf{lab}~}} \textsf{c}_2\) for some \(\textsf{lab}\in \textsf{Labs}\). Furthermore, we write \(\mathop {\xrightarrow {~*~}}\) to denote the transitive reflexive closure over \(\mathop {\xrightarrow {~~}}\), and if \(\textsf{c}_1 \mathop {\xrightarrow {~*~}} \textsf{c}_2\) then we say \(\textsf{c}_2\) is reachable from \(\textsf{c}_1\). If \(\textsf{c}_1 \in \textsf{C}_\textsf{init}\), then we just say that \(\textsf{c}_2\) is reachable. A run \(\mathsf {\rho }\) is an alternating sequence of configurations and labels and is expressed as follows: \( \textsf{c}_0 \mathop {\xrightarrow {~\textsf{lab}_1~}} \textsf{c}_1 \mathop {\xrightarrow {~\textsf{lab}_2~}} \textsf{c}_2 \dots \textsf{c}_{n-1} \mathop {\xrightarrow {~\textsf{lab}_n~}} \textsf{c}_n ~.\) Given \(\mathsf {\rho }\), we write \(\textsf{c}_0 \mathop {\xrightarrow {~n~}} \textsf{c}_n\) meaning that \(\textsf{c}_n\) is reachable from \(\textsf{c}_0\) by n steps, and we write \(\textsf{c}_0 \mathop {\xrightarrow {~\rho ~}} \textsf{c}_n\) meaning that \(\textsf{c}_n\) is reachable from \(\textsf{c}_0\) through the run \(\mathsf {\rho }\).

3 Abstract Data Types (ADT)

In this section, we introduce the notion of abstract data types (ADTs) which will be used extensively in the paper. An ADT is a labelled transition system \(\textsf{A}= \langle \textsf{Vals}, \{\textsf{val}_\textsf{init}\}, \textsf{Ops}, \mathop {\xrightarrow {~~}}\nolimits _{\textsf{A}}\rangle \). Intuitively, this describes the behaviour of some data type such as a stack, or a counter. \(\textsf{Vals}\) is the set of configurations of \(\textsf{A}\). It describes the possible values the data type can assume. The initial configuration is \(\textsf{val}_\textsf{init}\in \textsf{Vals}\). The set of labels \(\textsf{Ops}\) represents the operations that can be executed on the data type and the transition relation \(\mathop {\xrightarrow {~~}}\nolimits _{\textsf{A}}\in \textsf{Vals}\times \textsf{Ops}\times \textsf{Vals}\) describes the semantics of these operations. Below, we give some concrete examples of abstract data types.

Example 1 (Counter)

We define a counter, denoted by the ADT \(\textsc {Ct}\), as follows. The set of configurations \(\textsf{Vals}^\textsc {Ct}= \mathbb {N}\) are the natural numbers. The initial value, denoted by \(\textsf{val}_\textsf{init}^\textsc {Ct}\), is 0. The set of operations is \(\textsf{Ops}^\textsc {Ct}= \{ \texttt{inc}, \texttt{dec}, \texttt{isZero}\}\). The transition relation \(\mathop {\xrightarrow {~~}}\nolimits _{\textsc {Ct}}\) is as follows: The operations \(\texttt{inc}\) and \(\texttt{dec}\) increase or decrease the value of the counter by one, respectively. The latter operation is only enabled if the value of the counter is non-zero, otherwise it blocks. Finally, the transition \(\texttt{isZero}\) checks that the value of the counter is zero, i.e. it is only enabled if that condition is true.

Example 2 (Weak Counter)

A weak counter differs from a counter in that it cannot be checked for zero. The ADT \(\textsc {wCt}\) representing a weak counter is defined as in Example 1, except the operations of \(\textsc {wCt}\) are reduced to \(\textsf{Ops}^\textsc {wCt} = \{ \texttt{inc}, \texttt{dec}\}\).

Example 3 (Stack)

Let \(\varGamma \) be a finite set representing the stack alphabet. A stack \(\textsc {St}= \langle \textsf{Vals}^\textsc {St}, \{\textsf{val}_\textsf{init}^\textsc {St}\}, \textsf{Ops}^\textsc {St}, \mathop {\xrightarrow {~~}}\nolimits _{\textsc {St}}\rangle \) on \(\varGamma \) is defined as follows. The configurations of \(\textsc {St}\) are \(\textsf{Vals}^\textsc {St}= \varGamma ^{*}\) and the initial configuration is the empty stack \(\textsf{val}_\textsf{init}^\textsc {St}= \varepsilon \). The set of operations is \(\textsf{Ops}^\textsc {St}= \{ \texttt{pop}(\gamma ), \texttt{push}(\gamma ), \texttt{isEmpty}\mid \gamma \in \varGamma \}\). The transition relation is as follows. For every word \(w\in \varGamma ^{*}\) and every symbol \(\gamma \in \varGamma \), \(\texttt{push}(\gamma )\) adds the symbol \(\gamma \) to the top of the stack. Similiarly, the \(\texttt{pop}(\gamma )\) operation removes the topmost symbol from the stack. It is only enabled if the topmost symbol on the stack. The \(\texttt{isEmpty}\) operation does not change the stack, but can only be performed if the stack is the empty word \(\varepsilon \).

Example 4 (Petri Nets)

Given a Petri net[44], We can define a corresponding ADT \(\textsc {Petri}\) that models its semantics. The values are the markings, the operations are the Petri net transitions and the transition relation is given by the input and output vectors of the Petri net transitions.

Higher Order ADTs We extend the ADT \(\textsc {St}\) to higher order stacks referred to as \(n\text {-}\textsc {St} \). This is done recursively[18, 25]. The formal definition is in the full version of our paper [3]. A value of a level n higher order stack \(n\text {-}\textsc {St}\) is a stack of level \(n-1\) stacks. For level 1, it is the standard stack \(\textsc {St}\). The operations for level n are \(\textsf{Ops}^{{n} \text {-} \textsc {St}} = \{\texttt{pop}(\gamma ), \texttt{push}(\gamma ), \texttt{pop}_k, \texttt{push}_k, ~ | ~ \gamma \in \varGamma , 2 \le k \le n\}\). The operations \(\texttt{pop}(\gamma )\) and \( \texttt{push}(\gamma )\) are recursively applied to the top element in the stack (which consists of a stack that is one level lower) until the level of the top element is 1. Here, they have the standard stack behaviour. Operations \(\texttt{pop}_k\) and \( \texttt{push}_k\) are recursively applied to the top element until the level of the element is k. Then, a copy of this level k stack is pushed on top of the original.

Since a counter can be seen as a stack with an alphabet of size 1 (and a bottom element \(\bot \)), we can extend definitions of \(\textsc {wCt}\) and \(\textsc {Ct}\) to \(n\text {-}\textsc {wCt}\) and \(n\text {-}\textsc {Ct}\) in the same way. We add operations \(\texttt{inc}_k, \texttt{dec}_k\). All operations are recursively applied to the top counter. For \(\texttt{inc}, \texttt{dec}, \texttt{isZero}\), we use standard behaviour once the level is 1. For \(\texttt{inc}_k, \texttt{dec}_k\), we copy/remove the top element once the level is k.

Example 5 (Ordered Multi Stack)

We extend the stack to a numbered list of n many stacks n-OMSt [12]. A value of n-OMSt consists of list of stacks \(\textsf{val}_1^\textsc {St}\ldots \textsf{val}_n^\textsc {St}\). An operation \(\textsf{Ops}^{n\text {-}\textsc {OMSt}}=\{\texttt{isZero}_i, \texttt{pop}_i(\gamma ), \texttt{push}_i(\gamma ), ~ | ~ \gamma \in \varGamma , i \le n\}\) works on stack number i in the standard way. One additional condition is that the stacks have to be ordered, meaning an operation \(\texttt{pop}_i(\gamma )\) is only enabled if the stacks \(1\ldots i-1\) are empty.

4 TSO with an Abstract Data Type : \(\textsf{TSO}(\textsf{A})\)

In this section, we introduce concurrent programs running under \(\textsf{TSO}(\textsf{A})\) for an ADT \(\textsf{A}= \langle \textsf{Vals}, \{ \textsf{val}_\textsf{init}\}, \textsf{Ops}, \mathop {\xrightarrow {~~}}\nolimits _{\textsf{A}}\rangle \). These programs consist of concurrent processes where the communication between processes is performed using shared memory under the TSO semantics. In addition, each process maintains a local variable of type \(\textsf{A}\).

Syntax of \(\textsf{TSO}(\textsf{A})\). Let \(\textsf{Dom}\) be a finite data domain and \(\textsf{Vars}\) be a finite set of shared variables over \(\textsf{Dom}\). Let \(\textsf{d}_\textsf{init}\in \textsf{Dom}\) be the initial value of the variables. We define the instruction set of \(\textsf{TSO}(\textsf{A})\) as \(\textsf{Instrs}= \{ \texttt{rd}(\mathsf {\textsf{x}, \textsf{d}}), \texttt{wr}(\mathsf {\textsf{x}, \textsf{d}})\mid \textsf{x}\in \textsf{Vars}, \textsf{d}\in \textsf{Dom}\} \cup \{ \texttt{skip}, \texttt{mf}\}\), which are called read, write, skip and memory fence, respectively.

A process is represented by a finite state transition system. It is given by the tuple \(\textsf{Proc}= \langle \textsf{Q}, \textsf{q}_\textsf{init}, \delta \rangle \), where \(\textsf{Q}\) is a finite set of states, \(\textsf{q}_\textsf{init}\in \textsf{Q}\) is the initial state, and \(\delta \subseteq \textsf{Q}\times (\textsf{Instrs}\cup \textsf{Ops}) \times \textsf{Q}\) is the transition relation. We call this tuple the description of the process. A concurrent program is a tuple of processes \(\mathcal {P}= \langle \textsf{Proc}_\iota \rangle _{\iota \in \mathcal {I}}\), where \(\mathcal {I}\) is some finite set of process identifiers. For each \(\iota \in \mathcal {I}\) we have \(\textsf{Proc}^\iota = \langle \textsf{Q}^\iota , \textsf{q}_\textsf{init}^\iota , \delta ^\iota \rangle \).

Semantics of \(\textsf{TSO}(\textsf{A})\). We describe the semantics of a program \(\mathcal {P}\) running under \(\textsf{TSO}(\textsf{A})\) by a labelled transition system \(\mathcal {T}_\mathcal {P}= \langle \textsf{C}^\mathcal {P}, \textsf{C}_\textsf{init}^\mathcal {P}, \textsf{Labs}^\mathcal {P}, \mathop {\xrightarrow {~~}}\nolimits _{\mathcal {P}}\rangle \). The formal definition is given in [3]. Under \(\textsf{TSO}(\textsf{A})\), there is an unbounded FIFO buffer of writes between each process and the memory. A configuration \(\textsf{c}\in \textsf{C}^\mathcal {P}\) of the system consists of the value of each variable in the shared memory as well as for each process: its local state, its value of the ADT, and the content of the corresponding write buffer.

The labelled transitions \(\mathop {\xrightarrow {~~}}\nolimits _{\mathcal {P}}\) are as follows: A local \(\texttt{skip}\) transition simply updates the state of the corresponding process. An ADT operation additionally updates the ADT value according to ADT behaviour \(\mathop {\xrightarrow {~~}}\nolimits _{\textsf{A}}\). When a process executes a write instruction, the operation is enqueued as a pending write message into its buffer. A message \(\textsf{msg}\) is an assignment of the form \(\textsf{msg}=\langle \mathsf {\textsf{x}, \textsf{d}}\rangle \), where \(\textsf{x}\in \textsf{Vars}\) and \(\textsf{d}\in \textsf{Dom}\). We denote the set of all messages by \(\textsf{Msgs}= \textsf{Vars}\times \textsf{Dom}\). The buffer content for a process is given as a word over \(\textsf{Msgs}\). The messages inside each buffer are moved non-deterministically to the main memory in a FIFO manner. Once a message reaches the memory, it becomes visible to all the other processes. When executing a read instruction on a variable \(\textsf{x}\in \textsf{Vars}\), the process first checks its buffer for pending write messages on \(\textsf{x}\). If the buffer contains such a message, then it reads the value of the most recent one. If the buffer contains no write messages on \(\textsf{x}\), then the process fetches the value of \(\textsf{x}\) from the memory. The initial configuration is \(\textsf{c}_\textsf{init}^\mathcal {P}\), where each process is in its initial state, each ADT holds its initial value, each store buffer is empty and the memory holds the initial values of all variables. Note that since FIFO buffer is unbounded, this is an infinite state transition system, even for finite ADT.

A sequence of transitions \(\textsf{c}_0 \mathop {\xrightarrow {~\textsf{lab}_1~}}\nolimits _{\mathcal {P}}\textsf{c}_1 \mathop {\xrightarrow {~\textsf{lab}_2~}}\nolimits _{\mathcal {P}}\textsf{c}_2 \dots \textsf{c}_{n-1} \mathop {\xrightarrow {~\textsf{lab}_n~}}\nolimits _{\mathcal {P}}\textsf{c}_n\) where \(\textsf{c}_0 = \textsf{c}_\textsf{init}^\mathcal {P}\) is the initial configuration and \(\textsf{lab}_i\in \textsf{Labs}^\mathcal {P}\) is called a run in the \(\textsf{TSO}(\textsf{A})\) transition system. If there is a run ending in a configuration with state \(\textsf{q}_\textsf{final}\), then we say \(\textsf{q}_\textsf{final}\) is reachable by \(\textsf{Proc}\) under \(\textsf{TSO}(\textsf{A})\).

5 Parameterized Reachability in \(\textsf{TSO}(\textsf{A})\)

In this section, we consider the parameterized TSO setting which allows for an a priori unbounded number of processes with the same process description. We begin by formally introducing the parameterized state reachability problem, and then develop a generic construction that allows us to represent the TSO semantics (except for the ADT) in a finite manner.

The Parameterized State Reachability Problem Intuitively, parameterization allows for an arbitrary number of identical processes. The parameterized state reachability problem for \(\textsf{TSO}(\textsf{A})\) called \(\textsf{TSO}(\textsf{A})\)-P-Reach identifies a family of (standard) reachability problem instances. We want to determine whether we have reachability in some member of the family. We now introduce this formally.

For a given process description \(\textsf{Proc}\), we consider the program instance, \(\mathcal {P}_\textsf{Proc}^n\) parameterized by a natural number n as follows. For \(\mathcal {I}= \{1, \ldots , n\}\), let \(\mathcal {P}_\textsf{Proc}^n = \langle \textsf{Proc}_1, \ldots , \textsf{Proc}_n \rangle \) with \(\textsf{Proc}_\iota = \textsf{Proc}\) for all \(\iota \in \mathcal {I}\). That is, the \(n^\text {th}\) slice of the parameterized family of programs contains n processes, all with identical descriptions \(\textsf{Proc}\). We require that all processes maintain copies of the ADT \(\textsf{A}\).

figure a

When talking about a certain family of ADTs, e.g. the family of petri nets, we write and mean the restriction of \(\textsf{TSO}(\textsf{A})\)-P-Reach to petri nets, i.e. to instances where \(\textsf{A}\) is a petri net.

The main difference between the non-parameterized case and the parameterized case of the problem is that in the first case the index set \(\mathcal {I}\) is A priori fixed, while in the second case it can be arbitrary. This results in \(\textsf{C}_\textsf{init}^\mathcal {P}\) being a singleton in the non-parameterized case while it becomes infinite (one initial state for each n-slice) in the parameterized case.

We determine upper and lower bounds for the complexity of the state reachability problem. The challenge of solving this problem varies with the ADT. This problem for plain TSO without an ADT has been studied in [4]. They showed that the problem can be decided in PSpace and is in fact PSpace-complete. The result is based on an abstraction technique called the pivot semantics. The pivot semantics is exact in the sense that a state \(\textsf{q}\) is reachable under parameterized TSO if and only if it is reachable under the pivot semantics.

We show that the dynamics underlying the pivot abstraction can be generalized to our model with ADT. We show that the pivot abstraction can be extended to obtain a register machine. We use this construction to give a general characterization of \(\textsf{TSO}(\textsf{A})\)-P-Reach. First, we recall the pivot abstraction.

The Pivot Abstraction [4]. For a set of variables \(\textsf{Vars}\) and data domain \(\textsf{Dom}\), processes generate pending write messages from the set \(\textsf{Msgs}= \textsf{Vars}\times \textsf{Dom}\) by executing \(\texttt{wr}\) instructions. This set has size \(|\textsf{Vars}| \cdot |\textsf{Dom}|\) and hence at most as many distinct (variable, value) pairs can be produced in any run. For a run \(\mathsf {\rho }\) of the program, for each message \(\textsf{msg}= \langle \mathsf {\textsf{x}, \textsf{d}}\rangle \in \textsf{Msgs}\) we can define the first point along \(\mathsf {\rho }\) at which some write on variable \(\textsf{x}\) with value \(\textsf{d}\) is propagated to the memory. The pivot abstraction identifies these points as pivot points \(\mathop {\textrm{pvt}}(\textsf{msg})\), for each distinct message in \(\textsf{Msgs}\). For a write message \(\textsf{msg}\) under \(\mathsf {\rho }\), the pivot point \(\mathop {\textrm{pvt}}(\textsf{msg})\) is the first point of propagation of \(\textsf{msg}\) to the memory under \(\mathsf {\rho }\).

The core observation is that if at some point in \(\mathsf {\rho }\), a process \(\textsf{Proc}_\iota \) propagates a message \(\textsf{msg}= \langle \mathsf {\textsf{x}, \textsf{d}}\rangle \) from its buffer to the memory, then after that point, the value \(\textsf{d}\) will always be available to read on variable \(\textsf{x}\) from the shared memory. Technically, this follows from parameterization. There are arbitrarily many processes executing identical descriptions. This means transitions of the original process \(\textsf{Proc}_\iota \) can be mimicked by a clone process \(\textsf{Proc}_{\iota '}\) identical to \(\textsf{Proc}_\iota \). Hence, \(\textsf{Proc}_{\iota '}\) can replicate the execution of \(\textsf{Proc}_\iota \) right up to the point where the message \(\textsf{msg}\) is the oldest message in its buffer. Then a single propagate step updates the value of \(\textsf{x}\) in the shared memory to \(\textsf{d}\). There can be arbitrarily many such clones and the propagate step can happen at any time. It follows that beyond the \(\mathop {\textrm{pvt}}(\textsf{msg})\) point in \(\mathsf {\rho }\), the value \(\textsf{d}\) can always be read from \(\textsf{x}\).

For distinct messages from \(\textsf{Msgs}\), we can order the pivot points corresponding to these messages according to the order in which they appear in \(\mathsf {\rho }\). This gives us a first update sequence, denoted by \(\omega \). No two messages in \(\omega \) are the same; the set of such sequences is the set of differentiated words \(\textsf{Msgs}_\textsf{diff}\). A message \(\textsf{msg}\in \textsf{Msgs}\) in \(\omega \) has the rank k if it is the k-th pivot point in \(\omega \).

Providers. The pivot abstraction simulates a run \(\mathsf {\rho }\) under the TSO semantics by running abstract processes called providers in a sequential manner. For \(1 \le k \le |\omega |+1\), the k-provider simulates the process that generates the write of the rank k message \(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle \) corresponding to the k-pivot in \(\mathsf {\rho }\). The k-provider completes its task when it has simulated this process until the point it generates \(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle \). At this point, it invokes the \((k+1)\)-provider. With this background, we now develop the formal pivot semantics for parameterized \(\textsf{TSO}(\textsf{A})\).

Formal Pivot semantics for Parameterized \(\textsf{TSO}(\textsf{A})\). We define the formal operational semantics of the pivot abstraction as a labelled transition system. Given a process description \(\textsf{Proc}= \langle \textsf{Q}, \textsf{q}_\textsf{init}, \delta \rangle \) and ADT \(\textsf{A}= \langle \textsf{Vals}, \{\textsf{val}_\textsf{init}\}, \textsf{Ops}, \mathop {\xrightarrow {~~}}\nolimits _{\textsf{A}}\rangle \), a configuration of the pivot transition system represents the view of a provider when simulating a run of the program. A view \(\textsf{v}= \langle \textsf{q}, \textsf{val}, \textsf{Lw}, \omega , {\phi _E}, {\phi _L}, {\phi _P}\rangle \) is defined as follows. The process state is given by \(\textsf{q}\in \textsf{Q}\). The value of the provider’s ADT \(\textsf{A}\) is \(\textsf{val}\in \textsf{Vals}\). The function \(\textsf{Lw}: \textsf{Vars}\mathop {\xrightarrow {~~}}\textsf{Dom}\mathop \cup \{\oslash \}\) gives for each \(\textsf{x}\in \textsf{Vars}\), the value of the latest (i.e., most recent) write the provider has performed on \(\textsf{x}\). If no such instruction exists (the process has made no writes to \(\textsf{x}\)) then \(\textsf{Lw}(\textsf{x}) = \oslash \). Note that \(\textsf{Lw}\) abstracts the buffer in terms of read-own-write operations since the process can only read from the most recent pending write in its buffer on each variable (if it exists). We define \(\textsf{Lw}_\oslash \) such that \(\textsf{Lw}_\oslash (\textsf{x})=\oslash \) for all \(\textsf{x}\in \textsf{Vars}\). The first update sequence of pivot messages is \(\omega \in \textsf{Msgs}_\textsf{diff}\). It is unchanged by transitions and remains constant throughout the pivot run.

The external pointer, \({\phi _E}\in \{0,1,\ldots ,|\omega |\}\) helps the provider keep track of which messages from \(\omega \) it has observed. These messages have been propagated by other processes. The external pointer is used to identify which variables are still holding their initial values in the memory. If the provider observes an external write on a variable x (by accessing the memory), then this write has overwritten the initial value of x in the memory. The local pointer \({\phi _L}: \textsf{Vars}\mathop {\xrightarrow {~~}}\{0,1,\ldots ,|\omega |\}\) is a set of pointers, one for each variable \(\textsf{x}\in \textsf{Vars}\). The function \({\phi _L}(\textsf{x})\) gives the highest ranked write operation the provider itself has performed (on any variable) before it performed the latest write on \(\textsf{x}\). The local pointer is necessary to know which variables lose their initial values when we need to empty the buffer. In other words, the local pointer abstracts the buffer in terms of update operations. We define \({\phi _{L}^{\max }}:=\max \{{\phi _L}(\textsf{x})\mid \textsf{x}\in \textsf{Vars}\}\) as the highest value of a local pointer and \({\phi _{L}^{0}}\) such that \({\phi _{L}^{0}}(\textsf{x})=0\) for all variables \(\textsf{x}\in \textsf{Vars}\), i.e., the pointers are all in the leftmost position. The progress pointer \({\phi _P}\in \{1,2,\ldots ,|\omega |+1\}\) gives the rank of the process the current provider is simulating.

Fig. 1.
figure 1

The transition relation of the pivot semantics for a process \(\textsf{Proc}\).

Given an update sequence \(\omega \in \textsf{Msgs}^\textsf{diff}\) and \(1 \le k \le |\omega |+1\), we define the initial view induced by \(\omega \) and k denoted by \(\textsf{v}_\textsf{init}(\omega , k)\), as the view \(\langle \textsf{q}^\textsf{init}, \textsf{val}_\textsf{init}, \textsf{Lw}_\bot , \omega , 0, {\phi _{L}^{0}} , k \rangle \). For a given \(\omega \), the k-provider starts with \(\textsf{v}_\textsf{init}(\omega , k)\): \(\textsf{Lw}_\bot \) and \({\phi _{L}^{0}}\) imply that the simulated process has not performed any writes and \({\phi _E}= 0\) means that it has not read/updated from/to the memory.

We define the labeled transition relation \(\mathop {\xrightarrow {~~}}\nolimits _{\textsf{pvt}}\) on the set of views by the inference rules given in Figure 1. The set of labels is \(\textsf{Instrs}\mathop \cup \textsf{Ops}\). We describe the inference rules briefly. The \(\texttt{skip}\) rule only changes the local state of the process. There are two inference rules, write(1) and write(2), to describe the execution of a write operation \(\texttt{wr}(\mathsf {\textsf{x}, \textsf{d}})\). The rule write(1) describes the situation when the rank of \(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle \) is strictly smaller than the progress pointer \({\phi _P}\). In this case, we update both \(\textsf{Lw}\) and \({\phi _L}\). The rule write(2) describes the situation when the rank of \(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle \) equals the progress pointer. This means that the provider has provided the message \(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle \) with rank \({\phi _P}\). Hence it has completed its mission, and initiates the next provider by transitioning to \(\textsf{v}_\textsf{init}(\omega , {\phi _P}+ 1)\).

There are three inference rules that describe a read operation \(\texttt{rd}(\mathsf {\textsf{x}, \textsf{d}})\). The rule read(1) describes when the last written value to \(\textsf{x}\) by the provider is \(\textsf{d}\), \(\textsf{Lw}(\textsf{x})= \textsf{d}\). In this case, the provider simply reads from its local buffer. The rule read(2) describes the read of an initial value. It ensures that the read is possible by checking that no write operation on \(\textsf{x}\) is executed by the provider (\(\textsf{Lw}(\textsf{x})= \bot \)), and by checking that the initial value of the variable has not been overwritten in the memory. This is achieved by checking if the position of \(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle \) in \(\omega \), i.e. \(\mathop {\textrm{pos}}(\omega )(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle )\), is strictly larger than \({\phi _E}\). The rule read(3) describes when the simulated process reads from the memory. It checks that the message \(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle \) has been generated by some previous provider (\(\mathop {\textrm{pos}}(\omega )(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle ) < {\phi _P}\)), and then it updates the external pointer to \(\max ({\phi _E}, {\phi _L}(\textsf{x}), \mathop {\textrm{pos}}(\omega )(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle ))\). The memory fence rule describes when the simulated process does a fence action. The rule updates the external pointer to \(\max ({\phi _E}, {\phi _{L}^{\max }})\). Finally, the data-operation rule describes when the simulated process does an ADT operation.

The set of initial views is \(\textsf{V}_\textsf{init}= \{ \textsf{v}_\textsf{init}(\omega , 1) \mid \omega \in \textsf{Msgs}^\textsf{diff}\}\). This is the set of initial views of the 1-provider and it is finite because \(\textsf{Msgs}^\textsf{diff}\) is finite, unlike the set of initial configurations \(\textsf{C}_\textsf{init}\) in the parameterized case under TSO.

6 Register Machines

Our goal is to design a general method to determine the decidability and complexity of \(\textsf{TSO}(\textsf{A})\)-P-Reach depending on \(\textsf{A}\). We examine the pivot abstraction introduced in the previous chapter. A view \(\textsf{v}= \langle \textsf{q}, \textsf{val}, \textsf{Lw}, \omega , {\phi _E}, {\phi _L}, {\phi _P}\rangle \) of the pivot transition system, can be partitioned into the following two components: (1) \(\textsf{q}, \textsf{Lw}, \omega , {\phi _E}, {\phi _L}, {\phi _P}\) which contains the local state and also effectively abstracts the unbounded FIFO buffers and shared memory of the TSO system and (2) \(\textsf{val}\) which captures the value of the ADT. The first part is finite since each component takes finitely many values. We call this the book-keeping state since it keeps track of the progress of the core TSO system. However, the ADT part can be infinite, depending upon the abstract data type.

We will use a register machine in order to represent the book-keeping state in a finite way using states and registers. On the other hand, we will keep the ADT component general and only later instantiate it to some interesting cases.

A register machine is a finite state automaton that has access to a finite set of registers, each holding a natural number. The register machine can execute two operations on a register, it can write a given value or it can read a given value. A read is blocking if the given value is not in the register. We differ from most definitions of register machines in two significant ways: Since we only require a finite domain to model \(\textsf{TSO}(\textsf{A})\) semantics, the values of the registers are bound from above by an \(N \in \mathbb {N}\). This makes the register assignments finite whereas most definitions allow for an unbounded domain. Further, our register machine is augmented with an ADT.

Given an ADT \(\textsf{A}= \langle \textsf{Vals}, \{\textsf{val}_\textsf{init}\}, \textsf{Ops}, \mathop {\xrightarrow {~~}}\nolimits _{\textsf{A}}\rangle \), let \(\textsf{Regs}\) be a finite set of registers and \(\textsf{Dom}= \{0, \dots , N\}\) their domain. We define the set of actions \(\textsf{Acts}= \{ \texttt{SKP}, \texttt{WRITE}(\mathsf {\texttt {r}, \textsf{d}}), \texttt{READ}(\mathsf {\texttt {r}, \textsf{d}})\mid \texttt {r}\in \textsf{Regs}, \textsf{d}\in \textsf{Dom}\}\). A register machine is then defined as a tuple \(\mathcal {R}(\textsf{A})= \langle \textsf{Q}, \textsf{q}_\textsf{init}, \delta \rangle \), where \(\textsf{Q}\) is a finite set of states, \(\textsf{q}_\textsf{init}\in \textsf{Q}\) is the initial state and \(\delta \subseteq \textsf{Q}\times (\textsf{Acts}\cup \textsf{Ops}) \times \textsf{Q}\) is the transition relation.

The semantics of the register machine are given in terms of a transition system. The set of configurations is \(\textsf{Q}\times \textsf{Dom}^\textsf{Regs}\times \textsf{Vals}\). A configuration consists of a state, a register assignment \(\textsf{Regs}\mathop {\xrightarrow {~~}}\textsf{Dom}\) and a value of \(\textsf{A}\). The initial configuration is \(\langle \textsf{q}_\textsf{init}, 0^\textsf{Regs}, \textsf{val}_\textsf{init}\rangle \), where all registers contain the value 0.

The transition relation \(\mathop {\xrightarrow {~~}}\) is described in the following. \(\texttt{SKP}\) only changes the local state, not the registers or the ADT value. \(\texttt{WRITE}(\mathsf {\texttt {r}, \textsf{d}})\) sets the value of the register \(\texttt {r}\) to \(\textsf{d}\). \(\texttt{READ}(\mathsf {\texttt {r}, \textsf{d}})\) is only enabled if the value of \(\texttt {r}\) is \(\textsf{d}\), it does not change the value. The operations in \(\textsf{Ops}\) work as usually, they do not change any register. We define the state reachability problem for register machines as \(\mathcal {R}(\textsf{A})\)-Reach in the usual way. A state \(\textsf{q}_\textsf{final}\in \textsf{Q}\) is reachable if there is a run of the transition system defined by the semantics of \(\mathcal {R}(\textsf{A})\) that starts in the initial configuration and ends in a configuration with state \(\textsf{q}_\textsf{final}\).

6.1 Simulating Pivot Abstraction by Register Machines

In this section we will show how to simulate the pivot abstraction by a register machine. The idea is to save the book-keeping state (except for the local state) in the registers. Given a process description \(\textsf{Proc}= \langle \textsf{Q}^\textsf{Proc}, \textsf{q}_\textsf{init}^\textsf{Proc}, \delta ^\textsf{Proc}\rangle \) for an ADT \(\textsf{A}\), we construct a register machine \(\mathcal {R}(\textsf{A})= \langle \textsf{Q}, \textsf{q}_\textsf{init},\delta \rangle \) that simulates the pivot semantics as follows. The set of registers is

$$\begin{aligned} { \textsf{Regs}:= \{ \textsf{Lw}(\textsf{x}), \textsf{rk}_\textsf{Vars}(\textsf{x}), \textsf{rk}_\textsf{Msgs}(\textsf{msg}), {\phi _E}, {\phi _L}(\textsf{x}), {\phi _{L}^{\max }}, {\phi _P}, \textsf{rk}_\textsf{nxt} \mid \textsf{x}\in \textsf{Vars}, \textsf{msg}\in \textsf{Msgs}\}~. } \end{aligned}$$

The registers \(\textsf{rk}_\textsf{Vars}(\textsf{x})\) and \(\textsf{rk}_\textsf{Msgs}(\textsf{msg})\) hold the rank of each variable and message, respectively. This implicitly gives rise to an update sequence. The auxiliary register \(\textsf{rk}_\textsf{nxt}\) is used to initialize the other rank registers, as will be explained later on. The remaining registers correspond to their respective counterparts in the pivot abstraction. Note that the number of registers is linear in the number of messages \(|\textsf{Msgs}|\). The domain of the registers is defined to be \(\textsf{Dom}= \{ 0, \dots , |\textsf{Msgs}|+ 1 \}\). Since the TSO memory domain is finite, we can assume w.l.o.g. that the memory values are positive integers. If \(\textsf{Lw}(\textsf{x})= 0\), it means that there has been no write on \(\textsf{x}\) and it still holds the initial value. The set of states \(\textsf{Q}\) contains \(\textsf{Q}^\textsf{Proc}\cup \{\textsf{q}_\textsf{init}^\mathcal {R}(\textsf{A}), \textsf{q}_\textsf{init}^\textsf{ptr}\}\) as well as a number of (unnamed) auxiliary states that will be used in the following.

To simplify our construction, we will use additional operations on registers, instead of just \(\texttt{WRITE}\) and \(\texttt{READ}\). We introduce different blocking comparisons between registers and values such as \(==,<,\le ,\ne \), register assignments such as \(\texttt {r}:=\texttt {r}'\), and increments by one denoted as \(\texttt {r}\texttt {++}\). A more detailed description of these instructions is given in [3].

The Initializer. The pivot semantics define an exponential number of initial states: one per possible update sequence. The register machine instead guesses an update sequence at the start of the execution and stores it in the rank registers. This part of the register machine is the rank initializer (shown in Figure 2 (a)). It uses the auxiliary register \(\textsf{rk}_\textsf{nxt}\) to keep track of the next rank that is to be assigned. In a nondeterministic manner, the rank initializer chooses a so far unranked message and then it assigns the next rank to this message. If the variable of the message has no rank assigned yet, it updates the rank of the variable. Then it increases the \(\textsf{rk}_\textsf{nxt}\) register and continues. After each rank assignment, the initializer can choose to stop the rank assignment. In that case, it initializes the register \({\phi _P}\) to 1 and finishes in the initial state of \(\textsf{Proc}\).

In addition to the rank initializer, we have the pointer initializer. It is responsible for resetting all pointers except the process pointer to zero. The process pointer is incremented by one instead. This initializer is not executed in the beginning of the simulation, but between epochs of the pivot abstraction.

The simulator. The main part of this construction handles the simulation of the pivot abstraction. It contains \(\textsf{Q}^\textsf{Proc}\) as well as several auxiliary states that are described in the following. It simulates each instruction of \(\textsf{TSO}(\textsf{A})\). The skip instruction and the data instructions are carried out unchanged. A visualization of the remaining instructions is depicted in Figure 2. In case of a write instruction \(\texttt{wr}(\mathsf {\textsf{x}, \textsf{d}})\), we first compare the rank of the write message with the process pointer. If they are equal, it means that the epoch is finished and the next process should start, therefore we jump to the first state of the pointer initializer. Otherwise, we set the last write pointer \(\textsf{Lw}(\textsf{x})\) to \(\textsf{d}\). Now, we ensure that \({\phi _{L}^{\textsf{max}}}\) is at least as large as the rank of \(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle \) and finally we update the local pointer \({\phi _L}(\textsf{x})\) to be equal to \({\phi _{L}^\textsf{max}}\). For the memory fence instruction, it only needs to be ensured that the external pointer is at least as large as the maximum local pointer \({\phi _{L}^\textsf{max}}\). For a read instruction \(\texttt{rd}(\mathsf {\textsf{x}, \textsf{d}})\), if the last write to \(\textsf{x}\) was of value \(\textsf{d}\), we can execute the read directly. Otherwise, after checking that the write can be performed by the current provider, we ensure that the external pointer is at least as large as both the rank of \(\langle \mathsf {\textsf{x}, \textsf{d}}\rangle \) and the local pointer of \(\textsf{x}\). For the special case that \(\textsf{d}= \textsf{d}_\textsf{init}\), there is an additional way in which the read can be performed: We can read \(\textsf{d}_\textsf{init}\) from the memory if the process has neither already written to \(\textsf{x}\) nor observed a write that has higher or equal rank than the rank of \(\textsf{x}\). This gives us the following theorem, proven in Appendix C of the full version [3]:

Theorem 1

\(\textsf{TSO}(\textsf{A})\)-P-Reach is polynomial time reducible to \(\mathcal {R}(\textsf{A})\)-Reach.

Fig. 2.
figure 2

The rank initializer and the simulator for some instructions \(\textsf{instr}\).

6.2 Simulating Register Machines by TSO

We will now show how to simulate an ADT register machine with a parameterized program running under \(\textsf{TSO}(\textsf{A})\). The main idea is to save the information about the registers in the last pending write operations, while making sure that not a single write operation actually hits the memory. Thus, the simulator always reads the initial value or its own writes, never writes of other processes.

The TSO program has a variable for each register, and two additional variables \(\textsf{x}_s\) and \(\textsf{x}_c\) that act as flags: \(\textsf{x}_s\) indicates that the verifier should start working, while \(\textsf{x}_c\) indicates that the verifier has successfully completed the verification. At the beginning of the execution, each process nondeterministically chooses to be either simulator, scheduler, or verifier. Each role will be described in the following. The complete construction is shown in Appendix C of [3].

The simulator uses the same states and transitions as \(\mathcal {R}(\textsf{A})\), but instead of reading from and writing to registers, it uses the memory. If the simulator reaches the target state \(\textsf{q}_\textsf{target}\), it first checks the \(\textsf{x}_s\) flag. If it is already set, the simulator stops, never reaching the final state \(\textsf{q}_\textsf{final}\). Otherwise, it waits until it observes the flag \(\textsf{x}_c\) to be set. It then enters the final state. The scheduler’s only responsibility is to signal the start of the verification process. It does so by setting the flag \(\textsf{x}_s\) at a nondeterministically chosen time during the execution of the program. The verifier waits until it observers the flag \(\textsf{x}_s\). It then starts the verification process, which consists of checking each variable that corresponds to a register. If all of them still contain their initial value, the verification was successful. The verifier signals this to the simulator process by setting the \(\textsf{x}_c\) flag.

Any execution ending in \(\textsf{q}_\textsf{final}\) must perform a simulation of \(\mathcal {R}(\textsf{A})\) ending in \(\textsf{q}_\textsf{target}\) first, then a scheduler propagates the setting of flag \(\textsf{x}_s\) and afterwards a verifier executes. This ensures that the initial values are read by the verifier after the register machine has been simulated and thus the shared memory is unchanged. This means the simulator only accessed its write buffer and not writes from other threads. It follows that \(\textsf{q}_\textsf{target}\) is reachable by \(\mathcal {R}(\textsf{A})\) if and only if \(\textsf{q}_\textsf{final}\) is reachable by \(\textsf{Proc}\) under \(\textsf{TSO}(\textsf{A})\). This gives us the following result:

Theorem 2

\(\mathcal {R}(\textsf{A})\)-Reach is polynomial time reducible to \(\textsf{TSO}(\textsf{A})\)-P-Reach.

Theorem 1 and Theorem 2 give us a method of determining upper and lower bounds of the complexity of \(\textsf{TSO}(\textsf{A}){} \texttt {-P-Reach}\) for different instantiations of ADT. Since we have reductions in both directions, we can conclude that \(\textsf{TSO}(\textsf{A}){} \texttt {-P-Reach}\) is decidable if and only if \(\mathcal {R}(\textsf{A}){} \texttt {-Reach}\) is decidable. We know \(\textsf{TSO}(\textsf{A})\)-P-Reach is PSpace-hard for where \(\textsc {NoAdt}\) is the trivial ADT that models plain TSO semantics [4]. We can immediately derive a lower bound for any ADT: \(\textsf{TSO}(\textsf{A})\)-P-Reach is PSpace-hard.

7 Instantiations of ADTs

In the following, we instantiate our framework to a number of ADTs in order to show its applicability.

Theorem 3

and are PSpace-complete.

We know \(\textsf{TSO}(\textsf{A})\)-P-Reach is PSpace-hard for any ADT \(\textsf{A}\) including Ct and wCt. Regarding the upper bound for \(\textsc {Ct}\), we can show that can be polynomially reduced to . The idea is to show that there is a bound on the counter values in order to find a witness for . This bound is polynomial in the number of possible states and register assignments (i.e., this bound is at most exponential in the size of \(\mathcal {R}(\textsc {Ct})\).) Assume a run that contains a configuration \(\textsf{c}\) with a value that exceeds the bound, then certain state and register assignment are repeated in the run with different values. We can use this to shorten the run such that the counter value in \(\textsf{c}\) is reduced.

We can encode the counter value (up to this bound) in a binary way into registers acting as bits. The number of additional registers is polynomial in the size of \(\mathcal {R}(\textsc {Ct})\). In order to simulate an \(\texttt{inc}\) operation on this binary encoding using \(\texttt{WRITE}\) and \(\texttt{READ}\), we only have to go through the bits starting at the least important bit and flip them until one is flipped from 0 to 1. The \(\texttt{dec}\) operation works analogously. This only requires a polynomial state and transition overhead.

We know that is in PSpace[4]. It follows from the polynomial reduction that is in PSpace. Applying Theorem 1 gives us that is in PSpace. Since any wCt is a Ct, it follows is in PSpace as well. The proof is in [3].

Theorem 4

is ExpTime-complete.

For membership, we encode the registers of \(\mathcal {R}(\textsc {St})\) in the states, which yields a finite state machine with access to a stack, i.e. a pushdown automaton. The construction has an exponential number of states. From [45], we have that checking the emptiness of a context-free language generated by a pushdown automaton is polynomial in terms of the size of the automaton. Combined, we get that state reachability of the constructed pushdown automaton is in ExpTime. It follows that is in ExpTime (thanks to Theorem 1).

To prove the lower bound, we can reduce the problem of checking the emptiness of the intersection of a pushdown with n finite-state automata [35] to . This problem is well-known to be ExpTime-complete. The idea is to use the stack to simulate pushdown automaton and n registers to keep track of the states of the finite-state automata. We apply Theorem 2 and get is ExpTime-hard. The formal proof is in [3]

Theorem 5

is ExpSpace-complete.

Proof

Petri net coverability is known to be ExpSpace complete [26]. We show hardness by reducing coverability of a marking m to . The idea is to construct a register machine with a Petri net as ADT. This register machine will have two states \(\textsf{q}_\textsf{init}\) and \(\textsf{q}_\textsf{final}\). For every transition t of the original Petri net, we have t: \(\textsf{q}_\textsf{init}\mathop {\xrightarrow {~t~}}\nolimits _{} \textsf{q}_\textsf{init}\) as a transition of the register machine (we simply simulate the original Petri net). Furthermore, we have \(\textsf{q}_\textsf{init}\mathop {\xrightarrow {~t_{-m}~}}\nolimits _{} \textsf{q}_\textsf{final}\) as a transition of the register machine. Thus, the state \(\textsf{q}_\textsf{final}\) can be reached iff m can be covered.

We reduce reachability of \(\mathcal {R}(\textsc {Petri})\) to Petri net coverability. We construct the Petri net by taking the ADT \(\textsc {Petri}\) and adding a place \(p_\textsf{q}\) for every state \(\textsf{q}\) and a place \(p_{\textsf{reg},d}\) for every register \(\textsf{reg}\in \textsf{Regs}\) and register value \(d\in \textsf{Dom}\). The idea is that a marking with a token in \(p_\textsf{q}\) and one in \(p_{\textsf{reg},d}\) but none \(p_{\textsf{reg},d'}\) for \(d' \ne d\) corresponds to a configuration of \(\mathcal {R}(\textsc {Petri})\) with state q and \(\textsf{reg}= d\). The value of \(\textsc {Petri}\) is given by the remainder of the marking.

We simulate any \(\textsf{q}\mathop {\xrightarrow {~\textsf{instr}~}}\nolimits _{} \textsf{q}'\) with a transition t that takes one token from \(\textsf{q}\) and puts one in \(\textsf{q}'\). If \(\textsf{instr}\in \textsf{Ops}\), then \(\textsf{instr}\) is a Petri net transition. We simply add the same input and output arcs to t. To simulate a write, we add a new transition \(t_{d'}\) for every \(d'\in \textsf{Dom}\) with an arc to \(p_{\textsf{reg},d}\) and an arc from \(p_{\textsf{reg},d'}\). The initial marking is consistent with \(\textsf{val}_\textsf{init}^\textsc {Petri}\) and has one token in \(p_{\textsf{q}_\textsf{init}}\). A state \(\textsf{q}\) is reachable if a marking with one token in \(p_{\textsf{q}}\) is coverable.

Higher Order ADTs. Let \(\mathcal {M}(\textsf{A})\)-Reach problem be the restriction of \(\mathcal {R}(\textsf{A}){} \texttt {-Reach}\) with no registers. The \(\mathcal {M}(\textsf{A})\)-Reach problem has been studied for many ADT such as higher order counter and higher order stack variations[25, 34].

Theorem 6

  • is \({(n-1)} \text {-} \textsc {ExpTime}\)-hard and in \(n\text {-}\textsc {ExpTime}\).

  • is \({(n-2)} \text {-} \textsc {ExpTime}\)-hard and in \({(n-1)} \text {-} \textsc {ExpTime}\).

  • is \({(n-2)} \text {-} \textsc {ExpSpace}\)-hard and in \({(n-1)} \text {-} \textsc {ExpSpace}\).

Proof

has been shown to be \({(n-1)} \text {-} \textsc {ExpTime}\)-complete [25]. We know is \({(n-2)} \text {-} \textsc {ExpTime}\)-complete and is \({(n-2)} \text {-} \textsc {ExpSpace}\)-complete [34]. Since the reduction from \(\mathcal {M}(\textsf{A})\)-Reach to \(\mathcal {R}(\textsf{A})\)-Reach is trivial, any hardness result can be applied to \(\textsf{TSO}(\textsf{A})\)-P-Reach immediately using Theorem 2. In order to reduce \(\mathcal {R}(\textsf{A})\)-Reach to \(\mathcal {M}(\textsf{A})\)-Reach, we encode register assignments into the state which results in an exponential state explosion. Then we apply Theorem 1 to obtain our upper bound.

Theorem 7

is \({2} \text {-} \textsc {ETime}\)-complete.

Proof

We know that is \({2} \text {-} \textsc {ETime}\)-complete [12] and we can apply Theorem 2 to get \({2} \text {-} \textsc {ETime}\)-hardness. According to Theroem 4.6 in [11], is in \(\mathcal {O}(|\mathcal {M}(\textsf{A})|^{2^{dn}})\) for some constant \(d\in \mathbb {N}\). We apply the exponential size reduction to and Theorem 1 and get is in \(\mathcal {O}(({2^{ |\mathcal {P}| }})^{2^{dn}} ) =\mathcal {O}(2^{|\mathcal {P}| \cdot 2^{dn}} )\) and thus it is also in \(\mathcal {O}(2^{2^{|\mathcal {P}|} \cdot 2^{dn}} )= \mathcal {O}(2^{2^{{|\mathcal {P}|} +dn}} )\). Thus, is in \({2} \text {-} \textsc {ETime}\).

We study well structured ADTs [9, 29] as defined in [3]:

Theorem 8

If ADT \(\textsf{A}\) is well structured, then \(\textsf{TSO}(\textsf{A})\)-P-Reach is decidable.

A register machine for a well structured ADT \(\textsf{A}\) is equivalent to the composition of a well structured transition system (WSTS) modeling \(\textsf{A}\) and a finite transition system (and thus a WSTS) that models states and registers. According to [9], the composition is again a WSTS and reachability is decidable. The above theorem is then an immediate corollary of Theorem 1.

8 Conclusions and Future Work

In this paper, we have taken the first step to studying the complexity of parameterized verification under weak memory models when the processes manipulate unbounded data domains. Concretely, we have presented complexity results for parameterized concurrent programs running on the classical TSO memory model when the processes operate on an abstract data type. We reduce the problem to reachability for register machines enriched with the given abstract data type.

State reachability for finite automata with ADT has been extensively studied for many ADTs[25, 34]. We have shown in Theorem 6 that we can apply our framework to existing complexity results of this problem. This provides us with decidability and complexity results for the corresponding instances of \(\textsf{TSO}(\textsf{A})\)-P-Reach. However, due to the exponential number of register assignments, the upper bound is exponentially larger than the lower bound. We aim to study these cases further and determine more refined parametric bounds.

A direction for future work is considering other memory models, such as the partial store ordering semantics, the release-acquire semantics, and the ARM semantics. It is also interesting to re-consider the problem under the assumption of having distinguished processes (so-called leader processes). Adding leaders is known to make the parameterized verification problem harder. The complexity/decidability of parameterized verification under TSO with a single leader is open, even when the processes are finite-state.