Advertisement

Acta Informatica

, Volume 55, Issue 8, pp 649–668 | Cite as

TSO-to-TSO linearizability is undecidable

  • Chao Wang
  • Yi Lv
  • Peng Wu
Original Article
  • 418 Downloads

Abstract

TSO-to-TSO linearizability is a variant of linearizability for concurrent libraries on the total store order (TSO) memory model. It is proved in this paper that TSO-to-TSO linearizability for a bounded number of processes is undecidable. We first show that the trace inclusion problem of a classic-lossy single-channel system, which is known undecidable, can be reduced to the history inclusion problem of specific libraries on the TSO memory model. Based on the equivalence between history inclusion and extended history inclusion for these libraries, we then prove that the extended history inclusion problem of libraries is undecidable on the TSO memory model. By means of extended history inclusion as an equivalent characterization of TSO-to-TSO linearizability, we finally prove that TSO-to-TSO linearizability is undecidable for a bounded number of processes. Additionally, we prove that all variants of history inclusion problems are undecidable on TSO for a bounded number of processes.

1 Introduction

Libraries of high performance concurrent data structures have been widely used in concurrent programs to take advantage of multi-core architectures, such as java.util.concurrent for Java and std::thread for C\(++\)11. It is important but notoriously difficult to ensure that concurrent libraries are designed and implemented correctly. Linearizability [13] is accepted as a de facto correctness condition for a concurrent library with respect to its sequential specification on the sequential consistency (SC) memory model [14]. Intuitively, linearizability provides the vision that every individual operation appears to take place instantaneously at some point between its invocation and return. It is well known that on the SC memory model linearizability of a concurrent library is decidable for a bounded number of processes [1], but undecidable for an unbounded number of processes [6].

However, modern multiprocessors (e.g., 86 \(\times \) [17], POWER [18]) and programming languages (e.g., C/C\(++\) [5], Java [16]) do not comply with the SC memory model. As a matter of fact, they provide relaxed memory models, which allow subtle behaviors due to hardware or compiler optimization. For instance, in a multiprocessor system implementing the total store order (TSO) memory model [17], each processor is equipped with a FIFO store buffer. Any write action performed by a processor is put into its local store buffer first and can then be flushed into the main memory at any time.

The notion of linearizability has been extended for relaxed memory models, e.g., TSO-to-TSO linearizability [8] and TSO-to-SC linearizability [12] for the TSO memory model and two variants of linearizability [3] for the C\(++\) memory model. These notions generalize the original one by relating concurrent libraries with their abstract implementations, in the way as shown in [11] for the SC memory model. It is worth mentioning that these notions of linearizability satisfy the abstraction theorem [3, 8, 12]: if a library is linearizable with respect to its abstract implementation, every observable behavior of any client program using the former can be observed when the program uses the latter instead. Concurrent software developer can benefit from this correspondence in that the library can be safely replaced with its abstract implementation for the sake of optimization or the ease of verification of the client program.

The decision problems for linearizability on relaxed memory models become more complicated. Because of the hierarchy of memory models, it is rather trivial to see that linearizability on relaxed memory models is undecidable for an unbounded number of processes, based on the known undecidability result on the SC memory model [6]. But the decision problem of linearizability on relaxed memory models remains open for a bounded number of processes.

In this paper we mainly study the decision problem for the TSO-to-TSO linearizability of concurrent libraries within a bounded number of processes. TSO-to-TSO linearizability is the first definition of linearizability on relaxed memory models. It relates a library running on the TSO memory model to its abstract implementation running also on the TSO memory model. Histories of method invocations/responses are typically concerned by the standard notion of linearizability. For TSO-to-TSO linearizability, such histories have to be extended to reflect the interactions between concurrent libraries and processor-local store buffers.

The main result of this paper is that TSO-to-TSO linearizability is undecidable for a bounded number of processes. We first show that the extended history inclusion is an equivalent characterization of TSO-to-TSO linearizability. Then, we prove our undecidability result by reducing the trace inclusion problem between any two configurations of a classic-lossy single-channel system to the extended history inclusion problem between two specific libraries. Recall that the trace inclusion problem between configurations of a classic-lossy single-channel system is undecidable [19]. The reduction is achieved by using as a bridge the history inclusion between these two specific libraries.

Technically, we present a library template that can be instantiated as a specific library for a configuration of a classic-lossy single-channel system. The library is designed with three methods \(M_i\) for \(1\le i\le 3\). We use two processes \(P_1\) and \(P_2\), calling methods \(M_1\) and \(M_2\), respectively, to simulate the traces of the classic-lossy single-channel system starting from the given configuration. This is based on the observation that on the TSO memory model, a process may miss updates by other processes because multiple flush actions may occur between consecutive read actions of the process [2]. But a channel system accesses the content of a channel always in a FIFO manner; while on the contrary, a process on the TSO memory model always reads the latest updates in its local store buffer (whenever possible). Herein, processes \(P_1\) and \(P_2\) alternatively update their own store buffers, but read only from each other’s store buffer. In this way, the labeled transitions of the classic-lossy single-channel system can be reproduced through the interactions between processes \(P_1\) and \(P_2\). Furthermore, we use the third process \(P_3\), calling method \(M_3\) repeatedly, to return each fired transition label repeatedly, so that the traces of the classic-lossy single-channel system starting from a given configuration can be mimicked by the histories of the library exactly. Specially, methods \(M_1\) and \(M_2\) never return, while method \(M_3\) just uses an atomic write action to return labels in order not to touch process \(P_3\)’s store buffer. Consequently, we can easily establish the equivalence between the history inclusion and the extended history inclusion between the specific libraries.

By constructing two specific libraries based on the above library template, we show that the trace inclusion problem between any two configurations of a classic-lossy single-channel system can be reduced to the history inclusion problem between the corresponding two concurrent libraries, while the history inclusion relation and the extended history inclusion relation are equivalent between these two libraries. Then, the undecidability result of TSO-to-TSO linearizability for a bounded number of processes follows from its equivalent characterization and the undecidability result of classic-lossy single-channel system. To our best knowledge, this is the first result on the decidability of linearizability of concurrent libraries on relaxed memory models.

Apart from histories and extended histories, there are other forms of sequences that are used to represented behaviors of libraries. For example, in [9, 10] the behavior of concurrent libraries on TSO are essentially recorded by sequences of call and flush return actions. Based on this variant of history, they propose TSO-linearizability, a variant of linearizability without abstraction theorem, as correctness condition. To deal with various possible forms of histories, we also consider variants of histories. As by-product of our work, we prove that all variants of history inclusion problems, including history inclusion problem and extended history inclusion problem, are undecidable on TSO for a bounded number of processes. Some variants of history inclusion problems can be similarly proved as history inclusion problem and extended history inclusion problem. To deal with other variants of history inclusion problems, we slightly modify the specific libraries. Then the traces of the classic-lossy single-channel system can be mimicked by sequences of call actions.

Related work Efforts have been devoted on the decidability and model checking of linearizability on the SC memory model [1, 6, 7, 15, 20]. The principle of our equivalent characterization for TSO-to-TSO linearizability is similar to that of the characterization given by Bouajjani et al. in [7], where history inclusion is proved to be an equivalent characterization of linearizability. Alur et al. proved that for a bounded number of processes, checking whether a regular set of histories is linearizable with respect to its regular sequential specification can be reduced to a history inclusion problem, and hence is decidable [1]. Bouajjani et al. proved that the problem of whether a library is linearizable with respect to its regular sequential specification for a unbounded number of processes is undecidable, by a reduction from the reachability problem of a counter machine (which is known to be undecidable) [6].

On the other hand, the decidability of linearizability on relaxed memory models is still open for a bounded number of processes. The closest work to ours is [2] by Atig et al., where a lossy channel system is simulated by a concurrent program on the TSO memory model. Our approach of using methods \(M_1\) and \(M_2\) to simulate a classic-lossy single-channel system is inspired by their work. However, in [2], it was the decidable reachability problem of the channel system that was reduced to the reachability problem of the concurrent program on the TSO memory model. Hence, only the start and end configurations of the channel system are needed in their reduction. In this paper, we reduce the trace inclusion problem between any two configurations of a classic-lossy single-channel system, which is undecidable, to the TSO-to-TSO linearizability problem. Our reduction needs to show exactly each step of transitions in the channel system.

Paper outline We give the definitions of libraries, concurrent systems and its operational semantics in Sect. 2. We introduce the definition of TSO-to-TSO linearizability and variants of histories, and prove that extended history inclusion is an equivalent characterization of TSO-to-TSO linearizability in Sect. 3. In Sect. 4, we present how to generate specific libraries to mimic behaviors of classic-lossy single-channel systems. We prove in Sect. 5 that TSO-to-TSO linearizability and all variants of history inclusion problems are undecidable on TSO for a bounded number of processes. We conclude in Sect. 6.

Differences from the conference paper

This article is an extended version of our ATVA’15 conference paper [22], containing all the proofs of the lemmas and theorems mentioned in the paper. Since the conference, we have also extended our result from undecidability of history inclusion problem and extended history inclusion problem into all variants of history inclusion problems.

2 TSO concurrent systems

In this section, we first present the notations of libraries, the most general clients and TSO concurrent systems. Then, we introduce their operational semantics on the TSO memory model.

2.1 Notations

In general, a finite sequence on an alphabet \(\Sigma \) is denoted \(l=\alpha _1 \cdot \alpha _2 \cdot \ldots \cdot \alpha _k\), where \(\cdot \) is the concatenation symbol and \(\alpha _i\in \Sigma \) for each \(1\le i\le k\). Let |l| denote the length of l, i.e., \(|l|=k\), and l(i) denote the ith element of l for \(1 \le i \le k\), i.e., \(l(i)=\alpha _i\). For an alphabet \(\Sigma '\), let \(l \uparrow _{\Sigma '}\) denote the projection of l to \(\Sigma '\). Given a function f, let f[x : y] be the function that shares the same value as f everywhere, except for x, where it has the value y. We use \(\_\) for an item, of which the value is irrelevant.

A labelled transition system (LTS) is a tuple \(\mathcal {A}=(Q,\Sigma ,\rightarrow ,q_0)\), where Q is a set of states, \(\Sigma \) is a set of transition labels, \(\rightarrow \subseteq Q\times \Sigma \times Q\) is a transition relation and \(q_0\) is the initial state. A state of the LTS \(\mathcal {A}\) may be referred to as a configuration in the rest of the paper.

A path of \(\mathcal {A}\) is a finite transition sequence \(q_1\xrightarrow {\beta _1}q_2\overset{\beta _2}{\longrightarrow }\cdots \overset{\beta _k}{\longrightarrow }q_{k+1}\) for \(k\ge 0\). A trace of \(\mathcal {A}\) is a finite sequence \(t= \beta _1 \cdot \beta _2 \cdot \ldots \cdot \beta _k\), where \(k \ge 0\) if there exists a path \(q_1\overset{\beta _1}{\longrightarrow }q_2\overset{\beta _2}{\longrightarrow }\cdots \overset{\beta _k}{\longrightarrow }q_{k+1}\) of \(\mathcal {A}\). Let \(\textit{path}(\mathcal {A},q)\) and \(\textit{trace}(\mathcal {A},q)\) denote all the paths and traces of \(\mathcal {A}\) that start from q, respectively. We write \(\textit{path}(\mathcal {A})\) and \(\textit{trace}(\mathcal {A})\) for short if \(q=q_0\).

2.2 Libraries and the most general clients

A library implementing a concurrent data structure provides a set of methods for external users to access the data structure. It may contain private memory locations for its own use. A client program is a program that interacts with libraries. For simplicity, we assume that each method has just one parameter and one return value if it returns. Furthermore, all the parameters and the return values are passed via a special register \(r_f\).

For a library, let \(\mathcal {X}\) be a finite set of its memory locations, \(\mathcal {M}\) be a finite set of its method names, \(\mathcal {D}\) be its finite data domain, \(\mathcal {R}\) be a finite set of its register names and \(\mathcal {RE}\) be a finite set of its register expressions over \(\mathcal {R}\). Then, a set \(\textit{PCom}\) of primitive commands considered in this paper includes:
  • Register assign commands in the form of \( r_1 = \textit{re} \);

  • Register reset commands in the form of \( \textit{havoc}\);

  • Read commands in the form of \( \textit{read}(x,r_1) \);

  • Write commands in the form of \( \textit{write}(r_1,x) \);

  • Lock commands in the form of \(\textit{lock}\);

  • Unlock commands in the form of \(\textit{unlock}\);

  • Assume commands in the form of \( \textit{assume}(r_1)\);

  • Call commands in the form of \( \textit{call}(m)\);

where \(r_1 \in \mathcal {R}, \textit{re} \in \mathcal {RE}, x \in \mathcal {X}\). Herein, the commands in this paper are borrowed from [8], where lock and unlock commands are called xlock and xunlock in [8]. A \(\textit{havoc}\) command [8] assigns arbitrary values to all registers in \(\mathcal {R}\).

A control-flow graph is a tuple \(\textit{CFG} = (N,L,T,q_i,q_f)\), where N is a finite set of program positions, L is a set of primitive commands, \(T \subseteq N \times L \times N\) is a control-flow transition relation, \(q_i\) is the initial position and \(q_f\) is the final position.

A library \(\mathcal {L}\) can then be defined as a tuple \(\mathcal {L} =(Q_\mathcal {L},\rightarrow _\mathcal {L},\textit{InitV}_\mathcal {L})\), such that \(Q_\mathcal {L} = \bigcup _{m \in \mathcal {M}} Q_m \) is a finite set of program positions, where \(Q_m\) is the program positions of a method m of this library; \(\rightarrow _\mathcal {L} = \bigcup _{m \in \mathcal {M}} \rightarrow _m\) is a control-flow transition relation, where for each \(m \in \mathcal {M}\), \(( Q_m, \textit{PCom}, \rightarrow _m, i_m, f_m )\) is a control-flow graph with a unique initial position \(i_m\) and a unique final position \(f_m\); \(\textit{InitV}_\mathcal {L}: \mathcal {X} \rightarrow \mathcal {D}\) is an initial valuation for its memory locations.

The most general client of a library is a special client program that is used to exhibit all possible behaviors of the library. Formally, the most general client \(\mathcal {MGC}\) of library \(\mathcal {L}\) is defined as a tuple \((\{q_c,q_c'\},\rightarrow _c)\), where \(q_c\) and \(q_c'\) are two program positions, \(\rightarrow _c= \{ (q_c,\textit{havoc},q'_c ) \} \cup \{ (q'_c,\textit{call}(m),q_c) \vert m \in \mathcal {M}\} \) is a control-flow transition relation and \(( \{q_c,q'_c\}, \textit{PCom}, \rightarrow _c, q_c, q_c )\) is a control-flow graph. Intuitively, the most general client repeatedly calls an arbitrary method with an arbitrary argument for arbitrarily many times.

2.3 TSO operational semantics

Assume a concurrent system consists of n processes, each of which runs the most general client program of a library on a separate processor. Then, the operational semantics of a library can be defined in the context of the concurrent system.

For a library \(\mathcal {L}\)=\((Q_{\mathcal {L}},\rightarrow _{\mathcal {L}},\textit{InitV}_{\mathcal {L}})\), its operational semantics on the TSO memory model is defined as an LTS \(\llbracket \mathcal {L},n \rrbracket _{\textit{te}}\) 1 \(= (\textit{Conf}_{\textit{te}}, \Sigma _{\textit{te}},\,\rightarrow _{\textit{te}}, \textit{InitConf}_{\textit{te}} )\), where \(\textit{Conf}_{\textit{te}}, \Sigma _{\textit{te}},\) \(\rightarrow _{\textit{te},} \textit{InitConf}_{\textit{te}}\) are defined as follows.

Each configuration of \(\textit{Conf}_{te}\) is a tuple (pdurl), where
  • \(p:\{1, \dots , n \} \rightarrow \{ q_c , q'_c \} \cup Q_{\mathcal {L}}\) represents control states of each process;

  • \(d:\mathcal {X} \rightarrow \mathcal {D}\) represents values at each memory location;

  • \(u:\{1, \dots , n\} \rightarrow ( \{ (x,a) \vert x \in \mathcal {X}, a \in \mathcal {D} \} \cup \{\textit{call}(m,a) \vert m \in \mathcal {M},a \in \mathcal {D} \} \cup \{\textit{return}(m,a) \vert m \in \mathcal {M},a \in \mathcal {D} \})^*\) represents contents of each processor-local store buffer; each processor-local store buffer may contain a finite sequence of pending write, pending call or pending return actions;

  • \(r:\{1,\ldots ,n\} \rightarrow (\mathcal {R} \rightarrow \mathcal {D})\) represents values of the registers of each process.

  • \(l \subseteq \{1,\ldots ,n\}\) contains all the processes that can currently execute commands.

\(\Sigma _{te}\) consists of the following subsets of actions as transition labels.
  • Internal actions: \(\{ \tau (i) \vert 1 \le i \le n \}\);

  • Read actions: \(\{\textit{read}(i,x,a) \vert 1 \le i \le n, x \in \mathcal {X},a \in \mathcal {D} \}\);

  • Write actions: \(\{\textit{write}(i,x,a) \vert 1 \le i \le n, x \in \mathcal {X}, a \in \mathcal {D} \}\);

  • Lock actions: \(\{\textit{lock}(i) \vert \) \(1 \le i \le n \}\);

  • Unlock actions: \(\{\textit{unlock}(i) \vert \) \(1 \le i \le n \}\);

  • Flush actions: \(\{ \textit{flush}(i,\) \(x,a) \vert \ 1 \le i \le n, x \in \mathcal {X}, a \in \mathcal {D} \}\);

  • Call actions: \(\Sigma _{\textit{cal}}\) = \(\{\textit{call}(i,m,a) \vert 1 \le i \le n, m \in \mathcal {M}, a \in \mathcal {D} \}\);

  • Return actions: \(\Sigma _{\textit{ret}}\) = \(\{\textit{return}(i,m,a) \vert 1 \le i \le n, m \in \mathcal {M}, a \in \mathcal {D} \}\);

  • Flush call actions: \(\Sigma _{\textit{fcal}}\) = \(\{\textit{flushCall}(i,m,a) \vert 1 \le i \le n, m \in \mathcal {M}, a \in \mathcal {D} \}\);

  • Flush return actions: \(\Sigma _{\textit{fret}}\) = \(\{\textit{flushReturn}(i,m,a) \vert 1 \le i \le n, m \in \mathcal {M}, a \in \mathcal {D} \}\).

Fig. 1

Transition rules of \(\rightarrow _{\textit{te}}\)

The initial configuration \(\textit{InitConf}_{te}\in \textit{Conf}_{te}\) is a tuple \((p_{\textit{init}}, \textit{InitV}_{\mathcal {L}}, u_{\textit{init}}, r_{\textit{init}} , l_{\textit{init}} )\), where \(p_{\textit{init}}(i)=q_c\), \(u_{\textit{init}}(i)=\epsilon \) (representing an empty buffer), \(r_{\textit{init}}(i)(r)=\textit{regV}_{\textit{init}}\) (a special initial value of a register) and \(l_{\textit{init}}=\{1,\ldots ,n\}\) for \(1\le i\le n\), \(r \in \mathcal {R}\);

The transition relation \(\rightarrow _{\textit{te}}\) is the least relation satisfying the transition rules shown in Fig. 1. Our operational semantics is similar to the one presented in [8].
  • \(\textit{Register-Assign}\) rule: A function \(f_{\textit{re}} : (\mathcal {R} \rightarrow \mathcal {D}) \times \mathcal {RE} \rightarrow \mathcal {D}\) is used to evaluate register expression \(\textit{re}\) under register valuation \(\textit{rv}\) of current process, and its value is assigned to register \(r_1\).

  • \(\textit{Library-Havoc}\) and \(\mathcal {MGC}{} \textit{-Havoc}\) rules: \(\textit{havoc}\) commands are executed for libraries and the most general clients respectively.

  • \(\textit{Assume}\) rule: If the value of register \(r_1\) is \(\textit{true}\), current process can execute \(\textit{assume}\) command. Otherwise, it must wait.

  • \(\textit{Read}\) rule: A function \(\textit{lookup}(u,d,i,x)\) is used to search for the latest value of x from its processor-local store buffer or the main memory, i.e.,
    $$\begin{aligned} \textit{lookup(u,d,i,x)} = \left\{ \begin{array}{ll} a &{} \text {if}\quad u(i)\uparrow _{\Sigma _x}=(x,a) \cdot l, \ \textit{for some} \ l \in \Sigma _x^* \\ d(x) &{} \text {otherwise }\\ \end{array} \right. \end{aligned}$$
    where \(\Sigma _x\) = \(\{(x,a) \vert a \in \mathcal {D}\}\) is the set of pending write actions for x. A read action will take the latest value of x from its processor-local store buffer if possible, otherwise, it looks up the value in memory.
  • \(\textit{Write}\) rule: A write action will insert a pair of a location and a value to the tail of its processor-local store buffer.

  • \(\textit{Lock}\) and \(\textit{unlock}\) rules: Only when the processor-local store buffer is empty, a processor can perform lock or unlock commands. A process executing lock makes itself the only active process and prevents other processes from executing commands. After a process executes unlock command, other processes become active. Thus, the commands executed from lock to unlock are not interleaved with commands of other processes.

  • \(\textit{Flush}\) rule: The memory system may decide to flush the entry at the head of a processor-local store buffer to memory at any time.

  • \(\textit{Call}\) and \(\textit{return}\) rules: To deal with \(\textit{call}\) command, a call marker is added into the tail of processor-local store buffer and current process starts to execute the initial position of method m. When the process comes to the final position of method m it can launch a \(\textit{return}\) action, add a return marker to the tail of processor-local store buffer and start to execute the most general client.

  • \(\textit{Flush-Call}\) and \(\textit{Flush-Return}\) rules: The call and return marker can be discarded when they are at the head of processor-local store buffer. Such actions are used to define TSO-to-TSO linearizability only.

Given above commands, a memory barrier can be implemented as “lock;unlock”. The cas (compare-and-swap) commands can also be implemented. A cas command takes three arguments: a memory location x, an expected value a and a new value b. It atomically reads x, updates it with b and returns 1 when the value of x is a; otherwise, it does nothing and return 0. As in [8], \(\textit{cas}(x,a,b)\) is defined as syntactic sugar for the control-flow graph representation of:
  • lock;

  • if (x==a) {x=b; unlock; return 1;}

  • {unlock; return 0;}

3 TSO-to-TSO linearizability and equivalent characterization

In this section we introduce the definition of TSO-to-TSO linearizability and then prove that it can be equivalently characterized by extended history inclusion. We also give definitions of variants of histories.

3.1 TSO-to-TSO linearizability

The behavior of a library is typically represented by histories of interactions between the library and the clients calling it (through call and return actions). A finite sequence \(h\in (\Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}})^*\) is a history of an LTS \(\mathcal {A}\) if there exists a trace t of \(\mathcal {A}\) such that \(t \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} )} = h\). Let \(\textit{history}(\mathcal {A})\) denote all the histories of \(\mathcal {A}\).

TSO-to-TSO linearizability is a variant of linearizability on the TSO memory model. It additionally concerns the behavior of a library in the context of processor-local store buffers, i.e., the interactions between the library and store buffers through flush call and flush return actions. A finite sequence \(eh\in (\Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}}\cup \Sigma _{\textit{fcal}}\cup \Sigma _{\textit{fret}})^*\) is an extended history of an LTS \(\mathcal {A}\) if there exists a trace t of \(\mathcal {A}\) such that \(t \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}}\cup \Sigma _{\textit{fcal}}\cup \Sigma _{\textit{fret}} )} = eh\). Let \(\textit{ehistory}(\mathcal {A})\) denote all the extended histories of \(\mathcal {A}\), and \(\textit{eh} \vert _{\textit{i}} \) the projection of \(\textit{eh}\) to the actions of the ith process. Two extended histories \(\textit{eh}_1\) and \(\textit{eh}_2\) are equivalent, if for each \(1\le i\le n\), \(\textit{eh}_1 \vert _i = \textit{eh}_2 \vert _i\).

Definition 1

(TSO-to-TSO linearizability [8]) For any two extended histories \(\textit{eh}_1\) and \(\textit{eh}_2\) of libraries, \(\textit{eh}_1\) is TSO-to-TSO linearizable to \(\textit{eh}_2\), if
  • \(\textit{eh}_1\) and \(\textit{eh}_2\) are equivalent;

  • there is a bijection \(\pi : \{ 1,\dots ,\vert \textit{eh}_1 \vert \} \rightarrow \{ 1,\dots ,\vert \textit{eh}_2 \vert \}\) such that for any \(1\le i\le |eh_1|\), \(\textit{eh}_1(i)=\textit{eh}_2(\pi (i))\);

  • for any \(1\le i<j\le |eh_1|\), if \( (\textit{eh}_1(i) \in \Sigma _{\textit{ret}} \cup \Sigma _{\textit{fret}} ) \wedge (\textit{eh}_1(j) \in \Sigma _{\textit{cal}} \cup \Sigma _{\textit{fcal}})\), then \(\pi (i) < \pi (j)\).

For two libraries \(\mathcal {L}_1\) and \(\mathcal {L}_2\), we say that \(\mathcal {L}_2\) \(\textit{TSO-to-TSO linearizes}\) \(\mathcal {L}_1\), if for any \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_1,n \rrbracket _{\textit{te}} ) \), there exists \(\textit{eh}_2 \in \textit{ehistory}( \llbracket \mathcal {L}_2,\) \(n \rrbracket _{\textit{te}} ) \), such that \(\textit{eh}_1\) is \(\textit{TSO-to-TSO linearizable}\) to \(\textit{eh}_2\).

Informally speaking, if \(\textit{eh}_1\) is TSO-to-TSO Linearizable to \(\textit{eh}_2\), then \(\textit{eh}_2\) keeps all the non-overlapping pairs of call/flush call and return/flush return actions in \(\textit{eh}_1\) in the same order. It is proved in [8] that TSO-to-TSO linearizability satisfies a so-called abstraction theorem. Therefore, if \(\mathcal {L}_2\) TSO-to-TSO linearizes \(\mathcal {L}_1\), then it is safe to replace \(\mathcal {L}_1\) with \(\mathcal {L}_2\) and this will not introduce any new behaviors in the view of client programs.

The following is an example of implementation library and its specification library of TSO-to-TSO linearizability in [8]. The implementation library, spinlock, of software lock is used in various version of the Linux kernel [4] and is shown in (a):

Method release writes 1 to x without executing a memory barrier or lock, which can result in a delay for other processes on TSO. The abstraction library is given in (b) as follows:

Here, the write to x in the release of abstraction library can also be delayed in the store buffer. It can be seen that the resulting specification still guarantees mutual exclusion. It has been proved in [8] that the abstraction library TSO-to-TSO linearizes the implementation library.

Apart from histories and extended histories, there are other forms of sequences that are used to represent behaviors of libraries. For example, in [9, 10] the behavior of a method starts with call action and ends with flush return action. In such situation the behavior of a library essentially contains sequences of call and flush return actions. To deal with all possible variants of histories, we generalize the notions of history as follows: Let \(\textit{cal}\), \(\textit{ret}\), \(\textit{fcal}\) and \(\textit{fret}\) represent call, return, flush call and flush return actions, respectively. Given distinct \(x,y,z,w \in \{ \textit{cal}, \textit{ret}, \textit{fcal}, \textit{fret}\}\), a (x)-history is a sequence of x actions, a (xy)-history is a sequence of x and y actions, a (xyz)-history is a sequence of x, y and z actions, and a (xyzw)-history is a sequence of x, y, z and w actions. It is easy to see that there are fifteen variants of histories, while the (standard) histories can be defined as \((\textit{call}, \textit{ret})\)-histories, and extended histories can be defined as \((\textit{call}, \textit{ret}, \textit{fcal}, \textit{fret})\)-histories.

3.2 Equivalence characterization

To handle the decision problem of TSO-to-TSO linearizability, we show that the extended history inclusion is an equivalent characterization of TSO-to-TSO linearizability. It is obvious that extended history inclusion implies TSO-to-TSO linearizability. To prove the opposite direction, we need to prove that if \(\mathcal {L}_2\) TSO-to-TSO linearizes \(\mathcal {L}_1\), \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_1,n \rrbracket _{\textit{te}} )\), \(\textit{eh}_2 \in \textit{ehistory}( \llbracket \mathcal {L}_2,n\) \(\rrbracket _{\textit{te}} )\) and \(\textit{eh}_1\) is TSO-to-TSO linearizable to \(\textit{eh}_2\), then \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_2,n \rrbracket _{\textit{te}} )\).

A transformation \(\Rightarrow _{\textit{ER}}\) is a relation between two extended histories and is defined as follows: \(\textit{eh}_1 \Rightarrow _{\textit{ER}} \textit{eh}_2\), if \(\textit{eh}_1\) = \(l_1 \cdot \alpha \cdot \beta \cdot l_2\), \(\textit{eh}_2\) = \(l_1 \cdot \beta \cdot \alpha \cdot l_2\) and \((\alpha ,\beta )\) is neither in \((\Sigma _{\textit{ret}} \cup \Sigma _{\textit{fret}}) \times (\Sigma _{\textit{cal}} \cup \Sigma _{\textit{fcal}})\), nor actions of same process. Or we can say, \(\textit{eh}_2\) can be obtained by swapping two adjacent elements of \(\textit{eh}_1\) without violating TSO-to-TSO linearizability. We write \(\Rightarrow _{\textit{ER}}^*\) to denote the transition closure of \(\Rightarrow _{\textit{ER}}\).

Given two equivalent extended histories \(\textit{eh}_1\) and \(\textit{eh}_2\), we say that \(\pi \) is their bijection, if \(\pi \) is a bijection between \(\{1, \ldots , \vert \textit{eh}_1 \vert \}\) and \(\{1, \ldots , \vert \textit{eh}_2 \vert \}\), and \(\textit{eh}_1(i)=\textit{eh}_2(\pi (i))\) for each i. We use predicate \(\textit{eWit}(\textit{eh}_1,\textit{eh}_2,i_1,i_2)\) to denote a difference between two equivalent extended histories, and \(\textit{eWit}(\textit{eh}_1,\textit{eh}_2,i_1,i_2)\) holds if \(i_1 < i_2\) and \(\pi ^{-1}(i_1) > \pi ^{-1}(i_2)\). Given two equivalent extended histories \(\textit{eh}_1\) and \(\textit{eh}_2\), a non-negative distance function \(eWitSum(\textit{eh}_1, \textit{eh}_2)\) is used to measure the difference between them. Formally, \(\textit{eWitSum}(\textit{eh}_1,\textit{eh}_2)\) = \(\vert \{ (m,n) \vert \textit{eWit}(\textit{eh}_1,\) \(\textit{eh}_2,m,n) \ holds\} \vert \).

Through the well-defined transformation relation and distance function, we can show that if an extended history \(\textit{eh}_1\) is TSO-to-TSO linearizable to another extended history \(\textit{eh}_2\) and \(eh_1\ne eh_2\), then there exists a third extended history \(eh_3\), such that
  • \(\textit{eh}_1\) is TSO-to-TSO linearizable to \(\textit{eh}_3\);

  • \(\textit{eh}_3\) \(\Rightarrow _{\textit{ER}}\) \(\textit{eh}_2\);

  • \(\textit{eh}_3\) is TSO-to-TSO linearizable to \(\textit{eh}_2\);

  • the distance between \(eh_1\) and \(eh_3\) is strictly less than the one between \(eh_1\) and \(eh_2\).

Based on this, we prove that if \(\textit{eh}_1\) is TSO-to-TSO linearizable to \(\textit{eh}_2\), then \(\textit{eh}_1 \Rightarrow _{\textit{ER}}^* \textit{eh}_2\). Or we can say, \(\textit{eh}_2\) can be obtained from \(\textit{eh}_1\) by a finite number of \(\Rightarrow _{\textit{ER}}\) transformations. We also prove that if \(\textit{eh}_2\), an extended history of \(\llbracket \mathcal {L},n \rrbracket _{\textit{te}}\), can be obtained from \(\textit{eh}_1\) by one step of \(\Rightarrow _{\textit{ER}}\) transformation, then \(\textit{eh}_1\) is also an extended history of \(\llbracket \mathcal {L},n \rrbracket _{\textit{te}}\).

Based on the two results in above paragraph, it is not hard to see that TSO-to-TSO linearizability implies extended history inclusion. Therefore, extended history inclusion is an equivalent characterization of TSO-to-TSO linearizability, as presented by the following lemma.

Lemma 1

For any two libraries \(\mathcal {L}_1\) and \(\mathcal {L}_2\), \(\mathcal {L}_2\) TSO-to-TSO linearizes \(\mathcal {L}_1\) if and only if \(\textit{ehistory}( \llbracket \mathcal {L}_1,n \rrbracket _{\textit{te}} ) \subseteq \textit{ehistory}( \llbracket \mathcal {L}_2,n \rrbracket _{\textit{te}} )\).

4 Specific libraries for classic-lossy single-channel systems

In this section, we introduce the definition of classic-lossy single-channel systems, and then show how to simulate a classic-lossy single-channel system with a concurrent library.

4.1 Classic-lossy single-channel systems

A classic-lossy single-channel system [19] is a tuple \(\mathcal {S}\) = \((Q_{\textit{cs}},\Sigma _{\textit{cs}},\{c_{\textit{cs}}\},\Gamma _{\textit{cs}},\) \(\Delta _{\textit{cs}})\), where \(Q_{\textit{cs}}\) is a finite set of control states, \(\Sigma _{\textit{cs}}\) is a finite alphabet of messages, \(c_{\textit{cs}}\) is the name of the single channel, \(\Gamma _{\textit{cs}}\) is a finite set of transition labels and \(\Delta _{\textit{cs}} \subseteq Q_{\textit{cs}} \times \Sigma ^*_{\textit{cs}} \times \Gamma _{\textit{cs}} \times Q_{\textit{cs}} \times \Sigma ^*_{\textit{cs}}\) is a transition relation.

Given two finite sequences \(l_1=\alpha _1 \cdot \alpha _2 \cdot \ldots \cdot \alpha _u\) and \(l_2=\beta _1 \cdot \beta _2 \cdot \ldots \cdot \beta _v\), we say that \(l_1\) is a \(\textit{subword}\) of \(l_2\), denoted \(l_1 \sqsubseteq l_2\), if there exists \(1\le i_1< \cdots < i_u \le v \) such that for any \(1\le j \le u\), \(\alpha _j=\beta _{i_j}\). Then, the operational semantics of \(\mathcal {S}\) is given by an LTS \(\mathcal {CL}(\mathcal {S})\) = \((\textit{Conf}_{\textit{cs}}, \Gamma _{\textit{cs}},\rightarrow _{\textit{cs}},\textit{initConf}_{\textit{cs}})\), where \(\textit{Conf}_{\textit{cs}} = Q_{\textit{cs}} \times \Sigma ^*_{\textit{cs}}\) is a set of configurations with \(\textit{initConf}_{\textit{cs}} \in \textit{Conf}_{\textit{cs}}\) as the initial configuration. The transition relation \(\rightarrow _{\textit{cs}}\) is defined as follows: \((q_1,W_1)\) \(\overset{ \alpha }{\longrightarrow }_{\textit{cs}}\) \((q_2,W_2)\) if there exists \((q_1,U,\alpha ,q_2,V)\) \(\in \) \(\Delta _{\textit{cs}}\) and \(W' \in \Sigma _{\textit{cs}}^*\) such that \(U \cdot W' \sqsubseteq W_1\) and \(W_2 \sqsubseteq W' \cdot V\).

It is known that for two configurations \((q_1,W_1),(\) \(q_2,W_2) \in \textit{Conf}_{\textit{cs}}\) of a classic-lossy single-channel system \(\mathcal {S}\), the trace inclusion between \((q_1,W_1)\) and \((q_2,W_2)\) is undecidable [19].

4.2 Simulation on the TSO memory model

On the TSO memory model flush actions are launched nondeterministically by the memory system. Therefore, between two consecutive \(\textit{read}(x,\_)\) actions, more than one flush actions to x may happen. The second read action can only read the latest flush action to x, while missing the intermediate ones. These missing flush actions are similar to the missing messages that may happen in a classic-lossy single-channel system. This makes it possible to simulate a classic-lossy single-channel system with a concurrent program running on the TSO memory model. We implement such simulation through a library \(\mathcal {L}_{\mathcal {S},q,W}\) specifically constructed based on a classic-lossy single-channel system \(\mathcal {S}\) and a given configuration \((q,W) \in \textit{Conf}_{\textit{cs}}\).

For a classic-lossy single-channel system \(\mathcal {S}\) =\((Q_{\textit{cs}},\Sigma _{\textit{cs}},\{ c_{\textit{cs}} \},\) \(\Gamma _{\textit{cs}},\Delta _{\textit{cs}})\), assume the finite data domain \(\mathcal {D}_{\textit{cs}}\) = \(Q_{\textit{cs}} \cup \Sigma _{\textit{cs}} \cup \Delta _{\textit{cs}} \cup \{ \sharp , \textit{start}, \textit{end}, \bot ,\textit{true},\textit{false},\textit{regV}_{\textit{init}}, \textit{rule}_{f}\}\), where \(Q_{\textit{cs}} \cap \Sigma _{\textit{cs}} = \emptyset \), \(Q_{\textit{cs}} \cap \Delta _{\textit{cs}} = \emptyset \), \(\Sigma _{\textit{cs}} \cap \Delta _{\textit{cs}} = \emptyset \), and the symbols \(\sharp , \textit{start}, \textit{end}\), \(\bot \), \(\textit{true}\), \(\textit{false}\), \(\textit{regV}_{\textit{init}}\) and \(\textit{rule}_f\) do not exist in \(Q_{\textit{cs}} \cup \Sigma _{\textit{cs}} \cup \Delta _{\textit{cs}}\). Given a configuration \((q,W) \in \textit{Conf}_{\textit{cs}}\) of \(\mathcal {S}\), the library \(\mathcal {L}_{\mathcal {S},q,W}\) is constructed with three methods \(M_1\), \(M_2\) and \(M_3\), and three private memory locations x, y and z. x is used to transmit the channel contents from \(M_2\) to \(M_1\), while y is used to transmit the channel contents from \(M_1\) to \(M_2\). z is used to transmit the transition labels of \(\mathcal {CL}(\mathcal {S})\) from \(M_2\) to \(M_3\). It is also used to synchronize \(M_2\) and \(M_3\). The symbol \(\sharp \) is used as the delimiter to ensure that one element will not be read twice. The symbols \(\textit{start}\) and \(\textit{end}\) represent the start and the end of the channel contents, respectively. \(\bot \) is the initial value of x, y and z. The symbol \(\textit{rule}_f\) is an additional transition rule that is used to indicate the end of a simulation.

We now present the three methods in the pseudo-code, shown in Methods 1, 2 and 3. The \(\textit{if}\) and \(\textit{while}\) statements used in the pseudo-code can be easily implemented by the \(\textit{assume}\) commands as well as other commands in our formation of a library. For the sake of brevity, the following macro notations are used. For sequence \(l=a_1 \cdot \ldots \cdot a_m\), let writeSeq(x,l) denote the commands of writing \(a_1, \sharp , \ldots , a_m, \sharp \) to x in sequence, and readSeq(x,l) denote the commands of reading \(a_1, \sharp , \ldots , a_m, \sharp \) from x in sequence. We use \(v:=\textit{readOne}(x)\) to represent the commands of reading \(e,\sharp \) from x in sequence for some \(e \ne \sharp \) and then assigning e to v. If \(\textit{readSeq}(x,l)\) or \(\textit{readOne}(x)\) fails to read the specified content, then the calling process will no long proceed. We use \(\textit{writeOne}(x,\textit{reg})\) to represent the commands of writing a, \(\sharp \) to x in sequence where a is the current value of register \(\textit{reg}\). In the pseudo-code, r is a register in \(\mathcal {R}\).

The pseudo-code of method \(M_1\) is shown in Method 1. \(M_1\) contains an infinite loop that never returns (Lines 1–3). At each round of the loop, it reads a new update from x and writes it to y.
The pseudo-code of method \(M_2\) is shown in Method 2. \(M_2\) first guesses a transition rule \(\textit{rule}_1\), puts \(\textit{rule}_1\) and W into the processor-local store buffer by writing them to x (Lines 1–2). Then, it begins an infinite loop that never returns (Lines 3–16). At each round of the loop, it reads the current transition rule \(\textit{rule}_2 \in \Delta _{\textit{cs}}\) of \(\mathcal {S}\) (Line 4) and guesses a transition rule \(\textit{rule}_3 \in \Delta _{\textit{cs}} \cup \{ \textit{rule}_f \}\) (Line 5). If \(M_2\) guesses the rule \(\textit{rule}_f\) in Line 5, then in the next round of this loop it will be blocked at Line 4 and the simulation terminates. \(M_2\) does not confirm \(\textit{rule}_2\) until it reads \(\textit{start} \cdot U_1\) from y at Line 6 (intermediate values of y may be lost). At Line 7, it writes \(\textit{rule}_3 \cdot \textit{start}\) to y. Then, it reads the remaining contents of method \(M_1\)’s processor-local store buffer (intermediate values of y may be lost) and writes them and \(V_1\) to x (Lines 8–13). In Lines 14–16, it transmits the transition label \(\alpha _1\) to method \(M_3\).
The pseudo-code of method \(M_3\) is shown in Method 3. \(M_3\) first waits for \(M_2\) to transmit transition label to it though z by a non \(\bot \) value (Lines 1–4). Then it acknowledges \(M_2\) at Lines 5–7 and returns this transition label z at Line 8. \(M_3\) uses lock and unlock commands to communicate with \(M_2\). It never puts a pending write action to its processor-local store buffer. If the programming language does not provide lock and unlock commands of TSO memory model but \(\textit{cas}\) command instead, it is safe to use \(\textit{cas}(z,r,\bot );\) to take place of Lines 5–7 of method \(M_3\).

5 Undecidability of TSO-to-TSO linearizability

As the main result of this paper, we present in this section that the TSO-to-TSO linearizability of concurrent libraries is undecidable for a bounded number of processes. We first reduce the trace inclusion problem between any two configurations of a classic-lossy single-channel system to the history inclusion problem between two specific concurrent libraries. Then, our main undecidability result follows from the equivalence between the history inclusion and the extended history inclusion for these two libraries. Recall that the latter is equivalent to TSO-to-TSO linearizability between the two libraries based on the above Lemma 1. Moreover, we prove that in general all variants of history inclusion problems, including history inclusion problem and extended history inclusion problem, are undecidable on TSO for a bounded number of processes.

5.1 Undecidability of history inclusion

In this subsection we show that given a classic-lossy single-channel system \(\mathcal {S}\) and a configuration \((q,W)\in Conf_{cs}\), the histories of library \(\mathcal {L}_{\mathcal {S},q,W}\) simulate exactly the paths of \(\mathcal {S}\) starting from (qW). Therefore, trace inclusion between \((q_1,W_1)\) and \((q_2,W_2)\) can be reduced into history inclusion between \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1},3 \rrbracket _{\textit{te}} )\) and \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2},3 \rrbracket _{\textit{te}} )\).

A path \(p_{\mathcal {S}}= (q_1,W_1)\) \(\overset{\alpha _1}{\longrightarrow }_{\textit{cs}}\) \((q_2,W_2)\) \(\overset{\alpha _2}{\longrightarrow }_{\textit{cs}}\) \(\cdots \) \(\overset{\alpha _k}{\longrightarrow }_{\textit{cs}}\) \((q_{k+1},W_{k+1})\) \(\in \textit{path}(\mathcal {CL}(\mathcal {S}),\) \( (q_1,W_1) )\) is \(\textit{conservative}\), if the following two conditions hold: (1) it contains at least one transition, (2) assume the ith step uses rule \(r_i=(q_i,U_i,\alpha _i,q_{\textit{i+1}},V_i)\) for each \(1\le i\le k\), then for each \(1\le i\le k\), there exists \(W'_i,W''_i \in \Sigma _{\textit{cs}}^*\) such that \(U_i \cdot W'_i \sqsubseteq W_i\), \(W''_i \sqsubseteq W'_i\) and \(W_{\textit{i+1}}=W''_i \cdot V_i\). Intuitively, each ith step of a conservative path does not lose any element in \(V_i\). We prove that the traces of conservative paths equals to that of all paths for classic-lossy single-channel systems. A trace \(t_{\mathcal {L}} \in \textit{trace}( \llbracket \mathcal {L}_{\mathcal {S},q,W},3 \rrbracket _{\textit{te}} )\) is \(\textit{effective}\), if \(t_{\mathcal {L}}\) contains at least one return action \(\textit{return}(\_,M_3,\_)\). Otherwise, it is ineffective.

There is actually a close connection between the conservative paths of \(\mathcal {CL}(\mathcal {S})\) and the effective traces of \(\llbracket \mathcal {L}_{\mathcal {S},q,W},3 \rrbracket _{\textit{te}}\). An effective trace \(t_{\mathcal {L}} \in \textit{trace}( \llbracket \mathcal {L}_{\mathcal {S},q,W},3 \rrbracket _{\textit{te}} )\) and a conservative path \(p_{\mathcal {S}} \in \textit{path}( \mathcal {CL}(\mathcal {S}),(q,W) )\) correspond, if the sequence of return values of \(M_3\) in \(t_{\mathcal {L}}\) is the same as the sequence of transition labels of \(p_{\mathcal {S}}\). The following lemma states that given a conservative path \(p_{\mathcal {S}} \in \textit{path}( \mathcal {CL}(\mathcal {S}), (q,W) )\), there exists a corresponding effective trace \(t_{\mathcal {L}} \in \textit{trace}( \llbracket \mathcal {L}_{ \mathcal {S}, q,W},3 \rrbracket _{\textit{te}} )\).

Lemma 2

Given a conservative path \(p_{\mathcal {S}} \in \textit{path}( \mathcal {CL}(\mathcal {S}), (q,W) )\), there exists an effective trace \(t_{\mathcal {L}} \in \textit{trace}( \llbracket \mathcal {L}_{ \mathcal {S}, q,W},3 \rrbracket _{\textit{te}} )\) such that \(t_{\mathcal {L}}\) and \(p_{\mathcal {S}}\) correspond.

Proof

(Sketch) Assume \(p_{\mathcal {S}}\) = \((q_1,W_1) \overset{\alpha _1}{\longrightarrow }_{\textit{cs}} (q_2,W_2) \overset{\alpha _2}{\longrightarrow }_{\textit{cs}} \cdots \overset{\alpha _k}{\longrightarrow }_{\textit{cs}} (q_{k+1},W_{k+1})\), where (1) \((q_1,W_1)=(q,W)\), (2) for each i, the ith transition uses rule \(r_i=(q_i,U_i,\alpha _i,q_{\textit{i+1}},V_i)\), (3) for each i, \(\exists W'_i,W''_i\), such that \(U_i \cdot W'_i \sqsubseteq W_i\), \(W''_i \sqsubseteq W'_i\) and \(W_{\textit{i+1}}=W''_i \cdot V_i\).

Let us sketch how to construct a path \(p_{\mathcal {L}} \in \textit{path}( \llbracket \mathcal {L}_{ \mathcal {S}, q,W},3 \rrbracket _{\textit{te}} )\) that simulates k transitions on \(p_{\mathcal {S}}\). From initial configuration we first perform the following actions in sequence:
  • Call \(M_1\) in process 1 and call \(M_2\) in process 2.

  • Run \(M_2\) from Line 1 to Line 2, write \(r_1 \cdot \textit{start} \cdot W_1 \cdot \textit{end}\) to x while no flush action of process 2 happens during this period.

To simulate the ith (\(1 \le i \le k\)) transition of \(p_{\mathcal {S}}\), we need to perform the following actions in sequence:
  • Call \(M_3\) in process 3.

  • Run \(M_2\) from Line 3 to Line 13. \(M_2\) reads \(r_i \cdot \textit{start} \cdot U_i \cdot W''_i \cdot \textit{end}\) from y and writes \(r_{\textit{i+1}} \cdot \textit{start} \cdot W''_i \cdot V_i \cdot \textit{end}\) to x (in the case of \(i=k\), \(M_2\) write \(\textit{rule}_f \cdot \textit{start} \cdot W''_k \cdot V_k \cdot \textit{end}\) instead). Then \(M_2\) transmits transition label \(\alpha _i\) to \(M_3\) and \(M_3\) returns \(\alpha _i\). Since \(W_{\textit{i+1}}\) is equal to \(W''_i \cdot V_i\), \(M_2\) writes \(W_{\textit{i+1}}\) to x while it simulates the ith transition of \(p_{\mathcal {S}}\).

It is obvious that \(p_{\mathcal {L}}\) holds as required and its trace, \(t_{\mathcal {L}}\), corresponds with \(p_{\mathcal {S}}\). This completes the proof of Lemma 2. \(\square \)
Figure 2 shows an example of generating a corresponding effective trace of \(\llbracket \mathcal {L}_{\mathcal {S},q,W},\) \(3 \rrbracket _{\textit{te}}\) from a conservative path of \(\mathcal {CL}(\mathcal {S})\). Note that many possible executions of \(\llbracket \mathcal {L}_{\mathcal {S},q,W},\) \(3 \rrbracket _{\textit{te}}\) can get into deadlock due to the operational semantics and the pseudo-code of each method. Herein, we consider only the executions where \(M_3\) always manages to output return labels accordingly. In Fig. 2, contents of a store buffer are written from left to right, while the time progresses from left to right, too. Assume \((q,W) = (q_1, a \cdot a)\) and there is a conservative path \(p_{\mathcal {S}} = (q_1,a \cdot a)\) \(\overset{\alpha _1}{\longrightarrow }_{\textit{cs}}\) \((q_2,b \cdot c)\) \(\overset{\alpha _2}{\longrightarrow }_{\textit{cs}}\) \((q_3,a)\), where the first step uses rule \(\textit{rule}_1=(q_1,a,\alpha _1,q_2,b \cdot c)\) (loses a in the channel), and the second one uses rule \(\textit{rule}_2=(q_2,b,\alpha _2,q_3,a)\) (loses c in the channel). For this path, we can get a corresponding effective trace of \(t_{\mathcal {L}}\) as follows:
  1. 1.

    Run \(M_1\), \(M_2\) and \(M_3\) in processes \(P_1\), \(P_2\) and \(P_3\) respectively. Recall that \(M_1\) and \(M_2\) never return, while each invocation of \(M_3\) is associated with an interval shown in Fig. 2.

     
  2. 2.

    At Line 2 of Method 2, \(M_2\) puts \((x,\textit{rule}_1)\), \((x,\sharp )\), \((x,\textit{start})\), \((x,\sharp )\), (xa), \((x,\sharp )\), (xa), \((x,\sharp )\), \((x,\textit{end})\), \((x,\sharp )\) into the store buffer of process \(P_2\).

     
  3. 3.

    By several loops between Lines 1–3, \(M_1\) captures the updates of x in a lossy manner, and puts \((y,\textit{rule}_1)\), \((y,\sharp )\), \((y,\textit{start})\), \((y,\sharp )\), (ya), \((y,\sharp )\), \((y,\textit{end})\), \((y,\sharp )\) into the store buffer of process \(P_1\).

     
  4. 4.

    At Line 4 of Method 2, \(M_2\) captures the updates of y in a lossy manner. \(M_2\) guesses an applicable transition rule \( \textit{rule}_2\), and then puts \((x,\textit{rule}_2)\) ,\((x,\sharp )\), \((x,\textit{start})\), \((x,\sharp )\), (xb), \((x,\sharp )\), (xc), \((x,\sharp )\), \((x,\textit{end})\), \((x,\sharp )\) into the store buffer of process \(P_2\), according to transition rule \(\textit{rule}_1\).

     
  5. 5.

    \(M_2\) sends the transition label \(\alpha _1\) to \(M_3\) at Line 14 of Method 2. Then, \(M_3\) returns \(\alpha _1\) and we finish simulating the first transition in \(p_{\mathcal {S}}\).

     
  6. 6.

    By several loops between Lines 1–3, \(M_1\) captures the updates of x in a lossy manner, and puts \((y,\textit{rule}_2)\), \((y,\sharp )\), \((y,\textit{start})\), \((y,\sharp )\), (yb), \((y,\sharp )\), \((y,\textit{end})\), \((y,\sharp )\) into the store buffer of process \(P_1\).

     
  7. 7.

    At Line 4 of Method 2, \(M_2\) captures the updates of y. Then, \(M_2\) decides to terminate the simulation and puts \((x,\textit{rule}_f)\),\((x,\sharp )\), \((x,\textit{start})\), \((x,\sharp )\), (xa), \((x,\sharp )\), \((x,\textit{end})\), \((x,\sharp )\) into the store buffer of process \(P_2\), according to transition rule \(\textit{rule}_2\).

     
  8. 8.

    \(M_2\) sends the transition label \(\alpha _2\) to \(M_3\) at Line 14 of Method 2. Then, \(M_3\) returns \(\alpha _2\) and we finish simulating the second transition in \(p_{\mathcal {S}}\).

     
Fig. 2

A conservative path and its corresponding effective trace

It can be seen that \(t_{\mathcal {L}}\) and \(p_{\mathcal {S}}\) correspond in this example. The following lemma is the opposite direction of Lemma 2.

Lemma 3

For each effective trace \(t_{\mathcal {L}} \in \textit{trace}( \llbracket \mathcal {L}_{\mathcal {S}, q,W},3 \rrbracket _{\textit{te}} )\), there exists a conservative path \(p_{\mathcal {S}} \in \textit{path}( \mathcal {CL}(\mathcal {S}), (q,W) )\) such that \(t_{\mathcal {L}}\) and \(p_{\mathcal {S}}\) correspond.

Proof

(Sketch) Given a path \(p_{\mathcal {L}} \in \textit{path}( \llbracket \mathcal {L}_{ \mathcal {S}, q,W},3 \rrbracket _{\textit{te}} )\) and let \(t_{\mathcal {L}}\) be its trace. It is easy to see that the sequence of values of x which is read by \(M_1\) is a subword of the the sequence of values of x which is written by \(M_2\). And the sequence of values of y which is read by \(M_2\) is a subword of the the sequence of values of y which is written by \(M_1\).

A round of \(M_2\) is executions from Line 1 to Line 2 or from Line 3 to Line 13 of \(M_2\). Assume that \(t_{\mathcal {L}}\) has k return actions of \(M_3\), and for each \(1 \le i \le k\), \(M_2\) guesses rule \(r_i=(q_i,U_i,\alpha _i,\) \(q_{\textit{i+1}},\) \(V_i)\) in its ith round. \(M_2\) writes \(r_1 \cdot \textit{start} \cdot W \cdot \textit{end}\) to x during its first round. Assume for each \(1 < i \le k\), \(M_2\) reads \(r_i \cdot \textit{start} \cdot U_i \cdot L'_i \cdot \textit{end}\) from y during its \(\textit{i+1}\)th round and writes \(r_i \cdot \textit{start} \cdot L_i' \cdot V_i \cdot \textit{end}\) to x during its \(\textit{i+1}\)th round.

Recall that \(M_2\) acts according to transition rules \(r_i\) and when \(M_1\) reads updates of x or \(M_2\) reads updates of y, arbitrary message can be lost. Therefore, it is not hard to see that \(p_{\mathcal {S}}=(q,W)\) \(\overset{\alpha _1}{\longrightarrow }_{\textit{cs}}\) \((q_2, L'_1 \cdot V_1)\) \(\overset{\alpha _2}{\longrightarrow }_{\textit{cs}}\) \(\cdots \) \(\overset{\alpha _k}{\longrightarrow }_{\textit{cs}}\) \((q_{k+1}, L'_k \cdot V_k)\) is a conservative path of \(\textit{path}( \mathcal {CL}(\mathcal {S}), (q,W) )\) and \(t_{\mathcal {L}}\) and \(p_{\mathcal {S}}\) correspond, which completes the proof of Lemma 3. \(\square \)

Lemmas 2 and 3 states that there is a close connection between the conservative paths of \(\mathcal {CL}(\mathcal {S})\) and the effective traces of \(\llbracket \mathcal {L}_{\mathcal {S},q,W},3 \rrbracket _{\textit{te}}\). Based on them we can now prove the following lemma, which shows that the history inclusion between concurrent libraries is undecidable on the TSO memory model for a bounded number of processes.

Lemma 4

For any two libraries \(\mathcal {L}_1\) and \(\mathcal {L}_2\), it is undecidable whether \(\textit{history}( \llbracket \mathcal {L}_1,3 \rrbracket _{\textit{te}} )\) \(\subseteq \textit{history}( \llbracket \mathcal {L}_2, 3 \rrbracket _{\textit{te}} )\).

Proof

(Sketch) Based on Lemmas 2 and 3, for any two configurations \((q_1,W_1),\) \((q_2,\) \(W_2) \in \textit{Conf}_{\textit{cs}}\) of a classic-lossy single-channel system \(\mathcal {S}\), let us prove that \(\textit{history}( \llbracket \) \( \mathcal {L}_{\mathcal {S},q_1,W_1} ,3\rrbracket _{\textit{te}}) \subseteq \textit{history}\) \(( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2} ,3 \rrbracket _{\textit{te}})\), if and only if \(\textit{trace}( \mathcal {CL}\) \((\mathcal {S}),\) \((q_1,W_1) )\) \(\subseteq \textit{trace}( \) \( \mathcal {CL}(\mathcal {S}),(q_2,W_2) )\).

The \(\textit{only if}\) direction is proved by contradiction. Assume \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1},3 \rrbracket _{\textit{te}} ) \subseteq \) \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S}, q_2,W_2},3 \rrbracket _{\textit{te}} )\) but \(\textit{trace}( \mathcal {CL}(\mathcal {S}),(q_1,W_1) )\) is not a subset of \(\textit{trace}( \mathcal {CL}(\mathcal {S}),(q_2,\) \(W_2) )\). Thus there must exists a trace \(t_{\mathcal {S}1}\), such that \(t_{\mathcal {S}1} \in \textit{trace}( \mathcal {CL}(\mathcal {S}),(q_1,W_1) )\) and \(t_{\mathcal {S}1} \notin \textit{trace}( \mathcal {CL}(\mathcal {S}),(q_2,\) \(W_2) )\). It is clear that \(t_{\mathcal {S}1} \ne \epsilon \).

Let \(p_{\mathcal {S}1}\) be the path of \(t_{\mathcal {S}1}\) on \(\mathcal {CL}(\mathcal {S})\) from \((q_1,W_1)\). We can safely assume \(p_{\mathcal {S}1}\) to be conservative. According to Lemma 2 there exists an effective trace \(t_{\mathcal {L}1} \in \textit{trace}( \llbracket \mathcal {L}_{\mathcal {S}, q_1,W_1},\) \(3 \rrbracket _{\textit{te}} )\), such that \(t_{\mathcal {L}1}\) and \(p_{\mathcal {S}1}\) correspond. Let history \(h = t_{\mathcal {L}1} \uparrow _{ ( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} ) }\). It is obvious that \(h \in \textit{history}( \llbracket \mathcal {L}_{ \mathcal {S} ,q_1,W_1},3 \rrbracket _{\textit{te}} )\) and by assumption \(h \in \textit{history}( \llbracket \mathcal {L}_{ \mathcal {S} ,q_2,W_2},3 \rrbracket _{\textit{te}} )\).

There exists a trace \(t_{\mathcal {L}2} \in \textit{trace}( \llbracket \mathcal {L}_{ \mathcal {S} ,q_2,W_2},3 \rrbracket _{\textit{te}} )\) such that \(h = t_{\mathcal {L}2} \uparrow _{ ( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} ) }\). It is obvious that \(t_{\mathcal {L}2}\) is effective. According to Lemma 3, there exists a conservative path \(p_{\mathcal {S}2} \in \textit{path}( \mathcal {CL}(\mathcal {S}),(q_2,W_2) )\) such that \(t_{\mathcal {L}2}\) and \(p_{\mathcal {S}2}\) correspond. Let trace \(t_{\mathcal {S}2}\) be the trace of \(p_{\mathcal {S}2}\). Thus \(t_{\mathcal {S}2} \in \textit{trace}( \mathcal {CL}(\mathcal {S}),(q_2,W_2) )\) by its definition. Because the sequence of return values of \(M_3\) in \(t_{\mathcal {L}1}\) is same to that in \(t_{\mathcal {L}2}\), \(t_{\mathcal {L}1}\) and \(p_{\mathcal {S}1}\) correspond, and \(t_{\mathcal {L}2}\) and \(p_{\mathcal {S}2}\) correspond, we can obtain that \(t_{\mathcal {S}1}=t_{\mathcal {S}2}\) and \(t_{\mathcal {S}1} \in \textit{trace}( \mathcal {CL}(\mathcal {S}),(q_2,W_2) )\), which contradicts our assumption.

The \(\textit{if}\) direction can be similarly proved and its proof is omitted here. Therefore, the undecidability result follows from the fact that the trace inclusion problem between any two configurations of a classic-lossy single-channel system is undecidable [19]. \(\square \)

5.2 Undecidability of TSO-to-TSO linearizability

Although we prove above that history inclusion is undecidable on the TSO memory model, there is still a gap between the history inclusion and the extended history inclusion between concurrent libraries. Obviously there exist libraries \(\mathcal {L}_1\) and \(\mathcal {L}_2\) such that \(\textit{history}( \llbracket \mathcal {L}_1 ,n \rrbracket _{\textit{te}} ) \subseteq \textit{history}( \llbracket \mathcal {L}_2 ,n \rrbracket _{\textit{te}} )\) but \(\textit{ehistory}( \llbracket \mathcal {L}_1 ,n \rrbracket _{\textit{te}} ) \not \subseteq \textit{ehistory}( \llbracket \mathcal {L}_2 ,n \rrbracket _{\textit{te}} )\). We show in this subsection that for the two libraries \(\mathcal {L}_{\mathcal {S},q_1,W_1}\) and \(\mathcal {L}_{\mathcal {S},q_2,W_2}\), corresponding to the configurations \((q_1, W_1)\) and \((q_2, W_2)\) of a classic-lossy single-channel system, respectively, the history inclusion and the extended history inclusion between \(\mathcal {L}_{\mathcal {S},q_1,W_1}\) and \(\mathcal {L}_{\mathcal {S},q_2,W_2}\) coincides on the TSO memory model.

Without loss of generality, assume \(M_1\) and \(M_2\) of \(\mathcal {L}_{\mathcal {S},q,W}\) are called by processes \(P_1\), \(P_2\), respectively; while \(M_3\) of \(\mathcal {L}_{\mathcal {S},q,W}\) is repeatedly called by process \(P_3\). Then, an extended history \(\textit{eh} \in \textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q,W}, 3 \rrbracket _{\textit{te}} )\) that contains at least one return action of \(M_3\) is in the following form:
  • The first six actions of \(\textit{eh}\) are always call and corresponding flush call actions of \(M_1\), \(M_2\) and \(M_3\), while these actions may occur in any order.

  • The projection of \(\textit{eh}\) on \(P_i\) is exactly \(\textit{call}(i,M_i,\_) \cdot \textit{flushCall}(i,M_i,\_)\) for \(i\in \{1,2\}\).

  • Figure 3 shows the possible positions of flush call (\(\textit{fcal}\)) and flush return (\(\textit{fret}\)) actions in eh. Since \(M_3\) always executes lock and unlock commands before it returns, during each round of a call to \(M_3\) in \(P_3\), the flush call action must occur before the lock action (see the dashed vertical lines in Fig. 3); hence it can only occur before the return action of \(M_3\). During each round of a call to \(M_3\) in \(P_3\), the flush return action may occur alternatively at two positions: the first position is after the return action of \(M_3\) and before the next round of a call action of \(M_3\), as shown by the position of \(\textit{fret}_1\) in Fig. 3 (a); while the second one is after the next round of a call action of \(M_3\) and before the consequent flush call action, as shown by the position of \(\textit{fret}_1\) in Fig. 3 (b).

Fig. 3

Possible positions of flush call and flush return actions. a \(\textit{fret}_1\) occurs between \({ ret}_{1}\) and \({ cal}_{2}\), b \(\textit{fret}_1\) occurs between \({ cal}_{2}\) and \({ fcal}_{2}\)

To prove that the history inclusion and the extended history inclusion coincide between libraries \(\mathcal {L}_{\mathcal {S},q_1,W_1}\) and \(\mathcal {L}_{\mathcal {S},q_2,W_2}\), we need to show that for an extended history \(\textit{eh}_1\) of \(\llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}}\), if \(\textit{eh}_1\) contains a return action in \(P_3\) and \(\textit{eh}_1 \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} )} \in \textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\), then \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\). Because \(\textit{eh}_1 \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} )}\) is a history of \(\llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}}\), there exists a path \(p_{\mathcal {L}}'\) of \(\llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}}\) corresponding to \(\textit{eh}_1\). From \(p_{\mathcal {L}}'\) we can generate another path \(p_{\mathcal {L}}\) of \(\llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}}\) such that the extended history along \(p_{\mathcal {L}}\) is exactly \(\textit{eh}_1\).

The path \(p_{\mathcal {L}}\) is generated from \(p_{\mathcal {L}}'\) by changing the positions of the flush return actions. Recall that during each round of a call to \(M_3\), the flush return action may occur alternatively at two positions only. Since \(M_3\) does not insert any pending write action into the process \(P_3\)’s store buffer, \(p_\mathcal {L}'\) can be transformed into \(p_\mathcal {L}\) by swapping each flush return action in \(p_\mathcal {L}'\) from its current position to the other possible one (if necessary).

An extended history is \(\textit{effective}\) if it contains at least one \(\textit{return}(\_,M_3,\_)\) action. Otherwise, it is ineffective. The following lemma formalizes the idea describe above.

Lemma 5

For a classic-lossy single-channel system \(\mathcal {S}\) and two configurations \((q_1,W_1),\) \((q_2,W_2)\) \(\in \) \(\textit{Conf}_{\textit{cs}}\), if \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}} )\) is an effective extended history and \(\textit{eh}_1 \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{ret} )}\) \(\in \textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} ) \), then \(\textit{eh}_1\) \(\in \) \(\textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\).

With the help of Lemma 5, we can prove that the history inclusion and the extended history inclusion between the specific libraries coincide on the TSO memory model.

Lemma 6

For two configurations \((q_1,W_1),(q_2,W_2)\) of a classic-lossy single-channel system \(\mathcal {S}\), \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}} )\) \(\subseteq \) \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\) if and only if \(\textit{ehistory}\) \(( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}} )\) \(\subseteq \) \(\textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\).

Proof

The \(\textit{if}\) direction is obvious.

The only if direction can be proved by contradiction. Assume there is an extended history \(\textit{eh}_1\) such that \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}} )\) but \(\textit{eh}_1 \notin \textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\).

It can be seen that the sets of the ineffective extended histories of \(\mathcal {L}_{\mathcal {S},q_1,W_1}\) and \(\mathcal {L}_{\mathcal {S},q_2,W_2}\) are the same. By assumption, \(\textit{eh}_1\) is not an ineffective extended history of \(\mathcal {L}_{\mathcal {S},q_2,W_2}\), so \(\textit{eh}_1\) must be an effective extended history of \(\mathcal {L}_{\mathcal {S},q_1,W_1}\).

Let history \(h=\textit{eh}_1 \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} )}\). It is obvious that \(h \in \textit{history}(\llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}} )\). Then, by assumption, \(h \in \textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\). By Lemma 5, \(\textit{eh}_1 \in \textit{ehistory}(\) \( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3\) \(\rrbracket _{\textit{te}} )\), which contradicts the assumption. \(\square \)

The undecidability of TSO-to-TSO linearizability for a bounded number of processes is a direct consequence of Lemmas 1, 4 and 6.

Theorem 1

For any two concurrent libraries \(\mathcal {L}_1\) and \(\mathcal {L}_2\), it is undecidable whether \(\mathcal {L}_2\) TSO-to-TSO linearizes \(\mathcal {L}_1\) for a bounded number of processes.

5.3 Undecidability of all variants of history inclusion problems

In this subsection we show that all variants of history inclusion problems are undecidable on TSO for a bounded number of processes.

By constructing a close connection between the conservative paths of \(\mathcal {CL}(\mathcal {S})\) and return sequences, as in the undecidability proof of Lemma 4, it is not hard to prove that \((\textit{ret})\)-history inclusion problem is also undecidable on TSO for a bounded number of processes.

Recall that flush call action has fixed positions, flush return action has two possible positions and can be swapped from one position to another position if necessary. Therefore, it can be similarly proved as in Theorem 1 that, the \((\textit{fcal},\textit{ret})\)/ \((\textit{cal},\textit{fcal},\textit{ret})\)/ \((\textit{fret})\)/ \((\textit{cal},\textit{fret})\)/ \((\textit{fcal},\textit{fret})\)/ \((\textit{cal},\textit{fcal},\textit{fret})\)/ \((\textit{ret},\textit{fret})\)/ \((\textit{cal},\textit{ret},\textit{fret})\)/ \((\textit{fcal},\textit{ret},\textit{fret})\)-history inclusion problems are all undecidable on TSO for a bounded number of processes.

To prove that other variants of history inclusion problems to be undecidable, we slightly modify the library \(\mathcal {L}_{\mathcal {S},q,W}\) into library \(\mathcal {L}_{\mathcal {S},q,W}^c\). The only difference between \(\mathcal {L}_{\mathcal {S},q,W}^c\) and \(\mathcal {L}_{\mathcal {S},q,W}\) is method \(M_3\), and the pseudo-code of method \(M_3\) of \(\mathcal {L}_{\mathcal {S},q,W}^c\) is shown in Method 4. \(M_3\) of \(\mathcal {L}_{\mathcal {S},q,W}^c\) needs to ensure that if \(M_3\) returns, then its argument and return value must equal to one step of transitions of \(\mathcal {S}\).

According to the construction of \(\mathcal {L}_{\mathcal {S},q,W}^c\), it is obvious that if \(\alpha _1 \cdot \ldots \cdot \alpha _k\) is a trace of \(\mathcal {CL}(\mathcal {S})\) from (qW), then there must be a trace \(t \in \textit{trace}( \llbracket \mathcal {L}_{\mathcal {S}, q,W}^c,3 \rrbracket _{\textit{te}} )\), such that t contains \(k+1\) call actions of \(M_3\), and the first k arguments of \(M_3\) are \(\alpha _1, \ldots , \alpha _k\), while the last argument of \(M_3\) is irrelevant. It is not hard to construct a close connection between the conservative paths of \(\mathcal {CL}(\mathcal {S})\) and sequences of call actions (except the last one), and this also holds for flush call actions. Therefore, it is not hard to prove that, the \((\textit{cal})\)/ \((\textit{fcal})\)/ \((\textit{cal},\textit{fcal})\)-history inclusion problems are undecidable on TSO for a bounded number of processes.

The following theorem states that all variants of history inclusion problems are undecidable on TSO for a bounded number of processes.

Theorem 2

On TSO memory model, all variants of history inclusion problem are undecidable for a bounded number of processes.

Similarly, method \(M_3\) of \(\mathcal {L}_{\mathcal {S},q,W}^c\) can use \(\textit{cas}\) commands instead of lock and unlock commands. Therefore, on other relaxed memory models that are weaker than TSO and also provides TSO behaviors for write, read and \(\textit{cas}\) commands , the \((\textit{cal})\)/ \((\textit{ret})\)/ \((\textit{cal},\textit{ret})\)-history inclusion problems are still undecidable for a bounded number of processes.

6 Conclusion and future work

We have shown that the decision problem of TSO-to-TSO linearizability is undecidable for a bounded number of processes. The proof method is essentially by a reduction from a known undecidable problem, the trace inclusion problem of a classic-lossy single-channel system. To facilitate such a reduction, we introduced an intermediate notion of history inclusion between concurrent libraries on the TSO memory model. We then demonstrated that a configuration (qW) of a classic-lossy single-channel system \(\mathcal {S}\) can be simulated by a specific library \(\mathcal {L}_{ \mathcal {S} ,q,W}\), interacting with three specific processes on the TSO memory model. Although history inclusion does not coincide with extended history inclusion in general, they do coincide on a restricted class of libraries. We prove that \(\mathcal {L}_{ \mathcal {S} ,q,W}\) lies within such class. Finally, our undecidability result follows from the equivalence between extended history inclusion and TSO-to-TSO linearizability.

The problem of the linearizability between libraries on the SC memory model [11] can be shown to be decidable for a bounded number of processes. This is due to the provable equivalence between history inclusion and linearizability on the SC memory model, while the former is decidable. Thus, our work states clearly a boundary of decidability for linearizability of concurrent libraries on various memory models. As by-product of this work, we prove that all variants of history inclusion problems are undecidable on TSO for a bounded number of processes. This reveals that the undecidability of TSO-to-TSO linearizability comes from the unbounded size of processor-local store buffer, instead of which actions are chosen.

Other relaxed memory models, such as the memory models of POWER and ARM, are much weaker than the TSO memory model. We conjecture that variants of linearizability on these relaxed memory models may also be reduced to some new forms of extended history inclusion, similar to the variants of linearizability for C/C\(++\) memory model in [3], and these variants should also be undecidable. However, the decision problem of TSO-to-SC linearizability, which amounts to checking whether histories of a library on the TSO memory model belong to a regular language, still remains open. For concurrent programs using write, read and \(\textit{cas}\) commands but not call and return actions, Atig et al. proved in [2] that the reachability problem between any two configurations is decidable for a bounded number of processes on the TSO memory model. However, this reachability problem turns much more complex when call and return actions are involved. In [21], we have already proved that if the number of call and return actions is bounded in a history, the decision problem of TSO-to-SC linearizability is decidable for a bounded number of processes. As future work, we would like to further investigate the decidability of TSO-to-SC linearizability and other variants of linearizability for relaxed memory models.

Footnotes

  1. 1.

    t” represents TSO memory model. “e” represents that the operational semantics in this paper extends standard TSO operational semantics [17] similarly as [8].

References

  1. 1.
    Alur, R., McMillan, K., Peled, D.: Model-checking of correctness conditions for concurrent objects. In: LICS 1996, pp. 219–228. IEEE Computer Society (1996)Google Scholar
  2. 2.
    Atig, M.F., Bouajjani, A., Burckhardt, S., Musuvathi, M.: On the verification problem for weak memory models. In: Hermenegildo, M.V., Palsberg, J. (eds.) POPL 2010, pp. 7–18. ACM (2010)Google Scholar
  3. 3.
    Batty, M., Dodds, M., Gotsman, A.: Library abstraction for C/C++ concurrency. In: Giacobazzi, R., Cousot, R. (eds.) POPL 2013, pp. 235–248. ACM (2013)Google Scholar
  4. 4.
    Bovet, D., Cesati, M.: Understanding the Linux Kernel, 3rd edn. O’Reilly, Sebastopol (2005)Google Scholar
  5. 5.
    Batty, M., Owens, S., Sarkar, S., Sewell, P., Weber, T.: Mathematizing C++ concurrency. In: Ball, T., Sagiv, M. (eds.) POPL 2011, pp. 55–66. ACM (2011)Google Scholar
  6. 6.
    Bouajjani, A., Emmi, M., Enea, C., Hamza, J.: Verifying concurrent programs against sequential specifications. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013, pp. 290–309. Springer (2013)Google Scholar
  7. 7.
    Bouajjani, A., Emmi, M., Enea, C., Hamza, J.: Tractable refinement checking for concurrent objects. In: Rajamani, S.K., Walker, D. (eds.) POPL 2015, pp. 651–662. ACM (2015)CrossRefGoogle Scholar
  8. 8.
    Burckhardt, S., Gotsman, A., Musuvathi, M., Yang, H.: Concurrent library correctness on the TSO memory model. In: Seidl, H. (eds.) ESOP 2012, pp. 87–107. Springer (2012)Google Scholar
  9. 9.
    Derrick, J., Smith, G., Groves, L., Dongol, B.: Using coarse-grained abstractions to verify linearizability on TSO. In: Yahav, E. (eds.) HVC 2014, pp. 1–16. Springer (2014)Google Scholar
  10. 10.
    Derrick, J., Smith, G., Dongol, B.: Verifying linearizability on TSO architectures. In: Albert, E., Sekerinski, E. (eds.) IFM 2014, pp. 341–356. Springer (2014)Google Scholar
  11. 11.
    Filipovic, I., O’Hearn, P., Rinetzky, N., Yang, H.: Abstraction for concurrent objects. In: Castagna, G. (eds.) ESOP 2009, pp. 252–266. Springer (2009)Google Scholar
  12. 12.
    Gotsman, A., Musuvathi, M., Yang, H.: Show no weakness: sequentially consistent specifications of TSO libraries. In: Aguilera, M.K. (eds.) DISC 2012, pp. 31–45. Springer (2012)Google Scholar
  13. 13.
    Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)CrossRefGoogle Scholar
  14. 14.
    Lamport, L.: How to make a multiprocessor computer that correctly executes multiprocess program. IEEE Trans. Comput. 28(9), 690–691 (1979)CrossRefGoogle Scholar
  15. 15.
    Liu, Y., Chen, W., Liu, Y.A., Sun, J., Zhang, S.J., Dong, J.S.: Verifying linearizability via optimized refinement checking. IEEE Trans. Softw. Eng. 39(7), 1018–1039 (2013)CrossRefGoogle Scholar
  16. 16.
    Manson, J., Pugh, W., Adve, S.V.: The Java memory model. In: Palsberg, J., Abadi, M. (eds.) POPL 2005, pp. 378–391. ACM (2005)Google Scholar
  17. 17.
    Owens, S., Sarkar, S., Sewell, P.: A better x86 memory model: x86-TSO. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009, pp. 391–407. Springer (2009)Google Scholar
  18. 18.
    Sarkar, S., Sewell, P., Alglave, J., Maranget, L., Williams, D.: Understanding POWER multiprocessors. In: Hall, M. W., Padua, D.A. (eds.) PLDI 2011, pp. 175–186. ACM (2011)Google Scholar
  19. 19.
    Schnoebelen, P.: Bisimulation and other undecidable equivalences for lossy channel systems. In: Kobayashi, N., Pierce, B.C. (eds.) TACS 2001, pp. 385–399. Springer (2001)Google Scholar
  20. 20.
    Vechev, M.T., Yahav, E., Yorsh, G.: Experience with model checking linearizability. In: Pasareanu, C.S. (ed.) SPIN 2009, pp. 261–278. Springer (2009)Google Scholar
  21. 21.
    Wang, C., Lv, Y., Wu, P.: Bounded TSO-to-SC linearizability is decidable. In: Freivalds, M.R., Engels, G., Catania, B. (eds.) SOFSEM 2016, pp. 404–417. Springer (2016)Google Scholar
  22. 22.
    Wang, C., Lv, Y., Wu, P.: TSO-to-TSO linearizability is undecidable. In: Finkbeiner, B., Pu, G., Zhang, L. (eds.) ATVA 2015, pp. 309–325. Springer (2015)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.State Key Laboratory of Computer Science, Institute of SoftwareChinese Academy of SciencesBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina

Personalised recommendations