TSOtoTSO linearizability is undecidable
 609 Downloads
Abstract
TSOtoTSO linearizability is a variant of linearizability for concurrent libraries on the total store order (TSO) memory model. It is proved in this paper that TSOtoTSO linearizability for a bounded number of processes is undecidable. We first show that the trace inclusion problem of a classiclossy singlechannel system, which is known undecidable, can be reduced to the history inclusion problem of specific libraries on the TSO memory model. Based on the equivalence between history inclusion and extended history inclusion for these libraries, we then prove that the extended history inclusion problem of libraries is undecidable on the TSO memory model. By means of extended history inclusion as an equivalent characterization of TSOtoTSO linearizability, we finally prove that TSOtoTSO linearizability is undecidable for a bounded number of processes. Additionally, we prove that all variants of history inclusion problems are undecidable on TSO for a bounded number of processes.
1 Introduction
Libraries of high performance concurrent data structures have been widely used in concurrent programs to take advantage of multicore architectures, such as java.util.concurrent for Java and std::thread for C\(++\)11. It is important but notoriously difficult to ensure that concurrent libraries are designed and implemented correctly. Linearizability [13] is accepted as a de facto correctness condition for a concurrent library with respect to its sequential specification on the sequential consistency (SC) memory model [14]. Intuitively, linearizability provides the vision that every individual operation appears to take place instantaneously at some point between its invocation and return. It is well known that on the SC memory model linearizability of a concurrent library is decidable for a bounded number of processes [1], but undecidable for an unbounded number of processes [6].
However, modern multiprocessors (e.g., 86 \(\times \) [17], POWER [18]) and programming languages (e.g., C/C\(++\) [5], Java [16]) do not comply with the SC memory model. As a matter of fact, they provide relaxed memory models, which allow subtle behaviors due to hardware or compiler optimization. For instance, in a multiprocessor system implementing the total store order (TSO) memory model [17], each processor is equipped with a FIFO store buffer. Any write action performed by a processor is put into its local store buffer first and can then be flushed into the main memory at any time.
The notion of linearizability has been extended for relaxed memory models, e.g., TSOtoTSO linearizability [8] and TSOtoSC linearizability [12] for the TSO memory model and two variants of linearizability [3] for the C\(++\) memory model. These notions generalize the original one by relating concurrent libraries with their abstract implementations, in the way as shown in [11] for the SC memory model. It is worth mentioning that these notions of linearizability satisfy the abstraction theorem [3, 8, 12]: if a library is linearizable with respect to its abstract implementation, every observable behavior of any client program using the former can be observed when the program uses the latter instead. Concurrent software developer can benefit from this correspondence in that the library can be safely replaced with its abstract implementation for the sake of optimization or the ease of verification of the client program.
The decision problems for linearizability on relaxed memory models become more complicated. Because of the hierarchy of memory models, it is rather trivial to see that linearizability on relaxed memory models is undecidable for an unbounded number of processes, based on the known undecidability result on the SC memory model [6]. But the decision problem of linearizability on relaxed memory models remains open for a bounded number of processes.
In this paper we mainly study the decision problem for the TSOtoTSO linearizability of concurrent libraries within a bounded number of processes. TSOtoTSO linearizability is the first definition of linearizability on relaxed memory models. It relates a library running on the TSO memory model to its abstract implementation running also on the TSO memory model. Histories of method invocations/responses are typically concerned by the standard notion of linearizability. For TSOtoTSO linearizability, such histories have to be extended to reflect the interactions between concurrent libraries and processorlocal store buffers.
The main result of this paper is that TSOtoTSO linearizability is undecidable for a bounded number of processes. We first show that the extended history inclusion is an equivalent characterization of TSOtoTSO linearizability. Then, we prove our undecidability result by reducing the trace inclusion problem between any two configurations of a classiclossy singlechannel system to the extended history inclusion problem between two specific libraries. Recall that the trace inclusion problem between configurations of a classiclossy singlechannel system is undecidable [19]. The reduction is achieved by using as a bridge the history inclusion between these two specific libraries.
Technically, we present a library template that can be instantiated as a specific library for a configuration of a classiclossy singlechannel system. The library is designed with three methods \(M_i\) for \(1\le i\le 3\). We use two processes \(P_1\) and \(P_2\), calling methods \(M_1\) and \(M_2\), respectively, to simulate the traces of the classiclossy singlechannel system starting from the given configuration. This is based on the observation that on the TSO memory model, a process may miss updates by other processes because multiple flush actions may occur between consecutive read actions of the process [2]. But a channel system accesses the content of a channel always in a FIFO manner; while on the contrary, a process on the TSO memory model always reads the latest updates in its local store buffer (whenever possible). Herein, processes \(P_1\) and \(P_2\) alternatively update their own store buffers, but read only from each other’s store buffer. In this way, the labeled transitions of the classiclossy singlechannel system can be reproduced through the interactions between processes \(P_1\) and \(P_2\). Furthermore, we use the third process \(P_3\), calling method \(M_3\) repeatedly, to return each fired transition label repeatedly, so that the traces of the classiclossy singlechannel system starting from a given configuration can be mimicked by the histories of the library exactly. Specially, methods \(M_1\) and \(M_2\) never return, while method \(M_3\) just uses an atomic write action to return labels in order not to touch process \(P_3\)’s store buffer. Consequently, we can easily establish the equivalence between the history inclusion and the extended history inclusion between the specific libraries.
By constructing two specific libraries based on the above library template, we show that the trace inclusion problem between any two configurations of a classiclossy singlechannel system can be reduced to the history inclusion problem between the corresponding two concurrent libraries, while the history inclusion relation and the extended history inclusion relation are equivalent between these two libraries. Then, the undecidability result of TSOtoTSO linearizability for a bounded number of processes follows from its equivalent characterization and the undecidability result of classiclossy singlechannel system. To our best knowledge, this is the first result on the decidability of linearizability of concurrent libraries on relaxed memory models.
Apart from histories and extended histories, there are other forms of sequences that are used to represented behaviors of libraries. For example, in [9, 10] the behavior of concurrent libraries on TSO are essentially recorded by sequences of call and flush return actions. Based on this variant of history, they propose TSOlinearizability, a variant of linearizability without abstraction theorem, as correctness condition. To deal with various possible forms of histories, we also consider variants of histories. As byproduct of our work, we prove that all variants of history inclusion problems, including history inclusion problem and extended history inclusion problem, are undecidable on TSO for a bounded number of processes. Some variants of history inclusion problems can be similarly proved as history inclusion problem and extended history inclusion problem. To deal with other variants of history inclusion problems, we slightly modify the specific libraries. Then the traces of the classiclossy singlechannel system can be mimicked by sequences of call actions.
Related work Efforts have been devoted on the decidability and model checking of linearizability on the SC memory model [1, 6, 7, 15, 20]. The principle of our equivalent characterization for TSOtoTSO linearizability is similar to that of the characterization given by Bouajjani et al. in [7], where history inclusion is proved to be an equivalent characterization of linearizability. Alur et al. proved that for a bounded number of processes, checking whether a regular set of histories is linearizable with respect to its regular sequential specification can be reduced to a history inclusion problem, and hence is decidable [1]. Bouajjani et al. proved that the problem of whether a library is linearizable with respect to its regular sequential specification for a unbounded number of processes is undecidable, by a reduction from the reachability problem of a counter machine (which is known to be undecidable) [6].
On the other hand, the decidability of linearizability on relaxed memory models is still open for a bounded number of processes. The closest work to ours is [2] by Atig et al., where a lossy channel system is simulated by a concurrent program on the TSO memory model. Our approach of using methods \(M_1\) and \(M_2\) to simulate a classiclossy singlechannel system is inspired by their work. However, in [2], it was the decidable reachability problem of the channel system that was reduced to the reachability problem of the concurrent program on the TSO memory model. Hence, only the start and end configurations of the channel system are needed in their reduction. In this paper, we reduce the trace inclusion problem between any two configurations of a classiclossy singlechannel system, which is undecidable, to the TSOtoTSO linearizability problem. Our reduction needs to show exactly each step of transitions in the channel system.
Paper outline We give the definitions of libraries, concurrent systems and its operational semantics in Sect. 2. We introduce the definition of TSOtoTSO linearizability and variants of histories, and prove that extended history inclusion is an equivalent characterization of TSOtoTSO linearizability in Sect. 3. In Sect. 4, we present how to generate specific libraries to mimic behaviors of classiclossy singlechannel systems. We prove in Sect. 5 that TSOtoTSO linearizability and all variants of history inclusion problems are undecidable on TSO for a bounded number of processes. We conclude in Sect. 6.
Differences from the conference paper
This article is an extended version of our ATVA’15 conference paper [22], containing all the proofs of the lemmas and theorems mentioned in the paper. Since the conference, we have also extended our result from undecidability of history inclusion problem and extended history inclusion problem into all variants of history inclusion problems.
2 TSO concurrent systems
In this section, we first present the notations of libraries, the most general clients and TSO concurrent systems. Then, we introduce their operational semantics on the TSO memory model.
2.1 Notations
In general, a finite sequence on an alphabet \(\Sigma \) is denoted \(l=\alpha _1 \cdot \alpha _2 \cdot \ldots \cdot \alpha _k\), where \(\cdot \) is the concatenation symbol and \(\alpha _i\in \Sigma \) for each \(1\le i\le k\). Let l denote the length of l, i.e., \(l=k\), and l(i) denote the ith element of l for \(1 \le i \le k\), i.e., \(l(i)=\alpha _i\). For an alphabet \(\Sigma '\), let \(l \uparrow _{\Sigma '}\) denote the projection of l to \(\Sigma '\). Given a function f, let f[x : y] be the function that shares the same value as f everywhere, except for x, where it has the value y. We use \(\_\) for an item, of which the value is irrelevant.
A labelled transition system (LTS) is a tuple \(\mathcal {A}=(Q,\Sigma ,\rightarrow ,q_0)\), where Q is a set of states, \(\Sigma \) is a set of transition labels, \(\rightarrow \subseteq Q\times \Sigma \times Q\) is a transition relation and \(q_0\) is the initial state. A state of the LTS \(\mathcal {A}\) may be referred to as a configuration in the rest of the paper.
A path of \(\mathcal {A}\) is a finite transition sequence \(q_1\xrightarrow {\beta _1}q_2\overset{\beta _2}{\longrightarrow }\cdots \overset{\beta _k}{\longrightarrow }q_{k+1}\) for \(k\ge 0\). A trace of \(\mathcal {A}\) is a finite sequence \(t= \beta _1 \cdot \beta _2 \cdot \ldots \cdot \beta _k\), where \(k \ge 0\) if there exists a path \(q_1\overset{\beta _1}{\longrightarrow }q_2\overset{\beta _2}{\longrightarrow }\cdots \overset{\beta _k}{\longrightarrow }q_{k+1}\) of \(\mathcal {A}\). Let \(\textit{path}(\mathcal {A},q)\) and \(\textit{trace}(\mathcal {A},q)\) denote all the paths and traces of \(\mathcal {A}\) that start from q, respectively. We write \(\textit{path}(\mathcal {A})\) and \(\textit{trace}(\mathcal {A})\) for short if \(q=q_0\).
2.2 Libraries and the most general clients
A library implementing a concurrent data structure provides a set of methods for external users to access the data structure. It may contain private memory locations for its own use. A client program is a program that interacts with libraries. For simplicity, we assume that each method has just one parameter and one return value if it returns. Furthermore, all the parameters and the return values are passed via a special register \(r_f\).

Register assign commands in the form of \( r_1 = \textit{re} \);

Register reset commands in the form of \( \textit{havoc}\);

Read commands in the form of \( \textit{read}(x,r_1) \);

Write commands in the form of \( \textit{write}(r_1,x) \);

Lock commands in the form of \(\textit{lock}\);

Unlock commands in the form of \(\textit{unlock}\);

Assume commands in the form of \( \textit{assume}(r_1)\);

Call commands in the form of \( \textit{call}(m)\);
A controlflow graph is a tuple \(\textit{CFG} = (N,L,T,q_i,q_f)\), where N is a finite set of program positions, L is a set of primitive commands, \(T \subseteq N \times L \times N\) is a controlflow transition relation, \(q_i\) is the initial position and \(q_f\) is the final position.
A library \(\mathcal {L}\) can then be defined as a tuple \(\mathcal {L} =(Q_\mathcal {L},\rightarrow _\mathcal {L},\textit{InitV}_\mathcal {L})\), such that \(Q_\mathcal {L} = \bigcup _{m \in \mathcal {M}} Q_m \) is a finite set of program positions, where \(Q_m\) is the program positions of a method m of this library; \(\rightarrow _\mathcal {L} = \bigcup _{m \in \mathcal {M}} \rightarrow _m\) is a controlflow transition relation, where for each \(m \in \mathcal {M}\), \(( Q_m, \textit{PCom}, \rightarrow _m, i_m, f_m )\) is a controlflow graph with a unique initial position \(i_m\) and a unique final position \(f_m\); \(\textit{InitV}_\mathcal {L}: \mathcal {X} \rightarrow \mathcal {D}\) is an initial valuation for its memory locations.
The most general client of a library is a special client program that is used to exhibit all possible behaviors of the library. Formally, the most general client \(\mathcal {MGC}\) of library \(\mathcal {L}\) is defined as a tuple \((\{q_c,q_c'\},\rightarrow _c)\), where \(q_c\) and \(q_c'\) are two program positions, \(\rightarrow _c= \{ (q_c,\textit{havoc},q'_c ) \} \cup \{ (q'_c,\textit{call}(m),q_c) \vert m \in \mathcal {M}\} \) is a controlflow transition relation and \(( \{q_c,q'_c\}, \textit{PCom}, \rightarrow _c, q_c, q_c )\) is a controlflow graph. Intuitively, the most general client repeatedly calls an arbitrary method with an arbitrary argument for arbitrarily many times.
2.3 TSO operational semantics
Assume a concurrent system consists of n processes, each of which runs the most general client program of a library on a separate processor. Then, the operational semantics of a library can be defined in the context of the concurrent system.
For a library \(\mathcal {L}\)=\((Q_{\mathcal {L}},\rightarrow _{\mathcal {L}},\textit{InitV}_{\mathcal {L}})\), its operational semantics on the TSO memory model is defined as an LTS \(\llbracket \mathcal {L},n \rrbracket _{\textit{te}}\) ^{1} \(= (\textit{Conf}_{\textit{te}}, \Sigma _{\textit{te}},\,\rightarrow _{\textit{te}}, \textit{InitConf}_{\textit{te}} )\), where \(\textit{Conf}_{\textit{te}}, \Sigma _{\textit{te}},\) \(\rightarrow _{\textit{te},} \textit{InitConf}_{\textit{te}}\) are defined as follows.

\(p:\{1, \dots , n \} \rightarrow \{ q_c , q'_c \} \cup Q_{\mathcal {L}}\) represents control states of each process;

\(d:\mathcal {X} \rightarrow \mathcal {D}\) represents values at each memory location;

\(u:\{1, \dots , n\} \rightarrow ( \{ (x,a) \vert x \in \mathcal {X}, a \in \mathcal {D} \} \cup \{\textit{call}(m,a) \vert m \in \mathcal {M},a \in \mathcal {D} \} \cup \{\textit{return}(m,a) \vert m \in \mathcal {M},a \in \mathcal {D} \})^*\) represents contents of each processorlocal store buffer; each processorlocal store buffer may contain a finite sequence of pending write, pending call or pending return actions;

\(r:\{1,\ldots ,n\} \rightarrow (\mathcal {R} \rightarrow \mathcal {D})\) represents values of the registers of each process.

\(l \subseteq \{1,\ldots ,n\}\) contains all the processes that can currently execute commands.

Internal actions: \(\{ \tau (i) \vert 1 \le i \le n \}\);

Read actions: \(\{\textit{read}(i,x,a) \vert 1 \le i \le n, x \in \mathcal {X},a \in \mathcal {D} \}\);

Write actions: \(\{\textit{write}(i,x,a) \vert 1 \le i \le n, x \in \mathcal {X}, a \in \mathcal {D} \}\);

Lock actions: \(\{\textit{lock}(i) \vert \) \(1 \le i \le n \}\);

Unlock actions: \(\{\textit{unlock}(i) \vert \) \(1 \le i \le n \}\);

Flush actions: \(\{ \textit{flush}(i,\) \(x,a) \vert \ 1 \le i \le n, x \in \mathcal {X}, a \in \mathcal {D} \}\);

Call actions: \(\Sigma _{\textit{cal}}\) = \(\{\textit{call}(i,m,a) \vert 1 \le i \le n, m \in \mathcal {M}, a \in \mathcal {D} \}\);

Return actions: \(\Sigma _{\textit{ret}}\) = \(\{\textit{return}(i,m,a) \vert 1 \le i \le n, m \in \mathcal {M}, a \in \mathcal {D} \}\);

Flush call actions: \(\Sigma _{\textit{fcal}}\) = \(\{\textit{flushCall}(i,m,a) \vert 1 \le i \le n, m \in \mathcal {M}, a \in \mathcal {D} \}\);

Flush return actions: \(\Sigma _{\textit{fret}}\) = \(\{\textit{flushReturn}(i,m,a) \vert 1 \le i \le n, m \in \mathcal {M}, a \in \mathcal {D} \}\).
The initial configuration \(\textit{InitConf}_{te}\in \textit{Conf}_{te}\) is a tuple \((p_{\textit{init}}, \textit{InitV}_{\mathcal {L}}, u_{\textit{init}}, r_{\textit{init}} , l_{\textit{init}} )\), where \(p_{\textit{init}}(i)=q_c\), \(u_{\textit{init}}(i)=\epsilon \) (representing an empty buffer), \(r_{\textit{init}}(i)(r)=\textit{regV}_{\textit{init}}\) (a special initial value of a register) and \(l_{\textit{init}}=\{1,\ldots ,n\}\) for \(1\le i\le n\), \(r \in \mathcal {R}\);

\(\textit{RegisterAssign}\) rule: A function \(f_{\textit{re}} : (\mathcal {R} \rightarrow \mathcal {D}) \times \mathcal {RE} \rightarrow \mathcal {D}\) is used to evaluate register expression \(\textit{re}\) under register valuation \(\textit{rv}\) of current process, and its value is assigned to register \(r_1\).

\(\textit{LibraryHavoc}\) and \(\mathcal {MGC}{} \textit{Havoc}\) rules: \(\textit{havoc}\) commands are executed for libraries and the most general clients respectively.

\(\textit{Assume}\) rule: If the value of register \(r_1\) is \(\textit{true}\), current process can execute \(\textit{assume}\) command. Otherwise, it must wait.
 \(\textit{Read}\) rule: A function \(\textit{lookup}(u,d,i,x)\) is used to search for the latest value of x from its processorlocal store buffer or the main memory, i.e.,where \(\Sigma _x\) = \(\{(x,a) \vert a \in \mathcal {D}\}\) is the set of pending write actions for x. A read action will take the latest value of x from its processorlocal store buffer if possible, otherwise, it looks up the value in memory.$$\begin{aligned} \textit{lookup(u,d,i,x)} = \left\{ \begin{array}{ll} a &{} \text {if}\quad u(i)\uparrow _{\Sigma _x}=(x,a) \cdot l, \ \textit{for some} \ l \in \Sigma _x^* \\ d(x) &{} \text {otherwise }\\ \end{array} \right. \end{aligned}$$

\(\textit{Write}\) rule: A write action will insert a pair of a location and a value to the tail of its processorlocal store buffer.

\(\textit{Lock}\) and \(\textit{unlock}\) rules: Only when the processorlocal store buffer is empty, a processor can perform lock or unlock commands. A process executing lock makes itself the only active process and prevents other processes from executing commands. After a process executes unlock command, other processes become active. Thus, the commands executed from lock to unlock are not interleaved with commands of other processes.

\(\textit{Flush}\) rule: The memory system may decide to flush the entry at the head of a processorlocal store buffer to memory at any time.

\(\textit{Call}\) and \(\textit{return}\) rules: To deal with \(\textit{call}\) command, a call marker is added into the tail of processorlocal store buffer and current process starts to execute the initial position of method m. When the process comes to the final position of method m it can launch a \(\textit{return}\) action, add a return marker to the tail of processorlocal store buffer and start to execute the most general client.

\(\textit{FlushCall}\) and \(\textit{FlushReturn}\) rules: The call and return marker can be discarded when they are at the head of processorlocal store buffer. Such actions are used to define TSOtoTSO linearizability only.

lock;

if (x==a) {x=b; unlock; return 1;}

{unlock; return 0;}
3 TSOtoTSO linearizability and equivalent characterization
In this section we introduce the definition of TSOtoTSO linearizability and then prove that it can be equivalently characterized by extended history inclusion. We also give definitions of variants of histories.
3.1 TSOtoTSO linearizability
The behavior of a library is typically represented by histories of interactions between the library and the clients calling it (through call and return actions). A finite sequence \(h\in (\Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}})^*\) is a history of an LTS \(\mathcal {A}\) if there exists a trace t of \(\mathcal {A}\) such that \(t \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} )} = h\). Let \(\textit{history}(\mathcal {A})\) denote all the histories of \(\mathcal {A}\).
TSOtoTSO linearizability is a variant of linearizability on the TSO memory model. It additionally concerns the behavior of a library in the context of processorlocal store buffers, i.e., the interactions between the library and store buffers through flush call and flush return actions. A finite sequence \(eh\in (\Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}}\cup \Sigma _{\textit{fcal}}\cup \Sigma _{\textit{fret}})^*\) is an extended history of an LTS \(\mathcal {A}\) if there exists a trace t of \(\mathcal {A}\) such that \(t \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}}\cup \Sigma _{\textit{fcal}}\cup \Sigma _{\textit{fret}} )} = eh\). Let \(\textit{ehistory}(\mathcal {A})\) denote all the extended histories of \(\mathcal {A}\), and \(\textit{eh} \vert _{\textit{i}} \) the projection of \(\textit{eh}\) to the actions of the ith process. Two extended histories \(\textit{eh}_1\) and \(\textit{eh}_2\) are equivalent, if for each \(1\le i\le n\), \(\textit{eh}_1 \vert _i = \textit{eh}_2 \vert _i\).
Definition 1

\(\textit{eh}_1\) and \(\textit{eh}_2\) are equivalent;

there is a bijection \(\pi : \{ 1,\dots ,\vert \textit{eh}_1 \vert \} \rightarrow \{ 1,\dots ,\vert \textit{eh}_2 \vert \}\) such that for any \(1\le i\le eh_1\), \(\textit{eh}_1(i)=\textit{eh}_2(\pi (i))\);

for any \(1\le i<j\le eh_1\), if \( (\textit{eh}_1(i) \in \Sigma _{\textit{ret}} \cup \Sigma _{\textit{fret}} ) \wedge (\textit{eh}_1(j) \in \Sigma _{\textit{cal}} \cup \Sigma _{\textit{fcal}})\), then \(\pi (i) < \pi (j)\).
Informally speaking, if \(\textit{eh}_1\) is TSOtoTSO Linearizable to \(\textit{eh}_2\), then \(\textit{eh}_2\) keeps all the nonoverlapping pairs of call/flush call and return/flush return actions in \(\textit{eh}_1\) in the same order. It is proved in [8] that TSOtoTSO linearizability satisfies a socalled abstraction theorem. Therefore, if \(\mathcal {L}_2\) TSOtoTSO linearizes \(\mathcal {L}_1\), then it is safe to replace \(\mathcal {L}_1\) with \(\mathcal {L}_2\) and this will not introduce any new behaviors in the view of client programs.
The following is an example of implementation library and its specification library of TSOtoTSO linearizability in [8]. The implementation library, spinlock, of software lock is used in various version of the Linux kernel [4] and is shown in (a):
Here, the write to x in the release of abstraction library can also be delayed in the store buffer. It can be seen that the resulting specification still guarantees mutual exclusion. It has been proved in [8] that the abstraction library TSOtoTSO linearizes the implementation library.
Apart from histories and extended histories, there are other forms of sequences that are used to represent behaviors of libraries. For example, in [9, 10] the behavior of a method starts with call action and ends with flush return action. In such situation the behavior of a library essentially contains sequences of call and flush return actions. To deal with all possible variants of histories, we generalize the notions of history as follows: Let \(\textit{cal}\), \(\textit{ret}\), \(\textit{fcal}\) and \(\textit{fret}\) represent call, return, flush call and flush return actions, respectively. Given distinct \(x,y,z,w \in \{ \textit{cal}, \textit{ret}, \textit{fcal}, \textit{fret}\}\), a (x)history is a sequence of x actions, a (x, y)history is a sequence of x and y actions, a (x, y, z)history is a sequence of x, y and z actions, and a (x, y, z, w)history is a sequence of x, y, z and w actions. It is easy to see that there are fifteen variants of histories, while the (standard) histories can be defined as \((\textit{call}, \textit{ret})\)histories, and extended histories can be defined as \((\textit{call}, \textit{ret}, \textit{fcal}, \textit{fret})\)histories.
3.2 Equivalence characterization
To handle the decision problem of TSOtoTSO linearizability, we show that the extended history inclusion is an equivalent characterization of TSOtoTSO linearizability. It is obvious that extended history inclusion implies TSOtoTSO linearizability. To prove the opposite direction, we need to prove that if \(\mathcal {L}_2\) TSOtoTSO linearizes \(\mathcal {L}_1\), \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_1,n \rrbracket _{\textit{te}} )\), \(\textit{eh}_2 \in \textit{ehistory}( \llbracket \mathcal {L}_2,n\) \(\rrbracket _{\textit{te}} )\) and \(\textit{eh}_1\) is TSOtoTSO linearizable to \(\textit{eh}_2\), then \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_2,n \rrbracket _{\textit{te}} )\).
A transformation \(\Rightarrow _{\textit{ER}}\) is a relation between two extended histories and is defined as follows: \(\textit{eh}_1 \Rightarrow _{\textit{ER}} \textit{eh}_2\), if \(\textit{eh}_1\) = \(l_1 \cdot \alpha \cdot \beta \cdot l_2\), \(\textit{eh}_2\) = \(l_1 \cdot \beta \cdot \alpha \cdot l_2\) and \((\alpha ,\beta )\) is neither in \((\Sigma _{\textit{ret}} \cup \Sigma _{\textit{fret}}) \times (\Sigma _{\textit{cal}} \cup \Sigma _{\textit{fcal}})\), nor actions of same process. Or we can say, \(\textit{eh}_2\) can be obtained by swapping two adjacent elements of \(\textit{eh}_1\) without violating TSOtoTSO linearizability. We write \(\Rightarrow _{\textit{ER}}^*\) to denote the transition closure of \(\Rightarrow _{\textit{ER}}\).
Given two equivalent extended histories \(\textit{eh}_1\) and \(\textit{eh}_2\), we say that \(\pi \) is their bijection, if \(\pi \) is a bijection between \(\{1, \ldots , \vert \textit{eh}_1 \vert \}\) and \(\{1, \ldots , \vert \textit{eh}_2 \vert \}\), and \(\textit{eh}_1(i)=\textit{eh}_2(\pi (i))\) for each i. We use predicate \(\textit{eWit}(\textit{eh}_1,\textit{eh}_2,i_1,i_2)\) to denote a difference between two equivalent extended histories, and \(\textit{eWit}(\textit{eh}_1,\textit{eh}_2,i_1,i_2)\) holds if \(i_1 < i_2\) and \(\pi ^{1}(i_1) > \pi ^{1}(i_2)\). Given two equivalent extended histories \(\textit{eh}_1\) and \(\textit{eh}_2\), a nonnegative distance function \(eWitSum(\textit{eh}_1, \textit{eh}_2)\) is used to measure the difference between them. Formally, \(\textit{eWitSum}(\textit{eh}_1,\textit{eh}_2)\) = \(\vert \{ (m,n) \vert \textit{eWit}(\textit{eh}_1,\) \(\textit{eh}_2,m,n) \ holds\} \vert \).

\(\textit{eh}_1\) is TSOtoTSO linearizable to \(\textit{eh}_3\);

\(\textit{eh}_3\) \(\Rightarrow _{\textit{ER}}\) \(\textit{eh}_2\);

\(\textit{eh}_3\) is TSOtoTSO linearizable to \(\textit{eh}_2\);

the distance between \(eh_1\) and \(eh_3\) is strictly less than the one between \(eh_1\) and \(eh_2\).
Based on the two results in above paragraph, it is not hard to see that TSOtoTSO linearizability implies extended history inclusion. Therefore, extended history inclusion is an equivalent characterization of TSOtoTSO linearizability, as presented by the following lemma.
Lemma 1
For any two libraries \(\mathcal {L}_1\) and \(\mathcal {L}_2\), \(\mathcal {L}_2\) TSOtoTSO linearizes \(\mathcal {L}_1\) if and only if \(\textit{ehistory}( \llbracket \mathcal {L}_1,n \rrbracket _{\textit{te}} ) \subseteq \textit{ehistory}( \llbracket \mathcal {L}_2,n \rrbracket _{\textit{te}} )\).
4 Specific libraries for classiclossy singlechannel systems
In this section, we introduce the definition of classiclossy singlechannel systems, and then show how to simulate a classiclossy singlechannel system with a concurrent library.
4.1 Classiclossy singlechannel systems
A classiclossy singlechannel system [19] is a tuple \(\mathcal {S}\) = \((Q_{\textit{cs}},\Sigma _{\textit{cs}},\{c_{\textit{cs}}\},\Gamma _{\textit{cs}},\) \(\Delta _{\textit{cs}})\), where \(Q_{\textit{cs}}\) is a finite set of control states, \(\Sigma _{\textit{cs}}\) is a finite alphabet of messages, \(c_{\textit{cs}}\) is the name of the single channel, \(\Gamma _{\textit{cs}}\) is a finite set of transition labels and \(\Delta _{\textit{cs}} \subseteq Q_{\textit{cs}} \times \Sigma ^*_{\textit{cs}} \times \Gamma _{\textit{cs}} \times Q_{\textit{cs}} \times \Sigma ^*_{\textit{cs}}\) is a transition relation.
Given two finite sequences \(l_1=\alpha _1 \cdot \alpha _2 \cdot \ldots \cdot \alpha _u\) and \(l_2=\beta _1 \cdot \beta _2 \cdot \ldots \cdot \beta _v\), we say that \(l_1\) is a \(\textit{subword}\) of \(l_2\), denoted \(l_1 \sqsubseteq l_2\), if there exists \(1\le i_1< \cdots < i_u \le v \) such that for any \(1\le j \le u\), \(\alpha _j=\beta _{i_j}\). Then, the operational semantics of \(\mathcal {S}\) is given by an LTS \(\mathcal {CL}(\mathcal {S})\) = \((\textit{Conf}_{\textit{cs}}, \Gamma _{\textit{cs}},\rightarrow _{\textit{cs}},\textit{initConf}_{\textit{cs}})\), where \(\textit{Conf}_{\textit{cs}} = Q_{\textit{cs}} \times \Sigma ^*_{\textit{cs}}\) is a set of configurations with \(\textit{initConf}_{\textit{cs}} \in \textit{Conf}_{\textit{cs}}\) as the initial configuration. The transition relation \(\rightarrow _{\textit{cs}}\) is defined as follows: \((q_1,W_1)\) \(\overset{ \alpha }{\longrightarrow }_{\textit{cs}}\) \((q_2,W_2)\) if there exists \((q_1,U,\alpha ,q_2,V)\) \(\in \) \(\Delta _{\textit{cs}}\) and \(W' \in \Sigma _{\textit{cs}}^*\) such that \(U \cdot W' \sqsubseteq W_1\) and \(W_2 \sqsubseteq W' \cdot V\).
It is known that for two configurations \((q_1,W_1),(\) \(q_2,W_2) \in \textit{Conf}_{\textit{cs}}\) of a classiclossy singlechannel system \(\mathcal {S}\), the trace inclusion between \((q_1,W_1)\) and \((q_2,W_2)\) is undecidable [19].
4.2 Simulation on the TSO memory model
On the TSO memory model flush actions are launched nondeterministically by the memory system. Therefore, between two consecutive \(\textit{read}(x,\_)\) actions, more than one flush actions to x may happen. The second read action can only read the latest flush action to x, while missing the intermediate ones. These missing flush actions are similar to the missing messages that may happen in a classiclossy singlechannel system. This makes it possible to simulate a classiclossy singlechannel system with a concurrent program running on the TSO memory model. We implement such simulation through a library \(\mathcal {L}_{\mathcal {S},q,W}\) specifically constructed based on a classiclossy singlechannel system \(\mathcal {S}\) and a given configuration \((q,W) \in \textit{Conf}_{\textit{cs}}\).
For a classiclossy singlechannel system \(\mathcal {S}\) =\((Q_{\textit{cs}},\Sigma _{\textit{cs}},\{ c_{\textit{cs}} \},\) \(\Gamma _{\textit{cs}},\Delta _{\textit{cs}})\), assume the finite data domain \(\mathcal {D}_{\textit{cs}}\) = \(Q_{\textit{cs}} \cup \Sigma _{\textit{cs}} \cup \Delta _{\textit{cs}} \cup \{ \sharp , \textit{start}, \textit{end}, \bot ,\textit{true},\textit{false},\textit{regV}_{\textit{init}}, \textit{rule}_{f}\}\), where \(Q_{\textit{cs}} \cap \Sigma _{\textit{cs}} = \emptyset \), \(Q_{\textit{cs}} \cap \Delta _{\textit{cs}} = \emptyset \), \(\Sigma _{\textit{cs}} \cap \Delta _{\textit{cs}} = \emptyset \), and the symbols \(\sharp , \textit{start}, \textit{end}\), \(\bot \), \(\textit{true}\), \(\textit{false}\), \(\textit{regV}_{\textit{init}}\) and \(\textit{rule}_f\) do not exist in \(Q_{\textit{cs}} \cup \Sigma _{\textit{cs}} \cup \Delta _{\textit{cs}}\). Given a configuration \((q,W) \in \textit{Conf}_{\textit{cs}}\) of \(\mathcal {S}\), the library \(\mathcal {L}_{\mathcal {S},q,W}\) is constructed with three methods \(M_1\), \(M_2\) and \(M_3\), and three private memory locations x, y and z. x is used to transmit the channel contents from \(M_2\) to \(M_1\), while y is used to transmit the channel contents from \(M_1\) to \(M_2\). z is used to transmit the transition labels of \(\mathcal {CL}(\mathcal {S})\) from \(M_2\) to \(M_3\). It is also used to synchronize \(M_2\) and \(M_3\). The symbol \(\sharp \) is used as the delimiter to ensure that one element will not be read twice. The symbols \(\textit{start}\) and \(\textit{end}\) represent the start and the end of the channel contents, respectively. \(\bot \) is the initial value of x, y and z. The symbol \(\textit{rule}_f\) is an additional transition rule that is used to indicate the end of a simulation.
We now present the three methods in the pseudocode, shown in Methods 1, 2 and 3. The \(\textit{if}\) and \(\textit{while}\) statements used in the pseudocode can be easily implemented by the \(\textit{assume}\) commands as well as other commands in our formation of a library. For the sake of brevity, the following macro notations are used. For sequence \(l=a_1 \cdot \ldots \cdot a_m\), let writeSeq(x,l) denote the commands of writing \(a_1, \sharp , \ldots , a_m, \sharp \) to x in sequence, and readSeq(x,l) denote the commands of reading \(a_1, \sharp , \ldots , a_m, \sharp \) from x in sequence. We use \(v:=\textit{readOne}(x)\) to represent the commands of reading \(e,\sharp \) from x in sequence for some \(e \ne \sharp \) and then assigning e to v. If \(\textit{readSeq}(x,l)\) or \(\textit{readOne}(x)\) fails to read the specified content, then the calling process will no long proceed. We use \(\textit{writeOne}(x,\textit{reg})\) to represent the commands of writing a, \(\sharp \) to x in sequence where a is the current value of register \(\textit{reg}\). In the pseudocode, r is a register in \(\mathcal {R}\).
5 Undecidability of TSOtoTSO linearizability
As the main result of this paper, we present in this section that the TSOtoTSO linearizability of concurrent libraries is undecidable for a bounded number of processes. We first reduce the trace inclusion problem between any two configurations of a classiclossy singlechannel system to the history inclusion problem between two specific concurrent libraries. Then, our main undecidability result follows from the equivalence between the history inclusion and the extended history inclusion for these two libraries. Recall that the latter is equivalent to TSOtoTSO linearizability between the two libraries based on the above Lemma 1. Moreover, we prove that in general all variants of history inclusion problems, including history inclusion problem and extended history inclusion problem, are undecidable on TSO for a bounded number of processes.
5.1 Undecidability of history inclusion
In this subsection we show that given a classiclossy singlechannel system \(\mathcal {S}\) and a configuration \((q,W)\in Conf_{cs}\), the histories of library \(\mathcal {L}_{\mathcal {S},q,W}\) simulate exactly the paths of \(\mathcal {S}\) starting from (q, W). Therefore, trace inclusion between \((q_1,W_1)\) and \((q_2,W_2)\) can be reduced into history inclusion between \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1},3 \rrbracket _{\textit{te}} )\) and \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2},3 \rrbracket _{\textit{te}} )\).
A path \(p_{\mathcal {S}}= (q_1,W_1)\) \(\overset{\alpha _1}{\longrightarrow }_{\textit{cs}}\) \((q_2,W_2)\) \(\overset{\alpha _2}{\longrightarrow }_{\textit{cs}}\) \(\cdots \) \(\overset{\alpha _k}{\longrightarrow }_{\textit{cs}}\) \((q_{k+1},W_{k+1})\) \(\in \textit{path}(\mathcal {CL}(\mathcal {S}),\) \( (q_1,W_1) )\) is \(\textit{conservative}\), if the following two conditions hold: (1) it contains at least one transition, (2) assume the ith step uses rule \(r_i=(q_i,U_i,\alpha _i,q_{\textit{i+1}},V_i)\) for each \(1\le i\le k\), then for each \(1\le i\le k\), there exists \(W'_i,W''_i \in \Sigma _{\textit{cs}}^*\) such that \(U_i \cdot W'_i \sqsubseteq W_i\), \(W''_i \sqsubseteq W'_i\) and \(W_{\textit{i+1}}=W''_i \cdot V_i\). Intuitively, each ith step of a conservative path does not lose any element in \(V_i\). We prove that the traces of conservative paths equals to that of all paths for classiclossy singlechannel systems. A trace \(t_{\mathcal {L}} \in \textit{trace}( \llbracket \mathcal {L}_{\mathcal {S},q,W},3 \rrbracket _{\textit{te}} )\) is \(\textit{effective}\), if \(t_{\mathcal {L}}\) contains at least one return action \(\textit{return}(\_,M_3,\_)\). Otherwise, it is ineffective.
There is actually a close connection between the conservative paths of \(\mathcal {CL}(\mathcal {S})\) and the effective traces of \(\llbracket \mathcal {L}_{\mathcal {S},q,W},3 \rrbracket _{\textit{te}}\). An effective trace \(t_{\mathcal {L}} \in \textit{trace}( \llbracket \mathcal {L}_{\mathcal {S},q,W},3 \rrbracket _{\textit{te}} )\) and a conservative path \(p_{\mathcal {S}} \in \textit{path}( \mathcal {CL}(\mathcal {S}),(q,W) )\) correspond, if the sequence of return values of \(M_3\) in \(t_{\mathcal {L}}\) is the same as the sequence of transition labels of \(p_{\mathcal {S}}\). The following lemma states that given a conservative path \(p_{\mathcal {S}} \in \textit{path}( \mathcal {CL}(\mathcal {S}), (q,W) )\), there exists a corresponding effective trace \(t_{\mathcal {L}} \in \textit{trace}( \llbracket \mathcal {L}_{ \mathcal {S}, q,W},3 \rrbracket _{\textit{te}} )\).
Lemma 2
Given a conservative path \(p_{\mathcal {S}} \in \textit{path}( \mathcal {CL}(\mathcal {S}), (q,W) )\), there exists an effective trace \(t_{\mathcal {L}} \in \textit{trace}( \llbracket \mathcal {L}_{ \mathcal {S}, q,W},3 \rrbracket _{\textit{te}} )\) such that \(t_{\mathcal {L}}\) and \(p_{\mathcal {S}}\) correspond.
Proof
(Sketch) Assume \(p_{\mathcal {S}}\) = \((q_1,W_1) \overset{\alpha _1}{\longrightarrow }_{\textit{cs}} (q_2,W_2) \overset{\alpha _2}{\longrightarrow }_{\textit{cs}} \cdots \overset{\alpha _k}{\longrightarrow }_{\textit{cs}} (q_{k+1},W_{k+1})\), where (1) \((q_1,W_1)=(q,W)\), (2) for each i, the ith transition uses rule \(r_i=(q_i,U_i,\alpha _i,q_{\textit{i+1}},V_i)\), (3) for each i, \(\exists W'_i,W''_i\), such that \(U_i \cdot W'_i \sqsubseteq W_i\), \(W''_i \sqsubseteq W'_i\) and \(W_{\textit{i+1}}=W''_i \cdot V_i\).

Call \(M_1\) in process 1 and call \(M_2\) in process 2.

Run \(M_2\) from Line 1 to Line 2, write \(r_1 \cdot \textit{start} \cdot W_1 \cdot \textit{end}\) to x while no flush action of process 2 happens during this period.

Call \(M_3\) in process 3.

Run \(M_2\) from Line 3 to Line 13. \(M_2\) reads \(r_i \cdot \textit{start} \cdot U_i \cdot W''_i \cdot \textit{end}\) from y and writes \(r_{\textit{i+1}} \cdot \textit{start} \cdot W''_i \cdot V_i \cdot \textit{end}\) to x (in the case of \(i=k\), \(M_2\) write \(\textit{rule}_f \cdot \textit{start} \cdot W''_k \cdot V_k \cdot \textit{end}\) instead). Then \(M_2\) transmits transition label \(\alpha _i\) to \(M_3\) and \(M_3\) returns \(\alpha _i\). Since \(W_{\textit{i+1}}\) is equal to \(W''_i \cdot V_i\), \(M_2\) writes \(W_{\textit{i+1}}\) to x while it simulates the ith transition of \(p_{\mathcal {S}}\).
 1.
Run \(M_1\), \(M_2\) and \(M_3\) in processes \(P_1\), \(P_2\) and \(P_3\) respectively. Recall that \(M_1\) and \(M_2\) never return, while each invocation of \(M_3\) is associated with an interval shown in Fig. 2.
 2.
At Line 2 of Method 2, \(M_2\) puts \((x,\textit{rule}_1)\), \((x,\sharp )\), \((x,\textit{start})\), \((x,\sharp )\), (x, a), \((x,\sharp )\), (x, a), \((x,\sharp )\), \((x,\textit{end})\), \((x,\sharp )\) into the store buffer of process \(P_2\).
 3.
By several loops between Lines 1–3, \(M_1\) captures the updates of x in a lossy manner, and puts \((y,\textit{rule}_1)\), \((y,\sharp )\), \((y,\textit{start})\), \((y,\sharp )\), (y, a), \((y,\sharp )\), \((y,\textit{end})\), \((y,\sharp )\) into the store buffer of process \(P_1\).
 4.
At Line 4 of Method 2, \(M_2\) captures the updates of y in a lossy manner. \(M_2\) guesses an applicable transition rule \( \textit{rule}_2\), and then puts \((x,\textit{rule}_2)\) ,\((x,\sharp )\), \((x,\textit{start})\), \((x,\sharp )\), (x, b), \((x,\sharp )\), (x, c), \((x,\sharp )\), \((x,\textit{end})\), \((x,\sharp )\) into the store buffer of process \(P_2\), according to transition rule \(\textit{rule}_1\).
 5.
\(M_2\) sends the transition label \(\alpha _1\) to \(M_3\) at Line 14 of Method 2. Then, \(M_3\) returns \(\alpha _1\) and we finish simulating the first transition in \(p_{\mathcal {S}}\).
 6.
By several loops between Lines 1–3, \(M_1\) captures the updates of x in a lossy manner, and puts \((y,\textit{rule}_2)\), \((y,\sharp )\), \((y,\textit{start})\), \((y,\sharp )\), (y, b), \((y,\sharp )\), \((y,\textit{end})\), \((y,\sharp )\) into the store buffer of process \(P_1\).
 7.
At Line 4 of Method 2, \(M_2\) captures the updates of y. Then, \(M_2\) decides to terminate the simulation and puts \((x,\textit{rule}_f)\),\((x,\sharp )\), \((x,\textit{start})\), \((x,\sharp )\), (x, a), \((x,\sharp )\), \((x,\textit{end})\), \((x,\sharp )\) into the store buffer of process \(P_2\), according to transition rule \(\textit{rule}_2\).
 8.
\(M_2\) sends the transition label \(\alpha _2\) to \(M_3\) at Line 14 of Method 2. Then, \(M_3\) returns \(\alpha _2\) and we finish simulating the second transition in \(p_{\mathcal {S}}\).
It can be seen that \(t_{\mathcal {L}}\) and \(p_{\mathcal {S}}\) correspond in this example. The following lemma is the opposite direction of Lemma 2.
Lemma 3
For each effective trace \(t_{\mathcal {L}} \in \textit{trace}( \llbracket \mathcal {L}_{\mathcal {S}, q,W},3 \rrbracket _{\textit{te}} )\), there exists a conservative path \(p_{\mathcal {S}} \in \textit{path}( \mathcal {CL}(\mathcal {S}), (q,W) )\) such that \(t_{\mathcal {L}}\) and \(p_{\mathcal {S}}\) correspond.
Proof
(Sketch) Given a path \(p_{\mathcal {L}} \in \textit{path}( \llbracket \mathcal {L}_{ \mathcal {S}, q,W},3 \rrbracket _{\textit{te}} )\) and let \(t_{\mathcal {L}}\) be its trace. It is easy to see that the sequence of values of x which is read by \(M_1\) is a subword of the the sequence of values of x which is written by \(M_2\). And the sequence of values of y which is read by \(M_2\) is a subword of the the sequence of values of y which is written by \(M_1\).
A round of \(M_2\) is executions from Line 1 to Line 2 or from Line 3 to Line 13 of \(M_2\). Assume that \(t_{\mathcal {L}}\) has k return actions of \(M_3\), and for each \(1 \le i \le k\), \(M_2\) guesses rule \(r_i=(q_i,U_i,\alpha _i,\) \(q_{\textit{i+1}},\) \(V_i)\) in its ith round. \(M_2\) writes \(r_1 \cdot \textit{start} \cdot W \cdot \textit{end}\) to x during its first round. Assume for each \(1 < i \le k\), \(M_2\) reads \(r_i \cdot \textit{start} \cdot U_i \cdot L'_i \cdot \textit{end}\) from y during its \(\textit{i+1}\)th round and writes \(r_i \cdot \textit{start} \cdot L_i' \cdot V_i \cdot \textit{end}\) to x during its \(\textit{i+1}\)th round.
Recall that \(M_2\) acts according to transition rules \(r_i\) and when \(M_1\) reads updates of x or \(M_2\) reads updates of y, arbitrary message can be lost. Therefore, it is not hard to see that \(p_{\mathcal {S}}=(q,W)\) \(\overset{\alpha _1}{\longrightarrow }_{\textit{cs}}\) \((q_2, L'_1 \cdot V_1)\) \(\overset{\alpha _2}{\longrightarrow }_{\textit{cs}}\) \(\cdots \) \(\overset{\alpha _k}{\longrightarrow }_{\textit{cs}}\) \((q_{k+1}, L'_k \cdot V_k)\) is a conservative path of \(\textit{path}( \mathcal {CL}(\mathcal {S}), (q,W) )\) and \(t_{\mathcal {L}}\) and \(p_{\mathcal {S}}\) correspond, which completes the proof of Lemma 3. \(\square \)
Lemmas 2 and 3 states that there is a close connection between the conservative paths of \(\mathcal {CL}(\mathcal {S})\) and the effective traces of \(\llbracket \mathcal {L}_{\mathcal {S},q,W},3 \rrbracket _{\textit{te}}\). Based on them we can now prove the following lemma, which shows that the history inclusion between concurrent libraries is undecidable on the TSO memory model for a bounded number of processes.
Lemma 4
For any two libraries \(\mathcal {L}_1\) and \(\mathcal {L}_2\), it is undecidable whether \(\textit{history}( \llbracket \mathcal {L}_1,3 \rrbracket _{\textit{te}} )\) \(\subseteq \textit{history}( \llbracket \mathcal {L}_2, 3 \rrbracket _{\textit{te}} )\).
Proof
(Sketch) Based on Lemmas 2 and 3, for any two configurations \((q_1,W_1),\) \((q_2,\) \(W_2) \in \textit{Conf}_{\textit{cs}}\) of a classiclossy singlechannel system \(\mathcal {S}\), let us prove that \(\textit{history}( \llbracket \) \( \mathcal {L}_{\mathcal {S},q_1,W_1} ,3\rrbracket _{\textit{te}}) \subseteq \textit{history}\) \(( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2} ,3 \rrbracket _{\textit{te}})\), if and only if \(\textit{trace}( \mathcal {CL}\) \((\mathcal {S}),\) \((q_1,W_1) )\) \(\subseteq \textit{trace}( \) \( \mathcal {CL}(\mathcal {S}),(q_2,W_2) )\).
The \(\textit{only if}\) direction is proved by contradiction. Assume \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1},3 \rrbracket _{\textit{te}} ) \subseteq \) \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S}, q_2,W_2},3 \rrbracket _{\textit{te}} )\) but \(\textit{trace}( \mathcal {CL}(\mathcal {S}),(q_1,W_1) )\) is not a subset of \(\textit{trace}( \mathcal {CL}(\mathcal {S}),(q_2,\) \(W_2) )\). Thus there must exists a trace \(t_{\mathcal {S}1}\), such that \(t_{\mathcal {S}1} \in \textit{trace}( \mathcal {CL}(\mathcal {S}),(q_1,W_1) )\) and \(t_{\mathcal {S}1} \notin \textit{trace}( \mathcal {CL}(\mathcal {S}),(q_2,\) \(W_2) )\). It is clear that \(t_{\mathcal {S}1} \ne \epsilon \).
Let \(p_{\mathcal {S}1}\) be the path of \(t_{\mathcal {S}1}\) on \(\mathcal {CL}(\mathcal {S})\) from \((q_1,W_1)\). We can safely assume \(p_{\mathcal {S}1}\) to be conservative. According to Lemma 2 there exists an effective trace \(t_{\mathcal {L}1} \in \textit{trace}( \llbracket \mathcal {L}_{\mathcal {S}, q_1,W_1},\) \(3 \rrbracket _{\textit{te}} )\), such that \(t_{\mathcal {L}1}\) and \(p_{\mathcal {S}1}\) correspond. Let history \(h = t_{\mathcal {L}1} \uparrow _{ ( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} ) }\). It is obvious that \(h \in \textit{history}( \llbracket \mathcal {L}_{ \mathcal {S} ,q_1,W_1},3 \rrbracket _{\textit{te}} )\) and by assumption \(h \in \textit{history}( \llbracket \mathcal {L}_{ \mathcal {S} ,q_2,W_2},3 \rrbracket _{\textit{te}} )\).
There exists a trace \(t_{\mathcal {L}2} \in \textit{trace}( \llbracket \mathcal {L}_{ \mathcal {S} ,q_2,W_2},3 \rrbracket _{\textit{te}} )\) such that \(h = t_{\mathcal {L}2} \uparrow _{ ( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} ) }\). It is obvious that \(t_{\mathcal {L}2}\) is effective. According to Lemma 3, there exists a conservative path \(p_{\mathcal {S}2} \in \textit{path}( \mathcal {CL}(\mathcal {S}),(q_2,W_2) )\) such that \(t_{\mathcal {L}2}\) and \(p_{\mathcal {S}2}\) correspond. Let trace \(t_{\mathcal {S}2}\) be the trace of \(p_{\mathcal {S}2}\). Thus \(t_{\mathcal {S}2} \in \textit{trace}( \mathcal {CL}(\mathcal {S}),(q_2,W_2) )\) by its definition. Because the sequence of return values of \(M_3\) in \(t_{\mathcal {L}1}\) is same to that in \(t_{\mathcal {L}2}\), \(t_{\mathcal {L}1}\) and \(p_{\mathcal {S}1}\) correspond, and \(t_{\mathcal {L}2}\) and \(p_{\mathcal {S}2}\) correspond, we can obtain that \(t_{\mathcal {S}1}=t_{\mathcal {S}2}\) and \(t_{\mathcal {S}1} \in \textit{trace}( \mathcal {CL}(\mathcal {S}),(q_2,W_2) )\), which contradicts our assumption.
The \(\textit{if}\) direction can be similarly proved and its proof is omitted here. Therefore, the undecidability result follows from the fact that the trace inclusion problem between any two configurations of a classiclossy singlechannel system is undecidable [19]. \(\square \)
5.2 Undecidability of TSOtoTSO linearizability
Although we prove above that history inclusion is undecidable on the TSO memory model, there is still a gap between the history inclusion and the extended history inclusion between concurrent libraries. Obviously there exist libraries \(\mathcal {L}_1\) and \(\mathcal {L}_2\) such that \(\textit{history}( \llbracket \mathcal {L}_1 ,n \rrbracket _{\textit{te}} ) \subseteq \textit{history}( \llbracket \mathcal {L}_2 ,n \rrbracket _{\textit{te}} )\) but \(\textit{ehistory}( \llbracket \mathcal {L}_1 ,n \rrbracket _{\textit{te}} ) \not \subseteq \textit{ehistory}( \llbracket \mathcal {L}_2 ,n \rrbracket _{\textit{te}} )\). We show in this subsection that for the two libraries \(\mathcal {L}_{\mathcal {S},q_1,W_1}\) and \(\mathcal {L}_{\mathcal {S},q_2,W_2}\), corresponding to the configurations \((q_1, W_1)\) and \((q_2, W_2)\) of a classiclossy singlechannel system, respectively, the history inclusion and the extended history inclusion between \(\mathcal {L}_{\mathcal {S},q_1,W_1}\) and \(\mathcal {L}_{\mathcal {S},q_2,W_2}\) coincides on the TSO memory model.

The first six actions of \(\textit{eh}\) are always call and corresponding flush call actions of \(M_1\), \(M_2\) and \(M_3\), while these actions may occur in any order.

The projection of \(\textit{eh}\) on \(P_i\) is exactly \(\textit{call}(i,M_i,\_) \cdot \textit{flushCall}(i,M_i,\_)\) for \(i\in \{1,2\}\).

Figure 3 shows the possible positions of flush call (\(\textit{fcal}\)) and flush return (\(\textit{fret}\)) actions in eh. Since \(M_3\) always executes lock and unlock commands before it returns, during each round of a call to \(M_3\) in \(P_3\), the flush call action must occur before the lock action (see the dashed vertical lines in Fig. 3); hence it can only occur before the return action of \(M_3\). During each round of a call to \(M_3\) in \(P_3\), the flush return action may occur alternatively at two positions: the first position is after the return action of \(M_3\) and before the next round of a call action of \(M_3\), as shown by the position of \(\textit{fret}_1\) in Fig. 3 (a); while the second one is after the next round of a call action of \(M_3\) and before the consequent flush call action, as shown by the position of \(\textit{fret}_1\) in Fig. 3 (b).
To prove that the history inclusion and the extended history inclusion coincide between libraries \(\mathcal {L}_{\mathcal {S},q_1,W_1}\) and \(\mathcal {L}_{\mathcal {S},q_2,W_2}\), we need to show that for an extended history \(\textit{eh}_1\) of \(\llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}}\), if \(\textit{eh}_1\) contains a return action in \(P_3\) and \(\textit{eh}_1 \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} )} \in \textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\), then \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\). Because \(\textit{eh}_1 \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} )}\) is a history of \(\llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}}\), there exists a path \(p_{\mathcal {L}}'\) of \(\llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}}\) corresponding to \(\textit{eh}_1\). From \(p_{\mathcal {L}}'\) we can generate another path \(p_{\mathcal {L}}\) of \(\llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}}\) such that the extended history along \(p_{\mathcal {L}}\) is exactly \(\textit{eh}_1\).
The path \(p_{\mathcal {L}}\) is generated from \(p_{\mathcal {L}}'\) by changing the positions of the flush return actions. Recall that during each round of a call to \(M_3\), the flush return action may occur alternatively at two positions only. Since \(M_3\) does not insert any pending write action into the process \(P_3\)’s store buffer, \(p_\mathcal {L}'\) can be transformed into \(p_\mathcal {L}\) by swapping each flush return action in \(p_\mathcal {L}'\) from its current position to the other possible one (if necessary).
An extended history is \(\textit{effective}\) if it contains at least one \(\textit{return}(\_,M_3,\_)\) action. Otherwise, it is ineffective. The following lemma formalizes the idea describe above.
Lemma 5
For a classiclossy singlechannel system \(\mathcal {S}\) and two configurations \((q_1,W_1),\) \((q_2,W_2)\) \(\in \) \(\textit{Conf}_{\textit{cs}}\), if \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}} )\) is an effective extended history and \(\textit{eh}_1 \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{ret} )}\) \(\in \textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} ) \), then \(\textit{eh}_1\) \(\in \) \(\textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\).
With the help of Lemma 5, we can prove that the history inclusion and the extended history inclusion between the specific libraries coincide on the TSO memory model.
Lemma 6
For two configurations \((q_1,W_1),(q_2,W_2)\) of a classiclossy singlechannel system \(\mathcal {S}\), \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}} )\) \(\subseteq \) \(\textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\) if and only if \(\textit{ehistory}\) \(( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}} )\) \(\subseteq \) \(\textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\).
Proof
The \(\textit{if}\) direction is obvious.
The only if direction can be proved by contradiction. Assume there is an extended history \(\textit{eh}_1\) such that \(\textit{eh}_1 \in \textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}} )\) but \(\textit{eh}_1 \notin \textit{ehistory}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\).
It can be seen that the sets of the ineffective extended histories of \(\mathcal {L}_{\mathcal {S},q_1,W_1}\) and \(\mathcal {L}_{\mathcal {S},q_2,W_2}\) are the same. By assumption, \(\textit{eh}_1\) is not an ineffective extended history of \(\mathcal {L}_{\mathcal {S},q_2,W_2}\), so \(\textit{eh}_1\) must be an effective extended history of \(\mathcal {L}_{\mathcal {S},q_1,W_1}\).
Let history \(h=\textit{eh}_1 \uparrow _{( \Sigma _{\textit{cal}} \cup \Sigma _{\textit{ret}} )}\). It is obvious that \(h \in \textit{history}(\llbracket \mathcal {L}_{\mathcal {S},q_1,W_1}, 3 \rrbracket _{\textit{te}} )\). Then, by assumption, \(h \in \textit{history}( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3 \rrbracket _{\textit{te}} )\). By Lemma 5, \(\textit{eh}_1 \in \textit{ehistory}(\) \( \llbracket \mathcal {L}_{\mathcal {S},q_2,W_2}, 3\) \(\rrbracket _{\textit{te}} )\), which contradicts the assumption. \(\square \)
The undecidability of TSOtoTSO linearizability for a bounded number of processes is a direct consequence of Lemmas 1, 4 and 6.
Theorem 1
For any two concurrent libraries \(\mathcal {L}_1\) and \(\mathcal {L}_2\), it is undecidable whether \(\mathcal {L}_2\) TSOtoTSO linearizes \(\mathcal {L}_1\) for a bounded number of processes.
5.3 Undecidability of all variants of history inclusion problems
In this subsection we show that all variants of history inclusion problems are undecidable on TSO for a bounded number of processes.
By constructing a close connection between the conservative paths of \(\mathcal {CL}(\mathcal {S})\) and return sequences, as in the undecidability proof of Lemma 4, it is not hard to prove that \((\textit{ret})\)history inclusion problem is also undecidable on TSO for a bounded number of processes.
Recall that flush call action has fixed positions, flush return action has two possible positions and can be swapped from one position to another position if necessary. Therefore, it can be similarly proved as in Theorem 1 that, the \((\textit{fcal},\textit{ret})\)/ \((\textit{cal},\textit{fcal},\textit{ret})\)/ \((\textit{fret})\)/ \((\textit{cal},\textit{fret})\)/ \((\textit{fcal},\textit{fret})\)/ \((\textit{cal},\textit{fcal},\textit{fret})\)/ \((\textit{ret},\textit{fret})\)/ \((\textit{cal},\textit{ret},\textit{fret})\)/ \((\textit{fcal},\textit{ret},\textit{fret})\)history inclusion problems are all undecidable on TSO for a bounded number of processes.
According to the construction of \(\mathcal {L}_{\mathcal {S},q,W}^c\), it is obvious that if \(\alpha _1 \cdot \ldots \cdot \alpha _k\) is a trace of \(\mathcal {CL}(\mathcal {S})\) from (q, W), then there must be a trace \(t \in \textit{trace}( \llbracket \mathcal {L}_{\mathcal {S}, q,W}^c,3 \rrbracket _{\textit{te}} )\), such that t contains \(k+1\) call actions of \(M_3\), and the first k arguments of \(M_3\) are \(\alpha _1, \ldots , \alpha _k\), while the last argument of \(M_3\) is irrelevant. It is not hard to construct a close connection between the conservative paths of \(\mathcal {CL}(\mathcal {S})\) and sequences of call actions (except the last one), and this also holds for flush call actions. Therefore, it is not hard to prove that, the \((\textit{cal})\)/ \((\textit{fcal})\)/ \((\textit{cal},\textit{fcal})\)history inclusion problems are undecidable on TSO for a bounded number of processes.
The following theorem states that all variants of history inclusion problems are undecidable on TSO for a bounded number of processes.
Theorem 2
On TSO memory model, all variants of history inclusion problem are undecidable for a bounded number of processes.
Similarly, method \(M_3\) of \(\mathcal {L}_{\mathcal {S},q,W}^c\) can use \(\textit{cas}\) commands instead of lock and unlock commands. Therefore, on other relaxed memory models that are weaker than TSO and also provides TSO behaviors for write, read and \(\textit{cas}\) commands , the \((\textit{cal})\)/ \((\textit{ret})\)/ \((\textit{cal},\textit{ret})\)history inclusion problems are still undecidable for a bounded number of processes.
6 Conclusion and future work
We have shown that the decision problem of TSOtoTSO linearizability is undecidable for a bounded number of processes. The proof method is essentially by a reduction from a known undecidable problem, the trace inclusion problem of a classiclossy singlechannel system. To facilitate such a reduction, we introduced an intermediate notion of history inclusion between concurrent libraries on the TSO memory model. We then demonstrated that a configuration (q, W) of a classiclossy singlechannel system \(\mathcal {S}\) can be simulated by a specific library \(\mathcal {L}_{ \mathcal {S} ,q,W}\), interacting with three specific processes on the TSO memory model. Although history inclusion does not coincide with extended history inclusion in general, they do coincide on a restricted class of libraries. We prove that \(\mathcal {L}_{ \mathcal {S} ,q,W}\) lies within such class. Finally, our undecidability result follows from the equivalence between extended history inclusion and TSOtoTSO linearizability.
The problem of the linearizability between libraries on the SC memory model [11] can be shown to be decidable for a bounded number of processes. This is due to the provable equivalence between history inclusion and linearizability on the SC memory model, while the former is decidable. Thus, our work states clearly a boundary of decidability for linearizability of concurrent libraries on various memory models. As byproduct of this work, we prove that all variants of history inclusion problems are undecidable on TSO for a bounded number of processes. This reveals that the undecidability of TSOtoTSO linearizability comes from the unbounded size of processorlocal store buffer, instead of which actions are chosen.
Other relaxed memory models, such as the memory models of POWER and ARM, are much weaker than the TSO memory model. We conjecture that variants of linearizability on these relaxed memory models may also be reduced to some new forms of extended history inclusion, similar to the variants of linearizability for C/C\(++\) memory model in [3], and these variants should also be undecidable. However, the decision problem of TSOtoSC linearizability, which amounts to checking whether histories of a library on the TSO memory model belong to a regular language, still remains open. For concurrent programs using write, read and \(\textit{cas}\) commands but not call and return actions, Atig et al. proved in [2] that the reachability problem between any two configurations is decidable for a bounded number of processes on the TSO memory model. However, this reachability problem turns much more complex when call and return actions are involved. In [21], we have already proved that if the number of call and return actions is bounded in a history, the decision problem of TSOtoSC linearizability is decidable for a bounded number of processes. As future work, we would like to further investigate the decidability of TSOtoSC linearizability and other variants of linearizability for relaxed memory models.
Footnotes
References
 1.Alur, R., McMillan, K., Peled, D.: Modelchecking of correctness conditions for concurrent objects. In: LICS 1996, pp. 219–228. IEEE Computer Society (1996)Google Scholar
 2.Atig, M.F., Bouajjani, A., Burckhardt, S., Musuvathi, M.: On the verification problem for weak memory models. In: Hermenegildo, M.V., Palsberg, J. (eds.) POPL 2010, pp. 7–18. ACM (2010)Google Scholar
 3.Batty, M., Dodds, M., Gotsman, A.: Library abstraction for C/C++ concurrency. In: Giacobazzi, R., Cousot, R. (eds.) POPL 2013, pp. 235–248. ACM (2013)Google Scholar
 4.Bovet, D., Cesati, M.: Understanding the Linux Kernel, 3rd edn. O’Reilly, Sebastopol (2005)Google Scholar
 5.Batty, M., Owens, S., Sarkar, S., Sewell, P., Weber, T.: Mathematizing C++ concurrency. In: Ball, T., Sagiv, M. (eds.) POPL 2011, pp. 55–66. ACM (2011)Google Scholar
 6.Bouajjani, A., Emmi, M., Enea, C., Hamza, J.: Verifying concurrent programs against sequential specifications. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013, pp. 290–309. Springer (2013)Google Scholar
 7.Bouajjani, A., Emmi, M., Enea, C., Hamza, J.: Tractable refinement checking for concurrent objects. In: Rajamani, S.K., Walker, D. (eds.) POPL 2015, pp. 651–662. ACM (2015)CrossRefGoogle Scholar
 8.Burckhardt, S., Gotsman, A., Musuvathi, M., Yang, H.: Concurrent library correctness on the TSO memory model. In: Seidl, H. (eds.) ESOP 2012, pp. 87–107. Springer (2012)Google Scholar
 9.Derrick, J., Smith, G., Groves, L., Dongol, B.: Using coarsegrained abstractions to verify linearizability on TSO. In: Yahav, E. (eds.) HVC 2014, pp. 1–16. Springer (2014)Google Scholar
 10.Derrick, J., Smith, G., Dongol, B.: Verifying linearizability on TSO architectures. In: Albert, E., Sekerinski, E. (eds.) IFM 2014, pp. 341–356. Springer (2014)Google Scholar
 11.Filipovic, I., O’Hearn, P., Rinetzky, N., Yang, H.: Abstraction for concurrent objects. In: Castagna, G. (eds.) ESOP 2009, pp. 252–266. Springer (2009)Google Scholar
 12.Gotsman, A., Musuvathi, M., Yang, H.: Show no weakness: sequentially consistent specifications of TSO libraries. In: Aguilera, M.K. (eds.) DISC 2012, pp. 31–45. Springer (2012)Google Scholar
 13.Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)CrossRefGoogle Scholar
 14.Lamport, L.: How to make a multiprocessor computer that correctly executes multiprocess program. IEEE Trans. Comput. 28(9), 690–691 (1979)CrossRefGoogle Scholar
 15.Liu, Y., Chen, W., Liu, Y.A., Sun, J., Zhang, S.J., Dong, J.S.: Verifying linearizability via optimized refinement checking. IEEE Trans. Softw. Eng. 39(7), 1018–1039 (2013)CrossRefGoogle Scholar
 16.Manson, J., Pugh, W., Adve, S.V.: The Java memory model. In: Palsberg, J., Abadi, M. (eds.) POPL 2005, pp. 378–391. ACM (2005)Google Scholar
 17.Owens, S., Sarkar, S., Sewell, P.: A better x86 memory model: x86TSO. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009, pp. 391–407. Springer (2009)Google Scholar
 18.Sarkar, S., Sewell, P., Alglave, J., Maranget, L., Williams, D.: Understanding POWER multiprocessors. In: Hall, M. W., Padua, D.A. (eds.) PLDI 2011, pp. 175–186. ACM (2011)Google Scholar
 19.Schnoebelen, P.: Bisimulation and other undecidable equivalences for lossy channel systems. In: Kobayashi, N., Pierce, B.C. (eds.) TACS 2001, pp. 385–399. Springer (2001)Google Scholar
 20.Vechev, M.T., Yahav, E., Yorsh, G.: Experience with model checking linearizability. In: Pasareanu, C.S. (ed.) SPIN 2009, pp. 261–278. Springer (2009)Google Scholar
 21.Wang, C., Lv, Y., Wu, P.: Bounded TSOtoSC linearizability is decidable. In: Freivalds, M.R., Engels, G., Catania, B. (eds.) SOFSEM 2016, pp. 404–417. Springer (2016)Google Scholar
 22.Wang, C., Lv, Y., Wu, P.: TSOtoTSO linearizability is undecidable. In: Finkbeiner, B., Pu, G., Zhang, L. (eds.) ATVA 2015, pp. 309–325. Springer (2015)Google Scholar