Learning Pomset Automata

We extend the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathtt {L}^{\!\star }$$\end{document}L⋆ algorithm to learn bimonoids recognising pomset languages. We then identify a class of pomset automata that accepts precisely the class of pomset languages recognised by bimonoids and show how to convert between bimonoids and automata.


Introduction
Automata learning algorithms are useful in automated inference of models, which is needed for verification of hardware and software systems.In active learning, the algorithm interacts with a system through tests and observations to produce a model of the system's behaviour.One of the first active learning algorithms proposed was L ⋆ , due to Dana Angluin [2], which infers a minimal deterministic automaton for a target regular language.L ⋆ has been used in a range of verification tasks, including learning error traces in a program [5].For more advanced verification tasks, richer automata types are needed and L ⋆ has been extended to e.g.input-output [1], register [19], and weighted automata [15].None of the existing extensions can be used in analysis of concurrent programs.
Partially ordered multisets (pomsets) [13,12] are basic structures used in the modeling and semantics of concurrent programs.Pomsets generalise words, allowing to capture both the sequential and the parallel structure of a trace in a concurrent program.Automata accepting pomset languages are therefore useful to study the operational semantics of concurrent programs-see, for instance, work on concurrent Kleene algebra [16,25,20,23].
In this paper, we propose an active learning algorithm for a class of pomset automata.The approach is algebraic: we consider languages of pomsets recognised by bimonoids [27] (which we shall refer to as pomset recognisers).This can be thought of as a generalisation of the classical approach to language theory of using monoids as word acceptors: bimonoids have an extra operation that models parallel composition in addition to sequential.The two operations give rise to a complex branching structure that makes the learning process non-trivial.
The key observation is that pomset recognisers are tree automata whose algebraic structure satisfies additional equations.We extend tree automata learning algorithms [7,8,30] to pomset recognisers.The main challenge is to ensure that intermediate hypotheses in the algorithm are valid pomset recognisers, which is essential in practical scenarios where the learning process might not run to the very end, returning an approximation of the system under learning.This requires equations of bimonoids to be correctly propagated and preserved in the core data structure of the algorithm-the observation table.The proof of termination, in analogy to L ⋆ , relies on the existence of a canonical pomset recogniser of a language, which is based on its syntactic bimonoid.The steps of the algorithm provide hypotheses that get closer in size to the canonical recogniser.
Finally, we bridge the learning algorithm to pomset automata [20,21] by providing two constructions that enable us to seamlessly move between pomset recognisers and pomset automata.Note that although bimonoids provide a useful formalism to denote pomset languages, which is amenable to the design of the learning algorithm, they enforce a redundancy that is not present in pomset automata: whereas a pomset automaton processes a pomset from left to right in sequence, one letter per branch at a time, a bimonoid needs to be able to take the pomset represented as a binary tree in any way and process it bottom-up.This requirement of different decompositions leading to the same result makes bimonoids in general much larger than pomset automata and hence the latter are, in general, a more efficient representation of a pomset language.
The rest of the paper is organised as follows.We conclude this introductory section with a review of relevant related work.Section 2 contains the basic definitions on pomsets and pomset recognisers.The learning algorithm for pomset recognisers appears in Section 3, including proofs to ensure termination and invariant preservation.Section 4 presents constructions to translate between (a class of) pomset automata and pomset recognisers.We conclude with discussion of further work in Section 5. Omitted proofs appear in Appendices A and B.

Related Work.
There is a rich literature on adaptations and extensions of L ⋆ from deterministic automata to various kinds of models, see, e.g., [33,17] for an overview.To the best of our knowledge, this paper is the first to provide an active learning algorithm for pomset languages recognised by finite bimonoids.
Our algorithm learns an algebraic recogniser.Urbat and Schröder [32] provide a very general learning approach for languages recognised by algebras for monads [4,31], based on a reduction to categorical automata, for which they present an L ⋆ -type algorithm.Their reduction gives rise to an infinite alphabet in general, so tailored work is needed for deriving algorithms and finite representations.This can be done for instance for monoids, recognising regular languages, but it is not clear how this could extend to pomset recognisers.We present a direct learning algorithm for bimonoids, which does not rely on any encoding.
Our concrete learning algorithm for bimonoids is closely related to learning approaches for bottom-up tree automata [7,8,30]: pomset languages can be viewed as tree languages satisfying certain equations.Incorporating these equa-tions turned out to be a non-trivial task, which requires additional checks on the observation table during execution of the algorithm.
Conversion between recognisers and automata for a pomset language was first explored by Lodaya and Weil [27,26].Their results relate the expressive power of these formalisms to sr-expressions.As a result, converting between recognisers and automata using their construction uses an sr-expression as an intermediate representation, increasing the resulting state space.Our construction, however, converts recognisers directly to pomset automata, which keeps the state space relatively small.Moreover, Lodaya and Weil work focus on pomset languages of bounded width, i.e., with an upper bound on the number of parallel events.In contrast, our conversions work for all recognisable pomset languages (and a suitable class of pomset automata), including those of unbounded width.
Ésik and Németh [9] considered automata and recognisers for biposets, i.e., sp-pomsets without commutativity of parallel composition.They equate languages recognised by bisemigroups (bimonoids without commutativity or units) with those accepted by parenthesizing automata.Our equivalence is similar in structure, but relates a subclass of pomset automata to bimonoids instead.The results in this paper can easily be adapted to learn representations of biposet languages using bisemigroups, and convert those to parenthesizing automata.

Pomset Recognisers
Throughout this paper we fix a finite alphabet Σ and assume ∈ Σ.When defining sets parameterised by a set X, say S(X), we may use S to refer to S(Σ).
We recall pomsets [12,13], a generalisation of words that model concurrent traces.A labelled poset over X is a tuple u = S u , ≤ u , λ u , where S u is a finite set (the carrier of u), ≤ u is a partial order on S u (the order of u), and λ u : S u → X is a function (the labelling of u).Pomsets are labelled posets up to isomorphism.Definition 1 (Pomsets).Let u, v be labelled posets over X.An embedding of u in v is an injection h : ).An isomorphism is a bijective embedding whose inverse is also an embedding.We say u is isomorphic to v, denoted u ∼ = v, if there exists an isomorphism between u and v.A pomset over X is an isomorphism class of labelled posets over X, i.e., [v] = {u : u ∼ = v}.When u = [u] and v = [v] are pomsets, u is a subpomset of v when there exists an embedding of u in v.
When two pomsets are in scope, we tacitly assume that they are represented by labelled posets with disjoint carriers.We write 1 for the empty pomset.When a ∈ X, we write a for the pomset represented by the labelled poset whose sole element is labelled by a. Pomsets can be composed in sequence and in parallel: be pomsets over X.We write u v for the parallel composition of u and v, which is the pomset over X represented by the labelled poset Similarly, we write u • v for the sequential composition of u and v, that is, the pomset represented by the labelled poset We may elide the dot for sequential composition, for instance writing ab for a •b.
The pomsets we use can be built using sequential and parallel composition.
Definition 3 (Series-parallel pomsets).The set of series-parallel pomsets (sp-pomsets) over X, denoted SP(X), is the smallest set such that 1 ∈ SP(X) and a ∈ SP(X) for every a ∈ X, closed under parallel and sequential composition.
Concurrent systems admit executions of operations that are not only ordered in sequence but also allow parallel branches.An algebraic structure consisting of both a sequential and a parallel composition operation, with a shared unit, is called a bimonoid.Formally, its definition is as follows.

Definition 4 (Bimonoid).
A bimonoid is a tuple M, ⊙, ȅ, 1 where -M is a set called the carrier of the bimonoid, -⊙ is a binary associative operation on M , ȅ is a binary associative and commutative operation on M , and -1 ∈ M is a unit for both ⊙ (on both sides) and ȅ.
Bimonoid homomorphisms are defined in the usual way.
Given a set X, the free bimonoid [12] over X is SP(X), •, , 1 .The fact that it is free means that for every function f : X → M for a given bimonoid M, ⊙, ȅ, 1 M there exists a unique bimonoid homomorphism f ♯ : SP(X) → M such that the restriction of f ♯ to X is f .Just as monoids can recognise words, bimonoids can recognise pomsets [27].A bimonoid together with the witnesses of recognition is a pomset recogniser.

Definition 5 (Pomset recogniser).
A pomset recogniser is a tuple R = M, ⊙, ȅ, 1, i, F where M, ⊙, ȅ, 1 is a bimonoid, i : Σ → M , and F ⊆ M .The language recognised by R is given by L R = {u ∈ SP : i ♯ (u) ∈ F } ⊆ SP.Example 6. Suppose a program consists of a loop, where each iteration runs actions a and b in parallel.We can describe the behaviour of this program by We can describe this language using a pomset recogniser, as follows.Let M = {q a , q b , q 1 , q ⊥ , 1}, and let ⊙ and ȅ be the operations on M given by We set i(a) = q a , i(b) = q b , and F = {1, q 1 }.Now, for n > 0: No other pomsets are mapped to q 1 ; hence, M, ⊙, ȅ, 1, i, F accepts L.
Example 7. Suppose a program solves a problem recursively, such that the recursive calls are performed in parallel.In that case, the program would either perform the base action b, or some preprocessing action a followed by running two copies of itself in parallel.This behaviour can be described by the smallest pomset language L satisfying the following inference rules: This language can be described by a pomset recogniser.Let our carrier set be M = {q a , q b , q 1 , q ⊥ , 1}, and let ⊙ and ȅ be the operations on M given by , and i : Σ → M is given by setting i(a) = q a and i(b) = q b .One can then show that M, ⊙, ȅ, 1, i, F accepts L.
Pomset contexts are used to describe the behaviour of individual elements in a pomset recogniser.Formally, the set of pomset contexts over a set X is given by PC(X) = SP(X ∪ { }).Here the element acts as a placeholder, where a pomset can be plugged in: given a context c ∈ PC(X) and t ∈ SP(X), let c[t] ∈ SP(X) be obtained by substituting t for in c.

Learning Pomset Recognisers
In this section we present our algorithm to learn pomset recognisers from an oracle (the teacher ) that answers membership and equivalence queries.A membership query consists of a pomset, to which the teacher replies whether that pomset is in the language; an equivalence query consists of a hypothesis pomset recogniser, to which the teacher replies yes if it is correct or no with a counterexample-a pomset incorrectly classified by the hypothesis-if it is not.
A pomset recogniser is essentially a tree automaton, with the additional constraint that its algebraic structure satisfies the bimonoid axioms.Our algorithm is therefore relatively close to tree automata learning-in particular Drewes and Högberg [7,8]-but there are several key differences: we optimise the algorithm by taking advantage of the bimonoid axioms, and at the same time need to ensure that the hypotheses generated by the learning process satisfy those axioms.

Observation Table
We fix a target language L ⊆ SP throughout this section.As in the original L ⋆ algorithm, the state of the learner throughout a run of the algorithm is given by a data structure called the observation table, which collects information about L. The table contains rows indexed by pomsets, representing the state reached by the correct pomset recogniser after reading that pomset; and columns indexed by pomset contexts, used to approximately indentify the behaviour of each state.To represent the additional rows needed to approximate the pomset recogniser structure, we use the following definition.Given U ⊆ SP, we define We often write row instead of row S,E when S and E are clear from the context.
We depict observation tables, or more precisely row, as two separate tables with rows in S and S + \ S respectively, see for instance Example 9 below.
The goal of the learner is to extract a hypothesis pomset recogniser from the rows in the table.More specifically, the carrier of the underlying bimonoid of the hypothesis will be given by the rows indexed by pomsets in S. The structure on the rows is obtained by transferring the structure of the row labels onto the rows (e.g., row(s) ⊙ row(t) = row(s • t)), but this is not well-defined unless the table satisfies closedness, consistency, and associativity.Closedness and consistency are standard in L ⋆ , whereas associativity is a new property specific to bimonoid learning.We discuss each of these properties next, also including compatibility, a property that is used to show minimality of hypotheses.
The first potential issue is a closedness defect: this is the case when a composed row, indexed by an element of S + , is not indexed by a pomset in S.

Example 9 (Table not closed).
Recall L = {a b} * from Example 6, and suppose The carrier of the hypothesis bimonoid is M = {row(1), row(a), row(b)}, but the composition row(a) ⊙ row(a) cannot be defined since row(aa) ∈ M .
The absence of the issue described above is captured with closedness.

Definition 10 (Closed table). An observation table S, E is closed if for all
t ∈ S + there exists s ∈ S such that row(s) = row(t).
Another issue that may occur is that the same row being represented by different index pomsets leads to an inconsistent definition of the structure.The absence of this issue is referred to as consistency.
Whenever closedness and consistency hold, one can define sequential and parallel composition operations on the rows of the table.However, these operations are not guaranteed to be associative, as we show with the following example.

Example 12 (Table not associative). Consider L = {au : u ∈ {b}
* } over Σ = {a, b}, and suppose S = {1, a, b} and E = { , a}.The induced table is: This table does not lead to an associative sequential operation on rows: To prevent this issue we enforce the following additional property: The table from Example 12 is not •-associative: we have row(a) = row(ab) and row(b) = row(ba) but row(aa) = row(ab).
Putting the above definitions of closedness, consistency and associativity of tables together, we have the following result for constructing a hypothesis.

Lemma 14 (Hypothesis). A closed, consistent and associative table S, E induces a hypothesis pomset recogniser
Proof.The operations ⊙ H and ȅ H are well-defined by closedness and consistency, and 1 H is well-defined because 1 ∈ S by the observation table definition.Commutativity of ȅ H follows from commutativity of , and similarly that 1 H is a unit for both operations follows from 1 being a unit.Associativity follows by associativity of the table (it does not follow from • and being associative: Since a hypothesis is constructed from an observation table S, E that records for given s ∈ S and e ∈ E whether e[s] is accepted by the language or not, one would expect that the hypothesis classifies those pomsets correctly.This is not necessarily the case, as we show in the following example.
Example 15.Consider the language L from Example 7, and let S = {1, b} and From this closed, consistent, and associative table we obtain a hypothesis pomset recogniser that satisfies We thus have the following definition, parametric in a subset of T S,E .Ensuring hypotheses are compatible with their table will not be a crucial step in proving termination, but plays a key role in ensuring minimality (Section 3.4).This was originally shown by van Heerdt [14] for Mealy machines.

The Learning Algorithm
We are now ready to introduce our learning algorithm, Algorithm 1.The main algorithm initialises the table to {1}, { } and starts by augmenting the table to make sure it is closed and associative.We give an example below.

Example 17 (Fixing closedness and associativity).
Consider the table from Example 9, where row(aa) ∈ {row (1), row(a), row(b)} witnesses a closedness defect.To fix this, the algorithm would add aa to the set S, which means row(aa) will become part of the carrier of the hypothesis.Now consider the table from Example 12.Here we found an associativity defect witnessed by row(a) = row(ab) and row(b) = row(ba) but row(aa) = row(ab).More specifically, row(aa)( ) = row(ab)( ).Thus, s 1 = s 3 = s l = a, s 2 = s r = b, s l = a, and e = .A membership query on aba shows aba ∈ L, so b = 0. We have row(aa)( ) = 0, and therefore the algorithm would add the context [a • ] = a • to E.
Note that the algorithm does not explicitly check for consistency; this is because we actually ensure a stronger property-sharpness [3]-as an invariant (Lemma 25).This property ensures every row indexed by a pomset in S is indexed by exactly one pomset in S (implying consistency): The idea of maintaining sharpness is due to Maler and Pnueli [28].Once the table is closed and associative, we construct the hypothesis and check if it is compatible with its table.If this is not the case, a witness for incompatibility is a counterexample by definition, so HandleCounterexample is invoked to extract an extension of E, and we return to checking closedness and associativity.Once we obtain a hypothesis that is compatible with its table, we submit it to the teacher to check for equivalence with the target language.
If the teacher provides a counterexample, we again process this and return to checking closedness and associativity.Once we have a compatible hypothesis for which there is no counterexample, we return this correct pomset recogniser.
The procedure HandleCounterexample, adapted from [7,8], is provided with an observation table S, E a pomset z, and a context c and finds a single context to add to E. The main invariant is that c[z] is a counterexample.Recursive calls replace subpomsets from S + with elements of S in this counterexample while maintaining the invariant.There are two types of return values: if c is a suitable context, c is returned; otherwise the return value is an element of S that is to replace z.The context c is suitable if z ∈ S + and adding c to E would distinguish row(s) from row(z), where s ∈ S is such that currently row(s) = row(z).Because S is non-empty and subpomset-closed, if z ∈ S ∪ S + it can be decomposed into z = u 1 ♥ u 2 for non-empty u 1 , u 2 ∈ SP and ♥ ∈ {•, }.We then recurse into u 1 and u 2 to replace them with elements of S and replace z with u 1 ♥ u 2 ∈ S + in a final recursive call.If c = , the return value cannot be in S, as we will show in Lemma 25 that these elements are not counterexamples.

Termination and Query Complexity
Our termination argument is based on a comparison of the current observation table with the infinite table SP, PC .We first show that the latter induces a hypothesis, called the canonical pomset recogniser for the language.Its underlying bimonoid is isomorphic to the syntactic bimonoid [27] for the language.
Lemma 20.SP, PC is a closed, consistent, and associative observation table.
Definition 21 (Canonical pomset recogniser).The canonical pomset recogniser for L is the the hypothesis for the observation table SP, PC .We denote this hypothesis by The comparison of the current table with SP, PC is in terms of the number of distinct rows they hold.In the following lemma we show that the number of the former is bounded by the number of the latter.⊓ ⊔ An important fact will be that none of the pomsets in S can form a counterexample for the hypothesis of a table S, E .In order to show this we will first show that the hypothesis is always reachable, a concept we define for arbitrary pomset recognisers below.

Definition 23 (Reachability
Our reachability lemma relies on the fact that S is subpomset-closed. Lemma 24 (Hypothesis reachability).Given a closed, consistent, and associative observation table S, E , the hypothesis it induces is reachable.In particular, i H ♯ (s) = row(s) for any s ∈ S.
From the above it follows that we always have compatibility with respect to the set of row indices, as we show next.

Lemma 25. The hypothesis of any closed, consistent, and associative observation table S, E is S-compatible.
Before turning to our termination proof, we show that some simple properties hold throughout a run of the algorithm.
Lemma 26 (Invariant).Throughout execution of Algorithm 1, we have that S, E is a sharp observation table.
Proof.Subpomset-closedness holds throughout each run since {1} is subpomsetclosed and adding a single element of S + to S preserves the property.
For sharpness, first note that the initial table is sharp as it only has one row.Sharpness of S, E can only be violated when adding elements to S. But the only place where this happens is on line 7, and there the new row is unequal to all previous rows, which means sharpness is preserved.

⊓ ⊔
The preceding results allow us to prove our termination theorem.
Proof.First, we observe that fixing a closedness defect by adding a row (line 7) can only happen finitely many times, since, by Lemma 22, the size of {row(s) : This means that it suffices to show the following two points: 1.Each iteration of any of the loops starting on lines 2-4 either fixes a closedness defect by adding a row, or adapts E so that S, E ends up not being closed at the end of loop body.In the second case, a closedness defect will be fixed in the following iteration of the inner while loop.2. The calls to HandleCounterexample terminate.
Combined, these show that the algorithm terminates.For the first point, we treat each of the cases: -If the table is not closed, we directly find a new row that is taken from the S + -part of the table and added to the S-part of the table.-Consider the failure of ♥-associativity, for ♥ ∈ {•, }, and let s 1 , s 2 , s 3 , s l , s r ∈ S and e ∈ E be such that row(s l ) = row(s 1 ♥ s 2 ), row(s r ) = row(s 2 ♥ s 3 ), and row(s l ♥ s 3 )(e) = row(s 1 ♥ s r )(e).Suppose row(s l ♥ s 3 )(e) = b, with b be the result of a membership query on s 1 ♥ s 2 ♥ s 3 .Then e[ ♥ s 3 ] distinguishes the previously equal rows row(s 1 ♥ s 2 ) and row(s l ), so adding it to E creates a closedness defect.The fact that row(s 1 ♥ s 2 ) cannot remain equal to another row than row(s l ) is a result of the sharpness invariant.Alternatively, row(s l ♥ s 3 )(e) = b means row(s 1 ♥ s r )(e) = b, for otherwise we would contradict row(s l ♥ s 3 )(e) = row(s 1 ♥ s r )(e).For similar reasons the context e[s 1 ♥ ] in this case distinguishes the previously equal rows row(s 1 ♥ s 2 ) and row(s r ), creating a closedness defect.-A compatibility defect results in the identification of a counterexample, the handling of which we discuss next.
-Whenever a counterexample is identified, we eventually find a context c, s ∈ S, and t ∈ S + \ S such that row(t) = row(s) and c[t] ∈ L ⇐⇒ c[s] ∈ L. Thus, adding c to E creates a closedness defect.
Termination of HandleCounterexample follows: the first two recursive calls in the procedure replace z with strict subpomsets of z, whereas the last one replaces z with an element of S + , so no further recursion will happen.
⊓ ⊔ Query Complexity.We determine upper bounds on the membership and equivalence query numbers of a run of the algorithm in terms of the size of the canonical pomset recogniser n = |M L |, the size of the alphabet k = |Σ|, and the maximum number of operations (from {•, }, used to compose alphabet symbols) m found in a counterexample.We note that since the number of distinct rows indexed by S is bounded by n and the table remains sharp throughout any run, the final size of S is at most n.Thus, the final size of S + is in O(n 2 + k).Given the initialisation of S with a single element, the number of closedness defects fixed throughout a run is at most n − 1.This means that the total number of associativity defects fixed and counterexamples handled (including those resulting from compatibility defects) together is n − 1.We can already conclude that the number of equivalence queries posed is bounded by n.Moreover, we know that the final table will have at most n columns, and therefore the total number of cells in that table will be in O(n 3 + kn).
The number of membership queries posed during a run of the algorithm is given by the number of cells in the table plus the number of queries needed during the processing of counterexamples.Consider the counterexample z that contains the maximum number of operations among those encountered during a run.The first two recursive calls of HandleCounterexample break down one operation, whereas the third is used to execute a base case making two membership queries and does not lead to any further recursion.The number of membership queries made starting from a given counterexample is thus in O(m).This means the total number of membership queries during the processing of counterexamples is in O(mn), from which we conclude that the number of membership queries posed during a run is in O(n 3 + mn + kn).

Minimality of Hypotheses
In this section we will show that all hypotheses submitted by the algorithm to the teacher are minimal.We first need to define what minimality means.As is the case for DFAs, it is the combination of an absence of unreachable states and of every state exhibiting its own distinct behaviour.

Definition 28 (Minimality). A pomset recogniser
Before proving the main result of this section, we need the following: The minimality theorem below relies on table compatibility, which allows us to distinguish the behaviour of states based on the contents of their rows.Note that the algorithm only submits a hypothesis in an equivalence query if that hypothesis is compatible with its table.

Theorem 30 (Minimality of hypotheses). A closed, consistent, and associative observation S, E induces a minimal hypothesis if the hypothesis is compatible with its table.
Proof.We obtain the hypothesis from Lemma 14.Since S is subpomset-closed, we have by Lemma 24 that the hypothesis is reachable.Moreover, for every s ∈ S we have i H . Then there exist s 1 , s 2 ∈ S such that row(s 1 ) = i H ♯ (u 1 ) and row(s 2 ) = i H ♯ (u 2 ), and we have row(s 1 ) = row(s 2 ).Let e ∈ E be such that row(s 1 )(e) = row(s 2 )(e).We have As a corollary, we find that the canonical pomset recogniser is minimal.

Conversion to Pomset Automata
Bimonoids are a useful representation of pomset languages because sequential and parallel composition are on an equal footing; in the case of the learning algorithm of the previous section, this helps us treat both operations similarly.On the other hand, the behaviour of a program is usually thought of as a series of actions, some of which involve launching two or more threads that later combine.
Here, sequential actions form the basic unit of computation, while fork/join patterns of threads are specified separately.Pomset automata [21] encode this more asymmetric model: they can be thought of as non-deterministic finite automata with an additional transition type that brokers forking and joining threads.
In this section, we show how to convert a pomset recogniser to a certain type of pomset automaton, where acceptance of a pomset is guided by its structure; conversely, we show that each of the pomset automata in this class can be represented by a pomset recogniser.Together with the previous section, this establishes that the languages of pomset automata in this class are learnable.
If S is a set, we write M(S) for the set of finite multisets over S. A finite multiset over S is written φ = {|s 1 , . . ., s n | }.
Definition 32 (Pomset automata).A pomset automaton (PA) is a tuple A = Q, I, F, δ, γ where -Q is a set of states, with I, F ⊆ Q the initial and accepting states, and Lastly, for every q ∈ Q there are finitely many φ ∈ M(Q) such that γ(q, φ) = ∅.
A finite PA can be represented graphically: every state is drawn as a vertex, with accepting states doubly circled and initial states pointed out by an arrow, while δ-transitions are represented by labelled edges, and γ-transitions are drawn as a multi-ended edge.For instance, in Figure 1a, we have drawn a PA with states q 0 through q 5 with q 5 accepting, and q 1 ∈ δ(q 0 , a) (among other δ-transitions), while the multi-ended edge represents that q 2 ∈ γ(q 1 , {|q 3 , q 4 | }), i.e., q 2 can launch threads starting in q 3 and q 4 , which, upon termination, resume in q 2 .Fig. 1: Some pomset automata.
The sequential transition function is interpreted as in non-deterministic finite automata: if q ′ ∈ δ(q, a), then a machine in state q may transition to state q ′ after performing the action a.The intuition to the parallel transition function is that if q ′ ∈ γ(q, {|r 1 , . . ., r n | }), then a machine in state q may launch threads starting in states r 1 through r n , and when each of those has terminated succesfully, may proceed in state q ′ .Note how the representation of starting states in a γtransition allows for the possibility of launching multiple instances of the same thread, and disregards their order-i.e., γ(q, {|r 1 , . . ., r n | }) = γ(q, {|r n , . . ., r 1 | }).This intuition is made precise through the notion of a run.

Definition 33 (Run relation). The run relation of a PA
The language accepted by A is L A = {u ∈ SP : ∃q ∈ I, q ′ ∈ F. q u − →A q ′ }.
Example 34.If A is the PA from Figure 1a, we can see that q 3 b − →A q 5 and q 4 c − →A q 5 as a result of the second rule; by the third rule, we find that q 1 b c − − →A q 2 .Since q 2 a − → q 5 and q 0 a − →A q 1 (again by the second rule), we can conclude q 0 a•(b c)•a −−−−−→A q 5 by repeated application of the last rule.The language accepted by this PA is the singleton set {a • (b c) • a}.
In general, finite pomset automata can accept a very wide range of pomset languages, including all context free (pomset) languages [22].The intuition behind this is that the mechanism of forking and joining encoded in γ can be used to simulate a call stack.For example, the automaton in Figure 1b accepts the strictly context-free language (of words) {a n • b n : n ∈ N}.It follows that PAs can represent strictly more pomset languages than pomset recognisers.To tame the expressive power of PAs at least slightly, we propose the following.
Definition 35 (Saturation).We say that A = Q, I, F, δ, γ is saturated when for all u, v ∈ SP with u, v = 1, both of the following are true: Example 36.Returning to Figure 1, we see that the PA in Figure 1a is saturated, while Figure 1b is not, as a result of the run q 1 a•a•b•b − −−− →A q 4 , which does not admit an intermediate state q such that q 1 a•a − − →A q and q b•b − − →A q 4 .
We now have everything in place to convert the encoding of a language given by a pomset recogniser to a pomset automaton.The idea is to represent every element q of the bimonoid by a state which accepts exactly the language of pomsets mapped to q; the transition structure is derived from the operations.
The pomset automaton that arises from the construction above is partially depicted in Figure 2; we have not drawn the state q ⊥ and its incoming transitions, or forks into 1, to avoid clutter.In this PA, we see that, since q a ⊙ q 1 = q b and i(a) = q a , we have q 1 ∈ δ(q b , a).Furthermore, since (q b ȅ q b ) ⊙ 1 = q 1 ⊙ 1 = q 1 , we also have Fig. 2: Part of the PA obtained from the pomset recogniser from Example 7, using the construction from Lemma 37. The state q ⊥ (which does not contribute to the language of the automaton) and forks into the state 1 are not pictured.
We have thus shown that the language of any pomset recogniser can be accepted by a finite and saturated PA.In turn, this shows that our algorithm can, in principle, be adapted to work with a teacher that takes a (saturated) PA instead of a pomset recogniser as hypothesis, by simply converting the hypothesis pomset recogniser to an equivalent PA before sending it over.
Conversely, we can show that the transition relations of a saturated PA carry the algebraic structure of a bimonoid, and use that to show that a language recognised by a saturated PA is also recognised by a bimonoid.This shows that our characterisation is "tight", i.e., languages recognised by saturated PAs are precisely those recognised by bimonoids, and hence learnable.
Lemma 39.Let A = Q, I, F, δ, γ be a saturated pomset automaton.We can construct a pomset recogniser R = M, ⊙, ȅ, 1, i, F ′ , where Now ⊙ and ȅ are well-defined, and R is a pomset recogniser such that L R = L A .
If A is finite, then so is R, since each of the elements of M is a relation on Q, and there are finitely many relations on a finite set.
In general, the PA obtained from a pomset recogniser may admit runs where the same fork transition is nested repeatedly.Recognisable pomset languages of bounded width may be recognised by a pomset recogniser that is depthnilpotent [27], which can be converted into a fork-acyclic PA by way of an sr-expression [27,21].However, this detour via sr-expressions is not necessary: one can adapt Lemma 37 to produce a fork-acyclic PA, when given a depthnilpotent pomset recogniser.The details are discussed in Appendix C.
We conclude this section by remarking that the minimal pomset recogniser for a bounded-width language is necessarily depth-nilpotent [27]; since our algorithm produces a minimal pomset recogniser, this means that we can also produce a fork-acyclic PA after learning a bounded-width recognisable pomset language.

Discussion
To learn DFAs, there are several alternatives to the observation table data structure that reduce the space complexity of the algorithm.Most notable is the classification tree [24], which distinguishes individual pairs of words (which for us would be pomsets) at every node rather than filling an entire row for each of them.The TTT algorithm [18] further builds on this and achieves optimal space complexity.Given that we developed the first learning algorithm for pomset languages, we opted for the simplicity of the observation table-optimisations such as those analogous to the aforementioned work are left to future research.
We would like to extend our algorithm to learn recognisers based on arbitrary algebraic theories.One challenge is to ensure that the equations of the theory hold for hypotheses, by generalising our definition of associativity (Definition 13).
Our algorithm can also be specialised to learn languages recognised by commutative monoids.These languages of multisets can alternatively be represented as semi-linear sets [29] or described using Presburger arithmetic [11].While not all languages described this way are recognisable (for instance, the set of multisets over Σ = {a, b} with as many a's as b's [27]), it would be interesting to be able to learn at least the fragment representable by commutative monoids, and apply that to one of the domains where semi-linear sets are used.
Our algorithm is limited to learning languages of series-parallel pomsets; there exist pomsets which are not series-parallel, each of which must contain an "N-shape" [12,13,34].Since N-shapes appear in pomsets that describe message passing between threads, we would like to be able to learn such languages as well.We do not see an obvious way to extend our algorithm to include these pomsets, but perhaps recent techniques from [10] can provide a solution.
Every hypothesis of our algorithm can be converted to a pomset automaton.The final pomset recogniser for a bounded-width language is minimal, and hence depth-nilpotent [27], which means that it can be converted to a fork-acyclic PA.In future work, we would like to guarantee that the same holds for intermediate hypotheses when learning a bounded-width language.
Running two threads in parallel may be implemented by running some initial section of those threads in parallel, followed by running the remainder of those threads in parallel.This interleaving is represented by the exchange law [12,13].One can specialise pomset recognisers to include this interleaving to obtain recognisers of pomset languages closed under subsumption [27], i.e., such that if a pomset u is recognised, then so are all of the "more sequential" versions of u.We would like to adapt our algorithm to learn these types of recognisers, and exploit the extra structure provided by the exchange law to optimise further.
We have shown that recognisable pomset languages correspond to saturated regular pomset languages (Lemmas 37 and 39).One question that remains is whether there is an algorithm that can learn all or at least a larger class of regular pomset languages.Given that pomset automata can accept context-free languages (Figure 1b), we wonder if a suitable notion of context-free grammars for pomset languages could be identified.Clark [6] showed that there exists a subclass of context-free languages that can be learned via an adaptation of L ⋆ .Arguably, this adaptation learns recognisers with a monoidal structure and reverses this structure to obtain a grammar.An extension of this work to pomset languages might lead to a learning algorithm that learns more PAs.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material.If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
A Omitted Proofs about the Algorithm Lemma 20.SP, PC is a closed, consistent, and associative observation table.
Proof.Closedness holds trivially, and associativity follows from consistency.It remains to show that consistency holds.For ♥ ∈ {•, } we have for all s 1 , s 2 , t ∈ SP such that row(s 1 ) = row(s 2 ) and e ∈ PC that row(s and symmetrically we can show that Lemma 24 (Hypothesis reachability).Given a closed, consistent, and associative observation table S, E , the hypothesis it induces is reachable.In particular, i H ♯ (s) = row(s) for any s ∈ S.
Proof.By induction on the pomsets in S, using subpomset-closedness.We have Now assume for the inductive case that s 1 , s 2 ∈ S are such that i H ♯ (s 1 ) = row(s 1 ) and i H ♯ (s 2 ) = row(s 2 ).Then and similarly i H ♯ (s 1 s 2 ) = row(s 1 s 2 ).⊓ ⊔ Lemma 25.The hypothesis of any closed, consistent, and associative observation table S, E is S-compatible.
Proof.For all s ∈ S, we have Proof.Proof by induction on the elements of PC.For any a ∈ Σ trivially have i ♯ (a) = i ♯ (a), and similarly i ♯ (1) = i ♯ (1).Furthermore, The canonical pomset recogniser is minimal.
Proof.From Lemma 25 we know that the pomset recogniser accepts L, so by Theorem 30 it is minimal.
Proof.Our proof relies on the following property of A: Claim.q u − →A q ′ if and only if i ♯ (u) ⊙ q ′ = q.
Proof.For the direction from left to right, we proceed by induction on → A .In the base, there are two cases.On the one hand, suppose q u − →A q ′ is a trivial run, i.e., u = 1 and q = q ′ .We then calculate i ♯ (u) ⊙ q ′ = 1 ⊙ q ′ = q ′ = q.On the other hand, suppose q u − →A q ′ is a δ-run, which is to say that u = a for some a ∈ Σ with q ′ ∈ δ(q, a).It then follows that, i ♯ (u) ⊙ q ′ = i(a) ⊙ q ′ = q.
For the inductive step, there are again two cases to consider.
-Suppose that q u − →A q ′ is a composite run, i.e., that u = v • w and q ′′ ∈ M such that q v − →A q ′′ and q ′′ w − →A q ′ .By induction, we have i ♯ (v) ⊙ q ′′ = q and i ♯ (w) ⊙ q ′ = q ′′ .In total, we find that Fig. 3: Part of the PA obtained from the pomset recogniser from Example 6, using the construction from Lemma 44.Once more, the state q ⊥ and transitions into 1 are not pictured.
One can combine existing results to show that a language recognised by a depth-nilpotent pomset recogniser can also be accepted by a fork-acyclic PA: by [27, Theorem 3.9], every such language is series-rational, and by [21, Theorem 7.16], every series-rational language is recognised by a fork-acyclic PA.However, this detour is not necessary: we can convert a depth-nilpotent bimonoid to a fork-acyclic automaton directly, by adapting the construction from Lemma 37.
Proof (of Lemma 44).Our proof rests on the following two properties of A.
Proof.We proceed by induction on u.In the base, we have two cases.On the one hand, if u = 1, then i ♯ (u) = 1, and so the claim holds immediately.On the other hand, if u = a for some a ∈ Σ, then q ′ ∈ δ(q, a), whence q u − →A q ′ again.For the inductive step, there are two cases.
Proof.It suffices to validate the claim for the pairs that generate A .

⊓ ⊔
We return to the proof of Lemma 44.If u ∈ L A , then there exists a q ∈ F with q u − →A 1.By the same argument as in Lemma 37, i ♯ (u) ⊙ 1 = q-after all, the transitions of the new automaton are also transitions of the old automaton.Hence i ♯ (u) ∈ F , meaning u ∈ L R .Conversely, if u ∈ L R , then i ♯ (u) = 0 by depth-nilpotency, and by the first claim above we have q u − →A 1, meaning u ∈ L A .For fork-acyclicity, let q, r, s ∈ M be such that γ(q, {|r, s| }) = ∅.In that case, D R (r) < D R (q) by definition; thus q A r cannot hold, because that would imply D R (r) ≤ D R (r) by the second claim above.

⊓ ⊔
The converse also exists in the literature: the language of a fork-acyclic PA can also be recognised by a depth-nilpotent pomset recogniser, because any fork-acyclic automaton can be converted to an equivalent sr-expression [21, Theorem 8.4], and every sr-expression can in turn be converted to a depth-nilpotent pomset recogniser [27,Theorem 3.9].A direct construction is also possible, if we use the techniques from [21] to show that every fork-acyclic PA can be converted to an equivalent fork-acyclic PA that is also saturated, and reuse the construction from Lemma 39; we omit this proof for the sake of brevity.

Example 19 (0 a 1 aa 1 a a 1
Processing a counterexample).Consider L = {a, aa, a a}, and let S = {1, a} and E = { }.This induces a closed, sharp, and associative table1Suppose an equivalence query on its pomset recogniser, which rejects only the empty pomset, gives counterexample z = a a aa.We may decompose z as ( aa)[a a], where a a ∈ S + \ S. Because row(a a) = row(a), ( aa)[a] = a aa, and a aa ∈ L ⇐⇒ z ∈ L, we update z = a aa and repeat the process.Now we decompose z = (a )[aa].Since row(aa) = row(a), (a )[a] = a a, and a a ∈ L ⇐⇒ z ∈ L, we finish by adding a to E.
A non-saturated PA.