IDL-PMCFG, a Grammar Formalism for Describing Free Word Order Languages

We introduce Interleave-Disjunction-Lock parallel multiple context-free grammars (IDL-PMCFG), a novel grammar formalism designed to describe the syntax of free word order languages that allow for extensive interleaving of grammatical constituents. Though interleaved constituents, and especially the so-called hyperbaton, are common in several ancient (Classical Latin and Greek, Sanskrit...) and modern (Hungarian, Finnish...) languages, these syntactic structures are often difficult to express in existing formalisms. The IDL-PMCFG formalism combines Seki et al.’s parallel multiple context-free grammars (PMCFG) with Nederhof and Satta’s IDL expressions. We define the semantics of IDL-PMCFGs and study their expressivity, proving that IDL-PMCFG extends both PMCFG and IDL-CFG (context-free grammars equipped with IDL expressions) and that IDL-PMCFG parsing is NP\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {NP}$$\end{document}-hard. We then introduce COMPĀ, a programming language extending Ranta’s Grammatical Framework (GF) and built as a high-level front-end formalism to IDL-PMCFG for practical grammar development. We present a parsing algorithm for IDL-PMCFG inspired by earlier Earley-style PMCFG parsing algorithms and Nederhof and Satta’s IDL graphs and give a worst-case estimate of its complexity as a function of several metrics on IDL expressions, the size of the input and a new notion of the G-density of a language.


The Challenge of Free Word Order
Since Kashket (1986)'s seminal contribution, developing models and parsing techniques for free word order languages has been an ongoing challenge for computational linguists. Whilst free word order phenomena are largely absent from modern Western languages such as English, they are frequent in ancient Indo-European languages such as Sanskrit (Schaufele 1991), Greek and Latin (Conrad 1965;Devine and Stephens 2006;Spevak 2010), in Finno-Ugric languages such as Hungarian (Kiss 1981) or Finnish (Kay and Karttunen 1984), but also in Australian (Kashket 1986;Austin 2001), Turkic (Hoffman 1995), and, to a certain extent, Slavic (Siewierska and Uhlirova 1998) and Germanic (Reape 1994) idioms. In morphologically rich languages, a certain level of word order freedom is generally present, which can range from simple relaxation of linear ordering contraints to genuine non-configurationality. Additionnally, loosening of common word order constraints is a frequent feature of literary, especially metrical, texts, in which prosodic, stylistic and expressive factors favor alternative and unusual word orderings. At this point, it is worth mentioning that even the notion of free word order is, in itself, rather imprecise. Three different phenomena are generally qualified as such: (i) freedom in linear reordering or grammatical constituents as in Today I walk-I walk today (ii) discontinuous constituents that may span over a whole sentence; this common feature of e.g. German can also be demonstrated with English phrasal verbs in sentences such as I checked this out (iii) hyperbaton, i.e. interleaving of grammatical constituents as they frequently occur for instance in Classical Latin: cetera labuntur celeri caelestia motu 1 ('the other heavenly [bodies] move quickly', litt. 'the-other move quick heavenly movement'). This last and, by most aspects, most complex phenomenon produces crossing dependencies between constituents. Classical Latin, which provides innumerable examples of this, will serve as a reference for further investigation, but similar patterns can also be exhibited in Ancient Greek, Sanskrit, Old Norse, Slavic and Finno-Ugric languages, among others.
Context-free grammars (CFGs), introduced by Noam Chomsky in the 1950s, can be considered the de facto baseline of most generative grammar formalisms in both computer science and linguistics. Nevertheless, CFGs, unlike many dependency grammar formalisms, turned out to be unable to describe certain syntactic phenomena occuring in the grammar of natural languages, especially those involving free constituent order or discontinuous constituents. These limitations fostered the development of new, non context-free formalisms better suited to describe natural language: indexed grammars (Aho 1968), immediate dominance/linear precedence grammars (ID/LP) (Pullum 1982;Shieber 1984), tree-adjoining grammars (TAG) (Vijayashanker and Joshi 1988), parallel multiple context-free grammars (PMCFG) (Seki et al. 1991), affix grammars over a fixed lattice (AGFL) (Koster 1991), positive range concatenation grammars (PRCG) (Boullier 1998), among others. Following Chomsky (1956, these formalisms can all be classified as Type-1 grammars, and the languages they generate are generally referred to as context-sensitive. Most of the effort focussed on the development of so-called midly context-sensitive formalisms. A complete survey of the most common non context-free formalisms and their use in computational linguistics can be found in Kallmeyer (2010).
With rather strict word order languages accounting for a significant part of the available digital corpora and potential application fields, computational linguists, many of whom are native speakers of one of these idioms, may have been tempted to address the grammatical modelling of free word order languages with tools chiefly designed to describe English or similar languages. These tools rarely integrate an operator allowing for arbitrary constituent order, let alone for interleaving, since such operators come with a high computational cost that can and must be avoided when parsing free word order languages. This is especially true as regards multilingual parsing, translation or text generation systems that would have added support for some of the above languages at a later stage of their development. From all grammatical formalisms described above, only ID/LP can easily encode hyperbaton, but does not provide support for discontinuous constituents.
Note that this paper does not make a theoretical claim that none of the existing mildly context-sensitive formalisms is expressive enough, from a theoretical viewpoint, to encode free word order phenomena observed in natural languages. There are in fact good reasons to think that some of them are. In practice, if we assume that interleaving phenomena always have a finite depth, we can encode hyperbatic phenomena through a finite, yet exponential, number of context-free rules; recent theoretical results (Ho 2018) have shown that even without a finite-depth assumption, hyperbaton without copy is still mildly context-sensitive. What this paper does observe, however, is that we lack a general grammar description framework with built-in support for free word order phenomena, in which describing e.g. Classical Latin syntax requires neither an exponential inflation in the number of rules compared to the fixed word order case nor a complex conversion process. We lack a framework that would allow us to describe free word order syntax as linguists or grammarians would do, e.g. by defining single attachment rules that do not necessarily impose ordering constraints.
Early attempts to design grammatical formalisms for free word order languages have not led to the development of general-purpose tools; nor were they designed to provide cross-lingual interoperability with fixed word order languages. Covington (1990)'s approach, whose applications to parsing a "tiny subset of Latin" were explored by Koch (1993), relies on dependency rather than phrase structure grammar, which both authors consider less suited to addressing free word order phenomena. Dependency-and constraints-based methods have also been implemented by Bharati and Sangal (1993) for Indian languages, building on notions from Pān . inian grammar. Though underlying dependency relations between words are indeed the real issue while describing the syntax of free word order languages, we do not believe that this point of view should be deemed irreconciliable with the traditional structured approaches to grammar writing, that involve clear-cut constituents.
We are indeed looking for a formalism that would allow us to conveniently describe the syntax of free word order languages, and that could be used to produce wide-coverage, modular grammars in the style of the Ressource Grammar Library ). In addition to providing native support for free word order language, the new framework would still be able to encode standard fixed order rules; ideally, it would be built as a "free word order extension" of on existing framework, in order to capitalize on past efforts and guarantee compatibility with existing fixed word order grammars. A new formalism fulfilling these requirements, which we will introduce and study in Sect. 2, is called Interleave-Disjunction-Lock parallel multiple context-free grammars or IDL-PMCFG.
Another essential factor to take into account when designing a grammatical formalism is its suitability for practical implementation of wide-coverage grammars. One way to ensure that users can easily define and use their own grammar models is to provide a complete front-end syntax for grammatical description in the form of a special-purpose programming language. In these regards, we built on Ranta (2011)'s Grammatical Framework (GF) and Nederhof and Satta (2004)'s IDL expressions to elaborate our own grammar description system, COMPĀ, whose syntax extends so-called contextfree GF (Ljunglöf 2004) with some new operators to encode interleaving, disjunction and locking of constituents. High-level COMPĀ code is compiled into a low-level IDL-PMCF grammar that can be used directly for parsing. COMPĀ and its compiler are introduced in Sect. 3; the parsing algorithm itself is presented and studied in Sect. 4.
Before we proceed with the description of our formalism, a short look at precise linguistic facts behind extensive free word order can help us identify the exact features we are looking for.

Towards a Natural Account of Free Word Order Syntax: The Case of Classical Latin
A language with considerable freedom of word order, Classical Latin presents many syntactic phenomena alien to most modern Western European languages. By looking at the few typical aspects of Latin syntax, we shall see in this section which kind of features our desired framework should have in order to be able to concisely encode the syntactic phenomena at play in free word order languages in general, and in Classical Latin in particular.

Hyperbaton and Interleaved Constituents
As Devine and Stephens (2006) puts it, "[p]hrasal discontinuity, traditionally called hyperbaton in Classical studies, is perhaps the most distinctively alien feature of Latin word order". Hyperbaton is a very general, transcategorial phenomenon that can occur whenever a syntactic constituent is non-contiguous. Danckaert (2017) emphasizes that modern research has shifted away from the opposition of regular vs. exceptional word orders as it is found for example in Marouzeau (1922); still, recent transformational approaches have relied on some kind of default word order to distinguish non-emphatic from non-emphatic word orders. This might be totally justified when pragmatic information is available, provided that, in the words of Devine and Stephens (2006), "[t]he syntax is massaged to provide for a simple and direct translation into a pragmatically structured meaning". Unfortunately, such information is generally not available in usual parsing contexts. In particular, the ante-or postposition of adjectives and genitive modifiers in Classical Latin does not obey general syntactic rules (Devine and Stephens 2006). Statistical patterns may vary from word to word, with patterns rarely uniformly spreading over whole semantic lexical categories. Not surprisingly, discontinuous adjective and genitive attachement represents an overwhelming majority of all instances of hyperbata. In verse, where discontinuity is the standard rather than the exception, Conrad (1965) has shown it to be a characteristic feature of a long Greco-Roman poetic tradition dating back to the oral tradition of the Homeric times, influenced by the Roman taste for phenomena such as the clash of ictus and accent on the fourth foot of the hexameter. Latin poets made such an extensive use of the device that in Horace, we find stanzas with three crossing attribute dependencies. 2 In this context, there is no reason to deny hyperbaton its status as a standard, independent feature of Classical Latin; as parsing systems do not have access to pragmatic information, and since hyperbaton is extremely common even in simple prose, we need to be able to formulate general adjective attachment rules that, within the clause, relaxes all constraints on both linear order and intervention of other constituents.

Locking of Clauses and Prepositional Phrases
One seemingly absolute constraint on word reordering in Classical Latin concerns the impossibility of so-called 'long hyperbata' between finite clauses. 'Long hyperbata' are defined in Devine and Stephens (2006) as hyperbata that involve the extraction of a word from one clause to another; 'short hyperbata', on the other hand, are hyperbata that allow for interleaving words only within the bounds of a given clause. We must be able to express that finite clauses generally need to be 'locked', i.e. protected against interleaving with other clauses.
We only say 'generally', since verse texts provide well-known counter-examples to this rule, 3 showing that mixing of material from different finite clauses was not altogether impossible in poetic contexts. Moreover, it must be noted that this general exclusion of long hyperbata in finite clauses does not generalize to non-finite (infinitive and participial) clauses, which can be freely interleaved.
Another important issue, especially in verse, is that of the position of the subordinator not at the beginning, but within the clause, that has been extensively studied by Marouzeau (1949). Bortolussi (2006) has emphasized the high occurence frequency and expressive value of this so-called traiectio, which leads to the subordinator appearing (at least) second in the clause. Yet, an almost absolute rule that opposes rightward movement of subordinators is that a subordinator cannot stand last within the clause it introduces. Therefore, we still need to be able to restrict (linear) freedom of word order in certain cases.
Finally, another instance of locking with an additional constraint on word order occurs in the context of prepositional phrases: while all but one element of the prepo-sitional phrase might be arbitrary interleaved within the clause, at least one element (not necessarily the head) must be placed directly after the preposition. To account for this type of syntactic limitation, a combination of mostly free word order with targeted locking and linear constraints is again required.

Multiple Fields and Features
General-purpose grammar description systems such as Grammatical Framework (Ranta 2004) make an extensive use of records and fields in order to store the various forms of a word, parts of discontinuous grammatical constituents or handle reduplication phenomena. As our goal is to be able to describe Classical Latin syntax as generally as possible and we may want to keep some interoperability with existing frameworks, records and fields are required in practice.
There is, however, no obvious reason why we should require copying to be available in order to describe Classical Latin. Allowing copy in our framework can be desirable in order to account for specific syntactic phenomena in other natural languages (see below), because copying is a general phenomenon in language (Kobele 2006), or to preserve compatability with existing tools such as Grammatical Framework. But this is a design decision independent from the specific characteristics of Latin syntax, whose goal is not to stick closely to the formal requirements of Latin syntax, but rather to preverse some general linguistic expressiveness. As it makes sense to think of a new framework as having to match the needs of free word order languages in general and not of Classical Latin exclusively, we will want to allow copy operations in our formalism.

Summary
The above discussion suggests five characteristics that a grammatical formalism designed to enable a straightforward description of the syntax of free word order languages such as Classical Latin should have: operators to interleave grammatical constituents, lock phrases and restrict reorderings of constituents; a record and fields system; and, finally, and maybe less importantly, a support for copy operations.
Notations We will use the following conventions: -Symbol ε denotes the empty word, while ε and ('diamond') are special symbols; -All alphabets Σ used in this paper are assumed not to contain the symbols ε and ; -For (a, b) ∈ N 2 , [[a, b] -Rules in grammars are written using the following functional notation A 1 → · · · → A n → B : a 1 , . . . , a n → e which reads "given an item a 1 of category A 1 , …, an item a n of category A n , an item of category B can be produced which is equal to expression e"; expression e depends on the current formalism but will usually contain instances of a 1 , …, a n ;

IDL Expressions
IDL (Interleave-Disjunction-Lock) expressions, introduced by Nederhof and Satta (2004), are a family of regular expressions tailored to describe and parse natural language sentences. Since they do not allow for the use of nonterminal symbols, Nederhof and Satta's original IDL expressions are no grammars and can therefore only be used to describe specific (finite) families of utterances; a single IDL expression cannot encode a complex language model. However, they already include everything needed to account for free constituent order, hyperbata and their respective limitations. The definitions below closely follow those of the original paper.
Note that, unlike usual regular expressions dealing with character strings, IDL expressions used in typical computational linguistics applications use an alphabet Σ composed of full words (tokens), that are to be combined into grammatical constituents and sentences. Therefore, throughout this document, the word string should be understood as a shortcut for 'token list', an the word set of strings as a shortcut for 'set of token lists'. The informal semantics of the constructors, that all act on sets of strings, is as follows: -The dot represents standard concatenation; -Disjunction has its usual semantics as set union; -The interleave operator || allows for arbitrarily mixing tokens contained in n strings, as long as the relative ordering within each initial string is preserved in the final string. -The lock operator × prevents a string from being divided into several substrings by an instance of the interleave operator.
Since comb produces all possible interleavings of two input strings, it is clearly associative and commutative. We can see comb as an nary operator for all n, and write comb n i=1 a i := comb (a 1 , comb (a 2 , . . . comb (a n−1 , a n ) . . . )).
We can now define the language of an IDL expression, which exactly matches the mechanics exposed and demonstrated above: Definition 3 (Language of an IDL expression) Let Σ be an alphabet. For any IDL expression e over Σ, language L (e) is given by where for any IDL expression e over Σ, σ (e) is a subset of P (Σ ∪ { }) * that is defined inductively as follows: To see why IDL expressions are well-suited to describe grammatically valid reorderings of utterances in free word order languages, consider the following example from Latin: Marcus cum amico caro ambulat ('Marcus walks with his dear friend', litt. 'Marcus with friend dear walks'). For a permutation of the five above words to be considered valid in Classical Latin verse, the only condition to be met is that cum ('with') must stand immediately before either amico ('friend', ablative singular) or caro ('dear', ablative singular masculine). Besides this single constraint, the order of constituents is free, and the verb modifier cum amico suo might even be disjoint. This means that even heavily reordered utterances such as amico Marcus ambulat cum caro should be considered grammatical, as similar structures are, indeed, well documented. Now, this seemingly unusual syntactic constraint is surprisingly easy to express in terms of an IDL expression: Of course, IDL expressions alone cannot provide much more than ad-hoc solutions for a set of specific utterances. In order for their expressive power to be used for general language description, they must be integrated into a complete grammatical formalism.

Parallel Multiple Context-Free Grammars
It was already to deal with discontinuous constituents in natural language that Seki et al. (1991) defined parallel multiple context-free grammars (PMCFG), whose definition is given below. Parallel multiple context-free grammars extend context-free grammars by manipulating tuples of strings instead of strings. Each category of a PMCFG is assigned a dimension (a tuple size). Every production consumes a number of named argument tuples of fixed categories and produces a new tuple. Each element of this tuple is the concatenation of an arbitrary number of terminals and (nonterminal, index) pairs, which uniquely identify a field of one the arguments. The start category, usually denoted by S, defines the grammar's language, and has therefore dimension 1.
We have the following typical example:

Proof
The following PMCFG grammar matches L 3n T : ε, ε, ε where, following usual programming conventions, we start indexing at 0 and write (nonterminal, index) pairs as nonterminal [index].
How does this grammar define the above language? First, it states that a (onedimensional) tuple of type S can be produced from a (three-dimensional) tuple t of type T by concatenating the three fields of t; then, that a tuple of type T can be produced in either of two ways: either it is generated from another tuple t of type T by appending a, b and c respectively at the beginning of each of the fields, or is equal to ε, ε, ε . It is straightforward to see that T matches L 3n , thus yielding the expected behavior for S.
The formal definition of PMCFG, slightly adapted from Seki et al. (1991), is as follows: Definition 4 (PMCF grammar) A PMCF grammar (or PMCFG) is a sextuple G = (N , δ, Σ, F, P, S) where 1. N is a finite set of nonterminal symbols (also called categories); 2. δ : N → N maps each nonterminal symbol A to its dimension δ (A); 3. Σ is a finite set of terminal symbols disjoint with N ; 4. F is a finite set of functions such that for all f ∈ F, there exists a ( f ) ∈ N, called arity of f , as well as a( f ) integers d 1 ( f ), . . . , d a( f ) ( f ) encoding the dimensions of the a( f ) arguments of f , and an integer r ( f ) encoding the dimension of the image of f , such that the signature of f is 5. For any f ∈ F, letting ρ := r ( f ), f is of the form s 1 , . . . , s a( f ) → α 11 s β 11 γ 11 α 12 s β 12 γ 12 . . . α 1δ 1 , . . . , α ρ1 s β ρ1 γ ρ1 α ρ2 s β ρ2 γ ρ2 . . . α ργ ρ where all δ i , β i j and γ i j are integers with β i j ≤ a ( f ) and γ i j ≤ d β i j ( f ), and α i j ∈ Σ * for all i, j. In other terms, every component of the tuple produced by f is obtained via arbitrary concatenation of symbols from Σ and components of f 's arguments; 6. For q ∈ N, let F q denote the subset of all functions of arity q in F; 7. P, called the set of productions or rules, is a finite subset of q∈N F q × N q+1 such that for all q ∈ N and f , A 1 , . . . , A q+1 ∈ P, we have d k ( f ) = δ (A k ) for every k ∈ {1, . . . , q} as well as r ( f ) = δ A q+1 , i.e. the dimensions of the arguments (resp. of the image) of f match the dimensions of the categories on the left (resp. right) side of the production; 8. S ∈ N is the start symbol, of dimension δ (S) = 1.
Note that according to the previous definition, a PMCFG G = (N , δ, Σ, F, P, S) such that δ (N ) = {1} (i.e. a PMCFG whose categories have dimension as most 1) is exactly a CFG.
Finally, we define the language of a parallel multiple context-free grammar as follows: . . . , t δ(A) if, and only if, there exists a production f , A 1 , . . . , A a( f ) , A ∈ P and strings s i j i≤a( f ), j≤d i ( f ) such that the two following conditions are met: The language recognized by G is defined as L(G) = s ∈ Σ * | S→s and call PMCFL (resp. CFL) the set of all languages that are recognized by at least one PMCFG (resp. CFG).
Let us give an example of how PMCFG extends the expressivity of CFG. We will use the following well-known example: Proof This is a classical result whose proof (usually using the pumping lemma) can be found e.g. in Hopcroft et al. (2013).
This lemma, combined with Lemma 5, results in the following strict inclusion: Proposition 1 CFL PMCFL.
Through its use of tuples, PMCFG provides a handy way to handle discontinuous constituents. Various parts of the linearization of a constituent can be stored in different fields, and later on integrated into a larger phrase. Since the same argument can appear an arbitrary number of times in the right hand side of any production, general PMCFGs can also define reduplication phenomena as encountered e.g. in short Swiss German verbs (Lötscher 1993), Indonesian plurals (Dalrymple and Mofu 2011) or Telugu distributives (Balusu 2006). The formalism has proved efficient as a parsing front-end for context-free GF (Ljunglöf 2004;Angelov et al. 2009;Ljunglöf 2012). Nevertheless, expressing free order of constituents or interleaving of constituents is not easy in PMCFG. Until 2018, it was not even known whether this was possible. Although the answer is now known to be positive (Ho 2018), there is still no convenient way to concisely express the interleaving of groups, since PMCFG lacks a specific operator for this type of reordering. One way to overcome this difficulty is to define all legal orderings manually and pass them as arguments to the corresponding rule; this technique has been demonstrated by Lange (2017) in the case of Latin.

Bringing Together IDL Expressions and PMCFG: IDL-PMCFG
The conclusions of the last two subsections suggest that we combine IDL expressions and parallel multiple context-free grammars into a single formalism that can handle discontinuous constituents and copy (as did PMCFG) as well as free constituent order and hyperbaton (as did IDL expressions). This formalism is IDL-PMCF grammars (IDL-PMCFG), whose definition is given below. Although the complexity of the membership problem of both IDL expressions and PMCF grammars is polynomial, this is not the case of their combination: we will see in this subsection that parsing IDL-PMCF grammars is NP-hard, which as an important corrolary (Theorem 1) implies that IDL-PMCF provides a strict extension of PMCFG unless P = NP.
In IDL-PMCF grammars, productions are defined not as concatenations, but as IDL expressions of terminals and (nonterminal, index) pairs. Instead of tuples of strings, tuples of sets of strings are now the basic data type manipulated by the various rules. The symbol, which marks positions at which new words can be interleaved into the current string, is added to the alphabet.
Definition 6 (IDL-PMCF grammar) An IDL-MCF grammar (or IDL-PMCFG) is a sextuple of the a( f ) arguments of f , and an integer r ( f ) encoding the dimension of the image of f , such that the signature of f is where the X i j are fresh variable symbols, such that f is of the form Function f now produces a set of tuples of length r ( f ), which are derived in three steps: the dimensions of the arguments (resp. of the image) of f match those of the categories on the left (resp. right) side of the production; 8. S ∈ N is the start symbol, of dimension δ (S) = 1.
IDL context-free grammars are of essential theoretical interest. As we will come to evaluate the expressivity gain obtained by replacing simple concatenation by IDL expressions, it will be important to single out the contributions of both the PMCFG formalism and IDL expressions to the extension of the class of languages can that can be described by IDL-PMCFGs. Hence, comparing the expressivity of PMCFG to that of IDL-(PM)CFG, as we will do it in Sect. 2.4, shall inform us more thoroughly about the complementarity of the two approaches we combined.
Finally, the language matched by a given IDL-PMCFG can now be defined: We define a big-step derivation relation → on N × m i=0 P Σ i inductively as follows: and only if, there exists a production f , A 1 , . . . , A a( f ) , A ∈ P and sets of strings s i j i≤a( f ), j≤d i ( f ) such that the two following conditions are met: Finally, we use the lock function introduced in Definition 3 to define the language recognized by G as and call IDLPMCFL (resp. IDLCFL) the set of all languages that are recognized by at least one IDL-PMCFG (resp. IDL-CFG).
Note that the symbols are only erased from the output in the last step of the definition of L (G), after the set all strings that can be derived from the start symbol S has been retrieved. If they had been erased each time a production was used, interleaving constituents that are not arguments of the same rule would have been impossible, and constituents obtained from any production would have been locked.
Let us give an example of this. Consider the following IDL-CFG G abcd on Σ = {a, b, c, d} (when describing IDL-CFG grammars, we shall omit the [0] indexes identifying the first field of every argument): Clearly, L (G abcd ) = {abcd, acbd, acdb, cabd, cadb, cdab}. Consider the derivation tree for acbd: . For the first derivation to be possible, the diamonds in a b and c d are still required; otherwise, we would have comb ({ab} , {cd}) = {abcd, cdab}, which does not contain acbd. Keeping the diamonds in place until the end of the derivation process is therefore essential.
The following result comes "for free": We finally give a simple example of how this can be used to implement a very simple grammar. Suppose that we want to encode a small subset of Latin that contains sentences composed of a final verb and an optional subject. This subject is a noun phrase, i.e. a noun to which an arbitrary number of optional adjectives may be attached. The following IDL-CFG describes exactly this: Remark that while we used the || operator for building a new NP from an NP and an adjective (meaning that the adjective is allowed to appear either before, within or after the NP it is appended to), we resorted to simple concatenation for building a sentence from an NP and a verb, as we want the verb to appear at the end of the sentence, after the subject NP.

Expressivity
We shall now investigate the expressivity of IDL-(PM)CFGs and try to locate the corresponding class of languages within the hierarchy of polynomial languages. Recall the following series of inclusions CFL TAL PMCFL PRCL = P where -CFL is the class of context-free languages; -TAL is the class of tree-adjoining languages (Vijayashanker and Joshi 1988); -PMCFL is the class of parallel multiple context-free languages; -PRCL is the class of positive range concatenation languages (Boullier 1998); -P is the class of languages recognizable in polynomial time.
The main contribution of this subsection is a proof that IDL-PMCFGs can be located strictly above PRCGs in the hierarchy (Theorem 1). We also show that IDL-CFGs are strictly more expressive than TAGs and not more expressive than PMCFGs and suggest the question of whether IDLCFL ⊂ PMCFL as a natural generalization of a recently solved classification problem.
We first observe that IDL-CFG allows us to define in a very compact manner the nMIX language family: Proposition 3 For all n ∈ N + , the nMIX language defined as nMIX = x ∈ {a 1 , . . . , a n } * | |x| a 1 = · · · = |x| a n is in IDLCFL.
The position of the nMIX languages within the hierarchy has been intensively studied within the last decade. Language 2MIX is context-free. For n ≥ 3, the problem turns out to be much more difficult. The original MIX language (or Bach language), i.e. 3MIX, has been proven a PMCFL by Salvati (2015). Makoto and Salvati (2012) also proved that MIX is not a tree-adjoining language. Together with the fact that IDL-CFGs generate all nMIX languages, this results provides us with the following corollary: For many years, no general classification results were available for n > 3. Only very recently did Ho (2018) prove that for all n, the word problem of Z n is in PMCFL.
Since the word problem of Z n and nMIX are rationally equivalent (Salvati 2015), this yields the inclusion of the whole nMIX family within PMCFL.
Proving that IDLCFL ⊂ PMCFL would be an even stronger result, given that nMIX ∈ IDLCFL for all n; the inclusion might even appear likely in the light of Ho's proof. Nevertheless, the amount of work needed to address the "specific" case of nMIX languages suggests that this will be anything but easy, and we will leave this for future work.
On the other hand, it is clear that IDL-CFG does not contain PMCFG.
Proposition 5 Language L 3n above is not in IDLCFL.
Proof By contradiction, let G be an IDL-CFG matching L. Consider the context-free grammar G that is obtained from G by replacing every IDL expression e over some alphabet Σ in the right-hand side of a rule by a string s (e) defined inductively as follows: s (∨ (e 1 , . . . , e n )) = ∨ (s (e 1 ) · · · · · s (e n )) ∀n ∈ N s (|| (e 1 , . . . , e n )) = s (e 1 ) · · · · · s (e n ) ∀n ∈ N.
Note that an equivalent CFG grammar in a more canonical form can be easily obtained by removing the disjunction nodes in exchange of an increase in the number of rules. Now, it is straightforward that the language L generated by G is a subset of L. Moreover, for any string w generated by G, there exists a string w ∈ L ⊂ L such that |w| = w , and such that w is obtained by applying the same rules in G that were used to produce w in G. By construction, w is a permutation of w. Let x ∈ L, and n = |x|. By definition of L, x is the only word of length n in L. As a consequence, x ∈ L . This means that L = L and that the CFG grammar G recognizes L, which is impossible since L is not context-free.
Again, this results in a classification result: We have this corollary: Proposition 7 IDLCFL IDLPMCFL.
Proof By Propositions 2 and 6.
An essential result is that unless P = NP, IDL-PMCFGs can define some nonpolynomial languages. This is in line with Kirman and Salvati (2013)'s findings that even classes of grammars that are "close to [...] mildly context sensitive" may have NP-hard membership problems as soon as commutation is allowed. In the case of IDL-PMCFG, we will prove this in three steps. First, we will recall the definition of the NP-complete problem 3SAT and suggest a polynomial encoding of it on a simple finite alphabet. Second, we will construct an IDL-PMCFG grammar that recognizes the language of satisfiable 3-CNF formulae in the previous encoding. A final step will then lead us to the result.
The 3SAT problem is one of Karp (1972)'s 21 NP-complete problems. It asks for determining whether a finite boolean formula on a potentially infinite set of variables {x n } n∈N , input in conjunctive normal form (CNF) with at most three literals per clause, is satisfiable. Consider for example the 3-CNF formulae Formula f 1 is satisfiable because the valuation x 1 → , x 2 → ⊥, x 3 → results in the formula reducing to . Formula f 2 is not satisfiable: the last two unary clauses impose x 1 → ⊥, x 3 → ⊥, but then the first clause requires x 2 → ⊥ to be satisfied whereas the second one needs x 2 → , a contradiction.
The size of 3-CNFs is measured by the number of their clauses, without regard to the number of variables. In the two instances above, this gives | f 1 | = 3, | f 2 | = 4.
So far, our description of 3-CNF formulae, unlike the grammars we study in this paper, used an infinite alphabet to encode variables. We now introduce an encoding of 3-CNF logical formulae on an finite alphabet. ∀n ∈ N, ν (x n ) = 1 n ; -A mapping μ : SV → Σ * encoding optionally negated variables by appending the character ! in front of negated variables: -A mapping π : SV 3 → Σ * encoding ternary clauses in the following way: -A mapping τ : n∈N SV 3 n → Σ * encoding 3-CNF formulae as follows: Mappings ν, μ, π and τ are clearly bijective.
The above encoding is only applicable to formulae in 3-CNF where every clause contains exactly three literals. A straightforward observation makes this restriction largely irrelevant and will simplify the discussion later on: Proposition 8 Let f be a logical formula in 3-CNF. There exists another logical formulaf in 3-CNF such that: 1. Formulae f andf are equivalent and f = | f |;

All clauses inf have exactly three literals; 3. The setŴ of variables used inf is equal to
Moreover, for all such f , a formulaf matching the three above conditions can com- .
Proof Let f be a logical formula in 3-CNF and W the set of its variables. We derivê f from f as follows: 1. Rename the variables in f to produce a new formula g with variable set X such that X = (x n ) n∈ [[1,|W |]] . One convenient way to achieve this is to process the formula from left to right, keeping in mind the index of the smallest currently unused variable in the new (partial) formula, as well as the correspondance between variables in f that have already been renamed in g. This is done in time linear in | f | and would e.g. convert The following lemma provides the key argument: Proof We build an IDL-PMCF grammar G that recognizes satisfiable 3-CNF formulae encoded as in Definition 9. First, we define variables (of category V and arity 1) as sequences of 1s: We proceed by defining literals (of category L, arity 1) as variables preceded by the optional negation symbol !: A satisfiable formula and the valuation satisfying it are produced in parallel through a number of double-steps, each of them consisting of: 1. A selection step where a new variable x i := x i+1 is selected and its boolean value v i in the valuation is chosen; 2. An insertion step where an arbitrary number of ternary clauses containing is added at arbitrary positions in the already produced formula.
Each double-step uses a category F of arity 3 that stores the current formula f i as well as the current litteral The selection step retrieves the next variable and chooses its value by The insertion step adds arbitrary clauses containing the current litteral to the current formula: This rule can be described informally as follows: interleave into the current formula a (locked) clause consisting of three interleaved (locked) literals, the first two of which are arbitrary while the third one is equal to the current litteral; literals are enclosed in parentheses while clauses are enclosed in square brackets.
Finally, we have to indicate that the start category can be produced from any F and that the empty formula is satisfiable (with 1 as first variable to consider): In other words, E i is a set of clauses such that the current value of x i in v makes all clauses in the set reduce to true. Let κ ( ) = ε and κ (⊥) = !. This decomposition necessarily exists sincef is satisfiable, but it is in general not unique. The string τ f is recognized by G using the following derivations for i = 1 up to N , starting with ε, 1, κ (v (x 1 )) of category F: -Without loss of generality, suppose v (x i ) = and therefore κ (v (x i )) = ε, -Up to reordering of i, j and k, C n = x i ∨ σ j x j ∨ σ k x k , -Produce two literals of category L containing σ j x j and σ k x k respectively, -Use them along with φ to produce that has been added at a position compatible with the final reordering in τ f , Conversely, let f be a 3-CNF formula such that τ f ∈ L (G). As f andf are equivalent, it suffices to prove thatf ∈ 3SATL. Now, it is straightforward to see that verifying the same properties as in the first half of the proof can be constructed by considering the clauses added in f at iteration i. The existence of these subsets guarantees thatf ∈ 3SATL, which concludes the proof.
Proof By contradiction, suppose that IDLPMCFL ⊂ P. We will now prove that 3SATL ∈ P.
Let G be the grammar defined in Lemma 3. Thanks to our hypothesis that IDLPMCFL ⊂ P, the language L (G) recognized by G is in P. Let T be a (deterministic) Turing machine that recognizes L (G) in polynomial time. Consider the following procedure Poly3SAT: First, this procedure runs in polynomial time in the size | f | of the input: 1. Computingf takes time O (| f |) according to Proposition 8, and f = | f |.

Computing τ (g)
is also clearly linear in the size |g| = | f | of its input. The size of τ (g) ∈ Σ * is given by its length. The set W of variables appearing in g is included in (x i ) i∈ [[1,3|g|]] according to Proposition 8. Therefore, |ν (w)| ≤ 3 |g| = 3 | f | for all w ∈ W . Following Definition 9, we get 3. Finally, computing T (h) is by assumption a O (|h| α ) for some α ∈ N + . As Second, it recognizes 3SATL. This is a direct consequence of Lemma 3: f ∈ 3SATL iff h = τ f ∈ L (G) = L (T ), iff Poly3SAT returns true when applied to f .
We have proved that Poly3SAT recognizes 3SATL in polynomial time. Hence, 3SATL ∈ P. As 3SAT is NP-complete, this yields P = NP.
Independently from the answers to the previous questions, it is already clear that the presence in a CFL-based formalism of all three ||, · and × operators as well as of tuples of size at least 2 and copying, is sufficient to leave the realm of P. As noted in Sect. 1.2, interleaving, linear constraints, locking, records and copying are reasonable requirements for a grammatical formalism designed to describe the syntax of free word order languages in general, and of Classical Latin in particular. This, of course, does not mean that Classical Latin itself would be non-polynomial, since the reduction presented is not linguistically relevant, and involves copying which Latin does not require. It simply means that a grammatical formalism for free word order languages containing the features above leads to worst-case non-polynomial scenarios which might not necessarily be linguistically relevant.

Grammatical Framework and COMPĀ
Grammatical Framework (GF) (Ranta 2004), developed by Ranta et al. since 1998, is a special-purpose programming language aimed at writing grammars of natural languages. Practically, GF serves as the natural-language counterpart of tools such as YACC (Johnson et al. 1975) or Menhir (Pottier and Régis-Gianas 2016) for programming languages. From a logical point of view, Grammatical Framework is a logical framework relying on Martin-Löf type theory (Martin-Löf and Sambin 1984). A functional programming language, GF also has a large support for modularity and enforces conventions and standards simplifying the development of multilingual applications. Its community actively contributes to the Resource Grammar Library ), that unites 'concrete' wide-coverage grammars for over 30 individual languages around a common 'abstract' grammar. Over the course of the last 20 years, GF, which remains fully open-source, has been used in several experimental as well as industrial contexts, for applications ranging from morphological generation to natural language transcription of formal (mathematical, proof, technical) language, from multilingual translation of 'controlled language' to language learning tools. This chapter describes COMPĀ, a GF-like programming language tailored to encode the grammatical syntax of free word order languages. Though it has been primarily conceived to model and study the syntax of the Latin language, its design as well as the description we will give are both language-agnostic. The name COMPĀ stands for COMPĀgēs Grammaticālis Latīna, which means 'Latin Grammatical Framework' in Latin.
The syntax and semantics of COMPĀ are largely borrowed from standard GF: it is a functional programming language in Haskell-style manipulating sets of words/terminals, and providing records and tables over finite types, as well as finite lambda functions (Ranta 2011). As an experimental language focussing on the syntactic description of individual languages, COMPĀ does not implement structures and operators mainly directed at handling morphology, semantics and multilingualism or providing additional modularity, such as abstract grammars, dependent types, token-level gluing or general lambda functions. More precisely, COMPĀ's extends a subset of GF so-called context-free GF (Ljunglöf 2004). In turn, it provides 3 operators absent from standard GF, viz. the interleave (||), disjunction (∨) and lock (×) operators. While standard GF compiles (mostly for parsing purposes) into Angelov et al. (2009)'s PMCFG-equivalent PGF, COMPĀ can be transcompiled into IDL-PMCFG.
In this section, we will focus on aspects of COMPĀ's design that differ from standard GF, and show how it can be used as an efficient front-end for writing practical grammars of free word order languages. For a detailed presentation of the syntax of standard GF, the reader can refer to the GF reference manual (Ranta 2011).

Data Types
As a language designed to model free word order languages, COMPĀ relies one fundamental data type Set, the type of sets of token lists (short: 'sets'), that replaces the standard GF Str. The basic operators described below take as input, and return, only data of type Set.
Besides the fundamental type Set, each grammar may define an arbitrary number of parameter types. These finite types are often used to encode specific grammatical features (e.g. case, number or gender).
Record types can be built from a list of (distinct) identifier names and a list of types, each of which might be either the type Set or any finite type. Records store structured information and allow for an accurate representation of grammatical constituents (storing some of their features as well of one or several sets that represent their linearization).
Tables are finite functions that map every value of a finite type to a value of some (unique) other type.
Given a set of finite (i.e. enumerated) types Π and the set of admissible string identifiers S, the syntax of admissible types is formally defined as follows (Fig. 2):

Operations on Sets
Sets are introduced by means of the standard syntax for strings. Thus, in COMPĀ, the expression consul does not represent the singleton token list [consul] as in GF, but it instead stands for the singleton set {[consul]}. Similarly, COMPĀ's [] does not stand as in GF for the empty list of tokens (the empty string), but for the singleton containing the empty list of tokens. Another more practical way to put it is to see this set as the set of possible phrases that can be derived from the expression consul: there is only one, containing one word, the word consul, hence the singleton set above. Note that the empty set of strings has its own syntax, variants {}, that is also borrowed from standard GF.
COMPĀ define four basic operations on sets, that are the exact counterparts of those defined in Nederhof and al.'s IDL expressions formalism (Fig. 3):

Structure of Programs
Each COMPĀ program, called a grammar, is enclosed in a file, with each COMPĀ file defining exactly one such grammar. Standard extension for COMPĀ files is .cp. The syntax of programs is as follows (note that all whitespace and line breaks are ignored) (Fig. 4): The identifier following the concrete keyword is an (arbitrary) name. We will now go through each of the four sections of the grammar and consider their individual syntaxes.

Including GF Lexica
The first section is used to import already extant standard GF lexica into COMPĀ. Its syntax is extremely simple (Fig. 5): where filename is the name of a concrete GF file functioning as a lexicon. When an include is read, the corresponding GF file is retrieved and all words it defines are automatically extracted. The include section thus provides some compatibility with standard GF as well as a support (through the use of GF itself) for efficient morphological analysis.

Parameters
Parameters are declared as follows (Fig. 6): In the above description, ident 0 is the name of the new finite type, and (ident k ) k≥1 its values. Both type and value parameter identifiers must be unique throughout the whole grammar, and are usually (but not necessarily) capitalized.

Categories
Categories are introduced in the lincat section according to the syntax below, where paramType is any parameter type defined in the param section (Fig. 7):

Linearization Functions
The lin section collects the functional rules that are the heart of any GF or COMPĀ grammar. Each linearization rule describes a way to combine several arguments (of given input categories) into a new item (of a given output category). Unlike GF, which separates the type-annotated declaration of linearization functions in abstract syntax files from the non-type-annotated definition of linearization functions in concrete syntax files, COMPĀ uses only a single (concrete) syntax, which is directly annotated with types. COMPĀ includes a complete type-checker.
Let us first formally describe the syntax of linearization functions. In this figure, the non-terminals paramType and paramValue match parameter types and values introduced in the param section, whereas the non-terminal licatName matches the name of any category defined in the lincat section. In the definition of lin, ident 0 and lincatName 0 are respectively the name of the linearization function and its output category, while are the names and categories of the function's arguments (Fig. 8).

Iterating over Finite Types with for
To handle those cases where similar rules must be constructed for all possible values of a given parameter, a loop structure absent from standard GF is proposed. This structure is available in COMPĀ through a for-do construction.
Suppose that verb category V has type {s : Tense ⇒ Set} where Tense is a finite type enumerating the available tenses in the language, and that we want to write a rule that takes a verb of category V and produces a conjugated verb of category ConjugV and type {s : Set; tense : Tense} that stores a conjugated verb and keeps trace of its tense. This can be achieved in COMPĀ like this: conjugateVerb (v : V) : ConjugV = for t : Tense do { s = v.s ! t; tense = t }; When translated into low-level IDL-PMCFG, this results into several parallel rules being constructed, one for each available value of the bound variable. This is especially useful when a parameter (e.g. a verb tense or mode) provides different linearizations without playing any part in the syntactic structure itself, or when another parameter (e.g. number) can be arbitrarily chosen at some syntactic level before being propagated downwards into the tree.

The COMPĀ (Trans)Compiler
Just as standard GF must be compiled into low-level PMCFG for parsing purposes, the COMPĀ language is used as a grammar description front-end that has to be translated into IDL-PMCFG before parsing.
Using OCaml, we implemented a lightweight transcompiler that type-checks and converts a COMPĀ grammar into an equivalent IDL-PMCFG grammar. The essential conversion step employs finite function resolution techniques similar to those presented by Ljunglöf (2004): tables and parameter fields are replaced by new fields and categories, and new rules are finally created between new categories.
The compiler's source code can be found in the corresponding GitHub repository. 5

A Parsing Algorithm for COMPĀ
In this section, we present a parsing algorithm for IDL-PMCF grammars and provide an analysis of its complexity. This algorithm, for which we also provide a complete OCaml implementation, is inspired by the works of Ljunglöf (2012) and Angelov (2009) on GF parsing, while building on techniques introduced by Nederhof and Satta (2004) to parse IDL expressions. We extend Nederhof and Satta's graph-based finite state approach, enriching it by decorating active nodes by sets of word positions.

Parsing COMPĀ's IDL Expressions
Nederhof and Satta (2004) present a parsing algorithm for IDL expressions relying on left-to-right scanning of the input and a representation of the current parsing state as a cut (a set of nodes verifying certain properties, that does not necessarily match the traditional graph-theoretical definition of a cut-see below) within a so-called IDL graph. Each IDL expression is compiled to a single IDL graph. Transitions from one state/cut to another state/cut are encoded in the IDL graph; edges are labelled with words that must be read to transition from one cut to another. The input is parsed successfully if and only if the final state is reached after all characters have been read. Unlike in the original publication, where IDL expressions were used as autonomous regular expressions rather than within a grammar, the edges of COMPĀ's IDL expressions may be annotated both by terminals, i.e. words, and by (nonterminal, index) pairs. The latter labelling corresponds to the case where we want to match a field of one of the arguments of the current rule.
Let us now define the IDL graph associated to a given IDL expression. Note that this definition, though closely following along the lines of Nederhof and Satta's contribution, does not encode the lock operator in the same way. This different encoding has been found more practical for parsing of full IDL-PMCF grammars, as will be overt when we will discuss our algorithm.

Definition 10 (IDL graph) Let G = (N , δ, Σ, F, P, S) be an IDL-PMCFG.
Let The IDL graph γ e associated with e is defined by induction as follows: - As an exemple, the IDL graph associated with the IDL expression describing the valid permutations of Marcus cum amico caro ambulat is: 6 The parsing process of sentence Caro cum amico Marcus ambulat with the IDL graph presented in Fig. 9 is given in Fig. 10.
Given an IDL expression and its IDL graph, we also define the set of its cuts, that will serve as states in the parsing algorithm. Definition 11 (Cuts of an IDL expression) Let G = (N , δ, Σ, F, P, S) be an IDL-PMCFG.
Let f , A 1 , . . . , A q , A ∈ P. Let e be an IDL expression over Σ and γ e its IDL graph.
The set of cuts of e, C e ⊂ P (V (γ e )), where V (γ e ) denotes the set of vertices of γ e , is defined by induction as follows: IDL graphs can be regarded as automata recognizing a given regular expression by reading it left-to-right, allowing for parallel reading of several interleaved substrings. The initial and final cuts are composed of a single node. Split edges marked by n cause several branches to be explored in parallel (thus increasing the cardinality of the current cut by n − 1 elements) while merge edges marked by n allow n nodes in the old cut to be replaced by one single node in the new cut. Labelled edges can be used to replace the node on the left-hand-side of the edge by the node on the right-hand-side of the edge in the current cut, provided that the terminal or nonterminal labelling the edge can be read at the current position. Epsilon-labelled edges (aka ε-transitions) can be taken under no specific assumption, provided that the left-hand-side node of the edge is in the current cut. They are especially used to encode disjunction nodes, which do not result in several branches being taken at the same time, but in only one of them to be chosen. The special lock edges, which were absent from Nederhof and Satta's original publication, will be discussed later.
An additional degree of complexity has to be dealt with in the context of IDL-PMCFG: we have to check that the substrings matched by the various nonterminals previously read are compatible with the constraints imposed on word order or interleaving. Therefore, throughout the execution of the algorithm, the current state of the parsing process within each IDL graph must be cautiously saved. Any field of any input category can match an arbitrary (and non necessarily contiguous) substring of the input. Moreover, given the ability of the formalism to encode nested lock constructions, an arbitrary number of such position sets must be remembered. This suggests the state space presented in Definition 13.
We first formally define a notion of stacks above an arbitrary set.  applyHead : The state space of an IDL expression can now be defined. Informally, each state of an IDL parsing item is a pair (c, σ ) where c is a cut and σ is a map from the nodes of this cut to stacks of position sets. These stacks store the position of the terminals (words) that have already been read by the automaton when the current state is reached. Using a stack allows us to distinguish word positions that were matched in the current branch or at the current level of nested locks, as opposed to words matched before the last split or outside of the current level of nested locks. When a set of split transitions are taken, each of the new nodes added to the cut will store an independent copy of the previous stack, extended with an ∅ head. During the processing of the current branch, positions matched in the same branch will be added to the head of the stack, while non-head elements will store information from previous branches. When a set of merge transitions are taken, the heads of the various stacks will be first merged together (ensuring that no contradiction occurs) and then with the second element of all stacks (to take into account the closing of the current parallel processing and check, again, that no impossibility arises). With this technique, we can also give a simple semantics to the ↑ and ↓ edges: when an ↑ transition is taken, an ∅ is added to the current stack; when an ↓ transition is taken, we check whether the head element of the current stack is an interval and, if it is the case (and no incompatibility arises), we merge it with the second element.
The incompatibilities we evoked can be of two sorts: either the same positions have been read in two different branches, which can therefore not be merged; or what has been parsed does not respect the principle that an IDL graph, within the same branch, parses its input from left to right.
To formalize this, we introduce a partial order on sets of positions as well as some useful predicate:

Definition 14
The relation ≺⊂ P f (N) 2 is defined by Note that for any A ∈ P f (N) and ∅ ≺ A, A ≺ ∅, and that moreover ∅ ≺ ∅.

Definition 15 The predicate interval ⊂ P f (N) is defined by
We can finally define a transition relation between states: Definition 16 (Transition relation of an IDL expression) Let G = (N , δ, Σ, F, P, S) be an IDL-PMCFG.
e ) its IDL graph, C e its cuts and S e its states. The relation Δ e ⊂ S e × Σ × P f (N) × S e =: Ω × S e is the smallest relation verifying the following axioms: this axiom encodes the fact that, when reading the (non)terminal a at position set π π (meaning that π is located right of the previously read position set π ), we can update the current cut by replacing the node on the LHS of any edge labelled with a by the node on its RHS and appending the position set π to the positions stored on the top of the stack.

For all
an ↑-edge can always be used to replace the node on the LHS of the edge by the node on its RHS, pushing an empty position set on the top of the corresponding stack -this is used to isolate the parsing of locked subexpressions, which are finally tested for connexity through a ↓-edge;

For all
graphically: the ↓-edges are used at the end of locked subexpressions: the position set on the top of the stack, which stores the positions used in the current locked branch, is tested for connexity (with the interval primitive) and linear precedence (the newly closed locked branch at positions π must be located right of the previously read positions π ), which, if both tests succeed, leads to the node on the LHS of the edge to be replaced on its RHS and to both position sets to be merged as soon as the current cut contains the LHS of a set of split (i.e. n ) edges, this axiom opens n parallel (interleaved) branches, replacing the LHS node by n RHS nodes, all of which come with the same stack as previously, except for an additional empty position set on top, which will later isolate the positions read in the various parallel branches;

For all
graphically: at the end of a series of parallel branches marked with merge (i.e. n ) edges (closing the parsing of an || node), this axiom checks that the position sets matched by the various parallel branches are compatible (disjunct) and that these positions, when merged, are compatible with previously matched positions (i.e. located right of them), and, in this case, it replaces the set of nodes on the LHS by the single RHS of merge edges.
Although these rules would essentially suffice to describe the parsing algorithm if all non-terminals appearing in a rule appeared exactly once, the fact that the same nonterminal may appear several times (copy) or not appear at all (erasement) requires us to keep trace of partial parsing contexts in which each argument may or may not have been already identified. We do this by introducing so-called context tables.
A context table is a partial function that associates to some arguments of the rule a partial mapping between some fields of these arguments and positions sets. It helps us remember which arguments have already been fixed and which ones can still be chosen freely. For each argument that has already been fixed, it retains which of its fields are available and what positions in the input string are matched by each available field. Let G = (N , δ, Σ, F, P, S) be an IDL-PMCFG.

Definition 17 (Context table)
Let f , A 1 , . . . , A q , A ∈ P. Let e be an IDL expression over its IDL graph, C e its cuts and S e its states.
We call context table for e any Γ ∈ [[1, q]] . The set of context tables for e is denoted by G e .
The set of context tables is equipped with three primitives defined as follows: -For any input string s, in other terms, compat checks that the current context table Γ is compatible with mapping field of argument k to position set r =: π , which is the case iff either (i) [new nonterminal (k, ) matched] k is not in the context and π does not intersect any of the position sets stored in Γ or (ii) [copy of already matched nonterminal] k is already in the context, mapped to a partial function Γ (k), which itself maps field index to a position set Γ (k) =: π such that substring s π is the same as s π ; -The map reserve ∈ Ψ := G e × N × N P f (N) → G e is defined as this map registers k in Γ , mapping it to r , iff k is not yet in the context; -For any input string s, unify s ⊂ G 3 e is such that primitive unify identifies triplets of contexts Γ , Γ and Γ such that Γ can be obtained from Γ and Γ by first (i) adding to Γ all matchings k → r from Γ for which k is outside of the domain of Γ and then (ii) checking that for all k such that there exists k → r ∈ Γ and k → r ∈ Γ , r and r define the same fields and maps them to identical substrings.
The semantics of the three primitives are rather natural. First, compat indicates whether the assertion "the th field of argument k can be identified in the input string at position set r " is compatible with all prior decision stored in the current context. This is possible iff either the kth argument has never been matched before, or the th field of the already detected kth argument in the current context is identical to the one matched by position set r . Once a compatibility is detected, reserve is used to update the context table according to the newly matched item. Finally, unify allows us to compute the union of two contexts that do not interfere with each other.

Parsing COMPĀ Grammars
Our algorithm is inspired by Earley-style parsers designed to parse context-free GF or PMCFG grammars Ljunglöf 2012). The input is read from left to right and three different kinds of items are build bottom-up. The structure of the above context tables, that contain all essential information about the arguments of each rule and their position, makes it easier to recursively reconstruct all valid parse trees of a given input.
The three types of items we use are: Definition 18 (Parsing items) Let G = (N , δ, Σ, F, P, S) be an IDL-PMCFG.
-Active items are items of the form [φ; e; s; and Γ ∈ G e ; -Passive items are items of the form [φ; A i ; r ; Γ ] P where φ = f , A 1 , . . . , A q , A ∈ P and Γ ∈ G e , with e any IDL expression over (N) and Γ ∈ G e , with e any IDL expression over While active items store the current parsing status of a given IDL expression and passive items memorize successful parsing of a given IDL expression, completed items unify parsing results for different fields of the same category, checking that the various contexts are compatible with each other. Passive items are not absolutely essential; they are essentially syntactic sugar for active items where the current cut is reduced to the final node.
The deduction-style rules that make up the core of our algorithm are presented in Fig. 12. Predict, Scan and Combine have their usual semantics from bottomup parsing algorithms, and make extensive use of the context tables and transition relations defined in Sect. 4.1.
Step explores ε-transitions. Save produces a passive item from an active item that has reached it final state; this passive item is immediately converted into a completed item with only one activated field by Singleton. Finally, when a passive item can be used to extend the domain of a preexisting completed item, Unify performs this operation and returns a new completed item.   Practical implementation of the parsing algorithm requires that an (efficient) ordering be defined on the rules to apply. This ordering must guarantee correctness (i.e. that all possible syntax trees can be output) as well as an acceptable running time.
The graph from Fig. 13 displays the seven deduction rules from Fig. 12 as functions from and to the sets of active (A), passive (P) and completed items (C), as well as products thereof. Dashed arrows are added between two sets X and Y whenever there exists Z such that Y = X × Z (red) or Y = Z × X (green).
This graph can be viewed as a kind of "recursive control flow diagram" for our algorithm. Each edge labelled with a rule name corresponds to a recursive call to a corresponding function, that will try to apply the rule using the item output by the previous successful function call; each dashed edge corresponds to a fold operation through the item set matching the right-hand-side of the destination type (for red arrows) or the left-hand side of the destination type (for green arrows). The rule Predict is applied only at the first iteration. At each iteration, the parameters j and a = t j used by Scan are updated, reading the input from left to right, and a sequence of recursive calls takes place, building new rules that are appended to the existing parsing environment.
In fact, due to the structure of our parsing system, one of the arrows above is redundant: the red arrow r from C to C × P can be removed without altering the correctness of the algorithm, as long as, when handling a passive item, the fold operation suggested by the green arrow g from P to C × P is executed before the recursive call encoded in the Singleton arrow. Let us consider the first iteration where the red arrow r is taken. This can only occur when a new completed item c has just been created; this completed item has been itself generated by either of the Singleton or Unify rules. If it has been generated by Singleton, say from a passive item p, a recursive fold through the set C has already taken place via the green arrow g. That fold has added to the environment all completed elements that can be computed from p and any other available completed item. Now, for any available passive item p , a completed item c has been derived from p at some point of time in the past. The fold operation triggered by g has already, if possible, derived a new complete item from p and c that would contain exactly the same information as the item to be created from c and p . If c has been created by Unify, a fold has already been triggered through g (the only possible path to reach C × P has been through g, because of our hypothesis that we have not taken r before) and the same reasoning applies by considering the items c and p used to produce c.
The resulting pseudocode is presented in Algorithm 2.

Complexity
The goal of the final part of this paper is to provide an upper bound of the complexity of our parsing algorithm under some practically reasonable assumptions, that will be obtained as Theorem 4. The detailed proof of this theorem can be found in Appendix A.1. Before addressing the actual complexity problem, three remarks must be made. First, ∨ nodes are not absolutely needed in the IDL-PMCFG formalism. By creating some new rules and introducing intermediate categories, it is easy to transform any grammar into an equivalent one without any ∨ node. In the following discussion, we will often exclude the case of ∨ nodes, and give upper bounds only for the case where those ∨ nodes are not used. Practical experience showed that disjunction, being redundant with the creation of two separate rules, are a useful, but less frequently used  8: for e 1 , . . . , e δ(A) , A, A 1 , . . . , A a( feature of the formalism. Nevertheless, we shall give some insights in Appendix A.1.3 about how to take into account disjunction in the final estimate. Second, we introduce a notion of G-density of a language: Definition 19 (G-density of a language) Let m ∈ N. Let G = (N , δ, Σ, F, P, S) be an m-parallel IDL-MCF grammar. Let T ⊂ Σ * such that ε / ∈ T . The G-density of T is defined as where t p denotes the substring of t composed of the tokens at positions p in t (see notations).
The G-density of a language serves as a proxy for the amount of ambiguity that this language contains from the point of view of grammar G. It answers the question: 'How many different substrings of any string in the language can be matched by the same field of the same category?'. This 'how many' is quantified as a quotient of the number of different matches over the length of any input string. Introducing G-density will allow us to discuss the worst-case complexity of parsing on reasonable sets of inputs, i.e. those for which ρ T is finite, or, equivalently, for which the number of matched substrings grows at most linearly in the size of the string.
Consider the case where we want to describe adjective-noun attachment in a natural language where adjectives can be placed arbitrarily before or after the noun they modify. We are given an (arbitrary large) lexicon with a number of terminal rules producing adjectives (of category A) and nouns (of category N ). These terminals are stored in an alphabet we denote by Σ. The part of the grammar building noun phrases (of category S) in our toy IDL-CF grammar looks like this: Now, how ambiguous can noun phrases be? If we take T = Ω := P (Σ * ), then considering a string with a noun and n arbitrary adjectives in any order results in all substrings containing the noun to be valid noun phrases; in this case, ρ T ,G ≥ 2 n n+1 for all n ∈ N + , and therefore ρ T ,G = +∞. But if we now consider the (more practical) case where the number of adjectives to be attached to the same noun is less than some reasonable constant M, and call U the corresponding sublanguage of Ω, then we get no more than 2 min(k−1,M) different matching substrings for any input of length k ≥ 1; as a consequence, ρ U ,G = sup k∈N + 2 min(k−1,M) k = 2 M M+1 < +∞. Third, we need to keep in mind the following fact, that is an immediate consequence of Theorem 1: Theorem 3 IDL-PMCFG parsing is N P-complete. Therefore, unless P = NP, general IDL-PMCFG parsing is not polynomial in the size of the input string.
Proof Theorem 1 provides a reduction from the NP-complete problem 3SAT to parsing IDL-PMCF the grammar G from Lemma 3

Measuring IDL Graphs
We now introduce two measures of the complexity of IDL graphs, encoded in two primitives height and width. While height was already defined in Nederhof and Satta's paper (though it was there called width, and defined in a slightly different manner), width plays a new and complementary role that we shall emphasize later. The informal interpretation of these metrics is simple: height measures the maximal number of branches that can be traversed in parallel, while width quantifies the maximal number of edges labelled with a terminal or ε on any left-to-right path from the start to the end node.
Definition 20 (Height and width of an IDL expression graph) Let Σ be a set of symbols that does not contain ε and and E the set of IDL expressions over Σ. The height and width of an IDL expression e ∈ E are defined inductively as follows: width (e i ) .
In the graph γ of Fig. 9, we have width (γ ) = 3 (a path from left to right in the graph contains at most two edges labelled by terminals) and height (γ ) = 6 (there are at most six nodes in a cut, or equivalently six branches traversed in parallel).

Final Complexity Estimate
Based on the previous definitions of height and width, we can prove Theorem 4 Let m ∈ N. Let G = (N , δ, Σ, F, P, S) be an m-parallel IDL-MCF grammar. Let E be the set of IDL expressions used in G. Assume that for all e ∈ E, e does not contain any ∨ node. Let T ⊂ Σ * and put ρ := ρ T . Let w := max e∈E width (e), h := max e∈E height (e), R = |P|, α = max f ∈F a ( f ), M = max e∈E |e|. Finally, let t ∈ T and n := |t|. Assume that w ≥ 6 and n ≥ w. An upper bound of the complexity of the parsing algorithm described in Algorithm 2 is given by Proof See A.1.

Conclusion
In this paper, we have presented and studied IDL-PMCFG, a new grammatical formalism that extends PMCFG with Nederhof and Satta's IDL expression. This formalism, along with its GF-like experimental front-end COMPĀ, was designed as a tool to formally encode the syntax of free word order languages. COMPĀ, its IDL-PMCFG backbone and the associated parsing algorithm have been implemented in an experimental setup, focussing on the parsing of Classical Latin. The corresponding code can be found in our GitHub repository. To our knowledge, this formalism is the only one to this day to allow for a straightforward, wide-coverage syntactic description of Classical Latin and similar languages for rule-based parsing purposes. The fact that IDL-PMCFG extends PMCFG with only two new operators should make extending existing tools for support of hyperbatic constructions comparatively smoother than if an ad hoc approach to Latin syntax had been chosen.
In order to be able to easily encode the kind of extensive free word order encountered in the case of Classical Latin, an operator allowing for grammatical constituents to be swapped and intertwined, the || operator, is required; no less required for conciseness is the ability, in particular instances, to impose fixed constituent order (through the · operator) or non-interleaving, or locking, of constituents (through the × operator). Since these operators are virtually anavoidable when it comes to providing a linguistically intuitive description of the actual syntactic constraints in the language, it is reasonable to think of IDL-PMCFG as the "smallest extention of PMCFG with built-in support for free word order as observed in Classical Latin". Note that this does not mean that IDL-PMCFG would be the smallest extension of CFG with this same property, since copying is not required to encode hyperbatic constructions. Besides the design and analysis of the formalism itself, one of the main contributions of this paper is the classification result of Theorem 1. This theorem has two main consequences.
The first one is mainly theoretical: whenever a CFG-derived grammatical formalism is coupled with IDL expressions and includes a record system that does not restrict copying, parsing in this formalism must, in the worst case, be non-polynomial in the size of the input. As an immediate corollary, such formalisms cannot be mildly context-sensitive. In fact, even if we disallowed copying, there is no way a formalism able to generate Latin hyperbatic structures in a linguistically meaningful way could be mildly context sensitive: Becker et al. (1992) showed that scrambling as it occurs in German-a kind of free word order that is strictly less general than Latin hyperbata-is not mildly context-sensitive.
The second, more practical consequence is that the corresponding parsing algorithms will not be polynomial, which does of course not mean that parsing will be intractable altogether, since pratical linguistic settings rarely present the level of ambiguity that leads to theoretical worst cases. Our first experiments would rather suggest the opposite. Studying the complexity of the IDL-PMCFG parsing algorithm in the particular case of IDL-MCF grammars without copy would be an interesting path for further research.
While a majority of works in formal NLP draw most of their examples from fixed word order languages (most notably English, but also to certain extent German, French, or Chinese) in which hyperbata are almost always ungrammatical, it might be tempting to think that mild context sensitivity, and in particular polynomial-time parsing, is sufficiently expressive to account for syntactic phenomena in a vast majority of instances and for almost all natural languages. This view indeed seems to widely accepted, 7 and it has proved practical in many cases, often preventing unnecessary computational explosion. Becker et al. (1992) showed that its theoretical accurateness should be questioned. Indeed, in the light of this study, the alleged minimality of mildly context-sensitive languages, while not contradicted by the grammar of English and similar languages, appears to have somewhat underestimated the complexity of (very) free word order: Classical Latin and Greek, Sanskrit etc. present hyperbatic constructions that are considerably more complex that German scrambling. These may well be specific languages, and, in one sense, they are: Classical Latin and Greek, as well as Sanskrit, belong to a rather extremal subset in the (somewhat imprecise) galaxy of so-called free word order languages. In these languages, audacious interleaves and permutations have become part of a canon of refined rhetorical and prosodic effects, thus enhancing even more the natural syntactic flexibility of a morphologically rich linguistic system. Many other idioms, such as English, are not concerned by this kind of phenomena, and it would be equally unsatisfying to impose free word order formalisms upon them. What is at stake is not so much the pertinence of mild context sentivity for the vast majority of formal NLP applications, but rather its universality throughout natural languages.
Two import questions remain open. On the formal side, the position of IDL-CFGs and IDL-MCFGs (without copy) in the hierarchy are still unclear, and so is the complexity of their respective parsing algorithms. On the linguistic side, the level of expressivity needed to account for hyperbaton and locking of clauses is not precisely known. Answering these two questions would provide a more complete insight into the level of syntactic complexity of free word order in natural language, while paving the way for the development of more efficient description and parsing systems.
Finally, we define sort as the only function from k∈N + F k into itself such that sort | F k = sort k for all k ∈ N + . Now, it is time for us to introduce relaxed IDL expressions.
Definition 22 (Relaxed IDL expression) Let Σ be a set of symbols that does not contain ε and and E * the set of IDL expressions over Σ that do not contain any ∨ node. The relaxationẽ of an IDL expression e ∈ E * is defined inductively as follows: Relaxed IDL expressions over Σ are of the form e = || a 11 · · · · · a 1k 1 , . . . , a n1 · · · · · a nk n where ∀i ∈ [[1, n]]∀ j ∈ [[1, k i ]], a i j ∈ Σ ∪ {E } and k 1 ≥ · · · ≥ k n ; in particular, height (e) = n and width (e) = k 1 . A simple simultaneous induction is sufficient to prove both this fact and the welldefinedness of relaxation.
Relaxation increases the number of substrings of the input that can be matched by an expression: Lemma 4 Let Σ be a set of symbols that does not contain ε and and E * the set of IDL expressions over Σ that do not contain any ∨ node. Let t ∈ Σ * . For all p ∈ P ( [[1, |t|]]), if t p is recognized by e, then t p is also recognized by e.
Proof By immediate induction on e ∈ E * .
We can now check that relaxation keeps height and width invariant: Proposition 9 Let Σ be a set of symbols that does not contain ε and and e an IDL expression over Σ that does not contain ∨. We have width ( e) = width (e) and height ( e) = height (e).
-If e = × e , then by induction hypothesis width ( e) = width e = width e = width (e) and height ( e) = height e = height e = height (e).
-If e = || (e 1 , . . . , e n ), let (k 1 , . . . , k n ) ∈ N + n and e i1 , . . . , e ik i ∈ E k i * The other important result that we need to prove is, essentially, that relaxation can only increase the number of states in the IDL graph. Unfortunately, this is not true when all states are taken into account: relaxation sometimes leads to merging interleave branches, resulting in disappearing split and merge nodes. But the failing argument can be fixed by considering those states that actually matter in the complexity analysis, i.e. those from which no ε-transition is available. This is captured by the following definition of so-called stable states; states, that have no outgoing ε-transition.
Definition 23 (Stable states) Let Σ be a set of symbols that does not contain ε and and e an IDL expression over Σ that does not contain ∨. Let S e be the state space of e. The set of general stable states of e is defined as S e = s ∈ S e | ∀s ∈ S e , ¬Δ e s, ε, ∅, s , the set of t-specific stable states of e as S e,t = s ∈ S e,t | ∀s ∈ S e,t , ¬Δ e s, ε, ∅, s .
We can now prove that through relaxation, the number of (specific) stable states can only increase.
Proposition 10 Let Σ be a set of symbols that does not contain ε and and e an IDL expression over Σ that does not contain ∨. For t ∈ Σ * , we have S e,t ≤ S e,t .

Proof
We prove this by induction on e ∈ E.
• If e = a, e = e and there is nothing to prove.
• If e = e 1 · e 2 , assume without loss of generality that k 2 ≥ k 1 .
The situation is as follows (implicit i and i labels have been omitted): Red vertices and sets denote unstable (non-stable) states. Purple vertices denote vertices that cannot belong to any stable state because they have the left end of an ε-transition. We first prove that for all t ∈ T , S e 1 · e 2 ,t ≤ S e 1 ·e 2 ,t . This can be done by identifying, for each t ∈ T , an injective mapS e 1 · e 2 ,t →S e 1 ·e 2 ,t . Let t ∈ T . We build a map f as follows: -Each stable state whose cut is in e 1 is mapped to its natural counterpart in e 1 · e 2 , where v i is replaced by x i for all i ∈ [[1, k 1 ]] and all (x i ) i∈[[k 1 +1,k 2 ]] are added with stack [∅, ∅]; -Each stable state whose cut is in e 2 is mapped to its natural counterpart in e 1 · e 2 , where w i is replaced by y i for all i ∈ [[1, k 2 ]]; -The final state of e 1 · e 2 is mapped to the final state of e 1 · e 2 .
It is easy to see that f is injective. Therefore, S e 1 · e 2 ,t ≤ S e 1 ·e 2 ,t . We then have to check that for all t ∈ T , S e 1 ·e 2 ,t ≤ S e 1 · e 2 ,t . Let t ∈ T . We build an injective map f as before. Note that cuts internal to either e 1 or e 2 are easy to handle: the top of the stack is used independently in every branch to decide whether a given state can be accessed or not. The only difficulty thus concerns the final cut. Now, the states provided by the final cut of any IDL expression e are in bijection with the sets of positions p ∈ P ( [[1, |t|]]) such that t p is matched by e . Lemma 4 then concludes the proof. Now, we have S e,t = S e 1 ·e 2 ,t ≥ S e 1 · e 2 ,t ≥ S e 1 ·e 2 ,t = S e,t .
• If e = × e , × e = e . . . ↑ ↓ (with the same colorcoding as above); the rest of the proof is straightforward. • If e = || (e 1 , . . . , e n ), the situation is as follows (with the same conventions as before) up to a reordering of the branches triggered by sort: We build a bijective map f :S ||( e 1 ,..., e n ),t →S e,t as follows: -Let s ∈S ||( e 1 ,..., e n ),t . If s is not the final node, write s as s 1 ⊗ · · · ⊗ s n with s i ∈S e i ,t for i ∈ [[1, n]]. For i ∈ [[1, n]], define s i as the natural counterpart of s i in e, except for s i = {(v 1 , z)} which has no counterpart in e and is associated with (x i1 , z 1 ) , . . . , x ik i , z k i where z 1 , . . . , z k i are stacks chosen to replicate the non-stable state leading to (v 1 , z). Put f (s) = s 1 ⊗ · · · ⊗ s n . -If s is the final node, map it to the final node of e; associativity of interleave guarantees that the mapping between states based on the final nodes is bijective.
This ensures that S ||( e 1 ,..., e n ),t = S e,t . The rest of the proof exploits the same arguments as for concatenation.

A.1.2 Counting Parse States in Relaxed IDL Graphs
Notations and assumptions In the remaining part of this subsection, we fix a set of symbols Σ that does not contain ε and and call E * the set of IDL expressions over Σ that do not contain ∨; we further denote byẼ the set of relaxed IDL expressions over Σ. We consider e ∈Ẽ and define w := width (e), h := height (e). We choose a set T ⊂ Σ * , an IDL-PMCF grammar G, and put ρ := ρ T ,G . We finally choose t ∈ T , let n := |t|, Q t := p ⊂ P ( [[1, |t|]]) | t p ∈ L (e) and assume w ≥ h.
The goal of this subsection is to prove Theorem 5 below, that provides an upper bound for S e,t as a function of w, h and ρ, as well as of the size n of t. The definition of E ,n and associated combinatorial results can be found in the appendix.
As every IDL-PMCF grammar can be easily translated into an equivalent IDL-PMCFG without ∨ node by adding new rules, this does not restrict the set of languages that can be matched. At the end of this section, we will briefly address the role of the ∨ node in the complexity of our algorithm and propose a simple approach to efficiently measuring it.
Complexity of the algorithm is measured in terms of number of elementary operations. Creating and adding an item to the current environment is considered a O (1). Listing available transitions to new states given a state, a (non)terminal and an associated set of positions is an O (hn): the size of a cut is at most h and operations on set positions have a complexity of O (n). The constant is improved in practice by pre-computing all available transitions for every new active item to be added to the environment. Denoting by α the maximal arity of any rule in G (G := max f ∈F a ( f )), we find that reserve is an O (αm), whereas compat and unify are O (αmn). Initializing a context is also an O (αm).
To simplify the analysis, the contribution of any of the 8 mutually recursive functions is evaluated by multiplying an upper bound of the total number of times it will be called by the cost of the elementary operations it uses, excluding recursive calls. The number of rules in G is defined as R = |P| and the maximal number of nodes in any IDL expression in G is denoted by M. We will show that the number of times each function is called can be easily bounded by a function of the maximal (or, equivalently, final) number of: -Active items in E, which we denote by A; -Passive items in E, which we denote by P; -Complete items in E, which we denote by C.
We can now inspect the complexity of each of the nine functions presented in Algorithm 2.
Parse The two for loops have a complexity of O (n A).
PredictAll The function itself is called only once by Parse. The internal code is executed m R times at most. The internal code consists of a context initialization (O (αm)), and an item creation (O (1)). The contribution of PredictAll is therefore an O m 2 Rα .
TryScan The function is called at most O (n A) times by Parse. Computing available transitions is an O (hn) (see above). For each transition, the cost of operations is constant. This results in an overall O Ahn 2 .
TrySave The function is called exactly P times. After a constant-cost test and identification of the production at stake, new items are added to the environment in time O (1), resulting in an O (P).
TryStep The function is called at most m AM R times: each node is processed at most once for each new active item. Its cost is an O (hn). Its overall contribution to complexity is therefore O (hm AM Rn). TryUnify The function is called exactly P times. The total number of complete items of a given category in the environment is (ρn + 1) m (each field provides at most ρn matches from which at most one must be selected). At most that many items must be checked by the outer loop; the inner loop then has cost O (αmn). The final complexity is O Pαmn (ρn) m . The total complexity of the algorithm is now O n A + m 2 Rα + Ahn 2 + P + hm AM Rn + ACαhmn 2 + Pαmn (ρn) m = O m 2 Rα + hm AM Rn + ACαhmn 2 + Pαmn (ρn) m = O m m Rα + AM Rh 2 n + ACαhn 2 + Pαn (ρn) m .

TryCombineLHS
Using the results of the previous part and the remarks above, we can now complete simple upper bounds for A, C and P as follows: Let E be the image of E by e → e and S = max e ∈E S e ,t .