Semi-Oblivious Chase Termination: The Sticky Case

The chase procedure is a fundamental algorithmic tool in database theory with a variety of applications. A key problem concerning the chase procedure is all-instances termination: for a given set of tuple-generating dependencies (TGDs), is it the case that the chase terminates for every input database? In view of the fact that this problem is undecidable, it is natural to ask whether known well-behaved classes of TGDs, introduced in different contexts such as ontological reasoning, ensure decidability. We consider a prominent paradigm that led to a robust TGD-based formalism, called stickiness. We show that for sticky sets of TGDs, all-instances chase termination is decidable if we focus on the (semi-)oblivious chase, and we pinpoint its exact complexity: PSpace-complete in general, and NLogSpace-complete for predicates of bounded arity. These complexity results are obtained via a graph-based syntactic characterization of chase termination that is of independent interest.


Introduction
The chase procedure (or simply chase) is a fundamental algorithmic tool that has been successfully applied to several database problems such as containment of queries under constraints [2], checking logical implication of constraints [5,27], The restricted chase has a clear advantage over the (semi-)oblivious chase when it comes to the size of the final result. But, of course, this advantage does not come for free: at each application, the restricted chase has to check that there is no way to satisfy the head of the TGD at hand, and this can be computationally costly in practice. On the other hand, the advantage of the semi-oblivious chase over the oblivious chase comes without any additional overhead since both versions have to keep track of the TGDs and pairs of tuples that have been considered so far.
It has been recently observed that for RAM-based implementations the restricted chase is the indicated approach since the benefit from producing smaller instances justifies the additional effort for checking whether a TGD is already satisfied; see, e.g., [6,22]. However, as discussed in [6], an RDBMS-based implementation of the restricted chase is quite challenging, whereas an efficient implementation of the semioblivious chase is feasible. Hence, both the semi-oblivious and restricted versions of the chase are relevant algorithmic tools for practical implementations, whereas the oblivious version of the chase is mostly of theoretical interest.

The Challenge of Non-Termination
There are indeed efficient implementations of the semi-oblivious and restricted chase that allow us to solve central database problems by adopting a materialization-based approach [6,22,30,33]. Nevertheless, for this to be feasible in practice we need a guarantee that the chase terminates, which is not always the case. This fact motivated a long line of research on identifying classes of TGDs that ensure the termination of the semi-oblivious and/or restricted chase, no matter how the input database looks like. A prime example is the class of weakly-acyclic sets of TGDs [17], which has been introduced in the context of data exchange, and guarantees the termination of both the semi-oblivious and restricted chase. Many other termination criteria can be found in the literature; see, e.g., [4, 9, 14-16, 21, 23, 28, 29].
With so much effort spent on identifying sufficient conditions for the termination of the chase, the question that immediately comes up is whether a sufficient condition that is also necessary exists. In other words, given a set Σ of TGDs, is it possible to decide whether, for every database D, the semi-oblivious or the restricted chase on D and Σ terminates? The answer is negative, no matter which version of the chase we consider [18]; this is actually true even for the oblivious version of the chase. The problem remains undecidable even if the database is known; this has been established in [15] for the restricted chase, and it was observed in [28] that the same proof shows undecidability also for the (semi-)oblivious chase.

Deciding the Termination of the Chase
The undecidability proof given in [18] constructs a sophisticated set of TGDs that goes beyond existing well-behaved classes of TGDs that enjoy certain syntactic properties, which in turn ensure useful model-theoretic properties. Such well-behaved classes of TGDs have been proposed in the context of ontological reasoning. The two main paradigms that led to robust TGD-based formalisms, without forcing the chase to terminate, are guardedness [4,11,12] and stickiness [13]: Guardedness A TGD is guarded if its body has an atom that contains (or "guards") all the universally quantified variables. This condition has been inspired by the guarded fragment of first-order logic, and is powerful enough to capture important Description Logics (DLs) such as the members of the EL family. The key model-theoretic property of the class of guarded TGDs, which explains its robust behaviour, is the existence of tree-like universal models [11]. Stickiness On the other hand, sticky sets of TGDs are powerful enough to model interesting statements that are inherently non-tree-like, and thus, not expressible via guarded TGDs. Such a statement, for example, consists of the TGDs ∀x∀y(R(x, y) → ∃zR(y, z) ∧ P (z)) ∀x∀y(P (x) ∧ P (y) → S(x, y)), which compute the cartesian product of an infinite unary relation, a useful modeling feature that, in DL parlance, is known as concept product [32].
The fact that the set of TGDs constructed in the undecidability proof of [18] is neither guarded nor sticky raised the following question: is the semi-oblivious/restricted chase termination problem decidable for guarded or sticky sets of TGDs? The current state of affairs concerning this central question is as follows: -For the semi-oblivious chase and guarded TGDs the problem is well-understood: it is 2EXPTIME-complete in general, and EXPTIME-complete for predicates of bounded arity [8]. The same paper [8] considered also linear TGDs, i.e., TGDs with only one body atom, which form a central subclass of guarded TGDs: the problem becomes PSPACE-complete, and NLOGSPACE-complete for predicates of bounded arity. An alternative proof for linear TGDs, which relies on derivation trees, a notion that was originally introduced in the context of ontological query answering [3], has been recently proposed in [24]. -For the restricted chase and guarded TGDs, it has been recently shown, by exploiting Monadic-Second Order Logic over infinite trees of bounded degree, that the problem is decidable in elementary time assuming only one atom in the head, whereas the case of more than one atoms in the head remains a challenging open problem [19]. The case of linear TGDs with only one atom in the head has been explicitly considered in [24], where the decidability of the problem in question has been shown by relying on derivation trees. -Finally, for the restricted chase and sticky sets of TGDs, it has been recently shown, by exploiting Büchi Automata, that the problem is decidable in elementary time assuming only one atom in the head, whereas the case of more than one atoms in the head remains a challenging open problem [19].
Towards completing the picture concerning the chase termination problem, in this work we concentrate on the semi-oblivious chase and sticky sets of TGDs, and provide precise complexity results: PSPACE-complete in general, and NLOGSPACEcomplete for predicates of bounded arity. Our results apply also to the oblivious chase that, although is not very useful for practical purposes, it is a relevant technical tool due to its simplicity; for a discussion on the usefulness of the oblivious chase see [11].

Summary of Contributions
Our results can be summarized as follows: -In Section 4, we provide a semantic characterization of non-termination of the semi-oblivious chase under sticky sets of TGDs via the existence of "path-like" infinite chase derivations, which forms the basis for our decision procedure. -By exploiting the above semantic characterization, we then provide, in Section 5, a syntactic characterization of semi-oblivious chase termination via a graphbased condition. To this end, we exploit a recent syntactic characterization from [8] of the termination of the semi-oblivious chase under linear TGDs. -In Section 6, by using the graph-based syntactic characterization from the previous section, we establish the desired complexity upper bounds for our problem: PSPACE in general, and NLOGSPACE for predicates of bounded arity. We finally establish matching lower bounds. The PSPACE-hardness is obtained by simulating the behaviour of a polynomial space Turing machine, while the NLOGSPACE-hardness via a reduction from graph reachability.
We consider the mutually disjoint countably infinite sets C, N, and V of constants, (labeled) nulls, and variables, respectively. We refer to constants, nulls and variables as terms. For an integer n > 0, we may write [n] for the set {1, . . . , n}.

Relational Databases
A schema S is a finite set of relation symbols (or predicates) with associated arity. We write R/n to denote that R has arity n ≥ 0; we may also write ar(R) for the integer n. A (predicate) position of S is a pair (R, i), where R/n ∈ S and i ∈ [n], that essentially identifies the i-th argument of R. We write pos(S) for the set of positions of S, that is, the set {(R, i) | R/n ∈ S and i ∈ [n]}. An atom over S is an expression of the form R(t), where R/n ∈ S andt is an n-tuple of terms. A fact is an atom whose arguments consist only of constants. An (atom) We write var(R(t)) for the set of variables int. The notations pos(·, x) and var(·) extend to sets of atoms. An instance over S is a (possibly infinite) set of atoms over S with constants and nulls. A database over S is a finite set of facts over S. The active domain of an instance I , denoted dom(I ), is the set of terms in I .

Substitutions and Homomorphisms
A substitution from a set of terms T to a set of terms T is a function h : T →T . Henceforth, we treat a substitution h as the set of mappings {t → h(t) | t ∈ T }. The restriction of h to a subset S of T , denoted h |S , is the substitution {t → h(t) | t ∈ S}. A homomorphism from a set of atoms A to a set of atoms B is a substitution h from the set of terms in A to the set of terms in B such that (i) t ∈ C implies h(t) = t, i.e., h is the identity on C, and (ii) R(t 1 , . . . , t n ) ∈ A implies h(R(t 1 , . . . , t n )) = R(h(t 1 ), . . . , h(t n )) ∈ B.
Tuple-Generating Dependencies A tuple-generating dependency (TGD) σ is a firstorder sentence (without constants) of the form wherex,ȳ andz are mutually disjoint tuples of variables of V, while φ(x,ȳ) and ψ(x,z) are conjunctions of atoms. For brevity, we write σ as φ(x,ȳ)→∃zψ(x,z), and use comma instead of ∧ for joining atoms. We refer to φ(x,ȳ) and ψ(x,z) as the body and head of σ , denoted body(σ ) and head(σ ), respectively. The frontier of the TGD σ , denoted fr(σ ), is the set of variablesx, i.e., the variables that appear both in the body and the head of σ . Note that, by abuse of notation, we may treat a tuple of variables as a set of variables. The schema of a set Σ of TGDs, denoted sch(Σ ), is the set of predicates occurring in Σ, and we write ar(Σ ) for the maximum arity over all those predicates. An instance I satisfies a TGD σ as the one above, written I |= σ , if the following holds: whenever there exists a homomorphism h from φ(x,ȳ) to I , then there exists h ⊇ h |x that is a homomorphism from ψ(x,z) to I . Note that, by abuse of notation, we sometimes treat a conjunction of atoms as a set of atoms. The instance I satisfies a set Σ of TGDs, written I |= Σ, if I |= σ for each σ ∈ Σ.
Stickiness One of the main syntactic paradigms for TGDs is stickiness [13]. The key property underlying this condition is as follows: variables that appear more than once in the body of a TGD should be inductively propagated (or "stick") to every head atom, which can be graphically illustrated as where the first set of TGDs is sticky, while the second is not. The formal definition relies on an inductive procedure that marks the variables that may violate the above property. The base step marks a variable in the body of a TGD that does not occur in every head atom. Then, the marking is inductively propagated as follows Stickiness requires a marked variable to appear only once in the body of a TGD. Let us now formalize the above intuitive discussion. Consider a set Σ of TGDs; we assume, w.l.o.g., that the TGDs in Σ do not share variables. Let σ ∈ Σ and x a variable in body(σ ). We inductively define when x is marked in Σ: -If x does not occur in every atom of head(σ ), then x is marked in Σ.
-Assuming that head(σ ) contains an atom of the form R(t) and x ∈t, if there exists σ ∈ Σ that has in its body an atom of the form R(t ), and each variable in R(t ) at a position of pos(R(t), x) is marked in Σ, then x is marked in Σ.
The set Σ is sticky if there is no TGD whose body contains two occurrences of a variable that is marked in Σ. We denote by S the family of all sticky finite sets of TGDs. 1

The Chase Procedure
The chase procedure (or simply chase) accepts as an input a database D and a set Σ of TGDs, and constructs a (possibly infinite) instance that contains D and satisfies Σ. Central notions in this context are those of active trigger and trigger application, which are coming into two different variations, oblivious and semi-oblivious, which in turn give rise to the oblivious [11] and the semi-oblivious [20,28] chase. The key difference between these two versions of the chase, lies at the adopted naming scheme for the newly generated null values, which are used as witnesses for the existentially quantified variables occurring in the head of a TGD.

Definition 1 (Trigger and Trigger Application)
A trigger for a set Σ of TGDs on an instance I is a pair (σ, h), where σ ∈ Σ and h is a homomorphism from body(σ ) to I . The oblivious result and semi-oblivious result of (σ, h), denoted o-result(σ, h) and so-result(σ, h), is the set of atoms μ o (head(σ )) and μ so (head(σ )), respectively, where μ o , μ so : var(head(σ ))→C ∪ N are defined as where ⊥ x σ,h , ⊥ x σ,h |fr(σ ) are nulls from N. We call the trigger (σ, h) obliviously active if o-result(σ, h) ⊆ I , and semi-obliviously active if so-result(σ, h) ⊆ I . The oblivious application of (σ, h) to I returns the instance J = I ∪ o-result(σ, h), and is denoted as I o, σ, h J . Analogously, the semi-oblivious application of (σ, h) to I returns the instance J = I ∪ so-result(σ, h), and is denoted as I so, σ, h J .
Observe that in the definition of -result(σ, h), where ∈ {o, so}, each existentially quantified variable x occurring in head(σ ) is mapped by μ to a "fresh" null value of N whose name is uniquely determined by the trigger (σ, h) and x itself. Therefore, for a trigger (σ, h), we can unambiguously write down the set of atoms -result(σ, h). In our analysis, it would be useful to be able to refer to the terms occurring in -result(σ, h) that have been propagated (not invented) during the application of (σ, h). Formally, the frontier of -result(σ, h), denoted fr( -result(σ, h)), is the set of terms dom(h(body(σ ))) ∩ dom( -result(σ, h)).

(Semi-)Oblivious Chase
The main idea of the chase is, starting from a database D, to exhaustively apply active triggers for the given set Σ of TGDs on the instance constructed so far. We simultaneously define oblivious and semi-oblivious chase derivations. To this end, we distinguish the two cases where a derivation is finite or not: -A finite sequence (I i ) 0≤i≤n of instances, with D = I 0 and n ≥ 0, is an oblivious (resp., semi-oblivious) chase derivation of D w.r.t. Σ if, for each i ∈ {0, . . . , n− 1}, there exists an obliviously (resp., semi-obliviously) active trigger (σ, h) for Σ on I i such that I i o, σ, h I i+1 (resp., I i so, σ , h I i+1 ), and no obliviously (resp., semi-obliviously) active trigger for Σ on I n exists. -An infinite sequence (I i ) i≥0 of instances, with D = I 0 , is an oblivious (resp., semi-oblivious) chase derivation of D w.r.t. Σ if, for each i ≥ 0, there exists an obliviously (resp., semi-obliviously) active trigger (σ, h) for Σ on I i such that I i o, σ , h I i+1 (resp., I i so, σ , h I i+1 ). Moreover, (I i ) i≥0 is fair if, for each i ≥ 0, and for every obliviously (resp., semi-obliviously) active trigger (σ, h) for Σ on I i , there exists j > i such that (σ, h) is not an obliviously (resp., semi-obliviously) active trigger for Σ on I j . The latter is known as the fairness condition, and guarantees that all the active triggers will eventually be deactivated.
A (semi-)oblivious chase derivation is valid if it is finite, or infinite and fair. Infinite but unfair chase derivations are not valid since they do not serve the main purpose of the chase procedure, i.e., build an instance that satisfies the given TGDs. Henceforth, we write o-chase and so-chase for oblivious and semi-oblivious chase, respectively. In general, due to the adopted naming scheme and the definition of active triggers, the semi-oblivious chase builds smaller instances than the oblivious one. This is because a trigger that is semi-obliviously active it is also obliviously active, but the other direction is not always true. This has been already illustrated by Example 2 in Section 1, which, for the sake of readability, we recall below. , which is, of course, finite. Indeed, if we apply again the TGD σ , then we will obtain the atom R a, ⊥ z σ,{x →a} , which is already present.
Chase Relation A useful notion that we are going to use in our proofs is the so-called chase relation [13], which essentially describes how the atoms generated during the chase depend on each other. Consider a -chase derivation δ = (I i ) i≥0 , where ∈ {o, so}, of a database D w.r.t. a set Σ of TGDs, and assume that for each i ≥ 0, I i , σ i , h i I i+1 , which means that I i+1 = I i ∪ -result(σ i , h i ). The chase relation of δ, denoted ≺ δ , is a binary relation over i≥0 I i such that α ≺ δ β iff there exists i ≥ 0 such that α ∈ h i (body(σ i )) and β ∈ I i+1 \ I i . Notice that the relation ≺ δ is acyclic, or, in other words, it forms a directed acyclic graph over i≥0 I i .

Chase Termination Problem
It is known that due to the existentially quantified variables, a valid -chase derivation, where ∈ {o, so}, may be infinite. This is true even for very simple settings: it is easy to verify that the only -chase derivation of D = {R(a, b)} w.r.t. the set Σ consisting of the single TGD R(x, y)→∃zR(y, z) is infinite. The question that comes up is, given a set Σ of TGDs, can we check whether, for every database D, all or some valid (semi-)oblivious chase derivations of D w.r.t. Σ are finite? Before formalizing the above problem, let us recall two central classes of sets of TGDs: CT ∀∀ = Σ for every database D, every valid -chase derivation of D w.r.t. Σ is finite there exists a finite -chase derivation of D w.r.t. Σ The problems tackled in this work are as follows, where C is a class of sets of TGDs: It is well-known from [20] that the following holds: This immediately implies that, after fixing the version of the chase in question, i.e., oblivious or semi-oblivious, the above decision problems are equivalent. Henceforth, for a class C of sets of TGDs, we simply refer to the problem CT ∀ (C), and we write CT ∀ for the classes CT ∀∀ and CT ∀∃ , where ∈ {o, so}.
We know that our main decision problems are, in general, undecidable. Assuming that TGD denotes the class of arbitrary sets of TGDs, we have that:

even if we focus on binary and ternary predicates.
However, the set of TGDs employed in the undecidability proof of [18] is not sticky. What about CT ∀ (S) then? This is a non-trivial problem, and pinpointing its complexity is the main goal of this work.

Some Useful Results
Before proceeding with the complexity analysis of CT ∀ (S), let us recall a couple of technical results that would allow us to significantly simplify our investigation.
Critical Database It would be useful to have a special database of a very simple form that gives rise to a valid infinite chase derivation whenever there is a database with the same property. Interestingly, such a database exists, which is known as the critical database for a set Σ of TGDs [28]. Formally, the critical database for Σ is where ∈ C is a fixed constant. In other words, cr(Σ ) consists of all the atoms that can be formed using the predicates of sch(Σ ) and the constant . The following result states that cr(Σ ) is indeed the desired database: ) Consider a set Σ of TGDs. For ∈ {o, so}, the following are equivalent: Henceforth, we always use for the constant that appears in a critical database. Notice that this special constant does not depend on the set of TGDs in question.
Fairness As one might expect, we are going to focus on the complement of CT ∀ (S), for ∈ {o, so}, and pinpoint the complexity of the following problem: given a set Σ ∈ S, is there a valid infinite -chase derivation δ of cr(Σ ) w.r.t. Σ (see Proposition 1). However, as observed in [8], where the same problem is studied but for the class of guarded TGDs, one of the main difficulties is to ensure that δ enjoys the fairness condition. Interestingly, as shown in [8], we can completely neglect the fairness condition since the existence of a (possibly unfair) infinite -chase derivation of some database w.r.t. Σ implies the existence of a fair one.

Proposition 2 ([8])
Consider a database D and a set Σ of TGDs. For ∈ {o, so}, the following are equivalent: By combining Propositions 1 and 2, we obtain the following useful corollary: Corollary 1 Consider a set Σ of TGDs. For ∈ {o, so}, the following are equivalent

Our Main Result and Plan of Attack
As discussed above, the main goal of this work is to pinpoint the complexity of chase termination under sticky sets of TGDs, focussing on the oblivious and semi-oblivious versions of the chase procedure. Our main result follows: and CT so ∀ (S) are PSPACE-complete, and NLOGSPACE-complete for predicates of bounded arity.
Consider a set Σ ∈ S. By Corollary 1, our main challenge is to show that the problem of deciding whether there exists an infinite -chase derivation of cr(Σ ) w.r.t. Σ, where ∈ {o, so}, is PSPACE-complete, and NLOGSPACE-complete for predicates of bounded arity. In fact, the bulk of our work concentrates on establishing the desired upper bounds for the semi-oblivious chase, i.e., when = so. We can then easily obtain the same upper bounds for the oblivious chase by exploiting a simple reduction from CT o ∀ (S) to CT so ∀ (S). Our plan of attack follows: -The upper bounds heavily rely on the following semantic characterization given in Section 4: there exists an infinite so-chase derivation of cr(Σ ) w.r.t. Σ iff there exists a "path-like" infinite so-chase derivation of cr(Σ ) w.r.t. Σ. -The above semantic characterization allows us to provide, in Section 5, a syntactic graph-based characterization of the existence of an infinite so-chase derivation of cr(Σ ) w.r.t. Σ. Actually, the latter coincides with the existence of a certain "bad" cycle in the dependency graph of a "linearized" version of Σ.
-We show, in Section 6, that checking whether a "bad" cycle exists in the dependency graph of the "linearized" version of Σ is in PSPACE, and in NLOGSPACE for predicates of bounded arity. This shows the desired upper bounds for CT so ∀ (S). We then explain how the upper bounds for CT so ∀ (S) can be transferred to CT o ∀ (S) by exploiting a simple construction known as enrichment [20]. We finally provide matching lower bounds for CT o ∀ (S) and CT so ∀ (S). The PSPACE-hardness is obtained by simulating a deterministic polynomial space Turing machine, while the NLOGSPACE-hardness by a reduction from graph reachability.
At this point, one may wonder whether a powerful termination criterion from the literature allows us to characterize the fragment of sticky sets of TGDs that guarantees the termination of the semi-oblivious chase, which in turn will lead to the desired complexity results. To the best of our knowledge, such a criterion does not exist. Interestingly, model-faithful acyclicity, one of the largest classes of TGDs that ensure the termination of the semi-oblivious chase known today [14], is not powerful enough for characterizing the class S ∩ CT so ∀ . For example, the sticky set of TGDs belongs to CT so ∀ , but it violates the model-faithful acyclicity condition.

Semantic Characterization of Semi-Oblivious Chase Non-Termination
We proceed to characterize the non-termination of the so-chase under sticky sets of TGDs. For a set Σ ∈ S, our goal is to show that, if there is an infinite so-chase derivation of cr(Σ ) w.r.t. Σ, then we can isolate a "path-like" infinite so-chase derivation δ , which we call linear. Roughly speaking, linearity means that there exists an infinite simple path α 0 , α 1 , α 2 . . . in the chase relation of δ such that α 0 ∈ cr(Σ ) and α i is constructed during the i-th trigger application, while all the atoms that are needed to construct this path, and are not already on the path, are atoms of cr(Σ ). i≥0 such that the following hold: We are now ready to present our main characterization of non-termination of the semi-oblivious chase under sticky sets of TGDs.
Theorem 3 Consider a set Σ ∈ S. The following are equivalent: 1. There exists an infinite so-chase derivation of cr(Σ ) w.r.t Σ.

There exists a linear infinite
It is clear that (2) ⇒ (1) holds trivially. The non-trivial direction is (1) ⇒ (2), which is established in two main steps: 1. We show that the chase relation of an infinite so-chase derivation δ of cr(Σ ) w.r.t. Σ always contains a special path, called continuous, rooted at an atom of cr(Σ ), which essentially guarantees the continuous propagation of a new null. Note that the existence of such a special path does not rely on stickiness. 2. By exploiting the existence of a continuous path, we construct a linear infinite so-chase derivation of cr(Σ ) w.r.t. Σ. In fact, due to stickiness, we can convert an infinite suffix P of the continuous path in ≺ δ , together with all the atoms that are needed to generate the atoms on P via a single trigger application, into a linear infinite so-chase derivation δ of cr(Σ ) w.r.t. Σ. As we shall see, stickiness helps us to ensure that δ is linear, while continuity allows us to show that δ is infinite.
In the rest of this section, we fix a set Σ ∈ S. For technical clarity, we assume that all the TGDs of Σ have a non-empty frontier, i.e., for every TGD σ ∈ Σ, there exists at least one variable that appears both in body(σ ) and head(σ ). Furthermore, we assume that Σ is in normal form, i.e., each TGD of Σ has only one atom in its head. The normalization procedure, which preserves stickiness, is rather standard and can be found, e.g., in [13]. The above simplifying assumptions do not affect the generality of our proof. In other words, assuming that Σ is what we obtain after removing fromΣ ∈ S all the TGDs with an empty frontier and then normalize it, we can easily show that there exists a linear infinite so-chase derivation of cr(Σ) w.r.t.Σ iff there exists a linear infinite so-chase derivation of cr(Σ ) w.r.t. Σ.

Existence of a Continuous Path
Let us first formalize the notion of path in the chase relation of a derivation. Given an infinite so-chase derivation δ = (I i ) i≥0 of cr(Σ ) w.r.t. Σ, a finite δ-path is a finite sequence of atoms (α i ) 0≤i≤n such that α 0 ∈ I 0 and α i ≺ δ α i+1 . Analogously, we can define the notion of infinite δ-path, which refers to an infinite sequence of atoms rooted at an atom of I 0 . The intention underlying continuity is to ensure the continuous propagation of a new null on a path. Roughly, a δ-path (α i ) i≥0 is continuous via a sequence of indices ( i ) i≥0 , with 0 = 1, if for each i ≥ 0, a new null is invented in α i that is "necessarily propagated" up to the atom α i+1 . At this point, it is crucial to formalize what we mean by "necessarily propagated".
Let α, β be atoms of i≥0 I i , and assume that β ∈ I j \ I j −1 , for some j > 0, with I j −1 so, σ, h I j , i.e., β = so-result(σ, h). 2 We say that the i-th position of α and the j -th position of β are related (w. , with R 0 and R n being the predicates of α 0 and α n , respectively, if there exists a sequence of integers ( k ) 1≤k≤n−1 such that In simple words, the term occurring at position (α 0 , s) is necessarily propagated up to the position (α n , t) via the intermediate positions (α 1 , 1 ), . . . , (α n−1 , n−1 ).
For introducing continuity, we also need the notion of the birth atom of a null value. Consider a null ⊥ occurring in i≥0 I i . The birth atom (w. . It is clear that the atom birth δ (⊥) is unique (since we consider TGDs in normal form). We are now ready to introduce continuity.
A simple example that illustrates the notion of continuity follows: It is easy to verify that there exists an infinite so-chase derivation δ of cr(Σ ) w.r.t. Σ such that the following is part of ≺ δ ; a black solid edge from α to β labeled by σ means that (α, β) belongs to ≺ δ due to a trigger that involves the TGD σ : It can be verified that the path with P -atoms is a continuous δ-path. Let us explain the reason. The first atom in which a null is born is P (c, ⊥ 1 , ⊥ 2 ), with ⊥ 1 , ⊥ 2 being the new nulls, and continuity is satisfied since ⊥ 2 is propagated (this is indicated via the red dashed arrows) to the next atom where the new null ⊥ 3 is born. Now, since the null ⊥ 3 is propagated up to the next birth atom P (⊥ 3 , ⊥ 1 , ⊥ 6 ), continuity is satisfied. In the rest of the path the same pattern is repeated, and thus continuity is globally satisfied. In fact, the pattern that we can extract is graphically illustrated as where the continuous propagation of a new null (red dashed arrows) can be seen.
We are now ready to establish our main technical result concerning continuity. Note that the next result holds for arbitrary (not necessarily sticky) sets of TGDs. i≥0 is an infinite so-chase derivation of cr(Σ ) w.r.t Σ, and let I = i≥0 I i . Given a term t ∈ dom(I), which is either the constant or a null, and a null ⊥ ∈ dom(I), we write t δ ⊥ if t occurs in fr(birth δ (⊥)). 4 Now, for each t ∈ dom(I), we inductively define the rank (w.r.t. δ) of t as follows:

Lemma 1 For every infinite
For a null ⊥ ∈ dom(I), we select a term t ∈ dom(I) such that rank(t ) = rank(⊥) −1 and t δ ⊥, and we write t δ ⊥. 5 Since, by assumption, all the TGDs of Σ have a non-empty frontier, it is easy to see that the binary relation δ over dom(I) forms an infinite rooted tree T δ , where the root is the constant occurring in cr(Σ ). The key property of T δ is given by the following result: Proof We can show that, for each i ≥ 0, the set {t ∈ dom(I) | rank(t ) = i} is finite. This can be shown by induction on i ≥ 0, while the key fact is that only finitely many semi-obliviously active triggers can be formed due to which a null with rank i + 1 is generated (since, by induction hypothesis, the set of terms with rank at most i is finite). Therefore, the nodes of T δ have finite out-degree.
Having Lemma 2 in place, we can apply König's Lemma on T δ , and get that T δ contains an infinite simple path starting from the root node. 6 Let ⊥ 0 , ⊥ 1 , ⊥ 2 , . . . be such a path, where ⊥ 0 = . By construction, for each i ≥ 0, ⊥ i ∈ fr(birth δ (⊥ i+1 )). Moreover, there is a sequence of atoms . is a continuous δ-path, and the claim follows.

From Arbitrary to Linear Infinite Derivations
We can now show that an infinite so-chase derivation δ = (I i ) i≥0 of cr(Σ ) w.r.t. Σ can always be converted into a linear infinite so-chase derivation δ , which in turn establishes the non-trivial direction (1) ⇒ (2) of Theorem 3. The construction proceeds in three main steps, which are described below. We first describe those steps in a semi-formal way, and exploit Example 4 in order to illustrate the key ideas. We then proceed to give the formal construction.

Semi-Formal Description
Useful part of δ We first isolate a useful part of the so-chase derivation δ = (I i ) i≥0 . By Lemma 1, there exists a continuous infinite δ-path P = (α i ) i≥0 . Recall that continuity guarantees the continuous propagation on P of infinitely many nulls, which we call for the purpose of this discussion pivotal. By stickiness, there exists j ≥ 0 such that α j is the last atom on P in which a term t becomes sticky. By saying that the term t becomes sticky, we mean that the first time t participates in a join is during the trigger application that generates α j , and thus t occurs in (or sticks to) every atom of {α i } i≥j . Let k ≥ j be the integer such that α k is the first atom on P after α j in which a new pivotal null is invented. The useful part of δ that we are going to focus on is the infinite sequence of atoms (α i ) i≥k , which we call the backbone, and the atoms of i≥0 I i , which we call side atoms, that are needed to generate the atoms on the backbone via a single trigger application. In other words, for a backbone atom α, if α is obtained via the trigger (σ, h) for Σ on instance I i , for some i ≥ 0, then the atoms h(body(σ )), excluding the backbone atoms, are side atoms.
Example 5 Consider again the set Σ ∈ S from Example 4. As discussed above, there exists an infinite so-chase derivation δ of cr(Σ ) w.r.t. Σ such that a continuous infinite δ-path exists (see the figures above). The useful part of δ is as shown below Observe that the last atom on the continuous path in which a term becomes sticky is P (⊥ 2 , ⊥ 1 , ⊥ 3 ); in fact, the sticky term is ⊥ 1 , which is the only sticky term on the continuous path. It happened that P (⊥ 2 , ⊥ 1 , ⊥ 3 ) invents also a new pivotal null, that is, ⊥ 3 , and therefore the suffix of the continuous path that starts at P (⊥ 2 , ⊥ 1 , ⊥ 3 ) is the backbone. It is now easy to verify that all the other atoms, apart from S( ), indeed contribute to the generation of a backbone atom via a single trigger application.

Renaming step
We proceed to rename some of the nulls that occur in backbone atoms or side atoms. In particular, for every null ⊥ occurring in a side atom α, we apply the following renaming steps; recall that is the constant occurring in cr(Σ ): (i) every occurrence of ⊥ in α is replaced by , and (ii) every occurrence of ⊥ in a backbone atom β that is propagated from α to β is replaced by . For a backbone or side atom α, let ρ(α) be the atom obtained from α after globally applying the above renaming steps. We now define the sequence of instances δ = (J i ) i≥0 as follows: which is a subset of cr(Σ ), and where H is the set of atoms that are generated together with α k+i−1 , after renaming the propagated nulls that do not occur in ρ(α k+i−1 ) to . 7 It is crucial to observe that a new null generated in a backbone atom never participates in a join. This is because the first backbone atom α k comes after the atom α j , which is the last atom on P in which a term becomes sticky. This fact allows us to modify triggers from δ in order to construct, for every i ≥ 0, a trigger (σ i , h i ) such that J i so, σ i , h i J i+1 .

Example 6
We consider again our running example. Before renaming the nulls that appear in side atoms, we first need to understand how nulls are propagated from side atoms to backbone atoms during the chase. This is depicted in the following figure each instance occurs in δ finitely many times. Therefore, after the pruning step, the obtained so-chase derivation is infinite. Thus, δ is a linear infinite so-chase derivation of J 0 w.r.t. Σ. Since J 0 ⊆ cr(Σ ), we can construct a linear infinite so-chase derivation δ of cr(Σ ) w.r.t. Σ by adding to J 0 the set of atoms cr(Σ ) \ J 0 .
Example 7 Coming back to our running example, it can be seen that the sequence of instances devised in Example 6 is not the desired linear derivation due to non-active triggers, which implies that there are consecutive occurrences of the same instance, e.g., J I , J 2 , J 3 . However, after applying the pruning step, we get the sequence . . . Now, it is easy to verify that after adding the atoms S( ) and Q( ) to J 0 , we get a linear infinite so-chase derivation of cr(Σ ) w.r.t. Σ.
Given an atom α i = so-result(σ, h) for some i > 0, we say that a term t ∈ dom(I) becomes sticky in α i if i is the smallest integer such that there is a variable x ∈ fr(σ ) occurring more than once in body(σ ) and t = h(x). By stickiness of , and from the fact that the arity of each predicate in is finite, there exists some j ≥ 0 such that, for every i ≥ j , no term t ∈ dom(I) becomes sticky in α i . Let k ≥ 0 be the smallest integer such that no term becomes sticky in α k . Note that this integer exists since the sequence of indices ( i ) i≥0 is infinite. We call the sequence of atoms B = (α i ) i≥ k the backbone of δ; let B = {α i } i≥ k . Furthermore, we define the set exists, then the term at position (β, j ) is renamed to the constant . Furthermore, for each i ≥ 0, assuming that β i = so-result(σ, h), let η i = (σ, h ) be the trigger obtained from (σ, h) after updating h to h as dictated by the renaming step applied to β i . Assuming B = (β i ) i≥0 , we define δ = (J i ) i≥0 such that J 0 = cr(Σ ), and for -no term becomes sticky in some atom of B, and -for each null ⊥ such that β = birth δ (⊥) ∈ B, which means that ⊥ has been generated on B, ⊥ is not renamed on B , we get that, for i ≥ 0, η i is a trigger for Σ on J i , and thus, J i+1 = J i ∪ so-result(η i ).
is not yet a so-chase derivation since some of the involved triggers might not be semi-obliviously active. In other words, there might exist i ≥ 0 such that J i so, σ i , h i J i+1 , but soresult(σ i , h i ) ∈ J i , which implies that J i = J i+1 . Let δ be the sequence of instances obtained from δ where only one occurrence of each instance is kept. Clearly, δ is still locally consistent. Moreover, by exploiting continuity, we can show that, for each i ≥ 0, J i occurs in δ finitely many times, which in turn implies that δ is a linear infinite so-chase derivation of cr(Σ ) w.r.t. Σ, as needed.

Graph-Based Characterization of Semi-Oblivious Chase Termination
In this section, we characterize the termination of the semi-oblivious chase for sticky sets of TGDs via a graph-based condition. More precisely, we show that a set Σ ∈ S belongs to CT so ∀ iff a linearized version of it, i.e., a set of linear TGD obtained from Σ, enjoys a condition inspired by weak-acyclicity [17], called critical-weakacyclicity, introduced in [8]. Recall that linear TGDs are TGDs with only one body atom [12]; we write L for the class of linear TGDs. The proof of the above result boils down to showing that the given sticky set Σ of TGDs can be rewritten into a set of linear TGDs, while this rewriting preserves the termination of the semi-oblivious chase. The latter heavily relies on Theorem 3, which establishes that non-termination of the semi-oblivious chase coincides with the existence of a linear infinite chase derivation of cr(Σ ) w.r.t. Σ. We can then apply the characterization for the termination of the semi-oblivious chase for linear TGDs from [8], which states that a set Σ ∈ L belongs to CT so ∀ iff it is critically-weakly-acyclic. Let us first recall the notion of critical-weak-acyclicity for linear TGDs, which has been originally introduced in [8].

Critically-Weakly-Acyclic Linear TGDs
Since critical-weak-acyclicity is inspired by weak-acyclicity, it is not surprising that it relies on the dependency graph of a set of TGDs introduced in [17], that encodes how terms might be propagated during the chase. We assume a fixed ordering on the head-atoms of TGDs. 8 For a TGD σ with head(σ ) = α 1 , . . . , α k , we write (σ, i) for the TGD that has only one atom in its head, called single-head, obtained from σ by keeping only the atom α i , and the existentially quantified variables in α i . Recall that pos(α, x) is the set of positions in α at which x occurs, while pos(body(σ ), x) is the set of positions at which x occurs in the body of σ . We also write pos(sch(Σ )) for the set of positions of sch(Σ ), i.e., the set {(R, i) | R/n ∈ sch(Σ ) and i ∈ [n]}.
Intuitively speaking, a normal edge (π, π ) encodes the fact that a term may propagate from π to π during the chase. Moreover, a special edge (π, π ) keeps track of the fact that the propagation of a term from π to π also creates a new null at position π . A simple example that illustrates the notion of the dependency graph follows: The graph dg(Σ ) is as follows where the dashed arrows represent special edges. The normal edges occur due to the variable x, while the special edges due to the existentially quantified variable z.
The class of weakly-acyclic sets of TGDs is a well-known formalism, introduced in the context of data exchange, that guarantees the termination of the semi-oblivious chase [17]. 9 Formally, a set Σ is weakly-acyclic if there is no cycle in dg( ) that contains a special edge. It would be extremely useful if, whenever we focus on linear TGDs, weak-acyclicity is also a necessary condition for the termination of the semioblivious chase. Unfortunately, this is not the case. A simple counterexample follows: Example 9 Consider the set Σ of linear TGDs consisting of R(x, x)→∃zR(z, x). In dg( ) there is a cycle that contains a special edge. However, there exists only one so-chase derivation of cr(Σ ) w.r.t. Σ that is finite, and thus, Σ ∈ CT so ∀ .
Interestingly, as it has been shown in [8], there is an extension of weak-acyclicity, called critical-weak-acyclicity, that, whenever we focus on linear TGDs, provides a necessary and sufficient condition for the termination of the semi-oblivious chase. A key notion underlying critical-weak-acyclicity is the notion of compatibility among two single-head linear TGDs. Intuitively, if a single-head linear TGD σ 1 is compatible with a single-head linear TGD σ 2 , then the atom obtained during the chase by applying σ 1 may trigger σ 2 . To formally define the notion of compatibility, we first need to recall the standard notion of unification among atoms.
We say that two atoms α and β (containing only variables from V) unify if there exists a substitution γ from the variables occurring in α and β to V, called unifier for α and β, such that γ (α) = γ (β). A most general unifier (MGU) for α and β is a unifier for α and β, denoted mgu(α, β), such that, for each other unifier γ for α and β, there exists a substitution γ such that γ = γ • mgu(α, β). It is well-known that if two atoms α and β unify, then they have an MGU that is unique (modulo variable renaming), and thus, we can refer to the MGU for α and β [1]. For brevity, given a TGD σ and a variable x, let Π σ x = pos(body(σ ), x). Let var(α, Π), where α is an atom, and Π a set of positions, be the set of variables in α at positions of Π.
Having the notion of compatibility among two single-head linear TGDs in place, we can recall the notion of resolvent of a sequence σ 1 , . . . , σ n of single-head linear TGDs, which is in turn a single-head TGD. Roughly, such a resolvent mimics the behavior of the sequence σ 1 , . . . , σ n during the chase. Notice that the existence of such a resolvent is not guaranteed, but if it exists, this implies that we may have a sequence of trigger applications that involve the TGDs σ 1 , . . . , σ n in this order. In such a case, we call the sequence σ 1 , . . . , σ n active.
At this point, one may think that the right extension of weak-acyclicity, which will provide a necessary condition for the termination of the semi-oblivious chase under linear TGDs, is to allow cycles with special edges in the underlying dependency graph as long as the corresponding sequence of single-head TGDs, which can be extracted from the edge labels, is not active. However, as thoroughly discussed in [8], this is not enough. If a cycle with a special edge is labeled with an active sequence, then we can conclude that it will be traversed at least once during the chase. Nevertheless, it is not guaranteed that it will be traversed infinitely many times. A cycle that is labeled with an active sequence σ 1 , . . . , σ n , and contains a special edge, will be certainly traversed infinitely many times if the resolvent of the sequence ρ, . . . , ρ of length k, where ρ = [σ 1 , . . . , σ n ], exists, for every k > 0. Interestingly, for ensuring the latter condition, it suffices to consider sequences of length at most (ω + 1), where ω is the arity of the predicate of body(σ 1 ). This brings us to critical sequences. For brevity, we write σ k for the sequence σ, . . . , σ of length k.
We can now recall critical-weak-acyclicity as defined in [8]. It is essentially weakacyclicity, with the key difference that a cycle in the underlying graph is "dangerous", not only if it contains a special edge, but if it is also labeled with a critical sequence of single-head linear TGDs. The formal definition follows. Definition 8 (Critical-Weak-Acyclicity) Consider a set Σ ∈ L of TGDs, and let dg( ) = (N, E, λ). v 2 ), . . . , λ(v n , v 0 ) of single-head linear TGDs is critical. We call Σ critically-weakly-acyclic if no critical cycle in dg( ) contains a special edge.
The essence of critical-weak-acyclicity is revealed by the following result: Consider a set Σ ∈ L of TGDs. The following are equivalent: Σ is critically-weakly-acyclic.

From Sticky to Linear TGDs
Before presenting the linearization procedure, we need to introduce some auxiliary notions. Given a tuplet = (t 1 , . . . , t n ) ∈ (V ∪ { }) n , we write shape(t) for the tuple obtained fromt by replacing each variable of V with the special symbol * . For example, shape((x, y, , z, x, )) = ( * , * , , * , * , ). We also write svar(t) for the tuple obtained fromt by removing all the occurrences of the constant . For example, svar ((x, y, , z, x, )) = (x, y, z, x). For an atom α = R(t), let -free(α) = R shape(t) (svar(t)), where R shape(t) is of arity |svar(t)|. In fact, -free(α) is the constant-free version of α, while the subscript shape(t) keeps track of the original shape of α, i.e., where each occurrence of occurs in α. Notice that, having the constant-free version of an atom α in place, we can unambiguously write down α. For a set of atoms A, let Given a TGD σ and an atom α ∈ body(σ ), let that is, the set of body variables of σ that do not occur only in α. Now, given a TGD σ , and an atom α ∈ body(σ ), let i.e., for each subset T of var(body(σ )) that contains all the variables of V α,σ , M α,σ contains a mapping that maps each variable of T to the special constant .
We are now ready to introduce the linearization of a set of TGDs. Note that the following definition talks about arbitrary TGDs. The notion of stickiness is used later for showing that the termination of the semi-oblivious chase is preserved.
The linearization procedure converts a TGD σ into a set of (constant-free) linear TGDs, where the body atom of each such linear TGD corresponds to an atom α of body(σ ), while the variables in body(σ ) \ {α}, and possibly additional variables that occur only in α, are instantiated with the special constant . An example follows: Example 10 Consider the TGD σ = P (x, y, z)

Theory of Computing Systems
We have that the set V α 1 ,σ consists of all the variables in body(σ ) \ {α 1 }, while V α 2 ,σ of all the variables in body(σ ) \ {α 2 }, that is, There are four sets of variables in body(σ ) that are supersets of V α 1 ,σ : Moreover, there are two sets of variables in body(σ ) that are supersets of V α 2 ,σ : Therefore due to the set of mappings M α 2 ,σ .
Lemma 3 allows us to show that this procedure preserves the termination of the semi-oblivious chase whenever the input set of TGDs is sticky.
(⇒) Assume that Lin(Σ ) ∈ CT so ∀ . By Corollary 1, there exists an infinite sochase derivation δ = (I i ) i≥0 of cr(Lin(Σ )) w.r.t. Lin(Σ ), with I i σ α i ,g i i , h i I i+1 , or, equivalently, I i+1 = I i ∪ so-result(σ α i ,g i i , h i ), for each i ≥ 0. By construction, for Theory of Computing Systems each i ≥ 0, the TGD σ α i ,g i i is linear, with its body being an atom of the form Rt (x), wherex is a tuple of variables, andt is a tuple over { * , }. Letĥ i be the extension of h i that maps each variable in body(σ i ) that is not inx to .
Consider the infinite sequence of instances δ = (J i ) i≥0 , where J 0 = cr(Σ ), and J i+1 = J i ∪ so-result(σ i ,ĥ i ), for each i ≥ 0. By construction ofĥ i , for each i ≥ 0, (σ i ,ĥ i ) is a trigger for Σ on J i , which implies that J i so, σ i ,ĥ i J i+1 . However, the above sequence is not necessarily an infinite so-chase derivation of Σ w.r.t. cr(Σ ) since some of the involved triggers may not be semi-obliviously active. We therefore consider the infinite sequence of instances δ = (J i ) i≥0 obtained from (J i ) i≥0 by simply removing the triggers that are not semi-obliviously active, or, more formally, by removing the instances obtained from triggers that are not semi-obliviously active. Recall that, for each i > 0, J i is obtained via a trigger that is not semi-obliviously active iff J i = J i−1 . It remains to show that δ is infinite, which in turn implies that Σ ∈ CT so ∀ . It suffices to show that, for each i ≥ 0, J i occurs in δ finitely many times.
By contradiction, assume that there is k ≥ 0 such that J k occurs in δ infinitely many times, which means that J k = J k+1 = J k+2 = · · · . Since Lin(Σ ) is finite, there are indices k ≤ j < i such that σ α i ,g i i = σ α j ,g j j ; we refer to those TGDs as ρ.
This implies thatĥ i (x) =ĥ j (x), and therefore, J i = J j , which is a contradiction.
For every i ≥ 0, let X i = var(body(σ i ) \ {β i }), g i = h i|X i , and f i = h i \ g i . Note that, since δ is linear, for every x ∈ X i , g i (x) = . Assuming that σ i is of the form By construction, Σ ⊆ Lin(Σ ), and therefore, Σ ∈ CT so ∀ implies Lin(Σ ) ∈ CT so ∀ . Hence, to conclude the proof of Lemma 3, it suffices to exhibit an infinite so-chase derivation of cr(Σ ) ⊆ cr(Σ ) w.r.t. Σ . To this end, consider the infinite sequence of instances δ = (J i ) i≥0 , where J 0 = cr(Σ ), and J i = J i−1 ∪ {α i }, where α i = f i (body(σ i )) = f i ( -free(g i (β i ))), for each i > 0. It is not difficult to verify that δ is an infinite so-chase derivation of cr(Σ ) w.r.t. Σ (modulo null renaming).
The main result of this section follows from Theorem 4 and Lemma 3: Theorem 5 Consider a set Σ ∈ S of TGDs. The following are equivalent:

Lin(Σ ) is critically-weakly-acyclic.
We would like to remark that the above characterization of the termination of the so-chase is different than the one presented in the conference paper [10] (see Corollary 24). More precisely, the linearization procedure in [10] produces TGDs with constants, and thus, we had to properly extend the notion of critical-weak-acyclicity from [8], which was defined only for constant-free TGDs. It turned out that we can directly use the notion of critical-weak-acyclicity as defined in [8], at the price of a slightly more complex linearization procedure that produces constant-free TGDs.

Complexity Analysis
We are now ready to complete the proof of our main result, that is, Theorem 2, which states that CT o ∀ (S) and CT so ∀ (S) are PSPACE-complete, and NLOGSPACE-complete for predicates of bounded arity. We first concentrate on the upper bounds.

Upper Bounds for CT so ∀ (S)
We know that the problem CT so ∀ (L) is in PSPACE, and in NLOGSPACE for predicates of bounded arity [8]. In fact, these results exploit Theorem 4, and are obtained by showing that deciding whether a set of linear TGDs is critically-weakly-acyclic is in PSPACE, and in NLOGSPACE for predicates of bounded arity. However, despite the fact that we can reduce CT so ∀ (S) to CT so ∀ (L) (see Lemma 3), we cannot directly exploit the complexity results for linear TGDs. The reason is because the linearization procedure takes exponential time, in general, and polynomial time in the case of bounded-arity predicates. Therefore, we cannot simply compute the set Lin(Σ ), and then check for critical-weak-acyclicity, but a more refined approach is needed.
We focus on the complement of our problem, i.e., given a set Σ ∈ S of TGDs, we want to check whether Σ ∈ CT so ∀ . By Theorem 5, it suffices to show that Lin(Σ ) is not critically-weakly-acyclic. The latter problem can be seen as a generalization of the standard graph reachability problem. Indeed, we need to check whether there exists a node in the dependency graph of Lin(Σ ) that is reachable from itself via a critical cycle that contains a special edge. However, as discussed above, we cannot explicitly construct Lin(Σ ) and its dependency graph G. Instead, the above reachability check should be performed on a compact representation of G, which is the set Σ itself.

Lemma 4 Consider a set Σ of TGDs. The problem of deciding whether Lin(Σ ) is not critically-weakly-acyclic is in
where ω = ar(Σ ), and m is the number of variables occurring in Σ.
Proof We employ Algorithm 1, which takes as input a set Σ of TGDs, and checks whether Lin(Σ ) is not critically-weakly-acyclic by non-deterministically searching for the existence of a critical cycle in dg(Lin( )) that contains a special edge. Note that this is done without explicitly computing the graph dg(Lin( )). Before showing that Algorithm 1 is correct, and analyzing its space complexity, let us first introduce an auxiliary notion that is used by the algorithm.
Claim Consider a Σ-label , and two positions v, u ∈ sch(Lin(Σ )). Then: We can now show that Algorithm 1 accepts on input Σ iff there is a critical cycle in dg(Lin( )) that contains a special edge: (⇒) Assume first that Algorithm 1 accepts. This implies that there is an accepting computation that guesses a sequence of Σ-labels 1 , . . . , n , and a sequence of positions v 0 , . . . , v n of pos(sch(Lin(Σ ))). By construction, the following hold: The above properties imply that v 0 , . . . , v n−1 , v 0 is a critical cycle in dg(Lin( )) that contains at least one special edge, and the claim follows.
We proceed to analyze the space needed to store a Σ-label, a position of sch(Lin(Σ )), and a single-head linear TGD τ ( ) for a Σ-label . We also analyze the space for a compatibility check, for constructing a resolvent, and for checking whether [ρ] ω+1 is active. It will be then apparent that indeed the space used at each step of a computation of Algorithm 1 on input Σ is what we claimed above. Note that the latter ensures the termination of Algorithm 1 since we can always force a space-bounded algorithm to terminate and reject after exponentially many steps in the required space [31].
-The required space for a Σ-label = (σ, α, β, T ) is A TGD σ ∈ Σ takes O(log |Σ|) space. For storing an atom occurring in Σ we need to store a predicate of sch(Σ ), and, in the worst-case, ω variables occurring in Σ, which can be done in space O(log |sch(Σ)| + ω · log m). Finally, since T contains at most 2ω variables occurring in Σ, it can be stored in space O(ω · log m). Summing up, requires the space stated above.
-A position of sch(Lin(Σ )) takes space This follows from the fact that, by construction, the number of predicates occurring in Lin(Σ ) is at most |sch(Σ)| · 2 ω . -Given a Σ-label , the single-head linear TGD τ ( ) takes space It is clear that the two predicates occurring in τ ( ) require O(|sch(Σ)|·2 ω ) space. We also need to store, in the worst-case, 2ω variables occurring in Σ, which takes O(ω · log m) space. Summing up, τ ( ) requires the space stated above. -To check whether two single-head linear TGDs σ 1 , σ 2 , computed during the execution of the algorithm, are compatible, we only need to check that, for each x ∈ var(body(σ 2 )), either var(head(σ 1 ), Π σ 2 x ) ⊆ fr(σ 1 ), or there is an existentially quantified variable z in σ 1 such that var(head(σ 1 ), Π σ 2 x ) = {z}. The latter can be performed using space which is the space needed for storing Π σ 2 x . Note that head(σ 1 ) and body(σ 2 ) always unify since, at the point that we perform the compatibility check, we know that they have the same predicate (this has been checked by isEdge), and thus, there is no way for the unification check to fail.
-Given two single-head linear TGDs σ 1 , σ 2 , computed during the execution of the algorithm, that are compatible, constructing [σ 1 , σ 2 ] can be done using space which is essentially the space needed for storing mgu(head(σ 1 ), body(σ 2 )). 10 -Finally, given the single-head linear TGD ρ, computed after the execution of the repeat-until, checking whether [ρ] ω+1 is active can be done using space This easily follows from the analysis performed above.
Summing up, each step of a computation of Algorithm 1 on input Σ takes space and the claim follows.
Having Lemma 4 in place, it is clear that the complement of CT so ∀ (S), and thus CT so ∀ (S) itself, is in PSPACE, and in NLOGSPACE for predicates of bounded arity. 11

Upper Bounds for CT o ∀ (S)
We proceed to explain how the upper bounds established above for CT so ∀ (S) can be transferred to CT o ∀ (S). Since the semi-oblivious chase is a refined version of the oblivious chase, it is not surprising that CT o ∀ (TGD) can be reduced to CT so ∀ (TGD). This relies on a very simple construction, known as enrichment [20]. Formally, the enrichment of a set Σ of TGDs, denoted enrichment(Σ ), is the set of TGDs obtained by replacing each TGD σ ∈ Σ of the form φ(x,ȳ)→∃z ψ(x,z) with the TGD where Aux σ is an auxiliary predicate of arity (|x| + |ȳ|) not occurring in sch(Σ ). It is easy to show that the notion of enrichment provides the desired reduction from CT o ∀ (TGD) to CT so ∀ (TGD). More precisely: Lemma 5 For a set Σ of TGDs, the following are equivalent: Proof The fact that (2) implies (1) is a consequence of Theorem 6.1 in [20]. We proceed to show that (1) implies (2). Assume that enrichment(Σ ) ∈ CT so ∀ . Therefore, there exists a database D, and an infinite so-chase derivation δ = (I i ) i≥0 of D w.r.t. enrichment( ), where I i so, σ i , h i , I i+1 , and, for each i ≥ 0, σ i ∈ enrichment(Σ ) is the TGD corresponding to some TGD σ i ∈ Σ. We can easily construct an infinite o-chase derivation δ = (J i ) i≥0 of D w.r.t. Σ: let J 0 = D, and, for each i > 0, J i is obtained from I i by removing every atom with a predicate symbol of the form Aux σ , for some σ ∈ Σ, and by replacing every null of the form ⊥ z σ ,h with the null ⊥ z σ,h . The fact that J i o, σ i , h i J i+1 and (σ i , h) is obliviously active, for each i ≥ 0, follows by construction and the observation that, for each TGD σ i ∈ enrichment(Σ ) for i ≥ 0, fr(σ i ) coincides with var(body(σ i )).
It is also crucial to observe that the class of sticky sets of TGDs is closed under enrichment. In other words: Proof Consider a set Σ ∈ S. The claim follows from the fact that, for each σ ∈ Σ of the form φ(x,ȳ)→∃z ψ(x,z), the corresponding TGD σ ∈ enrichment(Σ ) of the form φ(x,ȳ)→∃z ψ(x,z), Aux σ (x,ȳ) is such that the atom Aux σ (x,ȳ) contains every body variable of σ , and no TGD in enrichment(Σ ) has an atom with predicate Aux σ in its body. Hence, no variable in enrichment(Σ ) is marked.
From Lemmas 5 and 6, we can conclude that CT o ∀ (S) can be reduced in logspace to CT so ∀ (S). Since, as shown above, CT o ∀ (S) is in PSPACE, and PSPACE is closed under logspace reductions, we immediately get that CT o ∀ (S) is in PSPACE, as needed. However, in the case of predicates of bounded arity, we cannot immediately inherit the NLOGSPACE upper bound since the reduction provided by Lemmas 5 and 6 introduces auxiliary predicates of the form Aux σ , where σ is a TGD, of unbounded arity. Indeed, the arity of this auxiliary predicate is the number of variables occurring in the body of the TGD σ , which can be unbounded even if we use only bounded-arity predicates. Nevertheless, a predicate Aux σ does not occur in the body of a TGD, which means that it cannot be part of a cycle in the underlying dependency graph. Therefore, we can consider a slightly modified version of Algorithm 1 that simply ignores the atoms that mention a predicate Aux σ . By relying on this modified algorithm, we get that for a set Σ of TGDs over predicates of bounded arity, checking whether Lin(enrichment(Σ )) is not critically-weakly-acyclic is in NLOGSPACE, which implies that CT o ∀ (S) is in NLOGSPACE for predicates of bounded arity.

Lower Bounds
We conclude the proof of Theorem 2 by providing matching lower bounds. We show that CT o ∀ (S) and CT so ∀ (S) are PSPACE-hard, and NLOGSPACE-hard for predicates of bounded arity. Let us start with the general case of unbounded-arity predicates.
Predicates of Unbounded Arity Since, as discussed above, CT o ∀ (S) can be reduced in logspace to CT so ∀ (S), it suffices to show that CT o ∀ (S) is PSPACE-hard, or, equivalently, its complement is PSPACE-hard. We show that every problem in PSPACE can be reduced in logspace to the complement of CT o ∀ (S). Fix a problem Π in PSPACE, and let M = (S, Λ, f, s 1 ) be the deterministic polynomial space Turing machine that solves Π, where S = {s 1 , . . . , s q } is the set of states of M, Λ = {0, 1, } is the tape alphabet of M with being the blank symbol, f : S × Λ→(S × Λ × {←, −, →}) is the transition function of M, and s 1 ∈ S is the initial state. We assume, w.l.o.g., that M is well-behaved and never tries to read beyond its tape boundaries, always halts, and uses exactly n = m k tape cells, where k > 0 and m is the size of the input string. We also assume that the machine accepts if it reaches a configuration with state s 2 . For the purposes of the current proof, we represent a configuration of M as a string s, t 1 , c 1 , t 2 , c 2 , . . . , t n , c n , where s ∈ S, (t i , c i ) ∈ Λ × {↑, }, for each i ∈ [n], and there is exactly one i ∈ [n] such that c i =↑. Such a string encodes the configuration where the state is s, the i-th cell of the machine contains the symbol t i , and, assuming that c i =↑, the cursor is on the i-th cell of the machine. For example, the initial configuration of M on input I = a 1 , . . . , a m is   s 1 , a 1 , ↑, a 2 , , . . . , a m , , , , . . . , , Our goal is to construct a set Σ ∈ S such that M accepts on input I = a 1 , . . . , a m iff Σ ∈ CT o ∀ , or, equivalently, there exists an infinite o-chase derivation of cr(Σ ) w.r.t. Σ. The high-level idea is, starting from an atom of the form Start( , ⊥, ), where ⊥ is a null, to apply a TGD σ start and generate the initial configuration of M on input I , which will be stored in a predicate Config. Then, each application of a TGD will mimic a transition rule of f and generate a valid configuration of M. Once an accepting configuration is reached, then an atom of the form Start(⊥, ⊥ , ), where ⊥ is a null different than ⊥, will be generated, which will trigger again the TGD σ start . This will give rise to an infinite o-chase derivation of cr(Σ ) w.r.t. Σ. To achieve this, however, via a sticky set of TGDs, we need a proper encoding of a configuration of M as a Config-atom generated during the execution of the chase. We proceed to explain this encoding, which will then allow us to define our sticky set of TGDs.
The key idea is to encode a state of M, the symbols of Λ, and the symbols ↑, , as tuples consisting of a null ⊥ ∈ N and a single occurrence of the constant , while the position at which occurs in this tuple identifies the state or symbol in question. In particular, a state s i ∈ S, for i ∈ [q], will be represented in the chase as a tuple implies that CT o ∀ (S) is NLOGSPACE-hard since CONLOGSPACE = NLOGSPACE. The set Σ consists of the following TGDs: Loop(x) → P s (x) P v (x) → P u (x), for each (v, u) ∈ E P t (x) → ∃z Loop(z).
It is easy to see that indeed t is reachable from s iff Σ ∈ CT o ∀ , and the claim follows.

Discussion and Future Work
We have studied all-instances (semi-)oblivious chase termination for sticky sets of TGDs, and provide precise complexity results. It turned out that our results and techniques allow us to obtain further complexity results concerning the chase procedure.

Further Results
Interestingly, our main result has already found applications in the context of (semi-)oblivious chase boundedness, which has been recently studied in [7]. Roughly, chase boundedness guarantees, not only the termination of the chase procedure, but also the existence of a uniform bound on the depth of the chase. As shown in [7], in the case of sticky sets of TGDs, (semi-)oblivious chase termination and (semi-)oblivious chase boundedness coincide. Therefore, from the main result of our work, we immediately get the complexity of deciding (semi-)oblivious chase boundedness in the case of sticky sets of TGDs. Moreover, the techniques that have been developed in this work can also be applied to shy sets of TGDs, another well-known TGD-based formalism [25,26]. Shy sets of TGDs are not powerful enough to join over null values that have been invented during the execution of the chase. This central property of shy sets of TGDs allows us to reuse the machinery developed for stickiness, and show that the (semi-)oblivious chase termination problem for shy sets of TGDs is PSPACE-complete in general, and NLOGSPACE-complete for predicates of bounded arity.
Future Work Although all-instances (semi-)oblivious chase termination for sticky sets of TGDs is well-understood, there are still interesting and highly non-trivial open questions that we are planning to address in the near future: 1. Whenever the (semi-)oblivious chase terminates under a sticky set of TGDs, what is the exact size of the final result? 2. Even if the (semi-)oblivious chase does not terminate, it might be the case that a finite universal model exists [15]. Can we decide, in the case of sticky sets of TGDs, whether a finite universal model exists? And what about its size?