In this section, we investigate the existence and definability of limits of sequences of LAV mappings. In fact, we will consider a much broader class of GLAV mappings, namely k-premise-bounded GLAV mappings for arbitrary k ≥ 1. LAV mappings correspond to the special case of k = 1.
Definition 7
Let \(\mathcal {M}\) be a GLAV mapping and k a positive integer. We call \(\mathcal {M}\) a k-premise-bounded GLAV mapping if the premise of every constraint in \(\mathcal {M}\) has at most k atoms.
Let \((\mathcal {M}_{n})_{n\geq 1}\) be a sequence of GLAV mappings. We say that \((\mathcal {M}_{n})_{n\geq 1}\) is premise-bounded if there exists an integer k such that every element \(\mathcal {M}_{n}\) of \((\mathcal {M}_{n})_{n\geq 1}\) is k-premise bounded.
Unlike the case of GAV mappings, the notions of pointwise Cauchy and uniformly Cauchy sequences of premise-bounded GLAV mappings coincide. Moreover, the same holds true for the notions of pointwise limit and uniform limit of sequences of such schema mappings.
Theorem 4
Let
\((\mathcal {M}_{n})_{n\geq 1}\)
be a sequence of premise-bounded GLAV mappings.
-
(1)
The sequence
\((\mathcal {M}_{n})_{n\geq 1}\)
is pointwise Cauchy if and only if it is uniformly Cauchy.
-
(2)
The sequence
\((\mathcal {M}_{n})_{n\geq 1}\)
has a pointwise limit if and only if it has a uniform limit.
Proof 9
We prove the first part and then use it to prove the second part.
Part 1. It is obvious that every uniformly Cauchy sequence of mappings is also pointwise Cauchy. We focus on the reverse direction. Let \((\mathcal {M}_{n})_{n\geq 1}\) be a pointwise Cauchy sequence of premise bounded GLAV mappings. We have to show that for every m, there is an N
0 such that for all n, n
′≥ N
0, we have that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}_{m}} \mathcal {M}_{n^{\prime }}\).
Fix an integer m. Since \((\mathcal {M}_{n})_{n\geq 1}\) is pointwise Cauchy, for every source instance I, there is an integer n
0(I) such that for all n, n
′≥ n
0(I) and for every conjunctive query q in C
Q
m
, we have that \(cert(q,I,\mathcal {M}_{n}) = cert(q,I,\mathcal {M}_{n^{\prime }})\). Let p be the number of relation symbols in the target schema, let r be their maximum arity, and let k be the bound on the number of atoms in the premises of the members of the sequence \((\mathcal {M}_{n})_{n\geq 1}\). We write \(\mathcal {I}\) to denote the class of all source instances with at most k ⋅ p ⋅ m
r atoms. Clearly, up to isomorphism, there are only finitely many instances \(I \in \mathcal {I}\). Moreover, if I
′≅I
″, then n
0(I
′) = n
0(I
″). Consequently, the quantity \(N_{0} = \mathop {max}\{n_{0}(I) \mid I \in \mathcal {I}\}\) is a positive integer. We claim that for all n, n
′≥ N
0, we have that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}_{m}} \mathcal {M}_{n^{\prime }}\).
Let I be an arbitrary source instance and let q be an arbitrary conjunctive query in C
Q
m
. We have to show that \(cert(q,I, \mathcal {M}_{n}) = cert(q, I, \mathcal {M}_{n^{\prime }})\), for all n, n
′≥ N
0. Let a be a tuple of constants such that \(\mathbf {a} \in cert(q, I, \mathcal {M}_{n})\), hence \(\mathbf {a} \in q(chase(I, \mathcal {M}_{n}))_{\downarrow }\). Since the query q has at most m variables, it must consist of at most p ⋅ m
r atoms. Let \(h: \mathit {atoms}(q) \to chase(I, \mathcal {M}_{n})\) be a homomorphism establishing that \(\mathbf {a} \in q(chase(I,\mathcal {M}_{n}))\). It follows that there are at most p ⋅ m
r facts in \(chase(I,\mathcal {M}_{n})\) witnessing that \(\mathbf {a} \in q(chase(I, \mathcal {M}_{n}))_{\downarrow }\). Each of these facts must be produced in a single step while chasing the source instance I with \(\mathcal {M}_{n}\), which implies that each of these facts is produced using at most k facts from I. Let I
∗ be the subinstance of I consisting of all the aforementioned facts of I used to produce the facts in \(chase(I,\mathcal {M}_{n})\) witnessing that \(\mathbf {a} \in q(chase(I, \mathcal {M}_{n}))_{\downarrow }\). We then have that |I
∗|≤ k ⋅ p ⋅ m
r and \(\mathbf {a} \in q(chase(I^{*}, \mathcal {M}_{n}))_{\downarrow }\). Since n, n
′≥ N
0, we have that \(q(chase(I^{*}, \mathcal {M}_{n}))_{\downarrow } = q(chase(I^{*}, \mathcal {M}^{\prime }_{n}))_{\downarrow }\), hence \(\mathbf {a} \in q(chase(I^{*}, \mathcal {M}^{\prime }_{n}))_{\downarrow }\). By the monotonicity of the chase procedure, we have that \(\mathbf {a} \in q(chase(I, \mathcal {M}^{\prime }_{n}))_{\downarrow }\). It follows that \(q(chase(I, \mathcal {M}_{n}))_{\downarrow } \subseteq q(chase(I, \mathcal {M}^{\prime }_{n}))_{\downarrow }\). A symmetric argument establishes the containment \(q(chase(I, \mathcal {M}^{\prime }_{n}))_{\downarrow } \subseteq q(chase(I, \mathcal {M}_{n}))_{\downarrow }\), hence \(q(chase(I, \mathcal {M}_{n}))_{\downarrow } = q(chase(I, \mathcal {M}^{\prime }_{n}))_{\downarrow }\), which, in turn, implies that \(cert(q,I, \mathcal {M}_{n}) = cert(q, I, \mathcal {M}_{n^{\prime }})\).
Part 2. It is obvious that if a sequence of schema mappings has a uniform limit, then it has a pointwise limit. We focus on the reverse direction. Let \((\mathcal {M}_{n})_{n\geq 1}\) be a sequence of premise bounded GLAV mappings that has a pointwise limit \(\mathcal {M}\). We claim that \(\mathcal {M}\) is also a uniform limit of \((\mathcal {M}_{n})_{n\geq 1}\).
Since \((\mathcal {M}_{n})_{n\geq 1}\) has a pointwise limit, we have that \((\mathcal {M}_{n})_{n\geq 1}\) is pointwise Cauchy. The previous part implies that \((\mathcal {M}_{n})_{n\geq 1}\) is uniformly Cauchy as well. Fix an integer m. Since \((\mathcal {M}_{n})_{n\geq 1}\) is uniformly Cauchy, there exists an n
0 such that for all n, n
′≥ n
0, we have that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}_{m}} \mathcal {M}_{n^{\prime }}\). We claim that also \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}_{m}} \mathcal {M}\) holds, for every n ≥ n
0. To show this, fix some n ≥ n
0 and let I be a source instance and q a conjunctive query in C
Q
m
. We have to show that \(cert(q,I,\mathcal {M}_{n}) = cert(q,I,\mathcal {M})\). Since \(\mathcal {M}\) is a pointwise limit of \((\mathcal {M}_{n})_{n\geq 1}\), there is an \(n^{\prime }_{0}(I)\) such that for all \(n^{\prime } \geq n^{\prime }_{0}(I)\), we have that \(cert(q,I,\mathcal {M}_{n^{\prime }})=cert(q,I,\mathcal {M})\). Take an integer n
′ such that \(n^{\prime } \geq \max \{n_{0}, n^{\prime }_{0}(I)\}\). Since n
′≥ n
0, we have that \(cert(q,I,\mathcal {M}_{n}) = cert(q,I,\mathcal {M}_{n^{\prime }})\). Since \(n^{\prime }\geq n^{\prime }_{0}(I)\), we have that \(cert(q,I,\mathcal {M}_{n^{\prime }}) = cert(q,I,\mathcal {M})\). Thus, \(cert(q,I,\mathcal {M}_{n}) = cert(q,I,\mathcal {M})\). □
Note that the preceding proof of Part 2 used only the hypothesis that the sequence \((\mathcal {M}_{n})_{n\geq 1}\) is uniformly Cauchy and the fact that the sequence \((\mathcal {M}_{n})_{n\geq 1}\) has a pointwise limit, as we have proved in Part 1. As a matter of fact, this is an instance of a general result about pseudometric spaces, namely, that if a uniformly Cauchy sequence of functions converges pointwise, then it also converges uniformly.
The following two propositions further demarcate the differences between GAV and premise-bounded GLAV mappings. In fact, these differences are already witnessed by sequences of LAV mappings. The first difference concerns the existence of limits of uniformly Cauchy sequences. In contrast to the GAV case, uniformly Cauchy sequences of LAV mappings may have no uniform limit; in fact, they may not even have a pointwise limit.
Proposition 6
There exists a uniformly Cauchy sequence of LAV mappings that has no
pointwise limit; in particular, it has no uniform limit either.
Proof 10
Let S be a source schema consisting of a binary relation symbol E and let T be a target schema consisting of a binary relation F. For every n ≥ 1, let \(\mathcal {M}_{n}\) be the LAV mapping specified by the constraint
$$\forall x,y (E(x,y) \to q_{n+1})$$
where \(q_{n} = \exists z_{1},{\ldots } z_{n} \bigwedge _{1 \leq i < j \leq n}(F(z_{i},z_{j}) \wedge F(z_{j},z_{i}))\) is the boolean conjunctive query which is satisfied by the graphs containing a self-loop or a clique of size n (now considering F as the edge relation).
We first show that the sequence\((\mathcal {M}_{n})_{n\geq 1}\) is uniformly Cauchy. Let k ≥ 1. We claim that if we take n
0 = k, then for every source instance I, for every n, m ≥ n
0, and every q ∈C
Q
k
, we have that \(cert(q, I, \mathcal {M}_{n}) = cert(q, I, \mathcal {M}_{m})\). To see this, note that for every source instance I and for every t ≥ 1, the universal solutions of I w.r.t. \(\mathcal {M}_{t}\) have active domains consisting entirely of labeled nulls. Hence, only boolean queries may return a non-empty result. Moreover, observe that these universal solutions have no self-loops, i.e., they contain no atoms of the form F(v, v) for some labeled null v.
We now distinguish two cases: First, suppose that q ∈C
Q
k
is a boolean conjunctive query which contains a “self-loop”, i.e., an atom of the form F(z, z) for some variable z. Then we clearly have \(cert(q, I, \mathcal {M}_{n}) = \mathit {false} = cert(q, I, \mathcal {M}_{m})\). It remains to consider the case that q ∈ C
Q
k
is a boolean C
Q containing no self-loop. Then we clearly have \(cert(q, I, \mathcal {M}_{n}) = \mathit {true} = cert(q, I, \mathcal {M}_{m})\), since we are assuming that m, n ≥ k holds.
Using an argument similar to the one in the proof of Proposition 2, we now show that the sequence \((\mathcal {M}_{n})_{n\geq 1}\) has no pointwise limit. Towards a contradiction, assume that \((\mathcal {M}_{n})_{n\geq 1}\) does have a pointwise limit \(\mathcal {M}\). Let I be a non-empty source instance. We consider three cases.
First, assume that \(\text {Sol}(I,\mathcal {M})\) is empty. Then, for every boolean conjunctive query q, it holds trivially that \(cert(q,I,\mathcal {M}) = \mathit {true}\). This is, in particular, the case for the query q = ∃z F(z, z), which asks for the existence of a self-loop. However, for this query q, we have that \(cert(q, I, \mathcal {M}_{n}) = \mathit {false}\) for every n ≥ 1.
Second, assume that \(\text {Sol}(I,\mathcal {M})\) is non-empty and that all solutions \(J \in \text {Sol}(I,\mathcal {M})\) contain a self-loop. For the query q = ∃z
F(z, z) as above, we again have \(cert(q, I, \mathcal {M}) = {\mathit {true}}\), whereas \(cert(q, I, \mathcal {M}_{n}) = \mathit {false}\), for every n ≥ 1.
Finally, assume that \(\text {Sol}(I,\mathcal {M})\) is non-empty and that at least one solution \(J \in \text {Sol}(I,\mathcal {M})\) does not contain a self-loop. Let m be the biggest integer such that J contains a clique of size m. Consider the conjunctive query
$$q = \exists z_{1}, {\dots} z_{m+1} \bigwedge\limits_{1 \leq i < j \leq m+1} (F(z_{i},z_{j}) \wedge F(z_{j},z_{i})).$$
Then q evaluates to false over J and we have \(cert(q, I, \mathcal {M}) = \mathit {false}\). On the other hand, for all n ≥ m + 1 we have \(cert(q, I, \mathcal {M}_{n}) = \mathit {true}\). Again, this contradicts our assumption that \(\mathcal {M}\) is the pointwise limit of \((\mathcal {M}_{n})_{n\geq 1}\). □
The next difference is the definability of uniform limits. In Section 4, we saw that if a sequence of GAV mappings has a uniform limit, then it is eventually constant, hence it has a GAV mapping as a uniform limit. This property need not hold for sequences of LAV mappings (hence, it need not hold for sequences of premise-bounded schema mappings).
Proposition 7
There exists a sequence
\((\mathcal {M}_{n})_{n\geq 1}\)
of LAV mappings that has a uniform limit, but no uniform limit of
\((\mathcal {M}_{n})_{n\geq 1}\)
admits universal solutions. In particular, no
SO tgd is a uniform limit of the sequence
\((\mathcal {M}_{n})_{n\geq 1}\).
Proof 11
For every n ≥ 1, let \(\mathcal {M}_{n}\) be the LAV mapping specified by the constraint
$$\forall x (V(x) \to \exists P_{n})$$
where ∃P
n
is the conjunctive query ∃z
1…∃z
n
(F(z
1, z
2) ∧… ∧ F(z
n−1, z
n
)) asserting that there is a “path” (possibly with repeated vertices) of length n in the target instance. We now show that the sequence \((\mathcal {M}_{n})_{n\geq 1}\) has a uniform limit, but no uniform limit of this sequence admits universal solutions.
Part 1. For the first part of the claim, consider the schema mapping
$$\mathcal{M} = \{(I, J) \mid I\neq \emptyset\text{ and }J\in v(C_{k}), k > 1\},$$
where C
k
is a target instance consisting of a simple cycle of nulls of size k and v(C
k
) is the set of all isomorphic copies of C
k
via isomorphisms that rename nulls. We will show that \(\mathcal {M}\) is a uniform limit of the sequence \((\mathcal {M}_{n})_{n\geq 1}\). Specifically, we will show for every m, there exists n
0 such that for all n ≥ n
0, we have that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}_{m}} \mathcal {M}\).
Let n
0 = m. Since each \(\mathcal {M}_{n}\) has solutions consisting entirely of nulls, it suffices to consider boolean C
Q s only. Let q be a boolean C
Q with m variables and assume that \(cert(q, I, \mathcal {M}_{n})= true\), where n ≥ m. This implies that there is a homomorphism from the body of q into P
n
, where P
n
is the simple path with n nodes. In turn, this implies that C
k
⊧q, for every k. Thus, \(cert(q, I, \mathcal {M}) = \mathit {true}\) as well. In the other direction, assume that \(cert(q, I, \mathcal {M}) = \mathit {true}\). Note that q cannot contain a directed cycle, since no directed cycle can be mapped homomorphically in every cycle of length greater than one. Let h be a homomorphism from the body of q into C
m+1. Since q ∈C
Q
m
, the variables of q have at most m distinct images among the nodes of C
m+1. This means that \(\tilde C_{m+1} \models q\), where \(\tilde C_{m+1}\) is obtained from C
m+1 by removing the facts that contain at least one element that is not the image of one of the variables of q under h. Note that \(\tilde C_{m+1}\) has at least one fact less than C
m+1, and so it is a collection of simple paths of length at most m; therefore, there is a homomorphism from \(\tilde C_{m+1}\) to P
n
, hence P
n
⊧q.
Part 2. For the second part of the claim and towards a contradiction, assume that \(\mathcal {M}^{\prime }\) is a uniform limit of \((\mathcal {M}_{n})_{n\geq 1}\) such that there exists a non-empty source instance I and a finite universal solution J for I w.r.t. \(\mathcal {M}^{\prime }\). Note that for every i, we have that \(cert(\exists P_{i}, I, \mathcal {M}^{\prime }) = \mathit {true}\), because \(\mathcal {M}^{\prime }\) is a (uniform and, hence also pointwise) limit of the sequence \((\mathcal {M}_{n})_{n\geq 1}\). Then we also have that J⊧∃P
i
, since J is universal. Since J is finite, this is possible only if J contains a directed cycle.
We can now derive a contradiction as follows. For each positive integer l, let ∃C
l
be the boolean conjunctive query asserting the existence of a cycle of length l. Then there is no n such that \(cert(\exists C_{l}, I, \mathcal {M}_{n}) = \mathit {true}\). Thus, \(cert(\exists C_{l}, I, \mathcal {M}^{\prime }) = \mathit {false}\) must hold for every l, since \(\mathcal {M}^{\prime }\) is a limit of \((\mathcal {M}_{n})_{n\geq 1}\). Hence, J cannot contain cycles.
Since every SO tgd admits universal solutions, it follows that no SO tgd is a (uniform or pointwise) limit of \((\mathcal {M}_{n})_{n\geq 1}\). □
By Theorem 1, every SO tgd is the uniform limit of a sequence of GLAV mappings. Proposition 7 implies that the converse is false, even for sequences of LAV mappings.
In the previous section, we showed that a sequence of GAV mappings has a GAV mapping as a pointwise limit if and only if it has a pointwise limit that allows for C
Q-rewriting. Is there some structural property that characterizes when a sequence of premise-bounded GLAV mappings has a GLAV mapping as a pointwise limit (which, for premise-bounded mappings, is the same as a uniform limit)? We will show that the property of admitting universal solutions is the key to this question. Specifically, we have the following result.
Theorem 5
Let
\((\mathcal {M}_{n})_{n\geq 1}\)
be a premise-bounded sequence of GLAV mappings. The following statements are
equivalent.
-
(1)
\((\mathcal {M}_{n})_{n\geq 1}\)
has a GLAV mapping
\(\mathcal {M}\)
as a uniform limit.
-
(2)
\((\mathcal {M}_{n})_{n\geq 1}\)
has a uniform limit that admits universal solutions.
Moreover, if
\((\mathcal {M}_{n})_{n\geq 1}\)
is a sequence
of LAV mappings, then
\((\mathcal {M}_{n})_{n\geq 1}\)
has a LAV mapping as a uniform limit if and only
\((\mathcal {M}_{n})_{n\geq 1}\)
has a
uniform limit that admits universal solutions.
We now give two lemmas which will be used in the proof of Theorem 5, but are also of interest in their own right.
Lemma 2
If
\(\mathcal {M}\)
is the uniform limit of a sequence
\((\mathcal {M}_{n})_{n\geq 1}\)
of schema mappings each of which allows for
C
Q
-rewriting,
then also
\(\mathcal {M}\)
allows for
C
Q
-rewriting.
Proof 12
Let q be a target conjunctive query with m variables. Since \(\mathcal {M}\) is a uniform limit of \((\mathcal {M}_{n})_{n\geq 1}\), there exists an integer n
0 such that for every n ≥ n
0 and every source instance I, we have that \(cert(q,I,\mathcal {M}) = cert(q,I,\mathcal {M}_{n})\). In particular, \(cert(q,I,\mathcal {M}) = cert(q,I,\mathcal {M}_{n_{0}})\). Since \(\mathcal {M}_{n_{0}}\) allows for C
Q-rewriting, there is a source conjunctive query q
′ such that \(cert(q,I,\mathcal {M}_{n_{0}}) = q^{\prime }(I)\), for every source instance I. Hence, \(cert(q,I,\mathcal {M}) = q^{\prime }(I)\) holds, for every source instance I. □
It should be noted that the conclusion of Lemma 2 does not hold, in general, if \(\mathcal {M}\) is a pointwise limit of a sequence \((\mathcal {M}_{n})_{n\geq 1}\) of schema mappings each of which allows for C
Q-rewriting. Indeed, if \((\mathcal {M}_{n})_{n\geq 1}\) is the sequence of GAV mappings in the proof of Proposition 5, then Theorem 3 and Proposition 5 imply that no pointwise limit of \((\mathcal {M}_{n})_{n\geq 1}\) allows for C
Q-rewriting.
Lemma 3
Let
\(\mathcal {M}\)
be a uniform limit of a sequence
\((\mathcal {M}_{n})_{n\geq 1}\)
of LAV mappings. If
\(\mathcal {M}\)
admits universal solutions, then it is closed under unions.
Proof 13
The proof proceeds through several stages and involves four claims, each of which builds on preceding ones. We first state the claims without proof and then use the last claim to show the desired conclusion. After this, we complete the proof of the lemma by proving each claim.
We first modify the notion of C
Q-equivalence by limiting the number of atoms of C
Q s, rather than the number of variables. This yields an equivalent notion of uniform limit.
-
For ℓ ≥ 1, we define C
Q
′
ℓ
= {q ∈C
Q ∣ l
e
n
g
t
h(q) ≤ ℓ}, where l
e
n
g
t
h(q) denotes the number of atoms in q.
-
We say that two schema mappings \(\mathcal {M}_{1}\) and \(\mathcal {M}_{2}\) are C
Q
′
ℓ
-equivalent, denoted by \(\mathcal {M}_{1} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}_{2}\), if for every source instance I and for every \(q \in \mathsf {CQ}^{\prime }_{\ell }\), we have that \(cert(q,I,\mathcal {M}_{1}) = cert(q,I,\mathcal {M}_{2})\).
-
We say that \(\mathcal {M}\) is the u
′-limit of a sequence \((\mathcal {M}_{n})_{n\geq 1}\), denoted by \(\mathcal {M}_{n} \stackrel {u^{\prime }}{\longrightarrow } \mathcal {M}\), if for every ℓ, there exists n
0 such that for all n ≥ n
0, it holds that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\).
Claim A.
The notions of u
′-limit and uniform limit coincide. Formally, for every sequence \((\mathcal {M}_{n})_{n\geq 1}\) of schema mappings and every schema mapping \(\mathcal {M}\), we have that \(\mathcal {M}_{n} \stackrel {u}{\longrightarrow } \mathcal {M}\) if and only if \(\mathcal {M}_{n} \stackrel {u^{\prime }}{\longrightarrow } \mathcal {M}\).
Next, we use the given sequence \((\mathcal {M}_{n})_{n\geq 1}\) to construct another sequence \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\) of LAV mappings that possesses some desirable properties. To define the sequence \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\), we need another claim.
Claim B.
Assume that \(\mathcal {M}_{n} \stackrel {u}{\longrightarrow } \mathcal {M}\). Then, there exists a strictly increasing sequence (n
i
)
i ≥ 1 of positive integers, such that for every ℓ ≥ 1 and for every n ≥ n
ℓ
, we have that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\).
Let (n
i
)
i ≥ 1 be the strictly increasing sequence of positive integers according to Claim B. We define the sequence \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\) of LAV mappings as follows:
$$\mathcal{M}^{\prime}_{n} = \left\{\begin{array}{ll} \mathcal{M}_{n} & \text{if } n < n_{1} \\ \displaystyle\bigcup\limits_{\tau \in \mathcal{M}_{n}} T(\tau,\ell) & \text{otherwise, if } n_{\ell} \leq n < n_{\ell +1} \end{array}\right. $$
Here, T(τ, ℓ) contains all LAV constraints obtained from τ by restricting the conclusion to at most ℓ atoms. Formally, let τ = A(x) →∃y A
1(x, y) ∧… ∧ A
r
(x, y) and let {j
1,…, j
p
}⊆{1,…r} for p ≥ 1. Define τ[j
1,…, j
p
]:= \(A(\mathbf {x}) \to \exists \mathbf {y}\, A_{j_{1}}(\mathbf {x},\mathbf {y})\land \ldots \land A_{j_{p}}(\mathbf {x},\mathbf {y})\). We define
$$T(\tau,\ell) \text{:=} \{\tau[j_{1},\ldots, j_{p}] \mid \{j_{1},\ldots,j_{p}\}\subseteq \{1,\ldots,r\} \text{ and } p\leq \ell \}. $$
Claim C.
Let (n
i
)
i ≥ 1 be the strictly increasing sequence of positive integers according to Claim B and let \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\) be the sequence of LAV mappings constructed above. Then, for every ℓ ≥ 1, the following properties hold: (i) for every n ≥ n
ℓ
, we have that \(\mathcal {M}^{\prime }_{n} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\); (ii) the conclusion of every LAV constraint in \(\mathcal {M}^{\prime }_{n_{\ell }}\) is of length at most ℓ.
We now make the following claim about the sequence \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\).
Claim D.
For every source instance I, there exists an integer n
0 ≥ 1 such that for every I
′⊆ I, we have that \(\text {Sol}(I^{\prime },\mathcal {M}^{\prime }_{n_{0}}) = \text {Sol}(I^{\prime },\mathcal {M})\).
Next, we use Claim D to show that \(\mathcal {M}\) is closed under unions, i.e., given \((I_{1},J_{1}) \in \mathcal {M}\) and \((I_{2},J_{2}) \in \mathcal {M}\), we must show that \((I,J) \in \mathcal {M}\) with I = I
1 ∪ I
2 and J = J
1 ∪ J
2. From Claim D, we know that there exists n
0 such that \(\text {Sol}(I^{\prime },\mathcal {M}^{\prime }_{n_{0}}) = \text {Sol}(I^{\prime },\mathcal {M})\), for every I
′⊆ I. In particular, I
1, I
2 ⊆ I. Hence, for each i ∈{1,2}, we have \(J_{i} \in \text {Sol}(I_{i}, \mathcal {M}^{\prime }_{n_{0}})\), that is, \((I_{1}, J_{1})\in \mathcal {M}^{\prime }_{n_{0}}\) and \((I_{2}, J_{2})\in \mathcal {M}^{\prime }_{n_{0}}\). Since \(\mathcal {M}^{\prime }_{n_{0}}\) is a LAV mapping, it is closed under unions. Hence, \((I,J) \in \mathcal {M}^{\prime }_{n_{0}}\), and, since \(\text {Sol}(I,\mathcal {M}^{\prime }_{n_{0}}) = \text {Sol}(I,\mathcal {M})\), we conclude that \(J \in \text {Sol}(I,\mathcal {M})\), i.e., \((I,J)\in \mathcal {M}\).
To complete the proof of the lemma, it remains to prove Claims A-D.
Claim A.
The notions of u
′-limit and uniform limit coincide. Formally, for every sequence \((\mathcal {M}_{n})_{n\geq 1}\) of schema mappings and every schema mapping \(\mathcal {M}\), we have that \(\mathcal {M}_{n} \stackrel {u}{\longrightarrow } \mathcal {M}\) if and only if \(\mathcal {M}_{n} \stackrel {u^{\prime }}{\longrightarrow } \mathcal {M}\).
(⇒) Assume \(\mathcal {M}_{n} \stackrel {u}{\longrightarrow } \mathcal {M}\). We have to show that also \(\mathcal {M}_{n} \stackrel {u^{\prime }}{\longrightarrow } \mathcal {M}\) holds. Consider an arbitrary ℓ ≥ 1 and let r be the maximal arity of the target schema of \(\mathcal {M}\). Any conjunctive query with at most ℓ atoms can have at most m = ℓ ⋅ r variables. Hence, the inclusion C
Q
′
ℓ
⊆C
Q
m
holds.
We are assuming \(\mathcal {M}_{n} \stackrel {u}{\longrightarrow } \mathcal {M}\). Hence, there exists n
0(m) such that for all n ≥ n
0(m), we have that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}_{m}} \mathcal {M}\). That is, for all q ∈C
Q
m
and for all I, it holds that \(cert(q, I, \mathcal {M}_{n}) = cert(q, I, \mathcal {M})\). Since C
Q
′
ℓ
⊆C
Q
m
, we may conclude that for all q ∈C
Q
′
ℓ
and for all I, it holds that \(cert(q, I, \mathcal {M}_{n}) = cert(q, I, \mathcal {M})\). Hence, \(\mathcal {M}_{n} \stackrel {u^{\prime }}{\longrightarrow } \mathcal {M}\) indeed holds.
(⇐) Assume \(\mathcal {M}_{n} \stackrel {u^{\prime }}{\longrightarrow } \mathcal {M}\). We have to show that also \(\mathcal {M}_{n} \stackrel {u}{\longrightarrow } \mathcal {M}\) holds. Consider an arbitrary m ≥ 1. As above, let r be the maximal arity of the target schema of \(\mathcal {M}\). Moreover, let p be the number of target relation symbols. Any conjunctive query with at most m variables can have at most ℓ = p ⋅ m
r atoms. Hence, the inclusion C
Q
m
⊆C
Q
′
ℓ
holds.
We are assuming \(\mathcal {M}_{n} \stackrel {u^{\prime }}{\longrightarrow } \mathcal {M}\). Hence, there exists n
0(ℓ) such that for all n ≥ n
0(ℓ), we have that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\). That is, for all \(q \in \mathsf {CQ}^{\prime }_{\ell }\) and for all I, it holds that \(cert(q, I, \mathcal {M}_{n}) = cert(q, I, \mathcal {M})\). Since C
Q
m
⊆C
Q
′
ℓ
, we may conclude that for all q ∈C
Q
m
and for all I, it holds that \(cert(q, I, \mathcal {M}_{n}) = cert(q, I, \mathcal {M})\). Hence, \(\mathcal {M}_{n} \stackrel {u}{\longrightarrow } \mathcal {M}\) indeed holds.
Claim B.
Assume that \(\mathcal {M}_{n} \stackrel {u}{\longrightarrow } \mathcal {M}\). Then, there exists a strictly increasing sequence (n
i
)
i ≥ 1 of positive integers, such that for every ℓ ≥ 1 and for every n ≥ n
ℓ
, we have that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\). Since \(\mathcal {M}_{n} \stackrel {u^{\prime }}{\longrightarrow } \mathcal {M}\), for each ℓ ≥ 1 there exists an integer \(n^{\prime }_{\ell }\) such that for all \(n \geq n^{\prime }_{\ell } \), we have that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\). We may choose n
ℓ
as follows to ensure strict monotonicity: \(n_{1} \text {:=}\, n^{\prime }_{1}\)
…\(n_{\ell } \text {:=}\, \max (n_{\ell -1} + 1, n^{\prime }_{\ell })\)Then the sequence (n
i
)
i ≥ 1 is strictly increasing and for all ℓ ≥ 1 and for all n ≥ n
ℓ
, we have that \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\).
Claim C.
Let (n
i
)
i ≥ 1 be the strictly increasing sequence of positive integers according to Claim B and let \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\) be the sequence of LAV mappings constructed above. Then, for every ℓ ≥ 1, the following properties hold: (i) for every n ≥ n
ℓ
, we have that \(\mathcal {M}^{\prime }_{n} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\); (ii) the conclusion of every LAV constraint in \(\mathcal {M}^{\prime }_{n_{\ell }}\) is of length at most ℓ. Consider an arbitrary ℓ ≥ 1. By the construction of the sequence \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\), every LAV constraint in \(\mathcal {M}^{\prime }_{n}\) has a conclusion of length at most ℓ. Hence, property (ii) clearly holds.
To prove property (i), consider an arbitrary n ≥ n
ℓ
. We have to show that \(\mathcal {M}^{\prime }_{n} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\), i.e., for arbitrary source instance I and arbitrary conjunctive query q ∈C
Q
′
ℓ
, we have to show that \(cert(q, I, \mathcal {M}^{\prime }_{n}) = cert(q, I, \mathcal {M})\). By Claim B, we have \(\mathcal {M}_{n} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\). Hence, it suffices to show that \(cert(q, I, \mathcal {M}^{\prime }_{n}) = cert(q, I, \mathcal {M}_{n})\) holds. We prove the two inclusions separately.
By the construction of \(\mathcal {M}^{\prime }_{n}\), we clearly have \(chase(I,\mathcal {M}^{\prime }_{n}) \to chase(I,\mathcal {M}_{n})\). From this, it follows immediately that \(cert(q,I,\mathcal {M}^{\prime }_{n}) \subseteq cert(q,I,\mathcal {M}_{n})\).
For the reverse inclusion, consider an arbitrary tuple \(\mathbf {a} \in cert(q, I,\mathcal {M}_{n})\). Then, there exists a homomorphism \(h_{n} \colon q \to chase(I,\mathcal {M}_{n})\) with h(z) = a, where z denotes the free variables of q. Let \(h_{n}(q) = \{A_{1}, {\ldots } A_{k}\} \subseteq chase(I,\mathcal {M}_{n})\) with k ≤ ℓ. By construction, \(\mathcal {M}^{\prime }_{n}\) is obtained by restricting the conclusions of the LAV constraints \(\tau \in \mathcal {M}_{n}\) in all possible ways to at most ℓ atoms. Hence, since k ≤ ℓ, we have that also \(chase(I,\mathcal {M}^{\prime }_{n})\) contains the set {A
1,…A
k
} of atoms (up to renaming of labeled nulls). Thus, there exists a homomorphism \(h \colon \{A_{1}, {\ldots } A_{k}\} \to chase(I,\mathcal {M}^{\prime }_{n})\) and h(h
n
(⋅)) is a homomorphism \(q \to chase(I,\mathcal {M}^{\prime }_{n})\) with h(z) = a. Therefore, \(\mathbf {a} \in cert(q, I,\mathcal {M}^{\prime }_{n})\) holds.
Before presenting the proof of Claim D, we need to bring the notion of fact block size into the picture; this notion was introduced in [7].
Fact Blocks. Let J be an instance. The Gaifman graph of facts G
J
of J is the graph whose nodes are the facts of J and there is an edge between two facts if they have a null in common. The fact blocks (or f-blocks) of J are the sets of nodes of the connected components of G
J
. The block size of an undirected graph G is the size of the maximal connected component of G
J
, where the size of a component is given as the number of nodes. The fact block size (f-block size) of an instance J is the block size of the Gaifman graph of facts of J.
Claim D.
For every source instance I, there exists an integer n
0 ≥ 1 such that for every I
′⊆ I, we have that \(\text {Sol}(I^{\prime },\mathcal {M}^{\prime }_{n_{0}}) = \text {Sol}(I^{\prime },\mathcal {M})\). Consider an arbitrary I
′⊆ I. Let J denote a universal solution for I
′ w.r.t. \(\mathcal {M}\) and let \(J^{\prime } = chase(I^{\prime },\mathcal {M}^{\prime }_{n_{0}})\). We set ℓ = size(J), where size (J) denotes the number of atoms in J. Moreover, we set n
0 = n
ℓ
from the construction of \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\). We claim that n
0 has the desired property. The proof proceeds in three steps, namely, we will show (i) J → J
′, (ii) J
′→ J, and, finally, (iii) \(\text {Sol}(I^{\prime },\mathcal {M}^{\prime }_{n_{0}}) = \text {Sol}(I^{\prime },\mathcal {M})\).
-
(i) Let u = (u
1,…, u
i
) in J denote the labeled nulls in J and let y = (y
1,…, y
i
) denote a vector of pairwise distinct variables. Consider the boolean conjunctive query ∃y q
J
whose atoms are the atoms in J where we instantiate the labeled nulls u = (u
1,…, u
i
) with y = (y
1,…, y
i
). Clearly q
J
→ J holds and, therefore, also \(cert(\exists \mathbf {y}\, q_{J}, I^{\prime },\mathcal {M}) = true\).
Since \(\mathcal {M}^{\prime }_{n_{0}} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\) and ∃y q
J
∈C
Q
′
ℓ
, also \(cert(\exists \mathbf {y}\, q_{J}, I^{\prime }, \mathcal {M}^{\prime }_{n_{0}}) = \mathit {true}\) holds. Hence, there exists a homomorphism h
′: q
J
→ J
′, which can be easily transformed into a homomorphism h: J → J
′ by setting h(u
α
) = h
′(y
α
) for every α ∈{1,…, i}.
-
(ii) For every f-block F
′ of J
′, we consider the boolean conjunctive query \(\exists \mathbf {z}\, q_{F^{\prime }}\) whose atoms are the atoms in F
′ and z = (z
1,…, z
i
) instantiates the labeled nulls v = (v
1,…, v
i
) in F
′ with pairwise distinct variables. Clearly, for every F
′, we have \(q_{F^{\prime }} \to J^{\prime }\) and, therefore, also \(cert(\exists \mathbf {z}\, q_{F^{\prime }}, I^{\prime },\mathcal {M}) = true\).
Since all LAV-constraints in \(\mathcal {M}^{\prime }_{n_{0}}\) have conclusion size bounded by ℓ, the number of atoms in any f-block of J
′ is bounded by ℓ. Hence, for every F
′, the corresponding conjunctive query \(q_{F^{\prime }}\) is in \(\mathsf {CQ}^{\prime }_{\ell }\). Since \(\mathcal {M}^{\prime }_{n_{0}} \equiv _{\mathsf {CQ}^{\prime }_{\ell }} \mathcal {M}\), we have that \(cert(\exists \mathbf {z}\, q_{F^{\prime }}, I^{\prime }, \mathcal {M}) = \mathit {true}\). Hence, for every f-block F
′ of J
′, there exists a homomorphism \(h_{F^{\prime }} \colon q_{F^{\prime }} \to J\), which can easily be transformed into a homomorphism \(g_{F^{\prime }} \colon F^{\prime } \to J^{\prime }\) by setting \(g_{F^{\prime }}(v_{\alpha }) = h_{F^{\prime }}(z_{\alpha })\) for every α ∈{1,…, i}. These homomorphisms from the f-blocks of J
′ to J can be combined to the desired homomorphism \(h^{\prime } = \bigcup g_{F^{\prime }}\) with h
′: J
′→ J.
-
(iii) Finally, we show that \(\text {Sol}(I^{\prime },\mathcal {M}^{\prime }_{n_{0}}) = \text {Sol}(I^{\prime },\mathcal {M})\) holds.
“ ⊆”: Let \(K \in \text {Sol}(I^{\prime },\mathcal {M}^{\prime }_{n_{0}})\). Since J
′ is a universal solution for I
′ w.r.t. \(\mathcal {M}^{\prime }_{n_{0}}\), there exists a homomorphism g
′ : J
′→ K. By composing g
′ with the homomorphism h : J → J
′, we obtain a homomorphism from J to K. By the closure under target homomorphisms, we conclude that \(K \in \text {Sol}(I^{\prime },\mathcal {M})\)
“ ⊇”: Now let \(K \in \text {Sol}(I^{\prime },\mathcal {M})\). Since J is a universal solution for I
′ w.r.t. \(\mathcal {M}\), there exists a homomorphism g : J → K. By composing g with the homomorphism h
′ : J
′→ J, we obtain a homomorphism from J
′ to K. Since LAV mapping \(\mathcal {M}^{\prime }_{n_{0}}\) is closed under target homomorphisms, we conclude that \(K \in \text {Sol}(I^{\prime },\mathcal {M}^{\prime }_{n_{0}})\).
The proof of Lemma 3 is now complete.
□
We now have all the tools needed to present the proof of Theorem 5. Before doing so and for the sake of readability, we reproduce its statement.
Let \((\mathcal {M}_{n})_{n\geq 1}\) be a premise-bounded sequence of GLAV mappings. The following statements are equivalent.
-
(1)
\((\mathcal {M}_{n})_{n\geq 1}\) has a GLAV mapping \(\mathcal {M}\) as a uniform limit.
-
(2)
\((\mathcal {M}_{n})_{n\geq 1}\) has a uniform limit that admits universal solutions.
Moreover, if \((\mathcal {M}_{n})_{n\geq 1}\) is a sequence of LAV mappings, then \((\mathcal {M}_{n})_{n\geq 1}\) has a LAV mapping as a uniform limit if and only \((\mathcal {M}_{n})_{n\geq 1}\) has a uniform limit that admits universal solutions.
Proof 14 (Proof of Theorem 5)
The direction (1) ⇒ (2) is obvious. For the direction (2) ⇒ (1), we start with the case when \((\mathcal {M}_{n})_{n\geq 1}\) is a sequence of LAV mappings.
Assume that \(\mathcal {M}\) is a uniform limit of a sequence \((\mathcal {M}_{n})_{n\geq 1}\) of LAV mappings and that \(\mathcal {M}\) admits universal solutions. Without loss of generality, we may also assume that \(\mathcal {M}\) is closed under target homomorphism. Indeed, if we let \(\mathcal {M}^{\prime }\) be the schema mapping obtained by closing \(\mathcal {M}\) under target homomorphisms, then \(\mathcal {M}^{\prime }\) is also a uniform limit of \((\mathcal {M}_{n})_{n\geq 1}\) and it admits universal solutions; this is so because the notion of uniform limit is based on C
Q-equivalence and also conjunctive queries are preserved under homomorphisms. Then the schema mapping \(\mathcal {M}\) has the following properties:
-
1.
\(\mathcal {M}\) allows for C
Q-rewriting (by Lemma 2);
-
2.
\(\mathcal {M}\) admits universal solutions (by hypothesis);
-
3.
\(\mathcal {M}\) is closed under target homomorphisms (by hypothesis);
-
4.
\(\mathcal {M}\) is closed under unions (by Lemma 3).
Theorem 3.1 in [19] asserts that if a schema mapping admits universal solutions, allows for query rewriting, and is closed under both target homomorphisms and unions, then it is logically equivalent to a LAV mapping. Consequently, we have that \(\mathcal {M}\) is logically equivalent to a LAV mapping.For the case when \((\mathcal {M}_{n})_{n\geq 1}\) is a sequence of premise-bounded GLAV mappings (but not necessarily LAV mappings), we apply yet another structural characterization of GLAV mappings from [19], namely, Theorem 3.9, which asserts that if a schema mapping allows for C
Q-rewriting, admits universal solutions, is closed under target homomorphisms, and is n-modular, for some fixed n, then it is logically equivalent to a GLAV mapping.
Let k be the constant bounding the length of premises in \((\mathcal {M}_{n})_{n\geq 1}\). We proceed exactly as in the proof of Lemma 3 and construct a sequence \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\), in which the premises of tgds are the same as in tgds in \((\mathcal {M}_n)_{n\geq 1}\), hence each tgd in \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\) has at most k atoms in its premise. We proceed exactly as in the proof of Lemma 3 to establish the following analog of Claim D.Claim D (in the proof of Lemma 3) For every source instance I, there exists an integer
n
0 ≥ 1 such that for
every I
′⊆ I
, we
have that
\(\text {Sol}(I^{\prime },\mathcal {M}^{\prime }_{n_{0}}) = \text {Sol}(I^{\prime },\mathcal {M})\).
Now, since each tgd in every element of \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\) has at most k atoms in its premise, it follows that there is a positive integer N
k
so that each mapping \(\mathcal {M}^{\prime }_{n}\) in \((\mathcal {M}^{\prime }_{n})_{n\geq 1}\) is N
k
-modular. It is easy to see that N
k
≤ k ⋅ r holds where r is the maximum relation arity in the source schema.
We now prove that \(\mathcal {M}\) is N
k
-modular. Assume that J is not a solution for I w.r.t. to \(\mathcal {M}\). Take an integer n
0 as in Claim D and consider \(\mathcal {M}^{\prime }_{n_{0}}\). It follows that J is not a solution for I w.r.t. \(\mathcal {M}^{\prime }_{n_{0}}\). Since \(\mathcal {M}^{\prime }_{n_{0}}\) is N
k
-modular, there is a subinstance I
′ of I such that J is not a solution for I
′ w.r.t. \(\mathcal {M}^{\prime }_{n_{0}}\) and |d
o
m(I
′)|≤ N
k
. Again by Claim D, we have that J is not a solution for I
′ w.r.t. \(\mathcal {M}\), hence M is N
k
-modular.
Thus, \(\mathcal {M}\) has the following properties: it admits C
Q-rewriting (since it is the uniform limit of GLAV mappings that admit C
Q-rewriting), it admits universal solutions, is closed under target homomorphisms (if it is not, we take its closure before we begin the construction), and, as just shown, it is N
k
-modular. Consequently, by Theorem 3.9 in [19], we have that \(\mathcal {M}\) is logically equivalent to a GLAV schema mapping, which completes the proof. □
We conclude this section with a conjecture concerning uniform limits of arbitrary sequences of GLAV mappings.
Conjecture 1
The following statements are equivalent for a sequence
\((\mathcal {M}_{n})_{n\geq 1}\)
of
GLAV mappings.
-
1.
\((\mathcal {M}_{n})_{n\geq 1}\)
has an SO tgd as a uniform limit.
-
2.
\((\mathcal {M}_{n})_{n\geq 1}\)
has a uniform limit that admits universal solutions.
It is not hard to show that the preceding conjecture is implied by a conjecture in [2] to the effect that the language of plain SO-tgdsFootnote 2 can be characterized by the following three properties: allowing for C
Q-rewriting, admitting universal solutions, and closure under target homomorphisms.