Limits of Schema Mappings

Schema mappings have been extensively studied in the context of data exchange and data integration, where they have turned out to be the right level of abstraction for formalizing data inter-operability tasks. Up to now and for the most part, schema mappings have been studied as static objects, in the sense that each time the focus has been on a single schema mapping of interest or, in the case of composition, on a pair of schema mappings of interest. In this paper, we adopt a dynamic viewpoint and embark on a study of sequences of schema mappings and of the limiting behavior of such sequences. To this effect, we first introduce a natural notion of distance on sets of finite target instances that expresses how “close” two sets of target instances are as regards the certain answers of conjunctive que- ries on these sets. Using this notion of distance, we investigate pointwise limits and uniform limits of sequences of schema mappings, as well as the companion notions of pointwise Cauchy and uniformly Cauchy sequences of schema mappings. We obtain a number of results about the limits of sequences of GAV schema mappings and the limits of sequences of LAV schema mappings that reveal striking differences between these two classes of schema mappings. We also consider the completion of the metric space of sets of target instances and obtain concrete representations of limits of sequences of schema mappings in terms of generalized schema mappings, that is, schema mappings with infinite target instances as solutions to (finite) source instances.


Introduction
Schema mappings have been extensively studied in the context of data exchange and data integration, where they have turned out to be the right level of abstraction for formalizing data inter-operability tasks (see the surveys [11,12] and the monograph [1]). Up to now and for the most part, schema mappings have been studied as static objects, in the sense that each time the focus has been on a single schema mapping or on a finite and, typically, small number of schema mappings. In the case of data exchange [6], a single schema mapping is used to specify the relationship between a source schema and a target schema. In the case of operators on schema mappings [3], such as the composition operator [8,14], a fixed number of schema mappings is used as input (e.g., two schema mappings in the case of composition) and another schema mapping is returned as output. Even the case of schema-mapping evolution [9] entails a finite (but potentially large) number of schema mappings.
In this paper, we adopt a dynamic viewpoint and embark on a systematic investigation of sequences of schema mappings and of the limiting behavior of such sequences. The original motivation came from the earlier work [2,5,7,10,14] on schema-mapping optimization and the study of various notions of equivalence between schema mappings that, intuitively, stipulate that two schema mappings cannot be distinguished using conjunctive queries (CQ-equivalence) or conjunctive queries with at most n variables (CQ n -equivalence), for some fixed n ≥ 1. In particular, in [5] and, implicitly, in [14], it was shown that, given an SO-tgd (second-order tuple-generating dependency) σ and a positive integer n, one can construct a GLAV schema mapping that is CQ n -equivalent to σ . Informally, this means that a given SO tgd can be "approximated" by GLAV schema mappings up to any fixed level of precision, even though an SO tgd is a formula of second-order logic that may not be logically equivalent to any formula of first-order logic and, in particular, to any GLAV schema mapping. A more dynamic interpretation is that, given an SO-tgd σ , one can obtain a sequence of GLAV schema mappings (M n ) n≥1 , whose "limit" is σ .

Summary of Results
Our contributions are both conceptual and technical. At the conceptual level, we develop a framework for studying sequences of schema mappings by first introducing a natural notion of distance on the powerset P(Inst(T)) of the set Inst(T) of finite instances over a schema T. Intuitively, this notion of distance expresses how "close" two sets of finite T-instances are as regards the certain answers of conjunctive queries on these sets. The pair (P(Inst(T)), dist) is a pseudometric space, which means that the distance function dist (·, ·) is symmetric and obeys the triangle inequality, but different sets of finite target instances may have distance zero; however, two such sets have distance zero if and only if they are CQequivalent, i.e., every conjunctive query has the same certain answers on these two sets. Thus, we will also work with the metric space obtained by considering the CQ-equivalence classes of members of P(Inst(T)), and will use the same notation for it.
Sequences of functions from some set to a metric space occupy a central place in the study of metric spaces (see, e.g., [18]). In particular, there are natural notions of a pointwise limit and of a uniform limit of a sequence (f n ) n≥1 of functions from some set to a metric space; moreover, there are companion notions of a pointwise Cauchy and of a uniformly Cauchy sequence of such functions. We now describe briefly how these notions can be applied to sequences of schema mappings. In its most general formulation, a schema mapping M over a source schema S and a target schema T is a set of pairs (I, J ), where I is a finite S-instance and J is a finite T-instance. It follows that a schema mapping M can be also be viewed as a function f from the set Inst(S) of all finite S-instances to the powerset P(Inst(T)) of the set of all finite T-instances, where f (I ) = {J : (I, J ) ∈ M}. This way, a sequence (M n ) n≥1 of schema mappings over a source schema S and a target schema T can be viewed as a sequence of functions from Inst(S) to the (pseudo)metric space (P(Inst(T)), dist).
After the conceptual framework has been laid out, we study in depth the limiting behavior of sequences of GAV mappings and the convergence of sequences of LAV mappings. We establish a number of technical results that reveal rather dramatic and perhaps unanticipated differences between GAV schema mappings and LAV schema mappings.
For sequences of GAV mappings, we point out that every uniformly Cauchy sequence of GAV mappings is eventually constant, hence it has a GAV mapping as uniform limit. We also show that every pointwise Cauchy sequence of GAV mappings has a pointwise limit, but it need not have a uniform limit; moreover, there are pointwise Cauchy sequences of GAV mappings such that no GAV mapping is their pointwise limit. This raises the question as to when a sequence of GAV mappings has a GAV mapping as a pointwise limit. We prove that a sequence of GAV mappings has a GAV mapping as a pointwise limit if and only if it has a pointwise limit that allows for CQ-rewriting. 1 For sequences of LAV mappings, we show that the notions of uniform limit and pointwise limit coincide; moreover, the same holds true for the notions of uniformly Cauchy and pointwise Cauchy sequences. However, there are uniformly Cauchy sequences of LAV mappings that have no uniform limit. We also establish that a uniformly Cauchy sequence of LAV mappings has a LAV mapping as a uniform limit if and only if it has a uniform limit that admits universal solutions. The aforementioned results lift to sequences of premise-bounded sequences of GLAV mappings, i.e., sequences of GLAV mappings for which there is a k ≥ 1 such that, for every mapping in the sequence, the left-hand side of every GLAV constraint has at most k source atoms (LAV mappings have k = 1).
In terms of techniques, we use systematically the structural characterizations of schema-mapping languages established in [19], thus creating a link with a different line of research.
The metric space (P(Inst(T)), dist) is incomplete, i.e., there are Cauchy sequences of elements of P(Inst(T)) that have no limit in P(Inst(T)). It is well known that every incomplete metric space (X, d) has a completion, which means that it can be embedded into a complete metric space (X * , d * ) so that X is a dense subset of X * . Moreover, pointwise (respectively, uniformly) Cauchy sequences of functions on X have pointwise (respectively, uniform) limits that take values in X * . The construction of X * from X involves equivalence classes of Cauchy sequences of elements of X, thus, in general, the members of X * do not have a concrete representation. In the last part of the paper, we show that the members of P(Inst(T)) * can be represented by suitably constructed infinite T-instances. As a consequence of this, the pointwise (respectively, uniform) limits of Cauchy sequences of schema mappings can be represented by generalized schema mappings, i.e., schema mappings that allow for infinite target instances as solutions to finite source instances.

Preliminaries
This section contains a minimum amount of necessary background material.

Schemas, Instances, and Conjunctive Queries
A schema R is a finite sequence R 1 , . . . , R k of relation symbols, where each R i has a fixed arity. An instance I over R, or an R-instance, is a sequence (R I 1 , . . . , R I k ), where each R I i is a finite relation of the same arity as R i . We will often use R i to denote both the relation symbol and the relation R I i that interprets it. The active domain adom(I ) of an instance I is the set of all values occurring in the relations of I . A fact of an instance I (over R) is an expression R I i (t 1 , . . . , t m ) (or simply R i (t 1 , . . . , t m )), where R i is a relation symbol of R and (t 1 , . . . , t m ) ∈ R I i . A conjunctive query is a first-order formula of the form ∃z θ(x, z), where θ(x, z) is a conjunction of atomic formulas R i (v 1 , ..., v m ) and each v j is one of the variables in x and z. A boolean conjunctive query is a conjunctive query with no free variables. We write CQ for the class of all conjunctive queries over some schema. For every n ≥ 1, we let CQ n denote the class of all conjunctive queries with at most n variables. We also let CQ 0 denote the singleton consisting of a trivially true query. If I is an instance and q is a conjunctive query, then we write q(I ) for the result of evaluating q on I ; in particular, for boolean conjunctive queries q we have that q(I ) = true if and only if I satisfies q.

Schema Mappings, Universal Solutions, Certain Answers
Motivated by the terminology in data exchange [6], we typically work with two schemas, a source schema S and a target schema T with no relation symbols in common. We refer to Sinstances as source instances, and to T-instances as target instances. We assume that the values occurring in the active domains of instances come from two fixed countably infinite disjoint sets, the set Const of all constants and the set Null of (labeled) nulls. We also assume that the active domains of source instances consist entirely of constants; the active domains of target instances may contain both constants and nulls.
In its most general form, a schema mapping M between a source schema S and a target schema T is a set of pairs (I, J ), where I is source instance and J a target instance. To avoid anomalies that arise from such a relaxed notion, we will assume that a schema mapping M must also possess a mild closure property, namely, that M is closed under isomorphisms that rename nulls by other nulls. This is a natural "genericity" condition that is akin to the condition that database queries are closed under arbitrary isomorphisms. The precise definitions are as follows.
Definition 1 Let S be a source schema and T a target schema.
• An isomorphism that renames nulls between two target instances J and J is a one-to-one and onto function h : adom(J ) → adom(J ) such that: In this case, we write J = h(J ) and say that J is an isomorphic copy of J via an isomorphism that renames nulls.
• A schema mapping M between S and T is a set of pairs (I, J ), where I is source instance and J a target instance, such that the following holds: if a pair (I, J ) is in M and if J = h(J ) is an isomorphic copy of J via an isomorphism h that renames nulls, then also (I, J ) is in M.
A schema mapping is often (but not always) given as a triple M = (S, T, ), where is a set of formulas in some logical formalism such that (I, J ) ∈ M if and only if I ∪J |= . Clearly, if is a set of first-order formulas or a set of second-order formulas, then M is indeed closed under isomorphisms that rename nulls holds.
Let M be a fixed schema mapping. In data exchange, the main problem is, given a source instance I , to find a solution for I w.r.t. M, that is, a target instance J such that (I, J ) ∈ M (or determine that no solution exists). We use the notation Sol(I, M) = {J | (I, J ) ∈ M} to denote the set of all solutions for I w.r.t. M. In data integration, the main problem is to compute the certain answers of queries [12]. Specifically, given a query q over the target schema and a source instance I , the certain answers of q on I w.r.t. M is the set If q is a boolean conjunctive query, then cert (q, I, M) = true, if q(J ) = true, for every solution J for I w.r.t. M; otherwise, cert (q, I, M) = false. Note also if q is a non-boolean conjunctive query, then either cert (q, I, M) = ∅ or every tuple t ∈ cert (q, I, M) is null-free, that is, it consists entirely of constants. This is a consequence of the closure of M under isomorphisms that rename nulls. Indeed, assume that t ∈ cert (q, I, M). Let J be a solution for I w.r.t. M and let J = h(J ) be a target instance that is an isomorphic copy of J via an isomorphism h that renames nulls from the active domain of J to nulls outside the active domain of J (such a target instance J and such an isomorphism h exist because J is a finite set of facts, hence its active domain is a finite set). By the closure property of M, the target instance J is also a solution for I w.r.t. M, hence t ∈ q(J ). It follows that t must consist of values in the intersection adom(J ) ∩ adom(J ) of the active domains of J and J , hence t must consist entirely of constants. Note that the only property of conjunctive queries used in this argument is that they are safe, that is, they return tuples from the active domain of the instance on which they are evaluated.
On the face of it, the definition of certain answers may entail computing an intersection of infinitely many sets. One of the main findings in [6] is that there is a notion of a "good" solution in data exchange, called universal solution, that can also be used to compute the certain answers of conjunctive queries in a much more direct way.
Let J 1 and J 2 be two target instances. A function h is a homomorphism from J 1 to J 2 if the following hold: (i) for every constant c, we have that h(c) = c; and (ii) for every relation symbol R in R and every tuple (a 1 , . . . , a n ) ∈ R J 1 , we have that (h(a 1 ), . . . , h(a n )) ∈ R J 2 . We write J 1 → J 2 to denote that there is a homomorphism from J 1 to J 2 . We say that J 1 is homomorphically equivalent to J 2 , written J 1 ↔ J 2 , if J 1 → J 2 and J 2 → J 1 .
Let I be a source instance. A universal solution for I w.r.t. M is a solution J such that for every solution J ∈ Sol(I, M), we have that J → J . Intuitively, a universal solution for I is a "most general" solution for I . We write UnivSol(I, M) to denote the set of all universal solutions for I w.r.t. M (note that universal solutions need not always exist, so it is possible that UnivSol(I, M) = ∅). The following useful property of universal solutions was first identified in [6].

Proposition 1
Assume that M is a schema mapping, I is a source instance, and J is a universal solution for I w.r.t. M. If q is a conjunctive query, then cert (q, I, M) = q(J ) ↓ , where q(J ) ↓ is the set of all null-free tuples in q(J ).
Proof First, assume that t ∈ cert (q, I, M). Then, as discussed earlier, t must be a null-free tuple. Since J is a solution for I w.r.t. M, we have that t ∈ q(J ), hence we have that t ∈ q(J ) ↓. Next, assume that t is a null-free tuple in q(J ). If J is an arbitrary solution for I w.r.t. M, then, since J is a universal solution for I w.r.t. I , there is a homomorphism h from J to J . Since conjunctive queries are preserved under homomorphisms, it follows that h(t) = t ∈ q(J ). Thus, t ∈ cert (q, I, M).

Structural Properties of Schema Mappings
We now present a number of structural properties that a schema mapping may or may not possess. These properties were investigated in their own right in [19], where they were used to obtain characterizations of schema-mapping languages that will be of great interest to us in this paper.
Let M be a schema mapping.
• M allows for CQ-rewriting if for every target conjunctive query q, there exists a union q of source conjunctive queries such that cert (I, M, q) = q (I ), for every source instance I . • M admits universal solutions if for every source instance I , there is a universal solution for I w.r.t. M.
there is a subinstance I ⊆ I with at most n elements in its active domain such that (I , J ) / ∈ M ("small counterexample").

Schema Mapping Languages A GLAV (global-and-local-as-view)
constraint is a first-order formula of the form ∀x(ϕ(x) → ∃yψ(x, y)), where ϕ(x) is a conjunction of atoms over the source schema S, each variable in x occurs in at least one atom in ϕ(x), and ψ(x, y) is a conjunction of atoms over the target schema T with variables in x and y. We refer to ϕ(x) as the left-hand side, or premise, and ∃yψ(x, y) as the right-hand side, or conclusion of the constraint. Another name for GLAV constraints is source-to-target tuple-generating dependencies or, in short, s-t tgds.
A LAV (local-as-view) constraint is a GLAV constraint whose left-hand side is a single atom over the source, while a GAV (global-as-view) constraint is a GLAV constraint whose right-hand side contains no existential quantifiers and consists of a single atom over the target. For example, ∀x, y(E(x, y) → ∃z(F (x, z) ∧ F (z, y))) is a LAV constraint, and ∀x, y, z(E(x, z) ∧ E(z, y) → F (x, y)) is a GAV constraint.
A GLAV (global-and-local-as-view) mapping is a schema mapping M = (S, T, ) such that is a finite set of GLAV constraints. The notions of a LAV mapping and of a GAV mapping are defined analogously.
Every GLAV mapping M admits universal solutions [6]; furthermore, given a source instance I , a canonical universal solution chase(I, M) can be produced via the oblivious chase procedure as follows: whenever the antecedent of an s-t tgd in M becomes true, fresh null values are introduced and facts involving these nulls are added to chase(I, M), so that the conclusion of the s-t tgd becomes true. Every GLAV mapping is also known to allow for CQ-rewriting and to be n-modular, for some n ≥ 1. Moreover, every LAV mapping is closed under unions, while every GAV mapping is closed under target intersections.
Second-Order tgds, or SO tgds, were introduced in [8] and were shown to be exactly the constraints needed to express the composition of a finite number of GLAV mappings. Instead of giving the precise definition of an SO tgd, we illustrate this notion with an example from [8]. The formula ∃f (∀e(Emp(e) → Mgr(e, f (e))) ∧ ∀e(Emp(e) ∧ (e = f (e)) → SelfMgr(e))) expresses the property that every employee has a manager, and if an employee is the manager of himself/herself, then this employee is a self-manager. Clearly, SO tgds are existential second-order formulas with existentially quantified function symbols, which can be thought of as acting like Skolem functions. The use of these function symbols, however, is limited by the syntax of SO tgds: they can only appear in equations between terms in the antecedent of an implication or as arguments of atoms in the conclusion of an implication. As regards expressive power, SO tgds are, in general, strictly more expressive than GLAV constraints, but less expressive than arbitrary existential second-order formulas. In particular, the above formula is an SO tgd that is not logically equivalent to any (finite or infinite) set of GLAV constraints [8].
Every SO tgd allows for CQ-rewriting and admits universal solutions; however, an SO tgd may not be closed under target homomorphisms and there may not exist any n ≥ 1 such that the SO tgd is n-modular (see [8,19]).

Pseudometric Spaces and Metric Spaces
where X is a set and d is a function from X × X to the set R + of non-negative real numbers with the following properties: for every x, y, z in X (triangle inequality). A metric space is a pseudometric space (X, d) such that if d(x, y) = 0, then x = y. It is easy to see that if (X, d) is a pseudometric space, then the relation R d = {(x, y) ∈ X × X | d(x, y) = 0} is an equivalence relation on X. From this, it follows that every pseudometric space (X, d) gives rise to a metric space ( X, d), where X is the set of equivalence classes of elements of X modulo the equivalence relation R d and d ([x], [y]) = d(x, y).
Let (X, d) be a pseudometric space. A sequence of elements x 1 , x 2 , . . . of X converges to an element x of X, denoted by lim n→∞ x n = x, if for every > 0, there is an integer n 0 such that d(x n , x) < , for every n ≥ n 0 . We say that x is a limit of this sequence. The limit is unique if (X, d) is a metric space. A sequence x 1 , x 2 , . . . of elements of X is Cauchy if for every > 0, there is an integer n 0 such that d(x n , x n ) < , for every n, n ≥ n 0 .
Using the triangle inequality, it is easy to see that if a sequence of elements in a (pseudo)metric space has a limit, then the sequence is Cauchy. The converse, however, does not hold for arbitrary (pseudo)metric spaces. A (pseudo)metric space (X, d) is complete if every Cauchy sequence of elements of X has a limit in X; otherwise, it is incomplete.
It is well known that every incomplete (pseudo)metric space (X, d) can be embedded into a complete (pseudo)metric space (X * , d * ), called the completion of (X, d), in such a way that X is a dense subset of X * , i.e., every member of X * is the limit of a sequence of members of X. The members of X * are equivalence classes of Cauchy sequences of X, where two Cauchy sequences x 1 , x 2 , ... and y 1 , y 2 , . . . of elements of X are equivalent if lim n→∞ d(x n , y n ) = 0, while the distance function d * is defined ) = lim n→∞ d(x n , y n ). The proof of correctness of this construction can be found in [18] or any other book on metric spaces.
As a concrete example, the metric space of the real numbers is the completion of the metric space of the rational numbers (both with the standard distance).

Metric Space of Target Instances
To study the limits of sequences of schema mappings, we first introduce a pseudometric space of sets of target instances. By considering schema mappings as functions that map each source instance to the set of its solutions, we can view sequences of schema mappings as sequences of functions. The (pointwise or uniform) limit of a sequence of schema mappings is then simply defined in the standard way as the limit of a sequence of functions taking values in a pseudometric space. Moreover, by passing to the associated metric space of equivalence classes of sets of target instances, we ensure the uniqueness of the limit. If T is a schema, we write Inst(T) for the set of all finite instances of T. We also write P(Inst(T)) for the power set of Inst(T). The notion of distance on P(Inst(T)) that we are about to introduce is heavily based on the notion of the certain answers to conjunctive queries and on the idea that two members J and J of P(Inst(T)) are "close" to each other if only "big" conjunctive queries can yield different certain answers on J and J . Definition 2 Let T be a schema.
• Let q be a query over T and let J be a member of P(Inst(T)). The certain answers of q over J are defined as • We say that two sets of instances J and J in P(Inst(T)) are CQ-equivalent, denoted J ≡ CQ J , if cert (q, J ) = cert (q, J ) holds for all conjunctive queries q. • We say that J and J are CQ n -equivalent, denoted J ≡ CQ n J , if it holds that cert (q, J ) = cert (q, J ) for all conjunctive queries q with at most n variables (i.e., for all q in CQ n .)

Definition 3
Let J and J be two sets of instances in P(Inst(T)). The similarity sim(J , J ) and the distance dist (J , J ) between J and J are defined as follows: It is easy to verify that the pair (P(Inst(T)), dist) is a pseudometric space; in fact, dist is an ultrametric distance function, that is, holds for all J , J , J in P(Inst(T)). Moreover, dist (J , J ) = 0 if and only if J and J are CQ-equivalent.

Definition 4
Let T be a schema. If J is a T-instance, then we write v(J ) to denote the member of P(Inst(T)) consisting of all isomorphic copies of J via isomorphisms that rename nulls. In other words, v(J ) consists of all T-instances J such that J is isomorphic to J via an isomorphism h that maps each constant to itself and maps each null to a (possibly different) null.
The next lemma will be used repeatedly in the sequel.

Lemma 1 Let T be a schema.
1. If J is a T-instance whose active domain consists entirely of nulls and q is a non-boolean conjunctive query, then cert (q, v(J )) = ∅. 2. If J is a T-instance whose active domain consists entirely of nulls and q is a boolean conjunctive query, then cert (q, v(J )) = q(J ). 3. If J and J are T-instances whose active domains consist entirely of nulls, then, for every k ≥ 1, the following statements are equivalent: (b) J and J satisfy the same boolean conjunctive queries in CQ k .
Proof For the first two parts of the lemma, let J be a T-instance whose active domain consists entirely of nulls. For every non-boolean query q in CQ k , we have that cert (q, v(J )) = ∅, because v(J ) contains instances with disjoint active domains. For every boolean query q, we have cert (q, v(J )) = q(J ) for the following reason: first, J is a member of v(J ), so if cert (q, v(J )) = true, then q(J ) = true as well; second, since every member of v(J ) is isomorphic to J and since boolean conjunctive queries are preserved under isomorphisms, we have that if q(J ) = true, then cert (q, v(J )) = true.
For the third part of the lemma, let J and J be T-instances whose active domains consist entirely of nulls and let k be a positive integer. If v(J ) ≡ CQ k v(J ), then J and J must satisfy the same boolean conjunctive queries in CQ k because J ∈ v(J ) and J ∈ v(J ). For the converse, assume that J and J satisfy the same boolean conjunctive queries in CQ k . We have to show that cert (q, v(J )) = cert (q, v(J )), for every conjunctive query q in CQ k . If q is a non-boolean conjunctive query in CQ k , then, by the first part of the lemma, we have that cert (q, v(J )) = ∅ = cert (q, v(J )). If q is a boolean query in CQ k , then, by the second part of the lemma and the hypothesis about J and J , we have that cert (q, v(J )) = q(J ) = q(J ) = cert (q, v(J )).
The preceding lemma will be used in the next example, which presents a sequence from P(Inst(T)) that has a limit in P(Inst(T)).
Example 1 Let T be a schema consisting of a single binary relation E and let C m be the undirected cycle of length m, m ≥ 1, where the vertices of the cycle are pairwise distinct labeled nulls. Consider the sequence (v(C 2n+1 )) n≥1 arising from the cycles of odd size. Then, for every m ≥ 1, we have that lim We first show that v(C 2m ) ≡ CQ v(C 2 ), for every m ≥ 1. By Lemma 1, it suffices to show that C 2m and C 2 satisfy the same boolean conjunctive queries. This is true because C 2m and C 2 are homomorphically equivalent (and boolean conjunctive queries are preserved under homomorphisms). Indeed, there is a homomorphism from C 2 to C 2m because C 2 is a subgraph of C 2m , and there is a homomorphism from C 2m to C 2 because C 2m is 2-colorable.
We will show that lim by showing that for every k, there exists n 0 such that for all n ≥ n 0 , we have that v(C 2n+1 ) ≡ CQ k v(C 2 ). For this, we take n 0 = k and show that if n ≥ k, then v(C 2n+1 ) ≡ CQ k v(C 2 ). By the third part of Lemma 1, it suffices to show if q is a boolean conjunctive query in CQ k , then q(C 2n+1 ) = q(C 2 ). Since C 2 is a subgraph of C 2n+1 , we have that if q(C 2 ) = true, then also q(C 2n+1 ) = true. Assume that q(C 2n+1 ) = true. Since q ∈ CQ k , there is a subgraph H of C 2n+1 with at most k distinct nodes such that q(H ) = true. Since 2n + 1 > n ≥ k, we have that H is a proper subgraph of C 2n+1 . Consequently, H is 2-colorable and so there is a homomorphism from H to C 2 , which, in turn, implies that q(C 2 ) = true.
In contrast to what we have just seen, there are Cauchy sequences of elements of P(Inst(T)) that have no limit in P(Inst(T)). Thus, the pseudo-metric space (P(Inst(T)), dist) is incomplete.

Proposition 2 Let T be a schema consisting of a single binary relation E and let
K n be the clique of size n, for n ≥ 1, where the vertices are pairwise distinct labeled nulls. The sequence (v(K n )) n≥1 is Cauchy, but has no limit in P(Inst(T)).
Proof The sequence (v(K n )) n≥1 is Cauchy, because if m ≥ n, then v(K m ) and v(K n ) satisfy the same conjunctive queries in CQ n . To show this, by the third part of Lemma 1, it suffices to show that if m ≥ n, then K m and K n satisfy the same boolean conjunctive queries in CQ n . Let q be a boolean conjunctive query in CQ n .
It remains to show that the sequence (v(K n )) n≥1 has no limit in P(Inst(T)). Assume to the contrary that there does exist a set J of finite instances over T such that lim n→∞ v(K n ) = J . We distinguish three cases.
First, if J = ∅, then cert (q, J ) = true, for every conjunctive query q. In particular, this holds for the query q = ∃xE(x, x), which asserts the existence of a self-loop. In contrast, for this conjunctive query, we have that cert (q, v(K n )) = false, for every n ≥ 1, since K n ∈ v(K n ) and none of the graphs K n , n ≥ 1 contains a self-loop.
Second, if J = ∅ and if every member J of J contains a self-loop, then we again consider the query q = ∃xE(x, x). We thus have cert (q, J ) = true, whereas cert (q, v(K n )) = false, for every n ≥ 1.
It remains to consider the case that J = ∅ and at least one member J ∈ J does not contain a self-loop. Let m be the biggest integer such that J contains a clique of size m. We define the query q as For graphs without self-loops, q asserts the existence of a clique of size m + 1. We now have that q evaluates to false overy J . Hence, cert (q, J ) = false holds, while cert (q, v(K n )) = true, for every n ≥ m + 1.
Since (v(K n ) n≥1 ) is a Cauchy sequence, it has a limit in the completion of (P(Inst(T)), dist). As we will see in Section 6, a concrete representation of this limit is the set consisting of all disjoint unions of cliques of all finite sizes in which every node is a null.
The following definitions are perfectly meaningful for every pseudometric space (X, d) and for every sequence of functions taking values in X. For concreteness, we give the definitions for sequences of functions taking values in P(Inst(T)).

Definition 5 Let
A be a set, let (f n ) n≥1 be a sequence of functions from A to P(Inst(T)), and let f be a function from A to P(Inst(T)).
• We say that (f n ) n≥1 converges pointwise to f , denoted as • We say that (f n ) n≥1 converges uniformly to f , denoted as u lim n→∞ f n = f , if for every > 0, there exists an integer n 0 ≥ 1 such that for every integer n ≥ n 0 and for every element x ∈ A, we have dist (f n (x), f (x)) < .
• We say that (f n ) n≥1 is pointwise Cauchy, if for every element x ∈ A, the sequence (f n (x)) n≥1 is Cauchy. • We say that (f n ) n≥1 is uniformly Cauchy, if for every > 0, there exists an integer n 0 ≥ 1 such that for all integers n, n ≥ n 0 and for every element Clearly, if (f n ) n≥1 converges pointwise (resp., uniformly), then (f n ) n≥1 is pointwise (resp., uniformly) Cauchy. The converse is not in general true for arbitrary (pseudo)metric spaces; in particular, it is not true for the pseudometric space (P(Inst(T)), dist), as we shall see later on.
We now bring schema mappings into the picture. Every schema mapping M over a source schema S and a target schema T can be identified with a function f : Inst(S) −→ P(Inst(T)), where f (I ) = Sol(I, M) (recall that Sol(I, M) is the set of all solutions of I w.r.t. M, i.e., the set of all finite T instances J such that (I, J ) ∈ M). Thus, a sequence (M n ) n≥1 of schema mappings over a source schema S and target schema T can be viewed as a sequence of functions from Inst(S) to P(Inst(T)). Therefore, we can talk about a sequence of schema mappings being pointwise Cauchy and uniformly Cauchy if the sequence of the associated functions has these properties. Similarly, we say that a sequence of schema mappings has a pointwise limit (resp., a uniform limit) if the sequence of the associated functions converges pointwise (resp., converges uniformly) to a schema mapping.
The preceding notion of convergence of a sequence of schema mappings allows us to draw a connection to earlier work on schema mapping optimization [5,7]. Here, we are considering CQ-equivalence and CQ n -equivalence of sets of instances. In previous works, these notions of equivalence have been mainly applied to schema mappings (see, e.g., [5,7,14]). Specifically, two schema mappings M, M are CQequivalent (resp., CQ n -equivalent) if for every target conjunctive query q (resp., every target conjunctive query q in CQ n ) and every source instance I , we have that cert (q, I, M) = cert (q, I, M ). In this case, we write M ≡ CQ M (resp., M ≡ CQ n M ). The notion of CQ n -equivalence has been studied in the context of schema mapping optimization [5,7]. Below we discuss its relationship to the convergence of schema mappings.
Proof The result follows by unfolding and comparing the definitions. Specifically, u lim n→∞ M n = M means that for every > 0, there is an integer n 0 ≥ 1 such that for every integer n ≥ n 0 and for every source instance I we have that dist (Sol(I, M n ), Sol(I, M)) < . In turn, this means that for every integer k ≥ 1, there is an integer n 0 ≥ 1 such that for every integer n ≥ n 0 and for every source instance I we have that Sol(I, M n ) ≡ CQ k Sol(I, M). Thus, for every integer k ≥ 1, there is an integer n 0 ≥ 1 such that for every integer n ≥ n 0 , we have that Intuitively, the preceding proposition states that it takes bigger and bigger conjunctive queries to distinguish the members of a sequence (M n ) n≥1 from its uniform limit.
Although never explicitly introduced, the notion of uniform convergence was implicit in [5], where it was shown that for every SO tgd σ and for every n ≥ 1, there is a GLAV mapping M n such that σ ≡ CQ n M n . From this, it is easy to see that u lim n→∞ M n = σ . Thus, we have the following result.
Theorem 1 (implicit in [5]) Every SO tgd is a uniform limit of a sequence of GLAV mappings.
There are SO tgds that are not CQ-equivalent to any GLAV mapping. Indeed, from Example 4.6 and Theorem 4.10 in [7], it follows that the SO-tgd is not CQ-equivalent to any GLAV mapping. Thus, the point of Theorem 1 is that SO tgds can be "approximated" up to any level of CQ k -equivalence by GLAV mappings, which are both syntactically simpler and generally more well-behaved.
As stated earlier, (P(Inst(T)), dist) is a pseudometric space since it cannot distinguish CQ-equivalent sets of instances. Consequently, the limit of a sequence of sets of instances and the (uniform or pointwise) limit of a sequence of mappings need not be unique. However, the limit is unique up to CQ-equivalence and, as described in Section 2, there is an associated metric space ( P(Inst(T)), dist) obtained by considering the equivalence classes of P(Inst(T)) modulo the equivalence relation In subsequent sections, we will work with the metric space ( P(Inst(T)), dist). Moreover, we will be interested in schema mappings modulo CQ-equivalence, which means that from now on we will view schema mappings as functions from source instances to equivalence classes of sets of target instances modulo CQ-equivalence. However, for notational simplicity, we will work each time with representatives of the equivalence classes. By a slight abuse of notation, we will write (P(Inst(T)), dist), instead of ( P(Inst(T)), dist). Likewise, we will not explicitly distinguish between a schema mapping M and the equivalence class of the schema mappings that are CQ-equivalent to M.

Limits of Sequences of GAV Mappings
Our goal in this section is to analyze sequences of GAV mappings. To this effect, we first investigate the existence of limits of such sequences and then examine the definability of limits. As discussed in Section 3, if a sequence (M n ) n≥1 of schema mappings has a pointwise (resp., uniform) limit, then the sequence is pointwise (resp., uniformly) Cauchy. The next result asserts that the converse holds for sequences of GAV mappings.

Theorem 2 Let (M n ) n≥1 be a sequence of GAV mappings.
• If (M n ) n≥1 is pointwise Cauchy, then it has a pointwise limit.
• If (M n ) n≥1 is uniformly Cauchy, then it is eventually constant and thus has a GAV schema mapping as a uniform limit.
Proof We consider GAV mappings over a source schema S and a target schema T.
Let r denote the maximum arity of the relation symbols in T. For showing the first claim, assume that (M n ) n≥1 is a pointwise Cauchy sequence of schema mappings and let I be a source instance. For each n ≥ 1, consider the universal solution chase(I, M n ) for I w.r.t. M n obtained by using the oblivious chase procedure. Since each M n is a GAV schema mapping, we have that chase(I, M n ) contains constants from the active domain of I and no nulls. We claim that there exists some n 0 such that for all n ≥ n 0 , we have that chase(I, M n ) = chase(I, M n 0 ). In other words, we claim that the sequence (chase(I, M n )) n≥1 is eventually constant (does not oscillate). Since every instance in the sequence (chase(I, M n )) n≥1 has no nulls, it can be identified by evaluating on that instance the atomic queries R(x 1 , . . . , x k ), where R ranges over the relation symbols of T and k (with k ≤ r) denotes the arity of R. The assumption that the sequence (chase(I, M n )) n≥1 is pointwise Cauchy implies that there exists a positive integer n 0 (that depends on I and r) such that for every integer n ≥ n 0 and every conjunctive query q ∈ CQ r , we have that cert (q, I, M n ) = cert (q, I, M). This implies that q(chase(I, M n )) = q(chase(I, M n 0 )) and, consequently, for every n ≥ n 0 , we have that chase(I, M n )) = chase(I, M n 0 ).
We have just shown that if (M n ) n≥1 is a pointwise Cauchy sequence of GAV mappings, then for every I , there exists a positive integer m I such that chase(I, M m I ) = chase(I, M n ), for all n ≥ m I . It follows that the schema mapping M = {(I, chase(I, M m I )) | I is a source instance} is a pointwise limit of the sequence (M n ) n≥1 . Note that M is indeed a schema mapping because chase(I, M m I ) contains no nulls.
For showing the second claim, assume that (M n ) n≥1 is a uniformly Cauchy sequence of GAV mappings. We claim that (M n ) n≥1 is eventually constant, i.e., there is some n 0 such that for all n ≥ n 0 , M n ≡ CQ M n 0 holds. For this, we repeat the previous argument, but also note that, since the sequence (M n ) n≥1 is uniformly Cauchy, there exists a positive integer n 0 that depends only on r such that for every source instance I , for every integer n ≥ n 0 and every conjunctive query q ∈ CQ r , we have that cert (q, I, M n ) = cert (q, I, M). This implies that for every source instance I and every n ≥ n 0 , we have that q(chase(I, M n )) = q(chase(I, M n 0 )); consequently, for every source instance I and every n ≥ n 0 , we have that chase(I, M n )) = chase(I, M n 0 ).
Next, we point out that, for sequences of GAV mappings, the notions of pointwise convergence and uniform convergence are genuinely different.

Proposition 4
There exists a sequence of GAV mappings that has a GAV mapping as a pointwise limit, but has no uniform limit.
Intuitively, if E is interpreted as edge relation, then q n yields a non-empty answer over any graph that contains a self-loop or a clique of size n. Let S be a source schema consisting of a binary relation symbol E and a unary relation symbol P , let T be a target schema consisting of a unary relation symbol P . Let (M n ) n≥1 be the sequence of GAV mappings, where M n is specified by the constraint ∀x∀x 1 , . . . , x n+1 (P (x) ∧ q n+1 → P (x)). Intuitively, M n is a "copy" schema mapping, but the copying action is triggered only if the source instance contains a self-loop or a clique of size n + 1. We will show that the GAV schema mapping M = {∀x∀y(P (x) ∧ E(y, y) → P (x))} is a pointwise limit of (M n ) n≥1 , but that this pointwise limit is not a uniform limit of (M n ) n≥1 and thus no uniform limit of (M n ) n≥1 exists.
We first show that the GAV mapping M is a pointwise limit of (M n ) n≥1 . Given a source instance I , we consider two cases. Next, we show that (M n ) n≥1 has no uniform limit. Towards a contradiction, suppose that such a uniform limit exists. Every uniform limit is also a pointwise limit; moreover, pointwise and uniform limits are unique up to CQ-equivalence. Hence, since the schema mapping M defined above is a pointwise limit of (M n ) n≥1 , it follows that M is also a uniform limit of (M n ) n≥1 . Let m = 1. Then there exists an n 0 such that for all n ≥ n 0 we have that M n ≡ CQ 1 M. Take n = n 0 . Let I be the source instance K n+1 ∪ {P (c)} and let q be the target conjunctive query ∃x P (x). We now claim that cert (q, I, M n ) = cert (q, I, M), which contradicts the previously derived fact that M n ≡ CQ 1 M. Indeed, since I contains a clique of size n + 1, Proposition 4 and Theorem 2 imply that the sequence of GAV mappings in the proof of Proposition 4 is an example of a pointwise Cauchy sequence that is not uniformly Cauchy. Theorem 2 also implies that if a sequence of GAV mappings has a uniform limit, then it must have a GAV mapping as such a limit. In turn, this gives rise to the following natural question concerning the definability of pointwise limits: if a sequence of GAV mappings has a pointwise limit, does it have a GAV mapping as such a limit? We answer this question in the negative by showing that even the much richer language of SO tgds cannot express pointwise limits of sequences of GAV mappings.

Proposition 5
There is a pointwise Cauchy sequence of GAV schema mappings such that no SO tgd is a pointwise limit of that sequence.
Proof Consider a source schema S consisting of a binary relation symbol E, and a target schema T consisting of a binary relation F . For every n ≥ 1, let P n (x, y) be the conjunctive query expressing the property "there is an E-path of length n from x to y", and let M n be the GAV mapping specified by the set {∀x, y(P i (x, y) → F (x, y)) | 1 ≤ i ≤ n}. Consider the schema mapping It is easy to see that M is a pointwise limit of the sequence (M n ) n≥1 ; the reason for this is that, for every source instance I and for every n ≥ |adom(I )| 2 , we have that chase(I, M n ) = T C(I ). However, M is not CQ-equivalent to any schema mapping M that allows for CQ-rewriting: if it were, then there would exist a union q of conjunctive queries over the source such that, for every source instance I , Consequently, the transitive closure of I would be first-order definable over the source, which is not the case. Since every SO tgd allows for CQ-rewriting, no SO tgd is a pointwise limit of the sequence (M n ) n≥1 .
We have just seen that there are sequences of GAV mappings that have a pointwise limit, but no such limit is definable by a GAV mapping. This raises the question of finding necessary and sufficient conditions guaranteeing that a sequence of GAV mappings has a GAV mapping as a pointwise limit. The next result provides an answer to this question. It is clear that M is also a pointwise limit of (M n ) n≥1 . The result we seek is an immediate consequence of the fact that the following four statements are equivalent: (a) (M n ) n≥1 has a GAV mapping as a pointwise limit. (b) (M n ) n≥1 has a pointwise limit that allows for CQ-rewriting. (c) M allows for CQ-rewriting. (d) M is logically equivalent to a GAV mapping.
We now show that these four conditions are equivalent.
Thus, M allows for CQ-rewriting, admits universal solutions, and is closed under both target homomorphisms and target intersections. Theorem 3.2 in [19] asserts that a schema mapping is logically equivalent to a GAV schema mapping if and only if it allows for CQ-rewriting, admits universal solutions, and is closed under both target homomorphisms and target intersections. It follows that M is logically equivalent to a GAV mapping. (d) ⇒ (a) This is obvious since M is a pointwise limit of (M n ) n≥1 .
Observe that Theorem 3 (and its proof) provide necessary and sufficient conditions for a pointwise Cauchy sequence of GAV mappings to have a GAV mapping as a pointwise limit, but these conditions are on the pointwise limit and not on the sequence itself. By analyzing the proof of Theorem 3, however, it is possible to extract a necessary and sufficient condition on the sequence itself. For this, we need to introduce the following concept.

Definition 6
Let (M n ) n≥1 be a sequence of schema mappings. We say that (M n ) n≥1 allows for CQ-rewriting if for every target conjunctive query q, there is a union q of source conjunctive queries having the following property: for every source instance I , there is a positive integer n I such that cert (q, I, M n ) = q (I ), for every n ≥ n I . Let M be a pointwise limit of a sequence (M n ) n≥1 of schema mappings. It is easy to show that M allows for CQ-rewriting if and only if (M n ) n≥1 allows for CQ-rewriting. Indeed, assume first that M allows for CQ-rewriting. To show that (M n ) n≥1 allows for CQ-rewriting, let q be a conjunctive query and let q be a union of conjunctive queries such that cert (q, I, M) = q (I ), for every source instance I . Since M is a pointwise limit of (M n ) n≥1 , for every instance I , there is a positive integer n I such that cert (q, I, M) = cert (q, I, M n ), for every n ≥ n I . It follows that cert (q, I, M n ) = q (I ), for every n ≥ n I , which shows that (M n ) n≥1 allows for CQ-rewriting. In the other direction, assume that (M n ) n≥1 allows for CQrewriting. To show that M allows for CQ-rewriting, let q be a conjunctive query and let q be a union of conjunctive queries such that for every source instance I , there is a positive integer n I such that cert (q, I, M n ) = q (I ), for every n ≥ n I . By the pointwise convergence of (M n ) n≥1 to M, for every source instance I , there is a positive integer n I such that cert (q, I, M) = cert (q, I, M n ), for every n ≥ n I . Let I be a source instance. By taking any n ≥ max{n I , n I }, we have that cert (q, I, M) = cert (q, I, M n ) and cert (q, I, M n ) = q (I ), hence cert (q, I, M) = q (I ), which shows that M allows for CQ-rewriting.
By combining the preceding remarks with Theorems 2 and 3, we obtain the following result. Since every schema mapping specified by an SO tgd allows for CQ-rewriting, Theorem 3 also implies the following result. Finally, we note that Proposition and Theorem 3 yield a fairly complete picture of the definability of pointwise limits of GAV mappings. Specifically, there are two mutually exclusive possibilities: (1) No pointwise limit allows for CQ-rewriting and no GAV mapping is a pointwise limit. (2) Every pointwise limit admits CQ-rewriting and there is a GAV mapping that is a pointwise limit. Moreover, this happens precisely when the schema mapping M in the proof of Theorem 3 allows for CQ-rewriting or, equivalently, when M is logically equivalent to a GAV mapping.

Limits of Sequences of LAV Mappings
In this section, we investigate the existence and definability of limits of sequences of LAV mappings. In fact, we will consider a much broader class of GLAV mappings, namely k-premise-bounded GLAV mappings for arbitrary k ≥ 1. LAV mappings correspond to the special case of k = 1.

Definition 7
Let M be a GLAV mapping and k a positive integer. We call M a kpremise-bounded GLAV mapping if the premise of every constraint in M has at most k atoms. Let (M n ) n≥1 be a sequence of GLAV mappings. We say that (M n ) n≥1 is premisebounded if there exists an integer k such that every element M n of (M n ) n≥1 is k-premise bounded.
Unlike the case of GAV mappings, the notions of pointwise Cauchy and uniformly Cauchy sequences of premise-bounded GLAV mappings coincide. Moreover, the same holds true for the notions of pointwise limit and uniform limit of sequences of such schema mappings.

Theorem 4
Let (M n ) n≥1 be a sequence of premise-bounded GLAV mappings.

(1) The sequence (M n ) n≥1 is pointwise Cauchy if and only if it is uniformly
Cauchy.
(2) The sequence (M n ) n≥1 has a pointwise limit if and only if it has a uniform limit.
Proof We prove the first part and then use it to prove the second part. Part 1. It is obvious that every uniformly Cauchy sequence of mappings is also pointwise Cauchy. We focus on the reverse direction. Let (M n ) n≥1 be a pointwise Cauchy sequence of premise bounded GLAV mappings. We have to show that for every m, there is an N 0 such that for all n, n ≥ N 0 , we have that M n ≡ CQ m M n .
Fix an integer m. Since (M n ) n≥1 is pointwise Cauchy, for every source instance I , there is an integer n 0 (I ) such that for all n, n ≥ n 0 (I ) and for every conjunctive query q in CQ m , we have that cert (q, I, M n ) = cert (q, I, M n ). Let p be the number of relation symbols in the target schema, let r be their maximum arity, and let k be the bound on the number of atoms in the premises of the members of the sequence (M n ) n≥1 . We write I to denote the class of all source instances with at most k · p · m r atoms. Clearly, up to isomorphism, there are only finitely many instances I ∈ I. Moreover, if I ∼ = I , then n 0 (I ) = n 0 (I ). Consequently, the quantity N 0 = max{n 0 (I ) | I ∈ I} is a positive integer. We claim that for all n, n ≥ N 0 , we have that M n ≡ CQ m M n .
Let I be an arbitrary source instance and let q be an arbitrary conjunctive query in CQ m . We have to show that cert (q, I, M n ) = cert (q, I, M n ), for all n, n ≥ N 0 . Let a be a tuple of constants such that a ∈ cert (q, I, M n ), hence a ∈ q(chase(I, M n )) ↓ . Since the query q has at most m variables, it must consist of at most p · m r atoms. Let h : atoms(q) → chase(I, M n ) be a homomorphism establishing that a ∈ q(chase(I, M n )). It follows that there are at most p · m r facts in chase(I, M n ) witnessing that a ∈ q(chase(I, M n )) ↓ . Each of these facts must be produced in a single step while chasing the source instance I with M n , which implies that each of these facts is produced using at most k facts from I . Let I * be the subinstance of I consisting of all the aforementioned facts of I used to produce the facts in chase(I, M n ) witnessing that a ∈ q(chase(I, M n )) ↓ . We then have that |I * | ≤ k · p · m r and a ∈ q(chase(I * , M n )) ↓ . Since n, n ≥ N 0 , we have that q(chase(I * , M n )) ↓ = q(chase(I * , M n )) ↓ , hence a ∈ q(chase(I * , M n )) ↓ . By the monotonicity of the chase procedure, we have that a ∈ q(chase(I, M n )) ↓ . It follows that q(chase(I, M n )) ↓ ⊆ q(chase(I, M n )) ↓ . A symmetric argument establishes the containment q(chase(I, M n )) ↓ ⊆ q(chase(I, M n )) ↓ , hence q(chase(I, M n )) ↓ = q(chase(I, M n )) ↓ , which, in turn, implies that cert (q, I, M n ) = cert (q, I, M n ). Part 2. It is obvious that if a sequence of schema mappings has a uniform limit, then it has a pointwise limit. We focus on the reverse direction. Let (M n ) n≥1 be a sequence of premise bounded GLAV mappings that has a pointwise limit M. We claim that M is also a uniform limit of (M n ) n≥1 .
Since (M n ) n≥1 has a pointwise limit, we have that (M n ) n≥1 is pointwise Cauchy. The previous part implies that (M n ) n≥1 is uniformly Cauchy as well. Fix an integer m. Since (M n ) n≥1 is uniformly Cauchy, there exists an n 0 such that for all n, n ≥ n 0 , we have that M n ≡ CQ m M n . We claim that also M n ≡ CQ m M holds, for every n ≥ n 0 . To show this, fix some n ≥ n 0 and let I be a source instance and q a conjunctive query in CQ m . We have to show that cert (q, I, M n ) = cert (q, I, M). Since M is a pointwise limit of (M n ) n≥1 , there is an n 0 (I ) such that for all n ≥ n 0 (I ), we have that cert (q, I, M n ) = cert (q, I, M). Take an integer n such that n ≥ max{n 0 , n 0 (I )}. Since n ≥ n 0 , we have that cert (q, I, M n ) = cert (q, I, M n ). Since n ≥ n 0 (I ), we have that cert (q, I, M n ) = cert (q, I, M). Thus, cert (q, I, M n ) = cert (q, I, M).
Note that the preceding proof of Part 2 used only the hypothesis that the sequence (M n ) n≥1 is uniformly Cauchy and the fact that the sequence (M n ) n≥1 has a pointwise limit, as we have proved in Part 1. As a matter of fact, this is an instance of a general result about pseudometric spaces, namely, that if a uniformly Cauchy sequence of functions converges pointwise, then it also converges uniformly.
The following two propositions further demarcate the differences between GAV and premise-bounded GLAV mappings. In fact, these differences are already witnessed by sequences of LAV mappings. The first difference concerns the existence of limits of uniformly Cauchy sequences. In contrast to the GAV case, uniformly Cauchy sequences of LAV mappings may have no uniform limit; in fact, they may not even have a pointwise limit.

Proposition 6
There exists a uniformly Cauchy sequence of LAV mappings that has no pointwise limit; in particular, it has no uniform limit either.
Proof Let S be a source schema consisting of a binary relation symbol E and let T be a target schema consisting of a binary relation F . For every n ≥ 1, let M n be the LAV mapping specified by the constraint

∀x, y(E(x, y) → q n+1 )
where q n = ∃z 1 , . . . z n 1≤i<j ≤n (F (z i , z j ) ∧ F (z j , z i )) is the boolean conjunctive query which is satisfied by the graphs containing a self-loop or a clique of size n (now considering F as the edge relation).
We first show that the sequence(M n ) n≥1 is uniformly Cauchy. Let k ≥ 1. We claim that if we take n 0 = k, then for every source instance I , for every n, m ≥ n 0 , and every q ∈ CQ k , we have that cert (q, I, M n ) = cert (q, I, M m ). To see this, note that for every source instance I and for every t ≥ 1, the universal solutions of I w.r.t. M t have active domains consisting entirely of labeled nulls. Hence, only boolean queries may return a non-empty result. Moreover, observe that these universal solutions have no self-loops, i.e., they contain no atoms of the form F (v, v) for some labeled null v.
We now distinguish two cases: First, suppose that q ∈ CQ k is a boolean conjunctive query which contains a "self-loop", i.e., an atom of the form F (z, z) for some variable z. Then we clearly have cert (q, I, M n ) = false = cert (q, I, M m ). It remains to consider the case that q ∈ CQ k is a boolean CQ containing no selfloop. Then we clearly have cert (q, I, M n ) = true = cert (q, I, M m ), since we are assuming that m, n ≥ k holds.
Using an argument similar to the one in the proof of Proposition 2, we now show that the sequence (M n ) n≥1 has no pointwise limit. Towards a contradiction, assume that (M n ) n≥1 does have a pointwise limit M. Let I be a non-empty source instance. We consider three cases.
First, assume that Sol(I, M) is empty. Then, for every boolean conjunctive query q, it holds trivially that cert (q, I, M) = true. This is, in particular, the case for the query q = ∃z F (z, z), which asks for the existence of a self-loop. However, for this query q, we have that cert (q, I, M n ) = false for every n ≥ 1.
Second, assume that Sol(I, M) is non-empty and that all solutions J ∈ Sol(I, M) contain a self-loop. For the query q = ∃zF (z, z) as above, we again have cert (q, I, M) = true, whereas cert (q, I, M n ) = false, for every n ≥ 1.
Finally, assume that Sol(I, M) is non-empty and that at least one solution J ∈ Sol(I, M) does not contain a self-loop. Let m be the biggest integer such that J contains a clique of size m. Consider the conjunctive query Then q evaluates to false over J and we have cert (q, I, M) = false. On the other hand, for all n ≥ m + 1 we have cert (q, I, M n ) = true. Again, this contradicts our assumption that M is the pointwise limit of (M n ) n≥1 .
The next difference is the definability of uniform limits. In Section 4, we saw that if a sequence of GAV mappings has a uniform limit, then it is eventually constant, hence it has a GAV mapping as a uniform limit. This property need not hold for sequences of LAV mappings (hence, it need not hold for sequences of premise-bounded schema mappings).

Proposition 7
There exists a sequence (M n ) n≥1 of LAV mappings that has a uniform limit, but no uniform limit of (M n ) n≥1 admits universal solutions. In particular, no SO tgd is a uniform limit of the sequence (M n ) n≥1 .
Proof For every n ≥ 1, let M n be the LAV mapping specified by the constraint where ∃P n is the conjunctive query ∃z 1 . . . ∃z n (F (z 1 , z 2 ) ∧ . . . ∧ F (z n−1 , z n )) asserting that there is a "path" (possibly with repeated vertices) of length n in the target instance. We now show that the sequence (M n ) n≥1 has a uniform limit, but no uniform limit of this sequence admits universal solutions. where C k is a target instance consisting of a simple cycle of nulls of size k and v(C k ) is the set of all isomorphic copies of C k via isomorphisms that rename nulls. We will show that M is a uniform limit of the sequence (M n ) n≥1 . Specifically, we will show for every m, there exists n 0 such that for all n ≥ n 0 , we have that M n ≡ CQ m M.
Let n 0 = m. Since each M n has solutions consisting entirely of nulls, it suffices to consider boolean CQs only. Let q be a boolean CQ with m variables and assume that cert (q, I, M n ) = true, where n ≥ m. This implies that there is a homomorphism from the body of q into P n , where P n is the simple path with n nodes. In turn, this implies that C k |= q, for every k. Thus, cert (q, I, M) = true as well. In the other direction, assume that cert (q, I, M) = true. Note that q cannot contain a directed cycle, since no directed cycle can be mapped homomorphically in every cycle of length greater than one. Let h be a homomorphism from the body of q into C m+1 . Since q ∈ CQ m , the variables of q have at most m distinct images among the nodes of C m+1 . This means thatC m+1 |= q, whereC m+1 is obtained from C m+1 by removing the facts that contain at least one element that is not the image of one of the variables of q under h. Note thatC m+1 has at least one fact less than C m+1 , and so it is a collection of simple paths of length at most m; therefore, there is a homomorphism fromC m+1 to P n , hence P n |= q. Part 2. For the second part of the claim and towards a contradiction, assume that M is a uniform limit of (M n ) n≥1 such that there exists a non-empty source instance I and a finite universal solution J for I w.r.t. M . Note that for every i, we have that cert (∃P i , I, M ) = true, because M is a (uniform and, hence also pointwise) limit of the sequence (M n ) n≥1 . Then we also have that J |= ∃P i , since J is universal. Since J is finite, this is possible only if J contains a directed cycle.
We can now derive a contradiction as follows. For each positive integer l, let ∃C l be the boolean conjunctive query asserting the existence of a cycle of length l. Then there is no n such that cert (∃C l , I, M n ) = true. Thus, cert (∃C l , I, M ) = false must hold for every l, since M is a limit of (M n ) n≥1 . Hence, J cannot contain cycles.
Since every SO tgd admits universal solutions, it follows that no SO tgd is a (uniform or pointwise) limit of (M n ) n≥1 .
By Theorem 1, every SO tgd is the uniform limit of a sequence of GLAV mappings. Proposition 7 implies that the converse is false, even for sequences of LAV mappings.
In the previous section, we showed that a sequence of GAV mappings has a GAV mapping as a pointwise limit if and only if it has a pointwise limit that allows for CQ-rewriting. Is there some structural property that characterizes when a sequence of premise-bounded GLAV mappings has a GLAV mapping as a pointwise limit (which, for premise-bounded mappings, is the same as a uniform limit)? We will show that the property of admitting universal solutions is the key to this question. Specifically, we have the following result. (1) (M n ) n≥1 has a GLAV mapping M as a uniform limit.
(2) (M n ) n≥1 has a uniform limit that admits universal solutions.

Moreover, if (M n ) n≥1 is a sequence of LAV mappings, then (M n ) n≥1 has a LAV mapping as a uniform limit if and only (M n ) n≥1 has a uniform limit that admits universal solutions.
We now give two lemmas which will be used in the proof of Theorem 5, but are also of interest in their own right.

Lemma 2 If M is the uniform limit of a sequence (M n ) n≥1 of schema mappings each of which allows for CQ-rewriting, then also M allows for CQ-rewriting.
Proof Let q be a target conjunctive query with m variables. Since M is a uniform limit of (M n ) n≥1 , there exists an integer n 0 such that for every n ≥ n 0 and every source instance I , we have that cert (q, I, M) = cert (q, I, M n ). In particular, cert (q, I, M) = cert (q, I, M n 0 ). Since M n 0 allows for CQ-rewriting, there is a source conjunctive query q such that cert (q, I, M n 0 ) = q (I ), for every source instance I . Hence, cert (q, I, M) = q (I ) holds, for every source instance I .
It should be noted that the conclusion of Lemma 2 does not hold, in general, if M is a pointwise limit of a sequence (M n ) n≥1 of schema mappings each of which allows for CQ-rewriting. Indeed, if (M n ) n≥1 is the sequence of GAV mappings in the proof of Proposition 5, then Theorem 3 and Proposition 5 imply that no pointwise limit of (M n ) n≥1 allows for CQ-rewriting.

Lemma 3 Let M be a uniform limit of a sequence (M n ) n≥1 of LAV mappings. If M admits universal solutions, then it is closed under unions.
Proof The proof proceeds through several stages and involves four claims, each of which builds on preceding ones. We first state the claims without proof and then use the last claim to show the desired conclusion. After this, we complete the proof of the lemma by proving each claim.
We first modify the notion of CQ-equivalence by limiting the number of atoms of CQs, rather than the number of variables. This yields an equivalent notion of uniform limit.
• For ≥ 1, we define CQ = {q ∈ CQ | length(q) ≤ }, where length(q) denotes the number of atoms in q. Next, we use the given sequence (M n ) n≥1 to construct another sequence (M n ) n≥1 of LAV mappings that possesses some desirable properties. To define the sequence (M n ) n≥1 , we need another claim.

Claim B.
Assume that M n u −→ M. Then, there exists a strictly increasing sequence (n i ) i≥1 of positive integers, such that for every ≥ 1 and for every n ≥ n , we have that M n ≡ CQ M.
Let (n i ) i≥1 be the strictly increasing sequence of positive integers according to Claim B. We define the sequence (M n ) n≥1 of LAV mappings as follows: Here, T (τ, ) contains all LAV constraints obtained from τ by restricting the conclusion to at most atoms. Formally, let τ = A(x) → ∃y A 1 (x, y)  Claim C. Let (n i ) i≥1 be the strictly increasing sequence of positive integers according to Claim B and let (M n ) n≥1 be the sequence of LAV mappings constructed above. Then, for every ≥ 1, the following properties hold: (i) for every n ≥ n , we have that M n ≡ CQ M; (ii) the conclusion of every LAV constraint in M n is of length at most .
We now make the following claim about the sequence (M n ) n≥1 . Since M n u −→ M, for each ≥ 1 there exists an integer n such that for all n ≥ n , we have that M n ≡ CQ M. We may choose n as follows to ensure strict monotonicity: n 1 := n 1 . . . n := max(n −1 + 1, n ) Then the sequence (n i ) i≥1 is strictly increasing and for all ≥ 1 and for all n ≥ n , we have that M n ≡ CQ M. Claim C. Let (n i ) i≥1 be the strictly increasing sequence of positive integers according to Claim B and let (M n ) n≥1 be the sequence of LAV mappings constructed above. Then, for every ≥ 1, the following properties hold: (i) for every n ≥ n , we have that M n ≡ CQ M; (ii) the conclusion of every LAV constraint in M n is of length at most .
Consider an arbitrary ≥ 1. By the construction of the sequence (M n ) n≥1 , every LAV constraint in M n has a conclusion of length at most . Hence, property (ii) clearly holds.
To prove property (i), consider an arbitrary n ≥ n . We have to show that Since M n 0 ≡ CQ M and ∃y q J ∈ CQ , also cert (∃y q J , I , M n 0 ) = true holds. Hence, there exists a homomorphism h : q J → J , which can be easily transformed into a homomorphism h : J → J by setting h(u α ) = h (y α ) for every α ∈ {1, . . . , i}.
(ii) For every f-block F of J , we consider the boolean conjunctive query ∃z q F whose atoms are the atoms in F and z = (z 1 , . . . , z i ) instantiates the labeled nulls v = (v 1 , . . . , v i ) in F with pairwise distinct variables. Clearly, for every F , we have q F → J and, therefore, also cert (∃z q F , I , M) = true.
Since all LAV-constraints in M n 0 have conclusion size bounded by , the number of atoms in any f-block of J is bounded by . Hence, for every F , the corresponding conjunctive query q F is in CQ . Since M n 0 ≡ CQ M, we have that cert (∃z q F , I , M) = true. Hence, for every f-block F of J , there exists a homomorphism h F : q F → J , which can easily be transformed into a homomorphism g F : F → J by setting g F (v α ) = h F (z α ) for every α ∈ {1, . . . , i}. These homomorphisms from the f-blocks of J to J can be combined to the desired homomorphism h = g F with h : J → J .
"⊆": Let K ∈ Sol(I , M n 0 ). Since J is a universal solution for I w.r.t. M n 0 , there exists a homomorphism g : J → K. By composing g with the homomorphism h : J → J , we obtain a homomorphism from J to K. By the closure under target homomorphisms, we conclude that K ∈ Sol(I , M) "⊇": Now let K ∈ Sol(I , M). Since J is a universal solution for I w.r.t. M, there exists a homomorphism g : J → K. By composing g with the homomorphism h : J → J , we obtain a homomorphism from J to K. Since LAV mapping M n 0 is closed under target homomorphisms, we conclude that K ∈ Sol(I , M n 0 ).
The proof of Lemma 3 is now complete.
We now have all the tools needed to present the proof of Theorem 5. Before doing so and for the sake of readability, we reproduce its statement.
Let (M n ) n≥1 be a premise-bounded sequence of GLAV mappings. The following statements are equivalent.
(1) (M n ) n≥1 has a GLAV mapping M as a uniform limit.
(2) (M n ) n≥1 has a uniform limit that admits universal solutions.
Moreover, if (M n ) n≥1 is a sequence of LAV mappings, then (M n ) n≥1 has a LAV mapping as a uniform limit if and only (M n ) n≥1 has a uniform limit that admits universal solutions.

Proof of Theorem 5
The direction (1) ⇒ (2) is obvious. For the direction (2) ⇒ (1), we start with the case when (M n ) n≥1 is a sequence of LAV mappings.
Assume that M is a uniform limit of a sequence (M n ) n≥1 of LAV mappings and that M admits universal solutions. Without loss of generality, we may also assume that M is closed under target homomorphism. Indeed, if we let M be the schema mapping obtained by closing M under target homomorphisms, then M is also a uniform limit of (M n ) n≥1 and it admits universal solutions; this is so because the notion of uniform limit is based on CQ-equivalence and also conjunctive queries are preserved under homomorphisms. Then the schema mapping M has the following properties: For the case when (M n ) n≥1 is a sequence of premise-bounded GLAV mappings (but not necessarily LAV mappings), we apply yet another structural characterization of GLAV mappings from [19], namely, Theorem 3.9, which asserts that if a schema mapping allows for CQ-rewriting, admits universal solutions, is closed under target homomorphisms, and is n-modular, for some fixed n, then it is logically equivalent to a GLAV mapping.
Let k be the constant bounding the length of premises in (M n ) n≥1 . We proceed exactly as in the proof of Lemma 3 and construct a sequence (M n ) n≥1 , in which the premises of tgds are the same as in tgds in (M n ) n≥1 , hence each tgd in (M n ) n≥1 has at most k atoms in its premise. We proceed exactly as in the proof of Lemma 3 to establish the following analog of Claim D. Now, since each tgd in every element of (M n ) n≥1 has at most k atoms in its premise, it follows that there is a positive integer N k so that each mapping M n in (M n ) n≥1 is N k -modular. It is easy to see that N k ≤ k · r holds where r is the maximum relation arity in the source schema.
We now prove that M is N k -modular. Assume that J is not a solution for I w.r.t. to M. Take an integer n 0 as in Claim D and consider M n 0 . It follows that J is not a solution for I w.r.t. M n 0 . Since M n 0 is N k -modular, there is a subinstance I of I such that J is not a solution for I w.r.t. M n 0 and |dom(I )| ≤ N k . Again by Claim D, we have that J is not a solution for I w.r.t. M, hence M is N k -modular.
Thus, M has the following properties: it admits CQ-rewriting (since it is the uniform limit of GLAV mappings that admit CQ-rewriting), it admits universal solutions, is closed under target homomorphisms (if it is not, we take its closure before we begin the construction), and, as just shown, it is N k -modular. Consequently, by Theorem 3.9 in [19], we have that M is logically equivalent to a GLAV schema mapping, which completes the proof.
We conclude this section with a conjecture concerning uniform limits of arbitrary sequences of GLAV mappings.

Conjecture 1
The following statements are equivalent for a sequence (M n ) n≥1 of GLAV mappings.
1. (M n ) n≥1 has an SO tgd as a uniform limit.

(M n ) n≥1 has a uniform limit that admits universal solutions.
It is not hard to show that the preceding conjecture is implied by a conjecture in [2] to the effect that the language of plain SO-tgds 2 can be characterized by the following three properties: allowing for CQ-rewriting, admitting universal solutions, and closure under target homomorphisms.

Metric Space Completion and Generalized Schema Mappings
Let T be a schema containing a binary relation symbol. By Proposition 2, the metric space (P(Inst(T)), dist) is not complete, i.e., there are Cauchy sequences of elements of P(Inst(T)) that have no limit in P(Inst(T)). Let (P(Inst(T)) * , dist * ) be the completion of (P(Inst(T)), dist). As described in Section 2, the elements of P(Inst(T)) * are the equivalence classes of Cauchy sequences of elements of P(Inst(T)), where two Cauchy sequences I 1 , I 2 , . . . and J 1 , J 2 , . . . are equivalent if lim n→∞ dist (I n , J n ) = 0. Clearly, this is a rather abstract description of P(Inst(T)) * .
In this section we show that, in many cases, the elements of P(Inst(T)) * can be represented by suitably constructed infinite T-instances. In turn, this result and basic results about complete metric spaces imply that the (pointwise or uniform) limits of a Cauchy sequence of schema mappings can be represented by a generalized schema mapping, that is, a schema mapping in which infinite solutions are allowed. We also establish a tight connection between these results and the representation of structural limits in the monograph by Nešetřil and Ossona de Mendez [15].

Representing Limits of Cauchy Sequences in the Metric Completion
Let T be a schema. Recall that, by definition, a T-instance is a finite set of facts.
In what follows, we will also consider infinite T-instances, where, by definition, an infinite T-instance is an infinite set I of facts R i (t 1 , . . . , t m ). The term T-instance will continue to denote a finite T-instance, but, at times and for emphasis or disambiguation, we will also use the term finite T-instance, especially in contexts in which infinite T-instances are also considered. According to Definitions 2 and 3, the notion of the distance between two sets of finite instances has been defined using the notion of CQ n -equivalence, where two sets J and J of finite T-instances are CQ nequivalent, denoted J ≡ CQ n J , if it holds that cert (q, J ) = cert (q, J ), for all q ∈ CQ n . The notion of CQ n -equivalence naturally extends to arbitrary (i.e., finite or infinite) T-instances. Hence, also the notions of similarity and distance, both of which were defined via CQ n -equivalence, immediately carry over to sets of arbitrary Tinstances. Furthermore, the set of sets of arbitrary T-instances forms a pseudometric space, in which we can speak about Cauchy sequences and limits.

Definition 8 Let T be a schema.
• Let X and Y be two sets of finite T-instances. We say that Y is an isomorphic copy of X with nulls named apart if 1. For every member J of X , there is a member J of Y that is an isomorphic copy of J via an isomorphism that renames nulls. 2. Every member J of Y is an isomorphic copy of some member J of X via an isomorphism that renames nulls.

No two members of Y have nulls in common.
• If Y is a set of finite T instances, then Y denotes the union of all members of Y (where each member of Y is viewed as a set of facts).
• If X is a set of finite T-instances, then X denotes the set consisting of the unions of isomorphic copies of X with nulls named apart, i.e., X = Y | Y is an isomorphic copy of X with nulls named apart .
Several remarks are in order now.
Note also that if X is a set of finite T-instances such that at least one instance in X contains nulls, then X is infinite (even if X is a finite set). • According to Definition 4, if J is a T-instance whose active domain contains nulls only, then v(J ) is the set of all T-instances that are isomorphic copies of J via an isomorphism that renames nulls. This notation makes sense also for infinite T-instances J whose active domains contain nulls only. With this in mind, observe that if X is a finite set of T-instances and if Y is an isomorphic copy of X with nulls named apart, then • As a concrete example, if K = {K n | n ≥ 1}, where K n is a clique of size n in which every node is a null, then the members of K are precisely the disjoint unions of cliques of all finite sizes in which every node is a null.

Definition 9
Let q be a conjunctive query over the schema T with k free variables, k ≥ 0, and let a be a k-tuple of constants (if k = 0, then a = (), i.e., a is the empty tuple).
We write q(a) to denote the T-instance J obtained from q and a by (i) substituting the free variables of q by the respective elements of a; (ii) replacing the existential variables of q by fresh distinct labeled nulls; and (iii) treating the resulting body atoms of q as facts of the T-instance J .
Note that if q is a boolean query (in which case a = ()), then q(()) is the canonical database of q, i.e., the T-instance whose active domain is the set of variables of q viewed as distinct nulls and whose facts are the atoms of q. Conversely, every Tinstance J whose active domain consists entirely of nulls is the canonical database of a boolean conjunctive query.
Before stating the main result of this section, we need to introduce one more concept. Let J be a set of finite or infinite T-instances. We say that J is closed under isomorphisms that rename nulls if for every (finite or infinite) T-instance J in J and for every (finite or infinite) T-instance J that is an isomorphic copy of J via an isomorphism that renames nulls, we have that J is also in J . Note that if X is a set of finite T-instances, then X is closed under isomorphisms that rename nulls. Moreover, if M is a schema mapping between S and T, then, for every source instance I , the set Sol(I, M) of the solutions of I w.r.t. M is closed under isomorphisms that rename nulls (see Definition 1).

Theorem 6
Let (J n ) n≥1 be a Cauchy sequence of elements of P(Inst(T)) such that each J n is closed under isomorphisms that rename nulls. Then the limit of the sequence (J n ) n≥1 is the set J * , where J * = {q(a) | q ∈ CQ and there is an integer p such that a ∈ cert(q, J i ), for every i ≥ p}.
Proof We have to show that, for every m ≥ 1, there is some n 0 such that for every q ∈ CQ m and every n ≥ n 0 , we have that cert (q, J n ) = cert (q, J * ). This will be done in two steps, as follows.
Step 1: We will show that, for every m ≥ 1, there is some n 1 such that cert (q, J n ) ⊆ cert (q, J * ), for every q ∈ CQ m and every n ≥ n 1 .
Step 2: We will show that, for every m ≥ 1, there is some n 2 such that cert (q, J * ) ⊆ cert (q, J n ), for every q ∈ CQ m and every n ≥ n 2 .
We start by pointing out that for every n ≥ 1 and every q ∈ CQ, the certain answers cert (q, J n ) consist entirely of null-free tuples. This follows from the assumption that J n is closed under isomorphisms that rename nulls (the proof is essentially the same as the proof of Proposition 1 in Section 2). Moreover, for every q ∈ CQ, the certain answers cert (q, J * ) also consist entirely of null-free tuples. This is so because J * contains isomorphic copies of J * having no nulls in common (e.g., if v 1 , . . . , v n , . . . is a list of all nulls, then J * contains an isomorphic copy of J * in which all nulls have even index and an isomorphic copy of J * in which all nulls have odd index). Thus, we only need to focus on tuples of constants as possible certain answers.
To prove Step 1, since the sequence (J n ) n≥1 is Cauchy, for every m ≥ 1, there is some n 1 such that if s ≥ n 1 and t ≥ n 1 , then J s ≡ CQ m J t . We now claim that cert (q, J n ) ⊆ cert (q, J * ), for every q ∈ CQ m and every n ≥ n 1 . Indeed, assume that q ∈ CQ m and let a be a (possibly empty) tuple of constants in cert (q, J n ), where n ≥ n 1 . It follows that a ∈ cert (q, J j ), for every j ≥ n 1 , hence the finite T-instance q(a) is in the set J * . Consequently, a ∈ q( Y), for every isomorphic copy Y of J * with nulls named apart, which implies that a ∈ cert (q, J * ).
To prove Step 2, we will first show that the set D of constants occurring in J * is finite (note that D is also the set of constants occurring in J * ). As a stepping stone, we will show the finiteness of a set D that is defined next.
A single-atom conjunctive query is a query of the form ∃yR(x, y), where R is a relation symbol in the schema T. Let D be the set of all constants b for which there is a single-atom query q and an index p, such that b occurs in cert (q, J i ), for all i ≥ p. We claim that the set D is finite. To see this, observe first that every single-atom query has at most r variables, where r is the maximum arity of the relation symbols in T. Since the sequence (J n ) n≥1 is Cauchy, there exists an integer p r such that J i ≡ CQ r J p r , for all i ≥ p r . This implies that the certain answers to single-atom conjunctive queries become fixed in (J n ) n≥1 starting from the index p r , which depends only on the schema T. By definition, the certain answers hold in every instance in J p r . Since J p r consists entirely of finite instances, the set D must be finite as well.
To complete the proof of the finiteness of D, we will show that D ⊆ D . Let a be a tuple of constants for which there is a conjunctive query q and an index p, such that a ∈ cert (q, J i ), for all i ≥ p. Let s be the number of atoms of q and consider the single-atom queries q 1 (y i ), . . . , q s (y s ) that cover q in the following sense: for every j with 1 ≤ j ≤ s, the atom of q j is the j -th atom of q, and y j contains exactly the free variables of q that occur in this atom. Let a j be the tuple of elements from a assigned to the variables y j . Clearly, every element of a is an element of some a j , 1 ≤ j ≤ s. Observe that a j ∈ cert (q, J i ) implies that a j ∈ cert (q j , J i ), hence we have that a j ∈ cert (q j , J i ), for every i ≥ p. Thus, each element of a j , 1 ≤ j ≤ s, is an element of D . This shows that D ⊆ D holds, hence D is a finite set.
We now return to the proof of Step 2. We will show that for every m ≥ 1, there is some n 2 such that cert (q, J * ) ⊆ cert (q, J n ), for every q ∈ CQ m and every n ≥ n 2 . Assume that q ∈ CQ m and let a be a tuple of constants such that a ∈ cert (q, J * ). Then, for every instance J ∈ J * , we have that a ∈ q(J ), hence there is a homomorphism h from the variables of q to the active domain of J such that the tuple of the free variables of q is mapped to a and the atoms of q are mapped to facts of J . Let s be the number of atoms of q and let f 1 , . . . , f s be the facts of J that are the images of the atoms of q under the homomorphism h. Up to renaming nulls, each fact f j is a fact of some finite T-instance of the form q j (b j ), where q j is a conjunctive query and b j is a tuple of constants such that b j ∈ cert (q j , J i ), for all sufficiently large i. Let n q(a) be an index such that for every i ≥ n q(a) , we have that b j ∈ cert (q j , J i ) holds, for 1 ≤ j ≤ s. Furthermore, let n 2 be the maximum such index n q(a) , for all q in CQ m and for all tuples a in D. Such an index exists (i.e., it is a finite number) because both the set CQ m and the set of tuples of elements D of length at most m is finite.
Observe that n 2 has been chosen so that for every tuple a and for every q ∈ CQ m with a homomorphism h mapping q(a) to some instance in J * (and thus to every instance in J * , by renaming the nulls in the co-domain of h accordingly), every fact f j in h(q(a)) can be mapped further to every instance J n , n ≥ n 2 , via a homomorphism h i defined on the entire f-block of f j . (Recall that, by the definition of J * , each fact f j instantiates an atom of some conjunctive query q j whose certain answers persist in the sequence (J n ) n≥1 ; the bodies of these queries are mapped into instances of J * after renaming apart the nulls in them, thus ensuring that no two distinct queries end up in the same f -block of an instance of J * ). The union of two homomorphisms h 1 , h 2 defined on two distinct f-blocks B 1 , B 2 is unambiguously defined, and it is a homomorphism on the instance B 1 ∪ B 2 , since homomorphisms are the identity on constants and f-blocks do not share nulls. Thus, for an instance J ∈ J * and for the image {f 1 , . . . , f s } of q(a) under some homomorphism h, we also have a homomorphism from q(a) to J n , n ≥ n 2 , obtained by composing h with a union h 1 ∪ · · · ∪ h s of homomorphisms from the f-blocks of the atoms f 1 , . . . , f s to J n . It follows that a ∈ cert (q, J n ), for every n ≥ n 2 . This establishes the inclusion cert (q, J * ) ⊆ cert (q, J n ), for n ≥ n 2 , and completes the proof of the theorem.
Recall the sequence (v(K n )) n≥1 in Proposition 2, where K n is the clique of size n whose vertices are pairwise distinct labeled nulls. By Proposition 2, this sequence is Cauchy, but has no limit in P(Inst(T)). Theorem 6 tells us how to find the limit in the complete metric space via the conjunctive queries with non-empty certain answers over all but finitely many members of the sequence. Since the instances K n , n ≥ 1, have active domains consisting entirely of nulls, Lemma 1 tells us that we only need to consider boolean conjunctive queries and, moreover, it suffices to evaluate them on each K n . These queries can only use the edge relation E, thus they can be considered as graphs -with the variables representing the vertices. If a query contains a selfloop (i.e., an atom of the form E(z, z) for some variable z), then the query evaluates to false over every K n . On the other hand, if a query contains no self-loop, then it evaluates to true over all but finitely many instances K n . Indeed, let q be a conjunctive query without self-loop and suppose that q contains m variables. It is easy to verify that q evaluates to true over all instances K n with n ≥ m. Hence, by Theorem 6, the limit of (v(K n )) n≥1 is G, where G is a set of graphs with the following properties: (i) every member of G is a graph with no self-loops and with labelled nulls as vertices; (ii) every graph with no self-loops is isomorphic to a graph in G. Clearly, K is also the limit of (v(K n )) n≥1 , where K is a set of graphs with the following properties: (i) every member of K is a clique with labelled nulls as vertices; (ii) every clique is isomorphic to a graph in K. Thus, the limit of (v(K n )) n≥1 is the set consisting of all disjoint unions of cliques of all finite sizes in which every node is a null. At any rate, it is clear that infinite instances have to be used to represent the limit of (v(K n )) n≥1 .
Next, we extend our results about limits of Cauchy sequences of instances to limits of Cauchy sequences of mappings. To this end, we first recall two basic results about complete metric spaces. (Y, d) be a complete metric space and let (f n ) n≥1 be a sequence of functions from a set X to Y .

Proposition 8 Let
• If (f n ) n≥1 is a pointwise Cauchy sequence, then (f n ) n≥1 has a pointwise limit • If (f n ) n≥1 is a uniformly Cauchy sequence, then (f n ) n≥1 has a uniform limit.
Moreover, the pointwise limit f : X → Y of (f n ) n≥1 is also the uniform limit of (f n ) n≥1 .
The proof of the first part of Proposition 8 is immediate from the definitions; the proof of the second part can be found in any standard book on metric spaces (see, e.g., Proposition 3.6.6 in [18]). In fact, the argument is essentially the same as the one given in the proof of Part 2 of Theorem 4. Note that the second part of Proposition 8 is known as the Cauchy criterion.
We are now ready to obtain concrete representations of the (pointwise or uniform) limits of Cauchy sequences of schema mappings.
Definition 10 Let S, T be two schemas. A generalized schema mapping is a set M of pairs (I, J ) such that I is a finite S-instance, J is a finite or infinite T-instance, and M has the following closure property: if (I, J ) ∈ M and if J is an isomorphic copy of J via an isomorphism that renames nulls, then (I, J ) ∈ M.

Corollary 3 Let (M n ) n≥1 be a sequence of schema mappings. Consider the generalized schema mapping
• If (M n ) n≥1 is a pointwise Cauchy sequence, then the schema mapping M is the pointwise limit of (M n ) n≥1 . • If (M n ) n≥1 is a uniformly Cauchy sequence, then the schema mapping M is the uniform limit of (M n ) n≥1 .
Proof The first part follows from Theorem 6 and the definitions. The second part follows from the first part and Proposition 8. is a pointwise limit of (M n ) n≥1 . Moreover, if (M n ) n≥1 is a uniformly Cauchy sequence, then M * is its uniform limit.

Connections with Representations of Structural Limits
In their recent monograph [15], Nešetřil and Ossona de Mendez considered a notion of distance between instances, as well as sequences of instances and limits of such sequences. In what follows, we describe the main differences between their setting and ours.
• The first main difference is that they did not distinguish two classes of domain elements (namely, constants and nulls), as we did here. As a result, in the definition of homomorphism in [15], no special treatment of constants is needed, while, in our setting, constants must always be mapped to themselves. Their notion of homomorphism coincides with ours on instances whose active domains consist of labeled nulls only. Note that this is exactly the scenario we had in Example 1 and Proposition 2, which are both inspired by results in [15].
• The second main difference is that the notion of distance in [15] is between a pair of two instances, while our notion of distance is between a pair of two sets of instances. This, of course, raises the question of how the two notions compare if, in our setting, both sets are singletons. We will address this question soon. • The third main difference is that, when cast in terms of the certain answers of conjunctive queries, the notion of distance in [15] involves boolean conjunctive queries only, while ours involves all conjunctive queries (boolean and non-boolean ones).
In what follows, we recall the definition of the similarity measure and the metric from [15] and briefly sketch the approach that Nešetřil and Ossona de Mendez took in representing limits of Cauchy sequences of instances via infinite instances.
Let T be a schema and let J and J be two T-instances. By a slight abuse of notation, we write J → J to denote the existence of a homomorphism from J to J in the sense of Nešetřil and Ossona de Mendez (i.e., not distinguishing two types of domain elements). As mentioned before, if the active domains of J and J contain nulls only, then this notion of homomorphism coincides with the one considered in the context of schema mappings and data exchange (which is the one we used here). [15]] Let T be a schema and let J , J be two Tinstances. Nešetřil and Ossona de Mendez call this distance the "left distance", because it is defined in terms of homomorphisms from other structures. This is to distinguish the notion from the "right distance" which is defined in terms of homomorphisms to other structures. For our purposes here, only the left distance is relevant. Because of the basic connection between homomorphisms and boolean conjunctive queries, it is easy to see that if J and J are T-instances, then the following statements are equivalent.
• m is the largest number such that J and J satisfy the same boolean conjunctive queries with at most m − 1 variables.
How do the notions of sim h of similarity and dist h of distance compare with our notions sim of similarity and dist of distance? Clearly, this comparison is meaningful only when, in our setting, we consider singletons of instances and, moreover, the active domains of these instances contain nulls only. Recall that, according to the notation introduced in Definition 4, if J is a T-instance whose active domain contains nulls only, then v(J ) is the set of all T-instances that are isomorphic copies of J via an isomorphism that renames nulls. The next observation is a direct consequence of Definitions 3 and 11, Lemma 1, and the preceding remarks.

Proposition 9
Let T be a schema and let J and J be two T-instances whose active domains contain nulls only. Then the following statements are true.
In what follows, we will write NInst(T) to denote the set of all T-instances whose active domain consists entirely of nulls. The pair (NInst(T), dist h ) is a pseudometric space, so a metric space can be obtained from it by passing to the equivalence classes consists of all target instances that are homomorphically equivalent to J . As we did for the distance dist and the pseudometric space (P (Inst(T)), dist), we will identify each equivalence class with one of its members.
Cauchy sequences and limits arising from dist h are called left Cauchy sequences and left limits in [15]. Proposition 9 implies that if (J n ) n≥1 is a sequence of elements of NInst(T), then (J n ) n≥1 is Cauchy with respect to the distance dist h if and only if the sequence (v(J n )) n≥1 is Cauchy with respect to the distance dist. If , there is a homomorphism from L to J . As before, we will not distinguish between equivalence classes and their members. The partial order ≤ h extends to a partial order ≤ * h on the metric completion (NInst(T) * , dist * h ) of (NInst(T), dist h ) in the following way. Let (X, ≤) be a (finite or infinite) partially ordered set.
• A downset is a subset F of X with the property that for all x ∈ F and y ≤ x, also y ∈ F holds. • An ideal is a downset F with the additional property that for all x and y in F , there exists z in F such that both x ≤ z and y ≤ z hold.
In [15], it is shown that there is a correspondence between left limits of Cauchy sequences from NInst(T) and ideals in the partial order (NInst(T), ≤ h ). Before presenting this correspondence, we need to introduce a piece of notation.
If X is a set of T-instances, then the disjoint union X is the set Y, where Y is an isomorphic copy of X with nulls named apart. In other words, X is the union of copies of all elements of X (one copy of each element of X ) so that no two members in the union have nulls in common. Clearly, X is unique up to isomorphisms that rename nulls.
Let (J n ) n≥1 be a Cauchy sequence from NInst(T) and let We now have all the conceptual and technical apparatus needed to establish a tight connection between the representations of limits given in Theorem 6 and the representation of limits given in Proposition 10.
Let {J n } n≥1 be a Cauchy sequence (w.r.t. the distance function dist h ) such that each J n is a member of NInst(T), i.e., each J n is a T-instance whose active domain consists entirely of nulls. Let h lim n→∞ J n be its left-limit in the metric completion of (NInst(T), dist h ). As discussed earlier, the sequence {v(J n )} n≥1 is Cauchy (w.r.t. the distance function dist), so it has a limit lim n→∞ v(J n ) in the metric completion of (P(Inst(T)), dist). The following proposition establishes the close relationship between these two limits.

Proposition 11
Let {J n } n≥1 be a Cauchy sequence (w.r.t. the distance function dist h ) such that each J n is a member of NInst(T). Then Since the active domains of the elements of v(J n ) consist entirely of nulls and are pairwise disjoint, we have that only boolean conjunctive queries q contribute to this expression. Moreover, by Lemma 1, the condition a ∈ cert (q, {v(J i )}) means that cert (q, {J i }) = true or, equivalently, that J i |= q. As mentioned earlier, every boolean conjunctive query q can be identified with its canonical database D q . Moreover, J i |= q if and only if D q → J i . Thus, the preceding equation becomes

Concluding Remarks
In this paper, we have embarked on a systematic study of the limiting behavior of sequences of schema mappings using concepts and tools from metric spaces. For the important special cases of GAV and LAV mappings, our main results are summarized in Figs. 1 and 2. In words, we have shown that, for GAV mappings, a pointwise Cauchy sequence need not be uniformly Cauchy; moreover, the existence of a pointwise limit does not imply the existence of a uniform limit. This cannot happen for LAV mappings. On the other side, a uniformly Cauchy sequence of LAV mappings need not even have a pointwise limit, which cannot happen for GAV mappings. We have also shown that structural properties of schema mappings can be used to characterize when the limit of a pointwise Cauchy sequence of GAV (or of LAV) mappings is equivalent to a GAV (or to a LAV) mapping. Finally, we have shown that infinite target instances and generalized mappings (i.e., schema mappings where target instances may be infinite) can be used to represent limits of Cauchy sequences of sets of target instances and limits of Cauchy sequences of arbitrary schema mappings.
We believe that the work reported here has laid the foundation for several interesting lines of subsequent investigations. We have seen that our results about sequences of LAV mappings extend in a natural way to sequences of premise-bounded GLAV mappings; an analogous extension of our results about sequences of GAV mappings to sequences of conclusion-bounded GLAV mappings is left for future work. We have also seen that there are sequences of LAV mappings for which no SO tgd is a uniform limit. Are there structural properties that characterize when a sequence of GLAV mappings has an SO tgd as a pointwise limit? In this vein, we have offered Conjecture 1. A related interesting open problem is whether schema mappings with target constraints are powerful enough to express pointwise limits or uniform limits of sequences of arbitrary GLAV schema mappings. We have some preliminary evidence that this is plausible, but much more work remains to be done.
We believe that the work reported in this paper provides a new perspective on the study of schema mappings by examining them from a dynamic viewpoint. As stated earlier, our original motivation came from schema-mapping optimization and, in particular, from the idea that "complex" schema mappings can be "approximated" by "simpler" ones. It remains to be seen whether the work reported here will lead to applications to schema-mapping optimization. We believe, however, that the study of the limiting behavior of schema mappings via metric spaces is interesting in its own right.
We also note there are several areas in theoretical computer science where the study of limiting behavior of objects has produced results that were significant in their own right and also had fruitful consequences. For example, starting with the work of Fagin [4], there has been an extensive investigation of the asymptotic probabilities of logical properties and of 0-1 laws for various logics of interest in computer science. More recently, there has been a study of profinite words, which has found applications to automata theory and to the satisfiability problem for variants of monadic second-order logic (see, e.g., [17,20]). Note that the profinite words form the completion of a metric space on words in which the distance is based on the size of the largest deterministic finite automaton needed to separate two words. Finally, the connection between graph limits in the monograph [15] by Nešetřil and Ossona de Mendez and the completion of the metric space (P(Inst(T)), d), which was mentioned in the previous section, may merit further exploration. It should also be pointed out that, motivated by the study of large-scale networks, there has been an extensive body of work on a notion of graph limits arising from converging sequences of homomorphism densities; a detailed account of this work is given in the monograph [13] by Lovász. In addition, Nešetřil and Ossona de Mendez [16] developed a general framework for limits of graphs and relational structures; in that framework, different fragments of first-order logic are used to define different notions of limits arising from converging sequences of the frequencies that first-order formulas in the fragment at hand are satisfied by an assignment (homomorphism densities correspond to the fragment consisting of all quantifier-free conjunctive queries). Homomorphisms, metric completions, and representations of limits of finite structures play a central role in [13,16].