Multivariate Central Limit Theorems for Random Clique Complexes

Motivated by open problems in applied and computational algebraic topology, we establish multivariate normal approximation theorems for three random vectors which arise organically in the study of random clique complexes. These are: (1) the vector of critical simplex counts attained by a lexicographical Morse matching, (2) the vector of simplex counts in the link of a fixed simplex, and (3) the vector of total simplex counts. The first of these random vectors forms a cornerstone of modern homology algorithms, while the second one provides a natural generalisation for the notion of vertex degree, and the third one may be viewed from the perspective of U-statistics. To obtain distributional approximations for these random vectors, we extend the notion of dissociated sums to a multivariate setting and prove a new central limit theorem for such sums using Stein's method.


Introduction
Methods from applied and computational algebraic topology have recently found substantial applications in the analysis of nonlinear and unstructured datasets [21,10].The modus operandi of topological data analysis is to first build a nested family of simplicial complexes around the elements of a dataset, and to then compute the associated persistent homology barcodes [14].Of central interest, when testing hypotheses under this paradigm, is the question of what homology groups to expect when the input data are randomly generated.Significant efforts have therefore been devoted to answering this question for various models of noise, giving rise to the field of stochastic topology [26,8,25,1,13].Our work here is a contribution to this area at the interface between probability theory and algebraic topology.
A cornerstone for statistical inference, beyond providing expectations, are distributional approximations.This paper establishes the first multivariate normal approximations to three important counting problems in stochastic topology; as these approximations are based on Stein's method, explicit bounds on the approximation errors are provided.Our starting point is the ubiquitous graph model G(n, p); a graph G chosen from this model has as its vertex set [n] = {1, 2, . . . ,n}, and each of its possible ( n 2 ) edges is included independently with probability p ∈ [0, 1].It was established by Erdős and Rényi in [16] that p = log(n)/n is a sharp threshold for connectivity in G(n, p), in the sense that the following assertions hold for any random graph G ∼ G(n, p) and every arbitrarily small > 0: if p exceeds (1 + ) • log(n)/n, then G is connected with high probability.Conversely, if p is smaller than (1 − ) • log(n)/n, then G is disconnected with high probability.
A natural higher-order generalisation of G(n, p) is furnished by the random clique complex model X(n, p), whose constituent complexes L are constructed as follows.One first selects an underlying graph G ∼ G(n, p), and then deterministically fills out all k-cliques in G with (k − 1)-dimensional simplices for k ≥ 3. Higher connectivity is now measured by the Betti numbers β k (L ), which are ranks of rational homology groups H k (L ; Q) -in particular, β 0 (L ) equals the number of connected components of the underlying random graph G.In [27], Kahle proved the following far-reaching generalisation of the Erdős-Rényi connectivity result: for each k ≥ 1 and > 0, (1) , then β k (L ) = 0 with high probability; and moreover, (2) if , then β k (L ) = 0 with high probability.
With this result in mind, we motivate and describe three random vectors pertaining to L ∼ X(n, p); the normal approximation of these three random vectors will be our focus in this paper.All three are denoted T = (T 1 , . . ., T d ) for an integer d > 0.
Random Vector 1: Critical Simplex Counts.The computation of Betti numbers β k (L ) begins with the chain complex Here C k is a vector space whose dimension equals the number of k-simplices in L , while d k : C k → C k−1 is an incidence matrix encoding which (k − 1)-simplices lie in the boundary of a given k-simplex.These matrices satisfy the property that every successive composite d k+1 • d k equals zero, and β k (L ) is the dimension of the quotient vector space ker d k /img d k+1 .Thus, one is required to diagonalise the matrices {d k : C k → C k−1 } via row and column operations, which is a straightforward task in princi- ple.Unfortunately, Gaussian elimination on an m × m matrix incurs an O(m 3 ) cost, which becomes prohibitive when facing simplicial complexes built around large data sets [38].The standard remedy is to construct a much smaller chain complex which has the same homology groups, and by far the most fruitful mechanism for achieving such homology-preserving reductions is discrete Morse theory [18,36,22,31].
The key structure here is that of an acyclic partial matching, which pairs together certain adjacent simplices of L ; and the homology groups of L may be recovered from a chain complex whose vector spaces are spanned by unpaired, or critical, simplices.One naturally seeks an optimal acyclic partial matching on L which admits the fewest possible critical simplices.Unfortunately, the optimal matching problem is computationally intractable to solve [24] even approximately [5] for large L .Our third random vector is obtained by letting T k equal the number of critical k-simplices for a specific type of acyclic partial matching on L , called the lexicographical matching.Knowledge of this random vector serves to simultaneously quantify the benefit of using discrete Morse theoretic reductions on random simplical complexes and to provide a robust null model by which to measure their efficacy on general (i.e., not necessarily random) simplicial complexes.
Random Vector 2: Link Simplex Counts.The link of a simplex t in L , denoted lk(t), consists of all simplices s for which the union s ∪ t is also a simplex in L and the intersection s ∩ t is empty.The link of t forms a simplicial complex in its own right; and if we restrict attention to the underlying random graph G, then the link of a vertex is precisely the collection of its neighbours.Therefore, the Betti numbers β k (lk(t)) generalise the degree distribution for vertices of random graphs in two different ways -one can study neighbourhoods of higher-dimensional simplices by increasing the dimension of t, and one can examine higher-order connectivity properties by increasing the homological dimension k.The second random vector of interest to us here is obtained by letting T k equal the number of k-simplices lying in the link of a fixed simplex t in L , given that t indeed is a simplex in the random complex.As far as we are aware, ours is the first work that studies this random vector.A different conditional distribution, which follows directly from results on subgraph counts in G(n, p), has been studied before, see Remark 5.1.
There are compelling reasons to better understand the combinatorics and topology of such links from a probabilistic viewpoint.For instance, the fact that the link of a k-simplex in a triangulated n-manifold is always a triangulated sphere of dimension (n − k − 1) has been exploited to produce canonical stratifications of simplicial complexes into homology manifolds [2,37].Knowledge of simplex counts (and hence, Betti numbers) of links would therefore form an essential first step in any systematic study involving canonical stratifications of random clique complexes.
Random Vector 3: Total Simplex Counts.The strategy employed in Kahle's proof of the second assertion above involves first checking that the expected number of k-simplices in L ∼ X(n, p) is much larger than the expected number of simplices of dimensions k ± 1 whenever p lies in the range indicated by (2).Therefore, one may combine the Morse inequalities with the linearity of expectation in order to guarantee that the expected β k (L ) is nonzero -see [27,Section 4] for details.To facilitate more refined analysis and estimates of this sort, the first random vector we study in this paper is obtained by letting T k equal the total number of k-dimensional simplices in L .
Since T k is precisely the number of (k + 1)-cliques in G ∼ G(n, p), this random vector falls within the purview of generalised U-statistics.We extend results from [23] to show not only distributional convergence asymptotically but a stronger result, detailing explicit non-asymptotic bounds on the approximation.Several interesting problems can be seen as special cases -these include classical U-statistics [32,30], monochromatic subgraph counts of inhomogeneous random graphs with independent random vertex colours, and the number of overlapping patterns in a sequence of independent Bernoulli trials.To the best of our knowledge, this is the first multivariate normal approximation result with explicit bounds where the sizes of the subgraphs are permitted to increase with n.
Main Results.The central contributions of this work are multivariate normal approximations for all three random vectors T described above.For the purposes of these introductory remarks, we will restrict attention to the case where T is the vector of critical simplex counts.Letting {Y i,j } 1≤i<j≤n be a sequence of i.i.d.Bernoulli variables, the k-th component is This variable, which we discuss in Section 4, arises naturally in stochastic topology [9,8] but has been poorly studied from a distributional approximation perspective.To the best of our knowledge, only the expected value of a closely-related random variable has been calculated (see [6,Section 8]).While there is no shortage of multivariate normal approximation theorems [17,40,35,12], the existing ones are not sufficiently fine-grained for our purposes.We therefore return to the pioneering work of Barbour, Karo ński, and Ruci ński [4], who proved a univariate central limit theorem (CLT) for a decomposable sum of random variables using Stein's method, treating the case of dissociated sums as a special case.Our approximation result, described below, forms a new extension of their ideas to the multivariate setting, and may be of independent interest.
Let n and d be positive integers.For each i ∈ [d] =: {1, 2, . . ., d}, we fix an index set I i ⊂ [n] × {i} and consider the union of disjoint sets I =: i∈[d] I i .Associate to each such s = (k, i) ∈ I a real centered random variable X s and form for each i ∈ [d] the sum Consider the resulting random vector W = (W 1 , . . ., W d ) ∈ R d .The following notion is a natural multivariate generalisation of the dissociated sum from [34]; see also [4].DEFINITION 1.1.We call W a vector of dissociated sums if for each s ∈ I and j ∈ [d] there exists a dependency neighbourhood D j (s) ⊂ I j satisfying three criteria: (1) the difference W j − ∑ u∈D j (s) X u is independent of X s ; (2) for each t ∈ I, the quantity W j − ∑ u∈D j (s) X u − ∑ v∈D j (t)\D j (s) X v is independent of the pair (X s , X t ); and finally, (3) X s and X t are independent if t ∈ j D j (s).
Let W be a vector of dissociated sums as defined above.For each s ∈ I, by construction, the sets D j (s), j ∈ [d] are disjoint (although for s = t, the sets D j (s) and D j (t) may not be disjoint).We write D(s) = j∈[d] D j (s) for the disjoint union of these of dependency neighbourhoods.With this preamble in place, we state our main result.THEOREM 1.2.Let h : R d → R be any three times continuously differentiable function whose third partial derivatives are Lipschitz continuous and bounded.Consider a standard d-dimensional Gaussian vector Z ∼ MVN(0, Id d×d ).Assume that for all s ∈ I, we have E {X s } = 0 and E X 3 s < ∞.Then, for any vector of dissociated sums W ∈ R d with a positive semi-definite covariance matrix Σ, In the special case where each W k is a sum of an equal number of i.i.d.random variables and each i.i.d.sequence is independent, the bound in Theorem 1.2 is optimal with respect to the size n of the sum in each component.However, compared to the CLT from [17], the bound is not optimal in the length d of the vector W. In any event, the desired CLT for critical simplex counts follows as a corollary to Theorem 1.2.We state a simplified version of this result here and note that the full statement and proof have been recorded as Theorem 4.6 below.In the statement below, W ∈ R d is an appropriately scaled and centered vector whose k-th component counts the number of critical simplices of dimension k for the lexicographical acyclic partial matching on L ∼ X(n, p).THEOREM 1.3.Let Z ∼ MVN(0, Id d×d ) and Σ be the covariance matrix of W. Let h : R d → R be three times partially differentiable function whose third partial derivatives are Lipschitz continuous and bounded.
Then there is a constant B 1.3 > 0 independent of n and a natural number N 1.3 such that for any n ≥ N 1.3 we have En route to proving Theorem 1.3, we also establish the following properties, which are of direct interest in computational topology.Here we assume that p ∈ (0, 1) and k ∈ {1, 2, . ..} are constants.
(1) The expected number of critical k-simplices is one order of n smaller that the expected number of total k-simplices; see Lemma 4.2.(2) The variance of the number of critical k-simplices is at least of the order n 2k , as shown in Lemma 4.4.An upper bound of the same order can be proved similarly.The variance of the total number of k-simplices is also of the same order.(3) Knowing the expected value and the variance one can prove concentration results using different concentration inequalities, for example, Chebyshev's inequality.This would show that not only the expected value of critical simplices is smaller compared to all simplices but also that large deviations from the mean are unlikely, hence implying that the substantial improvement of one order of n is not only expected but also likely.(4) For counting critical simplices to high accuracy in probability, it is not necessary to check every simplex.Certain simplices have a very small chance of being critical, and can be safely ignored.The probability of this omission causing an error is vanishingly small asymptotically; see Proposition 4.5.
More details are provided in Remark 4.7.
Related Work.Theorem 1.2 is not the first generalisation of the results in [4] to a multivariate setting, see for example [17,40].The key advantage of our approach is that it allows for bounds which are non-uniform in each component of the vector W.This is useful when, for example, the number of summands in each component are of different order or when the sizes of dependency neighbourhoods in each component are of different order.The applications consdered here are precisely of this type, where the non-uniformity of the bounds is crucial.Moreover, we do not require the covariance matrix Σ to be invertible, and can therefore accommodate degenerate multivariate normal distributions.
Another multivariate central limit theorem for centered subgraph counts in the more general setting of a random graph associated to a graphon can be found in [29].That proof is based on Stein's method via a Stein coupling.Translating this result for uncentered subgraph counts would yield an approximation by a function of a multivariate normal.In [41], an exchangeable pair coupling led to [41,Proposition 2] which can be specialised to joint counts of edges and triangles; our approximation significantly generalises this result beyond the case where k ∈ {1, 2}.Several univariate normal ap- proximation theorems for subgraph counts are available; recent developments in this area include [39], which uses Malliavin calculus together with Stein's method, and [15], which uses the Stein-Tikhomirov method.
Organisation.In Section 2 we prove our main approximation theorem using smooth test functions and extend the result to non-smooth test functions using a smoothing technique from [19].In Section 3, we recall concepts from the theory of simplicial complexes, which we later use.In Section 4 we prove an approximation theorem for critical simplex counts of lexicographical matchings.Two technical computations required in this Section have been consigned to the Appendix.In Section 5 we prove an approximation theorem for count variables of simplices that are in the link of a fixed simplex.In Section 6 we introduce a slight generalisation of generalised U-statistics for which Theorem 1.2 gives a CLT with explicit bounds.We then apply the CLT to simplex counts in the random clique complex.
Acknowledgements.TT acknowledges funding from EPSRC studentship 2275810.VN is supported by the EPSRC grant EP/R018472/1.GR is funded in part by the EPSRC grants EP/T018445/1 and EP/R018472/1.The authors would like to thank Xiao Fang and Matthew Kahle for helpful discussions.

A Multivariate CLT for Dissociated Sums
Throughout this paper we use the following notation.Given positive integers n, m we write [m, n] for the set {m, m + 1, . . ., n} and [n] for the set [1, n].Given a set X we write |X| for its cardinality, P(X) for its powerset, and given a positive integer k we write Throughout this section, W ∈ R d is a vector of dissociated sums in the sense of Definition 1.1, with covariance matrix whose entries are holds for all twice continuously differentiable f : R d → R for which the expectation exists.In particular, we will use the following result based on [35, Lemma 1 and Lemma 2].As Lemma 1 and Lemma 2 in [35] are stated there only for infinitely differentiable test functions, we give the proof here for completeness.
LEMMA 2.1 (Lemma 1 and Lemma 2 in [35]).Fix n ≥ 2. Let h : R d → R be n times continuously differentiable with n-th partial derivatives being Lipschitz and Z ∼ MVN(0, Id d×d ).Then, if Σ ∈ R d×d is symmetric positive semidefinite, there exists a solution f : R d → R to the equation such that f is n times continuously differentiable and we have for every k = 1, . . ., n: PROOF.Let h be as in the assertion.It is shown in Lemma 2.1 in [11], which is based on a reformulation of Eq. (2.20) in [3], that a solution of (2.2) for h is given by f As h has n-th partial derivatives being Lipschitz and hence for differentiating f we can bring the derivative inside the integral, it is straightforward to see that the solution f is n times continuously differentiable.

The bound on
for any i 1 , i 2 , . . ., i k ; see, for example, Equation (10) in [35].Taking the sup-norm on both sides and bounding the right hand side of the equation gives Note that neither Lemma 2.1 nor indeed Theorem 1.2 require the covariance matrix Σ to be invertible.
PROOF OF THEOREM 1.2.To prove Theorem 1.2, we replace w by W in Equation (2.2) and take the expected value on both sides.As a result, we aim to bound the expression where f is a solution to the Stein equation (2.2) for the test function h.Since the variables {X s | s ∈ I} are centered and as X t is independent of We now use the decomposition of Σ ij from (2.4) in the expression (2.3).For each pair (s, j) ∈ I × [d] and t ∈ D(s) we set D j (t; s) = D j (t) \ D j (s) and By Definition 1.1, W s j is independent of X s , while W s,t j is independent of the pair (X s , X t ).Next we decompose the r.h.s. of (2.3); ) ) Here we recall that if s = (k, i) As with the vector of dissociated sums W ∈ R d itself, we can assemble these differences into random vectors.Thus, W s ∈ R d is (W s 1 , . . ., W s d ), and similarly W s,t = (W s,t 1 , . . .W s,t d ).In the next three claims, we provide bounds on R i for i ∈ [3].
CLAIM 2.2.The absolute value of the expression R 1 from (2.6) is bounded above by PROOF.Note that For each s ∈ I i , it follows from (2.5) that W = U s + W s .Using the Lagrange form of the remainder term in Taylor's theorem, we obtain for some random θ s ∈ (0, 1).Using this Taylor expansion in the expression for R 1 , we get the following four-term summand S i,s for each i ∈ [d] and s ∈ I i : The second and fourth terms cancel each other.Recalling that X s is centered by definition and independent of W s by Definition 1.1, the third term also vanishes and CLAIM 2.3.The absolute value of the expression R 2 from (2.7) is bounded above by PROOF.Recalling that U s j = ∑ t∈D j (s) X t and D(s) = d j=1 D j (s), Fix s ∈ I and t ∈ D j (s).Recall that by (2.5), W s = W s,t + V s,t .Using the Lagrange form of the remainder term in Taylor's theorem, we obtain: for some random θ s,t ∈ (0, 1).Using this Taylor expansion in the expression for R 2 , we get the following three-term summand S s,t for each pair (s, t) ∈ I × D j (s): Recalling that W s,t is independent of the pair (X s , X t ) the first and the last terms cancel each other and only the sum over k is left: Using the Lagrange form of the remainder term in Taylor's theorem, we obtain for some random ρ s,t ∈ (0, 1).Recalling that Take any h ∈ H d .Let f : R d → R be the associated solution from Lemma 2.1.Combining Claims 2.1 -2.4 and using Lemma 2.1 we have: In most of our applications, the variables X s are centered and rescaled Bernoulli random variables.Hence, the following lemma is useful.LEMMA 2.5.Let ξ 1 , ξ 2 , ξ 3 be Bernoulli random variables with expected values µ 1 , µ 2 , µ 3 respectively.Let c 1 , c 2 , c 3 > 0 be any constants.Consider variables X i := c i (ξ i − µ i ) for i = 1, 2, 3. Then we have PROOF.Note that X 3 can take two values: −c 3 µ 3 or c 3 (1 − µ 3 ).As 0 ≤ µ 3 ≤ 1, we have Applying the Cauchy-Schwarz inequality and direct calculation of the second moments gives which finishes the proof.

Non-smooth Test Functions.
Here we follow [29, Section 5.3] very closely to derive a bound on the convex set distance between a vector of dissociated sums W ∈ R d with covariance matrix Σ and a target multivariate normal distribution Σ 1 2 Z, where Z ∼ MVN(0, Id d×d ).The smoothing technique used here is introduced in [19].However, a better (polylogarithmic) dependence on d could potentially be achieved using a recent result [20, Proposition 2.6], at the expense of larger constants.The recursive approach from [42,28] usually yields better dependence on n; however, this requires the target normal distribution to have an invertible covariance matrix.Since this property does not always hold in our applications of interest, we do not use the recursive approach here.To state our next result, letK be a class of convex sets in R d .THEOREM 2.6.Consider a standard d-dimensional Gaussian vector Z ∼ MVN(0, Id d×d ).For any centered vector of dissociated sums W ∈ R d with a positive semi-definite covariance matrix Σ and finite third absolute moments we have where the quantity B 1.2 as in Theorem 1.2.
PROOF.Fix A ∈ K , > 0 and define Let H ,A := h ,A : R d → [0, 1]; A ∈ K be a class of functions such that h ,A (x) = 1 for x ∈ A and 0 for x / ∈ A .Then, by [7, Lemma 2.1] as well as inequalities (1.2) and (1.4) from [7], for any > 0 we have: sup Let f : R d → R be a bounded Lebesgue measurable function, and for δ > 0 let , where I A /4 is the indicator function of the subset A /4 ⊆ R d .By [19, Lemma 3.9] we have that h ,A is bounded and has three continuous bounded partial derivatives and its third partials are Lipschitz.Moreover, the following bounds hold: Note that h ,A = S Since this bound works for every > 0, we minimise it by using = 3B .
The next result provides a simplification of Theorems 1.2 and 2.6 under the assumption that one uses bounds that are uniform in s, t, u ∈ I. Its proof follows immediately from writing the sum over ∑ s∈I ∑ t,u∈D(s) as the sum over COROLLARY 2.7.We have the following two bounds: (1) Under the assumptions of Theorem 1.2, (2) Assuming the hypotheses of Theorem 2.6, Here B 2.7 is a sum over (i, and α ij is the largest value attained by D j (s) over s ∈ I i , and

Simplicial Complex Preliminaries
3.1.First definitions.Firstly, we recall the notion of a simplicial complex [43, Ch 3.1]; these provide higher-dimensional generalisations of a graph and constitute data structures of interest across algebraic topology in general as well as applied and computational topology in particular.
A simplicial complex L on a vertex set V is a set of nonempty subsets of V (i.e.∅ / ∈ L ⊆ P(V)) such that the following properties are satisfied: (1) for each v ∈ V the singleton {v} lies in L , and (2) if t ∈ L and s ⊂ t then s ∈ L .
The dimension of a simplicial complex L is max s∈L |s| − 1. Elements of a simplicial complex are called simplices.If s is a simplex, then its dimension is |s| − 1.A simplex of dimension k can be called a k-simplex.Note that the notion of one-dimensional simplicial complex is equivalent to the notion of a graph, with the vertex set V and edges as subsets.
Recall that G(n, p) is a random graph on n vertices where each pair of vertices is connected with probability p, independently of any other pair.The X(n, p) random simplicial complex is the clique complex of the G(n, p) random graph, which is a random model studied in stochastic topology [25,27].Note that t ∈ X if and only if the vertices of t span a clique in G. Thus, elements in X(n, p) are cliques in G(n, p).

Links.
The link of a simplex t in a simplicial complex L is the subcomplex If we look at a graph as a one dimensional simplicial complex, then the vertices are sets of the form {i} and edges are sets of the form {i, j}.For a vertex t = {v}, the edges of the form s = {v, u} will not be in the link of t because t ∩ s = ∅ is not satisfied.If we pick s = {i, j} and v / ∈ s, then s ∪ t ∈ L is not satisfied.So there will be no edges in the link.However, if s = {u} and u is a neighbour of v, then s ∪ t ∈ L and s ∩ t = ∅.Hence the link of a vertex will be precisely the other vertices that the vertex is connected to; the notion of the link generalises the idea of a neighbourhood in a graph.Right: the link (highlighted in blue) of the edge {1, 2} (highlighted in red).The two- dimensional simplices are shaded in grey.EXAMPLE 3.2.Now consider the simplicial complex depicted in Figure 1: it has 8 vertices, 12 edges and 3 two-dimensional simplices that are shaded in grey.On the left hand side of the figure we see highlighted in blue the link of the vertex 1, which is highlighted in red.So lk({1}) = {{2}, {3}, {5}, {6}, {8}, {2, 3}, {2, 8}, {5, 6}}.On the right hand side of the figure we see highlighted in blue the link of the edge {1, 2}, which is highlighted in red.That is, lk({1, 2}) = {{3}, {8}}.

Discrete Morse theory.
A partial matching on a simplicial complex L is a collection such that every simplex appears in at most one pair of Σ.A Σ-path (of length k ≥ 1) is a sequence of distinct simplices of L of the following form: Given a partial matching Σ on L , we say that a simplex t ∈ L is critical iff t does not appear in any pair of Σ.
For a one-dimensional simplicial complex, viewed as a graph, a partial matching Σ is comprised of elements (v; {u, v}) with v a vertex and {u, v} an edge.A Σ−path is then a sequence of distinct vertices and edges We refer the interested reader to [18] for an introduction to discrete Morse theory and to [36] for seeing how it is used to reduce computations in the persistent homology algorithm.In this work we aim to understand how much improvement one would likely get on a random input when using a specific type of acyclic partial matching, defined below.DEFINITION 3.3.Let L be a simplicial complex and assume that the vertices are ordered by [n] = {1, . . ., n}.For each simplex s ∈ L define

Now consider the pairings
s ↔ s ∪ {i}, where i = min I L (s) is the smallest element in the set I L (s), defined whenever I L (s) = ∅.We call this the lexicographical matching.Due to the min I L (s) construction in the lexicographical matching, the indices are decreasing along any path and hence it will be a gradient path, showing that the lexicographical matching is indeed an acyclic partial matching on L .EXAMPLE 3.4.Consider the simplicial complex L depicted in Figure 2. The complex has 5 vertices, 6 edges and one two-dimensional simplex that is shaded in grey.The red arrows show the lexicographical matching on this simplicial complex: there is an arrow from a simplex s to t iff the pair (s, t) is part of the matching.More explicitly, the lexicographical matching on L is Σ = {({2}, {1, 2}), ({3}, {2, 3}), ({4}, {1, 4}), ({5}, {3, 5}), ({4, 5}, {3, 4, 5})}.
Note that {3, 4} cannot be matched because the set I L ({3, 4}) is empty.Also, in any lexicographical matching {1} is always critical as there are no vertices with a smaller label and hence the set I L ({1}) is empty.So under this matching there are two critical simplices: {1} and {3, 4}, highlighted in blue in the figure.Hence, if we were computing the homology of this complex, considering only two simplices would be sufficient instead of all 12 which are in L -a significant improvement.

Critical Simplex Counts for Lexicographical Morse Matchings
Now we attend to our motivating problem, critical simplex counts.Consider the random simplicial complex X(n, p).In this section we study the joint distribution of critical simplices in different dimensions with respect to the lexicographical matching on X(n, p).We start with the following lemma, which is an immediate consequence of Definition 3.3, allowing us to write down the variables of interest in terms of the edge indicators.LEMMA 4.1.Let L be a simplicial complex.Consider the lexicographical matching on L .Then t ∈ L matches with one of its cofaces (i.e.s ∈ L with |s| − |t| = 1 and t ⊂ s) iff it is not the case that for all j < min(t) we have t ∪ {j} / ∈ L .Also, t ∈ L matches with one of its faces (i.e.s ∈ L with |t| − |s| = 1 and s ⊂ t) iff for all j < min(t) we have t \ {min(t)} ∪ {j} / ∈ L .
For any pair of integers 1 ≤ i < j ≤ n let Y i,j := 1 ({i, j} ∈ X(n, p)) be the edge indicator.
Fix s ∈ C k .Define the variables X + s = 1 (s matches with its coface given it is a simplex) and X − s = 1 (s matches with its face given it is a simplex).The events that the two variables indicate are disjoint.
By Lemma 4.1 we can see that , where s − := s \ {min(s)}.Hence, Thus, the random variable of interest, counting the number of (k − 1)-simplices that are critical under the lexicographical matching, is Note that this random variable does not fit into the framework of generalised U-statistics, which we will discuss in Section 6, because the summands in T k depend not only on the variables that are indexed by the subset s.

Moments.
LEMMA 4.2.For any 1 ≤ k ≤ n − 1 we have: PROOF. Moreover, In this example, bounding the variance is not immediate.The proof of the following Lemmas 4.3 and 4.4 are long (and not particularly insightful) calculations, which are deferred to the Appendix.LEMMA 4.3.For any integer 1 ≤ k ≤ n − 1 we have: Here we have used the following notation: LEMMA 4.4.For a fixed integer 1 ≤ k ≤ n − 1 and p ∈ (0, 1) there is a constant C p,k > 0 independent of n and a natural number N p,k such that for any n ≥ N p,k : In Lemma 4.4 the constant could have been made explicit at the expense of an even longer calculation.
Just knowing the expectation and the variance can already give us some information about the variable.For example, we obtain the following proposition.This proposition shows that considering only a subset of the simplices already gives a good approximation for the critical simplex counts.We recall the notation that f .
for any > 0, then the variable T k+1 − T K k+1 vanishes with high probability, provided that p and k stay constant.
PROOF.A similar calculation to that for Lemma 4.2 shows that: Using Markov's inequality, we get: which asymptotically vanishes as long as K = ω(ln 1+ (n)).

Approximation theorem.
For i ∈ [d], recall a random variable counting i-simplices in X(n, p) that are critical under the lexicographical matching, as given in (4.1).We write for the i-th index set and For bounds that asymptotically go to zero for this example, we use Theorems 1.2 and 2.6 directly: the uniform bounds from Corollary 2.7 are not fine enough here.THEOREM 4.6.Let Z ∼ MVN(0, Id d×d ) and Σ be the covariance matrix of W.
(1) Let h ∈ H d .Then there is a constant B 4.6.1 > 0 independent of n and a natural number N 4.6.1 such that for any n ≥ N 4.6.1 we have (2) Let K be the class of convex sets in R d .Then there is a constant B 4.6.2> 0 independent of n and a natural number N 4.6.2such that for any n ≥ N 4.6.2we have PROOF.It is clear that W satisfies the conditions of Theorems 1.2 and 2.6 for any s = (φ, i) We apply Theorems 1.2 and 2.6.For the bounds on the quantity B 1.2 from Theorems 1.2 and 2.6 we use Lemma 2.5 and Lemma 4.4.We write C for an unspecified positive constant that does not depend on n.Also, we assume here that n is large enough for the bound in Lemma 4.4 to apply.Let Then we have: REMARK 4.7.The relevance of understanding the number of critical simplices in the context of applied and computational topology is as follows.We assume that p ∈ (0, 1) and k ∈ {1, 2, . ..} are constants.
(1) As seen in Lemma 4.2, the expected number of critical k-simplices under the lexicographical matching is one power of n smaller than the total number of k-simplices in X(n, p).(2) In light of our approximation Theorem 4.6 we also know that the (rescaled) deviations from the mean are approximately normal and the bounds are of the same order of n compared to the approximation of all simplex counts in X(n, p) as given in Theorem 6.6.Knowing the expectation and the variance from Lemmas 4.3 and 4.2, one can apply concentration inequalities, for example, Chebyshev's inequality, to show that the number of critical simplices concentrates around its mean.Hence, because of the concentration of measure, the computational improvements as a result of lexicographical matching are likely substantial in X(n, p).(3) From Proposition 4.5, it is very likely that in X(n, p) all k-simplices s ∈ X(n, p) with min(s) = ω(ln 1+ (n)) for any fixed > 0 are not critical.

Simplex Counts in Links
Consider a random simplicial complex X(n, p).For 1 ≤ i < j ≤ n define the edge indicator In this section we study the count of (k − 1)-simplices that would be in the link of a fixed subset t ⊆ [n] if the subset spanned a simplex in X(n, p).Given that t is a simplex, the variable counts the number of (k − 1)-simplices in lk(t).Thus, the random variable of interest is Note that the product ∏ i∈s,j∈t Y i,j ensures that t ∪ s is a simplex if t spans a simplex.
REMARK 5.1.The random variable T t k does not fit into the framework of generalised U-statistics, which we will discuss in Section 6, because the summands depend not only on the variables that are indexed by the subset s and we do not sum over all subsets s but rather only the ones that do not intersect t.
Moreover, note that given the number of vertices of the link of a simplex t, the conditional distribution of the link of t is again X(n , p), where n is a random variable equal to the number of vertices in the link.If we are interested in such a conditional distribution, the results of Section 6 apply.However, in this section we study the number of simplices in the link of t given that t is a simplex rather than given the number of vertices of the link of t.Such a random variable behaves differently from the simplex counts in X(n, p) which are studied in Section 6.For example, the summands of T t k have a different dependence structure compared to the summands of T k from Equation 6.4.As a result, the approximation bounds are of different order.
It is natural to ask whether the results obtained in this section follow from those of Section 6 below.This might well be the case, but the answer is not straightforward.One could derive an approximation for the number of simplices in lk(t) given the number of vertices in the link; the variable T t k could then be approximated by a mixture, induced by the distribution of the number of vertices in the link (which is binomial).However, applying this approach naïvely yields bounds that do not converge to zero.While it is certainly possible that a different approach would succeed, we prefer not to rely on Section 6 and prove the approximation directly.

Moments.
It is easy to see that for any positive integer k and t ⊆ [n], since there are ( n−|t| k+1 ) choices for s ∈ C k+1 such that s ∩ t = ∅.Next we derive a lower bound on the variance.LEMMA 5.2.For any fixed 1 ≤ k ≤ n − 1 and t ⊆ [n] we have: PROOF.First let us calculate Cov(T t k+1 , T t l+1 ).For fixed subsets s ∈ C k+1 and u ∈ C l+1 if |s ∩ u| = 0, then the corresponding variables ∏ i =j∈s Y i,j ∏ i∈s,j∈t Y i,j and ∏ i =j∈u Y i,j ∏ i∈u,j∈t Y i,j are independent and so have zero covariance.
For 1 ≤ m ≤ l + 1, the number of pairs of subsets s ∈ C k+1 and u ∈ C l+1 such that s ∩ Since each summand is non-negative, we lower bound by the m = 1 summand and get (with ( 12 Taking l = k completes the proof.

Approximation theorem.
For a multivariate normal approximation of counts given in Equation (5.1), we write Then we have the following approximation theorem.THEOREM 5.3.Let Z ∼ MVN(0, Id d×d ) and Σ be the covariance matrix of W t .
(1) Let h ∈ H d .Then (2) Let K be the class of convex sets in R d .Then Here PROOF.It is clear that W t satisfies the conditions of Corollary 2.7 with the dependency neighbourhood D j (s) = (ψ, j) ∈ I j | |φ ∩ ψ| ≥ 1 for any s = (φ, i) ∈ I i .So we aim to bound the quantity B 2.7 from the corollary.
Given φ ∈ C t i+1 and m ≤ min(i + 1, j + 1) there are giving a bound for α ij .For a bound on β ijk , applying Lemma 2.5, for any i, j, k ∈ [d] and s (5.3) (5.4) Now we apply Corollary 5.2 and get Taking both sides of the inequality to the power of − 1 2 we get for any i ∈ [d] (5.5) Using Equations (5.2) -(5.5) to bound B 2.7 from Corollary 2.7 we get: . By Stirling's approximation, if p ∈ (0, 1) is a constant, then max(k, |t|) = Ω(ln 1+ (n)) for any positive forces the expectation to go to 0 asymptotically.Hence, by Markov's inequality, with high probability there are no k-simplices in the link of t as long as max(k, |t|) is of order ln 1+ (n) or larger for any > 0 for a constant p.
Recall that in Theorem 5.3 we count all simplices up to dimension d in the link of t.Note that if max(d 2 , d|t|) = O(ln 1− (n)) for any > 0, then the bounds in Theorem 5.3 tend to 0 as n tends to infinity as long as p ∈ (0, 1) stays constant.In particular, if d is a constant, Theorem 5.3 gives an approximation for all sizes of t for which the approximation is needed.

Simplex Counts in X(n, p)
In this section we study the simplex counts in X(n, p) or, equivalently, the clique counts in G(n, p).In order to do that, we prove a multivariate normal approximation theorem for generalised U-statistics, which might be of independent interest.The approximation theorem for simplex counts in X(n, p) then follows as a special case.
Here we consider generalised U-statistics, which were first introduced in [23].We expand the notion slightly by considering independent but not necessarily identically distributed variables instead of i.i.d.variables.
Let {ξ i } 1≤i≤n be a sequence of of independent random variables taking values in a measurable set X and let {Y i,j } 1≤i<j≤n be an array of of independent random variables taking values in a measurable set Y which is independent of {ξ i } 1≤i≤n .We use the convention that Y i,j = Y j,i for any i < j.For example, one can think of X i as a random label of a vertex i in a random graph where Y i,j is the indicator for the edge connecting i and j.Given a subset s ⊆ 6.1.The First Approximation Theorem.Let {k i } i∈[d] be a collection of positive integers, each be- ing at most n, and for each i ∈ By construction, W i has mean 0 and variance 1. ASSUMPTION 6.2.We assume that (1) For any i ∈ [d] there is some α i > 0 such that for all s, t ∈ I i , the variables (2) There is β ≥ 0 such that for any i, j, l ∈ [d] and any s ∈ I i , t ∈ I j , u ∈ I l we have (3) The random variables X s have finite absolute third moments.
The first assumption is not necessary but very convenient and we use it to derive a lower bound for the variance σ 2 i .It holds in a variety of settings, for example, subgraph counts in a random graph.A normal approximation theorem can be proven in our framework when the assumption does not hold and a sufficiently large lower bound for the variance is acquired in a different way.Similarly, we use the second assumption to get a convenient bound on mixed moments.However, depending on a particular question at hand, one might want to use a bound on mixed moments, which is not uniform in i, j, l and sometimes even one that is not uniform in s, u, v.We will discuss such an example (which does not fit into the framework of generalised U-statistics) in Section 4. In this section, in order to maintain the generality and simplicity of the proofs, we work under Assumption 6.2.In [23,Theorem 6] it is assumed that all summands in the generalised U-statistic have finite second moment as well as that the sums admit a particular decomposition which is not easily translatable to our framework.In contrast to [23] we obtain a non-asymptotic bound on the normal approximation, as follows.THEOREM 6.3.Let Z ∼ MVN(0, Id d×d ) and let W with covariance matrix Σ satisfy Assumption 6.2.
(2) Let K be a class of convex sets in R d .Then Here, PROOF.Note that if s = (φ, i) ∈ I i and u = (ψ, j) ∈ I j are chosen such that φ ∩ ψ = ∅, then the corresponding variables X s and X u are independent since f i (X s , Y s ) and f j (X u , Y u ) do not share any random variables from the sets {ξ i } 1≤i≤n and {Y i,j } 1≤i<j≤n .Hence, if for any s = (φ, i) ∈ I i we set D j (s) = (ψ, j) ∈ I j | |φ ∩ ψ| ≥ 1 , then W satisfies the assumptions of Corollary 2.7.It remains to bound the quantity B 2.7 .
First, to find α ij as in Corollary 2.7, given φ ∈ C k i and if k i , k j ≥ m then there are Note that Using Assumption 6.2, for any i, j, l ∈ [d] and s To take care of the variance terms, we lower bound the variance using Assumption 6.2; Here the second-to-last inequality follows by taking only the term for m = 1.Now we take both sides of the inequality to the power of − 1 2 to get that for any i ∈ [d]

Approximation Theorem with no Variables {ξ
Next we consider the special case that the functions in Definition 6.1 only depend on the second component, so that the sequence {ξ i } i∈[n] can be ignored.Continuing to use the same notation, we want to understand the joint distribution of S n,k 1 ( f 1 ), S n,k 2 ( f 2 ), . . .S n,k d ( f d ).However, we add an additional assumption.ASSUMPTION 6.4.We assume that the functions f i only depend on the variables Y i,j for 1 ≤ i < j ≤ n.That is, we can write f i : Such functions appear naturally, for example, when counting subgraphs in an inhomogeneous Bernoulli random graph.A detailed example of such generalised U-statistic is worked out in Section 6.3.
In this case, we can adapt the previous theorems slightly and get improved bounds.We still work under Assumption 6.2.The key difference in this case is that the dependency neighbourhoods become smaller: now the subsets need to overlap in at least 2 elements for the corresponding summands to share at least one variable Y i,j and hence become dependent.This makes both the variance and the size of dependency neighbourhoods smaller.In the context of Theorem 1.2, the tradeoff works out in our favour to give smaller bounds, as follows.For any s = (φ, i) ∈ I i we set D j (s) = (ψ, j) ∈ I j | |φ ∩ ψ| ≥ 2 , so that W, under the additional Assumption 6.4, satisfies the assumptions of Corollary 2.7.
In this case, we can adjust Equations (6.1) and (6.3).The proofs are exactly the same as previously, with the only difference being that when we sum over m, we start at m = 2 as opposed to m = 1.THEOREM 6.5.Consider W that satisfies Assumption 6.4.Let Z ∼ MVN(0, Id d×d ) and Σ be the covariance matrix of W.Here PROOF.Equation (6.1) becomes Equation (6.3) becomes Using the adjusted bounds in Corollary 2.7 gives the result.

Approximation Theorem for Simplex Counts.
In this section we apply Theorem 6.5 to approximate simplex counts.Consider G ∼ G(n, p).For 1 ≤ x < y ≤ n let Y x,y := 1 (x ∼ y) be the edge indicator.In this section we are interested in the (i + 1)-clique count in G(n, p) or, equivalently, the i-simplex count in X(n, p), given by Let Y i+1 = {0, 1} i+1 and let f i : Y i+1 → R be the function Then the associated generalised U-statistic S n,i+1 ( f i ) equals the (i + 1)-clique count T i+1 , as given by Equation (6.4).To apply Theorem 6.5 we need to center and rescale our variables.It is easy to see that E{ f i (Y φ )} = p ( i+1 2 ) .if φ ∈ C i+1 Just like in Section 6.1, we let I i := C i+1 × {i} and for s = (φ, i) ∈ I i we define X s : 2 ) and W i = ∑ s∈I i X s .Now the vector of interest is W = (W 1 , W 2 , . . ., W d ) ∈ R d .This brings us to the next approximation theorem.COROLLARY 6.6.Let Z ∼ MVN(0, Id d×d ) and Σ be the covariance matrix of W.
(1) Let h ∈ H d .Then Eh(W) − Eh(Σ (2) Let K be a class of convex sets in R d .Then 6.6 n − 1 4 . Here PROOF.Firstly, observe that for any φ, ψ ∈ C i+1 for which |φ ∩ ψ| ≤ 1 the covariance vanishes, while if |φ ∩ ψ| ≥ 2 the covariance is non-zero, and we have For s = (φ, i) . Then by Lemma 2.5 we get: ; Since p ) ), we see that Assumption 6.2 holds.Assumption 6.4 also holds and therefore we can apply Theorem 6.5 with p −1 − 1 finishes the proof.REMARK 6.7.It is easy to show that with high probability there are no large cliques in G(n, p) for p < 1 constant.To see this, the expectation of the number of k-cliques is ( n k )p ( k 2 ) .By Stirling's approximation, k = Ω(ln 1+ (n)) for any positive forces the expectation to go to 0 asymptotically.Hence, by Markov's inequality, with high probability there are no cliques of order ln 1+ (n) or larger for any > 0. For cliques of order larger than ln 1 2 (n) and fixed p, a Poisson approximation might be more suitable.
Recall that in Corollary 6.6 the size of the maximal clique we count is d + 1.Note that if d = O(ln 1 2 − (n)) for any > 0, then the bounds in Corollary 6.6 tend to 0 as n tends to infinity as long as p ∈ (0, 1) stays constant.This might seem quite small but in the light there not being any cliques of order ln 1+ (n) with high probability, this is meaningfully large.REMARK 6.8.Note that in Corollary 6.6 we use multivariate normal distribution with covariance Σ, which is the covariance of W when n is finite and it differs from the limiting covariance, as mentioned in [41].To approximate W with the limiting distribution, one could proceed in the spirit of [41,Proposition 3] in two steps: use the existing theorems to approximate W with ΣZ and then approximate ΣZ with Σ L Z where Σ L is the limiting covariance, which is non-invertible, as observed in [23].REMARK 6.9.Corollary 6.6 generalises the result [41, Proposition 2] beyond the case when d = 2 and we get a bound of the same order of n. [29, Theorem 3.1] considers centered subgraph counts in a random graph associated to a graphon.If we take the graphon to be constant, the associated random graph is just G(n, p).Compared to [29,Theorem 3.1] we place weaker smoothness conditions on our test functions.However, we make use of the special structure of cliques whereas [29,Theorem 3.1] applies to any centered subgraph counts.Translating [29, Theorem 3.1] into a result for uncentered subgraph counts, as we provide here in the special case of clique counts, is not trivial for general d.
However, it should be possible to extend our results, using the same abstract approximation theorem, beyond the random clique complex to Linial-Meshulam random complexes [33] or even more general multiparamter Costa-Farber random complexes [13].We shall consider this conjecture in future work.
For the rest of the proof when calculating probabilities we assume w.l.o.g. that min(s) ≤ min(t).Then we have for Y + s Y + t : Recall the notation [a, b] = {a, a + 1, . . ., b} for two positive integers a ≤ b.Setting q s,t : This strategy of splitting the product Y + s Y + t into three products of independent variables, only one of which is dependent on Z s Z t works exactly in the same way for the variables We write i = min(s), j = min(t), and q instead of q s,t .Also, we set Using the described strategy we get: Now we are ready to calculate the covariance: Next we consider the two covariance sums in (A.1) separately.First let us assume that min(s) = min(t).Given i, j ∈ Next we argue that To see this, assume i < j.Note that to pick a pair (s, t) ∈ Γ + k+1 (i, j, m, q) with min(s) = i and min(t) = j we need to pick the 2k − m vertices in s ∪ t.Firstly, we pick the vertices that are not included in s ∩ , this amounts to choosing 2k − m − (q − 1) vertices out of n − j.Then we decide which of the vertices that we have just picked will lie in t.This means we further need to choose k out of 2k + 1 − m − q vertices.Then we choose m − 1 out of k vertices of t to lie in s ∩ t (under the assumption that we already have min(t) ∈ s).Finally, we choose the set s ∩ [min(s), min(t) − 1], which amounts to picking q − 1 vertices out of j − i + 1 possible choices.If any of the binomial coefficients are negative, we set them to 0. The case j < i is analogous.

FIGURE 1 .
FIGURE 1. Left: the link (highlighted in blue) of the vertex 1 (highlighted in red).Right: the link (highlighted in blue) of the edge {1, 2} (highlighted in red).The two- dimensional simplices are shaded in grey.

FIGURE 2 .
FIGURE 2. Lexicographical matching given by the red arrows.Critical simplices are highlighted in blue.

2.1. Smooth Test Functions. To prove
2. For each s ∈ I we denote by D(s) ⊂ I the disjoint union d j=1 D j (s).For each triple (s, t, j) ∈ I 2 × [d] we write the set-difference D j (t) \ D j (s) as D j (t; s), with D(t; s) ⊂ I denoting the disjoint union of such differences over j ∈ [d].