1 Introduction

In the causal set approach to quantum gravity, the Lorentzian manifold used in general relativity and other theories of gravity to represent the spacetime geometry is simply the large-scale view of a locally finite partially ordered set, a causal set [1]. Part of the motivation for this approach is the observation [2, 3] that if one chooses a sufficiently dense set of uniformly distributed random points in a spacetime manifold, one can recover the spacetime geometry on scales larger than the one determined by the point density simply by using the causal ordering of the points. In this view then, spacetime is a purely combinatorial structure, a collection \(\mathscr {C}\) of events with a relation \(p\prec q\) which is reflexive, transitive, symmetric, and makes \(\mathscr {C}\) locally finite in the sense that for any p, \(q\in \mathscr {C}\) the interval or Alexandrov set \(I(p,q):= \{r\mid p\prec r\prec q\}\) has only a finite number of elements; for the purposes of this paper, without loss of generality from now on we will assume that \(\mathscr {C}\) is actually of finite cardinality and corresponds to a spacetime region of finite volume.

Given a Lorentzian manifold without closed causal curves, any choice of a finite number N of points in it, with the partial order induced by causality, will produce a causal set; if the points are randomly distributed with uniform density, the causal set samples the continuum geometry uniformly and can be seen as a discretization of it at the volume scale given by the inverse density provided there are no length scales in the manifold of the order or smaller than the average distance between the points. On the other hand, given a large number N, the vast majority of the causal sets that can be constructed out of N elements cannot arise as discretizations of Lorentzian manifolds. One of the most general questions causal set theory must address then is what makes it possible, in this approach, for the causal sets arising out of the dynamics to be such that they are seen as Lorentzian manifolds at large scales. Some preliminary steps towards answering this question consist in showing that if a large causal set is manifoldlike then the manifold is approximately unique in an appropriate sense [3, 4], establishing criteria for manifoldlikeness of causal sets, possibly involving their embeddability in Lorentzian manifolds [5,6,7,8,9], and identifying procedures for using the structure of a manifoldlike causal set to determine properties of the corresponding manifold, such as its dimensionality, topology or curvature [10,11,12,13].

In this paper, we use the distribution of maximal-chain lengths between pairs of points in a causal set to find a criterion for manifoldlikeness. As shown in Sect. 3, the path distribution has an approximately Gaussian shape, with a maximum and a width that can be used as two parameters characterizing the structure of that interval in the causal set.

2 The path length distribution

In this section we will derive an expression for the distribution of maximal-chain lengths in a causal set sprinkled uniformly at random in an Alexandrov set I(pq) of 2-dimensional Minkowski spacetime, and show the results of some numerical simulations. This distribution provides a good opportunity for statistical analysis of properties of the causal set. We will refer to an element of such a sprinkling that is contained in some smaller region within I(pq) simply as a point in that region, and we will call a region empty if it does not contain any of the randomly distributed points. A maximal chain of length k between points \(x^{}_{1}\) and \(x^{}_{k+1}\) is defined by \(k+1\) related points \(x^{}_{1} \prec x^{}_{2} \prec \cdots \prec x^{}_{k} \prec x^{}_{k+1}\) such that for each i the Alexandrov set \(I^{}_{i,i+1} = I(x^{}_{i},x^{}_{i+1})\) is empty; in the following, a maximal chain will be called a path.

The number of k-paths between two points \(p = x^{}_{1}\) and \(q = x^{}_{k+1}\) in a causal set uniformly embedded in Minkowski space is a random variable, whose mean value can be evaluated analytically by picking \(k-1\) possible locations for the other \(x^{}_{i}\), calculating the probability that one causal set point is found in each infinitesimal neighborhood \(\mathrm{d}x^{}_{i}\) and each interval \(I^{}_{i,i+1}\) is empty, and integrating over all the \(x^{}_{i}\).

When N points are sprinkled uniformly in a volume \(V_0\), the probability of finding exactly k of them within a region of volume V inside it is given by the binomial distribution,

$$\begin{aligned} P_k = {N\atopwithdelims ()k}\left( \frac{V}{V_0}\right) ^{k}\left( 1-\frac{V}{V_0}\right) ^{N-k}. \end{aligned}$$
(1)

In the small \(V/V_0\) limit, as in the case of an infinitesimal \(\mathrm{d}x^{}_{i}\), this probability can be approximated by a Poisson distribution of density \(\rho = N/V_0\). Thus, for each of the \(\mathrm{d}^d x_i\) the probability that it contains exactly one point can be written as \(\rho \,\mathrm{d}^d x^{}_{i}\,\mathrm{e}^{-\rho \,\mathrm{d}^d x^{}_{i}}\) \( \approx \rho \,\mathrm{d}^d x_i\), and all of those probabilities can be considered to be independent as long as the number of links in the chain is much smaller than the total number of points, \(k \ll N\). The intervals \(I^{}_{i,i+1}\) however may not be small, and for those we will use the binomial distribution. In particular, the probability that a region of volume V is empty is given by \(P^{}_{0} = (1-V/V_0)^N\), and for the union \(I^{}_{1,2}\cup \cdots \cup I^{}_{k,k+1}\) of all Alexandrov sets between pairs of points that probability can be written as

$$\begin{aligned} P_0=\left( 1-\frac{\sum _{i=1}^k V^{}_{i,i+1}}{V_0}\right) ^{N}. \end{aligned}$$
(2)

Putting these together we then get, following the same approach as in Ref. [10],

$$\begin{aligned}&P(x_2,\ldots ,x_k)\,\mathrm{d}^d x_2\,\cdots \,\mathrm{d}^d x_k\nonumber \\&\quad = \rho ^{k-1}\,\mathrm{d}^dx_2\ \cdots \mathrm{d}^dx_k\ \left( 1-\frac{\sum _i^k V_{i,i+1}}{V_0}\right) ^{N}\nonumber \\&\qquad +\, \hbox {higher-order terms}. \end{aligned}$$
(3)

We now identify the probability in Eq. 3 with the mean number of paths through those locations, which integrated over all \(x_i\) gives the mean number of k-paths between p and q,

$$\begin{aligned} \langle n_k\rangle = \rho ^{k-1}\int _{I_1}\mathrm{d}^dx_2\cdots \int _{I_{k-1}}\mathrm{d}^dx_k~\left( 1-\frac{\sum _i ^k V_{i,i+1}}{V_0}\right) ^{N}, \end{aligned}$$
(4)

where \(I_i = I(x^{}_{i},q)\) is the Alexandrov set between \(x^{}_{i}\) and the maximal element q in the manifold; for simplicity, from now on we will drop the angle brackets, \(\langle n_k\rangle \mapsto n_k\). Using the binomial expansion we can write

$$\begin{aligned} n_k= & {} \rho ^{k-1} \sum _{n=0}^N {N \atopwithdelims ()n}\left( {-\frac{1}{V_0}}\right) ^{n}\nonumber \\&\times \sum _{i_1=0}^n {n\atopwithdelims ()i_1} \cdots \sum _{i_k=0}^{i_{k-1}} {i_{k-1} \atopwithdelims ()i_k}\nonumber \\&\times \int _{I_1}\mathrm{d}^dx_2\cdots \int _{I_{k-1}}\mathrm{d}^dx_k\, (V_{12})^{n-i_1}\cdots (V_{k,k+1})^{i_k}.\qquad \end{aligned}$$
(5)

In two dimensions the volume of an Alexandrov set can be easily calculated using the null coordinates

$$\begin{aligned} u = (t+x)/\sqrt{2}\;, \qquad v = (t-x)/\sqrt{2}\;, \end{aligned}$$
(6)

in terms of which

$$\begin{aligned} V_{i,i-1} = (u_i-u_{i-1})(v_i-v_{i-1}). \end{aligned}$$
(7)

With these expressions for the volumes, in the \(d = 2\) case the integrals in Eq. 5 give

$$\begin{aligned} n_k = N^{k-1}\sum _{i=0}^N{N\atopwithdelims ()i}(-1)^i\frac{\varGamma (i+1)}{\varGamma (i+k)^2}\,f_{i,k} \end{aligned}$$
(8)

where we used \(N = \rho V_0\) and

$$\begin{aligned}&f^{}_{i,k} = \sum _{i_2=0}^i \varGamma (1+i-i_2) \nonumber \\&\quad \times \sum _{i_3=0}^{i_2} \varGamma (1+i_2-i_3)\; \cdots \sum _{i_{k-1}=0}^{i_{k-2}}\varGamma (1+i_{k-2}-i_{k-1}) \nonumber \\&\quad \times \sum _{i_k=0}^{i_{k-1}}\varGamma (1+i_{k-1}-i_k)\,\varGamma (i_k+1). \end{aligned}$$
(9)

This definition implies the recursion relation

$$\begin{aligned} f^{}_{i,k}=\sum _{j=0}^i\varGamma (1+i-j)f^{}_{j,k-1}, \end{aligned}$$
(10)

with \(f^{}_{i,1}:= \varGamma (i+1).\) Using Eqs. 810, \(n^{}_{k}\) may now be calculated for any N and k.

3 Results of simulations

We wish to compare the results of the analytical distribution with actual manifoldlike causal sets obtained from numerical simulations of random sets of points sprinkled with uniform density in the Alexandrov set defined by two timelike related points in Minkowski space. From Fig. 1 it’s easy to see that the theory matches well with the average of the simulations, though it is worth pointing out that as the large error bars suggest, individual sprinklings can deviate significantly from the theory. This problem can be somewhat mitigated by considering only the peak position and full width at half maximum of the distribution rather than its entirety. The shape of these distributions is nearly Gaussian, allowing us to characterize each curve with just these two numbers, and to identify the peak position with the mean path length calculated in the previous section. As one can see in Fig. 2 the relative errors in both the peak position and the width decrease with N, implying that the larger errors in Fig. 1 are primarily due to fluctuations in the total number of paths rather than the shape of the curve considered as a probability distribution; also, the path-length distribution for a single 100-element causal set in Fig. 3 shows the same Gaussian shape and consistent values for the peak position and width. Nevertheless, there is still enough error in the peak and width to cause us some concern. As a result, any application based on this distribution should account for statistical fluctuations in evaluating a single causal set. For instance, if one wishes to use this distribution for its stated purpose of determining manifoldlikeness of causal sets, the failure of a particular causal set to exactly match either the full analytical distribution or its peak and width should not be taken as a sign that the causal set is not manifoldlike; rather, a fairly large range around the analytical distribution should be used, and causal sets which fall into this range should be considered candidates for manifoldlike causal sets. We will discuss this further in the next section.

Fig. 1
figure 1

A comparison of the average path length distributions for 300 sprinklings of 50 points and 100 sprinklings of 100 points, on the top and bottom figures respectively, with their corresponding analytical distributions. Due to numerical error, the theory curve starts at \(k = 9\)

Fig. 2
figure 2

A plot of width (full width at half maximum) vs peak position for a variety of sizes of sprinkled causal sets. The color indicates the number N of elements in the causal set

Fig. 3
figure 3

The distribution of path lengths for a single 100-element causal set sprinkled in 2D Minkowski space

4 Manifoldlike causal sets

One of the motivations for this work was to explore the possibility of using the mean path length between two causal set elements p and q for a known value for the volume of I(pq) as a dimension estimator, similar to the use of the longest path length in Ref. [10], with the possible computational advantage that sampling the set of paths between p and q and using an average length to estimate the mean may be easier than finding the longest path. From simulations whose results are shown in this paper, as well as simulations in higher-dimensional Minkowski space, it appears that the average length of a sample of a few paths is indeed a valid dimension estimator, though it is unclear whether it is computationally better than the longest path method. One benefit of our approach and similar ones using the distribution of path lengths, however, is that it provides a criterion of manifoldikeness for causal sets.

Fig. 4
figure 4

Top: an artificially produced causal set whose purpose is to mimic both the height and width of our path length distribution. Bottom: a similarly produced artificial causal set which eliminates some redundancies to reduce the number of points while maintaining the path length distribution; however, it still has far more points than its manifoldlike cohorts

It is clear even from simple examples that quantities like the longest or the mean path length may be good dimension estimators only for causal sets known to be manifoldlike, and do not by themselves distinguish those causal sets from non-manifoldlike ones. For example, the union of m chains of length k with minimal points and maximal points identified is a causal set that can always be embedded in 2D Minkowski space, but adjusting the values of m and k one can obtain a relationship between the total number of elements \(N = m(k-1)+2\) and the longest or mean path length k that reproduces that of any Minkowski dimensionality. Similarly, a causal set could be constructed as the union of separate paths of various lengths all sharing the same minimal and maximal element, and with no other overlap, as in the top part of Fig. 4, with the number of chains of each length adjusted in a way that exactly matches the mean and width of the typical manifoldlike distribution. However, while the construction would yield the right value of \(n_k\) for any length k, the total number of points in the causal set, \(N=\sum _{k=2}^{k_\mathrm{max}} n_k(k-1)+2,\) would be quite different as the manifoldlike distribution would have many paths sharing points and this contrived example does not. We could make the example slightly more realistic by forcing the paths to share all points not linked to the maximum point as in the bottom part of Fig. 4. This would limit the number of points significantly, with a total of \(N = \sum _{k = 2}^{k_\mathrm{max}} n_k+k_\mathrm{max}\), where \({k_\mathrm{max}}\) is the length of the longest path in the causal set. However, for \(N\gg 1\) this may still have several orders of magnitude more points than a manifoldlike causal set, as we can see by considering for example the bottom part of Fig. 1. If we use the average values of these 100-point sprinklings, the first method requires around \(4\times 10^5\) points while the second one requires around \(4\times 10^4\) points.

Fig. 5
figure 5

Top: a regular lattice of \(11^2 = 121\) points. All paths in a square lattice have the same length, \(k = 20\) in this case. Bottom: a 20-element causal set illustrating the Kleitman-Rothschild limit, with all \(k = 2\)

What we propose as a first manifoldlikeness criterion based on the distribution of path lengths \(n_k\) is simply that any N-element causal set for which the mean value \(k^{}_{0}\) and the width \(\varDelta \) of that distribution are not consistent with the corresponding theoretical values within statistical fluctuations cannot be manifoldlike. Based on the few examples we just saw, finding nonmanifoldlike causal sets that satisfy this condition is not trivial. Nevertheless, this condition is most likely not a sufficient one for manifoldlikeness. To further explore which causal sets meet or do not meet our criterion we will now provide two other types of examples of nonmanifoldlike causal sets.

One type includes causal sets that are not manifoldlike but are interesting for other reasons, and fail our criterion. The first example is a causal set that has one maximal and one minimal element, with all other elements located between them and unrelated to each other (i.e., one large antichain with added minimal and maximal elements); the path length distribution is proportional to a Kronecker delta \(n^{}_{k} \propto \delta ^{}_{k,2}\), with a sharp peak at length 2 and zero for other lengths. A regular “diamond lattice” (shown in the top part of Fig. 5) also has a path distribution sharply peaked at some length \(k^{}_{0} \approx \sqrt{N}\), with no paths of other lengths, as shown in Fig. 6. More generally, most randomly chosen causal sets of N elements with \(N\gg 1\) will look like the 3-layer Kleitman and Rothschild limit [14] shown in the bottom part of Fig. 5, in which the first and third layers have N / 4 elements, each of which is related to about half of the N / 2 elements in the second layer; in this causal set all paths have length 2, \(n^{}_{k} \propto \delta ^{}_{k,2}\). The causal sets in these examples are all nonmanifoldlike, as one would not obtain them from uniform distributions of points in a Lorentzian manifold.

Fig. 6
figure 6

Path length distribution for the 121-point square lattice in Fig. 5. This lattice can certainly be embedded in 2D Minkowski space, but we see from Fig. 2 that because of its peak position at \(k = 20\) and width \(\varDelta = 1\) it fails our manifoldlikeness criterion

The other type of examples includes causal sets that are still not manifoldlike, but are likely to meet our criterion for manifoldlikeness. Figure 7 shows the effect of adding one extra point to a causal set obtained from a 250-point random sprinkling in an Alexandrov set I(pq) of 2D Minkowski space. The added point was to the future of a point approximately in the middle of the sprinkled causal set, and linked directly to the maximal element q; the top part of the figure shows the resulting augmented causal set. Because the added point gives rise to additional paths which are shorter than the ones that go through the original, sprinkled causal set, the new path length distribution will exhibit an additional small bump with a peak length shorter than the overall \(k_0\). The bottom part of the figure shows the difference between the new path length distribution and the one without the additional point. This difference is very small compared to the overall distribution, but the feature it shows may be identifiable as characteristic of this particular type of nonmanifoldlike causal set.

5 Concluding remarks

Fig. 7
figure 7

Top: causal set obtained from sprinkling in 2D Minkowski space and one added point with “non-local” links. Bottom: difference between the path length distributions

With these results we aim to set up a procedure to establish whether a causal set is close to a spacetime manifold and address, at least in 2 dimensions, one of the most fundamental questions in causal set theory. Similar calculations and simulations for higher-dimensional spacetimes, with both flat and curved metrics, are being carried out and may have other applications, for instance in the expression for the Green’s function of a scalar field propagating on a background causal set in 4 dimensions [15]. For example, we show in Fig. 8 the path length distributions for causal sets in 3D, 4D and 5D flat spacetime, which have the same Gaussian shape as in 2 dimensions. Last but not least, this work provides a relation between the most probable path length and the proper time in the continuum, which can be used to discretize the action written in Refs. [16, 17] and was not known in general.

We plan to continue studying other types of causal sets, both manifoldlike and more general ones, to establish which additional criteria are needed to exclude nonmanifoldlike causal sets that are not physically interesting, and possibly identify causal sets which, although strictly speaking not faithfully embeddable in a Lorentzian manifold, may be physically interesting and we may want to consider them manifoldlike (see, e.g., [5]).

Fig. 8
figure 8

Probability distributions for path lengths in 3-, 4- and 5-dimensional Minkowski space, obtained by normalizing the average distributions from 100 sprinklings of \(N = 1000\) points in each case