Abstract
We present an analysis of the depth-first search algorithm in a random digraph model with independent outdegrees having a geometric distribution. The results include asymptotic results for the depth profile of vertices, the height (maximum depth) and average depth, the number of trees in the forest, the size of the largest and second-largest trees, and the numbers of arcs of different types in the depth-first jungle. Most results are first order. For the height, we show an asymptotic normal distribution. This analysis, proposed by Donald Knuth in his next to appear volume of The Art of Computer Programming, gives interesting insight in one of the most elegant and efficient algorithm for graph analysis due to Tarjan.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The motivation of this paper is a new section in Donald Knuth’s The Art of Computer Programming [16], which is dedicated to Depth-First Search (DFS) in a digraph. Briefly, the DFS starts with an arbitrary vertex and explores the arcs from that vertex one by one. When an arc is found leading to a vertex that has not been seen before, the DFS explores the arcs from that vertex in the same way, in a recursive fashion, before returning to the next arc from its parent. This eventually yields a tree containing all descendants of the first vertex (which is the root of the tree). If there still are some unseen vertices, the DFS starts again with one of them and finds a new tree, and so on until all vertices are found. We refer to [16] for details as well as for historical notes. (See also the pseudo-code below and S1–S2 in Sect. 7.) Note that the digraphs in [16] and here are multi-digraphs, where loops and multiple arcs are allowed. (Although in our random model, these are few and usually not important.) The DFS algorithm generates a spanning forest (the depth-first forest) in the digraph, with all arcs in the forest directed away from the roots. Our main purpose is to study the properties of the depth-first forest, starting with a random digraph G; in particular we study the distribution of the depth of vertices in the depth-first forest.
The random digraph model that we consider (following Knuth [16]) has n vertices and a given outdegree distribution \(\textbf{P}\), which in the main part of the present paper is a geometric distribution \({\text {Ge}}(1-p)\) for some fixed \(0<p<1\). (See Sect. 1.1 for definitions of the two versions \({\text {Ge}}(1-p)\) and \({\text {Ge}}_1(1-p)\) of geometric distributions.) The outdegrees (number of outgoing arcs) of the n vertices are independent random numbers with this distribution. The endpoint of each arc is uniformly selected at random among the n vertices, independently of all other arcs. (Therefore, an arc can loop back to the starting vertex, and multiple arcs can occur.) We consider asymptotics as \(n\rightarrow \infty \) for a fixed outdegree distribution.
In the present paper, we study the case of a geometric outdegree distribution in detail; we also (in Sect. 6) briefly give corresponding results for the shifted geometric outdegree distribution \({\text {Ge}}_1(1-p)\) and discuss the similarities and differences between the two cases. The case of a general outdegree distribution (with finite variance) will be studied in a forthcoming paper [12], where we use a somewhat different method which allows us to extend many (but not all) of the results in the present paper and obtain similar, but partly weaker, results; see also Sect. 7. One reason for studying the geometric case separately is that its lack-of-memory property leads to interesting features and simplifications not present for general outdegree distributions; this is seen both in [16] and in the proofs and results below. In particular, the depth process studied in Sect. 2 will be a Markov chain, which is the basis of our analysis.
In addition to studying the depth-first forest, we give also (in Sect. 5) some results on the number of edges of different types in the depth-first jungle; this is defined in [16] as the original digraph with arcs classified by the DFS algorithm into the following five types, see Fig. 1 for examples:
-
loops;
-
tree arcs, the arcs in the resulting depth-first forest;
-
back arcs, the arcs which point to an ancestor of the current vertex in the current tree;
-
forward arcs, the arcs which point to an already discovered descendant of the current vertex in the current tree;
-
cross arcs, all other arcs (these point to an already discovered vertex which is neither a descendant nor an ancestor of the current vertex and might be in another tree).
(See further the exercises in [16].)
For completeness of the presentation, we show a pseudo-code of the depth-first search with indications of the arc classifications. The stack at time t is the chain of the arguments of the successive calls of function \(\textsc {Deep}{}\) which are not terminated at time t. Thus a vertex u is an ancestor of vertex v in the depth-first forest if u is still in the stack when v is discovered. In the pseudo-code, the instructions which are not necessary for the functioning of the algorithm are prefixed as comment with a #, for example, the evolution of the parameters t and d. (To be precise, d(t) is set each time t is changed.) \({{\mathcal {N}}}(u)\) denotes the multiset of children of u, i.e., the endpoints of the arcs starting at u. (This is a multiset since there may be multiple arcs.)
Remark 1.1
Some related results for DFS in an undirected Erdős–Rényi graph \(G(n,\lambda /n)\) are proved by Faraud and Ménard [9] and Diskin and Krivelevich [8], and DFS in a random Erdős–Rényi digraph has been studied for example in the proof of [17, Theorem 3]. These models are closely related to our model with a Poisson outdegree distribution \(\textbf{P}\); they will therefore be further discussed in [12].
Remark 1.2
We consider only the case of a fixed outdegree distribution \(\textbf{P}\). The results can be extended to distributions \(\textbf{P}_n\) depending on n, under suitable conditions. This is particularly interesting in the critical case, with expectations \(\lambda _n\rightarrow 1\) (where \(\lambda _n\) is the expectation of \(\textbf{P}_n\)); however, this is out of the scope of the present paper.
The main results for a geometric outdegree distribution are stated and proved in Sects. 2–5. We analyze the process d(t) of depths of the vertices, in the order they are found by the DFS. For a geometric outdegree distribution (but not in general), d(t) is a Markov chain, and we find its first-order limit by a martingale argument; moreover, we show that fluctuations are of order \(\sqrt{n}\), and, in a subrange, asymptotically Gaussian. The first-order limit of the function d(t)/n is an explicit and simple function of t/n and \(\lambda \) which is \(\left[ \frac{t}{n}+\frac{1}{\lambda }\log (1-\frac{t}{n})\right] ^+\), where \([x]^+\) is the positive part of x (Theorem 2.4). This leads to results on the height of the depth-first forest (Corollary 2.5 and Theorem 3.4), average depth (Corollary 2.6), number of trees in the forest (Theorem 4.1), size of the largest tree in the forest (Theorem 4.3), and the results on the types of arcs (Theorems 5.1 and 5.3). We also find results for the numbers of different types of arcs defined above; this includes verifying some conjectures from previous versions of [16].
In Sect. 6, we study briefly the case of a shifted geometric outdegree distribution. The same method as in previous sections works in this case too, but the explicit results are somewhat different (Theorem 6.1). In particular, the first-order limit of d(t)/n is a new explicit function of t/n and \(\lambda \) which is different from the function for the non-shifted geometric distribution, namely \(\frac{\lambda }{\lambda -1}\left[ \frac{t}{n}+\frac{1}{\lambda }\log (1-\frac{t}{n})\right] ^+\). One motivation for this section is to show that some of the relations found in Sects. 2–5 for a geometric outdegree distribution do not hold for arbitrary distributions.
We end in Sect. 7 with some comments on the case of general outdegree distributions.
Appendix A gives a generalized version of Donsker’s theorem in a form convenient to use in our proofs.
1.1 Some Notation
We denote the given outdegree distribution by \(\textbf{P}\). Recall that our standing assumption is that the outdegrees of the vertices are i.i.d. (independent and identically distributed).
The mean outdegree, i.e., the expectation of \(\textbf{P}\), is denoted by \(\lambda \). In analogy with branching processes, we say that the random digraph is subcritical if \(\lambda <1\), critical if \(\lambda =1\), and supercritical if \(\lambda >1\).
As usual, w.h.p. means with high probability, i.e., with probability \(1-o(1)\) as \({n\rightarrow \infty }\). We use \(\overset{\textrm{p}}{\longrightarrow }\) for convergence in probability, and \(\overset{\textrm{d}}{\longrightarrow }\) for convergence in distribution of random variables.
Moreover, let \((a_n)\) be a sequence of positive numbers, and \(X_n\) be a sequence of random variables. We write \(X_n=o_{\textrm{p}}(a_n)\) if, as \({n\rightarrow \infty }\), \(X_n/a_n\overset{\textrm{p}}{\longrightarrow }0\), i.e., if for every \(\varepsilon >0\), we have \({\mathbb {P}}(|X_n|>\varepsilon a_n)\rightarrow 0\). Note that this is equivalent to the existence of a sequence \(\varepsilon _n\rightarrow 0\) such that \({\mathbb {P}}(|X_n|>\varepsilon _n a_n)\rightarrow 0\), or in other words \(|X_n|\le \varepsilon _na_n\) w.h.p. (This is sometimes denoted “\(X_n=o(a_n)\) w.h.p. ”, but we will not use this notation.)
Furthermore, \(X_n=O_{L^2}(a_n)\) means \({\mathbb {E}}\bigl [|X_n/a_n|^2\bigr ]=O(1)\). Note that \(X_n=O_{L^2}(a_n)\) implies \(X_n=o_{\textrm{p}}(\omega _na_n)\), for any sequence \(\omega _n\rightarrow \infty \). Note also that \(X_n=O_{L^2}(a_n)\) implies \({\mathbb {E}}X_n=O(a_n)\); thus error terms of this type imply immediately estimates for expectations and second moments. In particular, for the most common case below, \(X_n=O_{L^2}(n^{1/2})\) is equivalent to \({\mathbb {E}}X_n=O(n^{1/2})\) and \({\text {Var}}X_n= O(n)\).
\({\text {Ge}}(1-p)\) denotes the geometric distribution on \(\{0,1,\dots \}\); thus \(\eta \sim {\text {Ge}}(1-p)\) means that \(\eta \) is a random variable with
Similarly, \({\text {Ge}}_1(1-p)\) denotes the shifted geometric distribution on \(\{1,2,\dots \}\); thus \(\eta \sim {\text {Ge}}_1(1-p)\) means
\({\text {Po}}(\lambda )\) denotes a Poisson distribution with mean \(\lambda \).
We define \(\rho _0(x)\), for \(x\ge 0\), as the largest solution in [0, 1) to
As is well known, \(\rho _0(x)\) is the survival probability of a Bienaymé–Galton–Watson process with a Poisson offspring distribution \({\text {Po}}(x)\) with mean x. We have \(\rho _0(x)=0\) for \(x\le 1\) and \(0<\rho _0(x)<1\) for \(x>1\). (See e.g. [4, Theorem I.5.1].)
For a real number x, we write \(x^+:=\max \{x,0\}\). Let \([n]:=\{1,\dots ,n\}\). All logarithms are natural. C and c are sometimes used for positive constants.
Remark 1.3
We state many results with error estimates in \(L^2\), which means estimates on the second moment; we conjecture that the results extend to higher moment and estimates in \(L^p\) for any \(p<\infty \), but we have not pursued this.
2 Depth Analysis with Geometric Outdegree Distribution
In this section and the next ones (until we explicitly say otherwise in Sect. 6), we assume that the outdegree distribution is geometric \({\text {Ge}}(1-p)\) for some fixed \(0<p<1\) and thus has mean
When doing the DFS on a random digraph of the type studied in this paper, it is natural to reveal the outdegree of a vertex as soon as we find it. (See S1–S2 in Sect. 7.) However, for a geometric outdegree distribution, because of its lack-of-memory property, we do not have to immediately reveal the outdegree when we find a new vertex v. Instead, we only check whether there is at least one outgoing arc (probability p), and if so, we find its endpoint and explore this endpoint if it has not already been visited; eventually, we return to v and then we check whether there is another outgoing arc (again probability p, by the lack-of-memory property), and so on. This will yield the important Markov property in the construction in the next subsection. In the arguments below, we use only this version of the DFS.
By a future arc from some vertex, we mean an arc from that vertex that at the current time has not yet been seen by the DFS (using the version just described). Note that this is a temporary designation which changes as the DFS proceeds, unlike the permanent classification into five types discussed in the introduction.
2.1 Depth Markov Chain
Our aim is to track the evolution of the search depth as a function of the number t of discovered vertices. Let \(v_t\) be the t-th vertex discovered by the DFS (\(t=1,\dots ,n\)), and let d(t) be the depth of \(v_t\) in the resulting depth-first forest, i.e., the number of tree edges that connect the root of the current tree to \(v_t\). The first found vertex \(v_1\) is a root, and thus \(d(1)=0\).
The quantity d(t) follows a Markov chain with transitions (\(1\le t<n\)):
-
(i)
\(d(t+1)=d(t)+1\).
This happens if, for some \(k\ge 1\), \(v_t\) has at least k outgoing arcs, the first \(k-1\) arcs lead to vertices already visited, and the kth arc leads to a new vertex (which then becomes \(v_{t+1}\)). The probability of this is
$$\begin{aligned} \sum _{k=1}^\infty p^k \Bigl (\frac{t}{n}\Bigr )^{k-1}\Bigl (1-\frac{t}{n}\Bigr ) =\frac{(1-t/n)p}{1-pt/n}. \end{aligned}$$(2.2) -
(ii)
\(d(t+1)=d(t)\), assuming \(d(t)>0\).
This holds if all arcs from \(v_t\) lead to already visited vertices, i.e., (i) does not happen, and furthermore, the parent of \(v_t\) has at least one future arc leading to an unvisited vertex. These two events are independent. Moreover, by the lack-of-memory property, the number of future arcs from the parent of \(v_t\) has also the distribution \({\text {Ge}}(1-p)\). Hence, the probability that one of these future arcs leads to an unvisited vertex equals the probability in (2.2). The probability of (ii) is thus
$$\begin{aligned} \Bigl (1-\frac{(1-t/n)p}{1-pt/n}\Bigr )\frac{(1-t/n)p}{1-pt/n}. \end{aligned}$$(2.3) -
(iii)
\(d(t+1)=d(t)-\ell \), assuming \(d(t)>\ell \ge 1\).
This happens if all arcs from \(v_t\) lead to already visited vertices, and so do all future arcs from the \(\ell \) nearest ancestors of \(v_t\), while the \((\ell +1)\)th ancestor has at least one future arc leading to an unvisited vertex. The argument in (ii) generalizes and shows that this has probability
$$\begin{aligned} \Bigl (1-\frac{(1-t/n)p}{1-pt/n}\Bigr )^{\ell +1}\frac{(1-t/n)p}{1-pt/n}. \end{aligned}$$(2.4) -
(iv)
\(d(t+1)=d(t)-\ell \), assuming \(d(t)=\ell \ge 0\).
By the same argument as in (ii) and (iii), except that the \((\ell +1)\)th ancestor does not exist and we ignore it, we obtain the probability
$$\begin{aligned} \Bigl (1-\frac{(1-t/n)p}{1-pt/n}\Bigr )^{\ell +1}. \end{aligned}$$(2.5)
Note that (iv) is the case when \(d(t+1)=0\) and thus \(v_{t+1}\) is the root of a new tree in the depth-first forest.
We can summarize (i)–(iv) in the formula
where \(\xi _t\) is a random variable, independent of the history, with the distribution
where
In other words, \(\xi _t\) has the geometric distribution \({\text {Ge}}(\pi _t)\). Define
(in particular, \(\widetilde{d}(1)=0\)) and note that (2.9) is a sum of independent random variables. Then d(t) can be recovered from the simpler process \(\widetilde{d}(t)\) as follows.
Lemma 2.1
We have
Proof
We use induction on t. Evidently, (2.10) holds for \(t=1\) since \(d(1)=\widetilde{d}(1)=0\).
Suppose that (2.10) holds for some \(t<n\). Then (2.9) yields
If \(d(t)+1-\xi _t\ge 0\), then (2.11) shows that \(\widetilde{d}(t+1)\ge \min _{1\le j\le t} \widetilde{d}(j)\), and thus \(\min _{1\le j\le t+1} \widetilde{d}(j)=\min _{1\le j\le t} \widetilde{d}(j)\); furthermore, \(d(t+1)=d(t)+1-\xi _t\) by (2.6), and it follows that (2.10) holds for \(t+1\).
On the other hand, if \(d(t)+1-\xi _t<0\), then (2.11) shows that \(\widetilde{d}(t+1)<\min _{1\le j\le t} \widetilde{d}(j)\), and thus \(\min _{1\le j\le t+1} \widetilde{d}(j)=\widetilde{d}(t+1)\). In this case, \(d(t+1)=0\) by (2.6), and it follows that (2.10) holds for \(t+1\) in this case too. \(\square \)
Remark 2.2
Similar formulas have been used for other, related, problems with random graphs and trees, where trees have been coded as walks, see for example [3, Section 1.3]. Note that in our case, unlike, e.g., [3], \(\widetilde{d}(t)\) may have negative jumps of arbitrary size.
2.2 Main Result for Depth Analysis
Note first that (2.8) implies that, using \(\lambda =p/(1-p)\),
Hence, (2.9) implies that the expectation of \(\widetilde{d}(t)\) is
Let \(\theta :=t/n\). We fix a \(\theta ^*<1\) and obtain that, uniformly for \(\theta \le \theta ^*\),
where
Note that the derivative \(\widetilde{\ell }'(\theta )=1-\lambda ^{-1}/(1-\theta )\) is (strictly) decreasing on (0, 1), i.e., \(\widetilde{\ell }\) is concave. Moreover, if \(\lambda >1\) (i.e., \(p>\frac{1}{2}\)) (the supercritical case), then \(\widetilde{\ell }'(0)>0\), and (2.15) shows that \(\widetilde{\ell }(\theta )\) is positive and increasing for \(\theta <\theta _0:=1-\lambda ^{-1}=(2p-1)/p\). After the maximum at \(\theta _0\), \(\widetilde{\ell }(\theta )\) decreases and tends to \(-\infty \) as \(\theta \nearrow 1\). Hence, there exists a \(\theta _0<\theta _1<1\) such that \(\widetilde{\ell }(\theta _1)=0\); we then have \(\widetilde{\ell }(\theta )>0\) for \(0<\theta <\theta _1\) and \(\widetilde{\ell }(\theta )<0\) for \(\theta >\theta _1\). We will see that in this case the depth-first forest w.h.p. contains a giant tree, of order and height both linear in n, while all other trees are small.
On the other hand, if \(\lambda \le 1\) (i.e., \(p\le \frac{1}{2}\)) (the subcritical and critical cases), then \(\widetilde{\ell }'(0)\le 0\) and \(\widetilde{\ell }(\theta )\) is negative and decreasing for all \(\theta \in (0,1)\). In this case, we define \(\theta _0:=\theta _1:=0\) and note that the properties just stated for \(\widetilde{\ell }\) still hold (rather trivially). We will see that in this case w.h.p. all trees in the depth-first forest are small.
Note that in all cases,
and that \(\theta _1\) is the largest solution in [0, 1) to
Remark 2.3
The equation (2.17) may also be written \(1-\theta _1=\exp (-\lambda \theta _1)\), which shows that
the survival probability of a Bienaymé–Galton–Watson process with \({\text {Po}}(\lambda )\) offspring distribution defined in (1.3).
We define \(\widetilde{\ell }^+(\theta ):=\bigl (\widetilde{\ell }(\theta )\bigr )^+\). Thus, by (2.15) and the comments above,
We can now state one of our main results.
Theorem 2.4
We have
Proof
Since (2.9) is a sum of independent random variables, \(\widetilde{d}(t)-{\mathbb {E}}\widetilde{d}(t)\) (\(t=1,\dots ,n\)) is a martingale, and Doob’s inequality [10, Theorem 10.9.4] yields, for all \(T\le n\),
As above, fix \(\theta ^*<1\), and assume, as we may, that \(\theta ^*>\theta _1\). Let \(T^*:=\lfloor n\theta ^*\rfloor \), and consider first \(t\le T^*\). For \(i<T^*\), we have \({\text {Var}}\xi _i = O(1)\), and thus, for \(T=T^*\), the sum in (2.21) is \(O(T^*)=O(n)\). Consequently, (2.21) yields
Hence, by (2.14),
(Note that \(T^*\) and \(M^*\) depend on the choice of \(\theta ^*\).) For \(t\le T^*\), the definition of \(M^*\) in (2.23) implies \(\bigl |\widetilde{d}(j)-n\widetilde{\ell }(j/n)\bigr |\le M^*\) for \(1\le j\le t\), and thus
Moreover, for \(t/n\le \theta _1\), we have \(0\le \min _{1\le j\le t}\widetilde{\ell }(j/n)\le \widetilde{\ell }(1/n)=O(1/n)\), while for \(t/n\ge \theta _1\), we have \(\min _{1\le j\le t}\widetilde{\ell }(j/n)=\widetilde{\ell }(t/n)\). Hence, for all \(t\le T^*\),
and thus, by (2.24),
Finally, by (2.10), (2.23) and (2.26),
This holds uniformly for \(t\le T^*\), and thus, by (2.23),
It remains to consider \(T^*<t\le n\). Then the argument above does not quite work, because \(\pi _t\searrow 0\) and thus \({\text {Var}}\xi _t\nearrow \infty \) as \(t\nearrow n\). We therefore modify \(\xi _t\). We define \({\widehat{\pi }}_t:=\max \{\pi _t,\pi _{T^*}\}\); thus \({\widehat{\pi }}_t=\pi _t\) for \(t\le T^*\) and \({\widehat{\pi }}_t>\pi _t\) for \(t>T^*\). We may then define independent random variables \({\widehat{\xi }}_t\) such that \({\widehat{\xi }}_t\sim {\text {Ge}}({\widehat{\pi }}_t)\) and \({\widehat{\xi }}_t\le \xi _t\) for all \(t< n\). (Thus, \({\widehat{\xi }}_t=\xi _t\) for \(t\le T^*\). For \(t>T^*\), we may assume that \(\xi _t:=\min \{j:U_{t,j}<\pi _t\}-1\) for an array of independent U(0, 1) random variables \((U_{t,j})_{j,t}\) and then define \({\widehat{\xi }}_t:=\min \{j:U_{t,j}<{\widehat{\pi }}_t\}-1\).)
In analogy with (2.9)–(2.10), we further define
Since \({\widehat{\xi }}_i\le \xi _i\), (2.30) implies that \(\widehat{d}(t)\ge d(t)\) for all t.
We have \({\text {Var}}\bigl [{\widehat{\xi }}_t\bigr ]=O(1)\), uniformly for all \(t<n\), and thus the argument above yields
where
We have \(\widehat{\widetilde{\ell }}(\theta )=\widetilde{\ell }(\theta )\) for \(\theta \le \theta ^*\), and for \(\theta \ge \theta ^*\), \(\widehat{\widetilde{\ell }}(\theta )\) is negative and decreasing (since \(\theta ^*>\theta _1\)). Hence, \([\widehat{\widetilde{\ell }}(\theta )]^+=\widetilde{\ell }^+(\theta )\) for all \(0<\theta \le 1\). In particular, \([\widehat{\widetilde{\ell }}(\theta )]^+=\widetilde{\ell }^+(\theta )=0\) for all \(\theta \ge \theta ^*\), and (2.31) implies
Recalling \(0\le d(t)\le \widehat{d}(t)\), we thus have
which completes the proof. \(\square \)
Corollary 2.5
The height \(\Upsilon \) of the depth-first forest is
where
Proof
Immediate from Theorem 2.4 and (2.15), since we have \(\max _t\widetilde{\ell }^+(t/n)=\max _\theta \widetilde{\ell }^+(\theta )+O(1/n)\) and \(\max _\theta \widetilde{\ell }^+(\theta )=\widetilde{\ell }^+(\theta _0)=\widetilde{\ell }(\theta _0)\). \(\square \)
In Sect. 3,we will improve this when \(\lambda >1\) and show that then the height \(\Upsilon \) is asymptotically normally distributed (Theorem 3.4).
Corollary 2.6
The average depth \({\overline{d}}\) in the depth-first forest is
where \(\alpha =0\) if \(\lambda \le 1\), and, in general,
Proof
By Theorem 2.4, using that (2.19) shows that \(\widetilde{\ell }^+(\theta )\) is a Lipschitz function on \([0,1]\),
where
which yields (2.38), using (2.17). \(\square \)
Remark 2.7
When \(\lambda >1\), the height \(\Upsilon \) and average depth \({\overline{d}}\) are thus linear in n, unlike many other types of random trees. This might imply a rather slow performance of algorithms that operate on the depth-first forest if it is built explicitly in a computer’s memory.
3 Asymptotic Normality
In this section, we show that in the supercritical case \(\lambda >1\), Theorem 2.4 can be improved to yield convergence of d(t) (after rescaling) to a Gaussian process, at least on \([0,\theta _1)\). As a consequence, we show that the height \(\Upsilon \) is asymptotically normal.
Recall that for an interval \(I\subseteq \mathbb {R}\), D(I) is the space of functions \(I\rightarrow \mathbb {R}\) that are right-continuous with left limits (càdlàg) equipped with the Skorohod topology. For definitions of the topology see e.g. [5, 11, 15, Appendix A.2], or [13]; for our purposes it is enough to know that convergence in D(I) to a continuous limit is equivalent to uniform convergence on compact subsets of I. (Note that it thus matters if the endpoints are included in I or not; for example, convergence in \(D[0,1)\) and \(D[0,1]\) mean different things.)
We define \(d(0):=\widetilde{d}(0):=0\).
Lemma 3.1
Assume \(\lambda >1\). Then
where \(Z(\theta )\) is a continuous Gaussian process on \([0,1)\) with mean \({\mathbb {E}}Z(\theta )=0\) and covariance \({\text {Cov}}\bigl (Z(x),Z(y)\bigr )=g\bigl (\min \{x,y\}\bigr )\), where
Equivalently, \(Z(\theta )=B\bigl (g(\theta )\bigr )\) for a Brownian motion B(x).
Proof
Since the random variables \(\xi _t\) are independent, (2.9) and (2.7)–(2.8) yield, similarly to (2.13),
Hence, uniformly for \(t/n\le \theta ^*\) for any \(\theta ^*<1\),
with
in agreement with (3.2). (Note that this definition of \(g(\theta )\) as an integral shows that \(g(\theta )\) is strictly increasing in \(\theta \).) Since also \({\mathbb {E}}\widetilde{d}\bigl (\lfloor n\theta \rfloor \bigr )=n\widetilde{\ell }(\theta )+O(1)\) by (2.14), the marginal convergence for a fixed \(\theta \) in (3.1) follows by the classical central limit theorem for independent (not identically distributed) variables, e.g. using Lyapounov’s condition [10, Theorem 7.2.2].
The functional limit (3.1) is thus a version of Donsker’s theorem [5, Theorem 16.1], extended from the i.i.d. case to the non-identically distributed variables \(\xi _i\). Such extensions exist, but since we have not found a version in the literature that can be immediately applied here, we give such a version in Appendix A. The result (3.1) follows by applying Theorem A.1 to the variables \(n^{-1/2}\bigl ((1-\xi _{i-1})-{\mathbb {E}}(1-\xi _{i-1})\bigr )\). \(\square \)
Lemma 3.2
Assume \(\lambda >1\) and let \(0<\theta ^*<\theta _1\). Then
Proof
Let \(t_n:=\lceil n^{2/3}\rceil \). If n is large enough, then \(t_n<n\theta ^*\), and, since \(\widetilde{\ell }'(0)=1-\lambda ^{-1}>0\) by (2.15),
for some constant \(c>0\). Furthermore, (2.23) implies
(recall that \(O_{L^2}(a_n)\) implies \(o_{\textrm{p}}(\omega _na_n)\) for any \(a_n\) and any \(\omega _n\rightarrow \infty \)). It follows from (3.7)–(3.8) that w.h.p. \( \bigl |\widetilde{d}(t)- n\widetilde{\ell }(t/n)\bigr | < n\widetilde{\ell }(t/n)\) for all \(t\in [t_n,n\theta ^*]\). Hence, w.h.p. , \(\widetilde{d}(t)>0=\widetilde{d}(1)\) for all \(t\in [t_n,n\theta ^*]\). Consequently, w.h.p. ,
For \(t\le t_n\), we use Doob’s inequality in the form (2.21) again. Since \(\min _{t\le t_n}\widetilde{d}(t)\le \widetilde{d}(1)=0\) and \({\mathbb {E}}\widetilde{d}(t)\ge 0\) for \(t\le t_n\) (for n large), we have \(\bigl | \min _{t\le t_n}\widetilde{d}(t)\bigr |\le \max _{t\le t_n}|\widetilde{d}(t)-{\mathbb {E}}\widetilde{d}(t)|\) and thus (2.21) yields
Hence,
The proof is completed by combining (3.9) and (3.11). \(\square \)
Theorem 3.3
Assume \(\lambda >1\). Then
where \(Z(\theta )\) is the continuous Gaussian process defined in Lemma 3.1.
Proof
By (2.10) and Lemma 3.2, for any \(\theta ^*<\theta _1\),
The theorem now follows from Lemma 3.1. \(\square \)
Theorem 3.4
Let \(\lambda >1\). Then the height \(\Upsilon \) of the depth-first forest has an asymptotic normal distribution:
with \(\upsilon \) given by (2.36), and
Proof
Fix some \(\theta ^*\in (\theta _0,\theta _1)\). By Theorem 3.3 and the Skorohod coupling theorem [15, Theorem 4.30], we may assume that the random variables for different n are coupled such that the limit in (3.12) holds (almost) surely. Since \(Z(\theta )\) is continuous, this implies uniform convergence on \([0,\theta ^*]\), i.e.,
uniformly on \([0,\theta ^*]\). (The \(o(n^{1/2})\) here are random, but uniform in \(\theta \).) For \(|\theta -\theta _0|\le n^{-1/6}\), we have \(Z(\theta )=Z(\theta _0)+o(1)\), since Z is continuous, and thus (3.16) yields, almost surely,
Since \(\max _\theta \widetilde{\ell }(\theta )=\widetilde{\ell }(\theta _0)\), it follows that
On the other hand, for \(|\theta -\theta _0|\ge n^{-1/6}\), we have by a Taylor expansion, for some \(c>0\),
Hence, (2.20) implies
Comparing (3.18) and (3.20), we see that w.h.p. the maximum in (3.18) is larger than the one in (3.20), and thus
Hence, w.h.p. ,
which implies
Since \(\widetilde{\ell }(\theta _0)=\upsilon \) by (2.36), this shows (3.14) with \(\sigma ^2:=g(\theta _0)\), which gives (3.15) by (3.2) and \(\theta _0:=1-\lambda ^{-1}\). \(\square \)
4 The Trees in the Forest
Theorem 4.1
Let N be the number of trees in the depth-first forest. Then
where
Figure 2 shows the parameter \(\psi \) as a function of the average degree \(\lambda \).
Proof
Let \(J_t:={\varvec{1}}\{d(t)=0\}\), the indicator that vertex t is a root and thus starts a new tree. Thus \(N=\sum _1^n J_t\).
If \(\theta _1>0\) (i.e., \(\lambda >1\)), then Theorem 2.4 shows that w.h.p. \(d(t)>0\) in the interval \((1,n\theta _1)\), except possibly close to the endpoints. Thus the DFS will find one giant tree of order \(\approx \theta _1 n\), possibly preceded by a few small trees, and, as we will see later in the proof, followed by many small trees. To obtain a precise estimate, we note that there exists a constant \(c>0\) such that \(\widetilde{\ell }(\theta )\ge \min \{c\theta ,c(\theta _1-\theta )\}\) for \(\theta \in [0,\theta _1]\). Furthermore, if \(d(t)=0\), then (2.10) yields \(\widetilde{d}(t)=\min _{1\le j\le t}\widetilde{d}(j)\le \widetilde{d}(1)=0\), and thus, recalling (2.23), if also \(t\le n\theta _1\) and thus \(t\le T^*\), we have \(M^*\ge n\widetilde{\ell }(t/n)\). Hence, if \(t\le n\theta _1\) and \(d(t)=0\), then
Consequently, \(d(t)=0\) with \(t\le n\theta _1\) implies \(t\in [1,c^{-1}M^*] \cup [n\theta _1-c^{-1}M^*,n\theta _1]\). The number of such t is thus \(O(M^*+1)=O_{L^2}(n^{1/2})\), using (2.23).
Let \(T_1:=\lceil n\theta _1\rceil \). We have just shown that (the case \(\theta _1=0\) is trivial)
It remains to consider \(t\ge T_1\). For any integer \(k\ge 0\), the conditional distribution of \(\xi _t-k\) given \(\xi _t\ge k\) equals the distribution of \(\xi _t\). Hence, recalling (2.12),
We use again the stochastic recursion (2.6). Let \({\mathcal {F}}_t\) be the \(\sigma \)-field generated by \(\xi _1,\dots ,\xi _{t-1}\). Then d(t) is \({\mathcal {F}}_t\)-measurable, while \(\xi _t\) is independent of \({\mathcal {F}}_t\). Hence, (2.6) and (4.5) yield
We write \(\Delta d(t):=d(t+1)-d(t)\) and \(\overline{J}_t:=1-J_t\). Then (4.6) yields
Define
Then \({\mathcal {M}}_t\) is \({\mathcal {F}}_t\)-measurable, and (4.7) shows that \({\mathcal {M}}_t\) is a martingale. We have, with \(\Delta {\mathcal {M}}_t:={\mathcal {M}}_{t+1}-{\mathcal {M}}_t\), using (2.6),
and thus, since \(\pi _t\le p<1\) for all t by (2.8) (and using the inequality \((a+b)^2\le 2a^2+2b^2\) for any real a, b),
Hence, uniformly for all \(T\le n\),
The definition (4.8) yields
By a summation by parts, and interpreting \(\mu _n^{-1}:=0\),
As t increases, \(\mu _t\) increases by (2.12), and thus \(\mu _{t-1}^{-1}-\mu _t^{-1}>0\). Hence, (4.13) implies
by (2.20), since \(\widetilde{\ell }^+(t/n)=0\) for \(t\ge T_1\ge n\theta _1\). Furthermore, (4.11) shows that \({\mathcal {M}}_n,{\mathcal {M}}_{T_1}=O_{L^2}(n^{1/2})\). Hence, (4.12) yields, using (2.12),
We have
and thus (4.15) yields
where
The result follows by (4.17) and (4.4). \(\square \)
The arguments in the proof of Theorem 4.1 show that in the supercritical case \(\lambda >1\), the DFS w.h.p. find first possibly a few small trees, then a giant tree containing all \(v_t\) with \(O_{L^2}(n^{1/2})\le t\le \theta _1n+O_{L^2}(n^{1/2})\), and then a large number of small trees. We give some details in the following lemma and theorem.
Lemma 4.2
Let (a, b) be a fixed interval with \(0\le a<b\le 1\) and \(b>\theta _1\). Then w.h.p. there exists a root \(v_t\) in the depth-first forest with \(t/n\in (a,b)\).
Proof
By increasing a, we may assume that \(\theta _1<a<b\le 1\). Then, cf. (2.15), \(\widetilde{\ell }'(a)<\widetilde{\ell }'(\theta _1)\le 0\) and thus \( \lambda (1-a)<1\). Hence, the argument yielding (4.15) in the proof of Theorem 4.1 yields also
with \(c:=(b-a)(1-\lambda (1-a))>0\). Hence, w.h.p. there are many roots \(v_t\) with \(t\in (an,bn)\). \(\square \)
Theorem 4.3
Let \(\textbf{T}_1\) be the largest tree in the depth-first forest.
-
(i)
If \(\lambda \le 1\), then \(|\textbf{T}_1|=o_{\textrm{p}}(n)\).
-
(ii)
If \(\lambda >1\), then \(|\textbf{T}_1|=\theta _1 n+O_{L^2}(n^{1/2})\). Furthermore, the second-largest tree has order \(|\textbf{T}_2|=o_{\textrm{p}}(n)\).
Proof
Let \(\varepsilon >0\). By covering \([\theta _1,1]\) with a finite number of intervals of length \(<\varepsilon /2\), it follows from Lemma 4.2 that w.h.p. every tree \(\textbf{T}\) having a root \(v_t\) with \(t>(\theta _1-\varepsilon /2)n\) has \(|\textbf{T}|\le \varepsilon n\).
In particular, if \(\lambda \le 1\), so \(\theta _1=0\), this applies to all trees, and thus w.h.p. \(|\textbf{T}_1|\le \varepsilon n\), which proves (i).
Suppose now \(\lambda >1\). Consider the tree \(\textbf{T}\) in the depth-first forest that contains \(v_{\lfloor n\theta _0\rfloor }\), denote its root by \(v_r\) and let \(v_s\) be its last vertex. By the proof of Theorem 4.1, \(d(t)>0\) for \(O_{L^2}(n^{1/2})\le t\le \theta _1 n-O_{L^2}(n^{1/2})\), and thus \(r=O_{L^2}(n^{1/2})\) and \(s\ge \theta _1n-O_{L^2}(n^{1/2})\).
On the other hand, let \(\theta ^*\in (\theta _1,1)\). If \(s\ge \theta _1n\), let \(u:=\min \{s,\lfloor \theta ^*n\rfloor \}\). Since \(r/n\le \theta _0\), we have \(\widetilde{\ell }(r/n)\ge 0\). Furthermore, (2.10) implies that \(\widetilde{d}(t)>\min _{j\le t}\widetilde{d}(j)=\widetilde{d}(r)\) for \(t\in (r,s]\), and thus \(\widetilde{d}(u)>\widetilde{d}(r)\). Hence, by (2.23),
Since \(\widetilde{\ell }(\theta _1)=0\) and \(\widetilde{\ell }'(\theta )\le -c<0\) for \(\theta \ge \theta _1\), it follows that \(u\le \theta _1n+O_{L^2}(n^{1/2})\), and thus \(s\le \theta _1n+O_{L^2}(n^{1/2})\).
Consequently, \(s= \theta _1n+O_{L^2}(n^{1/2})\), and thus \(|\textbf{T}|=s-r+1=\theta _1n+O_{L^2}(n^{1/2})\). Furthermore, any tree found before \(\textbf{T}\) has order \(\le r=o_{\textrm{p}}(n)\). The first part of the proof now shows that for every \(\varepsilon >0\), there is w.h.p. no tree other than \(|\textbf{T}|\) of order \(>\varepsilon n\). Hence, w.h.p. \(\textbf{T}\) is the largest tree, and (ii) follows. \(\square \)
Remark 4.4
As said in Remark 2.3, \(\theta _1\), the asymptotic fraction of vertices in the giant tree equals the survival probability of a Bienaymé–Galton–Watson process with \({\text {Po}}(\lambda )\) offspring distribution. Heuristically, this may be explained by the following argument, well known from similar situations. Start at a random vertex and follow the arcs backwards. The indegree of a given vertex is asymptotically \({\text {Po}}(\lambda )\), and the process of exploring backwards from a vertex may be approximated by a Bienaymé–Galton–Watson process with this offspring distribution. Hence, the probability of a “large” backwards process converges to the survival probability \(\theta _1\). It seems reasonable that most vertices in the giant tree have a large backwards process, while most vertices outside the giant have a small backwards process.
Note also that the asymptotic size of the giant tree thus equals the asymptotic size of the giant component in an undirected Erdős–Rényi random graph \(G(n,\lambda /n)\), which heuristically is given by the same argument. (See also Remark 1.1 and [12].)
5 Types of Arcs
Recall from the introduction the classification of the arcs in the digraph G. Since we assume that the outdegrees are \({\text {Ge}}(1-p)\) and independent, the total number of arcs, M say, has a negative binomial distribution with mean \(\lambda n\), and, by a weak version of the law of large numbers,
In the following theorem, we give the asymptotics of the number of arcs of each type.
Theorem 5.1
Let L, T, B, F, and C be the numbers of loops, tree arcs, back arcs, forward arcs, and cross arcs in the random digraph. Then
where
Proof
Let \(\eta _t\) be the number of arcs from \(v_t\), and let \(\eta ^<_t,\eta ^=_t,\eta ^>_t\) be the numbers of these arcs that lead to some \(v_u\) with \(u<t\), \(u=t\) and \(u>t\), respectively. Then
Furthermore, an arc \(v_tv_u\) with \(u>t\) is either a tree arc or a forward arc; conversely, every tree arc or forward arc is of this type. Consequently,
Similarly, or by (5.9) and (5.10),
Conditioned on \(\eta _t\), \(\eta ^<_t\) has a binomial distribution \({\text {Bin}}(\eta _t,(t-1)/n)\), since each arc has probability \((t-1)/n\) to go to a vertex \(v_u\) with \(u<t\). In general, if \(X\sim {\text {Bin}}(m,p)\), then \({\mathbb {E}}X = mp\) and \({\mathbb {E}}X^2={\text {Var}}X + ({\mathbb {E}}X)^2=mp(1-p)+(m p)^2\). Hence, by first conditioning on \(\eta _t\),
Furthermore, the random variables \(\eta ^<_t\), \(t=1,\dots ,n\), are independent. Hence, (5.11) yields
and thus
The same argument with (5.10) yields
Similarly, conditioned on \(\eta _t\), we have \(\eta ^=\sim {\text {Bin}}(\eta _t,1/n)\), and we find
This proves (5.2). We prove (5.3)–(5.6) one by one.
T In any forest, the number of vertices equals the number of edges + the number of trees. Hence, \(T=n-N\), where n is the number of trees in the depth-first forest, and thus Theorem 4.1 implies (5.3) with \(\tau =1-\psi \) given by (5.7).
B Let \(B_t\) be the number of back arcs from \(v_t\); thus \(B=\sum _1^n B_t\). Let \({\mathcal {F}}_t\) be the \(\sigma \)-field generated by all arcs from \(v_i\), \(i\le t\) (i.e., by the outdegrees \(\eta _i\) and the endpoints of all these arcs); note that this includes complete information on the DFS until \(v_{t+1}\) is found, but also on some further arcs (the future arcs from the ancestors of \(v_{t+1}\)). Then d(t) is \({\mathcal {F}}_{t-1}\)-measurable and \(B_t\) is \({\mathcal {F}}_t\)-measurable. Moreover, \(\eta _t\) is independent of \({\mathcal {F}}_{t-1}\). Thus, conditioned on \({\mathcal {F}}_{t-1}\), we still have \(\eta _t\sim {\text {Ge}}(1-p)\); we also know d(t), and each arc from \(v_t\) is a back arc with probability d(t)/n. Hence, \({\mathbb {E}}\bigl [B_t\mid {\mathcal {F}}_{t-1},\eta _t\bigr ]=\eta _t d(t)/n\), and consequently
Similarly, since, as above, \(X\sim {\text {Bin}}(m,p)\) implies \({\mathbb {E}}X^2=mp(1-p)+(m p)^2\),
Define \(\Delta Z_t:=B_t-\lambda d(t)/n\) and \(Z_t:=\sum _1^t \Delta Z_i\). Then (5.23) shows that \({\mathbb {E}}\bigl [\Delta Z_t\mid {\mathcal {F}}_t\bigr ]=0\), and thus \((Z_i)_0^n\) is a martingale, with \(Z_0=0\). Hence, \({\mathbb {E}}Z_n=0\). Furthermore, (5.23) implies \({\mathbb {E}}\bigl [(\Delta Z_t)^2\bigr ]={\mathbb {E}}\bigl [(B_t-{\mathbb {E}}[B_t\mid {\mathcal {F}}_{t-1}])^2\bigr ] \le {\mathbb {E}}\bigl [B_t^2\bigr ]\), and thus by (5.24),
Consequently, \(Z_n=O_{L^2}(n^{1/2})\), and thus
Finally, (5.26) and Corollary 2.6 yield
which shows (5.4) with \(\beta =\lambda \alpha \) as in (5.8), recalling (2.38).
F By (5.19) and (5.3), we have (5.5) with \(\varphi =\frac{\lambda }{2}-\tau \), which agrees with (5.8) by (5.7) and a simple calculation. In particular, \(\varphi =\beta \).
C Similarly, it follows from (5.16) and (5.4) that we have (5.6) with \(\chi :=\lambda /2-\beta \). Since we have found \(\beta =\varphi \), and we always have \(\tau +\varphi =\lambda /2=\beta +\chi \), see (5.16) and (5.19), we thus have \(\chi =\tau \), and thus (5.7) holds. \(\square \)
Note that \(T+F\) and \(B+C\) are asymptotically normal; this follows immediately from (5.10) and (5.11) by the central limit theorem.
Conjecture 5.2
All four variables T, B, F, C are (jointly) asymptotically normal.
The equalities \(\tau =\chi \) and \(\beta =\varphi \) mean asymptotic equality of the corresponding expectations of numbers of arcs. In fact, there are exact equalities.
Theorem 5.3
For any n, \({\mathbb {E}}T = {\mathbb {E}}C\) and \({\mathbb {E}}B = {\mathbb {E}}F=\lambda {\mathbb {E}}{\overline{d}}\).
Proof
Let v, w be two distinct vertices. If the DFS finds w as a descendant of v, then there will later be \({\text {Ge}}(1-p)\) arcs from w, and each has probability 1/n of being a back arc to v. Similarly, there will be \({\text {Ge}}(1-p)\) future arcs from v, and each has probability 1/n of being a forward arc to w. Hence, if \(I_{vw}\) is the indicator that w is a descendant of v, and \(B_{vw}\) [\(F_{vw}\)] is the number of back [forward] arcs vw, then
Summing over all pairs of distinct v and w, we obtain
Finally, \({\mathbb {E}}T+{\mathbb {E}}F = {\mathbb {E}}C + {\mathbb {E}}B\) by (5.17) and (5.14), and thus (5.29) implies \({\mathbb {E}}T = {\mathbb {E}}C\). \(\square \)
Remark 5.4
Knuth [16] conjectured, based on exact calculation of generating functions for small n, that, much more strongly, B and F have the same distribution for every n. (Note that T and C do not have the same distribution; we have \(T\le n-1\), while C may take arbitrarily large values.) This conjecture has recently been proved by Nie [20], using a reformulation in [14].
Remark 5.5
A simple argument with generating functions shows that the number of loops at a given vertex v is \({\text {Ge}}(1-p/(n-np+p))\); these numbers are independent, and thus \(L\sim {\text {NegBin}}\bigl (n,1-p/(n-np+p)\bigr )\) with \({\mathbb {E}}L = p/(1-p)=\lambda =O(1)\) and \({\text {Var}}(L)=p(1-p+p/n)/(1-p)^2=\lambda (1+\lambda /n)=O(1)\) [16]. Moreover, it is easily seen that asymptotically, L has a Poisson distribution, \(L\overset{\textrm{d}}{\longrightarrow }{\text {Po}}(\lambda )\) as \({n\rightarrow \infty }\).
6 Depth, Trees and Arc Analysis in the Shifted Geometric Outdegree Distribution
In this section, the outdegree distribution is \({\text {Ge}}_1(1-p)=1+{\text {Ge}}(1-p)\). Thus we now have the mean
Thus \(\lambda >1\), and only the supercritical case occurs. As in Sect. 2, the depth d(t) is a Markov chain given by (2.6), but the distribution of \(\xi _t\) is now different. The probability (2.2) is replaced by \((1-t/n)/(1-pt/n)\), but the number of future arcs from an ancestor is still \({\text {Ge}}(1-p)\), and, with \(\theta :=t/n\),
where \(\pi _t=p{\overline{\pi }}_t\) is as in (2.8). The rest of the analysis does not change, and the results in Theorems 2.4–5.1 still hold, but we get different values for many of the constants.
We now have
and instead of (2.14), we have \({\mathbb {E}}\widetilde{d}(t) = n\widetilde{\ell }(\theta ) + O(1)\) where now \(\widetilde{\ell }(\theta )\) takes the new value
Note that \(\widetilde{\ell }(\theta )\) in (6.4) is proportional to (2.15) for the (unshifted) geometric distribution with the same \(\lambda \), but larger by a factor 1/p. Figures 3 and 4 show \(\widetilde{\ell }(\theta )\) for both geometric distributions with the same p (0.6) and the same \(\lambda \) (2.0), respectively.
Note that \(\widetilde{\ell }(\theta _1)=0\) still gives the formulas (2.17) and (2.18) for \(\theta _1\), now with \(\lambda =1/(1-p)\) as in (6.1), and that \(\lambda >1\) so \(\theta _1>0\) for every p. Differentiating (6.4) shows that the maximum point \(\theta _0=p>0\), which again is given by (2.16). Straightforward calculations yield
Furthermore, (6.2) yields by a simple calculation, with \(\theta :=t/n\),
Hence, (3.4) holds with (3.5) replaced by
and then Lemma 3.1 and Theorem 3.3 hold with this \(g(\theta )\).
Consequently, Theorem 3.4 holds with
In the proof of Theorem 4.1, (4.5) for \(k\ge 1\) still holds with \(\mu _t\) given by (2.12) (except for the formula with \(\lambda \)), and thus (4.6) is replaced by, using (6.3),
The rest of the proof remains the same with minor modifications and leads to, instead of (4.15), with \(\theta :=t/n\),
and thus Theorem 4.1 holds with
just as in (4.2).
In the proof of Theorem 5.1, (5.27) still holds, and we obtain (5.4) with \(\beta =\lambda \alpha \), and then (5.6) with \(\chi =\lambda /2-\beta \) just as before (but recall that \(\alpha \) now is different). On the other hand, now the expected numbers of back and forward arcs differ since \({\mathbb {E}}B = \lambda {\mathbb {E}}{\overline{d}}\sim \lambda \alpha n\) and \({\mathbb {E}}F=(\lambda -1){\mathbb {E}}{\overline{d}}\sim (\lambda -1)\alpha n\) because the average number of future arcs at a vertex after a descendant has been created is \(\lambda -1\). The asymptotic formula (5.3) holds as above with \(\tau :=1-\psi \); hence (5.17) implies that (5.5) holds too, with \(\varphi =\lambda /2-\tau \); as just noted, we now have \(\varphi =(\lambda -1)\alpha \ne \beta \). Collecting these constants, we see that Theorem 5.1 holds with
Thus the equality \(\beta =\varphi \) and the equality of the expected number of back and forward arcs in Theorems 5.1 and Theorem 5.3 was an artifact of the geometric degree distribution. Similarly, \(\chi =\lambda /2-\beta <\lambda /2-\varphi =\tau \), and the equality of the expected numbers of tree arcs and cross arcs in Theorem 5.3 also does not hold.
We summarize the results above.
Theorem 6.1
Let the outdegree distribution \(\textbf{P}\) be the shifted geometric distribution \({\text {Ge}}_1(p)\) with \(p\in (0,1)\). Then Theorems 2.4–5.1 hold, with the constants now having the values described above (and always \(\lambda >1\)), while Theorem 5.3 does not hold.
7 A General Outdegree Distribution: Stack Index
In this section, we consider a general outdegree distribution \(\textbf{P}\), with mean \(\lambda \) and finite variance.
When the outdegree distribution is general, the depth does not longer follow a simple Markov chain, since we would have to keep track of the number of children seen so far at each level of the branch of the tree toward the current vertex. We get back a Markov chain if instead of the depth d(t) of the current vertex we consider the stack index I(t) defined as follows.
The DFS can be regarded as keeping a stack of unexplored arcs, for which we have seen the start vertex but not the endpoint. Let again \(v_t\) be the t-th vertex seen by the DFS, and let I(t) be the size of this stack when \(v_t\) is found (but before we add the arcs from \(v_t\) to the stack). This stack index I(t) is given by the following modification of the pseudo-code for \(\textsc {Deep}{}\) (initialized at zero at the beginning):
This recursive pseudo-code looks very similar to our first recursive pseudo-code, although there is a subtle difference. In the first pseudo-code, there was no need to load \({{\mathcal {N}}}(u)\) as a local variable since the current neighbor could just be localized via a local index. Thus the stack memory requirement is proportional to the number of recursive calls to \(\textsc {Deep}{}\) and is thus of order (at most) the number of vertices in the graph. In the second pseudo-code, we load the full neighborhood \({{\mathcal {N}}}(u)\) as a local variable, which typically increases the stack memory requirement to the order of the number of arcs in the graph, which can be more than the square of the number of vertices. This is even more visible in the following non-recursive pseudo-code version. For our random model, with a general outdegree distribution, the evolution of the stack can be described as follows, with the stack initially empty:
-
S1.
If the stack is empty, pick a new vertex v that has not been seen before (if there is no such vertex, we have finished). Otherwise, pop the last arc from the stack and reveal its endpoint v (which is uniformly random over all vertices). If v already is seen, repeat.
-
S2.
(v is now a new vertex) Reveal the outdegree \(\eta \) of v and add to the stack \(\eta \) new arcs from v, with unspecified endpoints. GOTO S1.
It can easily be seen that the stack size I(t) will be a Markov chain, similar (but not identical) to the depth process d(t) in the geometric case studied above. Moreover, it is possible to recover the depth of the vertices from the stack size process, which makes it possible to extend many of the results above, although sometimes with less precision. For details, see [12].
Data availability
Not applicable.
References
Aldous, David: Stopping times and tightness. Ann. Probab. 6(2), 335–340 (1978)
Aldous, David: Stopping times and tightness. II. Ann. Probab. 17(2), 586–595 (1989)
Aldous, David: Brownian excursions, critical random graphs and the multiplicative coalescent. Ann. Probab. 25(2), 812–854 (1997)
Athreya, K.B., Ney, P.E.: Branching Processes. Springer, Berlin (1972)
Billingsley, Patrick: Convergence of Probability Measures. Wiley, New York (1968)
Borovkov, A.A.: Estimates in the invariance principle. (Russian) Dokl. Akad. Nauk SSSR 206, 1037–1039 (1972)
Brown, B.M.: Martingale central limit theorems. Ann. Math. Stat. 42, 59–66 (1971)
Diskin, S., Krivelevich, M.: On the performance of the Depth First Search algorithm in supercritical random graphs. Preprint (2021). arXiv:2111.07345
Enriquez, Nathanaël, Faraud, Gabriel, Ménard, Laurent: Limiting shape of the depth first search tree in an Erdős-Rényi graph. Random Struct. Algorithms 56(2), 501–516 (2020)
Gut, Allan: Probability: A Graduate Course, 2nd edn. Springer, New York (2013)
Jacod, Jean, Shiryaev, Albert N.: Limit Theorems for Stochastic Processes. Springer, Berlin (1987)
Jacquet, P., Janson, S.: Depth-First Search performance in random digraphs with general outdegree distributions. In preparation
Janson, S.: Orthogonal decompositions and functional limit theorems for random graph statistics. Mem. Am. Math. Soc. 111, no. 534, Amer. Math. Soc., Providence, R.I. (1994)
Janson, S.: On Knuth’s conjecture for back and forward arcs in Depth-First Search in a random digraph with geometric outdegree distribution. Preprint (2023). arXiv:2301.04131
Kallenberg, Olav: Foundations of Modern Probability, 2nd edn. Springer, New York (2002)
Knuth, D.E.: The Art of Computer Programming, Section 7.4.1.2 (Preliminary draft, 13 February 2022). http://cs.stanford.edu/~knuth/fasc12a.ps.gz
Krivelevich, Michael, Sudakov, Benny: The phase transition in random graphs: a simple proof. Random Struct. Algorithms 43(2), 131–138 (2013)
McLeish, D.L.: Dependent central limit theorems and invariance principles. Ann. Probab. 2, 620–628 (1974)
Merlevède, F., Peligrad, M.: Functional CLT for nonstationary strongly mixing processes. Stat. Probab. Lett. 156, 108581 (2020)
Nie, Z.: On a conjecture of Knuth about forward and back arcs. Preprint (2023). arXiv:2301.05704
Funding
Open access funding provided by Uppsala University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank Donald Knuth for posing us questions and conjectures that led to the present paper. We also thank two anonymous referees for their comments. Supported by the Knut and Alice Wallenberg Foundation.
A Generalized Donsker’s Theorem
A Generalized Donsker’s Theorem
The classical Donsker’s theorem [5, Theorems 16.1] yields functional convergence of the successive partial sums of an i.i.d. sequence of random variables with finite variance, suitably normalized, to a Brownian motion. This has been extended to random variables with different distributions; moreover, it has been extended to dependent variables under various conditions (in particular for martingales and for mixing conditions), see, e.g., [1, 6, 7, 18, 19]. We give here a simple version of Donsker’s theorem for independent but not necessarily identically distributed variables; it is adapted to the application in the present paper but stated in a rather general form for future reference.
A triangular array is a family of random variables \(X_{ni}\) defined for \(n\ge 1\) and \(1\le i\le k_n\), where \((k_n)_1^\infty \) is a given sequence of positive integers. We say that a triangular array is row-wise independent if for each n, the random variables \((X_{ni})_1^{k_n}\) are independent. For notational convenience, we may extend the array by defining \(X_{ni}:=0\) for \(i>k_n\); thus we may assume that \(X_{ni}\) is defined for all \(n,i\ge 1\).
Recall the comments on D(J) at the beginning of Sect. 3.
Theorem A.1
Let J be an interval [0, b) or [0, b], where \(0<b\le \infty \) or \(0<b<\infty \), respectively. Let \((X_{ni})_{i=1}^{k_n}\) be a row-wise independent triangular array of random variables such that \({\mathbb {E}}X_{ni}=0\) for all n and i, and furthermore, the following two conditions hold as \({n\rightarrow \infty }\):
-
(i)
There is a continuous function \(g:J\rightarrow [0,\infty )\) such that, for each fixed \(x\in J\),
$$\begin{aligned} \sum _{i=1}^{\lfloor nx\rfloor }{\mathbb {E}}X^2_{ni} \rightarrow g(x). \end{aligned}$$(A.1) -
(ii)
(Lindeberg condition) For every \(x\in J\) and every \(\varepsilon >0\),
$$\begin{aligned} \sum _{i=1}^{\lfloor nx\rfloor }{\mathbb {E}}\bigl [X^2_{ni}{\varvec{1}}\{|X_{ni}|>\varepsilon \}\bigr ] \rightarrow 0. \end{aligned}$$(A.2)(If \(J=[0,b]\), it evidently suffices that (A.2) holds for \(x=b\).)
Then, as \({n\rightarrow \infty }\),
where Z(x) is a continuous Gaussian process on J with mean \({\mathbb {E}}Z(x)=0\) and covariance
Equivalently, \(Z(x)=B\bigl (g(x)\bigr )\) for a standard Brownian motion B.
Proof
This is essentially (up to a change of time) a special case of e.g. [18, Corollary 2.8], [1, Theorem 3], or [19, Theorem 2.1], but we prefer to give a direct proof.
Define \(Z(x):=B\bigl (g(x)\bigr )\) for a Brownian motion B. Then Z(x) is a continuous Gaussian process (since \(g\) is continuous), and \({\mathbb {E}}Z(x)=0\). To find its covariance function, let \(x,y\in J\) with \(x\le y\). Then (A.1) implies \(g(x)\le g(y)\), and thus
which verifies (A.4).
Let
We note first that the standard central limit theorem for triangular arrays [10, Theorem 7.2.4] shows that for each fixed \(x\in J\), we have \(M_n(x)\overset{\textrm{d}}{\longrightarrow }N\bigl (0,g(x)\bigr )\), or equivalently \(M_n(x)\overset{\textrm{d}}{\longrightarrow }Z(x)\); in other words, the limit in (A.3) holds for each fixed x. Similarly, if \(0\le x_1\le x_2\in J\), then the central limit theorem yields \(M_n(x_2)-M_n(x_1)\overset{\textrm{d}}{\longrightarrow }N\bigl (0,g(x_2)-g(x_1)\bigr )\), and thus \(M_n(x_2)-M_n(x_1)\overset{\textrm{d}}{\longrightarrow }Z(x_2)-Z(x_1)\). Since \(M_n\) and Z are processes with independent increments, it follows that (A.3) holds in the sense of finite-dimensional convergence.
To improve this to the convergence in D(J) asserted in the theorem, we thus have to prove also tightness of the sequence \(M_n\) in D(J). This can be done in several ways, but the simplest seems to be to use a theorem by Aldous [2] for functional convergence of continuous-time martingales; note that each \(M_n(x)\) is a martingale on J (trivially, for the natural filtration). Suppose first that \(J=[0,b)\). For each fixed \(x\ge 0\), we have \({\mathbb {E}}M_n(x)^2= \sum _{i=1}^{\lfloor nx\rfloor }{\mathbb {E}}X^2_{ni} \rightarrow g(x)\) by (A.1), and thus \(\sup _n{\mathbb {E}}M_n(x)^2<\infty \). In particular, still for each fixed x, the family \(M_n(x)\) is uniformly integrable [10, Theorem 5.4.2]. This is what we need, since [2, Proposition 1.2] says that the functional convergence (A.4) follows from the finite-dimensional convergence just shown, together with this pointwise uniform integrability and the continuity of the limit Z(x). (This proposition in [2] is stated for \(J=[0,\infty )\), but the result transfers to any \(J=[0,b)\) by a change of variable.)
This proves the theorem in the case \(J=[0,b)\). The case \(J=[0,b]\) follows from the case \(J=[0,\infty )\) by (re)defining \(X_{ni}:=0\) for \(i>\lfloor nb\rfloor \) and extending \(g\) to \([0,\infty )\) by \(g(x):=g(b)\) for \(x>b\). \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Jacquet, P., Janson, S. Depth-First Search Performance in a Random Digraph with Geometric Outdegree Distribution. La Matematica 3, 262–292 (2024). https://doi.org/10.1007/s44007-024-00085-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s44007-024-00085-2