We are now interested in the statistical properties of the entropy
$$\begin{aligned} \begin{aligned} S_N({\varvec{\sigma }}) = \log Z({\varvec{\sigma }}) = \frac{1}{2} \sum _{i=1}^{2N} \log ({\bar{h}}_i({\varvec{\sigma }})) \, \end{aligned} \end{aligned}$$
(20)
when \({\varvec{\sigma }}(J)\) is a random variable induced by some probability measure on the space of configurations J of 2N points, and N tends to infinity. The subscript N reminds us that we are at size \(|W|=|B|=N\).
In matching problems, the typical choices for the configurational probability measure are factorized over a measure on the spacings and a measure on the color orderings, i.e.
$$\begin{aligned} \begin{aligned} \mu (J) = \mu _\mathrm{spacing}(\mathbf {s}(J)) \; \mu _\mathrm{color}({\varvec{\sigma }}(J)) \, \end{aligned} \end{aligned}$$
(21)
(see [5] for more details and examples). As \(S_N(J) = S_N({\varvec{\sigma }}(J))\), we can again forget about the spacing degrees of freedom, and study the statistics of \(S_N({\varvec{\sigma }})\) induced by some measure \(\mu _\mathrm{color}({\varvec{\sigma }})\). In particular, we will study the cases in which \({\varvec{\sigma }}\) is uniformly drawn from the set of Dyck paths, or uniformly drawn from the set of Dyck bridges.
Integral Formulas for the Integer Moments of \(S_N\) via Wiener Processes
It is well known (see Donsker’s theorem [12]) that lattice paths such as Dyck paths and bridges converge, as \(N\rightarrow \infty \) and after a proper rescaling, to Brownian bridges and Brownian excursions. Brownian bridges are Wiener processes constrained to end at null height, while Brownian excursions are Wiener processes constrained to end at null height and to lie in the upper half-plane. The correct rescaling of the steps of the lattice paths that highlights this convergence is given by \((+1,\pm 1) \rightarrow \left( +\frac{1}{N}, \pm \frac{1}{\sqrt{N}} \right) \).
These scalings suggest to consider a rescaled version of the entropy
$$\begin{aligned} \begin{aligned} s({\varvec{\sigma }}) = \frac{S_N({\varvec{\sigma }}) - \frac{1}{2}N\log N}{N} = \frac{1}{2N} \sum _{i=1}^{2N} \log \left( \frac{{\bar{h}}_i({\varvec{\sigma }})}{\sqrt{N}}\right) \, . \end{aligned} \end{aligned}$$
(22)
In the limit \(N\rightarrow \infty \), the rescaled entropy will converge to an integral operator over Wiener processes
$$\begin{aligned} \begin{aligned} s[{\varvec{\sigma }}] = \int _0^1 dt \, \log \left( |{\varvec{\sigma }}(t)| \right) \end{aligned} \end{aligned}$$
(23)
where \({\varvec{\sigma }}(x)\) is a Brownian bridge/excursion.
The integer moments of \(s[{\varvec{\sigma }}]\) can be readily computed as correlation functions of the Brownian process:
$$\begin{aligned} \begin{aligned} \langle (s[{\varvec{\sigma }}])^k \rangle _\mathrm{B/E}&= \int {\mathcal {D}}_\mathrm{B/E}[{\varvec{\sigma }}] \int _0^1 dt_1 \dots dt_k \prod _{a=1}^k \log \left( |{\varvec{\sigma }}(t_a)| \right) \\&= k! \int _{\varDelta _k} dt_1 \dots dt_k \int _{{\mathbb {R}}} dx_1 \dots dx_k \prod _{a=1}^k \log \left( |x_a| \right) \int {\mathcal {D}}_\mathrm{B/E}[{\varvec{\sigma }}] \prod _{a=1}^k \delta ( {\varvec{\sigma }}(t_a) - x_a ) \, , \end{aligned} \end{aligned}$$
(24)
where \(\varDelta _k \subset {\mathbb {R}}^k\) is the canonical symplex \(\{0 = t_0< t_1< t_2< \cdots< t_k < t_{k+1} = 1\}\), and \({\mathcal {D}}_\mathrm{B/E}[{\varvec{\sigma }}]\) is the standard measure on the Brownian process of choice among bridges and excursions. The last integral is the probability that the Brownian process we are interested in starts and ends at the origin and visits the points \((t_1,x_1),\dots ,(t_k, x_k )\), while subject to its constraints.
Let us consider Brownian bridges first. In this case, the probability that a Wiener process travels from \((t_i,x_i)\) to \((t_f,x_f)\) is given by \({\mathcal {N}}( x_f-x_i | 2(t_f-t_i))\) where \({\mathcal {N}}(x | \sigma ^2 )\) is the p.d.f. of a centered Gaussian distribution with variance \(\sigma ^2\). The factor 2 comes from Donsker’s theorem, and is due to the fact that the variance of the distribution of the steps in the lattice paths is exactly 2. Thus, for Brownian bridges
$$\begin{aligned} \begin{aligned} \int {\mathcal {D}}_\mathrm{B}[{\varvec{\sigma }}] \prod _{a=1}^k \delta ( {\varvec{\sigma }}(t_a) - x_a ) = \frac{\sqrt{4 \pi }}{\prod _{a=0}^{k} \sqrt{4 \pi (t_{a+1} - t_{a} ) }} \exp \left[ - \sum _{a=0}^k \frac{(x_{a+1} - x_{a})^2}{4(t_{a+1} - t_{a})} \right] \, \end{aligned} \end{aligned}$$
(25)
where \(x_0=0\), \(x_{k+1}=0\) and the factor \(\sqrt{4 \pi }\) is a normalization, so that
$$\begin{aligned} \begin{aligned} \langle (s[{\varvec{\sigma }}])^k \rangle _\mathrm{B}&= k! \int _{\varDelta _k} dt_1 \dots dt_k \int _{{\mathbb {R}}} dx_1 \dots dx_k \frac{\sqrt{4\pi }\prod _{a=1}^k \log \left( |x_a| \right) }{\prod _{a=0}^{k} \sqrt{4 \pi (t_{a+1} - t_{a} ) }} \exp \left[ - \sum _{a=0}^k \frac{(x_{a+1} - x_{a})^2}{4(t_{a+1} - t_{a})} \right] \, . \end{aligned}\nonumber \\ \end{aligned}$$
(26)
Brownian excursions can be treated analogously using the reflection principle. In this case, the conditional probability that a Wiener process travels from \((t_i,x_i)\) to \((t_f,x_f)\) without ever reaching negative heights, given that it already reached \((t_i ,x_i )\), is given by \({\mathcal {N}}( x_f-x_i | 2 (t_f - t_i) ) - {\mathcal {N}}( x_f+x_i | 2 ( t_f -t_i ) )\) for \(x_{i,f} > 0\), while for \(x_{i}=0\) and \(x_f = x\) (or viceversa) it equals \(\frac{|x|}{2 (t_f -t_i )} {\mathcal {N}}(x| 2 (t_f -t_i ))\). Moreover, now all \(x_i\)’s are constrained to be positive. Thus, for Brownian excursions
$$\begin{aligned} \begin{aligned} \langle (s[{\varvec{\sigma }}])^k \rangle _\mathrm{E}&= k! \int _{\varDelta _k} dt_1 \dots dt_k \int _{[0,+\infty )} dx_1 \dots dx_k \frac{\sqrt{4\pi } x_1 x_k \prod _{a=1}^k \log \left( x_a \right) }{t_1 (1-t_k) \prod _{a=0}^{k} \sqrt{4 \pi (t_{a+1} - t_{a} ) }} \\&\quad \times \exp \left[ -\frac{x_1^2}{4 t_1} -\frac{x_k^2}{4 (1- t_k)} \right] \prod _{a=1}^{k-1} \left\{ \exp \left[ - \frac{(x_{a+1} - x_{a})^2}{4(t_{a+1} - t_{a})} \right] - \exp \left[ - \frac{(x_{a+1} + x_{a})^2}{4(t_{a+1} - t_{a})} \right] \right\} \, . \end{aligned}\nonumber \\ \end{aligned}$$
(27)
In both cases, the Gaussian integrations on the heights \(x_i\) can be explicitly performed. First of all, we replace
$$\begin{aligned} \begin{aligned} \log |x_a | = \frac{1}{2} \partial _{\kappa _a} \left[ x_a ^{2\kappa _a} \right] _{\kappa _a =0} \, . \end{aligned} \end{aligned}$$
(28)
Then, we treat the contact terms. In the case of bridges, the contact terms can be rewritten (under integration) as
$$\begin{aligned} \begin{aligned} \exp \left[ \frac{x_{a+1} x_a }{2( t_{a+1} -t_a )} \right] \rightarrow \cosh \left( \frac{x_{a+1} x_a }{2( t_{a+1} -t_a )} \right) \, , \end{aligned} \end{aligned}$$
(29)
where the hyperbolic sine term is discarded due to the parity of the rest of the integrand in the variables \(x_a\). In the case of excursions, the contact term instead reads
$$\begin{aligned} \begin{aligned} \exp \left[ \frac{x_{a+1} x_a }{2( t_{a+1} -t_a )} \right] - \exp \left[ - \frac{x_{a+1} x_a }{2( t_{a+1} -t_a )} \right] = 2\sinh \left( \frac{x_{a+1} x_a }{2( t_{a+1} -t_a )} \right) \, . \end{aligned} \end{aligned}$$
(30)
In both cases, we can expand the hyperbolic function in power-series, so that the integrations in the \(x_a\) variables are now factorized and of the kind
$$\begin{aligned} \begin{aligned} \int _{{\mathbb {R}}} dx \, x^{2k} \exp \left[ -\frac{x^2}{\lambda } \right] = \varGamma \left( k+\frac{1}{2} \right) \lambda ^{k+\frac{1}{2}} \, \end{aligned} \end{aligned}$$
(31)
(in the case of excursions, a factor 1/2 must be added to take into account the halved integration domain).
Using these manipulations, the first two moments for both bridges and excursions can be analytically computed. We detail the computations in Appendix A. The results are:
$$\begin{aligned} \begin{aligned} \langle s[{\varvec{\sigma }}] \rangle _\mathrm{B}&= -\frac{\gamma _E+2}{2} \\ \langle s[{\varvec{\sigma }}] \rangle _\mathrm{E}&= -\frac{\gamma _E}{2} \, , \end{aligned} \end{aligned}$$
(32)
where \(\gamma _E\) is the Euler–Mascheroni constant, and
$$\begin{aligned} \begin{aligned} \left\langle (s[{\varvec{\sigma }}])^2 \right\rangle _\mathrm{B}&= \frac{4}{3}+\gamma _E +\frac{\gamma _E ^2}{4}-\frac{\pi ^2}{72}\\ \left\langle (s[{\varvec{\sigma }}])^2 \right\rangle _\mathrm{E}&= \frac{\gamma _E ^2}{4}+\frac{5 \pi ^2}{24}-2 \, . \end{aligned} \end{aligned}$$
(33)
The approach presented in this Section is simple in spirit, and allows to connect our problem to the vast literature on Wiener processes. Moreover, it is suitable for performing Monte Carlo numerical integration to retrieve the moments of \(s({\varvec{\sigma }})\).
In this Section we worked directly in the continuum limit. In the next Section, we provide a combinatorial approach that allows to recover the values of the first two moments in a discrete setting, and to compute finite-size corrections in the limit \(N\rightarrow \infty \).
Combinatorial Properties of the Integer Moments of \(S_N\) at Finite N
In this section, we introduce a combinatorial method to compute the moments of \(S_N({\varvec{\sigma }})\) in the limit \(N\rightarrow \infty \). This new approach allows to retain informations on the finite-size corrections.
The underlying idea is to reproduce Eq. (24) in the discrete setting for the variable \(S_N({\varvec{\sigma }})\), and to study its large-N behaviour using methods from analytic combinatorics.
We start again from
$$\begin{aligned} \begin{aligned} S_N({\varvec{\sigma }}) = \sum _{\begin{array}{c} i=1\dots 2N\\ \sigma _i=-1 \end{array}} \log ({\bar{h}}_i({\varvec{\sigma }})) \, \end{aligned} \end{aligned}$$
(34)
In the following, the superscript/subscript \(\mathrm T=E,B\) will stand for Dyck paths (excursions, E) and Dyck bridges (B) respectively, \(T_N = C_N,B_N\) and \({\mathcal {T}}_N = {\mathcal {C}}_N, {\mathcal {B}}_N\); we will mantain the notation unified whenever possible.
The k-th integer moment equals
$$\begin{aligned} \begin{aligned} \langle \left( S_N({\varvec{\sigma }}) \right) ^k \rangle _\mathrm{T}&:= M_{N,k}^\mathrm{(T)} = \frac{1}{T_N} \sum _{{\varvec{\sigma }}\in {\mathcal {T}}_N} \left[ S_N({\varvec{\sigma }}) \right] ^k = \frac{k!}{T_N} \sum _{{\varvec{\sigma }}\in {\mathcal {T}}_N} \sum _{\begin{array}{c} 1 \le t_1 , \dots , t_k \le 2N \\ \sigma _{t_1} = \dots = \sigma _{t_k} = -1 \end{array} } \prod _{a=1}^k \log ({\bar{h}}_{t_a}({\varvec{\sigma }})) \\&= k! \sum _{c=1}^k \sum _{ \begin{array}{c} 1 \le t_1< t_2< \cdots < t_c \le 2N \\ \nu _1,\ldots ,\nu _c \ge 1 \\ \nu _1 + \cdots + \nu _c = k \\ {\bar{h}}_1, \ldots , {\bar{h}}_c >0 \end{array}} \prod _{a=1}^c \left( \frac{ \left( \log {\bar{h}}_a \right) ^{\nu _a} }{\nu _a !} \right) \frac{{\mathcal {M}}_N^\mathrm{(T)}(t_1,\ldots ,t_c;{\bar{h}}_1,\ldots ,{\bar{h}}_c)}{T_N} \, \end{aligned}\nonumber \\ \end{aligned}$$
(35)
where \({\mathcal {M}}^\mathrm{(T)}_N(t_1, \dots , t_c;{\bar{h}}_1, \dots , {\bar{h}}_c)\) is the number of paths of type \(\mathrm T\) that has closing steps at horizontal positions \(t_1, \dots t_c\), and at heights \(h_1 = \pm ({\bar{h}}_1 - 1/2), \dots , h_c = \pm ({\bar{h}}_c - 1/2)\).
The last equation reproduces, as anticipated earlier, Eq. (24) in the discrete setting. Notice that here we must take into account the multiplicities \(\nu _a\), while in the continuous setting we could just set \(c=k\) and \(\nu _a=1\) for all a, as the contribution from the other terms is washed out in the continuum limit. This suggests that, in this more precise approach, we will verify explicitly that the leading contributions in the large-N limit comes from the \(c=k\) term of Eq. (35).
In order to study Eq. (35), we take the following route. As this equation depends on N only implicitly through the summation range, and explicitly through a normalization, we would like to introduce a generating function
$$\begin{aligned} M_{k}^\mathrm{(T)}(z) = \sum _{N \ge 1} z^N T_N M_{N,k}^\mathrm{(T)} \, \end{aligned}$$
(36)
that will decouple the summation range over the variables \(t_{i+1}-t_i\). By singularity analysis [14], the asymptotic expansion for \(N \rightarrow \infty \) of \(M^\mathrm{(T)}_{N,k}\) will be then retrieved by the singular expansion of \(M_k^\mathrm{(T)}(z)\) around its dominant singularity.
We start by giving an explicit form for \({\mathcal {M}}^\mathrm{(T)}_N\) for Dyck paths and Dyck bridges.
Proposition 3
In the case of Dyck bridges, we have
$$\begin{aligned} \begin{aligned}&{\mathcal {M}}^\mathrm{(B)}_N (t_1,\ldots ,t_c;{\bar{h}}_1,\ldots ,{\bar{h}}_c) \\&\quad = 2 B_{t_1-1,{\bar{h}}_1} \left( B_{t_2 - t_1 -1, {\bar{h}}_2 - ({\bar{h}}_1 -1)} + B_{t_2 - t_1 -1, {\bar{h}}_2 + ({\bar{h}}_1 -1)} \right) \ldots \\&\qquad \cdots \left( B_{t_c - t_{c-1} -1, {\bar{h}}_c - ({\bar{h}}_{c-1} -1)} + B_{t_c - t_{c-1} -1, {\bar{h}}_c + ({\bar{h}}_{c-1} -1)} \right) B_{2N - t_c, {\bar{h}}_c -1} \, , \end{aligned} \end{aligned}$$
(37)
where
$$\begin{aligned} \begin{aligned} B_{a,b} = {\left\{ \begin{array}{ll} \left( {\begin{array}{c}a\\ \frac{a+b}{2}\end{array}}\right) &{} \text {if } a,b \in {\mathbb {Z}}^+ \text { and } a+b \text { is even} \\ 0 &{} \text {otherwise} \end{array}\right. } \end{aligned} \end{aligned}$$
(38)
is the number of unconstrained paths that start at (x, y) and end at \((x+a,y+b)\).
In the case of Dyck paths, we have
$$\begin{aligned} \begin{aligned}&{\mathcal {M}}^\mathrm{(E)}_N (t_1,\ldots ,t_c;{\bar{h}}_1,\ldots ,{\bar{h}}_c) \\&\quad = C_{t_1-1,{\bar{h}}_1,0} \, C_{t_2-t_1-1,{\bar{h}}_2 - ({\bar{h}}_1-1), {\bar{h}}_1-1} \cdots \\&\qquad \cdots C_{t_c-t_{c-1}-1, {\bar{h}}_c - ({\bar{h}}_{c-1} -1), {\bar{h}}_{c-1}-1} C_{2N-t_c, -({\bar{h}}_c-1), {\bar{h}}_c-1} \end{aligned} \end{aligned}$$
(39)
where
$$\begin{aligned} \begin{aligned} C_{a,b,d} = \left( B_{a,b} - B_{a,b+2(d+1)} \right) \theta (b+d) \, \qquad a,b,d \in {\mathbb {Z}}^+ \, , \end{aligned} \end{aligned}$$
(40)
is the number of paths that start at (x, y), end at \((x+a,y+b)\) and never fall below height \(y-d\), and \(\theta (x)=1\) for \(x\ge 0\) and zero otherwise,
Notice that, while in general the \(\theta \) factors are necessary for the definition of \(C_{a,b,d}\) in terms of \(B_{a,b}\), in our specific case they are all automatically satisfied, as \({\bar{h}}_a \ge 1\) for all \(1\le a \le c\).
Proof
Let us start by considering Dyck bridges. The idea is to decompose a path contributing to the count of \({\mathcal {M}}^\mathrm{(B)}_N (t_1,\ldots ,t_c;{\bar{h}}_1,\ldots ,{\bar{h}}_c)\) around its closing steps:
-
the first closing step starts at coordinate \(\left( t_1-1,\pm {\bar{h}}_1 \right) \). There are \(B_{t_1-1,{\bar{h}}_1} + B_{t_1-1,-{\bar{h}}_1} = 2B_{t_1-1,{\bar{h}}_1}\) different portions of path joining the origin to the starting point of the first closing step.
-
the a-th closing step happens \(t_a - t_{a-1}-1\) steps after the \((t-1)\)-th one, and, based on the relative sign of the heights of the two closing steps, their difference in height equals \({\bar{h}}_a - ({\bar{h}}_{a-1}-1)\) or \({\bar{h}}_a + ({\bar{h}}_{a-1}-1)\). Thus, there are \(B_{t_a - t_{a-1} -1, {\bar{h}}_a - ({\bar{h}}_{a-1}-1)} + B_{t_a - t_{a-1} -1, {\bar{h}}_a + ({\bar{h}}_{a-1}-1)}\) different portions of path connecting the two closing steps.
-
the last closing step happens \(2N-t_c\) steps before the end of the path and at height \(h_c = \pm \left( {\bar{h}}_c - \frac{1}{2} \right) \). Thus, there are \(B_{2N-t_c,{\bar{h}}_c-1}\) portions of path concluding the original path.
The product of the contribution of each subpath recovers Eq. (37).
The case of Dyck paths can be treated analogously, with a few crucial differences. In fact, each of the portions of path between the i-th and \((i+1)\)-th closing steps (which, for excursions, are just down-steps) has now the constraint that it must never fall below the horizontal axis, i.e. must never reach a height \(\left( {\bar{h}}_i -\frac{1}{2} \right) \) lower with respect to its starting step. Let us count these paths. A useful trick to this end is the discrete version of the reflection method, that we already used in Sect. 3.1. Call a the total number of steps, b the relative height of the final step with respect to the starting step, and c the maximum fall allowed with respect to the starting step. Moreover, call bad paths all paths that do not respect the last constraint. A bad path is characterized by reaching relative height \(-c-1\) at some point (say, the first time after s steps). By reflecting the portion of path composed of the first s steps, we obtain a bijection between bad paths and unconstrained paths that start at relative height \(-2(c+1)\), and reach relative height b after a steps. Thus, the total number of good paths \(C_{a,b,d}\) is given by subtraction as
$$\begin{aligned} \begin{aligned} C_{a,b,d} = B_{a,b} - B_{a,b+2(d+1)} \, . \end{aligned} \end{aligned}$$
(41)
This line of thought holds for all values of \(a,d > 0\) and \(b \ge -d\); if \(b < -d\) we just have \(C_{a,b,d} = 0\). Moreover, by properties of \(B_{a,b}\), \(C_{a,b,d}=0\) if \(a+b\) is not an even number.
Equation (39) can be easily established by decomposing a generic (marked) path around its closing steps, and by applying our result above. \(\square \)
The fact that we want to exploit now is that, while a given binomial factor \(B_{a,b}\) (and its constrained variant \(C_{a,b,d}\)) are not easy to handle exactly, their generating function in a have simple expressions, induced by analogously simple decompositions, that we collect in the following:
Proposition 4
$$\begin{aligned} B_b(z)&:= \sum _a z^{\frac{a}{2}} B_{a,b} = B(z) (\sqrt{z} C(z))^{|b|} \, , \end{aligned}$$
(42)
$$\begin{aligned} C_{b,d}(z)&:= \sum _a z^{\frac{a}{2}} C_{a,b,d} = B(z) \left[ (\sqrt{z} C(z))^{|b|} - (\sqrt{z} C(z))^{|b+2(d+1)|} \right] \theta (b+d) \, , \end{aligned}$$
(43)
$$\begin{aligned} C_b(z)&:= \sum _a z^{\frac{a}{2}} C_{a,b,0} = B(z) \left( 1-zC(z)^2 \right) (\sqrt{z} C(z))^b \theta (b) \, , \end{aligned}$$
(44)
where, as in (15),
$$\begin{aligned} B(z)&=\sum _{k \ge 0} z^{k} B_k = \frac{1}{\sqrt{1-4z}} \, ;&C(z) =\sum _{k \ge 0} z^{k} C_k = \frac{1-\sqrt{1-4z}}{2z} \, . \end{aligned}$$
(45)
Proof
To obtain Eq. (42), observe that a path going from (0, 0) to (a, b) with non-negative b can be uniquely decomposed as \(w = w_{0} u_{1} w_{1} u_{2} w_{2} \dots u_{h} w_{h}\) where \(u_{i}\) is the right-most up-step of w at height \(i-1/2\), and \(w_{i}\) is a (possibly empty) Dyck path, for all \(i=1,\dots ,h\), while \(w_{0}\) is a (possibly empty) Dyck bridge. Thus,
$$\begin{aligned} \begin{aligned} B_{a,b} = \sum _{\begin{array}{c} \ell _0 , \dots , \ell _b \ge 0 \\ 2 \sum _{i=0}^b \ell _i + b = a \end{array}} B_{\ell _0} C_{\ell _1} \cdots C_{\ell _b} \, . \end{aligned} \end{aligned}$$
(46)
For negative b, the same reasoning holds with \(u_i\)’s replaced by down-steps, hence the absolute value on b in the result. Equation (42) then follows easily.
Equation (43) follows from \(C_{a,b,d} = \left( B_{a,b} - B_{a,b+2(d+1)} \right) \theta (b+d)\).
Equation (44) can be derived either as a special case of Eq. (43), or as a variation of (42) where \(w_0\) must be a Dyck path. The equivalence of these two decompositions is granted by the fact that \(C(z) = \left( 1-z C(z)^2 \right) B(z)\). \(\square \)
Let us introduce the symbol \(x=x(z)\) for the recurrent quantity
$$\begin{aligned} x(z) = z C(z)^2 = C(z)-1 \end{aligned}$$
(47)
which, if used to parametrise the other relevant quantities, gives
$$\begin{aligned} z(x)&= \frac{x}{(1+x)^2} \, ;&B(z(x))&= \frac{1+x}{1-x} \, . \end{aligned}$$
(48)
Then, Eq. (36) reads
$$\begin{aligned} \begin{aligned} M_{k}^\mathrm{(T)}(z)&= k! \sum _{c=1}^k \sum _{ \begin{array}{c} \nu _1,\ldots ,\nu _c \ge 1\\ {\bar{h}}_1, \ldots , {\bar{h}}_c >0\\ \sum _a \nu _a=k \end{array}} \prod _{a=1}^c \left( \frac{ \left( \log {\bar{h}}_a \right) ^{\nu _a} }{\nu _a !} \right) {\mathcal {M}}^\mathrm{(T)}(z;{\bar{h}}_1,\ldots ,{\bar{h}}_c) \, , \end{aligned} \end{aligned}$$
(49)
where
$$\begin{aligned} \begin{aligned} {\mathcal {M}}^\mathrm{(T)}(z;{\bar{h}}_1, \ldots , {\bar{h}}_c) = \sum _{N\ge 0}z^N \sum _{1 \le t_1< t_2< \ldots < t_c \le 2N} {\mathcal {M}}^\mathrm{(T)}_N(t_1,\ldots ,t_c; {\bar{h}}_1, \ldots , {\bar{h}}_c) \, . \end{aligned} \end{aligned}$$
(50)
Proposition 5
Using x to denote x(z), we have that for bridges
$$\begin{aligned} \quad {\mathcal {M}}^\mathrm{(B)}(z;{\bar{h}}_1,\ldots ,{\bar{h}}_c) = 2 z^{\frac{c}{2}} B(z)^{c+1} \sqrt{x}^{\,{\bar{h}}_1}&\big ( \sqrt{x}^{\,|{\bar{h}}_2-{\bar{h}}_1+1|} + \sqrt{x}^{\,{\bar{h}}_2+{\bar{h}}_1-1} \big ) \nonumber \\&\quad \cdots \big ( \sqrt{x}^{\,|{\bar{h}}_c-{\bar{h}}_{c-1}+1|} + \sqrt{x}^{\,{\bar{h}}_c+{\bar{h}}_{c-1}-1} \big ) \sqrt{x}^{\,{\bar{h}}_c-1} \, , \quad \end{aligned}$$
(51)
and for excursions
$$\begin{aligned} \quad {\mathcal {M}}^\mathrm{(E)}(z;{\bar{h}}_1,\ldots ,{\bar{h}}_c) = z^{\frac{c}{2}} B(z)^{c+1}&(1-x) \sqrt{x}^{\,{\bar{h}}_1} \big ( \sqrt{x}^{\,|{\bar{h}}_2-{\bar{h}}_1+1|} - \sqrt{x}^{\,{\bar{h}}_2+{\bar{h}}_1+1} \big ) \nonumber \\&\cdots \big ( \sqrt{x}^{\,|{\bar{h}}_c-{\bar{h}}_{c-1}+1|} - \sqrt{x}^{\,{\bar{h}}_c+{\bar{h}}_{c-1}+1} \big ) (1-x) \sqrt{x}^{\,{\bar{h}}_c-1} \, . \quad \end{aligned}$$
(52)
Proof
First of all, we notice that
$$\begin{aligned} \begin{aligned}&{\mathcal {M}}^\mathrm{(T)}_N(t_1, \ldots , t_c; {\bar{h}}_1, \ldots , {\bar{h}}_c) \\&\quad = f_1(t_1-1;{\bar{h}}_1)f_2(t_2-t_1-1;{\bar{h}}_2,{\bar{h}}_1)\ldots f_c(t_c-t_{c-1}-1;{\bar{h}}_c,{\bar{h}}_{c-1}) f_{c+1}(2N-t_c;{\bar{h}}_c) \, \end{aligned}\nonumber \\ \end{aligned}$$
(53)
for some functions \(f_i\) that depend on the type of paths \(\mathrm T\) that we are studying. Thus, by performing the change of summation variables \(\{t_1,\ldots ,t_c,N\} \rightarrow \{\alpha _1, \ldots , \alpha _{c+1}\}\) such that
$$\begin{aligned} \begin{aligned}&\alpha _1 = t_1-1 \, , \\&\alpha _i = t_i - t_{i-1} -1 \, , \qquad 2 \le i \le c \, ,\\&\alpha _{c+1} = 2N-t_c \, , \, \end{aligned} \end{aligned}$$
(54)
we have that
$$\begin{aligned} \begin{aligned}&{\mathcal {M}}^\mathrm{(T)}(z;{\bar{h}}_1, \ldots , {\bar{h}}_c) \\&\quad = \sum _{N\ge 0}z^N \sum _{1 \le t_1< t_2< \cdots < t_c \le 2N} {\mathcal {M}}^\mathrm{(T)}_N(t_1,\ldots ,t_c; {\bar{h}}_1, \ldots , {\bar{h}}_c) \\&\quad = z^{c/2}\sum _{\alpha _1, \ldots , \alpha _{c+1} \ge 0} f_1(\alpha _1;{\bar{h}}_1)z^{\alpha _1/2} \ldots f_{c}(\alpha _{c};{\bar{h}}_c,{\bar{h}}_{c-1})z^{\alpha _2/2} f_{c+1}(\alpha _{c+1};{\bar{h}}_c)z^{\alpha _{c+1}/2} \, , \end{aligned} \end{aligned}$$
(55)
so that all summations are now untangled. Equations (51) and (52) can now be recovered by using the explicit form of the functions \(f_i\) for Dyck paths and Dyck bridges given in Proposition 3, and the analytical form for the generating functions given in Proposition 4. Again, notice that the \(\theta \) functions are all automatically satisfied as \({\bar{h}}_i \ge 1\) for all \(1 \le i \le c\). \(\square \)
At this point, we have obtained a quite explicit expression for \(M^\mathrm{(T)}_k(z)\). In the following sections, we will study the behaviour near the leading singularities of the quantities above, for the first two moments, i.e. \(k=1,2\). Higher-order moments require a more involved computational machinery that will be presented elsewhere.
Singularity Analysis for \(k=1\)
We start our analysis from the simplest case, \(k=1\), to illustrate how singularity analysis is applied in this context. We expect to recover Eq. (32). From now on, for simplicity, as the \({\bar{h}}\) indices are mute summation indices, we will call them simply h. We have
$$\begin{aligned} \begin{aligned} M_1^\mathrm{(B)}(z)&= 2 \sqrt{z} B(z)^2 \sum _{h_1 \ge 1} \left( \log h_1 \right) \sqrt{x}^{\,2h_1-1} = \frac{2(1+x)}{(1-x)^2} \sum _{h \ge 1} \log h \; x^h = \frac{2(1+x)}{(1-x)^2} {{\,\mathrm{Li}\,}}_{0,1}(x) \, , \end{aligned}\nonumber \\ \end{aligned}$$
(56)
and
$$\begin{aligned} \begin{aligned} M_1^\mathrm{(E)}(z)&= \sqrt{z} B(z)^2 (1-x)^2 \sum _{h_1 \ge 1} \left( \log h_1 \right) \sqrt{x}^{\,2h_1-1} = (1+x) \sum _{h \ge 1} \log h \; x^h = (1+x) {{\,\mathrm{Li}\,}}_{0,1}(x) \, , \end{aligned}\nonumber \\ \end{aligned}$$
(57)
where \({{\,\mathrm{Li}\,}}_{s,r}(x) = \sum _{h\ge 1} h^{-s} \left( \log (h) \right) ^r x^h\) is the generalized polylogarithm function [13].
In both cases, the dominant singularity is at \(x(z)=1\), i.e. at \(z=1/4\). We have
$$\begin{aligned} \begin{aligned} x\left( z \right) = 1 - 2\sqrt{1-4z} + 2(1-4z) + {\mathcal {O}}\left( (1-4z)^{\frac{3}{2}} \right) \quad \text {for } z \rightarrow \left( \genfrac{}{}{0.25pt}1{1}{4} \right) ^- \, , \end{aligned} \end{aligned}$$
(58)
and
$$\begin{aligned} \begin{aligned} {{\,\mathrm{Li}\,}}_{0,1}(x) = \frac{{{\,\mathrm{L}\,}}(x) - \gamma _E}{1-x} + {\mathcal {O}}\left( {{\,\mathrm{L}\,}}(x)\right) \quad \text {for } x \rightarrow 1^- \, \end{aligned} \end{aligned}$$
(59)
where \({{\,\mathrm{L}\,}}(x) = \log \left( (1-x)^{-1} \right) \). Here and in the following, the rewriting of \({{\,\mathrm{Li}\,}}_{\alpha ,r}(x)\) (for \(-\alpha , r \in {\mathbb {N}}\)) in the form \(P({{\,\mathrm{L}\,}}(x))/(1-x)^{1-\alpha }\), with P(y) a polynomial of degree r, can be done either by matching the asymptotics of the coefficients in the two expressions (and appealing to the Transfer Theorem), or by using the explicit formulas in [14, Thm. VI.7]. In this paper we mostly adopt the first strategy. Passing to the variable z gives
$$\begin{aligned} \begin{aligned} {{\,\mathrm{Li}\,}}_{0,1}\left( x(z) \right) = \frac{{{\,\mathrm{L}\,}}(4z) - 2 \gamma _E - 2\log 2}{4 \sqrt{1-4z}} + {\mathcal {O}}\left( {{\,\mathrm{L}\,}}(4z) \right) \quad \text {for } z \rightarrow \left( \genfrac{}{}{0.25pt}1{1}{4}\right) ^- \, . \end{aligned} \end{aligned}$$
(60)
Thus, the singular expansion of \(M^\mathrm{(T)}_1(z)\) is given by
$$\begin{aligned} \begin{aligned} M_1^\mathrm{(B)}(z) = \frac{{{\,\mathrm{L}\,}}(4z) - 2 \gamma _E - 2\log 2}{4 (1-4z)^{\frac{3}{2}}} + {\mathcal {O}}\left( \frac{{{\,\mathrm{L}\,}}(4z)}{1-4z} \right) \quad \text {for } z \rightarrow \left( \genfrac{}{}{0.25pt}1{1}{4}\right) ^- \, \end{aligned} \end{aligned}$$
(61)
and
$$\begin{aligned} \begin{aligned} M_1^\mathrm{(E)}(z) = \frac{{{\,\mathrm{L}\,}}(4z) - 2 \gamma _E - 2\log 2}{2 \sqrt{1-4z}} + {\mathcal {O}}\left( {{\,\mathrm{L}\,}}(4z) \right) \quad \text {for } z \rightarrow \left( \genfrac{}{}{0.25pt}1{1}{4}\right) ^- \, . \end{aligned} \end{aligned}$$
(62)
The behaviour of \(T_N M^\mathrm{(T)}_{N,1} = [z^N] M_1 ^\mathrm{(T)} (z)\) for large N can be now estimated by using the so-called transfer theorem, that allows to jump back and forth between singular expansion of generating functions and the asymptotic expansion at large order of their coefficients (see [14], in particular Chapter VI for general informations, the table in Fig. VI.5 for the explicit formulas). The partinent result is also reported here in Appendix B, namely in Eq. (102). In practice, we can expand the approximate generating functions given in Eqs. (61) and (62) to get an asymptotic approximation for \(T_N M^\mathrm{(T)}_{N,1}\). Finally, using the asymptotic expansion for \(T_N\), i.e.
$$\begin{aligned} B_N&= \frac{4^N}{\sqrt{\pi }N^{\frac{1}{2}}} \left( 1+{\mathcal {O}}(N^{-1}) \right) \, ,&C_N&= \frac{4^N}{\sqrt{\pi }N^{\frac{3}{2}}} \left( 1+{\mathcal {O}}(N^{-1}) \right) \, \end{aligned}$$
(63)
for \(N\rightarrow \infty \), we obtain an asymptotic expansion for the first moment of \(S_N\) (which agrees with what we already found in Eq. (32))
$$\begin{aligned} \begin{aligned} M^\mathrm{(B)}_{1,N}&= B_N ^{-1} [z^{N}] M_1 ^\mathrm{(B)}(z) = \frac{1}{2} N \log N - \frac{\gamma _E+2}{2} N + {\mathcal {O}}\left( \sqrt{N} \log (N)\right) \\ M^\mathrm{(E)}_{1,N}&= C_N ^{-1} [z^{N}] M_1 ^\mathrm{(E)}(z) = \frac{1}{2} N \log N - \frac{\gamma _E}{2} N + {\mathcal {O}}\left( \sqrt{N} \log (N)\right) \, . \end{aligned} \end{aligned}$$
(64)
Notice that, although we have truncated our perturbative series at the first significant order, in principle the combinatorial method gives us access to finite-size corrections at arbitrary finite order.
Singularity Analysis for \(k=2\)
For \(k=2\), we compute Eq. (49) by studying separately terms at different values of c. Let us start from bridges. For \(c=1\), and thus \(\nu _1=2\), we have
$$\begin{aligned} \begin{aligned} M^\mathrm{(B)}_2(z) \vert _{c=1} =4 \sqrt{z} B(z)^2 \sum _{h_1 \ge 1} \frac{\left( \log h_1 \right) ^2}{2} \sqrt{x}^{\,2h_1-1} = \frac{2(1+x)}{(1-x)^2} {{\,\mathrm{Li}\,}}_{0,2}(x) \, \end{aligned} \end{aligned}$$
(65)
while for \(c=2\), and thus \(\nu _1=\nu _2=1\), we have
$$\begin{aligned} \begin{aligned} M^\mathrm{(B)}_2(z) \vert _{c=2} =4 z B(z)^3 \sum _{h_1,h_2 \ge 1} \log h_1 \log h_2 \sqrt{x}^{\,h_1} \left( \sqrt{x}^{\,|h_2-h_1+1|} + \sqrt{x}^{\,h_2+h_1-1} \right) \sqrt{x}^{\,h_2-1} \, . \end{aligned}\nonumber \\ \end{aligned}$$
(66)
The presence of the absolute value \(|h_2-h_1+1|\) forces us to consider separately the case \(h_1 > h_2\) and \(h_1 \le h_2\). In the first case we get
$$\begin{aligned} \begin{aligned}&4 z B(z)^3 \sum _{h_1> h_2 \ge 1} \left( \log h_1 \log h_2 \right) \sqrt{x}^{\,h_1} (\sqrt{x}^{\,h_1-h_2-1}+\sqrt{x}^{\,h_1+h_2-1}) \sqrt{x}^{\,h_2-1} \\&= 4 z B(z)^3 \bigg ( \sum _{h_1 \ge 1} \left( \log (h_1+1) \log (h_1!) \right) x^{h_1} + \sum _{h_1 > h_2 \ge 1} \left( \log h_1 \log h_2 \right) x^{h_1+h_2-1} \bigg ) \, , \end{aligned} \end{aligned}$$
(67)
while in the second case we obtain
$$\begin{aligned} \begin{aligned}&4 z B(z)^3 \sum _{1 \le h_1 \le h_2} \left( \log h_1 \log h_2 \right) \sqrt{x}^{\,h_1} (\sqrt{x}^{\,h_2-h_1+1}+\sqrt{x}^{\,h_1+h_2-1}) \sqrt{x}^{\,h_2-1} \\&= 4 z B(z)^3 \bigg ( \sum _{h_2 \ge 1} \left( \log h_2 \log h_2! \right) x^{h_2} + \sum _{1 \le h_1 \le h_2} \left( \log h_1 \log h_2 \right) x^{h_1+h_2-1} \bigg ) \, . \end{aligned} \end{aligned}$$
(68)
The combination of these two terms gives
$$\begin{aligned} \begin{aligned} M_2^\mathrm{(B)}(z)|_{c=2}&= \frac{4 x (1+x)}{(1-x)^3} \bigg ( \sum _{h \ge 1} \left( \log (h^2+h) \log h! \right) x^{h} + \frac{1}{x} ({{\,\mathrm{Li}\,}}_{0,1}(x) )^2 \bigg ) \, . \end{aligned} \end{aligned}$$
(69)
In the case of excursions, the computations are completely analogous, and give
$$\begin{aligned}&M^\mathrm{(E)}_2(z) \vert _{c=1} =2 \sqrt{z} B(z)^2 (1-x)^2 \sum _{h_1 \ge 1} \frac{\left( \log h_1 \right) ^2}{2} \sqrt{x}^{\,2h_1-1} = (1+x) {{\,\mathrm{Li}\,}}_{0,2}(x) \, \end{aligned}$$
(70)
$$\begin{aligned}&M_2^\mathrm{(E)}(z)|_{c=2} = \frac{2x(1+x)}{1-x} \bigg ( \sum _{h \ge 1} \left( \log (h^2+h) \log h! \right) x^{h} - ({{\,\mathrm{Li}\,}}_{0,1} )^2 \bigg ) \, . \end{aligned}$$
(71)
In order to compute the singular expansion of \({{\,\mathrm{Li}\,}}_{0,2}(x)\) and of \(\sum _{h\ge 1}( \log (h^2+h) \log (h!) x^h)\), one can again use the transfer theorem, obtaining
$$\begin{aligned} \begin{aligned} {{\,\mathrm{Li}\,}}_{0,2}(x) = \frac{{{\,\mathrm{L}\,}}(x)^2 - 2\gamma _E {{\,\mathrm{L}\,}}(x) + \gamma _E^2 + \frac{\pi ^2}{6}}{1-x} + {\mathcal {O}}\left( {{\,\mathrm{L}\,}}(x)^2 \right) \, \end{aligned} \end{aligned}$$
(72)
and
$$\begin{aligned} \begin{aligned}&\sum _{h\ge 1}( \log (h^2+h) \log (h!) z^h) = \\&\quad \frac{2 {{\,\mathrm{L}\,}}(x)^2 +2(1-2\gamma _E) {{\,\mathrm{L}\,}}(x) + \frac{\pi ^2}{3} +2\gamma _E^2-2\gamma _E-2 }{(1-x)^2} + {\mathcal {O}}\left( \frac{{{\,\mathrm{L}\,}}(x)^2}{1-x} \right) \, . \end{aligned} \end{aligned}$$
(73)
We see that, for both Dyck bridges and paths, the \(c=1\) term is subleading with respect to the \(c=2\) term, by a factor \(1-x\) which, after use of the Transfer Theorem, implies a factor \({\mathcal {O}}(N^{-1})\). It is easy to imagine (and in agreement with the discussion in Sect. 3.1) that this pattern will hold also for all subsequent moments, that is, the leading term for the k-th moment will be given by the \(c=k\) contribution alone, all other terms altogether giving a correction \({\mathcal {O}}(N^{-1})\).
After substituting x(z) with its expansion around \(z = 1/4\), and after having performed a series of tedious but trivial computations, we obtain
$$\begin{aligned} \begin{aligned} M^\mathrm{(B)}_{2,N}&= \frac{1}{4} N^2 \left( \log N \right) ^2 - \frac{\gamma _E+2}{2} N^2 \log N + \left( \frac{4}{3} + \frac{\gamma _E^2}{4} + \gamma _E - \frac{\pi ^2}{72} \right) N^2 + {\mathcal {O}}\left( N^\frac{3}{2} \left( \log N \right) ^2\right) \, ,\\ M^\mathrm{(E)}_{2,N}&= \frac{1}{4} N^2 \left( \log N \right) ^2 - \frac{\gamma _E}{2} N^2 \log N + \left( \frac{\gamma _E^2}{4}+\frac{5 \pi ^2}{24}-2 \right) N^2 + {\mathcal {O}}\left( N^\frac{3}{2} \left( \log N \right) ^2\right) \, . \end{aligned}\nonumber \\ \end{aligned}$$
(74)
Finally, we recover the moments of \(s({\varvec{\sigma }})\)
$$\begin{aligned} \begin{aligned} \langle (s({\varvec{\sigma }}))^2 \rangle _\mathrm{B}&= \frac{4}{3} + \frac{\gamma _E^2}{4} + \gamma _E - \frac{\pi ^2}{72} + {\mathcal {O}}\left( \frac{(\log N)^2}{\sqrt{N}} \right) \, ,\\ \langle (s({\varvec{\sigma }}))^2 \rangle _\mathrm{E}&= \frac{\gamma _E ^2}{4}+\frac{5 \pi ^2}{24} -2 + {\mathcal {O}}\left( \frac{(\log N)^2}{\sqrt{N}} \right) \, . \end{aligned} \end{aligned}$$
(75)
Details for these computations can be found in Appendix B.