Abstract
We introduce a new model called the Brownian Conga line. It is a random curve evolving in time, generated when a particle performing a two dimensional Gaussian random walk leads a long chain of particles connected to each other by cohesive forces. We approximate the discrete Conga line in some sense by a smooth random curve and subsequently study the properties of this smooth curve.
Introduction
The Conga line is a Cuban carnival march that has become popular in many cultures over time. It consists of a queue of people, each one holding onto the person in front of him. The person at the front of the line can move as he will, and the person holding onto him from behind follows him. The third person in the queue follows the second, and so on. Often people keep on joining the line over time by attaching themselves to the last person in the line. As the Conga line grows in time, it displays interesting motion patterns where the randomness in the motion of the first person propagates down the line, diminishing in influence as it moves further down. The Conga line also appears in molecular biology as a model for various long polymers built of smaller monomers and is closely linked to (a discrete version of) the curve shortening problem (see [1]). In this article, we devise a mathematical formulation of the Conga line and study its properties.
The formulation is as follows.
Let \(Z_k, \ k \ge 1\), be i.i.d standard twodimensional normal random variables. Fix some \(\alpha \in (0,1)\). Let \(X_1(0)=0\) and \(X_1(n)=\sum _{i=1}^nZ_i\) for \(n \ge 1\). This denotes the leading particle, or the tip of the Conga line.
Now, we define processes \(X_k\) inductively as follows. Suppose that \(\{X_k(n), \ n \ge 0\}\) have already been defined for \(1 \le k \le j\). Then we let \(X_{j+1}(0)=0\) and
for \(n \ge 1\). Here, the process \(X_k\) denotes the motion over time of the particle at distance k from the leading particle. The relation (1.1) describes the manner in which a particle \(X_{j+1}\) follows the preceding particle \(X_j\). It is easy to check from (1.1) that \(X_j(n)=0\) for all \(j> n\). These represent the particles at rest at the origin at time n. Note that the jth particle \(X_j\) joins the Conga line at time j. See Fig. 1 for the construction of the Conga line for \(n=1,2,3,4\).
The Conga line at time n is defined as the collection of random variables \(\{X_k(n), \ k \le n\}\).
One can also think of this model as a discrete version of a long string or molecule whose tip is moving randomly under the effect of an erratic force and the rest of it performs a constrained motion governed by the tip together with the cohesive forces. Burdzy and Pal [2] performed some simulations (see Fig. 2) which led them to make the following observations:

1.
For a fixed large n, the locations of the particles \(\{X_k(n), \ k \ge 1\}\) sufficiently away from the tip look like a ‘smooth’ curve, and the smoothness increases as we move away from the tip.

2.
For k significantly larger than 1, there is very little variability in the location of the particles over short periods of time.

3.
The small loops in the curve tend to die out over time. Just before death, they look ‘elongated’ and their death site forms a cusp.

4.
The particles near the origin seem to freeze showing very little movement over time. Moreover, the direction in which the particles come out of the origin seems to stabilise over time.
All the above observations need precise mathematical formulations. Once the rigorous foundations are established, we can ask the correct questions and try to answer them. This, broadly, is the goal of the article.
We give a brief outline of the content of each section.
In Sect. 2, we try to make mathematical sense of the statement ‘the process looks like a smooth curve’. This is the toughest challenge as the Conga line, unlike most known stochastic processes which can be approximated by continuous models, does not seem to have an interesting scaling limit under uniform scaling. This is because if we look at the Conga line for any fixed n, the distance between the particles decays as we move away from the tip, i.e., increase k, as suggested by Fig. 2. But it is precisely why this is a novel model, which exhibits particles moving in different scales ‘in the same picture’. The particles near the tip are wider spaced and their paths mostly resemble a Gaussian random walk, but those for large k are more closely packed and the Conga line looks very smooth in this region (see Fig. 2). To circumvent this problem, we describe a coupling between our discrete process \(\{X_k(n), \ k \le n\}\) and a smooth random process \(\{u(x,t): (x,t) \in \mathbb {R}^{+2}\}\) such that, when observed sufficiently away from the tip, more precisely for \(k \ge n^{\epsilon }\) for any fixed \(\epsilon >0\) and large n, the points \(X_{k+1}(n)\) are uniformly close to the points u(k, n). Thus, u serves as a smooth approximation to the discrete process X in a suitable sense. The x variable of u represents distance from the tip and the t variable represents time. Future references to the Conga line refer to this smooth version u. We close the section by presenting another smooth process \(\overline{u}\) that also serves as an approximation in the same sense, and is more intuitive when considering the motion of individual particles, i.e. trajectories of the form \(\{X_k(n): n \ge k\}\) for fixed k. It is also used in Sect. 6 to study the phenomenon of freezing of the Conga line near the origin.
Note In the following, we will be investigating the properties of the continuous two dimensional Conga line (which is the smooth approximation to our original discrete model) as well as the process corresponding to its x (equivalently y) coordinate, which we will call the (continuous) one dimensional Conga line. To save additional notation, we will denote both of these by u, but the dimension will be clear from the context.
In Sect. 3, we study the properties of the continuous, one dimensional Conga line u. First, we investigate the phenomenon of the particles at different distances from the tip moving in ‘different scales’ suggested by their different order of variances. The particles near the tip move in the same scale as the leading Gaussian random walk as indicated by their variance being O(t), while those far away from the tip show very little movement away from the origin, as indicated by exponentially decaying variances. Furthermore, there exists a cutoff near \(x\!=\!\alpha t\), where the variance shows a sharp transition from ‘very large to very small’. We identify this and study the fine changes in variance around this point.
Next, using the scaling properties of Brownian motion, we show that for fixed t, the Conga line (in both one and two dimensions) can be scaled so that the space variable x runs in [0, 1]. We call this scaled version \(u_t\) and study its analytical properties. Upper bounds on the growth rate of the derivatives show that \(u_t\) is real analytic. We also make a detailed study of the covariance structure of the derivatives. This turns out to be a major tool in studying the subsequent properties like critical points, length, loops, etc.
With the basic framework of the Conga line established, we set out to investigate its finer properties. We investigate the distribution of critical points of the scaled one dimensional Conga line \(u_t\), i.e., points at which the derivative vanishes. The number of critical points in an interval serves as a measure of how wiggly the Conga line looks on that interval. The critical points are distributed as a point process on the real line and we show using an expectation metatheorem for smooth Gaussian fields (see [3, p. 263]) that its first intensity at x (for a large time t) is approximately of the form \(\sqrt{t} x ^{1/2}\). This shows that, though the typical number of these points in a given interval is \(O(\sqrt{t})\) for large t, the proportion of critical points around x decreases as \({x^{\frac{1}{2}}}\) as we go farther away from the tip. We also show subsequently using second moment estimates that the critical points are reasonably wellspaced and they do not tend to crowd around any point. Furthermore, we show that the first intensity is a good estimate of the point process itself as for a given interval I sufficiently away from the ends \(x=0\) and \(x=1\), the ratio \({\frac{N_t(I)}{\mathbb {E}N_t(I)}}\) goes to one in probability as t grows large.
In Sect. 4 we give a fluctuation estimate for the continuous two dimensional Conga line. This, along with the approximation result on Sect. 2, tells us that for any \(\delta \in (0,\alpha )\) and for large n, the linear interpolation of the discrete Conga line, along with its derivative (which exists everywhere except at integer points), comes uniformly close to \((u(\cdot , n), \partial _xu(\cdot , n))\) on the entire interval \([\delta n, \alpha n]\). This explains why the discrete Conga line looks smooth away from the tip. We then study properties of the scaled two dimensional Conga line, like length and number of loops. We also investigate a strange phenomenon. Although the mechanism of subsequent particles following the preceding ones and ‘cutting corners’ results in progressively smoothing out the roughness of the Gaussian random walk of the tip, we see that as t increases, the scaled Conga line looks more and more like Brownian motion in that the supnorm distance between them on [0, 1] is roughly of order \(t^{1/4}\) (with a log correction term). This can be explained by the fact that the noticeable smoothing of the paths of the unscaled Conga line takes place in a window of width \(\sqrt{t}\) around each point, which translates into a window of width \(t^{1/2}\) as we scale space and time by t. Thus in the scaled version, the smoothing window becomes smaller with time, resulting in this phenomenon. Consequently, the scaled Conga line \(u_t\) for large t serves as a smooth approximation to Brownian motion which smooths out microscopic irregularities but retains its macroscopic characteristics.
In Sect. 5 we study the evolution of loops in the family of two dimensional paths that the particles at successively larger distances from the tip trace out. We study this evolution under a metric similar to the Skorohod metric. It turns out that with probability one, every singularity, i.e a point where the speed of the curve becomes zero, in a particle path is a cusp singularity (looks like the graph of \(y=x^{2/3}\) in a local coordinate frame). Furthermore, there is a bijection between dying loops and cusp singularities in the sense that small loops die (i.e. the loop shrinks to a point) creating cusp singularities, and conversely, if such a singularity appears in the path of some particle, we can find a loop in the path of the immediately preceding particles, and it dies creating the singularity.
Finally, in Sect. 6, we investigate the phenomenon of freezing near the origin. We work with the smooth approximation \(\overline{u}\), and show that for an appropriate choice of a sequence \(x_t\) of distances from the tip such that the particles at these distances remain sufficiently close to the origin, \(\overline{u}(x_t,t)\) converges almost surely and in \(L^2\), and find the limiting function.
Notation Before we proceed, we clarify the notation that we will be using here:

(i)
If g is a random function and V is a random variable with distribution function F and independent of g, then
$$\begin{aligned} \mathbb {E}_Vg(V)=\int g(v)\,dF(v) \end{aligned}$$denotes the expectation with respect to V for a fixed realisation of g.

(ii)
\(\Phi \) denotes the normal distribution function and \(\overline{\Phi }=1\Phi \). We denote the corresponding density function by \(\phi \).

(iii)
For any function f of several variables, \(\partial ^k_x f\) denotes the partial derivative of f with respect to the variable x taken k times.

(iv)
For functions \(f,g: [0,\infty ) \rightarrow \mathbb {R}^+\), \(f(t) \sim g(t)\) means that there f and g have the same growth rate in t, i.e., there exists a constant C such that
$$\begin{aligned}{\frac{f(t)}{g(t)}\vee \frac{g(t)}{f(t)} \le C}\end{aligned}$$for all sufficiently large t.

(v)
For a family of realvalued functions \(\{f_t: t \in (0, \infty )\}\) defined on a compact set \(I \subseteq \mathbb {R}^k\) and a function \(a: (0,\infty ) \rightarrow [0,\infty )\), we say that
$$\begin{aligned} f_t=O^{\infty }(a(t))\quad \text {on}\ I \end{aligned}$$if
$$\begin{aligned} \sup _{t \in (0,\infty )}\frac{\sup _{x \in I}f_t(x)}{a(t)} \le C \end{aligned}$$for some constant \(C<\infty \). Sometimes, (by abuse of notation) we will write
$$\begin{aligned} f_t(x)=O^{\infty }\left( a(t)\right) \quad \text {for}\ x \in I \end{aligned}$$to denote the same.
The discrete Conga line
We set out by finding a useful way to express \(X_{k}(n)\) in terms of \(X_1(n)\). This turns out to be the starting point in our approximation procedure of the discrete Conga line \(\{X_k(n): k \ge 1\}\) for sufficiently large k by a smooth curve.
Let \(T_1,T_2,\ldots \) be i.i.d Geom(\(\alpha \)) and let
Then \(\Theta _j \sim NB(j,\alpha )\), where NB(a, b) represents the Negative Binomial distribution with parameters a and b. It follows from the recursion relation (1.1) that one can write
where we set \(X_k(n)=0\) for all \(n \le 0\), for each k.
By induction, we get
Approximation by a smooth process
Here we show that for any fixed \(\epsilon >0\), the discrete Conga line \(\{X_k(n)\}\) can be approximated uniformly in k, for \(n^{\epsilon }\le k \le n\), for large n, by a smooth process \(\{u(x,t): (x,t) \in \mathbb {R}^+ \times \mathbb {R}^+\}\) evaluated at integer points (k, n). This process arises as a smoothing kernel acting on Brownian motion. Furthermore, the increments \(X_{k+1}(n)X_k(n)\) (which correspond to the ‘discrete derivative’ of X for fixed n) can also be uniformly approximated by the derivative of \(u(\cdot , n)\) at k.
Let \(B_l\sim \mathrm{{Bin}}(l,\alpha )\). From (2.1) and the fact that
we get
The above expression also yields
The next step is the key to the approximation. We obtain a coupling between a Brownian motion and our process X. Let \((\Omega , \mathcal {F}, P)\) be a probability space supporting a Brownian motion \(\{W(t): t \ge 0\}\), where we set \(W(t)=0\) for all \(t \le 0\). Then
gives the desired coupling on this space. Note that we can write
where
Here \(\lfloor z \rfloor \) denotes the largest integer less than or equal to z, and \(W^t_z= W(t)W(tz)\), \(0\le z\le t\), is the time reversed Brownian motion from time t.
Let \(\sigma =\sqrt{\alpha (1\alpha )}\). Consider the “spacetime” process
(we obtain the second expression from the first by an application of stochastic integration by parts, see [4]). Note that u is smooth in x and its first derivative is given by
We prove in what follows that for large n, and for \(n^{\epsilon }\le k \le n\) for any fixed \(\epsilon >0\), the points \(X_{k+1}(n)\) are “uniformly close” to the points u(k, n) (u evaluated at integer points) for the given range of k. Similarly, the increments \(X_{k+2}(n)X_{k+1}(n)\) are also uniformly close to \(\partial _x u(k,n)\) for k in the given range. Our strategy is to first consider a discretized version of the process u(x, t) and its derivative \(\partial _x u(x,t)\), respectively given by
In Lemma 1, we give a bound on the \(L^2\) distance between \(X_{k+1}(n)\) and \(\hat{u}(k,n)\) and between \(X_{k+2}(n)X_{k+1}(n)\) and \(\widehat{\partial _x u}(k,n)\) for large n when \(n^{\epsilon } \le k \le n\). In Lemma 2, a similar bound is achieved for the \(L^2\) distance between \(\hat{u}(k,n)\) and u(k, n) (and between \(\widehat{\partial _x u}(k,n)\) and \(\partial _x u(k,n)\)). In Theorem 1, we prove using a Borel–Cantelli argument that for large n the two processes X and u (evaluated at integer points) come uniformly close on \(n^{\epsilon } \le k \le n\). A similar result holds for the increments \(X_{k+2}(n)X_{k+1}(n)\) and the partial derivative \(\partial _x u(k,n)\).
In the following, \(C_1,C_2,\ldots \) represent absolute constants, \(C_{\epsilon }\), \(C'_{\epsilon }\) denote constants that depend only on \(\epsilon \), \(C_p\) denotes a constant depending only on p and \(D_{\epsilon ,p}\), \(D'_{\epsilon ,p}\) denote constants depending upon both \(\epsilon \) and p.
Lemma 1
Fix \(\epsilon >0\). For \(n^{\epsilon }\le k \le n,\)
where \(\sigma =\sqrt{\alpha (1\alpha )}\). Consequently,
and
uniformly on \(n^{\epsilon }\le k \le n.\)
Proof
Choose \(C>0\) such that \(\epsilon (C1)\ge 2\) and \(C\ge \frac{3}{2}\). Take \(L_k=\lfloor \alpha ^{1}\sqrt{Ck \log k}\rfloor \). Then, we can write
Here, \(S_1^{(k)}\) and \(S_3^{(k)}\) correspond to the tails of the distribution functions, and we shall show that they are negligible compared to \(S_2^{(k)}\). To this end, note that
Now, by Bernstein’s inequality,
We also have
Therefore, for large k,
Similarly, for \(S_3^{(k)}\), we get
Now, for \(S_2^{(k)}\), we use the Berry–Esseen Theorem (see [5]).
The second inequality in (2.5) follows similarly, except that we use the De Moivre–Laplace Theorem (see [6]) in place of the Berry–Esseen Theorem to bound the second sum. The De Moivre–Laplace Theorem gives a moderate deviations bound: it says that if \(A_N\) is nondecreasing and satisfies \({\frac{A_N}{N^{1/6}} \rightarrow 0}\) as \(N \rightarrow \infty \), then there is a constant C depending only on \(\alpha \) such that
Putting \(A_N = \alpha ^{1}\sqrt{C\log N}\) in this expression yields the required bound. This completes the proof of the lemma. \(\square \)
Lemma 2
\({\mathbb {E}\hat{u}(k,n)u(k,n)^2 \le C_{\epsilon }\frac{\sqrt{\log k}}{\sqrt{k}}}\) and \(\mathbb {E}\widehat{\partial _x u}(k,n)\partial _xu(k,n)^2 \le C_{\epsilon }\frac{\sqrt{\log k}}{k^{3/2}}\) uniformly on \(n^{\epsilon }\le k \le n.\)
Proof
Write \({\hat{u}(k,n)=\int _0^n\hat{f}(k,z)dW_z^n}\) and \({u(k,n)=\int _0^nf(k,z)dW_z^n}\) where \(\hat{f}(k,z)= \overline{\Phi }\left( \frac{k\alpha \lfloor z \rfloor }{\sigma \sqrt{\lfloor z \rfloor }}\right) \hbox { and }{f(k,z)=\overline{\Phi }\left( \frac{k\alpha z}{\sigma \sqrt{z}}\right) }.\)
Then, we can decompose \({\mathbb {E}\hat{u}(k,n)u(k,n)^2}\) as in the proof of Lemma 1 as follows:
Now,
We can follow the same argument as in the proof of Lemma 1 and verify that \(I_1^{(k)}\) and \(I_3^{(k)}\) are bounded above by \(C_5(\sqrt{k})^{1}\). To handle the second term, note that by (2.6), we have
on \(\lfloor \frac{k}{\alpha }\rfloor L_k \le z \le \lfloor \frac{k}{\alpha }\rfloor +L_k\). So,
This gives the first bound claimed in the lemma. To prove the second bound, we proceed exactly the same way, but now we use
on \(\lfloor \frac{k}{\alpha }\rfloor L_k \le z \le \lfloor \frac{k}{\alpha }\rfloor +L_k\), in place of the derivative bound on f. \(\square \)
So, by the preceding lemmas, we have proved that
and
uniformly on \(n^{\epsilon }\le k \le n\).
Now, \(X_{k+1}(n) u(k,n) = \int _0^n(g(k,z)f(k,z))\,dW^n_z\). As this is a centred Gaussian random variable,
and similarly,
We use this to obtain the following theorem.
Theorem 1
For any \(\mu > 0,\epsilon > 0\) and \( \eta >0,\) define the following events :
and
Then \(P(\limsup _nB_n^{\epsilon })=0\).
Proof
By Chebyshev type inequality (for 2pth moment) and (2.7), we get for any \(p \ge 1\),
Hence,
Now, choose p large enough such that \(P(B_n^{\epsilon })\le C n^{2}\) for some constant C. The result now follows by the Borel–Cantelli lemma. \(\square \)
Note The above theorem suggests that the distance between the points \(X_{k+1}(n)\) and u(k, n) decreases as we increase k. This is exactly what is suggested by Fig. 2. Furthermore, the difference between \(X_{k+2}(n)X_{k+1}(n)\) and \(\partial _x u(k,n)\) decreases even more rapidly on increasing k.
After we have investigated the path properties of the continuous Conga line u in subsequent sections, we will prove a fluctuation estimate for u in Lemma 12 which can be combined with Theorem 1 to prove Theorem 4 which states that the linear interpolation of \(X(\cdot ,n)\) and its derivative (which exists everywhere except integer points) come uniformly close to \((u(\cdot ,n), \partial _x u(\cdot ,n))\). This will formally explain why the discrete Conga line looks smooth when observed sufficiently far away from the tip.
Another smooth approximation
Here we give another smooth approximation \(\overline{u}\) to X given by
where \(\rho =\sigma /\alpha ^{3/2}\), \(Z_{\rho ^2x}\) is a normal random variable with mean zero and variance \(\rho ^2x\) and \(\mathbb {E}_{Z_{\rho ^2x}}\) represents expectation taken with respect to \(Z_{\rho ^2x}\) for a fixed realisation of W (see Notation (i)).
This approximation is more convenient and intuitive for investigating the paths of individual particles and studying the phenomenon of freezing near the origin. Note that \(\overline{u}\) has the following properties:

The curve \(\{\overline{u}(x,t): x \in (0,t]\}\) for fixed t corresponds to the Conga line. Increasing x (i.e. moving away from the tip) results in increasing smoothness along the same curve \(\overline{u}(\cdot ,t)\), indicated by the increasing variance of the \(Z_.\) variable (see Fig. 2).

The curve \(\{\overline{u}(x,t): t \in [x,\infty )\}\) for fixed x corresponds to the path of the particle at distance x from the tip. As successive particles ‘cut corners’, the family of curves \(\overline{u}(x,\cdot )\) become progressively smoother with x (see Fig. 5).
Theorem 2
The result of Theorem 1 holds with u replaced by \(\overline{u}.\)
Proof
Let \(\rho =\sigma /\alpha ^{3/2}\). Consider another continuous process \(u^*\) given by
Rewrite \(u^*\) as follows:
Clearly,
So, on \(n^{\epsilon } \le k \le n\),
Furthermore, by using stochastic integration by parts to express \(e_1(x,t)\) as a stochastic integral and then computing the second moment, it can be verified that
So, on \(n^{\epsilon } \le k \le n\),
By calculations similar to those in the proof of Lemma 2,
It is routine to check that a similar analysis yields the required bound for \(\mathbb {E}\partial _xu(k,n)\partial _x\overline{u}(k,n)^2\). Now, proceeding exactly as in the proof of Theorem 1, we get the result. \(\square \)
The continuous one dimensional Conga line
Here we investigate properties of the continuous one dimensional Conga line u, the coordinate process of the random smooth curve obtained in the previous section as an approximation to the discrete Conga line X.
Particles moving at different scales
It is not hard to observe by estimating
that particles at distances ct from the leading particle have variance O(t) if \(c < \alpha \) and o(1) (in fact, the variance decays exponentially with t) if \(c > \alpha \). Also in a window of width \(c\sqrt{t}\) about \(\alpha t\), the variance is \(O(\sqrt{t})\). In particular, this indicates that there is a window of the form \([\alpha t, \alpha t + c_t]\), with \(\frac{c_t}{t} \rightarrow 0\) and \(\frac{c_t}{\sqrt{t}} \rightarrow \infty \), where the variance changes from being ‘very large to very small’. Furthermore, we will show that there is a cutoff around which the variance shows a sharp transition: below it the variance grows to infinity with t, and above it the variance decreases to zero.
Theorem 3

(i)
For \(\lambda > 0\) and \(1/2 \le \beta < 1,\) \(\mathrm{{Var}}(u(\alpha t  \lambda t^{\beta },t))\sim t^{\beta }.\)

(ii)
\(\mathrm{{Var}}(u(\alpha t + \sigma \sqrt{\lambda t \log t},t)) \sim {\frac{t^{1/2\lambda }}{(\log t)^{3/2}}}.\)
Proof
(i) Take any \(c>\lambda /\alpha \). Then, decomposing the variance,
The first integral satisfies
where the second step above follows from a change of variables.
It is easy to check that
proving part (i).
(ii) We decompose the variance as
The first integral decays like \(e^{Ct}\) for some constant C. For the second integral, we make a change of variables similar to (i) and standard estimates for the normal c.d.f. to get
proving (ii). \(\square \)
Part (ii) of the above theorem has the following interesting consequence, demonstrating a cutoff phenomenon for the variance of u(x, t) in the vicinity of \({x=\alpha t + \sigma \sqrt{\frac{1}{2}t \log t}}\).
Corollary 1
As \(t \rightarrow \infty \)

(i)
\(\mathrm{{Var}}(u(\alpha t + \sigma \sqrt{\lambda t \log t},t)) \rightarrow 0\) if \(\lambda \ge 1/2.\)

(ii)
\(\mathrm{{Var}}(u(\alpha t + \sigma \sqrt{\lambda t \log t},t)) \rightarrow \infty \) if \(\lambda < 1/2.\)

(iii)
For \(0 < \delta < \infty ,\) \(\mathrm{{Var}}(u(\alpha t + \sigma \sqrt{t((1/2) \log t  (3/2)\log \log t  \log \delta )},t)\sim \delta .\)
So, the variance exhibits a sharp transition around \(\alpha t + \sigma \sqrt{t((1/2) \log t  (3/2)\log \log t)}.\)
The proof follows easily from part (ii) of Theorem 3.
Analyticity of the scaled Conga line
For a fixed time t, by a change of variables in (2.4), we have:
where \({\sigma _t=\frac{\sigma }{\sqrt{t}}}\) and \(W^{(t)}(z)=t^{\frac{1}{2}}W(tz), 0 \le z \le 1\). So, to study the Conga line for fixed t, we study the scaled process
for \(0 \le x \le 1\). From (3.1), note that u and \(u^W_t\) are connected by the exact equality
for \(0 \le x \le 1\). In particular, for fixed t,
When the driving Brownian motion W is clear from the context, we will suppress the superscript W and write \(u_t\) for \(u^W_t\).
Now, we take a look at the derivatives of \(u_t\). It is easy to check that we can differentiate under the integral. Thus,
In general, the \((n+1)\)th derivative takes the following form:
where \(\mathrm{{He}}_n\) is the nth Hermite polynomial (probabilist version) given by
In the following lemma, we give an upper bound on the growth rate of the derivatives. Using this, we will prove that, for fixed t, \(u_t\) is real analytic on the interval (0, 1), and the radius of convergence around \(x_0\) is comparable to \(x_0\). This is natural as the Conga line gets smoother as we move away from the tip. We start off with the following lemma.
Lemma 3
For \({0<\epsilon < \frac{x}{\alpha }},\)
where \({W = \sup _{0\le s\le 1}W_s}.\)
Proof
In the proof, we consider C as a generic positive constant whose value might change in between steps.
Let
Then
So, \({\partial _x^{n+1}u_t(x)\le \Vert W\Vert \int _0^1\partial _sK_t^n(x,s)\,ds}\). So, we have to estimate the integral \({\int _0^1\partial _sK_t^n(x,s)\,ds}\). Now,
So,
From the above, it is clear that estimating the second integral suffices.
where
Here we use the facts that
and
Similarly,
The lemma follows from the above. \(\square \)
From this lemma, it is not too hard to see that \(u_t\) is real analytic on (0, 1). Let \(x_0\) be any point in this interval. For \(0<\delta < x_0\), define
The nth order Taylor polynomial based at \(x_0\) is given by \(T_t^n(x)= \sum _{i=0}^n\frac{\partial _x^i u_t(x_0)}{i!}(xx_0)^i\). By Taylor’s inequality,
From the above lemma, we know that, for \({\epsilon < \frac{x_0}{\alpha }}\),
The above error will go to zero only when \({\frac{\sqrt{2}\delta }{x_0\delta \alpha \epsilon } < 1}\), i.e., \( {\delta < \frac{x_0\alpha \epsilon }{1+\sqrt{2}}}\). We can make \(\epsilon \) arbitrarily small to get the following:
Corollary 2
With probability one, the scaled Conga line \(u_t\) is real analytic on (0, 1). The power series expansion of \(u_t\) around \(x_0 \in (0,1)\) converges in \({(\frac{\sqrt{2}x_0}{1+\sqrt{2}}, \min \{\sqrt{2}x_0,1\})}.\)
We are going to use this property of the Conga line multiple times in this article.
Covariance structure of the derivatives
In the following sections, we will analyse the finer properties of the Conga line like distribution of critical points, length and shape and number of loops. For all of these, fine estimates on the covariance structure of the derivatives are of utmost importance. This section is devoted to finding these estimates for the one dimensional scaled Conga line \(u_t\).
To get uniform estimates for the covariance structure of derivatives, we will look at the scaled Conga line sufficiently away from the tip \(x=0\). More precisely, we will consider \(\{u_t(x): \delta \le x \le \alpha \}\) for an arbitrary \(\delta \in (0, \alpha )\). This amounts to analysing the unscaled version \(u(\cdot ,t)\) in the region \(\delta t \le x \le \alpha t\).
The next lemma is about the covariance between the first derivatives at two points. In what follows, we write \({L_t^x(M)=\alpha ^{1}\sqrt{M\frac{x}{t}\log \frac{x}{t}}}\).
Lemma 4
For \(\delta \le x,y \le \alpha ,\) \({\mathrm{{Cov}}(u_t'(x), u_t'(y)) \ge 0}\) and satisfies
Consequently, the correlation function \({{\rho }_t(x,y)=\mathrm{{Corr}}(u_t'(x), u_t'(y))}\) is always nonnegative and has the following decay rate
where constants \(C_1,C_2\) depend only on \(\delta \) and \(\alpha .\)
Proof
To prove this lemma, note that, by completing squares in the exponent, we get
Now we want to estimate the integral \({\int _0^1\frac{1}{\sigma _t^2s}\phi ^2\left( \frac{x  \alpha s}{\sigma _t\sqrt{s}}\right) ds}\) where \(\delta \le x \le \alpha \). By choosing M large enough, we can ensure that
Notice that
Substituting \(\sqrt{\frac{x^2 +y^2}{2}}\) in place of x proves the lemma. \(\square \)
Lemma 5
For \(\delta \le x \le \alpha ,\)
Proof
Follows along the same lines as the proof of Lemma 4. \(\square \)
Lemma 6
For \(\delta \le x \le \alpha ,\)
Proof
where \({f_t(x,s)=\frac{2x}{x+\alpha s}1}\).
Using the fact that \({\int _{\infty }^{\infty }s \exp (s^2)\,ds=0}\), we get
choosing sufficiently large M.
Also note that
for \(\delta \le x \le \alpha ,\frac{x}{\alpha }L_t^x(M) \le s \le \frac{x}{\alpha }+L_t^x(M)\), which yields
Equations (3.6) and (3.7) prove the lemma. \(\square \)
Corollary 3
For \(\delta \le x \le \alpha ,\)
Proof
This follows from Lemmas 4, 5 and 6. \(\square \)
Let \(\Sigma _t(x,y)\) be the covariance matrix of \((u_t'(x),u_t'(y))\). We need the following technical lemma to estimate the determinant of the matrix. It turns out to be crucial in certain second moment computations in Sect. 3.4.
Lemma 7
There exist constants \(C^*,C_1,C_2\) such that, for \(\delta \le x,y \le \alpha \) with \({xy \le \frac{C^*}{\sqrt{t}}},\)
Proof
We fix \(x \in [\delta ,\alpha ]\) and consider the function \(\Psi _{t,x}(y)=\det \Sigma _t(x,y)\). Consider the function \({g_t(y)= \mathrm{{Var}}\,{u_t'(y)} = \int _0^1\frac{1}{\sigma _t^2 s} \phi ^2\left( \frac{y  \alpha s}{\sigma _t\sqrt{s}}\right) ds}\). Let \(\mathrm{{H}}_n\) denote the nth Hermite polynomial (physicist version) given by
Then we can write the nth derivative of \(g_t\) as
Using the fact that \(\int _{\infty }^{\infty }H_n(s)\exp \{s^2\}\,ds=0\) and the same technique as the proof of Lemma 6, one can show that, for \(n\ge 1\),
Let \({\eta =\sqrt{\frac{x^2+y^2}{2}}}\).
Consider the functions
and
Then, writing down \(\mathrm{{Cov}}(u_t'(x),u_t'(y))\) as in the proof of Lemma 4, we get
It is easy to check that
The double derivative of \(E_{t,x}\) takes the form
Using (3.8) we deduce
which, along with (3.10) yields
for some constant \(C < \infty \).
Now, to estimate \(F_{t,x}\), note that in the region \(\delta \le x,y \le \alpha \),
Using (3.12) along with the fact that \(e^{C}x \le 1e^{x} \le x\) on \(0\le x \le C\), and Lemma 4, we get
where \(C, C^*\) are positive, finite constants.
Equations (3.11) and (3.13) together prove the lemma. \(\square \)
Analyzing the distribution of critical points
Let \({N_t(I)}\) denote the number of critical points of the scaled one dimensional Conga line \(u_t\) in an interval \(I\subseteq [\delta , \alpha ]\). Then \(N_t\) defines a simple point process on \([\delta , \alpha ]\). Our first goal is to find out the first intensity of this process. For this, we use the Expectation metatheorem for smooth Gaussian fields (see [3, p. 263]), which implies the following:
where \(p_t^y\) is the density of \(u_t'(y)\). Before we go further, we remark that the metatheorem from [3] mentioned above is a very general theorem which holds in a much wider setup under a set of assumptions. In our case, it is easy to check that all the assumptions hold. Now, we utilize (3.14) and the developments in Sect. 3.3 to derive a nice expression for the first intensity density.
Lemma 8
The first intensity density \(\rho _t\) for \(N_t\) satisfies
for \(\delta \le x \le \alpha \).
Proof
By standard formulae for normal conditional densities and the lemmas proved in Sect. 3.3, we manipulate (3.14) as follows:
Here, we use Corollary 3 for to get the second equality and the estimates proved in Lemmas 4 and 5 to get the third equality. This proves the lemma. \(\square \)
Thus \({\hat{\rho }_t(x)=\frac{\sqrt{\alpha t}}{\pi \sigma \sqrt{2x}}}\) gives us the approximate first intensity for \(N_t\). From this, we see that the expected number of critical points in a small interval \([x,x+h]\) is approximately \({\frac{\sqrt{\alpha t}h}{\pi \sigma \sqrt{2x}}}\).
Now that we know the first intensity reasonably accurately, we can ask finer questions about the distribution of critical points, such as

(i)
What can we say about the spacings of the critical points? Are there points in \([\delta , \alpha ]\) around which there is a large concentration of critical points, or are they more or less wellspaced?

(ii)
Given an interval \(I \in [\delta , \alpha ]\), how good is \(\mathbb {E}N_t(I)\) as an estimate of \(N_t(I)\)?
The next lemma answers (i) by estimating the second intensity of \(N_t\). First we present a formula for the second intensity of \(N_t\) taken from [3].
where \({p_t^{y,z}}\) is the joint density of \((u_t'(y),u_t'(z))\).
In the following, \(C^*\) represents a positive constant.
Lemma 9
For \({t > \frac{4(1+\sqrt{2})^2}{2\delta ^2}},\) and \(\delta \le y,z \le \alpha \) with \({yz \le \frac{C^*}{\sqrt{t}}},\)
for some constant \(C >0.\)
Proof
The hypothesis of the lemma tells us that y and z lie in the region of analyticity of each other, i.e. we can write
and the same holds with y and z interchanged. If we know that \(u_t'(y)=0\) and \(u_t'(z)=0\), the above equation becomes
From this, we can solve for \(u_t''(z)\) to get
and the same holds with y and z interchanged. Thus, the conditional expectation in (3.16) becomes
Now, by the Cauchy–Schwarz inequality and the fact that the conditional variance is bounded above by the total variance, we have
We know that
So, using the same techniques as in the proof of Lemma 3, for \({0<\epsilon <\frac{\delta }{2\alpha }}\), we estimate the variance as
To estimate the first integral, we note that the function
is maximised at \(s=\sqrt{2m+3}\). So, \({g_m(s) \le (2m+3)^{(2m+3)/2}\exp \{(2m+3)/2\}}\).
By Stirling’s Formula,
where \(\Gamma (\cdot )\) is the Gamma function. Using this, we get \({g_m(s) \le C\sqrt{m}2^{m}(m+1)!}\). So,
The second integral is easier to estimate. Finally, we get
Therefore,
where, by the assumptions of the lemma, \({\sum _{m=0}^{\infty }a_m(t,y,z) \le C}\), where C is a constant that does not depend on t, y, z. Thus, we have
which proves the lemma. \(\square \)
We know that \({p_t^{y,z}(0)=\frac{1}{2\pi \sqrt{\det \Sigma _t(y,z)}}}\). Using Lemmas 7 and 9, we get
Lemma 10
For \({t > \frac{4(1+\sqrt{2})^2}{2\delta ^2}},\) and \(h \le C^* t ^{1/2},\)
In particular, we get
Using this lemma, we can deduce that if we divide \([\delta ,\alpha ]\) into subintervals of sufficiently small length, the number of critical points in any of these should not exceed one. The following corollary makes this precise.
Corollary 4
Let \(\{a_t\}\) be any sequence such that \(a_t=o(t^{1/4})\). Divide the interval \([\delta ,\alpha ]\) into subintervals \(I_1,\ldots ,I_{\lfloor \sqrt{t}/a_t\rfloor +1}\) of length at most \({\frac{a_t}{\sqrt{t}}}\). Then
as \(t \rightarrow \infty .\)
This follows easily from Lemma 10 using the union bound.
Now, we answer (ii).
Note that for a Poisson point process, the first intensity determines the whole process. The Conga line lacks the Markov property. We can think of it as a process that ‘gains smoothness at the cost of Markov property’. But Lemma 4 tells us that there is exponential decorrelation, i.e. pieces of the Conga line that are reasonably far apart are almost independent. Thus, we expect that the first intensity of \(N_t\) should give us a lot of information about the process \(N_t\) itself. We conclude this section on critical points by giving a basis to this intuition. We show the following:
Lemma 11
Let \(I \subseteq (0, \alpha ]\) be a closed interval. Then
as \(t \rightarrow \infty \).
Proof
We prove the result for \(I=[\delta , \alpha ]\) for an arbitrary \(\delta >0\), although the same proof carries over to a general closed interval contained in \((0,\alpha ]\).
Consider a collection of intervals
where each interval is of length \({\frac{1}{\sqrt{t}}}\) in \([\delta ,\alpha ]\), and \({d(I_j,I_k) \ge \frac{r}{\sqrt{t}}}\) for a sufficiently large r (which can be a function of t), whose optimal choice will be made later, and d(A, B) represents the usual distance between sets A and B. Using the long range independence of the Conga line (see Lemma 4), we will prove that \(\mathrm{{Var}}(N_t(\bigcup _{j=1}^{C\lfloor \sqrt{t}/r\rfloor }I_j))\) is very small compared to \(\mathbb {E}(N_t(\bigcup _{j=1}^{C\lfloor \sqrt{t}/r\rfloor }I_j))^2\). The proof is completed by covering \([\delta ,\alpha ]\) with \(\lfloor {r}\rfloor \) translates \(\mathcal {C}_1,\ldots ,\mathcal {C}_{\lfloor {r}\rfloor }\) of such collections and an application of Chebyshev inequality.
Note that all the constants used in this proof depend on \(\delta \).
We begin by computing \(\mathbb {E}(N_t(I_1)N_t(I_2))\) using an analogue of the Expectation metatheorem (which can also be derived from the second intensity formula (3.16)) as follows:
where \(\Sigma _t(y,z)\) is the covariance matrix for \((u_t'(y),u_t'(z))\). We know that if
then
where \(\Sigma =\begin{bmatrix} \Sigma _{11}&\Sigma _{12}\\ \Sigma _{21}&\Sigma _{22} \end{bmatrix}\) and \(\Sigma ^*=\Sigma _{11}\Sigma _{12}\Sigma _{22}^{1}\Sigma _{21}.\) Now
Take \((y,z) \in I_j \times I_k\), where \(I_j, I_k \in \mathcal {C}\) with \(j\ne k\).
The proof of Lemma 4 shows that for \(\eta =\sqrt{\frac{y^2+z^2}{2}}\),
and
So, as \({yz\ge \frac{r}{\sqrt{t}}}\),
Calculations similar to those in the proof of Lemma 4 show
and thus
Also, from Lemma 6,
Similar calculations also show
writing \({\frac{y\alpha s}{\sigma _t\sqrt{s}}}\) as \({\frac{y\eta }{\sigma _t\sqrt{s}}+\frac{\eta \alpha s}{\sigma _t\sqrt{s}}}\) and similarly for \({\frac{z\alpha s}{\sigma _t\sqrt{s}}}\). Furthermore, we see that
for sufficiently large r. Using Eqs. (3.20), \(\ldots \), (3.24) to estimate the right side of Eq. (3.19), we see that there is a \(K>0\) for which
Plugging this into the expression (3.18), we get
We know from Lemma 5 that
Thus
If we choose \(r=\sqrt{M\log t}\) for a large enough M, then
Consequently, from (3.25),
where the last step above follows from (3.14) using Corollary 3 (see the proof of Lemma 8).
Thus,
for this choice of r.
Now, we have all we need to compute the variance of \(N_t(\bigcup _{j=1}^{C\lfloor \sqrt{t}/r\rfloor }I_j)\).
where we used Lemma 10 crucially in putting the constant bound on \(\mathrm{{Var}}\,N_t(I_j)\).
With our choice of \(r=\sqrt{M \log t}\), the above becomes
Finally, we have, for small \(\epsilon >0\),
which goes to zero as \(t \rightarrow \infty \). \(\square \)
The continuous two dimensional Conga line
We will write the (unscaled) continuous two dimensional Conga line as \(u(\cdot ,t)=(u_1(\cdot ,t),u_2(\cdot ,t))\), where \(u_1\) and \(u_2\) are independent and each has the same distribution as the (unscaled) continuous one dimensional Conga line investigated in the previous section. First, we give a fluctuation estimate for u and its first derivative on intervals of the form \([k,k+1]\) for \(k \in [\delta t,\alpha t]\) (\(\delta \in (0,\alpha )\)) for sufficiently large t.
Lemma 12
Take any \(\delta \in (0,\alpha ),\epsilon >0\) and integer \(p \ge 1\). For \(t>0\) and \(k \in [\delta t,\alpha t],\) there is a constant \(C_p\) depending only on p such that
Consequently, for any \(\eta >0\) and \(\mu >0,\) with probability one, there is a positive integer N such that for all \(n \ge N,\)
for all \(k \in [\delta n, \alpha n] \cap \mathbb {Z}.\)
Proof
From the scaling relation (3.2) and Corollary 2, we see that for sufficiently large t and \(k \in [\delta t,\alpha t]\), \(u(\cdot ,t)\) has a power series expansion around k that converges in \([k,k+1]\). Thus, we can write
The pth moment can be bounded as
Now we collect the moment estimates of the derivatives from the previous section. In the following, \(C_1, C_2, C_1', C_2',\ldots \) will be constants depending only on \(\alpha \). From Lemmas 4 and 5 along with the scaling relation (3.2), we have
for \(l=1,2\) and \(k \in [\delta t, \alpha t]\). To bound the variances of the higher order derivatives, we use (3.17) (with \(\epsilon \) there taken to be \(\frac{y}{2\alpha }\)) and (3.2) to get
for \(l \ge 3\) and \(k \in [\delta t, \alpha t]\). Along with these estimates, and the fact that the derivatives \(\partial _x^lu(k,n)\) are Gaussian, we use (4.3) to get the first bound in (4.1). The second bound follows similarly. Finally, (4.2) follows from (4.1) and the Borel–Cantelli lemma. \(\square \)
For any integer \(n \ge 1\), let \(\{X(x,n): x \in [0,n]\}\) denote the linear interpolation of \(\{X_k(n): k \in [0,n] \cap \mathbb {Z}\}\). Note that \(X(\cdot ,n)\) is differentiable everywhere except the integers. We set the convention \(\partial _xX(k,n)=X_{k+1}(n)X_k(n)\). Then, Theorem 1 and Lemma 12 together imply the following theorem.
Theorem 4
For any \(\eta >0\) and \(\mu >0,\) with probability one, there is a positive integer N such that for all \(n \ge N,\)
The above theorem tells us that for large n, \(\{(X(x,n), \partial _xX(x,n)): x \in [\delta n, \alpha n]\}\) comes uniformly close to \(\{(u(x,n), \partial _xu(x,n)): x \in [\delta n, \alpha n]\}\). Furthermore, the distance shrinks as x gets larger. This explains why the discrete Conga line sufficiently far away from the tip looks so smooth in Fig. 2.
In the rest of this section, we study properties of the continuous scaled two dimensional Conga line \(u_t=(u_{1,t},u_{2,t})\).
Analyzing length
The length of the Conga line in the interval \([\delta ,\alpha ]\) is given by \({l_t=\int _{\delta }^{\alpha }u_t'(x)\,dx}\), where \(\cdot \) denotes the \(L^2\) norm. In this section, we give estimates for the expected length and its concentration about the mean.
Lemma 13
\(\mathbb {E}(l_t) \sim t^{1/4}.\)
Proof
From Lemma 4, we have
Thus,
where the second equality above follows from the fact that \(u_t=(u_{1,t},u_{2,t})\) has a bivariate normal distribution. Using this, we get
\(\square \)
But this gives us only a rough estimate of the behaviour of length for large t. To get a better idea of how the length behaves for large time t, we need higher moments and, if possible, some form of concentration about the mean. The next lemma gives us an estimate of the variance of \({l_t}\).
Lemma 14
Proof
From Lemma 4, we know that for \(\delta \le x,y \le \alpha \), \(\rho _t(x,y)=\mathrm{{Corr}}(u_{1,t}'(x), u_{1,t}'(y))\) satisfies
and \(\sigma _t^2(x)=\mathrm{{Var}}(u_{1,t}(x))\) satisfies (4.5).
Let f denote the probability density of \((u_{1,t}'(x),u_{2,t}'(x),u_{1,t}'(y),u_{2,t}'(y))\). We use the fact that this is a Gaussian random vector to write down f explicitly as
where \(\mathbf {a}=(a_1,a_2)\), \(\mathbf {b}=(b_1,b_2)\), \(\cdot \) represents the \(L^2\) norm and \(\langle \cdot ,\cdot \rangle \) represents the dot product of two vectors. Using this expression, we can write down
where
It is routine to verify from the form of g above that
Let S denote the above supremum. From (4.6), it follows that there is \(0<A<\infty \) such that if \(xy\ge A/ \sqrt{t}\), then \(\rho _t(x,y) \le 1/2\) for sufficiently large t. On \(\{(x,y) \in [\delta , \alpha ]^2: xy \ge A/ \sqrt{t}\}\), we bound \(\mathrm{{Cov}}(u_t'(x), u_t'(y))\) by \(S \rho _t(x,y)\sigma _t(x)\sigma _t(y)\). On \(\{(x,y) \in [\delta , \alpha ]^2: xy < A/ \sqrt{t}\}\), we bound \(\mathrm{{Cov}}(u_t'(x), u_t'(y))\) by \(\sigma _t(x)\sigma _t(y)\) which follows from Cauchy–Schwarz inequality. From (4.5), we see that \(\sigma _t(x)\sigma _t(y)\) is bounded above by \(C\sqrt{t}\) for some finite constant C.
Now, we can combine the estimates above to bound the variance as follows:
\(\square \)
Thus, although the expected length grows like \(t^{1/4}\), the variance is bounded. This already tells us that the actual length cannot deviate much from the expected length.
In what we do next, we get Gaussian concentration of length about the mean in a window of scale \(O(\sqrt{\log t})\).
We know that for most useful concentration results, we need some ‘independence’ in our model. Our strategy here is to construct a new process \(\hat{u}_t\) which is very ‘close’ to the original process \(u_t\) and is nicer to analyze as \(\hat{u}_t(x)\) and \(\hat{u}_t(x')\) are independent whenever x and \(x'\) are sufficiently far apart. As this yields a useful tool which is going to be used in later sections, we give a detailed construction.
Construction of \(\hat{u}_t\)
By Lemma 4, we see that the correlation between \(u_t'(x)\) and \(u_t'(y)\) in the Conga line with \({xy=\frac{\lambda }{\sqrt{t}}}\) decays like \(e^{C\lambda ^2}\) as \(\lambda \) increases. This indicates that pieces of the Conga line sufficiently far away are ‘almost independent’. In what follows, we make use of this fact.
Choose and fix a constant \(M>2\). Divide the interval \(\left[ \delta \sqrt{\frac{M\log t}{t}},\alpha +\sqrt{\frac{M\log t}{t}}\right] \) into subintervals
of length at most \({\frac{\sqrt{M \log t}}{\sqrt{t}}}\). Define the process \(\hat{u}_t\) as
if \({x \in [y_{k+1},y_{k+2}]}\).
In the following, all constants \(C, C_1, C_2,\ldots \) depend only on \(\delta \) and \(\alpha \).
Lemma 15
\({\hat{u}_t}\) satisfies the following properties :

(i)
\(\hat{u}_t\) is smooth everywhere except possibly at the points \(y_k.\)

(ii)
\(\{\hat{u}_t(x): x \in I_k\}\) is independent of \(\{\hat{u}_t(x): x \in I_{k+3}\}\) for all k.

(iii)
For \(x \in I_{k+1},\)
$$\begin{aligned} \left \hat{u_t}(x)u_t(x)+W\left( 1\frac{y_k}{\alpha }\right) \right \le Ct^{M/2}\Vert W\Vert \end{aligned}$$and
$$\begin{aligned} \left u_t'(x)\hat{u}_t'(x)\right \le \frac{C}{\sqrt{M}}\left( \frac{x}{t}\right) ^{\frac{M}{4}\frac{1}{2}}\Vert W\Vert . \end{aligned}$$where \({\Vert W\Vert =\sup _{0 \le s \le 1}W_s}.\)
Proof
Properties (i) and (ii) follow from the definition of \(\hat{u}_t\).
To prove property (iii) notice that, for \({x \in I_{k+1}}\),
From (3.5) and a similar equation for the derivatives of \(\hat{u}_t\), we obtain
where \(K_t^n\) is defined as in (3.4). If \(x \in I_{k+1}\), then \({\frac{x}{\alpha }L_t^x(M) \ge \frac{y_{k}}{\alpha }}\) and \({\frac{x}{\alpha }+L_t^x(M) \le \frac{y_{k+3}}{\alpha }}\). So, the above differences yield part (iii). \(\square \)
Consequently, if \(\hat{l}_t\) is the length of the curve \(\hat{u}_t\) restricted to \([\delta ,\alpha ]\), then
and
So, to find the concentration of the length around the mean at time t, we look at the length of the curve \(\hat{u}_t\). Let \(W_k\) be the Brownian motion defined on \({I=\left[ 0,\frac{3}{\alpha }\sqrt{\frac{M\log t}{t}}\right] }\) by
For each k, the Brownian motions \(W_k\) and \(W_{k+3}\) so defined are clearly independent.
As length is an additive functional, we can find the length on subintervals \(I_k\) and add them together. Heuristically, we can see that this gives us concentration as the length of the curve on every third interval is independent of each other, and as these are summed up, the errors get averaged out.
Now, we give the rigorous arguments. In the following, we fix the probability space \({(\Omega ,\mathcal {B}(\Omega ), \mathcal {P})}\), where \({\Omega =C\left[ 0,\frac{3}{\alpha }\sqrt{\frac{M\log t}{t}}\right] }\) denote the set of continuous complex valued functions on I equipped with the supnorm metric d, and \({\mathcal {P}}\) is the Wiener measure.
We need some concepts from Concentration of Measure Theory. See [7] for an excellent survey of techniques in this area. We give a very brief outline of the concepts we need.
Transportation Cost Inequalities and Concentration: Let \((\chi , d)\) be a complete separable metric space equipped with the Borel sigma algebra \(\mathcal {B}(\chi )\). Consider the pth Wasserstein distance between two probability measures P and Q on this space, defined as
where the infimum is over all couplings \(\pi \) of a pair of random elements \((X,X')\) with the marginal of X being P and that of \(X'\) being Q.
Now, fix a probability measure P. Suppose there is a constant \(C>0\) such that for all probability measures \(Q\ll P\), we have
where H refers to the relative entropy \(H(Q\mid P)=\mathbb {E}^Q \log (dQ/dP)\). Then we say that P satisfies the \(L^p\) Transportation Cost Inequality. In short, we write \(P \in \mathcal {T}_p(C)\).
Now, we present one of the key results which connects Transportation Cost Inequalities and Concentration of Measures.
Lemma 16
Suppose P is a probability measure on \((\chi ,\mathcal {B} (\chi )).\) Suppose further that each P is in \(\mathcal {T}_1(C).\) Then, for any 1Lipschitz map \(F:\chi \rightarrow \mathbb {R}\) and any \(r >0,\)
It is easy to see that \(\mathcal {T}_2(C)\) implies \(\mathcal {T}_1(C)\). But the main advantage in dealing with \(\mathcal {T}_2(C)\) comes from its tensorization property described in the following lemma.
Lemma 17
Suppose \(P_i, i=1,2,\ldots , n\) are probability measures on \((\chi , \mathcal {B} (\chi )).\) Suppose further that each \(P_i\) is in \(\mathcal {T}_2(C).\) On \(\chi ^n,\) define the distance between \(x^n=(x_1,x_2,\ldots )\) and \(y^n=(y_1,y_2,\ldots )\) by
Then \({\bigotimes _{i=1}^nP_i \in \mathcal {T}_2(C)}\) on \((\chi ^n, d^n).\)
The following lemma, which follows from the developments in [8], is of key importance to us.
Lemma 18
The Wiener measure on \({C\left[ 0,T\right] }\) satisfies the transportation inequality \({\mathcal {T}_2(T)}\) with respect to the supnorm metric.
These tools are all we need to establish a concentration result for \(l_t\).
Let us define the function \({T_t^k:\Omega \rightarrow \mathbb {R}}\) as follows:
Notice that \({\hat{l}_t=\sum _{k=0}^{\frac{\sqrt{t}}{\sqrt{M \log t}}}T_t^k(W_k)}\). Suppose we prove that \({T_t^k}\) is Lipschitz with respect to d with Lipschitz constant \(C_t\). Then, with \({N=\left\lfloor \frac{\alpha \delta }{3}\sqrt{\frac{t}{M\log t}}\right\rfloor }\), the functions \({\{T_t^{(i)}:\Omega ^N \rightarrow \mathbb {R}:i=0,1,2\}}\) defined by
where \(f=(f_1,\ldots ,f_N)\), is also Lipschitz with respect to \(d^N\) with the same constant \(C_t\).
Lemma 19
For each \(k,{T_t^k}\) is Lipschitz on \((\Omega ,d)\) with Lipschitz constant \({C\sqrt{M\log t}}\) where C is a constant depending only on \(\delta \) and \(\alpha .\)
Proof
For \(f_1,f_2\) in \(\Omega \),
By the estimates obtained in the proof of Lemma 3,
This proves the lemma. \(\square \)
Now, for any \(f \in \Omega ^{3N}\), define the following functions in \(\Omega ^N\): \(f^{(1)}=(f_1,f_4,\ldots ,f_{3N2})\), \(f^{(2)}=(f_2,f_5,\ldots ,f_{3N1})\) and \(f^{(3)}=(f_3,f_6,\ldots ,f_{3N})\). Notice that
where \(\tilde{W}=(W_{0},W_{1},\ldots ,W_{3N1}) \in \Omega ^{3N}\). Using this fact and Lemmas 17 and 16, we get for any \(r>0\),
As \(M>2\) is arbitrary, it can be absorbed in the constant \(C_2\). Thus (4.10), along with (4.8) and (4.9), gives us our main conclusion:
Theorem 5
We have
where \(C_1, C_2\) are constants depending only on \(\delta \) and \(\alpha \).
How close is the scaled Conga line to Brownian motion?
Though the unscaled Conga line seen far away from the tip ‘smoothes out’ Brownian motion more and more with increasing t, we see that in the simulations of the scaled Conga line, making t larger actually makes the curve rougher and resemble Brownian motion more and more. Closer analysis reveals that this in fact results from the scaling. Again, before we supply the rigorous arguments, we give a heuristic reasoning. Looking at Eq. (3.2), we see that although the scaling takes the Brownian motion W on [0, t] to a Brownian motion \(W^{(t)}\) on [0, 1], the width of the window on which the smoothing takes place in the unscaled Conga line, which is comparable to \(\sqrt{t}\), is taken to \(O(t^{1/2})\) in the scaled version, which shrinks with time t.
In the following, we consider the family of two dimensional random curves \(u_t(\cdot )\) indexed by t, and \(L_t^x= \alpha ^{1}\sqrt{M\sigma _t^2x\log \sigma _t^2x}\).
Theorem 6
There exists a deterministic constant \(\kappa \) such that, almost surely, there is \(T=T(\omega )>0\) for which
for all \( x \in (0,\alpha ]\) satisfying \(x > \alpha L_t^x\) for all \(t\ge T.\) In particular, for any fixed \(\beta \in (0,1),\) the above holds almost surely for \(x \in [t^{\beta },\alpha ]\) for all \(t \ge T.\) Furthermore, \(\kappa \) could be chosen appropriately so that the following holds for sufficiently large t :
Thus, the scaled Conga line is close to Brownian motion for large t although the unscaled one is not, as can be seen from the right side of Eq. (4.11). This subsection is devoted to proving the above theorem.
For any continuous function \(f:[0,1] \rightarrow \mathbb {C}\), define
Note that the Conga line is given by \({u_t(x)=P_tW(x)}\). \(P_tf\) can be thought of as a smoothing kernel acting on the function \(x \mapsto f(1x/\alpha )\). The following lemma shows that if f is Lipschitz, then for large t, \(P_tf(x)\) is close to \(f(1x/\alpha )\).
Lemma 20
If f is Lipschitz with constant \(\mathcal {C},\) then for large enough t and for \( x \in (0,\alpha ]\) satisfying \(x > \alpha L_t^x,\)
Note that
where
and
Similarly as \(I_t^x\), \(S_t^x\) is small compared to \(J_t^x\). \(\square \)
Now, Brownian motion is not Lipschitz, but it can be uniformly approximated on [0, 1] by piecewise linear random functions whose Lipschitz constants can be controlled using Levy’s Construction of Brownian motion which we now briefly describe following [9]. Define the nth level dyadic partition \(\mathcal {D}_n=\{\frac{k}{2^n}: 0 \le k \le 2^n\}\) and let \(\mathcal {D}= \cup _{n=0}^{\infty }\mathcal {D}_n\). Let \(\{Z_n: n \in \mathbb {N}\}\) be i.i.d standard normal random variables. Define the random piecewise linear functions \(F_n\) as follows.
\(F_0(x)=xZ_1\) for \(x \in [0,1]\). For \(n \ge 1\),
With this, Levy’s construction says that a Brownian motion W can be constructed via
for \(x \in [0,1]\).
Let \({W_N(x)=\sum _{n=0}^{N}F_n(x)}\). This function serves as the piecewise linear (hence Lipschitz) approximation to W. From Lemma 20, for any N,
Fix \(c > \sqrt{2 \log 2}\). Let \({N^*=\inf \{n:Z_d \le c\sqrt{n} \ \forall \ d \in \mathcal {D} \setminus \mathcal {D}_n\}}\).
So, by Borel–Cantelli lemma, \(P(N^* < \infty )=1\).
Now, for \(n > N^*\), \(\Vert F_n\Vert _{\infty } \le c\sqrt{n}2^{n/2}\) and \({\Vert F_n'\Vert _{\infty } \le \frac{2\Vert F_n\Vert _{\infty }}{2^{n}} \le 2c\sqrt{n}2^{n/2}}\). So, for \(l> N^*\), we get
Now, take t large enough that, for every \(x \in (0,\alpha ]\), the first term is less than \({\sqrt{\left( \frac{x}{t}\right) ^{1/2} \log \left( \frac{x}{t}\right) }}\) and \({\sqrt{\frac{x}{t}} \in (2^{l},2^{l+1}]}\) for some \(l>N^*\). Plugging this l into Eq. (4.14) and using the fact that the second sum above is dominated by its last term, and the third sum is dominated by its leading term, we get (4.11).
To prove (4.12), note that the last two sums in (4.14) are bounded above by \({C'\sqrt{\left( \frac{x}{t}\right) ^{1/2} \log \left( \frac{x}{t}\right) }}\) where \(C'\) is deterministic (does not depend on \(N^*\)). So, it suffices to control the first sum. In the remaining part of the proof, C will denote a generic, deterministic constant.
Note that, the first sum is bounded above by \({C\sigma \sqrt{\frac{x}{t}}2^{N^*/2}U_{N^*}}\), where \(U_{n}=\sup _{d \in \mathcal {D}_n}Z_d\). Thus, the probability in (4.12) is bounded above by \(\text {P}(2^{N^*/2}U_{N^*} > t^{1/4}\sqrt{\log t})\). Choose and fix any \(\epsilon \in (0,\frac{1}{2})\). Choose c in the definition of \(N^*\) above to satisfy \({c>\sqrt{\left( \frac{4}{\epsilon }+2\right) \log 2}}\). Now,
The first probability above can be bounded as
The second probability has the following bound:
These bounds, along with \(\epsilon \left( \frac{c^2}{2 \log 2}1\right) >2\), give (4.12).
Analyzing number of loops
A loop L in a continuous curve \(f:\mathbb {R} \rightarrow \mathbb {C}\) is defined as a restriction of the form \(f_{[a,b]}\) where \(f(a)=f(b)\) and f is injective on [a, b). Note that L divides the plane into a bounded component and an unbounded component. Define the size of the loop
It can be shown (the quick way is to look at the expectation metatheorem from [3]) that if f is a continuously differentiable Gaussian process, then with probability one, it has no singularities (points where the first derivative of both Re f and Im f vanish). Using this fact, it is easy to see that if I is a compact interval on which f is not injective, then \(f_I\) has at least one loop L of positive size.
As the number of loops is bounded above by the number of critical points of Re f (equivalently Im f), we see that by Lemma 11, for a large fixed t, the number of loops in the Conga line is bounded above by \(C\sqrt{t}\) with very high probability. This section is dedicated to achieving a lower bound. The simulation (Fig. 2) shows a number of loops, most of them being small. In the following, we obtain a lower bound for the number of small loops, which differs from the upper bound by a logarithmic factor. For this, a key ingredient is the Support Lemma for Brownian Motion which we state as the following lemma:
Lemma 21
If \(f:[0,1] \rightarrow \mathbb {C}\) is continuous with \(f(0)=0\) and W is a complex Brownian motion on [0, 1], then for any \(\epsilon >0,\)
where \(\Vert g\Vert =\sup _{x\in [0,1]}g(x)\).
The above lemma can be proved either by approximating f by piecewise linear functions and using Levy’s construction of Brownian motion, or by an application of the Girsanov Theorem (see [4]).
We also need to exploit the exponentially decaying correlation between \(u_t(x)\) and \(u_t(x')\) as \(xx'\) increases (see Lemma 4) by bringing into play the approximation of \(u_t\) by the process \(\hat{u}_t\) introduced in Sect. 4.1.
Now, we state the main theorem of this section.
Theorem 7
Choose \(R > 6 \kappa ,\) where \(\kappa \) is the constant in Theorem 6. Let \(N_t^l\) be the number of loops of size less than or equal to \(2R(\frac{\log t}{t})^{1/4}\) in the (scaled) Conga line \(u_t\) in \([\delta ,\alpha ]\) at time t. Then there exist constants C and \(C'\) such that
as \(t \rightarrow \infty .\)
Proof
The upper bound follows from Lemma 11.
Proving the lower bound is more involved.
Our strategy is to choose a function f which has a loop and run the Brownian motion W in a narrow tube around f, which, by Theorem 21, we can do with positive probability. Now by Theorem 6, we know that for large t, \(u_t\) is ‘close’ to the Brownian motion W with very high probability, and thus the curve \(u_t\) is forced to run in a narrow sausage around f thereby inducing a loop.
Such a function is \({f(x)=C((4x2)^3(4x2),1(4x2)^2)}\) for \(x \in [0,1]\), where C is a suitably chosen constant to make the size of the loop in f to be R. Let us denote the continuous functions restricted to the \(\epsilon \)sausage around \(f\mid _{[a,b]}\) as
Fix \(\alpha '\) such that \(\delta <\alpha '<\alpha \). For \(x\in [\delta ,\alpha ']\), define
where \(M>2\) is any fixed constant as in Sect. 4.1. For any continuously differentiable complexvalued Gaussian process g defined on a subset of [0, 1] containing \([x,x+\alpha \sqrt{\frac{M \log t}{t}}]\) and any complex number c, if \(g \in S(c+f^{(t)}_x; \frac{R}{2}(\frac{M \log t}{t})^{1/4},[x,x+\alpha \sqrt{\frac{M \log t}{t}}])\), then g has a selfintersection on \([x,x+\alpha \sqrt{\frac{M \log t}{t}}]\) and thus, due to absence of singularities with probability one, g has a loop of positive size on this interval.
We break up the proof into parts:

(i)
In Lemma 22, we prove that the probability of \(u_t\mid _{[x,x + \alpha \sqrt{\frac{M \log t}{t}}]}\) having a loop of size comparable to \((\frac{\log t}{t})^{1/4}\) is bounded below uniformly for all \(x\in [\delta ,\alpha ']\) by a fixed positive constant p independent of x and t.

(ii)
We use part (iii) of Lemma 15 and Lemma 22 to deduce that the probability of \(\hat{u}_t\) having a loop of size comparable to \((\frac{\log t}{t})^{1/4}\) on each interval \(I_{k+1}\) is bounded below by p / 2.

(iii)
We use the independence of \({\hat{u}_{I_k}}\) and \({\hat{u}_{I_{k+3}}}\) for every k to deduce in Lemma 23 that the total number of such loops in \(\hat{u}_t\) is bounded below by \(\frac{p}{4}\left\lfloor (\alpha '\delta )\sqrt{\frac{t}{M \log t}}\right\rfloor \) with very high probability.

(iv)
We finally use part (iii) of Lemma 15 again to translate the result of Lemma 23 to the original process \(u_t\) in Lemma 24.
Lemma 22
There is a constant \(p>0\) independent of x and t such that
for all \(x\in [\delta , \alpha '],\) for all sufficiently large t.
Proof
Choose and fix any \(x\in [\delta ,\alpha ']\). By Theorem 6 and by the translation and scaling invariance of Brownian motion, we get for \(R >6\kappa \) (here \(\kappa \) is the constant in Theorem 6) and large t,
Here we used Theorems 6 and 21 for the last step. By virtue of the second last step above, we can choose p independent of x and t, and the above lower bound works uniformly for all \(x \in [\delta ,\alpha ']\). \(\square \)
Recall that by part (iii) of Lemma 15, we know that for \(x \in I_{k+1}\),
Define the event
If \(A_k\) holds, then \(\hat{u}_t\) has a loop in \(I_{k+1}\). Write
Then the following holds.
Lemma 23
Proof
By Lemma 22, it is easy to see that
for large enough t and small enough \(\epsilon \). Thus we see that \({ES_t \ge \frac{p}{2}\left\lfloor (\alpha '\delta )\sqrt{\frac{t}{M \log t}}\right\rfloor }\).
Now, as \(\hat{u_t}\) is independent on every third interval, so \(A_k\) is independent of \(A_{k+3}\) for every k. The result now follows by Bernstein’s Inequality. \(\square \)
The above implies that with very high probability \({S_t\ge \frac{p}{4}\left\lfloor (\alpha '\delta )\sqrt{\frac{t}{M \log t}}\right\rfloor }\).
Define the event
and the corresponding sum
Our final lemma is the following.
Lemma 24
as \(t \rightarrow \infty \).
Proof
Note that \(N_t^l \ge \tilde{N}_t^l\). By part (iii) of Lemma 15, we note that for small enough \(\epsilon >0\), the events \(A_k\) and \({\lbrace \Vert W\Vert \le \frac{\epsilon t^{M/2}}{C(\delta )}(\frac{M \log t}{t})^{1/4}\rbrace }\) imply that \(B_k\) holds. We see that, for large t,
which goes to one as \(t \rightarrow \infty \) by Lemma 23. \(\square \)
The proof of the lower bound in Theorem 7 follows from the above lemmas. \(\square \)
Loops and singularities in particle paths
We start off this section by describing an interesting phenomenon that one notices in simulations of the paths of the individual particles in the discrete Conga line. The leading particle (\(k=1\)) performs an erratic Gaussian random walk. But as k increases, the successive particles are seen to cut corners in the paths of the preceding particles making them smoother (see Fig. 5). This can be heuristically explained by the fact that a particle following another one in front directs itself along the shortest path between itself and the preceding particle (see Eq. (1.1)), and hence cuts corners. This phenomenon is captured by the process \(\overline{u}\) described in Sect. 2.2. So, we use the approximation of the discrete Conga line X by the smooth process \(\overline{u}\) in this section.
Recall that a singularity of a curve \(\gamma : \mathbb {R} \rightarrow \mathbb {C}\) is a point \(t_0\) at which its speed vanishes, i.e. \(\gamma '(t_0)=0\). A singularity \(t_0\) of a curve \(\gamma \) is called a cusp singularity if it is analytic in a neighborhood of \(t_0\) and there exists a translation and rotation of coordinates taking \(\gamma (t_0)\) to the origin, under which, \(\gamma \) has the representation \(\gamma ^*=(\gamma ^*_1,\gamma ^*_2)\) with the following power series expansions:
with \(a_2\ne 0, \ b_3\ne 0\), for \(t \in [t_0\delta , t_0+\delta ]\) for some \(\delta >0\).
Intuitively, this means that the graph of \(\gamma \) locally around \(t_0\) looks like \(y=x^{2/3}\) under a rigid motion of coordinates taking \(\gamma (t_0)\) to the origin. We call these transformed coordinates the natural coordinate frame based at the cusp singularity \(t_0\).
Making a change of variables \(p=t\frac{x}{\alpha }\) and \(\tau =\rho ^2 x\), we can rewrite \(\overline{u}\) as
We restrict our attention to \(p>0, \tau >0\). Fixing \(\tau \) and varying p in the above expression for f yields the path of the particle at distance \(x=\tau /\rho ^2\) from the tip. This change of variables enables us to write \(\overline{u}\) as the solution to the heat equation with initial function being the Brownian motion W, the space variable represented by p and time by \(\tau \). Later, we will see that this makes it easier to write down analytic expansions of \(\overline{u}\) around the singularity.
Another interesting observation, which was described briefly in the Introduction, is the evolution of loops as we look at the paths of successive particles. If a particle in the (two dimensional) Conga line goes through a loop, the particle following it, which cuts corners and tries to “catch it”, will go through a smaller loop. This is suggested by the simulations, where small loops are seen to ‘die’ (i.e. shrink to a point), and just before death, they look somewhat ‘elongated’, and the death site looks like a cusp singularity. Other loops are seen to break after some time, that is, their end points come apart. Figure 5, representing successive particle paths, depicts some loops in various stages of evolution. In this section, we investigate evolving loops in the paths of successive particles, especially the relationship between dying loops and formation of singularities.
Before we can start off, we give some definitions that will be useful in describing the evolution of loops.
We define a metric space \((\mathcal {M},d)\), with a metric similar to the Skorohod metric on RCLL paths, on which we want to study loop evolutions:
If \(f:[a_1,b_1] \rightarrow \mathbb {C},g:[a_2,b_2] \rightarrow \mathbb {C} \in \mathcal {M}\), define
where \(\Vert \cdot \Vert \) denotes the supnorm metric. It can be easily checked that \((\mathcal {M},d)\) is a metric space.
Define the evolution of a loop L as a continuous function \({L(\cdot ):[0,T) \rightarrow \mathcal {M}}\) such that \(L(0)=L\) and L(t) is a loop for every \(0\le t<T\). If \(f:\mathbb {R}^+ \times \mathbb {R}^+ \rightarrow \mathbb {C}\) is a continuous spacetime process, and \({L(t)=f(\cdot ,t)_{[a_t,b_t]}}\) is a loop evolution on \(0 \le t < T\), we say that \(\{L(t): 0\le t < T\}\) is a loop evolution of f starting from \(L=f(\cdot ,0)_{[a_0,b_0]}\). Say that the loop \(L=f(\cdot ,0)_{[a_0,b_0]}\) vanishes after time \(T^*\) if
Such a vanishing loop is said to die at spacetime point \((p,T^*)\) if \(a_t \rightarrow p\) and \(b_ta_t \rightarrow 0\) as \(t \rightarrow T^*\). Analogous definitions hold for [0, T) replaced by a general interval \([T_1,T_2)\).
We note here that loops can vanish without dying. This can happen, for example, when the ‘end points come apart’. One can check that an instance of this happening for f defined in (5.2) is when the end points of the loop have their velocity vectors parallel, but normal acceleration vectors pointing in opposite directions.
Note Although with probability one f has no singularities for a fixed time \(\tau \), it can be verified by an application of the expectation metatheorem of [3] that the expected number of singularities of \(f(\cdot ,\tau )\) for \((p,\tau )\) lying in a compact set \(K=[a,b] \times [c,d]\) is positive, and thus singularities do occur with positive probability if we allow both space and time to vary.
It is easy to see that if a loop dies at a site \((p_0,\tau _0)\), then \(p_0\) is a singularity for the curve \(f(\cdot ,\tau _0)\). We prove in Lemma 25 that with probability one, any singularity is a cusp singularity. In Theorem 8 we prove that for any (cusp) singularity \(p_0\) in \(f(\cdot ,\tau _0)\), there exists a unique loop at each time in some small interval \([\tau _0\delta ,\tau _0)\). Furthermore, the loop at time \(\tau _0\delta \) dies at \((p_0,\tau _0)\) to give birth to the singularity \(p_0\) of \(f(\cdot ,\tau _0)\). Also, the theorem shows that the dying loops, under some rescaling, converge to a deterministic limiting loop.
Lemma 25
With probability one, any singularity \(p_0\) of the curve \(f(\cdot ,\tau _0)\) is a cusp singularity.
Proof
We first prove that for every \(\tau _0 \in \mathbb {R}^+\), \(f(\cdot ,\tau _0)\) is analytic. For this, note that
By using the fact that \({\lim _{y\rightarrow \infty }\frac{W(y)}{y}=0}\) almost surely, we get that with probability one,
for some random constant C. This bound implies analyticity of \(f(\cdot ,\tau _0)\).
Write \(f=(f^1,f^2)\). We need to show that if \(p_0\) is a singularity of \(f(\cdot ,\tau _0)\), then there exists a rigid motion of coordinates taking \(f(p_0,\tau _0)\) to the origin under which (5.1) holds (with \(\gamma (\cdot )\) replaced by \(f(\cdot ,\tau _0)\)). It suffices to prove the lemma for \((p_0,\tau _0)\) lying in a fixed rectangle \(K=[a,b] \times [c,d]\). Our first step towards this is to show the following:
To show this, define \(A_n\) to be the event which holds when all the following are satisfied:

(i)
There exists \((p_0,\tau _0) \in K\) for which \(\partial _p f(p_0,\tau _0)=0\) and
$$\begin{aligned} (\partial ^2_p f^2(p_0,\tau _0),\partial ^3_p f^2(p_0,\tau _0))=\lambda (\partial ^2_p f^1(p_0,\tau _0),\partial ^3_p f^1(p_0,\tau _0)) \end{aligned}$$for some \(\lambda \in [n,n]\).

(ii)
The Lipschitz constants of the functions \(\{\partial ^i_p f^j(p,\tau ): (p,\tau ) \in K; i=1, 2, 3; j=1,2\}\) are less than or equal to n.
We will show that \(P(A_n)=0\) which will yield (5.4).
Partition the rectangle into a grid of subrectangles of side length \({\le } \epsilon \), where \(\epsilon \) is small. Call the set of grid points \(\hat{K}\).
Now, suppose \(A_n\) holds. Let \((p_0,\tau _0)\) lie in a subrectangle R and let \((p_i,\tau _j) \in \hat{K}\) be a grid point adjacent to R. Note that as the Lipschitz constants of the above functions and \(\lambda \) are bounded by n, the following event holds:
Thus we have
We show that there is a constant C depending on n such that \(P(A_n^{ij})\le C\epsilon \).
To save us notation, call
and similarly Y for \(f^2\). X and Y are independent and each follows a centred trivariate normal distribution. Let us call the density function of X \(p_{ij}\) and the distribution of Y as \(Q_{ij}\). Then, as X and Y have uniformly bounded densities,
where \(C_{ij}', C_{ij}\) depend on n. Note that the determinants of the covariance matrices of X and Y are continuous and do not vanish at any point on the compact set K. Thus we can bound \(C_{ij}\) by C (which depends on n) uniformly over \(i,j,\epsilon \). Using these facts, we get
As \(\epsilon \) is arbitrary, we get \({P(A_n)=0}\).
Now if \(p_0\) is a singularity occurring at time \(\tau _0\), i.e. \(\exists \) \((p_0,\tau _0) \in K\) for which \(\partial _p f(p_0,\tau _0)=0\), we can apply a rigid motion of coordinates such that \(f(p_0,\tau _0)\) is the new origin and the rotation angle \(\theta \) is chosen to satisfy the equation
where \(A_{\theta }\) is the rotation matrix corresponding to \(\theta \). By (5.4), we see that \(a_2\) and \(b_3\) are nonzero. Then, the result follows by taking this new coordinate frame as the natural coordinates. \(\square \)
Theorem 8
If \(p_0\) is a (cusp) singularity of \(f(\cdot , \tau _0)\), then there exists \(\delta >0\) such that \(f(\cdot ,\tau )\) has a unique loop \(L(\tau )=f(\cdot ,\tau )\mid _{[a_{\tau },b_{\tau }]}\) in an interval \(I_{\tau }\) containing \(p_0\) for all \(\tau \in [\tau _0\delta , \tau _0)\). Moreover, \(I_{\tau }\) can be chosen so that it shrinks to \(p_0\) as \(\tau \rightarrow \tau _0\), and \(L(\cdot ): [\tau _0\delta ,\tau _0) \rightarrow \mathcal {M}\) is a loop evolution of f starting from \(f(\cdot , \tau _0\delta )\mid _{[a_{\tau _0\delta },b_{\tau _0\delta }]}\).
Furthermore, if \(f=(f_1,f_2)\) in natural coordinates based at the cusp singularity \(p_0\) of \(f(\cdot , \tau _0),\) then there is \(M>0\) such that the rescaling of f given by
for \(s \in (0,\delta ]\) has a unique loop \(\hat{L}_s\) in \([M,M]\) which converges to a deterministic loop \(\hat{L}_0\) in \((\mathcal {M}, d)\) as \(s \rightarrow 0.\)
Proof
First we show that f is jointly analytic in \((p,\tau )\) for \((p,\tau ) \in \mathbb {R} \times \mathbb {R}^+\). From (5.3), we know that for any \(\tau _0 \in \mathbb {R}^+\), \(f(\cdot ,\tau _0)\) has an analytic representation in the space variable p given by
which holds for all \(p \in \mathbb {R}\), where \(\mathbf {a}_i(p_0,\tau _0) =\partial _p^i f(p_0,\tau _0) \in \mathbb {R}^2\) for each i. Note that if we can prove
for \(p \in \mathbb {R}\) and \(0\le \delta < \epsilon \) for some \(\epsilon >0\), then we can write
for \(p \in \mathbb {R}\) and \(\tau \in [\tau _0,\tau _0+\epsilon )\) and it follows that f is jointly analytic on \(\mathbb {R}\times [\tau _0,\tau _0+\epsilon )\). Joint analyticity on \(\mathbb {R} \times \mathbb {R}^+\) is immediate as a result.
From (5.3), we see that (5.7) holds when
which is satisfied for \({\delta < \frac{\tau _0}{8}}\) and all \(p \in \mathbb {R}\).
If we set \(\zeta _n(p,t)=\mathbb {E}_{Z_t}((pp_0)Z_t)^n\) for \(p \in \mathbb {R}\) and \(t\ge 0\), we note that this can be readily computed from the moments of \(Z_t\). Further, observe that the polynomial thus obtained for \(\zeta _n(p,t)\) is well defined for all \((p,t) \in \mathbb {R}^2\). The joint analyticity of f along with (5.8) tells us that we can write \(f(p,\tau )=\sum _{i=0}^{\infty }\mathbf {a}_i(p_0,\tau _0) \zeta _i(p,\tau \tau _0)\) for \((p,\tau )\) in the region of analyticity around \((p_0,\tau _0)\).
Choose and fix \(M \in (\sqrt{3}, 2\sqrt{3}1)\). If \(f(\cdot ,\tau _0)\) has a cusp singularity at \(p_0\), we can write \(f=(f_1,f_2)\) in the natural coordinate frame based at \(p_0\). We will show that there is \(\delta >0\) small enough for which the rescaled process \(\hat{f}_s\) given by (5.5) in this coordinate frame has a unique loop \(\hat{L}_s\) in the interval \([M,M]\) for all \(0<s \le \delta \). We will in fact show something stronger: for \(0<s \le \delta \), there is a unique pair \((\hat{a}_s,\hat{b}_s)\) such that \(\hat{a}_s, \hat{b}_s \in [M,M]\), \(\hat{a}_s < \hat{b}_s\) and \(\hat{f}_s(\hat{a}_s)=\hat{f}_s(\hat{b}_s)\).
This will imply that for all \(0<s\le \delta \), \(f(\cdot , \tau _0s)\) has a unique loop \(L(\tau _0s)=f(\cdot , \tau _0s)\mid _{[a_{\tau _0s},b_{\tau _0s}]}\) in \([M\sqrt{s}, M\sqrt{s}]\), where \(a_{\tau _0s}=\sqrt{s}\hat{a}_s\) and \(b_{\tau _0s}=\sqrt{s}\hat{b}_s\).
By scaling properties of Gaussian random variables, \(\hat{f}_s\) has the following representation:
Let
Then by (5.9) there is a (random) constant C depending only on M and \(\delta \) such that
and
for all \(P \in [M,M]\) and all \(s \in (0,\delta ]\).
The bound (5.10) shows that on \([M,M]\), \(\hat{f}_s\) is forced to run in a ‘tube’ of width \(C\sqrt{s}\) around \(\hat{g}\). Note that \(\hat{g}\) has positive speed in \([M,M]\), and has a unique loop \(\hat{g}\mid _{[\sqrt{3},\sqrt{3}]}\) in this interval. This ‘forces’ \(\hat{f}_s\) to intersect itself. More precisely, we can find a (random) constant \(A>0\) depending only on M and \(\delta \) such that there exist \(\hat{a}_s \in [\sqrt{3}  A\sqrt{s}, \sqrt{3}+A\sqrt{s}]\) and \(\hat{b}_s \in [\sqrt{3}A\sqrt{s}, \sqrt{3}+A\sqrt{s}]\) satisfying \(\hat{f}_s(\hat{a}_s)=\hat{f}_s(\hat{b}_s)\) for \(0<s \le \delta \). We will now show that \((\hat{a}_s, \hat{b}_s)\) is the required pair.
Notice that the only point in \([M,M]\) at which the derivative of \(\hat{g}^1\) vanishes is \(P=0\), and the only such points for \(\hat{g}^2\) are \(P=1\) and \(P=1\). Thus, if we choose any \(0< \theta <1\), it follows from (5.11) that for small enough \(\delta \), there is \(I>0\) (depending only on M and \(\delta \)) such that
for \(0<s \le \delta \). This, in turn, gives us
for \(0<s \le \delta \). In particular, there does not exist \(P,Q \in [M,M]\) with \(0<PQ \le \theta \) such that \(\hat{f}_s(P)=\hat{f}_s(Q)\).
Take and fix any \(\epsilon \in (0,2\sqrt{3}1M)\) and \(L \in (1+M+\epsilon ,2\sqrt{3})\). As the only selfintersection of \(\hat{g}\) in \([M,M]\) is given by \(\hat{g}(\sqrt{3})=\hat{g}(\sqrt{3})\),
thus, for sufficiently small \(\delta \),
for \(s \in (0,\delta ]\). Equations (5.12) and (5.13) together imply that if \(P,Q \in [M,M]\), \(P<Q\) such that \(\hat{f}_s(P)=\hat{f}_s(Q)\), then \(P \in [M,1\epsilon ]\) and \(Q \in [1+\epsilon ,M]\). As \((\hat{g}^1)'<0, (\hat{g}^2)'>0\) on \([M,1\epsilon ]\) and \((\hat{g}^1)'>0, (\hat{g}^2)'>0\) on \([1+\epsilon ,M]\), therefore for sufficiently small \(\delta \), the same relations hold with \(\hat{g}\) replaced by \(\hat{f}_s\) for \(s \in (0,\delta ]\). It is then routine to check that such a pair (P, Q) is unique.
Thus, for \(\delta \) small enough, the unique loop of \(\hat{f}_s\) in \([M,M]\) for \(s \in (0,\delta ]\) is given by \(\hat{L}_s=\hat{f}_s\mid _{[\hat{a}_s,\hat{b}_s]}\). As \(\max \{\hat{a}_s+\sqrt{3}, \hat{b}_s\sqrt{3}\} \le A\sqrt{s}\), this fact, along with (5.10) and (5.11), imply convergence of \(\hat{L}_s\) to \(\hat{L}_0=\hat{g}\mid _{[\sqrt{3},\sqrt{3}]}\) in \((\mathcal {M},d)\).
The final thing left to prove is the continuity of the map \(L(\cdot ):[\tau _0\delta ,\tau _0) \rightarrow \mathcal {M}\). To show this, take any \(s_0 \in (0,\delta ]\). By joint analyticity of f, for small enough \(\eta >0\), \(\sup _{p \in [M\sqrt{s_0},M\sqrt{s_0}]}f(p,\tau _0s) f(p,\tau _0s_0) \le Css_0\) whenever \(ss_0 \le \eta \), for some (random) constant C depending only on \(M, s_0\) and \(\eta \). This, along with the fact that \(f(\cdot , \tau _0s_0)\) has positive speed in \([M\sqrt{s_0},M\sqrt{s_0}]\) implies that \(\max \{a_sa_{s_0}, b_sb_{s_0}\} < Ass_0\) whenever \(ss_0 \le \eta \), where A is again a (random) constant depending on \(M, s_0\) and \(\eta \). This, along with uniform continuity of f in \([M\sqrt{s_0},M\sqrt{s_0}] \times [s_0\eta ,s_0+\eta ]\) implies that \(L(\tau _0s) \rightarrow L(\tau _0s_0)\) in \((\mathcal {M},d)\) as \(s \rightarrow s_0\) proving continuity. \(\square \)
From Lemma 25 and Theorem 8, it is clear that there is a bijection between dying loops and singularities in the particle path evolution.
Shape of a dying loop Observe that in the previous theorem, the two coordinates had to be scaled differently to get the limiting loop. The difference in the scaling exponent explains why the loops look elongated before death (see Fig. 5).
Freezing in the tail
This section addresses Observation 4 of Burdzy and Pal [2]. The tail of the Conga line refers to the particles at distance \(x > \alpha t\) from the leading particle.
Note that, from the scaling relation (3.2) and Eq. (4.12), we have for any fixed \(\beta \in (0,1)\),
Thus, almost surely, there is N such that for all \(n \ge N\),
for all \(n \ge N\). In particular, for any \(\delta \in (0,\alpha )\), if we look at the family of curves
for \(\delta \le x \le \alpha \), then (6.2), along with Strassen’s Theorem for Brownian motion (see [10, p. 221, Theorem 22.2]), tells us that the set of continuous curves that form the limit points of \(\{\mathcal {U}_n(x): \delta \le x \le \alpha \}_{n \ge 1}\) with respect to supnorm distance on \([\delta , \alpha ]\) is precisely given by
Thus, the Conga line u under the above scaling shows appreciable movement in time. In other words, we do not observe freezing in the scale of the driving Brownian motion W.
On the other hand, if we zoom in on the tail of the Conga line, the particles seem to freeze in time. Furthermore, the direction in which the particles come out of the origin shows very little change with time after a while (see Figs. 2, 5). The small variance of particles in the tail region compared to that of the driving Brownian motion W for large t necessitates a rescaling of the tail to study its properties. Thus, the tail behaves in a very different manner compared to particles near the tip.
To study the phenomenon of ‘freezing in the tail’, we use the continuous version \(\overline{u}\) described in Sect. 2.2. For any fixed \(\eta \in (0, (1\alpha )/\alpha )\), we look at the following distances from the tip
and study
The choice of distances \(x_t\) from the tip ensures that these particles remain in the tail region for all t. We rescale \(\overline{v}_t\) as
Also define
where W is the driving Brownian motion in expression (2.9).
Theorem 9
For any fixed \(\eta \in (0, (1\alpha )/\alpha )\),
almost surely and in \(L^2\) as \(t \rightarrow \infty \).
Proof
In the following, \(C_1, C_2,\dots \) represent finite, positive constants.
From (2.9) we can write
Almost sure convergence follows from the fact that
with probability one, and the Dominated Convergence Theorem.
To prove \(L^2\) convergence, note that
where
It is easy to see that
From (6.6), we get
Thus
giving \(L^2\) convergence. \(\square \)
Remark
Note that the above theorem only gives freezing (under rescaling) in a part of the Conga line that goes to zero exponentially fast with time. We have also seen that freezing does not occur in the scale of the driving Brownian motion. The interesting (and harder) question that remains to be investigated is about possible freezing in the intermediate regime (in a window of the form \([\alpha t, \alpha t + \sigma \sqrt{\frac{1}{2}t \log t}]\) where a sharp transition in variance occurs, see Theorem 3 and Corollary 1). We hope to investigate this in a future article.
Conclusion
In this paper, we approximate the smooth part of the Conga line, i.e. \(X_k(n)\) for \(k>>1\), by the smooth random Gaussian process u (equivalently \(\overline{u}\)), and then investigate the path properties of u and \(\overline{u}\). But this sheds little light on the Conga line near the tip \((k=1)\) as well as the motion of particles in time n for smaller values of k. Note from (2.3) that for fixed k, the increment process \(\{X_{k+1}(n)X_k(n): n \ge 1\}\) behaves like a Weighted Moving Average which finds frequent application in time series data analysis to smooth out short term fluctuations and highlight longer term trends and cycles (see [11]). From an analytical point of view, it would be interesting to analyse the Conga line at time n in a suitable window around some sequence \(a_n\) which grows to infinity sufficiently slowly so that the approximation by u fails. We will try to address this in a future article.
References
Chou, K.S., Zhu, X.P.: The Curve Shortening Problem. CRC Press, Boca Raton (2010)
Burdzy, K., Pal, S.: Private communication
Adler, R.J., Taylor, J.E.: Random Fields and Geometry, vol. 115. Springer, Berlin (2007)
Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion, vol. 293. Springer, Berlin (1999)
Feller, W.: An Introduction to Probability Theory and Its Applications, vol. 2. Wiley, New York (2008)
Chow, Y.S., Teicher, H.: Probability Theory: Independence, Interchangeability, Martingales. Springer Science and Business Media, New York (2003)
Ledoux, M.: The Concentration of Measure Phenomenon, vol. 89. AMS Bookstore, Toulouse (2005)
Pal, S.: Concentration for multidimensional diffusions and their boundary local times. Probab. Theory Relat. Fields 154(1–2), 225–254 (2012)
Mörters, P., Peres, Y.: Brownian Motion, vol. 30. Cambridge University Press, Cambridge (2010)
Billingsley, P.: Convergence of Probability Measures. Wiley, New York (2013)
Hamilton, J.D.: Time Series Analysis, vol. 2. Princeton University Press, Princeton (1994)
Acknowledgments
I thank my adviser Krzysztof Burdzy for guiding me through the project and giving interesting ideas to work on. I also thank Shirshendu Ganguly and Soumik Pal for helpful discussions. I thank Krzysztof Burdzy, Shirshendu Ganguly and Mary Solbrig for the figures, and Bharathwaj Palvannan, Yannick Van Huele and Tvrtko Tadic for helping me with computer graphics issues. I am grateful to an anonymous referee for providing valuable comments and suggestions that greatly improved the paper. This research was partly supported by NSF Grant Number DMS1206276.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Banerjee, S. The Brownian Conga line. Probab. Theory Relat. Fields 165, 901–961 (2016). https://doi.org/10.1007/s0044001506491
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s0044001506491
Mathematics Subject Classification
 60G15
 60G17
 60D05