1 Introduction

The puzzle why “we sometimes get nonsense-correlation between time-series” has first been addressed in the seminal paper by Yule (1926). One model that he suggested to explain correlation between independent series was the random walk, called “conjunct series the differences of which are random” by Yule (1926, p. 26). For independent random walks Yule (1926, p. 33) provided experimental evidence, obtained by drawing playing cards from shuffled packs, that “The frequency-distribution of the correlations of samples of 10 observations [...] are much more widely dispersed than the correlations from samples of random series”. His findings were accomplished by the computer experimental evidence on spurious regressions by Granger and Newbold (1974) for independent random walks of length 50, see also Palm and Sneek (1984) for further Monte Carlo results. Phillips (1986) showed that nonsense correlation between independent random walks is not a finite sample problem only. From Phillips (1986, Thm. 1) the limiting distribution of the sample correlation is available: it converges to a nondegenerate random variable. More recently, Ernst et al. (2017) determined the variance of this limit, and numerical evaluation showed that it equals 0.240522 (Ernst, Shepp and Wyner (2017, p. 1807)). Of course, such findings cannot fully explain why nonsense correlation occurs between random walks in small samples.Footnote 1

In this note, we return to the finite sample puzzle. Yule (1926, Fig. 14) observed that random walks may trend in the same direction (concordance) or in the opposite direction (discordance) for certain periods of time. This is an intuitive explanation for nonsense correlation: there will be cluster of association between independent random walks. To add some rigour to this intuition, we would like to know: what is the maximum length to be expected for such spells of concordance or discordance given a fixed sample size? How large is the mode of this maximum length? And how large is the probability to observe values equal to or even larger than the mode? These questions will be answered by means of the corresponding probability distribution given in Corollary 1, building on the little-known Hungarian paper by Székely and Tusnády (1976-1979), see Révész (1990, Thm. 7) and Révész (2013, Thm. 2.7) for a reference. For independent random walks of length \(n=25\) we learn for instance: The probability that the maximum length of spells with consecutive concordance, or consecutive discordance, is at least equal to 4 amounts to 84.76%. Hence, long spells of random association (relative to the small sample size) are rather likely. The merits of exact results will be demonstrated by comparison with approximations. Asymptotic results in Proposition 2 can be traced back to Földes (1975), which is again a Hungarian paper referenced by Révész (1990, Thm. 6). Further, Gordon et al. (1986) provided approximations that allows for correlated random walks, too, which will be evaluated at the end of our note. Since no results for exact probabilities are available, we confront the asymptotic results with finite sample Monte Carlo figures.

The rest of this paper is organized as follows. The next section motivates this study with some Monte Carlo results. Section 3 becomes precise on random association and provides the exact distributional result under independence. The latter is evaluated numerically in Sect. 4 to shed light on why nonsense correlation is likely between independent random walks in finite samples. Section 5 compares the exact results with approximations. Section 6 is devoted to the extension of correlated random walks. A short summary is contained in the final section.

A word on notation before we begin. Let \(\lfloor x \rfloor \) denote the integer part of \(x \in {\mathbb {R}}\), with fractional part \(\{x\} := x - \lfloor x \rfloor \). Let \(\log _b\) stand for the logarithm to the base b, while \(\ln \) denotes the natural logarithm.

2 Some experimental evidence

Consider a bivariate random walk \((X_i, Y_i)_{i=0,1, \ldots , n}\) defined by

$$\begin{aligned} X_i=X_{i-1}+ \varepsilon _i \ \text{ and } \ Y_i=Y_{i-1}+\eta _i \, , \quad i=1,\ldots , n \, , \end{aligned}$$
(1)

where \((X_0, Y_0)\) is an arbitrary starting value. Before we begin with the theory, let us collect some experimental evidence. For computer simulation, the differences \((\Delta X_i, \Delta Y_i) = (\varepsilon _i, \eta _i)\) are drawn from a bivariate normal distribution:

$$\begin{aligned} \begin{pmatrix} \varepsilon _i\\ \eta _i \end{pmatrix} \sim {\mathcal {N}}_2 \left( \begin{pmatrix} 0 \\ 0 \end{pmatrix} , \begin{pmatrix} 1 &{} \rho \\ \rho &{} 1 \end{pmatrix}\right) . \end{aligned}$$
(2)

We simulated random walks with \((X_0, Y_0)=(0,0)\) and computed the sample correlation:

$$\begin{aligned} {\widehat{\rho }} = \frac{\sum _{i=1}^n (X_i - {\overline{X}})(Y_i - {\overline{Y}})}{\sqrt{\sum _{i=1}^n (X_i - {\overline{X}})^2}\sqrt{\sum _{i=1}^n (Y_i - {\overline{Y}})^2}}. \end{aligned}$$

Then we took the absolute value \(| {\widehat{\rho }} |\) (since it is known that \({\widehat{\rho }}\) varies symmetrically around zero for \(\rho =0\)). We report the average over \(10^5\) replications for growing sample size. Clearly, there is massive evidence in favour of nonsense correlation for \(\rho = 0\), and the absolute correlation coefficients are of the same size for small n as for large n, see Table 1. For moderate correlation \(\rho = 0.2, 0.4\), the sample correlation still exaggerates the true values, while \(\rho =0.6\) results in averages \(| {\widehat{\rho }} | \approx 0.6\), and \(\rho =0.8\) yields on average \(| {\widehat{\rho }} | \approx 0.77\), and these figures are rather robust over the sample size n, too.

In this paper we offer the length of random association between independent random walks or between moderately correlated random walks of small and medium sample sizes as an explanation for nonsense correlation or exaggerated correlation as documented in Table 1.

Table 1 Absolute value of sample correlation, \({|{\widehat{\rho }}|}\)

3 Spells of concordance and discordance

We maintain a bivariate random walk \((X_i, Y_i)_{i=0,1, \ldots , n}\) defined by equation (1), where \((X_0, Y_0)\) is an arbitrary starting value. We now focus on independence (to be relaxed in Assumption 2). More precisely, the differences \((\Delta X_i, \Delta Y_i) = (\varepsilon _i, \eta _i)\) meet the following set of assumptions.

Assumption 1

Let \((\varepsilon _i, \eta _i)_{i=1, \ldots , n}\) be a sequence of independent, identically distributed and continuous random variables with

$$\begin{aligned} p_\varepsilon := \text{ P }(\varepsilon _i< 0), \ \text{ P }(\varepsilon _i> 0) = 1-p_\varepsilon , \quad p_\eta := \text{ P }(\eta _i < 0) , \ \text{ P }(\eta _i > 0) = 1-p_\eta , \end{aligned}$$

\(p_\varepsilon , p_\eta \in (0,1)\). Further, \(\varepsilon _i\) and \(\eta _i\) are independent, and at least one of the probabilities equals 1/2: \(p_\varepsilon = 1/2\) or \(p_\eta =1/2\).

Remark 1

Note that the asymptotic theory by Phillips (1986) or Ernst et al. (2017) requires \(\text{ E }(\varepsilon _i) = \text{ E }(\eta _i) = 0\), which we do not need. For Proposition 1 and 2 we just need that the median of \(\varepsilon _i\) or \(\eta _i\) equals zero.

We say that the variables from (1) are concordant on the ith interval if \(X_i\) and \(Y_i\) move in the same direction; if they move in the opposite direction, they are called discordant. In terms of the usual sign function this provides the following definition.

Definition 1

Concordance on the ith interval means that \(\text{ sign }(\Delta X_i \Delta Y_i ) = 1\). Discordance on the ith interval means that \(\text{ sign }(\Delta X_i \Delta Y_i ) = - 1\).

Note that we rule out \(\Delta X_i = 0\) or \(\Delta Y_i = 0\) with probability 1 by assumption. For convenience, we define \(C_i\) as concordance indicator, taking on the value 0 if \(\Delta X_i\) and \(\Delta Y_i\) have the same sign:

$$\begin{aligned} C_i= \left\{ \begin{array}{cl} 0 &{} \text{ if } \text{ sign }(\Delta X_i \Delta Y_i ) = 1 \\ 1 &{} \text{ if } \text{ sign }(\Delta X_i \Delta Y_i ) = - 1 \end{array} \right. \, , \quad i=1, \ldots , n \, . \end{aligned}$$
(3)

By Assumption 1, it holds that

$$\begin{aligned} p:= \text{ P }(C_i=0) = 1- p_\varepsilon - p_\eta + 2 p_\varepsilon p_\eta = \frac{1}{2} = \text{ P }(C_i=1) \, . \end{aligned}$$

Consider a subsequence of consecutive zeros in \((C_i)_{i=1, \ldots , n}\), called a zero run. Let \(Z_n\) stand for the length of the longest zero run, which corresponds to the length of the longest spell without interruption where \(X_i\) and \(Y_i\) move in the same direction. The probabilities \(\text{ P }(Z_n = k)\) for given n can be expressed in terms of generalized Fibonacci numbers. We adopt the most convenient definition for our purposes by Spickerman and Joyner (1984, p. 327).

Definition 2

A Fibonacci sequence of order \(\ell \), \((f^{(\ell )}_m)_{m = 1,2, \ldots }\) for \(\ell \in \{1,2, \ldots \}\), is defined by the linear difference equation

$$\begin{aligned} f_{m}^{(\ell )} = \sum _{i=1}^{\ell } f_{m -i}^{(\ell )} \quad \text{ for } m > \ell \, , \end{aligned}$$

with \(f_{m}^{(\ell )} =2^{m-1}\) for \(m=1, \ldots , \ell \).

The trivial case \(\ell =1\) covers a sequence of ones. For \(\ell =2\), the usual Fibonacci numbers are obtained. The case \(\ell =3\) has been called ‘tribonacci’ sequence, see e.g. Spickerman (1982). The following table corresponds to Székely and Tusnády (1976-1979, p. 149).

m

1

2

3

4

5

6

7

8

9

\(f_{m}^{(1)}\)

1

1

1

1

1

1

1

1

1

\(f_{m}^{(2)}\)

1

2

3

5

8

13

21

34

55

\(f_{m}^{(3)}\)

1

2

4

7

13

24

44

81

149

\(f_{m}^{(4)}\)

1

2

4

8

15

29

56

108

208

Trivially, \( \text{ P }(Z_n < k) =1\) for \(k > n\). The general probability distribution is given next. Révész (1990, Thm. 7) and Révész (2013, Thm. 2.7) stated it without proof referring to Székely and Tusnády (1976-1979).

Proposition 1

Let \((X_i, Y_i)_{i=0,1, \ldots , n}\) from equation (1) satisfy Assumption 1. It then holds that

$$\begin{aligned} \text{ P }(Z_n < k) = \frac{ f_{n+1}^{(k)}}{2^{n}} \, , \quad 1 \le k \le n \, . \end{aligned}$$

Proof

See Székely and Tusnády (1976-1979). For completeness and easier accessibility, a separate proof is provided in the Appendix. \(\square \)

By Proposition 1, it immediately follows that

$$\begin{aligned} \text{ P }(Z_n = k) = \frac{ f_{n+1}^{(k+1)} - f_{n+1}^{(k)}}{2^{n}} \, , \quad 1 \le k \le n \, . \end{aligned}$$
(4)

Further, \(Z_n=0\) corresponds to a sequence of n ones with probability \(\text{ P }(Z_n = 0) = 1/2^{n}\).

More generally, we are interested in the length of the longest spell of consecutive intervals where \(X_i\) and \(Y_i\) are concordant or discordant without interruption. This corresponds to the maximum length of zero runs or runs of ones in \((C_i)\). Let \(S_n\) stand for this length of the longest spell of consecutive ones or zeros. With Proposition 1, it is straightforward to establish the following distribution.

Corollary 1

Under the assumptions of Proposition 1holds that

$$\begin{aligned} \text{ P }(S_n< k) = \text{ P }(Z_{n-1} < k-1) = \frac{f_{n}^{(k-1)}}{2^{n-1}} \, , \quad 2 \le k \le n \, , \end{aligned}$$

and \(\text{ P }(S_n < 1) = 0\).

Proof

See Appendix. \(\square \)

By Corollary 1, it immediately follows that

$$\begin{aligned} \text{ P }(S_n = k) = \frac{ f_{n}^{(k)} - f_{n}^{(k-1)}}{2^{n-1}} \, , \quad 2 \le k \le n \, . \end{aligned}$$
(5)

From (4) and (5) one obtains with \(\text{ P }(S_n = 1) = 2^{1-n} = \text{ P }(Z_{n-1} = 0) \) that

$$\begin{aligned} \text{ P }(S_n =k ) = \text{ P }(Z_{n-1} = k-1) \, , \quad k=1, \ldots , n \, , \end{aligned}$$
(6)

which will be used below.

4 Numerical work

Given the relation in (6), our numerical evaluation will be restricted to the length of the longest spell of consecutive zeros or ones, \(S_n\). The computation requires to determine (generalized) Fibonacci numbers. We employ the recursion from Definition 2 and do not bother about explicit solutions.

Fig. 1
figure 1

\(\text{ P }(S_n = k)\), \(k=1,\ldots , 15\)

Statistical measures of \(S_n\) are given in Table 2, and they are illustrated by the plots in Fig. 1. For the expected values from Table 2 one observes a logarithmic rate: Doubling n adds roughly 1 to \(\text{ E }(S_n)\); an asymptotic explanation for this feature will be given in the next Section. While the variance mildly grows with n, the skewness and the kurtosis decrease with the sample size. All in all, we find the spread in \(S_n\) rather small.

Table 2 Measures of \(S_n\)

Looking more closely into the figures behind Fig. 1 reveals that the five outcomes with highest probabilities including the most probably value (mode) cover roughly 90% of the probability mass:

$$\begin{aligned} \text{ P }(3 \le S_{25} \le 7)= & {} 0.9195 \, , \quad \text{ P }(4 \le S_{50} \le 8) = 0.8995 \, , \\ \text{ P }(5 \le S_{100} \le 9)= & {} 0.8850 \, , \quad \text{ P }(6 \le S_{200} \le 10) = 0.8758 \, . \end{aligned}$$

Table 3 looks more closely at the mode, \(mod_n\). While Fig. 1 and Table 2 are restricted to \(n= 2^s \cdot 25\) for \(s=0,1,2, \ldots \), we consider now more generally \(n= 2^s \cdot B\) and vary \(B \in \{25, 30,35\}\). From Table 3 we observe a logarithmic rate, \(mod_n = \lfloor \log _2 n \rfloor = s + \lfloor \log _2 B \rfloor \);Footnote 2 as with the expectation this feature calls for an explanation provided in the next section. As we know from Fig. 1, the maximum probability decreases with n. For large n this probability seems to settle around 0.25 or slightly below, and an approximate explanation will be provided again in the next section. At the same time, it is interesting to look at the probabilities for larger values, say larger than the mode, \(\text{ P }(S_n > mod_n)\): Throughout, the probability for the maximum length of a spell to exceed the mode is varies only very little with s given \(n= 2^s \cdot B\), but the probability depends on B, which will be again clarified in the subsequent section. In any case we observe large probabilities \(\text{ P }(S_n > mod_n)\): Long spells relative to the sample size of concordance or discordance will be the rule rather than the exception. This is in line with the experimental evidence documented in Table 1 for no correlation.

Table 3 Mode of \(S_n\), \(n=2^s \cdot B\)

5 Approximate results

In this section we compare our exact figures from Table 2 and 3 and Fig. 1 with approximate figures. The following approximation can be traced back to Földes (1975), see Révész (1990, Thm. 6). Easier to access is the proof by Földes (1979), while extensions have been provided by Gordon et al. (1986, Thm. 1), see also Proposition 3 below. For this and the next section, remember the definition of the fractional part of a real number x, \(\{x\} := x - \lfloor x \rfloor \), with \(\lfloor \cdot \rfloor \) being the usual floor function.

Proposition 2

Under the assumptions of Proposition 1it holds uniformly for any integer z that

$$\begin{aligned} \text{ P }(Z_n - \lfloor \log _2 n \rfloor < z) = F_n(z) + o(1) \, , \end{aligned}$$

where \(F_n(z) := \exp \left( -2^{-(z+1 -\{\log _2 n \})} \right) \).

Proof

Földes (1979, Thm. 4). \(\square \)

Now we are equipped to turn to an approximation of \(S_n\) with \(S_n \approx Z_n+1\) building on \(\text{ P }(S_n = k) \approx \text{ P }(Z_n =k-1)\) for large n according to (6). The distribution of \(Z_n\) can be approximated by truncating a Gumbel distribution with distribution function \(F_n\). Let \(V_n\) be Gumbel distributed with parameters \(\{ \log _2 n \} -1\) and \(1/\ln 2\) such that

$$\begin{aligned} \text{ E }(V_n) = \{ \log _2 n \} -1 + \frac{\gamma }{\ln 2} \ \text{ and } \ \text{ Var }(V_n) = \frac{\pi ^2}{6} \frac{1}{\ln ^2 2} \, , \end{aligned}$$

where \(\gamma \approx 0.5772\) is Euler’s constant. It is known that \(F_n(v) = \text{ P }(V_n \le v)\), \(v \in {\mathbb {R}}\), with \(F_n\) given in Proposition 2, the mode is \(\text{ mod }(V_n) = \{ \log _2 n \} -1\), i.e. the density \(f_n(v)\) is maximized at \(\text{ mod }(V_n)\), and the median is \(\text{ med }(V_n) = \text{ mod }(V_n) - {\ln (\ln 2)}/{\ln 2}\). We then have by Proposition 2 that \(Z_n - \lfloor \log _2 n \rfloor \approx \lfloor V_n \rfloor \) in the sense that

$$\begin{aligned} \text{ P }(Z_n - \lfloor \log _2 n \rfloor \le z-1) \approx \text{ P }(V_n \le z) = \text{ P }(\lfloor V_n \rfloor \le z-1). \end{aligned}$$

Consequently,

$$\begin{aligned} S_n \approx \lfloor V_n \rfloor + \lfloor \log _2 n \rfloor +1 \, . \end{aligned}$$
(7)

Because of

$$\begin{aligned} \text{ P }(\lfloor V_n \rfloor = z-1) = \text{ P }(z-1 \le V_n < z) = \int _{z-1}^z f_n(v) \text{ d } v \, , \end{aligned}$$

the mode \(\text{ mod }(V_n)\) with \(-1< \text{ mod }(V_n) < 0\) suggests that \(\text{ mod }(\lfloor V_n \rfloor ) = -1\). Hence, (7) suggests that \(\text{ mod }(S_n) = \lfloor \log _2 n \rfloor \), which was observed in Table 3.

Remark 2

Note that the approximation in (7) builds on the convergence result in Proposition 2, which, however, may not be interpreted as a limiting distribution: The approximating random variable \(V_n\) with the distribution function \(F_n\) does not converge with n, simply because the fractional part \(0 \le \{ \log _2 n \} <1\) does not.

More loosely speaking, it follows from Proposition 2 that (\(k=1,2,\ldots \))Footnote 3

$$\begin{aligned} \text{ P }(S_n \le k ) \approx \text{ P }(Z_n < k ) \approx \exp \left( -2^{-(k +1 -\log _2 n) } \right) = F_n (k -\lfloor \log _2 n \rfloor )\, . \end{aligned}$$
(8)

Hence, \(\text{ P }(S_n =k )\) can be approximated by \(P_n (k)\) defined as follows:

$$\begin{aligned} \text{ P }(S_n =k ) \approx P_n (k) :=F_n (k -\lfloor \log _2 n \rfloor ) - F_n (k -1 -\lfloor \log _2 n \rfloor ) \, . \end{aligned}$$
(9)

As in Table 3, consider \(n=2^s \cdot B\) such that \(\lfloor \log _2 n \rfloor = s + \lfloor \log _2 B \rfloor \) with \(\{ \log _2 (2^s \cdot B) \} = \{ \log _2 B \}\). Obviously, \(P_n (\lfloor \log _2 (2^s \cdot B) \rfloor )\) is constant for all s,

$$\begin{aligned} P_n (\lfloor \log _2 (2^s \cdot B) \rfloor ) = \exp \left( -2^{\{\log _2 B \} -1}\right) - \exp \left( -2^{\{\log _2 B\}}\right) . \end{aligned}$$

Since \(\{\log _2 B\} \in [0,1)\), it is straightforward to verify that \(P_n (\lfloor \log _2 (2^s \cdot B) \rfloor ) \) varies only between 0.233 and 0.250, which explains \(\text{ P }(S_n = \lfloor \log _2 n \rfloor )\) in Table 3, in particular \(P_n (\lfloor \log _2 (2^s \cdot 25) \rfloor ) = 0.2482\), \(P_n (\lfloor \log _2 (2^s \cdot 30) \rfloor ) = 0.2383\), and \(P_n (\lfloor \log _2 (2^s \cdot 35) \rfloor ) = 0.2438\). Similarly,

$$\begin{aligned} \text{ P }(S_n > \lfloor \log _2 \left( 2^s \cdot B\right) \rfloor ) \approx 1- F_n (0) = 1 - \exp \left( -2^{\{ \log _2 B \} -1 } \right) \, . \end{aligned}$$

Again, for \(B \in \{25, 30, 35\}\) this very well explains the figures from Table 3 since

$$\begin{aligned} \text{ P }(S_n > \lfloor \log _2 \left( 2^s \cdot B\right) \rfloor ) \approx \left\{ \begin{array}{cc} 0.5422 &{} \text{ for } B=25 \\ 0.6084 &{} \text{ for } B= 30 \\ 0.4212 &{} \text{ for } B=35 \end{array} \right. . \end{aligned}$$

Further, Fig. 2 displays selected differences of the exact and the approximate probabilities, \(\text{ P }(S_n =k ) - P_n (k)\): (9) does a fairly good job in approximating the single exact probabilities from Corollary 1 for \(n \ge 100\), while for \(n=25\) or \(n=50\) the deviations may be considerable.

Fig. 2
figure 2

\(\text{ P }(S_n = k)- P_n (k)\), see (9), \(k=1,\ldots , 15\)

Using \(\text{ E }(V_n)\) and \(\text{ Var }(V_n)\), we could roughly approximate \(\text{ E }(S_n)\) and \(\text{ Var }(S_n)\), but more elaborate results are available from the literature. Because of (6) we have

$$\begin{aligned} \text{ E }(S_n) = \sum _{k=1}^{n} k \text{ P }(S_{n} = k)= \sum _{k=0}^{n-1} (k+1) \text{ P }(Z_{n-1} = k)= \text{ E }(Z_{n-1})+1 \, . \end{aligned}$$

Gordon et al. (1986, Thm. 2) provided \(\text{ E }(Z_n) \approx \log _2 n + {\gamma }/{\ln 2} - {3}/{2}\). It follows that

$$\begin{aligned} \text{ E }(S_n) \approx \mu _n := \log _2 n + \frac{\gamma }{\ln 2} - \frac{1}{2} \, . \end{aligned}$$
(10)

More precisely, one has, see Guibas and Odlyzko (1980, Thm. 4.1), that

$$\begin{aligned} \text{ E }(S_n) = \mu _n + r(n) + o(1) \, , \end{aligned}$$

where r(n) does not vanish but is small: \(|r(x)| \le 1.6 \cdot 10^{-6}\) for all x according to Guibas and Odlyzko (1980, p. 245). Due to r(n), \(S_n \) does not converge with n even if demeaned by \(\mu _n\), see Remark 2. Still, the mean can be very well approximated as the evaluation of (10) demonstrates:

 

\(n=25\)

\(n=50\)

\(n=100\)

\(n=200\)

\(n=400\)

\(\mu _n\)

4.9766

5.9766

6.9766

7.9766

8.9766

A look at Table 2 demonstrates the close correspondance with the exact expectation even for small n. Finally, Gordon et al. (1986, Thm. 2) provided an approximation of the variance, too. Since \(\text{ Var }(S_n) = \text{ Var }(Z_n)\) we have from their paper that

$$\begin{aligned} \text{ Var }(S_n) \approx \frac{\pi ^2}{6 \ln ^2 2} + \frac{1}{12} \approx 3.5070 \, , \end{aligned}$$

see also Guibas and Odlyzko (1980, Thm. 4.1). This value independent of n does not explain well the exact variances for small n given in Table 2.

6 Correlated random walks

Drawing from the paper by Gordon et al. (1986) we briefly consider an extensions of Proposition 2. We now relax Assumption 1 and allow for correlation between the random walks. In terms of the concordance from Definition 1, correlation allows for \(\text{ P }(C_i=0) \ne \text{ P }(C_i=1)\). Technically, this means we have a Bernoulli process without symmetry, which is the model for tossing a coin that is not fair. The stronger the positive correlation between the two random walks is, the larger is the probability of concordance p,

$$\begin{aligned} p:= \text{ P }(C_i=0) \quad \text{ and } \quad q:= 1-p = \text{ P }(C_i=1). \end{aligned}$$

Negative correlation implies \(p < 1/2\).

Assumption 2

Let \((\Delta X_i, \Delta Y_i)_{i=1, \ldots , n}\) be a sequence of independent, identically distributed and continuous random variables with \(0< p < 1\).

From Gordon et al. (1986, Thm. 1) we have the following result, see also Arratia, Gordon and Waterman (1990, Coro. 3).

Proposition 3

Let \((X_i, Y_i)_{i=0,1, \ldots , n}\) from equation (1) satisfy Assumption 2. It then holds uniformly for any integer z that

$$\begin{aligned} \text{ P }(Z_n - \lfloor m_{n,p} \rfloor < z) = \exp \left( -p^{z - \{m_{n,p}\}} \right) + o(1) \, , \end{aligned}$$

where \(m_{n,p} := \log _{1/p} (n q)\).

Proof

The result follows from Gordon et al. (1986, Thm. 1); details are provided in the Appendix. \(\square \)

Note that \(m_{n,1/2} = \log _2 (n) -1\) and \(\{\log _2 (n) -1\}= \{\log _2 n\}\), such that Proposition 2 arises as a special case. Further, Proposition 3 allows to approximate in the sense of (8) that

$$\begin{aligned} \text{ P }(Z_n \le k ) \approx \exp \left( -p^{k+1 - m_{n,p}} \right) \, . \end{aligned}$$
(11)

This formula underlies Table 4 dedicated to the effect of p on \(\text{ P }(Z_n > \lfloor \log _2 n \rfloor )\) building on the approximation from (11). Our choices of p equal the probabilities if \((\Delta X_i, \Delta Y_i)\) are jointly normal with a correlation of \(\rho \): \(p= 0.23, 0.42, 0.5, 0.58, 0.77\) arise from \(\rho = -0.75, -0.25, 0, 0.25, 0.75\). It is intuitively clear that \(p > 0.5\) increases the probabilites of long zero runs, and it does so dramatically e. g. for \(p=0.77\). For \(p < 0.5\) on the other hand, zero runs become less likely because the random walks tend to drift in a discordant manner. This does of course not reduce the correlation between the random walks. Let \(O_n\) stand for the length of the longest sequence of ones in \(\left( C_i \right) _{i=1, \ldots , n}\). It is clear from (11) that

$$\begin{aligned} \text{ P }(O_n \le k ) \approx \exp \left( -q^{k+1 - \mu _{n,q}} \right) , \end{aligned}$$

where \(\mu _{n,q}\) is defined analogously to \(m_{n,p}\) from Proposition 3: \(\mu _{n,q} := \log _{1/q} (n p)\). Table 4 formalizes the following intuition: The stronger the correlation between the random walks is, i. e. the larger \(|p- 0.5|\) is, the more likely are long runs of zeros or ones in \(\left( C_i \right) \), depending on the sign of \(p- 0.5\). The approximate results from the first panel of Table 4 are well supported by finite sample Monte Carlo estimates for p being not too large; for \(p=0.77\), however, the approximate figures are too conservative in that the Monte Carlo estimates are sizeable larger.

Table 4 Probabilities \(\text{ P }(Z_n > \lfloor \log _2 n \rfloor )\) for varying p

7 Summary

There exists a well understood asymptotic theory why one gets nonsensical correlation between independent long random walks, see Phillips (1986, Thm. 1). In this note we focus on finite samples with a special interest on small sizes. What is, for instance, the maximum length of random association (consecutive concordance or consecutive discordance) between two independent random walks of sample size \(n=50\)? Evaluating Corollary 1, one can verify that the probability of the maximum length of random association being equal to 5 amounts to 27.68% (see also Fig. 1). The exact probability that this maximum length is at least equal to 5 amounts to 82.09% (see Table 3), and the expected value is 5.9783 (Table 2). Hence, long episodes (relative to the small sample size) of random association occur frequently, which explains why nonsense correlation arises between independent short random walks. We also included the case of correlated random walks where long episodes of association are of course more likely, see Table 4 for a quantification.