1 Introduction

Motivated by applications to the analysis of queueing networks, we study the large deviations of extreme values of multivariate Gaussians. We focus on the bivariate case for simplicity, but our analysis will carry over to the more general case with some effort. Let \(\{X_1, X_2,\ldots \}\) be an ensemble of independent and identically distributed (i.i.d.) bivariate Gaussians with covariance matrix \(\Sigma \), and let \({\bar{X}}_n := (\max _{1\le i\le n} X^{\scriptscriptstyle (1)}_i, \max _{1\le i\le n}X_i^{\scriptscriptstyle (2)})\) be the componentwise maximum, or extreme value, random vector. For simplicity, we assume that \({\mathbb {E}}[X_1] = 0\). In the context of a queueing network, \({\bar{X}}_n\) is an approximation to the maximum congestion experienced over a typical interval in a network of infinite server queues, for instance. We characterize the likelihood of the tail event \(\{{\bar{X}}_n > a_n u\}\), where \(u \in (0,\infty )\), as the number of random vectors n tends to infinity, under the assumption that \(a_n \rightarrow \infty \) as \(n \rightarrow \infty \). We consider two cases.

Case 1: The right scale Under the condition that \(a_n = \sqrt{\log n}\), we prove a “restricted” large deviations principle (RLDP) (in the sense of [15]) in Theorem 2 that shows that if \(u > \sqrt{2} (\sigma ^{\scriptscriptstyle (1)},\sigma ^{\scriptscriptstyle (2)})\) (where \(\sigma ^{\scriptscriptstyle (j)}\) is the standard deviation of marginal j) then

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}({\bar{X}}_n > a_n u) = J(u/\sigma ), \end{aligned}$$
(1)

where

$$\begin{aligned} J(u)= {\left\{ \begin{array}{ll} 1 - \frac{1}{2} \left( {u^{\scriptscriptstyle (1)}} \right) ^2&{}\quad \text {when}~u^{\scriptscriptstyle (2)} \le \rho u^{\scriptscriptstyle (1)},\\ 1 - \frac{1}{2} \left( {u^{\scriptscriptstyle (2)}} \right) ^2 &{}\quad \text {when}~u^{\scriptscriptstyle (1)} \le \rho u^{\scriptscriptstyle (2)},\\ \max \Big \{2 - \tfrac{1}{2} \left\| {u} \right\| _2^2, 1-\frac{(u^{\scriptscriptstyle (1)})^2-2\rho u^{\scriptscriptstyle (1)} u^{\scriptscriptstyle (2)}+(u^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}\Big \}&{}\quad \text {otherwise,} \end{array}\right. } \end{aligned}$$
(2)

\(u/\sigma := (u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}, u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})\), \(\Vert u\Vert _2^2 := \sum _{i=1,2} |u^{\scriptscriptstyle (i)}|^2\) and \(\rho \in [-\,1,+\,1]\). Here, the top two cases only arise when \(\rho >0\), so when \(\rho <0\), the last line equals J(u). The proof follows by using the Laplace principle in the key Lemma 1 and combining it with the ‘largest probability wins’ principle.

The different cases in (2) originate due to the different scenarios in which the bivariate distribution can attain its maximum. In all the cases where a term \(+\,1\) is present, the maximum is attained by one index of \(X_i\) which simultaneously attains the maximum of both coordinates. In all the cases where a term \(+\,2\) is present, the maximum is attained by two different indices of \(X_i\), one which attains the maximum of the first coordinate and one which attains the maximum of the second coordinate. The latter case has the most distinct possibilities (“larger entropy”), while the first may have a larger probability for appropriate correlation coefficients \(\rho \). The optimal strategy is characterized by the ‘largest probability wins’ principle.

Case 2: Larger scales On a much larger scale, where \(a_n \gg \sqrt{\log n}\), we establish two main results. First, we prove a leading order asymptote for the extreme value that aligns with the result in case 1. Precisely, in Theorem 3, we prove the RLDP

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}\big ({\bar{X}}_n> a_n u) = -\,I(u/\sigma ), \end{aligned}$$
(3)

where

$$\begin{aligned} I(u)={\left\{ \begin{array}{ll} \frac{1}{2}(u^{\scriptscriptstyle (1)})^2 &{}\quad \text {when }u^{\scriptscriptstyle (2)}\le \rho u^{\scriptscriptstyle (1)},\\ \frac{1}{2} \left( {u^{\scriptscriptstyle (2)}} \right) ^2 &{}\quad \text {when}~u^{\scriptscriptstyle (1)} \le \rho u^{\scriptscriptstyle (2)},\\ \frac{1}{2}\min \big \{\Vert u\Vert ^2_2, \frac{(u^{\scriptscriptstyle (1)})^2-2\rho u^{\scriptscriptstyle (1)} u^{\scriptscriptstyle (2)}+(u^{\scriptscriptstyle (2)})^2}{1-\rho ^2}\big \} &{}\quad \text {otherwise.} \end{array}\right. } \end{aligned}$$
(4)

In Theorem 4, on the other hand, we establish a sharp asymptote for the likelihood and show that there exists a continuous function \(I :{\mathbb {R}}^d \rightarrow {\mathbb {R}}\) and constants K, b and c such that

$$\begin{aligned} \lim _{n\rightarrow \infty } a_n^b n^c {\mathrm e}^{a_n^2 I(u)} {\mathbb {P}}({\bar{X}}_n > a_n u) = K. \end{aligned}$$
(5)

The proofs of these theorems use the inclusion–exclusion principle to bound the likelihood from above and below.

1.1 Related literature

Multivariate Gaussians emerge as stationary limits of networks of infinite server queues. Recall that the steady-state number in system of an isolated \(M/G/\infty \) queue is Poisson distributed and can be approximated by an appropriate Gaussian random variable when the arrival rate is high (i.e., in heavy traffic in the sense of [9]); see [1, 19] as well. Similar invariance principles can be established for \(G/G/\infty \) queues in heavy traffic [9]. In an open queueing network, it can be argued that the number in the network is well approximated by a multivariate Gaussian random vector with independent covariates. On the other hand, in a closed network of infinite server queues, [10] shows that the number in the system is multinomially distributed. Once again, as the number in the closed system scales to infinity, the multinomial central limit theorem shows that the steady-state number in the system is well approximated by a multivariate Gaussian random vector with dependent covariates. Similar problems arise in ‘repairman’ problems where multiple simultaneous repairs must be conducted in parallel by specialized crews. In [8], a diffusion approximation model of the repairman system is again shown to have a Gaussian steady-state distribution. Given a finite number of observations of the network or repairman system, understanding large exceedances over these observations is of particular interest. Under an i.i.d. assumption on the observations (which would require a justification of its own on a case-by-case basis), the large deviation limit in this paper provides an approximation to the rare event probabilities.

Next, there is an explicit connection with extreme value theory (EVT). The logarithmic asymptotics established here complement the uniform convergence results for EVT; see [18, Chapter 4] and [2, Section I, Chapter C]. There are also clear connections with recent work on extremes of multidimensional Gaussian processes in [3,4,5, 14, 15] and other related works, where logarithmic asymptotics are derived for the “at least one in the set” extremum (not the componentwise extremum considered here) for Gaussian processes. We note, in particular, [15] where logarithmic asymptotics are derived for the “at least one in set” extremum of a sequence of (non-i.i.d.) generally distributed random vectors. The authors present a general theory closely aligned with the RLDP for univariate random variables introduced in [7], whereby the Gärtner-Ellis condition need not be satisfied. Of course, our results are more restrictive in the sense that we only study i.i.d. Gaussian random vectors, but we also consider large-scale asymptotics that are not under consideration there.

Our results are also closely related to the important series of papers by Hashorva and Hüsler [11,12,13] generalizing the classic Mills ratio Gaussian tail bound [17]. We observe that the quadratic program logarithmic asymptote derived in Lemma 1 is also implied by the tail bound derived in [11, 13]. In [13], the authors derive exact asymptotics for integrals of Gaussian random vectors, and, in particular, focus on the “at least one in the set” extremum for half-space extreme value sets. Our proof does not rely on the bound in [11, 13], however. We also note [16], where a crude asymptotic for Gaussian stochastic processes, closely related to Lemma 1, is proved—but not the logarithmic asymptotics. Furthermore, in our case, the threshold scales with n (or, equivalently, the time index t in [16]), which is a more interesting result.

It would be interesting to strengthen Theorem 2 to sharp asymptotics as performed for large scales in Theorem 4. This is hard, since various error terms that can easily be dealt with in the proof of Theorem 4 as they are much smaller than the leading order will only become marginally smaller. It would also be interesting to extend our analysis to other multivariate random vectors with nontrivial dependence.

1.2 Notation and setting

All vector relations should be understood componentwise. Thus, \(x > y\) implies that \(x^{\scriptscriptstyle (j)} > y^{\scriptscriptstyle (j)}\) for every component j. Following [15], we define a restricted large deviation principle (RLDP) as follows: for some \(q \in {\mathbb {R}}^d\), a sequence of multivariate \({\mathbb {R}}^d\)-valued random variables \(\{W_n\}\) satisfies a RLDP with rate function \(J :{\mathbb {R}}^d \rightarrow [0,\infty ]\) if

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{v_n} \log {\mathbb {P}}(W_n/a_n > q) = -\,J(q), \end{aligned}$$
(6)

where \(v_n,a_n \rightarrow \infty \) as \(n\rightarrow \infty \). This asymptotic is not a full-fledged large deviations principle (LDP) since it does not provide any insight into what happens for negative q, i.e., it only deals with attaining large positive values. Furthermore, as noted in [7, 15], if \(W_n\) satisfies an LDP with continuous rate function, then it automatically satisfies the RLDP. On the other hand, if the rate function is discontinuous, then it might not satisfy the RLDP.

We can write

$$\begin{aligned} (X^{\scriptscriptstyle (1)}_i, X^{\scriptscriptstyle (2)}_i){\mathop {=}\limits ^{d}} \left( \sigma ^{\scriptscriptstyle (1)}Z^{\scriptscriptstyle \mathrm (1)}_i + \mu ^{\scriptscriptstyle (1)}, \sigma ^{\scriptscriptstyle (2)}Z^{\scriptscriptstyle \mathrm (2)}_i + \mu ^{\scriptscriptstyle (2)}\right) , \end{aligned}$$
(7)

where \((\sigma ^{\scriptscriptstyle (1)},\mu ^{\scriptscriptstyle (1)})\) and \((\sigma ^{\scriptscriptstyle (2)},\mu ^{\scriptscriptstyle (2)})\) are the standard deviation and mean of \(X^{\scriptscriptstyle (1)}_i\) and \(X^{\scriptscriptstyle (2)}_i\), respectively, while \((Z^{\scriptscriptstyle (1)}_i, Z^{\scriptscriptstyle (2)}_i)\) are standard bivariate normals with correlation coefficient \(\rho \in [-\,1,1]\). Assume that \(\mu ^{\scriptscriptstyle (i)} = 0\) for \(i \in \{1,2\}\), without loss of generality. Throughout we assume that the covariance matrix \(\Sigma \) of the bivariate Gaussian is non-singular.

2 Right scale asymptote

We start by analyzing extreme events for bivariate Gaussian random variables:

Lemma 1

(Extreme events for single normal random variables) Let \(\{a_n\}_{n\ge 1}\) be any unbounded increasing sequence in \(n \in {\mathbb {N}}\) and \(\epsilon \in {\mathbb {R}}^2\). Then

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}\left( X_1> a_n \varepsilon \right) = -\,\mathop {{{\,\mathrm{ess\,inf}\,}}}\limits _{x > \varepsilon } \frac{1}{2} x^T \Sigma ^{-1} x. \end{aligned}$$
(8)

Proof

By definition, and with C an explicit constant,

$$\begin{aligned} \frac{1}{a_n^2}\log {\mathbb {P}}\left( X_1> a_n \varepsilon \right)&= \frac{1}{a_n^2}\log \left( C \int _{x> a_n \varepsilon } \exp \left( -\,\frac{1}{2} x^T\Sigma ^{-1}x\right) \hbox {d}x \right) \nonumber \\&=\frac{1}{a_n^2}\log \left( a_n C \int _{x > \varepsilon } \exp \left( -\,a_n^2\frac{1}{2} x^T\Sigma ^{-1}x\right) \hbox {d}x \right) , \end{aligned}$$
(9)

where the second equality follows by substitution of variables. Laplace’s principle [6, Chapter 4] implies the claim. \(\square \)

Next, we consider the asymptotics of the logarithmic likelihood of the event \(\{\exists i \le n:X_i > a_n u\}\). Note that this is not the componentwise maximum.

Proposition 1

(A single index attains the maximum) Let \(a_n := \sqrt{\log n}\). The bivariate Gaussian ensemble satisfies the RLDP limit

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}(\exists i \le n :X_i> a_n u) = 1 - \mathop {{{\,\mathrm{ess\,inf}\,}}}\limits _{x > u} \frac{1}{2} x^T \Sigma ^{-1} x, \end{aligned}$$
(10)

for \(u := (u^{\scriptscriptstyle (1)},u^{\scriptscriptstyle (2)}) > \sqrt{2}(\sigma ^{\scriptscriptstyle (1)},\sigma ^{\scriptscriptstyle (2)})\).

Remark 1

The condition \((u^{\scriptscriptstyle (1)},u^{\scriptscriptstyle (2)}) > \sqrt{2}(\sigma ^{\scriptscriptstyle (1)},\sigma ^{\scriptscriptstyle (2)})\) is very natural. Indeed, since the marginal distributions of each of the coordinates are normal with mean zero and standard deviation \(\sigma ^{\scriptscriptstyle (j)}\), we have that \(\max _{i=1}^n X_i^{\scriptscriptstyle (j)}/\sqrt{\log {n}} {\mathop {\longrightarrow }\limits ^{\scriptscriptstyle a.s.}}\sqrt{2} \sigma ^{\scriptscriptstyle (j)}\) for \(j=1,2\). Thus, when \(u^{\scriptscriptstyle (j)}\le \sqrt{2}\sigma ^{\scriptscriptstyle (j)}\), it is natural to assume that this event does not contribute to the asymptotics in Proposition 1. In particular, when \((u^{\scriptscriptstyle (1)},u^{\scriptscriptstyle (2)}) \le \sqrt{2}(\sigma ^{\scriptscriptstyle (1)},\sigma ^{\scriptscriptstyle (2)})\), the limit in (10) equals zero, whereas the right-hand side is strictly positive.

Proof

Observe that

$$\begin{aligned} {\mathbb {P}}(\exists i \le n :X_i> a_n u)&= {\mathbb {P}}\left( \cup _{i=1}^n \left\{ X_i> a_n\right\} \right) \nonumber \\&= 1 - {\mathbb {P}}(\cap _{j=1}^i \{X_i>a_n u\}^c) \nonumber \\&=(1-{\mathbb {P}}(\{X_1> a_n u\}^c))\sum _{i=0}^{n-1} {\mathbb {P}}(\{X_i>a_n u\}^c)^i \nonumber \\&= {\mathbb {P}}(X_1 > a_n u) \sum _{i=0}^{n-1} b_n^i, \end{aligned}$$
(11)

where \(b_n := {\mathbb {P}}(\{X_1 > a_n u\}^c)\).

From (11), it follows that

$$\begin{aligned} \log {\mathbb {P}}(\exists i \le n:X_i> a_n u) \le \log {\mathbb {P}}(X_1 > a_n u) + \log n, \end{aligned}$$
(12)

using the fact that \(b_n < 1\) for all finite n. Lemma 1 implies that

$$\begin{aligned} \limsup _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}\left( {\bar{X}}_n> a_n u\right) \le 1 - \frac{1}{2}\mathop {{{\,\mathrm{ess\,inf}\,}}}\limits _{x > u} x^T \Sigma ^{-1} x. \end{aligned}$$
(13)

Next, for the lower bound, we work with the term \(\log \sum _{i=0}^{n-1} b_n^i\) to obtain a finer analysis. In particular, suppose we demonstrate that, since \(b_n \in [0,1]\), \(\log (n b_n^{n-1}) \ge \log n + o(\log n)\) as \(n \rightarrow \infty \); then, it follows that

$$\begin{aligned} \log \sum _{i=0}^{n-1} b_n^i > \log (n b_n^{n-1}) \ge \log n + o(\log n)~\text {as}~n\rightarrow \infty . \end{aligned}$$
(14)

Consequently, Lemma 1, combined with this result, implies that

$$\begin{aligned} \liminf _{n\rightarrow \infty }\frac{1}{a_n^2} \log {\mathbb {P}}\left( \exists i \le n :X_i> a_n u\right) \ge 1 - \mathop {{{\,\mathrm{ess\,inf}\,}}}\limits _{x > u} \frac{1}{2} x^T \Sigma ^{-1} x, \end{aligned}$$
(15)

thereby completing the proof of the proposition.

It remains to show (14). Observe that the inclusion–exclusion formula implies that

$$\begin{aligned} b_n = {\mathbb {P}}(\{X_1 > a_n u\}^c)={\mathbb {P}}\big (\{X_1^{\scriptscriptstyle (1)} \le a_n u^{\scriptscriptstyle (1)}\}\cup \{X_1^{\scriptscriptstyle (2)} \le a_n u^{\scriptscriptstyle (2)}\}\big ) \end{aligned}$$
(16)

satisfies

$$\begin{aligned} b_n&= {\mathbb {P}}({X_1^{\scriptscriptstyle (1)} \le a_n u^{\scriptscriptstyle (1)}}) + {\mathbb {P}}(X_n^{\scriptscriptstyle (2)} \le a_n u^{\scriptscriptstyle (2)}) - {\mathbb {P}}(\{X_n^{\scriptscriptstyle (1)} \le a_n u^{\scriptscriptstyle (1)}\}\cap \{X_n^{\scriptscriptstyle (2)} \le a_n u^{\scriptscriptstyle (2)}\})\nonumber \\&= 2 - {\mathbb {P}}({X_1^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}}) - {\mathbb {P}}(X_n^{\scriptscriptstyle (2)} > a_n u^{\scriptscriptstyle (2)})\nonumber \\&\quad - \,{\mathbb {P}}(\{X_n^{\scriptscriptstyle (1)} \le a_n u^{\scriptscriptstyle (1)}\}\cap \{X_n^{\scriptscriptstyle (2)} \le a_n u^{\scriptscriptstyle (2)}\}). \end{aligned}$$
(17)

Therefore

$$\begin{aligned} b_n \ge 1 - {\mathbb {P}}({X_1^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}}) - {\mathbb {P}}(X_n^{\scriptscriptstyle (2)} > a_n u^{\scriptscriptstyle (2)}). \end{aligned}$$
(18)

By the Taylor series expansion of \(\log (1-x) = -\,x + o(x)\), as well as \(1-b_n=o(1)\),

$$\begin{aligned} \log (n b_n^{n-1})&= \log n + (n-1) \log {b_n}=\log n - (n-1) (1-b_n) + o(\log n)\nonumber \\&\ge \log n +n\left( - \,{\mathbb {P}}({X_1^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}}) - {\mathbb {P}}(X_n^{\scriptscriptstyle (2)} > a_n u^{\scriptscriptstyle (2)})\right) (1 + o(1)). \end{aligned}$$
(19)

Next, the Gaussian upper tail bound implies that, for large n,

$$\begin{aligned} \log (n b_n^{n-1})&\ge \log n - \frac{n}{\sqrt{2\pi }} \frac{1}{\sqrt{\log n}} \sum _{j\in \{1,2\}} \frac{1}{n^{1/2(u^{\scriptscriptstyle (i)}/\sigma ^{\scriptscriptstyle (i)})^2}} \frac{1}{\sigma ^{\scriptscriptstyle (j)}u^{\scriptscriptstyle (j)}}(1 + o(1))\nonumber \\&\ge \log n - \frac{1}{\sqrt{2\pi }} \frac{2}{\sqrt{\log n}} \nonumber \\&\quad \max \left\{ \frac{1}{n^{\scriptscriptstyle (u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)})^2/2-1}} \frac{1}{\sigma ^{\scriptscriptstyle (1)}u^{\scriptscriptstyle (1)}}, \frac{1}{n^{\scriptscriptstyle (u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})^2/2-1}} \frac{1}{\sigma ^{\scriptscriptstyle (2)}u^{\scriptscriptstyle (2)}} \right\} (1 + o(1)). \end{aligned}$$
(20)

Since \((u^{\scriptscriptstyle (1)},u^{\scriptscriptstyle (2)}) > \sqrt{2}(\sigma ^{\scriptscriptstyle (1)},\sigma ^{\scriptscriptstyle (2)})\), it follows that

$$\begin{aligned} \frac{\log (n b_n^{n-1})}{\log n} \ge 1 + o(1) ~\text {as}~n\rightarrow \infty , \end{aligned}$$
(21)

thereby completing the proof. \(\square \)

Lemma 2

(Analysis of variational problem) By a straightforward calculation,

$$\begin{aligned}&J_1(u/\sigma ):=1-\tfrac{1}{2}\mathop {{{\,\mathrm{ess\,inf}\,}}}\limits _{x > u} x^T \Sigma ^{-1} x\nonumber \\&\quad ={\left\{ \begin{array}{ll} 1-\frac{1}{2}(u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)})^2 &{}\text {when }u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)}\le \rho u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)},\\ 1-\frac{1}{2}(u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})^2 &{}\text {when }u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}\le \rho u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)},\\ 1-\frac{(u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)})^2-2\rho (u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}) (u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})+(u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)} &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$
Fig. 1
figure 1

Fix \(\rho =0.5\). The blue cone represents the region where \(u^{\scriptscriptstyle (1)} \le \rho u^{\scriptscriptstyle (2)}\) and the red cone where \(u^{\scriptscriptstyle (2)} \le \rho u^{\scriptscriptstyle (1)}\)

Proof

Fix \(\rho \in (0,1]\) and without loss of generality assume that \(\sigma ^{\scriptscriptstyle (1)} = \sigma ^{\scriptscriptstyle (2)} = 1\). We can divide the positive quadrant into three regions as shown in Fig. 1, where \(\rho = 0.5\). Suppose that u is such that \(u^{\scriptscriptstyle (2)} \le \rho u^{\scriptscriptstyle (1)}\) (see the red region in Fig. 1), then

$$\begin{aligned} \frac{(u^{\scriptscriptstyle (1)})^2-2\rho u^{\scriptscriptstyle (1)}u^{\scriptscriptstyle (2)}+(u^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)} \le \tfrac{1}{2}(u^{\scriptscriptstyle (1)})^2 < \tfrac{1}{2}(x^{\scriptscriptstyle (1)})^2, \end{aligned}$$
(22)

where the final inequality holds for any \(x > u\). It follows that \(\tfrac{1}{2}{{\,\mathrm{ess\,inf}\,}}_{x > u} x^T \Sigma ^{-1} x = \tfrac{1}{2}(u^{\scriptscriptstyle (1)})^2\). A similar argument shows that \(\tfrac{1}{2}{{\,\mathrm{ess\,inf}\,}}_{x > u} x^T \Sigma ^{-1} x = \frac{1}{2}(u^{\scriptscriptstyle (2)})^2\) when \(u^{\scriptscriptstyle (1)} \le \rho u^{\scriptscriptstyle (2)}\). Finally, in the region where neither of these conditions holds (the blank region in Fig. 1), it is straightforward to verify the Karush–Kuhn–Tucker (KKT) conditions for u and, since \(\Sigma ^{-1}\) is positive definite, u is the unique optimizer. \(\square \)

As a consequence, we obtain the main result of this section:

Theorem 2

(Extreme value asymptotics for bivariate Gaussians) Under the conditions of Proposition 1,

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}\left( {\bar{X}}_n > a_n u\right) = J(u/\sigma ), \end{aligned}$$
(23)

where

$$\begin{aligned}&J(u/\sigma )\\&\quad = {\left\{ \begin{array}{ll} 1 - \frac{1}{2} \left( {u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}} \right) ^2 &{}\quad \text {when}~u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)} \le \rho u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)},\\ 1 - \frac{1}{2} \left( {u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)}} \right) ^2 &{}\quad \text {when}~u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)} \le \rho u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)},\\ \max \big \{2 - \tfrac{1}{2} \left\| {u} \right\| _2^2, 1-\frac{(u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)})^2-2\rho (u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}) (u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})+(u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}\big \}&{}\quad \text {otherwise.} \end{array}\right. } \end{aligned}$$

Further, with \((I^\star , J^\star )\) the indices that maximize \(\bar{X}_n\) (i.e., \({\bar{X}}_n=(X^{\scriptscriptstyle (1)}_{I^\star }, X^{\scriptscriptstyle (2)}_{J^\star })\)),

$$\begin{aligned} \lim _{n\rightarrow \infty }{\mathbb {P}}\big (I^\star \ne J^\star \mid \bar{X}_n> a_n u\big )= {\left\{ \begin{array}{ll} 1 &{}\quad \text { when }2-\tfrac{1}{2}\Vert u/\sigma \Vert ^2_2>J_1(u/\sigma ),\\ 0 &{}\quad \text { when }2-\tfrac{1}{2}\Vert u/\sigma \Vert ^2_2<J_1(u/\sigma ). \end{array}\right. } \end{aligned}$$
(24)

It is an interesting problem to extend (24) to the case where

$$\begin{aligned} J(u/\sigma )=2-\tfrac{1}{2}\Vert u/\sigma \Vert ^2_2=1-\frac{(u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)})^2-2\rho (u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}) (u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})+(u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}, \end{aligned}$$
(25)

but this seems quite difficult. Note that this result shows a sharp transition between the cases where a single index causes the maximum, versus the case where two indices ‘conspire’ to cause the maximum. The proof explicitly uses the principle of the largest term [6, Lemma 1.2.15] and a strict ordering between \(2-\frac{1}{2}\Vert u/\sigma \Vert _2^2\) and \(J_1(u/\sigma )\). In the case in (25), the principle of the maximum term is no longer sufficient to distinguish between the one- and two-index cases—indeed, we anticipate that the limit of the optimal indices in (24) will lie in the interval (0, 1). This seems to require a finer analysis than the crude principle of the largest term and lies outside the scope of this paper.

Proof

Note that

$$\begin{aligned}&{\mathbb {P}}\left( {\bar{X}}_n> a_n u\right) \nonumber \\&\quad ={\mathbb {P}}(\exists i \le n:X_i> a_n u)\nonumber \\&\qquad +{\mathbb {P}}(\exists i\ne j \le n:X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_j^{\scriptscriptstyle (2)}> a_n u^{\scriptscriptstyle (1)}, \not \exists i \le n:X_i > a_n u). \end{aligned}$$
(26)

Depending on u, the first or the second term will be dominant. For the moment, ignoring the event that \(\{\not \exists i \le n:X_i > a_n u\}\) (which leads to an upper bound, but this event will be incorporated in more detail),

$$\begin{aligned}&{\mathbb {P}}(\exists i\ne j \le n:X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_j^{\scriptscriptstyle (2)}> a_n u^{\scriptscriptstyle (1)})\nonumber \\&\quad \approx n(n-1) {\mathbb {P}}(X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}){\mathbb {P}}(X_j^{\scriptscriptstyle (2)} > a_n u^{\scriptscriptstyle (2)})\nonumber \\&\quad \approx n^2 \exp {\Big \{-a_n^2 [(u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)})^2+(u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})^2]/2\Big \}}. \end{aligned}$$
(27)

Taking logs and dividing by \(\log {n}=a_n^2\) gives

$$\begin{aligned}&\lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}(\exists i\ne j \le n:X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_j^{\scriptscriptstyle (2)} > a_n u^{\scriptscriptstyle (1)})\nonumber \\&\quad =2-\tfrac{1}{2}[(u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)})^2+(u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})^2]=2 - \tfrac{1}{2} \Vert u/\sigma \Vert _2^2\nonumber \\&\quad :=J_2(u/\sigma ). \end{aligned}$$
(28)

Then, by the principle of the largest term [6, Lemma 1.2.15], it follows that

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}({\bar{X}}_n> a_n u)&= \max \Big \{\lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}\left( \exists i \le n:X_i> a_n u\right) ,\nonumber \\&\quad \lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}\left( \exists i\ne j \le n:X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_j^{\scriptscriptstyle (2)} > a_n u^{\scriptscriptstyle (1)} \right) \Big \}\nonumber \\&= \max \left\{ J_1(u/\sigma ),~J_2(u/\sigma )\right\} . \end{aligned}$$
(29)

Further,

$$\begin{aligned}&\lim _{n\rightarrow \infty }\frac{1}{a_n^2} \log {\mathbb {P}}\big (I^\star \ne J^\star \mid {\bar{X}}_n> a_n u\big )\nonumber \\&\quad =\lim _{n\rightarrow \infty }\frac{1}{a_n^2} \Big [\log {\mathbb {P}}\big (I^\star \ne J^\star ,{\bar{X}}_n> a_n u\big )-\log {\mathbb {P}}\big ({\bar{X}}_n> a_n u\big )\Big ]\nonumber \\&\quad \le J_2(u/\sigma )-J(u/\sigma )<0, \end{aligned}$$
(30)

when \(J_2(u/\sigma )<J(u/\sigma )\), showing that \({\mathbb {P}}\big (I^\star \ne J^\star \mid {\bar{X}}_n> a_n u\big )=o(1)\) when \(J_2(u/\sigma )<J(u/\sigma )\). Similarly,

$$\begin{aligned}&\lim _{n\rightarrow \infty }\frac{1}{a_n^2} \log {\mathbb {P}}\big (I^\star =J^\star \mid {\bar{X}}_n> a_n u\big )\nonumber \\&\quad =\lim _{n\rightarrow \infty }\frac{1}{a_n^2} \Big [\log {\mathbb {P}}\big (I^\star =J^\star ,{\bar{X}}_n> a_n u\big )-\log {\mathbb {P}}\big ({\bar{X}}_n> a_n u\big )\Big ]\nonumber \\&\quad \le J_1(u/\sigma )-J(u/\sigma )<0, \end{aligned}$$
(31)

when \(J_1(u/\sigma )<J(u/\sigma )\), showing that \({\mathbb {P}}\big (I^\star =J^\star \mid {\bar{X}}_n> a_n u\big )=o(1)\) when \(J_1(u/\sigma )<J(u/\sigma )\). This proves (24) subject to (23).

We are left to prove the claim in (23), for which we consider several cases:

Case (1)\(u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)} \le \rho u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}\). Under the assumption that \(u^{\scriptscriptstyle (j)} > \sqrt{2} \sigma ^{\scriptscriptstyle (j)}\) for \(j = 1,2\), it is straightforward to see that \(1 - \frac{1}{2} \left( u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}\right) ^2 > 2 - \frac{1}{2} \Vert u/\sigma \Vert _2^2\). Lemma 2 implies that

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}({\bar{X}}_n > a_n u) = 1 - \tfrac{1}{2} \left( u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}\right) ^2=J_1(u/\sigma ). \end{aligned}$$
(32)

Case (2)\(u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)} \le \rho u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}\). The proof follows case (1) and is omitted.

Case (3)\(u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)} > \rho u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}\)  and  \(u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)} > \rho u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)}\). Lemma 2 implies that (29) is simply

$$\begin{aligned}&\max \{J_1(u/\sigma ), J_2(u/\sigma )\}\\&\quad =\max \left\{ 2 - \tfrac{1}{2} \left\| {u} \right\| _2^2, 1-\frac{(u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)})^2-2\rho (u^{\scriptscriptstyle (1)}/\sigma ^{\scriptscriptstyle (1)}) (u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})+(u^{\scriptscriptstyle (2)}/\sigma ^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}\right\} . \end{aligned}$$

Which of the two is the maximizer depends sensitively on the relation between \(\rho \) and u.

Returning now to (26) to justify the upper bound to the second term, use the inclusion–exclusion principle on the event \(\{\not \exists i \le n :X_i > a_n u\}\) to observe that

$$\begin{aligned}&{\mathbb {P}}\big (\exists i \ne j :X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_i^{\scriptscriptstyle (2)}> a_n u^{\scriptscriptstyle (2)}, \not \exists i \le n :X_i> a_n u \big )\nonumber \\&\quad = {\mathbb {P}}\big (\exists i \ne j :X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_i^{\scriptscriptstyle (2)}> a_n u^{\scriptscriptstyle (2)}\big ) \nonumber \\&\qquad - \,{\mathbb {P}}\big ( \exists i \le n:X_i> a_n u, \exists i \ne j :X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_j^{\scriptscriptstyle (2)} > a_n u^{\scriptscriptstyle (2)} \big ). \end{aligned}$$
(33)

Observe that the second term in (33) can be bounded above by using a simple counting argument by

$$\begin{aligned}&{\mathbb {P}}\big ( \exists i \le n :X_i> a_n u, \exists i \ne j :X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_j^{\scriptscriptstyle (2)}> a_n u^{\scriptscriptstyle (2)} \big )\\&\quad \le ~~n^2 {\mathbb {P}}\big ( X_1> a_n u \big ) {\mathbb {P}}\big ( X_1^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)} \big ) + n^2 {\mathbb {P}}\big ( X_1> a_n u\big ) {\mathbb {P}}\big ( X_1^{\scriptscriptstyle (2)}> a_n u^{\scriptscriptstyle (2)} \big )\\&\qquad +\, n^3 {\mathbb {P}}\big ( X_1> a_n u \big ) {\mathbb {P}}\big ( X_1^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)} \big ){\mathbb {P}}\big ( X_1^{\scriptscriptstyle (2)} > a_n u^{\scriptscriptstyle (2)} \big ). \end{aligned}$$

Now, under the conditions of Proposition 1, in particular that \(u = (u^{\scriptscriptstyle (1)},u^{\scriptscriptstyle (2)}) > \sqrt{2}(\sigma ^{\scriptscriptstyle (1)},\sigma ^{\scriptscriptstyle (2)})\), it follows that \({\mathbb {P}}\big ( X_1^{\scriptscriptstyle (i)} > a_n u^{\scriptscriptstyle (i)} \big ) = o(1/n)\) for \(i = 1,2\). Consequently,

$$\begin{aligned} {\mathbb {P}}\big ( \exists i \le n:X_i> a_n u, \exists i \ne j :X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_j^{\scriptscriptstyle (2)}> a_n u^{\scriptscriptstyle (2)} \big ) \le o({\mathbb {P}}( X_1 > a_n u )). \end{aligned}$$

It follows that

$$\begin{aligned}&{\mathbb {P}}\big (\exists i \ne j :X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_i^{\scriptscriptstyle (2)}> a_n u^{\scriptscriptstyle (2)}, \not \exists i \le n :X_i> a_n u \big ) \\&\quad \ge {\mathbb {P}}\big (\exists i \ne j :X_i^{\scriptscriptstyle (1)}> a_n u^{\scriptscriptstyle (1)}, X_i^{\scriptscriptstyle (2)}> a_n u^{\scriptscriptstyle (2)}\big )\\&\qquad -\, o({\mathbb {P}}\big ( X_1 > a_n u \big )), \end{aligned}$$

thereby justifying the statement following (26). \(\square \)

3 Large-scale asymptote

In this section, we consider \({\mathbb {P}}({\bar{X}}_n>a_n u)\), where \(a_n\gg \sqrt{\log {n}}\), so that we are considering a large deviation event. Recall that \((X_i^{\scriptscriptstyle (1)},X^{\scriptscriptstyle (2)}) = (\sigma ^{\scriptscriptstyle (1)} Z_i^{\scriptscriptstyle (1)}, \sigma ^{\scriptscriptstyle (2)} Z_i^{\scriptscriptstyle (2)})\). By changing \(u=(u^{\scriptscriptstyle (1)}, u^{\scriptscriptstyle (2)})\) if needed, it thus suffices to study the standard case, and in what follows we will therefore focus on \({\mathbb {P}}({\bar{Z}}_n>a_n u)\). We prove the following two main theorems:

Theorem 3

(Leading order asymptotics extremes) For any \(a_n\gg \sqrt{\log {n}}\), and with \(0\le u^{\scriptscriptstyle (2)}\le u^{\scriptscriptstyle (1)}\),

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{a_n^2} \log {\mathbb {P}}\big ({\bar{Z}}_n> a_n u \big ) = -\,I(u), \end{aligned}$$
(34)

where

$$\begin{aligned} I(u)={\left\{ \begin{array}{ll} \frac{1}{2}(u^{\scriptscriptstyle (1)})^2 &{}\quad \text {when }u^{\scriptscriptstyle (2)}\le \rho u^{\scriptscriptstyle (1)},\\ \frac{1}{2}\min \big \{\Vert u\Vert ^2_2, \frac{(u^{\scriptscriptstyle (1)})^2-2\rho u^{\scriptscriptstyle (1)} u^{\scriptscriptstyle (2)}+(u^{\scriptscriptstyle (2)})^2}{1-\rho ^2}\big \} &{}\quad \text {otherwise.} \end{array}\right. } \end{aligned}$$
(35)

In the following theorem, we extend Theorem 3 to sharp asymptotics:

Theorem 4

(Sharp asymptotics extremes) For any \(a_n\gg \sqrt{\log {n}}\), and with \(0\le u^{\scriptscriptstyle (2)}\le u^{\scriptscriptstyle (1)}\),

$$\begin{aligned} \lim _{n\rightarrow \infty } a_n^bn^c\mathrm{e}^{a_n^2 I(u)}{\mathbb {P}}\big (\bar{Z}_n> a_n u) = K, \end{aligned}$$
(36)

where

$$\begin{aligned} b=1+{\mathchoice{1\mathrm l}{1\mathrm l}{1\mathrm l}{1\mathrm l}}_{\{u^{\scriptscriptstyle (2)}\ge \rho u^{\scriptscriptstyle (1)}\}}, \qquad c=1+{\mathchoice{1\mathrm l}{1\mathrm l}{1\mathrm l}{1\mathrm l}}_{\{I(u)=\Vert u\Vert ^2/2\}}, \end{aligned}$$
(37)

and

$$\begin{aligned} K= {\left\{ \begin{array}{ll} \frac{1}{2\pi u^{\scriptscriptstyle (1)}u^{\scriptscriptstyle (2)}} &{}\quad \text { when }I(u)=\tfrac{1}{2}\Vert u\Vert ^2, u^{\scriptscriptstyle (1)}\ne u^{\scriptscriptstyle (2)},\\ \frac{1}{4\pi u^{\scriptscriptstyle (1)}u^{\scriptscriptstyle (2)}} &{}\quad \text { when }I(u)=\tfrac{1}{2}\Vert u\Vert ^2, u^{\scriptscriptstyle (1)} = u^{\scriptscriptstyle (2)},\\ \frac{1}{2\pi u^{\scriptscriptstyle (1)}} &{}\quad \text { when }I(u)<\tfrac{1}{2}\Vert u\Vert ^2, u^{\scriptscriptstyle (2)}<\rho u^{\scriptscriptstyle (1)},\\ \frac{1}{4\pi u^{\scriptscriptstyle (1)}} &{}\quad \text { when }I(u)<\tfrac{1}{2}\Vert u\Vert ^2, u^{\scriptscriptstyle (2)}=\rho u^{\scriptscriptstyle (1)},\\ \frac{1-\rho ^2}{2\pi (u^{\scriptscriptstyle (1)}-\rho u^{\scriptscriptstyle (2)})(u^{\scriptscriptstyle (2)}-\rho u^{\scriptscriptstyle (1)})}&{}\quad \text { when }I(u)<\tfrac{1}{2}\Vert u\Vert ^2, u^{\scriptscriptstyle (2)}>\rho u^{\scriptscriptstyle (1)}. \end{array}\right. } \end{aligned}$$
(38)

Consequently, with \((I^\star , J^\star )\) the indices that maximize \({\bar{Z}}_n\) (i.e., \({\bar{Z}}_n=(Z^{\scriptscriptstyle (1)}_{I^\star }, Z^{\scriptscriptstyle (2)}_{J^\star })\)),

$$\begin{aligned} \lim _{n\rightarrow \infty }{\mathbb {P}}\big (I^\star \ne J^\star \mid \bar{Z}_n> a_n u\big )= {\left\{ \begin{array}{ll} 1 &{}\quad \text { when }I(u)=\tfrac{1}{2}\Vert u\Vert ^2_2,\\ 0 &{}\quad \text { otherwise}. \end{array}\right. } \end{aligned}$$
(39)

Proof

We prove Theorems 3 and 4 in one go. We note that

$$\begin{aligned} {\mathbb {P}}({\bar{Z}}_n>a_n u)={\mathbb {P}}\Big (\bigcup _{(i,j)} \{Z^{\scriptscriptstyle \mathrm (1)}_i>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}_j> a_n u^{\scriptscriptstyle (2)}\}\Big ). \end{aligned}$$
(40)

We obtain

$$\begin{aligned} {\mathbb {P}}({\bar{Z}}_n>a_n u)={\mathbb {P}}\Big (\bigcup _{(i,j)} A_{(i,j)}\Big ), \end{aligned}$$
(41)

where

$$\begin{aligned} A_{(i,j)}=\{Z^{\scriptscriptstyle \mathrm (1)}_i>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}_j> a_n u^{\scriptscriptstyle (2)}\}. \end{aligned}$$
(42)

From this formula, we see the importance of symmetry, as \(A_{(i,j)}=A_{(j,i)}\) for the symmetric case where \(u^{\scriptscriptstyle (1)}=u^{\scriptscriptstyle (2)}\), but not when this is not the case. In the symmetric case where \(u^{\scriptscriptstyle (1)}=u^{\scriptscriptstyle (2)}\), we note that \(A_{(i,j)}=A_{(j,i)}\), so we may write, instead,

$$\begin{aligned} {\mathbb {P}}({\bar{Z}}_n>a_n u)={\mathbb {P}}\left( \bigcup _{(i,j):i\le j} \{Z^{\scriptscriptstyle \mathrm (1)}_i>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}_j> a_n u^{\scriptscriptstyle (2)}\}\right) . \end{aligned}$$
(43)

Using inclusion–exclusion We use inclusion–exclusion to obtain that

$$\begin{aligned} \sum _{(i,j)} {\mathbb {P}}\big (A_{(i,j)}\big ) -e_n(u)\le {\mathbb {P}}({\bar{Z}}_n>a_n u)\le \sum _{(i,j)} {\mathbb {P}}\big (A_{(i,j)}\big ), \end{aligned}$$
(44)

where

$$\begin{aligned} e_n(u)=\frac{1}{2}\sum _{(i,j)\ne (k,l)}{\mathbb {P}}\big (A_{(i,j)}\cap A_{(k,l)}\big ), \end{aligned}$$
(45)

while for the symmetric case we sum over ordered pairs (ij) with \(i\le j\) instead.

Below, we analyze each of these terms. We separate between the case where (i) the indices are different; (ii) they are equal but the probability simplifies; (iii) they are different and we need to perform the integral over the joint density using the Laplace method. We start with the asymmetric case where \(u^{\scriptscriptstyle (1)}\ne u^{\scriptscriptstyle (2)}\), remarking on the extension to the symmetric case at the end of the proof. Without loss of generality, we may assume that \(u^{\scriptscriptstyle (1)}>u^{\scriptscriptstyle (2)}\).

Sum of probabilities: unequal indices First consider the case where \(i\ne j\). Then, since \((Z^{\scriptscriptstyle \mathrm (1)}_i, Z^{\scriptscriptstyle \mathrm (2)}_j)\) are i.i.d. standard normal random variables,

$$\begin{aligned} \sum _{(i,j):i\ne j} {\mathbb {P}}\big (A_{(i,j)}\big ) =n(n-1) \big [1-\Phi (a_n u^{\scriptscriptstyle (1)})\big ]\big [1-\Phi (a_n u^{\scriptscriptstyle (2)})\big ], \end{aligned}$$
(46)

where \(\Phi (x)={\mathbb {P}}(Z\le x)\) is the error function or the distribution function of a standard normal. By the asymptotics, for x large,

$$\begin{aligned} 1-\Phi (x)=\frac{1}{\sqrt{2\pi } x}\mathrm{e}^{-x^2/2}(1+O(x^{-2})), \end{aligned}$$
(47)

we thus obtain that

$$\begin{aligned} \sum _{(i,j):i\ne j} {\mathbb {P}}\big (A_{(i,j)}\big ) =n^2 \frac{1}{2\pi u^{\scriptscriptstyle (1)}u^{\scriptscriptstyle (2)}a_n^2}\mathrm{e}^{-a_n^2 \Vert u\Vert _2^2/2}(1+o(1)). \end{aligned}$$
(48)

Sum of probabilities: simple cases of equal indices We next proceed with the case where \(i=j\), for which we get

$$\begin{aligned} \sum _{(i,i)} {\mathbb {P}}\big (A_{(i,i)}\big ) =n{\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\big ). \end{aligned}$$
(49)

In this paragraph, we deal with the ‘simple’ case where \(\rho u^{\scriptscriptstyle (1)}\ge u^{\scriptscriptstyle (2)}\), leaving the other cases to the next paragraph. We use that, conditionally on \(Z^{\scriptscriptstyle \mathrm (1)}\), the law of \(Z^{\scriptscriptstyle \mathrm (2)}\) equals \(Z^{\scriptscriptstyle \mathrm (2)}=\rho Z^{\scriptscriptstyle \mathrm (1)} +\sqrt{1-\rho ^2}Z\), where Z is independent of \(Z^{\scriptscriptstyle \mathrm (1)}\). We thus get that

$$\begin{aligned}&{\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\big )\nonumber \\&\quad ={\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, \rho Z^{\scriptscriptstyle \mathrm (1)} +\sqrt{1-\rho ^2}Z> a_n u^{\scriptscriptstyle (2)}\big )\nonumber \\&\quad ={\mathbb {E}}\Big [{\mathchoice{1\mathrm l}{1\mathrm l}{1\mathrm l}{1\mathrm l}}_{\{Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}\}}{\mathbb {P}}\big (\sqrt{1-\rho ^2} Z>a_n u^{\scriptscriptstyle (2)}-\rho Z^{\scriptscriptstyle \mathrm (1)}\mid Z^{\scriptscriptstyle \mathrm (1)} \big )\Big ]. \end{aligned}$$
(50)

When \(\rho u^{\scriptscriptstyle (1)}>u^{\scriptscriptstyle (2)}\),

$$\begin{aligned} {\mathbb {P}}\big (\sqrt{1-\rho ^2} Z>a_n u^{\scriptscriptstyle (2)}-\rho Z^{\scriptscriptstyle \mathrm (1)}\mid Z^{\scriptscriptstyle \mathrm (1)} \big )=1+o(1), \end{aligned}$$
(51)

so that

$$\begin{aligned} \sum _{(i,i)} {\mathbb {P}}\big (A_{(i,i)}\big ) =n{\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}\big )(1+o(1)) =na_n \frac{1}{2\pi u^{\scriptscriptstyle (1)}}\mathrm{e}^{-a_n^2 (u^{\scriptscriptstyle \mathrm (1)})^2/2}(1+o(1)). \end{aligned}$$
(52)

When \(\rho u^{\scriptscriptstyle (1)}=u^{\scriptscriptstyle (2)}\), instead

$$\begin{aligned} {\mathbb {P}}\big (\sqrt{1-\rho ^2} Z>a_n u^{\scriptscriptstyle (2)}-\rho Z^{\scriptscriptstyle \mathrm (1)}\mid Z^{\scriptscriptstyle \mathrm (1)} \big )=\frac{1}{2}+o(1), \end{aligned}$$
(53)

since \(Z^{\scriptscriptstyle \mathrm (1)}-a_n u^{\scriptscriptstyle (1)}=o_{\scriptscriptstyle {{\mathbb {P}}}}(1)\) when \(Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}\). Thus, for \(\rho u^{\scriptscriptstyle (1)}=u^{\scriptscriptstyle (2)}\), this leads to

$$\begin{aligned} \sum _{(i,i)} {\mathbb {P}}\big (A_{(i,i)}\big ) =n{\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}\big )(1+o(1)) =n\frac{1}{4\pi u^{\scriptscriptstyle (1)}a_n}\mathrm{e}^{-a_n^2 (u^{\scriptscriptstyle \mathrm (1)})^2/2}(1+o(1)). \end{aligned}$$
(54)

Sum of probabilities: Laplace integral for equal indices When the above simple cases do not apply, we write \({\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\big )\) explicitly as a two-dimensional integral as

$$\begin{aligned} {\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\big ) =\frac{1}{2\pi }\int _{a_n u^{\scriptscriptstyle (1)}}^{\infty } \int _{a_n u^{\scriptscriptstyle (2)}}^{\infty } \mathrm{e}^{-(x_1^2 -2\rho x_1x_2 +x_2^2)/2(1-\rho ^2)}\hbox {d}x_2\hbox {d}x_1. \end{aligned}$$
(55)

We rescale the integrands by \(a_n\) to obtain

$$\begin{aligned} {\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\big ) =\frac{a_n^2}{2\pi }\int _{u^{\scriptscriptstyle (1)}}^{\infty } \int _{u^{\scriptscriptstyle (2)}}^{\infty } \mathrm{e}^{-a_n^2(x_1^2 -2\rho x_1x_2 +x_2^2)/2(1-\rho ^2)}\hbox {d}x_2\hbox {d}x_1. \end{aligned}$$
(56)

This is a classic example of a Laplace integral. Thus, the integral is dominated by the minimum of \((x_1^2 -2\rho x_1x_2 +x_2^2)/2(1-\rho ^2)\) over all \((x_1,x_2)\) for which \(x_1\ge u^{\scriptscriptstyle (1)}, x_2\ge u^{\scriptscriptstyle (2)}\). Since \((x_1^2 -2\rho x_1x_2 +x_2^2)/2(1-\rho ^2)\) is convex, this minimum is attained at one of the boundaries. Since \(\rho u^{\scriptscriptstyle (1)}<u^{\scriptscriptstyle (2)}\), this minimum is attained at \(x_1=u^{\scriptscriptstyle (1)}, x_2=u^{\scriptscriptstyle (2)}\) (see also the analysis in Lemma 1).

Thus,

$$\begin{aligned}&{\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\big )=\frac{a_n^2}{2\pi }\exp {\left\{ -\,a_n^2 \frac{(u^{\scriptscriptstyle (1)})^2-2\rho u^{\scriptscriptstyle (1)} u^{\scriptscriptstyle (2)}+(u^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}\right\} }\nonumber \\&\qquad \times \,\int _{u^{\scriptscriptstyle (1)}}^{\infty } \int _{u^{\scriptscriptstyle (2)}}^{\infty } \exp {\Big \{-a_n^2\frac{(x_1^2 -2\rho x_1x_2 +x_2^2)-(u^{\scriptscriptstyle (1)})^2+2\rho u^{\scriptscriptstyle (1)} u^{\scriptscriptstyle (2)}-(u^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}}\bigg \} \hbox {d}x_2\hbox {d}x_1. \end{aligned}$$
(57)

Therefore, we obtain that

$$\begin{aligned}&2\pi a_n^{-2} \exp {\left\{ a_n^2 \frac{(u^{\scriptscriptstyle (1)})^2-2\rho u^{\scriptscriptstyle (1)} u^{\scriptscriptstyle (2)}+(u^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}\right\} }{\mathbb {P}}\left( Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\right) \nonumber \\&\quad =\int _{u^{\scriptscriptstyle (1)}}^{\infty } \int _{u^{\scriptscriptstyle (2)}}^{\infty } \exp {\bigg \{-\,a_n^2\frac{(x_1-u^{\scriptscriptstyle (1)})^2 -2\rho (x_1-u^{\scriptscriptstyle (1)})(x_2-u^{\scriptscriptstyle (2)})+(x_2-u^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}}\bigg \}\nonumber \\&\qquad \times \exp {\left\{ -\,a_n^2\frac{\left[ 2u^{\scriptscriptstyle (1)}(x_1-u^{\scriptscriptstyle (1)}){+}2u^{\scriptscriptstyle (2)}(x_2-u^{\scriptscriptstyle (2)})-\rho u^{\scriptscriptstyle (1)}(x_2-u^{\scriptscriptstyle (2)})-\rho u^{\scriptscriptstyle (2)}(x_1-u^{\scriptscriptstyle (1)})\right] }{2(1-\rho ^2)}\right\} } \hbox {d}x_2\hbox {d}x_1. \end{aligned}$$
(58)

Since \(\rho u^{\scriptscriptstyle (1)}<u^{\scriptscriptstyle (2)}\), we have that the quadratic function inside the exponential is minimized for \(x_1=u^{\scriptscriptstyle (1)}, x_2=u^{\scriptscriptstyle (2)}\). Shifting both integrands by \(u^{\scriptscriptstyle (1)}\) and \(u^{\scriptscriptstyle (2)}\), respectively, leads to

$$\begin{aligned}&2\pi a_n^{-2} \exp {\left\{ a_n^2 \frac{(u^{\scriptscriptstyle (1)})^2-2\rho u^{\scriptscriptstyle (1)} u^{\scriptscriptstyle (2)}+(u^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}\right\} }{\mathbb {P}}\left( Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\right) \nonumber \\&\quad =\int _{0}^{\infty } \int _{0}^{\infty } \exp {\left\{ -\,a_n^2\frac{x_1^2 -2\rho x_1x_2 +x_2^2}{2(1-\rho ^2)}\right\} }\nonumber \\&\qquad \times \exp {\left\{ -\,a_n^2\frac{2u^{\scriptscriptstyle (1)}x_1+2u^{\scriptscriptstyle (2)}x_2 -\rho u^{\scriptscriptstyle (1)}x_2-\rho u^{\scriptscriptstyle (2)}x_1}{2(1-\rho ^2)}\right\} }\hbox {d}x_2\hbox {d}x_1. \end{aligned}$$
(59)

Now, rescaling both integrands by \(a_n^{-2}\) leads to

$$\begin{aligned}&2\pi a_n^{2} \exp {\left\{ a_n^2 \frac{(u^{\scriptscriptstyle (1)})^2-2\rho u^{\scriptscriptstyle (1)} u^{\scriptscriptstyle (2)}+(u^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}\right\} }{\mathbb {P}}\left( Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\right) \nonumber \\&\quad =\int _{0}^{\infty } \int _{0}^{\infty } \exp {\left\{ -\,a_n^{-2}\frac{x_1^2 -2\rho x_1x_2 +x_2^2}{2(1-\rho ^2)}\right\} } \nonumber \\&\qquad \times \exp {\left\{ -\frac{(u^{\scriptscriptstyle (1)}-\rho u^{\scriptscriptstyle (2)})x_1+(u^{\scriptscriptstyle (2)}-\rho u^{\scriptscriptstyle (1)})x_2}{1-\rho ^2}\right\} }\hbox {d}x_2\hbox {d}x_1. \end{aligned}$$
(60)

Again we see the significance of the assumption that \(\rho u^{\scriptscriptstyle (1)}<u^{\scriptscriptstyle (2)}\), which implies that both linear terms have a negative coefficient, and, thus, the exponential functions are integrable. As a result, the first exponential in the integral only leads to an error term, so that

$$\begin{aligned}&2\pi a_n^{2} \exp {\left\{ a_n^2 \frac{(u^{\scriptscriptstyle (1)})^2-2\rho u^{\scriptscriptstyle (1)} u^{\scriptscriptstyle (2)}+(u^{\scriptscriptstyle (2)})^2}{2(1-\rho ^2)}\right\} }{\mathbb {P}}\left( Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\right) \nonumber \\&\quad =(1+o(1))\int _{0}^{\infty } \int _{0}^{\infty }\exp {\left\{ -\frac{(u^{\scriptscriptstyle (1)}-\rho u^{\scriptscriptstyle (2)})x_1+(u^{\scriptscriptstyle (2)}-\rho u^{\scriptscriptstyle (1)})x_2}{1-\rho ^2}\right\} }\hbox {d}x_2\hbox {d}x_1\nonumber \\&\quad =(1+o(1)) (1-\rho ^2)(u^{\scriptscriptstyle (1)}-\rho u^{\scriptscriptstyle (2)})^{-1}(u^{\scriptscriptstyle (2)}-\rho u^{\scriptscriptstyle (1)})^{-1}. \end{aligned}$$
(61)

Combining (52)–(54) with (61) yields the asymptotics of the sum of probabilities. Note that the final outcome yields (36) in the asymmetric case, so what is left is to show that the error term \(e_n(u)\) is of smaller order.

The symmetric case: sum of probabilities We now look at the sum of probabilities for the symmetric case and analyze \({\mathbb {P}}(A_{(i,j)})\) there. The analysis for the case where \(i\ne j\) is identical to the one above, except for the fact that the prefactor (due to the number of pairs (ij)) is changed from \(n(n-1)\) to \(n(n-1)/2\). The contribution for the case where \(i=j\) is also the same as above. In fact, it is easy to see that for \(u^{\scriptscriptstyle (1)}=u^{\scriptscriptstyle (2)}=u\), we have \(I((u,u))=u^2\) when \(\rho \le 0\), while \(I((u,u))=u^2/(1+\rho )\) for \(\rho >0\), since

$$\begin{aligned} \frac{(u^{\scriptscriptstyle (1)})^2-2\rho u^{\scriptscriptstyle (1)} u^{\scriptscriptstyle (2)}+(u^{\scriptscriptstyle (2)})^2}{1-\rho ^2} =u^2 \frac{2(1-\rho )}{1-\rho ^2}=u^2 \frac{2}{1+\rho }<2u^2, \end{aligned}$$
(62)

precisely when \(\rho >0\).

The error term\(e_n(u)\): asymmetric case In dealing with error terms, we will make essential use of the fact that \(a_n\gg \sqrt{\log {n}}\). This condition implies that if a certain event A satisfies \({\mathbb {P}}(A)\le \mathrm{e}^{-a_n^2 J}\) for some \(J>I(u)\), then \({\mathbb {P}}(A)\) will constitute an error term in evaluating \({\mathbb {P}}({\bar{Z}}_n>a_n u)\), irrespective of the precise powers of \(a_n\) and n. Recall (45). We investigate the different ways that \((i,j)\ne (k,l)\) can occur, depending on the cardinality of \(\{i,j,k,l\}\) which ranges from 2 to 4. Below, we assume throughout the analysis that the indices ijkl used are distinct.

  • Case (2a) (ij), (ji) This corresponds to \({\mathbb {P}}(\bar{Z}_n>a_n \bar{u})\), where \(\bar{u}=(u^{\scriptscriptstyle (1)}\vee u^{\scriptscriptstyle (2)}, u^{\scriptscriptstyle (1)}\vee u^{\scriptscriptstyle (2)})\) and \(x\vee y=\max \{x,y\}\) for \(x,y\in {{\mathbb {R}}}\). This case was investigated in the previous step, and we see that the rate at speed \(a_n^2\) equals \(I((\bar{u},\bar{u}))>I(u)\), since \(u^{\scriptscriptstyle (1)}\ne u^{\scriptscriptstyle (2)}\).

  • Case (2b) (ii), (ij) or (ii), (ji) By independence, these probabilities equal \({\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\big ){\mathbb {P}}(Z>a_nu^{\scriptscriptstyle (2)})\) and \({\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u^{\scriptscriptstyle (1)}, Z^{\scriptscriptstyle \mathrm (2)}> a_n u^{\scriptscriptstyle (2)}\big ){\mathbb {P}}(Z>a_nu^{\scriptscriptstyle (1)})\), respectively. Obviously, the rate at speed \(a_n^2\) is strictly larger than I(u).

  • Case (3a) (ii), (jk) By independence, this probability equals \({\mathbb {P}}(A_{(i,i)}){\mathbb {P}}(A_{(j,k)})\), the rate at speed \(a_n^2\) again being strictly larger than I(u).

  • Case (3b) (ij), (jk) By independence, this probability equals \({\mathbb {P}}(A_{(j,j)}){\mathbb {P}}(Z>a_nu^{\scriptscriptstyle (1)}){\mathbb {P}}(Z>a_nu^{\scriptscriptstyle (2)})\), the rate at speed \(a_n^2\) again being strictly larger than I(u).

  • Case (3c) (ij), (kj) This case is similar.

  • Case (4) (ij), (kl) By independence, this probability equals \({\mathbb {P}}(A_{(i,j)}){\mathbb {P}}(A_{(k,l)})\), the rate at speed \(a_n^2\) being at least 2I(u), which is again strictly larger than I(u).

Together, these cases show that \(e_n(u)\) is of smaller order than the sum of probabilities in the asymmetric case.

The error term\(e_n(u)\): symmetric case This analysis is similar to the asymmetric case, except that some cases do not arise. We again go through the distinct possibilities, writing \(u=u^{\scriptscriptstyle (1)}=u^{\scriptscriptstyle (2)}\):

  • Case (2) (ii), (ij) or (ii), (ji) By independence, this probability equals \({\mathbb {P}}\big (Z^{\scriptscriptstyle \mathrm (1)}>a_n u, Z^{\scriptscriptstyle \mathrm (2)}> a_n u\big ){\mathbb {P}}(Z>a_nu)\). Obviously, the rate at speed \(a_n^2\) is strictly larger than I((uu)).

  • Case (3a) (ii), (jk) By independence, this probability equals \({\mathbb {P}}(A_{(i,i)}){\mathbb {P}}(A_{(j,k)})\), the rate at speed \(a_n^2\) again being strictly larger than I((uu)).

  • Case (3b) (ij), (jk) By independence, this probability equals \({\mathbb {P}}(A_{(j,j)}){\mathbb {P}}(Z>a_nu){\mathbb {P}}(Z>a_nu)\), the rate at speed \(a_n^2\) again being strictly larger than I((uu)).

  • Case (4) (ij), (kl) By independence, this probability equals \({\mathbb {P}}(A_{(i,j)}){\mathbb {P}}(A_{(k,l)})\), the rate at speed \(a_n^2\) being at least 2I((uu)), which is again strictly larger than I((uu)).

Together, these cases show that \(e_n(u)\) is also of smaller order than the sum of probabilities in the symmetric case. \(\square \)