Appendix
Details of proofs using Transform method
Proof of Lemma 2
Proof
(of Lemma 2) We omit the dependence on N and t of the variables, for ease of exposition. By definition of indicator function, for any \(i\in [N]\) we have
$$\begin{aligned} \mathbbm {1}_{\left\{ q_i=0 \right\} }\exp \left( {\scriptstyle \theta N^{-\alpha }q_\Sigma }\right)&= \mathbbm {1}_{\left\{ q_i=0 \right\} }\exp \left( {\scriptstyle -\theta N^{1-\alpha }q_i}\right) \exp \left( {\scriptstyle \theta N^{-\alpha }q_\Sigma }\right) \\&{\mathop {=}\limits ^{(a)}} \mathbbm {1}_{\left\{ q_i=0 \right\} } + \mathbbm {1}_{\left\{ q_i=0 \right\} } \left( \exp \left( {\scriptstyle -\theta N^{1-\alpha }q_{\perp i}}\right) -1\right) , \end{aligned}$$
where \(q_{\perp i}\) is the \(i{}^{\text {th}}\) component of \({\varvec{q}}_\perp \). Here, (a) holds by definition of \({\varvec{q}}_\perp \) according to (2), and after adding and subtracting \(\mathbbm {1}_{\left\{ q_i=0 \right\} }\). Then, recalling the definition of \(\phi ({\varvec{q}},N)\) and reorganizing terms we obtain
$$\begin{aligned} \phi ({\varvec{q}},N)&{\mathop {=}\limits ^{\triangle }}\left( \exp \left( {\scriptstyle \theta N^{1-\alpha }q_\Sigma }\right) - 1\right) \left( \sum _{i=1}^N \mathbbm {1}_{\left\{ q_i=0 \right\} }\right) \\&= \sum _{i=1}^N \mathbbm {1}_{\left\{ q_i=0 \right\} } \left( \exp \left( {\scriptstyle -\theta N^{1-\alpha }q_{\perp i}}\right) -1\right) . \end{aligned}$$
We now compute the desired bound. We have
$$\begin{aligned} \left| {\mathbb {E}}\left[ \phi ({\overline{{\varvec{q}}}},N) \right] \right|&{\mathop {\le }\limits ^{(a)}} {\mathbb {E}}\left[ \sum _{i=1}^N \mathbbm {1}_{\left\{ {\overline{q}}_i=0 \right\} } \left| \exp \left( {\scriptstyle -\theta N^{1-\alpha } {\overline{q}}_{\perp i}}\right) -1 \right| \right] \nonumber \\&{\mathop {\le }\limits ^{(b)}} |\theta | N^{1-\alpha } {\mathbb {E}}\left[ \sum _{i=1}^N \mathbbm {1}_{\left\{ {\overline{q}}_i=0 \right\} }|{\overline{q}}_{\perp i}| \exp \left( {\scriptstyle |\theta | N^{1-\alpha }|{\overline{q}}_{\perp i}|}\right) \right] \nonumber \\&{\mathop {\le }\limits ^{(c)}} |\theta | N^{1-\alpha } {\mathbb {E}}\left[ \sum _{i=1}^N \mathbbm {1}_{\left\{ {\overline{q}}_i=0 \right\} } \right] ^{1-\frac{1}{r}} {\mathbb {E}}\left[ \sum _{i=1}^N |{\overline{q}}_{\perp i}|^r \exp \left( {\scriptstyle |\theta | N^{1-\alpha } r |{\overline{q}}_{\perp i}|}\right) \right] ^{\frac{1}{r}} \nonumber \\&{\mathop {=}\limits ^{(d)}} |\theta | N^{(1-\alpha )\left( 2-\frac{1}{r}\right) } {\mathbb {E}}\left[ \sum _{i=1}^N |{\overline{q}}_{\perp i}|^r \exp \left( {\scriptstyle |\theta | N^{1-\alpha } r |{\overline{q}}_{\perp i}|}\right) \right] ^{\frac{1}{r}}, \end{aligned}$$
(25)
where \(r>1\). Here, (a) holds by triangle inequality; (b) holds because \(|\exp \left( {\scriptstyle x}\right) -1|\le |x|\exp \left( {\scriptstyle |x|}\right) \) for all \(x\in {\mathbb {R}}\); (c) hods by Hölder’s inequality for the vectors \({\varvec{X}}\) and \({\varvec{Y}}\) with elements \(X_i=\mathbbm {1}_{\left\{ {\overline{q}}_i=0 \right\} }\) and \(Y_i=|{\overline{q}}_{\perp i}| \exp \left( {\scriptstyle |\theta | N^{1-\alpha } |{\overline{q}}_{\perp i}|}\right) \) for \(i\in [N]\), and noticing that \(X_i^r = X_i\) because it is an indicator function; and (d) holds by Lemma 1.
Now we bound the expectation in (25) using properties of norms, Cauchy-Schwarz inequality and SSC. For \(r\ge 2\) we have
$$\begin{aligned}&{\mathbb {E}}\left[ \sum _{i=1}^N |{\overline{q}}_{\perp i}|^r \exp \left( {\scriptstyle |\theta | N^{1-\alpha } r |{\overline{q}}_{\perp i}|}\right) \right] ^{\frac{1}{r}} \nonumber \\&{\mathop {\le }\limits ^{(a)}} {\mathbb {E}}\left[ \left\| {\overline{{\varvec{q}}}}_\perp \right\| ^r_r \exp \left( {\scriptstyle |\theta | N^{1-\alpha } r \Vert {\overline{{\varvec{q}}}}_\perp \Vert }\right) \right] ^{\frac{1}{r}} \nonumber \\&{\mathop {\le }\limits ^{(b)}} {\mathbb {E}}\left[ \left\| {\overline{{\varvec{q}}}}_\perp \right\| ^r \exp \left( {\scriptstyle |\theta | N^{1-\alpha } r \Vert {\overline{{\varvec{q}}}}_\perp \Vert }\right) \right] ^{\frac{1}{r}} \nonumber \\&{\mathop {\le }\limits ^{(c)}} {\mathbb {E}}\left[ \Vert {\overline{{\varvec{q}}}}_\perp \Vert ^{2r} \right] ^{\frac{1}{2r}} {\mathbb {E}}\left[ \exp \left( {\scriptstyle |\theta | N^{1-\alpha } 2r \Vert {\overline{{\varvec{q}}}}_\perp \Vert }\right) \right] ^{\frac{1}{2r}}, \nonumber \\ \end{aligned}$$
(26)
where (a) holds using that \(|{\overline{q}}_{\perp i}|\le \Vert {\overline{{\varvec{q}}}}_\perp \Vert \) in the exponent and by definition of the r-norm; (b) holds because the r-norm is smaller than the Euclidean norm for all \(r\ge 2\); and (c) holds by Cauchy-Schwarz inequality.
Now we bound each of the terms in (26) using SSC. From Proposition 1, recall that for every positive integer k we have
$$\begin{aligned} {\mathbb {E}}\left[ \left\| {\overline{{\varvec{q}}}}_\perp \right\| ^k \right] ^{\frac{1}{k}} \le {\overline{C}}k \left( \dfrac{N^2}{d-1}\right) , \end{aligned}$$
and for every \(\theta ^*\) satisfying \(|\theta ^*| < \frac{1}{2}\log \left( 1 + \tfrac{\lambda _0(d-1)}{2N^2}\right) \) we have
$$\begin{aligned} {\mathbb {E}}\left[ \exp \left( {\scriptstyle \theta ^* \left\| {\overline{{\varvec{q}}}}_\perp \right\| }\right) \right] \le \dfrac{\lambda _0(d-1)\exp \left( {\scriptstyle \tfrac{2\theta ^*N^2}{\lambda _0(d-1)}}\right) }{\lambda _0(d-1) + 2N^2\left( 1- \exp \left( {\scriptstyle 2\theta ^*}\right) \right) }. \end{aligned}$$
Using these results in (26) with \(k=2r\) and \(\theta ^*=2|\theta |rN^{1-\alpha }\), we obtain
$$\begin{aligned}&{\mathbb {E}}\left[ \sum _{i=1}^N |{\overline{q}}_{\perp i}|^r \exp \left( {\scriptstyle |\theta | N^{1-\alpha } r |{\overline{q}}_{\perp i}|}\right) \right] ^{\frac{1}{r}} \\&\le 2{\overline{C}}\lambda _0 \left( \dfrac{ rN^2 \exp \left( {\scriptstyle \tfrac{4|\theta | r N^{3-\alpha }}{\lambda _0(d-1)}}\right) }{\lambda _0(d-1) + 2N^2\left( 1-\exp \left( {\scriptstyle 4|\theta | r N^{1-\alpha }}\right) \right) }\right) . \end{aligned}$$
Using this result in (25), we obtain
$$\begin{aligned} \left| {\mathbb {E}}\left[ \phi \left( {\overline{{\varvec{q}}}},N\right) \right] \right|&\le 2{\overline{C}}\lambda _0 |\theta | \left( \dfrac{rN^{(1-\alpha )\left( 1-\frac{1}{r}\right) } N^2 \exp \left( {\scriptstyle \tfrac{4|\theta | r N^{3-\alpha }}{\lambda _0(d-1)}}\right) }{\lambda _0(d-1) + 2N^2\left( 1 - \exp \left( {\scriptstyle 4|\theta | r N^{1-\alpha }}\right) \right) } \right) . \end{aligned}$$
Since this upper bound holds for every \(r\ge 2\), we minimize the bound with respect to r and we obtain that \(r = \lceil \alpha -1\rceil \lceil \log (N)\rceil \) gives the tightest bound. Replacing this value we obtain the result. \(\square \)
Proof of Lemma 3
In this proof we use the definition of drift and we reorganize terms appropriately.
Proof
(of Lemma 3) We have:
$$\begin{aligned}&\Delta V_\parallel ({\varvec{q}}) \\&= \lambda N \sum _{i=1}^N \dfrac{\left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) } \left( \left\| \left( {\varvec{q}}+{\varvec{e}}^{(\psi _{{\varvec{q}}}(i))}\right) _\parallel \right\| ^2 - \left\| {\varvec{q}}_\parallel \right\| ^2 \right) + \sum _{i=1}^N \left( 1-\mathbbm {1}_{\left\{ q_i=0 \right\} }\right) \\&\quad \times \left( \left\| \left( {\varvec{q}}-{\varvec{e}}^{(i)}\right) _\parallel \right\| ^2 - \left\| {\varvec{q}}_\parallel \right\| ^2 \right) \\&{\mathop {=}\limits ^{(a)}} \lambda \sum _{i=1}^N \dfrac{\left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) } \left( 1 + 2\sum _{j=1}^N q_j\right) + \frac{1}{N} \sum _{i=1}^N \left( 1-\mathbbm {1}_{\left\{ q_i=0 \right\} }\right) \left( 1-2\sum _{j=1}^N q_j \right) , \end{aligned}$$
where (a) holds by the definition of \({\varvec{x}}_\parallel \) given a vector \({\varvec{x}}\) in (2), and computing the norms. This completes the proof. \(\square \)
Details of the Proof of Proposition 1
Proof of Lemma 6
In the proof of Lemma 6, we use the bound (18) to compute an upper bound on the moment generating function of \(Z({\overline{X}})\).
Proof
(of Lemma 6) First observe that \(Z({\overline{X}})\ge 0\) by assumption of Lemma 5. Then,
$$\begin{aligned} \exp \left( {\scriptstyle \theta Z({\overline{X}})}\right) \le \exp \left( {\scriptstyle |\theta | Z({\overline{X}})}\right) . \end{aligned}$$
We compute an upper bound for \({\mathbb {E}}\left[ \exp \left( {\scriptstyle |\theta | Z({\overline{X}})}\right) \right] \). Let \(F_Z(x)\) be the cumulative distribution function of \(Z({\overline{X}})\). Then,
$$\begin{aligned}&{\mathbb {E}}\left[ \exp \left( {\scriptstyle |\theta | Z({\overline{X}})}\right) \right] \\&= \int _0^\infty \exp \left( {\scriptstyle |\theta | x}\right) \,\mathrm{d}F_Z(x) \\&{\mathop {=}\limits ^{(a)}} \left[ -\exp \left( {\scriptstyle |\theta | x}\right) {\mathbb {P}}\left( Z({\overline{X}})>x\right) \right] ^\infty _0 + |\theta |\int _0^\infty \exp \left( {\scriptstyle |\theta | x}\right) {\mathbb {P}}\left( Z({\overline{X}})>x\right) \,\mathrm{d}x \\&= {\mathbb {P}}\left( Z({\overline{X}})>0\right) + |\theta | \int _0^B \exp \left( {\scriptstyle |\theta | x}\right) {\mathbb {P}}\left( Z({\overline{X}})>x\right) \,\mathrm{d}x \\ {}&\quad + |\theta | \int _B^\infty \exp \left( {\scriptstyle |\theta | x}\right) {\mathbb {P}}\left( Z({\overline{X}})>x\right) \,\mathrm{d}x \\&{\mathop {\le }\limits ^{(b)}} \exp \left( {\scriptstyle |\theta | B}\right) + \sum _{j=0}^\infty \int _{B+2\nu _{\max }j}^{B+2\nu _{\max }(j+1)} |\theta | \exp \left( {\scriptstyle |\theta | x}\right) {\mathbb {P}}\left( Z({\overline{X}})>x\right) \, \mathrm{d}x \\&{\mathop {\le }\limits ^{(c)}} \exp \left( {\scriptstyle |\theta | B}\right) + \sum _{j=0}^\infty \int _{B+2\nu _{\max }j}^{B+2\nu _{\max }(j+1)} |\theta | \exp \left( {\scriptstyle |\theta | x}\right) {\mathbb {P}}\left( Z({\overline{X}})> B+2\nu _{\max }j\right) \,\mathrm{d}x \\&{\mathop {\le }\limits ^{(d)}} \exp \left( {\scriptstyle |\theta | B}\right) + \exp \left( {\scriptstyle |\theta | B}\right) \left( \exp \left( {\scriptstyle 2|\theta |\nu _{\max }}\right) -1 \right) \left( \dfrac{G_{\max }\nu _{\max }}{G_{\max }\nu _{\max }+\gamma } \right) \\ {}&\quad \times \sum _{j=0}^\infty \left( \dfrac{G_{\max }\nu _{\max }\exp \left( {\scriptstyle 2|\theta |\nu _{\max }}\right) }{G_{\max }\nu _{\max }+\gamma } \right) ^j \\&{\mathop {=}\limits ^{(e)}} \dfrac{\exp \left( {\scriptstyle |\theta | B}\right) \gamma }{\gamma + G_{\max }\nu _{\max }(1-\exp \left( {\scriptstyle 2\nu _{\max }|\theta |}\right) )} \end{aligned}$$
where (a) holds integrating by parts; (b) holds because probabilities are upper bounded by 1, solving \(\int _0^B \exp \left( {\scriptstyle |\theta | x}\right) \,dx\), and breaking the last integral into intervals; (c) holds because \(f(x)=1-F_Z(x)={\mathbb {P}}\left( Z({\overline{X}})>x\right) \) is a nonincreasing function; (d) holds by (18) and solving the integral; and (e) holds after solving the geometric summation and reorganizing terms, because \(|\theta |<\frac{1}{2\nu _{\max }}\log \left( 1+\tfrac{\gamma }{G_{\max }\nu _{\max }}\right) \) by assumption and, hence, the geometric sum converges. \(\square \)
Proof of Lemma 7
In this proof we use the definition of drift and properties of concave functions.
Proof
(of Lemma 7) First observe that if g(x) is a differentiable concave function on \({\mathbb {R}}_+\), we have that for any \(x,y\in {\mathbb {R}}_+\)
$$\begin{aligned} g(x)-g(y)\le g'(y)(x-y). \end{aligned}$$
(27)
Now, observe that \(W_\perp ({\varvec{q}})=\left\| {\varvec{q}}_\perp \right\| =\sqrt{\left\| {\varvec{q}}_\perp \right\| ^2}\) and \(g(x)=\sqrt{x}\) is a concave function. Therefore, by definition of drift in Definition 1, and the generator matrix in (1), we have
$$\begin{aligned}&\Delta W_\perp ({\varvec{q}}) \\&= \lambda N\sum _{i=1}^N \dfrac{\left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) } \left( W_\perp \left( {\varvec{q}}+{\varvec{e}}^{(\psi _{{\varvec{q}}}(i))}\right) - W_\perp \left( {\varvec{q}}\right) \right) \\ {}&\quad + \sum _{i=1}^N \left( 1-\mathbbm {1}_{\left\{ q_i=0 \right\} } \right) \left( W_\perp \left( {\varvec{q}}-{\varvec{e}}^{(i)}\right) - W_\perp ({\varvec{q}}) \right) \\&{\mathop {\le }\limits ^{(a)}} \lambda N \sum _{i=1}^N \dfrac{\left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) } \left( \dfrac{\left\| \left( {\varvec{q}}+{\varvec{e}}^{(\psi _{{\varvec{q}}}(i))} \right) _\perp \right\| ^2 - \left\| {\varvec{q}}_\perp \right\| ^2}{2\left\| {\varvec{q}}_\perp \right\| } \right) \\&\quad + \sum _{i=1}^N \left( 1-\mathbbm {1}_{\left\{ q_i=0 \right\} }\right) \left( \dfrac{\left\| \left( {\varvec{q}}- {\varvec{e}}^{(i)}\right) _\perp \right\| ^2 - \left\| {\varvec{q}}_\perp \right\| ^2}{2\left\| {\varvec{q}}_\perp \right\| } \right) \\&{\mathop {=}\limits ^{(b)}} \dfrac{\lambda N}{2\left\| {\varvec{q}}_\perp \right\| }\sum _{i=1}^N \dfrac{\left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) }\left( V\left( {\varvec{q}}+{\varvec{e}}^{(\psi _{{\varvec{q}}}(i))}\right) - V({\varvec{q}}) - \left( V_\parallel \left( {\varvec{q}}+{\varvec{e}}^{(\psi _{{\varvec{q}}}(i))}\right) - V_\parallel ({\varvec{q}}) \right) \right) \\&\quad + \sum _{i=1}^N \left( \dfrac{1-\mathbbm {1}_{\left\{ q_i=0 \right\} }}{2\left\| {\varvec{q}}_\perp \right\| }\right) \left( V\left( {\varvec{q}}-{\varvec{e}}^{(i)}\right) - V({\varvec{q}}) - \left( V_\parallel \left( {\varvec{q}}-{\varvec{e}}^{(i)}\right) - V_\parallel ({\varvec{q}}) \right) \right) \\&{\mathop {=}\limits ^{(c)}} \dfrac{1}{2\left\| {\varvec{q}}_\perp \right\| }\left( \Delta V({\varvec{q}}) - \Delta V_{\parallel }({\varvec{q}}) \right) \end{aligned}$$
where (a) holds by (27) applied in the first and the second term in the following way. In the first term we use \(x=\left\| \left( {\varvec{q}}+{\varvec{e}}^{(\psi _{{\varvec{q}}}(i))} \right) _\perp \right\| ^2\) and \(y=\left\| {\varvec{q}}_\perp \right\| ^2\), and in the second term we use \(x=\left\| \left( {\varvec{q}}- {\varvec{e}}^{(i)}\right) _\perp \right\| ^2\) and \(y=\left\| {\varvec{q}}_\perp \right\| ^2\). Equality (b) holds by the definition of \(V(\cdot )\) and \(V_\parallel (\cdot )\) in (20) and because for any vector \({\varvec{x}}\in {\mathbb {R}}^N\), we have \(\left\| {\varvec{x}}_\perp \right\| ^2=\left\| {\varvec{x}}\right\| ^2 - \left\| {\varvec{x}}_\parallel \right\| ^2\); and (c) holds by reorganizing terms and by definition of drift. \(\square \)
Proof of Lemma 8
In this proof we use properties of the order statistics \(q_{(i)}\) for \(i\in [N]\). Recall that \(q_{(i)}\) represents the \(i{}^{\text {th}}\) shortest element of \({\varvec{q}}\), with ties broken by the minimum index.
Proof
(of Lemma 8) We have
$$\begin{aligned}&\Delta V({\varvec{q}}) \nonumber \\&= \lambda N \sum _{i=1}^N \dfrac{\left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) } \left( \left\| {\varvec{q}}+ {\varvec{e}}^{(\psi _{{\varvec{q}}}(i))}\right\| ^2 - \left\| {\varvec{q}}\right\| ^2 \right) \nonumber \\ {}&\quad + \sum _{i=1}^N \left( 1-\mathbbm {1}_{\left\{ q_i=0 \right\} }\right) \left( \left\| {\varvec{q}}-{\varvec{e}}^{(i)} \right\| ^2 - \left\| {\varvec{q}}\right\| ^2\right) \nonumber \\&{\mathop {=}\limits ^{(a)}} \lambda N \sum _{i=1}^N \dfrac{\left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) } \left( 1+2 q_{(i)}\right) + \sum _{i=1}^N (1-\mathbbm {1}_{\left\{ q_i=0 \right\} }) \left( 1-2q_i \right) \nonumber \\&{\mathop {\le }\limits ^{(b)}} N(\lambda +1) - 2(1-\lambda ) \sum _{i=1}^N q_i + 2\lambda \sum _{i=1}^N \left( \dfrac{N \left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) }-1\right) q_{(i)}, \end{aligned}$$
(28)
where (a) holds because, by definition of \(\psi _{{\varvec{q}}}(i)\), we have \(q_{\psi _{{\varvec{q}}}(i)}=q_{(i)}\); and (b) holds because \(\mathbbm {1}_{\left\{ q_i=0 \right\} }q_i=0\) for all \(i\in [N]\), because \(\sum _{i=1}^N \mathbbm {1}_{\left\{ q_i=0 \right\} }\ge 0\) and reorganizing terms.
The last step of the proof is to show that
$$\begin{aligned} \sum _{i=1}^N \left( \dfrac{N \left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) }-1\right) q_{(i)} \le -\left( \dfrac{d-1}{N}\right) \left\| {\varvec{q}}_\perp \right\| , \end{aligned}$$
(29)
which we do at the end of this section. Using the bound (29), we obtain the result. \(\square \)
In the proof of (29), we use properties of the order statistics and majorization. Specifically, we use the following lemma, which is proved in [38, Section 16.A.2.a].
Lemma 10
Consider three vectors \({\varvec{a}},{\varvec{b}},{\varvec{x}}\in {\mathbb {R}}^N\). The inequality
$$\begin{aligned} \sum _{i=1}^N a_i x_{(i)}\le \sum _{i=1}^N b_i x_{(i)} \end{aligned}$$
holds if and only if
-
(C1)
The total sum satisfies
$$\begin{aligned} \sum _{i=1}^N a_i = \sum _{i=1}^N b_i. \end{aligned}$$
-
(C2)
For every \(k\in [N]\), the partial sums satisfy
$$\begin{aligned} \sum _{i=k}^N a_i \le \sum _{i=k}^N b_i. \end{aligned}$$
Now we show (29).
Proof
(of (29)) For each \(i\in [N]\) define
$$\begin{aligned} \eta _i {\mathop {=}\limits ^{\triangle }}\dfrac{ N \left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) }, \end{aligned}$$
(30)
and observe that \(\eta _i=0\) for \(i\ge N-d+1\). Then,
$$\begin{aligned}&\sum _{i=1}^N \left( \dfrac{N \left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) }-1\right) q_{(i)} = \sum _{i=1}^N \left( \eta _i -1\right) q_{(i)}. \end{aligned}$$
Observe that \(\eta _1=d\). Then,
$$\begin{aligned}&\sum _{i=1}^N \left( \eta _i-1\right) q_{(i)}\nonumber \\&= (d-1)q_{(1)} + \sum _{i=2}^N \left( \eta _i - 1\right) q_{(i)} \nonumber \\&{\mathop {=}\limits ^{(a)}} \left( \dfrac{d-1}{N}\right) \sum _{i=1}^N \left( q_{(1)}-q_i\right) + \sum _{i=1}^N \left( \eta _i - \dfrac{N-d+1}{N}\right) q_{(i)} - (d-1)q_{(1)} , \end{aligned}$$
(31)
where (a) holds after reorganizing terms. We bound each of the terms of (31). For the first term we have
$$\begin{aligned} \left( \dfrac{d-1}{N}\right) \sum _{i=1}^N \left( q_{(1)}-q_i \right)&{\mathop {=}\limits ^{(a)}} -\left( \dfrac{d-1}{N}\right) \sum _{i=1}^N \left| q_i-q_{(1)}\right| \\&= -\left( \dfrac{d-1}{N}\right) \left\| {\varvec{q}}-q_{(1)}{\varvec{1}}\right\| _1 \\&{\mathop {\le }\limits ^{(b)}} -\left( \dfrac{d-1}{N}\right) \left\| {\varvec{q}}- q_{(1)}{\varvec{1}}\right\| \\&{\mathop {\le }\limits ^{(c)}} -\left( \dfrac{d-1}{N}\right) \left\| {\varvec{q}}_\perp \right\| , \end{aligned}$$
where (a) holds because \(q_{(1)}=\min _{i\in [N]}q_i\); (b) holds because norm-1 upper bounds the Euclidean norm; and (c) holds because, by definition of projection, the function \(g(x)=\left\| {\varvec{q}}-x{\varvec{1}}\right\| \) is minimized at \(x=\dfrac{1}{N}\sum _{i=1}^N q_i\), which equals the elements of \({\varvec{q}}_\parallel \). Then, the inequality holds by definition of \({\varvec{q}}_\perp {\mathop {=}\limits ^{\triangle }}{\varvec{q}}-{\varvec{q}}_\parallel \).
Now we only need to show that
$$\begin{aligned} \sum _{i=1}^N \eta _i q_{(i)} - (d-1)q_{(1)}\le \left( \dfrac{N-d+1}{N}\right) \sum _{i=1}^N q_{(i)}. \end{aligned}$$
We use Lemma 10 with \({\varvec{a}}\) and \({\varvec{b}}\) defined as follows:
$$\begin{aligned} a_1&{\mathop {=}\limits ^{\triangle }}\eta _1 - (d-1) = 1,\quad a_i{\mathop {=}\limits ^{\triangle }}\eta _i = \dfrac{N\left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) }\quad \forall i\in [N], i\ge 2\\ b_i&{\mathop {=}\limits ^{\triangle }}\dfrac{N-d+1}{N} \quad \forall i\in [N]. \end{aligned}$$
We first show that condition (C1) is satisfied. To do so, we compute the sum of the elements of \({\varvec{a}}\) and \({\varvec{b}}\). For the vector \({\varvec{a}}\) we obtain
$$\begin{aligned} \sum _{i=1}^N a_i&= 1 + \dfrac{N}{\left( {\begin{array}{c}N\\ d\end{array}}\right) } \sum _{i=2}^N \left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) {\mathop {=}\limits ^{(a)}} 1 + (N-1) \dfrac{\left( {\begin{array}{c}N-2\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N-1\\ d-1\end{array}}\right) } {\mathop {=}\limits ^{(b)}} N-d+1, \end{aligned}$$
where (a) holds after solving the summation; and (b) holds after simplifying the last term.
For the vector \({\varvec{b}}\) we obtain
$$\begin{aligned} \sum _{i=1}^N b_i&= \sum _{i=1}^N \dfrac{N-d+1}{N} = N-d+1, \end{aligned}$$
where the last equality holds because the general term of the summation does not depend on the index i. Hence, condition (C1) is satisfied.
To prove condition (C2), we consider three cases: (i) \(k\ge N-d+2\), (ii) \(2\le k\le N-d+1\), and (iii) \(k=1\). First observe that in case (iii) the inequality trivially holds after proving (C1). Now we prove the other two cases.
We start with case (i). Since \(k\ge N-d+2\), we have \(\left( {\begin{array}{c}N-k\\ d-1\end{array}}\right) =0\) for all k. Additionally, \(b_i\ge 0\) for all \(i\in [N]\) by definition. Therefore, condition (C2) is satisfied for \(k\ge N-d+2\).
For case (i) we compute the partial sums. We obtain
$$\begin{aligned} \sum _{i=k}^N a_i&= \dfrac{N}{\left( {\begin{array}{c}N\\ d\end{array}}\right) } \sum _{i=k}^N \left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) \nonumber \\&{\mathop {=}\limits ^{(a)}} \dfrac{N}{\left( {\begin{array}{c}N\\ d\end{array}}\right) } \left( \dfrac{N+1-k}{d}\right) \left( {\begin{array}{c}N-k\\ d-1\end{array}}\right) \nonumber \\&{\mathop {=}\limits ^{(b)}} (N+1-k)\dfrac{\left( {\begin{array}{c}N-k\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N-1\\ d-1\end{array}}\right) } \nonumber \\&{\mathop {=}\limits ^{(c)}} (N+1-k)\dfrac{\left( {\begin{array}{c}N-2\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N-1\\ d-1\end{array}}\right) } \nonumber \\&{\mathop {=}\limits ^{(d)}} (N+1-k)\left( \dfrac{N-d}{N-1}\right) \end{aligned}$$
(32)
where (a) holds after solving the summation; (b) holds after reorganizing terms; (c) holds because \(k\ge 2\). Then, it suffices to show that
$$\begin{aligned} (32) \le \sum _{i=k}^N b_i = \dfrac{(N-k+1)(N-d+1)}{N}, \end{aligned}$$
which is satisfied if and only if
$$\begin{aligned} \dfrac{N-d}{N-1}\le \dfrac{N-d+1}{N}. \end{aligned}$$
(33)
Reorganizing terms in (33) we see that the condition is equivalent to \(d\ge 1\), which holds by assumption. This completes the proof. \(\square \)
Proof of Lemma 9
The goal of this section is to compute a lower bound on \(\Delta V_\parallel ({\varvec{q}})\). We use Lemma 3 (where we computed \(\Delta V_\parallel ({\varvec{q}})\)), properties of the Euclidean norm and of indicator functions.
Proof
(of Lemma 9) From Lemma 3 we have
$$\begin{aligned} \Delta V_\parallel ({\varvec{q}})&= \lambda \sum _{i=1}^N \dfrac{\left( {\begin{array}{c}N-i\\ d-1\end{array}}\right) }{\left( {\begin{array}{c}N\\ d\end{array}}\right) } \left( 1 + 2\sum _{j=1}^N q_j\right) + \frac{1}{N} \sum _{i=1}^N \left( 1-\mathbbm {1}_{\left\{ q_i=0 \right\} }\right) \left( 1-2\sum _{j=1}^N q_j \right) \nonumber \\&{\mathop {=}\limits ^{(a)}} \lambda - 2(1-\lambda ) \sum _{i=1}^N q_i + \dfrac{1}{N}\sum _{i=1}^N \left( 1-\mathbbm {1}_{\left\{ q_i=0 \right\} }\right) \nonumber \\ {}&\quad + \dfrac{2}{N}\left( \sum _{i=1}^N \mathbbm {1}_{\left\{ q_i=0 \right\} }\right) \left( \sum _{i=1}^N q_i \right) \nonumber \\&{\mathop {\ge }\limits ^{(b)}} -2(1-\lambda )\sum _{i=1}^N q_i, \end{aligned}$$
(34)
where (a) holds after reorganizing terms; and (b) holds because \(\lambda \ge 0\), \(1-\mathbbm {1}_{\left\{ q_i=0 \right\} }\ge 0\) for all \(i\in [N]\), and \(\left( \sum _{i=1}^N \mathbbm {1}_{\left\{ q_i=0 \right\} }\right) \left( \sum _{i=1}^N q_i \right) \ge 0\) since every term is nonnegative. This completes the proof. \(\square \)