Skip to main content

CLT for linear spectral statistics of large dimensional sample covariance matrices with dependent data

Abstract

This paper investigates the central limit theorem for linear spectral statistics of high dimensional sample covariance matrices of the form \({\mathbf {B}}_n=n^{-1}\sum _{j=1}^{n}{\mathbf {Q}}{\mathbf {x}}_j{\mathbf {x}}_j^{*}{\mathbf {Q}}^{*}\) under the assumption that \(p/n\rightarrow y>0\), where \({\mathbf {Q}}\) is a \(p\times k\) nonrandom matrix and \(\{{\mathbf {x}}_j\}_{j=1}^n\) is a sequence of independent k-dimensional random vector with independent entries. A key novelty here is that the dimension \(k\ge p\) can be arbitrary, possibly infinity. This new model of sample covariance matrix \({\mathbf {B}}_n\) covers most of the known models as its special cases. For example, standard sample covariance matrices are obtained with \(k=p\) and \({\mathbf {Q}}={\mathbf {T}}_n^{1/2}\) for some positive definite Hermitian matrix \({\mathbf {T}}_n\). Also with \(k=\infty \) our model covers the case of repeated linear processes considered in recent high-dimensional time series literature. The CLT found in this paper substantially generalizes the seminal CLT in Bai and Silverstein (Ann Probab 32(1):553–605, 2004). Applications of this new CLT are proposed for testing the AR(1) or AR(2) structure for a causal process. Our proposed tests are then used to analyze a large fMRI data set.

This is a preview of subscription content, access via your institution.

Notes

  1. 1.

    The recent paper Li et al. (2018) generalized our CLT result. However, Li et al. (2018)’s theoretical results are based on our paper which is posted on arXiv.org. Some of their techniques and results are from this paper.

References

  1. Ahn M, Shen H, Lin W, Zhu H (2015) A sparse reduced rank framework for group analysis of functional neuroimaging data. Stat Sin 25(1):295–312

    MathSciNet  MATH  Google Scholar 

  2. Bai ZD (1999) Methodologies in spectral analysis of large dimensional random matrices, a review. Stat Sin 9(3):611–677

    MathSciNet  MATH  Google Scholar 

  3. Bai ZD, Silverstein JW (1998) No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices. Ann Probab 26(1):316–345

    MathSciNet  Article  Google Scholar 

  4. Bai ZD, Silverstein JW (2004) CLT for linear spectral statistics of large-dimensional sample covariance matrices. Ann Probab 32(1):553–605

    MathSciNet  MATH  Google Scholar 

  5. Bai ZD, Silverstein JW (2010) Spectral analysis of large dimensional random matrices. Science Press, Beijing

    Book  Google Scholar 

  6. Bai ZD, Zhou W (2008) Large sample covariance matrices without independence structures in columns. Stat Sin 18(2):425–442

    MathSciNet  MATH  Google Scholar 

  7. Bai ZD, Jiang DD, Yao JF, Zheng SR (2009) Corrections to LRT on large dimensional covariance matrix by RMT. Ann Stat 37(6B):3822–3840

    MathSciNet  Article  Google Scholar 

  8. Bai ZD, Jiang DD, Yao JF, Zheng SR (2013) Testing linear hypotheses in high-dimensional regressions. Statistics 47(6):1207–1223

    MathSciNet  Article  Google Scholar 

  9. Banna M, Merlevède F (2015) Limiting spectral distribution of large sample covariance matrices associated with a class of stationary processes. J Theor Probab 28(2):745–783

    MathSciNet  Article  Google Scholar 

  10. Billingsley P (1995) Probability and measure. Wiley, New York

    MATH  Google Scholar 

  11. Burkholder DL (1973) Distribution function inequalities for martingales. Ann Probab 1(1):19–42

    MathSciNet  Article  Google Scholar 

  12. Couillet R, Debbah M (2013) Signal processing in large systems: a new paradigm. IEEE Signal Process Mag 30(1):24–39

    Article  Google Scholar 

  13. Dobriban E (2015) Efficient computation of limit spectra of sample covariance matrices. Random matrices: theory and applications 4(4):1550019

    MathSciNet  Article  Google Scholar 

  14. Jiang DD, Bai ZD, Zheng SR (2013) Testing the independence of sets of large-dimensional variables. Sci China Math 56(1):135–147

    MathSciNet  Article  Google Scholar 

  15. Jin BS, Wang C, Miao BQ (2009) Limiting spectral distribution of large-dimensional sample covariance matrices generated by VARMA. J Multivar Anal 100(9):2112–2125

    MathSciNet  Article  Google Scholar 

  16. Jin BS, Wang C, Bai ZD, Nair KK, Harding M (2014) Limiting spectral distribution of a symmetrized auto-cross covariance matrix. Ann Appl Probab 24(3):1199–1225

    MathSciNet  Article  Google Scholar 

  17. Johnstone IM (2007) High dimensional statistical inference and random matrices. In: International Congress of Mathematicians I, pp 307–333

  18. Jonsson D (1982) Some limit theorems for the eigenvalues of a sample covariance matrix. J Multivar Anal 12(1):1–38

    MathSciNet  Article  Google Scholar 

  19. Li WM (2014) Local expectations of the population spectral distribution of a high-dimensional covariance matrix. Stat Pap 55(2):563–573

    MathSciNet  Article  Google Scholar 

  20. Li WM, Li Z, Yao JF (2018) Joint CLT for eigenvalue statistics from several dependent large dimensional sample covariance matrices with application. Scand J Stat 45(3):699–728

    MathSciNet  Article  Google Scholar 

  21. Lindquist MA (2008) The statistical analysis of fMRI data. Stat Sci 23(4):439–464

    MathSciNet  Article  Google Scholar 

  22. Liu B, Xu L, Zheng SR, Tian GL (2014) A new test for the proportionality of two large-dimensional covariance matrices. J Multivar Anal 131(4):293–308

    MathSciNet  Article  Google Scholar 

  23. Marčenko VA, Pastur LA (1967) Distribution of eigenvalues for some sets of random matrices. Math USSR Sbornik 1(1):457–483

    Article  Google Scholar 

  24. Najim J, Yao JF (2016) Gaussian fluctuations for linear spectral statistics of large random covariance matrices. Ann Appl Probab 26(3):1837–1887

    MathSciNet  Article  Google Scholar 

  25. Pan GM (2014) Comparison between two types of large sample covariance matrices. Ann Probab Stat 50(2):655–677

    MathSciNet  MATH  Google Scholar 

  26. Pan GM, Zhou W (2008) Central limit theorem for signal-to-interference ratio of reduced rank linear receiver. Ann Appl Probab 18(3):1232–1270

    MathSciNet  Article  Google Scholar 

  27. Paul D, Aue A (2014) Random matrix theory in statistics: a review. J Stat Plan Inference 150:1–29

    MathSciNet  Article  Google Scholar 

  28. Silverstein JW (1995) Strong convergence of the empirical distribution of eigenvalues of large dimensional random matrices. J Multivar Anal 55(2):331–339

    MathSciNet  Article  Google Scholar 

  29. Silverstein JW, Bai ZD (1995) On the empirical distribution of eigenvalues of a class of large dimensional random matrices. J Multivar Anal 54(2):175–192

    MathSciNet  Article  Google Scholar 

  30. Silverstein JW, Choi SI (1995) Analysis of the limiting spectral distribution of large dimensional random matrices. J Multivar Anal 54(2):295–309

    MathSciNet  Article  Google Scholar 

  31. Verbyla AP (1985) A note on the inverse covariance matrix of the autoregressive process. Aust J Stat 27(2):221–224

    Article  Google Scholar 

  32. Wachter KW (1978) The strong limits of random matrix spectra for sample matrices of independent elements. Ann Probab 6(1):1–18

    MathSciNet  Article  Google Scholar 

  33. Wang QW, Yao JF (2013) On the sphericity test with large-dimensional observations. Electron J Stat 7(3):2164–2192

    MathSciNet  MATH  Google Scholar 

  34. Wang C, Jin BS, Miao BQ (2011) On limiting spectral distribution of large sample covariance matrices by VARMA(p, q). J Time Ser Anal 32(5):539–546

    MathSciNet  Article  Google Scholar 

  35. Wong WK, Miller RB (1990) Repeated time series analysis of ARIMA-noise models. J Bus Econ Stat 8(2):243–250

    Google Scholar 

  36. Wong WK, Miller RB, Shrestha K (2001) Maximum likelihood estimation of ARMA model with error processes for replicated observation. J Appl Stat Sci 10(4):287–297

    MathSciNet  MATH  Google Scholar 

  37. Xia NN, Bai ZD (2015) Functional CLT of eigenvectors for large sample covariance matrices. Stat Pap 56(1):23–60

    MathSciNet  Article  Google Scholar 

  38. Yao JF (2012) A note on a Marčenko–Pastur type theorem for time series. Stat Probab Lett 82(1):22–28

    Article  Google Scholar 

  39. Yao JF, Zheng SR, Bai ZD (2015) Large sample covariance matrices and high-dimensional data analysis. Cambridge University Press, Cambridge

  40. Zheng SR, Bai ZD, Yao JF (2015) Substitution principle for CLT of linear spectral statistics of high-dimensional sample covariance matrices with applications to hypothesis testing. Ann Stat 43(2):546–591

    MathSciNet  MATH  Google Scholar 

  41. Zheng SR, Bai ZD, Yao JF (2017) CLT for eigenvalue statistics of large dimensional general Fisher matrices with applications. Bernoulli 23(2):1130–1178

    MathSciNet  Article  Google Scholar 

  42. Zheng SR, Chen Z, Cui HJ, Li RZ (2019) Hypothesis testing on linear structures of high-dimensional covariance matrix. Ann Stat 47(6):3300–3334

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgements

We thank the editor and referees for their constructive and insightful comments. Shurong Zheng’s research was supported by NSFC Grants 12071066, 11690012 and also supported by 2020YFA0714102. Zhidong Bai’s research was supported by NSFC Grant 11771073.

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Tingting Zou or Shurong Zheng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Appendix A: Proofs of the main theorems

A.1 Proof of Theorem 1

A.1.1 Truncation, centralization and rescaling

Truncation By Assumption (a), there exists a sequence of constants \(\eta _n\downarrow 0\) such that

$$\begin{aligned} \frac{1}{pn\eta _n^2}\sum _{i=1}^k\sum _{j=1}^n \Vert {\mathbf {q}}_{i}\Vert ^2{\mathrm{E}}|x_{ij}|^2I\Big (|x_{ij}|>\eta _n \sqrt{n}/\Vert {\mathbf {q}}_i\Vert \Big )\rightarrow 0. \end{aligned}$$
(27)

Define \({{\widehat{x}}}_{ij}=x_{ij}I(|x_{ij}|\le \eta _n\sqrt{n}/\Vert {\mathbf {q}}_i\Vert )\), \({{\widehat{{\mathbf {X}}}}}_n=({{\widehat{x}}}_{ij})\) and \({{\widehat{{\mathbf {B}}}}}_n=n^{-1}{\mathbf {Q}}{{\widehat{{\mathbf {X}}}}}_n{{\widehat{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*_n\). Applying Lemma 6, we have

$$\begin{aligned} \Vert F^{{\mathbf {B}}_n}-F^{{{\widehat{{\mathbf {B}}}}}_n}\Vert \le p^{-1}{\mathrm{rank}}({\mathbf {X}}_n-{{\widehat{{\mathbf {X}}}}}_n)\le p^{-1}\sum _{i=1}^k\sum _{j=1}^n I(|x_{ij}|>\eta _n\sqrt{n}/\Vert {\mathbf {q}}_i\Vert )\rightarrow 0 \quad a.s.. \end{aligned}$$

Since

$$\begin{aligned}&{\mathrm{E}}\left( p^{-1}\sum \limits _{i=1}^k\sum \limits _{j=1}^nI(|x_{ij}|>{\eta _n\sqrt{n}}/{\Vert {\mathbf {q}}_i\Vert })\right) \\&\quad \le (pn\eta ^2_n)^{-1}\sum \limits _{i=1}^k\sum \limits _{j=1}^n\Vert {\mathbf {q}}_i\Vert ^2{\mathrm{E}}|x_{ij}|^2I(|x_{ij}|>{\eta _n\sqrt{n}}/{\Vert {\mathbf {q}}_i\Vert })\rightarrow 0, \end{aligned}$$

and

$$\begin{aligned}&{\mathrm{Var}}\left( p^{-1}\sum \limits _{i=1}^k\sum \limits _{j=1}^nI(|x_{ij}|>{\eta _n\sqrt{n}}/{\Vert {\mathbf {q}}_i\Vert })\right) \\&\quad \le p^{-2}\sum \limits _{i=1}^k\sum \limits _{j=1}^n{\mathrm{P}}(|x_{ij}|>{\eta _n\sqrt{n}}/{\Vert {\mathbf {q}}_i\Vert })\\&\quad \le (p^2n\eta ^2_n)^{-1}\sum \limits _{i=1}^k\sum \limits _{j=1}^n\Vert {\mathbf {q}}_i\Vert ^2{\mathrm{E}}|x_{ij}|^2I(|x_{ij}|>{\eta _n\sqrt{n}}/{\Vert {\mathbf {q}}_i\Vert })\\&\quad =o(p^{-1}). \end{aligned}$$

Based on Bernstein’s inequality we have \(P(p^{-1}\sum _{i=1}^k\sum _{j=1}^n I(|x_{ij}|>\eta _n\sqrt{n}/\Vert {\mathbf {q}}_i\Vert )>\epsilon )\le K\exp (-bp)\), for some constants \(K<\infty \) and \(b>0\).

Centralization Define \({{\widetilde{{\mathbf {X}}}}}_n={{\widehat{{\mathbf {X}}}}}_n-{\mathrm{E}}{{\widehat{{\mathbf {X}}}}}_n\) and \({{\widetilde{{\mathbf {B}}}}}_n=n^{-1}{\mathbf {Q}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*\). By Lemma 7, we have

$$\begin{aligned} L^4(F^{{{\widehat{{\mathbf {B}}}}}_n},F^{{{\widetilde{{\mathbf {B}}}}}_n})\le 2p^{-2}n^{-2}{\mathrm{tr}}{\mathbf {Q}}({{\widehat{{\mathbf {X}}}}}_n-{{\widetilde{{\mathbf {X}}}}}_n)({{\widehat{{\mathbf {X}}}}}_n-{{\widetilde{{\mathbf {X}}}}}_n)^*{\mathbf {Q}}^* {\mathrm{tr}}({\mathbf {Q}}{{\widehat{{\mathbf {X}}}}}_n{{\widehat{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*+{\mathbf {Q}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*), \end{aligned}$$

where L(, ) is the Levy distance. Notice that

$$\begin{aligned}&(pn)^{-1}{\mathrm{tr}}({\mathbf {Q}}({{\widehat{{\mathbf {X}}}}}_n-{{\widetilde{{\mathbf {X}}}}}_n)({{\widehat{{\mathbf {X}}}}}_n-{{\widetilde{{\mathbf {X}}}}}_n)^*{\mathbf {Q}}^*)\\&\quad =(pn)^{-1}{\mathrm{tr}}{\mathbf {Q}}{\mathrm{E}}{{\widehat{{\mathbf {X}}}}}_n({\mathrm{E}}{{\widehat{{\mathbf {X}}}}}_n)^*{\mathbf {Q}}^*\\&\quad =(pn)^{-1}\sum \limits _{i=1}^p\sum \limits _{j=1}^n\left| \sum _{\ell =1}^kq_{i\ell }{\mathrm{E}}{{\hat{x}}}_{\ell j}\right| ^2\\&\quad \le n^{-1}\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{\ell =1}^k|q_{i\ell }|^2 \sum \limits _{h=1}^k\frac{\Vert {\mathbf {q}}_h\Vert ^2}{pn\eta _n^2}{\mathrm{E}}|x_{hj}|^2I(|x_{hj}|>\eta _n\sqrt{n}/\Vert {\mathbf {q}}_h\Vert )\\&\qquad (\text{ by } \text{ Cauchy--Schwarz } \text{ inequality})\\&\quad =n^{-1}\sum \limits _{i=1}^p\sum \limits _{\ell =1}^k|q_{i\ell }|^2\cdot (pn\eta _n^2)^{-1}\sum \limits _{j=1}^n \sum \limits _{h=1}^k\Vert {\mathbf {q}}_h\Vert ^2{\mathrm{E}}|x_{hj}|^2I(|x_{hj}|>\eta _n\sqrt{n}/\Vert {\mathbf {q}}_h\Vert )\\&\quad =n^{-1}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {Q}}^{*})(pn\eta _n^2)^{-1}\sum \limits _{j=1}^n\sum \limits _{h=1}^k\Vert {\mathbf {q}}_h\Vert ^2{\mathrm{E}}|x_{hj}|^2I(|x_{hj}|>\eta _n\sqrt{n}/\Vert {\mathbf {q}}_h\Vert )=o(1). \end{aligned}$$

Noticing that the above bound is non-random, to show \(L^4(F^{{{\widehat{{\mathbf {B}}}}}_n}, F^{{{\widetilde{{\mathbf {B}}}}}_n})\rightarrow 0,a.s.\), one only needs to prove that

$$\begin{aligned}&\frac{1}{pn}{\mathrm{tr}}({\mathbf {Q}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*)=\frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\left| \sum _{\ell =1}^kq_{i\ell }{{\tilde{x}}}_{\ell j}\right| ^2\nonumber \\&\quad =\frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\sum _{\ell =1}^k|q_{i\ell }|^2|{{\tilde{x}}}_{\ell j}|^2+\frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n \sum _{k_1\ne k_2}q_{ik_1}{{\bar{q}}}_{ik_2}{{\tilde{x}}}_{k_1,j}\bar{{{\tilde{x}}}}_{k_2,j}=O_{a.s.}(1).\nonumber \\ \end{aligned}$$
(28)

Note that

$$\begin{aligned} {\mathrm{E}}\left( \frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\sum _{\ell =1}^k|q_{ik}|^2|{{\tilde{x}}}_{\ell j}|^2\right) \le \frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\sum _{\ell =1}^k|q_{i\ell }|^2=O(1) \end{aligned}$$

and

$$\begin{aligned}&{\mathrm{E}}\left( \frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\sum _{\ell =1}^k|q_{i\ell }|^2(|{{\tilde{x}}}_{\ell j}|^2-{\mathrm{E}}|{{\tilde{x}}}_{\ell j}|^2)\right) ^4\nonumber \\&\quad \le \frac{1}{p^4n^4}\sum _{j=1}^n\sum _{\ell =1}^k\left( \sum _{i=1}^p|q_{i\ell }|^2\right) ^4{\mathrm{E}}|{{\tilde{x}}}_{\ell j}|^8 +\frac{3}{p^4n^4}\left( \sum _{j=1}^n\sum _{\ell =1}^k\left( \sum _{i=1}^p|q_{i\ell }|^2\right) ^2{\mathrm{E}}|{{\tilde{x}}}_{\ell j}|^4\right) ^2\nonumber \\&\quad \le \frac{\eta _n^6}{p^4}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {Q}}^{*})+\frac{3\eta _n^4}{p^4}{\mathrm{tr}}^2({\mathbf {Q}}{\mathbf {Q}}^{*})=o(p^{-2}). \end{aligned}$$
(29)

These inequalities simply imply \((pn)^{-1}\sum _{i=1}^p\sum _{j=1}^n\sum _{\ell =1}^k|q_{i\ell }|^2|{{\tilde{x}}}_{\ell j}|^2=O_{a.s.}(1)\). Furthermore,

$$\begin{aligned}&{\mathrm{E}}\left( \frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\sum _{k_1\ne k_2}q_{ik_1}{{\bar{q}}}_{ik_2}{{\tilde{x}}}_{k_1,j}\bar{{{\tilde{x}}}}_{k_2,j}\right) ^2 \le \frac{2}{p^2n^2}\sum _{j=1}^n\sum _{k_1\ne k_2}\left| \sum _{i=1}^pq_{ik_1}{{\bar{q}}}_{ik_2}\right| ^2\nonumber \\&\quad \le \frac{2}{p^2n}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {Q}}^*)^2=O(p^{-2}), \end{aligned}$$
(30)

which implies that \((pn)^{-1}\sum _{i=1}^p\sum _{j=1}^n\sum _{k_1\ne k_2}q_{ik_1}{{\bar{q}}}_{ik_2}{{\tilde{x}}}_{k_1,j}\bar{{{\tilde{x}}}}_{k_2,j}\rightarrow 0,\,a.s.\) Hence the assertion (28) is proved.

Rescaling Denote \(\sigma _{ij}^2={\mathrm{E}}|{{\tilde{x}}}_{ij}|^2\), \(\breve{x}_{ij}=\sigma _{ij}^{-1}{{\tilde{x}}}_{ij}\), \(\breve{\mathbf {X}}_n=(\breve{x}_{ij})\) and \(\breve{\mathbf {B}}_n=n^{-1}{\mathbf {Q}}\breve{\mathbf {X}}_n\breve{\mathbf {X}}_n^*{\mathbf {Q}}^*\). Applying Lemma 7 again, we have

$$\begin{aligned} L^4(F^{\breve{\mathbf {B}}_n},F^{{{\widetilde{{\mathbf {B}}}}}_n})\le 2p^{-2}n^{-2}{\mathrm{tr}}{\mathbf {Q}}({{\widetilde{{\mathbf {X}}}}}_n-\breve{\mathbf {X}}_n)({{\widetilde{{\mathbf {X}}}}}_n-\breve{\mathbf {X}}_n)^*{\mathbf {Q}}^* {\mathrm{tr}}({\mathbf {Q}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*+{\mathbf {Q}}\breve{\mathbf {X}}_n\breve{\mathbf {X}}_n^*{\mathbf {Q}}^*). \end{aligned}$$

We have proved in (28) that \((pn)^{-1}{\mathrm{tr}}({\mathbf {Q}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*)=O_{a.s.}(1)\). Similarly, we can prove that \((pn)^{-1}{\mathrm{tr}}({\mathbf {Q}}\breve{\mathbf {X}}_n\breve{\mathbf {X}}_n^*{\mathbf {Q}}^*)=O_{a.s.}(1)\). What remains is to show that

$$\begin{aligned} \frac{1}{pn}{\mathrm{tr}}{\mathbf {Q}}({{\widetilde{{\mathbf {X}}}}}_n-\breve{\mathbf {X}}_n)({{\widetilde{{\mathbf {X}}}}}_n-\breve{\mathbf {X}}_n)^*{\mathbf {Q}}^*= & {} \frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\left| \sum _{\ell =1}^kq_{i\ell }(\breve{x}_{\ell j}-{{\tilde{x}}}_{\ell j})\right| ^2\\ \nonumber= & {} o_{a.s.}(1). \end{aligned}$$
(31)

Note that

$$\begin{aligned} \sum _{i=1}^p\sum _{j=1}^n\left| \sum _{\ell =1}^kq_{i\ell }(\breve{x}_{\ell j}-{{\tilde{x}}}_{\ell j})\right| ^2= & {} \sum _{i=1}^p\sum _{j=1}^n\sum _{\ell =1}^k|q_{i\ell }|^2|\breve{x}_{\ell j}-{{\tilde{x}}}_{\ell j}|^2\\&+\sum _{i=1}^p\sum _{j=1}^n\sum _{k_1\ne k_2}q_{ik_1}{{\bar{q}}}_{ik_2}(\breve{x}_{k_1j}-{{\tilde{x}}}_{k_1j})(\bar{\breve{x}}_{k_2j}-\bar{{{\tilde{x}}}}_{k_2j}), \end{aligned}$$

and

$$\begin{aligned}&{\mathrm{E}}\left( \frac{1}{pn}\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{\ell =1}^k|q_{i\ell }|^2|\breve{x}_{\ell j}-{{\tilde{x}}}_{\ell j}|^2\right) \\&\quad =\frac{1}{pn}\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{\ell =1}^k|q_{i\ell }|^2(1-\sigma _{lj})^2 \le \frac{1}{pn}\sum \limits _{j=1}^n\sum \limits _{\ell =1}^k\Vert {\mathbf {q}}_\ell \Vert ^2(1-\sigma _{lj}^2)\\&\quad \le \frac{2}{pn}\sum \limits _{j=1}^n\sum \limits _{\ell =1}^k\Vert {\mathbf {q}}_\ell \Vert ^2{\mathrm{E}}|x_{\ell j}|^2I(|x_{\ell j}|>\eta _n\sqrt{n}/\Vert {\mathbf {q}}_\ell \Vert )=o(1). \end{aligned}$$

Similar to (29) and (30), one can prove that

$$\begin{aligned} {\mathrm{E}}\left( \frac{1}{pn}\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{\ell \in E_{(j)}}|q_{i\ell }|^2 (|\breve{x}_{\ell j}-{{\tilde{x}}}_{\ell j}|^2-{\mathrm{E}}|\breve{x}_{\ell j}-{{\tilde{x}}}_{\ell j}|^2)\right) ^4=O(p^{-2}\eta _n^{-4}) \end{aligned}$$

and

$$\begin{aligned} {\mathrm{Var}}\left( \frac{1}{pn}\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{k_1\ne k_2}q_{ik_1}{{\bar{q}}}_{ik_2}(\breve{x}_{k_1j}-{{\tilde{x}}}_{k_1j}) (\bar{\breve{x}}_{k_2j}-\bar{{{\tilde{x}}}}_{k_2j})\right) =O(p^{-2}). \end{aligned}$$

From these, it is easy to show (31).

Case that \({\mathbf {Q}}\) has an infinite number of columns If the spectral norm of \({\mathbf {T}}_n\) is uniformly bounded in p, then for a fixed p and any \(h\in \{1,\ldots ,p\}\), we have

$$\begin{aligned} p^{-1}\sum _{\ell =1}^{\infty }|q_{h\ell }|^2\le p^{-1}\sum _{i=1}^p\sum _{\ell =1}^{\infty }|q_{i\ell }|^2 =p^{-1}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {Q}}^*)=p^{-1}\sum _{i=1}^p\lambda _i^{{\mathbf {T}}_n}\le M, \end{aligned}$$

where \(\{\lambda _i^{{\mathbf {T}}_n}\}_{i=1}^p\) are eigenvalues of \({\mathbf {T}}_n\) and M is a positive constant. This means that the series \(p^{-1}\sum _{\ell =1}^{\infty }|q_{h\ell }|^2\) converges. Then we can find an integer \(k_{ph}\) satisfying

$$\begin{aligned} p^{-1}\sum _{\ell =k_{ph}+1}^{\infty }|q_{h\ell }|^2<p^{-(3+\delta )}, \end{aligned}$$

where \(\delta \) is a positive constant. Let \(k_p=\max \{k_{ph}, h=1,\ldots ,p\}\), we have

$$\begin{aligned} p^{-1}\sum _{i=1}^p\sum _{\ell =k_p+1}^{\infty }|q_{i\ell }|^2<p^{-(2+\delta )}. \end{aligned}$$

Therefore, for any fixed p, we can find the corresponding \(k_p\) satisfying

$$\begin{aligned} p^{-1}\sum _{i=1}^p\sum _{\ell =k_p+1}^{\infty }|q_{i\ell }|^2<p^{-(2+\delta )}. \end{aligned}$$

That is, we can find a sequence of \(\{k_p\}\) that satisfies

$$\begin{aligned} p^{-1}\sum _{i=1}^p\sum _{\ell =k_p+1}^{\infty }|q_{i\ell }|^2=o(p^{-2}). \end{aligned}$$

For simplicity, we write k for \(k_p\). If \({\mathbf {Q}}\) is a \(p\times \infty \) dimensional matrix, then we truncate \({\mathbf {Q}}\) as \({\mathbf {Q}}=(\widehat{{{\mathbf {Q}}}}, \widetilde{{{\mathbf {Q}}}})\), where \(\widehat{{\mathbf {Q}}}\) is a \(p\times k\) matrix and \(\widetilde{{\mathbf {Q}}}=(q_{ij})\) is a \(p\times \infty \) matrix with \(i=1,\ldots ,p, j=k+1,\ldots ,\infty \). Similarly, truncate \({\mathbf {X}}_n\) as \({\mathbf {X}}_n=\left( \begin{array}{c}\widehat{{\mathbf {X}}}_n \\ \widetilde{{\mathbf {X}}}_n \end{array}\right) \), where \(\widehat{{{\mathbf {X}}}}_n\) is a \(k\times n\) matrix and \(\widetilde{{{\mathbf {X}}}}_n\) is a \(\infty \times n\) matrix. Then we have

$$\begin{aligned} L^4(F^{n^{-1}{\mathbf {Q}}{\mathbf {X}}_n{\mathbf {X}}_n^*{\mathbf {Q}}^*}, F^{n^{-1}\widehat{{{\mathbf {Q}}}}\widehat{{{\mathbf {X}}}}_n \widehat{{{\mathbf {X}}}}_n^*\widehat{{{\mathbf {Q}}}}^*})\le 2p^{-2}n^{-2}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {X}}_n{\mathbf {X}}_n^*{\mathbf {Q}}^* +\widehat{{{\mathbf {Q}}}}\widehat{{{\mathbf {X}}}}_n\widehat{{{\mathbf {X}}}}_n^*\widehat{{{\mathbf {Q}}}}^*) {\mathrm{tr}}\widetilde{{{\mathbf {Q}}}}\widetilde{{{\mathbf {X}}}}_n\widetilde{{{\mathbf {X}}}}_n^*\widetilde{{{\mathbf {Q}}}}^*. \end{aligned}$$

We have

$$\begin{aligned}&\frac{1}{pn}{\mathrm{tr}}(\widetilde{{\mathbf {Q}}}\widetilde{{\mathbf {X}}}_n\widetilde{{\mathbf {X}}}_n^*\widetilde{{\mathbf {Q}}}^*) =\frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\left| \sum _{\ell =k+1}^{\infty }q_{i\ell }x_{\ell j}\right| ^2\nonumber \\&\quad =\frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\sum _{\ell =k+1}^{\infty }|q_{i\ell }|^2|x_{\ell j}|^2 +\frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\sum _{k_1\ne k_2}q_{ik_1}{\bar{q}}_{ik_2} x_{k_1,j}{\bar{x}}_{k_2,j}. \end{aligned}$$

Note that

$$\begin{aligned} {\mathrm{E}}\left( \frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\sum _{\ell =k+1}^{\infty }|q_{ik}|^2|x_{\ell j}|^2\right) =\frac{1}{p}\sum _{i=1}^p\sum _{\ell =k+1}^{\infty }|q_{i\ell }^2|=o(p^{-2}) \end{aligned}$$

and

$$\begin{aligned}&{\mathrm{E}}\left( \frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\sum _{\ell =k+1}^{\infty }|q_{i\ell }|^2(|x_{\ell j}|^2-{\mathrm{E}}|x_{\ell j}|^2)\right) ^4\nonumber \\&\quad \le \frac{1}{p^4n^4}\sum _{j=1}^n\sum _{\ell =k+1}^{\infty }\left( \sum _{i=1}^p|q_{i\ell }|^2\right) ^4{\mathrm{E}}|x_{\ell j}|^8\nonumber \\&\quad +\frac{3}{p^4n^4}\left( \sum _{j=1}^n\sum _{\ell =k+1}^{\infty }\left( \sum _{i=1}^p|q_{i\ell }|^2\right) ^2{\mathrm{E}}|x_{\ell j}|^4\right) ^2\nonumber \\&\quad \le \frac{\eta _n^6}{p^4}\sum _{\ell =k+1}^{\infty }\Vert {\mathbf {q}}_\ell \Vert ^2+\frac{3\eta _n^4}{p^4}\left( \sum _{\ell =k+1}^{\infty }\Vert {\mathbf {q}}_\ell \Vert ^2\right) ^2=o(p^{-2}). \end{aligned}$$
(32)

These inequalities simply imply \((pn)^{-1}\sum _{i=1}^p\sum _{j=1}^n\sum _{\ell =k+1}^{\infty }|q_{i\ell }|^2|x_{\ell j}|^2=o_{a.s.}(1)\). Furthermore,

$$\begin{aligned}&{\mathrm{E}}\left( \frac{1}{pn}\sum _{i=1}^p\sum _{j=1}^n\sum _{k_1\ne k_2}q_{ik_1}{{\bar{q}}}_{ik_2}x_{k_1,j}{\bar{x}}_{k_2,j}\right) ^2 \le \frac{2}{p^2n^2}\sum _{j=1}^n\sum _{k_1\ne k_2}\left| \sum _{i=1}^pq_{ik_1}{{\bar{q}}}_{ik_2}\right| ^2\nonumber \\&\quad \le \frac{2}{p^2n}{\mathrm{tr}}(\widetilde{{{\mathbf {Q}}}}\widetilde{{{\mathbf {Q}}}}^*)^2=o(p^{-2}), \end{aligned}$$
(33)

which implies that \((pn)^{-1}\sum _{i=1}^p\sum _{j=1}^n\sum _{k_1\ne k_2}q_{ik_1}{{\bar{q}}}_{ik_2}x_{k_1,j}{\bar{x}}_{k_2,j}=o_{a.s.}(1)\). Then we have

$$\begin{aligned} (pn)^{-1}{\mathrm{tr}}(\widetilde{{{\mathbf {Q}}}}\widetilde{{{\mathbf {X}}}}_n\widetilde{{{\mathbf {X}}}}_n^*\widetilde{{{\mathbf {Q}}}}^*)=o_{a.s.}(1). \end{aligned}$$

Similarly, we can prove \((pn)^{-1}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {X}}_n{\mathbf {X}}_n^*{\mathbf {Q}}^*)=O_{a.s.}(1)\) and \((pn)^{-1}{\mathrm{tr}}(\widehat{{{\mathbf {Q}}}}\widehat{{{\mathbf {X}}}}_n\widehat{{{\mathbf {X}}}}_n^*\widehat{{{\mathbf {Q}}}}^*)=O_{a.s.}(1)\). Then we have

$$\begin{aligned} L^4(F^{n^{-1}{\mathbf {Q}}{\mathbf {X}}_n{\mathbf {X}}_n^*{\mathbf {Q}}^*}, F^{n^{-1}\widehat{{{\mathbf {Q}}}}\widehat{{{\mathbf {X}}}}_n\widehat{{{\mathbf {X}}}}_n^*\widehat{{{\mathbf {Q}}}}^*})=o_{a.s.}(1). \end{aligned}$$

Therefore, without loss of generality, we will hereafter assume that the number of columns k of \({\mathbf {Q}}\) is finite.

A.1.2 The proofs of (4) and (5)

Proof of (4) Let \(\mathfrak {I}(z)=v\). For the following analysis we will assume \(v>0\). Constants appearing in inequalities will be denoted by K and may take on different values from one expression to the next. Let \({\mathbf {r}}_j=n^{-1/2}{\mathbf {Q}}{\mathbf {x}}_j\), then \({\mathbf {B}}_{n}=\sum _{j=1}^n{\mathbf {r}}_j{\mathbf {r}}_j^*\). Define \( {\mathbf {D}}(z)={\mathbf {B}}_n-z{\mathbf {I}}_p\) and

$$\begin{aligned} {\mathbf {D}}_{j}(z)= & {} {\mathbf {D}}(z)-{\mathbf {r}}_j{\mathbf {r}}_j^*,\quad \beta _j(z)=[1+{\mathbf {r}}_j^*{\mathbf {D}}_{j}^{-1}(z){\mathbf {r}}_j]^{-1},\nonumber \\ {\gamma }_j(z)= & {} \beta _j(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-2}(z){\mathbf {r}}_j. \end{aligned}$$
(34)

Since \(\mathfrak {I}(\beta _j^{-1}(z))=v{\mathbf {r}}_j^*{\mathbf {D}}_{j}^{-1}(z)({\mathbf {D}}_{j}^{-1}(z))^*{\mathbf {r}}_j>v\Big |{\mathbf {r}}_j^*{\mathbf {D}}_{j}^{-2}(z){\mathbf {r}}_j\Big |\), we obtain

$$\begin{aligned} |{\gamma }_j(z)|\le v^{-1}, \end{aligned}$$
(35)

Moreover,

$$\begin{aligned} \mathfrak {I}[z{\mathbf {r}}_j^{*}{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j]= & {} (2\mathbf{i})^{-1}[z{\mathbf {r}}_j^{*}{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j-{\bar{z}}{\mathbf {r}}_j^{*}({\mathbf {D}}^{-1}_j(z))^{*}{\mathbf {r}}_j]\\= & {} v{\mathbf {r}}_j^{*}{\mathbf {D}}_j^{-1}(z)\left( \sum \limits _{i\not =j}{\mathbf {r}}_i{\mathbf {r}}_i^{*}\right) ({\mathbf {D}}_j^{-1}(z))^{*}{\mathbf {r}}_j\ge 0, \end{aligned}$$

where \({\bar{z}}\) denotes the conjugate of z. Thus we have

$$\begin{aligned} |\beta _j(z)|\le |z|v^{-1}. \end{aligned}$$
(36)

Denote the conditional expectation given \(\{{\mathbf {r}}_1,\ldots ,{\mathbf {r}}_j\}\) by \({\mathrm{E}}_j\) and \({\mathrm{E}}_0\) for the unconditional expectation. Then, we have

$$\begin{aligned} m_n(z)-{\mathrm{E}} m_n(z)= & {} p^{-1}\sum _{j=1}^n ({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\mathrm{tr}}{\mathbf {D}}^{-1}(z)\nonumber \\= & {} p^{-1}\sum _{j=1}^n ({\mathrm{E}}_j-{\mathrm{E}}_{j-1})[{\mathrm{tr}}{\mathbf {D}}^{-1}(z)-{\mathrm{tr}}{\mathbf {D}}_{j}^{-1}(z)]\nonumber \\= & {} p^{-1}\sum _{j=1}^n ({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\gamma }_j(z). \end{aligned}$$
(37)

By Lemma 2 (Burkholder inequality) and the inequality (35), for any \(\ell >1\), we obtain

$$\begin{aligned} {\mathrm{E}}|m_n(z)-{\mathrm{E}}m_n(z)|^{\ell }\le & {} Kp^{-\ell }{\mathrm{E}}\left( \sum _{j=1}^n |({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\gamma }_j(z)|^2\right) ^{\frac{\ell }{2}}\nonumber \\\le & {} Kv^{-\ell }p^{-\ell }n^{\frac{\ell }{2}}. \end{aligned}$$
(38)

Taking \(\ell >2\), Chebyshev inequality and (38) imply (4): \(m_n(z)-{\mathrm{E}} m_n(z)\rightarrow 0\) a.s..

Proof of (5) Following the steps of the proof of Theorem 1.1 of Bai and Zhou (2008), define \({\mathbf {K}}=(1+y_na_{n,1})^{-1}{\mathbf {T}}_n\) and \(y_n=p/n\), where \({\mathbf {T}}_n={\mathbf {Q}}{\mathbf {Q}}^*\), \(a_{n,\ell }=p^{-1}{\mathrm{E}}{\mathrm{tr}}[{\mathbf {T}}_n^{\ell }{\mathbf {D}}^{-1}(z)]\), \(\ell =0\) or 1. We have

$$\begin{aligned} ({\mathbf {K}}-z{\mathbf {I}})^{-1}-{\mathbf {D}}^{-1}(z)= & {} \sum \limits _{j=1}^n({\mathbf {K}}-z{\mathbf {I}})^{-1} {\mathbf {r}}_{j}{\mathbf {r}}_{j}^*{\mathbf {D}}^{-1}(z)-({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {K}}{\mathbf {D}}^{-1}(z)\nonumber \\= & {} \sum \limits _{j=1}^n({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {r}}_{j}{\mathbf {r}}_{j}^*{\mathbf {D}}^{-1}_{j}(z)\beta _{j}(z) -({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {K}}{\mathbf {D}}^{-1}(z).\nonumber \\ \end{aligned}$$
(39)

For \(\ell =0,1\), multiplying both sides by \({\mathbf {T}}_n^{\ell }\) and then taking trace and dividing by p, we have

$$\begin{aligned}&p^{-1}{\mathrm{E}}{\mathrm{tr}}[{\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}]-a_{n,\ell }\nonumber \\&\quad = p^{-1}\sum _{j=1}^n{\mathrm{E}}{\mathbf {r}}_j^*{\mathbf {D}}^{-1}_j(z){\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {r}}_j\beta _j(z)-p^{-1}{\mathrm{E}}{\mathrm{tr}} [{\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {K}}{\mathbf {D}}^{-1}(z)].\nonumber \\ \end{aligned}$$
(40)

One can prove a formula similar to (1.15) of Bai and Silverstein (2004) and verify that

$$\begin{aligned}&{\mathrm{E}}\left| {\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j-n^{-1}\mathrm \,{tr}({\mathbf {Q}}^*{\mathbf {D}}_j^{-1}(z){\mathbf {Q}})\right| ^2\nonumber \\&\quad \le 2n^{-2}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}^*{\mathbf {D}}_j^{-1}(z){\mathbf {Q}})({\mathbf {Q}}^*{\mathbf {D}}_j^{-1}(z){\mathbf {Q}})^* +n^{-2}\sum _{i=1}^k |({\mathbf {Q}}^*{\mathbf {D}}_j^{-1}(z){\mathbf {Q}})_{ii}|^2{\mathrm{E}}|x_{ij}|^4\nonumber \\&\quad \le 2n^{-2}pv^{-2}\Vert {\mathbf {T}}_n\Vert ^2+n^{-2}v^{-2}\sum _{i=1}^k\Vert {\mathbf {q}}_i\Vert ^4\eta _n^2n/\Vert {\mathbf {q}}_i\Vert ^2\le K\eta _n^2\rightarrow 0. \end{aligned}$$
(41)

By Lemma 8, one can prove that

$$\begin{aligned} \left| n^{-1}{\mathrm{tr}}({\mathbf {Q}}^*{\mathbf {D}}_j^{-1}(z){\mathbf {Q}})-n^{-1}{\mathrm{tr}}({\mathbf {Q}}^*{\mathbf {D}}^{-1}(z){\mathbf {Q}})\right| ^2\le K(n^2v^2)^{-1} \end{aligned}$$

and by the similar method of the proof of (4), we have

$$\begin{aligned} {\mathrm{E}}\left| n^{-1}{\mathrm{tr}}({\mathbf {Q}}^*{\mathbf {D}}^{-1}(z){\mathbf {Q}})-n^{-1}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}^*{\mathbf {D}}^{-1}(z){\mathbf {Q}})\right| ^2\le K(nv^2)^{-1}. \end{aligned}$$

Therefore, we have \({\mathrm{E}}|\beta _j^{-1}(z)-(1+y_n a_{n,1})|^2=o(1)\). Applying Lemma 8 again and (41), we have

$$\begin{aligned}&p^{-1}\sum _{j=1}^n{\mathrm{E}}{\mathbf {r}}_j^*{\mathbf {D}}^{-1}_j(z){\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {r}}_j\beta _j(z) -p^{-1}{\mathrm{E}}{\mathrm{tr}}[{\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {K}}{\mathbf {D}}^{-1}(z)]\nonumber \\&\quad =p^{-1}\sum _{j=1}^n{\mathrm{E}}{\mathbf {r}}_j^*{\mathbf {D}}^{-1}_j(z){\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {r}}_j(1+y_na_{n,1})^{-1}\nonumber \\&\qquad -p^{-1}{\mathrm{E}}{\mathrm{tr}}[{\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {K}}{\mathbf {D}}^{-1}(z)]+o(1)\nonumber \\&\quad =(pn)^{-1}\sum _{j=1}^n{\mathrm{E}}{\mathrm{tr}}{\mathbf {D}}^{-1}_j(z){\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {T}}_n(1+y_na_{n,1})^{-1}\nonumber \\&\qquad -p^{-1}{\mathrm{E}}{\mathrm{tr}}[{\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {K}}{\mathbf {D}}^{-1}(z)]+o(1)\nonumber \\&\quad =(pn)^{-1}\sum _{j=1}^n{\mathrm{E}}{\mathrm{tr}}{\mathbf {D}}^{-1}_j(z){\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {K}}-p^{-1}\mathrm E\mathrm tr {\mathbf {D}}^{-1}(z){\mathbf {T}}_n^{\ell }({\mathbf {K}}-z{\mathbf {I}})^{-1}{\mathbf {K}}+o(1)\nonumber \\&\quad =o(1). \end{aligned}$$
(42)

It then follows from (40) and (42) that

$$\begin{aligned} a_{n,\ell }= & {} p^{-1}{\mathrm{tr}}\Big \{{\mathbf {T}}_n^{\ell }\big [(1+y_na_{n,1})^{-1}{\mathbf {T}}_n-z{\mathbf {T}}\big ]^{-1}\Big \}+o(1)\nonumber \\= & {} \int \frac{t^\ell }{t(1+y_na_{n,1})^{-1}-z}dH_n(t)+o(1), \end{aligned}$$
(43)

where \(H_n\) is the ESD of \({\mathbf {T}}_n\). Because \(\mathfrak {I}[z(1+y_na_{n,1})]\ge v\), we conclude that \(|(1+y_na_{n,1})^{-1}|\le |z|/v\). Taking \(\ell =1\) in (43) and multiplying both sides by \((1+y_na_{n,1})^{-1}\), we obtain

$$\begin{aligned} \frac{a_{n,1}}{1+y_na_{n,1}}= & {} \int \frac{t(1+y_na_{n,1})^{-1}}{t(1+y_na_{n,1})^{-1}-z}dH_n(t)+o(1)\nonumber \\= & {} 1+z\int \frac{1}{t(1+y_na_{n,1})^{-1}-z}dH_n(t)+o(1)\\= & {} 1+za_{n,0}+o(1). \end{aligned}$$

From this, one can easily derive that

$$\begin{aligned} \frac{1}{1+y_na_{n,1}}=1-y_n(1+za_{n,0})+o(1)=1-y_n[1+z{\mathrm{E}} m_n(z)]+o(1). \end{aligned}$$
(44)

Finally, from (43) with \(\ell =0\), we obtain

$$\begin{aligned} {\mathrm{E}} m_n(z)=\int \frac{1}{t[1-y_n(1+z{\mathrm{E}} m_n(z))]-z}dH_n(t)+o(1). \end{aligned}$$
(45)

The limiting equation of the above equation is

$$\begin{aligned} m(z)=\int \frac{1}{t[1-y(1+zm(z))]-z}dH(t). \end{aligned}$$
(46)

It was proved in Silverstein (1995) that for each \(z\in {\mathbb {C}}^+\) the above equation has a unique solution m(z) satisfying \(\mathfrak {I}(m)>0\). By this fact, we conclude that \({\mathrm{E}} m_n(z)\) tends to the unique solution to Eq. (46).

A.2 Proof of Theorem 2

A.2.1 Truncation, Centralization and Rescaling

Truncation By Assumption (d), there exists a sequence of constants \(\eta _n\downarrow 0\) such that

$$\begin{aligned} \frac{1}{pn\eta _n^6}\sum _{i=1}^k\sum _{j=1}^n \Vert {\mathbf {q}}_{i}\Vert ^2{\mathrm{E}}|x_{ij}|^4I\left( |x_{ij}|>\eta _n \sqrt{n/\Vert {\mathbf {q}}_i\Vert }\right) \rightarrow 0. \end{aligned}$$
(47)

Define \({{\widehat{x}}}_{ij}=x_{ij}I(|x_{ij}|\le \eta _n\sqrt{n/\Vert {\mathbf {q}}_i\Vert })\), \({{\widehat{{\mathbf {X}}}}}_n=({{\widehat{x}}}_{ij})\) and \({{\widehat{{\mathbf {B}}}}}_n=n^{-1}{\mathbf {Q}}{{\widehat{{\mathbf {X}}}}}_n{{\widehat{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*_n\). Then

$$\begin{aligned} {\mathrm{P}}({\mathbf {B}}_n\ne {{\widehat{{\mathbf {B}}}}}_n)\le & {} {\mathrm{E}}\sum \limits _{i=1}^k\sum \limits _{j=1}^nI(|x_{ij}|>\eta _n\sqrt{n/\Vert {\mathbf {q}}_i\Vert })\nonumber \\\le & {} \eta _n^{-4}n^{-2}\sum \limits _{i=1}^k\sum \limits _{j=1}^n\Vert {\mathbf {q}}_i\Vert ^2{\mathrm{E}}|x_{ij}|^4I(|x_{ij}|>\eta _n\sqrt{n/\Vert {\mathbf {q}}_i\Vert })\rightarrow 0. \end{aligned}$$
(48)

Centralization Define \({{\widetilde{{\mathbf {X}}}}}_n={{\widehat{{\mathbf {X}}}}}_n-{\mathrm{E}}{{\widehat{{\mathbf {X}}}}}_n\) and \({{\widetilde{{\mathbf {B}}}}}_n=n^{-1}{\mathbf {Q}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*\). Using the approach and bounds used in the proof of Lemma 7, for \(\ell =1,\ldots ,L\), we have

$$\begin{aligned}&{\mathrm{E}}\left| \int f_\ell (x)d{\widehat{G}}_n(x)-\int f_\ell (x)d{\widetilde{G}}_n(x)\right| \\&\quad \le 2K_\ell \left( {\mathrm{tr}}({\mathbf {Q}}{\mathrm{E}}{{\widehat{{\mathbf {X}}}}}_n{\mathrm{E}}{{\widehat{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*)\right) ^{1/2} \left( n^{-2}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}{{\widehat{{\mathbf {X}}}}}_n{{\widehat{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*+{\mathbf {Q}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*)\right) ^{1/2}, \end{aligned}$$

where \(K_\ell \) is a bounded constant. Notice that

$$\begin{aligned}&\frac{1}{n^2}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}{{\widehat{{\mathbf {X}}}}}_n{{\widehat{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*) =\frac{1}{n^2}\sum \limits _{i=1}^p\sum \limits _{j=1}^n{\mathrm{E}}\left| \sum \limits _{h=1}^k q_{ih}{\hat{x}}_{hj}\right| ^2\\&\quad =\frac{1}{n^2}\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{h=1}^k|q_{ih}|^2{\mathrm{E}}|{\hat{x}}_{hj}|^2 +\frac{1}{n^2}\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum _{h_1\ne h_2}q_{ih_1}{\bar{q}}_{ih_2}{\mathrm{E}}{\hat{x}}_{h_1j}{\mathrm{E}}\bar{{\hat{x}}}_{h_2j} \end{aligned}$$

and

$$\begin{aligned}&\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum _{h_1\ne h_2}q_{ih_1}{\bar{q}}_{ih_2}{\mathrm{E}}{\hat{x}}_{h_1j}{\mathrm{E}}\bar{{\hat{x}}}_{h_2j} \le \sum \limits _{i=1}^p\sum \limits _{j=1}^n\left| \sum \limits _{h=1}^k q_{ih}{\mathrm{E}}{\hat{x}}_{hj}\right| ^2\\&\quad ={\mathrm{tr}}({\mathbf {Q}}{\mathrm{E}}{{\widehat{{\mathbf {X}}}}}_n{\mathrm{E}}{{\widehat{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*) \le \sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{h=1}^k|q_{ih}|^2\sum \limits _{m=1}^k|{\mathrm{E}}{\hat{x}}_{mj}|^2\\&\quad ={\mathrm{tr}}({\mathbf {Q}}{\mathbf {Q}}^*)\sum \limits _{m=1}^k\sum \limits _{j=1}^n|{\mathrm{E}}{\hat{x}}_{mj}|^2\\&\quad \le Kn^{-1}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {Q}}^*)\eta _n^{-6}n^{-2}\sum \limits _{m=1}^k\sum \limits _{j=1}^n\Vert {\mathbf {q}}_m\Vert ^2{\mathrm{E}}|x_{mj}|^4I(|x_{mj}|>\eta _n\sqrt{n/\Vert {\mathbf {q}}_m\Vert })\rightarrow 0, \end{aligned}$$

where K is a bounded constant. Therefore, we have \(n^{-2}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}{{\widehat{{\mathbf {X}}}}}_n{{\widehat{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*)=O(1)\). Similarly, we also have \(n^{-2}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*)=O(1)\). It follows that

$$\begin{aligned} \int f_\ell (x)d{\widehat{G}}_n(x)=\int f_\ell (x)d{\widetilde{G}}_n(x)+o_p(1), \quad \ell =1,\ldots ,L. \end{aligned}$$

Rescaling Define \(\sigma _{ij}^2={\mathrm{E}}|{\tilde{x}}_{ij}|^2\), \(\breve{x}_{ij}=\sigma _{ij}^{-1}{\tilde{x}}_{ij}\), \(\breve{\mathbf {X}}_n=(\breve{x}_{ij})\) and \(\breve{\mathbf {B}}_n=\frac{1}{n}{\mathbf {Q}}\breve{\mathbf {X}}_n\breve{\mathbf {X}}_n^*{\mathbf {Q}}^*\). Still using the approach and bounds used in the proof of Lemma 7, for \(\ell =1,\ldots ,L\), we obtain

$$\begin{aligned}&{\mathrm{E}}\left| \int f_\ell (x)d{\widetilde{G}}_n(x)-\int f_\ell (x)d\breve{G}_n(x)\right| \nonumber \\&\quad \le 2K_\ell \left( {\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}({{\widetilde{{\mathbf {X}}}}}_n-\breve{\mathbf {X}}_n)({{\widetilde{{\mathbf {X}}}}}_n-\breve{\mathbf {X}}_n)^*{\mathbf {Q}}^*)\right) ^{1/2}\\&\qquad \times \left( n^{-2}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*+{\mathbf {Q}}\breve{\mathbf {X}}_n\breve{\mathbf {X}}_n^*{\mathbf {Q}}^*)\right) ^{1/2}, \end{aligned}$$

where \(K_\ell \) is a bounded constant. Note that

$$\begin{aligned} n^{-2}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{\mathbf {Q}}^*) =\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{h=1}^k|q_{ih}|^2{\mathrm{E}}|{\tilde{x}}_{hj}|^2\le n^{-1}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {Q}}^*) \end{aligned}$$

and

$$\begin{aligned} n^{-2}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}\breve{\mathbf {X}}_n\breve{\mathbf {X}}_n^*{\mathbf {Q}}^*) =\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{h=1}^k|q_{ih}|^2{\mathrm{E}}|\breve{x}_{hj}|^2= n^{-1}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {Q}}^*). \end{aligned}$$

Since \({\tilde{x}}_{ij}-\breve{x}_{ij}=(1-\sigma _{ij}^{-1}){\tilde{x}}_{ij}\) and \(\sigma _{ij}\le 1\), we get

$$\begin{aligned}&{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}({{\widetilde{{\mathbf {X}}}}}_n-\breve{\mathbf {X}}_n)({{\widetilde{{\mathbf {X}}}}}_n-\breve{\mathbf {X}}_n)^*{\mathbf {Q}}^*) =\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{h=1}^k|q_{ih}|^2(1-\sigma _{hj})^2\\&\quad \le K\eta _n^{-4}n^{-2}\sum \limits _{h=1}^k\sum \limits _{j=1}^n\Vert {\mathbf {q}}_h\Vert ^2{\mathrm{E}}|x_{hj}|^4I(|x_{hj}|>\eta _n\sqrt{n/\Vert {\mathbf {q}}_h\Vert })\rightarrow 0. \end{aligned}$$

Therefore, we have

$$\begin{aligned} \int f_\ell (x)d{\widetilde{G}}_n(x)=\int f_\ell (x)d\breve{G}_n(x)+o_p(1), \quad \ell =1,\ldots ,L. \end{aligned}$$

Case that \({\mathbf {Q}}\) has an infinite number of columns Similar to the proof of Theorem 1, if the spectral norm of \({\mathbf {T}}_n\) is uniformly bounded in p, we can find a sequence of \(\{k_p\}\) that satisfies

$$\begin{aligned} p^{-1}\sum _{i=1}^p\sum _{h=k_p+1}^{\infty }|q_{ih}|^2=o(p^{-1}). \end{aligned}$$

For simplicity, we write k for \(k_p\). If \({\mathbf {Q}}\) is a \(p\times \infty \) dimensional matrix, then we truncate \({\mathbf {Q}}\) as \({\mathbf {Q}}=(\widehat{{{\mathbf {Q}}}}, \widetilde{{{\mathbf {Q}}}})\), where \(\widehat{{\mathbf {Q}}}\) is a \(p\times k\) matrix and \(\widetilde{{\mathbf {Q}}}=(q_{ij})\) is a \(p\times \infty \) matrix with \(i=1,\ldots ,p, j=k+1,\ldots ,\infty \). Similarly, truncate \({\mathbf {X}}_n\) as \({\mathbf {X}}_n=\left( \begin{array}{c}\widehat{{\mathbf {X}}}_n \\ \widetilde{{\mathbf {X}}}_n \end{array}\right) \), where \(\widehat{{{\mathbf {X}}}}_n\) is a \(k\times n\) matrix and \(\widetilde{{{\mathbf {X}}}}_n\) is a \(\infty \times n\) matrix. Denote \({{\widehat{{\mathbf {B}}}}}_n=n^{-1}{{\widehat{{\mathbf {Q}}}}}{{\widehat{{\mathbf {X}}}}}_n{{\widehat{{\mathbf {X}}}}}_n^*{{\widehat{{\mathbf {Q}}}}}^*\) and \({{\widehat{G}}}_n(x)=p[F^{{{\widehat{{\mathbf {B}}}}}_n}(x)-F^{y_n, H_n}(x)]\), then for \(\ell =1,\ldots ,L\), we have

$$\begin{aligned}&{\mathrm{E}}\left| \int f_\ell (x)dG_n(x)-\int f_\ell (x)d{\widehat{G}}_n(x)\right| \\&\quad \le 2K_\ell \left( {\mathrm{E}}{\mathrm{tr}}({{\widetilde{{\mathbf {Q}}}}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{{\widetilde{{\mathbf {Q}}}}}^*)\right) ^{1/2} \left( n^{-2}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {X}}_n{\mathbf {X}}_n^*{\mathbf {Q}}^*+{{\widehat{{\mathbf {Q}}}}}{{\widehat{{\mathbf {X}}}}}_n{{\widehat{{\mathbf {X}}}}}_n^*{{\widehat{{\mathbf {Q}}}}}^*)\right) ^{1/2}. \end{aligned}$$

Because

$$\begin{aligned}&n^{-2}{\mathrm{E}}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {X}}_n{\mathbf {X}}_n^*{\mathbf {Q}}^*)=n^{-2}\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{h=1}^\infty |q_{ih}|^2=n^{-1}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {Q}}^*), \\&n^{-2}{\mathrm{E}}{\mathrm{tr}}({{\widehat{{\mathbf {Q}}}}}{{\widehat{{\mathbf {X}}}}}_n{{\widehat{{\mathbf {X}}}}}_n^*{{\widehat{{\mathbf {Q}}}}}^*) =n^{-2}\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{h=1}^k|q_{ih}|^2\le n^{-1}{\mathrm{tr}}({\mathbf {Q}}{\mathbf {Q}}^*) \end{aligned}$$

and

$$\begin{aligned} {\mathrm{E}}{\mathrm{tr}}({{\widetilde{{\mathbf {Q}}}}}{{\widetilde{{\mathbf {X}}}}}_n{{\widetilde{{\mathbf {X}}}}}_n^*{{\widetilde{{\mathbf {Q}}}}}^*) =\sum \limits _{i=1}^p\sum \limits _{j=1}^n\sum \limits _{h=k+1}^\infty |q_{ih}|^2 =n\sum \limits _{i=1}^p\sum \limits _{h=k+1}^\infty |q_{ih}|^2=o(1), \end{aligned}$$

we obtain

$$\begin{aligned} \int f_\ell (x)dG_n(x)=\int f_\ell (x)d{\widehat{G}}_n(x)+o_p(1), \quad \ell =1,\ldots ,L. \end{aligned}$$

Therefore, without loss of generality, we assume that k is finite in the following section.

A.2.2 Proof of Lemma 1

For simplicity, we assume that \(|x_{ij}|\le \eta _n\sqrt{n/\Vert {\mathbf {q}}_i\Vert }\), \({\mathrm{E}}x_{ij}=0\) and \({\mathrm{E}}|x_{ij}|^2=1\) for \(i=1,\ldots ,k\) and \(j=1,\ldots ,n\). Similar to the proof of (1.9a) and (1.9b) in Bai and Silverstein (2004), we can obtain that for any \(\eta _1>\limsup _n\lambda _{\max }^{{\mathbf {T}}_n}(1+\sqrt{y})^2\) and any positive \(\ell \),

$$\begin{aligned} P(\lambda _{\max }^{{\mathbf {B}}_n}\ge \eta _1)=o(n^{-\ell }). \end{aligned}$$

If \(\liminf _n\lambda _{\min }^{{\mathbf {T}}_n}I(0<y<1)>0\), for any \(0<\eta _2<\liminf _n\lambda _{\min }^{{\mathbf {T}}_n}(1-\sqrt{y})^2I(0<y<1)\),

$$\begin{aligned} P(\lambda _{\min }^{{\mathbf {B}}_n}\le \eta _2)=o(n^{-\ell }). \end{aligned}$$

\({\widehat{M}}_n(z)\) is defined as follows. Let \(x_r\) be a number greater than \(\limsup _n\lambda _{\max }^{{\mathbf {T}}_n}(1+\sqrt{y})^2\). Let \(x_l\) be a number between 0 and \(\liminf _n\lambda _{\min }^{{\mathbf {T}}_n}(1-\sqrt{y})^2I(0<y<1)\) if the latter is greater than 0. Otherwise, \(x_l\) is defined by a negative number. Let \(\eta _l\) and \(\eta _r\) satisfy

$$\begin{aligned} x_l<\eta _l<\lim \inf \limits _{n}\lambda _{\min }^{{\mathbf {T}}_n}(1-\sqrt{y})^2I(0<y<1)<\lim \sup \limits _{n}\lambda _{\max }^{{\mathbf {T}}_n}(1+\sqrt{y})^2<\eta _r<x_r. \end{aligned}$$

Let \(v_0\) be any positive number. Define

$$\begin{aligned} {{{\mathcal {C}}}}=\{x_l+{\mathbf {i}}v: |v|\le v_0\}\cup {{{\mathcal {C}}}}_u\cup {{{\mathcal {C}}}}_b\cup \{x_r+{\mathbf {i}}v: |v|\le v_0\}, \end{aligned}$$

where

$$\begin{aligned} {{{\mathcal {C}}}}_u=\{x+{\mathbf {i}}v_0: x\in [x_l,x_r]\},\quad {{{\mathcal {C}}}}_b=\{x-{\mathbf {i}}v_0: x\in [x_l,x_r]\}. \end{aligned}$$
(49)

Define

$$\begin{aligned} {{{\mathcal {C}}}}_n=\left\{ \begin{array}{ll} \{z:z\in {{{\mathcal {C}}}}~\text{ and }~|\mathfrak {I}(z)|>n^{-1}\epsilon _n\},&{}\quad \text{ if }\quad x_l>0,\\ \{z:z\in {{{\mathcal {C}}}}~\text{ and }~|\mathfrak {I}(x_r+{\mathbf {i}}v)|>n^{-1}\epsilon _n\}, &{} \quad \text{ if }~ x_l<0, \end{array}\right. \end{aligned}$$

with \(\epsilon _n\ge n^{-\alpha }\) and \(0<\alpha <1\). Let

$$\begin{aligned} {\widehat{M}}_n(z)=\left\{ \begin{array}{ll} M_n(z), &{} z\in {{{\mathcal {C}}}}_n,\\ M_n(x_r+{\mathbf {i}}n^{-1}\epsilon _n), &{} x=x_r,~v\in [0,n^{-1}\epsilon _n],\\ M_n(x_r-{\mathbf {i}}n^{-1}\epsilon _n), &{} x=x_r,~v\in [-n^{-1}\epsilon _n,0),\\ M_n(x_l+{\mathbf {i}}n^{-1}\epsilon _n), &{} x_l>0, x=x_l,~v\in [0,n^{-1}\epsilon _n],\\ M_n(x_l-{\mathbf {i}}n^{-1}\epsilon _n), &{} x_l>0, x=x_l,~v\in [-n^{-1}\epsilon _n,0). \end{array}\right. \end{aligned}$$

Write \(M_n(z)=M_n^1(z)+M^2_n(z)\) for \(z\in {{{\mathcal {C}}}}_n\), where

$$\begin{aligned} M_n^1(z)=p[m_n(z)-{\mathrm{E}}m_n(z)]\quad \text {and}\quad M_n^2(z)=p[{\mathrm{E}}m_n(z)-m_n^0(z)]. \end{aligned}$$

Then the proof of Lemma 1 is divided into the following three steps:

  • Step 1 For any positive integer r and any complex numbers \(z_1,\ldots ,z_r\in {{{\mathcal {C}}}}_n\), the random vector \((M_n^1(z_j), j=1,\ldots ,r)\) converges to an 2r-dimensional centered Gaussian vector with covariance function (15).

  • Step 2 Tightness of \(M_n^1(z)\);

  • Step 3 \(M_n^2(z)\) converges to the mean function (14) for \(z\in {{{\mathcal {C}}}}_n\).

Step 1: Convergence of finite dimensional distributions In this section we will show that for any positive integer r and any complex numbers \(z_1,\ldots , z_r\in {{{\mathcal {C}}}}_n\), the sum \(\sum _{j=1}^r\alpha _jM_n^1(z_j)\) converges in distribution to a Gaussian random variable. Because of Assumption (e), without loss of generality, we may assume \(\Vert {\mathbf {Q}}\Vert \le 1\) for all n. Constants appearing in inequalities will be denoted by K and may take on different values from one expression to the next. Let

$$\begin{aligned}&\epsilon _j(z)={\mathbf {r}}_j^*{\mathbf {D}}^{-1}_j(z){\mathbf {r}}_j-n^{-1}{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}^{-1}_j(z), \end{aligned}$$
(50)
$$\begin{aligned}&\delta _j(z)={\mathbf {r}}_j^*{\mathbf {D}}^{-2}_j(z){\mathbf {r}}_j-n^{-1}{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}^{-2}_j(z)=\frac{d}{dz}\epsilon _j(z) \end{aligned}$$
(51)

and

$$\begin{aligned} {\bar{\beta }}_j(z)=\frac{1}{1+n^{-1}{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z)},\quad b_j(z)=\frac{1}{1+n^{-1}{\mathrm{E}}{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z)}. \end{aligned}$$
(52)

Notice that

$$\begin{aligned} {\mathbf {D}}^{-1}(z)={\mathbf {D}}_j^{-1}(z)-{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j{\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)\beta _j(z). \end{aligned}$$
(53)

By (53), we obtain

$$\begin{aligned} p[m_n(z)-{\mathrm{E}} m_n(z)]= & {} {\mathrm{tr}}[{\mathbf {D}}^{-1}(z)-{\mathrm{E}}{\mathbf {D}}^{-1}(z)]\\= & {} \sum _{j=1}^n{\mathrm{tr}}{\mathrm{E}}_j{\mathbf {D}}^{-1}(z)-{\mathrm{tr}} {\mathrm{E}}_{j-1}{\mathbf {D}}^{-1}(z)\\= & {} \sum _{j=1}^n{\mathrm{tr}}{\mathrm{E}}_j[{\mathbf {D}}^{-1}(z)-{\mathbf {D}}^{-1}_j(z)]-{\mathrm{tr}}{\mathrm{E}}_{j-1}[{\mathbf {D}}^{-1}(z)-{\mathbf {D}}^{-1}_j(z)]\\= & {} -\sum _{j=1}^n({\mathrm{E}}_j-{\mathrm{E}}_{j-1})\beta _j(z){\mathbf {r}}_j^*{\mathbf {D}}^{-2}_j(z){\mathbf {r}}_j\\= & {} \frac{d}{dz}\sum _{j=1}^n ({\mathrm{E}}_j-{\mathrm{E}}_{j-1})\log \beta _j(z)\\= & {} -\frac{d}{dz}\sum _{j=1}^n ({\mathrm{E}}_j-{\mathrm{E}}_{j-1})\log \big (1+\epsilon _j(z){{\bar{\beta }}}_j(z)\big ) \end{aligned}$$

where \(\beta _j(z)={\bar{\beta }}_j(z)-\beta _j(z){\bar{\beta }}_j(z)\epsilon _j(z)\). By Lemma 3, we have

$$\begin{aligned}&{\mathrm{E}}\left| \frac{d}{dz}\sum _{j=1}^n ({\mathrm{E}}_j-{\mathrm{E}}_{j-1})\big [\log \big (1+\epsilon _j(z){{\bar{\beta }}}_j(z)\big )-\epsilon _j(z){{\bar{\beta }}}_j(z)\big ]\right| ^2 \nonumber \\&\quad \le K\sum _{j=1}^n {\mathrm{E}}\left| \frac{1}{2\pi \mathbf{i}}\oint _{|\zeta -z|=v/2}\frac{\big [\log \big (1+\epsilon _j(\zeta ){{\bar{\beta }}}_j(\zeta )\big ) -\epsilon _j(\zeta ){{\bar{\beta }}}_j(\zeta )\big ]}{(z-\zeta )^2}d\zeta \right| ^2\nonumber \\&\quad \le K\frac{1}{2\pi v^4}\sum _{j=1}^n \oint _{|\zeta -z|=v/2}{\mathrm{E}}|\epsilon _j(\zeta ){{\bar{\beta }}}_j(\zeta )|^4d\zeta \nonumber \\&\quad \le \frac{K}{2\pi n^4v^4}\sum _{j=1}^n \oint _{|\zeta -z|=v/2}\bigg \{{\mathrm{E}} \left[ {\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(\zeta ){\mathbf {T}}_n{\mathbf {D}}_j^{-1}(\zeta )\right] ^2\nonumber \\&\qquad +\sum _{i=1}^k {\mathrm{E}}|x_{ij}|^8 {\mathrm{E}}|{\mathbf {q}}_i^T{\mathbf {D}}_j^{-1}(\zeta ){\mathbf {q}}_i|^4d\zeta \bigg \}\nonumber \\&\quad \le Kn^{-1}+K\eta _n^4\rightarrow 0. \end{aligned}$$
(54)

Therefore, we only need to derive the finite dimensional limiting distribution of

$$\begin{aligned} -\frac{d}{dz}\sum _{j=1}^n ({\mathrm{E}}_j-{\mathrm{E}}_{j-1})\epsilon _j(z){{\bar{\beta }}}_j(z)=-\frac{d}{dz}\sum _{j=1}^n {\mathrm{E}}_j\epsilon _j(z){{\bar{\beta }}}_j(z). \end{aligned}$$
(55)

Similar to the last three lines of the proof of (54), we can show that

$$\begin{aligned} \sum _{j=1}^n {\mathrm{E}}\left| {\mathrm{E}}_j\frac{d}{dz}\epsilon _j(z){{\bar{\beta }}}_j(z)\right| ^2 I\left( \left| {\mathrm{E}}_j\frac{d}{dz}\epsilon _j(z){{\bar{\beta }}}_j(z)\right| \ge \epsilon \right) \le \frac{1}{\epsilon ^2}\sum _{i=1}^n {\mathrm{E}}\Big |{\mathrm{E}}_j\frac{d}{dz}\epsilon _j(z){{\bar{\beta }}}_j(z)\Big |^4\rightarrow 0. \end{aligned}$$

Thus, the martingale difference sequence \(\{({\mathrm{E}}_j-{\mathrm{E}}_{j-1})\frac{d}{dz}\epsilon _j(z){{\bar{\beta }}}_j(z)\}\) satisfies the Lyapunov condition. Applying Lemma 5, the random vector \((M_n^1(z_1),\ldots ,M_n^1(z_r))\) will tend to an 2r-dimensional Gaussian vector \((M(z_1),\ldots , M(z_r))\) whose covariance function is given by

$$\begin{aligned} {\mathrm{Cov}}(M(z_1),M(z_2))=\lim _{n\rightarrow \infty }\sum _{j=1}^n {\mathrm{E}}_{j-1}\left( {\mathrm{E}}_j\frac{d}{d z_1}\epsilon _j(z_1){{\bar{\beta }}}_j(z_1) {\mathrm{E}}_j\frac{d}{d z_2}\epsilon _j(z_2){{\bar{\beta }}}_j(z_2)\right) . \end{aligned}$$

Consider the sum

$$\begin{aligned} \Gamma _n(z_1,z_2)=\sum _{j=1}^n{\mathrm{E}}_{j-1}\left[ {\mathrm{E}}_j({{\bar{\beta }}}_j(z_1)\epsilon _j(z_1)){\mathrm{E}}_j({{\bar{\beta }}}_j(z_2)\epsilon _j(z_2))\right] . \end{aligned}$$

Using the same approach of Bai and Silverstein (2004), we can replace \({{\bar{\beta }}}_j(z)\) by \(b_j(z)\). Therefore, by (1.15) of Bai and Silverstein (2004), we have

$$\begin{aligned} \Gamma _n(z_1,z_2)= & {} \sum _{j=1}^nb_j(z_1)b_j(z_2){\mathrm{E}}_{j-1}\left[ {\mathrm{E}}_j(\epsilon _j(z_1)){\mathrm{E}}_j(\epsilon _j(z_2))\right] \nonumber \\= & {} \frac{1}{n^2}\sum _{j=1}^nb_j(z_1)b_j(z_2){\mathrm{tr}}\Bigg [{\mathbf {T}}_n{\mathrm{E}}_{j}{\mathbf {D}}^{-1}_j(z_1){\mathbf {T}}_n{\mathrm{E}}_j{\mathbf {D}}^{-1}_j(z_2)\nonumber \\&+\,\alpha _x{\mathrm{tr}}{{\bar{{\mathbf {Q}}}}}{\mathbf {Q}}^*{\mathrm{E}}_j{\mathbf {D}}_j^{-1}(z_1){\mathbf {Q}}{\mathbf {Q}}^T{\mathrm{E}}_j({\mathbf {D}}_j^T)^{-1}(z_2)\nonumber \\&+\,\beta _x\sum _{i=1}^k{\mathbf {q}}_i^*{\mathrm{E}}_j{\mathbf {D}}_j^{-1}(z_1){\mathbf {q}}_i{\mathbf {q}}_i^{*}{\mathrm{E}}_j{\mathbf {D}}_j^{-1}(z_2){\mathbf {q}}_i\Bigg ]\nonumber \\= & {} \frac{1}{n^2}\sum _{j=1}^nb_j(z_1)b_j(z_2){\mathrm{tr}}\Bigg [{\mathbf {T}}_n{\mathrm{E}}_{j}{\mathbf {D}}^{-1}_j(z_1){\mathbf {T}}_n{\mathrm{E}}_j{\mathbf {D}}^{-1}_j(z_2)\nonumber \\&+\,\alpha _x{\mathrm{tr}}{\mathbf {T}}_n{\mathrm{E}}_j{\mathbf {D}}_j^{-1}(z_1){\mathbf {T}}_n{\mathrm{E}}_j({\mathbf {D}}_j^T)^{-1}(z_2)\nonumber \\&+\,\beta _x\sum _{i=1}^k{\mathbf {q}}_i^*{\mathrm{E}}_j{\mathbf {D}}_j^{-1}(z_1){\mathbf {q}}_i{\mathbf {q}}_i^{*}{\mathrm{E}}_j{\mathbf {D}}_j^{-1}(z_2){\mathbf {q}}_i\Bigg ], \end{aligned}$$
(56)

where \(\alpha _x=|Ex_{11}^2|^2\) and \(\beta _x=E|x_{11}|^4-\alpha _x-2\). Here, the third equality holds if either \(\alpha _x=0\) or \({\mathbf {Q}}\) is real which implies that \({\mathbf {T}}_n={\mathbf {Q}}{\mathbf {Q}}^T={{\bar{{\mathbf {Q}}}}}{\mathbf {Q}}^*\).

Now we use the new method to derive the limit of the first term which is different from but easier than that used in Bai and Silverstein (2004). Let \(v_0\) be a lower bound on \(\mathfrak {I}(z_i)\). Define \(\breve{\mathbf {r}}_j\) as an i.i.d. copy of \({\mathbf {r}}_j\), \(j=1,\ldots ,n\) and define \(\breve{\mathbf {D}}_j(z)\) similar as \({\mathbf {D}}_j(z)\) by using \({\mathbf {r}}_1,\ldots ,{\mathbf {r}}_{j-1},\breve{\mathbf {r}}_{j+1}\ldots ,\breve{\mathbf {r}}_n\). Then we have

$$\begin{aligned} {\mathrm{tr}}[{\mathbf {T}}_n{\mathrm{E}}_j{\mathbf {D}}^{-1}_j(z_1){\mathbf {T}}_n{\mathrm{E}}_j{\mathbf {D}}^{-1}_j(z_2)] ={\mathrm{tr}}{\mathrm{E}}_{j}[{\mathbf {T}}_n{\mathbf {D}}^{-1}_j(z_1){\mathbf {T}}_n\breve{\mathbf {D}}^{-1}_j(z_2)]. \end{aligned}$$

Similar to (44), from (6), one can prove that

$$\begin{aligned} n^{-1}{\mathrm{E}}_j[z_1{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z_1)-z_2{\mathrm{tr}}{\mathbf {T}}_n\breve{\mathbf {D}}_j^{-1}(z_2)]\rightarrow z_1[b^{-1}(z_1)-1]-z_2[b^{-1}(z_2)-1],~a.s. \end{aligned}$$

where \(b(z)=\lim \limits _{n\rightarrow \infty } b_j(z)=-z{\underline{m}}(z)\). On the other hand,

$$\begin{aligned}&n^{-1}{\mathrm{E}}_j[z_1{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z_1)-z_2{\mathrm{tr}}{\mathbf {T}}_n\breve{\mathbf {D}}_j^{-1}(z_2)]\\&\quad =n^{-1}{\mathrm{E}}_j{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z_1)\Big [(z_1-z_2)\sum _{i=1}^{j-1}{\mathbf {r}}_i{\mathbf {r}}_i^* +\sum _{i=j+1}^n(z_1\breve{\mathbf {r}}_i\breve{\mathbf {r}}_i^*-z_2{\mathbf {r}}_i{\mathbf {r}}_i^*)\Big ]\breve{\mathbf {D}}_j^{-1}(z_2)\\&\quad =n^{-1}\sum _{i=1}^{j-1}(z_1-z_2){\mathrm{E}}_j\beta _{ji}(z_1)\breve{\beta }_{ji}(z_2){\mathbf {r}}_i^*\breve{\mathbf {D}}_{ji}^{-1}(z_2){\mathbf {T}}_n{\mathbf {D}}_{ji}^{-1}(z_1){\mathbf {r}}_i\\&\qquad +n^{-1}\sum _{i=j+1}^n{\mathrm{E}}_j\Big [z_1\breve{\beta }_{ji}(z_2)\breve{\mathbf {r}}_i^*\breve{\mathbf {D}}_{ji}^{-1}(z_2){\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z_1)\breve{\mathbf {r}}_i\\&\qquad -z_2\beta _{ji}(z_1){\mathbf {r}}_i^*\breve{\mathbf {D}}_{j}^{-1}(z_2){\mathbf {T}}_n{\mathbf {D}}_{ji}^{-1}(z_1){\mathbf {r}}_i\Big ]\\&\quad =n^{-2}\sum _{i=1}^{j-1}(z_1-z_2)b(z_1)b(z_2){\mathrm{E}}_j{\mathrm{tr}}{\mathbf {T}}_n\breve{\mathbf {D}}_{ji}^{-1}(z_2){\mathbf {T}}_n{\mathbf {D}}_{ji}^{-1}(z_1)\\&\qquad +n^{-2}\sum _{i=j+1}^n{\mathrm{E}}_j\Big [z_1b(z_2){\mathrm{tr}}{\mathbf {T}}_n\breve{\mathbf {D}}_{ji}^{-1}(z_2){\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z_1)\\&\qquad -z_2b(z_1){\mathrm{tr}}{\mathbf {T}}_n\breve{\mathbf {D}}_{j}^{-1}(z_2){\mathbf {T}}_n{\mathbf {D}}_{ji}^{-1}(z_1)\Big ]+o_{a.s.}(1)\\&(\text{ by } \text{ replacing }\;{\mathbf {D}}_{ji}^{-1}={\mathbf {D}}_j^{-1}+{\mathbf {D}}_{ji}^{-1}{\mathbf {r}}_i{\mathbf {r}}_i^*{\mathbf {D}}_{ji}^{-1}\beta _{ji})\\&\quad =\Big [\frac{j-1}{n}(z_1-z_2)b(z_1)b(z_2)+\frac{n-j}{n}(z_1b(z_2)-z_2b(z_1))\Big ]\\&\qquad \times n^{-1}{\mathrm{tr}}{\mathrm{E}}_j[{\mathbf {T}}_n\breve{\mathbf {D}}_{j}^{-1}(z_2){\mathbf {T}}_n{\mathbf {D}}_{j}^{-1}(z_1)]+o_{a.s.}(1). \end{aligned}$$

Comparing the two estimates, we obtain

$$\begin{aligned} n^{-1}{\mathrm{tr}}{\mathrm{E}}_j[{\mathbf {T}}_n\breve{\mathbf {D}}_{j}^{-1}(z_2){\mathbf {T}}_n{\mathbf {D}}_{j}^{-1}(z_1)]=\frac{z_1(b^{-1}(z_1)-1)-z_2(b^{-1}(z_2)-1)+o_{a.s.}(1)}{\frac{j-1}{n}(z_1-z_2)b(z_1)b(z_2)+\frac{n-j}{n}(z_1b(z_2)-z_2b(z_1))}. \end{aligned}$$

Consequently, we obtain

$$\begin{aligned}&\frac{1}{n^2}\sum _{j=1}^nb_j(z_1)b_j(z_2){\mathrm{tr}}[{\mathbf {T}}_n{\mathrm{E}}_{j}{\mathbf {D}}^{-1}_j(z_1){\mathbf {T}}_n{\mathrm{E}}_j{\mathbf {D}}^{-1}_j(z_2)]\\&\quad \rightarrow a(z_1,z_2)\int _0^1\frac{1}{1-ta(z_1,z_2)}dt={-\log (1-a(z_1,z_2))}=\int _0^{a(z_1,z_2)}\frac{1}{1-z}dz, \end{aligned}$$

where

$$\begin{aligned} a(z_1,z_2)= & {} \frac{b(z_1)b(z_2)[z_1(b^{-1}(z_1)-1)-z_2(b^{-1}(z_2)-1)]}{z_1b(z_2)-z_2b(z_1)}\\= & {} 1+\frac{b(z_1)b(z_2)(z_2-z_1)}{z_1b(z_2) -z_2b(z_1)}=1+\frac{{\underline{m}}(z_1){\underline{m}}(z_2)(z_1-z_2)}{{\underline{m}}(z_2)-{\underline{m}}(z_1)}. \end{aligned}$$

Thus, we have

$$\begin{aligned}&\frac{\partial ^2}{\partial z_2\partial z_1}\int _0^{a(z_1,z_2)}\frac{1}{1-z}dz =\frac{\partial }{\partial z_2}\left( \frac{1}{1-a(z_1,z_2)}\frac{\partial a(z_1,z_2)}{\partial z_1}\right) \\&\quad =-\frac{\partial }{\partial z_2}\left( \frac{1}{{\underline{m}}(z_1)}\frac{d{\underline{m}}(z_1)}{d z_1} +\frac{1}{z_1-z_2}+\frac{1}{{\underline{m}}(z_2)-{\underline{m}}(z_1)}\frac{d{\underline{m}}(z_1)}{d z_1}\right) \\&\quad =\frac{1}{({\underline{m}}(z_2)-{\underline{m}}(z_1))^2}\frac{d{\underline{m}}(z_1)}{d z_1}\frac{d{\underline{m}}(z_2)}{d z_2}-\frac{1}{(z_1-z_2)^2}. \end{aligned}$$

Next, we compute the limit of the second term of (56). In this step, we need the assumption that \(\Vert \mathfrak {I}({\mathbf {Q}})\Vert =o(1)\). Similarly, we consider

$$\begin{aligned} n^{-1}{\mathrm{E}}_j[z_1{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z_1)-z_2{\mathrm{tr}}{\mathbf {T}}_n(\breve{\mathbf {D}}_j^T)^{-1}(z_2)]\rightarrow z_1[b^{-1}(z_1)-1]-z_2[b^{-1}(z_2)-1],~a.s. \end{aligned}$$

On the other hand,

$$\begin{aligned}&n^{-1}{\mathrm{E}}_j[z_1{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z_1)-z_2{\mathrm{tr}}{\mathbf {T}}_n(\breve{\mathbf {D}}_j^T)^{-1}(z_2)]\\&\quad =n^{-1}{\mathrm{E}}_j{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z_1)\left[ \sum _{i=1}^{j-1}(z_1{{\bar{{\mathbf {r}}}}}_i{\mathbf {r}}_i^T-z_2{\mathbf {r}}_i{\mathbf {r}}_i^*) +\sum _{i=j+1}^n(z_1\bar{\breve{\mathbf {r}}}_i\breve{\mathbf {r}}_i^T-z_2{\mathbf {r}}_i{\mathbf {r}}_i^*)\right] (\breve{\mathbf {D}}_j^T)^{-1}(z_2)\\&\quad =n^{-1}\sum _{i=1}^{j-1}{\mathrm{E}}_j\bigg [z_1\breve{\beta }_{ji}(z_2){\mathbf {r}}_i^T(\breve{\mathbf {D}}_{ji}^T)^{-1}(z_2){\mathbf {T}}_n \left( {\mathbf {D}}_{ji}^{-1}(z_1)-{\mathbf {D}}_{ji}^{-1}(z_1){\mathbf {r}}_i{\mathbf {r}}_i^*{\mathbf {D}}_{ji}^{-1}(z_1)\beta _{ji}(z_1)\right) {\bar{{\mathbf {r}}}}_i\\&\qquad -z_2\beta _{ji}(z_1){\mathbf {r}}_i^*\left( (\breve{\mathbf {D}}_{ji}^T)^{-1}(z_2) -\breve{\beta }_{ji}(z_2)(\breve{\mathbf {D}}^T_{ji})^{-1}(z_2){{\bar{{\mathbf {r}}}}}_i{\mathbf {r}}_i^T(\breve{\mathbf {D}}^T_{ji})^{-1}(z_2)\right) {\mathbf {T}}_n{\mathbf {D}}_{ji}^{-1}(z_1){\mathbf {r}}_i\bigg ]\\&\qquad +n^{-1}\sum _{i=j+1}^n{\mathrm{E}}_j\Big [z_1\breve{\beta }_{ji}(z_2){\breve{\mathbf {r}}}_i^T(\breve{\mathbf {D}}_{ji}^T)^{-1}(z_2){\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z_1)\bar{\breve{\mathbf {r}}}_i\\&\qquad -z_2\beta _{ji}(z_1){\mathbf {r}}_i^*\breve{\mathbf {D}}_{j}^{-1}(z_2){\mathbf {T}}_n{\mathbf {D}}_{ji}^{-1}(z_1){\mathbf {r}}_i\Big ]\\&\quad =\Big \{\frac{j-1}{n}\alpha _x[-(z_1b(z_2)-z_2b(z_1))+b(z_1)b(z_2)(z_1-z_2)]+(z_1b(z_2)-z_2b(z_1))\Big \}\\&\qquad \times \,n^{-1}{\mathrm{tr}}{\mathrm{E}}_j[{\mathbf {T}}_n{\mathbf {D}}_{j}^{-1}(z_1){\mathbf {T}}_n(\breve{\mathbf {D}}_{j}^T)^{-1}(z_2)]+o_{a.s.}(1). \end{aligned}$$

Comparing the two estimates, we obtain

$$\begin{aligned}&n^{-1}{\mathrm{tr}}{\mathrm{E}}_j[{\mathbf {T}}_n{\mathbf {D}}_{j}^{-1}(z_1){\mathbf {T}}_n(\breve{\mathbf {D}}_{j}^T)^{-1}(z_2)]\\&\quad =\frac{z_1[b^{-1}(z_1)-1]-z_2[b^{-1}(z_2)-1]+o_{a.s.}(1)}{\frac{j-1}{n}\alpha _x[-(z_1b(z_2)-z_2b(z_1))+b(z_1)b(z_2)(z_1-z_2)]+[z_1b(z_2)-z_2b(z_1)]}. \end{aligned}$$

Consequently, we have

$$\begin{aligned}&\frac{1}{n^2}\sum _{j=1}^n\alpha _xb_j(z_1)b_j(z_2){\mathrm{tr}}[{\mathbf {T}}_n{\mathrm{E}}_{j}{\mathbf {D}}^{-1}_j(z_1){\mathbf {T}}_n{\mathrm{E}}_j({\mathbf {D}}^T_j)^{-1}(z_2)]\nonumber \\&\quad \rightarrow {{\tilde{a}}}(z_1,z_2)\int _0^1\frac{1}{1-t{{\tilde{a}}}(z_1,z_2)}dt={-}\log (1-{{\tilde{a}}}(z_1,z_2))=\int _0^{{{\tilde{a}}}(z_1,z_2)}\frac{1}{1-z}dz, \end{aligned}$$

where

$$\begin{aligned} {{\tilde{a}}}(z_1,z_2)= & {} \frac{\alpha _xb(z_1)b(z_2)[z_1(b^{-1}(z_1)-1)-z_2(b^{-1}(z_2)-1)]}{z_1b(z_2)-z_2b(z_1)}\\= & {} \alpha _x\left( 1+\frac{b(z_1)b(z_2)(z_2-z_1)}{z_1b(z_2)-z_2b(z_1)}\right) =\alpha _x\left( 1+\frac{{\underline{m}}(z_1){\underline{m}}(z_2)(z_1-z_2)}{{\underline{m}}(z_2)-{\underline{m}}(z_1)}\right) . \end{aligned}$$

Last, we will compute the limit of the third term of (56). By (9.9.12) of Bai and Silverstein (2010), we have

$$\begin{aligned}&\frac{1}{n^2}\sum _{j=1}^{n}\beta _x\sum \limits _{i=1}^{k}\mathbf{e}_i^T{\mathbf {Q}}^*{\mathrm{E}}_j{\mathbf {D}}_j^{-1}(z_1){\mathbf {Q}}\mathbf{e}_i\mathbf{e}_i^T{\mathbf {Q}}^*{\mathrm{E}}_j{\mathbf {D}}_j^{-1}(z_2){\mathbf {Q}}\mathbf{e}_i\\&\quad =\frac{1}{n^2z_1z_2}\sum _{j=1}^{n}\beta _x\sum \limits _{i=1}^{k}\mathbf{e}_i^T{\mathbf {Q}}^*({\underline{m}}(z_1){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {Q}}\mathbf{e}_i \mathbf{e}_i^T{\mathbf {Q}}^*({\underline{m}}(z_2){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {Q}}\mathbf{e}_i+o_p(1). \end{aligned}$$

If \(\beta _x\ne 0\), then by Assumption (f), the matrix \({\mathbf {Q}}^{*}{\mathbf {Q}}\) satisfies \(\Vert {\mathbf {Q}}^{*}{\mathbf {Q}}-{\mathrm{diag}}({\mathbf {Q}}^{*}{\mathbf {Q}})\Vert =o(1)\). Using the identity \({\mathbf {Q}}^*[{\underline{m}}(z){\mathbf {T}}_n+{\mathbf {I}}_p]^{-1}{\mathbf {Q}}={\mathbf {Q}}^{*}{\mathbf {Q}}-{\underline{m}}(z){\mathbf {Q}}^{*}{\mathbf {Q}}({\mathbf {I}}_k+{\underline{m}}(z){\mathbf {Q}}^{*}{\mathbf {Q}})^{-1}{\mathbf {Q}}^{*}{\mathbf {Q}}\), we have

$$\begin{aligned}&\frac{1}{n^2z_1z_2}\sum _{j=1}^{n}\beta _x\sum \limits _{i=1}^{k}\mathbf{e}_i^T{\mathbf {Q}}^*[{\underline{m}}(z_1){\mathbf {T}}_n+{\mathbf {I}}_p]^{-1}{\mathbf {Q}}\mathbf{e}_i \mathbf{e}_i^T{\mathbf {Q}}^*[{\underline{m}}(z_2){\mathbf {T}}_n+{\mathbf {I}}_p]^{-1}{\mathbf {Q}}\mathbf{e}_i\\&\quad =\frac{1}{n^2z_1z_2}\sum _{j=1}^{n}\beta _x{\mathrm{tr}}\{{\mathbf {Q}}^*[{\underline{m}}(z_1){\mathbf {T}}_n+{\mathbf {I}}_p]^{-1}{\mathbf {Q}}{\mathbf {Q}}^*[{\underline{m}}(z_2){\mathbf {T}}_n+{\mathbf {I}}_p]^{-1}{\mathbf {Q}}\}\\&\quad =\frac{y\beta _x}{z_1z_2}\int \frac{t^2}{[1+t{\underline{m}}(z_1)][1+t{\underline{m}}(z_2)]}dH(t)+o(1). \end{aligned}$$

Then the third term of \({\mathrm{Cov}}(M(z_1), M(z_2))\) is

$$\begin{aligned}&\frac{\partial ^2}{\partial z_1\partial z_2}\left\{ \frac{y\beta _xb(z_1)b(z_2)}{z_1z_2}\int \frac{t^2}{[1+t{\underline{m}}(z_1)][1+t{\underline{m}}(z_2)]}dH(t)\right\} \\&\quad =\frac{\partial ^2}{\partial z_1\partial z_2}\left\{ y\beta _x\int \frac{t^2{\underline{m}}(z_1){\underline{m}}(z_2)}{[1+t{\underline{m}}(z_1)][1+t{\underline{m}}(z_2)]}dH(t)\right\} \\&\quad =y\beta _x\frac{d{\underline{m}}(z_1)}{d z_1}\frac{d{\underline{m}}(z_2)}{d z_2}\int \frac{t^2}{[1+t{\underline{m}}(z_1)]^2[1+t{\underline{m}}(z_2)]^2}dH(t). \end{aligned}$$

Step 2: Tightness of \(M_n^1(z)\) As done in Bai and Silverstein (2004), the proof of tightness of \(M_n^1\) relies on the proof of

$$\begin{aligned}&\sup _{n,z_1,z_2\in {{\mathcal {C}}}_n}{\mathrm{E}}\left| \frac{M_n^1(z_1)-M_n(z_2)}{z_1-z_2}\right| ^2\nonumber \\&\quad = \sup _{n,z_1,z_2\in {{\mathcal {C}}}_n}{\mathrm{E}}\left| {\mathrm{tr}}{\mathbf {D}}^{-1}(z_1){\mathbf {D}}^{-1}(z_2)-{\mathrm{E}}{\mathrm{tr}}{\mathbf {D}}^{-1}(z_1){\mathbf {D}}^{-1}(z_2)\right| ^2\nonumber \\&\quad :=\sup _{n,z_1,z_2\in {{\mathcal {C}}}_n} J_n(z_1,z_2)<\infty . \end{aligned}$$
(57)

By the formula (53), we have

$$\begin{aligned}&J_n(z_1,z_2)\\&\quad ={\mathrm{E}}\left| \sum _{j=1}^n({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\mathrm{tr}}{\mathbf {D}}^{-1}(z_1){\mathbf {D}}^{-1}(z_2)\right| ^2\\&\quad ={\mathrm{E}}\Big |\sum _{j=1}^n({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z_1){\mathbf {D}}_j^{-1}(z_2){\mathbf {r}}_j{\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z_2){\mathbf {D}}_j^{-1}(z_1){\mathbf {r}}_j\beta _j(z_1)\beta _j(z_2)\\&\qquad -\sum _{j=1}^n({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z_1){\mathbf {D}}_j^{-1}(z_2){\mathbf {D}}_j^{-1}(z_1){\mathbf {r}}_j\beta _j(z_1)\\&\qquad -\sum _{j=1}^n({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z_2){\mathbf {D}}_j^{-1}(z_1){\mathbf {D}}_j^{-1}(z_2){\mathbf {r}}_j\beta _j(z_2)\Big |^2, \end{aligned}$$

from \(\beta _j(z)=b_j(z)-\beta _j(z)b_j(z)\big ({\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j-n^{-1}{\mathrm{E}}{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}^{-1}(z)\big )\), we have

$$\begin{aligned}&{\mathrm{E}}\Big |\sum _{j=1}^n({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z_1){\mathbf {D}}_j^{-1}(z_2){\mathbf {r}}_j{\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z_2){\mathbf {D}}_j^{-1}(z_1){\mathbf {r}}_j\beta _j(z_1)\beta _j(z_2)\Big |^2=O(1),\\&{\mathrm{E}}\Big |\sum _{j=1}^n({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z_1){\mathbf {D}}_j^{-1}(z_2){\mathbf {D}}_j^{-1}(z_1){\mathbf {r}}_j\beta _j(z_1)\Big |^2=O(1),\\&{\mathrm{E}}\Big |\sum _{j=1}^n({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z_2){\mathbf {D}}_j^{-1}(z_1){\mathbf {D}}_j^{-1}(z_2){\mathbf {r}}_j\beta _j(z_2)\Big |^2=O(1). \end{aligned}$$

Therefore, the tightness of \(M_n^1(z)\) is verified.

Step 3: Convergence of \(M_n^2(z)\) In this section, we will verify that for \(z\in {{\mathcal {C}}}_n\), \(\{M^2_n(z)\}\) converges to (14) in Lemma 1. Notice that \(M^2_n(z)=n[{\mathrm{E}}{\underline{m}}_n(z)-{\underline{m}}_n^0(z)]\). Let \({{\mathcal {C}}}_1={{\mathcal {C}}}_u\) or \({{\mathcal {C}}}_u\cup {{\mathcal {C}}}_l\) if \(x_l<0\), and \({{\mathcal {C}}}_2={{\mathcal {C}}}_r\) or \({{\mathcal {C}}}_r\cup {{\mathcal {C}}}_l\) if \(x_l>0\). First, we prove

$$\begin{aligned} \sup _{z\in {{\mathcal {C}}}_n}|{\mathrm{E}}{\underline{m}}_n(z)-{\underline{m}}(z)|\rightarrow 0,\quad \text {as}\, n\rightarrow \infty , \end{aligned}$$
(58)

where \({\underline{m}}_n(z)\) and \({\underline{m}}(z)\) are the Stieltjes transforms of the ESD of \({\underline{{\mathbf {B}}}}_n\) and \({\underline{F}}^{y, H}\). Since \(F^{{\underline{{\mathbf {B}}}}_n}\downarrow {\underline{F}}^{y,H}\) almost surely, we have \({\mathrm{E}}F^{{\underline{{\mathbf {B}}}}_n}\downarrow {\underline{F}}^{y,H}\) from d.c.t. It is easy to verify that \({\mathrm{E}}F^{{\underline{{\mathbf {B}}}}_n}\) is a proper c.d.f. As z ranges in \({{\mathcal {C}}}_1\), the functions \((\lambda -z)^{-1}\) form a bounded and equicontinuous family when \(\lambda \in [0,\infty )\), it follows (see, e.g. Billingsley 1995, Problem 8, p. 17) that

$$\begin{aligned} \sup _{z\in {{\mathcal {C}}}_1}|{\mathrm{E}}{\underline{m}}_n(z)-{\underline{m}}(z)|\rightarrow 0. \end{aligned}$$

For \(z\in {{\mathcal {C}}}_2\), from the definitions of \(\eta _l\) and \(\eta _r\) in Sketch of proof of Theorem 2, we have

$$\begin{aligned} {\mathrm{E}}{\underline{m}}_n(z)-{\underline{m}}(z)= & {} \int \frac{1}{\lambda -z}I_{[\eta _l,\eta _r]}(\lambda )d({\mathrm{E}}F^{{{\underline{{\mathbf {B}}}}}_n}(\lambda )-{\underline{F}}^{y,H}(\lambda ))\\&+\int \frac{1}{\lambda -z}I_{[\eta _l,\eta _r]^c}(\lambda )d{\mathrm{E}}F^{{{\underline{{\mathbf {B}}}}}_n}(\lambda ). \end{aligned}$$

Similar with above discussion, the first term converges uniformly to zero. When \(\ell \ge 2\), by Assumption (g), we get

$$\begin{aligned}&\sup _{z\in {{\mathcal {C}}}_2}\left| {\mathrm{E}}\int \frac{1}{\lambda -z}I_{[\eta _l,\eta _r]^c}(\lambda )dF^{{{\underline{{\mathbf {B}}}}}_n}(\lambda )\right| \\&\quad \le (\epsilon _n/n)^{-1}P(\Vert {\mathbf {B}}_n\Vert \ge \eta _r\text { or } \lambda _{\min }^{{\mathbf {B}}_n}\le \eta _l) \le Kn\epsilon _n^{-1}n^{-\ell } \rightarrow 0. \end{aligned}$$

Since \(F^{y_n,H_n}\downarrow F^{y,H}\) (see Bai and Silverstein 1998, below (3.10)) and \({{\mathcal {C}}}\) lies outside the support of \(F^{c,H}\), it is easy to show that

$$\begin{aligned} \sup _{z\in {{\mathcal {C}}}}|{{\underline{m}}_n^0(z)}-{\underline{m}}(z)|\rightarrow 0,\quad \text { as}\, n\rightarrow \infty , \end{aligned}$$
(59)

where \({\underline{m}}_n^0(z)\) is the Stieltjes transform of \({\underline{F}}^{y_n,H_n}\). Next, we prove that

$$\begin{aligned} \sup _{n, z\in {{\mathcal {C}}}_n}\Vert ({{\mathrm{E}}{\underline{m}}_n}(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}\Vert <\infty . \end{aligned}$$
(60)

From Lemma 2.11 of Bai and Silverstein (1998), \(\Vert ({{\mathrm{E}}{\underline{m}}_n}(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}\Vert \) is bounded by \(\max (2,4v_0^{-1}\)) on \({{\mathcal {C}}}_u\). Now let us consider the bound on \({{\mathcal {C}}}_l\cup {{\mathcal {C}}}_r\). By Theorem 4.1 of Silverstein and Choi (1995), there exists a support point \(t_0\) of H such that \(1+t_0{\underline{m}}(x_l)\ne 0\). Since \({\underline{m}}(z)\) is analytic on \({{\mathcal {C}}}_l\), there exist positive constants \(\delta _1\) and \(\mu _0\) such that

$$\begin{aligned} \inf _{z\in {{\mathcal {C}}}_l}|1+t_0{\underline{m}}(z)|>\delta _1\ \ \text { and }\ \ \sup _{z\in {{\mathcal {C}}}_l}|{\underline{m}}(z)|<\mu _0. \end{aligned}$$

By (58) and \(H_n\rightarrow H\), for all sufficiently large n, there exists an eigenvalue \(\lambda _j\) of \({\mathbf {T}}_n\) such that \(|\lambda _j-t_0|<\delta _1/4\mu _0\) and \(\sup _{z\in {{\mathcal {C}}}_l}|{\mathrm{E}}{\underline{m}}_n(z)-{\underline{m}}(z)|<\delta _1/4\). Then, we have

$$\begin{aligned} \inf _{z\in {{\mathcal {C}}}_l}|1+\lambda _j{\mathrm{E}}{\underline{m}}_n(z)|>\delta _1/2. \end{aligned}$$

For \({{\mathcal {C}}}_r\), similar with above discussion, we also have

$$\begin{aligned} \inf _{z\in {{\mathcal {C}}}_r}|1+\lambda _j{\mathrm{E}}{\underline{m}}_n(z)|>\delta _1/2. \end{aligned}$$

Therefore, we complete the proof of (60).

Next, we show that there exits \(\xi \in (0,1)\) satisfying

$$\begin{aligned} \sup _{z\in {{\mathcal {C}}}_n}\left| y_n{\mathrm{E}}{\underline{m}}^2_n(z)\int \frac{t^2}{(1+t{\mathrm{E}}{\underline{m}}_n(z))^2}dH_n(t)\right| <\xi , \end{aligned}$$
(61)

for all sufficiently large n. Since (1.1) of Bai and Silverstein (1998)

$$\begin{aligned} {\underline{m}}(z)=-\Big (z-y\int \frac{t}{1+t{\underline{m}}(z)}dH(t)\Big )^{-1} \end{aligned}$$
(62)

valid for \(z=x+\mathbf{i}v\) outside the support of \(F^{y,H}\), we get

$$\begin{aligned} \mathfrak {I}\,{\underline{m}}(z)=\frac{v+\mathfrak {I}\,{\underline{m}}(z)y\int t^2|1+t{\underline{m}}(z)|^{-2}dH(t)}{\left| z-y\int t(1+t{\underline{m}}(z))^{-1}dH(t)\right| ^2}. \end{aligned}$$

Therefore

$$\begin{aligned}&\left| y{\underline{m}}^2(z)\int \frac{t^2}{(1+t{\underline{m}}(z))^2}dH(t)\right| \nonumber \\&\quad \le \frac{y\int t^2|1+t{\underline{m}}(z)|^{-2}dH(t)}{\left| z-y\int t(1+t{\underline{m}}(z))^{-1}dH(t)\right| ^2}\nonumber \\&\quad = \frac{\mathfrak {I}\,{\underline{m}}(z)y\int t^2|1+t{\underline{m}}(z)|^{-2}dH(t)}{v+\mathfrak {I}\,{\underline{m}}(z)y\int t^2|1+t{\underline{m}}(z)|^{-2}dH(t)}\nonumber \\&\quad = \frac{y\int |x-z|^{-2}d{\underline{F}}^{y,H}(x)\int t^2|1+t{\underline{m}}(z)|^{-2}dH(t)}{1+y\int |x-z|^{-2}d{\underline{F}}^{y,H}(x)\int t^2|1+t{\underline{m}}(z)|^{-2}dH(t)}<1, \end{aligned}$$
(63)

for all \(z\in {{\mathcal {C}}}\). By continuity, there exists \(\xi _1<1\) such that

$$\begin{aligned} \sup _{z\in {{\mathcal {C}}}}\left| y{\underline{m}}(z)^2\int \frac{t^2}{(1+t{\underline{m}}(z))^2}dH(t)\right| <\xi _1. \end{aligned}$$
(64)

Thus, according to (58), we conclude that (61) holds.

We are going to give some bounds on quantities appearing earlier. Recall the functions \(\beta _j(z)\) defined in (34) and let \(\zeta _j(z)={\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j-n^{-1}{\mathrm{E}}{\mathrm{tr}}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z)\). For \(p\ge 4\), based on Lemma 3, we have

$$\begin{aligned} {\mathrm{E}}|\zeta _j(z)|^p\le Kn^{-2},\quad j=1,\ldots ,n. \end{aligned}$$
(65)

Let \({\mathbf {M}}\) be \(p\times p\) nonrandom matrix. Then

$$\begin{aligned}&{\mathrm{E}}|{\mathrm{tr}}{\mathbf {D}}^{-1}(z){\mathbf {M}}-{\mathrm{E}}{\mathrm{tr}}{\mathbf {D}}^{-1}(z){\mathbf {M}}|^2={\mathrm{E}}|\sum _{j=1}^n({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\mathrm{tr}}{\mathbf {D}}^{-1}(z){\mathbf {M}}|^2\\&\quad = {\mathrm{E}}|\sum _{j=1}^n({\mathrm{E}}_j-{\mathrm{E}}_{j-1}){\mathrm{tr}}({\mathbf {D}}^{-1}(z)-{\mathbf {D}}_j^{-1}(z)){\mathbf {M}}|^2\\&\quad = \sum _{j=1}^n{\mathrm{E}}|({\mathrm{E}}_j-{\mathrm{E}}_{j-1})\beta _j(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j|^2\\&\quad = |b_n(z)|^2\sum _{j=1}^n{\mathrm{E}}|({\mathrm{E}}_j-{\mathrm{E}}_{j-1})(1-\beta _j(z)\zeta _j(z)){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j|^2\\&\quad = \sum _{j=1}^n|b_j(z)|^2{\mathrm{E}}|{\mathrm{E}}_j({\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j-n^{-1}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {T}}_n)\\&\qquad -\,({\mathrm{E}}_j-{\mathrm{E}}_{j-1})\beta _j(z)\zeta _j(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j|^2\\&\quad \le 2\sum _{j=1}^n|b_j(z)|^2{\mathrm{E}}|{\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j-n^{-1}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {T}}_n|^2\\&\qquad +\,4\sum _{j=1}^n|b_j(z)|^2{\mathrm{E}}|\beta _j(z)\zeta _j(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j|^2\\&\quad \le 2\sum _{j=1}^n|b_j(z)|^2{\mathrm{E}}|{\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j-n^{-1}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {T}}_n|^2\\&\qquad +\,4\sum _{j=1}^n|b_j(z)|^2({\mathrm{E}}|\zeta _j(z)|^4)^{1/2}({\mathrm{E}}|\beta _j(z)|^8)^{1/4}({\mathrm{E}}|{\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {M}}{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j|^8)^{1/4}. \end{aligned}$$

Using (9.9.3) of Bai and Silverstein (2010), (65) and the boundness of \(b_j(z)\), we get

$$\begin{aligned} {\mathrm{E}}|{\mathrm{tr}}{\mathbf {D}}^{-1}(z){\mathbf {M}}-{\mathrm{E}}{\mathrm{tr}}{\mathbf {D}}^{-1}(z){\mathbf {M}}|^2\le K\Vert {\mathbf {M}}\Vert ^2. \end{aligned}$$
(66)

The same argument holds for \({\mathbf {D}}_j^{-1}(z)\), we also get

$$\begin{aligned} {\mathrm{E}}|{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z){\mathbf {M}}-{\mathrm{E}}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z){\mathbf {M}}|^2\le K\Vert {\mathbf {M}}\Vert ^2. \end{aligned}$$
(67)

From (5.2) in Bai and Silverstein (1998), for \(z\in {{\mathcal {C}}}_n\), we have

$$\begin{aligned}&n\left( y_n\int \frac{dH_n(t)}{1+t{\mathrm{E}}{\underline{m}}_n(z)}+zy_n{\mathrm{E}}m_n(z)\right) \\&\quad =\sum _{j=1}^n{\mathrm{E}}\beta _j(z)\bigg [{\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j -n^{-1}{\mathrm{E}}{\mathrm{tr}}({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n{\mathbf {D}}^{-1}(z)\bigg ]. \end{aligned}$$

Throughout the following, all bounds, including \(O(\cdot )\) and \(o(\cdot )\) expressions, and convergence statements hold uniformly for \(z\in {{\mathcal {C}}}_n\). From (53), we have

$$\begin{aligned}&n^{-1}\sum _{j=1}^n{\mathrm{E}}\beta _j(z){\mathrm{E}}{\mathrm{tr}}({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n({\mathbf {D}}_j^{-1}(z)-{\mathbf {D}}^{-1}(z))\nonumber \\&\quad =n^{-1}\sum _{j=1}^n{\mathrm{E}}\beta _j(z){\mathrm{E}}\beta _j(z){\mathrm{tr}}({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j{\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)\nonumber \\&\quad =n^{-1}\sum _{j=1}^n{\mathrm{E}}\beta _j(z)b_j(z){\mathrm{E}}(1-\beta _j(z)\zeta _j(z)){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j.\nonumber \\ \end{aligned}$$
(68)

From (60), for \(j=1,\ldots ,p\), we get

$$\begin{aligned}&|{\mathrm{E}}\beta _j(z)\zeta _j(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j|\\&\quad \le ({\mathrm{E}}|{\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j|^4)^{1/4}\\&\qquad \times ({\mathrm{E}}|\zeta _j(z)|^2)^{1/2}({\mathrm{E}}|\beta _j(z)|^4)^{1/4}\le Kn^{-1/2}. \end{aligned}$$

Since \(\beta _j(z)=b_j(z)-b_j(z)\beta _j(z)\zeta _j(z)\), we have \({\mathrm{E}}\beta _j(z)=b_j(z)+O(n^{-1/2})\). Thus,

$$\begin{aligned} |(68)-n^{-2}\sum _{j=1}^nb_j^2(z){\mathrm{E}}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z){\mathbf {T}}_n|\le Kn^{-1/2}. \end{aligned}$$

Since \(\beta _j(z)=b_j(z)-b_n^2(z)\zeta _j(z)+\beta _j(z)b_j^2(z)\zeta _j^2(z)\), we have

$$\begin{aligned}&\sum _{j=1}^n{\mathrm{E}}\beta _j(z)({\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j -n^{-1}{\mathrm{E}}{\mathrm{tr}}({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z))\\&\quad =-\sum _{j=1}^n b_j^2(z){\mathrm{E}}\zeta _j(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j\\&\qquad +\sum _{j=1}^n b_j^2(z){\mathrm{E}}\beta _j(z)\zeta _j^2(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j\\&\qquad -n^{-1}\sum _{j=1}^n b_j^2(z){\mathrm{E}}\beta _j(z)\zeta _j^2(z){\mathrm{E}}{\mathrm{tr}}({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z)\\&\quad =-\sum _{j=1}^n b_j^2(z){\mathrm{E}}\zeta _j(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j\\&\qquad +\sum _{j=1}^n b_j^2(z){\mathrm{E}}\beta _j(z)\zeta _j^2(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j\\&\qquad -n^{-1}\sum _{j=1}^n b_j^2(z){\mathrm{E}}\beta _j(z)\zeta _j^2(z){\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n\\&\qquad +n^{-1}\sum _{j=1}^n b_j^2(z)\text {Cov}(\beta _j(z)\zeta _j^2(z),{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n). \end{aligned}$$

Using (60), (65), (67) and Lemma 3, we have

$$\begin{aligned}&\Big |\sum _{j=1}^n b_j^2(z){\mathrm{E}}\beta _j(z)\zeta _j^2(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j\\&\qquad -n^{-1}\sum _{j=1}^n b_j^2(z){\mathrm{E}}\beta _j(z)\zeta _j^2(z){\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n\big |\\&\quad \le \sum _{j=1}^n|b_j(z)|^2({\mathrm{E}}|\beta _j(z)|^4)^{1/4}({\mathrm{E}}|\zeta _j(z)|^4)^{1/2}({\mathrm{E}}|{\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j\\&\qquad -n^{-1}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n|^4)^{1/4}\le Kn^{-1/2} \end{aligned}$$

and

$$\begin{aligned}&\Big |n^{-1}\sum _{j=1}^n b_j^2(z)\text {Cov}(\beta _j(z)\zeta _j^2(z),{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n)\Big |\\&\quad \le n^{-1}\sum _{j=1}^n|b_j(z)|^2({\mathrm{E}}|\beta _j(z)|^6)^{1/6}({\mathrm{E}}|\zeta _j(z)|^6)^{1/3}({\mathrm{E}}|{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n\\&\qquad -{\mathrm{E}}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n|^2)^{1/2}\le Kn^{-2/3}. \end{aligned}$$

Write

$$\begin{aligned}&\sum _{j=1}^n b_j^2(z){\mathrm{E}}\zeta _j(z){\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j\\&\quad =\sum _{j=1}^n b_j^2(z){\mathrm{E}}[({\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j-n^{-1}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z){\mathbf {T}}_n)({\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j\\&\qquad -n^{-1}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n)]\\&\qquad +n^{-2}\sum _{j=1}^n\text {Cov}({\mathrm{tr}}{\mathbf {D}}_j^{-1}(z){\mathbf {T}}_n,{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n). \end{aligned}$$

Based on (67), we get that the second term above is \(O(n^{-1})\). Therefore, we have

$$\begin{aligned}&n\left( y_n\int \frac{dH_n(t)}{1+t{\mathrm{E}}{\underline{m}}_n}+zy_n{\mathrm{E}}m_n(z)\right) \\&\quad =n^{-2}\sum _{j=1}^nb_j^2(z){\mathrm{E}}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z){\mathbf {T}}_n\\&\qquad -\sum _{j=1}^n b_j^2(z){\mathrm{E}}[({\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z){\mathbf {r}}_j-n^{-1}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z){\mathbf {T}}_n)({\mathbf {r}}_j^*{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {r}}_j\\&\qquad -n^{-1}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}}_p)^{-1}{\mathbf {T}}_n)]+o(1)\\&\quad =-\frac{\alpha _x}{n^2}\sum _{j=1}^nb_j^2(z){\mathrm{E}}{\mathrm{tr}}{\mathbf {T}}_n({\mathbf {D}}_j^T)^{-1}(z){\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}})^{-1}\\&\qquad -\frac{\beta _x}{n^2}\sum _{j=1}^nb_j^2(z)\sum _{i=1}^{k}{\mathrm{E}}{\mathbf {q}}_i^{*}{\mathbf {D}}_j^{-1}(z){\mathbf {q}}_i \cdot {\mathbf {q}}_i^{*}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}})^{-1}{\mathbf {q}}_i+o(1). \end{aligned}$$

Let

$$\begin{aligned} A_n(z)=y_n\int \frac{dH_n(t)}{1+t{\mathrm{E}}{\underline{m}}_n(z)}+zy_n{\mathrm{E}}m_n(z). \end{aligned}$$

From the identity

$$\begin{aligned} {\mathrm{E}}{\underline{m}}_n(z)=-\frac{(1-y_n)}{z}+y_n{\mathrm{E}}m_n(z), \end{aligned}$$

we have

$$\begin{aligned} A_n(z)= & {} y_n\int \frac{dH_n(t)}{1+t{\mathrm{E}}{\underline{m}}_n(z)}-y_n+z{\mathrm{E}}{\underline{m}}_n(z)+1\\= & {} -{\mathrm{E}}{\underline{m}}_n(z)\left( -z-\frac{1}{{\mathrm{E}}{\underline{m}}_n(z)}+y_n\int \frac{tdH_n(t)}{1+t{\mathrm{E}}{\underline{m}}_n(z)}\right) . \end{aligned}$$

It follows that

$$\begin{aligned} {\mathrm{E}}{\underline{m}}_n(z)=\left[ -z+y_n\int \frac{tdH_n(t)}{1+t{\mathrm{E}}{\underline{m}}_n(z)}+A_n/{\mathrm{E}}{\underline{m}}_n(z)\right] ^{-1}. \end{aligned}$$

From (62), we get

$$\begin{aligned} n[{\mathrm{E}}{\underline{m}}_n(z)-{\underline{m}}_n^0(z)]=-\frac{{\underline{m}}_n^0(z)nA_n}{1-y_n{\mathrm{E}}{\underline{m}}_n(z){\underline{m}}_n^0(z)\int \frac{t^2dH_n(t)}{(1+t{\mathrm{E}}{\underline{m}}_n(z))(1+t{\underline{m}}_n^0(z))}}. \end{aligned}$$
(69)

Based on (61) and the corresponding bound involving \({\underline{m}}_n^0(z)\), the denominator of (69) is bounded away from zero. Therefore, by (9.9.12) of Bai and Silverstein (2004), we have

$$\begin{aligned}&z^2{\underline{m}}^2(z)n^{-1}{\mathrm{E}}{\mathrm{tr}}{\mathbf {T}}_n({\mathbf {D}}_j^T)^{-1}(z){\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}})^{-1}\\&\quad ={\underline{m}}^2(z)n^{-1}{\mathrm{tr}}({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-3}{\mathbf {T}}_n^2\\&\qquad +\,z^2{\underline{m}}^4(z)n^{-1}\sum \limits _{i\ne j} {\mathrm{E}}{\mathrm{tr}}\bigg \{({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-1}{\mathbf {T}}_n({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-1}({\mathbf {r}}_i{\mathbf {r}}_i^{*}-\frac{1}{n}{\mathbf {T}}_n){\mathbf {D}}_{ij}^{-1}(z)\\&\qquad \times ({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-1}{\mathbf {T}}_n({\mathbf {D}}^T_{ij})^{-1}(z)({{\bar{{\mathbf {r}}}}}_i{\mathbf {r}}_i^T-\frac{1}{n}{\mathbf {T}}_n)\bigg \}+o(1)\\&\quad ={\underline{m}}^2(z)n^{-1}{\mathrm{tr}}({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-3}{\mathbf {T}}_n^2\\&\qquad +\,\alpha _x z^2{\underline{m}}^4(z)n^{-1}{\mathrm{tr}}(({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-1}{\mathbf {T}}_n)^2\\&\qquad \times n^{-1}{\mathrm{E}}{\mathrm{tr}}{\mathbf {D}}_j^{-1}(z)({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-1}{\mathbf {T}}_n({\mathbf {D}}^T_j)^{-1}(z){\mathbf {T}}_n+o(1). \end{aligned}$$

Then we have

$$\begin{aligned}&z^2{\underline{m}}^2(z)n^{-1}{\mathrm{E}}{\mathrm{tr}}{\mathbf {T}}_n({\mathbf {D}}_j^T)^{-1}(z){\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}})^{-1}\\&\quad =\frac{{\underline{m}}^2(z)n^{-1}{\mathrm{tr}}({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-3}{\mathbf {T}}_n^2}{1-\alpha _x{\underline{m}}^2(z)n^{-1}{\mathrm{tr}}(({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-1}{\mathbf {T}}_n)^2}+o(1)\\&\quad =\frac{y\int {\underline{m}}^2(z)t^2(1+t{\underline{m}}(z))^{-3}dH(t)}{1-\alpha _xy\int {\underline{m}}^2(z)t^2(1+t{\underline{m}}(z))^{-2}dH(t)}+o(1). \end{aligned}$$

Thus we obtain

$$\begin{aligned}&\frac{\alpha _x n^{-2}\sum _{j=1}^nb_j^2(z){\mathrm{E}}{\mathrm{tr}}{\mathbf {T}}_n({\mathbf {D}}_j^T)^{-1}(z){\mathbf {T}}_n{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}})^{-1}}{1-y\int {\underline{m}}^2(z)t^2(1+t{\underline{m}}(z))^{-2}dH(t)}\\&\quad =\frac{\alpha _xy\int {\underline{m}}^2(z)t^2(1+t{\underline{m}}(z))^{-3}dH(t)}{\left( 1-y\int \frac{{\underline{m}}^2(z)t^2dH(t)}{(1+t{\underline{m}}(z))^2}\right) \left( 1-\alpha _xy\int \frac{{\underline{m}}^2(z)t^2dH(t)}{(1+t{\underline{m}}(z))^2}\right) }+o(1). \end{aligned}$$

Moreover, if \(\beta _x=0\) or \(\Vert {\mathbf {Q}}^*{{\mathbf {Q}}}-{\mathrm{diag}}({\mathbf {Q}}^*{{\mathbf {Q}}})\Vert =o(1)\), then we have

$$\begin{aligned}&\frac{\beta _x n^{-2}\sum _{j=1}^nb_j^2(z)\sum _{i=1}^{k}{\mathrm{E}}{\mathbf {q}}_i^{*}{\mathbf {D}}_j^{-1}(z){\mathbf {q}}_i \cdot {\mathbf {q}}_i^{*}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}})^{-1}{\mathbf {q}}_i}{1-y\int {\underline{m}}^2(z)t^2(1+t{\underline{m}}(z))^{-2}dH(t)}\\&\quad =\frac{\beta _xy\int {\underline{m}}^2(z)t^2(1+t{\underline{m}}(z))^{-3}dH(t)}{1-y\int {\underline{m}}^2(z)t^2(1+t{\underline{m}}(z))^{-2}dH(t)}+o(1), \end{aligned}$$

where

$$\begin{aligned}&{\mathrm{E}}{\mathbf {q}}_i^{*}{\mathbf {D}}_j^{-1}(z){\mathbf {q}}_i\cdot {\mathbf {q}}_i^{*}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+{\mathbf {I}})^{-1}{\mathbf {q}}_i\\&\quad ={\mathrm{E}}\mathbf{e}_i^T{\mathbf {Q}}^{*}{\mathbf {D}}_j^{-1}(z){\mathbf {Q}}\mathbf{e}_i\mathbf{e}_i^T{\mathbf {Q}}^{*}{\mathbf {D}}_j^{-1}(z)({\mathrm{E}}{\underline{m}}_n(z){\mathbf {T}}_n+\mathbf{I}_p)^{-1}{\mathbf {Q}}\mathbf{e}_i\\&\quad =z^{-2}{} \mathbf{e}_i^T{\mathbf {Q}}^{*}({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-1}{\mathbf {Q}}\mathbf{e}_i\mathbf{e}_i^T{\mathbf {Q}}^{*}({\underline{m}}(z){\mathbf {T}}_n+\mathbf{I}_p)^{-2}{\mathbf {Q}}\mathbf{e}_i+o(1) \end{aligned}$$

and

$$\begin{aligned} ({\underline{m}}(z){\mathbf {Q}}{\mathbf {Q}}^{*}+{\mathbf {I}}_p)^{-1}={\mathbf {I}}_p-{\underline{m}}(z){\mathbf {Q}}({\mathbf {I}}_k+{\underline{m}}(z){\mathbf {Q}}^{*}{\mathbf {Q}})^{-1}{\mathbf {Q}}^{*}. \end{aligned}$$

That is,

$$\begin{aligned}&n[{\mathrm{E}}{\underline{m}}_n(z)-{\underline{m}}_n^0(z)]\\&\quad =-\frac{{\underline{m}}_n^0(z)nA_n(z)}{1-c_n{\mathrm{E}}{\underline{m}}_n(z){\underline{m}}_n^0(z)\int \frac{t^2dH_n(t)}{(1+t{\mathrm{E}}{\underline{m}}_n(z))(1+t{\underline{m}}_n^0(z))}}\\&\quad =\frac{\alpha _xy\int {\underline{m}}^3(z)t^2(1+t{\underline{m}}(z))^{-3}dH(t)}{\left( 1-y\int \frac{{\underline{m}}^2(z)t^2dH(t)}{(1+t{\underline{m}}(z))^2}\right) \left( 1-\alpha _xy\int \frac{{\underline{m}}^2(z)t^2dH(t)}{(1+t{\underline{m}}(z))^2}\right) }\\&\qquad +\frac{\beta _xy\int {\underline{m}}^3(z)t^2(1+t{\underline{m}}(z))^{-3}dH(t)}{1-y\int {\underline{m}}^2(z)t^2(1+t{\underline{m}}(z))^{-2}dH(t)}+o(1). \end{aligned}$$

A.2.3 Applying Lemma 1 to derive Theorem 2

In this section, we show how to derive Theorem 2 from Lemma 1. Choose \(v_0\), \(x_r\) and \(x_l\) so that \(f_1,\ldots ,f_L\) are all analytic on and inside the contour \({{{\mathcal {C}}}}\). For any \(f\in \{f_1,\ldots ,f_L\}\) and all sufficiently large n, with probability one

$$\begin{aligned} \int f(x)dG_n(x)=-\frac{1}{2\pi \mathbf{i}}\oint \limits _{{{{\mathcal {C}}}}}f(z)M_n(z)dz. \end{aligned}$$

From the definition of \({\widehat{M}}_n(z)\), with probability one

$$\begin{aligned}&\left| \oint \limits _{{{{\mathcal {C}}}}}f(z)(M_n(z)-{\widehat{M}}_n(z))dz\right| \le 4K\epsilon _n(| \max (\lambda _{\max }^{{\mathbf {T}}_n}(1+\sqrt{y_n})^2,\lambda _{\max }^{{\mathbf {B}}_n})-x_r|^{-1} \\&\quad +|\min (\lambda _{\min }^{{\mathbf {T}}_n}(1-\sqrt{y_n})^2I(0<y_n<1),\lambda _{\min }^{{\mathbf {B}}_n})-x_l|^{-1}) \rightarrow 0, \end{aligned}$$

as \(n\rightarrow \infty \). Here K is a bound on f over \({{{\mathcal {C}}}}\). Since

$$\begin{aligned} {\widehat{M}}_n(\cdot )\longrightarrow \left( -\frac{1}{2\pi \mathbf{i}}\oint \limits _{{{{\mathcal {C}}}}}f_1(z)\,{\widehat{M}}_n(z)dz,\ldots ,-\frac{1}{2\pi \mathbf{i}}\oint \limits _{{{{\mathcal {C}}}}}f_L(z)\,{{\widehat{M}}}_n(z)dz \right) \end{aligned}$$

is a continuous mapping of \(C({{{\mathcal {C}}}}, {{\mathbb {R}}}^2)\) into \({{\mathbb {R}}}^L\), the above vector forms a tight sequence and has the weak limit equal in distribution to

$$\begin{aligned} \left( -\frac{1}{2\pi \mathbf{i}}\oint \limits _{{{{\mathcal {C}}}}}f_1(z)\,M(z)dz,\ldots ,-\frac{1}{2\pi \mathbf{i}}\oint \limits _{{{{\mathcal {C}}}}}f_L(z)\,M(z)dz\right) . \end{aligned}$$

Due to the fact that Riemann sums corresponding to the integrals in above vector are multivariate Gaussian and the weak limits of Gaussian vectors can only be Gaussian, we have the above vector is multivariate Gaussian. The limiting expressions (11) and (12) can be derived immediately.

A.3 Proof of Corollary 2

Let \(a(y)=(1-\sqrt{y})^2\) and \(b(y)=(1+\sqrt{y})^2\), then for \(\ell \in \{1,\ldots ,L\}\), we have

$$\begin{aligned} {\mathrm{E}}X_{f_{\ell }}= & {} \frac{[a(y)]^{\ell }+[b(y)]^{\ell }}{4}-\frac{1}{2\pi }\int \limits _0^\pi (1+y-2\sqrt{y}\cos \theta )^{\ell }d\theta \\&-\beta _x\frac{1}{2\pi \mathbf{i}}\oint z^{\ell }\frac{y{\underline{m}}^3(z)(1+{\underline{m}}(z))^{-3}}{1-y{\underline{m}}^2(z)(1+{\underline{m}}(z))^{-2}}dz\\= & {} \frac{[a(y)]^{\ell }+[b(y)]^{\ell }}{4}-\frac{1}{2}\sum \limits _{\ell _1=0}^{\ell } \left( {\begin{array}{c}\ell \\ \ell _1\end{array}}\right) ^2y^{\ell _1}+\beta _x\sum \limits _{\ell _2=2}^{\ell }\left( {\begin{array}{c}\ell \\ \ell _2-2\end{array}}\right) \left( {\begin{array}{c}\ell \\ \ell _2\end{array}}\right) y^{\ell +1-\ell _2}, \end{aligned}$$

where

$$\begin{aligned}&\frac{1}{2\pi \mathbf{i}}\oint z^{\ell }\frac{y{\underline{m}}^3(z)(1+{\underline{m}}(z))^{-3}}{1-y{\underline{m}}^2(z)(1+{\underline{m}}(z))^{-2}}dz\\&\quad =\frac{y}{2\pi \mathbf{i}}\oint \left( -\frac{1}{{\underline{m}}(z)}+\frac{y}{1+{\underline{m}}(z)}\right) ^{\ell } \frac{{\underline{m}}(z)}{(1+{\underline{m}}(z))^3}d{\underline{m}}(z)\\&\quad =\frac{y}{2\pi \mathbf{i}}\oint \sum \limits _{{\ell _1}=0}^{\ell }\frac{\left( {\begin{array}{c}\ell \\ {\ell _1}\end{array}}\right) (-1)^{\ell _1}y^{\ell -{\ell _1}}{\underline{m}}(z)}{{\underline{m}}^{\ell _1}(z)(1+{\underline{m}}(z))^{\ell +3-{\ell _1}}}d{\underline{m}}(z)\\&\quad =\frac{y}{2\pi \mathbf{i}}\oint \sum \limits _{{\ell _1}=2}^{\ell }\frac{\left( {\begin{array}{c}\ell \\ {\ell _1}\end{array}}\right) (-1)^{\ell _1}y^{\ell -{\ell _1}}}{{\underline{m}}^{{\ell _1}-1}(z)(1+{\underline{m}}(z))^{\ell +3-{\ell _1}}}d{\underline{m}}(z)\\&\quad =\sum \limits _{{\ell _1}=2}^{\ell }\frac{\left( {\begin{array}{c}\ell \\ {\ell _1}\end{array}}\right) (-1)^{\ell _1}y^{\ell +1-{\ell _1}}}{2\pi \mathbf{i}}\oint \frac{1}{{\underline{m}}^{{\ell _1}-1}(z)(1+{\underline{m}}(z))^{\ell +3-{\ell _1}}}d{\underline{m}}(z)\\&\quad =-\sum \limits _{{\ell _1}=2}^{\ell }\left( {\begin{array}{c}\ell \\ {\ell _1}-2\end{array}}\right) \left( {\begin{array}{c}\ell \\ {\ell _1}\end{array}}\right) y^{\ell +1-{\ell _1}}, \end{aligned}$$

and

$$\begin{aligned}&{\mathrm{Cov}}(X_{f_{\ell }}, X_{f_{\ell '}})\\&\quad =2y^{\ell +\ell '}\sum \limits _{\ell _1=0}^{\ell -1}\sum \limits _{\ell _2=0}^{\ell '} \left( {\begin{array}{c}\ell \\ \ell _1\end{array}}\right) \left( {\begin{array}{c}\ell '\\ \ell _2\end{array}}\right) \left( \frac{1-y}{y}\right) ^{\ell _1+\ell _2}\\&\qquad \times \sum \limits _{\ell _3=0}^{\ell -\ell _1}\ell _3\left( {\begin{array}{c}2\ell -1-\ell _1-\ell _3\\ \ell -1\end{array}}\right) \left( {\begin{array}{c}2\ell '-1-\ell _2+\ell _3\\ \ell '-1\end{array}}\right) \\&\qquad +\beta _x y\sum \limits _{\ell _1=1}^{\ell }\left( {\begin{array}{c}\ell \\ \ell _1-1\end{array}}\right) \left( {\begin{array}{c}\ell \\ \ell _1\end{array}}\right) y^{\ell -\ell _1} \sum \limits _{\ell _2=1}^{\ell '}\left( {\begin{array}{c}\ell '\\ \ell _2-1\end{array}}\right) \left( {\begin{array}{c}\ell '\\ \ell _2\end{array}}\right) y^{\ell '-\ell _2},\quad \ell , \ell '\in \{1,\ldots ,L\} \end{aligned}$$

with

$$\begin{aligned} \frac{1}{2\pi \mathbf{i}}\oint \frac{z^{\ell }}{(1+{\underline{m}}(z))^2}d{\underline{m}}(z)= & {} \frac{1}{2\pi \mathbf{i}}\oint \frac{1}{(1+{\underline{m}}(z))^2}\left( -\frac{1}{{\underline{m}}(z)}+\frac{y}{1+{\underline{m}}(z)}\right) ^{\ell }d{\underline{m}}(z)\\= & {} \frac{1}{2\pi \mathbf{i}}\oint \sum \limits _{{\ell _1}=0}^{\ell }\frac{\left( {\begin{array}{c}\ell \\ {\ell _1}\end{array}}\right) (-1)^{\ell _1}y^{\ell -{\ell _1}}}{{\underline{m}}^{\ell _1}(z)(1+{\underline{m}}(z))^{\ell +2-{\ell _1}}}d{\underline{m}}(z)\\= & {} \sum \limits _{{\ell _1}=1}^{\ell }\frac{\left( {\begin{array}{c}\ell \\ {\ell _1}\end{array}}\right) (-1)^{\ell _1}y^{\ell -{\ell _1}}}{2\pi i}\oint \frac{1}{{\underline{m}}^{\ell _1}(z)(1+{\underline{m}}(z))^{\ell +2-{\ell _1}}}d{\underline{m}}(z)\\= & {} \sum \limits _{{\ell _1}=1}^{\ell }\left( {\begin{array}{c}\ell \\ {\ell _1}-1\end{array}}\right) \left( {\begin{array}{c}\ell \\ {\ell _1}\end{array}}\right) y^{\ell -{\ell _1}}. \end{aligned}$$

A.4 Proof of Theorem 3

First, we prove the conclusion (a) of Theorem 3. Recall that

$$\begin{aligned} {\hat{\phi }}=\frac{\sum _{j=1}^n\sum _{i=2}^py_{ij}y_{i-1,j}}{\sum _{j=1}^n\sum _{i=2}^py_{i-1,j}^2}. \end{aligned}$$
(70)

Without loss of generality, we assume that \(\sigma _e^2=1\). Under \(H_{01}\), we have

$$\begin{aligned} {\mathrm{E}}\Big ((p-1)^{-1}n^{-1}\sum _{j=1}^n\sum _{i=2}^py_{ij}y_{i-1,j}\Big )=\frac{\phi }{1-\phi ^2} \end{aligned}$$

and

$$\begin{aligned} {\mathrm{E}}\Big ((p-1)^{-1}n^{-1}\sum _{j=1}^n\sum _{i=2}^py_{i-1,j}^2\Big )=\frac{1}{1-\phi ^2}. \end{aligned}$$

Thus, for deriving \(n({\hat{\phi }}-\phi )=O_p(1)\), it suffices to verify that

$$\begin{aligned} X_n=n\left( (p-1)^{-1}n^{-1}\sum _{j=1}^n\sum _{i=2}^py_{ij}y_{i-1,j}-\frac{\phi }{1-\phi ^2}\right) =O_p(1) \end{aligned}$$
(71)

and

$$\begin{aligned} Y_n=n\left( (p-1)^{-1}n^{-1}\sum _{j=1}^n\sum _{i=2}^py_{i-1,j}^2-\frac{1}{1-\phi ^2}\right) =O_p(1). \end{aligned}$$
(72)

Let \({\mathbf {y}}_j^{(1)}=(y_{2j},\ldots ,y_{pj})^T\), \({\mathbf {y}}_j^{(p)}=(y_{1j},\ldots ,y_{p-1,j})^T\) and \(\widetilde{\varvec{\Sigma }}_{\phi }\) be a \((p-1)\times (p-1)\) matrix with (ij)th element being \((1-\phi ^2)^{-1}(\phi ^{|i-j|})\). By the identity (1.15) in Bai and Silverstein (2004), we have

$$\begin{aligned} {\mathrm{Var}}(X_n)= & {} {\mathrm{E}}\left\{ (p-1)^{-1}\sum _{j=1}^n\left( ({\mathbf {y}}_j^{(1)})^T{\mathbf {y}}_j^{(p)}-\phi {\mathrm{tr}}\widetilde{\varvec{\Sigma }}_{\phi }\right) \right\} ^2\nonumber \\\le & {} (\phi ^2+1+|\beta _e|)n(p-1)^{-2}{\mathrm{tr}}\widetilde{\varvec{\Sigma }}_{\phi }^2=O(1) \end{aligned}$$
(73)

and

$$\begin{aligned} {\mathrm{Var}}(Y_n)= & {} {\mathrm{E}}\left\{ (p-1)^{-1}\sum _{j=1}^n\left( ({\mathbf {y}}_j^{(1)})^T{\mathbf {y}}_j^{(1)}-{\mathrm{tr}}\widetilde{\varvec{\Sigma }}_{\phi }\right) \right\} ^2\nonumber \\\le & {} (2+|\beta _e|)n(p-1)^{-2}{\mathrm{tr}}\widetilde{\varvec{\Sigma }}_{\phi }^2=O(1), \end{aligned}$$
(74)

where \(\beta _e={\mathrm{E}}e_{ij}^4-3\). By (73), (74) and Chebyshev’s inequality, we get

$$\begin{aligned} X_n=O_p(1)\quad \text{ and }\quad Y_n=O_p(1). \end{aligned}$$

Next, we prove the conclusion (b) of Theorem 3. Let

$$\begin{aligned} T_{n1}={\mathrm{tr}}\{\varvec{\Sigma }_{\phi }^{-1}{\mathbf {B}}_n/[p^{-1}{\mathrm{tr}}(\varvec{\Sigma }_{\phi }^{-1}{\mathbf {B}}_n)]-{\mathbf {I}}_p\}^2. \end{aligned}$$

Since \(\sigma _e^{-2}\varvec{\Sigma }_{\phi }^{-1/2}{\mathrm{Cov}}(\mathbf{y})\varvec{\Sigma }_{\phi }^{-1/2}=\mathbf{I}_p\) under \(H_{01}\), it follows from Corollary 2 and the delta method that

$$\begin{aligned} \frac{T_{n1}-py_n-(\beta _e+1)y_n}{2y_n}{\mathop {\longrightarrow }\limits ^{d}} N(0,1). \end{aligned}$$

For deriving (19), it suffices to prove that \({\widehat{T}}_{n1}-T_{n1}=o_p(1)\). From the definitions of \({\widehat{T}}_{n1}\) and \(T_{n1}\), we obtain

$$\begin{aligned} {\widehat{T}}_{n1}-T_{n1}= & {} \frac{{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)^2}{\big [p^{-1}{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)\big ]^2} -\frac{{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)^2}{\big [p^{-1}{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)\big ]^2}\\= & {} \frac{{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)^2}{\big [p^{-1}{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)\big ]^2} -\frac{{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)^2}{\big [p^{-1}{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)\big ]^2}+\frac{{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)^2-{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)^2}{\big [p^{-1}{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)\big ]^2}. \end{aligned}$$

According to Theorem 2.1 in Zheng et al. (2019), we have

$$\begin{aligned} p^{-1}{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)= & {} 1+o_p(1), \end{aligned}$$
(75)
$$\begin{aligned} p^{-1}{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)^2= & {} 1+y_n+o_p(1). \end{aligned}$$
(76)

Therefore, in order to derive \({\widehat{T}}_{n1}-T_{n1}=o_p(1)\), it suffices to show that

$$\begin{aligned} {\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)-{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)= & {} o_p(1), \end{aligned}$$
(77)
$$\begin{aligned} {\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)^2-{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)^2= & {} o_p(1). \end{aligned}$$
(78)

Let \({\mathbf {E}}_i\) be the identity matrix with the first and last i ones set to zero and \({\mathbf {F}}_i\) have ones along the upper and lower ith minor diagonals and zeros elsewhere. Due to the formula (3) of the inverse covariance matrix of the autoregressive process given in Verbyla (1985), we have

$$\begin{aligned} \varvec{\Sigma }_\phi ^{-1}= & {} {\mathbf {I}}_p+\phi ^2{\mathbf {E}}_1-\phi {\mathbf {F}}_1,\end{aligned}$$
(79)
$$\begin{aligned} \widehat{\varvec{\Sigma }}_\phi ^{-1}= & {} {\mathbf {I}}_p+{\hat{\phi }}^2{\mathbf {E}}_1-{\hat{\phi }}{\mathbf {F}}_1. \end{aligned}$$
(80)

Since \({\hat{\phi }}-\phi =O_p(n^{-1})\), based on the conclusion (a) in Theorem 2.1 of Zheng et al. (2019), we have

$$\begin{aligned}&{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)-{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)\nonumber \\&\quad ={\mathrm{tr}}\left[ \big (({\hat{\phi }}^2-\phi ^2){\mathbf {E}}_1-({\hat{\phi }}-\phi ){\mathbf {F}}_1\big ){\mathbf {B}}_n\right] \nonumber \\&\quad =({\hat{\phi }}^2-\phi ^2){\mathrm{tr}}({\mathbf {E}}_1{\mathbf {B}}_n)-({\hat{\phi }}-\phi ){\mathrm{tr}}({\mathbf {F}}_1{\mathbf {B}}_n)\nonumber \\&\quad =({\hat{\phi }}^2-\phi ^2){\mathrm{tr}}({\mathbf {E}}_1\varvec{\Sigma }_\phi )-({\hat{\phi }}-\phi ){\mathrm{tr}}({\mathbf {F}}_1\varvec{\Sigma }_\phi )+o_p(1)\nonumber \\&\quad ={\mathrm{tr}}\big [(\widehat{\varvec{\Sigma }}_\phi ^{-1}-\varvec{\Sigma }_\phi ^{-1})\varvec{\Sigma }_\phi \big ]+o_p(1)\nonumber \\&\quad ={\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}\varvec{\Sigma }_\phi )-p+o_p(1). \end{aligned}$$
(81)

Note that

$$\begin{aligned} {\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}\varvec{\Sigma }_\phi )= & {} (1-\phi ^2)^{-1}[2(1-{\hat{\phi }}\phi )+(p-2)(1+{\hat{\phi }}^2-2{\hat{\phi }}\phi )]\nonumber \\= & {} (1-\phi ^2)^{-1}p(1+{\hat{\phi }}^2-2{\hat{\phi }}\phi )+o_p(1)\nonumber \\= & {} p+(1-\phi ^2)^{-1}p({\hat{\phi }}-\phi )^2+o_p(1)\nonumber \\= & {} p+o_p(1). \end{aligned}$$
(82)

Thus, (77) follows from (81) and (82). Next, we verify (78). Since

$$\begin{aligned}&{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)^2-{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)^2\\&\quad ={\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n) -{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)\\&\qquad +{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n) -{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n), \end{aligned}$$

we only need to show that

$$\begin{aligned} {\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n) -{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)= & {} o_p(1), \end{aligned}$$
(83)
$$\begin{aligned} {\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n) -{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)= & {} o_p(1). \end{aligned}$$
(84)

From (79) and (80), we have

$$\begin{aligned}&{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n) -{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)\nonumber \\&\quad =({\hat{\phi }}^2-\phi ^2){\mathrm{tr}}({\mathbf {E}}_1{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n) -({\hat{\phi }}-\phi ){\mathrm{tr}}({\mathbf {F}}_1{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n). \end{aligned}$$
(85)

By the conclusion (b) in Theorem 2.1 of Zheng et al. (2019) and \({\hat{\phi }}-\phi =O_p(n^{-1})\), we obtain

$$\begin{aligned} p^{-1}{\mathrm{tr}}({\mathbf {E}}_1{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)= & {} (1+y_n)p^{-1}{\mathrm{tr}}({\mathbf {E}}_1\varvec{\Sigma }_\phi )+o_p(1), \end{aligned}$$
(86)
$$\begin{aligned} p^{-1}{\mathrm{tr}}({\mathbf {F}}_1{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)= & {} (1+y_n)p^{-1}{\mathrm{tr}}({\mathbf {F}}_1\varvec{\Sigma }_\phi )+o_p(1). \end{aligned}$$
(87)

Submitting (86) and (87) into (85), from (82), we get

$$\begin{aligned}&{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n) -{\mathrm{tr}}(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)\\&\quad =(1+y_n)\big [({\hat{\phi }}^2-\phi ^2){\mathrm{tr}}({\mathbf {E}}_1\varvec{\Sigma }_\phi )-({\hat{\phi }}-\phi ){\mathrm{tr}}({\mathbf {F}}_1\varvec{\Sigma }_\phi )\big ]+o_p(1)\\&\quad =(1+y_n)\big [{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_\phi ^{-1}\varvec{\Sigma }_\phi )-p\big ]+o_p(1)=o_p(1). \end{aligned}$$

The proof of (84) is simpler than that of (83). Due to the close similarity, the proof is omitted. This completes the proof of Theorem 3.

A.5 Proof of Theorem 4

The proof of Theorem 4 is similar to that of Theorem 3. Therefore, we only give the key part of this proof. Without loss of generality, we assume that \(\sigma _\varepsilon ^2=1\). Similar to the proof of (71) and (72), we can prove

$$\begin{aligned} (p-2)^{-1}n^{-1}\gamma _{01n}= & {} \gamma _0\rho _1+O_p(n^{-1}),\quad (p-2)^{-1}n^{-1}\gamma _{02n}=\gamma _0\rho _2+O_p(n^{-1}),\\ (p-2)^{-1}n^{-1}\gamma _{11n}= & {} \gamma _0+O_p(n^{-1}), \quad (p-2)^{-1}n^{-1}\gamma _{12n}=\gamma _0\rho _1+O_p(n^{-1}),\\ (p-2)^{-1}n^{-1}\gamma _{22n}= & {} \gamma _0+O_p(n^{-1}). \end{aligned}$$

Based on the expressions of \(\gamma _0\), \(\rho _1\) and \(\rho _2\), we obtain

$$\begin{aligned} {\hat{\phi }}_1=\phi _1+O_p(n^{-1}),\quad {\hat{\phi }}_2=\phi _2+O_p(n^{-1}). \end{aligned}$$

Let \(T_{n2}={\mathrm{tr}}\{\varvec{\Sigma }_{\phi _1,\phi _2}^{-1}{\mathbf {B}}_n/[p^{-1}{\mathrm{tr}}(\varvec{\Sigma }_{\phi _1,\phi _2}^{-1}{\mathbf {B}}_n)]-{\mathbf {I}}_p\}^2,\) from Corollary 2 and the delta method, we have under \(H_{02}\),

$$\begin{aligned} \frac{T_{n2}-py_n-(\beta _\varepsilon +1)y_n}{2y_n}{\mathop {\longrightarrow }\limits ^{d}} N(0,1). \end{aligned}$$

Therefore, for deriving the conclusion (b) of Theorem 4, it suffices to prove that \({\widehat{T}}_{n2}-T_{n2}=o_p(1)\). Still according to formula (3) in Verbyla (1985), we have

$$\begin{aligned} \varvec{\Sigma }_{\phi _1,\phi _2}^{-1}= & {} {\mathbf {I}}_p+\phi _1^2{\mathbf {E}}_1+\phi _2^2{\mathbf {E}}_2-\phi _1{\mathbf {F}}_1-\phi _2{\mathbf {F}}_2+\phi _1\phi _2{\mathbf {G}}_{1,2},\\ \widehat{\varvec{\Sigma }}_{\phi _1,\phi _2}^{-1}= & {} {\mathbf {I}}_p+{\hat{\phi }}_1^2{\mathbf {E}}_1+{\hat{\phi }}_2^2{\mathbf {E}}_2-{\hat{\phi }}_1{\mathbf {F}}_1-{\hat{\phi }}_2{\mathbf {F}}_2+{\hat{\phi }}_1{\hat{\phi }}_2{\mathbf {G}}_{1,2}, \end{aligned}$$

where \({\mathbf {G}}_{1,2}={\mathbf {E}}_1{\mathbf {F}}_1{\mathbf {E}}_1\). By some simple calculations, we get

$$\begin{aligned} {\mathrm{tr}}(\widehat{\varvec{\Sigma }}_{\phi _1,\phi _2}^{-1}\varvec{\Sigma }_{\phi _1,\phi _2})= & {} \gamma _0p\big [1+{\hat{\phi }}_1^2+{\hat{\phi }}_2^2-2{\hat{\phi }}_2\rho _2-2{\hat{\phi }}_1(1-{\hat{\phi }}_2)\rho _1\big ]+o_p(1)\\= & {} p+\gamma _0p\big [({\hat{\phi }}_1-\phi _1)^2+({\hat{\phi }}_2-\phi _2)^2+2\rho _1({\hat{\phi }}_1-\phi _1)({\hat{\phi }}_2-\phi _2)\big ]\\&\quad +o_p(1)=p+o_p(1). \end{aligned}$$

The rest of the proof follows exactly the same as that of the conclusion (b) of Theorem 3, so it is omitted. This completes the proof of Theorem 4.

A.6 Proof of Proposition 1

First, we prove the consistency of \({\hat{\beta }}_e\). Rewrite

$$\begin{aligned} {\hat{\beta }}_e=\frac{p^{-1}{\hat{V}}_e/[p^{-1}{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {B}}_n)]^2-2}{p^{-1}\sum _{i=1}^\infty ({\hat{{\mathbf {q}}}}_{1i}^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\hat{{\mathbf {q}}}}_{1i})^2}. \end{aligned}$$

From (75) and (77), we have \(p^{-1}{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {B}}_n)=\sigma _e^2+o_p(1)\). Moreover, based on (79), (80), (82) and \(\varvec{\Sigma }_{\phi }={\mathbf {Q}}_1{\mathbf {Q}}_1^T\), we can prove

$$\begin{aligned} p^{-1}\sum _{i=1}^\infty ({\hat{{\mathbf {q}}}}_{1i}^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\hat{{\mathbf {q}}}}_{1i})^2 -p^{-1}\sum _{i=1}^\infty ({\mathbf {q}}_{1i}^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {q}}_{1i})^2=o_p(1). \end{aligned}$$

Thus, in order to derive the consistency of \({\hat{\beta }}_e\), it suffices to show that

$$\begin{aligned} p^{-1}{\hat{V}}_e=\sigma _e^4\Big [2+\beta _ep^{-1}\sum _{i=1}^\infty ({\mathbf {q}}_{1i}^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {q}}_{1i})^2\Big ]+o_p(1). \end{aligned}$$
(88)

Define

$$\begin{aligned} V_e=(n-1)^{-1}\sum _{j=1}^n\Big ({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j-n^{-1}\sum _{j=1}^n{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j\Big )^2. \end{aligned}$$

The verification of (88) will be split into two parts:

$$\begin{aligned} p^{-1}{\hat{V}}_e= & {} p^{-1}V_e+o_p(1),\end{aligned}$$
(89)
$$\begin{aligned} p^{-1}V_e= & {} \sigma _e^4\Big [2+\beta _ep^{-1}\sum _{i=1}^\infty ({\mathbf {q}}_{1i}^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {q}}_{1i})^2\Big ]+o_p(1). \end{aligned}$$
(90)

We first deal with the proof of (89). From the definitions of \({\hat{V}}_e\) and \(V_e\), we have

$$\begin{aligned} {\hat{V}}_e-V_e= & {} (n-1)^{-1}\sum _{j=1}^n\big [({\mathbf {y}}_j^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {y}}_j)^2-({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j)^2\big ]\\&-(n-1)^{-1}n\Big [(n^{-1}\sum _{j=1}^n{\mathbf {y}}_j^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {y}}_j)^2 -(n^{-1}\sum _{j=1}^n{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j)^2\Big ]\\= & {} (n-1)^{-1}\sum _{j=1}^n\big [({\mathbf {y}}_j^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {y}}_j)^2-({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j)^2\big ]\\&-(n-1)^{-1}n\big [{\mathrm{tr}}^2(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)-{\mathrm{tr}}^2(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)\big ]. \end{aligned}$$

Still from (75) and (77), we obtain \(p^{-1}\big ({\mathrm{tr}}^2(\widehat{\varvec{\Sigma }}_\phi ^{-1}{\mathbf {B}}_n)-{\mathrm{tr}}^2(\varvec{\Sigma }_\phi ^{-1}{\mathbf {B}}_n)\big )=o_p(1)\). Note that

$$\begin{aligned}&p^{-1}(n-1)^{-1}\sum _{j=1}^n\big [({\mathbf {y}}_j^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {y}}_j)^2-({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j)^2\big ]\\&\quad =p^{-1}(n-1)^{-1}\sum _{j=1}^n({\mathbf {y}}_j^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {y}}_j-{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j)^2\\&\qquad +2p^{-1}(n-1)^{-1}\sum _{j=1}^n({\mathbf {y}}_j^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {y}}_j-{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j){\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j. \end{aligned}$$

By (79) and (80), we have

$$\begin{aligned}&p^{-1}(n-1)^{-1}\sum _{j=1}^n({\mathbf {y}}_j^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {y}}_j-{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j){\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j \nonumber \\&\quad =({\hat{\phi }}^2-\phi ^2)p^{-1}(n-1)^{-1}\sum _{j=1}^n{\mathbf {y}}_j^T{\mathbf {E}}_1{\mathbf {y}}_j{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j\nonumber \\&\qquad -({\hat{\phi }}-\phi )p^{-1}(n-1)^{-1}\sum _{j=1}^n{\mathbf {y}}_j^T{\mathbf {F}}_1{\mathbf {y}}_j{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j. \end{aligned}$$
(91)

Since \({\mathrm{E}}e_{ij}^8\) is finite, based on Chebyshev’s inequality and Lemma 3, we can prove

$$\begin{aligned} p^{-2}(n-1)^{-1}\sum _{j=1}^n{\mathbf {y}}_j^T{\mathbf {E}}_1{\mathbf {y}}_j{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j= & {} \sigma _e^4p^{-1}{\mathrm{tr}}({\mathbf {E}}_1\varvec{\Sigma }_{\phi })+o_p(1), \end{aligned}$$
(92)
$$\begin{aligned} p^{-2}(n-1)^{-1}\sum _{j=1}^n{\mathbf {y}}_j^T{\mathbf {F}}_1{\mathbf {y}}_j{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j= & {} \sigma _e^4p^{-1}{\mathrm{tr}}({\mathbf {F}}_1\varvec{\Sigma }_{\phi })+o_p(1). \end{aligned}$$
(93)

Submitting (92) and (93) into (91), it follows from (82) that

$$\begin{aligned}&p^{-1}(n-1)^{-1}\sum _{j=1}^n({\mathbf {y}}_j^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {y}}_j-{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j){\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j\\&\quad =\sigma _e^4\big [({\hat{\phi }}^2-\phi ^2){\mathrm{tr}}({\mathbf {E}}_1\varvec{\Sigma }_{\phi })-({\hat{\phi }}-\phi ){\mathrm{tr}}({\mathbf {F}}_1\varvec{\Sigma }_{\phi })\big ]+o_p(1)\\&\quad =\sigma _e^4\big [{\mathrm{tr}}(\widehat{\varvec{\Sigma }}_{\phi }^{-1}\varvec{\Sigma }_{\phi })-p\big ]+o_p(1)=o_p(1). \end{aligned}$$

Similarly, we can prove \(p^{-1}(n-1)^{-1}\sum _{j=1}^n({\mathbf {y}}_j^T\widehat{\varvec{\Sigma }}_{\phi }^{-1}{\mathbf {y}}_j-{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j)^2=o_p(1)\). Thus, (89) is proved. Next, we verify (90). After some simple calculations, we have

$$\begin{aligned} p^{-1}V_e= & {} p^{-1}(n-1)^{-1}\sum _{j=1}^n({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j)^2-(pn)^{-1}(n-1)^{-1}\Big (\sum _{j=1}^n{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j\Big )^2\\= & {} p^{-1}(n-1)^{-1}\Big [(n-1)n^{-1}\sum _{j=1}^n({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j)^2 -n^{-1}\sum _{i\ne j}{\mathbf {y}}_i^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_i{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j\Big ]\\= & {} (pn)^{-1}\sum _{j=1}^n({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j)^2 -(pn)^{-1}(n-1)^{-1}\sum _{i\ne j}{\mathbf {y}}_i^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_i{\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j\Big ]\\= & {} (pn)^{-1}\sum _{j=1}^n({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j-p\sigma _e^2)^2\\&-(pn)^{-1}(n-1)^{-1}\sum _{i\ne j}({\mathbf {y}}_i^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_i-p\sigma _e^2)({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j-p\sigma _e^2). \end{aligned}$$

It follows from (1.15) in Bai and Silverstein (2004) that

$$\begin{aligned} (pn)^{-1}\sum _{j=1}^n({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j-p\sigma _e^2)^2 =\sigma _e^4\Big [2+\beta _ep^{-1}\sum _{i=1}^\infty ({\mathbf {q}}_{1i}^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {q}}_{1i})^2\Big ]+o_p(1)\nonumber \\ \end{aligned}$$
(94)

and

$$\begin{aligned} (pn)^{-1}(n-1)^{-1}\sum _{i\ne j}({\mathbf {y}}_i^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_i-p\sigma _e^2)({\mathbf {y}}_j^T\varvec{\Sigma }_{\phi }^{-1}{\mathbf {y}}_j-p\sigma _e^2)=o_p(1). \end{aligned}$$
(95)

Thus, we complete the proof of conclusion (a). The proof of the conclusion (b) is similar to that of the conclusion (a), then it is omitted. Proposition 1 is proved.

Appendix B: Mathematical tools

Lemma 2

(Burkholder 1973). Let \(\{X_k\}\) be a complex martingale difference sequence with respect to the increasing \(\sigma \)-field \(\{{\mathscr {F}}_k\}\). Then for \(p>1\)

$$\begin{aligned} {\mathrm{E}}\left| \sum X_k\right| ^p\le K_p{\mathrm{E}}\left( \sum |X_k|^2\right) ^{p/2}. \end{aligned}$$

Lemma 3

For \({\mathbf {X}}=(X_1,\ldots ,X_n)^T\) with i.i.d. standardized (complex) entries, \(n\times n\) (complex) matrix \({\mathbf {C}}=(c_{ij})\), we have

$$\begin{aligned} {\mathrm{E}}|{\mathbf {X}}^*{\mathbf {C}}{\mathbf {X}}-{\mathrm{tr}}{\mathbf {C}}|^4\le K\left( \left( {\mathrm{tr}}({\mathbf {C}}{\mathbf {C}}^*)\right) ^{2}+\sum _{i=1}^n {\mathrm{E}}|X_{ii}|^8|c_{ii}|^4\right) . \end{aligned}$$

The proof of the lemma can easily follow by simple calculus and thus omitted.

Lemma 4

(Lemma 2.3 of Bai and Silverstein 2004). Let \(f_1,f_2,\ldots \) be analytic in D, a connected open set of \(\mathbb {C}\), satisfying \(|f_n(z)|\le M\) for every n and z in D, and \(f_n(z)\) converges, as \(n\rightarrow \infty \) for each z in a subset of D having a limit point in D. Then there exists a function f, analytic in D for which \(f_n(z)\rightarrow f(z)\) and \(f'_n(z)\rightarrow f'(z)\) for all \(z\in D\) where \('\) denotes the derivative. Moreover, on any set bounded by a contour interior to D the convergence is uniform and \(\{f'_n(z)\}\) is uniformly bounded.

Lemma 5

(Theorem 35.12 of Billingsley 1995) Suppose for each n \(Y_{n1}\), \(Y_{n2},\ldots ,\) \(Y_{nr_n}\) is a real martingale difference sequence with respect to the increasing \(\sigma \)-field \(\{{\mathscr {F}}_{nj}\}\) having second moments. If as \(n\rightarrow \infty \)

$$\begin{aligned} \sum _{j=1}^{r_n}{\mathrm{E}}(Y^2_{nj}|{\mathscr {F}}_{n,j-1}){\mathop {\rightarrow }\limits ^{i.p.}}\sigma ^2, \end{aligned}$$
(96)

where \(\sigma ^2\) is a positive constant, and for each \(\epsilon >0\)

$$\begin{aligned} \sum _{j=1}^{r_n}{\mathrm{E}}(Y^2_{nj}I{(|Y_{nj}|\ge \epsilon )})\rightarrow 0 \end{aligned}$$
(97)

then

$$\begin{aligned} \sum _{j=1}^{r_n}Y_{nr_n}{\mathop {\rightarrow }\limits ^{d}} N(0,\sigma ^2). \end{aligned}$$

Lemma 6

(Lemma 2.6 of Bai 1999). For \(p\times n\) complex matrices \({\mathbf {A}}\) and \({\mathbf {B}}\)

$$\begin{aligned} \Vert F^{{\mathbf {A}}{\mathbf {A}}*}-F^{{\mathbf {B}}{\mathbf {B}}^*}\Vert \le p^{-1}{\mathrm{rank}}({\mathbf {A}}-{\mathbf {B}}), \end{aligned}$$

where \(\Vert \cdot \Vert \) here denotes sup norm on functions.

Lemma 7

(Lemma 2.7 of Bai 1999). For \(p\times n\) complex matrices \({\mathbf {A}}\) and \({\mathbf {B}}\)

$$\begin{aligned} L^4(F^{{\mathbf {A}}{\mathbf {A}}^*},F^{{\mathbf {B}}{\mathbf {B}}^*})\le 2p^{-2}{\mathrm{tr}}({\mathbf {A}}-{\mathbf {B}})({\mathbf {A}}^*-{\mathbf {B}}^*){\mathrm{tr}}({\mathbf {A}}{\mathbf {A}}^*+{\mathbf {B}}{\mathbf {B}}^*), \end{aligned}$$

where L(, ) denotes the Levy distance between distribution functions.

Lemma 8

(Lemma 2.6 of Silverstein and Bai 1995). Let \(z\in \mathbb {C}^+\) with \(v=\mathfrak {I}\,z\), \({\mathbf {A}}\) and \({\mathbf {B}}\) being \(n\times n\) with \({\mathbf {B}}\) Hermitian, and \({\mathbf {r}}\in \mathbb {C}^n\). Then

$$\begin{aligned} \bigl |{\mathrm{tr}}\bigl (({\mathbf {B}}-z{\mathbf {I}})^{-1}-({\mathbf {B}}+{\mathbf {r}}{\mathbf {r}}^*-z{\mathbf {I}})^{-1}\bigr ){\mathbf {A}}\bigr |= \left| \frac{{\mathbf {r}}^*( {\mathbf {B}}-z{\mathbf {I}})^{-1}{\mathbf {A}}( {\mathbf {B}}-z{\mathbf {I}})^{-1}{\mathbf {r}}}{1+{\mathbf {r}}^*({\mathbf {B}}-z{\mathbf {I}})^{-1}{\mathbf {r}}}\right| \le \frac{\Vert {\mathbf {A}}\Vert }{v}. \end{aligned}$$

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zou, T., Zheng, S., Bai, Z. et al. CLT for linear spectral statistics of large dimensional sample covariance matrices with dependent data. Stat Papers (2021). https://doi.org/10.1007/s00362-021-01250-3

Download citation

Keywords

  • Sample covariance matrices
  • Linear spectral statistics
  • Central limit theorem
  • Repeated linear processes
  • High-dimensional dependent data

Mathematics Subject Classification

  • 15B52
  • 62E20