Abstract
The structure testing of a high-dimensional covariance matrix plays an important role in financial stock analyses, genetic series analyses, and many other fields. Testing that the covariance matrix is block-diagonal under the high-dimensional setting is the main focus of this paper. Several test procedures that rely on normality assumptions, two-diagonal block assumptions, or sub-block dimensionality assumptions have been proposed to tackle this problem. To relax these assumptions, we develop a test framework based on U-statistics, and the asymptotic distributions of the U-statistics are established under the null and local alternative hypotheses. Moreover, a test approach is developed for alternatives with different sparsity levels. Finally, both a simulation study and real data analysis demonstrate the performance of our proposed methods.
Similar content being viewed by others
Data Availability
The datasets used for this study are available upon request from the authors.
References
Al-Shalalfa M, Alhajj R (2007) Attractive feature reduction approach for colon data classification. In: 21st International Conference on Advanced Information Networking and Applications Workshops (AINAW’07), vol 1. IEEE, Niagara Falls, ON, Canada, pp 678–683, https://doi.org/10.1109/AINAW.2007.103
Anderson TW (1984) An introduction to multivariate statistical analysis, 2nd edn. Wiley, New York
Bai Z, Jiang D, Yao J et al (2009) Corrections to LRT on large dimensional covariance matrix by RMT. Ann Stat 37(6B):3822–3840. https://doi.org/10.1214/09-AOS694
Bao Z, Hu J, Pan G et al (2017) Test of independence for high-dimensional random vectors based on block correlation matrices. Electron J Stat 11:1527–1548. https://doi.org/10.1214/17-EJS1259
Berisa T, Pickrell JK (2016) Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32(2):283–285. https://doi.org/10.1093/bioinformatics/btv546
Bodnar T, Dette H, Parolya N (2019) Testing for independence of large dimensional vectors. Ann Stat 47(5):2977–3008. https://doi.org/10.1214/18-AOS1771
Cai T, Jiang T (2011) Limiting laws of coherence of random matrices with applications to testing covariance structure and construction of compressed sensing matrices. Ann Stat 39(3):1496–1525. https://doi.org/10.1214/11-AOS879
Cai T, Ma Z (2013) Optimal hypothesis testing for high dimensional covariance matrices. Bernoulli 19(5B):2359–2388. https://doi.org/10.3150/12-BEJ455
Chen S, Zhang L, Zhong P (2010) Tests for high-dimensional covariance matrices. J Am Stat Assoc 105(490):810–819. https://doi.org/10.1198/jasa.2010.tm09560
Devijver E, Gallopin M (2018) Block-diagonal covariance selection for high-dimensional gaussian graphical models. J Am Stat Assoc 113(521):306–314. https://doi.org/10.1080/01621459.2016.1247002
Dudoit S, Fridlyand J, Speed TP (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97(457):77–87. https://doi.org/10.1198/016214502753479248
He Y, Xu G, Wu C et al (2021) Asymptotically independent u-statistics in high-dimensional testing. Ann Stat 49(1):154–181. https://doi.org/10.1214/20-AOS1951
Hyodo M, Shutoh N, Nishiyama T et al (2015) Testing block-diagonal covariance structure for high-dimensional data. Stat Neerl 69(4):460–482. https://doi.org/10.1111/stan.12068
Jiang D, Qi Y (2015) Likelihood ratio tests for high-dimensional normal distributions. Scand J Stat 42(4):988–1009. https://doi.org/10.1111/sjos.12147
Jiang D, Jiang T, Yang F (2012) Likelihood ratio tests for covariance matrices of high-dimensional normal distributions. J Stat Plan Inference 142(8):2241–2256. https://doi.org/10.1016/j.jspi.2012.02.057
Jiang D, Bai Z, Zheng S (2013) Testing the independence of sets of large-dimensional variables. Sci China Math 56(1):135–147. https://doi.org/10.1007/s11425-012-4501-0
Jiang T, Yang F (2013) Central limit theorems for classical likelihood ratio tests for high-dimensional normal distributions. Ann Stat 41(4):2029–2074. https://doi.org/10.1214/13-AOS1134
John S (1971) Some optimal multivariates tests. Biometrika 58(1):123–127. https://doi.org/10.1093/biomet/58.1.123
Kan R (2008) From moments of sum to moments of product. J Multivar Anal 99(3):542–554. https://doi.org/10.1016/j.jmva.2007.01.013
Kumar SS, Sumathi A, Ramaraj DE (2012) Development of an efficient clustering technique for colon dataset. Int J Eng Innovative Technol 1(5):83–86
Ledoit O, Wolf M (2002) Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Ann Stat 30(4):1081–1102. https://doi.org/10.1214/aos/1031689018
Li W, Yao J (2018) On structure testing for component covariance matrices of a high dimensional mixture. J R Stat Soc Ser B (Stat Methodol) 80(2):293–318. https://doi.org/10.1111/rssb.12248
Li W, Chen J, Yao J (2017) Testing the independence of two random vectors where only one dimension is large. Statistics 51(1):141–153. https://doi.org/10.1080/02331888.2016.1266988
Lin Z, Xiang Y (2008) A hypothesis test for independence of sets of variates in high dimensions. Statist Probab Lett 78(17):2939–2946. https://doi.org/10.1016/j.spl.2008.05.003
Marques F, Coelho C, Marques P (2013) The block-matrix sphericity test: exact and near-exact distributions for the test statistic. Recent Developments in Modeling and Applications in Statistics pp 169–177. https://doi.org/10.1007/978-3-642-32419-2_18
Nagao H (1973) On some test criteria for covariance matrix. Ann Stat 1(4):700–709. https://doi.org/10.2307/2958313
Nirmalakumari K, Rajaguru H, Rajkumar P (2020) Inference on the shape of elliptical distributions based on the mcd. Int J Imaging Syst Technol pp 1–21. https://doi.org/10.1002/ima.22431
Pavlenko T, Björkström A, Tillander A (2012) Covariance structure approximation via glasso in high-dimensional supervised classification. J Appl Stat 39(8):1643–1666
Qi Y, Wang F, Zhang L (2019) Limiting distributions of likelihood ratio test for independence of components for high-dimensional normal vectors. Ann Inst Stat Math 71:911–946. https://doi.org/10.1007/s10463-018-0666-9
Qiu Y, Chen S (2012) Test for bandedness of high-dimensional covariance matrices and bandwidth estimation. Ann Stat 40(3):1285–1314. https://doi.org/10.1214/12-AOS1002
Rahman MA, Muniyandi RC (2018) Feature selection from colon cancer dataset for cancer classification using artificial neural network. Int J Adv Sci Eng Inform Technol 8(4–2):1387–1393
Schott JR (2005) Testing for complete independence in high dimensions. Biometrika 92(4):951–956. https://doi.org/10.1093/biomet/92.4.951
Silva IR, Zhuang Y, Junior JCAdS (2021) Kronecker delta method for testing independence between two vectors in high-dimension. Stat Pap (Berl) pp 1–23. https://doi.org/10.1007/s00362-021-01238-z
Srivastava M (2005) Some tests concerning the covariance matrix in high-dimensional data. J Japan Stat Soc 35(2):251–272. https://doi.org/10.14490/jjss.35.251
Srivastava M, Reid N (2012) Testing the structure of the covariance matrix with fewer observations than the dimension. J Multivar Anal 112:156–171. https://doi.org/10.1016/j.jmva.2012.06.004
Wang Q, Yao J (2013) On the sphericity test with large-dimensional observations. Electron J Stat 7:2164–2192. https://doi.org/10.1214/13-EJS842
Wang X, Xu G, Zheng S (2022) Adaptive tests for bandedness of high-dimensional covariance matrices arXiv:2204.11155 [stat.ME]
Xiao H, Wu W (2013) Asymptotic theory for maximum deviations of sample covariance matrix estimates. Stochastic Processes and Their Appl 123(7):2899–2920. https://doi.org/10.1016/j.spa.2013.03.012
Xu K (2017) Testing diagonality of high-dimensional covariance matrix under non-normality. J Stat Comput Simul 87(16):3208–3224. https://doi.org/10.1080/00949655.2017.1362405
Xu K, Hao X (2019) A nonparametric test for block-diagonal covariance structure in high dimension and small samples. J Multivar Anal 173:551–567. https://doi.org/10.1016/j.jmva.2019.05.001
Yamada Y, Hyodo M, Nishiyama T (2017) Testing block-diagonal covariance structure for high-dimensional data under non-normality. J Multivar Anal 155:305–316. https://doi.org/10.1016/j.jmva.2016.12.009
Yang Y, Pan G (2015) Independence test for high dimensional data based on regularized canonical correlation coefficients. Ann Stat 43(2):467–500. https://doi.org/10.1214/14-AOS1284
Yata K, Aoshima M (2016) High-dimensional inference on covariance structures via the extended cross-data-matrix methodology. J Multivar Anal 151:151–166. https://doi.org/10.1016/j.jmva.2016.07.011
Yu K, Li Q, Bergen AW et al (2009) Pathway analysis by adaptive combination of p-values. Genet Epidemiol 33(8):700–709. https://doi.org/10.1002/gepi.20422
Zhang W, Jin B, Bai Z (2021) Learning block structures in u-statistic based matrices. Biometrika 108(4):933–946. https://doi.org/10.1093/biomet/asaa099
Zhang X, Cheng G (2014) Bootstrapping high dimensional time series. statistics arXiv:1406.1037 [math.ST]
Zheng S, He X, Guo J (2022) Hypothesis testing for block-structured correlation for high dimensional variables. Statistica Sinica 32. https://doi.org/10.5705/ss.202019.0319
Zhu Z, Kay SM (2016) The Rao test for testing bandedness of complex-valued covariance matrix. In: 2016 IEEE International Conference On Acoustics, Speech and Signal Processing (ICASSP). IEEE, Shanghai, China, pp 3960–3963, https://doi.org/10.1109/ICASSP.2016.7472420
Acknowledgements
We are grateful to the Editor, the Associate Editor and the two referees for their constructive comments, which helped us to improve the manuscript. The authors thank Prof. Kai Xu for providing the code in Xu and Hao (2019). This paper is supported by the NSFC 12071066 and 12231011.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors have no conflicts of interest.
Code availability
The codes used for the simulations and real data analysis are available upon request from the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary information:
The online supplementary material document contains the proofs of Theorems 1--2 and Proposition 1, proofs of their assisted lemmas, and simulation results of data from the Gamma(4,2)-2 distribution. (pdf 846KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lai, J., Wang, X., Zhao, K. et al. Block-diagonal test for high-dimensional covariance matrices. TEST 32, 447–466 (2023). https://doi.org/10.1007/s11749-022-00842-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-022-00842-x