Abstract
Multiple response regression model is commonly employed to investigate the relationship between multiple outcomes and a set of potential predictors, where single-response analysis and multivariate analysis of variance (MANOVA) are two frequently used methods for association analysis. However, both methods have their own limitations. The basis of the former method is independence of multiple responses and the latter one assumes that multiple responses are normally distributed. In this work, the authors propose a test statistic for multiple response association analysis in high-dimensional situations based on F statistic. It is free of normal distribution assumption and the asymptotic normal distribution is obtained under some regular conditions. Extensive computer simulations and four real data applications show its superiority over single-response analysis and MANOVA methods.
Similar content being viewed by others
References
Anderson M J and Underwood A J, Effects of gastropod grazers on recruitment and succession of an estuarine assemblage: A multivariate and univariate approach, Oecologia, 1997, 109(3): 442–453.
Sivakumaran S, Agakov F, Theodoratou E, et al., Abundant pleiotropy in human complex diseases and traits, American Journal of Human Genetics, 2011, 89(5): 607–618.
Wu Q, Zhong S J, and Tong X W, Genetic peiotropy test by quasi p-walue with application to typhoon data in China, Journal of Systems Science &Complexity, 2022, 35(4): 1557–1572.
Akil H, Martone M E, and Van Essen D C, Challenges and opportunities in mining neuroscience data, Science, 2011, 331(6018): 708–712.
Li J, Zhang W, Zhang S, et al., A theoretic study of a distance-based regression model, Science China-Mathematics, 2019, 62(5): 979–998.
Finch H, Comparison of the performance of nonparametric and parametric MANOVA test statistics when assumptions are violated, Methodology, 2005, 1(1): 27–38.
McArdle B H and Anderson M J, Fitting multivariate models to community data: A comment on distance-based redundancy analysis, Ecology, 2001, 82(1): 290–297.
Maitra S and Yan J, Principle component analysis and partial least squares: Two dimension reduction techniques for regression, Applying Multivariate Statistical Models, 2008, 79: 79–90.
Luo Y, Tao D, Ramamohanarao K, et al., Tensor canonical correlation analysis for multi-view dimension reduction, IEEE Transactions on Knowledge & Data Engineering, 2015, 27(11): 3111–3124.
Ferreira M A R and Purcell S M, A multivariate test of association, Bioinformatics, 2008, 25(1): 132–133.
Kropf S, Läuter J, Kose D, et al., Comparison of exact parametric tests for high-dimensional data, Computational Statistics & Data Analysis, 2009, 53(3): 776–787.
Läuter J, Exact t and F tests for analyzing studies with multiple endpoints, Biometrics, 1996, 52, 964–970.
Läuter J, Glimm E, and Kropf S, New multivariate tests for data with an inherent structure, Biometrical Journal, 1996, 38: 5–23.
Läuter J, Glimm E, and Kropf S, Multivariate tests based on left-spherically distributed linear scores, The Annal of Statistics, 1998, 26: 1972–1988.
McArdle B H and Anderson M J, Fitting multivariate models to community data: A comment on distance-based redundancy analysis, Ecology, 2001, 82: 290–297.
Wang J, Li J, Xiong W, et al., Group analysis of distance matrices, Genetic Epidemiology, 2020, 44: 620–628.
Shi Y, Zhang W, Liu A, et al., Distance-based regression analysis for measuring associations, Journal of Systems & Complexity, 2023, 36(1): 393–411.
Srivastava M S, Some tests concerning the covariance matrix in high-dimensional data, Journal of the Japan Statistical Society, 2005, 35: 251–272.
Bai Z and Saranadasa H, Effect of high dimension: By an example of a two sample problem, Statistica Sinica, 1996: 311–329.
Rice W R, Analyzing tables of statistical tests, Evolution, 1989, 43(1): 223–225.
Lu T, Pan Y, Kao S Y, et al., Gene regulation and DNA damage in the ageing human brain, Nature, 2004, 429(6994): 883.
Meng Z, Yuan A, and Li N, Testing high-dimensional nonparametric Behrens-Fisher problem, Journal of Systems Science & Complexity, 2022, 35(3): 1098–1115.
Singh D, Febbo P G, Ross K, et al., Gene expression correlates of clinical prostate cancer behavior, Cancer Cell, 2002, 1(2): 203–209.
Cheung V G, Spielman R S, Ewens K G, et al., Mapping determinants of human gene expression by regional and genome-wide association, Nature, 2005, 437: 1365–1369.
Li Z B, Qin S N, and Li Q, A novel test by combining the maximum and minimum values among a large number of dependent Z-Scores with application to genome wide association study, Statistics in Medicine, 2021, 40: 2422–2434.
Hall P and Heyde C C, Martingale Limit Theory and Its Application, Academic Press, New York, 2014.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare no conflict of interest.
Additional information
This paper was in part supported by China Postdoctoral Science Foundation Funded Project under Grant No. 2021M700433, the National Natural Science Foundation of China under Grant Nos. 12101047 and 12201432.
Rights and permissions
About this article
Cite this article
Wang, J., Jiang, Z., Liu, H. et al. Association Testing for High-Dimensional Multiple Response Regression. J Syst Sci Complex 36, 1680–1696 (2023). https://doi.org/10.1007/s11424-023-1168-2
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11424-023-1168-2