Skip to main content
Log in

Matrix regression heterogeneity analysis

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

The development of modern science and technology has facilitated the collection of a large amount of matrix data in fields such as biomedicine. Matrix data modeling has been extensively studied, which advances from the naive approach of flattening the matrix into a vector. However, existing matrix modeling methods mainly focus on homogeneous data, failing to handle the data heterogeneity frequently encountered in the biomedical field, where samples from the same study belong to several underlying subgroups, and different subgroups follow different models. In this paper, we focus on regression-based heterogeneity analysis. We propose a matrix data heterogeneity analysis framework, by combining matrix bilinear sparse decomposition and penalized fusion techniques, which enables data-driven subgroup detection, including determining the number of subgroups and subgrouping membership. A rigorous theoretical analysis is conducted, including asymptotic consistency in terms of subgroup detection, the number of subgroups, and regression coefficients. Numerous numerical studies based on simulated and real data have been constructed, showcasing the superior performance of the proposed method in analyzing matrix heterogeneous data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data and code availability

Data and codes to reproduce the numerical results are posted on the GitHub page, the https://github.com/Zhang-Fengchuan/Matrix-heterogeneity-linear-regression.git, https://github.com/Zhang-Fengchuan/Matrix-heterogeneity-logistic-regression.git.

References

  • Amato, R., Pinelli, M., D’Andrea, D., Miele, G., Nicodemi, M., Raiconi, G., Cocozza, S.: A novel approach to simulate gene-environment interactions in complex diseases. BMC Bioinform. 11(1), 1–9 (2010)

    Article  Google Scholar 

  • Benjamin, E.J., Blaha, M.J., Chiuve, S.E., Cushman, M., Das, S.R., Deo, R., De Ferranti, S.D., Floyd, J., Fornage, M., Gillespie, C., et al.: Heart disease and stroke statistics-2017 update: a report from the American Heart Association. Circulation 135(10), 146–603 (2017)

    Article  Google Scholar 

  • Caner, M.: Generalized linear models with structured sparsity estimators. J. Econ. 236(2), 105478 (2023)

    Article  MathSciNet  Google Scholar 

  • Chakraborty, R., Ostrin, L.A., Nickla, D.L., Iuvone, P.M., Pardue, M.T., Stone, R.A.: Circadian rhythms, refractive development, and myopia. Ophthalmic Physiol. Opt. 38(3), 217–245 (2018)

    Article  PubMed  PubMed Central  Google Scholar 

  • Clark, R., Pozarickij, A., Hysi, P.G., Ohno-Matsui, K., Williams, C., Guggenheim, J.A., Eye, U.B., Consortium, V.: Education interacts with genetic variants near GJD2, RBFOX1, LAMA2, KCNQ5 and LRRC4C to confer susceptibility to myopia. PLoS Genet. 18(11), 478 (2022)

    Article  Google Scholar 

  • Ding, S., Dennis Cook, R.: Matrix variate regressions and envelope models. J. R. Stat. Soc. Ser. B Stat Methodol. 80(2), 387–408 (2018)

    Article  MathSciNet  Google Scholar 

  • Enthoven, C.A., Tideman, J.W.L., Polling, J.R., Tedja, M.S., Raat, H., Iglesias, A.I., Verhoeven, V.J., Klaver, C.C.: Interaction between lifestyle and genetic susceptibility in myopia: the generation R study. Eur. J. Epidemiol. 34, 777–784 (2019)

    Article  PubMed  PubMed Central  Google Scholar 

  • Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)

    Article  MathSciNet  Google Scholar 

  • Fan, Q., Guo, X., Tideman, J.W.L., Williams, K.M., Yazar, S., Hosseini, S.M., Howe, L.D., Pourcain, B.S., Evans, D.M., Timpson, N.J., et al.: Childhood gene-environment interactions and age-dependent effects of genetic variants associated with refractive error and myopia: The cream consortium. Sci. Rep. 6(1), 25853 (2016)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  • Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97(458), 611–631 (2002)

    Article  MathSciNet  Google Scholar 

  • Guggenheim, J.A., McMahon, G., Kemp, J.P., Akhtar, S., St Pourcain, B., Northstone, K., Ring, S.M., Evans, D.M., Smith, G.D., Timpson, N.J., et al.: A genome-wide association study for corneal curvature identifies the platelet-derived growth factor receptor alpha gene as a quantitative trait locus for eye size in white europeans. Mol. Vis. 19, 243 (2013)

    CAS  PubMed  PubMed Central  Google Scholar 

  • Hu, X., Huang, J., Liu, L., Sun, D., Zhao, X.: Subgroup analysis in the heterogeneous cox model. Stat. Med. 40(3), 739–757 (2021)

    Article  MathSciNet  PubMed  Google Scholar 

  • Hughes, A., Piggins, H.: Behavioral responses of VIPR2-/-mice to light. J. Biol. Rhythms 23(3), 211–219 (2008)

    Article  CAS  PubMed  Google Scholar 

  • Hung, H., Wang, C.-C.: Matrix variate logistic regression model with application to EEG data. Biostatistics 14(1), 189–202 (2013)

    Article  PubMed  Google Scholar 

  • Hunter, D.J.: Gene-environment interactions in human diseases. Nat. Rev. Genet. 6(4), 287–298 (2005)

    Article  CAS  PubMed  Google Scholar 

  • Khalili, A., Chen, J.: Variable selection in finite mixture of regression models. J. Am. Stat. Assoc. 102(479), 1025–1038 (2007)

    Article  MathSciNet  CAS  Google Scholar 

  • Kossaï, M., Leary, A., Scoazec, J.-Y., Genestie, C.: Ovarian cancer: a heterogeneous disease. Pathobiology 85(1–2), 41–49 (2018)

    Article  PubMed  Google Scholar 

  • Kravitz, R.L., Duan, N., Braslow, J.: Evidence-based medicine, heterogeneity of treatment effects, and the trouble with averages. Milbank Q. 82(4), 661–687 (2004)

    Article  PubMed  PubMed Central  Google Scholar 

  • Li, B., Kim, M.K., Altman, N.: On dimension folding of matrix-or array-valued statistical objects. Ann. Stat. (2010)

  • Li, S.-M., Liu, L.-R., Li, S.-Y., Ji, Y.-Z., Fu, J., Wang, Y., Li, H., Zhu, B.-D., Yang, Z., Li, L., et al.: Design, methodology and baseline data of a school-based cohort study in central china: the Anyang childhood eye study. Ophthalmic Epidemiol. 20(6), 348–359 (2013)

    Article  CAS  PubMed  Google Scholar 

  • Li, S.-M., Li, S.-Y., Kang, M.-T., Zhou, Y., Liu, L.-R., Li, H., Wang, Y.-P., Zhan, S.-Y., Gopinath, B., Mitchell, P., et al.: Near work related parameters and myopia in Chinese children: the Anyang childhood eye study. PLoS ONE 10(8), 0134514 (2015)

    Google Scholar 

  • Li, S.-M., Ran, A.-R., Kang, M.-T., Yang, X., Ren, M.-Y., Wei, S.-F., Gan, J.-H., Li, L., He, X., Li, H., et al.: Effect of text messaging parents of school-aged children on outdoor time to control myopia: a randomized clinical trial. JAMA Pediatr. 176(11), 1077–1083 (2022)

    Article  PubMed  PubMed Central  Google Scholar 

  • Liu, L., Lin, L.: Subgroup analysis for heterogeneous additive partially linear models and its application to car sales data. Comput. Stat. Data Anal. 138, 239–259 (2019)

    Article  MathSciNet  Google Scholar 

  • Liu, J., Huang, J., Zhang, Y., Lan, Q., Rothman, N., Zheng, T., Ma, S.: Identification of gene-environment interactions in cancer studies using penalization. Genomics 102(4), 189–194 (2013)

    Article  CAS  PubMed  Google Scholar 

  • Ma, S., Huang, J.: A concave pairwise fusion approach to subgroup analysis. J. Am. Stat. Assoc. 112(517), 410–423 (2017)

    Article  MathSciNet  CAS  Google Scholar 

  • Ma, S., Huang, J., Zhang, Z., Liu, M.: Exploration of heterogeneous treatment effects via concave fusion. Int. J. Biostat. 16(1), 20180026 (2019)

    Article  Google Scholar 

  • Mathew, D., Giles, J.R., Baxter, A.E., Oldridge, D.A., Greenplate, A.R., Wu, J.E., Alanio, C., Kuri-Cervantes, L., Pampena, M.B., D’Andrea, K., et al.: Deep immune profiling of COVID-19 patients reveals distinct immunotypes with therapeutic implications. Science 369(6508), 8511 (2020)

    Article  Google Scholar 

  • Morgan, I.G., Ohno-Matsui, K., Saw, S.-M.: Myopia. Lancet 379(9827), 1739–1748 (2012)

    Article  PubMed  Google Scholar 

  • Pozarickij, A., Williams, C., Hysi, P.G., Guggenheim, J.A.: Quantile regression analysis reveals widespread evidence for gene-environment or gene–gene interactions in myopia development. Commun. Biol. 2(1), 167 (2019)

    Article  PubMed  PubMed Central  Google Scholar 

  • Ren, M., Zhang, Q., Zhang, S., Zhong, T., Huang, J., Ma, S.: Hierarchical cancer heterogeneity analysis based on histopathological imaging features. Biometrics 78(4), 1579–1591 (2022)

    Article  MathSciNet  PubMed  Google Scholar 

  • Sørensen, T.I.: Which patients may be harmed by good treatments? Lancet 348(9024), 351–352 (1996)

  • Stucky, B., Geer, S.: Asymptotic confidence regions for high-dimensional structured sparsity. IEEE Trans. Signal Process. 66(8), 2178–2190 (2018)

    Article  ADS  MathSciNet  Google Scholar 

  • Turajlic, S., Sottoriva, A., Graham, T., Swanton, C.: Resolving genetic heterogeneity in cancer. Nat. Rev. Genet. 20(7), 404–416 (2019)

    Article  CAS  PubMed  Google Scholar 

  • Vaart, A.W.: Asymptotic Statistics, vol. 3. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  • Wang, H., Li, B., Leng, C.: Shrinkage tuning parameter selection with a diverging number of parameters. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 71(3), 671–683 (2009)

    Article  MathSciNet  Google Scholar 

  • Yang, X., Yan, X., Huang, J.: High-dimensional integrative analysis with homogeneity and sparsity recovery. J. Multivar. Anal. 174, 104529 (2019)

    Article  MathSciNet  Google Scholar 

  • Yiu, W.C., Yap, M.K., Fung, W.Y., Ng, P.W., Yip, S.P.: Genetic susceptibility to refractive error: association of vasoactive intestinal peptide receptor 2 (vipr2) with high myopia in chinese. PLoS ONE 8(4), 61805 (2013)

    Article  ADS  Google Scholar 

  • Zadnik, K., Mutti, D.O.: Outdoor activity protects against childhood myopia-let the sun shine in. JAMA Pediatr. 173(5), 415–416 (2019)

    Article  PubMed  Google Scholar 

  • Zhang, C.-H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. (2010)

  • Zhang, H., Jia, J.: Elastic-net regularized high-dimensional negative binomial regression: consistency and weak signal detection. Stat. Sin. 32, 181–207 (2022)

    MathSciNet  Google Scholar 

  • Zhou, H., Li, L., Zhu, H.: Tensor regression with applications in neuroimaging data analysis. J. Am. Stat. Assoc. 108(502), 540–552 (2013)

    Article  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China No. 12171454, U19B2940, Fundamental Research Funds for the Central Universities, Beijing Natural Science Foundation (JQ20029), National Key R &D Program of China (2022YFC3502502), and National Natural Science Foundation of China, No. 82071000.

Author information

Authors and Affiliations

Authors

Contributions

SZ and MR conceived the study. FZ and MR wrote the main manuscript text and supplementary information, planned and carried out the simulations, and developed the theory. SL provided the real data and guided the real data analysis. All authors reviewed the manuscript and contributed to the final manuscript.

Corresponding authors

Correspondence to Shi-Ming Li or Mingyang Ren.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 10124 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, F., Zhang, S., Li, SM. et al. Matrix regression heterogeneity analysis. Stat Comput 34, 95 (2024). https://doi.org/10.1007/s11222-024-10401-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-024-10401-z

Keywords

Navigation