Skip to main content
Log in

The minimum covariance determinant estimator for interval-valued data

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Effective estimation of covariance matrices is crucial for statistical analyses and applications. In this paper, we focus on the robust estimation of covariance matrix for interval-valued data in low and moderately high dimensions. In the low-dimensional scenario, we extend the Minimum Covariance Determinant (MCD) estimator to interval-valued data. We derive an iterative algorithm for computing this estimator, demonstrate its convergence, and theoretically establish that it retains the high breakdown-point property of the MCD estimator. Further, we propose a projection-based estimator and a regularization-based estimator to extend the MCD estimator to moderately high-dimensional settings, respectively. We propose efficient iterative algorithms for solving these two estimators and demonstrate their convergence properties. We conduct extensive simulation studies and real data analysis to validate the finite sample properties of these proposed estimators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Algorithm 3
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Agulló, J., Croux, C., Van Aelst, S.: The multivariate least-trimmed squares estimator. J. Multivar. Anal. 99(3), 311–338 (2008)

    Article  MathSciNet  Google Scholar 

  • Anderson, T.W.: An introduction to multivariate statistical analysis, vol. 2. Wiley, New York (1958)

    Google Scholar 

  • Aubry, A., De Maio, A., Pallotta, L., et al.: Maximum likelihood estimation of a structured covariance matrix with a condition number constraint. IEEE Trans. Signal Process. 60(6), 3004–3021 (2012)

    Article  MathSciNet  Google Scholar 

  • Avella-Medina, M., Battey, H.S., Fan, J., et al.: Robust estimation of high-dimensional covariance and precision matrices. Biometrika 105(2), 271–284 (2018)

    Article  MathSciNet  Google Scholar 

  • Bertrand, P., Goupil, F.: Descriptive statistics for symbolic data. In: Analysis of symbolic data. Springer, p 106–124 (2000)

  • Bickel, P.J., Levina, E.: Regularized estimation of large covariance matrices. Ann. Stat. 36, 199–227 (2008)

    Article  MathSciNet  Google Scholar 

  • Billard, L.: Sample covariance functions for complex quantitative data. In: Proceedings of World IASC Conference, Yokohama, Japan, pp 157–163 (2008)

  • Billard, L., Diday, E.: From the statistics of data to the statistics of knowledge: symbolic data analysis. J. Am. Stat. Assoc. 98(462), 470–487 (2003)

    Article  MathSciNet  Google Scholar 

  • Blanco-Fernández, A., Corral, N., González-Rodríguez, G.: Estimation of a flexible simple linear model for interval data based on set arithmetic. Computat. Statist. Data Anal. 55(9), 2568–2578 (2011)

    Article  MathSciNet  Google Scholar 

  • Boudt, K., Rousseeuw, P.J., Vanduffel, S., et al.: The minimum regularized covariance determinant estimator. Stat. Comput. 30(1), 113–128 (2020)

    Article  MathSciNet  Google Scholar 

  • Bühlmann, P., Van De Geer, S.: Statistics for high-dimensional data: methods, theory and applications. Springer Science & Business Media (2011)

  • Butler, R., Davies, P., Jhun, M.: Asymptotics for the minimum covariance determinant estimator. The Annals of Statistics pp 1385–1400 (1993)

  • Cai, T.T., Zhang, C.H., Zhou, H.H.: Optimal rates of convergence for covariance matrix estimation. Ann. Stat. 38, 2118–2144 (2010)

    Article  MathSciNet  Google Scholar 

  • Cator, E.A., Lopuhaä, H.P.: Central limit theorem and influence function for the mcd estimators at general multivariate distributions. Bernoulli 18(2), 520–551 (2012)

    Article  MathSciNet  Google Scholar 

  • Cazes, P., Chouakria, A., Diday, E., et al.: Extension de l’analyse en composantes principales à des données de type intervalle. Revue de Statistique appliquée 45(3), 5–24 (1997)

    Google Scholar 

  • Cazes, P., Chouakria, A., Diday, E., et al.: Extension de l’analyse en composantes principales à des données de type intervalle. Revue de Statistique Appliquée 45(3), 5–24 (1997)

    Google Scholar 

  • Chou, RY.: Forecasting financial volatilities with extreme values: the conditional autoregressive range (carr) model. J. Money Credit Bank, 561–582 (2005)

  • Croux, C., Haesbroeck, G.: Influence function and efficiency of the minimum covariance determinant scatter matrix estimator. J. Multivar. Anal. 71(2), 161–190 (1999)

    Article  MathSciNet  Google Scholar 

  • Davies, P.L.: Asymptotic behaviour of S-estimates of multivariate location parameters and dispersion matrices. Annals Stat. 1269–1292 (1987)

  • Diamond, P.: Least squares fitting of compact set-valued data. J. Math. Anal. Appl. 147(2), 351–362 (1990)

    Article  MathSciNet  Google Scholar 

  • Efron, B., Hastie, T.J., Johnstone, I.M., et al.: Least angle regression. Ann. Stat. 32, 407–499 (2004)

    Article  MathSciNet  Google Scholar 

  • Fan, J., Liao, Y., Liu, H.: An overview of the estimation of large covariance and precision matrices. Economet. J. 19(1), C1–C32 (2016)

    Article  MathSciNet  Google Scholar 

  • Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)

    Article  Google Scholar 

  • Furrer, R., Bengtsson, T.: Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. J. Multivar. Anal. 98, 227–255 (2007)

    Article  MathSciNet  Google Scholar 

  • Gil, M.Á., López-García, M.T., Lubiano, M.A., et al.: Regression and correlation analyses of a linear relation between random intervals. TEST 10, 183–201 (2001)

    Article  MathSciNet  Google Scholar 

  • Golan, A., Ullah, A.: Interval estimation: an info-metrics approach. Econ. Rev. (2015)

  • González-Rivera, G., Lin, W.: Constrained regression for interval-valued data. J. Bus. Econ. Stat. 31(4), 473–490 (2013)

    Article  MathSciNet  Google Scholar 

  • Han, A., Hong, Y., Wang, S., et al.: A vector autoregressive moving average model for interval-valued time series data. In: Essays in Honor of Aman Ullah. Emerald Group Publishing Limited (2016)

  • Huang, C.C., Liu, K., Pope, R.M., et al.: Activated TLR signaling in atherosclerosis among women with lower Framingham risk score: the multi-ethnic study of atherosclerosis. PLoS ONE 6(6), e21067 (2011)

  • Huber, PJ.: Robust statistics. In: International encyclopedia of statistical science. Springer, 1248–1251 (2011)

  • Huber, P.J., Donoho, D.: The notion of breakdown point. A Festschrift for Erich L Lehmann (1983)

  • Hubert, M., Debruyne, M.: Minimum covariance determinant. Wiley interdisciplinary reviews: Computational statistics 2(1), 36–43 (2010)

    Article  Google Scholar 

  • Kent, J.T., Tyler, D.E.: Constrained M-estimation for multivariate location and scatter. Ann. Stat. 24(3), 1346–1370 (1996)

    Article  MathSciNet  Google Scholar 

  • Ledoit, O., Wolf, M.: A well-conditioned estimator for large-dimensional covariance matrices. J. Multivar. Anal. 88(2), 365–411 (2004)

    Article  MathSciNet  Google Scholar 

  • Lopuhaa, H.P., Rousseeuw, P.J.: Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. Annals Statist. 229–248 (1991)

  • Maronna, R.A., Yohai, V.J.: Robust estimation of multivariate location and scatter, pp. 1–12. Statistics Reference Online, Wiley StatsRef (2014)

    Google Scholar 

  • Molchanov, I., Molinari, F.: Random sets in econometrics, vol. 60. Cambridge University Press, Cambridge (2018)

    Book  Google Scholar 

  • Ogata, H., Goto, S., Sato, K., et al.: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 27(1), 29–34 (1999)

    Article  Google Scholar 

  • Ramos-Guajardo, A.B., Grzegorzewski, P.: Distance-based linear discriminant analysis for interval-valued data. Inf. Sci. 372, 591–607 (2016)

    Article  Google Scholar 

  • Rousseeuw, P.J.: Multivariate estimation with high breakdown point. Math. Stat. Appl. 8(283–297), 37 (1985)

    MathSciNet  Google Scholar 

  • Rousseeuw, P.J., Driessen, K.V.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3), 212–223 (1999)

    Article  Google Scholar 

  • Rousseeuw, P.J., Leroy, A.M.: Robust regression and outlier detection. Wiley (2005)

  • Sun, Y., Zhang, X., Wan, A.T., et al.: Model averaging for interval-valued data. Eur. J. Oper. Res. 301(2), 772–784 (2022)

    Article  MathSciNet  Google Scholar 

  • Tatsuoka, KS., Tyler, DE.: On the uniqueness of S-functionals and M-functionals under nonelliptical distributions. Annal Stat pp 1219–1243 (2000)

  • Todorov, V., Filzmoser, P.: An object-oriented framework for robust multivariate analysis. J. Stat. Softw. 32(3), 1–47 (2009)

    Article  Google Scholar 

  • Tsiatis, A.A.: Estimating regression parameters using linear rank tests for censored data. Annal Stat. 354–372 (1990)

  • Van Aelst, S., Rousseeuw, P.: Minimum volume ellipsoid. Wiley interdisciplinary reviews: computational statistics 1(1), 71–82 (2009)

    Article  Google Scholar 

  • Wang, H., Guan, R., Wu, J.: CIPCA: complete-information-based principal component analysis for interval-valued data. Neurocomputing 86, 158–169 (2012)

    Article  Google Scholar 

  • Weisberg, H.: The distribution of linear combinations of order statistics from the uniform distribution. Ann. Math. Stat. 42(2), 704–709 (1971)

    Article  MathSciNet  Google Scholar 

  • Wit, E.C., Abbruzzo, A.: Inferring slowly-changing dynamic gene-regulatory networks. BMC Bioinf 16(S6) (2015)

  • Won, J.H., Lim, J., Kim, S.J., et al.: Condition-number-regularized covariance estimation. J. R. Stat. Soc. Ser. B Stat Methodol. 75(3), 427–450 (2013)

  • Wu, W.B., Pourahmadi, M.: Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika 90(4), 831–844 (2003)

  • Xue, L., Ma, S., Zou, H.: Positive-definite \(\ell \)-penalized estimation of large covariance matrices. J. Am. Stat. Assoc. 107(500), 1480–1491 (2012)

    Article  MathSciNet  Google Scholar 

  • Zhang, J., Liu, M., Dong, M.: Variational Bayesian inference for interval regression with an asymmetric Laplace distribution. Neurocomputing 323, 214–230 (2019)

    Article  Google Scholar 

  • Zuo, Y., Cui, H., He, X.: On the Stahel-Donoho estimator and depth-weighted means of multivariate data. Ann. Stat. 32(1), 167–188 (2004)

Download references

Funding

This work was supported by the National Natural Science Foundation of China (Nos. 72071008) and the Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science (East China Normal University), Ministry of Education.

Author information

Authors and Affiliations

Authors

Contributions

WT Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Writing original draft. ZQ Conceptualization, Methodology, Validation, Supervision,Writing-review & editing.

Corresponding author

Correspondence to Zhongfeng Qin.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 490 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tian, W., Qin, Z. The minimum covariance determinant estimator for interval-valued data. Stat Comput 34, 80 (2024). https://doi.org/10.1007/s11222-024-10386-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-024-10386-9

Keywords

Navigation