Advertisement

Statistical Methods & Applications

, Volume 27, Issue 4, pp 595–602 | Cite as

Discussion of “The power of monitoring: how to make the most of a contaminated multivariate sample” by Andrea Cerioli, Marco Riani, Anthony C. Atkinson and Aldo Corbellini

  • Stephane Heritier
  • Maria-Pia Victoria-Feser
Original Paper
  • 56 Downloads

Abstract

This paper discusses the contribution of Cerioli et al. (Stat Methods Appl, 2018), where robust monitoring based on high breakdown point estimators is proposed for multivariate data. The results follow years of development in robust diagnostic techniques. We discuss the issues of extending data monitoring to other models with complex structure, e.g. factor analysis, mixed linear models for which S and MM-estimators exist or deviating data cells. We emphasise the importance of robust testing that is often overlooked despite robust tests being readily available once S and MM-estimators have been defined. We mention open questions like out-of-sample inference or big data issues that would benefit from monitoring.

Keywords

S-estimators Mixed models Deviating cells Out-of-sample inference 

References

  1. Agostinelli C, Leung A, Yohai VJ, Zamar RH (2015) Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination. Test 24:441–461MathSciNetCrossRefzbMATHGoogle Scholar
  2. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723MathSciNetCrossRefzbMATHGoogle Scholar
  3. Alqallaf F, Van Aelst S, Yohai VJ, Zamar RH (2009) Propagation of outliers in multivariate data. Ann Stat 37:311–331MathSciNetCrossRefzbMATHGoogle Scholar
  4. Avella Medina M, Ronchetti E (2017) Robust and consistent variable selection in high-dimensional generalized linear models. Biometrika (To appear) Google Scholar
  5. Bianco AM, Garcia Ben M, Yohai VJ (2005) Robust estimation for linear regression with asymmetric errors. Can J Stat 33(4):511–528MathSciNetCrossRefzbMATHGoogle Scholar
  6. Butler R, Davies P, Jhun M (1993) Asymptotics for the minimum covariance determinant estimator. Ann Stat 21:385–1400MathSciNetCrossRefGoogle Scholar
  7. Candès EJ, Li X, Ma Y, Wrigh J (2011) Robust principal component analysis? J ACM 58(3), Article number 11Google Scholar
  8. Cerioli A, Riani M, Atkinson AC, Corbellini A (2018) The power of monitoring: how to make the most of a contaminated multivariate sample. Stat Methods Appl (in press) Google Scholar
  9. Cheng T-C, Victoria-Feser M-P (2002) High breakdown estimation of multivariate mean and covariance with missing observations. Br J Math Stat Psychol 55:317–335MathSciNetCrossRefGoogle Scholar
  10. Chervoneva I, Vishnyakov M (2011) Constrained S-estimators for linear mixed effects models with covariance components. Stat Med 30(14):1735–1750MathSciNetCrossRefGoogle Scholar
  11. Chervoneva I, Vishnyakov M (2014) Generalized S-estimators for linear mixed effect models. Stat Sin 24:1257–1276MathSciNetzbMATHGoogle Scholar
  12. Copt S, Heritier S (2007) A robust alternative to the F-test in mixed linear models based on MM-estimates. Biometrics 63:1045–1052MathSciNetCrossRefzbMATHGoogle Scholar
  13. Copt S, Victoria-Feser M-P (2004) Fast algorithms for computing high breakdown covariance matrices with missing data. In: Hubert M, Pison G, Struyf A, Van Aelst S (eds) Theory and applications of recent robust methods. Statistics for industry and technology series, Birkhauser, Basel, pp 71–82CrossRefGoogle Scholar
  14. Copt S, Victoria-Feser M-P (2006) High breakdown inference for mixed linear models. J Am Stat Assoc 101:292–300MathSciNetCrossRefzbMATHGoogle Scholar
  15. Critchley F, Schyns M, Haesbroeck G (2010) Relaxmcd: smooth optimisation for the minimum covariance determinant estimator. Comput Stat Data Anal 54:843–857MathSciNetCrossRefzbMATHGoogle Scholar
  16. Croux C, Hasebroeck G (2000) Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies. Biometrika 87:603–618MathSciNetCrossRefzbMATHGoogle Scholar
  17. Croux C, Ruiz-Gazen A (2005) High breakdown estimators for principal components: the projection-pursuit approach revisited. J Multivar Anal 95:206–226MathSciNetCrossRefzbMATHGoogle Scholar
  18. Croux C, Filzmoser P, Oliveira M (2007) Algorithms for projection pursuit robust principal component analysis. Chemometr Intell Lab Syst 87:218–225CrossRefGoogle Scholar
  19. Danilov M, Yohai VJ, Zamar RH (2012) Robust estimation of multivariate location and scatter in the presence of missing data. J Am Stat Assoc 107:1178–1186MathSciNetCrossRefzbMATHGoogle Scholar
  20. Davies PL (1987) Asymptotic behaviour of S-estimators of multivariate location parametersand dispertion matrices. Ann Stat 15:1269–1292CrossRefGoogle Scholar
  21. Devlin SJ, Gnanadesikan R, Kettenring JR (1975) Robust estimation and outlier detection with correlation coefficients. Biometrika 62:531–545CrossRefzbMATHGoogle Scholar
  22. Devlin SJ, Gnanadesikan R, Kettenring JR (1981) Robust estimation of dispersion matrices and principal components. J Am Stat Assoc 76:354–362CrossRefzbMATHGoogle Scholar
  23. Donoho DL (1982) Breakdown properties of multivariate location estimators. Ph.D. qualifying paper, Department of Statistics, Harward UniversityGoogle Scholar
  24. Dupuis Lozeron E, Victoria-Feser M-P (2010) Robust estimation of constrained covariance matrices for confirmatory factor analysis. Comput Stat Data Anal 54:3020–3032MathSciNetCrossRefzbMATHGoogle Scholar
  25. Dupuis DJ, Victoria-Feser M-P (2011) Fast robust model selection in large datasets. J Am Stat Assoc 106:203–212MathSciNetCrossRefzbMATHGoogle Scholar
  26. Dupuis DJ, Victoria-Feser M-P (2013) Robust vif regression with application to variable selection in large datasets. Ann Appl Stat 7:319–341MathSciNetCrossRefzbMATHGoogle Scholar
  27. Efron B (2004) The estimation of prediction error. J Am Stat Assoc 99(467):619–632CrossRefzbMATHGoogle Scholar
  28. Farcomeni A (2014a) Snipping for robust \(k\)-means clustering under component-wise contamination. Stat Comput 24:909–917MathSciNetCrossRefzbMATHGoogle Scholar
  29. Farcomeni A (2014b) Robust constrained clustering in presence of entry-wise outliers. Technometrics 56:102–111MathSciNetCrossRefGoogle Scholar
  30. Gnanadesikan R, Kettenring JR (1972) Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 29:81–124CrossRefGoogle Scholar
  31. Hampel FR, Ronchetti E, Rousseeuw PJ, Stahel WA (1986) Robust statistics: the approach based on influence functions. Wiley, New YorkzbMATHGoogle Scholar
  32. Heritier S, Ronchetti E (1994) Robust bounded-influence tests in general parametric models. J Am Stat Assoc 89(427):897–904MathSciNetCrossRefzbMATHGoogle Scholar
  33. Heritier S, Cantoni E, Copt S, Victoria-Feser MP (2009) Robust methods in biostatistics. Wiley, New YorkCrossRefzbMATHGoogle Scholar
  34. Huber P, Ronchetti E (2009) Robust statistics, 2nd edn. Wiley, New YorkCrossRefzbMATHGoogle Scholar
  35. Hubert M, Rousseeuw PJ, Branden K (2005) ROBPCA: a new approach to robust principal component analysis. Technometrics 47:64–79MathSciNetCrossRefGoogle Scholar
  36. Hubert M, Rousseeuw PJ, Verdonck T (2009) Robust PCA for skewed data and its outlier map. Comput Stat Data Anal 53:2264–2274MathSciNetCrossRefzbMATHGoogle Scholar
  37. Hubert M, Rousseeuw PJ, Verdonck T (2012) A deterministic algorithm for robust location and scatter. J Comput Graph Stat 21:618–637MathSciNetCrossRefGoogle Scholar
  38. Hubert M, Rousseeuw PJ, Segaert P (2015) Multivariate functional outlier detection. Stat Methods Appl 24:177–202MathSciNetCrossRefzbMATHGoogle Scholar
  39. Kent JT, Tyler DE (1996) Constrained \(M\)-estimation for multivariate location and scatter. Ann Stat 24:1346–1370MathSciNetCrossRefzbMATHGoogle Scholar
  40. Khan JA, Van Aelst S, Zamar RH (2007) Robust linear model selection based on least angle regression. J Am Stat Assoc 102:1289–1299MathSciNetCrossRefzbMATHGoogle Scholar
  41. Koller M (2016) robustlmm: an R package for robust estimation of linear mixed-effects models. J Stat Softw 75(6):1–24CrossRefGoogle Scholar
  42. Leung A, Zhang H, Zamar R (2016) Robust regression estimation and inference in the presence of cellwise and casewise contamination. Comput Stat Data Anal 99:1–11MathSciNetCrossRefzbMATHGoogle Scholar
  43. Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New YorkzbMATHGoogle Scholar
  44. Little RJA, Smith PJ (1987) Editing and imputing for quantitative survey data. J Am Stat Assoc 82:58–68CrossRefzbMATHGoogle Scholar
  45. Liu RY, Parelius JM, Singh K (1999) Multivariate analysis by data depth: descriptive statistics, graphics and inference, (with discussion and a rejoinder by liu and singh). Ann Stat 27:783–858zbMATHGoogle Scholar
  46. Lopuhaä HP (1989) On the relation between S-estimators and M-estimators of multivariatelocation and covariance. Ann Stat 17:1662–1683MathSciNetCrossRefzbMATHGoogle Scholar
  47. Machado JAF (1993) Robust model selection and \(m\)-estimation. Econ Theory 9:478–493MathSciNetCrossRefGoogle Scholar
  48. Mallows CL (1973) Some comments on \(C_p\). Technometrics 15(4):661–675zbMATHGoogle Scholar
  49. Maronna RA, Zamar RH (2002) Robust Estimates of Location and Dispersion for High-Dimensional Datasets. Technometrics 44(4):307–317MathSciNetCrossRefGoogle Scholar
  50. Maronna RA, Martin RD, Yohai VJ (2006) Robust statistics: theory and methods. Wiley, ChichesterCrossRefzbMATHGoogle Scholar
  51. Mavridis D, Moustaki I (2008) Detecting outliers in factor analysis using the forward search algorithm. Multivar Behav Res 43:453–475CrossRefGoogle Scholar
  52. Mavridis D, Moustaki I (2009) The forward search algorithm for detecting aberrant response patterns in factor analysis for binary data. J Comput Graph Stat 18:1016–1034MathSciNetCrossRefGoogle Scholar
  53. McQuarrie A, Tsai C (1998) Regression and time series model selection, vol 43. World Scientific, SingaporeCrossRefzbMATHGoogle Scholar
  54. Moustaki I, Victoria-Feser M-P (2006) Bounded-bias robust inference for generalized linear latent variable models. J Am Stat Assoc 101:644–653CrossRefzbMATHGoogle Scholar
  55. Olive D (2004) A resistant estimator of multivariate location and dispersion. Comput Stat Data Anal 46:99–102MathSciNetCrossRefzbMATHGoogle Scholar
  56. Öllerer V, Alfons A, Croux C (2016) The shooting S-estimator for robust regression. Comput Stat 31:829–844MathSciNetCrossRefzbMATHGoogle Scholar
  57. Rocke DM (1996) Robustness properties of S-estimators of multivariate location and shape in high dimension. Ann Stat 24:1327–1345MathSciNetCrossRefzbMATHGoogle Scholar
  58. Rocke DM, Woodruff DL (1996) Identification of outliers in multivariate data. J Am Stat Assoc 91:1047–1061MathSciNetCrossRefzbMATHGoogle Scholar
  59. Ronchetti E, Staudte RG (1994) A robust version of Mallows’s \(C_p\). J Am Stat Assoc 89:550–559zbMATHGoogle Scholar
  60. Ronchetti E, Field C, Blanchard W (1997) Robust linear model selection by cross-validation. J Am Stat Assoc 92:1017–1023MathSciNetCrossRefzbMATHGoogle Scholar
  61. Rousseeuw PJ (1984) Least median of squares regression. J Am Stat Assoc 79:871–880MathSciNetCrossRefzbMATHGoogle Scholar
  62. Rousseeuw PJ, Leroy AM (1987) Robust regression and outlier detection. Wiley, New YorkCrossRefzbMATHGoogle Scholar
  63. Rousseeuw PJ, Van den Bossche W (2017) Detecting deviating data cells. Technometrics.  https://doi.org/10.1080/00401706.2017.1340909. (in press)
  64. Rousseeuw PJ, Van Driessen K (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41:212–223CrossRefGoogle Scholar
  65. Rousseeuw PJ, Yohai VJ (1984) Robust regression by means of S-estimators. In: Franke JW, Hardle W, Martin RD (eds) Robust and nonlinear time series analysis. Springer, New York, pp 256–272CrossRefGoogle Scholar
  66. Salibian-Barrera M, Yohai VJ (2006) A fast algorithm for s-regression estimates. J Comput Graph Stat 15(2):414–427MathSciNetCrossRefGoogle Scholar
  67. Salibian-Barrera M, Yohai VJ (2008) High breakdown point robust regression with censored data. Ann Stat 36(1):118–146MathSciNetCrossRefzbMATHGoogle Scholar
  68. Salibian-Barrera M, Van Aelst S, Willems G (2006) PCA based on multivariate MM-estimators with fast and robust bootstrap. J Am Stat Assoc 101:1198–1211CrossRefzbMATHGoogle Scholar
  69. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464MathSciNetCrossRefzbMATHGoogle Scholar
  70. Stahel WA (1981) Breakdown of covariance estimators. Technical report 31, Fachgruppe für Statistik, ETH, ZurichGoogle Scholar
  71. Tatsuoka KS, Tyler DE (2000) The uniqueness of S and M-functionals under nonel liptical distributions. Ann Stat 28:1219–1243CrossRefzbMATHGoogle Scholar
  72. Van Aelst S, Wang Y (2017) Robust variable screening for regression using factor profiling, manuscriptGoogle Scholar
  73. Van Aelst S, Willems G (2012) Robust and efficient one-way MANOVA tests. J Am Stat Assoc 106(494):706–718MathSciNetCrossRefzbMATHGoogle Scholar
  74. Van Aelst S, Vandervieren E, Willems G (2012) A Stahel Donoho estimator based on huberized outlyingness. Comput Stat Data Anal 56:531–542MathSciNetCrossRefGoogle Scholar
  75. Woodruff DL, Rocke DM (1994) Computable robust estimation of multivariate location and shape in highdimension using compound estimators. J Am Stat Assoc 89:888–896CrossRefzbMATHGoogle Scholar
  76. Xu H, Caramanis C, Sanghavi S (2012) Robust PCA via outlier pursuit. IEEE Trans Inf Theory 58:3047–3064MathSciNetCrossRefzbMATHGoogle Scholar
  77. Xu H, Caramanis C, Mannor S (2013) Outlier-robust PCA: the high-dimensional case. IEEE Trans Inf Theory 59:546–572MathSciNetCrossRefzbMATHGoogle Scholar
  78. Yohai VJ (1987) High breakdown point and high efficiency robust estimates for regression. Ann Stat 15:642–656MathSciNetCrossRefzbMATHGoogle Scholar
  79. Zuo Y, Cui H (2005) Depth weighted scatter estimators. Ann Stat 33:381–413MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2017

Authors and Affiliations

  1. 1.School of Public Health and Preventive MedicineMonash UniversityMelbourneAustralia
  2. 2.Geneva School of Economics and ManagementGeneva UniversityGenevaSwitzerland

Personalised recommendations