Abstract
Maximum likelihood is perhaps the most common method to estimate model parameters in applied statistics. However, it is well known that maximum likelihood estimators often have poor properties when outliers are present. Robust estimation methods are often used for estimating the model parameters in the presence of outliers, but these methods lack a unified approach. We propose a unified method using EM algorithm to make statistical modelling more robust. In this paper, we describe the proposed method of robust estimation and demonstrate it using the example of estimating the location parameter. Well known real data sets with outliers were used to demonstrate the application of proposed estimator. Finally, the proposed estimator is compared with standard M-estimator. In this talk, the location case was considered for simplicity, but directly extends to the robust estimation of parameters in a broad range of statistical models. Hence this proposed method aligns with the classical statistical modelling, in terms of a unified approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abbey, S.: Robust measures and the estimator limit. Geostand. Newslett. 12, 241 (1988)
Analytical Methods Committee: Robust statistics how not to reject outliers. The Analyst 114, 1693–1702 (1989)
Andrews, D.F., Bickel, P.J., Hampel, F.R., Huber, P.J., Rogers, W.H., Tukey, J.W.: Robust Estimates of Location: Survey and Advances, vol. 279. Princeton University Press, Princeton (1972)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. (B) 39, 1–38 (1977)
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. Chapman and Hall, London (1995)
Hennig, C.: Robustness of ML estimators of location-scale mixtures. In: Innovations in Classification, Data Science, and Information Systems, pp. 128–137. Springer, Heidelberg (2005)
Hennig, C., Coretto, P.: The noise component in model-based cluster analysis. In: Data Analysis, Machine Learning and Applications, pp. 127–138. Springer, Berlin (2008)
Holland, P.W., Welsch, R.E.: Robust regression using iteratively reweighted least-squares. Commun. Stat. Theor. Methods A6(9), 813–827 (1977)
Huber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35, 73–101 (1964)
Huber, P., Ronchetti, E.M.: Robust Statistics, 2nd edn. Wiley, New York (2009)
Longforda, N.T., D’Ursob, P.: Mixture models with an improper component. J. Appl. Stat. 38(11), 2511–2521 (2011)
Louis, T.A.: Finding the observed information matrix when using the EM algorithm. J. R. Stat. Soc. B 44, 226–233 (1982)
Maronna, R.A., Martin, R.D., Yohai, V.J.: Robust Statistics. Wiley, West Sussex, England (2006)
McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. Wiley, Hoboken, New Jersey (1996)
Rohan, M.: Using Finite Mixtures to Robustify Statistical Models. Ph.D. Thesis, The University of Waikato (2011)
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer, New York (2002)
Acknowledgements
I would like to express sincere thanks to Dr Murray Jorgensen (AUT University) for his valuable discussion throughout my research and advice of this manuscript. I also acknowledged Dr Iain Hume (NSW DPI) for comments on the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A: Proof of Equation (2.9)
Appendix A: Proof of Equation (2.9)
For MLE, \( l_c(\theta ) = \sum _{i = 1}^n \hat z_i (y_i - \theta )^2 + \mbox{constant} \) is to be maximized with respect to θ,
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rohan, M. (2020). Computing Robust Statistics via an EM Algorithm. In: Rahman, A. (eds) Statistics for Data Science and Policy Analysis. Springer, Singapore. https://doi.org/10.1007/978-981-15-1735-8_2
Download citation
DOI: https://doi.org/10.1007/978-981-15-1735-8_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1734-1
Online ISBN: 978-981-15-1735-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)