Abstract
We consider, robust estimation of wrapped models to multivariate circular data that are points on the surface of a p-torus based on the weighted likelihood methodology. Robust model fitting is achieved by a set of weighted likelihood estimating equations, based on the computation of data dependent weights aimed to down-weight anomalous values, such as unexpected directions that do not share the main pattern of the bulk of the data. Weighted likelihood estimating equations with weights evaluated on the torus or obtained after unwrapping the data onto the Euclidean space are proposed and compared. Asymptotic properties and robustness features of the estimators under study have been studied, whereas their finite sample behavior has been investigated by Monte Carlo numerical experiment and real data examples.
Similar content being viewed by others
References
Agostinelli, C.: Robust estimation for circular data. Comput. Stat. Data Anal. 51(12), 5867–5875 (2007)
Agostinelli, C., Greco, L.: Discussion of “the power of monitoring: how to make the most of a contaminated multivariate sample” by Andrea Cerioli, Marco Riani, Anthony C. Atkinson and Aldo Corbellini. Stat. Methods Appl. 27(4), 609–619 (2018)
Agostinelli, C., Greco, L.: Weighted likelihood estimation of multivariate location and scatter. TEST 28(3), 756–784 (2019)
Azzalini, A., Menardi, G.: Clustering via nonparametric density estimation: the R package pdf Cluster. J. Stat. Softw. 57(11), 1–26 (2014)
Bahlmann, C.: Directional features in online handwriting recognition. Pattern Recognit. 39(1), 115–125 (2006)
Baltieri, D., Vezzani, R., Cucchiara, R.: People orientation recognition by mixtures of wrapped distributions on random trees. In: European Conference on Computer Vision, Springer, pp. 270–283 (2012)
Basu, A., Lindsay, B.G.: Minimum disparity estimation for continuous models: efficiency, distributions and robustness. Ann. Inst. Stat. Math. 46(4), 683–705 (1994)
Beran, R.: Minimum hellinger distance estimates for parametric models. Ann. Stat., pp. 445–463 (1977)
Bourne, P.E.: The protein data bank. Nucleic Acids Res. 28, 235–242 (2000)
Chakraborty, S., Wong, S.W.K.: BAMBI: an R package for fitting bivariate angular mixture models. J. Stat. Softw. 99(11), 1–69 (2021)
Chang, M., Artymiuk, P., Wu, X., et al.: Human triosephosphate isomerase deficiency resulting from mutation of phe-240. Am J Hum Genet 52, 1260 (1993)
Coles, S.: Inference for circular distributions and processes. Stat. Comput. 8(2), 105–113 (1998)
Cremers, J., Klugkist, I.: One direction? A tutorial for circular data analysis using r with examples in cognitive psychology. Front. Psychol., p. 2040 (2018)
Davies, P.L., Gather, U.: Breakdown and groups. Ann. Stat. 33(3), 977–1035 (2005)
Davies, P.L., Gather, U.: Addendum to the discussion of “breakdown and groups”. Ann. Stat., pp. 1577–1579 (2006)
Eltzner, B., Huckermann, S., Mardia, K.: Torus principal component analysis with applications to RNA structure. Ann. Appl. Stat. 12(2), 1332–1359 (2018)
Farcomeni, A., Greco, L.: Robust Methods for Data Reduction. CRC Press (2016)
Fisher, N., Lee, A.: Time series analysis of circular data. J. R. Stat. Soc. B 56, 327–339 (1994)
Greco, L., Agostinelli, C.: Weighted likelihood mixture modeling and model-based clustering. Stat. Comput. 30(2), 255–277 (2020)
Greco, L., Lucadamo, A., Agostinelli, C.: Weighted likelihood latent class linear regression. Stat. Methods Appl., pp. 1–36 (2020)
Greco, L., Saraceno, G., Agostinelli, C.: Robust fitting of a wrapped normal model to multivariate circular data and outlier detection. Stats 4(2), 454–471 (2021)
Greco, L., Novi Inverardi, P., Agostinelli, C.: Finite mixtures of multivariate wrapped normal distributions for model based clustering of p-torus data. J. Comput. Graph. Stat. 32(3), 1215–1228 (2022)
He, X., Simpson, D.G.: Robust direction estimation. Ann. Stat. 20(1), 351–369 (1992)
Huber, P., Ronchetti, E.: Robust Statistics. Wiley, London (2009)
Jammalamadaka, S., SenGupta, A.: Topics in Circular Statistics, Multivariate Analysis, vol. 5. World Scientific, Singapore (2001)
Jona Lasinio, G., Gelfand, A., Jona Lasinio, M.: Spatial analysis of wave direction data using wrapped Gaussian processes. Ann. Appl. Stat. 6(4), 1478–1498 (2012)
Ko, D., Guttorp, P.: Robustness of estimators for directional data. Ann. Stat., pp. 609–618 (1988)
Kurz, G., Gilitschenski, I., Hanebeck, U.D.: Efficient evaluation of the probability density function of a wrapped normal distribution. In: 2014 Sensor Data Fusion: Trends, pp. 1–5. Solutions, Applications (SDF), IEEE (2014)
Lenth, R.V.: Robust measures of location for directional data. Technometrics 23(1), 77–81 (1981)
Lindsay, B.: Efficiency versus robustness: the case for minimum hellinger distance and related methods. Ann. Stat. 22, 1018–1114 (1994)
Lund, U.: Cluster analysis for directional data. Commun. Stat. Simul. Comput. 28(4), 1001–1009 (1999)
Mardia, K.: Statistics of Directional Data. Academic Press (1972)
Mardia, K., Jupp, P.: Directional Statistics. Wiley, New York (2000)
Mardia, K., Taylor, C., Subramaniam, G.: Protein bioinformatics and mixtures of bivariate von mises distributions for angular data. Biometrics 63(2), 505–512 (2007)
Mardia, K., Kent, J., Zhang, Z., et al.: Mixtures of concentrated multivariate sine distributions with applications to bioinformatics. J. Appl. Stat. 39(11), 2475–2492 (2012)
Mardia, K.V., Frellsen, J.: Statistics of bivariate von mises distributions. In: Bayesian Methods in Structural Bioinformatics. Springer, p. 159–178 (2012)
Mardia, K.V., Jupp, P.E.: Directional Statistics. Wiley Online Library (2000b)
Markatou, M., Basu, A., Lindsay, B.G.: Weighted likelihood equations with bootstrap root search. J. Am. Stat. Assoc. 93(442), 740–750 (1998)
Maronna, R.A., Martin, R.D., Yohai, V.J., et al.: Robust Statistics: Theory and Methods (with R). Wiley, London (2019)
Munkres, J.R.: Elements of Algebraic Topology. CRC Press (2018)
Nodehi, A., Golalizadeh, M., Maadooliat, M., et al.: Estimation of parameters in multivariate wrapped models for data on ap-torus. Comput. Stat. 36, 193–215 (2021)
Park, C., Basu, A.: The generalized Kullback–Leibler divergence and robust inference. J. Stat. Comput. Simul. 73(5), 311–332 (2003)
Park, C., Basu, A., Lindsay, B.: The residual adjustment function and weighted likelihood: a graphical interpretation of robustness of minimum disparity estimators. Comput. Stat. Data Anal. 39(1), 21–33 (2002)
Pewsey, A., Neuhäuser, M., Ruxton, G.: Circular Statistics in R. Oxford University Press, Oxford (2013)
Prestele, C.: Credit portfolio modelling with elliptically contoured distributions. Ph.D. thesis, Institute for Finance Mathematics, University of Ulm (2007)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2021), https://www.R-project.org/
Ranalli, M., Maruotti, A.: Model-based clustering for noisy longitudinal circular data, with application to animal movement. Environmetrics 31(2), e2572 (2020)
Rao, B.: Nonparametric Functional Estimation. Academic Press (2014)
Rivest, L.P., Duchesne, T., Nicosia, A., et al.: A general angular regression model for the analysis of data on animal movement in ecology. J. R. Stat. Soc.: Ser. C (Appl. Stat.) 65(3), 445–463 (2016)
Rousseeuw, P.J., Hampel, F.R., Ronchetti, E.M., et al.: Robust Statistics: The Approach Based on Influence Functions. Wiley, London (2011)
Rutishauser, U., Ross, I.B., Mamelak, A.N., et al.: Human memory strength is predicted by theta-frequency phase-locking of single neurons. Nature 464(7290), 903–907 (2010)
Saraceno, G., Agostinelli, C., Greco, L.: Robust estimation for multivariate wrapped models. Metron 79(2), 225–240 (2021)
Serfling, R.J.: Approximation Theorems of Mathematical Statistics. Wiley, London (2009)
Wadley, L., Keating, K., Duarte, C., et al.: Evaluating and learning from rna pseudotorsional space: quantitative validation of a reduced representation for rna structure. J. Mol. Biol. 372(4), 942–957 (2007)
Warren, W.H., Rothman, D.B., Schnapp, B.H., et al.: Wormholes in virtual space: from cognitive maps to cognitive graphs. Cognition 166, 152–163 (2017)
Acknowledgements
The authors wish to thank the Associate Editor who supported and encouraged the reviewing process and two anonymous referees whose comments helped improving the quality of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: MLE for wrapped unimodal elliptically symmetric distributions
Let us consider the circular model
where
is a unimodal elliptically symmetric distribution. The log-likelihood function based on an i.i.d. sample \(\varvec{y}_1, \ldots , \varvec{y}_n\) is
Recall that for given square matrices A and B, both symmetric and positive definite we have that
-
1.
\(\nabla _{A} {\text {tr}}(BA) = B^\top\),
-
2.
\(\nabla _A \log (\vert A \vert ) = \left( A^{-1}\right) ^\top\),
-
3.
\(\nabla _{\varvec{x}} (\varvec{x}^\top A \varvec{x}) = 2 A \varvec{x}\).
Let \(d_{i\varvec{j}}(\varvec{\mu },\Sigma ) = (\varvec{y}_i + 2\pi \varvec{j} - \varvec{\mu })^\top \Sigma ^{-1}(\varvec{y}_i + 2\pi \varvec{j} - \varvec{\mu })\). Taking the derivatives w.r.t. \(\varvec{\mu }\) and \(\Sigma ^{-1}\), the likelihood equations are
and
where \(h^\prime (d) = \partial h(d)/\partial d\). Let
then, the MLE \((\hat{\varvec{\mu }}, {{\hat{\Sigma }}})\) is the solution to the (set of) fixed point equations
The WN distribution corresponds to \(h(t) = \exp \left( -\frac{t}{2} \right)\). Since, \(h^\prime (t) = -\frac{1}{2} h(d)\) then
and the estimating equations simplify to
with
Appendix B: EM algorithm for WN estimation
Given, an i.i.d. sample \((\varvec{y}_1, \ldots , \varvec{y}_n)\) from a WN distribution, in the EM algorithm the wrapping coefficients \(\varvec{j}\) are considered as latent variables and the observed torus data \(\varvec{y}_i\)s as being incomplete, that is \(\varvec{y}_i\) assumed to be one component of the pair \((\varvec{y}_i,\varvec{\omega }_i)\), where \(\varvec{\omega }_i=(\omega _{i\varvec{j}}: \varvec{j} \in {\mathbb {Z}}^p)\) is the associated latent wrapping coefficients label vector. Then, the MLE for \(\varvec{\theta } = (\varvec{\mu }, \Sigma )\) is the result of the EM algorithm based on the complete log-likelihood function
In the expectation step (E-step), we evaluate the conditional expectation of (25) given the observed data and the current parameters value \(\varvec{\theta }\) by computing the conditional probability that \(\varvec{y}_i\) has \(\varvec{j}\) as wrapping coefficients vector, that is
parameters estimation is carried out in the maximization step (M-step) solving the set of (complete) likelihood equations
with \(u(\varvec{y}_i + 2 \pi \varvec{j};\, \varvec{\theta })=\nabla _{\varvec{\theta }} \log m(\varvec{y}+ 2 \pi \varvec{j};\, \varvec{\theta })\). An alternative estimation strategy can be based on a CEM algorithm leading to an approximated solution. At each iteration, a Classification step (C-step) is performed after the E-step, that provides crispy assignments. Let
then, set \(\omega _{i\varvec{j}} = 1\) when \(\varvec{j}=\hat{\varvec{j}}_i\), \(\omega _{i\varvec{j}} = 0\) otherwise. As a result, the torus data \(\varvec{y}_i\) are unwrapped to (fitted) linear data \(\hat{\varvec{x}}_i = \varvec{y}_i + 2\pi \hat{\varvec{j}}_i\). It is easy to see that the M-step simplifies to
both the procedures are iterated until some convergence criterion is fulfilled, that could be based on the changes in the likelihood or in fitted parameter values (Nodehi et al. 2021).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Agostinelli, C., Greco, L. & Saraceno, G. Weighted likelihood methods for robust fitting of wrapped models for p-torus data. AStA Adv Stat Anal (2024). https://doi.org/10.1007/s10182-024-00494-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10182-024-00494-2