Abstract
In recent years, there has been a growing interest in statistical methods for the analysis of spatially referenced data. The spatial dependence structure modeling is an indispensable tool to estimate the parameters that define this structure. In this paper, we use the family of elliptical distributions to estimate the spatial dependence in referenced data. Thus we extend the Gaussian spatial linear model. Also we use the local influence methodology to assess the sensitivity of the maximum likelihood estimators to small perturbations in the data and/or in the spatial linear model assumptions. The methodology is illustrated with a real data set. The results allowed us to conclude that the presence of atypical values in the sample data have a strong influence, changing the spatial dependence structure. Also we have included a small simulation study.
Similar content being viewed by others
References
Atkinson AC, Riani M, Cerioli A (2004) Exploring multivariate data with the forward search. Springer, New York
Barnett V (2004) Environmental statistics: methods and applications. Wiley, Chichester
Borssoi JA, De Bastiani F, Uribe-Opazo MA, Galea M (2011) Local influence of explanatory variables in Gaussian spatial linear models. Chil J Stat 2:29–38
Cambanis S, Huang S, Simons G (1981) On the theory of elliptically contoured distributions. J Multivar Anal 11:368–385
Cerioli A, Riani R (1999) The ordering of spatial data and the detection of multiple outliers. J Comput Graph Stat 8:239–258
Christensen R, Johnson W, Pearson L (1992) Prediction diagnostics for spatial linear models. Biometrika 79:583–591
Christensen R, Johnson W, Pearson L (1993) Covariance function diagnostics for spatial linear models. Math Geol 25:145–160
Cook RD (1986) Assessment of local influence. J R Stat Soc Ser B 48:133–169
Cressie N (1993) Statistics for spatial data. Wiley, New York
Diamond P, Armstrong M (1984) Robustness of variograms and conditioning of kriging matrices. Math Geol 16:809–822
Diggle PJ, Ribeiro PJ Jr (2007) Model-based geostatistics. Springer, Berlin
Fang KT, Anderson TW (eds) (1990) Statistical inference in elliptical contoured and related distributions. Allerton Press, New York
Fang KT, Zhang YT (1990) Generalized multivariate analysis. Springer/Science Press, Berlin
Fang KT, Kotz S, Ng KW (1990) Symmetric multivariate and related distributions. Chapman & Hall, London
Filzmoser P, Ruiz-Gazen A, Thomas-Agnan Ch (2014) Identification of local multivariate outliers. Stat Pap 55:29–47
Galea M, Paula GA, Bolfarine H (1997) Local influence in elliptical linear regression models. Statistician 46:71–79
Galea M, Paula GA, Uribe-Opazo MA (2003) On influence diagnostics in univariate elliptical linear regression models. Stat Pap 44:23–45
Galea M, Paula GA, Cysneiros FJ (2005) On diagnostic in symmetrical nonlinear models. Stat Probab Lett 73:459–467
Gradshteyn I, Ryzhik I (2000) Tables of integrals, series and products. Academic Press, New York
Gupta AK, Varga T (1993) Elliptically contoured models in statistics. Kluwer Academic Publishers, Massachusetts
Hoaglin D, Welsh R (1978) The hat matrix in regression and ANOVA. Am Stat 32:17–22
Ibacache-Pulgar G, Paula GA (2011) Local influence for Student-t partially linear models. Comput Stat Data Anal 55:1462–1478
Isaaks E, Srisvastava R (1989) An introduction to applied geostatistics. Oxford University Press, New York
Jameson A (1968) Solution of the equation \(AX + XB = C\) by inversion of an \(M\times M\) or \(N\times N\) matrix. SIAM J Appl Math 16:1020–1023
Jones R (1989) Fitting a stochastic partial differential equation to aquifer data. Stoch Hydrol Hydraul 3:85–96
Kano Y, Berkane M, Bentler P (1993) Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations. J Am Stat Assoc 88:135–143
Kelker D (1970) Distribution theory of spherical and a location-scale parameter generalization. Sankhya A 32:419–430
Lange KL, Little RJ, Taylor JM (1989) Robust statistical modeling using the t distribution. J Am Stat Assoc 84:881–896
Lange KL, Sinsheimer JS (1993) Normal/independent distributions and their applications in robust regression. J Comput Graph Stat 2:175–198
Lesaffre E, Verbeke G (1998) Local influence in linear mixed model. Biometrics 54:570–582
Liu S (2000) On local influence for elliptical linear models. Stat Pap 41:211–224
Mardia K, Marshall R (1984) Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika 71:135–146
Martin R (1992) Leverage, influence and residuals in regression models when observations are correlated. Commum Stat Theor M 21:1183–1212
Militino AF, Palacios MB, Ugarte MD (2006) Outliers detection in multivariate spatial linear models. J Stat Plan Inference 136:125–146
Mitchell AF (1989) The information matrix, skewness tensor and \(\alpha \)-connections for the general multivariate elliptic distribution. Ann I Stat Math 41:289–304
Nel DG (1980) On matrix differentiation in statistics. S Afr Stat J 14:137–193
Osorio F, Paula GA, Galea M (2007) Assessment of local influence in elliptical linear models with longitudinal structure. Comput Stat Data Anal 51:4354–4368
Paula GA (1993) Assessing local influence in restricted regression models. Comput Stat Data Anal 16:63–79
Poon W, Poon YS (1999) Conformal normal curvature and assessment of local influence. J R Stat Soc Ser B 61:51–61
Ross W (1987) The geometry of case deletion and the assessment of influence in nonlinear regression. Can J Stat 15:91–103
Schabenberger O, Gotway C (2005) Statistical methods for spatial data analysis. Chapman & Hall, London
St. Laurent RT, Cook RD (1992) Leverage and superleverage in nonlinear regression. J Am Stat Assoc 87:985–990
Uribe-Opazo MA, Borssoi JA, Galea M (2012) Influence diagnostics in gaussian spatial linear models. J Appl Stat 39:615–630
Waller L, Gotway C (2004) Applied spatial statistics for public health data. Wiley, New Jersey
Warnes JA (1986) Sensitivity analysis for universal kriging. Math Geol 18:653–676
Webster R, Oliver M (2007) Geostatistics for environmental scientists, 2nd edn. Wiley, New York
Wei B, Hu Y, Fung W (1998) Generalized leverage and its applications. Scand J Stat 25:25–37
Zellner A (1976) Bayesian and non-Bayesian analysis of the regression model with multivariate Student-t error terms. J Am Stat Assoc 71:400–405
Zhu HT, Lee SY (2001) Local influence for incomplete-data models. J R Stat Soc Ser B 63:111–126
Zhu HT, Ibrahim JG, Lee S, Zhang H (2007) Perturbation selection and influence measures in local influence analysis. Ann Stat 35:2565–2588
Acknowledgments
We would like to thank the Associate Editor and two referees for their helpful comments and suggestions, leading to improvement of the paper. Also, we acknowledge the partial financial support from Fundaç\(\tilde{\text {a}}\)o Araucária of Paraná State, Capes, CNPq and FACEPE, Brazil, and Project FONDECYT 1110318, Chile.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: The observed information matrix for elliptical spatial linear models
The log-likelihood function is given by
where \(\delta =(\mathbf {Z}- \mathbf {X}{\varvec{\beta }})^{\top }{\varvec{\varSigma }}^{-1}(\mathbf {Z}- \mathbf {X}{\varvec{\beta }})\).
The second derivatives matrix, is given by
where \(L_{\beta \beta }=\frac{\partial ^2 {\mathcal L}({\varvec{\theta }})}{\partial {\varvec{\beta }}\partial {\varvec{\beta }}^{\top }}= 2\mathbf {X}^{\top }{\varvec{\varSigma }}^{-1}\{W_{g}(\delta ){\varvec{\varSigma }}+ 2 W^{'}_{g}(\delta ){\varvec{\epsilon }}{\varvec{\epsilon }}^{\top }\}{\varvec{\varSigma }}^{-1}\mathbf {X}\), \(L_{\beta \phi }=\frac{\partial ^2 {\mathcal L}({\varvec{\theta }})}{\partial {\varvec{\beta }}\partial {\varvec{\phi }}^{\top }}\), with \(\frac{\partial ^2 {\mathcal L}({\varvec{\theta }})}{\partial {\varvec{\beta }}\partial \phi _{j}}=2\mathbf {X}^{\top }{\varvec{\varSigma }}^{-1}\Big \{W^{'}_{g}(\delta ) {\varvec{\epsilon }}{\varvec{\epsilon }}^{\top }{\varvec{\varSigma }}\frac{\partial {\varvec{\varSigma }}}{\partial \phi _j} + W_{g}(\delta )\frac{\partial {\varvec{\varSigma }}}{\partial \phi _j}\Big \}{\varvec{\varSigma }}^{-1}{\varvec{\epsilon }}\), as its \(j\)th column, for \(j=1, 2, 3\); \(L_{\phi \beta }=L_{\beta \phi }^{\top }\) and \(L_{\phi \phi }=\frac{\partial ^2 {\mathcal L}({\varvec{\theta }})}{\partial {\varvec{\phi }}\partial {\varvec{\phi }}^{\top }}\), with elements
for \(i, j=1, 2, 3\). The derivatives of first and second-order of the scale matrix \({\varvec{\varSigma }}\), (3), with respect to \(\phi _{j}\), for \(j=1, 2, 3\); for some covariance functions are presented in Uribe-Opazo et al. (2012).
Appendix B: \({\varvec{\varDelta }}\) matrix for perturbation scheme of the mean
In this case we have that \({\mathcal L}({\varvec{\theta }}, {{\varvec{\omega }}})\) is given by
where \(\delta _{\omega }=\{\mathbf {Z}- {\varvec{\mu }}({{\varvec{\omega }}})\}^{\top }{\varvec{\varSigma }}^{-1}\{\mathbf {Z}- {\varvec{\mu }}({{\varvec{\omega }}})\}={\varvec{\epsilon }}_{\omega }^{\top }{\varvec{\varSigma }}^{-1}{\varvec{\epsilon }}_{\omega }\), \({\varvec{\epsilon }}_{\omega }=\mathbf {Z}-{\varvec{\mu }}({{\varvec{\omega }}})\) and \({\varvec{\mu }}({{\varvec{\omega }}})=\mathbf {X}{\varvec{\beta }}+\mathbf {A}{{\varvec{\omega }}}\). Then
Differentiating (12) with respect to \({\varvec{\beta }}\), see Nel (1980),
The derivative with respect to \(\phi _{j}\) is given by,
for \(j=1, 2, 3\). Evaluating (13) and (14) at \({{\varvec{\omega }}}={{\varvec{\omega }}}_{0}\), we obtain the \({\varvec{\varDelta }}=({\varvec{\varDelta }}^{\top }_{\beta }, {\varvec{\varDelta }}^{\top }_{\phi })^{\top }\) matrix.
Appendix C: The Fisher information matrix \(\mathbf {G}({{\varvec{\omega }}})\)
To select an adequate matrix \(\mathbf {A}\) we can use the methodology proposed by Zhu et al. (2007). In effect, the score function for \({{\varvec{\omega }}}\) in the perturbed log-likelihood function (11) is given by
Following Zhu et al. (2007) let \(\mathbf {G}({{\varvec{\omega }}})\), the Fisher information matrix with respect to the perturbation vector \({{\varvec{\omega }}}\). That is, \(\mathbf {G}({{\varvec{\omega }}})=E_{\omega }\{U({{\varvec{\omega }}})U^{\top }({{\varvec{\omega }}})\}\), where \(E_{\omega }\) denotes the expectation with respect to \(f(\mathbf {z}, {\varvec{\theta }},{{\varvec{\omega }}})\). A perturbation \({{\varvec{\omega }}}\) is appropriate if it satisfies \(\mathbf {G}({{\varvec{\omega }}}_{0})=c\mathbf {I}_n\), where \(c>0\). In our case we have
That is, \(\mathbf {G}({{\varvec{\omega }}}_{0})=c\mathbf {A}^{\top }{\varvec{\varSigma }}^{-1}\mathbf {A}\) with \(c=c({{\varvec{\omega }}}_{0})\) a positive constant, see Appendix . Notice that usually \(\mathbf {A}^{\top }{\varvec{\varSigma }}^{-1}\mathbf {A}\ne \mathbf {I}_{n}\). However, if \(\mathbf {A}={\varvec{\varSigma }}^{1/2}\), then \(\mathbf {G}({{\varvec{\omega }}}_{0})=c\mathbf {I}_{n}\) and so \({\varvec{\mu }}({{\varvec{\omega }}})=\mathbf {X}{\varvec{\beta }}+{\varvec{\varSigma }}^{1/2}{{\varvec{\omega }}}\) is a perturbation scheme appropriate. The derivatives \(\partial {\varvec{\varSigma }}^{1/2}/\partial \phi _{j}\) for \(j=1, 2, 3\), are given in Appendix .
Appendix D: Derivative of the square root \({\varvec{\varSigma }}^{1/2}\)
Corresponding to any matrix \({\varvec{\varSigma }}\) \(n \times n\) symmetric and nonnegative definite, there is a matrix symmetric nonnegative definite \({\varvec{\varSigma }}^{1/2}=\mathbf {W}\), such that \({\varvec{\varSigma }}={\varvec{\varSigma }}^{1/2}{\varvec{\varSigma }}^{1/2}={\mathbf {W}}^2\). Furthermore, \(\mathbf {W}\) is unique and can be expressed by
where \(\mathbf {A}^{1/2}=\mathop {\mathrm{diag}}\nolimits (\sqrt{\alpha _1},\ldots ,\sqrt{\alpha _n})\), with \(\alpha _1,\ldots ,\alpha _n\) the eigenvalues of \({\varvec{\varSigma }}\) and \(\mathbf {P}\) is a matrix \(n \times n\) orthogonal \((\mathbf {P}\mathbf {P}^{\top } = \mathbf {I}_n)\) such that \(\mathbf {P}{\varvec{\varSigma }}\mathbf {P}^{\top }=\mathbf {A}\), with \(\mathbf {A}=\mathop {\mathrm{diag}}\nolimits (\alpha _1,\ldots ,\alpha _n)\). So, derivatives of \({\varvec{\varSigma }}\) with respect to \(\phi _j\) is given by
This equation can be written as \(\dot{{\varvec{\varSigma }}}_j = \mathbf {W} \dot{\mathbf {W}}_j + \dot{\mathbf {W}}_j \mathbf {W}\), where \(\dot{{\varvec{\varSigma }}}_j=\frac{\partial {\varvec{\varSigma }}}{\partial \phi _j}\) and \(\frac{\partial \mathbf {W}}{\partial \phi _j}=\dot{\mathbf {W}}_j\), which has been extensively studied in the literature; see for instance Jameson (1968). Note that \(\dot{{\varvec{\varSigma }}}_j\), \(\mathbf {W}\) and \(\dot{\mathbf {W}}_j\) are symmetric matrices. Let \(\mathbf {J}_j=\mathbf {P}^{\top }\dot{{\varvec{\varSigma }}}_j\mathbf {P}\) and \(\mathbf {Q}=[(q_{rs})]\) symmetric matrices \(n\times n\), with \(q_{rs}=(\sqrt{\alpha _r}+\sqrt{\alpha _s})^{-1}\), for \(r,s=1,\ldots ,n\). Then, the solution to Eq. (15) is given by
where \(\odot \) denotes the Hadamard product for \(j=1, 2, 3\).
Appendix E: The likelihood function of the \({\varvec{t}}\) model is an increasing function of \(\nu \)
As noted by Zellner (1976), for the case of the usual linear regression model, “the necessary conditions on \({\varvec{\beta }}\), \({\varvec{\varSigma }}=\phi _{1}\mathbf {I}\) and \(\nu \) for a maximum of the likelihood function cannot be satisfied for \(\nu \ge 1\)”. In our case, the likelihood function is an increasing function of \(\nu \). For illustration, we consider the bivariate case, \({\varvec{t}}_{2}(0, \mathbf {I}, \nu )\) with density function given by
with \(\delta =\mathbf {z}^{\top }\mathbf {z}\). Clearly, from Fig. 5, the likelihood function is an increasing function of \(\nu \) and also of \(\delta \).
Rights and permissions
About this article
Cite this article
De Bastiani, F., Mariz de Aquino Cysneiros, A.H., Uribe-Opazo, M.A. et al. Influence diagnostics in elliptical spatial linear models. TEST 24, 322–340 (2015). https://doi.org/10.1007/s11749-014-0409-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-014-0409-z