Advertisement

Birnbaum–Saunders spatial modelling and diagnostics applied to agricultural engineering data

  • Fabiana Garcia-Papani
  • Miguel Angel Uribe-Opazo
  • Victor Leiva
  • Robert G. Aykroyd
Original Paper

Abstract

Applications of statistical models to describe spatial dependence in geo-referenced data are widespread across many disciplines including the environmental sciences. Most of these applications assume that the data follow a Gaussian distribution. However, in many of them the normality assumption, and even a more general assumption of symmetry, are not appropriate. In non-spatial applications, where the data are uni-modal and positively skewed, the Birnbaum–Saunders (BS) distribution has excelled. This paper proposes a spatial log-linear model based on the BS distribution. Model parameters are estimated using the maximum likelihood method. Local influence diagnostics are derived to assess the sensitivity of the estimators to perturbations in the response variable. As illustration, the proposed model and its diagnostics are used to analyse a real-world agricultural data set, where the spatial variability of phosphorus concentration in the soil is considered—which is extremely important for agricultural management.

Keywords

Asymmetric distributions Local influence Matérn model Maximum likelihood methods Monte Carlo simulation Non-normality R software Spatial data analysis 

1 Introduction

Spatial statistical models take into account the dependence of a variable over space using geo-referenced data. These models are essential in many fields; see, for example, Krige (1951), Mardia and Marshall (1984), Waller and Gotway (2004) and Militino et al. (2006). Recent studies include Borssoi et al. (2011), Uribe-Opazo et al. (2012) and Grzegozewski et al. (2013). All these works assume a normal (or Gaussian) distribution in the modelling. However, such an assumption is not always appropriate; see Davis (1952) and Lange et al. (1989). One approach to deal with non-normality is to transform the data to achieve at least approximate normality. Nevertheless, when working with data transformations, problems, such as the difficulty of interpreting the results from the analysis on the original scale, can be introduced into the modelling; see Azzalini and Capitanio (1999). When the normality assumption is not valid, an alternative approach can be carried out in the modelling by using some non-normal distribution to be suitable for the data under analysis. For example, Assumpção et al. (2011, 2014) conducted a geo-statistical study using the Student-t distribution, which has heavier tails than the normal distribution. De Bastiani et al. (2015) studied spatial modelling and diagnostics based on the family of elliptic (symmetric) distributions, which has as members the Gaussian and Student-t cases. With all of this said, elliptic models are still not appropriate if the data follow a skew distribution.

There has been little work in the literature that investigates the use of asymmetric distributions to analyse spatial data. However, in non-spatial situations, many distributions have been proposed to model phenomena that give rise to skew data, such as, the Birnbaum–Saunders (BS), exponential, gamma, log-normal and Weibull distributions; see Johnson et al. (1994, 1995). In particular, the BS distribution was proposed for modelling random variables describing processes of fatigue by Birnbaum and Saunders (1969). Applications in earth sciences of the BS distribution have been considered by, for example, Leiva et al. (2008, 2009, 2016a), Podlaski (2008), Vilca et al. (2010), Marchant et al. (2013) and Saulo et al. (2013). What makes this distribution attractive for the analysis of skew data are its properties and its relationship with the normal distribution; see Johnson et al. (1995, pp. 651–663). In contrast to its original application to fatigue processes, Leiva et al. (2015a) justified why the BS distribution is suitable for modelling earth and environmental data using theoretical arguments based on the law of proportionate effects. Rieck and Nedelman (1991) defined a relationship between the BS distribution and its logarithmic version, named the log-BS distribution. They used this relationship to propose a BS fixed effect model; whereas Villegas et al. (2011) considered a BS mixed effect model. A multivariate extension to the BS fixed effect model was studied by Marchant et al. (2016). An approach to BS spatial modelling was provided by Xia et al. (2011), who presented a methodology based on semi-Markov processes to produce a spatio-temporal model for the movement of tourists. The authors considered several distributions, including the BS model. To date, however, spatial models based on the BS distribution have not been studied.

The identification of cases that can produce substantial changes in the estimated parameters is an important step in any statistical investigation. The task of detecting possible atypical cases can be addressed by eliminating cases one-by-one from the data set and measuring the effects on estimated parameters—this is known as global influence; see Cook (1987). Another method for detecting cases that could potentially be influential was proposed by Cook (1987), which is known as local influence. This method studies the effect of small perturbations introduced into the models and/or the data on the maximum likelihood (ML) estimates. Different perturbation schemes are often considered to evaluate the sensitivity of ML estimates of the model parameters to such perturbations. The local influence method has at least two advantages over the global influence method: it has a lower computational cost, especially when the number of cases is large, and it allows us to detect groups of data exerting a joint influence. Zhu et al. (2007) proposed a methodology for choosing a perturbation scheme that is appropriate for the particular model to be considered. Gimenez and Galea (2013) applied the method proposed by Zhu et al. (2007) to heteroscedastic models with functional measurement errors. Galea et al. (2004) applied the local influence method in the BS fixed effect model, whereas Leiva et al. (2014, 2016b) and Liu et al. (2016) derived diagnostic tools in accelerated life models, in fixed effect models with stochastic restrictions and in the possibly heteroskedastic linear model with exact restrictions. In spatial modelling, diagnostic techniques have been discussed for Gaussian models by Militino et al. (2006) and Uribe-Opazo et al. (2012), for Student-t models by Assumpção et al. (2014), and for elliptic models by De Bastiani et al. (2015).

The main objective of this paper is to develop a spatial log-linear model based on the BS distribution and to derive its corresponding diagnostics. This distribution can be more appropriate than the Gaussian distribution in the analysis of spatial data with positive asymmetric behaviour. ML estimators of the model parameters and local influence diagnostic tools are derived for the BS spatial model. A computational framework in R code of the developed methodology is available from the authors under request. Specifically, Sect. 2 provides a background on uni/multi-variate BS and log-BS distributions, and on spatial modelling. Section 3 formulates the BS spatial log-linear model, estimates its parameters using the ML method and derives an appropriate perturbation scheme for the response variable using the methodology proposed by Zhu et al. (2007). Section 4 conducts two Monte Carlo (MC) simulation studies for evaluating the performance of the corresponding ML estimators and diagnostic tools. Section 5 illustrates the potential applications of the proposed model and its diagnostics with real-world data from agricultural engineering. Section 6 discusses some conclusions and possible future work. Detailed algebra is presented in appendices.

2 Background

2.1 The Birnbaum–Saunders distribution

If a random variable T follows a BS distribution with shape parameter \(\alpha\) and scale parameter \(\beta ,\) we use the notation \(T\sim \mathrm{BS}(\alpha ,\,\beta ).\) The distribution can be defined by its cumulative distribution function (CDF) given by
$$\begin{aligned} F_{T}(t;\,\alpha ,\,\beta )=\Phi \left( \frac{1}{\alpha }(\sqrt{{t}/{\beta }}-\sqrt{{\beta }/{t}})\right) , \quad t>0,\, \alpha >0,\, \beta >0, \end{aligned}$$
(1)
where \(\Phi (\cdot )\) is the CDF of the standard normal distribution. Then, the probability density function (PDF) of T obtained from (1) is expressed as
$$\begin{aligned} f_{T}(t;\,\alpha ,\,\beta ) = \frac{1}{2\alpha } \left( \sqrt{{1}/({\beta t})}+\sqrt{{\beta }/{t^{\frac{3}{2}}}}\right) \phi \left( \frac{1}{\alpha }(\sqrt{{t}/{\beta }}-\sqrt{{\beta }/{t}})\right), \quad t>0,\,\alpha >0,\, \beta >0, \end{aligned}$$
(2)
where \(\phi (\cdot )\) is the standard normal PDF. Thus, in turn the PDF in (2) can be re-written as
$$\begin{aligned} f_{T}(t;\,\alpha ,\,\beta ) = \frac{\exp (\alpha ^{-2})}{2\alpha \sqrt{2\pi \beta }} \exp \left( -\frac{1}{2\alpha ^{2}}\left( \frac{t}{\beta }+\frac{\beta }{t}\right) \right) {\frac{(t+\beta )}{t^{\frac{3}{2}}}}, \quad t>0,\quad \alpha >0,\quad \beta >0. \end{aligned}$$
(3)
Also, it may be said that a continuous random variable T has a BS distribution with parameters \(\alpha >0\) and \(\beta >0,\) if and only if \(Z= (1/\alpha )(\sqrt{{T}/{\beta }}-\sqrt{{\beta }/{T}})\sim \mathrm{N}(0,\,1).\) Some properties of the BS distribution are presented as follows. If \(T\sim \mathrm{BS}(\alpha ,\,\beta ),\) then: (i) \(\mathrm{E}(T)=\beta (1+\alpha ^{2}/2)\) and \(\mathrm{Var}(T)=(\alpha \beta )^{2}(1+5\alpha ^{2}/4),\) (ii) if \(b > 0,\) then \(bT\sim \mathrm{BS}(\alpha ,\,b\beta ),\) which means that the BS distribution is closed under scalar multiplication, (iii) \(1/T \sim \mathrm{BS}(\alpha ,\,1/\beta ),\) which means that the BS distribution is closed under reciprocity, (iv) the median of the distribution of T is \(\beta ,\) which can be directly obtained when \(q=0.5\) from its quantile function given by
$$t(q;\,\alpha ,\,\beta ) = F^{-1}_{T}(q;\, \alpha ,\,\beta ) = \beta \left({\alpha z(q)}/{2} + \sqrt{(\alpha z(q)/2)^{2}+1}\right)^{2}, 0 < q < 1,\quad \alpha >0,\quad \beta >0,$$
where z(q) is the standard normal quantile function, and (v) the BS distribution is positively skewed as \(\alpha\) increases and approximately symmetrical around \(\beta\) as \(\alpha\) goes to zero; see Fig. 1(left).

2.2 The log-Birnbaum–Saunders distribution

A continuous random variable Y has a log-BS distribution with shape parameter \(\alpha >0\) and location parameter \(\mu \in {\mathbb {R}},\) which is denoted by log-BS(α, μ) if and only if \(Z=(2/\alpha )\mathrm{sinh}({(Y-\mu )}/{2})\sim \mathrm{N}(0,\,1).\) Then, the CDF of Y is given by
$$\begin{aligned} F_{Y}(y;\, \alpha ,\, \mu )=\Phi \left( \frac{2}{\alpha }\mathrm{sinh}\left( \frac{y-\mu }{2}\right) \right) ,\quad y \in{\mathbb {R}},\, \alpha > 0, \, \mu \in {\mathbb {R}}. \end{aligned}$$
(4)
Consequently, from (4), the PDF of Y is obtained as
$$\begin{aligned} f_{Y}(y;\, \alpha ,\, \mu )=\frac{1}{\alpha \sqrt{2\pi }}\mathrm{cosh}\left( \frac{y-\mu }{2}\right) \mathrm{exp}\left( -\frac{2}{\alpha ^{2}}\mathrm{sinh}^{2}\left( \frac{y-\mu }{2}\right) \right) , \quad y \in{\mathbb {R}},\, \alpha > 0,\, \mu \in {\mathbb {R}}. \end{aligned}$$
(5)
Some properties of the log-BS distribution are presented as follows. If \(Y\sim {\text {log}}{\text{-BS}}(\alpha ,\,\mu ),\) then: (i) \(T = \exp (Y) \sim \mathrm{BS}(\alpha ,\,\beta ),\) which means that the log-BS PDF given in (5) can be obtained from the standard normal PDF or from the BS PDF defined in (3), (ii) \(\mathrm{E}(Y)=\mu ,\) (iii) there is no closed form for the variance of Y, but based upon an asymptotic approximation for the moment generating function of the log-BS distribution, it follows that, if \(\alpha \rightarrow 0,\) then \(\mathrm{Var}(T)=\alpha ^{2}-\alpha ^{4}/4,\) and if \(\alpha \rightarrow \infty ,\) then \(\mathrm{Var}(T) = 4(\log ^{2}(\sqrt{2}\alpha )+2-2\log (\sqrt{2}\alpha )),\) (iv) if \(X={\pm } Y+d,\) then \(X\sim {\text {log}}{\text{-BS}}(\alpha ,\,{\pm }\mu +d),\) and (v) the log-BS distribution is symmetric around \(\mu ,\) unimodal for \(\alpha \le 2\) and bimodal for \(\alpha > 2;\) see Fig. 1 (right).
Fig. 1

PDF of (left) BS\((\alpha ,\,1)\) and (right) log-BS\((\alpha ,\,0)\) distributions for the indicated value of the shape parameter \(\alpha\)

If a random vector \({\varvec{Y}}=(Y_{1},\ldots ,Y_{n})^{\top }\) follows an n-variate log-BS distribution, the notation \({\varvec{Y}} \sim {\text {log}}{\text{-BS}}_{n}({\varvec{\alpha }},\,{\varvec{\mu }},\,{\varvec{\Sigma }})\) is used. Here, the vector of shape parameters is \({\varvec{\alpha }}=(\alpha _{1},\ldots ,\alpha _{n})^{\top },\) with \(\alpha _{i} >0,\) the vector of location parameters is \({\varvec{\mu }}=(\mu _{1},\ldots ,\mu _{n})^{\top },\) with \(\mu_{i} \in {\mathbb{R}},\) for \(i=1,\ldots ,n,\) and \({\varvec{\Sigma }}\) is an \(n\times n\) positive definite (non-singular) matrix of scale and dependence parameters. The CDF of \({\varvec{Y}}\) can be defined from (4) as
$$\begin{aligned} F_{\varvec{Y}}({\varvec{y}};\,{\varvec{\alpha }},\,{\varvec{\mu }},\,{\varvec{\Sigma }} )=\Phi _{n}\left( \frac{2}{\alpha _{1}} \mathrm{sinh}\left( \frac{y_{1}-\mu _{1}}{2}\right) ,\ldots ,\frac{2}{\alpha _{n}}\mathrm{sinh}\left( \frac{y_{n}-\mu _{n}}{2}\right) ;\,{\varvec{\Sigma }}\right) ,\quad {\varvec{y}} \in \mathbb {R}^{n}, \, {\varvec{\alpha }} \in{{\mathbb {R}}^n_+}, \, {\varvec{\mu }} \in {{\mathbb {R}}^n},\, {\varvec{\Sigma }}>0,\end{aligned}$$
(6)
where \(\Phi _{n}(\cdot ;\,{\varvec{\Sigma }})\) is the CDF of the n-variate normal distribution with mean vector equal to zero and variance–covariance matrix \({\varvec{\Sigma }}.\) Therefore, the PDF of \({\varvec{Y}}\) can be obtained from (6) as
$$\begin{aligned} f_{{\varvec{Y}}}({\varvec{y}};\,{\varvec{\alpha }},\,{\varvec{\mu }},\,{\varvec{\Sigma }} )=\phi _{n}\left( \frac{2}{\alpha _{1}} \sinh \left( \frac{y_{1}-\mu _{1}}{2}\right) ,\ldots ,\frac{2}{\alpha _{n}} \sinh \left( \frac{y_{n}-\mu _{n}}{2}\right) ;\,{\varvec{\Sigma} }\right) \prod \limits _{i=1}^{n} \frac{1}{\alpha _{i}}\cosh \left( \frac{y_{i}-\mu _{i}}{2}\right) , \quad {\varvec{y}} \in \mathbb {R}^{n}, \quad {\varvec{\alpha }} \in{{\mathbb {R}}^n_+}, \quad {\varvec{\mu }} \in {{\mathbb {R}}^n},\quad {\varvec{\Sigma }}>0,\end{aligned}$$
(7)
where \(\phi _{n}(\cdot ;\,{\varvec{\Sigma} })\) is the PDF of the n-variate normal distribution with mean vector equal to zero and variance–covariance matrix \({\varvec{\Sigma} }.\)

2.3 Spatial models

Consider a stochastic process \(\{Y({\varvec{s}}),\,{\varvec{s}}\in {\varvec{D}}\},\) which is defined over a region D, with \({\varvec{D}}\subset {\mathbb {R}}^{2},\) described by the spatial linear model
$$\begin{aligned} Y({\varvec{s}}) = \mu ({\varvec{s}}) + \varepsilon ({\varvec{s}}), \quad {\varvec{s}}\in {\varvec{D}}, \end{aligned}$$
(8)
where \(\mu (\cdot )\) is a mean function and \(\varepsilon (\cdot )\) is the model error. This error has mean zero and common variance \(\sigma ^{2},\) which means that \(\mathrm{E}(Y({\varvec{s}})) = \mu ({\varvec{s}})\) and \(\mathrm{Var}(Y({\varvec{s}}))= \sigma ^{2},\) for all \({\varvec{s}} \in {\varvec{D}}.\)
If the spatial process \(\{Y({\varvec{s}}),\,{\varvec {s}}\in {\varvec{D}}\}\) is assumed to be stationary, then its mean function is constant, that is, \(\mu ({\varvec{s}})=\mu ,\) for all \({\varvec{s}} \in {\varvec{D}}.\) If it is further assumed to be isotropic, then its covariance function only depends on the distance between spatial locations, that is,
$$\begin{aligned} \mathrm{Cov}\left( Y\left( {\varvec{s}}_{i}\right) ,\,Y\left( {\varvec{s}}_{j}\right) \right) = \mathrm{C}\left( {\varvec{s}}_{i},\,{\varvec{s}}_{j}\right) =\mathrm{C}\left( h_{ij}\right) , \quad {\varvec{s}}_{i},\, {\varvec{s}}_{j} \in D, \end{aligned}$$
(9)
where \(h_{ij}=\Vert {\varvec{s}}_{i}-{\varvec{s}}_{j}\Vert\) is the Euclidean distance between \({\varvec{s}}_{i}\) and \({\varvec{s}}_{j}.\)

2.4 Covariance and variogram models

Now suppose that n measurements are collected at a set of known spatial locations \({\varvec{S}} = \{{\varvec{s}}_{1},\ldots ,{\varvec{s}}_{n}\},\) providing the n-variate random vector \({\varvec{Y}} =(Y_{1}, \ldots ,Y_{n})^{\top },\) where \(Y_{i} = Y({\varvec{s}}_{i}),\) for \(i=1,\ldots ,n.\) The covariance between all pairs of random variables \((Y_{i},\,Y_{j})\) is determined by an \(n\times n\) scale and dependence matrix \({\varvec{\Sigma }}=(\sigma _{ij}),\) which must be symmetric and positive-definite, where \(\sigma _{ij} = \mathrm{C}(h_{ij}),\) with \(\mathrm{C}(\cdot )\) given in (9). Then, \({\varvec{\Sigma }}\) defines the spatial dependence structure for stationary and isotropic processes by a parameter vector \({\varvec{\varphi }} = (\varphi _{1},\, \varphi _{2},\, \varphi _{3})^{\top },\) where \(\varphi _{1}\ge 0,\, \varphi _{2}\ge 0\) are parameters known as nugget effect and partial sill, respectively, whereas \(\varphi _{3}\ge 0\) is a parameter related to the effective range or spatial dependence radius \(a = g(\varphi _{3});\) see Mardia and Marshall (1984) and Uribe-Opazo et al. (2012). The nugget effect is related to an analytical error, indicating an unexplained variability from one point of the sampling grid to another. This variability may be attributed to measurement errors or to a variability not captured due to the sampling distance used; see, for example, Cambardella et al. (1994). The nugget effect can also act as a regulatory tool in spatial design for random fields making many designs feasible; see Müller and Stehlík (2009, 2010). We assume a particular parametric form for the scale and dependence matrix given by
$$\begin{aligned} {\varvec{\Sigma }}=\varphi _{1}{\varvec{I}}_{n}+\varphi _{2}{\varvec{R}}, \end{aligned}$$
(10)
where \({\varvec{I}}_{n}\) is the \(n\times n\) identity matrix and \({\varvec{R}}=(r_{ij})\) is an \(n\times n\) symmetric matrix with diagonal elements \(r_{ii}=1,\) for \(i=1,\ldots ,n.\) Specific forms for \(r_{ij}\) given by \(r_{ij} = \sigma _{ij}/\varphi _{2},\) with \(i\ne j\) and \(\varphi _{2}\ne 0,\) define the model used to explain the spatial dependence, with the most common forms being those obtained from the Matérn and power exponential families; see Isaaks and Srivastava (1989) and Diggle and Ribeiro (2007). In the family of Matérn models, we have
$$\begin{aligned} r_{ij}=\left\{ \begin{array}{ll} 1,& i=j,\\ \frac{1}{2^{\delta -1}\Gamma (\delta )}\left( \frac{h_{ij}}{\varphi _{3}}\right) ^{\delta } K_{\delta }\left( \frac{h_{ij}}{\varphi _{3}}\right) ,& i\ne j, \end{array} \right. \end{aligned}$$
(11)
where \(\delta\) is a shape parameter, \(\Gamma (\cdot )\) is the usual gamma function and \(K_{\delta }(\cdot )\) is the modified Bessel function of third kind of order \(\delta .\) From (11), we get
$$\begin{aligned} \sigma _{ij}=\left\{ \begin{array}{ll} \varphi _{1}+\varphi _{2},& i=j,\\ \frac{\varphi _{2}}{2^{\delta -1}\Gamma (\delta )}\left( \frac{h_{ij}}{\varphi _{3}}\right) ^{\delta } K_{\delta }\left( \frac{h_{ij}}{\varphi _{3}}\right) ,& i\ne j.\\ \end{array} \right. \end{aligned}$$
(12)
In the family of power exponential models, for \(i\ne j,\) we have \(r_{ij} = \exp (-(h_{ij}/\varphi _{3})^{p}),\) where \(0<p\le 2\) is a shape parameter, which implies
$$\begin{aligned} {\sigma _{ij}}=\left\{ \begin{array}{ll} \varphi _{1}+\varphi _{2},& i=j,\\ \varphi _{2}\exp \left( -\left( \frac{h_{ij}}{\varphi _{3}}\right) ^{p}\right) ,& i\ne j. \end{array} \right. \end{aligned}$$
(13)
Although the models in (12) and (13) have no finite range, the effective range a can be defined as the smallest distance between two locations, such that the covariance has dropped to 5 % of the maximum covariance, C(0). The exponential and Gaussian models are members of the power exponential family when \(p=1\) and 2, respectively, and also of the Matérn family when \(\delta =0.5\) and \(\delta \rightarrow \infty ,\) respectively. In stationary processes, from the covariance function given in (9), it is possible to define the variogram function by
$$\begin{aligned} \gamma (h) = {\text{C}}(0) - {\text{C}}(h),\quad h>0, \end{aligned}$$
(14)
where \({\text{C}}(0) = \varphi _{1}+\varphi _{2}\) and C(h) is specified from \(\mathrm{C}(h_{ij}) = \sigma _{ij},\) for \(h = h_{ij},\) with a suitable member of the Matérn or power exponential families given in (12) and (13), respectively. The plot of points \((h,\,\gamma (h))\) obtained from the variogram function given in (14) is a useful tool in spatial statistics.

2.5 Kriging interpolation

In geo-statistical analyses, a commonly used method for interpolation is Kriging; see Krige (1951). The Kriging prediction is given by a linear combination of the observed data \({\varvec{y}}=(y_{1},\ldots ,y_{n})^{\top }\) defined as
$$\begin{aligned} {\widehat{y}}\left( {\varvec{s}}_{0}\right) = \sum \limits _{i=1}^{n}\lambda _{i} y_{i}, \end{aligned}$$
(15)
where \({\widehat{y}}({\varvec{s}}_{0})\) is the predicted value at a new location \({\varvec{s}}_{0}\) and \(\lambda _{1},\ldots , \lambda _{n}\) are weights chosen to define the best linear unbiased predictor. This can be achieved by minimizing the variance of the error with respect to the weights whilst requiring the predictor to be unbiased. It may be shown that, under the stationarity assumption, the values \(\lambda _{1},\ldots , \lambda _{n}\) are given by the solution to \({\varvec{C\lambda }}={\varvec{\Upsilon }},\) where
$$\begin{aligned} {\varvec{C}}=\left( \begin{array}{cccc} \mathrm{C}({\varvec{s}}_{1},\, {\varvec{s}}_{1}) &{}\ldots &{}\mathrm{C}({\varvec{s}}_{1},\,{\varvec{s}}_{n}) &{}1\\ \vdots &{}\ddots &{}\vdots &{}\vdots \\ \mathrm{C}({\varvec{s}}_{n},\, {\varvec{s}}_{1}) &{}\ldots &{}\mathrm{C}({\varvec{s}}_{n},\, {\varvec{s}}_{n}) &{}1\\ 1 &{}\ldots &{}1 &{}0 \end{array} \right) ;\quad {\varvec{\lambda }} = \left( \lambda _{1},\ldots ,\lambda _{n},\,\varrho \right) ^{\top };\quad {\varvec{\Upsilon }}= \left( \mathrm{C}\left( {\varvec{s}}_{1},\, {\varvec{s}}_{0}\right) ,\ldots ,\mathrm{C}\left( {\varvec{s}}_{n},\,{\varvec{s}}_{0}\right) ,\,1\right) ^{\top }; \end{aligned}$$
and \(\varrho\) is a Lagrange multiplier introduced to ensure unbiasedness when minimizing the error variance.

2.6 Accuracy measures

To quantify the similarity between two maps, the global accuracy (GA) and kappa (\(\kappa\)) indexes can be used. Consider two maps (one called the reference map and the other the model map), both divided into the same m classes, denoted by \(M_{i},\) for \(i=1,\ldots ,m.\) In addition, let \(N_{ij},\) for \(i,\,j=1,\ldots ,m,\) be the number of pixels (a pixel is defined as a single point in an image; for example, in our application of Sect. 5, 1 pixel \({\approx } 85\,\mathrm{m}^{2}\)) belonging to class \(M_{i}\) of the model map and to class \(M_{j}\) of the reference map, and N be the total number of pixels in each map. The GA index is based on those pixels that belong to the same class in both maps and is defined as \(\mathrm{GA} = (\sum \nolimits _{i=1}^{m} N_{ii})/N.\) Model and reference maps have an acceptable similarity if the GA index is greater than 0.85; see Anderson et al. (1976). The \(\kappa\) index is based on all of the pixels (those belonging to the same class or not) and is defined as \(\kappa =(N\sum \nolimits _{i=1}^{m} N_{ii}-\sum \nolimits _{i=1}^{m} N_{{\bullet } i} N_{i\bullet })/(N^{2}-\sum \nolimits _{i=1}^{m} N_{\bullet i} N_{i\bullet }),\) where \(N_{i\bullet } = \sum \nolimits _{j=1}^{m}N_{ij}\) and \(N_{\bullet i}=\sum \nolimits _{j=1}^{m}N_{ji}.\) Model and reference maps have a low similarity if \(\kappa <0.67,\) a medium similarity if \(0.67 \le \kappa \le 0.80,\) and a high similarity if \(\kappa >0.80;\) see Krippendorff (2004).

3 The Birnbaum–Saunders spatial model

3.1 Formulation of the model

Let \(\{T({\varvec{s}}),\,{\varvec{s}}\in D\}\) be a stochastic process defined over a region D, with \(D\subset {\mathbb {R}}^{2}.\) Suppose that n measurements \({\varvec{T}} =(T_{1},\ldots , T_{n})^{\top },\) where \(T_{i}=T({\varvec{s_{i}}}),\) for \(i=1,\ldots ,n,\) are collected at a set of known spatial locations \({\varvec{S}} = \{{\varvec{s}}_{1},\ldots , {\varvec{s}}_{n}\}.\) Consider a spatial model of the form
$$\begin{aligned} T_{i}=\exp \left( \mu _{i}\right) \eta _{i}, \quad i=1,\ldots , n. \end{aligned}$$
(16)
Assume stationarity, such that \(\mu _{i}=\mu ({\varvec{s}}_{i}) = \mu ,\) and the model error \(\eta _{i}=\eta ({\varvec{s}}_{i}) \sim \mathrm{BS}(\alpha ,\,1),\) for \(i=1,\ldots ,n.\) Then, \(\exp (\mu )\) is the median of the model. Note that the shape parameter \(\alpha\) is also assumed to be constant across the spatial locations; see Marchant et al. (2016). Applying a logarithmic transformation to (16), a BS spatial log-linear model is obtained as
$$\begin{aligned} Y_{i}=\log \left( T_{i}\right) =\mu +\log \left( \eta _{i}\right) =\mu +\varepsilon _{i}, \quad i=1,\ldots ,n, \end{aligned}$$
(17)
where \(\varepsilon _{i}=\log (\eta _{i}) \sim {\text {log}}{\text{-BS}}(\alpha ,\,0),\) for \(i=1,\ldots ,n.\) Note that the BS spatial log-linear model defined in (17) has a similar form to the model given in (8). For ease of notation, the model in (17) can be written in matrix form as
$$\begin{aligned} {\varvec{Y}}=\mu {\mathbf{1}}+{\varvec{\varepsilon }}, \end{aligned}$$
(18)
with \(n\times 1\) vectors \({\varvec{Y}}=(Y_{1},\ldots ,Y_{n})^{\top },\,{\mathbf{1}}=(1,\ldots ,1)^{\top }\) and \({\varvec{\varepsilon }} = (\varepsilon _{1}, \ldots , \varepsilon _{n})^{\top }\). Here, \({\varvec{\varepsilon }}\) is a stationary spatial stochastic process with mean vector \(\mathrm{E}({\varvec{\varepsilon }})=\mathbf{0}.\) Suppose that the covariance between all pairs \((Y_{i},\, Y_{j})\) is determined by the \(n\times n\) scale and dependence matrix \({\varvec{\Sigma }}\) satisfying the conditions given in (10), where the elements of \({\varvec{R}}\) can be modelled with the Matérn structure given in (11).

3.2 Parameter estimation

Let \({\varvec{\theta }}=(\alpha ,\,\mu ,\,\varphi _{1},\,\varphi _{2},\,\varphi _{3})^{\top }\) be the vector of unknown parameters of the spatial model formulated in (18) to be estimated. Then, the likelihood function for \({\varvec{\theta }},\) based on the observations \({\varvec{y}}=(y_{1},\ldots ,y_{n})^{\top }\) of \({\varvec{Y}},\) obtained from (7), is given by
$$\begin{aligned} {\mathcal {L}}({\varvec{\theta }})={\mathcal {L}}({\varvec{\theta }};\,{\varvec{y}})=\frac{\alpha ^{-n}}{(2\pi )^{\frac{n}{2}}|{\varvec{\Sigma }}|^{\frac{1}{2}}} \exp \left( -\frac{2}{\alpha ^{2}}{\varvec{V}}^{\top }{\varvec{\Sigma }}^{-1}{\varvec{V}}\right) \prod \limits _{i=1}^{n}\cosh \left( \frac{y_{i}-\mu }{2}\right) , \end{aligned}$$
(19)
where \({\varvec{V}}=(V_{1},\ldots ,V_{n})^{\top }\) is an \(n\times 1\) vector with elements \(V_{i} =\sinh ({(y_{i}-\mu )}/{2}),\) for \(i=1,\ldots ,n.\) The corresponding log-likelihood function for \({\varvec{\theta }}\) obtained from (19) is then
$$\begin{aligned} \ell ({\varvec{\theta }})=-\frac{n}{2}\log (2\pi )-\frac{1}{2}\log (|{\varvec{\Sigma }}|)-n\log (\alpha )-\frac{2}{\alpha ^{2}}{\varvec{V}}^{\top }{\varvec{\Sigma }}^{-1}{\varvec{V}}+ \sum \limits _{i=1}^{n}\log \left( \cosh \left( \frac{y_{i}-\mu }{2}\right) \right) . \end{aligned}$$
(20)
The ML method defines the estimator \({\widehat{\varvec{\theta }}}\) of \({\varvec{\theta }}\) as the vector which maximises \(\mathcal{{ L}}({\varvec{\theta }}),\) or equivalently \(\ell ({\varvec{\theta }}),\) over the parameter space of \({\varvec{\theta }}.\) Thus,
$$\begin{aligned} {\widehat{\varvec{\theta }}} =\mathrm{arg}\max _{{\varvec{\theta }}} \ell ({\varvec{\theta }}). \end{aligned}$$
(21)
When the value in (21) is associated with an extreme point, it can be obtained from the solution of a system of equations created from the score vector and given by
$$\begin{aligned} {\frac{\partial \ell ({\varvec{\theta }})}{\partial \alpha }=0,\quad \frac{\partial \ell ({\varvec{\theta }})}{\partial \mu }=0,\quad \frac{\partial \ell ({\varvec{\theta }})}{\partial {\varvec{\varphi }}}={\bf{0}}}; \end{aligned}$$
(22)
see details of the score vector in Appendix 1. Note that no analytical solution to the system of equations given in (22) can be obtained. Then, the ML estimator \({\widehat{\varvec{\theta }}}\) must be computed with an iterative procedure to solve the non-linear system. Here the Broyden–Fletcher–Goldfarb–Shanno (BFGS) quasi-Newton procedure (see Nocedal and Wright 1999; Lange 2001) may be used through the functions optim and optimx implemented in the R software; see www.R-project.org and R-Team (2015). The signs of the determinants of the corresponding Hessian matrix and of its minors were also checked to ensure that a valid maximum had been found.
Under the usual regularity conditions, the ML estimator \({\widehat{{\varvec{\theta }}}}\) is consistent for \({{\varvec{\theta }}}\) and has an asymptotic normal distribution. Then, as \(n \rightarrow \infty ,\)
$$\begin{aligned} \sqrt{n}({\widehat{{\varvec{\theta }}}} -{\varvec{\theta }}) \mathrel {\mathop {\rightarrow }\limits ^{{\mathrm{D}}}_{ }} \mathrm{N}_{5}\left( {\mathbf{0}},\, {\varvec{J}}({\varvec{\theta }})^{-1}\right) , \end{aligned}$$
(23)
where \({\varvec{J}}({\varvec{\theta }}) = \lim \nolimits _{n\rightarrow \infty }{\varvec{I}}({\varvec{\theta }})/n,\) with \({\varvec{I}}({\varvec{\theta }})\) being the expected Fisher information matrix given in Appendix 3, and \(\mathrel {\mathop {\rightarrow }\limits ^{{\mathrm{D}}}_{ }}\) denotes convergence in distribution. Asymptotic confidence intervals (CIs) of an \(100\times (1-\zeta )\,\%\) level for \(\mu ,\,\alpha\) and \(\varphi _{i},\) with \(i=1,\,2,\,3,\) can be obtained from the asymptotic normality property given in (23) as
$$\begin{aligned} \mathrm{CI}(\mu ,\,(1-\zeta )\times 100\,\%)&= \left({\widehat{\mu }}-z(1-\zeta /2){\widehat{\mathrm{SE}}}({\widehat{\mu }}),\,{\widehat{\mu }}+z(1-\zeta /2){\widehat{\mathrm{SE}}}({\widehat{\mu }})\right),\\ \mathrm{CI}(\theta ,\,(1-\zeta )\times 100\,\%)&= \left(\exp ({\widehat{\theta ^{*}}}-z(1-\zeta /2){\widehat{\mathrm{SE}}}({\widehat{\theta ^{*}}})),\, \exp ({\widehat{\theta ^{*}}}+z(1-\zeta /2){\widehat{\mathrm{SE}}}({\widehat{\theta ^{*}}}))\right), \end{aligned}$$
where \({\widehat{\theta ^{*}}}\) is the ML estimate of \(\theta ^{*}=\log (\theta ),\) with \(\theta =\alpha\) or \(\theta =\varphi _{i},\) for \(i=1,\,2,\,3,\) and \({\widehat{\mathrm{SE}}}({\widehat{\theta ^{*}}})\) is the estimated asymptotic standard error (SE) of the ML estimator of \(\theta ^{*};\) see Leiva et al. (2016c).

3.3 Local influence

Let \(\ell ({\varvec{\theta }})\) be the log-likelihood function for the vector of model parameters \({\varvec{\theta }} = (\alpha ,\,\mu ,\,\varphi _{1},\,\varphi _{2},\,\varphi _{3})^{\top }\) given in (18), which is referred to as the non-perturbed log-likelihood. Then, let \({\varvec{\omega }}=(\omega _{1},\ldots ,\omega _{n})^{\top } \in \Omega \subset {\mathbb {R}}^{n}\) be an \(n\times 1\) perturbation vector, where \(\Omega\) is an open set of relevant perturbations. Let \(\ell ({\varvec{\theta }}|{\varvec{\omega }})\) be the log-likelihood function perturbed by \({\varvec{\omega }},\) called the perturbed log-likelihood, and \({\widehat{\varvec{\theta }}}_{\varvec{\omega }}\) be the ML estimate of \({\varvec{\theta }}\) obtained from \(\ell ({\varvec{\theta }}|{\varvec{\omega }}).\) In addition, let \({\varvec{\omega }}_{0} \in \Omega\) be an \(n \times 1\) non-perturbation vector, such that \(\ell ({\varvec{\theta }}|{\varvec{\omega }}_{0})= \ell ({\varvec{\theta }}).\) Suppose that \(\ell ({\varvec{\theta }}|{\varvec{\omega }})\) is twice continuously differentiable in a neighbourhood of \(({\widehat{\varvec{\theta }}},\, {\varvec{\omega }}_{0}).\) Now comparing the parameter estimates \({\widehat{\varvec{\theta }}}\) and \({\widehat{\varvec{\theta }}}_{\varvec{\omega }}\) using local influence, we are able to investigate how the inference is affected by the perturbation. Consider the likelihood displacement (LD) given by
$$\begin{aligned} \mathrm{LD}({\varvec{\omega }}) = 2\left( \ell ({\widehat{\varvec{\theta }}})- \ell ( {\widehat{\varvec{\theta }}}_{{\varvec{\omega }}}) \right) , \end{aligned}$$
(24)
which is used to assess the influence of the perturbation \({\varvec{\omega }}.\) Large values of LD\(({\varvec{\omega }})\) in (24) indicate that \({\widehat{\varvec{\theta }}}\) and \({\widehat{\varvec{\theta }}}_{\varvec{\omega }}\) differ considerably related to the contours of the non-perturbed log-likelihood function \(\ell ({\varvec{\theta }}\)). The method studies the local behaviour of the influence plot \(a({\varvec{\omega }}) = ({\varvec{\omega }}^{\top },\,\mathrm{LD}({\varvec{\omega }}))^{\top }\) around \({\varvec{\omega }}_{0}.\) Cook (1987) suggested invigilating the direction of maximum curvature, \(C_{{\max }},\) of the surface \(a({\varvec{\omega }}).\) For LD\(({\varvec{\omega }})\) in (24), \(C_{{\max }} = \max _{||{\varvec{d}}||=1} C_{\varvec{d}},\) where \(C_{\varvec{d}} = 2|{\varvec{d}}^{\top }{\varvec{B}} {\varvec{d}}|,\) with \({\varvec{B}}\) being an \(n \times n\) matrix and \({\varvec{d}}\) a unit-length direction vector. To find \(C_{{\max }}\) and the direction vector \({\varvec{d}}_{{\max }},\) we need to calculate \({\varvec{B}}= - {\varvec{\Delta }} {\varvec{\ddot{\ell }}}({\widehat{\varvec{\theta }}})^{{-1}} {\varvec{\Delta }},\) where \({\varvec{\ddot{\ell }}}({\widehat{\varvec{\theta }}})\) is the Hessian matrix obtained from (19) evaluated at \({\varvec{\theta }} = {\widehat{\varvec{\theta }}};\) see details of the Hessian matrix in Appendix 2. Here, \({\varvec{\Delta }}\) is a \(5 \times n\) matrix obtained from the perturbed log-likelihood function and given by
$$\begin{aligned} {\varvec{\Delta }} = {\partial ^{2} \ell ({\varvec{\theta }}|{\varvec{\omega }}) \over \partial {\varvec{\theta }} \partial {\varvec{\omega }}^{\top } } =\left( \begin{matrix} {\partial ^{2} \ell ({\varvec{\theta }}|{\varvec{\omega }}) \over \partial \alpha \partial {\varvec{\omega }}^{\top } } \\ {\partial ^{2} \ell ({\varvec{\theta }}|{\varvec{\omega }}) \over \partial \mu \partial {\varvec{\omega }}^{\top } } \\ {\partial ^{2} \ell ({\varvec{\theta }}|{\varvec{\omega }}) \over \partial {\varvec{\theta }} \partial {\varvec{\omega }}^{\top } } \end{matrix}\right) = \left( \begin{matrix} \Delta _{\alpha }\\ \Delta _{\mu }\\ {\varvec{\Delta }}_{\varvec{\varphi }}\\ \end{matrix}\right) . \end{aligned}$$
(25)
Note that (25) must be evaluated at \({\varvec{\theta }}={\widehat{\varvec{\theta }}}\) and \({\varvec{\omega }}={\varvec{\omega }}_{0}.\) Then, \({\varvec{d}}_{{\max }}\) is a unit-length eigenvector associated with the maximum absolute eigenvalue \(C_{{\max }}\) of \({\varvec{B}}.\) A large absolute value of any element of \({\varvec{d}}_{{\max }}\) reveals that the case is likely to be influential. Other important directions correspond to the canonical basis vectors \({\varvec{d}} = {\varvec{e}}_{i},\) for \(i=1,\ldots ,n,\) where \({\varvec{e}}_{i}\) is an \(n \times 1\) vector with a one (1) in its ith position and zeros (0) in the other positions. In this case, the curvature is given by \(C_{i}=2|b_{ii}|,\) where \(b_{ii}\) is the \((i,\,i)\) element of the matrix \({\varvec{B}},\) for \(i=1,\ldots ,n.\) The plot of \(C_{i}=C_{\varvec{d}_{i}}\) versus the index i can also be used to identify influential cases. We use index plots of \(C_{i}\) and \(|d_{{\max }_{i}}|\) as diagnostic measures of local influence in Sect. 5. Although there is no consensus about a benchmark to determine an influential case, we use a value analogous to that proposed by Zhu and Lee (2001), which indicates the case i as influential if \(C_{i} > {\overline{C}} + 2\mathrm{SE}(C),\) for \(i=1,\ldots ,n,\) where \({\overline{C}}\) and \(\mathrm{SE}(C)\) denote, respectively, the mean normal curvature and the corresponding sample SE. Similarly, in the index plot of \(|d_{{\max }_{i}}|,\) the case i is indicated as influential if \(|d_{{\max }_{i}}| > {\overline{|{\varvec{d}}_{{\max }}|}} + 2\mathrm{SE}(|{\varvec{d}}_{{\max }}|),\) for \(i=1,\ldots ,n,\) where \({\overline{|{\varvec{d}}_{{\max }}|}}\) and \(\mathrm{SE}(|{\varvec{d}}_{{\max }}|)\) denote, respectively, the mean of the elements in absolute value of the vector \({\varvec{d}}_{{\max }}\) and the corresponding sample SE. We use these benchmark values in the simulation study and in the agricultural application considered in Sects. 4 and 5, respectively.

3.4 Selection of the appropriate perturbation

Suppose that the perturbation of the response variable is of the form \({\varvec{Y_{\omega }}}({\varvec{s}})={\varvec{Y}}({\varvec{s}})+{\varvec{A}}{\varvec{\omega }},\) where \({\varvec{A}}\) is a symmetric and non-singular matrix. Hence, \({\varvec{Y}}_{\omega _{0}}({\varvec{s}})={\varvec{Y}}({\varvec{s}})\) and \(Y_{\varvec{\omega }}({\varvec{s}}_{i})=Y({\varvec{s}}_{i})+{\varvec{a}}_{i}{\varvec{\omega }},\) where \({\varvec{a}}_{i}\) is the ith row of the matrix \({\varvec{A}}.\) In this case, the perturbed log-likelihood function is given by
$$\begin{aligned} \ell ({\varvec{\theta }}|{\varvec{\omega }})=-\frac{n}{2}\log (2\pi )-\frac{1}{2}\log (|{\varvec{\Sigma }}|)-n\log (\alpha )-\frac{2}{\alpha ^{2}}{{\varvec{V}}_{\varvec{\omega} }^{\top}}{\varvec{\Sigma }}^{-1}{\varvec{V_{\omega }}}+\sum \limits _{i=1}^{n}\log \left( \cosh \left( \frac{y_{i}+{\varvec{a}}_{i}{\varvec{\omega }}-\mu }{2}\right) \right) , \end{aligned}$$
(26)
where \({\varvec{V_{\omega }}}=({V}_{{\omega }_{1}},\ldots ,{V}_{{\omega }_{n}})^{\top },\) with \({V}_{{\omega }_{i}}=\sinh ({(y_{i}+{\varvec{a}}_{i}{\varvec{\omega }}-\mu )}/{2}),\) for \(i=1,\ldots ,n.\) Then, the corresponding score vector obtained from (26) is given by (see details in Appendix 4)
$$\begin{aligned} {\varvec{U}}({\varvec{\omega }})=\frac{\partial \ell ({\varvec{\theta }}|{\varvec{\omega }})}{\partial {\varvec{\omega }}}=-\frac{2}{\alpha ^{2}}{\varvec{A}}{\varvec{\Sigma }}^{-1}{\varvec{V_{\omega }}}+\frac{1}{2}{\varvec{A}}{\varvec{V_{\omega }}}. \end{aligned}$$
(27)
Consider the variance of the score vector given in (27) as a function of the perturbation vector \({\varvec{\omega }},\) that is, \({\varvec{G}}({\varvec{\omega }})= \mathrm{Var}({\varvec{U}}({\varvec{\omega }})) = \mathrm{E}({\varvec{U}}({\varvec{\omega }}){\varvec{U}}^{\top }({\varvec{\omega }})),\) recalling that \(\mathrm{E}({\varvec{U}}({\varvec{\omega }}))= {\varvec{0}}.\) For the BS spatial log-linear model, we have that (again see details in Appendix 4)
$$\begin{aligned} {\varvec{G}}({\varvec{\omega }})={\varvec{A}}\left( \frac{1}{\alpha }{\varvec{\Sigma }}^{-\frac{1}{2}}-\frac{\alpha }{4}{\varvec{\Sigma }}^{\frac{1}{2}}\right) ^{2}{\varvec{A}}. \end{aligned}$$
(28)
According to Zhu et al. (2007), the perturbation \({\varvec{\omega }}\) is appropriate if and only if \({\varvec{G}}({\varvec{\omega }}_{0})=c{\varvec{I}}_{n},\) for \(c > 0,\) with \({\varvec{G}}(\cdot )\) given in (28). In general, for an arbitrary symmetric and non-singular matrix \({\varvec{A}},\, {\varvec{G}}({\varvec{\omega }})\ne c{\varvec{I}}_{n}.\) Instead, for the perturbation to be appropriate, \({\varvec{A}}\) must be found using the condition
$$\begin{aligned} {\varvec{A}}\left( \frac{1}{\alpha }{\varvec{\Sigma }}^{-\frac{1}{2}}-\frac{\alpha }{4}{\varvec{\Sigma }}^{\frac{1}{2}}\right) ^{2}{\varvec{A}}=c{\varvec{I}}_{n}, \end{aligned}$$
(29)
for some value of \(c>0.\) Given that c is non-negative, considering \(c=1\) immediately shows that \({\varvec{A}}=(({1}/{\alpha }){\varvec{\Sigma }}^{-\frac{1}{2}}-({\alpha }/{4}){\varvec{\Sigma }}^{\frac{1}{2}})^{-1}\) satisfies the condition given in (29). Thus, for the BS spatial log-linear model, an appropriate perturbation scheme for the response variable is given by
$$\begin{aligned} {\varvec{Y_{\omega }}}({\varvec{s}})={\varvec{Y}}({\varvec{s}})+ \left( \frac{1}{\alpha }{\varvec{\Sigma }}^{-\frac{1}{2}}-\frac{\alpha }{4}{\varvec{\Sigma }}^{\frac{1}{2}}\right) ^{-1}{\varvec{\omega }}. \end{aligned}$$
(30)
Now consider the perturbation matrix \({\varvec{\Delta }}\) defined in (25), the perturbation given in (30), and
$$\begin{aligned} \frac{\partial \ell ({\varvec{\theta }}|{\varvec{\omega }})}{\partial {\varvec{\omega }}^\top }=-\frac{2}{\alpha ^{2}}{\varvec{V_{\omega }}}^{\top }{\varvec{\Sigma }}^{-1}{\varvec{A}}+\frac{1}{2}{\varvec{V_{\omega }}}^{\top }{\varvec{A}}={\varvec{V_{\omega }}}^{\top }\left( -\frac{2}{\alpha ^{2}}{\varvec{\Sigma }}^{-1}+\frac{1}{2}{\varvec{I_n}}\right) {\varvec{A}}. \end{aligned}$$
(31)
Then, the elements of \({\varvec{\Delta }}\) obtained from (31) are expressed as
$$\begin{aligned} \Delta _{\mu }= & {} \frac{\partial ^{2} \ell ({\varvec{\theta }}|{\varvec{\omega }})}{\partial \mu \partial {\varvec{\omega }}^{\top }}={\mathbf{1}}^{\top } \left( \frac{1}{\alpha ^{2}}{\varvec{\Sigma }}^{-1}-\frac{1}{4}{\varvec{I_{n}}}\right) {\varvec{A}},\end{aligned}$$
(32)
$$\begin{aligned} \Delta _{\alpha }= & {} \frac{\partial ^{2} \ell ({\varvec{\theta }}|{\varvec{\omega }})}{\partial \alpha \partial {\varvec{\omega }}^{\top }}={\varvec{D}}^{\top }\left( \frac{-1}{\alpha ^{2}}{\varvec{\Sigma }}^{-1}+\frac{1}{4}{\varvec{I}}_{n}\right) {\varvec{A}}\nonumber \\&\quad +{\varvec{V}}_{\varvec{\omega} }^{\top }\left( \frac{4}{\alpha ^{3}}{\varvec{\Sigma }}^{-1}{\varvec{A}}+\left( - \frac{2}{\alpha ^{2}}{\varvec{\Sigma }}^{-1}+\frac{1}{2}{\varvec{I}}_{n}\right)\times {\varvec{A}}\left( \frac{1}{\alpha ^{2}}{\varvec{\Sigma }}^{-\frac{1}{2}}+\frac{1}{4}{\varvec{\Sigma }}^{\frac{1}{2}}\right) {\varvec{A}}\right) ,\end{aligned}$$
(33)
$$\begin{aligned} {\varvec{\Delta _{\varvec{\varphi }}}}= & {} \frac{\partial ^{2} \ell ({\varvec{\theta }}|{\varvec{\omega }})}{\partial ^{2}{\varvec{\varphi }}\partial {\varvec{\omega }}^{\top }}=\left( \frac{\partial ^{2} \ell ({\varvec{\theta }}|{\varvec{\omega }})}{\partial \varphi _{1}\partial {\varvec{\omega }}^{\top }},\,\frac{\partial ^{2} \ell ({\varvec{\theta }}|{\varvec{\omega }})}{\partial \varphi _{2}\partial {\varvec{\omega }}^{\top }},\,\frac{\partial \ell ({\varvec{\theta }}|{\varvec{\omega }})}{\partial \varphi _{3}\partial {\varvec{\omega }}^{\top }}\right) ^{\top }, \end{aligned}$$
(34)
where \({\varvec{D}}=(d_{1},\ldots ,d_{n})^{\top },\) with \(d_{j}={\varvec{l}}_{i}{\varvec{\omega }},\) for \(i=1,\ldots ,n,\) and \({\varvec{l}}_{i}\) being the ith row of the matrix \({\varvec{A}}(({1}/{\alpha ^{2}}){\varvec{\Sigma }}^{-\frac{1}{2}}+({1}/{4}){\varvec{\Sigma }}^{\frac{1}{2}}){\varvec{A}}.\) The elements of \({\varvec{\Delta _{\varvec{\varphi }}}}\) given in (34) are defined as
$$\begin{aligned} \frac{\partial ^{2} \ell ({\varvec{\theta }}|{\varvec{\omega }})}{\partial \varphi _{i}\partial {\varvec{\omega }}^{\top }}&= {\varvec{M}}^{\top }\left( -\frac{1}{\alpha ^{2}}{\varvec{\Sigma }}^{-1}+\frac{1}{4}{\varvec{I}}_{n}\right) {\varvec{A}}\\\quad &+{\varvec{V}}_{\varvec{\omega} }^{\top }\left( \frac{2}{\alpha ^{2}}{\varvec{\Sigma }}^{-1}\frac{\partial {\varvec{\Sigma }}}{\partial \varphi _{i}}{\varvec{\Sigma }}^{-1}\right) {\varvec{A}}\\&\quad +{\mathbf{1}}^{\top }\left( -\frac{2}{\alpha ^{2}}{\varvec{\Sigma }}^{-1}+\frac{1}{2}{\varvec{I}}_{n}\right)\\&\quad \times{\varvec{A}}\left( \frac{1}{\alpha }{\varvec{\Sigma }}^{-\frac{1}{2}}\frac{\partial {\varvec{\Sigma }}^{\frac{1}{2}}}{\partial \varphi _{i}}{\varvec{\Sigma }}^{-\frac{1}{2}}+\frac{\alpha }{4}\frac{\partial {\varvec{\Sigma }}^{\frac{1}{2}}}{\partial \varphi _{i}}\right) {\varvec{A}}, \end{aligned}$$
where \({\varvec{M}}=(m_{1},\ldots ,m_{n})^{\top },\) with \(m_{i}={\varvec{l}}_{i}{\varvec{\omega }},\) for \(i=1,\ldots ,n,\) and \({\varvec{l}}_{i}\) being now the ith row of the matrix
$$\begin{aligned} {\varvec{A}}\left( -\frac{1}{\alpha }{\varvec{\Sigma }}^{-\frac{1}{2}}\frac{\partial {\varvec{\Sigma }}^{\frac{1}{2}}}{\partial \varphi _{i}}{\varvec{\Sigma }}^{-\frac{1}{2}}+\frac{\alpha }{4}\frac{\partial {\varvec{\Sigma }}^{\frac{1}{2}}}{\partial \varphi _{i}}\right) {\varvec{A}}. \end{aligned}$$
Detailed algebra of the above expressions is presented in Appendix 5.

4 Simulation study

In this section, we conduct two MC simulation studies based on n sampling points \({\varvec{y}} = (y_{1}, \ldots ,y_{n})^{\top }\) generated from a log-BS distribution with shape parameter \(\alpha \in \{0.3,\,1.0\},\) scale parameter \(\mu =2\) and spatial structure described by the Matérn model with \(\delta =0.5,\, \varphi _{1}\in \{0.5,\,1.0\},\,\varphi _{2}\in \{0.5,\,1.0\}\) and \(\varphi _{3}\in \{1.0,\,1.5\}.\) We consider a regular grid with a minimum distance between points of one unit. We evaluate the performance of the estimators of the model parameters for small and large sample sizes. Then, we assess the performance of the proposed diagnostic methodology on the detection of influential cases.

4.1 Study I: ML estimation

To assess the efficiency of the estimator \({\widehat{\theta }_{j}}\) empirically, we use the absolute relative bias (ARB) and the root mean squared error (RMSE) defined as the square root of the mean squared error given by
$$\begin{aligned} \mathrm{ARB}( {\widehat{\theta }_{j}}) = \left| \frac{\bar{\theta _{j}} -\theta _{j}}{\theta _{j}}\right| \times 100, \mathrm{MSE}( {\widehat{\theta }_{j}}) = \frac{1}{p}\sum _{k=1}^{p}( {\widehat{\theta }}_{jk}-\theta _{j}) ^{2}, \quad j=1,\ldots,5,\quad \end{aligned}$$
where \(\bar{\theta _j} = (1/p)\sum _{k=1}^{p}\widehat{\theta }_{jk},\) with \(\widehat{\theta }_{jk}\) being the ML estimate of \(\theta _{j}\in \{\alpha ,\, \mu ,\, \varphi _{1},\, \varphi _{2},\, \varphi _{3}\}\) for the kth MC replication. Note that \(\widehat{\varvec{\theta }} = (\widehat{\alpha },\, \widehat{\mu },\, \widehat{\varphi }_{1},\, \widehat{\varphi }_{2},\, \widehat{\varphi }_{3})^{\top }\) is the ML estimate of \({\varvec{\theta }} = (\alpha ,\, \mu ,\, \varphi _{1},\, \varphi _{2},\, \varphi _{3})^{\top }.\) We consider \(p=500\) MC replications in each case. Table 1 displays the empirical ARBs and RMSEs of the corresponding ML estimators. Note that, for \(n=100,\) all the ARBs are small. In particular, the ARBs of the ML estimators of the parameter \(\alpha\) are less than \(10\,\%\) in 15 of 16 cases studied, further in \(50\,\%\) of these cases the ARBs are less than \(2\,\%.\) For the ML estimator of the parameter \(\mu ,\) the ARBs are less than \(1\,\%\) in practically all cases. For the ML estimators of the parameters describing the spatial structure, the ARBs are greater than \(10\,\%\) in a few cases with a maximum of \(13\,\%.\) However, for \(n = 36,\) this picture is quite different, because only the ARBs of the ML estimator of the parameter \(\mu\) remain low. For the other parameters, in various scenarios, the ARBs are high. In general, the RMSEs are small for the estimators of \(\alpha\) and \(\mu ,\) but are moderate for the spatial parameters. As expected the values increase for \(n=36\) compared to those for \(n=100.\) Such results show the sensitivity of the ML estimators of the BS spatial model to small samples.

4.2 Study II: influence diagnostics

We evaluate the performance of the corresponding diagnostic tools in detecting influential cases. In order to carry out this evaluation, we generate two data sets using a response variable as given in (18), with scale matrix structure described by the Matérn family model, considering the shape parameter \(\delta \in \{0.5,\,1.0\}\) and \(n= 100.\) After generating each data set \({\varvec{y}} = (y_{1}, \ldots ,y_{n})^{\top },\) we contaminate its maximum value to generate an outlier. This contamination is similar to that used in Ortega et al. (2003) to study the influence of a perturbation in an explanatory variable for generalized log-gamma fixed effect models. Specifically, we consider the contamination
$$\begin{aligned} y^{*}_{{\max }} = y_{{\max }} + 10 \sqrt{{\varvec{y}}^{\top }{\varvec{y}}}. \end{aligned}$$
(35)
For the first data set (\(\delta =0.5,\,n = 100\)), the contaminated case according to (35) results to be case #59. Figure 2 (left) shows the box-plot for this data set, which identifies case #59 as an outlier. Figure 2 (right) displays the corresponding model map divided into quartiles, from which it is possible to identify where outliers are located. After the contamination, the estimates of the parameters (obtained using the BFGS method) with their estimated asymptotic SEs (in parenthesis) are: \(\widehat{\alpha }=4.828\,(3.414 \times 10^{-1}),\, \widehat{\mu }=71.188\,(1.309 \times 10),\, \widehat{\varphi _{1}}=480.101\,(8.989 \times 10^{2}),\, \widehat{\varphi _{2}}=200.090\,(8.999 \times 10^{2})\) and \(\widehat{\varphi _{3}}=0.346\,(1.472 \times 10^{-3}),\) resulting in an estimated spatial dependence radius \(\widehat{a}=1.038\,(4.416 \times 10^{-3})\). Figure 3 displays the index plot of \(C_{i}\) (left) and \(|d_{{\max }}|\) (right) for the spatial structure described by the Matérn family model with \(\delta =0.5,\) considering the contaminated case according to (35). This study of local influence shows that case #59 is detected as potential influential.
Fig. 2

Box-plot (left) and model map (right) of the simulated data for the Matérn model with \(\delta =0.5\)

Fig. 3

Index plots of (left) \(C_{i}\) and (right) \(|d_{{\max }_{i}}|\) of the simulated data for the Matérn model with \(\delta =0.5\)

Similarly to the first data set, for the second data set (\(\delta =1.0,\,n = 100\)), the contaminated case according to (35) results to be case #59. Figure 4 (left) shows the box-plot for this second data set, whereas Fig. 4 (right) displays the corresponding model map divided by quartiles. In this second data set, after the contamination, the estimates of the parameters (obtained with the BFGS method) with their SEs (in parenthesis) are: \(\widehat{\alpha }=5.004\,(3.538 \times 10^{-1}),\, \widehat{\mu }=71.250\,(1.253 \times 10),\, \widehat{\varphi _{1}}=400.118\,(1.403 \times 10^{3}),\, \widehat{\varphi _{2}}=200.097\,(1.404 \times 10^{3})\) and \(\widehat{\varphi _{3}}=0.222\,(3.937 \times 10^{-4}),\) presenting an estimated maximum distance of spatial dependence \(\widehat{a}=0.888\, (1.575 \times 10^{-3}).\) Figure 5 displays the index plots of influence local which, such as in the first data set, indicate case #59 as potential influential.

Therefore, the proposed diagnostic methodology seems to be effective for detecting outlying and potential influential cases.
Fig. 4

Box-plot (left) and model map (right) of the simulated data for the Matérn model with \(\delta =1\)

Fig. 5

Index plots of (left) \(C_{i}\) and (right) \(|d_{{\max_{i} }}|\) of the simulated data for the Matérn model with \(\delta =0.5\)

Table 1

ARB (in \(\%\)) and RMSE of the ML estimator of the indicated parameter and n from simulated data with the BS spatial model

Parameter

True value

ARB

RMSE

True value

ARB

RMSE

True value

ARB

RMSE

True value

ARB

RMSE

 

\(n=36\)

\(n=100\)

\(\alpha\)

1.00

8.501

0.211

0.30

38.932

0.175

1.00

8.683

0.170

0.30

6.526

0.058

\(\mu\)

2.00

0.377

0.425

2.00

1.010

0.339

2.00

0.48

0.245

2.00

0.102

0.070

\(\varphi _{1}\)

0.50

15.204

0.628

0.50

0.134

0.474

0.50

6.446

0.505

0.50

1.624

0.586

\(\varphi _{2}\)

1.00

2.879

0.549

1.00

12.064

0.546

1.00

3.702

0.492

1.00

9.129

0.200

\(\varphi _{3}\)

1.00

10.631

1.549

1.00

3.283

1.591

1.00

8.288

0.922

1.00

10.102

0.731

\(\alpha\)

1.00

0.036

0.131

0.30

16.731

0.093

1.00

0.403

0.121

0.30

7.886

0.083

\(\mu\)

2.00

0.782

0.172

2.00

0.310

0.076

2.00

0.298

0.310

2.00

0.124

0.051

\(\varphi _{1}\)

0.50

16.061

0.407

0.50

37.419

0.362

0.50

12.548

0.424

0.50

9.748

0.448

\(\varphi _{2}\)

0.50

20.495

0.357

0.50

9.631

0.285

0.50

4.074

0.579

0.50

8.521

0.418

\(\varphi _{3}\)

1.00

19.786

1.406

1.00

31.122

0.598

1.00

12.092

1.000

1.00

3.368

0.840

\(\alpha\)

1.00

5.018

0.180

0.30

0.713

0.052

1.00

1.543

0.122

0.30

0.969

0.049

\(\mu\)

2.00

0.569

0.196

2.00

0.128

0.088

2.00

0.094

0.308

2.00

0.056

0.054

\(\varphi _{1}\)

1.00

23.470

0.559

1.00

30.419

0.598

1.00

4.416

0.444

1.00

7.977

0.296

\(\varphi _{2}\)

0.50

69.708

0.812

0.50

72.644

1.077

0.50

3.011

0.712

0.50

12.575

0.594

\(\varphi _{3}\)

1.00

23.311

1.479

1.00

27.248

0.508

1.00

3.736

0.966

1.00

10.600

0.838

\(\alpha\)

1.00

11.580

0.149

0.30

2.084

0.044

1.00

0.675

0.116

0.30

1.361

0.041

\(\mu\)

2.00

1.073

0.347

2.00

0.044

0.095

2.00

0.261

0.219

2.00

0.143

0.065

\(\varphi _{1}\)

0.50

13.250

0.417

0.50

10.198

0.392

0.50

3.457

0.380

0.50

1.860

0.226

\(\varphi _{2}\)

0.50

47.626

0.420

0.50

7.230

0.409

0.50

9.557

0.600

0.50

2.624

0.265

\(\varphi _{3}\)

1.50

4.095

1.365

1.50

28.613

0.875

1.50

7.062

0.999

1.50

7.850

1.048

\(\alpha\)

1.00

9.409

0.182

0.30

34.818

0.127

1.00

6.615

0.131

0.30

9.041

0.050

\(\mu\)

2.00

0.832

0.424

2.00

0.135

0.112

2.00

0.351

0.262

2.00

0.141

0.071

\(\varphi _{1}\)

1.00

22.382

0.688

1.00

58.626

0.704

1.00

9.877

0.600

1.00

6.292

0.609

\(\varphi _{2}\)

1.00

16.801

0.698

1.00

32.248

0.494

1.00

11.614

0.649

1.00

10.743

0.399

\(\varphi _{3}\)

1.00

29.993

1.441

1.00

25.988

0.722

1.00

6.278

1.088

1.00

6.915

0.854

\(\alpha\)

1.00

3.315

0.213

0.30

7.271

0.084

1.00

1.731

0.136

0.30

3.474

0.068

\(\mu\)

2.00

1.291

0.464

2.00

0.208

0.131

2.00

1.147

0.312

2.00

0.029

0.088

\(\varphi _{1}\)

0.50

13.031

0.513

0.50

0.375

0.799

0.50

6.876

0.406

0.50

4.700

0.419

\(\varphi _{2}\)

1.00

8.044

0.385

1.00

11.701

0.383

1.00

1.374

0.477

1.00

7.779

0.324

\(\varphi _{3}\)

1.50

11.971

1.455

1.50

26.713

0.862

1.50

11.409

1.010

1.50

1.422

1.048

\(\alpha\)

1.00

2.479

0.122

0.30

36.118

0.150

1.00

1.645

0.096

0.30

0.124

0.047

\(\mu\)

2.00

1.012

0.254

2.00

0.016

0.110

2.00

0.310

0.339

2.00

0.101

0.067

\(\varphi _{1}\)

1.00

12.815

0.434

1.00

49.457

0.593

1.00

2.893

0.401

1.00

8.174

0.312

\(\varphi _{2}\)

0.50

44.539

0.610

0.50

51.302

0.840

0.50

9.810

0.648

0.50

7.033

0.827

\(\varphi _{3}\)

1.50

19.714

1.927

1.50

10.903

0.570

1.50

2.042

1.105

1.50

0.266

1.052

\(\alpha\)

1.00

7.312

0.159

0.30

21.1904

0.094

1.00

6.414

0.176

0.30

10.473

0.051

\(\mu\)

2.00

0.318

0.456

2.00

0.335

0.136

2.00

0.622

0.331

2.00

0.207

0.093

\(\varphi _{1}\)

1.00

9.491

0.691

1.00

42.096

0.615

1.00

1.245

0.466

1.00

3.463

0.560

\(\varphi _{2}\)

1.00

8.964

0.528

1.00

28.573

1.121

1.00

1.951

0.910

1.00

9.094

0.328

\(\varphi _{3}\)

1.50

24.874

1.753

1.50

27.398

0.883

1.50

10.009

1.138

1.50

13.259

0.928

5 Numerical applications

5.1 Description of study and variables

The study was conducted during the crop year of 2012/2013 in a 167.35 ha non-experimental agricultural area for commercial use of grain production in Cascavel, a town in the western region of the state of Paraná, Brazil. The area has a geographic location of, approximately, 24.95\(^{\circ }\) south/53.57\(^{\circ }\) west, with an average altitude of 650 m. The soil is classified as Red Haplortox Oxisol, with a clay texture; see EMBRAPA (2009). The climate of the region is humid subtropical (Koeppen climatic type: Cfa) with an average annual temperature of 21 \(^{\circ }\)C.

Soil sampling locations were defined at points forming a regular square lattice with an inter-point distance of about 140 m. In addition, extra locations were chosen at random across the region. This leads to a total of \(n = 102\) locations, which were geo-referenced in UTM coordinates using a GPS device. To analyse the chemical properties, four soil samples were collected at depths from 0.0 to 0.2 m and mixed to produce a single representative sample at each location. The mixed soil samples were analysed by the Laboratory of the Cooperativa Central de Desenvolvimento Tecnológico e Econômico Ltda. (Coodetec, Brazil) to determine the concentration of macro and micro nutrients. Phosphorus concentration was chosen as the study variable, because it has a positive effect on plant growth and nutrition. Furthermore, its spatial variability is extremely important for agricultural management. According to the classification given by COAMO/COODETEC (2001), the phosphorus concentration in clayey soil is considered as follows. For soybean planting, it must be: (i) low for levels less than 3 mg/dm\(^{3},\) (ii) medium for levels between 3.1 and 6 mg/dm\(^{3},\) (iii) high for levels between 6.1 and 9 mg/dm\(^{3},\) and (iv) very high for levels greater than 9 mg/dm\(^{3}.\) For corn planting, the phosphorus concentration must be: (i) low for levels less than 2 mg/dm\(^{3},\) (ii) medium for levels between 2.1 and 4.5 mg/dm\(^{3},\) (iii) high for levels between 4.6 and 11 mg/dm\(^{3},\) and (iv) very high for levels greater than 11 mg/dm\(^{3}.\)

5.2 Exploratory and spatial dependence analysis

The exploratory data analysis (EDA) for the phosphorus concentration is divided in two: non-spatial and spatial. From the non-spatial point of view, the sample mean of the phosphorus concentration in the soil of the area under study is 18.11 mg/dm\(^{3},\) whereas the corresponding sample coefficients of variation (CV), skewness (CS) and kurtosis (CK) are CV = 0.41 (41 %), CS = 1.787 and CK = 5.104. These descriptive statistics indicate a reasonable degree of homogeneity around the mean, a positive skewness and a high kurtosis level, such as visualized in the box-plot of Fig. 6 (left). From this box-plot, note that four outliers, identified as cases #32, #53, #57 and #59, are detected. The circled points in Fig. 6 (right) identify these outlying cases, which are located at the lower part of the studied region. The non-spatial EDA supports the use of the BS distribution. From the spatial point of view, a standard analysis of sample variograms (omitted here), using the directions 0\(^{\circ },\,45^{\circ },\, 90^{\circ }\) and 135\(^{\circ },\) shows that the directional variograms have similar behaviour until a distance of about 900 m is attained. Therefore, we can assume that there is isotropy up until that distance.
Fig. 6

Box-plot (left) and model map (right) of the phosphorus concentration data

5.3 Parameter estimation and Kriging

To choose the best model for describing the spatial dependence structure of the phosphorus concentration, the cross-validation criterion and the maximum value of the log-likelihood function are considered. Note that the parameter \(\delta\) corresponds to the order of the variogram model in the Matérn family, which it is not estimated to avoid identifiability problems in the estimation of covariance matrix parameters. Thus, several variogram models based on the Matérn family are fitted and then the model with the smallest cross-validation value is chosen. Once \(\delta\) is determined, the best Matérn model with parameter \(\delta\) is used to estimate \({\varvec{\theta }}\) by the ML method using a profile likelihood approach. Therefore, using this criterion, the best model corresponds to the Matérn family with parameter \(\delta =2.5\). The estimated BS model and variogram parameters with estimated asymptotic SEs (in parenthesis) are: \(\widehat{\alpha }=0.997\,(3.521),\,\widehat{\mu }=2.807\,(0.082),\, {\widehat{\varphi }_{1}}=0.134\,(0.946),\,{\widehat{\varphi }}_{2}=0.020\, (0.142),\,{\widehat{\varphi }}_{3}=177.940\,(0.0000014),\) and \(\widehat{a}=1.053\,(0.0000083);\) see Table 4. Then, we obtain the fitted spatial map shown in Fig. 8 (left) for the soil phosphorus concentration by using the BS spatial model obtained from (16) and the ordinary Kriging interpolation described in (15). Observe that this is obtained from the ML estimates provided above, the fitted scale and dependence matrix \(\widehat{\varvec{\Sigma }} = 0.134{\varvec{I}}_{n} + 0.020{\widehat{\varvec{R}}},\) with \(\widehat{\varvec{R}}\) given from (11), and \(\delta =2.5\). According to the classification provided by COAMO/COODETEC (2001) (see Sect. 5.1), note that the phosphorus concentration is considered very high for both soybean and corn planting, which is suitable for both crops.

5.4 Model selection

We compare the spatial BS and Gaussian models using the Akaike (AIC) and Schwarz Bayesian (BIC) information criteria. These are given by \(\mathrm{AIC} = -2\ell (\widehat{\varvec{\theta} })+2d\) and \(\mathrm{BIC} =-2\ell (\widehat{\varvec{\theta}})+d\log ({n}),\) where \(\ell (\widehat{\varvec{\theta} })\) is the log-likelihood function for the parameter \(\varvec{\theta}\) associated with the model evaluated at \(\varvec{\theta} =\widehat{\varvec{\theta} },\,d\) is the dimension of the parameter space, and n the size of the data set. Both criteria are based on a penalized log-likelihood function as the model becomes more complex, that is, with more parameters. Thus, a model whose information criterion has a smaller value is better; see Ferreira et al. (2012), Leiva et al. (2015b) and references therein. Besides the AIC and BIC information criteria, the Bayes factor (BF) can also be used to compare the BS and Gaussian spatial models. The BF, denoted by \(B_{12},\) allows us to compare M1 (model considered as correct) to M2 (model to be contrasted with M1) by \(2\log (B_{12}) \approx \mathrm{BIC}_{\mathrm{M}_2}-\mathrm{BIC}_{\mathrm{M}_1},\) where \(\mathrm{BIC}_{\mathrm{M}_{j}}\) stands for the BIC associated with the model \(\mathrm{M}_{j}\), for \(j=1,\,2.\) The BF provides an objective value to quantify the degree of superiority of one model with respect to another. An interpretation of the BF is displayed in the Table 2. Thus, according to Tables 2 and 3, we detect that the BS model is superior to the Gaussian model with a very strong evidence in its favor when the data are not transformed by the logarithm.
Table 2

Interpretation of \(2\log (B_{12})\) associated with the BF

\(2\log (B_{12})\)

Evidence in favor of \(\mathrm{M}_{1}\)

\(<\)0

Negative (\(\mathrm{M}_{2}\) is accepted)

\([0,\,2)\)

Weak

\([2,\,6)\)

Positive

\([6,\,10)\)

Strong

\({\ge } 10\)

Very strong

Table 3

Log-likelihood, AIC, BIC and \(2\log (B_{12})\) values in the indicated spatial model for phosphorus concentration data

Model

\(\ell (\widehat{\theta })\)

AIC

BIC

\(2\log (B_{12})\)

BS (transformed data)

\({-}44.577\)

99.154

112.279

Gaussian (transformed data)

\({-}43.890\)

95.780

106.280

BS

\({-}332.576\)

675.152

688.276

28.224

Gaussian

\({-}349.000\)

706.000

716.500

5.5 Influence diagnostics

To evaluate the effect of atypical cases on the fitted spatial map shown in Fig. 8 (left), we carry out local influence diagnostics. By using a response variable perturbation scheme for the detection of influential cases, plots of \(C_{i}\) versus i and \(|d_{{\max_i }}|\) versus i are considered. It is clear to see from Fig. 7 that cases #2, #48 and #94 are identified as influential by both techniques. Note that these three cases are different from the outlying cases detected by the box-plot in Fig. 6 (left), which indicates the relevance of using the local influence method instead of a simple analysis employing the box-plot.
Fig. 7

Index plots of (left) \(C_{i}\) and (right) \(|d_{{\max_{i} }}|\) of phosphorus concentration data

Table 4 shows estimated model parameters and the corresponding estimated asymptotic SEs in parenthesis, for various spatial models fitted with the complete data set and subsets of it when the influential cases are removed, either individually or jointly. As usual in the local influence method, once one or more influential cases are identified, the cases are removed from the data set to investigate how their removal affects the model selection, the estimation of parameters and the construction of maps. Note that removal of the influential cases causes dramatic changes in the estimated spatial range, \(\widehat{a},\) especially when cases #32 and #48 are removed together. This is due to the considerable change in the estimate \(\widehat{\varphi }_{3}\) and a change in the estimated variogram model chosen to describe the spatial variability. Table 4 also shows the p-values for hypotheses of the form \(\mathrm{H}_{0}{\text {:}}\,\theta _{j} = 0\) versus \(\mathrm{H}_{1}{\text {:}}\, \theta _{j}\ne 0,\) where \(\theta _{j}\) is any of the parameters of the vector \({\varvec{\theta }}=(\alpha ,\,\mu ,\,\varphi _{1},\,\varphi _{2},\,\varphi _{3})^{\top }.\) To test H\(_{0},\) we can use the test statistic \(Z ={\widehat{\theta _{j}}}/\mathrm{SE}({\widehat{\theta _{j}}}),\) where \({\widehat{\theta _{j}}}\) is the ML estimator of \({\theta _{j}}\) and SE\({(\widehat{\theta _{j}})}\) the corresponding SE. The asymptotic distribution of Z is known to be normal; see results detailed in (23). From the p-values shown in this table, it follows that, in all models, the hypothesis H\(_{0}\) is rejected at a 5 % significance level for the parameters \(\mu\) and \(\varphi _{3},\) whereas H\(_{0}\) is not rejected in the tests for the parameters \(\alpha ,\,\varphi _{1}\) and \(\varphi _{2}.\) Note that no inferential changes are detected when removing the influential cases. Observe that, in the cases when the parameter space is the set of the positive real numbers, we should restrict H\(_{1}\) to this set, which was considered accordingly in the calculation of the corresponding p-value. Note also that, for the Matérn model, the spatial dependence radius a is a function of \(\varphi _{3},\) that is, \(a = c \varphi _{3},\) where c is a constant that depends on \(\delta ,\) so that \(\widehat{a} = c \widehat{\varphi }_{3}\) and then \(\mathrm{SE}(\widehat{a}) = c \mathrm{SE}(\widehat{\varphi }_{3}).\)

Figure 8 displays contour maps of the soil phosphorus concentration using the BS spatial model and ordinary Kriging interpolation. The maps were created based on two scenarios: (i) using the complete data set (reference map) and (ii) removing the influential cases #2, #48 and #94, individually and jointly. To construct the maps, we consider five classes of equal size obtained by dividing the range of estimated phosphorus concentration into five equal width intervals. Note that the removal of the cases individually does not change the map significantly. However, the joint removal of the cases causes a dramatic change. A more objective comparison of the maps is carried out using the GA and \(\kappa\) indexes; see Table 4. Thus, when the influential cases are removed individually, the model maps are similar to the reference map displayed in Fig. 6 (right). Nevertheless, this does not occur when the cases are removed jointly. Using the classification of Krippendorff (2004) for the \(\kappa\) index, note the following: (i) the model map created removing the influential case #48 has a high similarity compared to the reference map, (ii) the map created removing cases #2 and #94 has a medium similarity, and (iii) the other maps have a low similarity, which suggests that the cases are only jointly influential.
Fig. 8

Shaded contour plots showing the effects of removing the indicated case(s) for phosphorus concentration data

Table 4

ML estimates of the indicated model parameter with estimated asymptotic SE (in parenthesis) and p-values [in brackets], and values of GA and \(\kappa\) indexes for the phosphorus concentration data set with indicated dropped case(s)

Dropped case(s)

Model

\(\widehat{\alpha }\)

\(\widehat{\mu }\)

\(\widehat{\varphi _{1}}\)

\(\widehat{\varphi _{2}}\)

\(\widehat{\varphi _{3}}\)

\(\widehat{a}\)

GA

\(\kappa\)

None

Matérn \(\delta =2.5\)

0.997

2.807

0.134

0.020

177.940

1053.405

  

(3.521)

(0.082)

(0.946)

(0.142)

(0.0000014)

(0.0000083)

  
  

[0.389]

[\({<}0.001\)]

[0.444]

[0.444]

[\({<}0.001\)]

[\({<}0.001\)]

  

#2

Matérn \(\delta =2.5\)

0.993

2.831

0.125

0.016

108.655

643.238

0.90

0.72

  

(4.097)

(0.059)

(1.031)

(0.132)

(0.0000008)

(0.0000047)

  
  

[0.404]

[\({<}0.001\)]

[0.452]

[0.452]

[\({<}0.001\)]

[\({<}0.001\)]

  

#48

Matérn \(\delta =2.5\)

0.996

2.824

0.125

0.019

152.374

902.054

0.97

0.92

  

(3.417)

(0.073)

(0.856)

(0.133)

(0.0000012)

(0.0000071)

  
  

[0.386]

[\({<}0.001\)]

[0.442]

[0.443]

[\({<}0.001\)]

[\({<}0.001\)]

  

#94

Matérn \(\delta =1\)

0.997

2.817

0.122

0.028

308.828

1235.312

0.84

0.69

  

(3.417)

(0.073)

(0.856)

(0.133)

(0.0000012)

(0.0009092)

  
  

[0.374]

[\({<}0.001\)]

[0.436]

[0.437]

[\({<}0.001\)]

[\({<}0.001\)]

  

#2, #48

Matérn \(\delta =0.5\)

0.991

2.845

0.071

0.060

81.052

243.156

0.65

0.31

 

(Exponential)

(6.149)

(0.046)

(0.882)

(0.746)

(0.0000884)

(0.0002652)

  
  

[0.436]

[\({<}0.001\)]

[0.468]

[0.469]

[\({<}0.001\)]

[\({<}0.001\)]

  

#2, #94

Matérn \(\delta =0.5\)

0.995

2.840

0.097

0.038

182.059

546.177

0.72

0.45

 

(Exponential)

(3.308)

(0.063)

(0.644)

(0.254)

(0.0002974)

(0.0008922)

  
  

[0.382]

[\({<}0.001\)]

[0.440]

[0.441]

[\({<}0.001\)]

[\({<}0.001\)]

  

#48, #94

Matérn \(\delta =2.5\)

0.994

2.834

0.114

0.024

144.768

856.737

0.80

0.60

  

(2.980)

(0.077)

(0.683)

(0.145)

(0.0000010)

(0.0000059)

  
  

[0.370]

[\({<}0.001\)]

[0.434]

[0.435]

[\({<}0.001\)]

[\({<}0.001\)]

  

#2, #48, #94

Matérn \(\delta =0.5\)

0.982

2.855

0.085

0.041

143.699

431.097

0.66

0.35

 

(Exponential)

(3.442)

(0.056)

(0.597)

(0.287)

(0.0002051)

(0.0006153)

  
  

[0.388]

[\({<}0.001\)]

[0.444]

[0.443]

[\({<}0.001\)]

[\({<}0.001\)]

  

6 Conclusions and future work

We have proposed a novel spatial log-linear model based on the BS distribution. This is an alternative to the Gaussian model for describing data with spatial dependency structure and, most importantly, with a positive skew distribution. ML estimates of the model parameters were calculated using an iterative approach. Local influence diagnostics for the new model were derived and corresponding equations for the most appropriate perturbation obtained. We evaluated the performance of the estimation procedure and diagnostic tools by simulation. For large samples, estimation and diagnostics have a good performance. The proposed approach was also used to analyse real-world agricultural engineering data. In this application, influential cases were detected, and their removal caused a considerable change in the spatial dependence radius and on the spatial maps. Importantly, not all these influential cases would have been identified as traditional outliers.

Some possible issues to be addressed in future studies are the following. First, because the BS distribution is based on the normal distribution, a heavy-tailed version based, for example, on the Student-t distribution can be considered thereby reducing the influence of atypical data which can have an adverse effect on spatial maps. Second, explanatory variables may be considered in the spatial modelling, which can help to improve its predictive power. Third, other perturbation schemes could be considered to assess the influence of atypical data. Fourth, we could consider more than one random variable in the spatial modelling by means of multivariate structures for the BS distribution. Work on these four issues is currently under progress and we hope to report some findings in a future paper.

Notes

Acknowledgments

The authors thank the Editors and anonymous referees for their constructive comments on an earlier version of the manuscript, which resulted in this improved version. We are grateful to Carolina Brianezi-Melchior, who translated this work into English, from its original Portuguese. This research work was partially supported by CNPq Grants from the Brazilian Government, and by FONDECYT 1120879 Grant from the Chilean Government.

References

  1. Anderson J, Hardy E, Roach J, Witmer R (1976). A land use and land cover classification system for use with remote sensor data. Technical Report Paper 964. US Geological Survey Professional, Washington, DCGoogle Scholar
  2. Assumpção R, Uribe-Opazo M, Galea M (2011) Local influence for spatial analysis of soil physical properties and soybean yield using Student-t distribution. Rev Bras Ciênc Solo 35:1917–1926CrossRefGoogle Scholar
  3. Assumpção R, Uribe-Opazo M, Galea M (2014) Analysis of local influence in geostatistics using Student-t distribution. J Appl Stat 41:2323–2341CrossRefGoogle Scholar
  4. Azzalini A, Capitanio A (1999) Statistical applications of the multivariate skew-normal distribution. J R Stat Soc B 61:579–602CrossRefGoogle Scholar
  5. Birnbaum Z, Saunders S (1969) A new family of life distributions. J Appl Probab 6:319–327CrossRefGoogle Scholar
  6. Borssoi J, De Bastiani F, Uribe-Opazo M, Galea M (2011) Local influence of explanatory variables in Gaussian spatial linear models. Chil J Stat 2:29–38Google Scholar
  7. Cambardella C, Moorman T, Novak J, Parkin T, Karlen D, Turco R, Konopka A (1994) Field-scale variability of soil properties in central Iowa soils. Soil Sci Soc Am J 58:1501–1511CrossRefGoogle Scholar
  8. COAMO/COODETEC (2001) Soil fertility and plant nutrition. Technical report. Cooperativa Agropecuaria Mouraoense Ltda./Development Center Technological and Economic Cooperative Ltda. (COAMO/COODETEC), Cascavel, BrazilGoogle Scholar
  9. Cook R (1987) Influence assessment. J Appl Stat 14:117–131CrossRefGoogle Scholar
  10. Davis D (1952) An analysis of some failure data. J Am Stat Assoc 47:113–150CrossRefGoogle Scholar
  11. De Bastiani F, Cysneiros A, Uribe-Opazo M, Galea M (2015) Influence diagnostics in elliptical spatial linear models. TEST 24:322–340CrossRefGoogle Scholar
  12. Diggle P, Ribeiro P (2007) Model-based geostatistics. Springer, New YorkGoogle Scholar
  13. EMBRAPA (2009) Brazilian system of soil classification. Technical report. Brazilian Enterprise for National Agricultural Research/Centre of Soil Research (EMBRAPA/CPI), Rio de Janeiro, BrazilGoogle Scholar
  14. Ferreira M, Gomes M, Leiva V (2012) On an extreme value version of the Birnbaum–Saunders distribution. Revstat Stat J 10:181–210Google Scholar
  15. Galea M, Leiva V, Paula G (2004) Influence diagnostics in log-Birnbaum–Saunders regression models. J Appl Stat 31:1049–1064CrossRefGoogle Scholar
  16. Gimenez P, Galea M (2013) Influence measures on corrected score estimators in functional heteroscedastic measurement error models. J Multivar Anal 114:1–15CrossRefGoogle Scholar
  17. Grzegozewski D, Uribe-Opazo M, De Bastiani F, Galea M (2013) Local influence when fitting Gaussian spatial linear models: an agriculture application. Ciênc Investig Agrár 40:235–252Google Scholar
  18. Isaaks E, Srivastava R (1989) An introduction to applied geostatistics. Oxford University Press, OxfordGoogle Scholar
  19. Johnson N, Kotz S, Balakrishnan N (1994) Continuous univariate distributions, vol 1. Wiley, New YorkGoogle Scholar
  20. Johnson N, Kotz S, Balakrishnan N (1995) Continuous univariate distributions, vol 2. Wiley, New YorkGoogle Scholar
  21. Kendrick D (2002) Stochastic control for economic models. McGraw Hill, New YorkGoogle Scholar
  22. Krige D (1951) A statistical approach to some basic mine valuation problems on the Witwatersrand. J Chem Metall Min Soc S Afr 52:119–139Google Scholar
  23. Krippendorff K (2004) Content analysis: an introduction to its methodology. Sage, Thousand OaksGoogle Scholar
  24. Lange K (2001) Numerical analysis for statisticians. Springer, New YorkGoogle Scholar
  25. Lange K, Little J, Taylor M (1989) Robust statistical modeling using the \(t\) distribution. J Am Stat Assoc 84:881–896Google Scholar
  26. Leiva V, Barros M, Paula G, Sanhueza A (2008) Generalized Birnbaum–Saunders distribution applied to air pollutant concentration. Environmetrics 19:235–249CrossRefGoogle Scholar
  27. Leiva V, Sanhueza A, Angulo JM (2009) A length-biased version of the Birnbaum–Saunders distribution with application in water quality. Stoch Environ Res Risk Assess 23:299–307CrossRefGoogle Scholar
  28. Leiva V, Rojas E, Galea M, Sanhueza A (2014) Diagnostics in Birnbaum–Saunders accelerated life models with an application to fatigue data. Appl Stoch Models Bus Ind 30:115–131CrossRefGoogle Scholar
  29. Leiva V, Marchant C, Ruggeri F, Saulo H (2015a) A criterion for environmental assessment using Birnbaum–Saunders attribute control charts. Environmetrics 26:463–476CrossRefGoogle Scholar
  30. Leiva V, Tejo M, Guiraud P, Schmachtenberg O, Orio P, Marmolejo F (2015b) Modeling neural activity with cumulative damage distributions. Biol Cybern 109:421–433CrossRefGoogle Scholar
  31. Leiva V, Ferreira M, Gomes M, Lillo C (2016a) Extreme value Birnbaum–Saunders regression models applied to environmental data. Stoch Environ Res Risk Assess. doi: 10.1007/s00477-015-1069-6 Google Scholar
  32. Leiva V, Liu S, Shi L, Cysneiros F (2016b) Diagnostics in elliptical regression models with stochastic restrictions applied to econometrics. J Appl Stat 43:627–642CrossRefGoogle Scholar
  33. Leiva V, Santos-Neto M, Cysneiros F, Barros M (2016c) A methodology for stochastic inventory models based on a zero-adjusted Birnbaum–Saunders distribution. Appl Stoch Models Bus Ind 32:74–89CrossRefGoogle Scholar
  34. Liu S, Leiva V, Ma T, Welsh A (2016) Influence diagnostic analysis in the possibly heteroskedastic linear model with exact restrictions. Stat Methods Appl. doi: 10.1007/s10260-015-0329-4
  35. Marchant C, Leiva V, Cavieres M, Sanhueza A (2013) Air contaminant statistical distributions with application to PM10 in Santiago, Chile. Rev Environ Contam Toxicol 223:1–31Google Scholar
  36. Marchant C, Leiva V, Cysneiros F (2016) A multivariate log-linear model for Birnbaum–Saunders distributions. IEEE Trans Reliab. doi: 10.1109/TR.2015.2499964 Google Scholar
  37. Mardia K, Marshall R (1984) Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika 71:135–146CrossRefGoogle Scholar
  38. Militino A, Palacius M, Ugarte M (2006) Outliers detection in multivariate spatial linear models. J Stat Plan Inference 136:125–146CrossRefGoogle Scholar
  39. Muirhead R (1982) Aspects of multivariate statistical theory. Wiley, New YorkCrossRefGoogle Scholar
  40. Müller W, Stehlík M (2009) Issues in the optimal design of computer simulation experiments. Appl Stoch Models Bus Ind 25:163–177CrossRefGoogle Scholar
  41. Müller W, Stehlík M (2010) Compound optimal spatial designs. Environmetrics 21:354–364CrossRefGoogle Scholar
  42. Nocedal J, Wright S (1999) Numerical optimization. Springer, New YorkCrossRefGoogle Scholar
  43. Ortega E, Bolfarine H, Paula G (2003) Influence diagnostics in generalized log-gamma regression models. Comput Stat Data Anal 42:165–186CrossRefGoogle Scholar
  44. Podlaski R (2008) Characterization of diameter distribution data in near-natural forests using the Birnbaum–Saunders distribution. Can J For Res 18:518–527CrossRefGoogle Scholar
  45. R-Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
  46. Rieck J, Nedelman J (1991) A log-linear model for the Birnbaum–Saunders distribution. Technometrics 3:51–60Google Scholar
  47. Saulo H, Leiva V, Ziegelmann F, Marchant C (2013) A nonparametric method for estimating asymmetric densities based on skewed Birnbaum–Saunders distributions applied to environmental data. Stoch Environ Res Risk Assess 27:1479–1491CrossRefGoogle Scholar
  48. Uribe-Opazo M, Borssoi J, Galea M (2012) Influence diagnostics in Gaussian spatial linear models. J Appl Stat 39:615–630CrossRefGoogle Scholar
  49. Vilca F, Sanhueza A, Leiva V, Christakos G (2010) An extended Birnbaum–Saunders model and its application in the study of environmental quality in Santiago, Chile. Stoch Environ Res Risk Assess 24:771–782CrossRefGoogle Scholar
  50. Villegas C, Paula G, Leiva V (2011) Birnbaum–Saunders mixed models for censored reliability data analysis. IEEE Trans Reliab 60:748–758CrossRefGoogle Scholar
  51. Waller L, Gotway C (2004) Applied spatial statistics for public health data. Wiley, HobokenCrossRefGoogle Scholar
  52. Xia J, Zeephongsekul P, Packer D (2011) Spatial and temporal modelling of tourist movements using semi-Markov processes. Tour Manag 51:844–851CrossRefGoogle Scholar
  53. Zhu H, Ibrahim J, Lee S, Zhang H (2007) Perturbation selection and influence measures in local influence analysis. Ann Stat 35:2565–2588CrossRefGoogle Scholar
  54. Zhu H, Lee S (2001) Local influence for incomplete-data models. J R Stat Soc B 63:111–126CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Fabiana Garcia-Papani
    • 1
  • Miguel Angel Uribe-Opazo
    • 1
  • Victor Leiva
    • 2
  • Robert G. Aykroyd
    • 3
  1. 1.Postgraduate Program in Agricultural Engineering, Centre of Exact Sciences and TechnologyUniversidade Estadual do Oeste do ParanáCascavelBrazil
  2. 2.Faculty of Engineering and SciencesUniversidad Adolfo IbáñezViña del MarChile
  3. 3.Department of StatisticsUniversity of LeedsLeedsUK

Personalised recommendations