Appendix 1
Asymptotic results can be been proved for spatial processes which are observed at finitely many locations in the sampling region. In this case we deal with an increasing domain setup where the sampling region is unbounded. In the sequel we suppose that the mean function for the random field is known and without loss of generality this is zero. More precisely, we consider a weakly dependent random field \(\{Z(s), s\in S\}\) defined over an arbitrary lattice \(S\) in \(\mathbb {R}^{d}\) that is not necessarily regular. The lattice \(S\) is equipped with the metric \(\delta (s_k,s_l)=\max _{1\le l \le d}|s_{i,l}-s_{j,l}|\) and the distance between any subsets \(A,B\subset S\) is defined as \(\delta (A,B)=\inf \{\delta (s_k,s_l): \, s_k\in A\text { and } s_l\in B\}\). We denote
$$\begin{aligned} \alpha (U,V)&= \sup _{A,B}\{|P(A\cap B)-P(A)P(B)|:\, A\in \mathcal {F}(U),\\&B\in \mathcal {F}(V)\}, \end{aligned}$$
where \(\mathcal {F}(E)\) is the \(\sigma \)-algebra generated by the random variables \(\{Z(s),\, s\in E\}\). The \(\alpha \)-mixing coefficient (Doukhan 1994) for the random field \(\{Z(s), s\in S\}\) is defined as
$$\begin{aligned} \alpha (a,b,m)\!=\!\sup _{U,V}\{\alpha (U,V),\,|U|\!<\! a,\, |V| \!<\!b , \,\delta (U,V) \!\ge \! m\}. \end{aligned}$$
where \(|C|\) is the cardinality of the set \(C\).
We make the following assumptions:
- C1::
-
\(S\) is infinite, locally finite: for all \(s\in S\) and \(r>0, |\mathcal {B}(s,r)\cap S|=O(r^{d})\), with \(\mathcal {B}(s,r)\)
\(d\)-dimensional ball of center \(s\) and radius \(r\); moreover there exists a set of neighbourhoods, \(V_s\subset S\), such that \(|V_s|\) is uniformly bounded.
- C2::
-
\(D_n\) is an increasing sequence of finite subsets of \(S:\, d_n=|D_n|\rightarrow \infty \) as \(n\rightarrow \infty \).
- C3::
-
\(Z\) is a Gaussian random field with covariance function \(C(h;\theta )\), with \(\theta \in \Theta .\, \Theta \) is a compact set of \(\mathbb {R}^p\). The function \(\theta \mapsto C(h;\theta )\) has continuous second order partial derivatives with respect to \(\theta \in \Theta \), and these functions are continuous with respect to \(h\) and \(\inf _{ \theta \in \Theta }C(h;\theta )> 0\);
- C4::
-
The true unknown value of the parameter \(\theta \), namely \(\theta ^*\), is an interior point of \(\Theta \).
- C5::
-
The Gaussian random field is \(\alpha \)-mixing with mixing coefficient \(\alpha (m)=\alpha (\infty ,\infty ,m)\) satisfying:
- (C5a):
-
\(\exists \,\eta >0\) s.t. \(\sum _{k,l\in D_{n}}\alpha (\delta (k,l))^{ \frac{\eta }{2+\eta }}=O(d_{n})\),
- (C5b):
-
\(\sum _{m\ge 0}m^{d-1}\alpha (m)<\infty \);
- C6::
-
Let
$$\begin{aligned} g_k(Y(k);\theta )=-(1/2)\sum _{l\in V_k, l\ne k}l_{(k,l)}(\theta ) \end{aligned}$$
with \(Y(k)=(Z(s), s\in V_k)\), \(l_{(k,l)}=l_{kl}, l_{k|l}\) or \(l_{k-l}\). The functions \(l_{kl}, l_{k|l}\), and \(l_{k-l}\) are defined as in (10),(11) and (12), respectively. The composite likelihood is defined as
$$\begin{aligned}&Q_n(\theta )=\frac{1}{d_n}\sum _{k\in D_n}g_k(Y(k);\theta ); \end{aligned}$$
(28)
and the composite likelihood estimator is given by
$$\begin{aligned}&\widehat{\theta }_n=\mathrm{argmin }_{\theta \in \Theta }Q_n(\theta ). \end{aligned}$$
- C7::
-
the function \(\overline{Q}_n(\theta )=\mathrm {E}_{\theta ^*}[Q_n(\theta )]\) has a unique global minimum over \(\Theta \) at \(\theta ^*\), the true value.
Remarks
-
1.
Assumptions C1-2 are quite general. For instance we can consider a rectangular lattice, as in Shaby and Ruppert (2012), \(D_n\subset \Delta \mathbb {Z}^d\), for a fixed \(\Delta >0\), and \(D_n \subset D_{n+1}\) for all \(n\).
-
2.
The \(\alpha -\)mixing assumption C5 are a bit hard to check in general. It is satisfied when we consider compactly supported correlation functions, like the taper functions (2) and (3). When we consider a rectangular lattice the condition is satisfied for a stationary Gaussian random field with correlation function \(C(h;\theta )=O(\Vert h\Vert ^{-c})\), for some \(c> d\) and its spectral density bounded below (Doukhan 1994, Corollary 2, p. 59). In our examples this condition is satisfied by the exponential model.
-
3.
The assumption C7 is an identifiability condition. For each \(s\), the function \(\mathrm {E}_{\theta ^*}[g_s(Y_s;\theta ) ]\) has a global minimum at \(\theta ^*\) according the Kullback-Leibler inequality but in the multidimensional case (\(p >1\)) \(\theta ^*\) fails, in general, to be the unique minimizer. Assumption \(C7\) is not satisfied, for instance, when we consider a rectangular lattice and a compactly supported correlation functions, see (2) and (3), for instance, and \(\Delta > d\). In the case of the covariance models (20), (21) and (22) the condition is clearly satisfied.
-
4.
The assumption C6 is satisfied if we suppose a cut-off weight function for \(w_{kl}\).
-
5.
Any individual log-likelihood \(l_{(i,j)}\) can be written as
$$\begin{aligned}&l_{(k,l)}=c_{1}(\theta ,k-l)+c_{2}(\theta , k-l)Z_k^2+c_{3}(\theta ,k-l)Z_k^2 \\&\quad +c_{4}(\theta ,k-l)Z_k Z_l, \end{aligned}$$
where the functions \(c_{i}, i=1,\ldots ,4\) are \(\mathcal {C}^{2}\) functions with respect to \(\theta \).
Consistency
Given the previous assumptions C1-C7, \(\widehat{\theta }_n\) is a consistent estimator for \(\theta _0\) provided that \(\sup _{\theta \in \Theta }|Q_n(\theta )-\overline{Q}_n(\theta )|\rightarrow 0\) in probability, as \(n\rightarrow \infty \). According Corollary 2.2 in Newey (1991), we have to prove that
-
1.
for each \(\theta \in \Theta , Q_n(\theta )-\overline{Q}_n(\theta )\rightarrow 0\) in probability, as \(n\rightarrow \infty \);
-
2.
for \(M_n=O_p(1)\),
$$\begin{aligned} |\overline{Q}_n(\theta ') - \overline{Q}_n(\theta )|\le M_n \Vert \theta ' -\theta \Vert . \end{aligned}$$
We sketch the proof for \(l_{(k,l)}=l_{kl}\), the same arguments apply for the other sub-likelihoods, using the fourth remark.
-
1.
We prove that \(\sup _{k\in D_n}\mathrm {E}[(\sup _{\theta \in \Theta } g_k(Y(k);\theta ))^{2+\eta }]<\infty \), for \(\eta >0\). In fact, we have
$$\begin{aligned} g_k(Y(k);\theta )&= \frac{1}{2}\sum _{l\in V_k, l\ne k}\left\{ 2\log \sigma ^2+\log (1-\rho ^2_{kl}) \right. \\&\left. +\frac{Z_k^2+Z_l^2-2 \rho _{kl} Z_kZ_l}{\sigma ^2(1-\rho ^2_{kl} )}\right\} \\&\le \sum _{l\in V_k, l\ne k}\log \sigma ^2+\frac{1}{2}\log (1-\rho ^2_{kl})\\&+\frac{Z_k^2+Z_l^2}{\sigma ^2(1-\rho ^2_{kl} )} \\&\le c_1 |V_k|\log \sigma ^2+c_2|V_k|Z_k^2\\&+c_2\sum _{l\in V_k, l\ne k} Z_l^2 \end{aligned}$$
and \(|V_k| \) is uniformly bounded according the assumption C6. The uniform bounded moments \(g_k(Y(k);\theta )\) entail uniform \(L^1\) integrability of \(g_k\) and with the assumption C5 we obtain (Jenish and Prucha 2009, Theorem 3)
$$\begin{aligned}&Q_n(\theta )-\overline{Q}_n(\theta )=d_n^{-1}\sum _{k\in D_n} \left\{ g_k(Y(k),\theta ) \right. \\&\quad \left. -\mathrm {E}_\theta [g_k(Y(k),\theta )]\right\} \rightarrow 0, \text { in probability} \end{aligned}$$
-
2.
We have
$$\begin{aligned}&|g_k(Y(k);\theta ')-g_k(Y(k);\theta ) | \\&\quad =\frac{1}{2}\sum _{l\in V_k, l\ne k} \left| 2\log \frac{\sigma ^{'2}}{\sigma ^{2}}+\log \frac{1-\rho ^{'2}_{kl}}{1-\rho ^{2}_{kl}} \right. \\&\quad \quad \quad +(Z^2_k+Z^2_l)\left[ \frac{1}{\sigma ^{'2}(1-\rho ^{'2}_{kl} )}-\frac{1}{\sigma ^2(1-\rho ^2_{kl} )}\right] \\&\quad \quad \quad - \left. 2Z_kZ_l\left[ \frac{\rho ^{'}_{kl} }{\sigma ^{'2}(1-\rho ^{'2}_{kl} )}-\frac{\rho _{kl} }{\sigma ^2(1-\rho ^2_{kl} )}\right] \right| \\&\quad \le c_1 |V_k|\Vert \theta '- \theta \Vert +c_2 (|V_k| Z_k^2\\&\quad \quad +\sum _{l\in V_k, l\ne k} Z_l^2)\Vert \theta '- \theta \Vert \\&|\overline{Q}_n(\theta ') - \overline{Q}_n(\theta )| \\&\quad \le d_n^{-1}\sum _{k\in D_n}| q_k(\theta ') - q_k(\theta ) | \\&\quad \le c_3 d_n^{-1}\sum _{k\in D_n}(1+Z_k^2+\sum _{l\in V_k, l\ne k}Z_l^2)\Vert \theta '- \theta \Vert \\&\quad =M_n\Vert \theta '- \theta \Vert \end{aligned}$$
for some positive constants \(c_1, c_2\) and \(c_3\) and \(M_n=c_3 d_n^{-1}\sum _{k\in D_n}(1+Z_k^2+\sum _{l\in V_k, l\ne k}Z_l^2)\). Since \(\mathrm {E}_\theta [M_n]<\infty \), we obtain the desired result.
Asymptotic normality
We make the additional assumption:
- N1::
-
there exists two symmetric nonnegative definite matrices \(H\) and \(J\) such that for large \(n\):
$$\begin{aligned}&J_{n}=\mathrm {var}_{\theta ^*}(\sqrt{d_{n}}\nabla Q_{n}(\theta ^* ))\ge J\,\, \hbox {and} \,\,H_{n} \\&\quad =\mathrm {E}_{\theta ^* }(\nabla ^2Q_{n}(\theta ^* ))\ge I. \end{aligned}$$
where if \(A\) and \(B\) are two symmetric matrices, \(A \ge B\) means that \(A-B\) is a semipositive definite matrix.
We note that because \(g_{s}\) is a \(\mathcal {C}^{2}\) and \(\Theta \) is a compact space there exists a random variable \(h(Y(s)), \mathrm {E}_{\theta }(h(Y(s)))<\infty \) satisfying:
$$\begin{aligned} \left| \frac{\partial ^2}{\partial \theta _k\theta _l} g_{s}(Y(s),\theta )\right| ^2 \le h(Y(s)). \end{aligned}$$
Moreover for all \(s\in S\), \(\mathrm {E}_{\theta }[\frac{\partial }{\partial \theta _k} g_{s}(\theta )]=0\), because \(g_s\) is a sum of log-likelihoods, and it is easy to show that we have that \(\sup _{s\in S,\,\theta \in \Theta }\mathrm {E}_\theta \left[ \left| \frac{\partial }{\partial \theta _k} g_{s}(\theta )\right| ^{2+\eta }\right] <\infty \) and \(\sup _{s\in S,\,\theta \in \Theta }\mathrm {E}_\theta \left[ \left| \frac{\partial ^2}{\partial \theta _k\theta _l} g_{s}(\theta )\right| ^{2+\eta }\right] <\infty \), for all \(\eta >0\).
Under the condition C1-C7 and N1, conditions (H1-H2-H3) of Theorem 3.4.5 in Guyon (1995) are satisfied and
$$\begin{aligned} \sqrt{d_n}J_n^{-1/2}H_n (\hat{\theta }_n-\theta ^*) \overset{{d}}{\rightarrow }\mathcal {N}({0},{I}_p), \end{aligned}$$
where \({I}_p\) is the \(p\times p\) identity matrix.
Appendix 2
In this Section we provide the formulas for the Godambe information matrix associated to \(pl_a (\theta ), a=M,C,D\) for a Gaussian random field \(\{Z(s)\}\) with mean function \(E(Z(s))=\mu \), and covariance function \(Cov(Z(s_i),Z(s_j))=\sigma ^2\rho (s_i-s_j;\phi )=\sigma ^2 \rho _{ij}\). We denote
$$\begin{aligned}&H_{{a}}(\theta )=-\mathrm {E}[\nabla ^{2}pl_{{a}}(\theta )],\qquad J_{{a}}(\theta ) \\&\quad =\mathrm {E}[\nabla pl_{{a}}(\theta )\nabla pl_{{a}}(\theta )\,^\intercal ]\qquad a=M,C,D. \end{aligned}$$
where \(\theta \) is the vector of the unknown parameters.
The Godambe information matrix is given by
$$\begin{aligned} G_{{a}}(\theta )=H_{{a}}(\theta )J_{{a}}(\theta )^{-1}H_{{a}}(\theta )^\intercal , \qquad a=M,C,D. \end{aligned}$$
Let \(B_{ij}, G_{ij}\) and \(U_{ij}\) be defined in Sect. 3. Moreover we define
$$\begin{aligned} F_{ij}&= \rho _{ij}(Z(s_i)-\mu )^2+\rho _{ij}(Z(s_j)-\mu )^2 \\&\quad -(Z(s_i)-\mu )(Z(s_j)-\mu )(1+\rho _{ij}^2)\\&= \rho _{ij}B_{ij}-(Z(s_i)-\mu )(Z(s_j)-\mu )(1-\rho _{ij}^2)\\ Q_{ij}&= Z(s_i)+Z(s_j) \end{aligned}$$
For \(pl_M(\theta )\) and \(pl_C(\theta ), \theta =(\phi ,\sigma ^2,\mu )^\intercal \), the pairwise score functions are given by
and for \(pl_D(\theta )\), \(\theta =(\phi ,\sigma ^2)^\intercal \), we have
where \(\displaystyle { \alpha _{ij} =(1+ \rho _{ij})^{-1}{\sqrt{1+ \rho _{ij}^{2}}} {}}\), and \( \kappa _{ij} =(1- \rho _{ij})^{-1}\nabla \rho _{ij} \). Moreover we have
In order to derive the correlations in the \(J_a(\theta ), a=M,C,D\) matrices, we can exploit the following formula that holds for a stationary Gaussian random field:
$$\begin{aligned} Cov(Z(s_i)Z(s_j),Z(s_k)Z(s_l))=\sigma ^4(\rho _{ik}\rho _{jl}+\rho _{il}\rho _{jk}) \end{aligned}$$
After some algebra, we have:
-
\(Cov(Q_{ij},Q_{lk})=\sigma ^4 \frac{\rho _{il}+\rho _{ik}+\rho _{jl}+\rho _{jk}}{\sqrt{1+\rho _{ij}}\sqrt{1+\rho _{lk}}} \rho _{ij} \rho _{lk} \)
-
\(Cov(G_{ij},G_{kl})=\sigma ^2 (\rho _{ik}-\rho _{il}\rho _{kl}- \rho _{ij}\rho _{jk}+ \rho _{ij}\rho _{kl}\rho _{jk}) \)
-
\(Cov(G^2_{ij},G^2_{kl})=2[Cov(G_{ij},G_{kl})]^2\)
-
\( Cov(U_{ij},U_{kl})=\sigma ^2 (\rho _{ik}-\rho _{il}-\rho _{jk}+\rho _{jl}) \)
-
\( Cov(U_{ij}^2,U_{kl}^2)=2[Cov(U_{ij},U_{kl})]^2\)
-
\(Cov(B_{ij},B_{kl})= 2\sigma ^4[\rho _{ik}^2+\rho _{il}^2-2\rho _{kl}\rho _{ik}\rho _{il}+\rho _{jk}^2\)
\(+\rho _{jl}^2-2\rho _{kl}\rho _{jk}\rho _{jl} -2\rho _{ij}\rho _{ik}\rho _{jk}-2\rho _{ij}\rho _{il}\rho _{jl} \)
-
\(Cov(F_{ij},F_{kl})\)
\(\quad =\rho _{ij}\rho _{kl}Cov(B_{ij},B_{kl})\)
\(\quad +\sigma ^4(1-\rho _{ij}^2)(1-\rho _{kl}^2)(\rho _{ik}\rho _{jl}+\rho _{il}\rho _{jk})\)
\(\quad -2\sigma ^4\rho _{ij}(1-\rho _{kl}^2)(\rho _{ik}-\rho _{jk})(\rho _{il}-\rho _{jl})\)
\(\quad -2\sigma ^4\rho _{kl}(1-\rho _{ij}^2)(\rho _{ik}-\rho _{il})(\rho _{jk}-\rho _{jl})\)
-
\(Cov(F_{ij},B_{kl})=\rho _{ij}Cov(B_{ij},B_{kl})\)
\( -2\sigma ^4(1-\rho _{ij}^2)(\rho _{ik}-\rho _{il})(\rho _{jk}-\rho _{jl})\)
-
\(Cov(F_{ij},G^2_{kl})\)
\(\quad =2\sigma ^4[\rho _{ij}\rho ^2_{ik}+\rho _{ij}\rho ^2_{kl}\rho ^2_{il}-2\rho _{ij}\rho _{kl}\rho _{ik}\rho _{il}\)
\(\quad + \rho _{ij}\rho ^2_{jk}\ +\rho _{ij}\rho ^2_{kl}\rho ^2_{jl} -2\rho _{ij}\rho _{kl}\rho _{jk}\rho _{jl} \)
\(\quad -(1+\rho ^2_{ij})\rho _{ik}\rho _{jk}-(1+\rho ^2_{ij})\rho ^2_{kl}\rho _{il}\rho _{jl}\)
\(\quad + (1+\rho ^2_{ij})\rho _{kl}(\rho _{ik}\rho _{jl}+\rho _{il}\rho _{jk})]\)