Letters in Spatial and Resource Sciences

, Volume 6, Issue 2, pp 91–101 | Cite as

Testing for spatial error dependence in probit models

  • Pedro V. Amaral
  • Luc Anselin
  • Daniel Arribas-Bel
Original Paper

Abstract

In this note, we compare three test statistics that have been suggested to assess the presence of spatial error autocorrelation in probit models. We highlight the differences between the tests proposed by Pinkse and Slade (J Econom 85(1):125–254, 1998), Pinkse (Asymptotics of the Moran test and a test for spatial correlation in Probit models, 1999; Advances in Spatial Econometrics, 2004) and Kelejian and Prucha (J Econom 104(2):219–257, 2001), and compare their properties in a extensive set of Monte Carlo simulation experiments both under the null and under the alternative. We also assess the conjecture by Pinkse (Asymptotics of the Moran test and a test for spatial correlation in Probit models, 1999) that the usefulness of these test statistics is limited when the explanatory variables are spatially correlated. The Kelejian and Prucha (J Econom 104(2):219–257, 2001) generalized Moran’s I statistic turns out to perform best, even in medium sized samples of several hundreds of observations. The other two tests are acceptable in very large samples.

Keywords

Spatial econometrics Spatial probit Moran’s I 

JEL Classification

C21 C25 

1 Introduction

In contrast to the situation in the standard linear model, relatively little is known about tests against spatial error autocorrelation in specifications with limited dependent variables, such as a probit model. Three different test statistics have been proposed in the literature, respectively by Pinkse and Slade (1998), Pinkse (1999); Pinkse (2004) and Kelejian and Prucha (2001). Even though they have the same asymptotic distribution, they tend to yield different results in empirical practice.1

Very few systematic findings are available regarding the relative performance of these tests in finite samples. The only small sample results to date are contained in a limited simulation experiment in Novo (2001) and in Amaral and Anselin (2011). The results in Novo (2001) are based on a small number of replications (2,000) and, therefore, suffer from a lack of precision in the reported rejection frequencies. Also, his experiment was limited to sample sizes up to \(N = 225\). Amaral and Anselin (2011) only considered one test, the Kelejian and Prucha (2001) generalized Moran’s I test, but for both probit and tobit models.

As is well known, ignoring spatial error autocorrelation in a probit model has more serious consequences than in the standard linear regression. In the latter, the main problem is one of lack of efficiency (e.g., Anselin 1988). However, in a probit model, ignoring spatially autocorrelated errors results not only in inefficiency but also inconsistency for the standard maximum likelihood estimator (e.g., Fleming 2004). In part, this is due to the fact that most spatial processes for spatial error autocorrelation are also heteroskedastic Anselin (2006). A better understanding of the properties of test statistics developed to detect this form of misspecification is therefore not only of theoretical interest, but also has ramifications for empirical practice.

In this note, we extend the study in Amaral and Anselin (2011) to a comparative assessment of the three test statistics in the probit model. We highlight the similarities and differences between the three test statistics and carry out a series of Monte Carlo simulation experiments to compare their size and power. We also assess a conjecture formulated in Pinkse (1999), where it was suggested that the tests may have limited usefulness in the presence of spatially correlated explanatory variables. As stated in the beginning of this introduction, we consider only spatial autocorrelation in standard probit models, not models with a spatial lag.

2 Model and test statistics

2.1 Probit and spatial probit

The point of departure is the standard linear latent variable model:
$$\begin{aligned} \mathbf y ^*=\mathbf X \beta +\varepsilon , \end{aligned}$$
(1)
in which \(\mathbf y ^*\) is a \(N\times 1\) vector with the unobserved (latent) dependent variable, \(\mathbf X \) is a \(N\times k\) matrix of observations on the explanatory variables, \(\beta \) is a \(k\times 1\) vector of coefficients, and \(\varepsilon \) is the \(N\times 1\) vector of random errors assumed to be normally distributed \(\varepsilon \sim N(0,1)\).
The observed counterpart of \(\mathbf y ^*\) is the binary vector \(\mathbf y \), which elements are 1 for \(y^*_i>0\) and zero otherwise. This yields the probit result:
$$\begin{aligned} P[y_i=1|\mathbf{x }_i]=P[y^*_i>0|\mathbf{x }_i]=P[\varepsilon _i<\mathbf{x }_i \beta |\mathbf{x }_i] \end{aligned}$$
(2)
Typically, for identification purposes, \(\varepsilon _i\) is assumed to have a constant unit variance. However, in the presence of spatial error autocorrelation, this is no longer the case. Assuming an autoregressive form of correlation, as presented in Anselin (2006), we have:
$$\begin{aligned} \begin{aligned} \mathbf y ^*&=\mathbf X \beta +\mathbf u \\ \mathbf u&=\lambda \mathbf W \mathbf u +\varepsilon , \end{aligned} \end{aligned}$$
(3)
in which \(\mathbf u \) is an \(N\times 1\) vector of spatially correlated errors that follow a SAR process, \(\lambda \) is the autoregressive parameter, \(\mathbf W \) is the standard \(N\times N\) spatial lag matrix, and
$$\begin{aligned} P[y_i=1|\mathbf x _i]=P[y^*_i>0|\mathbf x _i]=P\left[\left.u_i<\frac{x_i\beta }{\sigma _i}\right|\mathbf x _i\right]. \end{aligned}$$
(4)
Unlike the expression for the standard probit model in Eq. 2, the relevant distribution in the spatial case is no longer that of an independent standard normal variate. Instead, the error term in Eq. 4 is heteroskedastic and follows a marginal distribution from a multivariate normal with variance-covariance matrix \([(\mathbf I -\lambda \mathbf W )^{\prime }(\mathbf I -\lambda \mathbf W )]^{-1}\). Consequently, the proper maximum likelihood estimator has to take this multivariate distribution into account (e.g., Beron and Vijverberg 2004).

2.2 Residuals

All three test statistics take the form of a generalized Moran’s I, squared in order to obtain an asymptotic \(\chi ^2(1)\) distribution:
$$\begin{aligned} I^2=\frac{(\mathbf e^{\prime } \mathbf W \mathbf e )^2}{v}\rightarrow \chi ^2(1), \end{aligned}$$
(5)
where \(\mathbf e \) is a vector of (unstandardized or standardized) residuals, \(\mathbf W \) is the spatial weights matrix and \(v\) is a variance estimate necessary to ensure that the asymptotic distribution is \(\chi ^2(1)\) (or standard normal, for the unsquared version).2 The test statistics differ by the estimation of the residual and by the variance estimate used for standardization.

Unlike the standard linear model, there is no unambiguous estimate for the residual in the probit model, since the true residual vector \(\mathbf y^* -\mathbf X \hat{\beta }\) is unobserved. Instead, the estimated residual needs to be based on the difference between the observed \(y_i\) and a predicted value \(\hat{\Phi }_i\), the cumulative standard normal distribution evaluated at \(\mathbf x^{\prime } _i\hat{\beta }\).

The most straightforward residual can thus be estimated as:
$$\begin{aligned} e_{1i}=y_i-\Phi (\mathbf x^{\prime } _i\hat{\beta }), \end{aligned}$$
(6)
with its variance estimated as:
$$\begin{aligned} \text{ Var}[e_{1i}]=\hat{\Phi }_i (1-\hat{\Phi }_i), \end{aligned}$$
(7)
where \(\hat{\Phi }_i\) is as above. Since \(\hat{\Phi }_i\) changes with \(\mathbf x^{\prime } _i\), the residual variance is not constant. To correct for this non-constant variance, a standardized residual can be estimated by dividing Expression 6 by the square root of the variance, which results in the following estimate:
$$\begin{aligned} e_{2i}=\frac{e_{1i}}{\sqrt{\hat{\Phi }_i (1-\hat{\Phi }_i )}}, \end{aligned}$$
(8)
with constant unit variance.
A third alternative is based on the notion of a generalized residual proposed by Cox and Snell (1968). This residual is a weighted adjustment to \(e_{1i}\), which is estimated as:
$$\begin{aligned} e_{3i}=\frac{\hat{\phi _i}}{\hat{\Phi _i}(1-\hat{\Phi _i})} e_{1i}, \end{aligned}$$
(9)
with \(\hat{\phi _i}\) as the normal density evaluated at \(\mathbf{x }_\mathbf{i }^{\prime }\hat{\beta }\) and \(\hat{\Phi _i}\) as the cumulative normal distribution evaluated at \(\mathbf{x }_\mathbf{i }^{\prime }\hat{\beta }\). Given that Var\([e_{1i}]=\hat{\Phi }_i (1-\hat{\Phi }_i)\), an estimate for the variance of \(e_{3i}\) is:
$$\begin{aligned} \text{ Var}[e_{3i}]=\frac{\hat{\phi _i}^2}{\hat{\Phi _i}^2 (1-\hat{\Phi _i})^2}[\hat{\Phi }_i (1-\hat{\Phi }_i )] =\frac{\hat{\phi _i}^2}{\hat{\Phi _i} (1-\hat{\Phi _i})}. \end{aligned}$$
(10)

2.3 Test statistics

The first specification tests for spatial error autocorrelation in a probit model was proposed by Pinkse and Slade (1998). The suggested statistic is essentially the same as the familiar Lagrange Multipler statistic for error autocorrelation in the standard linear model with a constant variance \(\sigma ^2_i = 1\), because of the use of the standardized residuals \(e_{2i}\) from Eq. 8. The statistic takes the form:
$$\begin{aligned} LM_{PS}=\frac{(\mathbf{e }_\mathbf{2 }^{\prime }\mathbf{W }\mathbf{e }_\mathbf{2 })^2}{tr(\mathbf{WW }+\mathbf{W ^{\prime }}\mathbf{W })}. \end{aligned}$$
(11)
(Pinkse and Slade (1998), p. 131) did not derive the asymptotic distribution of the statistic, but based inference on a bootstrap procedure. For the purposes of our evaluation, we will assess the extent to which the distribution of the statistic is asymptotically \(\chi ^2(1)\).
The second approach consists of the derivation of Lagrange Multiplier statistics for a class of limited dependent variable specifications (including the probit model) in Pinkse (1999) and Pinkse (2004). Again, this statistic takes the same form as the LM-Error statistic in the standard linear regression model, but Pinkse uses the Cox–Snell generalized residual \(e_{i3}\) in the numerator of the statistic. The variance estimate in the denominator has the same matrix trace term as \(LM_{PS}\), but multiplies this by an error variance estimate (since the residual is no longer standardized). The Pinkse LM test statistic for error spatial autocorrelation is then:
$$\begin{aligned} LM_{P}=\frac{(\mathbf{e }_\mathbf{3 }^{\prime }\mathbf{W }\mathbf{e }_\mathbf{3 })^2}{\hat{\sigma }^4 tr(\mathbf{WW }+\mathbf{W^{\prime } }\mathbf{W })} \stackrel{d}{\rightarrow } \chi ^2(1), \end{aligned}$$
(12)
with
$$\begin{aligned} \hat{\sigma }^2=\frac{1}{n}\sum _i\frac{\hat{\phi _i}^2}{\hat{\Phi }_i(1-\hat{\Phi }_i)}. \end{aligned}$$
(13)
Note how this approach smoothes the heteroskedastic error variances by using an overall average as the variance estimate.
The third approach, outlined in Kelejian and Prucha (2001) generalizes the Moran’s I statistic to specification tests in a range of models with limited dependent variables. For the probit specification, the resulting expression is:
$$\begin{aligned} MI = \frac{\mathbf{e }_\mathbf{1 }^{\prime }{ \mathbf W } \mathbf{e }_\mathbf{1 }}{\sqrt{tr(\mathbf W\Sigma W\Sigma + \mathbf W^{\prime }\Sigma W\Sigma )}} \stackrel{d}{\rightarrow }N(0,1), \end{aligned}$$
(14)
or,
$$\begin{aligned} I^2=\frac{(\mathbf{e }_{\mathbf{1 }}^{\prime } \mathbf{W } \mathbf{e }_\mathbf{1 })^2}{tr(\mathbf{W }{\varvec{\Sigma }} \mathbf{W }{\varvec{\Sigma }}+\mathbf{W }^{\prime }{\varvec{\Sigma }} \mathbf{W }{\varvec{\Sigma }})} \stackrel{d}{\rightarrow } \chi ^2(1), \end{aligned}$$
(15)
where \({\varvec{\Sigma }}\) is a diagonal matrix containing the variances of the individual residual terms, \(\hat{\sigma }^2_i=\hat{\Phi }_i(1-\hat{\Phi }_i)\). Again note the similarity to the classic LM-Error statistic: for a homoskedastic variance, the denominator of Eq. 15 would reduce to \(\hat{\sigma }^4 tr(\mathbf{WW }+\mathbf{W^{\prime }W })\).

3 Design of the experiments

We consider the performance of the three test statistics in a series of Monte Carlo simulation experiments in which we manipulate sample size and the value of the spatial autoregressive parameter. We use seven different sample sizes, each consisting of a regular grid, ranging from \(7\times 7\) grid cells (N\(=\) 49) to \(625\times 625\) cells (N\(=\) 390,625), with \(N={49, 100, 225, 625, 2{,}500, 15{,}625, 390{,}625}\). We use thirteen different values for the spatial autoregressive parameter, with \(\lambda =\{-0.8,-0.5,-0.3,-0.1,-0.05,-0.01,0.0,0.01,0.05,0.1,0.3,0.5,0.8\}\). \(\mathbf W \) is defined considering all neighbouring regions according to rook contiguity.

Each experiment consists of 10,000 replications.3 A nominal Type I error of 0.05 is used throughout, which leads to an associated sample standard deviation in each simulation run of \(\sqrt{0.05 \times 0.95/10{,}000}=0.0022\). In other words, rejection frequencies within the range [0.0478 – 0.0522] are within one standard deviation of the it true value of 0.05, and frequencies within the range [0.0456–0.0544] are within two standard deviations of the true value.

The experiments are based on simulating values for the unobserved latent variable \(y_i^*\). This is subsequently turned into an “observed” value of 1 or 0. The specification under the null hypothesis is based on the following model:
$$\begin{aligned} \mathbf{y }^*=\iota +0.5\mathbf{x }+\varepsilon \end{aligned}$$
(16)
in which \(\iota \) is a \(N\times 1\) vector of ones, \(\mathbf x \) is a non-stochastic Nx1 regressor vector uniformly distributed over the interval \([-7,3)\), and \(\varepsilon \sim N(0,1)\). The parameters have been chosen in order to provide a balanced sample of \(\mathbf y \), i.e., \(Pr(y=1|x) \approx 0.5\).4 The estimates were obtained by means of maximum likelihood estimation.
For the alternative, we limit ourselves to a spatial autoregressive error specification.The model under the alternative hypothesis of spatial error autocorrelation is then:
$$\begin{aligned} \mathbf y ^*=\iota +0.5\mathbf{x }+(I-\lambda \mathbf{W })^{-1}\varepsilon . \end{aligned}$$
(17)
Finally, we also consider the effect of spatial autocorrelation in the explanatory variables. We implement this through both a spatial autoregressive and a spatial moving average transformation of the \(\mathbf x \) vector:
$$\begin{aligned} \mathbf y ^*=\iota +0.5(I-\gamma \mathbf W )^\theta \mathbf x +\varepsilon , \end{aligned}$$
(18)
in which \(\theta \) is either equal to \(-1\) (autoregressive) or +1 (moving average). We consider four values for the spatial coefficient, with \(\gamma =\{-0.8,-0.3,0.3,0.8 \}\). Note that for the moving average transformation this implies that positive values for \(\gamma \) correspond to negative spatial autocorrelation and vice versa.5

4 Relative performance of the test statistics

4.1 Size and distribution under the null hypothesis

The rejection frequencies under the null hypothesis of the \(MI\), \(LM_{PS}\) and \(LM_P\) statistics are reported in Table 1, with values in bold reflecting a frequency within one standard deviation of the Type I error of 0.05. All tests are within two standard deviations of 0.05 for \(N\ge 625\), but only \(MI\) is consistently within one standard deviation for these sample sizes. All tests are biased and under-reject the null for the three smallest samples. This is especially the case for \(LM_{PS}\).
Table 1

Size of tests

\(N\)

\({MI}\)

\({LM_{PS}}\)

\({LM_P}\)

49

0.0473

0.0153

0.0355

100

0.0476

0.0295

0.0437

225

0.0434

0.0389

0.0427

625

0.0484

0.0470

0.0514

2,500

0.0478

0.0487

0.0470

15,625

0.0483

0.0498

0.0495

390,625

0.0490

0.0498

0.0502

10,000 replications: bold values within one st. dev. of \(p=0.05\)

We assess the extent to which the test statistics obtain their asymptotic distribution by means of a Kolmogorov–Smirnov test. We take the null hypothesis to be a \(\chi ^2(1)\) distribution. To make all test statistics comparable, we use the square of \(MI\) or \(I^2\) from Eq. 15. Also, it should be noted that (Pinkse and Slade (1998), p. 131) did not derive an asymptotic distribution for the \(LM_{PS}\) statistic, but instead proposed a bootstrap procedure. For the purposes of this exercise, we compare its distribution to a \(\chi ^2(1)\).

The results are reported in Table 2. For the \(I^2\) test statistic, the Kolmogorov–Smirnov test fails to reject the null (of a \(\chi ^2(1)\) distribution) in all samples. In other words, this statistic achieves its asymptotic distribution even in sample sizes as small as \(N = 49\). This is not the case for the other two tests. Again, \(LM_{PS}\) fares the worst, although in some sense this is not a fair comparison, since its asymptotic distribution was not derived. However, there is some evidence that the distribution of this test statistics approaches a \(\chi ^2(1)\) in the two largest samples (\(N = 15{,}625\) and \(N = 390{,}625\)). \(LM_P\) approaches its asymptotic distribution reliably for sample sizes of \(N\ge 625\). For \(N = 100\) and \(N = 225\) the null hypothesis is weakly rejected, but clearly for this test the asymptotic distribution is not appropriate for \(N = 49\).
Table 2

Kolmogorov–Smirnov test against \(\chi ^2(1)\) distribution

\(N\)

\(I^2\)

\(LM_{PS}\)

\(LM_P\)

49

0.0139

0.1825

0.0383

 

(0.289)

(0.000)

(0.000)

100

0.0136

0.1047

0.0178

 

(0.313)

(0.000)

(0.084)

225

0.0158

0.0596

0.0188

 

(0.165)

(0.000)

(0.058)

625

0.0098

0.0313

0.0102

 

(0.723)

(0.000)

(0.676)

2,500

0.0134

0.0244

0.0139

 

(0.331)

(0.005)

(0.289)

15,625

0.0087

0.0096

0.0100

 

(0.844)

(0.746)

(0.699)

390,625

0.0119

0.0157

0.0117

 

(0.478)

(0.170)

(0.500)

\(p\) values in parentheses

4.2 Power of the test statistics

We compare the power of the test statistics against an alternative hypothesis of a spatially autoregressive error term. We report both the “naive” rejection frequencies, as well as the results where we adjust the critical value for the test statistic according to the empirical distribution under the null as suggested in Hendry (2006).6 Both \(MI\) and \(LM_P\) have strong power against the alternative, with \(LM_{PS}\) only slightly less. All three tests achieve a rejection rate of over \(90~\%\) for \(\lambda =0.5\) at \(N = 625\) and for \(\lambda =0.3\) at \(N=2{,}500\). In the largest sample, they achieve 100 % rejection for \(\lambda = 0.1\) (Table 3).
Table 3

Rejection frequency: spatial autoregressive error

\(N\)

\(\lambda \)

Sp. Error

Sp. Error (adjusted)

  

\(MI\)

\(LM_{PS}\)

\(LM_P\)

\(MI\)

\(LM_{PS}\)

\(LM_P\)

49

0.01

0.047

0.014

0.034

0.050

0.049

0.051

 

0.05

0.045

0.014

0.031

0.046

0.046

0.046

 

0.1

0.043

0.015

0.03

0.046

0.043

0.045

 

0.3

0.064

0.022

0.047

0.067

0.068

0.067

 

0.5

0.137

0.070

0.115

0.142

0.156

0.147

 

0.8

0.526

0.430

0.505

0.534

0.569

0.551

100

0.01

0.049

0.028

0.045

0.051

0.050

0.053

 

0.05

0.051

0.028

0.047

0.053

0.048

0.055

 

0.1

0.048

0.029

0.045

0.050

0.051

0.053

 

0.3

0.109

0.065

0.101

0.112

0.100

0.113

 

0.5

0.296

0.220

0.293

0.301

0.294

0.315

 

0.8

0.882

0.837

0.883

0.884

0.883

0.891

225

0.01

0.049

0.042

0.049

0.054

0.053

0.054

 

0.05

0.051

0.040

0.048

0.057

0.051

0.054

 

0.1

0.060

0.047

0.057

0.068

0.059

0.064

 

0.3

0.207

0.166

0.208

0.220

0.195

0.224

 

0.5

0.604

0.514

0.605

0.619

0.558

0.624

 

0.8

0.997

0.995

0.997

0.997

0.997

0.998

625

0.01

0.051

0.044

0.050

0.052

0.047

0.048

 

0.05

0.056

0.054

0.058

0.057

0.056

0.057

 

0.1

0.082

0.074

0.084

0.084

0.078

0.083

 

0.3

0.467

0.384

0.480

0.470

0.397

0.475

 

0.5

0.943

0.904

0.949

0.945

0.909

0.948

 

0.8

1.000

1.000

1.000

1.000

1.000

1.000

2,500

0.01

0.050

0.051

0.050

0.052

0.052

0.053

 

0.05

0.090

0.082

0.093

0.093

0.084

0.096

 

0.1

0.237

0.206

0.241

0.242

0.210

0.247

 

0.3

0.976

0.957

0.980

0.977

0.958

0.981

 

0.5

1.000

1.000

1.000

1.000

1.000

1.000

 

0.8

1.000

1.000

1.000

1.000

1.000

1.000

15,625

0.01

0.062

0.059

0.063

0.065

0.059

0.064

 

0.05

0.340

0.293

0.345

0.346

0.295

0.347

 

0.1

0.871

0.816

0.881

0.873

0.816

0.881

 

0.3

1.000

1.000

1.000

1.000

1.000

1.000

 

0.5

1.000

1.000

1.000

1.000

1.000

1.000

 

0.8

1.000

1.000

1.000

1.000

1.000

1.000

390,625

0.01

0.337

0.283

0.337

0.340

0.283

0.337

 

0.05

1.000

1.000

1.000

1.000

1.000

1.000

 

0.1

1.000

1.000

1.000

1.000

1.000

1.000

 

0.3

1.000

1.000

1.000

1.000

1.000

1.000

 

0.5

1.000

1.000

1.000

1.000

1.000

1.000

 

0.8

1.000

1.000

1.000

1.000

1.000

1.000

10,000 replications: \(p=0.05\)

 

4.3 Spatial autocorrelation in the explanatory variables

As a final issue, we examine the conjecture by ((Pinkse 1999, p. 10)) that the LM tests “tend to reject the null hypothesis of spatial independence of the errors when regressors are spatially dependent, even when the errors are independent.” We consider both SAR and SMA processes to induce spatial autocorrelation in the \(\mathbf x \) vector. The results are given in Table 4 for sample sizes of \(N\ge 625\). For these sample sizes the rejection frequencies under the null hypothesis consistently remained within two standard deviations from the Type I error of 0.05 for all three test statistics. In the table we highlight those rejection frequencies in bold that are outside two standard deviations. Note that negative values of \(\lambda \) for the MA transformation correspond with positive spatial autocorrelation and vice versa.
Table 4

Rejection frequency: spatially correlated regressors

\(N\)

\(\gamma \)

AR(X)

MA(X)

  

\(MI\)

\(LM_{PS}\)

\(LM_P\)

\(MI\)

\(LM_{PS}\)

\(LM_P\)

625

\(-\)0.8

0.0513

0.0435

0.0575

0.0493

0.0451

0.0509

 

\(-\)0.3

0.0548

0.0513

0.0508

0.0532

0.0442

0.0490

 

0

0.0484

0.0470

0.0514

0.0484

0.0470

0.0514

 

0.3

0.0482

0.0456

0.0463

0.0484

0.0474

0.0471

 

0.8

0.0493

0.0388

0.0575

0.0505

0.0470

0.0527

2,500

\(-\)0.8

0.0488

0.0436

0.0569

0.0551

0.0491

0.0582

 

\(-\)0.3

0.0468

0.0505

0.0467

0.0503

0.0516

0.0513

 

0

0.0478

0.0487

0.0470

0.0478

0.0487

0.0470

 

0.3

0.0489

0.0489

0.0490

0.0532

0.0529

0.0545

 

0.8

0.0481

0.0440

0.0582

0.0518

0.0473

0.0560

15,625

\(-\)0.8

0.0521

0.0519

0.0612

0.0471

0.0497

0.0514

 

\(-\)0.3

0.0496

0.0492

0.0493

0.0482

0.0492

0.0485

 

0

0.0483

0.0498

0.0495

0.0483

0.0498

0.0495

 

0.3

0.0521

0.0468

0.0509

0.0477

0.0498

0.0477

 

0.8

0.0521

0.0499

0.0597

0.0500

0.0503

0.0538

390,625

\(-\)0.8

0.0495

0.0527

0.0568

0.0530

0.0503

0.0550

 

\(-\)0.3

0.0487

0.0477

0.0475

0.0490

0.0492

0.0493

 

0

0.0490

0.0498

0.0502

0.0490

0.0498

0.0502

 

0.3

0.0499

0.0473

0.0491

0.0539

0.0520

0.0541

 

0.8

0.0491

0.0503

0.0571

0.0530

0.0537

0.0570

10,000 replications, \(p = 0.05\), bold values outside two st. dev.

The rejection frequencies for \(LM_P\) indeed tend to be somewhat elevated, but only for very large values of the spatial parameter (\(| \gamma |=0.8\)). Interestingly, the effect on \(LM_{PS}\) works in the other direction, yielding a few cases of under-rejection. Overall, the \(MI\) statistic does not seem to be affected by spatial autocorrelation in the regressors, especially for \(N > 625\). The difference between the test statistics may be due to the way the residuals are calculated, since \(MI\) uses the “naive” residuals, whereas \(LM_P\) is based on the generalized Cox–Snell residuals, which involve \(\mathbf x \) in the weighting factor as well.

5 Conclusion

Our simulation experiments are the first systematic evaluation of the properties of the three tests proposed in the literature to assess spatial error autocorrelation in a probit model. They demonstrated that of the three tests, \(MI\) is overall the most reliable. It is unbiased across the widest range of sample sizes and achieves its asymptotic distribution under the null, even for \(N = 49\). The two other statistics also perform well under the null, but require larger sample sizes (\(N>~2{,}500\)) to obtain the \(\chi ^2(1)\) asymptotic distribution. All three test have good power against the alternative, especially in the larger sample sizes. Finally, the \(MI\) test is not affected by spatial correlation in the regressors, whereas there is a slight effect on the two other tests.

Footnotes

  1. 1.

    For example, using the Columbus data from Anselin (1988) (N\(=\) 49), truncated in the same fashion as in McMillen (1992) and others, with y\(=\) 1 for crime\(>40\), yields values of respectively 2.48, 1.22 and 2.89 for the statistics in a probit estimation of crime on income and housing values. Whereas the first two values do not reject the null, the last statistic weakly rejects the null.

  2. 2.

    Recall that the LM test statistic for spatial error autocorrelation in the standard linear regression takes the form \(LM=[(\mathbf e^{\prime } \mathbf W \mathbf e ) / \hat{\sigma }^2]^2 / tr(\mathbf{WW }+\mathbf{W^{\prime }W })\), where \(\mathbf e \) is a vector of OLS residuals, \(tr\) is a trace operator and \(\hat{\sigma }^2\) is the usual estimate of error variance ((Anselin 1988, p. 104)). Using the notation from Eq. 5, \(v=\hat{\sigma }^4 tr(\mathbf{WW }+\mathbf{W^{\prime }W })\).

  3. 3.

    For all computations, we used PySAL, a Python library for spatial analysis Rey and Anselin (2007).

  4. 4.

    All the experiments were also performed for irregular layouts and unbalanced samples. The results were similar to those for a balanced sample with regular lattice and were omitted from this paper.

  5. 5.

    The spatial transformation induces a change in the mean and variance of the \(x\) variables. In order to ensure that the sample remains balanced and that the approximate \(R^2\) in the samples is comparable to the other samples, we carry out a transformation of the \(\mathbf x \) vector and adjust the variance of the error term \(\varepsilon \) such that the \(R^2 \approx 0.67\) in all settings. The transformation of the \(\mathbf x \) vector required to maintain a balanced sample is \(\mathbf x \sim -2(1-\gamma )^{-\theta }+U(-5,5)\).

  6. 6.

    We take the 95th percentile of the distribution under the null obtained from our simulations as the “correct” critical value, rather than the value of 3.84 for a \(\chi ^2(1)\). This correction will become negligible for the larger sample sizes, since all three tests achieve their asymptotic distribution. The correction is most pronounced for \(LM_{PS}\) in the smaller samples.

Notes

Acknowledgments

This project was supported by Award No. 2009-SQ-B9-K101 by the National Institute of Justice, Office of Justice Programs, U.S. Department of Justice. The opinions, findings, and conclusions or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect those of the Department of Justice.

References

  1. Amaral, P., Anselin, L.: Finite sample properties of Moran’s I test for spatial autocorrelation in Probit and Tobit models—empirical evidence. Working Paper 2011–7, GeoDa Center for Geospatial Analysis and Computation, Arizona State University, Tempe, AZ (2011)Google Scholar
  2. Anselin, L.: Spatial Econometrics: Methods and Models. Kluwer, Dordrecht (1988)CrossRefGoogle Scholar
  3. Anselin, L.: Spatial econometrics. In: Mills, T.C., Patterson, K. (eds.) Palgrave Handbook of Econometrics. Econometric Theory, vol. 1, pp. 901–969. Palgrave Macmillan, Basingstoke (2006)Google Scholar
  4. Beron, K.J., Vijverberg, W.P.: Probit in a spatial context: a Monte Carlo analysis. In: Anselin, L., Florax, R.J., Rey, S.J. (eds.) Advances in Spatial Econometrics, pp. 169–195. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. Cox, D.R., Snell, E.J.: A general definition of residuals. J. R. Stat. Soc. Ser. B 39, 248–275 (1968)Google Scholar
  6. Fleming, M.: Techniques for estimating spatially dependent discrete choice models. In: Anselin, L., Florax, R.J., Rey, S.J. (eds.) Advances in Spatial Econometrics, pp. 145–168. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. Hendry, D.F.: A comment on specification searches in spatial econometrics: the relevance of Hendry’s methodology. Reg. Sci. Urban Econ. 36, 309–312 (2006)CrossRefGoogle Scholar
  8. Kelejian, H., Prucha, I.: On the asymptotic distribution of the Moran I test statistic with applications. J. Econom. 104(2), 219–257 (2001)CrossRefGoogle Scholar
  9. McMillen, D.: Probit with spatial autocorrelation. J. Reg. Sci. 32(3), 335–348 (1992)CrossRefGoogle Scholar
  10. Novo, A.: Spatial probit models: statistical inference and estimation. Monte Carlo simulations and an application to contagious currency crises. Phd thesis, University of Illinois, Urbana-Champaign, IL (2001)Google Scholar
  11. Pinkse, J.: Asymptotics of the Moran test and a test for spatial correlation in Probit models. Working paper, Department of Economics, University of British Columbia, Vancouver, BC (1999)Google Scholar
  12. Pinkse, J.: Moran-flavored tests with nuisance parameter. In: Anselin, L., Florax, R.J., Rey, S.J. (eds.) Advances in Spatial Econometrics, pp. 67–77. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  13. Pinkse, J., Slade, M.E.: Contracting in space: an application of spatial statistics to discrete-choice models. J. Econom. 85(1), 125–254 (1998)CrossRefGoogle Scholar
  14. Rey, S., Anselin, L.: PySAL, a Python library of spatial analytical methods. Rev. Reg. Stud. 37(5–27), 1 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Pedro V. Amaral
    • 2
    • 1
  • Luc Anselin
    • 2
  • Daniel Arribas-Bel
    • 2
  1. 1.Department of Land EconomyUniversity of CambridgeCambridgeUK
  2. 2.GeoDa Center for Geospatial Analysis and ComputationArizona State UniversityTempeUSA

Personalised recommendations