1 Introduction

Spatial econometrics is a subfield of econometrics dealing with spatial lags among geographical units. The early literature in this field started with contributions of Moran (1948), Whittle (1954), and Ord (1975), followed by the seminal contribution of Anselin (1988),Footnote 1 and a series of textbooks by LeSage and Pace (2009), Elhorst (2014), Kelejian and Piras (2017), and Beenstock and Felsenstein (2019).

According to Elhorst (2014), three generations of spatial econometric models could be distinguished about halfway through the decade 2010–2020. The first generation consists of models based on cross-sectional data. The second generation comprises non-dynamic models based on spatial panel data. These models might just pool time-series cross-sectional data, but more often they also control for fixed or random spatial and/or time-period specific effects. The third generation of spatial econometric models encompasses dynamic spatial panel data models. Today (read: 2021), a fourth generation of spatial econometric models has developed: the general nesting spatial (GNS) econometric model for spatial panels with common factors (CF). This model accounts for local spatial dependence by means of an endogenous spatial lag, exogenous spatial lags, and a spatial lag in the error term. It accounts for dynamic effects by means of the dependent variable lagged in time, and the dependent variable lagged in both space and time. Finally, it accounts for global cross-sectional dependence by means of cross-sectional averages or principal components with heterogeneous coefficients, which generalizes the traditional controls for time-invariant and spatial-invariant variables by unit-specific and time-specific effects. With these properties it is the most general spatial econometric model currently available. The aim of this paper is threefold. First, the full model is set out mathematically. Second, the rationale behind each term that is part of the model is explained. Third, potential objections or pitfalls of including certain terms are discussed from a statistical or an economic viewpoint. Finally, different kinds of data are discussed: regional or macroeconomic data, microeconomic data, and economic-historical data.

According to Elhorst (2010), the year 2007 marks a sea change in the spatial econometricians’ way of thinking. Prior to 2007 they were interested mainly in models containing one spatial lag, while after 2007 the interest in models containing more than one spatial lag increased. For this reason he added the words “Raising the Bar” to the title of his paper. The interest for common factors and the distinction between weak and strong cross-sectional dependence, which occurred about halfway through the decade 2010–2020, is another sea change in the spatial econometricians’ way of thinking, explaining the title of this paper.

2 Model

The general nesting spatial econometric model for spatial panels with common factors reads as

$$\begin{array}{cc} Y_{t}=\tau Y_{t-1}+\rho WY_{t}+\eta WY_{t-1}+X_{t}\beta +WX_{t}\theta +\sum _{r}\Gamma _{r}^{T}f_{rt}+u_{t} & u_{t}=\lambda Wu_{t}+\varepsilon _{t} \end{array}$$
(1)

where \(Y_{t}=\left(y_{1t},\ldots ,y_{Nt}\right)^{T}\) denotes an N × 1 vector consisting of one observation on the dependent variable yit for every unit i (i = 1, …, N) in the sample at time t (t = 1, …, T). \(Y_{t-1}\) and WYt represent, respectively, the temporal and spatial lag of Yt, and \(WY_{t-1}\) the spatiotemporal lag of Yt, while τ, ρ, and η are the response parameters of these variables, better known as, respectively, the serial, spatial and spatiotemporal autoregressive coefficients. The N × N matrix W is a nonnegative matrix of known constants describing the spatial arrangement of the units in the sample. Its diagonal elements are set to zero to prevent units from explaining themselves. Xt is an N × K matrix of explanatory variables and WXt an N × K matrix of contemporeous spatial lags of these explanatory variables. The impacts of these variables are measured by, respectively, the K × 1 vectors β and θ. The N × 1 vectors ut and εt denote the error terms of the model. It is assumed that ut follows a first-order spatial autoregressive process with spatial autocorrelation coefficient λ, which may be labeled as a spatial lag in the error term, and that \(\varepsilon _{t}=\left(\varepsilon _{1t},\ldots ,\varepsilon _{Nt}\right)^{T}\) is a vector of disturbance terms, where εit are independently and identically distributed error terms for all i with zero mean and variance σ2. Since the spatial econometric model in Equation (1) contains spatial lags in the dependent variable, in each of the explanatory variables, and in the error term, it is also known as a general nesting spatial model (Elhorst 2014). The determinants of the model described so far capture potential local spatial dependence (weak cross-sectional dependence) among the observations.

The common factors frt (r = 1, …, R) capturing potential global cross-sectional dependence can take three forms. First, if two factors are considered, \(f_{1t}=\left(1,\ldots ,1\right)^{T}\) and \(f_{2t}={(\xi _{1}},\ldots ,{\xi _{T}})^{T}\), and the parameter restrictions \(\Gamma _{1}^{T}=(v_{1},\ldots ,v_{N})\) and \(\Gamma _{2}^{T}=(1,\ldots ,1)\) are imposed, the model boils down to a dynamic GNS model with cross-sectional and time-period fixed effects. Formally, the cross-sectional fixed effects represent one common factor (f1t) which is constant over time but with heterogenous coefficients (Γ1). The time-period fixed effects represent another common factor of length T (f2t) which changes over time but with homogenous coefficients (Γ2). The total number of common factor parameters to be estimated in this setting amounts to N + T‑1, since one of the T time dummies should be left aside to avoid perfect multicollineairity with the cross-sectional fixed effects.

The second possibility is to maintain the cross-sectional fixed effects, but to replace the time dummies by time-specific cross-sectional averages of the dependent variable at times t and t-1, i.e., \(\overline{Y}_{t}=\frac{1}{N}\sum _{i=1}^{N}y_{it}\), \(\overline{Y}_{t-1}=\frac{1}{N}\sum _{i=1}^{N}y_{it-1}\), and/or the time-specific cross-sectional averages of the explanatory variables at time t, \(\overline{X}_{kt}=\frac{1}{N}\sum _{i=1}^{N}x_{\mathrm{ikt}}\), where k denotes the kth variable among the set of K explanatory variables. Furthermore, just as the cross-sectional fixed effects have heterogenous coefficients, one for each single unit in the sample and thus N in total, so does each cross-sectional average. This implies that if all cross-sectional averages are included, 2 for the dependent variable at times t and t-1, and K for the explanatory variables at time t, the total number of common factor coefficients to be estimated (including the cross-sectional fixed effects) increases to N +(2 + K)N. Just as time-period fixed effects, these cross-sectional averages may be treated as exogenous explanatory variables based on the assumption that the contribution of each unit to the cross-sectional averages at a particular point in time goes to zero if N goes to infinity (Pesaran 2006, assumption 5 and remark 3). Elhorst (2021) provides a set of commands with which this model can be estimated in Stata.

The third possibility is to approach the unobservable common factors by one or more principal components. In that case the Γ parameters represent the factor loadings of the principal components. Shi and Lee (2017) develop a quasi maximum likelihood (QML) estimator for the this model. This estimator does not require any specification of the distribution function of the disturbance term, except that the error term should have zero mean and variance σ2. This explains the term quasi. The coefficient estimates are corrected for the Nickell bias and the impact of this bias on the other coefficients in the equation. For this purpose, a Matlab routine called SFactors has been developed, which the first author made available at his web site www.w-shi.net. This routine is also made available at spatial-panels.com and extended to include the determination of the log-likelihood function value and R2 of the model. Since every principal component requires the estimation of 2N parameters, the total number of common factor parameters to be estimated in this setting amounts to 2NR.

3 A spatial lag in the dependent variable: WYt

A spatial lag in the dependent variable implies that yit observed in cross-sectional unit i is explained by yjt in other cross-sectional units j, \(j\neq i\), and vice versa. The units j which are included depend on the specification of the spatial weight matrix W. A linear regression model that contains a spatial lag in the dependent variable only (WY) is known as a spatial autoregressive (SAR) model. It is one of the most widely used spatial econometric models to introduce new methods of estimation or spatial statistics. Two other popular spatial econometric models used for these purposes are the spatial error (SE) model, which includes a spatial lag in the error term (Wu), and the spatial autoregressive combined (SAC) model, which includes both types of spatial lags (WY and Wu). Leading examples are Ord (1975), who introduces the maximum likelihood (ML) estimator of the SAR and SE models; Anselin (1988, pp. 82–86) and Kelejian and Prucha (1998, 1999), who introduce the instrumental variables (IV) and generalized method-of-moments (GMM) estimators of the SAR, SE and the SAC models; Lee (2004), who introduces the quasi maximum likelihood (QML) estimator of the SAR model and also discusses the regularity conditions that need to be imposed on the spatial weight matrix W;Footnote 2 LeSage and Pace (2009, Ch.5), who set out the Bayesian Markow Chain Monte Carlo (MCMC) estimator of the SAR model;Footnote 3 Bao and Ullah (2007), who investigate the finite sample properties of the ML estimator of the SAR model; Ahrens and Bhattarchajee (2015), who exploit the Lasso estimator and mimics two-stage least squares (2SLS) to estimate the SAR model and the corresponding spatial weight matrix; Kyriacou et al. (2017), who introduce the Indirect Inference (II) estimator of the SAR model; and Smirnov (2021), who derives a closed-form consistent estimator of the spatial autoregressive parameter in the SAR model and the spatial autocorrelation coefficient in the SE model.

Despite the fact that so many different estimators of the SAR model have been developed, it should be stressed that this does not imply that the SAR model also makes sense from an economic-theoretical viewpoint. Many empirical studies justfy the inclusion of a spatial lag in the dependent variable based on the simple finding that its coefficient is significant. Two leading examples in this characterizing many empirical studies are the following. When running Moran’s I test on the dependent variable, the corresponding null hypothesis that this variable is not spatially correlated often needs to be rejected. The robust Lagrange multiplier (LM) tests developed by Anselin et al. (1996) to test for the SAR model (as well as the SE model) as an extension of the standard linear regression model without any spatial lags is also often provided as empirical evidence in favor of the SAR model. When estimating the SAR model subsequently, one can easily find empirical evidence in favor of a significant spatial autoregressive parameter ρ of WY for several potential specifications of W. However, this approach has severely been criticized in the spatial econometric literature. First of all, researchers apparently do not realize that Moran’s I test is unfocused and that the robust LM tests do not control for potential spatial lags in the explanatory variables. Theoretically, it is possible that a standard linear regression without any spatial lags is sufficient, even if the dependent variable according to Moran’s I test is spatially correlated, since the explanatory variables may also be spatially correlated and, moreover, in such a way that it fully covers the spatial correlation in the dependent variable. According to Pinkse and Slade (2010), this is also a primary criticism of standard spatial econometrics; researchers try to fit their preferred model (usually a SAR model) onto every empirical problem rather than having the nature of the empirical problem inform which particular model best answers the question. In addition, they criticize the SAR model for the laughable notion that the entire spatial dependence structure is reduced to one single unknown coefficient. Similarly, McMillen (2012) critiques the overuse of the SAR model (and the SE model) as a quick fix for nearly any model misspecification issue related to space. Corrado and Fingleton (2012) demonstrate by using a simple Monte Carlo simulation experiment that the coefficient estimate for the WY variable may be significant because it could be picking up the effect of omitted WX variables or nonlinearities in the X variables if they are erroneously specified as being linear. This makes the interpretation of a causal (spillover) effect difficult, i.e., to discern whether the significant coefficient of the WY variable is due to omitted variables or due to a causal effect of WY. Another important limitation of the SAR model, as demonstrated by Elhorst (2010), is that the ratio between the marginal impacts of changes to explanatory variables in one cross-sectional unit on the dependent variable values in other units (spillover effect) and in the own unit (direct effect) is independent of its coefficient β and therefore the same for every explanatory variable, which is unlikely to hold in many applied settings. The appendix to this paper contains a detailed description of the direct and indirect (spatial spillover effects) that can be derived from the spatial econometric model in Equation (1). An issue related to this and that is gaining more attention in the empirical literature is that global spillovers are often difficult to justify. One speaks of global spillover effects if changes in the explanatory variables X in one unit j impact the dependent variable observed in another unit i, even if these two units are not connected to each other according to the spatial weight matrix (\(w_{ij}=0\)). Halleck Vega and Elhorst (2015) show that these kind of spillovers can occur only if at least a spatial lag in the dependent variable is part of the model, while Pinkse and Slade (2010, p. 115), as well as Arbia and Fingleton (2008), Gibbons and Overman (2012), Corrado and Fingleton (2012), Partridge et al. (2012), Lacombe and LeSage (2015), and Elhorst et al. (2020b) argue that it is often difficult to form a reasonable argument to include a spatial lag in the dependent variable even if it is easily found to be statistically significant. For example, if teen smoking behavior is being analyzed then it would be sound to argue that an individual’s propensity to smoke is directly influenced by the smoking behavior of friends. However, if real per capita sales of cigarettes are analyzed at the aggregate level of geographical units, then it is difficult to justify that the average levels of consumption in different units affect one another. The resulting global spillovers would mean that a change in price or income in one particular unit potentially impacts consumption in all units, even if these units are unconnected. Other examples than this one, which is taken from Halleck Vega and Elhorst (2015), concern poverty rates (Partridge et al. 2012) and car use (Elhorst et al. 2020b).

The number of studies that do provide an economic-theoretical model underpinning of a spatial lag in the dependent variable is limited. According to Anselin (2006), the SAR model is generally conceptualized as representing the empirical counterpart to an equilibrium solution of strategic interaction or a spatial reaction function, \(y_{i}=R(y_{\_ i},x_{i}\)), where yi stands for the level of decision variable y of agent i, \(y_{\_ i}\) reflects a function of the decision variables chosen by other agents, xi is a vector of exogenous characteristics of i, and R is a functional form to be specified. Xu and Lee (2019) show that if (i) N individuals maximize their utilities, (ii) individual’s i benefit is proportional to this action and depends on his own characteristics and those of others, \(y_{i}(\rho \sum _{j=1}^{N}w_{ij}y_{j}+X_{i}^{'}\beta +\varepsilon _{i})\), and (iii) individual’s i cost equal \(\frac{1}{2}y_{i}^{2}\), the utility function of individual i takes the form \({U_{i}}\left(y_{i}\right)=y_{i}\left(\rho \sum _{j=1}^{N}w_{ij}y_{j}+X_{i}^{'}\beta +\varepsilon _{i}\right)-\frac{1}{2}y_{i}^{2}\), and that his optimal action takes the form of a SAR model. In other words, SAR can be regarded as a model on the Nash equilibrium of a static complete information game with a linear-quadratic utility function. The time dimension is not part of this setting but can be added in a straightforward manner.

Pinkse et al. (2002) and LeSage et al. (2017) have used spatial econometric models to show that, when one petrol station decreases its price, geographically nearby service stations need to follow in order not to lose market share. The first of these two studies also provides an economic-theoretical model for this strategic behavior. In the literature on strategic interaction among local governments, a spatial lag in the dependent variable is theoretically consistent with the situation where taxation and expenditures on public services interact with taxation and expenditures on public services in nearby jurisdictions (Wildasin 1988; Besley and Case 1995; Brueckner 2003, 2006). For more references in this field see Allers and Elhorst (2011), who argue that many studies of fiscal policy interactions are based on single equation models of either taxation or expenditures, without specifying the underlying social welfare function, without taking account of budget constraints and without allowing for cost differences between jurisdictions. By taking this into account, they derive an extended version of the linear expenditure system with policy interaction effects that correspond to a system of several SAR models. Hanson (2005) develops an augmented market-potential function derived from the Krugman model of economic geography, reflecting the impact of scale economies and transport costs, to explain wage curves. Behrens et al. (2012) derive a quantity-based structural gravity equation system in which both trade flows and error terms are cross-sectionally correlated. This system can be estimated using techniques borrowed from the spatial econometrics literature, in particular the literature on SAR models extended to include an error term with an autoregressive or an moving average spatial structure. One of the first studies explaining interregional trade flows incorporating the effect of spatial interactions is Keller and Shiue (2007). Blonigen et al. (2007), who develops an economic-theoretical model of foreigh direct investments (FDI), shows that this model results in a linear regression model extended to include an endogenous spatial lag on FDI, measured by FDI into markets nearby the host country, and an exogenous market potential variable among the set of explanatory variables, measured by the size of markets nearby the FDI host country in terms of gross domestic product (GDP). The signs and significance levels of the coefficients of these two variables can be used to answer the question whether these outcomes are compatible with horizontal, vertical, export-platform or complex vertical FDI.

Although this is just a selection of several economic-theoretical studies motivating the inclusion of spatial lag in the dependent variable, and there are certainly more of studies of this type, their number remains relatively limited.

4 Temporal and spatiotemporal lags: \(Y_{t-\mathbf{1}}\) and \(\boldsymbol{W}Y_{t-\mathbf{1}}\)

The main reason to control for a temporal lag in the dependent variable, \(Y_{t-1}\), is habit persistence. It takes time to change behavior. A household may not change its consumption level and labor supply immediately in response to a change in prices or its income. Similarly, a firm may react with some delay to changes in costs and to changes in demand for its product. Moreover, time lags can arise from imperfect information. Economic agents require time to gather relevant information, and this delays the decision-making process. Institutional factors can also result in lags. Households may be contractually obliged to supply a certain level of labor hours, though other conditions would indicate a reduction or increase in labor supply. The half-life of a change in one of the explanatory variables explaining the dependent variable, Yt, can be calculated as \(h=\ln \left(\frac{1}{2}\right)/\ln \left(\tau \right)\). If, for example, \(\tau =0.8\), then \(h=3.1\), which implies that it takes more than three time periods before the impact on the dependent variable due to a change of one of the explanatory variables has been halved. Only if \(\tau < \frac{1}{2}\), the half-life is shorter than one time period.

Korniotis (2010) interprets the coefficients of the temporal and spatiotemporal lags of the dependent variable, \(Y_{t-1}\) and \(WY_{t-1}\), as measures of the relative strength of internal and external habit persistence, where external habit persistence reflects the time agents of a particular unit need to pick up information from their neighbors. An econometric model that contains temporal and spatiotemporal lags of the dependent variable, \(Y_{t-1}\) and \(WY_{t-1}\), but not the spatial lag of the dependent variable, WYt, is known as the time-space recursive spatial econometric model and has gained a lot of attention in the spatial econometrics literature. According to Anselin et al. (2008), this model is especially useful to study spatial diffusion phenomena. In a social learning framework (e.g., Goyal 2009, ch. 5), the spatial reaction function may take the form \(y_{it}=R(y_{it-1},y_{-it-1},x_{i})\). LeSage and Pace (2009, ch. 7) refer to this model as a classic spatiotemporal (partial adjustment) model and employ it to show that high temporal dependence and low spatial dependence might nonetheless imply a long-run equilibrium with high spatial dependence. Fogli and Veldkamp (2011) adopt this model to investigate whether the labor force participation rate varies with past participation rates in surrounding areas, based on decennial data of female participation rates over the period 1940–2000 at the U.S. county level. A way to view these papers is that information diffusion can change preferences, but that people require time to gather information, creating a delay in the decision-making process, and hence spatial dependence takes time to manifest itself. The time-space recursive model may also be extremely useful to analyse the rise and spread of the Covid-19 virus on a daily basis. New infections occur due to people who have recently been infected in the own and in neighboring areas, but the transfer of the virus takes time, i.e., in this particular case a couple of days.

Despite the popularity of the time-space recursive model, a basic question is whether the removal of the spatial lag in the dependent variable, WYt, is supported by the data. Indeed, some researchers are troubled with the idea that the spatial autoregressive interaction between Y and WY is instantaneous (see Upton and Fingleton 1985, p. 369 for one of the first discussions on this issue). Instead, they suggest a model in which the autoregressive response is allotted one period to take effect, \(Y_{t}=\eta WY_{t-1}\). By contrast, other reseachers do not seem to have problems with the idea that Yt in one spatial unit is regressed on Yt of other spatial units, \(Y_{t}=\rho WY_{t}\). Data frequency may also matter (daily, monthly, quarterly or annual data). For that reason they do not preclude this specification in advance and suggest to determine whether the data can help to determine the most appropriate model. Elhorst et al. (2020b) deliberately include the variable WYt even though they expect that its coefficient will be zero. By investigating this, they are able to provide empirical evidence in favor of this hypothesis.

An important restriction frequently overlooked is that the serial, spatial and spatiotemporal autoregressive coefficients may not sum up to a value that is equal to or greater than 1, \(\tau +\rho +\eta < 1\), otherwise the spatial econometric model is not stable, i.e., a change in one of the explanatory variables or a shock in the error term will have the effect that the dependent variable does not return to an equilibrium value but instead explodes. If the variables \(Y_{t-1}\) and \(WY_{t-1}\) are not included, as in a static spatial econometric model, researchers are generally aware that the spatial autoregressive parameter ρ of the variable WYt should take a value in the interval \(-1/\omega _{\min }< \rho < 1\), where ωmin is the smallest negative eigenvalue of W. However, when these variables are also included, then this condition changes into \(\tau +\rho +\eta < 1\). Details on accompanying restrictions that should also hold but are less relevant in empirical research are available in Elhorst (2014, Sect. 4.3). One outstanding example in which this restriction is overlooked concerns the advanced study of Fogli and Veldkamp (2011). These authors only include the variables \(Y_{t-1}\) and \(WY_{t-1}\), as a result of which \(\tau +\eta < 1\) is required for stability. However, in their preferred model they find \(0.916+0.570> 1\). In a similar study based on a panel of 108 regions across eight EU countries over the period 1986–2010, Halleck Vega and Elhorst (2017) find that the sum of both coefficients is smaller than one (0.845 + 0.019 for the total working population, 0.875 + 0.014 for the male, and 0.928 + 0.004 for the female working population).

Another empirical regularity which is often found but many researchers are not aware of is \(\eta =-\tau \rho\). Parent and LeSage (2010, 2011) show that imposing this parameter constraint might avoid overidentification problems, while Elhorst (2010) shows that under this constraint the impact of a change in one of the explanatory variables gradually diminishes over both space and time, i.e., these two effects can be separated from each other mathematically. The impact of a change in one of the explanatory variables over space falls by the factor ρW for every higher-order neighbor, and over time by the factor τ for every next time period. Due to this property, Lee and Yu (2015) label it as the separable space-time filter. Although this empirical regularity does not have to be met in theory, empirical evidence in favor of it has been found in many studies. For example, in the short empirical application on housing prices accompanying the work of Shi and Lee (2017, Table 4), the authors find a positive and significant value for η of 0.05405, while the constraint \(\eta =-\tau \rho\), which equals \(0.05405\approx -[(-0.05527)\times 0.68981]\), cannot be rejected statistically. Since the degree of habit persistence τ in most studies is positive, this empirical regularity implies that ρ and η have opposite signs, i.e., if the spatial lag has a positive sign, the spatiotemporal lag has a negative sign, as a result of which the net effect of these two terms is smaller than the positive effect of the spatial lag. Many researchers are puzzled by such a finding, perhaps because it has a stronger statistical than an economic-theoretical background. Lee and Yu (2015) discuss several limitations of imposing this empirical regularity. First, if \(\tau =0\) or \(\rho =0\), the spatiotemporal lag \(WY_{t-1}\) will automatically also have no effect since \(\eta =0\), which rules out diffusion and external habit persistence as in Korniotis (2010). Second, the omission of \(WY_{t-1}\) causes inaccuracy in forecasting when this variable is part of the true but unknown data generating process. Third, it rules out the possibility that ρ and η have the same sign, provided that τ is positive. Fourth, since both τ and ρ are smaller than one in absolute value, so will η.

5 Spatial lags in the explanatory variables: WXt

Halleck Vega and Elhorst (2015) and Elhorst and Halleck Vega (2017) provide four reasons why to include spatial lags in the explanatory variables. First, since a spatial econometric model may potentially contain K spatial lags in the explanatory variables, one in the dependent variable, and one in the error term, the spatial lags in the explanatory variables are dominating, i.e., K relative to K + 2. In view of this it makes sense to focus on these spatial lags first.

Second, the SAR, SE and SAC models are of limited use in empirical research due to initial restrictions on the spillover effects they can potentially produce. In the SAR and SAC models the ratio between the spillover effect and the direct effect is the same for every explanatory variable, while in the SE model the spillover effects are set to zero by construction. Only in the SLX, SDE, SD and GNS models can the spatial spillover effects take any value. Table 1 gives of overview.

Table 1 Spatial econometric models with different combinations of spatial lags and their flexibility regarding spatial spillovers

Third, the spatial weight matrix of spatial lags in the explanatory variables can easily be parameterized, for example, according to an exponential or inverse distance matrix, or as a gravity type of model, which has a stronger background in economic theory. Consequently, this setup offers the opportunity to consider a broader spectrum of potential specifications of the spatial weight matrix than the traditional first-order binary contiguity matrix or the pre-specified exponential and inverse distance matrices (with or without a cut-off point).

Fourth, econometric-theoretical researchers are mainly interested in spatial econometric models containing spatial lags in the dependent variable, the error term, or both (i.e., the SAR, SEM, and SAC models, respectively), because of the econometric problems and often complicated regularity conditions accompanying the estimation of these models. The reason they do not focus on the spatial econometric model with spatial lags in the explanatory variables is because their inclusion does not cause severe additional econometric problems, provided that the explanatory variables X are exogenous and the spatial weight matrix W is known and exogenous. This causes a gap in the level of interest in spatial lags between econometric theoreticians and practitioners. One of the advantages of the SLX model over other spatial econometric models or of including WX variables in general is that non-spatial econometric techniques can be used to test for potential endogeneity of the X and the accompanying WX variables. It concerns the Hausman test for endogeneity in combination with tests for the validity of the instruments to assess whether they satisfy the relevance and exogeneity criterions. The methodology behind these tests is explained in Halleck Vega and Elhorst (2015) and applied to the cigarette demand data set of 46 U.S. states over the period 1963–1992. One of their main findings is that the price of cigarettes in the own state is endogeneous, but the price in neigboring states, reflecting the spatial lag of this explanatory variable, is not. In principle, these kind of tests can also be used to test whether the variables used to instrument the spatial lag in the dependent variable, WYt, are relevant and exogenous when applying IV/GMM estimators to estimate the parameters of a spatial econometric model, but remarkably, this is rarely done (see Drukker et al. 2013).

Just as for the spatial lag in the dependent variable, the number of studies that do provide an economic-theoretical underpinning of spatial lags in the explanatory variables is limited. In their spatial econometric textbook, LeSage and Pace (2009) provide several motivations for including spatial lags in general and for considering the spatial Durbin model in particular, although most of these motivations are statistically rather than economic-theoretically driven. Ertur and Koch (2007) develop an economic-theoretical model of economic growth that results in a spatial Durbin model, i.e., a model in which economic growth is regressed on economic growth in neighboring economies, on the initial income level in the own and in neighboring economies, and on the rates of saving, population growth, technological change and depreciation in the own and in neighboring economies. Yesilyurt and Elhorst (2017) develop an economic-theoretical model of military expenditures as a ratio of GDP which likewise results in a spatial Durbin model. In this model the expenditures in one country are explained by their counterparts in neighboring countries, as well as economic, political, and strategic factors that mark the own and neighboring countries. Costa da Silva et al. (2017) develop a spatially augmented population growth model building on Glaeser (2008) that results in a dynamic GNS model. Heijnen and Elhorst (2018) develop an economic-theoretical model explaining the diffusion of waste disposal taxes across municipalities. In this model, spillover effects may occur for two reasons. First, (illegal) dumping of waste will become more prevalent, which may not be confined to the municipality that introduces a waste disposal tax. Second, if a particular municipality introduces a waste disposal tax, the policymakers and citizens of neighboring municipalities obtain valuable information about the impact of this taxing scheme, which may help them to decide whether it is also suitable for them. Their economic-theoretical model of these spillover effects results again in a spatial Durbin model. In the economic-theoretical game model of Xu and Lee (2019), a spatial Durbin model results if the benefits of individual i take the form \(y_{i}(\rho \sum _{j=1}^{N}w_{ij}y_{j}+X_{i}^{'}\beta +\sum _{j=1}^{N}w_{ij}X_{j}^{'}\theta +\varepsilon _{i})\).

6 A spatial lag in the error term: Wut

A spatial lag in the error term does not require a theoretical model for a spatial or social interaction process, but instead, is consistent with a situation where determinants of the dependent variable omitted from the model are spatially autocorrelated, or with a situation where unobserved shocks follow a spatial pattern.

In contrast to the spatial econometrics literature, the (G)VAR literature is more focused on the impact of idiosyncratic shocks to the dependent variable in a given unit on that of the unit itself and on neighboring units, where the impact of neigboring areas is sometimes labeled contagion. These effects can be simulated by replacing the second \(N\times N\) matrix on the right-hand side in Equations (3) and (4) in the appendix to this paper by a \(N\times 1\) vector \(S=(\ldots ,s_{i},\ldots )\), where si is generally set to one standard error of the error term representing the shock, and premultiplying this \(N\times 1\) vector by the \(N\times N\) spatial multiplier matrix to get an \(N\times 1\) vector of responses to a shock in a particular unit. For applications, see Lacombe and LeSage (2015), Elhorst and Zigová (2014), and Elhorst et al. (2021).

An important and well-known econometric property of a spatial lag in the error term is that it does affect the efficiency of the parameter estimates of the right-hand side variables in the spatial econometric model, but not its consistency. This property has so far been underused to test for misspecification problems. A Hausman test can be used whenever there are two consistent estimators, one of which is inefficient, while the other is efficient. Pace and LeSage (2008) develop this test to compare OLS and SEM estimates. According to LeSage and Pace (2009, p. 62), rejection of the null hypothesis of equality in OLS and SEM coefficient estimates can be useful in diagnosing the presence of omitted variables that are correlated with variables included in the model. The test statistic follows a chi-squared distribution with degrees of freedom equal to the number of regression parameters under test. Three different outcomes are possible (Elhorst and Halleck Vega 2017). First, the OLS and SEM coefficient estimates are not significantly different from each other and the spatial autocorrelation coefficient is not significant. When this occurs, extension of the OLS model with spatial autocorrelation is not necessary and may be left aside. Second, the OLS and SEM coefficient estimates are not significantly different from each other, but the spatial autocorrelation coefficient is significant. If this occurs, SEM yields a significantly higher log-likelihood function value than OLS, as a result of which the conclusion must be that the spatial error term is capturing the effect of omitted variables. However, since the null hypothesis cannot be rejected, it may concurrently be concluded that these omitted variables are not correlated with the included variables and thus that the SEM re-specification of the OLS model leads to an efficiency gain. Third, the OLS and SEM coefficient estimates are significantly different from each other and the spatial autocorrelation coefficient is significant (the probability that the spatial autocorrelation coefficient will be insignificant here is negligible). This outcome points to misspecification problems due to omission of relevant explanatory variables. By replacing OLS and SEM by respectively SLX and SDEM, and SDM and GNS for the three potential outcomes set out above, both for static and dynamic versions of these models, similar tests can be carried out for more advanced spatial econometric models. Such an approach may help to test for misspecification problems on a broader scale than has been done up to now.

7 Cross-sectional and time-period specific effects

The standard reasoning behind spatial specific effects is that they control for all space-specific time-invariant variables whose omission could bias the estimates in a typical cross-sectional study (Baltagi 2005). The spatial specific effects may be treated as fixed or as random effects. In the fixed effects model, a dummy variable is introduced for each spatial unit, while in the random effects model, vi is treated as a random variable that is independently and identically distributed with zero mean and variance σv.

The standard reasoning behind time specific effects is that they control for all time specific spatial-invariant variables whose omission could bias the estimates in a typical time-series study. The time specific effects may be treated as fixed or as random effects. In the fixed effects model, a dummy variable is introduced for each time period, while in the random effects model, ξt is treated as a random variable that is independently and identically distributed with zero mean and variance σξ.

In addition, it is assumed that the variables vi and ξt if specified as being random are independent of each other and independent of εit (with variance \(\sigma _{\varepsilon }^{2}\)). To test this assumption of zero correlation between the random effects components, a Hausman specification test might be used (Baltagi 2005, pp. 66–68; Lee and Yu 2012). However, one may question whether this test is really needed. Experience shows that spatial econometricians tend to work with space-time data of adjacent spatial units located in unbroken study areas, otherwise potential spatial spillover effects and the spatial weight matrix cannot be adequately measured. Consequently, the study area often takes a form similar to all counties of a state or all regions in a country. Under these circumstances the fixed effects model is more appropriate than the random effects model, because the idea that a limited set of regions is sampled from a larger population must be rejected.

The same holds for time specific effects. Most researchers use data over a consecutive time span, otherwise dynamic effects cannot adequately be analysed. Consequently, the study period often covers all its time periods. Under these circumstances the fixed effects model is more appropriate than the random effects model, because the idea that a limited set of time periods is sampled from a larger population must be rejected.

8 Cross-sectional averages

One objection to time period fixed effects is that each time dummy has the same homogeneous impact on all observations in period t, while it is likely that, for example, business cycle effects hit one unit harder than another unit. The time needed and the extent to which a unit is able to recover from a shock may also differ from one unit to another. An alternative is to replace these time dummies by time-specific cross-sectional averages of the variables \(\overline{Y}_{t}\), \(\overline{Y}_{t-1}\), and \(\overline{X}_{kt}\) (k = 1, …, K) that have different heterogeneous impacts on the observations in each time period t. Since the numbers of parameters to be estimated increases rapidly with the number of common factors, most empirical studies try to keep the number of cross-sectional averages to a minimum. Cicarelli and Elhorst (2018) find that, using cigarette demand data of 69 Italian regions over the period 1877–1913, controlling for \(\overline{Y}_{t}\) and \(\overline{Y}_{t-1}\) only already effectively filters out the common time trends in the data. The cross-sectional averages of the explanatory variables in their model are not needed.

The idea to link the individual observations to cross-sectional averages and to estimate this relationship for each individual observation, in regional studies often regional observations to its counterpart observed at the national level, dates back to Thirlwall (1966) and Brechling (1967), and is known as the (regional) cyclical sensitivity literature. A critical overview of 13 studies on regional unemployment cyclical sensitivity models can be found in Elhorst (2003, Sect. 2.1). Importantly, regional unemployment rates tend to move in tandem with the national unemployment rate, but within the common rises and falls over time, the extent to which a region’s rate responds to changes in the national rate can be quite heterogeneous. This implies that studies on cyclical sensitivity that appeared back in the 1960s have paid attention to what can be termed common factors, and that spatial econometric studies have started to pay attention again to this important type of cross-sectional dependence. This literature also contrasts two-step procedures that have been proposed in the literature, where the observations are first taken in deviation from their national average (US) as in Blanchard and Katz (1992) or continental average (EU) as in Decressin and Fatás (1995).

Although this literature lost interest, the prevalence of recessionary shocks, and notably the financial, euro and covid-19 crises, makes it ever more pertinent to study cyclical sensitivity. Moreover, since the common factor literature based on cross-sectional averages developed by Pesaran (2006) is gaining more attention, the cyclical sensitivity literature comes back into the picture again. Of particular importance is that heterogeneity is considered in both strands of literature and that common factors can be embedded in the economic-theoretical literature on cyclical sensitivity. In line with this, Halleck Vega and Elhorst (2016) and Ciccarelli and Elhorst (2018) also attempt to interpret the estimated common factor coefficients, another strength of this approach which unfortunately has hardly been explored up to now.

9 Principal components

A potential disadvantage of principal components is that they are often difficult to interpret, especially if they are compared with cross-sectional averages. Up to now, not many empirical studies have attempted to interpret the factor loadings of the principal components.

To find out which set of common factors is able to filter out common factors most effectively, the cross-sectional dependence (CD) test developed by Pesaran (2015) may be used. This test is based on the correlation coefficients between the time-series observations of each pair of units with respect to a particular variable, in this case the residuals of Equation (1), resulting in N(N − 1) correlations. Denoting these estimated correlation coefficients between the time-series for units i and j as κij, the test statistic is defined as \(CD=\sqrt{2T/(N(N-1)}\sum _{i=1}^{N-1}\sum _{j=i+1}^{N}\kappa _{ij}\). It is a two-sided test statistic whose limiting distribution converges to the standard normal distribution, and thus −1.96 and 1.96 as critical values at the 5% significance level, provided that N goes to infinity faster than T or when T is fixed, reflecting the case in most spatial econometric studies.

Elhorst (2021) compares the performance of a dynamic spatial panel data model applied to the cigarette demand data set of 46 U.S. States over the period 1963–1992 when using spatial and time period fixed effects, cross-sectional averages, and principal components. It is found that only the model with spatial and time period fixed effects is able to produce a CD-test on the residuals that takes a value in the interval [−1.96,+1.96]. It should be noted, however, that two other recent studies point to different results. Cicarelli and Elhorst (2018) find that the dynamic model with cross-sectional averages outperforms its counterpart with spatial and time-period fixed effects. Similarly, Elhorst et al. (2020b) find that the dynamic model with principal components outperforms its counterpart with spatial and time-period fixed effects. The conclusion must be that the best model to control for common time trends might differ from one empirical study to another.

10 Data

Most spatial econometric studies are based on geographical data. Units of observations might be zip codes, neighborhoods, cities, municipalities, regions, counties, states, jurisdictions, or countries. Spatial econometric models are also used to explain the behavior of economic agents other than geographical units, such as individuals, firms, or governments. One example is a hedonic price equation in which the price of each house is explained by the price and characteristics of other houses that have sold prior and in the neighborhood of that house (Kelejian and Piras 2017, p. 13). Elhorst et al. (2020a) illustrates that the popularity of spatial econometric studies based on microeconomic data sets is increasing by providing an overview of all studies that appeared in the last four volumes of the journal Spatial Economic Analysis that are based on relatively large microeconomic data sets (up to 382,000 observations).

Another type of data that deserves more attention are economic-historical data. Keller and Shiue (2007) study interregional trade by examining the spatial pattern of rice price differences in 121 Chinese prefectual markets between the years 1742 and 1795. Groote et al. (2009) employ a dynamic econometric model with lags in both space and time to measure the impact of infrastructure improvements on the standard of living in municipalities located in the north of the Netherlands during the period 1815/1835–1890. Ciccarelli and Elhorst (2018) adopt a dynamic spatial panel data model with common factors to explain the non-stationary diffusion process of cigarette consumption across 69 Italian provinces over the period 1877–1913. The need for more studies of this type and, related to that, the need to use more advanced models, such as the one set out in this paper, appears from the debate that is currently going on in the literature on persistence studies (Kelly 2019; Voth 2020). A substantial literature on persistence finds that historical characteristics or events in specific places determine current socioeconomic outcomes. Kelly (2019) analysed 27 persistence studies that appeared in four leading economic or econometric journals over the period 2001–2019 and concludes that the residuals of the equations adopted in those studies appear to be substantially spatially autocorrelated in most cases. One reason for this, according to Voth (2020), is that cross-sectional differences are the main source of variation. Kelly (2019) and Voth (2020) also discuss several remedies: the inclusion of cross-sectional fixed effects, clustered standard errors, and noise simulations. However, a more fundamental modelling approach would be one that accounts for dynamic effects, local spatial dependence and global cross-sectional dependence within one simultanous framework.

11 Conclusion

The general nesting spatial (GNS) econometric model for spatial panels with common factors (CF) is the most general spatial econometric model currently available for empirical research. Hopefully, this paper encourages more scholars to work with this model in their empirical research. At the same time, they should be warned that this is a difficult model to work with since the estimation results produced by this model are often quite puzzling, especially in the beginning. This advanced model requires extensive research experience in spatial econometrics and sufficient economic-theoretical knowledge of the problem at hand. Often the results are not in line with initial expectations, but after thinking them over and debating them with other researchers, progress towards an acceptable model specification can be made step by step.