Abstract
Bayesian approaches play an important role in the development of new spatial econometric methods, but are uncommon in applied work. This is partly due to a lack of accessible, flexible software for the Bayesian estimation of spatial models. Established probabilistic software struggles with the specifics of spatial econometrics, while classical implementations do not harness the flexibility of Bayesian modelling. In this paper, I present a layered, objectedoriented software architecture that bridges this gap. An R implementation in the bsreg package allows quick and easy estimation of spatial econometric models, while remaining maintainable and extensible. I demonstrate the benefits of the Bayesian approach and using a wellknown dataset on cigarette demand. First, I show that Bayesian posterior densities yield better insights into the uncertainty of nonlinear models. Second, I find that earlier studies overestimate spillover effects for distancebased connectivities due to a scaling error, highlighting the need for tried and tested software.
Similar content being viewed by others
1 Introduction
The spatial dimension of economic and other activities of interest is often vital for understanding processes at work. Spatial spillover effects, i.e. impacts of an observational unit i on other units \(j \ne i\), are commonplace in theory, but can pose considerable challenges for econometric modelling. Spatial econometrics allows researchers to analyse and control for this dimension in a parsimonious way. Spatial models are an important part of the empirical toolkit of many disciplines, in particular of economics. They have successfully been used to investigate the determinants of growth (Acemoglu et al. 2019; Crespo Cuaresma et al. 2014; Lesage and Fischer 2008; Panzera and Postiglione 2021), the drivers of land use change (Arima et al. 2011; Chakir and Le Gallo 2013; Kuschnig et al. 2021), international trade (Behrens et al. 2012; Krisztin and Fischer 2015; Yang et al. 2017), and many more pressing issues.
Bayesian approaches to spatial models allow for flexible specifications, that loosen restrictive assumptions and impose structure where needed. This allows for models that better reflect reality and natively account for uncertainty. Related developments in the field of spatial econometrics include Bayesian model averaging and model selection (LeSage and Parent 2007; Lesage and Fischer 2008; Crespo Cuaresma and Feldkircher 2013; Pfarrhofer and Piribauer 2019), hierarchical and mixture modelling (Cornwall and Parent 2017; Dong and Harris 2015; Dong et al. 2015; Lacombe and McIntyre 2016), the flexible treatment of connectivity (Debarsy and LeSage 2018, 2020; Han and Lee 2016; Krisztin and Piribauer 2021), and models with limited dependent variables (Krisztin et al. 2021; LeSage 2000). However, Bayesian approaches are rarely used for applied spatial econometrics. This is due, in part, to a lack of dedicated software for Bayesian spatial modelling.
The flexibility of Bayesian modelling and the peculiarity of spatial econometrics challenges good software implementations. Generalpurpose software for Bayesian modelling, such as JAGS (Plummer 2003) and Stan (Carpenter et al. 2017), are not optimised for these models and can be computationally inefficient (see Wolf et al. 2018). Bivand and Piras (2015) first document a lack of software for Bayesian spatial econometric methods, with the notable exception of the MATLAB (The MathWorks Inc 2021) Econometrics Toolbox by LeSage (1999). At the time of writing, some toolbox routines were ported to the spatialreg package (Bivand and Piras 2015), the RINLA (Lindgren and Rue 2015) package allows for marginal inference (also see Bivand et al. 2015; GómezRubio et al. 2021), and there exist many implementations of specific (e.g. Dong et al. 2016), or related models (e.g. Morris et al. 2019). However, comprehensive free and open source software for spatial econometric models remains scarce—particularly for the R (R Core Team 2021) ecosystem.
In this paper, I present a layered objectoriented software architecture that bridges the flexibility of Bayesian modelling with the necessities of spatial econometrics. I provide an implementation in bsreg (Kuschnig 2021), an R package for estimating spatial econometric models. The package provides routines for common models and, thanks to its architecture, can readily be adapted and extended. A combination of two distinct programming and user interfaces allows for a universal underlying structure, while accommodating existing R workflows. The R6 (Chang 2021) objectoriented system is used for the underlying structure; outputs are provided in a standard format, integrating into established procedures with base, thirdparty, or custom functionality. This duality of a flexible, objectoriented structure and a familiar user interface makes the proposed software architecture maintainable, extensible, and convenient to use.
I demonstrate the flexibility of Bayesian spatial modelling and bsreg by reproducing a frequentist analysis of cigarette demand. Halleck Vega and Elhorst (2015) propose a model with local spatial lags, where distancebased connectivities are parameterised—I present a Bayesian variant of their approach. I find that Bayesian posterior densities yield much improved measures of uncertainty. Uncertainty in nonlinear models is characterised by heavy tails that are not reflected in conventional measures of uncertainty. I also find that estimates by Halleck Vega and Elhorst (2015) of the average spillover effects were inflated considerably for distancebased connectivities. This is due to a scaling issue that, at the time of writing, also exists in the widely used spatialreg package. This suggests that (a) Bayesian methods are suitable for flexibly extending spatial models, (b) yield improved insights into uncertainty, and (c) the complexities of spatial models, including their interpretation, present a dangerous source of errors for adhoc, untested approaches.
The remainder of this paper is structured as follows. First, I introduce relevant econometric methods to the reader. This provides a foundation to discuss the software architecture and implementation. I start by sketching out spatial econometrics, and refer the interested reader to Anselin (2013) and LeSage and Pace (2009) for further information. Then, I introduce Bayesian inference and simulation methods for Bayesian estimation, where I also present a Bayesian approach to the parameterised connectivity model of Halleck Vega and Elhorst (2015). For more information on Bayesian inference and computation, I refer to Lancaster (2004), Gelman et al. (2013), and Gamerman and Lopes (2006). In Sect. 4, I present the proposed software architecture, discuss the implementation of the bsreg package and demonstrate its use. In Sect. 5, I estimate several spatial models of cigarette demand and discuss similarities and differences of the Bayesian approach to previous results by Halleck Vega and Elhorst (2015). Section 6 concludes and provides an outlook for future work.
2 Spatial econometric models
Consider a standard linear regression model
where \({\mathbf {y}} \in {\mathbb {R}}^N\), \({\mathbf {X}} \in {\mathbb {R}}^{N \times K}\), and \({\mathbf {e}} \in {\mathbb {R}}^N\) is an error term with mean zero. The idea behind spatial econometric models is to extend this model with spatial information by using neighbouring values. A comprehensive spatial econometric model, also termed the ‘general nested model’, is given by
where \({\mathbf {W}} \in {\mathbb {R}}^{N \times N}\) is a connectivity matrix that imposes a neighbourhood structure, with elements \({\mathbf {w}}_{ij} > 0\) for neighbours i and j, where \(i \ne j\), and 0 otherwise, which is usually scaled in some way (see Anselin 2013, for further information on connectivity matrices), and variables are assumed to be centred for notational convenience.
This model includes three spatial lags that are the building blocks of spatial econometric models. These are the spatial autoregressive lag \({\mathbf {W}} {\mathbf {y}}\), the spatially lagged regressors \({\mathbf {W}} {\mathbf {X}}\), and the spatially lagged error \({\mathbf {W}} {\mathbf {u}}\). In practice, this comprehensive model sees little use and models with one or two of the spatially lagged terms are preferred. I briefly introduce the most prominent ones below.
The spatial lag of \({\mathbf {X}}\) (SLX) model is given by
This model is notable for two main reasons. First, it allows for local spillovers from neighbour to neighbour that are captured in the parameter \(\theta\). Second, for a given connectivity matrix, the SLX model is linear in parameters. We can express it in the form of a standard linear model (Eq. (1)) by letting \({\mathbf {X}} = \left[ {\mathbf {X}},\, {\mathbf {W}}{\mathbf {X}} \right]\). This means that the SLX model is straightforward to estimate, while yielding insights into spillover effects. For a more indepth discussion of the SLX model, I refer to LeSage and Pace (2009) and Halleck Vega and Elhorst (2015).
The next notable models are the spatial autoregressive (SAR) model and the spatial error model (SEM). We can express the SAR model in two useful ways, which are
and the SEM in the following way
Both the SEM and SAR model induce nonlinear spatial filters \(\left( {\mathbf {I}}  \lambda {\mathbf {W}} \right)\), in which the strength of connectivity is captured by a single spatial parameter, \(\lambda\). Nonlinearity can be a challenge for estimation; suitable modes of estimation are the generalised method of moments (Kelejian and Prucha 1998, 1999), maximum likelihood (Lee 2004), and Bayesian methods (LeSage and Pace 2009). The SAR model’s autoregressive parameter captures global spillovers, which occur across all units. This offers additional insights, but complicates interpretation (as discussed below). The SEM allows for spatial dependence of shocks; this is usually interpreted as spatial heterogeneity as opposed to a spatial spillover (see e.g. LeSage and Pace 2009).
Three further spatial econometric models combine two of the three spatially lagged terms. They are the spatial Durbin model (SDM), spatial Durbin error model (SDEM), and the spatial autoregressive combined (SAC) model. Arguably the most notable one is the SDM—a combination of the spatial terms in the SLX and SAR models. Here I will just note that the SDM has convenient properties that lead LeSage and Pace (2009) to argue for considering it as the ‘default’ spatial model. See Elhorst (2010) for a discussion of this notion and Halleck Vega and Elhorst (2015) for a different argument, favouring the SLX model. Note that this discussion is shaped by the previously dominant role of the SAR model, which is sometimes referred to as the ‘spatial lag model’.
2.1 Interpretation of spatial econometric models
Partial effects of an explanatory variable k are directly captured by the respective coefficient \(\beta _k\) in the standard linear model. This is not generally the case for models with spatial lags of the dependent and explanatory variables, such as the SLX and SAR models. In these models, partial effects are (1) matrices, and (2) depend on the spatial parameters and connectivity matrix.
Partial effects of a variable k in a SAR involve the spatial filter and are given by
whereas in an SLX model, they are given by
The resulting matrices can be interpreted using the summary measures of average total, direct, and indirect effects (LeSage and Pace 2009). These are given by
For the SAR model, these summary measures involve the inverse of \({\mathbf {S}}\), which can be prohibitive to compute. Average direct effects in the SLX model, i.e. the average trace of the partial effect matrix, are directly given by \(\beta _k\). In certain situations, e.g. when using a rowstochastic connectivity matrix \({\mathbf {W}}\), the average indirect effect is also given by \(\theta _k\). However, this is not the case in general. The average indirect effect in the SLX model is given by \(s_{\theta } \theta _k\), where \(s_{\theta }\) is a scaling factor given by the sum of all elements of \({\mathbf {W}}\).
3 Bayesian inference
In Bayesian statistics, probability expresses a degree of belief. Bayes’ theorem is used as a tool to update probabilities in the light of new information and is given by
where \(\theta\) denotes a set of unknown quantities, \({\mathcal {D}}\) denotes observed quantities, p is the probability density of a probability distribution, and \(\propto\) reads as ‘is proportional to’. The posterior probability, \(p(\theta  {\mathcal {D}})\), is obtained by updating the prior information, \(p(\theta )\), with the likelihood, \(p({\mathcal {D}}  \theta )\). The notion of sequentially updating information—e.g. in a hierarchical setup—is central to Bayesian inference and provides a rich and intuitive modelling framework.
3.1 Closed form inference and the linear model
A Bayesian model is built on distribution assumptions. Consider, for example, the standard linear regression model in Eq. (1). Assuming that errors are distributed normally, with mean zero and constant variance \(\sigma ^2\), the model can be expressed as
where \({\mathcal {N}}_q(\mu , \mathbf {\Sigma })\) denotes the density of a qdimensional Normal distribution with mean \(\mu\) and covariance \(\mathbf {\Sigma }\), \({\mathbf {I}}_N\) is the identity matrix, the unknown quantities are \(\theta = \left( \beta , \sigma ^2 \right)\), and the observed ones are \({\mathcal {D}} = \left( {\mathbf {y}}, {\mathbf {X}} \right)\). The likelihood of this model is given by
Next, prior distributions have to be specified for the unobserved quantities, e.g.
where \({\mathcal {G}}^{1}(a, b)\) refers to the density of an inverted Gamma distribution with shape a and scale b. This prior is often termed the conjugate Normal InverseGamma prior.
Bayesian inference proceeds by combining the priors and likelihood into the joint posterior, \(p(\beta , \sigma ^2  {\mathcal {D}})\). In this setup, the prior is conjugate, meaning that the posterior is from the same family of probability distributions as the prior. As a result, there is a closed form expression for the joint posterior and knowledge of its probability distribution can be used for inference or to draw samples directly from the posterior.
Conjugate priors and the closed form inference they allow play an important role in Bayesian modelling. However, they are limited to special settings and cannot be used for arbitrarily flexible models implied by Bayes’ theorem. In general, posteriors are not of a wellknown form and another approach is needed for inference.
3.2 Posterior simulation and the SLX(\(\delta\)) model
Consider adapting the conjugate Normal InverseGamma prior in Eq. (9) so that the prior covariance of \(\beta\) is independent of \(\sigma\). The resulting prior is often termed the independent Normal InverseGamma prior and is given by
This small adaptation means that the joint posterior is no longer available in closed form. Samples from the posterior distribution can still be obtained using Markov chain Monte Carlo methods. Closed forms of the conditional posteriors \(p(\beta  {\mathcal {D}}, \sigma ^2)\) and \(p(\sigma ^2  {\mathcal {D}}, \beta )\) are available, and Gibbs sampling can be used to obtain dependent draws from the posterior. These dependent samples are subject to some autocorrelation, meaning that their effective size, or utility for inferential purposes, will be lower than that of an independent sample of equal size. A Gibbs sampling algorithm for this case works as follows.

0.
Let i be zero and set an initial value \(\beta _{(i)}\).

1.
Draw \(\sigma ^2_{(i + 1)}\) from the conditional posterior \(p(\sigma ^2  {\mathcal {D}}, \beta _{(i)})\).

2.
Draw \(\beta _{(i + 1)}\) from the conditional posterior \(p(\beta  {\mathcal {D}}, \sigma _{(i + 1)})\).

3.
Increment i and go to Step 1 if more samples are desired.
Algorithm 1: Gibbs sampler for the independent Normal InverseGamma prior
Assuming that \(\beta _{(0)}\) from Step 0 is a valid draw from \(p(\beta  {\mathcal {D}})\) it is easy to show that subsequent draws are valid draws from the posterior. For an arbitrary initial value \(\beta _{(0)}\), the Gibbs sampler converges to this state under mild conditions. The stationary distribution of the Gibbs sampler will mirror the joint posterior and draws from it can be used for inference. A number of the initial samples is often discarded as burnin, to ensure that draws are from the stationary distribution and to limit the influence of the initial value.
For many models of interest, the conditional posteriors are not conveniently available. Consider the SLX model with parameterised connectivity (Halleck Vega and Elhorst 2015), which we will term the SLX(\(\delta\)) model. Departing from the Bayesian linear model in Eq. (7), we can express this model as
where \({\mathbf {Z}}(\delta ) = \left[ {\mathbf {X}},\, \mathbf {\Psi }(\delta ){\mathbf {X}}\right]\) and \(\mathbf {\Psi }(\delta )\) is a connectivity function with parameter \(\delta\) and some form of scaling. For fixed values \({\bar{\delta }}\), this model collapses to the standard SLX model (see Eq. (2)) with connectivity matrix \({\mathbf {W}} = \mathbf {\Psi }({\bar{\delta }})\). An example for \(\mathbf {\Psi }\) is a inversedistance decay function with \(\psi _{ij} = d_{ij}^{\delta }\) for \(i \ne j\) and 0 otherwise, where \(d_{ij}\) is some distance between observations i and j. In this case, a suitable prior for \(\delta\) is
In order to obtain samples from the joint posterior \(p(\delta , \beta , \sigma ^2  {\mathcal {D}})\), the sampling approach from above needs to be adapted. First, note that conditional on knowing \(\delta\) the Gibbs sampling steps for \(\beta\) and \(\sigma ^2\) can be left unchanged—the hurdle is to obtain draws of \(\delta\). Gibbs sampling is not an option, since the conditional posterior of \(\delta\) has no wellknown form. It is given by
An alternative is the Metropolis–Hastings (MH) algorithm, another MCMC method that does not rely on a wellknown conditional posterior distribution. In an MH algorithm, a draw is proposed from an arbitrary proposal density \({\mathcal {Q}}(\delta ^\star  \delta _{(i)})\) and is accepted with a certain probability \(\mathcal {\alpha } (\delta ^\star  \delta _{(i)})\), such that draws will reflect the joint posterior. Note that the Gibbs sampler outlined above is a special case of the MH algorithm, where draws are proposed in such a way that they are always accepted. An MH sampler for \(\delta\) works as follows.

0.
Let i be zero and set an initial draw \(\delta _{(i)}\).

1.
Propose a value \(\delta ^\star\) using \({\mathcal {Q}}(\delta ^\star  \delta _{(i)})\).

2.
Set \(\delta _{(i + 1)}\) to \(\delta ^\star\) with probability \({\alpha } (\delta ^\star  \delta _{(i)})\) and to \(\delta _{(i)}\) otherwise.

3.
Increment i and go to Step 1 if more samples are desired.
Algorithm 2: Metropolis–Hastings sampler for a connectivity parameter \(\delta\)
The acceptance probability in Step 2 is composed of the ratio of the conditional posterior and the proposal density evaluated at the two values. It is given by
Assume for now that the second ratio cancels out, as is the case for a symmetric proposal density. Then the first ratio, i.e. the conditional posterior evaluated at \(\delta ^\star\) and at \(\delta _{(i)}\), implies that the sampler will always move to a more probable value \(\delta ^\star\) than the current \(\delta _{(i)}\); for a less probable value it will move with a certain probability. Samples obtained this way will converge to a stationary distribution mirroring the joint posterior under mild conditions. As in the case of the Gibbs sampler, it is common to discard a number of initial samples.
3.3 Sampling considerations and other spatial models
Bayesian estimation of models with nonlinear spatial filters, such as the SAR model, works similarly to the approach described above. A common prior for the parameter \(\lambda\), which is constrained to the support \((1, 1)\), is a Beta density
In such a model, sampling of the parameters \(\beta\) and \(\sigma ^2\) works as described above. An MH step can be used to sample \(\lambda\) from its conditional posterior. The conditional posterior of a SAR model, letting \({\mathbf {S}}(\lambda ) = \left( {\mathbf {I}}  \lambda {\mathbf {W}}\right)\), is given by
This likelihood can be prohibitive to compute (mainly due to the \(N \times N\) determinant) which is a central issue for Bayesian (and maximum likelihood) estimation (Bivand et al. 2013). The determinant generally has a computational complexity of \({\mathcal {O}}(n^3)\) that can be prohibitive for sampling. However, it can be computed efficiently using the eigenvalues of \({\mathbf {W}}\), or other approaches that include sparse matrix factorisations, and approximate methods (see e.g. Barry and Pace 1999; Pace et al. 2004; Smirnov and Anselin 2009). Efficient sampling is crucial and can be attained, e.g., by concentrating the likelihood with respect to \(\lambda\) and employing suitable simulation methods.
Bayesian simulation methods look to generate a usable number of effective draws from the posterior in as little time as possible. This requires balancing the computational cost of obtaining draws and the effective size of these draws. The Gibbs and MH algorithms are computationally cheap ways of producing dependent draws. Their efficiency in producing effective draws relies on rapid mixing, i.e. movement through the support of the posterior. A sampler that mixes slowly has high autocorrelation, hence producing a low number of effective draws. An important concept in simulation methods is the posterior exploration, i.e. how well a sampler explores the full posterior density. Exploration is a prerequisite for proper sampling, since any number of draws is irrelevant if they do not reflect the posterior.
An important advantage of the MH algorithm over Gibbs sampling is that it can be tuned. The MH sampling steps can be improved by adjusting the proposal density to achieve suitable acceptance rates. If the acceptance rate is too high or too low, the sampler will suffer from poor posterior exploration and converge to the stationary distribution very slowly. This can be diagnosed by a high degree of autocorrelation in the posterior samples. MH samplers are usually tuned to acceptance rates between 0.2 and 0.5 (see e.g. Bédard 2008; Sherlock and Roberts 2009, for research on optimal acceptance rates). There exist many more MCMC methods that can be very effective, e.g. ones based on Hamiltonian Monte Carlo. See Wolf et al. (2018) for a review of sampling methods in a spatial econometric context. Markov chain Monte Carlo simulation methods and increasing computational power play an important role in the prominence of Bayesian inference. They enable inference in countless settings that are analytically intractable and can readily be combined for modern flexible Bayesian modelling.
4 Software
The flexibility of Bayesian modelling can be a challenge for clean and efficient software implementations. On the one hand, generalpurpose software may be inefficient in certain important cases. Adapting to corner cases leads to convoluted routines that introduce inefficiencies and are hard to maintain properly. On the other hand, more focused implementations are of limited use; necessary adaptations may be errorprone, inefficient to create, and hard to maintain.
In the following, I present a software architecture that bridges this gap. This software architecture builds on an objectoriented system with a layered interface. I provide an implementation in the bsreg (Kuschnig 2021) R (R Core Team 2021) package. The package follows this layered design, with an objectoriented base system that is accessed via an internal programming interface, and an idiomatic R user interface. This structure makes bsreg efficient and maintainable codewise, flexible and extensible for advanced modelling, and accessible to users.
In this section, I introduce the software architecture and design philosophy behind it. Then, I discuss the technical implementation of bsreg to illustrate the software architecture. Finally, I demonstrate the use of the package. In the next section, I provide an application to realworld data, where I reproduce and extend an analysis of cigarette demand.
4.1 Design philosophy
MCMC methods of interest to us can be understood as a machine with a state—the parameters—and a set of rules to update this state—the sampling steps. To instantiate this machine some inputs are necessary—namely the data and prior settings, i.e. immutable parts of the model, and an initial state. See Fig. 1 for a visualisation of the concept. We can use this idealised sampler to construct a prototype in an objectoriented framework. This framework allows us to extend the prototype with additional functionality where needed.
The linear model in Eq. (7) with an independent Normal InverseGamma prior (Eq. (10)) is a sensible starting point for conceptualising this approach. First, we require a method to instantiate our object. The necessary inputs are the priors—\(p(\beta )\) and \(p(\sigma ^2)\)—the data—\({\mathbf {y}}\) and \({\mathbf {X}}\)—and an initial state. Second, we need a method for updating the state. For our model setup, we can employ the Gibbs sampling algorithm described in Sect. 3. Third, we need to be able to access the current state.
Departing from this prototype, we can formulate a number of relevant setups. A simple example is an adaptation to the conjugate variant of the prior (Eq. (9)). The conjugacy implies that we can draw independent samples directly from the posterior. This means that we no longer require an initial state and that the update step can be simplified. Another simple example is the SLX model. It can be accommodated implicitly, where regressors already include their spatial lag, or explicitly, via an extra input argument for the connectivity matrix \({\mathbf {W}}\) that is used to lag regressors during instantiation.
The prototype can also be extended in more elaborate ways. For example, to the parameterised SLX(\(\delta\)) model, the SAR model, or to the SEM. For these models, the pattern of extension essentially remains the same. First, they require additional inputs for their respective parameters and connectivities. Previous sampling steps can be left unchanged, conditional on the new parameters (c.f. the alternative SAR and SEM formulations in Eqs. (4) and (6)). The new parameters can be updated using the MH algorithm, which itself could be expressed as an instance of an objectoriented MH class. These building blocks already allow us to express all common spatial models, with compound models like the SDM expressed via layered inheritance.
4.2 Technical implementation
The implementation of bsreg follows the objectoriented structure outlined above. For the objectoriented system, I rely on the thirdparty R6 package (Chang 2021) over more idiomatic, native systems.^{Footnote 1} This type of objectoriented system is somewhat alien to R, but the layered interface allows the package to remain idiomatic for practical purposes. The toplevel user interface builds on base formats and the standard S3 system. This allows for customary usage patterns that are linked to the R6 system via an intermediate programming interface.
The implementation starts with an abstract ‘Base’ class that provides a skeleton for successors. Instantiating this class is done in a layered way. First, the ‘initialize()’ method sets up meta information and evaluates related inheritor methods that provide priors and other settings. Next, the ‘setup()’ method takes in observed data and evaluates related inheritor methods. These may take in observed quantities, additional settings, and set starting values. This layered structure, allows for the correct ordering of steps required for full instantiation. Then, update steps are summarised in a ‘sample()’ method that calls individual sampling steps and updates meta information using a ‘finalize()’ method. This base class also provides fields for accessing and storing the data, cache, meta information, and potential Metropolis–Hastings samplers.
The first functional inheritor is the ‘NormalGamma’ class, which implements the linear model with independent Normal InverseGamma prior. Its ‘initialize()’ method has an argument for the prior mean and precision of \(\beta\), as well as the shape and rate of \(\sigma ^2\). In its ‘setup()’ method, prior parameters are adapted to fit the model dimensions and the known posterior shape is computed. Starting values can be supplied as additional arguments; otherwise, they are determined using least squares. The sampling steps for \(\beta\) and \(\sigma ^2\) follow the Gibbs sampling steps outlined in Sect. 3. The class also contains methods to access the parameters, residuals, and settings in their respective fields. The ‘NormalGamma’ class serves as prototype for all other classes. Among them, the ‘ConjugateNormalGamma’ class is a straightforward inheritor. Since its posterior parameters are known, the sampling steps can be adapted accordingly. Independent draws from the posteriors of \(\beta\) and \(\sigma ^2\) can then be obtained directly from their posteriors.
A spatial inheritor is the ‘SpatialLX’ class. It accommodates both the standard SLX and the SLX(\(\delta\)) model, depending on the connectivity and settings that are provided when instantiated. For the standard variant, the ‘setup()’ method requires only a connectivity matrix (or function and parameter) and optionally an indicator for regressors to lag spatially. Regressors and other cached values are modified accordingly, and the sampling proceeds as before. The more flexible SLX(\(\delta\)) variant, which considers uncertainty around the connectivity, makes some additional adaptations. First, settings and priors for the free \(\delta\) parameter must be provided to the ‘initialize()’ method. At this stage, functions for updating the parameter and cached values that depend on it, and an object for the MH sampling step are created internally. In the ‘setup()’ method, where observed quantities are provided, these are set up. The sampling step is extended with an MH step for the parameter \(\delta\). By using an objectoriented system for MH samplers, this can be achieved with very little code.
The next spatial inheritors accommodate the SEM and SAR model. The respective classes are ‘SpatialEM’ and ‘SpatialAR’. Both have ‘initialize()’ methods for setting priors and options and to create internal functions for updating parameters and dependent quantities, as well as the MH sampler. The ‘setup()’ methods are used to provide connectivities and set up the filtered quantities and MH sampler. Note that updating the spatial parameter \(\lambda\) affects other steps via the filtered dependent and/or explanatory variables. In my implementation, this works by redirecting access methods to the filtered quantities. Computation of the (log) determinant in the marginal likelihood of \(\lambda\) is facilitated by (a) a spectral decomposition of the connectivity, or (b) a gridbased approximation using LU decompositions. Additional options for panel settings, where connectivities are repeated in a blockdiagonal structure, are available. This aspect of the sampler can be extended in several ways (see Bivand et al. 2013).
These are the core blocks that form a suitable foundation for estimating standard spatial econometric models. The linear classes and spatial inheritors can be used in a straightforward way. The composite SDM or SDEM can be created by adjusting the parent class of ‘SpatialAR’ or ‘SpatialEM’ to ‘SpatialLX’. At the moment, layered adjustment to filtered quantities, as used for the SAC model, is not supported. To simplify the link between this objectoriented core to the idiomatic and the accessible user interface, I use an intermediate programming interface. This interface provides constructor functions that allow dynamic creation of classes, allowing for composite models or different base classes. Additional functions help with the instantiation, make robustness checks, and provide sensible default values. Finally, helper functions for obtaining and working with a number of samples from the posterior facilitate Bayesian inference.
4.3 Usage and the interface
Useful software relies on good accessibility, i.e. a good user interface that simplifies interaction with the underlying code. The bsreg package achieves this with an R idiomatic userinterface on top of the programming interface and objectoriented structure. This toplevel interface focuses on facilitating posterior inference and uses the established formula system and basic S3 methods. The outputs come in a standard format that allows users to stick with familiar procedures for analysis. There are dedicated functions in the style of ‘lm()’ that wrap spatial models of interest, which perform the instantiation, setup, and sampling in one swoop. In this way, bsreg enables near effortless Bayesian inference for spatial econometric models.
The interface builds on the ‘bm()’ function for Bayesian models and uses wrappers for specific specifications, like the classical linear, SLX, SLX(\(\delta\)), and SAR models, as well as the SEM, SDM, SDEM. These use the customary R formula interface, similarly to ‘lm()’ or the spatialreg package.
A linear model with default settings can be estimated as follows.
The first argument is a formula with a symbolic description of the model. The variables are obtained from the environment or are provided via the ‘data’ argument; in this case, we use the packaged ‘cigarettes’ dataset by Baltagi and Li (2004) (see the next section for more information). Other arguments include the number of posterior draws to save (‘n_save’) and discard (‘n_burn’), and ‘verbose’ to control console output. The argument ‘options’ and related helper functions—most importantly ‘set_options()’—can be used to adjust further settings, including model priors.
The default values of ‘set_options()’ amount to relatively uninformative priors and generally applicable settings. However, priors and other settings are best when provided specifically for a problem. Below, we adjust the prior settings of the linear model above to use a conjugate prior with more informative parameters. For this, we use the priorspecific argument ‘NG’ and helper function ‘set_NG()’.
Spatial models additionally rely on the ‘W’ argument for the connectivity, and ones with spatially lagged explanatories have an optional argument to specify variables to lag. With a suitable connectivity matrix \({\mathbf {W}}\), e.g. based on the distance between observations, we can estimate spatial models, like the SDM. See the appendix for how to construct the connectivity matrix below from the packaged ‘us_states’ dataset. Below, we estimate an SDM, which we adjust to use a Uniform prior (equivalent to the \(\mathrm {Beta}(1, 1)\) prior) on the spatial autoregressive parameter \(\lambda\) using the argument ‘SAR’ and helper function ‘set_SAR()’. In addition, we increase the length of the burnin and the number of saved posterior draws.
From this call, we get an S3 object that supports standard ‘print()’, ‘plot()’, and ‘summary()’ methods. These can be used for quick analysis, but more importantly—the object can be used directly. It essentially consists of two elements—a matrix with saved posterior draws and the underlying R6 model object that was used to obtain these draws. We can use the matrix in custom or thirdparty procedures; interfacing to packages like coda (Plummer et al. 2006) is straightforward. The model object preserves the state of the sampler, making it easy to obtain additional posterior draws. This facilitates interactive work and is done by simply using our object in a new call to the estimation function or the generic ‘bm()’ function.
After estimation, we would typically check convergence of the sampler. For this, we can call ‘plot(x)’, which offers basic trace plots. Alternatively, the coda interface provides advanced diagnostic statistics and plots and is accessible via an ‘as.mcmc(x)’ method. If we are satisfied with our samples, we can turn to analysis. By calling ‘print(x)’ we get posterior parameter means, similarly to ‘lm()’. More detailed summary measures, such as various types of credible intervals, can be computed directly (using the matrix slot ‘x[[1]]’), or e.g. with coda (‘coda::HPDinterval()’). Posteriors can also be visualised using base functionality, or the output can be adapted to specialised packages like ggplot2 (Wickham 2016) and ggdist (Kay 2021). If, instead, we find that the number of draws is not sufficient (e.g. via ‘coda::effectiveSize()’), we can get additional samples by simply calling ‘bm(x)’.
As can be seen, the bsreg features an easytouse interface for applied use. It builds on an extensible and flexible objectoriented system and programming interface. The package, more extensive information on its use, as well as uptodate documentation is available from the central R archive network (CRAN) at https://cran.rproject.org/package=bsreg or from the repository at https://github.com/nk027/bsreg.
5 Cigarette demand model
In this section, I use bsreg to estimate various specifications of the demand model for cigarettes in the continental United States (US) by Baltagi and Li (2004). With this application, I follow Halleck Vega and Elhorst (2015) and focus on specifics of the Bayesian approach. The cigarette demand model is
where subscripts are used to indicate a subset of US states and Washington, D.C. (\(i = 1, \dots , 46\)) and time periods (\(t = 1, \dots , 30\)). The dependent C is real per capita cigarette sales, explanatories are the average price of cigarettes (P) and the real per capita disposable income (I). The parameters \(\mu _i\) and \(\phi _t\) are region and timefixed effects; errors \(\varepsilon _{it}\) are assumed to follow a multivariate Normal distribution with mean zero and spherical variancecovariance \({\mathbf {I}} \sigma ^2\).
I consider the linear model (LM) in Eq. (13) and the five spatial econometric models currently supported by bsreg. For the spatial lags, I follow Halleck Vega and Elhorst (2015), who use connectivity matrices based on (1) binary contiguity (rowstochastic), and (2) inversedistance decay (scaled with the maximum eigenvalue). Connectivities are the same for every year. I estimate these models using bsreg as a demonstration of the package and Bayesian methods; for more information on the data, actual application, and additional background see Baltagi and Li (2004), Kelejian and Piras (2014), and Halleck Vega and Elhorst (2015).
Estimation results for the linear model and spatial models using contiguitybased connectivity are presented in Table 1. The posterior point estimates are close to the frequentist ones obtained by Halleck Vega and Elhorst (2015), as can be seen in Fig. 2. Spatial dependence plays an important role—spatial lags are significant in all cases. Estimates of \(\beta _{price}\) and \(\beta _{inc}\) are relatively stable throughout, but omit important details related to spillover effects. In order to adequately compare and interpret the models, we need to consider partial effects of the variables, or summary statistics thereof. As discussed in Sect. 2, the partial effects of models with spatial lag are (a) not directly given by the coefficients, and (b) carry additional locational information (see LeSage and Pace 2009). It is helpful to work with (average) total effects, which can be further divided into direct (within a location itself) and indirect (impacting neighbouring locations) effects. The average total effects of price and income are visualised in Fig. 2. In the plot, we can clearly discern the impact of omitting spatial spillover effects.
Following the Bayesian paradigm, summary statistics of parameters are easy to compute, and uncertainty measurements are straightforward to obtain. To calculate, e.g., the average total effects, we simply compute this effect for every draw from the posterior, or a subset thereof. We obtain a posterior density of the effect in question that reflects all available information on it and uncertainty around it. We can compute the mean, standard error, or credible intervals of this density. The workings of Bayesian uncertainty can be seen in Fig. 2. The linear model (orange, left) entails the assumption of no spatial dependence, omitting all uncertainty that concerns spatial parameters and yielding a narrow posterior. The SLX model (yellow, centreright) introduces local spillover effects (with given connectivity), which are reflected in the location and scale of the posterior. The SDM (blue, right) additionally considers global spillover effects that act as a filter for the dependent. For it, and the related SAR model, we see clear benefits of obtaining the full posterior—total effects are nonGaussian (also see the appendix for quantile–quantile plots). Uncertainty around these nonlinear models is characterised by heavy tails—they are not adequately summarised by their means and standard errors.
One layer of uncertainty that we have ignored so far concerns the structure of spatial connectivity. By fixing \({\mathbf {W}}\) to a contiguitybased structure, we impose the type of connectivity and implicitly assume perfect knowledge of it. In practice, researchers often check the robustness of results against different connectivity structures. However, this is usually done in an adhoc manner and—most of the time—only few and related types of connectivity are considered (Plümper and Neumayer 2010; Neumayer and Plümper 2016), which are known to have limited impact on results (see LeSage and Pace 2014). More formal approaches to uncertainty around spatial connectivity structure are Bayesian model averaging (see e.g. Debarsy and LeSage 2020; Krisztin and Piribauer 2021) or the parameterisation of connectivity, e.g. in the SLX(\(\delta\)) model in Eq. (11). This model treats the connectivity structure as a function with parameters to estimate. One useful functional form is that of inversedistance decay, with one parameter \(\delta\) controlling the speed of decay (see Sect. 3). With this in mind, I extend estimation results of the SLX model in Table 1 to four variants with different inversedistance decay connectivities in Table 2.
The inversedistance decay connectivity of the SLX model in Table 2 mirrors the results of Halleck Vega and Elhorst (2015) and paints a different picture than the specification using contiguitybased connectivity. Differences in local spillovers are pronounced—indirect effects of the price variable experience a signswitch, while indirect effects from the income variable increase considerably in size. As can be seen in Fig. 3, these differences have a large impact on the total effects of these variables. Total effects of prices are lower, with a lower direct effect that is additionally offset by positive local spillovers. With inversedistance based connectivity, the total effects of income are larger than before. To the keen eyed reader, and considering the findings of Halleck Vega and Elhorst (2015), this may seem paradoxical—estimates of \(\beta _{inc}\) are comparable while \(\theta _{inc}\) is considerably more negative.
However, the large values of \(\theta\) in the inversedistance decay specification can be misleading due to the scaling applied. Partial derivatives of the SLX model are not always as trivial as they may seem. The average direct effect of an explanatory variable k is given by \(\beta _k\). The average indirect effect of a unit i on another unit j, however, is generally more complicated to obtain. It is given by \(N^{1} \sum _i \sum _j w_{i j} \theta _k\). For a rowstochastic connectivity matrix \({\mathbf {W}}\), as used for Table 1, this value is equivalent to \(\theta _k\). For connectivity that is scaled in some other way, e.g. using the maximum eigenvalue, this is not the case and the average indirect effects need to be computed explicitly. In Table 2, I additionally report \(s_\theta\), the appropriate scaling factors for obtaining average indirect effects from the reported estimates of \(\theta\). For the SLX(3) specification, estimates are inflated by a factor of seven. Scaled estimates that can be compared to the contiguity specification are \(\theta _{price} = 0.036\) and \(\theta _{inc} = 0.115\), suggesting less pronounced differences between the specifications.
With Table 2 we have five different SLX specifications with different implications—which one to use? A standard approach would be model selection based on information criteria, like the Schwarz information criterion (SIC). For the four classical SLX models we arrive at SIC values of − 2758 (contiguitybased), − 3004 (\(\delta = 2\)), − 3047 (\(\delta = 3\)), and − 3037 (\(\delta = 4\)). These values heavily favour the inversedistance decay specifications, and among them the SLX(3) specification. In fact, Bayesian model averaging using the SIC approximation for the marginal likelihood would place more than 99% of the weight on the SLX(3) variant and an essentially zero weight (less than \(1\mathrm {e}{60}\)) on the contiguity variant. However, the exact parameterisation of the three inversedistance decay specifications compared is essentially arbitrary. The SLX(\(\delta\)) allows us to accurately reflect uncertainty around this parameterisation. For this flexibility, we require a prior and need to closely monitor samples of the parameter for convergence and mixing behaviour.
In the SLX(\(\delta\)) model, I use an inversegamma prior with \(d_0 = D_0 = 2\) (see Eq. (12)), which has support \((0, \infty )\), a mean of 2, and places 99% of the prior weight on the interval (0.27, 19.22). See the appendix for trace and density plots indicating good mixing and proper convergence of the sampler. The posterior indicates that connectivity decays rather quickly at \(\delta = 3\). This means that very close neighbours hold relatively much weight (see the appendix for a visualisation of connectivity strengths). Compared to the contiguity specification, most of the connectivity is between few states, while connectivity at medium and long distances remains relevant. This implies that the SLX(2) specification picks up effects at larger distances, which may not be relevant to the issue at hand. Meanwhile, the SLX(4) specification is focused on extremely close neighbours, potentially omitting the effects of midrange neighbours. Regarding results, we find that the SLX(2) model overestimates average indirect effects and the SLX(4) underestimates them when compared to the SLX(3) model of choice.
Focusing on the preferred fixed SLX(3) and flexible SLX(\(\delta\)) specifications, we find the aforementioned uncertainty around the connectivity specification at work in Table 2. The former pinpoints the maximum likelihood value of \(\delta\), while the latter considers all possible values of \(\delta\), reflecting the associated uncertainty in all posterior densities. The 99% credible interval of \(\delta\) covers (2.54, 3.74); notably, the previously considered alternatives of \(\delta \in \{2, 4\}\) fall well out of this range. This uncertainty surrounding \(\delta\) is reflected in other posterior densities, such as the indirect price effect parameter \(\theta _{price}\). With the (unscaled) 99% credible interval covering [0.022, 0.507], the effect could be interpreted as barely significant. Meanwhile, the SLX(3) specification that remains ignorant of uncertainty around \(\delta\) yields a much narrower (unscaled) 99% interval of [0.116, 0.402]. Moreover , posterior densities in Fig. 3 again imply that the posterior effects of the flexible SLX(\(\delta\)) model are nonGaussian. The Bayesian paradigm offers a suitable tool for dealing with these inherent uncertainties.
6 Concluding remarks
In this paper, I presented a software architecture for Bayesian modelling in the spatial econometric domain. I implemented this objectoriented system with a layered interface for users and programmers in the bsreg package. The result is flexible, easytouse software that can readily be extended. For the purpose of demonstration, I used the package to reproduce an analysis of cigarette demand in the United States. I demonstrated the flexibility of Bayesian methods and their suitability for acknowledging and accommodating uncertainty in nonlinear spatial econometric models. I documented continued issues with the interpretation of spatial models—highlighting the need for tried and tested software.
Departing from this paper, much future work for Bayesian spatial software and Bayesian spatial methods remains to be done. Regarding software, largerscale efforts are required—to extend bsreg or replace it. Potential features include (1) a more optimised objectoriented core system, (2) extensions to more models and settings, such as large panel settings, (3) more focus on exposing the programming interface and tighter integration into the existing ecosystem, (4) a more comprehensive and wellconnected user interface, with more methods for analysis and links to existing packages. Alternatively, better integration of spatial econometric methods in existing probabilistic languages could be pursued. Methodwise, the uncertainties of spatial econometric model need to be addressed—both in terms of more flexible models and in terms of communicating and presenting results. The summary measures of average direct and indirect effects have raised the bar on spatial econometrics—perhaps posterior densities can help find its range.
Notes
The R native objectoriented systems are the S3 and S4 (R Core Team 2021) systems, which are rather languagespecific, and the reference class system, also referred to as RC (R Core Team 2021). The R6 and RC systems are more standard, i.e. comparable to other languages such as C++, making them better suited to this application; between them, R6 is faster and lighter weight (Chang 2021).
References
Acemoglu D, Naidu S, Restrepo P, Robinson JA (2019) Democracy does cause growth. J Polit Econ 127(1):47–100. https://doi.org/10.1086/700936
Almeida A, Loy A, Hofmann H (2018) ggplot2 compatible quantile–quantile plots in R. R J 10(2):248–261. https://doi.org/10.32614/RJ201805
Anselin L (2013) Spatial econometrics: methods and models, vol 4. Springer Science and Business Media, Berlin. https://doi.org/10.1007/9789401577991 (ISBN 9789401577991)
Arima EY, Richards P, Walker R, Caldas MM (2011) Statistical confirmation of indirect land use change in the Brazilian Amazon. Environ Res Lett 6(2):024010. https://doi.org/10.1088/17489326/6/2/024010
Baltagi BH, Li D (2004) Prediction in the panel data model with spatial correlation. Advances in spatial econometrics. Springer, Berlin, pp 283–295. https://doi.org/10.1007/9783662056172_13
Barry RP, Pace RK (1999) Monte Carlo estimates of the log determinant of large sparse matrices. Linear Algebra Appl 289(1–3):41–54. https://doi.org/10.1016/S00243795(97)10009X
Bédard M (2008) Optimal acceptance rates for Metropolis algorithms: moving beyond 0.234. Stoch Process Appl 118(12):2198–2222. https://doi.org/10.1016/j.spa.2007.12.005
Behrens K, Ertur C, Koch W (2012) ‘Dual Gravity’: using spatial econometrics to control for multilateral resistance. J Appl Economet 27(5):773–794. https://doi.org/10.1002/jae.1231
Bivand R, Piras G (2015) Comparing implementations of estimation methods for spatial econometrics. J Stat Softw 63(18):1–36. https://doi.org/10.18637/jss.v063.i18 (ISSN 15487660)
Bivand R, GómezRubio V, Rue H (2015) Spatial data analysis with RINLA with some extensions. J Stat Softw 63(20):1–31. https://doi.org/10.18637/jss.v063.i20 (ISSN 15487660)
Bivand Roger S, Edzer P, Virgilio GR (2013) Applied spatial data analysis with R. Springer, New York. https://doi.org/10.1007/9781461476184
Carpenter B, Gelman A, Hoffman M, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, Riddell A (2017) Stan: a probabilistic programming language. J Stat Softw 76(1):1–32. https://doi.org/10.18637/jss.v076.i01
Chakir R, Le Gallo J (2013) Predicting land use allocation in France: a spatial panel data analysis. Ecol Econ 92:114–125. https://doi.org/10.1016/j.ecolecon.2012.04.009
Chang W (2021) R6: encapsulated classes with reference semantics. https://CRAN.Rproject.org/package=R6. R package version 2.5.1
Cornwall GJ, Parent O (2017) Embracing heterogeneity: the spatial autoregressive mixture model. Reg Sci Urban Econ 64:148–161. https://doi.org/10.1016/j.regsciurbeco.2017.03.004
CrespoCuaresma J, Feldkircher M (2013) Spatial filtering, model uncertainty and the speed of income convergence in Europe. J Appl Economet 28(4):720–741. https://doi.org/10.1002/jae.2277
Crespo Cuaresma J, Doppelhofer G, Feldkircher M (2014) The determinants of economic growth in European regions. Region Stud 48(1):44–67. https://doi.org/10.1080/00343404.2012.678824
Debarsy N, LeSage J (2018) Flexible dependence modeling using convex combinations of different types of connectivity structures. Reg Sci Urban Econ 69:48–68. https://doi.org/10.1016/j.regsciurbeco.2018.01.001
Debarsy N, LeSage JP (2020) Bayesian model averaging for spatial autoregressive models based on convex combinations of different types of connectivity matrices. J Bus Econ Stat. https://doi.org/10.1080/07350015.2020.1840993
Dong G, Harris R (2015) Spatial autoregressive models for geographically hierarchical data structures. Geogr Anal 47(2):173–191. https://doi.org/10.1111/gean.12049
Dong G, Harris R, Jones K, Jianhui Yu (2015) Multilevel modelling with spatial interaction effects with application to an emerging land market in Beijing, China. PLOS One 10(6):e0130761. https://doi.org/10.1371/journal.pone.0130761
Dong G, Harris R, Mimis A (2016) HSAR: an R package for integrated spatial econometric and multilevel modelling. GIS Research UK 2016
Elhorst JP (2010) Applied spatial econometrics: raising the bar. Spatial Econ Anal 5(1):9–28. https://doi.org/10.1080/17421770903541772
Gamerman D, Lopes HF (2006) Markov chain Monte Carlo: stochastic simulation for Bayesian inference. CRC Press, Boca Raton. https://doi.org/10.1201/9781482296426
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis. CRC Press, Boca Raton. https://doi.org/10.1201/b16018 (ISBN 9781439840955)
GómezRubio V, Bivand RS, Rue H (2021) Estimating spatial econometrics models with integrated nested Laplace approximation. Mathematics 9(17):2044. https://doi.org/10.3390/math9172044
HalleckVega S, Elhorst JP (2015) The SLX model. J Region Sci 55(3):339–363. https://doi.org/10.1111/jors.12188
Han X, Lee LF (2016) Bayesian analysis of spatial panel autoregressive models with timevarying endogenous spatial weight matrices, common factors, and random coefficients. J Bus Econ Stat 34(4):642–660. https://doi.org/10.1080/07350015.2016.1167058
Kay M (2021) ggdist: visualizations of distributions and uncertainty. https://cran.rproject.org/package=ggdist. R package version 3.0.0
Kelejian HH, Piras G (2014) Estimation of spatial models with endogenous weighting matrices, and an application to a demand model for cigarettes. Region Sci Urban Econ 46:140–149. https://doi.org/10.1016/j.regsciurbeco.2014.03.001 (ISSN 01660462)
Kelejian HH, Prucha IR (1998) A generalized spatial twostage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J Real Estate Financ Econ 17(1):99–121. https://doi.org/10.1023/A:1007707430416
Kelejian HH, Prucha IR (1999) A generalized moments estimator for the autoregressive parameter in a spatial model. Int Econ Rev 40(2):509–533. https://doi.org/10.1111/14682354.00027
Krisztin T, Fischer MM (2015) The gravity model for international trade: specification and estimation issues. Spat Econ Anal 10(4):451–470. https://doi.org/10.1080/17421772.2015.1076575
Krisztin T, Piribauer P (2021) A Bayesian approach for estimation of weight matrices in spatial autoregressive models (under review)
Krisztin T, Piribauer P, Wögerer M (2021) A spatial multinomial logit model for analysing urban expansion. Spatial Econ Anal. https://doi.org/10.1080/17421772.2021.1933579
Kuschnig N (2021) bsreg: Bayesian spatial regression models. https://CRAN.Rproject.org/package=bsreg. R package version 0.0.1
Kuschnig N, Cuaresma JC, Krisztin T, Giljum S (2021) Spillover effects in agriculture drive deforestation in Mato Grosso, Brazil. Sci Rep 11(1):1–9. https://doi.org/10.1038/s4159802100861y
Lacombe DJ, McIntyre SG (2016) Local and global spatial effects in hierarchical models. Appl Econ Lett 23(16):1168–1172. https://doi.org/10.1080/13504851.2016.1142645
Lancaster T (2004) An introduction to modern Bayesian econometrics. Blackwell, Oxford (ISBN 9781405117197)
Lee LF (2004) Asymptotic distributions of quasimaximum likelihood estimators for spatial autoregressive models. Econometrica 72(6):1899–1925. https://doi.org/10.1111/j.14680262.2004.00558.x
LeSage JP (1999) Applied econometrics using MATLAB. https://spatialeconometrics.com/
LeSage JP (2000) Bayesian estimation of limited dependent variable spatial autoregressive models. Geogr Anal 32(1):19–35. https://doi.org/10.1111/j.15384632.2000.tb00413.x
Lesage JP, Fischer MM (2008) Spatial growth regressions: model specification, estimation and interpretation. Spat Econ Anal 3(3):275–304. https://doi.org/10.1080/17421770802353758
LeSage J, Pace RK (2009) Introduction to spatial econometrics. Chapman and Hall/CRC, London. https://doi.org/10.1201/9781420064254
LeSage JP, Pace RK (2014) The biggest myth in spatial econometrics. Econometrics 2(4):217–249. https://doi.org/10.3390/econometrics2040217
LeSage JP, Parent O (2007) Bayesian model averaging for spatial econometric models. Geogr Anal 39(3):241–267. https://doi.org/10.1111/j.15384632.2007.00703.x
Lindgren F, Rue H (2015) Bayesian spatial modelling with RINLA. J Stat Softw 63(19):1–25. https://doi.org/10.18637/jss.v063.i19. https://www.jstatsoft.org/v063/i19
Morris M, WheelerMartin K, Simpson D, Mooney SJ, Gelman A, DiMaggio C (2019) Bayesian hierarchical spatial models: implementing the Besag York Mollié model in Stan. Spatial Spatiotemporal Epidemiol 31:100301. https://doi.org/10.1016/j.sste.2019.100301 (ISSN 18775845)
Neumayer E, Plümper T (2016) W. Polit Sci Res Methods 4(1):175–193. https://doi.org/10.1017/psrm.2014.40
Pace RK, Barry R, Slawson VC, Sirmans CF (2004) Simultaneous spatial and functional form transformations. Advances in spatial econometrics. Springer, Berlin, pp 197–224. https://doi.org/10.1007/9783662056172
Panzera D, Postiglione P (2021) The impact of regional inequality on economic growth: a spatial econometric approach. Region Stud. https://doi.org/10.1080/00343404.2021.1910228
Pfarrhofer M, Piribauer P (2019) Flexible shrinkage in highdimensional Bayesian spatial autoregressive models. Spatial Stat 29:109–128. https://doi.org/10.1016/j.spasta.2018.10.004
Plummer M (2003) JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. In: Proceedings of the 3rd international workshop on distributed statistical computing, 2003. http://www.ci.tuwien.ac.at/Conferences/DSC2003/
Plümper T, Neumayer E (2010) Model specification in the analysis of spatial dependence. Eur J Polit Res 49(3):418–442. https://doi.org/10.1111/j.14756765.2009.01900.x
Plummer M, Best N, Cowles K, Vines K (2006) CODA: convergence siagnosis and output analysis for MCMC. R News 6(1):7–11. https://journal.rproject.org/archive/ (ISSN 16093631)
R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.Rproject.org/
Sherlock C, Roberts G (2009) Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets. Bernoulli 15(3):774–798. https://doi.org/10.3150/08BEJ176
Smirnov OA, Anselin LE (2009) An \({\cal{O}}(n)\) parallel method of computing the LogJacobian of the variable transformation for models with spatial interaction on a lattice. Comput Stat Data Anal 53(8):2980–2988. https://doi.org/10.1016/j.csda.2008.10.010
The MathWorks Inc (2021) MATLAB—the language of technical computing. http://www.mathworks.com/products/matlab/
Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer, New York. https://doi.org/10.1007/9783319242774
Wolf LJ, Anselin L, ArribasBel D (2018) Stochastic efficiency of Bayesian Markov chain Monte Carlo in spatial econometric models: an empirical comparison of exact sampling methods. Geogr Anal 50(1):97–119. https://doi.org/10.1111/gean.12135
Yang Q, Geng Y, Dong H, Zhang J, Xiaoman Yu, Sun L, Xiaorong L, Chen Y (2017) Effect of environmental regulations on China’s graphite export. J Clean Prod 161:327–334. https://doi.org/10.1016/j.jclepro.2017.05.131
Acknowledgements
I thank Jesús Crespo Cuaresma, Lukas Vashold, Gregor Zens, and Bettina Grün for comments on earlier drafts of this paper. My thanks also go out to the R, spatial, and spatial econometric communities for their attitudes towards open science, and the provision of free and open source software.
Funding
Open access funding provided by Vienna University of Economics and Business (WU).
Author information
Authors and Affiliations
Corresponding author
A: Appendix
A: Appendix
1.1 Reproducibility
1.1.1 Data
The datasets used in this paper were kindly made available online by Paul Elhorst on his website at https://spatialpanels.com/ and can be downloaded directly from this link. There is a small error in the state coordinates in the original files; the longitude of Utah was coded as 11.9 instead of approximately 111.7. The impact of this error on overall results is somewhat limited. Distances used are measured in arc degrees—one degree is about 111 kilometres. The results reproduced for this paper can be found in Tables 2, 3, and 4 of Halleck Vega and Elhorst (2015).
1.1.2 Code
All code used for this paper is made available online under the GPL3 free software license at https://github.com/nk027/bsreg. Results were produced using version 0.0.2 of bsreg. Visualisations were created using the using the ggplot2 (Wickham 2016), ggdist (Kay 2021), and qqplotr (Almeida et al. 2018) R packages.
1.1.3 Connectivity matrix
The bsreg package also includes datasets on the boundaries of US states, as well as the original dataset by Baltagi and Li (2004) extended with centroid coordinates that allow reproducing some of the results. Using this dataset, the inverse distancedecay matrix used in Sect. 4 can be constructed using the following code.
1.2 Additional figures
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kuschnig, N. Bayesian spatial econometrics: a software architecture. J Spat Econometrics 3, 6 (2022). https://doi.org/10.1007/s4307102200023w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s4307102200023w