1 Introduction

Recent literature have dealt with the extension of generalized autoregressive conditional heteroscedasticity (GARCH) models to spatial and spatiotemporal processes (e.g., Otto et al. 2018, 2021; Sato and Matsuda 2020). Whereas the classical ARCH model is defined as a process over time, these random processes have a multidimensional support. Thus, they allow for spatially dependent second-order moments, while the observations are uncorrelated and the mean is constant in space (see Otto et al. 2021). For all geo-referenced processes, it is important to allow for instantaneous spatial interactions, as “near things are more related than distant things” (Tobler’s first law of geography, Tobler 1970). That is, observations nearby are more similar than observations with larger distances. With regard to autoregressive dependencies, the focus was mostly on the mean process, but not on the spatial conditional heteroscedasticity. Thus, Otto et al. (2018) suggested first a purely spatial ARCH model. Furthermore, Sato and Matsuda (2017, 2020) introduced a random process incorporating elements of GARCH and exponential GARCH (E-GARCH) processes, which is, however, neither a GARCH nor an E-GARCH process—also in a one-dimensional space, where the model should collapse to a classical time-series GARCH model. Their model can be rather considered as symmetric spatial log-GARCH process. Moreover, Otto et al. (2018) only focussed on spatial ARCH processes without considering the influences from the realized, conditional variance at neighbouring locations. Direct extensions of GARCH and E-GARCH processes to spatial settings do not exist among current research.

Thus, we introduce a completely novel generalized spatial ARCH model (spatial GARCH or spGARCH) in this paper. Because a general definition of this model is used, time-series GARCH models (Bollerslev 1986), the previously introduced spatial ARCH (Otto et al. 2018) and the symmetric log-GARCH (Sato and Matsuda 2020) are nested. This definition also allows us to define a spatial exponential spatial GARCH model, which will be the subject of a future paper. Moreover, other GARCH-type models, like threshold or multivariate GARCH models, can easily be constructed. This unified spatial GARCH process is a completely new class of models in spatial statistics/econometrics, for which we derive consistent estimators based on a non-linear least-squares approach. In addition, all models are computationally implemented in one library, the R-package spGARCH (version \(> 2.0\)).

From a practical perspective, this unified spatial GARCH model can be used to model spill-over effects in the conditional variances across the spatial units. That means that an increasing variance in a certain region of the considered space would lead to an increase or decrease in the adjacent regions, depending on the direction (sign) of the spatial dependence. Compared to previous spatiotemporal GARCH models, these spill-overs are instantaneous. That is, there is no time lag needed. All previously proposed spatiotemporal GARCH models include the spatial autoregressive dependence in the first temporal lag (i.e., spatial spillover are temporally delayed), so that these models can be seen as special cases of multivariate time-series GARCH models (e.g., Hølleland and Karlsen 2020; Borovkova and Lopuhaa 2012; Caporin and Paruolo 2015, or Billio et al. 2021 for networks). Contrary to these spatiotemporal models, the proposed spatial GARCH model could also be applied to purely spatial data, like for modelling local climate risks, such as fluctuations in the temperature and precipitation, or financial risks in spatially constrained markets, such as real estate or labour. Furthermore, spatial GARCH-type models can be used as error models for any linear or non-linear spatial regression model to account for local model uncertainties (i.e., areas in which the considered models perform worse than in others). These model uncertainties can be considered to be a kind of local risk. Instead of modelling such autoregressive dependence as a GARCH process, stochastic volatility models could also be used. Taşpınar et al. (2021) considered a stochastic volatility approach for spatial settings and showed its applicability for U.S. house prices.

The remainder of this paper is structured as follows. In the next section, we introduce the generalized framework of spatial and spatiotemporal autoregressive conditional heteroscedasticity models and discuss two examples nested within this approach, more precisely, the novel spatial GARCH (as an equivalent to the time-series GARCH models) and the spatial log-GARCH processes by Sato and Matsuda (2017, 2020). Following from there, a non-linear least-squares procedure is introduced for this model class. These theoretical sections are followed up with a discussion of the insights gained from simulation studies. The paper then supplies a real-world example, namely the real estate prices in the German capital city of Berlin. In Sect. 6, we stress some important extensions for future research before concluding the paper.

2 Spatial and spatiotemporal GARCH-type models

Let \(\left\{ Y(\varvec{s}) \in \mathbb {R}: \varvec{s} \in D_{\varvec{s}} \right\} \) be a univariate stochastic process, where \(D_{\varvec{s}}\) represents a set of possible locations in a q-dimensional space. Thus, spatial and spatiotemporal models are both covered by this approach. With regards to spatiotemporal processes, the temporal dimension can be easily considered as one of the q dimensions. In addition, time-series GARCH models are included for \(q = 1\).

Let \(\varvec{s}_1, \ldots , \varvec{s}_n\) denote all locations, and let \(\varvec{Y}\) stand for the vector of observations \(\left( Y\left( \varvec{s}_i\right) \right) _{i = 1, \ldots , n}\). The commonly applied spatial autoregressive (SAR) model implies that the conditional variance \({{\,\mathrm{Var}\,}}(Y(\varvec{s}_i) \vert Y(\varvec{s}_j), j \ne i)\) is constant (cf. Cressie 1993; Cressie and Wikle 2011) and does not depend on the observations of neighbouring locations. This approach is extended by assuming the changes in the volatility can spill over to neighbouring regions and that conditional variances can vary over space, resulting in clusters of high and low variance. As in time-series ARCH models developed by Engle (1982), the vector of observations is given by the non-linear relationship

$$\begin{aligned} \varvec{Y} = \text {diag}(\varvec{h})^{1/2} \varvec{\varepsilon } \, \end{aligned}$$
(1)

where \(\varvec{h} = (h(\varvec{s}_1), \ldots , h(\varvec{s}_n))'\) and \(\varvec{\varepsilon } = (\varepsilon (\varvec{s}_1), \ldots , \varepsilon (\varvec{s}_n))'\) is a noise component, which is later specified in more detail.

Moreover, we assume that a known function f exists, which relates \(\varvec{h}\) to a vector \(\varvec{F} = (f(h(\varvec{s}_1)), \ldots , f(h(\varvec{s}_n)))^\prime \). This general approach is beneficial because different spatial GARCH-type processes can be defined by choosing f and a suitable model of \(\varvec{F}\). For instance, they could have additive or multiplicative dynamics, or the spill-over effects in the conditional variances could be global or locally constrained to direct neighbouring observations. In this paper, we extend the class of spatial ARCH models to generalized spatial GARCH models (spGARCH), which are analogously defined to the time-series GARCH models and have additive dynamics (cf. Bollerslev 1986). Besides, previously introduced spatial ARCH models are nested within this general approach (e.g., Otto et al. 2018; Sato and Matsuda 2020; see also Examples 1 and 3 in Sect. 2.3).

2.1 Generalized approach

Below, we introduce a general approach covering some important spatial and spatiotemporal GARCH-type models, namely the spatial ARCH model of Otto et al. (2018) and the logarithmic model of Sato and Matsuda (2020). For these models, vector \(\varvec{F}\) is chosen as

$$\begin{aligned} \varvec{F} = \varvec{\alpha } + \mathbf {W}_1 \varvec{\gamma }(\varvec{Y}^{(2)}) + \mathbf {W}_2 \varvec{F} \end{aligned}$$
(2)

with a measurable function \(\varvec{\gamma }(\varvec{x}) = (\gamma _1(\varvec{x}), \ldots , \gamma _n(\varvec{x}))^\prime \) and \(\varvec{Y}^{(2)} = (Y(\varvec{s}_1)^2, \ldots , Y(\varvec{s}_n)^2)^\prime \). The weighting matrices \(\mathbf {W}_1 = (w_{1,ij})_{i,j = 1, \ldots , n}\) and \(\mathbf {W}_2 = (w_{2,ij})_{i,j = 1, \ldots , n}\) are assumed to be non-negative with zeros on the diagonal (i.e., \(w_{v,ij} \ge 0\) and \(w_{v,ii}=0\) for all \(i,j = 1, \ldots , n\) and \(v = 1, 2\)). Moreover, let \(\varvec{\alpha } = (\alpha _i)_{i = 1, \ldots , n}\) be a positive vector.

First, we discuss under what conditions the process is well-defined. To do this, we make use of the Banach fixed point theorem for random processes. The field of random fixed point theorems has been studied by several authors (e.g., Hanš 1957; Bharucha-Reid et al. 1976; Tan and Yuan 1997).

In the following, we make use of the notation \(\mathbf {E} = \text{ diag }(\varepsilon (\varvec{s}_1)^2, \ldots , \varepsilon (\varvec{s}_n)^2)\). Considering the operator

$$\begin{aligned} \varvec{T}(\omega )\circ \varvec{z} = \varvec{\alpha } + \mathbf {W}_1 \varvec{\gamma }(\mathbf {E}(\omega ) \varvec{z}) - (\mathbf {I} - \mathbf {W}_2) ( f(z_i) )_{i=1,\ldots ,n} + \varvec{z} \end{aligned}$$
(3)

defined on \(I\!\!R^n\) with a norm \(\vert \vert . \vert \vert \), the following conditions can be derived such that the process is well-defined. Furthermore, \(\omega \) is an element in the probability space of the error process and \(\circ \) denotes the Hadamard product.

Theorem 1

Suppose that the operator \(\varvec{T}(\omega )\) defined in (3) is a continuous random operator on \((I\!\!R^n, \vert \vert .\vert \vert )\) to itself and that there is a a non-negative real-valued random variable \(L_n(\omega ) < 1\) a.s. such that \(\vert \vert \varvec{T}(\omega ) \circ \varvec{z}_1 - \varvec{T}(\omega ) \circ \varvec{z}_2\vert \vert \le L_n(\omega ) \vert \vert \varvec{z}_1 - \varvec{z}_2\vert \vert \) for all \(\varvec{z}_1, \varvec{z}_2 \in I\!\!R^n\). Then the equations (1) and (2) have exactly one real-valued measurable solution \(\varvec{z}\).

The proof of this theorem is given in the Appendix. Note that the condition of Theorem 1 is fulfilled if, for example, \(\varvec{\gamma }\) satisfies a Lipschitz condition with constant \(L_1\), \((f(z_i))_{i=1,\ldots ,n}\) satisfies a Lipschitz condition with a constant \(L_2\) and \(L_n := L_1 \vert \vert \mathbf {W}_1 \mathbf {E}\vert \vert + L_2 \vert \vert \mathbf {I} - \mathbf {W}_2\vert \vert < 1\) where we make use of the matrix norm induced by the vector norm. However, in order to guarantee that \(L_n\) does not depend on n, we need stronger conditions. If we take the 1-norm and if the matrices \(\mathbf {W}_1\) and \(\mathbf {W}_2\) are row-standardized then \(\vert \vert \mathbf {I} - \mathbf {W}_2\vert \vert < 2\). To ensure that \(\vert \vert \mathbf {W}_1 \mathbf {E}\vert \vert \) is bounded we have to assume that the \(\varepsilon (\varvec{s}_i)\) are uniformly bounded. We refer to Otto et al. (2018) where this problem is discussed for a spatial ARCH process in more detail. Moreover, it is important to note that the operator \(\varvec{T}(\omega )\) is continuous if f and \(\gamma \) are continuous.

Further, the solution of (3) which reflects \(\varvec{h}\) should be non-negative such that the process \(\varvec{Y}\) is a well-defined real-valued process. In many applications, the functions f and \(\varvec{\gamma }\) are defined to be zero for negative values, such that \(\varvec{h}\) is always positive if \(\varvec{\alpha } > 0\). We will come back to this point later.

In addition, the fixed-point theorem of Banach implies that the sequence \(\varvec{z}_m = \varvec{T}(\varvec{z}_{m-1})\), \(m \ge 1\) converges to \(\varvec{h}\) for given \(\varvec{\alpha }\), \(\mathbf {E}\), \(\mathbf {W}_1\), and \(\mathbf {W}_2\). Consequently, this result represents one way to simulate such a process.

2.2 Properties of spatial GARCH models

Below, we discuss some important properties of this process including the following condition for stationarity.

Corollary 2

Suppose that the assumptions of Theorem 1 are fulfilled and that the solution of (3) is non-negative. If \((\varepsilon (\varvec{s}_1),\ldots ,\varepsilon (\varvec{s}_n))^\prime \) is strictly stationary, then \((Y(\varvec{s}_1), \ldots , Y(\varvec{s}_n))^\prime \) is strictly stationary as well.

Moreover, the observations \(Y(\varvec{s})\) are uncorrelated with a mean of zero, as we will show in the following theorem. Thus, spatial GARCH models are suitable error models for use with other linear or non-linear spatial regression models, such as spatial autoregressive or spatial error models (see also Elhorst 2010), without affecting the mean equation. In this way, locally varying model uncertainties can be captured.

Theorem 3

Let \(i \in \{1,\ldots ,n\}\). Suppose that the assumptions of Theorem  1 are satisfied and that the solution of (3) is non-negative. Further let \(\varvec{\varepsilon }\) be sign-symmetric, i.e.,

$$\begin{aligned} \varvec{\varepsilon } {\mathop {=}\limits ^{d}} ((-1)^{v_1} \varepsilon (\varvec{s}_1),\ldots ,(-1)^{v_n} \varepsilon (\varvec{s}_n)) \qquad \text{ for } \text{ all } \qquad v_1,\ldots ,v_n \in \{0,1\} . \end{aligned}$$
  1. (a)

    Then \(Y(\varvec{s}_i)\) is a symmetric random variable. All odd moments and all conditional odd moments of \(Y(\varvec{s}_i)\) are zero, provided that they exist.

  2. (b)

    It holds that \(\text{ Cov }(Y(\varvec{s}_i), Y(\varvec{s}_j)) = 0\) for \(i \ne j\) if the second moment exists.

In the spatial setting, however, the conditional variance \({{\,\mathrm{Var}\,}}(Y(\varvec{s}_i) \vert Y(\varvec{s}_j), j \ne i)\) is not exactly equal to \(h(\varvec{s}_i)\) (see Otto et al. 2021). In principle, this is due to the fact that there is no clear (causal) ordering of observations as in time series, where only past observations can influence future observations, but not vice versa. In case of a directional spatial dependence, however, \(h(\varvec{s}_i)\) is equal to the conditional variance at location \(\varvec{s}_i\). A detailed analysis of this point can be found in Otto et al. (2021). Nevertheless, the interpretation of \(\varvec{h}\) is similar to the conditional variance. In locations \(\varvec{s}\), where \(h(\varvec{s})\) is large, the conditional variance is also large and vice versa (see Otto et al. 2021, Fig. 1). That means that the local risk or level of uncertainty of this particular region is high compared to its neighbours. Such regions could be identified via \(\varvec{h}\); this could be of interest in terms of the valuation of real estate or other immovable assets since it provides insights into an individual location’s risk.

In addition to this, the spatial GARCH coefficients measure potential risk spill-overs from neighbouring locations. It is worth noting that in the case of directional spatial processes, \(\varvec{h}\) is equal to the conditional variances. Thus, it can be interpreted in the same way as with time-series GARCH models.

2.3 Examples of spatial GARCH models

This general framework allows for a large range of GARCH-type models. Depending on the definition of f and \(\varvec{\gamma }\), the resulting spatial GARCH-type models have different stochastic properties. We discuss some important special cases below, starting with the spatial ARCH model, which is a direct extension of the ARCH process of Engle (1982) to spatial and spatiotemporal processes. It was originally introduced by Otto et al. (2018). For more details on its stochastic properties, we refer to (Otto et al. 2021).

Example 1

(Spatial ARCH process of Otto et al. 2018) Choosing \(f(x) = x I_{[0,\infty )}(x)\), \(\gamma _i(\varvec{x}) = x_i I_{[0,\infty )}(x_i)\) for \(i = 1, \ldots , n\), and \(\mathbf {W}_2 = \mathbf {0}\) the spatial ARCH (spARCH) process is obtained. It is given by

$$\begin{aligned} Y(\varvec{s}_i) = \sqrt{h(\varvec{s}_i)} \varepsilon (\varvec{s}_i), \quad i=1,\ldots ,n \end{aligned}$$

with

$$\begin{aligned} \varvec{h} = \varvec{\alpha } + \mathbf {W}_1 \varvec{Y}^{(2)}\, . \end{aligned}$$

Here, \(I_A(x)\) denotes the indicator function on a set A. The process is well-defined if \(\vert \vert \mathbf {W}_1 \mathbf {E}\vert \vert < 1\). This is an immediate consequence of Theorem 1. Indeed, the spatial ARCH process can be easily extended to a spatial GARCH process by considering the realized values of \(h(\cdot )\) in adjacent locations. This novel spatial GARCH process is defined in the following example.

Example 2

(Spatial GARCH process) Taking \(f(x) = x I_{[0,\infty )}(x)\) and \(\gamma _i(\varvec{x}) = x_i I_{[0,\infty )}(x_i)\) for \(i=1,\ldots ,n\) a spatial GARCH (spGARCH) process is obtained. That is,

$$\begin{aligned} Y(\varvec{s}_i) = \sqrt{h(\varvec{s}_i)} \varepsilon (\varvec{s}_i), \quad i=1,\ldots ,n \end{aligned}$$

with

$$\begin{aligned} \varvec{h} = \varvec{\alpha } + \mathbf {W}_1 \varvec{Y}^{(2)} + \mathbf {W}_2 \varvec{h} \, . \end{aligned}$$

Since \(\varvec{Y}^{(2)} = \mathbf {E} \varvec{h}\), the quantity \(\varvec{h}\) can be specified as

$$\begin{aligned} \varvec{h} = ( \mathbf {I} - \mathbf {W}_1 \mathbf {E} - \mathbf {W}_2)^{-1} \varvec{\alpha } \, , \end{aligned}$$
(4)

if the inverse exists. For this simple example, there is a unique solution if \(\vert \vert \mathbf {W}_1 \mathbf {E} + \mathbf {W}_2\vert \vert < 1\), as it is already expressed in Theorem 1. Alternatively, the condition is fulfilled if the process is directional. In this case, \(\mathbf {W}_1\) and \(\mathbf {W}_2\) are lower or upper triangular matrices (cf. Basak et al. 2018; Merk and Otto 2021).

The case of triangular matrices also includes causal temporal and spatiotemporal GARCH processes. Let us first consider time-series GARCH models. In this case, the dimension of the underlying domain \(D_{\varvec{s}}\) is equal to \(q = 1\) and there is strict causal relation between the observations. That is, current observations are only influenced by past observations. This implies a triangular structure of the weight matrices. For instance, for a GARCH(1, 1) process, the first subdiagonal elements of \(\mathbf {W}_1\) and \(\mathbf {W}_2\) are equal to one if \(\varvec{s}_1 = 1, \ldots , \varvec{s}_n = T\) representing the consecutive time points from one to T. Similarly, for spatiotemporal processes, the weight matrices are triangular if instantaneous spatial interactions are excluded, such as for the models of Hølleland and Karlsen (2020), Borovkova and Lopuhaa (2012), and Caporin and Paruolo (2006). In contrast to these causal processes, the current observations of this new spatiotemporal GARCH model are not only influenced by past realisations of the process, but also by their neighbouring observations. In this way, the model can account for instantaneous spatial interactions as implied by the first law of geography.

Contrary to the previous examples, Sato and Matsuda (2017, 2020) have considered a slightly different choice of f, and have used the log-transformation to avoid any non-negativity problems of \(\varvec{h}\). Thus, their model combines the GARCH and the E-GARCH attempts and can be regarded as the spatial extension of a (symmetric) log-GARCH process (see also Francq et al. 2013). Let \(\varvec{h}_L = ( \log (h(\varvec{s}_i)) )_{i=1,\ldots ,n}\) and \(\varvec{Y}^{(2)}_L = ( \log (Y(\varvec{s}_j)^2 ) )_{i=1,\ldots ,n}\).

Example 3

(Spatial log-GARCH process of Sato and Matsuda 2017) Choosing \(f(x) = \log (x)\) and \(\gamma _i(\varvec{x}) = \log (x_i)\) the symmetric spatial log-GARCH (log-spGARCH) process is obtained, i.e.,

$$\begin{aligned} Y(\varvec{s}_i) = \sqrt{h(\varvec{s}_i)} \varepsilon (\varvec{s}_i), \quad i=1,\ldots ,n \end{aligned}$$

with

$$\begin{aligned} \varvec{h}_L = \varvec{\alpha } + \mathbf {W}_1 \varvec{Y}^{(2)}_L + \mathbf {W}_2 \varvec{h}_L \, . \end{aligned}$$

The process has a unique solution if \(\vert \vert \mathbf {W}_1 + \mathbf {W}_2\vert \vert < 1\). This is also an immediate consequence of Theorem 1. It is obtained by setting \(\gamma (\varvec{x}) = (\log (x_i) )_{i=1,\ldots ,n}\), \(f(x)=\log (x)\), and considering the right side of (3) to be a function of \(\log (z)\). We see that the condition on the existence of a solution is much simpler than for the spGARCH process, since it only depends on the weight matrices and not on the random matrix \(\mathbf {E}\). This simplification is due to the fact that we have an additive decomposition of the function \(\varvec{\gamma }\), i.e., \(\varvec{\gamma }(\mathbf {E} \varvec{z}) = \varvec{\phi }_1(\mathbf {E}) + \varvec{\phi }_2(\varvec{z})\) with certain functions \(\varvec{\phi }_1\) and \(\varvec{\phi }_2\). This functional equation is solved by the logarithm function. However, the behaviour of the log-spGARCH is different to that of the spGARCH. Thus, the one or the other could be preferable for empirical applications. To summarize this section, we provide an overview on all nested spatial GARCH-type models and important other GARCH model, e.g., the classical time-series GARCH models, in Table 1.

Table 1 Overview of nested spatial GARCH-type models for \(\gamma _i(\varvec{Y}) = Y(\varvec{s}_i)^2\)

3 Statistical inference

In the following section, we firstly discuss the choice of the weight matrices in more detail. In a general setting, \(\mathbf {W}_1\) and \(\mathbf {W}_2\) have \(n(n-1)\) free parameters, while only n values are observed. In spatial econometrics, these matrices are therefore usually replaced with a parametric model to control the influence of adjacent regions. Note that the parameter estimates are typically biased if the weights are misspecified. Alternatively, they might instead be estimated using statistical learning approaches, e.g., lasso-type estimators under the assumption of a certain degree of sparsity (e.g., Bhattacharjee and Jensen-Butler 2013; Otto and Steinert 2021). In this section of the paper, however, we will focus on a classical parametric model. For this, we develop an estimation method based on non-linear least squares estimators and show the consistency of these estimators. In practice, the weight matrices must carefully be selected.

3.1 Choice of weight matrices

There is great flexibility in the choice of the weight matrices (see Getis 2009 for an overview). In practice, these are usually dependent upon additional parameters and spatial locations. Frequently, it is assumed that \(\mathbf {W}_1 = \rho \mathbf {W}^{*}_1\) and \(\mathbf {W}_2 = \lambda \mathbf {W}^{*}_2\) with the predefined, known matrices \(\mathbf {W}^{*}_1\) and \(\mathbf {W}^{*}_2\). That is, \(\mathbf {W}^{*}_1\) and \(\mathbf {W}^{*}_2\) describe the structure of the spatial dependence, with the weights as a multiple of these specific matrices. In settings such as these, it is easy to test whether a random process exhibits such a spatial dependence, by testing the parameters \(\rho \) and \(\lambda \). As with time-series GARCH models, \(\rho \) measures the extent to which a volatility shock in one region spills over to neighbouring regions, while \(\rho + \lambda \) gives an impression how fast this effect will fade out in space (see, e.g., Campbell et al. 1997). It is important to note that spill-over effects of shocks always happen simultaneously in purely spatial setting, i.e., without any temporal delay. For spatial autoregressive models, a similar distinction between local and global spill-over effects can be made (see Fingleton 2009, 2008). A more general approach can be obtained by choosing \(\mathbf {W}_{k} = \text {diag}(\rho _1, \ldots , \rho _1, \ldots , \rho _k, \ldots , \rho _k) {\mathbf {W}_{\cdot }^{*}}\) as the weights for \(k \in \{1,2\}\). Here, different areas are weighted in different ways. For instance, all counties of state i are weighted by \(\rho _i\), while counties of another state, j, get a different weighting factor, \(\rho _j\). Alternatively, \(\mathbf {W}_{k}^{*}\) could be chosen as \((K_\theta (\varvec{s}_i - \varvec{s}_j))_{i,j = 1, \ldots , n}\) for \(k \in \{1,2\}\) with a known function K. In this case, the spatial correlation depends on the distance between two locations. For instance, inverse distance weighting schemes \(K(\varvec{x}) = \vert \vert \varvec{x}\vert \vert ^{-k}\) with k being estimated, or anisotropic weighting schemes dependent upon the bearing between two locations.

3.2 Parameter estimation

Below, we assume that the weight matrices have the structure

$$\begin{aligned} \mathbf {W}_1 = \rho \mathbf {W}_1^*, \mathbf {W}_2 = \lambda \mathbf {W}_2^*, \varvec{\alpha } = \alpha \varvec{1} . \end{aligned}$$
(5)

Thus, the model has three parameters to be estimated, \(\varvec{\vartheta } = (\rho , \lambda , \alpha )^\prime \). Let \(\varvec{\vartheta }_0\) denote the true parameters. In the following we use the symbol \(\vert \vert \varvec{x}\vert \vert _2\) for the Euclidean norm of a vector \(\varvec{x}\) and \(\vert \vert \mathbf {A}\vert \vert _\infty = \max _{1 \le i \le n} \sum _{j=1}^{n} \vert a_{ij}\vert \) for the matrix norm of an \(n \times n\) matrix \(\mathbf {A}\), which is induced by a maximum norm.

Even though this seems to be a strong restriction compared to the general definition of the model in (1) and (2), it is probably the most widely applied specification in practice. Generally, the estimation method could also be applied for more complex specifications of the spatial interactions, such as the above-mentioned choices or higher-order spatial lags. In this case, particular attention should be paid to the identifiability of the process parameters (cf. Manski 1993) and to the consistence of the parameter estimators.

One possible method for estimating the parameters is the non-linear least-squares approach (NLSE). Squaring the components of (1) and taking the logarithms, we get that for \(i=1, \ldots , n\)

$$\begin{aligned} \text{ log }(Y(\varvec{s}_i)^2)= & {} \text{ log }(h(\varvec{s}_i)) + \text{ log }(\varepsilon (\varvec{s}_i)^2)\\= & {} {{\,\mathrm{\mathbb {E}}\,}}( \text{ log }(\varepsilon (\varvec{s}_i)^2) ) + \text{ log }(h(\varvec{s}_i)) + \eta (\varvec{s}_i) \end{aligned}$$

with \(\eta (\varvec{s}_i) = \text{ log }(\varepsilon (\varvec{s}_i)^2) - {{\,\mathrm{\mathbb {E}}\,}}( \text{ log }(\varepsilon (\varvec{s}_i)^2) )\). Now \(\eta (\varvec{s}_i), i=1,\ldots ,n\) is a white noise process. Moreover, it follows with \(\tau (x) = f(\text{ exp }(x))\) that

$$\begin{aligned} \varvec{F}= & {} ( \tau (\text{ log }(h(\varvec{s}_i)) )_{i=1,\ldots ,n} = (\mathbf {I} - \lambda \mathbf {W}_2^*)^{-1} ( \alpha \varvec{1} + \rho \mathbf {W}_1^* \varvec{\gamma }(\varvec{Y}^{(2)} ) ) \\= & {} (\mathbf {I} - \lambda \mathbf {W}_2^*)^{-1} (\alpha \varvec{1} + \rho \mathbf {W}_1^* \tilde{\varvec{\gamma }}(\text{ log }(\varvec{Y}^{(2)}) )) \end{aligned}$$

where \( \tilde{\varvec{\gamma }}(\varvec{x}) = ( \gamma _i(\text{ exp }(x_1),\ldots , \text{ exp }(x_n)) )_{i=1,\ldots ,n}\). Note that \((\mathbf {I} - \lambda \mathbf {W}_2^*)^{-1}\) exists if \(\vert \vert \lambda \mathbf {W}_2^*\vert \vert < 1\).

Now, let \(( c_i(\lambda ) )_{i=1,\ldots ,n} = (\mathbf {I} - \lambda \mathbf {W}_2^*)^{-1} \varvec{1}\) and \(( \varvec{d}_i(\lambda )^\prime )_{i=1,\ldots ,n} = (\mathbf {I} - \lambda \mathbf {W}_2^*)^{-1} \mathbf {W}_1^*\). In order to denote the dependence on \(\varvec{\vartheta }\) we write \(h_{\varvec{\vartheta }}(\varvec{s}_i)\), \(i=1,\ldots ,n\). Then,

$$\begin{aligned} \text{ log }(h_{\varvec{\vartheta }}(\varvec{s}_i)) = \tau ^{-1}(\alpha \, c_i(\lambda ) + \rho \, \varvec{d}_i(\lambda )^\prime \, \tilde{\varvec{\gamma }}(\text{ log }(\varvec{Y}^{(2)} ) ) ) . \end{aligned}$$

Here, we assume that \(c = {{\,\mathrm{\mathbb {E}}\,}}( \text{ log }(\varepsilon (\varvec{s}_i)^2) )\) is a known quantity. Using \(H_i = \text{ log }(Y(\varvec{s}_i)^2) - {{\,\mathrm{\mathbb {E}}\,}}( \text{ log }(\varepsilon (\varvec{s}_i)^2) )\) and \(\varvec{H} = ( H_i )_{i=1,\ldots ,n}\) the estimators of the parameters \(\alpha \), \(\lambda \), and \(\rho \) are obtained by minimizing the non-linear sum of squares

$$\begin{aligned} \sum _{i=1}^n \left( H_i - \text{ log }(h_{\varvec{\vartheta }}(\varvec{s}_i) \right) ^2 = \sum _{i=1}^n \left( H_i - \tau ^{-1}(\alpha \, c_i(\lambda ) + \rho \, \varvec{d}_i(\lambda )^\prime \, \tilde{\varvec{\gamma }}(\varvec{H} + c \varvec{1} ) ) \right) ^2 \end{aligned}$$

with respect to \(\varvec{\vartheta }\).

Although \(\tau ^{-1}\) is a known function, this minimization problem is complex. Thus, we will impose further assumptions which are fulfilled for all relevant special cases. We will suppose that \(\varvec{\gamma }(\varvec{x}) = ( \gamma ( x_i ) )\) with a known function \(\gamma \). Consequently, \(\tilde{\varvec{\gamma }}(\varvec{x}) = ( \tilde{\gamma }(x_i) )_{i=1,\ldots ,n}\) with \(\tilde{\gamma }(x) = \gamma ( \text{ exp }(x) )\), which leads to the easier minimization of

$$\begin{aligned} Q_n(\varvec{\vartheta } ) = \frac{1}{n} \; \sum _{i=1}^n \left( H_i - \tau ^{-1}\left( \alpha \, c_i(\lambda ) + \rho \, \varvec{d}_i(\lambda )^\prime \, \left( \tilde{\gamma }( H_v + c )\right) \right) _{v=1,\ldots ,n}\right) ^2 . \end{aligned}$$
(6)

Note that since the (ii)-th element of \(\mathbf {W}_1^*\) is zero it follows that \(\varvec{d}_i(\lambda )^\prime \, \left( \tilde{\gamma }( H_v + c ) \right) _{v=1,\ldots ,n}\) is no function of \(H_i\). Minimization problems of that type have been studied in detail in, e.g., Amemiya (1985), Pötscher and Prucha (1997), and Newey and McFadden (1994). There, sufficient conditions are given for the consistency and asymptotic normality of the resulting estimators under various conditions. Note that, in the present case, \(\{ H_i \}\) is a strictly stationary process. Moreover, in most papers on this topic, the regression function is assumed to be a deterministic function depending on certain parameters. In the present case, however, it is a function depending on the observations \(\{ H_i \}\) which makes the analysis of the asymptotic behaviour of the estimators much harder. Further, it must be noted that a spatial problem is present. The positions \(\varvec{s}_i\) are points in a space and we need a certain distance measure between these points to assess the dependence of the observations.

Theorem 4

Suppose that \(\varvec{\vartheta }_0 \in \Theta = [\rho _l, \rho _u] \times [\lambda _l, \lambda _u] \times [\alpha _l, \alpha _u] \subseteq [0, 1) \times [0,1) \times [0,\infty )\). Let \(\{ \varepsilon (\varvec{s}_i) : i \in I\!\!N \}\) be independent and identically distributed random variables with existing moment \({{\,\mathrm{\mathbb {E}}\,}}( (\log (\vert \varepsilon (\varvec{s}_1)\vert ))^2 )\). Let f be a differentiable and invertible function with \(f^\prime > 0\) on \((0,\infty )\) and let \(\gamma \) be a measurable function on \([0, \infty )\) with \({{\,\mathrm{Var}\,}}(\gamma (Y(\varvec{s}_1)^2)) < \infty \). Suppose that \(f^\prime (h_{\varvec{\vartheta }}(\varvec{s}_i)) \, h_{\varvec{\vartheta }}(\varvec{s}_i) \ge L > 0\) for all \(i, \varvec{\vartheta } \in \Theta , \omega \). Further assume that for \(\varvec{\vartheta }= (\rho , \lambda , \alpha ) \in \Theta \)

$$\begin{aligned}&\lim _{n \rightarrow \infty } \frac{1}{n} \sum _{i=1}^n \left( {{\,\mathrm{\mathbb {E}}\,}}( \log (h_{\varvec{\vartheta }}(\varvec{s}_i)) - {{\,\mathrm{\mathbb {E}}\,}}(\log (h_{\varvec{\vartheta }_0}(\varvec{s}_i)) ) \right) ^2 \quad \text{ exists } , \end{aligned}$$
(7)
$$\begin{aligned}&\frac{1}{n} \sum _{i=1}^n \left( \log (h_{\varvec{\vartheta }}(\varvec{s}_i)) - {{\,\mathrm{\mathbb {E}}\,}}( \log (h_{\varvec{\vartheta }}(\varvec{s}_i))) \right) ^2 {\mathop {\rightarrow }\limits ^{p}} 0 \end{aligned}$$
(8)

as n tends to infinity and that the limit function in (7) has a unique minimum at \(\varvec{\vartheta }_0\). Moreover, let \(\mathbf {W}_1^*\) and \(\mathbf {W}_2^*\) be row-standardized, i.e., \(\mathbf {W}_1^* \varvec{1} = \mathbf {W}_2^* \varvec{1} = \varvec{1}\). Then the minimization problem (6) has a solution \(\hat{\varvec{\vartheta }}_n\) and it holds that \(\hat{\varvec{\vartheta }}_n {\mathop {\rightarrow }\limits ^{p}} \varvec{\vartheta }_0\) as \(n \rightarrow \infty \).

Note that the solution of (6) does not have to be unique. For more details, we refer to Sect. 4 of Amemiya (1985).

Moreover, for a spGARCH process it holds that \(f^\prime (x) x = x\) and \(h_{\varvec{\vartheta }}(\varvec{s}_i) \ge \alpha _l > 0\) and thus the above condition is fulfilled. For a log-spGARCH process we have that \(f^\prime (x) x = 1\) and thus it is fulfilled as well. Further, for a log-spGARCH process the condition (7) can be easily seen to be fulfilled since for a strictly stationary process \(\{ Y(\varvec{s}_i ) \}\) the quantity \({{\,\mathrm{\mathbb {E}}\,}}(\log (h_{\varvec{\vartheta }}(\varvec{s}_i)))\) does not depend on i at all.

To prove the consistency of the local minimum, the roots of the first derivative of the sum of non-linear squares with respect to the parameters must be zero, i.e.,

$$\begin{aligned} \frac{\partial Q_n(\varvec{\vartheta })}{\partial \varvec{\vartheta }} = 0 . \end{aligned}$$

Theorem 5

Suppose that the conditions of Theorem 4 are fulfilled and (7) and (8) hold in an open neighbourhood N of \(\varvec{\vartheta }_0\).

Let \(\varvec{\Theta }_T\) denote the set of roots of the equation

$$\begin{aligned} \frac{\partial Q_n(\varvec{\vartheta })}{\partial \varvec{\vartheta }} = 0 . \end{aligned}$$

Then it holds for all \(\varepsilon > 0\) that

$$\begin{aligned} \lim _{n \rightarrow \infty } P\left( \inf _{\varvec{\vartheta } \in \Theta _T} (\varvec{\vartheta } - \varvec{\vartheta }_0)^\prime (\varvec{\vartheta } - \varvec{\vartheta }_0) > \varepsilon \right) = 0 . \end{aligned}$$

As in the previous section, we will now consider a special case of the general framework, namely an spGARCH model. That is, we choose \(f(x) = x I_{ [0, \infty ]}(x)\) while \(\gamma \) is an arbitrary function satisfying certain conditions.

Lemma 6

Let \(\{ Y(s_i) \}\) be an spGARCH process. Suppose that \(\varvec{\Theta } = (0,1) \times (0,1) \times (0,\infty )\). Let \(\{ \varepsilon (\varvec{s}_i) : i \in I\!\!N \}\) be independent and identically distributed random variables with existing moment \({{\,\mathrm{\mathbb {E}}\,}}(\log (\vert \varepsilon (\varvec{s}_1)\vert )^2 )\). Let \(\gamma \) be a non-negative measurable function on \([0, \infty )\) with \({{\,\mathrm{Var}\,}}(\gamma (Y(\varvec{s}_1)^2)) < \infty \). Suppose that \(\{ Y(s_i) \}\) is strictly stationary. Moreover, let \(\mathbf {W}_1\) and \(\mathbf {W}_2\) be row-standardized.

  1. (a)

    If there is an open neighbourhood N of \(\varvec{\vartheta }_0\) such that for all \(\varvec{\vartheta } = (\rho , \lambda , \alpha ) \in N\) it holds with \(\varvec{\Delta } = \left( \gamma ( Y(\varvec{s}_j)^2 ) - {{\,\mathrm{\mathbb {E}}\,}}( \gamma ( Y(\varvec{s}_j)^2 ) \right) _{j=1,\ldots ,n}\) that

    $$\begin{aligned} \frac{1}{n} \; \varvec{\Delta }^\prime \mathbf {W}_1^{* \prime } (\mathbf {I} - \lambda \mathbf {W}_2^{* \prime })^{-1} (\mathbf {I} - \lambda \mathbf {W}_2^*)^{-1} \mathbf {W}_1^* \varvec{\Delta } {\mathop {\rightarrow }\limits ^{p}} 0 \end{aligned}$$
    (9)

    as n tends to infinity then the assumption (8) is fulfilled.

  2. (b)

    If

    $$\begin{aligned} \frac{1}{n} \varvec{1}^\prime (\mathbf {I} - \lambda \mathbf {W}_2^*)^{-1} \mathbf {W}_1^* {{\,\mathrm{Cov}\,}}( \varvec{\Delta } ) \mathbf {W}_1^{* \prime } (\mathbf {I} - \lambda \mathbf {W}_2^{* \prime })^{-1} \varvec{1} \rightarrow 0 \end{aligned}$$
    (10)

    as n tends to infinity then the condition (9) is fulfilled.

Note that the assumption (9) is a statement about the topological structure of the underlying space. Moreover, (10) shows that it can be interpreted as an assumption on the underlying autocorrelation structure of the process. It is fulfilled if \(\mathbf {W}_1^*\) and \(\mathbf {W}_2^*\) are sparse to limit the spatial dependence to a manageable degree. Here the choice of the weight matrices is restricted. It is also satisfied if the autocorrelation is weak.

4 Computational implementation and simulation studies

In the following section, we assume the simple parametric setting given by (5). We simulated a spatial GARCH process as specified in Example 2 and the weighting matrices \(\mathbf {W}^*_1\) and \(\mathbf {W}^*_2\) were set as row-standardized Rook contiguity matrices, for which the upper triangular elements were set to zero to avoid negative values of \(h(\varvec{s}_i)\). Thus, the conditions of Theorem 1 are fulfilled. In practice, such processes are relevant to model directional processes, for instance. It is worth mentioning at this point that the condition for invertibility of the spatial GARCH process is restrictive, i.e., to represent \(Y(\varvec{s})\) as a function of all \(\varepsilon (\varvec{s})\) in a closed-form, such that the solutions \(Y(\varvec{s})\) are real-valued. This condition is, however, only sufficient. For estimation, a function for the residual process \(\varepsilon (\varvec{s})\) given all (real-valued) observations \(y(\varvec{s})\) is needed, which is easier for spatial GARCH models; and therefore, we can relax on the assumption of a directional process.

The simulation study is performed on a \(d \times d\) spatial unit grid (i.e., \(D_{\varvec{s}} = \{\varvec{s} = (s_1, s_2)' \in \mathbb {Z}^2 : 1 \le s_1, s_2 \le d \}\)), resulting in \(n = d^2\) observations, with \(m = 10000\) replications. The size of this spatial field has been successively increased with \(d \in \{5, 10, 15\}\). Moreover, we have considered different settings depending on the data-generating parameters \(\varvec{\theta }_0 = (\rho _0, \lambda _0, \alpha _0)'\). For all settings, the errors were independently drawn from a standard normal distribution and \(\alpha _0\) equals 1 (i.e., the spatially constant term), whereas \(\rho _0\) and \(\lambda _0\) varied across the settings. To be precise, \(\rho _0 \in \{0.2, 0.4, 0.7\}\) and \(\lambda _0 \in \{0.2, 0.4, 0.7\}\) to have settings with a weak, moderate and large dependence in the conditional spatial heteroscedasticity.

For all three parameters \(\rho \), \(\lambda \), and \(\alpha \), where \(\rho _0 + \lambda _0 < 1\), the average bias and RMSE are shown Tables 2 and 3. We see that the RMSE is decreasing with an increasing number of observations in all cases, while the absolute bias is decreasing in almost all cases. The exceptions are observed for cases where the bias is small. Moreover, the non-linear least squares approach can efficiently be implemented and runs very fast on standard computers, even if the number of observations is large. On average, the computing time for estimating the parameters ranges from 3.4 to 3.8 s for 25 observations, 3.2–4.8 s for 100 observations, and 5.5–8.5 s for 225 observations on a standard notebook. This method is implemented in the R package spGARCH (version \(> 2.0\), see also Otto 2019).

Table 2 Bias of the estimates \(\hat{\rho },\hat{\lambda },\hat{\alpha }\) for different settings with \(\alpha _0 = 1\)
Table 3 RMSE of the estimates \(\hat{\rho },\hat{\lambda },\hat{\alpha }\) for different settings with \(\alpha _0 = 1\)

5 Real-world application: condominium prices in Berlin

In markets that are constrained in space, one can typically expect to find locally varying risks. Typical examples of such markets are real estate and labour. For the former, the property prices are highly dependent upon the location of the real estate and prices in the surrounding areas. Similarly, for the latter, this market is also often constrained in space due to the limited mobility of labourers.

On the one hand, we observe conditional mean levels that vary in space, so-called spatial clusters. That is, both clustered areas of higher prices and lower prices can be observed. On the other hand, we may also expect to find locally varying price risks, which can be considered as local volatility clusters. The proposed spatial GARCH-type models are capable of capturing such spatial dependencies in the conditional variance. This motivates why we consider condominium prices (average prices per square metre) at a fine spatial scale. In particular, we will analyze the relative changes from 1995 to 2014 across all Berlin postcode regions (i.e., \(n = 190\)), more precisely, the logarithmic returns over entire period to obtain a purely spatial data set of the long-term changes. The data are depicted in Fig. 1. The sample mean for these price changes is 0.8103 with a median of 0.6965. In total, the price changes range from − 2.5650 to 7.3131. We can observe a spatial cluster of positive values in the north-western postcode regions. To model this spatial dependence, we consider first-order contiguity matrices \(\mathbf {W}^*_1\) and \(\mathbf {W}^*_2\) giving equal weights to each directly neighbouring region. This choice appears to be the best with respect to the Akaike and Bayesian information criterion. For estimation, we do not need further restrictions on the error process or the weight matrix, such as the triangular weighting scheme in the simulation study. There is a well-defined mapping from the observed process \(\{Y(\varvec{s})\}\) to the errors \(\{\varepsilon (\varvec{s})\}\), and the restrictions were only needed for the inverse relation (i.e., the mapping from \(\{\varepsilon (\varvec{s})\}\) to \(\{Y(\varvec{s})\}\)).

Fig. 1
figure 1

Long-term logarithmic returns of the condominium prices for all Berlin postcode regions

However, this fine spatial scale of all postcode regions causes another problem, namely exogenous regressors are often not available or cannot be assigned in the given small-scale resolution. For instance, the average household income could play an important role in the increase of the condominium prices, but the place of living (in terms of postal code areas) does not usually coincide with the place of work. There is no reliable way to associate quantities like personal or household income with postcode areas. Moreover, local infrastructure like schools, leisure facilities, parks is not limited to the residents of the respective areas with the same postal code. Thus, modelling an empirical process on such a small spatial scale is typically prone to heteroscedasticity induced by latent variables.

To illustrate these effects, we applied the developed spatial GARCH model to the residuals of a spatial autoregressive model, briefly SAR (see, e.g. Halleck Vega and Elhorst 2015; Lee 2004), with and without exogenous regressors. More precisely, we select the regressors from a set of potential covariates available for different spatial scales, including the number of crimes (in so-called life-world oriented spaces, LOR, 161 units), number of schools, kinder gardens (postcode-area level), percentage of migrants (LOR level), number of inhabitants (LOR level), size of areas used for infrastructure, living, water, vegetation (district level, 12 units), and average net income per household (district level). The included regressors were chosen such that the Akaike information criterion is minimal. The results of these two models are shown in Table 4 along with the spGARCH coefficients of the error process. All covariates were standardized. Regarding the residuals of these two mean models with and without regressors (before fitting a spGARCH model to the residuals), we observe that both of them are not autocorrelated in space (Moran’s \(I = -0.0375\) with a p-value of 0.7811 for the intercept-only model, and \(I = -0.0124\) with \(p = 0.5676\) for the model including covariate effects). That is, the spatial correlation of original data could fully be modelled (\(I = 0.5104\)). However, looking at the absolute values of the residuals, we observe that there is significant autocorrelation for both models (\(I = 0.1027\) (\(p = 0.0042\)) and \(I = 0.0808\) (\(p = 0.0181\)) for the intercept-only model and regressive model, respectively).

In the final regressive model, only four regressors were selected, namely a proxy for the available free space (i.e., the proportion of settlement area to total area), the net household income in each district, and linear trends in the east-west direction, and north-south direction have been included (i.e., the coordinates of the centroids of each postal code unit). While a significant negative trend can be observed from west to east, the increase in the north-south direction seems to be of minor importance. Furthermore, the average household income clearly influences the price development of condominiums in Berlin. Condominiums in high-class districts in terms of average income have increased relatively less than condominiums in lower-income areas.

It is worth noting that the dependence in the conditional heteroscedasticity could partly be covered by these covariates and the spatial autocorrelation in the absolute residuals was reduced. Nevertheless, latent effects are present which were not modelled by including these regressors. This can also be seen in Fig. 2, where the residuals of the SAR model are displayed in a diverging colour scheme (blue areas indicate negative residuals, red areas are positive residuals). Even though there is no spatial dependence in the residuals (red and blue areas are irregularly located across space without clustering), we observe a certain dependence in the absolute residuals, because darker colours are clustered together while in other areas the residuals are close to zero indicated by yellow-coloured regions. Thus, an spGARCH model has been fitted in a second step to the residuals of both models. The obtained spGARCH parameters can be interpreted as local model uncertainties and simultaneously cover latent variables, which could not be included due to the fine spatial scale of postcode-area levels.

Table 4 Estimated parameters of the spGARCH model for the residuals of a spatial autoregressive model with and without regressors, where the dependent variables are the changes in the condominium prices in Berlin
Fig. 2
figure 2

Residuals of the SAR model

In both cases, we observe significant positive dependence in the conditional second moments. More precisely, the GARCH effects amount to \(\hat{\rho } = 0.2136\) and \(\hat{\rho } = 0.2022\) with \(\hat{\lambda }\) roughly equal to 0.70 for the intercept-only and regression model, respectively. These parameters can similarly be interpreted as in the time-series case, although \(\varvec{h}\) does not necessarily coincide with the conditional second moments (see Otto et al. 2021). Furthermore, \(\hat{\alpha }\) does not significantly differ from zero. It is worth noting that, according to (4), \(\varvec{h}\) is obtained by multiplying \(( \mathbf {I} - \mathbf {W}_1 \mathbf {E} - \mathbf {W}_2)^{-1}\) with \(\varvec{\alpha }\), such that a small value of \(\varvec{\alpha }\) does not necessarily implies small values of \(\varvec{h}\). Analyzing the residuals of this combined model shows that the remaining dependence in the heteroscedasticity could be explained. The squared residuals are no longer significantly autocorrelated in space.

The resulting conditional variance is visualized for the model with regressors in Fig. 3. The highest values of \(h(\varvec{s}_i)\) can be observed for the outer regions in north-west and another cluster is located in the southern city centre. This indicates that the highest uncertainty in the price changes is observed for these regions, while there is a band around the city centre where the price changes could more accurately be predicted by the regression model (i.e., the estimated values of \(h(\varvec{s}_i)\) are lower). These results seem to be very reasonable because the real-estate market was changing the most in these areas. First, due to increasing prices in the centre, new land for building has been created outside the city—mostly along the regional transport tracks which are mainly going in an east-west direction. Second, the major airport of West Berlin, namely the airport Berlin-Tempelhof, was closed in 2008 changing the atmosphere in this region, some areas changed from regions in the flight paths to calm regions very close to the city centre, while others were not affected. This explains the second cluster in the centre.

Fig. 3
figure 3

Spatial equivalent of the conditional variance, \(\hat{\varvec{h}}(\varvec{s})\)

6 Discussion and conclusions

Recently, a few papers have introduced spatial ARCH and GARCH-type models that allow the modelling of an instantaneous spatial autoregressive dependence of heteroscedasticity. In this paper, we propose a generalized spatial ARCH model that additionally covers all previous approaches. Due to the flexible definition of the model as a set of functions, we can derive a common estimation strategy for all these spatial GARCH-type models. It is based on non-linear least squares.

In the second part of the paper, we confirmed our theoretical findings on the consistency of the estimators by means of Monte Carlo simulation studies. The estimation method is computationally implemented in the R package spGARCH. Eventually, the use of the model was demonstrated through an empirical example. More precisely, this paper has shown how the model uncertainties of local price changes in the real estate market in Berlin can be described using an spGARCH model as residuals’ process. Though all proposed models are uncorrelated and have a zero mean, potential interactions between the error process and the mean equation should be analyzed in greater detail in future research.

In addition, we want to stress that the dependence structure does not necessarily have to be interpreted in a spatial sense. Thus, we briefly discuss a further example below, on which the “spatial” proximity could also be defined as the edges of networks. In such cases, \(\mathbf {W}_1\) and \(\mathbf {W}_2\) would be interpreted as adjacency matrices. For instance, one might consider the financial returns of several stocks as a network, where the only assets that are connected are those that are correlated above a certain threshold. Choosing the threshold equal to 0.5, a financial network as displayed in Fig. 4 can be created. Thus, spGARCH models can be used to analyze various forms of information, whether that might be volatility, risk, or spill-overs from one stock to another, if these assets are close to one another within a certain network. In future research, attempts for modelling volatility clusters within networks, using spatial GARCH models, should be analyzed in greater detail. Moreover, a temporal dimension can be added. The main difference between the spatiotemporal GARCH in our framework and a multivariate vec-GARCH model is that we allow for instantaneous spatial/network interactions, while multivariate GARCH include only temporally lagged cross-variable (i.e., spatial) interactions. Alternatively, in financial applications, the spatial locations could be considered to be unknown. Santi et al. (2021a, 2021b) considered the case of unknown or incompletely known spatial locations for autoregressive models. They propose to approximate the geographical space by another space spanned by certain covariates, which seems to be a promising approach also for spatial GARCH models and financial applications.

Fig. 4
figure 4

Financial network of selected stocks of the S &P 500, where the colour of the nodes denotes the annual returns in 2017 with darker colours indicating higher returns

Up to now, we have assumed that suitable functions of the spGARCH model framework are known. Hence, it is possible to maximize certain goodness-of-fit criteria in order to obtain the best-fitting model. However, these functions can also be estimated using a non-parametric approach; for instance by penalized or classical B-splines. Besides, further choices of f have not been discussed in this paper yet, including choices of f to obtain E-spGARCH or logarithmic spGARCH models. Also, multivariate models remain open for future research. This will be the subject of some forthcoming papers.