Introduction

Earth’s crust is deformed due to relative motions of tectonic plates. In particular, we observe significant crustal deformation in plate convergence zones. Accumulation of tectonic stress causes earthquakes, and large-scale crustal deformation leads to tectonic landforms. Precise knowledge on crustal deformation plays a fundamental role in understanding the tectonics and dynamics of subject regions. The present day crustal motions can be accurately measured with space geodetic techniques. In particular, networks of permanent GNSS stations are operated worldwide and crustal velocity at the stations can be obtained very precisely, which contributes to monitor ongoing crustal deformation (e.g., Kreemer et al. 2014; Nishimura et al. 2014; Murray et al. 2020). Velocity data generally depend on a reference station or a reference frame and include rigid rotations. Therefore, strain rates, obtained by differentiating velocity fields with respect to space, are more suitable to interpret internal deformation of Earth’s crust.

Estimation of a strain-rate field from spatially discrete velocity data is a longstanding issue to quantify crustal deformation, and many methods have been proposed before the operation of space geodetic measurements. Classical methods divide a region into a triangulated network to estimate a mean strain rate within each cell using triangulation survey data (Frank 1966; Prescott 1976) as well as trilateration survey and GNSS data (Feigl et al. 1990). In this approach, however, the estimated strains are discontinuous at cell boundaries and depend on the way of partition. Later, continuous interpolation methods were developed (e.g., Haines and Holt 1993; Shen et al. 1996). These methods impose a certain degree of smoothness on strain-rate fields to stabilize the estimation from discrete velocity data without knowledge on major faults or block motions. They are still in progress using advanced mathematical tools. For example, Tape et al. (2009) applied spherical wavelets to estimate velocity and strain-rate fields, which enables to separate crustal deformation into different length scales. Sandwell and Wessel (2016) developed an interpolation method of discrete 2-D vector data on the basis of Green’s functions of an elastic body, which ensures the coupling of the two components.

In this direction, Shen et al. (1996) proposed a modified least-square inversion method to investigate regional crustal deformation in the seismically active Los Angeles basin. Assuming local uniformity of a strain-rate field, they simultaneously estimated velocity and strain-rate fields through a bilinear fitting with weighted contributions of data according to the distance to an estimation point. The distance dependence of weights is controlled by a hyperparameter called the distance decaying constant (DDC). This method is easy to understand and implement, and has been widely applied to clarify characteristics of crustal deformation fields. For example, applying the method to GNSS data in Japan, Sagiya et al. (2000) found a high strain-rate region named the Niigata–Kobe tectonic zone (NKTZ) that passes through central Japan. Furthermore, through the analysis of GNSS data in this region before and after the 2011 Tohoku-oki earthquake, Meneses-Gutierrez and Sagiya (2016) succeeded in separating elastic strain and inelastic strain; Fukahata et al. (2020) further succeeded in separating the inelastic strain into plastic and viscous strains. Nishimura et al. (2018) clarified strain partitioning between inland active faults and the interplate coupling in southwest Japan. Other than Japan, their method has also been used to reveal crustal deformation related to regional tectonics and seismic activities, for example, in Taiwan (Lin et al. 2010), mainland China (Wang and Shen 2020), Spain (Stich et al. 2006), Italy (Devoti et al. 2011), and Greece (Chousianitis et al. 2015). The strain-rate field obtained by their method has also been used to long-term earthquake prediction (Shen et al. 2007). Some improvements have still been proposed for the method (Shen et al. 2015), as described in the next section. In brief, the method of Shen et al. (1996) has made great contribution to geophysics.

However, we notice three theoretical points to be examined when we apply the method. First, the simultaneously estimated velocity and strain-rate fields are inconsistent mathematically: the estimated strain-rate field cannot be obtained by differentiating the estimated velocity field, as shown in Appendix. Secondly, there is no criterion to objectively determine the optimal value of DDC, though the value of DDC greatly affects the estimation result. Thirdly, as pointed out by Shen et al. (2015), it is difficult to properly estimate uncertainties of velocity and strain-rate fields.

As an alternative, in this study, we propose a method of basis function expansion to estimate a velocity field from spatially discrete geodetic data, in which a velocity field is expressed by a linear combination of basis functions. Coefficients of the basis functions, which determine the weights of respective basis functions, are obtained through a procedure of an inversion analysis. Once we obtain a velocity field, the associated strain-rate field can be analytically calculated by spatially differentiating the velocity field.

Techniques of basis function expansion have been broadly used in the waveform inversion to estimate seismic source processes (e.g., Olson and Aspel 1982; Hartzell and Heaton 1983; Ide et al. 1996; Yagi et al. 2004). In these studies, however, boxcar functions have commonly been used as basis functions. Therefore, we cannot differentiate them at cell boundaries. Yabuki and Matsu’ura (1992) introduced bicubic B-spline functions as basis functions to estimate coseismic slip distribution from geodetic data. Cubic B-splines are continuous until the second derivative. Their method has been widely used to estimate not only coseismic slip distribution (e.g., Fukahata and Wright 2008; Funning et al. 2014), but also interseismic slip distribution (e.g., Yoshioka et al. 1993; Sagiya 1999; Fukahata et al. 2004). Fukahata et al. (1996) applied the method to reconstruct temporal variation of vertical crustal motions from levelling data. In this study, we essentially follow the formulation of Yabuki and Matsu'ura (1992) and Fukahata et al. (1996) to estimate a velocity field from GNSS data. In this approach, as mentioned above, the strain-rate field is obtained by spatially differentiating the estimated velocity field.

In Yabuki and Matsu'ura (1992), which used a framework of Bayesian inversion, smoothness constraint was used as a prior constraint and the relative importance of it to observed data was objectively determined from observed data based on Akaike’s Bayesian information criterion (ABIC; Akaike 1980). A statistically rigorous framework of Yabuki and Matsu'ura (1992) also enables us to evaluate estimation errors appropriately. Therefore, the above-mentioned three points on the method of Shen et al. (1996, 2015) are all overcome in the method of this study. Luo et al. (2016) also used ABIC to reconstruct a surface deformation field from InSAR and GNSS data, but they used boxcars as basis functions, and so, their method was not suitable to estimate strain-rate fields.

In the following, we first explain the method of Shen et al. (1996, 2015) and basis function expansion with ABIC to estimate a strain-rate field. Next, we apply these methods to GNSS data in Japan, and compare the results with discussion about the characteristics of these methods. Finally, we examine the strain-rate field in Japan based on the results obtained by the method of basis function expansion.

Methods to estimate a strain-rate field from spatially discrete geodetic data

Method of Shen et al.

Shen et al. (1996) proposed a method to estimate a velocity field and a strain-rate field simultaneously from spatially discrete velocity data. In their method, assuming local uniformity of a strain-rate field, horizontal velocity components \(\left( {u,v} \right)\), strain rates \(\left( {e_{xx} ,e_{xy} ,e_{xy} } \right) = \left( {\partial_{x} u,\left( {\partial_{x} v + \partial_{y} u} \right)/2,\partial_{y} v} \right)\) and a rotation rate \( \omega = {{\left( {\partial_{x} v - \partial_{y} u} \right)} \mathord{\left/ {\vphantom {{\left( {\partial_{x} v - \partial_{y} u} \right)} 2}} \right. \kern-\nulldelimiterspace} 2}\) at an arbitrary point \(\left( {x,y} \right)\) are related with observed velocity data \(\left( {u_{i} ,v_{i} } \right)\) at \(i\)th station of a coordinate \(\left( {x_{i} ,y_{i} } \right)\) by the following relation:

$$ \left( {\begin{array}{*{20}c} {u_{i} } \\ {v_{i} } \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {\begin{array}{*{20}c} 1 & 0 \\ 0 & 1 \\ \end{array} } & {\begin{array}{*{20}c} {\Delta x_{i} } & {\Delta y_{i} } \\ 0 & {\Delta x_{i} } \\ \end{array} } & {\begin{array}{*{20}c} 0 & { - \Delta y_{i} } \\ {\Delta y_{i} } & {\Delta x_{i} } \\ \end{array} } \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {\begin{array}{*{20}c} u \\ v \\ \end{array} } \\ {\begin{array}{*{20}c} {e_{xx} } \\ {e_{xy} } \\ \end{array} } \\ {\begin{array}{*{20}c} {e_{yy} } \\ \omega \\ \end{array} } \\ \end{array} } \right) + \left( {\begin{array}{*{20}c} {\delta u_{i} } \\ {\delta v_{i} } \\ \end{array} } \right), $$
(1)

where \((\Delta x_{i} ,\Delta y_{i} ) = \left( {x_{i} - x,y_{i} - y} \right)\) are the relative position of the observation station to the estimation point. \(\delta u_{i}\) and \(\delta v_{i}\) are called observation errors in Sagiya et al. (2000), but it would be more appropriate to regard them as fitting errors. Note that, following the convention (e.g., Segall 2010), anticlockwise rotation is taken to be positive for the rotation rate in Eq. (1), although clockwise rotation is taken to be positive in the original formulation of Shen et al. (1996). Using the definitions of the strain rate and rotation rate, which are given above, Eq. (1) can be decoupled to

$$ \left\{ {\begin{array}{*{20}c} {u_{i} = \left( {1 \Delta x_{i} \Delta y_{i} } \right)\left( {\begin{array}{*{20}c} u \\ {\partial_{x} u} \\ {\partial_{y} u} \\ \end{array} } \right) + \delta u_{i} } \\ {v_{i} = \left( {1 \Delta x_{i} \Delta y_{i} } \right)\left( {\begin{array}{*{20}c} v \\ {\partial_{x} v} \\ {\partial_{y} v} \\ \end{array} } \right) + \delta v_{i} } \\ \end{array} } \right.. $$
(2)

Equation (2) corresponds to the Taylor expansion of the velocity field at the first order. Equation (2) indicates that, as long as the errors \(\left( {\delta u_{i} ,\delta v_{i} } \right)\) are mutually independent, the estimations of \(\left( {u,\partial_{x} u,\partial_{y} u} \right)\) and \(\left( {v,\partial_{x} v,\partial_{y} v} \right)\) are independent of each other. Therefore, we explain the method for one velocity component in the following.

We have velocity data \(v_{i}\) with variance (observation errors) \(\sigma_{i}^{2}\) at \((x_{i} ,y_{i} )\) of \(N\) stations (\(i = 1, \ldots ,N\)). We consider a multiple regression model at each estimation point P \(\left( {x,y} \right):\)

$$ v_{i} = v + \partial_{x} v \cdot \Delta x_{i} + \partial_{y} v \cdot \Delta y_{i} + \delta v_{i} . $$
(3)

Model parameters to be estimated are velocity \(v\) and its spatial derivatives \(\partial_{x} v\) and \(\partial_{y} v\) at a position P \(\left( {x,y} \right)\). A key point in the estimation is that the fitting errors \(\delta v_{i}\) are weighted according to the distance between the observation point \((x_{i} ,y_{i} )\) and the estimation point P \(\left( {x,y} \right)\) as

$$ \delta v_{i} \sim {\mathcal{N}}\left( {0,\sigma_{i}^{2} \exp \frac{{\Delta x_{i}^{2} + \Delta y_{i}^{2} }}{{D^{2} }}} \right), $$
(4)

where the notation \({\mathcal{N}}\left( {m,\sigma^{2} } \right)\) represents the Gaussian distribution with mean \(m\) and variance \(\sigma^{2}\). Equation (4) means that data at remote points from P do not contribute to the estimation at P. \(D\) is the distance decaying constant (DDC), which is a hyperparameter in this analysis and controls the spatial range of significance. Since the weight of each station varies continuously, this method leads to a smooth estimation of velocity and strain-rate fields.

The observation Eq. (3) is expressed in a matrix form as

$$ {\mathbf{d}} = {\mathbf{H}}_{{\text{P}}} {\mathbf{a}}_{{\text{P}}} + {\mathbf{e}}_{{\text{P}}} ,\;{\mathbf{e}}_{{\text{P}}} \sim {\mathcal{N}}\left( {{\mathbf{0}},{\mathbf{E}}_{{\text{P}}} } \right), $$
(5)

with

$$ {\mathbf{d}} = \left( {\begin{array}{*{20}c} {v_{1} } \\ \vdots \\ {v_{N} } \\ \end{array} } \right),\;{\mathbf{a}}_{{\text{P}}} = \left( {\begin{array}{*{20}c} v \\ {\partial_{x} v} \\ {\partial_{y} v} \\ \end{array} } \right),\;{\mathbf{H}}_{{\text{P}}} = \left( {\begin{array}{*{20}c} 1 & {\Delta x_{1} } & {\Delta y_{1} } \\ \vdots & \vdots & \vdots \\ 1 & {\Delta x_{N} } & {\Delta y_{N} } \\ \end{array} } \right),\;{\mathbf{E}}_{{\text{P}}} = {\text{diag}}\left( {\sigma_{i}^{2} \exp \frac{{\Delta x_{i}^{2} + \Delta y_{i}^{2} }}{{D^{2} }}} \right), $$
(6)

where “diag” represents a diagonal matrix. We attach the subscript P to note that these quantities depend on an estimation point P. Since \({\mathbf{e}}_{{\text{P}}}\) follows the Gaussian distribution, the relation of Eq. (5) can be expressed in the form of probability distribution:

$$ p_{{\text{P}}} \left( {{\mathbf{d}}{|}{\mathbf{a}}_{{\text{P}}} ;D} \right) = \left( {2\pi } \right)^{ - N/2} \left| {{\mathbf{E}}_{{\text{P}}} } \right|^{ - 1/2} \exp \left[ { - \frac{1}{2}\left( {{\mathbf{d}} - {\mathbf{H}}_{{\text{P}}} {\mathbf{a}}_{{\text{P}}} } \right)^{{\text{T}}} {\mathbf{E}}_{{\text{P}}}^{ - 1} \left( {{\mathbf{d}} - {\mathbf{H}}_{{\text{P}}} {\mathbf{a}}_{{\text{P}}} } \right)} \right], $$
(7)

which can be regarded as the likelihood function for the model parameter \({\mathbf{a}}_{{\text{P}}}\). By maximizing the likelihood for given data \({\mathbf{d}}\) and the hyperparameter \(D\), we obtain the optimal values of \({\mathbf{a}}_{{\text{P}}}\) with its uncertainties as

$$ {\hat{\mathbf{a}}}_{{\text{P}}} = \left( {{\mathbf{H}}_{{\text{P}}}^{{\text{T}}} {\mathbf{E}}_{{\text{P}}}^{ - 1} {\mathbf{H}}_{{\text{P}}} } \right)^{ - 1} {\mathbf{H}}_{{\text{P}}}^{{\text{T}}} {\mathbf{E}}_{{\text{P}}}^{ - 1} {\mathbf{d}},\;{\text{cov}} \left[ {{\mathbf{a}}_{{\text{P}}} } \right] = \left( {{\mathbf{H}}_{{\text{P}}}^{{\text{T}}} {\mathbf{E}}_{{\text{P}}}^{ - 1} {\mathbf{H}}_{{\text{P}}} } \right)^{ - 1} . $$
(8)

Shen et al. (2015) investigated the model structure further: they applied a quadratic decay besides the Gaussian decay (Eq. 4), made compensation for uneven azimuthal distribution of data, and determined the value of \(D\) at each point by assigning the total weighting of data \(W\), which is defined as

$$ W = \sum \limits_{i} \exp \left( {{{ - \left( {\Delta x_{i}^{2} + \Delta y_{i}^{2} } \right)} \mathord{\left/ {\vphantom {{ - \left( {\Delta x_{i}^{2} + \Delta y_{i}^{2} } \right)} {D^{2} }}} \right. \kern-\nulldelimiterspace} {D^{2} }}} \right). $$
(9)

They selected an optimal scheme by mapping the difference of estimated strain-rate fields.

It is a good advantage of the method of Shen et al. (1996, 2015) that both velocity and strain rate are simultaneously computed at any point and these fields vary continuously in space. In addition, the implementation is simple, and so, this method has been used by many researchers and contributed to the advancement of geophysical research (e.g., Sagiya et al. 2000). As mentioned in “Introduction”, however, there are three points to be examined in this method. First of all, because the velocity \(v\) and its derivatives \(\partial_{x} v\) and \(\partial_{y} v\) are estimated at each point independently, the velocity and strain-rate fields are mutually inconsistent in general. That is to say, the obtained strain-rate field is different from the spatial derivative of the obtained velocity field except for special cases (see Appendix). Secondly, it is difficult to objectively determine the optimal value of the hyperparameter \(D\). This is because the regression model, Eq. (7), is independently constructed at each estimation point and we do not treat all (unweighted) data at once. This prevents us from determining the hyperparameter using a single objective function. Thirdly, as pointed out by Shen et al. (2015), estimation of uncertainties is inappropriate. This is because the uncertainty (Eq. 8) depends only on the variance \(\sigma_{i}^{2}\) of velocity data at each station, so it does not reflect actual fitting to velocity data \(v_{i}\).

Basis function expansion with ABIC

We first formulate for one velocity component, and then describe how to deal with two components. We express the velocity field as a linear combination of a set of fixed basis functions \(\left\{ {\varPhi_{j} \left( {x,y} \right)} \right\}_{j = 1}^{M} :\)

$$ v\left( {x,y} \right) = \sum \limits_{j = 1}^{M} a_{j} \Phi_{j} \left( {x,y} \right) = {\mathbf{\Phi a}}, $$
(10)

where \({\mathbf{a}}\) is a model parameter vector to be determined from observation data. As mentioned in “Introduction”, we use bicubic B-splines for basis functions \(\varPhi_{j}\). Once we obtain the velocity field \(v\), we can analytically calculate the strain-rate field using its spatial derivatives (e.g., \(\partial_{x} v\left( {x,y} \right) = \sum \limits_{j = 1}^{M} a_{j} \partial_{x} \varPhi_{j} \left( {x,y} \right) = {{\varvec{\Phi}}}_{x} {\mathbf{a}}\)).

In our problem to reconstruct a velocity field from spatially discrete velocity data, Eq. (10) corresponds to the observation equation, which is written in a matrix form as

$$ {\mathbf{d}} = {\mathbf{Ha}} + {\mathbf{e}}, $$
(11)

with

$$ {\mathbf{d}} = \left( {\begin{array}{*{20}c} {v_{1} } \\ \vdots \\ {v_{N} } \\ \end{array} } \right),\;{\mathbf{a}} = \left( {\begin{array}{*{20}c} {a_{1} } \\ \vdots \\ {a_{M} } \\ \end{array} } \right),\;{\mathbf{H}} = \left( {\begin{array}{*{20}c} {\varPhi_{1} \left( {x_{1} ,y_{1} } \right)} & \cdots & {\varPhi_{M} \left( {x_{1} ,y_{1} } \right)} \\ \vdots & \ddots & \vdots \\ {\varPhi_{1} \left( {x_{N} ,y_{N} } \right)} & \cdots & {\varPhi_{M} \left( {x_{N} ,y_{N} } \right)} \\ \end{array} } \right). $$
(12)

We assume the errors to be isotropic Gaussian \({\mathbf{e}}\sim {\mathcal{N}}\left( {{\mathbf{0}},\sigma^{2} {\mathbf{I}}} \right)\), where \(\sigma^{2}\) is an unknown scale factor (hyperparameter) of the variance, and I is the \(N \times N\) unit matrix. Then, we obtain the data distribution as

$$ p\left( {{\mathbf{d}}{|}{\mathbf{a}},\sigma^{2} } \right) = \left( {2\pi \sigma^{2} } \right)^{ - N/2} \exp \left[ { - \frac{1}{{2\sigma^{2} }}\left( {{\mathbf{d}} - {\mathbf{Ha}}} \right)^{{\text{T}}} \left( {{\mathbf{d}} - {\mathbf{Ha}}} \right)} \right]. $$
(13)

Following Yabuki and Matsu’ura (1992), we construct a Bayesian model incorporating prior information that the velocity field should change smoothly in space. Although velocity discontinuity could exist at boundaries of rigid motions, gradual variation of velocity is usually detected by dense observation networks in deformation zones. The method of Shen et al. (1996, 2015) also imposes local uniformity on strain rates. The roughness of a velocity field can be measured by the following quantity:

$$ r = \iint {\left[ {\left( {\frac{{\partial^{2} v}}{{\partial x^{2} }}} \right)^{2} + 2\left( {\frac{{\partial^{2} v}}{\partial x\partial y}} \right)^{2} + \left( {\frac{{\partial^{2} v}}{{\partial y^{2} }}} \right)^{2} } \right]{\text{d}}x{\text{d}}y}. $$
(14)

In order to realize a smooth velocity field, the roughness \(r\) should be small. In terms of model parameters, it is rewritten in a positive-semidefinite quadratic form as

$$ r\left( {\mathbf{a}} \right) = \sum \limits_{i = 1}^{M} \sum \limits_{j = 1}^{M} a_{i} a_{j} \iint {\left[ {\frac{{\partial^{2} \varPhi_{i} }}{{\partial x^{2} }}\frac{{\partial^{2} \varPhi_{j} }}{{\partial x^{2} }} + 2\frac{{\partial^{2} \varPhi_{i} }}{\partial x\partial y}\frac{{\partial^{2} \varPhi_{j} }}{\partial x\partial y} + \frac{{\partial^{2} \varPhi_{i} }}{{\partial y^{2} }}\frac{{\partial^{2} \varPhi_{j} }}{{\partial y^{2} }}} \right]{\text{d}}x{\text{d}}y = {\mathbf{a}}^{{\text{T}}} {\mathbf{Ra}},} $$
(15)

where

$$ {\mathbf{R}} = \iint {\left( {{{\varvec{\Phi}}}_{xx}^{{\text{T}}} {{\varvec{\Phi}}}_{xx} + 2{{\varvec{\Phi}}}_{xy}^{{\text{T}}} {{\varvec{\Phi}}}_{xy} + {{\varvec{\Phi}}}_{yy}^{{\text{T}}} {{\varvec{\Phi}}}_{yy} } \right){\text{d}}x{\text{d}}y,} $$
(16)

with

$$ {{\varvec{\Phi}}}_{xx} = \left( {\frac{{\partial^{2} \varPhi_{1} }}{{\partial x^{2} }}, \cdots ,\frac{{\partial^{2} \varPhi_{M} }}{{\partial x^{2} }}} \right),\;{{\varvec{\Phi}}}_{xy} = \left( {\frac{{\partial^{2} \varPhi_{1} }}{\partial x\partial y}, \cdots ,\frac{{\partial^{2} \varPhi_{M} }}{\partial x\partial y}} \right),\;{{\varvec{\Phi}}}_{yy} = \left( {\frac{{\partial^{2} \varPhi_{1} }}{{\partial y^{2} }}, \cdots ,\frac{{\partial^{2} \varPhi_{M} }}{{\partial y^{2} }}} \right). $$
(17)

With the roughness defined in Eq. (15), we can introduce prior constraints on the smoothness of a velocity field in the form of the degenerate Gaussian distribution (Fukahata 2012):

$$ p\left( {{\mathbf{a}};\rho^{2} } \right) \propto \left( {2\pi \rho^{2} } \right)^{{ - \frac{P}{2}}} \left| {{{\varvec{\Lambda}}}_{P} } \right|^{\frac{1}{2}} \exp \left[ { - \frac{1}{{2\rho^{2} }}{\mathbf{a}}^{{\text{T}}} {\mathbf{Ra}}} \right], $$
(18)

where \(P\) is the rank of \({\mathbf{R}}\), and \(\left| {{{\varvec{\Lambda}}}_{P} } \right|\) is the product of the non-zero eigenvalues of \({\mathbf{R}}\). \(\rho^{2}\) is a hyperparameter that controls the strength of smoothness constraint.

We combine the two distributions, Eqs. (13) and (18), through Bayes’ theorem to derive the posterior distribution of model parameter \({\mathbf{a}}\) for given observation data \({\mathbf{d}}\):

$$ p\left( {{\mathbf{a}};\sigma^{2} ,\rho^{2} {|}{\mathbf{d}}} \right) = cp\left( {{\mathbf{d}}{|}{\mathbf{a}};\sigma^{2} } \right)p\left( {{\mathbf{a}};\rho^{2} } \right), $$
(19)

where \(c\) is a normalization constant independent of model parameter \({\mathbf{a}}\) and hyperparameters \(\sigma^{2}\) and \(\rho^{2}\). For fixed values of the hyperparameters, the model parameter \({\mathbf{a}}\) and its covariance are obtained by maximizing Eq. (19) as

$$ {\hat{\mathbf{a}}} = \left( {{\mathbf{H}}^{{\text{T}}} {\mathbf{H}} + \alpha^{2} {\mathbf{R}}} \right)^{ - 1} {\mathbf{H}}^{{\text{T}}} {\mathbf{d}},\;{\text{cov}}\left[ {\mathbf{a}} \right] = \sigma^{2} \left( {{\mathbf{H}}^{{\text{T}}} {\mathbf{H}} + \alpha^{2} {\mathbf{R}}} \right)^{ - 1} , $$
(20)

with \(\alpha^{2} = \sigma^{2} /\rho^{2} .\)

The optimal values of hyperparameters can be objectively selected by minimizing Akaike’s Bayesian information criterion (ABIC) introduced by Akaike (1980). ABIC is defined by

$$ {\text{ABIC}}\left( {\sigma^{2} ,\rho^{2} } \right) = - 2\log \left[ {\int {p\left( {{\mathbf{d}}{|}{\mathbf{a}};\sigma^{2} } \right)p\left( {{\mathbf{a}};\rho^{2} } \right)d{\mathbf{a}}} } \right] + 2N_{h} , $$
(21)

where \(N_{h}\) is the number of hyperparameters (\(N_{h} = 2\) for the current modeling). By minimizing ABIC, the optimal value of \(\sigma^{2}\) can be analytically solved as

$$ \hat{\sigma }^{2} = s\left( {{\hat{\mathbf{a}}}} \right)/\left( {N + P - M} \right), $$
(22)

with

$$ s\left( {\mathbf{a}} \right) = \left( {{\mathbf{d}} - {\mathbf{Ha}}} \right)^{{\text{T}}} \left( {{\mathbf{d}} - {\mathbf{Ha}}} \right) + \alpha^{2} {\mathbf{a}}^{{\text{T}}} {\mathbf{Ra}}. $$
(23)

Then, ABIC is expressed in terms of \(\alpha^{2}\) as

$$ \begin{aligned} {\text{ABIC}}\left( {\alpha^{2} } \right) & = \left( {N + P - M} \right)\log \left[ {\frac{2\pi }{{\left( {N + P - M} \right)}}s\left( {{\hat{\mathbf{a}}}} \right)} \right] - P\log \alpha^{2} + \log \left| {{\mathbf{H}}^{{\text{T}}} {\mathbf{H}} + \alpha^{2} {\mathbf{R}}} \right| \\ & \quad - \log \left| {{{\varvec{\Lambda}}}_{P} } \right| + \left( {N + P - M} \right) + 2N_{h} . \\ \end{aligned} $$
(24)

The search for the optimal value of \(\alpha^{2}\) is performed numerically. Substituting the optimal value \(\hat{\alpha }^{2}\) into Eqs. (22) and (20) gives optimal values of \(\sigma^{2}\) and the model parameters.

It is now straightforward to treat two velocity components, where we assume that the degree of smoothness is common to both components. Components of data, model parameters and errors are distinguished by subscripts as

$$ {\mathbf{d}} = \left( {\begin{array}{*{20}c} {{\mathbf{d}}_{x} } \\ {{\mathbf{d}}_{y} } \\ \end{array} } \right),\;{\mathbf{a}} = \left( {\begin{array}{*{20}c} {{\mathbf{a}}_{x} } \\ {{\mathbf{a}}_{y} } \\ \end{array} } \right),\;{\mathbf{e}} = \left( {\begin{array}{*{20}c} {{\mathbf{e}}_{x} } \\ {{\mathbf{e}}_{y} } \\ \end{array} } \right), $$
(25)

while matrices \({\mathbf{H}}\) and \({\mathbf{R}}\) and hyperparameters \(\sigma^{2}\) and \(\rho^{2}\) are common in both components. The observation equation and roughness are written as

$$ \left( {\begin{array}{*{20}c} {{\mathbf{d}}_{x} } \\ {{\mathbf{d}}_{y} } \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {\mathbf{H}} & 0 \\ 0 & {\mathbf{H}} \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {{\mathbf{a}}_{x} } \\ {{\mathbf{a}}_{y} } \\ \end{array} } \right) + \left( {\begin{array}{*{20}c} {{\mathbf{e}}_{x} } \\ {{\mathbf{e}}_{y} } \\ \end{array} } \right), $$
(26)
$$ r = \left( {{\mathbf{a}}_{x}^{{\text{T}}} {\mathbf{a}}_{y}^{{\text{T}}} } \right)\left( {\begin{array}{*{20}c} {\mathbf{R}} & 0 \\ 0 & {\mathbf{R}} \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {{\mathbf{a}}_{x} } \\ {{\mathbf{a}}_{y} } \\ \end{array} } \right). $$
(27)

The optimal model parameters and their covariance matrices are given by

$$ {\hat{\mathbf{a}}}_{a} = \left( {{\mathbf{H}}^{{\text{T}}} {\mathbf{H}} + \alpha^{2} {\mathbf{R}}} \right)^{ - 1} {\mathbf{H}}^{{\text{T}}} {\mathbf{d}}_{a} ,\;{\text{cov}}\left[ {{\mathbf{a}}_{a} } \right] = \sigma^{2} \left( {{\mathbf{H}}^{{\text{T}}} {\mathbf{H}} + \alpha^{2} {\mathbf{R}}} \right)^{ - 1} { }\left( {a = x,y} \right), $$
(28)

which is identical to Eq. (20). ABIC is calculated as

$$ \begin{aligned} {\text{ABIC}}\left( {\alpha^{2} } \right) & = 2\left( {N + P - M} \right)\log \left[ {\frac{\pi }{{\left( {N + P - M} \right)}}s\left( {{\hat{\mathbf{a}}}} \right)} \right] - 2P\log \alpha^{2} + 2\log \left| {{\mathbf{H}}^{{\text{T}}} {\mathbf{H}} + \alpha^{2} {\mathbf{R}}} \right| \\ & \quad - 2\log \left| {{{\varvec{\Lambda}}}_{P} } \right| + 2\left( {N + P - M} \right) + 2N_{h} , \\ \end{aligned} $$
(29)

with

$$ s\left( {\mathbf{a}} \right) = \left( {{\mathbf{d}}_{x} - {\mathbf{Ha}}_{x} } \right)^{{\text{T}}} \left( {{\mathbf{d}}_{{\varvec{x}}} - {\mathbf{Ha}}_{x} } \right) + \left( {{\mathbf{d}}_{{\varvec{y}}} - {\mathbf{Ha}}_{y} } \right)^{{\text{T}}} \left( {{\mathbf{d}}_{{\varvec{y}}} - {\mathbf{Ha}}_{y} } \right) + \alpha^{2} \left( {{\mathbf{a}}_{x}^{{\text{T}}} {\mathbf{Ra}}_{x} + {\mathbf{a}}_{y}^{{\text{T}}} {\mathbf{Ra}}_{y} } \right). $$
(30)

Here, note that the number of data and model parameters are \(2N\) and \(2M\), respectively, in this case. The optimal value of \(\sigma^{2}\) is given by

$$ \hat{\sigma }^{2} = s\left( {{\hat{\mathbf{a}}}} \right)/2\left( {N + P - M} \right). $$
(31)

The two velocity components are mostly independent, but interacted with each other through the common hyperparameters.

Comparison of the two methods through application to GNSS data in Japan

Data and model setting

We apply the above two methods to GNSS velocity data in Japan to estimate velocity and strain-rate fields. The number of the used stations is 1336 in the region ranging 128°–146° E and 30°–46° N (Fig. 1; Additional file 1). Raw data of these stations are archived at the Geospatial Information Authority of Japan (GSI), the Japan Coast Guard, Kyoto University, the International GNSS service (IGS), and UNAVCO. We estimate daily coordinates of continuous GNSS stations using precise point positioning with ambiguity resolution (Bertiger et al. 2010a, b) implemented in the GIPSY Ver. 6.4 software (https://gipsy-oasis.jpl.nasa.gov/). The coordinates are transformed into the IGS14 reference frame (http://www.igs.org/article/igs14-reference-frame-transition), which is GNSS realization of the International Terrestrial Reference Frame (ITRF) 2014 (Altamimi et al. 2016) using the transformation parameters provided by the Jet Propulsion Laboratory. The daily coordinates from January 2006 to December 2009 are used in the following analysis. Coordinate offsets associated with 14 large earthquakes (Table 1), 2 dyke intrusion events in eastern Izu volcanoes (Ueno et al. 2012), and maintenance of equipment referring to catalogues provided by GSI and Japan Meteorological Agency (JMA) are removed. The velocity vectors are estimated with a conventional procedure similar to that in Sagiya et al. (2000): linear trend, sinusoidal annual variation are fitted by the least-square method for each component separately, and then the coefficient of the linear term is used as the velocity component.

Fig. 1
figure 1

Horizontal velocity observed at permanent GNSS stations with respect to the ITRF 2014 for January 2006–December 2009. Green stars represent the epicenters of earthquakes whose coordinate offsets are removed from GNSS data

Table 1 List of large earthquakes whose coordinate offsets are removed from GNSS data

In the method of Shen et al. (1996, 2015), or shortly Shen’s method, the form of the weighting function must be chosen. We adopt the original formulation of Shen’s method (Shen et al. 1996), because the hyperparameter \(D\) is more intuitive and the trade-off curve indicates a better performance than the method proposed by Shen et al. (2015), which uses \(W\) instead of \(D\) as shown in Discussion (see Fig. 11). An appropriate value of the hyperparameter \(D\) generally depends on the observation density and the length-scale of crustal deformation, and it is actually determined through comparison of estimated results. In consequence, different values of \(D\) have been adopted in different studies: e.g., 25 km in Shen et al. (1996) and 35 km in Sagiya et al. (2000). Therefore, the results of several values of hyperparameters, specifically \(D = 15,\) 25 and 35 km, are compared in this study.

In the basis function expansion with ABIC, or shortly the ABIC method or ABIC, we set the Cartesian coordinates with the origin at (137° E, 38° N), and take a rectangle of 1800 km (north–south) by 1600 km (east–west) as an analysis region (the range of Fig. 1). A basis function \(\varPhi_{j} \left( {x,y} \right) = X_{k} \left( x \right)Y_{l} \left( y \right)\) is the product of cubic B-splines in the \(x\) and \(y\) directions. Basis functions are placed with 20-km intervals and truncated at the boundary of the analysis region. Effects of the setting of basis functions on estimation results are discussed in detail in the section of ‘Setting of basis functions in the ABIC method’.

Comparison of the results of the two methods

The velocity fields estimated by both methods are shown in Fig. 2, in which the results are presented for the places where three or more observation stations exist within the radius of 50 km. All results are similar in most regions. Looking into the details, however, Shen’s method with \(D = 15\) km yields an unstable eastward velocity field along the Pacific coast in northeast Japan. This suggests an overfitting to observation data, and \(D = {15}\) km is considered to be too short to perform a stable estimation. Figure 3 is an enlarged view of eastward velocity around the Izu Islands. Observation data generally have westward motions, but Shen’s method commonly estimates eastward motions in the east offshore of the islets. This is because the method extrapolates the large gradient of velocity data observed in Miyakejima Island, where adjacent stations have significantly different velocities. On the other hand, ABIC estimates a reasonable velocity field even in the offshore region. In brief, both methods usually give similar velocity fields, but Shen’s method can be unstable on the periphery of observation networks.

Fig. 2
figure 2

Comparison of the velocity fields estimated by the ABIC and Shen's methods. Results of different values of the distance decaying constant (D = 15, 25, and 35 km) are presented for Shen's method. Boxes A and B indicate the area shown in Figs. 3 and 5, respectively. Three black lines (a, b and c) show cross sections presented in Figs. 6 and 9. The green star represents the epicenter of the 2008 Iwate-Miyagi inland earthquake, which is also shown in Fig. 1

Fig. 3
figure 3

Comparison of the estimated eastward velocity fields around the Izu Islands. The location is indicated by box A in Fig. 2. Results of different values of D are presented for Shen’s method. Open circles represent the GNSS stations. The top panel shows the observed velocity data

Estimation bias, i.e., mean residual of estimations to observations at the observation sites \( \sum \nolimits_{i = 1}^{N} [v\left( {x_{i} ,y_{i} } \right) - v_{i} ]/N\), is listed in Table 2. The bias of ABIC is very tiny (the order of \(10^{ - 15}\), which may be numerical errors). This is an advantage of treating all data simultaneously, which leads to the cancellation of the bias. On the other hand, in Shen’s method, since velocities are estimated independently at each data point, biases are not cancelled out and accumulate randomly.

Table 2 Estimation bias of the velocity components

Estimated strain-rate fields, specifically the dilatation rate \(\Delta = e_{xx} + e_{yy}\) and maximum shear-strain rate \(\Sigma = \sqrt {e_{xy}^{2} + \left( {e_{xx} - e_{yy} } \right)^{2} /4}\), are presented in Fig. 4. Here, \({\Delta } > 0\) (warm colors) represents expansive deformation, while \({\Delta } < 0\) (cold colors) represents contractive deformation. Differences between the estimation methods are clearer in the strain-rate field than in the velocity field. The scale of spatial variation differs by the value of DDC in Shen’s method: the result of D = 15 km shows shorter length-scale variations, while that of D = 35 km shows a much smoother field. It is a difficult point that there is no objective criterion to determine the value of D. ABIC estimates the strain-rate field similar to Shen’s method with D = 25 km.

Fig. 4
figure 4

Comparison of the estimated strain-rate fields. Results of the ABIC method and Shen’s method with different D values are presented. Positive dilatation means expansion

Differences between the two methods can be better recognized around the focal region of the 2008 M7.2 Iwate-Miyagi inland earthquake (140.9° E, 39.0° N), the epicenter of which is marked by a green star in Fig. 1, and an enlarged view is shown in Fig. 5. A pair of weak positive and strong negative dilatation rates is observed in the result of ABIC. In Shen’s method, the positive dilation is unclear for \(D = 25\) km, and disappears for \(D = 35\) km; the result of \(D = 15\) km shows the pair of positive and negative dilatation rates, similar to ABIC, but there exist many such pairs in other areas and it is difficult to distinguish meaningful signals.

Fig. 5
figure 5

Comparison of the estimated dilatation-rate fields in the Tohoku region. The location is indicated by box B in Fig. 2. Results of different D values are presented for Shen’s method. Green stars represent the epicenter of the 2008 Iwate-Miyagi inland earthquake. Black lines indicate the location of the cross section shown in Fig. 6c. Open circles represent the GNSS stations

To compare the estimated results more quantitatively, the velocity and dilatation-rate fields along three cross sections (see Fig. 2) are plotted in Fig. 6. Figure 6a, b shows results along north–south and east–west sections, respectively. Velocity fields are almost identical among all methods and different D values except for the outside of the observation network. There exists an outlier at 33.5° N on the 133° E section (Fig. 6a). The velocity profile of Shen’s method is not affected by it, while that of ABIC is slightly, but not significantly, dragged. Both methods can be said to be sufficiently robust to a single outlier in estimating a velocity field. Strain-rate fields are more sensitive to estimation methods. In particular, along the 35.3° N section that passes through Mt. Fuji around 138.7° E, the magnitude of the dilatation rate is significantly different among the methods (Fig. 6b).

Fig. 6
figure 6

Cross sections of the estimated eastward velocity and dilatation rates. The locations of the cross sections are indicated by the solid lines in Fig. 2. Results of different D values are presented for Shen’s method. Observed velocity data within \(\pm 0.2^\circ\) longitude or latitude from the sections are also plotted. Error bars show the estimated uncertainty of velocity data at each site. The green star in c indicates the epicenter of the 2008 Iwate-Miyagi inland earthquake

Figure 6c presents the velocity and dilatation-rate profiles that pass through the focal region of the 2008 Iwate-Miyagi inland earthquake. Observed eastward velocities generally decrease from west to east, which indicates contractive motion in the east–west direction. However, there are a few data with faster eastward velocities in 140.5°–141.0° E; the motion causes larger east–west contraction around the epicenter and small expansion in the outside (Ohzono et al. 2012). ABIC captures this tendency smoothly. In Shen’s method, the eastward velocity monotonically decreases for \(D = 35\) km and, is barely stagnant for \(D = 25\) km; the velocity profile for \(D = 15\) km shows a better fit to the data in the focal region, but the strain-rate field oscillates in a small scale. ABIC exhibits sharp crustal deformation in the focal region while keeping smooth variation in the outside.

Setting of basis functions in the ABIC method

In the analysis using ABIC, the interval \(L\) of basis functions has been typically fixed beforehand (e.g., Fukahata et al. 1996), but it can have significant influence on the fitting performance. Therefore, we examine its effect in the following. As \(L\) gets smaller, the fitting generally improves and saturates when it reaches fine enough to resolve the variation of data, whereas the computational cost increases dramatically. The computation time of the inverse matrix (the most expensive portion in the analysis) is proportional to the cube of the number of basis functions \(M\), which is inversely proportional to \(L^{2}\); if the interval decreases by half, the computational cost becomes heavier by approximately \(2^{2 \times 3} = 64\) times. Therefore, we need to take a balance between finer fitting and computational cost.

The dependence of \(M\) and the value of ABIC (Eq. 29) on \(L\) is shown in Fig. 7. Figure 7a illustrates that the number of basis functions, and hence computational cost, increases rapidly as \(L\) decreases. As shown in Fig. 7b, ABIC tends to increase with \(L\), because of the lack of resolution to represent the variation of data. Fluctuation of the plot implies that the results could be affected by the relative position between observation data and nodes of basis functions. This fluctuation also suggests that the interval is too coarse to fit the data. The values of ABIC stay almost constant for \(L \le 30\) km, which implies sufficient resolution in this range.

Fig. 7
figure 7

Relation between the basis-function interval \(L\) and estimation performance in the ABIC method. a Relation between the interval \(L\) and the number of basis functions \(M\). b Relation between the interval \(L\) and the value of ABIC

The estimated velocity and strain-rate fields for different \(L\) are presented in Figs. 8 and 9. The velocity field is almost identical for \(L \le 30\) km. Although the estimated strain-rate changes around an outlier (Fig. 9a) and a focal region of the 2008 Iwate-Miyagi inland earthquake (Fig. 9c), the difference is much smaller than that of Shen’s method with different D (Fig. 6). In particular, the results of \(L = 15\) and 20 km are very similar to each other. In brief, results are stable for the change of \(L\) in the ABIC method, when \(L\) is taken to be sufficiently small. In this study, we basically present the result of \(L = 20\) km, because it achieves a sufficient resolution and has moderate computational cost.

Fig. 8
figure 8

Comparison of the estimated eastward velocity and dilatation-rate fields in the ABIC method. Results of the different values of the basis-function interval \(L\) are shown

Fig. 9
figure 9

Cross sections of the estimated eastward velocity and dilatation rates. Results of different values of the basis-function interval \(L\) are presented. The locations of the cross sections are the same as in Fig. 6; the location map is shown in Fig. 2. Observed velocity data within \(\pm 0.2^\circ\) longitude or latitude from the sections are also plotted. Error bars show the estimated uncertainty of velocity data at each site. The green star in c indicates the epicenter of the 2008 Iwate-Miyagi inland earthquake

We next investigate the effect of the form of basis functions at the boundary of the analysis region (Fig. 10). Yabuki and Matsu’ura (1992) and Fukahata et al. (1996) used basis functions that take zero at the edge of the analysis region (Fig. 10a), which forces the field to be zero at the boundary. This model assumption is reasonable in expressing slip distribution of earthquakes, but inappropriate in expressing a horizontal velocity field. If we use these basis functions, the velocity field gradually approaches zero in the region outside the observation network (Fig. 10c) to mitigate large roughness near the boundary (Fig. 10g). This leads to the reversal of the dilatation rate from a coastal area to the sea area (Fig. 10e). In Fig. 10e, contraction is dominant in the land area, which is consistent with compressional tectonics in Japan, while expansion is dominant in the sea area. We observe the reversal of the dilatation rate more clearly, where observation sites are closer to the boundary (e.g., north and east of Hokkaido, west of Kyushu).

Fig. 10
figure 10

Dependence of results in the ABIC method on the setting of basis functions at the model boundary. a, b Setting of basis functions. Blue curves represent individual cubic B-splines, and the red curve represents their summation. The vertical line indicates the boundary of the analysis region. Basis functions are truncated at the boundary in b. c, d Estimated eastward velocity fields using basis functions shown in a and b, respectively. e, f Estimated dilatation-rate fields using basis functions shown in a and b, respectively. g, h Spatial distribution of roughness density using basis functions shown in a and b, respectively

To remove such artificial deformation outside the observation network, in this study, we use basis functions truncated at the boundary shown in Fig. 10b; this strategy was partly used by Fukahata and Wright (2008) to express coseismic slip distribution at the Earth’s surface. This changes the number of model parameters and components of the prior constraint \({\mathbf{R}}\) (Eq. 16) near the boundary. The obtained result shows natural extrapolation of the velocity field to the sea area (Fig. 10d), and the reversal of dilatation rates disappears (Fig. 10f). It is critical for reasonable modeling to set appropriate basis functions.

Discussion on the problems in Shen’s method

We mentioned three theoretical points to be examined in the method of Shen et al. (1996, 2015): velocity and strain-rate fields do not satisfy the relationship of differentiation, the value of the hyperparameter D (distance decaying constant) cannot be objectively determined, and the estimated uncertainty is unreliable. We discuss how much impact these factors have on the estimation results.

First, we consider the optimization of hyperparameters. It is crucial for fitting problems to take a balance between resolution and certainty (robustness), which are in a reciprocal relationship (Backus and Gilbert 1970; Menke 2012). In Shen’s method, the hyperparameter \(D\) plays a role of regularization; smaller values yield a high-resolution but unstable (unrobust) solutions, while larger values yield a stable (robust) but low-resolution solutions. In the basis function expansion, the smoothness of solutions are represented by prior information and its significance is controlled by \(\rho^{2}\); the optimal value of it is objectively determined from observed data using ABIC.

To quantify the resolution and certainty, we use RMS of fitting errors (residual) and roughness (Eq. 14) of the velocity field, respectively. The lower these indices, the better the model is. The relation between the residual and roughness of the estimated results is plotted in Fig. 11. For Shen's method, we also show the case where W (Eq. 9) is used as a hyperparameter instead of D. In Shen’s method, the residual and roughness show a clear reciprocal relationship. The original method (Shen et al. 1996), which changes D, gives slightly better performance than the revised method (Shen et al. 2015), which changes W, in this criterion. Roughness increases rapidly below a certain value of the hyperparameters (\(D\sim 25\) km and \(W\sim 6\)), which implies overfitting to observation data. On the other hand, the results with large D or W strongly flatten the estimated velocity field, and has too small roughness, which leads to increase of the residual. The indices of Shen’s methods with \(D = 25\) km are closest to those of ABIC with \(L \le 30\) km, which is in agreement with a visual comparison of Fig. 4. A difficult problem is how to determine the optimal point on the trade-off curve in the absence of other criteria. In ABIC, the results of \(L \ge 50\) km are comparative or worse than those of Shen’s method, which suggests that basis functions are too coarse to fit the data variation. On the other hand, the indices (residual and roughness) of \(L \le 30\) km, for which the value of ABIC is sufficiently small (Fig. 7b), are located in the lower left part compared to the trade-off curves of Shen’s method. This indicates the superiority of the ABIC method in this criterion. Although the curve of the ABIC method does not converge for small \(L\), the dependence of the indices on \(L\) is much milder than that on \(D\) and \(W\) of Shen’s method. It is a manageable property of ABIC that a smaller \(L\) generally gives a better result and that a sufficiently small \(L\) gives a similar result. This is because a smaller L ensures a higher resolution, and the certainty is maintained by determining the optimal values of the hyperparameters with ABIC. This point is a clear contrast with Shen’s method; results are sensitive to the value of D, and both too large or too small D yield poor results.

Fig. 11
figure 11

Trade-off curve between residual and roughness. The root-mean squares (RMS) of fitting errors (residual) and roughness (Eq. 14) of the velocity field are plotted. The lower left area corresponds to good models. Blue and orange curves represent the results of Shen’s method with different \(D\) (km) and \(W\), respectively (the value is attached at each point). The black curve represents the results of the ABIC method; attached numbers represent the basis-function interval \(L\) (km)

Second, as shown in Appendix, the velocity and strain-rate fields estimated by Shen’s method do not satisfy the relationship of differentiation. The discrepancy between the dilatation rate directly estimated by Shen’s method and that calculated from the estimated velocity field through differentiation is presented in Fig. 12. The maximum discrepancy is 159.13, 30.03 and 14.83 nanostrain/year for \(D = 15\), 25 and 35 km, respectively. The discrepancy is generally small except for an overfitting model \(D = 15\) km. In practice, no serious problem would occur in most regions. However, the discrepancy is evident around the Izu Islands (the bottom panels in Fig. 12), where observation stations are sparse, even for \(D = 25\) km. A large discrepancy exists not only in the extrapolated sea area, but also in the vicinity of the GNSS stations on the islets. We should keep in mind the possibility of inconsistency in strain-rate fields. The discrepancy does not occur in the ABIC method because the strain-rate field is analytically calculated by differentiating the velocity field.

Fig. 12
figure 12

Discrepancy in strain-rate fields estimated by Shen’s method with different values of D. The discrepancy between the dilatation rate directly obtained by Shen’s method and that calculated from the velocity field through differentiation is shown. Bottom panels show enlarged views around the Izu Islands; the location is indicated by boxes in the top panels. Open circles represent the GNSS stations. Positive dilatation represents expansion

Finally, we consider the problem of uncertainties of the estimated velocity and strain-rate fields. Figure 13 shows uncertainties of the absolute value of velocity vectors estimated by both methods. In Shen’s method, the uncertainty is small in the whole land area and increases exponentially to the outside of the observation network. We also observe a smaller D results in a larger uncertainty despite a smaller residual (Fig. 11). This is because the uncertainty, defined by Eq. (8), depends only on input uncertainties \(\sigma_{i}\) estimated at each observation site, and it does not reflect the actual fitting to velocity data \(v_{i}\). Equations (4) and (6) indicate that a smaller D yields larger errors at each data point (i.e., larger \(\delta v_{i}\) and components of \({\mathbf{E}}_{{\text{P}}}\)), which leads to a larger estimated uncertainty (Eq. 8). As pointed out by Shen et al. (2015), this value itself cannot be used to measure the error of estimations. On the other hand, in the ABIC method, the covariance of the model parameters is given by Eq. (28) in which the value of \(\sigma\) is determined through the balance between data fit and smoothness of a model, which are related to the observed data d as well as the model parameter \({\mathbf{a}}\) (Eqs. 30 and 31). Therefore, the estimated uncertainty reflects the fitting accuracy of velocity fields.

Fig. 13
figure 13

Comparison of the estimation errors in terms of the absolute value of velocity vector. Results of different D values are presented for Shen’s method

Figure 14 shows uncertainties of the dilatation-rate field. Characteristics are similar to those of the velocity field; the uncertainty monotonically decreases with D in Shen’s method, while the ABIC method estimates larger uncertainty than Shen’s method in most land areas. The change of uncertainty with D in Shen’s method is even clearer in the strain-rate field. In Hokkaido, the result of the ABIC method shows a slightly larger uncertainty owing to sparser distribution of observation stations, while this characteristic is not so clear in the results of Shen's method of D = 25 km and 35 km.

Fig. 14
figure 14

Comparison of the estimation errors in terms of the absolute value of dilatation rates. Results of different D values are presented for Shen’s method

Strain-rate field in Japan

In this section, we overview characteristics of the strain-rate fields in Japan revealed by the ABIC method (Figs. 15, 16 and 17). Although very high strain rates are observed along the Pacific coast particularly in southwestern Japan, we neglect them in the following discussion, because this crustal deformation is chiefly caused by interplate coupling and considered to be mostly cyclic.

Fig. 15
figure 15

The principal axes of the strain-rate field estimated using the ABIC method with \(L = 20\) km. Red and blue arrows represent expansive and contractive strain rates, respectively. Green stars represent the epicenters of earthquakes whose coordinate offsets are removed from GNSS data. Yellow lines trace MTL and OBR. Black lines represent surface traces of major active faults (Headquarters for Earthquake Research Promotion 2017). Red triangles represent Mt. Fuji, Hakusan and Sakurajima, and black triangles represent other active volcanos (JMA 2013). The following acronyms are used: MTL Median Tectonic Line active fault system, OBR Ou Backbone Range, IB Ise Bay, A Aichi, F Fukushima, I Ibaraki, K Kobe, Na Nara, Ni Niigata

Fig. 16
figure 16

The dilatation-rate field estimated from the ABIC method with \(L = 20\) km. Green stars represent the epicenters of earthquakes whose coordinate offsets are removed from GNSS data. Black lines represent surface traces of major active faults (Headquarters for Earthquake Research Promotion 2017). White triangles represent Mt. Fuji, Hakusan and Sakurajima, and black triangles represent other active volcanos (JMA 2013)

Fig. 17
figure 17

The maximum shear-strain rate field estimated from the ABIC method with \(L = 20\) km. Green stars represent the epicenters of earthquakes whose coordinate offsets are removed from GNSS data. Black lines represent surface traces of major active faults (Headquarters for Earthquake Research Promotion 2017). White triangles represent Mt. Fuji, Hakusan and Sakurajima, and black triangles represent other active volcanos (JMA 2013)

The results of the principal strain rates (Fig. 15) and dilatation rates (Fig. 16) indicate that EW contraction dominates the Japanese Islands, which is consistent with previous studies of GNSS data analysis (e.g., Sagiya et al. 2000). This EW contraction is also in harmony with the stress field of Japan estimated from seismological data (e.g., Terakawa and Matsu'ura 2010) and geologically detected deformation in recent a few million years (e.g., Hujita 1980). Although the EW contraction is dominant in the Japanese Islands, the obtained strain-rate fields show large spatial variation. It is also observed that areas of high dilatation rates usually correspond to those of high shear-strain rates (Fig. 17), though there are exceptions.

A conspicuous high strain-rate zone passes through from the eastern margin of Japan Sea in the Tohoku district, via the Niigata Plain to Kobe, which is known as NKTZ (Sagiya et al. 2000). The high shear-strain rate zone further extends from Kobe to middle Kyushu along the Median Tectonic Line active fault system in Shikoku, which is consistent with the most active boundary in southwestern Japan revealed by a block modeling of GNSS data (Nishimura et al. 2018). We also observe a branch of a high strain-rate zone from around Hakusan volcano (136.8° E, 36.2° N) to Ise Bay along 137° E. This region has already been suggested as the contractive boundary between northeastern Japan (the North American plate) and southwestern Japan (the Amurian plate) from a GNSS data analysis (Heki and Miyazaki 2001).

In contrast, we have found that a low strain-rate zone extends in the forearc, from the Pacific coast of the southern Tohoku district (Fukushima Prefecture) to central Japan (Aichi Prefecture or possibly Nara Prefecture). This low strain-rate zone was briefly mentioned in Sagiya (2004), but the result of this study elucidates it more sharply due to denser GNSS stations and the improvement of the analysis method (Figs. 4 and 11). This forearc low strain-rate zone, which basically well corresponds to high seismic-velocity zone in the shallow crust (Nishida et al. 2008), would be attributed to a low geothermal gradient in the forearc (Tanaka et al. 2004) and less effects of interplate coupling between continental and oceanic plates, that is, a low coupling ratio for the Pacific plate (Noda et al. 2013) and a low plate convergence rate due to strain partitioning of both continental and oceanic plates in central Japan (Heki and Miyazaki 2001; Nishimura et al. 2018).

The Ou Backbone Range in the Tohoku district has several contractive spots around active volcanoes, although the postseismic deformation of the 2008 Iwate-Miyagi inland earthquake is also included. It is interesting that the locations of these contractive spots well correspond to the subsidence areas detected by InSAR after the 2011 Tohoku-oki earthquake (Takada and Fukushima 2013), although an earlier period (1997–2000) of GNSS data did not show such a good coincidence (Miura et al. 2002). This coincidence suggests that strain-rate variations with such short wavelengths catch meaningful signals. This contraction may be attributed to a high geothermal gradient under the stress condition of EW compression (Shibazaki et al. 2008).

Although contraction is dominant in the Japanese Islands, there are several regions with areal expansion, which are better captured owing to improved resolution than in previous studies. The most distinctive expansion occurs in the Izu Islands, where backarc spreading is ongoing (Nishimura 2011). The dilatation is also apparent around Mt. Sakurajima (131.7° E, 31.6° N) and Mt. Fuji (138.7° E, 35.3° N), which would capture the inflation of volcanoes. Weak positive dilatations are also observed near the focal areas of recent large earthquakes, such as the 2003 Tokachi-oki (M8.0) and the 1993 southwest off Hokkaido (M7.8) earthquakes, where high shear-strain rates are also observed. Postseismic deformation of these earthquakes is likely to be the cause of these weak dilatation and high shear-strain rates. Weak positive dilatation along the Pacific coast in Fukushima and Ibaraki prefectures might also be related to the 2008 Fukushima-oki (M6.9) and the 2008 Ibaraki-oki (M7.0) earthquakes as well as decadal transient deformation before the 2011 Tohoku-oki earthquake (Mavrommatis et al. 2014), though shear-strain rates are low in these regions. It is interesting that positive dilatation occurred along the Pacific coast of Fukushima prefecture before the 2011 Tohoku-oki earthquake, where a large normal fault event (M7.0), which is very rare in Honshu, Japan, occurred after the 2011 Tohoku-oki earthquake (Fukushima et al. 2013).

Conclusions

Among various methods for estimating strain-rate fields, we investigated theoretical properties of a widely applied method of Shen et al. (1996, 2015). In this method, the estimated strain-rate field is mathematically not consistent with the estimated velocity field (Appendix). Moreover, the value of a hyperparameter D (Eq. 4) that controls the degree of smoothness must be manually selected, and estimated uncertainty of the velocity and strain-rate fields is unreliable. We then proposed a method using basis function expansion with ABIC to resolve these difficulties.

We applied the two methods to GNSS data (2006–2009) in Japan, and examined the characteristics of these methods. The basis function expansion with ABIC yielded a better result than Shen’s method in terms of the trade-off curve between the residual and roughness (Fig. 11). If we use an appropriate value of D = 25 km, the results of the ABIC method and Shen’s method are quite similar, but significant differences exist in islet and coastal regions, where the density of observation stations drastically changes, and in local large deformation areas, where postseismic deformation and volcanic inflations occur (Figs. 2, 3, 4, 5 and 6). In Shen’s method, the self-inconsistency between the estimated velocity and strain-rate fields does not have serious effects on most regions (Fig. 12), but the estimated uncertainty is certainly unreliable (Figs. 13 and 14).

A main practical concern of Shen’s method is the determination of the value of the hyperparameter D, which significantly changes the estimation result and interpretation of crustal deformation. Other mathematical interpolation methods also require adjustment of hyperparameters to specify the smoothness of solutions. The optimal values are usually determined by checking whether the obtained result is in harmony with expected forms of crustal deformation. That is to say, subjective judgement by analysts affects the selection of a final result. This may allow us to show good-looking results, but could lead to a biased solution. On the other hand, ABIC objectively determines the optimal values of hyperparameters in the method of basis function expansion. Although the interval \(L\) of basis functions must be chosen, the dependence is simple: estimation results are reasonable and hardly change for sufficiently small \(L\) (Figs. 7 and 8). Therefore, once we check that the value of ABIC and estimation result do not significantly change below a certain \(L_{0}\) (approximately 30 km in the current dataset), any value of \(L \le L_{0}\) can be chosen as long as computational cost allows.

The strain-rate field in Japan (Figs. 15, 16 and 17) estimated from the basis function expansion with ABIC detected a forearc low strain-rate zone that passes from the southern Tohoku district to central Japan. The result also shows several contractive spots around active volcanoes in the Ou Backbone Range, the locations of which well correspond to the subsidence areas detected by InSAR after the 2011 Tohoku-oki earthquake (Takada and Fukushima 2013). Thus, the method of basis function expansion with ABIC would serve as an effective tool for estimating strain-rate fields from GNSS data.