On Resolution Matrices

An, Meijian

doi:10.1007/s00024-022-03211-9

On Resolution Matrices

Open access
Published: 23 December 2022

Volume 180, pages 111–143, (2023)
Cite this article

Download PDF

You have full access to this open access article

Pure and Applied Geophysics Aims and scope Submit manuscript

On Resolution Matrices

Download PDF

Meijian An ORCID: orcid.org/0000-0002-9727-0984¹

2033 Accesses
Explore all metrics

Abstract

Solution appraisal, which has been realized on the basis of projections from the true medium to the solution, is an essential procedure in practical studies, especially in computer tomography. The projection operator in a linear problem or its linear approximation in a nonlinear problem is the resolution matrix for the solution (or model). Practical applications of a resolution matrix can be used to quantitatively retrieve the resolvability of the medium, the constrainability of the solution parameters, and the relationship between the solution and the factors in the study system. A given row vector of the matrix for a solution parameter can be used to quantify the resolvability, deviation from expectation, and difference between that solution parameter and its neighbor from the main-diagonal element, row-vector sum, and difference between neighboring elements in the row vector, respectively. The resolution length of a solution parameter should be estimated from the row vector, although it may be unreliable when the vector is unstable (e.g., due to errors). Comparatively, the resolution lengths that are estimated from the column vectors of the observation-constrained parameters are reliable in this instance. Previous studies have generally employed either the direct resolution matrix or the hybrid resolution matrix as the model resolution matrix. The direct resolution matrix and hybrid resolution matrix in an inversion with damping (or general Tikhonov regularization) are Gramian (e.g., symmetric). The hybrid resolution matrix in an inversion using zero-row-sum regularization matrices (e.g., higher-order Tikhonov regularizations) is one-row-sum but is not a stochastic matrix. When the two resolution matrices appear in iterative nonlinear inversions, they are not a projection of the solution, but rather the gradient of the projection or a projection of the solution improvement immediately after a given iteration. Regardless, their resultant resolution lengths in iterative nonlinear inversions of surface-wave dispersion remain similar to those from the projection of the solution. The solution is influenced by various factors in the study, but the direct resolution matrix is derived only from the observation matrix, whereas the hybrid resolution matrix is derived from the observation and regularization matrices. The limitations imply that the appropriateness using the two resolution matrices may be questionable in practical applications. Here we propose a new complete resolution matrix to overcome the limitations, in which all of the factors (e.g., errors) in linear or nonlinear (inverse or non-inverse) studies can be incorporated. Insights on all of the above are essential for ensuring a reliable and appropriate application of the resolution matrix to appraise the model/solution and understand the relationship between the solution and all of the factors in the study system, which is also important for improving the system.

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Article Open access 13 June 2022

Accelerated Smoothing Hard Thresholding Algorithms for $$\ell _0$$ Regularized Nonsmooth Convex Regression Problem

Article 15 June 2023

Ptychography

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Geophysics is the investigation of the Earth based on the principles of physics, whereby the physical properties of the Earth medium (i.e., the unknown model) are inverted from either surface- or space-based observations. Geophysical inverse problems are often underdetermined (e.g., Menke 2015; Tarantola and Valette 1982; Wiggins 1972) owing to the various limitations of these observations. However, advances in generalized or regularized inverse theory over the last century now make it possible to solve underdetermined inverse problems (e.g., Hansen 1992; Lawson and Hanson 1995; Levenberg 1944; Moore 1920; Morozov 1984; Penrose 1955; Tikhonov 1963). Although the quality and reliability of such an under-constrained solution is essential in a data-poor environment, the verification of a geophysical solution is difficult or even impossible, as the investigated medium is generally inaccessible. As such, geophysicists have to make great efforts in the methodology of solution appraisal. However, understanding what is reliable in some geophysical solutions remains perhaps the most exciting challenge to date (Foulger et al. 2015).

An inverted solution (x) that represents investigated medium (x) can be described as:

$$\underline {\varvec{x}} = r({\varvec{x}}),$$

(1)

where the operator r, which can also be denoted by r:x → x, represents the relationship between x and x. This problem can be linearized as:

$$\underline {\varvec{x}} = {{\varvec{Rx}}},$$

(2)

where R, which can also be denoted by R:x → x, is the resolution matrix (Backus and Gilbert 1968, 1970), a linear projection approximation of r:x → x (Eq. (1)). R has been widely used in solution appraisals in geophysics (e.g., Aki et al. 1977; An 2012; Aster et al. 2005; Menke 2015; Tarantola and Valette 1982; Wiggins 1972; Yao et al. 1999) and other research areas (e.g., Lütkenhöner and Grade de Peralta Menendez 1997; Katamreddy and Yalavarthy 2012).

However, the properties of R remain poorly understood in practical problems, and considerable uncertainties and/or inconsistencies regarding R still exist. For example, early studies mostly analyzed the diagonal entries to evaluate the resolvability of the target medium (e.g., Aki et al. 1977; Day-Lewis et al. 2005; Wiggins 1972), even though the solution is actually related to all of the entries in R (Eq. (2)). R has been mostly applied to estimate the resolution length (or resolution width). This length should be retrieved from a given row in R (e.g., An 2012; Barmin et al. 2001; Crosson 1976), although the result appears to be similar to that from the corresponding column (e.g., Alumbaugh and Newman 2000; Miller and Routh 2007; Pilkington 2016). Why and when are they similar? An inverse problem is often solved using reference model, whereby the inverted solution is not a model of the medium but rather a perturbation (Δx) of the reference model (xⁱ). R can be provided in the inversions (e.g., Jackson 1972; Ren and Kalscheuer 2020), but it represents the Δx → Δx projection, not x → x. What is the relationship between the matrix and r:x → x? Can the matrix be used to estimate the resolution length of the solution x (= Δx + xⁱ)? All the above questions will be addressed in this paper.

R is particularly useful in model appraisal, but it comes with various limitations. For example, some of the general factors (e.g., observational and data processing errors) cannot be reflected by the matrix. The resolution estimated from the matrix with limitations may be unrealistically high (Pilkington 2016). A matrix R:x → x with all of the factors in a complete process can overcome these limitations. However, no such matrix exists to date.

In total, R is a unique quantitative indicator of the reliability of a given solution, and it is also important in understanding the relationships between the solution and the observations, regularization, and other factors. However, various uncertainties and limitations regarding R remain. This paper reviews previous resolution matrices and clarifies both the significance and properties of the matrices, which often appear in practical inversions, to explain how to appropriately employ such a matrix in a given study. Furthermore, this paper clarifies the resolution matrices in nonlinear inversions and suggests a new resolution matrix, which can include all of the factors in a linear/nonlinear problem. This study can therefore assist in the appropriate selection and implementation of a resolution matrix and provide the reader with a better understanding of both the quality and reliability of the solution and the relationships between the solution and all of the factors in the study system, which are important for further improving the system.

2 Resolution Matrices from Observations and Regularization Matrices

Resolution matrices that are derived from observations and regularization matrices have been widely applied in various research studies. This section provides a review of resolution matrices and their significance. Furthermore, the matrix properties, which are important in matrix applications, are first clarified.

2.1 Variables Used

x, x: Real medium (or true model) vector, solution (or inverted model) vector.

x_i, x_i: ith real-medium parameter, ith solution parameter.

r_i,j: Entry at the ith row and jth column of matrix R.

r_i,* (or r_*,j): ith row (or jth column) vector of matrix R.

Σr_i,* or Σ_iR: Sum of all of the entries in the ith row of matrix R.

2.2 Solutions of the Inverse Problem

The goal of a geophysical investigation is to directly retrieve the solution of the medium (x_D) from observations (d = d + δd) that are contaminated with errors (δd) via the inverse equation:

$$\underline {{\varvec{x}}}_{\text{D}} = g^{ - {\text{g}}} \underline {{\varvec{d}}} ,$$

(3)

which is based on the physical relationship between the medium parameters (m × 1 vector x; [x₁, x₂,…, x_m]^T) and observational data (n × 1 vector d; [d₁, d₂,…, d_n]^T), with the latter defined as:

$${{\varvec{d}}} = g({{\varvec{x}}}),$$

(4)

the operator g in Eq. (4) is often not invertible, such that a generalized inverse of g, g^–g, is used, as in Eq. (3).

Equation (3) can be expressed in linear form as:

$$\underline {{\varvec{x}}}_{\text{D}} = {{\varvec{G}}}^{ - {\text{g}}} \underline {{\varvec{d}}} ,$$

(5)

and rewritten as:

$${{\varvec{d}}} = {{\varvec{Gx}}},$$

(6)

the n × m matrix G, which is normally termed the observation matrix, is composed of the sensitivities of d with respect to x. The generalized inverse of G, G^–g (e.g., Lawson and Hanson 1995; Moore 1920; Penrose 1955; Tan 2017; Tarantola and Valette 1982), can be either a right inverse:

$${{\varvec{G}}^{ - {\text{g}}} {\text{ = }}{\varvec{G}}^{\text{T}} \left( {{\varvec{GG}}^{\text{T}} } \right)^{ - 1}}, \quad {n < m},$$

(7)

or a left inverse:

$$\begin{array}{*{20}c} {{{\varvec{G}}}^{ - {\text{g}}} { = }\left( {{{\varvec{G}}}^{\text{T}} {{\varvec{G}}}} \right)^{ - 1} {{\varvec{G}}}^{\text{T}} ,} & {n \ge m} \\ \end{array} .$$

(8)

G^–g can also be obtained via singular value decomposition (SVD) (Golub and Kahan 1965; Golub and Reinsch 1970; Varah 1973) or some form of matrix factorization (e.g., Gentle 2007; Golub 1965). The calculated G^–g from those methods is tightly related or exactly equivalent.

Figure 1a shows an example (Example 1) of a one-dimensional (1-D) underdetermined inverse problem that mimics the relationship between distance, slowness (x), and travel time (d). The problem is described by 10 linear equations (Eq. (S1) in the supporting information), with 100 (m = 100) unknown parameters in x and 10 (n = 10) observations (d). The coefficient of the ith parameter (x_i) in the jth equation for d_j is the product of the segment length x_i that is traveled by the jth observation (Fig. 1a) and the sensitivity of d_j with respect to x_i. All of the coefficients are stored in G (Fig. 1b). Several coefficients in Eq. (S1) (or the entries in G) for the second and ninth observations are greater than one, which means that higher sensitivities are set for the parameters because the segment lengths of all the parameters are equal to one (Fig. 1a). Figure 1c shows a synthetic x and its pseudo-inverse solution x_D. The synthetic error-free observation d (δd = {0}) and prediction of x_D (d_D) are shown in Fig. 1d.

Both the data coverage (observation distribution) and sensitivities, which are stored in G, influence the solution in the synthetic example (Fig. 1). For example, the solution parameters from x₁ to x₁₀ are constrained by only one observation (the first travel path; the top left arrow in Fig. 1a), such that the nonzero entries for all ten parameters in the first row of G (Fig. 1b) and the coefficients in the first row of Eq. (S1) are the same and equal to one. Similarly, the parameters from x₁₁ to x₂₀ are also constrained by one observation (the second path), but the nonzero entries for the ten parameters in the second row of G (Fig. 1b) and the coefficients in the second row of Eq. (S1) are different. The resultant x₁−x₁₀ values in x_D (Fig. 1c) are the same and equal to the average of the synthetic model parameters x₁−x₁₀ (circles in Fig. 1c), whereas the resultant x₁₁−x₂₀ values are quite different, both from each other and their respective synthetic values (x₁₁−x₂₀). The differences in x₁₁−x₂₀ are caused by different sensitivities because they all have the same path segments. Parameters x₂₁−x₄₀ and x₄₁−x₆₀ are constrained by two paths, but their path-overlapping patterns (Fig. 1a and b) differ. The differences in x₂₁−x₆₀ in x_D (Fig. 1c) are related to the path coverage. Parameters x₆₁−x₉₀ are constrained by several observations. x₉₁−x₁₀₀ are not constrained by any observations, such that they are all equal to zero in x_D.

However, most geophysical inverse problems are ill-posed (i.e., there are not sufficient observations to obtain a unique and stable solution), and the generalized solution x_D, such as that in Example 1 (Fig. 1c), is not physically rational. Furthermore, large discrepancies may exist between the real model x and solution x_D, especially for parameters with poor or no observation coverage (Fig. 1a–c), even though there is a good fit between the observation data and model predictions (Fig. 1d). Additional artificial constraints (often called regularization) (e.g., Aster et al. 2005; Benning and Burger 2018; Engl et al. 2000; Levenberg 1944; Menke 1989; Tikhonov 1963) on the model parameters must therefore be included during the inversion to obtain a physically rational solution. The regularization-based forward equation then becomes:

$${{\varvec{b}}} = {{\varvec{Ax}}},$$

(9)

where the n_b × m matrix A and n_b × 1 vector b are:

$$\begin{array}{*{20}c} {{{\varvec{A}}} = \left[ {\begin{array}{*{20}r} \hfill {{\varvec{G}}} \\ \hfill {{\varvec{C}}} \\ \end{array} } \right]} & {{\text{and}}} & {{{\varvec{b}}} = \left[ {\begin{array}{*{20}r} \hfill {{\varvec{d}}} \\ \hfill {{\varvec{c}}} \\ \end{array} } \right]} \\ \end{array} ,$$

(10)

respectively, which contain an n_c × m (n_c = n_b – n) matrix C and an n_c × 1 vector c, respectively, both of which are related to the regularization.

Tikhonov regularization (Levenberg 1944; Tikhonov 1963) is used largely in geophysical inversions (e.g., Aster et al. 2005; Constable et al. 1987; Menke 1989). C is denoted by λLⁿ in n^th-order Tikhonov regularizations, and c (Eq. (10)) is a zero vector ({0}). The factor λ is the regularization parameter, which balances the contributions of either the observations and regularization in the inversion, or the fit to the observational data d and regularization vector c. Tests that employ ad hoc methods (e.g., Craven and Wahba 1978; Hansen 1992; Morozov 1984) are often used to determine λ. The matrix L⁰ for zeroth-order Tikhonov regularization (damping regularization), which minimizes the model (Levenberg 1944), is the identity matrix I. Lⁿ (n > 0; e.g., L¹ in Eq. (S2) in the supporting information) is a zero-row-sum band matrix. L¹ regularization (flatness regularization) flattens the model via minimizing the first-order gradient of the model. Lⁿ Tikhonov regularization exerts uniform a priori constraints to all of the solution parameters. Therefore, regularization (C = λLⁿ) with an optimal weighting factor λ produces the optimal average regularizing effect for all of the parameters. However, if C contains a diagonal matrix W (= diag(w₁, w₂,…)) (i.e., C = λWLⁿ), then the regularization is heterogeneous and spatially variant for the model (e.g., An 2020; Katamreddy and Yalavarthy 2012; Pogue et al. 1999; Sanny et al. 2018).

Regularization at least makes A a full-column rank matrix, such that A⁻¹ (or the left inverse A^−g), which is an m × n_b matrix, can be uniquely obtained. The solution (x) of an inversion using regularization is:

$$\underline {{\varvec{x}}} = {{\varvec{A}}}^{ - {\text{g}}} \underline {{\varvec{b}}} .$$

(11)

where b and d are replaced by b and d, respectively, in Eq. (10). x can also be obtained via least-squares or minimum-norm inversions using the objective function:

$$\min \left\| {\left[ {\begin{array}{*{20}c} {{\varvec{G}}} \\ {{\varvec{C}}} \\ \end{array} } \right]{{\varvec{x}}} - \left[ {\begin{array}{*{20}c} {{\varvec{d}}} \\ {{\varvec{c}}} \\ \end{array} } \right]} \right\|^2 .$$

(12)

When Tikhonov regularization (C = λL, c = {0}) is used, a truncated form of A^−g (m × n matrix A^−t) can be obtained via (e.g., Aster et al. 2005; Barmin et al. 2001; Crosson 1976):

$${{\varvec{A}}}^{ - {\text{t}}} = ({{\varvec{G}}}^{\text{T}} {{\varvec{G}}} + \lambda^2 {{\varvec{L}}}^{\text{T}} {{\varvec{L}}})^{ - 1} {{\varvec{G}}}^{\text{T}} .$$

(13)

x is then given by:

$$\underline {{\varvec{x}}} = {{\varvec{A}}}^{ - {\text{t}}} \underline {{\varvec{d}}} ,$$

(14)

which is the same as that in Eq. (11). x in Example 1 (Fig. 1a), which uses first-order Tikhonov regularization, is shown in Fig. 1c. An optimal factor λ of one is selected by Morozov’s discrepancy principle (Morozov 1984) for the inversion.

The resultant x via Eq. (11), which is obtained using regularization, is generally more rational than x_D. For example, x (denoted as x(L¹)) in Example 1 (Fig. 1c) is closer to the synthetic model than x_D, such that x is preferred over x_D (Eq. (5)) in practical inversions (i.e., the solution provided in a practical ill-posed inversion is x rather than x_D).

If the errors (δd) in the measured observation data d (= d + δd) are known, then the solution x in Eq. (11) becomes:

$$\underline {{\varvec{x}}} = {{\varvec{A}}}^{ - g} {{\varvec{b}}} + \delta {{\varvec{x}}}_{\text{d}} ,$$

(15)

where:

$$\delta {{\varvec{x}}}_{\text{d}} = {{\varvec{A}}}^{ - g} \left[ {\begin{array}{*{20}r} \hfill {\delta {{\varvec{d}}}} \\ \hfill 0 \\ \end{array} } \right] = {{\varvec{A}}}^{ - {\text{t}}} \delta {{\varvec{d}}}.$$

(16)

It is noted that δx_d is only the contribution of the observational errors to the solution and is therefore not the solution error. The solution error or residual, δx (= x − x), is related to both δd and the other factors in the x → x process.

2.3 Resolution Matrices from the Observations and Regularization Matrices

The process to obtain the solution (x) of a true medium (x) is a projection (r:x → x) from x to x, which can be described using Eq. (1) (Fig. 2a). If the projection is linear, then x can be written as a linear regression equation (Fig. 2a):

$$\underline {{\varvec{x}}} = {{\varvec{Rx}}} + \delta {{\varvec{x}}}_{{\text{of}}} ,$$

(17)

where R is the slope of the regression, δx_of is a constant offset. If δx_of is ignored, then the linear regression becomes the linear projection of Eq. (2) (Fig. 2a), where R is the so-called resolution (Backus and Gilbert 1968, 1970) or projection matrix. For a nonlinear problem, either R or R:x → x can be considered a linear approximation of r:x → x (Fig. 2a).

However, this matrix has been obtained via matrix operations, with several different resolution matrices potentially being obtained (An 2012). If the observational errors (δd) in data d in Eq. (10) are ignored, then d = d. Replacing d in Eq. (5) with d in Eq. (6) causes the projection from x to x_D (x → x_D) to become:

$$\underline {{\varvec{x}}}_{\text{D}} = {{\varvec{R}}}_{\text{D}} {{\varvec{x}}},$$

(18)

where the resolution matrix R_D (Table 1) is of the following form (e.g., Jackson 1972; Menke 1989; Wiggins 1972):

$${{\varvec{R}}}_{\text{D}} = {{\varvec{G}}}^{ - {\text{g}}} {{\varvec{G}}}.$$

(19)

Table 1 Resolution matrices

Full size table

The transformation from x to x (x → x) is obtained by inserting Eq. (9) into Eq. (11):

$$\underline {{\varvec{x}}} = {{\varvec{R}}}_{\text{I}} {{\varvec{x}}},$$

(20)

where resolution matrix R_I (Table 1) is of the form (An 2012):

$${{\varvec{R}}}_{\text{I}} = {{\varvec{A}}}^{ - g} {{\varvec{A}}}.$$

(21)

When the vector c is a zero vector (e.g., in Tikhonov regularization) and d in Eq. (11) is replaced by that in Eq. (6), the transformation from x to x (x → x) is:

$$\underline {{\varvec{x}}} = {{\varvec{R}}}_{\text{H}} {{\varvec{x}}},$$

(22)

where the resolution matrix (R_H) (Table 1) takes the form:

$$\begin{array}{*{20}c} {{{\varvec{R}}}_{\text{H}} = {{\varvec{A}}}^{ - g} {{\varvec{B}}},} & {{{\varvec{B}}} = \left[ {\begin{array}{*{20}c} {{\varvec{G}}} \\ 0 \\ \end{array} } \right]} \\ \end{array},$$

(23)

if the truncated form A^–t is used, then Eq. (23) becomes (An 2012):

$${{\varvec{R}}}_{\text{H}} = {{\varvec{A}}}^{ - {\text{t}}} {{\varvec{G}}},$$

(24)

if A^−t is not truncated from A^−g and instead calculated from Eq. (13), which employs Tikhonov regularization of the form λL, then R_H becomes (e.g., Aster et al. 2005; Barmin et al. 2001; Boschi 2003; Crosson 1976):

$${{\varvec{R}}}_{\text{H}} = ({{\varvec{G}}}^{\text{T}} {{\varvec{G}}} + \lambda^2 {{\varvec{L}}}^{\text{T}} {{\varvec{L}}})^{ - 1} {{\varvec{G}}}^{\text{T}} {{\varvec{G}}}.$$

(25)

The three resolution matrices for Example 1, R_D and R_H in the inversion using the regularization matrix I (R_H(I)) and R_H in the inversion using the regularization matrix L¹ (R_H(L¹)), are shown in Fig. 3a–c.

If G is a full-column rank matrix and no regularization is used, then x_D equals x, and R_D, R_I, and R_H are the same as the identity matrix. Otherwise, x_D and x are different, and the three resolution matrices differ. Regularization needs to ensure that A is a full-column rank matrix, such that R_I is still an identity matrix. The practical solution is x, as opposed to x_D, due to regularization, and the resolution matrix for the x → x projection is R_H, as opposed to R_D. R_D has been paid little attention in previous regularized inversions in this case. However, R_D is still very important for understanding the reliability of the solution in the inversions, as explained in the “Resolvability and constrainability from the resolution matrix” section.

Even though R_D and R_H are often different, they are both commonly called the model resolution matrix, which may confuse readers. The notations suggested by An (2012) for the matrices are adopted here for clarity, where R_D is the direct resolution matrix, R_I is the regularized resolution matrix, and R_H is the hybrid resolution matrix.

2.4 Properties of the Resolution Matrices

2.4.1 R _D from only the Observation Matrix

Truncated SVD of G allows R_D to be written in the form (e.g., Aster et al. 2005; Jackson 1972; Wiggins 1972):

$${{\varvec{R}}}_{\text{D}} = {{\varvec{V}}}_\rho {{\varvec{V}}}_\rho^{\text{T}} ,$$

(26)

where V_ρ (= {v_i,j}_m×ρ, ρ = rank(G)) is an unitary matrix with right singular vectors of G. Equation (26) indicates that R_D is a Gram matrix (or Gramian), which can be created by multiplying a matrix with its own transpose, as in Eq. (26).

The Gram matrix (e.g., Gentle 2007), R_D (e.g., Fig. 3a), is symmetric (Table 2), with its rank (rank(R_D)) equal to both its trace (trace(R_D)) and the rank of V_ρ (rank(V_ρ) = ρ) (Eq. (26)). Here rank(R_D) equals ρ because rank(V_ρ) is equal to rank(G(ρ)). The diagonal elements in R_D are nonnegative, but the off-diagonal elements in R_D can be negative, with the exception that either V_ρ^T is a full-column rank matrix or V_ρ is a full-row rank matrix. However, V_ρ for an underdetermined problem is not a full-row rank matrix because ρ is smaller than m, such that the off-diagonal entries of R_D of an underdetermined problem can be negative.

Table 2 Properties of typical resolution matrices

Full size table

A negative r_i,j entry appears in R_D (Table 2) when some columns in two rows of G are like in a band matrix with the entries (0 and nonzero numbers a to d) below:

$$\left[ {\begin{array}{*{20}c} {...}&a&{...}&b&{...}&0&{...} \\ {...}&0&{...}&c&{...}&d&{...} \\ \end{array} } \right].$$

(27)

entries in G^–g related with the nonzero entries a and d in G have different signs from those in G, causing the negative r_i,j. This happens in G when x_i and x_j are constrained by different observations, which also constrain another parameter in common. For example, x₄₈ and x₅₈ are respectively constrained by the fifth and sixth observations (Fig. 1a), but the two observations also constrain the solution parameters x₅₂–x₅₅ in common. Consequently, r_48,58 in R_D is negative (Fig. 3a).

2.4.2 R _H with Uniform Regularization Using λ I

When uniform zeroth-order Tikhonov regularization (λL⁰ or λI) is used, SVD of G allows the general inverse A^–t (Eq. (13)) to be written as below (e.g., Menke 2012):

$${{\varvec{A}}}^{ - {\text{t}}} = {{\varvec{V}}}{(}{{\varvec{SS}}}{\bf{ + }}\lambda^2 {{\varvec{I}}}{)}^{ - {1}} {{\varvec{SU}}}^{\text{T}} ,$$

(28)

where U is a unitary matrix with left singular vectors of G, S is a nonnegative diagonal matrix with singular values of G (s_i). The resolution matrix R_H(λI) (Eq. (25)) can be written as:

$$\begin{aligned} {{\varvec{R}}}_{\text{H}} (\lambda {{\varvec{I}}}) &= {{\varvec{V}}}{(}{{\varvec{SS}}}{\bf{ + }}\lambda^2 {{\varvec{I}}}{)}^{ - {1}} {{\varvec{SU}}}^{\text{T}} {{\varvec{USV}}}^T , \\ &= {{\varvec{V}}}{(}{{\varvec{SS}}}{\bf{ + }}\lambda^2 {{\varvec{I}}}{)}^{ - {1}} {{\varvec{SSV}}}^T \\ &= {{\varvec{VFV}}}^{\text{T}} = {{\varvec{V}}}_\rho {{\varvec{F}}}_\rho {{\varvec{V}}}_\rho^{\text{T}} \\ \end{aligned}$$

(29)

where F_ρ (= diag(f₁, f₂, …, f_ρ)) (Aster et al. 2005) is truncated F and with the positive constants (f_i):

$$f_i = \frac{s_i^2 }{{s_i^2 + \lambda^2 }},$$

(30)

The positive diagonal matrix F_ρ can be written as a product of E_ρ (= diag(f₁^1/2, f₂^1/2, …, f_ρ^1/2)) and E_ρ^T. Equation (29) can then be written as:

$${{\varvec{R}}}_{\text{H}} (\lambda {{\varvec{I}}}) = {{\varvec{V}}}_\rho {{\varvec{E}}}_\rho {{\varvec{E}}}_\rho^{\text{T}} {{\varvec{V}}}_\rho^{\text{T}} = ({{\varvec{V}}}_\rho {{\varvec{E}}}_\rho )({{\varvec{V}}}_\rho {{\varvec{E}}}_\rho )^{\text{T}} ,$$

(31)

where:

$${{\varvec{V}}}_\rho {{\varvec{E}}} = \{ v_{i,j} f_j^{1/2} \} .$$

(32)

Equation (31) indicates that R_H(λI) is a Gram matrix (Table 2), like R_D. However, when a spatial variant regularization of the form λWI is used, the matrix R_H(λWI) cannot be written in the form of matrix multiplication with its own transpose, and is therefore not Gramian.

R_H(λI) (e.g., Fig. 3b) is symmetric, like R_D. R_H(λI) has a rank that is equal to rank(V_ρE_ρ) (Eq. (31)). The diagonal entries of R_H(λI) are nonnegative, but the other entries can be negative. Equation (30) indicates that f_j^1/2 is in the range (0,1). Equation (32) indicates that a given column (e.g., the j^th column) of V_ρE_ρ equals the same column vector of V_ρ multiplied by f_j^1/2. All of the entries in V_ρE_ρ will be closer to zero than V_ρ, as f_j^1/2 is a positive value that is less than one. All of the entries in R_H(λI) therefore have weaker intensities than those in R_D, but they possess similar intensity patterns, as observed in comparisons of Fig. 3a and b, and Fig. 3d and e. Furthermore, trace(R_H(λI)) is smaller than trace(R_D), such that the resolvability of x decreases after regularization using λI, as explained in the “Resolvability and constrainability from the resolution matrix” section.

2.4.3 R _H Using a Zero-Row-Sum Regularization Matrix

The matrices of derivative regularizations, e.g., the high-order Tikhonov regularization matrices λLⁿ and λWLⁿ (n > 0), are a zero-row-sum n_s × m matrix (S⁰) (n_s > 0), such that

$${{\varvec{S}}}^0 {\bf{1}} = {\bf{0,}}$$

(33)

where 1 (= {1}_m×1) is a vector with all elements one and 0 is a zero vector. Regardless of the number (n_s) of rows of S⁰, a relation below:

$$\begin{aligned} ({{\varvec{G}}}^{\text{T}} {{\varvec{G}}} + \lambda^2 ({{\varvec{S}}}^0 )^{\text{T}} {{\varvec{S}}}^0 {\bf{)1}} &= {{\varvec{G}}}^{\text{T}} {{\varvec{G}}}{\bf{1}} + \lambda^2 ({{\varvec{S}}}^0 )^{\text{T}} ({{\varvec{S}}}^0 {\bf{1}}{)} \\ &= {{\varvec{G}}}^{\text{T}} {{\varvec{G}}}{\bf{1}} + \lambda^2 ({{\varvec{S}}}^0 )^{\text{T}} {\bf{0}} \\ &= {{\varvec{G}}}^{\text{T}} {{\varvec{G}}}{\bf{1}} \\ \end{aligned},$$

(34)

exists. If the regularization matrix (C, Eq. (10)) is a zero-row-sum matrix (S⁰), the Eq. (34) allows a relation on the resolution matrix R_H (Eqs. (24) and (25)):

$$\begin{aligned} {{\varvec{R}}}_{\text{H}} {\bf{1}} &= {{\varvec{A}}}^{ - {\text{t}}} {{\varvec{G}}}{\bf{1}}{\bf{}} \\ &= ({{\varvec{G}}}^{\text{T}} {{\varvec{G}}} + \lambda^2 ({{\varvec{S}}}^0 )^{\text{T}} {{\varvec{S}}}^0 )^{ - 1} {{\varvec{G}}}^{\text{T}} {{\varvec{G}}}{\bf{1}} \\ &= ({{\varvec{G}}}^{\text{T}} {{\varvec{G}}} + \lambda^2 ({{\varvec{S}}}^0 )^{\text{T}} {{\varvec{S}}}^0 )^{ - 1} ({{\varvec{G}}}^{\text{T}} {{\varvec{G}}} + \lambda^2 ({{\varvec{S}}}^0 )^{\text{T}} {{\varvec{S}}}^0 ){\bf{1}} \\ &= {\bf{1}} \\ \end{aligned},$$

(35)

therefore, when regularization matrix is a zero-row-sum matrix (S⁰), the resolution matrix R_H is a one-row-sum matrix S¹ (while S¹1 = 1). This can be simplified as:

$$\left[ {\begin{array}{*{20}c} {{\varvec{G}}} \\ {{{\varvec{S}}}^0 } \\ \end{array} } \right]^{ - {\text{g}}} \left[ {\begin{array}{*{20}c} {{\varvec{G}}} \\ {\bf{0}} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {{\varvec{G}}} \\ {{{\varvec{S}}}^0 } \\ \end{array} } \right]^{ - {\text{t}}} {{\varvec{G}}} = {{\varvec{S}}}^1 ,$$

(36)

where 0 is a zero matrix.

As the high-order Tikhonov regularization matrices λLⁿ and λWLⁿ (n > 0) (e.g., flatness and smoothness) are both a zero-row-sum S⁰ matrix. Therefore, R_H(λLⁿ) and R_H(λWLⁿ) are S¹ matrices (i.e., all of the rows in R_H(λLⁿ) and R_H(λWLⁿ) sum to one; Table 2). All of the rows in R_H(L¹) in Fig. 3c sum to one, as shown in Fig. 4a.

The resolution matrix R_H(S⁰) is similar to a stochastic matrix (a square matrix with non-negative elements and each row summing to 1), but the entries in R_H (or S¹) can be negative due to the same reason above for R_D. S⁰ matrix (is often band matrix) or the combined matrix of S⁰ with G (Eq. (36) or A in Eq. (10)) often includes two rows like Eq. (27), i.e., the S¹ regularization matrix always yields two parameters (e.g., x₁ and x₁₁) that are constrained by the same observation (the first row of G), with one of them (x₁₁) also being constrained by one of the other observation or regularization row (the tenth row of L¹ in Eq. (S2)). The two rows also constrain a third parameter (x₁₀). The entries in A^−t and A^−g related with parameters often have different signs from G, and the r_1,11 entry in R_H(L¹) (Fig. 3c) is consequently negative. Therefore, the application of Lⁿ regularization can cause more negative entries in R_H(Lⁿ) (e.g., Fig. 3c) than R_D (Fig. 3a), and then R_H is not a stochastic matrix.

A one-row-sum matrix R_H(S¹) implies that one is an eigenvalue of the projection R_H, such that there exists an equation:

$${{\varvec{R}}}_{\text{H}} {{\varvec{x}}}_{\text{p}} = 1{{\varvec{x}}}_{\text{p}},$$

(37)

Eqs. (22) and (37) indicate that a medium (x_p) can be fully resolved and equal to a solution (x_p = R_Hx_p), even though x_p is not the medium that is currently measured. It is impossible for all of the row sums in R_H(λI) to equal one, thereby demonstrating that higher-order Tikhonov regularization is superior to damping (or λI) regularization from the viewpoint of row sums.

2.4.4 R _H Using Mixed Regularization

If the regularization is mixed using higher-order regularizations (e.g., L² and L³), then the new regularization matrix is still a matrix S⁰, and R_H is a matrix S¹. Otherwise, if it is mixed using damping and a higher-order Lⁿ (n > 0) (e.g., Sigloch 2011; Tewarson 1977), then the mixed regularization matrix C is neither a diagonal matrix nor a zero-row-sum matrix. Therefore, the new R_H is neither a symmetric (Gram) matrix nor a one-row-sum matrix.

2.5 Significances of the Resolution Matrices

2.5.1 Row Vector = Content Function of the Medium

Equation (2) highlights that a solution parameter x_i can be expressed as a weighted sum of all of the model parameters in x (or [x₁, x₂, …x_m]^T):

$$\underline x_i = \sum_{j = 1}^m {r_{i,j} x_j } ,$$

(38)

where the r_i,j entry of matrix R plays a role in weighting the j^th model parameter x_j in the summation. The r_i,* row vector of R therefore acts like an averaging vector (Backus and Gilbert 1968), with the entries in r_i,* representing the accurate contents (or contributions) of all of the medium parameters to (or in) the i^th solution parameter x_i. Therefore, r_i,* can also be termed the content (or contribution) function of the medium in x_i (Table 3).

Table 3 Vectors of resolution matrix

Full size table

2.5.2 Column Vector = Spreading Function of the Medium

The r_i,j entry signifies the contribution of the medium parameter x_j to x_i, such that all of the entries in the j^th column vector (r_*,j) (Table 3) correspond to the contributions of the j^th model parameter (x_j) to either all of the parameters in x or the spread of x_j into the parameters. This has been considered the Green’s function (also termed the point spread function (Smith 1997), or impulse response function) of x_j to x.

2.5.3 Significances of the Matrices

The matrices R_D, R_I, and R_H are slopes of the linear projection from x to x, but have different significances (Table 1) on the transformation. Regularization at least makes matrix A full rank (invertible); i.e., R_I should be an identity matrix. If the x → x projection matrix R_I is an identity matrix, then a unique solution (x) can be obtained from a given G and C. Otherwise, some of the model parameters remain poorly constrained, and further regularization should be employed.

R_D is only produced from G (Eq. (19)). Therefore, R_D represents the effects and contributions of x from the given observations, regardless of whether regularization is used or not.

R_H only reflects the x → x projection, with no consideration of other factors (e.g., errors) in the matrix. R_H can therefore evaluate the reliability of x. Furthermore, the construction of R_H as a mixture of the observations (G) and regularization (λC in A) (Eq. (23)) means that it reflects some combination of the observational and regularization effects in the solution. The variations or differences between R_H and R_D therefore represent the effects of regularization on the inversion, as R_D only reflects the observational effects on the solution.

All three matrices, especially R_H and R_D, are therefore essential for understanding the inversion and its result, such as the resolvability of x, the uniqueness and reliability of x, and the effects of regularization.

2.5.4 Column Vector Variations Due to Regularization Changes

The role of regularization on the projection from x to x can be revealed via a comparison of the resolution matrices R_D and R_H. One entry (r_i,j) of either R_D in Eq. (19) or R_H in Eq. (24) can be written as:

$$r_{i,j} = \sum_k {u_{i,k} g_{k,j} } ,$$

(39)

where u_i,j represents an entry of either G^−g or A^−t, and g_k,j is an entry of G. R_H (Eq. (24)) can be considered R_D (Eq. (19)), with the variation in u_i,j obtained by replacing G^−g with A^−t. Equation (39) indicates that if the g_*,j column vector of G is given, then the variation of u_i,j only influences the j^th column vector (r_*,j) of R. Therefore, the addition of regularization (C) in A (Eq. (10)) allows the j^th column vector in R_H to be considered a function of the j^th column (with no relationship arising among the other columns) in R_D. Regularization essentially changes the spread functions from R_D to R_H, such that the spread function is sensitive to regularization (Table 3).

These column vector variations due to regularization changes are well-illustrated by comparing R_D and R_H(L¹) (Fig. 5c and d, respectively) of a linear inverse problem example (Example 2) with three-point observations (G in Fig. 5a) and 50 unknowns in x (Fig. 5a and b). The L¹ regularization matrix and λ = 1 are used in the inversion. If a column in R_D (e.g., r_*,5) is all zeros (Fig. 5c), then the corresponding column vector in R_H (r_*,5) (Fig. 5d) will be all zeros. Otherwise, the tenth column in R_D has one nonzero entry (r_10,10), and the tenth column vector in R_H(L¹) (r_*,10) (Fig. 5c and d) has nonzero entries around the r_10,10 entry. The variations between the column in R_D and that in R_H are due to regularization. Similar results can be found via a comparison of R_H(L¹) and R_D (Fig. 3) for Example 1 (Fig. 1).

Comparisons of the column vectors of R_D and R_H for the same parameter x_j can reveal the regularization effect, as the variations in the spread functions from R_D to R_H are due to regularization. Here, the j^th column vector in R_H is only related to the j^th column in R_D, such that the relationship between the relative magnitudes of neighboring entries in a row vector of R_D may be preserved in R_H. A given row vector in R_H may exhibit a similar pattern or curve to that in R_D. For example, the curves of the r_48,* row vectors in R_D and R_H (Fig. 3d) are similar, but those of the r_*,48 column vectors (Fig. 3e) are very different.

2.6 Do the Projection Matrices Represent Practical Projections?

The projection r:x → x (Eq. (1)) includes the effects of all of the factors (e.g., uncertainty in observation d and prior information c (Menke 2015), Eq. (12)) in the process from x to x, but R_D, R_I, and R_H do not. The resolution estimated from R_H can be unrealistically higher than that obtained via synthetic tests (Pilkington 2016), which may be due to the limitation of the projection matrix.

2.6.1 R _H Versus Observational Errors

The observational errors δd influence the solution x. When δd is considered, Eq. (2) becomes:

$$\underline {{\varvec{x}}} = {{\varvec{R}}}(\delta {{\varvec{d}}}){{\varvec{x}}},$$

(40)

where R(δd) represents R as a function of δd, with R(δd = 0) equaling R for the error-free case.

Inserting Eq. (6) into Eq. (15) then yields a projection that is similar to the form in Eq. (17):

$$\underline {{\varvec{x}}} = {{\varvec{R}}}_{\text{H}} {{\varvec{x}}} + \delta {{\varvec{x}}}_{\text{d}},$$

(41)

Eq. (41) shows that R_H is independent of δx_d, which represents the effect of δd in the solution. However, the error ranges can be used to weight the data in an inversion that employs a weighted least squares (WSL) approach; R_H determined via WSL inversion is slightly different to the above R_H, but it is also not R(δd). Therefore, R_H does not include the effect of observational errors.

2.6.2 Resolution Matrix of the Full Process from x to x?

Is there a matrix that reflects all of the factors in the x → x process? If x is the result of a process that incorporates all of the factors, then the resolution matrix R that is directly inverted from x and x via Eq. (2) reflects all of the factors. However, the inversion scheme to determine such a matrix is often impossible in geophysics. The true model for a region (x) may be never known; otherwise, the matrix R for a region with known x is redundant and unnecessary.

Equation (2) represents a transformation from the real model x to the corresponding solution x, or a process via the projection r:x → x (or R). This approach is independent of x and can be isolated from the practical true model x and practical solution x. If x is not the true medium but rather a synthetic model, then the inversion of R is possible.

Recovery tests, e.g., checkerboard tests (Lévěque et al. 1993), that employ a synthetic model with a specific structure are frequently used to retrieve a qualitative resolution. The output solution x of a synthetic test that employs an input model with a random structure x also contains resolution information (An 2012). An (2012) suggested that the statistical resolution matrix R_S (Table 1) can be determined by statistically comparing a limited number of input synthetic random models x and their correspondent output solutions x via a Gaussian function approximation for each row vector.

The output solution x includes all of the known factors in the x → x process, with R_S including all of these factors. However, the matrix is only approximate and not necessarily accurate. If a large number of x and x are given, then an accurate and complete resolution matrix can be obtained on the basis of Eq. (2), as discussed in the following section.

3 Resolution Matrices of the Complete Process

An accurate resolution matrix that includes all of the factors in the complete x → x projection process is suggested in this section.

3.1 Method

Equation (2) has to be reorganized for the inversion of R because R is an m × m matrix, and both x and x are m × 1 vectors. The extended form of Eq. (2) is:

$$\left\{ {\begin{array}{*{20}r} \hfill {\underline x_1 = } & \hfill {r_{{1},{1}} x_1 + } & \hfill {r_{{1},2} x_2 + } & \hfill { \cdot \cdot \cdot + } & \hfill {r_{{1},m} x_m } \\ \hfill {\underline x_2 = } & \hfill {r_{{2},{1}} x_1 + } & \hfill {r_{{2},2} x_2 + } & \hfill { \cdot \cdot \cdot + } & \hfill {r_{{2},m} x_m } \\ \hfill \vdots & \hfill {} & \hfill {} & \hfill {} & \hfill {} \\ \hfill {\underline x_m = } & \hfill {r_{m,{1}} x_1 + } & \hfill {r_{m,2} x_2 + } & \hfill { \cdot \cdot \cdot + } & \hfill {r_{m,m} x_m } \\ \end{array} } \right.,$$

(42)

where r_i,j is the element at the ith row and jth column of R. The equation is reorganized as:

$$\left\{ {\begin{array}{*{20}r} \hfill {\underline x_1 = } & \hfill {x_1 r_{{1},{1}} + } & \hfill {x_2 r_{{1},2} + } & \hfill { \cdot \cdot \cdot + } & \hfill {x_m r_{{1},m} } & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {} \\ \hfill {\underline x_2 = } & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {x_1 r_{{2},{1}} + } & \hfill { \cdot \cdot \cdot + } & \hfill {x_m r_{{2},m} } & \hfill {} & \hfill {} & \hfill {} & \hfill {} \\ \hfill \vdots & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill \ddots & \hfill {} & \hfill {} & \hfill {} \\ \hfill {\underline x_m = } & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {} & \hfill {x_1 r_{m,{1}} + } & \hfill { \cdot \cdot \cdot + } & \hfill {x_m r_{m,m} } \\ \end{array} } \right..$$

(43)

The compact form of Eq. (43) is:

$$\underline {{\varvec{x}}} = {{\varvec{Xr}}},$$

(44)

where X is a band matrix that is composed of x^T vectors and r is a vectorization of R^T:

$${{\varvec{X}}}{ = }\left[ {\begin{array}{*{20}c} {{{\varvec{x}}}^{\text{T}} } & {} & {} & 0 \\ {} & {{{\varvec{x}}}^{\text{T}} } & {} & {} \\ {} & {} & \ddots & {} \\ 0 & {} & {} & {{{\varvec{x}}}^{\text{T}} } \\ \end{array} } \right]_{m \times m^2 }$$

(45)

and:

$${{\varvec{r}}} = {\text{vec}}({{\varvec{R}}}^{\text{T}} ) = {[}\begin{array}{*{20}c} {r_{1,1} } & {r_{1,2} } & { \cdot \cdot \cdot } & {r_{1,m} } & {r_{2,1} } & { \cdot \cdot \cdot } & {r_{2,m} } & { \cdot \cdot \cdot } & {r_{m,1} } & { \cdot \cdot \cdot } & {r_{m,m} } \\ \end{array} {]}^{\text{T}} .$$

(46)

Unlike Eq. (2), x in Eq. (44) is a dependent variable of r. If we have a real model (X) and its corresponding solution (x), then the resolution vector r can be inverted via Eq. (44), as r is a vector with m² unknowns that is constructed from m equations and x has m elements.

Application of the projection to a random synthetic model x^k outputs the corresponding solution x^k. The model X^k (a band matrix for x^k, Eq. (45)) and x^k still satisfy Eq. (44). One can therefore obtain N solutions (x¹, x², …, x^N) by performing the same projection for N different random synthetic models (X¹, X²,…, X^N), and then all of the solutions can be used to construct a new equation from N equations, following Eq. (44):

$$\{ \underline {{\varvec{x}}} \} = \{ {{\varvec{X}}}\} {{\varvec{r}}},$$

(47)

where:

$$\begin{array}{*{20}c} {\{ \underline {{\varvec{x}}} \} = \left[ {\begin{array}{*{20}c} {\underline {{\varvec{x}}}^1 } \\ {\underline {{\varvec{x}}}^2 } \\ \vdots \\ {\underline {{\varvec{x}}}^N } \\ \end{array} } \right]} & {{\text{and}}} & {\{ {{\varvec{X}}}\} = \left[ {\begin{array}{*{20}c} {{{\varvec{X}}}^1 } \\ {{{\varvec{X}}}^2 } \\ \vdots \\ {{{\varvec{X}}}^N } \\ \end{array} } \right]} \\ \end{array},$$

(48)

the extended form of Eq. (47) is shown in Eq. (S4) in the supporting information. One synthetic model (X^k) produces m equations, as outlined in Eq. (43), with Eq. (47) including N × m equations for N synthetic models. If Eq. (47) is constructed of m² independent equations, then r will be uniquely resolvable via:

$${{\varvec{r}}} = \{ {{\varvec{X}}}\}^{ - 1} \{ \underline {{\varvec{x}}} \},$$

(49)

a resolution matrix R is then obtained by converting the vector r back into a matrix.

The new matrix R is obtained via either Eq. (49) or (2) without any approximation. The synthetic solutions (x^k) are the result of the complete x → x process with all of the factors. Therefore, the resultant R reflects all of the factors (various errors, simplification, etc.) in the complete process, and is termed the complete resolution matrix, which is denoted R_C (Table 1).

This extraction of R_C from random synthetic input models and output solutions is similar to that proposed by An (2012). An (2012) focused on the extraction of a reliable resolution length from a small number of xⁱ and xⁱ to construct the approximate resolution matrix R_S (Table 1). Here, an accurate resolution matrix (R_C) is directly inverted without approximation. Various procedures, such as linear and nonlinear inverse problems—and also non-inverse problems (An 2012)—can be implemented to obtain R_C (Fig. 6) and R_S, as they are isolated from the x → x process. For example, R_S can be obtained for kriging and minimum curvature gridding (Chiao et al. 2014). The extraction of R_C by Eq. (47) is a linear regression of the relationship between x^k and x^k (Fig. 2a), such that R_C can then be considered a linear approximation of r:x → x for either the nonlinear inverse or non-inverse problem.

3.2 Equation Simplification

The ability to resolve R_C via Eq. (49) requires the inversion of a large matrix with ≥ m² rows and m² columns. However, the calculation can be simplified.

Equations (42) and (43) state that the ith solution parameter x_i^k of the solution x^k can be written as:

$$\underline{x}_i^k = {{\varvec{R}}}_i {{\varvec{x}}}^k = ({{\varvec{x}}}^k )^{\text{T}} {{\varvec{R}}}_i^{\text{T}} ,$$

(50)

where R_i is the ith row of R (r_i,*). All of the equations for the parameters from x_i¹ to x_i^N form:

$$\{ \underline{x}_i \} = \{ {{\varvec{x}}}^{\text{T}} \} {{\varvec{R}}}_i^{\text{T}} ,$$

(51)

where:

$$\begin{array}{*{20}c} {\{ \underline{x}_i \} = \left[ {\begin{array}{*{20}l} {\underline{x}_i^1 } \hfill \\ {\underline{x}_i^2 } \hfill \\ \vdots \hfill \\ {\underline{x}_i^N } \hfill \\ \end{array} } \right]} & {{\text{and}}} & {\{ {{\varvec{x}}}^{\text{T}} \} = \left[ {\begin{array}{*{20}l} {({{\varvec{x}}}^1 )^{\text{T}} } \hfill \\ {({{\varvec{x}}}^2 )^{\text{T}} } \hfill \\ \vdots \hfill \\ {({{\varvec{x}}}^N )^{\text{T}} } \hfill \\ \end{array} } \right]} \\ \end{array} .$$

(52)

Equation (51) is actually the same as Eq. (S4), but it only contains the rows for the solution parameter x_i. Equation (51) can then be used to invert the ith row of R_C (R_i) from:

$${{\varvec{R}}}_i^{\text{T}} = \{ {{\varvec{x}}}^{\text{T}} \}^{ - 1} \{ \underline{x}_i \},$$

(53)

the other rows of R_C can also be inverted using Eq. (53).

The inversion of R_C via Eq. (53) is easier than that via Eq. (49). Equation (49) has an (left) inverse of the matrix {X} with N × m rows and m² columns. However, Eq. (53) has an (left) inverse of the much smaller (N × m) matrix {x^T}. Furthermore, the inverse of {x^T} for R_i^T in Eq. (53) is the same for the calculation of the other rows (e.g., R_j^T), and is then directly used in the calculation for all of the rows of R_C.

We derived R_C for Example 1 (Fig. 1) using the same regularization to obtain R_H(L¹) (Fig. 3c) without considering observation errors (δd) for comparison. The solutions (e.g., x^k) for 100 (N = 100) different synthetic random models (e.g., x^k) were resolved in step 1 (Fig. 6) of the calculation. The resultant R_C (Fig. 7b) is the same as R_H (Fig. 3c) for the linear inverse problem. However, R_C can reflect all of the factors in a practical study, whereas R_H only reflects the effects of G and regularization.

3.3 Resolution Matrices with Error Effects

All of the measurements and processing include errors which influence the solution x and then the x → x projection. R_C can include the effects of various (quantifiable and unquantifiable) errors and additional prior information (such as C and c in Eq. (10)). For example, system simplification may cause error in solution but is often difficult to be quantified. If x^k is obtained through the simplification, the resulted R_C will reflect the error related with the simplification. However, for the sake of comparison, examples on the effects of quantified errors in data are given below.

When the observational errors δd are considered, the equation with R(δd) (Eq. (40)) is of the same form as Eq. (2). Therefore, the above method (Fig. 6) for obtaining the complete resolution matrix R_C via either Eq. (49) or (53) on the basis of Eq. (2) can be used to obtain R(δd) (Eq. (40)). One pair of models (x^k, x^k) serves as an independent measurement for R in this method. However, the addition of more factors in the processing than those contained in G and C make the relationship between x and x more complex, such that a larger number (> m) of model pairs is often necessary to obtain a reliable R_C.

The traditional resolution matrix R_H (Eq. (41)) cannot include the effect of observation error, then equals R(0) or R(δd = 0). The resolution matrix (R_C or R(δd)) with the effect of observation error is different from R_H. An equation with their difference can be obtained from Eqs. (40) and (41):

$$\delta \underline {{\varvec{x}}} = ({{\varvec{R}}}(\delta {{\varvec{d}}}) - {{\varvec{R}}}_{\text{H}} ){{\varvec{x}}},$$

(54)

Eq. (54) indicates the difference (R(δd) – R_H) or (R(δd) – R(0)) is a projection matrix on the solution error (δx).

Random errors exist in practical studies. The influence of random errors on the solution diminishes as the number of observations increases. However, if observations are limited, then the average effect of the random errors appears as a regular systematic error; we therefore only test systematic errors here. A stronger regularization (larger λ) is normally required for inversions with observational errors. However, we used the same λ that was applied in the above error-free inversion for comparison.

Two main types of systematic errors, offset and scale factor errors, are tested here. Offset observational errors in a linear inversion will introduce offset errors in x (Fig. 2b). The x → x process with offset errors in x can be better represented by Eq. (17) than Eq. (40). Equation (17) has more variables (R and δx_of) than Eq. (40), such that the process with offset errors is more complex and requires a larger number of model pairs to obtain R(δd). If we still invert for R(δd) via Eq. (40) using the same number (N = 100) of model pairs (x^k, x^k) in Fig. 7, then the resultant matrix is somewhat unstable (Fig. 8a). The column vector is somewhat stable (Fig. 8b) due to regularization (L¹) because flatness regularization directly influences the column vectors, but not the row vectors. R(δd) generally becomes stable when a larger number (e.g., N = 200) of model pairs (Fig. 8a) is used.

Scale factor observational errors in a linear inversion will introduce scale factor errors in x. The x → x process with scale factor errors in x can be well represented by Eq. (40), with R(δd) equaling R_H multiplied by the factors related to the scale factor errors.

An offset error of 2.0 s (δd₁ = 2.0 s) and a scale factor error of 20% (δd₁ = 0.2d₁ = ~ 2.5 s) in the first observation data (d₁), which have a mean of ~ 12.5 s for all of the synthetic observations, are considered. Two hundred (N = 200) model pairs are used, with the resultant matrices R(δd) shown in Figure S1. The matrix differences, R(δd) − R_H(L¹), which reflect the observational error effects in the projection from x to x, are shown in Fig. 9c and d.

These two types of observational errors yielded similar errors (δx_d) in the solution parameters (x₁–x₂₅) in x(L¹) (Fig. 9a, b), resulting in similar summation curves for the 1^st–25^th content vectors in R(δd) (Fig. 9e, f). However, the errors produced different effects in the projection matrices R(δd). The row vector of R(δd) for a solution parameter with errors (e.g., x₅) in Fig. 9c is very different from that in Fig. 9d. The addition of the scale factor error δd₁ (= 0.2d₁) only caused resolution matrix changes in the 1^st–10^th columns (Fig. 9d) (i.e., only the contributions of x₁–x₁₀, which are constrained by d₁, are influenced, as expected). However, the offset error δd₁ (= 2.0) caused changes in all of the columns (Fig. 9c). Therefore, the spread functions are sensitive to different types of errors.

3.4 Offset Error Isolation

The x → x process with offset errors in x (δx) (Fig. 2b) can be better represented via linear regression (Eq. (17)), whereby the offset errors are isolated from either x or R(δd)x. When the offset errors δx_of (= [o₁, o₂, …, o_m]^T) are isolated, the obtained resolution matrix is denoted as either R_-of or R_-of(δd). Equation (17) can be written as:

$$\underline {{\varvec{x}}} = {{\varvec{R}}}_{\text{ - of}} (\delta {{\varvec{d}}}){{\varvec{x}}} + \delta {{\varvec{x}}}_{{\text{of}}} = {{\varvec{R}}}_{ + 1} {{\varvec{x}}}_{ + 1} ,$$

(55)

where R₊₁ = [R_-of δx_of], x₊₁ = [x^T 1]^T. The extended form of Eq. (55) is:

$$\left[ {\begin{array}{*{20}c} {\underline x_1 } \\ {\underline x_2 } \\ \vdots \\ {\underline x_m } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {r_{1,1} }&{r_{1,2} }&{ \cdot \cdot \cdot }&{r_{1,m} }&{o_1 } \\ {r_{2,1} }&{r_{2,2} }&{}&{r_{2,m} }&{o_2 } \\ \vdots & {} &\ddots & {} &\vdots \\ {r_{m,1} }&{r_{m,2} }&{ \cdot \cdot \cdot }&{r_{m,m} }&{o_m } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {x_1 } \\ {x_2 } \\ \vdots \\ {x_m } \\ 1 \\ \end{array} } \right].$$

(56)

Equation (55) is of the same form as Eq. (17), such that the m × (m + 1) matrix R₊₁ can be inverted using model pairs (e.g., x^k and x^k) via the above procedure for R_C (Fig. 6). The obtained R₊₁ is denoted as R_C+1 (Table 1). As expected, the resultant matrix R_-of(δd) for the above case with offset errors is the same as R(δd = 0) (Fig. 7b), and the resultant δx_of (not shown here) equals the offset errors in x.

3.5 Properties of the Complete Resolution Matrix

Unlike R_D (from G only) and R_H (from G and C), a complete resolution matrix can be obtained from any combination of all of the factors in a study (Table 1). If the synthetic solution x^k is obtained from a generalized inversion using error-free observations, then R_C is equal to R_D, with both matrices characterizing the same properties. If regularization is used during the processing, then R_C is equal to R_H, with both matrices characterizing the same properties. Therefore, the properties of R_C vary based on the considered factors in the x → x process.

3.6 Utilities of Resolution Matrices

Resolution matrices (e.g., R_C) are a quantitative indicator not only for solution but also for other factors in a study system. All the factors (e.g., solution, observation, and regularization) can be appraised by the matrices.

Resolution matrix has been widely used to appraise solution reliability in an inverse problem (e.g., Aki et al. 1977; Aster et al. 2005; Backus and Gilbert 1970; Menke 2012; Tarantola and Valette 1982; Wiggins 1972). The diagonal entry r_i,i reflects whether x_i can be resolved or how much of x_i is contained in x_i and then has received considerable attention for several decades (e.g., Aki et al. 1977; Day-Lewis et al. 2005; Wiggins 1972). Resolution spread estimated from resolution matrix can quantify the degree of departure of R from an identity matrix, and then the goodness of the model (Backus and Gilbert 1970; Menke 2012, 2015). However, resolution length, discussed in the “Resolution length” section, is widely used in practical model appraisals at present. Furthermore, the matrices R_S and R_C can also be applied to evaluate solution stability. In an unstable inversion, a small variation in observation causes a large change in solution, then R_S and R_C calculated from unstable solutions from random models are sensitive to the instability.

The resolvability and constrainability of x under given observation, which is the most important information in a study, controls the solution reliability and must be taken into account to select parameterization and regularization. They can be quantitatively retrieved from resolution matrices, which is explained in the “Resolvability and constrainability from the resolution matrix” section.

The influences of regularization and errors on solution can also be evaluated by resolution matrices. The matrix R_D reflects model projection under given observation. R_H reflects the projection with the combination of observation and regularization. The difference, R_H – R_D, reflects the regularization influence on solution, which is discussed above in the “Significances of the resolution matrices” section. If an error exists in the process to obtain x, R_C will include the effect of the error. The difference, R_C – R_D, reflects the error’s effect, which is discussed in the Sects. 3.3 and 3.4.

4 Resolution Length

The smallest possible feature that can be detected is an important constraint in the model. The feature size is generally called the (spatial) resolution or resolution length (or width). Following the suggestion of Lebedev and Nolet (2003), the resolution length is defined here as the half size of the feature. This section focuses on how to obtain the resolution length from a resolution matrix.

4.1 Content Extent Versus Resolution Length

Equation (38) indicates that one solution parameter generally comes from the weighted averages of all of the medium parameters, with the entries from a row vector of R being the weights or contents. The smallest resolvable feature represented by x_i or the resolution length at x_i should therefore be estimated from the row vector.

The medium parameters with high contents/weights in x_i mainly determine x_i, such that the feature represented by x_i is related to the high-content segment (e.g., r_48,41 to r_48,55 in Fig. 7c) in the row-vector curve. However, as the feature is represented by x_i, the resolution length is not defined by the high-value segment extent, but rather the distances from x_i to the parameters at the segment borders. In Fig. 7c, the resolution length for x₄₈ is not the half distance from x₄₁ (r_48,41) to x₅₅ (r_48,55), but it is instead related to the distances from x₄₈ (r_48,48) to x₄₁ (r_48,41) and x₅₅ (r_48,55); similarly, the resolution length for x₉₅ is not the half distance from x₈₆ to x₉₀, but it is instead related to the distances from x₉₅ to x₈₆ and x₉₀ (Fig. 7d).

In general, a parameter (e.g., x_i) and its neighbors should provide a large contribution to the average (the solution parameter x_i) (Jackson 1972), and the contributions (r_i,j; j = 1, m) should decrease quickly with increasing distance from x_i to the other parameter (x_j). Therefore, each row of the resolution matrix (r_i,j; j = 1, m) can be approximated as either a Gaussian-shaped function (e.g., An 2012; Fichtner and Trampert 2011; Nolet 2008) or a cone (Barmin et al. 2001). The row-vector curve may be similar to the shape of a Gaussian function (e.g., r_48,*; Fig. 7c), but it can be quite different (e.g., r_95,*; Fig. 7d). In general, the width of the Gaussian function approximation to the row-vector curves can represent the resolution length (distance from x_i to the borders of the high-content segment) (Fig. 7c and d). For example, the resolution lengths in Fig. 7c and d are 4 km for x₄₈ and 9 km for x₉₅, respectively.

4.2 Estimation from a Row or Column Vector?

The resolution length may also be estimated from a column vector (e.g., Alumbaugh and Newman 2000; Smith 1997). R_D is symmetric, such that the resolution lengths estimated from its i^th row and column vectors are the same. However, regularization directly influences the spread functions (or column vectors), such that most of the resolution matrices with regularization (e.g., R_H) are asymmetric. Therefore, the resolution lengths estimated from the i^th column and row vectors can be different.

The resolution lengths that are estimated from the column and row vectors are largely the same or similar when Tikhonov regularization is applied. The resolution matrix R_H(λI) is symmetric, such that the lengths from its row and column vectors are same. Higher-order Tikhonov regularizations (λLⁿ) yield a smoother column vector (r_*,48 in Fig. 7e) than its corresponding row vector (r_48,* in Fig. 7e), and then the lengths from the two vectors are largely similar, as previously confirmed (Miller and Routh 2007; Pilkington 2016), but with exceptions.

The column vector cannot provide a valid resolution length for a parameter constrained by no observations (e.g., x₉₅ in Fig. 1a, b). The column vector for the parameters in R_H (e.g., r_*,95 in Fig. 7f) is all zeros, but the corresponding row vector is not (r_95,* in Fig. 7d), with a row-vector sum for R(S⁰) (e.g., R_H(Lⁿ)) that is equal to one. The row vector can provide a reliable resolution length in this case, whereas the column vector cannot (Table 3).

The row vector may be unstable and possess strong oscillations when large errors exist (e.g., r_5,* in Fig. 8a, c). The high-amplitude oscillations around the diagonal entry of the row vector may imply an illusion of high resolution. However, the corresponding column vector in R_H(Lⁿ) is smoother (Fig. 8b) than the row vector (Fig. 8a), such that the length estimated from the column vector is more reliable and less influenced by these large errors (Table 3).

In summary, the resolution length should generally be taken from the row vectors based on the definition of the resolution length (Table 3). However, special cases (e.g., observation errors) may yield an unstable row-vector curve that may in turn influence the estimated resolution from this vector. Therefore, it is advised to simultaneously extract the resolution length from the row and column vectors of the resolution matrix to ensure the resolution length is accurately defined.

4.3 Resolution Estimations that are not from R

The spatial resolution in seismic tomography studies is widely estimated via visual inspection of the restoration of the synthetic structure (e.g., checkerboard tests) (Feng and An 2010; Lévěque et al. 1993; Thurber and Ritsema 2009), as illustrated in Fig. 10a–c. If the checker size can be recovered at a given location, then the resolution length at that location in the final result is at least the same as the checker size. This method is powerful and easily realized. However, the resolution length is qualitative, not quantitative. Furthermore, the recovered (x) and synthetic checkerboard pattern model (x) in one test are equivalent to one pair of x and x. Several tests cannot produce sufficient model pairs to provide fully resolution information, as explained in the “Resolution matrices of the complete process” section.

The quantitative resolution length can still be retrieved when no resolution matrix is given or needed. The output solution x of a synthetic test using an input model with a random structure x contains the resolution information (An 2012). Quantitative resolution information can be retrieved via a number of approaches, including a comparison of many x and x (An 2012), cross correlation of x and x (Trampert et al. 2013), and autocorrelation of x (Fichtner and Leeuwen 2015). The resolution lengths in Fig. 10d were obtained using the An (2012) method on the basis of limited pairs of random synthetic models and solutions (Ma et al. 2014). The An (2012) method has been easily realized in various studies (e.g., Chevrot et al. 2014; Chiao et al. 2014; Lin et al. 2014; Ma et al. 2014). The resolution length distribution (Fig. 10d) is often easier and more informative for the general reader than synthetic checkerboard recovery tests (Fig. 10a–c) (Ma et al. 2014).

5 Resolvability and Constrainability from the Resolution Matrix

Several essential questions arise when the real model x cannot be fully resolved in x for an ill-posed problem. For example, how much information from x can be reflected in the solution x, or what is the resolvability of x_i (or the content of x_i in x_i)? How much information from x is controlled by the observations? What is the constrainability (constraining status) of an individual solution parameter x_i under given observations? These questions are not only essential for understanding the reliability of the solution, but also instrumental in providing basic information to guide the improvement of the study system. These questions are essentially centered on the relationship between x and x, such that their answers can be derived from the x → x projection/resolution matrix.

The r_i,j entry in R represents the accurate content or contribution of the j^th model parameter x_j to the i^th solution parameter x_i. Therefore, the entries of the i^th content vector (r_i,*) and their sum (Σr_i,*) are indicators of the resolvability of x_i and the constrainability of x_i. As previously mentioned, R_H reflects the combined effects of the observation matrix and regularization during the x → x process, whereas R_D only reflects the observational effects. The reliability of the practical solution has therefore been evaluated via R_H, but not R_D. However, the constrainability of the solution parameter x_i under given observations can be better obtained from R_D than from R_H for the same reason. The factors rather than the observations and regularization matrices can also influence the reliability of x_i; this cannot be reflected by R_D and R_H, but can be by R_C.

5.1 Resolvability Defined by the Main-Diagonal Element

Equation (38) indicates that the main-diagonal element r_i,i reflects the content (or contribution) of the real model parameter x_i in (to) its counterpart x_i. This entry reflects whether x_i can be resolved or how much of x_i is contained in x_i. r_i,i can therefore be considered the resolvability of x_i (Table 3) and has received considerable attention for several decades (e.g., Aki et al. 1977; Day-Lewis et al. 2005; Wiggins 1972).

The main-diagonal element r_i,i (e.g., Fig. 4b for Example 1) may take one of the following values:

r_i,i = 1 (as illustrated in Fig. 11a). The curve shape of the elements in the ith row vector is a delta function, where all of the elements are zero, except r_i,i. In this case, x_i equals x_i, which means that x_i is well constrained and x_i can be fully resolved. If r_i,i is in R_D, then the parameter x_i is fully resolvable under the given observations.
r_i,i = 0 (Fig. 11b). This case indicates that x_i makes no contribution to its counterpart x_i and is unresolvable. For example, parameter x₉₅ in Example 1 is not constrained by an observation (Fig. 1a, b), such that r_95,95 in both R_D and R_H is zero (Figs. 3 and 4b), and x₉₅ in x_D equals zero (Fig. 1c).
r_i,i ∈ (0,1). x_i partially contributes to its counterpart x_i and is partially resolvable. If r_i,i in R_D belongs to (0,1), then x_i shares the observation with other parameters (e.g., x_j); x_i therefore contributes to both its counterpart x_i and x_j. For example r_1,1 = 0.1 in R_D in Example 1 (Fig. 4b), which indicates that x₁ shares observation 1 with x₂ (Fig. 1a); x₁ therefore contributes to both x₁ and x₂.

The main-diagonal element r_i,i in R_D reflects the resolvability of x_i under a given observational condition, such that the sum of all of the main-diagonal elements, or trace(R_D), can be considered the resolvability of the model vector x. As R_D is a Gram matrix, trace(R_D) equals the number of independent observations (rank(G)). R_D will therefore be an identity matrix, and x can be fully resolved when either trace(R_D) or rank(G) equals the number of model parameters (m).

5.2 Deviation from the Expectation Given by the Row-Vector Sum

Equation (38) indicates that x_i equals the weighted sum of all of the parameters in x, and Σr_i,* (or Σ_iR) is the sum of all of the weighted r_i,*. If R is a stochastic matrix, then a sum of one means that x_i reflects the true average of x (Nolet 2008) and lies at least within the extremes of x. For example, if Σ₁R_D = 1 (Fig. 4a) and r_1,* in R_D are positive (Fig. 3a), then x₁ in x_D is a good representative of x₁ (Fig. 1c). However, the assumption is often false, as the entries in R can be negative (Menke 2015) (i.e., R is often not a stochastic matrix). For example, Σr_1,* in R_H(L¹) in Example 1 equals one (Fig. 4a), but the parameter x₁ in x(L¹) (Fig. 1c) deviates from the expected average. The polarity of r_i,* must therefore be considered when Σ_i(R) is used to judge the reliability of x_i.

While Σ_iR = 1 does not necessarily correspond to the perfectness of x_i, the deviation of Σ_iR from one is a good indicator of the deviation of x_i from the true model average (Table 3). A comparison of Figs. 4a and 1c indicates that an overestimated (larger than the model average) parameter x_i (x₅₄ in x_D and x(I) in Fig. 1c) corresponds to Σ_iR > 1 (e.g., Σ₅₄R_D and Σ₅₄R_H(I) in Fig. 4a), and an underestimated (smaller than the model average) parameter (x₄₈ in x_D and x(I) in Fig. 1c) corresponds to Σ_iR < 1 (e.g., Σ₄₈R_D and Σ₄₈R_H(I) in Fig. 4a).

5.3 Difference Between Neighboring Parameters

If the parameter x_i is partially resolvable (r_i,i ∈ (0,1)), then the solution parameter x_i will include content (r_i,ii > 0) from neighboring parameters (e.g., x_ii), and the r_i,i and r_i,ii entries in the ith row vector of R can reflect the similarity between x_i and x_ii.

When two neighboring parameters x_i and x_ii (e.g., x₆₀ and x₆₁; Fig. 1a, b) are constrained by unrelated observations, r_i,i < 1 and r_i,ii = 0 (Fig. 11c), as observed for r_60,61 and r_60,60 in either R_D or R_H(I) (Fig. 3a and b). In this case, x_ii does not contribute to x_i (Eq. (38)), and x_i does not contribute to x_ii. It is possible to discriminate the difference between x_i and x_ii in x_i and x_ii from the unrelated observations, as x₆₁ in either x_D or x(I) is obviously different from x₆₀ (Fig. 1c).

When two neighboring parameters x_ii and x_i (e.g., x₁₃ and x₁₄, which are constrained by the second observation in Fig. 1a, b) are constrained by the same or related observations, both r_i,i and r_i,ii (ii = i − 1 or i + 1; Fig. 11d) (e.g., r_13,13 and r_13,14 in either R_D or R_H(I); Fig. 3a and b) are in the range (0,1). In this case, x_i has contents from both x_ii and x_i. x_ii also has contents from both x_ii and x_i because of the symmetry of R_D. Consequently, x_i and x_ii cannot be discriminated, and their difference is often related to the difference Δr_i,ii:

$$\Delta r_{i,ii} = \left| {r_{i,i} - r_{i,ii} } \right|.$$

(57)

The difference between x_i and x_ii in x_D is often very large when Δr_i,ii is large, even if the real model parameters x_i and x_ii are the same. When Δr_i,ii is small, the difference is also small, even if x_i and x_ii are quite different. When Δr_i,ii is zero, x_i is often equal to x_ii. The solution parameters x₁₃ and x₁₄ in either x_D or x(I) are quite different (Fig. 1c), even though the real model parameters x₁₃ and x₁₄ are almost the same. Δr_i,ii is therefore a good indicator of the difference between x_i and x_ii (Table 3) if r_i,i is less than one, even though this difference has no relation to the difference between x_i and x_ii.

5.4 Short Summary on Constrainability

In summary, the constrainability of a solution parameter can be quantitatively evaluated from its content vector in a resolution matrix (Table 3). The main-diagonal element (r_i,i) can be considered the resolvability of x_i. If R_D is used, then r_i,i values of 0, 1, and (0,1) mean that x_i is unresolvable, fully resolvable, and partially resolvable, respectively, for given observations. The deviation of the content vector sum (Σr_i,*) from one can be considered an indicator of the deviation from the model expectation. Values of Σr_i,* > 1 and < 1 mean that x_i is overestimated and underestimated, respectively. In the case where Σr_i,* = 1 and all of the elements r_i,* are nonnegative, x_i is the model true average. The difference between r_i,i and r_i,ii (Δr_i,ii) is a reflection of the difference between x_i with x_ii for a partially resolvable parameter x_i (r_i,i ∈ (0,1)). Large and small Δr_i,ii values correspond to large and small differences between two neighboring parameters, respectively, although the solution difference has no relation to the difference between the true medium parameters.

5.5 Constrainability from R _H

Practical studies that employ regularization do not provide R_D, but rather R_H. Estimation of the parameter constrainability under a given observation from R_H is necessary in this case to determine how much information in the solution is from observation rather regularization. As mentioned above, R_H(λI) is the most similar matrix to R_D, making it a good alternative to R_D for evaluating the constrainability. However, most of the main-diagonal elements in R_H(λI) are smaller than those in R_D. The row vector sum of R_H(λI) may therefore be smaller than that of R_D, such that its deviation from one cannot be used to evaluate the underestimated parameters. R_H(λI) can also be used to evaluate the well constrained parameters, as the curve shape of their row vectors is still a delta function, even though r_i,i may not be one.

R_H(Lⁿ) (n > 0; e.g., R_H(L¹) in Fig. 3c) is significantly different from R_D (Fig. 3a). With the exception of the unconstrained parameters, the main-diagonal elements of R_H(L¹) (r_i,i and r_ii,ii) for two neighboring parameters (x_i and x_ii) that are constrained by the same observation can be different (Fig. 4b). The row-vector sum of R_H(L¹) for any single parameter equals one (Fig. 4a). Therefore, the main-diagonal elements r_i,i and row-vector sum Σ_iR_H(L¹) cannot be used to evaluate the constrainability of the parameter x_i. However, the all-zero column vectors of R_H(Lⁿ) (n = 0, 1, 2) for the unconstrained parameters are the same as those in R_D, such that the unconstrained parameters and unconstrained neighbors can be evaluated from R_H(Lⁿ). Furthermore, the curve shape of the row vectors of R_H(Lⁿ) (n = 0, 1) is similar to that of R_D (Figs. 1f and 3). The mutual relationship between two neighboring parameters can therefore be roughly evaluated using Δr_i,ii of R_H(Lⁿ).

5.6 Resolution Upper Bound

R_D is only related to G (Eq. (19)), with no relationship to the observation data d with observational errors (δd) and other factors, whereas R_H is composed of both G and C. Regularization adds artificial constraints to make the solution appear more rational. However, various factors, including observational errors, regularization, and the instability of the solution can decrease (but not increase) the resolution reflected by R_D. Therefore, the resolution derived from R_D marks the upper bound of resolvability (Table 1).

While repeated observations can improve the precision of the solution by improving the precision in the observation data (d), they cannot increase either rank(G) or the number of independent observations. Repeated observations therefore have no effect on R_D and cannot improve the upper bound of resolution. The only way to improve the upper bound of resolution is to increase the independent observations (or rank(G)).

5.7 What is a Perfect Inversion?

A perfect inversion, or a perfect constrainability, corresponds to a resolution matrix that equals the identity matrix I (e.g., Jackson 1972; Menke 1989). However, this rule is only applicable for direct resolution matrix R_D, not the other resolution matrices (e.g., R_H or R_C). A main-diagonal element r_i,i of one in R_D means that x_i can be fully resolved in x_i, thereby implying that a perfect inversion has identity matrix of R_D. R_D always equals I except when the problem is under-determined. However, R_H includes regularization, and regularization is largely employed when the constrainability of the solution parameters under a given observational condition is imperfect, (i.e., trace(R_D) is often smaller than m or the inverse of G is unstable). Therefore, trace(R_H) is always smaller than trace(R_D), regardless of the quality of the regularization, such that R_H will never be an identity matrix I. Therefore, the perfectness of the inversion cannot be judged based on the degree of similarity between R_H and I. The perfectness of a study using regularization should be judged based on the amount of valid information in G that is reflected in the solution, as different regularizations yield different solutions.

6 Resolution Matrices in Nonlinear Inversions

The relationship between x and x is different from that between x (or x) and d (or d). The relationship between x and x (r:x → x) of a nonlinear problem can be nonlinear, but can also be linear, such as when true model x is perfectly resolved (x = x = I x = r(x)). This indicates, the x → x projection is a different relation with the problem but relates with the ability of solving the problem. The nonlinear inverse problem cannot be described by either Eq. (6) or (9), such that neither R_D (Eq. (19)) nor R_H (Eq. (24)) can be obtained for the problem. Anyway, Eq. (2) remains a valid linear projection approximation of the nonlinear equation (Eq. (1)), such that R_C, which is directly obtained from Eq. (2), can still be provided, regardless of the method used to solve the problem.

A nonlinear inverse problem can be solved via either a global optimization or linearized method. If the problem is resolved using linearized iterative methods (e.g., Aster et al. 2005; Bourgeois et al. 1989), such as Newton’s method, then a resolution matrix can be obtained after each iteration (Jackson 1972) from either Eq. (19), Eq. (24), or a similar equation. However, the resolution matrix is not any of the above-mentioned resolution matrices (R_D, R_H, R_C, R_I, and R_S), but rather one of three new resolution matrices that are specifically constructed for nonlinear inverse problems.

An iterative inversion is designed to invert for the model perturbation Δxⁱ (= x – x^i–1) of the reference model x^i–1 at the i^th iteration on the basis of the first-order approximation of the inverse problem (Eq. (4)):

$$\Delta {{\varvec{d}}}^i = {{\varvec{G}}}_{\text{J}}^i \Delta {{\varvec{x}}}^i .$$

(58)

where Δdⁱ = d − g(xⁱ⁻¹) and G_Jⁱ is a Jacobian matrix with the partial derivatives of d taken with respect to x at xⁱ⁻¹. Equation (58) is a linear equation, such that the solution of the perturbation Δxⁱ (Δxⁱ) can be obtained from G_Jⁱ and/or C using the same methods that were employed to obtain the model solution x from G and/or C in the above linear inversions (e.g., Eq. (11)). However, the inverted solution (Δxⁱ) is a model perturbation, not the model. The inverted model (solution) after the ith iteration xⁱ is xⁱ⁻¹ + Δxⁱ, which is the reference model at the next inversion iteration. This is the general procedure of Newton’s method.

A surface-wave inversion to constrain the S-wave velocity structure (Example 3) (Fig. 12a) is synthetized here to illustrate the three resolution matrices that can be implemented in a linearized nonlinear inversion. The employed inversion is a typical nonlinear inversion approach in geophysics that is widely applied to elucidate 1-D and three-dimensional sedimentary, crustal, and lithospheric Earth structures (e.g., Feng and An 2010; Knopoff 1972; Snoke and James 1997; Wiggins 1972; Xia et al. 1999); therefore, the details of the nonlinear relationship between x and d are not explained here. First-order Tikhonov regularization has been widely used in this inversion approach, although it prevents the correction of bad discontinuities in the reference model (An 2020). A regularization approach that is adapted to the reference models suggested by An (2020) can overcome this problem and lead to a rapidly convergent iterative inversion; this regularization approach is used here. The regularization parameter was set to λ = 0.01 after a series of tests that explored the trade-off between the misfit and model flatness. The synthetic observation (d) and predictions (Fig. 12a) and partial-derivative matrix G_Jⁱ (Eq. (58)) were calculated using the surf96 program (Herrmann 2013). Given a reference model at the first iteration (the starting model), the model solutions after the first to fifth iterations (xⁱ) and their fits to the reference model are shown in Fig. 12a. The solution x⁵ after the fifth iteration is nearly the same as x⁴ after the fourth iteration, which implies that the inversion converged around x⁴. x⁴ is, therefore, considered the final solution.

6.1 Linear Approximation of r:x→x

The R_C calculation for a nonlinear inversion is the same as that for the above linear inversions, whereby only the synthetic input models (x) and their corresponding solutions (x) are used. This calculation, which is based on either Eq. (43) or (47), is in fact a linear regression of x and x (i.e., R_C is a linear approximation of r:x → x) (Table 1) (Fig. 2a). However, the relationship between x and x is often nonlinear due to the complexity in the x → x process for a nonlinear problem. The calculation of a reliable R_C therefore requires more pairs of input/output models. Furthermore, the resultant R_C will somewhat depend on the synthetic random input models. If they are closer to the practical medium, then the resultant R_C will be more realistic.

6.2 Projection of Solution Improvement After an Iteration

As Eq. (58) is of the same form as Eq. (6), the process from Δxⁱ (= x – x^i–1) to Δxⁱ (= xⁱ – x^i–1) after the i^th iteration can be expressed in a form similar to Eq. (22) to represent the x → x process (e.g., Jackson 1972; Wiggins 1972):

$$\Delta \underline {{\varvec{x}}}^i = {{\varvec{R}}}_{\text{J}}^i \Delta {{\varvec{x}}}^i$$

(59)

or:

$$\underline {{\varvec{x}}}^i - \underline {{\varvec{x}}}^{i - 1} = {{\varvec{R}}}_{\text{J}}^i ({{\varvec{x}}} - \underline {{\varvec{x}}}^{i - 1} ),$$

(60)

where R_Jⁱ denotes the resolution matrix R:Δxⁱ → Δxⁱ. Equation (59) is of the same form as Eq. (2), such that R_Jⁱ (denoted R_JDⁱ) can be obtained via Eq. (19):

$${{\varvec{R}}}_{{\text{JD}}}^i = ({{\varvec{G}}}_{\text{J}}^i )^{ - {\text{g}}} {{\varvec{G}}}_{\text{J}}^i .$$

(61)

When the regularization matrix C is used, R_Jⁱ (denoted R_JHⁱ) can be obtained via Eq. (24):

$${{\varvec{R}}}_{{\text{JH}}}^i = ({{\varvec{A}}}_{\text{J}}^i )^{ - {\text{t}}} {{\varvec{G}}}_{\text{J}}^i ,$$

(62)

where A_Jⁱ is the combination of G_Jⁱ and C, which is similar to how A is the combination of G and C in Eq. (10). The matrix R_JDⁱ (or R_JHⁱ) is often denoted R previously, but it possesses a different significance than the resolution matrix R:x → x.

The first-order Taylor expansion of Eq. (1) at xⁱ⁻¹ is:

$$\underline {{\varvec{x}}}^i = \underline {{\varvec{x}}}^{i - 1} + {{\varvec{J}}}_r (\underline {{\varvec{x}}}^{i - 1} )({{\varvec{x}}} - \underline {{\varvec{x}}}^{i - 1} ),$$

(63)

where J_r(xⁱ⁻¹) denotes the Jacobian matrix (or the gradient) of the projection r:x → x at reference model x^i–1. Equation (63) has the same form as Eq. (59). Therefore, R_Jⁱ is exactly the Jacobian matrix J_r(xⁱ⁻¹) (Table 1). Practically, R_Jⁱ represents the projection from x − xⁱ⁻¹ to xⁱ − xⁱ⁻¹ (Eq. (59)) (i.e., the projection of the solution improvement on the reference model (x^i–1) just after the i^th iteration) (e.g., R_J¹ illustrated in Fig. 13b). The matrix R_Jⁱ in Example 3 (e.g., R_JH⁴ just after the fourth iteration (Fig. 12c)) is slightly different than R_C (Fig. 12b).

6.3 Projection of the Solution Improvement up to an Iteration

The inversion after each iteration is represented by Eq. (59), but an application needs one more iteration. The solution improvement from the k^th to i^th iterations can be expressed as:

$$\underline {{\varvec{x}}}^i - \underline {{\varvec{x}}}^{k - 1} = {{\varvec{R}}}_{\text{J}}^{k \to i} ({{\varvec{x}}} - \underline {{\varvec{x}}}^{k - 1} ),$$

(64)

where:

$$\Delta {{\varvec{R}}}_{\text{J}}^{k \to i} = \Delta {{\varvec{R}}}_{\text{J}}^i + \Delta {{\varvec{R}}}_{\text{J}}^{k \to (i - 1)} - \Delta {{\varvec{R}}}_{\text{J}}^i \Delta {{\varvec{R}}}_{\text{J}}^{k \to (i - 1)} .$$

(65)

If k = i just after the i^th iteration of a given inversion, then Eq. (64) should be the same as Eq. (59). The matrix R_J^i→(i−1) = {0}, which means that R_J^i→i = R_Jⁱ. Therefore, if k = 1, then Eq. (64) becomes:

$$\underline {{\varvec{x}}}^i - \underline {{\varvec{x}}}^0 = {{\varvec{R}}}_{\text{J}}^{1 \to i} ({{\varvec{x}}} - \underline {{\varvec{x}}}^0 ),$$

(66)

where:

$${{\varvec{R}}}_{\text{J}}^{1 \to i} = {{\varvec{R}}}_{\text{J}}^i + {{\varvec{R}}}_{\text{J}}^{1 \to (i - 1)} - {{\varvec{R}}}_{\text{J}}^i {{\varvec{R}}}_{\text{J}}^{1 \to (i - 1)} .$$

(67)

The matrix R_J^1→i represents the projection from Δx (= x − x⁰) to Δx (= xⁱ − x⁰), which is a projection of the solution improvement on the starting model x⁰ after i iterations (xⁱ − x⁰) (Fig. 13b). This is different from the gradient of r:x → x at xⁱ (R_Jⁱ), as R_J^1→i is the slope of r:x → x from x⁰ to xⁱ (Table 1). The magnitude difference between R_JH^1→4, which is obtained using R_JH¹, …, R_JH⁴, and R_JH⁴ (Fig. 12d) is remarkable. The matrix R_J^1→i better represents the solution improvement in the inversion than R_Jⁱ.

6.4 Four Types of Resolution Matrices in a Linearized Inversion

The matrices R_Jⁱ:Δxⁱ → Δxⁱ and R_J^1→i:Δx → Δx in a linearized inversion can also be obtained via the R_C calculation. If the synthetic perturbations Δxⁱ (= x − xⁱ⁻¹) and solution Δxⁱ after the i^th iteration are used as the true model x and corresponding solution x, respectively, then the resultant R_C (denoted R_JCⁱ) should be the same as R_JHⁱ:Δxⁱ → Δxⁱ. If the synthetic perturbations Δx (= x − x⁰) and corresponding solution Δx^1→i (= xⁱ − x⁰) are used, then the resultant R_C (denoted R_JC^1→i) is the same as R_JH^1→i.

Furthermore, if the synthetic x and corresponding solution xⁱ (= x^i–1 + Δxⁱ) after the i^th iteration are used in the R_C calculation, then the resultant R_C (denoted R_Cⁱ) (e.g., R_C and R_C¹ in Fig. 13b) represents the projection from x to xⁱ (= x^i–1 + Δxⁱ) just after the iteration (Rⁱ:x → xⁱ). This is one more new type of resolution matrix that may appear in a linearized iterative application. If xⁱ is final solution x, R_Cⁱ is written as R_C.

In summary, completed resolution matrix for the x → x process can be obtained from the linear approximation of r:x → x, regardless of the method used to solve a given nonlinear inverse problem. However, a linearized iterative application can have four classes of resolution matrices (Tables 1 and 2, Fig. 13b), R_C, R_Cⁱ, R_J^1→i, and R_Jⁱ, which represent the x → x, x → xⁱ, (x − x⁰) → (xⁱ − x⁰), and (x − xⁱ⁻¹) → (xⁱ − xⁱ⁻¹) projections, respectively. R is a linear approximation of the operator r, whereas R_J is the Jacobian matrix of r. The resolution matrix R_JHⁱ, which is often provided in the literature, reflects the solution improvement just at the i^th iteration. The matrix R_JH^1→i reflects the solution improvement of the solution after all i iterations from starting model to the solution xⁱ.

The surface-wave dispersion inversion in Example 3 highlights that even though the magnitudes of R_C, R_JH⁴, and R_JH^1→4 (Fig. 12b–d) are obviously different, their magnitude patterns (that of R_JH^1→4 is not shown here) are similar. The resolution lengths estimated from the three matrices (Fig. 12d) are also similar. Therefore, the resolution lengths retrieved from either R_JH^1→4 or R_JH⁴ in a surface-wave dispersion inversion are also acceptable if R_C cannot be given.

7 Conclusion

Here, we reviewed previous resolution matrices and their applications to clarify the properties of resolution matrices in linear and nonlinear inversions that implement zeroth- and higher-order Tikhonov regularizations. We explained how to use the resolution matrix to understand both the resolvability of the medium parameters and the constrainability of the solution parameters. Furthermore, we suggested a new resolution matrix, the complete resolution matrix, which reflects all of the factors in a study system. This new matrix, which is able to overcome many of the limitations encountered by previous matrices, can be broadly applied in linear and nonlinear (inverse and non-inverse) problems. This study is designed to assist the reader in fully understanding both the concept and application of a resolution matrix and in recognizing how to appropriately appraise a solution and understand the relationship between the solution and all of the factors in the study. These matrix suggestions can guide the reader in improving the study system.

Data availability statement

No practical data is used in this paper.

References

Aki, K., Christoffersson, A., & Husebye, E. S. (1977). Determination of the three-dimensional seismic structure of the lithosphere. Journal of Geophysical Research, 82(2), 277–296. https://doi.org/10.1029/JB082i002p00277
Article Google Scholar
Alumbaugh, D. L., & Newman, G. A. (2000). Image appraisal for 2-D and 3-D electromagnetic inversion. Geophysics, 65(5), 1455–1467. https://doi.org/10.1190/1.1444834
Article Google Scholar
An, M. (2012). A simple method for determining the spatial resolution of a general inverse problem. Geophysical Journal International, 191(2), 849–864. https://doi.org/10.1111/j.1365-246X.2012.05661.x
Article Google Scholar
An, M. (2020). Adaptive Regularization of the Reference Model in an Inverse Problem. Pure and Applied Geophysics, 177(10), 4943–4956. https://doi.org/10.1007/s00024-020-02530-z
Article Google Scholar
Aster, R. C., Borchers, B., & Thurber, C. H. (2005). Parameter Estimation and Inverse Problems. Burlington: Academic Press.
Google Scholar
Backus, G., & Gilbert, F. (1968). The resolving power of gross earth data. Geophysical Journal of the Royal Astronomical Society, 16, 169–205.
Article Google Scholar
Backus, G., & Gilbert, F. (1970). Uniqueness in the Inversion of Inaccurate Gross Earth Data. Philosophical Transactions of the Royal Society B, 266(1173), 123–192. https://doi.org/10.1098/rsta.1970.0005
Article Google Scholar
Barmin, M. P., Ritzwoller, M. H., & Levshin, A. L. (2001). A fast and reliable method for surface wave tomography. Pure and Applied Geophysics, 158, 1351–1375.
Article Google Scholar
Benning, M., & Burger, M. (2018). Modern regularization methods for inverse problems. Acta Numerica, 27, 1–111. https://doi.org/10.1017/S0962492918000016
Article Google Scholar
Boschi, L. (2003). Measures of resolution in global body wave tomography. Geophysical Research Letters, 30(19), 1978. https://doi.org/10.1029/2003gl018222
Article Google Scholar
Bourgeois, A., Jiang, B. F., & Lailly, P. (1989). Linearized inversion: A significant step beyond pre-stack migration. Geophysical Journal International, 99(2), 435–445. https://doi.org/10.1111/j.1365-246X.1989.tb01700.x
Article Google Scholar
Chevrot, S., Villaseñor, A., Sylvander, M., Benahmed, S., Beucler, E., Cougoulat, G., Delmas, P., de Saint Blanquat, M., Diaz, J., Gallart, J., Grimaud, F., Lagabrielle, Y., Manatschal, G., Mocquet, A., Pauchet, H., Paul, A., Péquegnat, C., Quillard, O., Roussel, S., … Wolyniec, D. (2014). High-resolution imaging of the Pyrenees and Massif Central from the data of the PYROPE and IBERARRAY portable array deployments. Journal of Geophysical Research, 119(8), 6399–6420. https://doi.org/10.1002/2014JB010953
Article Google Scholar
Chiao, L.-Y., Chen, Y.-N., & Gung, Y. (2014). Constructing empirical resolution diagnostics for kriging and minimum curvature gridding. Journal of Geophysical Research, 119(5), 3939–3954. https://doi.org/10.1002/2013JB010364
Article Google Scholar
Constable, S. C., Parker, R. L., & Constable, C. G. (1987). Occam’s inversion: A practical algorithm for generating smooth models from electromagnetic sounding data. Geophysics, 52(3), 289–300. https://doi.org/10.1190/1.1442303
Article Google Scholar
Craven, P., & Wahba, G. (1978). Smoothing noisy data with spline functions. Numerische Mathematik, 31(4), 377–403. https://doi.org/10.1007/BF01404567
Article Google Scholar
Crosson, R. S. (1976). Crustal structure modeling of earthquake data 1. Simultaneous least squares estimation of hypocenter and velocity parameters. Journal of Geophysical Research, 81(17), 3036–3046. https://doi.org/10.1029/JB081i017p03036
Article Google Scholar
Day-Lewis, F. D., Singha, K., & Binley, A. M. (2005). Applying petrophysical models to radar travel time and electrical resistivity tomograms: Resolution-dependent limitations. Journal of Geophysical Research, 110(B8), B08206. https://doi.org/10.1029/2004JB003569
Article Google Scholar
Engl, H. W., Hanke, M., & Neubauer, A. (2000). Regularization of Inverse Problems. Amsterdam: Kluwer Academic Publishers.
Google Scholar
Feng, M., & An, M. (2010). Lithospheric structure of the Chinese mainland determined from joint inversion of regional and teleseismic Rayleigh-wave group velocities. Journal of Geophysical Research, 115, B06317. https://doi.org/10.1029/2008JB005787
Article Google Scholar
Fichtner, A., & Leeuwen, T. V. (2015). Resolution analysis by random probing. Journal of Geophysical Research, 120(8), 5549–5573. https://doi.org/10.1002/2015jb012106
Article Google Scholar
Fichtner, A., & Trampert, J. (2011). Resolution analysis in full waveform inversion. Geophysical Journal International, 187(3), 1604–1624. https://doi.org/10.1111/j.1365-246X.2011.05218.x
Article Google Scholar
Foulger, G. R., Panza, G. F., Artemieva, I. M., Bastow, I. D., Cammarano, F., Doglioni, C., Evans, J. R., Hamilton, W. B., Julian, B. R., Lustrino, M., Thybo, H., & Yanovskaya, T. B. (2015). What lies deep in the mantle below? Eos. https://doi.org/10.1029/2015EO034319
Article Google Scholar
Gentle, J. E. (2007). Matrix Algebra: Theory, Computations, and Applications in Statistics. New York: Springer.
Book Google Scholar
Golub, G. (1965). Numerical methods for solving linear least squares problems. Numerische Mathematik, 7(3), 206–216. https://doi.org/10.1007/BF01436075
Article Google Scholar
Golub, G., & Kahan, W. (1965). Calculating the Singular Values and Pseudo-Inverse of a Matrix. Journal of the Society for Industrial and Applied Mathematics Series B Numerical Analysis, 2(2), 205–224. https://doi.org/10.1137/0702016
Article Google Scholar
Golub, G. H., & Reinsch, C. (1970). Singular value decomposition and least squares solutions. Numerische Mathematik, 14(5), 403–420. https://doi.org/10.1007/BF02163027
Article Google Scholar
Hansen, P. C. (1992). Analysis of Discrete Ill-Posed Problems by Means of the L-Curve. SIAM Review, 34(4), 561–580. https://doi.org/10.1137/1034115
Article Google Scholar
Herrmann, R. B. (2013). Computer Programs in Seismology: An Evolving Tool for Instruction and Research. Seismological Research Letters, 84(6), 1081–1088. https://doi.org/10.1785/0220110096
Article Google Scholar
Jackson, D. D. (1972). Interpretation of Inaccurate, Insufficient and Inconsistent Data. Geophysical journal of the Royal Astronomical Society, 28(2), 97–109. https://doi.org/10.1111/j.1365-246X.1972.tb06115.x
Article Google Scholar
Katamreddy, S. H., & Yalavarthy, P. K. (2012). Model-resolution based regularization improves near infrared diffuse optical tomography. Journal of the Optical Society of America A, 29(5), 649–656. https://doi.org/10.1364/JOSAA.29.000649
Article Google Scholar
Knopoff, L. (1972). Observation and inversion of surface-wave dispersion. Tectonophysics, 13(1–4), 497–519. https://doi.org/10.1016/0040-1951(72)90035-2
Article Google Scholar
Lawson, C. L., & Hanson, R. J. (1995). Solving Least Squares Problems. Philadelphia: Society for Industrial and Applied Mathematics.
Book Google Scholar
Lebedev, S., & Nolet, G. (2003). Upper mantle beneath Southeast Asia from S velocity tomography. Journal of Geophysical Research, 108(B1), 2048. https://doi.org/10.1029/2000JB000073
Article Google Scholar
Levenberg, K. (1944). A method for the solution of certain nonlinear problems in least squares. Quarterly of Applied Mathematics, 2, 164–168.
Article Google Scholar
Lévěque, J.-J., Rivera, L., & Wittlinger, G. (1993). On the use of the checker-board test to assess the resolution of tomographic inversions. Geophysical Journal International, 115(1), 313–318. https://doi.org/10.1111/j.1365-246X.1993.tb05605.x
Article Google Scholar
Lin, Y.-P., Zhao, L., & Hung, S.-H. (2014). Full-wave multiscale anisotropy tomography in Southern California. Geophysical Research Letters. https://doi.org/10.1002/2014GL061855
Article Google Scholar
Lütkenhöner, B., de Peralta, G., & Menendez, R. (1997). The resolution-field concept (in eng). Electroencephalography and Clinical Neurophysiology, 102(4), 326–334. https://doi.org/10.1016/s0013-4694(96)96590-6
Article Google Scholar
Ma, Z., Masters, G., Laske, G., & Pasyanos, M. (2014). A comprehensive dispersion model of surface wave phase and group velocity for the globe. Geophysical Journal International, 199(1), 113–135. https://doi.org/10.1093/gji/ggu246
Article Google Scholar
Menke, W. (1989). Geophysical data analysis: Discrete inverse theory (revised). San Diego: Academic Press.
Google Scholar
Menke, W. (2012). Geophysical data analysis: Discrete inverse theory (3rd ed.). San Diago: Academic Press.
Google Scholar
Menke, W. (2015). Review of the Generalized Least Squares Method. Surveys in Geophysics, 36, 1–25. https://doi.org/10.1007/s10712-014-9303-1
Article Google Scholar
Miller, C. R., & Routh, P. S. (2007). Resolution analysis of geophysical images: Comparison between point spread function and region of data influence measures. Geophysical Prospecting, 55(6), 835–852. https://doi.org/10.1111/j.1365-2478.2007.00640.x
Article Google Scholar
Moore, E. H. (1920). On the Reciprocal of the General Algebraic Matrix (Abstract). Bulletin of American Mathematical Society, 26, 394–395.
Google Scholar
Morozov, V. A. (1984). Methods for Solving Incorrectly Posed Problems. New York: Springer.
Book Google Scholar
Nolet, G. (2008). A breviary of seismic tomography: imaging the interior of the earth and sun. Cambridge: Cambridge University Press.
Book Google Scholar
Penrose, R. (1955). A generalized inverse for matrices. Mathematical Proceedings of the Cambridge Philosophical Society, 51(3), 406–413. https://doi.org/10.1017/S0305004100030401
Article Google Scholar
Pilkington, M. (2016). Resolution measures for 3D magnetic inversions. Geophysics, 81(2), J1–J9. https://doi.org/10.1190/geo2015-0081.1
Article Google Scholar
Pogue, B. W., McBride, T. O., Prewitt, J., Österberg, U. L., & Paulsen, K. D. (1999). Spatially variant regularization improves diffuse optical tomography. Applied Optics, 38(13), 2950–2961. https://doi.org/10.1364/AO.38.002950
Article Google Scholar
Ren, Z., & Kalscheuer, T. (2020). Uncertainty and resolution analysis of 2D and 3D inversion models computed from geophysical electromagnetic data. Surveys in Geophysics, 41(1), 47–112. https://doi.org/10.1007/s10712-019-09567-3
Article Google Scholar
Sanny, D. R., Prakash, J., Kalva, S. K., Pramanik, M., & Yalavarthy, P. K. (2018). Spatially variant regularization based on model resolution and fidelity embedding characteristics improves photoacoustic tomography. Journal of Biomedical Optics, 23(10), 100502.
Article Google Scholar
Sigloch, K. (2011). Mantle provinces under North America from multifrequency P wave tomography. Geochemistry, Geophysics, Geosystems. https://doi.org/10.1029/2010GC003421
Article Google Scholar
Smith, S. W. (1997). The Scientist and Engineer’s Guide to Digital Signal Processing. Pasadena: California Technical Pub.
Google Scholar
Snoke, J. A., & James, D. E. (1997). Lithospheric structure of the Chaco and Paraná Basins of South America from surface-wave inversion. Journal of Geophysical Research, 102, 2939–2951.
Article Google Scholar
Tan, L. (2017). 3—Generalized inverse of matrix and solution of linear system equation. In L. Tan (Ed.), A Generalized Framework of Linear Multivariable Control (pp. 38–50). Oxford: Butterworth-Heinemann.
Chapter Google Scholar
Tarantola, A., & Valette, B. (1982). Generalized nonlinear inverse problems solved using the least squares criterion. Reviews of Geophysics, 20(2), 219–232. https://doi.org/10.1029/RG020i002p00219
Article Google Scholar
Tewarson, R. P. (1977). Use of smoothing and damping techniques in the solution of nonlinear equations. SIAM Review, 19(1), 35–45.
Article Google Scholar
Thurber, C. H., & Ritsema, J. (2009). Theory and observations—Seismic tomography and inversion methods. In B. Romanowicz & A. Dziewonski (Eds.), Treatise on geophysics: Seismology and structure of the earth (pp. 323–360). Amsterdam: Elsevier.
Google Scholar
Tikhonov, A. N. (1963). On the solution of ill-posed problems and the method of regularization. Doklady Akademii Nauk, 151(3), 501–504.
Google Scholar
Trampert, J., Fichtner, A., & Ritsema, J. (2013). Resolution tests revisited: The power of random numbers. Geophysical Journal International, 192(2), 676–680. https://doi.org/10.1093/gji/ggs057
Article Google Scholar
Varah, J. M. (1973). On the numerical solution of Ill-conditioned linear systems with applications to Ill-posed problems. SIAM Journal on Numerical Analysis, 10, 257–267.
Article Google Scholar
Wessel, P., & Smith, W. H. F. (1991). Free software helps map and display data. Eos, Transactions of the American Geophysical Union, 72, 441. https://doi.org/10.1029/90EO00319
Article Google Scholar
Wiggins, R. A. (1972). The general linear inverse problem: Implication of surface waves and free oscillations for Earth structure. Reviews of Geophysics, 10(1), 251–285. https://doi.org/10.1029/RG010i001p00251
Article Google Scholar
Xia, J., Miller, R. D., & Park, C. B. (1999). Estimation of near-surface shear-wave velocity by inversion of Rayleigh waves. Geophysics, 64(3), 691–700.
Article Google Scholar
Yao, Z. S., Roberts, R. G., & Tryggvason, A. (1999). Calculating resolution and covariance matrices for seismic tomography with the LSQR method. Geophysical Journal International, 138(3), 886–894. https://doi.org/10.1046/j.1365-246x.1999.00925.x
Article Google Scholar

Download references

Acknowledgements

This paper is dedicated to the memory of the author’s father, An Feng-Lou, who passed away during the preparation of this manuscript. This work was funded by the National Natural Science Foundation of China (Grants 41574049 and 41974051), Basic Research Foundation of Chinese Academy of Geological Sciences (JKY202217), and the China Geological Survey Program (DD20221643-2). I thank two anonymous reviewers for their constructive comments and suggestions. All of the figures were generated using Generic Mapping Tools (Wessel and Smith 1991).

Funding

This work was funded by the National Natural Science Foundation of China (Grants 41574049 and 41974051) and the China Geological Survey Program (DD20221643-2).

Author information

Authors and Affiliations

Chinese Academy of Geological Sciences, Beijing, China
Meijian An

Authors

Meijian An
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.A. completed the manuscript.

Corresponding author

Correspondence to Meijian An.

Ethics declarations

Conflict of interest

The author has no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 107 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

An, M. On Resolution Matrices. Pure Appl. Geophys. 180, 111–143 (2023). https://doi.org/10.1007/s00024-022-03211-9

Download citation

Received: 06 September 2022
Revised: 24 October 2022
Accepted: 06 December 2022
Published: 23 December 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s00024-022-03211-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On Resolution Matrices

Abstract

Similar content being viewed by others

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Accelerated Smoothing Hard Thresholding Algorithms for $$\ell _0$$ Regularized Nonsmooth Convex Regression Problem

Ptychography

1 Introduction

2 Resolution Matrices from Observations and Regularization Matrices

2.1 Variables Used

2.2 Solutions of the Inverse Problem

2.3 Resolution Matrices from the Observations and Regularization Matrices

2.4 Properties of the Resolution Matrices

2.4.1 R D from only the Observation Matrix

2.4.2 R H with Uniform Regularization Using λ I

2.4.3 R H Using a Zero-Row-Sum Regularization Matrix

2.4.4 R H Using Mixed Regularization

2.5 Significances of the Resolution Matrices

2.5.1 Row Vector = Content Function of the Medium

2.5.2 Column Vector = Spreading Function of the Medium

2.5.3 Significances of the Matrices

2.5.4 Column Vector Variations Due to Regularization Changes

2.6 Do the Projection Matrices Represent Practical Projections?

2.6.1 R H Versus Observational Errors

2.6.2 Resolution Matrix of the Full Process from x to x?

3 Resolution Matrices of the Complete Process

3.1 Method

3.2 Equation Simplification

3.3 Resolution Matrices with Error Effects

3.4 Offset Error Isolation

3.5 Properties of the Complete Resolution Matrix

3.6 Utilities of Resolution Matrices

4 Resolution Length

4.1 Content Extent Versus Resolution Length

4.2 Estimation from a Row or Column Vector?

4.3 Resolution Estimations that are not from R

5 Resolvability and Constrainability from the Resolution Matrix

5.1 Resolvability Defined by the Main-Diagonal Element

5.2 Deviation from the Expectation Given by the Row-Vector Sum

5.3 Difference Between Neighboring Parameters

5.4 Short Summary on Constrainability

5.5 Constrainability from R H

5.6 Resolution Upper Bound

5.7 What is a Perfect Inversion?

6 Resolution Matrices in Nonlinear Inversions

6.1 Linear Approximation of r:x→x

6.2 Projection of Solution Improvement After an Iteration

6.3 Projection of the Solution Improvement up to an Iteration

6.4 Four Types of Resolution Matrices in a Linearized Inversion

7 Conclusion

Data availability statement

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 107 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

2.4.1 R _D from only the Observation Matrix

2.4.2 R _H with Uniform Regularization Using λ I

2.4.3 R _H Using a Zero-Row-Sum Regularization Matrix

2.4.4 R _H Using Mixed Regularization

2.6.1 R _H Versus Observational Errors

5.5 Constrainability from R _H