On Resolution Matrices

Solution appraisal, which has been realized on the basis of projections from the true medium to the solution, is an essential procedure in practical studies, especially in computer tomography. The projection operator in a linear problem or its linear approximation in a nonlinear problem is the resolution matrix for the solution (or model). Practical applications of a resolution matrix can be used to quantitatively retrieve the resolvability of the medium, the constrainability of the solution parameters, and the relationship between the solution and the factors in the study system. A given row vector of the matrix for a solution parameter can be used to quantify the resolvability, deviation from expectation, and difference between that solution parameter and its neighbor from the main-diagonal element, row-vector sum, and difference between neighboring elements in the row vector, respectively. The resolution length of a solution parameter should be estimated from the row vector, although it may be unreliable when the vector is unstable (e.g., due to errors). Comparatively, the resolution lengths that are estimated from the column vectors of the observation-constrained parameters are reliable in this instance. Previous studies have generally employed either the direct resolution matrix or the hybrid resolution matrix as the model resolution matrix. The direct resolution matrix and hybrid resolution matrix in an inversion with damping (or general Tikhonov regularization) are Gramian (e.g., symmetric). The hybrid resolution matrix in an inversion using zero-row-sum regularization matrices (e.g., higher-order Tikhonov regularizations) is one-row-sum but is not a stochastic matrix. When the two resolution matrices appear in iterative nonlinear inversions, they are not a projection of the solution, but rather the gradient of the projection or a projection of the solution improvement immediately after a given iteration. Regardless, their resultant resolution lengths in iterative nonlinear inversions of surface-wave dispersion remain similar to those from the projection of the solution. The solution is influenced by various factors in the study, but the direct resolution matrix is derived only from the observation matrix, whereas the hybrid resolution matrix is derived from the observation and regularization matrices. The limitations imply that the appropriateness using the two resolution matrices may be questionable in practical applications. Here we propose a new complete resolution matrix to overcome the limitations, in which all of the factors (e.g., errors) in linear or nonlinear (inverse or non-inverse) studies can be incorporated. Insights on all of the above are essential for ensuring a reliable and appropriate application of the resolution matrix to appraise the model/solution and understand the relationship between the solution and all of the factors in the study system, which is also important for improving the system.


Introduction
Geophysics is the investigation of the Earth based on the principles of physics, whereby the physical properties of the Earth medium (i.e., the unknown model) are inverted from either surface-or spacebased observations. Geophysical inverse problems are often underdetermined (e.g., Menke 2015; Tarantola and Valette 1982;Wiggins 1972) owing to the various limitations of these observations. However, advances in generalized or regularized inverse theory over the last century now make it possible to solve underdetermined inverse problems (e.g., Hansen 1992; Lawson and Hanson 1995;Levenberg 1944;Moore 1920;Morozov 1984;Penrose 1955;Tikhonov 1963). Although the quality and reliability of such an under-constrained solution is essential in a data-poor environment, the verification of a geophysical solution is difficult or even impossible, as the investigated medium is generally inaccessible. As such, geophysicists have to make great efforts in the methodology of solution appraisal. However, understanding what is reliable in some geophysical solutions remains perhaps the most exciting challenge to date (Foulger et al. 2015).
An inverted solution (x) that represents investigated medium (x) can be described as: Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s00024-022-03211-9.
x ¼ rðxÞ; where the operator r, which can also be denoted by r:x ? x, represents the relationship between x and x. This problem can be linearized as: where R, which can also be denoted by R: x ? x, is the resolution matrix Gilbert 1968, 1970), a linear projection approximation of r: x ? x (Eq. (1)). R has been widely used in solution appraisals in geophysics (e.g., Aki et al. 1977;Aster et al. 2005;Menke 2015; Tarantola and Valette 1982;Wiggins 1972;Yao et al. 1999) and other research areas (e.g., Lütkenhöner and Grade de Peralta Menendez 1997;Katamreddy and Yalavarthy 2012). However, the properties of R remain poorly understood in practical problems, and considerable uncertainties and/or inconsistencies regarding R still exist. For example, early studies mostly analyzed the diagonal entries to evaluate the resolvability of the target medium (e.g., Aki et al. 1977;Day-Lewis et al. 2005;Wiggins 1972), even though the solution is actually related to all of the entries in R (Eq. (2)). R has been mostly applied to estimate the resolution length (or resolution width). This length should be retrieved from a given row in R (e.g., An 2012; Barmin et al. 2001;Crosson 1976), although the result appears to be similar to that from the corresponding column (e.g., Alumbaugh and Newman 2000;Miller and Routh 2007;Pilkington 2016). Why and when are they similar? An inverse problem is often solved using reference model, whereby the inverted solution is not a model of the medium but rather a perturbation (Dx) of the reference model (x i ). R can be provided in the inversions (e.g., Jackson 1972; Ren and Kalscheuer 2020), but it represents the Dx ? Dx projection, not x ? x. What is the relationship between the matrix and r:x ? x? Can the matrix be used to estimate the resolution length of the solution x (= Dx ? x i )? All the above questions will be addressed in this paper.
R is particularly useful in model appraisal, but it comes with various limitations. For example, some of the general factors (e.g., observational and data processing errors) cannot be reflected by the matrix. The resolution estimated from the matrix with limitations may be unrealistically high (Pilkington 2016). A matrix R:x ? x with all of the factors in a complete process can overcome these limitations. However, no such matrix exists to date.
In total, R is a unique quantitative indicator of the reliability of a given solution, and it is also important in understanding the relationships between the solution and the observations, regularization, and other factors. However, various uncertainties and limitations regarding R remain. This paper reviews previous resolution matrices and clarifies both the significance and properties of the matrices, which often appear in practical inversions, to explain how to appropriately employ such a matrix in a given study. Furthermore, this paper clarifies the resolution matrices in nonlinear inversions and suggests a new resolution matrix, which can include all of the factors in a linear/nonlinear problem. This study can therefore assist in the appropriate selection and implementation of a resolution matrix and provide the reader with a better understanding of both the quality and reliability of the solution and the relationships between the solution and all of the factors in the study system, which are important for further improving the system.

Resolution Matrices from Observations and Regularization Matrices
Resolution matrices that are derived from observations and regularization matrices have been widely applied in various research studies. This section provides a review of resolution matrices and their significance. Furthermore, the matrix properties, which are important in matrix applications, are first clarified.

Variables Used
x, x: Real medium (or true model) vector, solution (or inverted model) vector.
x i , x i : ith real-medium parameter, ith solution parameter. r i,j : Entry at the ith row and jth column of matrix R. r i,* (or r *,j ): ith row (or jth column) vector of matrix R.
Rr i,* or R i R: Sum of all of the entries in the ith row of matrix R.

Solutions of the Inverse Problem
The goal of a geophysical investigation is to directly retrieve the solution of the medium (x D ) from observations (d = d ? dd) that are contaminated with errors (dd) via the inverse equation: which is based on the physical relationship between the medium parameters (m 9 1 vector x; [x 1 , x 2 ,…, x m ] T ) and observational data (n 9 1 vector d; [d 1 , d 2 ,…, d n ] T ), with the latter defined as: the operator g in Eq. (4) is often not invertible, such that a generalized inverse of g, g -g , is used, as in Eq.
(3). Equation (3) can be expressed in linear form as: and rewritten as: the n 9 m matrix G, which is normally termed the observation matrix, is composed of the sensitivities of d with respect to x. The generalized inverse of G, G -g (e.g., Lawson and Hanson 1995;Moore 1920;Penrose 1955;Tan 2017;Tarantola and Valette 1982), can be either a right inverse: or a left inverse: G -g can also be obtained via singular value decomposition (SVD) (Golub and Kahan 1965;Golub and Reinsch 1970;Varah 1973) or some form of matrix factorization (e.g., Gentle 2007;Golub 1965). The calculated G -g from those methods is tightly related or exactly equivalent. Figure 1a shows an example (Example 1) of a one-dimensional (1-D) underdetermined inverse problem that mimics the relationship between distance, slowness (x), and travel time (d). The problem is described by 10 linear equations (Eq. (S1) in the supporting information), with 100 (m = 100) unknown parameters in x and 10 (n = 10) observations (d). The coefficient of the ith parameter (x i ) in the jth equation for d j is the product of the segment length x i that is traveled by the jth observation (Fig. 1a) and the sensitivity of d j with respect to x i . All of the coefficients are stored in G (Fig. 1b). Several coefficients in Eq. (S1) (or the entries in G) for the second and ninth observations are greater than one, which means that higher sensitivities are set for the parameters because the segment lengths of all the parameters are equal to one (Fig. 1a). Figure 1c shows a synthetic x and its pseudo-inverse solution x D . The synthetic error-free observation d (dd = {0}) and prediction of x D (d D ) are shown in Fig. 1d.
Both the data coverage (observation distribution) and sensitivities, which are stored in G, influence the solution in the synthetic example (Fig. 1). For example, the solution parameters from x 1 to x 10 are constrained by only one observation (the first travel path; the top left arrow in Fig. 1a), such that the nonzero entries for all ten parameters in the first row of G (Fig. 1b) and the coefficients in the first row of Eq. (S1) are the same and equal to one. Similarly, the parameters from x 11 to x 20 are also constrained by one observation (the second path), but the nonzero entries for the ten parameters in the second row of G (Fig. 1b) and the coefficients in the second row of Eq. (S1) are different. The resultant x 1 -x 10 values in x D (Fig. 1c) are the same and equal to the average of the synthetic model parameters x 1 -x 10 (circles in Fig. 1c), whereas the resultant x 11 -x 20 values are quite different, both from each other and their respective synthetic values (x 11 -x 20 ). The differences in x 11 -x 20 are caused by different sensitivities because they all have the same path segments. Parameters x 21 -x 40 and x 41 -x 60 are constrained by two paths, but their path-overlapping patterns ( Fig. 1a and b) differ. The differences in x 21 -x 60 in x D (Fig. 1c) are related to the path coverage. Parameters x 61 -x 90 are constrained by several observations. x 91 -x 100 are not constrained by any observations, such that they are all equal to zero in x D .
However, most geophysical inverse problems are ill-posed (i.e., there are not sufficient observations to obtain a unique and stable solution), and the generalized solution x D , such as that in Example 1 (Fig. 1c), is not physically rational. Furthermore, large discrepancies may exist between the real model x and solution x D , especially for parameters with poor or no observation coverage (Fig. 1a-c), even though there is a good fit between the observation data and model predictions (Fig. 1d). Additional artificial constraints (often called regularization) (e.g., Aster et al. 2005;Benning and Burger 2018;Engl et al. 2000;Levenberg 1944;Menke 1989;Tikhonov 1963) on the model parameters must therefore be included during the inversion to obtain a physically rational solution. The regularization-based forward equation then becomes: where the n b 9 m matrix A and n b 9 1 vector b are: respectively, which contain an n c 9 m (n c = n bn) matrix C and an n c 9 1 vector c, respectively, both of which are related to the regularization. Tikhonov regularization (Levenberg 1944;Tikhonov 1963) is used largely in geophysical inversions (e.g., Aster et al. 2005;Constable et al. 1987;Menke 1989). C is denoted by kL n in n th -order Tikhonov regularizations, and c (Eq. (10)) is a zero vector ({0}). The factor k is the regularization parameter, which balances the contributions of either the observations and regularization in the inversion, or the fit to the observational data d and regularization vector Inverse problem Example 1. This is a simple 1-D ray-propagation (linear) problem that is defined by either d = Gx or Eq. (S1). Arrow lines in (a) illustrate the ray paths. The observation vector d contains the travel times for all ten rays (d j ; j = 1, …, 10). The vector x includes the slowness (reciprocal of speed) in each unit segment to be resolved (x i ; i = 1, …, 100). The matrix G in (b) contains the combinations of the distance segments travelled by the ten rays and their synthetic sensitivities. Several parameters are set to a high sensitivity in the second and ninth observations, which simulates a study with parameters that possess different sensitivities. c Synthetic model x (circles) and solutions x D , x(I), and x(L 1 ) (lines) obtained via generalized inversions, with zeroth-(I) and first-order (L 1 ) Tikhonov regularizations included in the latter two inversions. d The synthetic observation data d (circles) from x, and the predictions d D , d(I), and d(L 1 ) (lines) from the solutions x D , x(I) and x(L 1 ), respectively 114 M. An Pure Appl. Geophys.
c. Tests that employ ad hoc methods (e.g., Craven and Wahba 1978;Hansen 1992;Morozov 1984) are often used to determine k. The matrix L 0 for zerothorder Tikhonov regularization (damping regularization), which minimizes the model (Levenberg 1944), is the identity matrix I. L n (n [ 0; e.g., L 1 in Eq. (S2) in the supporting information) is a zero-row-sum band matrix. L 1 regularization (flatness regularization) flattens the model via minimizing the first-order gradient of the model. L n Tikhonov regularization exerts uniform a priori constraints to all of the solution parameters. Therefore, regularization (C = kL n ) with an optimal weighting factor k produces the optimal average regularizing effect for all of the parameters. However, if C contains a diagonal matrix W (= diag(w 1 , w 2 ,…)) (i.e., C = kWL n ), then the regularization is heterogeneous and spatially variant for the model (e.g., An 2020; Katamreddy and Yalavarthy 2012;Pogue et al. 1999;Sanny et al. 2018). Regularization at least makes A a full-column rank matrix, such that A -1 (or the left inverse A -g ), which is an m 9 n b matrix, can be uniquely obtained. The solution (x) of an inversion using regularization is: where b and d are replaced by b and d, respectively, in Eq. (10). x can also be obtained via least-squares or minimum-norm inversions using the objective function: When Tikhonov regularization (C = kL, c = {0}) is used, a truncated form of A -g (m 9 n matrix A -t ) can be obtained via (e.g., Aster et al. 2005;Barmin et al. 2001;Crosson 1976): x is then given by: which is the same as that in Eq. (11). x in Example 1 (Fig. 1a), which uses first-order Tikhonov regularization, is shown in Fig. 1c. An optimal factor k of one is selected by Morozov's discrepancy principle (Morozov 1984) for the inversion.
The resultant x via Eq. (11), which is obtained using regularization, is generally more rational than x D . For example, x (denoted as x(L 1 )) in Example 1 (Fig. 1c) is closer to the synthetic model than x D , such that x is preferred over x D (Eq. (5)) in practical inversions (i.e., the solution provided in a practical ill-posed inversion is x rather than x D ).
If the errors (dd) in the measured observation data d (= d ? dd) are known, then the solution x in Eq. (11) becomes: where: It is noted that dx d is only the contribution of the observational errors to the solution and is therefore not the solution error. The solution error or residual, dx (= xx), is related to both dd and the other factors in the x ? x process.

Resolution Matrices from the Observations and Regularization Matrices
The process to obtain the solution (x) of a true medium (x) is a projection (r:x ? x) from x to x, which can be described using Eq. (1) (Fig. 2a). If the projection is linear, then x can be written as a linear regression equation (Fig. 2a): where R is the slope of the regression, dx of is a constant offset. If dx of is ignored, then the linear regression becomes the linear projection of Eq. (2) (Fig. 2a), where R is the so-called resolution Gilbert 1968, 1970) or projection matrix. For a nonlinear problem, either R or R:x ? x can be considered a linear approximation of r:x ? x (Fig. 2a). However, this matrix has been obtained via matrix operations, with several different resolution matrices potentially being obtained . If the observational errors (dd) in data d in Eq. (10) are ignored, then d = d. Replacing d in Eq. (5) with d in Eq. (6) causes the projection from x to x D (x ? x D ) to become: Vol. 180, (2023) On Resolution Matrices where the resolution matrix R D (Table 1) is of the following form (e.g., Jackson 1972;Menke 1989;Wiggins 1972): The transformation from x to x (x ? x) is obtained by inserting Eq. (9) into Eq. (11): where resolution matrix R I (Table 1) is of the form (An 2012): When the vector c is a zero vector (e.g., in Tikhonov regularization) and d in Eq. (11) is replaced by that in Eq. (6), the transformation from x to x (x ? x) is: where the resolution matrix (R H ) (Table 1) takes the form: if the truncated form A -t is used, then Eq. (23) becomes : if A -t is not truncated from A -g and instead calculated from Eq. (13), which employs Tikhonov regularization of the form kL, then R H becomes (e.g., Aster et al. 2005;Barmin et al. 2001;Boschi 2003;Crosson 1976): The three resolution matrices for Example 1, R D and R H in the inversion using the regularization matrix I (R H (I)) and R H in the inversion using the regularization matrix L 1 (R H (L 1 )), are shown in Fig. 3a-c.
If G is a full-column rank matrix and no regularization is used, then x D equals x, and R D , R I , and R H are the same as the identity matrix. Otherwise, x D and x are different, and the three resolution matrices differ. Regularization needs to ensure that A is a full-column rank matrix, such that R I is still an identity matrix. The practical solution is x, as opposed to x D , due to regularization, and the resolution matrix for the x ? x projection is R H , as opposed to R D . R D has been paid little attention in previous regularized inversions in this case. However, R D is still very important for understanding the reliability of the solution in the inversions, as explained in the ''Resolvability and constrainability from the resolution matrix'' section. Even though R D and R H are often different, they are both commonly called the model resolution matrix, which may confuse readers. The notations suggested by  for the matrices are adopted here for clarity, where R D is the direct resolution matrix, R I is the regularized resolution matrix, and R H is the hybrid resolution matrix.

R D from only the Observation Matrix
Truncated SVD of G allows R D to be written in the form (e.g., Aster et al. 2005;Jackson 1972;Wiggins 1972): where V q (= {v i,j } m9q , q = rank(G)) is an unitary matrix with right singular vectors of G. Equation (26) indicates that R D is a Gram matrix (or Gramian), which can be created by multiplying a matrix with its own transpose, as in Eq. (26). The Gram matrix (e.g., Gentle 2007), R D (e.g., Fig. 3a), is symmetric (Table 2), with its rank (rank(R D )) equal to both its trace (trace(R D )) and the rank of V q (rank(V q ) = q) (Eq. (26)). Here rank(R D ) equals q because rank(V q ) is equal to rank(G(q)). The diagonal elements in R D are nonnegative, but the off-diagonal elements in R D can be negative, with the exception that either V q T is a fullcolumn rank matrix or V q is a full-row rank matrix. However, V q for an underdetermined problem is not a full-row rank matrix because q is smaller than m, such that the off-diagonal entries of R D of an underdetermined problem can be negative. (2) Gaussian function approximation of R C ; provides reliable resolution length Res. slope matrix from x 0 to x i A negative r i,j entry appears in R D (Table 2) when some columns in two rows of G are like in a band matrix with the entries (0 and nonzero numbers a to d) below: ::: a ::: b ::: 0 ::: ::: 0 ::: c ::: d ::: Figure 3 The direct resolution matrix a R D , and hybrid resolution matrices b R H (I) and c R H (L 1 ) for the inverse problem in Fig. 1a. A regularization parameter (k) of one is used in the inversions. R H (I) and R H (L 1 ) are for the inversions that employ I and L 1 regularization matrices, respectively. d, e The 48th row and 48th column vectors of the matrices. The x 48 parameter is constrained by the fifth observation, which overlaps with the sixth observation (Fig. 1a); the parameters from x 40 to x 60 are influenced by these two observations. The entries in row r 48,* of R D (d) are zeros, with the exceptions of r 48,41 -r 48,55 (positive), and r 48,56 -r 48,60 (negative) entries in G -g related with the nonzero entries a and d in G have different signs from those in G, causing the negative r i,j . This happens in G when x i and x j are constrained by different observations, which also constrain another parameter in common. For example, x 48 and x 58 are respectively constrained by the fifth and sixth observations ( Fig. 1a), but the two observations also constrain the solution parameters x 52 -x 55 in common. Consequently, r 48,58 in R D is negative (Fig. 3a).

R H with Uniform Regularization Using kI
When uniform zeroth-order Tikhonov regularization (kL 0 or kI) is used, SVD of G allows the general inverse A -t (Eq. (13)) to be written as below (e.g., Menke 2012): where U is a unitary matrix with left singular vectors of G, S is a nonnegative diagonal matrix with singular values of G (s i ). The resolution matrix R H (kI) (Eq. (25)) can be written as: where (Aster et al. 2005) is truncated F and with the positive constants (f i ): The positive diagonal matrix F q can be written as a product of E q (= diag(f 1 1/2 , f 2 1/2 , …, f q 1/2 )) and E q T .
Equation (29) can then be written as: where: Equation (31) indicates that R H (kI) is a Gram matrix (Table 2), like R D . However, when a spatial variant regularization of the form kWI is used, the matrix R H (kWI) cannot be written in the form of matrix multiplication with its own transpose, and is therefore not Gramian.
R H (kI) (e.g., Fig. 3b) is symmetric, like R D . R H (kI) has a rank that is equal to rank(V q E q ) (Eq. (31)). The diagonal entries of R H (kI) are nonnegative, but the other entries can be negative. Equation (30) indicates that f j 1/2 is in the range (0,1). Equation (32) indicates that a given column (e.g., the j th column) of V q E q equals the same column vector of V q multiplied by f j 1/2 . All of the entries in V q E q will be closer to zero than V q , as f j 1/2 is a positive value that is less than one. All of the entries in R H (kI) therefore have weaker intensities than those in R D , but they possess similar intensity patterns, as observed in comparisons of Fig. 3a and b, and Fig. 3d and e. Furthermore, trace(R H (kI)) is smaller than trace(R D ), such that the resolvability of x decreases after regularization using kI, as explained in the ''Resolvability and constrainability from the resolution matrix'' section.

R H Using a Zero-Row-Sum Regularization Matrix
The matrices of derivative regularizations, e.g., the high-order Tikhonov regularization matrices kL n and kWL n (n [ 0), are a zero-row-sum n s 9 m matrix (S 0 ) (n s [ 0), such that Table 2 Properties of typical resolution matrices

Matrix Inverse problem Properties
R D Linear or linearized nonlinear Gramian (e.g., symmetric, with nonnegative main-diagonal elements (but not the off-diagonal elements)) Vol. 180, (2023) On Resolution Matrices 119 where 1 (= {1} m91 ) is a vector with all elements one and 0 is a zero vector. Regardless of the number (n s ) of rows of S 0 , a relation below: exists. If the regularization matrix (C, Eq. (10)) is a zero-row-sum matrix (S 0 ), the Eq. (34) allows a relation on the resolution matrix R H (Eqs. (24) and (25)): therefore, when regularization matrix is a zero-rowsum matrix (S 0 ), the resolution matrix R H is a onerow-sum matrix S 1 (while S 1 1 = 1). This can be simplified as: where 0 is a zero matrix. As the high-order Tikhonov regularization matrices kL n and kWL n (n [ 0) (e.g., flatness and smoothness) are both a zero-row-sum S 0 matrix. Therefore, R H (kL n ) and R H (kWL n ) are S 1 matrices (i.e., all of the rows in R H (kL n ) and R H (kWL n ) sum to one; Table 2). All of the rows in R H (L 1 ) in Fig. 3c sum to one, as shown in Fig. 4a.
The resolution matrix R H (S 0 ) is similar to a stochastic matrix (a square matrix with non-negative elements and each row summing to 1), but the entries in R H (or S 1 ) can be negative due to the same reason above for R D . S 0 matrix (is often band matrix) or the combined matrix of S 0 with G (Eq. (36) or A in Eq. (10)) often includes two rows like Eq. (27), i.e., the S 1 regularization matrix always yields two parameters (e.g., x 1 and x 11 ) that are constrained by the same observation (the first row of G), with one of them (x 11 ) also being constrained by one of the other observation or regularization row (the tenth row of L 1 in Eq. (S2)). The two rows also constrain a third parameter (x 10 ). The entries in A -t and A -g related with parameters often have different signs from G, and the r 1,11 entry in R H (L 1 ) ( Fig. 3c) is consequently negative. Therefore, the application of L n regularization can cause more negative entries in R H (L n ) (e.g., Fig. 3c) than R D (Fig. 3a), and then R H is not a stochastic matrix.
A one-row-sum matrix R H (S 1 ) implies that one is an eigenvalue of the projection R H , such that there exists an equation: Eqs. (22) and (37) indicate that a medium (x p ) can be fully resolved and equal to a solution (x p = R H x p ), even though x p is not the medium that is currently measured. It is impossible for all of the row sums in R H (kI) to equal one, thereby demonstrating that M. An Pure Appl. Geophys.
higher-order Tikhonov regularization is superior to damping (or kI) regularization from the viewpoint of row sums.

R H Using Mixed Regularization
If the regularization is mixed using higher-order regularizations (e.g., L 2 and L 3 ), then the new regularization matrix is still a matrix S 0 , and R H is a matrix S 1 . Otherwise, if it is mixed using damping and a higher-order L n (n [ 0) (e.g., Sigloch 2011; Tewarson 1977), then the mixed regularization matrix C is neither a diagonal matrix nor a zerorow-sum matrix. Therefore, the new R H is neither a symmetric (Gram) matrix nor a one-row-sum matrix.

Row Vector = Content Function of the Medium
Equation (2) highlights that a solution parameter x i can be expressed as a weighted sum of all of the model parameters in x (or [x 1 , x 2 , …x m ] T ): where the r i,j entry of matrix R plays a role in weighting the j th model parameter x j in the summation. The r i,* row vector of R therefore acts like an averaging vector (Backus and Gilbert 1968), with the entries in r i,* representing the accurate contents (or contributions) of all of the medium parameters to (or in) the i th solution parameter x i . Therefore, r i,* can also be termed the content (or contribution) function of the medium in x i (Table 3).

Column Vector = Spreading Function of the Medium
The r i,j entry signifies the contribution of the medium parameter x j to x i , such that all of the entries in the j th column vector (r *,j ) (Table 3) correspond to the contributions of the j th model parameter (x j ) to either all of the parameters in x or the spread of x j into the parameters. This has been considered the Green's function (also termed the point spread function (Smith 1997), or impulse response function) of x j to x.

Significances of the Matrices
The matrices R D , R I , and R H are slopes of the linear projection from x to x, but have different significances (Table 1) on the transformation. Regularization at least makes matrix A full rank (invertible); i.e., R I should be an identity matrix. If the x ? x projection matrix R I is an identity matrix, then a unique solution (x) can be obtained from a given G and C. Otherwise, some of the model parameters remain poorly constrained, and further regularization should be employed. R D is only produced from G (Eq. (19)). Therefore, R D represents the effects and contributions of x from the given observations, regardless of whether regularization is used or not.
R H only reflects the x ? x projection, with no consideration of other factors (e.g., errors) in the   (23)) means that it reflects some combination of the observational and regularization effects in the solution. The variations or differences between R H and R D therefore represent the effects of regularization on the inversion, as R D only reflects the observational effects on the solution. All three matrices, especially R H and R D , are therefore essential for understanding the inversion and its result, such as the resolvability of x, the uniqueness and reliability of x, and the effects of regularization.

Column Vector Variations Due to Regularization Changes
The role of regularization on the projection from x to x can be revealed via a comparison of the resolution matrices R D and R H . One entry (r i,j ) of either R D in Eq. (19) or R H in Eq. (24) can be written as: where u i,j represents an entry of either G -g or A -t , and g k,j is an entry of G. R H (Eq. (24)) can be considered R D (Eq. (19)), with the variation in u i,j obtained by replacing G -g with A -t . Equation (39) indicates that if the g *,j column vector of G is given, then the variation of u i,j only influences the j th column vector (r *,j ) of R. Therefore, the addition of regularization (C) in A (Eq. (10)) allows the j th column vector in R H to be considered a function of the j th column (with no relationship arising among the other columns) in R D . Regularization essentially changes the spread functions from R D to R H , such that the spread function is sensitive to regularization (Table 3). These column vector variations due to regularization changes are well-illustrated by comparing R D and R H (L 1 ) ( Fig. 5c and d, respectively) of a linear inverse problem example (Example 2) with threepoint observations (G in Fig. 5a) and 50 unknowns in x ( Fig. 5a and b). The L 1 regularization matrix and k = 1 are used in the inversion. If a column in R D (e.g., r *,5 ) is all zeros (Fig. 5c), then the corresponding column vector in R H (r *,5 ) ( Fig. 5d) will be all zeros. Otherwise, the tenth column in R D has one nonzero entry (r 10,10 ), and the tenth column vector in R H (L 1 ) (r *,10 ) ( Fig. 5c and d) has nonzero entries around the r 10,10 entry. The variations between the column in R D and that in R H are due to regularization. Similar results can be found via a comparison of R H (L 1 ) and R D (Fig. 3) for Example 1 (Fig. 1).
Comparisons of the column vectors of R D and R H for the same parameter x j can reveal the regularization effect, as the variations in the spread functions from R D to R H are due to regularization. Here, the j th column vector in R H is only related to the j th column in R D , such that the relationship between the relative magnitudes of neighboring entries in a row vector of R D may be preserved in R H . A given row vector in R H may exhibit a similar pattern or curve to that in R D . For example, the curves of the r 48,* row vectors in R D and R H (Fig. 3d) are similar, but those of the r *,48 column vectors (Fig. 3e) are very different.

Do the Projection Matrices Represent Practical
Projections?
The projection r: x ? x (Eq. (1)) includes the effects of all of the factors (e.g., uncertainty in observation d and prior information c (Menke 2015), Eq. (12)) in the process from x to x, but R D , R I , and R H do not. The resolution estimated from R H can be unrealistically higher than that obtained via synthetic tests (Pilkington 2016), which may be due to the limitation of the projection matrix.

R H Versus Observational Errors
The observational errors dd influence the solution x. When dd is considered, Eq. (2) becomes: where R(dd) represents R as a function of dd, with R(dd = 0) equaling R for the error-free case. Inserting Eq. (6) into Eq. (15) then yields a projection that is similar to the form in Eq. (17): M. An Pure Appl. Geophys.
Eq. (41) shows that R H is independent of dx d , which represents the effect of dd in the solution. However, the error ranges can be used to weight the data in an inversion that employs a weighted least squares (WSL) approach; R H determined via WSL inversion is slightly different to the above R H , but it is also not R(dd). Therefore, R H does not include the effect of observational errors.

Resolution Matrix of the Full Process from x to x?
Is there a matrix that reflects all of the factors in the x ? x process? If x is the result of a process that incorporates all of the factors, then the resolution matrix R that is directly inverted from x and x via Eq.
(2) reflects all of the factors. However, the inversion scheme to determine such a matrix is often impossible in geophysics. The true model for a region (x) may be never known; otherwise, the matrix R for a region with known x is redundant and unnecessary.

Figure 5
Resolution matrices for linear inverse problem Example 2. a The observation matrix G. The observations consist of only three points at parameters x 10 , x 30 , and x 32 . Only one entry in each row in G is nonzero. b A synthetic model and its corresponding solution, which was constrained using first-order Tikhonov regularization and k = 1. c, d Resolution matrices R D and R H . R D has only three nonzero entries. R H has more nonzero entries, but they are in the same columns (10, 30, and 32) as those in R D Vol. 180, (2023) On Resolution Matrices 123 Equation (2) represents a transformation from the real model x to the corresponding solution x, or a process via the projection r:x ? x (or R). This approach is independent of x and can be isolated from the practical true model x and practical solution x. If x is not the true medium but rather a synthetic model, then the inversion of R is possible.
Recovery tests, e.g., checkerboard tests (Lévěque et al. 1993), that employ a synthetic model with a specific structure are frequently used to retrieve a qualitative resolution. The output solution x of a synthetic test that employs an input model with a random structure x also contains resolution information ).  suggested that the statistical resolution matrix R S (Table 1) can be determined by statistically comparing a limited number of input synthetic random models x and their correspondent output solutions x via a Gaussian function approximation for each row vector.
The output solution x includes all of the known factors in the x ? x process, with R S including all of these factors. However, the matrix is only approximate and not necessarily accurate. If a large number of x and x are given, then an accurate and complete resolution matrix can be obtained on the basis of Eq. (2), as discussed in the following section.

Resolution Matrices of the Complete Process
An accurate resolution matrix that includes all of the factors in the complete x ? x projection process is suggested in this section.

Method
Equation (2) has to be reorganized for the inversion of R because R is an m 9 m matrix, and both x and x are m 9 1 vectors. The extended form of Eq. (2) is: where r i,j is the element at the ith row and jth column of R. The equation is reorganized as: The compact form of Eq. (43) is: where X is a band matrix that is composed of x T vectors and r is a vectorization of R T : and: r ¼ vecðR T Þ ¼ ½ r 1;1 r 1;2 Á Á Á r 1;m r 2;1 Á Á Á r 2;m Á Á Á r m;1 Á Á Á r m;m T : M. An Pure Appl. Geophys.
(2), x in Eq. (44) is a dependent variable of r. If we have a real model (X) and its corresponding solution (x), then the resolution vector r can be inverted via Eq. (44), as r is a vector with m 2 unknowns that is constructed from m equations and x has m elements.
Application of the projection to a random synthetic model x k outputs the corresponding solution x k . The model X k (a band matrix for x k , Eq. (45)) and x k still satisfy Eq. (44). One can therefore obtain N solutions (x 1 , x 2 , …, x N ) by performing the same projection for N different random synthetic models (X 1 , X 2 ,…, X N ), and then all of the solutions can be used to construct a new equation from N equations, following Eq. (44): where: fxg ¼ . .  47) is constructed of m 2 independent equations, then r will be uniquely resolvable via: a resolution matrix R is then obtained by converting the vector r back into a matrix.
The new matrix R is obtained via either Eq. (49) or (2) without any approximation. The synthetic solutions (x k ) are the result of the complete x ? x process with all of the factors. Therefore, the resultant R reflects all of the factors (various errors, simplification, etc.) in the complete process, and is termed the complete resolution matrix, which is denoted R C (Table 1).
This extraction of R C from random synthetic input models and output solutions is similar to that proposed by .  focused on the extraction of a reliable resolution length from a small number of x i and x i to construct the approximate resolution matrix R S (Table 1). Here, an accurate resolution matrix (R C ) is directly inverted without approximation. Various procedures, such as linear and nonlinear inverse problems-and also noninverse problems )-can be implemented to obtain R C (Fig. 6) and R S , as they are isolated from the x ? x process. For example, R S can be obtained for kriging and minimum curvature gridding (Chiao et al. 2014). The extraction of R C by Eq. (47) is a linear regression of the relationship between x k and x k (Fig. 2a), such that R C can then be considered a linear approximation of r:x ? x for either the nonlinear inverse or non-inverse problem.

Equation Simplification
The ability to resolve R C via Eq. (49) requires the inversion of a large matrix with C m 2 rows and m 2 columns. However, the calculation can be simplified.
Equations (42) and (43) state that the ith solution parameter x i k of the solution x k can be written as: where R i is the ith row of R (r i,* ). All of the equations for the parameters from x i 1 to x i N form: where: . .
Equation (51) is actually the same as Eq. (S4), but it only contains the rows for the solution parameter x i . Equation (51) can then be used to invert the ith row of R C (R i ) from: the other rows of R C can also be inverted using Eq. (53). The inversion of R C via Eq. (53) is easier than that via Eq. (49). Equation (49) has an (left) inverse of the matrix {X} with N 9 m rows and m 2 columns. However, Eq. (53) has an (left) inverse of the much smaller (N 9 m) matrix {x T }. Furthermore, the Vol. 180, (2023) On Resolution Matrices 125 inverse of {x T } for R i T in Eq. (53) is the same for the calculation of the other rows (e.g., R j T ), and is then directly used in the calculation for all of the rows of R C .
We derived R C for Example 1 (Fig. 1) using the same regularization to obtain R H (L 1 ) (Fig. 3c) without considering observation errors (dd) for comparison. The solutions (e.g., x k ) for 100 (N = 100) different synthetic random models (e.g., x k ) were resolved in step 1 (Fig. 6) of the calculation. The resultant R C (Fig. 7b) is the same as R H (Fig. 3c) for the linear inverse problem. However, R C can reflect all of the factors in a practical study, whereas R H only reflects the effects of G and regularization.

Resolution Matrices with Error Effects
All of the measurements and processing include errors which influence the solution x and then the x ? x projection. R C can include the effects of various (quantifiable and unquantifiable) errors and additional prior information (such as C and c in Eq. (10)). For example, system simplification may cause error in solution but is often difficult to be quantified. If x k is obtained through the simplification, the resulted R C will reflect the error related with the simplification. However, for the sake of comparison, examples on the effects of quantified errors in data are given below.
When the observational errors dd are considered, the equation with R(dd) (Eq. (40)) is of the same form as Eq. (2). Therefore, the above method (Fig. 6) for obtaining the complete resolution matrix R C via either Eq. (49) or (53) on the basis of Eq. (2) can be used to obtain R(dd) (Eq. (40)). One pair of models (x k , x k ) serves as an independent measurement for R in this method. However, the addition of more factors in the processing than those contained in G and C make the relationship between x and x more complex, such that a larger number ([ m) of model pairs is often necessary to obtain a reliable R C .
The traditional resolution matrix R H (Eq. (41)) cannot include the effect of observation error, then equals R(0) or R(dd = 0). The resolution matrix (R C or R(dd)) with the effect of observation error is different from R H . An equation with their difference can be obtained from Eqs. (40) and (41): Eq. (54) indicates the difference (R(dd) -R H ) or (R(dd) -R(0)) is a projection matrix on the solution error (dx). M. An Pure Appl. Geophys.
Random errors exist in practical studies. The influence of random errors on the solution diminishes as the number of observations increases. However, if observations are limited, then the average effect of the random errors appears as a regular systematic error; we therefore only test systematic errors here. A stronger regularization (larger k) is normally required for inversions with observational errors. However, we used the same k that was applied in the above errorfree inversion for comparison.
Two main types of systematic errors, offset and scale factor errors, are tested here. Offset observational errors in a linear inversion will introduce offset errors in x (Fig. 2b). The x ? x process with offset errors in x can be better represented by Eq. (17) than Eq. (40). Equation (17) has more variables (R and dx of ) than Eq. (40), such that the process with offset errors is more complex and requires a larger number of model pairs to obtain R(dd). If we still invert for R(dd) via Eq. (40) using the same number (N = 100) of model pairs (x k , x k ) in Fig. 7, then the resultant Resolution matrix R C for the 1-D inverse problem in Fig. 1. a Example synthetic (true) random model (input model) and its corresponding solution (output model) for the inverse problem. b Matrix R C or R -of . The resolution matrix R C is calculated from 100 pairs of random input and output models, as in (a). Flatness regularization, with k = 1.0 used. R -of is the resolution matrix after the offset errors are isolated (Eq. (55)). c, d Two row vectors and e, f two column vectors of R C in (a). Red lines in (c-f) are Gaussian approximations of the vectors around the diagonal entry. The row and column vectors for the 48 th parameter can be represented by Gaussian function curves (c and e, respectively), whereas those for the 95th parameter are far from Gaussian curves (d and f, respectively). The column vector (r *,95 ) (f) is all zeros. The widths of the Gaussian approximation curves (c-e) can represent the widths of the vector curves Vol. 180, (2023) On Resolution Matrices matrix is somewhat unstable (Fig. 8a). The column vector is somewhat stable (Fig. 8b) due to regularization (L 1 ) because flatness regularization directly influences the column vectors, but not the row vectors. R(dd) generally becomes stable when a larger number (e.g., N = 200) of model pairs (Fig. 8a) is used. Scale factor observational errors in a linear inversion will introduce scale factor errors in x. The x ? x process with scale factor errors in x can be well represented by Eq. (40), with R(dd) equaling R H multiplied by the factors related to the scale factor errors.
An offset error of 2.0 s (dd 1 = 2.0 s) and a scale factor error of 20% (dd 1 = 0.2d 1 = * 2.5 s) in the first observation data (d 1 ), which have a mean of * 12.5 s for all of the synthetic observations, are considered. Two hundred (N = 200) model pairs are used, with the resultant matrices R(dd) shown in Figure S1. The matrix differences, R(dd) -R H (L 1 ), which reflect the observational error effects in the projection from x to x, are shown in Fig. 9c and d.
These two types of observational errors yielded similar errors (dx d ) in the solution parameters (x 1x 25 ) in x(L 1 ) (Fig. 9a, b), resulting in similar summation curves for the 1 st -25 th content vectors in R(dd) (Fig. 9e, f). However, the errors produced different effects in the projection matrices R(dd). The row vector of R(dd) for a solution parameter with errors (e.g., x 5 ) in Fig. 9c is very different from that in Fig. 9d. The addition of the scale factor error dd 1 (= 0.2d 1 ) only caused resolution matrix changes in the 1 st -10 th columns (Fig. 9d) (i.e., only the contributions of x 1 -x 10 , which are constrained by d 1 , are influenced, as expected). However, the offset error dd 1 (= 2.0) caused changes in all of the columns (Fig. 9c). Therefore, the spread functions are sensitive to different types of errors.

Offset Error Isolation
The x ? x process with offset errors in x (dx) (Fig. 2b) can be better represented via linear regression (Eq. (17)), whereby the offset errors are isolated from either x or R(dd)x. When the offset errors dx of (= [o 1 , o 2 , …, o m ] T ) are isolated, the obtained resolution matrix is denoted as either R -of or R -of (dd). Equation (17) can be written as: where R ?1 = [R -of dx of ], x ?1 = [x T 1] T . The extended form of Eq. (55) is: x 1 x 2 . . . x 1 x 2 . . .
Equation (55) is of the same form as Eq. (17), such that the m 9 (m ? 1) matrix R ?1 can be inverted using model pairs (e.g., x k and x k ) via the above procedure for R C (Fig. 6). The obtained R ?1 is denoted as R C?1 (Table 1). As expected, the resultant matrix R -of (dd) for the above case with offset errors is the same as R(dd = 0) (Fig. 7b), and the resultant dx of (not shown here) equals the offset errors in x.

Properties of the Complete Resolution Matrix
Unlike R D (from G only) and R H (from G and C), a complete resolution matrix can be obtained from any combination of all of the factors in a study (Table 1). If the synthetic solution x k is obtained from a generalized inversion using error-free observations, then R C is equal to R D , with both matrices characterizing the same properties. If regularization is used during the processing, then R C is equal to R H , with both matrices characterizing the same properties. Therefore, the properties of R C vary based on the considered factors in the x ? x process.

Utilities of Resolution Matrices
Resolution matrices (e.g., R C ) are a quantitative indicator not only for solution but also for other factors in a study system. All the factors (e.g., solution, observation, and regularization) can be appraised by the matrices.
Resolution matrix has been widely used to appraise solution reliability in an inverse problem (e.g., Aki et al. 1977;Aster et al. 2005;Backus and Gilbert 1970;Menke 2012;Tarantola and Valette 1982;Wiggins 1972). The diagonal entry r i,i reflects whether x i can be resolved or how much of x i is contained in x i and then has received considerable attention for several decades (e.g., Aki et al. 1977;Day-Lewis et al. 2005;Wiggins 1972). Resolution spread estimated from resolution matrix can quantify the degree of departure of R from an identity matrix, and then the goodness of the model (Backus and Gilbert 1970;Menke 2012Menke , 2015. However, resolution length, discussed in the ''Resolution length'' section, is widely used in practical model appraisals at present. Furthermore, the matrices R S and R C can also be applied to evaluate solution stability. In an unstable inversion, a small variation in observation causes a large change in solution, then R S and R C calculated from unstable solutions from random models are sensitive to the instability.
The resolvability and constrainability of x under given observation, which is the most important information in a study, controls the solution reliability and must be taken into account to select parameterization and regularization. They can be quantitatively retrieved from resolution matrices, which is explained in the ''Resolvability and constrainability from the resolution matrix'' section.
The influences of regularization and errors on solution can also be evaluated by resolution matrices. The matrix R D reflects model projection under given observation. R H reflects the projection with the combination of observation and regularization. The difference, R H -R D , reflects the regularization influence on solution, which is discussed above in the ''Significances of the resolution matrices'' section. If an error exists in the process to obtain x, R C will include the effect of the error. The difference, R C -R D , reflects the error's effect, which is discussed in the Sects. 3.3 and 3.4.

Resolution Length
The smallest possible feature that can be detected is an important constraint in the model. The feature size is generally called the (spatial) resolution or resolution length (or width). Following the suggestion of Lebedev and Nolet (2003), the resolution length is defined here as the half size of the feature. This section focuses on how to obtain the resolution length from a resolution matrix.

Content Extent Versus Resolution Length
Equation (38) indicates that one solution parameter generally comes from the weighted averages of all of the medium parameters, with the entries from a row vector of R being the weights or contents. The smallest resolvable feature represented by x i or the resolution length at x i should therefore be estimated from the row vector.
The medium parameters with high contents/ weights in x i mainly determine x i , such that the feature represented by x i is related to the high-content segment (e.g., r 48,41 to r 48,55 in Fig. 7c) in the rowvector curve. However, as the feature is represented by x i , the resolution length is not defined by the highvalue segment extent, but rather the distances from x i to the parameters at the segment borders. In Fig. 7c, the resolution length for x 48 is not the half distance from x 41 (r 48,41 ) to x 55 (r 48,55 ), but it is instead related to the distances from x 48 (r 48,48 ) to x 41 (r 48,41 ) and x 55 (r 48,55 ); similarly, the resolution length for x 95 is not the half distance from x 86 to x 90 , but it is instead related to the distances from x 95 to x 86 and x 90 (Fig. 7d).
In general, a parameter (e.g., x i ) and its neighbors should provide a large contribution to the average (the solution parameter x i ) (Jackson 1972), and the contributions (r i,j ; j = 1, m) should decrease quickly with increasing distance from x i to the other parameter (x j ). Therefore, each row of the resolution matrix (r i,j ; j = 1, m) can be approximated as either a Gaussian-shaped function (e.g., Fichtner 130 M. An Pure Appl. Geophys. and Trampert 2011;Nolet 2008) or a cone (Barmin et al. 2001). The row-vector curve may be similar to the shape of a Gaussian function (e.g., r 48,* ; Fig. 7c), but it can be quite different (e.g., r 95,* ; Fig. 7d). In general, the width of the Gaussian function approximation to the row-vector curves can represent the resolution length (distance from x i to the borders of the high-content segment) ( Fig. 7c and d). For example, the resolution lengths in Fig. 7c and d are 4 km for x 48 and 9 km for x 95 , respectively.

Estimation from a Row or Column Vector?
The resolution length may also be estimated from a column vector (e.g., Alumbaugh and Newman 2000;Smith 1997). R D is symmetric, such that the resolution lengths estimated from its i th row and column vectors are the same. However, regularization directly influences the spread functions (or column vectors), such that most of the resolution matrices with regularization (e.g., R H ) are asymmetric. Therefore, the resolution lengths estimated from the i th column and row vectors can be different.
The resolution lengths that are estimated from the column and row vectors are largely the same or similar when Tikhonov regularization is applied. The resolution matrix R H (kI) is symmetric, such that the lengths from its row and column vectors are same. Higher-order Tikhonov regularizations (kL n ) yield a smoother column vector (r *,48 in Fig. 7e) than its corresponding row vector (r 48,* in Fig. 7e), and then the lengths from the two vectors are largely similar, as previously confirmed (Miller and Routh 2007;Pilkington 2016), but with exceptions.
The column vector cannot provide a valid resolution length for a parameter constrained by no observations (e.g., x 95 in Fig. 1a, b). The column vector for the parameters in R H (e.g., r *,95 in Fig. 7f) is all zeros, but the corresponding row vector is not (r 95,* in Fig. 7d), with a row-vector sum for R(S 0 ) (e.g., R H (L n )) that is equal to one. The row vector can provide a reliable resolution length in this case, whereas the column vector cannot (Table 3).
The row vector may be unstable and possess strong oscillations when large errors exist (e.g., r 5,* in Fig. 8a, c). The high-amplitude oscillations around the diagonal entry of the row vector may imply an illusion of high resolution. However, the corresponding column vector in R H (L n ) is smoother (Fig. 8b) than the row vector (Fig. 8a), such that the length estimated from the column vector is more reliable and less influenced by these large errors (Table 3).
In summary, the resolution length should generally be taken from the row vectors based on the definition of the resolution length (Table 3). However, special cases (e.g., observation errors) may yield an unstable row-vector curve that may in turn influence the estimated resolution from this vector. Therefore, it is advised to simultaneously extract the resolution length from the row and column vectors of the resolution matrix to ensure the resolution length is accurately defined.

Resolution Estimations that are not from R
The spatial resolution in seismic tomography studies is widely estimated via visual inspection of the restoration of the synthetic structure (e.g., checkerboard tests) (Feng and An 2010;Lévěque et al. 1993;Thurber and Ritsema 2009), as illustrated in Fig. 10a-c. If the checker size can be recovered at a given location, then the resolution length at that location in the final result is at least the same as the checker size. This method is powerful and easily realized. However, the resolution length is qualitative, not quantitative. Furthermore, the recovered (x) and synthetic checkerboard pattern model (x) in one test are equivalent to one pair of x and x. Several tests cannot produce sufficient model pairs to provide fully resolution information, as explained in the ''Resolution matrices of the complete process'' section.
The quantitative resolution length can still be retrieved when no resolution matrix is given or needed. The output solution x of a synthetic test using an input model with a random structure x contains the resolution information . Quantitative resolution information can be retrieved via a number of approaches, including a comparison of many x and x (An 2012), cross correlation of x and x (Trampert et al. 2013), and autocorrelation of x (Fichtner and Leeuwen 2015). The resolution lengths in Fig. 10d were obtained using the An (2012) method on the basis of limited pairs of random synthetic models and Vol. 180, (2023) On Resolution Matrices solutions (Ma et al. 2014). The An (2012) method has been easily realized in various studies (e.g., Chevrot et al. 2014;Chiao et al. 2014;Lin et al. 2014;Ma et al. 2014). The resolution length distribution (Fig. 10d) is often easier and more informative for the general reader than synthetic checkerboard recovery tests (Fig. 10a-c) (Ma et al. 2014).

Resolvability and Constrainability from the Resolution Matrix
Several essential questions arise when the real model x cannot be fully resolved in x for an ill-posed problem. For example, how much information from x can be reflected in the solution x, or what is the resolvability of x i (or the content of x i in x i )? How much information from x is controlled by the observations? What is the constrainability (constraining status) of an individual solution parameter x i under given observations? These questions are not only essential for understanding the reliability of the solution, but also instrumental in providing basic information to guide the improvement of the study system. These questions are essentially centered on the relationship between x and x, such that their answers can be derived from the x ? x projection/ resolution matrix.
The r i,j entry in R represents the accurate content or contribution of the j th model parameter x j to the i th solution parameter x i . Therefore, the entries of the i th content vector (r i,* ) and their sum (Rr i,* ) are indicators of the resolvability of x i and the constrainability of x i . As previously mentioned, R H reflects the combined effects of the observation matrix and regularization during the x ? x process, whereas R D only reflects the observational effects. The reliability of the practical solution has therefore been evaluated via R H , but not R D . However, the constrainability of the solution parameter x i under given observations can be better obtained from R D than from R H for the same reason. The factors rather than the observations and regularization matrices can also influence the reliability of x i ; this cannot be reflected by R D and R H , but can be by R C .

Resolvability Defined by the Main-Diagonal Element
Equation (38) indicates that the main-diagonal element r i,i reflects the content (or contribution) of the real model parameter x i in (to) its counterpart x i . This entry reflects whether x i can be resolved or how much of x i is contained in x i . r i,i can therefore be considered the resolvability of x i (Table 3) and has received considerable attention for several decades (e.g., Aki et al. 1977;Day-Lewis et al. 2005;Wiggins 1972).
The main-diagonal element r i,i (e.g., Fig. 4b for Example 1) may take one of the following values: • r i,i = 1 (as illustrated in Fig. 11a). The curve shape of the elements in the ith row vector is a delta function, where all of the elements are zero, except r i,i . In this case, x i equals x i , which means that x i is well constrained and x i can be fully resolved. If r i,i is in R D , then the parameter x i is fully resolvable under the given observations. • r i,i = 0 (Fig. 11b). This case indicates that x i makes no contribution to its counterpart x i and is unresolvable. For example, parameter x 95 in Example 1 is not constrained by an observation (Fig. 1a,  b), such that r 95,95 in both R D and R H is zero (Figs. 3 and 4b), and x 95 in x D equals zero (Fig. 1c). • r i,i [ (0,1). x i partially contributes to its counterpart x i and is partially resolvable. If r i,i in R D belongs to (0,1), then x i shares the observation with other parameters (e.g., x j ); x i therefore contributes to both its counterpart x i and x j . For example r 1,1 = 0.1 in R D in Example 1 (Fig. 4b), which indicates that x 1 shares observation 1 with x 2 (Fig. 1a); x 1 therefore contributes to both x 1 and x 2 .
The main-diagonal element r i,i in R D reflects the resolvability of x i under a given observational condition, such that the sum of all of the maindiagonal elements, or trace(R D ), can be considered the resolvability of the model vector x. As R D is a Gram matrix, trace(R D ) equals the number of independent observations (rank(G)). R D will therefore be an identity matrix, and x can be fully resolved when either trace(R D ) or rank(G) equals the number of model parameters (m). Vol. 180, (2023) On Resolution Matrices 133

Deviation from the Expectation Given by the Row-Vector Sum
Equation (38) indicates that x i equals the weighted sum of all of the parameters in x, and Rr i,* (or R i R) is the sum of all of the weighted r i,* . If R is a stochastic matrix, then a sum of one means that x i reflects the true average of x (Nolet 2008) and lies at least within the extremes of x. For example, if R 1 R D = 1 (Fig. 4a) and r 1,* in R D are positive (Fig. 3a), then x 1 in x D is a good representative of x 1 (Fig. 1c). However, the assumption is often false, as the entries in R can be negative (Menke 2015) (i.e., R is often not a stochastic matrix). For example, Rr 1,* in R H (L 1 ) in Example 1 equals one (Fig. 4a), but the parameter x 1 in x(L 1 ) (Fig. 1c) deviates from the expected average. The polarity of r i,* must therefore be considered when R i (R) is used to judge the reliability of x i .
While R i R = 1 does not necessarily correspond to the perfectness of x i , the deviation of R i R from one is a good indicator of the deviation of x i from the true model average (Table 3). A comparison of Figs. 4a and 1c indicates that an overestimated (larger than the model average) parameter x i (x 54 in x D and x(I) in Fig. 1c) corresponds to R i R [ 1 (e.g., R 54 R D and R 54 R H (I) in Fig. 4a), and an underestimated (smaller than the model average) parameter (x 48 in x D and x(I) in Fig. 1c) corresponds to R i R \ 1 (e.g., R 48 R D and R 48 R H (I) in Fig. 4a).

Difference Between Neighboring Parameters
If the parameter x i is partially resolvable (r i,i [ (0,1)), then the solution parameter x i will include content (r i,ii [ 0) from neighboring parameters (e.g., x ii ), and the r i,i and r i,ii entries in the ith row vector of R can reflect the similarity between x i and x ii .
When two neighboring parameters x i and x ii (e.g., x 60 and x 61 ; Fig. 1a, b) are constrained by unrelated observations, r i,i \ 1 and r i,ii = 0 (Fig. 11c), as observed for r 60,61 and r 60,60 in either R D or R H (I) (Fig. 3a and b). In this case, x ii does not contribute to x i (Eq. (38)), and x i does not contribute to x ii . It is possible to discriminate the difference Figure 11 Illustration of the four constraining statuses for the ith solution parameter x i based on the elements of the ith row vector r i,*

134
M. An Pure Appl. Geophys. between x i and x ii in x i and x ii from the unrelated observations, as x 61 in either x D or x(I) is obviously different from x 60 (Fig. 1c). When two neighboring parameters x ii and x i (e.g., x 13 and x 14 , which are constrained by the second observation in Fig. 1a, b) are constrained by the same or related observations, both r i,i and r i,ii (ii = i -1 or i ? 1; Fig. 11d) (e.g., r 13,13 and r 13,14 in either R D or R H (I); Fig. 3a and b) are in the range (0,1). In this case, x i has contents from both x ii and x i . x ii also has contents from both x ii and x i because of the symmetry of R D . Consequently, x i and x ii cannot be discriminated, and their difference is often related to the difference Dr i,ii : The difference between x i and x ii in x D is often very large when Dr i,ii is large, even if the real model parameters x i and x ii are the same. When Dr i,ii is small, the difference is also small, even if x i and x ii are quite different. When Dr i,ii is zero, x i is often equal to x ii . The solution parameters x 13 and x 14 in either x D or x(I) are quite different (Fig. 1c), even though the real model parameters x 13 and x 14 are almost the same. Dr i,ii is therefore a good indicator of the difference between x i and x ii (Table 3) if r i,i is less than one, even though this difference has no relation to the difference between x i and x ii .

Short Summary on Constrainability
In summary, the constrainability of a solution parameter can be quantitatively evaluated from its content vector in a resolution matrix (Table 3). The main-diagonal element (r i,i ) can be considered the resolvability of x i . If R D is used, then r i,i values of 0, 1, and (0,1) mean that x i is unresolvable, fully resolvable, and partially resolvable, respectively, for given observations. The deviation of the content vector sum (Rr i,* ) from one can be considered an indicator of the deviation from the model expectation. Values of Rr i,* [ 1 and \ 1 mean that x i is overestimated and underestimated, respectively. In the case where Rr i,* = 1 and all of the elements r i,* are nonnegative, x i is the model true average. The difference between r i,i and r i,ii (Dr i,ii ) is a reflection of the difference between x i with x ii for a partially resolvable parameter x i (r i,i [ (0,1)). Large and small Dr i,ii values correspond to large and small differences between two neighboring parameters, respectively, although the solution difference has no relation to the difference between the true medium parameters.

Constrainability from R H
Practical studies that employ regularization do not provide R D , but rather R H . Estimation of the parameter constrainability under a given observation from R H is necessary in this case to determine how much information in the solution is from observation rather regularization. As mentioned above, R H (kI) is the most similar matrix to R D , making it a good alternative to R D for evaluating the constrainability. However, most of the main-diagonal elements in R H (kI) are smaller than those in R D . The row vector sum of R H (kI) may therefore be smaller than that of R D , such that its deviation from one cannot be used to evaluate the underestimated parameters. R H (kI) can also be used to evaluate the well constrained parameters, as the curve shape of their row vectors is still a delta function, even though r i,i may not be one.
R H (L n ) (n [ 0; e.g., R H (L 1 ) in Fig. 3c) is significantly different from R D (Fig. 3a). With the exception of the unconstrained parameters, the main-diagonal elements of R H (L 1 ) (r i,i and r ii,ii ) for two neighboring parameters (x i and x ii ) that are constrained by the same observation can be different (Fig. 4b). The row-vector sum of R H (L 1 ) for any single parameter equals one (Fig. 4a). Therefore, the main-diagonal elements r i,i and row-vector sum R i R H (L 1 ) cannot be used to evaluate the constrainability of the parameter x i . However, the all-zero column vectors of R H (L n ) (n = 0, 1, 2) for the unconstrained parameters are the same as those in R D , such that the unconstrained parameters and unconstrained neighbors can be evaluated from R H (L n ). Furthermore, the curve shape of the row vectors of R H (L n ) (n = 0, 1) is similar to that of R D (Figs. 1f and 3). The mutual relationship between two Vol. 180, (2023) On Resolution Matrices 135 neighboring parameters can therefore be roughly evaluated using Dr i,ii of R H (L n ).

Resolution Upper Bound
R D is only related to G (Eq. (19)), with no relationship to the observation data d with observational errors (dd) and other factors, whereas R H is composed of both G and C. Regularization adds artificial constraints to make the solution appear more rational. However, various factors, including observational errors, regularization, and the instability of the solution can decrease (but not increase) the resolution reflected by R D . Therefore, the resolution derived from R D marks the upper bound of resolvability (Table 1).
While repeated observations can improve the precision of the solution by improving the precision in the observation data (d), they cannot increase either rank(G) or the number of independent observations. Repeated observations therefore have no effect on R D and cannot improve the upper bound of resolution. The only way to improve the upper bound of resolution is to increase the independent observations (or rank(G)).

What is a Perfect Inversion?
A perfect inversion, or a perfect constrainability, corresponds to a resolution matrix that equals the identity matrix I (e.g., Jackson 1972;Menke 1989). However, this rule is only applicable for direct resolution matrix R D , not the other resolution matrices (e.g., R H or R C ). A main-diagonal element r i,i of one in R D means that x i can be fully resolved in x i , thereby implying that a perfect inversion has identity matrix of R D . R D always equals I except when the problem is under-determined. However, R H includes regularization, and regularization is largely employed when the constrainability of the solution parameters under a given observational condition is imperfect, (i.e., trace(R D ) is often smaller than m or the inverse of G is unstable). Therefore, trace(R H ) is always smaller than trace(R D ), regardless of the quality of the regularization, such that R H will never be an identity matrix I. Therefore, the perfectness of the inversion cannot be judged based on the degree of similarity between R H and I. The perfectness of a study using regularization should be judged based on the amount of valid information in G that is reflected in the solution, as different regularizations yield different solutions.

Resolution Matrices in Nonlinear Inversions
The relationship between x and x is different from that between x (or x) and d (or d). The relationship between x and x (r:x ? x) of a nonlinear problem can be nonlinear, but can also be linear, such as when true model x is perfectly resolved (x = x = I x = r(x)). This indicates, the x ? x projection is a different relation with the problem but relates with the ability of solving the problem. The nonlinear inverse problem cannot be described by either Eq. (6) or (9), such that neither R D (Eq. (19)) nor R H (Eq. (24)) can be obtained for the problem. Anyway, Eq. (2) remains a valid linear projection approximation of the nonlinear equation (Eq. (1)), such that R C , which is directly obtained from Eq. (2), can still be provided, regardless of the method used to solve the problem.
A nonlinear inverse problem can be solved via either a global optimization or linearized method. If the problem is resolved using linearized iterative methods (e.g., Aster et al. 2005;Bourgeois et al. 1989), such as Newton's method, then a resolution matrix can be obtained after each iteration (Jackson 1972) from either Eq. (19), Eq. (24), or a similar equation. However, the resolution matrix is not any of the above-mentioned resolution matrices (R D , R H , R C , R I , and R S ), but rather one of three new resolution matrices that are specifically constructed for nonlinear inverse problems.
An iterative inversion is designed to invert for the model perturbation Dx i (= xx i-1 ) of the reference model x i-1 at the i th iteration on the basis of the firstorder approximation of the inverse problem (Eq. (4)): where Dd i = dg(x i-1 ) and G J i is a Jacobian matrix with the partial derivatives of d taken with respect to x at x i-1 . Equation (58)  obtained from G J i and/or C using the same methods that were employed to obtain the model solution x from G and/or C in the above linear inversions (e.g., Eq. (11)). However, the inverted solution (Dx i ) is a model perturbation, not the model. The inverted model (solution) after the ith iteration x i is x i-1-? Dx i , which is the reference model at the next inversion iteration. This is the general procedure of Newton's method. A surface-wave inversion to constrain the S-wave velocity structure (Example 3) (Fig. 12a) is synthetized here to illustrate the three resolution matrices that can be implemented in a linearized nonlinear inversion. The employed inversion is a typical nonlinear inversion approach in geophysics that is widely applied to elucidate 1-D and three-dimensional sedimentary, crustal, and lithospheric Earth structures (e.g., Feng and An 2010;Knopoff 1972;Snoke and James 1997;Wiggins 1972;Xia et al. 1999); therefore, the details of the nonlinear relationship between x and d are not explained here. First-order Tikhonov regularization has been widely used in this inversion approach, although it prevents the correction of bad discontinuities in the reference model (An 2020). A regularization approach that is adapted to the reference models suggested by An (2020) can overcome this problem and lead to a rapidly convergent iterative inversion; this regularization approach is used here. The regularization parameter was set to k = 0.01 after a series of tests that explored the tradeoff between the misfit and model flatness. The synthetic observation (d) and predictions (Fig. 12a) and partial-derivative matrix G J i (Eq. (58)) were calculated using the surf96 program (Herrmann 2013). Given a reference model at the first iteration (the starting model), the model solutions after the first to fifth iterations (x i ) and their fits to the reference model are shown in Fig. 12a. The solution x 5 after the fifth iteration is nearly the same as x 4 after the fourth iteration, which implies that the inversion converged around x 4 . x 4 is, therefore, considered the final solution.
6.1. Linear Approximation of r:x? x The R C calculation for a nonlinear inversion is the same as that for the above linear inversions, whereby only the synthetic input models (x) and their corresponding solutions (x) are used. This calculation, which is based on either Eq. (43) or (47), is in fact a linear regression of x and x (i.e., R C is a linear approximation of r:x ? x) (Table 1) (Fig. 2a). However, the relationship between x and x is often nonlinear due to the complexity in the x ? x process for a nonlinear problem. The calculation of a reliable R C therefore requires more pairs of input/output models. Furthermore, the resultant R C will somewhat depend on the synthetic random input models. If they are closer to the practical medium, then the resultant R C will be more realistic.

Projection of Solution Improvement After an Iteration
As Eq. (58) is of the same form as Eq. (6), the process from Dx i (= xx i-1 ) to Dx i (= x ix i-1 ) after the i th iteration can be expressed in a form similar to Eq. (22) to represent the x ? x process (e.g., Jackson 1972;Wiggins 1972): or: where R J i denotes the resolution matrix R:Dx i ? Dx i .
Equation (59) is of the same form as Eq.
(2), such that R J i (denoted R JD i ) can be obtained via Eq. (19): When the regularization matrix C is used, R J i (denoted R JH i ) can be obtained via Eq. (24): where A J i is the combination of G J i and C, which is similar to how A is the combination of G and C in Eq. (10). The matrix R JD i (or R JH i ) is often denoted R previously, but it possesses a different significance than the resolution matrix R:x ? x.
The first-order Taylor expansion of Eq. (1) at x i-1 is: where J r (x i-1 ) denotes the Jacobian matrix (or the gradient) of the projection r: x ? x at reference Vol. 180, (2023) On Resolution Matrices 137 model x i-1 . Equation (63) has the same form as Eq. (59). Therefore, R J i is exactly the Jacobian matrix J r (x i-1 ) (Table 1). Practically, R J i represents the projection from xx i-1 to x ix i-1 (Eq. (59)) (i.e., the projection of the solution improvement on the reference model (x i-1 ) just after the i th iteration) (e.g., : If k = i just after the i th iteration of a given inversion, then Eq. (64) should be the same as Eq. (59). The matrix R J i?(i-1) = {0}, which means that R J i?i = R J i . Therefore, if k = 1, then Eq. (64) becomes: where: The matrix R J 1?i represents the projection from Dx (= xx 0 ) to Dx (= x ix 0 ), which is a projection of the solution improvement on the starting model x 0 after i iterations (x ix 0 ) (Fig. 13b). This is different from the gradient of r: x ?
x at x i (R J i ), as R J 1?i is the slope of r:x ?
x from x 0 to x i ( Table 1).
The magnitude difference between R JH 1?4 , which is obtained using R JH 1 , …, R JH 4 , and R JH 4 (Fig. 12d)  . Furthermore, if the synthetic x and corresponding solution x i (= x i-1 ? Dx i ) after the i th iteration are used in the R C calculation, then the resultant R C (denoted R C i ) (e.g., R C and R C 1 in Fig. 13b) represents the projection from x to x i (= x i-1 ? Dx i ) just after the iteration (R i :x ? x i ). This is one more new type of resolution matrix that may appear in a linearized iterative application. If x i is final solution x, R C i is written as R C .
In summary, completed resolution matrix for the x ? x process can be obtained from the linear approximation of r:x ? x, regardless of the method used to solve a given nonlinear inverse problem. However, a linearized iterative application can have four classes of resolution matrices (Tables 1 and 2, Fig. 13b), R C , R C i , R J 1?i , and R J i , which represent the x ? x, x ? x i , (x -x 0 ) ? (x ix 0 ), and (x -x i-1 ) ? (x ix i-1 ) projections, respectively. R is a linear approximation of the operator r, whereas R J is the Jacobian matrix of r. The resolution matrix R JH i , which is often provided in the literature, reflects the solution improvement just at the i th iteration. The matrix R JH 1?i reflects the solution improvement of the solution after all i iterations from starting model to the solution x i . The surface-wave dispersion inversion in Example 3 highlights that even though the magnitudes of R C , R JH 4 , and R JH 1?4 ( Fig. 12b-d) are obviously different, their magnitude patterns (that of R JH 1?4 is not shown here) are similar. The resolution lengths estimated from the three matrices ( Fig. 12d) are also similar. Therefore, the resolution lengths retrieved from either R JH 1?4 or R JH 4 in a surface-wave dispersion inversion are also acceptable if R C cannot be given.

Conclusion
Here, we reviewed previous resolution matrices and their applications to clarify the properties of resolution matrices in linear and nonlinear inversions that implement zeroth-and higher-order Tikhonov regularizations. We explained how to use the resolution matrix to understand both the resolvability of the medium parameters and the constrainability of the solution parameters. Furthermore, we suggested a new resolution matrix, the complete resolution matrix, which reflects all of the factors in a study system. This new matrix, which is able to overcome many of the limitations encountered by previous matrices, can be broadly applied in linear and nonlinear (inverse and non-inverse) problems. This study is designed to assist the reader in fully understanding both the concept and application of a resolution matrix and in recognizing how to appropriately appraise a solution and understand the relationship between the solution and all of the factors in the study. These matrix suggestions can guide the reader in improving the study system.