Recovering a perturbation of a matrix polynomial from a perturbation of its first companion linearization

A number of theoretical and computational problems for matrix polynomials are solved by passing to linearizations. Therefore a perturbation theory, that relates perturbations in the linearization to equivalent perturbations in the corresponding matrix polynomial, is needed. In this paper we develop an algorithm that finds which perturbation of matrix coefficients of a matrix polynomial corresponds to a given perturbation of the entire linearization pencil. Moreover we find transformation matrices that, via strict equivalence, transform a perturbation of the linearization to the linearization of a perturbed polynomial. For simplicity, we present the results for the first companion linearization but they can be generalized to a broader class of linearizations.


Introduction
Nonlinear eigenvalue problems play an important role in mathematics and its applications, see e.g., the surveys [21,26,30]. In particular, polynomial eigenvalue problems have been receiving much attention [3,14,15,22,24,25]. Recall that P(λ) = λ d A d + · · · + λA 1 + A 0 , A i ∈ C m×n , and i = 0, . . . , d (1.1) is a matrix polynomial and that the number d is called the grade of P(λ). If A d = 0 then the grade coincides with the degree of the polynomial. Frequently, complete eigenstructures, i.e. elementary divisors and minimal indices of matrix polynomials (for the definitions, see e.g., [6,14]) provide an understanding of the properties and behaviour of the underlying physical system and thus are the actual objects of interest. The complete eigenstructure is usually computed by passing to a (strong) linearization which replaces a matrix polynomial by a matrix pencil, i.e. matrix polynomials of degree d = 1, with the same finite (and infinite) elementary divisors and with the known changes in the minimal indices, see [26] for more details. For example, a classical linearization of (1.1), used in this paper, is the first companion form where I n is the n × n identity matrix and all nonspecified blocks are zeros.
In this paper, we find which perturbation of the matrix coefficients of a given matrix polynomial corresponds to a given perturbation of the entire linearization pencil. To be exact, we find such a perturbation of the matrix polynomial coefficients that the linearization of this perturbed polynomial (2.2) has the same complete eigenstructure as a given perturbed linearization (2.1). We also note that the existence of such a perturbation (2.2) was proven before for Fiedler-type linearizations [8,14,32], and even for a larger class of block-Kronecker linearizations [15]. Nevertheless, the problem of finding this perturbation explicitly has been open until now. The main contribution of this paper is an algorithm for finding this perturbation explicitly. Moreover, now the existence of such perturbation also follows from the convergence of the algorithm developed in this paper.
The results of this paper can be applied to a number of problems in numerical linear algebra. One example is solving various distance problems for matrix polynomials if the corresponding problems are solved for matrix pencils, e.g., finding a singular matrix polynomial nearby a given matrix polynomial [4,18,20]. Another application lies in the stratification theory [8,14]: constructing an explicit perturbation of a matrix polynomial when a perturbation of its linearization is known. This will allow to say which perturbation does a certain change to the complete eigenstructure of a given polynomial. (In [11,16,17] the explicit perturbations for investigating such changes for matrix pencils, bi-and sesquilinear forms are derived.) Moreover, our result may also be useful for investigating the backward stability of the polynomial eigenvalue problems solved by using the backward stable methods on the linearizations, see e.g., [29].

Perturbations of matrix polynomials and their linearizations
Recall that for every matrix X = [x i j ] its Frobenius norm is given by X := X F = i, j |x i j | 2 1 2 . Hereafter, unless otherwise stated, we use the Frobenius norm for matrices. Let P(λ) be an m × n matrix polynomial of grade d and define a norm of P(λ) as follows Definition 2.1 Let P(λ) and E(λ) be two m × n matrix polynomials, with grade P(λ) ≥ grade E(λ). The matrix polynomial P(λ) + E(λ) is a perturbation of the m × n matrix polynomial P(λ).
In this paper E(λ) is typically small. Definition 2.1 is also applicable to matrix pencils as a particular case of matrix polynomials.
The first companion form C 1 P(λ) of P(λ) is defined in (1.2) and is a well-known way to linearize matrix polynomials, i.e. to substitute an investigation of a matrix polynomial by an investigation of a certain matrix pencil with the same characteristics of interest. Namely, P(λ) and C 1 P(λ) have the same finite and infinite elementary divisors (the same finite and infinite eigenvalues and their multiplicities), the same left minimal indices, and there is a simple relation between their right minimal indices (those of C 1 P(λ) are greater by d − 1 than those of P(λ)), see [6] for the definitions and more details. Define a (full) perturbation of the linearization of an m × n matrix polynomial of grade d as follows and define a structured perturbation of the linearization, i.e. a perturbation in which only the blocks A i , i = 0, 1, . . . , d are perturbed We also refer to (2.2) as the linearization of a perturbed matrix polynomial.
Recall that an m×n matrix pencil λA 1 + A 0 is called strictly equivalent to λB 1 +B 0 if there are non-singular matrices Q and R such that Note that two m × n matrix pencils have the same complete eigenstructure if and only if they are strictly equivalent. Moreover, two m × n matrix polynomials of degree d, P(λ) and Q(λ), have the same complete eigenstructure if and only if C 1 P(λ) and C 1

Q(λ)
are strictly equivalent. Now we can state one of our goals as finding a perturbation E(λ) such that C 1 P(λ) +E and C 1 P(λ)+E(λ) are strictly equivalent. The existence of such a perturbation E(λ) is known and stated in Theorem 2.1, which is a slightly adapted formulation of Theorem 2.5 in [10], see also Theorem 5.21 in [15] and [14,23,32]. Theorem 2.1 Let P(λ) be an m × n matrix polynomial of degree d and let C 1 P(λ) be its first companion form. If C 1 P(λ) + E is a perturbation of C 1 P(λ) such that then there is some perturbation polynomial E(λ) such that C 1 P(λ) + E is strictly equivalent to the linearization of the perturbed polynomial C 1 P(λ)+E(λ) , i.e. there exist two nonsingular matrices U and V (they are small perturbations of the identity matrices) such that Theorem 2.1 guarantees the existence of the structured perturbation (2.2) and the transformation matrices U and V but says nothing about how to find them. In the following section we present an algorithm that, for a given perturbation (2.1), finds such a structured perturbation (2.2) and transformation matrices explicitly.

Reduction algorithm
In this section we describe our iterative algorithm (Algorithm 3.1) that, by strict equivalence transformation, reduces a full perturbation of a linearization pencil (2.1) to a structured perturbation of this pencil (2.2), i.e. a perturbation where only the blocks that correspond to the matrix coefficients of a matrix polynomial are perturbed. The corresponding transformation matrices are derived too. We also analyze important characteristics of the proposed algorithm and its outputs. Define an unstructured perturbation E u = λE u + E u of the linearization C 1 P(λ) as a perturbation (2.1) where the blocks E 11 , E 11 , E 12 , . . . , E 1d are replaced by the zero blocks of the corresponding sizes, namely: E u consists of all the perturbation blocks that are not included in the structured perturbation (2.2), i.e. E u consists of all the perturbations of the identity and zero blocks of the linearization C 1 P(λ) . In turn, the structured perturbation E s is as follows In Sect. 3.1 we show that the unstructured part of the perturbation tends to zero (entrywise) as the number of iterations of our algorithm grows; in Sect. 3.2 we derive a bound on the norm of the resulting structured perturbation; in Sect. 3.3 we explain how to construct the corresponding transformation matrices, i.e. matrices that reduce a full perturbation to a structured one.
We note that the construction of the corresponding transformation matrices in this paper is similar to the construction of the transformation matrices for the reduction to miniversal deformations of matrices in [12,13], and that the evaluation of the norm of the structured part has some similarities with the evaluation of the norm of the miniversal deformation of (skew-)symmetric matrix pencils in [7,9], see also [12,13]. These similarities are due to the fact that our structured perturbation is a versal deformation (but not miniversal), see the mentioned papers for the definitions and details. Algorithm 3.1 Let C 1 P(λ) be a first companion linearization of a matrix polynomial P(λ) and E 1 be a full perturbation of C 1 P(λ) .
Input: Matrix polynomial P(λ), perturbed matrix pencil C 1 P(λ) + E 1 , and the tolerance parameter tol; Initialization: U 1 := I and V 1 := I Computation: While E u i > tol -find the minimum norm least-squares solution to the coupled Sylvester equations: -update the perturbation of the linearization (the updated perturbed linearization remains strictly equivalent to the original one): ; -update the transformation matrices: -extract the new unstructured perturbation E u i+1 to be eliminated; -increase the counter: i := i + 1; , and the transformation matrices U and V .
appearing at the computation step of Algorithm 3.1, has a solution since the space is transversal to the space of the first companion linearizations i.e., these spaces add up to the whole space of matrix pencils of the corresponding size, see e.g., [14,23,32]. Following Algorithm 3.1 we can explicitly write how the resulting perturbation of the linearization C 1 P(λ) + E k is constructed: In the rest of the paper we investigate properties of this algorithm and perform numerical experiments.

Elimination of the unstructured perturbation
We start by deriving an auxiliary lemma that will be used to prove that in Algorithm 3.1 the unstructured perturbation converges to zero. For a given matrix T , define κ(T ) := κ F (T ) = T · T † to be a Frobenius condition number of T , e.g., see references [2,5,28]. We recall that the matrix T † denotes the pseudoinverse (the Moore-Penrose inverse) of T , see e.g., [19].
Proof Using the Kronecker product we can rewrite the system of coupled Sylvester equations as a system of linear equations T x = b, or explicitly The minimum norm least-squares solution to such system can be written as x = T † b, implying x ≤ T † · b or more explicitly, and taking into account x 2 = X 2 + Y 2 : where κ(T ) is the Frobenius condition number of T . Taking into account that The bounding expression in (3.3) depends on the conditioning of the problem (3.4) as well as on how small (normwise) the right-hand side of (3.4) (or, equivalently, (3.2)) is, compared to the matrix coefficients in the left-hand side. The conditioning of (3.2) may actually be better than the conditioning of (3.4). Thus for very ill-conditioned problems and large perturbations, it may be reasonable to use a solver for (3.2) instead of passing to the Kronecker product matrices.
In the following theorem we prove that Algorithm 3.1 eliminates the unstructured perturbation, i.e. we show that the norm of the unstructured part of the perturbation tends to zero as the number of iterations grows. Theorem 3.1 Let C 1 P(λ) +E 1 be a perturbation of the linearization and let α E 1 < 1, where

6)
with the pencils C 1 Proof We start by proving a bound for the norm of the unstructured part of a perturbation at the (i + 1)-st step of the algorithm, using the norm of the unstructured part of a perturbation at the i-th step of the algorithm. Define C 1 P(λ) = λW + W . Following Algorithm 3.1 we obtain matrices X i and Y i by solving the system of coupled Sylvester matrix equations: Using the solution X i and Y i to the system (3.8) we compute or equivalently, Since X i and Y i are a solution to (3.8) we have Splitting the perturbation into the structured and unstructured parts we obtain In general, E u i+1 and E u i+1 are not zero matrices but we show that they tend to zero (entrywise) when i → ∞. In (3.9)-(3.12) below T i is the Kronecker product matrix defined in (3.7) and associated with the system of coupled Sylvester equations (3.8). Using the bound (3.5) we have: respectively, using the bound (3.3) we have: (3.10) Combining (3.9) and (3.10) we obtain the following bound on the Frobenious norm of E u i+1 : similarly, for the matrix E u i+1 , (3.12) Using α, defined in (3.6), we can write the bounds on the unstructured part of the perturbation for both coefficients of the matrix pencil at the step i + 1 as follows Note that the supremum in the definition of α is finite since κ(T i ) and the norms of the corresponding matrices are finite. This results into the bound for the whole pencil: (3.14) Using the bounds (3.13) and (3.14) at each step we get If α E 1 < 1 then the norm of the unstructured part of the perturbation tends to zero as the number of iterations grows.

Remark 3.1
In our case we should exclude some rows from (3.7), since we want to eliminate only the unstructured part of the perturbation E i . Therefore, the norm of the solution of such least-squares problem will be less than or equal to x , where x = T † b. Clearly, the bounds from Lemma 3.1 remain valid.
The sharpness of the bounds (3.15) depends on the value of α and on the size of the initial perturbation: the better conditioned the problem is and the smaller initial perturbation is, the better the bounds (3.15) are. Even if the problem is ill-conditioned we can still guarantee the convergence for small enough perturbations. Note that a proper scaling of a matrix polynomial improves the conditioning of the problem, see e.g., [15]. Moreover, in practice, Algorithm 3.1 converges to a structured perturbation very well and requires only a small number of iterations, see the numerical experiments in Sect. 4. The numerical experiments also suggest that we have the quadratic order of convergence, and this can indeed be verified using (3.15).

Bound on the norm of the structured perturbation
In this section we find a bound on the resulting structured perturbation. Similarly to the analysis in Sect. 3.1 we have a dependence on the conditioning of the problem as well as on the norm of an original perturbation. Therefore, we need to make an assumption that these quantities are small enough. (3.6). Define also β := sup i 2 (n+m) κ(T i ) and

.1, and the Kronecker product matrix T i in
Proof For the input C 1 P(λ) + E 1 , following Algorithm 3.1 step by step, we can build the resulting perturbation as explained in (3.1). Skipping writing the eliminated unstructured parts of (3.1) we have: We start by evaluating the structured part of the perturbation coming from the coupled Sylvester equations using the bounds (3.5): Recall that κ(T i ) and the norms of the corresponding perturbations are finite and thus the supremum in the definition of β is finite. We use the bounds in (3.10), see = ε + βε + γ ε 2 + βγ ε 2 + γ 3 ε 4 + βγ 3 ε 4 + γ 7 ε 8 + . . .
In (3.18) we used that γ ε < 1 which is true since by the assumption of the theorem αε < 1, and γ < α.

Construction of the transformation matrices
In this section we investigate the transformation matrices that bring a full perturbation of the linearization to a structured perturbation of the linearization. Following Algo-rithm 3.1, we observe that the transformation matrices are constructed as the following infinite products: The convergence of these infinite products to nonsingular matrices is proven in Theorem 3.3. Note that, for small initial perturbations, such transformation matrices are small perturbations of the identity matrices. (3.6). Let also X i and Y i be a solution to (3.8) for the corresponding index i, and I m and I n be the m × m and n × n identity matrices. Then exist and are nonsingular matrices.

Proof By [31, Theorem 4] the limits in (3.19) exist and are nonsingular matrices if the sums
respectively, absolutely converge. Using the bound (3.5) for a solution of coupled Sylvester equations and noting that X 2 ≤ X 2 + Y 2 , we have the following bound for both X i 2 and Y i 2 : (3.21) Bounds (3.21) together with our assumption α E 1 < 1 ( E u 1 ≤ E 1 ) allow us to conclude that the sums in (3.20) absolutely converge.

Numerical experiments
All the numerical experiments are performed on a MacBook Pro (processor: 2.6 GHz Intel Core i7, memory: 32 GB 2400 MHz DDR4), using Matlab R2019a (64-bit). We consider a large number of randomly generated matrix polynomials, matrix polynomials coming from real world applications, and matrix polynomials crafted specially for testing the limits of the proposed algorithm.

Example 4.1
Consider 1000 random polynomials of size 8 × 8 and degree 4. The entries of the matrix coefficients of these polynomials are generated from the normal distribution with mean μ = 0 and standard deviation σ = 10 (variance σ 2 = 100). The polynomials are normalized to have the Frobenius norm equal to 1. Each polynomial is perturbed by adding a matrix polynomial whose matrix coefficients have entries that are uniformly distributed numbers on the interval (0, 0.01). At most 6 iterations are needed for the norm of the unstructured part of a perturbation to be of order 10 −16 (10 −16 is the tolerance we require). In Fig. 1 we present the results in whisker plots (box plots).
In the following two examples we consider two quadratic matrix polynomials coming from applications. Both matrix polynomials belong to the NLEVP collection [3].

Example 4.3
Consider a 21 × 16 quadratic matrix polynomial arising from calibration of a surveillance camera using a human body as a calibration target [3,27]. Note that the polynomial is rectangular. In Fig. 5 we present the decay of the norm of the unstructured part of the perturbation. The changes in the norm of the structured part of the perturbation and in the norms of the transformation matrices are presented in Figs. 6 and 7, respectively.
In the following example we tune the conditioning of the problem and the value of the initial perturbation to test the limits of Algorithm 3.1.

Example 4.4
Consider the 5 × 5 quadratic matrix polynomial from Example 4.2. We scale the matrix coefficients of this polynomial and increase the initial perturbation to achieve the following goals: (a) making the structured perturbation much larger compared to the initial perturbation and (b) forcing Algorithm 3.1 to diverge. Notably, if (a) is achieved, i.e. the limit perturbation is much larger than the original one, then we may still have the convergence. We summarize the results of our experiment in Table 1. In the last column we underline the word "yes" if the convergence is also provable by Theorem 3.1, e.g., in the first and the second experiments the values of α E 1 are 7.75e-05 and 0.98, respectively; in experiments 3,5,6, and 8 the value of α E 1 is larger than 1 thus Theorem 3.1 is not applicable, nevertheless we still observe convergence numerically. In experiments 9 and 10 we do not have even the numerical convergence and of course the value of α E 1 is larger than 1.
The existence of structured perturbations for these broader classes of linearizations follows, e.g., from [8,14,15]. Such a generalization will require solving the corresponding structured coupled Sylvester equations, or at least the corresponding structured least-squares problem. Table 1 We show how the choice of the scalars α i , i = 0, 1, 2 in the matrix polynomial Q(λ) = α 2 A 2 λ 2 + α 1 A 1 λ + α 0 A 0 and the initial perturbation E 1 change the norm of the resulting structured perturbation E s and the convergence of the algorithm Entries of E 1 are equidistributed in (0, 5.0e-06):