Convergence results for some piecewise linear solvers

Let $A$ be a real $n\times n$ matrix and $z,b\in \mathbb R^n$. The piecewise linear equation system $z-A\vert z\vert = b$ is called an \textit{absolute value equation}. We consider two solvers for this problem, one direct, one semi-iterative, and extend their previously known ranges of convergence.


Introduction
Denote by M n (R) the space of n × n real matrices and let A ∈ M n (R) and z, b ∈ R n .The piecewise linear equation system is called an absolute value equation (AVE) and was first introduced by Rohn in [Roh89].Mangasarian proved its polynomial equivalence to the linear complementarity problem (LCP) [MM06].In [Neu90, P. 216-230] Neumaier authored a detailed survey about the AVEs intimate connection to the research field of linear interval equations.Especially closely related system types are equilibrium problems of the form where B ∈ M n (R) and x, c ∈ R n .A prominent example is the first hydrodynamic model presented in [BC08].Using the identity max(s, t) = (s + t + |s − t|)/2, equality (1.2) can be reformulated as and for regular (2B + I) this is equivalent to an AVE (1.1).
The AVE also has a connection to nonsmooth optimization: piecewise affine systems of arbitrary structure may arise as local linearizations of piecewise differentiable objective functions [GHR + 18] or as intermediary problems in the numerical solution of ordinary differential equations with nonsmooth right-hand side [GSL + 18].Such system can be, with a one-to-one solution correspondence, transformed into an AVE [GBRS15,Lem. 6.5].
This position at the crossroads of several interesting problem areas has given rise to the development of efficient solvers for the AVE.The latest publications on that matter include approaches by linear programming [Man14] and concave minimization [Man07a], as well as a variety of Newton and fixed point methods (see, e.g., [BC08], [YY12], [HHZ11]).In this article we will present and further analyze two solvers for the AVE: the signed Gaussian elimination, which is a direct solver that was developed in [Rad16a]; and a semi-iterative generalized Newton method developed in [GBRS15,SGRB14].In particular, we unify and further extend the known convergence results for both algorithms.
Content and structure of this note: In Section 2 we will assemble the necessary preliminaries from the literature and (re-)prove some auxiliary results.In Section 3 we will prove a theorem that will allow us to unify and extend the existing correctness, resp.convergence results for the two solvers mentioned above.In Sections 4 and 5 the solvers are presented and the main results proved.In Section 6 we provide an example which demonstrates that despite the results presented in this note, both solvers are not equivalent.

Preliminaries
We denote by [n] the set {1, . . ., n}.For vectors and matrices absolute values and comparisons are used entrywise.Zero vectors and matrices of proper dimension are denoted by 0. Let c ∈ R n , then we denote by diag(c) a diagonal matrix in M n (R) with entries c 1 , . . ., c n .
A signature matrix S, or, briefly, a signature, is a diagonal matrix with entries +1 or −1, i.e. |S| = I.The set of n-dimensional signature matrices is denoted by S n .A single diagonal entry of a signature is a sign s i (i ∈ [n]).Let z ∈ R n .We write S z for a signature, where s i = 1 if z i ≥ 0 and −1 else.Clearly, we then have S z z = |z|.Using this convention, we can rewrite (1.1) as (2.1) In this form it becomes apparent that the main difficulty is to determine the proper signature S for z.That is, to determine in which of the 2 n orthants about the origin z lies.This is NP-hard in general [Man07b].Denote by ρ(A) the spectral radius of A and let (c.f.[Roh89, Chap.5]) ρ 0 (A) ≡ max{|λ| : λ real eigenvalue of A} be the real spectral radius of A. Then its sign-real spectral radius is defined as follows (see [Rum97, Def.1.1]): The exponential number of signatures S accounts for the NP-hardness of the computation of ρ R (A) [Rum97, Cor.2.9].For a fixed signature S, we have {S( SA) : S ∈ S n } = {SA : S ∈ S n }.Furthermore, since all S ∈ S n are involutive, i.e., S −1 = S, the spectra of A and SAS are identical.These observations immediately yield the useful identity The solvability properties of (2.1) and the quantity ρ R (A) are strongly connected (cf.[Rum97], [Neu90, p. 220, Thm.6.1.3-5]): Then the following are equivalent: 2. The system (I − AS z )z = b has a unique solution for all b ∈ R n .

The piecewise linear function
We provide a brief assertion of the statements essential to our investigation.For a complete proof of Theorem 2.1 we refer to the afore cited references.
(3) ⇔ (4) : The piecewise linear function is positively homogeneous.Hence, it is bijective if and only it is bijective at the origin.By Clarke's inverse function theorem this is the case if and only if all the matrices (I − AS), S ∈ S n ,, which are the Jacobians of the selection functions of ϕ, have the same determinant sign.This sign cannot be negative, because in the described situation all matrices in the polytope P := conv(I − AS : S ∈ S n ) have the same nonzero determinant sign and P contains the identity.
(4) ⇒ (1) : If it was ρ R (A) ≥ 1, we could find a matrix in P , as defined in the last step, that was singular.
There exist various other proofs for the equivalencies listed in Theorem 2.1.See, e.g., [Rum97,Neu90,Rad16a].Moreover, note that the sign-real spectral radius is but one facet of the unified Perron-Frobenius theory developed in [Rum02] which extends several key properties of the Perron root of nonnegative real matrices to general real and complex matrices via the concepts of the signreal and sign-complex spectral radius, respectively.A unified expression for these three quantities is derived in [Rum02, Thm.2.4]:

The unifying theorem
Hereafter, sign denotes the signum function.The following simple observation is key to the subsequent discussion: Proof.Let z i be an entry of z s.t.
and define if either of the following conditions is satisfied.
The first three points are cited from [Rad16a, Thm.3.1].We will prove the fourth point and reprove the first two by somewhat more elegant means than in the latter reference.This includes a new proof for the following lemma.
Then the inverse of B = I − A is strictly diagonally dominant and has a positive diagonal.
Proof.As A ∞ ≤ 1 2 < 1, the inverse of (I − A) exists and can be expressed via the Neumann series This already proves strict diagonal dominance for A ∞ < 1 2 .Moreover, due to A ∞ < 1, we have ρ(A ′ ) < 1 for any principal submatrix A ′ of A (including the case A ′ = A).This implies that the real part of all eigenvalues of I − A ′ is positive.Consequently, all real eigenvalues of I − A ′ are positive and no complex eigenvalue is 0. Since the complex eigenvalues of real marices appear in conjugate pairs, whose product is positive as well, we get det(I − A ′ ) > 0. The positivity of the diagonal of (I − A) −1 now follows from Cramer's rule.
To further explore the diagonal dominance of a matrix sum I +M +R, where we will use M = A m and R = k =m A k , we bound the gap in the inequality below as Thus we get strict diagonal dominance in row i both for M ∞ + R ∞ < 1 and in the case of M ∞ + R ∞ = 1 and M ii > 0, where the partition A(I −A) −1 = M + R can be chosen differently for every i = 1, . . ., n.
If ρ(A) < 1 2 , then (2A) k converges toward zero, so that there is some K with The right one follows from ) , which holds by hypothesis.In the case ρ(A) = 1 2 the assumptions of the Wielandt theorem [Mey00, Wie50] are satisfied, ρ(A) = ρ(|A|) = A ∞ , such that there is a sign s and a signature matrix S = diag(s 1 , . . ., s n ) with |s| = |s i | = 1, i = 1, . . ., n, so that A = s T −1 |A| T and thus for the powers of The diagonal elements of |A| k are sums of products over k-cycles of non-negative elements.Since |A| is irreducible there is at least one k i -cycle, k i ∈ [n], for each diagonal element at position i.Thus we find (|A| ki ) ii = |(A k ) ii | > 0. For the square of that power we note that the diagonal element satisfies the identity n j=1 By the triangle inequality, the identity of the leftmost and rightmost terms is only possible if all the terms in the sum on the left have the same sign.As (A ki ) 2 ii > 0, all those terms are positive and consequently (A 2ki ) ii > 0, which proves diagonal dominance in row i by setting M = A 2ki and R = m =2ki A m in the separation inequality (3.1).This can be done for any index thus proving overall diagonal dominance of (I − A) −1 .
Proof.(Theorem 3.1) (1) and ( 2 2 .Moreover, assume set S z ≡ S. Then (I − AS) −1 is strictly diagonally dominant by Lemma 3.2 since both irreducibility and the norm constraint are invariant under multiplications of A by a signature matrix.Hence, z i = e ⊺ i (I − AS) −1 b will adopt the sign of b i for all i ∈ I b max .
(3): See [Rad16a].(4): The proof proceeds by induction.The (2 × 2)-case can be verified by direct computation.(We note that it is the only part of the proof that makes use of the symmetry of A.) Now assume the statement of the theorem holds for an N ≥ 2, but the tuple Thus we will assume that not all entries of z have the same absolute value.
For all entries z j of z whose abolute value is maximal in z, we then have Consequently, if there existed a tuple (A, z, b) which violated the claim of the theorem, for any i ∈ Neq(A, b, z) the absolute value of z i would not be maximal in z.
As N + 1 ≥ 3, we can assume without loss of generality that |z N | = z ∞ , while the first row holds the contradiction, so that we have |b 1 | = b 1 ∞ and sign(b 1 ) = sign(z 1 ).Then there exists a scalar ζ ∈ [0, 1] such that Let A ∈ M N +1 (R) and denote by A N +1,N +1 the matrix in M N (R) that is derived from A by removing its (N + 1)-th row and column.Then Hence, the tuple ( Ā, z, b) contradicts the induction hypothesis for dimension N , as we still have 1 ∈ Neq( Ā, z, b).

Signed Gaussian elimination
If one is sure of the sign s k of z k one can remove this variable from the right hand side of the AVE.Let A * k denote the k-th column Ae k and A j * the j-th row e ⊺ j A. Then the removal of the variable is reflected in the formula For dense A the SGE has a cubic computational cost.For A with band structure it was shown in [Rad16a] that the computation has the asymptotic cost of sorting n floating point numbers.Moreover, note that the SGE is numerically stable, since I − AS is strictly diagonally dominant if A ∞ < 1.
For counterexamples which demonstrate the sharpness of the conditions (1)-(3) in Theorem 3.1 with respect to the SGE's correctness, see [Rad16a].Concerning condition (4), let A ≡ (1 + ε)I, where ε > 0 is arbitrarily small.Then we have A ∞ = 1 + ε and the SGE will pick the wrong first sign for arbitrary right hand sides b.

Full step Newton method
In this section we analyze the full step Newton method (FN) which is defined by the recursion (5.1) where S k ≡ S z k .The iteration has the terminating criterion It was developed in [GBRS15] and is equivalent to the semi-iterative solver for the equilibrium problem (1.2) developed in [BC08].A first, albeit rather restrictive, convergence result is [GBRS15, Prop.7.2]: Proposition 5.1.If A p < 1/3 for any p-norm, then the iteration (5.1) converges for all b in finitely many iterations from any z 0 to the unique solution of (2.1).Moreover, the p-norms of both z i − z as well as (I − AS i )z i+1 − b are monotonically reduced.
Moreover, in [GBRS15, Prop.7] convergence was proved for the first two restrictions on A in Theorem 3.1.The following extends this result to the criteria in Theorem 3.1.3-4.
Theorem 5.1.Let A ∈ M n (R) and z, b ∈ R n such that (1.1) is satisfied.If A conforms to any of the conditions listed in Theorem 3.1, then for any initial vector z 0 ∈ R n the full step Newton method (5.1) computes the unique solution of the AVE (1.1) correctly in at most n + 1 iterations.
Proof.Note that all conditions listed in Theorem 3.1 are invariant under scalings of A by a signature matrix.Now assume that z satisfies the equality z − Az = b and set S ≡ S z .Then, since SS = I, we have b = z − ASSz ≡ z − A ′ |z| , and A ′ is still strictly diagonally dominant with A ∞ < 1.This implies Neq(A ′ , b, z) is empty, by Theorem 3.1.Hence, we have That is, the signs with index in I b max are fixed throughout all iterations.Now assume i ∈ I b max .Then for all k ≥ 1 we will have sign(z k i ) = sign(b i ).This allows us to rewrite the i-th equation in (1.1) and express the z k i as a linear combination of the other z k j by the transformations of A and b to Ā and c, respectively, as they were defined in (4.1).This corresponds to one step of Gaussian elimination.As mentioned in the proof of Theorem 4.1, all restrictions listed in Theorem 3.1.1-4are invariant under the latter operation, which implies that the argument applies recursively and all signs of z are fixed correctly in at most n + 1 iterations.Again, we remark that the conditions in Theorem 3.1.1-4imply the uniqueness of the solution at which we arrive via the afore described procedure.

Comparison of both solvers
Despite their proved range of correctness, resp.convergence being the same, both solvers are not equivalent and neither solver has a stricly larger range than the other: Let where a = 5/8, the FN method cycles for all starting signatures that contain both positive and negative signs.It is sraightforward to show that the SGE solves the corresponding AVE, but we refrain from performing this exercise.
then z = 0 and thus b ≡ z − A|z| = 0, and the statement holds trivially.If |z i | > 0, then |e ⊺ i A|z|| < |z i |, due to the norm constraint on A. Thus, b i = z i − e ⊺ i A|z| will adopt the sign of z i .We do not know though, for which indices the signs coincide.The theorem below states restrictions on A which guarantee the coincidence of the signs of z i and b i for all i ∈ [n] where |b i | = b ∞ and thus provide the basis for the convergence proofs in Sections 4 and 5.For b ∈ R n we set ε > 0 is arbitrarily small.Then, for b ≡ z −A|z| we have b = (− 2+ε 2 4 , 1 2 ) ⊺ .And clearly |b 1 | > |b 2 |, but sign(b 1 ) = sign(z 1 ).It was shown in [Rad16a, Prop.5.2] that for A and b as described, the SGE is lead astray -but we have A ∞ = 1 2 + ε.An elementary calculation shows that for n ≤ 2 we have convergence of the FN method if A ∞ < 1[Rad16b].In [GBRS15, Sec.7] it was shown that for a system with is bounded by all p-norms [Rum97, Thm.2.15].It affirms that all systems considered in the sequel are uniquely solvable.