Convergence results for some piecewise linear solvers

Radons, Manuel; Rump, Siegfried M.

doi:10.1007/s11590-021-01837-7

Convergence results for some piecewise linear solvers

Original Paper
Open access
Published: 25 December 2021

Volume 16, pages 1663–1673, (2022)
Cite this article

Download PDF

You have full access to this open access article

Optimization Letters Aims and scope Submit manuscript

Convergence results for some piecewise linear solvers

Download PDF

1713 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Let A be a real $n\times n$ matrix and $z,b\in \mathbb R^n$. The piecewise linear equation system $z-A\vert z\vert = b$ is called an absolute value equation. In this note we consider two solvers for uniquely solvable instances of the latter problem, one direct, one semi-iterative. We slightly extend the existing correctness, resp. convergence, results for the latter algorithms and provide numerical tests.

A globally convergent LP-Newton method for piecewise smooth constrained equations: escaping nonstationary accumulation points

Article 04 October 2017

Polynomial convergence of two higher order interior-point methods for $P_*(\kappa )$-LCP in a wide neighborhood of the central path

Article 05 December 2017

Stability and Accuracy of Inexact Interior Point Methods for Convex Quadratic Programming

Article 13 September 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Denote by ${\text {M}}_n(\mathbb R)$ the space of $n\times n$ real matrices and let $A\in {\text {M}}_n(\mathbb R)$ and $z,b\in \mathbb R^n$. The piecewise linear equation system

$$\begin{aligned} z - A\vert z\vert = b \end{aligned}$$

(1.1)

is called an absolute value equation (AVE) and was first introduced by Rohn in [16]. Mangasarian and Meyer proved its polynomial equivalence to the linear complementarity problem (LCP) [10]. In [11, P. 216-230] Neumaier authored a detailed survey about the AVEs intimate connection to the research field of linear interval equations. Especially closely related system types are equilibrium problems of the form

$$\begin{aligned} Bx+ \max (0, x)= c, \end{aligned}$$

(1.2)

where $B\in {\text {M}}_n(\mathbb R)$ and $x, c \in \mathbb R^n$. A prominent example is the first hydrodynamic model presented in [2]. Using the identity $\max (s, t) = (s+t +\vert s-t \vert )/ 2$, equality (1.2) can be reformulated as

$$\begin{aligned} Bx + \frac{x+\vert x\vert }{2}=c\quad \Longleftrightarrow \quad (2B+I)x +\vert x \vert =2c, \end{aligned}$$

(1.3)

and for nonsingular $(2B+I)$ this is equivalent to an AVE (1.1).

This position at the crossroads of several interesting problem areas has given rise to the development of efficient solvers for the AVE. Publications on the matter include approaches by linear programming [9], concave minimization [21, 6], split shifts [19], successive over-relaxation [22], as well as a variety of Newton and fixed point methods, cf. [2, 5, 20].

In this article we will present and further analyze two solvers for the AVE: the signed Gaussian elimination, which is a direct solver that was developed in [12], and a semi-iterative generalized Newton method which was developed for equivalent system types in [2, 8] and adapted to the AVE in [4, 18]. Previously, four correctness results for the signed Gaussian elimination have been proved, two of which were shown to hold for the semi-iterative generalized Newton method as well. We extend and unify these results. That is, after some preliminaries (Sect. 2), we slightly strengthen one of the correctness results for the signed Gaussian elimination, and then give a unifying proof that extends all correctness results for the signed Gaussian elimination to the generalized Newton method (Sects. 3–5). Further, we show that both algorithms are, nevertheless, not equivalent and provide some numerical results (Sect. 6).

2 Preliminaries

We denote by [n] the set $\{1,\dots ,n\}$. For vectors and matrices absolute values and comparisons are used entrywise. A signature matrix S, or, briefly, a signature, is a diagonal matrix with entries $+1$ or $-1$, i.e., $\vert S\vert =I$. The set of n-dimensional signature matrices is denoted by $\mathcal S_n$. A single diagonal entry of a signature is a sign $s_i$, where $i\in [n]$. Let $z\in \mathbb R^n$. We write $S_z$ for a signature, where $s_i=1$ if $z_i\ge 0$ and $-1$ else. We then have $S_zz=\vert z\vert $. Using this convention, we can rewrite (1.1) as

$$\begin{aligned} (I-AS_z)z\ =\ b . \end{aligned}$$

(2.1)

In this form it becomes apparent that the main difficulty of the AVE is to determine the proper signature S for z. That is, to determine in which of the $2^n$ orthants about the origin z lies. This is NP-hard in general [7].

An important statement from [17] that we will frequently use is

Theorem 2.1

Let $A\in {\text {M}}_n(\mathbb {R})$. Then the AVE (2.1) is uniquely solvable if $\Vert A\Vert _p<1$, where $\Vert \cdot \Vert _p$ is any p-norm.

3 The unifying theorem

Hereafter, ${\text {sign}}$ denotes the signum function. The following simple observation is key to the subsequent discussion:

Proposition 3.1

Let $A\in {\text {M}}_n(\mathbb R)$ and $z,b\in \mathbb R^n$ satisfy (2.1). If $\Vert A\Vert _{\infty }<1$, then for at least one $i\in [n]$ we have ${\text {sign}}(z_i)={\text {sign}}(b_i)$ .

Proof

Let $z_i$ be an entry of z s.t. $\vert z_i\vert \ge \vert z_j\vert \ $ for all $j\in [n]$. If $z_i=0$, then $z=\mathbf{0}$ and thus $b\equiv z-A\vert z\vert =\mathbf {0}$, where $\mathbf {0}$ denotes the zero-vector in $\mathbb {R}^n$, and the statement holds trivially. If $\vert z_i\vert >0$, then $\left| e_i^\intercal A\vert z\vert \right| < \vert z_i\vert $, due to the norm constraint on A. Thus, $b_i = z_i - e_i^\intercal A\vert z\vert $ will adopt the sign of $z_i$. $\square $

We do not know though, for which indices the signs coincide. The theorem below states restrictions on A which guarantee the coincidence of the signs of $z_i$ and $b_i$ for all $i\in [n]$ where $\vert b_i\vert = vertb vert_\infty $ and thus provide the basis for the convergence proofs in Sects. 4 and 5. For $b\in \mathbb R^n$ we set

$$\begin{aligned} \mathcal I^b_{\max }\ \equiv \left\{ \, i\in [n] \ : \ \vert b_i\vert = \vert b\vert _\infty \, \right\} , \end{aligned}$$

and define

$$\begin{aligned}{\text {Neq}}(A,b,z)\ \equiv \ \{ i\in \mathcal I^b_{\max }\ :\ {\text {sign}}(b_i) \ne {\text {sign}}(z_i) \}.\end{aligned}$$

Theorem 3.2

Let $A\in {\text {M}}_n(\mathbb R)$ and $b, z\in \mathbb R^n$ such that (2.1) is satisfied. Then we have

$$\begin{aligned}{\text {Neq}}(A,b,z)\ =\ \emptyset \end{aligned}$$

if either of the following conditions is satisfied.

1.:: $\Vert A\Vert _{\infty }\ <\ \frac{1}{2}$ .
2.:: A is irreducible, and $\Vert A\Vert _{\infty }\ \le \ \frac{1}{2}$ .
3.:: A is strictly diagonally dominant and $\Vert A\Vert _\infty \le \frac{2}{3}$ .
4.:: $|A|$ is tridiagonal, symmetric, $\Vert A\Vert _\infty <1$, and $n\ge 2$ .

The first three points are cited from [12, Thm. 3.1]. We will here prove the fourth point.

Proof (Theorem 3.2.4)

The proof is performed by induction. The $(2\times 2)$-case can be verified by direct computation. Now assume the statement of the theorem holds for an $N\ge 2$, but the tuple (A, z, b) contradicts it in dimension $N+1$. Since $ \Vert A\Vert _{\infty } <1$, it is ${\text {sign}}(z_i)={\text {sign}}(b_i)$ for all $i\in [N+1]$, if $\vert z_1\vert =\dots =\vert z_{N+1}\vert $, in which case we would be done. Thus we will assume that not all entries of z have the same absolute value.

Let $z\in \mathbb R^{N+1}$ so that $z-A\vert z\vert =b$. Further, let $i\in [N+1]$ so that $z_i=\Vert z\Vert _\infty $. Then $\sum _{j=1}^{N+1} \vert a_{ij}z_j\vert < \vert z_i\vert $, since $\Vert A\Vert _\infty <1$, and we thus have ${\text {sign}}(b_i) = {\text {sign}}(z_i)$. Now let $k \in {\text {Neq}}(A,b,z)$, which implies $\mathop {\mathrm {sign}}(z_k)\ne \mathop {\mathrm {sign}}(b_k)$. Then we cannot have $\vert z_k\vert =\Vert z\Vert _\infty $, and it must hold $\vert z_k\vert <\Vert z\Vert _\infty $ instead.

Let i be an index such that $\vert z_i\vert =\Vert z\Vert _\infty $. Then another row than the i-th must hold the contradiction because of the arguments made above. If this row has an index j so that $j\le i-2$ or $j\ge i+2$, we are done, because then we can eliminate the i-th row and column from the system and the resulting $(n-1)$-dimensional system would still contain the row holhing the contradiction (the matrix is tridiagonal, hence eliminating a row/column whose index differs from its own index by two or more does not affect it).

Thus it remains to deal with the case $j=i\pm 1$. We assume that $i\in {2,\dots ,N}$. If not, simply omit the corresponding half of the operations outlined below. Since $\vert z_i\vert =\Vert z\Vert _\infty $, there exist scalars $\zeta _1,\zeta _2\in \left[ 0, 1 \right] $ such that $ \zeta _1 \cdot \vert z_i\vert = \vert z_{i-1}\vert $ and $ \zeta _2 \cdot \vert z_i\vert = \vert z_{i+1}\vert $. Multiplying both sides with $a_{i,i-1}$, resp. $a_{i,i+1}$, we get

$$\begin{aligned} \zeta _1\cdot a_{i,i-1} \cdot \vert z_i\vert = a_{i,i-1}\cdot \vert z_{i-1}\vert \quad \text {and}\quad \zeta _2\cdot a_{i,i+1} \cdot \vert z_i\vert = a_{i,i+1}\cdot \vert z_{i+1}\vert \,. \end{aligned}$$

Let $A\in {\text {M}}_{N+1}(\mathbb R)$, and denote by $A_{i,i}$ the matrix in $M_N(\mathbb R)$ that is derived from A by removing its i-th row and column. Further, let D be an real $(N\times N)$-matrix which is zero with the exception of two entries

$$\begin{aligned} d_{i-1,i-1}=\zeta _1\cdot a_{i,i-1}\quad \text {and}\quad d_{i,i}=\zeta _2\cdot a_{i,i+1} \end{aligned}$$

(note that the i-th row/column of $A_{i,i}$ holds entries that were formerly contained in row/column $i+1$ of A). Then

$$\begin{aligned}\bar{A}\ \equiv \ A_{i,i} + D \end{aligned}$$

is still tridiagonal with $\Vert A\Vert _{\infty }<1$ and $\vert A\vert $ symmetric. Further, for

$$\begin{aligned} \bar{z}\equiv (z_1,\dots , z_{i-1},z_{i+1},\dots ,z_{N+1})^\intercal \quad \text {and}\quad \bar{b}\equiv (b_1,\dots , b_{i-1},b_{i+1},\dots ,b_{N+1})^\intercal \end{aligned}$$

we have

$$\begin{aligned}\bar{z}-\bar{A}\vert \bar{z}\vert \ =\ \bar{b}.\end{aligned}$$

Hence, the tuple $(\bar{A},\bar{z}, \bar{b})$ contradicts the induction hypothesis for dimension N, as either the index $i-1$ or i (formerly $i+1$) is contained in ${\text {Neq}}(\bar{A},\bar{z}, \bar{b})$. $\square $

4 Signed Gaussian elimination

Let C be a nonsingular coefficient matrix of a linear system. The Gaussian elimination algorithm performs a sequence of rank-1-updates on C whose result is an upper triangular matrix. This shape allows to read off the solution of the system via backwards substitution. For the algorithmic principle of the backwards substituion to apply, it suffices to have a coefficient matrix that can be transformed to an upper triangular matrix via symmetric row/column-permutations. We will use this fact to reformulate the signed Gaussian elimination introduced in [12] in such a way that it does not require excessive row/column-pivoting anymore.

If one is sure of the sign $s_k$ of $z_k$ one can remove this variable from the left-hand side of the AVE. Let $A_{*k}$ denote the k-th column $Ae_k$ and $A_{j*}$ the j-th row $e_j^\intercal A$. Then the removal of the variable is reflected in the formula

$$\begin{aligned} (I-A_{*k}e_k^\intercal s_k)z = b + (A-A_{*k}e_k^\intercal )\vert z\vert . \end{aligned}$$

The inverses of rank-1-modifications are well-known to be (see, e.g. [1])

$$\begin{aligned} (I-uv^\intercal )^{-1}=I+\frac{1}{1-v^\intercal u}uv^\intercal . \end{aligned}$$

Thus it is easy to remove the matrix factor on the left side. We then have

$$\begin{aligned} z&=\bar{b} + \bar{A}\vert z\vert , \end{aligned}$$

(4.1)

where

$$\begin{aligned} \bar{b} = b + \frac{1}{1-A_{kk}s_k}s_kA_{*k}\,b_k\quad \text {and}\quad \bar{A} = A_{red}+ \frac{1}{1-A_{kk}s_k}s_kA_{*k}\,(A_{red})_{k*}, \end{aligned}$$

with

$$\begin{aligned} A_{re d}=A-A_{*k}e_k^\intercal =A(I-e_ke_k^\intercal ). \end{aligned}$$

Now let ${\text {El}}$ be the set of those indices for which the latter rank-1-update has already been performed. We can then formulate an elimination step for an arbitrary row/column as follows:

Performing this elimination step n times (for n pairwise different indices, and always updating ${\text {El}}$) transforms A into a matrix which can be transformed to an upper triangular matrix via symmetric row/column pivots. From this transformed matrix, z can be computed in a straightforward fashion via (an adapted) backwards substitution, since all sign-choices have been eliminated.

Now let $J\subseteq [n]$ be an index set and define

$$\begin{aligned}{\text {J\_b}}\ \equiv \{ i\in J\ : \ \vert b_i\vert \ge \vert b_j\vert \ \forall j\in J \}.\end{aligned}$$

Further, let $\vert J\vert $ be the number of elements in J. Using this convention, we can give the pseudocode of a slight modification of the algorithm that was introduced as signed Gaussian elimination (SGE) in [12]:

Theorem 4.1

Let $A\in {\text {M}}_n(\mathbb {R})$ and $z, b\in \mathbb {R}^n$ such that (1.1) is satisfied. If A conforms to any of the conditions listed in Theorem 3.2, then the signed Gaussian elimination computes the unique solution of the AVE (1.1) correctly.

Proof

Theorem 3.2 ensures the correctness of the sign-picks. Further, the conditions listed in Theorem 3.2 are invariant under the (sign controlled) elimination step: For criteria $(1)-(3)$, we refer to [12] and [4]. Concerning criterium (4): Assume without loss of generality that $1\in \mathcal I^b_{\max }$. Let $A'$ be the matrix into which A is transformed via the elimination step, and $A'_{11}$ the matrix obtained from $A'$ by eliminating its first row and column, and define $A_{11}$ analogously. Since A is tridiagonal, $A'_{11}$ differs from $A_{11}$ only in a single entry, the upper left one, which we will denote $a'_{11}$. We denote the corresponding entry of A by $a_{22}$. Hence, the symmetry is preserved. Further, for any $S\in \mathcal {S}_n$, the matrix $I-AS$ is strictly diagonally dominant, since $\Vert A\Vert _\infty <1$. So $a_{12}=\alpha \cdot (1-a_{11}s_{11})$, where $\alpha \in [0,1)$. Since $a_{12}=a_{21}$, this implies $a'_{11}=a_{22}-\alpha \cdot a_{12}$. Keeping in mind that $a_{23}=a'_{12}$, we have

$$\begin{aligned} \vert a'_{11}\vert +\vert a'_{12}\vert \&\le \ \vert a_{22}\vert + \alpha \cdot \vert a_{12}\vert + \vert a_{23}\vert \ =\ \vert a_{22}\vert + \alpha \cdot \vert a_{21}\vert + \vert a_{23}\vert \\&<\ \vert a_{22}\vert + \vert a_{21}\vert + \vert a_{23}\vert \ \quad \; <\ 1\,. \end{aligned}$$

This proves the preservation of the norm constraint. Now apply the argument recursively down to the scalar level. This results in a nonsingular linear system with a coefficient matrix that is permutationally similar to an upper triangular matrix, which can be solved by backwards substitution. This solution is unique, since criteria (1)-(4) imply the unique solvability of the AVE, cf. Theorem 2.1. $\square $

For dense A the SGE has a cubic computational cost. For A with band structure it was shown in [12] that the computation has the asymptotic cost of sorting n floating point numbers. Moreover, note that the SGE is numerically stable, since $I-AS$ is strictly diagonally dominant if $\Vert A\Vert _\infty <1$.

For counterexamples which demonstrate the sharpness of the conditions (1)-(3) in Theorem 3.2 with respect to the SGE’s correctness, see [12]. Concerning condition (4), let $A\equiv I$. Then we have $\Vert A\Vert _\infty = 1$ and the SGE does not necessarily have a solution, due to the fact that another quantity which measures numbers of AVE solutions, the aligning radius of A, is not smaller than one, cf. [15]. E.g. if b is the vector of ones, there is no solution and any sign-pick is necessarily wrong.

5 Full step Newton method

In this section we analyze the full step Newton method (FN) which is defined by the recursion

$$\begin{aligned} z^{k+1}\ =\ (I-AS_k)^{-1}b, \end{aligned}$$

(5.1)

where $S_k\equiv S_{z^k}$, and $z_0\equiv b$. The iteration has the terminating criterion

$$\begin{aligned} z^k=z^{k+1}\; . \end{aligned}$$

To the best knowledge of the authors it was developed developed simultaneously for two different equivalent system types in [8] and [2], and then later adapted to the present formulation of the AVE in [4]. A first, albeit rather restrictive, convergence result is [4, Prop. 7.2]:

Proposition 5.1

If $\Vert A\Vert _p < 1/3$ for any p-norm, then the iteration (5.1) converges for all b in finitely many iterations from any $z_0$ to the unique solution of (2.1).

Moreover, in [4, Prop. 7] convergence was proved for the first two restrictions on A in Theorem 3.2. The following extends this result to the criteria in Theorem 3.2.3-4.

Theorem 5.2

Let $A\in {\text {M}}_n(\mathbb {R})$ and $z,b\in \mathbb {R}^n$ such that (1.1) is satisfied. If A conforms to any of the conditions listed in Theorem 3.2, then for any initial vector $z^0\in \mathbb {R}^n$ the full step Newton method (5.1) computes the unique solution of the AVE (1.1) correctly in at most $n+1$ iterations.

Proof

Note that all conditions listed in Theorem 3.2 are invariant under scalings of A by a signature matrix. Now assume that z satisfies the equality

$$\begin{aligned} z-Az\ =\ b \end{aligned}$$

and set $S\equiv S_z$. Then, since $SS=I$, we have

$$\begin{aligned}b\ =\ SSz-ASS z\ =\ z-AS \vert z\vert \ =\ z-A'\vert z\vert , \end{aligned}$$

where $A'\equiv AS$ is still strictly diagonally dominant with $\Vert A'\Vert _\infty <1$. This implies ${\text {Neq}}(A',b,z)$ is empty, by Theorem 3.2. Hence, the signs with index in $\mathcal I^b_{\max }$ are fixed throughout all iterations.

Now assume without loss of generality that $1\in \mathcal I^b_{\max }$. Then for all $k\ge 1$ we will have ${\text {sign}}(z_1^k)={\text {sign}}(b_1)$. Let $z_k$, where $k\ge 1$, be an iterate and let $\tilde{z}_k$ be the vector obtained from $z_k$ by eliminating its first entry. Then $\tilde{z}_k$ is the unique solution of the system $\tilde{z}_k -\tilde{A}\vert \tilde{z}_k\vert =\tilde{b}$, where $\tilde{A}$ and $\tilde{b}$ are obtained from $\bar{A}$ and $\bar{b}$ as defined in (4.1) by eliminating their first row and column, and first entry, respectively. (To avoid confusion, note that this system never appears anywhere as an intermediate step of the algorithm. We merely use the fact that $\tilde{z}_k$ is its solution.) The latter system equals a subsystem obtained by one step of Gaussian elimination. As mentioned in the proof of Theorem 4.1, all restrictions listed in Theorem 3.2.1-4 are invariant under the latter operation. That means for all $\tilde{z}_K$ with $K\ge 2$ all signs $s_j$ so that $j\in \mathcal {I}^{\tilde{b}}_{\max }$ stay fixed to the sign of $\tilde{b}_j$. But that means the signs with index $j+1$ in the $z_K$ stay fixed as well. Applying this argument recursively implies that all signs of z are fixed correctly in at most $n+1$ iterations. Again, we remark that the conditions in Theorem 3.2.1-4 imply the uniqueness of the solution at which we arrive via the aforedescribed procedure. $\square $

6 Comparison of both solvers

In this subsection we will make theoretical and numerical comparisons of both solvers. We will show that they are not equivalent, despite similar correctness, resp. convergence, results. Further, we will test them on random data, as well as two systems from the literature.

6.1 The algorithms are not equivalent

Both solvers are not equivalent and neither solver has a range strictly larger range than the other: Let

$$\begin{aligned}A\equiv \begin{bmatrix} \frac{\varepsilon }{2}&{} \frac{1+\varepsilon }{2} \\ 0 &{} \frac{1}{2} \end{bmatrix} \qquad \text {and} \qquad z\equiv \begin{bmatrix} \frac{\varepsilon }{2} \\ 1 \end{bmatrix}, \end{aligned}$$

where $\varepsilon >0$ is arbitrarily small. Then, for $b\equiv z-A\vert z\vert $ we have $b=(-\frac{2+\varepsilon ^2}{4}, \frac{1}{2})^\intercal $. And clearly $\vert b_1\vert >\vert b_2\vert $, but ${\text {sign}}(b_1)\ne {\text {sign}}(z_1)$. It was shown in [12, Prop.5.2] that for A and b as described, the SGE is lead astray – but we have $\Vert A\Vert _\infty = \frac{1}{2} + \varepsilon $. An elementary calculation shows that for $n\le 2$ we have convergence of the FN method if $\Vert A\Vert _\infty <1$ [13]. In [4, Sec. 7] it was shown that for a system with

$$\begin{aligned}A\equiv \begin{bmatrix} 0 &{}\quad 0 &{}\quad a \\ a &{}\quad 0 &{}\quad 0 \\ 0&{}\quad a &{}\quad 0 \end{bmatrix} \qquad \text {and} \qquad b\equiv \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}, \end{aligned}$$

where $a=5/8$, the FN method cycles for all starting signatures that contain both positive and negative signs. It is straightforward to show that the SGE solves the corresponding AVE.

6.2 Random systems

500 tuples (A, b) of dimension 2.000 were generated at random as follows. Entries of A and b were picked uniformly at random in [0, 1]. Then signs were randomly flipped. Finally, A was scaled by $1/(\Vert A\Vert _\infty +1/n)$ to achieve unique solvability of the system.

1.
Signed Gaussian elimination: All systems were solved. We remark that this is no surprise, since, for random systems, since for large systems, which are constructed as above, we have equality of the sign-patterns of b and z coincide with high probability, cf. [12].
2.
Full step Newton: All systems were solved. The average number of iterations, when started with $S_b$ as the initial signature, was approx. 3. Moreover, the number of updated signs never exceeded 20 and was approximately 10 on average.

Remark 6.1

The results imply a practical advantage of the full step Newton of over the signed Gaussian elimination, since the former can be assembled from existing linear solvers, while the signed Gaussian elimination has to be implemented from scratch. For example, our test-implementation of FN utilizing the numpy library is roughly $1,000\times $ faster than our proof-of-concept SG implementation using the code presented above.

Theoretically, an efficient implementation of SG should outperform the FN algorithm on the given examples by the factor of number of iterations of FN. But to create an implementation from scratch whose eficciency rivals that of modern linear algebra libraries would be a research project in and of itself.

6.3 The system of Brugnano and Casulli

The text [2] is a standard reference. In it a hydrodynamic system of the form (1.2) is solved. We repeat it here:

$$\begin{aligned} Bz+ \max (0, z)= c, \end{aligned}$$

(6.1)

where $B\equiv {\text {tridiag}}(-1,2,-1)$ and

$$\begin{aligned} z_i \equiv \exp \left( 6\frac{i-1}{n-1}-5 \right) -1 \qquad \text {for all}\quad i\in [n]\,. \end{aligned}$$

Reformulation into an AVE via (1.3) yields $A\equiv -(2B+I)^{-1}$.

We tested both algorithms with system sizes of $n=1000$ and $n=10000$. It was already noted that the generalized Newton is equivalent to the algorithms presented in [2]. Hence the convergence proof developed in the latter reference also applies to our setting. The algorithm terminated after the third iteration for both system sizes. It is perhaps interesting that in both dimensions the majority of sign updates were performed during the first iteration of the algorithm (828, resp 8330). Only 3, resp. 4 more signs were corrected in the step after that. The signed Gaussian elimination solved both systems correctly.

6.4 The system of Wu and Li

In the aforecited reference [19], the following tridiagonal system is proposed:

$$\begin{aligned} B\equiv {\text {tridiag}}(-1,4,-1)\in {\text {M}}_n(\mathbb {R}),\quad z\equiv (-1,1,-1,\dots ,1,-1,1)^T, \end{aligned}$$

where $c=\vert z\vert -Bz$. To convert this into an AVE in the sense of this paper, one needs to invert B. This can be accomplished in linear time to working precision, e.g., via the algorithm in [14]. One readily checks, e.g. via the explicit inversion formulas in [3], that $\Vert B^{-1}\Vert _\infty <\frac{1}{2}$. Hence, both algorithms solve the problem correctly, cf. Thms. 4.1 and 5.2. Moreover, the generalized Newton method terminates after one iteration for any dimension. Proof: Let $b=B^{-1}c=z-B^{-1}\vert z\vert $. Since $\Vert B^{-1}\Vert _\infty<\frac{1}{2}<1$, and $\vert z_1\vert =\vert z_2\vert =\dots =\vert z_n\vert $, we have ${\text {sign}}(b_i)={\text {sign}}(z_i)$ for all $i\in [n]$. And the generalized Newton method starts with $S_b$ as its initial best guess signature-pick.

References

Bartlett, M.S.: An inverse matrix adjustment arising in discriminant analysis. Ann. Math. Statist. 22(1), 107–111 (1951)
Article MathSciNet Google Scholar
Brugnano, L., Casulli, V.: Iterative solution of piecewise linear systems. SIAM J. Sci. Comput. 30(1), 463–472 (2008)
Article MathSciNet Google Scholar
Fonseca, C.M., Petronilho, J.: Explicit inverses of some tridiagonal matrices. Linear Algeb. Appl. 325, 7–21 (2001)
Article MathSciNet Google Scholar
Griewank, A., Bernt, J.U., Radons, M., Streubel, T.: Solving piecewise linear equations in abs-normal form. Linear Algeb. Appl. 471, 500–530 (2015)
Article Google Scholar
Hu, S.-L., Huang, Z.-H., Zhang, Q.: A generalized Newton method for absolute value equations associated with second order cones. J. Comput. Appl. Math. 235(5), 1490–1501 (2011)
Article MathSciNet Google Scholar
Mangasarian, O.L.: Absolute value equation solution via concave minimization. Optim. Lett. 1(1), 3–8 (2007)
Article MathSciNet Google Scholar
Mangasarian, O.L.: Absolute value programming. Comput. Optim. Appl. 36(1), 43–53 (2007)
Article MathSciNet Google Scholar
Mangasarian, O.L.: A generalized Newton method for absolute value equations. Optim. Lett. 3, 101–108 (2009)
Article MathSciNet Google Scholar
Mangasarian, O.L.: Absolute value equation solution via linear programming. J. Optim. Theory Appl. 161(3), 870–876 (2014)
Article MathSciNet Google Scholar
Mangasarian, O.L., Meyer, R.R.: Absolute value equations. Linear Algeb. Appl. 419, 359–367 (2006)
Article MathSciNet Google Scholar
Neumaier, A.: Interval Methods for Systems of Equations. Cambridge University Press (1990)
Radons, M.: Direct solution of piecewise linear systems. Theor. Comput. Sci. 626, 97–109 (2016)
Article MathSciNet Google Scholar
Radons, M.: Efficient solution of piecewise linear systems. Master-Thesis (2016)
Radons, M.: O(n) working precision inverses for symmetric tridiagonal toeplitz matrices with O(1) floating point calculations. Optim. Lett. 12, 425–434 (2018)
Article MathSciNet Google Scholar
Manuel, R.: Generalized perron roots and solvability of the absolute value equation (2021)
Rohn, J.: Systems of linear interval equations. Linear Algeb. Appl. 126, 39–78 (1989)
Article MathSciNet Google Scholar
Rump, S.M.: Theorems of Perron-Frobenius type for matrices without sign restrictions. Linear Algeb. Appl. 266, 1–42 (1997)
Article MathSciNet Google Scholar
Streubel, T., Griewank, A., Radons, M., Bernt, J.U.: Representation and analysis of piecewise linear functions in abs-normal form. Proc. IFIP TC 7, 323–332 (2014)
MATH Google Scholar
ShiLiang, Wu., Li, CuiXia: A special shift splitting iteration method for absolute value equation. AIMS Math. 5, 51–71 (2020)
MathSciNet MATH Google Scholar
Yuan, X.-T., Yan, S.: Nondegenerate piecewise linear systems: a finite newton algorithm and applications in machine learning. Neural Comput. 24(4), 1047–1084 (2012)
Article MathSciNet Google Scholar
Zamani, M., Hladík, M.A.: A new concave minimization algorithm for the absolute value equation solution. Optim. Lett. (2021)
Zheng, L.: The picard-hss-sor iteration method for absolute value equations. J. Inequal. Appl. (258) (2020)

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Chair of Discrete Mathematics/Geometry, Technische Universität Berlin, Berlin, Germany
Manuel Radons
Institute for Reliable Computing, Hamburg University of Technology, Hamburg, Germany
Siegfried M. Rump

Authors

Manuel Radons
View author publications
You can also search for this author in PubMed Google Scholar
Siegfried M. Rump
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel Radons.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors wish to thank Lutz Lehmann for his valuable input.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Radons, M., Rump, S.M. Convergence results for some piecewise linear solvers. Optim Lett 16, 1663–1673 (2022). https://doi.org/10.1007/s11590-021-01837-7

Download citation

Received: 30 November 2020
Accepted: 05 December 2021
Published: 25 December 2021
Issue Date: July 2022
DOI: https://doi.org/10.1007/s11590-021-01837-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Convergence results for some piecewise linear solvers

Abstract

Similar content being viewed by others

A globally convergent LP-Newton method for piecewise smooth constrained equations: escaping nonstationary accumulation points

Polynomial convergence of two higher order interior-point methods for \(P_*(\kappa )\)-LCP in a wide neighborhood of the central path

Stability and Accuracy of Inexact Interior Point Methods for Convex Quadratic Programming

1 Introduction

2 Preliminaries

Theorem 2.1

3 The unifying theorem

Proposition 3.1

Proof

Theorem 3.2

Proof (Theorem 3.2.4)

4 Signed Gaussian elimination

Theorem 4.1

Proof

5 Full step Newton method

Proposition 5.1

Theorem 5.2

Proof

6 Comparison of both solvers

6.1 The algorithms are not equivalent

6.2 Random systems

Remark 6.1

6.3 The system of Brugnano and Casulli

6.4 The system of Wu and Li

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Convergence results for some piecewise linear solvers

Abstract

Similar content being viewed by others

A globally convergent LP-Newton method for piecewise smooth constrained equations: escaping nonstationary accumulation points

Polynomial convergence of two higher order interior-point methods for \(P_*(\kappa )\)-LCP in a wide neighborhood of the central path

Stability and Accuracy of Inexact Interior Point Methods for Convex Quadratic Programming

1 Introduction

2 Preliminaries

Theorem 2.1

3 The unifying theorem

Proposition 3.1

Proof

Theorem 3.2

Proof (Theorem 3.2.4)

4 Signed Gaussian elimination

Theorem 4.1

Proof

5 Full step Newton method

Proposition 5.1

Theorem 5.2

Proof

6 Comparison of both solvers

6.1 The algorithms are not equivalent

6.2 Random systems

Remark 6.1

6.3 The system of Brugnano and Casulli

6.4 The system of Wu and Li

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation