1 Introduction

We consider the absolute value equation problem of finding an \(x\in \mathbb {R}^n\) such that

$$\begin{aligned} Ax-b=|x|, \end{aligned}$$
(AVE)

where \(A\in \mathbb {R}^{n\times n}\), \(b\in \mathbb {R}^{n}\) and \(|\cdot |\) denotes the componentwise absolute value. A slightly more generalized form of AVE was introduced by Rohn [43] (see also [41]).

Many methods, including Newton-like methods [12, 30, 53] or concave optimization methods [32, 33, 52], have been developed for solving AVE. An important point concerning numerical methods is the precision of the computed solution. To the best knowledge of the authors, there exist only few papers which are devoted to this subject for AVE; for instance see [1, 49, 50]. Wang et al. [49, 50] use interval methods for numerical validation. Hladík [20] derives various bounds for the solution set of AVE.

Error bounds play a crucial role in theoretical and numerical analysis of linear algebraic and optimization problems [11, 13, 14, 18, 38]. In this paper, we study error bounds for AVE under the assumption that uniqueness of the solution of AVE is guaranteed. Then, we compute upper bounds for \(\Vert x- x^\star \Vert \), the distance to the solution \(x^\star \) of AVE, in terms of a computable residual function.

1.1 Organization and contribution of the paper

The paper is organized as follows. Section 1.2 presents basic definitions and preliminaries needed to state the results. In Sect. 2, we propose error bounds for the absolute value equations. They naturally give rise to a corresponding condition number of AVE. We further investigate properties of the condition number for various norms, including the computational complexity issues. Since the calculation of the condition number can be computationally hard, we present several bounds, and we also inspect special classes of matrices, for which it can be computed efficiently.

It is well known that a linear complementarity problem can be formulated as an absolute value equation [31]. Indeed, it is one of the main applications of absolute value equations. In Sect. 3, we study error bounds for absolute value equations obtained by the reformulation of linear complementarity problems. In addition, thanks to the given results, we provide a new error bound condition for linear complementarity problems.

Section 4 is devoted to a relative condition number of AVE. The motivation stems from the relative error bounds that we propose there.

Error bounds are important in convergence analysis of iterative methods. That is why in Sect. 5, we apply the presented error bounds for AVE to convergence analysis of two prominent methods; we prove their convergence and also address the rate of convergence.

Lastly, Sect. 6 discusses the case when AVE has not a unique solution. We show a so-called local error bounds property under certain assumptions.

1.2 Basic definitions and preliminaries

The n-dimensional Euclidean space is denoted by \(\mathbb {R}^n\). We use e and I to denote the vector of ones and the identity matrix, respectively. We denote an arbitrary scaling p-norm on \(\mathbb {R}^n\) by \(\Vert \cdot \Vert \), that is, \(\Vert x\Vert =\Vert Dx\Vert _p\) for a positive diagonal matrix D and a p-norm. In particular, \(\Vert \cdot \Vert _1\), \(\Vert \cdot \Vert _2\) and \(\Vert \cdot \Vert _\infty \) stand for 1-norm, 2-norm and \(\infty \)-norm, respectively. We use \(\hbox {sgn}(x)\) to denote the componentwise sign of x.

Let A and B be \(n\times n\) matrices. We denote the smallest singular value and the spectral radius of A by \(\sigma _{\min }(A)\) and \(\rho (A)\), respectively. The eigenvalues of a symmetric matrix \(A\in {\mathbb {R}}^{n\times n}\) are denoted and sorted as follows: \(\lambda _{\max }(A)=\lambda _1(A)\ge \dots \ge \lambda _n(A)=\lambda _{\min }(A)\). For a given norm \(\Vert \cdot \Vert \) on \(\mathbb {R}^n\), \(\Vert A\Vert \) denotes the induced matrix norm by \(\Vert \cdot \Vert \), i.e.,

$$\begin{aligned} \Vert A\Vert =\max \{\Vert Ax\Vert : \Vert x\Vert =1\}. \end{aligned}$$

Throughout the paper, we consider only induced matrix norms. The matrix inequality \(A\ge B\), |A| and \(\max (A, B)\) are understood entrywise. For \(d\in \mathbb {R}^n\), \(\hbox {diag}(d)\) stands for the diagonal matrix whose entries on the diagonal are the components of d. In contrast, \(\hbox {Diag}(A)\) denotes the vector of diagonal elements of A. The ith row and jth column of A are denoted by \(A_{i*}\) and \(A_{*j}\), respectively. We denote the comparison matrix of A by \(\langle A\rangle \), which is defined as

$$\begin{aligned}&\langle A\rangle _{ii}=|A_{ii}|,&i=1, \dots ,n,\\&\langle A\rangle _{ij}=-|A_{ij}|,&i,j=1, \dots ,n, i\ne j. \end{aligned}$$

We recall the following definitions for an \(n\times n\) real matrix A:

  • A is a P-matrix if each principal minor of A is positive.

  • A is an M-matrix if \(A^{-1}\ge 0\) and \(A_{ij}\le 0\) for \(i, j = 1,2,\dots ,n\) with \(i\ne j\).

  • A is an H-matrix if its comparison matrix is an M-matrix.

We will exploit some results from interval linear algebra, so we recall some results from this discipline. For two \(n\times n\) matrices \(\underline{{A}}\) and \(\overline{{A}}\), \(\underline{{A}}\le \overline{{A}}\), the interval matrix \(\varvec{A} = [\underline{{A}}, \overline{{A}}]\) is defined as \(\varvec{A}=\{A: \underline{{A}}\le A \le \overline{{A}}\}\). An interval matrix A is called regular if each \(A\in \varvec{A}\) is nonsingular; similarly, we define H-matrix property of interval matrices. Furthermore, we denote and define the inverse of a regular interval matrix \(\varvec{A}\) as \(\varvec{A}^{-1}:=\{A^{-1}: A\in \varvec{A}\}\). Note that the inverse of an interval matrix is not necessarily an interval matrix.

In this paper, generalized Jacobian matrices [9] are used in the presence of nonsmooth functions. Let \(f:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^m\) be a locally Lipschitz function. The generalized Jacobian of f at \({\hat{x}}\), denoted by \(\partial f({\hat{x}})\), is defined as

$$\begin{aligned} \partial f({\hat{x}}):=\hbox {co}\{\textstyle \lim _{n\rightarrow \infty } \nabla f(x_n): x_n\rightarrow {\hat{x}}, x_n\notin X_f\}, \end{aligned}$$

where \( X_f\) is the set of points at which f is not differentiable and \(\hbox {co}(S)\) denotes the convex hull of a set S.

In what follows, we remind some known theorems that we will need later on.

Theorem 1

(Wu and Li [51, Theorem 3.3]) AVE has a unique solution for each \(b\in \mathbb {R}^n\) if and only if the interval matrix \([A-I, A+I]\) is regular.

Theorem 2

(Rohn et al. [45, Theorem 4]) AVE has a unique solution for each \(b\in \mathbb {R}^n\) if \(\rho (|A^{-1}|)<1\).

Theorem 3

(Kuttler [25]) An interval matrix \(\varvec{A}\) is inverse nonnegative (i.e., \(A^{-1}\ge 0\) for every \(A\in \varvec{A}\)) if and only if \(\underline{{A}}^{-1}\ge 0\) and \(\overline{{A}}^{-1}\ge 0\). In which case, \( \varvec{A}^{-1}\subseteq [\overline{{A}}^{-1},\underline{{A}}^{-1}]. \)

The first item of the following results can be found, e.g., in Berman and Plemmons [3], Theorem 2.3 in Chapter 6. The second item is Proposition 3.6.3(iii) from Neumaier [36] with \(B:=I\).

Theorem 4

If \(A\in {\mathbb {R}}^{n\times n}\) is an M-matrix, then the following properties hold:

  1. (i)

    \(A+I\) is an M-matrix and \(\rho ((A+I)^{-1}(A-I))<1\);

  2. (ii)

    \(A-I\) is an M-matrix if and only if \(\rho (A^{-1})<1\).

The following result is a special case of Theorem 3.7.5 from Neumaier [36].

Theorem 5

Let \(A\in {\mathbb {R}}^{n\times n}\) be an H-matrix. Then

  1. (i)

    \(|A^{-1}| \le \langle A\rangle ^{-1}\);

  2. (ii)

    \([A-I,A+I]\) is an H-matrix if and only if \(\rho (\langle A\rangle ^{-1})<1\).

The Sherman-Morrison formula for the inverse of a rank-one update can be found, e.g., in [22].

Theorem 6

(Sherman-Morrison formula) Let \(A\in {\mathbb {R}}^{n\times n}\) be nonsingular and \(u,v\in {\mathbb {R}}^n\). If \(v^TA^{-1}u\not =-1\), then \((A+uv^T)^{-1}=A^{-1}-\frac{1}{1+v^TA^{-1}u}A^{-1}uv^TA^{-1}\).

2 Error bounds for the absolute value equations

Consider an absolute value equation AVE. We will need the assumption of regularity of \([A-I, A+I]\) throughout this section; by Theorem 1, AVE has then a unique solution and we denote it by \(x^\star \). Even though AVE can possess multiple solutions in practice (we discuss the case of multiple solutions in Sect. 6), there are also important classes of problems leading to a unique solution. Consider, for instance, strictly convex quadratic programs or Rohn’s characterization of extreme points of the solution set of interval equations [41].

Theorem 7

If the interval matrix \([A-I, A+I]\) is regular, then

$$\begin{aligned} \Vert x- x^\star \Vert \le \max _{\Vert d\Vert _{\infty }\le 1} \Vert (A-\hbox {diag}(d))^{-1}\Vert \cdot \Vert Ax-|x|-b\Vert , \ \ \forall x\in \mathbb {R}^n. \end{aligned}$$
(1)

Proof

Note that due to regularity of \([A-I, A+I]\) the right side of the above inequality is finite. Define the residual function \(\phi :{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\) by \(\phi (x)=Ax-|x|-b\). By the mean value theorem, see Theorem 8 in [19],

$$\begin{aligned} \phi (x)=\phi (x)-\phi (x^\star )=(\textstyle \sum _{i=1}^n \lambda _i {\mathfrak {A}}_i)(x- x^\star ), \end{aligned}$$

where \({\mathfrak {A}}_i\in \partial \phi (x_i)\), \(x_i\in co(\{x, x^\star \})\), \(\lambda _i\ge 0\), \(i=1, \dots , n\) and \(\sum _{i=1}^n \lambda _i=1\). It is easily seen that \(\partial \phi (y)\subseteq \{A+\hbox {diag}(d): \Vert d\Vert _{\infty }\le 1\}=[A-I, A+I]\) for \(y\in \mathbb {R}^n\). Due to the convexity of \(\{A+\hbox {diag}(d): \Vert d\Vert _{\infty }\le 1\}\), we have

$$\begin{aligned} \phi (x)={\hat{A}}(x- x^\star ), \end{aligned}$$

for some \({\hat{A}}\in [A-I, A+I]\). By multiplying \({\hat{A}}^{-1}\) on both sides and using the induced norm property, we obtain

$$\begin{aligned} \Vert x- x^\star \Vert =\Vert {\hat{A}}^{-1}\phi (x)\Vert \le \Vert {\hat{A}}^{-1}\Vert \cdot \Vert \phi (x)\Vert \le \max _{\Vert d\Vert _{\infty }\le 1} \Vert (A-\hbox {diag}(d))^{-1}\Vert \cdot \Vert \phi (x)\Vert , \end{aligned}$$

which completes the proof. \(\square \)

To take advantage of this formulation, we need to compute the optimal value of the following optimization problem,

$$\begin{aligned} c(A):=\max \&\Vert (A-\hbox {diag}(d))^{-1}\Vert \ \ \hbox {s.t.}\ \ \Vert d\Vert _{\infty }\le 1. \end{aligned}$$
(2)

We call the optimal value of (2) the condition number of the absolute value equation AVE with respect to the norm \(\Vert \cdot \Vert \). In addition, we denote the condition number with respect to the 1-norm, 2-norm and \(\infty \)-norm by \(c_1(A)\), \(c_2(A)\) and \(c_\infty (A)\), respectively. By the properties of matrix norms, we have the following results.

Proposition 1

Let \([A-I, A+I]\) be regular and \(\alpha \) be a scalar with \(|\alpha |\ge 1\). Then c(A) and \(c(\alpha A)\) exist, and

  1. (i)

    \(c(-A)=c(A)\);

  2. (ii)

    \(c_1(A^T)=c_\infty (A)\);

  3. (iii)

    \(c(\alpha A)\le |\alpha ^{-1}| c(A)\).

Proof

Parts (i) and (ii) are straightforward. Part (iii) follows from the fact that

$$\begin{aligned} \max _{\Vert d\Vert _{\infty }\le 1} \ \Vert (\alpha A-\hbox {diag}(d))^{-1}\Vert&= \max _{\Vert d\Vert _{\infty }\le |\alpha ^{-1}|} \ \Vert (\alpha A-\alpha \hbox {diag}(d))^{-1}\Vert \\&\le |\alpha ^{-1}|\max _{\Vert d\Vert _{\infty }\le 1} \ \Vert ( A-\hbox {diag}(d))^{-1}\Vert . \end{aligned}$$

\(\square \)

In the next proposition, we show that optimization problem (2) attains its maximum at some vertices of the box \(\{d:\Vert d\Vert _\infty \le 1\}=[-e,e]\).

Proposition 2

Let the interval matrix \([A-I, A+I]\) be regular. Then, there exists a vertex of the box \([-e,e]\) which is a solution of (2).

Proof

It is enough to show that function \(d\mapsto \Vert ( A-\hbox {diag}(d))^{-1}\Vert \) is convex in each coordinate \(d_i\), \(i=1,\dots ,n\). Then the maximum must be attained in a vertex of \([-e,e]\).

Without loss of generality we show convexity in \(d_1\). Let \(f:[-1, 1]\rightarrow \mathbb {R}\) be given by \(f(t)=\Vert (A-\hbox {diag}((t, \check{d})))^{-1}\Vert \), where \({\check{d}}\) is obtained by removing the first component of \({\hat{d}}\). By the Sherman-Morrison formula (Theorem 6), \(f(t)=\Vert {\hat{A}}^{-1}-\frac{t}{1+t{\hat{A}}^{-1}_{11}}E\Vert \), where \({\hat{A}}=A+\hbox {diag}((0, {\check{d}}))\) and \(E={\hat{A}}^{-1}_{*1}{\hat{A}}^{-1}_{1*}\). Due to regularity of \([A-I, A+I]\), function \(\frac{t}{1+t{\hat{A}}^{-1}_{11}}\) is well-defined for \(t\in [-1, 1]\). Since \(\Vert A+\tau E\Vert \) as a function of \(\tau \) is convex and \(g(t)=\frac{t}{1+t{\hat{A}}^{-1}_{11}}\) is strictly monotone on \([-1, 1]\), f is convex on its domain [4]. \(\square \)

Remark 1

Note that function \(d\mapsto \Vert ( A-\hbox {diag}(d))^{-1}\Vert \) is not necessarily convex or concave; see Example 1. By Proposition 2, to handle problem (2), one needs to check solely all vertices of \([-e,e]\). As the number of vertices is \(2^n\), this method may not be effective for large n. Indeed, problem (2) is NP-hard in general. It is known that for any rational \(p\in [1, \infty )\), except for \(p=1, 2\), computation of the matrix p-norm of a given matrix is NP-hard [17]. Consequently, problem (2) is NP-hard for any rational \(p\in [1, \infty )\) except \(p=1, 2\). We prove intractability for 1-norm, so it is NP-hard for \(\infty \)-norm, too. We conjecture that it is also NP-hard for 2-norm.

Proposition 3

Computation of \(c_1(A)\) is an NP-hard problem.

Proof

By [42], solving the problem

$$\begin{aligned} \max \ e^T|x| \ \ \ \hbox {s.t.}\ \ \ |Ax|\le e \end{aligned}$$
(3)

is NP-hard. Even more, it is intractable even with accuracy less than \(\frac{1}{2}\) when \(A^{-1}\) is a so called MC-matrix [42]. Recall that \(M\in {\mathbb {R}}^{n\times n}\) is an MC matrix if it is symmetric, \(M_{ii}=n\) and \(M_{ij}\in \{0,-1\}\), \(i\not =j\). For an MC matrix M we have \(\lambda _{\max }(M)\le 2n-1\), from which \(\lambda _{\min }(M^{-1})\ge \frac{1}{2n-1}\). Therefore \(\lambda _{\min }(A)\ge \frac{1}{2n-1}\) and we can achieve \(\lambda _{\min }(A)>1\) by a suitable scaling. As a consequence, \([A-I,A+I]\) is regular.

Feasible solutions to the above optimization problem can be equivalently characterized as

$$\begin{aligned} Ax=b,\ \ b\in [-e,e], \end{aligned}$$

or, substituting \(b=\hbox {diag}(b)e=\hbox {diag}(b)y\) with \(y=e\),

$$\begin{aligned} \begin{pmatrix} A &{} -\hbox {diag}(b) \\ 0 &{} I \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} =\begin{pmatrix} 0 \\ e \end{pmatrix},\ \ b\in [-e,e]. \end{aligned}$$

Introducing an auxiliary variable \(z=1\), we get

$$\begin{aligned} \begin{pmatrix} A &{} -\hbox {diag}(b) &{} 0 \\ 0 &{} I &{} -e \\ 0 &{} 0 &{} 1 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} =\begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix},\ \ b\in [-e,e]. \end{aligned}$$

Rewrite the system as

$$\begin{aligned} \begin{pmatrix} D &{} A &{} 0 \\ I &{} 0 &{} -e \\ 0 &{} 0 &{} 1 \end{pmatrix} \begin{pmatrix} y \\ x \\ z \end{pmatrix} =\begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix},\ \ |D|\le I. \end{aligned}$$

Let \(\alpha >0\) be sufficiently large. The system equivalently reads

$$\begin{aligned} \begin{pmatrix} D &{} A &{} 0 \\ \alpha I &{} 0 &{} -e\alpha \\ 0 &{} 0 &{} 2 \end{pmatrix} \begin{pmatrix} \frac{1}{\alpha }y \\[2pt] \frac{1}{\alpha }x \\[2pt] \frac{1}{\alpha }z \end{pmatrix} =\begin{pmatrix} 0 \\ 0 \\ \frac{2}{\alpha } \end{pmatrix},\ \ |D|\le I. \end{aligned}$$

Now, we relax the system by introducing intervals on the remaining diagonal entries

$$\begin{aligned} \begin{pmatrix} D &{} A &{} 0 \\ \alpha I &{} D' &{} -e\alpha \\ 0 &{} 0 &{} 2+d \end{pmatrix} \begin{pmatrix} \frac{1}{\alpha }y \\[2pt] \frac{1}{\alpha }x \\[2pt] \frac{1}{\alpha }z \end{pmatrix} =\begin{pmatrix} 0 \\ 0 \\ \frac{2}{\alpha } \end{pmatrix},\ \ |D|,|D'|\le I,\ |d|\le 1. \end{aligned}$$

Denote by \(M(D,D',d)\) the constraint matrix. The solution is \(\frac{2}{\alpha }\)-multiple of the last column of the inverse matrix \(M(D,D',d)^{-1}\). That is why we analytically express the inverse matrix (notice that it exists due to regularity of \([\alpha A-I,\alpha A+I]\))

$$\begin{aligned} M(D,D',d)^{-1}= \begin{pmatrix} -D'C &{} \frac{1}{\alpha }(I+D'CD) &{} \frac{1}{2+d}(e+D'CDe) \\[2pt] \alpha C &{} -CD &{} -\frac{\alpha }{2+d}CDe \\[2pt] 0 &{} 0 &{} \frac{1}{2+d} \end{pmatrix}, \end{aligned}$$

where \(C:=(\alpha A-DD')^{-1}\), \(|D|, |D'|\le I, |d|\le 1\). The idea of the proof is to reduce the above mentioned NP-hard problem to computation of the condition number for matrix M(0, 0, 0). Obviously, 1-norm of \(M(D,D',d)^{-1}\) is attained for the value of \(d=-1\), so we can fix it for the remainder of the proof.

Claim A. There exist \({\bar{D}}\) and \({\bar{D}}'\) such that \(|{\bar{D}}|=|{\bar{D}}'|=I\) and \(c_1(M(0,0,0))=\Vert M({\bar{D}}, \bar{D}', -1)^{-1}_{*(2n+1)}\Vert _1\).

Proof of the Claim A

By Proposition 2, the maximum norm is attained for \(|D|=|D'|=I\). Therefore, we need only to investigate the matrices with \(|D|=|D'|=I\). Let \(c_1(M(0,0,0))=\Vert M( D, D', -1)^{-1}\Vert _1\) with \(|D|=|D'|=I\). If 1-norm of \(M( D, D', -1)^{-1}\) is attained for the last column, the claim is resulted. Otherwise, since \(\alpha >0\) is arbitrarily large, the 1-norm is attained for no column of the middle part. Suppose that the norm is attained for the ith column of the first column block. We compare the norms of this column and the last column of \(M(D,D',d)^{-1}\), that is, we compare vectors

$$\begin{aligned} \begin{pmatrix} -D'C_{*i} \\ \alpha C_{*i} \\ 0 \end{pmatrix}\quad \hbox {and}\quad \begin{pmatrix} e+D'CDe \\ -\alpha CDe \\ 1 \end{pmatrix}. \end{aligned}$$

We compare separately their three blocks. Obviously, for the last entry the latter is larger. Since \(C\rightarrow 0\) as \(\alpha \rightarrow \infty \), the first block of entries of the former vector is arbitrarily small and neglectable. Thus, we focus on the second block. The former vector has entries \(\alpha C_{*i}\). Notice that by the triangle inequality one has either \(\Vert u\Vert \le \Vert u+v\Vert \) or \(\Vert u\Vert \le \Vert u-v\Vert \) for any \(u,v\in {\mathbb {R}}^n\) and any norm. Thus, one can choose a suitable \({\bar{D}}\) such that \(|{\bar{D}}|=I\) and \(\Vert \alpha C_{*i} \Vert _1\le \Vert \alpha C{\bar{D}}e \Vert _1=\Vert \alpha C_{*i}+\alpha \sum _{j\ne i} C_{*j}{\bar{d}}_{jj} \Vert _1\). Furthermore, one can select a matrix \({\bar{D}}'\) with \(|{\bar{D}}'|=I\) and \(\Vert e+D'CDe\Vert _1=\Vert e+\bar{D}'C{\bar{D}}e\Vert _1\). Because \(c_1(M(0,0,0))=\Vert M( D, D', -1)^{-1}\Vert _1\), the given matrices \({\bar{D}}\) and \({\bar{D}}'\) fulfill the claim.

Claim B. The 1-norm of the last column is arbitrarily close to \(1+n+ e^T|A^{-1}De|\).

Proof of the Claim B

The last entry of the column is 1. Since \(C\rightarrow 0\) as \(\alpha \rightarrow \infty \), the first block tends to e as \(\alpha \rightarrow \infty \). The second block reads \(-\alpha CDe=-(A-\frac{1}{\alpha }DD')^{-1}De\), which tends to \(-A^{-1}De\) as \(\alpha \rightarrow \infty \). So its 1-norm tends to \(e^T|A^{-1}De|\).

By Claim B, the 1-norm of the last column is by \(1+n\) larger than the objective value of (3). So by maximizing 1-norm of \(M(D,D',d)^{-1}\) we can deduce the maximum of (3) with arbitrary precision. Notice that \(e^T|A^{-1}|e\) is an upper bound on (3) and it has polynomial size, so we can find \(\alpha \) of polynomial size, too by the standard means (c.f. [46]). \(\square \)

In general, the computation of c(A) is not easy. However, computation of the condition number with respect to some norms or for some classes of matrices is not difficult. In the rest of the section, we study the given condition number from this aspect.

Proposition 4

If \(\max _{D\in [-I, I]}\Vert A^{-1}D\Vert \le \gamma <1\), then

$$\begin{aligned} c(A)_{\Vert \cdot \Vert }\le \frac{\Vert A^{-1}\Vert }{1-\gamma }. \end{aligned}$$

Proof

Let \(D:=\hbox {diag}(d)\) for some d with \(|d|\le e\). By the assumption, \(\rho (A^{-1}D)\le \Vert A^{-1}D\Vert <1\). By using Neumann series [22],

$$\begin{aligned} (A-D)^{-1}=(I-A^{-1}D)^{-1}A^{-1} =\sum _{k=0}^{\infty }\left( A^{-1}D\right) ^k A^{-1}. \end{aligned}$$

We have

$$\begin{aligned} \Vert (A-D)^{-1}\Vert&\le \sum _{k=0}^{\infty }\big \Vert A^{-1}D\big \Vert ^k \cdot \Vert A^{-1}\Vert \le \frac{\Vert A^{-1}\Vert }{1-\gamma }. \end{aligned}$$

\(\square \)

We say that a matrix norm is monotone if \(|A|\le B\) implies \(\Vert A\Vert \le \Vert B\Vert \). For instance, the scaled matrix p-norms are monotone. It is seen that if \(\Vert ( |A^{-1}| ) \Vert <1\) for a monotone norm \(\Vert \cdot \Vert \), the assumption of Proposition 4 holds. It is worth mentioning that if \(\Vert A^{-1}\Vert <1\) and if \(\max _{D\in [-I, I]}\Vert D\Vert \le 1\), then we have the the assumption of Proposition 4, too.

Theorem 8

If \(\rho (|A^{-1}|)<\gamma <1\), then there exists a scaling 1-norm such that

(4)

Proof

By Theorems 1 and 2, AVE has a unique solution and \([A-I, A+I]\) is regular. Due to the continuity of eigenvalues with respect to the matrix elements, there exists an invertible matrix B with and \(\rho (B)= \gamma \). By Perron-Frobenius theorem, there exists \(v> 0\) such that \(Bv=\rho (B)v\). We define norm as . Note that . As , we have

$$\begin{aligned}&Ax-|x|-b=A(x-x^\star )-(|x|-|x^\star |)\\&\Rightarrow x-x^\star =A^{-1}(Ax-|x|-b)+A^{-1}(|x|-|x^\star |)\\&\Rightarrow |x-x^\star |\le |A^{-1}||Ax-|x|-b|+|A^{-1}|(||x|-|x^\star ||)\\&\Rightarrow (I-|A^{-1}|)|x-x^\star |\le |A^{-1}||Ax-|x|-b| \end{aligned}$$

By Neumann series theorem [22], and \((I-B)^{-1}\) exist and are non-negative. Hence,

$$\begin{aligned}&|x-x^\star |\le (I-|A^{-1}|)^{-1} |A^{-1}| |Ax-|x|-b|\\&\Rightarrow |x-x^\star |\le (I-B)^{-1}B|Ax-|x|-b| \end{aligned}$$

The last inequality follows from \((I-|A^{-1}|)^{-1}=\sum _{i=0}^\infty |A^{-1}|^i\le \sum _{i=0}^\infty B^i=(I-B)^{-1}\). Hence,

Moreover, for d with \(\Vert d\Vert _\infty \le 1\),

$$\begin{aligned} |(B^{-1}-\hbox {diag}(d))^{-1}| = |(I-B\hbox {diag}(d))^{-1}B| = \bigg | \sum _{i=0}^\infty (B\hbox {diag}(d))^i B\bigg | \le \sum _{i=1}^\infty B^i. \end{aligned}$$

Since \(\sum _{i=0}^\infty B^i=-(B^{-1}-I)^{-1}\), the Perron–Frobenius theorem then implies .\(\square \)

One may wonder why we do not use the well-known result which states the existence of a matrix norm with , see Lemma 5.6.10 in [22], to prove the above theorem. The underlying reason is that the given matrix norm by this result is not necessarily a scaled matrix p-norm. It is worth mentioning that, under the assumption of Theorem 8, when , one obtains

(5)

for some scaling 1-norm. Note that a sufficient condition for having is the existence of a diagonal matrix S with \(|S|=I\) such that \(A^{-1}S\ge 0\) and \((A-S)^{-1}S\ge 0\). In fact, Theorem 5.2 in Chapter 7 of [3] implies that \(\rho (A^{-1}S)<1\) under this condition, which is equivalent to \(\rho (|A^{-1}|)<1\).

Error bounds can be utilized as a tool in stability analysis [10, 14]. As mentioned earlier, AVE has a unique solution for each \(b\in {\mathbb {R}}^n\) if and only if \([A-I, A+I]\) is regular. Denote

$$\begin{aligned} {\mathcal {A}}:=\{A\in {\mathbb {R}}^{n\times n}: [A-I, A+I]\hbox { is regular}\}. \end{aligned}$$

It is easily seen that \({\mathcal {A}}\) is an open set. Let function \(X(A, b):{\mathcal {A}}\times \mathbb {R}^n\rightarrow \mathbb {R}^n\) return the solution of AVE. In the following proposition, we list some properties of function X.

Proposition 5

Let \(A\in {\mathcal {A}}\).

  1. (i)

    For any \(b_1, b_2\in \mathbb {R}^n\),

    $$\begin{aligned} \Vert X(A,b_1)-X(A,b_2)\Vert \le c(A)\Vert b_1-b_2\Vert . \end{aligned}$$
  2. (ii)

    Function X is locally Lipschitz with modulus c(A), that is,

    $$\begin{aligned} \Vert X(A_1,b_1)-X(A_2,b_2)\Vert \le c(A)(\Vert A_1-A_2\Vert +\Vert b_1-b_2\Vert ) \end{aligned}$$
    (6)

    for any \(A_1, A_2\) and \(b_1, b_2\) in certain neighborhoods of A and b, respectively.

Proof

First, we show the first part. Suppose that \(X(A,b_1)=x_1\) and \(X(A,b_2)=x_2\). Thus,

$$\begin{aligned} Ax_1-|x_1|-(Ax_2-|x_2|)=b_1-b_2. \end{aligned}$$

There exists a matrix \(D\in [-I, I]\) such that \(|x_2|-|x_1|= D(x_1-x_2)\). So the above equality can be written as

$$\begin{aligned} (A+D)(x_1-x_2)=b_1-b_2, \end{aligned}$$

which implies that \(\Vert x_1-x_2\Vert \le \Vert (A+D)^{-1}\Vert \cdot \Vert b_1-b_2\Vert \le c(A)\Vert b_1-b_2\Vert \).

Now, we prove the second part. Consider the locally Lipschitz function \(\phi :{\mathcal {A}}\times \mathbb {R}^n\times \mathbb {R}^n\rightarrow \mathbb {R}^n\) given by \(\phi (A, b, x)=Ax-|x|-b\). We have \(\partial _x\phi (A, b, x)\subseteq [A-I, A+I]\). As \([A-I, A+I]\) is regular, the implicit function theorem (see Chapter 7 in [9]) implies that there exists a locally Lipschitz function \(X(A, b):{\mathcal {A}}\times \mathbb {R}^n\rightarrow \mathbb {R}^n\) with \(\phi (A, b, X(A, b))=0\). In addition, (6) holds. \(\square \)

2.1 Condition number of AVE for 2-norm

Since \(\Vert A^{-1}\Vert _2=\frac{1}{\sigma _{\min }(A)}\), the value of \(c_2(A)\) can be computed as the optimal value of the following optimization problem,

$$\begin{aligned} \min \&\sigma _{\min }(A-\hbox {diag}(d)) \ \ \hbox {s.t.}\ \ \Vert d\Vert _{\infty }\le 1. \end{aligned}$$
(7)

In general, the function \(\sigma _{\min }(\cdot )\) is neither convex nor concave; see Remark 5.2 in [39]. In (7), \(\sigma _{\min }(\cdot )\) is a function of the matrix diagonal entries. Nonetheless, \(\sigma _{\min }(\cdot )\) is also neither convex nor concave in this case; the following example clarifies this point. From this perspective, Proposition 2 mentioned above is by far not obvious.

Example 1

Let \( A=\left( {\begin{matrix} 2 &{} 1\\ -2 &{} 1 \end{matrix}}\right) \) and \(E=\left( {\begin{matrix} 0 &{} 0\\ 0 &{} 1 \end{matrix}}\right) \). We have

$$\begin{aligned} \sigma _{\min }(A)=\sqrt{2}< \tfrac{1}{2}\sigma _{\min }(A+I)+\tfrac{1}{2}\sigma _{\min }(A-I)\approx 1.541,\\ \sigma _{\min }(A)=\sqrt{2}> \tfrac{1}{2}\sigma _{\min }(A+E)+\tfrac{1}{2}\sigma _{\min }(A-E)\approx 1.34. \end{aligned}$$

In the next proposition, we give a formula for symmetric matrices. Before we get to the proposition, we present a lemma, which follows directly from [21, Thm. 17].

Lemma 1

Let A be symmetric. The interval matrix \([A-I, A+I]\) is regular if and only if

$$\begin{aligned} |\lambda _i(A)|>1, \ \ i=1, \dots , n. \end{aligned}$$
(8)

Note that condition (8) is equivalent to \(\sigma _{\min }(A)>1\).

Proposition 6

Let the interval matrix \([A-I, A+I]\) be regular. If A is symmetric, then \(c_2(A)=\frac{1}{\sigma _{\min }(A)-1}\).

Proof

As \([A-I, A+I]\) is regular, \(\sigma _{\min }(A)>1\). For d with \(\Vert d\Vert _\infty \le 1\), \(\sigma _{\min }(A+\hbox {diag}(d))\ge \sigma _{\min }(A)-1\). By the proof of Lemma 1, it is seen that there exists \({\bar{d}}\) with \(\Vert {\bar{d}}\Vert _\infty = 1\) such that \(\sigma _{\min }(A+\hbox {diag}({\bar{d}}))= \sigma _{\min }(A)-1\). Hence, the proposition follows from formulation (7). \(\square \)

Proposition 7

If \(\sigma _{\min }(A)>1\), then

$$\begin{aligned} c_2(A)\le \frac{1}{\sigma _{\min }(A)-1}. \end{aligned}$$
(9)

Proof

Note that under the assumption, AVE has a unique solution for any b, see Proposition 3 in [31], and consequently \([A-I, A+I]\) is regular. Let \({\hat{d}}\in \{d:\Vert d\Vert _\infty \le 1\}\). Consider the formulation (7). Since \(\sigma _{\min }(A+B)\ge \sigma _{\min }(A)-\Vert B\Vert _2\) and \(\max _{\Vert d\Vert _\infty \le 1} \Vert \hbox {diag}(d)\Vert _2=1,\) we obtain the desired inequality. \(\square \)

In the following example, we show that the bound (9) can be arbitrary large while the error bound with respect to 2-norm, \(c_2(A)\), is bounded.

Example 2

Let \(\epsilon >0\),

$$\begin{aligned} A=\tfrac{\sqrt{2}}{2}\begin{pmatrix} 1 &{} -1\\ 1 &{} 1 \end{pmatrix} \begin{pmatrix} 5 &{} 0\\ 0 &{} 1+\epsilon \end{pmatrix}\ \ \hbox {and}\ \ E=\begin{pmatrix} -1 &{} 0\\ 0 &{} 1 \end{pmatrix}. \end{aligned}$$

As \(\sigma _{\min }(A)=1+\epsilon \), we have the assumption of Proposition 7. By Proposition 2,

$$\begin{aligned} c_2(A)=\max \left\{ \Vert (A-I)^{-1}\Vert _2, \Vert (A-E)^{-1}\Vert _2, \Vert (A+E)^{-1}\Vert _2, \Vert (A+I)^{-1}\Vert _2\right\} . \end{aligned}$$

With a little algebra, it is seen that \( c_2(A)\le 6\), while \(\frac{1}{\sigma _{\min }(A)-1}\) goes to infinity as \(\epsilon \) tends to zero.

For matrix A, let

$$\begin{aligned} r_i(A)=\sum _{\begin{array}{c} j=1,\, j\ne i \end{array}}^n |A_{ij}|, \ \ \hbox {cl}_i(A)=\sum _{\begin{array}{c} j=1,\, j\ne i \end{array}}^n |A_{ji}|. \end{aligned}$$

Proposition 8

Let \({\bar{d}}=\hbox {sgn}(\hbox {Diag}(A))\). If

$$\begin{aligned} \alpha :=\min _{i=1, \dots , n} \big \{|A_{ii}|-\textstyle \frac{1}{2}(r_i(A)+\hbox {cl}_i(A))\big \}>1, \end{aligned}$$

then \(c_2(A)= \Vert (A-\hbox {diag}({\bar{d}}))^{-1}\Vert _2\).

Proof

Let \(d\in \{d:\Vert d\Vert _\infty \le 1\}\). By Theorem 3 in [23], \(\sigma _{\min }(A-\hbox {diag}(d))\ge \alpha -1\). So, \([A-I, A+I]\) is regular. Since \(\Vert A^{-1}\Vert ^{-2}_2 = \lambda _{\min }(A^TA)\), by Proposition 2, \(c_2(A)^{-2}=\min _{|d|=e}\lambda _{\min }\big ((A-\hbox {diag}(d))^T(A-\hbox {diag}(d))\big )\). Suppose that \(|d|=e\). Consider matrix

$$\begin{aligned} T&=(A-\hbox {diag}(d))^T(A-\hbox {diag}(d))-(A-\hbox {diag}({\bar{d}}))^T(A-\hbox {diag}({\bar{d}}))\\&=\hbox {diag}({\bar{d}})A+A^T\hbox {diag}({\bar{d}})-\hbox {diag}( d)A-A^T\hbox {diag}( d). \end{aligned}$$

It is easily seen that T is diagonally dominant with nonnegative diagonal, so it is positive semi-definite. Consequently, \(\lambda _{\min }\big ((A-\hbox {diag}(d))^T(A-\hbox {diag}(d))\big )\ge \lambda _{\min }\big ((A-\hbox {diag}({\bar{d}}))^T(A-\hbox {diag}({\bar{d}}))\big )\), which implies the desired equality. \(\square \)

Note that under the assumptions of Proposition 8, we also have the following bound

$$\begin{aligned} c_2(A)\le \frac{1}{\alpha -1}. \end{aligned}$$

As for a permutation matrix P, \(\Vert AP\Vert _2=\Vert A\Vert _2\) and \([-I, I]P=[-I, I]\), the following corollary gives a more generalized form of Proposition 8.

Corollary 1

Let P be a permutation matrix and \(B=AP\). If

$$\begin{aligned} \alpha :=\min _{i=1, \dots , n} \big \{|B_{ii}|-\textstyle \frac{1}{2}(r_i(B)+\hbox {cl}_i(B))\big \}>1, \end{aligned}$$

then \(c_2(A)= \Vert (B-\hbox {diag}({\bar{d}}))^{-1}\Vert _2\), where \(\bar{d}=\hbox {sgn}(\hbox {Diag}(B))\).

As mentioned earlier, one class of effective approaches to handle .AVE is concave optimization methods. Mangasarian [32] proposed the following concave optimization problem,

$$\begin{aligned} \min \&e^T(Ax-|x|-b)\ \ \hbox {s.t.}\ \ (A+I)x\ge b,\ (A-I)x\ge b. \end{aligned}$$
(10)

He showed that AVE has a solution if and only if the optimal value of (10) is zero. Now, we show that (10) has the weak sharp minima property. Consider an optimization problem \(\min _{x\in X} f(x)\) with the optimal solution set S. The set S is called a weak sharp minima if there is an \(\alpha >0\) such that

$$\begin{aligned} \alpha \cdot {{\,\mathrm{dist}\,}}_S(x)\le f(x)-f(s), \quad \forall x\in X,\ \forall s\in S, \end{aligned}$$

where \({{\,\mathrm{dist}\,}}_S(x):=\min \{\Vert x -s\Vert _2 : s\in S\}\). Weak sharp minima notion has wide applications in the convergence analysis of iterative methods and error bounds [5, 6].

Proposition 9

Let \(A\in {\mathcal {A}}\). Then the optimal solution of (10) is a weak sharp minimum.

Proof

Let X and \(x^\star \) denote the feasible set and the unique solution of (10), respectively. By Theorem 7, \(c_2(A)\in \mathbb {R}_+\) and

$$\begin{aligned} \frac{1}{c_2(A)}\Vert x- x^\star \Vert _2 \le \Vert Ax-|x|-b \Vert _2, \quad \forall x\in X. \end{aligned}$$

As \(\Vert Ax-|x|-b \Vert _2\le \Vert Ax-|x|-b \Vert _1\) and \(Ax-|x|-b \ge 0\) for \(x\in X\), we have

$$\begin{aligned} \frac{1}{c_2(A)}\Vert x- x^\star \Vert _2 \le e^T(Ax-|x|-b), \quad \forall x\in X, \end{aligned}$$

which shows that \(x^\star \) is a weak sharp minimum. \(\square \)

2.2 Condition number of AVE for \(\infty \)-norm

Some upper bounds were proposed for \(\Vert A^{-1}\Vert _\infty \) and \(\Vert A^{-1}\Vert _1\); see [24, 26, 35, 48]. As Theorem 7 holds for any scaling p-norm, it would be advantageous to use these norms.

Proposition 10

If \((AP-I)^{-1}\ge 0\) and \((AP+I)^{-1}\ge 0\) for some diagonal matrix P with \(|\hbox {Diag}(P)|=e\), then \(c_\infty (A)=\Vert (AP-I)^{-1}e\Vert _\infty \).

Proof

By Theorem 3, under the assumptions of the proposition, the interval matrix \([AP-I, AP+I]\) is regular and inverse nonnegative. In addition, \([AP-I, AP+I]^{-1}\subseteq [(AP+I)^{-1}, (AP-I)^{-1}]\). Since \([AP-I, AP+I]=[A-I, A+I]P\), the interval matrix \([A-I, A+I]\) is regular. It is easily seen that for any non-negative matrix M we have \(\Vert M\Vert _\infty =\Vert Me\Vert _\infty \). Because \(\Vert PM\Vert _\infty =\Vert M\Vert _\infty \) for any matrix M, we get \(c_\infty (A)=\Vert (AP-I)^{-1}e\Vert _\infty \). \(\square \)

One can establish that the assumption of Proposition 10 is equivalent to the condition that each row of B has a constant pattern of signs for any \(B\in [A-I, A+I]^{-1}\). Moreover, we have \(c_1(A)=\Vert (AP-I)^{-1}\Vert _1\) under the assumptions of Proposition 10.

Proposition 11

If \(\rho (|A^{-1}|)<1\), then

$$\begin{aligned} c_\infty (A)\le \Vert \max (|B_1|, |B_2|)\Vert _\infty , \end{aligned}$$
(11)

where \(H=(I-|A^{-1}|)^{-1}\), \(T=(2\hbox {diag}(\hbox {Diag}(H))-I)^{-1}\) and

$$\begin{aligned}&B_1=\min \{-H|A^{-1}|+T(A^{-1}+|A^{-1}|), T(-H|A^{-1}|+T(A^{-1}+|A^{-1}|))\},\\&B_2=\max \{H|A^{-1}|+T(A^{-1}-|A^{-1}|), T(H|A^{-1}|+T(A^{-1}-|A^{-1}|))\}. \end{aligned}$$

Proof

By Theorem 2.40 in [15], \([A-I, A+I]^{-1}\subseteq [B_1, B_2]\). Thus,

$$\begin{aligned} c_\infty (A)&= \max _{\Vert d\Vert _{\infty }\le 1} \Vert (A-\hbox {diag}(d))^{-1}\Vert _\infty \\&\le \max _{X\in [B_1, B_2]} \Vert X\Vert _\infty \le \Vert \max (|B_1|, |B_2|)\Vert _\infty . \end{aligned}$$

\(\square \)

Proposition 12

Let A be an M-matrix. If \(\rho (A^{-1})<1\), then

$$\begin{aligned} c_{\infty }(A)= \Vert (A- I)^{-1}e\Vert _{\infty }. \end{aligned}$$
(12)

Proof

By Theorem 4(ii), \(A-I\) is an M-matrix. In addition, as M-matrices are preserved by the addition of positive diagonal matrices [3], \(A+I\) is also an M-matrix. Hence, by Theorem 3, \([A-I, A+I]\) is inverse nonnegative. The statement now follows from Proposition 10 with \(P=I\). \(\square \)

Proposition 13

Let A be an H-matrix. If \(\rho (\langle A\rangle ^{-1})<1\), then

$$\begin{aligned} c_{\infty }(A)\le \Vert (\langle A\rangle -I)^{-1}e\Vert _{\infty }. \end{aligned}$$
(13)

Proof

By Theorem 5, the interval matrix \([A-I, A+I]\) is an H-matrix, and thus it is regular. In addition, \(\langle [A-I, A+I]\rangle =[\langle A\rangle -I, \langle A\rangle +I]\). By Theorem 3, \([\langle A\rangle -I, \langle A\rangle +I]^{-1} \subseteq [(\langle A\rangle +I)^{-1}, (\langle A\rangle -I)^{-1}]\). Because \((\langle A\rangle +I)^{-1}\ge 0\),

$$\begin{aligned} c_{\infty }(A)&=\max _{\Vert d\Vert _\infty \le 1} \ \Vert (A-\hbox {diag}(d))^{-1}\Vert _{\infty } \\&\le \max _{\Vert d\Vert _\infty \le 1} \ \Vert \langle A-\hbox {diag}(d)\rangle ^{-1}\Vert _{\infty } \\&= \Vert (\langle A\rangle -I)^{-1}e\Vert _{\infty }, \end{aligned}$$

where the first inequality follows from Theorem 5. \(\square \)

Proposition 14

Let \(r>0\) and . If

$$\begin{aligned} \alpha :=\min _{i=1, \dots , n}\{|A_{ii}|-1-r_i^{-1}\sum _{j\ne i} r_j|A_{ij}|\}>0, \end{aligned}$$

then .

Proof

First, we show that for a given d with \(\Vert d \Vert _\infty \le 1\), we have the following inequality

(14)

Suppose that and . We have

Consequently, interval matrix \([A-I, A+I]\) is regular. Similarly to the proof of Proposition 8, one can show that

The above equality and (14) imply , and the proof is complete. \(\square \)

Corollary 2

If \(\alpha :=\min _{i=1, \dots , n}\{|A_{ii}|-r_i(A)\}>1,\) then .

Corollary 3

If \(\beta :=\min _{j=1, \dots , n}\{|A_{jj}|-\hbox {cl}_j(A)\}>1,\) then \(c_{1}(A)\le \frac{1}{\beta -1}\).

3 Error bounds and a condition number of AVE related to linear complementarity problems

The study of AVE is inspired from the well-known linear complementarity problem (LCP) [31], which provides a unified framework for many mathematical programs [10]. In the section, we study error bounds for AVE obtained by transforming LCPs. Consider a general linear complementarity problem

$$\begin{aligned} Mx+q\ge 0, \ \ x\ge 0, \ \ x^T(Mx+q)=0, \end{aligned}$$
(LCP)

where \(M\in \mathbb {R}^{n\times n}\) and \(q\in \mathbb {R}^{n}\). Throughout the section, without loss of generality, we may assume that one is not an eigenvalue of M. So matrix \((M-I)\) is non-singular. This assumption is not restrictive, as one can rescale M and q in LCP. Problem LCP can be formulated as the following AVE,

$$\begin{aligned} (M+I)(M-I)^{-1}(x+q)=|x|; \end{aligned}$$
(15)

see [29]. The following proposition states the relationship between M and \((M+I)(M-I)^{-1}\); see Theorem 2 in [44].

Proposition 15

Let \(M-I\) be non-singular. Matrix M is a P-matrix if and only if \([(M+I)(M-I)^{-1}-I, (M+I)(M-I)^{-1}+I]\) is regular.

In addition to the error bounds introduced for some classes of matrices in the former section, in the following results, we propose error bounds for absolute value equation (15) according to some properties of M.

Proposition 16

Let M be an M-matrix with \(\hbox {Diag}(M)\le e\) and \(M-I\) be nonsingular. Then

$$\begin{aligned} c((M+I)(M-I)^{-1})=\frac{1}{2}\Vert I-M^{-1}\Vert . \end{aligned}$$

Proof

Since the off-diagonal elements of M are non-positive and \(M^{-1}\ge 0\), we have \(\hbox {Diag}(M^{-1})\ge e\). Putting \(A=(M+I)(M-I)^{-1}\), we get

$$\begin{aligned}&A-I=((M+I)-(M-I))(M-I)^{-1}=2(M-I)^{-1},\\&A+I=2M(M-I)^{-1}=2(I-M^{-1})^{-1}. \end{aligned}$$

Therefore, \((A-I)^{-1}=\frac{1}{2}(M-I)\le 0\) and \((A+I)^{-1}=\frac{1}{2}(I-M^{-1})\le 0\). Theorem 3 implies that \([A-I, A+I]\) is regular and \([A-I, A+I]^{-1}\subseteq \frac{1}{2}[I-M^{-1},M-I]\), and consequently, \(c(A)=\frac{1}{2}\Vert I-M^{-1}\Vert \). \(\square \)

It is worth noting that the assumption \(\hbox {Diag}(M)\le e\) is not restrictive since LCP(Mq) is equivalent to LCP\((\lambda M, \lambda q)\) for \(\lambda >0\). In the following, we investigate the case that M is an H-matrix. Before we get to the theorem, which gives a bound in this case, we need to present a lemma first.

Lemma 2

If M is an H-matrix with non-negative diagonal, then \(M+I\) is an H-matrix.

Proof

By the assumption, \(\langle M+I\rangle =\langle M\rangle +I\). By using Theorem 4(i), \(M+I\) should be an H-matrix. \(\square \)

Theorem 9

Let \(M-I\) be nonsingular and let M be an H-matrix with \(0\le \hbox {Diag}(M)\le e\). Then

$$\begin{aligned} c((M+I)(M-I)^{-1})\le \frac{1}{2}\Vert \langle M\rangle ^{-1}-I\Vert . \end{aligned}$$

Proof

Consider vector \(d\in \mathbb {R}^n\) with \(\Vert d\Vert _\infty \le 1\). We have

$$\begin{aligned} |(M-I)(M+I)^{-1}\hbox {diag}(d)|&\le |(M-I)(M+I)^{-1}|\\&\le |M-I| |(M+I)^{-1}|\\&\le (I-\langle M\rangle )(\langle M\rangle +I)^{-1}, \end{aligned}$$

where the last inequality follows from \(|M-I|\le I-\langle M\rangle \), Theorem 5 and Lemma 2. Thus, \(\rho ((M-I)(M+I)^{-1}\hbox {diag}(d))\le \rho ((I-\langle M\rangle )(\langle M\rangle +I)^{-1})\). Since \(\langle M\rangle \) is an M-matrix and \(\rho (BC)=\rho (CB)\), we have \(\rho ((I-\langle M\rangle )(\langle M\rangle +I)^{-1})<1\); see Theorem 4(i). Hence, \(\rho ((M-I)(M+I)^{-1}\hbox {diag}(d))<1\).

Let \({\hat{A}}\in [(M+I)(M-I)^{-1}-I, (M+I)(M-I)^{-1}+I]\). So \({\hat{A}}=(M+I)(M-I)^{-1}-\hbox {diag}(d)\) for some d with \(\Vert d\Vert _\infty \le 1\). Hence,

$$\begin{aligned}&((M+I)(M-I)^{-1}-\hbox {diag}(d))^{-1}\\&\quad =(I-(M-I)(M+I)^{-1}\hbox {diag}(d))^{-1}(M-I)(M+I)^{-1}. \end{aligned}$$

By applying Neumann series and the obtained results, we have

$$\begin{aligned} |((M+I)(M-I)^{-1}&-\hbox {diag}(d))^{-1}|\\&\le \left| \sum _{i=0}^\infty ((M-I)(M+I)^{-1}\hbox {diag}(d))^i\right| |(M-I)(M+I)^{-1}|,\\&\le \sum _{i=1}^\infty |(M-I)(M+I)^{-1}|^i ,\\&\le \sum _{i=1}^\infty ((I-\langle M\rangle )(\langle M\rangle +I)^{-1})^i,\\&= (I-(I-\langle M\rangle )(\langle M\rangle +I)^{-1})^{-1}(I-\langle M\rangle )(\langle M\rangle +I)^{-1}\\&= \frac{1}{2}(\langle M\rangle ^{-1}-I), \end{aligned}$$

where the last equality is obtained by using the relations \((I-A)^{-1}A=(A^{-1}-I)^{-1}\) and \(((I+\langle M\rangle )(I-\langle M\rangle )^{-1}-I)^{-1}=(2\langle M\rangle (I-\langle M\rangle )^{-1})^{-1}=\frac{1}{2}(\langle M\rangle ^{-1}-I)\). Therefore, \(\Vert {\hat{A}}^{-1}\Vert \le \frac{1}{2}\Vert \langle M\rangle ^{-1}-I\Vert \), and the proof is complete. \(\square \)

In the rest of this section, by using the obtained results, we present new error bounds for linear complementarity problems. Many papers were devoted to the error bounds for the LCP(M, q); see [7, 8, 10, 16, 38]. It is easily seen that \({\hat{x}}\) is a solution of LCP if and only if \({\hat{x}}\) solves

$$\begin{aligned} \theta (x):=\min (Mx+q, x)=0. \end{aligned}$$

The function \(\theta (x)\) is called the natural residual of LCP. As mentioned earlier, LCP has a unique solution for each q if and only if M is a P-matrix. For M being a P-matrix, Chen and Xiang [7] proposed the following error bound

$$\begin{aligned} \Vert x-x^\star \Vert \le \max _{0\le D\le I} \Vert (I-D+DM)^{-1}\Vert \cdot \Vert \theta (x)\Vert , \end{aligned}$$

where \(x^\star \) is the solution of LCP and \(x\in {\mathbb {R}}^n\) arbitrary. By introducing a new variable d with \(\hbox {diag}(d)=2D-I\), we have

$$\begin{aligned} \max _{0\le D\le I}\,&\Vert (I-D+DM)^{-1}\Vert \nonumber \\&=\max _{\Vert d\Vert _\infty \le 1} \Vert (I-\textstyle \frac{1}{2}(\hbox {diag}(d)+I)+\textstyle \frac{1}{2}(\hbox {diag}(d)+I)M)^{-1}\Vert \nonumber \\&= 2\max _{\Vert d\Vert _\infty \le 1} \Vert (I-M)^{-1}((I+M)(I-M)^{-1}-\hbox {diag}(d))^{-1}\Vert . \end{aligned}$$
(16)

Because \(c(A)=c(-A)\), we have

$$\begin{aligned} \max _{0\le D\le I} \Vert (I-D+DM)^{-1}\Vert \le 2c((I+M)(M-I)^{-1})\Vert (I-M)^{-1}\Vert . \end{aligned}$$

Therefore, the given results in this paper can be exploited for providing an upper bound for this maximization. For instance, Chen and Xiang, see Theorem 2.2 in [7], proved that when M is an M-matrix, then

$$\begin{aligned} \max _{0\le D\le I}\, \Vert (I-D+DM)^{-1}\Vert _1=\max _{v\in V}\, f(v), \end{aligned}$$

where \(f(v)=\max _{1\le i \le n}(e+v-M^Tv)_i\) and \(V=\{v: M^Tv\le e, v\ge 0\}\). As seen, f is a piece-wise linear convex function. However, maximization of a convex function is an intractable problem in general. In this case, one needs to solve n linear programs. In the next proposition, we give an explicit formula for the optimal value for \(\infty \)-norm.

Proposition 17

Let M be an M-matrix with \(\hbox {Diag}(M)\le e\). Then

$$\begin{aligned} \max _{0\le D\le I} \Vert (I-D+DM)^{-1}\Vert _\infty = \Vert \hat{B}\Vert _\infty , \end{aligned}$$

where for \(i, j=1, \dots , n\)

$$\begin{aligned} \underline{{B}}_{ij}=\sum _{k=1}^n \min \{(I-M)^{-1}_{ik}(I-M)_{kj}, (I-M)^{-1}_{ik}(M^{-1}-I)_{kj} \},\\ \overline{{B}}_{ij}=\sum _{k=1}^n \max \{(I-M)^{-1}_{ik}(I-M)_{kj}, (I-M)^{-1}_{ik}(M^{-1}-I)_{kj} \}, \end{aligned}$$

and \(\hat{B}=\max (| \underline{{B}}|, | \overline{{B}}|)\).

Proof

Similarly to the proof of Proposition 16, if \(\hbox {Diag}(M)\le e\), we have \([(I+M)(I-M)^{-1}-I, (I+M)(I-M)^{-1}+I]^{-1}\subseteq \frac{1}{2}[I-M, M^{-1}-I]\). Therefore, by (16)

$$\begin{aligned} \max _{0\le D\le I}\, \Vert (I-D+DM)^{-1}\Vert _\infty \le \max _{I-M\le X \le M^{-1}-I} \Vert (I-M)^{-1} X\Vert _\infty . \end{aligned}$$

Furthermore, \(\{ (I-M)^{-1} X: I-M\le X \le M^{-1}-I \}\subseteq [\underline{{B}}, \overline{{B}}]\). Hence,

$$\begin{aligned} \max _{0\le D\le I} \Vert (I-D+DM)^{-1}\Vert _\infty \le \Vert \hat{B}\Vert _\infty . \end{aligned}$$

On the other hand, suppose that \(\Vert \hat{B}\Vert _\infty =\Vert \hat{B}_{i*}\Vert _\infty \). There exist \({\check{B}}\in \{ (I-M)^{-1} X: I-M\le X \le M^{-1}-I \}\) such that \(| {\check{B}}_{i*}|=\hat{B}_{i*}\), which implies the above inequality holds as equality, and the proof will be complete. \(\square \)

For M being an H-matrix with \(0\le \hbox {Diag}(M)\le e\), similarly to the proof of Theorem 9, one can show that for d with \(\Vert d\Vert _\infty \le 1\),

$$\begin{aligned} \big |(I-M)^{-1}&\big ((I+M)(I-M)^{-1}-\hbox {diag}(d)\big )^{-1}\big |\\&= \big |\big ((I+M)-\hbox {diag}(d)(I-M)\big )^{-1}\big |,\\&=\left| (I+M)^{-1}\sum _{i=0}^\infty \big (\hbox {diag}(d)(I-M)(M+I)^{-1}\big )^i\right| ,\\&\le (\langle M\rangle +I)^{-1} \sum _{i=0}^\infty \big ((I-\langle M\rangle )(\langle M\rangle +I)^{-1}\big )^i,\\&= (\langle M\rangle +I)^{-1}\big (I-(I-\langle M\rangle )(\langle M\rangle +I)^{-1}\big )^{-1} =\frac{1}{2}\langle M\rangle ^{-1}. \end{aligned}$$

Therefore, by (16), we get

$$\begin{aligned} \max _{0\le D\le I} \Vert (I-D+DM)^{-1}\Vert \le \langle M\rangle ^{-1}, \end{aligned}$$
(17)

which is a well-known bound; see Theorem 2.1 in [7]. Here, we obtain inequality (17) with a different method as a by-product of our analysis.

4 Relative condition number of AVE

We introduce a relative condition number as follows

$$\begin{aligned} c^*(A) :=\max _{\Vert d\Vert _\infty \le 1}\Vert (A-\hbox {diag}(d))^{-1}\Vert \cdot \max _{\Vert d\Vert _\infty \le 1}\Vert A-\hbox {diag}(d)\Vert , \end{aligned}$$

which is equal to \(c(A) \max _{\Vert d\Vert _\infty \le 1}\Vert A-\hbox {diag}(d)\Vert \). The meaning of the relative condition number follows from the bounds presented in the proposition below. They extend the bounds known for the error of standard linear systems of equations [18].

Proposition 18

If the interval matrix \([A-I, A+I]\) is regular and \(b\ne 0\), then for each \(x\in \mathbb {R}^n\)

$$\begin{aligned} c^*(A)^{-1} \frac{\Vert Ax-|x|-b\Vert }{\Vert b\Vert } \le \frac{\Vert x- x^\star \Vert }{\Vert x^\star \Vert } \le c^*(A) \frac{\Vert Ax-|x|-b\Vert }{\Vert b\Vert }. \end{aligned}$$

Proof

Since \(b\ne 0\), we have \(x^\star \ne 0\). First, we show the upper bound. Denote \(s^\star :=\hbox {sgn}(x^\star )\). As \(Ax^\star -b=|x^\star |=\hbox {diag}(s^\star )x^\star \), we derive \((A-\hbox {diag}(s^\star ))x^\star =b\), from which \(\Vert A-\hbox {diag}(s^\star )\Vert \cdot \Vert x^\star \Vert \ge \Vert b\Vert \). Now, we have by Theorem 7

$$\begin{aligned} \Vert x- x^\star \Vert \le c(A)\Vert Ax-|x|-b\Vert \le c(A) \Vert Ax-|x|-b\Vert \frac{\Vert A-\hbox {diag}(s^\star )\Vert \cdot \Vert x^\star \Vert }{\Vert b\Vert }, \end{aligned}$$

from which the bound follows.

Now, we establish the lower bound. From the proof of Theorem 7, we know that there exist some \({\hat{A}}\in [A-I, A+I]\) such that \(Ax-|x|-b=\hat{A}(x-x^\star )\). Hence,

$$\begin{aligned} \Vert Ax-|x|-b\Vert&= \Vert \hat{A}(x-x^\star )\Vert \le \Vert \hat{A}\Vert \cdot \Vert x-x^\star \Vert \\&\le \Vert \hat{A}\Vert \cdot \Vert x-x^\star \Vert \frac{\Vert (A-\hbox {diag}(s^\star ))^{-1}\Vert \cdot \Vert b\Vert }{\Vert x^\star \Vert }, \end{aligned}$$

from which the statement follows. \(\square \)

Remark 2

The solutions of AVE lying in orthant \(\hbox {diag}(d)\ge 0\), \(d\in \{\pm 1\}^n\), are described by \((A-\hbox {diag}(d))x=b\). This may suggest to introduce the condition number as

$$\begin{aligned} \max _{d\in \{\pm 1\}^n} \kappa (A-\hbox {diag}(d)), \end{aligned}$$

where \(\kappa \) is the classical condition number. This value then reads

$$\begin{aligned} \max _{d\in \{\pm 1\}^n} \Vert (A-\hbox {diag}(d))^{-1}\Vert \cdot \Vert A-\hbox {diag}(d)\Vert . \end{aligned}$$

The main difference to \(c^*(A)\) is that in the definition of \(c^*(A)\) we have two separated maximization problems. The need for that may stem from possible variations of the solution between different orthants (e.g., when it lies on the border between two of them), whereas the above expression handles orthants separately.

In order to compute \(c^*(A)\) we have to determine c(A) and \(\max _{\Vert d\Vert _\infty \le 1}\Vert A-\hbox {diag}(d)\Vert \). The former is discussed in detail in the previous sections, so we focus on the latter now. Recall that a norm is absolute if \(\Vert A\Vert =\Vert |A|\Vert \), and it is monotone if \(|A|\le |B|\) implies \(\Vert A\Vert \le \Vert B\Vert \). For example, 1-norm, \(\infty \)-norm, Frobenius norm or max norm are both absolute and monotone.

Proposition 19

For any absolute and monotone matrix norm

$$\begin{aligned} \max _{\Vert d\Vert _\infty \le 1}\Vert A-\hbox {diag}(d)\Vert =\Vert |A|+I_n\Vert . \end{aligned}$$

Proof

We have

$$\begin{aligned} \max _{\Vert d\Vert _\infty \le 1}\Vert A-\hbox {diag}(d)\Vert \le \max _{\Vert d\Vert _\infty \le 1}\Vert |A|+|\hbox {diag}(d)|\Vert =\Vert |A|+I\Vert , \end{aligned}$$

and equation is attained for certain d with \(\Vert d\Vert _\infty =1\). \(\square \)

Proposition 20

For spectral norm we have

$$\begin{aligned} \max _{\Vert d\Vert _\infty \le 1}\Vert A-\hbox {diag}(d)\Vert _2 \le \Vert A\Vert _2+1. \end{aligned}$$

Moreover, It holds as an equality when A is symmetric.

Proof

We have \(\Vert A-\hbox {diag}(d)\Vert _2\le \Vert A\Vert _2+\Vert \hbox {diag}(d)\Vert _2\le \Vert A\Vert _2+1\). \(\square \)

5 Error bounds and convergence analysis

As mentioned earlier, error bounds were employed as a powerful tool for the analysis of iterative methods. In the section, we study two well-known algorithms, a generalized Newton method [30] and the Picard method [45], for solving AVE. By using error bounds, we provide some sufficient conditions for the convergence. In addition, we establish the linear convergence of the aforementioned methods. Our approach is in the spirit of convergence analysis in [27, 28, 47].

Mangasarian [30] proposed a generalized Newton method for solving AVE. In this method, the starting point \(x^1\) is chosen arbitrary and the Newton iteration is as follows,

$$\begin{aligned} (A-D^k)x^{k+1}=b, \quad k=1, 2, \dots \end{aligned}$$
(18)

where \(D^k=\hbox {diag}(\hbox {sgn}(x^k))\). The function \(U(x)=\Vert Ax-|x|-b\Vert _2\) is non-negative and \(U({\bar{x}})=0\) if and only if \({\bar{x}}\) is a solution of AVE. In the literature, function U is called a potential function or a Lyapunov function.

Mangasarian established that the generalized Newton method is convergent when \(\sigma _{\min }(A)>4\); see Proposition 7 in [30]. Cruz et al. proved the convergence under the weaker assumption \(\sigma _{\min }(A)>3\); see Remark 3 in [12]. In the next theorem, we show the convergence of the generalized Newton method with a different method. Indeed, we prove the linear convergence of \(\{U(x^k)\}\) by using error bounds.

Theorem 10

If \(\sigma _{\min }(A)>3\), then the generalized Newton iteration (18) converges linearly from any starting point and

$$\begin{aligned} U(x^{k+1}) \le \frac{2}{\sigma _{\min }(A)-1} U(x^{k}). \end{aligned}$$
(19)

Proof

The assumption implies that the interval matrix \([A-I, A+I]\) is regular, and consequently it has a unique solution. Due to the Newton iteration (18), we have

$$\begin{aligned} U(x^{k+1})&=\left\| Ax^{k+1}-\left| x^{k+1}\right| -b\right\| _2=\left\| D^kx^{k+1}-\left| x^{k+1}\right| \right\| _2 \nonumber \\&=\left\| D^k(x^{k+1}-x^k)-\left| x^{k+1}\right| +\left| x^k\right| \right\| _2 \le 2\left\| x^{k+1}-x^k\right\| _2. \end{aligned}$$
(20)

Hence, by Theorem 7, we get

$$\begin{aligned} \Vert x^{k+1}-x^\star \Vert _2 \le c_2(A) U(x^{k+1}) \le 2c_2(A)\left\| x^{k+1}-x^k\right\| _2. \end{aligned}$$
(21)

On the other hand, by (20) together with (18), we obtain

$$\begin{aligned} U(x^{k+1})-U(x^{k})&\le 2\Vert x^{k+1}-x^k\Vert _2-\Vert Ax^{k}-|x^{k}|-b\Vert _2\nonumber \\&=2\left\| x^{k+1}-x^k\right\| _2-\Vert (A-D^k)(x^{k}-x^{k+1})\Vert _2 \nonumber \\&\le -\left( \sigma _{\min }(A-D^k)-2\right) \left\| x^{k+1}-x^k\right\| _2 \nonumber \\&\le - \left( \sigma _{\min }(A)-3\right) \left\| x^{k+1}-x^k\right\| _2, \end{aligned}$$
(22)

where the last inequality follows from \(\sigma _{\min }(A-D^k)\ge \sigma _{\min }(A)-\sigma _{\max }(D^k)\). Inequalities (20) and (22) imply

$$\begin{aligned} U(x^{k+1}) \le 2\left\| x^{k+1}-x^k\right\| _2 \le 2\left( \sigma _{\min }(A)-3\right) ^{-1}\left( U(x^{k})-U(x^{k+1}) \right) , \end{aligned}$$

yielding (19). Since the function U is non-negative, inequality (19) implies that \(U(x^k)\) goes to zero as k tends to infinity. Hence, by (21), \(\Vert x^{k+1}-x^\star \Vert _2\) tends to zero and the algorithm is convergent. Moreover, inequalities (19) implies the linear convergence of \(\{U(x^k)\}\). \(\square \)

In the next theorem, we prove that the generalized Newton method is convergent under the assumptions of Proposition 10. Barrios et al. employed similar assumptions to prove the convergence of a semi-smooth Newton method for the piecewise linear system \(\max (x,0)+Tx=b\), where \(A\in \mathbb {R}^{n\times n}\), \(b\in \mathbb {R}^{n}\); see Theorem 3 in [2]. Note that the piecewise linear system \(\max (x,0)+Tx=b\) is equivalent to \(-(I+2T)x-|x|=-2b\). To prove the convergence of the generalized Newton method, we use the potential function \(W(x)=\Vert x-x^\star \Vert _1\).

Theorem 11

Let \((AP-I)^{-1}\ge 0\) and \((AP+I)^{-1}\ge 0\) for a diagonal matrix P such that \(|\hbox {Diag}(P)|=e\). Then the generalized Newton iteration (18) converges linearly from any starting point and

$$\begin{aligned} W(x^{k+1})\le \tfrac{\ell c_1(A)-1}{\ell c_1(A)} W(x^{k}), \ k=2, 3, \ldots , \end{aligned}$$
(23)

where \(\ell =\max _{\Vert d\Vert _\infty \le 1} \Vert A-\hbox {diag}(d)\Vert _1\).

Proof

By the proof of Proposition 10, one can infer that AVE has a unique solution. Without loss of generality, we may assume that \(P=I\). For the residual function \(\phi (x)=Ax-|x|-b\), we have

$$\begin{aligned} \phi (y)-\phi (x)-(A-\hbox {diag}(\hbox {sgn}(x)))(y-x)=-|y|+\hbox {diag}(\hbox {sgn}(x))y\le 0, \end{aligned}$$
(24)

for \( x, y\in \mathbb {R}^n\). By (24) together with \(\phi (x^{k})=(A-D^k)(x^{k}-x^{k+1})\), we get

$$\begin{aligned} \phi (x^{k+1})\le 0, \ k=1, 2, \ldots . \end{aligned}$$
(25)

By virtue of (24) for \(k\ge 2\), we obtain

$$\begin{aligned}&\phi (x^\star )-\phi (x^k)-(A-D^k)(x^\star -x^k)\le 0 \\&\Rightarrow \ x^k\le x^k-(A-D^k)^{-1}\phi (x^k)\le x^\star \end{aligned}$$

where the last inequalities follows from \((A-D^k)^{-1}\ge 0\) and (25). Because \(x^{k+1}=x^k-(A-D^k)^{-1}\phi (x^k)\), we get

$$\begin{aligned} x^k\le x^{k+1}\le x^\star , \ k=2, 3, \ldots . \end{aligned}$$
(26)

By the above inequality, we get

$$\begin{aligned} W(x^{k+1})-W(x^k)&\le -\Vert x^{k+1}-x^k\Vert _1. \end{aligned}$$
(27)

By using Theorem 7 and \(\phi (x^{k})=(A-D^k)(x^{k}-x^{k+1})\), we have

$$\begin{aligned} \Vert x^{k}-x^\star \Vert _1 \le c_1(A)\Vert \phi (x^{k})\Vert _1 \le \ell c_1(A)\left\| x^{k+1}-x^k\right\| _1. \end{aligned}$$
(28)

We can infer from inequalities (26)–(28),

$$\begin{aligned} W(x^{k+1})&\le -\Vert x^{k+1}-x^k\Vert _1+\Vert x^{k}-x^\star \Vert _1\le (\ell c_1(A)-1)\Vert x^{k+1}-x^k\Vert _1\\&=(\ell c_1(A)-1)(W(x^{k})-W(x^{k+1})), \end{aligned}$$

from which the statement follows. \(\square \)

Note that, under the assumptions of Theorem 11, the unique solution of AVE can be obtained by solving just one linear program; see Proposition 4 in [52]. In the following proposition, we establish the finite convergence of the generalized Newton method under some mild conditions.

Proposition 21

Let \([A-I, A+I]\) be regular. If \(\{x^k\}\) converges to \(x^\star \), then the generalized Newton method is finitely convergent.

Proof

First, we consider the case that all components of \(x^\star \) are non-zero. In this case, the proof follows from the fact that we have \(D^k=\hbox {diag}(\hbox {sgn}(x^\star ))\) for \(x^k\) sufficiently close to \(x^\star \). Now, we investigate the case that some components of \(x^\star \) are zero. Let \({\mathcal {K}}=\{i: x^\star _i=0\}\). Due to the regularity of \([A-I, A+I]\), it is seen that \(x^\star \) is the unique solution of the linear system \((A-D)x=b\) for any diagonal matrix D with

$$\begin{aligned} D_{ii}= {\left\{ \begin{array}{ll} -1,\ 0 \hbox { or }1 &{} \hbox {if } i\in {\mathcal {K}} \\ \hbox {sgn}(x^\star _i), &{} \hbox {otherwise}. \end{array}\right. } \end{aligned}$$

Hence, for \(x^k\) sufficiently close to \(x^\star \), we have \((A-D^k)x^{\star }=b\) and the proof is complete. \(\square \)

In what follows, we investigate the Picard iterative method for solving AVE. We refer the interested reader to Chapter 7 in [37] for more information on this method.

The Picard iterative method was employed by Rohn et al. [45] for tackling AVE. The method can be summarized as follows

$$\begin{aligned} x^{k+1}=A^{-1}\left( |x^k|+b\right) , \ \ \ \ k=1, 2, \dots , \end{aligned}$$
(29)

where \(x^1\in \mathbb {R}^n\) is an arbitrary point. They proved that the Picard method (29) is convergent if \(\rho (|A^{-1}|)<1\); see Theorem 2 in [45]. The next proposition gives a sufficient condition for the convergence by using error bounds.

Proposition 22

If \(\sigma _{\min }(A)>1\), then the Picard method (29) converges from any starting point and

$$\begin{aligned} U(x^{k+1})\le \frac{1}{\sigma _{\min }(A)^k} U(x^{1}). \end{aligned}$$

Proof

We follow the analogous arguments used in Theorem 10. By (29),

$$\begin{aligned} U(x^{k+1})=\left\| Ax^{k+1}-\left| x^{k+1}\right| -b\right\| _2=\left\| |x^{k}|-|x^{k+1}|\right\| _2\le \left\| x^{k+1}-x^k\right\| _2. \end{aligned}$$
(30)

Due to Theorem 7, we have

$$\begin{aligned} \Vert x^{k+1}-x^\star \Vert \le c_2(A) U(x^{k+1})\le c_2(A)\left\| x^{k+1}-x^k\right\| _2. \end{aligned}$$

By virtue of (29) and (30), we get

$$\begin{aligned} U(x^{k+1})-U(x^{k})&\le \left\| x^{k+1}-x^k\right\| _2-\left\| Ax^{k}-|x^{k}|-b\right\| _2\\&=\left\| x^{k+1}-x^k\right\| _2-\left\| A(x^{k+1}-x^k)\right\| _2\\&\le - (\sigma _{\min }(A)-1)\left\| x^{k+1}-x^k\right\| _2. \end{aligned}$$

The rest of the proof is analogous to that of Theorem 10. \(\square \)

It is worth mentioning that the conditions \(\sigma _{\min }(A)>1\) and \(\rho (|A^{-1}|)<1\) do not necessarily imply each other. To prove the convergence under the assumption \(\rho (|A^{-1}|)<1\) by using this framework, one needs to modify the potential function U. Let . Similarly to the proof of Theorem 8, there exists an invertible matrix B with \(|A^{-1}|<B\) and \(\rho (B)= \gamma \). In addition, for some \(v>0\), is a norm with . We define the potential function .

Proposition 23

If \(\rho (|A^{-1}|)<\gamma <1\), then the Picard method (29) converges from any starting point and

$$\begin{aligned} V(x^{k+1})\le \gamma ^{k}V(x^{1}). \end{aligned}$$

Proof

By virtue of (29),

(31)

where the last inequality follows from . By Theorem 8, we have

Equations (29) and (31) imply that

The rest of the proof is analogous to the proof of Theorem 10. \(\square \)

6 Error bounds for the absolute value equations with multiple solutions

The section studies error bounds for AVE when the solution set is non-empty and the interval matrix \([A-I, A+I]\) is not necessarily regular. Under this setting, AVE may have multiple or infinite number of solutions. Let denote the solution set of AVE. It is easily seen that \(X^\star \) may be written as a finite union of polyhedral sets.

By the locally upper Lipschitzian property of polyhedral set-valued mappings, see Proposition 1 in [40], AVE has the local error bounds property. That is, there exist \(\epsilon >0\) and \(\kappa >0\) such that

$$\begin{aligned} {{\,\mathrm{dist}\,}}_{X^\star }(x)\le \kappa \Vert Ax-|x|-b\Vert , \end{aligned}$$
(32)

when \(\Vert Ax-|x|-b\Vert \le \epsilon \). However, in general, the global error bounds property does not hold necessarily. The following example illustrates this point.

Example 3

Consider the system AVE in the form

$$\begin{aligned} \tfrac{1}{4}\begin{pmatrix}3 &{}\quad 1\\ 2 &{}\quad 2\end{pmatrix} \begin{pmatrix}x_1\\ x_2\end{pmatrix}- \begin{pmatrix}|x_1|\\ |x_2|\end{pmatrix}- \begin{pmatrix}-4\\ -4\end{pmatrix}=0. \end{aligned}$$

One can check that \(X^\star =\left\{ (-2,-2)^T,(-3,5)^T,(10,-6)^T \right\} \). Let \(x(t)=(t,t)^T\), where \(t>0\). It is seen that \(\Vert Ax(t)-|x(t)|-b\Vert =\Vert b\Vert \), while \({{\,\mathrm{dist}\,}}_{X^\star }(x(t))\) tends to infinity as \(t\rightarrow \infty \).

In the next theorem, we give a sufficient condition under which the global error bounds hold.

Theorem 12

Let \(X^\star \) be non-empty. If zero is the unique solution of \(Ax-|x|=0\), then there exists \(\kappa >0\) such that

$$\begin{aligned} {{\,\mathrm{dist}\,}}_{X^\star }(x)\le \kappa \Vert Ax-|x|-b\Vert , \ \ \ \forall x\in \mathbb {R}^n. \end{aligned}$$
(33)

Proof

The idea of the proof is similar to that of Theorem 2.1 in [34]. Suppose to the contrary that (33) does not hold. Hence, for each \(k\in \mathbb {N}\), there exists \(x^k\) such that

$$\begin{aligned} \Vert x^k-{\bar{x}}\Vert \ge {{\,\mathrm{dist}\,}}_{X^\star }(x^k)> k\Vert Ax^k-|x^k|-b\Vert , \end{aligned}$$
(34)

where \({\bar{x}}\in X^\star \). Due to the local error bounds property (32), there exists \(\epsilon >0\) such that \(\Vert Ax^k-|x^k|-b\Vert >\epsilon \) for each \(k\ge k_0\), where \(k_0\) is sufficiently large. Consequently, \(\Vert x^k-{\bar{x}}\Vert \) tends to infinity as \(k\rightarrow \infty \). Choosing subsequences if necessary, we may assume that \(\frac{x^k}{\Vert x^k\Vert }\) goes to a non-zero vector d. By dividing both sides of (34) by \(k\Vert x^k\Vert \) and taking the limit as k goes to infinity, we get

$$\begin{aligned} Ad-|d|=0, \end{aligned}$$

which contradicts the assumptions. \(\square \)

7 Conclusion

In this paper, we studied error bounds for absolute value equations. We suggested formulas for the computation of error bounds for certain classes of matrices. The investigation of other classes of matrices may be of interest for further research. The proposed formulas can be employed not only for the absolute value equations obtained by transforming the linear complementarity problem, but also for the linear complementarity problem itself. In addition, we showed that the computation of error bounds, except for 2-norm, for a general matrix is an NP-hard problem, and it remains an open problem for 2-norm. To demonstrate importance of the error bounds, we applied them in a convergence analysis of two methods used to solve the absolute value equations.