Error bounds and a condition number for the absolute value equations

Absolute value equations, due to their relation to the linear complementarity problem, have been intensively studied recently. In this paper, we present error bounds for absolute value equations. Along with the error bounds, we introduce an appropriate condition number. We consider general scaled matrix p-norms, as well as particular p-norms. We discuss basic properties of the condition number, its computational complexity, its bounds and also exact values for special classes of matrices. We consider also matrices that appear based on the transformation from the linear complementarity problem.

1. Introduction. We consider the absolute value equation problem of finding an x ∈ R n such that where A ∈ R n×n , b ∈ R n and | · | denotes absolute value. A slightly more generalized form of (AVE) was introduced by Rohn [36], which is written as where B ∈ R n×n , but we will deal merely with (AVE). Many methods, including Newton-like methods [12,27,45] or concave optimization methods [29,30], have been developed for (AVE). The important point concerning numerical methods is the precision of the computed solution. To the best knowledge of the authors, there exist only few papers which are devoted to this subject for (AVE); for instance see [1,42,43]. Wang et al. [42,43] use interval methods for numerical validation. In addition, some general bounds for the solution set were presented in [20]. One effective method for numerical validation is error bound method [33].
Error bounds play a crucial role in theoretical and numerical analysis of linear algebraic and optimization problems [11,13,14,18,33]. In this paper, we study error bounds for (AVE). Indeed, under the assumption guaranteeing unique solvability for each b ∈ R n , we compute upper bounds for x − x ⋆ , the distance to the solution x ⋆ , in terms of a computable residual function. We discuss various kinds of norms and investigate special classes of matrices.
It is well-known that a linear complementarity problem can be formulated as an absolute value equation [28]. In fact, it is one of the main applications of absolute value equations. In Section 3, we study error bounds for absolute value equations obtained by the reformulation of linear complementarity problems. In addition, thanks to the given results, we provide a new error bound for linear complementarity problems.
The paper is organized as follows. After reviewing terminologies and notations, we investigate error bounds for absolute value equations in Section 2. Section 3 is devoted to linear complementarity problems. We study relative condition number of AVE in Section 4.
1.1. Notation. The n-dimensional Euclidean space is denoted by R n . Vectors are considered to be column vectors and the superscript T represents the transpose operation. We use e and I to denote the vector of ones and the identity matrix, respectively. We denote an arbitrary scaling p-norm on R n by · , that is, x = Dx p for a positive diagonal matrix D and a p-norm. In particular, · 1 , · 2 and · ∞ stand for 1-norm, 2-norm and ∞-norm, respectively. We use sgn(x) to denote the sign of x.
Let A and B be n × n matrices. We denote the smallest singular value and the spectral radius of A by σ min (A) and ρ(A), respectively. We use λ(A) to denote the vector of eigenvalues of a symmetric matrix A, and λ min (A) and λ max (A) stand for the smallest and the largest eigenvalue, respectively. For a given norm · on R n , A denotes the matrix norm induced by · , which is defined as The matrix inequality A ≥ B, |A| and max(A, B) are understood entrywise. For d ∈ R n , diag(d) stands for the diagonal matrix whose entries on the diagonal are the components of d. In contrast, Diag(A) denotes the vector of diagonal elements of A. The ith row and ith column of A are denoted by A i * and A * i , receptively. We denote the comparison matrix A by A , which is defined as We recall the following definitions for an n × n real matrix A: • A is a P-matrix if each principal minor of A is positive.
• A is an H-matrix if its comparison matrix is an M-matrix. We will exploit some results from interval linear algebra, so we recall some results from this discipline. For two n × n matrices A and A, A ≤ A, the interval matrix A = [A, A] is defined as A = {A : A ≤ A ≤ A}. An interval matrix A is called regular if each A ∈ A is nonsingular. Furthermore, we denote and define the inverse of a regular interval matrix A as A −1 := {A −1 : A ∈ A}. Note that the inverse of an interval matrix is not necessarily an interval matrix.
In this paper, generalized Jacobian matrices [9] are used in the presence of nonsmooth functions. Let f : R n → R m be a locally Lipschitz function. The generalized gradient of f atx, denoted by ∂f (x), is defined as where X f is the set of points at which f is not differentiable and co(S) denotes the convex hull of a set S.
2. Error bounds for the absolute value equations. Consider an absolute value equation (AVE). It is known that (AVE) has a unique solution for each b ∈ R n if and only if the interval matrix [A−I, A+I] is regular; see Theorem 3.3 in [44]. That is why in many statements below, we make an assumption that the interval matrix [A − I, A + I] is regular. In this case, we denote the unique solution set of (AVE) by x ⋆ .
Proof. Note that due to regularity of [A − I, A + I] the right side of the above inequality is finite. Define the residual function φ : By the mean value theorem, see Theorem 8 in [19], By multiplyingÂ −1 on the both sides and using induced norms property, we obtain which completes the proof.
To take advantage of this formulation, we need to compute the optimal value of the following optimization problem, We call the optimal value of (2.2) the condition number of the absolute value equation (AVE) with respect to the norm · . In addition, we denote the condition number with respect to the 1-norm, 2-norm and ∞-norm by c 1 (A), c 2 (A) and c ∞ (A), respectively. By properties of matrix norms, we have the following results.
Proof. Part i) and ii) are straightforward. Part iii) follows form the fact that In the next proposition, we show that optimization problem (2.2) attains its minimum at some vertices of the box {d : Proof. We will show that problem (2.2) has a solution whose components are either one or minus one. As the feasible set is compact, problem (2.2) attains its maximum. Letd be an optimal solution. Ifd is a vertex of {d : d ∞ ≤ 1}, the proof will be complete. Otherwise, without loss of generality, suppose that |d 1 | < 1. Let f : [−1, 1] → R given by f (t) = (A − diag((t,ď))) −1 , whereď is obtained by removing the first component ofd. By Sherman-Morrison formula [21], is well-defined for t ∈ [−1, 1]. Consider the optimization problem max t∈[−1,1] f (t). Since A + tE as a function of t is convex and is strictly monotone on [−1, 1], f is convex on its domain [4], and consequently max t∈[−1,1] f (t) = max{f (−1), f (1)}. Hence, due to optimality ofd, we get a new pointd which is optimal to (2.2) and all components instead of first one are equal tô d and its first component is either one or minus one. In the same line, one can obtain a solutiond with |d| = e, which completes the proof.
By Proposition 2.3, to handle problem (2.2), one needs to check solely all vertices of {d : d ∞ ≤ 1}. As the number of vertices is 2 n , this method may not be effective for large n. Indeed, problem (2.2) is NP-hard in general. It is known that for any rational p ∈ [1, ∞), except for p = 1, 2, computation of the matrix p-norm of a given matrix is NP-hard [17]. Consequently, problem (2.2) is NP-hard for any rational p ∈ [1, ∞) except p = 1, 2. We prove intractability for 1-norm, so it is NP-hard for ∞-norm, too. We conjecture it is also NP-hard for 2-norm.
Proof. By triangle inequality u = 1 , from which the statement follows. Proof. By [35], solving the problem max e T |x| subject to |Ax| ≤ e (2.3) is NP-hard. Even more, it is intractable even with accuracy less than 1 2 when A −1 is a so called MC-matrix [35]. Recall that M ∈ R n×n is an MC matrix if it is symmetric, 2n−1 . Therefore λ min (A) ≥ 1 2n−1 and we can achieve λ min (A) > 1 by a suitable scaling. As a consequence, [A − I, A + I] is regular.
Feasible solutions to the above optimization problem can be equivalently characterized as Introducing an auxiliary variable z = 1, we get Rewrite the system as Let α > 0 be sufficiently large. The system equivalently reads Now, we relax the system by introducing intervals on the remaining diagonal entries That is why we analytically express the inverse matrix (notice that it exists due to regularity of [αA − I, αA is attained for the last column, the claim is resulted. Otherwise, since α > 0 is arbitrarily large, the 1-norm is attained for no column of the middle part. Suppose that the norm is attained for ith column of the first column block. We compare the norms of this column and the last column of M (D, We compare separately their three blocks. Obviously, for the last entry the latter is larger. Since C → 0 as α → ∞, the first block of entries of the former vector is arbitrarily small and neglectable. Thus we focus on the second block. The former vector has entries αC * i . In view of Lemma 2.4, one can choose a suitableD such that |D| = I and αC * i 1 ≤ αCDe 1 = αC * i + α j =i C * jdjj 1 . Furthermore, one can select a matrixD ′ with |D ′ | = I and e + D ′ CDe 1 = e +D ′ CDe 1 . Because c 1 (M (0, 0, 0)) = M (D, D ′ , −1) −1 1 , the given matricesD andD ′ fulfill the claim. Claim B. The 1-norm of the last column is arbitrarily close to 1 + n + e T |A −1 De|. Proof of the Claim B. The last entry of the column is 1. Since C → 0 as α → ∞, the first block tends to e as α → ∞. The second block reads By Claim B, the 1-norm of the last column is by 1 + n larger than the objective value of (2.3). So by maximizing 1-norm of M (D, D ′ , d) −1 we can deduce the maximum of (2.3) with arbitrary precision. Notice that e T |A −1 |e is an upper bound on (2.3) and it has polynomial size, so we can find α of polynomial size, too by the standard means (c.f. [39]).
In general, the computation of c(A) is not easy. However, computation of the condition number with respect to some norms or for some classes of matrices is not difficult. In the rest of the section, we study the given condition number from this aspect.
For the following we say that a matrix norm is monotone if |A| ≤ B implies A ≤ B . For instance, the scaled matrix p-norms are monotone.
Proof. By Proposition 2.3, we need to check the vertices of {d : d ∞ ≤ 1}. Let d be such that |d| = e and denote D := diag(d). Then By monotonicity of the matrix norm Proof. Note that the assumption implies that (AVE) has a unique solution, see Theorem 4 in [38], and [A − I, A + I] is regular. Due to the continuity of eigenvalues with respect to the matrix elements, there exists matrix B with |A −1 | < B and ρ(B) = γ. By Perron-Frobenius theorem, there exists v > 0 such that Bv = ρ(B)v.
By Neumann series theorem [21], (I − |A −1 |) −1 and (I − B) −1 exist and are nonnegative. Hence, Moreover, for d with d ∞ ≤ 1, One may wonder why we do not use the well-known result which states the existence of a matrix norm . with A < ρ, see Lemma 5.6.10 in [21], to prove the above theorem. The underlying reason is that the given matrix norm by this result is not necessarily a scaled matrix p-norm. It is worth mentioning that, under the assumption of Theorem 2.7, when |A −1 | > 0, one obtains for some scaling 1-norm. Note that a sufficient condition for having ρ(|A −1 |) < 1 is the existence of a diagonal matrix S with |S| = I such that A −1 S ≥ 0 and (A−S) −1 S ≥ 0. In fact, Theorem 5.2 in Chapter 7 of [3] implies that ρ(A −1 S) < 1 under this condition, which is equivalent to ρ(|A −1 |) < 1.
Error bound can be utilized as a tool in stability analysis [10,14]. As mentioned earlier, (AVE) has a unique solution for each b ∈ R n if and only if [A − I, A + I] is regular. We denote the set of matrices which satisfy this property by A. It is easily seen that A is an open set. Let function X(A, b) : A × R n → R n return the solution of (AVE). In the following proposition, we list some properties of function X.
ii) Function X is locally Lipschitz with modulus c(A).
Proof. First, we show the first part. Suppose that X(A, b 1 ) = x 1 and X(A, b 2 ) = x 2 . Thus, There exists a matrix D ∈ [−I, I] such that |x 2 | − |x 1 | = D(x 2 − x 1 ). So the above equality can be written as Now, we prove the second part. Consider the locally Lipschitz function φ : A × R n ×R n → R n given by φ As [A − I, A + I] is regular, the implicit function theorem (see Chapter 7 in [9]) implies that there exists a locally Lipschitz function X(A, b) : A × R n → R n with φ(A, b, X(A, b)) = 0. In addition, where A 1 , A 2 and b 1 , b 2 are in some neighborhoods of A and b, respectively.
As mentioned earlier, one class of effective approaches to handle (AVE) is concave optimization methods. Mangasarian [29] proposed the following concave optimization problem, He showed that (AVE) has a solution if and only if the optimal value of (2.6) is zero. Now, we show that (2.6) has weak sharp minima property. Consider an optimization problem min x∈X f (x) with the optimal solution set S. The set S is called a weak sharp minima if there is an α > 0 such that where dist S (x) := min{ x − s 2 : s ∈ S}. Weak sharp minima notion has wide applications in the convergence analysis of iterative methods and error bounds [5,6]. Proposition 2.9. Let A ∈ A. Then the optimal solution of (2.6) is a weak sharp minimum.
Proof. Let X and x ⋆ denote the feasible set and the unique solution of (2.6), respectively. By Theorem 2.1, c 2 (A) ∈ R + and 1 c 2 (A) x which shows that x ⋆ is a weak sharp minimum.

Condition number of AVE for 2-norm. Since
, c 2 (A) can be computed as the optimal value of the following optimization problem, In general, the function σ min (·) is neither convex nor concave; see Remark 5.2 in [34]. Here, σ min (·) is a function of diagonals. Nonetheless, σ min (·) is also neither convex nor concave in this case; the following example clarifies this point. From this perspective, Proposition 2.3 mentioned above is by far not obvious.
In the next proposition, we give a formula for symmetric matrices. Before we get to the proposition, we present a lemma. The "only if" part is resulted from Bauer-Fike Theorem [2].
In the following example we show that the bound (2.9) can be arbitrary large while the error bound with respect to 2-norm is bounded.
For matrix A, let It is easily seen that T is diagonally dominant with nonnegative diagonal, so it is positive semi-definite. Consequently, , which implies the desired equality.
Note that under the assumptions of Proposition 2.15, we also have the following bound As for a permutation matrix P ,

Condition number of AVE for ∞-norm.
Some upper bounds were proposed for A −1 ∞ and A −1 1 ; see [23,25,31,40]. As Theorem 2.1 holds for any scaling p-norm, it would be advantageous to use these norms.
Proof. By virtue of Theorem 3.6.3 in [32], A − I is an M-matrix. In addition, as M-matrices are preserved by the addition of positive diagonal matrices [3], A+I is also an M-matrix. Hence, by Kuttler's theorem [24], [A − I, A + I] is inverse nonnegative, and we proceed as in the proof of Proposition 2.17.
where the first inequality follows form the fact that for H-matrix A, A −1 ∞ ≤ A −1 ∞ ; see Theorem 1 in [41].
Proof. First, we show that for a given d with d ∞ ≤ 1, we have the following inequality Consequently, interval matrix [A − I, A + I] is regular. Similarly to the proof of Proposition 2.15, one can show that The above equality and (2.13) imply c · (A) ≤ 1 α , and the proof is complete.
Error bounds and a condition number of AVE related to linear complementarity problems. The study of AVE is inspired from the well-known linear complementarity problem (LCP) [28]. LCP provides a unified framework for many mathematical programs [10]. In the section, we study error bounds for AVE obtained by transforming LCPs. Consider a general linear complementarity problem where M ∈ R n×n and q ∈ R n . Throughout the section, without loss of generality, we may assume that one is not an eigenvalue of M . So matrix (M − I) is non-singular. This assumption is not restrictive, as one can rescale M and q in (LCP). Problem (LCP) can be formulated as the following AVE, see [26]. The following proposition states the relationship between M and (M + I)(M − I) −1 ; see Theorem 2 in [37]. In addition to the error bounds introduced for some classes of matrices in the former section, in the following results, we propose error bounds for absolute value equation  It is worth noting that the assumption Diag(M ) ≤ e is not restrictive, since LCP(M, q) is equivalent to LCP(λM, λq) for λ > 0. In the following, we investigate the case that M is an H-matrix. Before we get to the theorem, which gives a bound in this case, we need to present a lemma first.  By applying Neumann series and the obtained results, we have where the last equality obtained by using the relations ( Therefore, Â −1 ≤ 1 2 M −1 − I , and the proof is complete. In the rest of this section, by using the obtained result, we present new error bounds for linear complementarity problems. Many papers have devoted to the error bounds for the LCP(M, q); see [7,8,10,16,33]. It is easily seen thatx is a solution of (LCP) if and only ifx solves The function θ(x) is called the natural residual of (LCP). As mentioned earlier, (LCP) has a unique solution for each q if and only if M is a P-matrix. For M being a P-matrix, Chen and Xiang [7] proposed the following error bound Therefore, the given results in this paper can be exploited for providing an upper bound for this maximization. For instance, Chen and Xiang, see Theorem 2.2 in [7], proved that when M is an M-matrix, then As seen, f is a piece-wise linear convex function. However, maximization of a convex function, in general, is an intractable problem. In this case, one needs to solve n linear programs.
In the next proposition, we give an upper bound for the optimal value for ∞-norm.
On the other hand, suppose that B ∞ = B i * ∞ . There existB ∈ {(I − M ) −1 X : which is a well-known bound; see Theorem 2.1 in [7]. Here, we obtain inequlity (3.3) with a different method as a by-product of our analysis.
4. Relative condition number of AVE. We introduce a relative condition number as follows which is equal to c(A) max d ∞ ≤1 A−diag(d) . The meaning of the relative condition number follows from the bounds presented in the proposition below. They extend the bounds known for the error of standard linear systems of equations [18].
Proof. Since b = 0, we have x ⋆ = 0. First, we show the upper bound. Denote from which the bound follows. Now, we establish the lower bound. From the proof of Theorem 2.1 we know that there exist someÂ ∈ [A − I, A + I] such that Ax − b − |x| =Â(x − x ⋆ ). Hence from which the statement follows.
In order to compute c * (A) we have to determine c(A) and max d ∞≤1 A − diag(d) . The former is discussed in detail in the previous sections, so we focus on the latter now. Recall that a norm is absolute if A = |A| , and it is monotone if |A| ≤ |B| implies A ≤ B . For example, 1-norm, ∞-norm, Frobenius norm or max norm are both absolute and monotone.  A − diag(d) 2 ≤ A 2 + 1.
Moreover, It holds as an equality when A is symmetric.
Conclusion. In this paper, we studied error bounds for absolute value equations. We suggested some formulas for the computation of error bounds for some classes of matrices. The investigation of other classes of matrices may be of interest for further research. The proposed formulas can be employed not only for absolute value equations obtained by transforming linear complementarity problems, but also for linear complementarity problems. In addition, We showed that , in general, the computation of error bounds, except for 2-norm, for a general matrix is an NP-hard problem, and it remains an open problem for 2-norm.