Exclusion regions for parameter-dependent systems of equations

This paper presents a new algorithm based on interval methods for rigorously constructing inner estimates of feasible parameter regions together with enclosures of the solution set for parameter-dependent systems of nonlinear equations in low (parameter) dimensions. The proposed method allows to explicitly construct feasible parameter sets around a regular parameter value, and to rigorously enclose a particular solution curve (resp. manifold) by a union of inclusion regions, simultaneously. The method is based on the calculation of inclusion and exclusion regions for zeros of square nonlinear systems of equations. Starting from an approximate solution at a fixed set $p$ of parameters, the new method provides an algorithmic concept on how to construct a box $\mathbf{s}$ around $p$ such that for each element $s\in \mathbf{s}$ in the box the existence of a solution can be proved within certain error bounds.


Introduction
In this paper we consider parameter-dependent nonlinear systems of equations H(x, s) = 0, (1.1) with solutions x = x(s) ∈ R n depending on a set of parameters s ∈ R p . Thereby, we assume H : X × S ⊆ R n × R p → R n to be differentiable with Lipschitz continuous derivative.
Branch and bound methods for finding all zeros of a (square) nonlinear system of equations in a box frequently have the difficulty that subboxes containing no solution cannot be eliminated if there is a nearby zero outside the box. This results in the so-called cluster effect, i.e., the creation of many small boxes by repeated splitting, whose processing may dominate the total work spent on the global search. Schichl & Neumaier [26] presented a method how to reduce the cluster effect for nonlinear n × n-systems of equations by computing so-called inclusion and exclusion regions around an approximate zero with the property, that a true solution lies in the inclusion region and no other solution in the corresponding exclusion region, which thus can be safely discarded.
In the parameter-dependent case, it would be convenient to show the existence of such inclusion regions for a whole set of parameter values in order to rigorously identify feasible parameter boxes s where for all s ∈ s solutions x(s) of (1.1) exist. Thus, we extend the method from Schichl & Neumaier [26] to this problem class, and show how to compute parameter boxes s ⊆ S such that for each parameter set s ∈ s the existence of a solution x(s) ∈ x ⊆ X of (1.1) within a narrow inclusion box can be guaranteed.
The procedure for computing such a feasible parameter box s consists of three main steps: (i) solve (1.1) for a fixed parameter p ∈ s and compute a pair of inclusion and exclusion regions for a corresponding approximate zero z ≈ x(p) as described in [26], (ii) consider an approximation functionx(s) : S → X for the solution curve, and (iii) extend the estimates and bounds from step (i) using slope forms in order to calculate a feasible parameter box s around p such that for all s ∈ s the existence of a solution x * (s) of (1.1) can be proved.
Other known approaches. Parameter-dependent systems of equations can be solved by continuation methods (e.g., [1,2]) which trace a particular solution curve or a solution manifold, if p > 1 in (1.1). Another approach for parametric polynomial systems is to use Gröbner bases (e.g., [7,14] [9] proposed an iterative method to construct a linear interval enclosure of the solution set of (1.1) over a given parameter interval. Goldsztejn [4] used a weak version of the parametric Miranda-theorem (see [15,Thm. 5.3.7]) to verify the existence of solutions over a given parameter interval and to compute a reliable inner estimate of the feasible parameter region. Independently from the work of Goldsztejn, the authors recently pursued a similar approach and propose some tools utilizing Miranda's theorem and centered forms for rigorously solving parameter-dependent systems of equations [19].
Outline. The paper is organized as follows: in Section 2 we review some central results about rigorously computing solutions of square nonlinear systems of equations. Additionally, we summarize basic definitions and known results about slope forms, which will be an important tool when extending the exlusion region-concept from [26] to the parameter-dependent case. In Section 3 we will outline the method introduced by Schichl & Neumaier [26] as it is the starting point for the new method. In Section 4 we will then state and prove the main results of this paper and describe how to extend the inclusion/exclusion-region concept to the parameter-dependent case. In Section 5 the new method is demonstrated on a numerical example. Finally, in Section 6 a summary as well as an outlook of future work is given.
Notation. Throughout the paper, we will use the following notation: For a matrix A ∈ R n×m we denote by A :K the n × k submatrix consisting of k columns with indices in K ⊆ {1, . . . , m}, and, similarly A K: denotes the k × m submatrix with row-indices in K ⊆ {1, . . . , n}. Let F : R m → R n . If y = (x, s) T ∈ R m is a partition of x with x = y I , s = y J , where I, J are index sets with I ∩ J = ∅ and I ∪ J = {1, . . . , m}, the Jacobian of F with respect to x is A slope of F with center z at y is written as F [z, y], a slope with respect to x then is Since second order slopes (resp. first order slopes of the Jacobian) are third order tensors, we use the following multiplication rules (see [25]) for a 3-tensor T ∈ R n×m×r , a vector v ∈ R r , and matrices C ∈ R s×n , B ∈ R r×s : Additionally, we define for a vector v ∈ R n and a 3-tensor T ∈ R n×n×n the product

Preliminaries and known results
Consider a twice continuously differentiable function F : D ⊆ R n → R n . We can always write (see [26]) for any two points z, x ∈ D and a suitable matrix F [z, x] ∈ R n×n , a so-called slope matrix for F with center z at x. While in the multivariate case, the slope matrix is not uniquely determined, we always have by differentiability Assuming that the slope matrix is continuously differentiable in both points, we can write similarly and, analogously, Hence, the first and second order slope forms given in (2.4) and (2.5), respectively, provide enclosures for the true range of the function F over an interval x. There are recursive procedures to calculate slopes, given x and z (see [8,10,21,24]). A Matlab implementation for first order slopes is in Intlab [22]; also, the Coconut environement [23] provides algorithms.
Similarly to derivatives, slopes obey a sort of chain rule. Let F : R m → R p , g : R n → R m . Then we have x] is a slope matrix for F • g.
Exclusion regions for n × n-systems are usually constructed using uniqueness tests based on the Krawczyk operator (see [15]) or the Kantorovich theorem (see [3,6,18]), which both provide existence and uniqueness regions for zeros of square systems of equations.
Kahan [5] used the Krawczyk operator to make existence statements. An important advantage of the Krawczyk operator is that it only needs first order information. Together with later improvements about slopes, his result is contained in the following statement.
Theorem 2.1 (Kahan). Let F : R n → R n be as before and let z ∈ z ⊆ x. If there is a matrix C ∈ R n×n such that the Krawczyk operator , then x contains a unique zero.
Neumaier & Zuhe [17] proved that the Krawczyk operator with slopes always provides existence regions which are at least as large as those computed by Kantorovich's theorem. Based on a more detailed analysis of the properties of the Krawczyk operator, Schichl & Neumaier [26] provided componentwise and affine invariant existence, uniqueness, and nonexistence regions given a zero or any other point in the search region. More recently, this concept was extended to optimiziation problems; see Schichl et al. [25].

Inclusion/exclusion regions for a fixed parameter
We consider the nonlinear system of equations (1.1) at a fixed parameter value p, Let z ≈ x(p) be an approximate solution of (3.1), i.e., Our first aim is the verification of a true solution x * of system (3.1) in a neighbourhood of z by computing an inclusion (resp. exclusion) region around z as described by Schichl & Neumaier [26]. Assuming regularity of the Jacobian H x (z, p) we take as a fixed preconditioning matrix and compute the componentwise bounds where the second order slopes H xx are fixed. Throughout the paper, we assume z ≈ x(p) ∈ x as fixed center and the bounds from (3.4) valid for all x ∈ x, where x ⊆ X is chosen appropriately (see below). Following [26,Thm. 4.3], we choose a suitable vector 0 < v ∈ R n , which basically determines the scaling of the inclusion/exclusion regions, and set w : If λ e > λ i , then there is at least one zero x * of (3.1) in the inclusion region R 0 i and the zeros in this region are the only zeros of (3.1) in the interior of the exclusion region R 0

Parameter-dependent problem
Let (z, p) be an approximate solution of (1.1) for which a pair of inclusion and exclusion regions can be computed as described in Section 3. In addition, we assume the bounds from (3.4) valid for x ⊆ X with z ∈ x. We aim to prove the existence of a solution of (1.1) for every s ∈ s ⊆ S. Therefore, we first extend the results from Schichl & Neumaier [26] to the parameter-dependent case. In Thm. 4.4 we then state a method to explicitly construct such a parameter interval s. As a by-product we get an outer enclosure of a solution region x(s) ⊆ x over the parameter set s.
Consider any box s ⊆ S ⊆ R p with p ∈ s and a continuously differentiable approximation functionx : which satisfiesx(p) = z andx(s) ∈ x for all s ∈ s, and prove for every s ∈ s the existence of an inclusion box R i s ⊆ x with 0 ∈ H R i s , s . Note that the choice of the approximation function may greatly influence the quality, i.e., the radius, of the parameter interval s (see Section 5).
We define Withx[p, s] denoting a slope matrix forx with center p at s, a slope matrix for g is given by Let C be the fixed preconditioning matrix from (3.3). For each s ∈ s we define similar bounds as in (3.4 and, similarly we estimate the first derivative of H with respect to x by where the 3-tensor (H x )[g(p), g(s)] ∈ R n×n×(n+p) is a slope for H x .
By taking absolute values we get with y := |s − p| and (3.4) Note, that A(s) ∈ R n×n×p is the result of the multiplication of a 3-tensor with an ((n + p) × p)-matrix. Therefore, A(s) is computed by the appropriate multiplication rule from (1.2).
Proposition 4.1. Let (z, p) ∈ x × s ⊆ X × S, where x, s are any subboxes of X and S containing (z, p) (from (3.2)) such that the bounds (3.4) hold for all x ∈ x. Additionally, let s ∈ s be an arbitrary parameter value, andx :=x(s) ∈ int(x) be the function value of the approximation function from (4.1) at s. Further, let 0 < v ∈ R n and λ e as in (3.6).
Then for a true solution x = x(s) of (1.1) at s with |x − z| ≤ λ e v the deviation x is a solution of (1.1) at s. This simplifies for (x 1 , s 1 ) = (x, s) and g(s) as in (4.2) to where we calculate H(g(s)) by (4.7) with respect to (z, p) as Now we consider the deviation between the approximate and a true solution and get with (4.11) which extends by (4.13) to Taking absolute values, we get by (4.6), (4.8), and (4.9) Using this result, we are able to formulate a first criterion for existence regions. For x ∈ M u (s) we get with (4.11) and (4.13) Taking absolute values we get Based on the above results, the following theorem provides a way of constructing inclusion and exclusion regions for an approximate solutionx(s).  If λ e s > λ i s and 20) then there exists at least one zero x * of (1.1) for a parameter set s (i.e., with H(x * , s) = 0) in the inclusion region and these zeros are the only zeros of H at s in the interior of the exclusion region Proof. We set u = λv with arbitrary 0 < v ∈ R n , and check for which λ = λ(s) ∈ R + the vector u satisfies property (4.14). We get which leads to the sufficient condition The jth component of this inequality requires λ to be between the solutions of the quadratic equation which are exactly λ i j (s) and λ e j (s). Since D j (s) > 0 for all j by assumption, the interval λ i s , λ e s is nonempty. Thus, for all λ(s) ∈ λ i s , λ e s , the vector u satisfies (4.14).
It remains to check, whether the solution(s) in R i s are the only ones in R e s . Assume that x is a solution of (1.1) at s with x ∈ int(R e s ) \ R i s , and let λ = λ(s) be minimal with |x −x| ≤ λv. By construction, we have λ i s < λ < λ e s . In the proof of Thm. 4.2 we got for the Krawczyk operator (2.7) Since by construction G 0 ≥ G 0 (s) and A ≥ A(s) for all s ∈ s, we have b ≥ b(s) and w : for all s ∈ s. By computing an upper bound over an appropriate box x ∈ X with z ∈ x (e.g., take x = k R i 0 with k ∈ R + , k ≤ 1) we get upper bounds on the second order slopes from (4.6) Hence, the lowest values of the discriminant D from (4.17) are obtained by (4.22) and and w j and b j from (3.4). Solving each quadratic equation for µ, we get (4.29) Since H(z, p) ≈ 0, the discriminant in (4.29) is smaller than β 2 j , since we have with b j being an upper bound for the function value at (z, p) and thus, close to zero. Hence, both solutions of (4.28) are positive. In order to derive numerically stable results, we compute these solutions by  which hold for all s ∈ s, x ∈ x. Let further 0 < y ∈ R p , 0 < v ∈ R n as before, with α, β, γ as defined in (4.25) and (4.27), respectively, and Let η ∈ [0, µ] be maximal such that where λ e η = min j λ e j (η), with λ e j (η) = with w j (η), b j (η), a and D j (η) as defined in (4.26) and (4.25). Further, let σ ∈ 0, µ be the largest value such that for all j = 1, . . . , n Proof. Wlog., let s = p + ν y ∈ int( s) ∩ s, i.e., 0 < ν < µ. In order to meet all requirements from Thm. 4.3, which provides the result about the inclusion/exclusion regions at s, we have to check that the following three conditions hold: (i) D j (s) > 0 for all j = 1, . . . , n, with D j as in (4.17) (positivity requirement).
(ii) λ i s < λ e s (monotonicity of the inclusion/exclusion parameters).
Condition (i) is satisfied by construction, since we get for D j (s) as in (4.17) by the calculations preceeding the statement of the theorem for all s ∈ int( s) ∩ s.

(4.38)
Since by construction w ≤ w(s), b ≥ b(s), a ≥ a(s), we have with λ e j (s), λ i j (s) as in (4.18). Both λ e j (ν) and λ i j (ν) are depending continuously on ν. In particular, for increasing ν, λ e j (ν) is monotonically decreasing and λ i j (ν) monotonically increasing. Hence, λ e ν = min j λ e j (ν) and λ i ν = max j λ i j (ν) have the same monotonicity behaviour, since we take a minimum (resp. maximum) of a monotonically decreasing (resp. increasing) function. By computing µ from (4.33), we get a lower and an upper bound on the exclusion and inclusion parameter λ e and λ i , respectively, since D k µ = 0 implies λ e k µ = λ i k µ for some k ∈ {1, . . . , n}. Since we choose η ∈ 0, µ in such a way that λ e η > λ i η , we have in particular for all j = 1, . . . , n. By monotonicity, we have Taking the minimum (resp. maximum) over all j, we get by (4.39), (4.40) and assumption (4.34) Finally, we have for ν ≤ σ and by assumption (4.36) sincex(s) ∈x(s ν ), and λ i s ≤ λ i ν by (ii). Hence, condition (iii) is satisfied as well, which concludes the proof.

Example
Inclusion/exclusion region for fixed parameter p. We consider the system of equations We set p = 1 and compute a corresponding solution z = (3, 4) T . A slope for H with center (z, p) can be computed as For the preconditioning matrix C we take The slope from ( where b and B 0 both vanish, since z happens to be an exact zero of (5.1), and we computed without roundoff-errors. With v = (1, 1) T we get from (3.5) and with D 1 = D 2 = 1 we get from (3.6) Hence, we get Construction of a feasible parameter interval s. We consider a linear approximation functionx : R → R 2 ,x(s) = z + Θ (s − p) with Θ ∈ R 2×1 . We compute the parameter interval s from Thm. 4.4 for two different approximation functions, (i) a tangentx tan in (z, p) with (ii) a secantx sec through the center (z, p) and a second point  the approximation function greatly influences the size of the computed parameter box as well as the quality of the enclosure of the solution set. A good choice ofx(s) is thus important, and may require a closer analysis of the problem at hand.
In Fig. 1 two solution curves for x 1 over s are shown together with the approximation tangentx tan (s) in (z, p) and corresponding inclusion and exclusion regions. The dashdotted lines represent inclusion (exclusion) regions, which are computed using the true bounds at each parameter value s, i.e., with G 0 (s) and A(s) from (4.8) and (4.9). The dotted lines represent the developement of the inclusion (exclusion) exclusion regions which are calculated using upper bounds over the initial interval s, i.e., with G 0 and A from (4.22) and (4.23). The latter ones increase (decrease) much faster and intersect at the boundary of s. In Fig. 2 these inclusion and exclusion regions as well as a comparison between the respective regions for the tangent and the secant are depicted.

Future Work
The above described method allows to explictely construct feasible areas in the parameter space and to rigorously enclose the solution set of (1.1) for all parameters in the computed parameter boxes. The method shows promising applicability for problems in low parameter dimensions. However, the method suffers from some sort of cluster effect when approaching the boundaries of the feasible parameter regions, i.e., the step size µ and thus the radii of the parameter boxes become smaller and smaller. This problem may be tackled using an extension of Miranda's theorem and is addressed in [19]. An  application for the new method is for example the workspace-computation of parallel manipulators (see Merlet [12]). In particular, the computation of the total orientation workspace requires the solution of a parameter-dependent system of nonlinear equations. Up to the author's knowledge, there are only a few results adressing this problem using rigorous methods (Merlet [11], Merlet et al. [13]). Therefore, the proposed method will be applied to the workspace problem [20].