1 Introduction

In this paper we consider parameter-dependent nonlinear systems of equations

$$\begin{aligned} H(x,s) = 0, \end{aligned}$$
(1.1)

with solutions \(x = x(s) \in {\mathbb {R}}^n\) depending on a set of parameters \(s\in {\mathbb {R}}^p\). Thereby, we assume

$$H:X\times S \subseteq {\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}^n$$

to be differentiable with Lipschitz continuous derivative.

Branch and bound methods for finding all zeros of a (square) nonlinear system of equations in a box frequently have the difficulty that subboxes containing no solution cannot be eliminated if there is a nearby zero outside the box. This results in the so-called cluster effect, i.e., the creation of many small boxes by repeated splitting, whose processing may dominate the total work spent on the global search. Schichl and Neumaier [34] presented a method how to reduce the cluster effect for nonlinear \(n\times n\)-systems of equations by computing so-called inclusion and exclusion regions around an approximate zero with the property, that a true solution lies in the inclusion region and no other solution in the corresponding exclusion region, which thus can be safely discarded.

In the parameter-dependent case, it would be convenient to show the existence of such inclusion regions for a whole set of parameter values in order to rigorously identify feasible parameter boxes \(\mathbf {s}\) where for all \(s\in \mathbf {s}\) solutions x(s) of (1.1) exist. Thus, we extend the method from Schichl and Neumaier [34] to this problem class, and show how to compute parameter boxes \(\mathbf {s}\subseteq S\) such that for each parameter set \(s \in \mathbf {s}\) the existence of a solution \(x(s) \in \mathbf {x}\subseteq X\) of (1.1) within a narrow inclusion box can be guaranteed.

The procedure for computing such a feasible parameter box \(\mathbf {s}\) consists of three main steps: (i) solve (1.1) for a fixed parameter \(p\in \mathbf {s}\) and compute a pair of inclusion and exclusion regions for a corresponding approximate zero \(z\approx x(p)\) as described in Schichl and Neumaier [34], (ii) consider an approximation function \({\hat{x}}(s):S\rightarrow X\) for the solution curve, and (iii) extend the estimates and bounds from step (i) using slope forms in order to calculate a feasible parameter box \(\mathbf {s}\) around p such that for all \(s\in \mathbf {s}\) the existence of a solution \(x^*(s)\) of (1.1) can be proved.

Other known approaches.  Parameter-dependent systems of equations can be solved by continuation methods (e.g., [1, 2]) which trace a particular solution curve or a solution manifold, if \(p>1\) in (1.1). More recently, Martin et al. [17] presented a new rigorous continuation method for 1-manifolds (improving the method proposed by Kearfott and Xing [13]) which utilizes parallelotopes (as defined in [9]) to enclose consecutive portions of the followed manifold. Another approach for parametric polynomial systems is to use Gröbner bases (e.g., [12, 21]).

Neumaier [23, Thm. 5.1.3] formulated a semilocal version of the implicit function theorem and provided a tool Neumaier [23, Prop. 5.5.2] for constructing an enclosure of the solution set of (1.1) with parameters varying in narrow intervals. Furthermore, Neumaier [22] performed a rigorous sensitivity analysis for parameter-dependent systems of equations and proved a quadratic approximation property of a slope based enclosure. A parametric Krawcyk operator was proposed in Rump [29], a comparison of this Newton-like operator and a parametric Hansen–Segupta operator [6] was presented in Goldsztejn [7]. In Goldsztejn and Granvilliers [9], Goldsztejn and Granvilliers extended the preconditioned interval Newton operator to underconstrained systems of equations by generalizing the domain from boxes to parallelepipeds.

Kolev and Nenov [15] proposed an iterative method to construct a linear interval enclosure of the solution set of (1.1) over a given parameter interval. Goldsztejn [5] used a weak version of the parametric Miranda-theorem (see [23], Thm. 5.3.7) to verify the existence of solutions over a given parameter interval and to compute a reliable inner estimate of the feasible parameter region. Independently from the work of Goldsztejn [5], the authors recently pursued a similar approach and propose some tools utilizing Miranda’s theorem and centered forms for rigorously solving parameter-dependent systems of equations [27].

In addition, several recent contributions are concerned with rigorously solving problems arising in the field of robotics. For example, Tannous et al. [38] proposed an interval linearization method to enclose the solution set of a system (1.1) over a small parameter interval given a nominal approximate solution in order to provide verified results for the sensitivity analysis of serial and parallel manipulators; Caro et al. [3] compute a verified enclosure of n-manifolds for computing generalized aspects (connected components of a set of nonsingular reachable configurations) of parallel robots; Goldsztejn et al. [8] use a parametric version of the Kantorovich theorem for proving hypotheses regarding the maximal pose error of parallel manipulators, and computing upper bounds on the pose error in a safe way.

Other contributions regarding parametric interval methods for solving systems of equations are, e.g., [36, 37] for robust simulation and design of chemical processes, or [39] for infinite dimensional systems of equations (PDE).

Outline. The paper is organized as follows: in Sect. 2 we review some central results about rigorously computing solutions of square nonlinear systems of equations. Additionally, we summarize basic definitions and known results about slope forms, which will be an important tool when extending the exlusion region-concept from Schichl and Neumaier [34] to the parameter-dependent case. In Sect. 3 we will outline the method introduced by Schichl and Neumaier [34] as it is the starting point for the new method. In Sect. 4 we will then state and prove the main results of this paper and describe how to extend the inclusion/exclusion-region concept to the parameter-dependent case. In Sect. 5 the new method is illustrated with several numerical examples, and, additionally, some preliminary computational results for rigorously computing an inner approximation of the feasible solution region over an initial box \(\mathbf {s}\) are presented. Section 6 provides a discussion of the proposed method as well as an outlook on future work.

Notation. Throughout the paper, we will use the following notation: For a matrix \(A\in {\mathbb {R}}^{n\times m}\) we denote by \(A_{: K}\) the \(n\times k\) submatrix consisting of k columns with indices in \(K\subseteq \lbrace {1,\dots ,m}\rbrace \), and, similarly \(A_{K:}\) denotes the \(k\times m\) submatrix with row-indices in \(K\subseteq \lbrace 1,\dots ,n\rbrace \). Let \(F:{\mathbb {R}}^m\rightarrow {\mathbb {R}}^n\). If \(y = (x,s)^T\in {\mathbb {R}}^m\) is a partition of x with \(x=y_I\), \(s=y_J\), where I, J are index sets with \(I\cap J=\emptyset \) and \(I\cup J=\lbrace {1,\dots ,m}\rbrace \), the Jacobian of F with respect to x is

$$\begin{aligned} F'_x(y)= \frac{\partial F}{\partial x}(y) = \left( \frac{\partial F}{\partial y}(y)\right) _{:I}. \end{aligned}$$

A slope of F with center z at y is written as \({F}\!\left[ z,y\right] \), a slope with respect to x then is

$$\begin{aligned} {F_x}\!\left[ z,y\right] = ({F}\!\left[ z,y\right] )_{:I}. \end{aligned}$$

Since second order slopes (resp. first order slopes of the Jacobian) are third order tensors, we use the following multiplication rules (see [33]) for a 3-tensor \({\mathcal {T}} \in {\mathbb {R}}^{n \times m\times r}\), a vector \(v \in {\mathbb {R}}^r\), and matrices \(C\in {\mathbb {R}}^{s\times n}\), \(B\in {\mathbb {R}}^{r \times s}\):

$$\begin{aligned} \left( {\mathcal {T}}v\right) _{ij} = \sum _{k=1}^r {\mathcal {T}}_{ijk}v_k,&\quad \quad \left( C{\mathcal {T}} \right) _{ijk} = \sum _{l=1}^n C_{il}{\mathcal {T}}_{ljk},&\quad \quad \left( {\mathcal {T}}B\right) _{ijk} = \sum _{l=1}^r {\mathcal {T}}_{ijl}B_{lk}. \end{aligned}$$
(1.2)

Additionally, we define for a vector \(v\in {\mathbb {R}}^n\) and a 3-tensor \({\mathcal {T}}\in {\mathbb {R}}^{n\times n\times n}\) the product

$$\begin{aligned} v^T{\mathcal {T}}\;v=({\mathcal {T}}v)\; v, \quad \text {i.e., with }\quad \left( v^T{\mathcal {T}}\;v\right) _i = \sum _{j=1}^n \sum _{k=1}^n v_k {\mathcal {T}}_{ijk}\,v_j \end{aligned}$$

2 Preliminaries and known results

Consider a twice continuously differentiable function \(F:D \subseteq {\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\). We can always write (see [34])

$$\begin{aligned} F(x) - F(z) = {F}\!\left[ z,x\right] \;(x-z), \end{aligned}$$
(2.1)

for any two points z, \(x\in D\) and a suitable matrix \({F}\!\left[ z,x\right] \in {\mathbb {R}}^{n\times n}\), a so-called slope matrix for F with center z at x. While in the multivariate case, the slope matrix is not uniquely determined, we always have by differentiability

$$\begin{aligned} {F}\!\left[ z,z\right] = F'(z). \end{aligned}$$

Assuming that the slope matrix is continuously differentiable in both points, we can write similarly

$$\begin{aligned} {F}\!\left[ z, x\right]&= {F}\!\left[ z,z'\right] +{F}\!\left[ z, z', x\right] (x-z') \end{aligned}$$
(2.2)

which simplifies for \(z=z'\) to

$$\begin{aligned} {F}\!\left[ z, x\right]&=F'(z) + {F}\!\left[ z, z, x\right] (x-z), \end{aligned}$$
(2.3)

where the second order slopes \({F}\!\left[ z,z',x\right] \), \({F}\!\left[ z,z,x\right] \), respectively, are continuous in z, \(z'\) and x. If F is quadratic, the first order slopes are linear, and thus, the second order slope matrices are constant. Let z be a fixed center in the domain of F. Having a slope \({F}\!\left[ z,x\right] \) for all \(x \in \mathbf {x}\) we get

$$\begin{aligned} F(\mathbf {x}) \subseteq F(z) + {F}\!\left[ z,\mathbf {x}\right] \left( \mathbf {x}- z\right) , \end{aligned}$$
(2.4)

and, analogously,

$$\begin{aligned} F(\mathbf {x}) \subseteq F(z) + \left( F'(z) + {F}\!\left[ z, z, \mathbf {x}\right] (\mathbf {x}-z)\right) \left( \mathbf {x}- z\right) . \end{aligned}$$
(2.5)

Hence, the first and second order slope forms given in (2.4) and (2.5), respectively, provide enclosures for the true range of the function F over an interval \(\mathbf {x}\). There are recursive procedures to calculate slopes, given x and z (see [14, 16, 30, 35]). A Matlab implementation for first order slopes is in Intlab [31]; also, the Coconut environment [32] provides algorithms.

Similarly to derivatives, slopes obey a sort of chain rule. Consider \(F :{\mathbb {R}}^m\rightarrow {\mathbb {R}}^p\) and \(g:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^m\). Then we have

$$\begin{aligned} \begin{aligned} (F\circ g)(x)&= (F\circ g)(z) + {(F\circ g)}\!\left[ z,x\right] (x-z)\\&= F(g(z)) + {F}\!\left[ g(z), g(x)\right] \; {g}\!\left[ z,x\right] (x-z), \end{aligned} \end{aligned}$$
(2.6)

i.e., \({F}\!\left[ g(z),\,g(x)\right] \;{g}\!\left[ z,x\right] \) is a slope matrix for \(F\circ g\).

Exclusion regions for \(n\times n\)-systems are usually constructed using uniqueness tests based on the Krawczyk operator (see [23]) or the Kantorovich theorem (see [4, 11, 25]), which both provide existence and uniqueness regions for zeros of square systems of equations. Kahan [10] used the Krawczyk operator to make existence statements. An important advantage of the Krawczyk operator is that it only needs first order information. Together with later improvements about slopes, his result is contained in the following statement.

Theorem 2.1

(Kahan) Let \(F:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\) be as before and let \(z\in \mathbf {z}\subseteq \mathbf {x}\). If there is a matrix \(C\in {\mathbb {R}}^{n\times n}\) such that the Krawczyk operator

$$\begin{aligned} \text {K}({\mathbf {z}, \mathbf {x}}) \mathrel {\mathop :}=z-CF(z)-\left( {CF}\!\left[ \mathbf {z}, \mathbf {x}\right] -\mathbbm {1}\right) (\mathbf {x}-z) \end{aligned}$$
(2.7)

satisfies \(\text {K}({\mathbf {z}, \mathbf {x}}) \subseteq \mathbf {x}\), then \(\mathbf {x}\) contains a zero of F. Moreover, if \(\text {K}({\mathbf {x}, \mathbf {x}})\subseteq {{\,\mathrm{int}\,}}(\mathbf {x})\), then \(\mathbf {x}\) contains a unique zero.

Neumaier and Zuhe [24] proved that the Krawczyk operator with slopes always provides existence regions which are at least as large as those computed by Kantorovich’s theorem. Based on a more detailed analysis of the properties of the Krawczyk operator, Schichl and Neumaier [34] provided componentwise and affine invariant existence, uniqueness, and nonexistence regions given a zero or any other point in the search region. More recently, this concept was extended to optimiziation problems (see [33]).

3 Inclusion/exclusion regions for a fixed parameter

We consider the nonlinear system of equations (1.1) at a fixed parameter value p,

$$\begin{aligned} H(x, p) = 0, \quad H:X \subseteq {\mathbb {R}}^n \rightarrow {\mathbb {R}}^n. \end{aligned}$$
(3.1)

Let \(z\approx x(p)\) be an approximate solution of (3.1), i.e.,

$$\begin{aligned} H(z,p) \approx 0. \end{aligned}$$
(3.2)

Our first aim is the verification of a true solution \(x^*\)of system (3.1) in a neighbourhood of z by computing an inclusion (resp. exclusion) region around z as described by Schichl and Neumaier [34]. Assuming regularity of the Jacobian \(H'_x(z, p)\) we take

$$\begin{aligned} C \approx H'_x(z, p)^{-1} \in {\mathbb {R}}^n \times {\mathbb {R}}^n \end{aligned}$$
(3.3)

as a fixed preconditioning matrix and compute the componentwise bounds

$$\begin{aligned} \begin{aligned} \overline{b}&\ge \left| {CH(z, p)}\right| \\ B_0&\ge \left| {CH'_x(z, p)-\mathbbm {1}}\right| \\ B(x)&\ge \left| {C{H_{xx}}\!\left[ (z, p), (z, p), (x, p)\right] }\right| \\ \overline{B}&\ge B(x) \quad \forall \; x \in \; {\mathbf {x}}\subseteq X, \end{aligned} \end{aligned}$$
(3.4)

where the second order slopes \(H_{xx}\) are fixed. Throughout the paper, we assume \(z \approx x(p)\in \mathbf {x}\) as fixed center and the bounds from (3.4) valid for all \(x\in \mathbf {x}\), where \(\mathbf {x}\subseteq X\) is chosen appropriately (see below). Following [34, Thm. 4.3], we choose a suitable vector \(0<v\in {\mathbb {R}}^n\), which basically determines the scaling of the inclusion/exclusion regions, and set

$$\begin{aligned} w \mathrel {\mathop :}=\left( \mathbbm {1}-B_0\right) v, \quad a \mathrel {\mathop :}=v^T\overline{B}\;v \end{aligned}$$
(3.5)

Supposing

$$\begin{aligned} D_j = w_j^2-4a_j\overline{b}_j>0 \end{aligned}$$

for all \(j=1,\ldots ,n\), we define

$$\begin{aligned} \lambda _j^e&\mathrel {\mathop :}=\frac{w_j+\sqrt{D_j}}{2a_j},&\lambda _j^i&\mathrel {\mathop :}=\frac{\overline{b}_j}{a_j\lambda _j^e}, \nonumber \\ \lambda ^e&\mathrel {\mathop :}=\min _{j=1,\dots ,n}\lambda ^{e}_j,&\lambda ^i&\mathrel {\mathop :}=\max _{j=1,\dots , n} \lambda ^{i}_j. \end{aligned}$$
(3.6)

If \(\lambda ^e>\lambda ^i\), then there is at least one zero \(x^*\) of (3.1) in the inclusion region \(\mathbf {R}^{i}_0\) and the zeros in this region are the only zeros of (3.1) in the interior of the exclusion region \(\mathbf {R}^{e}_0\) with

$$\begin{aligned} \mathbf {R}^{i}_0 \mathrel {\mathop :}=[z\!-\lambda ^i\,v,\; z\!+\lambda ^i\,v]\subseteq \mathbf {x},\qquad \mathbf {R}^{e}_0 \mathrel {\mathop :}=[z\!-\lambda ^e\,v,\; z\!+\lambda ^e\,v]\cap \mathbf {x}. \end{aligned}$$
(3.7)

In the important special case where H(xp) is quadratic in x, the first order slope matrix \({H}\!\left[ (z, p), (x,p)\right] \) is linear in x. Hence, all second order slope matrices are constant in x. Therefore, the upper bounds \(B(x)=B\) are constant as well. Thus, we can set \(\overline{B} = B\) and the estimate from (3.4) becomes valid everywhere. Otherwise, an appropriate choice of \(\mathbf {x}\subseteq X\) is crucial in order to keep the bounds \(\overline{B}\) on the second order slopes considerably small.

4 Parameter-dependent problem

Let (zp) be an approximate solution of (1.1) for which a pair of inclusion and exclusion regions can be computed as described in Sect. 3. In addition, we assume the bounds from (3.4) are valid for \(\mathbf {x}\subseteq X\) with \(z\in \mathbf {x}\). We aim to prove the existence of a solution of (1.1) for every \(s\in \mathbf {s}\subseteq S\). Therefore, we first extend the results from Schichl and Neumaier [34] to the parameter-dependent case. In Theorem 4.3 we then state a method to explicitly construct such a parameter interval \(\mathbf {s}\). As a by-product we get an outer enclosure of a solution region \(\mathbf {x}(\mathbf {s})\subseteq \mathbf {x}\) over the parameter set \(\mathbf {s}\).

Consider any box \(\mathbf {s}\subseteq S\subseteq {\mathbb {R}}^p\) with \(p\in \mathbf {s}\) and a continuously differentiable approximation function

$$\begin{aligned} {\hat{x}}:{\mathbb {R}}^p\rightarrow {\mathbb {R}}^n \end{aligned}$$
(4.1)

which satisfies \({\hat{x}}(p) = z\) and \({\hat{x}}(s)\in \mathbf {x}\) for all \(s\in \mathbf {s}\), and prove for every \(s\in \mathbf {s}\) the existence of an inclusion box

$${\mathbf {R}}^i_s\subseteq {\mathbf {x}}\quad \text { with }\quad 0\in H\left( {\mathbf {R}}^i_s,s\right) .$$

In principle, the approximation function (4.1) can be chosen arbitrarily. The easiest choice would be a linear approximation of the function at hand. Another possible approach would be using higher-order Taylor-expansions of the solution. Due to computational reasons, expansions of maximal order 2 are most useful, since higher-order approximations would lead to more overestimation. Note, that the choice of the approximation function may greatly influence the quality, i.e., the radius, of the parameter interval \(\mathbf {s}\) (see Fig. 2).

We define

$$\begin{aligned} g(s) = \begin{pmatrix}{\hat{x}}(s)\\ s\end{pmatrix}, \quad g:\mathbf {s}\rightarrow X \times S \subseteq {\mathbb {R}}^{n}\times {\mathbb {R}}^{p}. \end{aligned}$$
(4.2)

With \({{\hat{x}}}\!\left[ p, s\right] \) denoting a slope matrix for \({\hat{x}}\) with center p at s, a slope matrix for g is given by

$$\begin{aligned} {g}\!\left[ p, s\right] = \begin{pmatrix}{{\hat{x}}}\!\left[ p, s\right] \\ \mathbbm {1}\end{pmatrix} \quad \in {\mathbb {R}}^{(n+p)\times p}, \end{aligned}$$
(4.3)

since

$$\begin{aligned} g(s)-g(p) = \begin{pmatrix} {\hat{x}}(s)-{\hat{x}}(p)\\ s-p \end{pmatrix} = \begin{pmatrix} {{\hat{x}}}\!\left[ p, s\right] \\ \mathbbm {1}\end{pmatrix} \left( s-p\right) . \end{aligned}$$

Let C be the fixed preconditioning matrix from (3.3). For each \(s \in \mathbf {s}\) we define similar bounds as in (3.4)

$$\begin{aligned} \overline{{\mathfrak {b}}}(s)&\ge \left| {CH({\hat{x}}(s),s)}\right| , \end{aligned}$$
(4.4a)
$$\begin{aligned} {\mathfrak {B}}_0(s)&\ge \left| {CH'_x({\hat{x}}(s),s)-\mathbbm {1}}\right| , \end{aligned}$$
(4.4b)
$$\begin{aligned} {\mathfrak {B}}(x,s)&\ge \left| {C{H_{xx}}\!\left[ ({\hat{x}}(s),s), ({\hat{x}}(s),s), (x,s)\right] }\right| \end{aligned}$$
(4.4c)

and calculate estimates on the bounds from (4.4a) and (4.4b) with respect to the bounds from (3.4) using first order slope approximations. Applying the chain rule (2.6) to \(H({\hat{x}}(s),s) = (H\circ g)(s)\) we get

$$\begin{aligned} H({\hat{x}}(s),s) = H(g(p)) + {H}\!\left[ g(p), g(s)\right] \,{g}\!\left[ p,\;s\right] \,(s-p), \end{aligned}$$
(4.5)

and, similarly we estimate the first derivative of H with respect to x by

$$\begin{aligned} H'_x({\hat{x}}(s),s)&= H'_x(z, p)+ {(H'_x)}\!\left[ g(p),\; g (s)\right] \, {g}\!\left[ p,\;s\right] \,(s-p), \end{aligned}$$

where the 3-tensor \({(H'_x)}\!\left[ g(p),\; g (s)\right] \in {\mathbb {R}}^{n\times n \times (n+p)}\) is a slope for \(H'_x\).

By taking absolute values we get with \(\widetilde{y} \mathrel {\mathop :}=\left| {s-p}\right| \) and (3.4)

$$\begin{aligned} \begin{array}{lcccl} \left| {CH({\hat{x}}(s),s)}\right| &{} \le &{} \overline{b} &{} + &{} \left| {{CH}\!\left[ g(p),\; g(s)\right] }\right| \, \left| {{g}\!\left[ p,\; s\right] }\right| \, \widetilde{y},\\ \left| {CH'_x({\hat{x}}(s),s)-\mathbbm {1}}\right| &{} \le &{} B_0 &{} + &{} \left| {C{(H'_x)}\!\left[ g(p),\; g(s)\right] }\right| \, \left| {{g}\!\left[ p,\; s\right] }\right| \, \widetilde{y}. \end{array} \end{aligned}$$

Hence, we define

$$\begin{aligned} \overline{{\mathfrak {b}}}(s)\mathrel {\mathop :}=\overline{b} + G_0(s)\, \widetilde{y} \quad \text { with }\quad G_0(s)\mathrel {\mathop :}=\left| {{CH}\!\left[ g(p),\; g(s)\right] }\right| \, \left| {{g}\!\left[ p,\; s\right] }\right| \end{aligned}$$
(4.6)

and

$$\begin{aligned} {\mathfrak {B}}_0(s)\mathrel {\mathop :}=B_0 + A(s)\, \widetilde{y}\quad \text { with } \quad A(s) \mathrel {\mathop :}=\left| {C{(H'_x)}\!\left[ g(p),\; g(s)\right] }\right| \, \left| {{g}\!\left[ p,\; s\right] }\right| . \end{aligned}$$
(4.7)

Note, that \(A(s) \in {\mathbb {R}}^{n\times n\times p}\) is the result of the multiplication of a 3-tensor with an \(((n+p)\times p)\)-matrix. Therefore, A(s) is computed by the appropriate multiplication rule from (1.2).

Proposition 4.1

Let \((z,p)\in \mathbf {x}\times \mathbf {s}\subseteq X\times S\), where \(\mathbf {x}\), \(\mathbf {s}\) are any subboxes of X and S containing (zp) (from (3.2)) such that the bounds (3.4) hold for all \(x\in \mathbf {x}\). Additionally, let \(s\in \mathbf {s}\) be an arbitrary parameter value, and \({\hat{x}} \mathrel {\mathop :}={\hat{x}}(s)\in {{\,\mathrm{int}\,}}(\mathbf {x})\) be the function value of the approximation function from (4.1) at s. Further, let \(0<v\in {\mathbb {R}}^n\) and \(\lambda ^e\) as in (3.6). Then for a true solution \(x=x(s)\) of (1.1) at s with \(\left| {x-z}\right| \le \lambda ^e\,v\) the deviation

$$d_s\mathrel {\mathop :}=\left| {x-{\hat{x}}}\right| $$

satifies

$$\begin{aligned} 0\le d_s \le \overline{{\mathfrak {b}}}(s) + \left( {\mathfrak {B}}_0(s)+{\mathfrak {B}}(x,s)\,d_s\right) d_{s} \end{aligned}$$
(4.8)

with \(\overline{{\mathfrak {b}}}(s)\), \({\mathfrak {B}}_0(s)\), and \({\mathfrak {B}}(x,s)\) as defined in (4.6), (4.7), and (4.4c), respectively.

Proof

Let \((x^1, s^1)\) be an arbitrary point in the domain of definition of H. Then we have by (2.1)

$$\begin{aligned} H(x, s) = H(x^1, s^1) + {H}\!\left[ (x^1,\; s^1), (x,s)^T\right] \, \begin{pmatrix} x-x^1\\ s-s^1 \end{pmatrix} = 0 \end{aligned}$$

since x is a solution of (1.1) at s. This simplifies for \((x^1, s^1)= ({\hat{x}}, s)\) and g(s) as in (4.2) to

$$\begin{aligned} H(x, s) = H(g(s)) + {H_{x}}\!\left[ g(s),\; (x,s)^T\right] \, (x-{\hat{x}}), \end{aligned}$$
(4.9)

where we calculate H(g(s)) by (4.5) with respect to (zp) as

$$\begin{aligned} H(g(s)) = H(g(p))+ {(H\circ g)}\!\left[ p, s\right] \, ( s-p) \end{aligned}$$
(4.10)

with \( {(H\circ g)}\!\left[ p, s\right] \mathrel {\mathop :}={H}\!\left[ g(p), g(s)\right] \, {g}\!\left[ p,s\right] \), and \({H_x}\!\left[ g(s), (x,s)^T\right] \) by (2.3) as

$$\begin{aligned} {H_{x}}\!\left[ g(s), (x,s)^T\right] = H'_x(g(s)) + {H_{xx}}\!\left[ g(s),\; g(s),\; (x,s)^T\right] \,(x-{\hat{x}}) \end{aligned}$$
(4.11)

with g(s), \({g}\!\left[ p, s\right] \) as in (4.2) and (4.3).

Now we consider the deviation between the approximate and a true solution and get with (4.9)

$$\begin{aligned} -(x-{\hat{x}})&= -(x-{\hat{x}}) \quad + CH(g(s)) + {H_{x}}\!\left[ g(s),\;(x,s)^T\right] (x-{\hat{x}}) \end{aligned}$$

which extends by (4.11) to

$$\begin{aligned}&= C H(g(s)) \,\, +(CH'_x(g(s))-\mathbbm {1}) (x-{\hat{x}})\\&\qquad +(x-{\hat{x}})^T\,{CH_{xx}}\!\left[ g(s), g(s), (x,s)^T\right] \, (x-{\hat{x}}). \end{aligned}$$

Taking absolute values, we get by (4.4c), (4.6), and (4.7)

$$\begin{aligned} d_s&= \left| {x-{\hat{x}}}\right| \le \left| {C H(g(s))}\right| + \left| {CH'_x(g(s))-\mathbbm {1}}\right| \left| {x-{\hat{x}}}\right| \\&\quad +\left| {x-{\hat{x}}}\right| ^T\,\left| {{CH_{xx}}\!\left[ g(s),\; g(s),\; (x,s)^T\right] }\right| \, \left| {x-{\hat{x}}}\right| \\&\le \quad \overline{{\mathfrak {b}}}(s) + \left( {\mathfrak {B}}_0(s)+{\mathfrak {B}}(x,s)\,d_s\right) \,d_{s}. \end{aligned}$$

\(\square \)

Using this result, we are able to formulate a first criterion for existence regions.

Theorem 4.1

Let again \(s\in \mathbf {s}\) with corresponding function value \({\hat{x}} \mathrel {\mathop :}={\hat{x}}(s)\in {{\,\mathrm{int}\,}}(\mathbf {x})\) of the approximation function from (4.1) at s. In addition to the assumptions from Proposition 4.1 let \(0<u\in {\mathbb {R}}^{n}\) be such that

$$\begin{aligned} \overline{{\mathfrak {b}}}(s) + \left( {\mathfrak {B}}_0(s)+{\mathfrak {B}}(s)\,u\right) \,u\le u \end{aligned}$$
(4.12)

with \({\mathfrak {B}}(s)\ge \, {\mathfrak {B}}(x,s)\) for all x in \(M_u(s)\), where

$$\begin{aligned} M_u(s) \mathrel {\mathop :}=\left\{ x \mid \left| {x-{\hat{x}}}\right| \le u\right\} \subseteq \mathbf {x}. \end{aligned}$$
(4.13)

Then (1.1) has a solution \(x(s) \in M_u(s)\).

Proof

For arbitrary x in the domain of definition of H we define

$$\begin{aligned} \text {K}_s(x) \mathrel {\mathop :}=x - CH(x,s). \end{aligned}$$

For \(x\in M_u(s)\) we get with (4.9) and (4.11)

$$\begin{aligned} \text {K}_s(x)&= {\hat{x}} - \;CH(g(s)) \, - ({CH_x}\!\left[ g(s),\,(x,s)^T\right] -\mathbbm {1})\,(x-{\hat{x}})\nonumber \\&= {\hat{x}} - \Big ( CH(g(s)) + (CH'_x (g(s)) -\mathbbm {1}) (x-{\hat{x}} )\nonumber \\&\quad + (x-{\hat{x}} )^T\,{CH_{xx}}\!\left[ g(s),\, g(s),\, (x,s )^T\right] \, (x-{\hat{x}} )\Big ). \end{aligned}$$
(4.14)

Taking absolute values we get

$$\begin{aligned} \begin{array}{llll} \left| {\text {K}_s(x) -{\hat{x}}}\right| &{}\le \left| {CH (g(s) )}\right| &{}&{} + \left| {C H'_x(g(s)) -\mathbbm {1}}\right| \left| {x-{\hat{x}}}\right| \\ &{} &{}&{} + \left| {x-{\hat{x}}}\right| \,\left| {{CH_{xx}}\!\left[ g(s),\, g(s), (x,s)^T\right] }\right| \,\left| {x-{\hat{x}}}\right| \\ &{} \le \quad \quad \overline{{\mathfrak {b}}}(s) &{}&{} + \left( {\mathfrak {B}}_0(s) + {\mathfrak {B}}(s)\,u \right) \,u\\ &{}\le \quad \quad u \end{array} \end{aligned}$$
(4.15)

by assumption (4.12). Thus, \(\text {K}_s(x) \in M_u(s)\) for all \(x \in M_u(s)\). Further, (4.14) shows that \(\text {K}_s(x)\) is equal to the Krawczyk operator (2.7) for a fixed parameter s. Hence, we get by Theorem 2.1, or, equivalently, by direct application of Brouwer’s fixed point theorem, that there exists a solution of (1.1) in \(M_u(s)\). \(\square \)

Based on the above results, the following theorem provides a way of constructing inclusion and exclusion regions for an approximate solution \({\hat{x}}(s)\).

Theorem 4.2

In addition to the assumptions from Proposition 4.1 and Theorem 4.1, we take

$$\begin{aligned} {\mathfrak {B}}(s) \ge {\mathfrak {B}}(x,s) \quad \forall \; x \in \mathbf {x}. \end{aligned}$$

For \(0<v\in {\mathbb {R}}^n\) we define

$$\begin{aligned} {\mathfrak {w}}(s) \mathrel {\mathop :}=\left( \mathbbm {1}-{\mathfrak {B}}_0(s)\right) v, \quad {\mathfrak {a}}(s) = v^T\, {\mathfrak {B}}(s)\,v. \end{aligned}$$

If

$$\begin{aligned} D_j(s) \mathrel {\mathop :}={\mathfrak {w}}_j(s)^2 - 4\,{\mathfrak {a}}_j(s)\,\overline{{\mathfrak {b}}}_j(s) > 0 \end{aligned}$$
(4.16)

for all \(j=1,\dots ,n\), we define

$$\begin{aligned} \lambda _j^e(s) \mathrel {\mathop :}=\frac{{\mathfrak {w}}_j(s)+\sqrt{D_j(s)}}{2{\mathfrak {a}}_j(s)}, \quad&\lambda _j^i(s) \mathrel {\mathop :}=\frac{\overline{{\mathfrak {b}}}_j(s)}{{\mathfrak {a}}_j(s)\cdot \lambda _j^e(s)} \end{aligned}$$
(4.17)

and

$$\begin{aligned} \lambda _s^e \mathrel {\mathop :}=\min _{j} \; \lambda _j^e(s), \quad&\lambda _s^i \mathrel {\mathop :}=\max _{j}\;\lambda _j^i(s). \end{aligned}$$
(4.18)

If \(\lambda _s^e > \lambda _s^i\) and

$$\begin{aligned} \left( {\hat{x}}_j +\left[ -1,\, 1\right] \,\lambda _s^i\, v_j\right) \subseteq \mathbf {x}_{j} \quad \text { for all }j, \end{aligned}$$
(4.19)

then there exists at least one zero \(x^*\) of (1.1) for a parameter set s (i.e., \(H(x^*,s)=0\)) in the inclusion region

$$\begin{aligned} {\mathbf {R}}^i_s \mathrel {\mathop :}=\left[ {\hat{x}}-\lambda _s^iv,\; {\hat{x}}+\lambda _s^iv\right] \subseteq {\mathbf {x}}, \end{aligned}$$
(4.20)

and these zeros are the only zeros of H at s in the interior of the exclusion region

$$\begin{aligned} {\mathbf {R}}^e_s \mathrel {\mathop :}=\left[ {\hat{x}}-\lambda _s^ev,\; {\hat{x}}+\lambda _s^ev\right] \cap {\mathbf {x}}. \end{aligned}$$

Proof

We set \(u=\lambda v\) with arbitrary \(0<v\in {\mathbb {R}}^{n}\), and check for which \(\lambda =\lambda (s) \in {\mathbb {R}}_+\) the vector u satisfies property (4.12). We get

$$\begin{aligned} \lambda v&\ge \overline{{\mathfrak {b}}}(s) + \left( {\mathfrak {B}}_0(s) + \lambda \, {\mathfrak {B}}(s)\,v\right) \,\lambda v\\&= \overline{{\mathfrak {b}}}(s) + \lambda \left( v-{\mathfrak {w}}(s)\right) + \lambda ^2 {\mathfrak {a}}(s), \end{aligned}$$

which leads to the sufficient condition

$$\begin{aligned} \lambda ^2{\mathfrak {a}}(s)-\lambda {\mathfrak {w}}(s) + \overline{{\mathfrak {b}}}(s) \le 0. \end{aligned}$$

The jth component of this inequality requires \(\lambda \) to be between the solutions of the quadratic equation

$$\begin{aligned} \lambda ^2{\mathfrak {a}}_j(s)-\lambda {\mathfrak {w}}(s)_j + \overline{{\mathfrak {b}}}(s)_j= 0, \end{aligned}$$

which are exactly \(\lambda _j^i(s)\) and \(\lambda _j^e(s)\). Since \(D_j(s)>0\) for all j by assumption, the interval \(\left[ \lambda _s^i,\; \lambda _s^e\right] \) is nonempty. Thus, for all \(\lambda (s) \in \left[ \lambda _s^i,\; \lambda _s^e\right] \), the vector u satisfies (4.12).

It remains to check, whether the solution(s) in \({\mathbf {R}}^i_s\) are the only ones in \({\mathbf {R}}^e_s\). Assume that x is a solution of (1.1) at s with \(x \in {{\,\mathrm{int}\,}}({\mathbf {R}}^e_s)\setminus {\mathbf {R}}^i_s\), and let \(\lambda =\lambda (s)\) be minimal with \(\left| {x-{\hat{x}}}\right| \le \lambda v\). By construction, we have \(\lambda _s^i< \lambda < \lambda _s^e\). In the proof of Theorem 4.1 we got for the Krawczyk operator (2.7)

$$\begin{aligned} \text {K}_s(x) \mathrel {\mathop :}=x-CH(x,s) = x, \end{aligned}$$

since x is a solution of (1.1) at s. Thus, we get by the same considerations as in the proof of Theorem 4.1 from (4.15)

$$\begin{aligned} \left| {x-{\hat{x}}}\right| \le \overline{{\mathfrak {b}}}(s) + \left( {\mathfrak {B}}_0(s) + \lambda \,{\mathfrak {B}}(s)\,v\right) \,\lambda v < \lambda v, \end{aligned}$$

since \(\lambda > \lambda _s^i\). Since this contradicts the minimality of \(\lambda \), there is no solution of (1.1) at s in \({{\,\mathrm{int}\,}}({\mathbf {R}}^e_s)\setminus {\mathbf {R}}^i_s\). So, if (4.19) is satisfied for all j, there exists at least one solution \(x^*\) of (1.1) at s in the inclusion box \({\mathbf {R}}^i_s\) and there are no other solutions in \({{\,\mathrm{int}\,}}({\mathbf {R}}^e_s)\setminus {\mathbf {R}}^i_s\). \(\square \)

The final step is now to compute a feasible parameter \(0<\mu \in {\mathbb {R}}\) such that Theorem 4.2 holds for all

$$\begin{aligned} s\in \hat{\mathbf {s}}\mathrel {\mathop :}=\left[ p-\mu \,y, \; p+\mu \,y\right] , \end{aligned}$$

with arbitrary scaling vector \(y\in {\mathbb {R}}^p\). Assume \(\widetilde{\mathbf {s}}\subseteq \mathbf {s}\in S\), where \(\mathbf {s}\) is an arbitrary box containing p. We compute a lower bound on each component

$$\begin{aligned} D_j(s) \mathrel {\mathop :}={\mathfrak {w}}_j(s)^2 - 4{\mathfrak {a}}_j(s)\,\overline{{\mathfrak {b}}}_j(s) \end{aligned}$$

from the positivity requirement (4.16) of Theorem 4.2 over the box \(\mathbf {s}\). For the bounds from (4.6) and (4.7) we compute upper bounds

$$\begin{aligned} \overline{{\mathfrak {b}}}&\mathrel {\mathop :}=\overline{{\mathfrak {b}}(\mathbf {s})} = \overline{b} + \mu \, \overline{G_0}\, y \quad \text { with }\quad \overline{G_0} \mathrel {\mathop :}=\overline{G_0(\mathbf {s})}, \end{aligned}$$
(4.21)
$$\begin{aligned} \overline{{\mathfrak {B}}_0}&\mathrel {\mathop :}=\overline{{\mathfrak {B}}_0(\mathbf {s})} = B_0+\mu \,\overline{A}\,y \quad \text { with } \quad \overline{A}\mathrel {\mathop :}=\overline{A(\mathbf {s})}. \end{aligned}$$
(4.22)

Since by construction \(\overline{G_0}\ge G_0(s)\) and \(\overline{A}\ge A(s)\) for all \(s\in \mathbf {s}\), we have

$$\begin{aligned} \overline{{\mathfrak {b}}}\ge \overline{{\mathfrak {b}}}(s)\quad \text { and }\quad \underline{{\mathfrak {w}}} \mathrel {\mathop :}=\left( \mathbbm {1}-\overline{{\mathfrak {B}}_0}\right) v \le \left( \mathbbm {1}-{\mathfrak {B}}_0(s)\right) v={\mathfrak {w}}(s) \end{aligned}$$

for all \(s\in \mathbf {s}\). By computing an upper bound over an appropriate box \(\mathbf {x}\in X\) with \(z\in \mathbf {x}\) (e.g., take \(\mathbf {x} = k\, \mathbf {R}_0^e\) with \(k\in {\mathbb {R}}_+,\, k \le 1\)) we get upper bounds on the second order slopes from (4.4c)

$$\begin{aligned} \overline{{\mathfrak {B}}} \mathrel {\mathop :}=\overline{\left| {{CH_{xx}}\!\left[ g(\mathbf {s}), g(\mathbf {s}), ({\mathbf {x}}, \mathbf {s})^T \right] }\right| }, \end{aligned}$$
(4.23)

which satisfy

$$\begin{aligned} \overline{{\mathfrak {B}}}_j \ge {\mathfrak {B}}_j(s) \ge {\mathfrak {B}}_j(x,s) \quad \text { for all } s \in \mathbf {s},\; x\in \mathbf {x}. \end{aligned}$$

Hence, the lowest values of the discriminant D from (4.16) are obtained by

$$\begin{aligned} \underline{D}_j = \underline{{\mathfrak {w}}}_j^2-4\;\overline{{\mathfrak {a}}}_j\;\overline{{\mathfrak {b}}}_j \end{aligned}$$

with \(\overline{{\mathfrak {b}}}\) from (4.21) and

$$\begin{aligned} \underline{{\mathfrak {w}}} \mathrel {\mathop :}={\mathfrak {w}}(\mu )= \left( \mathbbm {1}-\overline{{\mathfrak {B}}_0}\right) v,\qquad \overline{{\mathfrak {a}}} \mathrel {\mathop :}=v^T\, \overline{{\mathfrak {B}}}\, v. \end{aligned}$$
(4.24)

Considering \(\underline{D}_j = \underline{D}_j(\mu )\), we get

$$\begin{aligned} \underline{D}_j(\mu ) = \alpha _j^2\,\mu ^2-2\,\beta _j\,\mu + \gamma _j \end{aligned}$$
(4.25)

with

$$\begin{aligned} \alpha _j \mathrel {\mathop :}=\left( (\overline{A}y)v\right) _j, \qquad \beta _j \mathrel {\mathop :}=\alpha _j w_j+2\,\overline{{\mathfrak {a}}}_j\left( \overline{G_0}y\right) _j, \qquad \gamma _j = w_j^2-4\,\overline{{\mathfrak {a}}}_j\,\overline{b}_j \end{aligned}$$
(4.26)

and \(w_j\) and \(\overline{b}_j\) from (3.4). Solving each quadratic equation

$$\begin{aligned} \underline{D}_j(\mu ) = \alpha _j^2\,\mu ^2-2\,\beta _j\,\mu + \gamma _j = 0 \end{aligned}$$
(4.27)

for \(\mu \), if \(\alpha _j \ne 0\), we get

$$\begin{aligned} \mu _j = \frac{\beta _j \pm \sqrt{\beta _j^2-\alpha _j^2\,\gamma _j}}{\alpha _j^2}. \end{aligned}$$
(4.28)

Since \(H(z,p)\approx 0\), the discriminant in (4.28) is smaller than \(\beta _j^2\), since we have

$$\begin{aligned} \beta _j^2-\alpha ^2\gamma _j = \beta _j^2-\alpha _j^2(w_j^2-4\,\overline{{\mathfrak {a}}}_j\,\overline{b}_j) \end{aligned}$$

with \(\overline{b}_j\) being an upper bound for the function value at (zp) and thus, close to zero. Hence, both solutions of (4.27) are positive. In order to derive numerically stable results, we compute these solutions by

$$\begin{aligned} \overline{\mu }_j = \frac{\beta _j + \sqrt{\beta _j^2-\alpha _j^2\gamma _j}}{\alpha _j^2}, \qquad \underline{\mu }_j=\frac{\gamma _j}{\alpha _j^2\,\overline{\mu }_j}. \end{aligned}$$
(4.29)

and set

$$\begin{aligned} \mu \mathrel {\mathop :}=\min _j \; \underline{\mu }_j, \end{aligned}$$

since we need \(\mu \in [0,\, \underline{\mu }_j]\) in order to meet the positivity requirement (4.16) in the j-th component.

Now we are able to state and prove the main result of this section.

Theorem 4.3

Let \(\mathbf {s}\in S\) with \(p\in \mathbf {s}\), \(\mathbf {x}\in X\) with \(z\in \mathbf {x}\) as above. In addition to the assumptions of Theorem 4.2 we assume upper bounds

  1. (i)

    on the first order slope of H(g(s)),

    $$\begin{aligned} \overline{G}_0 \mathrel {\mathop :}=\overline{G_0(\mathbf {s})} = \overline{\left| {{CH}\!\left[ g\left( p\right) , g(\mathbf {s})\right] }\right| \, \left| {{g}\!\left[ p,\; \mathbf {s}\right] }\right| } , \end{aligned}$$
    (4.30)
  2. (ii)

    on the slope of the first derivative of H(g(s)) wrt. x,

    $$\begin{aligned} \overline{A} \mathrel {\mathop :}=\overline{A(\mathbf {s})} = \overline{\left| C{(H'_x)}\!\left[ g(p),\; g(\mathbf {s})\right] \right| \, \left| {g}\!\left[ p,\; \mathbf {s}\right] \right| }, \end{aligned}$$
    (4.31)
  3. (iii)

    and on the second order slopes of H(g(s)) wrt. x

    $$\begin{aligned} \overline{{\mathfrak {B}}}\mathrel {\mathop :}=\overline{\left| {C{H_{xx}}\!\left[ g(\mathbf {s}),\; g(\mathbf {s}),\; (\mathbf {x},\mathbf {s})^T\right] }\right| } \end{aligned}$$

which hold for all \(s\in \mathbf {s}\), \(x\in \mathbf {x}\). Let further \(0<y\in {\mathbb {R}}^{p}\), \(0<v\in {\mathbb {R}}^{n}\) as before,

$$\begin{aligned} \overline{\mu }_j = \frac{\beta _j + \sqrt{\beta _j^2-\alpha _j^2\gamma _j}}{\alpha _j^2}, \qquad \underline{\mu }_j=\frac{\gamma _j}{\alpha _j^2\,\overline{\mu }_j}. \end{aligned}$$
(4.32)

with \(\alpha \), \(\beta \), \(\gamma \) as defined in (4.24) and (4.26), respectively, and

$$\begin{aligned} \underline{\mu }\mathrel {\mathop :}=\min _j\underline{\mu }_j. \end{aligned}$$

Let \(\eta \in [0, \underline{\mu }]\) be maximal such that

$$\begin{aligned} \lambda ^e(\eta )>\lambda ^i(\eta ), \end{aligned}$$
(4.33)

where

$$\begin{aligned} \begin{aligned} \lambda ^e_{\eta } = \min _j \lambda _j^e(\eta ), \quad \text { with }\quad&\lambda _j^e(\eta ) =\frac{{\mathfrak {w}}_j(\eta )+\sqrt{\underline{D}_j(\eta )}}{2\overline{{\mathfrak {a}}}_j},\\ \lambda ^i_{\eta } = \max _j \lambda _j^i(\eta ), \quad \text { with }\quad&\lambda _j^i(\eta ) =\frac{\overline{{\mathfrak {b}}}_j(\eta )}{\overline{{\mathfrak {a}}}_j\; \lambda _j^e(\eta )} \end{aligned} \end{aligned}$$
(4.34)

with \({\mathfrak {w}}_j(\eta )\), \(\overline{{\mathfrak {b}}}_j(\eta )\), \(\overline{{\mathfrak {a}}}\) and \(D_j(\eta )\) as defined in (4.24), (4.25). Further, let \(\sigma \in \left[ 0,\underline{\mu }\right] \) be the largest value such that for all \(j=1,\dots ,n\)

$$\begin{aligned} {\hat{x}}_j(\mathbf {s}_{\sigma }) + \left[ -1, \, 1\right] \,\lambda ^i_{\sigma }\, v_j \subseteq \mathbf {x}_{j} \end{aligned}$$
(4.35)

for \(\mathbf {s}_{\sigma } \mathrel {\mathop :}=\left[ p-\sigma y, \, p+\sigma y\right] \). If

$$\begin{aligned} \mu \mathrel {\mathop :}=\min (\eta , \sigma ) > 0, \end{aligned}$$
(4.36)

then for all \(s\in {{\,\mathrm{int}\,}}(\widetilde{\mathbf {s}})\cap \mathbf {s}\) with

$$\begin{aligned} \widetilde{\mathbf {s}} \mathrel {\mathop :}=\left\{ s\mid \left| {s-p}\right| \le \mu \, y\right\} \end{aligned}$$

there exists at least one solution x of (1.1) which lies inside the inclusion box \({\mathbf {R}}^i_s\) (as defined in (4.20)) and there are no solutions in \(\cup _{s\in \mathbf {s}}{{\,\mathrm{int}\,}}({\mathbf {R}}^e_s)\setminus \cup _{s\in \mathbf {s}}({\mathbf {R}}^i_s)\).

Proof

Wlog., let \(s=p+ \nu \, y\in {{\,\mathrm{int}\,}}(\widetilde{\mathbf {s}})\cap \mathbf {s}\), i.e., \(0<\nu <\mu \). In order to meet all requirements from Theorem 4.2, which provides the result about the inclusion/exclusion regions at s, we have to check that the following three conditions hold:

  1. (i)

    \(D_j(s)>0\) for all \(j=1,\dots ,n\), with \(D_j\) as in (4.16) (positivity requirement).

  2. (ii)

    \(\lambda _s^i< \lambda _s^e\) (monotonicity of the inclusion/exclusion parameters).

  3. (iii)

    \(\left( {\hat{x}}_j(s) +\left[ -1,\, 1\right] \, \lambda ^i_s\,\,v_j\right) \subseteq \mathbf {x}_j\) \(\forall \, j=1, \dots ,n\) (feasibility of the inclusion region \({\mathbf {R}}_s^i\)).

Condition (i) is satisfied by construction, since we get for \(D_j(s)\) as in (4.16) by the calculations preceeding the statement of the theorem

$$\begin{aligned} 0 \le \underline{D}_j\bigl (\underline{\mu }\bigr ) \le \underline{D}_j\left( \mu \right) <\underline{D}_j\left( \nu \right) \le D_j\left( s\right) \end{aligned}$$

for all \(s\in {{\,\mathrm{int}\,}}(\widetilde{\mathbf {s}})\cap \mathbf {s}\).

For (ii) we consider \(\lambda ^e_j\), \(\lambda ^i_j\) componentwise as functions in \(\nu \), i.e.,

$$\begin{aligned} \lambda _j^e(\nu ) \mathrel {\mathop :}=\frac{\underline{{\mathfrak {w}}}_j(\nu )+\sqrt{\underline{{\mathfrak {w}}}_j(\nu )^2-4\,\overline{{\mathfrak {a}}}_j\overline{{\mathfrak {b}}}_j(\nu )}}{2\,\overline{{\mathfrak {a}}}_j}, \quad \lambda _j^i(\nu ) \mathrel {\mathop :}=\frac{\overline{{\mathfrak {b}}}_j(\nu )}{\overline{{\mathfrak {a}}}\; \lambda _j^e(\nu )}. \end{aligned}$$
(4.37)

Since by construction \(\underline{{\mathfrak {w}}}\le {\mathfrak {w}}(s)\), \(\overline{{\mathfrak {b}}}\ge \overline{{\mathfrak {b}}}(s)\), \(\overline{{\mathfrak {a}}}\ge {\mathfrak {a}}(s)\), we have

$$\begin{aligned} \lambda _j^e(\nu ) \le \lambda _j^e(s), \quad \lambda _j^i(\nu ) \ge \lambda _j^i(s) \end{aligned}$$
(4.38)

with \(\lambda _j^e(s)\), \(\lambda _j^i(s)\) as in (4.17). Both \(\lambda ^e_j\left( \nu \right) \) and \(\lambda ^i_j\left( \nu \right) \) are depending continuously on \(\nu \). In particular, for increasing \(\nu \), \(\lambda ^e_j(\nu )\) is monotonically decreasing and \(\lambda ^i_j(\nu )\) monotonically increasing. Hence, \(\lambda ^e_{\nu } = \min _j \lambda _j^e(\nu )\) and \(\lambda ^i_{\nu } = \max _j \lambda _j^i(\nu )\) have the same monotonicity behaviour, since we take a minimum (resp. maximum) of a monotonically decreasing (resp. increasing) function. By computing \(\underline{\mu }\) from (4.32), we get a lower and an upper bound on the exclusion and inclusion parameter \(\lambda ^e\) and \(\lambda ^i\), respectively, since \(\underline{D}_k\bigl (\underline{\mu }\bigr ) = 0\) implies \(\lambda _k^e\bigl (\underline{\mu }\bigr )=\lambda _k^i\bigl (\underline{\mu }\bigr )\) for some \(k\in \lbrace {1,\dots ,n\rbrace }\). Since we choose \(\eta \in \bigl [0,\, \underline{\mu }\bigr ]\) in such a way that \(\lambda ^e_{\eta }>\lambda ^i_{\eta }\), we have in particular

$$\begin{aligned} \lambda _j^e(\eta )\ge \lambda _k^e\bigl (\underline{\mu }\bigr ), \quad \lambda _j^i(\eta )\le \lambda _k^i\bigl (\underline{\mu }\bigr ) \end{aligned}$$

for all \(j = 1,\dots ,n\). By monotonicity, we have

$$\begin{aligned} \lambda _j^e(\nu )\ge \lambda _j^e(\eta ), \quad \lambda _j^i(\nu )\le \lambda _j^i(\eta ). \end{aligned}$$
(4.39)

Taking the minimum (resp. maximum) over all j, we get by (4.38), (4.39) and assumption (4.33)

$$\begin{aligned} \lambda ^i_s \le \lambda ^i_{\nu }\le \lambda ^i_{\eta }<\lambda ^e_{\eta }\le \lambda ^e_{\nu }\le \lambda ^e_s, \end{aligned}$$

hence, condition (ii) is satisfied.

Finally, we have for \(\nu \le \sigma \) and by assumption (4.35)

$$\begin{aligned} {\hat{x}}_j(s)+\lambda _s^i\,v_j \le \overline{{\hat{x}}_j(\mathbf {s}_{\nu })}+\lambda ^i_{\nu }\,v_j\le \overline{x}_j, \quad \text { and }\quad {\hat{x}}_j(s)-\lambda _s^i\,v_j \ge \underline{{\hat{x}}_j(\mathbf {s}_{\nu })}-\lambda ^i_{\nu }\,v_j\ge \underline{x}_j, \end{aligned}$$

since \({\hat{x}}(s)\in {\hat{x}}(\mathbf {s}_{\nu })\), and \(\lambda _s^i\le \lambda _{\nu }^i\) by (ii). Hence, condition (iii) is satisfied as well, which concludes the proof. \(\square \)

As for the non-parametric case, the above considerations simplify if the system (1.1) is quadratic in both x and s. Since the first order slopes then are linear in x and s, all second order slopes are constant. Hence the estimates \({\mathfrak {B}}(x,s)\) become valid everywhere in the domain of definition of H, i.e., \(\overline{{\mathfrak {B}}}={\mathfrak {B}}(x,s)=\overline{B}\). If the approximation function (4.1) is linear, its first order slope (2.1) is constant, which simplifies the bounds in (4.30) and (4.31). In particular, if (1.1) is quadratic and the approximation function \({\hat{x}}\) is linear both in x and s, \(\overline{A}\) from (4.31) is constant.

5 Numerical results

The proposed method is an integral tool in the new solver rPSE for rigorously solving Parameter-dependent Systems of Equations. The solver uses a branch-and-bound approach for computing an inner approximation of the feasible parameter region in a given initial box together with an enclosure of the solution set over found regions. Additionally, infeasible parameter regions are identified. In the following, the main steps of a generic implementation of the algorithm are described. A variant of the solver, \({{{\texttt {\textsc {rPSE}}}}_{q}}\) for quadratic problems, is currently implemented in Matlab, utilizing the interval toolbox Intlab [31]. In the quadratic case, the second order slopes needed for the calculations in (4.21) and (4.22) are constant. All computations were performed on a laptop equipped with a 2.40 GHz Intel Core i7-5500U CPU, 32GB RAM. A comprehensive description of the solver together with numerical results and a thorough comparison with other existing methods is provided in Ponleitner [26].

In the following, the new method is demonstrated on three examples. Example 5.1 illustrates the computation of feasible parameter regions as proposed in Sect. 4, i.e., performing approxSol, inExPoint, inExParam. Examples 5.2 and 5.3 provide computational results from applying a basic implementation of \({{{\texttt {\textsc {rPSE}}}}_{q}}\). Thereby, in step 4 (inExParam) of rPSE , we use a linear approximation function \({\hat{x}}(s)\), in splitBox we perform splits only in the parameter space, and for performing generic feasibility tests in boxFeasCheck we utilize centered forms. Since the examples serve the purpose to present the new method, we exclusively used the new parametric inclusion-exclusion region method for computing feasible parameter regions, and forwent the application of additional parametric methods, more sophisticated heuristics, or constraint propagation for improving the enclosures of the solution curves.

Main Steps of the solver rPSE . The generic implementation of the solver performs the following steps for rigorously computing an inner approximation of the feasible parameter region for (1.1) in an initial box \(\mathbf {s}\):

  • Input:

  • initial box \((\mathbf {x}, \mathbf {s})\),

  • model equations for (1.1),

  • accuracy-thresholds \(\epsilon _x\), \(\epsilon _s\)

  • Initialize:

    $$\begin{aligned} \begin{array}{ll} {\mathcal {L}}\mathrel {\mathop :}=\{(\mathbf {x},\mathbf {s})\}\quad &{} \text {list of unprocessed boxes}\\ {\mathcal {B}}_{sol}\mathrel {\mathop :}=\emptyset \quad &{} \text {list of } feasible \text { parameter boxes together with solution enclosures}\\ {\mathcal {B}}_{undec}\mathrel {\mathop :}=\emptyset \quad &{} \text {list of } undecided \text { boxes (neither proven feasible, nor found infeasible)}\\ {\mathcal {B}}_{infeas}\mathrel {\mathop :}=\emptyset \quad &{} \text {list of } infeasible \text { parameter intervals together with corr. variable boxes}\\ \end{array} \end{aligned}$$
  • while \({\mathcal {L}}\ne \emptyset \) do

  • Step 1: \(B = {\mathcal {L}}_1\), \({\mathcal {L}} = {\mathcal {L}}\setminus \{ B\}\)

  • Step 2: approxSol: compute approx. solution z(p) at inital set of parameters \(p\in B_S\)

  • Step 3: inExPoint: compute inclusion/exclusion region at z

  • if successful \(\rightarrow \) goTo Step 4: inExParam

  • else \(\rightarrow \) do splitBox

  • Step 4: inExParam: compute \(\mu \), \(\widetilde{\mathbf {s}}\), check resulting solution enclosure \(\hat{\mathbf {x}}\) for feasibility

  • if successful \(\rightarrow \) update \({\mathcal {B}}_{sol} = {\mathcal {B}}_{sol}\cup \bigl (\hat{\mathbf {x}},\,\widetilde{\mathbf {s}}\bigr )\), \(\rightarrow \) do cutBox

  • else \(\rightarrow \) do splitBox

  • cutBox: compute \(\{B_i\} = B\setminus \{B_{sol}\}\), \(\rightarrow \) do boxFeasCheck for the remaining boxes \(B_i\)

  • splitBox:

  • if \(\max {{{\,\mathrm{rad}\,}}(\mathbf {s}_B)} > \epsilon _s\) (or \(\max {{{\,\mathrm{rad}\,}}(\mathbf {x})} > \epsilon _x\)): split the current box according to some heuristics, \(\rightarrow \) do boxFeasCheck for all new boxes \(B_i\)

  • else update \({\mathcal {B}}_{undec} = {\mathcal {B}}_{undec}\cup \{B\}\), \(\rightarrow \) goTo Step 1

  • boxFeasCheck: check all boxes \(B_i\) for feasibility:

  • foreach \(B_i\) do

    • if \(B_i\) feasible \(\rightarrow \) update \({\mathcal {L}} = {\mathcal {L}}\cup \{B_i\}\)

    • else (\(B_i\) infeasible) \(\rightarrow \) update \({\mathcal {B}}_{infeas} = {\mathcal {B}}_{infeas}\cup \{B_i\}\)

  • \(\rightarrow \) goTo Step 1

  • Output: \({\mathcal {B}}_{sol}\), \({\mathcal {B}}_{infeas}\), \({\mathcal {B}}_{undec}\)

Example 5.1

Illustration of the algorithmic concept from Sec. 4 for computing a feasible parameter interval given an approximate solution z(p) of (1.1) at center p.

Step 1: Inclusion/exclusion region for fixed parameter p. We consider the system of equations

$$\begin{aligned} H(x,s) = \begin{pmatrix} x_1^2 + x_2^2-26+s^2\\ x_1\cdot x_2 -13+s \end{pmatrix} = 0 \end{aligned}$$
(5.1)

for \(s\in \mathbf {s} = [0,\;2]\), \(x \in \mathbf {x}= [0,\;5]\times [0,\; 5]\). We set \(p = 1\) and compute a corresponding solution \(z = (3,4)^T\). A slope for H with center (zp) can be computed as

$$\begin{aligned} H(x,s)-H(z,p)&= \underbrace{\begin{pmatrix} x_1+z_1 &{} x_2 + z_2 &{} s+p\\ x_2 &{} z_1 &{} 1 \end{pmatrix}}_{={H}\!\left[ (z,p),(x,s)\right] } \begin{pmatrix} x_1-z_1\\ x_2-z_2\\ s-p \end{pmatrix}. \end{aligned}$$
(5.2)

We get for the solution from above

$$\begin{aligned} \begin{aligned} {H}\!\left[ (z,p), (x,s)\right]&= \begin{pmatrix} {H_x}\!\left[ (z,p), (x,s)\right] ,&{H_s}\!\left[ (z,p), (x,s)\right] \end{pmatrix} \\&= \begin{pmatrix} \begin{pmatrix} x_1+3 &{} x_2 +4 \\ x_2 &{} 3 \end{pmatrix}, \begin{pmatrix} s+1\\ 1 \end{pmatrix} \end{pmatrix}, \end{aligned} \end{aligned}$$

and for the Jacobian of H wrt. x at (zp)

$$\begin{aligned} H'_{x} = \begin{pmatrix} 2x_1 &{} 2x_2\\ x_2 &{} x_1 \end{pmatrix}, \quad H'_{x}(z, p) = \begin{pmatrix} 6 &{} 8 \\ 4&{} 3\end{pmatrix}. \end{aligned}$$

For the preconditioning matrix C we take

$$\begin{aligned} C \mathrel {\mathop :}=H'_{x}(z,p)^{-1} = \frac{1}{14}\begin{pmatrix} -3 &{} 8\\ 4 &{} -6 \end{pmatrix}. \end{aligned}$$
(5.3)

The slope from (5.2) can be put in form (2.3) with

$$\begin{aligned} H_1 = \begin{pmatrix} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0\end{pmatrix}, \quad H_2 = \begin{pmatrix} 0 &{} 1&{} 0 \\ 1 &{} 0 &{} 0 \end{pmatrix}, \quad H_3 = \begin{pmatrix} 0 &{} 0 &{} 1\\ 0 &{} 0 &{} 0 \end{pmatrix}. \end{aligned}$$

Thus, we get for the bounds from (3.4)

$$\begin{aligned} \overline{b} = (0,0)^T, \quad B_0 = \begin{pmatrix} 0 &{} 0 \\ 0 &{} 0 \end{pmatrix}, \quad \overline{B}= \begin{pmatrix} \frac{1}{14} \begin{pmatrix} 3 &{} 0 \\ 4 &{} 0 \end{pmatrix}, \; \frac{1}{14} \begin{pmatrix} 8 &{} 3 \\ 6 &{} 4 \end{pmatrix} \end{pmatrix}, \end{aligned}$$
(5.4)

where \(\overline{b}\) and \(B_0\) both vanish, since z happens to be an exact zero of (5.1), and we computed without roundoff-errors. With \(v=(1,1)^T\) we get from (3.5)

$$\begin{aligned} w = (1,1)^T, \quad a = \left( \overline{B}_1+\overline{B}_2\right) v = (1,1)^T, \end{aligned}$$
(5.5)

and with \(D_1 = D_2 = 1\) we get from (3.6)

$$\begin{aligned} \lambda ^e = 1, \quad \lambda ^i = 0. \end{aligned}$$

Hence, we get

$$\begin{aligned} \mathbf {R}^e_0 = \begin{pmatrix} {[}2,4]\\ {[}3,5] \end{pmatrix}, \quad \mathbf {R}^i_0 = \begin{pmatrix} 3\\ 4\end{pmatrix}. \end{aligned}$$

Step 2: Construction of a feasible parameter interval. Since (5.1) is a quadratic problem, we consider a linear approximation function

$$\begin{aligned} {\hat{x}} :{\mathbb {R}}\rightarrow {\mathbb {R}}^2, \quad {\hat{x}}(s) = z+\varTheta \, (s-p) \end{aligned}$$

with \(\varTheta \in {\mathbb {R}}^{2\times 1}\). In order to indicate the influence of the approximation function on the radius of the resulting feasible parameter box, we compute the parameter interval \(\widetilde{\mathbf {s}}\) from Theorem 4.3 for two different linear approximations,

  1. (i)

    a tangent \({\hat{x}}^{tan}\) in (zp) with

    $$\begin{aligned} \varTheta ^{tan} = -(H'_x(z,\, p))^{-1}\,(H'_s(z, p)) = -\frac{1}{7}\begin{pmatrix}1\\ 1 \end{pmatrix}, \end{aligned}$$
    (5.6)
  2. (ii)

    a secant \({\hat{x}}^{sec}\) through the center (zp) and a second point \(x^1=(\sqrt{13},\sqrt{13})^T\) at \(s^1=0\) with

    $$\begin{aligned} \varTheta ^{sec} = \left( x^1-z\right) \, \left( s^1-p\right) ^{-1} = \begin{pmatrix} 3-\sqrt{13}\\ 4-\sqrt{13} \end{pmatrix}. \end{aligned}$$
    (5.7)

Thus, we have

$$\begin{aligned} g(s) = \begin{pmatrix} z+\varTheta \, (s-p)\\ s \end{pmatrix}\quad \text { with constant slope matrix } {g}\!\left[ s,p\right] =\begin{pmatrix} \varTheta \\ 1\end{pmatrix}. \end{aligned}$$

In order to apply Theorem 4.3 we compute the upper bounds \(\overline{G}_0\) and \(\overline{A}\) from (4.30) and (4.31), respectively. A slope for \(H'_x\) is

$$\begin{aligned} {(H'_x)}\!\left[ g(p), g(s)\right] = \begin{pmatrix} \begin{pmatrix} 2 &{} 0 \\ 0 &{} 1 \end{pmatrix}, \; \begin{pmatrix} 0&{}2\\ 1&{}0 \end{pmatrix}, \; \begin{pmatrix} 0 &{} 0\\ 0 &{} 0 \end{pmatrix} \end{pmatrix}, \end{aligned}$$

since

$$\begin{aligned} H'_x(x,s) - H'_x(z, p)&= \begin{pmatrix} 2 &{} 0 \\ 0&{} 1 \end{pmatrix}(x_1-z_1) + \begin{pmatrix} 0 &{} 2 \\ 1 &{} 0\end{pmatrix}(x_2-z_2). \end{aligned}$$

With preconditionig matrix C from (5.3) we get

$$\begin{aligned} \overline{G}_0^{tan}&= \frac{1}{98}\begin{pmatrix}51\\ 58\end{pmatrix},&\text { and }\quad \overline{A}^{tan}&= \frac{1}{7}\begin{pmatrix}1&{}1\\ 1&{}1\end{pmatrix},\\ \overline{G}_0^{sec}&= \frac{1}{7}\begin{pmatrix}14(\sqrt{13}-3)\\ 58-14\sqrt{13}\end{pmatrix},&\text { and }\quad \overline{A}^{sec}&= \frac{1}{7}\begin{pmatrix}7-\sqrt{13}&{}\sqrt{13}\\ \sqrt{13}&{}7-\sqrt{13}\end{pmatrix} \end{aligned}$$

for tangent \({\hat{x}}^{tan}\) and secant \({\hat{x}}^{sec}\), respectively. Thereby we compute the tensor-vector-product in the formula for \(\overline{A}\) using the tensor rules (1.2). Since we have a quadratic problem, the second order slopes are constant for all \(x \in \mathbf {x}\), \(s\in \mathbf {s}\), and thus, \(\overline{{\mathfrak {B}}}= \overline{B}\) from (3.4). With these preparations, we are able to compute \(\mu \). We take \(y = 1\), and get from (4.26)

  1. (i)

    for tangent \({\hat{x}}^{tan}\):

    $$\begin{aligned} \alpha ^{tan} = \frac{2}{7}\begin{pmatrix}1\\ 1\end{pmatrix},\quad \beta ^{tan}=\frac{1}{49}\begin{pmatrix}65\\ 72\end{pmatrix},\quad \gamma ^{tan}=\begin{pmatrix}1\\ 1\end{pmatrix}, \end{aligned}$$
  2. (ii)

    and for secant \({\hat{x}}^{sec}\)

    $$\begin{aligned} \alpha ^{sec} = \begin{pmatrix}1\\ 1\end{pmatrix},\quad \beta ^{sec}=\frac{1}{7}\begin{pmatrix}28\sqrt{13}-77\\ 123-28\sqrt{13}\end{pmatrix},\quad \gamma ^{sec}=\begin{pmatrix}1\\ 1\end{pmatrix}, \end{aligned}$$

which results in

$$\begin{aligned} \mu ^{tan} \approx 0.343, \quad \quad \mu ^{sec}\approx 0.149. \end{aligned}$$
Fig. 1
figure 1

Two solution curves (solid and dashed lines) for \(x_1\) over \(\mathbf {s}\), approximation function \({\hat{x}}^{tan}\), together with inclusion regions (dash-dotted curves expanding from center z) and exclusion regions (dotted curves, contracting towards the solid solution curve); the vertical dash-dotted lines represent the respective inclusion regions for \({\hat{x}}^{tan}(\underline{s})\) at the lower bound of \(\widetilde{\mathbf {s}}^{tan}\) and for \({\hat{x}}^{tan}(s_1)\) (\(s_1\in \widetilde{\mathbf {s}}^{tan}\)), the vertical dotted lines represent the respective exclusion regions at \(s_1\) and at the center p; the points \(+\) mark the intersection of the inclusion and exclusion regions at the boundaries of \(\widetilde{\mathbf {s}}^{tan}\)

The respective inclusion and exclusion parameters from (4.34) are

$$\begin{aligned} \lambda ^i_{\mu ^{tan}} \approx 0.45092049, \quad&\lambda ^e_{\mu ^{tan}} \approx 0.45092053\\ \lambda ^i_{\mu ^{sec}} \approx 0.42531789, \quad&\lambda ^e_{\mu ^{sec}} \approx 0.42531794, \end{aligned}$$

so, for both approximation functions the monotonicity requirement (4.33) holds for \(\mu \). We still have to check the feasibility condition (4.35) of the inclusion region for the parameter intervals

$$\begin{aligned} \mathbf {s}_{\mu }^{tan}=\left[ 0.657,\, 1.343\right] , \quad \mathbf {s}_{\mu }^{sec} = \left[ 0.851,\, 1.149\right] . \end{aligned}$$
(5.8)

For both intervals the feasibility requirement is met, since

$$\begin{aligned} \begin{aligned} \hat{\mathbf {x}}\bigl (\mathbf {s}_{\mu }^{tan}\bigr ) = {\hat{x}}\bigl (\mathbf {s}_{\mu }^{tan}\bigr )+[-1,\, 1]\,\lambda ^i_{\mu ^{tan}}\, v&= \begin{pmatrix}\\ [3.500,\, 4.500]\end{pmatrix}\subseteq \begin{pmatrix}\\ [0,\, 5]\end{pmatrix} \\ \hat{\mathbf {x}}\bigl (\mathbf {s}_{\mu }^{sec}\bigr ) = {\hat{x}}\bigl (\mathbf {s}_{\mu }^{sec}\bigl )+[-1,\, 1]\,\lambda ^i_{\mu ^{sec}}\, v&= \begin{pmatrix}\\ [3.553,\, 4.447]\end{pmatrix}\subseteq \begin{pmatrix}\\ [0,\, 5]\end{pmatrix}. \end{aligned} \end{aligned}$$
(5.9)

Hence, for both parameter intervals from (5.8) the existence of at least one solution of (5.1) can be guaranteed. The boxes from (5.9) provide a first outer approximation of the solution set. As we could already see in this low dimensional example, the choice of the approximation function greatly influences the size of the computed parameter box as well as the quality of the enclosure of the solution set. A good choice of \({\hat{x}}(s)\) is thus important, and may require a closer analysis of the problem at hand.

Fig. 2
figure 2

Inclusion (a) and exclusion (b) regions over \(\widetilde{\mathbf {s}}^{tan}\approx [0.657,\,1.343]\) for approx. function \({\hat{x}}^{tan}\) (computed with upper bounds from (4.21) and (4.22)), and a comparison (c) with the respective regions for approx. function \({\hat{x}}^{sec}\) with \(\widetilde{\mathbf {s}}^{sec}\approx [0.851,\,1.149]\)

In Fig. 1 two solution curves for \(x_1\) over the initial interval \(\mathbf {s}\) are shown together with the approximation tangent \({\hat{x}}^{tan}(s)\) in (zp) and corresponding inclusion and exclusion regions. The dash-dotted curves show the development of the inclusion regions, the dotted curves the development of the exclusion regions expanding from center z, calculated using upper bounds over the initial interval \(\mathbf {s}\), i.e., with \(\overline{G_0}\) and \(\overline{A}\) from (4.21) and (4.22). Satisfiying the monotonicity requirement (4.33), the inclusion and exclusion curves intersect at the boundary of  \(\widetilde{\mathbf {s}}_{\mu }\) (marked by ’\(+\)’ in Fig. 1). The vertical dash-dotted and dotted lines represent respective inclusion and exclusion regions for center z at p, for \({\hat{x}}(s_1)\) at a parameter \(s_1\in \widetilde{\mathbf {s}}_{\mu }^{tan}\) and for \({\hat{x}}(\underline{s})\) at the lower boundary of the calculated parameter interval. In Fig. 2 these inclusion and exclusion regions as well as a comparison between the respective regions for different linear approximations (tangent and secant) are depicted. The parameter interval computed with a secant as approximation function is much smaller than the one computed using the tangent at (zp).

Close to the boundaries of the feasible parameter region, a phenomenon know as cluster effect can be observed, i.e., the step size \(\mu \) and thus the radii of the parameter boxes become smaller and smaller (see Fig. 3). The cluster effect occurs since at the boundary of the feasible parameter regions small changes in the parameter space may cause great differences in the solution space. Hence, a rigorous numerical proof of feasibility fails at the boundary of the feasible parameter region. Of course, in this case it would be possible to exchange x and s in order to compute a valid inclusion of the one dimensional manifold to avoid the cluster effect. However, in the multidimensional setting this becomes less clear.

Fig. 3
figure 3

Several parameter intervals \(\widetilde{\mathbf {s}}_i\) (\(i=1,2,\ldots ,13\); bold horizontal lines) computed for (5.1) from various centers together with inclusion (dash-dotted lines) and exclusion (dotted lines) regions over the respective interval; the closer to the boundary of \(\mathbf {s}\), the smaller the radius of the parameter interval gets (cluster-effect)

Example 5.2

Numerical results I: 1-dimensional parameter space. We consider the system of equations \(H:{\mathbb {R}}^4\rightarrow {\mathbb {R}}^3\) with

$$\begin{aligned} H(x,s) = \begin{pmatrix} x_1^2 -s^2 -1\\ x_2^2 -s^2 +c_1\cdot s +c_2\\ x_3^2 -s^2 +c_3\cdot s + c_4 \end{pmatrix} = 0 \end{aligned}$$
(5.10)

for \(s\in \mathbf {s} = [5,\; 8]\), \(x \in \mathbf {x}= [5.5, \; 9]\times [2,\; 10]\times [1.5,\;5]\) and coefficients

$$\begin{aligned} c_1&=12 + 12\,\sin ({\varphi }),\\ c_2&= 156\, \cos (\varphi ) -72\,\sin (\varphi )-241,\\ c_3&= 22 + 6\,\sin (\varphi ) - 8\,\cos (\varphi ),\\ c_4&= 106\, \cos (\varphi ) -42\, \sin (\varphi ) -155. \end{aligned}$$

with \(\varphi = \frac{\pi }{9}\), and compute a feasible parameter region using a basic implementation of solver \({{{\texttt {\textsc {rPSE}}}}_{q}}\), which performs the steps described above.

The three plots in Fig. 4 show an exemplary enclosure for each variable \(x_i\) (\(i=1,2,3\)) computed by \({{{\texttt {\textsc {rPSE}}}}_{q}}\) with accuracy-threshold \(\epsilon _s = 0.05\), i.e., boxes with parameter intervals \(\mathbf {s}\) with \({{\,\mathrm{rad}\,}}({\mathbf {s}}) < \epsilon _s\) where marked as undecided and not further processed during the branch-and-bound process. For elements in \({\mathcal {B}}_{undec}\) neither feasibility nor infeasibility can be proven by the algorithm. The computation terminated after 27 iterations (computation time 2.27 s), and the computed feasible parameter region covers about 0.8954 of the true feasible area. Table 1 displays computational results for several accuracy values \(\epsilon _s\). There we denote by \(\overline{\mathbf {s}_{feas}}\) the average ratio of the size of the computed feasible parameter region \(\mathbf {s}^C_{feas}\) to the size of the true feasible parameter region \(\mathbf {s}_{feas}\).

Fig. 4
figure 4

Enclosures of the solution curves (dash-dotted lines) of (5.10) for \(x_i\) (\(i=1,2,3\)) over \(\mathbf {s} = [5,\;8]\); the dark grey areas at the left and right boundary of the interval represent infeasible regions, the grey area in the middle represent feasible areas (steps 1–4 of the algorithm were performed successfully), the light grey areas in-between display the undecided regions

Table 1 Results for solving (5.10) with \({{{\texttt {\textsc {rPSE}}}}_{q}}\) with accuracy \(\epsilon _s\) (max. radius of parameter intervals \(\mathbf {s}\) with status undecided); the averages for number of iterations (iter), computation time (time, in seconds), and ratio \(\overline{\mathbf {s}_{feas}}\) of the size of the computed feasible parameter region \(\mathbf {s}_{feas}^C\) to the size of the true feasible parameter region \(\mathbf {s}_{feas}\), as well as the average numbers of feasible (\({\mathcal {B}}_{sol}\)), infeasible (\({\mathcal {B}}_{infeas}\)), and undecided (\({\mathcal {B}}_{undec}\)) boxes are computed over 50 repetitions of \({{{\texttt {\textsc {rPSE}}}}_{q}}\)

Example 5.3

Numerical results II: 2-dimensional parameter space. We consider the system of equations \(H:{\mathbb {R}}^5\rightarrow {\mathbb {R}}^3\) with

$$\begin{aligned} H(x,s) = \begin{pmatrix} x_1^2 -s_1^2 -s_2^2\\ x_2^2 -s_1^2 -s_2^2 + c_{21}\cdot s_1 + c_{22}\cdot s_2+c_{23}\\ x_3^2 -s_1^2 -s_2^2 + c_{31}\cdot s_1 + c_{32}\cdot s_2 + c_{33} \end{pmatrix} = 0 \end{aligned}$$
(5.11)

for \(s\in \mathbf {s} = [2,\; 3.5]\times [4.5,\;6.8]\), \(x \in \mathbf {x}= [5.5, \; 7]\times [6,\; 10]\times [1.5,\;5]\), and

$$\begin{aligned} \begin{array}{llll} c_{21}&{} = 28-12\,\cos ({\varphi }), \quad &{} c_{31} &{} = 8- 8\,\sin (\varphi ) - 6\,\cos (\varphi ),\\ c_{22}&{} =12 + 12\,\sin ({\varphi }), \quad &{} c_{32} &{} = 22 + 6\,\sin (\varphi ) - 8\,\cos (\varphi ),\\ c_{23}&{} =168\,\cos (\varphi ) -72\,\sin (\varphi ) -268, \quad &{} c_{33} &{} = 112\,\cos (\varphi )-34\,\sin (\varphi )-162,\\ \end{array} \end{aligned}$$

with \(\varphi = \frac{\pi }{9}\), and compute a feasible parameter region using a basic implementation of solver \({{{\texttt {\textsc {rPSE}}}}_{q}}\). The feasible parameter region is an intersection of annular regions, bounded by arcs depending on the lower and upper bound of the bounding boxes for the variables \(x_i\). In Fig. 5 the by \({{{\texttt {\textsc {rPSE}}}}_{q}}\) computed feasible and infeasible parameter regions together with the true bounds of the feasible region are shown. One can also see the boxes with status undecided along the boundary of the feasible region. As in Example 5.1, close to the boundary of the feasible region the cluster effect is visible. The example was computed with accuracy threshold \(\epsilon _s=0.01\). The algorithm terminated after \(168\,s\) and 1478 iterations, the computed feasible area is \(A_{sol}=1.4517\), the infeasible area \(A_{infeas}=1.8486\), the reamaining undecided part \(A_{undec} = 0.1497\). The radius of the largest feasible box computed by \({{{\texttt {\textsc {rPSE}}}}_{q}}\) was \(\mu \approx 0.264\). In Table 2 some results for different accuracy values \(\epsilon _s\) are summarized.

Fig. 5
figure 5

Feasible and infeasible regions for (5.11) for parameters \(s\in \mathbf {s} = [2,\; 3.5]\times [4.5,\;6.8]\) together with the true bounds (white dashed line) and an undecided area (accuracy \(\epsilon _s=0.01\)) around the boundary of the feasible region; in (b) the lower right corner of the initial interval is enlarged in order to show a boundary of the feasible region in more detail

Table 2 Results for solving (5.11) for parameters \(s\in \mathbf {s} = [2,\; 3.5]\times [4.5,\;6.8]\) with \({{{\texttt {\textsc {rPSE}}}}_{q}}\) with accuracy \(\epsilon _s\) (max. radius of parameter intervals \(\mathbf {s}\) with status undecided); the averages for number of iterations (iter), and computation time (time, in seconds), as well as the average numbers of feasible (\({\mathcal {B}}_{sol}\)), infeasible (\({\mathcal {B}}_{infeas}\)), and undecided (\({\mathcal {B}}_{undec}\)) boxes, and average covered areas of the initial interval are computed over 20 repetitions of \({{{\texttt {\textsc {rPSE}}}}_{q}}\)

6 Conclusion

Most interval methods for rigorously enclosing the solution set of a system of equations require a reasonable initial estimate of the interval hull of the solution set. The newly proposed method allows to explicitely compute feasible areas \(\widetilde{\mathbf {s}}\) in the parameter space given an initial approximate solution z(p) for a set p of parameters, and provide an initial rigorous enclosure of the solution set of (1.1) for all parameters in the computed parameter boxes. The novelty of the method is the use of first and second order slope approximations for constructing such feasible parameter boxes. The new method is an integral part of the currently developed solver rPSE for solving nonlinear parameter-dependent systems of equations rigorously in a branch-and-bound setting. The discussed numerical results obtained with a generic implementation of a variant of the solver, \({{{\texttt {\textsc {rPSE}}}}_{q}}\) for quadratic problems, verify the practial applicability of the theoretical results. However, similar to other interval-based branch-and-bound methods, the new approach suffers from some sort of cluster effect (see Fig. 3) when approaching the boundaries of the feasible parameter regions, i.e., the step size \(\mu \) and thus the radii of the parameter boxes become smaller and smaller. This problem may be tackled by extending the method to more general polytopes instead of boxes (as suggested, e.g., in Goldsztejn and Granvilliers [9]), or by using an extension of Miranda’s theorem, and is addressed in Ponleitner [26]. The proposed Algorithm \({{{\texttt {\textsc {rPSE}}}}_{q}}\) globally identifies all feasible parameter regions within an initial box, and thus provides an inner approximation of the set of feasible parameters.

The purpose of the here demonstrated basic implementation of the solver is to illustrate the newly proposed method. Hence, the solver (i) does not claim to be competitive, and (ii) has certain limitations in terms of performance. The presented examples are chosen in such a way that on the one hand, the correctness of the results can easily be verified, and on the other hand, additional algorithmic challenges such as separation of multiple solutions in the variable domain, do not arise. Cleary, the problems from Examples 5.2 and 5.3 could be solved using techniques specifically dedicated to single-variable-equations, since the equations each depend on one variable only. Since this version of the solver solely relies on the new method for constructing and verifying feasible parameter regions, it is computationally expensive, as can be seen in Tables 1 and 2, respectively. Future implementations will address these limitations and utilize additional parametric interval methods and heuristics in order to reduce computational costs. Regardless of the simplicity of the problems, the results of Examples 5.2 and 5.3 suggest that the computational effort significantly increases for each new dimension in the parameter space. Hence, the applicability of the new method seems most promising for problems in low parameter dimensions.

An application for the new method is for example the workspace-computation of parallel manipulators (see, e.g., [19]). In particular, the computation of the total orientation workspace requires the solution of a parameter-dependent system of nonlinear equations. Up to the author’s knowledge, there are only a few results adressing this problem using rigorous methods (e.g., [18, 20]). Therefore, future work will, i.a., be concerned with applying the new method to the workspace problem [26, 28].