On a quality measure for interval inclusions

Rump, Siegfried M.; Ogita, Takeshi

doi:10.1007/s10543-024-01020-1

On a quality measure for interval inclusions

Open access
Published: 13 May 2024

Volume 64, article number 22, (2024)
Cite this article

Download PDF

You have full access to this open access article

BIT Numerical Mathematics Aims and scope Submit manuscript

On a quality measure for interval inclusions

Download PDF

271 Accesses
Explore all metrics

Abstract

Verification methods compute intervals which contain the solution of a given problem with mathematical rigour. In order to compare the quality of intervals some measure is desirable. We identify some anticipated properties and propose a method avoiding drawbacks of previous definitions.

FO Model Checking of Interval Graphs

Rigorous Global Filtering Methods with Interval Unions

Reverse Mathematics Is Computable for Interval Computations

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Verification methods prove that a given a numerical problem is solvable and produce mathematically rigorous error bounds for the solution of the problem. For an overview of verification methods cf. [5, 8] and [in Japanese] [6].

When developing a new verification method, it is desirable to have some measure for the quality of an inclusion. We consider an inclusion interval X as error bounds for an unknown real quantity $\hat{x}$, i.e., $\hat{x} \in X$. Depending on the situation, we use synonymous notations for an inclusion interval, namely

$$\begin{aligned} X= & {} [\underline{x},\overline{x}]:= \{x \in \mathbb {R}: \underline{x}\le x \le \overline{x}\} \\= & {} \langle m,r \rangle := \{ x \in \mathbb {R}: m-r \le x \le m+r \} \ . \end{aligned}$$

A colloquial notation is $\langle m,r \rangle = m \pm r$. Consider

$$\begin{aligned} X_1:= [-1,2], \quad X_2:= [-1,1], \quad \text{ and }\quad X_3:= [1,2] \ . \end{aligned}$$

It seems that all three intervals do not give much information, only $X_3$ proves at least that $\hat{x}$ is positive. Now let A be a symmetric matrix with $\Vert A\Vert _2 = 10^{10}$ and let the $X_\nu $ be inclusions of an eigenvalue. Then all three inclusions $X_\nu $ reveal that the condition number $\frac{\sigma _{\max }(A)}{\sigma _{\min }(A)}$ of A is at least $5\cdot 10^9$.

The quality of an interval inclusion depends on the context. Having said that, it may nevertheless be desirable to define a measure for the quality of an interval, knowing the pros and cons of such an attempt. There is some folklore about such measures, however, to that end we found only one paper in the literature, see below.

In this note we develop some criteria for such a measure. We start with some theoretical considerations in the next section, and conclude with some practical remarks.

2 Theoretical considerations

Let $\varrho : \mathbb {R}\times \mathbb {R}_{\ge 0} \rightarrow \mathbb {R}_{\ge 0}$ be a function for the quality $\varrho (m,r)$ of $\langle m,r \rangle $. The letter $\varrho $ may remind of “relative error”, however, we prefer the wording “quality” because mathematically $\varrho $ may be interpreted as relative error, but only in a certain sense (see below). Note that $\varrho (m,r)=0$ means best quality. We first list some desirable properties of such a function:

$$\begin{aligned} \begin{array}{rrl} \text{ I) } \; &{} \text{ non-negativity }\quad \; &{} \varrho (m,r) \ge 0 \\ \text{ II) } \;&{} \text{ zero } \text{ value } \quad \; &{} \varrho (m,r)=0 \; \Leftrightarrow \; r = 0 \\ \text{ III) } \;&{} \text{ scaling } \text{ invariance } \quad \; &{} \varrho (X) \; = \; \varrho (\alpha X) \quad \text{ for }\;\; 0 \ne \alpha \in \mathbb {R}\\ \text{ IV) } \;&{} \text{ monotonicity } \text{ for } \text{ fixed } \text{ m } \quad \; &{} r'> r \; \Rightarrow \; \varrho (m,r')> \varrho (m,r) \\ \text{ V) } \;&{} \text{ monotonicity } \text{ for } \text{ fixed } \text{ r } \quad \; &{} |m'| > |m| \; \Rightarrow \; \varrho (m',r) < \varrho (m,r) \\ \end{array} \end{aligned}$$

The rationale is as follows. Properties I) and II) are clear. As for III), the quality of an inclusion interval X may well depend on the scaling for different settings, see the above example. However, without knowing any setting, invariance with respect to scaling seems the only option. For the monotonicity, an interval with constant midpoint but increasing radius gives less information, and with constant radius but increasing absolute value of the midpoint^{Footnote 1} the interval contains, in some sense, more information.

Moreover, we may demand $\varrho $ to be continuous in m and r except for $m=r=0$ because for $r>0$ it follows $\varrho (0,0) < \varrho (0,r) = \varrho (0,1)$. As for differentiability note that $\varrho (m,r) = \varrho (-m,r)$ would imply $\frac{d\varrho }{dm}(0,r)=0$ for all $r>0$, but then V) and I) lead to a contradiction. Therefore we require

$$\begin{aligned} \begin{array}{lll} \text{ VI) } &{} \text{ continuity } \quad &{} \varrho (m,r)\; \text{ is } \text{ everywhere } \text{ continuous } \text{ except } \text{ for }\; m=r=0\\ \text{ VII) } &{} \text{ differentiability } \; &{} \varrho (m,r)\; \text{ is } \text{ everywhere } \text{ differentiable } \text{ except } \text{ for }\; m=0\\ \end{array} \end{aligned}$$

Having listed the desired properties, we look for possible candidates. An obvious choice is to use the midpoint m of $X = \langle m,r \rangle $ as an approximation and define $\varrho (X)$ to be the largest relative error of $x \in X$ with respect to m:

$$\begin{aligned} \varrho _1(m,r) := \max _{x \in X} \left| \dfrac{x-m}{m} \right| \quad \text{ implying }\quad \varrho _1(X) = \left| \dfrac{\overline{x}-\underline{x}}{\underline{x}+\overline{x}} \right| \ . \end{aligned}$$

(1)

All properties I) to VII) are satisfied, however, for a small or zero unknown real quantity $\hat{x}$ the midpoint may be zero causing an obvious problem. In this case $\varrho _1(0,r)$ is infinite no matter how small the radius r is.

A remedy is to use the maximum over the minimal relative error against some $\tilde{x} \in X$, i.e.,

$$\begin{aligned} \varrho _2(X) := \min _{\tilde{x} \in X} \max _{x \in X} \left| \dfrac{\tilde{x}-x}{\tilde{x}} \right| \ . \end{aligned}$$

(2)

That is the definition in [4], the only reference we found. It is shown that

$$\begin{aligned} \varrho _2(m,r) = \; \left\{ \begin{array}{ll} \dfrac{r}{|m|} &{} \quad \text{ if } |m|-r \ge 0 \\ \dfrac{2r}{\max (|m-r|,m+r)} &{} \quad \text{ otherwise } \text{. } \end{array}\right. \end{aligned}$$

The properties I) to VI) are satisfied for $\varrho _2$, however, differentiability VII) is not met:

$$\begin{aligned} \varrho _2(1,1+e) = \; \left\{ \begin{array}{ll} 1+e &{} \quad \text{ if } e \le 0 \\ \dfrac{1+e}{1+e/2} &{} \quad \text{ if } e \ge 0 \ .\\ \end{array}\right. \end{aligned}$$

As has been mentioned there is some folklore about quality measures, in particular

$$\begin{aligned} \varrho _3(X) := \dfrac{\overline{x}-\underline{x}}{|\underline{x}|+|\overline{x}|} \end{aligned}$$

(3)

with $0/0:=0$. That avoids the zero midpoint problem, but for all intervals X containing zero $\underline{x}\le 0 \le \overline{x}$ implies

$$\begin{aligned} 0 \in X: \quad \varrho _3(X) = \dfrac{\overline{x}+|\underline{x}|}{|\underline{x}|+\overline{x}} = 1 \ . \end{aligned}$$

The properties I) to VI are satisfied, but $\varrho _3$ is not differentiable for one endpoint zero:

$$\begin{aligned} \varrho _3([0,e]) = \; \left\{ \begin{array}{ll} 1 &{} \quad \text{ if } e > 0 \\ \dfrac{e}{|e|} &{} \quad \text{ if } e < 0 \ .\\ \end{array}\right. \end{aligned}$$

In order to find a function $\varrho $ sharing all properties I) to VII) but avoiding the problems for zero midpoint we use, in view of $\varrho (m,r)=\varrho (-m,r)$, the ansatz

$$\begin{aligned} \varrho (m,r) = \dfrac{\alpha |m| + \beta r}{\gamma |m| + \delta r} \end{aligned}$$

for constants $\alpha ,\beta ,\gamma ,\delta $ to be determined. Property II) implies $\alpha =0$ and $\gamma \ne 0$, so that using III) and some scaling we can restrict our attention to

$$\begin{aligned} \varrho (m,r) = \psi \; \dfrac{r}{\varphi |m| + r} \end{aligned}$$

with a scaling factor $\psi $ defining the maximum of $\varrho $. Rewriting $\varrho (m,r) = \psi \left( \varphi \frac{|m|}{r} + 1 \right) ^{-1}$ it is easy to verify that this definition satisfies all properties I) to VII) for any $\varphi >0$. In order to find a suitable choice for $\varphi $ we look at intervals with fixed left endpoint $\underline{x}= -1$ and right endpoints $-1 \le \overline{x}\le 1$, that is $X_r:= \langle -1+r,r \rangle $ for $0 \le r \le 1$. Then

$$\begin{aligned} \varrho (X_r) = \dfrac{\psi r}{\varphi (1-r) + r}. \end{aligned}$$

A good choice may be $\varphi =1$ in which case $\varrho (X_r)$ grows linearly with r. Hence,

$$\begin{aligned} \varrho (m,r):= \dfrac{\psi r}{|m|+r} \ . \end{aligned}$$

Now it is a matter of taste to fix $\psi $. We may feel that $\varrho ([0,1])=1$ should hold. That implies $\psi =2$, so that we define

$$\begin{aligned} \varrho _4(m,r) := \dfrac{2r}{|m|+r} \end{aligned}$$

(4)

implying $\varrho _4(m,r) \le 2$ for all m, r. For $X = [ \underline{x}, \overline{x}]$ it follows

$$\begin{aligned} \varrho _4(X) = \min \left( \left| \dfrac{\overline{x}-\underline{x}}{\underline{x}} \right| , \left| \dfrac{\overline{x}-\underline{x}}{\overline{x}} \right| \right) \end{aligned}$$

with the convention $\frac{0}{0}=0$, the minimal relative error of the endpoints against each other. In verification methods $\text{ mag }(X):= \max \{ |x|: x \in X \}$ is called the magnitude of an interval. Hence $\varrho _4(X) = \text{ diam }(X)/\text{mag }(X)$. An advantage over $\varrho _3$ is that no case distinction is necessary in the computation. An almost identical formulation

$$\begin{aligned} \varrho _4'(X) = \dfrac{\overline{x}-\underline{x}}{\max (|\underline{x}|,|\overline{x}|,\eta )} \end{aligned}$$

was suggested by Demmel [1]. It is equal to $\varrho _4$ except that it is tailored to binary64 of the IEEE754 [3] arithmetic standard by using the gradual underflow unit, i.e., the smallest positive floating-point number $\eta = 2^{-1074}$. If the endpoints $\underline{x},\overline{x}$ are binary64 floating-point numbers, then $\varrho _4(X) = \varrho _4'(X)$.

In Fig. 1 the four definitions $\varrho _\nu $ are compared for fixed midpoint $m=1$ and for fixed left endpoint $\underline{x}= -1$.

The first function $\varrho _1$ [relative error against midpoint, red] shows a linear behaviour for fixed midpoint and growing radius, and tends to infinity if the midpoint approaches zero. As discussed the second function $\varrho _2$ [Kreinovich’s definition, black with circles] it is not differentiable at $m=r$. The “folklore” function $\varrho _3$ [green] is not differentiable for zero endpoint and flat equal to the maximal value 1 for intervals containing zero, no discrimination in terms of small or large radius. Moreover, it is not concave. Finally, the new definition $\varrho _4$ [blue] is, as $\varrho _1$, linear for fixed midpoint and growing radius, and everywhere differentiable except for $m=0$.

The first three definitions coincide in the left picture for $X=\langle 1,r \rangle $ with $r \in [0,1]$, and in the right picture for $X = [-1,-1+d]$ with $d \in [0,1]$. In both pictures Kreinovich’s definition $\varrho _2$ and the proposed $\varrho _4$ coincide for $r \ge 1$ and $d \ge 1$, respectively. So the proposed measure $\varrho _4$ differs from the other definitions for $r \in [0,1]$ and $d \in [0,1]$ in the left and right picture, respectively. This ensures differentiability everywhere except zero midpoint.

The definition $\varrho _4(X) = \frac{\text{ diam }(X)}{\text{ mag }(X)}$ with the interpretation $\frac{0}{0}=0$ can be used for complex intervals as well. It replaced the function relerr in the latest Version 13 of INTLAB [7], the Matlab/Octave toolbox for reliable computing. Executable Matlab/INTLAB code is as follows:

The code is working for scalar, vector and matrix input X, full or sparse, real or complex. The “if”-statement takes care of $\frac{0}{0}$, and of sparse input avoiding full output.

3 Practical considerations

Our definition $\varrho _4(X)$ seems a good theoretical measure for the relative error of an interval X. However, from a practical and numerical point of view, there is a drawback. Mathematically a small $\varrho _4(Y)$ means a small forward error, i.e., a small relative error with respect to the true result. But numerically we can only hope for a small backward error, introduced and popularized by Wilkinson [11, 12], see also [2]. The backward error of an approximation $\tilde{x}$ is small if $\tilde{x}$ is the true solution of the original problem after a small perturbation of the input data. Without further measurements such as a residual iteration that is about the best what we can expect.

Now consider, similar to our introductory problem, an approximation $\tilde{x} = 1.23456 \cdot 10^{-10}$ of a singular value of a matrix A with $\Vert A\Vert _2 = 1$ to the true singular value $\hat{x} = 1.23457 \cdot 10^{-10}$. Then $\varrho _4(\tilde{x} \underline{\cup } \hat{x}) = 8.1 \cdot 10^{-6}$. If computed in binary64 equivalent to some 16 decimals precision, the accuracy of $\tilde{x}$ might be considered as not bad, but far from best possible. With the additional information of the context $\Vert A\Vert _2 = 1$, however, we know that this is close to the best possible approximation we can hope for.

Therefore, from a practical and numerical point of view it seems reasonable to pass information about the context. We therefore propose a relative accuracy defined by

$$\begin{aligned} \alpha (X,\tau ) := \dfrac{\text{ diam }(X)}{\max (\text{ mag }(X),\tau )} \ , \end{aligned}$$

(5)

where $\tau $ is the context information. That implies $\alpha (X,\Vert A\Vert _2) = 10^{-15}$, a value we may expect from a practical, numerical point of view. In Version 13 of INTLAB the function relacc computes the relative accuracy. A typical call is

$$\begin{aligned} \texttt {alpha = relacc(X,'thresh',tau);} \end{aligned}$$

The following Fig. 2 illustrates this definition and compares it to the relative error $\varrho _4$. We compute approximations $s_k$ of the singular values of a square matrix with 1000 rows and condition number $10^{12}$. The well accepted rule thumb says that the approximations $s_k$ of the smallest singular values may be correct to some 4 decimals. The dotted green line^{Footnote 2} in Fig. 2 displays the values $\varrho _4(s_k \underline{\cup } \sigma _k)$, where $\sigma _k$ are the true singular values of A. As expected the relative error increases from $10^{-14}$ for the largest to about $10^{-6}$ for the smallest singular values. The dotted blue line displays the relative accuracy $\alpha (X,\Vert A\Vert _2)$ and reflects what we would expect from a numerical point of view.

Additionally we use INTLAB’s routine verifysingvalall to compute inclusions $X_k$ of all singular values of A. The solid black line shows the relative error $\varrho _4(X)$ of the inclusions, while the solid line displays the relative accuracy $\alpha (X,\Vert A\Vert _2)$. From the black line we might conclude that the inclusions are of reasonable, but not too good quality for the smallest singular values, whereas the red line shows that the inclusions are of almost best quality for an inclusion method without extra iterative refinement. For other problems the context may be passed similarly.

We want to stress that neither the function relerr nor relacc is a panacea. As noted at the beginning of this note the judgement of the quality of an inclusion depends on the context. As an example let matrices R, A be given. Then $\Vert I-RA\Vert < 1$ for any matrix norm proves that both R and A are nonsingular. Typically, a good choice for R is an approximate inverse of A. Denote by $\textbf{X}$ the stacked columns of an inclusion of the residual $I-RA$. As an example, we display the first and last two elements in Table 1.

It is well known that one step of iterative refinement in working precision implies backward stability of the result of Gaussian elimination [9, 10]. A forward stable result, i.e., an approximation with close to maximum accuracy can be achieved with residuals computed in twice the working precision.

The computed $\textbf{X}$ may be applied in some iterative refinement. The intervals have relatively wide diameters but are small in magnitude. If that is true for all entries, the wide diameters show that a residual of that quality is not suited for iterative refinement, so that relerr provides that information. However, the small magnitude shows that the residuals are good enough to prove that A is nonsingular, so that relacc provides that information.

Notes

Note that III) implies $\varrho (m,r) = \varrho (-m,r)$.
Relative errors zero are set to $10^{-25}$ to avoid gaps.

References

Demmel, J.B.: Private communication (2012)
Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM Publications, Philadelphia (2002)
Book Google Scholar
IEEE standard for floating-point arithmetic, IEEE Std 754-2019 (Revision of IEEE 754-2008), pp 1–84 (2019)
Kreinovich, V.: How to define relative approximation error of an interval estimate: a proposal. Appl. Math. Sci. 7(5), 211–216 (2013)
MathSciNet Google Scholar
Neumaier, A.: Interval methods for systems of equations. Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge (1990)
Google Scholar
Oishi, S., Ichihara, K., Kashiwagi, M., Kimura, T., Liu, X., Masai, H., Morikura, Y., Ogita, T., Ozaki, K., Rump, S.M., Sekine, K., Takayasu, A., Yamanaka, N.: Principle of verified numerical computations. Corona Publisher, Tokyo, Japan (2018). ([in Japanese])
Google Scholar
Rump, S.M. (1999): INTLAB – INTerval LABoratory, In: Csendes Tibor editor, Developments in Reliable Computing, Springer Netherlands, Dordrecht, pp 77–104
Rump, S.M.: Verification methods: Rigorous results using floating-point arithmetic. Acta Numer 19, 287–449 (2010)
Article MathSciNet Google Scholar
Skeel, R.: Scaling for numerical stability in Gaussian elimination. J. ACM 26(3), 494–526 (1979)
Skeel, R.: Iterative refinement implies numerical stability for Gaussian elimination. Math. Comp. 35(151), 817–832 (1980)
Wilkinson, J.H.: Error analysis of floating-point computation. Numer. Math. 2, 319–340 (1960)
Article MathSciNet Google Scholar
Wilkinson, J.H.: Rounding Errors in Algebraic Processes. Prentice-Hall Inc., Englewood Cliffs (1963)
Google Scholar

Download references

Acknowledgements

The authors wish to thank the two anonymous referees for the thorough reading and fruitful comments.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Institute for Reliable Computing, Hamburg University of Technology, Am Schwarzenberg-Campus 3, 21073, Hamburg, Germany
Siegfried M. Rump
Faculty of Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo, 169–8555, Japan
Siegfried M. Rump & Takeshi Ogita

Authors

Siegfried M. Rump
View author publications
You can also search for this author in PubMed Google Scholar
Takeshi Ogita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Siegfried M. Rump.

Ethics declarations

Conflict of interest

Not applicable.

Additional information

Communicated by Elisabeth Larsson.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Rump, S.M., Ogita, T. On a quality measure for interval inclusions. Bit Numer Math 64, 22 (2024). https://doi.org/10.1007/s10543-024-01020-1

Download citation

Received: 04 October 2023
Accepted: 11 March 2024
Published: 13 May 2024
DOI: https://doi.org/10.1007/s10543-024-01020-1

Keywords

Mathematics Subject Classification

65G40

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On a quality measure for interval inclusions

Abstract