Abstract
The computations involving the noncentralF distribution are notoriously difficult to implement properly in floatingpoint arithmetic: Catastrophic loss of precision, floatingpoint underflow and overflow, drastically increasing computation time and program hangups, and instability due to numerical cancellation have all been reported. It is therefore recommended that existing statistical packages are crosschecked, and the present paper proposes a numerical algorithm precisely for this purpose. To the best of our knowledge, the proposed method is the first method that can compute the noncentrality parameter of the noncentralF distribution with guaranteed accuracy over a wide parameter range that spans the range relevant for practical applications. Although the proposed method is limited to cases where the the degree of freedom of the denominator of the F test statistic is even, it does not affect its usefulness significantly: All of those algorithmic failures and inaccuracies that we can still reproduce today could have been prevented by simply crosschecking against the proposed method. Two numerical examples are presented where the intermediate computations went wrong silently, but the final result of the computations seemed nevertheless plausible, and eventually erroneous results were published. Crosschecking against the proposed method would have caught the numerical errors in both cases. The source code of the algorithm is available on GitHub, together with selfcontained commandline executables. These executables can read the data to be crosschecked from plain text files, making it easy to crosscheck any statistical software in an automated fashion.
Introduction
The computational task that this paper deals with, when viewed from a high level of abstraction, is equivalent to solving univariate equations with guaranteed accuracy. Since roots of monotone functions are sought, the challenge is not in the rootfinding, but in computing the function values. These intermediate computations are notoriously difficult to implement properly in floatingpoint arithmetic:

Under and overflow problems were reported by Benton and Krishnamoorthy (2003), Ding (1997), Helstrom and Ritcey (1985) independently of us;

we also reveal in Baharev and Kemény 2008 that the algorithms of Norton (1983) and Lenth (1987) are exposed to over and underflow issues, and that the Appendix 12 of Lorenzen and Anderson (1993, p. 374) is most likely bogus due to overflow;

catastrophic roundoff errors were reported by Frick (1990);

drastically increasing computation time and hangups were observed by Chattamvelli (1995), Benton and Krishnamoorthy (2003);

other noncentral distributions are similarly challenging, see for example Oliveira and Ferreira (2012).
If some of the intermediate computations suffer catastrophic loss of precision, the rootfinding method can still succeed, and the final result presented to the user may nevertheless seem plausible. This was the case with the Appendix 12 of Lorenzen and Anderson (1993, p. 374): The errors in the intermediate computations remained unnoticed, despite the authors of the book being very diligent.
In theory, one could analyze the floatingpoint behavior of all the intermediate computations with pen and paper, and derive mathematically proven bounds on the final numerical error of the entire algorithm when implemented in floatingpoint arithmetic. However, as the above listed failures demonstrate, this task is too errorprone for humans to carry out for a nontrivial numerical algorithm. One way to mitigate this issue is to automate the numerical error analysis as much as possible: Interval arithmetic (to be discussed in Sect. 2) is a way to keep track of numerical errors automatically, and to guarantee that floatingpoint issues such as over and underflow are noticed. However, interval arithmetic still requires that a human analyzes the floatingpoint behavior of all the basic operations and all the involved mathematical functions with pen and paper, and derives rigorous error bounds for all the possible floatingpoint inputs. It is still a huge win because “only” the smallest building blocks of the algorithm (basic operations and mathematical functions) have to be analyzed by a human, but not the entire algorithm as before: The numerical error analysis of algorithms built from the humanverified building blocks happens automatically because interval arithmetic keeps track of the floatingpoint errors as the algorithm is executed. In short, interval arithmetic reduces the amount of analysis that humans have to carry out with pen and paper, but it does not eliminate it. With interval arithmetic we essentially give the rest of the error analysis over to the computer, which then has to finish it as it executes the algorithm. It obviously has a runtime cost; in case of the proposed method, the runtime cost is negligible.
All of those algorithmic failures and inaccuracies that we can still reproduce today could have been prevented by simply crosschecking a sparse but sufficiently wide grid of values against the accurate values provided by the proposed method. By ‘sufficiently wide’ we mean that the grid of values spans the parameter range that is relevant for practical applications. In case of the noncentral beta distribution, we consider this range to be roughly the range spanned by Appendix 12 of Lorenzen and Anderson (1993, p. 374), i.e., shape parameters \(a\in [0.5, 25]\) and \(b\in [0.5, 500]\), type I and type II errors 0.05 and 0.10, respectively. If a numerical algorithm fails or is inaccurate, it typically happens not at isolated tiny regions of the parameter space but over a wide and continuous range, and especially at or near the extremes of the parameter ranges. Therefore, it is not necessary for the grid of crosschecked values to be dense; for crosschecking, it is sufficient if it has points near the extremes of the parameter ranges. We give numerical examples in Sect. 4 to support this claim.
The proposed method is limited to cases where the the degree of freedom of the denominator of the F test statistic is even. Although this is a limitation, it does not affect the usefulness of the proposed method significantly: All of those algorithmic failures and inaccuracies that we can still reproduce today could have been prevented by simply crosschecking against the proposed method.
To the best of our knowledge, the proposed method is the first method that can compute the noncentrality parameter of the noncentralF distribution with guaranteed accuracy over a wide parameter range that spans the range relevant for practical applications. In our literature research we found the related selfvalidating numerical methods by Wang and Kennedy Wang and Kennedy (1990, 1992, 1994, 1995), out of which only Wang and Kennedy (1995) is directly relevant. These methods were published more than 20 years ago, and their source code is not publicly available. As we discuss in Baharev and Kemény (2008), the method in Wang and Kennedy (1995) is susceptible to over and underflow: For example, it would over or underflow due to the large value of the noncentrality parameter if we tried to compute the top right entry of Table 4 in contrast to the proposed method that has no difficulties there.
Outline for the rest of the paper
The paper is structured as follows. The proposed method achieves guaranteed accuracy by applying interval arithmetic. In Sect. 2.1 we give an informal overview of interval arithmetic with an example, and the reader can compare and contrast it with ordinary floatingpoint arithmetic. Interval arithmetic is intentionally treated as a black box in Sect. 2.1: We only present what it delivers, but we do not discuss how. Then, in Sect. 2.2 we give a formal overview of how interval arithmetic guarantees rigorous error bounds. Sect. 3.1 derives the univariate equations that this paper is concerned with, Eqs. (9) and (11). Section 3.2 discusses how interval arithmetic and the interval Newton method can be applied to solve (9) and (11). The formal description of the proposed algorithm is presented in Sect. 3.3 with pseudocode. We finally present in Sect. 4 two examples from the literature, where the intermediate computations involving the noncentralF distribution went wrong silently, but the final result of the computations seemed nevertheless plausible, and erroneous results were published. Crosschecking against the proposed method would have caught the numerical errors in both cases.
Interval arithmetic
Automatic numerical error analysis with interval arithmetic: an example
The goal of this example is to demonstrate that interval arithmetic performs automatic numerical error analysis. The reader can think of interval arithmetic as a computationally cheap way to get guaranteed lower and upper bounds on the range of a function over a given domain of the variables. The obtained bounds are not necessarily sharp, but they are guaranteed to enclose the true range of the function despite the intermediate computations being carried out in floatingpoint arithmetic, and potentially suffer catastrophic loss of precision. Interval arithmetic can safely work with infinity, division by zero, etc., and automatically keeps track of the numerical error propagation throughout the intermediate floatingpoint computations.
We examine the numerical behavior of the following two functions:
It is known, see e.g. Li and Yeh (2013), that f(n) is monotone increasing, g(n) is monotone decreasing, \({\lim \nolimits _{n\rightarrow \infty }}f(n)= {\lim \nolimits _{n\rightarrow \infty }}g(n)=e\), and as a consequence \(f(n)<e<g(n)\). If we carried out the computations in exact arithmetic, we would get tighter and tighter enclosures for e as n increases. However, we get the erratic results shown in Table 1 when the computations are carried out with 64 bit floatingpoint numbers on a computer. The source code of the example is available on GitHub at Baharev (2016) if the reader wishes to reproduce the numerical results, or analyze the implementation. For \(k=9,10,11,12,15\), the supposed lower bounding f(n) values exceed the true value of e, where \(n=10^k\); these are the rows with negative entries in the column for \(ef(n)\). For \(k=9,12,14,15,17\), the supposed upper bounding g(n) values fall below the true value of e; these are the rows with negative entries in the column for \(g(n)e\). For \(k=9,12,15\), the supposed lower bounding values exceed even the supposed upper bounding values, meaning that we did not even get an enclosure. The f(n) values are supposed to be increasing, but for \(k=13,14,16,17\) it is clearly not the case. Similarly, g(n) is supposed to be monotone decreasing, but it is violated, e.g., at \(k=13,16\).
Since the f(n) and g(n) functions are fairly simple, one could carry out a rigorous error analysis of these functions with pen and paper, and figure out the accuracy of the table entries. We will do this automatically with interval arithmetic.
Interval arithmetic takes the function \(f(10^k)\), and the interval for k as input. (The interval for k is just the point interval [k, k] for each row in Table 1.) The output of the interval function evaluation is a rigorous enclosure of the range of \(f(10^k)\) over [k, k]. In exact arithmetic, the range of \(f(10^k)\) over the point interval [k, k] is obviously just a single real number, the value \(f(10^k)\). However, we carry out the computations with 64 bit floatingpoint numbers on a computer. Interval arithmetic automatically keeps track of the numerical errors that occur during these computations, and we do not get a single real number but a pair of floatingpoint numbers as a result, that is, an interval enclosing the value of \(f(10^k)\), the interval \([\underline{f}(10^k),\overline{f}(10^k)]\). These intervals are given in Table 2 for \(k=1,2,\dots ,17\), together with the similarly computed \([\underline{g}(10^k),\overline{g}(10^k)]\).
The wide intervals in Table 2, for example the rows for \(k\ge 15\), are a clear sign of the intermediate computations suffering catastrophic loss of precision; it is guaranteed that with interval arithmetic we always get informed (through wide intervals) when this happens. We consider this one of the biggest advantages of this approach.
Despite the serious numerical difficulties for \(k\ge 9\), the above discussed properties of f(n) and g(n) are still preserved in some form: (i) \(\overline{f}(n)\) is monotone increasing, and \(\underline{g}(n)\) is monotone decreasing, (ii) \(\underline{f}(n)<e<\overline{g}(n)\) holds (unlike in Table 1, there are no negative entries in the last two columns of Table 2). The fact that these properties are preserved is not a coincidence either but the guaranteed behavior of interval arithmetic. However, note that we did not get tighter and tighter enclosures for e as k increased: The enclosure \([\underline{f}(10^k),\overline{g}(10^k)]\) reaches its minimum width at \(k=8\), then the width starts increasing. We cannot blame interval arithmetic for this: Interval arithmetic is implemented on the top of 64 bit floatingpoint numbers, and unless one uses some extended precision arithmetic, e cannot be enclosed better with this simple approach. The tightest verified enclosure we got is [2.71828179, 2.71828186] for \(k=8\); indeed, the correct value is \(2.718281828\ldots \), and it is enclosed.
Let us emphasize again that for this simple example one could have derived bounds on the numerical errors of the entries in Table 1 with pen and paper. The advantage of interval arithmetic is that the numerical error analysis of the computations happens fully automatically, and therefore certain kinds of human errors are completely eliminated.
A formal overview of interval arithmetic
Interval arithmetic is an extension of real arithmetic defined on the set of real intervals, rather than the set of real numbers. According to a survey paper by Kearfott (1996), a form of interval arithmetic perhaps first appeared in Burkill (1924). Modern interval arithmetic was originally invented independently in the late 1950s by several researchers; including Warmus (1956), Sunaga (1958) and finally Moore (1959), who set firm foundations for the field in his many publications, including the foundational book Moore (1966). Since then, interval arithmetic is being used to rigorously solve numerical problems.
Let \(\mathbb {IR}\) be the set of all real intervals, and take \(\mathbf {x}, \mathbf {y}\in \mathbb {IR}\). We set \(\underline{x} := \inf {\mathbf {x}}\) and \(\overline{x} := \sup {\mathbf {x}}\), such that \(\mathbf {x}=[\underline{x},\overline{x}]\). Furthermore, the width of \(\mathbf {x}\) is defined as \({\text {wid}}(\mathbf {x}) := \overline{x}\underline{x}\), the radius of \(\mathbf {x}\) as \({\text {rad}}(\mathbf {x}) := \tfrac{1}{2}{\text {wid}}(\mathbf {x})\), the magnitude of \(\mathbf {x}\) as \(\mathbf {x} := \max (\underline{x},\overline{x})\), and the mignitude of \(\mathbf {x}\) as \(\langle \mathbf {x}\rangle := \min \{x\mid x\in \mathbf {y}\}\). For \(\mathbf {x}\) bounded we set the midpoint of \(\mathbf {x}\) as \(\check{x} := \tfrac{1}{2}(\underline{x}+\overline{x})\). We define the elementary operations for interval arithmetic by the rule \(\mathbf {x}\diamond \mathbf {y}= \Box \{x \diamond y \mid x \in \mathbf {x}, y \in \mathbf {y}\}, \forall \diamond \in \{+, , \times , \div , \hat{}~\}\), where \(\Box S\) denotes the smallest interval containing the set S. (The symbol ‘\(\Box \)’ is a box, and it refers to the tightest interval hull, also called as box hull.) Thus, the ranges of the five elementary interval arithmetic operations are exactly the ranges of their realvalued counterparts. Although this rule characterizes these operations mathematically, the usefulness of interval arithmetic is due to the operational definitions based on interval bounds Hickey et al. (2001). For example, let \(\mathbf {x}= [\underline{x}, \overline{x}]\) and \(\mathbf {y}= [\underline{y}, \overline{y}]\), it can be easily proved that
In addition, for a function \(\varphi :\mathbb {R}\rightarrow \mathbb {R}\) and an interval \(\mathbf {x}\) we define
Moreover, if a function f composed of these elementary arithmetic operations and elementary functions \(\varphi \in \{\sin ,\cos ,\exp ,\log ,\ldots \}\), i.e., factorable function, is given, bounds on the range of f can be obtained by replacing the real arithmetic operations and the real functions by their corresponding interval arithmetic counterparts.
The finite nature of computers precludes an exact representation of the reals. In practice, the real set, \(\mathbb {R}\), is approximated by a finite set \(\bar{\mathbb {F}} = \mathbb {F}\cup \{\infty , +\infty \}\), where \(\mathbb {F}\) is the set of floatingpoint numbers. The set of real intervals is then approximated by the set \(\mathbb {I}\) of intervals with bounds in \(\bar{\mathbb {F}}\). The power of interval arithmetic lies in its implementation on computers. In particular, outwardly rounded interval arithmetic allows rigorous enclosures for the ranges of operations and functions. This makes a qualitative difference in scientific computations, since the results are now intervals in which the exact result is guaranteed to lie. Interval arithmetic can be carried out for virtually every expression that can be evaluated with floatingpoint arithmetic. However, two important points have to be considered: Interval arithmetic is only subdistributive, so expressions that are equivalent in real arithmetic differ in interval arithmetic, giving different amounts of overestimation (the amount by which the real range of the function over an interval and the result computed by interval arithmetic differ). Therefore, computations should be arranged so that overestimation of ranges is minimized. Readers are referred to Alefeld and Herzberger (1983), Neumaier (1990), Hickey et al. (2001), Jaulin et al. (2001) for more details on basic interval methods.
Let \(f:\mathbf {x}\rightarrow \mathbb {R}\) be continuously differentiable, and assume the existence of \(\hat{x}\in \mathbf {x}\) with \(f(\hat{x}) = 0\), and let \(\tilde{x}\in \mathbf {x}\). Then by the mean value theorem we get
for some \(\xi \in \mathbf {x}\). Therefore,
Now let \(\mathfrak f'\) be an interval extension of \(f'\) and \(\mathfrak f\) be an interval extension of f, i.e. a specific expression for computing an enclosure for the range of f over a given input interval \(\mathbf {x}\). Then by the properties of interval arithmetic we get
The operator N is called univariate interval Newton operator. Using this operator we can define the interval Newton iteration as
starting with \(\mathbf {x}^{(0)} = \mathbf {x}\). This iteration has the properties that whenever \(\mathbf {x}^{(k)} = \emptyset \) for some k, then \(\mathbf {x}\) does not contain a zero of f. Otherwise \(\hat{x}\in \mathbf {x}^{(k)}\) for all k, and \({\text {wid}}(\mathbf {x}^{(k+1)}) = O({\text {wid}}(\mathbf {x}^{(k)})^2)\) locally under mild assumptions on f and \(\mathbf {x}^{(0)}\). Furthermore, if for any k we find that \(\mathbf {x}^{(k+1)} \subseteq \text {int}(\mathbf {x}^{(k)})\), i.e., that the interval Newton operator maps the box \(\mathbf {x}^{(k)}\) into its interior, then \(\mathbf {x}^{(k)}\) contains a unique zero of f.
The interval Newton operator requires an interval extension \(\mathfrak {f}'\) of the derivative of f. For every factorable function f such an extension can be constructed using algorithmic differentiation techniques, e.g., see Berz et al. (1996), Griewank and Corliss (1991), Griewank and Walther (2008). For univariate functions the most efficient approach is via the algebra of differential numbers \(\mathcal {D}_1:=\mathbb {R}\times \mathbb {R}\), equipped with the following basic operations: Let \(df:=(f,f'), dg:=(g,g')\in \mathcal {D}_1\), and \(\varphi :\mathbb {R}\rightarrow \mathbb {R}\). Define
The set of real numbers is embedded in \(\mathcal {D}_1\) by \(r\mapsto (r,0)\).
If \(\tilde{f}(x)\) is an expression representing f using arithmetic operations and elementary functions, we can use \(\tilde{f}\) to calculate \((y,y') = \tilde{f}((x,1))\) on \(\mathcal {D}_1\) by replacing the operations and elementary functions in \(\tilde{f}\) by their counterparts on \(\mathcal {D}_1\), and then \(y' = f'(x)\).
This approach can be generalized to compute an interval extension \(\mathfrak {f}'\) of \(f'\) by defining the algebra of interval differential numbers \(\mathbb {I}\mathcal {D}_1:=\mathbb {IR}\times \mathbb {IR}\) and introducing again the operations (3) on \(\mathbb {I}\mathcal {D}_1\) now using interval arithmetic operations in the components of the interval differential numbers. Using this algebra and an expression \(\tilde{f}\) for f, we get by computing \((\mathbf {y},\mathbf {y}') = \tilde{f}((\mathbf {x},1))\) an enclosure \(\mathbf {y}'\supseteq f'(\mathbf {x})\) and thereby an interval extension \(\mathfrak {f}'\) of \(f'\).
The proposed method
Derivation of the formulas
The F test statistic used in the analysis of variance (ANOVA) problems follows the noncentral F distribution when the null hypothesis is false, i.e., \(F_0 = F_{nc}(\nu _1,\nu _2,\lambda )\), where \(\nu _1\) and \(\nu _2\) are the degrees of freedom of the nominator and the denominator, respectively, of the F test statistic; \(\lambda \) is the noncentrality parameter. This noncentrality parameter is related to the size of effects. The probability of the Type II error \(\beta \) (not detecting an effect) is
where \(\alpha \) is the significance level of the test. The power calculation consists of calculating the power (\(Power=1\beta \)) for certain effect sizes. When it is used in the reversed way, the power is fixed (e.g. \(Power=0.9\)), and the size of the effect is calculated; this is considered as effect of detectable size, see Johnson and Leone (1977, p. 170) and Lorenzen and Anderson (1993). Another application where the noncentral Fdistribution arises is in coverages of ClopperPearson confidence intervals for binomial proportions with misspecified parameter p (Puza and O’Neill 2006a, b).
The cdf \(F(w,\nu _1,\nu _2,\lambda )\) of the noncentral Fdistribution with \(\nu _1\), \(\nu _2\) degrees of freedom and noncentrality parameter \(\lambda \), and the cdf \(I_x(a,b;\lambda )\) of the noncentral beta distribution with shape parameters a and b and noncentrality parameter \(\lambda \) are related as follows:
where \(a=\frac{\nu _1}{2}\), \(b=\frac{\nu _2}{2}\), and \(x=\frac{\nu _1w}{\nu _1w+\nu _2}\). We will work with the noncentral beta distribution from now on.
The incomplete noncentral beta function ratio \(I_x(a,b;\lambda )\) for \(0\le x\le 1, a>0, b>0, \lambda \ge 0\) is defined as
where \(I_x(a,b)\) is the usual incomplete beta function ratio,
and \(\varGamma (a)\) is the (complete) gamma function
see for example Johnson et al. (1995).
Equation (6) cannot be used directly for actual numerical computations. The proposed method uses the the closed formula
by Singh and Relyea (1992) (misprint corrected by Chattamvelli (1995)) for the computation of the cdf of the central beta distribution, and the closed formula
by Sibuya (1967) (and later published by Johnson et al. (1995)) for the cdf of the noncentral beta distribution. The shape parameter b must be integer. It is very interesting that the evaluation of (10), using (9), is possible in finitely many steps, requiring only the four arithmetic operations, the power function, and the exponential function; the formulas do not depend on any function for computing statistical distributions, or other special functions.
Detectable differences for a specified type II error probability \(\beta \) are determined by the noncentrality parameter \(\lambda \) for which the cdf value of the noncentral beta distribution equals \(\beta \)
where \(x_{1\alpha }\) is the upper \(\alpha \) quantile of the central beta distribution with shape parameters \(a=\nu _1/2\) and \(b=\nu _2/2\); \(\alpha \) denotes the allowed type I error probability. When minimal detectable differences are sought, one solves (11) for \(\lambda \), given \(a,b,\alpha \) and \(\beta \).
Overview of the proposed method
To summarize Sect. 3.1, one can compute minimal detectable differences for a specified type II error probability \(\beta \) in the traditional setting as follows. Given the shape parameters \(a=\nu _1/2\) and \(b=\nu _2/2\), and the allowed type I error probability \(\alpha \), the upper \(\alpha \) quantile \(x_{1\alpha }\) of the central beta distribution is computed first by solving \(I_x(a,b)=1\alpha \) for x, using (9) and the traditional Newton method. Then, (11) is solved for the noncentrality parameter \(\lambda \) with the traditional Newton method, given \(x_{1\alpha }, a, b\), and \(\beta \).
The proposed method follows these steps of the traditional setting but all the computations are carried out with interval arithmetic instead of ordinary floatingpoint arithmetic, and the interval Newton method is used instead of the traditional Newton method. As a consequence, there are notable differences. The traditional Newton method has two outcomes: It either reaches convergence or it does not. In contrast, the interval Newton method has three possible outcomes as we discussed in Sect. 2.2 formally; these outcomes are enumerated below informally as cases a, b, and c.
 (a):

All solutions are rigorously enclosed, and each enclosure contains a unique zero. The result is the list of these enclosures. (In our case, there can be at most one zero, i.e., the list has at most one element.)
 (b):

It is proved with mathematical rigor that the function cannot have any zeros in the initial interval.
 (c):

There is at least one enclosure among the resulting enclosures of zeros which may contain a zero but verification of existence and/or uniqueness of a zero in that particular enclosure failed.
The primary use case of the proposed method is crosschecking correctness of existing statistical software. The user first computes \(x_{1\alpha }\) and \(\lambda \) with the statistical software to be checked, given \(a=\nu _1/2\), \(b=\nu _2/2\), \(\alpha \), and \(\beta \). The \(x_{1\alpha }\) and \(\lambda \) values are floatingpoint numbers, or in other words, zerowidth intervals. The initial intervals for the interval Newton method are then constructed by inflating these point intervals \(x_{1\alpha }\) and \(\lambda \) such that they are centered around \(x_{1\alpha }\) and \(\lambda \), respectively, but they have nonzero widths. (In this context, inflation refers to the width of the interval: The width of the point interval \(x_{1\alpha }\) is increased from zero to a strictly positive value.) If these initial intervals contain the true values, and the interval Newton method succeeds in proving it (case a), then the algorithm under test is at least as accurate as the radius of the initial intervals in that studied case. Analogously, if the initial intervals do not contain the theoretically correct value, and the interval Newton method reliably proves that (case b), then the accuracy of the algorithm under test is less than the radius of the initial intervals. The very rare but unfortunate case c, when reliable conclusion cannot be drawn and further investigation may be needed, can usually be remedied by simply changing (increasing) the radius of the initial intervals. We have not experienced case c when computing in the parameter range relevant for practical applications, e.g., when computing Tables 4 and 5; we only experienced case c when insane parameters were set.
The proposed method also works with, for example, [0, 1] as initial interval for \(x_{1\alpha }\). In other words, the proposed method works even in the complete absence of an approximate value for \(x_{1\alpha }\), however, this is not the anticipated use case.
Formal description of the proposed algorithm
 Input. :

The input data of the proposed algorithm are \(x_{1\alpha }\) and \(\lambda \), that the user wants to crosscheck. In the anticipated use case, \(x_{1\alpha }\) and \(\lambda \) come from an existing statistical software whose correctness is being checked.
 Step 1. :

The initial intervals \(x_0\) and \(\lambda _0\) are obtained from the inputs \(x_{1\alpha }\) and \(\lambda \) by inflating them as follows:
$$\begin{aligned}&x_0 = \left[ (1\epsilon _x)x_{1\alpha },~ (1+\epsilon _x)x_{1\alpha }\right] , \text{ and } \lambda _0 = \left[ (1\epsilon _{\lambda })\lambda ,~ (1+\epsilon _{\lambda })\lambda \right] ,\nonumber \\ \end{aligned}$$(12)where the inflation parameters \(\epsilon _x\) and \(\epsilon _{\lambda }\) are sufficiently small userdefined real numbers, for example \(10^{6}\).
 Step 2. :

A narrow interval containing the theoretically correct value of \(x_{1\alpha }\) is computed with the interval Newton method, using (9). If the interval Newton method proves that \(x_0\) is guaranteed not to have a solution, or the verification of a unique solution in \(x_0\) fails, exit with the corresponding error message.
 Step 3. :

Equation (11) is solved for \(\lambda \) with the interval Newton method, using (10) and (9). The possible outcomes are: A rigorous enclosure of the true value of \(\lambda \) is obtained, or \(\lambda _0\) is proved not to contain the correct value, or the verification of a unique solution in \(\lambda _0\) fails. The algorithm finishes here.
 Output. :

The rigorous enclosures for \(x_{1\alpha }\) and \(\lambda \) are printed if the interval Newton iteration is successful in both Step 2 and Step 3, or the corresponding error message if any of these steps fails.
We implemented this algorithm in C++, the source code is available on GitHub under Baharev (2016).
Implementation details
As for the implementation details, the above algorithm is implemented in C++ using the CXSC module nlfzero. CXSC is available from http://www.xsc.de, and it is documented in the book of Hammer et al. (1995). CXSC implements the interval Newton method in one variable using automatic differentiation. No higher precision internal data format is used. All computations are done using the IEEE double format (64 bit). Evaluation of (9) and (10) could be significantly speeded up since it involves several redundant operations. For example, it is possible to reduce the number of switches between rounding modes considering that \(a>0\), \(b>0\), and \(0\le x\le 1\) always hold. However, (9) and (10) are used directly, and no efforts were made to decrease the computation time because we found it to be satisfactory.
The parameters are assumed to lie in the domain that is relevant for practical applications, roughly: \(a \le 25, b \le 500\), and \(0.01 \le \alpha , \beta \le 0.99\); the inflation parameters are also assumed to be sane, say \(< 10^{4}\). Violating these assumptions may cause performance degradation and the algorithm may start reporting verification failures, but incorrect results will never appear in the output.
Numerical results
As we claimed in the introduction, all of those algorithmic failures and inaccuracies that we can still reproduce today could have been prevented by simply crosschecking against the proposed method; we now give two such examples. In both examples, the intermediate computations suffer significant loss of precision, but the final results presented to the user seem nevertheless plausible, making these kinds of numerical errors particularly harmful. A byproduct of the first example is that the algorithm of Baharev and Kemény (2008), implemented on the top of the builtin functions of R, is proved to be accurate for 6 significant digits in the investigated cases with mathematical certainty.
The entire source code is available on GitHub at Baharev (2016). The computations have been carried out with the following hardware and software configuration. Processor: Intel(R) Core(TM) i53320M CPU at 2.60GHz; operating system: Ubuntu 14.04.3 LTS with 3.13.067generic kernel; compiler: gcc 4.8.4, compiler optimization flag: O3; CXSC 2.5.4 configuration left on the default values given by the install script.
Example 1: minimal detectable differences for general ANOVA models
Appendix 12 of Lorenzen and Anderson (1993, p. 374) tabulates the minimal detectable differences for general ANOVA models as a function of \(\nu _1\) and \(\nu _2\), with the type I and type II error probabilities fixed at \(\alpha =0.05\) and \(\beta =0.10\), respectively. All entries in Appendix 12 seem plausible; there is no obvious sign that some of the entries have no correct significant digits (for example that entry in Appendix 12 that corresponds to the top right entry of Table 4 of the present paper). The entries of Appendix 12 are most likely bogus due to floatingpoint overflow (Baharev and Kemény 2008).
The error could have been caught by simply crosschecking against the proposed method. The corrected form of Appendix 12 of Lorenzen and Anderson (1993, p. 374), i.e., Table 3 of Baharev and Kemény (2008), fully spans the parameter range that is relevant for practical applications. This table (except rows for which b is not integer, i.e., \(b=0.5, 1.5, 2.5, 3.5\)) is recomputed with the proposed method, and it is given as Table 4. The input values of \(x_{0.95}\) and \(\lambda \) are computed by the algorithm of Baharev and Kemény (2008), implemented in R R Development Core Team (2015) and available as the package fpow; the inflation parameter values \(\epsilon _x\) and \(\epsilon _{\lambda }\) are both set to \(10^{6}\).
Table 3 contains the solution of \(I_x(a,b)=1\alpha \) for x, given a, b and \(\alpha =0.05\). Table 4 contains the solution of (11), where the verified \(x_{0.95}\) is obtained from the previous step, that is, from Table 3. The overall computation required less than 9 seconds. The output of the proposed algorithm is \(x_{0.95}\) verified up to 12 significant digits, and \(\lambda \) verified up to 10 significant digits. (Only 6 digits are shown in the Tables 3 and 4.) The algorithm of Baharev and Kemény (2008), implemented on the top of the builtin functions of R, is also proved to be accurate for 6 significant digits in the investigated cases with mathematical certainty.
Example 2: comparing the accuracy of numerical algorithms
One of the goals of Table 1 of Chattamvelli and Shanmugam (1997) was to illustrate that their algorithm gives more accurate results than Frick’s algorithm (Frick 1990) for large values of \(\lambda \). We reproduced selected rows of their table in Table 5, and extended it with an extra column showing the correct values up to 7 significant digits. This extra column was computed with the proposed method. The correct values (correct up to 7 significant digits) enabled us to give the number of correct significant digits in parentheses after each table entry. Table 5 proves, with mathematical rigor, that the algorithm of Chattamvelli and Shanmugam (1997) is actually slightly less accurate than Frick’s algorithm (Frick 1990) for those large values of \(\lambda \) that are shown in Table 5. Since the precise details of the computations are not given in Chattamvelli and Shanmugam (1997), we cannot tell where the algorithm in Chattamvelli and Shanmugam (1997) suffers from a loss of precision.
References
Alefeld G, Herzberger J (1983) Introduction to interval computations. Academic Press, New York
Baharev A (2016) Rigorous enclosures of minimal detectable differences for general anova models. https://github.com/baharev/mindiffver
Baharev A, Kemény S (2008) On the computation of the noncentral F and noncentral beta distribution. Stat Comput 18:333–340
Benton D, Krishnamoorthy K (2003) Computing discrete mixtures of continuous distributions: noncentral chisquare, noncentral \(t\) and the distribution of the square of the sample multiple correlation coefficient. Comput Stat Data Anal 43(2):249–267
Berz M, Bischof C, Corliss G, Griewank A (1996) Computational differentiation: techniques, applications, and tools. SIAM, Philadelphia
Burkill JC (1924) Functions of intervals. Proc Lond Math Soc 22:375–446
Chattamvelli R (1995) On the doubly noncentral \(F\) distribution. Comput Stat Data Anal 20(5):481–489
Chattamvelli R, Shanmugam F (1997) Algorithm AS 310: computing the noncentral beta distribution function. J R Stat Soc: Ser C (Appl Stat) 46(1):146–156
Ding CG (1997) On using Newton’s method for computing the noncentrality parameter of the noncentral \(F\) distribution. Commun StatSimul Comput 26(1):259–268
Frick H (1990) AS R84. A remark on Algorithm AS 226, computing noncentral beta probabilities. Appl Stat 39(2):311–312
Griewank A, Corliss GF (1991) Automatic differentiation of algorithms. SIAM Publications, Philadelphia
Griewank A (2008) Evaluating derivatives: principles and techniques of algorithmic differentiation. SIAM, Philadelphia
Hammer R, Hocks M, Kulisch U, Ratz D (1995) C++ toolbox for verified computing I, basic numerical problems. SpringerVerlag, Berlin
Helstrom CW, Ritcey JA (1985) Evaluation of the noncentral \(F\)distribution by numerical contour integration. SIAM J Sci Stat Comput 6(3):505–514
Hickey TJ, Ju Q, Van Emden MH (2001) Interval arithmetic: from principles to implementation. J ACM (JACM) 48(5):1038–1068
Jaulin L, Kieffer M, Didrit O, Walter E (2001) Applied interval analysis, 1st edn. Springer, Berlin
Johnson NL, Leone FC (1977) Statistics and experimental design: in engineering and the physical science, 2nd edn. John Wiley & Sons, New York
Johnson NL, Kotz S, Balakrishnan N (1995) Continuous univariate distributions, vol 2, 2nd edn. John Wiley & Sons Inc, New York
Kearfott BR (1996) Interval computations: introduction, uses, and resources. Euromath Bull 2(1):95–112
Lenth RV (1987) Algorithm AS 226: computing noncentral beta probabilities. J R Stat Soc Ser C (Appl Stat) 36(2):241–244
Li YC, Yeh CC (2013) Some equivalent forms of Bernoulli’s inequality: a survey. Appl Math 4:1070–1093
Lorenzen TJ, Anderson VL (1993) Design of experiments: a noname approach. Marcel Dekker Inc, New York
Moore RE (1959) Automatic error analysis in digital computation. Technical Report LMSD84821, Missiles and space division, Lockheed Aircraft Corporation, Sunnyvale, California, USA
Moore RE (1966) Interval analysis. Prentice Hall, Englewood Cliffs
Neumaier A (1990) Interval methods for systems of equations. Cambridge University Press, Cambridge
Norton V (1983) A simple algorithm for computing the noncentral \(F\) distribution. Appl Stat 32(1):84–85
Oliveira IRC, Ferreira DF (2012) Computing the noncentral gamma distribution, its inverse and the noncentrality parameter. Comput Stat 28(4):1663–1680
Puza B, O’Neill T (2006a) Generalised clopperpearson confidence intervals for the binomial proportion. J Stat Comput Simul 76(6):489–508. doi:10.1080/10629360500107527
Puza B, O’Neill T (2006b) Interval estimation via tail functions. The Canadian Journal of Statistics / La Revue Canadienne de Statistique 34(2):299–310, http://www.jstor.org/stable/20445202
R Development Core Team (2015) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.Rproject.org, ISBN 3900051070
Sibuya M (1967) On the noncentral beta distribution function, unpublished manuscript, the equation can be found in the book of Johnson, Kotz, and Balakrishnan (1995), p. 485, Eq. (30.12)
Singh KP, Relyea GE (1992) Computation of noncentral \(F\) probabilities: A computer program. Comput Stat Data Anal 13(1):95–102, the misprint on p. 97 was corrected by Chattamvelli (1995)
Sunaga T (1958) Theory of an interval algebra and its applications to numerical analysis. RAAG Memoirs 2:29–46
Wang M, Kennedy WJ (1990) Comparison of algorithms for bivariate normal probability over a rectangle based on selfvalidated results from interval analysis. J Stat Comput Simul 37(1–2):13–25
Wang M, Kennedy WJ (1992) A numerical method for accurately approximating multivariate normal probabilities. Comput Stat Data Anal 13(2):197–210
Wang MC, Kennedy WJ (1994) Selfvalidating computations of probabilities for selected central and noncentral univariate probability functions. J Am Stat Assoc 89(427):878–887
Wang MC, Kennedy WJ (1995) A selfvalidating numerical method for computation of central and noncentral \(F\) probabilities and percentiles. Stat Comput 5(2):155–163
Warmus M (1956) Caculus of Approximations. Bulletin de l’Académie Polonaise des Sciences IV(5):253–259
Acknowledgements
Open access funding provided by Austrian Science Fund (FWF).
Author information
Authors and Affiliations
Corresponding author
Additional information
The research was funded by the Austrian Science Fund (FWF): P27891N32. The authors are grateful to Professor Sándor Kemény and to the anonymous reviewers for reading the draft of this paper and for making valuable comments and suggestions.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Baharev, A., Schichl, H. & Rév, E. Computing the noncentralF distribution and the power of the Ftest with guaranteed accuracy. Comput Stat 32, 763–779 (2017). https://doi.org/10.1007/s0018001607013
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s0018001607013
Keywords
 Minimal detectable differences
 ANOVA
 Noncentrality parameter
 Selfvalidating numerical method
 Interval arithmetic