A refined affine approximation method of multiplication for range analysis in word-length optimization

Sun, Ruiyi; Zhang, Yan; Cui, Aijiao

doi:10.1186/1687-6180-2014-36

A refined affine approximation method of multiplication for range analysis in word-length optimization

Research
Open access
Published: 22 March 2014

Volume 2014, article number 36, (2014)
Cite this article

Download PDF

You have full access to this open access article

EURASIP Journal on Advances in Signal Processing Submit manuscript

A refined affine approximation method of multiplication for range analysis in word-length optimization

Download PDF

Ruiyi Sun¹,
Yan Zhang¹ &
Aijiao Cui¹

2159 Accesses
1 Citation
Explore all metrics

Abstract

Affine arithmetic (AA) is widely used in range analysis in word-length optimization of hardware designs. To reduce the uncertainty in the AA and achieve efficient and accurate range analysis of multiplication, this paper presents a novel refined affine approximation method, Approximation Affine based on Space Extreme Estimation (AASEE). The affine form of multiplication is divided into two parts. The first part is the approximate affine form of the operation. In the second part, the equivalent affine form of the estimated range of the difference, which is introduced by the approximation, is represented by an extra noise symbol. In AASEE, it is proven that the proposed approximate affine form is the closest to the result of multiplication based on linear geometry. The proposed equivalent affine form of AASEE is more accurate since the extreme value theory of multivariable functions is used to minimize the difference between the result of multiplication and the approximate affine form. The computational complexity of AASEE is the same as that of trivial range estimation (AATRE) and lower than that of Chebyshev approximation (AACHA). The proposed affine form of multiplication is demonstrated with polynomial approximation, B-splines, and multivariate polynomial functions. In experiments, the average of the ranges derived by AASEE is 59% and 89% of that by AATRE and AACHA, respectively. The integer bits derived by AASEE are 2 and 1 b less than that by AATRE and AACHA at most, respectively.

A new algorithm for Chebyshev minimum-error multiplication of reduced affine forms

Article 06 March 2017

Optimization of Number Representations

1 Introduction

As a method of representing real numbers, floating point can support a wide dynamic range and high precision of values. It has been thus commonly used in signal processing, such as image processing, speech processing, and digital signals processing, to represent signals. When these applications are implemented on hardware for high speed and stability, the signals need to be represented in fixed point to optimize the performance of area, power, and speed of the hardware. Hence, the values in floating-point need to be converted to those in fixed point. This process is named as word-length optimization. Its goal is to achieve optimal system performance while satisfying the specification on the system output precision. Word-length optimization involves range analysis and precision analysis. The former one is to find the minimum word length of the integer part of the value, while the latter one focuses on the optimization of the fractional part of the word length.

Word-length optimization has been proven to be an NP-hard problem [1]. It can be usually classified into dynamic analysis [2–7] and static analysis [8–20]. By analyzing a large set of stimuli signals, dynamic analysis is applicable to all types of systems. However, it will take long time on simulation to provide sufficient confidence. Also, the precision for the signals without simulation cannot be guaranteed. Comparatively, the static analysis is an automated and efficient word-length optimization method and more applicable to large designs when compared to dynamic analysis. The static analysis mainly uses the characteristics of the input signals to estimate the word length conservatively, which can result in overestimation [12] to some extent. As a part of word-length optimization, the range analysis can also been classified in the same way.

Affine arithmetic (AA) [21] is often used for range analysis in static analysis. In AA, every signal must be represented in an affine form, which is a first-degree polynomial. As AA tracks the correlations among range intervals of signals, it can provide more accurate word-length range. This makes it suitable for range analysis of the result of linear operations. It is noted that besides linear operations, nonlinear operations, such as multiplication, are also involved in hardware operations, typically in linear time invariant (LTI) systems. AA cannot provide an exact affine form for nonlinear operations. To solve this problem, Stolfi and de Figueiredo [22] proposed affine approximation methods for multiplication, which include trivial range estimation (AATRE) and Chebyshev approximation (AACHA). AATRE is efficient for computation, but the range produced by it can be four times of real range at most. The accumulation of the uncertainty of all signals in the computational chain may result in an error explosion, which is unacceptable in application. Such overestimation obviously cannot satisfy the accuracy requirement of the system, which limits the application of AATRE in large systems. The uncertainty of AACHA is less than AATRE, however, it is too complex to be used in large systems. Since LTI operations are accurately covered by AA, the proposed method is applied in the field of the range analysis of word-length optimization in this paper.

A novel affine approximation method, Approximation Affine based on Space Extreme Estimation (AASEE), is proposed to reduce the uncertainty of multiplication and achieve an accurate and efficient range analysis of multiplication in this paper. To analyze the uncertainty conveniently, we use two parts to divide the different parts of all the approximation methods for multiplication, which include AATRE, AACHA, and AASEE. The first part is named as approximate affine form, which is approximated to the nonlinear operation. The second part is named as equivalent affine form, which is the equivalent affine form of the estimated range of the difference between the result of multiplication and the approximate affine form. The more accurate the two parts are, the more accurate the approximation method is. Based on linear geometry [23], it is proven that the proposed approximate affine form is the closest to the result of multiplication. To derive the equivalent affine form, we use the extreme value theory of multivariable functions [24] to estimate the upper and lower bounds of the difference in space, and the difference is introduced by the approximation of the first part. The uncertainty of the proposed method is minimized. The accuracy of the resulting affine form by AASEE is higher than that by AATRE and averagely higher than that by AACHA. Meanwhile, the computational complexity of AASEE is equivalent to that of AATRE and lower than that of AACHA.

The rest of this paper is organized as follows. Background of range analysis for multiplication is presented in Section 2. Section 3 presents the method of derivation of the two parts for multiplication. The refined affine form of multiplication, AASEE, is presented in next section. In Section 5, we compare the computational complexity and the accuracy among AASEE to AATRE and AACHA. The case studies and experimental results are demonstrated in Section 6. Section 7 concludes the paper.

2 Background

2.1 Related work

Interval arithmetic (IA) and affine arithmetic (AA) have been widely used in range analysis in word-length optimization.

IA [25] is a range arithmetic theory which is firstly presented by Moore in 1962. Cmar [2] employs it for range analysis of digital signal processing (DSP) systems. Carreras [20] presents a method based on IA. To reduce the oversized word length, the method provides the probability density functions that can be used when some truncation must be performed due to constraints in the specification. IA is not suitable for most real-world applications, since it could lead to drastic overestimation of the true range.

AA [21] is proposed to overcome the weakness of IA by Stolfi in 1993. In [8, 9], Fang uses AA to analyze word-length optimization. Both range and precision are represented by the same affine form, which limits the optimization. Pu and Ha [10] also use AA for word-length optimization. Simultaneously, they use two different affine forms for range analysis and precision analysis, respectively, and achieve more refined result of word-length optimization. Similarly, Lee et al. [11] develop an automatic optimization approach, which is called MiniBit, to produce accuracy-guaranteed solutions, and area is minimized while meeting an error constraint. Osborne [12] uses both IA and AA for range analysis for different situations. Computation using either of the two methods in the design is time-consuming. The problem of overestimation is serious due to the approximation of the nonlinear operations.

Since AA cannot be used in the systems with infinite number of loops, an improved approach, quantized AA (QAA), has been proposed in [13] for linear time-invariant systems with feedback loops. This method can provide fast and tight estimation of the evolution of large sets of numerical inputs, using only an affine-based simulation, but it does not provide the exact bounds.

AATRE [22] is adopted for multiplication in most of the works for the low computational complexity. But the uncertainty of the range by AATRE is very large. To adjust the trade-off between the accuracy of approximation and computational complexity, Zhang [14] introduces a new parameter N in the N-level simplified affine approximation (N-SAA). This method is faster than AACHA and more accurate than AATRE, but it is more complex than AATRE. Furthermore, it is troublesome to choose a suitable N. A method of range analysis is proposed by Pang [26]. This method combines methods of IA, AATRE, and arithmetic transform (AT); and the result of the method is more accurate than AATRE, while the CPU implementation time is longer than AATRE. To deal with applications from the scientific computing domain, Kinsman [17, 18] uses the computational methods based on Satisfiability Modulo Theory. Search efficiency of this method is improved leading to tighter bounds and thus smaller word length.

For all the existing methods, the accuracy of approximation is improved at the expense of the computational complexity. This paper presents an affine approximation method for multiplication, which achieves better trade-off between accuracy and computational complexity.

2.2 Range analysis

Range analysis involves studying the data range of every signal and minimizing the integer word lengths for signals on the premise that the signals in the design have enough bits to accommodate this range. The range of signal x is represented by x= [x_min, x_max], where the two real numbers, x_min and x_max, denote the lower and upper bounds of x, respectively. The required integer part of the word length for signal x, which is represented as IWL_x, can be derived by:

\begin{array}{l} {IWL}_{x} & = \{\begin{array}{l} ⌈ {log}_{2} (| x |_{max}) ⌉ + α, & | x |_{max} \geq 1 \\ 1, & | x |_{max} < 1 \end{array} . \\ where | x |_{max} & = max (| x_{min} |, | x_{max} |) and \\ α & = \{\begin{array}{l} 1, mod ({log}_{2} (x_{max}), 1) \neq 0 \\ 2, mod ({log}_{2} (x_{max}), 1) = 0 \end{array} . \end{array}

(1)

In (1), all the signals in the design are assumed to be expressed as signed numbers, and the sign bit is taken into account in IWL_x. According to (1), once the range of a signal is decided, the integer part of word length of the signal can be derived.

2.3 Affine arithmetic

AA is widely applied for range analysis. In AA, an uncertain signal x is represented by an affine form as a first-degree polynomial [22]:

\hat{x} = x_{0} + x_{1} ε_{1} + x_{2} ε_{2} + \dots + x_{n} ε_{n}, where ε_{i} = [- 1, 1] .

(2)

For the signal x, x₀ is the central value, and ε_i is the i th noise symbol. ε_i denotes an independent uncertainty source that contributes to the total uncertainty of the signal x, and x_i is its coefficient.

The upper and lower bounds for the range of x can be represented as

x_{max} = x_{0} + \sum_{i = 1}^{n} | x_{i} |, x_{min} = x_{0} - \sum_{i = 1}^{n} | x_{i} | .

(3)

With x_min and x_max, the input interval $\bar{x} = [x_{min}, x_{max}]$ can be converted into an equivalent affine form as (4), using only one independent noise symbol.

\begin{array}{l} \hat{x} & = x_{0} + x_{1} ε_{1}, \\ with x_{0} & = \frac{x_{max} + x_{min}}{2}, x_{1} = \frac{x_{max} - x_{min}}{2} . \end{array}

(4)

AA can keep correlations among the signals of the computational chain by contributing the sample noise symbol ε_i to each signal [22].

For multiplication, AATRE and AACHA are typical approximation methods.

The affine form of AATRE is

\begin{array}{l} \hat{x} ŷ = x_{0} y_{0} + \sum_{i = 1}^{n} (x_{0} y_{i} + y_{0} x_{i}) ε_{i} + \sum_{i = 1}^{n} | x_{i} | \sum_{i = 1}^{n} | y_{i} | ε_{n + 1} . \end{array}

(5)

Suppose M₁= max(n₁,n₂), in which n₁ and n₂ denote the number of the noise symbol, whose coefficient is nonzero, of $\hat{x}$ and , respectively. The computational complexity of AATRE is O(M₁).

AACHA provides a better approximation result, but it is more complex. The affine form of AACHA is

\hat{x} ŷ = x_{0} y_{0} + \sum_{i = 1}^{n} (x_{0} y_{i} + y_{0} x_{i}) ε_{i} + \frac{a + b}{2} + \frac{b - a}{2} ε_{n + 1},

(6)

where a and b denote the minimum and the maximum of the range of $(\sum_{i = 1}^{n} x_{i} ε_{i}) (\sum_{i = 1}^{n} y_{i} ε_{i})$ . Suppose M₂ = n₁ + n₂. The complexity of computing the both extremal values, a and b, is O(M₂ logM₂). As M₁ ≤ M₂, the computational complexity of AATRE is lower than that of AACHA [22].

2.4 Extreme value theory

The proposed approximation is based on the extreme value theory of multivariable functions [24].

According to the extreme value theory of multivariable functions, the Hessian matrix of the function, H, and Jacobian matrix of the function, J, can be used to find the local maxima and the local minima. Hessian matrix of function f(x₁,x₂, …, x_n) is

\begin{array}{l} H = [\begin{array}{c} \frac{\partial^{2} f}{\partial x_{1}^{2}} & \frac{\partial^{2} f}{\partial x_{1} x_{2}} & \dots & \frac{\partial^{2} f}{\partial x_{1} x_{n}} \\ \frac{\partial^{2} f}{\partial x_{2} x_{1}} & \frac{\partial^{2} f}{\partial x_{2}^{2}} & \dots & \frac{\partial^{2} f}{\partial x_{2} x_{n}} \\ \dots & \dots & \dots & \dots \\ \dots & \dots & \dots & \dots \\ \frac{\partial^{2} f}{\partial x_{n} x_{1}} & \frac{\partial^{2} f}{\partial x_{n} x_{2}} & \dots & \frac{\partial^{2} f}{\partial x_{n}^{2}} \end{array}] . \end{array}

(7)

Here we use $H_{f^{α}}$ to represent H at a point $f^{α} = (x_{1}^{α}, x_{2}^{α}, \dots, x_{n}^{α})$ and $J_{f^{α}}$ to represent J at a point f^α.

A stationary point of f, f^α, is a point where $J_{f^{α}} = 0$ . $H_{f^{α}}$ is indefinite when $H_{f^{α}}$ is neither positive semidefinite nor negative semidefinite. If $H_{f^{α}}$ is positive definite, then f^α is a local minimum point. If $H_{f^{α}}$ is negative definite, then f^α is a local maximum point. If $H_{f^{α}}$ is indefinite, then f^α is neither a local maximum nor a local minimum. It is a saddle point. Otherwise, f^α is not utilized in this paper.

The principal minor determinants are used to determine if a matrix is positive or negative definite or semidefinite.

It is necessary and sufficient for a positive semidefinite matrix that all the principal minor determinants of the matrix are nonnegative real numbers.

It is necessary and sufficient for a negative semidefinite matrix that all the odd order principal minor determinants of the matrix are non-positive real numbers and all the even order principal minor determinants of the matrix are nonnegative real numbers.

3 Derivation of the two parts for multiplication

A generic nonlinear operation $z \leftarrow f (\hat{x}, ŷ)$ proposed in [22] can be described by (8):

\begin{array}{l} z & = f (x_{0} + x_{1} ε_{1} + \dots + x_{n} ε_{n}, y_{0} + y_{1} ε_{1} + \dots + y_{n} ε_{n}) \\ = f^{*} (ε_{1}, \dots, ε_{n}) . \end{array}

(8)

Since the operation f is nonlinear, f^∗(ε₁, …, ε_n) cannot be expressed exactly as an affine combination of the noise symbols, ε_i. Under this case, an approximate affine form of the operation, which is represented as f_z, must be used to approximate f^∗(ε₁, …, ε_n). The difference introduced by this approximation, d_f = f^∗-f_z, can be expressed by an equivalent affine form of the estimated range of the difference, which is represented as $\hat{d}$ . Hence, the affine form of z can be expressed as

\hat{z} = f_{z} + \hat{d} .

(9)

In (9), f_z is a first-degree function of ε_i and can be expressed as (10)

f_{z} (ε_{1}, \dots, ε_{n}) = z_{0} + \sum_{i = 1}^{n} z_{i} ε_{i} .

(10)

The computational complexity of computing the true range of d_f is very high in a practical application. The estimated range of d_f is utilized instead of the true range. Suppose d_max and d_min denote the upper and lower bounds of the estimated range of d_f, respectively. According to (4), the $\hat{d}$ can be expressed as (11)

\hat{d} = z^{'} + z_{n + 1} ε_{n + 1} = \frac{d_{max} + d_{min}}{2} + \frac{d_{max} - d_{min}}{2} ε_{n + 1} .

(11)

With (10) and (11), the affine form of z can be represented as

\hat{z} = f_{z} + \hat{d} = z_{0} + \sum_{i = 1}^{n} z_{i} ε_{i} + z^{'} + z_{n + 1} ε_{n + 1} .

(12)

For multiplication, z can be expressed as

z = x_{0} y_{0} + x_{0} \sum_{i = 1}^{n} y_{i} ε_{i} + y_{0} \sum_{i = 1}^{n} x_{i} ε_{i} + (\sum_{i = 1}^{n} x_{i} ε_{i}) (\sum_{i = 1}^{n} y_{i} ε_{i}) .

(13)

The first three items of (13) form an affine form and the last term is a quadratic term. Its affine form can also be represented as (12).

According to the definition of f_z in (10) and $\hat{d}$ in (11), AATRE and AACHA can also be represented by f_z and $\hat{d}$ . For AATRE in (5), the f_z and $\hat{d}$ are defined as

f_{z} = x_{0} y_{0} + \sum_{i = 1}^{n} (x_{0} y_{i} + y_{0} x_{i}) ε_{i},

(14)

\hat{d} = \sum_{i = 1}^{n} | x_{i} | \sum_{i = 1}^{n} | y_{i} | ε_{n + 1} .

(15)

For AACHA in (6), the f_z and $\hat{d}$ are defined as

f_{z} = x_{0} y_{0} + \sum_{i = 1}^{n} (x_{0} y_{i} + y_{0} x_{i}) ε_{i},

(16)

\hat{d} = \frac{a + b}{2} + \frac{b - a}{2} ε_{n + 1} .

(17)

In the existing affine approximation methods of AATRE and AACHA, d_max and d_min are estimated in the XY plane. In these methods, the same noise symbol of different variables is considered to be independent. Hence, the range of $\hat{d}$ is much larger than that of d_f. The difference between $\hat{d}$ and d_f will propagate to $\hat{z}$ and result in uncertainty.

To describe the multiplication accurately, we use ε_i as the input arguments and estimate the range of z in the (n+1)-dimensional space Eⁿ⁺¹. The (n + 1)-dimensional space Eⁿ⁺¹ is labeled as (ε₁, …, ε_n, z). In space Eⁿ⁺¹, a first-degree polynomial function can be expressed as a (n + 1)-dimensional hyperplane and a nonlinear polynomial function denotes a (n + 1)-dimensional space curved surface. The approximate affine form in (10) denotes a (n + 1)-dimensional hyperplane in Eⁿ⁺¹. Each hyperplane in Eⁿ⁺¹ can be viewed as a parallel translation of a tangent hyperplane at a certain point of (n + 1)-dimensional space curved surface. Hence, all possible approximate affine forms for z can be regarded as the (n + 1)-dimensional tangent hyperplanes at all points of (n + 1)-dimensional space curved surface in Eⁿ⁺¹. The translation amount is taken into account in d_f, which is approximated by $\hat{d}$ . In space Eⁿ⁺¹, d_f can be viewed as the function of the distance between the points of space curved surface and the tangent hyperplane.

Figure 1 shows an example of $\hat{x} = 1 + ε_{1} + 5 ε_{2}$ and $ŷ = 3 - 6 ε_{1} + ε_{2}$ . The space is labeled as (ε₁, ε₂, z). The red mesh surface represents the function $z = \hat{x} ŷ = (1 + ε_{1} + 5 ε_{2}) (3 - 6 ε_{1} + ε_{2})$ . The blue plane represents the tangent plane f_z, z = 3 - 3ε₁ + 16ε₂, at the point z^α = (0, 0, 3). All the possible approximate affine forms for z are the tangent planes of all the points. d_f is a function of distance between z and f_z.

Here we use $f_{z^{α}}$ in (18) to represent the tangent hyperplane at the point $z^{α} = (ε_{1}^{α}, ε_{2}^{α}, \dots, ε_{n}^{α})$ . Then, the possible approximate affine form can be represented as $f_{z^{α}}$ , too.

f_{z^{α}} = z^{α} + z_{ε_{1}}^{'} (ε_{1} - ε_{1}^{α}) + z'_{ε_{2}} (ε_{2} - ε_{2}^{α}) + \dots + z'_{ε_{n}} (ε_{n} - ε_{n}^{α}) .

(18)

In (18), $z'_{ε_{n}}$ are the partial derivatives of z with respect to the variables ε_n at the point z^α.

With the estimated range of d_f, the maximum absolute error of d_f can be expressed as

e_{a} = max (| d_{max} |, | d_{min} |) .

(19)

To reduce the uncertainty, f_z must be the most closed to the result of multiplication. Hence, f_z is the tangent hyperplane whose maximum absolute error is minimum among that of all the possible affine form $f_{z^{α}}$ , that is,

e_{a} (f_{z}) = min (e_{a} (f_{z^{α}})) .

(20)

The geometrical meaning of f_z denotes the tangent hyperplane whose maximum absolute error is minimized.

f_z is derived by the range of d_f, while $\hat{d}$ is the equivalent affine form of d_f. It is very complex to compute the true range of d_f. With $\hat{d}$ in (11), the uncertainty in AA for nonlinear operations is generated due to the difference between the true range of d_f and the estimated range of d_f.

It is much tighter and easier to estimate range of d_f in Eⁿ⁺¹ space than in the XY plane. Based on the extreme value theory of multivariable functions, the estimated range of d_f in AASEE is derived.

With more accurate d_max and d_min, f_z and $\hat{d}$ can be calculated more precisely, and AASEE can achieve a refined affine approximation result.

In the next sections, the estimated range of d_f will be derived firstly, and the two parts will be derived later.

4 AASEE for multiplication

4.1 Estimated range of the difference

For multiplication, which is expressed as (13), the value of z at the point z^α is

z^{α} = (x_{0} + \sum_{i = 1}^{n} x_{i} ε_{i}^{α}) (y_{0} + \sum_{i = 1}^{n} y_{i} ε_{i}^{α}) .

(21)

The partial derivatives of z with respect to the variable ε_i at the point z^α are

z'_{ε_{i}} = (x_{i} (y_{0} + \sum_{j = 1}^{n} y_{j} ε_{j}^{α}) + y_{i} (x_{0} + \sum_{j = 1}^{n} x_{j} ε_{j}^{α})) .

(22)

Upon substitution for z^α and $z'_{ε_{i}}$ , the tangent hyperplane $f_{z^{α}}$ can be expressed as

\begin{array}{l} f_{z^{α}} & = (x_{0} + \sum_{i = 1}^{n} x_{i} ε_{i}^{α}) (y_{0} + \sum_{i = 1}^{n} y_{i} ε_{i}^{α}) + (x_{1} (y_{0} + \sum_{i = 1}^{n} y_{i} ε_{i}^{α}) \\ + y_{1} (x_{0} + \sum_{i = 1}^{n} x_{i} ε_{i}^{α})) (ε_{1} - ε_{1}^{α}) + \dots \\ + (x_{n} (y_{0} + \sum_{i = 1}^{n} y_{i} ε_{i}^{α}) + y_{n} (x_{0} + \sum_{i = 1}^{n} x_{i} ε_{i}^{α})) (ε_{n} - ε_{n}^{α}) . \end{array}

(23)

The difference between the tangent hyperplane $f_{z^{α}}$ and (n + 1)-dimensional quadratic surface z is

\begin{array}{l} d_{f} & = z - f_{z^{α}} = \sum_{i, j = 1}^{n} x_{i} y_{j} (ε_{i} - ε_{i}^{α}) (ε_{j} - ε_{j}^{α}), \\ where ε_{i}, ε_{j}, ε_{i}^{α}, ε_{j}^{α} = [- 1, 1] . \end{array}

(24)

Suppose d_emax and d_emin denote the estimated maximum and minimum of the function value at the domain boundary respectively, and d_fimax and d_fimin denote the local maxima and the local minima, respectively. The estimated maximum and minimum of multivariable function d_f, d_max and d_min, can be expressed as

d_{max} = max (d_{emax}, d_{fimax}),

(25)

d_{min} = min (d_{emin}, d_{fimin}) .

(26)

According to (24), the function value at the domain boundary, d_fe, is represented by

\begin{array}{l} d_{fe} & = \sum_{i, j = 1}^{n} x_{i} y_{j} [ε_{i} ε_{j} - ε_{j} ε_{i}^{α} - ε_{i} ε_{j}^{α} + ε_{i}^{α} ε_{j}^{α}] \\ where \exists ε_{i} = \pm 1, \forall i = 1, 2, \dots, n . \end{array}

(27)

To simplify, we observe the extreme case of ∀ε_i = ±1. Under this case, for the first item, it is always positive when i = j. Hence, the estimated function value at the domain boundary, d_e, is expressed as

\begin{array}{l} d_{e} & = \sum_{i, j = 1, i = j}^{n} x_{i} y_{j} + \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i}^{α} ε_{j}^{α} + \sum_{i, j = 1, i \neq j}^{n} x_{i} y_{j} ε_{i} ε_{j} \\ - \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{j} ε_{i}^{α} - \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i} ε_{j}^{α} \\ where \forall ε_{i} = \pm 1 . \end{array}

(28)

Hence, the maximum and minimum of d_e, d_emax and d_emin are derived as

\begin{array}{l} d_{emax} & = \sum_{i = 1}^{n} x_{i} y_{i} + \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i}^{α} ε_{j}^{α} + \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | \\ + \sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{i}^{α} | + \sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{j}^{α} | \end{array}

(29)

\begin{array}{l} d_{emin} & = \sum_{i = 1}^{n} x_{i} y_{i} + \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i}^{α} ε_{j}^{α} - \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | \\ - \sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{i}^{α} | - \sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{j}^{α} | . \end{array}

(30)

To simply compare, d_fimax and d_fimin in (25) and (26) can be expressed as

\begin{array}{l} d_{fimax} & = \sum_{i = 1}^{n} x_{i} y_{i} ε_{i}^{2} + \sum_{i, j = 1, i \neq j}^{n} x_{i} y_{j} ε_{i} ε_{j} + \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i} ε_{j}^{α} \\ + \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{j} ε_{i}^{α} + \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i}^{α} ε_{j}^{α}, \end{array}

(31)

\begin{array}{l} d_{fimin} & = \sum_{i = 1}^{n} x_{i} y_{i} ε_{i}^{2} + \sum_{i, j = 1, i \neq j}^{n} x_{i} y_{j} ε_{i} ε_{j} + \sum_{i = 1}^{n} x_{i} y_{i} ε_{i} (ε_{i}^{α} + ε_{j}^{α}) \\ + \sum_{i, j = 1, i \neq j}^{n} x_{i} y_{j} ε_{i} ε_{j}^{α} + \sum_{i, j = 1, i \neq j}^{n} x_{i} y_{j} ε_{j} ε_{i}^{α} + \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i}^{α} ε_{j}^{α}, \\ where ε_{i}, ε_{j} = (- 1, 1), and ε_{i}^{α}, ε_{j}^{α} = [- 1, 1] . \end{array}

(32)

As the example in Section 3, Figure 2 shows the function of d_f = -6(ε₁-0.1)²-29(ε₁-0.1)(ε₂-0.1) + 5(ε₂-0.1)² when $ε_{1}^{α} = 0.1$ and $ε_{2}^{α} = 0.1$ . The estimated maximum and minimum of d_f at the domain boundary, d_emax and d_emin, are also marked in the figure. Since the value of ε_i in (27) are substituted by ∀ε_i = ±1, d_emax is larger than the maximum of d_f and d_emin is smaller than the minimum.

The extreme value theory of multivariable functions is used to compare d_emax, d_fimax, d_emin, and d_fimin.

Hessian matrix of function $d_{f} = \sum_{i, j = 1}^{n} x_{i} y_{j} (ε_{i} - ε_{i}^{α}) (ε_{j} - ε_{j}^{α})$ is

\begin{array}{l} H & = [\begin{array}{c} \frac{\partial^{2} d_{f}}{\partial ε_{1}^{2}} & \frac{\partial^{2} d_{f}}{\partial ε_{1} ε_{2}} & \dots & \frac{\partial^{2} d_{f}}{\partial ε_{1} ε_{n}} \\ \frac{\partial^{2} d_{f}}{\partial ε_{2} ε_{1}} & \frac{\partial^{2} d_{f}}{\partial ε_{2}^{2}} & \dots & \frac{\partial^{2} d_{f}}{\partial ε_{2} ε_{n}} \\ \dots & \dots & \dots & \dots \\ \dots & \dots & \dots & \dots \\ \frac{\partial^{2} d_{f}}{\partial ε_{n} ε_{1}} & \frac{\partial^{2} d_{f}}{\partial ε_{n} ε_{2}} & \dots & \frac{\partial^{2} d_{f}}{\partial ε_{n}^{2}} \end{array}] \\ = [\begin{array}{c} 2 x_{1} y_{1} & x_{1} y_{2} + x_{2} y_{1} & x_{1} y_{3} + x_{3} y_{1} & \dots \\ x_{1} y_{2} + x_{2} y_{1} & 2 x_{2} y_{2} & x_{2} y_{3} + x_{3} y_{2} & \dots \\ x_{1} y_{3} + x_{3} y_{1} & x_{2} y_{3} + x_{3} y_{2} & 2 x_{3} y_{3} & \dots \\ \dots & \dots & \dots & \dots \\ \dots & \dots & \dots & \dots \\ x_{1} y_{n} + x_{n} y_{1} & x_{2} y_{n} + x_{n} y_{2} & x_{3} y_{n} + x_{n} y_{3} & \dots \end{array}] . \end{array}

(33)

From (33), we can see that H is independent of ε_i. It is a expression of x_i and y_i. This means that H is same for all the points in the domain.

To determine if H is positive or negative definite or semidefinite, its principal minor determinants are derived as

D_{0} = 2 x_{i} y_{i}

(34)

\begin{array}{l} D_{1} & = |\begin{array}{c} 2 x_{i} y_{i} & x_{i} y_{j} + x_{j} y_{i} \\ x_{i} y_{j} + x_{j} y_{i} & 2 x_{j} y_{j} \end{array}| = - {(x_{i} y_{j} - x_{j} y_{i})}^{2} \end{array}

(35)

\begin{array}{l} D_{2} & = D_{3} = \dots = D_{n} = 0, \\ where 1 \leq i < j \leq n . \end{array}

(36)

As introduced in Section 2.4, H is a positive semidefinite matrix, iff it satisfies

\forall x_{i} y_{i} \geq 0, \forall x_{i} y_{j} = x_{j} y_{i}, for 1 \leq i < j \leq n .

(37)

H is a negative semidefinite matrix, iff it satisfies

\forall x_{i} y_{i} \geq 0, \exists x_{i} y_{j} \neq x_{j} y_{i}, for 1 \leq i < j \leq n .

(38)

If it satisfies neither (37) nor (38), which means it satisfies (39), H is an indefinite matrix as

\exists x_{i} y_{i} < 0, for 1 \leq i \leq n .

(39)

According to (37), (38), and (39), we can compare d_emax, d_emin, d_fimax, and d_fimin, which are expressed as (29), (30), (31), and (32), respectively. Based on (25) and (26), d_max and d_min can be identified.

Lemma 1.

The estimated maximum of function d_f, d_maxequals to the estimated maximum of the function value at the domain boundary, and the estimated minimum of function d_f, d_minequals to the estimated minimum of the function value at the domain boundary. This can be expressed as

d_{max} = d_{emax} d_{min} = d_{emin} .

(40)

Proof.

There are two cases to consider, as ∃x_iy_i < 0 and ∀x_iy_i ≥ 0.

For ∃x_iy_i < 0, (39) is satisfied and H is indefinite. The stationary point is a saddle point, such as the point P in Figure 2. Neither d_fimax nor d_fimin exists in d_f, that is,

d_{max} = d_{emax} d_{min} = d_{emin} .

(41)

According to (41), Lemma 1 can be proven in this case.

For ∀x_iy_i ≥ 0, H may be positive semidefinite or negative semidefinite. d_f may have local minima or local maxima under this condition.

As ε_i = [-1, 1], the following inequalities are established:

\sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | \geq \pm \sum_{i, j = 1, i \neq j}^{n} x_{i} y_{j} ε_{i} ε_{j},

(42)

\sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{i}^{α} | \geq \pm \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i} ε_{j}^{α},

(43)

\sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{j}^{α} | \geq \pm \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{j} ε_{i}^{α} .

(44)

If a local maximum lies at z^α, the difference between d_emax and d_fimax is

d_{emax} - d_{fimax} \geq \sum_{i = 1}^{n} x_{i} y_{i} (1 - ε_{i}^{2}) .

(45)

∀x_iy_i ≥ 0, there exists

d_{emax} \geq d_{fimax} .

(46)

According to (25) and (46), we can prove that

d_{max} = d_{emax} .

(47)

Similarly, if a local minimum lies at z^α, the difference between d_emin and d_fimin is

\begin{array}{l} d_{emin} - d_{fimin} & \leq - \sum_{i = 1}^{n} x_{i} y_{i} (ε_{i}^{2} + ε_{i} (ε_{i}^{α} + ε_{j}^{α}) + 1) \\ \leq - \sum_{i = 1}^{n} x_{i} y_{i} {(ε_{i} + 1)}^{2} . \end{array}

(48)

As ∀x_iy_i ≥ 0 in (48), the inequality (49) can be proven:

d_{emin} \leq d_{fimin} .

(49)

According to (26) and (49), we can prove that

d_{min} = d_{emin} .

(50)

As (47) and (50) are established, Lemma 1 can be proven in the case of ∀x₁y₁ ≥ 0.

Combining these two cases, Lemma 1 is proven.

According to Lemma 1, d_max and d_min at a point z^α can be computed as d_emax and d_emin in (29) and (30).

4.2 Expression of the approximate affine form in AASEE

Lemma 2.

When f_zrepresents a tangent hyperplane at the point z⁰ = z₀ = (0, 0, …, 0), it satisfies (20).

Proof.

According to Lemma 1, (29), and (30), the maximum absolute error of d_f is

\begin{array}{l} e_{a} & = | \sum_{i = 1}^{n} x_{i} y_{i} | + \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | + \sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{i}^{α} | \\ + \sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{j}^{α} | + | \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i}^{α} ε_{j}^{α} | . \end{array}

(51)

So the maximum absolute error between the tangent hyperplane $f_{z^{0}}$ at the point z⁰ = z₀ = (0, 0, …, 0) and (n + 1)-dimensional quadratic surface z is

\begin{array}{l} e_{a} (z^{0}) = | \sum_{i = 1}^{n} x_{i} y_{i} | + \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | . \end{array}

(52)

Suppose that there is another point z^α ≠ z⁰, which is typically represented by z^α = (ε₁, ε₂, …, ε_n), where ε_i = [-1, 1], and ε_i cannot be equal to 0 for all i, i = 1 … n. The maximum absolute error between the tangent hyperplane $f_{z^{α}}$ at point z^α and (n + 1)-dimensional quadratic surface $\hat{x} ŷ$ is

\begin{array}{l} e_{a} (z^{α}) & = | \sum_{i = 1}^{n} x_{i} y_{i} | + \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | + \sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{i}^{α} | \\ + \sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{j}^{α} | + | \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i}^{α} ε_{j}^{α} | . \end{array}

(53)

e_a(z^α) and e_a(z⁰) can be compared by

\begin{array}{l} e_{a} (z^{0}) - e_{a} (z^{α}) = & - \sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{i}^{α} | \\ - \sum_{i, j = 1}^{n} | x_{i} y_{j} ε_{j}^{α} | - | \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i}^{α} ε_{j}^{α} | \leq 0 . \end{array}

(54)

Because e_a(z⁰) ≤ e_a(z^α), the tangent hyperplane $f_{z^{0}}$ at the point z⁰ = z₀ = (0, 0, …, 0) is the tangent hyperplane whose maximum absolute error is minimized.

It is proven that the chosen f_z is a tangent hyperplane at the point z⁰ = z₀ = (0, 0, …, 0).

According to Lemma 2, f_z of AASEE denotes the tangent hyperplane at the point z₀ = (0, 0, …, 0) and can be expressed as

f_{z} = x_{0} y_{0} + x_{0} \sum_{i = 1}^{n} y_{i} ε_{i} + y_{0} \sum_{i = 1}^{n} x_{i} ε_{i} .

(55)

This f_z is the same as the f_zs in AATRE and AACHA.

4.3 Expression of the equivalent affine form in AASEE

According to (55), the d_f between the tangent hyperplane $f_{z^{0}}$ and the quadratic surface is

d_{f} = \sum_{i, j = 1}^{n} x_{i} y_{j} ε_{i} ε_{j} .

(56)

According to Lemma 1, (29), and (30), the estimated maximum and estimated minimum of d_f, d_max and d_min can be expressed as

\begin{array}{l} d_{max} & = & d_{emax} = \sum_{i = 1}^{n} x_{i} y_{i} + \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | \\ d_{min} & = & d_{emin} = \sum_{i = 1}^{n} x_{i} y_{i} - \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | . \end{array}

(57)

n = 1 is a special case and d_max and d_min can be optimized as

\begin{array}{l} d_{max} & = \{\begin{array}{l} x_{1} y_{1}, for n = 1, x_{1} y_{1} \geq 0 \\ 0, for n = 1, x_{1} y_{1} \leq 0 \end{array} \end{array}

(58)

\begin{array}{l} d_{min} & = \{\begin{array}{l} 0, for n = 1, x_{1} y_{1} \geq 0 \\ x_{1} y_{1}, for n = 1, x_{1} y_{1} \leq 0 . \end{array} \end{array}

(59)

By combining the two cases, d_emax and d_emin are rewritten as

\begin{array}{l} d_{max} = \{\begin{array}{l} \sum_{i = 1}^{n} x_{i} y_{i} + \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} |, for n > 1 \\ x_{1} y_{1}, for n = 1, x_{1} y_{1} \geq 0 \\ 0, for n = 1, x_{1} y_{1} < 0 \end{array} \end{array}

(60)

\begin{array}{l} d_{min} = \{\begin{array}{l} \sum_{i = 1}^{n} x_{i} y_{i} - \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} |, for n > 1 \\ 0, for n = 1, x_{1} y_{1} \geq 0 \\ x_{1} y_{1}, for n = 1, x_{1} y_{1} < 0 . \end{array} \end{array}

(61)

When n > 1, the range of $\hat{d}$ can be expressed as

[\sum_{i = 1}^{n} x_{i} y_{i} - \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} |, \sum_{i = 1}^{n} x_{i} y_{i} + \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} |] .

(62)

According to (11), the affine form of $\hat{d}$ can be expressed as

\hat{d} = \sum_{i = 1}^{n} x_{i} y_{i} + \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | ε_{n + 1} .

(63)

When n = 1, the range of $\hat{d}$ can be expressed as

[x_{1} y_{1}, 0] or [0, x_{1} y_{1}] .

(64)

The affine form of $\hat{d}$ can be expressed as

\hat{d} = \frac{1}{2} x_{1} y_{1} + \frac{1}{2} | x_{1} y_{1} | ε_{2} .

(65)

4.4 Formulary of AASEE

According to (12), the affine form of AASEE for multiplication is

\begin{array}{l} \hat{z} & = f_{z} + \hat{d} = x_{0} y_{0} + x_{0} \sum_{i = 1}^{n} y_{i} ε_{i} + y_{0} \sum_{i = 1}^{n} x_{i} ε_{i} \\ + \sum_{i = 1}^{n} x_{i} y_{i} + \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | ε_{n + 1} for n > 1, \end{array}

(66)

\begin{array}{l} \hat{z} & = f_{z} + \hat{d} = x_{0} y_{0} + (x_{0} y_{1} + y_{0} x_{1}) ε_{1} + \frac{1}{2} x_{1} y_{1} \\ + \frac{1}{2} | x_{1} y_{1} | ε_{2} for n = 1 . \end{array}

(67)

It is impossible to obtain the exact affine form for multiplication in AA. The result of multiplication must be approximated to an affine form. Using ε_i as the input arguments, the uncertainty of multiplication in AASEE is reduced. The proposed f_z is the most closed to the result of multiplication among all the possible approximate affine forms, and the upper and lower bounds of $\hat{d}$ in AASEE are much closer to true bounds of d_f. Hence, the uncertainty in AASEE is smaller than that in AATRE and AACHA. Formed by such f_z and $\hat{d}$ , AASEE creates a refined affine form of multiplication.

5 Comparison of AASEE to AATRE and AACHA

5.1 Computational complexity

The computational complexity of an expression is determined by its most complex item. For n > 1, the most complex item is the coefficient of ε_n+1. To make the analysis convenient, we transform this coefficient:

\begin{array}{l} \sum_{i, j = 1, i \neq j}^{n} | x_{i} y_{j} | & = \sum_{i, j = 1}^{n} | x_{i} y_{j} | - \sum_{i = 1}^{n} | x_{i} y_{i} | \\ = \sum_{i, j = 1}^{n} | x_{i} | \sum_{i, j = 1}^{n} | y_{j} | - \sum_{i = 1}^{n} | x_{i} y_{i} | . \end{array}

(68)

The computational complexity of the minuend is O(M₁), where M₁ is defined in Section 2.3, while the computational complexity of the subtrahend is less than O(M₁).

Hence, the computational complexity of AASEE is O(M₁). We can see that it is the same as that of AATRE and is lower than that of AACHA.

5.2 Accuracy

The accuracy of $\hat{d}$ is influential to the accuracy of the affine approximation methods of multiplication. The more accurate $\hat{d}$ will lead to a more accurate the affine approximation result.

For AATRE, $\hat{d} = \sum_{i = 1}^{n} | x_{i} | \sum_{i = 1}^{n} | y_{i} | ε_{n + 1}$ . In this method, the same noise symbol of different variables is considered to be independent. The range of this $\hat{d}$ is

[- \sum_{i = 1}^{n} | x_{i} | \sum_{i = 1}^{n} | y_{i} |, \sum_{i = 1}^{n} | x_{i} | \sum_{i = 1}^{n} | y_{i} |] .

(69)

It is much larger than the range of $\hat{d}$ by AASEE, which is expressed in (62) and (64).

In AACHA, $\hat{d} = \frac{a + b}{2} + \frac{b - a}{2} ε_{n + 1}$ , where a and b are represented the estimated range of $\hat{d}$ . In this method, a polygon in XY plane is used to find a and b. The domain of $\hat{x} ŷ$ is bounded by the polygon. However, the polygon is larger than the true domain, and all the same noise symbols of different variables are not taken into account together.

All the same noise symbols of different variables are considered together by $\hat{d}$ of AASEE. It is more accurate than $\hat{d}$ of AATRE. In the most cases, it is more accurate than $\hat{d}$ of AACHA, too.

6 Case studies

The following nonlinear system cases are used to demonstrate the efficiency of the proposed refined affine form of multiplication. These cases are commonly used in signal processing. The first two cases are univariate cases and come from [11]. The rest of cases are multivariate polynomial functions and come from [27–29].

6.1 Introduction of the cases

Case 1. Polynomial approximation. The first case study is that degree-four polynomial for the approximation of y = ln(1 + x), where x = [0,1]. Horner’s rule evaluates the polynomial

y = (((- 0.0550 x + 0.2168) x - 0.4645) x + 0.9956) x + 0.0001,

where the coefficients are obtained by polynomial curve fitting technique.

Case 2. B-splines Uniform cubic B-splines are commonly used for image warping [30]. Basic functions B₀, B₁, B₂, and B₃ in B-spline are defined as

\begin{array}{l} B_{0} (u) & = \frac{1}{6} {(1 - u)}^{3}, B_{1} (u) = \frac{1}{6} (3 u^{3} - 6 u^{2} + 4), \\ B_{2} (u) & = \frac{1}{6} (- 3 u^{3} + 3 u^{2} + 3 u + 1), B_{3} (u) = \frac{- u^{3}}{6}, \end{array}

where u = [0, 1].

Case 3. Multivariate polynomial functions. In the third case, eight multivariate polynomial functions are examined. They are as follows:

1.
Savitzky-Golay filter:
$\begin{array}{l} f_{1} (X) & = 7 x_{1}^{3} - 984 x_{2}^{3} - 76 x_{1}^{2} x_{2} + 92 x_{1} x_{2}^{2} + 7 x_{1}^{2} \\ - 39 x_{1} x_{2} - 46 x_{2}^{2} + 7 x_{1} - 46 x_{2} - 75 \\ where the input range: X = {[- 2, 2]}^{2} \end{array}$
2.
Image rejection unit:
$\begin{array}{l} f_{2} (X) & = 16384 (x_{1}^{4} + x_{2}^{4}) + 64767 (x_{1}^{2} - x_{2}^{2}) + x_{1} - x_{2} \\ + 57344 x_{1} x_{2} (x_{1} - x_{2}) \\ where the input range: X = {[0, 1]}^{2} \end{array}$
3.
A random function:
$\begin{array}{l} f_{3} (X) & = (x_{1} - 1) (x_{1} + 2) (x_{2} + 1) (x_{2} - 2) x_{3}^{2} \\ where the input range: X = {[- 2, 2]}^{3} \end{array}$
4.
Mitchell function:
$\begin{array}{l} f_{4} (X) & = 4 [x_{1}^{4} + {(x_{2}^{2} + x_{3}^{2})}^{2}] + 17 x_{1}^{2} (x_{2}^{2} + x_{3}^{2}) \\ - 20 (x_{1}^{2} + x_{2}^{2} + x_{3}^{2}) + 17 \\ where the input range: X = {[- 2, 2]}^{3} \end{array}$
5.
Matyas function:
$\begin{array}{l} f_{5} (X) & = 0.26 (x_{1}^{2} + x_{2}^{2}) - 0.48 x_{1} x_{2} \\ where the input range: X = {[- 100, 100]}^{2} \end{array}$
6.
Three-hump function:
$\begin{array}{l} f_{6} (X) & = 12 x_{1}^{2} - 6.3 x_{1}^{4} + x_{1}^{6} + 6 x_{2} (x_{2} - x_{1}) \\ where the input range: X = {[- 10, 10]}^{2} \end{array}$
7.
Goldstein-Price function:
$\begin{array}{l} f_{7} (X) & = [1 + {(x_{1} + x_{2} + 1)}^{2} (19 - 14 x_{1} + 3 x_{1}^{2} - 14 x_{2} \\ + 6 x_{1} x_{2} + 3 x_{2}^{2})] \times [30 + {(2 x_{1} - 3 x_{2})}^{2} \\ \times (18 - 32 x_{1} + 12 x_{1}^{2} + 48 x_{2} - 36 x_{1} x_{2} + 27 x_{2}^{2})] \\ where the input range: X = {[- 2, 2]}^{2} \end{array}$
8.
Ratscheck function:
$\begin{array}{l} f_{8} (X) & = 4 x_{1}^{2} - 2.1 x_{1}^{4} + \frac{1}{3} x_{1}^{6} + x_{1} x_{2} - 4 x_{2}^{2} + 4 x_{2}^{4} \\ where the input range: X = {[- 100, 100]}^{2} \end{array}$

6.2 Analysis of case 1

For the input range x = [0, 1], equivalent affine form is $\hat{x} = 0.5 + 0.5 ε_{1}$ . For case 1, the intermediate and output signals are defined as

\begin{array}{l} y_{1} & = - 0.0550 x + 0.2168, y_{2} = y_{1} x - 0.4645, \\ y_{3} & = y_{2} x + 0.9956, y = y_{3} x + 0.0001 . \end{array}

(70)

Using AATRE, the affine forms of intermediate and output are

\begin{array}{l} y_{1} & = 0.1893 - 0.0275 ε_{1}, \\ y_{2} & = - 0.36985 + 0.0809 ε_{1} + 0.01375 ε_{2}, \\ y_{3} & = 0.81068 - 0.14448 ε_{1} + 0.00688 ε_{2} + 0.04733 ε_{3}, \\ y & = 0.4054 + 0.3331 ε_{1} + 0.0034 ε_{2} + 0.0237 ε_{3} + 0.0993 ε_{4} . \end{array}

Using AACHA, the affine forms of intermediate and output are

\begin{array}{l} y_{1} & = 0.1893 - 0.0275 ε_{1}, \\ y_{2} & = - 0.3768 + 0.0809 ε_{1} + 0.0069 ε_{2}, \\ y_{3} & = 0.8291 - 0.1479 ε_{1} + 0.0034 ε_{2} + 0.0220 ε_{3}, \\ y & = 0.3761 + 0.3406 ε_{1} + 0.0017 ε_{2} + 0.0110 ε_{3} + 0.0436 ε_{4} . \end{array}

Using AASEE, the affine forms of intermediate and output are

\begin{array}{l} y_{1} & = 0.1893 - 0.0275 ε_{1}, \\ y_{2} & = - 0.37673 + 0.0809 ε_{1} + 0.00688 ε_{2}, \\ y_{3} & = 0.84769 - 0.14791 ε_{1} + 0.00344 ε_{2} + 0.00344 ε_{3}, \\ y & = 0.34999 + 0.34989 ε_{1} + 0.00172 (ε_{2} + ε_{3}) + 0.00344 ε_{4} . \end{array}

Table 1 shows the variable ranges and the range intervals, (y_max-y_min), of intermediates and output by the three methods. The true range of y lies in [0,0.6931], and the range interval of output is 0.6931. Suppose R(T), R(C), and R(A) are represented as the ratios of range interval obtained by AATRE, AACHA, and AASEE to the true range interval, respectively. The closer this ratio converges to 1, the more accurate the method is. In this case, as R(T) = 1.33, R(C) = 1.15, and R(A) = 1.03, we can see the range by AASEE is closer to the true range than AATRE and AACHA.

Table 1 Comparison of ranges and range intervals for every variable of the three methods for case 1

Full size table

6.3 Comparison of range and computational complexity by the three cases

The output ranges by the three methods of case 2 and case 3 can be obtained according to the process of case 1.

Table 2 demonstrates the ranges and the integer word lengths by AASEE and comparison among AATRE, AACHA and AASEE. Column c.fun shows the case study and the function of the row. The true output ranges, which are used as reference values, are obtained by numerical method or nonlinear programming technique, which are time-consuming and are not practical to solve the true bounds for large number of signals. From the table, we can see that the ranges, which are derived by AASEE, cover the true ranges and they are smaller than those by AATRE, for all the functions. For these thirteen functions, the ranges, which are derived by AASEE, are smaller than those by AACHA for nine functions, and equal to those by AACHA for two functions. According to (1), the integer word length can be decided by the range. The integer word-length, which is derived by AASEE, is 2 b less than that by AATRE and 1 b less than that by AACHA, at most. Comparing with AATRE, AASEE and AACHA can save 0.54 b on average.

Table 2 Comparison of analytical ranges and bits by the three methods

Full size table

To calculate the estimated range of d_f, the values of ∃ε_i = ±1, ∀i = 1, 2, …, n in (27) are substituted by ∀ε_i = ±1 in AASEE. The difference between the estimated range and the true range of d_f is introduced by this approximation. In most of the applications, the estimated ranges, which are computed by AASEE, are closer than those by AACHA. However, the estimated minimum and maximum of $\hat{x} ŷ$ on the boundary of the polygon are independent of the value of ε_i. In some applications such as functions f₂ and f₈ in Table 2, the results by AASEE are almost the same as those by AACHA.

In Table 3, ratios of range intervals and the computational complexity are compared among AATRE, AACHA, and AASEE. The computational complexity is calculated from the numbers of multiplications and additions. For AACHA, the extreme value of a quadratic function in one variable on a bounded interval needs to be calculated. N_m, N_a, and N_e denote the numbers of multiplications, additions and the extreme value computations of each case, respectively. Table 3 shows that R(T) values are from 1.04 to 281.2, R(C) are from 1.03 to 233.7, and R(A) are from 1.03 to 192.9. The ratios of R(A) to R(T) and R(C) show the accuracy of AASEE compared to AATRE and AACHA, respectively. The average ratios can be used to evaluate the accuracy of the affine approximation methods. The ratios of R(A) to R(T) are from 0.18 to 0.99, and the average of these ratios is 0.59. The ratios of R(A) to R(C) are from 0.33 to 1.17, and the average of these ratios is 0.89. For these 13 cases, on average, the accuracy of AASEE is 1.69 times than that of AATRE and 1.12 times than that of AACHA. The extreme value computation, which is only necessary for AACHA, of the quadratic function is the most complex and time-consuming among the operations. Hence, the computational complexity of AACHA is much higher than that of AATRE and AASEE. The increase rate of the number of multiplications, N_m, by AASEE to AATRE is from 0.091 to 1.75, and the average is 0.450. The increase rate of the number of multiplications, N_m, by AASEE to AACHA is from 0.2 to 1.833, and the average is 0.567. The increase rate of the number of additions, N_a, by AASEE to AATRE is from 0.05 to 3.4, and the average is 0.944. The increase rate of the number of additions, N_a, by AASEE to AACHA is from 0 to 0.985, and the average is 0.157. The numbers of multiplications and additions of AASEE are increased a few. As shown in Table 3, AACHA is slightly more accurate for functions c₃.f₂ and c₃.f₈, but the computational complexity of AACHA is much higher than that of AASEE.

Table 3 Comparison of range ratios and computational complexity by the three methods

Full size table

6.4 Comparison of the design cost by the three methods

To compare the design cost, the system area by the three methods, the fractional word lengths are obtained by the precise analysis in [11]. Typically, we select the case of a random function of case 3, c₃.f₃, for this section. The design of c₃.f₃ is synthesized on Xilinx Xc2vp30-7ff896 FPGA device (Xilinx, San Jose, CA, USA).

Figure 3 shows the area variation for c₃.f₃ with increasing target precision. It can be seen that the area, which is calculated by AASEE, is less than that by AATRE and AACHA, and the area difference between them is increasing with the target precision. This difference is from 265 to 729 with the target precision increased. Such optimization of integer word length can save area.

Figure 4 shows the percentage area saving of AASEE over AATRE at different target precision for c₃.f₃. The percentage area saving is from 14.34% to 5.62% with the target precision increased. Generally, we obtain increased relative saving for lower precision.

7 Conclusions

This paper presents a novel affine approximation method for multiplication, Approximation Affine based on Space Extreme Estimation. In this method, an extra noise symbol is added to an approximated affine form.

To reduce the uncertainty in AA, we derive this method in the (n + 1)-dimensional space Eⁿ⁺¹. In space Eⁿ⁺¹, approximate affine form can be regarded as the tangent hyperplane at a certain point of (n + 1)-dimensional space curved surface. Using the linear geometry, it is proven that the f_z of AASEE is the closest to the result of multiplication among all the possible approximate affine forms. Taking ε_i as the input arguments, all the same noise symbols of different variables are taken into account together. Hence, the uncertainty of $\hat{d}$ of AASEE is reduced. Based on the extreme value theory of multivariable functions, we can prove that the range of this $\hat{d}$ covers the true range of the difference introduced by approximation and much tighter than that by AATRE and AACHA.

The uncertainty in AASEE is much smaller than that in AATRE and AACHA on average. At the same time, the computational complexity of AASEE is the same as that of AATRE and lower than that of AACHA.

In the case studies, the accuracy of AASEE is 1.69 times than that of AATRE and 1.12 times than that of AACHA on average. The integer word length, which is derived by AASEE, is 2 b less than that by AATRE and 1 b less than that by AACHA, at most. For the case of c₃.f₃, the area, which is computed by AASEE, is less than that by AATRE and AACHA, and the percentage area saving of AASEE over AATRE is from 14.34% to 5.62% with the target precision increased.

References

Constantinides G, Woeginger G: The complexity of multiple wordlength assignment. Appl. Math. Lett 2002, 15(2):137-140. 10.1016/S0893-9659(01)00107-0
Article MATH MathSciNet Google Scholar
Cmar R, Rijnders L, Schaumont P, Vernalde S, Bolsens I: A methodology and design environment for DSP ASIC fixed point refinement. In Proceedings of Design, Automation and Test in Europe. Munich: IEEE Computer Society; 09–12 March 1999:271-276.
Google Scholar
Kum K, Sung W: Combined word-length optimization and high level synthesis of digital signal processing systems. IEEE Trans. Computer-Aided Design Integr. Circuits Syst 2001, 20(8):921-930. 10.1109/43.936374
Article Google Scholar
Roy S, Banerjee P: An algorithm for trading off quantization error with hardware resources for MATLAB-based FPGA design. IEEE Trans. Comput 2005, 54(7):886-896. 10.1109/TC.2005.106
Article Google Scholar
Mallik A, Sinha D, Zhou H: Low-power optimization by smart bit-width allocation in a SystmC-based ASIC design environment. IEEE Trans. Computer-Aided Design Integr. Circuits Syst 2007, 26(3):447-455.
Article Google Scholar
Caffarena G, Carreras C, Lopez JA: SQNR estimation of fixed-point DSP algorithms. Eurasip J. Adv. Signal Process 2010, 21: 1-12.
Google Scholar
Banciu A, Casseau E, Menard D: Stochastic modeling for floating-point to fixed-point conversion. In Proceedings of IEEE Workshop on Signal Processing Systems (SiPS). Beirut: IEEE Computer Society; 4–7 October 2011:180-185.
Google Scholar
Fang CF, Rutenbar R, Puschel M, Chen T: Toward efficient static analysis of finite-precision effects in DSP applications via affine arithmetic modeling. In Proceedings of Design Automation Conference, Institute of Electrical and Electronics Engineers Inc.. Anaheim; 2–6 June 2003:496-501.
Google Scholar
Fang CF, Rutenbar R: Fast, accurate static analysis for fixed-point finite-precision effects in DSP designs. In Proceedings of International Conference on Computer-Aided Design, Institute of Electrical and Electronics Engineers Inc.. San Jose; 9–13 November 2003:275-282.
Google Scholar
Pu Y, Ha Y: An automated, efficient and static bit-width optimization methodology towards maximum bit-width-to-error tradeoff with affine arithmetic model. In Proceedings of Asia and South Pacific Design Automation Conference, Institute of Electrical and Electronics Engineers Inc.. Yokohama; 24–27 January 2006:886-891.
Google Scholar
Lee DU, Gaffar AA, Cheung RC, Mencer O, Luk W, Constantinides GA: Accuracy guaranteed bit-width optimisation. IEEE Trans. Computer-Aided Design Integr. Circuits Syst 2006, 25(10):1990-2000.
Article Google Scholar
Osborne WG, Coutinho JGF, Luk W, Mencer O: Instrumented multi-stage word-length optimization. In Proceedings of IEEE International Conference on Field-Programmable Technology, Institute of Electrical and Electronics Engineers Inc.. Kitakyushu; 12–14 December 2007:89-96.
Google Scholar
Lopez JA, Carreras C, Nieto-Taladriz O: Improved interval-based characterization of fixed-point LTI systems with feedback loops. IEEE Trans. Computer-Aided Design Integr. Circuits Syst 2007, 2(11):1923-1933.
Article Google Scholar
Zhang L, Zhang Y, Zhou W: Tradeoff between approximation accuracy and complexity for range analysis using affine arithmetic. J. Signal Process. Syst 2010, 61(3):279-291. 10.1007/s11265-010-0452-2
Article Google Scholar
Sarbishei O, Radecka K, Zilic Z: Analytical optimization of bit-widths in fixed-point LTI systems. IEEE Trans. Computer-Aided Design Integr. Circuits Syst 2012, 31(3):343-355.
Article Google Scholar
Rocher R, Menard D, Scalart P: Analytical approach for numerical accuracy estimation of fixed-point systems based on smooth operations. IEEE Trans. Circuits Syst. I, Reg. Papers 2012, 59(10):2326-2339.
Article MathSciNet Google Scholar
Kinsman AB, Nicolici N: Bit-width allocation for hardware accelerators for scientific computing using SAT-modulo theory. IEEE Trans. Computer-Aided Design Integr. Circuits Syst 2010, 29(3):406-413.
Article Google Scholar
Kinsman AB, Nicolici N: Computational vector-magnitude-based range determination for scientific abstract data types. IEEE Trans. Comput 2011, 60(11):1652-1663.
Article MathSciNet Google Scholar
Wadekar SA, Parker AC: Accuracy sensitive word-length selection for algorithm optimization. In Proceedings of the International Conference on Computer Design: VLSI in Computers and Processors, 1998. ICCD ‘98, Institute of Electrical and Electronics Engineers Inc.. Austin; 5–7 October 1998:54-61.
Google Scholar
Carreras C, Lopez JA, Nieto-Taladriz O: Bit-width selection for data-path implementations. In Proceedings of the 12th International Symposium on System Synthesis, 1999. Boca Raton: IEEE Computer Society; 1–4 November 1999:114-119.
Google Scholar
Comba JLD, Stolfi J: Affine arithmetic and its applications to computer graphics. In Proceedings of SIBGRAPI’93 - VI Simposio Brasileiro de Computacao Grafica e Processamento de Imagens. Recife: IEEE Computer Society; 20–22 October 1993:9-18.
Google Scholar
Stolfi J, de Figueiredo (eds) LH: Affine arithmetic. In Self-Validated Numerical Methods and Applications. Brazil: Monograph for 21st Brazilian Mathematics Colloquium, IMPA, Rio de Janeiro; 1997:70-74.
Google Scholar
Huang K, Yee H: Improved tangent hyperplane method for transient stability studies [of power systems]. In Proceedings of APSCOM-91 Conference, Institution of Electrical Engineers. Hong Kong; 5–8 November 1991:363-366.
Google Scholar
Eivind E, Gustavsen TS: GRA6035 Mathematics. Oslo: BI Norwegian Business School; 2010.
Google Scholar
Moore R: Interval Analysis. New Jersey: Prentice-Hall; 1966.
MATH Google Scholar
Pang Y, Radecka K: An efficient algorithm of performing range analysis for fixed-point arithmetic circuits based on SAT checking. In Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS). Rio de Janeiro: IEEE Computer Society; 15–18 May 2011:1736-1739.
Google Scholar
Shekhar N, Kalla P, Enescu F: Equivalence verification of arithmetic datapaths with multiple word-length operands. In Proceedings of Design, Automation and Test in Europe. Munich: IEEE Computer Society; 6–10 March 2006:824-829.
Google Scholar
Gopalakrishnan S, Kalla P, Meredith MB, Enescu F: Finding linear building-blocks for RTL synthesis of polynomial datapaths with fixed-size bit-vectors. In Proceedings of International Conference on Computer-Aided Design, Institute of Electrical and Electronics Engineers Inc.. San Jose; 5–8 November 2007:143-148.
Google Scholar
Shou H, Song W, Shen J, Martind R, Wang G: A recursive Taylor method for ray-casting algebraic surfaces. In Proceedings of International Conference on Computer Graphics and Virtual Reality. Las Vegas: CSREA Press; 26–29 June 2006:196-204.
Google Scholar
Jiang J, Luk W, Rueckert D: FPGA-based computation of free-form deformations in medical image registration. In Proceedings of IEEE International Conference on Field-Programmable Technology 2003. Tokyo: IEEE Computer Society; 15–17 December 2003:234-241.
Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Xili, Shenzhen, 518055, China
Ruiyi Sun, Yan Zhang & Aijiao Cui

Authors

Ruiyi Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Aijiao Cui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Zhang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Sun, R., Zhang, Y. & Cui, A. A refined affine approximation method of multiplication for range analysis in word-length optimization. EURASIP J. Adv. Signal Process. 2014, 36 (2014). https://doi.org/10.1186/1687-6180-2014-36

Download citation

Received: 08 June 2013
Accepted: 10 March 2014
Published: 22 March 2014
DOI: https://doi.org/10.1186/1687-6180-2014-36

A refined affine approximation method of multiplication for range analysis in word-length optimization

Abstract

Similar content being viewed by others

A new algorithm for Chebyshev minimum-error multiplication of reduced affine forms

Optimization of Number Representations

Optimization of Number Representations

1 Introduction

2 Background

2.1 Related work

2.2 Range analysis

2.3 Affine arithmetic

2.4 Extreme value theory

3 Derivation of the two parts for multiplication

4 AASEE for multiplication

4.1 Estimated range of the difference

Lemma 1.

Proof.

4.2 Expression of the approximate affine form in AASEE

Lemma 2.

Proof.

4.3 Expression of the equivalent affine form in AASEE

4.4 Formulary of AASEE

5 Comparison of AASEE to AATRE and AACHA

5.1 Computational complexity

5.2 Accuracy

6 Case studies

6.1 Introduction of the cases

6.2 Analysis of case 1

6.3 Comparison of range and computational complexity by the three cases

6.4 Comparison of the design cost by the three methods

7 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation