Generation of test matrices with specified eigenvalues using floating-point arithmetic

Ozaki, Katsuhisa; Ogita, Takeshi

doi:10.1007/s11075-021-01186-7

Generation of test matrices with specified eigenvalues using floating-point arithmetic

Original Paper
Open access
Published: 17 September 2021

Volume 90, pages 241–262, (2022)
Cite this article

Download PDF

You have full access to this open access article

Numerical Algorithms Aims and scope Submit manuscript

Generation of test matrices with specified eigenvalues using floating-point arithmetic

Download PDF

1509 Accesses
Explore all metrics

Abstract

This paper concerns test matrices for numerical linear algebra using an error-free transformation of floating-point arithmetic. For specified eigenvalues given by a user, we propose methods of generating a matrix whose eigenvalues are exactly known based on, for example, Schur or Jordan normal form and a block diagonal form. It is also possible to produce a real matrix with specified complex eigenvalues. Such test matrices with exactly known eigenvalues are useful for numerical algorithms in checking the accuracy of computed results. In particular, exact errors of eigenvalues can be monitored. To generate test matrices, we first propose an error-free transformation for the product of three matrices Y SX. We approximate S by ${S^{\prime }}$ to compute ${YS^{\prime }X}$ without a rounding error. Next, the error-free transformation is applied to the generation of test matrices with exactly known eigenvalues. Note that the exactly known eigenvalues of the constructed matrix may differ from the anticipated given eigenvalues. Finally, numerical examples are introduced in checking the accuracy of numerical computations for symmetric and unsymmetric eigenvalue problems.

Numerical methods for accurate computation of the eigenvalues of Hermitian matrices and the singular values of general matrices

Article 30 August 2020

Solving Polynomial Systems Using Numeric Gröbner Bases

Explicit Solutions of the Singular Yang–Baxter-like Matrix Equation and Their Numerical Computation

Article 18 March 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

This paper concerns test matrices for numerical algorithms. If exact eigenvalues are known in advance, then it is useful for checking accuracy and stability of numerical methods for problems in numerical linear algebra. There are several matrices whose exact eigenvalues are known, for example, Circular, Clement, Frank, and tridiagonal Toeplitz matrices. Test matrices are well summarized in [4, Chapter 28]. This paper aims to generate a matrix with exactly known eigenvalues based on some standard forms, for example, the Schur or the Jordan normal forms, or a block diagonal form. Let $X \in \mathbb {R}^{n \times n}$ be a non-singular matrix. For a matrix $S \in \mathbb {R}^{n \times n}$, we consider a form:

$$ X^{-1} S X. $$

(1)

If (1) is the Schur normal form, X is an orthogonal matrix, and S is an upper triangular matrix. For the Jordan normal form, $S \in \mathbb {R}^{n \times n}$ is a matrix whose diagonal blocks are Jordan matrices. It indicates that S is an upper bidiagonal matrix and the sub-diagonal elements s_i,i+ 1 are either 1 or 0. For both normal forms, the diagonal elements s_ii (1 ≤ i ≤ n) are eigenvalues of A.

Let $\mathbb {F}$ be a set of floating-point numbers as defined in IEEE 754 [5]. If we use floating-point numbers and floating-point arithmetic, then there are following problems to get a matrix with exactly known eigenvalues using (1).

Problem 1:: Even if all elements in a non-singular matrix X can be represented by floating-point numbers ($X \in \mathbb {F}^{n \times n}$), elements in X^− 1 may not be represented by floating-point numbers ($X^{-1} \not \in \mathbb {F}^{n \times n}$).
Problem 2:: Even if $X, S, X^{-1} \in \mathbb {F}^{n \times n}$, $X^{-1} S X \in \mathbb {F}^{n \times n}$ is not always satisfied.

We first prepare a diagonal matrix S and a non-singular matrix X, next obtain the inverse matrix X^− 1 using exact computations such as symbolic or rational arithmetic. Finally, we compute X^− 1SX using exact computations, and the result X^− 1SX is rounded to a floating-point matrix A. One may think s_ii (1 ≤ i ≤ n) are nearly eigenvalues of A, and these are useful as numerical tests. However, the Bauer-Fike theorem [2] states

$$ \underset{\lambda}{\min} |\lambda - \mu| \le \mathtt{cond}(X) \| E \|, $$

(2)

where λ and μ are eigenvalues of A and A + E = X^− 1SX, respectively. Therefore, s_ii can be far from the exact eigenvalues of A even if A is a floating-point matrix nearest to X^− 1SX. In addition, the cost of symbolic or rational arithmetic is much expensive compared to that of numerical computations.

We propose methods that produce a matrix with specified eigenvalues on the basis of an error-free transformation of matrix multiplication. For a matrix $S \in \mathbb {F}^{n \times n}$ given by a user, we approximate S by $S^{\prime } \in \mathbb {F}^{n \times n}$, and obtain $A:=X^{-1} S^{\prime } X \in \mathbb {F}^{n \times n}$ using the proposed method. We prove that no rounding error occurs in evaluating a product of three matrices $X^{-1} S^{\prime } X$ using floating-point arithmetic. Hence, the exact eigenvalue of A can be known from the matrix $S^{\prime }$.

This paper is organized as follows: In Section 2, basic theorems for floating-point numbers are introduced. In Section 3, we propose an error-free transformation for a product of three floating-point matrices. In Section 4, several methods for generating test matrices with exactly known eigenvalues are proposed based on the theory in Section 3, and we introduce numerical examples in checking the accuracy of numerical computations.

2 Basics of floating-point arithmetic

Notation and basic lemmas on floating-point arithmetic are first introduced. Notations (⋅), _▽(⋅), and _△(⋅) indicate a computed result by floating-point arithmetic with rounding to nearest mode (roundTiesToEven), rounding downward mode (roundTowardNegative), and rounding upward mode (roundTowardPositive), respectively. Such rounding modes are defined in IEEE 754 [5]. We omit to write (⋅) for each operation for simplicity of description. For example, if we write ((a + b) + c) for $a,b,c \in \mathbb {F}$, then it means ((a + b) + c). Assume that neither overflow nor underflow occurs in (⋅) and _△(⋅). Let and _S be a roundoff unit and the smallest positive number in $\mathbb {F}$, respectively. A constant is the largest floating-point number in $\mathbb {F}$. For example, $\mathtt {u}=2^{-53}, \mathtt {u}_{S}=2^{-1074}$, and = 2¹⁰²⁴(1 −) for binary64 in IEEE 754. A function (a) for $a \in \mathbb {R}$, the unit in the first place of the binary representation of a, is defined as

$$ \mathtt{ufp}(a) = \begin{cases} 0 & a=0 \\ 2^{\lfloor \log_{2} |a| \rfloor} & \text{otherwise} \\ \end{cases} . $$

From the definition of the function, we have

$$ \mathtt{ufp}(a) > \frac{1}{2} |a| \ \text{for} \ a \not = 0. $$

(3)

We introduce several basic facts of floating-point numbers and floating-point arithmetic for $a,b \in \mathbb {F}$ and $x \in \mathbb {R}$:

$$ \begin{array}{@{}rcl@{}} a & \in & 2 \mathtt{u} \cdot \mathtt{ufp}(a) \mathbb{Z}, \end{array} $$

(4)

$$ \begin{array}{@{}rcl@{}} \mathtt{u}_{S} \mathbb{Z} & \ni & |x| \!\le\! \mathtt{realmax}, \ \exists k = 2^{w}, w \in \mathbb{Z}, \ \mathtt{u} k \mathbb{Z} \ni |x| \!\le\! k \Longrightarrow x \in \mathbb{F}, \end{array} $$

(5)

$$ \begin{array}{@{}rcl@{}} a+b & = & \mathtt{fl}(a+b) + \delta, \ |\delta| \le \mathtt{u} \cdot \mathtt{ufp}(a+b), \end{array} $$

(6)

$$ \begin{array}{@{}rcl@{}} |a| & \ge & |b|, \ c=\mathtt{fl}(a+b) \Longrightarrow \mathtt{fl}(c-a)=c-a, \end{array} $$

(7)

$$ \begin{array}{@{}rcl@{}} \mathtt{ufp}(a+b) & \le & \mathtt{ufp}(\mathtt{fl}(a+b)). \end{array} $$

(8)

The definition of the floating-point numbers by IEEE 754 immediately derives (4) and (5). The relation (6) and the proposition (7) are written in [8] and [3], respectively. In addition, the following lemmas are used in our proof.^{Footnote 1}

Lemma 1

For $\sigma , a \in \mathbb {F}$, if (σ + a) ≥(σ), then $\mathtt {fl}(\sigma +a) \in 2 \mathtt {u} \cdot \mathtt {ufp}(\sigma ) \mathbb {Z}$. In addition, $\mathtt {fl}((\sigma +a)-\sigma ) \in 2 \mathtt {u} \cdot \mathtt {ufp}(\sigma ) \mathbb {Z}$ holds true.

Proof

Using (4), (σ + a) ≤((σ + a)) from (8), and the assumption (σ + a) ≥(σ) in turn, we have

$$ \mathtt{fl}(\sigma + a) \in 2 \mathtt{u} \cdot \mathtt{ufp}(\mathtt{fl}(\sigma + a)) \mathbb{Z} \subseteq 2 \mathtt{u} \cdot \mathtt{ufp}(\sigma + a) \mathbb{Z} \subseteq 2 \mathtt{u} \cdot \mathtt{ufp}(\sigma) \mathbb{Z}. $$

(9)

Because (9) and $\sigma \in 2 \mathtt {u} \cdot \mathtt {ufp}(\sigma ) \mathbb {Z}$ from (4), $\mathtt {fl}((\sigma +a)-\sigma ) \in 2 \mathtt {u} \cdot \mathtt {ufp}(\sigma ) \mathbb {Z}$ is satisfied.

□

Applying Theorem 3.5 in [8] and an induction of binary tree described in [6], we can immediately obtain the following lemma.

Lemma 2

For $p \in \mathbb {F}^{n}$, assume that $p_{i} \in \mathtt {u} \sigma \mathbb {Z}$ for all i, where $\sigma \in \mathbb {F}$ is a power of two. If $\displaystyle \sum \limits _{i=1}^{n} |p_{i}| \le \sigma $, then no rounding error occurs in the sum by any orders.

We assume that divide and conquer methods by, for example, Strassen [9] and Winograd [11] are not applied for dot product, matrix-vector, and matrix-matrix multiplication.

3 General theory for error-free computations

For $Y \in \mathbb {F}^{m \times n}, S \in \mathbb {F}^{n \times p}, X \in \mathbb {F}^{p \times q}$, we obtain a matrix $S^{\prime } \in \mathbb {F}^{n \times p} (S^{\prime } \approx S)$ in order to compute $YS^{\prime }X$ without a rounding error. This technique is applied to generation of test matrices with exactly known eigenvalues.

We set several constants as follows: We define ϕ_ij for x_ij≠ 0 and ψ_ij for y_ij≠ 0 such that

$$ x_{ij} \in \phi_{ij} \mathbb{Z}, \quad x_{ij} \not \in 2 \phi_{ij} \mathbb{Z}, \quad y_{ij} \in \psi_{ij} \mathbb{Z}, \quad y_{ij} \not \in 2 \psi_{ij} \mathbb{Z}, $$

(10)

where all ϕ_ij and ψ_ij are powers of two. The constants ϕ_ij and ψ_ij imply the unit in the last non-zero bit of x_ij and y_ij, respectively. Exceptionally, we set ϕ_ij = 0 for x_ij = 0 and ψ_ij = 0 for y_ij = 0. Next, we set constants β,γ,𝜃,ω as follows:

$$ \begin{array}{@{}rcl@{}} \beta & = & \underset{\phi_{kj}\not=0}{\underset{i,j,k}{\max}} ~~\frac{\phi_{ij}}{\phi_{kj}}, \quad \gamma = \underset{\phi_{ij} \not = 0}{\underset{i,j}{\max}} ~~\frac{\mathtt{ufp}(x_{ij})}{\phi_{ij}}, \end{array} $$

(11)

$$ \begin{array}{@{}rcl@{}} \theta & = & \underset{\psi_{ik}\not=0}{\underset{i,j,k}{\max}}~~ \frac{\psi_{ij}}{\psi_{ik}}, \quad \omega =\underset{\psi_{ij} \not = 0}{\underset{i,j}{\max}}~~ \frac{\mathtt{ufp}(y_{ij})}{\psi_{ij}}. \end{array} $$

(12)

These imply

$$ \phi_{ij} \le \beta ~ \underset{\phi_{wj}\not=0}{\underset{w}{\min}}~~ \phi_{wj}, \quad \psi_{ij} \le \theta~ \underset{\psi_{it}\not=0}{\underset{t}{\min}}~ ~\psi_{it} $$

(13)

for all (i,j) pairs. Note that β,γ,𝜃,ω ≥ 1 and $\displaystyle \underset {k,j}{\max \limits } \phi _{kj}>0, \underset {i,k}{\max \limits } \psi _{ik}>0$, because both X and Y are non-singular matrices.

If s_ij = 0 for all (i,j) pairs, then it is trivial that the matrix A is the zero matrix. Hence, we assume $\displaystyle \max \limits _{i,j} |s_{ij}| > 0$. Let

n_Y: the maximum of the number of non-zero elements in row of Y,
n_S: the maximum of the number of non-zero elements in row of S,
n_X: the maximum of the number of non-zero elements in column of X.

We set $n^{\prime } = {\min \limits } (n_{S}, n_{X})$. We define a constant σ as

$$ \sigma = 12 \mathtt{ufp(\alpha)} \beta \gamma \theta \omega \in \mathbb{F}, $$

(14)

where

$$ \mathbb{F} \ni \alpha \ge n_{Y} \cdot n^{\prime} \max_{i,j} |s_{ij}|. $$

(15)

Then,

$$ \mathtt{ufp}(\sigma) = 8 \mathtt{ufp(\alpha)} \beta \gamma \theta \omega $$

(16)

is satisfied because all of β,γ,𝜃, and ω are powers of two. It is possible to set α as $\displaystyle \mathtt {fl}_{\bigtriangleup } (n \cdot n^{\prime } \cdot \max \limits _{i{,j}} |{s}_{ii}|)$ using floating-point arithmetic. A matrix $S^{\prime }$ is computed by

$$ s_{ij}^{\prime} = \mathtt{fl}((\sigma + s_{ij}) - \sigma). $$

(17)

We introduce several lemmas.

Lemma 3

For $S \in \mathbb {F}^{n \times n}$, σ in (14), β and γ in (11), and ψ and ω in (12),

$$ |s_{ij}| \le \frac{\mathtt{ufp}(\sigma)}{4n_{Y} \cdot n^{\prime} \cdot \beta \gamma \theta \omega}, $$

(18)

and

$$ |x_{ij}| \le 2 \gamma \phi_{ij} \le 2 \beta \gamma \underset{\begin{array}{cc}w \\ \phi_{wj}\not=0 \end{array}}{\min} \phi_{wj}, \quad |y_{ij}| \le 2 \omega \psi_{ij} \le 2 \theta \omega \underset{\begin{array}{cc}t \\ \psi_{it}\not=0 \end{array}}{\min} \psi_{it} $$

(19)

are satisfied for all (i,j) pairs.

Proof

From (16), the inequality (3), and α in (15), we have

$$ \mathtt{ufp}(\sigma) = 8 \mathtt{ufp}(\alpha) \beta \gamma \theta \omega > 4 \alpha \beta \gamma \theta \omega \ge 4n_{Y} \cdot n^{\prime} \cdot |{s}_{ij}| \beta \gamma \theta \omega. $$

Because $n_{Y}, n^{\prime }, \beta , \gamma , \theta , \omega \ge 1$, the inequality (18) can be obtained.

For x_ij = 0 and y_ij = 0, the inequalities (19) are trivially satisfied. Therefore, we assume x_ij≠ 0 (ϕ_ij≠ 0) and y_ij≠ 0 (ψ_ij≠ 0) for the rest of the proof. From (11), (12), and (3), we have

$$ \begin{array}{@{}rcl@{}} \gamma & = & \underset{i,j}{\max} \frac{\mathtt{ufp}(x_{ij})}{\phi_{ij}} \ge \frac{\mathtt{ufp}(x_{ij})}{\phi_{ij}} > \frac{|x_{ij}|}{2 \phi_{ij}}, \\ \omega & = & \underset{i,j}{\max} \frac{\mathtt{ufp}(y_{ij})}{\psi_{ij}} \ge \frac{\mathtt{ufp}(y_{ij})}{\psi_{ij}} > \frac{|y_{ij}|}{2 \psi_{ij}} \end{array} $$

for all (i,j) pairs. Then, using (13),

$$ |x_{ij}| < 2 \gamma \phi_{ij} \le 2 \beta \gamma~ \underset{\phi_{wj}\not=0}{\underset{w}{\min}} ~~\phi_{wj}, \quad |y_{ij}| < 2 \omega \psi_{ij} \le 2 \theta \omega ~ \underset{\psi_{it}\not=0}{\underset{t}{\min}}~~ \psi_{it} $$

are satisfied for all (i,j) pairs. Therefore, (19) is proved.

□

Lemma 4

For $S \in \mathbb {F}^{n \times n}$ and σ in (14),

$$ \mathtt{ufp}(\sigma + s_{ij}) = \mathtt{ufp}(\sigma) $$

is satisfied for all (i,j) pairs.

Proof

Using (14) and (16) gives

$$ \begin{array}{@{}rcl@{}} \sigma + s_{ij} & = & 8 \mathtt{ufp}(\alpha) \beta \gamma \theta \omega + 4 \mathtt{ufp}(\alpha) \beta \gamma \theta \omega + s_{ij} \\ & = & \mathtt{ufp}(\sigma) + 4\mathtt{ufp}(\alpha) \beta \gamma \theta \omega + s_{ij}. \end{array} $$

(20)

From β,γ,𝜃,ω ≥ 1, (3), and α≠ 0 from $\displaystyle \max \limits _{i,j} |s_{ij}| > 0$, we have

$$ 4 \mathtt{ufp}(\alpha) \beta \gamma \theta \omega + s_{ij} > \alpha + s_{ij} \ge 0. $$

(21)

Therefore, application of (20) and (21) yields

$$ \sigma + s_{ij} > \mathtt{ufp}(\sigma). $$

(22)

From (15) and (3), we have

$$ |s_{ij}| \le \alpha < 2 \mathtt{ufp}(\alpha) \beta \gamma \theta \omega $$

(23)

and using (22), (14), (23), and (16) gives

$$ \mathtt{ufp}(\sigma) < \sigma + s_{ij} \le \sigma + |s_{ij}| < 14 \mathtt{ufp}(\alpha) \beta \gamma \theta \omega < 2 \mathtt{ufp}(\sigma). $$

Therefore,

$$ \mathtt{ufp}(\sigma + s_{ij}) = \mathtt{ufp}(\sigma) $$

can be derived.

□

Lemma 5

Assume that $s^{\prime }_{ij}$ is obtained by (17). Then,

$$ |s_{ij}^{\prime}| \le \frac{\mathtt{ufp}(\sigma)}{4n_{Y} \cdot n^{\prime} \cdot \beta \gamma \theta \omega} + \mathtt{u} \cdot \mathtt{ufp}(\sigma) $$

is satisfied.

Proof

Using the definition of σ in (14) gives σ ≥|s_ij| for all (i,j) pairs. From this and (7), we derive

$$ s_{ij}^{\prime} = \mathtt{fl}((\sigma + s_{ij})-\sigma) = \mathtt{fl}(\sigma + s_{ij}) - \sigma = \sigma + s_{ij} + \delta - \sigma = s_{ij} + \delta, $$

where |δ|≤⋅(σ + s_ij) = ⋅(σ) from (6) and Lemma 4. Therefore, from Lemma 3

$$ |s_{ij}^{\prime}| = |s_{ij} + \delta| \le |s_{ij}| + |\delta| \le \frac{\mathtt{ufp}(\sigma)}{4n_{Y} \cdot n^{\prime} \cdot \beta \gamma \theta \omega} + \mathtt{u} \cdot \mathtt{ufp}(\sigma). $$

□

We show that rounding error never occurs in the evaluation of $Y{S}^{\prime }X$.

Theorem 1

Let $\sigma \in \mathbb {F}$ be defined as in (14) from the matrices Y,X and S. Assume that $4 n_{Y} \cdot n^{\prime } \cdot \mathtt {u} \beta \gamma \theta \omega \le 1$. If a matrix $S^{\prime }$ is obtained by (17), then

$$ \mathtt{fl}(Y(S^{\prime}X)) = YS^{\prime}X. $$

Proof

First, we show $\mathtt {fl}(S^{\prime }X)=S^{\prime }X$. If x_ij = 0, 1 ≤ i ≤ p for some j (the j-th column vector of X is the zero vector), then

$$ \sum\limits_{k=1}^{{p}} {s}^{\prime}_{ik} x_{kj} = \mathtt{fl} \left( \sum\limits_{k=1}^{{p}} {s}^{\prime}_{ik} x_{kj} \right) = 0. $$

Hence, we assume there exists x_ij≠ 0 for any j for the rest of the proof. Since $n_{Y}, n^{\prime }, \beta , \theta , \omega \ge 1$ and $4 n_{Y} \cdot n^{\prime } \cdot \mathtt {u} \beta \gamma \theta \omega \le 1$ from the assumption, we have

$$ \frac{1}{4n_{Y} \cdot \theta \omega} + \mathtt{u} n^{\prime} \beta \gamma < 1. $$

(24)

Using Lemma 1 and (10) gives

$$ s_{ik}^{\prime} \in 2 \mathtt{u} \cdot \mathtt{ufp}(\sigma) \mathbb{Z}, \quad x_{kj} \in \phi_{kj} \mathbb{Z} \subseteq \left( \underset{ \phi_{wj}\not=0}{\underset{w}{\min}}~~ \phi_{wj} \right) \mathbb{Z}, $$

(25)

and they yield

$$ s_{ik}^{\prime} x_{kj} \in 2\mathtt{u} \left( \underset{\phi_{wj}\not=0}{\underset{w}{\min}} ~~\phi_{wj} \right) \mathtt{ufp}(\sigma) \mathbb{Z}. $$

(26)

Next, we take an upper bound of $\displaystyle \left | \sum \limits _{k=1}^{{p}} s_{ik}^{\prime } x_{kj} \right |$ using Lemma 5, Lemma 3, and (24):

$$ \begin{array}{@{}rcl@{}} \left| \sum\limits_{k=1}^{{p}} s_{ik}^{\prime} x_{kj} \right| & \le & \sum\limits_{k=1}^{{p}} |s_{ik}^{\prime}| |x_{kj} | \\ & \le & n^{\prime} \left( \frac{\mathtt{ufp}(\sigma)}{4n_{Y} \cdot n^{\prime} \cdot \beta \gamma \theta \omega} + \mathtt{u} \cdot \mathtt{ufp}(\sigma)\right) 2 \beta \gamma \underset{\begin{array}{cc}w \\ \phi_{wj}\not=0 \end{array}}{\min} \phi_{wj} \\ & = & 2 \left( \frac{1}{4n_{Y} \theta \omega} + \mathtt{u} n^{\prime} \beta \gamma \right) \left( \underset{\phi_{wj}\not=0}{\underset{w}{\min}}~~ \phi_{wj} \right) \mathtt{ufp}(\sigma) \end{array} $$

(27)

$$ \begin{array}{@{}rcl@{}} & < & 2 \left( \underset{\phi_{wj}\not=0}{\underset{w}{\min}} \phi_{wj} \right) \mathtt{ufp}(\sigma). \end{array} $$

(28)

Therefore, from (26) and (28), Lemma 2 derives $\mathtt {fl}({S}^{\prime }X)={S}^{\prime }X$.

Next, we focus on $YS^{\prime }X$. The (i,j) element of $YS^{\prime }X$ is

$$ (YS^{\prime}X)_{ij} = \sum\limits_{\ell=1}^{n} y_{i\ell} \left( \sum\limits_{k=1}^{{p}} s_{\ell k}^{\prime} x_{k j} \right). $$

Using (26) and (10) gives

$$ s^{\prime}_{\ell k} x_{kj} \in 2 \mathtt{u} \cdot \mathtt{ufp}(\sigma) \left( \underset{\phi_{wj}\not=0}{\underset{w}{\min}} \phi_{wj} \right) \mathbb{Z}, \quad y_{i\ell} \in \psi_{i\ell} \mathbb{Z} \subseteq \underset{\psi_{it}\not=0}{\underset{t}{\min}} \psi_{it} \mathbb{Z}. $$

Therefore, we derive

$$ (YS^{\prime}X)_{ij} = \sum\limits_{\ell=1}^{n} y_{i\ell} \left( \sum\limits_{k=1}^{{p}} s_{\ell k}^{\prime} x_{k j} \right) \in 2 \mathtt{u} \left( \underset{\psi_{it}\not=0}{\underset{t}{\min}}~ \psi_{it} \right) \left( \underset{\phi_{wj}\not=0}{\underset{w}{\min}} \phi_{wj} \right) \mathtt{ufp}(\sigma) \mathbb{Z}. $$

(29)

Application of (27), Lemma 3, and $\displaystyle 1 + 4 n_{Y} \cdot n^{\prime } \cdot \mathtt {u} \beta \gamma \theta \omega \le 2$ obtained from the assumption yields

$$ \begin{array}{@{}rcl@{}} & & \left| \sum\limits_{\ell=1}^{n} y_{i\ell} \left( \sum\limits_{k=1}^{{p}} s_{\ell k}^{\prime} x_{k j} \right) \right| \le \sum\limits_{\ell=1}^{n} |y_{i\ell}| \left( \sum\limits_{k=1}^{{p}} |s_{\ell k}^{\prime}| \ |x_{k j} | \right) \\ & \le & \sum\limits_{\ell=1}^{n} 2|y_{i\ell}| \left( \frac{1}{4n_{Y} \theta \omega} + \mathtt{u} n^{\prime} \beta \gamma \right) \left( \underset{\phi_{wj}\not=0}{\underset{w}{\min}} ~\phi_{wj} \right) \mathtt{ufp}(\sigma) \\ & \le & 4n_{Y} \theta \omega \left( \underset{\psi_{it}\not=0}{\underset{t}{\min}} ~\psi_{it} \right) \left( \frac{1}{4n_{Y} \theta \omega} + \mathtt{u} n^{\prime} \beta \gamma \right) \left( \underset{\phi_{wj}\not=0}{\underset{w}{\min}} ~\phi_{wj} \right) \mathtt{ufp}(\sigma) \\ & \le & \left( 1 + 4 \mathtt{u} n_{Y} \cdot n^{\prime} \cdot \beta \gamma \theta \omega \right) \left( \underset{\psi_{it}\not=0}{\underset{t}{\min}}~ \psi_{it} \right) \left( \underset{\phi_{wj}\not=0}{\underset{w}{\min}}~ \phi_{wj} \right) \mathtt{ufp}(\sigma) \\ & \le & 2 \left( \underset{\psi_{it}\not=0}{\underset{t}{\min}}~ \psi_{it} \right) \left( \underset{ \phi_{wj}\not=0}{\underset{w}{\min}}~ \phi_{wj} \right) \mathtt{ufp}(\sigma). \end{array} $$

(30)

Hence, from (29) and (30), Lemma 2 verifies $\mathtt {fl}({Y} (S^{\prime } X)) = {Y} S^{\prime } X$.

□

If we change the definition of $n^{\prime }$, it is possible to verify $\mathtt {fl}(({Y} S^{\prime }) X) = {Y} S^{\prime } X$ similarly. Remark that the proposed method has reproducibility,^{Footnote 2} because there is no rounding error in any order of computations in $\mathtt {fl}({Y} (S^{\prime } X))$.

4 Test matrices based on specified matrices

In this section, we introduce how to generate a matrix with exactly known eigenvalues. We prepare $S, X, Y \in \mathbb {F}^{n \times n} \ (Y = X^{-1})$. Assume that exact eigenvalues can be known quickly from the matrix S. For example, S is a diagonal matrix, triangular matrices, and tridiagonal Toeplitz matrix. We use MATLAB R2020a on Windows 10 with Intel(R) Core(TM) i7-8665U in numerical examples in this section.

We introduce a basic algorithm for the generation of test matrices using MATLAB-like code on the basis of the discussion in Section 3. Greek alphabets cannot be used for variables on MATLAB. However, we exceptionally accept to use Greek alphabets in the description of algorithms for consistency with the theories in this paper.

In the algorithm, (⋅) at line 11 is not a built-in function in MATLAB. The calculation of (a) for $a \in \mathbb {F}$ involves only three floating-point operations (see Algorithm 3.6 in [7]). Therefore, the cost for the ufp function is negligible. The function _ at line 6 in Algorithm 1 produces β and γ according to (11). The function _ similarly gives 𝜃 and ω based on (12).

To set multiple eigenvalues can be possible, because if s_ij = s_kℓ, then z_ij = z_kℓ. We can also generate a rank deficient matrix with exactly known eigenvalues. However, even if s_ij≠s_kℓ, z_ij = z_kℓ may be satisfied. It happens for relatively small |s_ij| compared to $\displaystyle \max \limits _{i,j} |s_{ij}|$. If σ ≥|s_ij|, then z_ij = 0. In the worst case, the matrix Z becomes the zero matrix, and an obtained matrix A is useless. In addition, a test matrix with the specified singular values can be generated using Y = X^T.

Note that we can generate a real matrix with complex eigenvalues. Let i be the imaginary unit. The basic of linear algebra derives that a ± bi are eigenvalues of the following skew-symmetric matrix:

$$ \left( \begin{array}{cc} a & b \\ -b & a \end{array} \right) = \frac{1}{2} \left( \begin{array}{rr} i & -i \\ 1 & 1 \end{array} \right) \left( \begin{array}{rr} a-bi & 0 \\ 0 & a+bi \end{array} \right) \left( \begin{array}{rr} -i & 1 \\ i & 1 \end{array} \right), $$

where

$$ \frac{1}{2} \left( \begin{array}{rr} i & -i \\ 1 & 1 \end{array} \right) \left( \begin{array}{rr} -i & 1 \\ i & 1 \end{array} \right) = \left( \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right). $$

For a discussion of complex eigenvalues, we set 2-by-2 matrices to the diagonal block of D. For example, if $a \pm b i, \ c \pm d i, e \ (a,b,c,d,e \in \mathbb {F})$ are expected eigenvalues, then we set a skew-symmetric matrix as

$$ S=\left( \begin{array}{rrrrr} a & b & 0 & 0 & 0\\ -b & a & 0 & 0 & 0\\ 0 & 0 & c & d & 0\\ 0 & 0 & -d & c & 0\\ 0 & 0 & 0 & 0 & e \end{array} \right). $$

If one wants to obtain a matrix with exactly known eigenvectors, then we obtain the matrix as follows.

$$ \frac{1}{2} Y \left( \begin{array}{rrrrr} -i & 1 & 0 & 0 & 0\\ i & 1 & 0 & 0 & 0\\ 0 & 0 & -i & 1 & 0\\ 0 & 0 & i & 1 & 0\\ 0 & 0 & 0 & 0 & 1 \end{array} \right) $$

Note that

$$ \mathtt{fl}((\sigma + b) - \sigma) \not = -\mathtt{fl}((\sigma - b) - \sigma) $$

in some cases due to a problem of rounding error. Then, $S^{\prime }$ is not a skew-symmetric matrix. To solve the problem, we compute only

$$ s_{ij}^{\prime} = \mathtt{fl}((\sigma + s_{ij}) - \sigma), \quad i \le j $$

and set $s_{ji}^{\prime }=-s_{ij}^{\prime }$ by copy of the computed result. For

$$ [A,Z]=\mathtt{alg\_basic}(Y,S,X), $$

Z is not a skew-symmetric matrix, while S is a skew-symmetric matrix.

The condition $X, Y \in \mathbb {F}^{n \times n} \ (Y = X^{-1})$ is very special. Therefore, we use specified matrices for X in (1) from the following sections.

4.1 Using weighing matrix

Assume that $D \in \mathbb {F}^{n \times n}$ is a diagonal matrix ($n^{\prime }=1$). First, we use a weighing matrix. Let I be the identity matrix. If $W \in \mathbb {R}^{n \times n}$ with weight $c \in \mathbb {N}$ is a weighing matrix, W^TW = cI and w_ij ∈{− 1,0,1} for all (i,j) pairs. The weighing matrix is useful for the candidate of the matrix X, because β = γ = 𝜃 = ω = 1 in (11) and (12), and $\displaystyle Y=X^{-1}=\frac {1}{c}W^{T}$.

Let $W_{n,c} \in \mathbb {F}^{n \times n}$ be a weighing matrix such that $W^{T}_{n,c} W_{n,c} = c I$. The notation ⊗ means the Kronecker product. For $\displaystyle n=\prod \limits _{i=1}^{w} n_{i}$ and $\displaystyle c = \prod \limits _{i=1}^{w} c_{i}$, $n_{i}, \ c_{i} \in \mathbb {N}$, a weighing matrix W_n,c can be obtained by $W_{n_{1},c_{1}} \otimes W_{n_{2},c_{2}} \otimes {\dots } \otimes W_{n_{w},c_{w}}$.

If we set $Y = \displaystyle \frac {1}{c} W_{n,c}^{T} = \frac {1}{c_{1}} W_{n_{1},c_{1}}^{T} \otimes {\dots } \otimes \frac {1}{c_{w}} W_{n_{w},c_{w}}^{T}$, then the matrix Y may not be representable in floating-point numbers. It happens when c is not a power of two. For a given diagonal matrix D, we obtain σ from $\displaystyle \mathtt {fl} \left (\frac {1}{c}D \right )$ instead of D and compute

$$ d_{ii}^{\prime} = \mathtt{fl}((\sigma + d_{ii}/c) - \sigma). $$

Then, we can compute $\displaystyle W_{n,c}^{T} (D^{\prime } W_{n,c})=:A$ without rounding errors. Hence, $c d_{ii}^{\prime }$ is the exact eigenvalue of the matrix A. Note that $c d_{ii}^{\prime }$ may not be representable in the floating-point numbers. An error-free transformation [3] produces floating-point vectors p and q such that

$$ c d^{\prime} = p + q, \quad p=\mathtt{fl}(c d^{\prime}), \quad p,q \in \mathbb{F}^{n}. $$

It is well known that we quickly obtain p and q using fused multiply-add (FMA instruction defined in IEEE 754). The function (a,b,c) for $a,b,c \in \mathbb {F}$ provides a nearest floating-point number of ab + c. Then,

$$ c d^{\prime} = p + q, \quad p=\mathtt{fl}(c d^{\prime}), \quad q=\mathtt{fma}(c,d^{\prime},-p). $$

Hadamard matrix, the special case of weighing matrix (c = n), is very useful for the candidate of X.

The following codes generate a test matrix on MATLAB. Let $d \in \mathbb {F}^{n}$ be a vector with expected eigenvalues. The exact eigenvalue of $A \in \mathbb {F}^{n \times n}$ is p_i + q_i ≈ d_i. Note that the function hadamard on MATLAB R2020a in the following algorithm works when n, n/12 or n/20 is a power of two.^{Footnote 3}

Note that Algorithm 2 produces a real symmetric matrix. For $n \in \mathbb {F}$ and $y \in \mathbb {F}^{n}$, the function at line 12 transforms ny to $p+q, \ p,q \in \mathbb {F}^{n}$. If n is a power of two, q at line 12 is the zero vector.

At line 11, a matrix multiplication is computed. In the special cases, we can reduce the cost of the matrix multiplication. Let H_n be a Hadamard matrix with dimension n. We can generate one of 2n-by-2n Hadamard matrices by the following manner [10]

$$ H_{2n} = \left( \begin{array}{cc} H_{n} & H_{n} \\ H_{n} & -H_{n} \end{array} \right) = H_{n} \otimes \left( \begin{array}{cc} 1 & 1 \\ 1 & -1 \end{array} \right). $$

For a diagonal matrix $D \in \mathbb {F}^{2n \times 2n}$, there exist matrices $M, N \in \mathbb {R}^{n \times n}$ such that

$$ D= \left( \begin{array}{cc} D_{1} & O \\ O & D_{2} \end{array} \right) , \quad D H_{2n}= \left( \begin{array}{cc} M & M \\ N & -N \end{array} \right) , \quad M=D_{1} H_{n}, \quad N=D_{2} H_{n} $$

where D₁ and D₂ are n-by-n diagonal matrices. Then, the matrices M and N can quickly be obtained, because D₁ and D₂ are diagonal matrices. Finally, we compute $H_{2n}^{T} D H_{2n}$ by

$$ H_{2n}^{T} D H_{2n}= \left( \begin{array}{cc} {H_{n}^{T}} & {H_{n}^{T}} \\ {H_{n}^{T}} & -{H_{n}^{T}} \end{array} \right) \left( \begin{array}{cc} M & M \\ N & -N \end{array} \right) = \left( \begin{array}{cc} {H_{n}^{T}} M + {H_{n}^{T}} N & {H_{n}^{T}} M - {H_{n}^{T}} N \\ {H_{n}^{T}} M - {H_{n}^{T}} N & {H_{n}^{T}} M + {H_{n}^{T}} N \end{array} \right). $$

Therefore, we only compute ${H_{n}^{T}} M = {H_{n}^{T}} D_{1} {H_{n}^{T}}$ and ${H_{n}^{T}} N = {H_{n}^{T}} D_{2} {H_{n}^{T}}$ as matrix multiplications to obtain $H^{T}_{2n} D H_{2n}$. The cost of matrix multiplications can be reduced using this technique recursively.

Let r ⋅ n be a matrix size for the output, where $r, n \in \mathbb {N}$ and r is a power of two.

$$ C = \left( \begin{array}{cccc} C_{11} & C_{12} & {\dots} & C_{1r} \\ C_{21} & C_{22} & {\dots} & C_{2r} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ C_{r1} & C_{r2} & {\dots} & C_{rr} \end{array} \right) = H_{rn}^{T} D H_{rn}, \quad D = \left( \begin{array}{cccc} D_{1} & \textbf{O} & \textbf{O} & \textbf{O} \\ \textbf{O} & D_{2} & \textbf{O} & \textbf{O} \\ \textbf{O} & \textbf{O} & {\ddots} & {\vdots} \\ \textbf{O} & \textbf{O} & {\dots} & D_{r} \end{array} \right) $$

where $D_{i} \in \mathbb {F}^{n \times n}$, 1 ≤ i ≤ r are diagonal matrices, and O is the zero matrix.

Then, we only compute the following r matrix multiplications:

$$ {H_{n}^{T}} D_{1} H_{n}, \quad {H_{n}^{T}} D_{2} H_{n}, \quad {\dots} \quad, \quad {H_{n}^{T}} D_{k} H_{n}, $$

and C_ij is obtained by

$$ C_{ij} = \sum\limits_{k=1}^{r} s_{ki} s_{kj} {H_{n}^{T}} D_{k} H_{n}, \quad S=H_{r}. $$

The total cost becomes $2 k n^{3} + \mathcal {O}(n^{2}+k)$ floating-point operations for r-by-r block division. Note that the original cost without block computations is $2 (k n)^{3} + \mathcal {O}((kn)^{2})$, that is much expensive compared to block computations for large k.

We introduce an algorithm producing a matrix with eigenvalues based on the block computations.

We show numerical examples for the performance using this technique. Figures 1 and 2 show computing times with r-by-r block computations. r = 1 means a computing time for Algorithm 2. Computing times for solving linear systems () and eigenvalue problems () using MATLAB are shown for comparison. Note that in this example produces only approximate eigenvalues such that d = (A). The figures indicate that computing time for the generation of test matrices is much shorter than that for solving linear systems and that for eigenvalue problems.

Remark 1

We mentioned that MATLAB’s hadamard function works in the case that n, n/12, or n/20 is the power of two. Even if n does not satisfy that restriction, the test matrices can be generated. Assume that H_k exists and k < n. Let $P \in \mathbb {R}^{(n-k) \times (n-k)}$ be a permutation matrix. We generate

$$ \left( \begin{array}{cc} H_{n} & O \\ O & P \end{array} \right) $$

instead of the Hadamard matrix. Then, $D^{\prime }$ is obtained by

$$ d_{ii}^{\prime} = \left\{ \begin{array}{rrrrr} \mathtt{fl}((\sigma + d_{ii}/k) - \sigma), \quad i \le k \\ \mathtt{fl}((\sigma + d_{ii}) - \sigma), \quad i > k. \end{array} \right. $$

The exact eigenvalue is $k d_{ii}^{\prime }$ for 1 ≤ i ≤ k and $d^{\prime }_{ii}$ for k + 1 ≤ i ≤ n.

We checked the accuracy of approximate eigenvalues obtained by . The vector $d \in \mathbb {F}^{n}, \ n=4096$ in Algorithm 2 is set as follows.

Ex.1::: d₁ = 1, d_n = 10¹⁰, intermediate values d_i, 2 ≤ i ≤ n − 1, are geometrically distributed.
Ex.2::: d_i = 1, d_n = 10¹⁰, 1 ≤ i ≤ n − 1.

We compare accuracy of eigenvalues by d = (A) and [V,D] = (A) in MATLAB. Let $\hat \lambda _{1} \le \hat \lambda _{2} \le {\dots } \le \hat \lambda _{n}$ be computed eigenvalues. Figures 3 and 4 show relative errors

$$ \frac{|p_{i} + q_{i}- \hat \lambda_{i}|}{|p_{i} + q_{i}|} $$

(31)

for all i, where p and q are obtained by Algorithm 2. From Fig. 3 for Ex.1, the accuracy of relatively small eigenvalue tends to be worse. Besides, approximate eigenvalues from [V,D] = (A) are better than those from d = (A) in terms of accuracy. In the case of Ex.2, we can see from Fig. 4 that some of the eigenvalues are not so accurate, and several computed eigenvalues are exact.

Using the Hadamard matrix, we introduce an algorithm generating a test matrix based on the Jordan normal form ($n^{\prime }=2$).

We show numerical examples using Algorithm 4. The vectors $v \in \mathbb {F}^{n}$ and $w \in \mathbb {F}^{n-1}$ for n = 4096 as arguments of Algorithm 4 are set as follows:

Ex.3::: $v_{1}=1, \ v_{i} = 10^{5}, \ 2 \le i \le n, \ w=(1,1,\dots ,1)^{T}$.
Ex.4::: $v_{i}=1, \ 1 \le i \le n-1, \ v_{n} = 10^{5}, \ w=(1,1,\dots ,1)^{T}$.

Then, we execute [A,p,q] = _(v,w). Figures 5 and 6 show the quantity (31) for all i, respectively. Approximate eigenvalues for these numerical examples have small imaginary parts so that we checked the accuracy of the real part of the computed eigenvalues. In both cases, the matrices have n − 1 multiple eigenvalues, but there are differences in the numerical examples. From Fig. 6, computed eigenvalues for n − 1 small eigenvalues are very inaccurate.

4.2 Eigenvector matrix X using random integer matrices

We use the LU decomposition and generate integer matrices^{Footnote 4} for X. Let L and U be a unit lower triangular matrix and a unit upper triangular matrix, respectively. We generate X = LU, where we set L and U as 0-1 matrices in order to obtain small constants β, γ, 𝜃, and ω. Then, Y = X^− 1 is also an integer matrix. Because L and U are 0-1 matrices, there is no rounding error in (LU) up to n = ^− 1. On the other hand, the problem is that the exact Y = X^− 1 can be obtained or not. Let $\tilde Y$ be a numerically obtained result for X^− 1. If $\textbf {O} = \mathtt {fl}_{\bigtriangledown } (\tilde Y X - I) = \mathtt {fl}_{\bigtriangleup } (\tilde Y X - I)$, where I is the identity matrix, then $\tilde Y = X^{-1}$. Using MATLAB notation, L and U are given as

$$ \begin{array}{@{}rcl@{}} L & = & \mathtt{spones}(\mathtt{tril}(\mathtt{sprandn}(n,n,k))); \quad L(1:n+1:n*n)=1; \end{array} $$

(32)

$$ \begin{array}{@{}rcl@{}} U & = & \mathtt{spones}(\mathtt{triu}(\mathtt{sprandn}(n,n,k))); \quad U(1:n+1:n*n)=1; \end{array} $$

(33)

where 0 ≤ k ≤ 1 is density of a matrix. The sparsity of L and U is controlled by $k \in \mathbb {N}$.

The following algorithm generates A using random integer matrices.

We introduce numerical examples for Algorithm 5. We set n = 1,000,5,000, and $d \in \mathbb {F}^{n}$ is generated by (n,1) using MATLAB R2020a. Table 1 shows

The product of four constants βγ𝜃ω
The 2-norm condition number of X, i.e., ∥X^− 1∥₂∥X∥₂ = ∥Y ∥₂∥X∥₂
The relative change between d_i and r_i, i.e., $\displaystyle \max \limits _{i} \frac {|d_{i}-r_{i}|}{|d_{i}|}$
The density of the matrix A, i.e., the number of non-zero elements of A divided by n²

for each k. The numbers in Tables 1 and 2 are averages of 10 examples. If k is small, we can obtain small βγ𝜃ω. It makes the relative change for v small. If k becomes large, then

The relative change for a small absolute value,
The condition number of X, and
Density of A

also become large.

Table 1 Comparison of properties for k for n = 1,000

Full size table

Table 2 Comparison of properties for k for n = 5,000

Full size table

5 Conclusion

We proposed methods generating matrices with specified eigenvalues based on the error-free transformation of floating-point numbers. They are useful for checking the accuracy and stability of numerical algorithms. Finally, we provided several possibilities for an eigenvector matrix and showed how to generate a real matrix with complex eigenvalues. The discussion can be extended to the generation of test matrices with complex eigenvalues and eigenvectors.

Notes

Thanks to Dr. Atsushi Minamihata for the discussion.
Reproducibility means ability to obtain always bit-wise identical results on the same input data on the same or different architectures [1].
This is not only for MATLAB R2020a, but also for more early versions.
We appreciate Prof. Siegfried M. Rump’s advice for this discussion.

References

Ahrens, P., Nguyen, H.D., Demmel, J.: Efficient Reproducible Floating Point Summation and BLAS, EECS Department, UC Berkeley, Technical Report No. UCB/EECS2016-121 Berkeley (2016)
Bauer, F.L., Fike, C.T.: Norms and exclusion theorems. Numer. Math. 2, 137–141 (1960)
Article MathSciNet Google Scholar
Dekker, T.J.: A floating-point technique for extending the available precision. Numer. Math. 18, 224–242 (1971)
Article MathSciNet Google Scholar
Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)
Book Google Scholar
IEEE Standard for Floating-Point Arithmetic: Std 754-2019 (2019)
Jeannerod, C. -P., Rump, S.M.: Improved error bounds for inner products in floating-point arithmetic. SIAM J. Matrix Anal. & Appl. (SIMAX) 34, 338–344 (2013)
Article MathSciNet Google Scholar
Rump, S.M.: Error estimation of floating-point summation and dot product. BIT Numer. Math. 52, 201–220 (2012)
Article MathSciNet Google Scholar
Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation part I: faithful rounding. SIAM J. Sci. Comput. 31, 189–224 (2008)
Article MathSciNet Google Scholar
Strassen, V.: Gaussian elimination is not optimal. Numer. Math. 13, 354–356 (1969)
Article MathSciNet Google Scholar
Sylvester, J.J.: Thoughts on inverse orthogonal matrices, simultaneous sign successions, and tessellated pavements in two or more colours, with applications to Newton’s rule, ornamental tile-work, and the theory of numbers. Philos. Mag. 34, 461–475 (1867)
Article Google Scholar
Winograd, S.: On multiplication of 2x2 matrices. Linear Algebra Its Appl. 4, 381–388 (1971)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This research is partially supported by JST / CREST and JSPS Kakenhi (20H04195). We sincerely express our thanks to the reviewer for the fruitful comments and Dr. Takeshi Terao for his kind advice of simplified codes of MATLAB.

Author information

Authors and Affiliations

307 Fukasaku, Minuma-ku, Saitama-shi, Saitama, 337-8570, Japan
Katsuhisa Ozaki
2-6-1 Zempukuji, Suginami-ku, Tokyo, 167-8585, Japan
Takeshi Ogita

Authors

Katsuhisa Ozaki
View author publications
You can also search for this author in PubMed Google Scholar
Takeshi Ogita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Katsuhisa Ozaki.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ozaki, K., Ogita, T. Generation of test matrices with specified eigenvalues using floating-point arithmetic. Numer Algor 90, 241–262 (2022). https://doi.org/10.1007/s11075-021-01186-7

Download citation

Received: 28 February 2021
Accepted: 06 August 2021
Published: 17 September 2021
Issue Date: May 2022
DOI: https://doi.org/10.1007/s11075-021-01186-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Generation of test matrices with specified eigenvalues using floating-point arithmetic

Abstract

Similar content being viewed by others

Numerical methods for accurate computation of the eigenvalues of Hermitian matrices and the singular values of general matrices

Solving Polynomial Systems Using Numeric Gröbner Bases

Explicit Solutions of the Singular Yang–Baxter-like Matrix Equation and Their Numerical Computation

1 Introduction

2 Basics of floating-point arithmetic

Lemma 1

Proof

Lemma 2

3 General theory for error-free computations

Lemma 3

Proof

Lemma 4

Proof

Lemma 5

Proof

Theorem 1

Proof

4 Test matrices based on specified matrices

4.1 Using weighing matrix

Remark 1

4.2 Eigenvector matrix X using random integer matrices

5 Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generation of test matrices with specified eigenvalues using floating-point arithmetic

Abstract

Similar content being viewed by others

Numerical methods for accurate computation of the eigenvalues of Hermitian matrices and the singular values of general matrices

Solving Polynomial Systems Using Numeric Gröbner Bases

Explicit Solutions of the Singular Yang–Baxter-like Matrix Equation and Their Numerical Computation

1 Introduction

2 Basics of floating-point arithmetic

Lemma 1

Proof

Lemma 2

3 General theory for error-free computations

Lemma 3

Proof

Lemma 4

Proof

Lemma 5

Proof

Theorem 1

Proof

4 Test matrices based on specified matrices

4.1 Using weighing matrix

Remark 1

4.2 Eigenvector matrix X using random integer matrices

5 Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation