On the Existence of Schur-like Forms for Matrices with Symmetry Structures

Mehl, Christian

doi:10.1007/s10013-020-00394-3

On the Existence of Schur-like Forms for Matrices with Symmetry Structures

Original Article
Open access
Published: 13 March 2020

Volume 48, pages 831–845, (2020)
Cite this article

Download PDF

You have full access to this open access article

Vietnam Journal of Mathematics Aims and scope Submit manuscript

On the Existence of Schur-like Forms for Matrices with Symmetry Structures

Download PDF

Christian Mehl ORCID: orcid.org/0000-0003-2146-4974¹

1032 Accesses
1 Citation
Explore all metrics

A Correction to this article was published on 19 July 2021

This article has been updated

Abstract

Schur-like forms are developed for matrices that have a symmetry structure with respect to an indefinite inner product induced by a Hermitian and unitary Gram matrix. It is characterized under which conditions these forms can be computed by structure-preserving unitary transformations. The main results combines and generalizes the two well-known results from the literature that on the one hand any normal matrix can be unitarily diagonalized and on the other hand a Hamiltonian matrix can be transformed to Hamiltonian Schur form via a unitary similarity transformation if and only if its purely imaginary eigenvalues satisfy certain conditions that involve the sign characteristic of the matrix under consideration.

Dual Integrable Representations on Locally Compact Groups

Article 30 January 2024

On the decomposability for sums of complex symmetric operators

Article 10 April 2024

On the generalized n-strong Drazin inverses and block matrices in Banach algebras

Article 21 April 2024

1 Introduction

It is well known that Hermitian matrices, skew-Hermitian matrices, and unitary matrices are unitarily diagonalizable. More generally, this is true for normal matrices, i.e., for matrices $A\in \mathbb {C}^{n,n}$ satisfying AA^H = A^HA - a class of matrices that generalizes the classes of Hermitian, skew-Hermitian, and unitary matrices. The fact that unitary matrices can be used as transformation matrices for diagonalizing normal matrices is important because of two fundamental properties: firstly, similarity transformations with unitary matrices preserve the given Hermitian, skew-Hermitian, unitary, or normal structure of the matrix - a fact that it crucial for the development of structure-preserving algorithms for the solution of the given eigenvalue problem - and secondly, unitary matrices are optimally conditioned. Therefore, the normal eigenvalue problem can be considered to be a well-behaved problem in the sense that a diagonalization can be performed by backward stable and structure-preserving algorithms.

The picture changes drastically, if one considers matrices that carry a symmetry structure with respect to an indefinite inner product, i.e., with respect to a nondegenerate Hermitian form that is not necessarily positive definite or with respect to a nondegenerate skew-Hermitian form. An important example are Hamiltonian matrices, i.e., matrices $A\in \mathbb {R}^{2n,2n}$ satisfying A^⊤J + JA = 0, where J denotes the skew-symmetric matrix

$$ J=\left[\begin{array}{cc} 0&I_{n}\\ -I_{n}&0 \end{array}\right]. $$

(1)

The identity A^⊤J = −JA can be interpreted as skew-symmetry of the matrix A with respect to the skew-symmetric inner product induced by J. The corresponding Hamiltonian eigenvalue problem arises in many application, e.g., in system theory and the theory of Algebraic Riccati equations, see [1, 5, 11] and the references therein. For practical reasons, one often switches to the complex version of the problem which leads to the consideration of matrices $A\in \mathbb {C}^{2n,2n}$ satisfying A^HJ + JA = 0. Typically, these matrices are also called Hamiltonian in the Numerical Linear Algebra community and we will follow this convention in this paper. It should be noted though that other communities prefer the terminology J-skew-adjoint for such matrices (e.g., see [2]) in order to avoid confusion with complex matrices $A\in \mathbb {C}^{2n,2n}$ satisfying A^⊤J + JA = 0 which are called Hamiltonian as well.

For the solution of the Hamiltonian eigenvalue problem, the so-called Hamiltonian Schur form was suggested in [12] as a target form. This is a Hamiltonian matrix of the block form

$$ \left[\begin{array}{cc} R&B\\ 0&-R^{H} \end{array}\right], $$

(2)

where $R\in \mathbb {C}^{n,n}$ is upper triangular. It is straightforward to see that this is just a permutation of the classical upper triangular Schur form of a complex matrix and as a consequence the eigenvalues can be read off from the diagonal. It was shown in [12] that this form can always be achieved under a unitary symplectic similarity transformation provided that the given Hamiltonian matrices does not have eigenvalues on the imaginary axis. Recall that a matrix $S\in \mathbb {C}^{2n,2n}$ is called symplectic (following the convention in the Numerical Linear Algebra community) if S^HJS = J. It is easily checked that similarity transformations with symplectic matrices preserve the Hamiltonian structure and are therefore the base of structure-preserving algorithms for the solution of the Hamiltonian eigenvalue problem. However, since the condition number of symplectic matrices may be arbitrarily large, it is favorable to further restrict oneself to the class of unitary symplectic similarity transformations in order to guarantee numerical stability.

Surprisingly, there are Hamiltonian matrices that cannot be transformed to Hamiltonian Schur form via a unitary symplectic similarity transformation. As an obvious example, consider the matrix J which is both Hamiltonian and symplectic (and even unitary). Clearly, if $U\in \mathbb {C}^{2n,2n}$ is unitary and symplectic then U^− 1JU = U^HJU = J, so J is invariant under any unitary symplectic similarity transformations and hence cannot be transformed to Hamiltonian Schur form. It is clear from the form in (2) that a necessary condition for the existence of a Hamiltonian Schur form is that the purely imaginary eigenvalues of the given matrix have even multiplicity. Indeed, if λ is an eigenvalue of R then $-\overline {\lambda }$ is an eigenvalue of − R^H, so any purely imaginary eigenvalue of R will also be one of − R^H. The example of J however shows that this condition is not sufficient. The long open problem of characterizing all Hamiltonian matrices that can be transformed to Hamiltonian Schur form was finally solved in [6] with the help of a newly developed structured canonical form for Hamiltonian matrices.

At first sight, one may come to the conclusion that the nonexistence of the Hamiltonian Schur form is related to the fact that in contrast to the Euclidean inner product the two fundamental properties of structure-preservation and numerical stability are now partitioned among two sets of transformation matrices instead of only one. Thus, to enable both features, one has to restrict oneself to the set of unitary symplectic matrices which is a much “smaller” subset (in terms of dimension as a manifold) than the sets of symplectic matrices or unitary matrices. This conclusion, however, turns out to be wrong as it is well-known that the Hamiltonian Schur form exists under unitary symplectic similarity transformations if and only if it exists under similarity transformations that are symplectic only. This equivalence can easily be shown by applying a structure-preserving QR decomposition to the transformation matrix, see, e.g., [6] for details. Thus, both in the case of normal matrices with respect to the Euclidean inner product and in the case of Hamiltonian matrices, the actual problem is the computation of a Schur-like form by similarity transformations from the group of matrices that are unitary with respect to the considered inner product and therefore preserve the given symmetry structure of the matrix they are acting on.

We will show in this paper that the diagonal Schur form of normal matrices and the Hamiltonian Schur form of Hamiltonian matrices are two extreme cases of a much more general Schur-like form for matrices carrying a symmetry structure with respect to an indefinite inner product. To treat the problem in full generality, we will consider a generalization of normality of matrices in an indefinite inner product space, the so-called polynomial H-normality which will be introduced in Section 2 where we will also review the basic theory of indefinite inner products. In Section 3 we will formulate the main result of this paper and develop a Schur-like form for polynomially H-normal matrices for the case of a Hermitian and unitary Gram matrix H. Then we will discuss how the Schur form of normal matrices and the Hamiltonian Schur form can be deduced as special cases of this result. The proof of the main result will then be given in Section 4 followed by a short summary in Section 5.

Notation

By $\mathcal {J}_{n}(\lambda )$ we denote the n × n upper triangular Jordan block associated with the eigenvalue λ. The reverse identity of size n is denote by R_n, i.e.,

If $H\in \mathbb {C}^{n,n}$ is a Hermitian matrix, then its inertia index is denoted by (π,ν,ζ), where π, ν, and ζ are the numbers of positive, negative and zero eigenvalues, respectively, each counted with algebraic multiplicities.

2 Indefinite Inner Products and Polynomially H-normal Matrices

Let $H\in \mathbb {C}^{n,n}$ be Hermitian and invertible. Then H defines an indefinite inner product on $\mathbb {C}^{n}$ via

$$ {[x,y]}:=x^{H}Hy,\quad x,y\in\mathbb{C}^{n}. $$

As in the case of positive definite inner product, we will call H the Gram matrix of the indefinite inner product. If $A\in \mathbb {C}^{n,n}$ then the matrix A^[∗] := H^− 1A^HH is called the H-adjoint of A, because it is the unique matrix satisfying the identity [Ax,y] = [x,A^[∗]y] for all $x,y\in \mathbb {C}^{n}$. The matrix A is called

H-selfadjoint, if A^[∗] = A, or, equivalently, A^HH = HA,
H-skew-adjoint, if A^[∗] = −A, or, equivalently, A^HH + HA = 0,
and H-unitary, if A^[∗] = A^− 1, or, equivalently, A^HHA = H.

In the following, we will restrict ourselves to Hermitian inner products, because any skew-Hermitian inner product can easily be transformed to a Hermitian inner product by multiplying the corresponding Gram matrix with the imaginary unit i. In particular, a matrix $A\in \mathbb {C}^{2n,2n}$ is Hamiltonian if and only if A is (iJ)-skew-adjoint, where J is the matrix as in (1).

Canonical forms for H-selfadjoint, H-skew-adjoint, and H-unitary matrices are well known, see, e.g., [2, 8]. More generally, one can define the set of H-normal matrices as the set of all matrices $A\in \mathbb {C}^{n,n}$ satisfying A^[∗]A = AA^[∗]. Unfortunately, this set turns out to be “too big”, because it was shown in [3] that the problem of classifying H-normal matrices is a wild problem and hence canonical forms cannot be obtained. Therefore, it was suggested in [10] to consider the set of polynomially H-normal matrices instead. A matrix $A\in \mathbb {C}^{n,n}$ is called polynomially H-normal if there exists a polynomial p in one variable such that A^[∗] = p(A). The polynomial p is then called the H-normality polynomial of A. It is easily checked that any polynomially H-normal matrix is H-normal, but the converse is not true, see [10]. Still, the set of polynomially H-normal matrices is large enough to contain the sets of H-selfadjoint, H-skew-adjoint, and H-unitary matrices. Indeed, if the matrix A is H-selfadjoint or H-skew-adjoint, then it is polynomially H-normal with H-normality polynomial p(t) = t or p(t) = −t, respectively, and since A^[∗] = A^− 1 and the inverse of a matrix is always a polynomial in that matrix, it follows that also any H-unitary matrix is polynomially H-normal.

The major advantage of polynomially H-normal matrices over H-normal matrices is the fact that a complete classification is available under the following equivalence relation.

Remark 1

Let $H\in \mathbb {C}^{n,n}$ be Hermitian and invertible and let $A\in \mathbb {C}^{n,n}$ be polynomially H-normal with H-normality polynomial p. If $T\in \mathbb {C}^{n,n}$ is invertible, then T^− 1AT is polynomially T^HHT-normal with H-normality polynomial p. In particular, the relation

$$ (A_{1},H_{1}) \sim (A_{2},H_{2}):~\Leftrightarrow~\exists T\in GL_{n}(\mathbb{C}):A_{2}=T^{-1}A_{1}T\wedge H_{2}=T^{H}H_{1}T $$

(3)

is an equivalence relation on the set of pairs (A,H), where H is Hermitian and invertible and A is polynomially H-normal with H-normality polynomial p.

The following canonical form for polynomially H-normal matrices was developed in [8, Theorem 6.1]. Here, $T(\alpha _{1},\dots ,\alpha _{n})$ denotes an upper triangular n × n Toeplitz matrix that has $\left [\begin {array}{ccc} \alpha _{1}&\dots &\alpha _{n} \end {array}\right ]$ as its first row.

Theorem 1 (Canonical form for polynomially H-normal matrices)

Let the matrix $A\in \mathbb {C}^{n,n}$ be polynomially H-normal with H-normality polynomial p. Then there exists a nonsingular matrix T such that

$$ T^{-1}AT=A_{1}\oplus\cdots\oplus A_{q},\quad T^{\ast} HT=H_{1}\oplus\cdots\oplus H_{q}, $$

(4)

where A_j is H_j-indecomposable and where A_j and H_j have one of the following forms:

i)
blocks associated with eigenvalues $\lambda _{j}\in \mathbb {C}$ satisfying $p(\lambda _{j})=\overline {\lambda _{j}}$:
$$ A_{j}=\lambda_{j} I_{n_{j}} + e^{i\theta_{j}}T(0,1,ir_{j,2},\dots,ir_{j,n_{j}-1}),\quad H_{j}=\sigma_{j} R_{n_{j}}, $$
(5)
where $n_{j}\in \mathbb {N}$, σ_j ∈{1,− 1}, 𝜃_j ∈ [0,π), and $r_{j,2},\dots ,r_{j,n_{j}-1}\in \mathbb {R}$;
ii)
blocks associated with a pair (λ_j,μ_j) of eigenvalues with $\mu _{j}=\overline {p(\lambda _{j})} \neq \lambda _{j}$, $\overline {p(\mu _{j})} = \lambda _{j}$, and Re(λ_j) > Re(μ_j) or Im(λ_j) > Im(μ_j) if Re(λ_j) = Re(μ_j):
$$ A_{j}=\left[ \begin{array}{cc} \mathcal{J}_{m_{j}}(\lambda_{j})&0\\ 0& p\left( \mathcal{J}_{m_{j}}(\lambda_{j})\right)^{H} \end{array}\right],\quad H_{j}=\left[ \begin{array}{cc} 0 & I_{m_{j}}\\ I_{m_{j}} & 0 \end{array}\right], $$
(6)
where $m_{j}\in \mathbb {N}$.

Moreover, the form (4) is unique up to the permutation of blocks, and the parameters 𝜃_j, and $r_{j,2},\dots ,r_{j,n_{j}-1}$ in (5) are uniquely determined by λ_j and the coefficients of p, and they can be computed from the identity

$$ \overline{\lambda_{j}}I_{n_{j}} + e^{-i\theta_{j}}T(0,1,-ir_{j,2},\dots,-ir_{j,n_{j}-1}) = p\!\left( \lambda_{j} I_{n_{j}} + e^{i\theta_{j}} T(0,1,ir_{j,2},\dots,ir_{j,n_{j}-1})\right). $$

(We highlight that the eigenvalues λ_j in i) are not necessarily pairwise distinct, i.e., the same eigenvalue may occur in different blocks. The same is true for the eigenvalues λ_j,μ_j in ii).)

Besides the eigenvalues and their partial multiplicities, the signs σ_j = ± 1 attached to each Jordan block corresponding to an eigenvalue λ_j satisfying $\overline {\lambda }_{j}=p(\lambda _{j})$ are additional invariants under the equivalence relation (3). The list of all signs associated with a fixed eigenvalue λ_j is referred to as the sign characteristic of the eigenvalue λ_j extending the terminology in [2] used for H-selfadjoint and H-unitary matrices.

The following values related to the sign characteristic of a fixed eigenvalue will play a crucial role in the characterization when a structured Schur-like form will exist.

Definition 1

Let $H\in \mathbb {C}^{n,n}$ be Hermitian and invertible, let $A\in \mathbb {C}^{n,n}$ be polynomially H-normal, and let $\lambda \in \mathbb {C}$ be an eigenvalue of A that satisfies $\overline {\lambda }=p(\lambda )$. Then the sum of all signs σ_j from the sign characteristic of λ_j attached to blocks of odd size is called the sign sum of λ_j and is denoted by signsum(λ_j).

To illustrate Definition 1 consider the matrices

$$ \begin{array}{@{}rcl@{}} A&=&\mathcal{J}_{5}(0)\oplus\mathcal{J}_{4}(0)\oplus\mathcal{J}_{3}(0)\oplus\mathcal{J}_{3}(0)\oplus\mathcal{J}_{3}(0)\oplus\mathcal{J}_{2}(0)\oplus\mathcal{J}_{1}(0),\\ H&=&\sigma_{1}R_{5}\oplus\sigma_{2}R_{4}\oplus\sigma_{3}R_{3}\oplus\sigma_{4}R_{3}\oplus\sigma_{5}R_{3}\oplus\sigma_{6}R_{2}\oplus\sigma_{7}R_{1} \end{array} $$

with σ_i ∈{1,− 1} for $i=1,\dots ,7$. Then A is polynomially H-normal with H-normality polynomial p(t) = t (in fact, A is H-selfadjoint), and $\overline {0}=p(0)$. The sign sum of the eigenvalue λ = 0 is then given by

$$ \operatorname{signsum}(0)=\sigma_{1}+\sigma_{3}+\sigma_{4}+\sigma_{5}+\sigma_{7}. $$

Note that in accordance with Definition 1 the values σ₂ and σ₆ do not contribute to the sign sum as they are attached to blocks of the even sizes 4 and 2, respectively.

The signsum has an important impact on the inertia index of the given Hermitian matrix defining the indefinite inner product as we will show in the following lemma.

Lemma 1

Let $H\in \mathbb {C}^{n,n}$ be Hermitian with inertia index (π,ν,0) and let $A\in \mathbb {C}^{n,n}$ be polynomially H normal. If $\lambda _{1},\dots ,\lambda _{r}\in \mathbb {C}$ are the pairwise distinct eigenvalues of A satisfying $\overline {\lambda _{j}}=p(\lambda _{j})$, then

$$ \pi-\nu={\sum}_{j=1}^{r}\operatorname{signsum}(\lambda_{j}). $$

Proof

The proof immediately follows by inspection from the canonical form given in Theorem 1. Indeed, one easily checks that the matrices H_j from blocks of type ii) and blocks of type i) corresponding to an even size n_j contribute equally to the positive and negative eigenvalues of H. On the other hand, the matrix $H_{j}=\sigma _{j}R_{n_{j}}$ of a block as in type i) that corresponds to an odd size n_j = 2k + 1 has inertia index (k + 1,k,0) if σ_j = 1, and inertia index (k,k + 1,0) if σ_j = − 1. □

Finally, we recall the concept of neutral subspaces in indefinite inner product spaces. If $H\in \mathbb {C}^{n,n}$ is Hermitian and invertible, then a subspace $\mathcal {V} \subseteq \mathbb {C}^{n}$ is called H-neutral if [v,w] = 0 (or, equivalently, v^HHw = 0) for all $v,w\in \mathcal {V}$. It is well-known that if (π,ν,0) is the inertia index of H, then the maximal possible dimension of an H-neutral subspace is equal to $k=\min \limits \frac {n-|\pi -\nu |}{2}=\min \limits \{\pi ,\nu \}$. An H-neutral subspace of this maximal dimension k is called a maximal H-neutral subspace.

3 Schur-like Forms and Invariant Maximal H-neutral Subspaces

In this section, we will develop the main result of this paper: the introduction of a structured Schur-like form for polynomially H-normal matrices combined with a characterization of its existence. As pointed out in the introduction, it was shown in [6] that a Hamiltonian matrix can be transformed to Hamiltonian Schur form via a unitary symplectic similarity transformation if and only if the same can be done via a similarity transformation that is only symplectic. An analysis of the corresponding proof in [6] reveals that the property of the matrix J in (1) to be unitary is a crucial fact in this equivalence. Therefore, we will assume throughout this section that the Gram matrix of the given indefinite inner product is not only Hermitian, but also unitary. Many important examples of Gram matrices such as

$$ R_{n},\quad \left[\begin{array}{cc} 0&I_{n}\\ I_{n}&0 \end{array}\right], \quad \left[ \begin{array}{cc} I_{p}&0\\ 0&-I_{q} \end{array}\right] $$

satisfy this extra condition. (We mention in passing that properties of inner products that are either Hermitian or unitary are discussed in [7].)

Theorem 2

Let $H\in \mathbb {C}^{n,n}$ be unitary and Hermitian with inertia index (π,ν,0) and let $A\in \mathbb {C}^{n,n}$ be polynomially H-normal with H-normality polynomial p. Furthermore, let m := |π − ν|. Then n − m is even and the following statements are equivalent, where $k:=\frac {n-m}{2}$.

1)
There exists an H-neutral subspace of dimension k that is A-invariant.
2)
There exists a unitary matrix $U\in \mathbb {C}^{n,n}$ such that
$$ U^{-1}AU=\left[ \begin{array}{ccc} B_{11}&B_{12}&B_{13}\\ 0&p(B_{11})^{H}&0\\ 0&B_{32}&B_{33} \end{array}\right] \quad \text{and}\quad U^{H}HU=\left[ \begin{array}{ccc} 0&I_{k}&0\\ I_{k}&0&0\\ 0&0&sI_{m} \end{array}\right], $$
(7)
where $s=\frac {\pi -\nu }{|\pi -\nu |}$ (if m≠ 0 and thus π − ν≠ 0, else let s := 1), $B_{11}\in \mathbb {C}^{k,k}$ is upper triangular and $B_{33}\in \mathbb {C}^{m,m}$ is diagonal.
3)
There exists an invertible matrix $S\in \mathbb {C}^{n,n}$ such that
$$ S^{-1}AS=\left[ \begin{array}{ccc} B_{11}&B_{12}&B_{13}\\ 0&p(B_{11})^{H}&0\\ 0&B_{32}&B_{33} \end{array}\right] \quad\text{and}\quad S^{H}HS=\left[ \begin{array}{ccc} 0&I_{k}&0\\ I_{k}&0&0\\ 0&0&sI_{m} \end{array}\right], $$
(8)
where $s=\frac {\pi -\nu }{|\pi -\nu |}$ (if m≠ 0 and thus π − ν≠ 0, else let s := 1), $B_{11}\in \mathbb {C}^{k,k}$ is upper triangular and $B_{33}\in \mathbb {C}^{m,m}$ is diagonal.
4)
Let $\lambda _{1},\dots ,\lambda _{r}\in \mathbb {C}$ be the eigenvalues of A satisfying the equation $\overline {\lambda }=p(\lambda )$. Then s ⋅ signsum(λ_i) ≥ 0 for $i=1,\dots ,r$ and
$$ {\sum}_{i=1}^{r}\operatorname{signsum}(\lambda_{i})=sm. $$

Proof

The proof of Theorem 2 is rather long and will therefore be presented in a separate section. □

As mentioned in the introduction, Theorem 2 combines and generalizes two important results from the literature that we will restate below as corollaries. The first results recovers the well-known result on unitary diagonalizability of (I_n-)normal matrices.

Corollary 1 (Schur-form of normal matrices)

Let $A\in \mathbb {C}^{n,n}$ be a normal matrix, i.e., A^HA = AA^H. Then A is unitarily diagonalizable.

Proof

By [4] normality with respect to the Euclidean inner product is equivalent to polynomially I_n-normality and hence Theorem 2 can be applied with H = I_n. Since I_n has the inertia index (n,0,0), we find that k = 0. Thus, condition 2) of Theorem 2 states the existence of a unitary matrix $U\in \mathbb {C}^{n,n}$ such that U^− 1AU = B₃₃ is diagonal. □

Corollary 2 (Hamiltonian Schur-form of Hamiltonian matrices)

Let $A\in \mathbb {C}^{2n,2n}$ be a Hamiltonian matrix, i.e., A^HJ + JA = 0. Then the following statements are equivalent:

1)
There exists an n-dimensional subspace of $\mathbb {C}^{2n}$ that is J-neutral and A-invariant.
2)
There exists a unitary symplectic matrix $Q\in \mathbb {C}^{2n,2n}$ such that Q^− 1AQ is in Hamiltonian Schur form, i.e.,
$$ Q^{-1}AQ=\left[ \begin{array}{cc} B&C\\ 0&-B^{H} \end{array}\right], $$
where $B\in \mathbb {C}^{n,n}$ is upper triangular.
3)
There exists a symplectic matrix $S\in \mathbb {C}^{2n,2n}$ such that S^− 1AS is in Hamiltonian Schur form, i.e.,
$$ S^{-1}AS=\left[ \begin{array}{cc} B&C\\ 0&-B^{H} \end{array}\right], $$
where $B\in \mathbb {C}^{n,n}$ is upper triangular.
4)
For any purely imaginary eigenvalue λ of A, the number of odd partial multiplicities corresponding to λ with sign + 1 is equal to the number of partial multiplicities corresponding to λ with sign − 1.

Proof

First, we recall that A is Hamiltonian if and only if A is H-skew-adjoint with H = iJ. In particular, A is polynomially H-normal with H-normality polynomial p(t) = −t. We will frequently make use of this fact in the following.

“1) ⇒ 2)”: It is trivial that the J-neutral subspace in 1) is also iJ-neutral. Moreover, the inertia index of iJ is (n,n,0) and thus, Theorem 2 implies the existence of a unitary matrix $U\in \mathbb {C}^{2n,2n}$ such that

$$ U^{-1}AU=\left[ \begin{array}{cc} B&C\\ 0&-B^{H} \end{array}\right]\quad\text{and}\quad U^{H}(iJ)U=\left[ \begin{array}{cc} 0&I_{n}\\ I_{n}&0 \end{array}\right]. $$

Setting Q := U ⋅diag(I_n,iI_n), we obtain that

$$ Q^{-1}AQ=\left[\begin{array}{cc} B&iC\\ 0&-B^{H} \end{array}\right] \quad\text{and}\quad Q^{H}JQ=\left[ \begin{array}{cc} 0&I_{n}\\ -I_{n}&0 \end{array}\right]=J, $$

i.e., Q is unitary and symplectic. This implies 2).

“2) ⇒ 3)” is trivial and “3) ⇒ 4)” and “4) ⇒ 1)” follow immediately from Theorem 2 taking into account that the eigenvalues satisfying $\overline {\lambda }=p(\lambda )=-\lambda $ are exactly the purely imaginary eigenvalues of A, and that signsum(λ) = 0 is equivalent to the statement that the number of odd partial multiplicities corresponding to λ with sign + 1 is equal to the number of partial multiplicities corresponding to λ with sign − 1. □

The equivalence of 2), 3) and 4) was proved in [6] while the equivalence of 1) and 4) was proved in [13]. Clearly, the equivalence of 1) and 2) - or 1) and 3) - immediately follows from those two results in the literature and since then this has been implicitly known by many researchers dealing with Hamiltonian matrices. Nevertheless, it seems that a theorem combining all four equivalent conditions into a single result was explicitly formulated only as late as in [9].

Remark 2

The two results in Corollaries 1 and 2 represent the two extreme cases k = 0 and m = 0 in Theorem 2. Whenever k,m≠ 0 as it would be the case for Gram matrices of the form diag(I_p,−I_q) with p≠q, then the corresponding Schur-like form will have triangular and diagonal parts on the block diagonal as indicated in (7).

We highlight that the transformation in Theorem 2 is a transformation that changes the inner product, but keeps the symmetry structure of the matrix U^− 1AU linked to the transformed Gram matrix U^HHU in the sense of Remark 1. For the practical use of Theorem 2, we advise to first transform the pair (A,H) to a form $(A^{\prime },H^{\prime })$, where $H^{\prime }$ already has the form

$$ H^{\prime}=\left[ \begin{array}{ccc} 0&I_{k}&0\\ I_{k}&0&0\\ 0&0&sI_{m} \end{array}\right] $$

with s ∈{± 1}. When Theorem 2 is then applied to the pair $(A^{\prime },H^{\prime })$ it yields the existence of a unitary matrix U such that $U^{-1}A^{\prime }U$ is in the Schur-like form as in (7) while $U^{H}H^{\prime }U=H^{\prime }$. The latter conditions just means that the matrix U is not only unitary, but also $H^{\prime }$-unitary.

4 Proof of the Main Result

Before we prove Theorem 2, we start with a technical lemma that will be used frequently in the following.

Lemma 2

Let $A_{1},H_{1}\in \mathbb {C}^{n_{1},n_{1}}$ and $A_{2},H_{2}\in \mathbb {C}^{n_{2},n_{2}}$, where H₁ and H₂ are Hermitian and invertible, and let

$$ A=\left[ \begin{array}{cc} A_{1}&0\\ 0&A_{2} \end{array}\right],\quad H = \left[ \begin{array}{cc} H_{1}&0\\ 0&H_{2} \end{array}\right]. $$

If A₁ has an invariant H₁-neutral subspace of dimension k₁ and A₂ has an invariant H₂-neutral subspace of dimension k₂, then A has an invariant H-neutral subspace of dimension k₁ + k₂.

Proof

Let the vectors $v_{1},\dots ,v_{k_{1}}\in \mathbb {C}^{n_{1}}$ and $w_{1},\dots ,w_{k_{2}}\in \mathbb {C}^{n_{2}}$ form bases of the A_i-invariant H_i-neutral subspaces for i = 1,2, respectively. Then it is straightforward to verify that the vectors

$$ \left[ \begin{array}{c} v_{1}\\ 0 \end{array}\right],\dots,\left[ \begin{array}{c} v_{k_{1}}\\ 0 \end{array}\right],\left[ \begin{array}{c} 0\\ w_{1} \end{array}\right],\dots,\left[ \begin{array}{c} 0\\ w_{k_{2}} \end{array}\right]\in\mathbb{C}^{n_{1}+n_{2}} $$

form a basis of an A-invariant H-neutral subspace which obviously has dimension k₁ + k₂. □

Proof Proof of Theorem2

“1) ⇒ 2)”: By switching to an orthonormal basis whose first k columns span the A-invariant H-neutral subspace, we can assume that A and H have the forms

$$ A=\left[ \begin{array}{cc} A_{11}&A_{12}\\ 0&A_{22} \end{array}\right]\quad\text{and}\quad H=\left[ \begin{array}{cc} 0&H_{12}\\ H_{12}^{H}&H_{22} \end{array}\right], $$

where $A_{22},H_{22}\in \mathbb {C}^{n-k,n-k}$. Since H is unitary, the rows of $H_{12}\in \mathbb {C}^{k,n-k}$ are orthonormal and consequently its singular value decomposition takes the form

$$ H_{12}=U_{2}\left[ \begin{array}{cc} I_{k} & 0 \end{array}\right] V_{2}, $$

where $U_{2}\in \mathbb {C}^{k,k}$ and $V_{2}\in \mathbb {C}^{n-k,n-k}$ are unitary. Setting Q := diag(U₂,V₂) we obtain

$$ Q^{-1}AQ=\left[ \begin{array}{ccc} B_{11}&B_{12}&B_{13}\\ 0&B_{22}&B_{23}\\ 0&B_{32}&B_{33} \end{array}\right] \quad\text{and}\quad Q^{H}HQ=\left[ \begin{array}{ccc} 0&I_{k}&0\\ I_{k}&0&0\\ 0&0&H_{33} \end{array}\right], $$

where B₁₁,B₂₂ are k × k. The zeros in the block positions (2,2), (2,3), and (3,2) are due to the fact that Q^HHQ is still unitary and consequently has orthonormal columns. On the other hand, Q^HHQ is also Hermitian and keeping in mind that its inertia index is (π,ν,0) and we have $k=\min \limits \{\pi ,\nu \}$, the block structure implies that H₃₃ is positive or negative definite, depending on π − ν being positive or negative, respectively. (If π = ν, then n = 2k and the (3,3)-blocks in Q^− 1AQ and Q^HHQ are void.) Since H₃₃ is also unitary, it follows that H₃₃ = I_m or H₃₃ = −I_m. Applying a unitary transformation of the form diag(U₁,U₁,U₃) with $U_{1}\in \mathbb {C}^{k,k}, U_{3}\in \mathbb {C}^{m,m}$ if necessary, we can assume without loss of generality that B₁₁ and B₃₃ are upper triangular. (Note that this transformation will not change Q^HHQ. Finally, we exploit the fact that A is polynomially H-normal with H-normality-polynomial p. This implies

$$ \left[\begin{array}{ccc} B_{22}^{H}&B_{12}^{H}&B_{32}^{H}\\ 0&B_{11}^{H}&0\\ B_{23}^{H}&B_{13}^{H}&B_{33}^{H} \end{array}\right] = Q^{H}H^{-1}A^{H}HQ=p(Q^{-1}AQ)=\left[ \begin{array}{cc} p(B_{11})&\ast\\ 0&p\left( \left[ \begin{array}{cc} B_{22}&B_{23}\\ B_{32}&B_{33} \end{array}\right]\right) \end{array}\right] $$

and hence B₂₂ = p(B₁₁)^H and B₂₃ = 0. But then, we find that $B_{33}^{H}=p(B_{33})$ which implies that B₃₃ is normal (i.e., with respect to the Euclidean inner product) and hence diagonal as upper triangular normal matrices are diagonal.

“2) ⇒ 3)”: This is trivial.

“3) ⇒ 4)”: The proof proceeds by induction on the number r of eigenvalues satisfying $\overline {\lambda }=p(\lambda )$. If r = 0, then the block B₃₃ in (8) is void, because it is diagonal and satisfies $B_{33}^{H}=p(B_{33})$, so all its eigenvalues satisfy $\overline {\lambda }=p(\lambda )$. Thus, we have m = 0 and 4) is satisfied by the definition of the empty sum.

Now assume that r > 0. Let λ := λ₁, i.e., we have $\overline {\lambda }=p(\lambda )$. Starting from 3) we can assume without loss of generality that A and H are in the form (8), where the eigenvalues on the diagonal of B₁₁ and B₃₃ are ordered in such a way that all occurrences of λ come first, i.e., A has the form

where k₁,m₁ ≥ 0 and $\sigma (A_{11}),\sigma (p({A}_{11})^{H}),\sigma (A_{55})\subseteq \{\lambda \}$ and λ is not contained in σ(A₂₂), σ(p(A₂₂)^H), or σ(A₆₆). If k₁ > 0, then letting $X\in \mathbb {C}^{k_{1},k_{2}}$ be the unique solution of the Sylvester equation XA₂₂ − A₁₁X = A₁₂ (which exists, because the spectra of A₁₁ and A₂₂ are disjoint) and defining $\widehat {S}$ as the matrix that differs from the block-partitioned identity $\mathcal {I}:=I_{k_{1}}\oplus I_{k_{2}}\oplus I_{k_{1}}\oplus I_{k_{2}}\oplus I_{m_{1}}\oplus I_{m_{2}}$ by adding X in the (1,2)-block position, we obtain that a similarity transformation on A with $\widehat {S}$ annihilates the block entry A₁₂ of A. The corresponding congruence transformation creates the entry X in the (3,2)-position and the entry X^H in the (2,3)-block-position of H. We can restore H with the congruence transformation with the matrix T which differs from the block-partitioned identity $\mathcal {I}$ by putting − X^H in the (4,3)-block-position. The corresponding similarity applied to $\widehat {S}^{-1}A \widehat {S}$ will change the block-entries A₁₃, A₂₃, A₄₃, A₅₃, A₆₃, but not the yet established zero in the (1,2)-block position. For the ease of notation let us rename $T^{-1}\widehat {S}^{-1}A\widehat {S}T$ by A and $T^{H}\widehat {S}^{H}H\widehat {S}T$ by H. (Similar renaming steps will occur after each of the following steps without further notice.) We illustrate the A₁₂-elimination-step in the following diagram, where non-zero block-entries, that were effected by the current transformation are marked as bullets. In each substep, the pair (i,j) denotes the block-entry of the transformation matrix that differs from the one in the identity matrix. Observe that the similarity transformation with such a matrix adds the i-th block column to the j-th block column, but the j-th block row to the i-th block row while in the corresponding congruence transformation the i-th block column and i-th block row are added to the j-th block column or j-th block row, respectively:

$$ A:~\left[\begin{array}{cccccc} \ast&\ast&\ast&\ast&\ast&\ast\\ 0&\ast&\ast&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&\ast&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right] \underset{\leadsto}{(1,2)} \left[\begin{array}{cccccc} \ast&0&\bullet&\bullet&\bullet&\bullet\\ 0&\ast&\ast&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&\ast&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right] \underset{\leadsto}{(4,3)} \left[\begin{array}{cccccc} \ast&0&\bullet&\ast&\ast&\ast\\ 0&\ast&\bullet&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\bullet&\ast&0&0\\ 0&0&\bullet&\ast&\ast&0\\ 0&0&\bullet&\ast&0&\ast \end{array}\right], $$

$$ H:~\left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&0&0&0\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right] \underset{\leadsto}{(1,2)} \left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&\bullet&\ast&0&0\\ \ast&\bullet&0&0&0&0\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right] \underset{\leadsto}{(4,3)} \left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&0&0&0\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right]. $$

In the next step, we eliminate A₁₆ by applying a similarity transformation on A that is obtained from $\mathcal {I}$ by changing the (1,6)-block to the solution X of the Sylvester equation XA₆₆ − A₁₁X = A₁₆. The corresponding congruence transformation on H introduces the matrix X in the (3,6)-block-position and X^H in the (6,3)-block-position. We can restore H as follows: first, we apply a congruence with a matrix that differs from $\mathcal {I}$ by − sX^H in the (6,3)-block-position. This will annihilate the (3,6)- and (6,3)-block entries, but introduce the block − sXX^H in the (3,3)-block entry. The corresponding similarity transformation on A only effects the block-entries A₂₃ and A₂₆. Then, we eliminate the (3,3)-block entry of H by a congruence transformation with the matrix that coincides with $\mathcal {I}$ except for having $-\frac {1}{2}X^{H}X$ as its (1,3)-block. The corresponding similarity transformation on A only changes the block A₁₃. As before, we illustrate this elimination step with the help of a diagram:

$$ A:~\left[\begin{array}{cccccc} \ast&0&\ast&\ast&\ast&\ast\\ 0&\ast&\ast&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&\ast&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right] \underset{\leadsto}{(1,6)} \left[\begin{array}{cccccc} \ast&0&\bullet&\bullet&\ast&0\\ 0&\ast&\ast&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&\ast&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right] \underset{\leadsto}{(6,3)} \left[\begin{array}{cccccc} \ast&0&\ast&\ast&\ast&0\\ 0&\ast&\bullet&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&\ast&\ast&0\\ 0&0&\bullet&\ast&0&\ast \end{array}\right] \underset{\leadsto}{(1,3)} \left[\begin{array}{cccccc} \ast&0&\bullet&\ast&\ast&0\\ 0&\ast&\ast&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&\ast&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right], $$

$$ H:\left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&0&0&0\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right] \underset{\leadsto}{(1,6)} \left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&0&0&\bullet\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&\bullet&0&0&\ast \end{array}\right] \underset{\leadsto}{(6,3)} \left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&\bullet&0&0&0\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right] \underset{\leadsto}{(1,3)} \left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&0&0&0\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right]. $$

We continue by eliminating A₅₄ with a similarity transformation that differs from $\mathcal {I}$ by the solution of the Sylvester equation Xp(A₂₂)^H − A₅₅X = A₅₄ in the (5,4)-block-position followed by transformations that restore H. Since the step is analogous to the A₁₂-elimination step, we restrict ourselves to the illustration via a diagram:

$$ A:~\left[\begin{array}{cccccc} \ast&0&\ast&\ast&\ast&0\\ 0&\ast&\ast&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&\ast&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right] \underset{\leadsto}{(5,4)} \left[\begin{array}{cccccc} \ast&0&\ast&\bullet&\ast&0\\ 0&\ast&\ast&\bullet&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\bullet&0&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right] \underset{\leadsto}{(2,5)} \left[\begin{array}{cccccc} \ast&0&\ast&\ast&\ast&0\\ 0&\ast&\bullet&\ast&\bullet&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&0&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right], $$

$$ H:~\left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&0&0&0\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right] \underset{\leadsto}{(5,4)} \left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&0&0&0\\ 0&\ast&0&0&\bullet&0\\ 0&0&0&\bullet&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right] \underset{\leadsto}{(2,5)} \left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&0&0&0\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right]. $$

Finally, we eliminate A₁₄ with a similarity transformation whose relevant entry, the solution of the Sylvester equation Xp(A₂₂)^H − A₁₁X = A₁₄ is in the (1,4)-block position. Restoring H imposes another similarity transformation on A with the relevant entry in the (2,3)-block position which effects the entry A₂₃ only. This step is depicted as follows:

$$ A:~\left[\begin{array}{cccccc} \ast&0&\ast&\ast&\ast&0\\ 0&\ast&\ast&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&0&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right] \underset{\leadsto}{(1,4)} \left[\begin{array}{cccccc} \ast&0&\bullet&0&\ast&0\\ 0&\ast&\ast&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&0&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right] \underset{\leadsto}{(2,3)} \left[\begin{array}{cccccc} \ast&0&\ast&0&\ast&0\\ 0&\ast&\bullet&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&0&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right], $$

$$ H:~\left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&0&0&0\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right] \underset{\leadsto}{(1,4)} \left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&\bullet&0&0\\ 0&\ast&\bullet&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right] \underset{\leadsto}{(2,3)} \left[\begin{array}{cccccc} 0&0&\ast&0&0&0\\ 0&0&0&\ast&0&0\\ \ast&0&0&0&0&0\\ 0&\ast&0&0&0&0\\ 0&0&0&0&\ast&0\\ 0&0&0&0&0&\ast \end{array}\right]. $$

The key observation is now that the block zero pattern obtained for A is invariant under multiplication and thus, all powers of A as well as p(A) will have the same block zero pattern. But then, the fact that A is polynomially H-normal with H-normality polynomial p implies

$$ \left[\begin{array}{cccccc} p(A_{11})&A_{43}^{H}&A_{13}^{H}&A_{23}^{H}&A_{53}^{H}&A_{63}^{H}\\ 0&p(A_{22})&0&A_{24}^{H}&0&A_{64}^{H}\\ 0&0&A_{11}^{H}&0&0&0\\ 0&0&0&A_{22}^{H}&0&0\\ 0&0&A_{15}^{H}&A_{25}^{H}&A_{55}^{H}&0\\ 0&0&0&A_{26}^{H}&0&A_{66}^{H} \end{array}\right] = H^{-1}A^{H}H=p(A)= \left[\begin{array}{cccccc} \ast&0&\ast&0&\ast&0\\ 0&\ast&\ast&\ast&\ast&\ast\\ 0&0&\ast&0&0&0\\ 0&0&\ast&\ast&0&0\\ 0&0&\ast&0&\ast&0\\ 0&0&\ast&\ast&0&\ast \end{array}\right], $$

and we obtain that A₂₃ = 0, A₂₅ = 0, A₄₃ = 0, and A₆₃ = 0. But then, after applying a block permutation, we may assume that A and H have the forms A = A₁ ⊕ A₂ and H = H₁ ⊕ H₂ with

$$ A_{1}=\left[\begin{array}{ccc} A_{11}&A_{13}&A_{15}\\ 0&p(A_{11})^{H}&0\\ 0&A_{53}&A_{55} \end{array}\right], \quad H_{1}= \left[\begin{array}{ccc} 0&I_{k_{1}}&0\\ I_{k_{1}}&0&0\\ 0&0&sI_{m_{1}} \end{array}\right] $$

and

$$ A_{2}=\left[\begin{array}{ccc} A_{22}&A_{24}&A_{26}\\ 0&p(A_{22})^{H}&0\\ 0&A_{64}&A_{66} \end{array}\right], \quad H_{2}= \left[\begin{array}{ccc} 0&I_{k_{2}}&0\\ I_{k_{2}}&0&0\\ 0&0&sI_{m_{2}} \end{array}\right]. $$

Note that A₁ has only one eigenvalue which is λ₁ = λ. It immediately follows from Lemma 1 that m₁s = signsum(λ₁) and s ⋅signsum λ₁ = m₁ ≥ 0. On the other hand, the matrix A₂ now has precisely r − 1 eigenvalues $\lambda _{2},\dots ,\lambda _{r}$ satisfying $\overline {\lambda _{i}}=p(\lambda _{i})$ and we can apply the induction hypothesis to obtain that s ⋅signsum(λ_i) ≥ 0 for $i=2,\dots ,r$ and

$$ sm=sm_{1}+sm_{2}=\text{signsum}(\lambda_{1})+{\sum}_{j=2}^{r}\text{signsum}(\lambda_{j}) $$

which proves 4).

“4) ⇒ 1)”: We may assume without loss of generality that A and H are in the forms of Theorem 1. Thus, by Lemma 2, we may consider some of the corresponding diagonal blocks of A and H separately in order to construct an A-invariant H-neutral subspace. We will do this by individually investigating each block in the canonical form of Theorem 1 that is of type ii) or of type i) having even dimension. Concerning the blocks of type i) having odd dimension, we will have to consider all of them together to obtain the desired dimension of the A-invariant H-neutral subspace. We proceed by first considering the following three special cases, before discussing the general case.Special Case 1: A is a block of type ii). In this case, we have m = 0, $k=\frac {n}{2}$ and

$$ A=\left[\begin{array}{cc} \mathcal{J}_{k}(\lambda)&0\\ 0&p(\mathcal{J}_{k}(\lambda))^{H} \end{array}\right], \quad H=\left[\begin{array}{cc} 0&I_{k}\\ I_{k}&0 \end{array}\right], $$

where $p(\lambda )\neq \overline {\lambda }$. Obviously, the first k standard basis vectors of $\mathbb {C}^{n}$ span a k-dimensional A-invariant subspace that is H-neutral.Special Case 2: A is a block of type i) of even dimension. In this case, we again have m = 0, $k=\frac {n}{2}$ and

$$ A=\lambda I_{n}+e^{i\theta}T(0,1,ir_{2},\dots,ir_{n-1}),\quad H=\sigma R_{n}, $$

where $p(\lambda )=\overline {\lambda }$, σ ∈{1,− 1}, and where $\theta ,r_{2},\dots ,r_{n-1}$ are as specified in Theorem 1. Again, it is obvious that the first k standard basis vectors of $\mathbb {C}^{n}$ span a k-dimensional A-invariant subspace that is H-neutral.Special Case 3: A only consists of blocks of type i) with odd dimension. In this case, we have

$$ A=(\lambda I_{n_{1}}+e^{i\theta_{1}}T(0,1,ir_{1,2},\dots,ir_{1,n_{1}-1}))\oplus\cdots\oplus (\lambda I_{n_{\ell}}+e^{i\theta_{\ell}}T(0,1,ir_{\ell,2},\dots,ir_{\ell,n_{\ell}-1})) $$

$$ \text{and} \quad H=\sigma_{1}R_{n_{1}}\oplus\cdots\oplus\sigma_{\ell} R_{n_{\ell}}, $$

where n_j = 2k_j + 1 for some nonnegative integers $k_{1},\dots ,k_{\ell }$. By 4), we have |signsum(λ)| = m. Without loss of generality we may assume that s = 1 considering − H instead of H otherwise, i.e., we have that signsum(λ) = m. It follows that ℓ ≥ m. More precisely, there exists a nonnegative integer α such that ℓ = m + 2α and such that m + α signs among $\sigma _{1},\dots ,\sigma _{\ell }$ are positive and α are negative. Without loss of generality, we may assume that among the diagonal blocks the first 2α blocks have alternating signs starting with σ₁ = 1 and thus the last m blocks all have positive sign. Let us first consider a group of two blocks with signs + 1,− 1, say

$$ A_{j}=\left[\begin{array}{cc} T_{2j-1}&0\\ 0&T_{2j} \end{array}\right], \quad H_{j}=\left[\begin{array}{cc} R_{n_{2j-1}}&0\\ 0&-R_{n_{2j}} \end{array}\right], $$

where $j\in \{1,\dots ,\alpha \}$ and where $T_{i}\in \mathbb {C}^{n_{i},n_{i}}$ is an upper triangular matrix having $\lambda I_{n_{j}}$ as its diagonal. Then it is easy to check that the vectors

$$ e_{1},{\dots} e_{k_{2j-1}},e_{n_{2j-1}+1},\dots,e_{n_{2j-1}+k_{2j}},e_{k_{2j-1}+1}+e_{n_{2j-1}+k_{2j}+1} $$

(9)

form a basis of an A_j-invariant H_j-neutral subspace. Indeed, partitioning

a simultaneous block permutation on A_j and H_j results in

$$ \left[\begin{array}{cccccc} T_{11} & 0 & T_{12} & 0 & T_{13} & 0\\ 0 & \widetilde T_{11} & 0 & \widetilde T_{12} & 0 & \widetilde T_{13}\\ 0&0&\lambda & 0 & T_{23} & 0\\ 0&0&0&\lambda & 0 & \widetilde T_{23}\\ 0&0&0&0&T_{33}&0\\ 0&0&0&0&0&\widetilde T_{33} \end{array}\right] \quad\text{and}\quad \left[\begin{array}{cccccc} 0&0&0&0&R_{k_{2j-1}}&0\\ 0&0&0&0&0&-R_{k_{2j-1}}\\ 0&0&1&0&0&0\\ 0&0&0&-1&0&0\\ R_{k_{2j-1}}&0&0&0&0&0\\ 0&-R_{k_{2j-1}}&0&0&0&0 \end{array}\right]. $$

Then transforming the middle 2 × 2 blocks with the transformation given by the matrix

$$ \left[\begin{array}{cc} 1&1\\ 1&-1 \end{array}\right] $$

produces matrices of the form

$$ \left[\begin{array}{cc} \widehat T_{1}&\widehat T_{2}\\ 0&\widehat T_{3} \end{array}\right] \quad\text{ and }\quad \left[\begin{array}{cc} 0&\widehat H\\ \widehat H^{H}&0 \end{array}\right], $$

(10)

where all blocks have size $\frac {n_{2j-1}+n_{2j}}{2}$. Observe that the first $\frac {n_{2j-1}+n_{2j}}{2}$ columns of the transformation matrix that transform (A_j,H_j) to the pair of the matrices in (10) coincide with the vectors in (9), possibly up to scalar multiples.

Next, we consider a block for $j\in \{2\alpha +1,\dots ,\ell \}$, i.e.,

$$ A_{j}=\lambda I_{n_{j}}+e^{i\theta_{j}}T(0,1,ir_{j,2},\dots,ir_{1,n_{j}-1}),\quad H_{j}=R_{n_{j}}. $$

Here, the first k_j standard basis vectors span an A_j-invariant H-neutral subspace.

In view of Lemma 2, we obtain the existence of an A-invariant H-neutral subspace of dimension

$$ \sum\limits_{j=1}^{\alpha}(k_{2j-1}+k_{2j}+1)+\sum\limits_{j=2\alpha+1}^{\ell} k_{j}= \sum\limits_{j=1}^{\alpha} \frac{n_{2j-1}+n_{2j}}{2}+\sum\limits_{j=2\alpha+1}^{\ell} \frac{n_{j}-1}{2}=\frac{n-m}{2}, $$

where we used that ℓ − 2α = m.

The general case: Now let A be general having r eigenvalues $\lambda _{1},\dots ,\lambda _{r}\in \mathbb {C}$ satisfying $\overline {\lambda _{i}}=p(\lambda _{i})$, $i=1,\dots ,r$. Putting together the special cases above in view of our observation, we obtain that there exists an A-invariant H-neutral subspace of dimension

$$ \frac{1}{2}\left( n-{\sum}_{j=1}^{r}|\text{signsum}(\lambda_{i})|\right), $$

and by 4) this dimension is equal to $\frac {n-m}{2}=k$ as desired. This finishes the proof. □

5 Conclusions

We have developed a Schur-like form for polynomially H-normal matrices, where H is Hermitian and unitary and characterized under which conditions these forms can be obtained via structure preserving (and unitary) similarity transformations. In particular, the result can be applied to all matrices that are selfadjoint, skew-adjoint, or unitary with respect to an indefinite inner product that has a unitary Hermitian Gram matrix. As two extreme special cases, the unitary diagonalizability of normal matrices and equivalent conditions for the existence of the Hamiltonian Schur form of Hamiltonian matrices have been recovered. While structure-preserving and numerically backward stable algorithms for the numerical computation of these forms are well known in the mentioned two special cases, it remains to develop such algorithms for the general case.

Change history

19 July 2021
A Correction to this paper has been published: https://doi.org/10.1007/s10013-021-00516-5

References

Benner, P., Kressner, D., Mehrmann, V.: Skew-Hamiltonian and Hamiltonian eigenvalue problems: theory, algorithms and applications. In: Drmač, Z., Marušić, M., Tutek, Z (eds.) Proceedings of the Conference on Applied Mathematics and Scientific Computing, pp. 3–39. Springer, Dordrecht (2005)
Gohberg, I., Lancaster, P., Rodman, L.: Indefinite Linear Algebra and Applications. Birkhäuser, Basel (2005)
MATH Google Scholar
Gohberg, I., Reichstein, B.: On classification of normal matrices in an indefinite scalar product. Integral Equ. Oper. Theory 13, 364–394 (1990)
Article MathSciNet Google Scholar
Grone, R., Johnson, C., Sa, E., Wolkowicz, H.: Normal matrices. Linear Algebra Appl. 87, 213–225 (1987)
Article MathSciNet Google Scholar
Lancaster, P., Rodman, L.: Algebraic Riccati Equations. Clarendon Press, Oxford (1995)
MATH Google Scholar
Lin, W.-W., Mehrmann, V., Xu, H.: Canonical forms for Hamiltonian and symplectic matrices and pencils. Linear Algebra Appl. 302–303, 469–533 (1999)
Article MathSciNet Google Scholar
Mackey, D. S., Mackey, N., Tisseur, F.: On the definition of two natural classes of scalar product MIMS EPrint 2007.64, The University of Manchester, UK (2007)
Mehl, C.: On classification of normal matrices in indefinite inner product spaces. Electron. J. Linear Algebra 15, 50–83 (2006)
MathSciNet MATH Google Scholar
Mehl, C.: Finite-dimensional inner product spaces and applications in numerical analysis. In: Alpay, D. (ed.) Operator Theory, pp. 431–449. Springer Basel, Basel (2015)
Mehl, C., Rodman, L.: Classes of normal matrices in indefinite inner products. Linear Algebra Appl. 336, 71–98 (2001)
Article MathSciNet Google Scholar
Mehrmann, V.: The Autonomous Linear Quadratic Control Problem: Theory and Numerical Solution. Lecture Notes in Control and Information Sciences, vol. 163. Springer, Berlin (1991)
Book Google Scholar
Paige, C., Van Loan, C.: A Schur decomposition for Hamiltonian matrices. Linear Algebra Appl. 41, 11–32 (1981)
Article MathSciNet Google Scholar
Ran, A.C.M., Rodman, L.: Stability of invariant maximal semidefinite subspaces I. Linear Algebra Appl. 62, 51–86 (1984)
Article MathSciNet Google Scholar

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Institut für Mathematik, Technische Universität Berlin, MA 4-5, 10623, Berlin, Germany
Christian Mehl

Authors

Christian Mehl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Mehl.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Dedicated to Volker Mehrmann on the occasion of his 65th birthday.

The original online version of this article was revised due to a retrospective Open Access order.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mehl, C. On the Existence of Schur-like Forms for Matrices with Symmetry Structures. Vietnam J. Math. 48, 831–845 (2020). https://doi.org/10.1007/s10013-020-00394-3

Download citation

Received: 29 August 2019
Accepted: 13 December 2019
Published: 13 March 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10013-020-00394-3

Keywords

Mathematics Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On the Existence of Schur-like Forms for Matrices with Symmetry Structures

Abstract

Similar content being viewed by others

Dual Integrable Representations on Locally Compact Groups

On the decomposability for sums of complex symmetric operators

On the generalized n-strong Drazin inverses and block matrices in Banach algebras

1 Introduction

Notation

2 Indefinite Inner Products and Polynomially H-normal Matrices

Remark 1

Theorem 1 (Canonical form for polynomially H-normal matrices)

Definition 1

Lemma 1

Proof

3 Schur-like Forms and Invariant Maximal H-neutral Subspaces

Theorem 2

Proof

Corollary 1 (Schur-form of normal matrices)

Proof

Corollary 2 (Hamiltonian Schur-form of Hamiltonian matrices)

Proof

Remark 2

4 Proof of the Main Result

Lemma 2

Proof

Proof Proof of Theorem2

5 Conclusions

Change history

19 July 2021

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation