1 Introduction

In this paper, we are interested in studying when the quaternionic numerical range is the closed unit ball in the algebra of quaternions. The quaternionic numerical range was introduced in [14] and several basic properties were derived in [2, 3, 14, 19]. More recent papers include [1, 6, 17]. For the case of nilpotent quaternionic matrices, some circularity results appear in [5]. In the case of complex matrices, it was derived (beyond the nilpotent case) in [16, Theorem 4.5] that for a complex matrix \(B\in {\mathbb {C}}^{n\times n},\) its numerical range is the closed unit disk if and only if there exist \(P_0,P_1 \in {\mathbb {C}}^{(n-1)\times n},\) so that \(B=2P_0^*P_1\) and \(P_0^*P_0+P_1^*P_1=I_{n}.\) We will be deriving a quaternionic version of this result. To do this, we need to study Fejér–Riesz factorization in the algebra of complex matrices associated with quaternion matrices. This algebra, which we will call the QRC-subalgebra, consists of matrices of the form

$$\begin{aligned} \begin{bmatrix} A &{} \overline{B} \\ -B &{} \overline{A} \end{bmatrix} \in {\mathbb {C}}^{2n \times 2n}, \end{aligned}$$

where \(\overline{C}= (\overline{c_{ij}})\) denotes the matrix obtained from \(C=(c_{ij})\) by taking entrywise complex conjugates. Here, ‘QRC’ stands for ‘Quaternions Represented as Complexes’. It should be noted that in [9], for instance, these matrices are referred to as being ‘symplectic’. However, as the term ‘symplectic’ is also used for other types of matrices, we will use the QRC-subalgebra terminology. The classical Fejér–Riesz factorization result has its application in filter design (see, e.g., [11]), wavelet design (see [7]), and \(H_\infty \)-control (see, e.g., [10]). Factorizations in this context are numerically found via solving Riccati equations (see [15]) or semidefinite programming (see, e.g., [12]).

The paper is organized as follows. In Sect. 2, we will derive the Fejér–Riesz factorization in the QRC-subalgebra. In Sect. 3, we will provide a characterization when the quaternionic numerical range equals the closed unit ball in the quaternions. Before we start, we will introduce our notational conventions.

1.1 Notation

For notation regarding basic quaternion linear algebra, we refer to the monograph [18]. Every quaternion in \({\mathbb {H}}\) is of the form: quaternion number

$$\begin{aligned} x=x_0+x_1{\textsf {i}}+x_2{\textsf {j}}+x_3{\textsf {k}}, \end{aligned}$$

where \(x_0,x_1,x_2,x_3\in {\mathbb {R}}\) and the elements \(\textsf{i},\textsf{j},\textsf{k}\) satisfy the following formulas:

$$\begin{aligned} {\textsf {i}}^2={\textsf {j}}^2={\textsf {k}}^2=-1,\quad {\textsf {i}}{\textsf {j}}=-{\textsf {j}}{\textsf {i}}={\textsf {k}},\quad {\textsf {j}}{\textsf {k}}=-{\textsf {k}}{\textsf {j}}={\textsf {i}},\quad {\textsf {k}}{\textsf {i}}=-{\textsf {i}}{\textsf {k}}={\textsf {j}}. \end{aligned}$$

Note that multiplication in \({\mathbb {H}}\) is not commutative. Denote the conjugate quaternion conjugate quaternion of x by \(\overline{x}=x_0-x_1{\textsf {i}}-x_2{\textsf {j}}-x_3{\textsf {k}}\) and the norm of \(x\in {\mathbb {H}}\) by \(\Vert x\Vert =\sqrt{x^*x}=\sqrt{x_0^2+x_1^2+x_2^2+x_3^2}\in {\mathbb {R}}.\) For a quaternion matrix A,  let \(\overline{A}\) denote the matrix obtained from where each entry is the conjugate of the corresponding entry in A.

For \(A \in {\mathbb {H}}^{n \times n},\) where

$$\begin{aligned} A=A_0+{\textsf {i}}A_1+{\textsf {j}}A_2+{\textsf {k}}A_3, \quad \text{ with } \ A_i \in {\mathbb {R}}^{n \times n}, i=0,1,2,3, \end{aligned}$$

we write \(A=B+{\textsf {j}}C\) where \(B:=A_0+{\textsf {i}}A_1 \in {\mathbb {C}}^{n \times n} \) and \(C:=A_2-{\textsf {i}}A_3 \in {\mathbb {C}}^{n \times n}\). There exists a complex matrix \(\omega _n(A)\) associated with A,  defined by

$$\begin{aligned} \omega _n(A)=\begin{bmatrix} B &{} \overline{C} \\ -C &{} \overline{B} \end{bmatrix}\in {\mathbb {C}}^{2n\times 2n}. \end{aligned}$$

Then, \(\omega _n\) is an isomorphism of the real algebra \({\mathbb {H}}^{n\times n}\) onto the real unital subalgebra \(\Omega _{2n}\) (subalgebra of \({\mathbb {C}}^{2n\times 2n}\))

$$\begin{aligned} \Omega _{2n}:=\left\{ \begin{bmatrix} B &{} \overline{C} \\ -C &{} \overline{B} \end{bmatrix}: B,C\in {\mathbb {C}}^{n\times n}\right\} \end{aligned}$$

of \({\mathbb {C}}^{2n\times 2n}.\) As mentioned before, we will call this subalgebra the QRC-subalgebra where ‘QRC’ stands for ‘Quaternions Represented as Complexes’. For a nonsquare matrix \(A\in {\mathbb {H}}^{m\times n}\) written as \(A=B+{\textsf {j}}C,\) \(B,C\in {\mathbb {C}}^{m\times n},\) we use the notation \(\omega _{m,n}(A)\) and \(\Omega _{2m,2n}\)

$$\begin{aligned} \Omega _{2m,2n}:=\left\{ \begin{bmatrix} B &{} \overline{C} \\ -C &{} \overline{B} \end{bmatrix}: B,C\in {\mathbb {C}}^{m\times n}\right\} . \end{aligned}$$

The map \(\omega _{m,n}: {\mathbb {H}}^{m\times n} \rightarrow \Omega _{2m,2n}\) is invertible, and the inverse of \(\omega _{m,n}\) is denoted by \(\omega _{m,n}^{-1}.\) When the sizes are clear from the context, we may drop the subscripts and simply write \(\omega (A).\)

The following properties can be found in [18, Section 3.4] and [21]. For \(X,Y\in {\mathbb {H}}^{n\times n}\) and \(s,t\in {\mathbb {R}}\) be arbitrary, we have

  1. (i)

    \(\omega _n(I_n)=I_{2n};\)

  2. (ii)

    \(\omega _n(XY)=\omega _n(X)\omega _n(Y);\)

  3. (iii)

    \(\omega _n(sX+tY)=s\omega _n(X)+t\omega _n(Y);\)

  4. (iv)

    \(\omega _n(X^*)=(\omega _n(X))^*;\)

  5. (v)

    X is invertible if and only if \(\omega _n(X)\) is invertible; if so, then \(\omega _n(X^{-1})=(\omega _n(X))^{-1}.\)

Furthermore, define the matrix

$$\begin{aligned} J_n:=\omega ({\textsf {j}}) \otimes I_n = \begin{bmatrix} 0 &{}{} I_n \\ -I_n &{}{} 0 \end{bmatrix}, \end{aligned}$$

where \(\otimes \) is the Kronecker product.

We denote the numerical range of a quaternion matrix \(A \in {\mathbb {H}}^{n \times n}\) by

$$\begin{aligned} W_{{\mathbb {H}}}(A)=\left\{ x^*Ax: \Vert x\Vert =1, x \in {\mathbb {H}}^{n}\right\} \subset {\mathbb {H}}. \end{aligned}$$

Note that if \(q \in W_{{\mathbb {H}}}(A),\) and \(u \in {\mathbb {H}}\) where \(\Vert u\Vert =1,\) then \(u^*qu \in W_{{\mathbb {H}}}(A).\) In general, \(u^*qu \ne q.\) For basic information on the quaternionic numerical range, see for instance [18, Section 3.3]. The subset of complex elements in \(W_{{\mathbb {H}}}(A)\) is known as the bild of A,  denoted

$$\begin{aligned} B(A):=W_{{\mathbb {H}}}(A)\cap {\mathbb {C}}. \end{aligned}$$

Although the bild may not be convex, the upper bild \(B^+(A)=W_{{\mathbb {H}}}(A)\cap {\mathbb {C}}^+\) is always convex; see [19]. Here, \({\mathbb {C}}^+=\{ z \in {\mathbb {C}}: \textrm{Im} z \ge 0 \}\) is the closed upper half plane. Note that B(A) is symmetric with respect to the real axis. Indeed, if \({\textsf {j}}^*\alpha {\textsf {j}}=\overline{\alpha } \in B(A).\) Thus, knowing just \(B^+(A)\) provides full information on B(A),  since \(B(A)=B^+(A) \cup \overline{B^+(A)}\) (where the bar indicates taking complex conjugates).

Let \({\mathbb {D}}=\{z \in {\mathbb {C}}: \Vert z\Vert <1\}\) and \(\overline{\mathbb {D}}=\{z \in {\mathbb {C}}: \Vert z\Vert \le 1\}\) denote the open and closed unit disks in \({\mathbb {C}}\) and \({\mathbb {T}}=\{z \in {\mathbb {C}}: \Vert z\Vert =1\}\) the unit circle. Finally, let \({\mathbb {B}}=\{q \in {\mathbb {H}}: \Vert q\Vert < 1\}\) and \(\overline{\mathbb {B}}=\{q \in {\mathbb {H}}: \Vert q\Vert \le 1\}\) denote the open and closed unit balls in \({\mathbb {H}}.\)

2 Fejér–Riesz factorization in the QRC-subalgebra

Lemma 2.1

Let \(C \in {\mathbb {C}}^{2n \times 2n}.\) Then \(C \in \Omega _{2n}\) if and only if \(J_n \overline{C}J_n^*=C.\)

Proof

Take \(C \in {\mathbb {C}}^{2n \times 2n},\) where

$$\begin{aligned} C=\begin{bmatrix} C_{11} &{} C_{12} \\ C_{21} &{} C_{22} \end{bmatrix}. \end{aligned}$$

Then

$$\begin{aligned} J_n\overline{C}J_n^*&=\begin{bmatrix} 0 &{} I_n \\ -I_n &{} 0 \end{bmatrix} \begin{bmatrix} \overline{C}_{11} &{} \overline{C}_{12} \\ \overline{C}_{21} &{} \overline{C}_{22} \end{bmatrix} \begin{bmatrix} 0 &{} -I_n \\ I_n &{} 0 \end{bmatrix} = \begin{bmatrix} \overline{C}_{22} &{} -\overline{C}_{21} \\ -\overline{C}_{12} &{} \overline{C}_{11} \end{bmatrix}, \end{aligned}$$

which is equal to C if and only if \(C_{11}=\overline{C}_{22}\) and \(C_{12}=-\overline{C}_{21}.\) Hence, \(C \in \Omega _{2n}\) if and only if \(J_n\overline{C}J_n^*=C.\) \(\square \)

Lemma 2.2

Let \(C \in {\mathbb {C}}^{2n \times 2n}.\) Then,  \(\frac{1}{2}\left( J_n\overline{C}J_n^*+C\right) \in \Omega _{2n}.\)

Proof

Take \(C \in {\mathbb {C}}^{2n \times 2n},\) where

$$\begin{aligned} C=\begin{bmatrix} C_{11} &{} C_{12} \\ C_{21} &{} C_{22} \end{bmatrix}. \end{aligned}$$

Then

$$\begin{aligned} \frac{1}{2}\left( J_n\overline{C}J_n^*+C\right)&= \frac{1}{2}\left( \begin{bmatrix} 0 &{} I_n \\ -I_n &{} 0 \end{bmatrix} \begin{bmatrix} \overline{C}_{11} &{} \overline{C}_{12} \\ \overline{C}_{21} &{} \overline{C}_{22} \end{bmatrix} \begin{bmatrix} 0 &{} -I_n \\ I_n &{} 0 \end{bmatrix} + \begin{bmatrix} C_{11}&{}C_{12}\\ C_{21} &{} C_{22} \end{bmatrix} \right) \\ {}&= \frac{1}{2}\left( \begin{bmatrix} \overline{C}_{22} &{} -\overline{C}_{21} \\ -\overline{C}_{12} &{} \overline{C}_{11} \end{bmatrix}+\begin{bmatrix} C_{11} &{} C_{12} \\ C_{21} &{} C_{22} \end{bmatrix} \right) \\&=\frac{1}{2} \begin{bmatrix} \overline{C}_{22} +C_{11}&{} -\overline{C}_{21}+C_{12} \\ -\overline{C}_{12}+C_{21} &{} \overline{C}_{11}+C_{22} \end{bmatrix} \in \Omega _{2n}. \\ \end{aligned}$$

\(\square \)

Note that if \(C \in \Omega _{2n},\) then \(J_n\overline{C}J_n^*=C,\) and hence

$$\begin{aligned} \frac{1}{2}\left( J_n\overline{C}J_n^*+C\right) =C. \end{aligned}$$

The following result provides Fejér–Riesz factorization in the QRC-subalgebra \(\Omega _{2n}.\) The result is a variation of [12, Proposition 4.6], which was based on [8, Section 3]; see also [4, Theorem 2.4.19].

Theorem 2.3

Let \(Q(z) = Q_{-m}z^{-m} + \cdots + Q_{m}z^{m} \ge 0, z \in {\mathbb {T}},\) where \(Q_{j} \in \Omega _{2n}.\) Let \(Z = \begin{bmatrix} Q_{0} &{} Q_{-1} &{} \ldots &{} Q_{-m}\\ Q_{1} &{} 0 &{} \ldots &{} 0\\ \vdots &{} \vdots &{} &{} \vdots \\ Q_{m} &{} 0 &{} \ldots &{} 0 \end{bmatrix},\) and introduce the convex set

$$\begin{aligned} {\mathcal {G}}=\Bigg \{X \in {\mathbb {C}}^{2nm \times 2nm}\mid A(X):= Z- \begin{bmatrix} X &{} 0\\ 0 &{} 0_{2n} \end{bmatrix} + \begin{bmatrix} 0_{2n} &{} 0\\ 0 &{} X \end{bmatrix} \ge 0 \Bigg \}. \end{aligned}$$

Then,  \({\mathcal {G}}\) has elements \(X_{\textrm{max}}\) and \(X_{\textrm{min}}\) that are maximal and minimal with respect to the Loewner ordering,  respectively;  i.e.,  \(X_{\textrm{max}}, X_{\textrm{min}} \in {\mathcal {G}}\) have the property that \(X \in {\mathcal {G}}\) implies \(X_{\textrm{min}} \le X \le X_{\textrm{max}}.\) Writing \(X_{\textrm{max}}=(X^\textrm{max}_{ij})_{i,j=1}^m\) and \(X_{\textrm{min}}=(X^\textrm{min}_{ij})_{i,j=1}^m,\) we have that \(X^{\textrm{max}}_{ij}, X^\textrm{min}_{ij}\in \Omega _{2n}.\)

Moreover, consider the set

$$\begin{aligned} {{\mathcal {A}}} = \Big \{ A=(A_{ij})_{i,j=0}^m \in {\mathbb {C}}^{2n(m+1)\times 2n(m+1)} \, \ A = A(X) \ \text {for some} \ X \in {{\mathcal {G}}} \Big \}. \end{aligned}$$

Then, \(A(X_{\textrm{max}})\) is the unique element in \({{\mathcal {A}}},\) so that \(A_{mm}\) is maximal (or equivalently, \(A_{00}\) is minimal) in the Loewner order. Also, \(A(X_{\textrm{min}})\) is the unique element in \({{\mathcal {A}}}\) so that \(A_{00}\) is maximal (or equivalently, \(A_{mm}\) is minimal) in the Loewner order. Finally, we may factor \(A(X_{\textrm{max}})\) and \(A(X_{\textrm{min}})\) as

$$\begin{aligned} A(X_{\textrm{max}}) = \begin{bmatrix} H^{*}_0 \\ \vdots \\ H^{*}_{m} \end{bmatrix} \begin{bmatrix} H_{0}&\ldots&H_{m} \end{bmatrix}, \; A(X_{\textrm{min}}) = \begin{bmatrix} K^{*}_0 \\ \vdots \\ K^{*}_{m} \end{bmatrix} \begin{bmatrix} K_{0}&\ldots&K_{m} \end{bmatrix} \end{aligned}$$
(2.1)

with \(H_i, K_{i} \in \Omega _{2n}, i=0, \ldots , m,\) and put \(H(z) = \sum _{k=0}^m H_kz^k,\) \(K(z) = \sum _{k=0}^m K_k z^k.\) Then

$$\begin{aligned} Q(z) = H(z)^* H(z) = K(z)^* K(z),\quad z \in {\mathbb {T}}. \end{aligned}$$

Proof

Without the restriction that \(Q_j, H_j, K_j \in \Omega _{2n},\) the statement is covered by [12, Proposition 4.6]. For a block matrix \(B=(B_{ij})_{i,j},\) where \(B_{ij} \in {\mathbb {C}}^{2n\times 2n},\) we define

$$\begin{aligned} {{\mathcal {J}}} (B):= \left( \frac{1}{2}\left( J_n\overline{B_{ij}}J_n^* +B_{ij}\right) \right) _{i,j}. \end{aligned}$$

In other words, we perform the operation \(C\mapsto \frac{1}{2} \left( J_n\overline{C}J_n^*+C\right) \) to each block entry. It is easy to see that the operation \(\mathcal {J}\) preserves positive semidefiniteness. In addition, \({\mathcal {J}}(Z)=Z\) and \({{\mathcal {J}}} (A(X)) = A({{\mathcal {J}}} (X)).\) This now yields that when \(X\in {{\mathcal {G}}},\) then \({{\mathcal {J}}}(X) \in {{\mathcal {G}}}.\) As \({{\mathcal {J}}}\) preserves positive semidefiniteness, it also follows that \(X\le Y\) implies \({\mathcal {J}}(X) \le {{\mathcal {J}}}(Y).\) Thus, \(X_{\textrm{min}} \le X \le X_{\textrm{max}}\) implies \({{\mathcal {J}}}(X_{\textrm{min}}) \le {{\mathcal {J}}}(X) \le \mathcal {J}(X_{\textrm{max}}).\) As \({{\mathcal {J}}}(X_{\textrm{min}}), {\mathcal {J}}(X_{\textrm{max}}) \in {\mathcal {G}},\) we also have \(X_{\textrm{min}} \le {{\mathcal {J}}}(X_{\textrm{min}})\) and \( {{\mathcal {J}}} (X_{\textrm{max}})\le X_{\textrm{max}}.\) Observe that \({{\mathcal {J}}}\) does not change the trace of a Hermitian matrix, and thus, \(\textrm{Tr} (X_{\textrm{min}}) = \textrm{Tr} ({{\mathcal {J}}}(X_{\textrm{min}}))\) and \( \textrm{Tr} ({{\mathcal {J}}} (X_{\textrm{max}}))= \textrm{Tr} (X_{\textrm{max}}).\) But then, it follows that \({{\mathcal {J}}}(X_{\textrm{min}})=X_{\textrm{min}} \) and \({{\mathcal {J}}}(X_{\textrm{max}})=X_{\textrm{max}} ,\) and thus \(X^{\textrm{max}}_{ij}, X^{\textrm{min}}_{ij}\in \Omega _{2n}.\) Finally, the spectral theorem for quaternionic matrices [9, Theorem 3.3 and Proposition 3.7] implies that the factorizations in (2.1) can be kept within the QRC-subalgebra. \(\square \)

In fact, we have that H(z) is co-outer and K(z) is outer (see [12, Section 4] or [4, Section 2.4] for the definitions); when \(H_m\) and \(K_0\) are invertible, this means that \(\det H(z) \ne 0\) for \(z \in {\mathbb {C}}{\setminus }\overline{\mathbb {D}}\) and \(\det K(z) \ne 0\) for \(z\in {\mathbb {D}}.\)

3 Circularity of the quaternionic numerical range

For \(A \in {\mathbb {H}}^{n \times n},\) we consider also the numerical range of the complex matrix \(\omega (A)\)

$$\begin{aligned} W\left( \omega (A)\right) =\left\{ y^*\omega (A)y: \Vert y\Vert =1, y \in {\mathbb {C}}^{2n}\right\} . \end{aligned}$$

Since \(W\left( \omega (A)\right) \) is always convex by the famous Toeplitz–Hausdorff Theorem [13, 20] and B(A) sometimes not, we know that the two sets do not coincide in general. However, we will see that B(A) is the unit disk if and only if \(W\left( \omega (A)\right) \) is the unit disk.

We start off with some auxiliary results. The following lemma follows from the proof of [14, Theorem 35].

Lemma 3.1

[14] For \(A \in {\mathbb {H}}^{n \times n}\) follows \(B(A) \subseteq W\left( \omega (A)\right) .\)

Proof

Let \(q \in B(A).\) Hence, there exists a vector \(x \in {\mathbb {H}}^n,\) where \(\Vert x\Vert =1,\) such that \(q=x^*Ax \in {\mathbb {C}}.\) Thus, \(\omega (q)=\omega (x)^*\omega (A)\omega (x),\) where

$$\begin{aligned} \omega (q)=\begin{bmatrix} q &{} 0 \\ 0 &{} \overline{q} \end{bmatrix}=\begin{bmatrix} x_C^* &{}-x_H^* \\ \overline{x}_H^* &{} \overline{x}_C^* \end{bmatrix} \begin{bmatrix} A_C &{} \overline{A}_H \\ -A_H &{} \overline{A}_C \end{bmatrix} \begin{bmatrix} x_C &{} \overline{x}_H \\ - x_H &{} \overline{x}_C \end{bmatrix}, \end{aligned}$$

with \(A=A_C+{\textsf {j}}A_H\) and \(x=x_C + {\textsf {j}}x_H\). Using that

$$\begin{aligned} 1=\Vert x\Vert ^2=\Vert x_C+{\textsf {j}}x_H\Vert ^2=\Vert x_C\Vert ^2+\Vert x_H\Vert ^2=\left\| \begin{bmatrix} x_C \\ -x_H \end{bmatrix}\right\| ^2, \end{aligned}$$

it follows that

$$\begin{aligned} q=\begin{bmatrix} x_C^*&-x_H^* \end{bmatrix} \omega (A)\begin{bmatrix} x_C \\ -x_H \end{bmatrix} \in W(\omega (A)). \\ \end{aligned}$$

\(\square \)

In general, the other inclusion will not hold. For example, if \(A \in {\mathbb {H}}^{1\times 1}\) is not real, then \(B(A)=\{\alpha , \overline{\alpha }\}\) where \(\alpha \) and \(\overline{\alpha }\) are the eigenvalues of \(\omega (A)\in {\mathbb {C}}^{2 \times 2};\) and \(W(\omega (A))\) will be the line segment between \(\alpha \) and \(\overline{\alpha }.\)

Next, we show that the two sets B(A) and \(W(\omega (A))\) do coincide when either one is the closed unit disc, with \(B(A)=\overline{\mathbb {D}}\) the stronger condition, since \(W(\omega (A))\) is convex.

Lemma 3.2

For \(A \in {\mathbb {H}}^{n \times n},\) it follows that \(B(A)=\overline{\mathbb {D}}\) is equivalent to \(W_{{\mathbb {H}}}(A)=\overline{\mathbb {B}}.\)

Proof

Assume \(B(A)=\overline{\mathbb {D}}.\) Let \(q \in \overline{\mathbb {B}}\) where \(q=q_0+ {\textsf {i}}q_1+ {\textsf {j}}q_2+ {\textsf {k}}q_3\) with \(\Vert q\Vert ^2=\sum _{p=0}^3 |q_p|^2 \le 1.\) Then

$$\begin{aligned} {\tilde{q}}=q_0+{\textsf {i}}\left\| \begin{bmatrix} q_1 \\ q_2 \\ q_3 \end{bmatrix}\right\| \in \overline{\mathbb {D}}=B(A)=W_{{\mathbb {H}}}(A)\cap {\mathbb {C}}. \end{aligned}$$

Hence, \({\tilde{q}} \in W_{{\mathbb {H}}}(A)\) and by [2, Lemma 2] also \(q \in W_{{\mathbb {H}}}(A).\) For the other inclusion, let \(q \in W_{{\mathbb {H}}}(A)\) where \(q=q_0+ {\textsf {i}}q_1+ {\textsf {j}}q_2+ {\textsf {k}}q_3.\) Then

$$\begin{aligned} q_0+{\textsf {i}}\left\| \begin{bmatrix} q_1 \\ q_2 \\ q_3 \end{bmatrix}\right\| \in B(A)=\overline{\mathbb {D}}, \end{aligned}$$

from which follows

$$\begin{aligned} 1\ge |q_0|^2+\left\| \begin{bmatrix} q_1 \\ q_2 \\ q_3 \end{bmatrix}\right\| ^2=\sum _{p=0}^3|q_p|^2=\Vert q\Vert ^2. \end{aligned}$$

Hence, \(q \in \overline{\mathbb {B}}.\)

For the other direction, assume \(W_{{\mathbb {H}}}(A)=\overline{\mathbb {B}}.\) Take \(z \in B(A)=W_{{\mathbb {H}}}(A)\cap {\mathbb {C}}.\) Thus, \(\Vert z\Vert ^2\le 1 \) which proves \(z \in \overline{\mathbb {D}}.\) For the other inclusion, take \(z \in \overline{\mathbb {D}}\) where \(z=x+{\textsf {i}}y\) and \(x,y\in {\mathbb R}\) Thus \(\Vert z\Vert ^2=x^2+y^2 \le 1.\) Then, \(x+{\textsf {i}}y \in {\mathbb {H}}\) and \(x+{\textsf {i}}y \in \overline{\mathbb {B}}=W_{{\mathbb {H}}}(A).\) Using [2, Lemma 2], it follows that \(z \in B(A).\) \(\square \)

The following lemma follows from [2, Theorem 2], which states that if \(W_{{\mathbb {H}}}(A)\) is convex, then \(B(A)=W(\omega (A)).\) We will provide a proof as it makes the paper more self-contained.

Lemma 3.3

For \(A \in {\mathbb {H}}^{n \times n},\) if \(B(A)=\overline{\mathbb {D}},\) then \(B(A)=W(\omega (A)).\)

Proof

If \(B(A)=\overline{\mathbb {D}},\) then \(\overline{\mathbb {D}} \subseteq W(\omega (A))\) using Lemma 3.1. Hence, it remains to show that \(W(\omega (A)) \subseteq \overline{\mathbb {D}}.\)

Since \(B(A)=\overline{\mathbb {D}},\) it follows that \(W_{{\mathbb {H}}}(A)=\overline{\mathbb {B}}.\) Let \(\alpha \in W(\omega (A)).\) Then, \(\alpha =u^*\omega (A)u,\) where \(u \in {\mathbb {C}}^{2n}\) and \(\Vert u\Vert =1.\) Decompose \(u=\begin{bmatrix} u_1 \\ u_2 \end{bmatrix},\) where \(u_1, u_2 \in {\mathbb {C}}^n,\) and define \(w:=\begin{bmatrix} -\overline{u}_2 \\ \overline{u}_1 \end{bmatrix}.\) Note that

$$\begin{aligned} w^*u&=-\overline{u}_2^*u_1+\overline{u}_1^*u_2\\ {}&= -u_2^Tu_1+u_1^Tu_2\\ {}&=-(u_1^Tu_2)^T +u_1^Tu_2=0. \end{aligned}$$

Define \(x:=u_1-{\textsf {j}}u_2 \in {\mathbb {H}}^n.\) Then, \(\Vert x\Vert =1\) and for some \(\beta \in {\mathbb {C}}\)

$$\begin{aligned} \begin{bmatrix} \alpha &{}{} \overline{\beta } \\ -\beta &{}{} \overline{\alpha } \end{bmatrix}&=\begin{bmatrix} u^* \\ w^* \end{bmatrix}\omega (A)\begin{bmatrix} u&w \end{bmatrix}\\ {}&=\begin{bmatrix} u_1^* &{}{} u_2^* \\ \overline{u}_2^* &{}{} \overline{u}_1^* \end{bmatrix} \omega (A) \begin{bmatrix} u_1 &{}{} -\overline{u}_2\\ u_2 &{}{} \overline{u}_1 \end{bmatrix} = \omega (x)^*\omega (A)\omega (x). \end{aligned}$$

Hence, \(\alpha +{\textsf {j}}\beta \in W_{{\mathbb {H}}}(A)=\overline{\mathbb {B}}.\) Thus, \( \Vert \alpha \Vert ^2 + \Vert \beta \Vert ^2 =\Vert \alpha + {\textsf {j}}\beta \Vert ^2\le 1,\) and therefore, \(\Vert \alpha \Vert \le 1,\) which proves \(\alpha \in \overline{\mathbb {D}}.\) \(\square \)

Before we can state and prove our main result, we require the following results. For the results that are known, we omit the proofs and provide only a reference.

Lemma 3.4

[2, Corollary 1] Let \(A \in {\mathbb {H}}^{n \times n}.\) Then,  \({\mathbb {R}} \cap W_{{\mathbb {H}}}(A)\) is a closed interval.

Lemma 3.5

[2, Theorem 3] For \(A \in {\mathbb {H}}^{n \times n},\) \(W_{{\mathbb {H}}}(A)\) is convex if and only if \({\mathbb {R}} \cap W_{{\mathbb {H}}}(A)=\left\{ Re (q): \, q \in W_{{\mathbb {H}}}(A) \right\} .\)

Corollary 3.6

Let \(A \in {\mathbb {H}}^{n \times n}\) and \(W_{{\mathbb {H}}}(A) \subseteq \overline{\mathbb {B}}.\) If \(\pm 1 \in W_{{\mathbb {H}}}(A),\) then \(W_{{\mathbb {H}}}(A)\) is convex.

Proof

Since \(W_{{\mathbb {H}}}(A) \subseteq \overline{\mathbb {B}},\) we have

$$\begin{aligned} \left\{ Re (q): \, q \in W_{{\mathbb {H}}}(A) \right\} \subseteq [-1,1]. \end{aligned}$$

As \(\pm 1 \in {\mathbb {R}}\cap W_{{\mathbb {H}}}(A),\) if follows by Lemma 3.4 that:

$$\begin{aligned}{}[-1,1]\subseteq {\mathbb {R}} \cap W_{{\mathbb {H}}}(A). \end{aligned}$$

Then

$$\begin{aligned}{}[-1,1]\subseteq {\mathbb {R}} \cap W_{{\mathbb {H}}}(A)\subseteq \left\{ Re (q): \, q \in W_{{\mathbb {H}}}(A) \right\} \subseteq [-1,1]. \end{aligned}$$

Thus

$$\begin{aligned} {\mathbb {R}} \cap W_{{\mathbb {H}}}(A)=\left\{ Re (q): \, q \in W_{{\mathbb {H}}}(A) \right\} , \end{aligned}$$

and by Lemma 3.5\(W_{{\mathbb {H}}}(A)\) is convex. \(\square \)

Our main result is the following description of quaternionic matrices whose bild is the unit disk.

Theorem 3.7

Let \(A \in {\mathbb {H}}^{n \times n}.\) Then,  \(B(A)=\overline{\mathbb {D}}\) if and only if there exist \(P_0,P_1 \in {\mathbb {H}}^{n-1 \times n},\) such that

$$\begin{aligned} P_0^*P_0+P_1^*P_1=I_n \quad \text {and} \quad \frac{1}{2}A=P_0^*P_1. \end{aligned}$$

Proof

Assume there exist \(P_0,P_1 \in {\mathbb {H}}^{n-1 \times n},\) such that

$$\begin{aligned} P_0^*P_0+P_1^*P_1=I_n \quad \text {and} \quad \frac{1}{2}A=P_0^*P_1. \end{aligned}$$

Then

$$\begin{aligned} \omega (P_0)^*\omega (P_0)+\omega (P_1)^*\omega (P_1)=I_{2n} \quad \text {and} \quad \frac{1}{2}\omega (A)=\omega (P_0)^*\omega (P_1). \end{aligned}$$

Since \(\omega (P_0), \omega (P_1) \in {\mathbb {C}}^{2(n-1) \times 2n},\) it follows by [16, Theorem 4.5] that \(W(\omega (A))=\overline{\mathbb {D}},\) and hence, by Lemma 3.1, we have \(B(A) \subseteq \overline{\mathbb {D}}.\) From [2, Theorem 1], it follows:

$$\begin{aligned} W_{{\mathbb {H}}}(A)\subseteq \left\{ a_0+p: \, a \in {\mathbb {R}}, \, p \in {\mathbb {H}},\, Re (p)=0, \, a_0+\Vert p\Vert {\textsf {i}}\in B(A) \right\} \!. \end{aligned}$$

Therefore, it is clear that \(W_{{\mathbb {H}}}(A) \subseteq \overline{\mathbb {B}}.\)

Next, note that

$$\begin{aligned} I_{n} -\frac{1}{2}A - \frac{1}{2}A^*=(P_0-P_1)^*(P_0-P_1) \ge 0, \end{aligned}$$

where \(P_0-P_1 \in {\mathbb {H}}^{n-1 \times n}.\) Hence, \(Ker (P_0-P_1)\) is non-trivial. Take \(x \in {\mathbb {H}}^n\) where \(x \in Ker (P_0-P_1)\) and \(\Vert x\Vert =1.\) Then

$$\begin{aligned} 0&=x^*(P_0-P_1)^*(P_0-P_1)x\\&= x^*\left( I_{n} -\frac{1}{2}A - \frac{1}{2}A^*\right) x \\&= x^*x-\frac{1}{2}x^*Ax - \frac{1}{2}x^*A^*x\\&=1-Re \left( x^*Ax\right) . \end{aligned}$$

Since \(x^*Ax \in W_{{\mathbb {H}}}(A),\) it follows that \(\Vert x^*Ax\Vert \le 1,\) and hence, \(1=x^*Ax \in W_{{\mathbb {H}}}(A).\) In a similar way, it can be proven that \(-1 \in W_{{\mathbb {H}}}(A)\) using

$$\begin{aligned} I_{n} +\frac{1}{2}A + \frac{1}{2}A^*=(P_0+P_1)^*(P_0+P_1) \ge 0. \end{aligned}$$

Hence, by Corollary 3.6, it follows that \(W_{{\mathbb {H}}}(A)\) is convex, and therefore, \(B(A)=W(\omega (A))=\overline{\mathbb {D}}\) by Lemma 3.3.

For the other direction, assume \(B(A)=\overline{\mathbb {D}}.\) By Lemma 3.3, it then follows that \(B(A)=W(\omega (A)).\) Thus, \(|x^*\omega (A)x| \le 1\) for all \(x \in {\mathbb {C}}^{2n}\) where \(\Vert x\Vert =1.\) This is equivalent to

$$\begin{aligned} Re \left( e^{\textsf{i}\theta }x^*\omega (A)x\right) \le 1, \quad \text {for all} \ \theta \ \text {and} \ \Vert x\Vert =1, \end{aligned}$$

which can be rewritten as

$$\begin{aligned} \frac{1}{2}x^*\left( e^{\textsf{i}\theta }\omega (A)+e^{-\textsf{i}\theta }\omega (A)^* \right) x\le x^*x, \quad \text {for all} \ \theta \ \text {and} \ \Vert x\Vert =1. \end{aligned}$$

Hence

$$\begin{aligned} I_{2n}-\frac{1}{2}\left( e^{\textsf{i}\theta }\omega (A)+e^{-\textsf{i}\theta } \omega (A)^*\right) \ge 0 \quad \text {for all} \ \theta , \end{aligned}$$

or equivalently

$$\begin{aligned} I_{2n}-\frac{1}{2}\left( z\omega (A)+\frac{1}{z}\omega (A)^*\right) \ge 0 \quad \text {for all} \ z \in {\mathbb {T}}. \end{aligned}$$

Define \(Q_0:=I_{2n}\) and \(Q_1:=-\frac{1}{2}\omega (A).\) Then, \(Q_j \in \Omega _{2n}\) for \(j=-1,0,1,\) where \(Q_{-1}:=Q_1^*,\) and

$$\begin{aligned} Q(z):=\sum _{j=-1}^1 Q_jz^j \ge 0 \quad \text {for all} \ z \in {\mathbb {T}}. \end{aligned}$$

By the Fejér–Riesz factorization (Theorem 2.3), there exist \({\widetilde{P}}_0, {\widetilde{P}}_1 \in \Omega _{2n},\) such that

$$\begin{aligned} Q(z)=\left( {\widetilde{P}}_0+z{\widetilde{P}}_1\right) ^* \left( {\widetilde{P}}_0+z{\widetilde{P}}_1\right) , \quad \text {for all} \ z \in {\mathbb {T}}, \end{aligned}$$

and in fact, we choose the factorization that corresponds to the choice \(X_{\textrm{min}}\) in Theorem 2.3. By equating coefficients, it follows that:

$$\begin{aligned} P_0^*P_0+P_1^*P_1=I_n \quad \text {and} \quad \frac{1}{2}A=P_0^*P_1, \end{aligned}$$

where \(\omega (P_0)={\widetilde{P}}_0\) and \(\omega (P_1)={\widetilde{P}}_1,\) for some \(P_0,P_1 \in {\mathbb {H}}^{n \times n.}\) Hence, it remains to prove that we are able to reduce the size of \(P_0\) and \(P_1.\)

Note that since \(\overline{z} \in W(\omega (A))\) for any \(z \in {\mathbb {T}},\) there exists a \(y \in {\mathbb {C}}^{2n},\) \(\Vert y\Vert =1,\) such that \(\overline{z}=y^*\omega (A)y.\) Then

$$\begin{aligned} y^*\left( {\widetilde{P}}_0+z{\widetilde{P}}_1\right) ^* \left( {\widetilde{P}}_0+z{\widetilde{P}}_1\right) y&=y^*Q(z)y\\&=y^*\left( I_{2n}-\frac{1}{2}z\omega (A)-\frac{1}{2}\frac{1}{z}\omega (A)^*\right) y=0, \end{aligned}$$

which proves that \({\widetilde{P}}_0+z{\widetilde{P}}_1\) is singular for all \(z \in {\mathbb {T}}.\) We reason by contradiction that \({\widetilde{P}}_0\) is singular. If \({\widetilde{P}}_0\) is non-singular, then

$$\begin{aligned} {\widetilde{P}}_0+z{\widetilde{P}}_1={\widetilde{P}}_0\left( I_{2n} +z{\widetilde{P}}_0^{-1}{\widetilde{P}}_1\right) ={\widetilde{P}}_0 \left( I_{2n}-z\left( -{\widetilde{P}}_0^{-1}{\widetilde{P}}_1\right) \right) . \end{aligned}$$

Using that

$$\begin{aligned} det \left( I_{2n}-z\left( -{\widetilde{P}}_0^{-1}{\widetilde{P}}_1\right) \right) =0 \end{aligned}$$

only holds when \(\frac{1}{z}\) is an eigenvalue of \(-{\widetilde{P}}_0^{-1}{\widetilde{P}}_1,\) of which there are at most 2m,  we arrive at a contradiction. Hence, \({\widetilde{P}}_0\) is singular. By a similar argument, it follows that \({\widetilde{P}}_1\) is singular.

Write

$$\begin{aligned} 0 \le \begin{bmatrix} {\widetilde{P}}_0 \\ {\widetilde{P}}_1 \end{bmatrix}^*\begin{bmatrix} {\widetilde{P}}_0&{\widetilde{P}}_1 \end{bmatrix}=:\begin{bmatrix} P &{} \frac{1}{2}\omega (A) \\ \frac{1}{2}\omega (A)^* &{} I_{2n}-P \end{bmatrix}, \end{aligned}$$

where we recall that we made the choice corresponding to \(X_{\textrm{min}}\) in Theorem 2.3. This results in choosing the largest P in the Loewner order, which we denote by \(P_{max }.\) Furthermore

$$\begin{aligned} {{\,\mathrm{{rank}}\,}}\begin{bmatrix} P_{max } &{} \frac{1}{2}\omega (A) \\ \frac{1}{2}\omega (A)^* &{} I_{2n}-P_{max } \end{bmatrix}={{\,\mathrm{{rank}}\,}}\, P_{max }. \end{aligned}$$

To see this, we use the generalized Schur complement of \(P_{max }\) which equals

$$\begin{aligned} B:=I_{2n}-P_{max } - \frac{1}{2}\omega (A)^*P_{max }^{(-1)}\frac{1}{2}\omega (A)\ge 0, \end{aligned}$$

with \(P_{max }^{(-1)}\) the Moore–Penrose generalized inverse of \(P_{max }.\) From

$$\begin{aligned} {{\,\mathrm{{rank}}\,}}\begin{bmatrix} P_{max } &{} \frac{1}{2}\omega (A) \\ \frac{1}{2}\omega (A)^* &{} I_{2n}-P_{max } \end{bmatrix}={{\,\mathrm{{rank}}\,}}\, P_{max } + {{\,\mathrm{{rank}}\,}}\, B, \end{aligned}$$

it follows that

$$\begin{aligned} {{\,\mathrm{{rank}}\,}}\begin{bmatrix} P_{max } &{} \frac{1}{2}\omega (A) \\ \frac{1}{2}\omega (A)^* &{} I_{2n}-P_{max } \end{bmatrix}\ge {{\,\mathrm{{rank}}\,}}\, P_{max }. \end{aligned}$$

Suppose

$$\begin{aligned} {{\,\mathrm{{rank}}\,}}\begin{bmatrix} P_{max } &{} \frac{1}{2}\omega (A) \\ \frac{1}{2}\omega (A)^* &{} I_{2n}-P_{max } \end{bmatrix}>{{\,\mathrm{{rank}}\,}}\, P_{max }. \end{aligned}$$

Then, \(B \ne 0.\) Furthermore,

$$\begin{aligned} \begin{bmatrix} P_{max } &{} \frac{1}{2}\omega (A) \\ \frac{1}{2}\omega (A)^* &{} I_{2n}-P_{max }-B \end{bmatrix}= \begin{bmatrix} P_{max } &{} \frac{1}{2}\omega (A) \\ \frac{1}{2}\omega (A)^* &{} \frac{1}{2}\omega (A)^*P_{max }^{(-1)}\frac{1}{2}\omega (A) \end{bmatrix} \ge 0. \end{aligned}$$

Now

$$\begin{aligned} \begin{bmatrix} P_{max } +B &{} \frac{1}{2}\omega (A) \\ \frac{1}{2}\omega (A)^* &{} I_{2n}-\left( P_{max }+B\right) \end{bmatrix} \ge 0, \end{aligned}$$

and since \(P_{max } +B> P_{max },\) this is a contradiction.

Let

$$\begin{aligned} k={{\,\mathrm{{rank}}\,}}\, P_{max }={{\,\mathrm{{rank}}\,}}\begin{bmatrix} P_{max } &{} \frac{1}{2}\omega (A) \\ \frac{1}{2}\omega (A)^* &{} I_{2n}-P_{max } \end{bmatrix}. \end{aligned}$$

Since \(P_{\textrm{max}} = {\widetilde{P}}_0^*{\widetilde{P}}_0\) and rank \({\widetilde{P}}_0 <2n,\) we have that \(k<2n.\) Since 0 as an eigenvalue of \(P_{\textrm{max}}\) has even multiplicity (see, e.g., [21, Corollary 5.1]), we have that \(k\le 2(n-1).\) Then, by the spectral theorem in quaternions [9, Proposition 3.7], there exist \(R_0, R_1 \in \Omega _{2(n-1) \times 2n},\) such that

$$\begin{aligned} \begin{bmatrix} P_{max } &{} \frac{1}{2}\omega (A) \\ \frac{1}{2}\omega (A)^* &{} I_{2n}-P_{max } \end{bmatrix}=\begin{bmatrix} R_0 \\ R_1 \end{bmatrix}^*\begin{bmatrix} R_0&R_1 \end{bmatrix}. \end{aligned}$$

Now, \(S_0=\omega ^{-1}(R_0)\in {\mathbb {H}}^{(n-1)\times n}, S_1=\omega ^{-1}(R_1) \in {\mathbb {H}}^{(n-1)\times n}\) gives us the desired matrices, so that \(A=2S_0^*S_1\) and \(S_0^*S_0+S_1^*S_1=I_n.\) \(\square \)

Let us illustrate Theorem 3.7 in a simple example.

Example 3.8

Let \(A=\begin{bmatrix} 0 &{} a_1 &{} 0 \\ 0 &{} 0 &{} a_2 \\ 0 &{} 0 &{} 0 \end{bmatrix}.\) Then, one easily checks that B(A) is a disk with center 0 (see also [5]). In this case, let

$$\begin{aligned} P_0 = \begin{pmatrix} 1 &{} 0 &{} 0 \\ 0 &{} \frac{\bar{a}_2}{2} &{} 0 \end{pmatrix},\; P_1 = \begin{pmatrix} 0 &{} \frac{a_1}{2} &{} 0 \\ 0 &{} 0 &{} 1 \end{pmatrix}. \end{aligned}$$

Then, \(A=2P_0^*P_1.\) Moreover, when \(\Vert a_1\Vert ^2 + \Vert a_2 \Vert ^2 = 4\) we obtain that \(P_0^*P_0+P_1^*P_1=I_3.\) Therefore, under this condition, we have that the numerical range of A is the closed unit ball in \({\mathbb {H}}.\)