1 Introduction

Let \({\mathbb {T}}:=\{z\in {\mathbb {C}}:\vert z\vert =1\}\) be the unit circle in \({\mathbb {C}}\). We write \(\sigma \) for the normalized Lebesgue measure \(d\theta /(2\pi )\) on \(([-\pi ,\pi ), {\mathcal {B}}([-\pi ,\pi ))\), where \({\mathcal {B}}([-\pi ,\pi ))\) is the Borel \(\sigma \)-algebra on \([-\pi ,\pi )\); thus we have \(\sigma ([-\pi ,\pi ))=1\). For \(p\in [1,\infty )\), we write \(L_p({\mathbb {T}})\) for the Lebesgue space of measurable functions \(f:{\mathbb {T}}\rightarrow {\mathbb {C}}\) such that \(\Vert f\Vert _p<\infty \), where \(\Vert f\Vert _p:=\{\int _{-\pi }^{\pi }\vert f(e^{i\theta })\vert ^p \sigma (d\theta )\}^{1/p}\). Let \(L_p^{m\times n}({\mathbb {T}})\) be the space of \({\mathbb {C}}^{m\times n}\)-valued functions on \({\mathbb {T}}\) whose entries belong to \(L_p({\mathbb {T}})\).

Let \(d\in {\mathbb {N}}\). For \(n\in {\mathbb {N}}\), we consider the block Toeplitz matrix

$$\begin{aligned} T_n(w) :=\left( \begin{matrix} \gamma (0) &{} \gamma (-1) &{} \cdots &{} \gamma (-n+1)\\ \gamma (1) &{} \gamma (0) &{} \cdots &{} \gamma (-n+2)\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \gamma (n-1) &{} \gamma (n-2) &{} \cdots &{} \gamma (0) \end{matrix} \right) \in {\mathbb {C}}^{dn\times dn}, \end{aligned}$$

where

$$\begin{aligned} \gamma (k):=\int _{-\pi }^{\pi }e^{-ik\theta }w(e^{i\theta })\frac{d\theta }{2\pi } \in {\mathbb {C}}^{d\times d}, \quad k\in {\mathbb {Z}}, \end{aligned}$$
(1.1)

and the symbol w satisfies the following two conditions:

$$\begin{aligned}&\displaystyle w\in L^{d\times d}_1({\mathbb {T}}) \hbox { and }w(e^{i\theta }) \hbox { is a positive Hermitian matrix }\sigma \hbox {-a.e.,} \end{aligned}$$
(1.2)
$$\begin{aligned}&\displaystyle w^{-1}\in L^{d\times d}_1({\mathbb {T}}). \end{aligned}$$
(1.3)

Let \(\{X_k:k\in {\mathbb {Z}}\}\) be a \({\mathbb {C}}^d\)-valued, centered, weakly stationary process that has spectral density w, hence autocovariance function \(\gamma \). Then the conditions (1.2) and (1.3) imply that \(\{X_k\}\) is minimal (see Sect. 10 of [21, Chapter II]).

In this paper, we show novel explicit formulas for \(T_n(w)^{-1}\) (Theorem 2.1), which are especially useful for large n (see [2]). The formulas are new even for \(d=1\). The main ingredients of the formulas are the Fourier coefficients of \(h^*h_{\sharp }^{-1}=h^{-1}h_{\sharp }^*\), where h and \(h_{\sharp }\) are \({\mathbb {C}}^{d\times d}\)-valued outer functions on \({\mathbb {T}}\) such that

$$\begin{aligned} w(e^{i\theta }) = h(e^{i\theta }) h(e^{i\theta })^* = h_{\sharp }(e^{i\theta })^*h_{\sharp }(e^{i\theta }), \qquad \sigma \hbox {-a.e.} \end{aligned}$$
(1.4)

(see [10]; see also Sect. 2). We note that the unitary matrix valued function \(h^*h_{\sharp }^{-1}=h^{-1}h_{\sharp }^*\) on \({\mathbb {T}}\) attached to w is called the phase function of w (see page 428 in [20]).

Let \(\{X_k\}\) be as above, and let \(\{X^{\prime }_k: k\in {\mathbb {Z}}\}\) be the dual process of \(\{X_k\}\) (see [19]; see also Sect. 2 below). In the proof of the above explicit formulas for \(T_n(w)^{-1}\), the dual process \(\{X^{\prime }_k\}\) plays an important role. In fact, the key to the proof of the explicit formulas for \(T_n(w)^{-1}\) is the following equality (Theorem 3.1):

$$\begin{aligned} \left( T_n(w)^{-1}\right) ^{s,t} = \langle X^{\prime }_s, P_{[1,n]}X^{\prime }_t\rangle , \quad s, t \in \{1,\dots ,n\}. \end{aligned}$$
(1.5)

Here, \(\langle \cdot , \cdot \rangle \) stands for the Gram matrix (see Sect. 3) and \(P_{[1,n]}X^{\prime }_t\) denotes the best linear predictor of \(X^{\prime }_t\) based on the observations \(X_{1},\dots ,X_{n}\) (see Sect. 2 for the precise definition). Moreover, for \(n\in {\mathbb {N}}\), \(A \in {\mathbb {C}}^{dn \times dn}\) and \(s, t\in \{1,\dots ,n\}\), we write \(A^{s,t}\in {\mathbb {C}}^{d\times d}\) for the (st) block of A; thus \(A = (A^{s,t})_{1\le s, t\le n}\). The equality (1.5) enables us to apply the \(P_{[1,n]}\)-related methods developed in [11, 12, 14,15,16] and others to derive the explicit formulas for \(T_n(w)^{-1}\).

We illustrate the usefulness of the explicit formulas for \(T_n(w)^{-1}\) by two applications. The first one is a strong convergence result for solutions of block Toeplitz systems. For this application, we assume (1.2) as well as the following condition:

$$\begin{aligned} \sum _{k=-\infty }^{\infty } \Vert \gamma (k)\Vert <\infty \hbox { and } \min _{z\in {\mathbb {T}}}\det w(z)>0. \end{aligned}$$
(1.6)

Here, for \(a\in {\mathbb {C}}^{d\times d}\), \(\Vert a\Vert \) denotes the operator norm of a. The condition (1.6) implies that \(\{X_k\}\) with spectral density w is a short-memory process. We note that (1.3) follows from (1.2) and (1.6) (see Sect. 4). Under (1.2) and (1.6), for \(n\in {\mathbb {N}}\) and a \({\mathbb {C}}^{d\times d}\)-valued sequence \(\{y_k\}_{k=1}^{\infty }\) such that \(\sum _{k=1}^{\infty } \Vert y_k\Vert < \infty \), let

$$\begin{aligned} Z_n=(z_{n,1}^{\top },\dots ,z_{n,n}^{\top })^{\top }\in {\mathbb {C}}^{dn\times d}\ \ \hbox {with}\ \ z_{n,k}\in {\mathbb {C}}^{d\times d},\ k\in \{1,\dots ,n\}, \end{aligned}$$
(1.7)

be the solution to the block Toeplitz system

$$\begin{aligned} T_n(w)Z_n = Y_n, \end{aligned}$$
(1.8)

where

$$\begin{aligned} Y_n := (y_1^{\top },\dots ,y_n^{\top })^{\top }\in {\mathbb {C}}^{dn\times d}. \end{aligned}$$
(1.9)

Also, let

$$\begin{aligned} Z_{\infty }=(z_{1}^{\top },z_{2}^{\top },\dots )^{\top }\ \ \hbox {with}\ \ z_{k}\in {\mathbb {C}}^{d\times d},\ k \in {\mathbb {N}}, \end{aligned}$$
(1.10)

be the solution to the corresponding infinite block Toeplitz system

$$\begin{aligned} T_{\infty }(w)Z_{\infty } = Y_{\infty }, \end{aligned}$$
(1.11)

where

$$\begin{aligned} T_{\infty }(w) := \left( \begin{matrix} \gamma (0) &{} \gamma (-1) &{} \gamma (-2) &{} \cdots \\ \gamma (1) &{} \gamma (0) &{} \gamma (-1) &{} \cdots \\ \gamma (2) &{} \gamma (1) &{} \gamma (0) &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \ddots \end{matrix} \right) \end{aligned}$$
(1.12)

and

$$\begin{aligned} Y_{\infty } := (y_1^{\top },y_2^{\top },\dots )^{\top }. \end{aligned}$$
(1.13)

Then, our result (Theorem 4.1) reads as follows:

$$\begin{aligned} \lim _{n\rightarrow \infty } \sum _{k=1}^n \Vert z_{n,k} - z_k\Vert = 0. \end{aligned}$$
(1.14)

We explain the background of the result (1.14). As above, let \(\{X_k:k\in {\mathbb {Z}}\}\) be a \({\mathbb {C}}^d\)-valued, centered, weakly stationary process that has spectral density w. For \(n\in {\mathbb {N}}\), the finite and infinite predictor coefficients \(\phi _{n,k}\in {\mathbb {C}}^{d\times d}\), \(k\in \{1,\dots ,n\}\), and \(\phi _k\), \(k\in {\mathbb {N}}\), of \(\{X_k\}\) are defined by

$$\begin{aligned} P_{[1,n]} X_{n+1} = \sum _{k=1}^n \phi _{n,k} X_{n+1-k} \quad \hbox {and} \quad P_{(-\infty ,n]} X_{n+1} = \sum _{k=1}^{\infty } \phi _{k} X_{n+1-k}, \end{aligned}$$

respectively; see Sect. 3 for the precise definitions of \(P_{[1,n]}\) and \(P_{(-\infty ,n]}\). We note that \(\sum _{k=1}^{\infty } \Vert \phi _k\Vert < \infty \) holds under (1.2) and (1.6) (see Sect. 4 below and (2.16) in [16]). Baxter’s inequality in [1, 5, 9] states that, under (1.2) and (1.6), there exists \(K\in (0,\infty )\) such that

$$\begin{aligned} \sum _{k=1}^{n}\Vert \phi _{n,k} - \phi _k \Vert \le K\sum _{k=n+1}^{\infty }\Vert \phi _k\Vert ,\quad n\in {\mathbb {N}}. \end{aligned}$$
(1.15)

In particular, we have

$$\begin{aligned} \lim _{n\rightarrow \infty } \sum _{k=1}^n \Vert \phi _{n,k} - \phi _k\Vert = 0. \end{aligned}$$
(1.16)

If we put \({\tilde{w}}(e^{i\theta }):=w(e^{-i\theta })\), then, \((\phi _{n,1},\dots ,\phi _{n,n})\) is the solution to the block Toeplitz system

$$\begin{aligned} T_n({\tilde{w}})(\phi _{n,1},\dots ,\phi _{n,n})^* = (\gamma (1),\dots ,\gamma (n))^*, \end{aligned}$$

called the Yule–Walker equation, while \((\phi _{1},\phi _{2}, \dots )\) is the solution to the corresponding infinite block Toeplitz system

$$\begin{aligned} T_{\infty }({\tilde{w}})(\phi _{1},\phi _{2},\dots )^* = (\gamma (1),\gamma (2),\dots )^*. \end{aligned}$$

Clearly, \({\tilde{w}}\) satisfies (1.2) and (1.6) since so does w. Therefore, our result (1.14) can be viewed as an extension to (1.16). It should be noted, however, that we prove (1.14) directly, without proving an analogue of Baxter’s inequality (1.15).

The convergence result (1.16) has various applications in time series analysis, such as the autoregressive sieve bootstrap (see, e.g., [16] and the references therein), whille Toeplitz systems of the form (1.8) appear in various fields, such as filtering of signals. Therefore the extension (1.14), as well as the other results explained below, may potentially be useful in such fields. We note that Baxter’s inequality (1.15), hence (1.16), is also proved for univariate and multivariate FARIMA (fractional autoregressive integrated moving-average) processes, which are long-memory processes, in [14] and [16], respectively. The FARIMA processes have singular spectral densities w but our explicit formulas for \(T_n(w)^{-1}\) above also cover them since we only assume minimality in the formulas. Applications of the explicit formulas to univariate and multivariate FARIMA processes will be discussed elsewhere. However, the problem of proving results of the type (1.14) for FARIMA processes remains unsolved so far.

The second application of the explicit formulas for \(T_n(w)^{-1}\) is closed-form formulas for \(T_n(w)^{-1}\) with rational w that corresponds to a univariate (\(d=1\)) or multivariate (\(d\ge 2\)) ARMA (autoregressive moving-average) process (Theorem 5.2). More precisely, we assume that w is of the form

$$\begin{aligned} w(e^{i\theta })=h(e^{i\theta })h(e^{i\theta })^*,\quad \theta \in [-\pi ,\pi ), \end{aligned}$$
(1.17)

where \(h:{\mathbb {T}}\rightarrow {\mathbb {C}}^{d\times d}\) satisfies the following condition:

$$\begin{aligned} \begin{aligned}&\hbox {the entries of }h(z) \hbox { are rational functions in }z\, \hbox {that have}\\&\hbox {no poles in }{\overline{{\mathbb {D}}}}, \hbox { and }\det h(z) \hbox { has no zeros in }{\overline{{\mathbb {D}}}}. \end{aligned} \end{aligned}$$
(1.18)

Here \({\overline{{\mathbb {D}}}}:=\{z\in {\mathbb {C}}:\vert z\vert \le 1\}\) is the closed unit disk in \({\mathbb {C}}\). The closed-form formulas for \(T_n(w)^{-1}\) consist of several building block matrices that are of fixed sizes independent of n. The significance of the formulas for \(T_n(w)^{-1}\) is that they provide us with a linear-time, or O(n), algorithm to compute the solution \(Z\in {\mathbb {C}}^{dn\times d}\) to the block Toeplitz system

$$\begin{aligned} T_n(w)Z = Y \end{aligned}$$
(1.19)

for \(Y\in {\mathbb {C}}^{dn\times d}\) (see Sect. 6). The famous Durbin–Levinson algorithm solves the Eq. (1.19) for more general w in \(O(n^2)\) time. Algorithms for Toeplitz linear systems that run faster than \(O(n^2)\) are called superfast. While our algorithm is restricted to the class of w corresponding to ARMA processes, the class is important in applications, and the linear-time algorithm is ideally superfast in the sense that there is no algorithm faster than O(n).

Toeplitz matrices appear in a variety of fields, including operator theory, orthogonal polynomials on the unit circle, time series analysis, engineering, and physics. Therefore, there is a vast amount of literature on Toeplitz matrices. Here, we refer to [2, 3, 6, 8, 22, 23] and [24] as textbook treatments. For example, in [6, III], the Gohberg-Semencul formulas in [7], which express the inverse of a Toeplitz matrix as a difference of products of lower and upper triangular Toeplitz matrices, are explained.

After this work was completed, the author learned of [25] by Subba Rao and Yang, where they also provide an explicit series expansion for \(T_n(w)^{-1}\) that corresponds to a univariate stationary process satisfying some conditions (see [25], Sect. 3.2). The main aim of [25] is to reconcile the Gaussian and Whittle likelihood, and the series expansion in [25] is tailored to this purpose, using the complete DFT (discrete Fourier transform) introduced in [25]. It should be noticed that \(T_n(w)^{-1}\) appears in the Gaussian likelihood, while the Whittle likelihood is based on the ordinary DFT. Since most results of the present paper directly concern \(T_n(w)^{-1}\), some of them may also be useful for studies related to the Gaussian likelihood.

This paper is organized as follows. We state the explicit formulas for \(T_n(w)^{-1}\) in Sect. 2. In Sect. 3, we first prove (1.5) and then use it to prove the explicit formulas for \(T_n(w)^{-1}\). In Sect. 4, we prove (1.14) for w satisfying (1.2) and (1.6), using the explicit formulas for \(T_n(w)^{-1}\). In Sect. 5, we prove the closed-form formulas for \(T_n(w)^{-1}\) with w satisfying (1.18), using the explicit formulas for \(T_n(w)^{-1}\). In Sect. 6, we explain how the results in Sect. 5 give a linear-time algorithm to compute the solution to (1.19). Finally, the Appendix contains the omitted proofs of two lemmas.

2 Explicit formulas

Let \({\mathbb {C}}^{m\times n}\) be the set of all complex \(m\times n\) matrices; we write \({\mathbb {C}}^d\) for \({\mathbb {C}}^{d\times 1}\). Let \(I_n\) be the \(n\times n\) unit matrix. For \(a\in {\mathbb {C}}^{m\times n}\), \(a^{\top }\) denotes the transpose of a, and \({\overline{a}}\) and \(a^*\) the complex and Hermitian conjugates of a, respectively; thus, in particular, \(a^*:={\overline{a}}^{\top }\). For \(a\in {\mathbb {C}}^{d\times d}\), we write \(\Vert a\Vert \) for the operator norm of a:

$$\begin{aligned} \Vert a\Vert :=\sup _{u\in {\mathbb {C}}^d, \vert u\vert \le 1}\vert au\vert . \end{aligned}$$

Here \(\vert u\vert :=(\sum _{i=1}^d\vert u^i\vert ^2)^{1/2}\) denotes the Euclidean norm of \(u=(u^1,\dots ,u^d)^{\top }\in {\mathbb {C}}^d\). For \(p\in [1,\infty )\) and \(K\subset {\mathbb {Z}}\), \(\ell _p^{d\times d}(K)\) denotes the space of \({\mathbb {C}}^{d\times d}\)-valued sequences \(\{a_k\}_{k\in K}\) such that \(\sum _{k\in K}\Vert a_k\Vert ^p<\infty \). We write \(\ell _{p+}^{d\times d}\) for \(\ell _p^{d\times d}({\mathbb {N}}\cup \{0\})\) and \(\ell _{p+}\) for \(\ell _{p+}^{1\times 1}=\ell _p^{1\times 1}({\mathbb {N}}\cup \{0\})\).

Recall \(\sigma \) from Sect. 1. The Hardy class \(H_2({\mathbb {T}})\) on \({\mathbb {T}}\) is the closed subspace of \(L_2({\mathbb {T}})\) consisting of \(f\in L_2({\mathbb {T}})\) such that \(\int _{-\pi }^{\pi }e^{im\theta }f(e^{i\theta })\sigma (d\theta )=0\) for \(m=1,2,\dots \). Let \(H_2^{m\times n}({\mathbb {T}})\) be the space of \({\mathbb {C}}^{m\times n}\)-valued functions on \({\mathbb {T}}\) whose entries belong to \(H_2({\mathbb {T}})\). Let \({\mathbb {D}}:=\{z\in {\mathbb {C}}: \vert z\vert {<}1\}\) be the open unit disk in \({\mathbb {C}}\). We write \(H_2({\mathbb {D}})\) for the Hardy class on \({\mathbb {D}}\), consisting of holomorphic functions f on \({\mathbb {D}}\) such that \(\sup _{r\in [0,1)}\int _{-\pi }^{\pi }\vert f(re^{i\theta })\vert ^2\sigma (d\theta )<\infty \). As usual, we identify each function f in \(H_2({\mathbb {D}})\) with its boundary function \(f(e^{i\theta }):=\lim _{r\uparrow 1}f(re^{i\theta })\), \(\sigma \)-a.e., in \(H_2({\mathbb {T}})\). A function h in \(H_2^{d\times d}({\mathbb {T}})\) is called outer if \(\det h\) is a \({\mathbb {C}}\)-valued outer function, that is, \(\det h\) satisfies \(\log \vert \det h(0)\vert =\int _{-\pi }^{\pi }\log \vert \det h(e^{i\theta })\vert \sigma (d\theta )\) (see Definition 3.1 in [18]).

We assume that w satisfies (1.2) and (1.3). Then \(\log \det w\) is in \(L_1({\mathbb {T}})\) (see Sect. 3 in [16]). Therefore w has the decompositions (1.4) for two outer functions h and \(h_{\sharp }\) belonging to \(H_2^{d\times d}({\mathbb {T}})\), and h and \(h_{\sharp }\) are unique up to constant unitary factors (see Chapter II in [21] and Theorem 11 in [10]; see also Sect. 3 in [16]). We may take \(h_{\sharp }=h\) for the case \(d=1\) but there is no such simple relation between h and \(h_{\sharp }\) for \(d\ge 2\). We define the outer function \({\tilde{h}}\) in \(H_2^{d\times d}({\mathbb {T}})\) by

$$\begin{aligned} {\tilde{h}}(z) := \{h_{\sharp }({\overline{z}})\}^*. \end{aligned}$$
(2.1)

All of \(h^{-1}\), \(h_{\sharp }^{-1}\) and \({\tilde{h}}^{-1}\) also belong to \(H_2^{d\times d}({\mathbb {T}})\) since we have assumed (1.3).

We define four \({\mathbb {C}}^{d\times d}\)-valued sequences \(\{c_k\}\), \(\{a_k\}\), \(\{{\tilde{c}}_k\}\) and \(\{{\tilde{a}}_k\}\) by

$$\begin{aligned} h(z)&=\sum _{k=0}^{\infty }z^kc_k,\quad z\in {\mathbb {D}}, \end{aligned}$$
(2.2)
$$\begin{aligned} -h(z)^{-1}&=\sum _{k=0}^{\infty }z^ka_k,\quad z\in {\mathbb {D}}, \end{aligned}$$
(2.3)
$$\begin{aligned} {\tilde{h}}(z)&=\sum _{k=0}^{\infty }z^k{\tilde{c}}_k,\quad z\in {\mathbb {D}}, \end{aligned}$$
(2.4)

and

$$\begin{aligned} -{\tilde{h}}(z)^{-1}=\sum _{k=0}^{\infty }z^k{\tilde{a}}_k,\quad z\in {\mathbb {D}}, \end{aligned}$$
(2.5)

respectively. By (1.3), all of \(\{c_k\}\), \(\{a_k\}\), \(\{{\tilde{c}}_k\}\) and \(\{{\tilde{a}}_k\}\) belong to \(\ell _{2+}^{d\times d}\).

We define a \({\mathbb {C}}^{d\times d}\)-valued sequence \(\{\beta _k\}_{k=-\infty }^{\infty }\) as the (minus of the) Fourier coefficients of the phase function \(h^*h_{\sharp }^{-1}=h^{-1}h_{\sharp }^*\):

$$\begin{aligned} \beta _k= & {} -\int _{-\pi }^{\pi }e^{-ik\theta } h(e^{i\theta })^* h_{\sharp }(e^{i\theta })^{-1} \frac{d\theta }{2\pi }\nonumber \\= & {} -\int _{-\pi }^{\pi }e^{-ik\theta } h(e^{i\theta })^{-1} h_{\sharp }(e^{i\theta })^* \frac{d\theta }{2\pi }, \quad k \in {\mathbb {Z}}. \end{aligned}$$
(2.6)

For \(n\in {\mathbb {N}}\), \(u \in \{1,\dots ,n\}\) and \(k\in {\mathbb {N}}\), we can define the sequences \(\{b_{n,u,\ell }^k\}_{\ell =0}^{\infty }\in \ell _{2+}^{d\times d}\) by the recursion

$$\begin{aligned} \left\{ \begin{aligned} b_{n,u,\ell }^1&=\beta _{u+\ell },\\ b_{n,u,\ell }^{2k}&=\sum _{m=0}^{\infty } b_{n,u,m}^{2k-1} \beta _{n+1+m+\ell }^*,\quad b_{n,u,\ell }^{2k+1} =\sum _{m=0}^{\infty } b_{n,u,m}^{2k} \beta _{n+1+m+\ell } \end{aligned} \right. \end{aligned}$$
(2.7)

(see Sect. 3 below). Similarly, for \(n\in {\mathbb {N}}\), \(u \in \{1,\dots ,n\}\) and \(k\in {\mathbb {N}}\), we can define the sequences \(\{{\tilde{b}}_{n,u,\ell }^k\}_{\ell =0}^{\infty }\in \ell _{2+}^{d\times d}\) by the recursion

$$\begin{aligned} \left\{ \begin{aligned} {\tilde{b}}_{n,u,\ell }^1&=\beta _{n+1-u+\ell }^*,\\ {\tilde{b}}_{n,u,\ell }^{2k}&=\sum _{m=0}^{\infty } {\tilde{b}}_{n,u,m}^{2k-1} \beta _{n+1+m+\ell },\quad {\tilde{b}}_{n,u,\ell }^{2k+1} =\sum _{m=0}^{\infty } {\tilde{b}}_{n,u,m}^{2k} \beta _{n+1+m+\ell }^*. \end{aligned} \right. \end{aligned}$$
(2.8)

Recall from Sect. 1 that \((T_n(w)^{-1})^{s,t}\) denotes the (st) block of \(T_n(w)^{-1}\). Since \(T_n(w)\), hence \(T_n(w)^{-1}\), is self-adjoint, we have

$$\begin{aligned} (T_n(w)^{-1})^{s,t} = ((T_n(w)^{-1})^{t,s})^*, \quad s, t \in \{1,\dots ,n\}. \end{aligned}$$
(2.9)

We use the following notation:

$$\begin{aligned} s\vee t:=\max (s, t), \quad s\wedge t:=\min (s, t). \end{aligned}$$

We are ready to state the explicit formulas for \((T_n(w))^{-1}\).

Theorem 2.1

We assume (1.2) and (1.3). Then the following two assertions hold.

  1. (i)

    For \(n\in {\mathbb {N}}\) and \(s, t\in \{1,\dots ,n\}\), we have

    $$\begin{aligned}&\left( T_n(w)^{-1}\right) ^{s,t} = \sum _{\ell = 1}^{s\wedge t} {\tilde{a}}_{s - \ell }^* {\tilde{a}}_{t - \ell } \nonumber \\&\qquad + \sum _{u=1}^t \sum _{k=1}^{\infty } \left\{ \sum _{\ell = 0}^{\infty } {\tilde{b}}_{n,u,\ell }^{2k-1} a_{n + 1 - s + \ell } + \sum _{\ell = 0}^{\infty } {\tilde{b}}_{n,u,\ell }^{2k} {\tilde{a}}_{s + \ell } \right\} ^* {\tilde{a}}_{t-u}. \end{aligned}$$
    (2.10)
  2. (ii)

    For \(n\in {\mathbb {N}}\) and \(s, t\in \{1,\dots ,n\}\), we have

    $$\begin{aligned}&\left( T_n(w)^{-1}\right) ^{s,t} = \sum _{\ell =s\vee t}^n a_{\ell - s}^* a_{\ell - t} \nonumber \\&\qquad + \sum _{u=t}^n \sum _{k=1}^{\infty } \left\{ \sum _{\ell = 0}^{\infty } b_{n,u,\ell }^{2k-1} {\tilde{a}}_{s + \ell } + \sum _{\ell = 0}^{\infty } b_{n,u,\ell }^{2k} a_{n + 1 - s + \ell } \right\} ^* a_{u-t}. \end{aligned}$$
    (2.11)

The proof of Theorem 2.1 will be given in Sect. 3.

Corollary 2.1

We assume (1.2) and (1.3). Then the following two assertions hold.

  1. (i)

    For \(n\in {\mathbb {N}}\) and \(s, t\in \{1,\dots ,n\}\), we have

    $$\begin{aligned}&\left( T_n(w)^{-1}\right) ^{s,t} = \sum _{\ell = 1}^{s\wedge t} {\tilde{a}}_{s - \ell }^* {\tilde{a}}_{t - \ell } \nonumber \\&\qquad + \sum _{u=1}^s {\tilde{a}}_{s-u}^* \sum _{k=1}^{\infty } \left\{ \sum _{\ell = 0}^{\infty } {\tilde{b}}_{n,u,\ell }^{2k-1} a_{n + 1 - t + \ell } + \sum _{\ell = 0}^{\infty } {\tilde{b}}_{n,u,\ell }^{2k} {\tilde{a}}_{t + \ell } \right\} . \end{aligned}$$
    (2.12)
  2. (ii)

    For \(n\in {\mathbb {N}}\) and \(s, t\in \{1,\dots ,n\}\), we have

    $$\begin{aligned}&\left( T_n(w)^{-1}\right) ^{s,t} = \sum _{\ell =s\vee t}^n a_{\ell - s}^* a_{\ell - t} \nonumber \\&\qquad + \sum _{u=s}^n a_{u-s}^* \sum _{k=1}^{\infty } \left\{ \sum _{\ell = 0}^{\infty } b_{n,u,\ell }^{2k-1} {\tilde{a}}_{t + \ell } + \sum _{\ell = 0}^{\infty } b_{n,u,\ell }^{2k} a_{n + 1 - t + \ell } \right\} . \end{aligned}$$
    (2.13)

Proof

Thanks to (2.9), we obtain (2.12) and (2.13) from (2.10) and (2.11), respectively. \(\square \)

Remark 2.1

Recall \(T_{\infty }(w)\) from (1.12). For \(n\in {\mathbb {N}}\cup \{0\}\), we have \(\gamma (n)=\sum _{k=0}^{\infty } {\tilde{c}}_k {\tilde{c}}_{n+k}^*\) and \(\gamma (-n)=\sum _{k=0}^{\infty } {\tilde{c}}_{n+k} {\tilde{c}}_{k}^*\) (see (2.13) in [16]), hence \(T_{\infty }(w) = {\tilde{C}}_{\infty } ({\tilde{C}}_{\infty })^*\), where

On the other hand, it follows from \({\tilde{h}}(z){\tilde{h}}(z)^{-1}=I_d\) that \(\sum _{k=0}^n {\tilde{c}}_k {\tilde{a}}_{n-k} = -\delta _{n0}I_d\) for \(n\in {\mathbb {N}}\cup \{0\}\), hence \({\tilde{C}}_{\infty } {\tilde{A}}_{\infty } = -I_{\infty }\), where

Combining, we have \(T_{\infty }(w)^{-1} = ({\tilde{A}}_{\infty })^* {\tilde{A}}_{\infty }\). Thus, we find that the first term \(\sum _{\ell = 1}^{s\wedge t} {\tilde{a}}_{s - \ell }^* {\tilde{a}}_{t - \ell }\) in (2.10) or (2.12) coincides with the (st) block of \(T_{\infty }(w)^{-1}\).

For \(n\in {\mathbb {N}}\), we define

(2.14)

and

(2.15)

The next lemma will turn out to be useful in Sect. 6.

Lemma 2.1

For \(n\in {\mathbb {N}}\) and \(s, t \in \{1,\dots ,n\}\), we have the following two equalities:

$$\begin{aligned} \left( {\tilde{A}}_n^* {\tilde{A}}_n\right) ^{s,t}&= \sum _{\ell = 1}^{s\wedge t} {\tilde{a}}_{s - \ell }^* {\tilde{a}}_{t - \ell }, \\ \left( A_n^* A_n\right) ^{s,t}&= \sum _{\ell =s\vee t}^n a_{\ell - s}^* a_{\ell - t}. \end{aligned}$$

The proof of Lemma 2.1 is straightforward and will be omitted.

3 Proof of Theorem 2.1

In this section, we prove Theorem 2.1. We assume (1.2) and (1.3). Let \(\{X_k\}=\{X_k:k\in {\mathbb {Z}}\}\) be a \({\mathbb {C}}^d\)-valued, centered, weakly stationary process, defined on a probability space \((\varOmega , {\mathcal {F}}, P)\), that has spectral density w, hence autocovariance function \(\gamma \). Thus we have \(E[X_k X_0^*] = \gamma (k) = \int _{-\pi }^{\pi }e^{-ik\theta }w(e^{i\theta })(d\theta /(2\pi ))\) for \(k\in {\mathbb {Z}}\).

Write \(X_k=(X^1_k,\dots ,X^d_k)^{\top }\), and let V be the complex Hilbert space spanned by all the entries \(\{X^j_k: k\in {\mathbb {Z}},\ j=1,\dots ,d\}\) in \(L^2(\varOmega , {\mathcal {F}}, P)\), which has inner product \((x, y)_{V}:=E[x{\overline{y}}]\) and norm \(\Vert x\Vert _{V}:=(x,x)_{V}^{1/2}\). For \(J\subset {\mathbb {Z}}\) such as \(\{n\}\), \((-\infty ,n]:=\{n,n-1,\dots \}\), \([n,\infty ):=\{n,n+1,\dots \}\), and \([m,n]:=\{m,\dots ,n\}\) with \(m\le n\), we define the closed subspace \(V_J^X\) of V by

$$\begin{aligned} V_J^X:=\overline{\mathrm {sp}}\{X^j_k: j=1,\dots ,d,\ k\in J\}. \end{aligned}$$

Let \(P_J\) and \(P_J^{\perp }\) be the orthogonal projection operators of V onto \(V_J^X\) and \((V_J^X)^{\perp }\), respectively, where \((V_J^X)^{\bot }\) denotes the orthogonal complement of \(V_J^X\) in V.

By Theorem 3.1 in [11] for \(d=1\) and Corollary 3.6 in [15] for general \(d\ge 1\), the conditions (1.2) and (1.3) imply the following intersection of past and future property:

$$\begin{aligned} V_{(-\infty ,n]}^X\cap V_{[1,\infty )}^X=V_{[1,n]}^X,\quad n \in {\mathbb {N}}. \end{aligned}$$
(3.1)

Let \(V^d\) be the space of \({\mathbb {C}}^d\)-valued random variables on \((\varOmega , {\mathcal {F}}, P)\) whose entries belong to V. The norm \(\Vert x\Vert _{V^d}\) of \(x=(x^1,\dots ,x^d)^{\top }\in V^d\) is given by \(\Vert x\Vert _{V^d}:=(\sum _{i=1}^d \Vert x^i\Vert _V^2)^{1/2}\). For \(J\subset {\mathbb {Z}}\) and \(x=(x^1,\dots ,x^d)^{\top }\in V^d\), we write \(P_Jx\) for \((P_Jx^1, \dots , P_Jx^d)^{\top }\). We define \(P_J^{\perp }x\) in a similar way. For \(x=(x^1,\dots ,x^d)^{\top }\) and \(y=(y^1,\dots ,y^d)^{\top }\) in \(V^d\),

$$\begin{aligned} \langle x,y\rangle :=E[xy^*]=((x^k, y^{\ell })_V)_{1\le k, \ell \le d} \in {\mathbb {C}}^{d\times d} \end{aligned}$$

stands for the Gram matrix of x and y.

Let

$$\begin{aligned} X_k=\int _{-\pi }^{\pi }e^{-ik\theta }\eta (d\theta ),\quad k\in {\mathbb {Z}}, \end{aligned}$$

be the spectral representation of \(\{X_k\}\), where \(\eta \) is a \({\mathbb {C}}^d\)-valued random spectral measure. We define a d-variate stationary process \(\{\varepsilon _k:k\in {\mathbb {Z}}\}\), called the forward innovation process of \(\{X_k\}\), by

$$\begin{aligned} \varepsilon _k:=\int _{-\pi }^{\pi }e^{-ik\theta }h(e^{i\theta })^{-1}\eta (d\theta ),\quad k\in {\mathbb {Z}}. \end{aligned}$$

Then, \(\{\varepsilon _k\}\) satisfies \(\langle \varepsilon _n, \varepsilon _m\rangle = \delta _{n m}I_d\) and \(V_{(-\infty ,n]}^X=V_{(-\infty ,n]}^{\varepsilon }\) for \(n\in {\mathbb {Z}}\), hence

$$\begin{aligned} (V_{(-\infty ,n]}^X)^{\bot } = V_{[n+1, \infty )}^{\varepsilon },\quad n\in {\mathbb {Z}}. \end{aligned}$$

Recall the outer function \(h_{\sharp }\) in \(H_2^{d\times d}({\mathbb {T}})\) from (1.4). We define the backward innovation process \(\{{\tilde{\varepsilon }}_k: k\in {\mathbb {Z}}\}\) of \(\{X_k\}\) by

$$\begin{aligned} {\tilde{\varepsilon }}_k:=\int _{-\pi }^{\pi }e^{ik\theta }\{h_{\sharp } (e^{i\theta })^*\}^{-1} \eta (d\theta ),\quad k\in {\mathbb {Z}}. \end{aligned}$$

Then, \(\{{\tilde{\varepsilon }}_k\}\) satisfies \(\langle {\tilde{\varepsilon }}_n, {\tilde{\varepsilon }}_m\rangle =\delta _{n m}I_d\) and \(V_{[-n,\infty )}^X=V_{(-\infty , n]}^{{\tilde{\varepsilon }}}\) for \(n\in {\mathbb {Z}}\), hence

$$\begin{aligned} (V_{[-n,\infty )}^X)^{\bot } = V_{[n+1,\infty )}^{{\tilde{\varepsilon }}},\quad n\in {\mathbb {Z}}\end{aligned}$$

(see Sect. 2 in [16]). Moreover, by Lemma 4.1 in [16], we have

$$\begin{aligned} \langle \varepsilon _{\ell }, {\tilde{\varepsilon }}_{m}\rangle = -\beta _{\ell +m},\quad \langle {\tilde{\varepsilon }}_{m}, \varepsilon _{\ell }\rangle = -\beta _{\ell +m}^*, \quad \ell , m \in {\mathbb {Z}}. \end{aligned}$$
(3.2)

By (3.2), for \(\{s_{\ell }\}\in \ell _{2+}^{d\times d}\) and \(n\in {\mathbb {N}}\),

$$\begin{aligned} P_{[1,\infty )}^{\perp } \left( \sum _{\ell =0}^{\infty } s_{\ell } \varepsilon _{n+1+\ell }\right)&=-\sum _{\ell =0}^\infty \left( \sum _{m=0}^\infty s_m \beta _{n+1+\ell +m}\right) {\tilde{\varepsilon }}_{\ell }, \end{aligned}$$
(3.3)
$$\begin{aligned} P_{(-\infty ,n]}^{\perp } \left( \sum _{\ell =0}^\infty s_{\ell } {\tilde{\varepsilon }}_{\ell }\right)&=-\sum _{\ell =0}^\infty \left( \sum _{m=0}^\infty s_m \beta _{n+1+\ell +m}^*\right) \varepsilon _{n+1+\ell }. \end{aligned}$$
(3.4)

Therefore,

$$\begin{aligned} \left\{ \sum _{m=0}^\infty s_m \beta _{n+1+\ell +m}\right\} _{\ell =0}^{\infty }, \ \left\{ \sum _{m=0}^\infty s_m \beta _{n+1+\ell +m}^*\right\} _{\ell =0}^{\infty }\ \in \ \ell _{2+}^{d\times d}. \end{aligned}$$

See Lemma 4.2 in [16]. In particular, for \(n\in {\mathbb {N}}\), \(u \in \{1,\dots ,n\}\) and \(k\in {\mathbb {N}}\), we can define the sequences \(\{b_{n,u,\ell }^k\}_{\ell =0}^{\infty }\in \ell _{2+}^{d\times d}\) and \(\{{\tilde{b}}_{n,u,\ell }^k\}_{\ell =0}^{\infty }\in \ell _{2+}^{d\times d}\) by the recursions (2.7) and (2.8), respectively.

By (1.2) and (1.3), \(\{X_k\}\) has the dual process \(\{X^{\prime }_k: k\in {\mathbb {Z}}\}\), which is a \({\mathbb {C}}^d\)-valued, centered, weakly stationary process characterized by the biorthogonality relation

$$\begin{aligned} \langle X_s,X^{\prime }_t\rangle =\delta _{st}I_d, \quad s, t\in {\mathbb {Z}}\end{aligned}$$

(see [19]). Recall \(\{a_k\} \in \ell _{2+}^{d\times d}\) and \(\{{\tilde{a}}_k\} \in \ell _{2+}^{d\times d}\) from (2.3) and (2.5), respectively. The dual process \(\{X^{\prime }_k\}\) admits the following two MA representations (see Sect. 5 in [16]):

$$\begin{aligned} X^{\prime }_n&=-\sum _{\ell =n}^\infty a_{\ell -n}^* \varepsilon _{\ell }, \quad n\in {\mathbb {Z}}, \end{aligned}$$
(3.5)
$$\begin{aligned} X^{\prime }_n&=-\sum _{\ell =-n}^{\infty } {\tilde{a}}_{\ell +n}^* {\tilde{\varepsilon }}_{\ell },\quad n\in {\mathbb {Z}}. \end{aligned}$$
(3.6)

The next theorem is the key to the proof of Theorem 2.1.

Theorem 3.1

Assume (1.2) and (1.3). Then, for \(n\in {\mathbb {N}}\) and \(s,t \in \{1,\dots ,n\}\), we have (1.5).

Proof

Fix \(n\in {\mathbb {N}}\). For \(s\in \{1,\dots ,n\}\), we can write \(P_{[1,n]}X^{\prime }_s = \sum _{k=1}^n q_{s,k} X_k\) for some \(q_{s,k}\in {\mathbb {C}}^{d\times d}\), \(k\in \{1,\dots ,n\}\). For \(s,t \in \{1,\dots ,n\}\), we have

$$\begin{aligned} \begin{aligned} \delta _{st}I_d&= \langle X^{\prime }_s, X_t\rangle = \langle X^{\prime }_s, P_{[1,n]}X_t\rangle = \langle P_{[1,n]}X^{\prime }_s, X_t \rangle = \left\langle \sum _{k=1}^n q_{s,k} X_k, X_t\right\rangle \\&=\sum _{k=1}^n q_{s,k} \left\langle X_k, X_t \right\rangle =\sum _{k=1}^n q_{s,k} \gamma (k-t), \end{aligned} \end{aligned}$$

or \(Q_n T_n(w) = I_{dn}\), where \(Q_n := (q_{s,k})_{1\le s, k\le n}\in {\mathbb {C}}^{dn\times dn}\). Therefore, we have \(Q_n = T_n(w)^{-1}\). However,

$$\begin{aligned} \langle X^{\prime }_s, P_{[1,n]}X^{\prime }_t\rangle = \langle P_{[1,n]}X^{\prime }_s, X^{\prime }_t\rangle = \left\langle \sum _{k=1}^n q_{s,k} X_k, X^{\prime }_t \right\rangle = \sum _{k=1}^n q_{s,k} \langle X_k, X^{\prime }_t\rangle = q_{s,t}. \end{aligned}$$

Thus, the theorem follows. \(\square \)

Lemma 3.1

Assume (1.2) and (1.3). Then, for \(n\in {\mathbb {N}}\) and \(s,t \in \{1,\dots ,n\}\), the following two equalities hold:

$$\begin{aligned} \langle X^{\prime }_s, P_{[1,n]}X^{\prime }_t\rangle&= \sum _{\ell =s\vee t}^n a_{\ell -s}^* a_{\ell -t} + \sum _{u=t}^n \langle X^{\prime }_s, P_{[1,n]}^{\bot } \varepsilon _u\rangle a_{u-t}, \end{aligned}$$
(3.7)
$$\begin{aligned} \langle X^{\prime }_s, P_{[1,n]}X^{\prime }_t\rangle&= \sum _{\ell =1}^{s\wedge t} {\tilde{a}}_{s-\ell }^* {\tilde{a}}_{t-\ell } + \sum _{u=1}^t \langle X^{\prime }_s, P_{[1,n]}^{\bot }{\tilde{\varepsilon }}_{-u}\rangle {\tilde{a}}_{t-u}. \end{aligned}$$
(3.8)

Proof

First, we prove (3.7). Since \(V_{[1,n]}^{X} \subset V_{(-\infty ,n]}^{X}\), we have

$$\begin{aligned} \begin{aligned} \langle X^{\prime }_s, P_{[1,n]} X^{\prime }_t\rangle&=\langle X^{\prime }_s, P_{[1,n]} P_{(-\infty ,n]} X^{\prime }_t\rangle \\&=\langle X^{\prime }_s, P_{(-\infty ,n]} X^{\prime }_t\rangle - \langle X^{\prime }_s, P_{[1,n]}^{\bot } P_{(-\infty ,n]} X^{\prime }_t\rangle . \end{aligned} \end{aligned}$$

On the other hand, from (3.5), we have \(P_{(-\infty ,n]} X^{\prime }_t = -\sum _{m=t}^n a_{m-t}^* \varepsilon _{m}\), hence

$$\begin{aligned} \langle X^{\prime }_s, P_{(-\infty ,n]} X^{\prime }_t\rangle = \left\langle \sum _{\ell =s}^{\infty } a_{\ell -s}^* \varepsilon _{\ell }, \sum _{m=t}^n a_{m-t}^* \varepsilon _{m}\right\rangle =\sum _{\ell =s\vee t}^n a_{\ell -s}^* a_{\ell -t}, \end{aligned}$$

and \(\langle X^{\prime }_s, P_{[1,n]}^{\bot } P_{(-\infty ,n]} X^{\prime }_t\rangle \) is equal to

$$\begin{aligned} -\left\langle X^{\prime }_s, P_{[1,n]}^{\bot }\left( \sum _{u=t}^n a_{u-t}^* \varepsilon _{u}\right) \right\rangle =-\sum _{u=t}^n \langle X^{\prime }_s, P_{[1,n]}^{\bot }\varepsilon _{u}\rangle a_{u-t}. \end{aligned}$$

Combining, we obtain (3.7).

Next, we prove (3.8). Since \(V_{[1,n]}^{X} \subset V_{[1,\infty )}^{X}\), we have

$$\begin{aligned} \begin{aligned} \langle X^{\prime }_s, P_{[1,n]} X^{\prime }_t\rangle&=\langle X^{\prime }_s, P_{[1,n]} P_{[1,\infty )} X^{\prime }_t\rangle \\&=\langle X^{\prime }_s, P_{[1,\infty )} X^{\prime }_t\rangle - \langle X^{\prime }_s, P_{[1,n]}^{\bot } P_{[1,\infty )} X^{\prime }_t\rangle . \end{aligned} \end{aligned}$$

On the other hand, from (3.6), we have \(P_{[1, \infty )} X^{\prime }_t = -\sum _{m=1}^{t} {\tilde{a}}_{t-m}^* {\tilde{\varepsilon }}_{-m}\), hence

$$\begin{aligned} \langle X^{\prime }_s, P_{[1,\infty )} X^{\prime }_t\rangle = \left\langle \sum _{\ell =-\infty }^{s} {\tilde{a}}_{s-\ell }^* {\tilde{\varepsilon }}_{-\ell }, \sum _{m=1}^{t} {\tilde{a}}_{t-m}^* {\tilde{\varepsilon }}_{-m} \right\rangle =\sum _{\ell =1}^{s\wedge t} {\tilde{a}}_{s-\ell }^* {\tilde{a}}_{t-\ell }, \end{aligned}$$

and \(\langle X^{\prime }_s, P_{[1,n]}^{\bot } P_{[1,\infty )} X^{\prime }_t\rangle \) is equal to

$$\begin{aligned} -\left\langle X^{\prime }_s, P_{[1,n]}^{\bot }\left( \sum _{u=1}^{t} {\tilde{a}}_{t-u}^* {\tilde{\varepsilon }}_{-u} \right) \right\rangle =-\sum _{u=1}^t \langle X^{\prime }_s, P_{[1,n]}^{\bot }{\tilde{\varepsilon }}_{-u}\rangle {\tilde{a}}_{t-u}. \end{aligned}$$

Combining, we obtain (3.8). \(\square \)

For \(n\in {\mathbb {N}}\) and \(u \in \{1,\dots ,n\}\), we define the sequence \(\{W_{n,u}^k\}_{k=1}^{\infty }\) in \(V^d\) by

$$\begin{aligned} W_{n,u}^{2k-1}&= - P_{[1, \infty )}^{\perp } (P_{(-\infty , n]}^\perp P_{[1, \infty )}^{\perp })^{k-1} \varepsilon _u,\quad k \in {\mathbb {N}}, \\ W_{n,u}^{2k}&= (P_{(-\infty ,n]}^{\perp } P_{[1, \infty )}^{\perp })^{k} \varepsilon _u,\quad k \in {\mathbb {N}}. \end{aligned}$$

Lemma 3.2

We assume (1.2) and (1.3). Then, for \(n\in {\mathbb {N}}\) and \(u\in \{1,\dots ,n\}\), we have

$$\begin{aligned} P_{[1,n]}^{\perp } \varepsilon _u = -\sum _{k=1}^{\infty } W_{n,u}^{k}, \end{aligned}$$
(3.9)

the sum converging strongly in \(V^d\).

Proof

Since \(\varepsilon _u\) is in \(V_{(-\infty ,n]}^X\), (3.9) follows from (3.1) and Theorem 3.2 in [16]. \(\square \)

Proposition 3.1

We assume (1.2) and (1.3). Then, for \(n\in {\mathbb {N}}\), \(u\in \{1,\dots ,n\}\) and \(k\in {\mathbb {N}}\), we have

$$\begin{aligned} W_{n,u}^{2k-1}&=\sum _{\ell =0}^\infty b_{n,u,\ell }^{2k-1} {\tilde{\varepsilon }}_{\ell }, \end{aligned}$$
(3.10)
$$\begin{aligned} W_{n,u}^{2k}&=\sum _{\ell =0}^\infty b_{n,u,\ell }^{2k} \varepsilon _{n+1+\ell }. \end{aligned}$$
(3.11)

Proof

Note that, from the definition of \(W_{n,u}^k\),

$$\begin{aligned} W_{n,u}^{2k+1}=-P_{[1,\infty )}^\perp W_{n,u}^{2k}, \quad W_{n,u}^{2k+2}=-P_{(-\infty ,n]}^\perp W_{n,u}^{2k+1}. \end{aligned}$$

We prove (3.10) and (3.11) by induction. First, by (3.2), we have

$$\begin{aligned} W_{n,u}^1=-P_{[1, \infty )}^{\perp } \varepsilon _u = -\sum _{\ell =0}^{\infty } \langle \varepsilon _u, {\tilde{\varepsilon }}_{\ell }\rangle {\tilde{\varepsilon }}_{\ell } =\sum _{\ell =0}^\infty \beta _{u+\ell } {\tilde{\varepsilon }}_{\ell } = \sum _{\ell =0}^\infty b_{n,u,\ell }^1 {\tilde{\varepsilon }}_{\ell }. \end{aligned}$$

For \(k \in {\mathbb {N}}\), assume that \(W_{n,u}^{2k-1}=\sum _{\ell =0}^\infty b_{n,u,\ell }^{2k-1} {\tilde{\varepsilon }}_{\ell }\). Then, by (3.4),

$$\begin{aligned} \begin{aligned} W_{n,u}^{2k}&=-P_{(-\infty ,n]}^{\perp } \left( \sum _{\ell =0}^{\infty } b_{n,u,\ell }^{2k-1} {\tilde{\varepsilon }}_{\ell }\right) =\sum _{\ell =0}^{\infty } \left( \sum _{m=0}^\infty b_{n,u,m}^{2k-1} \beta _{n+1+m+\ell }^* \right) \varepsilon _{n+1+\ell }\\&=\sum _{\ell =0}^\infty b_{n,u,\ell }^{2k} \varepsilon _{n+1+\ell }, \end{aligned} \end{aligned}$$

and, by (3.3),

$$\begin{aligned} \begin{aligned} W_{n,u}^{2k+1}&=-P_{[1, \infty )}^{\perp } \left( \sum _{\ell =0}^\infty b_{n,u,\ell }^{2k} \varepsilon _{n+1+\ell }\right) =\sum _{\ell =0}^{\infty } \left( \sum _{m=0}^{\infty } b_{n,u,m}^{2k} \beta _{n+1+m+\ell } \right) {\tilde{\varepsilon }}_{\ell }\\&=\sum _{\ell =0}^{\infty } b_{n,u,\ell }^{2k+1} {\tilde{\varepsilon }}_{\ell }. \end{aligned} \end{aligned}$$

Thus (3.10) and (3.11) follow. \(\square \)

For \(n\in {\mathbb {N}}\) and \(u \in \{1,\dots ,n\}\), we define the sequence \(\{{\tilde{W}}_{n,u}^k\}_{k=1}^{\infty }\) in \(V^d\) by

$$\begin{aligned} {\tilde{W}}_{n,u}^{2k-1}&= - P_{(-\infty , n]}^\perp (P_{[1, \infty )}^{\perp } P_{(-\infty , n]}^\perp )^{k-1} {\tilde{\varepsilon }}_{-u}, \quad k \in {\mathbb {N}}, \\ {\tilde{W}}_{n,u}^{2k}&= (P_{[1, \infty )}^{\perp } P_{(-\infty ,n]}^{\perp })^{k} {\tilde{\varepsilon }}_{-u}, \quad k \in {\mathbb {N}}. \end{aligned}$$

Lemma 3.3

We assume (1.2) and (1.3). Then, for \(n\in {\mathbb {N}}\) and \(u\in \{1,\dots ,n\}\), we have

$$\begin{aligned} P_{[1,n]}^{\perp } {\tilde{\varepsilon }}_{-u} = -\sum _{k=1}^{\infty } {\tilde{W}}_{n,u}^{k}, \end{aligned}$$
(3.12)

the sum converging strongly in \(V^d\).

Proof

Since \({\tilde{\varepsilon }}_{-u}\) is in \(V_{[1,\infty )}^X\), (3.12) follows from (3.1) and Theorem 3.2 in [16]. \(\square \)

Proposition 3.2

We assume (1.2) and (1.3). Then, for \(n\in {\mathbb {N}}\), \(u\in \{1,\dots ,n\}\) and \(k\in {\mathbb {N}}\), we have

$$\begin{aligned} {\tilde{W}}_{n,u}^{2k-1}&=\sum _{\ell =0}^\infty {\tilde{b}}_{n,u,\ell }^{2k-1} \varepsilon _{n+1+\ell }, \end{aligned}$$
(3.13)
$$\begin{aligned} {\tilde{W}}_{n,u}^{2k}&=\sum _{\ell =0}^\infty {\tilde{b}}_{n,u,\ell }^{2k} {\tilde{\varepsilon }}_{\ell }. \end{aligned}$$
(3.14)

Proof

Note that, from the definition of \({\tilde{W}}_{n,u}^k\),

$$\begin{aligned} {\tilde{W}}_{n,u}^{2k+1}=-P_{(-\infty ,n]}^\perp {\tilde{W}}_{n,u}^{2k}, \quad {\tilde{W}}_{n,u}^{2k+2}=- P_{[1,\infty )}^\perp {\tilde{W}}_{n,u}^{2k+1}. \end{aligned}$$

We prove (3.13) and (3.14) by induction. First, by (3.2), we have

$$\begin{aligned} \begin{aligned} {\tilde{W}}_{n,u}^1&= - P_{(-\infty ,n]}^\perp {\tilde{\varepsilon }}_{-u} = -\sum _{\ell =0}^{\infty } \langle {\tilde{\varepsilon }}_{-u}, \varepsilon _{n+1+\ell }\rangle \varepsilon _{n+1+\ell }\\&=\sum _{\ell =0}^\infty \beta _{n+1-u+\ell }^* \varepsilon _{n+1+\ell } = \sum _{\ell =0}^\infty {\tilde{b}}_{n,u,\ell }^1 \varepsilon _{n+1+\ell }. \end{aligned} \end{aligned}$$

For \(k \in {\mathbb {N}}\), assume that \({\tilde{W}}_{n,u}^{2k-1}=\sum _{\ell =0}^\infty {\tilde{b}}_{n,u,\ell }^{2k-1} \varepsilon _{n+1+\ell }\). Then, by (3.3),

$$\begin{aligned} \begin{aligned} {\tilde{W}}_{n,u}^{2k}&=-P_{[1,\infty )}^\perp \left( \sum _{\ell =0}^{\infty } {\tilde{b}}_{n,u,\ell }^{2k-1} \varepsilon _{n+1+\ell }\right) =\sum _{\ell =0}^{\infty } \left( \sum _{m=0}^\infty {\tilde{b}}_{n,u,m}^{2k-1} \beta _{n+1+m+\ell } \right) {\tilde{\varepsilon }}_l\\&=\sum _{\ell =0}^\infty {\tilde{b}}_{n,u,\ell }^{2k} {\tilde{\varepsilon }}_{\ell }, \end{aligned} \end{aligned}$$

and, by (3.4),

$$\begin{aligned} \begin{aligned} {\tilde{W}}_{n,u}^{2k+1}&=-P_{(\infty ,n]}^{\perp } \left( \sum _{\ell =0}^\infty {\tilde{b}}_{n,u,\ell }^{2k} {\tilde{\varepsilon }}_{\ell }\right) =\sum _{\ell =0}^{\infty } \left( \sum _{m=0}^{\infty } {\tilde{b}}_{n,u,m}^{2k} \beta _{n+1+m+\ell }^* \right) \varepsilon _{n+1+\ell }\\&=\sum _{\ell =0}^{\infty } {\tilde{b}}_{n,u,\ell }^{2k+1} \varepsilon _{n+1+\ell }. \end{aligned} \end{aligned}$$

Thus (3.13) and (3.14) follow. \(\square \)

We are ready to prove Theorem 2.1.

Proof

(i) For \(n\in {\mathbb {N}}\), \(s, u \in \{1,\dots ,n\}\) and \(k\in {\mathbb {N}}\), we see from (3.5) and (3.13) that

$$\begin{aligned} \langle X^{\prime }_s, {\tilde{W}}_{n,u}^{2k-1} \rangle = - \sum _{\ell =0}^{\infty } a_{n+1-s+\ell }^* ({\tilde{b}}_{n,u,\ell }^{2k-1})^*, \end{aligned}$$

and from (3.6) and (3.14) that

$$\begin{aligned} \langle X^{\prime }_s, {\tilde{W}}_{n,u}^{2k} \rangle = - \sum _{\ell =0}^{\infty } {\tilde{a}}_{s+\ell }^* ({\tilde{b}}_{n,u,\ell }^{2k})^*. \end{aligned}$$

Therefore, by Lemma 3.3, \(\langle X^{\prime }_s, P_{[1,n]}^{\perp } {\tilde{\varepsilon }}_{-u} \rangle \) is equal to

$$\begin{aligned} - \sum _{k=1}^{\infty } \langle X^{\prime }_s, {\tilde{W}}_{n,u}^{k} \rangle = \sum _{k=1}^{\infty } \left\{ \sum _{\ell =0}^{\infty } {\tilde{b}}_{n,u,\ell }^{2k-1} a_{n+1-s+\ell } + \sum _{\ell =0}^{\infty } {\tilde{b}}_{n,u,\ell }^{2k} {\tilde{a}}_{s+\ell } \right\} ^*. \end{aligned}$$

The assertion (i) follows from this, Theorem 3.1 and Lemma 3.1.

(ii) For \(n\in {\mathbb {N}}\), \(s, u \in \{1,\dots ,n\}\) and \(k\in {\mathbb {N}}\), we see from (3.6) and (3.10) that

$$\begin{aligned} \langle X^{\prime }_s, W_{n,u}^{2k-1} \rangle = - \sum _{\ell =0}^{\infty } {\tilde{a}}_{s+\ell }^* (b_{n,u,\ell }^{2k-1})^*, \end{aligned}$$

and from (3.5) and (3.11) that

$$\begin{aligned} \langle X^{\prime }_s, W_{n,u}^{2k} \rangle = - \sum _{\ell =0}^{\infty } a_{n+1-s+\ell }^* (b_{n,u,\ell }^{2k})^*. \end{aligned}$$

Therefore, by Lemma 3.2, \(\langle X^{\prime }_s, P_{[1,n]}^{\perp } \varepsilon _u \rangle \) is equal to

$$\begin{aligned} - \sum _{k=1}^{\infty } \langle X^{\prime }_s, W_{n,u}^{k} \rangle = \sum _{k=1}^{\infty } \left\{ \sum _{\ell =0}^{\infty } b_{n,u,\ell }^{2k-1} {\tilde{a}}_{s+\ell } + \sum _{\ell =0}^{\infty } b_{n,u,\ell }^{2k} a_{n+1-s+\ell } \right\} ^*. \end{aligned}$$

The assertion (ii) follows from this, Theorem 3.1 and Lemma 3.1. \(\square \)

4 Strong convergence result for Toeplitz systems

In this section, we use Theorem 2.1 to show a strong convergence result for solutions of block Toeplitz systems. We assume (1.2) and (1.6). Then w is continuous on \({\mathbb {T}}\) since \(w(e^{i\theta })=(2\pi )^{-1}\sum _{k\in {\mathbb {Z}}} e^{ik\theta } \gamma (k)\). In particular, (1.3) is also satisfied. The conditions (1.2) and (1.6) also imply that all of \(\{a_k\}\), \(\{c_k\}\), \(\{{\tilde{a}}_k\}\) and \(\{{\tilde{c}}_k\}\) belong to \(\ell _{1+}^{d\times d}\). See Theorem 3.3 and (3.3) in [17]; see also Theorem 4.1 in [12]. In particular, we have \(h(e^{i\theta })^{-1} = - \sum _{k=0}^{\infty } e^{ik\theta } a_k\) and \(h_{\sharp }(e^{i\theta }) = {\tilde{h}}(e^{-i\theta })^* = \sum _{k=0}^{\infty } e^{ik\theta } {\tilde{c}}_k^*\), hence, by (2.6),

$$\begin{aligned} \beta _k = \sum _{j=0}^{\infty } a_{j+k} {\tilde{c}}_j,\quad k \in {\mathbb {N}}\cup \{0\}. \end{aligned}$$
(4.1)

Under (1.2) and (1.6), we define

$$\begin{aligned} F(n):=\left( \sum _{j=0}^{\infty }\Vert {\tilde{c}}_j\Vert \right) \sum _{\ell =n}^{\infty }\Vert a_{\ell }\Vert , \quad n \in {\mathbb {N}}\cup \{0\}. \end{aligned}$$

Then F(n) decreases to zero as \(n\rightarrow \infty \).

We need the next lemma in the proof of Theorem 4.1 below.

Lemma 4.1

Assume (1.2) and (1.6). Then, for \(n, k\in {\mathbb {N}}\) and \(u\in \{1,\dots ,n\}\), we have

$$\begin{aligned} \sum _{\ell =0}^{\infty } \Vert {\tilde{b}}^{k}_{n,u,\ell }\Vert \le F(n+1)^{k-1}F(n+1-u). \end{aligned}$$
(4.2)

Proof

For \(m\in {\mathbb {N}}\), we see from (4.1) that

$$\begin{aligned} \sum _{\ell =0}^{\infty } \Vert \beta _{m+\ell }\Vert \le \sum _{j=0}^{\infty } \Vert {\tilde{c}}_j\Vert \sum _{\ell =0}^{\infty } \Vert a_{m+j+\ell }\Vert \le \sum _{j=0}^{\infty } \Vert {\tilde{c}}_j\Vert \sum _{\ell =m}^{\infty } \Vert a_{\ell }\Vert , \end{aligned}$$

hence

$$\begin{aligned} \sum _{\ell =0}^{\infty } \Vert \beta _{m+\ell }\Vert \le F(m). \end{aligned}$$
(4.3)

Let \(n\in {\mathbb {N}}\) and \(u\in \{1,\dots ,n\}\). We use induction on k to prove (4.2). Since \({\tilde{b}}^1_{n,u,\ell }=\beta _{n+1-u+\ell }^*\), we see from (4.3) that

$$\begin{aligned} \sum _{\ell =0}^{\infty } \Vert {\tilde{b}}^1_{n,u,\ell } \Vert =\sum _{\ell =0}^{\infty }\Vert \beta _{n+1-u+\ell } \Vert \le F(n+1-u). \end{aligned}$$

We assume (4.2) for \(k\in {\mathbb {N}}\). Then, again by (4.3),

$$\begin{aligned} \begin{aligned} \sum _{\ell =0}^{\infty } \Vert {\tilde{b}}^{k+1}_{n,u,\ell } \Vert&\le \sum _{m=0}^{\infty } \Vert {\tilde{b}}^{k}_{n,u,m} \Vert \sum _{\ell =0}^{\infty } \Vert \beta _{n + 1 + m + \ell }\Vert \\&\le F(n+1) \sum _{m=0}^{\infty } \Vert {\tilde{b}}^k_{n,u,m} \Vert \le F(n+1)^{k}F(n+1-u). \end{aligned} \end{aligned}$$

Thus (4.2) with k replaced by \(k+1\) also holds. \(\square \)

For \(\{y_k\}_{k=1}^{\infty } \in \ell _1^{d\times d}({\mathbb {N}})\), the solution \(Z_{\infty }\) to (1.11) with (1.12) and (1.13) is given by (1.10) with

$$\begin{aligned} z_s = \sum _{t=1}^{\infty } \sum _{\ell = 1}^{s\wedge t} {\tilde{a}}_{s - \ell }^* {\tilde{a}}_{t - \ell } y_t \in {\mathbb {C}}^{d\times d}, \quad s\in {\mathbb {N}} \end{aligned}$$
(4.4)

(see Remark 2.1 in Sect. 2). Notice that the sum in (4.4) converges absolutely.

Theorem 4.1

We assume (1.2) and (1.6). Let \(\{y_k\}_{k=1}^{\infty } \in \ell _1^{d\times d}({\mathbb {N}})\). Then, for \(Z_n\) in (1.7)–(1.9) and \(Z_{\infty }\) in (1.10)–(1.13), we have (1.14).

Proof

By Theorem 2.1 (i), we have

$$\begin{aligned} \begin{aligned} z_{n,s}&= \sum _{t=1}^{n} \sum _{\ell = 1}^{s\wedge t} {\tilde{a}}_{s - \ell }^* {\tilde{a}}_{t - \ell } y_t + \sum _{t=1}^{n} \sum _{u=1}^t \sum _{\ell = 0}^{\infty } a_{n + 1 - s + \ell }^* \beta _{n+1-u+\ell } {\tilde{a}}_{t-u} y_t\\&\quad + \sum _{t=1}^{n} \sum _{u=1}^t \sum _{k=1}^{\infty } \left\{ \sum _{\ell = 0}^{\infty } {\tilde{b}}_{n,u,\ell }^{2k+1} a_{n + 1 - s + \ell } + \sum _{\ell = 0}^{\infty } {\tilde{b}}_{n,u,\ell }^{2k} {\tilde{a}}_{s + \ell } \right\} ^* {\tilde{a}}_{t-u} y_t, \end{aligned} \end{aligned}$$

hence, by (4.4), \(\sum _{s=1}^n \Vert z_{n,s} - z_s\Vert \le S_1(n) + S_2(n) + S_3(n) + S_4(n)\), where

$$\begin{aligned} S_1(n)&:= \sum _{t=n+1}^{\infty } \sum _{s=1}^n \sum _{\ell = 1}^{s} \Vert {\tilde{a}}_{s - \ell }\Vert \Vert {\tilde{a}}_{t - \ell }\Vert \Vert y_t\Vert ,\\ S_2(n)&:= \sum _{s=1}^n \sum _{t=1}^{n} \sum _{u=1}^t \sum _{\ell = 0}^{\infty } \Vert a_{n + 1 - s + \ell }\Vert \Vert \beta _{n+1-u+\ell }\Vert \Vert {\tilde{a}}_{t-u}\Vert \Vert y_t\Vert ,\\ S_3(n)&:= \sum _{s=1}^n \sum _{t=1}^{n} \sum _{u=1}^t \sum _{k=1}^{\infty } \sum _{\ell = 0}^{\infty } \Vert {\tilde{b}}_{n,u,\ell }^{2k+1}\Vert \Vert a_{n + 1 - s + \ell }\Vert \Vert {\tilde{a}}_{t-u}\Vert \Vert y_t\Vert \end{aligned}$$

and

$$\begin{aligned} S_4(n) = \sum _{s=1}^n \sum _{t=1}^{n} \sum _{u=1}^t \sum _{k=1}^{\infty } \sum _{\ell = 0}^{\infty } \Vert {\tilde{b}}_{n,u,\ell }^{2k}\Vert \Vert {\tilde{a}}_{s + \ell }\Vert \Vert {\tilde{a}}_{t-u}\Vert \Vert y_t\Vert . \end{aligned}$$

By the change of variables \(m=s-\ell +1\), we have

$$\begin{aligned} \begin{aligned} S_1(n)&= \sum _{t=n+1}^{\infty } \sum _{s=1}^n \sum _{m = 1}^{s} \Vert {\tilde{a}}_{m - 1}\Vert \Vert {\tilde{a}}_{t + m -s - 1}\Vert \Vert y_t\Vert \\&=\sum _{t=n+1}^{\infty } \Vert y_t\Vert \sum _{m = 1}^{n} \Vert {\tilde{a}}_{m - 1}\Vert \sum _{s=m}^n \Vert {\tilde{a}}_{t + m -s - 1}\Vert \\&\le \left( \sum _{k = 0}^{\infty } \Vert {\tilde{a}}_{k}\Vert \right) ^2 \sum _{t=n+1}^{\infty } \Vert y_t\Vert \quad \rightarrow \quad 0,\quad n\rightarrow \infty . \end{aligned} \end{aligned}$$

By (4.2) with \(k=1\) or (4.3), we have

$$\begin{aligned} \begin{aligned} S_2(n)&= \sum _{t=1}^{n} \sum _{u=1}^t \Vert {\tilde{a}}_{t-u}\Vert \Vert y_t\Vert \sum _{\ell = 0}^{\infty } \Vert \beta _{n+1-u+\ell }\Vert \sum _{s=1}^n \Vert a_{n + 1 - s + \ell } \Vert \\&\le \left( \sum _{s=1}^{\infty } \Vert a_{s}\Vert \right) \sum _{t=1}^{n} \sum _{u=1}^t \Vert {\tilde{a}}_{t-u}\Vert \Vert y_t\Vert F(n+1-u). \end{aligned} \end{aligned}$$

Furthermore, by the change of variables \(v=t-u+1\), we obtain

$$\begin{aligned} \begin{aligned} \sum _{t=1}^{n} \sum _{u=1}^t \Vert {\tilde{a}}_{t-u}\Vert \Vert y_t\Vert F(n+1-u)&=\sum _{t=1}^{\infty } \sum _{u=1}^t \Vert {\tilde{a}}_{t-u}\Vert \Vert y_t\Vert 1_{[0,n]}(t)F(n+1-u)\\&=\sum _{t=1}^{\infty } \sum _{v=1}^t \Vert {\tilde{a}}_{v-1}\Vert \Vert y_t\Vert 1_{[0,n]}(t)F(n-t+v)\\&\le \sum _{t=1}^{\infty } \sum _{v=1}^{\infty } \Vert {\tilde{a}}_{v-1}\Vert \Vert y_t\Vert 1_{[0,n]}(t)F(n-t+v). \end{aligned} \end{aligned}$$

Since

$$\begin{aligned}&\lim _{n\rightarrow \infty } \Vert {\tilde{a}}_{v-1}\Vert \Vert y_t\Vert 1_{[0,n]}(t)F(n-t+v) =0,\quad t, v\in {\mathbb {N}},\\&\Vert {\tilde{a}}_{v-1}\Vert \Vert y_t\Vert 1_{[0,n]}(t)F(n-t+v) \le F(1)\Vert {\tilde{a}}_{v-1}\Vert \Vert y_t\Vert ,\quad t, v\in {\mathbb {N}},\\&\quad \sum _{t=1}^{\infty } \sum _{v=1}^{\infty } \Vert {\tilde{a}}_{v-1}\Vert \Vert y_t\Vert <\infty , \end{aligned}$$

the dominated convergence theorem yields

$$\begin{aligned} \lim _{n\rightarrow \infty } \sum _{t=1}^{\infty } \sum _{v=1}^{\infty } \Vert {\tilde{a}}_{v-1}\Vert \Vert y_t\Vert 1_{[0,n]}(t)F(n-t+v) = 0, \end{aligned}$$

hence \(\lim _{n\rightarrow \infty } S_2(n) = 0\).

Choose \(N\in {\mathbb {N}}\) such that \(F(N+1)<1\). Then, by Lemma 4.1, we have, for \(n\ge N\),

$$\begin{aligned} \begin{aligned} S_3(n)&=\sum _{t=1}^{n} \sum _{u=1}^t \Vert {\tilde{a}}_{t-u}\Vert \Vert y_t\Vert \sum _{k=1}^{\infty } \sum _{\ell = 0}^{\infty } \Vert {\tilde{b}}_{n,u,\ell }^{2k+1}\Vert \sum _{s=1}^n \Vert a_{n + 1 - s + \ell }\Vert \\&\le F(1)\left( \sum _{s=1}^{\infty } \Vert a_{s}\Vert \right) \sum _{t=1}^{n} \Vert y_t\Vert \sum _{u=1}^t \Vert {\tilde{a}}_{t-u}\Vert \sum _{k=1}^{\infty } F(n+1)^{2k}\\&\le F(1)\left( \sum _{s=1}^{\infty } \Vert a_{s}\Vert \right) \left( \sum _{u=0}^{\infty } \Vert {\tilde{a}}_{u}\Vert \right) \left( \sum _{t=1}^{\infty } \Vert y_t\Vert \right) \frac{F(n+1)^2}{1-F(n+1)^2}. \end{aligned} \end{aligned}$$

Thus \(\lim _{n\rightarrow \infty }S_3(n)=0\). Similarly, we have, for \(n\ge N\),

$$\begin{aligned} S_4(n) \le F(1) \left( \sum _{s=0}^{\infty } \Vert {\tilde{a}}_{s}\Vert \right) ^2 \left( \sum _{t=1}^{\infty } \Vert y_t\Vert \right) \frac{F(n+1)}{1-F(n+1)^2}, \end{aligned}$$

hence \(\lim _{n\rightarrow \infty }S_4(n)=0\).

Combining, we obtain (1.14). \(\square \)

5 Closed-form formulas

In this section, we use Theorem 2.1 to derive closed-form formulas for \(T_n(w)^{-1}\) with rational symbol w that corresponds to a d-variate ARMA process. We assume that the symbol w of \(T_n(w)\) is of the form (1.17) with \(h:{\mathbb {T}}\rightarrow {\mathbb {C}}^{d\times d}\) satisfying (1.18). Then h is an outer function in \(H_2^{d\times d}({\mathbb {T}})\), and another outer function \(h_{\sharp }\in H_2^{d\times d}({\mathbb {T}})\) that appears in (1.4) also satisfies (1.18); see Sect. 6.2 in [16]. Notice that (1.17) with (1.18) implies (1.2) and (1.3).

We can write \(h(z)^{-1}\) in the form

$$\begin{aligned} h(z)^{-1} = - \rho _{0,0} - \sum _{{\mu }=1}^{K} \sum _{j=1}^{m_{\mu }} \frac{1}{(1-{\overline{p}}_{\mu }z)^j}\rho _{{\mu }, j} - \sum _{j=1}^{m_0} z^j \rho _{0,j}, \end{aligned}$$
(5.1)

where

$$\begin{aligned} \left\{ \begin{aligned}&K\in {\mathbb {N}}\cup \{0\}, \quad m_{\mu }\in {\mathbb {N}}, \quad \mu \in \{1,\dots ,K\}, \quad m_0\in {\mathbb {N}}\cup \{0\},\\&p_{\mu }\in {\mathbb {D}}\setminus \{0\}, \quad \mu \in \{1,\dots ,K\}, \quad p_{\mu }\ne p_{\nu }, \quad \mu \ne \nu ,\\&\rho _{{\mu }, j}\in {\mathbb {C}}^{d\times d}, \quad \mu \in \{0,\dots ,K\},\ j \in \{1,\dots ,m_{\mu }\}, \quad \rho _{0,0} \in {\mathbb {C}}^{d\times d},\\&\rho _{{\mu },m_{\mu }}\ne 0, \quad \mu \in \{1,\dots ,K\},\\&\rho _{0,m_0}\ne 0\quad \hbox {if}\ \ m_0\ge 1. \end{aligned} \right. \end{aligned}$$
(5.2)

Here the convention \(\sum _{k=1}^0=0\) is adopted in the sums on the right-hand side of (5.1). For example, if \(m_0=0\), then

$$\begin{aligned} h(z)^{-1} = - \rho _{0,0} - \sum _{{\mu }=1}^{K} \sum _{j=1}^{m_{\mu }} \frac{1}{(1-{\overline{p}}_{\mu }z)^j}\rho _{{\mu }, j}, \end{aligned}$$

while, if \(K=0\), then

$$\begin{aligned} h(z)^{-1} = - \rho _{0,0} - \sum _{j=1}^{m_0} z^j \rho _{0,j} \end{aligned}$$
(5.3)

and the corresponding stationary process \(\{X_k\}\) is a d-variate AR\((m_0)\) process.

Remark 5.1

It should be noticed that the expression (5.1) with (5.2) is uniquely determined, up to a constant unitary factor, from \(\{X_k\}\) satisfying (1.17) with (1.18) since so is h in the factorization (1.17) with (1.18) (see Sect. 2). Suppose that we start with a d-variate, causal and invertible ARMA process \(\{X_k\}\) in the sense of [4], that is, a \({\mathbb {C}}^d\)-valued, centered, weakly stationary process described by the ARMA equation

$$\begin{aligned} \varPhi (B) X_n = \varPsi (B) \xi _n,\quad n\in {\mathbb {Z}}, \end{aligned}$$

where, for \(r, s\in {\mathbb {N}}\cup \{0\}\) and \(\varPhi _i, \varPsi _j\in {\mathbb {C}}^{d\times d}\ (i=1,\dots ,r,\ j=1,\dots ,s)\),

$$\begin{aligned} \varPhi (z) = I_d - z\varPhi _1 - \cdots - z^r\varPhi _r \quad \hbox {and} \quad \varPsi (z) = I_d - z\varPsi _1 - \cdots - z^s\varPsi _s \end{aligned}$$

are \({\mathbb {C}}^{d\times d}\)-valued polynomials satisfying \(\det \varPhi (z)\ne 0\) and \(\det \varPsi (z)\ne 0\) on \({\overline{{\mathbb {D}}}}\), B is the backward shift operator defined by \(B X_m=X_{m-1}\), and \(\{\xi _k : k\in {\mathbb {Z}}\}\) is a d-variate white noise, that is, a d-variate, centered process such that \(E[\xi _n \xi _m^*]=\delta _{nm}V\) for some positive-definite \(V\in {\mathbb {C}}^{d\times d}\). Notice that the pair \((\varPhi (z),\varPsi (z))\) is not uniquely determined from \(\{X_k\}\); for example, we can replace \((\varPhi (z),\varPsi (z))\) by \(((2-z)\varPhi (z),(2-z)\varPsi (z))\). However, if we put \(h(z) = \varPhi (z)^{-1} \varPsi (z) V^{1/2}\), then h is an outer function belonging to \(H_2^{d\times d}({\mathbb {T}})\) and satisfies (1.17) for the spectral density w of \(\{X_k\}\). Therefore, h is uniquely determined, up to a constant unitary factor, from \(\{X_k\}\). In particular, the expression (5.1) with (5.2) for h is also uniquely determined, up to a constant unitary factor, from \(\{X_k\}\). From these observations and the results in [13] and this paper, we are led to the idea of parameterizing the ARMA processes by the expression (5.1) with (5.2) (see Remark 8 in [13]). This point will be discussed in future work.

By Theorem 2 in [13], \(h_{\sharp }^{-1}\) has the same \(m_0\) and the same poles with the same multiplicities as \(h^{-1}\), that is, for \(m_0\), K and \((p_1, m_1), \dots , (p_K, m_K)\) in (5.1) with (5.2), \(h_{\sharp }^{-1}\) has the form

$$\begin{aligned} h_{\sharp }(z)^{-1} = - \rho _{0,0}^{\sharp } - \sum _{{\mu }=1}^{K} \sum _{j=1}^{m_{\mu }} \frac{1}{(1-{\overline{p}}_{\mu }z)^j}\rho _{{\mu }, j}^{\sharp } - \sum _{j=1}^{m_0} z^j \rho _{0,j}^{\sharp }, \end{aligned}$$
(5.4)

where

$$\begin{aligned} \left\{ \begin{aligned}&\rho _{{\mu }, j}^{\sharp }\in {\mathbb {C}}^{d\times d}, \quad \mu \in \{0,\dots ,K\},\ j \in \{1,\dots ,m_{\mu }\}, \quad \rho _{0,0}^{\sharp } \in {\mathbb {C}}^{d\times d},\\&\rho _{{\mu },m_{\mu }}^{\sharp }\ne 0, \quad \mu \in \{1,\dots ,K\},\\&\rho _{0,m_0}^{\sharp }\ne 0\quad \hbox {if}\ \ m_0\ge 1. \end{aligned} \right. \end{aligned}$$

Notice that if \(d=1\), then we can take \(h_{\sharp }=h\), hence \(\rho _{0,0}=\rho _{0,0}^{\sharp }\) and \(\rho _{{\mu }, j} = \rho _{{\mu }, j}^{\sharp }\) for \(\mu \in \{1,\dots ,K\}\) and \(j\in \{1,\dots ,m_{\mu }\}\).

Recall \({\tilde{h}}\) from (2.1). From (5.4), we have

$$\begin{aligned} {\tilde{h}}(z)^{-1} = - {\tilde{\rho }}_{0,0} - \sum _{{\mu }=1}^{K} \sum _{j=1}^{m_{\mu }} \frac{1}{(1 - p_{\mu } z)^j} {\tilde{\rho }}_{{\mu }, j} - \sum _{j=1}^{m_0} z^j {\tilde{\rho }}_{0,j}, \end{aligned}$$

where

$$\begin{aligned} {\tilde{\rho }}_{0,0} := (\rho _{0,0}^{\sharp })^*, \quad {\tilde{\rho }}_{\mu ,j} := (\rho _{\mu , j}^{\sharp })^*, \quad \mu \in \{0,\dots ,K\},\ j \in \{1,\dots ,m_{\mu }\}. \end{aligned}$$

Recall the sequences \(\{a_k\}\) and \(\{{\tilde{a}}_k\}\) from (2.3) and (2.5), respectively. We have

$$\begin{aligned} a_n&= \sum _{{\mu }=1}^{K} \sum _{j=1}^{m_{\mu }} \left( {\begin{array}{c}n+j-1\\ j-1\end{array}}\right) {\overline{p}}_{\mu }^n \rho _{{\mu }, j}, \quad n\ge m_0 + 1, \end{aligned}$$
(5.5)
$$\begin{aligned} {\tilde{a}}_n&= \sum _{{\mu }=1}^{K} \sum _{j=1}^{m_{\mu }} \left( {\begin{array}{c}n+j-1\\ j-1\end{array}}\right) p_{\mu }^n {\tilde{\rho }}_{{\mu }, j}, \quad n\ge m_0 + 1 \end{aligned}$$
(5.6)

and

$$\begin{aligned} a_n&= \rho _{0,n} + \sum _{{\mu }=1}^{K} \sum _{j=1}^{m_{\mu }} \left( {\begin{array}{c}n+j-1\\ j-1\end{array}}\right) {\overline{p}}_{\mu }^n \rho _{{\mu }, j}, \quad n \in \{0,\dots ,m_0\}, \end{aligned}$$
(5.7)
$$\begin{aligned} {\tilde{a}}_n&= {\tilde{\rho }}_{0,n} + \sum _{{\mu }=1}^{K} \sum _{j=1}^{m_{\mu }} \left( {\begin{array}{c}n+j-1\\ j-1\end{array}}\right) p_{\mu }^n {\tilde{\rho }}_{{\mu }, j}, \quad n \in \{0,\dots ,m_0\}, \end{aligned}$$
(5.8)

where the convention \(\left( {\begin{array}{c}0\\ 0\end{array}}\right) =1\) is adopted; see Proposition 4 in [13].

We first consider the case of \(K = 0\) that corresponds to a d-variate AR\((m_0)\) process. As can be seen from the following theorem, in this case, we have simple closed-form formulas for \(T_n(w)^{-1}\).

Theorem 5.1

We assume (1.17), (1.18) and \(K = 0\) for K in (5.1). Thus we assume (5.3). Then the following four assertions hold.

  1. (i)

    For \(n \ge m_0 + 1\), \(s \in \{1,\dots ,n\}\) and \(t \in \{1, \dots , n-m_0\}\), we have

    $$\begin{aligned} \left( T_n(w)^{-1}\right) ^{s,t} = \sum _{\lambda =1}^{s\wedge t} {\tilde{a}}_{s-\lambda }^* {\tilde{a}}_{t-\lambda }. \end{aligned}$$
    (5.9)
  2. (ii)

    For \(n \ge m_0 + 1\), \(s \in \{1, \dots , n-m_0\}\) and \(t \in \{1,\dots ,n\}\), we have

    $$\begin{aligned} \left( T_n(w)^{-1}\right) ^{s,t} = \sum _{\lambda =1}^{s\wedge t} {\tilde{a}}_{s-\lambda }^* {\tilde{a}}_{t-\lambda }. \end{aligned}$$
    (5.10)
  3. (iii)

    For \(n \ge m_0 + 1\), \(s \in \{1,\dots ,n\}\) and \(t \in \{m_0+1, \dots , n\}\), we have

    $$\begin{aligned} \left( T_n(w)^{-1}\right) ^{s,t} = \sum _{\lambda =s\vee t}^n a_{\lambda -s}^* a_{\lambda -t}. \end{aligned}$$
    (5.11)
  4. (iv)

    For \(n \ge m_0 + 1\), \(s \in \{m_0+1, \dots , n\}\) and \(t \in \{1,\dots ,n\}\), we have

    $$\begin{aligned} \left( T_n(w)^{-1}\right) ^{s,t} = \sum _{\lambda =s\vee t}^n a_{\lambda -s}^* a_{\lambda -t}. \end{aligned}$$
    (5.12)

Proof

For w satisfying (1.17), (1.18) and \(K = 0\), let \(\{X_k\}\), \(\{X^{\prime }_k\}\), \(\{\varepsilon _k\}\) and \(\{{\tilde{\varepsilon }}_k\}\) be as in Sect. 3.

(i) By (5.4) with \(K=0\), we have \({\tilde{a}}_0={\tilde{\rho }}_0\), \({\tilde{a}}_k={\tilde{\rho }}_{0,k}\) for \(k\in \{1,\dots ,m_0\}\) and \({\tilde{a}}_k=0\) for \(k \ge m_0 + 1\). In particular, we have \(\sum _{k=0}^{m_0}{\tilde{a}}_{k}X_{u+k} + {\tilde{\varepsilon }}_{-u} = 0\) for \(u\in {\mathbb {Z}}\); see (2.15) in [16]. This implies \({\tilde{\varepsilon }}_{-u} \in V_{[1,n]}^X\), or \(P_{[1,n]}^{\bot } {\tilde{\varepsilon }}_{-u} = 0\), for \(u\in \{1,\dots ,n-m_0\}\). Therefore, (5.9) follows from Theorem 3.1 and (3.8).

(iii) By (5.3), we have \(a_0=\rho _{0,0}\), \(a_k=\rho _{0,k}\) for \(k\in \{1,\dots ,m_0\}\) and \(a_k=0\) for \(k \ge m_0 + 1\). In particular, \(\sum _{k=0}^{m_0}a_{k}X_{u-k} + \varepsilon _u = 0\) for \(u\in {\mathbb {Z}}\); see (2.15) in [16]. This implies \(\varepsilon _u \in V_{[1,n]}^X\), or \(P_{[1,n]}^{\bot } \varepsilon _u = 0\), for \(u\in \{m_0+1, \dots ,n\}\). Therefore, (5.11) follows from Theorem 3.1 and (3.7).

(ii), (iv) By (2.9), (ii) and (iv) follow from (i) and (iii), respectively. \(\square \)

We turn to the case of \(K \ge 1\). In what follows in this section, for K in (5.1), we assume

$$\begin{aligned} K\ge 1. \end{aligned}$$

For \(m_1,\dots ,m_K\) in (5.1), we define \(M\in {\mathbb {N}}\) by

$$\begin{aligned} M:=\sum _{\mu =1}^K m_{\mu }. \end{aligned}$$
(5.13)

For \(\mu \in \{1,\dots ,K\}\), \(p_{\mu }\) in (5.1) and \(i\in {\mathbb {N}}\), we define \(p_{{\mu }, i}: {\mathbb {Z}}\rightarrow {\mathbb {C}}^{d\times d}\) by

$$\begin{aligned} p_{{\mu },i}(k):= \left( {\begin{array}{c}k\\ i-1\end{array}}\right) p_{\mu }^{k - i+1} I_d, \quad k\in {\mathbb {Z}}. \end{aligned}$$
(5.14)

Notice that

$$\begin{aligned} p_{\mu ,i}(0)=\left( {\begin{array}{c}0\\ i-1\end{array}}\right) p_{\mu }^{-i+1} I_d = \delta _{i, 1} I_d. \end{aligned}$$

For \(n\in {\mathbb {Z}}\), we also define \({\mathbf {p}}_n \in {\mathbb {C}}^{dM\times d}\) by the following block representation:

$$\begin{aligned} \begin{aligned} {\mathbf {p}}_n&:=(p_{1, 1}(n), \dots , p_{1, m_1}(n)\ \vert \ p_{2, 1}(n), \dots , p_{2, m_2}(n)\ \vert \\&\quad \cdots \vert \ p_{K, 1}(n), \dots , p_{K, m_{K}}(n))^{\top }. \end{aligned} \end{aligned}$$

Notice that

$$\begin{aligned} {\mathbf {p}}_0 = (I_d, 0, \dots , 0\ \vert \ I_d, 0, \dots , 0 \vert \cdots \vert \ I_d, 0, \dots , 0)^{\top } \in {\mathbb {C}}^{dM\times d}. \end{aligned}$$

We define \(\varLambda \in {\mathbb {C}}^{dM\times dM}\) by

$$\begin{aligned} \varLambda := \sum _{\ell =0}^{\infty } {\mathbf {p}}_{\ell } {\mathbf {p}}_{\ell }^*. \end{aligned}$$

For \({\mu }, {\nu }\in \{1,2,\dots ,K\}\), we define \(\varLambda ^{{\mu },{\nu }}\in {\mathbb {C}}^{dm_{\mu }\times dm_{\nu }}\) by the block representation

$$\begin{aligned} \varLambda ^{{\mu },{\nu }} := \left( \begin{matrix} \lambda ^{{\mu }, {\nu }}(1, 1) &{} \lambda ^{{\mu }, {\nu }}(1, 2) &{} \cdots &{} \lambda ^{{\mu }, {\nu }}(1, m_{\nu }) \\ \lambda ^{{\mu }, {\nu }}(2, 1) &{} \lambda ^{{\mu }, {\nu }}(2, 2) &{} \cdots &{} \lambda ^{{\mu }, {\nu }}(2, m_{\nu }) \\ \vdots &{} \vdots &{} &{} \vdots \\ \lambda ^{\mu ,\nu }(m_{\mu },1) &{} \lambda ^{\mu ,\nu }(m_{\mu },2) &{} \cdots &{} \lambda ^{\mu ,\nu }(m_{\mu },m_{\nu }) \\ \end{matrix} \right) , \end{aligned}$$

where, for \(i \in \{1,\dots ,m_{\mu }\}\) and \(j \in \{1,\dots ,m_{\nu }\}\),

$$\begin{aligned} \lambda ^{{\mu },{\nu }}(i, j) := \sum _{r=0}^{j-1} \left( {\begin{array}{c}i-1\\ r\end{array}}\right) \left( {\begin{array}{c}i+j-r-2\\ i-1\end{array}}\right) \frac{p_{\mu }^{j - r -1}{\overline{p}}_{\nu }^{i-r-1}}{(1-p_{\mu }{\overline{p}}_{\nu })^{i+j-r-1}} I_d \in {\mathbb {C}}^{d\times d}. \end{aligned}$$

Then, by Lemma 3 in [13], the matrix \(\varLambda \) has the following block representation:

$$\begin{aligned} \varLambda =\left( \begin{matrix} \varLambda ^{1, 1} &{} \varLambda ^{1, 2} &{} \cdots &{} \varLambda ^{1, K} \\ \varLambda ^{2, 1} &{} \varLambda ^{2, 2} &{} \cdots &{} \varLambda ^{2, K} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \varLambda ^{K, 1} &{} \varLambda ^{K, 2} &{} \cdots &{} \varLambda ^{K, K} \\ \end{matrix} \right) . \end{aligned}$$

We define, for \(\mu \in \{1,\dots ,K\}\) and \(j \in \{1,\dots ,m_{\mu }\}\),

$$\begin{aligned} \theta _{{\mu }, j} := - \lim _{z\rightarrow p_{\mu }} \frac{1}{(m_{\mu } - j)!} \frac{d^{m_{\mu } - j}}{dz^{m_{\mu } - j}} \left\{ (z-p_{\mu })^{m_{\mu }} h_{\sharp }(z) h^{\dagger }(z)^{-1}\right\} \in {\mathbb {C}}^{d\times d},\nonumber \\ \end{aligned}$$
(5.15)

where

$$\begin{aligned} h^{\dagger }(z) := h(1/{\overline{z}})^*. \end{aligned}$$
(5.16)

We define \(\varTheta \in {\mathbb {C}}^{dM\times dM}\) by the block representation

$$\begin{aligned} \varTheta := \left( \begin{matrix} \varTheta _1 &{} 0 &{} \cdots &{} 0\\ 0 &{} \varTheta _2 &{} \cdots &{} 0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} \varTheta _{K} \end{matrix} \right) , \end{aligned}$$

where, for \(\mu \in \{1,\dots ,K\}\), \(\varTheta _{\mu } \in {\mathbb {C}}^{dm_{\mu }\times dm_{\mu }}\) is defined by

using \(\theta _{\mu ,j}\) in (5.15) with (5.16).

For \(n\in {\mathbb {Z}}\), we define \(\varPi _n\in {\mathbb {C}}^{dM\times dM}\) by the block representation

$$\begin{aligned} \varPi _n:= \left( \begin{matrix} \varPi _{1, n} &{} 0 &{} \cdots &{} 0\\ 0 &{} \varPi _{2, n} &{} \cdots &{} 0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} \varPi _{K, n} \end{matrix} \right) , \end{aligned}$$

where, for \(\mu \in \{1,\dots ,K\}\) and \(n\in {\mathbb {Z}}\), \(\varPi _{\mu , n} \in {\mathbb {C}}^{dm_{\mu }\times dm_{\mu }}\) is defined by

using \(p_{{\mu }, i}(n)\) in (5.14).

The next lemma slightly extends Lemma 17 in [13].

Lemma 5.1

We assume (1.17), (1.18) and \(K \ge 1\) for K in (5.1). Then, for \(n, k, \ell \in {\mathbb {Z}}\) such that \(n+k+\ell \ge m_0\), we have

$$\begin{aligned} \beta _{n+k+\ell +1}^* = {\mathbf {p}}_{\ell }^{\top } \varPi _n \varTheta {\mathbf {p}}_k, \end{aligned}$$

hence

$$\begin{aligned} \beta _{n+k+\ell +1} = {\mathbf {p}}_k^* (\varPi _n \varTheta )^* {\overline{{\mathbf {p}}}}_{\ell }. \end{aligned}$$

The proof of Lemma 5.1 is almost the same as that of Lemma 17 in [13], hence we omit it.

For \(n\in {\mathbb {Z}}\), we define \(G_n, {\tilde{G}}_n\in {\mathbb {C}}^{dM\times dM}\) by

$$\begin{aligned} G_n := \varPi _n \varTheta \varLambda , \quad {\tilde{G}}_n := (\varPi _n \varTheta )^* \varLambda ^{\top }. \end{aligned}$$

Lemma 5.2

We assume (1.17), (1.18) and \(K \ge 1\) for K in (5.1). Then the following two assertions hold.

  1. (i)

    We assume \(n\ge u\ge m_0+1\). Then, for \(k\in {\mathbb {N}}\) and \(\ell \in {\mathbb {N}}\cup \{0\}\), we have

    $$\begin{aligned} b_{n,u,\ell }^{2k-1}&= {\mathbf {p}}_{u-n-1}^* ( {\tilde{G}}_n G_n )^{k-1} (\varPi _n \varTheta )^* {\overline{{\mathbf {p}}}}_{\ell }, \end{aligned}$$
    (5.17)
    $$\begin{aligned} b_{n,u,\ell }^{2k}&= {\mathbf {p}}_{u-n-1}^* ({\tilde{G}}_n G_n )^{k-1} {\tilde{G}}_n \varPi _n \varTheta {\mathbf {p}}_{\ell }. \end{aligned}$$
    (5.18)
  2. (ii)

    We assume \(1\le u\le n-m_0\). Then, for \(k\in {\mathbb {N}}\) and \(\ell \in {\mathbb {N}}\cup \{0\}\), we have

    $$\begin{aligned} {\tilde{b}}_{n,u,\ell }^{2k-1}&= {\mathbf {p}}_{-u}^{\top } ( G_n {\tilde{G}}_n )^{k-1} \varPi _n \varTheta {\mathbf {p}}_{\ell }, \end{aligned}$$
    (5.19)
    $$\begin{aligned} {\tilde{b}}_{n,u,\ell }^{2k}&= {\mathbf {p}}_{-u}^{\top } (G_n {\tilde{G}}_n )^{k-1} G_n (\varPi _n \varTheta )^* {\overline{{\mathbf {p}}}}_{\ell }. \end{aligned}$$
    (5.20)

The proof of Lemma 5.2 will be given in the Appendix.

For \(n\in {\mathbb {N}}\) and \({\mu }, {\nu }\in \{1,2,\dots ,K\}\), we define \(\varXi _n^{{\mu },{\nu }}\in {\mathbb {C}}^{dm_{\mu }\times dm_{\nu }}\) by the block representation

$$\begin{aligned} \varXi _n^{{\mu },{\nu }} := \left( \begin{matrix} \xi _n^{{\mu }, {\nu }}(1, 1) &{} \xi _n^{{\mu }, {\nu }}(1, 2) &{} \cdots &{} \xi _n^{{\mu }, {\nu }}(1, m_{\nu }) \\ \xi _n^{{\mu }, {\nu }}(2, 1) &{} \xi _n^{{\mu }, {\nu }}(2, 2) &{} \cdots &{} \xi _n^{{\mu }, {\nu }}(2, m_{\nu }) \\ \vdots &{} \vdots &{} &{} \vdots \\ \xi _n^{\mu ,\nu }(m_{\mu },1) &{} \xi _n^{\mu ,\nu }(m_{\mu },2) &{} \cdots &{} \xi _n^{\mu ,\nu }(m_{\mu },m_{\nu }) \\ \end{matrix} \right) , \end{aligned}$$

where, for \(n\in {\mathbb {N}}\), \(i \in \{1,\dots ,m_{\mu }\}\) and \(j \in \{1,\dots ,m_{\nu }\}\), \(\xi _n^{{\mu },{\nu }}(i, j) \in {\mathbb {C}}^{d\times d}\) is defined by

$$\begin{aligned} \xi _n^{{\mu },{\nu }}(i, j) := \sum _{r=0}^{j-1} \left( {\begin{array}{c}n+i+j-2\\ r\end{array}}\right) \left( {\begin{array}{c}i+j-r-2\\ i-1\end{array}}\right) \frac{p_{\mu }^{j - r -1}{\overline{p}}_{\nu }^{n+i+j-r-2}}{(1-p_{\mu }{\overline{p}}_{\nu })^{i+j-r-1}} I_d. \end{aligned}$$

For \(n\in {\mathbb {N}}\), we define \(\varXi _n\in {\mathbb {C}}^{dM\times dM}\) by

$$\begin{aligned} \varXi _n:=\left( \begin{matrix} \varXi _n^{1, 1} &{} \varXi _n^{1, 2} &{} \cdots &{} \varXi _n^{1, K} \\ \varXi _n^{2, 1} &{} \varXi _n^{2, 2} &{} \cdots &{} \varXi _n^{2, K} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \varXi _n^{K, 1} &{} \varXi _n^{K, 2} &{} \cdots &{} \varXi _n^{K, K} \\ \end{matrix} \right) . \end{aligned}$$

We also define \(\rho \in {\mathbb {C}}^{dM\times d}\) and \({\tilde{\rho }} \in {\mathbb {C}}^{dM\times d}\) by the block representations

$$\begin{aligned} \rho :=(\rho _{1, 1}^{\top }, \dots , \rho _{1, m_1}^{\top } \ \vert \ \rho _{2, 1}^{\top }, \dots , \rho _{2, m_2}^{\top } \ \vert \ \cdots \ \vert \ \rho _{K, 1}^{\top }, \dots , \rho _{K, m_{K}}^{\top })^{\top } \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} {\tilde{\rho }}&:=({\tilde{\rho }}_{1, 1}^{\top }, \dots , {\tilde{\rho }}_{1, m_1}^{\top } \ \vert \ {\tilde{\rho }}_{2, 1}^{\top }, \dots , {\tilde{\rho }}_{2, m_2}^{\top } \ \vert \ \cdots \ \vert \ {\tilde{\rho }}_{K, 1}^{\top }, \dots , {\tilde{\rho }}_{K, m_{K}}^{\top })^{\top }\\&=\left( \overline{\rho _{1, 1}^{\sharp }}, \dots , \overline{\rho _{1, m_1}^{\sharp }} \ \vert \ \overline{\rho _{2, 1}^{\sharp }}, \dots , \overline{\rho _{2, m_2}^{\sharp }} \ \vert \ \cdots \ \vert \ \overline{\rho _{K, 1}^{\sharp }}, \dots , \overline{\rho _{K, m_{K}}^{\sharp }}\right) ^{\top }, \end{aligned} \end{aligned}$$

respectively. For \(n\in {\mathbb {N}}\), we define \(v_n, {\tilde{v}}_n \in {\mathbb {C}}^{dM\times d}\) by

$$\begin{aligned} v_n := \sum _{\ell =0}^{\infty } {\mathbf {p}}_{\ell } a_{n+\ell }, \quad {\tilde{v}}_n := \sum _{\ell =0}^{\infty } {\overline{{\mathbf {p}}}}_{\ell } {\tilde{a}}_{n+\ell }. \end{aligned}$$

Then, by Lemma 5 in [13], we have

$$\begin{aligned} v_n = \varXi _n \rho , \quad {\tilde{v}}_n = {\overline{\varXi }}_n {\tilde{\rho }}, \quad n\ge m_0 + 1. \end{aligned}$$

Moreover, if \(m_0\ge 1\), then we have

$$\begin{aligned} v_n = \varXi _n \rho + \sum _{\ell =0}^{m_0-n} {\mathbf {p}}_{\ell } \rho _{0,n+\ell }, \quad {\tilde{v}}_n = {\overline{\varXi }}_n {\tilde{\rho }} + \sum _{\ell =0}^{m_0-n} {\overline{{\mathbf {p}}}}_{\ell } {\tilde{\rho }}_{0,n+\ell }, \quad n \in \{1,\dots ,m_0\}. \end{aligned}$$

For \(n\in {\mathbb {Z}}\), we define \(w_n, {\tilde{w}}_n \in {\mathbb {C}}^{dM\times d}\) by

$$\begin{aligned} w_n := \sum _{\ell =0}^{\infty } {\mathbf {p}}_{\ell - n} a_{\ell }, \quad {\tilde{w}}_n := \sum _{\ell =0}^{\infty } {\overline{{\mathbf {p}}}}_{\ell - n} {\tilde{a}}_{\ell }. \end{aligned}$$

To give closed-form expressions for \(w_n\) and \({\tilde{w}}_n\), we introduce some matrices. For \(n\in {\mathbb {Z}}\) and \({\mu }, {\nu }\in \{1,2,\dots ,K\}\), we define \(\varPhi _n^{{\mu },{\nu }}\in {\mathbb {C}}^{dm_{\mu }\times dm_{\nu }}\) by the block representation

$$\begin{aligned} \varPhi _n^{{\mu },{\nu }} := \left( \begin{matrix} \varphi _n^{{\mu }, {\nu }}(1, 1) &{} \varphi _n^{{\mu }, {\nu }}(1, 2) &{} \cdots &{} \varphi _n^{{\mu }, {\nu }}(1, m_{\nu }) \\ \varphi _n^{{\mu }, {\nu }}(2, 1) &{} \varphi _n^{{\mu }, {\nu }}(2, 2) &{} \cdots &{} \varphi _n^{{\mu }, {\nu }}(2, m_{\nu }) \\ \vdots &{} \vdots &{} &{} \vdots \\ \varphi _n^{\mu ,\nu }(m_{\mu },1) &{} \varphi _n^{\mu ,\nu }(m_{\mu },2) &{} \cdots &{} \varphi _n^{\mu ,\nu }(m_{\mu },m_{\nu }) \\ \end{matrix} \right) , \end{aligned}$$

where, for \(n\in {\mathbb {Z}}\), \(i=1,\dots ,m_{\mu }\) and \(j = 1,\dots ,m_{\nu }\), \(\varphi _n^{{\mu },{\nu }}(i, j) \in {\mathbb {C}}^{d\times d}\) is defined by

$$\begin{aligned} \varphi _n^{{\mu },{\nu }}(i, j) := \sum _{q=0}^{i-1} \sum _{r=0}^{j-1} \left( {\begin{array}{c}j-1\\ r\end{array}}\right) \left( {\begin{array}{c}r+q\\ q\end{array}}\right) \left( {\begin{array}{c}r - n\\ i-q-1\end{array}}\right) \frac{p_{\mu }^{r+q+1-i - n} {\overline{p}}_{\nu }^{r+q}}{(1-p_{\mu }{\overline{p}}_{\nu })^{r+q+1}} I_d. \end{aligned}$$

For \(n\in {\mathbb {Z}}\), we define \(\varPhi _n\in {\mathbb {C}}^{dM\times dM}\) by

$$\begin{aligned} \varPhi _n:=\left( \begin{matrix} \varPhi _n^{1, 1} &{} \varPhi _n^{1, 2} &{} \cdots &{} \varPhi _n^{1, K} \\ \varPhi _n^{2, 1} &{} \varPhi _n^{2, 2} &{} \cdots &{} \varPhi _n^{2, K} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \varPhi _n^{K, 1} &{} \varPhi _n^{K, 2} &{} \cdots &{} \varPhi _n^{K, K} \\ \end{matrix} \right) . \end{aligned}$$

Here are closed-form expressions for \(w_n\) and \({\tilde{w}}_n\).

Lemma 5.3

We have

$$\begin{aligned} w_n&= \varPhi _n \rho + \sum _{\ell =0}^{m_0} {\mathbf {p}}_{\ell - n} \rho _{0,\ell }, \quad n \in {\mathbb {Z}}, \\ {\tilde{w}}_n&= {\overline{\varPhi }}_n {\tilde{\rho }} + \sum _{\ell =0}^{m_0} {\overline{{\mathbf {p}}}}_{\ell - n} {\tilde{\rho }}_{0,\ell }, \quad n \in {\mathbb {Z}}. \end{aligned}$$

The proof of Lemma 5.3 will be given in the Appendix.

Recall M from (5.13). For \(n \in {\mathbb {N}}\) and \(s\in \{1,\dots ,n\}\), we define

$$\begin{aligned} \ell _{n,s}&:= \{w_{n+1-s} - v_{n+1-s}\}^* (I_{dM} - {\tilde{G}}_n G_n )^{-1} \in {\mathbb {C}}^{d \times dM}, \\ {\tilde{\ell }}_{n,s}&:= \{{\tilde{w}}_{s} - {\tilde{v}}_s\}^* (I_{dM} - G_n {\tilde{G}}_n )^{-1} \in {\mathbb {C}}^{d \times dM}, \\ r_{n,s}&:= (\varPi _n \varTheta )^* {\tilde{v}}_s + {\tilde{G}}_n \varPi _n \varTheta v_{n+1-s} \in {\mathbb {C}}^{dM \times d} \end{aligned}$$

and

$$\begin{aligned} {\tilde{r}}_{n,s} := \varPi _n \varTheta v_{n+1-s} + G_n (\varPi _n \varTheta )^* {\tilde{v}}_s \in {\mathbb {C}}^{dM \times d}. \end{aligned}$$

Here are closed-form formulas for \((T_n(w))^{-1}\) with w satisfying (1.18) and \(K \ge 1\).

Theorem 5.2

We assume (1.17), (1.18) and \(K \ge 1\) for K in (5.1). Then the following four assertions hold.

  1. (i)

    For \(n \ge m_0 + 1\), \(s \in \{1,\dots ,n\}\) and \(t \in \{1, \dots , n-m_0\}\), we have

    $$\begin{aligned} \left( T_n(w)^{-1}\right) ^{s,t} = {\tilde{r}}_{n,s}^* {\tilde{\ell }}_{n,t}^* + \sum _{\lambda =1}^{s\wedge t} {\tilde{a}}_{s-\lambda }^* {\tilde{a}}_{t-\lambda }. \end{aligned}$$
  2. (ii)

    For \(n \ge m_0 + 1\), \(s \in \{1, \dots , n-m_0\}\) and \(t \in \{1,\dots ,n\}\), we have

    $$\begin{aligned} \left( T_n(w)^{-1}\right) ^{s,t} = {\tilde{\ell }}_{n,s} {\tilde{r}}_{n,t} + \sum _{\lambda =1}^{s\wedge t} {\tilde{a}}_{s-\lambda }^* {\tilde{a}}_{t-\lambda }. \end{aligned}$$
  3. (iii)

    For \(n \ge m_0 + 1\), \(s \in \{1,\dots ,n\}\) and \(t \in \{m_0+1, \dots , n\}\), we have

    $$\begin{aligned} \left( T_n(w)^{-1}\right) ^{s,t} = r_{n,s}^* \ell _{n,t}^* + \sum _{\lambda =s\vee t}^n a_{\lambda -s}^* a_{\lambda -t}. \end{aligned}$$
  4. (iv)

    For \(n \ge m_0 + 1\), \(s \in \{m_0+1, \dots , n\}\) and \(t \in \{1,\dots ,n\}\), we have

    $$\begin{aligned} \left( T_n(w)^{-1}\right) ^{s,t} = \ell _{n,s} r_{n,t} + \sum _{\lambda =s\vee t}^n a_{\lambda -s}^* a_{\lambda -t}. \end{aligned}$$

Proof

(i) We assume \(n \ge m_0 + 1\), \(s \in \{1,\dots ,n\}\) and \(t \in \{1, \dots , n-m_0\}\). Then, by Lemma 5.2 (ii) above and Lemma 19 in [13], we have

$$\begin{aligned} \begin{aligned}&\sum _{u=1}^t \sum _{k=1}^{\infty } \left\{ \sum _{\lambda =0}^{\infty } {\tilde{b}}_{n,u,\lambda }^{2k-1} a_{n+1-s+\lambda } \right\} ^* {\tilde{a}}_{t-u}\\&\qquad =\sum _{u=1}^t \sum _{k=1}^{\infty } \left\{ \sum _{\lambda =0}^{\infty } {\mathbf {p}}_{-u}^{\top } ( G_n {\tilde{G}}_n )^{k-1} \varPi _n \varTheta {\mathbf {p}}_{\lambda } a_{n+1-s+\lambda } \right\} ^* {\tilde{a}}_{t-u}\\&\qquad =\sum _{u=1}^t \sum _{k=1}^{\infty } \left\{ {\mathbf {p}}_{-u}^{\top } ( G_n {\tilde{G}}_n )^{k-1} \varPi _n \varTheta v_{n+1-s} \right\} ^* {\tilde{a}}_{t-u}\\&\qquad =\sum _{u=1}^t \left\{ {\mathbf {p}}_{-u}^{\top } ( I_{dM} - G_n {\tilde{G}}_n )^{-1} \varPi _n \varTheta v_{n+1-s} \right\} ^* {\tilde{a}}_{t-u}\\&\qquad =v_{n+1-s}^* (\varPi _n \varTheta )^* (I_{dM} - {\tilde{G}}_n^* G_n^*)^{-1} \sum _{u=1}^t {\overline{{\mathbf {p}}}}_{-u} {\tilde{a}}_{t-u}. \end{aligned} \end{aligned}$$

Similarly, by Lemma 5.2 (ii) above and Lemma 19 in [13],

$$\begin{aligned} \sum _{u=1}^t \sum _{k=1}^{\infty } \left\{ \sum _{\lambda =0}^{\infty } {\tilde{b}}_{n,u,\lambda }^{2k} {\tilde{a}}_{s+\lambda } \right\} ^* {\tilde{a}}_{t-u} = {\tilde{v}}_s^* \varPi _n \varTheta G_n^* (I_{dM} - {\tilde{G}}_n^* G_n^*)^{-1} \sum _{u=1}^t {\overline{{\mathbf {p}}}}_{-u} {\tilde{a}}_{t-u}. \end{aligned}$$

However, \(\sum _{u=1}^t {\overline{{\mathbf {p}}}}_{-u} {\tilde{a}}_{t-u} =\sum _{\lambda =0}^{\infty } {\overline{{\mathbf {p}}}}_{\lambda -t} {\tilde{a}}_{\lambda } -\sum _{\lambda =0}^{\infty } {\overline{{\mathbf {p}}}}_{\lambda } {\tilde{a}}_{t+\lambda } ={\tilde{w}}_{t} - {\tilde{v}}_t\). Therefore, the assertion (i) follows from Theorem 2.1 (i).

(iii) We assume \(n \ge m_0 + 1\), \(s \in \{1,\dots ,n\}\) and \(t \in \{m_0+1, \dots , n\}\). Then, by Lemma 5.2 (i) above and Lemma 19 in [13], we have

$$\begin{aligned} \begin{aligned}&\sum _{u=t}^n \sum _{k=1}^{\infty } \left\{ \sum _{\lambda =0}^{\infty } b_{n,u,\lambda }^{2k-1} {\tilde{a}}_{s+\lambda } \right\} ^* a_{u-t}\\&\qquad =\sum _{u=t}^n \sum _{k=1}^{\infty } \left\{ \sum _{\lambda =0}^{\infty } {\mathbf {p}}_{u-n-1}^* ( {\tilde{G}}_n G_n )^{k-1} (\varPi _n \varTheta )^* {\overline{{\mathbf {p}}}}_{\lambda } {\tilde{a}}_{s+\lambda } \right\} ^* a_{u-t}\\&\qquad =\sum _{u=t}^n \sum _{k=1}^{\infty } \left\{ {\mathbf {p}}_{u-n-1}^* ( {\tilde{G}}_n G_n )^{k-1} (\varPi _n \varTheta )^* {\tilde{v}}_{s} \right\} ^* a_{u-t}\\&\qquad =\sum _{u=t}^n \left\{ {\mathbf {p}}_{u-n-1}^* ( I_{dM} - {\tilde{G}}_n G_n )^{-1} (\varPi _n \varTheta )^* {\tilde{v}}_{s} \right\} ^* a_{u-t}\\&\qquad = {\tilde{v}}_{s}^* \varPi _n \varTheta (I_{dM} - G_n^* {\tilde{G}}_n^* )^{-1}\sum _{u=t}^n {\mathbf {p}}_{u-n-1} a_{u-t}. \end{aligned} \end{aligned}$$

Similarly, by Lemma 5.2 (i) above and Lemma 19 in [13], we have

$$\begin{aligned} \begin{aligned}&\sum _{u=t}^n \sum _{k=1}^{\infty } \left\{ \sum _{\lambda =0}^{\infty } b_{n,u,\lambda }^{2k} a_{n+1-s+\lambda } \right\} ^* a_{u-t}\\&\qquad = v_{n+1-s}^* (\varPi _n \varTheta )^* {\tilde{G}}_n^* (I_{dM} - G_n^* {\tilde{G}}_n^* )^{-1} \sum _{u=t}^n {\mathbf {p}}_{u-n-1} a_{u-t}. \end{aligned} \end{aligned}$$

However, \(\sum _{u=t}^n {\mathbf {p}}_{u-n-1} a_{u-t} = w_{n+1-t} - v_{n+1-t}\). Therefore, the assertion (ii) follows from Theorem 2.1 (ii).

(ii), (iv) By (2.9), (ii) and (iv) follow from (i) and (iii), respectively. \(\square \)

Example 5.1

Suppose that \(K\ge 1\), \(m_{\mu }=1\) for \(\mu \in \{1,\dots ,K\}\) and \(m_0=0\). Then,

$$\begin{aligned} h(z)^{-1} = - \rho _{0,0} - \sum _{{\mu }=1}^{K} \frac{1}{1-{\overline{p}}_{\mu }z}\rho _{{\mu }, 1}, \quad h_{\sharp }(z)^{-1} = - \rho ^{\sharp }_{0,0} - \sum _{{\mu }=1}^{K} \frac{1}{1 - {\overline{p}}_{\mu } z} \rho ^{\sharp }_{{\mu }, 1}. \end{aligned}$$

We have

$$\begin{aligned}&{\mathbf {p}}_n^{\top } = (p_1^n I_d, \dots , p_K^n I_d)\in {\mathbb {C}}^{d \times dK}, \quad n \in {\mathbb {Z}},\\&\rho ^{\top } =(\rho _{1, 1}^{\top }, \rho _{2, 1}^{\top }, \dots , \rho _{K, 1}^{\top }) \in {\mathbb {C}}^{dK\times d}, \quad {\tilde{\rho }}^{\top } =\left( \overline{\rho ^{\sharp }_{1, 1}}, \overline{\rho ^{\sharp }_{2, 1}}, \dots , \overline{\rho ^{\sharp }_{K, 1}}\right) \in {\mathbb {C}}^{dK\times d}. \end{aligned}$$

We also have

$$\begin{aligned} \varTheta&= \left( \begin{array}{cccc} p_1 h_{\sharp }(p_1) \rho _{1, 1}^* &{} 0 &{} \cdots &{} 0 \\ 0 &{} p_2 h_{\sharp }(p_2) \rho _{2, 1}^* &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} p_K h_{\sharp }(p_K) \rho _{K, 1}^* \end{array} \right) \in {\mathbb {C}}^{dK\times dK},\\ \varLambda&= \left( \begin{array}{cccc} \frac{1}{1-p_{1}{\overline{p}}_{1}} I_d &{} \frac{1}{1-p_{1}{\overline{p}}_{2}} I_d &{} \cdots &{} \frac{1}{1-p_{1}{\overline{p}}_{K}} I_d \\ \frac{1}{1-p_{2}{\overline{p}}_{1}} I_d &{} \frac{1}{1-p_{2}{\overline{p}}_{2}} I_d &{} \cdots &{} \frac{1}{1-p_{2}{\overline{p}}_{K}} I_d \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \frac{1}{1-p_{K}{\overline{p}}_{1}} I_d &{} \frac{1}{1-p_{K}{\overline{p}}_{2}} I_d &{} \cdots &{} \frac{1}{1-p_{K}{\overline{p}}_{K}} I_d \\ \end{array} \right) \in {\mathbb {C}}^{dK\times dK},\\ \varPi _n&= \left( \begin{array}{cccc} p_1^n I_d &{} 0 &{} \cdots &{} 0 \\ 0 &{} p_2^n I_d &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} p_K^n I_d \end{array} \right) \in {\mathbb {C}}^{dK\times dK},\quad n \in {\mathbb {Z}},\\ \varXi _n&= \left( \begin{array}{cccc} \frac{{\overline{p}}_{1}^{n}}{1-p_{1}{\overline{p}}_{1}} I_d &{} \frac{{\overline{p}}_{2}^{n}}{1-p_{1}{\overline{p}}_{2}} I_d &{} \cdots &{} \frac{{\overline{p}}_{K}^{n}}{1-p_{1}{\overline{p}}_{K}} I_d \\ \frac{{\overline{p}}_{1}^{n}}{1-p_{2}{\overline{p}}_{1}} I_d &{} \frac{{\overline{p}}_{2}^{n}}{1-p_{2}{\overline{p}}_{2}} I_d &{} \cdots &{} \frac{{\overline{p}}_{K}^{n}}{1-p_{2}{\overline{p}}_{K}} I_d \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \frac{{\overline{p}}_{1}^{n}}{1-p_{K}{\overline{p}}_{1}} I_d &{} \frac{{\overline{p}}_{2}^{n}}{1-p_{K}{\overline{p}}_{2}} I_d &{} \cdots &{} \frac{{\overline{p}}_{K}^{n}}{1-p_{K}{\overline{p}}_{K}} I_d \\ \end{array} \right) \in {\mathbb {C}}^{dK\times dK}, \quad n \in {\mathbb {N}},\\ \varPhi _n&= \left( \begin{array}{cccc} \frac{p_{1}^{-n}}{1-p_{1}{\overline{p}}_{1}} I_d &{} \frac{p_{1}^{-n}}{1-p_{1}{\overline{p}}_{2}} I_d &{} \cdots &{} \frac{p_{1}^{-n}}{1-p_{1}{\overline{p}}_{K}} I_d \\ \frac{p_{2}^{-n}}{1-p_{2}{\overline{p}}_{1}} I_d &{} \frac{p_{2}^{-n}}{1-p_{2}{\overline{p}}_{2}} I_d &{} \cdots &{} \frac{p_{2}^{-n}}{1-p_{2}{\overline{p}}_{K}} I_d \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \frac{p_{K}^{-n}}{1-p_{K}{\overline{p}}_{1}} I_d &{} \frac{p_{K}^{-n}}{1-p_{K}{\overline{p}}_{2}} I_d &{} \cdots &{} \frac{p_{K}^{-n}}{1-p_{K}{\overline{p}}_{K}} I_d \\ \end{array} \right) \in {\mathbb {C}}^{dK\times dK}, \quad n \in {\mathbb {Z}},\\ G_n&= \varPi _n \varTheta \varLambda \in {\mathbb {C}}^{dK\times dK}, \quad {\tilde{G}}_n = (\varPi _n \varTheta )^* \varLambda ^{\top } \in {\mathbb {C}}^{dK\times dK}, \quad n \in {\mathbb {Z}},\\ v_n&= \varXi _n\rho \in {\mathbb {C}}^{dK\times d}, \quad {\tilde{v}}_n = {\overline{\varXi }}_n {\tilde{\rho }} \in {\mathbb {C}}^{dK\times d}, \quad n\in {\mathbb {N}},\\ w_n&= \varPhi _n\rho + {\mathbf {p}}_{-n}\rho _{0,0}\in {\mathbb {C}}^{dK\times d}, \quad {\tilde{w}}_n = {\overline{\varPhi }}_n {\tilde{\rho }} + {\overline{{\mathbf {p}}}}_{-n} {\tilde{\rho }}_{0,0} \in {\mathbb {C}}^{dK\times d}, \quad n\in {\mathbb {Z}}. \end{aligned}$$

Example 5.2

In Example 5.1, we further assume \(d=K=1\). Then, we can write \(h(z) = h_{\sharp }(z) = -(1 - {\overline{p}}z)/\rho \), where \(\rho \in {\mathbb {C}}{\setminus }\{0\}\) and \(p \in {\mathbb {D}}{\setminus }\{0\}\). It follows that

$$\begin{aligned}&c_0=-1/\rho , \quad c_1 = {\overline{p}}/\rho , \quad c_k = 0\quad (k\ge 2),\\&a_k = \rho ({\overline{p}})^k, \quad {\tilde{a}}_k = {\overline{a}}_k, \quad k \in {\mathbb {N}}\cup \{0\}. \end{aligned}$$

Since \(\gamma (k) = \sum _{\ell =0}^{\infty } c_{k+\ell }{\overline{c}}_{\ell }\) and \(\gamma (-k)=\overline{\gamma (k)}\) for \(k\in {\mathbb {N}}\cup \{0\}\), we have

$$\begin{aligned} T_2(w) = \frac{1}{\vert \rho \vert ^2} \left( \begin{matrix} 1 + \vert p \vert ^2 &{} -p \\ -{\overline{p}}&{} 1 + \vert p \vert ^2 \end{matrix} \right) , \end{aligned}$$

hence

$$\begin{aligned} T_2(w)^{-1} = \frac{\vert \rho \vert ^2}{1 + \vert p \vert ^2 + \vert p \vert ^4} \left( \begin{matrix} 1 + \vert p \vert ^2 &{} p \\ {\overline{p}}&{} 1 + \vert p \vert ^2 \end{matrix} \right) . \end{aligned}$$

We also have

$$\begin{aligned} {\tilde{A}}_2 = {\overline{\rho }} \left( \begin{matrix} 1 &{} p \\ 0 &{} 1 \end{matrix} \right) \quad \hbox {and} \quad A_2 = \rho \left( \begin{matrix} 1 &{} 0 \\ {\overline{p}}&{} 1 \end{matrix} \right) \end{aligned}$$

for \({\tilde{A}}_2\) and \(A_2\) in (2.14) and (2.15), respectively. By simple calculations, we have

$$\begin{aligned} (\ell _{2,1}, \ell _{2,2})&= \frac{{\overline{\rho }}}{({\overline{p}})^2(1 - \vert p\vert ^6)} (1 + \vert p\vert ^2, {\overline{p}}), \quad \left( {\tilde{\ell }}_{2,1}, {\tilde{\ell }}_{2,2}\right) = \left( \overline{\ell _{2,2}}, \overline{\ell _{2,1}} \right) ,\\ (r_{2,1}, r_{2,2})&= - \rho {\overline{p}}\vert p\vert ^2 (1 - \vert p\vert ^2) ({\overline{p}}(1+\vert p\vert ^2), \vert p\vert ^2), \quad \left( {\tilde{r}}_{2,1}, {\tilde{r}}_{2,2}\right) = \left( \overline{r_{2,2}}, \overline{r_{2,1}} \right) \end{aligned}$$

hence

$$\begin{aligned} T_2(w)^{-1} = {\tilde{A}}_2^* {\tilde{A}}_2 + \left( \begin{matrix} {\tilde{\ell }}_{2,1} \\ {\tilde{\ell }}_{2,2} \end{matrix} \right) \left( {\tilde{r}}_{2,1}, {\tilde{r}}_{2,2} \right) = A_2^* A_2 + \left( \begin{matrix} \ell _{2,1} \\ \ell _{2,2} \end{matrix} \right) \left( r_{2,1}, r_{2,2} \right) \end{aligned}$$

which agrees with equalities in Theorem 5.2.

6 Linear-time algorithm

As in Sect. 5, we assume (1.17) and (1.18). Let K be as in (5.1) with (5.2). In this section, we explain how Theorems 5.1 and 5.2 above provide us with a linear-time algorithm to compute the solution Z to the block Toeplitz system (1.19).

For

$$\begin{aligned} Y = (y_1^{\top },\dots ,y_n^{\top })^{\top }\in {\mathbb {C}}^{dn\times d} \quad \hbox {with} \quad y_s \in {\mathbb {C}}^{d \times d}, \quad s \in \{1,\dots ,n\}, \end{aligned}$$
(6.1)

let

$$\begin{aligned} Z = (z_1^{\top },\dots ,z_n^{\top })^{\top }\in {\mathbb {C}}^{dn\times d} \quad \hbox {with} \quad z_s \in {\mathbb {C}}^{d \times d}, \quad s \in \{1,\dots ,n\}, \end{aligned}$$

be the solution to (1.19), that is, \(Z = T_n(w)^{-1}Y\). For \(m_0\) in (5.1), let \(n \ge 2m_0 + 1\) so that \(n-m_0 \ge m_0+1\) holds.

Recall \({\tilde{A}}_n\) and \(A_n\) from (2.14) and (2.15), respectively. If \(K = 0\), then it follows from Lemma 2.1 and Theorem 5.1 (ii), (iv) that

$$\begin{aligned} z_s&= {\tilde{\alpha }}_{n,s}, \quad s \in \{1,\dots ,n-m_0\}, \\ z_s&= \alpha _{n,s}, \quad s \in \{m_0+1,\dots ,n\}, \end{aligned}$$

where

$$\begin{aligned} ({\tilde{\alpha }}_{n,1}^{\top }, \dots , {\tilde{\alpha }}_{n,n}^{\top })^{\top }&:= {\tilde{A}}_n^* {\tilde{A}}_n Y \quad \hbox {with} \quad {\tilde{\alpha }}_{n,s} \in {\mathbb {C}}^{d \times d}, \quad s \in \{1,\dots ,n\}, \\ (\alpha _{n,1}^{\top }, \dots , \alpha _{n,n}^{\top })^{\top }&:= A_n^* A_n Y \quad \hbox {with} \quad \alpha _{n,s} \in {\mathbb {C}}^{d \times d}, \quad s \in \{1,\dots ,n\}. \end{aligned}$$

On the other hand, if \(K \ge 1\), then we see from Lemma 2.1 and Theorem 5.2 (ii), (iv) that

$$\begin{aligned} z_s&= {\tilde{\ell }}_{n,s} {\tilde{R}}_n + {\tilde{\alpha }}_{n,s}, \quad s \in \{1,\dots ,n-m_0\}, \\ z_s&= \ell _{n,s} R_n + \alpha _{n,s}, \quad s \in \{m_0+1,\dots ,n\}, \end{aligned}$$

where

$$\begin{aligned} {\tilde{R}}_n := \sum _{t=1}^n {\tilde{r}}_{n,t} y_t \in {\mathbb {C}}^{d \times d}, \quad R_n := \sum _{t=1}^n r_{n,t} y_t \in {\mathbb {C}}^{d \times d}. \end{aligned}$$

Therefore, algorithms to compute \({\tilde{A}}_n^* {\tilde{A}}_n Y\) and \(A_n^* A_n Y\) in O(n) operations imply that of Z. We present the former ones below.

For \(n\in {\mathbb {N}}\cup \{0\}\), \(\mu \in \{1,\dots ,K\}\) and \(j\in \{1,\dots ,m_{\mu }\}\), we define \(q_{\mu ,j}(n) \in {\mathbb {C}}^{d\times d}\) by \(q_{\mu ,j}(n):=p_{\mu ,j}(n+j-1)\), that is,

$$\begin{aligned} q_{\mu ,j}(n) = \left( {\begin{array}{c}n+j-1\\ j-1\end{array}}\right) p_{\mu }^nI_d. \end{aligned}$$
(6.2)

For \(n\in {\mathbb {N}}\), \(\mu \in \{1,\dots ,K\}\) and \(j\in \{1,\dots ,m_{\mu }\}\), we define the upper trianglular block Toeplitz matrix \(Q_{\mu ,j,n} \in {\mathbb {C}}^{dn\times dn}\) by

Notice that

with \(q^*_{\mu ,j}(n) = \left( {\begin{array}{c}n+j-1\\ j-1\end{array}}\right) {\overline{p}}_{\mu }^nI_d\). For \(n\in {\mathbb {N}}\), \(\mu \in \{1,\dots ,K\}\) and \(j\in \{1,\dots ,m_{\mu }\}\), we define the block diagonal matrices \({\tilde{D}}_{\mu ,j,n} \in {\mathbb {C}}^{dn\times dn}\) and \(D_{\mu ,j,n} \in {\mathbb {C}}^{dn\times dn}\) by

respectively. Moreover, for \(n\ge m_0+1\), we define the upper and lower triangular block Toeplitz matrices \({\tilde{\varDelta }}_{n} \in {\mathbb {C}}^{dn\times dn}\) and \(\varDelta _{n} \in {\mathbb {C}}^{dn\times dn}\) by

and

respectively. Note that both \({\tilde{\varDelta }}_{n}\) and \(\varDelta _{n}\) are sparse matrices in the sense that they have only O(n) nonzero elements.

By (5.5)–(5.8), we have

$$\begin{aligned} {\tilde{A}}_n&= {\tilde{\varDelta }}_{n} + \sum _{\mu =1}^K \sum _{j=1}^{m_{\mu }} Q_{\mu ,j,n} {\tilde{D}}_{\mu ,j,n}, \quad n\ge m_0 + 1, \\ A_n&= \varDelta _{n} + \sum _{\mu =1}^K \sum _{j=1}^{m_{\mu }} Q^*_{\mu ,j,n} D_{\mu ,j,n}, \quad n\ge m_0 + 1. \end{aligned}$$

Therefore, it is enough to give linear-time algorithms to compute \(Q_{\mu ,i,n} Y\) and \(Q^*_{\mu ,i,n} Y\) for \(Y \in {\mathbb {C}}^{dn\times d}\) in O(n) operations. The following two propositions provide such linear-time algorithms.

Proposition 6.1

Let \(n\in {\mathbb {N}}\), \(\mu \in \{1,\dots ,K\}\) and Y be as in (6.1). We put \(Z_{\mu ,i}=Q_{\mu ,i,n} Y\) for \(i\in \{1,\dots ,m_{\mu }\}\). Then the component blocks \(z_{\mu ,i}(s)\) of \(Z_{\mu ,i}=(z_{\mu ,i}^{\top }(1),\dots ,z_{\mu ,i}^{\top }(n))^{\top }\) satisfy the following equalities:

$$\begin{aligned}&z_{\mu ,i}(n) = q_{\mu ,i}(0) y_n, \quad i \in \{1,\dots ,m_{\mu }\}, \end{aligned}$$
(6.3)
$$\begin{aligned}&z_{\mu ,1}(s) = p_{\mu } z_{\mu ,1}(s+1) +q_{\mu ,1}(0) y_s, \quad s\in \{1,\dots ,n-1\} \end{aligned}$$
(6.4)
$$\begin{aligned}&\begin{aligned} z_{\mu ,i}(s)&= p_{\mu } z_{\mu ,i}(s+1) + z_{\mu ,i-1}(s) + \{q_{\mu ,i}(0) -q_{\mu ,i-1}(0)\} y_s,\\&\qquad i\in \{2,\dots ,m_{\mu }\}, \ s\in \{1,\dots ,n-1\}. \end{aligned} \end{aligned}$$
(6.5)

Proof

From the definition of \(Q_{\mu ,i,n}\), (6.3) is trivial. For \(q_{\mu ,i}(k)\) in (6.2), Pascal’s rule yields the following recursions:

$$\begin{aligned}&q_{\mu ,1}(k+1) = p_{\mu }q_{\mu ,1}(k), \quad k \in {\mathbb {N}}\cup \{0\}, \end{aligned}$$
(6.6)
$$\begin{aligned}&q_{\mu ,i}(k+1) = p_{\mu }q_{\mu ,i}(k) +q_{\mu ,i-1}(k+1), \quad i \in \{2,\dots ,j\}, \ k \in {\mathbb {N}}\cup \{0\}. \end{aligned}$$
(6.7)

For \(s\in \{1,\dots ,n-1\}\), we see, from (6.6),

$$\begin{aligned} \begin{aligned} z_{\mu ,1}(s)&=q_{\mu ,1}(0) y_s + \sum _{t=0}^{n-s-1}q_{\mu ,1}(t+1) y_{s+t+1}\\&=q_{\mu ,1}(0) y_s + p_{\mu } \sum _{t=0}^{n-s-1}q_{\mu ,1}(t) y_{s+t+1} =q_{\mu ,1}(0) y_s + p_{\mu } z_{\mu ,1}(s+1), \end{aligned} \end{aligned}$$

and, from (6.7),

$$\begin{aligned} \begin{aligned} z_{\mu ,i}(s)&=q_{\mu ,i}(0) y_s + \sum _{t=0}^{n-s-1}q_{\mu ,i}(t+1) y_{s+t+1}\\&= \{q_{\mu ,i}(0) -q_{\mu ,i-1}(0)\} y_s + p_{\mu } \sum _{t=0}^{n-s-1}q_{\mu ,1}(t) y_{s+t+1} + \sum _{t=0}^{n-s}q_{\mu ,i-1}(t) y_{s+t}\\&=\{q_{\mu ,i}(0) -q_{\mu ,i-1}(0)\} y_s + p_{\mu } z_{\mu ,i}(s+1) + z_{\mu ,i-1}(s) \end{aligned} \end{aligned}$$

for \(i\in \{2,\dots ,j\}\). Thus, (6.4) and (6.5) follow. \(\square \)

By Proposition 6.1, we can compute \(z_{\mu ,i}(s)\) in the following order in O(n) operations:

$$\begin{aligned} \begin{aligned}&z_{\mu ,1}(n) \ \rightarrow \ \cdots \rightarrow \ z_{\mu ,1}(1) \ \rightarrow \ z_{\mu ,2}(n) \ \rightarrow \cdots \rightarrow \ z_{\mu ,2}(1)\\&\quad \rightarrow \cdots \rightarrow \ z_{\mu ,m_{\mu }}(n) \ \rightarrow \cdots \rightarrow z_{\mu ,m_{\mu }}(1). \end{aligned} \end{aligned}$$

Proposition 6.2

Let \(n\in {\mathbb {N}}\), \(\mu \in \{1,\dots ,K\}\) and Y be as in (6.1). We put \(W_{\mu ,i}=Q^*_{\mu ,i,n} Y\) for \(i\in \{1,\dots ,m_{\mu }\}\). Then the component blocks \(w_{\mu ,i}(s)\) of \(W_{\mu ,i}=(w_{\mu ,i}^{\top }(1),\dots ,w_{\mu ,i}^{\top }(n))^{\top }\) satisfy the following equalities:

$$\begin{aligned}&w_{\mu ,i}(1) = q_{\mu ,i}^*(0) y_1, \quad i \in \{1,\dots ,m_{\mu }\}, \\&w_{\mu ,1}(s+1) = {\overline{p}}_{\mu } w_{\mu ,1}(s) +q_{\mu ,1}^*(0) y_{s+1}, \quad s\in \{1,\dots ,n-1\} \\&\begin{aligned} w_{\mu ,i}(s+1)&= {\overline{p}}_{\mu } w_{\mu ,i}(s) + w_{\mu ,i-1}(s+1) + \{q_{\mu ,i}^*(0) -q_{\mu ,i-1}^*(0)\} y_{s+1},\\&\quad i\in \{2,\dots ,m_{\mu }\}, \ s\in \{1,\dots ,n-1\}. \end{aligned} \end{aligned}$$

The proof of Proposition 6.2 is similar to that of Proposition 6.1; we omit it.

By Proposition 6.2, we can compute \(w_{\mu ,i}(s)\) in the following order in O(n) operations:

$$\begin{aligned} \begin{aligned}&w_{\mu ,1}(1) \ \rightarrow \ \cdots \rightarrow \ w_{\mu ,1}(n) \ \rightarrow \ w_{\mu ,2}(1) \ \rightarrow \cdots \rightarrow \ w_{\mu ,2}(n)\\&\quad \rightarrow \cdots \rightarrow \ w_{\mu ,m_{\mu }}(1) \ \rightarrow \cdots \rightarrow w_{\mu ,m_{\mu }}(n). \end{aligned} \end{aligned}$$