1 Introduction and problem statement

Discrete Roesser state-space models, introduced in Roesser (1975), are of the form

$$\begin{aligned} \begin{bmatrix} \sigma _1 x_1\\ \sigma _2 x_2 \end{bmatrix}= & {} \begin{bmatrix} A_{11}&A_{12}\\ A_{21}&A_{22}\end{bmatrix}\begin{bmatrix} x_1\\ x_2 \end{bmatrix}+\begin{bmatrix} B_1\\ B_2\end{bmatrix} u\nonumber \\ y= & {} \begin{bmatrix}C_1&C_2 \end{bmatrix} \begin{bmatrix} x_1\\ x_2 \end{bmatrix}+Du\; , \end{aligned}$$
(1)

where \(x_i(k_1.k_2)\in \mathbb {R}^{{\mathtt{n}_i}}\) for all \((k_1,k_2)\in \mathbb {Z}^2\), \(A_{ij}\in \mathbb {R}^{\mathtt{n}_i\times \mathtt{n}_j}\)\(i,j=1,2\); \(u(k_1,k_2)\in \mathbb {R}^{\mathtt{m}}\) and \(y(k_1,k_2)\in \mathbb {R}^{\mathtt{p}}\) for all \((k_1,k_2)\in \mathbb {Z}^2\); and \(B:=\begin{bmatrix} B_1^\top&B_2^\top \end{bmatrix}^\top \in \mathbb {R}^{(\mathtt{n}_1+\mathtt{n}_2)\times \mathtt{m}}\), \(C:=\begin{bmatrix} C_1&C_2\end{bmatrix}\in \mathbb {R}^{\mathtt{p}\times (\mathtt{n}_1+\mathtt{n}_2)}\), \(D\in \mathbb {R}^{\mathtt{p}\times \mathtt{m}}\). Such models are widely used to describe the class of quarter-plane causal 2D-systems, whose transfer function matrix consists of entries of the form \(\frac{n(z_1,z_2)}{d(z_1,z_2)}=\frac{\sum _{i=0}^m n_i(z_1) z_2^i}{\sum _{j=0}^n d_j(z_1)z_2^j}\) with \(n_m(z_1), d_n(z_1)\ne 0\), that satisfy the following properties:

  1. 1.

    \(m\le n\)

  2. 2.

    \(\deg (d_n(z_1)\ge \deg (n_i(z_1))\), \(i=0,\ldots ,m-1\)

  3. 3.

    \(\deg (d_n(z_1))\ge \deg (d_i(z_1))\), \(i=0,\ldots ,n\).

In this paper we solve the following identification problem: we are given a finite set consisting of Npolynomial vector-geometric input-output trajectories \(w_i{:=\begin{bmatrix} u_i\\y_i\end{bmatrix}:\mathbb {Z}^2\rightarrow \mathbb {C}^{\mathtt{m}+\mathtt{p}}}\), \(i=1,\ldots ,N\), generated by a system (1), whose value at \((k_1,k_2)\in \mathbb {Z}^2\) is

$$\begin{aligned} w_i(k_1 ,k_2):= \sum _{j_1=0}^{L_1^i}\sum _{j_2=0}^{L_2^i} \overline{w}_{j_1,j_2}^i k_1^{j_1} k_2^{j_2} {\lambda _{1,i}^{k_1} }{\lambda _{2,i}^{k_2} }\;, \; {i=1,\ldots ,N} \end{aligned}$$
(2)

where \(\overline{w}_{j_1,j_2}^i={\begin{bmatrix}\overline{u}_{j_1,j_2}^i \\ \overline{y}_{j_1,j_2}^i\end{bmatrix} \in \mathbb {C}^{\mathtt{m}+\mathtt{p}}}\), \(j_\ell =0,\ldots ,L_\ell ^i\), \(\ell =1,2\) and \(\lambda _{j,i}\in \mathbb {C}\), \(i=1,\ldots ,N\), \(j=1,2\). In the following we call \((\lambda _{1,i},\lambda _{2,i})\in \mathbb {C}^2\) the frequency associated with the i-th trajectory, \(i=1,\ldots ,N\). Trajectories such as (2) arise from the response of the system (1) with zero initial state to a polynomial-exponential input \(\sum _{j_1=0}^{L_1^i}\sum _{j_2=0}^{L_2^i} \overline{u}_{j_1,j_2}^i k_1^{j_1} k_2^{j_2} {\lambda _{1,i}^{k_1} }{\lambda _{2,i}^{k_2} }\). If \(L_1^i=L_2^i=0\) for \(i=1,\ldots ,N\), (in the following called the vector-exponential case) the directions \(\overline{u}_{0,0}^i\) and \(\overline{y}_{0,0}^i\) are related to each other by the value of the transfer function \(H(z_1,z_2)\) of (1) at the point \((\lambda _{1,i},\lambda _{2,i})\in \mathbb {C}^2\): \(\overline{y}_{0,0}^i=H(\lambda _{1,i},\lambda _{2,i})\overline{u}_{0,0}^i\), \(i=1,\ldots ,N\).

We want to find matrices A, B, C, D such that (1) holds for the data (2) and some associated state trajectories \(\widehat{x}_i:=\begin{bmatrix} \widehat{x}_{i,1}\\ \widehat{x}_{i,2}\end{bmatrix}\), \(i=1,\ldots ,N\). Such quadruple (ABCD) will be called an unfalsified Roesser model for the data (2).

Roesser model system identification has been considered previously, see Cheng et al. (2017), Farah et al. (2014), Ramos (1994), Ramos et al. (2011), Ramos and dos Santos (2011), and Ramos and Mercère (2016, 2017a, b), and it has been applied in modelling the spatial dynamics of deformable mirrors (see Voorsluys 2015), heat exchangers (see Farah et al. 2016), batch processes controlled by iterative learning control (see Wei et al. 2015), and in image processing (see Ramos and Mercère 2017b). Our approach to compute an unfalsified model differs fundamentally from previous work. It is based on an idea pursued in the 2D continuous-time case in Rapisarda and Antoulas (2016), and derived from the 1D Loewner framework, see Antoulas and Rapisarda (2015) and Rapisarda and Antoulas (2015) [and also Rapisarda and Schaft (2013) and Rapisarda and Trentelman (2011) for analogous approaches to 1D identification based on the factorization of “energy” matrices]. Namely, we use the data (2) to compute state trajectories corresponding to it, and subsequently we compute a state representation for the data- and such state trajectories by solving linear equations in the unknown matrices A, B, C, D. From a methodological point of view our two-stage procedure is thus analogous to 2D subspace identification algorithms: first compute state trajectories compatible with the data, then fit a state-space model to the input-output and the computed state trajectories. However, our approach is essentially an application of the consequences of duality, rather than shift-invariance as in subspace identification: in our procedure, state trajectories are computed by factorizing constant matrices built from the data and its dual, rather than Hankel-type matrices consisting of shifts of the data in the two independent variables. Such aspect makes our method conceptually simple, and it helps to reduce the amount of bookkeeping necessary for calculations. Moreover, approaching the problem from a frequency-domain and a duality point of view allows us to avoid imposing restrictive assumptions on the data-generating system, such as the separability-in-the-denominator property required by earlier work on 2D subspace identification such as Cheng et al. (2017), Ramos (1994), Ramos et al. (2011) and Ramos and dos Santos (2011). We note that the recent publication Ramos and Mercère (2017b), provides a subspace algorithm for the identification of general, i.e. not necessarily separable-in-the-denominator, Roesser models.

The paper is structured as follows. In Sect. 2 we gather the necessary background material; this section contains several original results in the theory of 2D bilinear- and quadratic difference forms, a tool extensively used in our approach. In Sect. 3 we illustrate some original results on duality of Roesser models, including a “pairing” result crucial for our identification procedure. In Sect. 6 we illustrate our method for the identifying Roesser models. Section 7 contains some concluding remarks.

Notation We denote by \(\mathbb {R}^{m\times n}\) (respectively \(\mathbb {C}^{m\times n}\)) the set of all \(m\times n\) matrices with entries in \(\mathbb {R}\) (respectively \(\mathbb {C}\)). \(\mathbb {C}^{\bullet \times n}\) denotes the set of matrices with n columns and an unspecified (finite) number of rows. Given \(A\in \mathbb {C}^{m\times n}\), we denote by \(A^*\) its conjugate transpose. If A has full column rank, we denote by \(A^\dagger \) a left-inverse of A. If A, B are matrices with the same number of columns (or linear maps acting on the same space), \({{\mathrm{\mathrm col}}}(A,B)\) is the matrix (map) obtained stacking A on top of B.

\(\mathbb {C}[z_1^{-1},z_1,z_2^{-1},z_2]\) is the ring of bivariate Laurent polynomials in the indeterminates \(z_1\), \(z_2\) with complex coefficients, and \(\mathbb {C}^{m\times n}[z_1^{-1},z_1,z_2^{-1},z_2]\) that of \(m\times n\) bivariate Laurent polynomial matrices. The ring of \(m\times n\) Laurent polynomial matrices with real coefficients in the indeterminates \(\zeta _1\), \(\zeta _2\), \(\eta _1\), \(\eta _2\) is denoted by \(\mathbb {R}^{m\times n}[\zeta _1^{-1},\zeta _1,\zeta _2^{-1},\zeta _2,\eta _1^{-1},\eta _1,\eta _2^{-1},\eta _2]\).

We denote by \(\left( \mathbb {C}^\mathtt{w}\right) ^{\mathbb {Z}^2}\) the set \(\left\{ w:\mathbb {Z}^2\rightarrow \mathbb {C}^\mathtt{w}\right\} \) consisting of all sequences of \(\mathbb {Z}^2\) taking their values in \(\mathbb {C}^\mathtt{w}\), and by \(\ell _2(\mathbb {Z}^2,\mathbb {C}^\mathtt{w})\) the set of square-summable sequences in \(\left( \mathbb {C}^\mathtt{w}\right) ^{\mathbb {Z}^2}\). The notation \((\cdot ,\cdot )\) appended to a symbol (e.g. u) is used to denote a trajectory \(u:\mathbb {Z}^2\rightarrow \mathbb {C}^\mathtt{{u}}\). With slight abuse of notation, given \(\lambda \in \mathbb {C}\) we denote by \(\exp _{\lambda }\) the geometric series whose value at \(k\in \mathbb {Z}\) is \(\exp _{\lambda }(k):=\lambda ^k\).

We define \({{\mathrm{\text{ vec }}}}\) as the linear map defined by \({{\mathrm{\text{ vec }}}}: \mathbb {R}^{m\times n} \rightarrow \mathbb {R}^{mn}\)

$$\begin{aligned} {{\mathrm{\text{ vec }}}}\left( \begin{bmatrix} a_{ij}\end{bmatrix}_{i=1,\ldots ,m,j=1,\ldots ,n}\right) :=\begin{bmatrix}a_{11}&\ldots&a_{1n}&\ldots&a_{m1}&\ldots&a_{mn} \end{bmatrix}^\top \; , \end{aligned}$$

and \({{\mathrm{\text{ mat }}}}\) as the linear map defined by \({{\mathrm{\text{ mat }}}}: \mathbb {R}^{mn}\rightarrow \mathbb {R}^{m\times n}\) and

$$\begin{aligned} {{\mathrm{\text{ mat }}}}\left( \begin{bmatrix}a_{11}&\ldots&a_{1n}&\ldots&a_{m1}&\ldots&a_{mn} \end{bmatrix}^\top \right) :=\begin{bmatrix} a_{ij}\end{bmatrix}_{i=1,\ldots ,m,j=1,\ldots ,n}\; . \end{aligned}$$

2 Background material

2.1 Controllable 2D behaviors

Define the shift operator \(\sigma _1\) by

$$\begin{aligned}&\sigma _1 : \left( \mathbb {C}^\mathtt{w}\right) ^{\mathbb {Z}^2} \rightarrow \left( \mathbb {C}^\mathtt{w}\right) ^{\mathbb {Z}^2}\\&\quad (\sigma _1 w)(k_1,k_2):=w(k_1+1,k_2)\qquad (k_1,k_2)\in \mathbb {Z}^2\; , \end{aligned}$$

and define \(\sigma _2\) analogously. The reverse shift operators \(\sigma _i^{-1}: \left( \mathbb {C}^\mathtt{w}\right) ^{\mathbb {Z}^2} \rightarrow \left( \mathbb {C}^\mathtt{w}\right) ^{\mathbb {Z}^2}\), \(i=1,2\), are defined in the obvious way.

A subspace \(\mathfrak {B}\) of \(\left( \mathbb {C}^\mathtt{w}\right) ^{\mathbb {Z}^2}\) is called a 2D linear behavior if it is the solution set of a system of linear, constant-coefficient partial difference equations in two independent variables, i.e.

$$\begin{aligned} \mathfrak {B}=\left\{ w\in \left( \mathbb {C}^\mathtt{w}\right) ^{\mathbb {Z}^2} \text{ s.t. } R\left( \sigma _1^{-1},\sigma _1,\sigma _2^{-1},\sigma _2\right) w=0\right\} \end{aligned}$$
(3)

where R is a Laurent polynomial matrix in the indeterminates \(z_i\), \(i=1,2\). We call (3) a kernel representation of \(\mathfrak {B}\), and we denote the set consisting of all 2D linear behaviors with \(\mathtt{w}\) external variables with \(\mathcal{L}^{\mathtt{w}}_2\).

In this paper we follow the behavioral- and algebraic terminology of Rocha (1990) and Rocha and Willems (1991) [see also Rocha and Zerz (2005) for refinements of such behavioral concepts and for the corresponding algebraic characterizations]. A \(\mathtt{g}\times \mathtt{w}\) Laurent polynomial matrix R in the indeterminates \(z_1,z_2\) is called left prime if the existence of some \(D, R^\prime \in \mathbb {R}^{\bullet \times \bullet }[z_1^{-1},z_1,z_2^{-1},z_2]\) for which equality \(R=DR^\prime \) holds implies that D is unimodular. The property of right-primeness is defined analogously. The following is a characterization of controllable behaviors [see p. 414 of Rocha and Willems (1991) for the definition].

Proposition 1

The following statements are equivalent:

  1. 1.

    \(\mathfrak {B}\in \mathcal{L}^{\mathtt{w}}_2\) is controllable;

  2. 2.

    \(\mathfrak {B}\) is the closure of \({\mathfrak {B}\cap \ell _2\left( \mathbb {Z}^2,\mathbb {R}^\mathtt{w}\right) }\) in the topology of pointwise convergence;

  3. 3.

    \(\exists \)\(\mathtt{g}\in \mathbb {N}\), \(R\in \mathbb {R}^{\mathtt{g}\times \mathtt{w}}[z_1^{-1},z_1,z_2^{-1},z_2]\) left prime such that \(\mathfrak {B}=\ker ~R(\sigma _1,\sigma _2)\);

  4. 4.

    \(\exists \)\(\mathtt{l}\in \mathbb {N}\), \(M\in \mathbb {R}^{\mathtt{w}\times \mathtt{l}}[z_1^{-1},z_1,z_2^{-1},z_2]\) right prime such that

    $$\begin{aligned} \mathfrak {B}=\left\{ w \text{ s.t. } \text{ there } \text{ exists } \ell \in \left( \mathbb {C}^\mathtt{w}\right) ^{\mathbb {Z}^2} \text{ s.t. } w=M\left( \sigma _1^{-1},\sigma _1,\sigma _2^{-1},\sigma _2\right) \ell \right\} \end{aligned}$$
    (4)

Proof

See Theorem 1 p. 415 of Rocha and Willems (1991). \(\square \)

It follows from Proposition 2 of Rocha and Willems (1991) that a controllable behavior is uniquely identified by its transfer function [see p. 415 of Rocha and Willems (1991) for the definition]. Controllability is crucial for the definition of the dual of a \(\mathcal {L}_2^{\mathtt{w}}\)-behavior, as defined in the next section.

2.2 Dual discrete 2D behaviors

Let \(\mathfrak {B}\in \mathcal {L}_2^\mathtt{w}\) be a controllable 2D system, and let \(J=J^\top \in \mathbb {R}^{\mathtt{w}\times \mathtt{w}}\) satisfy \(J^2=I_\mathtt{w}\). Define the J-dual of \(\mathfrak {B}\) as the set \(\mathfrak {B}^{\perp _J}\) consisting of all trajectories \(w^\prime :\mathbb {Z}^2\rightarrow \mathbb {R}^\mathtt{w}\) such that for all \(w\in \mathfrak {B}\cap \ell _2(\mathbb {Z}^2,\mathbb {R}^\mathtt{w})\) and \(w^\prime \in \mathfrak {B}^{\perp _J}\cap \ell _2(\mathbb {Z}^2,\mathbb {R}^\mathtt{w})\) the following equality holds:

$$\begin{aligned} \sum _{k_1=-\infty }^{+\infty }\sum _{k_2=-\infty }^{+\infty } w^{\prime }(k_1,k_2)^\top J w(k_1,k_2)=0\; . \end{aligned}$$

The definition of orthogonality on the complexification of \(\mathfrak {B}\) is straightforward by substituting \(w^{\prime }(k_1,k_2)^\top \) with \(w^{\prime }(k_1,k_2)^*\) in the previous equation. \(\mathfrak {B}^{\perp _J}\) is also a controllable 2D linear behavior, and its image- and kernel representations are related to those of \(\mathfrak {B}\) as stated in the following result.

Proposition 2

Let \(\mathfrak {B}\in \mathcal {L}_2^\mathtt{w}\), \(\mathfrak {B}=\ker ~R(\sigma _1,\sigma _2)={{\mathrm{\mathrm im}}}~M(\sigma _1,\sigma _2)\) with \(R\in \mathbb {R}^{\bullet \times \mathtt{w}}[z_1^{-1},z_1,z_2^{-1},z_2]\) left-prime, \(M\in \mathbb {R}^{\mathtt{w}\times \bullet }[z_1^{-1},z_1,z_2^{-1},z_2]\) right-prime. Then \(\mathfrak {B}^{\perp _J}\) is controllable and

$$\begin{aligned} \mathfrak {B}^{\perp _J}=\ker ~M(\sigma _1^{-1},\sigma _2^{-1})^\top J={{\mathrm{\mathrm im}}}~JR(\sigma _1^{-1},\sigma _2^{-1})^\top \; . \end{aligned}$$
(5)

Proof

We first prove the first equality in (5). Without loss of generality (if necessary, changing the latent variable \(\ell \) of the image representation by postmultiplication of M by a suitable unimodular matrix) we can assume that M is polynomial, i.e. \(M(z_1,z_2)=\sum _{i_1=0}^{N_1}\sum _{i_2=0}^{N_2} M_{i_1,i_2} z_1^{i_1} z_2^{i_2}\). Now consider the following chain of equalities:

$$\begin{aligned} 0= & {} \sum _{k_1=-\infty }^{+\infty }\sum _{k_2=-\infty }^{+\infty } w^{\prime }(k_1,k_2)^\top J w(k_1,k_2)\\= & {} \sum _{k_1=-\infty }^{+\infty }\sum _{k_2=-\infty }^{+\infty } w^{\prime }(k_1,k_2)^\top J \left( M(\sigma _1,\sigma _2) \ell \right) (k_1,k_2)\\= & {} \sum _{k_1=-\infty }^{+\infty }\sum _{k_2=-\infty }^{+\infty } w^{\prime }(k_1,k_2)^\top J \left( \sum _{i_1=0}^{N_1}\sum _{i_2=0}^{N_2} M_{i_1,i_2} \ell (k_1+i_1,k_2+i_2)\right) \; , \end{aligned}$$

where in the last equality we have used \(M(z_1,z_2)=\sum _{i_1=0}^{N_1}\sum _{i_2=0}^{N_2} M_{i_1,i_2} z_1^{i_1} z_2^{i_2}\). Now define \(k_j^\prime :=k_j+i_j\), \(j=1,2\), and observe that \(k_j\rightarrow \pm \infty \) if and only if \(k_j^\prime \rightarrow \pm \infty \), \(j=1,2\).

With such positions, we can write the last displayed expression as

$$\begin{aligned} 0=\sum _{k_1^\prime =-\infty }^{+\infty }\sum _{k_2^\prime =-\infty }^{+\infty } \left( \sum _{i_1=0}^{N_1}\sum _{i_2=0}^{N_2} w^{\prime }(k_1^\prime -i_1,k_2^\prime -i_2)^\top JM_{i_1,i_2} \ell (k_1^\prime ,k_2^\prime )\right) \; . \end{aligned}$$

In turn the last expression can be rewritten as

$$\begin{aligned} 0= & {} \sum _{k_1^\prime =-\infty }^{+\infty }\sum _{k_2^\prime =-\infty }^{+\infty }\underset{=\left( M(\sigma _1^{-1},\sigma _2^{-1})^\top Jw^\prime \right) (k_1^\prime ,k_2^\prime )}{\underbrace{\left( \sum _{i_1=0}^{N_1}\sum _{i_2=0}^{N_2}w^{\prime }(k_1^\prime -i_1,k_2^\prime -i_2)^\top J M_{i_1,i_2} \right) }}\ell (k_1^\prime ,k_2^\prime )\; . \end{aligned}$$

Now use the arbitrariness of \(\ell (\cdot ,\cdot )\) to conclude that

$$\begin{aligned} \sum _{i_1=0}^{N_1}\sum _{i_2=0}^{N_2}w^{\prime }(k_1^\prime -i_1,k_2^\prime -i_2)^\top J M_{i_1,i_2} =0\; , \end{aligned}$$

for all \((k_1^\prime ,k_2^\prime )\in \mathbb {Z}^2\). This proves that \(\mathfrak {B}^{\perp _J}=\ker ~M(\sigma _1^{-1},\sigma _2^{-1})^\top J\). Right-primeness of \(M(z_1,z_2)\) and \(J^2=I_w\) imply left-primeness of \(M(z_1,z_2)^\top J\); this yields controllability of \(\mathfrak {B}^{\perp _J}\).

The second equality follows from \(RM=0\), \(J^2=I_\mathtt{w}\), and standard results in behavioral system theory (see Rocha 1990).

\(\square \)

In the rest of this paper we assume that \(\mathtt{m}=\mathtt{p}\), and we denote by J the matrix

$$\begin{aligned} J:=\begin{bmatrix} 0_{\mathtt{m}\times \mathtt{p}}&I_{\mathtt{m}}\\ I_{\mathtt{p}}&0_{\mathtt{p}\times \mathtt{m}} \end{bmatrix}\; . \end{aligned}$$
(6)

We also assume that the external variable of \(\mathfrak {B}\) is partitioned as \(w={{\mathrm{\mathrm col}}}(u,y)\) with u a \(\mathtt{m}\)-dimensional input variable, and y a \(\mathtt{p}\)-dimensional output variable. It is a matter of straightforward verification to check that Proposition 2 implies the following result.

Corollary 1

Assume that \(\mathfrak {B}\in \mathcal {L}^\mathtt{w}_2\) is controllable, and let \(J=J^\top \) be such that \(J^2=I_\mathtt{w}\). Denote the transfer function of \(\mathfrak {B}\) by \(H(z_1,z_2)\) and the transfer function of \(\mathfrak {B}^{\perp _J}\) by \(H^\prime (z_1,z_2)\). Then \(H^\prime (z_1,z_2)=-H(z_1^{-1},z_2^{-1})^\top \).

We conclude this section showing how to compute trajectories of the dual system from those of the primal one with the mirroring technique [see also Kaneko and Rapisarda (2003), Kaneko and Rapisarda (2007) and Rapisarda and Willems (1997) for the use of such idea in the 1D case]. Given the importance of dual trajectories in our identification procedure, such technique is crucial to our approach.

Proposition 3

Let \(\mathfrak {B}\in \mathcal {L}_2^\mathtt{w}\) be controllable, and \(J=J^\top \in \mathbb {R}^{\mathtt{w}\times \mathtt{w}}\) be such that \(J^2=I_\mathtt{w}\). Let \(\overline{w}\in \mathbb {C}^\mathtt{w}\), and denote by \(w(\cdot ,\cdot )\in \mathfrak {B}\) a trajectory whose value at \((k_1,k_2)\) is \(\overline{w} {\lambda _1}^{k_1} {\lambda _2}^{k_2}\). Assume that \(\overline{v}\in \mathbb {C}^\mathtt{w}\) satisfies \(\overline{v}^*\overline{w}=0\). Then the trajectory \(v(\cdot ,\cdot )\) whose value at \((k_1,k_2)\) is \(J\overline{v} \left( \frac{1}{\lambda _1^{*}}\right) ^{k_1} \left( \frac{1}{\lambda _2^{*}}\right) ^{k_2}\) belongs to \(\mathfrak {B}^{\perp _J}\).

Proof

Let \(M\in \mathbb {R}^{\mathtt{w}\times \mathtt{m}}[z_1,z_2]\) and \(R\in \mathbb {R}^{\mathtt{p}\times \mathtt{w}}[z_1,z_2]\) with \(\mathtt{w}=\mathtt{p}+\mathtt{m}\) induce an image, respectively kernel representation of \(\mathfrak {B}\). Since \(R(z_1,z_2)M(z_1,z_2)=0_{\mathtt{p}\times \mathtt{m}}\), for all \((\lambda _1,\lambda _2)\in \mathbb {C}^2\) it holds that \({{\mathrm{\mathrm im}}}~M(\lambda _1,\lambda _2)=\left( {{\mathrm{\mathrm im}}}~J R(\lambda _1,\lambda _2)^*\right) ^{\perp }\), with orthogonality in the Euclidean sense in \(\mathbb {C}^\mathtt{w}\). It follows that

$$\begin{aligned} J\overline{v}\left( \frac{1}{\lambda _1^{*}}\right) ^{\cdot } \left( \frac{1}{\lambda _2^{*}}\right) ^{\cdot } \in {{\mathrm{\mathrm im}}}~JR\left( \sigma _1^{-1},\sigma _2^{-1}\right) ^\top =\mathfrak {B}^{\perp _J}\; . \end{aligned}$$

This concludes the proof. \(\square \)

Example 1

We show how the result of Proposition 3 can be constructively used for computing trajectories of the dual system from trajectories of the primal one. Consider the case

$$\begin{aligned} J:=\begin{bmatrix} 0&I_{\mathtt{m}}\\ I_{\mathtt{m}}&0\end{bmatrix}\; . \end{aligned}$$

Partition w as \(w=:\begin{bmatrix}u\\y \end{bmatrix}\) as in (2) and let \(w(k_1,k_2)=\begin{bmatrix} \overline{u} \\ \overline{y} \end{bmatrix} {\lambda _1}^{k_1} {\lambda _2}^{k_2}\) for some \(\overline{u},\overline{y}\in \mathbb {C}^{\mathtt{m}}\) and \(\lambda _1,\lambda _2\in \mathbb {C}\). It follows from Proposition 3 that the trajectory \(w^\prime \) whose value at \((k_1,k_2)\) is defined by \(w^\prime (k_1,k_2):=\begin{bmatrix}\overline{y} \\ -{\bar{u}} \end{bmatrix} {\left( \frac{1}{\lambda _1^*}\right) }^{k_1} {\left( \frac{1}{\lambda _2^*}\right) }^{k_2}\) belongs to the dual system. Thus in the case of \(J=\begin{bmatrix} 0&I_{\mathtt{m}}\\ I_{\mathtt{m}}&0\end{bmatrix}\) dual trajectories can be computed from primal ones by inspection. \(\square \)

2.3 2D bilinear- and quadratic difference forms

Bilinear- and quadratic difference forms (BdF and QdF in the following) have been used in the analysis of stability of 2D discrete systems in Kojima et al. (2007, 2011), Napp-Avelli et al. (2011a, b) and Rapisarda and Rocha (2012). We briefly review them here, and introduce some novel results.

To simplify the notation for elements of the ring

$$\begin{aligned} \mathbb {R}^{\mathtt{w}_1\times \mathtt{w}_2}[\zeta _1^{-1},\zeta _1,\zeta _2^{-1},\zeta _2,\eta _1^{-1},\eta _1,\eta _2^{-1},\eta _2]\; , \end{aligned}$$

we define multi-indices \({\mathbf{j}}:=(j_1,j_2),{\mathbf{i}}:=(i_1,i_2)\in \mathbb {Z}^2\), and the notation \(\zeta :=(\zeta _1,\zeta _2)\) and \(\eta :=(\eta _1,\eta _2)\). Every element of the ring can be written in the form

$$\begin{aligned} \varPhi (\zeta ,\eta )=\sum _{{\mathbf{i}},{\mathbf{j}}} \zeta ^{\mathbf{i}} \varPhi _{{\mathbf{i}},{\mathbf{j}}} \eta ^{\mathbf{j}}\; , \end{aligned}$$

where \(\varPhi _{{\mathbf{i}},{\mathbf{j}}}\in \mathbb {R}^{{\mathtt{w}}_1\times {\mathtt{w}}_2}\), \(\mathbf{i},\mathbf{j}\in \mathbb {Z}^2\); only a finite number of nonzero such matrices is present in the expression for \(\varPhi (\zeta ,\eta )\). We associate with \(\varPhi (\zeta ,\eta )\) a bilinear difference form\(L_\varPhi \) defined by

$$\begin{aligned}&L_\varPhi : \left( \mathbb {R}^{\mathtt{w}_1}\right) ^{\mathbb {Z}^2} \times \left( \mathbb {R}^{\mathtt{w}_2}\right) ^{\mathbb {Z}^2} \rightarrow \left( \mathbb {R}\right) ^{\mathbb {Z}^2}\\&L_\varPhi (w_1,w_2)(k_1,k_2):=\sum _{{\mathbf{i}},{\mathbf{j}}} \left( \sigma ^{\mathbf{i}} w_1\right) (k_1,k_2)^\top \varPhi _{{\mathbf{i}},{\mathbf{j}}} \left( \sigma ^{\mathbf{j}} w_2\right) (k_1,k_2) \; , \end{aligned}$$

where \(\sigma ^{{\mathbf{i}}} w_1:=\sigma _1^{i_1} \sigma _2^{i_2} w_1\) and \(\sigma ^{{\mathbf{j}}} w_2:=\sigma _1^{j_1} \sigma _2^{j_2} w_2\). The definition of the BdF \(L_\varPhi \) on the complexification of \(\mathfrak {B}\) is straightforward.

If \({\mathtt{w}_1}=\mathtt{w}_2=\mathtt{w}\) then \(\varPhi (\zeta ,\eta )\) also induces a quadratic difference form\(Q_\varPhi \) defined by

$$\begin{aligned}&Q_\varPhi : \left( \mathbb {R}^{\mathtt{w}}\right) ^{\mathbb {Z}^2} \rightarrow \left( \mathbb {R}\right) ^{\mathbb {Z}^2}\\&Q_\varPhi (w)(k_1,k_2):=\sum _{{\mathbf{i}},{\mathbf{j}}} \left( \sigma ^{\mathbf{i}} w\right) (k_1,k_2)^\top \varPhi _{{\mathbf{i}},{\mathbf{j}}} \left( \sigma ^{\mathbf{j}} w\right) (k_1,k_2) \; . \end{aligned}$$

Without loss of generality when dealing with QdFs we can assume that \(\varPhi _{\mathbf{i},\mathbf{j}}=\varPhi _{\mathbf{j},\mathbf{i}}^\top \) for all \(\mathbf{i}\) and \(\mathbf{j}\), i.e. that \(\varPhi (\zeta ,\eta )=\varPhi (\eta ,\zeta )^\top \) is symmetric.

The increment in the first direction of a BdF \(L_\varPhi \) is defined by

$$\begin{aligned} \left( \nabla _1 L_\varPhi \right) (w_1,w_2)(k_1,k_2):=L_\varPhi (w_1,w_2)(k_1+1,k_2)-L_\varPhi (w_1,w_2)(k_1,k_2)\; , \end{aligned}$$
(7)

for all \((k_1,k_2)\in \mathbb {Z}^2\). The increment of \(L_\varPhi \) in the second direction is defined analogously. Such increments are themselves BdFs; it follows straightforwardly from the definition that their four-variable representations, denoted with the same symbols with slight abuse of notation, are respectively

$$\begin{aligned} (\nabla _1 L_\varPhi )(\zeta ,\eta )= & {} \zeta _1 \eta _1 \varPhi (\zeta ,\eta )-\varPhi (\zeta ,\eta )\\ (\nabla _2 L_\varPhi )(\zeta ,\eta )= & {} \zeta _2 \eta _2 \varPhi (\zeta ,\eta )-\varPhi (\zeta ,\eta )\; . \end{aligned}$$

Given BdFs \(L_{\varPhi _i}\), \(i=1,2\), acting on \(\left( \mathbb {R}^{\mathtt{w}_1} \right) ^{\mathbb {Z}^2} \times \left( \mathbb {R}^{\mathtt{w}_2} \right) ^{\mathbb {Z}^2}\), the vector of BdFs (“v-BdFs” in the following) \(L_{\varPhi }:={{\mathrm{\mathrm col}}}(L_{\varPhi _1},L_{\varPhi _2})\) is defined by

$$\begin{aligned}&L_\varPhi : \left( \mathbb {R}^{\mathtt{w}_1} \right) ^{\mathbb {Z}^2} \times \left( \mathbb {R}^{\mathtt{w}_2} \right) ^{\mathbb {Z}^2} \rightarrow \left( \mathbb {R}\right) ^{\mathbb {Z}^2}\times \left( \mathbb {R}\right) ^{\mathbb {Z}^2}\\&\quad L_\varPhi (w_1,w_2):={{\mathrm{\mathrm col}}}(L_{\varPhi _1}(w_1,w_2),L_{\varPhi _2}(w_1,w_2))\; . \end{aligned}$$

The discrete divergence of a v-BdFs \(L_{\varPhi }:={{\mathrm{\mathrm col}}}(L_{\varPhi _1},L_{\varPhi _2})\) is the BdF defined by

$$\begin{aligned}&\nabla L_\varPhi : \left( \mathbb {R}^{\mathtt{w}_1} \right) ^{\mathbb {Z}^2} \times \left( \mathbb {R}^{\mathtt{w}_2} \right) ^{\mathbb {Z}^2} \rightarrow \left( \mathbb {R}\right) ^{\mathbb {Z}^2}\nonumber \\&\quad \nabla L_\varPhi (w_1,w_2):=\nabla _1 L_{\varPhi _1}(w_1,w_2)+\nabla _2 L_{\varPhi _2}(w_1,w_2)\; . \end{aligned}$$
(8)

From the definition it follows that the four-variable polynomial representation of the discrete divergence of \(L_\varPhi \) is

$$\begin{aligned} \left( \nabla L_\varPhi \right) (\zeta ,\eta )=\left( \zeta _1 \eta _1 -1\right) \varPhi _1(\zeta ,\eta )+\left( \zeta _2 \eta _2 -1\right) \varPhi _2(\zeta ,\eta )\; . \end{aligned}$$
(9)

Define the following map:

$$\begin{aligned}&\partial : \mathbb {R}^{\mathtt{w}_1\times \mathtt{w}_2}[\zeta _1^{-1},\zeta _1,\zeta _2^{-1},\zeta _2,\eta _1^{-1},\eta _1,\eta _2^{-1},\eta _2] \rightarrow \mathbb {R}^{\mathtt{w}_1\times \mathtt{w}_2}[z_1^{-1},z_1,z_2^{-1},z_2]\\&\partial \left( \varPhi (\zeta ,\eta )\right) :=\varPhi (z_1^{-1},z_2^{-1},z_1,z_2)\; . \end{aligned}$$

It is straightforward to verify that if (9) holds, then \(\partial \left( \varPhi (\zeta ,\eta )\right) =0_{\mathtt{w}_1\times \mathtt{w}_2}\). The following result states that also the converse holds.

Proposition 4

Let \(\varPhi \in \mathbb {R}[\zeta _1^{-1},\zeta _1,\zeta _2^{-1},\zeta _2,\eta _1^{-1},\eta _1,\eta _2^{-1},\eta _2]\). The following statements are equivalent:

  1. 1.

    \(\partial \left( \varPhi (\zeta ,\eta )\right) =0_{\mathtt{w}_1\times \mathtt{w}_2}\);

  2. 2.

    There exists a v-BdFs \({{\mathrm{\mathrm col}}}(\varPsi _1,\varPsi _2)\) such that \(\varPhi (\zeta ,\eta )=\nabla \varPsi (\zeta ,\eta )\);

  3. 3.

    \(\sum _{k_1=-\infty }^{+\infty } \sum _{k_2=-\infty }^{+\infty } Q_\varPhi (w)(k_1,k_2)=0\) for all \(w\in \ell _2(\mathbb {Z}^2,\mathbb {R}^\mathtt{w})\).

Proof

See Proposition 1 p. 1524 of Kojima and Kaneko (2014). \(\square \)

The following result states that there are nonzero v-BdFs whose divergence is zero; this is well-known in the continuous case, even considering non-constant fields.

Proposition 5

Let \(\varPsi _i\in \mathbb {R}^{\mathtt{w}\times \mathtt{w}}[\zeta _1^{-1},\zeta _1,\zeta _2^{-1},\zeta _2,\eta _1^{-1},\eta _1,\eta _2^{-1},\eta _2]\), \(i=1,2\). The following three statements are equivalent:

  1. 1.

    \(\nabla ~{{\mathrm{\mathrm col}}}(L_{\varPsi _1},L_{\varPsi _2})=0\);

  2. 2.

    \((\zeta _1\eta _1-1) \varPsi _1(\zeta ,\eta )+(\zeta _2\eta _2-1) \varPsi _2(\zeta ,\eta )=0\);

  3. 3.

    There exists \(\varPsi \in \mathbb {R}^{\mathtt{w}\times \mathtt{w}}[\zeta _1^{-1},\zeta _1,\zeta _2^{-1},\zeta _2,\eta _1^{-1},\eta _1,\eta _2^{-1},\eta _2]\) such that

    $$\begin{aligned} \varPsi _1(\zeta ,\eta )= & {} (\zeta _2\eta _2-1)\varPsi (\zeta ,\eta )\\ \varPsi _2(\zeta ,\eta )= & {} -(\zeta _1\eta _1-1)\varPsi (\zeta ,\eta )\; . \end{aligned}$$

Proof

The equivalence of statements (1) and (2) follows from the relation (9) between the discrete divergence and its four-variable polynomial representation. The implication \((3) \Longrightarrow (2)\) follows from straightforward verification.

To prove the implication \((2) \Longrightarrow (3)\) observe first that if \((\zeta _1\eta _1-1) \varPsi _1(\zeta ,\eta )=-(\zeta _2\eta _2-1) \varPsi _2(\zeta ,\eta )\), then \(\varPsi _1(\zeta ,\eta )\) is divisible by \((\zeta _2\eta _2-1)\), and \(\varPsi _2(\zeta ,\eta )\) by \((\zeta _1\eta _1-1)\). Consequently, there exist \(\varPsi _j^\prime (\zeta ,\eta )\), \(j=1,2\) such that \(\varPsi _j(\zeta ,\eta )=(\zeta _i\eta _i-1)\varPsi _j^\prime (\zeta ,\eta )\), \(i,j=1,2\), \(i\ne j\). Statement (3) follows readily from such equality. \(\square \)

3 Dual Roesser representations and pairing

To the best of the author’s knowledge and despite the popularity of Roesser models, what the dual system is of one admitting such a representation, and whether such dual system admits a Roesser representation itself, have not been investigated before. In this section we prove some statements related to such issues, which moreover are crucial for our system identification approach. Foremost among them is a “pairing” result, showing how bilinear forms on the external variables of the primal and the dual system are related to bilinear forms on the state variables of the two systems.

We associate to (1) a “backward” Roesser model described by the equations

$$\begin{aligned} \begin{bmatrix} \sigma _1^{-1} \tilde{x}_1\\ \sigma _2^{-1} \tilde{x}_2\\ \end{bmatrix}= & {} A^\top \begin{bmatrix} \tilde{x}_1\\ \tilde{x}_2 \end{bmatrix}+\begin{bmatrix} C_1^\top \\ C_2^\top \end{bmatrix} u^\prime \nonumber \\ y^\prime= & {} -\begin{bmatrix}B_1^\top&B_2^\top \end{bmatrix} \begin{bmatrix} \tilde{x}_1\\ \tilde{x}_2 \end{bmatrix}-D^\top u^\prime \; . \end{aligned}$$
(10)

We call (10) a representation of the dual system to (1), for reasons given in Prop.s 6 and 7 below.

In the following result we show that the transfer function of the dual system (10) is related to that of (1) by a straightforward transformation.

Proposition 6

Define

$$\begin{aligned} H(z_1,z_2):=\begin{bmatrix}C_1&C_2 \end{bmatrix} \begin{bmatrix}z_1 I_{n_1}-A_{11}&-A_{12} \\ -A_{21}&z_2 I_{n_2}-A_{22} \end{bmatrix}^{-1} \begin{bmatrix}B_1\\ B_2 \end{bmatrix}+D\; ; \end{aligned}$$

then the transfer function of the dual system (10) is \(-H(z_1^{-1},z_2^{-1})^\top \).

Proof

Taking the two-variable z-transform of (10) yields the following expression for the transfer function from \(u^\prime \) to \(y^\prime \):

$$\begin{aligned} -\begin{bmatrix}B_1^\top&B_2^\top \end{bmatrix} \begin{bmatrix}z_1^{-1} I_{n_1}-A_{11}^\top&-A_{21}^\top \\ -A_{12}^\top&z_2^{-1} I_{n_2}-A_{22}^\top \end{bmatrix}^{-1} \begin{bmatrix}C_1^\top \\ C_2^\top \end{bmatrix}-D^\top \; , \end{aligned}$$

from which the claim follows.

\(\square \)

Define J by (6); the result of Prop. 6, the fact that the transfer function uniquely defines a controllable system, and Corollary 1 imply that (10) is a state representation of the J-dual behavior of the external behavior of (1). In the following we find it easier to use a different hybrid representation of such dual, that obtained from (10) defining the latent variables \(x_i^\prime :=\sigma _i^{-1}\tilde{x}_i\), \(i=1,2\); then the equations (11) can be rewritten as

$$\begin{aligned} \begin{bmatrix} x_1^\prime \\ x_2^\prime \end{bmatrix}= & {} A^\top \begin{bmatrix} \sigma _1 x_1^\prime \\ \sigma _2 x_2^\prime \end{bmatrix}+\begin{bmatrix} C_1^\top \\ C_2^\top \end{bmatrix} u^\prime \nonumber \\ y^\prime= & {} -\begin{bmatrix}B_1^\top&B_2^\top \end{bmatrix} \begin{bmatrix} \sigma _1 x_1^\prime \\ \sigma _2 x_2^\prime \end{bmatrix}-D^\top u^\prime \; . \end{aligned}$$
(11)

In the following pairing result we describe a relation between the bilinear form induced by \(-J\) on \(\mathfrak {B}\times \mathfrak {B}^{\perp _J}\) and a discrete divergence on the field generated by the inner product of the state variable x of (1) and the latent variable \(x^\prime \) of (11).

Proposition 7

Consider the latent variable representations (1) of \(\mathfrak {B}\) and (11) of \(\mathfrak {B}^{\perp _J}\). Then for all \({{\mathrm{\mathrm col}}}(u,y,x)\) satisfying (1) and \({{\mathrm{\mathrm col}}}(u^\prime ,y^\prime ,x^\prime )\) satisfying (11) the following equation holds:

$$\begin{aligned} \begin{bmatrix} u^\top&y^\top \end{bmatrix} \left( -J\right) \begin{bmatrix} u^\prime \\ y^\prime \end{bmatrix}= & {} -y^{\top } u^\prime - u^\top y^\prime \nonumber \\= & {} \left( \sigma _1 x_1\right) ^\top \left( \sigma _1 x^\prime _1\right) -x_1^\top x_1^\prime +\left( \sigma _2 x_2\right) ^\top \left( \sigma _2 x^\prime _2\right) -x_2^\top x_2^\prime \nonumber \\= & {} \nabla {{\mathrm{\mathrm col}}}\left( x_1^\top x_1^\prime , x_2^\top x_2^\prime \right) \; . \end{aligned}$$
(12)

Proof

It is a matter of straightforward verification to check that the following chain of equalities holds:

$$\begin{aligned} -y^{\top } u^\prime -u^\top y^\prime= & {} -\left( x^\top C^\top +u^\top D^\top \right) u^\prime -u^\top \left( -B^\top \begin{bmatrix} \sigma _1 x_1^\prime \\ \sigma _2 x_2^\prime \end{bmatrix} -D^\top u^\prime \right) \\= & {} -x^\top \left( C^\top u^\prime \right) + \left( u^\top B^\top \right) \begin{bmatrix} \sigma _1 x_1^\prime \\ \sigma _2 x_2^\prime \end{bmatrix}\\= & {} -x^\top \left( \begin{bmatrix} x_1^\prime \\ x_2^\prime \end{bmatrix}-A^\top \begin{bmatrix} \sigma _1 x_1^\prime \\ \sigma _2 x_2^\prime \end{bmatrix} \right) +\left( \begin{bmatrix}\sigma _1 x_1\\ \sigma _2 x_2 \end{bmatrix}-A\begin{bmatrix} x_1\\ x_2 \end{bmatrix}\right) ^\top \begin{bmatrix} \sigma _1 x_1^\prime \\ \sigma _2 x_2^\prime \end{bmatrix}\; . \end{aligned}$$

The claim follows. \(\square \)

We can reformulate the result of Proposition 7 saying that the bilinear form induced by \(-\,J\) on the external trajectories of the primal and the dual system is the divergence of the field induced by the inner products on the first and second state components of the primal and the dual system. Such relation is of paramount importance for our system identification procedure.

4 The data matrix and the 2D Stein matrix equation

In the following we assume that besides the primal data (2), also a set of dual trajectories whose value at \((k_1,k_2)\in \mathbb {Z}^2\) is

$$\begin{aligned} w^\prime _i(k_1,k_2):=\sum _{j_1=0}^{L_1^{\prime i}}\sum _{j_2=0}^{L_2^{\prime i}} k_1^{j_1} k_2^{j_2}\overline{w^\prime }^i_{j_1,j_2} \mu _{1,i}^{k_1} \mu _{2,i}^{k_2}\; . \end{aligned}$$
(13)

where \(\overline{w}_{j_1,j_2}^i={\begin{bmatrix}\overline{u^\prime }_{j_1,j_2}^i \\ \overline{y^\prime }_{j_1,j_2}^i\end{bmatrix} \in \mathbb {C}^{\mathtt{p}+\mathtt{m}}}\), \(j_\ell =0,\ldots ,L_\ell ^{^\prime i}\), \(\ell =1,2\) and \(\mu _{j,i}\in \mathbb {C}\), \(i=1,\ldots ,N\), \(j=1,2\). Recall from Proposition 3 that such trajectories can also be computed from trajectories obtained from experiments conducted on the primal system, and consequently such assumption is not restrictive for practical purposes.

In the rest of the paper we only consider the case of primal and dual data with multiplicity one, i.e. \(L_j^i=1=L^{\prime i}_j\), \(j=1,2\), \(i=1,\ldots ,N\) in (2), and vector-geometric trajectories \(\overline{w}^\prime _i exp_{\mu _{1,i}} exp_{\mu _{2,i}}\), \(\overline{w}_i \exp _{\lambda _{1,i}} \exp _{\lambda _{2,i}}\), \(i=1,\ldots ,N\); in such case the value of the primal and dual trajectories at \((k_1,k_2)\) is

$$\begin{aligned} w^\prime _i(k_1,k_2)=\overline{w}^\prime _i \mu _{1,i}^{k_1} \mu _{2,i}^{k_2} \text{ and } w_i(k_1,k_2)=\overline{w}_i \lambda _{1,i}^{k_1} \lambda _{2,i}^{k_2}\; . \end{aligned}$$
(14)

In order to compute real models (1), (10), we will also assume that the data sets \(\mathfrak {D}^\prime :=\{w_i^\prime \}_{i=1,\ldots ,N}\) and \(\mathfrak {D}:=\{w_i\}_{i=1,\ldots ,N}\) are closed under conjugation, i.e. that

$$\begin{aligned} {\overline{w}_i \exp _{\lambda _{1,i}} \exp _{\lambda _{2,i}}}\in \mathfrak {D}&\Longrightarrow&{\overline{w}_i^*\exp _{\lambda _{1,i}^{*}} \exp _{\lambda _{2,i}^{*}}}\in \mathfrak {D}\\ {\overline{w}^\prime _i exp_{\mu _{1,i}} exp_{\mu _{2,i}}}\in \mathfrak {D}^\prime&\Longrightarrow&{\overline{w}^{\prime *}_i exp_{\mu _{1,i}^{*}} exp_{\mu _{2,i}^{*}}} \in \mathfrak {D}^\prime \; . \end{aligned}$$

Such assumption can be made without loss of generality, since it is always possible to close \(\mathfrak {D}\) and \(\mathfrak {D}^\prime \) by adding conjugate trajectories, if necessary.

The following result [that uses the definition of observability on p. 6 of Roesser (1975)] implies that if the external trajectories are vector-geometric, also the corresponding state trajectories of the primal system are vector-geometric.

Proposition 8

Let \({{\mathrm{\mathrm col}}}(x,u,y)\) satisfy (1). Assume that \(u=\overline{u}~ {\exp _{\lambda _1} \exp _{\lambda _2}}\) and \(y=\overline{y}~ {\exp _{\lambda _1} \exp _{\lambda _2}}\) for some \( \overline{u}\in \mathbb {C}^{\mathtt{m}}\) and \( \overline{y}\in \mathbb {C}^{\mathtt{p}}\), respectively, and some \(\lambda _i\in \mathbb {C}\), \(i=1,2\). Assume that the representation (1) is observable. Then there exists a unique \(\overline{x}\in \mathbb {C}^{\mathtt{n}_1+\mathtt{n}_2}\) such that the state trajectory corresponding to u and y is \(x=\overline{x}~ {\exp _{\lambda _1} \exp _{\lambda _2}}\).

Proof

The fact that the state trajectory x is vector-geometric follows in a straightforward way from the second equation in (1) and the fact that u and y) are vector-geometric and associated with the same 2D frequency \((\lambda _1,\lambda _2)\). The uniqueness of the state trajectory, and consequently of the state direction \(\overline{x}\), follows from the observability of (1). \(\square \)

A result analogous to Proposition 8 also holds true for the dual system represented by (10) and equivalently by (11); we will not state it explicitly here.

We associate to the dual and primal vector-geometric sequences \(w^\prime _i(\cdot ,\cdot )\), \(w_j(\cdot ,\cdot )\), \(i,j=1,\ldots ,N\) the data matrix\(\mathcal {D}\) defined by:

$$\begin{aligned} \mathcal {D}:=\begin{bmatrix} \overline{u}^\prime _1&\ldots&\overline{u}_N^\prime \\ \overline{y}_1^\prime&\ldots&\overline{y}_N^\prime \end{bmatrix}^*J \begin{bmatrix} \overline{u}_1&\ldots&\overline{u}_N\\ \overline{y}_1&\ldots&\overline{y}_N\end{bmatrix}\; . \end{aligned}$$
(15)

In the following result we make explicit the connection between D and the state trajectories corresponding to the data \(w^\prime _i(\cdot ,\cdot )\), \(w_j(\cdot ,\cdot )\), \(i,j=1,\ldots ,N\).

Proposition 9

Let (1) be a state representation of \(\mathfrak {B}\), and let (11) be a representation of its dual; assume that such representations are observable. Denote the unique state trajectories corresponding to \(w_i^\prime \) and \(w_j\) by \(x^\prime _i\) and \(x_j\), respectively, and the associated state directions by \(\overline{x}_i^\prime =:\begin{bmatrix}\overline{x}_{i,1}^\prime \\ \overline{x}_{i,2}^\prime \end{bmatrix}\) and \(\overline{x}_j=:\begin{bmatrix}\overline{x}_{j,1}\\ \overline{x}_{j,2} \end{bmatrix}\), \(i,j=1,\ldots ,N\). Define for \(k=1,2\) the matrices

$$\begin{aligned} X_k^\prime:= & {} \begin{bmatrix}x_{{1,k}}^\prime&\ldots&x_{{N,k}}^\prime \end{bmatrix}\in \mathbb {C}^{\mathtt{n}_k\times N}\nonumber \\ X_k:= & {} \begin{bmatrix}x_{{1,k}}&\ldots&x_{{N,k}} \end{bmatrix}\in \mathbb {C}^{\mathtt{n}_k\times N}\nonumber \\ \overline{\mathcal {S}}_k:= & {} X_k^{\prime \top } X_k\in \mathbb {C}^{N\times N}\nonumber \\ M_k:= & {} {{\mathrm{\text{ diag }}}}\left( \mu _{k,i} \right) _{i=1,\ldots ,N}\in \mathbb {C}^{N\times N}\nonumber \\ \varLambda _k:= & {} {{\mathrm{\text{ diag }}}}\left( \lambda _{k,i} \right) _{i=1,\ldots ,N}\in \mathbb {C}^{N\times N}\; . \end{aligned}$$
(16)

The following equation holds:

$$\begin{aligned} \mathcal {D}=M_1^*\overline{\mathcal {S}}_1\varLambda _1-\overline{\mathcal {S}}_1+ M_2^*\overline{\mathcal {S}}_2 \varLambda _2-\overline{\mathcal {S}}_2\; . \end{aligned}$$
(17)

Proof

The claim follows in a straightforward way by applying the pairing equation (12) to the vector-geometric data \(w_i^\prime \), \(w_j\) and the associated state trajectories \(x_i^\prime \), \(x_j\). \(\square \)

By analogy with the classical matrix equation \(\mathcal {X}-M\mathcal {X}\varLambda =\mathcal {Q}\) where M, \(\varLambda \) and \(\mathcal {Q}\) are given and the unknown is the matrix \(\mathcal {X}\), we call (17) the 2D Stein matrix equation, and for future reference we rewrite it as

$$\begin{aligned} \mathcal {Q}=M_1^*{\mathcal {S}}_1\varLambda _1-{\mathcal {S}}_1+ M_2^*{\mathcal {S}}_2 \varLambda _2-{\mathcal {S}}_2\; , \end{aligned}$$
(18)

where \({\mathcal {S}}_k\) are the unknowns, \(k=1,2\). Such equation is of paramount importance in our approach to 2D system identification: from it we compute admissible state trajectories associate with the input-output data, and also the state equations corresponding to such variables.

The solutions of the linear matrix equation (18) can be parametrized in a straightforward way, as we now show. We first show how to compute one pair of solutions, and then consider the homogeneous matrix equation associated with (18).

Using analogous arguments to those available in the theory of matrix Sylvester equations (see e.g. Peeters and Rapisarda 2006) it can be shown that if \(\lambda _k^i \mu _k^j\ne 1\), \(i,j=1,\ldots ,N\), \(k=1,2\), then for every choice of the right-hand side \(\mathcal {Q}^\prime \) there exist unique solutions \(\overline{\mathcal {X}}_k\) to the k-th 1D matrix Stein equation

$$\begin{aligned} \mathcal {X}-M_k^*\mathcal {X}\varLambda _k=\mathcal {Q}^\prime \; , \end{aligned}$$
(19)

\(k=1,2\). If \(Q^\prime =\frac{1}{2} \mathcal {Q}\) and if \(\overline{\mathcal {S}}_k\) solves the 1D Stein equation (19), \(k=1,2\), then summing up the two 1D Stein equations it follows that the pair \((\overline{\mathcal {S}}_1,\overline{\mathcal {S}}_2)\) solves the 2D Stein equation (18). From such discussion it follows that under the assumption \(\lambda _k^i \mu _k^j\ne 1\), \(i,j=1,\ldots ,N\), \(k=1,2\), one can compute a particular solution of the 2D matrix Stein equation directly from two 1D Stein equations. In many cases (e.g. in experimental setting where the inputs can be arbitrarily chosen and the system can be started at rest), the frequencies \(\lambda _k^i\), \(\mu _k^j\) can be freely chosen from the experimental data, and the assumption \(\lambda _k^i \mu _k^j\ne 1\), \(i,j=1,\ldots ,N\), \(k=1,2\) is not particularly restrictive. We will assume it is valid for the rest of the paper.

We now study the homogeneous 2D Stein matrix equation:

$$\begin{aligned} 0=M_1^*\mathcal {S}_1^\prime \varLambda _1 -\mathcal {S}_1^\prime +M_2^*\mathcal {S}_2^\prime \varLambda _2 -\mathcal {S}_2^\prime \; , \end{aligned}$$
(20)

A parametrization of all solution pairs \((\mathcal {S}^\prime _1,\mathcal {S}^\prime _2)\) to such equation is straightforward to derive from the following result, whose proof is omitted.

Proposition 10

Denote the (kj)-th entry of \(\mathcal {S}_i^\prime \) in (20) by \(\mathcal {S}_i^\prime (k,j)\), \(i=1,2\), \(k,j=1,\ldots ,N\). Assume that \(\lambda _i^j \mu _i^k\ne 1\), \(k,j=1,\ldots ,N\), \(i=1,2\). The following statements are equivalent:

  1. 1.

    The pair \((\mathcal {S}_1^\prime ,\mathcal {S}_2^\prime )\) is a solution of (20);

  2. 2.

    The following equality holds:

    $$\begin{aligned} \mathcal {S}_1^\prime (k,j)\left( \lambda ^j_1 \mu _1^{k*}-1 \right) +\mathcal {S}_2^\prime (k,j)\left( \lambda ^j_2 \mu _2^{k*} -1\right) =0\; , \end{aligned}$$

    \(k,j=1,\ldots ,N\);

  3. 3.

    The following equality holds:

    $$\begin{aligned} \mathcal {S}_1^\prime (k,j)=-\mathcal {S}_2^\prime (k,j)\frac{\lambda ^j_2 \mu _2^{k*} -1 }{\lambda ^j_1 \mu _1^{k*}-1 }\; , \end{aligned}$$
    (21)

    \(k,j=1,\ldots ,N\).

On the basis of the result of Proposition 10, we characterize all solutions to (18) as follows.

Corollary 2

Assume that \(\lambda _i^j \mu _i^k\ne 1\), \(k,j=1,\ldots ,N\), \(i=1,2\). Denote by \(\overline{\mathcal {S}}_i\) the unique solution to (19) when \(\mathcal {Q}^\prime =\frac{1}{2}\mathcal {Q}\), \(M=M_i\), \(\varLambda =\varLambda _i\), \(i=1,2\). Then \((\mathcal {S}_1,\mathcal {S}_2)\) is a solution of (18) if and only if there exist \(\mathcal {S}_2^\prime (k,j)\in \mathbb {R}\), \(k,j=1,\ldots ,N\), such that

$$\begin{aligned} \mathcal {S}_1(k,j)= & {} \overline{\mathcal {S}}_1(k,j)-\mathcal {S}_2^\prime (k,j)\frac{\lambda ^j_2 \mu _2^{k*} -1 }{\lambda ^j_1 \mu _1^{k*}-1 }\nonumber \\ \mathcal {S}_2(k,j)= & {} \overline{\mathcal {S}}_2(k,j)+\mathcal {S}_2^\prime (k,j)\; , \end{aligned}$$
(22)

\(k,j=1,\ldots ,N\).

5 Computation of state trajectories

We can restate the result of Proposition 9 by saying that the matrices \(\overline{\mathcal {S}}_k\), \(k=1,2\) defined in (16) are solutions of the 2D Stein matrix equation

$$\begin{aligned} \mathcal {D}=M_1^*\mathcal {S}_1 \varLambda _1 -\mathcal {S}_1+M_2^*\mathcal {S}_2 \varLambda _2 -\mathcal {S}_2\; , \end{aligned}$$
(23)

with \(M_k\), \(\varLambda _k\in \mathbb {C}^{N\times N}\), \(k=1,2\) defined by (16), \(\mathcal {D}\in \mathbb {C}^{N\times N}\) by (15), and \(\mathcal {S}_k\in \mathbb {C}^{N\times N}\), \(k=1,2\) being the unknown matrices. Based on such observation, for pedagogical reasons we propose the following preliminary version of a 2D identification procedure, which we refine in the rest of this section and in sect. 6.

Algorithm 1

Input: :

Primal and dual data as in (14)

Output: :

An unfalsified Roesser model for the data.

Step 1: :

Construct the matrix \(\mathcal {D}\) defined by (15) from the data (14).

Step 2: :

Compute a pair \((\mathcal {S}_1,\mathcal {S}_2)\) of solutions to the matrix equation (23).

Step 3: :

Perform a rank-revealing factorization of \(\mathcal {S}_k=F_{k}^{\prime \top } F_k\), i.e.

$$\begin{aligned} {{\mathrm{\mathrm rank}}}(\mathcal {S}_k)={{\mathrm{\mathrm rank}}}(F_k)={{\mathrm{\mathrm rank}}}(F^\prime _k)\; , \; k=1,2\; . \end{aligned}$$
Step 4: :

Define

$$\begin{aligned} Y:= & {} \begin{bmatrix} \overline{y}_1&\ldots&\overline{y}_N\end{bmatrix}\in \mathbb {C}^{\mathtt{p}\times N}\nonumber \\ U:= & {} \begin{bmatrix} \overline{u}_1&\ldots&\overline{u}_N\end{bmatrix}\in \mathbb {C}^{\mathtt{m}\times N}\; , \end{aligned}$$
(24)

and solve for \(A_{ij}\), \(B_{i}\), \(C_i\), \(i,j=1,2\) and D in

$$\begin{aligned} \begin{bmatrix} F_1 \varLambda _1\\F_2\varLambda _2\\Y \end{bmatrix}=\begin{bmatrix} A_{11}&A_{12}&B_1\\ A_{21}&A_{22}&B_2\\ C_1&C_2&D \end{bmatrix} \begin{bmatrix} F_1\\ F_2\\ U \end{bmatrix}\; . \end{aligned}$$
(25)
Step 5: :

Return A, B, C, D.

We now discuss several issues related to such procedure.

Identifiability: :

Recall from Proposition 5 that \(\overline{\mathcal {S}}_k\), \(k=1,2\) defined in (16) are not the only solutions of 2D matrix Stein equation (23): if \(\mathcal {S}_k^\prime \), \(k=1,2\) satisfy the homogeneous 2D Stein matrix equation (20), then also \(\overline{\mathcal {S}}_k+\mathcal {S}_k^\prime \), \(k=1,2\) are solutions of (23). The converse also holds: since the matrix equation (23) is linear, it follows that every solution pair to it can be written as \(\overline{\mathcal {S}}_k+\mathcal {S}_k^\prime \) for some pair \((\mathcal {S}_1^\prime ,\mathcal {S}_2^\prime )\) solving (20) and \(\overline{\mathcal {S}}_k\) defined in (16). Such non-unicity of the solutions to (23) is an unavoidable consequence of the non-invertibility of the divergence operator, or equivalently the existence of nonzero solution pairs to (20). Consequently, identifiability is not a well-posed question in the identification approach sketched in Algorithm 1.

Complexity: :

Another issue arising from the non-unicity of the solutions to (23) is the fact that Algorithm 1 may compute models having larger state dimension than that of the generating system. Note that the sum of the ranks of \(\mathcal {S}_k\), \(k=1,2\) coming from a generic solution pair computed in Step 2 will generically be \(N+N\), and consequently presumably higher than the minimal state dimension of a Roesser model of the data-generating system. Thus generically the Roesser model computed in Step 4 would be high-dimensional and impractical for use e.g. in simulation, control, and so forth. In Sect. 5.1 we illustrate a procedure using rank-minimization to compute a minimal complexity model; see also Remark 1 where an alternative approach using Gröbner basis computation is sketched.

Computation of A, B, C, D: :

Finally, sufficient conditions must be established guaranteeing that solutions A, B, C, D exist to the system of linear equations (25).

The rest of this section is devoted to modifying Algorithm 1 to address the complexity issue; the solvability of the system of linear equations (25) is considered in Sect. 6. We will show that with small modifications our data matrix-based approach to Roesser model identification offers the opportunity to address in a conceptually simple way the problem of deriving a minimal-complexityunfalsified Roesser model.

5.1 Computation of minimal complexity state trajectories

We define the complexity of a model (1) as the dimension \(\mathtt{n}_1+\mathtt{n}_2\) of the state variable. Given a controllable, quarter-plane causal behavior \(\mathfrak {B}\in \mathcal {L}_2^\mathtt{w}\), we define its minimal complexity to be the minimal dimension of the state variable among all possible Roesser representations of \(\mathfrak {B}\). Thus every \(\mathfrak {B}\in \mathcal {L}_2^\mathtt{w}\) can be mapped to point lying on a line \(\mathtt{n}_1+\mathtt{n}_2=c\) in \(\mathbb {N}\times \mathbb {N}\), where c is its complexity, see Fig. 1. Note that complexity of a given model obtained in our framework is related to the total rank\({{\mathrm{\mathrm rank}}}(\mathcal {S}_1)+{{\mathrm{\mathrm rank}}}(\mathcal {S}_2)\) of the particular solution to (23) chosen to perform the rank-revealing factorization in Step 3 of Algorithm 1. Assuming the data to be sufficiently informative about the system dynamics, such total rank ranges between the complexity n of the actual data-generating system, and 2N, see the previous discussion on complexity. In the light of the characterization (22) in Corollary 2, the problem of computing a minimal total rank solution to (23) can be formulated as a rank minimization problem (see Fazel et al. 2004), as we presently show. In order to do this, we need to state a preliminary result.

Fig. 1
figure 1

A constant complexity line in the \((\mathtt{n}_1,\mathtt{n}_2)\)-space

Proposition 11

Define \(b_{k,j}:=\frac{\lambda ^j_2 \mu _2^{k*} -1 }{\lambda ^j_1 \mu _1^{k*}-1 }\), \(k,j=1,\ldots ,N\), and

$$\begin{aligned} B:={{\mathrm{\text{ diag }}}}(b_{11},\ldots ,b_{1N},\ldots ,b_{N1},\ldots ,b_{NN})\; . \end{aligned}$$

Moreover, define the map

$$\begin{aligned}&f: \mathbb {R}^{N\times N}\rightarrow \mathbb {R}^{N\times N}\\&f(X):={{\mathrm{\text{ mat }}}}(-B{{\mathrm{\text{ vec }}}}(X))\; . \end{aligned}$$

The subset \(\mathfrak {C}\) of \(\mathbb {R}^{2N\times 2N}\) defined by

$$\begin{aligned} \mathfrak {C}:=\left\{ \begin{bmatrix}\overline{\mathcal {S}}_1+f(\mathcal {S}_2^\prime )&0_{N\times N}\\ 0_{N\times N}&\overline{\mathcal {S}}_2+\mathcal {S}^{\prime \prime }_2\end{bmatrix} \text{ s.t. } \mathcal {S}^{\prime \prime }_2,\mathcal {S}^\prime _2\in \mathbb {R}^{N\times N}\right\} \; , \end{aligned}$$

is convex.

Proof

Let \(\alpha \in \mathbb {R}\), \(0\le \alpha \le 1\); the claim follows in a straightforward way from the equality

$$\begin{aligned} f(\alpha \mathcal {S}_2^\prime +(1-\alpha ) \mathcal {S}_2^{\prime \prime })=\alpha f( \mathcal {S}_2^\prime )+(1-\alpha ) f(\mathcal {S}_2^{\prime \prime })\; , \end{aligned}$$

which we now prove. The following chain of equalities is a direct consequence of the definition of f and the linearity of the \({{\mathrm{\text{ mat }}}}\) and \({{\mathrm{\text{ vec }}}}\) maps:

$$\begin{aligned} f(\alpha \mathcal {S}_2^\prime +(1-\alpha ) \mathcal {S}_2^{\prime \prime })= & {} {{\mathrm{\text{ mat }}}}(-B{{\mathrm{\text{ vec }}}}(\alpha \mathcal {S}_2^\prime +(1-\alpha )\mathcal {S}_2^{\prime \prime }))\\= & {} {{\mathrm{\text{ mat }}}}(-B\alpha {{\mathrm{\text{ vec }}}}(\mathcal {S}_2^\prime )-B(1-\alpha ){{\mathrm{\text{ vec }}}}(\mathcal {S}_2^{\prime \prime }))\\= & {} \alpha {{\mathrm{\text{ mat }}}}(-B{{\mathrm{\text{ vec }}}}(\mathcal {S}_2^\prime ))+(1-\alpha ){{\mathrm{\text{ mat }}}}(-B{{\mathrm{\text{ vec }}}}(\mathcal {S}_2^{\prime \prime }))\\= & {} \alpha f( \mathcal {S}_2^\prime )+(1-\alpha ) f(\mathcal {S}_2^{\prime \prime })\; , \end{aligned}$$

as was to be proved.

From such equality it follows that the set

$$\begin{aligned} \left\{ \begin{bmatrix}f(\mathcal {S}_2^\prime )&0_{N\times N}\\ 0_{N\times N}&\mathcal {S}^{\prime \prime }_2\end{bmatrix} \text{ s.t. } \mathcal {S}^{\prime \prime }_2,\mathcal {S}^\prime _2\in \mathbb {R}^{N\times N}\right\} \; , \end{aligned}$$

is convex, from which the claim follows directly. \(\square \)

It follows from Proposition 11 that the optimization problem defined by

$$\begin{aligned} \min&{{\mathrm{\mathrm rank}}}\left( \begin{bmatrix}\mathcal {S}_1&0\\ 0&\mathcal {S}_2\end{bmatrix}\right) \nonumber \\ \text{ s.t. }&\begin{bmatrix}\mathcal {S}_1&0\\ 0&\mathcal {S}_2\end{bmatrix}\in \mathfrak {C}\; ; \end{aligned}$$
(26)

is in the standard form of a rank minimization problem

$$\begin{aligned}&\min {{\mathrm{\mathrm rank}}}(X)\\&\text{ s.t. } X\in \mathcal {C}\;, \end{aligned}$$

where \(\mathcal {C}\) is a convex set, and can be solved by several algorithms implemented on standard platforms. It goes beyond the scope of the present paper to enter into details about which algorithms to use in order to solve (26), and to discuss important issues such as their numerical accuracy and complexity. The interested reader is referred to the growing literature on the subject.

We can now refine Algorithm 1 as follows.

Algorithm 2

Input: :

Primal and dual data as in (14)

Output: :

A minimal complexity unfalsified Roesser model for the data.

Step 1: :

Construct the matrix \(\mathcal {D}\) defined by (15) from the data (14).

Step 2: :

Define \(\mathfrak {C}\) as in Proposition 11, and solve the optimization problem (26).

Step 3: :

Perform a rank-revealing factorization of \(\mathcal {S}_k=F_{k}^{\prime \top } F_k\), i.e.

$$\begin{aligned} {{{\mathrm{\mathrm rank}}}(\mathcal {S}_k)={{\mathrm{\mathrm rank}}}(F_k)={{\mathrm{\mathrm rank}}}(F^\prime _k)\; , \; k=1,2\; .} \end{aligned}$$
Step 4: :

Define

$$\begin{aligned} {Y}&{:=}&{\begin{bmatrix} \overline{y}_1&\ldots&\overline{y}_N\end{bmatrix}\in \mathbb {C}^{\mathtt{p}\times N}}\nonumber \\ {U}&{:=}&{\begin{bmatrix} \overline{u}_1&\ldots&\overline{u}_N\end{bmatrix}\in \mathbb {C}^{\mathtt{m}\times N}}\; , \end{aligned}$$
(27)

and solve for \(A_{ij}\), \(B_{i}\), \(C_i\), \(i,j=1,2\) and D in

$$\begin{aligned} {\begin{bmatrix} F_1 \varLambda _1\\F_2\varLambda _2\\Y \end{bmatrix}=\begin{bmatrix} A_{11}&A_{12}&B_1\\ A_{21}&A_{22}&B_2\\ C_1&C_2&D \end{bmatrix} \begin{bmatrix} F_1\\ F_2\\ U \end{bmatrix}}\; . \end{aligned}$$
(28)
Step 5: :

Return A, B, C, D.

Remark 1

In Rapisarda (2017) a Gröbner basis approach to solve the rank minimization problem (26) is illustrated. Such approach uses a parametrization similar to that of Proposition 10 in order to transform the problem of finding fixed-rank matrices solving the 2D Sylvester equation (the continuous counterpart of the Stein equation) into a polynomial algebraic problem. In order to compute a minimal complexity Roesser model for the data, beginning with \(c=1\) we check whether there exist solution pairs \((\mathcal {S}_1,\mathcal {S}_2)\) to (23) such that \({{\mathrm{\mathrm rank}}}(\mathcal {S}_k)=\mathtt{n}_k\), \(k=1,2\) and \(\mathtt{n}_1+\mathtt{n}_2=c\). If no such solution pair exists, we increment c by 1 and repeat the check. Note that working under the assumption \(\lambda _k^i \mu _k^j\ne 1\), \(i,j=1,\ldots ,N\), \(k=1,2\), equation (23) is solved by \((\overline{\mathcal {S}}_1,\overline{\mathcal {S}}_2)\), where \(\overline{\mathcal {S}}_i\), \(i=1,2\), are the unique solution to (19) when \(\mathcal {Q}=\frac{1}{2}\mathcal {D}\), \(M=M_i\), \(\varLambda =\varLambda _i\), \(i=1,2\). Consequently, such search ends after at most N steps.

The largest part of the computational effort of such approach is due to the complexity of Gröbner basis calculations, which becomes especially heavy for problems involving more than ten data trajectories. However, such approach has the advantage that a parametrization of all solutions to (23) with a given total rank is obtained; a numerical approach based on rank-minimization algorithms only produces one solution among many. Consequently, such parametrization opens up the possibility of exploring the space of unfalsified models of given complexity, with potential application to 2D data-driven model order reduction [see Rapisarda and Schaft (2013) and Rapisarda and Trentelman (2011) for the 1D case]. Such procedure also shares with the one sketched in Algorithm 2 a conceptually appealing simplicity that avoids some difficulties inherent in other approaches based on shift-invariance. \(\square \)

6 Identification of Roesser models

To set up a system of linear equations (25), (28) we resort once more to the 2D Stein equation (23). Define, analogously with (24), (27), the input-output matrices of the dual data by

$$\begin{aligned} Y^\prime:= & {} \begin{bmatrix} \overline{y}^\prime _1&\ldots&\overline{y}^\prime _N\end{bmatrix}\in \mathbb {C}^{\mathtt{p}\times N}\nonumber \\ U^\prime:= & {} \begin{bmatrix} \overline{u}^\prime _1&\ldots&\overline{u}^\prime _N\end{bmatrix}\in \mathbb {C}^{\mathtt{m}\times N}\; , \end{aligned}$$
(29)

and assume that rank-revealing factorizations \(\mathcal {S}_k=F_{k}^{\prime \top } F_k\), \(k=1,2\) have been computed. Now rewrite (23) as:

$$\begin{aligned} \begin{bmatrix}M_1^*F_1^{\prime *}&M_2^*F_2^{\prime *} \end{bmatrix} \begin{bmatrix} F_1 \varLambda _1 \\ F_2 \varLambda _2\end{bmatrix}=\begin{bmatrix} Y^{\prime *}&U^{\prime *}\end{bmatrix} \begin{bmatrix} U\\ Y\end{bmatrix}+\begin{bmatrix} F_1^{\prime *}&F_2^{\prime *} \end{bmatrix} \begin{bmatrix} F_1\\ F_2 \end{bmatrix}\; . \end{aligned}$$
(30)

Denote the j-th column of \(F_k\) by \(f_{k,j}\), \(j=1,\ldots ,N\), \(k=1,2\). Observe that the columns of the matrix \({{\mathrm{\mathrm col}}}(F_1, F_2)\) are the values at (0, 0) of the 2D-geometric sequence \({{\mathrm{\mathrm col}}}( f_{1,j} {\exp _{\lambda _1} \exp _{\lambda _2}} ,f_{2,j}{\exp _{\lambda _1} \exp _{\lambda _2}})\), and those of \({{\mathrm{\mathrm col}}}(F_1\varLambda _1, F_2\varLambda _2)\) are the values at (0, 0) of the shifted 2D-geometric sequence

$$\begin{aligned} {{\mathrm{\mathrm col}}}(\sigma _1 {{\mathrm{\mathrm col}}}( f_{1,j} {\exp _{\lambda _1} \exp _{\lambda _2}} ,\sigma _2 f_{2,j}{\exp _{\lambda _1} \exp _{\lambda _2}}\; . \end{aligned}$$

The idea underlying our computation of an unfalsified model is to left-invert \(\begin{bmatrix}M_1^*F_1^{\prime *}&M_2^*F_2^{\prime *} \end{bmatrix}\) so as to obtain directly from (30) an unfalsified Roesser model.

The following result gives sufficient conditions so that a left inverse

$$\begin{aligned} \begin{bmatrix}M_1^*F_1^{\prime *}&M_2^*F_2^{\prime *} \end{bmatrix}^\dagger \end{aligned}$$

of \(\begin{bmatrix}M_1^*F_1^{\prime *}&M_2^*F_2^{\prime *} \end{bmatrix}\) exists so that a Roesser model can be computed from (30).

Proposition 12

Let \(\mathfrak {B},\mathfrak {B}^{\perp }\in \mathcal {L}^\mathtt{w}_2\) be controllable. Let data (14) be given and define \(\mathcal {D}\) by (15), \(\varLambda _i,M_i\), \(i=1,2\) by (16), U, Y by (24), and \(U^\prime \), \(Y^\prime \) by (29).

Let \(\mathcal {S}_1,\mathcal {S}_2\in \mathbb {R}^{N\times N}\) solve (23), and let \(\mathcal {S}_i=F^{\prime *}_i F_i\), \(i=1,2\) be rank-revealing factorizations. Assume that:

  1. 1.

    \({{\mathrm{\mathrm im}}}~M_1^*F^{\prime *}_1\cap {{\mathrm{\mathrm im}}}~M_2^*F^{\prime *}_2=\{0\}\);

  2. 2.

    \({{\mathrm{\mathrm im}}}~\begin{bmatrix} M_1^*F^{\prime *}_1&M_2^*F^{\prime *}_2\end{bmatrix}\cap {{\mathrm{\mathrm im}}}~ U^{\prime *}=\{0\}\),

  3. 3.

    \({{\mathrm{\mathrm im}}}~Y^{\prime *}\cap {{\mathrm{\mathrm im}}}~U^{\prime *}=\{0\}\).

There exist a left inverse \(K:=\begin{bmatrix} M_1^*F_1^{\prime *}&M_2^*F_2^{\prime *}\end{bmatrix}^\dagger \) of \(\begin{bmatrix} M_1^*F_1^{\prime *}&M_2^*F_2^{\prime *}\end{bmatrix}\) and \(G\in \mathbb {C}^{\mathtt{p}\times N}\) such that

$$\begin{aligned} K U^{\prime *}=0_{(\mathtt{n}_1+\mathtt{n}_2)\times \mathtt{p}} \text{ and } G \begin{bmatrix} Y^{\prime *}&U^{\prime *} \end{bmatrix}=\begin{bmatrix} 0_{\mathtt{p}\times \mathtt{m}}&I_\mathtt{p}\end{bmatrix}\; . \end{aligned}$$
(31)

Let K, G satisfy (31), and define

$$\begin{aligned} A:= & {} K\begin{bmatrix} F_1^{\prime *}&F_2^{\prime *}\end{bmatrix}\; , \; B:= K Y^{\prime *}\nonumber \\ C:= & {} G\left( \begin{bmatrix} M_1^*F_1^{\prime *}&M_2^*F_2^{\prime *}\end{bmatrix} K - I_N \right) \begin{bmatrix} F_1^{\prime *}&F_2^{\prime *}\end{bmatrix} \nonumber \\ D:= & {} G K Y^{\prime *}\; . \end{aligned}$$
(32)

Then A, B, C, D define an unfalsified Roesser model for the data.

Proof

From assumption (1) it follows that \(\begin{bmatrix} M_1^*F^{\prime *}_1&M_2^*F^{\prime *}_2 \end{bmatrix}\) admits a left inverse. From assumption (2) conclude that such a left-inverse can be chosen satisfying the first equation in (31). Now multiply both sides of (30) by such left-inverse to conclude that

$$\begin{aligned} \begin{bmatrix} F_1 \varLambda _1\\ F_2 \varLambda _2\end{bmatrix}=A\begin{bmatrix} F_1\\ F_2\end{bmatrix}+B U\; , \end{aligned}$$
(33)

where A and B are defined by the first two equations in (32).

Now use assumption (3) to conclude that a matrix G exists such that the second equation in (31) holds. Multiply both sides of (30) by such G and rearrange the terms also using equation (33), to conclude that

$$\begin{aligned} Y=C \begin{bmatrix} F_1\\ F_2\end{bmatrix}+DU\; , \end{aligned}$$
(34)

where C and D are defined by the last two equations in (32).

The fact that A, B, C and D define an unfalsified model for the primal data follows from (33) and (34) defining

$$\begin{aligned} {{\mathrm{\mathrm col}}}(x_{1,j},x_{2,j}):={{\mathrm{\mathrm col}}}(f_{1,j},f_{2,j})\exp _{\lambda _{1,j}} \exp _{\lambda _{2,j}}\; , \end{aligned}$$

to be the state trajectory corresponding to the j-th input-output data, \(j=1,\ldots ,N\). This concludes the proof of the Theorem. \(\square \)

Remark 2

The sufficient conditions stated in Proposition 12 fall short of being completely satisfactory, since they involve the matrices arising from the factorizations of \(\mathcal {S}_k\) rather than the matrices \(\mathcal {S}_k\), \(k=1,2\), themselves, or in the best case, the input-output data itself. We also do not make any claim about the conservativeness of such sufficient conditions. The issue of deriving tighter conditions expressed only in terms of the input-output data is a pressing issue for further research. \(\square \)

7 Conclusions

We have presented a novel approach to the identification of unfalsified Roesser discrete models from vector-geometric data, based on the idea of first computing state trajectories compatible with the given input-output trajectories, and secondly using such trajectories together with the data in order to compute the A, B, C, D matrices of the Roesser equations. Our procedure is based on new results concerning duality of such models (Sect. 3) and on a parametrization of the solutions to the 2D Stein matrix equation (Sect. 4). Such results lead to an algorithm for the computation of state trajectories (Sect. 5), which can be refined in a straightforward way to one for the computation of minimal complexity ones (Sect. 5.1). The 2D Stein equation is exploited once again in order to find an unfalsified model for the input-output data and the computed state matrices (see Sect. 6).

In several preceding publications concerned with linear time-invariant systems, the author and his collaborators put forward an “energy”-based approach to identification. Given the abundance of powerful methods to solve system identification problems for such class of systems, it can be argued that such results amounted to a relatively minor contribution. The author hopes that the application of such ideas to 2D systems as in Rapisarda and Antoulas (2016) and in the present paper, may shift the balance of judgment more in his favour. The ideas underlying the approach presented here and in the germane publication Rapisarda and Antoulas (2016) can be applied to a wider class of systems, and to more general classes of data than vector-geometric or exponential ones. Their potential lies in the generality of the idea of duality, which we believe can be used to overcome the difficulties (e.g. of bookkeeping) inherent in applying shift-invariance techniques to multidimensional systems, or to bypass them altogether for system classes where such property is not satisfied (e.g. 1D time-varying and nonlinear systems, for which promising results are being obtained as we write).

Limiting ourselves to the class of multidimensional systems considered in this paper, three areas of research are currently being investigated. Firstly, we need to generalize our approach to the case of data other than vector-geometric through the use of compact-support trajectories and infinite series involving their shifts [as in the 1D case, see Rapisarda and Trentelman (2011)]. Secondly, we want to identify other classes of 2D systems than the Roesser one, amenable to be identified with duality ideas. The main issue to be addressed is to determine classes of systems admit a “pairing relation” with their dual, which can be expressed as the divergence of a field involving the state trajectories, and if possible algebraically characterize such property. Moreover, it is important to ascertain whether such divergence is amenable to a computationally straightforward treatment; for example, in Rem. 8 p. 2751 of Rapisarda and Antoulas (2016) it has been shown that (continuous-time) Fornasini-Marchesini models have been shown to admit a pairing relation, but one which does not seem to be conducive to a direct exploitation to derive from it state trajectories. Finally, we plan to investigate whether duality relations can be used to compute minimal Roesser model from non-minimal ones, and in the problem of state-space realisation from transfer functions.