Abstract
This paper considers largescale linear stochastic systems representing, e.g., spatially discretized stochastic partial differential equations. Since asymptotic stability can often not be ensured in such a stochastic setting (e.g., due to larger noise), the main focus is on establishing model order reduction (MOR) schemes applicable to unstable systems. MOR is vital to reduce the dimension of the problem in order to lower the enormous computational complexity of for instance sampling methods in high dimensions. In particular, a new type of Gramianbased MOR approach is proposed in this paper that can be used in very general settings. The considered Gramians are constructed to identify dominant subspaces of the stochastic system as pointed out in this work. Moreover, they can be computed via Lyapunov equations. However, covariance information of the underlying systems enters these equations which is not directly available. Therefore, efficient samplingbased methods relying on variance reduction techniques are established to derive the required covariances and hence the Gramians. Alternatively, an ansatz to compute the Gramians by deterministic approximations of covariance functions is investigated. An error bound for the studied MOR methods is proved yielding an a priori criterion for the choice of the reduced system dimension. This bound is new and beneficial even in the deterministic case. The paper is concluded by numerical experiments showing the efficiency of the proposed MOR schemes.
Introduction
Let \(w=\left( w_1, \ldots , w_q\right) ^\top \) be an \({\mathbb {R}}^q\)valued mean zero Wiener process with covariance matrix \({\mathbf {K}}=(k_{ij})\), i.e., \({\mathbb {E}}[w(t)w^\top (t)]={\mathbf {K}} t\) for \(t\in [0, T]\), where \(T>0\) is the terminal time. Suppose that W and all stochastic process appearing in this paper are defined on a filtered probability space \(\left( \Omega , {\mathcal {F}}, ({\mathcal {F}}_t)_{t\in [0, T]}, {\mathbb {P}}\right) \)^{Footnote 1}. In addition, we assume w to be \(({\mathcal {F}}_t)_{t\in [0, T]}\)adapted and the increments \(w(t+h)w(t)\) to be independent of \({\mathcal {F}}_t\) for \(t, h\ge 0\). We consider the following largescale controlled linear stochastic differential equation
where \(A, N_i\in {\mathbb {R}}^{n\times n}\), \(B\in {\mathbb {R}}^{n\times m}\) and \(C\in {\mathbb {R}}^{p\times n}\). The state dimension n is assumed to be large and the quantity of interest y is often lowdimensional, i.e., \(p\ll n\), but we also discuss the case of a large p. By \(x(t; x_0, u)\), we denote the state in dependence on the initial state \(x_0\) and the control u, for which we assume that it is \(({\mathcal {F}}_t)_{t\in [0, T]}\)adapted and \(\left\ u\right\ _{L^2_T}^2:={\mathbb {E}}\int _0^T \left\ u(s)\right\ ^2_2 ds<\infty \) with \(\left\ \cdot \right\ _2\) representing the Euclidean norm.
The goal is to construct a system with state \(\bar{x}\) and quantity of interest \(\bar{y}\) having the same structure as (1) but a much smaller state dimension \(r\ll n\). At the same time, it is aimed to ensure \(y \approx \bar{y}\). Such a reducedorder model (ROM) is particularly beneficial if many evaluations (1) for several controls u are required (e.g., in an optimal control problem) combined with need of generating many samples of y for each individual u. Now, a ROM shall be achieved under very general conditions such as the absence of mean square asymptotic stability, i.e., \({\mathbb {E}}\left\ x(t; x_0, 0)\right\ _2^2\rightarrow 0\) (as \(t\rightarrow \infty \)) is not given. Methods involving such a stability condition are intensively studied in the literature [3, 4, 13, 16] since it is often guaranteed if (1a) results from a spatial discretization of a stochastic partial differential equation (SPDE) such as
We refer to [6] for more details on the theory of such equations. The solution \({{\mathcal {X}}}(t, \cdot )\) to the heat equation (2) is viewed as a stochastic process taking values in a Hilbert space and shall be approximated by x. In this context, A can be seen as a discretized version of the Laplacian \(\Delta \) and B, \(N_i\) represent discretizations of the linear bounded operators \({\mathcal {B}}\), \({\mathcal {N}}_i\). Moreover, \(w_i\) can be interpreted as Fourier coefficients corresponding to a truncated series of spacetime noise. Further explanations on different schemes for a spatial discretization can, for example, be found in [2, 10]. However, even in a setting like in (2), mean square asymptotic stability can be violated since the noise can easily cause instabilities (e.g., if it is sufficiently large).
Such a scenario is of interest in this paper. We establish generalizations of balancing related model order reduction (MOR) schemes in order to make them applicable to general systems (1). These MOR methods rely on matrices called Gramians that can be used to identify the dominant subspaces of (1). Based on this characterization of the relevance of different state directions, less important information in the dynamics is removed leading to the desired ROM. This step can be interpreted as an optimization procedure applied to spatially discretized SPDE. In an unstable setting, Gramians need to be defined that generally exist in contrast to previous approaches. We consider generalized timelimited Gramians in this work. Such type of Gramians have been used in deterministic frameworks [9, 11, 15]. Although such an ansatz is beneficial for the setting we want to cover, the analysis of MOR methods based on generalized timelimited Gramians is much more challenging. Furthermore, the question of how to compute these Gramians in practice is very difficult but vital since they are required to derive the ROM.
In this paper, we introduce timelimited Gramian in the stochastic setting studied here. We point out the relation between these Gramians and the dominant subspaces of (1) and show their relation to matrix (differential) equations. Subsequently, we discuss two different MOR techniques based on these Gramians and analyze the respective error. In particular, an error bound is established that allows us to identify situations in which the approaches work well. It is important to mention that this bound is more than just a generalization of the deterministic case [15]. The new type of representation links the truncated Hankel singular values of the system or the truncated eigenvalues of the reachability Gramian, respectively, to the error of the approximation without needing asymptotic stability and is hence beneficial also in unstable settings. Moreover, we discuss different strategies that can be used to compute the proposed Gramians. They are solutions to Lyapunov equations. However, in a timelimited scenario, covariance information at the terminal time enters these Lyapunov equations which is not immediately available. Since direct methods only work in moderate high dimensions, we focus on sampling based approaches to estimate the required covariances. In order to increase the efficiency of such procedures we apply variance reduction methods in this context leading to an efficient way of solving for the timelimited Gramians. Apart from this empirical procedure, a second strategy to approximate covariance functions and hence the Gramians is investigated, where potentially expensive sampling is not required. The paper is concluded by several numerical experiments showing the efficiency of the MOR methods.
Gramianbased MOR
Gramians and characterization of dominant subspaces
Identifying the effective dimensionality of system (1) requires the study of the fundamental solution to the homogeneous stochastic state equation. It is defined as the matrix valued stochastic process \(\Phi \) solving
where I denotes the identity matrix. Multiplying (3) with \(x_0\) from the right, we obtain the solution to (1a) if \(u\equiv 0\). Based on \(\Phi \) we define two Gramians by
where \(P_T\) and \(Q_T\) are supposed to identify the less relevant states in (1a) and (1b), respectively. \(P_T\) and \(Q_T\) can be viewed as generalizations of deterministic timelimited Gramians which are obtained by setting \(N_i=0\) for all \(i=1, \ldots , q\) resulting in \(\Phi (t)= \text {e}^{At}\). MOR schemes based on such Gramians in a deterministic framework are investigated, e.g., in [9, 11, 15]. \(P_T\) and \(Q_T\) generally exist in contrast to their limits \(\lim _{T\rightarrow \infty } P_{T}\) and \(\lim _{T\rightarrow \infty }Q_{T}\) which require mean square asymptotic stability. MOR methods based on these limits are, e.g., considered in [3, 4, 13, 16] and are already analyzed in detail. However, the necessary stability condition is often not satisfied in practice.
Let us briefly sketch the relation between \(P_T\) and dominant subspaces in (1a) for the case of zero initial data. Suppose that \((p_{k})_{k=1,\ldots , n}\) is an orthonormal basis of \({\mathbb {R}}^n\) consisting of eigenvectors of \(P_T\). We can then write the state as
Given \(x_0=0\), the expansion coefficient can be bound from above as follows
see [13, Section 3], where \(\lambda _{k}\) is the eigenvalue corresponding to \(p_k\). If \(\lambda _{k}\) is small, the same is true for \(\langle x(\cdot , 0, u), p_{k} \rangle _2\) and hence \(p_k\) is a less relevant direction that can be neglected. This implies that the eigenspaces of \(P_T\) belonging to the small eigenvalues can be removed from the system. On the other hand, we aim to find state directions that have a low impact on the quantity of interest y. We therefore look at the initial state \(x_0\) since it determines the dynamics of the state variable. We expand
where \((q_{k})_{k=1,\ldots , n}\) is an orthonormal basis of eigenvectors of \(Q_T\) with associated eigenvalues \((\mu _{k})_{k=1,\ldots , n}\). Using the solution representation of the state variable, we obtain
with \(t\in [0, T]\) and \( \Phi (t, s):=\Phi (t)\Phi ^{1}(s)\). Consequently, neglecting \(q_k\) has a low impact on y if \(C\Phi (\cdot )q_{k}\) is small on [0, T]. It now follows that
telling us that the eigenspaces of \(Q_T\) are unimportant for which the associated eigenvalues \(\mu _k\) are small. Knowing both the less relevant state directions in (1a) and (1b) from (6) and (7) it is aimed to remove them. This can be done by diagonalizing \(P_T\) such that less important variables in (1a) can be easily identified and truncated. Another, but computationally more expensive, approach is based on simultaneously diagonalizing \(P_T\) and \(Q_T\) which allows to remove more redundant information from the system. Both strategies are discussed in Sect. 2.2.
Below, we point out the relation between the Gramians and linear matrix differential equations. To do so, we introduce two operators \({\mathcal {L}}_A(X)= A X+X A^\top \) and \(\Pi (X) = \sum _{i, j=1}^q N_i X N_j^\top k_{ij}\) on the space of symmetric matrices endowed with the Frobenius inner product \(\langle \cdot , \cdot \rangle _F\). \({\mathcal {L}}_A\) is a Lyapunov operator and \(\Pi \) is positive in the sense that \(\Pi (X)\) is a positive semidefinite matrix if X is positive semidefinite. The corresponding adjoint operators are \({\mathcal {L}}_A^*(X)= A^\top X+X A\) and \(\Pi ^*(X) = \sum _{i, j=1}^q N_i^\top X N_j k_{ij}\).
The equations related to \(P_T\) and \(Q_T\) will be helpful to compute these Gramians that are needed in order to derive the reduced system. By Ito’s product rule [12], we can show that \(F(t) = {\mathbb {E}} [\Phi (t)BB^\top \Phi ^\top (t)]\), \(t\in [0, T]\), solves
Integrating both sides of (8) yields
Remark 2.1
The generalized Lyapunov operator \({\mathcal {L}}_A + \Pi \) is linked to the Kronecker matrix
where \(\cdot \otimes \cdot \) is the Kronecker product between two matrices. Let \(\text {vec}(\cdot )\) be the vectorization of a matrix. Then, it holds that \(\text {vec}\left( ({\mathcal {L}}_A + \Pi )\left( X\right) \right) = {\mathcal {K}} \text {vec}(X)\).
The link between \(Q_T\) and the corresponding matrix equation is established in a different way. We formulate this result in the following proposition.
Proposition 2.2
Let \(C^\top C\) be contained in the eigenspace of the Lyapunov operator \({\mathcal {L}}_A^* + \Pi ^*\). Then, \(G(t) = {\mathbb {E}} [\Phi ^\top (t)C^\top C \Phi (t)]\), \(t\in [0, T]\), satisfies
Proof
Since \(C^\top C\) is contained in the eigenspace of the Lyapunov operator, there exist \(\alpha _1, \dots , \alpha _{n^2}\in {\mathbb {C}}\) such that \(C^\top C = \sum _{k=1}^{n^2} \alpha _k {\mathcal {V}}_k\), where \(({\mathcal {V}}_k)\) are eigenvectors of \({\mathcal {L}}_A^* + \Pi ^*\) corresponding to the eigenvalues \((\beta _k)\). Then, we have \({\mathbb {E}} [\Phi ^\top (t)C^\top C \Phi (t)]= \sum _{k=1}^{n^2} \alpha _k {\mathbb {E}} [\Phi ^\top (t){\mathcal {V}}_k\Phi (t)]\). Let us apply Ito’s product rule, see [12], to \(\Phi ^\top (t){\mathcal {V}}_k\Phi (t)\) resulting in
We insert the stochastic differential of \(\Phi \) above, compare with (3), leading to
We apply the expected value to both sides of the above identity and exploit that Ito integrals have mean zero (see, e.g., [12]). Hence, we obtain
This implies that \({\mathbb {E}} [\Phi ^\top (t){\mathcal {V}}_k\Phi (t)] = \text {e}^{\beta _k t}{\mathcal {V}}_k\) providing \({\mathbb {E}} [\Phi ^\top (t)C^\top C \Phi (t)]= \sum _{k=1}^{n^2} \alpha _k \text {e}^{\beta _k t}{\mathcal {V}}_k\). Consequently, we have
using the linearity of \({\mathcal {L}}_A^*+ \Pi ^*\). This concludes the proof. \(\square \)
Remark 2.3
The assumption of Proposition 2.2 is always true if \({\mathcal {K}}\) is diagonalizable over \({\mathbb {C}}\) because in that case there is a basis of \({\mathbb {C}}^{n^2}\) consisting of eigenvectors of \({\mathcal {K}}^\top \). Hence, \(\text {vec}(C^\top C)\) can be spanned by these eigenvectors which are of the form \(\text {vec}({\mathcal {V}}_k)\) with \({\mathcal {V}}_k\) being an eigenvector of \({\mathcal {L}}_A^*+ \Pi ^*\) providing that \(C^\top C\) is in the eigenspaces of this operator. Therefore, from the computational point of view, the assumption of Proposition 2.2 does not restrict the generality since the set of diagonalizable \(n^2\times n^2\) matrices is dense in \({\mathbb {C}}^{n^2\times n^2}\).
In fact, we can find a stochastic representation of the solution to (11) different from \({\mathbb {E}} [\Phi ^\top (t)C^\top C \Phi (t)]\), \(t\in [0, T]\). Introducing the fundamental solution \(\Phi _d\) by the equation \(\Phi _d(t)=I+\int _0^t A^\top \Phi _d(s) {\text {d}}s+\sum _{i=1}^q \int _0^t N_i^\top \Phi _d(s){\text {d}}w_i(s)\), we see that \(G(t) = {\mathbb {E}} [\Phi _d(t)C^\top C \Phi _d^\top (t)]\). This is a direct consequence of the relation between \({\mathbb {E}} [\Phi (t)B B^\top \Phi ^\top (t)]\) and the solution of (8) when \((A, B, N_i)\) is replaced by \((A^\top , C^\top , N_i^\top )\). Therefore, \({\mathbb {E}} [\Phi _d(t)C^\top C \Phi _d^\top (t)]\), \(t\in [0, T]\), solves (11) and hence coincides with \({\mathbb {E}} [\Phi ^\top (t)C^\top C \Phi (t)]\), \(t\in [0, T]\), given the assumption of Proposition 2.2.
Generally, we have \(\Phi _d(t)\ne \Phi ^\top (t)\). In case all matrices \(A, N_1, \ldots , N_q\) commute, we know that A and \(N_i\) commute with \(\Phi \) (see, e.g., [14]). Hence, \(\Phi _d(t)= \Phi ^\top (t)\) which can be seen be transposing (3) and subsequently exploiting the commutative property. This is particularly given in the deterministic case where \(N_i=0\) for all \(i=1, \ldots , q\).
Under the assumption of Proposition 2.2, it holds that
exploiting (11). In fact, we need to compute \(P_T\) and \(Q_T\) within the MOR procedure described later. Lyapunov equations (9) and (12) are used to do so. However, one needs to have access to F(T) and G(T) which are the terminal values of the matrixdifferential equations (8) and (11). This is indeed very challenging in a framework, where \(n\gg 100\). We will address possible approaches for computing \(P_T\) and \(Q_T\) for such settings in Sect. 4.
Reducedorder modeling by transformation of Gramians
In this work, we address MOR techniques that rely on a change of basis. In particular, one seeks for a suitable regular matrix S that defines \(x_S(t) = S x(t)\). Inserting this into (1) yields
where \((A_S, B_S, C_S, N_{i, S}) = (SAS^{1},SB, CS^{1}, SN_iS^{1})\). System (13) has the same input–output behavior as (1) but the fundamental solution and hence the Gramians are different. The fundamental solution of (13) is \(\Phi _S(t) = S \Phi (t) S^{1}\) which can be observed by multiplying (3) with S from the left and with \(S^{1}\) from the right. Consequently, the new Gramians are
The idea is to diagonalize at least one of these Gramians, since in a system with diagonal Gramians, the orthonormal bases \((p_k)\) and \((q_k)\) are canonical unit vectors (columns of the identity matrix). Thus, unimportant directions can be identified easily by (6) and (7) and are associated with the small diagonal entries of the new Gramians. For the first approach, we set \(S=S_1\), where \(S_1\) is part of the eigenvalue decomposition \( P_T= S_1^\top \Sigma ^{(1)}_T S_1\). This leads to \(P_{T, S}=\Sigma ^{(1)}_T\) with \(\Sigma ^{(1)}_T\) being the diagonal matrix of eigenvalues of \(P_T\). Notice that \(S^\top = S^{1}\) holds in this case. If (1a) is mean square asymptotically stable, \(P_T\) can be replaced by \(\lim _{T\rightarrow \infty } P_T\). This method based on the limit is investigated in [16].
The second approach uses \(S=S_2\), which leads to \(P_{T, S}=Q_{T, S}=\Sigma ^{(2)}_T\), where \(\Sigma ^{(2)}_T\) is the diagonal matrix of the square roots of eigenvalues of \(P_T Q_T\). Those are called Hankel singular values (HSVs). Given \(P_T, Q_T>0\), the transformation \(S_2\) and its inverse are obtained by
where the ingredients of (14) are computed by the factorizations \(P_{T} = K K^\top \), \(Q_T=LL^\top \) and the singular value decomposition of \(K^\top L = V\Sigma ^{(2)}_T U^\top \). The same procedure can be conducted for the limits of the Gramians (as \(T\rightarrow \infty \)) if mean square asymptotic stability is given [4]. However, such a stability condition is generally too restrictive in practice. We introduce the matrix
as the diagonal matrix of either eigenvalues of \(P_T\) or of HSVs of system (1). For \(S=S_1\) or \(S=S_2\) the coefficients of (13) are partitioned as follows
where \(x_1(t)\in \mathbb {R}^{r}\), \(A_{11}\in \mathbb {R}^{r\times r} \), \( B_1\in \mathbb {R}^{r\times m} \), \( C_1\in \mathbb {R}^{p\times r} \), \( N_{i,11}\in \mathbb {R}^{r\times r}\) and \(\Sigma _{T, 1}\in \mathbb {R}^{r\times r}\), etc. The variables \(x_2\) are associated with the matrix \(\Sigma _{T, 2}\) of small diagonal entries of \(\Sigma _{T}\) and are the less relevant ones. A reduced system is now obtained by truncating the equations of \(x_2\) in (13). Additionally, we set \(x_2\equiv 0\) in the equations for \(x_1\) leading to a reduced system
approximating (1). Below, we give another interpretation for (17). Let us decompose the transformation
where \(W^\top \) and V are the first r rows and columns of S and \(S^{1}\), respectively. Notice that \(W^\top V=I\) and hence \(V W^\top \) is a projection. Furthermore, we have \(W=V\) if \(S=S_1\). Consequently, (17) can be seen as a projectionbased model with \({A}_{11}= W^\top A V\), \(B_1= W^\top B\), \(C_1 = CV\) and \({N}_{i,11}= W^\top {N}_{i} V\) which is obtained by the state approximation \(x(t) \approx V \bar{x}(t)\). Inserting this approximation into (1) and subsequently multiplying the state equation with \(W^\top \) to enforce the remainder term to be zero then results in (17).
Output error bound
In this section, we prove a bound for the error between (1) and (17). Below, we assume zero initial conditions, i.e., \(x_0=0\) and \(\bar{x}_0=0\). We begin with a general bound following the steps of [4, 13]. The solutions x(t) and \(\bar{x}(t)\), \( t\in [0,T] \), to (1) and (17) can be expressed using their fundamental matrices \(\Phi (t)\) and \(\bar{\Phi }(t)\), respectively, see [13]. Therefore, we have
where \( \Phi (t,s)=\Phi (t)\Phi ^{1}(s) \) and \( \bar{\Phi }(t,s)=\bar{\Phi }(t)\bar{\Phi }^{1}(s) \). Consequently, representations for the outputs are
where \( t\in [0,T] \). Then, we find
Here, \( \Vert \cdot \Vert _F \) denotes the Frobenius norm. Using Cauchy’s inequality, it holds that
where \(B^e=\left( {\begin{matrix}{B}\\ B_1\end{matrix}}\right) \), \(C^e = \left( {\begin{matrix}{C}&C\end{matrix}}\right) \) and \(\Phi ^e= \left( {\begin{matrix}{\Phi }&{} 0\\ 0 &{}\bar{\Phi }\end{matrix}}\right) \) is the fundamental solution to the system with coefficients \(A^e=\left( {\begin{matrix}{A}&{} 0\\ 0 &{}A_{11}\end{matrix}}\right) \) and \(N_i^e=\left( {\begin{matrix}{N}_i&{} 0\\ 0 &{} N_{i, 11}\end{matrix}}\right) \).
Applying the arguments that are used in [4, 13], we know that
For \(t\in [0, T]\), the identity in (21) yields
with \(F^e(t) = \mathbb {E}\left[ \Phi ^e(t) B^e{B^e}^\top {\Phi ^e}^\top (t)\right] \) exploiting Fubini’s theorem as well as the fact that the trace and \(C^e\) are linear operators. Since \(F(t) = \mathbb {E}\left[ \Phi (t) BB^\top \Phi ^\top (t)\right] \) is a stochastic representation for equation (8), see Sect. 2.1, \(F^e\) satisfies
using the same arguments. From (23), it can be seen that the left upper \(n\times n\) block of \(F^e\) is F which solves (8). On the other hand, the right lower \(r\times r\) block \(\bar{F}\) and the right upper \(n\times r\) block \(\tilde{F}\) of \(F^e\) satisfy
with stochastic representations
Consequently, using (22) with the partition \(F^e= \left( {\begin{matrix}{F} &{} \tilde{F}\\ \tilde{F}^\top &{} \bar{F} \end{matrix}}\right) \), we find
where \(\bar{P}_T=\int _0^T\bar{F}(t){{{\,\mathrm{d}\,}}}t\) and \(\tilde{P}_T=\int _0^T\tilde{F}(t){{{\,\mathrm{d}\,}}}t\) solve
Summing up, we obtain that
The bound in (29) is very useful in order to check for the quality of a reduced system. Since \(P_T\) has to be computed to obtain (17), the actual cost to determine the bound lies in solving the lowdimensional matrix equations (27) and (28). However, (29) is only an a posteriori estimate which is computed after the reduced order model is derived. Therefore, we discuss the role of \(\Sigma _{T, 2}=\text {diag}(\sigma _{T, r+1}, \ldots , \sigma _{T, n})\) which is either the matrix of neglected eigenvalues of \(P_T\) or HSVs of the system. \(\Sigma _{T, 2}\) is associated with the truncated state variables \(x_2\) of (13), compare with (16). By (6) and (7), it is already known that such variables \(x_2\) are less relevant if \(\sigma _{T, r+1}, \ldots , \sigma _{T, n}\) are small. This makes the values \(\sigma _i\) a good a priori criterion for the choice of r. In the following, we want to investigate how the truncated values \(\sigma _{T, r+1}, \ldots , \sigma _{T, n}\) characterize the error of the approximation. For that reason, we prove an error bound depending on \(\Sigma _{T, 2}\). As we will see, \(\Sigma _{T, 2}\) is not the only factor having an impact on the bound that is structurally independent of whether we choose \(S=S_1\) or \(S= S_2\).
Theorem 3.1
Let y be the output of (1) and \(\bar{y}\) be the one of (17). Suppose that \(S=S_1, S_2\), where \(S_1\) is the factor of the eigenvalue decomposition of the Gramian \(P_T\) and \(S_2\) is the balancing transformation defined in (14). Using partition (16) of the realization \((A_S, B_S, C_S, N_{i, S})\), we have
where \(\bar{Q}\) and \(\tilde{Q}=\begin{pmatrix} \tilde{Q}_1\quad \tilde{Q}_2 \end{pmatrix}\) and are the unique solutions to
Moreover, the above bound involves \(F_S(T):=SF(T)S^{\top }=\left( {\begin{matrix}{F}_{11} &{} F_{12} \\ F_{21} &{} F_{22} \end{matrix}}\right) \) and \(\tilde{F}_S(T):= S\tilde{F}(T)=\left( {\begin{matrix}{\tilde{F}_1} \\ {\tilde{F}_2 } \end{matrix}}\right) \), where F(T), \(\bar{F}=\bar{F}(T)\) and \(\tilde{F}(T)\) are the terminal values of (8), (24) and (25), respectively.
The terms in the bound of Theorem 3.1 that do not directly depend on \(\Sigma _{T, 2}\) are related to the covariance error of the dimension reduction at the terminal time T (with \(u\equiv 0\)). To see this, let V be the matrix introduced in (18). As explained below (18), the state of the reduced system (17) can be interpreted as an approximation of the original state in the subspace spanned by the columns of V. By the stochastic representations of F(T), \(\tilde{F}(T)\) and \(\bar{F}(T)\) (see above (8) and (26)), we can view F(T) and \(\bar{F}(T)\) as covariances of the original and reduced model at time T, whereas \(\tilde{F}(T)\) describes the correlations between both systems. Let us now assume that
i.e., the covariance at T is wellapproximated in the reduced system. This is, e.g., given if the uncontrolled state is wellapproximated in the range of V at time T, i.e., \(\Phi (T) B \approx V \bar{\Phi }(T) B_1\). Now, multiplying (32) with S from the left and with W (defined in (18)) from the right, we obtain that \(\left( {\begin{matrix}{\tilde{F}_1{F}_{11}} \\ {\tilde{F}_2{F}_{21} }\end{matrix}}\right) \) is small. Multiplying (33) with \(W^\top \) from the left and with W from the right provides a low deviation between \(F_{11}= W^\top F(T) W\) and \(\bar{F}\). Although we additionally have these terms related to the covariance error, looking at \(\Sigma _{T, 2}\) is still suitable for getting an intuition concerning the error and hence a first idea for the choice of r. This is because a small \(\Sigma _{T, 2}\) goes along with a small error between \(\Phi (T) B\) and its approximation \(V \bar{\Phi }(T) B_1\) in the range of V. This observation can be made due to
where \(z_T\in \text {ker}P_T\). Since \(t\mapsto \Phi (t)\) is \({\mathbb {P}}\)almost surely continuous, we have \(\left( \Phi (t) B\right) ^\top z_T=0\) \({\mathbb {P}}\)almost surely for all \(t\in [0, T]\). Choosing \(t=T\), we therefore know that the columns of \(\Phi (T) B\) are orthogonal to \(\text {ker}P_T\). This means that \(\Phi (T) B\in \text {im}P_T\) since \(P_T\) is symmetric. Hence, there is a matrix \(Z_T\) such that
i.e., the columns of \( \Phi (T) B\) lie almost in the span of V if \(\Sigma _{T,2}\) is small. Therefore, a good approximation can be expected if one truncates states with associated small values \(\sigma _{T, r+1},\ldots ,\sigma _{T, n}\). This can be confirmed by computing the representation in (29) after a reduced order dimension r was chosen based on the values \(\sigma _{T, i}\).
Remark 3.2
Notice that the covariance F(T) vanishes in the limit as \(T\rightarrow \infty \) if (1) is mean square asymptotically stable. In this context, the deviations in (32) and (33) can be expected to be small for sufficiently large T since the covariance error disappears at \(\infty \). If the system is unstable, we have \(\Vert F(T)\Vert \rightarrow \infty \) as \(T\rightarrow \infty \). In this case, the covariance error might be large and dominant if T is very large such that the approximation quality is lower. The role of T is additionally discussed in Sect. 5.
We are now ready to prove the error bound in the following:
Proof of Theorem 3.1
Since \(S= S_1, S_2\) diagonalizes \(P_T\), we have
We set \(\tilde{Y}_T:=S\tilde{P}_T\) and obtain the corresponding equation by multiplying (28) with S from the left resulting in
Now, we analyze the trace expression \(\epsilon ^2 :=(\mathrm{tr}(CP_TC^\top )+\mathrm{tr}(C_1\bar{P}_TC_1^\top )2\mathrm{tr}(C\tilde{P}_TC_1^\top ) )\) in (29). We see that
Exploiting (31) yields
Comparing (31) and (35), we find that
Using the partition in (16), the first r columns of (34) are
We insert (38) into (37) and obtain
Using the partition of the balanced realization in (16), we observe that the last term of above equation is the first r columns of (31). So, we can say that
Inserting (39) into (36), we have
Equation (30) now yields
The combination of (27) and the left upper block of (34) gives
Consequently, we have
So, we obtain that
which concludes the proof of this theorem. \(\square \)
Notice that the estimate in Theorem 3.1 is also beneficial if \(N_i=0\) for all \(i=1, \ldots , q\), since it improves the deterministic bound [15] in the sense that we can generally deduce the relation between the truncated HSVs and the actual approximation error here. It is important to notice that, in the deterministic case, “improvement” is not meant in terms of accuracy. The error bound representation in [15] just has the drawback that it allows to make similar conclusions only if the underlying system is asymptotically stable. Moreover, the result of Theorem 3.1 is a generalization of the bounds for mean square asymptotically stable stochastic systems [4, 16], where the covariance related terms vanish as \(T\rightarrow \infty \).
Computation of Gramians
In this section, we discuss how to compute \(P_T\) and \(Q_T\) which allow us to identify redundant information in the system. These matrices are solutions of Lyapunov equations (9) and (12) with lefthand sides depending on F(T) and G(T), respectively. Given F(T) and G(T) it is therefore required to solve generalized Lyapunov equations
efficiently, where L is a symmetric matrix of suitable dimension. According to Remark 2.1 this can be done by vectorization, i.e., one can try to solve \(\text {vec}\left( L\right) = {\mathcal {K}} \text {vec}(X)\) with the Kronecker matrix \({\mathcal {K}}\) defined in (10). Since \({\mathcal {K}}\) is of order \(n^2\), the complexity of deriving \(\text {vec}(X)\) from this linear system of equations is \(\mathcal {O}(n^6)\) making this procedure infeasible for \(n\gg 100\).
However, more efficient techniques have been developed in order to solve (41), see, e.g., [8], where a sequence of standard Lyapunov equations (\(\Pi =0\)) is solved to find X. Such standard Lyapunov equations can either be tackled by direct methods, such as BartelsStewart [1], which cost \(\mathcal {O}(n^3)\) operations, or by iterative methods such as ADI or Krylov subspace methods [17], which have a much smaller complexity than the Bartels–Stewart algorithm, in particular, when the lefthand side is of low rank or structured (complexity of \(\mathcal {O}(n^2)\) or less).
Solving for \(P_T\) and \(Q_T\) now relies on having access to F(T) and G(T) which are the terminal values of the matrixdifferential equations (8) and (11). The remainder of this section will deal with strategies to compute these terminal values.
Exact methods
One solution to overcome the issue of unknown F(T) and G(T) is to use vectorizations of (8) and (11) for dimensions n of a few hundreds. If we define \(f(t):=\text {vec}(F(t))\) and \(g(t)=\text {vec}(G(t))\), then
where \( {\mathcal {K}}\) is defined in (10). Therefore, obtaining F(T) and G(T) relies on the efficient computation of a matrix exponential, since
One can find a discussion on how to determine a matrix exponential efficiently in [11] and references therein. Alternatively, one might think of discretizing the matrix differential equations (8) and (11) to find an approximation of F(T) and G(T). However, as stated above, these equations are equivalent to ordinary differential equations of order \(n^2\). Solving such extremely large scale systems is usually not feasible. In addition, only implicit schemes would allow for a reasonable step size in the discretization making the problem even more complex. For that reason, we discuss more suitable numerical approximations in the following.
Samplingbased approaches
We aim to derive an approximation of the terminal value \(F(T) = {\mathbb {E}} [\Phi (T)BB^\top \Phi ^\top (T)]\) of (8) by different stochastic representations. This alternative approach is required since computing \(\text {e}^{{\mathcal {K}} T}\) is not feasible if \(n\gg 100\) knowing that \({\mathcal {K}}\in {\mathbb {R}}^{n^2\times n^2}\). Therefore, we discuss samplingbased approaches in the following. Let \(\Phi ^i(T)\), \(i\in \{1, \ldots , M\}\), be i.i.d. copies of \(\Phi (T)\). Then, we have \(\frac{1}{M}\sum _{i=1}^M \Phi ^i(T)BB^\top {\Phi ^i}(T)^\top \approx F(T)\) if M is sufficiently large. This requires to sample the random variable \(\Phi (T)B\) possibly many times. \(\Phi (T)B\) is the terminal value of the stochastic differential equation
with \(x_B(t) \in {\mathbb {R}}^{n\times m}\). System (42) can be seen as a matrixvalued homogeneous version of (1a) (\(u\equiv 0\)) with initial state B. If (1) needs to be evaluated for many different controls u and additionally a large number of samples are required for each fixed u, it even pays off to generate many samples of the solution to (42). In particular, this is true if the number of columns of B is low. However, we want to avoid evaluating (42) too often. The number of samples M required for a good estimate of F(T) depends on the variance of \(\Phi (T)BB^\top \Phi ^\top (T)\). Therefore, we want to reduce the variance by finding a better stochastic representation than \({\mathbb {E}} [\Phi (T)BB^\top \Phi ^\top (T)]\). In the spirit of variance reduction techniques, we find the zero variance unbiased estimator first. To do so, we apply Ito’s product rule (see, e.g., [12]) to obtain
This stochastic differential is now exploited to find
using that \(\text {vec}\left( ({\mathcal {L}}_A + \Pi )\left( x_B(t) x_B^\top (t)\right) \right) = {\mathcal {K}} \text {vec}(x_B(t) x_B^\top (t))\). Hence, we have
Devectorizing this equation yields
where the second argument in F represents the initial condition of (8). The righthand side of (43) now is unbiased zero variance estimator of F(T). However, this estimator depends on F which is not available. Therefore, given a symmetric matrix \(X_0\), we approximate \(F(t, X_0)\) by a computable matrix function \({\mathcal {F}}(t, X_0)\) that we specify later. This leads to the unbiased estimator
for F(T). The hope is that a few samples of \(E_{\mathcal F}(T)\) can give an accurate approximation of F(T). Of course, \(E_{\mathcal F}(T)\) can only be simulated by further discretizing the above Ito integrals, e.g., by a Riemann–Stieltjes sum approximation. The variance of \(E_{\mathcal F}(T)\) is
setting \(X_i(t)= N_i x_B(t)x_B^\top (t) + x_B(t) x_B^\top (t)N_i^\top \) and exploiting Ito’s isometry, see [12]. Consequently, the benefit of the variance reduction depends on the difference \(F(t, X_0){\mathcal {F}}(t, X_0)\).
We conclude this section by discussing suitable approximations \({\mathcal {F}}(t, X_0)\) of \(F(t, X_0)\). For that reason, we establish the following theorem.
Theorem 4.1
Let \(F(t, X_0)\), \(t\in [0, T]\), be the solution to
where the initial data \(X_0\) is a symmetric matrix. Then, there exist constants \(\underline{c}\) and \(\overline{c}\) such that
Proof
Exploiting the product rule, it can be seen that F is implicitly given by
The solution \(t\mapsto F(t)\) is continuous and F(t) is a symmetric matrix for all \(t\in [0, T]\). Consequently, exploiting [5, Corollary VI.1.6], there exist continuous and real functions \(\lambda _1, \ldots , \lambda _n\) such that \(\lambda _1(t), \ldots , \lambda _n(t)\) represent the eigenvalues of F(t) for each fixed t. We now define continuous functions by \(\underline{\lambda } := \min \{\lambda _1, \ldots , \lambda _n\}\) and \(\overline{\lambda } := \max \{\lambda _1, \ldots , \lambda _n\}\). Symmetric matrices can be estimated from below and above by their smallest and largest eigenvalue, respectively, leading to \(\underline{\lambda }(t)I\le F(t)\le \overline{\lambda }(t)I\). Therefore, given an arbitrary vector in \(v\in {\mathbb {R}}^{n}\), we have
resulting in \(\underline{\lambda }(t)\Pi \left( I\right) \le \Pi \left( F(t)\right) \le \overline{\lambda }(t)\Pi \left( I\right) \), where \(e_i\) is the canonical basis of \({\mathbb {R}}^q\). Since \(\underline{\lambda }, \overline{\lambda }\) are continuous on [0, T], they can be bounded from below and above by some suitable constants. Applying this to (45), we obtain the result by substitution. \(\square \)
Of course, the constants in Theorem 4.1 are generally unknown. However, this result gives us the intuition that \(F(t, X_0)\) can be approximated by
where \(c\in [\underline{c}, \overline{c}]\) is a real number. From the proof of Theorem 4.1, we further know that \(\underline{c}, \overline{c}\ge 0\) if \(X_0\) is positive semidefinite. We cannot generally expect a reduction of the variance for all choices of c. However, a good candidate will reduce the computational complexity. A general strategy how to find such a candidate is an interesting question for future research.
Remark 4.2
Besides generating (a few) samples of \(x_B\) from (42), we require the matrix exponentials \(\text {e}^{A t_i}\) on a grid \(0=t_0< t_1< \dots < t_{n_g}=T\) to determine the estimator (44) with \({\mathcal {F}}\) as in (46). Here, \(n_g\) is the number of grid points when discretizing the Ito integral in (44). If the points \(t_i\) are equidistant with step size h, one first computes \(\text {e}^{A h}\). The other exponentials are then powers of \(\text {e}^{A h}\) such that a certain number of matrix multiplications (depending on \(n_g\)) have to be conducted.
The Gramian \(Q_T\) can be computed from (12) requiring to determine G(T). According to Remark 2.3, we know that \(G(T) = {\mathbb {E}} [x_C(T) x_C^\top (T)]\), where
with \(x_C(t)\in {\mathbb {R}}^{n\times p}\). Exploiting the above consideration regarding F(T), we can see that
is a possible unbiased estimator for G(T). The approximation \({\mathcal {G}}\) of G can be chosen as in (46) replacing \((A, N_i)\mapsto (A^\top , N_i^\top )\).
Gramians based on deterministic approximations of F(T) and G(T)
Based on Theorem 4.1, an estimation of F(T) (and also G(T)) is given in (46). Instead of using these approximations in a variance reduction procedure like in Sect. 4.2, we exploit it directly in (9) and (12). This leads to matrices \({\mathcal {P}}_T\) and \({\mathcal {Q}}_T\) solving
where the left hand sides are defined by
Certainly, the choice of the constants \(c_F\) and \(c_G\) determine how well \(P_T\) and \(Q_T\) are approximated by \({\mathcal {P}}_T\) and \({\mathcal {Q}}_T\), e.g., in terms of the characterization of the respective dominant subspaces of system (1). Notice that for \(N_i=0\), \({\mathcal {F}}(T, BB^\top )\) and \({\mathcal {G}}(T, C^\top C)\) yield the exact values for \(F(T, BB^\top )\) and \(G(T, C^\top C)\). At this point, it is important to mention that the Gramian approximation of this section is computationally less complex than the one in Sect. 4.2. First of all, we do not need to sample from (42) and secondly no Ito integral as in (44) has to be discretized. Calculating \({\mathcal {F}}\) and \({\mathcal {G}}\) might also require to compute matrix exponentials on a partition of [0, T], compare with Remark 4.2. However, less grid points than for the sampled Gramians of Sect. 4.2 have to be considered since an ordinary integral can be discretized with a larger step size compared to an Ito integral. Alternatively, the integrals in (50) and (51) can also be determined without a discretization since it holds that
This approach has the advantage that only the matrix exponential \(\text {e}^{A T}\) at the terminal time is needed.
Numerical experiments
In order to indicate the benefit of the model reduction method presented in Sect. 2, we consider a linear controlled SPDE as in (2). In addition, we emphasize the applicability to unstable systems by rescaling and shifting the Laplacian. The concrete example of interest is
where \(\alpha , \beta >0\), \(\gamma \in {\mathbb {R}}\) and w is an onedimensional Wiener process. \({{\mathcal {X}}}(t,\cdot )\), \(t\in [0, T]\), is interpreted as a process taking values in \(H=L^2([0,\pi ]^2)\). The input operator \({\mathcal {B}}\) in (2) is characterized by \(1_{[\frac{\pi }{4},\frac{3\pi }{4}]^2}(\cdot )\) and the noise operator \({\mathcal {N}}_1= {\mathcal {N}}\) is defined trough \( {\mathcal {N}}{\mathcal {X}} =\text {e}^{\cdot \frac{\pi }{2}\cdot }{\mathcal {X}}\) for \( {\mathcal {X}}\in L^2([0,\pi ]^2) \). Since the Dirichlet Laplacian generates a \(C_0\)semigroup and its eigenfunctions \((h_k)_{k\in {\mathbb {N}}}\) represent a basis of H, the same is true for \(\alpha \Delta +\beta I\). Therefore, we interpret the solution of the above SPDE in the mild sense. For more information to SPDEs and the mild solution concept, we refer to [6]. The quantity of interest is the average temperature on the noncontrolled area, i.e.,
In order to solve this SPDE numerically, a spatial discretization can be considered as a first step. Here, we choose a spectral Galerkin method relying on the global basis of eigenfunctions \((h_k)_{k\in {\mathbb {N}}}\). The idea is to construct an approximation \({\mathcal {X}}_n\) to \({\mathcal {X}}\) taking values in the subspace \(H_n =\text {span}\{h_1,\cdots , h_n\}\) and which converges to the SPDE solution with \(n\rightarrow \infty \). For more detailed information on this discretization scheme, we refer to [10]. The vector of Fourier coefficients \(x(t)= \left( \langle {\mathcal {X}}_n(t),h_1\rangle _H, \cdots , \langle {\mathcal {X}}_n(t),h_n\rangle _H\right) ^\top \) is a solution of a system like (1) with \(q=1\) and discretized operators

\( {A}=\alpha \text {diag}(\lambda _1,\cdots ,\lambda _n) +\beta I\), \( {B}=\left( \langle {\mathcal {B}},h_k\rangle _H\right) _{k=1 \cdots n}\), \({C}=\left( {\mathcal {C}} h_k\right) _{ k=1\cdots n}\),

\( {N}_1=\left( \langle {\mathcal {N}} h_i ,h_k\rangle _H\right) _{k,i=1 \cdots n} \) and \( x_0=0\),
where \((\lambda _k)_{k\in {\mathbb {N}}}\) are the ordered eigenvalues of \(\Delta \). We refer to [4], where a similar example was studied. There, more details are provided on how this system with its matrices is derived. Now, a small \(\alpha \) and a larger \(\beta \) yield an unstable A, i.e., \(\sigma (A)\not \subset {\mathbb {C}}\) which already violates asymptotic mean square stability of (1), i.e., \({\mathbb {E}}\left\ x(t; x_0, 0)\right\ _2^2\nrightarrow 0\) as \(t\rightarrow \infty \) . Moreover, a larger \(\gamma \) (larger noise) causes further instabilities. For that reason, we pick \(\alpha = 0.4\), \(\beta =3\) and \(\gamma = 2\) in order to demonstrate the MOR procedure for a relatively unstable system. Notice that enlarging \(\beta \) or \(\gamma \) (or making \(\alpha \) smaller) leads to a higher degree of instability. This affects the approximation quality in the reduced system given T is fixed. The intuition is that the less stable a system is the stronger the dominant subspaces are expanding in time. This is because some variables in unstable systems are strongly growing such that initially redundant directions become more relevant from a certain point of time. This can also be observed in numerical experiments.
Below, we fix a normalized control \(u(t)= c_u \text {e}^{0.1 t}\), \(t\in [0, T]\), (the constant \(c_u\) ensures \(\left\ u\right\ _{L^2_T}=1\)) and apply the MOR method to the spatially discretized SPDE that is based on the balancing transformation \(S=S_2\) described in Sect. 2.2. In Sect. 5.1, we compare the approximation quality of the ROMs using either the exact Gramian or inexact Gramians introduced in Sect. 4. Subsequently, Sect. 5.2 shows the reduced model accuracy in higher state space dimension, where solely inexact Gramians are available. We conclude the numerical experiments by discussing the impact of the terminal time T and the covariance matrix K in Sect. 5.3.
Simulations for \(n=100\) and \(T=1\)
We compare the associated ROM (17) with the original system in dimension \(n=100\) first since this choice allows to determine F(T), G(T) and hence the Gramians \(P_T, Q_T\) exactly according to Sect. 4.1. As a consequence, we can compare the MOR scheme involving the exact Gramians with the same type of scheme relying on the approximated Gramians that are computed exploiting the approaches in Sects. 4.2 and 4.3. In particular, we first approximate F(T) and G(T) based on a Monte Carlo simulation using 10 realizations of the estimators (44) and (47), respectively. The functions \({\mathcal {F}}\) and \({\mathcal {G}}\) entering these estimators are chosen as in (46) with \(c=0\). We refer to the resulting matrices as Sect. 4.2 Gramians. At this point, we want to emphasize that these sampling based Gramians do not necessarily have to be accurate approximations of the exact Gramians in a componentwise sense. It is more important that the dominant subspaces of the system (eigenspaces of the Gramians) are captured in the approximation. Notice that the dominant subspace characterization is not improved if the number of samples is enlarged to 1000. Second, we determine the approximations \({\mathcal {P}}_T\) and \(\mathcal {Q}_T\) according to Sect. 4.3 and call them Sect. 4.3 Gramians. The associated constants are chosen to be \(c_F=c_G=0\).
In Fig. 1, the HSVs \(\sigma _{T, i}\), \(i=\{1, \ldots , 50\}\), of system (1) are displayed. By Theorem 3.1 and the explanations below this theorem, it is known that small truncated \(\sigma _{T, i}\) go along with a small reduction error of the MOR scheme. Due to the rapid decay of these values, we can therefore conclude that small error can already be achieved for small reduced dimensions r. For instance, we observe that \(\sigma _{T, i}<3.5\)e\(06\) for \(i\ge 8\) indicating a very high accuracy in the ROM for \(r\ge 7\). This is confirmed by the error plot in Fig. 2 and the second column of Table 1. Moreover, Fig. 2 shows the tightness of the error bound in (29) that was specified in Theorem 3.1. The bound differs from the exact error only by a factor between 2.5 and 4.6 for the reduced dimensions considered in Fig. 2 and is hence a good indicator for the expected performance. Notice that the error is only exact up to deviations occurring due to the semiimplicit Euler–Maruyama discretization of (1) and (17) as well as the Monte Carlo approximation of the expected value using \(10\,000\) paths. Besides the MOR error based on \(P_T\) and \(Q_T\), Table 1 states the errors in case the approximating Gramians of Sects. 4.2 and 4.3 are used. It can be seen that both approximations perform roughly the same and that one looses an order of accuracy compared to the exact Gramian approach. However, one can lower the reduction error by an optimization with respect to the constants \(c, c_F, c_G\). Moreover, we see that the accuracy is very good for the estimators of the covariances F(T) and G(T) used here.
Simulations for \(n=1000\) and \(T=1\)
We repeat the simulations of Sect. 5.1 for \(n=1000\). This is a scenario, where the exact Gramians are not available anymore. Therefore, we conduct the balancing MOR scheme using the Sects. 4.2 and 4.3 Gramians only. In the context of Sect. 4.2 Gramians, it is important to mention that in higher dimensions it is required to use very efficient discretizations of the Ito integrals in (44) and (47). Otherwise, a very small step size is needed such that from the computational point of view it is better to omit these Ito integrals within the estimators, i.e., just \(x_B\) and \(x_C\) are supposed to be sampled to approximate F(T) and G(T). Table 2 shows that the balancing related MOR technique based on the approximated Gramians of Sects. 4.2 and 4.3 is beneficial in high dimensions. A very small reduction error can be observed and in the majority of the cases the samplingbased approach seems slightly more accurate than the approach of Sect. 4.3 given the same type of approximations for F(T) and G(T) for each ansatz.
Relevance of T and K
As in Sect. 5.1, let us fix \(n=100\) to be able to compute the Gramians exactly. We begin with deriving reduced systems on different intervals [0, T]. Second, we extend our model to a stochastic differential equation with noise dimension \(q=2\) and investigate the effect of different correlations between the two Wiener processes.
Relevance of the terminal time Let us study the scenario of Sect. 5.1 with \(T= 0.5, 1, 2, 3\) using the exact Gramians to illustrate that dominant subspaces are changing in time. Indeed, we observe in Table 3 that for a fixed reduced dimension r the error gets bigger the larger the interval [0, T] is. This means that with increasing T the reduced dimension has to be enlarged to ensure a certain desired approximation error. This is also intuitive in the sense that it is generally harder to find a good approximation on a larger interval in comparison with a smaller one.
Relevance the the covariance structure Let us extend the SPDE discretization by introducing \(N_2:= N_1^{\frac{6}{5}}\) so that we have a system of the form (1) with \(q=2\) and standard Wiener processes \(w_1\) and \(w_2\). The goal is to investigate how the correlation between \(w_1\) and \(w_2\) influences the MOR error. For that reason, we choose the following three scenarios: \({\mathbb {E}}[w_1(t) w_2(t)] = \rho t\) with \(\rho = 0, 0.5, 1\). Table 4 states the MOR errors for these correlations. In this example, we can observe that a higher correlation between the processes yields a larger error. A different observation was made in numerical examples studied in [14], where systems with high correlations in the noise processes gave a smaller reduction error. However, [14] studies different types of stochastic differential equations in the context of asset price models which do not have control inputs.
Notes
\(({\mathcal {F}}_t)_{t\in [0, T]}\) shall be right continuous and complete.
References
Bartels RH, Stewart GW (1972) Solution of the matrix equation \(AX + XB = C\). Commun ACM 15(9):820–826
Barth A (2010) A finite element method for martingaledriven stochastic partial differential equations. Commun Stoch Anal 4(3):355–375
Becker S, Hartmann C (2019) Infinitedimensional bilinear and stochastic balanced truncation with error bounds. Math Control Signals Syst 31:1–37
Benner P, Redmann M (2015) Model reduction for stochastic systems. Stoch PDE Anal Comp 3(3):291–338
Bhatia R (1997) Matrix analysis, vol 169. Springer, Berlin
Da Prato G, Zabczyk J (1992) Stochastic equations in infinite dimensions. Encyclopedia of mathematics and its applications, vol 44. Cambridge University Press, Cambridge
Damm T (2004) Rational matrix equations in stochastic control. Lecture notes in control and information sciences, vol 297. Springer, Berlin
Damm T (2008) Direct methods and ADIpreconditioned Krylov subspace methods for generalized Lyapunov equations. Numer Linear Algebra Appl 15(9):853–871
Gawronski W, Juang J (1990) Model reduction in limited time and frequency intervals. Int J Syst Sci 21(2):349–376
Hausenblas E (2003) Approximation for semilinear stochastic evolution equations. Potential Anal 18(2):141–186
Kürschner P (2018) Balanced truncation model order reduction in limited time intervals for large systems. Adv Comput Math 44(6):1821–1844
Øksendal B (2013) Stochastic differential equations (6th edition): an introduction with application. Springer, Berlin
Redmann M (2018) Type II singular perturbation approximation for linear systems with Lévy noise. SIAM J Control Optim 56(3):2120–2158
Redmann M, Bayer C, Goyal P (2021) Lowdimensional approximations of highdimensional asset price models. SIAM J Financ Math 12(1):1–28
Redmann M, Kürschner P (2018) An output error bound for timelimited balanced truncation. Syst Control Lett 121:1–6
Redmann M, Pontes Duff I (2022) Full state approximation by Galerkin projection reduced order models for stochastic and bilinear systems. Appl Math Comput 420:126561
Simoncini V (2016) Computational methods for linear matrix equations. SIAM Rev 58(3):377–441
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Redmann, M., Jamshidi, N. Gramianbased model reduction for unstable stochastic systems. Math. Control Signals Syst. 34, 855–881 (2022). https://doi.org/10.1007/s0049802200328z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s0049802200328z
Keywords
 Model order reduction
 Linear stochastic systems
 Unstable systems
 Stochastic processes
 Error analysis