Before moving on, we would like to investigate the fundamental question: given the observation data, can we uniquely reconstruct the parameter Q in the objective function? This question actually involves two aspects: identifiability of the model structure as well as persistent excitation. Identifiability is a property of the model structure which determines if the model structure can generate the same output using two different parameters while persistent excitation of the input signal ensures that we can actually recover the undetermined parameters in the identifiable model by the measured corresponding output. Hence, to do further analysis, we need to define the model structure for the IOC problem first.
Model structure is nothing but a parameterized candidate model that describes the input–output relationship [17]. In the IOC problem we just formulated, we regard the initial value \({\bar{x}}\) as the input signal and the system output \(y_t\) as the output signal. Therefore, the model structure \({\mathcal {M}}_{2:N-1}(Q)\) considered in this paper is defined as the following:
Definition 1
(Model structure) For \({\mathcal {M}}_t:{\mathbb {R}}^n\mapsto {\mathbb {R}}^l,t=2,\ldots ,N-1\), the model structure \({\mathcal {M}}_{2:N-1}(\cdot )\) takes the form:
$$\begin{aligned} {\mathcal {M}}_t(Q) = C\textstyle \prod \limits _{k=1}^{t-1}A_{cl}(k;Q), \end{aligned}$$
(3)
where \(A_{cl}(k;Q)\) is the closed-loop system matrix generated by the “forward problem” (1) and (2).
We further make the following assumption for the system:
Assumption 1
The discrete-time linear system defined by (A, B, C) is a square system, i.e., \(l=m\) and has relative degree \((r_1,\ldots ,r_m)\). Namely,
$$\begin{aligned}&c_iA^jB=0,\ j=0:r_i-2, ~~ c_iA^{r_i-1}B\ne 0,\\&{\mathscr {L}}=\begin{bmatrix} c_1A^{r_1-1}B\\ \vdots \\ c_mA^{r_m-1}B \end{bmatrix} \text{ is } \text{ nonsingular, } \end{aligned}$$
where \(c_i\) denotes the ith row of C.
We first look into more details of the optimal control sequences.
Theorem 1
Suppose that \(N\geqslant n+1\) and (A, B) is controllable. If \(Q\ne Q^\prime , Q,Q^\prime \in {\mathbb {S}}^n_+\), then \(K_{t}(Q)= K_{t}(Q^\prime )\) cannot happen consecutively for more than \(n-1\) steps.
Proof
Recall that the optimal solution to (2) is derived by the discrete-time Riccati Equation (DRE)
$$\begin{aligned} P_t&=A^\mathrm{T}P_{t+1}A+Q-A^\mathrm{T}P_{t+1}B(B^\mathrm{T}P_{t+1}B\\&\quad+I)^{-1}B^\mathrm{T}P_{t+1}A,~~t=1,\ldots ,N-1, \\ P_N&=S=0, \end{aligned}$$
whose solution sequence is denoted by DRE(Q, S). We then prove the theorem by contradiction. Let \(P_t=DRE(Q,0)\), \(P^\prime _t=DRE(Q^\prime ,0)\), \({\Delta } Q=Q- Q^\prime\) and \({\Delta } P_t=P_t- P^\prime _t\) for \(t=2:N\). Assume there exists \(t_1\in [1,N-1]\) such that \(K_{t}(Q)= K_{t}(Q^\prime )\) hold for \(t\in [t_1-n+1,t_1] \subset [1,N-1]\). Then it holds that
$$\begin{aligned} {\left\{ \begin{array}{ll} B^\mathrm{T} {\Delta } P_{t+1}=0, \\ A^\mathrm{T}{\Delta } P_{t+1}A-{\Delta } P_{t}+{\Delta } Q=0, \end{array}\right. } \end{aligned}$$
(4)
where \(t=t_1-n+1:t_1\).
Using (4) and the fact that A is invertible, \(B^\mathrm{T} {\Delta } P_{t+1}=0\) could be computed recursively for \(t=t_1-n+1:t_1\) and stacked into the following compact form:
$$\begin{aligned}&\underbrace{\begin{bmatrix} B^\mathrm{T} \\ B^\mathrm{T}A^\mathrm{T} \\ \vdots \\ B^\mathrm{T}(A^\mathrm{T})^{(n-1)} \end{bmatrix}}_{{\varGamma }} {\Delta } P_{t_1+1} \\&\quad = \underbrace{-\begin{bmatrix} 0 \\ B^\mathrm{T}{\Delta } QA^{-1} \\ \vdots \\ B^\mathrm{T}(\textstyle \sum \limits _{k=0}^{n-2}(A^\mathrm{T})^k{\Delta } Q A^k )(A^{-1})^{(n-1)} \end{bmatrix}}_{\varOmega }. \end{aligned}$$
Since (A, B) is controllable, \({\varGamma }\) has full column rank. Hence, \({\Delta } P_{t_1+1}\) is the unique solution of the linear equation \({\varGamma } {\mathscr {P}} = {\varOmega }\).
On the other hand, consider the following algebraic equation:
$$\begin{aligned} {\left\{ \begin{array}{ll} B^\mathrm{T} {\Delta } P=0, \\ A^\mathrm{T}{\Delta } PA-{\Delta } P +{\Delta } Q=0. \end{array}\right. } \end{aligned}$$
(5)
We can apply the same techniques to (5) and get the same equation \({\varGamma } {\mathscr {P}} = {\varOmega }\) while \({\Delta } P\) is a solution to it. Since \({\varGamma } {\mathscr {P}} = {\varOmega }\) has a unique solution, hence it follows that \({\Delta } P_{t_1+1}={\Delta } P\). Moreover, recall that it holds for any \(t_1-n+1\leqslant t\leqslant t_1\) that \(A^\mathrm{T}{\Delta } P_{t+1}A-{\Delta } Q = {\Delta } P_t\) and \(A^\mathrm{T}{\Delta } PA+{\Delta } Q={\Delta } P\), thus we have \({\Delta } P_t = {\Delta } P,\forall t = t_1-n+2:t_1+1\).
Using (5) and \(P^\prime _t=DRE(Q^\prime ,0)\), we can then show that \({\bar{P}}_t\triangleq P_t^\prime +{\Delta } P\) is in fact the solution to \(DRE(Q,{\Delta } P)\) for all \(t\in [1,N-1]\), i.e.,
$$\begin{aligned} {\bar{P}}_t&=P_t^\prime +{\Delta } P \\&=A^\mathrm{T}P_{t+1}^\prime A+Q^\prime -A^\mathrm{T}P_{t+1}^\prime B(B^\mathrm{T}P_{t+1}^\prime B+I)^{-1}B^\mathrm{T}P_{t+1}^\prime A \\&\quad + A^\mathrm{T}{\Delta } PA+{\Delta } Q\\&= A^\mathrm{T}(P_{t+1}^\prime +{\Delta } P)A+Q-A^\mathrm{T}(P_{t+1}^\prime \\&\quad +{\Delta } P)B[B^\mathrm{T}(P_{t+1}^\prime +{\Delta } P)B+I]^{-1}B^\mathrm{T}(P_{t+1}^\prime +{\Delta } P)A, \end{aligned}$$
where \({\bar{P}}_N=P_N^\prime +{\Delta } P={\Delta } P\).
Recall that on the time interval \(t =t_1-n+2:t_1+1\), \(P_t\) and \({\bar{P}}_t\) satisfy the same DRE with weighting matrix Q and boundary \({P}_{t_1+1}={\bar{P}}_{t_1+1}=P_{t_1+1}^\prime +{\Delta } P\). Then based on the backwards recursion of DRE, it is obvious that \(P_t={\bar{P}}_t\) for all \(1 \leqslant t \leqslant t_1+1\). It means that \(P_t\) and \({\bar{P}}_t\) are generated by the same forward Riccati equation with initial condition \(P_1\). We further show uniqueness of solution for the forward DRE for \(t\in [1,N-1]\) by considering its adjoint system and using the fact that A is nonsingular
$$\begin{aligned} {\left\{ \begin{array}{ll} X_{t+1}=AX_t-BB^\mathrm{T}Y_{t+1},&{} X_1=I,\\ Y_{t+1}=-(A^\mathrm{T})^{-1}QX_t+(A^\mathrm{T})^{-1}Y_t,&{} Y_1=P_1, \end{array}\right. } \end{aligned}$$
(6)
whose solution can be uniquely determined for all \(t\geqslant 1\). Recall that \(P_t=DRE(Q,0)\) and let \(\lbrace X_t \rbrace\) be the regular matrix solution of its closed-loop system \(X_{t+1}=(A+BK_t)X_t\) with \(X_1=I\). Since \(A+BK_t\) is always nonsingular [18], the nonsingular \(X_t\) together with \(Y_t=P_t X_t\) compose the unique solution to (6). Therefore, for the given \(P_1\), there exists a unique solution \(Y_tX_t^{-1}\) that satisfies the forward DRE for \(t=1:N\), which contradicts with the fact \({\bar{P}}_N={\Delta } P \ne P_N\). Hence the claim follows. \(\square\)
Now, we are ready to present the theorem for the identifiability of the model structure \({\mathcal {M}}_{2:N-1}(Q)\).
Theorem 2
Under Assumption 1, suppose that (A, B) is controllable, \(N\geqslant 2n\), A is invertible, B has full column rank, then the model structure (3) is strictly globally identifiable.
Proof
We prove the statement by contradiction. Suppose two different weighting matrices \(Q_1 \ne Q_2\) could generate the same model structure, namely, \({\mathcal {M}}_{2:N-1}(Q_1)={\mathcal {M}}_{2:N-1}(Q_2)\). For the model structure \({\mathcal {M}}_{t,i}(Q)\) that corresponds to the ith system output at time instant t. By Theorem 1, assume \(t^\prime \leqslant n-1\) is the first time instant such that \(K_{t^\prime }(Q_1)\ne K_{t^\prime }(Q_2)\). Therefore, we have
$$\begin{aligned}&\begin{array}{ll} {\mathcal {M}}_{t,i}(Q)&{}=c_i\textstyle \prod \limits _{j=t^\prime }^{t-1}(A+BK_j(Q))\textstyle \prod \limits _{j=1}^{t^\prime -1}A_{cl}(j;Q)\\ &{}=c_iA^{t-t^\prime }\textstyle \prod \limits _{j=1}^{t^\prime -1}A_{cl}(j;Q), \ t=1:r_i+t^\prime -1, ~ i=1:m, \end{array} \\&{\mathcal {M}}_{r_i+t^\prime ,i}(Q)=c_i[A^{r_i}+A^{r_i-1}BK_{t ^\prime }(Q)]\textstyle \prod \limits _{j=1}^{t^\prime -1}A_{cl}(j;Q),\ i=1:m. \end{aligned}$$
Hence, if it holds for the model structures \({\mathcal {M}}_t(Q_1)= {\mathcal {M}}_t(Q_2)\) at all \(t=1:N-1\), then it must hold that
$$\begin{aligned}&{\mathcal {M}}_{r_i+t^\prime ,i}(Q_1) ={\mathcal {M}}_{r_i+t^\prime ,i}(Q_2), \ i=1:m\\& \Leftrightarrow c_i[A^{r_i}+A^{r_i-1}BK_{t ^\prime }(Q)]\textstyle \prod \limits _{j=1}^{t^\prime -1}A_{cl}(j;Q_1)\\&\quad ~=c_i[A^{r_i}+A^{r_i-1}BK_{t ^\prime }(Q)]\textstyle \prod \limits _{j=1}^{t^\prime -1}A_{cl}(j;Q_2),\ i=1:m. \end{aligned}$$
Using the fact that \(A_{cl}(t;Q)\) is invertible for all \(t=1:N-1\) [5], we can cancel the product of closed-loop system matrices in the above equation and get
$$\begin{aligned}&c_iA^{r_i-1}BK_{t ^\prime }(Q) = c_iA^{r_i-1}BK_{t ^\prime }(Q), ~~ i=1:m\\&\quad \Leftrightarrow {\mathscr {L}}K_{t^\prime }(Q_1)={\mathscr {L}}K_{t^\prime }(Q_2). \end{aligned}$$
Since \({\mathscr {L}}\) is invertible by definition, it holds that \(K_{t^\prime }(Q_1)=K_{t^\prime }(Q_2)\) and we have a contradiction. Hence the model structure is globally identifiable at \(Q_1\). Since \(Q_1\) is arbitrarily chosen, this means that the model structure is strictly globally identifiable. \(\square\)
Remark 1
In the above theorem, it is easy to see that it is actually sufficient for the time-horizon length to have \(N\geqslant n+\max _i(r_i)\) to make the model structure strictly globally identifiable. However, we state the theorem as it is just to make the theorem easier to comprehend for the reader. On the other hand, for the case \(l>m\) (the output dimension is greater than the input dimension), we just need to assume that there is a subset of outputs that makes the system have relative degree \((r_1,\ldots ,r_m)\).
Here, we would like to remark that in our problem formulation observability of the system is not assumed. This is due to the fact that lack of observability can happen in various situations such as sensor unavailability, data shared with privacy, uncertain environment or tasks with a limited number of descriptive features, and it is not necessary to impose reconstructability of the state from noisy observations. On the other hand, for the identifiability of the IOC problem, it is required that the influence of different cost functions should be reflected sufficiently in the observation sequences. More specifically, from the proof of Theorem 2, we can see that the existence of relative degree plays a key role in guaranteeing the distinguishability of the system output under different parameter specifications, namely, \({\bar{y}}_t({\bar{Q}},{\bar{x}})={\bar{y}}_t(Q,{\bar{x}})\) with \(t=2,\ldots N\) if and only if \(Q={{\bar{Q}}}\). It means that under such assumption, the LQR problems would generate exactly the same output sequences in the noiseless case only when the same weighting matrix Q is used. However, such property can not always be guaranteed for systems without a relative degree, which is illustrated by the following example.
Example 1
Consider the following controllable square system
$$\begin{aligned} A=\begin{bmatrix} 3 &{} ~0 &{} 0 \\ 1 &{} ~3 &{} -1 \\ 2 &{} ~1 &{} 1 \\ \end{bmatrix},~ B=\begin{bmatrix} 1 &{} ~0\\ 0 &{} ~0\\ 1 &{} ~5 \end{bmatrix},~ C=\begin{bmatrix}c_1\\ c_2\end{bmatrix}=\begin{bmatrix} 0 &{} 1 &{} ~0\\ -1 &{} -1 &{} ~1 \end{bmatrix}, \end{aligned}$$
which, however, does not have a relative degree.
For the quadratic cost function, we take \(N=5\). We could show that the following weighting matrices would generate the same output sequences \({\bar{y}}_t\),
$$\begin{aligned} Q_1=\begin{bmatrix} 3 &{} 1 &{} -2\\ 1 &{} 2 &{} -1\\ -2 &{} -1 &{} 2 \end{bmatrix},~ Q_2=\begin{bmatrix} 4 &{} 1 &{} -2\\ 1 &{} 2 &{} -1\\ -2 &{} -1 &{} 2 \end{bmatrix}. \end{aligned}$$
First, we check the first output. Since \(c_1B=0\), \(c_1AB \ne 0\), and \({\bar{y}}_{t+1,i}(Q,{\bar{x}})=c_i\Pi _{j=1}^t(A+BK_j){\bar{x}}\), straightforward computation gives
$$\begin{aligned} {\bar{y}}_{2,1}(Q,{\bar{x}})=&\,c_1A{\bar{x}},\\ {\bar{y}}_{3,1}(Q,{\bar{x}})=&\,c_1A^2{\bar{x}}+c_1ABK_1{\bar{x}},\\ {\bar{y}}_{4,1}(Q,{\bar{x}})=&\,c_1A^3{\bar{x}}+c_1A^2BK_1{\bar{x}}+c_1ABK_2BK_1{\bar{x}},\\ {\bar{y}}_{5,1}(Q,{\bar{x}})=&\,c_1A^4{\bar{x}}+c_1A^3BK_1{\bar{x}}+c_1A^2BK_2BK_1{\bar{x}}\\&+c_1ABK_3BK_2BK_1{\bar{x}}. \end{aligned}$$
By computing \(K_t\) from DRE, one could easily show that \({\bar{y}}_{t,1}({\bar{Q}},{\bar{x}})={\bar{y}}_{t,1}(Q,{\bar{x}})\) for any \({{\bar{x}}}\) and \(t=1,\ldots N\) since all the coefficient matrices of \({{\bar{x}}}\) in the right hand side of the above equations are equal for \(K_t(Q)\) and \(K_t({{\bar{Q}}})\). The same claim also holds for \({\bar{y}}_{t,2}\) and the statement follows. \(\square\)