An Improvement of the Newton Method for Solving Symmetric Algebraic Riccati Equations

Hernández-Verón, M. A.; Romero, N.

doi:10.1007/s00009-023-02466-3

An Improvement of the Newton Method for Solving Symmetric Algebraic Riccati Equations

Open access
Published: 08 July 2023

Volume 20, article number 261, (2023)
Cite this article

Download PDF

You have full access to this open access article

Mediterranean Journal of Mathematics Aims and scope Submit manuscript

An Improvement of the Newton Method for Solving Symmetric Algebraic Riccati Equations

Download PDF

M. A. Hernández-Verón¹ &
N. Romero¹

770 Accesses
2 Citations
Explore all metrics

Abstract

In this work, we analyze a strategy for solving symmetric algebraic Riccati equation based on the use of efficient high-order iterative scheme. This iterative scheme is more efficient than Newton’s method. Then, we propose two iterative two-stage predictor–corrector schemes using an iterative scheme with good accessibility as the predictor iteration and a high-order iterative scheme as the corrector iteration. The iterative schemes constructed turn out to be competitive compared to the commonly used Newton’s method. The efficiency of these methods is illustrated by a numerical example.

Solving Symmetric Algebraic Riccati Equations with High Order Iterative Schemes

Article 05 March 2018

Numerical Study on Nonsymmetric Algebraic Riccati Equations

Article 30 September 2016

A generalized ALI iteration method for nonsymmetric algebraic Riccati equations

Article 06 November 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The study of algebraic Riccati equation (ARE) is motivated by the important role played in many applications from different areas, such as optimal filter design and control theory [20], queueing models [23], numerical solutions of the transport theory [14], problems with or without symmetric constraints, etc. The algebraic Riccati equation is given by ([18]):

$$\begin{aligned} {\mathcal {R}}(X):=XDX-XA-BX-C=0, \end{aligned}$$

(1)

where $X\in {\mathbb {C}}^{m\times n}$ is the unknown, and the coefficients are $A \in {\mathbb {C}}^{n\times n}$, $B\in {\mathbb {C}}^{m\times m}$, $D\in {\mathbb {C}}^{n\times m }$ and $C\in {\mathbb {C}}^{m\times n}$. Equation (1) is known as the nonsymmetric algebraic Riccati equation (NARE), which is distinguished from the symmetric one:

$$\begin{aligned} {\mathcal {R}}(X):=XDX-XA-A^{*}X-C=0,\qquad \end{aligned}$$

(2)

where $A^{*}$ denotes the transpose conjugate of the matrix A. The symmetric term refers to a matrix equation ${\mathcal {R}}(X)=0$ such that ${\mathcal {R}}(X)^{*}={\mathcal {R}}(X^{*})$. In particular, Eq. (1) turns to a symmetric equation if $A=B^{*}$, $D=D^{*}$ and $C=C^{*}$. Besides, if $m=n$, then Eq. (2) is known as a continuous-time algebraic Riccati equation (CARE). In this case, the solution X of interest is a Hermitian matrix. A typical problem where a (CARE) appears involved is the linear-quadratic optimal control problems.

In the literature can be found different techniques for solving these equations. Many papers dedicated to solve ARE equations are based on algebraic techniques and theory of matrices, see for instance [19] where the Schur method was introduced. And, other techniques like the use of the hierarchical matrices for solving large scale Riccati equations, see [10]. In fact, in the large and sparse case, it is a common approach to use Newton’s method to find a solution of (2), see for instance [7, 8, 12, 16, 21, 24] and references given therein.

For solving (CARE), at every Newton step, a Lyapunov equation ( [18]) must be solved:

$$\begin{aligned} {\mathcal {L}}_F (X)=XF+F^{*}X =W, \end{aligned}$$

where F and W are given matrices.

In this work, we analyze a strategy for solving symmetric algebraic Riccati equation based on the use of efficient high-order iterative scheme [3, 6]. We propose two iterative two-stage predictor–corrector schemes [17] using an iterative scheme with good accessibility as the predictor iteration and a high-order iterative scheme as the corrector iteration which accelerates the convergence. An advantage of this strategy is that at every step, the matrix of the Lyapunov equations which must be solved, is the same. Thus, the use of this iterative scheme reduce the number of iterations and hence the total amount of work.

Throughout the work, we suppose the pair (A, D) is stabilizable, that is a (feedback) matrix $K\in {\mathbb {R}}^{n\times n}$ exists such that all eigenvalues of $A-DK$ are in the open left half-plane. Note that the matrix K can be chosen to be symmetric if the matrix D is symmetric (see [18]). Under these conditions, the existence of Hermitian solutions X of (2) can be characterized using spectral properties of the matrix $\left( \begin{array}{cc} -A&{}D\\ C&{}A^{*} \\ \end{array}\right) $, see [18].

For this, we denote ${\mathcal {H}}$ the set of Hermitian matrices in ${\mathbb {R}}^{n\times n}$. For any matrix, norm $ {\mathcal {H}}$ is a Banach space, and $ {\mathcal {R}}$ is a mapping from ${\mathcal {H}}$ into itself. From now on, we denote by $\Vert \cdot \Vert $ a general norm in the set of Hermitian matrices. Then, the first Frèchet derivative of ${\mathcal {R}}$ at a matrix $X \in {\mathcal {H}}$ is a linear map ${\mathcal {R}}'(X): {\mathcal {H}} \rightarrow {\mathcal {H}}$, given by

$$\begin{aligned} {\mathcal {R}}'(X) E= E(DX-A)+(DX-A)^{*}E, \,\, E \in {\mathcal {H}}. \end{aligned}$$

(3)

Also, the second derivative at $X \in {\mathcal {H}}$, is a bilinear map ${\mathcal {R}}''(X): {\mathcal {H}} \times {\mathcal {H}} \rightarrow {\mathcal {H}}$, given by

$$\begin{aligned} {\mathcal {R}}''(X) E_1E_2= E_1DE_2 +E_2DE_1, \,\, E_1, E_2 \in {\mathcal {H}}. \end{aligned}$$

One of the most well-known iterative schemes for solving Eq. (2) is the Newton method [1, 15]:

$$\begin{aligned} \left\{ \begin{array}{ll} X_0\ \text {given},\\[1ex] {\mathcal {R}}'(X_n) L_n= -{\mathcal {R}}(X_n), \\[1ex] X_{n+1}=X_n+L_{n}, \quad n\ge 0. \end{array} \right. \end{aligned}$$

(4)

Thus, taking into account (3), to approximate a solution of equation (2) via the Newton method is equivalent to solve the Lyapunov equation:

$$\begin{aligned} X_{n+1}(A-DX_n) +(A-DX_n)^{*}X_{n+1}= -H(X_{n})-C, \end{aligned}$$

where $H:{\mathcal {H}} \rightarrow {\mathcal {H}}$ is the operator $H(X)=XDX$.

Our main aim is to approximate a solution of the equation (2), improving the results obtained by the classic Newton’s method. To do this, first, we think of an iterative scheme that improves the quadratic speed of convergence that Newton’s method has. But, obviously, this is not enough, we must consider an iterative scheme with reduced operational cost thus, it is more efficient than Newton’s method. Thus, we consider the M5 method [4] that has local fifth order of convergence and reduced operational cost. In fact, we prove that it is a more efficient iterative scheme than Newton’s method. However, when choosing an iterative scheme, in addition to its speed of convergence and its operational cost, there is another very important aspect, which is its accessibility. The measure of this accessibility is given by the set of starting points that make the iterative scheme convergent. Therefore, a commonly used procedure is to obtain a local convergence result for the iterative scheme. This result gives us the well-known convergence ball [2] that indicates a domain of starting points that make the iterative scheme convergent. Thus, comparing the convergence balls of the iterative schemes we compare their accessibility. Then we check that the method M5 has a reduced accessibility and therefore we modify this scheme. Thus, to solve this problem, we consider two-stage predictor–corrector iterative schemes. Thus, first, it is applied to an iterative scheme which has a wide accessibility region, and second, it is applied in the M5 method with better computational efficiency than Newton’s method. In this way, we obtain two interesting improvements of the M5 method considering the Newton method and the modified Newton method [15] as predictor schemes.

The paper is organized as follows—First, in Sect. 2, we study the M5 method, proving that it has local fifth order of convergence for solving Eq. (2). Next, in Sect. 3, from a local convergence result given for the iterative scheme M5, we study its domain of parameters and we prove that its accessibility is poor. Then, we justify a first improvement of the M5 method through an iterative two-stage predictor–corrector scheme. In Sect. 4, we obtain a second improvement of the M5 method by means of the modified Newton method from a reduction of the operational cost. To finish, in Sect. 5, we use the predictor–corrector iterative schemes considered to approximate a solution of a particular symmetric ARE of the benchmark collection. We show that these iterative schemes considered are competitive with respect to Newton’s method, which is one of the iterative schemes commonly used in this type of problem.

From now on, we denote by $\overline{B(X, \rho )} = \{Y \in {\mathcal {H}}; \Vert Y-X\Vert \leqslant \rho \}$ and $B(X, \rho ) = \{Y \in {\mathcal {H}}; \Vert Y-X\Vert < \rho \},$ with $X \in {\mathcal {H}} $ and $\rho \in {\mathbb {R}}_{+}$.

2 A High-Order Iterative Scheme: M5 Method

In order to accelerate the speed of convergence, we consider the efficient fifth-order iterative scheme M5:

$$\begin{aligned} \left\{ \begin{array}{ll} \mathrm {Given \, \, an \, \, initial \, \, guess \, \,} X_0\in {\mathcal {H}},\\[1ex] Y_{k}= X_k - [{\mathcal {R}}'(X_k)]^{-1}{\mathcal {R}}(X_k), \\[1ex] Z_{k}= Y_k - 5[{\mathcal {R}}'(X_k)]^{-1}{\mathcal {R}}(Y_k), \\[1ex] X_{k+1} = Z_k - \displaystyle \frac{1}{5} [{\mathcal {R}}'(X_k)]^{-1}({\mathcal {R}}(Z_k)-16 {\mathcal {R}}(Y_k)),\quad k\ge 0. \end{array} \right. \end{aligned}$$

(5)

given that the maps $R'(X_n)$ are all invertible.

Observe that, from the aforesaid, the iteration (5) is equivalent to solve a single Lyapunov equation, $ {\mathcal {L}}_F(X)=W$, with different independent terms:

$$\begin{aligned} \left\{ \begin{array}{lll} \mathrm {Given \, \, an \, \, initial \, \, guess \, \,} X_0\in {\mathcal {H}}, \\ {\mathcal {L}}_{A-DX_k}(Y_k)= ({\mathcal {P}} _D(X_k) + C), \\ {\mathcal {L}}_{A-DX_k}(Z _k)= ({\mathcal {P}} _D(X_k) + C) + 5 {\mathcal {P}} _D(Y_k-X_k), \\ {\mathcal {L}}_{A-DX_k}(X _{k+1})= ({\mathcal {P}} _D(X_k) + C)+ {\mathcal {P}} _D(Y_k-X_k) \\ - \displaystyle \frac{1}{5} \left[ (Y_k-X_k)D(Z_k-Y_k)+(Z_k-Y_k)D(Y_k-X_k)+{\mathcal {P}} _D(Z_k-Y_k) \right] , k\geqslant 0. \end{array} \right. \end{aligned}$$

(6)

where $ {\mathcal {L}}_F(X):= XF+F^{*}X$ with F and W given matrices and W is Hermitian if $X_k$ is so. In this section, we prove that method (5), under certain conditions at ${\widehat{X}}$, is convergent to the solution ${\widehat{X}}$ of (2) with at least local fifth order of convergence.

In what follows, we denote $E_k=X_k-{\widehat{X}}$ the error in the k-th iteration, the equation $E_{k+1}=M E_{k}^p$, where M is a p-linear operator, $M: {\mathcal {H}} \times \cdots (p \cdots \times {\mathcal {H}} \rightarrow {\mathcal {H}}$, is called the error equation and p is the local order of convergence. Notice that $E_k^p$ is $E_k \times \cdots (p \cdots \times E_k$. Next result proves the local order of convergence of the M5 method, using the Taylor expansions, and obtain the error equation.

Theorem 1

Let ${\mathcal {R}}: {\mathcal {H}} \rightarrow {\mathcal {H}}$, be the symmetric quadratic algebraic equation given in (2) and ${\widehat{X}}$ a solution of (2). Suppose that $A-D{\widehat{X}}$ is stable and ${\mathcal {R}}'$ is nonsingular at ${\widehat{X}}$. Then, the sequence $\{X_k\}$, given by the M5 method, converges to ${\widehat{X}}$ with local fifth order of convergence. Moreover, the error equation satisfies

$$\begin{aligned} \Vert E_{k+1}\Vert \le 14 \Vert [{\mathcal {R}}'({\widehat{X}})]^{-1}\Vert ^4\Vert D\Vert ^4\Vert E_{k}\Vert ^5. \end{aligned}$$

Proof

We expand ${\mathcal {R}}$ and ${\mathcal {R}}'$ in the Taylor series around the solution ${\widehat{X}}$. We denoting by $C_2= \frac{1}{2}[{\mathcal {R}}'({\widehat{X}})]^{-1}{\mathcal {R}}''({\widehat{X}})$, it follows

$$\begin{aligned} {\mathcal {R}}(X_k)={\mathcal {R}}'({\widehat{X}})\left( E_k+C_2E_k^2\right) \end{aligned}$$

and

$$\begin{aligned} {\mathcal {R}}'(X_k)={\mathcal {R}}'({\widehat{X}})\left( I+ 2C_2)E_k\right) . \end{aligned}$$

From ${\mathcal {R}}'(X_k)[{\mathcal {R}}'(X_k)]^{-1}=[{\mathcal {R}}'(X_k)]^{-1}{\mathcal {R}}'(X_k)=I$, we obtain

$$\begin{aligned}{}[{\mathcal {R}}'(X_k)]^{-1}= & {} \left( I-2C_2E_k+ 4(C_2E_k)^2 -8(C_2E_k)^3+16(C_2E_k)^4\right. \\{} & {} \left. -32(C_2E_k)^5\right) [{\mathcal {R}}'({\widehat{X}})]^{-1}+{\mathcal {O}}(E_k^6) \end{aligned}$$

On the other hand, it follows

$$\begin{aligned} Y_k-{\widehat{X}}= & {} C_2E_k^2-2(C_2 E_k)^2E_k+ 4(C_2 E_k)^3E_k\\{} & {} -8 (C_2 E_k)^4E_k+{\mathcal {O}}(E_k^6). \end{aligned}$$

And therefore, hence ${\mathcal {R}}(Y_k)={\mathcal {R}}'({\widehat{X}})\left( (Y_k-{\widehat{X}})+C_2(Y_k-{\widehat{X}})^2\right) $, it follows

$$\begin{aligned} {\mathcal {R}}(Y_k)= & {} {\mathcal {R}}'({\widehat{X}})\left( C_2E_k^2-2(C_2 E_k)^2E_k+4(C_2 E_k)^3E_k\right. \\{} & {} -8(C_2 E_k)^4E_k+C_2(C_2 E_k^2)^2\\{} & {} -2C_2^2(C_2 E_k)^2E_k C_2 E_k^2 \\{} & {} \left. -2C_2^2 E_k^2(C_2 E_k)^2E_k\right) +{\mathcal {O}}(E_k^6), \end{aligned}$$

and

$$\begin{aligned}{}[{\mathcal {R}}'(X_k)]^{-1}{\mathcal {R}}(Y_k)= & {} C_2E_k^2-4(C_2 E_k)^2E_k+12(C_2 E_k)^3E_k\\{} & {} -32 (C_2 E_k)^4E_k+C_2 (C_2 E_k^2)^2 \\{} & {} -2C_2^2(C_2 E_k)^2E_k C_2 E_k^2 -2C_2^2 E_k^2(C_2 E_k)^2E_k\\{} & {} -2C_2 E_kC_2 (C_2 E_k^2)^2 +{\mathcal {O}}(E_k^6). \end{aligned}$$

Notice that, $Z_k-{\widehat{X}} = (Y_k-{\widehat{X}})-5[{\mathcal {R}}'(X_k)]^{-1}{\mathcal {R}}(Y_k)$, thus

$$\begin{aligned} Z_k-{\widehat{X}}= & {} -4 C_2E_k^2+18(C_2 E_k)^2E_k-56 (C_2 E_k)^3E_k\\{} & {} +152(C_2 E_k)^4E_k-5C_2(C_2 E_k^2)^2\\{} & {} +10C_2(C_2 E_k)^2E_k C_2 E_k^2 +10C_2^2 E_k^2(C_2 E_k)^2E_k\\{} & {} +10C_2 E_kC_2 (C_2 E_k^2)^2+{\mathcal {O}}(E_k^6). \end{aligned}$$

Hence, ${\mathcal {R}}(Z_k)={\mathcal {R}}'({\widehat{X}})\left( (Z_k-{\widehat{X}})+C_2(Z_k-{\widehat{X}})^2\right) $, it follows

$$\begin{aligned} \begin{array}{lll}{\mathcal {R}}(Z_k)&{}=&{}[{\mathcal {R}}'({\widehat{X}})] \left( -4 C_2E_k^2+18(C_2 E_k)^2E_k-56 (C_2 E_k)^3E_k+152(C_2 E_k)^4E_k\right. \\ &{}&{} \left. +11C_2(C_2 E_k^2)^2-62C_2^2 E_k^2(C_2 E_k)^2E_k-62C_2(C_2 E_k)^2E_kC_2 E_k^2 \right. \\ &{}&{} \left. +10C_2 E_k C_2 (C_2 E_k^2)^2 \right) +{\mathcal {O}}(E_k^6),\end{array} \end{aligned}$$

and

$$\begin{aligned}{}[{\mathcal {R}}'(X_k)]^{-1}{\mathcal {R}}(Z_k)= & {} \left( -4 C_2E_k^2+26(C_2 E_k)^2E_k\right. \\{} & {} -108 (C_2 E_k)^3E_k+349(C_2 E_k)^4E_k\\{} & {} +11C_2(C_2 E_k^2)^2 -62C_2(C_2 E_k)^2E_kC_2 E_k^2\\{} & {} \left. -62 C_2^2 E_k^2 (C_2 E_k)^2 E_k\right. \\ {}{} & {} \left. -11C_2 E_kC_2(C_2 E_k^2)^2 \right) +{\mathcal {O}}(E_k^6). \end{aligned}$$

Next, taking into account that

$$\begin{aligned} X_{k+1}-{\widehat{X}}= Z_{k }-{\widehat{X}}-\frac{1}{5}[{\mathcal {R}}'(X_k)]^{-1}{\mathcal {R}}(Y_k)\left( {\mathcal {R}}(Z_k)-16{\mathcal {R}}(Y_k)\right) \end{aligned}$$

and taking norms, it follows

$$\begin{aligned} \Vert X_{k+1}-{\widehat{X}}\Vert \le 14 \Vert [{\mathcal {R}}'({\widehat{X}})]^{-1}\Vert ^4\Vert D\Vert ^4\Vert X_{k }-{\widehat{X}}\Vert ^5. \end{aligned}$$

$\Box $

One of the most important aspects to consider when choosing an iterative scheme to solve a nonlinear equation is its efficiency, see [9, 11]. The parameters that are taken into account to analyze the efficiency of an iterative scheme are the order of convergence and the operational cost.

A widely used efficiency index is the computational efficiency (CE), see [25]. This efficiency index is defined by $CE = ord^{1/op}$, where ord is the order of convergence and op is the amount of operations per iteration associated with the method.

2.1 Computational Efficiency

In this section, we analyze the computational efficiency of method (6) and compare it with that of Newton method.

To analyze the computational efficiency, we need to know the order of operations that these methods perform per step.

On the other hand, Newton’s method requires solving a Lyapunov equation at each step. Applying the M5 method involves a Lyapunov equation with three different independent terms.

One of the most known and effective algorithms to the sequential solution of Lyapunov equation with different independent terms is the Bartels–Stewart [5]. The main idea of this algorithm is to reduce the matrix system to Schur or upper Hessenberg form. In this way, it transforms the Lyapunov equation into a triangular system which can be solved efficiently by forward or backward substitutions. Thus, the algorithm leaves the matrix system $XF+F^{*}X=W$, transformed into the form

$$\begin{aligned} HY+YH^{*}={\widetilde{C}} \end{aligned}$$

where $ H=Q^{T}FQ $ is the orthogonal reduction of F to Schur or upper Hessenberg form. The operational cost of computing the real Schur decomposition is ${\mathcal {O}}(25 n^3)$. Moreover, the update the right-hand side, $Q^{T}WQ$, backward substitution for Y and the solution, $X=Q YQ^{T}$, require a total cost of $4 n^3$ operations.

Thus, denoting by CE(M5) and CE(N) the computational efficiency of M5, Newton and doubling methods, respectively, we have

$$\begin{aligned} CE(M5)=5^{\frac{1}{ 37 n^3 }} \end{aligned}$$

and

$$\begin{aligned} CE(N)=2^\frac{1}{29 n^3}, \end{aligned}$$

In any case, as we can observe in Figs. 1 and 2, the M5 method has a better behavior than the Newton, even for large dimensions.

3 A First Improvement for M5 Method

Another interesting aspect when selecting an iterative scheme is its accessibility, that is, the set of starting matrices that make the iterative scheme a convergent process. This fact is measured from a local convergence result that shows us the convergence ball of an iterative scheme.

3.1 Local Convergence

Usually, the local convergence results for iterative schemes require conditions on the operator ${\mathcal {R}}$ and a solution ${\widehat{X}}$ of Eq. (2). Notice that a local result provides the convergence ball which we denote by $B({\widehat{X}}, S)$. From the value S, the convergence ball gives information about the accessibility of the solution ${\widehat{X}}$, since the convergence of the iterative scheme is ensured from any starting point belonging to the ball $B({\widehat{X}}, S)$.

We started our study obtaining some previous results.

Lemma 2

If ${\widehat{X}}$ is a Hermitian solution of (2) and $A-D{\widehat{X}}$ is stable, then there exists $[{\mathcal {R}}'({\widehat{X}})]^{-1}$ with $\Vert [{\mathcal {R}}'({\widehat{X}})]^{-1} \Vert \le \beta $, $\beta \in {\mathbb {R}}_{+}$.

Proof

Taking a Hermitian matrix ${\widehat{X}}$, to obtain the existence of the lineal operator ${\mathcal {R}}'({\widehat{X}})^{-1} $, we have to solve a Lyapunov equation in this way

$$\begin{aligned} -{\mathcal {R}}'({\widehat{X}})Y=Y(A-D{\widehat{X}}) +(A-D{\widehat{X}})^{*}Y = Z \Leftrightarrow - R'({\widehat{X}})^{-1} (Z)=Y. \end{aligned}$$

It is well known [18], that this equation has solution if $A-D{\widehat{X}}$ is stable, that is, all their eigenvalues have real negative part. In this case, the solution is known:

$$\begin{aligned} Y=-{\mathcal {R}}'({\widehat{X}})^{-1}(Z)= \int _{0}^{\infty }\exp ((A-D{\widehat{X}})t) Z\exp ((A-D{\widehat{X}})^{*}t) {\text {d}}t. \end{aligned}$$

On the other hand, as $D=D^{*}$, $C=C^{*}$ and the pair (A, D) is stabilizable, there exist $l, m \in {\mathbb {R}}$ such that, $\Vert \exp ((A-D{\widehat{X}})t) \Vert \le m \exp (- l t)$ for all $t\ge 0$. Therefore, $[{\mathcal {R}}'({\widehat{X}})]^{-1}$ exists and $\Vert [{\mathcal {R}}'({\widehat{X}})]^{-1}\Vert \le m^2/2 l = \beta $. $\Box $

Hence, from now on, we consider that $[{\mathcal {R}}'({\widehat{X}})]^{-1}$ exists, with $\Vert [{\mathcal {R}}'({\widehat{X}})]^{-1}\Vert \le b$. Moreover, it is easy to check that

$$\begin{aligned} \Vert {\mathcal {R}}''(X)\Vert \le 2\Vert D \Vert ,\quad X \in {\mathcal {H}}. \end{aligned}$$

To prove that $\{X_n\}$, given by (5), converges to $\widehat{X}$, we will previously prove that for each element of the sequence, it is verified that there exists $[{\mathcal {R}}(X_n)]^{-1}$ and $X_n \in B({\widehat{X}},S)$.

Lemma 3

Under conditions of previous Lemma, if $X \in B({\widehat{X}},S)$ with $\beta \Vert D\Vert S \le \dfrac{1}{2 }$, then there exists $[{\mathcal {R}}'(X)]^{-1}]$, with

$$\begin{aligned} \Vert [{\mathcal {R}}'(X)]^{-1}] \Vert \le \dfrac{\beta }{1 - 2 \beta \Vert D\Vert S }. \end{aligned}$$

Proof

Firstly, we have

$$\begin{aligned} \Vert I - [{\mathcal {R}}'({\widehat{X}})]^{-1}{\mathcal {R}}'(X) \Vert \le \Vert [{\mathcal {R}}'({\widehat{X}})]^{-1} \Vert \Vert {\mathcal {R}}'({\widehat{X}}) - {\mathcal {R}}'(X) \Vert < 2 \beta \Vert D\Vert S \le 1. \end{aligned}$$

Second, from by the Perturbation Lemma [21], then we get the thesis. $\Box $

From the algorithm (5) and the Taylor expansions, it is easy to check the following technical result.

Lemma 4

If $X_n, Y_n, Z_n \in {\mathcal {H}}$, then the following equalities are satisfied for $n\geqslant 0:$

(i)
${\mathcal {R}}(Y_n) = \frac{1}{2} {\mathcal {R}}''(X_n)(Y_n - X_n)^2$
(ii)
${\mathcal {R}}(Z_n) = - 5 {\mathcal {R}}(Y_n) + \frac{1}{2} {\mathcal {R}}''(X_n)(Z_n - X_n)^2.$ $\Box $

Lemma 5

Under conditions of Lemma 2, assuming that $X_n \in B({\widehat{X}},S)$ are well defined, the following items are satisfied for $n\ge 1$:

$(i_n)$ $\Vert Y_n-{\widehat{X}}\Vert < \Psi _0(\Delta (S)) \Vert X_{n}-{\widehat{X}}\Vert $,
$(ii_n)$ $\Vert Z_n-{\widehat{X}}\Vert < \Psi _1 (\Delta (S)) \Vert X_n-{\widehat{X}}\Vert $,
$(iii_n)$ $\Vert X_{n+1}-{\widehat{X}}\Vert < \Psi _2(\Delta (S)) \Vert X_{n}-{\widehat{X}}\Vert ,$

where $\Delta (S) = \dfrac{\beta \Vert D\Vert S}{1 - 2 \beta \Vert D\Vert S}$ and the auxiliary real functions: $\Psi _0(t) = 3 t$, $\Psi _1 (t)= \Psi _0(t) ( 4+ 10 t + 3 t^2)$ and $\Psi _2(t) = \dfrac{t}{5} ( 8\,\Psi _0(t) + 4\, \Psi _0(t)^2 + 2\, \Psi _1(t) + \Psi _1(t)^2)$.

Proof

From algorithm (5) and the Taylor expansions, it is easy to check

$$\begin{aligned} Y_n-{\widehat{X}}= & {} X_n - [{\mathcal {R}}'(X_n)]^{-1} {\mathcal {R}}(X_n) - {\widehat{X}} \\= & {} [{\mathcal {R}}'(X_n)]^{-1} ( {\mathcal {R}}'(X_n)(X_n-{\widehat{X}})- {\mathcal {R}}(X_n) )\\= & {} [{\mathcal {R}}'(X_n)]^{-1} \left[ ({\mathcal {R}}'(X_n)-{\mathcal {R}}'({\widehat{X}}) )(X_n-{\widehat{X}}) - \dfrac{1}{2} {\mathcal {R}}''({\widehat{X}})(X_n-{\widehat{X}})^2\right] \end{aligned}$$

Now, taking norms, we obtain

$$\begin{aligned} \Vert Y_n-{\widehat{X}} \Vert\le & {} \dfrac{3}{2} \Vert {\mathcal {R}}'(X_n)]^{-1}\Vert \Vert 2 \Vert D\Vert \Vert \Vert X_n-{\widehat{X}}\Vert ^2 \\< & {} \dfrac{3}{2}\dfrac{\beta }{1 - 2 \beta \Vert D\Vert S} 2 \Vert D\Vert S \Vert X_n-{\widehat{X}}\Vert , \end{aligned}$$

which proves $(i_n)$.

To prove $(ii_n)$, it follows the previous procedure. Thus,

$$\begin{aligned} Z_n-{\widehat{X}}= & {} Y_n - 5 [{\mathcal {R}}'(X_n)]^{-1} {\mathcal {R}}(Y_n) - {\widehat{X}} \\= & {} [{\mathcal {R}}'(X_n)]^{-1} ( {\mathcal {R}}'(X_n)(Y_n-{\widehat{X}})- 5 {\mathcal {R}}(Y_n) )\\= & {} [{\mathcal {R}}'(X_n)]^{-1} \left[ 5 ({\mathcal {R}}'(X_n)-{\mathcal {R}}'({\widehat{X}}) )(Y_n-{\widehat{X}}) - 4 {\mathcal {R}}'(X_n) (Y_n-{\widehat{X}})\right. \\{} & {} \left. + \dfrac{1}{2} {\mathcal {R}}''({\widehat{X}})(Y_n-{\widehat{X}})^2\right] \\= & {} [{\mathcal {R}}'(X_n)]^{-1} \left[ 5 {\mathcal {R}}''(\Theta )(X_n-{\widehat{X}})(Y_n-{\widehat{X}}) - 4 {\mathcal {R}}'(X_n) (Y_n-{\widehat{X}})\right. \\{} & {} \left. + \dfrac{1}{2} {\mathcal {R}}''({\widehat{X}})(Y_n-{\widehat{X}})^2\right] \\= & {} [{\mathcal {R}}'(X_n)]^{-1} \left[ 5 {\mathcal {R}}''(\Theta )(X_n-{\widehat{X}})(Y_n-{\widehat{X}})+ \dfrac{1}{2} {\mathcal {R}}''({\widehat{X}})(Y_n-{\widehat{X}})^2\right] \\{} & {} -4 (Y_n-{\widehat{X}}), \end{aligned}$$

where $\Theta = \lambda X_n + (1-\lambda ){\widehat{X}}$ for some $\lambda \in [0,1]$. Next, taking norms in the previous equality and taking into account $(i_n)$, we get

$$\begin{aligned} \Vert Z_n-{\widehat{X}} \Vert\le & {} \dfrac{\beta }{1 - 2 \beta \Vert D\Vert S} \left[ 10 \Vert D \Vert \Vert X_n-{\widehat{X}}\Vert +\Vert D\Vert \Vert Y_n-{\widehat{X}}\Vert \right] \\{} & {} \Vert Y_n-{\widehat{X}}\Vert + 4 \Vert Y_n-{\widehat{X}}\Vert \\< & {} \left[ \dfrac{\beta }{1 - 2 \beta \Vert D\Vert S} \left( 10 \Vert D \Vert \Vert X_n-{\widehat{X}}\Vert +\Vert D\Vert 3 \Delta (S) \Vert X_n-{\widehat{X}}\Vert \right) +4 \right] \\{} & {} 3 \Delta (S) \Vert X_n-{\widehat{X}}\Vert \\< & {} 3 \Delta (S) \left[ 4 +10 \Delta (S) + 3 \Delta (S)^2 \right] \Vert X_n-{\widehat{X}}\Vert , \end{aligned}$$

then $(ii_n)$ is proved.

To finish, we prove $(iii_n)$. Therefore, applying the algorithm (5) and Taylor expansions, we have

$$\begin{aligned} X_{n+1}-{\widehat{X}}= & {} Z_n - \dfrac{1}{5} [{\mathcal {R}}'(X_n)]^{-1} \left( -16 {\mathcal {R}}(Y_n) + {\mathcal {R}}(Z_n) \right) - {\widehat{X}}. \nonumber \\= & {} [{\mathcal {R}}'(X_n)]^{-1} \left( {\mathcal {R}}'(X_n)(Z_n-{\widehat{X}})+ \dfrac{16}{5} {\mathcal {R}}(Y_n) - \dfrac{1}{5} {\mathcal {R}}(Z_n) \right) \nonumber \\= & {} [{\mathcal {R}}'(X_n)]^{-1} \left( \dfrac{1}{5} {\mathcal {R}}''(\Theta _1)(X_n-{\widehat{X}})(Z_n-{\widehat{X}})+ \dfrac{4}{5} {\mathcal {R}}'(X_n)(Z_n-{\widehat{X}})\right. \nonumber \\{} & {} \left. + \dfrac{16}{5} {\mathcal {R}}'({\widehat{X}})(Y_n-{\widehat{X}})\right) \nonumber \\- & {} [{\mathcal {R}}'(X_n)]^{-1} \left( \dfrac{1}{10} {\mathcal {R}}''({\widehat{X}})(Z_n-{\widehat{X}})^2 - \dfrac{16}{10} {\mathcal {R}}''({\widehat{X}})(Y_n-{\widehat{X}})^2 \right) , \end{aligned}$$

(7)

with $\Theta _1 = \lambda X_n + (1-\lambda ){\widehat{X}}$ for some $\lambda \in [0,1]$. However, taking into account that $ Z_n-{\widehat{X}} = Y_n - {\widehat{X}} - 5 [{\mathcal {R}}'(X_n)]^{-1} {\mathcal {R}}(Y_n)$, then

$$\begin{aligned}{} & {} \dfrac{4}{5} {\mathcal {R}}'(X_n)(Z_n-{\widehat{X}})+ \dfrac{16}{5} {\mathcal {R}}'({\widehat{X}})(Y_n-{\widehat{X}})\nonumber \\{} & {} \quad = \dfrac{4}{5} {\mathcal {R}}''(\Theta _2)(X_n-{\widehat{X}})(Y_n-{\widehat{X}}) + 4 {\mathcal {R}}'({\widehat{X}})(Y_n-{\widehat{X}}) - 4 {\mathcal {R}}(Y_n) \nonumber \\{} & {} \quad = \dfrac{4}{5} {\mathcal {R}}''(\Theta _2)(X_n-{\widehat{X}})(Y_n-{\widehat{X}}) - 2 {\mathcal {R}}''({\widehat{X}}) (Y_n-{\widehat{X}})^2, \end{aligned}$$

(8)

with $\Theta _2 = \lambda X_n + (1-\lambda ){\widehat{X}}$ for some $\lambda \in [0,1]$. Next, substituting in (7) the expression obtained in (8), we have

$$\begin{aligned} X_{n+1}-{\widehat{X}}= & {} [{\mathcal {R}}'(X_n)]^{-1} \left( \dfrac{1}{5} {\mathcal {R}}''(\Theta _1)(X_n-{\widehat{X}})(Z_n-{\widehat{X}})\right. \\{} & {} \left. + \dfrac{4}{5} {\mathcal {R}}''(\Theta _2)(X_n-{\widehat{X}})(Y_n-{\widehat{X}}) \right) \\{} & {} \quad - [{\mathcal {R}}'(X_n)]^{-1} \left( \dfrac{2}{5} {\mathcal {R}}''({\widehat{X}})(Y_n-{\widehat{X}})^2 + \dfrac{1}{10} {\mathcal {R}}''({\widehat{X}}) (Z_n-{\widehat{X}})^2 \right) . \end{aligned}$$

Next, taking norms in the previous equality and applying the items $(i_n)$ and $(ii_n)$, we obtain

$$\begin{aligned}{} & {} \Vert X_{n+1}-{\widehat{X}} \Vert< \dfrac{\beta }{1 - 2 \beta \Vert D\Vert S}\\{} & {} \qquad \left( \dfrac{2}{5} \Vert D\Vert S \Psi _1(\Delta (S)) \Vert X_n - {\widehat{X}}\Vert + \dfrac{8}{5} \Vert D\Vert S \Psi _0(\Delta (S)) \Vert X_n-{\widehat{X}}\Vert \right) \\{} & {} \qquad + \dfrac{\beta }{1 - 2 \beta \Vert D\Vert S}\\{} & {} \qquad \left( \dfrac{4}{5} \Vert D\Vert S \Psi _0(\Delta (S)) ^2 \Vert X_n-{\widehat{X}}\Vert + \dfrac{2}{10} \Vert D\Vert S \Psi _1(\Delta (S))^2 \Vert X_n-{\widehat{X}}\Vert \right) \\{} & {} \quad < \dfrac{\Delta (S)}{5} \left( 8 \Psi _0(\Delta (S)) + 4 \Psi _0(\Delta (S)) ^2 + 2 \Psi _1(\Delta (S)) + \Psi _1(\Delta (S))^2 \right) \Vert X_n-{\widehat{X}}\Vert \end{aligned}$$

$\Box $

Theorem 6

Let ${\widehat{X}}$ be a Hermitian solution of (2) and $A-D{\widehat{X}}$ is stable. If $X_0 \in B({\widehat{X}},S)$ with $ \beta \Vert D\Vert S < 0.14$, then the sequence the sequence $\{X_k\}$, given by the M5 method (5), converges to ${\widehat{X}}$.

Proof

Note that if $\beta \Vert D\Vert S < 0.14$, then $\Psi _2(\Delta (S)) <1$, and therefore $\{\Vert X_k - {\widehat{X}} \Vert \}$ is a strictly decreasing sequence of positive real numbers. Therefore, the thesis is obtained. $\Box $

3.2 Accessibility

Notice that, the local convergence result, Theorem 6, is based on demanding conditions to $\beta \Vert D\Vert S$. This result provides the so-called domain of parameters [1] corresponding to the conditions required that guarantee the convergence of sequence $\{X_n\}$ to a solution ${\widehat{X}}$ of symmetric ARE, (2). In this case, the condition is $\beta \Vert D\Vert S < 0.14$. Therefore, the domain of parameters

$$\begin{aligned} D_{M5}= \{(x,y) \in {\mathbb {R}}^2: x y < 0.14\} \end{aligned}$$

(9)

measures the accessibility of the M5 method, taking $x=\beta \Vert D\Vert $ and $y=S$. That is, the location of starting approximations, $X_0$, from which the M5 method converges to a solution of symmetric ARE, (2). Thus, the accessibility of the M5 method from Theorem 6 is given by its domain of parameters, that we can see in Fig. 3.

It is clear that to compare the accessibility of two iterative schemes, we must consider two local convergence results in the same conditions for the operator considered. An interesting local result for Newton’s method is given in [13]

Theorem 7

Let ${\widehat{X}}$ be a Hermitian solution of (2) and $A-D{\widehat{X}}$ is stable.Then, the sequence $\{X_n\}$ generated by the Newton method (4) converges to ${\widehat{X}}$ from every matrix $X_0\in B({\widehat{X}}, S)$, with $ \beta \Vert D\Vert S < \frac{1}{3 }$.

Thus, if we consider the previous result for Newton method (4), in the same conditions that Theorem 6, its domain of parameters is given by

$$\begin{aligned} D_N= \{(x,y) \in {\mathbb {R}}^2: x y < \frac{1}{3 } \} \end{aligned}$$

(10)

As we can see in Fig. 4, there is an important difference between the domains of parameters associated with these two methods. And therefore, we can affirm that the accessibility of Newton’s method is greater than that of the M5 method.

3.3 A Predictor–Corrector Iterative Scheme

As we have obtained previously, the iterative scheme M5 has better computational efficiency than Newton’s method. However, in our analysis of accessibility, it has been shown that the iterative scheme M5 has a reduced accessibility region, contrary to what happens to Newton’s method. Next we build an iterative scheme that uses the best features of both the M5 and Newton iterative schemes. Thus, we consider a two-stage predictor–corrector iterative scheme. Firstly it is applied Newton’s method, which has a wide accessibility region, and second, it is applied the M5 method with better computational efficiency than Newton’s method.

$$\begin{aligned} \left\{ \begin{array}{l} \left\{ \begin{array}{l} \mathrm {Given \, \, an \, \, initial \, \, guess \, \,} X_0\in {\mathcal {H}},\\[1ex] X_{j+1}= X_j - [{\mathcal {R}}'(X_j)]^{-1}{\mathcal {R}}(X_j), \quad j=0,1,\ldots ,N_{0}, \end{array} \right. \\[3ex] \left\{ \begin{array}{l} W_0 = X_{N_0 + 1},\\[1ex] Y_{k}= W_k - [{\mathcal {R}}'(W_k)]^{-1}{\mathcal {R}}(W_k), \\[1ex] Z_{k}= Y_k - 5[{\mathcal {R}}'(W_k)]^{-1}{\mathcal {R}}(Y_k), \\[1ex] W_{k+1} = Z_k - \displaystyle \frac{1}{5} [{\mathcal {R}}'(W_k)]^{-1}({\mathcal {R}}(Z_k)-16 {\mathcal {R}}(Y_k)), \,\,k\geqslant 0. \end{array} \right. \end{array} \right. \end{aligned}$$

(11)

whose expression, according to the practical application by solving Lyapunov equations, is given as follows:

$$\begin{aligned} \left\{ \begin{array}{l} \left\{ \begin{array}{l} \mathrm {Given \, \, an \, \, initial \, \, guess \, \,} X_0\in {\mathcal {H}},\\[1ex] {\mathcal {L}}_{A-DX_j}(X_{j+1}) = ({\mathcal {P}}_D(X_{j})+C),\quad j=0,1,\ldots ,N_{0}, \end{array} \right. \\[3ex] \left\{ \begin{array}{l} W_0=X_{N_0+1},\\[1ex] {\mathcal {L}}_{A-DW_k}(Y_k)= ({\mathcal {P}} _D(W_k) + C), \\ {\mathcal {L}}_{A-DW_k}(Z _k)= ({\mathcal {P}} _D(W_k) + C) + 5 {\mathcal {P}} _D(Y_k-W_k), \\ {\mathcal {L}}_{A-DW_k}(W _{k+1})= ({\mathcal {P}} _D(W_k) + C)+ {\mathcal {P}} _D(Y_k-W_k) \\ - \displaystyle \frac{1}{5} \left[ (Y_k-W_k)D(Z_k-Y_k)+(Z_k-Y_k)D(Y_k-W_k)+{\mathcal {P}} _D(Z_k-Y_k) \right] , k\geqslant 0. \end{array} \right. \end{array} \right. \end{aligned}$$

(12)

Obviously, the study of the local convergence of this iterative scheme requires the determination of the existence of the value $N_0$ hence we can ensure that the iteration $W_0$ is in the accessibility region of the iterative scheme M5.

Let $X_0 \in D_N$, $D_N$ given in (10), that is, $X_0 \in B({\widehat{X}},S)$ with $\beta ||D|| S < 1/3$. Being ${\widehat{X}}$, a Hermitian solution of the equation of (2), where $A-D{\widehat{X}}$ is stable.

First, from previous Lemma 3, for $ n\geqslant 1$ if $X_n \in B({\widehat{X}},S)$, with $\beta ||D|| S < 1/3$, then there exists the operator $[{\mathcal {R}}'(X_n)]^{-1}$. Secondly, from Taylor series we have:

$$\begin{aligned} 0={\mathcal {R}} ({\widehat{X}})={\mathcal {R}} (X_n)+{\mathcal {R}}' (X_n)({\widehat{X}}-X_n)+ ({\widehat{X}}-X_n)D ({\widehat{X}}-X_n). \end{aligned}$$

Thirdly, from the algorithm of Newton’s iterative scheme (4), we have:

$$\begin{aligned} X_{n+1}-{\widehat{X}}=(X_n-{\widehat{X}})-[{\mathcal {R}}' (X_n)]^{-1}{\mathcal {R}} (X_n). \end{aligned}$$

Thus, the following decomposition is verified:

$$\begin{aligned} X_{n+1}-{\widehat{X}}= & {} (X_n-{\widehat{X}})+({\widehat{X}}-X_n)+[{\mathcal {R}}' (X_n)]^{-1} ({\widehat{X}}-X_n)D ({\widehat{X}}-X_n)\\= & {} [{\mathcal {R}}' (X_n)]^{-1} ({\widehat{X}}-X_n)D ({\widehat{X}}-X_n). \end{aligned}$$

Hence, taking norms, if $X_n \in B({\widehat{X}},S)$, we obtain

$$\begin{aligned} \Vert X_{n+1}-{\widehat{X}} \Vert \le \dfrac{\beta ||D|| S}{ 1 - 2 \beta ||D|| S} \Vert {\widehat{X}}-X_n\Vert = \Delta (S) \Vert {\widehat{X}}-X_n\Vert . \end{aligned}$$

Therefore, under the previous conditions, for $n=0,$ we have

$$\begin{aligned} \Vert X_{1}-{\widehat{X}} \Vert \le \dfrac{\beta ||D|| S}{ 1 - 2 \beta ||D|| S} \Vert {\widehat{X}}-X_0\Vert = \Delta (S) \Vert {\widehat{X}}-X_0\Vert . \end{aligned}$$

As $\Delta (S) < 1,$ then $X_1 \in B({\widehat{X}},S)$. To continue, by applying an inductive procedure, it follows that $X_n \in \in B({\widehat{X}},S)$ for all $ n\geqslant 1$. Moreover,

$$\begin{aligned} \Vert X_{n+1}-{\widehat{X}} \Vert\le & {} \dfrac{\beta ||D|| S}{ 1 - 2 \beta ||D|| S} \Vert {\widehat{X}}-X_n\Vert = \Delta (S) \Vert {\widehat{X}}-X_n\Vert \\< & {} \Big [\Delta (S)\Big ]^{n+1} \Vert {\widehat{X}}-X_0\Vert . \end{aligned}$$

Now, as $\Delta (S) < 1,$ there will be a value $N_0 \in {\mathbb {N}}$ such that

$$\begin{aligned} \Vert X_{N_0+1}-{\widehat{X}} \Vert\le & {} \Delta (S) \Vert {\widehat{X}}-X_{N_0}\Vert< \Big [\Delta (S)\Big ]^{N_0+1} \Vert {\widehat{X}}-X_0\Vert \\< & {} \Big [\Delta (S)\Big ]^{N_0+1} \dfrac{1}{3 \beta ||D||} \le \dfrac{0.14}{\beta ||D||}. \end{aligned}$$

Then $W_0 = X_{N_0 +1} \in D_{M5},$ given in (9), and then the sequence $\{ W_n\}$ converges to ${\widehat{X}}$.

In this way, we have obtained the following local convergence result for the iterative scheme given in (11).

Theorem 8

Let ${\widehat{X}}$ be a Hermitian solution of (2) and $A-D{\widehat{X}}$ is stable. Then, the sequence $\{W_n\}$ generated by the iterative scheme (11) converges to ${\widehat{X}}$ from every matrix $X_0\in B({\widehat{X}}, S)$, with $ \beta \Vert D\Vert S < \frac{1}{3 }$.

As we can see, from this result, the iterative scheme (11) has the same accessibility as Newton’s method, maintaining, except for the first $N_0$ iterations, the same computational efficiency as the iterative scheme M5.

4 A Second Improvement for M5 Method

As we have already seen in the case of the iterative scheme M5 (5), high-order iterative schemes are known to have a small region of accessibility associated with them. Therefore, locating starting points for them is a difficult problem to solve. For this reason, in this new improvement, we maintain the idea of improving the accessibility of the M5 method in the first stage of the predictor–corrector iterative scheme. But, in this case, we also intend to reduce the operational cost. The first improvement of the M5 method that we have established in the previous section, we have considered Newton’s method as a predictor method. This iterative scheme improved accessibility but for its application it requires the resolution of a Lyapunov equation in each of the $N_0$ iterations that we must carry out. Well, we propose a second improvement of the M5 method in which we are going to consider the well-known modified Newton method [15]. This method also improves accessibility and only needs to solve a Lyapunov equation, with different independent terms, in the $N_0$ iterations that we must carry out, which significantly reduces the operational cost of the predictor–corrector iterative scheme. Thus, we consider the modified Newton method as predictor iterative scheme:

$$\begin{aligned} \left\{ \begin{array}{l} X_0\, \text{ given },\\[1ex] X_{n+1} = X_n-[{\mathcal {R}}'(X_0)]^{-1}{\mathcal {R}}(X_n),\quad n\ge 0,\\[1ex] \end{array} \right. \end{aligned}$$

(13)

To obtain the $(N_0+1)$-th step of the modified Newton method (13), is equivalent to solve the Lyapunov equation, [18]:

$$\begin{aligned} {\mathcal {L}}_{A-DX_0}(X_{n+1})= & {} X_{n+1}(A-DX_0)+(A-DX_0)^{*}X_{n+1}\nonumber \\= & {} C+X_nD\left( \frac{X_n}{2}+X_0\right) +\left( \frac{X_n}{2}+X_0\right) DX_n, \end{aligned}$$

(14)

where ${\mathcal {P}}_D:{\mathcal {H}} \rightarrow {\mathcal {H}}$, is the operator defined by ${\mathcal {P}} _D(X):=XDX$.

Therefore, the modified Newton method given in (13) has a low operational cost, the cost of solving a Lyapunov equation (14) with $N_0$ different independent terms. Hence, we show that it has a good accessibility region, although smaller than that of Newton’s method. But, it has a problem, its linear convergence, whose influence will depend on the number of iterations to perform in the first stage of the predictor–corrector method.

Next, we study the accessibility region of the method.

Theorem 9

Let ${\widehat{X}}$ be a Hermitian solution of (2) and $A-D{\widehat{X}}$ is stable. If $X_0 \in B({\widehat{X}},S)$ with $\beta \Vert D\Vert S < 0.2$, then the sequence the sequence $\{X_k\}$, given by the modified Newton method (13), converges to ${\widehat{X}}$.

Proof

Notice that, from Lemma 3, for $ n\geqslant 1$ if $X_n \in B({\widehat{X}},S)$, with $\beta ||D|| S < 0.2$, then there exists the operator $[{\mathcal {R}}'(X_n)]^{-1}$.

Now, from the algorithm of the modified Newton method (13), we have:

$$\begin{aligned} X_{n+1}-{\widehat{X}}= & {} (X_n-{\widehat{X}})-[{\mathcal {R}}' (X_0)]^{-1}{\mathcal {R}} (X_n)\\= & {} [{\mathcal {R}}' (X_0)]^{-1} \Big ([{\mathcal {R}}' (X_0)] (X_n-{\widehat{X}}) - {\mathcal {R}} (X_n) \Big ). \end{aligned}$$

Next, taking into account the Taylor series, we have:

$$\begin{aligned} {\mathcal {R}} (X_n)={\mathcal {R}} ({\widehat{X}})+{\mathcal {R}}' ({\widehat{X}})(X_n -{\widehat{X}})+ (X_n -{\widehat{X}})D (X_n -{\widehat{X}}). \end{aligned}$$

Thus, the following decomposition is verified:

$$\begin{aligned} X_{n+1}-{\widehat{X}}= & {} [{\mathcal {R}}' (X_0)]^{-1} \Big (\Big ([{\mathcal {R}}' (X_0)] - [{\mathcal {R}}' ({\widehat{X}})]\Big )\\{} & {} (X_n-{\widehat{X}}) - (X_n-{\widehat{X}}) D (X_n-{\widehat{X}}) \Big ). \end{aligned}$$

Thus, taking norms, if $X_n \in B({\widehat{X}},S)$, we obtain

$$\begin{aligned} \Vert X_{n+1}-{\widehat{X}} \Vert \le \dfrac{ 3\beta ||D|| S}{ 1 - 2 \beta ||D|| S} \Vert X_n - {\widehat{X}}\Vert = 3 \Delta (S) \Vert X_n - {\widehat{X}}\Vert . \end{aligned}$$

Hence, if we consider $n=0$, we obtain that

$$\begin{aligned} \Vert X_{1}-{\widehat{X}} \Vert \le \dfrac{ 3\beta ||D|| S}{ 1 - 2 \beta ||D|| S} \Vert X_0 - {\widehat{X}}\Vert = 3 \Delta (S) \Vert X_0 - {\widehat{X}}\Vert , \end{aligned}$$

and as $3 \Delta (S) < 1,$ since $\beta ||D|| S < 0.2$, then $X_1 \in B({\widehat{X}},S)$ with $\Vert X_{1}-{\widehat{X}} \Vert < \Vert X_0 - {\widehat{X}}\Vert $. To continue, by applying an inductive procedure, it follows that $X_{n+1} \in B({\widehat{X}},S)$ and $\Vert X_{n+1}-{\widehat{X}} \Vert < \Vert X_n - {\widehat{X}}\Vert $ for all $ n\geqslant 1$. Therefore, $\{ \Vert X_{n+1}-{\widehat{X}}\Vert \}$ is a strictly decreasing sequence of positive real numbers and then the result is proved. $\Box $

From the previous Theorem, we obtain that the domain of parameters for the modified Newton method is given by

$$\begin{aligned} D_{MN}= \{(x,y) \in {\mathbb {R}}^2: x y < 0.2 \}. \end{aligned}$$

Therefore, the domain of parameters for the modified Newton method is smaller than the domain of parameters for Newton’s method given in (10), but it is greater than that of the M5 method. In this case, $D_{M5} \subset D_{MN} \subset D_N$, see Fig. 5.

Therefore, the modified Newton method is an interesting iterative scheme to consider as a predictor scheme. It has better accessibility than the M5 method and also has a lower operating cost than the Newton method.

4.1 A Predictor–Corrector Iterative Scheme

From the previous comments, to obtain an efficient iterative scheme with a suitable region of accessibility, we consider the predictor–corrector iterative scheme given by

$$\begin{aligned} \left\{ \begin{array}{l} \left\{ \begin{array}{l} \mathrm {Given \, \, an \, \, initial \, \, guess \, \,} X_0\in {\mathcal {H}},\\[1ex] X_{j+1}= X_j - [{\mathcal {R}}'(X_0)]^{-1}{\mathcal {R}}(X_j), \quad j=0,1,\ldots ,N_{0}, \end{array} \right. \\[3ex] \left\{ \begin{array}{l} W_0 = X_{N_0 + 1},\\[1ex] Y_{k}= W_k - [{\mathcal {R}}'(W_k)]^{-1}{\mathcal {R}}(W_k), \\[1ex] Z_{k}= Y_k - 5[{\mathcal {R}}'(W_k)]^{-1}{\mathcal {R}}(Y_k), \\[1ex] W_{k+1} = Z_k - \displaystyle \frac{1}{5} [{\mathcal {R}}'(W_k)]^{-1}({\mathcal {R}}(Z_k)-16 {\mathcal {R}}(Y_k)), \,\,k\geqslant 0. \end{array} \right. \end{array} \right. \end{aligned}$$

(15)

whose expression, according to the practical application by solving Lyapunov equations, is given as follows:

$$\begin{aligned} \left\{ \begin{array}{l} \left\{ \begin{array}{l} \mathrm {Given \, \, an \, \, initial \, \, guess \, \,} X_0\in {\mathcal {H}},\\[1ex] {\mathcal {L}}_{A-DX_0}(X_{j+1}) = C+X_jD\left( \frac{X_j}{2}+X_0\right) \\ \qquad +\left( \frac{X_j}{2}+X_0\right) DX_j,\quad j=0,1,\ldots ,N_{0}, \end{array} \right. \\[3ex] \left\{ \begin{array}{l} W_0=X_{N_0+1},\\[1ex] {\mathcal {L}}_{A-DW_k}(Y_k)= ({\mathcal {P}} _D(W_k) + C), \\ {\mathcal {L}}_{A-DW_k}(Z _k)= ({\mathcal {P}} _D(W_k) + C) + 5 {\mathcal {P}} _D(Y_k-W_k), \\ {\mathcal {L}}_{A-DW_k}(W _{k+1})= ({\mathcal {P}} _D(W_k) + C)+ {\mathcal {P}} _D(Y_k-W_k) \\ - \displaystyle \frac{1}{5} \left[ (Y_k-W_k)D(Z_k-Y_k)+(Z_k-Y_k)D(Y_k-W_k)\right. \\ \left. \qquad \quad +{\mathcal {P}} _D(Z_k-Y_k) \right] , k\geqslant 0. \end{array} \right. \end{array} \right. \end{aligned}$$

(16)

Next, we study the local convergence of this iterative scheme (15). If $X_0 \in B({\widehat{X}},S)$ satisfies condition $\beta ||D|| S < 0.2$, and $W_{0}=X_{N_0} \in B({\widehat{X}},{\widetilde{S}})$ satisfies the condition $\beta ||D|| {\widetilde{S}} < 0.14$, in this situation, the sequence $\{W_n\}$, given by the M5 method, converges to ${\widehat{X}}$ a solution of symmetric ARE (2). To continue, we show how to do this procedure.

Theorem 10

Let ${\widehat{X}}$ be a Hermitian solution of (2) and $A-D{\widehat{X}}$ is stable. Then, the sequence $\{W_n\}$ generated by the iterative scheme (15) converges to ${\widehat{X}}$ from every matrix $X_0\in B({\widehat{X}}, S)$, with $ \beta \Vert D\Vert S < 0.2$.

Table 1 Iterations, errors, residuals and operational cost for the Newton method and the predictor–corrector method (12), with $X_0=\frac{-\Vert A\Vert +\sqrt{\Vert A\Vert ^2-\Vert D\Vert \Vert C\Vert }}{\Vert D\Vert }$

Full size table

Table 2 Iterations, errors, residuals and operational cost for the Newton method and predictor–corrector method (16), with $X_0=3I_9/2$

Full size table

Proof

Notice that, from Lemma 3, if $X_0 \in B({\widehat{X}},S)$, with $\beta ||D|| S < 0.2$, then there exists the operator $[{\mathcal {R}}'(X_0)]^{-1}$. Moreover, from Lemma 5, we have

$$\begin{aligned} \Vert X_{1}-{\widehat{X}} \Vert \le 3 \Delta (S) \Vert X_0 - {\widehat{X}}\Vert , \end{aligned}$$

then $X_1 \in B({\widehat{X}},3 \Delta (S) S) \subset B({\widehat{X}},S)$ since $3 \Delta (S) <1.$

To continue, by applying an inductive procedure, as in the proof of Theorem 10, it follows that $X_{n+1} \in B({\widehat{X}}, (3 \Delta (S))^{n+1} S)$ for all $ n\geqslant 1$. Therefore, there exists $N_0 \in {\mathbb {N}}$ such that $X_{N_0 +1} \in B({\widehat{X}}, {\widetilde{S}})$ with $ \beta ||D|| {\widetilde{S}} < 0.14$, then $X_{N_0 +1} \in D_{M5}$. Thus, the sequence $\{W_n\}$, given by M5 method, converges to ${\widehat{X}} $.

$\Box $

5 Numerical Experiments

We show a numerical experiment related to the control for a tubular ammonia reactor, which can be found in the benchmark collection for CARE [22]. We apply the Newton method and the constructed predictor–corrector methods, to approximate a solution of equation (2) related with a ninth-order discrete state-space model of a tubular ammonia reactor, where the matrices A and B are the following:

$$\begin{aligned} A= \left( \begin{array}{ccccccccc} -4.019&{} 5.12&{} 0&{} 0&{} -2.082&{} 0&{} 0&{} 0 &{} 0.87\\ -0.346&{} 0.986&{} 0&{} 0&{} -2.34&{} 0&{} 0&{} 0&{} 0.97\\ -7.909&{}15.407&{} -4.069&{} 0&{} -6.45&{} 0&{} 0&{} 0&{}2.68\\ -21.816&{} 35.606&{}-0.339&{} -3.87&{} -17.8&{} 0&{} 0&{} 0&{} 7.39\\ -60.196&{} 98.188&{} -7.907&{}0.34&{} -53.008&{} 0&{} 0&{} 0&{} 20.4\\ 0&{} 0&{} 0&{} 0&{} 94.0&{} -147.2&{}0&{} 53.2&{} 0\\ 0&{} 0&{} 0&{} 0&{} 0&{} 94.0&{} -147.2&{}0&{} 0\\ 0&{} 0&{} 0&{} 0&{} 0&{} 12.8&{} 0&{} -31.6&{} 0\\ 0&{} 0&{} 0&{} 0&{} 12.8&{} 0&{} 0&{} 18.8&{} -31.6 \end{array}\right) , \end{aligned}$$

and

$$\begin{aligned} B^T= \left( \begin{array}{ccccccccc} 0.010&{} 0.003&{} 0.009&{} 0.024&{} 0.068&{} 0&{} 0&{} 0 &{} 0 \\ -0.011&{} -0.021&{} -0.059&{} -0.162&{} -0.445&{} 0&{} 0&{} 0&{} 0 \\ -0.151&{}0&{} 0&{} 0&{} 0&{} 0&{} 0&{} 0&{}0\\ \end{array}\right) , \end{aligned}$$

$D=BB^T$ and C is the identity matrix of size 9.

Numerical experiment shows the high precision and accuracy of the constructed predictor–corrector methods compared to the well-known Newton’s method. The Newton method is the iterative process commonly used to approximate solutions of nonlinear equations, in particular Eq. (2). In tables 1 and 2, is showed the number of iterations, errors $\Vert X^{*}-X_k \Vert _F$, with stopping criteria $\Vert X^{*}-X_k \Vert _F < 10^{-40}$, and residuals $\Vert {\mathcal {R}}(X_k)\Vert _F$, of the iterative processes considered.

First, we consider the starting matrix $X_0=\frac{-\Vert A\Vert +\sqrt{\Vert A\Vert ^2-\Vert D\Vert \Vert C\Vert }}{\Vert D\Vert }$. This starting matrix is found in the accessibility domain of the Newton Method Thus, performing a step with the Newton method, we already obtain a matrix that is found in the accessibility domain of the M5 method. Therefore, we obtain $N_0=1$ for predictor–corrector method (12). In Table 1, we can verify that the predictor–corrector method (12) obtains greater precision in a smaller number of iterations than Newton’s method. In addition, we see that it uses a smaller number of operations. Therefore, we can say that the predictor–corrector method (12) improves Newton’s method.

Second, we consider $X_0=3I_9/2$, where $I_9$ is the identity matrix of size 9. This matrix is found in the accessibility domain of the modified Newton method. And, performing two step with the modified Newton method, we already obtain a matrix that is found in the accessibility domain of the M5 method. Therefore, we obtain $N_0=2$ for the predictor–corrector method (16). In Table 2, we can observe that with fewer iterations, the predictor–corrector method (16) provides us with greater precision, requiring a smaller number of operations. Therefore, we can say that the predictor–corrector method (16) improves Newton’s method.

Finally, we can state that the constructed predictor–corrector methods, (12) and (16), improve Newton’s method for approximating a solution of the equation (2).

Availability of Data and Materials

Not applicable.

References

Amorós, C., Argyros, I.K., González, D., Magreñán, Á.A., Regmi, S., Sarría, Í.: New improvement of the domain of parameters for Newton’s method. Mathematics 8(1), 103 (2020)
Article Google Scholar
Argyros, I.K., George, S.: Ball convergence of a sixth order iterative method with one parameter for solving equations under weak conditions. Calcolo 53, 585–595 (2016)
Article MathSciNet MATH Google Scholar
Argyros, I.K., George, S., Argyros, C.: On the complexity of convergence for high order iterative methods. J. Complex. 73, 101678 (2022)
Article MathSciNet MATH Google Scholar
Arroyo, V., Cordero, A., Torregrosa, J.R.: Approximation of artificial satellites preliminary orbits: the efficiency challenge. Math. Comput. Model. 54, 1802–1807 (2011)
Article MATH Google Scholar
Bartels, R.H., Stewart, G.W.: Solution of the matrix equation $AX + XB = C$. Algorithm 432. Commun. Ass. Comput. Mach. 15(9), 820–826 (1972)
MATH Google Scholar
Behl, R., Bhalla, S., Magreñán, Á.A., Kumar, S.: An efficient high order iterative scheme for large nonlinear systems with dynamics. J. Comput. Appl. Math. 404, 113249 (2022)
Article MathSciNet MATH Google Scholar
Chehab, J.P., Raydan, M.: Inexact Newton s method with inner implicit preconditioning for algebraic Riccati equations. Comp. Appl. Math. 36, 955–969 (2017)
Article MathSciNet MATH Google Scholar
Dieci, L.: Some numerical considerations and newton’s method revisited for solving algebraic Riccati equations. IEEE Trans. Autom. Control 36(5), 608–616 (1991)
Article MathSciNet Google Scholar
Ezquerro, J.A., Hernández, M.A., Romero, N.: Improving the efficiency index of one-point iterative processes. J. Comput. Appl. Math. 223(2), 879–892 (2009)
Article MathSciNet MATH Google Scholar
Grasedyck, L., Hackbusch, W., Khoromskij, B.N.: Solution of large scale algebraic matrix Riccati equations by use of hierarchical matrices. Computing 70, 121–165 (2003)
Article MathSciNet MATH Google Scholar
Grau-Sánchez, M., Díaz-Barrero, J.L.: On computational efficiency for multi-precision zero-finding methods. Appl. Math. Comput. 181, 402–412 (2006)
MathSciNet MATH Google Scholar
Hammarling, S.: Newton’s Method for Solving the Algebraic Riccati Equation. National Physical Laboratory, Teddington (1982)
Google Scholar
Hernández, M.A., Romero, N.: Solving symmetric algebraic Riccati equations with high order iterative schemes. Mediterr. J. Math. 15, 51 (2018)
Article MathSciNet MATH Google Scholar
Juang, J.: Global existence and stability of solutions of matrix Riccati equations. J. Math. Anal. Appl. 258(1), 1–12 (2001)
Article MathSciNet MATH Google Scholar
Kantorovich, L.V., Akilov, G.P.: Functional Analysis. Pergamon Press, Oxford (1982)
MATH Google Scholar
Kleinman, D.: On an iterative technique for Riccati equation computations. IEEE Trans. Automat. Control. 13, 114–115 (1968)
Article Google Scholar
Kumar, S., Kanwar, V., Singh, S.: Modified efficient families of two and three-step predictor-corrector iterative methods for solving nonlinear equations. Appl. Math. 1, 153–158 (2010)
Article Google Scholar
Lancaster, P., Rodman, L.: Algebraic Riccati Equations. Oxford Science Publications, Oxford (1995)
MATH Google Scholar
Laub, A.: A Schur method for solving algebraic Riccati equations. IEEE Trans. Autom. Control 24, 913–921 (1979)
Article MathSciNet MATH Google Scholar
Mehrmann, V.L.: The Autonomous Linear Quadratic Control Problem: The Theory and Numerical Solution. Lecture Notes in and Information Sciences. Springer, Berlin (1991)
Book Google Scholar
Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York (1970)
MATH Google Scholar
Patnaik, L.M., Viswanadham, N., Sarma, I.G.: Computer control algorithms for a tubular ammonia reactor. IEEE Trans. Automat. Control AC 25(4), 642–651 (1980)
Article MATH Google Scholar
Rogers, L.C.G.: Fluid models in queueing theory and Wiener–Hopf factorization of Markov chains. Ann. Appl. Probab. 4(2), 390–413 (1994)
Article MathSciNet MATH Google Scholar
Simoncini, V., Szyld, D.B., Monsalve, M.: On two numerical methods for the solution of large-scale algebraic Riccati equations. IMA J. Numer. Anal. 34, 904–920 (2014)
Article MathSciNet MATH Google Scholar
Traub, J.F.: Iterative Methods for the Solution of Equations. Chelsea Publishing Company, New York (1982)
MATH Google Scholar

Download references

Funding

Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. The research has been partially supported by the project MTM2018-095896-B-C21 of the Spanish Ministry of Science.

Author information

Authors and Affiliations

Departamento de Matemáticas y Computación, Universidad de La Rioja, Ed. CCT. Calle Madre de Dios, 53, 26006, Logroño, Spain
M. A. Hernández-Verón & N. Romero

Authors

M. A. Hernández-Verón
View author publications
You can also search for this author in PubMed Google Scholar
N. Romero
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

This article is a joint research between both authors.

Corresponding author

Correspondence to N. Romero.

Ethics declarations

Conflict of interest

The authors declare no competing of interest.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hernández-Verón, M.A., Romero, N. An Improvement of the Newton Method for Solving Symmetric Algebraic Riccati Equations. Mediterr. J. Math. 20, 261 (2023). https://doi.org/10.1007/s00009-023-02466-3

Download citation

Received: 03 October 2022
Revised: 25 June 2023
Accepted: 26 June 2023
Published: 08 July 2023
DOI: https://doi.org/10.1007/s00009-023-02466-3

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An Improvement of the Newton Method for Solving Symmetric Algebraic Riccati Equations

Abstract

Similar content being viewed by others

Solving Symmetric Algebraic Riccati Equations with High Order Iterative Schemes

Numerical Study on Nonsymmetric Algebraic Riccati Equations

A generalized ALI iteration method for nonsymmetric algebraic Riccati equations

1 Introduction

2 A High-Order Iterative Scheme: M5 Method

Theorem 1

Proof

2.1 Computational Efficiency

3 A First Improvement for M5 Method

3.1 Local Convergence

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Lemma 5

Proof

Theorem 6

Proof

3.2 Accessibility

Theorem 7

3.3 A Predictor–Corrector Iterative Scheme

Theorem 8

4 A Second Improvement for M5 Method

Theorem 9

Proof

4.1 A Predictor–Corrector Iterative Scheme

Theorem 10

Proof

5 Numerical Experiments

Availability of Data and Materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation