# Adaptive Horizon Model Predictive Control and Al’brekht’s Method

**DOI:**https://doi.org/10.1007/978-1-4471-5102-9_100071-1

- 252 Downloads

## Abstract

A standard way of finding a feedback law that stabilizes a control system to an operating point is to recast the problem as an infinite horizon optimal control problem. If the optimal cost and the optimal feedback can be found on a large domain around the operating point, then a Lyapunov argument can be used to verify the asymptotic stability of the closed loop dynamics. The problem with this approach is that it is usually very difficult to find the optimal cost and the optimal feedback on a large domain for nonlinear problems with or even without constraints, hence the increasing interest in Model Predictive Control (MPC). In standard MPC a finite horizon optimal control problem is solved in real time, but just at the current state, the first control action is implemented, the system evolves one time step, and the process is repeated. A terminal cost and terminal feedback found by Al’brekht’s method and defined in a neighborhood of the operating point can be used to shorten the horizon and thereby make the nonlinear programs easier to solve because they have less decision variables. Adaptive Horizon Model Predictive Control (AHMPC) is a scheme for varying the horizon length of Model Predictive Control as needed. Its goal is to achieve stabilization with horizons as small as possible so that MPC methods can be used on faster and/or more complicated dynamic processes.

## Keywords

Adaptive horizon Model predictive control Al’brekht’s method Optimal stabilization## Introduction

Model Predictive Control (MPC) is a way to steer a discrete time control system to a desired operating point. We will present an extension of MPC that we call Adaptive Horizon Model Predictive Control (AHMPC) which adjusts the length of the horizon in MPC while nearly verifying in real time that stabilization is occurring for a nonlinear system.

We are not the first to consider adaptively changing the horizon length in MPC; see Michalska and Mayne (1993), Polak and Yang (1993a, b, c). In these papers the horizon is changed so that a terminal constraint is satisfied by the predicted state at the end of horizon. In Giselsson (2010) the horizon length is adaptively changed to ensure that the infinite horizon cost of using the finite horizon MPC scheme is not much more than the cost of the corresponding infinite horizon optimal control problem.

Adaptive horizon tracking is discussed in Page et al. (2006) and Droge and Egerstedt (2011). In Kim and Sugie (2008) an adaptive parameter estimation algorithm suitable for MPC was proposed, which uses the available input and output signals to estimate the unknown system parameters. In Gruene et al. (2010) a detailed analysis of the impact of the optimization horizon and the time varying control horizon on stability and performance of the closed loop is given.

## Review of Model Predictive Control

We briefly describe MPC following the definitive treatise of Rawlings and Mayne (2009). We largely follow their notation.

*x*

^{+}(

*k*) =

*x*(

*k*+ 1). Typically this a time discretization of a controlled, nonlinear dynamics in continuous time. The goal is to find a feedback law

*u*(

*k*) =

*κ*(

*x*(

*k*)) that drives the state of the system to some desired operating point. A pair (

*x*

^{e},

*u*

^{e}) is an operating point if

*f*(

*x*

^{e},

*u*

^{e}) =

*x*

^{e}. We conveniently assume that, after state and control coordinate translations, the operating point of interest is (

*x*

^{e},

*u*

^{e}) = (0, 0).

*u*is said to be feasible at \(x\in \mathbb {X}\) if

*κ*(

*x*) that we seek needs to be feasible, that is, for every \(x\in \mathbb {X}\),

*l*(

*x*,

*u*) (aka running cost), that is, nonnegative definite in

*x*,

*u*and positive definite in

*u*, and then to solve the infinite horizon optimal control problem of minimizing the quantity

**u**= (

*u*(0),

*u*(1), …) subject to the dynamics (1), the constraints (2, 3, 4), and the initial condition

*x*(0) =

*x*

^{0}. Assuming the minimum exists for each \(x^0\in \mathbb {X}\), we define the optimal cost function

**u**

^{∗}= (

*u*

^{∗}(0),

*u*

^{∗}(1), …) be a minimizing control sequence with corresponding state sequence

**x**

^{∗}= (

*x*

^{∗}(0) =

*x*

^{0},

*x*

^{∗}(1), …). Minimizing control and state sequences need not be unique, but we shall generally ignore this problem because we are using optimization as a path to stabilization. The key question is whether the possibly nonunique solution is stabilizing to the desired operating point. As we shall see, AHMPC nearly verifies stabilization in real time.

*V*(

*x*) is the optimal cost and

*κ*(

*x*) is an optimal feedback on \( \mathbb {X}\). Under suitable conditions a Lyapunov argument can be used to show that the feedback

*κ*(

*x*) is stabilizing.

The difficulty with this approach is that it is generally impossible to solve the BDP equations on a large domain \(\mathbb {X}\) if the state dimension *n* is greater than 2 or 3. So both theorists and practitioners have turned to Model Predictive Control (MPC).

*l*(

*x*,

*u*), a horizon length

*N*, a terminal domain \(\mathbb {X}_f\subset \mathbb {X}\) containing

*x*= 0, and a terminal cost

*V*

_{f}(

*x*) defined and positive definite on \(\mathbb {X}_f\). Then one considers the problem of minimizing

*x*(0) =

*x*

^{0}. Assuming this problem is solvable, let

*V*

_{N}(

*x*

^{0}) denote the optimal cost

*x*

^{0}when the horizon length is

*N*. We then define the MPC feedback law \( \kappa _N(x^0)= u_N^*(0) \).

The terminal set \(\mathbb {X}_f\) is controlled invariant (aka viable) if for each \(x\in \mathbb {X}_f\) there exists a \(u\in \mathbb {U}\) such that \(f(x,u)\in \mathbb {X}_f\) and \(h(x,u) \in \mathbb {Y}\) are satisfied. If *V*_{f}(*x*) is a control Lyapunov function on \(\mathbb {X}_f\), then, under suitable conditions, a Lyapunov argument can be used to show that the feedback *κ*_{N}(*x*) is stabilizing on \(\mathbb {X}_f\). See Rawlings and Mayne (2009) for more details.

*u*=

*κ*

_{f}(

*x*) defined on the terminal set \(\mathbb {X}_f\) that leaves it positively invariant, if \(x\in \mathbb {X}_f\) then \(f(x,\kappa _f(x)) \in \mathbb {X}_f\), and which makes

*V*

_{f}(

*x*) a strict Lyapunov function on \(\mathbb {X}_f\) for the closed loop dynamics, if \(x \in \mathbb {X}_f\) and

*x*≠0 then

If \(x\in \mathbb {X}_f\) then AHMPC does not need to solve (9) to get *u*, it just takes *u* = *κ*_{f}(*x*). A similar scheme has been called dual mode control in Michalska and Mayne (1993).

The advantage of solving the finite horizon optimal control problem (9) over solving the infinite horizon problem (6) is that it may be possible to solve the former online as the process evolves. If it is known that the terminal set \(\mathbb {X}_f\) can be reached from the current state *x* in *N* or fewer steps, then the finite horizon *N* optimal control problem is a feasible nonlinear program with finite dimensional decision variable \({\mathbf {u}}_N\in \mathbb {R}^{m\times N}\). If the time step is long enough, if *f*, *h*, *l* are reasonably simple, and if *N* is small enough, then this nonlinear program possibly can be solved in a fraction of one time step for \({\mathbf {u}}^*_N\). Then the first element of this sequence \(u_N^*(0)\) is used as the control at the current time. The system evolves one time step, and the process is repeated at the next time. Conceptually MPC computes a feedback law \(\kappa _N(x) =u^*_N(0)\) but only at values of *x* when and where it is needed.

Some authors like Grimm et al. (2005) and Gruene (2012) do away with the terminal cost *V*_{f}(*x*), but there is a theoretical reason and a practical reason to use one. The theoretical reason is that a control Lyapunov terminal cost facilitates a proof of asymptotic stability via a simple Lyapunov argument; see Rawlings and Mayne (2009). But this is not a binding reason because under suitable assumptions, asymptotic stability can be shown even when there is no terminal cost provided the horizon is sufficiently long. The practical reason is more important; when there is a terminal cost one can usually use a shorter horizon *N*. A shorter horizon reduces the dimension *mN* of the decision variables in the nonlinear programs that need to be solved online. Therefore MPC with a suitable terminal cost can be used for faster and more complicated systems.

An ideal terminal cost *V*_{f}(*x*) is *V* (*x*) of the corresponding infinite horizon optimal control provided that the latter can be accurately computed off-line on a reasonably large terminal set \(\mathbb {X}_f\). For then the infinite horizon cost (6) and (9) will be the same. One should not make too much of this fact as stabilization is our goal; the optimal control problems are just a means to accomplish this. This in contrast to Economic MPC where the cost and the associated Lagrangian are chosen to model real-world costs.

## Adaptive Horizon Model Predictive Control

A discrete time dynamics

*f*(*x*,*u*) with operating point*x*= 0,*u*= 0.A Lagrangian

*l*(*x*,*u*), nonnegative definite in (*x*,*u*) and positive definite in*u*.State constraints \(x\in \mathbb {X}\) where \(\mathbb {X}\) is a neighborhood of

*x*= 0.Control constraints \(u\in \mathbb {U}\) where \(\mathbb {U}\) is a neighborhood of

*u*= 0.Mixed constraints \( h(x,u)\in \mathbb {Y}\) which are not active at the operating point

*x*= 0,*u*= 0.The dynamics is recursively feasible on \( \mathbb {X}\), that is, for every \(x\in \mathbb {X}\) there is a \(u\in \mathbb {U}\) satisfying \( h(x,u)\in \mathbb {Y}\) and \(f(x,u)\in \mathbb {X}\)

A terminal cost

*V*_{f}(*x*) defined and nonnegative definite on some neighborhood \(\mathbb {X}_f\) of the operating point*x*= 0,*u*= 0. The neighborhood \(\mathbb {X}_f\) need not be known explicitly.A terminal feedback

*u*=*κ*_{f}(*x*) defined on \(\mathbb {X}_f\) such that the terminal cost is a valid Lyapunov function on \(\mathbb {X}_f\) for the closed loop dynamics using the terminal feedback*u*=*κ*_{f}(*x*).

One way of obtaining a terminal pair *V*_{f}(*x*), *κ*_{f}(*x*) is to approximately solve the infinite horizon dynamic program equations (BDP) on some neighborhood of the origin. For example, if the linear part of the dynamics and the quadratic part of the Lagrangian constitute a linear quadratic regulator (LQR) problem satisfying the standard conditions, then one can let *V*_{f}(*x*) be the quadratic optimal cost and *κ*_{f}(*x*) be the linear optimal feedback of this LQR problem. Of course the problem with such terminal pairs *V*_{f}(*x*), *κ*_{f}(*x*) is that generally there is no way to estimate the terminal set \(\mathbb {X}_f\) on which the feasibility and Lyapunov conditions are satisfied. It is reasonable to expect that they are satisfied on some terminal set but the extent of this terminal set is difficult to estimate.

In the next section we show how higher degree Taylor polynomials for the optimal cost and optimal feedback can be computed by the discrete time extension (Aguilar and Krener 2012) of Al’brekht’s method (Al’brecht 1961) because this can lead to a larger terminal set \(\mathbb {X}_f\) on which the feasibility and the Lyapunov conditions are satisfied. It would be very difficult to determine what this terminal set is, but fortunately in AHMPC we do not need to do this.

AHMPC mitigates this last difficulty just as MPC mitigates the problem of solving the infinite horizon Bellman Dynamic Programming equations BDP (7). MPC does not try to compute the optimal cost and optimal feedback everywhere; instead it computes them just when and where they are needed. AHMPC does not try to compute the set \(\mathbb {X}_f\) on which *κ*_{f}(*x*) is feasible and stabilizing; it just tries to determine if the end state \(x_N^*(N)\) of the currently computed optimal trajectory is in a terminal set \(\mathbb {X}_f\) where the feasibility and Lyapunov conditions are satisfied.

*x*and we have solved the horizon

*N*optimal control problem for \( {\mathbf {u}}^*_N=(u^*_N(0),\ldots , u^*_N(N-1))\), \({\mathbf {x}}^*_N=(x^*_N(0)=x,\ldots , x^*_N(N))\) The terminal feedback

*u*=

*κ*

_{f}(

*x*) is used to compute

*M*additional steps of this state trajectory

*k*=

*N*, …,

*N*+

*M*− 1.

*k*=

*N*, …,

*N*+

*M*− 1 and some Class K function

*α*(

*s*). For more on Class K functions, we refer the reader to Khalil (1996).

If (11–15) hold for all for *k* = *N*, …, *N* + *M* − 1, then we presume that \(x^*_N(N)\in \mathbb {X}_f\), a set where *κ*_{f}(*x*) is stabilizing and we use the control \(u_N^*(0)\) to move one time step forward to \(x^+=f(x, u^*_N(0))\). At this next state *x*^{+}, we solve the horizon *N* − 1 optimal control problem and check that the extension of the new optimal trajectory satisfies (11–15).

If (11–15) does not hold for all for *k* = *N*, …, *N* + *M* − 1, then we presume that \(x^*_N(N)\notin \mathbb {X}_f\). We extend current horizon *N* to *N* + *L* where *L* ≥ 1 , and if time permits, we solve the horizon *N* + *L* optimal control problem at the current state *x* and then check the feasibility and Lyapunov conditions again. We keep increasing *N* by *L* until these conditions are satisfied on the extension of the trajectory. If we run out of time before they are satisfied, then we use the last computed \(u_N^*(0)\) and move one time step forward to \(x^+=f(x, u^*_N(0))\). At *x*^{+} we solve the horizon *N* + *L* optimal control problem.

*M*and the class

*K*function

*α*(⋅)? If the extended part of the state sequence is actually in the region where the terminal cost

*V*

_{f}(

*x*) and the terminal feedback

*κ*

_{f}(

*x*) well approximate the solution to infinite horizon optimal control problem, then the dynamic programing equations (7) should approximately hold. In other words

*N*. We can also increase the extended horizon

*M*, but this may not be necessary. If the Lyapunov and feasibility conditions are going to fail somewhere on the extension, it is most likely this will happen at the beginning of the extension. Also we should choose

*α*(⋅) so that

*α*(|

*x*|) < |

*l*(

*x*,

*κ*

_{f}(

*x*))|∕2.

The nonlinear programming problems generated by employing MPC on a nonlinear system are generally nonconvex so the solver might return local rather than global minimizers. In which case there is no guarantee that an MPC approach is actually stabilizing. AHMPC mitigates this difficulty by nearly checking that stabilization is occurring. If (11–15) don’t hold even after the horizon *N* has been increased substantially, then this is a strong indication that the solver is returning locally rather than globally minimizing solutions, and these local solutions are not stabilizing. To change this behavior one needs to start the solver at a substantially different initial guess. Just how one does this is an open research question. It is essentially the same question as which initial guess should one pass to the solver at the first step of MPC.

The actual computation of the *M* additional steps (10) can be done very quickly because the closed loop dynamics function *f*(*x*, *κ*_{f}(*x*)) can be computed and compiled beforehand. Similarly the feasibility and Lyapunov conditions (11–15) can be computed and compiled beforehand. The number *M* of additional time steps is a design parameter. One choice is to take *M* a positive integer; another choice is a positive integer plus a fraction of the current *N*.

## Choosing a Terminal Cost and a Terminal Feedback

*V*

_{f}(

*x*) and a terminal feedback

*κ*

_{f}(

*x*) is to solve the linear quadratic regulator (LQR) using the quadratic part of the Lagrangian and the linear part of dynamics around the operating point (

*x*

^{e},

*u*

^{e}) = (0, 0). Suppose

*P*,

*K*such that

*F*,

*G*, the detectability of

*Q*

^{1∕2},

*F*, the nonnegative definiteness of [

*Q*,

*S*;

*S′*,

*R*], and the positive definiteness of

*R*, there exist a unique nonnegative definite

*P*satisfying the first equation (16) which is called the discrete time algebraic Riccati equation (DARE). Then

*K*given by (17) puts all the poles of the closed loop linear dynamics

*l*(

*x*,

*u*) is a design parameter so we choose [

*Q*,

*S*;

*S′*,

*R*] to be positive definite and then

*P*will be positive definite.

*κ*

_{f}(

*x*) =

*Kx*, then we know by a Lyapunov argument that the nonlinear closed loop dynamics

*x*

^{e}= 0. The problem is that we don’t know what is the neighborhood \(\mathbb {X}_f\) of asymptotic stability and computing it off-line can be very difficult in state dimensions higher than two or three.

There are other possible choices for the terminal cost and terminal feedback. Al’brecht (1961) showed how the Taylor polynomials of the optimal cost and the optimal feedback could be computed for some smooth, infinite horizon optimal control problems in continuous time. Aguilar and Krener (2012) extended this to some smooth, infinite horizon optimal control problems in discrete time. The discrete time Taylor polynomials of the optimal cost and the optimal feedback may be used as the terminal cost and terminal feedback in an AHMPC scheme, so we briefly review (Aguilar and Krener 2012).

*f*(

*x*,

*u*) and

*l*(

*x*,

*u*) are smooth and the constraints are not active at the origin, we can simplify the BDP equations. The simplified Bellman Dynamic Programming equations (sBDP) are obtained by setting the derivative with respect to

*u*of the quantity to be minimized in (7) to zero. The result is

*u*, then the BDP equations and sBDP equations are equivalent. But if not then BDP implies sBDP but not vice versa.

*x*= 0,

*u*= 0 of the form

*d*≥ 1 where

^{[j]}indicates the homogeneous polynomial terms of degree

*j*.

We plug these polynomials into sBDP and collect terms of lowest degree. The lowest degree in (18) is two while in (19) it is one. The result is the discrete time Riccati equations (16, 17).

*V*

^{[3]}(

*x*) and

*κ*

^{[2]}(

*x*) and the right sides of these equations involve only known quantities. Moreover

*κ*

^{[2]}(

*x*) does not appear in the first equation. The mapping

*x*. Its eigenvalues are of the form of 1 minus the products of three eigenvalues of

*F*+

*GK*. Since the eigenvalues of

*F*+

*GK*are all inside the open unit disk, zero is not an eigenvalue of the operator (22), so it is invertible. Having solved (20) for

*V*

^{[3]}(

*x*), we can readily solve (21) for

*κ*

^{[2]}(

*x*) since we have assumed

*R*is positive definite

*d*+ 1,

*d*they take the form

*f*,

*l*and previously computed

*V*

^{[i+1]},

*κ*

^{[i]}for 1 ≤

*i*<

*j*. Again the equations are linear in the unknowns

*V*

^{[k+1]},

*κ*

^{[k]}, and the first equation does not involve

*κ*

^{[k]}. The eigenvalues of the linear operator

*k*+ 1 eigenvalues of

*F*+

*GK*.

We have written MATLAB code to solve these equations to any degree and in any dimensions. The code is quite fast. Later we shall present an example where *n* = 4 and *m* = 2. The code found the Taylor polynomials of the optimal cost to degree 6 and the optimal feedback to degree 5 in 0.12 s. on a laptop using 3.1 GHz Intel Core i5.

## Completing the Squares

The infinite horizon optimal cost is certainly nonnegative definite, and if we choose *Q* > 0 then it is positive definite. That implies that its quadratic part \(V^{[2]}(x)={1\over 2}x'Px\) is positive definite.

*d*+ 1,

*d*> 1. This can lead to problems if we define this Taylor polynomial to be our terminal cost

*V*

_{f}(

*x*) because then the nonlinear program solver might return a negative cost

*V*

_{N}(

*x*) (9). The way around this difficulty is to “complete the squares.”

### Theorem

Suppose a polynomial *V* (*x*) is of degrees two through *d* + 1 in *n* variables *x*_{1}, …, *x*_{n}. If the quadratic part of *V* (*x*) is positive definite, then there exist a nonnegative definite polynomial *W*(*x*) of degrees two through 2*d* such that the part of *W*(*x*) that is of degrees two through *d* + 1 equals *V* (*x*). Moreover, we know that *W*(*x*) is nonnegative definite because it is the sum of *n* squares.

### Proof

*V*(

*x*), because it is positive definite it must be of the form \({1\over 2}x'Px\) where

*P*is a positive definite

*n*×

*n*matrix. We know that there is an orthogonal matrix

*T*that diagonalizes

*P*

*λ*

_{1}≥

*λ*

_{2}≥… ≥

*λ*

_{n}> 0. We make the linear change of coordinates

*x*=

*Tz*. We shall show that

*V*(

*z*) =

*V*(

*Tz*) can be extended with higher degree terms to a polynomial

*W*(

*z*) of degrees two through 2

*d*which is a sum of

*n*squares. We do this degree by degree. We have already showed that the degree two part of

*V*(

*z*) is a sum of

*n*squares

*V*(

*z*) are of the form

*z*

_{1}.

*z*

_{1}or

*z*

_{2}.

*δ*

_{3}(

*z*), …,

*δ*

_{n}(

*z*) such that

*z*

_{1}in

*z*

_{2}in

We continue on in this fashion. The result is a sum of squares whose degree two through four terms equal *V* (*z*).

At degree *d* = 3, we solved a linear equation from the quadratic coefficients of *δ*_{1}(*z*), …, *δ*_{n}(*z*) to the cubic coefficients of *V* (*z*). We restricted the domain of this mapping by requiring that the quadratic part of *δ*_{i}(*z*) does not depend on *z*_{1}, …, *z*_{i−1}. This made the restricted mapping square; the dimensions of the domain and the range of the linear mapping are the same. We showed that the restricted mapping has a unique solution.

*d*the overall dimension of the domain is

So the unrestricted mapping has more unknowns than equations, and hence there are multiple solutions.

*λ*

_{1}≥

*λ*

_{2}≥… ≥

*λ*

_{n}> 0. To see this consider the coefficient

*γ*

_{1,1,2}of \(z_1^2z_2\). If we allow

*γ*

_{2}(

*z*) to have a term of the form \(\delta _{2,1,1} z_1^2\), we can also cancel

*γ*

_{1,1,2}by choosing

*δ*

_{1,1,2}and

*δ*

_{2,1,1}so that

*λ*

_{1}≥

*λ*

_{2}, a least squares solution to this equation is \(\delta _{1,1,2}={\gamma _{1,1,2}\over \lambda _1}\) and

*δ*

_{2,1,1}= 0. Because

*T*is orthogonal,

*W*(

*x*) =

*W*(

*T′z*) is also a least squares solution.

## Example

Suppose we wish to stabilize a double pendulum to straight up. The first two states are the angles between the two links and straight up measured in radians counterclockwise. The other two states are their angular velocities. The controls are the torques applied at the base of the lower link and at the joint between the links. The links are assumed to be massless, the base link is one meter long, and the other link is two meters long. There is a mass of two kilograms at the joint between the links and a mass of one kilogram at the tip of the upper link. There is linear damping at the base and the joint with both coefficients equal to 0.5 s^{−1}. The resulting continuous time dynamics is discretized using Euler’s method with time step 0.1 s.

The first pair \(V^2_f(x), \kappa ^1_f(x)\) was found by solving the infinite horizon LQR problem obtained by taking the linear part of the dynamics around the operating point *x* = 0 and the quadratic Lagrangian (23). Then \(V^2_f(x)\) is quadratic and positive definite and \(\kappa ^1_f(x)\) is a linear.

The second pair \(V^6_f(x), \kappa ^5_f(x)\) was found using the discrete time version of Al’brekht’s method. Then \(V^6_f(x)\) is the Taylor polynomial of the optimal cost to degree 6 and \(\kappa ^5_f(x)\) is the Taylor polynomial of the optimal feedback to degree 5. But \(V^6_f(x)\) is not positive definite so we completed the squares as above to get degree 10 polynomial \(V^{10}_f(x)\) which is positive definite.

In all the simulations we imposed the control constraint |*u*|_{∞}≤ 4 and started at *x*(0) = (0.9*π*, 0.9*π*, 0, 0) with an initial horizon of *N* = 50 time steps. The extended horizon was kept constant at *M* = 5. The class K function was taken to be *α*(*s*) = *s*^{2}∕10.

*N*was increased by 5 and the finite horizon nonlinear program was solved again without advancing the system. If after three tries the Lyapunov or feasibility conditions were still not satisfied, then the first value of the control sequence was used, the simulation was advanced one time step, and the horizon was increased by 5 (Fig. 1).

If the Lyapunov and feasibility conditions were satisfied over the extended horizon, then the simulation was advanced one time step, and the horizon *N* was decreased by 1.

*t*= 80 times steps (8 s). The degree 2

*d*= 10 terminal cost and the degree

*d*= 5 terminal feedback seem to do it a little more smoothly with no overshoot and with shorter maximum horizon

*N*= 65 versus

*N*= 75 for LQR (

*d*= 1) (Fig. 2).

*d*= 10 terminal cost and the degree

*d*= 5 terminal feedback was 5.01 s. This is probably too slow to control pendulum in real time because the solver needs to return

*u*

^{∗}(0) in a fraction of a time step But by using a faster solver than fmincon.m, coding the objective, the gradient, and the Hessian in

*C*compiling them, we probably could control the double pendula in real time. The cpu time for the LQR terminal cost and terminal feedback was 24.56 s so it is probably not possible to control the double pendula in real time using LQR terminal cost and terminal feedback (Fig. 3). Then we added noise to the simulations. At each advancing step a Gaussian random vector with mean zero and covariance 0.0001 times the identity was added to the state. The next figures show the results using the degree 10 terminal cost and degree 5 terminal feedback. Notice that the horizon starts at

*T*= 50 but immediately jumps to

*T*= 65, declines to

*T*= 64, then jumps to

*T*= 69 before decaying monotonically to zero. When the horizon

*T*= 0 the terminal feedback is used. The LQR terminal cost and feedback failed to stabilize the noisy pendula. We stopped the simulation when

*T*> 1000 (Fig. 4).

*d*= 3 so that after completing the squares the terminal cost is degree 2

*d*= 6 and the terminal feedback is degree

*d*= 3. It stabilized the noiseless simulation with a maximum horizon of

*N*= 80 which is greater than the maximum horizons for both

*d*= 1 and

*d*= 5. But it did not stabilize the noisy simulation. Perhaps the reason is revealed by Taylor polynomial approximations to \(\sin x\) as shown in Figure 5. The linear approximation in green overestimates the magnitude of \(\sin x\), so the linear feedback is stronger than it needs to be to overcome gravity. The cubic approximation in blue underestimates the magnitude of \(\sin x\) so the cubic feedback is weaker than it needs to be to overcome gravity. The quintic approximation in orange overestimates the magnitude of \(\sin x\), so the quintic feedback is also stronger than it needs to be but by a lesser margin than the linear feedback to overcome gravity. This may explain why the degree 5 feedback stabilizes the noise free pendula in a smoother fashion than the linear feedback (Fig. 5).

## Conclusion

Adaptive Horizon Model Predictive Control is a scheme for varying the horizon length in Model Predictive Control as the stabilization process evolves. It adapts the horizon in real time by testing Lyapunov and feasibility conditions on extensions of optimal trajectories returned by the nonlinear program solver. In this way it seeks the shortest horizons consistent with stabilization.

AHMPC requires a terminal cost and terminal feedback that stabilizes the plant in some neighborhood of the operating point but that neighborhood need not be known explicitly. Higher degree Taylor polynomial approximations to the optimal cost and the optimal feedback of the corresponding infinite horizon optimal control problems can be found by an extension of Al’brekht’s method (Aguilar and Krener 2012). The higher degree Taylor polynomial approximations to optimal cost need not be positive definite, but they can be extended to nonnegative definite polynomials by completing the squares. These nonnegative definite extensions and the Taylor polynomial approximations to the optimal feedback can be used as terminal costs and terminal feedbacks in AHMPC. We have shown by an example that a higher degree terminal cost and feedback can outperform using LQR to define a degree two terminal cost and a degree one terminal feedback.

## References

- Aguilar C, Krener AJ (2012) Numerical solutions to the dynamic programming equations of optimal control. In: Proceedings of the 2012 American control conferenceGoogle Scholar
- Antsaklis PJ, Michel AN (1997) Linear systems. McGraw Hill, New YorkGoogle Scholar
- Al’brecht EG (1961) On the optimal stabilization of nonlinear systems. PMM-J Appl Math Mech 25:1254–1266MathSciNetCrossRefGoogle Scholar
- Droge G, Egerstedt M (2011) Adaptive time horizon optimization in model predictive control. In: Proceedings of the 2011 American control conferenceGoogle Scholar
- Giselsson P (2010) Adaptive nonlinear model predictive control with suboptimality and stability guarantees. In: Proceedings of the 2010 conference on decision and controlGoogle Scholar
- Grimm G, Messina MJ, Tuna SE, Teel AR (2005) Model predictive control: for want of a local control Lyapunov function, all is not lost. IEEE Trans Autom Control 50:546–558MathSciNetCrossRefGoogle Scholar
- Gruene L (2012) NMPC without terminal constraints. In: Proceedings of the 2012 IFAC conference on nonlinear model predictive control, pp 1–13Google Scholar
- Gruene L, Pannek J, Seehafer M, Worthmann K (2010) Analysis of unconstrained nonlinear MPC schemes with time varying control horizon. SIAM J Control Optim 48:4938–4962MathSciNetCrossRefGoogle Scholar
- Khalil HK (1996) Nonlinear systems, 2nd edn. Prentice Hall, EnglewoodGoogle Scholar
- Kim T-H, Sugie T (2008) Adaptive receding horizon predictive control for constrained discrete-time linear systems with parameter uncertainties. Int J Control 81:62–73MathSciNetCrossRefGoogle Scholar
- Krener AJ (2018) Adaptive horizon model predictive control. In: Proceedings of the IFAC conference on modeling, identification and control of nonlinear systems, Guadalajara, 2018Google Scholar
- Michalska H, Mayne DQ (1993) Robust receding horizon control of constrained nonlinear systems. IEEE Trans Autom Control 38:1623–1633MathSciNetCrossRefGoogle Scholar
- Page SF, Dolia AN, Harris CJ, White NM (2006) Adaptive horizon model predictive control based sensor management for multi-target tracking. In: Proceedings of the 2006 American control conferenceGoogle Scholar
- Pannek J, Worthmann K (2011) Reducing the Predictive Horizon in NMPC: an algorithm based approach. Proc IFAC World Congress 44:7969–7974Google Scholar
- Polak E, Yang TH (1993a) Moving horizon control of linear systems with input saturation and plant uncertainty part 1. robustness. Int J Control 58:613–638CrossRefGoogle Scholar
- Polak E, Yang TH (1993b) Moving horizon control of linear systems with input saturation and plant uncertainty part 2.disturbance rejection and tracking. Int J Control 58:639–663CrossRefGoogle Scholar
- Polak E, Yang TH (1993c) Moving horizon control of nonlinear systems with input saturation, disturbances and plant uncertainty. Int J Control 58:875–903MathSciNetCrossRefGoogle Scholar
- Rawlings JB, Mayne DQ (2009) Model predictive control: theory and design. Nob Hill Publishing, MadisonGoogle Scholar