1 Introduction

For a long time, discrete-continuous dynamical interactions have been recognized as a major challenge in the process control area. The emergence of a hybrid systems modelling framework has provided a new perspective on some important problems. The ability to operate hybrid systems in an optimal way remains a challenging task. Indeed, for the general setting of hybrid systems, one has to deal not only with the infinite dimensional optimization problems related to the continuous dynamics, but also with a potential combinatorial explosion related to the discrete part.

In this context and with focus on particular classes, many schemes have been proposed to tackle the problem. Some are based on a newly elaborated condition of optimality (see e.g., Shaikh and Caines 2003, 2007; Rantzer 2006), others are more related to semi-classical approaches, (see e.g., Attia et al. 2005; Alamir and Attia 2004; Hedlund and Rantzer 2002; Xu and Antsaklis 2003a). In the last few years, there has been a revived interest in gradient based methods, (see e.g., Lu et al. 1993; Cassandras et al. 1998; Azhmyakov and Raisch 2006; Egerstedt et al. 2006; Axelsson et al. 2006). This fact is due to their intuitive interpretation, reliability and the existence of well established convergence results. It is the aim of this contribution to extend this approach to a particular class of hybrid systems with autonomous switching. This class is characterized by the fact that the discrete transitions are accompanied by instantaneous changes (jumps) in the continuous states and that these state jumps can be considered as control variables. Discrete transitions, or switches, are called autonomous if they cannot be triggered by a discrete control command but depend on the evolution of the continuous states. Hybrid systems with jumps in the continuous states are after referred to as impulsive hybrid systems. The problem of interest here is formulated as a sequential problem i.e., for a particular execution the time axis is partitioned into subintervals, in each interval, the dynamics are characterized by a set of ODE’s, with transitions being triggered internally (autonomous switches). This is the approach that has been considered since the initial formulation of the corresponding optimal control problem (Clarke and Vinter 1989) and can be seen as a natural way to tackle the problem. The class of impulsive hybrid systems discussed in this paper arises most frequently in the area of process control and may be a result of adding and processing material, (see e.g., Alamir 2006, for a typical benchmark problem). In particular, we have been motivated by a specific chemical process control application, namely a preferential crystallization process used to separate enantiomers (see Elsner et al. 2005 for the physical aspects and Raisch et al. 2005 for more details on the control problems). This process is often operated in cyclic batch mode, where after each cycle, material to be separated is added. This represents an instantaneous change of the system state. The amount of added material is a degree of freedom and can be considered as an additional control input.

Hybrid systems with autonomous switching and non-controlled state jumps have been considered in Xu and Antsaklis (2003b). The objective is to find the sequence of jump instants such that a cost functional is minimized. For that purpose, the authors develop a second order scheme that is further specialized to linear systems in Xu and Antsaklis (2003c). Recently, Verriest and coworkers (Verriest et al. 2004) have considered another class of hybrid systems. This consists of systems with controlled switching where some delay is present on the states and where both the jump magnitudes and instants of switching are considered as control variables. Based on variational arguments, necessary conditions of optimality are derived and used in a first order scheme, see Verriest et al. (2005) for an epidemic control application. Another contribution by the same authors is the class of hybrid systems introduced in Verriest (2006). The class includes systems with variable state space dimensions where the transitions are time triggered and can thus be viewed as a general class of switched systems. The variation of the the state space dimension is captured by introducing fictitious reset maps. The tools developed can be as well applied to the case where the reset maps are real as exemplified in Mehta et al. (2007) where an optimal control problem is formulated and solved using a variational approach. In this contribution, we consider a related problem where no delay is present on the state and where the switching is autonomous.

The paper is organized as follows: in Section 2 the problem is stated formally. Section 3 is devoted to the statement of the necessary conditions of optimality. In Section 4, the gradient formulas are derived and a conceptual algorithm together with some convergence properties is stated. Finally, some conclusions and suggestions for future work are given in Section 5.

2 Problem formulation

We consider the following class of impulsive hybrid systems with autonomous switching (see e.g., Simic et al. 2009, for the general modelling framework).

Definition 1

An impulsive hybrid system with autonomous switching is a collection

$$ {\mathcal H} = ({\mathcal Q},{\mathcal E},\ {\mathcal X},\ {\mathcal F},\ {\mathcal G},\ {\mathcal R}) $$

where

  • \({\mathcal Q} = \{q_0, q_1 \ldots,q_Q\}\) is a finite set of locations

  • \({\mathcal E} \subseteq {\mathcal Q}\times {\mathcal Q}\) is a set of edges

  • \({\mathcal X} = \{{\mathcal X}_q \}_{q \in {\mathcal Q}}\) is a collection of state spaces where, for all \(q \in {\mathcal Q}\), \({\mathcal X}_q\) is an open subset of ℝn.

  • \({\mathcal F}=\{f_q\}_{q \in {\mathcal Q}}\) is a collection of vector fields. For all \(q \in {\mathcal Q}\), \(f_q : {\mathcal X}_q \rightarrow \mathbb{R}^n\)

  • \({\mathcal G} = \{{\mathcal G}_e\}_{e \in {\mathcal E}}\) is a collection of guards. For all possible transitions \(e =(q_i,\ q_j)\in {\mathcal E}\), \({\mathcal G}_e \subset {\mathcal X}_{q_i}\)

  • \({\mathcal R}=\{{\mathcal R}_e\}_{e \in {\mathcal E}}\) is a collection of reset maps. For all \(e = (q_i,\ q_j) \in {\mathcal E}\), \({\mathcal R}_e : {\mathcal G}_e \rightarrow 2^{{\mathcal X}_{q_j}}\) where \(2^{{\mathcal X}_{q_j}}\) denotes the power set of \({\mathcal X}_{q_j}\)

We assume that the vector fields f q are smooth enough (see assumptions below), and that the sets \({\mathcal G}_e\) are nonempty for all \(e\in {\mathcal E}\). An execution is as follows: starting from an initial condition \((x_0, q_{i_0})\) the continuous state evolves according to the autonomous differential equations

$$\dot{x}(t) = f_{q_{i_0}}(x(t)) $$
(1)

The discrete state \(q(\cdot)=q_{i_0}\) remains constant as long as the trajectory does not reach a guard \({\mathcal G}_{(q_{i_0}, q_{i_1})}\). In our set up, once this guard is reached, q(·) will switch from \(q_{i_0}\) to \(q_{i_1}\),Footnote 1 at the same time the continuous state gets reset to some value according to the map \({\mathcal R}_{(q_{i_0}, q_{i_1})}\) and the whole process is repeated. Next, we suppose that the guards can be described by smooth (n − 1) dimensional surfaces in ℝn

$${\mathcal G}_e = \{ x \ | \ S_e(x) = 0 \}, \qquad \text{for all} \ e \in {\mathcal E} $$
(2)

and that the reset maps are linear maps of the form, for all \(e \in {\mathcal E}\),

$${\mathcal R}_{(q_{i_{k-1}},q_{i_{k}})}(x) = x + \theta_{k} $$
(3)

with θ k belonging to Θ a compact subset of ℝn. In our setup, the variables θ k are degrees of freedom that can be selected by the controller.

The dynamic optimization problem of interest can now be formulated as follows:

Problem 1

Under a fixed switching sequence of locations \(\{q_{i_k}\}_{k}\), solve the following optimization problem

$$\min_{\theta}J[\theta]:= \sum\limits_{k=0}^{K}\int_{t_{k}}^{t_{k+1}} L(x(\tau)) d\tau $$
(4)

such that

$$ \dot{x}(t) = f_{q(t)}(x(t)) $$
(5)
$$ q(t) = q_{i_{k}}, \qquad t \in [t_{k}, t_{k+1})$$
(6)
$$ x(t_0^+)= x(t_0) + \theta_0 $$
(7)
$$ x\left(t_{k+1}^+\right) = x(t_{k+1}) + \theta_{k+1}, \ S_{(q_{i_k},q_{i_{k+1}})}(x(t_{k+1})) = 0 $$
(8)
$$ \dot{x}(t)=f_{q_{i_K}}(x(t)), \qquad t \in [t_K, t_{K+1}] $$
(9)

with x(t 0) = x 0, K ∈ ℤ +  is the total number of switches, the vector θ denotes the n×(K + 1) dimensional vector \(\left(\theta_0' \ldots \theta_K'\right)'\), L : ℝn→ℝ +  is a supplied cost function, t 0 and t K + 1 = T are both finite and given.

Remark 1

Note that in the problem formulation, the sequence of locations is fixed. Although the sequence of locations that the model defined in Definition 1 will exhibit, is not fixed à priori and will depend on the chosen control inputs (parameters), we concentrate on finding the optimal parameters for a given sequence of locations. This implies that the solution developed in the reminder of this paper will preserve the specified sequence of locations. Additionally, a dynamic programming or a heuristic search approach can be added to generate possible sequences, but this is beyond the scope of the paper. Note also that the switching instants are not defined à priori. This follows from the degree of freedom allowed on the state jumps, i.e., variation of these quantities induces variations of the switching instants, see Fig. 1 for an example of an execution within the framework of Problem 1.

Fig. 1
figure 1

An example of execution with K = 4 switches

We assume the following to hold:

  1. A1

    For all q i in \({\mathcal Q}\), \({\mathcal X}_{q_i} = \mathbb{R}^n\)

  2. A2

    For all q i in \({\mathcal Q}\), the functions \(f_{q_i}\) are continuously differentiable

  3. A3

    L is a twice continuously differentiable function

  4. A4

    There exists a constant M such that \(\|f_{q_i}(x)\| < M\) for all \(x \in {\mathcal X}_{q_i}\) and \(q_i \in {\mathcal Q}\)

Remark 2

Since the number of switchings and the sequence of locations are fixed, the problem is well posed. Indeed, sliding and Zeno behaviors are ruled out.

Problem 1 could be seen as a collection of initial value optimization subproblems. Using classical approaches, the subproblems might be solved separately but nothing guarantees that the trajectory obtained by concatenation of the different solutions is the optimal one. The link between the different subproblems is provided by a set of necessary conditions stated in the next section.

3 Necessary conditions of optimality

In this section, necessary conditions of optimality of a solution to Problem 1 are derived. The arguments used are of variational type (see e.g., Gelfand and Fomin (1963) and Bryson and Ho (1975) for basic material on the Euler-Lagrange theory). Let us first define the Hamiltonian associated to location \(q_{i_k}\) as

$$ H_{q_{i_k}}(x,\lambda) = L(x) + \lambda' f_{q_{i_k}}(x) $$
(10)

where λ denotes the adjoint variables. We then have the following result

Theorem 1

If θ* is an interior optimal solution to Problem 1 under Assumptions A1–A4 and x*(t) is its corresponding state trajectory for t ∈ [t0, T], then there exists a nontrivial adjoint λ*(t) and multipliers\(\pi_k^*\) such that the following equations hold

$$\dot{\lambda}^*(t)' = -\nabla_x H_{q_{i_k}}^*, \qquad t \in [t_{k}, t_{k+1}) $$
(11)

At the switching instant tk + 1, the following jump conditions are satisfied

$$ \lambda^*|_{t_{k+1}^+} = \lambda^*|_{t_{k+1}} - \pi_k^* \nabla_x S_{\left(q_{i_k}, q_{i_{k+1}}\right)}'\Big|_{t_{k+1}} $$
(12)
$$ H_{q_{i_{k+1}}}^*|_{t_{k+1}^+} = H_{q_{i_k}}^*|_{t_{k+1}} $$
(13)

for k = 0, ...K − 1, with

$$ \lambda^*|_{t_{K+1}} = \boldsymbol{0} $$
(14)

and

$$\nabla_{\theta} J[\theta^*]= \boldsymbol{0} $$
(15)

Proof

The augmented Lagrangian can be written as

$${\mathcal L} = \sum\limits_{k=0}^{K} \left[\int_{t_k}^{t_{k+1}} \left(H_{q_{i_k}}(x,\lambda) - \lambda' \dot{x}\right) d\tau + \pi_k S_{(q_{i_k},q_{i_{k+1}})}|_{t_{k+1}} \right] $$
(16)

where for simplicity time dependence is dropped. The increment of \({\mathcal L}\) with respect to x can be written as

$$\Delta_x {\mathcal L} = {\mathcal L}(x+h)-{\mathcal L}(x) $$
(17)

where h is a continuously differentiable function of time. Development of the these terms lead to the following

$$\begin{array}{lll} \Delta_x {\mathcal L} &=& \sum\limits_{k=0}^{K} \bigg[\int_{t_k+dt_k}^{t_{k+1}+dt_{k+1}} \Big(H_{q_{i_k}}(x+h,\lambda) -\lambda'\left(\dot{x} + \dot{h}\right)\Big) d\tau \notag \\ && \quad\quad- \int_{t_k}^{t_{k+1}} \left(H_{q_{i_k}}(x,\lambda) - \lambda' \dot{x}\right) d\tau + \pi_k \Delta_x S_{(q_{i_k},q_{i_{k+1}})}|_{t_{k+1}})\bigg] \end{array} $$
(18)

with dt k a small time increment (the existence of which follows from the smoothness assumptions). After integration by part of the second term under the integral sign and rearrangement, Eq. 18 can be written as

$$\begin{array}{lll} \Delta_x {\mathcal L}&=& \sum\limits_{k=0}^{K} \bigg[\int_{t_k}^{t_{k+1}} \left(H_{q_{i_k}}(x+h,\lambda)- H_{q_{i_k}}(x,\lambda) +\dot{\lambda}' h\right) d\tau \notag\\&& \quad\quad\; -\left(H_{q_{i_k}}(x,\lambda)-\lambda'\dot{x}\right)|_{t_k^+} dt_k +\left(H_{q_{i_k}}(x,\lambda)-\lambda'\dot{x}\right)|_{t_{k+1}^+} dt_{k+1}\notag\\&& \quad\quad\; - \lambda'h|_{t_k^+}^{t_{k+1}} +\pi_k \Delta_x S_{(q_{i_k},q_{i_{k+1}})}|_{t_{i_{k+1}}}\bigg] \end{array} $$
(19)

Now, using Taylor’s theorem, we obtain (up to first order) the following expression

$$ \begin{array}{lll} \Delta_x {\mathcal L} &=& \sum\limits_{k=0}^{K} \bigg[\int_{t_k}^{t_{k+1}} \left(\nabla_x H_{q_{i_k}} +\dot{\lambda}'\right)h d\tau -\left(H_{q_{i_k}}(x,\lambda)-\lambda'\dot{x}\right)|_{t_k^+} dt_k \notag \\ && \quad\quad +\left(H_{q_{i_k}}(x,\lambda)-\lambda'\dot{x}\right)|_{t_{k+1}} dt_{k+1} - \lambda'h|_{t_k^+}^{t_{k+1}} +\pi_k \frac{\partial S_{(q_{i_k},q_{i_{k+1}})}}{\partial x}\Bigg|_{t_{k+1}} dx(t_{k+1})\bigg]\notag\\ \end{array} $$
(20)

where dx(t k + 1) is the exact state variation at the instant t k + 1. Following simple geometrical arguments, it can be approximated to the first order by the following expression

$$ dx(t_{k+1}) = h(t_{k+1}) + \dot{x}(t_{k+1}) dt_{k+1} $$
(21)

Using Eq. 21, the first order variation of the augmented Lagrangian \({\mathcal L}\) can be written as

$$\begin{array}{lll} \delta_x {\mathcal L} &=& \sum\limits_{k=0}^{K} \bigg[\int_{t_k}^{t_{k+1}} \left(\nabla_x H_{q_{i_k}} +\dot{\lambda}'\right)h d\tau - \left(H_{q_{i_k}}|_{t_{k+1}}dt_{k+1} - H_{q_{i_k}}|_{t_k^+} dt_k \right)\notag \\ && \quad\quad + \lambda|_{t_k^+} dx(t_k) + \left(\pi_k \nabla_x S_{(q_{i_k},q_{i_{k+1}})} \Big|_{t_{k+1}} - \lambda|_{t_{k+1}}\right) dx(t_{k+1})\bigg] \end{array} $$
(22)

Along the optimal pair (x *, θ *) the following is satisfied

$$ \delta_x {\mathcal L} = 0 $$
(23)

After rearrangement and using the fact that the optimal problem is without terminal constraints but with fixed initial and final time, the desired result (11)–(14) follows. The necessity of Eq. 15 can be shown analogously.□

Remark 3

Instead of affine reset maps, it is straightforward to derive necessary conditions of optimality for the case of a nonlinear parametrized map of the form \(x(t_k^+)=\psi(x(t_k), \theta)\), provided that some smoothness requirements are satisfied. In this case, the gradient of ψ would have appeared in the above conditions. On the other hand, the gradient formulas obtained next will be much harder to derive for this case, which will therefore not be pursued further in this contribution.

Remark 4

The proof techniques used here are of the variational type meaning that the derived necessary conditions of optimality are valid for smooth hybrid impulsive systems under the aforementioned assumptions. The characterized minimum is of the weak type in contrast to a strong minimum that has been recently derived for other classes of hybrid systems (see e.g., Sussmann 1999; Shaikh and Caines 2007).

Remark 5

The conditions stated above characterize a local minimum in the sense that the optimal trajectory is compared only to trajectories that have the same switching sequence of locations (Fig. 2).

Fig. 2
figure 2

A figure showing how the optimal solution to problem 1, under Assumptions A1–A4, looks like. The implementable Algorithm 1 in Section 4 shows how to find the optimal parameters, and thus the optimal trajectory, iteratively by solving a set of initial value problems

4 A gradient based approach

The basic idea is to cast Problem 1 into a parameter optimization framework and then to compute, as required by the necessary conditions of optimality, the gradient of the cost functional. Gradient descent techniques can then be used to compute the optimal parameter values. Before the gradient formula is stated in a theorem, a lemma concerning the sensitivity of the states is given.

Lemma 1

The sensitivity\(\Delta_{q_{i_k}}^{\theta_k} x(\cdot)\) of the state trajectory x(·), corresponding to the dynamics (5)–(9) under Assumptions A1–A4, w.r.t. the vectors θ k can be computed as a solution to the following variational equation

$$\Delta_{q_{i_k}}^{\theta_k} \dot{x}(t) = \frac{\partial f_{q_{i_k}}}{\partial x} \Delta_{q_{i_k}}^{\theta_k} x(t), \qquad t\in [t_k, t_{k+1}) $$
(24)

under the following initial conditions

$$ \Delta_{q_{i_0}}^{\theta_0} x(t_0) = I_{n\times n} $$
(25)
$$ \Delta_{q_{i_{k+1}}}^{\theta_{k+1}} x\left(t_{k+1}^+\right) = \Delta_{q_{i_k}}^{\theta_{k}} x(t_{k+1}) + I_{n\times n} + \left(f_{q_{i_k}}(x(t_{k+1})) - f_{q_{i_{k+1}}}\left(x\left(t_{k+1}^+\right)\right) \right) \Delta^{\theta_{k+1}} t_{k+1}$$
(26)

with

$$\Delta^{\theta_{k+1}} t_{k+1}= -\frac{\left. \nabla_x S_{(q_{i_k},q_{i_{k+1}})} \right|_{t_{k+1}}}{\left. \nabla_x S_{(q_{i_k},q_{i_{k+1}})} \right|_{t_{k+1}} f_{q_{i_k}} (x(t_{k+1}))} \Delta^{\theta_{k}}_{q_{i_k}} x(t_{k+1}) $$
(27)

Proof

See Appendix 1.□

Equations 2427 allow for the computation of the sensitivity trajectories. The first three Eq. 2426 reflect the sensitivity of the states with respect to the parameters θ and are obtained by introducing smooth parametric variations in the initial conditions corresponding to each location. The perturbations are then carefully propagated through the dynamics of the system. Equation 27 is the sensitivity of the switching instants with respect to the parameters θ. It is obtained using a first order Taylor series approximation. If the switching instants were fixed or freely controllable, the above conditions would simplify. Specifically, Eq. 27 would be zero.

The result in Lemma 1 is used to establish the following theorem.

Theorem 2

The gradient of the cost functional J corresponding to Problem 1, under Assumptions A1–A4, can be computed as follows for k = 0, ...K − 1

$$ \nabla_{\theta_0} J = \lambda'|_{t_0}$$
(28)
$$\begin{array}{rcl} \nabla_{\theta_{k+1}} J &=& \lambda'|_{t_{k+1}}- \pi_{k+1}^* \nabla_x S_{(q_{i_k},q_{i_{k+1}})} \notag \\ &&\times \left(\Delta_{q_{i_{k+1}}}^{\theta_{k+1}} x\left(t_{k+1}^+\right) + f_{q_{i_{k+1}}}\left(x\left(t_{k+1}^+\right)\right)\Delta^{\theta_{k+1}}t_{k+1}\right)\end{array} $$
(29)

Proof

See Appendix 2.□

Equation 28 is the gradient of the cost functional with respect to the parameter vector θ 0. The same type of equation can be obtained if one considers a classical variational problem (in a non-hybrid setting) where only the initial conditions are allowed to vary. Equation 28 can readily be used to study the impact that the initial condition can have on a classical variational problem and can also be used on its own in a gradient algorithm to derive optimal initial conditions. The theory developed here recovers this important result. Equation 29 is the gradient of the cost with respect to parameter θ k + 1 and is one of the main contributions of the paper. It depends on the different sensitivity components already derived in Lemma 1. The idea behind the proof is again to introduce variations in the parameters and study their impact on the cost. Using the necessary conditions of optimality obtained in Theorem 1, Eqs. 28 and 29 follow. As will be described in more detail in the forthcoming paragraphs, the computational complexity in this type of problems is considerable. Indeed, the difficulty consists in solving a boundary value problem of a special type. We will now state a conceptual algorithm and show that, under some additional assumptions, it converges to the infinimum of the optimization Problem 1 under Assumptions A1–A4.

Conceptual Algorithm 1

  1. 1.

    Choose an admissible parameter vector\(\theta^{(0)}=\left(\theta_0^{(0)}, \theta_1^{(0)}, \ldots, \theta_K^{(0)}\right)\) and set l = 0

  2. 2.

    Compute the trajectory x(l)(·) and the corresponding adjoint λ(l)(·) such that the conditions (12)–(14) are satisfied,

  3. 3.

    Update the parameter vector θ(l) using the gradient information Eq. 28 and 29 in a gradient projection algorithm, set l: = l + 1 and go to step 2

We can now state the following result

Proposition 1

If θ* is an accumulation point of the sequence\(\{\theta^{(l)}\}_l\) generated by the Conceptual Algorithm 1, then it is a stationary point i.e.,\(\nabla J[\theta^*] = 0\).

Proof

The proof of Proposition 1 follows using descent properties (see e.g., Bertsekas 1995; Pshenichny and Danilin 1982).□

In the following paragraph, some important implementation issues are discussed.

Denote by P Θ the projection operator on the set Θ. An implementable version of the preceding Conceptual Algorithm can be stated as follows

Algorithm 1

Step 0 :

Choose parameters β, μ as positive real numbers from the set (0, 1), a small positive real number ε and an admissible parameter vector θ(0). Set l = 0

Step 1 :

Compute the trajectory x(l)(·) by forward integrating the state Eqs. 59 under the specified initial condition. Let t(l) be the resulting sequence of switching instants {t k } k

Step 2 :

Compute the sensitivity trajectories using Eq. 24 and the corresponding initial conditions (25) and (26) and Eq. 27

Step 3 :

Backward integrate the adjoint λ(l)(·) using Eq. 11 with the terminal condition (14). At the switching instants t(l) compute the multipliers \(\pi_k^{(l)}\) in Eq. 12 such that Eq. 13 is enforced. Update the adjoint at the switching instants t(l) using Eq. 12 and the so far computed multipliers π(l).

Step 4 :

Compute the gradient using Eqs. 28 and 29. Update the parameter vector θ(l + 1)

$$ \theta^{(l+1)} = P_\mathit{\Theta}\left(\theta^{(l)} - \gamma^{(l)} \nabla_{\theta} J\right) $$
(30)

with\(\gamma^{(l)}=\mu^{j_l}\) and j l as the smallest nonnegative integer j satisfying the following inequality- Armijo step size rule

$$J\left[\theta^{(l)}-\mu^{j}\nabla_{\theta} J\right]- J\left[\theta^{(l)}\right] \leq -\beta \mu^{j} \|\nabla_{\theta} J\|^2 $$
(31)
Step 5 :

If J[θ(l)] − J[θ(l + 1)] ≤ ε Then STOP Else set l: = l + 1 and go to Step 1

A good choice of the algorithm parameters β, μ and ε depends on the problem at hand. Numerical experience has shown the universality of some values (see e.g., Polak 1971) for indications. The computation in Step 1 involves the solution of (K + 1) Initial Value Problems (IVP). Particular attention should be paid to the detection of the switching instants. This can be done using the event location capabilities of Matlab IVP solvers (Shampine and Thompson 2000). Step 2 involves the solution of a linear time varying system that should pose no major difficulties. In Step 3, the multipliers are computed such that the Hamiltonian continuity condition is enforced. Recall here that a closed form solution of the multipliers can be found by combining Eqs. 12 and 13. Step 4 is the costliest part of the algorithm. Indeed, evaluation of the cost functional in the right-hand side of inequality (31) makes internal calls to Step 1–Step 3. However, the number of such calls is finite. In Figs. 3 and 4, some iterations of Algorithm 1 applied to a first-order system are shown schematically. The rate of decay of the cost is provided by Eq. 31. The rate is large whenever the gradient is large and this happens usually during the first iterations of the algorithm (Figs. 5 and 6).

Fig. 3
figure 3

Iteration 0 of the algorithm under K = 2 switches for a generic example

Fig. 4
figure 4

Iteration 1 of the algorithm under K = 2 switches for a generic example. Note the change of the switching instants compared to those in Fig. 3. See Remark 1 for a thorough discussion

Fig. 5
figure 5

Iteration m of the algorithm under K = 2 switches for a generic example. Note that the initial value of the adjoint is converging to the point 0, this is a direct consequence of the necessary conditions of optimality (see Eqs. 15 and 28)

Fig. 6
figure 6

Optimal solution to problem 1. For the last iteration, all the necessary conditions of optimality (see Eqs. 1115) are satisfied

5 Conclusions

This paper addresses optimization problems for a class of hybrid systems arising frequently in the process industries. Using variational principles, necessary conditions of optimality and a gradient formula are derived. A conceptual algorithm is then presented together with a convergence analysis and a discussion of implementation issues. A topic of current research is the study of the cyclic operation of the system i.e., the switching sequence of locations is periodic. We expect that a cycle-to-cycle improvement is possible and could be used to further reduce the complexity of the developed approach.