On mixed-integer optimal control with constrained total variation of the integer control

The combinatorial integral approximation (CIA) decomposition suggests solving mixed-integer optimal control problems by solving one continuous nonlinear control problem and one mixed-integer linear program (MILP). Unrealistic frequent switching can be avoided by adding a constraint on the total variation to the MILP. Within this work, we present a fast heuristic way to solve this CIA problem and investigate in which situations optimality of the constructed feasible solution is guaranteed. In the second part of this article, we show tight bounds on the integrality gap between a relaxed continuous control trajectory and an integer feasible one in the case of two controls. Finally, we present numerical experiments to highlight the proposed algorithm’s advantages in terms of run time and solution quality.


Introduction
Mixed-Integer Optimal Control has been established as a useful tool for modeling real-world problems [6,10,22,33]. In practice, only an optimal control policy is realistic that avoids frequent switching between the system modes. However, it remains an open research question as to how switching costs or a limited number of switches can be efficiently incorporated into the optimization problem. 1 3 In this article, we follow a first-discretize-then-optimize approach because in contrast to indirect methods and dynamic programming, a more generic problem class can be solved for which efficient numerical methods, in particular, a decomposition approach used in this article, are available. By this approach, the control problem is discretized via, e.g., Direct Multiple Shooting [5] or Direct Collocation [29], which leads to a mixed-integer nonlinear program (MINLP). This problem class is NPhard in general, so that it has been proposed to reduce complexity by solving first the relaxed problem with dropped integrality constraint, which is a nonlinear program (NLP), before approximating relaxed controls in a second step with binary controls as part of a mixed-integer linear program (MILP). The second problem is usually referred to as combinatorial integral approximation (CIA) problem [27], whereas the whole algorithm is called CIA decomposition [31]. It is common to use the fast Sum-Up Rounding (SUR) heuristic [24] to find a feasible approximative solution for the CIA problem, so that this second step is also named rounding. However, standard SUR does not consider time-coupled combinatorial constraints, which is why the use of the CIA problem is necessary in this case. The variable time transformation [2,11,21] method also avoids solving a MINLP by assuming a given sequence of the system modes so that only their durations as part of an NLP have to be computed. Recently, De Marchi extended this approach by including switching costs with sparse optimization methods [9]. As part of the CIA decomposition, Sager [24], Kirches [16], and Rieck [20] proposed to add penalty terms to the control problem objective in order to account for switching costs or to reduce the number of switches between active controls. Recently, Kirches et al. [17] investigated this approach for the setting with implicit switches and where jumps of the differential state values may occur. One issue of the penalization approach is the appropriate choice of the penalty factor since heavy penalization can in some instances [16] attract solutions involving frequent switching. Bestehorn et al. [3,4] presented an idea to incorporate switching costs into the problem by fixing a small control deviation tolerance in the rounding problem and minimizing switching costs subject to this deviation tolerance.
This work builds on [27], where it has been proposed to solve the CIA problem with constraints that limit the total variation (TV) of the integer control. By this, the original optimal control objective remains untouched, and we require only the number of switches to be less than a desired threshold. Our idea is to use generally a Branch & Bound (BNB) scheme [7] for solving this CIA problem, but, as we will show in this article, this can be replaced for certain instances by a sequence of rounding scheme evaluations. Furthermore, we evaluate upper bounds on the CIA objective subject to these discrete TV constraints. It has been shown that the integer approximation error, i.e., the difference of relaxed optimal control objective value and CIA rounded objective value, can be driven to zero under mild conditions and if the grid length is driven to zero [19, 1 3 On mixed-integer optimal control with constrained total… 27]. This result does not hold anymore if discrete TV constraints are included. Still, we expect our approach to yield feasible solutions of the MIOCP with an a priori integer gap.

Contribution
To accelerate the CIA solving process, we propose the Maximum Dwell Rounding (MDR) scheme, which is a fast rounding heuristic. It is based on the idea to activate a chosen control mode as long as possible without violating a desired integrality gap , i.e., the accumulated deviation of relaxed and binary controls, and then perform this with the next promising mode. We apply it iteratively as part of the Adaptive Maximum Dwell Rounding (AMDR) algorithm for finding binary controls that satisfy timecoupled combinatorial constraints such as a TV bound and derive optimality conditions of the obtained binary control function with respect to the CIA problem. Based on this scheme, we prove the tightest possible upper bound on the integrality gap for equidistant discretization and the case of two binary controls, which reads where N denotes the number of intervals, max the TV bound and Δ the maximum grid length. We are going to establish further bounds for the situation of non-equidistant grids or more than two binary controls.

Outline
We give a problem definition of the MIOCP of interest in Sect. 2 and describe the proposed CIA decomposition algorithm with (CIA) as subproblem in Sect. 3. Next, we introduce auxiliary CIA problems and derive a lower bound for these problems in Sect. 4. We define the MDR scheme in Sect. 5 and show its usefulness with respect to solving the CIA problem subject to discrete TV constraints. We continue by analyzing the worst-case integrality gap for n = 2 in Sect. 6, respectively for n > 2 in Sect. 7. Finally, we present numerical experiments in Sect. 8 and conclusions in Sect. 9.

Mixed-integer optimal control problem
Mixed-integer optimal control problems (MIOCPs) can be equivalently reformulated into problems with affinely entering binary controls in the right-hand side of the ordinary differential equation via the (partial) outer convexification method, see [24] for further details. Therefore, we declare this reformulated MIOCP as the problem of interest and provide the corresponding definitions in this section. We consider problems on a given time horizon T ∶= [t 0 , t f ] ⊂ ℝ . Throughout this paper, we assume a problem involving n ≥ 2 binary control realizations. We introduce the binary control function after defining the TV of a function and its associated space.

Definition 1 (Total Variation of a function and BV space) The TV of a function
∶ T → ℝ n is defined to be the quantity where P = (t 0 , … , t n P ) is a partition out of the set of all partitions P of the interval T and n P denotes the partition specific number of time points. We group the functions with finite TV into the space of bounded variation BV: Definition 2 (Binary and relaxed control functions ) Let the vector of binary controls on the simplex and its corresponding vector of relaxed controls defined by their function space domains On mixed-integer optimal control with constrained total… We minimize a Mayer term functional Φ ∈ C 1 (ℝ n x , ℝ) over the binary controls i and differential states x ∈ W 1,∞ (T, R n x ) with fixed initial values x 0 ∈ ℝ n x . Constraint (2.2) expresses the dynamical system as a switched system in partial outer convexified form, i.e., as a sum of a drift term f 0 and control specific functions f i both out of C 0 (ℝ n x , ℝ n x ) . We assume that there exists a solution x for the above problem; for this we may assume that a uniform Lipschitz estimate on f 0 and f i exists so that the theorem by Picard-Lindelöf is applicable. We limit the number of switches between active modes to be at most max in the TV constraint (2.4). Finally, we define (OCP) as the canonical relaxation of problem (MIOCP) where we optimize over ∈ A instead of ∈ Ω.
Remark 1 When we write that control i is active, we indicate that i (t) = 1 . In fact, we count the switches in (2.4), respectively in (2.1), twice since we sum up the control that has just been deactivated with the one that has just been activated. That explains the factor of one-half in (2.1). We remark that our study assumes no modespecific switching limits, but we may also impose them by splitting up (2.4) into n inequalities and by dropping the first sum in (2.1).
Without loss of generality, we omit further constraints and continuous control functions u ∈ L ∞ (T, ℝ n u ) in our problem definition. See [28] how to cope with these and further extensions.

Combinatorial integral approximation decomposition
We propose to solve (MIOCP) with the CIA decomposition [27,31], which relies on the use of direct methods (first discretize, then optimize approach). This section explains and defines the problem's temporal discretization and the subproblems that constitute the decomposition algorithm.
Next, we define the matrix sets of the discretized binary and relaxed binary control functions Ω N , A N .

Definition 5 (Convex combination constraint (Conv) and Ω N ,
We express the requirement that the columns of a matrix (m i,j ) ∈ [0, 1] n ×N sum up to one by and call it convex combination constraint (Conv) in the remainder. Based on this constraint, we define We introduce a discretized version of the TV constraint (2.4) that relies on w: Definition 1 (Discretized version of (2.4)) Let G N and max ∈ ℕ be given. We use auxiliary variables i,j ∈ ℕ for introducing a discretized version of the TV constraint that reads In order to solve the upcoming subproblems efficiently, we have deliberately formulated the above constraints without an absolute value term. This and replacing w with a in (3.2) results in a differentiable TV constraint. We define the discretizations of (MIOCP) and (OCP) below. Definition 6 ((NLP rel ), (NLP bin )) Consider (OCP) with the following modifications: • We discretize (2.2) with G N and by using direct collocation or direct multiple shooting together with an appropriate integrator function (e.g. Runge-Kutta methods [18,23]). • The controls are piecewise constant functions on G N with (a i,j ) ∈ A N : • The TV constraint (2.4) is replaced by the constraints (3.1) and (3.2). In (3.2), we replace w with a.
We denote the resulting discretized optimization problem with relaxed binary control functions a ∈ A N by (NLP rel ) and by (NLP bin ) for fixed binary control functions w ∈ Ω N . Definition 7 [(CIA) problem, (w) ] Let a ∈ A N be given. Then, we define the problem (CIA) to be On mixed-integer optimal control with constrained total… TV constraints (3.1) and (3.2).
We denote with (w) the (CIA) objective value for a feasible solution w ∈ Ω N .
With these subproblem definitions we are able to summarize the CIA decomposition in Algorithm 1. We first solve the relaxed problem (NLP rel ) and approximate the resulting relaxed binary controls with binary values in the (CIA) problem. The last step consists of evaluating (NLP bin ) with a fixed binary control function w CIA in order to obtain the objective value of (MIOCP).
We remark that the TV constraints (3.1)-(3.2) in (NLP rel ) and (NLP bin ) may be replaced by other TV reformulations, such as the ones presented in [17] or even dropped because (CIA) guarantees feasibility with respect to bounded TV in any case. The algorithmic focus of this article lies on the (CIA) step, so that we omit further considerations of TV reformulations.
We stress that this algorithm solves two problems of which each one is less hard to solve than the original MINLP, which denotes the discretized (MIOCP). It yields only an approximation of the optimal solution of (MIOCP), but Sager et al. [26] showed that-without TV constraints and under certain regularity assumptions-the difference between the differential states based on relaxed and binary control values depends linearly on the difference of the integrals of their corresponding control function. In particular, they proved that the so called integrality gap with feasible solutions constructed by (CIA) is linearly bounded by the maximum grid length Δ in the sense of where C(n ) is a constant depending on the number of controls. This implies that the differential state trajectories of (MIOCP) and (OCP)-without the TV constraint-are arbitrarily close with vanishing grid length Δ and by assumed Lipschitz continuity of the objective the same holds also for the objective values. 1 In Sects. 6-7, we are going to show that in the presence of discrete TV constraints (3.1)-(3.2) the rounding error does not vanish in general with grid length going to zero. More precisely, Theorem 4 and Corollary 7 will present concrete results on the rounding error.

(CIA− ), (CIA−−init) and an associated lower bound
In this section, we address a problem that minimizes the used switches subject to a given approximation error ̄ that shall not be exceeded by the accumulated control deviation. Afterward, we aim for a lower bound on its objective that will be useful in the next section and introduce useful auxiliary variables and definitions for this. Taking the fixed initial active control in (4.3) aside, the problems (CIA−̄ ) and (CIA) from Definition 7 are closely connected with each other because the TV constraints (3.1) and (3.2) are reinterpreted as objective function subject to a fixed approximation error ̄ . This justifies the naming. We will introduce the MDR algorithm in Sect.5 to (heuristically) solve (CIA−̄−init). By applying this algorithm to all i ∈ [n ] as initial active controls, we exploit this relationship to solve (CIA−̄ ) as well, which will be used then as part of a bisection algorithm to solve (CIA).
We stress that fixing the initial active control i 0 may seem odd, though this fixing reduces the problem complexity, which later yields in Theorem 1, Section 5, an optimality result of the solution constructed by the MDR algorithm concerning (CIA−̄−init).
We notice that (CIA−̄−init) is very similar to (SCARP) from [3,4]. The latter problem aims at minimizing the switching costs, representing a generalized

3
On mixed-integer optimal control with constrained total… objective function of (CIA−̄−init), whereas in (CIA−̄−init) the initial active control is fixed.

Remark 2 (Link to scheduling theory)
On an equidistant grid, (CIA−̄ ) can be reformulated into the following, equivalent, scheduling problem: On a single machine, minimize the total setup costs (TSC) until the Nth processed job, N ≤ n , so that n jobs (f, k) are processed within f ∈ [n ] job families subject to release times r f ,k , deadlines d f ,k , equal processing times Δ and sequence-independent setup costs, which can be summarized in scheduling notation [13] as In the following we will revert to scheduling-like concepts, but explicitly dispense with its notation to not distract the reader from the usual MIOCP notation.
Next, we need some definitions to derive a lower bound for (CIA−̄−init) at the end of this section. We stress that we establish our results on an equidistant grid but will sometimes drop this assumption in definitions used in later sections. then we say w switches on j. We introduce the set of switches and set n s ∶= |S| . We denote by j ∈ [N] the corresponding interval of the jth switch of w , where we set 0 ∶= 0, n s +1 ∶= N . On an equidistant grid and if i ∈ [n ] is active between two consecutive switches or one switch and the first/last interval, we define the set of activations of i between these switches as an activation block B ⊂ [n i ] . On a general grid, we further define the length of the jth activation block between the (j − 1) st switch on, i.e. j−1 , and before the jth switch, i.e. j − 1 , via the auxiliary variable j = ∑ j −1 We notice that the switches actually occur on the grid points; however, we have indexed the variables w i,j according to the intervals, and therefore, for simplicity, we refer to switches on intervals. In the following, we will sometimes abbreviate activation block with block. In order to keep the number of used switches small and when deciding to set up a new block, it is highly relevant to know how many activations could be at most included in this block beginning with activation k. An activation j > k cannot be included in the block if its release interval begins later than the deadline interval of activation k plus the number of activations between k and j. We give a definition that formalizes these deadlines for initial activation-dependent deadlines of blocks. Based on these block deadlines, it is straightforward to introduce the notion of a block deadline feasible partition of activations into blocks. The constraint (4.3) imposes that the control i 0 's first activation has to be executed on the first interval, for which we introduce the definition of fixed initial active control feasibility.
Definition 11 (db i,k , block deadline and fiac feasible partition) Consider an equidistant grid. The deadline of a block for i ∈ [n ] that begins with the kth activation, k ∈ [n i ], is defined by Let P i denote a partition of all activations [n i ] for i ∈ [n ] . We call P i block deadline feasible if for all subsets B ∈ P i , i.e., all blocks, hold: Furthermore, we refer to P i as a fixed initial active control (fiac) feasible partition if for all k ∈ B 1 hold where B 1 ∈ P i denotes the first activation block of P i .
In the last definition, we provided the concept of a control specific partition of all activations.
The kth activation of control i ∈ [n ] does generally not coincide with the kth interval. The following example illustrates the introduced concepts and, in particular, S ∶= {j ∈ {2, … , N} | w switches on j},

3
On mixed-integer optimal control with constrained total… that there may be in total more possible but less necessary activations than intervals N.
Example 1 Let the following matrices a ∈ A N and w ∈ Ω N for equidistant discretization given: where n = 3, N = 9 . Consider i = 1 to be the fixed initial active control and a rounding threshold of ̄= 1Δ . Then, we deal with in total eleven possible activations with their release and deadline intervals: There are 4, 3, and 2 activations in w for the controls i = 1, 2, and 3, respectively. These activations are grouped into in total 4 activation blocks so that w uses 3 switches. For instance, the first block of control i = 1 has a length of 1 = 3Δ and its deadline is As illustrated in Example 1, a feasible solution w of (CIA−̄−init) may not use all possible activations. To this end, we define an extension of the set of blocks of w to become a partition of [n i ] for all i ∈ [n ] in the following lemma. The extension may seem arbitrary, but is necessary to compare any w ∈ Ω N with partitions of [n i ] . Thereby, we establish a connection between the above feasibility concepts and a feasible solution w of (CIA−̄−init).

Lemma 1
For an equidistant grid, let w ∈ Ω N be feasible for (CIA−̄−init) and let P ′ i denote the set of blocks of w for control i ∈ [n ]. We define Proof We first argue that P i is by definition a partition of [n i ] . We need to prove that these partitions are block deadline feasible, respectively fiac feasible. If for i ∈ [n ] and an activation block B ∈ P i holds this would imply that the (max{k ∈ B}) th activation of i has been processed before its release interval because B can not be interrupted by activations from other controls. Therefore, the above inequality does not hold and block deadline feasibility  [2,3], [3,9], [9, ∞], [1,6], [6,7], [7,8], [8, ∞], [1,5], [4,6], [6, ∞].
is established. We apply the same argument for confirming fiac feasibility. By constraint (4.3), the first activation of i 0 is scheduled on the first interval. Hence, all activations k ∈ B 1 of the first block B 1 must be processed on the kth interval and require therefore a release interval that is no later than k. ◻

Remark 3 (Necessary condition for feasibility of (CIA−̄−init)
The formation of activations into block deadline and for i 0 fiac feasible partitions is a necessary feasibility criterion of w ∈ Ω N for (CIA−̄−init) by virtue of Lemma 1. Nevertheless, it is not a sufficient criterion since the order of the processing of blocks is not clarified. In particular, one might order the blocks to contain an activation whose release interval is later than its executed interval.
Next, we formalize specific partitions of the control i's possible activations n i whose blocks are constructed to include as many activations as possible without violating their block deadlines. These quantities serve as tools to derive a lower bound of necessary blocks per control independent of the other control's blocks. This will result in a lower bound for (CIA−̄−init) in Proposition 2. We distinguish between the case that i is the fixed initial active control, i.e., i = i 0 , or not. The MDR scheme from the next section creates switches that resemble the above k (init) l terms. The latter, though, only expresses the grouping of activations, while the switches explicitly specify the corresponding intervals as well. It turns out that the partitions P i,min and P init i 0 ,min are minimal in the number of blocks as indicated in the following proposition. (4.10) .
, let the partitions P i,min , P init i 0 ,min be given as in Definition 12. For any partition P i of [n i ], with i = i 0 included, we define its restriction to the first ñ i ≤ n i activations as Then, the partition P i,min , respectively P init i 0 ,min , consists for any ñ i ≤ n i of a minimal number of blocks on the first ñ i activations compared with all other block deadline feasible, respectively both block deadline and fiac feasible, partitions P i : It is block deadline feasible because the deadline of the last activation for each block is defined in (4.8) and (4.10) to be less or equal than the corresponding block deadline. Assume there is a block deadline feasible In other words, there exists a subset of the first j blocks of P i | |ñ i that includes more activations than the ones included into the first j blocks of P i,min | |ñ i . We consider the minimal number of blocks j with this property: The block index j is unique since the association of activations to blocks is monotonically increasing, meaning that there are no k 1 th, k 2 th activations, k 1 < k 2 , with k 1 ∈ B i,l 1 , k 2 ∈ B i,l 2 and l 1 > l 2 . We conclude so that block B ′ i,j 's first activation k ′ is smaller or equal than k which marks the earliest activation of B i,j . The definition of release intervals (4.54.6) implies r i,k ′ ≤ r i,k for k ′ ≤ k . Similarly, the definition of block deadlines (4.7) implies db i,k ′ ≤ db i,k for r i,k ′ ≤ r i,k and we find with (4.13) in particular On the other hand, the definition of P i,min in (4.10) implies Then, the definition of j yields where the last inequality must hold due to the assumption of P i being block deadline feasible. Inequality (4.14) contradicts inequality (4.16), or equivalently there is no such partition P i and P i,min uses indeed a minimal number of blocks on any The same argumentation for j ≥ 2 in equation (4.12) can be applied in order to prove the result for P init i 0 ,min as P i 0 is also assumed to be block deadline feasible in this case and the same holds for P init i 0 ,min from the second block on. We just need to take care of the case when j = 1 , i.e., if B i 0 ,j , respectively B ′ i 0 ,j , is the first block of the control i 0 . Here, max{k ∈ B i 0 ,1 } < max{k ∈ B � i 0 ,1 } cannot appear, since P i 0 is assumed to be fiac feasible and the construction of the first block of P init i 0 ,min implies that no further activation can be added to B i 0 ,1 without violating fiac feasibility. Thus, j = 1 is impossible in (4.12) and P init i 0 ,min is also minimal in the number of blocks. ◻

Corollary 1 Consider the setting of Proposition 1 and the controls
We define There is no block deadline feasible partition, respectively block deadline and fiac feasible partition, that uses less than Proof The result follows directly from Proposition 1 with ñ i =ñ i,N and ñ i 0 =ñ i 0 ,N . ◻ As a final result for this section, we establish a lower bound for (CIA−̄−init) that will be useful in Theorem 1. Proof By virtue of Lemma 1, a feasible solution of (CIA−̄−init) satisfies the necessary condition of generating only block deadline feasible partitions P i and if i = i 0 , the activation partition P i is also fiac feasible. Moreover, all activations are executed no later than their deadline interval. That holds particularly for those that are due no later than N. Hence, we can apply Corollary 1 and conclude the minimum number of blocks of a feasible solution until the in total Nth activation is nb N i,min , respectively

3
On mixed-integer optimal control with constrained total… nb N,init i 0 ,min . Finally, we obtain the claim (4.17) by summing up over all controls and using that the setup of the first block does not count as switch. ◻

Maximum dwell rounding
This section is dedicated to solving (CIA). Generally, we recommend a tailored BNB algorithm that has been proposed by Jung et al. [15,27] and implemented in the open-source software package pycombina [7]. The BNB algorithm outperforms standard MILP solvers in a case study [14] by three orders of magnitude. However, in some instances, the algorithm struggles to find the optimal solution quickly because the node relaxation can be quite weak [8]. We, therefore, present a polynomial-time algorithm that constructs good initial guesses for BNB and, in some situations, solves (CIA) even to optimality. We proceed by giving the necessary definitions of the algorithm itself and its auxiliary variables in the first subsection and investigate beneficial properties in the second subsection.

Definition of the algorithm
and interval j ∈ [N] we define the accumulated control deviation variables as and set i,0 ∶= 0.
The following lemma is useful for Proposition 3 on page 11 and Lemma 7 on page 17.

Lemma 2 Consider a ∈ A N and w ∈ Ω N , for each j ∈ [N] holds
Proof These equations follow directly from the definition of and as well as from the convexity property of a and w . ◻ Definition 14 (Inadmissible, next forced and forced activation) Consider a rounding threshold ̄> 0 and a ∈ A N . Let the values of w ∈ Ω N be given until interval and we call the control i otherwise inadmissible. Similarly, the choice w i,j = 1 is forced if we have that Let further N j (i) ∈ {j, … N} denote the next interval on which control i would become forced without activation after interval j − 1: Then, we define a control i * ∈ [n ] on interval j to be next forced if and only if The above definition allows more than one control to be next forced or forced for an arbitrary interval j ∈ [N] . This is supposedly not the standard case in our discussion but will also be taken into account in our considerations. The guiding idea behind the above control activations is that we include more and more summands of w into the computation of and can choose the next row of w accordingly. With this definition we have introduced necessary activation properties of feasible solutions for (CIA−̄−init), but neglected so far the fixed initial active control constraint (4.3). The following definition fills this gap.

Definition 15 (Initially admissible control)
We define a control i ∈ [n ] to be initially admissible if it is admissible on the first interval and if there is no other control i 1 ≠ i that is forced on the first interval. Now, we can define the MDR in Algorithm 2. The MDR algorithm assumes a given initial control i 0 and activates it until it becomes inadmissible or until there is another forced control. We require the control i 0 to be initially admissible because otherwise w MDR would violate the control accumulation constraint (4.2). Otherwise, the control i with the maximum forward control deviation i,j is set active and remains so until it becomes inadmissible or another control becomes forced. This procedure is performed forward in time until the end of the time horizon N is reached. We named the algorithm "maximum dwell rounding" because it tries to stay in the current mode as long as possible without violating the given rounding threshold.
The AMDR is defined in Algorithm 3 and can be described as a bisection method. We initialize it with a trivial lower bound LB and upper bound UB for (CIA). The algorithm runs MDR iteratively with different threshold ̄ and initially admissible control as long as the difference of lower and upper bound is larger than the chosen tolerance TOL (lines 2-5). If the computed control function satisfies the TV constraint and exhibits a (CIA) objective value that is smaller than the current UB, we update UB, reset the rounding threshold ̄ via interval halving of UB − LB and save the current best solution (lines 6-10). The evaluation (w) is necessary since MDR may construct a control function with a rounding gap larger than the desired gap ̄ , as will be discussed in the next subsection. If no computed control function w with given initial control and ̄ fulfills the TV constraint, then we increase the LB (lines 11-14).

Solution quality and properties of MDR (Algorithm 2)
Although the MDR algorithm may seem simple, it generates optimal solutions w for (CIA−̄−init) under certain conditions, for which we need the following definition.

Definition 16 (Canonical switch)
We define a switch j ∈ S as defined in Definition 10 to be canonical, if on interval j holds: exactly one control i 1 is inadmissible and exactly one control i 2 ≠ i 1 is forced.
We build our theoretical results of this section mainly on the following assumption.

Assumption 1 (MDR uses only canonical switches)
Suppose w MDR ∈ Ω N has been generated by MDR. We assume that all switches of w MDR are canonical.

Properties of the MDR algorithm
Assumption 1 may seem restrictive, although it is satisfied anyway under certain conditions. Proposition 3 (MDR with n = 2 and ̄≥ 1 2Δ uses canonical switches) Consider n = 2 , a ∈ A N and any grid G N . If we choose ̄≥ 1 2Δ , then the control function w MDR constructed by the MDR scheme uses only canonical switches.
Proof We have to prove: 1. follows from the definition of forced activation and from ̄≥ 1

2Δ
: For proving 2. let us assume i 1 is forced on j ∈ [N] , i.e., i 1 ,j >̄ . By virtue of Lemma 2 for i 2 ,j we derive Assumption 1 is not necessarily true for a control problem that involves more than two binary controls. It may, however, hold for special cases of such a problem. For instance, if the relaxed values are of bang-bang type, i.e., a i,j ∈ {0, 1} , and ̄ is chosen smaller than the smallest activation block, then the situation resembles the case n = 2 and Assumption 1 may hold (without proof). On the other hand, Example 3 is going to demonstrate that this assumption can indeed be quite restrictive.
Assumption 1 allows us to prove strong properties of control functions obtained by MDR and AMDR. The first result expresses that the MDR scheme produces indeed control functions which exhibit a (CIA) objective value smaller or equal than ̄.

3
On mixed-integer optimal control with constrained total… Lemma 3 (MDR solution satisfies ̄ bound) Let Assumption 1 hold and let w MDR ∈ Ω N be constructed by MDR with given threshold ̄ . Then, we obtain (w MDR ) ≤̄.
Proof As soon as the activated control becomes inadmissible or there is a forced control on interval j ≥ 2 , w MDR has a switch by the definition of MDR. By Assumption 1, the newly activated control is both forced and admissible, hence i,j ≥ −̄ , and there is also no other forced control on j, thus i,j ≤̄ . ◻ The following example demonstrates that (w MDR ) >̄ may generally appear without Assumption 1.

Example 2
Consider an equidistant discretization and a ∈ A N with the first values given as a 1,1 = 1, a 2,1 = 0, a 1,2 = 0.5, a 2,2 = 0.5 . For this relaxed value, let w MDR ∈ Ω N be the corresponding binary control function computed by MDR with given threshold ̄= 0.4Δ and initial control i = 1 . Then, w MDR 1,2 = 0, w MDR 2,2 = 1 holds since the second control becomes forced on the second interval. At the same time, control i = 2 is inadmissible on the second interval, hence Assumption 1 is violated, and it results 2,2 = −0.6Δ < −̄.
We reuse concepts from the previous chapter, especially activations and their grouping into blocks. For proving optimality we note that w MDR constructs partitions of the activations [n i ], i ∈ [n ] that are due no later than N and let P MDR i denote these partitions. With the notation from Corollary 1 we want to show that these partitions coincide with the partitions constructed in Definition 12 because then Corollary 1 would imply w MDR has ∑ i∈[n ],i≠i 0 nb N i,min + nb N,init i 0 ,min activation blocks or equivalently (5.1) * = (w MDR ).
and the claim follows from inequality (5.2). Consider the first blocks B 1 ∈ P MDR i 0 and B init 1 ∈ P init i 0 ,min . By Assumption 1, the MDR algorithm activates i 0 until it becomes inadmissible on the interval 1 of the first switch: We compare this inequality with the Definition 9 of release intervals and notice that either the next activation 1 of control i 0 has a release interval that is later than 1 or there is no further possible activation. So, if the 1 th activation exists, then its release interval has not yet been reached: Again by the definition of MDR and Assumption 1, i 0 is forced on interval j , which is equivalent to d i 0 ,min{k∈B j } = j . The MDR scheme activates i 0 either until N (then trivially B j = B init j ) or until it becomes inadmissible on interval j+1 (by Assumption 1). With the argumentation for j = 1 , inadmissible means hereby the (max{k ∈ B j } + 1) th activation has a release interval greater than j+1 . Using that the block's first activation j is processed on its deadline interval d i 0 ,min{k∈B j } , this yields The above inequality expresses that B j contains as many activations as possible without violating its block deadline db i 0 ,min{k∈B j } and by construction of P init i 0 ,min this is equivalent to B j = B init j . This settles the case i = i 0 in (5.3). We can reuse the above arguments about forced and inadmissible activation for j ≥ 2 in order to analogously prove the case i ≠ i 0 in (5.3). ◻

Remark 5
Theorem 1 is predicated on the assumption of an equidistant grid. We stress that after grid refinement of the optimal control problem, i.e., after several rounds of applying the CIA decomposition, this might be a restriction.
The following corollary establishes a way to find the optimum of (CIA−̄−init) in the setting of Theorem 1.
Corollary 2 (Using MDR to find a control function with minimum number of switches) Consider the setting of Theorem 1. A control function w * that uses a minimum number of switches, i.e., (w * ) = * , can be found by running MDR.

3
On mixed-integer optimal control with constrained total… Proof Let i be the initial control of w * . Execute MDR with i as initial control so that the result follows directly from Theorem 1. ◻ It is not clear which control is the optimal initial active one in order to minimize switches. In practice, MDR must be executed one after the other for all controls i ∈ [n ] as initial active control. This expresses the following corollary. MDR and (CIA−̄)) Consider the setting of Theorem 1. We assume that the MDR algorithm constructs for all i ∈ [n ] as initial active controls the control functions w MDR that use only canonical switches. Then, there is a minimizing control w * ∈ Ω N for (CIA−̄ ) that only uses canonical switches. Moreover, there exists i 0 ∈ [n ] such that running MDR with i 0 as initial control produces w MDR ∈ Ω N that minimizes (CIA−̄).

Corollary 3 (Link between
Proof If the MDR algorithm produces w MDR that only uses canonical switches, w MDR is optimal by Theorem 1 for (CIA−̄−init) with the corresponding initial control fixed . Then, the result follows from the fact that the optimal solution of (CIA−̄ ) is contained in the set of optimal solutions for the set of problems (CIA−̄−init) with each control i ∈ [n ] initially fixed. Proposition 3 establishes that w MDR uses only canonical switches for ̄≥ 1 2Δ and thus the above cases cannot appear for ̄≥ 1 2Δ . Let us focus on ̄< 1 2Δ . In order to create a control function w that does fulfill (w) ≤̄ , we need to change at least one activation of w MDR on an earlier interval l < j . However, we recognize that any earlier change of activation is not possible: • We cannot extend an activation block at its end, since the active control is inadmissible. • If the active control i 1 is admissible on l, then the other control i 2 is not forced on l -otherwise it would be active in the MDR scheme. This means where we applied ̄< 1

2Δ
. This indicates the (CIA) objective value of w is again greater than ̄. Hence, no previous activation w MDR can be changed so that there is no w with (w) ≤̄ . ◻

Properties of the AMDR algorithm
Theorem 2 states that the AMDR Algorithm is able to find the optimal solution of (CIA) for n = 2 and equidistant discretization. Otherwise, strict assumptions are required for optimality, and in general, the found feasible solution represents only a promising upper bound. Proof AMDR is a bisection algorithm that either decreases UB (line 7-8) or increases LB (line 12-13) by at least one half of (UB − LB) in every while loop iteration (line 2). From this and because of TOL > 0 , we conclude that the while loop and AMDR as a whole terminate after finitely many iterations.
1. The objective of (CIA) cannot be greater than t f − t 0 , even with no switches allowed, i.e., max = 0 . Since we initialize the AMDR algorithm with UB = t f − t 0 , it finds in any case a feasible solution. 2. Every in line 5 by MDR generated w MDR that satisfies the TV constraints together with (w MDR ) < UB represents an upper bound on * , i.e., UB = (w MDR ) ≥ * . For proving that AMDR constructs valid lower bounds LB on * we exploit that w MDR only uses canonical switches for n = 2 and by assumption for n > 2 so that Corollary 3 is applicable. If there is no initial control i ∈ [n ] for which On mixed-integer optimal control with constrained total… MDR produces w MDR for given ̄ that uses less or equal switches then by the TV constraints required, we conclude by Corollary 3 that there exists no such w ∈ Ω N for this specific threshold ̄ and, hence, LB =̄≤ * is a true lower bound on the optimal (CIA) objective value. Moreover, if MDR constructs for a given ̄ and all initial controls i ∈ [n ] control functions w MDR with (w MDR ) >̄ , Lemma 4 and the assumption in (b) guarantee that this ̄ is also a true lower bound on the optimal (CIA) objective value. Altogether, AMDR iteratively generates valid lower LB and upper bounds UB for * and produces a feasible solution that is optimal up to the chosen tolerance TOL. 3. MDR runs forward in time and computes solely the accumulated control deviation and for all intervals j ∈ [N] , therefore C MDR ∈ O(N) . The interval halving in AMDR ensures that we execute the while loop a maximum of log 2 ((t f − t 0 )∕TOL) times. Inside this loop, we need to run the MDR scheme in the worst case with all n controls as initial controls. Combining these findings yields the asserted complexity.

Remark 6
Several meaningful modifications for the AMDR algorithm are available. We may use it also for finding control functions fulfilling other combinatorial constraints such as minimum dwell time constraints by checking them together with the TV constraint in line 5. As part of the MDR scheme, the control with maximum forward control deviation is activated if the previously active control is inadmissible. Instead, one may choose a less greedy variant. For instance, we could activate the next forced and admissible control. Lastly, the initial upper bound UB can be reduced, as we will point out in the next section.

Remark 7
If we drop the TV constraint on w , the AMDR scheme finds a control function with the same objective value as the one obtained by the control function of Sum-Up Rounding [24] (without proof).
Most results of this section are based on the assumption of an equidistant discretization, which is common in practice. However, the assumption of dealing only with canonical switches in the produced control function is critical. The following example illustrates that a control function generated by MDR with non-canonical switches may use more switches than needed or may not satisfy the rounding bound ̄.

Example 3
Consider an equidistant grid. Let the following two relaxed values a 1 , a 2 ∈ A N be defined as Then, MDR with i = 1 as initial control and ̄1 = 0.75Δ , respectively ̄2 = (0.6 + )Δ , constructs the following control functions: In this section, we use the MDR algorithm and previous results to deduce bounds on (CIA). We consider a given (CIA) problem with grid G N , relaxed value a ∈ A N and maximum number of switches max > 0 . The idea in the following is to construct a control function w MDR that bounds the objective of (CIA). For finding an appropriate initial active control for the MDR scheme, we introduce an auxiliary grid G N which ends at t f and has Ñ intervals: In the definition of G N we intersect two sets because we consider given G N and max . To specify the rounding of a value t 0 ≤ t down to the next grid point, we utilize the following brackets notation Depending on whether we deal with an equidistant grid or not, we can prove a sharp bound for (CIA). We are going to distinguish between these two cases in the upcoming results and introduce the following constant We propose to apply the rounding threshold (w MDR,1 i,j ) i∈ [3],j∈ [3] ∶= [3],j∈ [3] ∶= [3],j∈ [3] ∶= [3],j∈ [3] ∶= else.

3
On mixed-integer optimal control with constrained total… in the MDR scheme and claim that this choice will be later beneficial for proving upper bounds on (CIA). Next, we establish useful properties of the rounding ⌊⋅⌋ G N to the next grid point.

Lemma 5 (Distance to next grid points)
Consider max > 0 and the rounding threshold ̄ defined as above. The following holds true:

Proof
1. Let us first consider the non-equidistant case. If t 0 + j̄≤ t f , we deduce that the maximum distance of t 0 + j̄ to the next smaller or equal grid point is Δ . If t 0 + j̄> t f , we have ⌊t 0 + j̄⌋ G N = t f and obtain This settles the non-equidistant case: ⌊t 0 + j̄⌋ G N ≥ t 0 + j̄−Δ . For the equidistant case, we observe We look at the right fraction and notice that the numerator consists of a product of an integer and Δ , whereas the denominator is the integer 3 + 2 max . Thus, the maximum cut-off by rounding down to the closest grid point is which is equal to Δ − C 1Δ and proves the claim. 2. This follows from a similar argumentation as for the claim "1.". For the nonequidistant case we need only to consider t 0 + 3+2 max ≤ t f , for the equidistant case we take again advantage of t f − t 0 = NΔ.
◻ We continue with a lemma that quantifies the length of activation blocks in w MDR . Lemma 6 (Length of activation blocks l ) Consider a feasible control solution for (CIA−̄) that only uses canonical switches. Then, for the length of its activation block l , 2 ≤ l ≤ max , follows: Proof Let i be the active control on activation block l. We are using the assumption regarding canonical switches twice. First, i is forced for the earlier switch l − 1: and second, it is inadmissible on interval l : By definition of activation blocks we have l = ∑ l −1 j= l−1 Δ j so that we obtain by rearranging (6.4): Plugging (6.3) into the above inequality yields which settles the non-equidistant case. For an equidistant grid, we compute and because l is a multiple of Δ , it follows from l > 2̄−Δ ◻ Next, we propose in Algorithm 4 a specification of the initial active control i 0 . We observe that a small number of switches on G N in terms of ̄ is sufficient, as quantified in the following lemma.

Lemma 7
The MDR algorithm applied to the auxiliary grid G N with rounding threshold ̄ , n = 2 and i 0 from Algorithm 4 as initial control constructs a control function w MDR that uses at most one switch on G N .

3
On mixed-integer optimal control with constrained total… Proof We distinguish between the three possibilities of the initial control in Algorithm 4.
1. If MDR is initialized with i 2 and for i 1 holds ∑Ñ j=1 a i 1 ,j Δ j ≤̄ , the latter does not become forced on G N . For this reason there is no switch. 2. If i 1 with ∑Ñ j=1 a i 1 ,j Δ j ≤ 2̄−Δ + C 1Δ is the initial active control, a switch has to occur in case i 2 is forced on some interval 1 ∈ [Ñ] . We need to prove that i 1 does not become forced after the first switch, which is equivalent to i 2 does not become inadmissible due to n = 2 and Lemma 2, because then there is no other switch. For this, we derive a lower bound on the length of the first activation block 1 , where i 1 is active. The control i 1 becomes at the earliest inadmissible when it has been active on intervals j with a i 1 ,j = 0 whose lengths sum up to be more than ̄ , i.e., ⌊t 0 +̄⌋ G N − t 0 . With this observation and Lemma 5.1 we derive Note that i 1 ,j is monotonically increasing with increasing interval j > 1 as long as i 1 is inactive, i.e., w i 1 ,j−1 = 0 . Hence, if we are able to prove i 1 ,Ñ ≤̄ in case of w i 1 ,j = 0 for j > 1 , we also have that i 1 ,j ≤̄ for any interval j > 1 meaning there is no second switch. Altogether, we get with the above inequality so that w MDR switches no more than once on G N . 3. Otherwise we have in the else case. We can argue similarly as in the previous case, which is why we only have to prove i 1 ,Ñ ≤̄ . Since i 1 is a next forced control on the first interval, there is an interval l ≤ 1 with ∑ l j=1 a i 1 ,j Δ j >̄ . This implies the interval 1 of the earliest possible switch is given by from which we find ∑ 1 j=1 Δ j > 2̄ . We conclude for the grid point Using the (Conv) property yields ∑Ñ in the third equation. To conclude, there is also at most one switch. ◻ The above three lemmata are crucial for the following theorem, which provides an upper bound on (CIA).

Theorem 3 Consider any grid G N , relaxed values a ∈ A N and a maximum number of switches max > 0 . The objective of (CIA) is bounded by
Proof We want to prove that the control function w MDR constructed by MDR with rounding threshold ̄ from (6.2) and initial control from Algorithm 4 is feasible and satisfies the claimed bound. We observe ̄≥ 1 2Δ from its definition in (6.2) and the definition of C 1 in (6.1). Thus, we can apply Proposition 3 in connection with Lemma 3 so that w MDR fulfills indeed the claimed bound: What remains to be shown is that w MDR is a feasible solution for (CIA−̄ ), i.e., it does not use more than max switches. In the sequel, we write n = max in variable indices to improve readability of the latter. We assume there are already max switches taken in w MDR and calculate the maximum length of the possibly last activation block, i.e., n+1 = t f − t n −1 . In Lemma 7 we have derived that at most one

3
On mixed-integer optimal control with constrained total… switch is used on the reduced grid G N until t f , but there may follow another switch shortly afterwards, i.e., 2 ≥Ñ + 1 . For the remaining max − 2 activation blocks until t n −1 we can apply Lemma 6, since Proposition 3 states that MDR uses canonical switches for n = 2 . Lemma 6 states Combining these findings and using Lemma 5.2 results in Let i denote the control that is active after the max th switch of w MDR . Note that i,j is monotonically decreasing with increasing interval j ≥ n since i is chosen to be active on interval j. Hence, if we are able to show that i is admissible on interval N, then it is also admissible on earlier intervals. We want to prove the admissibility of control i on interval N as in this case there will be no further switch until N. For this, let us assume i is inadmissible on interval N. We obtain In the second inequality we used that control i is forced on the interval n of the nth, respectively max th, switch. With this contradiction, there cannot be a further switch after n ; in other words, w MDR uses at most max switches and is a feasible solution of (CIA). This completes the proof. ◻ In the sequel, we are going to elaborate how sharp the upper bound from Theorem 3 is, and we will thereby exclude the case max ≥ N − 1 ; otherwise the TV constraint would be no longer restrictive. Before presenting the main result in this context, we need a technical lemma again.
We have that where we indicate by ⌈x⌉ 0.5 the rounding up of x ∈ ℝ to the next multiple of 0.5 as defined in Sect. 1.3.

Proof
Since R is a rational number with 3 + 2 max in the denominator, we have Moreover, using basic properties of floor and ceiling functions yields Next, we calculate (6.10) ⌈R⌉ ≤⌈R⌉ 0.5 + 0.5,

3
On mixed-integer optimal control with constrained total… � ∈ ℤ are valid so that we can deduce from the above inequality Since there are more intervals N than the maximum number of switches max plus one, there is an interval j on which the optimal solution w of (CIA) has the value w i 1 ,j = 1 , while a i 1 ,j = 0 holds. This results in Otherwise, if N ≥ 3 + 2 max , we proceed as follows 1. We construct a specific matrix a that depends on the choice of max and N. 2. We prove that the MDR scheme constructs for both initial active controls, for this a value and with a rounding threshold of control functions w MDR that use more than max switches. Then, we can come back to the idea of the AMDR scheme and Theorem 2.1., which states that w AMDR is feasible for (CIA), i.e. uses at most max switches, resulting in Theorem 2 provides in 2. (a) also a statement about the relation to the optimal solution of (CIA): Because the tolerance TOL can be arbitrarily small, we conclude the optimal solution of (CIA) involves an objective value of at least 1. We reuse the notation of R from Lemma 8 and introduce the auxiliary constant n I ∈ ℕ: Next, we are interested in designing a specific a ∈ A N with the property to enforce an improper covering by any w ∈ Ω N that satisfies a (CIA) objective of at most ̄ . By improper covering we indicate w ∈ Ω N has to use more than max switches in order to yield the desired (CIA) objective value of at most ̄ . We create sets of consecutive intervals for a on which either a i 1 or a i 2 is set to one (and the other control is thereby set to zero). We call these sets of consecutive intervals with the same value here index sections. We generate max + 2 index sections, where the two controls are alternately set to one in a , with the idea that a feasible solution w of (CIA) with at most max switches shall contain at most max + 1 activation blocks. The first index section will include ⌊R⌋ intervals, followed by index sections with n I intervals, and the last index section arises from the remaining intervals until N is reached. After conveying some intuition of the specific a ∈ A N , we continue with a technical definition of the index set J that specifies the index sections on which a i 1 is set to one: With these definitions we introduce a by fixing the values of control i 1 .
if max is odd.

3
On mixed-integer optimal control with constrained total… The value of a i 1 on the (⌈R⌉) th interval in the second and third case may seem unintuitive. The idea of this construction is that it results i 1 ,⌈R⌉ = ⌈R⌉ 0.5 if control i 1 is neither active on the first index section, nor on the (⌈R⌉) th interval. In this way, control i 1 needs to be already active on the first index section in order to maintain a (CIA) objective value of at most ̄. 2. We want to prove that the MDR scheme with the rounding threshold from (6.13) and with a defined in (6.14) constructs a control function that uses more than max switches, independent of the initial active control. For this, we are going to establish the following claim: a) If i 1 is the initial active control, the kth switch of w MDR happens before the � ⌈R⌉ + kn I � th interval, where k ∈ [ max + 1]. b) If i 2 is the initial active control, the kth switch of w MDR happens before the Assuming the claim is true, w MDR uses indeed more than max switches because the � ⌈R⌉ + ( max + 1)n I � th interval exists, i.e., is smaller than or equal to N: The inequality above shows that there are indeed max + 2 index sections for a as described above. With this information we deduce that ̄< 1 2Δ results directly in more than max switches or in control solutions that does not satisfy the claimed optimal (CIA) objective value from (6.12) anyway: -If a consists only of zeros and ones and ̄< 1 2Δ , the MDR algorithm creates switches on all intervals j for which a ⋅,j ≠ a ⋅,j−1 holds true. Thus, the activation blocks of w MDR would match the index sections of a , i.e. w MDR = a . As we derived max + 2 index sections for a , there are max + 2 blocks for w MDR and therefore max + 1 switches.
-If a i 1 ,⌈R⌉ = 0.5 , then there is no w with (w) < 1 2Δ regardless of which control is active on interval ⌈R⌉ since a is either zero or one on all other intervals. Hence, we can exclude the case ̄< 1 2Δ from further consideration.
Thus, we are left with the case ̄≥ 1 2Δ . In this case, we can apply Proposition 3 and conclude that we deal only with canonical switches. We return to prove the claim, and we proceed via induction.
⌈R⌉ + ( max + 1)n I = ⌈R⌉ + ( max + 1) (a) We consider k = 1 and conclude from N ≥ 3 + 2 max that ⌈R⌉ 0.5 ≥ 1 holds. Plugging this into inequality (6.8) from Lemma 8 results in ⌈R⌉ 0.5 < n I , and thus By construction of a , the values a i 1 ,j are equal to one for 1 ≤ j ≤ ⌊R⌋ . The value a i 1 ,⌈R⌉ is either 0.5 or 1. Therefore, −0.5 ≤ i 1 ,⌈R⌉ ≤ 0 holds for the accumulated control deviation of w MDR with i 1 as initial active control. After the (⌈R⌉) th interval n I intervals follow on which a i 1 ,j is zero. We conclude i 1 becomes inadmissible by (6.15) before interval ⌈R⌉ + n I and hence, the first switch appears before this interval. (b) We show the claim for the first two switches because we take an interest in a switch that occurs after interval ⌈R⌉ in the induction step. Let k = 1 . Since we conclude control i 2 becomes inadmissible the latest on interval ⌈R⌉ when being the initial active control and, equivalently, w MDR has a switch on interval ⌈R⌉ at the latest. This is equivalent to at least one activation of i 1 up to and including interval ⌈R⌉ , which we use for proving the assertion in case of k = 2 . Let us assume the second switch happens on or after interval ⌈R⌉ + n I . This implies i 1 would be admissible on that interval and we derive Consequently, the second switch happens before the � ⌈R⌉ + n I � th interval.
Induction step Assume the assertion holds for k − 1 ≤ max , we show that it is also true for k. At first, we prove an auxiliary result. For i ∈ [2] and j ≥ ⌈R⌉ we have that We prove the equation (6.16) by computing the accumulated control deviation: For j > ⌈R⌉ we have defined a i 1 ,j ∈ {0, 1} so that (6.16) for some z ∈ ℤ.

3
On mixed-integer optimal control with constrained total… and therefore (6.16) is satisfied with z = In order to make use of the established auxiliary result, we need to argue that the (k − 1) st switch happens after the interval ⌈R⌉ . In case a) the MDR algorithm will not deactivate i 1 due to a i 1 ,j = 1 before the ⌈R⌉ th interval. So it does on the ⌈R⌉ th interval if a i 1 ,⌈R⌉ = 0.5 because we have established ̄≥ 1

2Δ
. In case b) we use the base case for the second switch. We consider the interval 1 of the first switch of case a) and compare the two accumulated control deviations for the two cases a) and b) on 1 and obtain i 1 , 1 (b) ≥ i 1 , 1 (a) because i 2 has already been activated in case b) in contrast to case a). Since 1 > ⌈R⌉ , we are done. Now, without loss of generality, let i 1 be the active control after the switch on interval k−1 . We know that i 2 is active and thus admissible on interval k−1 − 1: which implies by Lemma 2 for the control i 1 and by equation (6.16) we have for some z i 1 ≥ 1 The control i 2 is inadmissible on interval k−1 as there are only canonical switches. If a i 2 , k−1 = 1 would be true, then i 2 would already have been inadmissible on interval k−1 − 1 . Also, a i 2 , k−1 = 0.5 is not possible because we derived k−1 > ⌈R⌉ . We conclude a i 2 , k−1 = 0 . From this and the induction hypothesis, which states that the (k − 1) st switch appears before the � ⌈R⌉ + (k − 1)n I � th interval, follows a i 1 ,j = 1 for the intervals j between k−1 and � ⌈R⌉ + (k − 1)n I � . Hence, i 1 ,⌈R⌉+(k−1)n I ≤ (⌈R⌉ 0.5 − 1)Δ holds due to (6.17). Finally, we assume i 1 can stay active up to and including interval ⌈R⌉ + kn I without becoming inadmissible. This and a i 1 ,j = 0 for ⌈R⌉ + (k − 1)n I + 1 ≤ j ≤ ⌈R⌉ + kn I imply Thus, i 1 is not active until the (⌈R⌉ + kn I ) th interval, respectively with an analogous computation for case b) i 1 is not active until the (⌈R⌉ + (k − 1)n I ) th interval. Thereby, we showed that indeed the assertion holds for k. Altogether, the constructed control function w MDR uses more than max switches for the chosen rounding threshold ̄ so that the (CIA) objective value is at least ̄ and we conclude the claimed theorem is true. ◻ We complete this section by drawing a conclusion from the Theorems 3 and 4.

Corollary 4 Consider an equidistant grid
The objective of (CIA) is bounded by which is the tightest possible bound.
Proof The inequality (6.18) is achieved by Theorem 3 applied to the equidistant case and rearranging terms: It is the tightest possible bound by Theorem 4 and the case

3
On mixed-integer optimal control with constrained total…

Upper bounds on (CIA) with n ! > 2
Deriving bounds for the (CIA) problem with more than two controls is more difficult compared to the last chapter as the number of possibilities increases significantly. Let max denote the maximum possible objective value of (CIA) for any given a ∈ A N and G N in this section. We will first use known results to derive lower and upper bounds for max . Then, we dedicate ourselves to the continuous relaxation of (CIA), which allows us to prove a sharper lower bound. Based on this, we state a conjecture about the actual value of max .
Proof This bound has been established in Theorem 4 and Corollary 4 for the case n = 2 . The provided example in the proof of Theorem 4 can be also applied to the case n > 2 by setting the values of the relaxed controls a i , for all i ∈ [n ], i > 2, to zero. ◻ Proof For the (CIA) problem without TV constraints, but with minimum up time constraints the sharp bound is proven in Theorem 2 in [32], where the constant C U ≥ 0 represents the given minimum up time. If we require for (CIA) that an activated control remains active for a time period of at least t f −t 0 max +1 , at most max switches take place. Thus, the TV constraint serves as a relaxation of the minimum up time constraint. ◻ We tighten the above results by investigating the continuous version of the (CIA) problem.

Definition 17 (CCIA)
Let ∈ A and max ∈ ℕ be given. Then, we define the continuou s CIA problem to be 3) max ≥ TV( ).

3
On mixed-integer optimal control with constrained total… Fig. 1 gives a visualization of this specifically defined control function . We have so that ∪ i∈[n ] I i = T follows and because the intervals I i are all disjoint, we obtain i (t) = 1 for exactly one control i and for all t ∈ T . Hence, ∈ A . The next observation about is that it consists of n + ( max + 2 − n ) = max + 2 activation blocks (interpreted in this continuous setting), meaning there are max + 1 changes of the active control. Now, let us assume we can approximate with a binary control function ∈ Ω resulting in an (CCIA) objective value of less than t . We have So, each control i , i ∈ [n ] needs to be active for some time until t 0 + it resulting in at least n − 1 switches up to and including t 0 + nt . Then, we have that This, and using that the next activation blocks of last for a period of 2t , imply each control i needs to be activated again up to and including t 0 + (n + 2i)t . If it were possible for some i ∈ [n ] to skip the activation of i without violating the control deviation bound t , this would result in On mixed-integer optimal control with constrained total… Consider max = 1 in the first example. Then, =Δ and max ≥Δ follows from the above corollary. Any asymmetric modification of (a 1 i,j ) with unequal control accumulation ∑ 3 j=1 a 1 i 2 ,j would result in a binary control function w OPT that activates the controls with highest control accumulation and hence <Δ . We conclude that the claimed bound is sharp, i.e., max =Δ.
Finding the exact value of max is difficult due to the nonconvex objective max a min w max i∈[n ],j∈ [N] and the tremendously increased number of different ∈ Ω when n > 2 , but we conjecture that the lower bound in Proposition 4 cannot be improved. We recognize the symmetry of the constructed in the proof: Any modification of that alters the length of its activation blocks would result either in less than max + 2 activation blocks or in at least one block with a smaller length compared with the previous length. The latter block length would be smaller than t f −t 0 2 max +4−n if the block is the first control's activation, respectively smaller than 2 ⋅ t f −t 0 2 max +4−n else. With the argumentation from the proof of Proposition 4, this would allow us to choose a control function ∈ Ω with a (CCIA) objective value smaller than t f −t 0 2 max +4−n . Furthermore, we argue that the optimal objective value of (CCIA) is at most by 1 2Δ smaller than the one of (CIA) because the switching times of the optimal ∈ Ω differ at most by one half of the maximum grid length from the optimal w ∈ Ω N . We close this section by summarizing these thoughts in the following conjecture.

Numerical experiments
We test the proposed algorithm with a benchmark example from the https ://mintO C.de library [25], with a real-world adsorption cooling machine problem [7] and with generic data. We use the CIA decomposition in order to solve these problems, where we applied CasADi v3.4.5 [1] to parse the nonlinear program (NLP) with (a 1 i,j ) i∈ [3],j∈ [3] ∶= [3],j∈ [3] ∶= efficient derivative calculation to the solver Ipopt 3.12.3 [30]. We implemented the AMDR algorithm into an add-on as part of the open-source software package pycombina 2 [7] and used its BNB solver for benchmarking reasons. The BNB scheme is based on the idea to branch forward in time and exploits that an evaluation of the objective function up to the current grid point yields a valid lower bound that is extremely cheap to compute, see [15,27] for further details. We set the tolerance parameter of the ADMR algorithm to TOL = 0.0001 . All computational experiments are executed on a workstation with 4 Intel i5-4210U CPUs (1.7 GHz) and 7.7 GB RAM.

Multimode MIOCP
We consider the following MIOCP, which is a modified version of the Egerstedt standard problem from https ://mintO C.de: Obviously, we deal with 3 different modes, i.e., n = 3 . We use as initial values x 0 ∶= (0.5, 0.5) T . Furthermore, we add the TV constraints (3.1)-(3.2) to (P1), with varying maximum number of switches max . Fig. 2 illustrates the differential state and control trajectories for max = 20 and with relaxed binary controls as well as binary controls based on SUR, BNB and AMDR. We remark that the control function constructed by SUR uses 70 switches and is therefore infeasible with respect to max = 20 . The relaxed control values are greater than zero and less than one around t ≈ 0.45 and for t ≥ 0.8 so that the corresponding approximated state trajectories of BNB and AMDR are slightly different from the relaxed one from t ≈ 0.45 on. We set the BNB iteration limit to 5 ⋅ 10 6 so that it stopped after 15.3s with (CIA) objective value = 9.1 ⋅ 10 −3 and Φ = 0.991855 as (P1) objective value. The execution of AMDR took 0.2s and resulted in the improved objective values = 4.6 ⋅ 10 −3 , respectively Φ = 0.991509 , which can be explained by more uniformly distributed switches compared with the BNB solution. Table 1 shows that the BNB algorithm constructs for small instances, e.g. N = 200 , better (CIA) objective values than AMDR if enough time is available. If the BNB scheme finds a good solution, it will usually do so after a few million iterations. While the values of AMDR are close to the ones from BNB for N = 200 , they are clearly outperforming the latter for bigger instances. Its run time is only slightly increasing with a grid's refinement, from about 0.1 seconds to at most 0.6 (P1)

3
On mixed-integer optimal control with constrained total… seconds. A C++ implementation could still improve the run time as we used so far a prototype implementation in python. It appears that selecting the next-forced control rather than the one with a maximum value is beneficial as part of the AMDR algorithm and tends to yield the solution with the smallest (CIA) objective value.

Dualmode adsorption cooling machine problem
In [6,7], a complex renewable energy system in the form of a solar thermal climate system with nonlinear system behavior is introduced as an MIOCP. The system's core is an adsorption cooling machine, which can be switched on to intensify the cooling down of ambient temperature. The goal is to control the room temperature in a comfort zone and at the same time to minimize the energy costs. We skip a detailed system's description but refer to [7], and consider the relaxed binary control values as given, as illustrated in Fig. 3 in the left plot. We assume two modes of the adsorption cooling machine, i.e., n = 2 , and a whole day time horizon with control adjustment every four minutes, i.e., N = 360.  Table 1 Comparison of (CIA) objective values and run time of (P1) for different solving methods and varying max . AMDR corresponds to Algorithm 3, while AMDR-NF represents a modification in which the admissible and next-forced control is selected to be active in line 4 of Algorithm 2. By BNB we refer to pycombina's BNB algorithm with depth-first node selection strategy, where we set up an iteration limit to 5 and 50 million nodes. We high- On mixed-integer optimal control with constrained total… We use the AMDR scheme to calculate a candidate solution of the (CIA) problem depending on max , which is optimal by virtue of Theorem 2.2. (a). The right plot in Fig. 3 compares these optimal solutions with the (CIA) objective values of BNB solutions with increasing iteration limit. For a small and large number of allowed switches, the deviation of the BNB solutions is small. One explanation for this is the limited degree of freedom for a small max , so that the width of the BNB tree is very limited. With a large max , on the other hand, solutions with a small value can be found quickly, with which many nodes can be pruned. The deviation from the optimal solution is particularly striking for medium-sized max . For some instances, especially for 10 ≤ max ≤ 20 , an increase of the iteration limit hardly leads to an improvement because the BNB algorithm seems to remain in a suboptimal branch. We also compare the optimal solution of (CIA) with the upper bound from Corollary 4 and see that the latter appears between 200 and 600 percent larger.

Comparison of optima for (CIA) with upper bounds based on generic data
The two investigated MIOCPs showed a relatively large deviation of the optimal (CIA) objective value compared with the derived upper bounds. Therefore, we generated uniformly distributed random values a ∈ A N for N = 40 equidistant intervals, n = 2, 3 controls, and examined how the ratio of these two values results here. We illustrate this comparison in Fig. 4, where we use the upper bound from Corollary 4 for n = 2 and the one from Conjecture 1 for n = 3 . The objective values , * and bounds max decrease logarithmically with the increase of max , as expected. In contrast to the above MIOCPs, the (CIA) objective values come close to the upper bounds, particularly for small max , but a relevant gap remains for larger max . This gap may be further reduced utilizing a larger sample size; we considered here only 1000 (CIA) instances per max value. We also note that the values generated by the AMDR algorithm are very close to the optimal ones.

Discussion
As expected by the polynomial run time complexity, our prototype implementation of AMDR constructs (CIA) feasible solutions very quickly. Their values are mostly outperforming the ones obtained by the BNB algorithm or are at least close to the latter for a problem with more than two binary controls. Consequently, the AMDR solution is itself a promising (CIA) feasible solution or is a fast option to initialize the BNB with a competitive upper bound. As stated in Remark 6, the AMDR algorithm may also be used to include combinatorial constraints other than the TV constraints.
For comparison with the BNB method, we restrict that we only used the depthfirst node selection strategy and could have tuned it a bit more to achieve more competitive feasible solutions of (CIA). Besides, the BNB algorithm can include a variety of combinatorial conditions of the (CIA) problem, so it is generally advantageous.
We also note that our calculations mainly examine the (CIA) objective value because it correlates with the (MIOCP) objective value. With very similar or large (CIA) objective values, however, the smaller value may lead to a worse (MIOCP) objective value-and vice versa. There may be several binary control functions with the same (CIA) objective value but different (MIOCP) objective values. In some instances, we observed that the AMDR algorithm generates a control function with suboptimal (MIOCP) objective value since its switches are structurally delayed compared to the switches on bang-bang-arcs of the relaxed binary values. In this case, we tested, as a heuristic, shifting the AMDR binary values backward in time On mixed-integer optimal control with constrained total… by ⌊ ∕Δ⌋ intervals so that the control function is more similar to the relaxed binary values, which worked well.

Conclusions
In this paper, we have devised a fast rounding method for the MIOCP with constrained TV of the integer control. The proposed algorithm constructs under certain assumptions, e.g., n = 2 , an optimal solution of the (CIA) subproblem. Based on this, we have proven bounds on the integrality gap of (CIA) for the constrained TV case. Our numerical results have shown that the computed control function's quality outperforms in many cases the BNB solution, for which an iteration limit has been set up. Due to the very short run time, we recommend the proposed method, especially for the mixed-integer model predictive control setting or for instances with a vast number of binary variables. In the future, this algorithmic proposal could be compared with a penalty alternating direction method [12] or extended to switching costs as in [3].