On Mixed-Integer Optimal Control with Constrained Total Variation of the Integer Control

The combinatorial integral approximation (CIA) decomposition suggests solving mixed-integer optimal control problems (MIOCPs) by solving one continuous nonlinear control problem and one mixedinteger linear program (MILP). Unrealistic frequent switching can be avoided by adding a constraint on the total variation to the MILP. Within this work, we present a fast heuristic way to solve this CIA problem and investigate in which situations optimality of the constructed feasible solution is guaranteed. In the second part of this article, we show tight bounds on the integrality gap between a relaxed continuous control trajectory and an integer feasible one in the case of two controls. Finally, we present numerical experiments to highlight the proposed algorithm’s advantages in terms of run time and solution quality.


Introduction
Mixed-Integer Optimal Control has been established as a useful tool for modeling real-world problems [6,10,22]. In practice, only an optimal control policy is realistic that avoids frequent switching between the system modes. However, it remains an open research question as to how switching costs or a limited number of switches can be efficiently incorporated into the optimization problem.
In this article we follow a first-discretize-then-optimize approach because in contrast to indirect methods and dynamic programming a more generic problem class can be solved for which efficient numerical methods, in particular, a decomposition approach used in this article, are available. By this approach, the control problem is discretized via e.g., Direct Multiple Shooting [5] or Direct Collocation [29], which leads to an mixed-integer nonlinear program (MINLP). This problem class is NP-hard in general so that it has been proposed to reduce complexity by solving first the relaxed problem with dropped integrality constraint, which is an nonlinear program (NLP), before approximating relaxed controls in a second step with binary controls as part of an MILP. The second problem is usually referred to as combinatorial integral approximation (CIA) problem [27], whereas the whole algorithm is called CIA decomposition [32]. It is common to use the fast Sum-Up Rounding (SUR) heuristic [24] to find a feasible approximative solution for the CIA problem so that this second step is also named rounding. However, standard SUR does not consider time-coupled combinatorial constraints, which is why the use of the CIA problem is necessary in this case. The variable time transformation [2,11,21] method also avoids solving an MINLP by assuming a given sequence of the system modes so that only their durations as part of an NLP have to be computed. Recently, De Marchi extended this approach by including switching costs with sparse optimization methods [9]. As part of the CIA decomposition, Sager [24], Kirches [16] and Rieck [20] proposed to add penalty terms to the control problem objective in order to account for switching costs or to reduce the number of switches between active controls. Recently, Kirches et al. [17] investigated this approach for the setting with implicit switches and where jumps of the differential state values may occur. One issue of the penalization approach is the appropriate choice of the penalty factor since heavy penalization can in some instances [16] attract solutions involving frequent switching. Bestehorn et al. [3,4] presented an idea to incorporate switching costs into the problem by fixing a small control deviation tolerance in the rounding problem and minimizing switching costs subject to this deviation tolerance.
This work builds on [27], where it has been proposed to solve the CIA problem with constraints that limit the total variation (TV) of the integer control. By this, the original optimal control objective remains untouched, and we require only the number of switches to be less than a desired threshold. Our idea is to use generally a Branch & Bound (BNB) scheme [7] for solving this CIA problem, but, as we will show in this article, this can be replaced for certain instances by a sequence of rounding scheme evaluations. Furthermore, we evaluate upper bounds on the CIA objective subject to these discrete TV constraints. It has been shown that the integer approximation error, i.e., the difference of relaxed optimal control objective value and CIA rounded objective value, can be driven to zero under mild conditions and if the grid length is driven to zero [19,27]. This result does not hold anymore if discrete TV constraints are included. Still, we expect our approach to yield feasible solutions of the MIOCP with an a priori integer gap.

Contribution
To accelerate the CIA solving process, we propose the Maximum Dwell Rounding (MDR) scheme, which is a fast rounding heuristic. It is based on the idea to activate a chosen control mode as long as possible without violating a desired integrality gap θ , i.e., the accumulated deviation of relaxed and binary controls, and then perform this with the next promising mode. We apply it iteratively as part of the Adaptive Maximum Dwell Rounding (AMDR) algorithm for finding binary controls that satisfy time-coupled combinatorial constraints such as a TV bound and derive optimality conditions of the obtained binary control function with respect to the CIA problem. Based on this scheme, we prove the tightest possible upper bound on the integrality gap for equidistant discretization and the case of two binary controls, which reads where N denotes the number of intervals, σ max the TV bound and∆ the maximum grid length. We are going to establish further bounds for the situation of non-equidistant grids or more than two binary controls.

Outline
We give a problem definition of the MIOCP of interest in Section 2 and describe the proposed CIA decomposition algorithm with (CIA) as subproblem in Section 3. Next, we introduce auxiliary CIA problems and derive a lower bound for these problems in Section 4. We define the MDR scheme in Section 5 and show its usefulness with respect to solving the CIA problem subject to discrete TV constraints. We continue by analyzing the worst case integrality gap for n ω = 2 in Section 6, respectively for n ω > 2 in Section 7. Finally, we present numerical experiments in Section 8 and conclusions in Section 9. , for n ∈ N. We use Gauss' bracket notation, i.e. x := max{k ∈ Z | k ≤ x}, x ∈ R, and analogously for x . We indicate by x 0.5 the rounding up of x ∈ R to the next multiple of 0.5: x 0.5 := min{y | y = n · 0.5, n ∈ N, y ≥ x}.

Mixed-Integer Optimal Control Problem
MIOCPs can be equivalently reformulated into problems with affinely entering binary controls in the righthand side of the ordinary differential equation via the (partial) outer convexification method, see [24] for further details. Therefore, we declare this reformulated MIOCP as the problem of interest and provide the corresponding definitions in this section. We consider problems on a given time horizon T := [t 0 ,t f ] ⊂ R. Throughout this paper, we assume a problem involving n ω ≥ 2 binary control realizations. We introduce the binary control function after defining the TV of a function and its associated space.
Definition 1 (Total Variation of a function and BV space) The TV of a function ω : T → R n ω is defined to be the quantity where P = (t 0 , . . . ,t n P ) is a partition out of the set of all partitions P of the interval T and n P denotes the partition specific number of time points. We group the functions ω with finite TV into the space of bounded variation BV :  ω i (t)f i (x(t)), for a.e. t ∈ T , (2.2) x(t 0 ) = x 0 , (2.3) TV (ω) ≤ σ max . (2.4) We minimize a Mayer term functional Φ ∈ C 1 (R n x , R) over the binary controls ω i and differential states 2) expresses the dynamical system as a switched system in partial outer convexified form, i.e., as a sum of a drift term f 0 and control specific functions f i both out of C 0 (R n x , R n x ). We assume that there exists a solution x for the above problem; for this we may assume that a uniform Lipschitz estimate on f 0 and f i exists so that the theorem by Picard-Lindelöf is applicable. We limit the number of switches between active modes to be at most σ max in the TV constraint (2.4). Finally, we define (OCP) as the canonical relaxation of problem (MIOCP) where we optimize over α ∈ A instead of ω ∈ Ω .
Remark 1 When we write that control i is active, we indicate that ω i (t) = 1. In fact, we count the switches in (2.4), respectively in (2.1), twice since we sum up the control that has just been deactivated with the one that has just been activated. That explains the factor of one-half in (2.1). We remark that our study assumes no mode-specific switching limits, but we may also impose them by splitting up (2.4) into n ω inequalities and by dropping the first sum in (2.1).
Without loss of generality, we omit further constraints and continuous control functions u ∈ L ∞ (T , R n u ) in our problem definition. See [28] how to cope with these and further extensions.

Combinatorial Integral Approximation Decomposition
We propose to solve (MIOCP) with the CIA decomposition [27,32], which relies on the use of direct methods (first discretize, then optimize approach). This section explains and defines the problem's temporal discretization and the subproblems that constitute the decomposition algorithm.
Next, we define the matrix sets of the discretized binary and relaxed binary control functions Ω N , A N .
Definition 5 (Convex combination constraint (Conv) and Ω N , A N ) Let N ∈ N. We express the requirement that the columns of a matrix (m i, j ) ∈ [0, 1] n ω ×N sum up to one by and call it convex combination constraint (Conv) in the remainder. Based on this constraint, we define We introduce a discretized version of the TV constraint (2.4) that relies on w: Definition 1 (Discretized version of (2.4)) Let G N and σ max ∈ N be given. We use auxiliary variables σ i, j ∈ N for introducing a discretized version of the TV constraint that reads In order to solve the upcoming subproblems efficiently, we have deliberately formulated the above constraints without an absolute value term. This and replacing w with a in (3.2) results in a differentiable TV constraint. We define the discretizations of (MIOCP) and (OCP) below.
Definition 6 ((NLP rel ), (NLP bin )) Consider (OCP) with the following modifications: • We discretize (2.2) with G N and by using direct collocation or direct multiple shooting together with an appropriate integrator function (e.g. Runge-Kutta methods [18,23]). • The controls are piecewise constant functions on G N with (a i, j ) ∈ A N : • The TV constraint (2.4) is replaced by the constraints (3.1) and (3.2). In (3.2), we replace w with a. We denote the resulting discretized optimization problem with relaxed binary control functions a ∈ A N by (NLP rel ) and by (NLP bin ) for fixed binary control functions w ∈ Ω N .
We denote with θ (w) the (CIA) objective value for a feasible solution w ∈ Ω N .
With these subproblem definitions we are able to summarize the CIA decomposition in Algorithm1. We first solve the relaxed problem (NLP rel ) and approximate the resulting relaxed binary controls with binary values in the (CIA) problem. The last step consists of evaluating (NLP bin ) with a fixed binary control function w CIA in order to obtain the objective value of (MIOCP). We remark that the TV constraints (3.1)-(3.2) in (NLP rel ) and (NLP bin ) may be replaced by other TV reformulations such as the ones presented in [17] or even dropped because (CIA) guarantees feasibility with Algorithm 1: CIA decomposition algorithm for error-controlled solution of (MIOCP) respect to bounded TV in any case. The algorithmic focus of this article lies on the (CIA) step, so that we omit further considerations of TV reformulations.
We stress that this algorithm solves two problems of which each one is less hard to solve than the original MINLP, which denotes the discretized (MIOCP). It yields only an approximation of the optimal solution of (MIOCP), but Sager et al. [26] showed that -without TV constraints and under certain regularity assumptions -the difference between the differential states based on relaxed and binary control values depends linearly on the difference of the integrals of their corresponding control function. In particular, they proved that the so called integrality gap with feasible solutions constructed by (CIA) is linearly bounded by the maximum grid length∆ in the sense of where C(n ω ) is a constant depending on the number of controls. This implies that the differential state trajectories of (MIOCP) and (OCP) -without the TV constraint -are arbitrarily close with vanishing grid length∆ and by assumed Lipschitz continuity of the objective the same holds also for the objective values. 1 In Sections 6-7, we are going to show that in the presence of discrete TV constraints (3.1)-(3.2) the rounding error does not vanish in general with grid length going to zero. More precisely, Theorem 4 and Corollary 7 will present concrete results on the rounding error.

(CIA−θ ), (CIA−θ −init) and an Associated Lower Bound
In this section, we address a problem that minimizes the used switches subject to a given approximation error θ that shall not be exceeded by the accumulated control deviation. Afterward, we aim for a lower bound on its objective that will be useful in the next section and introduce useful auxiliary variables and definitions for this.
Taking the fixed initial active control in (4.3) aside, the problems (CIA−θ ) and (CIA) from Definition 7 are closely connected with each other because the TV constraints (3.1) and (3.2) are reinterpreted as objective function subject to a fixed approximation errorθ . This justifies the naming. We will introduce the MDR algorithm in Section 5 to (heuristically) solve (CIA−θ −init). By applying this algorithm to all i ∈ [n ω ] as initial active controls, we exploit this relationship to solve (CIA−θ ) as well, which will be used then as part of a bisection algorithm to solve (CIA). We stress that fixing the initial active control i 0 may seem odd, though this fixing reduces the problem complexity, which later yields in Theorem 1, Section 5, an optimality result of the solution constructed by the MDR algorithm concerning (CIA−θ −init).
We notice that (CIA−θ −init) is very similar to (SCARP) from [3,4]. The latter problem aims at minimizing the switching costs, representing a generalized objective function of (CIA−θ −init), whereas in (CIA−θ −init) the initial active control is fixed.
Remark 2 (Link to scheduling theory) On an equidistant grid, (CIA−θ ) can be reformulated into the following, equivalent, scheduling problem: On a single machine, minimize the total setup costs (TSC) until the Nth processed job, N ≤ n, so that n jobs ( f , k) are processed within f ∈ [n ω ] job families subject to release times r f ,k , deadlines d f ,k , equal processing times∆ and sequence independent setup costs, which can be summarized in scheduling notation [13] as In the following we will revert to scheduling-like concepts, but explicitly dispense with its notation to not distract the reader from the usual MIOCP notation.
Next, we need some definitions to derive a lower bound for (CIA−θ −init) at the end of this section. We stress that we establish our results on an equidistant grid but will sometimes drop this assumption in definitions used in later sections.
Definition 9 (Activations, release r i,k and deadline intervals d i,k ) For each control i ∈ [n ω ] on an equidistant grid G N , we introduce the number of possible activations n i as Each activation k ∈ [n i ] is associated with a release and deadline interval, which are defined by: Finally, we call the kth activation of control i necessary, if d i,k < ∞.
Definition 10 (Switch, activation block) Consider w ∈ Ω N . If we have on interval j ≥ 2 and for any i ∈ [n ω ] w i, j−1 = 0, w i, j = 1, then we say w switches on j. We introduce the set of switches S := { j ∈ {2, . . . , N} | w switches on j} and set n s := |S |. We denote by τ j ∈ [N] the corresponding interval of the jth switch of w, where we set τ 0 := 0, τ n s +1 := N. On an equidistant grid and if i ∈ [n ω ] is active between two consecutive switches or one switch and the first/last interval, we define the set of activations of i between these switches as an activation block B ⊂ [n i ]. On a general grid, we further define the length of the jth activation block between the ( j − 1)st switch on, i.e. τ j−1 , and before the jth switch, i.e. τ j − 1, via the auxiliary variable δ j = ∑ We notice that the switches actually occur on the grid points; however, we have indexed the variables w i, j according to the intervals, and therefore, for simplicity, we refer to switches on intervals. In the following, we will sometimes abbreviate activation block with block. In order to keep the number of used switches small and when deciding to set up a new block, it is highly relevant to know how many activations could be at most included in this block beginning with activation k. An activation j > k cannot be included in the block if its release interval begins later than the deadline interval of activation k plus the number of activations between k and j. We give a definition that formalizes these deadlines for initial activation-dependent deadlines of blocks. Based on these block deadlines, it is straightforward to introduce the notion of a block deadline feasible partition of activations into blocks. The constraint (4.3) imposes that the control i 0 's first activation has to be executed on the first interval, for which we introduce the definition of fixed initial active control feasibility.
Definition 11 (db i,k , block deadline and fiac feasible partition) Consider an equidistant grid. The deadline of a block for i ∈ [n ω ] that begins with the kth activation, k ∈ [n i ], is defined by (4.7) Let P i denote a partition of all activations [n i ] for i ∈ [n ω ]. We call P i block deadline feasible if for all subsets B ∈ P i , i.e., all blocks, hold: Furthermore, we refer to P i as a fixed initial active control (fiac) feasible partition if for all k ∈ B 1 hold where B 1 ∈ P i denotes the first activation block of P i .
In the last definition, we provided the concept of a control specific partition of all activations. The kth activation of control i ∈ [n ω ] does generally not coincide with the kth interval. The following example illustrates the introduced concepts and, in particular, that there may be in total more possible but less necessary activations than intervals N.
As illustrated in Example 1, a feasible solution w of (CIA−θ −init) may not use all possible activations. To this end, we define an extension of the set of blocks of w to become a partition of [n i ] for all i ∈ [n ω ] in the following lemma. The extension may seem arbitrary, but is necessary to compare any w ∈ Ω N with partitions of [n i ]. Thereby, we establish a connection between the above feasibility concepts and a feasible solution w of (CIA−θ −init).

Lemma 1
For an equidistant grid, let w ∈ Ω N be feasible for (CIA−θ −init) and let P i denote the set of blocks of w for control i ∈ [n ω ]. We define P i := {k ∈ [n i ] | B ∈ P i : k ∈ B} and P i := P i k∈P i {k}. Then, P i is a block deadline feasible partition and if i = i 0 , P i is also fiac feasible.
Proof We first argue that P i is by definition a partition of [n i ]. We need to prove that these partitions are block deadline feasible, respectively fiac feasible. If for i ∈ [n ω ] and an activation block B ∈ P i holds r i,max{k∈B} > d i,min{k∈B} + |B| − 1, this would imply that the (max{k ∈ B})th activation of i has been processed before its release interval because B can not be interrupted by activations from other controls. Therefore, the above inequality does not hold and block deadline feasibility is established. We apply the same argument for confirming fiac feasibility. By constraint (4.3), the first activation of i 0 is scheduled on the first interval. Hence, all activations k ∈ B 1 of the first block B 1 must be processed on the kth interval and require therefore a release interval that is no later than k. Remark 3 (Necessary condition for feasibility of (CIA−θ −init) ) The formation of activations into block deadline and for i 0 fiac feasible partitions is a necessary feasibility criterion of w ∈ Ω N for (CIA−θ −init) by virtue of Lemma 1. Nevertheless, it is not a sufficient criterion since the order of the processing of blocks is not clarified. In particular, one might order the blocks to contain an activation whose release interval is later than its executed interval.
Next, we formalize specific partitions of the control i's possible activations n i whose blocks are constructed to include as many activations as possible without violating their block deadlines. These quantities serve as tools to derive a lower bound of necessary blocks per control independent of the other control's blocks. This will result in a lower bound for (CIA−θ −init) in Proposition 2. We distinguish between the case that i is the fixed initial active control, i.e., i = i 0 , or not.
Definition 12 (P i,min , P init i,min ) Consider an equidistant grid and the controls i 0 , i ∈ [n ω ]. Let We write (·) (init) i to indicate that equations or inequalities each apply to the parameters (·) i and (·) init i 0 . We define the blocks B (init) i,l recursively for l ≥ 2 and while k (init) l < n i by and P (init) i,min the partitions of [n i ] constructed by the latter: The MDR scheme from the next section creates switches that resemble the above k (init) l terms. The latter, though, only expresses the grouping of activations, while the switches explicitly specify the corresponding intervals as well. It turns out that the partitions P i,min and P init i 0 ,min are minimal in the number of blocks as indicated in the following proposition.
Proposition 1 For i 0 , i ∈ [n ω ], let the partitions P i,min , P init i 0 ,min be given as in Definition 12. For any partition P i of [n i ], with i = i 0 included, we define its restriction to the firstñ i ≤ n i activations as Then, the partition P i,min , respectively P init i 0 ,min , consists for anyñ i ≤ n i of a minimal number of blocks on the firstñ i activations compared with all other block deadline feasible, respectively both block deadline and fiac feasible, partitions P i : Proof We consider first P i,min ñ i . It is block deadline feasible because the deadline of the last activation for each block is defined in (4.8) and (4.10) to be less or equal than the corresponding block deadline. Assume there is a block deadline feasible partition P i for the control i ∈ [n ω ] with P i |ñ i < P i,min ñ i . In other words, there exists a subset of the first j blocks of P i |ñ i that includes more activations than the ones included into the first j blocks of P i,min ñ i . We consider the minimal number of blocks j with this property: The block index j is unique since the association of activations to blocks is monotonically increasing, meaning that there are no k 1 th, k 2 th activations, k 1 < k 2 , with k 1 ∈ B i,l 1 , k 2 ∈ B i,l 2 and l 1 > l 2 . We conclude so that block B i, j 's first activation k is smaller or equal than k which marks the earliest activation of B i, j . The definition of release intervals (4.5) implies r i,k ≤ r i,k for k ≤ k. Similarly, the definition of block deadlines (4.7) implies db i,k ≤ db i,k for r i,k ≤ r i,k and we find with (4.13) in particular On the other hand, the definition of P i,min in (4.10) implies Then, the definition of j yields where the last inequality must hold due to the assumption of P i being block deadline feasible. Inequality (4.14) contradicts inequality (4.16), or equivalently there is no such partition P i and P i,min uses indeed a minimal number of blocks on any The same argumentation for j ≥ 2 in equation (4.12) can be applied in order to prove the result for P init i 0 ,min as P i 0 is also assumed to be block deadline feasible in this case and the same holds for P init i 0 ,min from the second block on. We just need to take care of the case when j = 1, i.e., if B i 0 , j , respectively B i 0 , j , is the first block of the control i 0 . Here, max{k ∈ B i 0 ,1 } < max{k ∈ B i 0 ,1 } cannot appear, since P i 0 is assumed to be fiac feasible and the construction of the first block of P init i 0 ,min implies that no further activation can be added to B i 0 ,1 without violating fiac feasibility. Thus, j = 1 is impossible in (4.12) and P init i 0 ,min is also minimal in the number of blocks.

Corollary 1 Consider the setting of Proposition 1 and the controls i
There is no block deadline feasible partition, respectively block deadline and fiac feasible partition, that uses less than nb N i,min blocks on As a final result for this section, we establish a lower bound for (CIA−θ −init) that will be useful in Theorem 1.
Proposition 2 (Lower bound for (CIA−θ −init)) Let σ be the objective of (CIA−θ −init) with equidistant discretization and i 0 the fixed initial active control as defined in Definition 8. Let nb N i,min for all i = i 0 and nb N,init i 0 ,min be given as in Corollary 1. It results Proof By virtue of Lemma 1, a feasible solution of (CIA−θ −init) satisfies the necessary condition of generating only block deadline feasible partitions P i and if i = i 0 , the activation partition P i is also fiac feasible. Moreover, all activations are executed no later than their deadline interval. That holds particularly for those that are due no later than N. Hence, we can apply Corollary 1 and conclude the minimum number of blocks of a feasible solution until the in total Nth activation is nb N i,min , respectively nb N,init i 0 ,min . Finally, we obtain the claim (4.17) by summing up over all controls and using that the setup of the first block does not count as switch.

Maximum Dwell Rounding
This section is dedicated to solving (CIA). Generally, we recommend a tailored BNB algorithm that has been proposed by Jung et al. [15,27] and implemented in the open-source software package pycombina [7]. The BNB algorithm outperforms standard MILP solvers in a case study [14] by three orders of magnitude. However, in some instances, the algorithm struggles to find the optimal solution quickly because the node relaxation can be quite weak [8]. We, therefore, present a polynomial-time algorithm that constructs good initial guesses for BNB and, in some situations, solves (CIA) even to optimality. We proceed by giving the necessary definitions of the algorithm itself and its auxiliary variables in the first subsection and investigate beneficial properties in the second subsection.

Definition of the algorithm
Definition 13 (Accumulated control deviation θ i, j , γ i, j ) Let a ∈ A N and w ∈ Ω N . For control i ∈ [n ω ] and interval j ∈ [N] we define the accumulated control deviation variables as and set θ i,0 := 0.
The following lemma is useful for Proposition 3 on page 11 and Lemma 7 on page 17.
Proof These equations follow directly from the definition of θ and γ as well as from the convexity property of a and w.
Definition 14 (Inadmissible, next forced and forced activation) Consider a rounding thresholdθ > 0 and a ∈ A N . Let the values of w ∈ Ω N be given until interval . . N} denote the next interval on which control i would become forced without activation after interval j − 1: Then, we define a control i ∈ [n ω ] on interval j to be next forced if and only if The above definition allows more than one control to be next forced or forced for an arbitrary interval j ∈ [N]. This is supposedly not the standard case in our discussion but will also be taken into account in our considerations. The guiding idea behind the above control activations is that we include more and more summands of w into the computation of θ and can choose the next row of w accordingly. With this definition we have introduced necessary activation properties of feasible solutions for (CIA−θ −init), but neglected so far the fixed initial active control constraint (4.3). The following definition fills this gap.
Definition 15 (Initially admissible control) We define a control i ∈ [n ω ] to be initially admissible if it is admissible on the first interval and if there is no other control i 1 = i that is forced on the first interval. Now, we can define the Maximum Dwell Rounding (MDR) in Algorithm 2. The MDR algorithm assumes a given initial control i 0 and activates it until it becomes inadmissible or until there is another forced control. We require the control i 0 to be initially admissible because otherwise w MDR would violate the control accumulation constraint (4.2). Otherwise, the control i with the maximum forward control deviation γ i, j is set active and remains so until it becomes inadmissible or another control becomes forced. This procedure is performed forward in time until the end of the time horizon N is reached. We named the algorithm "maximum dwell rounding" because it tries to stay in the current mode as long as possible without violating the given rounding threshold.
The Adaptive Maximum Dwell Rounding (AMDR) is defined in Algorithm 3 and can be described as a bisection method. We initialize it with a trivial lower bound LB and upper bound UB for (CIA). The algorithm runs MDR iteratively with different thresholdθ and initially admissible control as long as the difference of  4 if i is an initially admissible control then 5 Run MDR with i as initial control and set w = w MDR(i) ; 6 if w satisfies TV constraints (3.1),(3.2) and θ (w) < UB then lower and upper bound is larger than the chosen tolerance T OL (lines 2-5). If the computed control function satisfies the TV constraint and exhibits a (CIA) objective value that is smaller than the current UB, we update UB, reset the rounding thresholdθ via interval halving of UB − LB and save the current best solution (lines 6-10). The evaluation θ (w) is necessary since MDR may construct a control function with a rounding gap larger than the desired gapθ , as will be discussed in the next subsection. If no computed control function w with given initial control andθ fulfills the TV constraint, then we increase the LB (lines 11-14).

Solution quality and properties of MDR (Algorithm 2)
Although the MDR algorithm may seem simple, it generates optimal solutions w for (CIA−θ −init) under certain conditions, for which we need the following definition.
Definition 16 (Canonical switch) We define a switch j ∈ S as defined in Definition 10 to be canonical, if on interval τ j holds: exactly one control i 1 is inadmissible and exactly one control i 2 = i 1 is forced.
We build our theoretical results of this section mainly on the following assumption.
Assumption 1 (MDR uses only canonical switches) Suppose w MDR ∈ Ω N has been generated by MDR. We assume that all switches of w MDR are canonical.

Properties of the MDR algorithm
Assumption 1 may seem restrictive, although it is satisfied anyway under certain conditions. Proposition 3 (MDR with n ω = 2 andθ ≥ 1 2∆ uses canonical switches) Consider n ω = 2, a ∈ A N and any grid G N . If we chooseθ ≥ 1 2∆ , then the control function w MDR constructed by the MDR scheme uses only canonical switches.
Proof We have to prove: 1. If control i 1 is forced on interval j ≥ 2, then it is admissible. 2. For all intervals j ≥ 2 hold: control i 1 is inadmissible, if and only if i 2 = i 1 is forced.
1. follows from the definition of forced activation and fromθ ≥ 1 2∆ : For proving 2. let us assume i 1 is forced on j ∈ [N], i.e., γ i 1 , j >θ . By virtue of Lemma 2 for γ i 2 , j we derive Remark 4 Assumption 1 is not necessarily true for a control problem that involves more than two binary controls. It may, however, hold for special cases of such a problem. For instance, if the relaxed values are of bang-bang type, i.e., a i, j ∈ {0, 1}, andθ is chosen smaller than the smallest activation block, then the situation resembles the case n ω = 2 and Assumption 1 may hold (without proof). On the other hand, Example 3 is going to demonstrate that this assumption can indeed be quite restrictive.
Assumption 1 allows us to prove strong properties of control functions obtained by MDR and AMDR. The first result expresses that the MDR scheme produces indeed control functions which exhibit a (CIA) objective value smaller or equal thanθ .
Lemma 3 (MDR solution satisfiesθ bound) Let Assumption 1 hold and let w MDR ∈ Ω N be constructed by MDR with given thresholdθ . Then, we obtain θ (w MDR ) ≤θ .
Proof As soon as the activated control becomes inadmissible or there is a forced control on interval j ≥ 2, w MDR has a switch by the definition of MDR. By Assumption 1, the newly activated control is both forced and admissible, hence θ i, j ≥ −θ , and there is also no other forced control on j, thus θ i, j ≤θ .
The following example demonstrates that θ (w MDR ) >θ may generally appear without Assumption 1.
Example 2 Consider an equidistant discretization and a ∈ A N with the first values given as a 1,1 = 1, a 2,1 = 0, a 1,2 = 0.5, a 2,2 = 0.5. For this relaxed value, let w MDR ∈ Ω N be the corresponding binary control function computed by MDR with given thresholdθ = 0.4∆ and initial control i = 1. Then, w MDR 1,2 = 0, w MDR 2,2 = 1 holds since the second control becomes forced on the second interval. At the same time, control i = 2 is inadmissible on the second interval, hence Assumption 1 is violated, and it results θ 2,2 = −0.6∆ < −θ .
We reuse concepts from the previous chapter, especially activations and their grouping into blocks.
Theorem 1 (Least switches property of MDR) Let Assumption 1 hold. For given a ∈ A N and an equidistant grid, let w MDR be constructed by MDR with i as initial control and anyθ > 0, where we assume that i is initially admissible. Let σ (w MDR ) denote the number of switches used by w MDR . Then, for the optimal objective value σ of (CIA−θ −init) with i 0 = i as initial control holds Proof We can conclude w MDR is a feasible solution of (CIA−θ −init) by Lemma 3. Combining this with Proposition 2 yields For proving optimality we note that w MDR constructs partitions of the activations [n i ], i ∈ [n ω ] that are due no later than N and let P MDR i denote these partitions. With the notation from Corollary 1 we want to show that these partitions coincide with the partitions constructed in Definition 12 and B init 1 ∈ P init i 0 ,min . By Assumption 1, the MDR algorithm activates i 0 until it becomes inadmissible on the interval τ 1 of the first switch: We compare this inequality with the Definition 9 of release intervals and notice that either the next activation τ 1 of control i 0 has a release interval that is later than τ 1 or there is no further possible activation. So, if the τ 1 th activation exists, then its release interval has not yet been reached: By the definition of P init i 0 ,min we conclude B 1 = B init 1 . Let us now consider the jth blocks B j ∈ P MDR Again by the definition of MDR and Assumption 1, i 0 is forced on interval τ j , which is equivalent to d i 0 ,min{k∈B j } = τ j . The MDR scheme activates i 0 either until N (then trivially B j = B init j ) or until it becomes inadmissible on interval τ j+1 (by Assumption 1). With the argumentation for j = 1, inadmissible means hereby the (max{k ∈ B j } + 1)th activation has a release interval greater than τ j+1 . Using that the block's first activation τ j is processed on its deadline interval d i 0 ,min{k∈B j } , this yields The above inequality expresses that B j contains as many activations as possible without violating its block deadline db i 0 ,min{k∈B j } and by construction of P init i 0 ,min this is equivalent to B j = B init j . This settles the case i = i 0 in (5.3). We can reuse the above arguments about forced and inadmissible activation for j ≥ 2 in order to analogously prove the case i = i 0 in (5.3).
Remark 5 Theorem 1 is predicated on the assumption of an equidistant grid. We stress that after grid refinement of the optimal control problem, i.e., after several rounds of applying the CIA decomposition, this might be a restriction.
The following corollary establishes a way to find the optimum of (CIA−θ −init) in the setting of Theorem 1.
Corollary 2 (Using MDR to find a control function with minimum number of switches) Consider the setting of Theorem 1. A control function w that uses a minimum number of switches, i.e., σ (w ) = σ , can be found by running MDR.
Proof Let i be the initial control of w . Execute MDR with i as initial control so that the result follows directly from Theorem 1.
It is not clear which control is the optimal initial active one in order to minimize switches. In practice, MDR must be executed one after the other for all controls i ∈ [n ω ] as initial active control. This expresses the following corollary.
Corollary 3 (Link between MDR and (CIA−θ )) Consider the setting of Theorem 1. We assume that the MDR algorithm constructs for all i ∈ [n ω ] as initial active controls the control functions w MDR that use only canonical switches. Then, there is a minimizing control w ∈ Ω N for (CIA−θ ) that only uses canonical switches. Moreover, there exists i 0 ∈ [n ω ] such that running MDR with i 0 as initial control produces w MDR ∈ Ω N that minimizes (CIA−θ ).
Proof If the MDR algorithm produces w MDR that only uses canonical switches, w MDR is optimal by Theorem 1 for (CIA−θ −init) with the corresponding initial control fixed . Then, the result follows from the fact that the optimal solution of (CIA−θ ) is contained in the set of optimal solutions for the set of problems (CIA−θ −init) with each control i ∈ [n ω ] initially fixed.
Lemma 4 Consider a ∈ A N on an equidistant grid and assume n ω = 2. Let w MDR denote the control function constructed by MDR withθ > 0 and given initial control. If θ (w MDR ) >θ , then there is no control function w ∈ Ω N with the same initial active control and θ (w) ≤θ .
Proof We consider the first interval j on which the accumulated control deviation of w MDR is greater thanθ . Let control i 1 be active on j. By definition of the MDR scheme, |θ i 1 , j | >θ or |θ i 2 , j | >θ can only appear if there is a switch on interval j and 1. i 1 is on interval j both forced and inadmissible or 2. both i 1 and i 2 are inadmissible on interval j.
Proposition 3 establishes that w MDR uses only canonical switches forθ ≥ 1 2∆ and thus the above cases cannot appear forθ ≥ 1 2∆ . Let us focus onθ < 1 2∆ . In order to create a control function w that does fulfill θ (w) ≤θ , we need to change at least one activation of w MDR on an earlier interval l < j. However, we recognize that any earlier change of activation is not possible: • We cannot extend an activation block at its end, since the active control is inadmissible.
• If the active control i 1 is admissible on l, then the other control i 2 is not forced on l -otherwise it would be active in the MDR scheme. This means θ i 2 ,l−1 + a i 2 ,l∆ ≤θ . Activating i 2 on l results in where we appliedθ < 1 2∆ . This indicates the (CIA) objective value of w is again greater thanθ . Hence, no previous activation w MDR can be changed so that there is no w with θ (w) ≤θ .

Properties of the AMDR algorithm
Theorem 2 states that the AMDR Algorithm is able to find the optimal solution of (CIA) for n ω = 2 and equidistant discretization. Otherwise, strict assumptions are required for optimality, and in general, the found feasible solution represents only a promising upper bound.
Theorem 2 (Properties of Algorithm 3) Algorithm 3 terminates for given a ∈ A N , T OL > 0 and σ max ∈ N after a finite number of iterations. Furthermore, consider an equidistant grid G N . Let w AMDR denote the solution constructed by Algorithm 3. It follows: 1. w AMDR is a feasible solution of (CIA). 2. (a) If n ω = 2, we have for the optimum θ of (CIA): θ (w AMDR ) ≤ θ + T OL.
(b) Let n ω > 2. We assume the MDR scheme uses in every run only canonical switches. Furthermore, suppose we have : If the MDR scheme constructs a solution with θ (w MDR ) >θ , then there is no control function w ∈ Ω N with the same initial active control and θ (w) ≤θ . With these assumptions we obtain for the optimum θ of (CIA): θ (w AMDR ) ≤ θ + T OL. 3. ADMR has time complexity O(n ω ·C MDR · log 2 ( (t f − t 0 )/T OL )), where C MDR ∈ O(N) denotes the time complexity of the MDR scheme.
Proof AMDR is a bisection algorithm that either decreases UB (line 7-8) or increases LB (line 12-13) by at least one half of (UB − LB) in every while loop iteration (line 2). From this and because of T OL > 0, we conclude that the while loop and AMDR as a whole terminate after finitely many iterations.
1. The objective of (CIA) cannot be greater than t f −t 0 , even with no switches allowed, i.e., σ max = 0. Since we initialize the AMDR algorithm with UB = t f − t 0 , it finds in any case a feasible solution. 2. Every in line 5 by MDR generated w MDR that satisfies the TV constraints together with θ (w MDR ) < UB represents an upper bound on θ , i.e., UB = θ (w MDR ) ≥ θ . For proving that AMDR constructs valid lower bounds LB on θ we exploit that w MDR only uses canonical switches for n ω = 2 and by assumption for n ω > 2 so that Corollary 3 is applicable. If there is no initial control i ∈ [n ω ] for which MDR produces w MDR for givenθ that uses less or equal switches then by the TV constraints required, we conclude by Corollary 3 that there exists no such w ∈ Ω N for this specific thresholdθ and, hence, LB =θ ≤ θ is a true lower bound on the optimal (CIA) objective value. Moreover, if MDR constructs for a givenθ and all initial controls i ∈ [n ω ] control functions w MDR with θ (w MDR ) >θ , Lemma 4 and the assumption in (b) guarantee that thisθ is also a true lower bound on the optimal (CIA) objective value. Altogether, AMDR iteratively generates valid lower LB and upper bounds UB for θ and produces a feasible solution that is optimal up to the chosen tolerance T OL.
3. MDR runs forward in time and computes solely the accumulated control deviation γ and θ for all intervals j ∈ [N], therefore C MDR ∈ O(N). The interval halving in AMDR ensures that we execute the while loop a maximum of log 2 ((t f − t 0 )/T OL) times. Inside this loop, we need to run the MDR scheme in the worst case with all n ω controls as initial controls. Combining these findings yields the asserted complexity.
Remark 6 Several meaningful modifications for the AMDR algorithm are available. We may use it also for finding control functions fulfilling other combinatorial constraints such as minimum dwell time constraints by checking them together with the TV constraint in line 5. As part of the MDR scheme, the control with maximum forward control deviation γ is activated if the previously active control is inadmissible. Instead, one may choose a less greedy variant. For instance, we could activate the next forced and admissible control. Lastly, the initial upper bound UB can be reduced, as we will point out in the next section.
Remark 7 If we drop the TV constraint on w, the AMDR scheme finds a control function with the same objective value as the one obtained by the control function of Sum-Up Rounding [24] (without proof).
Most results of this section are based on the assumption of an equidistant discretization, which is common in practice. However, the assumption of dealing only with canonical switches in the produced control function is critical. The following example illustrates that a control function generated by MDR with non-canonical switches may use more switches than needed or may not satisfy the rounding boundθ .
Example 3 Consider an equidistant grid. Let the following two relaxed values a 1 , a 2 ∈ A N be defined as  In this section, we use the MDR algorithm and previous results to deduce bounds on (CIA). We consider a given (CIA) problem with grid G N , relaxed value a ∈ A N and maximum number of switches σ max > 0. The idea in the following is to construct a control function w MDR that bounds the objective of (CIA). For finding an appropriate initial active control for the MDR scheme, we introduce an auxiliary gridG N which ends att f and hasÑ intervals: In the definition ofG N we intersect two sets because we consider given G N and σ max . To specify the rounding of a value t 0 ≤ t down to the next grid point, we utilize the following brackets notation Depending on whether we deal with an equidistant grid or not, we can prove a sharp bound for (CIA). We are going to distinguish between these two cases in the upcoming results and introduce the following constant We propose to apply the rounding threshold in the MDR scheme and claim that this choice will be later beneficial for proving upper bounds on (CIA). Next, we establish useful properties of the rounding · G N to the next grid point.
Lemma 5 (Distance to next grid points) Consider σ max > 0 and the rounding thresholdθ defined as above.
The following holds true: Proof 1. Let us first consider the non-equidistant case. If t 0 + jθ ≤ t f , we deduce that the maximum distance of t 0 + jθ to the next smaller or equal grid point is∆ . If t 0 + jθ > t f , we have t 0 + jθ G N = t f and obtain This settles the non-equidistant case: t 0 + jθ G N ≥ t 0 + jθ −∆ . For the equidistant case, we observē We look at the right fraction and notice that the numerator consists of a product of an integer and∆ , whereas the denominator is the integer 3 + 2σ max . Thus, the maximum cut-off by rounding down to the closest grid point is 3+2σ max −1 3+2σ max∆ , which is equal to∆ −C 1∆ and proves the claim. 2. This follows from a similar argumentation as for the claim "1.". For the non-equidistant case we need only to consider t 0 + 3+2σ max ≤ t f , for the equidistant case we take again advantage of t f − t 0 = N∆ .
We continue with a lemma that quantifies the length of activation blocks in w MDR .
Lemma 6 (Length of activation blocks δ l ) Consider a feasible control solution for (CIA−θ ) that only uses canonical switches. Then, for the length of its activation block δ l , 2 ≤ l ≤ σ max , follows: Proof Let i be the active control on activation block l. We are using the assumption regarding canonical switches twice. First, i is forced for the earlier switch l − 1: and second, it is inadmissible on interval τ l : By definition of activation blocks we have δ l = ∑ τ l −1 j=τ l−1 ∆ j so that we obtain by rearranging (6.4):

Plugging (6.3) into the above inequality yields
which settles the non-equidistant case. For an equidistant grid, we compute and because δ l is a multiple of∆ , it follows from δ l > 2θ −∆ Next, we propose in Algorithm 4 a specification of the initial active control i 0 . We observe that a small number of switches onG N in terms ofθ is sufficient, as quantified in the following lemma.
Algorithm 4: Detecting the initial active control for MDR that results in at most one switch onG N .
Set i 0 = i 1 ; 5 else 6 Set i 0 = i 1 , where i 1 is a next forced control on interval j = 1; 7 end 8 return: i 0 as initial control; Lemma 7 The MDR algorithm applied to the auxiliary gridG N with rounding thresholdθ , n ω = 2 and i 0 from Algorithm 4 as initial control constructs a control function w MDR that uses at most one switch onG N .
Proof We distinguish between the three possibilities of the initial control in Algorithm 4.
1. If MDR is initialized with i 2 and for i 1 holds ∑Ñ j=1 a i 1 , j ∆ j ≤θ , the latter does not become forced onG N .
For this reason there is no switch. 2. If i 1 with ∑Ñ j=1 a i 1 , j ∆ j ≤ 2θ −∆ + C 1∆ is the initial active control, a switch has to occur in case i 2 is forced on some interval τ 1 ∈ [Ñ]. We need to prove that i 1 does not become forced after the first switch, which is equivalent to i 2 does not become inadmissible due to n ω = 2 and Lemma 2, because then there is no other switch. For this, we derive a lower bound on the length of the first activation block δ 1 , where i 1 is active. The control i 1 becomes at the earliest inadmissible when it has been active on intervals j with a i 1 , j = 0 whose lengths sum up to be more thanθ , i.e., t 0 +θ G N − t 0 . With this observation and Lemma 5.1 we derive Note that γ i 1 , j is monotonically increasing with increasing interval j > τ 1 as long as i 1 is inactive, i.e., w i 1 , j−1 = 0. Hence, if we are able to prove γ i 1 ,Ñ ≤θ in case of w i 1 , j = 0 for j > τ 1 , we also have that γ i 1 , j ≤θ for any interval j > τ 1 meaning there is no second switch. Altogether, we get with the above inequality so that w MDR switches no more than once onG N .

Otherwise we haveÑ
in the else case. We can argue similarly as in the previous case, which is why we only have to prove γ i 1 ,Ñ ≤θ . Since i 1 is a next forced control on the first interval, there is an interval l ≤ τ 1 with ∑ l j=1 a i 1 , j ∆ j > θ . This implies the interval τ 1 of the earliest possible switch is given by Using the (Conv) property yields ∑Ñ j=1 a i 1 , j ∆ j = ∑Ñ j=1 ∆ j − ∑Ñ j=1 a i 2 , j ∆ j and therefore where we usedt f = t 0 + in the third equation. To conclude, there is also at most one switch.
The above three lemmata are crucial for the following theorem, which provides an upper bound on (CIA).
Theorem 3 Consider any grid G N , relaxed values a ∈ A N and a maximum number of switches σ max > 0. The objective of (CIA) is bounded by Proof We want to prove that the control function w MDR constructed by MDR with rounding thresholdθ from (6.2) and initial control from Algorithm 4 is feasible and satisfies the claimed bound. We observeθ ≥ 1 2∆ from its definition in (6.2) and the definition of C 1 in (6.1). Thus, we can apply Proposition 3 in connection with Lemma 3 so that w MDR fulfills indeed the claimed bound: What remains to be shown is that w MDR is a feasible solution for (CIA−θ ), i.e., it does not use more than σ max switches. In the sequel, we write n = σ max in variable indices to improve readability of the latter. We assume there are already σ max switches taken in w MDR and calculate the maximum length of the possibly last activation block, i.e., δ n+1 = t f − t τ n −1 . In Lemma 7 we have derived that at most one switch is used on the reduced gridG N untilt f , but there may follow another switch shortly afterwards, i.e., τ 2 ≥Ñ + 1. For the remaining σ max − 2 activation blocks until t τ n −1 we can apply Lemma 6, since Proposition 3 states that MDR uses canonical switches for n ω = 2. Lemma 6 states Combining these findings and using Lemma 5.2 results in Let i denote the control that is active after the σ max th switch of w MDR . Note that θ i, j is monotonically decreasing with increasing interval j ≥ τ n since i is chosen to be active on interval j. Hence, if we are able to show that i is admissible on interval N, then it is also admissible on earlier intervals. We want to prove the admissibility of control i on interval N as in this case there will be no further switch until N. For this, let us assume i is inadmissible on interval N. We obtain In the second inequality we used that control i is forced on the interval τ n of the nth, respectively σ max th, switch. With this contradiction, there cannot be a further switch after τ n ; in other words, w MDR uses at most σ max switches and is a feasible solution of (CIA). This completes the proof.
In the sequel, we are going to elaborate how sharp the upper bound from Theorem 3 is, and we will thereby exclude the case σ max ≥ N − 1; otherwise the TV constraint would be no longer restrictive. Before presenting the main result in this context, we need a technical lemma again.
We have that where we indicate by x 0.5 the rounding up of x ∈ R to the next multiple of 0.5 as defined in Section 1.3.
Proof Since R is a rational number with 3 + 2σ max in the denominator, we have Moreover, using basic properties of floor and ceiling functions yields R ≤ R 0.5 + 0.5, (6.10) Next, we calculate Both 2 R 0.5 − 1 ∈ Z and N− R σ max +1 ∈ Z are valid so that we can deduce from the above inequality there is an equidistant grid G N and an a ∈ A N so that (CIA) has an objective value of (6.12) Proof If σ max + 2 ≤ N < 3 + 2σ max , then we define a by specifying the values of control i 1 for the intervals j ∈ [N] by a i 1 , j = 1, if j odd, 0, if j even.
Since there are more intervals N than the maximum number of switches σ max plus one, there is an interval j on which the optimal solution w of (CIA) has the value w i 1 , j = 1, while a i 1 , j = 0 holds. This results in θ ≥ 1∆ = N 3+2σ max 0.5∆ . Otherwise, if N ≥ 3 + 2σ max , we proceed as follows 1. We construct a specific matrix a that depends on the choice of σ max and N.
2. We prove that the MDR scheme constructs for both initial active controls, for this a value and with a rounding threshold of for any 0 < ε < N 3 + 2σ max 0.5∆ , (6.13) control functions w MDR that use more than σ max switches. Then, we can come back to the idea of the AMDR scheme and Theorem 2.1., which states that w AMDR is feasible for (CIA), i.e. uses at most σ max switches, resulting in Theorem 2 provides in 2. (a) also a statement about the relation to the optimal solution of (CIA): Because the tolerance T OL can be arbitrarily small, we conclude the optimal solution of (CIA) involves an objective value of at least N 3+2σ max 0.5∆ . 1. We reuse the notation of R from Lemma 8 and introduce the auxiliary constant n I ∈ N: Next, we are interested in designing a specific a ∈ A N with the property to enforce an improper covering by any w ∈ Ω N that satisfies a (CIA) objective of at mostθ . By improper covering we indicate w ∈ Ω N has to use more than σ max switches in order to yield the desired (CIA) objective value of at mostθ . We create sets of consecutive intervals for a on which either a i 1 or a i 2 is set to one (and the other control is thereby set to zero). We call these sets of consecutive intervals with the same value here index sections. We generate σ max + 2 index sections, where the two controls are alternately set to one in a, with the idea that a feasible solution w of (CIA) with at most σ max switches shall contain at most σ max + 1 activation blocks. The first index section will include R intervals, followed by index sections with n I intervals, and the last index section arises from the remaining intervals until N is reached. After conveying some intuition of the specific a ∈ A N , we continue with a technical definition of the index set J that specifies the index sections on which a i 1 is set to one: if σ max is odd.
With these definitions we introduce a by fixing the values of control i 1 . (6.14) The value of a i 1 on the ( R )th interval in the second and third case may seem unintuitive. The idea of this construction is that it results θ i 1 , R = R 0.5 if control i 1 is neither active on the first index section, nor on the ( R )th interval. In this way, control i 1 needs to be already active on the first index section in order to maintain a (CIA) objective value of at mostθ . 2. We want to prove that the MDR scheme with the rounding threshold from (6.13) and with a defined in (6.14) constructs a control function that uses more than σ max switches, independent of the initial active control. For this, we are going to establish the following claim: a) If i 1 is the initial active control, the kth switch of w MDR happens before the ( R + kn I )th interval, where k ∈ [σ max + 1]. b) If i 2 is the initial active control, the kth switch of w MDR happens before the ( R + (k − 1)n I )th interval, where k ∈ [σ max + 1].
Assuming the claim is true, w MDR uses indeed more than σ max switches because the ( R + (σ max + 1)n I )th interval exists, i.e., is smaller than or equal to N: The inequality above shows that there are indeed σ max + 2 index sections for a as described above. With this information we deduce thatθ < 1 2∆ results directly in more than σ max switches or in control solutions that does not satisfy the claimed optimal (CIA) objective value from (6.12) anyway: -If a consists only of zeros and ones andθ < 1 2∆ , the MDR algorithm creates switches on all intervals j for which a ·, j = a ·, j−1 holds true. Thus, the activation blocks of w MDR would match the index sections of a, i.e. w MDR = a. As we derived σ max + 2 index sections for a, there are σ max + 2 blocks for w MDR and therefore σ max + 1 switches.
-If a i 1 , R = 0.5, then there is no w with θ (w) < 1 2∆ regardless of which control is active on interval R since a is either zero or one on all other intervals. Hence, we can exclude the caseθ < 1 2∆ from further consideration.
Thus, we are left with the caseθ ≥ 1 2∆ . In this case, we can apply Proposition 3 and conclude that we deal only with canonical switches. We return to prove the claim, and we proceed via induction. Base case: a) We consider k = 1 and conclude from N ≥ 3 + 2σ max that R 0.5 ≥ 1 holds. Plugging this into inequality (6.8) from Lemma 8 results in R 0.5 < n I , and thus θ < n I∆ .
(6.15) By construction of a, the values a i 1 , j are equal to one for 1 ≤ j ≤ R . The value a i 1 , R is either 0.5 or 1. Therefore, −0.5 ≤ θ i 1 , R ≤ 0 holds for the accumulated control deviation of w MDR with i 1 as initial active control. After the ( R )th interval n I intervals follow on which a i 1 , j is zero. We conclude i 1 becomes inadmissible by (6.15) before interval R + n I and hence, the first switch appears before this interval. b) We show the claim for the first two switches because we take an interest in a switch that occurs after interval R in the induction step. Let k = 1. Since we conclude control i 2 becomes inadmissible the latest on interval R when being the initial active control and, equivalently, w MDR has a switch on interval R at the latest. This is equivalent to at least one activation of i 1 up to and including interval R , which we use for proving the assertion in case of k = 2. Let us assume the second switch happens on or after interval R + n I . This implies i 1 would be admissible on that interval and we derive Consequently, the second switch happens before the ( R + n I )th interval.
Induction step: Assume the assertion holds for k − 1 ≤ σ max , we show that it is also true for k. At first, we prove an auxiliary result. For i ∈ [2] and j ≥ R we have that θ i, j = R 0.5∆ + z∆ , for some z ∈ Z. (6.16) We prove the equation (6.16) by computing the accumulated control deviation: For j > R we have defined a i 1 , j ∈ {0, 1} so that (6.16) holds with z = ∑ j l=1+ R a i 1 ,l − ∑ j l=1 w i 1 ,l . On the other hand, for the other control i 2 holds In order to make use of the established auxiliary result, we need to argue that the (k − 1)st switch happens after the interval R . In case a) the MDR algorithm will not deactivate i 1 due to a i 1 , j = 1 before the R th interval. So it does on the R th interval if a i 1 , R = 0.5 because we have establishedθ ≥ 1 2∆ . In case b) we use the base case for the second switch. We consider the interval τ 1 of the first switch of case a) and compare the two accumulated control deviations for the two cases a) and b) on τ 1 and obtain θ i 1 ,τ 1 (b) ≥ θ i 1 ,τ 1 (a) because i 2 has already been activated in case b) in contrast to case a). Since τ 1 > R , we are done. Now, without loss of generality, let i 1 be the active control after the switch on interval τ k−1 . We know that i 2 is active and thus admissible on interval τ k−1 − 1: which implies by Lemma 2 for the control i 1 and by equation (6.16) we have for some z i 1 ≥ 1 The control i 2 is inadmissible on interval τ k−1 as there are only canonical switches. If a i 2 ,τ k−1 = 1 would be true, then i 2 would already have been inadmissible on interval τ k−1 − 1. Also, a i 2 ,τ k−1 = 0.5 is not possible because we derived τ k−1 > R . We conclude a i 2 ,τ k−1 = 0. From this and the induction hypothesis, which states that the (k − 1)st switch appears before the ( R + (k − 1)n I )th interval, follows a i 1 , j = 1 for the intervals j between τ k−1 and ( R + (k − 1)n I ). Hence, θ i 1 , R +(k−1)n I ≤ ( R 0.5 −1)∆ holds due to (6.17). Finally, we assume i 1 can stay active up to and including interval R + kn I without becoming inadmissible. This and a i 1 , j = 0 for R + (k − 1)n I + 1 ≤ j ≤ R + kn I imply Thus, i 1 is not active until the ( R + kn I )th interval, respectively with an analogous computation for case b) i 1 is not active until the ( R + (k − 1)n I )th interval. Thereby, we showed that indeed the assertion holds for k. Altogether, the constructed control function w MDR uses more than σ max switches for the chosen rounding thresholdθ so that the (CIA) objective value is at leastθ and we conclude the claimed theorem is true.
We complete this section by drawing a conclusion from the Theorems 3 and 4.

Corollary 4
Consider an equidistant grid G N , a ∈ A N and 1 ≤ σ max ≤ N − 2. The objective of (CIA) is bounded by which is the tightest possible bound.
Proof The inequality (6.18) is achieved by Theorem 3 applied to the equidistant case and rearranging terms: It is the tightest possible bound by Theorem 4 and the case N = k(3 + 2σ max ) + 2 + σ max , k ∈ N 0 : 7 Upper Bounds on (CIA) with n ω > 2 Deriving bounds for the (CIA) problem with more than two controls is more difficult compared to the last chapter as the number of possibilities increases significantly. Let θ max denote the maximum possible objective value of (CIA) for any given a ∈ A N and G N in this section. We will first use known results to derive lower and upper bounds for θ max . Then, we dedicate ourselves to the continuous relaxation of (CIA), which allows us to prove a sharper lower bound. Based on this, we state a conjecture about the actual value of θ max .
Corollary 5 Let 1 ≤ σ max ≤ N − 2 and n ω > 2. We have that θ max ≥ N+σ max +1 3+2σ max∆ . Proof This bound has been established in Theorem 4 and Corollary 4 for the case n ω = 2. The provided example in the proof of Theorem 4 can be also applied to the case n ω > 2 by setting the values of the relaxed controls a i , for all i ∈ [n ω ], i > 2, to zero.
Corollary 6 Let 1 ≤ σ max ≤ N − 2 and n ω > 2. We have that θ max ≤ 2n ω −3 Proof For the (CIA) problem without TV constraints, but with minimum up time constraints the sharp bound is proven in Theorem 2 in [31], where the constant C U ≥ 0 represents the given minimum up time. If we require for (CIA) that an activated control remains active for a time period of at least t f −t 0 σ max +1 , at most σ max switches take place. Thus, the TV constraint serves as a relaxation of the minimum up time constraint.
We tighten the above results by investigating the continuous version of the (CIA) problem.
Definition 17 (CCIA) Let α ∈ A and σ max ∈ N be given. Then, we define the continuous combinatorial integral approximation (CCIA) problem to be min θ , ω∈Ω θ (7.1) σ max ≥ TV (ω). We stress that the given data α lives for (CCIA) in A not as usually in A N and analogously we try to find a binary control function ω ∈ Ω , not w ∈ Ω N . We obtain a lower bound for the maximum objective θ max of (CCIA) over all α ∈ A by constructing a specific instance, as indicated in the following theorem.
n ωt + (σ max + 2 − n ω )2t = t f − t 0 so that ∪ i∈[n ω ] I i = T follows and because the intervals I i are all disjoint, we obtain α i (t) = 1 for exactly one control i and for all t ∈ T . Hence, α ∈ A . The next observation about α is that it consists of n ω + (σ max + 2 − n ω ) = σ max + 2 activation blocks (interpreted in this continuous setting), meaning there are σ max + 1 (7.5) Proof (CCIA) is a relaxation of (CIA) since every feasible solution of (CIA) corresponds to a feasible solution of (CCIA). Thus, the claim follows from Proposition 4.
The lower bound in Corollary 7 is generally sharp in the sense that there are combinations of n ω , σ max and G N so that θ max equals the claimed lower bound. The following example illustrates this relationship.
Example 4 Let the grid be equidistant with N = 3 and n ω = 3. Consider the following two instances: Consider σ max = 1 in the first example. Then, θ =∆ and θ max ≥∆ follows from the above corollary. Any asymmetric modification of (a 1 i, j ) with unequal control accumulation ∑ 3 j=1 a 1 i 1 , j = ∑ 3 j=1 a 1 i 2 , j would result in a binary control function w OPT that activates the controls with highest control accumulation and hence θ <∆ . We conclude that the claimed bound is sharp, i.e., θ max =∆ .
Finding the exact value of θ max is difficult due to the nonconvex objective max a min w max i∈[n ω ], j∈[N] and the tremendously increased number of different ω ∈ Ω when n ω > 2, but we conjecture that the lower bound in Proposition 4 cannot be improved. We recognize the symmetry of the constructed α in the proof: Any modification of α that alters the length of its activation blocks would result either in less than σ max + 2 activation blocks or in at least one block with a smaller length compared with the previous length. The latter block length would be smaller than t f −t 0 2σ max +4−n ω if the block is the first control's activation, respectively smaller than 2 · t f −t 0 2σ max +4−n ω else. With the argumentation from the proof of Proposition 4, this would allow us to choose a control function ω ∈ Ω with a (CCIA) objective value smaller than t f −t 0 2σ max +4−n ω . Furthermore, we argue that the optimal objective value of (CCIA) is at most by 1 2∆ smaller than the one of (CIA) because the switching times of the optimal ω ∈ Ω differ at most by one half of the maximum grid length from the optimal w ∈ Ω N . We close this section by summarizing these thoughts in the following conjecture.

Numerical Experiments
We test the proposed algorithm with a benchmark example from the https://mintOC.de library [25], with a real-world adsorption cooling machine problem [7] and with generic data. We use the CIA decomposition in order to solve these problems, where we applied CasADi v3.4.5 [1] to parse the NLP with efficient derivative calculation to the solver Ipopt 3.12.3 [30]. We implemented the AMDR algorithm into an add-on as part of the open-source software package pycombina 2 [7] and used its BNB solver for benchmarking reasons.
The BNB scheme is based on the idea to branch forward in time and exploits that an evaluation of the objective function up to the current grid point yields a valid lower bound that is extremely cheap to compute, see [15,27] for further details. We set the tolerance parameter of the ADMR algorithm to T OL = 0.0001. All computational experiments are executed on a workstation with 4 Intel i5-4210U CPUs (1.7 GHz) and 7.7 GB RAM.

Multimode MIOCP
We consider the following MIOCP, which is a modified version of the Egerstedt standard problem from https://mintOC.de: Obviously, we deal with 3 different modes, i.e., n ω = 3. We use as initial values x 0 := (0.5, 0.5) T . Furthermore, we add the TV constraints (3.1)-(3.2) to (P1), with varying maximum number of switches σ max . Fig. 2 illustrates the differential state and control trajectories for σ max = 20 and with relaxed binary controls as well as binary controls based on SUR, BNB and AMDR. We remark that the control function constructed by SUR uses 70 switches and is therefore infeasible with respect to σ max = 20. The relaxed control values are greater than zero and less than one around t ≈ 0.45 and for t ≥ 0.8 so that the corresponding approximated state trajectories of BNB and AMDR are slightly different from the relaxed one from t ≈ 0.45 on. We set the BNB iteration limit to 5 · 10 6 so that it stopped after 15.3s with (CIA) objective value θ = 9.1 · 10 −3 and State trajectories for (P1) Control trajectories by SUR Control trajectories by BNB Control trajectories by AMDR  Table 1 shows that the BNB algorithm constructs for small instances, e.g. N = 200, better (CIA) objective values than AMDR if enough time is available. If the BNB scheme finds a good solution, it will usually do so after a few million iterations. While the θ values of AMDR are close to the ones from BNB for N = 200, they are clearly outperforming the latter for bigger instances. Its run time is only slightly increasing with a grid's refinement, from about 0.1 seconds to at most 0.6 seconds. A C++ implementation could still improve the run time as we used so far a prototype implementation in python. It appears that selecting the next-forced control rather than the one with a maximum γ value is beneficial as part of the AMDR algorithm and tends to yield the solution with the smallest (CIA) objective value.

Dualmode adsorption cooling machine problem
In [6,7], a complex renewable energy system in the form of a solar thermal climate system with nonlinear system behavior is introduced as an MIOCP. The system's core is an adsorption cooling machine, which can be switched on to intensify the cooling down of ambient temperature. The goal is to control the room temperature in a comfort zone and at the same time to minimize the energy costs. We skip a detailed system's description but refer to [7], and consider the relaxed binary control values as given, as illustrated in Fig. 3 in the left plot. We assume two modes of the adsorption cooling machine, i.e., n ω = 2, and a whole day time horizon with control adjustment every four minutes, i.e., N = 360.
We use the AMDR scheme to calculate a candidate solution of the (CIA) problem depending on σ max , which is optimal by virtue of Theorem 2.2. (a). The right plot in Fig. 3 Table 1: Comparison of (CIA) objective values and run time of (P1) for different solving methods and varying σ max . AMDR corresponds to Algorithm 3, while AMDR-NF represents a modification in which the admissible and next-forced control is selected to be active in line 4 of Algorithm 2. By BNB we refer to pycombina's BNB algorithm with depth-first node selection strategy, where we set up an iteration limit to 5 and 50 million nodes. We highlight the best objective values in red. The last two columns show the optimal objective values and the upper bounds from Conjecture 1.
the (CIA) objective values of BNB solutions with increasing iteration limit. For a small and large number of allowed switches, the deviation of the BNB solutions is small. One explanation for this is the limited degree of freedom for a small σ max , so that the width of the BNB tree is very limited. With a large σ max , on the other hand, solutions with a small θ value can be found quickly, with which many nodes can be pruned. The deviation from the optimal solution is particularly striking for medium-sized σ max . For some instances, especially for 10 ≤ σ max ≤ 20, an increase of the iteration limit hardly leads to an improvement because the BNB algorithm seems to remain in a suboptimal branch. We also compare the optimal solution of (CIA) with the upper bound from Corollary 4 and see that the latter appears between 200 and 600 percent larger.

Comparison of Optima for (CIA) with Upper Bounds based on Generic Data
The two investigated MIOCPs showed a relatively large deviation of the optimal (CIA) objective value compared with the derived upper bounds. Therefore, we generated uniformly distributed random values a ∈ A N for N = 40 equidistant intervals, n ω = 2, 3 controls, and examined how the ratio of these two values results here. We illustrate this comparison in Fig. 4, where we use the upper bound from Corollary 4 for n ω = 2 and the one from Conjecture 1 for n ω = 3. The objective values θ , θ and bounds θ max decrease logarithmically with the increase of σ max , as expected. In contrast to the above MIOCPs, the (CIA) objective values come close to the upper bounds, particularly for small σ max , but a relevant gap remains for larger σ max . This gap may be further reduced utilizing a larger sample size; we considered here only 1000 (CIA) instances per σ max value. We also note that the values generated by the AMDR algorithm are very close to the optimal ones.

Discussion
As expected by the polynomial run time complexity, our prototype implementation of AMDR constructs (CIA) feasible solutions very quickly. Their θ values are mostly outperforming the ones obtained by the BNB algorithm or are at least close to the latter for a problem with more than two binary controls. Consequently, the AMDR solution is itself a promising (CIA) feasible solution or is a fast option to initialize the BNB with a competitive upper bound. As stated in Remark 6, the AMDR algorithm may also be used to include combinatorial constraints other than the TV constraints.
For comparison with the BNB method, we restrict that we only used the depth-first node selection strategy and could have tuned it a bit more to achieve more competitive feasible solutions of (CIA). Besides, the BNB algorithm can include a variety of combinatorial conditions of the (CIA) problem, so it is generally advantageous.
We also note that our calculations mainly examine the (CIA) objective value because it correlates with the (MIOCP) objective value. With very similar or large (CIA) objective values, however, the smaller value may lead to a worse (MIOCP) objective value -and vice versa. There may be several binary control functions with the same (CIA) objective value but different (MIOCP) objective values. In some instances, we observed that the AMDR algorithm generates a control function with suboptimal (MIOCP) objective value since its switches are structurally delayed compared to the switches on bang-bang-arcs of the relaxed binary values.
In this case, we tested, as a heuristic, shifting the AMDR binary values backward in time by θ /∆ intervals so that the control function is more similar to the relaxed binary values, which worked well.

Conclusions
In this paper, we have devised a fast rounding method for the MIOCP with constrained TV of the integer control. The proposed algorithm constructs under certain assumptions, e.g., n ω = 2, an optimal solution of the (CIA) subproblem. Based on this, we have proven bounds on the integrality gap of (CIA) for the constrained TV case. Our numerical results have shown that the computed control function's quality outperforms in many cases the BNB solution, for which an iteration limit has been set up. Due to the very short run time, we recommend the proposed method, especially for the mixed-integer model predictive control setting or for instances with a vast number of binary variables. In the future, this algorithmic proposal could be compared with a penalty alternating direction method [12] or extended to switching costs as in [3].