Algorithms for Energy Conservation in Heterogeneous Data Centers

Power consumption is the major cost factor in data centers. It can be reduced by dynamically right-sizing the data center according to the currently arriving jobs. If there is a long period with low load, servers can be powered down to save energy. For identical machines, the problem has already been solved optimally by Lin et al. (2013) and Albers and Quedenfeld (2018). In this paper, we study how a data-center with heterogeneous servers can dynamically be right-sized to minimize the energy consumption. There are $d$ different server types with various operating and switching costs. We present a deterministic online algorithm that achieves a competitive ratio of $2d$ as well as a randomized version that is $1.58d$-competitive. Furthermore, we show that there is no deterministic online algorithm that attains a competitive ratio smaller than $2d$. Hence our deterministic algorithm is optimal. In contrast to related problems like convex body chasing and convex function chasing, we investigate the discrete setting where the number of active servers must be integral, so we gain truly feasible solutions.


Introduction
Energy management is an important issue in data centers.A huge amount of a data center's financial budget is spent on electricity that is needed to operate the servers as well as to cool them [12,20].However, server utilization is typically low.In fact there are data centers where the average server utilization is as low as 12% [16]; only for a few days a year is full processing power needed.Unfortunately, idle servers still consume about half of their peak power [29].Therefore, right-sizing a data center by powering down idle servers can save a significant amount of energy.However, shutting down a server and powering it up immediately afterwards incurs much more cost than holding the server in the active state during this time period.The cost for powering up and down does not only contain the increased energy consumption but also, for example, wear-and-tear costs or the risk that the server does not work properly after restarting [26].Consequently, algorithms are needed that manage the number of active servers to minimize the total cost, without knowing when new jobs will arrive in the future.Since about 3% of the global electricity production is consumed by data centers [11], a reduction of their energy consumption can also decrease greenhouse emissions.Thus, right-sizing data centers is not only important for economical but also for ecological reasons.
Modern data centers usually contain heterogeneous servers.If the capacity of a data center is no longer sufficient, it is extended by including new servers.The old servers are still used however.
Hence, there are different server types with various operating and switching costs in a data center.Heterogeneous data centers may also include different processing architectures.There can be servers that use GPUs to perform massive parallel calculations.However, GPUs are not suitable for all jobs.For example, tasks with many branches can be computed much faster on common CPUs than on GPUs [31].
Problem Formulation We consider a data center with d different server types.There are m j servers of type j.Each server has an active state where it is able to process jobs, and an inactive state where no energy is consumed.Powering up a server of type j (i.e., switching from the inactive into the active state) incurs a cost of β j (called switching cost); powering down does not cost anything.We consider a finite time horizon consisting of the time slots {1, . . ., T }.For each time slot t ∈ {1, . . ., T }, jobs of total volume λ t ∈ N 0 arrive and have to be processed during the time slot.There must be at least λ t active servers to process the arriving jobs.We consider a basic setting where the operating cost of a server of type j is load-and time-independent and denoted by l j ∈ R ≥0 .Hence, an active server incurs a constant but type-dependent operating cost per time slot.
A schedule X is a sequence x 1 , . . ., x T with x t = (x t,1 , . . ., x t,d ) where each x t,j indicates the number of active servers of type j during time slot t.At the beginning and the end of the considered time horizon all servers are shut down, i.e., x 0 = x T +1 = (0, . . ., 0).A schedule is called feasible if there are enough active servers to process the arriving jobs and if there are not more active servers than available, i.e., d j=1 x t,j ≥ λ t and x t,j ∈ {0, 1, . . ., m j } for all t ∈ {1, . . ., T } and j ∈ {1, . . ., d}.The cost of a feasible schedule is defined by where (x) + := max(x, 0).The switching cost is only paid for powering up.However, this is not a restriction, since all servers are inactive at the beginning and end of the workload.Thus the cost of powering down can be folded into the cost of powering up.A problem instance is specified by the tuple I = (T, d, m, β, l, Λ) where m = (m 1 , . . ., m d ), β = (β 1 , . . ., β d ), l = (l 1 , . . ., l d ) and Λ = (λ 1 , . . ., λ T ).The task is to find a schedule with minimum cost.
We focus on the central case without inefficient server types.A server type j is called inefficient if there is another server type j ′ = j with both smaller (or equal) operating and switching costs, i.e., l j ≥ l j ′ and β j ≥ β j ′ .This assumption is natural because a better server type with a lower operating cost usually has a higher switching cost.An inefficient server of type j is only powered up, if all servers of all types j ′ with β j ′ ≤ β j and l j ′ ≤ l j are already running.Therefore, excluding inefficient servers is not a relevant restriction in practice.In related work, Augustine et al. [6] exclude inefficient states when operating a single server.
Our contribution We analyze the online setting of this problem where the job volumes λ t arrive one-by-one.The vector of the active servers x t has to be determined without knowledge of future jobs λ t ′ with t ′ > t.A main contribution of our work, compared to previous results, is that we investigate heterogeneous data centers and examine the online setting when truly feasible (integral) solutions are sought.
In Section 2, we present a 2d-competitive deterministic online algorithm, i.e., the total cost of the schedule calculated by our algorithm is at most 2d times larger than the cost of an optimal offline solution.Roughly, our algorithm works as follows.It calculates an optimal schedule for the jobs received so far and ensures that the operating cost of the active servers is at most as large as the operating cost of the active servers in the optimal schedule.If this is not the case, servers with high operating cost are replaced by servers with low operating cost.If a server is not used for a specific duration depending on its switching and operating costs, it is shut down.
In Section 3, we devise a randomized version of our algorithm achieving a competitive ratio of e e−1 d ≈ 1.582d against an oblivious adversary.
In Section 4, we show that there is no deterministic online algorithm that achieves a competitive ratio smaller than 2d.Therefore, our algorithm is optimal.Additionally, for a data center that contains m unique servers (that is m j = 1 for all j ∈ {1, . . ., d}), we show that the best achievable competitive ratio is 2m.
Related work The design of energy-effcient algorithms has received quite some research interest over the last years, see e.g.[10,21,3] and references therein.Specifically, data center right-sizing has attracted considerable attention lately.Lin and Wierman [25,26] analyzed the data-center rightsizing problem for data centers with identical servers (d = 1).The operating cost is load-dependent and modeled by a convex function.In contrast to our setting, continuous solutions are allowed, i.e., the number of active server x t can be fractional.This allows for other techniques in the design and analysis of an algorithm, but the created schedules cannot be used directly in practice.They gave a 3-competitive deterministic online algorithm for this problem.Bansal et al. [9] improved this result by randomization and developed a 2-competitive online algorithm.In our previous paper [1], we showed that 2 is a lower bound for randomized algorithms in the continuous setting; this result was independently shown by [4].Furthermore, we analyzed the discrete setting of the problem where the number of active servers is integral (x t ∈ N 0 ).We presented a 3-competitive deterministic and a 2-competitive randomized online algorithm.Moreover, we proved that these competitive ratios are optimal.
Data-center right-sizing of heterogeneous data centers is related to convex function chasing, which is also known as smoothed online convex optimization [15].At each time slot t, a convex function f t arrives.The algorithm then has to choose a point x t and pay the cost f t (x t ) as well as the movement cost x t − x t−1 where • is any metric.The problem described by equation ( 1) is a special case of convex function chasing if fractional schedules are allowed, i.e., x t,j ∈ [0, m j ] instead of x t,j ∈ {0, . . ., m j }.The operating cost d j=1 l j x t,j in equation (1) together with the feasibility requirements can be modeled as a convex function that is infinite for d j=1 x t,j < λ t and x t,j / ∈ [0, m j ].The switching cost equals the Manhattan metric if the number of servers is scaled appropriately.Sellke [30] gave a (d + 1)-competitive algorithm for convex function chasing.A similar result was found by Argue et al. [5].
In the discrete setting, convex function chasing has at least an exponential competitive ratio, as the following setting shows.Let m j = 1 and β j = 1 for all j ∈ {1, . . ., d}, so the possible server configurations are {0, 1} d .The arriving convex functions f t are infinite for the current position x t−1 of the online algorithm and 0 for all other positions {0, 1} d \ {x t−1 }.After T := 2 d − 1 functions arrived, the switching cost paid by the algorithm is at least 2 d − 1 (otherwise it has to pay infinite operating costs), whereas the offline schedule can go directly to a position without any operating cost and only pays a switching cost of at most d.
Already for the 1-dimensional case (i.e.identical machines), it is not trivial to round a fractional schedule without increasing the competitive ratio (see [26] and [2]).In d-dimensional space, it is completely unclear, if continuous solutions can be rounded without arbitrarily increasing the total cost.Simply rounding up can lead to arbitrarily large switching costs, for example if the fractional solution rapidly switches between 1 and 1 + ǫ.Using a randomized rounding scheme like in [2] (that was used for homogeneous data centers) independently for each dimension can result in an infeasible schedule (for example, if λ t = 1 and x t = (1/d, . . ., 1/d) is rounded down to (0, . . ., 0)).Therefore, Sellke's result does not help us for analyzing the discrete setting.Other publications handling convex function chasing or convex body chasing are [17,8,13].
Goel and Wierman [19] developed a (3 + O(1/µ))-competitive algorithm called Online Balanced Descent (OBD) for convex function chasing, where the arriving functions were required to be µstrongly convex.We remark that the operating cost defined by equation ( 1) is not strongly convex, i.e., µ = 0. Hence their result cannot be used for our problem.A similar result is given by Chen et al. [15] who showed that OBD is (3 + O(1/α))-competitive if the arriving functions are locally α-polyhedral.In our case, α = min j∈{1,...,d} l j /β j , so α can be arbitrarily small depending on the problem instance.
Another similar problem is the Parking Permit Problem by Meyerson [28].There are d different permits which can be purchased for β j dollars and have a duration of D j days.Certain days are driving days where at least one parking permit is needed (λ t ∈ {0, 1}).The permit cost corresponds to our switching cost.However, the duration of the permit is fixed to D j , whereas in our problem the online algorithm can choose for each time slot if it wants to power down a server.Furthermore, there is no operating cost.Even if each server type is replaced by an infinite number of permits with the duration t and the cost β j + l j • t, it is still a different problem, because the algorithm has to choose the time slot for powering down in advance (when the server is powered up).

Deterministic Online Algorithm
In this section we present a deterministic 2d-competitive online algorithm for the problem described in the preceding section.The basic idea of our algorithm is to calculate an optimal schedule for the problem instance that ends at the current time slot.Based on this schedule, we decide when a server is powered up.If a server is idle for a specific time, it is powered down.
Formally, given the original problem instance I = (T, d, m, β, l, Λ), the shortened problem instance I t is defined by I t := (t, d, m, β, l, Λ t ) with Λ t = (λ 1 , . . ., λ t ).Let Xt denote an optimal schedule for I t and let X A be the schedule calculated by our algorithm A.
W.l.o.g.there are no server types with the same operating and switching costs, i.e., β j = β j ′ and l j = l j ′ implies j = j ′ .Furthermore, let l 1 > • • • > l d , i.e., the server types are sorted by their operating costs.Since inefficient server types are excluded, this implies that We separate a problem instance into m := d j=1 m j lanes.At time slot t, there is a single job in lane k ∈ [m], if and only if k ≤ λ t .We can assume that λ t ≤ m holds for all t ∈ [T ], because otherwise there is no feasible schedule for the problem instance.Let X be an arbitrary feasible schedule with x t = (x t,1 , . . ., x t,d ).We define to be the server type that handles the k-th lane during time slot t (see Figure 1).If y t,k = 0, then there is no active server in lane k at time slot t.By definition, the values y t,1 , . . ., y t,m are sorted in descending order, i.e., y t,k ≥ y t,k ′ for k < k ′ .Note that y t,k = 0 implies λ t < k, because otherwise there are not enough active servers to handle the jobs at time t.For the schedule Xt , the server t 0 1 2 3 4 5 6 7 8 9 10 11 Our algorithm works as follows: First, an optimal solution Xt is calculated.If there are several optimal schedules, we choose a schedule that fulfills the inequality ŷt t ′ ,k ≥ ŷt−1 t ′ ,k for all time slots t ′ ∈ [t] and lanes k ∈ [m], so Xt never uses smaller server types than the previous schedule Xt−1 .We will see in Lemma 5 that such a schedule exists and how to construct it.
If there is a server type j with l j = 0, then in an optimal schedule such a server can be powered up before it is needed, although λ t = 0 holds for this time slot.Similarly, such a server can run for more time slots than necessary.W.l.o.g.let Xt be a schedule where servers are powered up as late as possible and powered down as early as possible.
Beginning from the lowest lane (k = 1), it is ensured that A uses a server type that is not smaller than the server type used by Xt , i.e., y A t,k ≥ ŷt t,k must be fulfilled.If the server type y A t−1,k used in the previous time slot is smaller than ŷt t,k , it is powered down and server type ŷt t,k is powered up.A server of type j that is not replaced by a greater server type stays active for tj := ⌊β j /l j ⌋ time slots.If Xt uses a smaller server type j ′ ≤ j in the meantime, then server type j will run for at least tj ′ further time slots (including time slot t).Formally, a server of type j in lane k is powered down at time slot t, if ŷt ′ t ′ ,k = j ′ holds for all server types j ′ ≤ j and time slots t ′ ∈ [t − tj ′ + 1 : t].The pseudocode below clarifies how algorithm A works.The variables e k for k ∈ [m] store the time slot when the server in the corresponding lane will be powered down.Figure 2 visualizes how the schedule X A changes from time slot t − 1 to t.
Structure of optimal schedules Before we can analyze the competitiveness of algorithm A, we have to show that an optimal schedule with the desired properties required by line 2 actually exists.First, we will investigate basic properties of optimal schedules.The following lemma shows that in an optimal schedule, a server of type j that runs in lane k does not change the lane while being active. 3: 6: else 8: e k := max{e k , t + tŷ t t,k } where t0 := 0 The schedule of A (upper plot) at t − 1 is shown in blue, the changes after reacting to λ t are printed in green.The optimal schedule Xt is shown in the lower plot in red.Let tj := j.In the lowest lane k = 1, we have y A t,1 = 6 ≥ 5 = ŷt t,1 , so server type y A t,1 will run for at least t5 = 5 further time slots (including the current time slot t), i.e., y A t,1 will be powered down after time slot t + 4. In lane k = 2, server type ) and replaced by ŷt t,2 = 4.In lane k = 3, Algorithm A has no active server during time slot t − 1, so server type ŷt t,3 = 3 is powered up.
Proof.Let ŷt−1,k = j and ŷt,k = j.To get a contradiction, assume that there exists a lane k ′ = k with ŷt−1,k ′ = j and ŷt,k ′ = j.We differ between the cases (1) k ′ < k and (2) k ′ > k.In case 1, the server type ŷt−1,k ′ must be greater than j, since the server types are sorted.Furthermore, at time slot t − 1 there are at least k ′ active servers whose types are greater than j, and at time slot t there are at most k ′ − 1 active servers whose types are greater than j.Therefore a server of type j ′ > j is powered down after t − 1.Let t ′ > t be the first time slot where xt ′ ,j < xt,j .By replacing one server of type j during the time slots [t : t ′ − 1] by j ′ (i.e., j ′ is not powered down at t, but instead at t ′ ), we reduce the operating cost without increasing the switching cost.Therefore, X cannot be an optimal schedule.
Case 2 works analogously: we have k ′ > k, so the server type ŷt,k ′ must be greater than j.At time slot t − 1 there are at most k − 1 active servers whose types are greater than j, and at time slot t there are at least k active servers whose types are greater than j.Therefore a server of type j ′ > j is powered up after t − 1.Let t ′ < t be the last time slot where xt ′ ,j < xt,j .We replace server type j during [t ′ + 1 : t] by j ′ .The total costs are decreased by this transformation, so X cannot be an optimal schedule.Therefore, a lane k ′ = k with ŷt−1,k ′ = j and ŷt,k ′ = j cannot exist.
The next lemma shows that in an optimal schedule, a server is only powered up or powered down if the number of jobs is increased or decreased, respectively.Lemma 2. Let X be an optimal schedule.If ŷt−1,k > 0 and ŷt,k = 0, then λ t−1,k = 1 and λ t,k = 0. Analogously, ŷt−1,k = 0 and ŷt,k > 0 implies λ t−1,k = 0 and λ t,k = 1.
Proof.Let ŷt−1,k > 0 and ŷt,k = 0.By Lemma 1, we know that a server of type j := ŷt−1,k is powered down after time slot t − 1.There cannot be a job in lane k at time t, because there is no active server in X, so λ t,k = 0. Assume that there is no job for the previous time slot, i.e., λ t−1,k = 0. Then we get a better schedule by powering down the server in lane k one time slot earlier (i.e., after time slot t − 2), because the operating cost is reduced by l j , so X would not be optimal.Therefore, λ t−1,k = 1 must hold.For ŷt−1,k = 0 and ŷt,k > 0 the proof works analogously.
The following lemma shows that in an optimal schedule in a given lane k, the server type does not change immediately, i.e., there must be at least one time slot, where no server is running in lane k.
Proof.Assume that this lemma does not hold.Let t be the first time slot and k the lowest lane during this time slot where ŷt−1,k > 0 and ŷt,k > 0, but ŷt−1,k = ŷt,k .To simplify the notation, let j := ŷt−1,k and j ′ := ŷt,k .We differ between the cases (1) j < j ′ and (2) j > j ′ .In case 1, let t ′ < t be the last time slot where the server type j in lane k was powered up.By replacing server type j by j ′ during [t ′ : t − 1], we reduce the operating cost without increasing the switching cost.If this violates the condition xt,j ≤ m j , we instead choose the last time slot t ′′ ∈ [t ′ + 1 : t − 1] where j ′ is powered down.By replacing j with j ′ during [t ′′ + 1 : t − 1] we reduce the operating cost and save the cost for powering up server type j ′ .It can happen that j has to be powered up one more time, however, the switching cost of j ′ is smaller than the switching cost of j, so the total switching cost is reduced.Case 2 works analogously.We have shown that the total cost can be decreased, so X would not be an optimal schedule.Therefore, the lemma must hold.
Given the optimal schedules Xu and Xv with u < v, we construct a minimum schedule X min(u,v) with y min(u,v) t,k := min{ŷ u t,k , ŷv t,k }.Furthermore, we construct a maximum schedule X max(u,v) as follows.Let z l (t, k) be the last time slot t ′ < t with ŷu t ′ ,k = ŷv t ′ ,k = 0 (no active servers in both schedules) and let z r (t, k) be the first time slot t ′ > t with ŷu Another way to construct X max(u,v) is as follows.First, we take the maximum of both schedules (analogously to X min(u,v) ).However, this can lead to situations where the server type changes immediately, so the necessary condition for optimal schedules would not be fulfilled.Therefore, we replace the lower server type by the greater one until there are no more immediate server changes.This construction is equivalent to equation (3).We will see in Lemma 5 that the maximum schedule is an optimal schedule for I v and fulfills the property required by algorithm A in line 2, which says that the server type used in lane k at time t never decreases when the considered problem instance is expanded.To prove this property, first we have to show that X min(u,v) and X max(u,v) are feasible schedules for the problem instances I u and I v , respectively.Lemma 4. X min(u,v) and X max(u,v) are feasible for I u and I v , respectively.
Proof.(a) Feasibility of X min (u,v)  First, we will show that the demand requirements are fulfilled, so for all k ∈ [m] and t ∈ Assume that this is not the case, so there exists a time slot t and a server type j with x min(u,v) t,j > m j .Since the server types of Xu and Xv are sorted, the server types of X min(u,v) are sorted too.Thus, there must be at least m j + 1 consecutive lanes with y min(u,v) t,k = j.Let k + be the topmost and k − be the lowermost lane with y min(u,v) t,k = j.W.l.o.g.let ŷu t,k + = j (the case ŷv t,k + = j works analogously), so ŷv t,k + ≥ j.It is not possible that ŷu t,k − = j, because then there would be m j + 1 active servers of type j in Xu .On the other hand, ŷv t,k − = j implies that ŷv t,k + = j, since the server types are sorted, so there would be at least m j + 1 active servers of type j in Xv .Thus, our assumption was wrong and X min(u,v) is a feasible schedule for Consider the schedule X with ỹt,k := max{ŷ u t,k , ŷv t,k } (similar to X max(u,v) , but without eliminating immediate server changes).Analogous to part (a), it can be shown that X is a feasible schedule for I v .Furthermore, we observe that the server types of X are sorted for a given time slot, since the server types of Xu and Xv are sorted.Taking the maximum preserves this order.
The schedule X max(u,v) fulfills the demand requirements of I v , because ỹt,k > 0 implies Assume that there are more active servers in X max(u,v) than available, i.e., there exists a time slot t ∈ [v] and a server type j ∈ [d] with x max(u,v) t,j > m j .Let k + be the topmost lane with There must be a time slot t ′ such that ỹt ′ ,k + = j and y Since the server types in X are sorted and since X is a feasible schedule, ỹt ′ ,k − > j holds, because ỹt ′ ,k − = j would imply that X uses server type j in all lanes k ∈ [k However, for all t ′′ between t and t ′ we have y max(u,v) t ′′ ,k − > 0, since there is an active server in the higher lane k + , so y j which is a contradiction to our assumption.Therefore, X max(u,v) is a feasible schedule for I v .Now, we are able to show that the maximum schedule is optimal for the problem instance Proof.To simplify the notation, let X min := X min(u,v) and X max := X max(u,v) .Since Xu and Xv are optimal schedules for I u and I v , respectively, we know from Lemma 4 that C( Xu ) ≤ C(X min ) and C( Xv ) ≤ C(X max ).In the following we will show that C(X min ) + C(X max ) ≤ C( Xu ) + C( Xv ) which implies that X min must be an optimal schedule for I u and X max must be an optimal schedule for I v .First, we compare the operating cost and afterwards the switching cost of the schedules.
The operating costs of Xu and Xv in lane k at time slot t are with l 0 := 0 (if y = 0, then there is no active server, so the operating cost for this time slot is zero).Note that by definition of X min and l max{ŷ u t,k ,ŷ v t,k } ≥ l y max t,k because max{ŷ u t,k , ŷv t,k } ≤ y max t,k .Inequality (4) indicates that the sum of the operating costs of Xmin and Xmax are smaller than or equal to the sum of the operating costs of Xu and Xv .In the following we will show that the same holds for the switching costs.
Each lane k in the schedule X max is divided into blocks such that at the beginning of a block a server is powered up and at the end of the block it is powered down.In the following we consider one single block.Let j denote the server type used in that block and let a and b denote the start and end time slot, respectively.Note that in the time slot immediately before the begin and after the end of the block in both Xu and Xv there is no active server, i.e. ŷu a−1,k = ŷv a−1,k = 0 and ŷu b+1,k = ŷv b+1,k = 0.For t ∈ [a : b], there is always an active server in at least one of the schedules.For the time interval [a : b] we divide lane k of the schedules Xu , Xv and X min(u,v) into blocks B w 1 , . . ., B w nw with w ∈ {u, v, min} such that at the beginning of the block a server is powered up and at the end of the block it is powered down.Let j w i denote the server type used in block B w i with w ∈ {u, v, min} and i ∈ [n w ].
In Xu or Xv (or both) there must be one block B max with j w i = j where w ∈ {u, v} (if there are several blocks that fulfill this property, then we choose an arbitrary one).Let s max denote the start time slot of B max .Let B − be the blocks in Xu and Xv that start before s max and let B + be the blocks that start after s max .Note that {B − , B + , {B max }} is a partition of w∈{u,v},i∈[nw] B w i .Each block B min i which starts before s max is mapped to the block in B − which has the same end time slot.There must be a block B min i which starts at s max .This block is mapped to the last block in B − (which cannot end before s max , so it was not mapped yet).Each block B min i which starts after s max is mapped to the block in B + which has the same start time slot.The mapping procedure is visualized in Figure 3.It ensures that all blocks B min i with i ∈ [n w ] are mapped to a block of Xu or Xv , but not to the block B max .Since X min uses the smaller server type of Xu and Xv , the switching cost of B min i is smaller than or equal to the switching cost of the mapped block B w i with w ∈ {u, v}.
Let β(B) denote the switching cost of block B. The switching costs of Xu and Xv in lane k during the time interval [a : b] are equal to β(B max ) + B∈B − ∩B + β(B).The switching cost of X min in lane k during [a : b] is at most B∈B − ∩B + β(B) and the switching cost of X max is exactly β(B max ), because X max only consists of one single block.By using this result for all blocks of X max and with equation ( 4), we get C(X min ) + C(X max ) ≤ C( Xu ) + C( Xv ) which implies that X max must be an optimal schedule for I v .Feasibility In the following, let { X1 , . . ., XT } be optimal schedules that fulfill the inequality ŷt t ′ ,k ≥ ŷt−1 t ′ ,k for all t, t ′ ∈ [T ] and k ∈ [m] as required by algorithm A. Lemma 5 ensures that such a schedule sequence exists (and also shows how to construct it).Before we can prove that algorithm A is 2d-competitive, we have to show that the computed schedule X A is feasible.
The following lemma shows that the running times tj are sorted in ascending order, i.e., t1 ≤ • • • ≤ td .In other words, the higher the server type is, the longer it stays in the active state.Lemma 6.For j < j ′ , tj ≤ tj ′ holds.
In an optimal schedule Xt , the values ŷt t ′ ,1 , . . ., ŷt t ′ ,m are sorted in descending order by definition.This also holds for the schedule calculated by our algorithm.(2) Assume that there exist t ∈ [T ] and j ∈ [d] such that x A t,j > m j .Let t be the first time slot where algorithm A wants to use server type j in lane k, although it is used already m j times in the lower lanes during the same time slot.Let K be the set of lanes where j is already used, i.e., y A t,k ′ = j for all k ′ ∈ K ⊆ [k − 1].We differ between case 1 where y A t,k is set in line 5 and case 2 where y A t,k is set in line 8.In the first case (y A t,k is set in line 5), we know that Xt uses j in lane k.Since the server types of Xt are sorted, the server types of Xt in the lower lanes cannot be smaller than k.Formally, we have ŷt t,k ′ ≥ j for all k ′ ∈ [k].In the lanes where A uses server type j, the optimal schedule Xt cannot use a greater server type.Thus, there are exactly m j lanes below lane k where ŷt t,k ′ = j holds, so Xt cannot use j in lane k.In the second case (y A t,k is set in line 8), we know that y A t−1,k = j, but x A t−1,j ≤ m j , so there must be a lane k ′ ∈ K with y A t−1,k ′ > j by Lemma 7. We consider the time slot t ′ when the value of e k has changed for the last time.Formally, let t ′ < t be the last time slot such that t , the runtime of y A t ′ ,k in lane k ′ was extended at time slot t ′ , so it still runs during time slot t.This is a contradiction to y A t,k ′ = j.
Competitiveness To show the competitiveness of A, we divide the schedule X A into blocks A t,k with t ∈ [T ] and k ∈ [m].Each block A t,k is described by its creation time t, its start time s t,k , its end time e t,k , the used server type j t,k and the corresponding lane k.The start time is the time slot when j t,k is powered up and the end time is the first time slot, when j t,k is inactive, i.e., during the time interval [s t,k : e t,k − 1] the server of type j t,k is in the active state.
There are two types of blocks: new blocks and extended blocks.A new block starts when a new server is powered up, i.e., lines 5 and 6 of algorithm A are executed because y A t−1,k < ŷt t,k or t ≥ e k ∧ y A t−1,k > ŷt t,k ∧ ŷt t,k > 0 (in words: the previous block ends and Xt has an active server in lane k, but the server type is smaller than the server type used by A in the previous time slot).It ends after ty A t,k time slots.Thus s t,k := t and e t,k := t + ty A t,k (i.e., e t,k equals e k after executing line 6).
An extended block is created when the running time of a server is extended, i.e., the value of e k is updated, but the server type remains the same (that is     Proof.If A t,k is a new block, its length is tj with j = j t,k .Therefore the total cost of A t,k is β j + l j tj = β j + l j ⌊β j /l j ⌋ ≤ 2β j .If A t,k is an extended block, then the server j = j t,k is already running, so there is no switching cost and C(A t,k ) = l j d t,k .
To show the competitiveness of algorithm A, we introduce another variable that will be used in Lemmas 11 and 12. Let ỹu t,k := max be the largest server type used in lane k by the schedule Xt ′ at time slot t ′ for t ′ ∈ [t : u].The next lemma shows that ỹu t,k is monotonically decreasing with respect to t as well as k and increasing with respect to u.
By using equations ( 6), ( 7) and ( 8), we get ỹu The cost of schedule X in lane k during time slot t is denoted by The total cost of X can be written as C(X) = T t=1 m k=1 C t,k (X).The technical lemma below will be needed for our induction proof in Theorem 13.Given the optimal schedules Xu and Xv with u < v, the inequality We consider the schedule Xu which is constructed in two steps.First, we insert the schedule Xv for all lanes k and time slots t where ỹu t,k,j = 1 holds into Xu .Afterwards, we eliminate immediate server changes after ỹu t,k,j switches from 1 to 0 by using the greater server type (equivalent to the construction of the maximum schedule).By Lemma 10, ỹu t,k,j = 1 implies ỹu t ′ ,k ′ ,j = 1 for all j ∈ [d], t ′ ≤ t and k ′ ≤ k, so if Xu uses the schedule Xv for a given time slot t and lane k, then it also uses Xv for the previous time slots t ′ ≤ t and lanes k ′ ≤ k.
The schedule Xu is feasible for I u , because for a given time slot t and lane k, the server type used by Xv is greater than or equal to the server type used by Xu , so in Xu there cannot be more active servers than available.Furthermore the demand requirements are obviously fulfilled.
The total cost of Xu is In the first step, we simply split C( Xu ) into two parts (note that ỹu t,k,j ∈ {0, 1}).The first inequality uses the definition of Xu : for ỹu t,k,j = 1, we have C t,k ( Xu ) = C t,k ( Xv ) and for ỹu t,k,j = 0, we have C t,k ( Xu ) ≤ C t,k ( Xu ), because Xu can use greater server types with lower operating costs than Xu due to the elimination of immediate server changes.The last inequality uses our assumption given by equation (11).
We have shown that C( Xu ) < C( Xu ), however, this is a contradiction to the fact that Xu is an optimal schedule.Therefore, our assumption was wrong and for all j ∈ [d], The following lemma shows how the cost of a single block A v,k can be folded into the term Xv ) which is the right hand side of equation (10) given in the previous lemma with Proof.If the block A v,k does not exists, equation (12) holds by Lemma 10 and ) ≤ 2β j with j := j v,k = ŷv v,k by Lemma 9. Since A v,k is a new block, server type j was not used in the last time slot of the last tj schedules, i.e., ŷt ŷv− tj v− tj ,k = j would hold, then y A v−1,k = j and there would be an extended block at time slot v.By using the facts above and the definition of tv By using Lemma 10 and equation ( 13), we can estimate the first sum in (12): For the second inequality, we add (ỹ v v,k − 1) • C v,k ( Xv ) ≥ 0 and use v t=v− tj C t,k ( Xv ) ≥ β j which holds because either j was powered up in Xv during [v − tj : v] (then there is the switching cost of β j ) or j runs for tj + 1 time slots resulting in an operating cost of l j • ( tj + 1) = l j • (⌊β j /l j ⌋ + 1) ≥ β j .Altogether, we get (beginning from the left hand side of equation ( 12) that has to be shown) If A v,k is an extended block, then C(A v,k ) ≤ l j d with j := j v,k and d := d t,k by Lemma 9. Let j ′ := ŷv v,k be the server type in Xv that provoked the extended block.
t,k ≤ j ′ − 1 holds, because otherwise the duration of A v,k would be smaller than d.Analogously to new blocks, equation ( 13) holds for all t . The first sum of equation ( 12) is at most The last term in (15) satisfies because either j ′ runs for d time slots in Xv during [v − d + 1 : v] (then the operating cost is exactly l j ′ d) or j ′ was powered up during this interval resulting in a cost of as the duration d of block A v,k is upper bounded by tj .Altogether, we get The last inequality holds, because Proof.The feasibility of X A was already proven in Lemma 8, so we have to show that C( ) denote the cost of algorithm A up to time slot v.We will show by induction that holds for all v ∈ [T ] 0 .For v = 0, we have no costs for both X A and Xv , so inequality ( 17) is fulfilled.Assume that inequality (17) holds for v − 1.By using the induction hypothesis as well as Lemmas 11 and 12, we get Since ỹv t,k ≤ d, we get ≤ 2 The schedule XT is optimal for the problem instance I, so algorithm A is 2d-competitive.

Randomized Online Algorithm
The 2d-competitive algorithm can be randomized to achieve a competitive ratio of e e−1 d ≈ 1.582d against an oblivious adversary.The randomized algorithm B chooses γ ∈ [0, 1] according to the probability density function f γ (x) = e x /(e − 1) for x ∈ [0, 1].The variables tj are set to ⌊γ • β j /l j ⌋, so the running time of a server is randomized.Then, algorithm A is executed.Note that γ is determined at the beginning of the algorithm and not for each block.
Lemmas 1-8 as well as 10 and 11 still hold, because they do not depend on the exact value of tj .Only Lemmas 9 and 12 have to be adapted.First of all, we have to introduce a new variable.Let be the number of time slots we have to go backwards in time to find an optimal schedule Xt−τ that uses a server type greater than or equal to ŷt t,k in its last time slot in lane k.The following lemma replaces Lemma 9 and estimates the expected cost of the block A t,k depending on τt,k .Lemma 14.Let c = e/(e − 1), j := ŷt t,k and τ := τt,k The expected cost of the block A t,k is upper bounded by E[C(A t,k )] ≤ l j τ c.
Proof.Let q := l j β j τ (note that both j and τ do not depend on random decisions).We estimate the cost of A t,k depending on γ.
If γ > q, then the server y B t−τ,k ≥ j is still running at time slot t, since ŷt−τ Therefore, A t,k is an extended block with duration at most τ (or A t,k does not exists which is equivalent to an extended block with duration 0).Furthermore, server type j t,k = y B t−τ,k used in A t,k is greater than or equal to j = ŷt t,k , so l j t,k ≤ l j .Thus, for γ > q, we have C(A t,k ) ≤ l j τ .If γ ≤ q, then there can be a new block at time slot t.Note that this is only a necessary, not a sufficient condition for a new block (e.g., if ŷt−τ−1 is a new block, then its cost is given by β j + l j tj .If y B t−τ,k still runs at time slot t, then A t,k is an extended block whose cost is at most l j tj , since j ≤ j t,k .Thus, for γ ≤ q, we have C(A t,k ) ≤ β j + l j tj = β j + l j ⌊γ • β j /l j ⌋.Now, we can estimate the expected cost of A t,k by using the density function f γ .
The last inequality uses l j ⌊x • β j /l j ⌋ ≤ β j x, so the integrals can easily be calculated.By using β j q = l j τ (which follows from the definition of q), we get The following lemma replaces Lemma 12 and shows how the expected cost of block A v,k can be folded into the term c • v−1 t=1 ỹv−1 t,k C t,k ( Xv ) which is the right hand side of equation (10).Lemma 15.For all lanes k ∈ [m] and time slots Proof.If ŷv v,k = 0, then A v,k does not exist, so E[C(A v,k )] = 0 and therefore equation (19) holds by ỹv−1 t,k ≤ ỹv t,k (see Lemma 10).Thus, in the following we consider the case ŷv v,k > 0. For all t ∈ [v − τ + 1 : v − 1] with τ := τv,k , the inequality ỹv−1 t,k ≤ ỹv t,k − 1 holds (see equation (13) in the proof of Lemma 12).Therefore, we get For the last inequality, we add the term (ỹ v v,k − 1)C t,k ( Xv ) which is positive, since ŷv v,k > 0. The last term in (20) with j := ŷv v,k , because either j runs for τ time slots in Xv or j is powered up during [v − τ + 1 : v] resulting in a cost of as τ ≤ tj by definition.By using Lemma 14, we get Theorem 16.Algorithm B is e e−1 d-competitive against an oblivious adversary.
Proof.Lemma 8 still holds for algorithm B, so the schedule X B is feasible.We have to show that denote the expected cost of algorithm B up to time slot v.We will show by induction that holds for all v ∈ [T ] 0 .For v = 0, we have no costs for both X B and Xv , so inequality ( 22) is fulfilled.Assume that inequality (22) holds for v − 1.By using the induction hypothesis as well as Lemmas 11 and 15, we get Since ỹv t,k ≤ d, we get The schedule XT is optimal for the problem instance I, so algorithm B is cd-competitive.

Lower bound
In this section, we show that there is no deterministic online algorithm that achieves a competitive ratio that is better than 2d.We consider the following problem instance: Let β j := N 2j and l j := 1/N 2j where N is a sufficiently large number that depends on the number of servers types d.The value of N will be determined later.The adversary will send a job for the current time slot if and only if the online algorithm has no active server during the previous time slot.This implies that the online algorithm has to power up a server immediately after powering down any server.Note that λ t ∈ {0, 1}, i.e., it is never necessary to power up more than one server.The optimal schedule is denoted by X * .Let A be an arbitrary deterministic online algorithm and let X A be the schedule computed by A.
W.l.o.g., in X A there is no time slot with more than one active server.If this were not the case, we could easily convert the schedule into one where the assumption holds without increasing the cost.Assume that at time slot t a new server of type k is powered up such that there are (at least) two active servers at time t.If we power up the server at t + 1, the schedule is still feasible, but the total costs are reduced by l k .We can repeat this procedure until there is at most one active server for each time slot.
Proof.We will prove the lemma by induction.For k = 1, let t be the length of the schedule X A and let n denote how often server type 1 is powered up in X A .The cost of X A is C(X A ) = nβ 1 + l 1 (t − n + 1).We use two strategies to estimate the cost of an optimal schedule.In the first strategy the server runs for the whole time, so the cost is β 1 + l 1 t.The second strategy is to power down the server when it is idle, so the cost is n(β 1 + l 1 ).
We differ between the cases n ≥ N/8 (case 1) and n < N/8 (case 2).In case 1, the competitive ratio is For the inequality, we split the cost of X A into two terms and estimate the cost of X * in left quotient with the second strategy and C(X * ) in the right quotient with the first strategy.The quotient 2l 1 (n−1) n(β 1 +l 1 ) can be estimated by using n(β 1 + l 1 ) ≥ (n − 1)β 1 , the definitions of l 1 and β 1 as well as the precondition of the lemma that requires N ≥ 6k.
By using this result in equation ( 25) as well as n ≥ N/8, we get In case 2, we use the fact that C(X A ) ≥ N β 1 , so the competitive ratio is at least In the first inequality, we use the second strategy to estimate the cost of C(X * ).The second inequality holds because n < N/8 and n(β 1 + l 1 ) ≤ 2nβ 1 .
Next, assume that Lemma 17 holds for k − 1.
We divide the schedule X A into phases L 0 , K 1 , L 1 , K 2 , . . ., L n such that in the phases K 1 , . . ., K n server type k is used exactly once, while in the intermediate phases L 0 , . . ., L n the other server types 1, . . ., k − 1 are used.A phase K i begins when a server of type k is powered up and ends when it is powered down.The phases L i can have zero length (if the server type k is powered up immediately after it is powered down, so between K i and K i+1 an empty phase L i is inserted).
The operating cost during phase K i is denoted by δ i β k .The operating and switching costs during phase L i are denoted by p i β k .We divide the intermediate phases L i into long phases where p i > 1/N holds and short phases where p i ≤ 1/N .Note that we can use the induction hypothesis only for long phases.The index sets of the long and short phases are denoted by L and S, respectively.
To estimate the cost of an optimal schedule we consider two strategies (see Figure 5): In the first strategy, a server of type k is powered up at the first time slot and runs for the whole time except for phases K i with δ i > 1, then powering down and powering up are cheaper than keeping the server in the active state (β k vs. δ i β k ).The operating cost for the phases K i is δ * i β k with δ * i := min{1, δ i } and the operating cost for the phases L i is at most 1 N 2 p i β k , because algorithm A uses servers whose types are lower than k and therefore the operating cost of A is at least N 2 times larger.Thus, the total cost of this strategy is upper bounded by Strategy 2 Figure 5: (figure is colored) Visualization of the two strategies to estimate the cost of an optimal schedule.The schedule of algorithm A and the incoming jobs λ t are shown in the middle.Long phases are marked in blue and short phases are marked in green (L 1 is a short phase with zero length).Strategy 1 simply uses server type k the whole time.During the short phases, strategy 2 behaves like algorithm A. For the long phases, there is a solution that results in only 1/α of the cost of X A with α := 2k − 2 − ǫ k−1 .In the red blocks server type 1 is activated for exactly one time slot.
In the second strategy, for the long phases L we use the strategy given by our induction hypothesis, while for the short phases S we behave like algorithm A and in the phases K i we run the server type 1 for exactly one time slot (note that in K i we only have λ t = 1 in the first time slot of the phase).Therefore the total cost is at most with α := 2k − 2 − ǫ k−1 .The total cost of A is equal to β k n i=1 (1 + δ i ) + i∈L∪S p i , so the competitive ratio is given by In the first step, the numerator is separated into two parts.Then C(X * ) in the first fraction is estimated by equation ( 26) (first strategy).In the next step, we transform the second fraction.
For the last estimation we used the following inequalities: Theorem 18.There is no deterministic online algorithm for the data-center right-sizing problem with heterogeneous servers and time-and load-independent operating costs whose competitive ratio is smaller than 2d.
Proof.Assume that there is an (2d − ǫ)-competitive deterministic online algorithm A. Let N := max{6d, ⌈9k 2 /ǫ + 1⌉}.We construct a workload as described at the beginning of Section 4 until the cost of A is greater than N β d (note that l j > 0 for all j ∈ [d], so the cost of A can be arbitrarily large).By using Lemma 17 with k = d, we get which is a contradiction to our assumption that algorithm A is (2d − ǫ)-competitive.Therefore, there is no deterministic online algorithm whose competitive ratio is smaller than 2d.
The schedule constructed for the lower bound only uses at most one job in each time slot, so there is no reason for an online algorithm to utilize more than one server of a specific type.Thus, for a data center with m unique servers (i.e.m j = 1 for all j ∈ [d]), the best achievable competitive ratio is 2d = 2m.Corollary 19.There is no deterministic online algorithm for the data-center right-sizing problem with m unique servers and time-and load-independent operating costs whose competitive ratio is smaller than 2m.

Summary
In this paper, we have settled the competitive ratio of online algorithms for right-sizing heterogeneous data centers with d different server types.We investigated a basic setting where each server type has a constant operating cost per time unit.In contrast to related publications like [25] or [30], we studied the discrete setting where the number of active servers must be an integral number.Thereby we gain truly feasible solutions.We developed a 2d-competitive deterministic online algorithm and showed that 2d is a lower bound for deterministic algorithms.Hence our algorithm is optimal.Furthermore, we presented a randomized version that achieves a competitive ratio of e e−1 d ≈ 1.582d against an oblivious adversary.

A Variables
The following table gives an overview of the variables defined in this paper.

Figure 3 :
Figure 3: (figure is colored) Visualization of the proof of Lemma 5.The number inside each block refers to the used server type.The blocks of the sets B − and B + are marked in blue and green, respectively.Block B max , which contains the largest server type, is drawn in red.Note that the third block B min 3 in X min is mapped to the second block B v 2 in Xv , but it uses the server type j min 3 = min{j u 2 , j v 2 } = j u 2 = 7 instead of j v 2 = 9.However, since β 7 < β 9 , the switching cost of B min 3

Figure 4 :
Figure 4: (figure is colored) Visualization of the definition of the blocks A t,k for one specific lane k.The first line shows the values of ŷt t,k for t ∈ [0 : 18] and the second line the resulting schedule of algorithm A. In this example, we have ( t1 , t2 , t3 ) = (2, 3, 5).The blocks A t,k are printed as rectangles that show the start and end time s t,k and e t,k , e.g., s 4,k = 5 and e 4,k = 7 (note that the server is in the active state during [s t,k : e t,k − 1]).The dashed line after A 1,k indicates that e 1,k = 3, since the block is interrupted by A 2,k .New blocks are drawn in green and extended blocks are drawn in blue.The arrows indicate the creation time t of a block.The used server type j t,k is equal to y A t,k .Blocks that are not printed do not exist, e.g., A 3,k does not exist, because t + tŷ t t1 = 3 + 2 = 5 ≤ e 2,k = 5, so e k is not updated at t = 3.
,j C t,k ( Xv ) holds.By summarizing these inequalities for all j ∈ [d] and by using the fact d j=1 ỹu t,k,j = ỹu t,k , we get C t,k ( Xv ).

Lemma 17 .
Let k ∈ [d].If X A only uses servers of type lower than or equal to k and if the cost of

i∈S p i ≤ n + 1 N 2 ( 3 C
(by |S| ≤ n + 1 and p i ≤ 1/N for i ∈ S), by α ≤ 2k, k ≥ 2 andβ k = N 2k ).With N 2 ≥ N , ξ := 6k/N , α ≤ 2k, the definition of α = 2k − 2 − ǫ k−1 and N ≥ 2k, we get C(X A ) C(X * ) ≥ 2k − ǫ k−1 − 10k N − 3 C(X * )/β k .(29) If C(X * ) < N 2k β k holds, then C(X A ) ≥ N β k (a precondition of Lemma 17) implies 2k • C(X * ) < N β k ≤ C(X A ), so equation (24) is fulfilled and Lemma 17 holds.If C(X * ) ≥ N 2k β k , then (X * )/β k ≤ 6kN and inequality(29) gives [u], there must be an active server in lane k at time t, if λ t,k > 0. Since Xu and Xv are feasible schedules, ŷu t,k ≥ λ t,k and ŷv t,k ≥ λ t,k holds for all t ∈ [u] and k ∈ [m].Thus, y } ≥ λ t,k holds.Second, we have to check if there are not more active servers in X min(u,v) than available, i.e. x ′ is powered up at time t, then ŷt t,k ′ = y A t,k ′ holds.By the definition of algorithm A, the server types used during a given time slot are greater than or equal to the server types used by Xt , so y A t,k ≥ ŷt t,k .The server types in Xt are sorted, so we get y A t,k ≥ ŷt t,k ≥ ŷt t,k ′ = y A t,k ′ which contradicts our assumption.If y A t,k ′ is already running at time t, we consider the time slot t ′ < t when the value of e k ′ has changed for the last time.Formally, let t ′ < t be the last time slot such that t ′ + tŷ t ′ t ′ ,k ′ > t.We have ŷt ′ t ′ ,k ≥ ŷt ′ t ′ ,k ′ , so by Lemma 6, y A t ′ ,k runs at least as long as y A t ′ ,k ′ .Therefore, the fact y A t ′ ,k ≥ y A Lemma 7.For all time slots t ∈ [T ], the values y A t,1 , . . ., y A t,m are sorted in descending order, i.e., y A t,k ≥ y A t,k ′ for k < k ′ .Proof.Assume that Lemma 7 does not hold.Let t be the first time slot with y A t,k < y A t,k ′ .If y A t,k d j=1 x A t,j ≥ λ t ) and (2) there are not more active servers than available (i.e., x A t,j ∈ [m j ] 0 ).(1) By the definition of algorithm A, the server types used during a given time slot are greater than or equal to the server types used by Xt , so there are at least as many active servers as in Xt .Therefore, d j=1 x A t,j ≥ d j=1 xt t,j ≥ λ t holds for all t ∈ [T ].
(i.e., the value of e k after executing line 9 or 6) and s t,k := e t ′ ,k , where A t ′ ,k is the previous block in the same lane.Note that an extended block can be created not only in line 9, but also in line 6, if t = e k and y A t−1,k = ŷt t,k .If line 8 and 9 are executed, but the value of e k does not change (because t + tŷ t t,k is smaller than or equal to the previous value of e k ), then the block A t,k does not exist.Figure 4 visualizes the definition of A t,k .Let d t,k := e t,k − s t,k be the duration of the block A t,k and let C(A t,k ) be the cost caused by A t,k if the block A t,k exists or 0 otherwise.The next lemma describes how the cost of a block can be estimated.Lemma 9.The cost of the block A t,k is upper bounded by ).We have e t,k := t + tŷ t t,k ).The blocks A t,k are printed as rectangles that show the start and end time s t,k and e t,k , e.g., s 4,k = 5 and e 4,k = 7 (note that the server is in the active state during [s t,k : e t,k − 1]).The dashed line after A 1,k indicates that e 1,k = 3, since the block is interrupted by A 2,k .New blocks are drawn in green and extended blocks are drawn in blue.The arrows indicate the creation time t of a block.The used server type j t,k is equal to y A t,k .Blocks that are not printed do not exist, e.g., A 3,k does not exist, because t + tŷ t k ( Xv ) is obviously fulfilled (because Xu is an optimal schedule for I u , so Xv cannot be better).The lemma below shows that this inequality still holds if the cost C t,k (•) is scaled by ỹu t,k .Lemma 11.Let u, v ∈ [T ] with u < v.It holds that To deduce a contradiction, we assume that there exists a j ∈ [d] such that j=1 ỹu t,k,j .In other words, ỹu t,k,j = 1 means that the largest server type in the sequence (ŷ t ′ t ′ ,k ) t ′ ∈[t:u] is at least j.
Block at time slot t in lane k of the schedule X A Duration of block A t,k .Formally, d t,k := e t,k − s t,k e k Variable in algorithm A that stores the time slot when the server in lane k will be powered down e t,k Last time slot (exclusive) of block A t,k , i.e. e t,k is the first time slot after A t,k I Problem instance.Formally, I := (T, d, m, β, l, Λ) I t Problem instance that ends at time slot t.Formally, I t := (t, d, m, β, l, Λ t ) j t,k Server type used in block A t,k First time slot of block A t,k tj Number of time slots that a server of type j stays active in algorithm A; tj := ⌊β j /l j ⌋ Variable Description Xt Optimal schedule for the problem instance I t that ends at time t x t,j Number of active servers of type j at time t in the schedule X Number of active servers of type j at time t in the schedule X A Number of active servers of type j at time t in the schedule Xu y t,k Server type used in the k-th lane at time t in the schedule X (see equation (2)) Server type used in the k-th lane at time t in the schedule X A (see equation (2)) Server type used in the k-th lane at time t in the schedule Xu (see equation (2)) Largest server type used in lane k by the schedule Xt ′ at time slot t ′ for t ′ ∈ [t : u].Formally, ỹu t,k := max t ′ ∈[t:u] ŷt ′ AOur deterministic online algorithm (Section 2) or any online algorithm (Section 4)β jSwitching cost of server type j C(X) Total cost of the schedule X (see equation (1))C t,k (X)Switching and operating cost of the schedule X at time t in lane k (see equation (9))C(A t,k ) Cost of block A t,k (see Lemma 9)X An arbitrary schedule.Formally, X = (x 1 , ..., x T ) and xt = (x t,1 , ..., x t,d )X * An optimal scheduleX AThe schedule calculated by our deterministic online algorithm (in Section 2) or by any online algorithm (in Section 4)X BThe schedule calculated by our randomized online algorithm t ′ ,k