Designs of optimal switching feedback decentralized control policies for fluid queueing networks

The paper considers standard fluid models of multi-product multiple-server production systems where setup times are incurred whenever a server changes product. We consider a general approach to the problem of optimizing the long-run average cost per unit time that consists of first determining an optimal steady state (periodic) behavior and then to design a feedback scheduling protocol ensuring convergence to this behavior as time progresses. In this paper, we focus on the latter part and introduce a systematic approach. This approach gives rise to protocols that are cyclic and distributed: the servers do not need information about the entire system state. Each of them proceeds basically from the local data concerning only the currently served queue, although a fixed finite number of one-bit notification signals should be exchanged between the servers during every cycle. The approach is illustrated by simple instructive examples concerning polling systems, single server systems with processor sharing scheme, and the re-entrant two-server manufacturing network with non-negligible setup times introduced by Kumar and Seidman. For the last network considered in the analytical form, some cases of optimal steady-state (periodic) behavior are first recalled. For all examples, based on the desired steady state behavior and using the presented theory, we designed simple distributed feedback switching control laws. These laws not only give rise to the required behaviors but also make them globally attractive, irrespective of the system parameters and initial state.


Introduction
The paper deals with standard fluid models of production systems. We represent the system as a network that receives incoming product flows, interpreted as deterministic fluid streams, and processes them by means of servers. The servers move products (also called work) among internal buffers and ultimately dispatch work into the exterior of the network. The servers can alter their locations, which requires nonzero setup times.
Such models are used to describe certain aspects of flexible manufacturing systems, computer, communication and transport networks, chemical kinetics, etc. [2,22,32].
Recently, a great deal of research was concerned with these models, see e.g., [5,6,8,12,18,35,36,38] and the literature therein. It was shown that they may exhibit unexpectedly complicated and counter-intuitive behavior, especially if decentralized control policies and non-zero setup times are involved. For instance, it was shown via computer simulation in [3] that some standard policies may cause instability: the total amount of work increases without limits even if each server has enough capacity to cope with the incoming flows. In [23], it was rigorously proved that the clearing policy (serve the buffer until emptying) is unstable for very simple networks even if the setup times are zero. In [32], clear a fraction (CAF) policies were introduced and shown to achieve stability for single server systems, as well as for multi-server networks such that under some enumeration of the servers, work visits them in the ascending order. If such enumeration is impossible (which holds for e.g., re-entrant networks), CAF policies may fail to stabilize the system [23]. In some cases, the so-called gated policies proposed in [20,21] are able to overcome this drawback. The main idea behind them is to assign a certain level (gate) to every buffer and switch the servers proceeding from not the entire backlog in the buffer but its excess over the gate. This shortens the time of buffer service, thus reducing the likelihood of the detrimental situation underlying instability: a server wastes its capacity due to deficiency in work supply from another server since the latter is occupied by another activity in a side buffer for a too long time. However, gated policies carry potential for increase of the mean number of jobs in the system, which is undesirable from a performance point of view.
In [34], a universal decentralized switching strategy was proposed and shown to stabilize very general multiple server networks with time-varying rates of the outer inflows. The strategy arranges the system operation in repeated cycles of a fixed duration T ; within any cycle, every server visits each of the associated buffers only once in a pre-specified cycle-invariant order. From any buffer, the server removes the amount of work identical to the cumulative network income brought for time T by all inflows that affect, either directly or indirectly, this buffer. If the buffer contains not enough work to do so, it is drained out. The next setup may be prolonged to fully consume the time reserved for the step. This strategy not only keeps wip (= 'work in progress' = the total amount of work in the system) bounded but also makes all trajectories eventually periodic in the case of constant arrival rates. However, it does not provide a machinery of wip reduction. For example, the more work in the system initially, the comparably more work remains afterwards. This is undesirable from a performance point of view as well.
The above references display characteristic features of other works on feedback control of fluid networks (see e.g., [11] for a recent reference). They start from more or less heuristically designed policies and proceed with study of the resultant system behavior. Performance optimization, when treated, is typically limited to the choice of the parameters for a pre-specified policy, with a few exceptions [1,7,9,13,19] which focused on two buffer systems. However, the issue of performance becomes one of the major concerns as a result of complication of manufacturing processes and cost increase. Though optimal scheduling of systems with setups has an extensive literature, it is primarily focused on open-loop schedules and basically assumes a static and certain environment (see, e.g., [33] and the literature therein). This approach, being a backbone of production systems planning, is not suited well to deal with real-life dynamic and uncertain environments, which causes a reported gap between the theory and practice [31]. The basic tool to cope with these uncertainties is feedback under which decisions are made on an ongoing basis from the current events in the system.
A systematic approach to bridge this gap between optimal open-loop scheduling and feedback switching control protocols was proposed in [26,27]. Assuming a pre-determined periodic process that represents the desired open-loop schedule, the objective of the approach is to develop a general technique to design a feedback switching policy that not only gives rise to this periodic process but also makes it globally attractive. The last property is of especial interest if the given process is optimal or nearly optimal. Though most schedule optimization problems are NP hard, relatively effective optimization techniques have been developed to treat them [31,33]. The main concern of this paper is not optimization, but is a stable feedback generation of a desired periodic process. This is similar to the standard hard problem in control engineering, i.e., excitation of the oscillations in a technical device: a feedback controller should make the desired cyclic trajectory globally attracting.
A Lyapunov-type technique of the above kind was proposed in [27]. The resultant controllers are central: every server has access to the entire system state. In certain cases, decentralized controllers driven by only local data are needed. In [26], it was shown, by means of an example, that within the new view of the problem introduced in [27], decentralized controllers can be derived as well. The discussed technique relies on computation of a Lyapunov function and suffers much from the 'curse of dimensionality'.
To break the curse, a computationally non-demanding Poincaré-type technique was reported in [10] in a preliminary form. It is aimed at designs of decentralized controllers and offers to partition the required process into relatively simple phases, each associated with a specific combination of activities of the servers. The policy is to periodically repeat the resultant cycle of the phases, each governed by an individual control rule on the basis of local information. When the server completes the task for the phase, it broadcasts one-bit notifications to the other servers and proceeds to the next phase as soon as it collects all notifications. Design of the phase control rules (PCR) is the core occupation. According to [10], the design objective is confined into the phase itself: it should be ensured that a certain set of properties hold for the phase dynamical operator, which maps the system state at the phase beginning into that at its end. This set is elaborated so that these properties are inherited by compositions and so inevitably hold for the monodromy operator (MO). This is the composition of the dynamical operators over the entire cycle of the phases; the system in fact evolves via iterations of the MO. The above properties are such that being established for the MO, they guarantee the global stability of the equilibrium point of the iteration process and as a result that of the required periodic behavior. The curse becomes broken by avoiding computation and analysis of the entire MO, which is typically cumbersome up to intractable. In [10], this technique was probed by application to an example, also treated in [26]. It concerns the Kumar-Seidman system [23] and a periodic process optimal for only particular numerical values of the system parameters. Under them, this process features special properties, unnecessary in general, which were essentially utilized in the design and proofs.
This paper offers an extended and systematic presentation of this technique and demonstrates, by means of examples 1 , that it fits to handle the entire range of cases encountered in the optimization problem. To the latter end, we start with polling systems, i.e., ones with a single server attending several buffers typically, in a cyclic order. Such systems have a wide range of applications and have been the subject of extensive research; see e.g., [24,37,39] for surveys of this area. Mostly concerned were the performance of specific policies and performance estimates. Only a relatively small body of research dealt with optimal designs under non-zero setup times, with the focus on open-loop schedules and efficient visit orders, where only the latter topic was partly handled with the aid of feedback protocols; see e.g., [4,[14][15][16]24] and the literature therein.
When dealing with polling systems, we mean to highlight the ease of application of the proposed theory, which promises well regarding more complicated networks. Our choice of the desired periodic behavior is based on many studies showing that for a whole variety of performance criteria (e.g., time-averaged wip, maximum wip, throughput, etc.), service at the rates maximal under the current circumstances 2 is required for optimal system performance; see e.g., [9,24]. So we assume that the pre-specified process has this property, with no other assumptions being imposed. A simple distributed feedback switching control law is designed that not only gives 1 Application of this technique to the general multiple-server fluid networks is the topic of ongoing research. As compared with the examples, this requires essentially more technical developments, which makes the general case not the best arena for the first presentation of the approach. 2 i.e., at the maximal rate if the served buffer is not empty, and at the input rate otherwise. rise to the required steady state behavior but also makes it globally attractive, irrespective of the system parameters.
Being of self-importance, polling systems are not interesting enough for illustration of the general theory: for these systems the phase dynamical operators are affine, whereas the general theory deals with non-affine ones. As the simplest example involving non-affine operators, we extend the above results on a generalization of the polling system where the server can simultaneously serve several buffers. This model is of interest for e.g., high speed data and computer networks [40], transport and manufacturing networks [28], etc., with the simplest example being an intersection of two-way roads.
Our third illustration concerns the more complicated two-server re-entrant network introduced by Kumar and Seidman [23] and traditionally employed as a testbed in the area. We first recall the analytical solution to the problem of optimizing the system behavior, as presented in [25]. Next, we design a distributed feedback switching control law that gives rise to the optimal steady state behavior and makes it globally attractive. Compared with [10], this requires new phase control rules, with an emphasis on the concept of the flexible phase. This is a phase that encompasses several activities of every server and within which the servers are given the freedom to proceed to the next activity independently of each other and based on only the local data, thus achieving complete decentralization of control.
The body of the paper is organized as follows. Section 3 presents the proposed general guidelines for switching policies design and the related mathematical background. 3 To make presentation of these specific guidelines, Sect. 2 introduces a rather general model of multi-product multiple server system with setups. 4 Sections 4, 5, and 6 deal with polling systems, single server networks with processor sharing scheme, and the Kumar-Seidman system, respectively. There are three appendices providing the technical facts and their proofs.

General multi-product multiple server system with setups
We consider a system that receives F product flows, interpreted as fluid streams, and processes them by means of S servers. The servers move products among N internal buffers and ultimately dispatch them into the exterior of the system. We do not consider the case where a job may dynamically choose the server to be processed at, and assume that the production routes are specified a priory. A setup activity is required to switch a product type at any server. This system is represented by a directed graph with the set of nodes N := {1, . . . , N , 1 , . . . , F , out }. Here i is the source of the ith flow and out is the exterior where work is ultimately delivered. Other nodes represent buffers enumerated by n ∈ [1 : N ]. The graph arcs display the paths along which work is moved. Any server s has its own service area I s ⊂ [1 : N ], which form a partition of the set of buffers. The sources j have no incoming arcs, the exterior out has no outgoing arcs. There is only one arc starting at n = out ; its end is denoted by next(n). The graph contains no cycles and every buffer n has incoming arcs. The rate λ i ≥ 0 of the flow from the source i is constant. Any server can serve only one buffer at a given time. Service of buffer n consists in withdrawal of its content to next(n) at a rate 0 ≤ u n (t) ≤ μ n , where μ n > 0 is given. Switching the server from n to n consumes σ n →n > 0 time units.
The (feasible) state (X, Q) consists of the continuous state X = {x n ≥ 0} N n=1 and the discrete state Q = q s ∈ I s ∪ { } S s=1 . Here x n is the content of buffer n and q s is the state of server s, i.e., q s either indicates the buffer served or is the 'switching in progress' symbol . A process refers to a feasible evolution of the feasible state [X (t), Q(t)] over time, i.e., evolution such that 1. any function q s [·] is piece-wise constant and in the chronological list of its values, any two successive 'buffer' entries n , n are different n = n and separated by the 'switching' one , which is maintained no less than σ n →n time units 5 ; 2. the function X (·) is absolutely continuous and for any buffer n ∈ [1 : N ], In practice, the system is usually governed by a switching policy. It endows each server with a rule to determine the current service rate u n (t) and to decide when this service should be terminated and which buffer should be served next. The problem to be treated in this paper is as follows: (P) Given a periodic process π 0 , a switching policy should be designed such that • The process π 0 is generated by this policy; • All processes converge to π 0 as t → ∞.
For the definition of process convergence, we refer the reader to [35,36]. Briefly, this means asymptotic convergence. Notice that a necessary and sufficient condition for existence of periodic processes is provided in [17,34]: all servers should have enough capacity to process the job inflow. Given a switching policy, the process is determined by the initial state. So the first requirement means that there exists an initial state that gives rise to π 0 . By the second requirement, sooner or later, the system evolution closely follows π 0 irrespective of the initial state. This is of especial interest if π 0 is suboptimal. Then the policy ensures automatic transition to the suboptimal system behavior. The main concern of this paper is not determining optimal system behavior, but is generating stable feedbacks that make all processes converge to a given desired periodic processes. So in what follows, the process π 0 is treated as pre-specified.

Transformation of a periodic process into a switching policy: general guidelines and mathematical background
According to [10], transformation of the periodic process π 0 = [X 0 (·), Q 0 (·)] into a switching policy is arranged along the following lines: • The periodic process π 0 is partitioned into finitely many phases each associated with the discrete state transitions involved; • Every phase is equipped with a phase control rule (PCR) to govern the system within the phase; • The entire policy is to progress through the periodically repeated sequence of phases (1) while applying the relevant PCR within every phase; • When a server completes the task for the phase, it broadcasts one-bit notification to the others and proceeds to the next phase as soon as it collects all notifications.
To promote decentralization, PCR are welcome to drive every server on the basis of only its own local data (i.e., that about the currently served buffer). Then within any phase, the control is completely decentralized and cooperation of servers comes to exchange of finitely many bits at the end of every phase. The dynamical operator T P i of phase P i maps the continuous state X at the beginning of P i into that at the end (for a given PCR). The monodromy operator is the similar map for the entire cycle (1): The problem (P) from page 482 is solved whenever PCR's ensure the following: (i) Any PCR generates the related piece of π 0 ; (ii) Any trajectory of the iterated system X (k +1) = M[X (k)], X (0) ≥ 0 converges to X 0 0 := X 0 (0) as k → ∞. Here (i) guarantees that the entire switching policy does generate the required periodic process π 0 and also that X 0 0 is the equilibrium point of the iterated system. If the phase dynamical operators T P i are continuous, (ii) ensures convergence of all processes in the original fluid network to the desired periodic behavior π 0 by the standard argument presented in e.g., [34][35][36].
To ensure (i), the idea is to design PCR's so that they enforce the system to copycat the desired process π 0 . Property (ii) brings more trouble partly due to the curse of dimensionality: computation of the monodromy operator becomes cumbersome up to intractable as the numbers of servers or buffers increase. This burden is especially hard at the stage of design, where there is no specific monodromy operator to compute, and the actual task is to display and employ the relationships between this operator and particular designs of PCR's in order to choose the proper ones. The following new criterion for stability of equilibria of iterative dynamic systems aids to remove this blockage since this criterion can be verified and ensured 'phase-wise', thus annihilating the need to deal with the entire monodromy operator.

Mathematical background
The inequalities x ≤ y and x < y for x, y ∈ R p are meant component wise.

Definition 1 The operator
S j exists such that each set S j (called cell) has an interior point and is described by finitely many linear inequalities (both strict and non-strict), and all restrictions T | S j are affine, i.e., The following theorem is the main result of this section.

Theorem 1 Suppose that an iteration T m of a piece-wise affine continuous monotone map T is strictly dominated and this map has a fixed point T[x
The proof of Theorem 1 is given in Appendix A.
With respect to the monodromy operator T := M, the assumptions of continuity, monotonicity, and piece-wise affinity can be checked 'phase-wise' since they are evidently inherited by compositions of the maps. As for as the strict dominance, it can be shown that the composition not only inherits this property but also acquires it even if the composed maps are not strictly dominated.
Piece-wise affinity is usually absolutely clear from the formula of the operator. To check the continuity of piece-wise affine operators it is required to establish that the formulas that are active at the different sides of the boundary of any cell produce a common result at any boundary point. Another useful fact is that the composition f = g • h of a continuous and piece-wise affine operator g with an affine operator h is continuous and piece-wise affine as well. For example, by taking here g(y 1 , . . . , y m ) = max{y 1 , . . . , n j=1 a i j x j + b i is continuous and piecewise affine. Linear combinations of continuous and piece-wise affine functions clearly inherit these properties. Finally, it can be shown that a piece-wise affine continuous operator is monotone if and only if all matrices A j from (ii) Definition 1 have nonnegative entries.

Polling systems
We consider a particular case of the system from Sect. 2: only one server, N ≥ 2 buffers, and N outer flows (see Fig. 1a, where N = 6). The nth flow arrives at buffer n at a constant rate λ n > 0 and after service at a rate 0 ≤ u n (t) ≤ μ n , leaves the system. Switching between buffers requires a given and nonzero setup time. Consider a T -periodic process 6 and equals the input rate u n (t) = λ n if x n (t) = 0. Without any loss of generality, we assume that T is the end of a switching period. Following the lines of Sect. 3, we decompose π 0 into the simplest phases (1) by partitioning the period . Then any phase P i is associated with a discrete state q i , which form the sequence Now we introduce the phase control rules (PCR). PCR for the switching phase q i = . Switching is implemented for a duration of σ 0 i time units, where σ 0 i > 0 is its duration along the process π 0 . PCR for the service phase q i = n = . We first introduce the following (see Fig. 1b): • θ n i -the fraction of the initial content of buffer n at this phase that is retained in the buffer at the phase end for π 0 , i.e., • δ i -the duration of the service at the rate λ n at this phase for π 0 . Note that θ n i · δ i = 0. Let t i stand for the time when the phase commences. Phase control rule: Buffer n = q i is served at the maximal rate μ n until its content reduces to the level θ n i x n (t i ) and then at the input rate δ i time units more. The entire policy is to progress through the periodically repeated sequence of phases (3) while applying the relevant phase control rule within every phase.
Thus the server is driven by local data about the currently served buffer.

Theorem 2
The proposed policy gives rise to a unique periodic process, which is equal to π 0 and attracts all other processes.
Proof The proposed PCR's trivially meet (i) from Sect. 3. So the entire policy does generate π 0 and X 0 0 := X 0 (0) is the equilibrium of the monodromy operator M. By Theorem 1 and the standard argument presented in e.g., [34][35][36], it suffices to show that the assumptions of Theorem 1 are true for M.
By elementary computation, the phase dynamical operators are as follows: They are clearly affine, continuous, monotone, and dominated. So evidently their composition is M. It is also strictly dominated since so is the operator T P c − 1 (where q c−1 = by (3)) that is the last to act in the composition (2). Thus, the assumptions of Theorem 1 are satisfied.
For this example, the phase dynamical operators are affine, which is not the case for the next example, where they are only piece-wise affine. Let π 0 be a T -periodic process 8 for which any service of any buffer n starts at the maximal rate μ n and proceeds at the input rate λ n , where any of these periods may be of zero duration, i.e., does not occur in effect. Like in Sect. 4, the interest to such processes is inspired by optimization issues.
Like in Sect. 4, it can be assumed that T is the end of a switching period. To transform π 0 into a switching policy, we still decompose π 0 into the simplest phases (1) by partitioning [0, T ] into the intervals where the discrete state is constant. Then any phase P i from (1) is associated with either an active mode P i ∼ m i or switching P i ∼ , which are arranged in the sequence PCR for the switching phase P i ∼ . Switching is implemented for the duration of σ 0 i time units, where σ 0 i > 0 is its duration along π 0 . PCR for the service phase P i ∼ m i . To state this rule, we employ the quantity θ n i from (4) in Sect. 4. Let δ n i denote the duration of the service of buffer n at the input rate at phase P i for process π 0 . The PCR is as follows: (1) Every buffer n ∈ J m i is served at the maximal rate μ n until its content reduces to the level θ n i x n (t i ), where t i is the time when P i has commenced; (2) When task (1) is completed for a buffer n ∈ J m i , the server reduces the service rate for this buffer to the input rate λ n and maintains it no less than δ n i time units and until the phase end; (3) The phase is terminated as soon as task (1) is accomplished and the compulsory time δ n i of service at the input rate is expired for all buffers n ∈ J m i .
The entire policy is to progress through the periodically repeated sequence of phases (5) while applying the relevant phase control rule within every phase. Thus the server is driven only by data about the currently served buffers.

Theorem 3
The proposed policy gives rise to a unique periodic process, which is equal to π 0 and attracts all other processes.
Proof Similarly to the proof of Theorem 2, it suffices to show that the assumptions of Theorem 1 are true for the monodromy operator M. By elementary computation, the phase dynamical operators are as follows: Footnote 7 continued modes. The processes in the original system can be identified with those in the auxiliary k-server system for which all servers first, are switched synchronously and second, always serve buffers from a common set J m . So if a switching policy designed for the auxiliary system gives rise only to processes with these properties, it can be interpreted as a policy for the original system. 8 Existence of such process clearly implies that μ n > λ n ∀n.
where y n = θ n i x n if n ∈ J m i x n otherwise , a n = 0 if n ∈ J m i λ n otherwise , Note that T P i is a piece-wise affine monotone continuous function of X , whose cells (see Definition 1) are enumerated by n ∈ J m i and look as follows: On this cell, τ is equal to the expression on the left. It follows that the entire dynamical operator T P i is not only piece-wise affine, continuous, and monotone but also dominated. So evidently is the composition M of these operators. It is also strictly dominated since so is the operator T P c − 1 (by (5)) that is the last to act in the composition (2). Thus the assumptions of Theorem 1 are satisfied.
The following lemma addresses existence of a periodic process and shows that such process exists if and only if the system is stabilizable (there is a way to control the system so that the total queue is kept bounded).

Lemma 1 The following statements are equivalent for the system at hand:
(i) There exists a periodic process of the kind that we have considered (i.e., such that any service of any buffer n starts at the maximal rate μ n and proceeds at the input rate λ n ); (ii) There exists a (not necessarily periodic) process along which the total amount of work in the system remains bounded as time progresses; (iii) The following inequality holds M m=1 max n∈J m λ n μ n < 1; (iv) There exists a cyclic switching policy that generates a unique periodic process, which attracts all other processes.
An example of such policy is that given by any periodically repeated production cycle (5) equipped with the above PCR's, where the parameters θ n i ∈ [0; 1), δ i > 0, σ 0 2i+1 ≥ σ 2i→2i+2 are arbitrary chosen, provided that any mode m is encountered in the chain (5) and θ n i = θ m i ∀n ∈ J m . The proof of this lemma is given in Appendix B. The last requirement from the lemma in fact is not necessary and is imposed to simplify the proof in the face of the paper length limitations.
Since the system at hand generalizes that from Sect. 4, Lemma 1 extends on polling system. For them, (7) shapes into N n=1 λ n μ n < 1 since modes are associated with buffers m ∼ n and every J m contains only one buffer n. This network consists of four buffers and two servers and processes a single job flow, see Fig. 2. Work arrives at the first buffer at a constant rate λ > 0, then is consecutively processed by server 1, then twice by server 2, and finally by server 1 once more, and then leaves the system. Any server is capable to serve only one buffer at a given time. Switching between buffers consumes setup times σ 1→4 , σ 4→1 , σ 2→3 , σ 3→2 > 0, respectively. The maximal service rate is μ n > 0 for buffer n.
Thus the continuous state X = {x n } 4 n=1 and the service areas are as follows The system is stabilizable, i.e., the total amount of work can be kept bounded provided that the system is properly controlled. This holds if and only if every server has enough capacity to process the job inflow [17]: The model at hand was introduced in [23], also analyzed in [29], to demonstrate that the clearing policy (serve any buffer until emptying) is inappropriate since it may cause instability: even if (8) holds, the total amount of work may explode. Moreover, this inevitably holds whenever It is this case that is examined: (8) and (9) are assumed to be true. Then i.e.,ẋ 2 > 0 (orẋ 4 > 0) if buffers 1 and 2 (or 3 and 4) are simultaneously served at the maximal rates.

Optimal periodic behavior of the Kumar-Seidman system
In [25] optimal periodic behavior for this network has been determined with respect to the long-run time-averaged weighted wip (work in progress): under the following technical assumption.

Assumption 1 No downstream buffer values more than an upstream one:
More precisely, a periodic processes is said to be simple if every server processes each of the associated buffers only once during the period. The paper [25] characterizes the optimal simple periodic process in terms of the separate activities of the first and second servers. For switching policy design, we need to know more: how these activities are combined and how this combination evolves over time. The following theorem summarizes the results of [25].
Theorem 4 An optimal simple periodic process exists. For this optimal periodic behavior, server 1 repeatedly goes through the following successive phases: • Setup from 4 to 1 for a duration of σ 4→1 (x 2 = 0 upon completion), • Serve 1 at rate μ 1 for a duration of ρ • Serve 1 at rate λ 1 for a duration of 1 1 • Setup from 1 to 4 for a duration of σ 1→4 , • Serve 4 at rate μ 4 for a duration of ρ 4 T . and server 2 repeatedly goes through the following successive phases: • Setup from 3 to 2 for a duration of σ 3→2 , • Serve 2 at maximal rate for a duration of (1 − ρ 3 )T − (σ 2→3 ) + σ 3→2 ), which is either at rate μ 2 as long as x 2 > 0 or at rate 0 when x 2 = 0, • Setup from 2 to 3 for a duration of σ 2→3 (x 4 = 0 upon completion), • Serve 3 at rate μ 3 for a duration of ρ 3 T (x 3 = 0 upon completion).
where T denotes the optimal duration of the period (see [25] for the explicit expression). Furthermore, depending on the parameters c i , λ, μ i , σ i→ j , either the setups of duration σ 4→1 and σ 3→2 are finished simultaneously, or the setups of duration σ 1→4 and σ 2→3 are finished simultaneously.
As explained in [25], eight different time evolutions of buffer contents result, depending on the above parameters. 9 In this paper, we discuss, for three of these cases, how to arrive at a distributed feedback switching control law that gives rise to the optimal steady state behavior and makes it globally attractive. Since two of the cases can be merged, we call the cases Case 1(a), Case 1(b), and Case 2, respectively.
All other cases can be easily treated along the same lines; their discussion is omitted only to meet the paper length limit.
For the considered cases, the optimal periodic behavior is illustrated by Figs. 3 and  4, respectively. For each of them, the buffers are served at the maximal feasible rates.
In Case 1 from Fig. 3, the optimal behavior consists of periodic repetition of the following successive phases: P 0 The servers simultaneously start services of buffers 1 and 2; during the phase, buffer 1 is drained out and then served at the input rate; (a) (b) Fig. 3 Case 1 of the optimal periodic behavior Fig. 4 Case 2 of the optimal periodic behavior P 1 Server 1 goes to buffer 4 and serves it until emptying; server 2 empties buffer 2 and switches to buffer 3, the switch is completed when buffer 4 is drained out; P 2 Server 2 empties buffer 3 and then switches to buffer 2, where it idles for some time τ 0 2 . Server 1 serves buffer 4 and then switches to buffer 1. This switch is completed synchronously with the end of the idling period of server 2, which is the end of the phase.
In Case 2 from Fig. 4, the optimal behavior consists in periodic repetition of the following successive phases: The servers simultaneously start services of buffers 3 and 4. Server 2 empties buffer 3, then switches to buffer 2 and serves it until emptying. Server 1 serves buffer 4 until emptying and then begins switching to buffer 1. When this switch is in progress, buffer 2 is emptied, which is the end of the phase; P 2 After emptying buffer 2, server 2 goes to buffer 3. Server 1 completes the switch to buffer 1, serves it until emptying and then even longer, and finally goes to buffer 4. Switches to buffers 3 and 4 are completed synchronously.
The difference between Fig. 3a and b concerns only the evolution of buffer 4 at phase P 2 . Case 1(b) occurs if and only if σ 4→1 < σ 3→2 := σ 3→2 + τ 0 2 , where τ 0 2 is the idling time of server 2. Then the content of buffer 4 decreases during some sub-phase of this phase. In Case 1(a), σ 4→1 ≥ σ 3→2 , and the content of buffer 4 never decreases at this phase.
Dissimilarities in the two optimal behaviors can be related to the fact that the problem is reducible to that of optimization of a linear function over a polytope. The solution to this optimization problem may abruptly jump from one vertex to another under the continuous change of the parameters.
To design the switching policy, we also need the following parameters of the optimal process: Notation 1 τ 0 2 : the idle time of server 2 at phase P 2 , see Fig. 3 and (13a); τ λ 1 : the duration of the period when server 1 serves the emptied buffer 1 at the input rate λ at phases P 0 and P 2 , see Figs. 3, 4 and (13b); θ : the fraction of the maximal content of buffer 2 at phase P 0 that remains in this buffer at the phase end, see Fig. 3b and (13c); ξ : the fraction of the buffer 3 initial content x * 3 at phase P 2 that remains there at the start of server 1 switching 4 → 1, see Fig. 3 and (13d); ζ : the fraction of the buffer 4 content x * 4 at the start of server 2 switching 3 → 2 at phase P 2 that is in this buffer at the first time instant when both servers are involved in switching within the phase, see Fig. 3 and (13d); ν: the percentage of the switching period σ 4→1 that elapses until buffer 2 is emptied at phase P 2 , see Fig. 4 and (13e); For a given value of T , 10 these parameters are given by:

Optimal switching policy
Now by following the guidelines from Sect. 3, we propose a simple interactive switching policy that ensures that after a transient and irrespective of the initial state, the system inevitably exhibits the optimal periodic behavior described in Theorem 4. Since there are two qualitatively different optimal behaviors, two switching policies are offered to handle the cases where the first or second behavior occurs, respectively.
The partition (1) of the process into phases is merely borrowed from Theorem 4. The phase control rules are designed so that the system behavior copycats that from Fig. 3

or 4.
Switching policy 1 (to be applied in Case 1 illustrated by Fig. 3) 1. Whenever any buffer n is served, the service is at the maximal feasible rate: 2. The servers are switched so that the discrete state Q(t) = [q 1 (t), q 2 (t)] periodically repeats the following cycle: (15) 3. Transition (a) is implemented as soon as 3(a) buffer 1 is emptied 3(b) and after this the level of buffer 2 is reduced to the value θ x 2 (τ ).
Here τ is the time when event 3(a) occurs, and θ is introduced in Notation 1; 4. Within phase P 1 , • Server 1 switches from buffer 1 to 4 for σ 1→4 time units and then serves buffer 4 until emptying and possibly longer, waiting for the switch of server 2 to be completed; • Server 2 serves buffer 2 until emptying, then switches to buffer 3 for σ 2→3 time units and then possibly idles, waiting for emptying buffer 4. 5. Transition (b) from phase P 1 to P 2 is implemented as soon as first, buffer 4 is empty and second, switching of server 2 from buffer 2 to 3 is completed; 6. Within phase P 2 , • Server 2 empties buffer 3, then switches to buffer 2 for σ 3→2 time units, and finally idles for τ 0 2 time units and possibly longer, waiting for the switch of server 1 to be completed.
• Server 1 serves buffer 4 until the content of buffer 3 decays to ξ x 0 3 and after this the level of buffer 4 is reduced to ζ x 4 (τ * ), where x 0 3 is the buffer level at the start of the phase and τ * is the time instant when the first of these reductions is completed. After this, server 1 switches to buffer 1 for a duration of σ 4→1 time units and then possibly idles, waiting for the compulsory idling time τ 0 2 of server 2 to be expired. Here τ 0 2 , ξ, and ζ are introduced in Notation 1; 7. Transition (c) is implemented as soon as switching of server 1 is completed and the compulsory idling time τ 0 2 of server 2 is expired.
Remark 1 (i) Formula (15) displays the longest chains of discrete state transitions that may be observed during phases P 1 and P 2 ; the rigorous definitions of these phases are given in 4 and 6, respectively. Some sub-phases, like (4, 2) in P 1 , may be missed depending on the initial state and the serial number of the cycle (15) at hand. Such phases are said to be flexible. (ii) To determine the end of the current phase, any server needs the one-bit 'end of mission' notification from the companion server. (iii) Within the flexible phase P 2 in Case 1(b) and phase P 1 , operation of every server is based on data about the current level of the buffer served. In particular, each server operates with no regard to what is going on with the other server. As for P 2 in Case 1(a), server 1 needs a one-bit notification that the required decrease in the level of buffer 3 is achieved. (iv) Rule 6 is well-defined since buffer 4 should be unloaded only in Case 1(b), where server 2 does not supply work to it from the start of the unload and until the phase ends. (v) For the definiteness, we assume that Q(0) = (1, 2). Then given the initial state X (0), policy 1 uniquely determines a process in the system.
Switching policy 2 (to be applied in Case 2 illustrated by Fig. 4) 1. Whenever a buffer is served, the service is at the maximal feasible rate (14); 2. The servers are switched so that the discrete state Q(t) = [q 1 (t), q 2 (t)] periodically repeats the following cycle: 3. During phase P 1 , • Server 2 serves buffer 3 until emptying, then switches to buffer 2 for σ 3→2 time units, then serves buffer 2 until emptying, and finally possibly idles, waiting for server 1 to complete its mission for this phase; • Server 1 empties buffer 4, then undergoes the first part of switching to buffer 1 for νσ 4→1 time units, and finally possibly idles, waiting for emptying buffer 2 by server 2. (Here ν is introduced in Notation 1.) 4. Transition (a) is implemented as soon as buffer 2 is emptied and server 1 completes the required percentage of switching 4 → 1. 5. During phase P 2 , • Server 1 completes switching 4 → 1 for (1 − ν)σ 4→1 time units, then empties buffer 1, continues to serve it at the input rate τ λ 1 time units and possibly even longer so that it leaves buffer 1 no sooner than σ 2→3 − σ 1→4 time units elapses since the phase beginning, and finally switches to buffer 4 for a duration of σ 1→4 time units; • Server 2 switches from buffer 2 to buffer 3 for σ 2→3 time units and then possibly idles, waiting for server 1 to complete its mission. Here τ λ 1 is introduced in Notation 1; 6. Transition (b) is implemented as soon as the switch 1 → 4 is completed.
Remark 2 • The duration of service of the emptied buffer 1 at phase P 2 is adjusted so that switching 1 → 4 is completed at the earliest occasion after the end of the switch 2 → 3. • The servers operate independently and on the basis of data from only the currently served buffer within both flexible phases P 1 and P 2 . • For the definiteness, we assume that Q(0) = (4, 3). Then given the initial state X (0), policy 2 uniquely determines a process in the system.
The following theorem shows that the proposed policies ensure asymptotically optimal performance of the closed-loop system. (8) and (10) hold. Suppose that policy 1 is applied in Case 1 and policy 2 is put in use in Case 2. Then any of these policies gives rise to a unique periodic process, which attracts all other processes in the Kumar-Seidman system. Moreover, this periodic process represents the optimal behavior described in Theorem 4.

Proof of Theorem 5
The proposed phase control rules trivially meet the requirement (i) from Sect. 3. So the entire switching policy generates the periodic process described in Theorem 4 and X 0 0 := X 0 (0) is the equilibrium of the monodromy operator M. By Theorem 1 and the standard argument presented in e.g., [34][35][36], it suffices to show that the assumptions of this theorem are true for this operator. Policy 1. The phase dynamical operators are easily computed: They are clearly piece-wise affine, continuous, monotone, and dominated. So evidently is their composition M = T P 2 • T P 1 • T 1,2 . As for the strict dominance, we are going to examine M 2 . It is easy to see that +λσ 2→3 or +λσ 1→4 in the first line of the formula for T P 1 X is converted into +const(> 0) at the first position of T P 2 • T P 1 X , in addition to the constant addend λ[σ 3→2 + τ 0 2 ] or λσ 4→1 . It follows that T P 0 • M X contains +const(> 0) at the second and third positions; T P 1 • T P 0 • M X contains +const(> 0) at the first and third positions; T P 2 • T P 1 • T P 0 • M X = M 2 X contains +const(> 0) at the first and forth positions. The second and third positions of M X are always zero. So by truncating the state space to R 2 = {col (x 1 , x 4 )}, we make M 2 strictly dominated. So the assumptions of Theorem 1 are satisfied.
Policy 2. In this case, the phase dynamical operators are as follows: where (b) follows from the computation of the phase duration τ = max σ 2→3 ; τ λ 1 + and noting that since λτ units of work arrives at buffer 1 during the phase, x 1 + λτ − λσ 1→4 units should be removed to buffer 2 to make the final level of buffer 1 equal to λσ 1→4 . We see that these operators are piece-wise affine, continuous, monotone, and dominated. So evidently is the monodromy operator M = T P2 • T P1 . As for the strict dominance, we still examine M 2 . Irrespective of the cell, the expression for T P2 contains +const ( Proof is focused on the case of strictly dominated operator, the case of dominated one is considered likewise. Necessity. Let T be strictly dominated, x ∈ K Lemma A. 3 The continuous piece-wise affine operator T is monotonous if and only if the entries of the matrix A j from (ii) of Definition 1 are non-negative for j = 1, . . . , m.
Proof Since necessity is obvious, we focus on sufficiency. Let 0 ≤ x ≤ y. Owing to (ii) of Definition 1, a partition 0 = θ 0 < θ 1 < · · · < θ k = 1 exists such that Within any set S j , the operator T(x) = A j x + b j is evidently monotonous. So Hence, Proof (i) and (iii) are obvious; (ii) ⇐ (i).
Lemma A.5 Let the continuous operator T be piece-wise affine, monotonous, and strictly dominated, x * be its fixed point T[x * ] = x * , and θ ≥ 1. Then x t (θ x * ) → x * as t → ∞.
The proof is completed by (20) and the second relation in (19).
i.e., lim t→∞ x t (0) exists. By iii) of Lemma A.4 and Corollary A.3, this limit is equal to x * . For any c ≥ 0, there exists a ≥ 0 such that a ≥ x * , a ≥ c. By invoking i) of Lemma A.4 once more, we see that x t [0] ≤ x t [c] ≤ x t [a] ∀t ≥ 0, where x t [0] → x * and x t [a] → x * as t → ∞ by the foregoing and Corollary A.2, respectively. Hence x t [a] → x * as t → ∞.
Proof of Theorem 1 The iteration T m is clearly a piece-wise affine continuous monotone map. Applying Lemma A.6 to this iteration shows that it has a unique fixed point attracting all trajectories. Since any fixed point of T[·] is a fixed point of the iteration, the fixed point of T[·] is also unique. For the trajectory {x k } from Theorem 1 and any s ∈ [0 : m −1], the sequence y t := x t·m+s , t = 0, 1, . . . is the trajectory of the iterated system y t+1 = T m [y t ]. Hence y t → x * as t → ∞ by Lemma A.6. Since s is arbitrary, this completes the proof. true for any switching phase (with another constant ϕ i ) since the time of switching is given.
Based on this scant Laypunov-like property, we are going to show that the monodromy operator (2) maps the set C r := {X ∈ R N : 0 ≤ X, V [X ] ≤ r } into itself provided that r > 0 is large enough. To this end, we start with analysis of X ∈ C r such that V [X ] = r . By assumption, any mode m is encountered in the chain (5); let i(m) stand for the index of the first phase in this mode P i ∼ m. We also enumerate the modes m 1 , . . . , m M to arrange i(m) in the descending order i(m 1 ) > i(m 2 ) > . . . > i(m M ). Then

V [M X]
(2) Now we employ (22) (where i := i(m 1 ) and X := X 1 ) and note that since i(m 1 ) is the first phase in mode m 1 , the contents of all buffers n ∈ J m 1 constantly increase duting all previous phases and so x 1 n > x n ∀n ∈ J m 1 . Hence By continuing likewise, we establish that To extend this conclusion on all X ∈ C r , we note that X ∈ C r ⇒ X := r/V [X ] X ∈ C r &V [ X ] = r &X ≤ X . The last inequality implies that M X ≤ M X since the operator M is monotone 11 . So far as the function V [·] is evidently monotone as well, we have V [M X] ≤ V [M X ] ≤ r , where the last inequality is given be the foregoing. Thus X ∈ C r ⇒ M X ∈ C r .
Overall we see that the continuous (see footnote 11) operator M maps the convex compact subset C r ⊂ R N into itself. By Brouwers fixed point theorem [30, p. 117], this operator has a fixed point X * = M X * . The last equation means that the process that starts at the initial state X * returns in this state after the entire production cycle is completed; thus it is periodic. The proof is finalized by the arguments from the concluding part of the proof of Theorem 3.
Since the policy employed in this part of the proof clearly generate only processes for which any service of any buffer n starts at the maximal rate μ n and proceeds at the input rate λ n , we have also proved that (iii) ⇒ (i) To complete the proof, it is suffices to show that (iv) ⇒ (ii). However, this is evident.

Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.