Skip to main content
Log in

Signaling for decentralized routing in a queueing network

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

A discrete-time decentralized routing problem in a service system consisting of two service stations and two controllers is investigated. Each controller is affiliated with one station. Each station has an infinite size buffer. Exogenous customer arrivals at each station occur with rate \(\uplambda \). Service times at each station have rate \(\mu \). At any time, a controller can route one of the customers waiting in its own station to the other station. Each controller knows perfectly the queue length in its own station and observes the exogenous arrivals to its own station as well as the arrivals of customers sent from the other station. At the beginning, each controller has a probability mass function (PMF) on the number of customers in the other station. These PMFs are common knowledge between the two controllers. At each time a holding cost is incurred at each station due to the customers waiting at that station. The objective is to determine routing policies for the two controllers that minimize either the total expected holding cost over a finite horizon or the average cost per unit time over an infinite horizon. In this problem there is implicit communication between the two controllers; whenever a controller decides to send or not to send a customer from its own station to the other station it communicates information about its queue length to the other station. This implicit communication through control actions is referred to as signaling in decentralized control. Signaling results in complex communication and decision problems. In spite of the complexity of signaling involved, it is shown that an optimal signaling strategy is described by a threshold policy which depends on the common information between the two controllers; this threshold policy is explicitly determined.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. The expectation in all equations appearing in this paper is with respect to the probability measure induced by the policy \(g \in \mathcal {G}_d\). Furthermore, all probabilities (except \(\pi ^1_0,\pi ^2_0\)) appearing in the analysis depend on the policy \(g \in \mathcal {G}_d\).

  2. We are dealing with a team problem here. Therefore, the agents/controllers can agree at the beginning, before the operation of the queueing system begins, on the routing policies they will employ. Such policies then become common knowledge. Therefore, the conditional PMFs, given by (11)–(12), are common knowledge.

  3. Since the Markov chain \(\{Y^1_t\}\) is irreducible and positive recurrent, it has an unique stationary distribution \(\pi ^{g_0}\), and \(\pi ^{g_0}(x)>0\) for all states \(x\) of the Markov chain \(\{Y^1_t\}\) (see Bremaud, 1999, chap. 3). Consequently, \(\pi ^{g_0}(x)>0\) for all \(x\le M\), and \(\max _{x\le M}\frac{\pi ^{i}_0(x)}{\pi ^{g_0}(x)}\) is finite.

References

  • Abdollahi, F., & Khorasani, K. (2008). A novel \(H_\infty \) control strategy for design of a robust dynamic routing algorithm in traffic networks. IEEE Journal on Selected Areas in Communications, 26(4), 706–718.

    Article  Google Scholar 

  • Akgun, O. T., Righter, R., & Wolff, R. (2012). Understanding the marginal impact of customer flexibility. Queueing Systems, 71(1–2), 5–23.

    Article  Google Scholar 

  • Aumann, R. J. (1976). Agreeing to disagree. The Annals of Statistics 4(6), 1236–1239.

  • Beutler, F. J., & Teneketzis, D. (1989). Routing in queueing networks under imperfect information: Stochastic dominance and thresholds. Stochastics and Stochastic Reports, 26(2), 81–100.

    Article  Google Scholar 

  • Bremaud, P. (1999). Markov chains: Gibbs fields, Monte Carlo simulation, and queues (Vol. 31). Berlin: Springer.

    Google Scholar 

  • Burnetas, A. N., & Katehakis, M. N. (1993). On sequencing two types of tasks on a single processor under incomplete information. Probability in the Engineering and Informational Sciences, 7(1), 85–119.

    Article  Google Scholar 

  • Cogill, R., Rotkowitz, M., Van Roy, B., & Lall, S. (2006). An approximate dynamic programming approach to decentralized control of stochastic systems. In B. A. Francis, M. C. Smith & J. C. Willems (Eds.), Control of uncertain systems: Modelling, approximation, and design (pp. 243–256). Berlin: Springer.

  • Davis, E. (1977). Optimal control of arrivals to a two-server queueing system with separate queues. Ph.D. thesis, Ph.D. dissertation, Program in Operations Research, North Carolina State University, Raleigh, NC.

  • Ephremides, A., Varaiya, P., & Walrand, J. (1980). A simple dynamic routing problem. IEEE Transactions on Automatic Control, 25(4), 690–693.

    Article  Google Scholar 

  • Foley, R.D., & McDonald, D. (2001). Join the shortest queue: stability and exact asymptotics. The Annals of Applied Probability, 11(3), 569–607.

  • Hajek, B. (1984). Optimal control of two interacting service stations. IEEE Transactions on Automatic Control, 29(6), 491–499.

    Article  Google Scholar 

  • Ho, Y. C. (1980). Team decision theory and information structures. Proceedings of the IEEE, 68(6), 644–654.

    Article  Google Scholar 

  • Hordijk, A., & Koole, G. (1990). On the optimality of the generalized shortest queue policy. Probability in the Engineering and Informational Sciences, 4(4), 477–487.

    Article  Google Scholar 

  • Hordijk, A., & Koole, G. (1992). On the assignment of customers to parallel queues. Probability in the Engineering and Informational Sciences, 6(04), 495–511.

    Article  Google Scholar 

  • Katehakis, M. N. (1985). A note on the hypercube model. Operations Research Letters, 3(6), 319–322.

    Article  Google Scholar 

  • Katehakis, M. N., & Melolidakis, C. (1995). On the optimal maintenance of systems and control of arrivals in queues. Stochastic Analysis and Applications, 13(2), 137–164.

    Article  Google Scholar 

  • Kuri, J., & Kumar, A. (1995). Optimal control of arrivals to queues with delayed queue length information. IEEE Transactions on Automatic Control, 40(8), 1444–1450.

    Article  Google Scholar 

  • Lin, W., & Kumar, P. (1984). Optimal control of a queueing system with two heterogeneous servers. IEEE Transactions on Automatic Control, 29(8), 696–703.

    Article  Google Scholar 

  • Mahajan, A. (2013). Optimal decentralized control of coupled subsystems with control sharing. IEEE Transactions on Automatic Control, 58(9), 2377–2382. doi:10.1109/TAC.2013.2251807.

    Article  Google Scholar 

  • Manfredi, S. (2014). Decentralized queue balancing and differentiated service scheme based on cooperative control concept. IEEE Transactions on Industrial Informatics, 10(1), 586–593.

    Article  Google Scholar 

  • Marshall, A., Olkin, I., & Arnold, B. (2010). Inequalities: Theory of majorization and its applications. Berlin: Springer.

    Google Scholar 

  • Menich, R., & Serfozo, R. F. (1991). Optimality of routing and servicing in dependent parallel processing systems. Queueing Systems, 9(4), 403–418.

    Article  Google Scholar 

  • Nayyar, A., Mahajan, A., & Teneketzis, D. (2013). Decentralized stochastic control with partial history sharing: A common information approach. IEEE Transactions on Automatic Control, 58(7), 1644–1658. doi:10.1109/TAC.2013.2239000.

    Article  Google Scholar 

  • Ouyang, Y., & Teneketzis, D. (2013). A routing problem in a simple queueing system with non-classical information structure. In Proceedings of the 51th annual allerton conference on communication, control, and computing (Allerton) (pp. 1278–1284). Monticello, IL.

  • Ouyang, Y., & Teneketzis, D. (2014). Balancing through signaling in decentralized routing. In Proceedings of the 53rd conference on decision and control (CDC) (pp. 1675–1680). Los Angeles, CA.

  • Pandelis, D. G., & Teneketzis, D. (1996). A simple load balancing problem with decentralized information. Mathematical Methods of Operations Research, 44(1), 97–113.

    Article  Google Scholar 

  • Petersen, K. E., & Petersen, K. (1989). Ergodic theory (Vol. 2). Cambridge: Cambridge University Press.

    Google Scholar 

  • Reddy, A. A., Banerjee, S., Gopalan, A., Shakkottai, S., & Ying, L. (2012). On distributed scheduling with heterogeneously delayed network-state information. Queueing Systems, 72(3–4), 193–218.

    Article  Google Scholar 

  • Si, X., Zhu, X.L., Du, X., & Xie, X. (2013). A decentralized routing control scheme for data communication networks. Mathematical Problems in Engineering, article ID 648267.

  • Weber, R. R. (1978). On the optimal assignment of customers to parallel servers. Journal of Applied Probability, 15(2), 406–413.

  • Weber, R. R., & Stidham, Jr S. (1987). Optimal control of service rates in networks of queues. Advances in Applied Probability, 19(1), 202–218.

  • Whitt, W. (1986). Deciding which queue to join: Some counterexamples. Operations Research, 34(1), 55–62.

    Article  Google Scholar 

  • Winston, W. (1977). Optimality of the shortest line discipline. Journal of Applied Probability, 14(1), 181–189.

  • Witsenhausen, H. S. (1971). Separation of estimation and control for discrete time systems. Proceedings of the IEEE, 59(11), 1557–1566.

    Article  Google Scholar 

  • Ying, L., & Shakkottai, S. (2011). On throughput optimality with delayed network-state information. IEEE Transactions on Information Theory, 57(8), 5116–5132.

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially supported by National Science Foundation (NSF) Grant CCF-1111061 and NASA Grant NNX12AO54G. The authors thank Mark Rudelson and Aditya Mahajan for helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Ouyang.

Additional information

Preliminary versions of this paper appeared in Ouyang and Teneketzis (2013) and Ouyang and Teneketzis (2014).

Appendices

Appendix 1: Proofs of the results in Sect. 4

Proof (Proof of Lemma 1)

Since there is one possible arrival to any queue and one possible departure from any queue at each time instant, (31) holds.

When \((U^{1,\hat{g}}_t,U^{2,\hat{g}}_t)=(0,0)\), both \(\overline{X}^{1,\hat{g}}_t\) and \(\overline{X}^{2,\hat{g}}_t\) are below the threshold and no customers are routed form any queue. Therefore, the upper bound of the queue lengths at \(t+1\) is

$$\begin{aligned} \textit{UB}^{\hat{g}}_{t+1} =&\left\lceil \textit{TH}_t \right\rceil -1. \end{aligned}$$
(97)

Moreover, the lower bound of the queue lengths at \(t+1\) is the same as the lower bound of \(\overline{X}^{1,\hat{g}}_t,\overline{X}^{2,\hat{g}}_t\). That is,

$$\begin{aligned} \textit{LB}^{\hat{g}}_{t+1} =\overline{\textit{LB}}^{\hat{g}}_t. \end{aligned}$$
(98)

When \((U^{1,\hat{g}}_t,U^{2,\hat{g}}_t)=(1,1)\), both \(\overline{X}^{1,\hat{g}}_t\) and \(\overline{X}^{2,\hat{g}}_t\) are greater than or equal to the threshold. Since the routing only exchanges two customers between the two queues, the queue lengths remain the same as the queue lengths before routing. As a result, the upper bound and lower bound of the queue lengths at \(t+1\) are given by

$$\begin{aligned} \textit{UB}^{\hat{g}}_{t+1} =&\,\overline{\textit{UB}}^{\hat{g}}_t. \end{aligned}$$
(99)
$$\begin{aligned} \textit{LB}^{\hat{g}}_{t+1} =&\left\lceil \textit{TH}_t \right\rceil . \end{aligned}$$
(100)

When \((U^{i,\hat{g}}_t,U^{j,\hat{g}}_t)=(1,0), i\ne j\), \(\overline{X}^{i,\hat{g}}_t\) is greater than or equal to the threshold; \(\overline{X}^{j,\hat{g}}_t\) is below the threshold. Since one customer is routed from \(Q_i\) to \(Q_j\),

$$\begin{aligned}&X^{i,\hat{g}}_{t+1} = \overline{X}^{i,\hat{g}}_t-1, \end{aligned}$$
(101)
$$\begin{aligned}&X^{j,\hat{g}}_{t+1} = \overline{X}^{j,\hat{g}}_t+1. \end{aligned}$$
(102)

Therefore, the upper bound of the queue lengths at \(t+1\) becomes

$$\begin{aligned} \textit{UB}^{\hat{g}}_{t+1} =&\max \left\{ \overline{\textit{UB}}^{i,\hat{g}}_t-1 ,\left\lceil \textit{TH}_t \right\rceil -1 + 1\right\} \nonumber \\ =&\max \left\{ \overline{\textit{UB}}^{i,\hat{g}}_t-1 ,\left\lceil \textit{TH}_t \right\rceil \right\} , \end{aligned}$$
(103)

and lower bound of the queue lengths at \(t+1\) is given by

$$\begin{aligned} \textit{LB}^{\hat{g}}_{t+1} =&\min \left\{ \left\lceil \textit{TH}_t \right\rceil -1,\overline{\textit{LB}}^{j,\hat{g}}_t+1 \right\} . \end{aligned}$$
(104)

\(\square \)

Appendix 2: Proofs of the results in Sect. 5

Proof (Proof of Lemma 2)

The proof is done by induction.

At time \(t=0\), \(X^{1,\hat{g}}_0+X^{2,\hat{g}}_0 = X^{1,g}_0+X^{2,g}_0=x_0\).

Suppose the lemma is true at time \(t\).

At time \(t+1\), from the system dynamics (1)–(3) we get, for any \(g\),

$$\begin{aligned} X^{1,g}_{t+1}+X^{2,g}_{t+1} = \left( X^{1,g}_t - D^1_t \right) ^+ +\left( X^{2,g}_t - D^2_t \right) ^+ +A^1_t+A^2_t. \end{aligned}$$
(105)

Therefore, it suffices to show that

$$\begin{aligned} \left( X^{1,\hat{g}}_t - D^1_t \right) ^+ +\left( X^{2,\hat{g}}_t - D^2_t \right) ^+ \le _{st}&\left( X^{1,g}_t - D^1_t \right) ^+ +\left( X^{2,g}_t - D^2_t \right) ^+. \end{aligned}$$
(106)

Consider any realization \((X^{1,g}_t,X^{2,g}_t) = (x^1,x^2)\).

If \(x^1,x^2 >0\), then \(\left\lfloor \frac{1}{2}(x^1+x^2 )\right\rfloor ,\left\lceil \frac{1}{2}(x^1+x^2)\right\rceil >0\). Since \(D^1_t,D^2_t\) take values in \(\{0,1\}\), \(\left( x - D^i_t \right) ^+ = x - D^i_t, i=1,2,\) for any positive integer \(x>0\). Consequently,

$$\begin{aligned}&\left( X^{1,g}_t - D^1_t \right) ^+ +\left( X^{2,g}_t - D^2_t \right) ^+\nonumber \\&\quad =x^1+x^2- D^1_t- D^2_t\nonumber \\&\quad =\left( \left\lfloor \frac{1}{2}(x^1+x^2 )\right\rfloor - D^1_t\right) ^+ +\left( \left\lceil \frac{1}{2}(x^1+x^2)\right\rceil - D^2_t\right) ^+. \end{aligned}$$
(107)

If \(x^i=0\) and \(x^j \ge 2\) (\(i\ne j\)), then \(\left\lfloor \frac{1}{2}(x^1+x^2 )\right\rfloor >0\) and \(\left\lceil \frac{1}{2}(x^1+x^2)\right\rceil >0\). Using arguments similar to those in (107) we get

$$\begin{aligned}&\left( X^{1,g}_t - D^1_t \right) ^+ +\left( X^{2,g}_t - D^2_t \right) ^+\nonumber \\&\quad =x^j- D^j_t\nonumber \\&\quad \ge x^1+x^2- D^1_t- D^2_t\nonumber \\&\quad =\left( \left\lfloor \frac{1}{2}(x^1+x^2 )\right\rfloor - D^1_t\right) ^+ +\left( \left\lceil \frac{1}{2}(x^1+x^2)\right\rceil - D^2_t\right) ^+. \end{aligned}$$
(108)

If \(x^i=0\) and \(x^j =1 \) (\(i\ne j\)), then \(\left\lfloor \frac{1}{2}(x^1+x^2 )\right\rfloor =0\) and \(\left\lceil \frac{1}{2}(x^1+x^2)\right\rceil =1\). Therefore,

$$\begin{aligned}&\left( X^{1,g}_t - D^1_t \right) ^+ +\left( X^{2,g}_t - D^2_t \right) ^+\nonumber \\&\quad =1- D^j_t\nonumber \\&\quad \ge _{st} 1-D^2_t\nonumber \\&\quad =\left( \left\lfloor \frac{1}{2}(x^1+x^2 )\right\rfloor - D^1_t\right) ^+ +\left( \left\lceil \frac{1}{2}(x^1+x^2)\right\rceil - D^2_t\right) ^+, \end{aligned}$$
(109)

If \(x^1,x^2 =0\), then \(\left\lfloor \frac{1}{2}(x^1+x^2 )\right\rfloor ,\left\lceil \frac{1}{2}(x^1+x^2)\right\rceil =0\). Therefore,

$$\begin{aligned}&\left( X^{1,g}_t - D^1_t \right) ^+ +\left( X^{2,g}_t - D^2_t \right) ^+ =0\nonumber \\&\quad =\left( \left\lfloor \frac{1}{2}(x^1+x^2 )\right\rfloor - D^1_t\right) ^+ +\left( \left\lceil \frac{1}{2}(x^1+x^2)\right\rceil - D^2_t\right) ^+. \end{aligned}$$
(110)

As a result of (107)–(110) (note that if \(X\ge Y\), \(X \ge _{st} Y\) by the definition of \(\ge _{st}\)), we obtain

$$\begin{aligned}&\left( X^{1,g}_t - D^1_t \right) ^+ +\left( X^{2,g}_t - D^2_t \right) ^+\nonumber \\&\quad \ge _{st} \left( \left\lfloor \frac{1}{2}(X^{1,g}_t+X^{2,g}_t )\right\rfloor - D^1_t\right) ^+ +\left( \left\lceil \frac{1}{2}(X^{1,g}_t+X^{2,g}_t)\right\rceil - D^2_t\right) ^+. \end{aligned}$$
(111)

Then, from (111), the induction hypothesis and Corollary 2 we obtain

$$\begin{aligned}&\left( X^{1,g}_t - D^1_t \right) ^+ +\left( X^{2,g}_t - D^2_t \right) ^+\nonumber \\&\quad \ge _{st} \left( \left\lfloor \frac{1}{2}(X^{1,g}_t+X^{2,g}_t )\right\rfloor - D^1_t\right) ^+ +\left( \left\lceil \frac{1}{2}(X^{1,g}_t+X^{2,g}_t)\right\rceil - D^2_t\right) ^+\nonumber \\&\quad \ge _{st} \left( \left\lfloor \frac{1}{2}(X^{1,\hat{g}}_t+X^{2,\hat{g}}_t )\right\rfloor - D^1_t\right) ^+ +\left( \left\lceil \frac{1}{2}(X^{1,\hat{g}}_t+X^{2,\hat{g}}_t)\right\rceil - D^2_t\right) ^+ \nonumber \\&\quad = \left( \min (X^{1,\hat{g}}_t,X^{2,\hat{g}}_t)- D^1_t\right) ^++\left( \max (X^{1,\hat{g}}_t,X^{2,\hat{g}}_t)- D^2_t\right) ^+\nonumber \\&\quad \ge _{st} \left( X^{1,\hat{g}}_t- D^1_t\right) ^+ +\left( X^{2,\hat{g}}_t- D^2_t\right) ^+. \end{aligned}$$
(112)

The first and second stochastic inequalities in (112) follow from (111) and the induction hypothesis, respectively. The equality in (112) follows from Corollary 2. The last stochastic inequality in (112) is true because \(D^1_t, D^2_t\) are i.i.d. and independent of \(X^{1,\hat{g}}_t, X^{2,\hat{g}}_t\).

Thus, inequality (106) is true, and the proof of the lemma is complete. \(\square \)

Appendix 3: Proofs of the results associated with step 1 of the proof of Theorem 2

Proof (Proof of Lemma 4)

The proof is done by induction. At \(t=0\), (67), (68) and (69) hold if we let \(Y^{i}_0 = X^{i,g_0}_0\) for \(i=1,2\).

Assume the assertion of this lemma is true at time \(t\); we want to show that the assertion is also true at time \(t+1\).

For that matter we claim the following. \(\square \)

Claim 1

$$\begin{aligned} X^{1,\hat{g}}_{t+1}+X^{2,\hat{g}}_{t+1} = \overline{X}^{1,\hat{g}}_t+\overline{X}^{2,\hat{g}}_t \quad a.s., \end{aligned}$$
(113)
$$\begin{aligned} \max _i\left( X^{i,\hat{g}}_{t+1}\right) \le \max _i\left( \overline{X}^{i,\hat{g}}_t\right) \quad a.s. \end{aligned}$$
(114)

Claim 2

There exists \(Y^{i}_{t+1}, i=1,2\) such that

$$\begin{aligned}&\displaystyle \mathbf {P}\left( Y^{i}_{t+1} = y_{t+1} | Y^{i}_{0:t}=y_{0:t}\right) = \mathbf {P}\left( X^{i,g0}_{t+1} =y_{t+1}| X^{i,g_0}_{0:t}=y_{0:t}\right) \text { for all }y_{0:t}, \end{aligned}$$
(115)
$$\begin{aligned}&\displaystyle \overline{X}^{1,\hat{g}}_t+\overline{X}^{2,\hat{g}}_t \le Y^{1}_{t+1}+Y^{2}_{t+1} \quad a.s., \end{aligned}$$
(116)
$$\begin{aligned}&\displaystyle \max _i\left( \overline{X}^{i,\hat{g}}_t\right) \le \max _i\left( Y^{i}_{t+1}\right) \quad a.s. \end{aligned}$$
(117)

We assume the above claims to be true and prove them after the completion of the proof of the induction step.

For all \(y_{0:t+1}\), from (115) and the induction hypothesis for (67) we get for \(i=1,2\)

$$\begin{aligned} \mathbf {P}\left( Y^{i}_{0:t+1} = y_{0:t+1}\right)&= \mathbf {P}\left( Y^{i}_{t+1} = y_{t+1}| Y^{i}_{0:t}=y_{0:t}\right) \mathbf {P}\left( Y^{i}_{t}=y_t,\dots , Y^{i}_{0}=y_0\right) \nonumber \\&= \mathbf {P}\left( X^{i,g0}_{t+1} =y_{t+1}| X^{i,g_0}_{0:t}=y_{0:t}\right) \mathbf {P}\left( X^{i,g_0}_{0:t}=y_{0:t}\right) \nonumber \\&=\mathbf {P}\left( X^{i,g0}_{0:t+1} =y_{0:t+1}\right) . \end{aligned}$$
(118)

From (113) and (116) we obtain

$$\begin{aligned} X^{1,\hat{g}}_{t+1}+X^{2,\hat{g}}_{t+1} = \,&\overline{X}^{1,\hat{g}}_t+\overline{X}^{2,\hat{g}}_t \nonumber \\ \le \,&Y^{1}_{t+1}+Y^{2}_{t+1} \quad a.s. \end{aligned}$$
(119)

Furthermore, combination of (114) and (117) gives

$$\begin{aligned} \max _i\left( X^{i,\hat{g}}_{t+1}\right) \le \max _i\left( \overline{X}^{i,\hat{g}}_t\right) = \max _i\left( Y^{i}_{t+1}\right) \quad a.s. \end{aligned}$$
(120)

Therefore, the assertions (67), (68) and (69) of the lemma are true at \(t+1\) by (118), (119) and (120), respectively.

We now prove claims 1 and 2.

Proof of Claim 1

From the system dynamics (1)–(2)

$$\begin{aligned} X^{1,\hat{g}}_{t+1} = \overline{X}^{i,\hat{g}}_t-U^{i,\hat{g}}_t+U^{j,\hat{g}}_t, \end{aligned}$$
(121)
$$\begin{aligned} X^{2,\hat{g}}_{t+1} = \overline{X}^{i,\hat{g}}_t-U^{i,\hat{g}}_t+U^{j,\hat{g}}_t. \end{aligned}$$
(122)

Therefore, (113) follows by summing (121) and (122).

For (114), consider \(X^{1,\hat{g}}_{t+1}\) ( the case of \(X^{2,\hat{g}}_{t+1}\) follows from similar arguments).

When \(U^{2,\hat{g}}_t = 0\),

$$\begin{aligned} X^{1,\hat{g}}_{t+1} = \overline{X}^{1,\hat{g}}_t-U^{1,\hat{g}}_t \le \max _i\left( \overline{X}^{i,\hat{g}}_t\right) . \end{aligned}$$
(123)

When \(U^{1,\hat{g}}_t =U^{2,\hat{g}}_t = 1\),

$$\begin{aligned} X^{1,\hat{g}}_{t+1} = \overline{X}^{1,\hat{g}}_t \le \max _i\left( \overline{X}^{i,\hat{g}}_t\right) . \end{aligned}$$
(124)

When \(U^{1,\hat{g}}_t = 0, U^{2,\hat{g}}_t = 1\), \(\overline{X}^{1,\hat{g}}_t\) is less than the threshold and \(\overline{X}^{2,\hat{g}}_t\) is greater than or equal to the threshold. Therefore, by (121),

$$\begin{aligned} X^{1,\hat{g}}_{t+1} = \overline{X}^{1,\hat{g}}_t +1 \le&\,\left\lceil \textit{TH}_t \right\rceil \nonumber \\ \le&\, \overline{X}^{2,\hat{g}}_t \le \max _i\left( \overline{X}^{i,\hat{g}}_t\right) . \end{aligned}$$
(125)

Therefore, (114) follows from (123)–(125). \(\square \)

Proof of Claim 2

We set

$$\begin{aligned} Y^i_{t+1} := \left( Y^{i}_t -\tilde{D}^i_t\right) ^+ +\tilde{A}^i_t \end{aligned}$$
(126)

where \(Y^i_t\) satisfy the induction hypothesis, and \(\tilde{A}^i_t, \tilde{D}^i_t, i=1,2\) are specified as follows. Let

$$\begin{aligned}&\displaystyle M_x = \text {argmax}_i\{X^{i,\hat{g}}_t\}, \quad m_x = \text {argmin}_i\{X^{i,\hat{g}}_t\} \end{aligned}$$
(127)
$$\begin{aligned}&\displaystyle M_y = \text {argmax}_i\{Y^i_t\}, \quad m_y = \text {argmin}_i\{Y^i_t\}, \end{aligned}$$
(128)

where \(M_x=1, m_x = 2\) (resp. \(M_y=1, m_y = 2\)) when \(\{X^{1,\hat{g}}_t=X^{2,\hat{g}}_t\}\) (resp. \(\{Y^1_t=Y^2_t\}\)); define

$$\begin{aligned} \left( \tilde{A}^{M_y}_t,\tilde{D}^{M_y}_t,\tilde{A}^{m_y}_t,\tilde{D}^{m_y}_t\right) := \left\{ \begin{array}{ll} \left( A^{M_x}_t,D^{m_x}_t,A^{m_x}_t,D^{M_x}_t\right) &{}\quad \text { in case 1}, \\ \left( A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\right) &{}\quad \text { in case 2}, \end{array} \right. \end{aligned}$$
(129)

where the two cases are :

Case 1 \(\{Y^{M_y}_{t}-1 = X^{M_x,\hat{g}}_{t}=X^{m_x,\hat{g}}_{t}\) and \(\left( A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\right) = (0,1,1,0) \text { or } (0,0,1,1) \}\).

Case 2 All other instances.

Assertion: The random variables \(Y^1_{t+1},Y^2_{t+1}\), defined by (126)–(129) satisfy (115)–(117).

As the proof of this assertion is long, we first provide a sketch of its proof and then we provide a full proof.

Sketch of the proof of the assertion

  • Equation (129) implies the following: In case 2 we associate the arrival to and the departure from the longer queue \(M_x\) to those of the longer queue \(M_y\), i.e. we set \(\tilde{A}^{M_y}_t = A^{M_x}_t, \tilde{D}^{M_y}_t = D^{M_x}_t\). We do the same for the shorter queue \(m_x, m_y\), i.e. \(\tilde{A}^{m_y}_t = A^{m_x}_t, \tilde{D}^{m_y}_t = D^{m_x}_t\).

    In case 1, we have the same association for the arrivals as in case 2, that is \(\tilde{A}^{M_y}_t = A^{M_x}_t, \tilde{A}^{m_y}_t = A^{m_x}_t\), but we reverse the association of the departures, that is \(\tilde{D}^{M_y}_t = D^{m_x}_t, \tilde{D}^{m_y}_t = D^{M_x}_t\). Therefore the arrivals \(\tilde{A}^{i}_t\), and departures \(\tilde{D}^{i}_t\), have the same distribution as the original \(A^{i}_t, D^{i}_t\), respectively, \(i=1,2\). Then (115) follows from (126).

  • To establish (116), we note that, because of (129), the sum of arrivals to (respectively, departures from) queues \(M_y\) and \(m_y\) equals to the sum of arrivals to (respectively, departures from) queues \(M_x\) and \(m_x\).

    When \(X^{i,\hat{g}}_t,Y^i_t \ne 0\), \(i=1,2\), the function \((x-d)^+ + a\) is linear \(x\), as \((x-d)^+ + a = x-d+a\). Then from (126), (129) and the induction hypothesis we obtain

    $$\begin{aligned}&Y^1_{t+1}+Y^2_{t+1} - \overline{X}^{1,\hat{g}}_t-\overline{X}^{2,\hat{g}}_t \nonumber \\&\quad = Y^1_t+Y^2_t - X^{1,\hat{g}}_t-X^{2,\hat{g}}_t \ge 0 \end{aligned}$$
    (130)

    and this establish (116) when \(X^{i,\hat{g}}_t,Y^i_t \ne 0\), \(i=1,2\). In the full proof of the assertion, we show that show that (116) is also true when \(X^{i,\hat{g}}_t,Y^i_t\) are not all non-zero.

  • To establish (117) we consider the maximum of the queue lengths. In case 2, we show that (126)–(129) ensure that

    $$\begin{aligned}&\displaystyle Y^{M_y}_{t+1} \ge \overline{X}^{M_x,\hat{g}}_t,&\end{aligned}$$
    (131)
    $$\begin{aligned}&\displaystyle \max \left( Y^{M_y}_{t+1},Y^{m_y}_{t+1}\right) \ge \overline{X}^{m_x,\hat{g}}_t;&\end{aligned}$$
    (132)

    then (117) follows from (131)–(132).

    In case 1 (117) is verified by direct computation in the full proof.

Proof of the assertion

For all \(y_{0:t}\), we denote by \(E_{y_{0:t}}\) the event \(\{Y^{i}_{0:t}=y_{0:t} \}\).

Let \(\tilde{Z}_t=\left( \tilde{A}^{M_y}_t,\tilde{D}^{M_y}_t,\tilde{A}^{m_y}_t,\tilde{D}^{m_y}_t\right) \), then for any realization \(z_t \in \{0,1\}^4 \) of \(\tilde{Z}_t\) we have

$$\begin{aligned} \mathbf {P}\left( \tilde{Z}_t =z_t | E_{y_{0:t}}\right) = \mathbf {P}\left( \tilde{Z}_t =z_t, \text {case 1}| E_{y_{0:t}}\right) + \mathbf {P}\left( \tilde{Z}_t =z_t, \text {case 2}| E_{y_{0:t}}\right) . \end{aligned}$$
(133)

When \(z_t \ne (0,1,1,0)\) or \((0,0,1,1)\), we get

$$\begin{aligned} \mathbf {P}\left( \tilde{Z}_t =z_t, \text {case 1}| E_{y_{0:t}}\right) =0, \end{aligned}$$
(134)

and

$$\begin{aligned} \mathbf {P}\left( \tilde{Z}_t =z_t, \text {case 2}| E_{y_{0:t}}\right)&= \mathbf {P}\left( \left( A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\right) =z_t | E_{y_{0:t}}\right) \nonumber \\&= \mathbf {P}\left( \left( A^{1}_t,D^{1}_t,A^{2}_t,D^{2}_t\right) =z_t\right) , \end{aligned}$$
(135)

where the last equality in (135) holds because the random variables \(A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\) are independent of \(Y_0,Y_1,\dots ,Y_t\) and have the same distribution as \(A^{1}_t,D^{1}_t,A^{2}_t,D^{2}_t\).

Therefore, combining (134) and (135) we obtain for \(z_t \ne (0,1,1,0)\) or \((0,0,1,1)\)

$$\begin{aligned} \mathbf {P}\left( \tilde{Z}_t =z_t | E_{y_{0:t}}\right) = \mathbf {P}\left( \left( A^{1}_t,D^{1}_t,A^{2}_t,D^{2}_t\right) =z_t\right) \end{aligned}$$
(136)

When \(z_t = (0,1,1,0)\) or \((0,0,1,1)\), let \(E\) denote the event \(\{Y^{M_y}_{t}-1 = X^{M_x,\hat{g}}_{t}=X^{m_x,\hat{g}}_{t}\}\); then we obtain

$$\begin{aligned} \mathbf {P}\left( \tilde{Z}_t =z_t, \text {case 1}| E_{y_{0:t}}\right)&= \mathbf {P}\left( \left( A^{M_x}_t,D^{m_x}_t,A^{m_x}_t,D^{M_x}_t\right) =z_t, E | E_{y_{0:t}}\right) \nonumber \\&= \mathbf {P}\left( \left( A^{1}_t,D^{2}_t,A^{2}_t,D^{1}_t\right) =z_t\right) \mathbf {P}\left( E | E_{y_{0:t}}\right) , \end{aligned}$$
(137)

and

$$\begin{aligned} \mathbf {P}\left( \tilde{Z}_t =z_t, \text {case 2}| E_{y_{0:t}}\right)&= \mathbf {P}\left( \left( A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\right) =z_t, E^c | E_{y_{0:t}}\right) \nonumber \\&= \mathbf {P}\left( \left( A^{1}_t,D^{1}_t,A^{2}_t,D^{2}_t\right) =z_t\right) \mathbf {P}\left( E^c | E_{y_{0:t}}\right) , \end{aligned}$$
(138)

where the last equality in (137) and (138) follow by the fact that the random variables \(A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\) are independent of \(Y_0,Y_1,\dots ,Y_t\) (hence, the event \(E\) which is generated by \(Y_0,Y_1,\dots ,Y_t\)) and have the same distribution as \(A^{1}_t,D^{1}_t,A^{2}_t,D^{2}_t\).

Therefore, combining (137) and (138) we obtain for \(z_t = (0,1,1,0)\) or \((0,0,1,1)\)

$$\begin{aligned} \mathbf {P}\left( \tilde{Z}_t =z_t | E_{y_{0:t}}\right)&= \mathbf {P}\left( \left( A^{1}_t,D^{2}_t,A^{2}_t,D^{1}_t\right) =z_t\right) \mathbf {P}\left( E | E_{y_{0:t}}\right) \nonumber \\&\quad +\mathbf {P}\left( \left( A^{1}_t,D^{1}_t,A^{2}_t,D^{2}_t\right) =z_t\right) \mathbf {P}\left( E^c | E_{y_{0:t}}\right) \nonumber \\&= \mathbf {P}\left( \left( A^{1}_t,D^{1}_t,A^{2}_t,D^{2}_t\right) =z_t\right) , \end{aligned}$$
(139)

where the last equality in (139) is true because \(A^{1}_t,D^{1}_t,A^{2}_t,D^{2}_t\) are independent and \(D^1_t\) has the same distribution as \(D^{2}_t\).

As a result of (136) and (139), for any \(z_t \in \{0,1\}^4 \) we have

$$\begin{aligned} \mathbf {P}\left( \tilde{Z}_t =z_t | E_{y_{0:t}}\right) =\mathbf {P}\left( \left( A^{1}_t,D^{1}_t,A^{2}_t,D^{2}_t\right) =z_t\right) . \end{aligned}$$
(140)

Now consider any \(y_{0:t+1}\). By (140) we have for \(i=M_y\) or \(m_y\)

$$\begin{aligned} \mathbf {P}\left( Y^{i}_{t+1} = y_{t+1} | E_{y_{0:t}}\right)&= \mathbf {P}\left( \left( y^{i}_{t} - \tilde{D}^i_t \right) ^+ +\tilde{A}^i_t = y_{t+1} | E_{y_{0:t}}\right) \nonumber \\&= \mathbf {P}\left( \left( y^{i}_{t} - D^i_t \right) ^+ +A^i_t = y_{t+1}\right) \nonumber \\&= \mathbf {P}\left( X^{i,g0}_{t+1} =y_{t+1}| X^{i,g_0}_{0:t}=y_{0:t}\right) . \end{aligned}$$
(141)

which is (115).

Now consider the sum \(Y^1_{t+1}+Y^2_{t+1}\).

From (129), we know that

$$\begin{aligned}&\tilde{A}^{M_y}_t+\tilde{A}^{m_y}_t= A^{M_x}_t+A^{m_x}_t \quad a.s., \end{aligned}$$
(142)
$$\begin{aligned}&\tilde{D}^{M_y}_t+\tilde{D}^{m_y}_t= D^{M_x}_t+D^{m_x}_t \quad a.s. \end{aligned}$$
(143)

Therefore, (142) implies

$$\begin{aligned}&Y^1_{t+1}+Y^2_{t+1} - \overline{X}^{1,\hat{g}}_{t+1}-\overline{X}^{1,\hat{g}}_{t+1} \nonumber \\&\quad = \left( Y^{M_y}_{t} - \tilde{D}^{M_y}_t \right) ^++\left( Y^{m_y}_{t} - \tilde{D}^{m_y}_t \right) ^+ - \left( X^{M_x,\hat{g}}_{t} - D^{M_x}_t \right) ^+ -\left( X^{m_x,\hat{g}}_{t} - D^{m_x}_t \right) ^+. \end{aligned}$$
(144)

We proceed to show that the right hand side of (144) is positive. From the induction hypothesis for (69)–(68) we have

$$\begin{aligned}&Y^{m_y}_{t}+Y^{M_y}_{t} \ge X^{m_x,\hat{g}}_{t}+X^{M_x,\hat{g}}_{t}\quad a.s. , \end{aligned}$$
(145)
$$\begin{aligned}&Y^{M_y}_{t} \ge X^{M_x,\hat{g}}_{t}\quad a.s. \end{aligned}$$
(146)

There are three possibilities: \(\{Y^{M_y}_{t} = X^{M_x,\hat{g}}_{t}\}\), \(\{Y^{M_y}_{t} > X^{M_x,\hat{g}}_{t}, X^{m_x,\hat{g}}_{t} = 0\}\) and \(\{Y^{M_y}_{t} > X^{M_x,\hat{g}}_{t}, X^{m_x}_{t} > 0\}\).

First consider \(\{Y^{M_y}_{t} = X^{M_x,\hat{g}}_{t}\}\). By (145) we have

$$\begin{aligned} Y^{m_y}_{t} \ge X^{m_x,\hat{g}}_{t} \quad a.s. \end{aligned}$$
(147)

Note that \(\{Y^{M_y}_{t} = X^{M_x,\hat{g}}_{t}\}\) belongs to case 2 in (129). From case 2 of (129) we also know that

$$\begin{aligned} D^{M_x}_t=\tilde{D}^{M_y}_t, \quad D^{m_x}_t=\tilde{D}^{m_y}_t. \end{aligned}$$
(148)

Then, because of (146)–(148) we get

$$\begin{aligned}&\left( X^{M_x,\hat{g}}_{t} - D^{M_x}_t \right) ^+ +\left( X^{m_x,\hat{g}}_{t} - D^{m_x}_t \right) ^+ \nonumber \\&\quad \le \left( Y^{M_y}_{t} - D^{M_x}_t \right) ^+ +\left( Y^{m_y}_{t} - D^{m_x}_t \right) ^+ \nonumber \\&\quad = \left( Y^{M_y}_{t} - \tilde{D}^{M_y}_t \right) ^++\left( Y^{m_y}_{t} - \tilde{D}^{m_y}_t \right) ^+ \quad a.s. \end{aligned}$$
(149)

If \(Y^{M_y}_{t} > X^{M_x,\hat{g}}_{t} \) and \(X^{m_x,\hat{g}}_{t} = 0\)

$$\begin{aligned}&\left( X^{M_x,\hat{g}}_{t} - D^{M_x}_t \right) ^+ +\left( X^{m_x,\hat{g}}_{t} - D^{m_x}_t \right) ^+ \nonumber \\&\quad = \left( X^{M_x,\hat{g}}_{t} - D^{M_x}_t \right) ^+ \nonumber \\&\quad \le X^{M_x,\hat{g}}_{t} \le Y^{M_y}_{t}-1 \nonumber \\&\quad \le \left( Y^{M_y}_{t} - \tilde{D}^{M_y}_t \right) ^++\left( Y^{m_y}_{t} - \tilde{D}^{m_y}_t \right) ^+ \end{aligned}$$
(150)

If \(Y^{M_y}_{t} > X^{M_x,\hat{g}}_{t} \) and \(X^{m_x}_{t} > 0\), then

$$\begin{aligned}&\left( X^{M_x,\hat{g}}_{t} - D^{M_x}_t \right) ^+ +\left( X^{m_x,\hat{g}}_{t} - D^{m_x}_t \right) ^+ \nonumber \\&\quad = X^{M_x,\hat{g}}_{t} - D^{M_x}_t +X^{m_x,\hat{g}}_{t} - D^{m_x}_t \nonumber \\&\quad = X^{M_x,\hat{g}}_{t} +X^{m_x,\hat{g}}_{t}- \tilde{D}^{M_y}_t - \tilde{D}^{m_y}_t \nonumber \\&\quad \le Y^{M_y}_{t} +Y^{m_y}_{t}- \tilde{D}^{M_y}_t - \tilde{D}^{m_y}_t \nonumber \\&\quad \le \left( Y^{M_y}_{t} - \tilde{D}^{M_y}_t \right) ^++\left( Y^{m_y}_{t} - \tilde{D}^{m_y}_t \right) ^+ \end{aligned}$$
(151)

where the second equality in (151) follows from (143) and the first inequality in (151) follows from the induction hypothesis for (68).

The above results, namely (149)–(151), show that the right hand side of (144) is positive, and the proof for (116) is complete.

It remains to show that (117) is true.

We first consider case 2.

In case 2, we know from (129) that

$$\begin{aligned} \left( \tilde{A}^{M_y}_t,\tilde{D}^{M_y}_t,\tilde{A}^{m_y}_t,\tilde{D}^{m_y}_t\right) = \left( A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\right) . \end{aligned}$$
(152)

Then,

$$\begin{aligned} \overline{X}^{M_x,\hat{g}}_{t} =&\left( X^{M_x,\hat{g}}_{t} - D^{M_x}_t \right) ^+ + A^{M_x}_t \nonumber \\ =&\left( X^{M_x,\hat{g}}_{t} - \tilde{D}^{M_y}_t \right) ^+ +\tilde{A}^{M_y}_t \nonumber \\ \le&\left( Y^{M_y}_{t} - \tilde{D}^{M_y}_t \right) ^+ +\tilde{A}^{M_y}_t \nonumber \\ =&\,\, Y^{M_y}_{t+1}, \end{aligned}$$
(153)

where the second equality is a consequence of (152) and the inequality follows from the induction hypothesis for (69).

To proceed further we note that in case 2 there are three possibilities: \(\{Y^{M_y}_{t} = X^{M_x,\hat{g}}_{t} \}\), \(\{Y^{M_y}_{t}-2 \ge X^{m_x,\hat{g}}_{t} \}\) and \(\{Y^{M_y}_{t} > X^{M_x,\hat{g}}_{t}, Y^{M_y}_{t}-2 < X^{m_x,\hat{g}}_{t} \}\)

If \(Y^{M_y}_{t} = X^{M_x,\hat{g}}_{t} \), (147) is also true. Following similar arguments as in (153) we obtain

$$\begin{aligned} \overline{X}^{m_x,\hat{g}}_{t} \le Y^{m_y}_{t+1}. \end{aligned}$$
(154)

If \(Y^{M_y}_{t}-2 \ge X^{m_x,\hat{g}}_{t} \)

$$\begin{aligned} \overline{X}^{m_x,\hat{g}}_{t} \le X^{m_x,\hat{g}}_{t}+1 \le Y^{M_y}_{t}-1 \le Y^{M_y}_{t+1}. \end{aligned}$$
(155)

If \(Y^{M_y}_{t} >X^{M_x,\hat{g}}_{t} \) and \(Y^{M_y}_{t}-2 < X^{m_x,\hat{g}}_{t} \) it can only be \(Y^{M_y}_{t}-1 = X^{M_x,\hat{g}}_{t} = X^{m_x,\hat{g}}_{t} \). Since we are in case 2, \(\left( A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\right) \ne (0,1,1,0)\). Therefore,

$$\begin{aligned} A^{m_x}_t-D^{m_x}_t \le A^{M_x}_t-D^{M_x}_t+1. \end{aligned}$$
(156)

Then we get

$$\begin{aligned} \overline{X}^{m_x,\hat{g}}_{t} =&\left( Y^{M_y}_{t}-1 - D^{m_x}_t\right) ^+ + A^{m_x}_t \nonumber \\ =&\max \left( A^{m_x}_t, Y^{M_y}_{t}- 1 - D^{m_x}_t + A^{m_x}_t \right) \nonumber \\ \le&\max \left( A^{m_x}_t, Y^{M_y}_{t}- D^{M_x}_t + A^{M_x}_t \right) \nonumber \\ \le&\max \left( A^{m_x}_t, Y^{M_y}_{t+1} \right) \nonumber \\ \le&\max \left( Y^{m_y}_{t+1}, Y^{M_y}_{t+1} \right) . \end{aligned}$$
(157)

Combining (153), (154), (155) and (157) we get (117) when case 2 is true.

Now consider case 1. We have \(Y^{M_y}_{t}-1 = X^{M_x,\hat{g}}_{t} = X^{m_x,\hat{g}}_{t} \).

When \(\left( A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\right) = (0,1,1,0)\), then

$$\begin{aligned} \overline{X}^{M_x,\hat{g}}_{t} =&\left( X^{M_x,\hat{g}}_{t} -1 \right) ^+\nonumber \\ \le&\, \overline{X}^{m_x}_{t} \nonumber \\ =&\, X^{m_x}_{t}+1 \nonumber \\ =&\, \left( Y^{M_y}_{t}-D^{m_x}_t\right) ^+ + A^{M_x}_t \nonumber \\ =&\, Y^{M_y}_{t+1} \end{aligned}$$
(158)

When \(\left( A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\right) = (0,0,1,1)\) we get

$$\begin{aligned} \overline{X}^{M_x,\hat{g}}_{t} =&\, X^{M_x,\hat{g}}_{t}\nonumber \\ \le&\, \overline{X}^{m_x,\hat{g}}_{t} \nonumber \\ =&\max \left( X^{m_x,\hat{g}}_{t}, 1 \right) \nonumber \\ =&\max \left( \left( Y^{M_y}_{t}-D^{m_x}_t\right) ^+ + A^{M_x}_t, A^{m_x}_t \right) \nonumber \\ =&\max \left( Y^{M_y}_{t+1}, A^{m_x}_t \right) \nonumber \\ \le&\max \left( Y^{M_y}_{t+1}, Y^{m_y}_{t+1} \right) . \end{aligned}$$
(159)

Combining (158) and (159) we obtain (117) for case 1.

As a result, (117) holds for both cases 1 and 2.

Remark

We note that we need the two cases described in (129) for the following reasons. If we eliminate case 1 and always associate \(\left( \tilde{A}^{M_y}_t,\tilde{D}^{M_y}_t,\tilde{A}^{m_y}_t,\tilde{D}^{m_y}_t\right) \) with \(\left( A^{M_x}_t,D^{M_x}_t,A^{m_x}_t,D^{m_x}_t\right) \) as in case 2, then when \(\{Y^{M_y}_{t}-1 =X^{m_x,\hat{g}}_{t}\) and \(\left( A^{M_x}_t,D^{M_x}_t,\right. \left. A^{m_x}_t,D^{m_x}_t\right) = (0,1,1,0)\}\), the shorter queue \(m_x\) increases by one customer, and the longer queue \(M_y\) decreases by one customer; therefore \(\overline{X}^{m_x,\hat{g}}_{t} = Y^{M_y}_{t+1}+1\) and (117) is not satisfied. \(\square \)

Proof (Proof of Lemma 5)

From Lemma 4, at any time \(t\) there exists \(Y^{i}_t\) such that such that (67)–(69) hold.

Adopting the notations \(M_x,m_x\) and \(M_y,m_y\) in the proof of Lemma 4, we have at every time \(t\)

$$\begin{aligned}&X^{m_x,\hat{g}}_t \le X^{M_X,\hat{g}}_t \quad a.s., \end{aligned}$$
(160)
$$\begin{aligned}&Y^{m_y}_t \le Y^{M_y}_t \quad a.s. \end{aligned}$$
(161)

Furthermore, from (69) we have

$$\begin{aligned} X^{M_x,\hat{g}}_t \le Y^{M_y}_t \quad a.s. \end{aligned}$$
(162)

If \(X^{m_x,\hat{g}}_t \le Y^{m_y}_t\), (162) and the fact that \(c(\cdot )\) is increasing give

$$\begin{aligned} c\left( X^{M_X,\hat{g}}_t\right) +c\left( X^{m_X,\hat{g}}_t\right) \le c\left( Y^{M_y}_t\right) +c\left( Y^{m_y}_t\right) . \end{aligned}$$
(163)

If \(X^{m_x,\hat{g}}_t > Y^{m_y}_t\), then

$$\begin{aligned} Y^{m_y}_t < X^{m_x,\hat{g}}_t \le X^{M_x,\hat{g}}_t \le Y^{M_y}_t . \end{aligned}$$
(164)

Since \(c(\cdot )\) is convex, it follows from (164) that

$$\begin{aligned} \frac{c\left( Y^{M_y}_t\right) -c\left( X^{M_x,\hat{g}}_t\right) }{Y^{M_y}_t-X^{M_x,\hat{g}}_t} \ge \frac{c\left( X^{m_x,\hat{g}}_t\right) -c\left( Y^{m_y}_t\right) }{X^{m_x,\hat{g}}_t-Y^{m_y}_t} . \end{aligned}$$
(165)

From (68) in Lemma 4 we know that

$$\begin{aligned} Y^{M_y}_t - X^{M_x,\hat{g}}_t \ge X^{m_x,\hat{g}}_t - Y^{m_y}_t. \end{aligned}$$
(166)

Combining (165) and (166) we get

$$\begin{aligned} c\left( Y^{M_y}_t\right) +c\left( Y^{m_y}_t\right) \ge c\left( X^{M_x,\hat{g}}_t\right) +c\left( X^{m_x,\hat{g}}_t\right) . \end{aligned}$$
(167)

\(\square \)

Proof (Proof of Lemma 6)

Let \(\{Y^{1}_t, t \in \mathbb {Z}_+ \}\) and \(\{Y^{2}_t, t\in \mathbb {Z}_+\}\) be the processes defined in Lemma 4. Then \(\{Y^{i}_t, t\in \mathbb {Z}_+\}\) has the same distribution as \(\{X^{i,g_0}_t, t\in \mathbb {Z}_+\}\) for \(i=1,2\).

Since \(\mu >\uplambda \), the processes \(\{Y^{i}_t, t\in \mathbb {Z}_+\},i=1,2\) are irreducible positive recurrent Markov chains. Moreover, the two processes \(\{Y^{1}_t, t \in \mathbb {Z}_+ \}\) and \(\{Y^{2}_t, t\in \mathbb {Z}_+\}\) have the same stationary distribution, denoted by \(\pi ^{g_0}\). Under Assumption 2, by Ergodic theorem of Markov chains (see (Bremaud (1999), chap. 3)) we get

$$\begin{aligned} \lim _{T \rightarrow \infty } \frac{1}{T} \sum _{t=0}^{T-1}c(Y^{1}_t) = \lim _{T \rightarrow \infty } \frac{1}{T}\sum _{t=0}^{T-1}c(Y^{2}_t) = \sum _{x=0}^{\infty } \pi ^{g_0}(x) c(x)\quad a.s. \end{aligned}$$
(168)

Let \(W^i_T(Y_{0:T-1}) := \frac{1}{T}\sum _{t=0}^{T-1}c(Y^{i}_t), i=1,2\).

We show that \(\{W^i_T(Y_{0:T-1}) , T=1,2,\dots \}\) is uniformly integrable for \(i=1,2\). That is,

$$\begin{aligned} \sup _{T} \mathbf {E}\left[ W^i_T(Y_{0:T-1}) 1_{\{W^i_T(Y_{0:T-1}) >N\}} \right] \rightarrow 0 \end{aligned}$$
(169)

as \(N\rightarrow \infty \).

Let \(p^{g_0}(x,y),x,y\in \mathbb {Z}_+\) be the transition probabilities of the Markov chain. Note that the initial PMF of the process \(\{Y^{i}_t, t\in \mathbb {Z}_+\},i=1,2\) is \(\pi ^i_0\). From Assumption 2 we know that \(\pi ^i_0(x)=0,i=1,2\) for all \(x>M\).

Letting \(R := \max _{x\le M} \frac{\pi ^i_0(x)}{\pi ^{g_0}(x)}<\infty \) Footnote 3 , we obtain for \(i=1,2\)

$$\begin{aligned}&\mathbf {E}\left[ W^i_T(Y_{0:T-1}) 1_{\{W^i_T(Y_{0:T-1}) >N\}} \right] \nonumber \\&\quad = \sum _{y_{0:T-1}} W^i_T(y_{0:T-1})1_{\{W^i_T(y_{0:T-1}) >N\}}\mathbf {P}(Y_{0:T-1} = y_{0:T-1}) \nonumber \\&\quad = \sum _{y_{0:T-1}} W^i_T(y_{0:T-1})1_{\{W^i_T(y_{0:T-1}) >N\}} \pi ^{i}_0(y_0)\varPi _{t=1}^{T-1}p^{g_0}(y_{t-1},y_t) \nonumber \\&\quad \le R\sum _{y_{0:T-1}} W^i_T(y_{0:T-1})1_{\{W^i_T(y_{0:T-1}) >N\}} \pi ^{g_0}(y_0)\varPi _{t=1}^{T-1}p^{g_0}(y_{t-1},y_t) \nonumber \\&\quad = R \mathbf {E}\left[ W^{\pi ^{g_0}}_T 1_{\{W^{\pi ^{g_0}}_T >N\}} \right] , \end{aligned}$$
(170)

where \(W^{\pi ^{g_0}}_T = \frac{1}{T}\sum _{t=0}^{T-1}c(Y^{\pi ^{g_0}}_t)\) and \(\{Y^{\pi ^{g_0}}_t, t\in \mathbb {Z}_+\}\) is the chain with transition probabilities \(p^{g_0}(x,y)\) and initial PMF \(\pi ^{g_0}\).

Note that \(\{Y^{\pi ^{g_0}}_t, t\in \mathbb {Z}_+\}\) is stationary because the initial PMF is the stationary distribution \(\pi ^{g_0}\). From Birkhoff’s Ergodic theorem we know that \(\{W^{\pi ^{g_0}}_T,T=1,2,\dots \}\) converges \(a.s.\) and in expectation (see (Petersen and Petersen (1989), chap. 2)). Therefore, \(\{W^{\pi ^{g_0}}_T,T=1,2,\dots \}\) is uniformly integrable, and the right hand side of (170) goes to zeros uniformly as \(N \rightarrow \infty \). Consequently, \(\{W^i_T(Y_{0:T-1}) , T=1,2,\dots \}\) is also uniformly integrable for \(i=1,2\).

Since \(W_T = W^1_T(Y_{0:T-1})+W^2_T(Y_{0:T-1})\) for all \(T=1,2,\dots \), \(\{W_T,T=1,2,\dots \}\) is uniformly integrable. \(\square \)

Proof (Proof of Corollary 3)

From Lemma 5, there exists \(\{Y^{1}_t,Y^2_t, t \in \mathbb {Z}_+\}\) such that (67) holds and

$$\begin{aligned} c\left( X^{1,\hat{g}}_t\right) +c\left( X^{2,\hat{g}}_t\right) \le c\left( Y^{1}_t\right) +c\left( Y^{2}_t\right) \quad a.s. \end{aligned}$$
(171)

Let

$$\begin{aligned} W_T :=&\, \frac{1}{T}\sum _{t=0}^{T-1} \left( c\left( Y^1_t\right) +c\left( Y^2_t\right) \right) , \end{aligned}$$
(172)
$$\begin{aligned} V_T :=&\, \frac{1}{T}\sum _{t=0}^{T-1} \left( c\left( X^{1,\hat{g}}_t\right) +c\left( X^{2,\hat{g}}_t\right) \right) . \end{aligned}$$
(173)

From (171) it follows that

$$\begin{aligned} V_T \le W_T,\quad T=1,2,\dots \end{aligned}$$
(174)

From Lemmas 6, \(\{W_T, T =1,2,\dots \} \) is uniformly integrable, therefore \(\{V_T, T =1,2,\dots \} \), which is bounded above by \(\{W_T, T =1,2,\dots \} \) is also uniformly integrable.

From the property of uniformly integrability, if \(\{V_T, T =1,2,\dots \} \) converges a.s., we know that \(\{V_T, T =1,2,\dots \} \) also converges in expectation. Furthermore,

$$\begin{aligned} J^{\hat{g}}\left( \pi ^1_0,\pi ^2_0\right) =&\limsup _{T\rightarrow \infty }\frac{1}{T} \mathbf {E}\left[ \sum _{t=0}^{T-1} \left( c\left( Y^1_t\right) +c\left( Y^2_t\right) \right) \right] \nonumber \\&= \limsup _{T\rightarrow \infty } \mathbf {E}\left[ V_T\right] \nonumber \\&\le \limsup _{T\rightarrow \infty } \mathbf {E}\left[ W_T\right] = J^{g_0}. \end{aligned}$$
(175)

\(\square \)

Appendix 4: Proofs of the results associated with step 2 of the proof of Theorem 2

Proof (Proof of Lemma 7)

First we show that \(\{S_t,t\ge T_0+1\}\) is a Markov chain.

For \(s_t \ge 2\),

$$\begin{aligned}&\mathbf {P}\left( S_{t+1}=s_{t+1}|S_{T_0+1:t}=s_{T_0+1:t}\right) \nonumber \\&\quad = \mathbf {P}\left( \left( s_t-D^1_t -D^2_t +A^1_t+A^2_t \right) =s_{t+1}\right. \left. |S_{T_0+1:t}=s_{T_0+1:t}\right) \nonumber \\&\quad = \mathbf {P}\left( \left( s_t-D^1_t -D^2_t +A^1_t+A^2_t\right) =s_{t+1}|S_t=s_t\right) \nonumber \\&\quad =\mathbf {P}\left( S_{t+1}=s_{t+1}|S_t=s_t\right) . \end{aligned}$$
(176)

The first and last equalities in (176) follow from the construction of the process \(\{S_t, t\ge T_0+1\}\). The second equality in (176) is true because \(T_0\) is a stopping time with respect to \(\{X^{1,\hat{g}}_t,X^{2,\hat{g}}_t, t\in \mathbb {Z}_+\}\), and \(A^i_t, D^i_t, i=1,2\) are independent of all random variables before \(t\). Similarly, for \(s_t =0\) we have, by arguments similar to the above,

$$\begin{aligned}&\mathbf {P}\left( S_{t+1}=s_{t+1}|S_{T_0+1:t}=s_{T_0+1:t}\right) \nonumber \\&\quad = \mathbf {P}\left( A^1_t+A^2_t=s_{t+1}|S_{T_0+1:t-1}=s_{T_0+1:t-1}, S_t=0\right) \nonumber \\&\quad = \mathbf {P}\left( A^1_t+A^2_t=s_{t+1}|S_t=0\right) \nonumber \\&\quad =\mathbf {P}\left( S_{t+1}=s_{t+1}|S_t=0\right) . \end{aligned}$$
(177)

The first and last equality in (177) follow from the construction of the process \(\{S_t, t\ge T_0+1\}\). The second equality in (177) is true because \(A^i_t, D^i_t, i=1,2\) are independent of all variables before \(t\).

For \(s_t =1\),

$$\begin{aligned}&\mathbf {P}\left( S_{t+1}=s_{t+1}|S_{T_0+1:t}=s_{T_0+1:t}\right) \nonumber \\&\quad = \mathbf {P}\left( s_t+1_{\left\{ X^{1,\hat{g}}_t=0\right\} }(D^1_t- D^2_t)-D^1_t +A^1_t+A^2_t =s_{t+1} |S_{T_0+1:t}=s_{T_0+1:t}\right) \nonumber \\&\quad = \mathbf {P}\left( 1 -D^2_t +A^1_t+A^2_t=s_{t+1}, X^{1,\hat{g}}_t=0|S_{T_0+1:t-1}=s_{T_0+1:t-1}, S_t=1\right) \nonumber \\&\qquad + \mathbf {P}\left( 1 -D^1_t +A^1_t+A^2_t=s_{t+1}, X^{1,\hat{g}}_t=1|S_{T_0+1:t-1}=s_{T_0+1:t-1}, S_t=1\right) \nonumber \\&\quad = \mathbf {P}\left( 1 -D^1_t +A^1_t+A^2_t=s_{t+1}, X^{1,\hat{g}}_t=0|S_{T_0+1:t-1}=s_{T_0+1:t-1}, S_t=1\right) \nonumber \\&\qquad + \mathbf {P}\left( 1 -D^1_t +A^1_t+A^2_t=s_{t+1}, X^{1,\hat{g}}_t=1|S_{T_0+1:t-1}=s_{T_0+1:t-1}, S_t=1\right) \nonumber \\&\quad = \mathbf {P}\left( 1 -D^1_t +A^1_t+A^2_t=s_{t+1} |S_{T_0+1:t-1}=s_{T_0+1:t-1}, S_t=1\right) \nonumber \\&\quad = \mathbf {P}\left( 1 -D^1_t +A^1_t+A^2_t=s_{t+1} |S_t=1\right) \nonumber \\&\quad =\mathbf {P}\left( S_{t+1}=s_{t+1}|S_t=s_t\right) . \end{aligned}$$
(178)

The first equality in (178) follows from the construction of the process \(\{S_t, t\ge T_0+1\}\). The second and forth equalities follow from the fact that \(X^{1,\hat{g}}_t\) can be either 0 or 1.

In the third equality, \(D^2_t\) is replaced by \(D^1_t\) in the first term; this is true because \(D^1_t\) and \(D^2_t\) are identically distributed and independent of \(X^{1,\hat{g}}_t\) and all past random variables. The fifth equality holds because \(T_0\) is a stopping time with respect to \(\{X^{1,\hat{g}}_t,X^{2,\hat{g}}_t, t\in \mathbb {Z}_+\}\) and \(A^i_t, D^i_t, i=1,2\) are independent of all past random variables. The last equality follows from the same arguments that lead to the first through the fifth equalities.

Therefore, the process \(\{S_t, t\ge T_0+1\}\) is a Markov chain.

Since \(\uplambda ,\mu >0\), the Markov chain is irreducible.

We prove that the process \(\{S_t, t\ge T_0+1\}\) is positive recurrent. Note that, for all \(s=0,1,2,\dots \), because of the construction of \(\{S_t, t\ge T_0+1\}\)

$$\begin{aligned} \mathbf {E}\left[ S_{t+1} | S_t=s\right]&\le \mathbf {E}\left[ S_t +A^1_t+A^2_t| S_t=s\right] \nonumber \\&= s +2\uplambda < \infty . \end{aligned}$$
(179)

Moreover, for all \(s\ge 2\),

$$\begin{aligned} \mathbf {E}\left[ S_{t+1} | S_t=s\right]&= \mathbf {E}\left[ s-D^1_t -D^2_t +A^1_t+A^2_t | S_t=s\right] \nonumber \\&= s-2\mu +2\uplambda < s. \end{aligned}$$
(180)

Using Foster’s theorem [see (Bremaud (1999), chap. 5)], we conclude that the Markov chain \(\{S_t, t\ge T_0+1\}\) is positive recurrent. \(\square \)

Proof (Proof of Lemma 8)

Let \((\Omega , \mathcal {F}, \mathbf {P})\) denote the basic probability space for our problem. Define events \(E_t \in \mathcal {F}, t=0,1,\dots \) to be

$$\begin{aligned} E_t =&\{\omega \in \Omega : \left( U^{1,\hat{g}}_{t'}(\omega ),U^{2,\hat{g}}_{t'}(\omega )\right) \ne (0,0) \quad \forall t' \ge t\} \end{aligned}$$
(181)

If the claim of this lemma is not true, we get

$$\begin{aligned} \mathbf {P}\left( \bigcup _{t =0}^{\infty }E_t\right) = 1-\mathbf {P}\left( \left( U^{1,\hat{g}}_{t},U^{2,\hat{g}}_{t}\right) =(0,0) \quad i.o.\right) >0. \end{aligned}$$
(182)

Therefore, there exist some \(t_0\) such that \(\mathbf {P}(E_{t_0}) >0\). Since \(t_0\) is a constant, it is a stopping time with respect to \(\{X^{1,\hat{g}}_t,X^{2,\hat{g}}_t, t \in \mathbb {Z}_+ \}\).

Consider the process \(\{S_t,t=t_0+1,t_0+2,\ldots \}\) defined in Lemma 7 with the stopping time \(t_0\). From Lemma 7 we know that \(\{S_t, t\ge t_0+1 \}\) is an irreducible positive recurrent Markov chain. Furthermore, along the sample path induced by any \(\omega \in E_{t_0}\), we claim that for all \(t \ge t_0+1\)

$$\begin{aligned} S_t(\omega ) =\,X^{1,\hat{g}}_{t}(\omega )+X^{2,\hat{g}}_{t}(\omega ) = \,\overline{X}^{1,\hat{g}}_{t-1}(\omega )+\overline{X}^{2,\hat{g}}_{t-1}(\omega ). \end{aligned}$$
(183)

The claim is shown by induction below.

By the definition of \(\{S_t, t\ge t_0+1 \}\) in Lemma 7, we have at time \(t_0+1\) for any \(\omega \in E_{t_0}\)

$$\begin{aligned} S_{t_0+1}(\omega ) = X^{1,\hat{g}}_{t_0+1}(\omega )+X^{2,\hat{g}}_{t_0+1}(\omega ) = \overline{X}^{1,\hat{g}}_{t_0}(\omega )+\overline{X}^{2,\hat{g}}_{t_0}(\omega ), \end{aligned}$$
(184)

where the last inequality in (184) follows from the system dynamics (1)–(3).

Assume Eq. (183) is true at time \(t\) (\(t\ge t_0+1\)). At time \(t+1\) we have, by (1)–(3),

$$\begin{aligned} X^{1,\hat{g}}_{t+1}+X^{2,\hat{g}}_{t+1}&= (X^{1,\hat{g}}_t-D^1_t)^++(X^{2,\hat{g}}_t-D^2_t)^+ +A^1_t+A^2_t\nonumber \\&= X^{1,\hat{g}}_t+X^{2,\hat{g}}_t-D^1_t-D^2_t +A^1_t+A^2_t\nonumber \\&\quad +D^1_t1_{\left\{ X^{1,\hat{g}}_t=0\right\} }+D^2_t1_{\left\{ X^{2,\hat{g}}_t=0\right\} }. \end{aligned}$$
(185)

Since along the sample path induced by \(\omega \in E_{t_0}\), \(\left( U^{1,\hat{g}}_{t-1}(\omega ),U^{2,\hat{g}}_{t-1}(\omega )\right) \ne (0,0)\) and \(X^{i,\hat{g}}_t=\overline{X}^{i,\hat{g}}_{t-1}-U^{i,\hat{g}}_{t-1}+U^{j,\hat{g}}_{t-1}\), the event \(\{X^{i,\hat{g}}_t=0\} \bigcap E_{t_0}\) (\(i=1\) or 2) implies that \(\overline{X}^{i,\hat{g}}_{t-1}=1,\, U^{i,\hat{g}}_{t-1}=1 \text { and }U^{j,\hat{g}}_{t-1}=0\). For this case, \(\overline{X}^{i,\hat{g}}_{t-1}=1\) and \(U^{i,\hat{g}}_{t-1}=1\) further imply that the threshold is smaller than one. Then, the only possibility for \(U^{j,\hat{g}}_{t-1}=0\) is \(\overline{X}^{j,\hat{g}}_{t-1}=0\). Therefore,

$$\begin{aligned}&\left\{ X^{i,\hat{g}}_t=0\right\} \bigcap E_{t_0} \nonumber \\&\quad \subseteq \left\{ \overline{X}^{i,\hat{g}}_{t-1}=1,\, U^{i,\hat{g}}_{t-1}=1, \, \overline{X}^{j,\hat{g}}_{t-1}=0 \text { and }U^{j,\hat{g}}_{t-1}=0\right\} \nonumber \\&\quad \subseteq \{S_t = 1\}. \end{aligned}$$
(186)

Consequently, from (186), for any \(\omega \in E_{t_0}\)

$$\begin{aligned}&D^1_t(\omega )1_{\left\{ X^{1,\hat{g}}_t(\omega )=0\right\} }+D^2_t(\omega )1_{\left\{ X^{2,\hat{g}}_t(\omega )=0\right\} }\nonumber \\&\quad = 1_{\left\{ S_t(\omega )=1\right\} } \left( D^1_t(\omega )1_{\left\{ X^{1,\hat{g}}_t(\omega )=0\right\} } + D^2_t(\omega )1_{\left\{ X^{2,\hat{g}}_t(\omega )=0\right\} } \right) \!.\nonumber \\&\quad = 1_{\left\{ S_t(\omega )=1\right\} }\left( 1_{\left\{ X^{1,\hat{g}}_t(\omega )=0\right\} }(D^1_t(\omega )-D^2_t(\omega ))+ D^2_t(\omega ) \right) \!. \end{aligned}$$
(187)

Moreover, \(\left( U^{1,\hat{g}}_{t-1}(\omega ),U^{2,\hat{g}}_{t-1}(\omega )\right) \ne (0,0)\) implies that \(\left( \overline{X}^{1,\hat{g}}_{t-1}(\omega ),\overline{X}^{2,\hat{g}}_{t-1}(\omega )\right) \ne (0,0)\). Hence,

$$\begin{aligned} S_t(\omega ) = \overline{X}^{1,\hat{g}}_{t-1}(\omega )+\overline{X}^{2,\hat{g}}_{t-1}(\omega )\ne 0, \end{aligned}$$
(188)

and

$$\begin{aligned}&X^{1,\hat{g}}_{t+1}(\omega )+X^{2,\hat{g}}_{t+1}(\omega )\nonumber \\&\quad = X^{1,\hat{g}}_t(\omega )+X^{2,\hat{g}}_t(\omega )-D^1_t(\omega )-D^2_t(\omega ) +A^1_t(\omega )+A^2_t(\omega )\nonumber \\&\qquad + 1_{\left\{ S_t(\omega )=1\right\} } \left( 1_{\left\{ X^{1,\hat{g}}_t(\omega )=0\right\} }(D^1_t(\omega )- D^2_t(\omega ))+D^2_t(\omega ) \right) \nonumber \\&\quad = X^{1,\hat{g}}_t(\omega )+X^{2,\hat{g}}_t(\omega )-D^1_t(\omega )-D^2_t(\omega ) +A^1_t(\omega )+A^2_t(\omega )\nonumber \\&\qquad + 1_{\left\{ S_t(\omega )=1\right\} } \left( 1_{\left\{ X^{1,\hat{g}}_t(\omega )=0\right\} }(D^1_t(\omega )- D^2_t(\omega ))+D^2_t(\omega ) \right) \nonumber \\&\qquad +1_{\{S_t(\omega )=0\}}\left( D^1_t(\omega )+D^2_t(\omega ) \right) \nonumber \\&\quad = S_{t+1}(\omega ), \end{aligned}$$
(189)

where the first and second equalities in (189) follow from (187) and (188), respectively. The last equality in (189) follows from the construction of \(\{S_t, t\ge t_0+1\}\).

Furthermore, by the system dynamics (1)–(3) we have

$$\begin{aligned} \overline{X}^{1,\hat{g}}_{t}(\omega )+\overline{X}^{2,\hat{g}}_{t}(\omega ) =&\,X^{1,\hat{g}}_{t+1}(\omega )+X^{2,\hat{g}}_{t+1}(\omega )\nonumber \\ =&\, S_{t+1}(\omega ). \end{aligned}$$
(190)

Thus, Eq. (183) is true for any \(\omega \in E_{t_0}\) for all \(t \ge t_0+1\).

Then, for any \(\omega \in E_{t_0}\)

$$\begin{aligned} S_t(\omega ) = \overline{X}^{1,\hat{g}}_{t-1}(\omega )+\overline{X}^{2,\hat{g}}_{t-1}(\omega ) \ne 0 \text { for all }t \ge t_0+1 \end{aligned}$$
(191)

because \(\left( U^{1,\hat{g}}_{t-1}(\omega ),U^{2,\hat{g}}_{t-1}(\omega )\right) \ne (0,0)\) for all \(t \ge t_0+1\). Since \(\mathbf {P}(E_{t_0})>0 \), (191) contradicts the fact that \(\{S_t, t\ge t_0+1\}\) is recurrent.

Therefore, no such event \(E_{t_0}\in \mathcal {F}\) with positive probability exists, and the proof of this lemma is complete. \(\square \)

Appendix 5: Proofs of the results associated with step 3 of the proof of Theorem 2

Proof (Proof of Lemma 9)

For any fixed centralized policy \(g\in \mathcal {G}_c\), the information \(I^1_t,I^2_t\) available to the centralized controller includes all primitive random variables \(X^i_{0},A^i_{0:t},D^i_{0:t},i=1,2\) up to time \(t\). Since all other random variables are functions of these primitive random variables and \(g\), we have

$$\begin{aligned} U^{i,g}_t=\, g^i_t(I^1_t,I^2_t) =\, g^i_t(X^1_{0},X^2_{0},A^1_{0:t},A^2_{0:t},D^1_{0:t},D^2_{0:t}), \end{aligned}$$
(192)

for \(i=1,2\). For any initial queue lengths \(x^1_{0},x^2_{0}\), we now define a policy \(\tilde{g}\) from \(g\) for the case when both queues are initially empty. Let \(\tilde{g}\) be the policy such that for \(i=1,2\)

$$\begin{aligned} U^{i,\tilde{g}}_t =&\,\tilde{g}^i_t(I^1_t,I^2_t) \nonumber \\ :=&\left\{ \begin{array}{ll} g^i_t(x^1_{0},x^2_{0},A^1_{0:t},A^2_{0:t},D^i_{0:t},D^2_{0:t}) &{} \text { if }\ \overline{X}^{i,\tilde{g}}_t>0\\ 0 &{} \text { if }\ \overline{X}^{i,\tilde{g}}_t=0 \end{array} \right. \nonumber \\ =&\min \left( U^{i,g}_t, \overline{X}^{i,\tilde{g}}_t \right) \le U^{i,g}_t, \end{aligned}$$
(193)

where \(X^{1,\tilde{g}}_t\) and \(X^{2,\tilde{g}}_t\) denote the queue lengths at time \(t\) due to policy \(\tilde{g}\) with initial queue lengths \(X^{1,\tilde{g}}_0=X^{2,\tilde{g}}_0=0\).

At time 0 we have \(X^{i,g}_0=x^i_0\ge 0 = X^{i,\tilde{g}}_0\) for \(i=1,2\). We now prove by induction that for all time \(t\)

$$\begin{aligned} X^{i,g}_t \ge X^{i,\tilde{g}}_t, \quad i=1,2. \end{aligned}$$
(194)

Suppose the claim is true at time \(t\). Then, from the system dynamics (1)–(2) and (194) we obtain, for \(i=1,2\),

$$\begin{aligned} \overline{X}^{i,g}_t =&\left( X^{i,g}_t - D^{i}_t \right) ^+ +A^i_t \nonumber \\ \ge&\left( X^{i,\tilde{g}}_t - D^{i}_t \right) ^+ +A^i_t = \overline{X}^{i,\tilde{g}}_t. \end{aligned}$$
(195)

Furthermore from (1)–(2) and (193)

$$\begin{aligned} X^{i,g}_{t+1} =&\, \overline{X}^{i,g}_t-U^{i,g}_t+U^{j,g}_t \nonumber \\ \ge&\,\overline{X}^{i,g}_t-U^{i,g}_t+U^{j,\tilde{g}}_t \end{aligned}$$
(196)

If \(\overline{X}^{i,\tilde{g}}_t>0 \), then, because of (193) and (195)

$$\begin{aligned} \overline{X}^{i,g}_t-U^{i,g}_t =&\,\overline{X}^{i,g}_t-\min \left( U^{i,g}_t, \overline{X}^{i,\tilde{g}}_t \right) \nonumber \\ =&\,\overline{X}^{i,g}_t-U^{i,\tilde{g}}_t \ge \overline{X}^{i,\tilde{g}}_t-U^{i,\tilde{g}}_t . \end{aligned}$$
(197)

If \(\overline{X}^{i,\tilde{g}}_t=0 \), since \(\overline{X}^{i,g}_t-U^{i,g}_t \ge 0\), (193) implies

$$\begin{aligned} \overline{X}^{i,g}_t-U^{i,g}_t \ge 0 = \overline{X}^{i,\tilde{g}}_t-U^{i,\tilde{g}}_t. \end{aligned}$$
(198)

Combining (196)–(198) and (1)–(2) we get

$$\begin{aligned} X^{i,g}_{t+1} \ge&\,\overline{X}^{i,g}_t-U^{i,g}_t+U^{j,\tilde{g}}_t \nonumber \\ \ge&\, \overline{X}^{i,\tilde{g}}_t-U^{i,\tilde{g}}_t+U^{j,\tilde{g}}_t = X^{i,\tilde{g}}_{t+1}. \end{aligned}$$
(199)

Therefore, we complete the proof of the claim (194).

Since the cost function is increasing, (194) implies that for all \(g\in \mathcal {G}_c\) and any initial condition \(X^1_0=x^1_0,X^2_0=x^2_0\),

$$\begin{aligned} \inf _{g\in \mathcal {G}_c} J^g_T(0,0) \le J^{\tilde{g}}_T(0,0) \le J^{g}_T(x^1_0,x^2_0). \end{aligned}$$
(200)

Consequently, for any PMFs \(\pi ^1_0,\pi ^2_0\)

$$\begin{aligned} \inf _{g\in \mathcal {G}_c} J^g_T(0,0) \le \inf _{g\in \mathcal {G}_c} J^{g}_T(\pi ^1_0,\pi ^2_0). \end{aligned}$$
(201)

Moreover, the result of Lemma 3 ensures that \(\hat{g}\) gives the smallest expected cost among policies in \(\mathcal {G}_c\) for any finite horizon when \(X^1_0=X^2_0=0\). It follows that, for any finite \(T\),

$$\begin{aligned} J^{\hat{g}}_T(0,0) = \inf _{g\in \mathcal {G}_c} J^g_T(0,0) \le J^{\tilde{g}}_T(0,0) \le J^{g}_T(x^1_0,x^2_0). \end{aligned}$$
(202)

For infinite horizon cost, we divide each term in (202) by \(T\) and let \(T\) to infinity, and we obtain, for any \(\pi ^1_0,\pi ^2_0\),

$$\begin{aligned} J^{\hat{g}}(0,0) = \inf _{g\in \mathcal {G}_c} J^g(0,0) \le J^{\tilde{g}}(0,0) \le J^{g}(x^1_0,x^2_0). \end{aligned}$$
(203)

\(\square \)

Appendix 6: Proofs of the results associated with step 4 of the proof of Theorem 2

Proof (Proof of the claim in the proof of Theorem 2)

We prove here our claim expressed by Eq. (90) to complete the proof of Theorem 2. By (78),

$$\begin{aligned} S_{T_0+1} = X^{1,\hat{g}}_{T_0+1}+X^{2,\hat{g}}_{T_0+1}. \end{aligned}$$
(204)

We prove by induction that \(X^{1,\hat{g}}_{t}+X^{2,\hat{g}}_{t}= S_t\) for all \(t \ge T_0+1\).

Assume that \(X^{1,\hat{g}}_{t}+X^{2,\hat{g}}_{t}= S_t\) at time \(t\), \(t\ge T_0+1\). Then for time \(t+1\), because of the systems dynamics (1)–(3),

$$\begin{aligned} X^{1,\hat{g}}_{t+1}+X^{2,\hat{g}}_{t+1}&= (X^{1,\hat{g}}_t-D^1_t)^++(X^{2,\hat{g}}_t-D^2_t)^+ +A^1_t+A^2_t\nonumber \\&= X^{1,\hat{g}}_t+X^{2,\hat{g}}_t-D^1_t-D^2_t +A^1_t+A^2_t\nonumber \\&\quad +D^1_t1_{\left\{ X^{1,\hat{g}}_t=0\right\} }+D^2_t1_{\left\{ X^{2,\hat{g}}_t=0\right\} }. \end{aligned}$$
(205)

When \(X^{i,\hat{g}}_t=0\) (\(i=1\) or 2), \(U^{j,\hat{g}}_{t-1}\) should be 0 because

$$\begin{aligned} 0=X^{i,\hat{g}}_t=\overline{X}^{i,\hat{g}}_{t-1}-U^{i,\hat{g}}_{t-1}+U^{j,\hat{g}}_{t-1} \end{aligned}$$
(206)

and \(\overline{X}^{i,\hat{g}}_{t-1}-U^{i,\hat{g}}_{t-1}\ge 0\).

We consider the following two cases separately:

Case 1 :

\(U^{i,\hat{g}}_{t-1}=0\).

Case 2 :

\(U^{i,\hat{g}}_{t-1}=1\).

Case 1 :

When \(U^{i,\hat{g}}_{t-1}=0\), we must have \(\overline{X}^{i,\hat{g}}_{t-1}=0\) by (206). Then \(\overline{X}^{j,\hat{g}}_{t-1}\in \{0,1\}\) for the following reason. When \(U^{i,\hat{g}}_{t-1}=U^{j,\hat{g}}_{t-1}=0\), the sizes of both queues are between the lower bound and the threshold. That is

$$\begin{aligned} \overline{\textit{LB}}^{\hat{g}}_{t-1} \le&\,\overline{X}^{i,\hat{g}}_{t-1} \le \left\lceil TH_t \right\rceil -1, \end{aligned}$$
(207)
$$\begin{aligned} \overline{\textit{LB}}^{\hat{g}}_{t-1} \le&\,\overline{X}^{j,\hat{g}}_{t-1} \le \left\lceil TH_t \right\rceil -1. \end{aligned}$$
(208)

Combining (207), (208) with \(\overline{X}^{i,\hat{g}}_{t-1}=0\) we obtain

$$\begin{aligned} \overline{X}^{j,\hat{g}}_{t-1} =&\left| \overline{X}^{j,\hat{g}}_{t-1} - \overline{X}^{i,\hat{g}}_{t-1} \right| \nonumber \\ \le&\,\left\lceil TH_t \right\rceil -1 -\overline{\textit{LB}}^{\hat{g}}_{t-1}\nonumber \\ \le&\,\,\frac{1}{2}\left( \overline{\textit{UB}}^{\hat{g}}_{t-1}-\overline{\textit{LB}}^{\hat{g}}_{t-1} \right) \le 1.5, \end{aligned}$$
(209)

where the last inequality in (209) is true because of (31) in Lemma 1, (89), and

$$\begin{aligned} \overline{\textit{UB}}^{\hat{g}}_{t-1}-\overline{\textit{LB}}^{\hat{g}}_{t-1} \le \textit{UB}^{\hat{g}}_{t}+1-\textit{LB}^{\hat{g}}_{t}+1 \le 3 . \end{aligned}$$

Therefore, \(\overline{X}^{j,\hat{g}}_{t-1}\le 1\) because \(\overline{X}^{j,\hat{g}}_{t-1}\) takes integer values.

Case 2:

When \(U^{i,\hat{g}}_{t-1}=1\), we must have \(\overline{X}^{i,\hat{g}}_{t-1}=1\) by (206). This implies that the threshold is not more than 1, and the only possible value of \(\overline{X}^{j,\hat{g}}_{t-1}\) less than the threshold is 0.

As a consequence of the above analysis for the cases 1 and 2, \(\{X^{i,\hat{g}}_t=0 \}\) implies

$$\begin{aligned} S_t = \overline{X}^{i,\hat{g}}_{t-1}+\overline{X}^{j,\hat{g}}_{t-1} \le 1. \end{aligned}$$
(210)

Thus, for \(i=1,2\),

$$\begin{aligned} \left\{ X^{i,\hat{g}}_t=0 \right\} = \left\{ X^{i,\hat{g}}_t=0, S_t \le 1 \right\} . \end{aligned}$$
(211)

Then,

$$\begin{aligned}&D^1_t1_{\left\{ X^{1,\hat{g}}_t=0\right\} }+D^2_t1_{\left\{ X^{2,\hat{g}}_t=0\right\} }\nonumber \\&\quad =D^1_t1_{\left\{ X^{1,\hat{g}}_t =0, S_t \le 1\right\} }+D^2_{t+1}1_{\left\{ X^{2,\hat{g}}_t=0, S_t \le 1\right\} }\nonumber \\&\quad =D^1_t1_{\left\{ X^{1,\hat{g}}_t=0, S_t = 1\right\} }+D^2_t1_{\left\{ X^{1,\hat{g}}_t\ne 0, S_t = 1\right\} }\nonumber \\&\qquad + D^1_t1_{\left\{ S_t =0\right\} }+D^2_t1_{\left\{ S_t =0\right\} }. \end{aligned}$$
(212)

Combining (205) and (212) we obtain

$$\begin{aligned} X^{1,\hat{g}}_{t+1}+X^{2,\hat{g}}_{t+1}&= X^{1,\hat{g}}_t+X^{2,\hat{g}}_t-D^1_t-D^2_t +A^1_t+A^2_t\nonumber \\&\quad + D^1_t1_{\left\{ X^1_t=0, S_t = 1\right\} }+D^2_t1_{\left\{ X^1_t\ne 0, S_t = 1\right\} }\nonumber \\&\quad + D^1_t1_{\left\{ S_t =0\right\} }+D^2_t1_{\left\{ S_t =0\right\} } = S_{t+1}, \end{aligned}$$
(213)

where the last equality follows by the definition of \(S_{t+1}\).

Therefore, at any time \(t \ge T_0+1\) we have

$$\begin{aligned} X^{1,\hat{g}}_t+X^{2,\hat{g}}_t =S_t. \end{aligned}$$
(214)

The proof of claim (90), and consequently, the proof of Theorem 2 is complete. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ouyang, Y., Teneketzis, D. Signaling for decentralized routing in a queueing network. Ann Oper Res 317, 737–775 (2022). https://doi.org/10.1007/s10479-015-1850-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-015-1850-4

Keywords

Navigation