Skip to main content
Log in

Stochastic scheduling: A short history of index policies and new approaches to index generation for dynamic resource allocation

  • Published:
Journal of Scheduling Aims and scope Submit manuscript

Abstract

In the 1970’s John Gittins discovered that multi-armed bandits, an important class of models for the dynamic allocation of a single key resource among a set of competing projects, have optimal solutions of index form. At each decision epoch such policies allocate the resource to whichever project has the largest Gittins index. Since the 1970’s, Gittins’ index result together with a range of developments and reformulations of it have constituted an influential stream of ideas and results contributing to research into the scheduling of stochastic objects. We give a brief account of many of the most important contributions to this work and proceed to describe how index theory has recently been developed to produce strongly performing heuristic policies for the dynamic allocation of a divisible resource to a collection of stochastic projects (or bandits). A limitation on this work concerns the need for the structural requirement of indexability which is notoriously difficult to establish. We introduce a general framework for the development of index policies for dynamic resource allocation which circumvents this difficulty. We utilise this framework to generate index policies for two model classes of independent interest. Their performance is evaluated in an extensive numerical study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ansell, P. S., Glazebrook, K. D., Niño Mora, J., & O’Keeffe, M. (2003). Whittle’s index policy for a multi-class queueing system with convex holding costs. Mathematical Methods of Operations Research, 57, 21–39.

    Article  Google Scholar 

  • Archibald, T. W., Black, D. P., & Glazebrook, K. D. (2009). Indexability and index heuristics for a simple class of inventory routing problems. Operations Research, 57, 314–326.

    Google Scholar 

  • Armony, M., & Bambos, N. (2003). Queueing dynamics and maximal throughput scheduling in switched processing systems. QUESTA, 44, 209–252.

    Google Scholar 

  • Bertsimas, D. P., & Niño Mora, J. (1996). Conservation laws, extended polymatroids and multi-armed bandit problems: A polyhedral approach to indexable systems. Mathematics of Operations Research, 21, 257–306.

    Article  Google Scholar 

  • Caro, F., & Gallien, J. (2007). Dynamic assortment with demand learning for seasonal consumer goods. Management Science, 53, 276–292.

    Article  Google Scholar 

  • Dacre, M. J., & Glazebrook, K. D. (2002). The dependence of optimal returns from multi-class queueing systems on their customer base. QUESTA, 40, 93–115.

    Google Scholar 

  • Dacre, M. J., Glazebrook, K. D., & Niño Mora, J. (1999). The achievable region approach to the optimal control of stochastic systems (with discussion). Journal of the Royal Statistical Society, B61, 747–791.

    Article  Google Scholar 

  • Dayanik, S., Powell, W., & Yamazaki, K. (2008). Index policies for discounted bandit problems with availability constraints. Advances in Applied Probability, 49(2), 377–400.

    Article  Google Scholar 

  • Dunn, R. T., & Glazebrook, K. D. (2001). The performance of index-based policies for bandit problems with stochastic machine availability. Advances in Applied Probability, 33, 365–390.

    Article  Google Scholar 

  • Dunn, R. T., & Glazebrook, K. D. (2004). Discounted multi-armed bandit problems on a collection of machines with varying speeds. Mathematics of Operations Research, 29, 266–279.

    Article  Google Scholar 

  • Garbe, R., & Glazebrook, K. D. (1998a). Stochastic scheduling with priority classes. Mathematics of Operations Research, 23, 119–144.

    Article  Google Scholar 

  • Garbe, R., & Glazebrook, K. D. (1998b). Submodular returns and greedy heuristics for queueing scheduling problems. Operations Research, 46, 336–346.

    Google Scholar 

  • Gittins, J. C. (1979). Bandit processes and dynamic allocation indices (with discussion). Journal of the Royal Statistical Society, B41, 148–177.

    Google Scholar 

  • Gittins, J. C., Glazebrook, K. D., & Weber, R. R. (2011). Multi-armed bandit allocation indices (2nd ed.). London: Wiley-Blackwell.

    Book  Google Scholar 

  • Gittins, J. C., & Jones, D. M. (1974) . A dynamic allocation index for the sequential design of experiments. In Progress in statistics, pp. 241–266. Amsterdam: North-Holland.

  • Glazebrook, K. D. (1976). Stochastic scheduling with order constraints. International Journal of Systems Science, 7, 657–666.

    Article  Google Scholar 

  • Glazebrook, K. D., Hodge, D. J., & Kirkbride, C. (2011). General notions of indexability for queueing control and asset management. Annals of Applied Probability, 23, 876–907.

    Article  Google Scholar 

  • Glazebrook, K. D., Kirkbride, C., & Ouenniche, J. (2009). Index policies for the admission control and routing of impatient customers to heterogeneous service stations. Operations Research, 57, 975–989.

    Google Scholar 

  • Glazebrook, K. D., Kirkbride, C., & Ruiz-Hernandez, D. (2006). Spinning plates and squad systems—Policies for bi-directional restless bandits. Advances in Applied Probability, 38, 95–115.

    Article  Google Scholar 

  • Glazebrook, K. D., Mitchell, H. M., & Ansell, P. S. (2005). Index policies for the maintenance of a collection of machines by a set of repairmen. European Journal of Operational Research, 165, 267–284.

    Article  Google Scholar 

  • Glazebrook, K. D., & Niño Mora, J. (2001). Parallel scheduling of multiclass \(M/M/m\) queues: Approximate and heavy-traffic optimization of achievable performance. Operations Research, 49, 609–623.

  • Glazebrook, K. D., & Wilkinson, D. J. (2000). Index-based policies for discounted multi-armed bandits on parallel machines. Annals of Applied Probability, 10, 877–896.

    Article  Google Scholar 

  • Hodge, D. J., & Glazebrook, K. D. (2011). Dynamic resource allocation in a multi-product make-to-stock production system. QUESTA, 67, 333–364.

    Google Scholar 

  • Jacko, P., & Sansò, B. (2012). Optimal anticipative congestion control of flows with time-varying input stream. Performance Evaluation, 69, 86–101.

    Article  Google Scholar 

  • Katehakis, M. N., & Veinott, A. F. (1987). The multi-armed bandit problem—Decomposition and computation. Mathematics of Operations Research, 12, 262–268.

    Article  Google Scholar 

  • Klimov, G. P. (1974). Time sharing systems I. Theory of Probability and Its Applications, 19, 532–551.

    Article  Google Scholar 

  • Nash, P. (1973). Optimal allocation of resources between research projects. Ph.D. Thesis, Cambridge University, Cambridge.

  • Niño Mora, J. (2007). Dynamic priority allocation via restless bandit marginal productivity indices. TOP, 15, 161–198.

    Article  Google Scholar 

  • Niño-Mora, J. (2001). Restless bandits, partial conservation laws and indexability. Advances in Applied Probability, 33, 76–98.

    Article  Google Scholar 

  • Puterman, M. L. (1994). Markov decision processes: Discrete stochastic dynamic programming. New York, NY: Wiley.

    Book  Google Scholar 

  • Robinson, D. R. (1982). Algorithms for evaluating the dynamic allocation index. Operations Research Letters, 1, 72–74.

    Article  Google Scholar 

  • Sonin, I. M. (2008). A generalized Gittins index for a Markov chain and its recursive calculation. Statistics & Probability Letters, 78, 1526–1533.

    Article  Google Scholar 

  • Tsitsiklis, J. N. (1994). A short proof of the Gittins index theorem. Annals of Applied Probability, 4, 194–199.

    Article  Google Scholar 

  • Tsoucas, P. (1991). The region of achievable performance in a model of Klimov. IBM: Technical report.

  • Varaiya, P., Walrand, J., & Buyukkoc, C. (1985). Extensions of the multi-armed bandit problem. IEEE Transactions on Automatic Control, AC–30, 426–439.

    Article  Google Scholar 

  • Weber, R. R. (1992). On the Gittins index for multiarmed bandits. Annals of Applied Probability, 2, 1024–1033.

    Article  Google Scholar 

  • Weber,R. R., Weiss, G. (1990) . On an index policy for restless bandits. Journal of Applied Probability, 27, 637–648, 1990. (Addendum: Advances in Applied Probability, 23:429–430, 1991).

    Google Scholar 

  • Weiss, G. (1988). Branching bandit processes. Probability in the Engineering and Informational Sciences, 2, 269–278.

    Article  Google Scholar 

  • Whittle, P. (1980). Multi-armed bandits and the Gittins index. Journal of the Royal Statistical Society, B42, 142–149.

    Google Scholar 

  • Whittle, P. (1981). Arm-acquiring bandits. Annals of Probability, 9, 284–292.

    Article  Google Scholar 

  • Whittle, P. (1988). Restless bandits: Activity allocation in a changing world. In J. Gani (Ed.), A celebration of applied probability, (J. Appl. Prob. Spec. Vol. 25A, pp. 287–298). Sheffield: Applied Probability Trust.

Download references

Acknowledgments

An earlier version of this paper was given as a plenary address by the first author at MISTA2011 and his thanks go to the conference organisers. The first two authors acknowledge the support for this work by the Engineering and Physical Sciences Research Council (EPSRC) through grant EP/E049265/1. The third author was supported by an RCUK fellowship and the fourth author by an EPSRC doctoral studentship. All authors are grateful to the two referees whose careful reading of the paper and comments have enabled them to strengthen the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. D. Glazebrook.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Glazebrook, K.D., Hodge, D.J., Kirkbride, C. et al. Stochastic scheduling: A short history of index policies and new approaches to index generation for dynamic resource allocation. J Sched 17, 407–425 (2014). https://doi.org/10.1007/s10951-013-0325-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10951-013-0325-1

Keywords

Navigation