Stochastic scheduling: A short history of index policies and new approaches to index generation for dynamic resource allocation

Glazebrook, K. D.; Hodge, D. J.; Kirkbride, C.; Minty, R. J.

doi:10.1007/s10951-013-0325-1

Stochastic scheduling: A short history of index policies and new approaches to index generation for dynamic resource allocation

Published: 05 April 2013

Volume 17, pages 407–425, (2014)
Cite this article

Journal of Scheduling Aims and scope Submit manuscript

K. D. Glazebrook¹,
D. J. Hodge²,
C. Kirkbride¹ &
…
R. J. Minty³

927 Accesses
11 Citations
Explore all metrics

Abstract

In the 1970’s John Gittins discovered that multi-armed bandits, an important class of models for the dynamic allocation of a single key resource among a set of competing projects, have optimal solutions of index form. At each decision epoch such policies allocate the resource to whichever project has the largest Gittins index. Since the 1970’s, Gittins’ index result together with a range of developments and reformulations of it have constituted an influential stream of ideas and results contributing to research into the scheduling of stochastic objects. We give a brief account of many of the most important contributions to this work and proceed to describe how index theory has recently been developed to produce strongly performing heuristic policies for the dynamic allocation of a divisible resource to a collection of stochastic projects (or bandits). A limitation on this work concerns the need for the structural requirement of indexability which is notoriously difficult to establish. We introduce a general framework for the development of index policies for dynamic resource allocation which circumvents this difficulty. We utilise this framework to generate index policies for two model classes of independent interest. Their performance is evaluated in an extensive numerical study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A trade-off multiobjective dynamic programming procedure and its application to project portfolio selection

Article 05 January 2021

A fast greedy heuristic for scheduling modular projects

Article 07 December 2014

New strategies for stochastic resource-constrained project scheduling

Article 12 January 2017

References

Ansell, P. S., Glazebrook, K. D., Niño Mora, J., & O’Keeffe, M. (2003). Whittle’s index policy for a multi-class queueing system with convex holding costs. Mathematical Methods of Operations Research, 57, 21–39.
Article Google Scholar
Archibald, T. W., Black, D. P., & Glazebrook, K. D. (2009). Indexability and index heuristics for a simple class of inventory routing problems. Operations Research, 57, 314–326.
Google Scholar
Armony, M., & Bambos, N. (2003). Queueing dynamics and maximal throughput scheduling in switched processing systems. QUESTA, 44, 209–252.
Google Scholar
Bertsimas, D. P., & Niño Mora, J. (1996). Conservation laws, extended polymatroids and multi-armed bandit problems: A polyhedral approach to indexable systems. Mathematics of Operations Research, 21, 257–306.
Article Google Scholar
Caro, F., & Gallien, J. (2007). Dynamic assortment with demand learning for seasonal consumer goods. Management Science, 53, 276–292.
Article Google Scholar
Dacre, M. J., & Glazebrook, K. D. (2002). The dependence of optimal returns from multi-class queueing systems on their customer base. QUESTA, 40, 93–115.
Google Scholar
Dacre, M. J., Glazebrook, K. D., & Niño Mora, J. (1999). The achievable region approach to the optimal control of stochastic systems (with discussion). Journal of the Royal Statistical Society, B61, 747–791.
Article Google Scholar
Dayanik, S., Powell, W., & Yamazaki, K. (2008). Index policies for discounted bandit problems with availability constraints. Advances in Applied Probability, 49(2), 377–400.
Article Google Scholar
Dunn, R. T., & Glazebrook, K. D. (2001). The performance of index-based policies for bandit problems with stochastic machine availability. Advances in Applied Probability, 33, 365–390.
Article Google Scholar
Dunn, R. T., & Glazebrook, K. D. (2004). Discounted multi-armed bandit problems on a collection of machines with varying speeds. Mathematics of Operations Research, 29, 266–279.
Article Google Scholar
Garbe, R., & Glazebrook, K. D. (1998a). Stochastic scheduling with priority classes. Mathematics of Operations Research, 23, 119–144.
Article Google Scholar
Garbe, R., & Glazebrook, K. D. (1998b). Submodular returns and greedy heuristics for queueing scheduling problems. Operations Research, 46, 336–346.
Google Scholar
Gittins, J. C. (1979). Bandit processes and dynamic allocation indices (with discussion). Journal of the Royal Statistical Society, B41, 148–177.
Google Scholar
Gittins, J. C., Glazebrook, K. D., & Weber, R. R. (2011). Multi-armed bandit allocation indices (2nd ed.). London: Wiley-Blackwell.
Book Google Scholar
Gittins, J. C., & Jones, D. M. (1974) . A dynamic allocation index for the sequential design of experiments. In Progress in statistics, pp. 241–266. Amsterdam: North-Holland.
Glazebrook, K. D. (1976). Stochastic scheduling with order constraints. International Journal of Systems Science, 7, 657–666.
Article Google Scholar
Glazebrook, K. D., Hodge, D. J., & Kirkbride, C. (2011). General notions of indexability for queueing control and asset management. Annals of Applied Probability, 23, 876–907.
Article Google Scholar
Glazebrook, K. D., Kirkbride, C., & Ouenniche, J. (2009). Index policies for the admission control and routing of impatient customers to heterogeneous service stations. Operations Research, 57, 975–989.
Google Scholar
Glazebrook, K. D., Kirkbride, C., & Ruiz-Hernandez, D. (2006). Spinning plates and squad systems—Policies for bi-directional restless bandits. Advances in Applied Probability, 38, 95–115.
Article Google Scholar
Glazebrook, K. D., Mitchell, H. M., & Ansell, P. S. (2005). Index policies for the maintenance of a collection of machines by a set of repairmen. European Journal of Operational Research, 165, 267–284.
Article Google Scholar
Glazebrook, K. D., & Niño Mora, J. (2001). Parallel scheduling of multiclass \(M/M/m\) queues: Approximate and heavy-traffic optimization of achievable performance. Operations Research, 49, 609–623.
Glazebrook, K. D., & Wilkinson, D. J. (2000). Index-based policies for discounted multi-armed bandits on parallel machines. Annals of Applied Probability, 10, 877–896.
Article Google Scholar
Hodge, D. J., & Glazebrook, K. D. (2011). Dynamic resource allocation in a multi-product make-to-stock production system. QUESTA, 67, 333–364.
Google Scholar
Jacko, P., & Sansò, B. (2012). Optimal anticipative congestion control of flows with time-varying input stream. Performance Evaluation, 69, 86–101.
Article Google Scholar
Katehakis, M. N., & Veinott, A. F. (1987). The multi-armed bandit problem—Decomposition and computation. Mathematics of Operations Research, 12, 262–268.
Article Google Scholar
Klimov, G. P. (1974). Time sharing systems I. Theory of Probability and Its Applications, 19, 532–551.
Article Google Scholar
Nash, P. (1973). Optimal allocation of resources between research projects. Ph.D. Thesis, Cambridge University, Cambridge.
Niño Mora, J. (2007). Dynamic priority allocation via restless bandit marginal productivity indices. TOP, 15, 161–198.
Article Google Scholar
Niño-Mora, J. (2001). Restless bandits, partial conservation laws and indexability. Advances in Applied Probability, 33, 76–98.
Article Google Scholar
Puterman, M. L. (1994). Markov decision processes: Discrete stochastic dynamic programming. New York, NY: Wiley.
Book Google Scholar
Robinson, D. R. (1982). Algorithms for evaluating the dynamic allocation index. Operations Research Letters, 1, 72–74.
Article Google Scholar
Sonin, I. M. (2008). A generalized Gittins index for a Markov chain and its recursive calculation. Statistics & Probability Letters, 78, 1526–1533.
Article Google Scholar
Tsitsiklis, J. N. (1994). A short proof of the Gittins index theorem. Annals of Applied Probability, 4, 194–199.
Article Google Scholar
Tsoucas, P. (1991). The region of achievable performance in a model of Klimov. IBM: Technical report.
Varaiya, P., Walrand, J., & Buyukkoc, C. (1985). Extensions of the multi-armed bandit problem. IEEE Transactions on Automatic Control, AC–30, 426–439.
Article Google Scholar
Weber, R. R. (1992). On the Gittins index for multiarmed bandits. Annals of Applied Probability, 2, 1024–1033.
Article Google Scholar
Weber,R. R., Weiss, G. (1990) . On an index policy for restless bandits. Journal of Applied Probability, 27, 637–648, 1990. (Addendum: Advances in Applied Probability, 23:429–430, 1991).
Google Scholar
Weiss, G. (1988). Branching bandit processes. Probability in the Engineering and Informational Sciences, 2, 269–278.
Article Google Scholar
Whittle, P. (1980). Multi-armed bandits and the Gittins index. Journal of the Royal Statistical Society, B42, 142–149.
Google Scholar
Whittle, P. (1981). Arm-acquiring bandits. Annals of Probability, 9, 284–292.
Article Google Scholar
Whittle, P. (1988). Restless bandits: Activity allocation in a changing world. In J. Gani (Ed.), A celebration of applied probability, (J. Appl. Prob. Spec. Vol. 25A, pp. 287–298). Sheffield: Applied Probability Trust.

Download references

Acknowledgments

An earlier version of this paper was given as a plenary address by the first author at MISTA2011 and his thanks go to the conference organisers. The first two authors acknowledge the support for this work by the Engineering and Physical Sciences Research Council (EPSRC) through grant EP/E049265/1. The third author was supported by an RCUK fellowship and the fourth author by an EPSRC doctoral studentship. All authors are grateful to the two referees whose careful reading of the paper and comments have enabled them to strengthen the paper.

Author information

Authors and Affiliations

Department of Management Science, Lancaster University, Lancaster, LA1 4YX, UK
K. D. Glazebrook & C. Kirkbride
School of Mathematical Sciences, University of Nottingham, Nottingham, NG7 2RD, UK
D. J. Hodge
School of Mathematics, Cardiff University, Cardiff, CF24 4AG, UK
R. J. Minty

Authors

K. D. Glazebrook
View author publications
You can also search for this author in PubMed Google Scholar
D. J. Hodge
View author publications
You can also search for this author in PubMed Google Scholar
C. Kirkbride
View author publications
You can also search for this author in PubMed Google Scholar
R. J. Minty
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. D. Glazebrook.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Glazebrook, K.D., Hodge, D.J., Kirkbride, C. et al. Stochastic scheduling: A short history of index policies and new approaches to index generation for dynamic resource allocation. J Sched 17, 407–425 (2014). https://doi.org/10.1007/s10951-013-0325-1

Download citation

Received: 28 October 2011
Accepted: 21 March 2013
Published: 05 April 2013
Issue Date: October 2014
DOI: https://doi.org/10.1007/s10951-013-0325-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic scheduling: A short history of index policies and new approaches to index generation for dynamic resource allocation

Abstract

Access this article

Similar content being viewed by others

A trade-off multiobjective dynamic programming procedure and its application to project portfolio selection

A fast greedy heuristic for scheduling modular projects

New strategies for stochastic resource-constrained project scheduling

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Stochastic scheduling: A short history of index policies and new approaches to index generation for dynamic resource allocation

Abstract

Access this article

Similar content being viewed by others

A trade-off multiobjective dynamic programming procedure and its application to project portfolio selection

A fast greedy heuristic for scheduling modular projects

New strategies for stochastic resource-constrained project scheduling

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation