Annals of Operations Research

, Volume 134, Issue 1, pp 19–67 | Cite as

A Tutorial on the Cross-Entropy Method

  • Pieter-Tjerk de Boer
  • Dirk P. Kroese
  • Shie Mannor
  • Reuven Y. Rubinstein
Article

Abstract

The cross-entropy (CE) method is a new generic approach to combinatorial and multi-extremal optimization and rare event simulation. The purpose of this tutorial is to give a gentle introduction to the CE method. We present the CE methodology, the basic algorithm and its modifications, and discuss applications in combinatorial optimization and machine learning.

Key words

cross-entropy method Monte-Carlo simulation randomized optimization machine learning rare events 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aarts, E.H.L. and J.H.M Korst. (1989). Simulated Annealing and Boltzmann Machines. John Wiley & Sons.Google Scholar
  2. Alon, G., D.P. Kroese, T. Raviv, and R.Y. Rubinstein. (2005). “Application of the Cross-Entropy Method to the Buffer Allocation Problem in a Simulation-Based Environment.” Annals of Operations Research 134, 19–67.CrossRefGoogle Scholar
  3. Asmussen, S., D.P. Kroese, and R.Y. Rubinstein. (2005). “Heavy Tails, Importance Sampling and Cross-Entropy.” Stochastic Models 21(1). To appear.Google Scholar
  4. Barto, A., R. Sutton, and C. Anderson. (1983). “Neuron-Like Adaptive Elements that can Solve Difficult Learning Control Problems.” IEEE Transactions on Systems, Man, and Cybernetics 13, 834–846.Google Scholar
  5. Barto, A.G. and R.S. Sutton. (1998). Reinforcement Learning. MIT Press.Google Scholar
  6. Baxter, J., P.L. Bartlett, and L. Weaver. (2001). Experiments with Infinite-Horizon, Policy-Gradient Estimation.” Journal of Artificial Intelligence Research 15, 351–381.Google Scholar
  7. Bertsekas, D.P. (1995). Dynamic Programming and Optimal Control. Athena Scientific.Google Scholar
  8. Bertsekas, D.P. and J.N. Tsitsiklis. (1995). Neuro-Dynamic Programming. Athena Scientific.Google Scholar
  9. Chepuri, K. and T. Homem-de-Mello. (2005). “Solving the Vehicle Routing Problem with Stochastic Demands using the Cross-Entropy Method.” Annals of Operations Research 134, 153–181.CrossRefGoogle Scholar
  10. Cohen, I., B. Golany, and A. Shtub. (2005). “Managing Stochastic Finite Capacity Multi-Project Systems Through the Cross-Entropy Method.” Annals of Operations Research 134, 183–199.CrossRefGoogle Scholar
  11. Colorni, A., M. Dorigo, F. Maffioli, V. Maniezzo, G. Righini, and M. Trubian. (1996). “Heuristics from Nature for Hard Combinatorial Problems.” International Transactions in Operational Research 3(1), 1—21.CrossRefGoogle Scholar
  12. Dayan, P. and C. Watkins (1992). “Q-Learning.” Machine Learning 8, 279–292.Google Scholar
  13. de Boer, P.T. (2000). “Analysis and Efficient Simulation of Queueing Models of Telecommunication Systems.” Ph.D. thesis, University of Twente.Google Scholar
  14. de Boer, P.T., D.P. Kroese, and R.Y. Rubinstein. (2002).“Estimating Buffer Overflows in Three Stages using Cross-Entropy.” In Proceedings of the 2002 Winter Simulation Conference. San Diego, pp. 301–309.Google Scholar
  15. de Boer, P.T., D.P. Kroese, and R.Y. Rubinstein. (2004). “A Fast Cross-Entropy Method for Estimating Buffer Overflows in Queueing Networks.” Management Science 50(7), 883–895.CrossRefGoogle Scholar
  16. Dorigo, M., G. Di Caro, and L.M. Gambardella. (1999). “Ant Algorithms for Discrete Optimization.” Artificial Life 5(2), 137–172.CrossRefGoogle Scholar
  17. Dubin, U. (2002). “Application of the Cross-Entropy Method to Neural Computation.” Master’s thesis, Technion, Electrical Engineering.Google Scholar
  18. Dubin, U. (2004). “Application of the Cross-Entropy Method for Image Segmentation.”Unpublished.Google Scholar
  19. Garey, M.R. and D.S. Johnson. (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco: W.H. Freeman and Company.Google Scholar
  20. Glover, F. and M. Laguna. (1993). Modern Heuristic Techniques for Combinatorial Optimization, chapter 3: Tabu search. Blackwell Scientific Publications.Google Scholar
  21. Goldberg, D. (1989). Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley.Google Scholar
  22. Gutjahr, W.J. (2000). “A Graph-Based Ant System and Its Convergence.” Future Generations Computing 16, 873–888.CrossRefGoogle Scholar
  23. Helvik, B.E. and O. Wittner. (2001). “Using the Cross-Entropy Method to Guide/Govern Mobile Agent’s Path Finding in Networks.” In 3rd International Workshop on Mobile Agents for Telecommunication Applications—MATA’01.Google Scholar
  24. Homem-de-Mello, T. and R.Y. Rubinstein. (2002). “Rare Event Estimation for Static Models via Cross-Entropy and Importance Sampling.” (submitted for publication).Google Scholar
  25. Hui, K.-P., N. Bean, M. Kraetzl, and D.P. Kroese. (2005). “The Cross-Entropy Method for Network Reliability Estimation.”Annals of Operations Research 134, 101–118.CrossRefGoogle Scholar
  26. Kaelbling, L.P., M. Littman, and A.W. Moore. (1996). “Reinforcement Learning—A Survey.” Journal of Artificial Intelligence Research 4, 237–285.Google Scholar
  27. Keith, J. and D.P. Kroese. (2002). “SABRES: Sequence Alignment By Rare Event Simulation.” In Proceedings of the 2002 Winter Simulation Conference, San Diego, pp. 320–327.Google Scholar
  28. Konda, V.R. and J.N. Tsitsiklis. (2003). “Actor-Critic Algorithms.” SIAM Journal on Control and Optimization 42(4), 1134–1166.Google Scholar
  29. Kroese, D.P. and R.Y. Rubinstein. (2004). “The Transform Likelihood Ratio Method for Rare Event Simulation with Heavy Tails.” Queueing Systems 46, 317–351.CrossRefGoogle Scholar
  30. Lieber, D. (1998). “Rare-Events Estimation via Cross-Entropy and Importance Sampling.” Ph.D. thesis, William Davidson Faculty of Industrial Engineering and Management, Technion, Haifa, Israel.Google Scholar
  31. Mannor, S., R.Y. Rubinstein, and Y. Gat. (2003). “The Cross-Entropy Method for Fast Policy Search.” In Proceedings of the Twentieth International Conference on Machine Learning. Morgan Kaufmann, pp. 512–519.Google Scholar
  32. Margolin, L. (2002). “Application of the Cross-Entropy Method to Scheduling Problems.” Master’s thesis, Technion, Industrial Engineering.Google Scholar
  33. Margolin, L. (2004). “The Cross-Entropy Method for the Single Machine Total Weighted Tardiness Problem.” Unpublished.Google Scholar
  34. Margolin, L. (2005).“On the Convergence of the Cross-Entropy Method.” Annals of Operations Research 134, 201–214.CrossRefGoogle Scholar
  35. Menache, I., S. Mannor, and N. Shimkin. (2005). “Basis Function Adaption in Temporal Difference Reinforcement Learning.”Annals of Operations Research 134, 215–238.CrossRefGoogle Scholar
  36. Papadimitriou, C.H. and M. Yannakakis. (1991). “Optimization, Approximation, and Complexity Classes.”J. Comput. System Sci. 43, 425–440.CrossRefGoogle Scholar
  37. Puterman, M. (1994). Markov Decision Processes. Wiley-Interscience.Google Scholar
  38. Ridder, A. (2005). “Importance Sampling Simulations of Markovian Reliability Systems Using Cross-Entropy.”Annals of Operations Research 134, 119–136.CrossRefGoogle Scholar
  39. Rosenstein, M.T. and A.G. Barto. (2001). “Robot Weightlifting by Direct Policy Search.” In B. Nebel (ed.). Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence. Morgan Kaufmann, pp. 839–846.Google Scholar
  40. Rubinstein, R.Y. (1997). “Optimization of Computer Simulation Models with Rare Events.”European Journal of Operations Research 99, 89–112.CrossRefGoogle Scholar
  41. Rubinstein, R.Y. (1999). “The Simulated Entropy Method for Combinatorial and Continuous Optimization.”Methodology and Computing in Applied Probability 2, 127–190.CrossRefGoogle Scholar
  42. Rubinstein, R.Y. (2001). “Combinatorial Optimization, Cross-Entropy, Ants and Rare Events.”In S. Uryasev and P.M. Pardalos (eds.). Stochastic Optimization: Algorithms and Applications. Kluwer, pp. 304–358.Google Scholar
  43. Rubinstein, R.Y. (2002). “Cross-Entropy and Rare-Events for Maximal Cut and Bipartition Problems.”ACM Transactions on Modeling and Computer Simulation, 27–53.Google Scholar
  44. Rubinstein, R.Y. and D.P. Kroese. (2004). The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning. Springer-Verlag, New York.Google Scholar
  45. Rubinstein, R.Y. and B. Melamed. (1998). Modern Simulation and Modeling. Wiley series in probability and Statistics.Google Scholar
  46. Rubinstein, R.Y. and A. Shapiro. (1993). Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization via the Score Function Method. Wiley.Google Scholar
  47. Shi, L. and S. Olafsson. (2000). “Nested Partitioning Method for Global Optimization.” Operations Research 48(3), 390–407.CrossRefGoogle Scholar
  48. Sutton, R.S., D. McAllester, S. Singh, and Y. Mansour. (2000). “Policy Gradient Methods for Reinforcement Learning with Function Approximation.” In Advances in Neural Information Processing Systems 12. MIT Press, pp. 1057–1063.Google Scholar
  49. Voudouris, C. (2003). “Guided Local Search—An Illustrative Example in Function Optimisation.”BT Technology Journal 16(3), 46–50.CrossRefGoogle Scholar
  50. Webb, A. (1999). Statistical Pattern Recognition. Arnold.Google Scholar

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  • Pieter-Tjerk de Boer
    • 1
  • Dirk P. Kroese
    • 2
  • Shie Mannor
    • 3
  • Reuven Y. Rubinstein
    • 4
  1. 1.Department of Electrical Engineering, Mathematics and Computer ScienceUniversity of TwenteEnschedeThe Netherlands
  2. 2.Department of MathematicsThe University of QueenslandBrisbaneAustralia
  3. 3.Department of Electrical and Computer EngineeringMcGill UniversityMontrealCanada
  4. 4.Department of Industrial Engineering, TechnionIsrael Institute of TechnologyHaifaIsrael

Personalised recommendations