, Volume 25, Issue 2, pp 207–236 | Cite as

On learning and branching: a survey

  • Andrea LodiEmail author
  • Giulia Zarpellon
Invited Paper


This paper surveys learning techniques to deal with the two most crucial decisions in the branch-and-bound algorithm for Mixed-Integer Linear Programming, namely variable and node selections. Because of the lack of deep mathematical understanding on those decisions, the classical and vast literature in the field is inherently based on computational studies and heuristic, often problem-specific, strategies. We will both interpret some of those early contributions in the light of modern (machine) learning techniques, and give the details of the recent algorithms that instead explicitly incorporate machine learning paradigms.


Branch and bound Machine learning 

Mathematics Subject Classification

90-02 68R-02 68T-02 



We like to thank Yoshua Bengio for his support in our learning curve. Additional thanks go to Laurent Charlin, Mathieu Tanneau, Claudio Sole and François Laviolette for interesting discussions on the topic. A final round of discussion happened at the Bellairs Workshop on “Data, Learning and Optimization”, so we are indebted to all participants for exchanging on the topic and especially to Bruce Shepherd to have made the workshop happening.


  1. Abdi H, Williams LJ (2010) Principal component analysis. Wiley Interdiscip Rev 2(4):433–459CrossRefGoogle Scholar
  2. Achterberg T, Berthold T (2009) Hybrid branching. Springer, Berlin, pp 309–311. doi: 10.1007/978-3-642-01929-6_23
  3. Achterberg T, Koch T, Martin A (2005) Branching rules revisited. Oper Res Lett 33(1):42–54. doi: 10.1016/j.orl.2004.04.002 CrossRefGoogle Scholar
  4. Achterberg T, Koch T, Martin A (2006) MIPLIB 2003. Oper Res Lett 34(4):361–372CrossRefGoogle Scholar
  5. Ansótegui C, Sellmann M, Tierney K (2009) A gender-based genetic algorithm for the automatic configuration of algorithms. Springer, Berlin, pp 142–157. doi: 10.1007/978-3-642-04244-7_14
  6. Applegate D, Bixby R, Chvátal V, Cook W (2007) The traveling salesman problem. A computational study. Princeton University Press, PrincetonGoogle Scholar
  7. Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47(2):235–256. doi: 10.1023/A:1013689704352 CrossRefGoogle Scholar
  8. Bellman R (1961) Adaptive control processes. Princeton University Press, PrincetonCrossRefGoogle Scholar
  9. Benichou M, Gauthier J, Girodet P, Hentges G (1971) Experiments in mixed-integer programming. Math Program 1:76–94CrossRefGoogle Scholar
  10. Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming, 1st edn. Anthropological field studies. Athena Scientific, BelmontGoogle Scholar
  11. Bischl B, Lang M, Kotthoff L, Schiffner J, Richter J, Studerus E, Casalicchio G, Jones ZM (2016) mlr: machine learning in R. J Mach Learn Res 17(170):1–5Google Scholar
  12. Bishop CM (2006) Pattern recognition and machine learning. Information science and statistics. Springer, New YorkGoogle Scholar
  13. Bixby RE, Ceria S, McZeal CM, Savelsbergh MWP (1998) An updated mixed integer programming library: MIPLIB 3.0Google Scholar
  14. COR@L (2017) Computational Optimization Research at Lehigh.
  15. Cornuéjols G, Karamanov M, Li Y (2006) Early estimates of the size of branch-and-bound trees. INFORMS J Comput 18(1):86–96. doi: 10.1287/ijoc.1040.0107 CrossRefGoogle Scholar
  16. Di Liberto G, Kadioglu S, Leo K, Malitsky Y (2016) DASH: dynamic approach for switching heuristics. Eur J Oper Res 248(3):943–953. doi: 10.1016/j.ejor.2015.08.018 CrossRefGoogle Scholar
  17. Domingos P (2012) A few useful things to know about machine learning. Commun ACM 55(10):78–87. doi: 10.1145/2347736.2347755 CrossRefGoogle Scholar
  18. Fischetti M, Monaci M (2012a) Backdoor branching. INFORMS J Comput 25(4):693–700. doi: 10.1287/ijoc.1120.0531 CrossRefGoogle Scholar
  19. Fischetti M, Monaci M (2012b) Branching on nonchimerical fractionalities. Oper Res Lett 40(3):159–164CrossRefGoogle Scholar
  20. Fischetti M, Monaci M (2014) Exploiting erraticism in search. Oper Res 62(1):114–122. doi: 10.1287/opre.2013.1231 CrossRefGoogle Scholar
  21. Fischetti M, Lodi A, Monaci M, Salvagnin D, Tramontani A (2016) Improving branch-and-cut performance by random sampling. Math Program Comput 8(1):113–132. doi: 10.1007/s12532-015-0096-0 CrossRefGoogle Scholar
  22. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42. doi: 10.1007/s10994-006-6226-1 CrossRefGoogle Scholar
  23. Gilpin A, Sandholm T (2011) Information-theoretic approaches to branching in search. Discret Optim 8(2):147–159. doi: 10.1016/j.disopt.2010.07.001 CrossRefGoogle Scholar
  24. Glankwamdee W, Linderoth J (2011) Lookahead branching for mixed integer programming. In: Twelfth INFORMS computing society meeting, INFORMS, pp 130–150Google Scholar
  25. Gomory R (1960) An algorithm for the mixed integer problem. Tech. Rep. RM-2597, The Rand CorporationGoogle Scholar
  26. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press.
  27. Gurobi (2017)
  28. Hamerly G, Elkan C (2003) Learning the k in k-means. In: NIPS, vol. 3, MIT Press, pp 281–288Google Scholar
  29. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer series in statistics, Springer, New York. doi: 10.1007/978-0-387-84858-7
  30. He H, Daume III H, Eisner JM (2014) Learning to search in branch and bound algorithms. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems, vol. 27. Curran Associates, Inc., pp 3293–3301Google Scholar
  31. Hutter F, Xu L, Hoos HH, Leyton-Brown K (2014) Algorithm runtime prediction: methods & evaluation. Artif Intell 206:79–111. doi: 10.1016/j.artint.2013.10.003 CrossRefGoogle Scholar
  32. Joachims T (2006) Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 217–226Google Scholar
  33. Kadioglu S, Malitsky Y, Sellmann M (2012) Non-model-based search guidance for set partitioning problems. In: AAAIGoogle Scholar
  34. Karzan FK, Nemhauser GL, Savelsbergh MWP (2009) Information-based branching schemes for binary linear mixed integer problems. Math Program Comput 1(4):249–293. doi: 10.1007/s12532-009-0009-1 CrossRefGoogle Scholar
  35. Khalil E (2016) Machine learning for integer programming. In: Proceedings of the doctoral consortium at the twenty-fifth international joint conference on artificial intelligence (IJCAI)Google Scholar
  36. Khalil E, Le Bodic P, Song L, Nemhauser G, Dilkina B (2016) Learning to branch in mixed integer programming. In: Proceedings of the 30th AAAI conference on artificial intelligenceGoogle Scholar
  37. Knuth DE (1975) Estimating the efficiency of backtrack programs. Math Comput 29(129):122–136CrossRefGoogle Scholar
  38. Koch T, Achterberg T, Andersen E, Bastert O, Berthold T, Bixby R, Danna E, Gamrath G, Gleixner A, Heinz S, Lodi A, Mittelmann H, Ralphs T, Salvagnin D, Steffy D, Wolter K (2011) MIPLIB 2010. Math Program Comput 3:103–163Google Scholar
  39. Kocsis L, Szepesvári C (2006) Bandit based Monte-Carlo planning. Springer, Berlin, pp 282–293. doi: 10.1007/11871842_29
  40. Land A, Doig A (1960) An automatic method of solving discrete programming problems. Econometrica 28:497–520CrossRefGoogle Scholar
  41. Le Bodic P, Nemhauser G (2017) An abstract model for branching and its application to mixed integer programming. Math Program. doi: 10.1007/s10107-016-1101-8
  42. Linderoth JT, Lodi A (2011) MILP software. In: Cochran J (ed) Wiley encyclopedia of operations research and management science, vol 5. Wiley, pp 3239–3248Google Scholar
  43. Linderoth JT, Savelsbergh MWP (1999) A computational study of search strategies for mixed integer programming. INFORMS J Comput 11(2):173–187. doi: 10.1287/ijoc.11.2.173 CrossRefGoogle Scholar
  44. Lodi A (2009) Mixed integer programming computation. In: Jünger M, Liebling T, Naddef D, Nemhauser G, Pulleyblank W, Reinelt G, Rinaldi G, Wolsey L (eds) 50 Years of Integer Programming 1958–2008. Springer, Berlin Heidelberg, pp 619–645Google Scholar
  45. Lodi A (2013) The heuristic (dark) side of MIP solvers. In: Talbi EG (ed) Hybrid metaheuristics, vol 434. Studies in computational intelligence. Springer, Berlin, pp 273–284CrossRefGoogle Scholar
  46. Lodi A, Tramontani A (2013) Performance variability in mixed-integer programming. INFORMS, chap 1, pp 1–12. doi: 10.1287/educ.2013.0112
  47. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1, University of California Press, pp 281–297Google Scholar
  48. Marcos Alvarez A (2016) Computational and theoretical synergies between linear optimization and supervised machine learning. PhD thesis, Université de Liège, Liège, BelgiqueGoogle Scholar
  49. Marcos Alvarez A, Louveaux Q, Wehenkel L (2014) A supervised machine learning approach to variable branching in branch-and-bound. Tech. rep., Université de Liège.
  50. Marcos Alvarez A, Wehenkel L, Louveaux Q (2015) Machine learning to balance the load in parallel branch-and-bound. Tech. rep., Université de Liège.
  51. Marcos Alvarez A, Wehenkel L, Louveaux Q (2016) Online learning for strong branching approximation in branch-and-bound. Tech. rep., Université de Liège.
  52. Marcos Alvarez A, Louveaux Q, Wehenkel L (2017) A machine learning-based approximation of strong branching. INFORMS J Comput 29(1):185–195. doi: 10.1287/ijoc.2016.0723 CrossRefGoogle Scholar
  53. Nocedal J, Wright S (2006) Numerical optimization, 2nd edn. Springer, New YorkGoogle Scholar
  54. Padberg M, Rinaldi G (1991) A branch and cut algorithm for the resolution of large-scale symmetric traveling salesmen problems. SIAM Rev 33(1):60–100. doi: 10.1137/1033004 CrossRefGoogle Scholar
  55. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830Google Scholar
  56. Robbins H (1952) Some aspects of the sequential design of experiments. Bull Am Math Soc 58(5):527–535CrossRefGoogle Scholar
  57. Sabharwal A, Samulowitz H, Reddy C (2012) Guiding combinatorial optimization with UCT. In: Beldiceanu N, Jussien N, Pinson É (eds) Integration of AI and OR techniques in constraint programming for combinatorial optimization problems: 9th international conference, CPAIOR 2012, Nantes, France, May 28–June 1, 2012. Proceedings. Lecture notes in computer science. Springer, Berlin, pp 356–361. doi: 10.1007/978-3-642-29828-8_23
  58. Sammut C (2010) Behavioral cloning. Springer, Boston, pp 93–97. doi: 10.1007/978-0-387-30164-8_69
  59. SCIP (2017)
  60. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423 and 623–656CrossRefGoogle Scholar
  61. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, CambridgeGoogle Scholar
  62. Syed U, Schapire RE (2010) A reduction from apprenticeship learning to classification. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Advances in neural information processing systems, vol. 23. Curran Associates, Inc., pp 2253–2261Google Scholar
  63. Szepesvári C (2010) Algorithms for reinforcement learning, vol 4. Morgan & Claypool Publishers, San RafaelGoogle Scholar

Copyright information

© Sociedad de Estadística e Investigación Operativa 2017

Authors and Affiliations

  1. 1.Canada Excellence Research ChairÉcole Polytechnique de MontréalMontréalCanada

Personalised recommendations