Skip to main content

On learning and branching: a survey


This paper surveys learning techniques to deal with the two most crucial decisions in the branch-and-bound algorithm for Mixed-Integer Linear Programming, namely variable and node selections. Because of the lack of deep mathematical understanding on those decisions, the classical and vast literature in the field is inherently based on computational studies and heuristic, often problem-specific, strategies. We will both interpret some of those early contributions in the light of modern (machine) learning techniques, and give the details of the recent algorithms that instead explicitly incorporate machine learning paradigms.

This is a preview of subscription content, access via your institution.


  1. 1.

    Note that SB + PC and RB differ on the switch from SB to PC: at a certain fixed depth of the tree for the former method, whereas depending on each variable’s reliability (i.e., past usage) for the latter.

  2. 2.

    Note that Marcos Alvarez et al. (2016) comes later than Marcos Alvarez et al. (2017), where 2017 is the year of journal publication of Marcos Alvarez et al. (2014).


  1. Abdi H, Williams LJ (2010) Principal component analysis. Wiley Interdiscip Rev 2(4):433–459

    Article  Google Scholar 

  2. Achterberg T, Berthold T (2009) Hybrid branching. Springer, Berlin, pp 309–311. doi:10.1007/978-3-642-01929-6_23

  3. Achterberg T, Koch T, Martin A (2005) Branching rules revisited. Oper Res Lett 33(1):42–54. doi:10.1016/j.orl.2004.04.002

    Article  Google Scholar 

  4. Achterberg T, Koch T, Martin A (2006) MIPLIB 2003. Oper Res Lett 34(4):361–372

    Article  Google Scholar 

  5. Ansótegui C, Sellmann M, Tierney K (2009) A gender-based genetic algorithm for the automatic configuration of algorithms. Springer, Berlin, pp 142–157. doi:10.1007/978-3-642-04244-7_14

  6. Applegate D, Bixby R, Chvátal V, Cook W (2007) The traveling salesman problem. A computational study. Princeton University Press, Princeton

    Google Scholar 

  7. Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47(2):235–256. doi:10.1023/A:1013689704352

    Article  Google Scholar 

  8. Bellman R (1961) Adaptive control processes. Princeton University Press, Princeton

    Book  Google Scholar 

  9. Benichou M, Gauthier J, Girodet P, Hentges G (1971) Experiments in mixed-integer programming. Math Program 1:76–94

    Article  Google Scholar 

  10. Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming, 1st edn. Anthropological field studies. Athena Scientific, Belmont

    Google Scholar 

  11. Bischl B, Lang M, Kotthoff L, Schiffner J, Richter J, Studerus E, Casalicchio G, Jones ZM (2016) mlr: machine learning in R. J Mach Learn Res 17(170):1–5

    Google Scholar 

  12. Bishop CM (2006) Pattern recognition and machine learning. Information science and statistics. Springer, New York

    Google Scholar 

  13. Bixby RE, Ceria S, McZeal CM, Savelsbergh MWP (1998) An updated mixed integer programming library: MIPLIB 3.0

  14. COR@L (2017) Computational Optimization Research at Lehigh.

  15. Cornuéjols G, Karamanov M, Li Y (2006) Early estimates of the size of branch-and-bound trees. INFORMS J Comput 18(1):86–96. doi:10.1287/ijoc.1040.0107

    Article  Google Scholar 

  16. CPLEX (2017)

  17. Di Liberto G, Kadioglu S, Leo K, Malitsky Y (2016) DASH: dynamic approach for switching heuristics. Eur J Oper Res 248(3):943–953. doi:10.1016/j.ejor.2015.08.018

    Article  Google Scholar 

  18. Domingos P (2012) A few useful things to know about machine learning. Commun ACM 55(10):78–87. doi:10.1145/2347736.2347755

    Article  Google Scholar 

  19. Fischetti M, Monaci M (2012a) Backdoor branching. INFORMS J Comput 25(4):693–700. doi:10.1287/ijoc.1120.0531

    Article  Google Scholar 

  20. Fischetti M, Monaci M (2012b) Branching on nonchimerical fractionalities. Oper Res Lett 40(3):159–164

    Article  Google Scholar 

  21. Fischetti M, Monaci M (2014) Exploiting erraticism in search. Oper Res 62(1):114–122. doi:10.1287/opre.2013.1231

    Article  Google Scholar 

  22. Fischetti M, Lodi A, Monaci M, Salvagnin D, Tramontani A (2016) Improving branch-and-cut performance by random sampling. Math Program Comput 8(1):113–132. doi:10.1007/s12532-015-0096-0

    Article  Google Scholar 

  23. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42. doi:10.1007/s10994-006-6226-1

    Article  Google Scholar 

  24. Gilpin A, Sandholm T (2011) Information-theoretic approaches to branching in search. Discret Optim 8(2):147–159. doi:10.1016/j.disopt.2010.07.001

    Article  Google Scholar 

  25. Glankwamdee W, Linderoth J (2011) Lookahead branching for mixed integer programming. In: Twelfth INFORMS computing society meeting, INFORMS, pp 130–150

  26. Gomory R (1960) An algorithm for the mixed integer problem. Tech. Rep. RM-2597, The Rand Corporation

  27. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press.

  28. Gurobi (2017)

  29. Hamerly G, Elkan C (2003) Learning the k in k-means. In: NIPS, vol. 3, MIT Press, pp 281–288

  30. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer series in statistics, Springer, New York. doi:10.1007/978-0-387-84858-7

  31. He H, Daume III H, Eisner JM (2014) Learning to search in branch and bound algorithms. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems, vol. 27. Curran Associates, Inc., pp 3293–3301

  32. Hutter F, Xu L, Hoos HH, Leyton-Brown K (2014) Algorithm runtime prediction: methods & evaluation. Artif Intell 206:79–111. doi:10.1016/j.artint.2013.10.003

    Article  Google Scholar 

  33. Joachims T (2006) Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 217–226

  34. JuliaComputing (2017)

  35. Kadioglu S, Malitsky Y, Sellmann M (2012) Non-model-based search guidance for set partitioning problems. In: AAAI

  36. Karzan FK, Nemhauser GL, Savelsbergh MWP (2009) Information-based branching schemes for binary linear mixed integer problems. Math Program Comput 1(4):249–293. doi:10.1007/s12532-009-0009-1

    Article  Google Scholar 

  37. Khalil E (2016) Machine learning for integer programming. In: Proceedings of the doctoral consortium at the twenty-fifth international joint conference on artificial intelligence (IJCAI)

  38. Khalil E, Le Bodic P, Song L, Nemhauser G, Dilkina B (2016) Learning to branch in mixed integer programming. In: Proceedings of the 30th AAAI conference on artificial intelligence

  39. Knuth DE (1975) Estimating the efficiency of backtrack programs. Math Comput 29(129):122–136

    Article  Google Scholar 

  40. Koch T, Achterberg T, Andersen E, Bastert O, Berthold T, Bixby R, Danna E, Gamrath G, Gleixner A, Heinz S, Lodi A, Mittelmann H, Ralphs T, Salvagnin D, Steffy D, Wolter K (2011) MIPLIB 2010. Math Program Comput 3:103–163

  41. Kocsis L, Szepesvári C (2006) Bandit based Monte-Carlo planning. Springer, Berlin, pp 282–293. doi:10.1007/11871842_29

  42. Land A, Doig A (1960) An automatic method of solving discrete programming problems. Econometrica 28:497–520

    Article  Google Scholar 

  43. Le Bodic P, Nemhauser G (2017) An abstract model for branching and its application to mixed integer programming. Math Program. doi:10.1007/s10107-016-1101-8

  44. Linderoth JT, Lodi A (2011) MILP software. In: Cochran J (ed) Wiley encyclopedia of operations research and management science, vol 5. Wiley, pp 3239–3248

  45. Linderoth JT, Savelsbergh MWP (1999) A computational study of search strategies for mixed integer programming. INFORMS J Comput 11(2):173–187. doi:10.1287/ijoc.11.2.173

    Article  Google Scholar 

  46. Lodi A (2009) Mixed integer programming computation. In: Jünger M, Liebling T, Naddef D, Nemhauser G, Pulleyblank W, Reinelt G, Rinaldi G, Wolsey L (eds) 50 Years of Integer Programming 1958–2008. Springer, Berlin Heidelberg, pp 619–645

    Google Scholar 

  47. Lodi A (2013) The heuristic (dark) side of MIP solvers. In: Talbi EG (ed) Hybrid metaheuristics, vol 434. Studies in computational intelligence. Springer, Berlin, pp 273–284

    Chapter  Google Scholar 

  48. Lodi A, Tramontani A (2013) Performance variability in mixed-integer programming. INFORMS, chap 1, pp 1–12. doi:10.1287/educ.2013.0112

  49. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1, University of California Press, pp 281–297

  50. Marcos Alvarez A (2016) Computational and theoretical synergies between linear optimization and supervised machine learning. PhD thesis, Université de Liège, Liège, Belgique

  51. Marcos Alvarez A, Louveaux Q, Wehenkel L (2014) A supervised machine learning approach to variable branching in branch-and-bound. Tech. rep., Université de Liège.

  52. Marcos Alvarez A, Wehenkel L, Louveaux Q (2015) Machine learning to balance the load in parallel branch-and-bound. Tech. rep., Université de Liège.

  53. Marcos Alvarez A, Wehenkel L, Louveaux Q (2016) Online learning for strong branching approximation in branch-and-bound. Tech. rep., Université de Liège.

  54. Marcos Alvarez A, Louveaux Q, Wehenkel L (2017) A machine learning-based approximation of strong branching. INFORMS J Comput 29(1):185–195. doi:10.1287/ijoc.2016.0723

    Article  Google Scholar 

  55. Nocedal J, Wright S (2006) Numerical optimization, 2nd edn. Springer, New York

    Google Scholar 

  56. Padberg M, Rinaldi G (1991) A branch and cut algorithm for the resolution of large-scale symmetric traveling salesmen problems. SIAM Rev 33(1):60–100. doi:10.1137/1033004

    Article  Google Scholar 

  57. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  58. Robbins H (1952) Some aspects of the sequential design of experiments. Bull Am Math Soc 58(5):527–535

    Article  Google Scholar 

  59. Sabharwal A, Samulowitz H, Reddy C (2012) Guiding combinatorial optimization with UCT. In: Beldiceanu N, Jussien N, Pinson É (eds) Integration of AI and OR techniques in constraint programming for combinatorial optimization problems: 9th international conference, CPAIOR 2012, Nantes, France, May 28–June 1, 2012. Proceedings. Lecture notes in computer science. Springer, Berlin, pp 356–361. doi:10.1007/978-3-642-29828-8_23

  60. Sammut C (2010) Behavioral cloning. Springer, Boston, pp 93–97. doi:10.1007/978-0-387-30164-8_69

  61. SCIP (2017)

  62. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423 and 623–656

    Article  Google Scholar 

  63. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge

  64. Syed U, Schapire RE (2010) A reduction from apprenticeship learning to classification. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Advances in neural information processing systems, vol. 23. Curran Associates, Inc., pp 2253–2261

  65. Szepesvári C (2010) Algorithms for reinforcement learning, vol 4. Morgan & Claypool Publishers, San Rafael

    Google Scholar 

Download references


We like to thank Yoshua Bengio for his support in our learning curve. Additional thanks go to Laurent Charlin, Mathieu Tanneau, Claudio Sole and François Laviolette for interesting discussions on the topic. A final round of discussion happened at the Bellairs Workshop on “Data, Learning and Optimization”, so we are indebted to all participants for exchanging on the topic and especially to Bruce Shepherd to have made the workshop happening.

Author information



Corresponding author

Correspondence to Andrea Lodi.

Additional information

This invited paper is discussed in the comments available at doi:10.1007/s11750-017-0449-0, doi:10.1007/s11750-017-0453-4, and doi:10.1007/s11750-017-0454-3.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lodi, A., Zarpellon, G. On learning and branching: a survey. TOP 25, 207–236 (2017).

Download citation


  • Branch and bound
  • Machine learning

Mathematics Subject Classification

  • 90-02
  • 68R-02
  • 68T-02