Skip to main content
Log in

Learning to optimize: A tutorial for continuous and mixed-integer optimization

  • Articles
  • AI Methods for Optimization Problems
  • Published:
Science China Mathematics Aims and scope Submit manuscript

Abstract

Learning to optimize (L2O) stands at the intersection of traditional optimization and machine learning, utilizing the capabilities of machine learning to enhance conventional optimization techniques. As real-world optimization problems frequently share common structures, L2O provides a tool to exploit these structures for better or faster solutions. This tutorial dives deep into L2O techniques, introducing how to accelerate optimization algorithms, promptly estimate the solutions, or even reshape the optimization problem itself, making it more adaptive to real-world applications. By considering the prerequisites for successful applications of L2O and the structure of the optimization problems at hand, this tutorial provides a comprehensive guide for practitioners and researchers alike.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aberdam A, Golts A, Elad M. Ada-LISTA: Learned solvers adaptive to varying models. IEEE Trans Pattern Anal Mach Intell. 2021, 44: 9222–9235

    Article  Google Scholar 

  2. Ablin P, Moreau T, Massias M, et al. Learning step sizes for unfolded sparse coding. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2019, 32

  3. Achterberg T. Constraint Integer Programming. Berlin-Heidelberg: Springer, 2007

    Google Scholar 

  4. Adler J, Öktem O. Learned primal-dual reconstruction. IEEE Trans Medical Imag, 2018, 37: 1322–1332

    Article  Google Scholar 

  5. Agrawal A, Amos B, Barratt S, et al. Differentiable convex optimization layers. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2019, 32

  6. Aharon M, Elad M, Bruckstein A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process, 2006, 54:4311–4322

    Article  Google Scholar 

  7. Alvarez A M, Louveaux Q, Wehenkel L. A machine learning-based approximation of strong branching. INFORMS J Comput, 2017, 29: 185–195

    Article  MathSciNet  Google Scholar 

  8. Amos B, Kolter J Z. OptNet: Differentiable optimization as a layer in neural networks. In: Proceedings of the 34th International Conference on Machine Learning International Conference on Machine Learning. Ann Arbor: PMLR, 2017, 70

    Google Scholar 

  9. Anstegui C, Gabas J, Malitsky Y, et al. MaxSAT by improved instance-specific algorithm configuration. Artificial Intelligence, 2016, 235: 26–39

    Article  MathSciNet  Google Scholar 

  10. Bai S, Kolter J Z, Koltun V. Deep equilibrium models. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2019, 32

  11. Balas E, Ceria S, Cornuejols G, et al. Gomory cuts revisited. Oper Res Lett, 1996, 19: 1–9

    Article  MathSciNet  Google Scholar 

  12. Balas E, Ho A. Set covering algorithms using cutting planes, heuristics, and subgradient optimization: A computational study. In: Combinatorial Optimization. Berlin-Heidelberg: Springer, 1980, 37C–60

    Chapter  Google Scholar 

  13. Balatsoukas-Stimming A, Studer C. Deep unfolding for communications systems: A survey and some new directions. In Proceedings of the 2019 IEEE International Workshop on Signal Processing Systems. San Francisco: IEEE, 2019, 266–271

    Google Scholar 

  14. Balcan M-F, Dick T, Sandholm T, et al. Learning to branch. In: Proceedings of the 35th International Conference on Machine Learning. Ann Arbor: PMLR, 2018, 80

    Google Scholar 

  15. Bansal N, Chen X, Wang Z. Can we gain more from orthogonality regularizations in training deep networks? In: Proceedings of the 32nd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2018, 31

  16. Bartlett P L, Foster D J, Telgarsky M J. Spectrally-normalized margin bounds for neural networks. In: Proceedings of the 31st Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2017, 30

  17. Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm with application to wavelet-based image deblurring. In: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing. San Francisco: IEEE, 2009, 693–696

    Google Scholar 

  18. Behboodi A, Rauhut H, Schnoor E. Compressive sensing and neural networks from a statistical learning perspective. In: Compressed Sensing in Information Processing. Cham: Springer, 2022, 247–277

    Chapter  Google Scholar 

  19. Behrens F, Sauder J, Jung P. Neurally augmented ALISTA. In: Proceedings of the 8th International Conference on Learning Representations. New Orleans: OpenReview.net, 2020

    Google Scholar 

  20. Bergstra J, Bengio Y. Random search for hyper-parameter optimization. J Mach Learn Res, 2012, 13: 281–305

    MathSciNet  Google Scholar 

  21. Berthet Q, Blondel M, Teboul O, et al. Learning with differentiable pertubed optimizers. In: Proceedings of the 34th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2020, 33: 9508–9519

  22. Berthold T. Primal heuristics for mixed integer programs. PhD Thesis. Berlin: Technischen Universität, 2006

    Google Scholar 

  23. Berthold T, Francobaldi M, Hendel G. Learning to use local cuts. arXiv:2206.11618, 2022

  24. Bertsimas D, Kallus N. From predictive to prescriptive analytics. Manag Sci, 2020, 66: 1025–1044

    Article  Google Scholar 

  25. Bertsimas D, Tsitsiklis J N. Introduction to Linear Optimization. Belmont: Athena Scientific, 1997

    Google Scholar 

  26. Bestuzheva K, Besançon M, Chen W K, et al. The SCIP optimization suite 8.0. arXiv:2112.08872, 2021

  27. Bolte J, Pauwels E, Vaiter S. One-step differentiation of iterative algorithms. In: Proceedings of the 37th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2024, 36

  28. Bonami P, Lodi A, Zarpellon G. A classifier to decide on the linearization of mixed-integer quadratic problems in CPLEX. Oper Res, 2022, 70: 3303–3320

    Article  MathSciNet  Google Scholar 

  29. Borgerding M, Schniter P, Rangan S. AMP-inspired deep networks for sparse linear inverse problems. IEEE Trans Signal Process, 2017, 65: 4293–4308

    Article  MathSciNet  Google Scholar 

  30. Boyd S, Parikh N, Chu E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn, 2011, 3: 1–22

    Article  Google Scholar 

  31. Brauer C, Breustedt N, De Wolff T, et al. Learning variational models with unrolling and bilevel optimization. arXiv:2209.12651, 2022

  32. Brock A, Donahue J, Simonyan K. Large scale GAN training for high fidelity natural image synthesis. In: Proceedings of the 6th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018

    Google Scholar 

  33. Buades A, Coll B, Morel J M. A non-local algorithm for image denoising. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision And Pattern Recognition. San Francisco: IEEE, 2005, 60–65

    Google Scholar 

  34. Cai H, Liu J, Yin W. Learned robust PCA: A scalable deep unfolding approach for high-dimensional outlier detection. In: Proceedings of the 35th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2021, 34: 16977–16989

  35. Cappart Q, Moisan T, Rousseau L M, et al. Combining reinforcement learning and constraint programming for combinatorial optimization. In Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021, 3677–3687

    Google Scholar 

  36. Chan S H, Wang X, Elgendy O A. Plug-and-play ADMM for image restoration: Fixed-point convergence and applications. IEEE Trans Comput Imaging, 2016, 3: 84–98

    Article  MathSciNet  Google Scholar 

  37. Chen T, Chen X, Chen W, et al. Learning to optimize: A primer and a benchmark. J Mach Learn Res, 2022, 23: 1–59

    MathSciNet  Google Scholar 

  38. Chen X, Dai H, Li Y, et al. Learning to stop while learning to predict. In: Proceedings of International Conference on Machine Learning. Ann Arbor: PMLR, 2020, 1520–1530

    Google Scholar 

  39. Chen X, Liu J, Wang Z, et al. Theoretical linear convergence of unfolded ISTA and its practical weights and thresholds. In: Proceedings of the 32nd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2018, 31

  40. Chen X, Liu J, Wang Z, et al. Hyperparameter tuning is all you need for LISTA. In: Proceedings of the 35th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2021, 34:11678–11689

  41. Chen X, Zhang Y, Reisinger C, et al. Understanding deep architecture with reasoning layer. In: Proceedings of the 34th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2020, 33: 1240–1252

  42. Chen Z, Liu J, Chen X, et al. Rethinking the capacity of graph neural networks for branching strategy. arXiv:2402.07099, 2024

  43. Chen Z, Liu J, Wang X, et al. On representing linear programs by graph neural networks. In: Proceedings of the 11th International Conference on Learning Representations. New Orleans: OpenReview.net, 2023

    Google Scholar 

  44. Chen Z, Liu J, Wang X, et al. On representing mixed-integer linear programs by graph neural networks. In: Proceedings of the 11th International Conference on Learning Representations. New Orleans: OpenReview.net, 2023

    Google Scholar 

  45. Chmiela A, Khalil E, Gleixner A, et al. Learning to schedule heuristics in branch and bound. In: Proceedings of the 35th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2021, 34: 24235–24246

  46. Cohen R, Elad M, Milanfar P. Regularization by Denoising via Fixed-Point Projection (RED-PRO). SIAM J Imaging Sci, 2021, 14: 1374–1406

    Article  MathSciNet  Google Scholar 

  47. Condat L. A primalCdual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J Optim Theo Appl, 2013, 158: 460–479

    Article  Google Scholar 

  48. Corbineau M C, Bertocchi C, Chouzenoux E, et al. Learned image deblurring by unfolding a proximal interior point algorithm. In: Proceedings of the 2019 IEEE International Conference on Image Processing. San Francisco: IEEE, 2019, 4664–4668

    Google Scholar 

  49. Dabov K, Foi A, Katkovnik V, et al. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans Image Process, 2007, 16: 2080–2095

    Article  MathSciNet  Google Scholar 

  50. Davis D, Yin W. A three-operator splitting scheme and its optimization applications. Set-Valued Var Anal, 2017, 25: 829–858

    Article  MathSciNet  Google Scholar 

  51. Deza A, Khalil E B. Machine learning for cutting planes in integer programming: A survey. arXiv:2302.09166, 2023

  52. Ding J Y, Zhang C, Shen L, et al. Accelerating primal solution findings for mixed integer programs based on solution prediction. In Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020, 1452–1459

    Google Scholar 

  53. Donti P, Amos B, Kolter J Z. Task-based end-to-end model learning in stochastic optimization. In: Proceedings of the 31st Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2017, 30

  54. Elmachtoub A N, Grigas P. Smart “predict, then optimize”. Manag Sci, 2022, 68: 9–26

    Article  Google Scholar 

  55. Etheve M, Als Z, Bissuel C, et al. Reinforcement learning for variable selection in a branch and bound algorithm. In: Proceedings of the 17th International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research. Cham: Springer, 2020, 176–185

    Google Scholar 

  56. Falkner J K, Thyssens D, Schmidt-Thieme L. Large neighborhood search based on neural construction heuristics. arXiv:2205.00772, 2022

  57. Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Amer Statist Assoc, 2001, 96: 1348–1360

    Article  MathSciNet  Google Scholar 

  58. Fischetti M, Lodi A. Local branching. Math Program, 2003, 98: 23–47

    Article  MathSciNet  Google Scholar 

  59. Fung S W, Heaton H, Li Q, et al. JFB: Jacobian-free backpropagation for implicit networks. In Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022, 6648–6656

    Google Scholar 

  60. Gasse M, Chtelat D, Ferroni N, et al. Exact combinatorial optimization with graph convolutional neural networks. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2019, 32

  61. Gehring J, Auli M, Grangier D, et al. A convolutional encoder model for neural machine translation. arXiv:1611.02344, 2016

  62. Geng Z, Zhang X Y, Bai S, et al. On training implicit models. In: Proceedings of the 35th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2021, 34: 24247–2460

  63. Giryes R, Eldar Y C, Bronstein A M, Sapiro G. Tradeoffs between convergence speed and reconstruction accuracy in inverse problems. IEEE Trans Signal Process, 2018, 66: 1676–1690

    Article  MathSciNet  Google Scholar 

  64. Gogna A, Tayal A. Metaheuristics: Review and application. J Exp Theoret Artificial Intell, 2013, 25: 503–526

    Article  Google Scholar 

  65. Gomory R E. An Algorithm for Integer Solutions to Lmear Programs. Princeton-IBM Mathematics Research Project Technical Report 1. Princeton: Princeton University, 1958

    Google Scholar 

  66. Gomory R E. Solving linear programming problems in integers. Combin Anal, 1960, 10: 211–215

    Article  MathSciNet  Google Scholar 

  67. Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge: MIT Press, 2016

    Google Scholar 

  68. Gregor K, LeCun Y. Learning fast approximations of sparse coding. In: Proceedings of the 27th International Conference on Machine Learning. Ann Arbor: PMLR, 2010, 399–406

    Google Scholar 

  69. Griewank A, Walther A. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Philadelphia: SIAM, 2008

    Book  Google Scholar 

  70. Gupta H, Jin K H, Nguyen H Q, et al. CNN-based projected gradient descent for consistent CT image reconstruction. IEEE Trans Medical Imag, 2018, 37: 1440–1453

    Article  Google Scholar 

  71. Gupta P, Gasse M, Khalil E, et al. Hybrid models for learning to branch. In: Proceedings of the 34th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2020, 33: 18087–18097

  72. Gupta P, Khalil E B, Chetlat D, et al. Lookback for learning to branch. arXiv:2206.14987, 2022

  73. Han S, Fu R, Wang S, et al. Online adaptive dictionary learning and weighted sparse coding for abnormality detection. In: Proceedings of the 2013 IEEE International Conference on Image Processing. San Francisco: IEEE, 2013, 151–155

    Chapter  Google Scholar 

  74. Hauptmann A, Lucka F, Betcke M, et al. Model-based learning for accelerated, limited-view 3-D photoacoustic tomography. IEEE Trans Medical Imag, 2018, 37: 1382–1393

    Article  Google Scholar 

  75. He H, Daume III H, Eisner JM. Learning to search in branch and bound algorithms. In: Proceedings of the 28th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2014, 27

  76. He H, Wen C K, Jin S, et al. Model-driven deep learning for MIMO detection. IEEE Trans Signal Process, 2020, 68: 1702–1715

    Article  MathSciNet  Google Scholar 

  77. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2016, 770–778

    Google Scholar 

  78. Heaton H, Chen X, Wang Z, et al. Safeguarded learned convex optimization. In Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2023, 7848–7855

    Google Scholar 

  79. Heaton H, Fung S W, Lin A T, et al. Wasserstein-based projections with applications to inverse problems. SIAM J Math Data Sci, 2022, 4: 581–603

    Article  MathSciNet  Google Scholar 

  80. Hendel G. Adaptive large neighborhood search for mixed integer programming. Math Program Comput, 2022, 14: 185–221

    Article  MathSciNet  Google Scholar 

  81. Himmich I, El Hachemi N, El Hallaoui I, et al. MPILS: An automatic tuner for MILP solvers. Comput Oper Res, 2023, 159: 106344

    Article  MathSciNet  Google Scholar 

  82. Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw, 1989, 2: 359–366

    Article  Google Scholar 

  83. Hosny A, Reda S. Automatic MILP solver configuration by learning problem similarities. Ann Oper Res, 2024, in press

  84. Hottung A, Tierney K. Neural large neighborhood search for the capacitated vehicle routing problem. arXiv:1911.09539, 2019

  85. Huang L, Chen X, Huo W, et al. Improving primal heuristics for mixed integer programming problems based on problem reduction: A learning-based approach. In: Proceedings of the 17th International Conference on Control, Automation, Robotics and Vision. San Francisco: IEEE, 2022, 181–186

    Google Scholar 

  86. Huang T, Ferber A M, Tian Y, et al. Searching large neighborhoods for integer linear programs with contrastive learning. In: Proceedings of the 40th International Conference on Machine Learning. Ann Arbor: PMLR, 2023, 13869–13890

    Google Scholar 

  87. Huang T, Li J, Koenig S, et al. Anytime multi-agent path finding via machine learning-guided large neighborhood search. In Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022, 9368–9376

    Google Scholar 

  88. Huang Z, Wang K, Liu F, et al. Learning to select cuts for efficient mixed-integer programming. Pattern Recog, 2022, 123: 108353

    Article  Google Scholar 

  89. Hutter F, Hoos H H, Leyton-Brown K. Sequential model-based optimization for general algorithm configuration. In: Proceedings of the 5th International Conference on Learning and Intelligent Optimization. Berlin-Heidelberg: Springer, 2011, 507–523

    Chapter  Google Scholar 

  90. Hutter F, Hoos H H, Leyton-Brown K, et al. ParamILS: An automatic algorithm configuration framework. J Artificial Intell Res, 2009, 36: 267–306

    Article  Google Scholar 

  91. Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning 2015 Jun 1 (pp. 448–456). PMLR. In: Proceedings of the 32nd International Conference on Machine Learning. Ann Arbor: PMLR, 2015, 448–456

    Google Scholar 

  92. Jegelka S. Theory of graph neural networks: Representation and learning. In: Proceedings of the 2022 International Congress of Mathematicians.

  93. Jia H, Shen S. Benders cut classification via support vector machines for solving two-stage stochastic programs. INFORMS J Optim, 2021, 3: 278–297

    Article  MathSciNet  Google Scholar 

  94. Joukovsky B, Mukherjee T, Van Luong H, et al. Generalization error bounds for deep unfolding RNNs. In: Uncertainty in Artificial Intelligence PMLR. Ann Arbor: PMLR, 2021, 1515–1524

    Google Scholar 

  95. Kadioglu S, Malitsky Y, Sellmann M, et al. ISACCinstance-specific algorithm configuration. In: Proceedings of the 19th European Conference on Artificial Intelligence. Amsterdam: IOS Press, 2010, 751–756

    Google Scholar 

  96. Kang E, Chang W, Yoo J, et al. Deep convolutional framelet denosing for low-dose CT via wavelet residual network. IEEE Trans Medical Imag, 2018, 37: 1358–1369

    Article  Google Scholar 

  97. Kao Y H, Roy B, Yan X. Directed regression. In: Proceedings of the 23rd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2009, 22

  98. Khalil E B, Dai H, Zhang Y, et al. Learning combinatorial optimization algorithms over graphs. In: Proceedings of the 31st Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2017, 30

  99. Khalil E B, Le Bodic P, Song L, et al. Learning to branch in mixed integer programming. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016

    Google Scholar 

  100. Khalil E B, Morris C, Lodi A. MIP-GNN: A data-driven framework for guiding combinatorial solvers. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022, 10219–10227

    Google Scholar 

  101. Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014

  102. Kouni V, Panagakis Y. DECONET: An unfolding network for analysis-based compressed sensing with generalization error bounds. IEEE Trans Signal Process, 2023, 71: 1938–1951

    Article  MathSciNet  Google Scholar 

  103. Labassi A G, Chtelat D, Lodi A. Learning to compare nodes in branch and bound with graph neural networks. In: Proceedings of the 36th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2022, 35: 32000–32010

  104. LeCun Y, Cortes C, Burges JC C. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/

  105. Li Y, Bar-Shira O, Monga V, et al. Deep algorithm unrolling for biomedical imaging. arXiv:2108.06637, 2021

  106. Lin J, Zhu J, Wang H, et al. Learning to branch with Tree-aware Branching Transformers. Knowledge-Based Syst, 2022, 252: 109455

    Article  Google Scholar 

  107. Liu D, Fischetti M, Lodi A. Learning to search in local branching. In Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022, 3796–3803

    Google Scholar 

  108. Liu J, Chen X, Wang Z, et al. ALISTA: Analytic Weights Are As Good As Learned Weights in LISTA. In: Proceedings of the 6th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018

    Google Scholar 

  109. Liu J, Chen X, Wang Z, et al. Towards constituting mathematical structures for learning to optimize. arXiv:2305.18577, 2023

  110. Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2015, 3431–3440

    Google Scholar 

  111. Ma Y, Li J, Cao Z, et al. Efficient neural neighborhood search for pickup and delivery problems. arXiv:2204.11399, 2022

  112. Malitsky Y. Instance-Specific Algorithm Configuration. New York: Springer, 2014

    Book  Google Scholar 

  113. Mandi J, Guns T. Interior point solving for LP-based prediction+optimisation. In: Proceedings of the 34th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2020, 33: 7272–7282

  114. Mao X, Shen C, Yang Y B. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: Proceedings of the 30th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2016, 29

  115. Marcos Alvarez A, Louveaux Q, Wehenkel L. A supervised machine learning approach to variable branching in branch-and-bound. Technical Report. Liege: Université de Liè?ge, 2014

    Google Scholar 

  116. Mardani M, Sun Q, Donoho D, et al. Neural proximal gradient descent for compressive imaging. In: Proceedings of the 32nd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2018, 31

  117. McKenzie D, Fung S W, Heaton H. Faster predict-and-optimize with Davis-Yin splitting. arXiv:2301.13395, 2023

  118. Meinhardt T, Moller M, Hazirbas C, et al. Learning proximal operators: Using denoising networks for regularizing inverse imaging problems. In Proceedings of the IEEE International Conference on Computer Vision. San Francisco: IEEE, 2017, 1781–1790

    Google Scholar 

  119. Miyato T, Kataoka T, Koyama M, et al. Spectral normalization for generative adversarial networks. In: Proceedings of the 6th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018

    Google Scholar 

  120. Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518: 529–533

    Article  Google Scholar 

  121. Monga V, Li Y, Eldar YC. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. IEEE Signal Process Mag, 2021, 38: 18–44

    Article  Google Scholar 

  122. Moreau T, Bruna J. Understanding neural sparse coding with matrix factorization. In: Proceedings of the 5th International Conference on Learning Representations. New Orleans: OpenReview.net, 2017

    Google Scholar 

  123. Nair V, Bartunov S, Gimeno F, et al. Solving mixed integer programs using neural networks. arXiv:2012.13349, 2020

  124. Oberman A M, Calder J. Lipschitz regularized deep neural networks converge and generalize. arXiv:1808.09540, 2018

  125. Parsonson C W, Laterre A, Barrett T D. Reinforcement learning for branch-and-bound optimisation using retrospective trajectories. In Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2023, 4061–4069

    Google Scholar 

  126. Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. In: Proceedings of the 30th International Conference on Machine Learning. Ann Arbor: PMLR, 2013, 1310–1318

    Google Scholar 

  127. Paulus M, Krause A. Learning to dive in branch and bound. In: Proceedings of the 37th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2024, 36

  128. Paulus M, Zarpellon G, Krause A, et al. Learning to cut by looking ahead: Cutting plane selection via imitation learning. In: Proceedings of the 39th International Conference on Machine Learning. Ann Arbor: PMLR, 2022, 17584–17600

    Google Scholar 

  129. Pramanik A, Aggarwal H K, Jacob M. Deep generalization of structured low-rank algorithms (Deep-SLR). IEEE Trans Medical Imag, 2020, 39: 4186–4197

    Article  Google Scholar 

  130. Prouvost A, Dumouchelle J, Scavuzzo L, et al. Ecole: A gym-like library for machine learning in combinatorial optimization solvers. In: Learning Meets Combinatorial Algorithms at NeurIPS 2020. New Orleans: OpenReview.net, 2020

    Google Scholar 

  131. Qian H, Wegman M N. L2-nonexpansive neural networks. In: Proceedings of the 6th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018

    Google Scholar 

  132. Qu Q, Li X, Zhou Y, et al. An improved reinforcement learning algorithm for learning to branch. arXiv:2201.06213, 2022

  133. Rick Chang J H, Li C L, Poczos B, et al. One network to solve them all—solving linear inverse problems using deep projection models. In Proceedings of the IEEE International Conference on Computer Vision. San Francisco: IEEE, 2017, 5888–5897

    Google Scholar 

  134. Rudin L I, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms. Physica D, 1992, 60: 259–268

    Article  MathSciNet  Google Scholar 

  135. Ryu E, Liu J, Wang S, et al. Plug-and-play methods provably converge with properly trained denoisers. In: Proceedings of the 36th International Conference on Machine Learning. Ann Arbor: PMLR, 2019, 5546–5557

    Google Scholar 

  136. Ryu E, Yin W. Large-scale convex optimization: Algorithms & Analyses via Monotone Operators. Cambridge: Cambridge Univ Press, 2022

    Book  Google Scholar 

  137. Samuel N, Diskin T, Wiesel A. Learning to detect. IEEE Trans Signal Process, 2019, 67: 2554–2564

    Article  MathSciNet  Google Scholar 

  138. Scarlett J, Heckel R, Rodrigues M R, et al. Theoretical perspectives on deep learning methods in inverse problems. IEEE J Sel Area Inform Theo, 2022, 3: 433–453

    Article  Google Scholar 

  139. Scavuzzo L, Chen F, Chtelat D, et al. Learning to branch with tree mdps. In: Proceedings of the 36th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2022, 35: 18514–18526

  140. Schnoor E, Behboodi A, Rauhut H. Generalization error bounds for iterative recovery algorithms unfolded as neural networks. Inform Infer: A J IMA, 2023, 12: 2267–2299

    MathSciNet  Google Scholar 

  141. Shen Y, Sun Y, Eberhard A, et al. Learning primal heuristics for mixed integer programs. In: Proceedings of the 2021 International Joint Conference on Neural Networks. San Francisco: IEEE, 2021, 1–8

    Google Scholar 

  142. Silver D, Huang A, Maddison C J, et al. Mastering the game of Go with deep neural networks and tree search. Nature, 2016, 529: 484–489

    Article  Google Scholar 

  143. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014

  144. Snoek J, Larochelle H, Adams R P. Practical bayesian optimization of machine learning algorithms. In: Proceedings of the 26th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2012, 25

  145. Solomon O, Cohen R, Zhang Y, et al. Deep unfolded robust PCA with application to clutter suppression in ultrasound. IEEE Trans Medical Imag, 2019, 39: 1051–1063

    Article  Google Scholar 

  146. Song J, Yue Y, Dilkina B. A general large neighborhood search framework for solving integer linear programs. In: Proceedings of the 34th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2020, 33: 20012–20023

  147. Song W, Liu Y, Cao Z, et al. Instance-specific algorithm configuration via unsupervised deep graph clustering. Eng Appl Artificial Intell, 2023, 125: 106740

    Article  Google Scholar 

  148. Sonnerat N, Wang P, Ktena I, et al. Learning a large neighborhood search algorithm for mixed integer programs. arXiv:2107.10201, 2021

  149. Sreehari S, Venkatakrishnan S V, Wohlberg B, et al. Plug-and-play priors for bright field electron tomography and sparse interpolation. IEEE Trans Comput Imag, 2016, 2: 408–423

    Article  MathSciNet  Google Scholar 

  150. Sreter H, Giryes R. Learned convolutional sparse coding. In: Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing. San Francisco: IEEE, 2018, 219–2195

    Google Scholar 

  151. Sutton R S, McAllester D, Singh S, et al. Policy gradient methods for reinforcement learning with function approximation. In: Proceedings of the 13th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 1999, 12

  152. Takabe S, Wadayama T. Theoretical interpretation of learned step size in deep-unfolded gradient descent. arXiv:2001.05142, 2020

  153. Takabe S, Wadayama T, Eldar Y C. Complex trainable ista for linear and nonlinear inverse problems. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. San Francisco: IEEE, 2020, 5020–5024

    Google Scholar 

  154. Tang Y, Agrawal S, Faenza Y. Reinforcement learning for integer programming: Learning to cut. In: Proceedings of the 37th International Conference on Machine Learning. Ann Arbor: PMLR, 2020, 9367–9376

    Google Scholar 

  155. Teerapittayanon S, McDanel B, Kung H T. Branchynet: Fast inference via early exiting from deep neural networks. In: Proceedings of the 23rd International Conference on Pattern Recognition. San Francisco: IEEE, 2016, 2464–2469

    Google Scholar 

  156. Terris M, Repetti A, Pesquet J C, et al. Enhanced convergent PnP algorithms for image restoration. In: Proceedings of the IEEE International Conference on Image Processing. San Francisco: IEEE, 2021, 1684–1688

    Google Scholar 

  157. Turner M, Koch T, Serrano F, et al. Adaptive cut selection in mixed-integer linear programming. Open J Math Optim, 2023, 4: 1–28

    Article  MathSciNet  Google Scholar 

  158. Ulyanov D, Vedaldi A, Lempitsky V. Deep image prior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2018, 9446–9454

    Google Scholar 

  159. Valentin R, Ferrari C, Scheurer J, et al. Instance-wise algorithm configuration with graph neural networks. arXiv:2202.04910, 2022

  160. Venkatakrishnan S V, Bouman C A, Wohlberg B. Plug-and-play priors for model based reconstruction. In: Proceedings of the IEEE Global Conference on Signal and Information Processing. San Francisco: IEEE, 2013, 945–948

    Google Scholar 

  161. Vu B C. A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv Comput Math, 2013, 38: 667–681

    Article  MathSciNet  Google Scholar 

  162. Wadayama T, Takabe S. Deep learning-aided trainable projected gradient decoding for LDPC codes. In: Proceedings of the IEEE International Symposium on Information Theory. San Francisco: IEEE, 2019, 2444–2448

    Google Scholar 

  163. Wang Z, Li X, Wang J, et al. Learning cut selection for mixed-integer linear programming via hierarchical sequence model. In: Proceedings of the 11th International Conference on Learning Representations. New Orleans: OpenReview.net, 2022

    Google Scholar 

  164. Wang Z, Liu D, Chang S, et al. D3: Deep dual-domain based fast restoration of JPEG-compressed images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2016, 2764–2772

    Google Scholar 

  165. Wei K, Aviles-Rivero A, Liang J, et al. Tuning-free plug-and-play proximal algorithm for inverse imaging problems. In: Proceedings of the 37th International Conference on Machine Learning. Ann Arbor: PMLR, 2020, 10158–10169

    Google Scholar 

  166. Weng T W, Zhang H, Chen P Y, et al. Evaluating the robustness of neural networks: An extreme value theory approach. In: Proceedings of the 7th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018

    Google Scholar 

  167. Wilder B, Dilkina B, Tambe M. Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019, 1658–1665

    Google Scholar 

  168. Wolpert D H, Macready W G. No free lunch theorems for optimization. IEEE Trans Evol Comput, 1997, 1: 67–82

    Article  Google Scholar 

  169. Wolsey L A. Integer Programming. New York: John Wiley & Sons, 2020

    Book  Google Scholar 

  170. Wöllmer M, Kaiser M, Eyben F, et al. LSTM-modeling of continuous emotions in an audiovisual affect recognition framework. Image Vision Comput, 2013, 31: 153–163

    Article  Google Scholar 

  171. Wu K, Guo Y, Li Z, et al. Sparse coding with gated learned ISTA. In: Proceedings of the 7th International Conference on Learning Representations. New Orleans: OpenReview.net, 2019

    Google Scholar 

  172. Wu L, Cui P, Pei J, et al. Graph neural networks: Foundation, frontiers and applications. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2022, 4840–4841

    Chapter  Google Scholar 

  173. Wu Y, Song W, Cao Z, et al. Learning large neighborhood search policy for integer programming. In: Proceedings of the 35th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2021, 34: 30075–30087

  174. Xie X, Wu J, Liu G, et al. Differentiable linearized ADMM. In: Proceedings of the 36th International Conference on Machine Learning. Ann Arbor: PMLR, 2019, 6902–6911

    Google Scholar 

  175. Xu L, Hutter F, Hoos H H, et al. Hydra-MIP: Automated algorithm configuration and selection for mixed integer programming. In: Proceedings of the RCRA Workshop on Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion at the International Joint Conference on Artificial Intelligence. RCRA, 2011, 16–30

  176. Yang C, Gu Y, Chen B, et al. Learning proximal operator methods for nonconvex sparse recovery with theoretical guarantee. IEEE Trans Signal Process, 2020, 68: 5244–5259

    Article  MathSciNet  Google Scholar 

  177. Yang L, Shami A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 2020, 415: 295–316

    Article  Google Scholar 

  178. Yilmaz K, Yorke-Smith N. A study of learning search approximation in mixed integer branch and bound: Node selection in scip. Artificial Intell, 2021, 2: 150–178

    Google Scholar 

  179. Yuan X, Liu Y, Suo J, et al. Plug-and-play algorithms for large-scale snapshot compressive imaging. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2020, 1447–1457

    Google Scholar 

  180. Zarka J, Thiry L, Angles T, et al. Deep Network Classification by Scattering and Homotopy Dictionary Learning. In: Proceedings of the 8th International Conference on Learning Representations. New Orleans: OpenReview.net, 2020

    Google Scholar 

  181. Zarpellon G, Jo J, Lodi A, et al. Parameterizing branch-and-bound search trees to learn branching policies. In Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021, 3931–3939

    Google Scholar 

  182. Zhang J, Ghanem B. ISTA-Net: Interpretable optimization-inspired deep network for image compressive sensing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2018, 1828–1837

    Google Scholar 

  183. Zhang K, Li Y, Zuo W, et al. Plug-and-play image restoration with deep denoiser prior. IEEE Trans Pattern Anal Mach Intell, 2021, 44: 6360–6376

    Article  Google Scholar 

  184. Zhang K, Zuo W, Chen Y, et al. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans Image Process, 2017, 26: 3142–3155

    Article  MathSciNet  Google Scholar 

  185. Zhang K, Zuo W, Gu S, et al. Learning deep CNN denoiser prior for image restoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2017, 3929–3938

    Google Scholar 

  186. Zhang K, Zuo W, Zhang L. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Trans Image Process, 2018, 27: 4608–4622

    Article  MathSciNet  Google Scholar 

  187. Zhang M, Yin W, Wang M, et al. Mindopt tuner: Boost the performance of numerical software by automatic parameter tuning. arXiv:2307.08085, 2023

  188. Zhang T, Banitalebi-Dehkordi A, Zhang Y. Deep reinforcement learning for exact combinatorial optimization: Learning to branch. In: Proceedings of the 26th International Conference on Pattern Recognition. San Francisco: IEEE, 2022, 3105–3111

    Google Scholar 

  189. Zhang X, Lu Y, Liu J, et al. Dynamically unfolding recurrent restorer: A moving endpoint control method for image restoration. In: Proceedings of the 6th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018

    Google Scholar 

  190. Zhao B, Li F F. Online detection of unusual events in videos via dynamic sparse coding. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2011, 3313–3320

    Google Scholar 

  191. Zou Y, Zhou Y, Chen X, et al. Proximal gradient-based unfolding for massive random access in IoT networks. arXiv:2212.01839, 2022

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wotao Yin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, X., Liu, J. & Yin, W. Learning to optimize: A tutorial for continuous and mixed-integer optimization. Sci. China Math. (2024). https://doi.org/10.1007/s11425-023-2293-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11425-023-2293-3

Keywords

MSC(2020)

Navigation