Learning to optimize: A tutorial for continuous and mixed-integer optimization

Chen, Xiaohan; Liu, Jialin; Yin, Wotao

doi:10.1007/s11425-023-2293-3

Learning to optimize: A tutorial for continuous and mixed-integer optimization

Articles
AI Methods for Optimization Problems
Published: 08 May 2024

(2024)
Cite this article

Science China Mathematics Aims and scope Submit manuscript

Xiaohan Chen¹,
Jialin Liu¹ &
Wotao Yin¹

237 Accesses
Explore all metrics

Abstract

Learning to optimize (L2O) stands at the intersection of traditional optimization and machine learning, utilizing the capabilities of machine learning to enhance conventional optimization techniques. As real-world optimization problems frequently share common structures, L2O provides a tool to exploit these structures for better or faster solutions. This tutorial dives deep into L2O techniques, introducing how to accelerate optimization algorithms, promptly estimate the solutions, or even reshape the optimization problem itself, making it more adaptive to real-world applications. By considering the prerequisites for successful applications of L2O and the structure of the optimization problems at hand, this tutorial provides a comprehensive guide for practitioners and researchers alike.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aberdam A, Golts A, Elad M. Ada-LISTA: Learned solvers adaptive to varying models. IEEE Trans Pattern Anal Mach Intell. 2021, 44: 9222–9235
Article Google Scholar
Ablin P, Moreau T, Massias M, et al. Learning step sizes for unfolded sparse coding. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2019, 32
Achterberg T. Constraint Integer Programming. Berlin-Heidelberg: Springer, 2007
Google Scholar
Adler J, Öktem O. Learned primal-dual reconstruction. IEEE Trans Medical Imag, 2018, 37: 1322–1332
Article Google Scholar
Agrawal A, Amos B, Barratt S, et al. Differentiable convex optimization layers. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2019, 32
Aharon M, Elad M, Bruckstein A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process, 2006, 54:4311–4322
Article Google Scholar
Alvarez A M, Louveaux Q, Wehenkel L. A machine learning-based approximation of strong branching. INFORMS J Comput, 2017, 29: 185–195
Article MathSciNet Google Scholar
Amos B, Kolter J Z. OptNet: Differentiable optimization as a layer in neural networks. In: Proceedings of the 34th International Conference on Machine Learning International Conference on Machine Learning. Ann Arbor: PMLR, 2017, 70
Google Scholar
Anstegui C, Gabas J, Malitsky Y, et al. MaxSAT by improved instance-specific algorithm configuration. Artificial Intelligence, 2016, 235: 26–39
Article MathSciNet Google Scholar
Bai S, Kolter J Z, Koltun V. Deep equilibrium models. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2019, 32
Balas E, Ceria S, Cornuejols G, et al. Gomory cuts revisited. Oper Res Lett, 1996, 19: 1–9
Article MathSciNet Google Scholar
Balas E, Ho A. Set covering algorithms using cutting planes, heuristics, and subgradient optimization: A computational study. In: Combinatorial Optimization. Berlin-Heidelberg: Springer, 1980, 37C–60
Chapter Google Scholar
Balatsoukas-Stimming A, Studer C. Deep unfolding for communications systems: A survey and some new directions. In Proceedings of the 2019 IEEE International Workshop on Signal Processing Systems. San Francisco: IEEE, 2019, 266–271
Google Scholar
Balcan M-F, Dick T, Sandholm T, et al. Learning to branch. In: Proceedings of the 35th International Conference on Machine Learning. Ann Arbor: PMLR, 2018, 80
Google Scholar
Bansal N, Chen X, Wang Z. Can we gain more from orthogonality regularizations in training deep networks? In: Proceedings of the 32nd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2018, 31
Bartlett P L, Foster D J, Telgarsky M J. Spectrally-normalized margin bounds for neural networks. In: Proceedings of the 31st Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2017, 30
Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm with application to wavelet-based image deblurring. In: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing. San Francisco: IEEE, 2009, 693–696
Google Scholar
Behboodi A, Rauhut H, Schnoor E. Compressive sensing and neural networks from a statistical learning perspective. In: Compressed Sensing in Information Processing. Cham: Springer, 2022, 247–277
Chapter Google Scholar
Behrens F, Sauder J, Jung P. Neurally augmented ALISTA. In: Proceedings of the 8th International Conference on Learning Representations. New Orleans: OpenReview.net, 2020
Google Scholar
Bergstra J, Bengio Y. Random search for hyper-parameter optimization. J Mach Learn Res, 2012, 13: 281–305
MathSciNet Google Scholar
Berthet Q, Blondel M, Teboul O, et al. Learning with differentiable pertubed optimizers. In: Proceedings of the 34th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2020, 33: 9508–9519
Berthold T. Primal heuristics for mixed integer programs. PhD Thesis. Berlin: Technischen Universität, 2006
Google Scholar
Berthold T, Francobaldi M, Hendel G. Learning to use local cuts. arXiv:2206.11618, 2022
Bertsimas D, Kallus N. From predictive to prescriptive analytics. Manag Sci, 2020, 66: 1025–1044
Article Google Scholar
Bertsimas D, Tsitsiklis J N. Introduction to Linear Optimization. Belmont: Athena Scientific, 1997
Google Scholar
Bestuzheva K, Besançon M, Chen W K, et al. The SCIP optimization suite 8.0. arXiv:2112.08872, 2021
Bolte J, Pauwels E, Vaiter S. One-step differentiation of iterative algorithms. In: Proceedings of the 37th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2024, 36
Bonami P, Lodi A, Zarpellon G. A classifier to decide on the linearization of mixed-integer quadratic problems in CPLEX. Oper Res, 2022, 70: 3303–3320
Article MathSciNet Google Scholar
Borgerding M, Schniter P, Rangan S. AMP-inspired deep networks for sparse linear inverse problems. IEEE Trans Signal Process, 2017, 65: 4293–4308
Article MathSciNet Google Scholar
Boyd S, Parikh N, Chu E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn, 2011, 3: 1–22
Article Google Scholar
Brauer C, Breustedt N, De Wolff T, et al. Learning variational models with unrolling and bilevel optimization. arXiv:2209.12651, 2022
Brock A, Donahue J, Simonyan K. Large scale GAN training for high fidelity natural image synthesis. In: Proceedings of the 6th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018
Google Scholar
Buades A, Coll B, Morel J M. A non-local algorithm for image denoising. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision And Pattern Recognition. San Francisco: IEEE, 2005, 60–65
Google Scholar
Cai H, Liu J, Yin W. Learned robust PCA: A scalable deep unfolding approach for high-dimensional outlier detection. In: Proceedings of the 35th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2021, 34: 16977–16989
Cappart Q, Moisan T, Rousseau L M, et al. Combining reinforcement learning and constraint programming for combinatorial optimization. In Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021, 3677–3687
Google Scholar
Chan S H, Wang X, Elgendy O A. Plug-and-play ADMM for image restoration: Fixed-point convergence and applications. IEEE Trans Comput Imaging, 2016, 3: 84–98
Article MathSciNet Google Scholar
Chen T, Chen X, Chen W, et al. Learning to optimize: A primer and a benchmark. J Mach Learn Res, 2022, 23: 1–59
MathSciNet Google Scholar
Chen X, Dai H, Li Y, et al. Learning to stop while learning to predict. In: Proceedings of International Conference on Machine Learning. Ann Arbor: PMLR, 2020, 1520–1530
Google Scholar
Chen X, Liu J, Wang Z, et al. Theoretical linear convergence of unfolded ISTA and its practical weights and thresholds. In: Proceedings of the 32nd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2018, 31
Chen X, Liu J, Wang Z, et al. Hyperparameter tuning is all you need for LISTA. In: Proceedings of the 35th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2021, 34:11678–11689
Chen X, Zhang Y, Reisinger C, et al. Understanding deep architecture with reasoning layer. In: Proceedings of the 34th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2020, 33: 1240–1252
Chen Z, Liu J, Chen X, et al. Rethinking the capacity of graph neural networks for branching strategy. arXiv:2402.07099, 2024
Chen Z, Liu J, Wang X, et al. On representing linear programs by graph neural networks. In: Proceedings of the 11th International Conference on Learning Representations. New Orleans: OpenReview.net, 2023
Google Scholar
Chen Z, Liu J, Wang X, et al. On representing mixed-integer linear programs by graph neural networks. In: Proceedings of the 11th International Conference on Learning Representations. New Orleans: OpenReview.net, 2023
Google Scholar
Chmiela A, Khalil E, Gleixner A, et al. Learning to schedule heuristics in branch and bound. In: Proceedings of the 35th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2021, 34: 24235–24246
Cohen R, Elad M, Milanfar P. Regularization by Denoising via Fixed-Point Projection (RED-PRO). SIAM J Imaging Sci, 2021, 14: 1374–1406
Article MathSciNet Google Scholar
Condat L. A primalCdual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J Optim Theo Appl, 2013, 158: 460–479
Article Google Scholar
Corbineau M C, Bertocchi C, Chouzenoux E, et al. Learned image deblurring by unfolding a proximal interior point algorithm. In: Proceedings of the 2019 IEEE International Conference on Image Processing. San Francisco: IEEE, 2019, 4664–4668
Google Scholar
Dabov K, Foi A, Katkovnik V, et al. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans Image Process, 2007, 16: 2080–2095
Article MathSciNet Google Scholar
Davis D, Yin W. A three-operator splitting scheme and its optimization applications. Set-Valued Var Anal, 2017, 25: 829–858
Article MathSciNet Google Scholar
Deza A, Khalil E B. Machine learning for cutting planes in integer programming: A survey. arXiv:2302.09166, 2023
Ding J Y, Zhang C, Shen L, et al. Accelerating primal solution findings for mixed integer programs based on solution prediction. In Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020, 1452–1459
Google Scholar
Donti P, Amos B, Kolter J Z. Task-based end-to-end model learning in stochastic optimization. In: Proceedings of the 31st Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2017, 30
Elmachtoub A N, Grigas P. Smart “predict, then optimize”. Manag Sci, 2022, 68: 9–26
Article Google Scholar
Etheve M, Als Z, Bissuel C, et al. Reinforcement learning for variable selection in a branch and bound algorithm. In: Proceedings of the 17th International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research. Cham: Springer, 2020, 176–185
Google Scholar
Falkner J K, Thyssens D, Schmidt-Thieme L. Large neighborhood search based on neural construction heuristics. arXiv:2205.00772, 2022
Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Amer Statist Assoc, 2001, 96: 1348–1360
Article MathSciNet Google Scholar
Fischetti M, Lodi A. Local branching. Math Program, 2003, 98: 23–47
Article MathSciNet Google Scholar
Fung S W, Heaton H, Li Q, et al. JFB: Jacobian-free backpropagation for implicit networks. In Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022, 6648–6656
Google Scholar
Gasse M, Chtelat D, Ferroni N, et al. Exact combinatorial optimization with graph convolutional neural networks. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2019, 32
Gehring J, Auli M, Grangier D, et al. A convolutional encoder model for neural machine translation. arXiv:1611.02344, 2016
Geng Z, Zhang X Y, Bai S, et al. On training implicit models. In: Proceedings of the 35th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2021, 34: 24247–2460
Giryes R, Eldar Y C, Bronstein A M, Sapiro G. Tradeoffs between convergence speed and reconstruction accuracy in inverse problems. IEEE Trans Signal Process, 2018, 66: 1676–1690
Article MathSciNet Google Scholar
Gogna A, Tayal A. Metaheuristics: Review and application. J Exp Theoret Artificial Intell, 2013, 25: 503–526
Article Google Scholar
Gomory R E. An Algorithm for Integer Solutions to Lmear Programs. Princeton-IBM Mathematics Research Project Technical Report 1. Princeton: Princeton University, 1958
Google Scholar
Gomory R E. Solving linear programming problems in integers. Combin Anal, 1960, 10: 211–215
Article MathSciNet Google Scholar
Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge: MIT Press, 2016
Google Scholar
Gregor K, LeCun Y. Learning fast approximations of sparse coding. In: Proceedings of the 27th International Conference on Machine Learning. Ann Arbor: PMLR, 2010, 399–406
Google Scholar
Griewank A, Walther A. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Philadelphia: SIAM, 2008
Book Google Scholar
Gupta H, Jin K H, Nguyen H Q, et al. CNN-based projected gradient descent for consistent CT image reconstruction. IEEE Trans Medical Imag, 2018, 37: 1440–1453
Article Google Scholar
Gupta P, Gasse M, Khalil E, et al. Hybrid models for learning to branch. In: Proceedings of the 34th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2020, 33: 18087–18097
Gupta P, Khalil E B, Chetlat D, et al. Lookback for learning to branch. arXiv:2206.14987, 2022
Han S, Fu R, Wang S, et al. Online adaptive dictionary learning and weighted sparse coding for abnormality detection. In: Proceedings of the 2013 IEEE International Conference on Image Processing. San Francisco: IEEE, 2013, 151–155
Chapter Google Scholar
Hauptmann A, Lucka F, Betcke M, et al. Model-based learning for accelerated, limited-view 3-D photoacoustic tomography. IEEE Trans Medical Imag, 2018, 37: 1382–1393
Article Google Scholar
He H, Daume III H, Eisner JM. Learning to search in branch and bound algorithms. In: Proceedings of the 28th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2014, 27
He H, Wen C K, Jin S, et al. Model-driven deep learning for MIMO detection. IEEE Trans Signal Process, 2020, 68: 1702–1715
Article MathSciNet Google Scholar
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2016, 770–778
Google Scholar
Heaton H, Chen X, Wang Z, et al. Safeguarded learned convex optimization. In Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2023, 7848–7855
Google Scholar
Heaton H, Fung S W, Lin A T, et al. Wasserstein-based projections with applications to inverse problems. SIAM J Math Data Sci, 2022, 4: 581–603
Article MathSciNet Google Scholar
Hendel G. Adaptive large neighborhood search for mixed integer programming. Math Program Comput, 2022, 14: 185–221
Article MathSciNet Google Scholar
Himmich I, El Hachemi N, El Hallaoui I, et al. MPILS: An automatic tuner for MILP solvers. Comput Oper Res, 2023, 159: 106344
Article MathSciNet Google Scholar
Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw, 1989, 2: 359–366
Article Google Scholar
Hosny A, Reda S. Automatic MILP solver configuration by learning problem similarities. Ann Oper Res, 2024, in press
Hottung A, Tierney K. Neural large neighborhood search for the capacitated vehicle routing problem. arXiv:1911.09539, 2019
Huang L, Chen X, Huo W, et al. Improving primal heuristics for mixed integer programming problems based on problem reduction: A learning-based approach. In: Proceedings of the 17th International Conference on Control, Automation, Robotics and Vision. San Francisco: IEEE, 2022, 181–186
Google Scholar
Huang T, Ferber A M, Tian Y, et al. Searching large neighborhoods for integer linear programs with contrastive learning. In: Proceedings of the 40th International Conference on Machine Learning. Ann Arbor: PMLR, 2023, 13869–13890
Google Scholar
Huang T, Li J, Koenig S, et al. Anytime multi-agent path finding via machine learning-guided large neighborhood search. In Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022, 9368–9376
Google Scholar
Huang Z, Wang K, Liu F, et al. Learning to select cuts for efficient mixed-integer programming. Pattern Recog, 2022, 123: 108353
Article Google Scholar
Hutter F, Hoos H H, Leyton-Brown K. Sequential model-based optimization for general algorithm configuration. In: Proceedings of the 5th International Conference on Learning and Intelligent Optimization. Berlin-Heidelberg: Springer, 2011, 507–523
Chapter Google Scholar
Hutter F, Hoos H H, Leyton-Brown K, et al. ParamILS: An automatic algorithm configuration framework. J Artificial Intell Res, 2009, 36: 267–306
Article Google Scholar
Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning 2015 Jun 1 (pp. 448–456). PMLR. In: Proceedings of the 32nd International Conference on Machine Learning. Ann Arbor: PMLR, 2015, 448–456
Google Scholar
Jegelka S. Theory of graph neural networks: Representation and learning. In: Proceedings of the 2022 International Congress of Mathematicians.
Jia H, Shen S. Benders cut classification via support vector machines for solving two-stage stochastic programs. INFORMS J Optim, 2021, 3: 278–297
Article MathSciNet Google Scholar
Joukovsky B, Mukherjee T, Van Luong H, et al. Generalization error bounds for deep unfolding RNNs. In: Uncertainty in Artificial Intelligence PMLR. Ann Arbor: PMLR, 2021, 1515–1524
Google Scholar
Kadioglu S, Malitsky Y, Sellmann M, et al. ISACCinstance-specific algorithm configuration. In: Proceedings of the 19th European Conference on Artificial Intelligence. Amsterdam: IOS Press, 2010, 751–756
Google Scholar
Kang E, Chang W, Yoo J, et al. Deep convolutional framelet denosing for low-dose CT via wavelet residual network. IEEE Trans Medical Imag, 2018, 37: 1358–1369
Article Google Scholar
Kao Y H, Roy B, Yan X. Directed regression. In: Proceedings of the 23rd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2009, 22
Khalil E B, Dai H, Zhang Y, et al. Learning combinatorial optimization algorithms over graphs. In: Proceedings of the 31st Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2017, 30
Khalil E B, Le Bodic P, Song L, et al. Learning to branch in mixed integer programming. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016
Google Scholar
Khalil E B, Morris C, Lodi A. MIP-GNN: A data-driven framework for guiding combinatorial solvers. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022, 10219–10227
Google Scholar
Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014
Kouni V, Panagakis Y. DECONET: An unfolding network for analysis-based compressed sensing with generalization error bounds. IEEE Trans Signal Process, 2023, 71: 1938–1951
Article MathSciNet Google Scholar
Labassi A G, Chtelat D, Lodi A. Learning to compare nodes in branch and bound with graph neural networks. In: Proceedings of the 36th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2022, 35: 32000–32010
LeCun Y, Cortes C, Burges JC C. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/
Li Y, Bar-Shira O, Monga V, et al. Deep algorithm unrolling for biomedical imaging. arXiv:2108.06637, 2021
Lin J, Zhu J, Wang H, et al. Learning to branch with Tree-aware Branching Transformers. Knowledge-Based Syst, 2022, 252: 109455
Article Google Scholar
Liu D, Fischetti M, Lodi A. Learning to search in local branching. In Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022, 3796–3803
Google Scholar
Liu J, Chen X, Wang Z, et al. ALISTA: Analytic Weights Are As Good As Learned Weights in LISTA. In: Proceedings of the 6th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018
Google Scholar
Liu J, Chen X, Wang Z, et al. Towards constituting mathematical structures for learning to optimize. arXiv:2305.18577, 2023
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2015, 3431–3440
Google Scholar
Ma Y, Li J, Cao Z, et al. Efficient neural neighborhood search for pickup and delivery problems. arXiv:2204.11399, 2022
Malitsky Y. Instance-Specific Algorithm Configuration. New York: Springer, 2014
Book Google Scholar
Mandi J, Guns T. Interior point solving for LP-based prediction+optimisation. In: Proceedings of the 34th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2020, 33: 7272–7282
Mao X, Shen C, Yang Y B. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: Proceedings of the 30th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2016, 29
Marcos Alvarez A, Louveaux Q, Wehenkel L. A supervised machine learning approach to variable branching in branch-and-bound. Technical Report. Liege: Université de Liè?ge, 2014
Google Scholar
Mardani M, Sun Q, Donoho D, et al. Neural proximal gradient descent for compressive imaging. In: Proceedings of the 32nd Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2018, 31
McKenzie D, Fung S W, Heaton H. Faster predict-and-optimize with Davis-Yin splitting. arXiv:2301.13395, 2023
Meinhardt T, Moller M, Hazirbas C, et al. Learning proximal operators: Using denoising networks for regularizing inverse imaging problems. In Proceedings of the IEEE International Conference on Computer Vision. San Francisco: IEEE, 2017, 1781–1790
Google Scholar
Miyato T, Kataoka T, Koyama M, et al. Spectral normalization for generative adversarial networks. In: Proceedings of the 6th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018
Google Scholar
Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518: 529–533
Article Google Scholar
Monga V, Li Y, Eldar YC. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. IEEE Signal Process Mag, 2021, 38: 18–44
Article Google Scholar
Moreau T, Bruna J. Understanding neural sparse coding with matrix factorization. In: Proceedings of the 5th International Conference on Learning Representations. New Orleans: OpenReview.net, 2017
Google Scholar
Nair V, Bartunov S, Gimeno F, et al. Solving mixed integer programs using neural networks. arXiv:2012.13349, 2020
Oberman A M, Calder J. Lipschitz regularized deep neural networks converge and generalize. arXiv:1808.09540, 2018
Parsonson C W, Laterre A, Barrett T D. Reinforcement learning for branch-and-bound optimisation using retrospective trajectories. In Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2023, 4061–4069
Google Scholar
Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. In: Proceedings of the 30th International Conference on Machine Learning. Ann Arbor: PMLR, 2013, 1310–1318
Google Scholar
Paulus M, Krause A. Learning to dive in branch and bound. In: Proceedings of the 37th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2024, 36
Paulus M, Zarpellon G, Krause A, et al. Learning to cut by looking ahead: Cutting plane selection via imitation learning. In: Proceedings of the 39th International Conference on Machine Learning. Ann Arbor: PMLR, 2022, 17584–17600
Google Scholar
Pramanik A, Aggarwal H K, Jacob M. Deep generalization of structured low-rank algorithms (Deep-SLR). IEEE Trans Medical Imag, 2020, 39: 4186–4197
Article Google Scholar
Prouvost A, Dumouchelle J, Scavuzzo L, et al. Ecole: A gym-like library for machine learning in combinatorial optimization solvers. In: Learning Meets Combinatorial Algorithms at NeurIPS 2020. New Orleans: OpenReview.net, 2020
Google Scholar
Qian H, Wegman M N. L2-nonexpansive neural networks. In: Proceedings of the 6th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018
Google Scholar
Qu Q, Li X, Zhou Y, et al. An improved reinforcement learning algorithm for learning to branch. arXiv:2201.06213, 2022
Rick Chang J H, Li C L, Poczos B, et al. One network to solve them all—solving linear inverse problems using deep projection models. In Proceedings of the IEEE International Conference on Computer Vision. San Francisco: IEEE, 2017, 5888–5897
Google Scholar
Rudin L I, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms. Physica D, 1992, 60: 259–268
Article MathSciNet Google Scholar
Ryu E, Liu J, Wang S, et al. Plug-and-play methods provably converge with properly trained denoisers. In: Proceedings of the 36th International Conference on Machine Learning. Ann Arbor: PMLR, 2019, 5546–5557
Google Scholar
Ryu E, Yin W. Large-scale convex optimization: Algorithms & Analyses via Monotone Operators. Cambridge: Cambridge Univ Press, 2022
Book Google Scholar
Samuel N, Diskin T, Wiesel A. Learning to detect. IEEE Trans Signal Process, 2019, 67: 2554–2564
Article MathSciNet Google Scholar
Scarlett J, Heckel R, Rodrigues M R, et al. Theoretical perspectives on deep learning methods in inverse problems. IEEE J Sel Area Inform Theo, 2022, 3: 433–453
Article Google Scholar
Scavuzzo L, Chen F, Chtelat D, et al. Learning to branch with tree mdps. In: Proceedings of the 36th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2022, 35: 18514–18526
Schnoor E, Behboodi A, Rauhut H. Generalization error bounds for iterative recovery algorithms unfolded as neural networks. Inform Infer: A J IMA, 2023, 12: 2267–2299
MathSciNet Google Scholar
Shen Y, Sun Y, Eberhard A, et al. Learning primal heuristics for mixed integer programs. In: Proceedings of the 2021 International Joint Conference on Neural Networks. San Francisco: IEEE, 2021, 1–8
Google Scholar
Silver D, Huang A, Maddison C J, et al. Mastering the game of Go with deep neural networks and tree search. Nature, 2016, 529: 484–489
Article Google Scholar
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014
Snoek J, Larochelle H, Adams R P. Practical bayesian optimization of machine learning algorithms. In: Proceedings of the 26th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2012, 25
Solomon O, Cohen R, Zhang Y, et al. Deep unfolded robust PCA with application to clutter suppression in ultrasound. IEEE Trans Medical Imag, 2019, 39: 1051–1063
Article Google Scholar
Song J, Yue Y, Dilkina B. A general large neighborhood search framework for solving integer linear programs. In: Proceedings of the 34th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2020, 33: 20012–20023
Song W, Liu Y, Cao Z, et al. Instance-specific algorithm configuration via unsupervised deep graph clustering. Eng Appl Artificial Intell, 2023, 125: 106740
Article Google Scholar
Sonnerat N, Wang P, Ktena I, et al. Learning a large neighborhood search algorithm for mixed integer programs. arXiv:2107.10201, 2021
Sreehari S, Venkatakrishnan S V, Wohlberg B, et al. Plug-and-play priors for bright field electron tomography and sparse interpolation. IEEE Trans Comput Imag, 2016, 2: 408–423
Article MathSciNet Google Scholar
Sreter H, Giryes R. Learned convolutional sparse coding. In: Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing. San Francisco: IEEE, 2018, 219–2195
Google Scholar
Sutton R S, McAllester D, Singh S, et al. Policy gradient methods for reinforcement learning with function approximation. In: Proceedings of the 13th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 1999, 12
Takabe S, Wadayama T. Theoretical interpretation of learned step size in deep-unfolded gradient descent. arXiv:2001.05142, 2020
Takabe S, Wadayama T, Eldar Y C. Complex trainable ista for linear and nonlinear inverse problems. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. San Francisco: IEEE, 2020, 5020–5024
Google Scholar
Tang Y, Agrawal S, Faenza Y. Reinforcement learning for integer programming: Learning to cut. In: Proceedings of the 37th International Conference on Machine Learning. Ann Arbor: PMLR, 2020, 9367–9376
Google Scholar
Teerapittayanon S, McDanel B, Kung H T. Branchynet: Fast inference via early exiting from deep neural networks. In: Proceedings of the 23rd International Conference on Pattern Recognition. San Francisco: IEEE, 2016, 2464–2469
Google Scholar
Terris M, Repetti A, Pesquet J C, et al. Enhanced convergent PnP algorithms for image restoration. In: Proceedings of the IEEE International Conference on Image Processing. San Francisco: IEEE, 2021, 1684–1688
Google Scholar
Turner M, Koch T, Serrano F, et al. Adaptive cut selection in mixed-integer linear programming. Open J Math Optim, 2023, 4: 1–28
Article MathSciNet Google Scholar
Ulyanov D, Vedaldi A, Lempitsky V. Deep image prior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2018, 9446–9454
Google Scholar
Valentin R, Ferrari C, Scheurer J, et al. Instance-wise algorithm configuration with graph neural networks. arXiv:2202.04910, 2022
Venkatakrishnan S V, Bouman C A, Wohlberg B. Plug-and-play priors for model based reconstruction. In: Proceedings of the IEEE Global Conference on Signal and Information Processing. San Francisco: IEEE, 2013, 945–948
Google Scholar
Vu B C. A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv Comput Math, 2013, 38: 667–681
Article MathSciNet Google Scholar
Wadayama T, Takabe S. Deep learning-aided trainable projected gradient decoding for LDPC codes. In: Proceedings of the IEEE International Symposium on Information Theory. San Francisco: IEEE, 2019, 2444–2448
Google Scholar
Wang Z, Li X, Wang J, et al. Learning cut selection for mixed-integer linear programming via hierarchical sequence model. In: Proceedings of the 11th International Conference on Learning Representations. New Orleans: OpenReview.net, 2022
Google Scholar
Wang Z, Liu D, Chang S, et al. D3: Deep dual-domain based fast restoration of JPEG-compressed images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2016, 2764–2772
Google Scholar
Wei K, Aviles-Rivero A, Liang J, et al. Tuning-free plug-and-play proximal algorithm for inverse imaging problems. In: Proceedings of the 37th International Conference on Machine Learning. Ann Arbor: PMLR, 2020, 10158–10169
Google Scholar
Weng T W, Zhang H, Chen P Y, et al. Evaluating the robustness of neural networks: An extreme value theory approach. In: Proceedings of the 7th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018
Google Scholar
Wilder B, Dilkina B, Tambe M. Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019, 1658–1665
Google Scholar
Wolpert D H, Macready W G. No free lunch theorems for optimization. IEEE Trans Evol Comput, 1997, 1: 67–82
Article Google Scholar
Wolsey L A. Integer Programming. New York: John Wiley & Sons, 2020
Book Google Scholar
Wöllmer M, Kaiser M, Eyben F, et al. LSTM-modeling of continuous emotions in an audiovisual affect recognition framework. Image Vision Comput, 2013, 31: 153–163
Article Google Scholar
Wu K, Guo Y, Li Z, et al. Sparse coding with gated learned ISTA. In: Proceedings of the 7th International Conference on Learning Representations. New Orleans: OpenReview.net, 2019
Google Scholar
Wu L, Cui P, Pei J, et al. Graph neural networks: Foundation, frontiers and applications. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2022, 4840–4841
Chapter Google Scholar
Wu Y, Song W, Cao Z, et al. Learning large neighborhood search policy for integer programming. In: Proceedings of the 35th Conference on Neural Information Processing Systems. Adv Neural Informn Process Syst, 2021, 34: 30075–30087
Xie X, Wu J, Liu G, et al. Differentiable linearized ADMM. In: Proceedings of the 36th International Conference on Machine Learning. Ann Arbor: PMLR, 2019, 6902–6911
Google Scholar
Xu L, Hutter F, Hoos H H, et al. Hydra-MIP: Automated algorithm configuration and selection for mixed integer programming. In: Proceedings of the RCRA Workshop on Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion at the International Joint Conference on Artificial Intelligence. RCRA, 2011, 16–30
Yang C, Gu Y, Chen B, et al. Learning proximal operator methods for nonconvex sparse recovery with theoretical guarantee. IEEE Trans Signal Process, 2020, 68: 5244–5259
Article MathSciNet Google Scholar
Yang L, Shami A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 2020, 415: 295–316
Article Google Scholar
Yilmaz K, Yorke-Smith N. A study of learning search approximation in mixed integer branch and bound: Node selection in scip. Artificial Intell, 2021, 2: 150–178
Google Scholar
Yuan X, Liu Y, Suo J, et al. Plug-and-play algorithms for large-scale snapshot compressive imaging. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2020, 1447–1457
Google Scholar
Zarka J, Thiry L, Angles T, et al. Deep Network Classification by Scattering and Homotopy Dictionary Learning. In: Proceedings of the 8th International Conference on Learning Representations. New Orleans: OpenReview.net, 2020
Google Scholar
Zarpellon G, Jo J, Lodi A, et al. Parameterizing branch-and-bound search trees to learn branching policies. In Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021, 3931–3939
Google Scholar
Zhang J, Ghanem B. ISTA-Net: Interpretable optimization-inspired deep network for image compressive sensing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2018, 1828–1837
Google Scholar
Zhang K, Li Y, Zuo W, et al. Plug-and-play image restoration with deep denoiser prior. IEEE Trans Pattern Anal Mach Intell, 2021, 44: 6360–6376
Article Google Scholar
Zhang K, Zuo W, Chen Y, et al. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans Image Process, 2017, 26: 3142–3155
Article MathSciNet Google Scholar
Zhang K, Zuo W, Gu S, et al. Learning deep CNN denoiser prior for image restoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2017, 3929–3938
Google Scholar
Zhang K, Zuo W, Zhang L. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Trans Image Process, 2018, 27: 4608–4622
Article MathSciNet Google Scholar
Zhang M, Yin W, Wang M, et al. Mindopt tuner: Boost the performance of numerical software by automatic parameter tuning. arXiv:2307.08085, 2023
Zhang T, Banitalebi-Dehkordi A, Zhang Y. Deep reinforcement learning for exact combinatorial optimization: Learning to branch. In: Proceedings of the 26th International Conference on Pattern Recognition. San Francisco: IEEE, 2022, 3105–3111
Google Scholar
Zhang X, Lu Y, Liu J, et al. Dynamically unfolding recurrent restorer: A moving endpoint control method for image restoration. In: Proceedings of the 6th International Conference on Learning Representations. New Orleans: OpenReview.net, 2018
Google Scholar
Zhao B, Li F F. Online detection of unusual events in videos via dynamic sparse coding. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2011, 3313–3320
Google Scholar
Zou Y, Zhou Y, Chen X, et al. Proximal gradient-based unfolding for massive random access in IoT networks. arXiv:2212.01839, 2022

Download references

Author information

Authors and Affiliations

Decision Intelligence Lab, Alibaba DAMO Academy, Bellevue, WA, 98004, USA
Xiaohan Chen, Jialin Liu & Wotao Yin

Authors

Xiaohan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jialin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wotao Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wotao Yin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, X., Liu, J. & Yin, W. Learning to optimize: A tutorial for continuous and mixed-integer optimization. Sci. China Math. (2024). https://doi.org/10.1007/s11425-023-2293-3

Download citation

Received: 18 August 2023
Accepted: 10 April 2024
Published: 08 May 2024
DOI: https://doi.org/10.1007/s11425-023-2293-3

Keywords

MSC(2020)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning to optimize: A tutorial for continuous and mixed-integer optimization

Abstract

Access this article

Similar content being viewed by others

How Can Machine Learning and Optimization Help Each Other Better?

Machine Learning Optimization Techniques: A Survey, Classification, Challenges, and Future Research Issues

Nonlinear Optimization: The Nelder-Mead Simplex Search Procedure

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

MSC(2020)

Navigation

Learning to optimize: A tutorial for continuous and mixed-integer optimization

Abstract

Access this article

Similar content being viewed by others

How Can Machine Learning and Optimization Help Each Other Better?

Machine Learning Optimization Techniques: A Survey, Classification, Challenges, and Future Research Issues

Nonlinear Optimization: The Nelder-Mead Simplex Search Procedure

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

MSC(2020)

Search

Navigation