Gradient-based algorithms for multi-objective bi-level optimization

Yang, Xinmin; Yao, Wei; Yin, Haian; Zeng, Shangzhi; Zhang, Jin

doi:10.1007/s11425-023-2302-9

Gradient-based algorithms for multi-objective bi-level optimization

Articles
AI Methods for Optimization Problems
Published: 15 May 2024

Volume 67, pages 1419–1438, (2024)
Cite this article

Science China Mathematics Aims and scope Submit manuscript

Xinmin Yang^1,2,
Wei Yao^3,4,
Haian Yin³,
Shangzhi Zeng⁵ &
…
Jin Zhang^3,4

118 Accesses
1 Citation
Explore all metrics

Abstract

Multi-objective bi-level optimization (MOBLO) addresses nested multi-objective optimization problems common in a range of applications. However, its multi-objective and hierarchical bi-level nature makes it notably complex. Gradient-based MOBLO algorithms have recently grown in popularity, as they effectively solve crucial machine learning problems like meta-learning, neural architecture search, and reinforcement learning. Unfortunately, these algorithms depend on solving a sequence of approximation subproblems with high accuracy, resulting in adverse time and memory complexity that lowers their numerical efficiency. To address this issue, we propose a gradient-based algorithm for MOBLO, called gMOBA, which has fewer hyperparameters to tune, making it both simple and efficient. Additionally, we demonstrate the theoretical validity by accomplishing the desirable Pareto stationarity. Numerical experiments confirm the practical efficiency of the proposed method and verify the theoretical results. To accelerate the convergence of gMOBA, we introduce a beneficial L2O (learning to optimize) neural network (called L2O-gMOBA) implemented as the initialization phase of our gMOBA algorithm. Comparative results of numerical experiments are presented to illustrate the performance of L2O-gMOBA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Adaptive Stochastic Gradient-Free Approach for High-Dimensional Blackbox Optimization

Adaptive Conflict-Averse Multi-gradient Descent for Multi-objective Learning

Learning to optimize: A tutorial for continuous and mixed-integer optimization

Article 08 May 2024

References

Abdolmaleki A, Huang S, Hasenclever L, et al. A distributional view on multi-objective policy optimization. In: International Conference on Machine Learning, vol. 119. Virtual Event: PMLR, 2020, 11–22
Google Scholar
Albuquerque I, Monteiro J, Doan T, et al. Multi-objective training of generative adversarial networks with multiple discriminators. In: International Conference on Machine Learning, vol. 97. Long Beach: PMLR, 2019, 202–211
Google Scholar
Andreani R, Ramirez V A, Santos S A, et al. Bilevel optimization with a multiobjective problem in the lower level. Numer Algor, 2019, 81: 915–946
Article MathSciNet Google Scholar
Andrychowicz M, Denil M, Gomez S, et al. Learning to learn by gradient descent by gradient descent. In: Advances in Neural Information Processing Systems. Barcelona: NIPS, 2016, 3981–3989
Google Scholar
Bandyopadhyay S, Pal S K, Aruna B. Multiobjective gas, quantitative indices, and pattern classification. IEEE Trans Syst Man Cy B, 2004, 34: 2088–2099
Article Google Scholar
Beck A. First-order Methods in Optimization. Philadelphia: SIAM, 2017
Book Google Scholar
Bonnel H, Iusem A N, Svaiter B F. Proximal methods in vector optimization. SIAM J Optim, 2005, 15: 953–970
Article MathSciNet Google Scholar
Bonnel H, Morgan J. Semivectorial bilevel optimization problem: Penalty approach. J Optim Theo Appl, 2006, 131: 365–382
Article MathSciNet Google Scholar
Chen J, Tang L, Yang X. A Barzilai-Borwein descent method for multiobjective optimization problems. Eur J Oper Res, 2023, 311: 196–209
Article MathSciNet Google Scholar
Chen T, Chen X, Chen W, et al. Learning to optimize: A primer and a benchmark. J Mach Learn Res, 2022, 23: 1–59
MathSciNet Google Scholar
Chen W Y, Liu Y C, Kira Z, et al. A closer look at few-shot classification. In: International Conference on Learning Representations. New Orleans: OpenReview.net, 2019
Google Scholar
Chen X, Ghadirzadeh A, Björkman M, et al. Meta-learning for multi-objective reinforcement learning. In: Proceedings of the International Conference on Intelligent Robots and Systems. Macau: IEEE, 2019, 977–983
Google Scholar
Chen X, Xie L, Wu J, et al. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In: Proceedings of the IEEE International Conference on Computer Vision. Seoul: IEEE, 2019, 1294–1303
Google Scholar
Custódio A L, Madeira J A, Vaz A I F, et al. Direct multisearch for multiobjective optimization. SIAM J Optim, 2011, 21: 1109–1140
Article MathSciNet Google Scholar
da Cruz Neto J X, Da Silva G, Ferreira O P, et al. A subgradient method for multiobjective optimization. Comput Optim Appl, 2013, 54: 461–472
Article MathSciNet Google Scholar
Dagréou M, Ablin P, Vaiter S, et al. A framework for bilevel optimization that enables stochastic and global variance reduction algorithms. arXiv:2201.13409, 2022
Deb K, Pratap A, Agarwal S, et al. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evo Comput, 2002, 6: 182–197
Article Google Scholar
Dempe S, Gadhi N, Zemkoho A B. New optimality conditions for the semivectorial bilevel optimization problem. J Optim Theo App, 2013, 157: 54–74
Article MathSciNet Google Scholar
Dempe S, Mehlitz P. Semivectorial bilevel programming versus scalar bilevel programming. Optimization, 2019, 69: 657–679
Article MathSciNet Google Scholar
Désidéri J A. Multiple-gradient descent algorithm (MGDA) for multiobjective optimization. C R Math, 2012, 350: 313–318
Article MathSciNet Google Scholar
Dong J D, Cheng A C, Juan D C, et al. Dpp-net: Device-aware progressive search for Pareto-optimal neural architectures. In: Proceedings of the European Conference on Computer Vision, vol. 11220. Munich: Springer, 2018, 517–531
Google Scholar
Ehrgott M. Multicriteria Optimization. Luxembourg: Springer, 2005
Google Scholar
Elsken T, Metzen J H, Hutter F. Efficient multi-objective neural architecture search via lamarckian evolution. In: International Conference on Learning Representations. New Orleans: OpenReview.net, 2019
Google Scholar
Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, vol. 70. Sydney: PMLR, 2017, 1126–1135
Google Scholar
Fliege J, Svaiter B F. Steepest descent methods for multicriteria optimization. Math Meth Oper Res, 2000, 51: 479–494
Article MathSciNet Google Scholar
Franceschi L, Donini M, Frasconi P, et al. Forward and reverse gradient-based hyperparameter optimization. In: International Conference on Machine Learning, vol. 70. Sydney: PMLR, 2017, 1165–1173
Google Scholar
Franceschi L, Frasconi P, Salzo S, et al. Bilevel programming for hyperparameter optimization and meta-learning. In: International Conference on Machine Learning, vol. 80. Stockholm: PMLR, 2018, 1563–1572
Google Scholar
Ghadimi S, Wang M. Approximation methods for bilevel programming. arXiv:1802.02246, 2018
Goldblum M, Fowl L, Goldstein T. Adversarially robust few-shot learning: A meta-learning approach. In: Advances in Neural Information Processing Systems. Virtual Event: NIPS, 2020, 17886–17895
Google Scholar
Grazzi R, Franceschi L, Pontil M, et al. On the iteration complexity of hypergradient computation. In: International Conference on Machine Learning, vol. 119. Virtual Event: PMLR, 2020
Google Scholar
Gregor K, LeCun Y. Learning fast approximations of sparse coding. In: International Conference on Machine Learning, vol. 14. Haifa: PMLR, 2010, 399–406
Google Scholar
Gu A, Lu S, Ram P, et al. Min-max multi-objective bilevel optimization with applications in robust machine learning. In: International Conference on Learning Representations. Virtual Event: OpenReview.net, 2022
Google Scholar
Hospedales T M, Antoniou A, Micaelli P, et al. Meta-learning in neural networks: A survey. IEEE Trans Pattern Anal, 2020, 44: 5149–5169
Google Scholar
Hu Z, Shaloudegi K, Zhang G, et al. Federated learning meets multi-objective optimization. IEEE Trans Netw Sci Eng, 2022, 9: 2039–2051
Article MathSciNet Google Scholar
Jain H, Deb K. An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, part ii: Handling constraints and extending to an adaptive approach. IEEE Trans Evo Comput, 2013, 18: 602–622
Article Google Scholar
Ji K, Lee J D, Liang Y, et al. Convergence of meta-learning with task-specific adaptation over partial parameters. In: Advances in Neural Information Processing Systems. Virtual Event: NIPS, 2020, 11490–11500
Google Scholar
Ji K, Liu M, Liang Y, et al. Will bilevel optimizers benefit from loops. arXiv:2205.14224, 2022
Ji K, Yang J, Liang Y. Bilevel optimization: Convergence analysis and enhanced design. In: International Conference on Machine Learning, vol. 139. Virtual Event: PMLR, 2021, 4882–4892
Google Scholar
Jin Y, Sendhoff B. Pareto-based multiobjective machine learning: An overview and case studies. IEEE Trans Syst Man Cy C, 2008, 38: 397–415
Article Google Scholar
Khanduri P, Zeng S, Hong M, et al. A near-optimal algorithm for stochastic bilevel optimization via double-momentum. In: Advances in Neural Information Processing Systems. Vancouver: NIPS, 2021, 30271–30283
Google Scholar
Killamsetty K, Li C, Zhao C, et al. A nested bi-level optimization framework for robust few shot learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36. Palo Alto: AAAI Press, 2022, 7176–7184
Google Scholar
Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014
Li K, Chen R, Fu G, et al. Two-archive evolutionary algorithm for constrained multiobjective optimization. IEEE Trans Evo Comput, 2018, 23: 303–315
Article Google Scholar
Liang H, Zhang S, Sun J, et al. DARTS+: Improved differentiable architecture search with early stopping. arXiv:1909.06035, 2019
Lin X, Yang Z, Zhang Q, et al. Controllable Pareto multi-task learning. arXiv:2010.06313, 2020
Lin X, Zhen H L, Li Z, et al. Pareto multi-task learning. In: Advances in Neural Information Processing Systems. Vancouver: NIPS, 2019, 12037–12047
Google Scholar
Liu B, Liu X, Jin X, et al. Conflict-averse gradient descent for multi-task learning. In: Advances in Neural Information Processing Systems. Virtual Event: NIPS, 2021, 18878–18890
Google Scholar
Liu H, Simonyan K, Yang Y. DARTS: Differentiable architecture search. In: International Conference on Learning Representations. Vancouver: OpenReview.net, 2018
Google Scholar
Liu J, Chen X. ALISTA: Analytic weights are as good as learned weights in LISTA. In: International Conference on Learning Representations. New Orleans: OpenReview.net, 2019
Google Scholar
Liu R, Gao J, Zhang J, et al. Investigating bi-level optimization for learning and vision from a unified perspective: A survey and beyond. IEEE Trans Pattern Anal, 2021, 44: 10045–10067
Article Google Scholar
Liu R, Liu Y, Yao W, et al. Averaged method of multipliers for bi-level optimization without lower-level strong convexity. In: International Conference on Machine Learning, vol. 202. Honolulu: PMLR, 2023, 21839–21866
Google Scholar
Liu R, Mu P, Yuan X, et al. A generic first-order algorithmic framework for bi-level programming beyond lower-level singleton. In: International Conference on Machine Learning, vol. 119. Virtual Event: PMLR, 2020, 6305–6315
Google Scholar
Lu Z, Deb K, Goodman E, et al. NSGANetV2: Evolutionary multi-objective surrogate-assisted neural architecture search. In: European Conference on Computer Vision, vol. 12346. Virtual Event: Springer, 2020, 35–51
Google Scholar
Mackay M, Vicol P, Lorraine J, et al. Self-tuning networks: Bilevel optimization of hyperparameters using structured best-response functions. In: International Conference on Learning Representations. New Orleans: OpenReview.net, 2019
Google Scholar
Maclaurin D, Duvenaud D, Adams R. Gradient-based hyperparameter optimization through reversible learning. In: International Conference on Machine Learning, vol. 37. Lille: PMLR, 2015, 2113–2122
Google Scholar
Mahapatra D, Rajan V. Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization. In: International Conference on Machine Learning, vol. 119. Virtual Event: PMLR, 2020, 6597–6607
Google Scholar
Miettinen K. Nonlinear Multiobjective Optimization. Luxembourg: Springer, 1999
Google Scholar
Mohri M, Sivek G, Suresh A T. Agnostic federated learning. In: International Conference on Machine Learning, vol. 97. Long Beach: PMLR, 2019, 4615–4625
Google Scholar
Momma M, Dong C, Liu J. A multi-objective/multi-task learning framework induced by pareto stationarity. In: International Conference on Machine Learning, vol. 162. Baltimore: PMLR, 2022, 15895–15907
Google Scholar
Monga V, Li Y, Eldar Y C. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. IEEE Signal Proc Mag, 2021, 38: 18–44
Article Google Scholar
Mordukhovich B S, Nam N M. An Easy Path to Convex Analysis and Applications. Luxembourg: Springer, 2013
Google Scholar
Mossalam H, Assael Y M, Roijers D M, et al. Multi-objective deep reinforcement learning. arXiv:1610.02707, 2016
Neyshabur B, Bhojanapalli S, Chakrabarti A. Stabilizing GAN training with multiple random projections. arXiv:1705.07831, 2017
Paszke A, Gross S, Massa F, et al. PyTorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems. Vancouver: NIPS, 2019, 8024–8035
Google Scholar
Pedregosa F. Hyperparameter optimization with approximate gradient. In: International Conference on Machine Learning, vol. 48. New York: PMLR, 2016, 737–746
Google Scholar
Rajeswaran A, Finn C, Kakade S M, et al. Meta-learning with implicit gradients. In: Advances in Neural Information Processing Systems. Vancouver: NIPS, 2019, 113–124
Google Scholar
Ruder S. An overview of multi-task learning in deep neural networks. arXiv:1706.05098, 2017
Schott J R. Fault tolerant design using single and multicriteria genetic algorithm optimization. PhD Thesis. Cambridge: Massachusetts Institute of Technology, 1995
Google Scholar
Sener O, Koltun V. Multi-task learning as multi-objective optimization. In: Advances in Neural Information Processing Systems. Montreal: NIPS, 2018, 113–124
Google Scholar
Shaban A, Cheng C A, Hatch N, et al. Truncated back-propagation for bilevel optimization. In: International Conference on Artificial Intelligence and Statistics, vol. 89. Naha: PMLR, 2019, 1723–1732
Google Scholar
Sprechmann P, Litman R, Ben Yakar T, et al. Supervised sparse analysis and synthesis operators. In: Advances in Neural Information Processing Systems. Lake Tahoe: NIPS, 2013, 908–916
Google Scholar
Sun J, Li H, Xu Z, et al. Deep ADMM-Net for compressive sensing MRI. In: Advances in Neural Information Processing Systems. Barcelona: NIPS, 2016, 10–18
Google Scholar
Tan M, Chen B, Pang R, et al. MnasNet: Platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019, 2820–2828
Google Scholar
Tanabe H, Fukuda E H, Yamashita N. Proximal gradient methods for multiobjective optimization and their applications. Comput Optim Appl, 2019, 72: 339–361
Article MathSciNet Google Scholar
Vamplew P, Dazeley R, Berry A, et al. Empirical evaluation methods for multiobjective reinforcement learning algorithms. Mach Learn, 2011, 84: 51–80
Article MathSciNet Google Scholar
Van Moffaert K, NowNowe A. Multi-objective reinforcement learning using sets of pareto dominating policies. J Mach Learn Res, 2014, 15: 3483–3512
MathSciNet Google Scholar
Van Veldhuizen D A. Multiobjective Evolutionary Algorithms: Classifications, Analyses, and New Innovations. Ohio: Air Force Institute of Technology, 1999
Google Scholar
Venkatakrishnan S V, Bouman C A, Wohlberg B. Plug-and-play priors for model based reconstruction. In: IEEE Global Conference on Signal and Information Processing. Austin: IEEE, 2013, 945–948
Chapter Google Scholar
Yang R, Sun X, Narasimhan K. A generalized algorithm for multi-objective reinforcement learning and policy adaptation. In: Advances in Neural Information Processing Systems. Vancouver: NIPS, 2019, 14610–14621
Google Scholar
Yang Y, Sun J, Li H, et al. ADMM-Net: A deep learning approach for compressive sensing MRI. arXiv:1705.06869, 2017
Ye F, Lin B, Yue Z, et al. Multi-objective meta learning. In: Advances in Neural Information Processing Systems. Virtual Event: NIPS, 2021, 21338–21351
Google Scholar
Yu T, Kumar S, Gupta A, et al. Gradient surgery for multi-task learning. In: Advances in Neural Information Processing Systems. Virtual Event: NIPS, 2020, 5824–5836
Google Scholar
Zügner D, Günnemann S. Adversarial attacks on graph neural networks via meta learning. In: International Conference on Learning Representations. Vancouver: OpenReview.net, 2018
Google Scholar

Download references

Acknowledgements

Yang’s work was supported by the Major Program of National Natural Science Foundation of China (Grant Nos. 11991020 and 11991024). Yao’s work was supported by National Natural Science Foundation of China (Grant No. 12371305). Zhang’s work was supported by National Natural Science Foundation of China (Grant No. 12222106), Guangdong Basic and Applied Basic Research Foundation (Grant No. 2022B1515020082) and Shenzhen Science and Technology Program (Grant No. RCYX20200714114700072).

Author information

Authors and Affiliations

National Center for Applied Mathematics in Chongqing, Chongqing, 401331, China
Xinmin Yang
School of Mathematical Sciences, Chongqing Normal University, Chongqing, 401331, China
Xinmin Yang
Department of Mathematics, Southern University of Science and Technology, Shenzhen, 518055, China
Wei Yao, Haian Yin & Jin Zhang
National Center for Applied Mathematics Shenzhen, Shenzhen, 518000, China
Wei Yao & Jin Zhang
Department of Mathematics and Statistics, University of Victoria, Victoria, BC, V8W 2Y2, Canada
Shangzhi Zeng

Authors

Xinmin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Yao
View author publications
You can also search for this author in PubMed Google Scholar
Haian Yin
View author publications
You can also search for this author in PubMed Google Scholar
Shangzhi Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Jin Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jin Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, X., Yao, W., Yin, H. et al. Gradient-based algorithms for multi-objective bi-level optimization. Sci. China Math. 67, 1419–1438 (2024). https://doi.org/10.1007/s11425-023-2302-9

Download citation

Received: 04 July 2023
Accepted: 30 April 2024
Published: 15 May 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s11425-023-2302-9

Keywords

MSC(2020)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gradient-based algorithms for multi-objective bi-level optimization

Abstract

Access this article

Similar content being viewed by others

An Adaptive Stochastic Gradient-Free Approach for High-Dimensional Blackbox Optimization

Adaptive Conflict-Averse Multi-gradient Descent for Multi-objective Learning

Learning to optimize: A tutorial for continuous and mixed-integer optimization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

MSC(2020)

Navigation

Gradient-based algorithms for multi-objective bi-level optimization

Abstract

Access this article

Similar content being viewed by others

An Adaptive Stochastic Gradient-Free Approach for High-Dimensional Blackbox Optimization

Adaptive Conflict-Averse Multi-gradient Descent for Multi-objective Learning

Learning to optimize: A tutorial for continuous and mixed-integer optimization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

MSC(2020)

Search

Navigation