Skip to main content
Log in

Gradient-based algorithms for multi-objective bi-level optimization

  • Articles
  • AI Methods for Optimization Problems
  • Published:
Science China Mathematics Aims and scope Submit manuscript

Abstract

Multi-objective bi-level optimization (MOBLO) addresses nested multi-objective optimization problems common in a range of applications. However, its multi-objective and hierarchical bi-level nature makes it notably complex. Gradient-based MOBLO algorithms have recently grown in popularity, as they effectively solve crucial machine learning problems like meta-learning, neural architecture search, and reinforcement learning. Unfortunately, these algorithms depend on solving a sequence of approximation subproblems with high accuracy, resulting in adverse time and memory complexity that lowers their numerical efficiency. To address this issue, we propose a gradient-based algorithm for MOBLO, called gMOBA, which has fewer hyperparameters to tune, making it both simple and efficient. Additionally, we demonstrate the theoretical validity by accomplishing the desirable Pareto stationarity. Numerical experiments confirm the practical efficiency of the proposed method and verify the theoretical results. To accelerate the convergence of gMOBA, we introduce a beneficial L2O (learning to optimize) neural network (called L2O-gMOBA) implemented as the initialization phase of our gMOBA algorithm. Comparative results of numerical experiments are presented to illustrate the performance of L2O-gMOBA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abdolmaleki A, Huang S, Hasenclever L, et al. A distributional view on multi-objective policy optimization. In: International Conference on Machine Learning, vol. 119. Virtual Event: PMLR, 2020, 11–22

    Google Scholar 

  2. Albuquerque I, Monteiro J, Doan T, et al. Multi-objective training of generative adversarial networks with multiple discriminators. In: International Conference on Machine Learning, vol. 97. Long Beach: PMLR, 2019, 202–211

    Google Scholar 

  3. Andreani R, Ramirez V A, Santos S A, et al. Bilevel optimization with a multiobjective problem in the lower level. Numer Algor, 2019, 81: 915–946

    Article  MathSciNet  Google Scholar 

  4. Andrychowicz M, Denil M, Gomez S, et al. Learning to learn by gradient descent by gradient descent. In: Advances in Neural Information Processing Systems. Barcelona: NIPS, 2016, 3981–3989

    Google Scholar 

  5. Bandyopadhyay S, Pal S K, Aruna B. Multiobjective gas, quantitative indices, and pattern classification. IEEE Trans Syst Man Cy B, 2004, 34: 2088–2099

    Article  Google Scholar 

  6. Beck A. First-order Methods in Optimization. Philadelphia: SIAM, 2017

    Book  Google Scholar 

  7. Bonnel H, Iusem A N, Svaiter B F. Proximal methods in vector optimization. SIAM J Optim, 2005, 15: 953–970

    Article  MathSciNet  Google Scholar 

  8. Bonnel H, Morgan J. Semivectorial bilevel optimization problem: Penalty approach. J Optim Theo Appl, 2006, 131: 365–382

    Article  MathSciNet  Google Scholar 

  9. Chen J, Tang L, Yang X. A Barzilai-Borwein descent method for multiobjective optimization problems. Eur J Oper Res, 2023, 311: 196–209

    Article  MathSciNet  Google Scholar 

  10. Chen T, Chen X, Chen W, et al. Learning to optimize: A primer and a benchmark. J Mach Learn Res, 2022, 23: 1–59

    MathSciNet  Google Scholar 

  11. Chen W Y, Liu Y C, Kira Z, et al. A closer look at few-shot classification. In: International Conference on Learning Representations. New Orleans: OpenReview.net, 2019

    Google Scholar 

  12. Chen X, Ghadirzadeh A, Björkman M, et al. Meta-learning for multi-objective reinforcement learning. In: Proceedings of the International Conference on Intelligent Robots and Systems. Macau: IEEE, 2019, 977–983

    Google Scholar 

  13. Chen X, Xie L, Wu J, et al. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In: Proceedings of the IEEE International Conference on Computer Vision. Seoul: IEEE, 2019, 1294–1303

    Google Scholar 

  14. Custódio A L, Madeira J A, Vaz A I F, et al. Direct multisearch for multiobjective optimization. SIAM J Optim, 2011, 21: 1109–1140

    Article  MathSciNet  Google Scholar 

  15. da Cruz Neto J X, Da Silva G, Ferreira O P, et al. A subgradient method for multiobjective optimization. Comput Optim Appl, 2013, 54: 461–472

    Article  MathSciNet  Google Scholar 

  16. Dagréou M, Ablin P, Vaiter S, et al. A framework for bilevel optimization that enables stochastic and global variance reduction algorithms. arXiv:2201.13409, 2022

  17. Deb K, Pratap A, Agarwal S, et al. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evo Comput, 2002, 6: 182–197

    Article  Google Scholar 

  18. Dempe S, Gadhi N, Zemkoho A B. New optimality conditions for the semivectorial bilevel optimization problem. J Optim Theo App, 2013, 157: 54–74

    Article  MathSciNet  Google Scholar 

  19. Dempe S, Mehlitz P. Semivectorial bilevel programming versus scalar bilevel programming. Optimization, 2019, 69: 657–679

    Article  MathSciNet  Google Scholar 

  20. Désidéri J A. Multiple-gradient descent algorithm (MGDA) for multiobjective optimization. C R Math, 2012, 350: 313–318

    Article  MathSciNet  Google Scholar 

  21. Dong J D, Cheng A C, Juan D C, et al. Dpp-net: Device-aware progressive search for Pareto-optimal neural architectures. In: Proceedings of the European Conference on Computer Vision, vol. 11220. Munich: Springer, 2018, 517–531

    Google Scholar 

  22. Ehrgott M. Multicriteria Optimization. Luxembourg: Springer, 2005

    Google Scholar 

  23. Elsken T, Metzen J H, Hutter F. Efficient multi-objective neural architecture search via lamarckian evolution. In: International Conference on Learning Representations. New Orleans: OpenReview.net, 2019

    Google Scholar 

  24. Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, vol. 70. Sydney: PMLR, 2017, 1126–1135

    Google Scholar 

  25. Fliege J, Svaiter B F. Steepest descent methods for multicriteria optimization. Math Meth Oper Res, 2000, 51: 479–494

    Article  MathSciNet  Google Scholar 

  26. Franceschi L, Donini M, Frasconi P, et al. Forward and reverse gradient-based hyperparameter optimization. In: International Conference on Machine Learning, vol. 70. Sydney: PMLR, 2017, 1165–1173

    Google Scholar 

  27. Franceschi L, Frasconi P, Salzo S, et al. Bilevel programming for hyperparameter optimization and meta-learning. In: International Conference on Machine Learning, vol. 80. Stockholm: PMLR, 2018, 1563–1572

    Google Scholar 

  28. Ghadimi S, Wang M. Approximation methods for bilevel programming. arXiv:1802.02246, 2018

  29. Goldblum M, Fowl L, Goldstein T. Adversarially robust few-shot learning: A meta-learning approach. In: Advances in Neural Information Processing Systems. Virtual Event: NIPS, 2020, 17886–17895

    Google Scholar 

  30. Grazzi R, Franceschi L, Pontil M, et al. On the iteration complexity of hypergradient computation. In: International Conference on Machine Learning, vol. 119. Virtual Event: PMLR, 2020

    Google Scholar 

  31. Gregor K, LeCun Y. Learning fast approximations of sparse coding. In: International Conference on Machine Learning, vol. 14. Haifa: PMLR, 2010, 399–406

    Google Scholar 

  32. Gu A, Lu S, Ram P, et al. Min-max multi-objective bilevel optimization with applications in robust machine learning. In: International Conference on Learning Representations. Virtual Event: OpenReview.net, 2022

    Google Scholar 

  33. Hospedales T M, Antoniou A, Micaelli P, et al. Meta-learning in neural networks: A survey. IEEE Trans Pattern Anal, 2020, 44: 5149–5169

    Google Scholar 

  34. Hu Z, Shaloudegi K, Zhang G, et al. Federated learning meets multi-objective optimization. IEEE Trans Netw Sci Eng, 2022, 9: 2039–2051

    Article  MathSciNet  Google Scholar 

  35. Jain H, Deb K. An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, part ii: Handling constraints and extending to an adaptive approach. IEEE Trans Evo Comput, 2013, 18: 602–622

    Article  Google Scholar 

  36. Ji K, Lee J D, Liang Y, et al. Convergence of meta-learning with task-specific adaptation over partial parameters. In: Advances in Neural Information Processing Systems. Virtual Event: NIPS, 2020, 11490–11500

    Google Scholar 

  37. Ji K, Liu M, Liang Y, et al. Will bilevel optimizers benefit from loops. arXiv:2205.14224, 2022

  38. Ji K, Yang J, Liang Y. Bilevel optimization: Convergence analysis and enhanced design. In: International Conference on Machine Learning, vol. 139. Virtual Event: PMLR, 2021, 4882–4892

    Google Scholar 

  39. Jin Y, Sendhoff B. Pareto-based multiobjective machine learning: An overview and case studies. IEEE Trans Syst Man Cy C, 2008, 38: 397–415

    Article  Google Scholar 

  40. Khanduri P, Zeng S, Hong M, et al. A near-optimal algorithm for stochastic bilevel optimization via double-momentum. In: Advances in Neural Information Processing Systems. Vancouver: NIPS, 2021, 30271–30283

    Google Scholar 

  41. Killamsetty K, Li C, Zhao C, et al. A nested bi-level optimization framework for robust few shot learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36. Palo Alto: AAAI Press, 2022, 7176–7184

    Google Scholar 

  42. Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014

  43. Li K, Chen R, Fu G, et al. Two-archive evolutionary algorithm for constrained multiobjective optimization. IEEE Trans Evo Comput, 2018, 23: 303–315

    Article  Google Scholar 

  44. Liang H, Zhang S, Sun J, et al. DARTS+: Improved differentiable architecture search with early stopping. arXiv:1909.06035, 2019

  45. Lin X, Yang Z, Zhang Q, et al. Controllable Pareto multi-task learning. arXiv:2010.06313, 2020

  46. Lin X, Zhen H L, Li Z, et al. Pareto multi-task learning. In: Advances in Neural Information Processing Systems. Vancouver: NIPS, 2019, 12037–12047

    Google Scholar 

  47. Liu B, Liu X, Jin X, et al. Conflict-averse gradient descent for multi-task learning. In: Advances in Neural Information Processing Systems. Virtual Event: NIPS, 2021, 18878–18890

    Google Scholar 

  48. Liu H, Simonyan K, Yang Y. DARTS: Differentiable architecture search. In: International Conference on Learning Representations. Vancouver: OpenReview.net, 2018

    Google Scholar 

  49. Liu J, Chen X. ALISTA: Analytic weights are as good as learned weights in LISTA. In: International Conference on Learning Representations. New Orleans: OpenReview.net, 2019

    Google Scholar 

  50. Liu R, Gao J, Zhang J, et al. Investigating bi-level optimization for learning and vision from a unified perspective: A survey and beyond. IEEE Trans Pattern Anal, 2021, 44: 10045–10067

    Article  Google Scholar 

  51. Liu R, Liu Y, Yao W, et al. Averaged method of multipliers for bi-level optimization without lower-level strong convexity. In: International Conference on Machine Learning, vol. 202. Honolulu: PMLR, 2023, 21839–21866

    Google Scholar 

  52. Liu R, Mu P, Yuan X, et al. A generic first-order algorithmic framework for bi-level programming beyond lower-level singleton. In: International Conference on Machine Learning, vol. 119. Virtual Event: PMLR, 2020, 6305–6315

    Google Scholar 

  53. Lu Z, Deb K, Goodman E, et al. NSGANetV2: Evolutionary multi-objective surrogate-assisted neural architecture search. In: European Conference on Computer Vision, vol. 12346. Virtual Event: Springer, 2020, 35–51

    Google Scholar 

  54. Mackay M, Vicol P, Lorraine J, et al. Self-tuning networks: Bilevel optimization of hyperparameters using structured best-response functions. In: International Conference on Learning Representations. New Orleans: OpenReview.net, 2019

    Google Scholar 

  55. Maclaurin D, Duvenaud D, Adams R. Gradient-based hyperparameter optimization through reversible learning. In: International Conference on Machine Learning, vol. 37. Lille: PMLR, 2015, 2113–2122

    Google Scholar 

  56. Mahapatra D, Rajan V. Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization. In: International Conference on Machine Learning, vol. 119. Virtual Event: PMLR, 2020, 6597–6607

    Google Scholar 

  57. Miettinen K. Nonlinear Multiobjective Optimization. Luxembourg: Springer, 1999

    Google Scholar 

  58. Mohri M, Sivek G, Suresh A T. Agnostic federated learning. In: International Conference on Machine Learning, vol. 97. Long Beach: PMLR, 2019, 4615–4625

    Google Scholar 

  59. Momma M, Dong C, Liu J. A multi-objective/multi-task learning framework induced by pareto stationarity. In: International Conference on Machine Learning, vol. 162. Baltimore: PMLR, 2022, 15895–15907

    Google Scholar 

  60. Monga V, Li Y, Eldar Y C. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. IEEE Signal Proc Mag, 2021, 38: 18–44

    Article  Google Scholar 

  61. Mordukhovich B S, Nam N M. An Easy Path to Convex Analysis and Applications. Luxembourg: Springer, 2013

    Google Scholar 

  62. Mossalam H, Assael Y M, Roijers D M, et al. Multi-objective deep reinforcement learning. arXiv:1610.02707, 2016

  63. Neyshabur B, Bhojanapalli S, Chakrabarti A. Stabilizing GAN training with multiple random projections. arXiv:1705.07831, 2017

  64. Paszke A, Gross S, Massa F, et al. PyTorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems. Vancouver: NIPS, 2019, 8024–8035

    Google Scholar 

  65. Pedregosa F. Hyperparameter optimization with approximate gradient. In: International Conference on Machine Learning, vol. 48. New York: PMLR, 2016, 737–746

    Google Scholar 

  66. Rajeswaran A, Finn C, Kakade S M, et al. Meta-learning with implicit gradients. In: Advances in Neural Information Processing Systems. Vancouver: NIPS, 2019, 113–124

    Google Scholar 

  67. Ruder S. An overview of multi-task learning in deep neural networks. arXiv:1706.05098, 2017

  68. Schott J R. Fault tolerant design using single and multicriteria genetic algorithm optimization. PhD Thesis. Cambridge: Massachusetts Institute of Technology, 1995

    Google Scholar 

  69. Sener O, Koltun V. Multi-task learning as multi-objective optimization. In: Advances in Neural Information Processing Systems. Montreal: NIPS, 2018, 113–124

    Google Scholar 

  70. Shaban A, Cheng C A, Hatch N, et al. Truncated back-propagation for bilevel optimization. In: International Conference on Artificial Intelligence and Statistics, vol. 89. Naha: PMLR, 2019, 1723–1732

    Google Scholar 

  71. Sprechmann P, Litman R, Ben Yakar T, et al. Supervised sparse analysis and synthesis operators. In: Advances in Neural Information Processing Systems. Lake Tahoe: NIPS, 2013, 908–916

    Google Scholar 

  72. Sun J, Li H, Xu Z, et al. Deep ADMM-Net for compressive sensing MRI. In: Advances in Neural Information Processing Systems. Barcelona: NIPS, 2016, 10–18

    Google Scholar 

  73. Tan M, Chen B, Pang R, et al. MnasNet: Platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019, 2820–2828

    Google Scholar 

  74. Tanabe H, Fukuda E H, Yamashita N. Proximal gradient methods for multiobjective optimization and their applications. Comput Optim Appl, 2019, 72: 339–361

    Article  MathSciNet  Google Scholar 

  75. Vamplew P, Dazeley R, Berry A, et al. Empirical evaluation methods for multiobjective reinforcement learning algorithms. Mach Learn, 2011, 84: 51–80

    Article  MathSciNet  Google Scholar 

  76. Van Moffaert K, NowNowe A. Multi-objective reinforcement learning using sets of pareto dominating policies. J Mach Learn Res, 2014, 15: 3483–3512

    MathSciNet  Google Scholar 

  77. Van Veldhuizen D A. Multiobjective Evolutionary Algorithms: Classifications, Analyses, and New Innovations. Ohio: Air Force Institute of Technology, 1999

    Google Scholar 

  78. Venkatakrishnan S V, Bouman C A, Wohlberg B. Plug-and-play priors for model based reconstruction. In: IEEE Global Conference on Signal and Information Processing. Austin: IEEE, 2013, 945–948

    Chapter  Google Scholar 

  79. Yang R, Sun X, Narasimhan K. A generalized algorithm for multi-objective reinforcement learning and policy adaptation. In: Advances in Neural Information Processing Systems. Vancouver: NIPS, 2019, 14610–14621

    Google Scholar 

  80. Yang Y, Sun J, Li H, et al. ADMM-Net: A deep learning approach for compressive sensing MRI. arXiv:1705.06869, 2017

  81. Ye F, Lin B, Yue Z, et al. Multi-objective meta learning. In: Advances in Neural Information Processing Systems. Virtual Event: NIPS, 2021, 21338–21351

    Google Scholar 

  82. Yu T, Kumar S, Gupta A, et al. Gradient surgery for multi-task learning. In: Advances in Neural Information Processing Systems. Virtual Event: NIPS, 2020, 5824–5836

    Google Scholar 

  83. Zügner D, Günnemann S. Adversarial attacks on graph neural networks via meta learning. In: International Conference on Learning Representations. Vancouver: OpenReview.net, 2018

    Google Scholar 

Download references

Acknowledgements

Yang’s work was supported by the Major Program of National Natural Science Foundation of China (Grant Nos. 11991020 and 11991024). Yao’s work was supported by National Natural Science Foundation of China (Grant No. 12371305). Zhang’s work was supported by National Natural Science Foundation of China (Grant No. 12222106), Guangdong Basic and Applied Basic Research Foundation (Grant No. 2022B1515020082) and Shenzhen Science and Technology Program (Grant No. RCYX20200714114700072).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, X., Yao, W., Yin, H. et al. Gradient-based algorithms for multi-objective bi-level optimization. Sci. China Math. 67, 1419–1438 (2024). https://doi.org/10.1007/s11425-023-2302-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11425-023-2302-9

Keywords

MSC(2020)

Navigation