Abstract
This paper presents a decentralized algorithm for non-convex optimization over tree-structured networks. We assume that each node of this network can solve small-scale optimization problems and communicate approximate value functions with its neighbors based on a novel multi-sweep communication protocol. In contrast to existing parallelizable optimization algorithms for non-convex optimization, the nodes of the network are neither synchronized nor assign any central entity. None of the nodes needs to know the whole topology of the network, but all nodes know that the network is tree-structured. We discuss conditions under which locally quadratic convergence rates can be achieved. The method is illustrated by running the decentralized asynchronous multi-sweep protocol on a radial AC power network case study.
Similar content being viewed by others
Notes
The assumption \(|{\mathcal {N}}| > 1\) ensures that \(\pi _i\) exists and is well-defined for all \(i \in {\mathcal {L}}^\bullet \).
References
Bellman, R.: Dynamic programming. Science 153(3731), 34–37 (1966)
Bernardini, D., Bemporad, A.: Stabilizing model predictive control of stochastic constrained linear systems. IEEE Trans. Autom. Control 57(6), 1468–1480 (2011)
Bertsekas, D.: Convexification procedures and decomposition methods for nonconvex optimization problems. J. Optim. Theory Appl. 29(2), 169–197 (1979)
Bertsekas, D.P.: Dynamic programming and suboptimal control: A survey from ADP to MPC. Eur. J. Control 11(4–5), 310–334 (2005)
Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific Belmont, MA (2007)
Bertsekas, D.P.: Abstract Dynamic Programming. Athena Scientific Belmont, MA (2013)
Bertsekas, D.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, Singapore (2014)
Bertsekas, D., Tsitsiklis, J.: Parallel and Distributed Computation: Numerical Methods, vol. 23. Prentice Hall Englewood Cliffs, NJ (1989)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)
Braun, P., Grüne, L., Kellett, C.M., Weller, S.R., Worthmann, K.: A distributed optimization algorithm for the predictive control of smart grids. IEEE Trans. Autom. Control 61(12), 3898–3911 (2016)
Du, X., Engelmann, A., Jiang, Y., Faulwasser, T., Houska, B.: Distributed state estimation for AC power systems using Gauss-Newton ALADIN. In: In Proceedings of the 58th IEEE Conference on Decision and Control, pp. 1919–1924 (2019)
Engelmann, A., Jiang, Y., Mühlpfordt, T., Houska, B., Faulwasser, T.: Toward distributed OPF using ALADIN. IEEE Trans. Power Syst. 34(1), 584–594 (2018)
Gondzio, J., Grothey, A.: Exploiting structure in parallel implementation of interior point methods for optimization. CMS 6(2), 135–160 (2009)
Grüne, L., Semmler, W.: Using dynamic programming with adaptive grid scheme for optimal control problems in economics. J. Econ. Dyn. Control 28, 2427–2456 (2004)
Hamdi, A.: Two-level primal-dual proximal decomposition technique to solve large scale optimization problems. Appl. Math. Comput. 160(3), 921–938 (2005)
Hamdi, A., Mishra, S.K.: Decomposition methods based on augmented Lagrangians: a survey. In: Mishra, S. (ed.) Topics in nonconvex optimization, pp. 175–203. Springer (2011)
Hong, M., Luo, Z.Q., Razaviyayn, M.: Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J. Optim. 26(1), 337–364 (2016)
Houska, B., Diehl, M.: Nonlinear robust optimization via sequential convex bilevel programming. Math. Program., Ser. A 142, 539577 (2013)
Houska, B., Frasch, J., Diehl, M.: An augmented Lagrangian based algorithm for distributed nonconvex optimization. SIAM J. Optim. 26(2), 11011127 (2016)
Hult, R., Zanon,M.,Gros, S., Falcone, P.: Primal decomposition of the optimal coordination of vehicles at traffic intersections. In: 2016 IEEE 55thConference on Decision and Control (CDC), pp. 2567–2573
Jiang, Y., Zanon, M., Hult, R., Houska, B.: Distributed algorithm for optimal vehicle coordination at traffic intersections. IFAC-PapersOnLine 50(1), 11577–11582 (2017)
Kekatos, V., Giannakis, G.B.: Distributed robust power system state estimation. IEEE Trans. Power Syst. 28(2), 1617–1626 (2012)
Kellerer, A., Steinke, F.: An approximate min-sum algorithm for smart grid dispatch with continuous variables. IFAC-PapersOnLine 49, 307–312 (2016)
Kellerer, A., Steinke, F.: Scalable economic dispatch for smart distribution networks. IEEE Trans. Power Syst. 30, 1739–1746 (2014)
Keshavarz, A., Boyd, S.: Quadratic approximate dynamic programming for input-affine systems. Int. J. Robust Nonlinear Control 24(3), 432–449 (2014)
Khoshfetrat Pakazad, S., Hansson, A., Andersen, M.S., Nielsen, I.: Distributed primal-dual interior-point methods for solving tree-structured coupled convex problems using message-passing. Optim. Methods Softw. 32(3), 401–435 (2017)
Kouzoupis, D., Klintberg, E., Diehl, M., Gros, S.: A dual Newton strategy for scenario decomposition in robust multistage MPC. Int. J. Robust Nonlinear Control 28(6), 2340–2355 (2018)
Kouzoupis, D., Quirynen, R., Garcia, J., Erhard, M., Diehl, M.: A quadratically convergent primal decomposition algorithm with soft coupling for nonlinear parameter estimation. In: 2016 IEEE 55th Conference on Decision and Control (CDC), pp. 1086–1092 (2016)
Kouzoupis,D.: Structure-exploiting numericalmethods for tree-sparse optimal control problems. Ph.D. thesis, University of Freiburg (2019)
Lucia, S., Andersson, J.A., Brandt, H., Diehl, M., Engell, S.: Handling uncertainty in economic nonlinear model predictive control: A comparative case study. J. Process Control 24(8), 1247–1259 (2014)
Luss, R.: Optimal control by dynamic programming using systematic reduction in grid size. Int. J. Control 51(5), 995–1013 (1990)
Makhdoumi, A., Ozdaglar, A.: Convergence rate of distributed ADMM over networks. IEEE Trans. Autom. Control 62(10), 5082–5095 (2017)
Molzahn, D.K., Dörfler, F., Sandberg, H., Low, S.H., Chakrabarti, S., Baldick, R., Lavaei, J.: A survey of distributed optimization and control algorithms for electric power systems. IEEE Trans. Smart Grid 8(6), 2941–2962 (2017)
Nedić, A., Olshevsky, A., Shi, W.: Decentralized consensus optimization and resource allocation. In: Large-Scale and Distributed Optimization, pp. 247–287. Springer (2018)
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course, vol. 87. Springer, Berlin (2013)
Nesterov, Y., Polyak, B.T.: Cubic regularization of Newton method and its global performance. Math. Program. 108(1), 177–205 (2006)
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, Berlin (2006)
Pakazad, S., Hansson, A., Andersen, M.: Distributed primal-dual interior-point methods for solving tree-structured coupled problems using message passing. Optim. Methods Softw. 32(3), 401–435 (2017)
Peng, Q., Low, S.: Distributed algorithm for optimal power flow on a radial network. In: 53rd IEEE Conference on Decision and Control, pp. 167–172. IEEE (2014)
Rawlings, J., Mayne, D., Diehl, M.: Model Predictive Control: Theory and Design, 2nd edn. Nob Hill Publishing, Madison, WI (2017)
Robinson, S.: Strongly regular generalized equations. Math. Oper. Res. 5(1), 43–62 (1980)
Shi, W., Ling, Q., Yuan, K., Wu, G., Yin, W.: On the linear convergence of the ADMM in decentralized consensus optimization. IEEE Trans. Signal Process. 62(7), 1750–1761 (2014)
Terelius, H., Topcu, U., Murray, R.M.: Decentralized multi-agent optimization via dual decomposition. IFAC Proc. 44(1), 11245–11251 (2011)
Wächter, A., Biegler, L.T.: On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 106(1), 25–57 (2006)
Wang, Y., O’Donoghue, B., Boyd, S.: Approximate dynamic programming via iterated Bellman inequalities. Int. J. Robust Nonlinear Control 25(10), 1472–1496 (2015)
Zavala, V., Laird, C., Biegler, L.: Interior-point decomposition approaches for parallel solution of large-scale nonlinear parameter estimation problems. Chem. Eng. Sci. 63(19), 4834–4845 (2008)
Zimmerman, R.D., Murillo-Sánchez, C.E., Thomas, R.J.: Matpower: Steady-state operations, planning, and analysis tools for power systems research and education. IEEE Trans. Power Syst. 26(1), 12–19 (2011)
Acknowledgements
YJ, HY, and BH acknowledge support by ShanghaiTech University, Grant-Nr. F-0203-14-012. DK and MD acknowledge support by BMWi via eco4wind (0324125B) and DyConPV (0324166B), and by DFG via Research Unit FOR 2401.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Levent Tunçel.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A Proof of Theorem 1.1
Appendix A Proof of Theorem 1.1
Let us introduce the shorthands
to denote, respectively, the primal dual minimizer of (6) at the kth iteration of the algorithm and the primal-dual minimizer of (5). Due to the regularity of \(x^\star \), the LICQ condition must be satisfied in a neighborhood of \(x^\star \), which implies that the first order necessary KKT conditions
with shorthands
are satisfied recalling that \(\varPhi \) is a locally accurate approximation of F. Now, because the derivative of R with respect to its second argument, \(\nabla _z R(x,\cdot )\), is uniformly Lipschitz continuous function in a neighborhood of \(z^\star \), the first equation in (24) yields
where we have set \(M(z^k) = \nabla _z R(x^k, z^k) = \nabla _z \widetilde{R}(z^k)\) and used that \(\widetilde{R}(z^k) = R(x^k,z^k)\). Notice that the KKT matrix \(M(z_k)\) is invertible for all \(z^k\) in an open neighborhood of \(z^\star \) as we assume that the LICQ and SOSC condition are satisfied at \(z^\star \). Consequently, because we have \(\widetilde{R}(z^k) = \mathbf {O}( \Vert z^k - z^\star \Vert )\), the above equation implies that
From here on, the proof is very similar to the standard proof of quadratic convergence of Newton’s method (see, e.g., [37, Thm. 3.5]); that is we use (27) to establish the inequality
Because the LICQ condition holds the iterates of the multiplier sequence \(\kappa ^k\) is uniquely determined by the sequence \(x^k\) (since \(x^{k+1}\) depends only on \(x^k\), but not on \(\kappa ^k\)), the above equation also implies that
The latter equation corresponds to the statement of the theorem establishing local quadratic convergence.
Rights and permissions
About this article
Cite this article
Jiang, Y., Kouzoupis, D., Yin, H. et al. Decentralized Optimization Over Tree Graphs. J Optim Theory Appl 189, 384–407 (2021). https://doi.org/10.1007/s10957-021-01828-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-021-01828-9