Convergence and error estimates for time-discrete consensus-based optimization algorithms

Ha, Seung-Yeal; Jin, Shi; Kim, Doheon

doi:10.1007/s00211-021-01174-y

Convergence and error estimates for time-discrete consensus-based optimization algorithms

Published: 22 January 2021

Volume 147, pages 255–282, (2021)
Cite this article

Numerische Mathematik Aims and scope Submit manuscript

Seung-Yeal Ha^1,2,
Shi Jin³ &
Doheon Kim²

600 Accesses
14 Citations
Explore all metrics

Abstract

We present convergence and error estimates of modified versions of the time-discrete consensus-based optimization (CBO) algorithm proposed in Carrillo et al. (ESAIM: Control Optim Calc Var, 2020) for general non-convex functions. In authors’ recent work (Ha et al. in Math Models Meth Appl Sci 30:2417–2444, 2020), rigorous error analysis of a modified version of the first-order consensus-based optimization algorithm proposed in Carrillo et al. (2020) was studied at the particle level without resorting to the kinetic equation via a mean-field limit. However, the error analysis for the corresponding time- discrete algorithm was not done mainly due to lack of discrete analogue of Itô’s stochastic calculus. In this paper, we provide a simple and elementary convergence and error analysis for a general time-discrete consensus-based optimization algorithm, which includes modifications of the three discrete algorithms in Carrillo et al. (2020), two of which are present in Ha et al. (2020). Our analysis provides numerical stability and convergence conditions for the three algorithms, as well as error estimates to the global minimum.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convergence of Anisotropic Consensus-Based Optimization in Mean-Field Law

An Adaptive Consensus Based Method for Multi-objective Optimization with Uniform Pareto Front Approximation

Article Open access 10 August 2023

Distributed Fixed-Time Consensus and Optimization for Second-Order Multi-Agent Systems

References

Acebron, J.A., Bonilla, L.L., Pérez Vicente, C.J.P., Ritort, F., Spigler, R.: The Kuramoto model: a simple paradigm for synchronization phenomena. Rev. Mod. Phys. 77, 137–185 (2005)
Article Google Scholar
Albi, G., Bellomo, N., Fermo, L., Ha, S.-L., Pareschi, L., Poyato, D., Soler, J.: Vehicular traffic, crowds, and swarms. On the kinetic theory approach towards research perspectives. Math. Models Methods Appl. Sci. 29, 1901–2005 (2019)
Article MathSciNet Google Scholar
Bertsekas, D.: Convex Analysis and Optimization. Athena Scientific, Belmont (2003)
MATH Google Scholar
Carrillo, J.A., Jin, S., Li, L., Zhu, Y.: A consensus-based global optimization method for high dimensional machine learning problems. ESAIM: Control Optim. Calc. Var. (2020)
Carrillo, J., Choi, Y.-P., Totzeck, C., Tse, O.: An analytical framework for consensus-based global optimization method. Math. Models Methods Appl. Sci. 28, 1037–1066 (2018)
Article MathSciNet Google Scholar
Choi, Y.-P., Ha, S.-Y., Li, Z.: Emergent dynamics of the Cucker–Smale flocking model and its variants. In: Bellomo, N., Degond, P., Tadmor, E. (eds.) Active Particles Vol. I Theory, Models, Applications (Tentative Title), Series: Modeling and Simulation in Science and Technology. Birkhauser, Springer, Berlin (2017)
Google Scholar
Crow, E.L., Shimizu, K. (eds.): Lognormal Distributions: Theory and Applications. Statistics: Textbooks and Monographs. Marcel-Dekker Inc., New York (1988)
Google Scholar
Cucker, F., Smale, S.: On the mathematics of emergence. Jpn. J. Math. 2, 197–227 (2007)
Article MathSciNet Google Scholar
Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer, Heidelberg, Berlin (1998)
Book Google Scholar
Eberhart, R., Kennedy, J.: Particle swarm optimization. Proc. IEEE Int. Conf. Neural Netw. 4, 1942–1948 (1995)
Article Google Scholar
Fang, D., Ha, S.-Y., Jin, S.: Emergent behaviors of the Cucker–Smale ensemble under attractive-repulsive couplings and Rayleigh frictions. Math. Models Methods Appl. Sci. 29, 1349–1385 (2019)
Article MathSciNet Google Scholar
Fornasier, M., Huang, H., Pareschi, L., S$\ddot{u}$nnen, P.: Consensus-based optimization on the sphere I: well-posedness and mean-field limit. Preprint. Available at arXiv:2001.110994v2
Fornasier, M., Huang, H., Pareschi, L., S$\ddot{u}$nnen, P.: Consensus-based optimization on the sphere II: convergence to global minimizer and machine learning. Preprint. Available at arXiv:2001.11988v2
Ha, S.-Y., Jin, S., Kim, D.: Convergence of a first-order consensus-based global optimization algorithm. Math. Models Meth. Appl. Sci. 30, 2417–2444 (2020)
Article MathSciNet Google Scholar
Ha, S.-Y., Liu, J.-G.: A simple proof of Cucker–Smale flocking dynamics and mean-field limit. Commun. Math. Sci. 7, 297–325 (2009)
Article MathSciNet Google Scholar
Ha, S.-Y., Lee, K., Levy, D.: Emergence of time-asymptotic flocking in a stochastic Cucker–Smale system. Commun. Math. Sci. 7, 453–469 (2009)
Article MathSciNet Google Scholar
Holland, J.H.: Genetic algorithms. Sci. Am. 267, 66–73 (1992)
Article Google Scholar
Hsu, L.C.: A theorem on the asymptotic behavior of a multiple integral. Duke Math. J. 15, 623–632 (1948)
Article MathSciNet Google Scholar
Kennedy, J.: Swarm Intelligence. Handbook of Nature-Inspired and Innovative computing, pp. 187–219. Springer, Berlin (2006)
Book Google Scholar
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)
Article MathSciNet Google Scholar
Kolokolnikov, T., Carrillo, J.A., Bertozzi, A., Fetecau, R., Lewis, M.: Emergent behavior in a multi-particle systems with non-local interactions. Physica D 260, 1–4 (2013)
Article MathSciNet Google Scholar
Kuramoto, Y.: Chemical Oscillations, Waves and Turbulence. Springer, Berlin (1984)
Book Google Scholar
Kuramoto, Y.: International symposium on mathematical problems in mathematical physics. Lect. Notes Theor. Phys. 30, 420 (1975)
Article Google Scholar
Motsch, S., Tadmor, E.: Heterophilious dynamics enhances consensus. SIAM. Rev. 56, 577–621 (2014)
Article MathSciNet Google Scholar
Peskin, C.S.: Mathematical Aspects of Heart Physiology. Courant Institute of Mathematical Sciences, New York (1975)
MATH Google Scholar
Pinnau, R., Totzeck, C., Tse, O., Martin, S.: A consensus-based model for global optimization and its mean-field limit. Math. Models Methods Appl. Sci. 27, 183–204 (2017)
Article MathSciNet Google Scholar
Pikovsky, A., Rosenblum, M., Kurths, J.: Synchronization: A Universal Concept in Nonlinear Sciences. Cambridge University Press, Cambridge (2001)
Book Google Scholar
Totzeck, C., Pinnau, R., Blauth, S., Schotthöfer, S.: A numerical comparison of consensus-based global optimization to other particle-based global optimization scheme. Proc. Appl. Math. Mech. 18, e201800291 (2018)
Article Google Scholar
van Laarhoven, P.J.M., Aarts, E.H.L.: Simulated Annealing: Theory and Applications. D. Reidel Publishing Co., Dordrecht (1987)
Book Google Scholar
Vicsek, T., Zefeiris, A.: Collective motion. Phys. Rep. 517, 71–140 (2012)
Article Google Scholar
Yang, X.-S.: Nature-Inspired Metaheuristic Algorithms. Luniver Press, Paris (2010)
Google Scholar
Yang, X.-S., Deb, S., Zhao, Y.-X., Fong, S., He, X.: Swarm intelligence: past, present and future. Soft Comput. 22, 5923–5933 (2018)
Article Google Scholar

Download references

Acknowledgements

The work of S.-Y. Ha was supported by National Research Foundation of Korea (NRF-2020R1A2C3A01003881), and the work of S. Jin was supported by NSFC Grant Nos. 11871297 and 3157107, and the work of D. Kim was supported by a KIAS Individual Grant (MG073901) at Korea Institute for Advanced Study. The authors would like to thank Professor Dongnam Ko for his helpful comments on the Laplace principle.

Author information

Authors and Affiliations

Department of Mathematical Sciences and Research Institute of Mathematics, Seoul National University, Seoul, 08826, Korea (Republic of)
Seung-Yeal Ha
School of Mathematics, Korea Institute for Advanced Study, Hoegiro 85, Seoul, 02455, Korea (Republic of)
Seung-Yeal Ha & Doheon Kim
School of Mathematical Sciences, MOE-LSC, and Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai, 200240, China
Shi Jin

Authors

Seung-Yeal Ha
View author publications
You can also search for this author in PubMed Google Scholar
Shi Jin
View author publications
You can also search for this author in PubMed Google Scholar
Doheon Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Doheon Kim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Error estimate under an alternative framework

In this subsection, we provide an alternative result for an error estimate for time-discrete CBO scheme (1.1) without using Laplace’s principle under a slightly different framework $({{\mathcal {B}}}1) - ({{\mathcal {B}}}2)$. Below, we present a framework for the objective function L, global minimum point $X_*$ and reference random variable ${X_\text {in}}$ as follows:

$({{\mathcal {B}}}1)$: Let $L = L(x)$ be a $C^2$-objective function satisfying the following relations:
$$\begin{aligned} L_m{:}{=}\min _{x \in {\mathbb {R}}^d} L(x) >0 \quad \text{ and } \quad C_L{:}{=}\sup _{x\in {\mathbb {R}}^d}\Vert \nabla ^2L(x)\Vert _2 <\infty , \end{aligned}$$
where $\Vert \cdot \Vert _2$ denotes the spectral norm.
$({{\mathcal {B}}}2)$: Let ${X_\text {in}}$ be a reference random variable associated with a law whose support ${{\tilde{D}}}$ is compact and contains $X_*$.

Note that the condition $({{\mathcal {B}}}1)$ is exactly the same as $({{\mathcal {A}}}1)$, whereas the condition $({{\mathcal {B}}}2)$ is different from $({{\mathcal {A}}}3)$ in which the probability measures associated with ${X_\text {in}}$ is absolutely continuous with respect to Lebesgue measure. Moreover, notice that this new framework does not need any condition on ${\text {det}}\left( \nabla ^2 L(X_*)\right) $ as in $({{\mathcal {A}}}2)$.

Next, we study how the common consensus state $X_\infty $ is close to the global minimum $X_*$ in suitable sense. Now, we are ready to provide an error estimate for the discrete CBO algorithm, which is analogous to Theorem 4.1 in [14] for the continuous case.

Theorem 5.1

Suppose that the framework $({{\mathcal {B}}}1) - ({{\mathcal {B}}}2)$ holds, and parameters $\beta , \gamma , \zeta , \delta $ and the initial data $\{X_0^i \}$ satisfy

$$\begin{aligned} \begin{aligned}&\beta> 0, \quad \delta > 0, \quad (\gamma -1)^2 + \zeta ^2< 1, \quad X_0^i: i,i.d, \quad X_0^i \sim {X_\text {in}}, \quad \sup _{x \in {\tilde{{\mathcal {D}}}}} L(x) - L_m < \delta , \\&(1-\varepsilon ){\mathbb {E}} \Big [ e^{-\beta L({X_\text {in}})} \Big ] \\&\ge \frac{2C_L \sqrt{ \big (1+ (1-\gamma )^2+\zeta ^2\big ) \big ( \gamma ^2+\zeta ^2\big )} \beta e^{-\beta L_m}}{1-e^{-[ 1-(\gamma - 1)^2 - \zeta ^2]}} \sum _{l=1}^d \left( {\mathbb {E}}\max _{1\le i\le N} (x_0^{i,l} -{{\bar{x}}}^l_0)^2\right) , \end{aligned} \end{aligned}$$

(5.1)

for some $0<\varepsilon <1$. Then, for a solution $\{X_n^i\}_{1\le i\le N}$ to (1.1),

$$\begin{aligned} \Big | \mathop {\mathrm{essinf}}\limits _{\omega \in \Omega } L(X_\infty ) - L_m \Big | \le \delta + \Big | \frac{\log \varepsilon }{\beta } \Big |. \end{aligned}$$

(5.2)

Before we provide a proof, we give several comments on the result of this theorem. 1. In the proof of Theorem 5.1, we will first derive the estimate:

$$\begin{aligned} \mathop {\mathrm{essinf}}\limits _{\omega \in \Omega } L(X_\infty )\le \sup \limits _{x\in {{\tilde{D}}}} L(x)-\frac{1}{\beta }\log \varepsilon . \end{aligned}$$

(5.3)

2. For any given $\delta> 0,~\varepsilon > 0$ and $\beta > 0$, the conditions (5.1) can be attained with suitable ${X_\text {in}}$. To see this, we choose the law of ${X_\text {in}}$ such that ${{\tilde{D}}}$ is a small neighborhood of a global minimum $X_*$ satisfying the following two relations:

$$\begin{aligned} \sup \limits _{x\in {{\tilde{D}}}} L(x)-L_m<\delta , \end{aligned}$$

(5.4)

and

$$\begin{aligned} 1-\varepsilon \ge \frac{2C_L \sqrt{ \big (1+ (1-\gamma )^2+\zeta ^2\big ) \big ( \gamma ^2+\zeta ^2\big )} \beta e^{\beta (\sup \limits _{x\in D}L(x)- L_m)}}{1-e^{-[ 1-(\gamma - 1)^2 - \zeta ^2]}} \big ( {\text {diam}}({{\mathcal {R}}}_{in})\big )^2,\nonumber \\ \end{aligned}$$

(5.5)

where ${{\mathcal {R}}}_{in} = [a_1, b_1] \times \cdots \times [a_d, b_d]$ is the smallest closed d-dimensional rectangle containing ${{\tilde{D}}}$ so that

$$\begin{aligned} {\mathbb {E}} \Big [ \max _{1\le i\le N} (x_0^{i,l} -{{\bar{x}}}^l_0)^2 \Big ] \le |b_\ell - a_\ell |^2, \end{aligned}$$

(5.6)

for each $\ell = 1, \ldots , d$. Then, due to (5.6), the relation (5.5) implies the condition (5.1)$_2$. Hence, we can apply the estimate (5.3) and (5.4) to get the desired error estimate:

$$\begin{aligned} \mathop {\mathrm{essinf}}\limits _{\omega \in \Omega } L(X_\infty )\le L_m+\left( \sup \limits _{x\in {{\tilde{D}}}} L(x)-L_m\right) -\frac{1}{\beta }\log \varepsilon <L_m+\delta -\frac{\log \varepsilon }{\beta }. \end{aligned}$$

Now we are ready to provide a proof of Theorem 5.1. Similar to the proof of Theorem 3.2, we obtain

$$\begin{aligned} \mathbb Ee^{-\beta L(X_\infty )}&\ge \frac{1}{N}\sum _{i=1}^N {\mathbb {E}} e^{-\beta L(X_0^i)} \\&\quad -\frac{2C_L \sqrt{ \big (1+ (1-\gamma )^2+\zeta ^2\big ) \big ( \gamma ^2+\zeta ^2\big )} \beta e^{-\beta L_m}}{1-e^{-(2\gamma - \gamma ^2 -\zeta ^2)}} \\&\quad \sum _{l=1}^d \left( {\mathbb {E}}\max _{1\le i\le N} (x_0^{i,l} -{{\bar{x}}}^l_0)^2\right) \\&\ge \varepsilon {\mathbb {E}} e^{-\beta L({X_\text {in}})}. \end{aligned}$$

Hence one has

$$\begin{aligned} e^{-\beta \mathop {\mathrm{essinf}}\limits _{\omega \in \Omega } L(X_\infty )} =\mathbb Ee^{-\beta \mathop {\mathrm{essinf}}\limits _{\omega \in \Omega }L(X_\infty )}\ge \mathbb Ee^{-\beta L(X_\infty )}\ge \varepsilon {\mathbb {E}} e^{-\beta L({X_\text {in}})}\ge \varepsilon e^{-\beta \sup \limits _{x\in {{\tilde{D}}}} L(x)}. \end{aligned}$$

Finally, we take logarithm to the both sides of the above relation to get the desired estimate (5.3).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ha, SY., Jin, S. & Kim, D. Convergence and error estimates for time-discrete consensus-based optimization algorithms. Numer. Math. 147, 255–282 (2021). https://doi.org/10.1007/s00211-021-01174-y

Download citation

Received: 11 March 2020
Revised: 21 September 2020
Accepted: 24 December 2020
Published: 22 January 2021
Issue Date: February 2021
DOI: https://doi.org/10.1007/s00211-021-01174-y

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convergence and error estimates for time-discrete consensus-based optimization algorithms

Abstract

Access this article

Similar content being viewed by others

Convergence of Anisotropic Consensus-Based Optimization in Mean-Field Law

An Adaptive Consensus Based Method for Multi-objective Optimization with Uniform Pareto Front Approximation

Distributed Fixed-Time Consensus and Optimization for Second-Order Multi-Agent Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Error estimate under an alternative framework

Theorem 5.1

Rights and permissions

About this article

Cite this article

Mathematics Subject Classification

Navigation

Convergence and error estimates for time-discrete consensus-based optimization algorithms

Abstract

Access this article

Similar content being viewed by others

Convergence of Anisotropic Consensus-Based Optimization in Mean-Field Law

An Adaptive Consensus Based Method for Multi-objective Optimization with Uniform Pareto Front Approximation

Distributed Fixed-Time Consensus and Optimization for Second-Order Multi-Agent Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Error estimate under an alternative framework

Error estimate under an alternative framework

Theorem 5.1

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation