Skip to main content
Log in

Convergence and error estimates for time-discrete consensus-based optimization algorithms

  • Published:
Numerische Mathematik Aims and scope Submit manuscript

Abstract

We present convergence and error estimates of modified versions of the time-discrete consensus-based optimization (CBO) algorithm proposed in Carrillo et al. (ESAIM: Control Optim Calc Var, 2020) for general non-convex functions. In authors’ recent work (Ha et al. in Math Models Meth Appl Sci 30:2417–2444, 2020), rigorous error analysis of a modified version of the first-order consensus-based optimization algorithm proposed in Carrillo et al. (2020) was studied at the particle level without resorting to the kinetic equation via a mean-field limit. However, the error analysis for the corresponding time- discrete algorithm was not done mainly due to lack of discrete analogue of Itô’s stochastic calculus. In this paper, we provide a simple and elementary convergence and error analysis for a general time-discrete consensus-based optimization algorithm, which includes modifications of the three discrete algorithms in Carrillo et al. (2020), two of which are present in Ha et al. (2020). Our analysis provides numerical stability and convergence conditions for the three algorithms, as well as error estimates to the global minimum.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Acebron, J.A., Bonilla, L.L., Pérez Vicente, C.J.P., Ritort, F., Spigler, R.: The Kuramoto model: a simple paradigm for synchronization phenomena. Rev. Mod. Phys. 77, 137–185 (2005)

    Article  Google Scholar 

  2. Albi, G., Bellomo, N., Fermo, L., Ha, S.-L., Pareschi, L., Poyato, D., Soler, J.: Vehicular traffic, crowds, and swarms. On the kinetic theory approach towards research perspectives. Math. Models Methods Appl. Sci. 29, 1901–2005 (2019)

    Article  MathSciNet  Google Scholar 

  3. Bertsekas, D.: Convex Analysis and Optimization. Athena Scientific, Belmont (2003)

    MATH  Google Scholar 

  4. Carrillo, J.A., Jin, S., Li, L., Zhu, Y.: A consensus-based global optimization method for high dimensional machine learning problems. ESAIM: Control Optim. Calc. Var. (2020)

  5. Carrillo, J., Choi, Y.-P., Totzeck, C., Tse, O.: An analytical framework for consensus-based global optimization method. Math. Models Methods Appl. Sci. 28, 1037–1066 (2018)

    Article  MathSciNet  Google Scholar 

  6. Choi, Y.-P., Ha, S.-Y., Li, Z.: Emergent dynamics of the Cucker–Smale flocking model and its variants. In: Bellomo, N., Degond, P., Tadmor, E. (eds.) Active Particles Vol. I Theory, Models, Applications (Tentative Title), Series: Modeling and Simulation in Science and Technology. Birkhauser, Springer, Berlin (2017)

    Google Scholar 

  7. Crow, E.L., Shimizu, K. (eds.): Lognormal Distributions: Theory and Applications. Statistics: Textbooks and Monographs. Marcel-Dekker Inc., New York (1988)

    Google Scholar 

  8. Cucker, F., Smale, S.: On the mathematics of emergence. Jpn. J. Math. 2, 197–227 (2007)

    Article  MathSciNet  Google Scholar 

  9. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer, Heidelberg, Berlin (1998)

    Book  Google Scholar 

  10. Eberhart, R., Kennedy, J.: Particle swarm optimization. Proc. IEEE Int. Conf. Neural Netw. 4, 1942–1948 (1995)

    Article  Google Scholar 

  11. Fang, D., Ha, S.-Y., Jin, S.: Emergent behaviors of the Cucker–Smale ensemble under attractive-repulsive couplings and Rayleigh frictions. Math. Models Methods Appl. Sci. 29, 1349–1385 (2019)

    Article  MathSciNet  Google Scholar 

  12. Fornasier, M., Huang, H., Pareschi, L., S\(\ddot{u}\)nnen, P.: Consensus-based optimization on the sphere I: well-posedness and mean-field limit. Preprint. Available at arXiv:2001.110994v2

  13. Fornasier, M., Huang, H., Pareschi, L., S\(\ddot{u}\)nnen, P.: Consensus-based optimization on the sphere II: convergence to global minimizer and machine learning. Preprint. Available at arXiv:2001.11988v2

  14. Ha, S.-Y., Jin, S., Kim, D.: Convergence of a first-order consensus-based global optimization algorithm. Math. Models Meth. Appl. Sci. 30, 2417–2444 (2020)

    Article  MathSciNet  Google Scholar 

  15. Ha, S.-Y., Liu, J.-G.: A simple proof of Cucker–Smale flocking dynamics and mean-field limit. Commun. Math. Sci. 7, 297–325 (2009)

    Article  MathSciNet  Google Scholar 

  16. Ha, S.-Y., Lee, K., Levy, D.: Emergence of time-asymptotic flocking in a stochastic Cucker–Smale system. Commun. Math. Sci. 7, 453–469 (2009)

    Article  MathSciNet  Google Scholar 

  17. Holland, J.H.: Genetic algorithms. Sci. Am. 267, 66–73 (1992)

    Article  Google Scholar 

  18. Hsu, L.C.: A theorem on the asymptotic behavior of a multiple integral. Duke Math. J. 15, 623–632 (1948)

    Article  MathSciNet  Google Scholar 

  19. Kennedy, J.: Swarm Intelligence. Handbook of Nature-Inspired and Innovative computing, pp. 187–219. Springer, Berlin (2006)

    Book  Google Scholar 

  20. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)

    Article  MathSciNet  Google Scholar 

  21. Kolokolnikov, T., Carrillo, J.A., Bertozzi, A., Fetecau, R., Lewis, M.: Emergent behavior in a multi-particle systems with non-local interactions. Physica D 260, 1–4 (2013)

    Article  MathSciNet  Google Scholar 

  22. Kuramoto, Y.: Chemical Oscillations, Waves and Turbulence. Springer, Berlin (1984)

    Book  Google Scholar 

  23. Kuramoto, Y.: International symposium on mathematical problems in mathematical physics. Lect. Notes Theor. Phys. 30, 420 (1975)

    Article  Google Scholar 

  24. Motsch, S., Tadmor, E.: Heterophilious dynamics enhances consensus. SIAM. Rev. 56, 577–621 (2014)

    Article  MathSciNet  Google Scholar 

  25. Peskin, C.S.: Mathematical Aspects of Heart Physiology. Courant Institute of Mathematical Sciences, New York (1975)

    MATH  Google Scholar 

  26. Pinnau, R., Totzeck, C., Tse, O., Martin, S.: A consensus-based model for global optimization and its mean-field limit. Math. Models Methods Appl. Sci. 27, 183–204 (2017)

    Article  MathSciNet  Google Scholar 

  27. Pikovsky, A., Rosenblum, M., Kurths, J.: Synchronization: A Universal Concept in Nonlinear Sciences. Cambridge University Press, Cambridge (2001)

    Book  Google Scholar 

  28. Totzeck, C., Pinnau, R., Blauth, S., Schotthöfer, S.: A numerical comparison of consensus-based global optimization to other particle-based global optimization scheme. Proc. Appl. Math. Mech. 18, e201800291 (2018)

    Article  Google Scholar 

  29. van Laarhoven, P.J.M., Aarts, E.H.L.: Simulated Annealing: Theory and Applications. D. Reidel Publishing Co., Dordrecht (1987)

    Book  Google Scholar 

  30. Vicsek, T., Zefeiris, A.: Collective motion. Phys. Rep. 517, 71–140 (2012)

    Article  Google Scholar 

  31. Yang, X.-S.: Nature-Inspired Metaheuristic Algorithms. Luniver Press, Paris (2010)

    Google Scholar 

  32. Yang, X.-S., Deb, S., Zhao, Y.-X., Fong, S., He, X.: Swarm intelligence: past, present and future. Soft Comput. 22, 5923–5933 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

The work of S.-Y. Ha was supported by National Research Foundation of Korea (NRF-2020R1A2C3A01003881), and the work of S. Jin was supported by NSFC Grant Nos. 11871297 and 3157107, and the work of D. Kim was supported by a KIAS Individual Grant (MG073901) at Korea Institute for Advanced Study. The authors would like to thank Professor Dongnam Ko for his helpful comments on the Laplace principle.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Doheon Kim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Error estimate under an alternative framework

Error estimate under an alternative framework

In this subsection, we provide an alternative result for an error estimate for time-discrete CBO scheme (1.1) without using Laplace’s principle under a slightly different framework \(({{\mathcal {B}}}1) - ({{\mathcal {B}}}2)\). Below, we present a framework for the objective function L, global minimum point \(X_*\) and reference random variable \({X_\text {in}}\) as follows:

  • \(({{\mathcal {B}}}1)\): Let \(L = L(x)\) be a \(C^2\)-objective function satisfying the following relations:

    $$\begin{aligned} L_m{:}{=}\min _{x \in {\mathbb {R}}^d} L(x) >0 \quad \text{ and } \quad C_L{:}{=}\sup _{x\in {\mathbb {R}}^d}\Vert \nabla ^2L(x)\Vert _2 <\infty , \end{aligned}$$

    where \(\Vert \cdot \Vert _2\) denotes the spectral norm.

  • \(({{\mathcal {B}}}2)\): Let \({X_\text {in}}\) be a reference random variable associated with a law whose support \({{\tilde{D}}}\) is compact and contains \(X_*\).

Note that the condition \(({{\mathcal {B}}}1)\) is exactly the same as \(({{\mathcal {A}}}1)\), whereas the condition \(({{\mathcal {B}}}2)\) is different from \(({{\mathcal {A}}}3)\) in which the probability measures associated with \({X_\text {in}}\) is absolutely continuous with respect to Lebesgue measure. Moreover, notice that this new framework does not need any condition on \({\text {det}}\left( \nabla ^2 L(X_*)\right) \) as in \(({{\mathcal {A}}}2)\).

Next, we study how the common consensus state \(X_\infty \) is close to the global minimum \(X_*\) in suitable sense. Now, we are ready to provide an error estimate for the discrete CBO algorithm, which is analogous to Theorem 4.1 in [14] for the continuous case.

Theorem 5.1

Suppose that the framework \(({{\mathcal {B}}}1) - ({{\mathcal {B}}}2)\) holds, and parameters \(\beta , \gamma , \zeta , \delta \) and the initial data \(\{X_0^i \}\) satisfy

$$\begin{aligned} \begin{aligned}&\beta> 0, \quad \delta > 0, \quad (\gamma -1)^2 + \zeta ^2< 1, \quad X_0^i: i,i.d, \quad X_0^i \sim {X_\text {in}}, \quad \sup _{x \in {\tilde{{\mathcal {D}}}}} L(x) - L_m < \delta , \\&(1-\varepsilon ){\mathbb {E}} \Big [ e^{-\beta L({X_\text {in}})} \Big ] \\&\ge \frac{2C_L \sqrt{ \big (1+ (1-\gamma )^2+\zeta ^2\big ) \big ( \gamma ^2+\zeta ^2\big )} \beta e^{-\beta L_m}}{1-e^{-[ 1-(\gamma - 1)^2 - \zeta ^2]}} \sum _{l=1}^d \left( {\mathbb {E}}\max _{1\le i\le N} (x_0^{i,l} -{{\bar{x}}}^l_0)^2\right) , \end{aligned} \end{aligned}$$
(5.1)

for some \(0<\varepsilon <1\). Then, for a solution \(\{X_n^i\}_{1\le i\le N}\) to (1.1),

$$\begin{aligned} \Big | \mathop {\mathrm{essinf}}\limits _{\omega \in \Omega } L(X_\infty ) - L_m \Big | \le \delta + \Big | \frac{\log \varepsilon }{\beta } \Big |. \end{aligned}$$
(5.2)

Before we provide a proof, we give several comments on the result of this theorem. 1. In the proof of Theorem 5.1, we will first derive the estimate:

$$\begin{aligned} \mathop {\mathrm{essinf}}\limits _{\omega \in \Omega } L(X_\infty )\le \sup \limits _{x\in {{\tilde{D}}}} L(x)-\frac{1}{\beta }\log \varepsilon . \end{aligned}$$
(5.3)

2. For any given \(\delta> 0,~\varepsilon > 0\) and \(\beta > 0\), the conditions (5.1) can be attained with suitable \({X_\text {in}}\). To see this, we choose the law of \({X_\text {in}}\) such that \({{\tilde{D}}}\) is a small neighborhood of a global minimum \(X_*\) satisfying the following two relations:

$$\begin{aligned} \sup \limits _{x\in {{\tilde{D}}}} L(x)-L_m<\delta , \end{aligned}$$
(5.4)

and

$$\begin{aligned} 1-\varepsilon \ge \frac{2C_L \sqrt{ \big (1+ (1-\gamma )^2+\zeta ^2\big ) \big ( \gamma ^2+\zeta ^2\big )} \beta e^{\beta (\sup \limits _{x\in D}L(x)- L_m)}}{1-e^{-[ 1-(\gamma - 1)^2 - \zeta ^2]}} \big ( {\text {diam}}({{\mathcal {R}}}_{in})\big )^2,\nonumber \\ \end{aligned}$$
(5.5)

where \({{\mathcal {R}}}_{in} = [a_1, b_1] \times \cdots \times [a_d, b_d]\) is the smallest closed d-dimensional rectangle containing \({{\tilde{D}}}\) so that

$$\begin{aligned} {\mathbb {E}} \Big [ \max _{1\le i\le N} (x_0^{i,l} -{{\bar{x}}}^l_0)^2 \Big ] \le |b_\ell - a_\ell |^2, \end{aligned}$$
(5.6)

for each \(\ell = 1, \ldots , d\). Then, due to (5.6), the relation (5.5) implies the condition (5.1)\(_2\). Hence, we can apply the estimate (5.3) and (5.4) to get the desired error estimate:

$$\begin{aligned} \mathop {\mathrm{essinf}}\limits _{\omega \in \Omega } L(X_\infty )\le L_m+\left( \sup \limits _{x\in {{\tilde{D}}}} L(x)-L_m\right) -\frac{1}{\beta }\log \varepsilon <L_m+\delta -\frac{\log \varepsilon }{\beta }. \end{aligned}$$

Now we are ready to provide a proof of Theorem 5.1. Similar to the proof of Theorem 3.2, we obtain

$$\begin{aligned} \mathbb Ee^{-\beta L(X_\infty )}&\ge \frac{1}{N}\sum _{i=1}^N {\mathbb {E}} e^{-\beta L(X_0^i)} \\&\quad -\frac{2C_L \sqrt{ \big (1+ (1-\gamma )^2+\zeta ^2\big ) \big ( \gamma ^2+\zeta ^2\big )} \beta e^{-\beta L_m}}{1-e^{-(2\gamma - \gamma ^2 -\zeta ^2)}} \\&\quad \sum _{l=1}^d \left( {\mathbb {E}}\max _{1\le i\le N} (x_0^{i,l} -{{\bar{x}}}^l_0)^2\right) \\&\ge \varepsilon {\mathbb {E}} e^{-\beta L({X_\text {in}})}. \end{aligned}$$

Hence one has

$$\begin{aligned} e^{-\beta \mathop {\mathrm{essinf}}\limits _{\omega \in \Omega } L(X_\infty )} =\mathbb Ee^{-\beta \mathop {\mathrm{essinf}}\limits _{\omega \in \Omega }L(X_\infty )}\ge \mathbb Ee^{-\beta L(X_\infty )}\ge \varepsilon {\mathbb {E}} e^{-\beta L({X_\text {in}})}\ge \varepsilon e^{-\beta \sup \limits _{x\in {{\tilde{D}}}} L(x)}. \end{aligned}$$

Finally, we take logarithm to the both sides of the above relation to get the desired estimate (5.3).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ha, SY., Jin, S. & Kim, D. Convergence and error estimates for time-discrete consensus-based optimization algorithms. Numer. Math. 147, 255–282 (2021). https://doi.org/10.1007/s00211-021-01174-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00211-021-01174-y

Mathematics Subject Classification

Navigation