Improving simulated annealing through derandomization

Gerber, Mathieu; Bornn, Luke

doi:10.1007/s10898-016-0461-1

Improving simulated annealing through derandomization

Published: 08 September 2016

Volume 68, pages 189–217, (2017)
Cite this article

Journal of Global Optimization Aims and scope Submit manuscript

440 Accesses
5 Citations
Explore all metrics

Abstract

We propose and study a version of simulated annealing (SA) on continuous state spaces based on $(t,s)_R$-sequences. The parameter $R\in \bar{\mathbb {N}}$ regulates the degree of randomness of the input sequence, with the case $R=0$ corresponding to IID uniform random numbers and the limiting case $R=\infty $ to (t, s)-sequences. Our main result, obtained for rectangular domains, shows that the resulting optimization method, which we refer to as QMC-SA, converges almost surely to the global optimum of the objective function $\varphi $ for any $R\in \mathbb {N}$. When $\varphi $ is univariate, we are in addition able to show that the completely deterministic version of QMC-SA is convergent. A key property of these results is that they do not require objective-dependent conditions on the cooling schedule. As a corollary of our theoretical analysis, we provide a new almost sure convergence result for SA which shares this property under minimal assumptions on $\varphi $. We further explain how our results in fact apply to a broader class of optimization methods including for example threshold accepting, for which to our knowledge no convergence results currently exist. We finally illustrate the superiority of QMC-SA over SA algorithms in a numerical study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Theoretically Grounded Acceleration Techniques for Simulated Annealing

An information guided framework for simulated annealing

Article 08 August 2014

Mathematical Aspects of the Digital Annealer’s Simulated Annealing Algorithm

Article 24 November 2023

References

Alabduljabbar, A., Milanovic, J., Al-Eid, E.: Low discrepancy sequences based optimization algorithm for tuning psss. In: Proceedings of the 10th International Conference on Probabilistic Methods Applied to Power Systems, PMAPS’08, pp. 1–9. IEEE (2008)
Althöfer, I., Koschnick, K.-U.: On the convergence of “Threshold accepting”. Appl. Math. Optim. 24(1), 183–195 (1991)
Article MathSciNet MATH Google Scholar
Andrieu, C., Breyer, L.A., Doucet, A.: Convergence of simulated annealing using Foster–Lyapunov criteria. J. Appl. Prob. 38(4), 975–994 (2001)
Article MathSciNet MATH Google Scholar
Andrieu, C., Doucet, A.: Simulated annealing for maximum a posteriori parameter estimation of hidden Markov models. IEEE Trans. Inf. Theory 46(3), 994–1004 (2000)
Article MathSciNet MATH Google Scholar
Bélisle, C.J.P.: Convergence theorems for a class of simulated annealing algorithms on $\mathbb{R}^d$. J. Appl. Prob. 29(4), 885–895 (1992)
MathSciNet MATH Google Scholar
Bornn, L., Shaddick, G., Zidek, J.V.: Modeling nonstationary processes through dimension expansion. J. Am. Stat. Assoc. 107(497), 281–289 (2012)
Article MathSciNet MATH Google Scholar
Chen, J., Suarez, J., Molnar, P., Behal, A.: Maximum likelihood parameter estimation in a stochastic resonate-and-fire neuronal model. In: 2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), pp. 57–62. IEEE (2011)
Chen, S., Luk, B.L.: Adaptive simulated annealing for optimization in signal processing applications. Signal Process. 79(1), 117–128 (1999)
Article MATH Google Scholar
Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte Carlo Integration. Cambridge University Press, Cambridge (2010)
Book MATH Google Scholar
Dueck, G., Scheuer, T.: Threshold accepting: a general purpose optimization algorithm appearing superior to simulated annealing. J. Comput. Phys. 90(1), 161–175 (1990)
Article MathSciNet MATH Google Scholar
Fang, K., Winker, P., Hickernell, F.J.: Some global optimization algorithms in statistics. In: Du, D.Z., Zhang, Z.S., Cheng, K. (eds.) Operations Research and Its Applications. Lecture Notes in Operations Research, vol. 2. World Publishing Corp, New York (1996)
Google Scholar
Fang, K.T.: Some applications of quasi-Monte Carlo methods in statistics. In: Monte Carlo and Quasi-Monte Carlo Methods 2000, pp. 10–26. Springer (2002)
Gelfand, S.B., Mitter, S.K.: Recursive stochastic algorithms for global optimization in $\mathbb{R}^d$. SIAM J. Control Optim. 29(5), 999–1018 (1991)
Article MathSciNet MATH Google Scholar
Gelfand, S.B., Mitter, S.K.: Metropolis-type annealing algorithms for global optimization in $\mathbb{R}^d$. SIAM J. Control Optim. 31(1), 111–131 (1993)
Article MathSciNet MATH Google Scholar
Geman, S., Hwang, C.-R.: Diffusions for global optimization. SIAM J. Control Optim. 24(5), 1031–1043 (1986)
Article MathSciNet MATH Google Scholar
Gerber, M., Chopin, N.: Sequential Quasi-Monte Carlo. J. R. Stat. Soc. B 77(3), 509–579 (2015)
Article MathSciNet Google Scholar
Girard, T., Staraj, R., Cambiaggio, E., Muller, F.: A simulated annealing algorithm for planar or conformal antenna array synthesis with optimized polarization. Microw. Opt. Technol. Lett. 28(2), 86–89 (2001)
Article Google Scholar
Goffe, W.L., Ferrier, G.D., Rogers, J.: Global optimization of statistical functions with simulated annealing. J. Econom. 60(1), 65–99 (1994)
Article MATH Google Scholar
Haario, H., Saksman, E.: Simulated annealing process in general state space. Adv. Appl. Probab. 23, 866–893 (1991)
Article MathSciNet MATH Google Scholar
He, Z., Owen, A.B.: Extensible grids: uniform sampling on a space filling curve. J. R. Stat. Soc.: Ser. B (2015)
Hickernell, F.J., Yuan, Y.-X.: A simple multistart algorithm for global optimization. OR Trans. 1(2), 1–12 (1997)
Google Scholar
Hong, H.S., Hickernell, F.J.: Algorithm 823: implementing scrambled digital sequences. ACM Trans. Math. Softw. 29(2), 95–109 (2003)
Article MathSciNet MATH Google Scholar
Ingber, L.: Very fast simulated re-annealing. Math. Comput. Model. 12(8), 967–973 (1989)
Article MathSciNet MATH Google Scholar
Ireland, J.: Simulated annealing and Bayesian posterior distribution analysis applied to spectral emission line fitting. Solar Phys. 243(2), 237–252 (2007)
Article Google Scholar
Jiao, Y.-C., Dang, C., Leung, Y., Hao, Y.: A modification to the new version of the price’s algorithm for continuous global optimization problems. J. Global Optim. 36(4), 609–626 (2006)
Article MathSciNet MATH Google Scholar
Lecchini-Visintini, A., Lygeros, J., Maciejowski, J.M.: Stochastic optimization on continuous domains with finite-time guarantees by Markov Chain Monte Carlo methods. IEEE Trans. Autom. Control 55(12), 2858–2863 (2010)
Article MathSciNet Google Scholar
Lei, G.: Adaptive random search in quasi-Monte Carlo methods for global optimization. Comput. Math. Appl. 43(6), 747–754 (2002)
Article MathSciNet MATH Google Scholar
Locatelli, M.: Convergence properties of simulated annealing for continuous global optimization. J. Appl. Prob. 33(4), 1127–1140 (1991)
Article MathSciNet MATH Google Scholar
Locatelli, M.: Convergence of a simulated annealing algorithm for continuous global optimization. J. Global Optim. 18(3), 219–233 (2000)
Article MathSciNet MATH Google Scholar
Locatelli, M.: Simulated annealing algorithms for continuous global optimization. In: Handbook of global optimization, pp. 179–229. Springer (2002)
Moscato, P., Fontanari, J.F.: Stochastic versus deterministic update in simulated annealing. Phys. Lett. A 146(4), 204–208 (1990)
Article Google Scholar
Niederreiter, H.: A quasi-Monte Carlo method for the approximate computation of the extreme values of a function. In: Studies in Pure Mathematics, pp. 523–529. Springer (1983)
Niederreiter, H.: Point sets and sequences with small discrepancy. Monatshefte für Mathematik 104(4), 273–337 (1987)
Article MathSciNet MATH Google Scholar
Niederreiter, H.: Random number generation and quasi-Monte Carlo methods. In: CBMS-NSF Regional Conference Series in Applied Mathematics (1992)
Niederreiter, H., Peart, P.: Localization of search in quasi-Monte Carlo methods for global optimization. SIAM J. Sci. Stat. Comput. 7(2), 660–664 (1986)
Article MathSciNet MATH Google Scholar
Nikolaev, A.G., Jacobson, S.H.: Simulated annealing. In: Handbook of Metaheuristics, pp. 1–39. Springer (2010)
Owen, A.B.: Randomly permuted $(t, m, s)$-nets and $(t, s)$-sequences. In: Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing. Lecture Notes in Statististics, vol. 106, pp. 299–317. Springer, New York (1995)
Pistovčák, F., Breuer, T.: Using quasi-Monte Carlo scenarios in risk management. In: Monte Carlo and Quasi-Monte Carlo Methods 2002, pp. 379–392. Springer (2004)
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods, 2nd edn. Springer-Verlag, New York (2004)
Book MATH Google Scholar
Rosenblatt, M.: Remarks on a multivariate transformation. Ann. Math. Stat. 23(3), 470–472 (1952)
Article MathSciNet MATH Google Scholar
Rubenthaler, S., Rydén, T., Wiktorsson, M.: Fast simulated annealing in $\mathbb{R}^d$ with an application to maximum likelihood estimation in state-space models. Stoch. Process. Appl. 119(6), 1912–1931 (2009)
Article MathSciNet MATH Google Scholar
Winker, P., Maringer, D.: The threshold accepting optimisation algorithm in economics and statistics. In: Optimisation, Econometric and Financial Analysis, pp. 107–125. Springer (2007)
Zhang, H., Bonilla-Petriciolet, A., Rangaiah, G.P.: A review on global optimization methods for phase equilibrium modeling and calculations. Open Thermodyn. J. 5(S1), 71–92 (2011)
Article Google Scholar

Download references

Acknowledgments

The authors acknowledge support from DARPA under Grant No. FA8750-14-2-0117. The authors also thank Christophe Andrieu, Pierre Jacob and Art Owen for insightful discussions and useful feedback.

Author information

Authors and Affiliations

Harvard University, Cambridge, MA, USA
Mathieu Gerber & Luke Bornn

Authors

Mathieu Gerber
View author publications
You can also search for this author in PubMed Google Scholar
Luke Bornn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mathieu Gerber.

Appendices

Appendix 1: Proof of Lemma 1

Let $n\in \mathbb {N}$, $(\tilde{\mathbf {x}},\mathbf {x}^{\prime })\in \mathcal {X}^2$, $\delta _{\mathcal {X}}=0.5$ and $\delta \in (0,\delta _{\mathcal {X}}]$. Then, by Assumption (A1), $F_K^{-1}(\tilde{\mathbf {x}},\mathbf {u}_1^n)\in B_{\delta }(\mathbf {x}^{\prime })$ if and only if $ \mathbf {u}_1^n\in F_K(\tilde{\mathbf {x}},B_{\delta }(\mathbf {x}^{\prime }))$. We now show that, for $\delta $ small enough, there exists a closed hypercube $W(\tilde{\mathbf {x}},\mathbf {x}^{\prime }, \delta )\subset [0,1)^d$ such that $W(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta )\subseteq F_K(\mathbf {x},B_{\delta }(\mathbf {x}^{\prime }))$ for all $\mathbf {x}\in B_{v_K(\delta )}(\tilde{\mathbf {x}})$, with $v_K(\cdot )$ as in the statement of the lemma.

To see this note that, because $K(\mathbf {x},\mathrm {d}\mathbf {y})$ admits a density $K(\mathbf {y}|\mathbf {x})$ which is continuous on the compact set $\mathcal {X}^2$, and using Assumption (A2), it is easy to see that, for $i\in 1:d$, $K_i(y_i|y_{1:i-1},\mathbf {x})\ge \tilde{K}$ for all $(\mathbf {x},\mathbf {y})\in \mathcal {X}^2$ and for a constant $\tilde{K}>0$. Consequently, for any $\delta \in [0,0.5]$ and $(\mathbf {x},\mathbf {y})\in \mathcal {X}^2$,

$$\begin{aligned} F_{K_i}\left( x^{\prime }_i+\delta |\mathbf {x}, y_{1:i-1}\right) -F_{K_i}\left( x^{\prime }_i-\delta |\mathbf {x}, y_{1:i-1}\right) \ge \tilde{K}\delta ,\quad \forall i\in 1:d \end{aligned}$$

(6)

where $F_{K_i}(\cdot {}|\mathbf {x}, y_{1:i-1})$ denotes the CDF of the probability measure $K_i(\mathbf {x},y_{1:i-1},\mathrm {d}y_i)$, with the convention that $F_{K_i}(\cdot {}|\mathbf {x}, y_{1:i-1})=F_{K_1}(\cdot {}|\mathbf {x})$ when $i=1$. Note that the right-hand side of (6) is $\tilde{K}\delta $ and not $2\tilde{K}\delta $ to encompass the case where either $x_i^{\prime }-\delta \not \in [0,1]$ or $x_i^{\prime }+\delta \not \in [0,1]$. (Note also that because $\delta \le 0.5$ we cannot have both $x_i^{\prime }-\delta \not \in [0,1]$ and $x_i^{\prime }+\delta \not \in [0,1]$.)

For $i\in 1:d$ and $\delta ^{\prime }>0$, let

$$\begin{aligned} \omega _i(\delta ^{\prime })=\sup _{ \begin{array}{c} (\mathbf {x},\mathbf {y})\in \mathcal {X}^2,\,(\mathbf {x}^{\prime },\mathbf {y}^{\prime })\in \mathcal {X}^2\\ \Vert \mathbf {x}-\mathbf {x}^{\prime }\Vert _{\infty }\vee \Vert \mathbf {y}-\mathbf {y}^{\prime }\Vert _{\infty }\le \delta ^{\prime } \end{array}}|F_{K_i}\left( y_i|\mathbf {x}, y_{1:i-1}\right) -F_{K_i}\left( y_i^{\prime }|\mathbf {x}^{\prime }, y^{\prime }_{1:i-1}\right) | \end{aligned}$$

be the (optimal) modulus of continuity of $F_{K_i}(\cdot |\cdot )$. Since $F_{K_i}$ is uniformly continuous on the compact set $[0,1]^{d+i}$, the mapping $\omega _i(\cdot )$ is continuous and $\omega _i(\delta ^{\prime })\rightarrow 0$ as $\delta ^{\prime }\rightarrow 0$. In addition, because $F_{K_i}\left( \cdot |\mathbf {x}, y_{1:i-1}\right) $ is strictly increasing on [0, 1] for all $(\mathbf {x},\mathbf {y})\in \mathcal {X}^2$, $\omega _i(\cdot )$ is strictly increasing on (0, 1]. Let $\tilde{K}$ be small enough so that, for $i\in 1:d$, $0.25\tilde{K}\delta _{\mathcal {X}}\le w_i(1)$ and let $\tilde{\delta }_i(\cdot )$ be the mapping $ z\in (0,\delta _{\mathcal {X}}]\longmapsto \tilde{\delta }_i(z)=\omega _i^{-1} (0.25\tilde{K}z)$. Remark that the function $\tilde{\delta }_i(\cdot )$ is independent of $(\tilde{\mathbf {x}},\mathbf {x}^{\prime })\in \mathcal {X}^2$, continuous and strictly increasing on $(0,\delta _{\mathcal {X}}]$ and such that $\tilde{\delta }_i(\delta ^{\prime })\rightarrow 0$ as $\delta ^{\prime }\rightarrow 0$.

For $\mathbf {x}\in \mathcal {X}$, $\delta ^{\prime }>0$ and $\delta ^{\prime }_i>0$, $i\in 1:d$, let

$$\begin{aligned} B^i_{\delta ^{\prime }}(\tilde{\mathbf {x}})=\{\mathbf {x}\in [0,1]^i:\, \Vert \mathbf {x}-\tilde{x}_{1:i}\Vert _{\infty }\le \delta ^{\prime }\}\cap [0,1]^i \end{aligned}$$

and

$$\begin{aligned} B_{\delta ^{\prime }_{1:i}}(\tilde{\mathbf {x}})=\{\mathbf {x}\in [0,1]^i:\, |x_j-\tilde{x}_{j}|\le \delta ^{\prime }_j,\,j\in 1:i\}\cap [0,1]^i. \end{aligned}$$

Then, for any $\delta ^{\prime }>0$ and for all $(\mathbf {x},y_{1:i-1})\in B_{\tilde{\delta }_i(\delta )}(\tilde{\mathbf {x}})\times B^{i-1}_{\tilde{\delta }_i(\delta )}(\mathbf {x}^{\prime })$, we have

$$\begin{aligned}&|F_{K_i}\left( x^{\prime }_i+\delta ^{\prime }|\mathbf {x}, y_{1:i-1}\right) -F_{K_i}\left( x^{\prime }_i+\delta ^{\prime }|\tilde{\mathbf {x}}, x^{\prime }_{1:i-1}\right) |\le 0.25\tilde{K}\delta \end{aligned}$$

(7)

$$\begin{aligned}&|F_{K_i}\left( x^{\prime }_i-\delta ^{\prime }|\mathbf {x}, y_{1:i-1}\right) -F_{K_i}\left( x^{\prime }_i-\delta ^{\prime }|\tilde{\mathbf {x}}, x^{\prime }_{1:i-1}\right) |\le 0.25\tilde{K}\delta . \end{aligned}$$

(8)

For $i\in 1:d$ and $\delta ^{\prime }\in (0,\delta _{\mathcal {X}}]$, let $\delta _i(\delta ^{\prime })=\tilde{\delta }_i(\delta ^{\prime })\wedge \delta ^{\prime }$ and note that the function $\delta _i(\cdot )$ is continuous and strictly increasing on $(0,\delta _{\mathcal {X}}]$. Let $\delta _d=\delta _d(\delta )$ and define recursively $\delta _{i}=\delta _{i}(\delta _{i+1})$, $i\in 1:(d-1)$, so that $\delta \ge \delta _d\ge \dots \ge \delta _1>0$. For $i\in 1:d$, let

$$\begin{aligned} \underline{v}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})= \sup _{(\mathbf {x},y_{1:i-1})\in B_{\delta _{1}}(\tilde{\mathbf {x}})\times B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })} F_{K_i}\left( x^{\prime }_i-\delta _i|\mathbf {x}, y_{1:i-1}\right) \end{aligned}$$

and

$$\begin{aligned} \bar{v}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})= \inf _{(\mathbf {x},y_{1:i-1})\in B_{\delta _{1}}(\tilde{\mathbf {x}})\times B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })} F_{K_i}\left( x^{\prime }_i+\delta _i|\mathbf {x}, y_{1:i-1}\right) . \end{aligned}$$

Then, since $F_{K_i}(\cdot |\cdot )$ is continuous and the set $B_{\delta _{1}}(\tilde{\mathbf {x}})\times B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })$ is compact, there exists points $(\underline{\mathbf {x}}^i,\underline{y}^{i}_{1:i-1})$ and $(\bar{\mathbf {x}}^i,\bar{y}^{i}_{1:i-1})$ in $B_{\delta _{1}}(\tilde{\mathbf {x}})\times B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })$ such that

$$\begin{aligned} \underline{v}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})=F_{K_i}(x^{\prime }_i-\delta _{i}|\underline{\mathbf {x}}^i, \underline{y}^{i}_{1:i-1}),\quad \bar{v}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime }, \delta _{1:i})=F_{K_i}\left( x^{\prime }_i+\delta _i|\bar{\mathbf {x}}^i, \bar{y}^{i}_{1:i-1}\right) . \end{aligned}$$

In addition, by the construction of the $\delta _i$’s, $B_{\delta _{1}}(\tilde{\mathbf {x}})\times B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })\subseteq B_{\tilde{\delta }_i(\delta _{i})}(\tilde{\mathbf {x}})\times B^i_{\tilde{\delta }_i(\delta _{i})}(\mathbf {x}^{\prime })$ for all $i\in 1:d$. Therefore, using (6)–(8), we have, for all $i\in 1:d$,

$$\begin{aligned} \bar{v}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})-\underline{v}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})&=F_{K_i}(x^{\prime }_i+\delta _i|\bar{\mathbf {x}}^i, \bar{y}^i_{1:i-1})-F_{K_i}(x^{\prime }_i-\delta _i|\underline{\mathbf {x}}^i, \underline{y}^i_{1:i-1})\\&\ge F_{K_i}\left( x^{\prime }_i+\delta _i|\tilde{\mathbf {x}}, x^{\prime }_{1:i-1}\right) -F_{K_i}\left( x^{\prime }_i-\delta _i|\tilde{\mathbf {x}}, x^{\prime }_{1:i-1}\right) -0.5\tilde{K}\delta _i\\&\ge 0.5\tilde{K}\delta _i\\&>0. \end{aligned}$$

Consequently, for all $i\in 1:d$ and for all $(\mathbf {x},y_{1:i-1})\in B_{\delta _{1}}(\tilde{\mathbf {x}})\times B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })$,

$$\begin{aligned} \Big [\underline{v}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i}), \bar{v}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})\Big ]\subseteq \Big [F_{K_i}(x^{\prime }_i-\delta _i|\mathbf {x}, y_{1:i-1}),F_{K_i}(x^{\prime }_i+\delta _i|\mathbf {x}, y_{1:i-1})\Big ]. \end{aligned}$$

Let $\underline{S}_{\delta }=0.5\tilde{K}\delta _1$. Then, this shows that there exists a closed hypercube $\underline{W}(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta )$ of side $\underline{S}_{\delta }$ such that

$$\begin{aligned} \underline{W}(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta )\subseteq F_K\big (\mathbf {x},B_{\delta _{1:d}}(\mathbf {x}^{\prime })\big )\subseteq F_K\big (\mathbf {x},B_{\delta }(\mathbf {x}^{\prime })\big ),\quad \forall \mathbf {x}\in B_{v_K(\delta )}(\tilde{\mathbf {x}}) \end{aligned}$$

where we set $v_K(\delta )=\delta _1$. Note that $v_K(\delta )\in (0,\delta ]$ and thus $v_K(\delta )\rightarrow 0$ as $\delta \rightarrow 0$, as required. In addition, $v_K(\cdot )=\delta _1\circ \dots \delta _d(\cdot )$ is continuous and strictly increasing on $(0,\delta _{\mathcal {X}}]$ because the functions $\delta _i(\cdot )$, $i\in 1:d$, are continuous and strictly increasing on this set. Note also that $v_K(\cdot )$ does not depend on $(\tilde{\mathbf {x}},\mathbf {x}^{\prime })\in \mathcal {X}^2$.

To conclude the proof, let

$$\begin{aligned} k_{\delta }=\big \lceil t+d-d\log (\underline{S}_{\delta }/3)/\log b\big \rceil \end{aligned}$$

(9)

and note that, if $\delta $ is small enough, $k_{\delta }\ge t+d$ because $\underline{S}_{\delta }\rightarrow 0$ as $\delta \rightarrow 0$. Let $\bar{\delta }_K$ be the largest value of $\delta ^{\prime }\le \delta _{\mathcal {X}}$ such that $k_{\delta ^{\prime }}\ge t+d$. Let $\delta \in (0,\bar{\delta }_K]$ and $t_{\delta ,d}\in t:(t+d)$ be such that $(k_{\delta }-t_{\delta ,d})/d$ is an integer. Let $\{E(j,\delta )\}_{j=1}^{b^{k_{\delta }-t_{\delta ,d}}}$ be the partition of $[0,1)^d$ into elementary intervals of volume $b^{t_{\delta ,d}-k_{\delta }}$ so that any closed hypercube of side $\underline{S}_{\delta }$ contains at least one elementary interval $E(j,\delta )$ for a $j\in 1:b^{k_{\delta }-t_{\delta ,d}}$. Hence, there exists a $j_{\tilde{\mathbf {x}},\mathbf {x}^{\prime }}\in 1:b^{k_{\delta }-t_{\delta ,d}}$ such that

$$\begin{aligned} E(j_{\tilde{\mathbf {x}},\mathbf {x}^{\prime },},\delta )\subseteq \underline{W}(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta )\subseteq F_K(\mathbf {x},B_{\delta }(\mathbf {x}^{\prime })),\quad \forall \mathbf {x}\in B_{v(\delta )}(\tilde{\mathbf {x}}). \end{aligned}$$

Let $a\in \mathbb {N}$ and note that, by the properties of (t, s)-sequences in base b, the point set $\{\mathbf {u}^n\}_{n=ab^{k_{\delta }}}^{(a+1)b^{k_{\delta }}-1}$ is a $(t,k_{\delta },d)$-net in base b because $k_{\delta }> t$. In addition, since $k_{\delta }\ge t_{\delta ,d}\ge t$, the point set $\{\mathbf {u}^n\}_{n=ab^{k_{\delta }}}^{(a+1)b^{k_{\delta }}-1}$ is also a $(t_{\delta ,d},k_{\delta },d)$-net in base b ([34] Remark 4.3, p. 48). Thus, since for $j\in 1:b^{k_{\delta }-t_{\delta ,d}}$ the elementary interval $E(j,\delta )$ has volume $b^{t_{\delta ,d}-k_{\delta }}$, the point set $\{\mathbf {u}^n\}_{n=ab^{k_{\delta }}}^{(a+1)b^{k_{\delta }}-1}$ therefore contains exactly $b^{t_{\delta _d}}\ge b^t$ points in $E(j_{\tilde{\mathbf {x}},\mathbf {x}^{\prime },},\delta )$ and the proof is complete.

Appendix 2: Proof of Lemma 2

Using the Lipschitz property of $F_{K_i}(\cdot |\cdot )$ for all $i\in 1:d$, conditions (7) and (8) in the proof of Lemma 1 hold with $ \tilde{\delta }_i(\delta )=\delta (0.25\tilde{K}/C_K)$, $i\in 1:d$. Hence, we can take $v_K(\delta )=\delta (0.25\tilde{K}/C_K)^{d}\wedge \delta $ and thus $\underline{S}_{\delta }= \delta 0.5\tilde{K}\big (1 \wedge (0.25\tilde{K}/C_K)^{d}\big )$. Then, the expression for $k_{\delta }$ follows using (9) while the expression for $\bar{\delta }_{K}\le 0.5$ results from the condition $k_{\delta }\ge t+d$ for all $\delta \in (0,\bar{\delta }_{K}]$.

Appendix 3: Proof of Lemma 3

We first state and prove three technical Lemmas:

Lemma 6

Let $\mathcal {X}=[0,1]^d$ and $K:\mathcal {X}\rightarrow \mathcal {P}(\mathcal {X})$ be a Markov kernel which verifies Assumptions (A1)-(A2). Then, for any $\delta \in (0,\bar{\delta }_K]$, with $\bar{\delta }_K$ as in Lemma 1, and any $(\tilde{\mathbf {x}},\mathbf {x}^{\prime })\in \mathcal {X}^2$, there exists a closed hypercube $\bar{W}(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta )\subset [0,1)^d$ of side $\bar{S}_{\delta }=2.5\bar{K}\delta $, with $\bar{K}=\max _{i\in 1:d}\{\sup _{\mathbf {x},\mathbf {y}\in \mathcal {X}}K_i(y_i|y_{1:i-1},\mathbf {x})\}$, such that

$$\begin{aligned} F_K(\mathbf {x},B_{v_K(\delta )}(\mathbf {x}^{\prime }))\subseteq \bar{W}(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta ),\quad \forall \mathbf {x}\in B_{v_K(\delta )}(\tilde{\mathbf {x}}) \end{aligned}$$

(10)

where $v_K(\cdot )$ is as in Lemma 1.

Proof

The proof of Lemma 6 is similar to the proof of Lemma 1. Below, we use the same notation as in this latter.

Let $\delta \in (0,\bar{\delta }_K]$, $(\tilde{\mathbf {x}},\,\mathbf {x}^{\prime })\in \mathcal {X}^2$ and note that, for any $(\mathbf {x},\mathbf {y})\in \mathcal {X}^2$,

$$\begin{aligned} F_{K_i}(x^{\prime }_i+\delta |\mathbf {x},y_{1:i-1})-F_{K_i}(x^{\prime }_i-\delta |\mathbf {x},y_{1:i-1})\le 2\bar{K}\delta ,\quad i\in 1:d. \end{aligned}$$

(11)

Let $0<\delta _1\le \dots \le \delta _d\le \delta $ be as in the proof of Lemma 1 and, for $i\in 1:d$, define

$$\begin{aligned} \underline{u}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})=\inf _{(\mathbf {x},\,\mathbf {y})\in B_{v_K(\delta )}(\tilde{\mathbf {x}}),\times B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })}F_{K_i}(x^{\prime }_i-\delta _i|\mathbf {x},y_{1:i-1}) \end{aligned}$$

and

$$\begin{aligned} \bar{u}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime }, \delta _{1:i})=\sup _{(\mathbf {x},\,\mathbf {y})\in B_{v_K(\delta )}(\tilde{\mathbf {x}}),\times B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })}F_{K_i}(x^{\prime }_i+\delta _i|\mathbf {x},y_{1:i-1}). \end{aligned}$$

Let $i\in 1:d$ and $(\underline{\mathbf {x}}^i,\underline{\mathbf {y}}^i),(\bar{\mathbf {x}}^i,\bar{\mathbf {y}}^i)\in B_{v_K(\delta )}(\tilde{\mathbf {x}})\times B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })$ be such that

$$\begin{aligned} \underline{u}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})=F_{K_i}(x^{\prime }_i-\delta _i|\underline{\mathbf {x}}^i,\underline{y}^i_{1:i-1}),\quad \bar{u}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta )=F_{K_i}(x^{\prime }_i+\delta _i|\bar{\mathbf {x}}^i,\bar{y}^i_{1:i-1}). \end{aligned}$$

Therefore, using (7), (8) and (10), we have, $\forall i\in 1:d$,

$$\begin{aligned} 0<\bar{u}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})-\underline{u}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})&=F_{K_i}(x^{\prime }_i+\delta _i|\bar{\mathbf {x}}^i,\bar{y}^i_{1:i-1})-F_{K_i}(x^{\prime }_i-\delta _i|\underline{\mathbf {x}}^i,\underline{y}^i_{1:i-1})\\&\le F_{K_i}(x^{\prime }_i\!+\!\delta _i|\tilde{\mathbf {x}},x^{\prime }_{1:i-1})\!-\!F_{K_i}(x^{\prime }_i-\delta _i|\tilde{\mathbf {x}},x^{\prime }_{1:i-1})+0.5\tilde{K}\delta _i\\&\le \delta _i(2\bar{K}+0.5\tilde{K})\\&\le 2.5\delta _i\bar{K} \end{aligned}$$

where $\tilde{K}\le \bar{K}$ is as in the proof of Lemma 1. (Note that $\bar{u}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})-\underline{u}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta _{1:i})$ is indeed strictly positive because $F_{K_i}(\cdot |\mathbf {x},y_{1:i-1},)$ is strictly increasing on [0, 1] for any $(\mathbf {x},\mathbf {y})\in \mathcal {X}^2$ and because $\delta _i>0$.)

This shows that for all $\mathbf {x}\in B_{v_K(\delta )}(\tilde{\mathbf {x}})$ and for all $\mathbf {y}\in B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })$, we have

$$\begin{aligned} \big [F_{K_i}(x^{\prime }_i-\delta _i|\mathbf {x},\mathbf {y}),F_{K_i}(x^{\prime }_i+\delta _i|\mathbf {x},\mathbf {y})\big ]\subseteq \big [\underline{u}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta ),\bar{u}_i(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta )\big ],\quad \forall i\in 1:d \end{aligned}$$

and thus there exists a closed hypercube $\bar{W}(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta )$ of side $\bar{S}_{\delta }=2.5\delta \bar{K}$ such that

$$\begin{aligned} F_K(\mathbf {x},B_{\delta _{1:i-1}}(\mathbf {x}^{\prime }))\subseteq \bar{W}(\tilde{\mathbf {x}},\mathbf {x}^{\prime },\delta ),\quad \forall \mathbf {x}\in B_{v_K(\delta )}(\tilde{\mathbf {x}}). \end{aligned}$$

To conclude the proof of Lemma 6, note that, because $v_K(\delta )\le \delta _i$ for all $i\in 1:d$,

$$\begin{aligned} F_K(\mathbf {x},B_{v_K(\delta )}(\mathbf {x}^{\prime }))\subseteq F_K(\mathbf {x},B_{\delta _{1:i-1}}(\mathbf {x}^{\prime })). \end{aligned}$$

Lemma 7

Consider the set-up of Lemma 3 and, for $(p,a,k)\in \mathbb {N}_+^3$, let

$$\begin{aligned} E^{p}_{a,k}&=\Big \{\exists n\in \{ab^{k},\dots ,(a+1)b^{k}-1\}:\ \mathbf {x}^{n}\ne \mathbf {x}^{ab^{k}-1},\,\,\varphi (\mathbf {x}^{ab^k-1})<\varphi ^*\Big \}\\&\cap \Big \{\forall n\in \{ab^{k},\dots ,(a+1)b^{k}-1\}:\, \mathbf {x}^{n} \in (\mathcal {X}_{\varphi (\mathbf {x}^{ab^k-1})})_{2^{-p}}\Big \}. \end{aligned}$$

Then, for all $k\in \mathbb {N}$, there exists a $p^*_k\in \mathbb {N}$ such that $\mathrm {Pr}\big (\bigcap _{a\in \mathbb {N}}E^{p}_{a,k}\big )=0$ for all $p\ge p^*_k$.

Proof

Let $\epsilon >0$, $a\in \mathbb {N}$ and $l\in \mathbb {R}$ be such that $l<\varphi ^*$, and for $k\in \mathbb {N}$, let $E(k)=\{E(j,k)\}_{j=1}^{k^d}$ be the splitting of $\mathcal {X}$ into closed hypercubes of volume $k^{-d}$.

Let $p^{\prime }\in \mathbb {N}_+$, $\delta =2^{-p^{\prime }}$ and $P^l_{\epsilon ,\delta }\subseteq E(\delta )$ be the smallest coverage of $(\mathcal {X}_{l})_{\epsilon }$ by hypercubes in $E(\delta )$; that is, $|P^l_{\epsilon ,\delta }|$ is the smallest value in $1:\delta ^{-d}$ such that $(\mathcal {X}_{l})_{\epsilon }\subseteq \cup _{W\in P^l_{\epsilon ,\delta }}$. Let $J^l_{\epsilon ,\delta }\subseteq 1:\delta ^{-d}$ be such that $j\in J^l_{\epsilon ,\delta }$ if and only if $E(j,\delta )\in P^l_{\epsilon ,\delta }$. We now bound $|J^l_{\epsilon ,\delta }|$ following the same idea as in [20].

By assumption, there exists a constant $\bar{M}<\infty $ independent of l such that $M(\mathcal {X}_{l})\le \bar{M}$. Hence, for any fixed $w>1$ there exists a $\epsilon ^*\in (0,1)$ (independent of l) such that $\lambda _d\big ((\mathcal {X}_{l})_{\epsilon }\big )\le w M(\mathcal {X}_{l})\epsilon \le w \bar{M}\epsilon $ for all $\epsilon \in (0,\epsilon ^*]$. Let $\epsilon =2^{-p}$, with $p\in \mathbb {N}$ such that $2^{-p}\le 0.5\epsilon ^*$, and take $\delta _{\epsilon }=2^{-p-1}$. Then, we have the inclusions $(\mathcal {X}_{l})_{\epsilon }\subseteq \cup _{W\in P^l_{\epsilon ,\delta _{\epsilon }}} \subseteq (\mathcal {X}_{l})_{2\epsilon }$ and therefore, since $2\epsilon \le \epsilon ^*$,

$$\begin{aligned} |J^l_{\epsilon ,\delta _{\epsilon }}|\le \frac{\lambda _d\big ((\mathcal {X}_{l})_{2\epsilon }\big )}{\lambda _d(E(j,\delta _{\epsilon }))}\le \frac{w \bar{M} (2\epsilon )^{d}}{\delta _{\epsilon }^{d}}\le \bar{C}\delta _{\epsilon }^{-(d-1)},\quad \bar{C}:=w \bar{M} 2^d \end{aligned}$$

(12)

where the right-hand side is independent of l.

Next, for $j\in J^l_{\epsilon ,\delta _{\epsilon }}$, let $\bar{\mathbf {x}}^j$ be the center of $E(j,\delta _{\epsilon })$ and $W^l(j,\delta _{\epsilon })=\cup _{j^{\prime }\in J^l_{\epsilon ,\delta _{\epsilon }}} \bar{W}(\bar{\mathbf {x}}^j,\bar{\mathbf {x}}^{j^{\prime }},\delta _{\epsilon })$, with $\bar{W}(\cdot ,\cdot ,\cdot )$ as in Lemma 6. Then, using this latter, a necessary condition to move at iteration $n+1$ of Algorithm 1 from a point $\mathbf {x}^{n}\in E(j_{n},\delta _{\epsilon })$, with $j_{n}\in J^{l}_{\epsilon ,\delta _{\epsilon }}$, to a point $\mathbf {x}^{n+1}\ne \mathbf {x}^{n}$ such that $\mathbf {x}^{n+1}\in E(j_{n+1},\delta _{\epsilon })$ for a $j_{n+1}\in J^{l}_{\epsilon ,\delta _{\epsilon }}$ is that $\mathbf {u}_R^{n+1}\in W^{l}(j_{n},\delta _{\epsilon })$.

Let $k^{\delta _{\epsilon }}$ be the largest integer such that (i) $b^{k}\le \bar{S}_{\delta _{\epsilon }}^{-d}b^t$, with $\bar{S}_{\delta _{\epsilon }}=2.5\bar{K}\delta _{\epsilon }$, $\bar{K}<\infty $, as in Lemma 6, and (ii) $(k-t)/d$ is a positive integer (if necessary reduce $\epsilon $ to fulfil this last condition). Let $E^{\prime }(\delta _{\epsilon })=\{E^{\prime }(k,\delta _{\epsilon })\}_{k=1}^{b^{k^{\delta _{\epsilon }}-t}}$ be the partition of $[0,1)^d$ into hypercubes of volume $b^{t-k^{\delta _{\epsilon }}}$. Then, for all $j\in J^l_{\epsilon ,\delta _{\epsilon }}$, $W^l(j,\delta _{\epsilon })$ is covered by at most $2^d|J^l_{\epsilon ,\delta _{\epsilon }}|$ hypercubes of $E^{\prime }(\delta _{\epsilon })$.

Let $\epsilon $ be small enough so that $k^{\delta _{\epsilon }}>t+dR$. Then, using the properties of $(t,s)_R$-sequences (see Section 3.1), it is easily checked that, for all $n\ge 0$,

$$\begin{aligned} \mathrm {Pr}\big (\mathbf {u}_R^{n}\in E^{\prime }(k,\delta _{\epsilon })\big )\le b^{t-k^{\delta ^{\epsilon }}+dR},\quad \forall k\in 1:b^{k^{\delta _{\epsilon }}-t}. \end{aligned}$$

(13)

Thus, using (12)–(13) and the definition of $k^{\delta _{\epsilon }}$, we obtain, for all $j\in J^l_{\epsilon ,\delta _{\epsilon }}$ and $n\ge 0$,

$$\begin{aligned} \mathrm {Pr}\big (\mathbf {u}_R^{n}\in W^l(j,\delta _{\epsilon })\big )\le 2^d|J^l_{\epsilon ,\delta _{\epsilon }}|b^tb^{t-k^{\delta ^{\epsilon }}+dR}\le C^*\delta _\epsilon ,\quad C^*=2^d\bar{C}b^{t+1}(2.5\bar{K})^{d}b^{dR}. \end{aligned}$$

Consequently, using the definition of $\epsilon $ and $\delta _{\epsilon }$, and the fact that there exist at most $2^d$ values of $j\in J^l_{\epsilon ,\delta _{\epsilon }}$ such that, for $n\in \mathbb {N}$, we have $\mathbf {x}^{n}\in E(j,\delta _{\epsilon })$, we deduce that, for a $p^*\in \mathbb {N}$ large enough (i.e. for $\epsilon =2^{-p^*}$ small enough)

$$\begin{aligned} \mathrm {Pr}\big (E^{p}_{a,k}|\, \varphi (\mathbf {x}^{ab^k-1})=l\big )\le b^{k} 2^dC^* 2^{-p-1},\quad \forall (a,k)\in \mathbb {N}^2,\quad \forall l<\varphi ^*,\quad \forall p\ge p^* \end{aligned}$$

implying that, for $p\ge p^*$,

$$\begin{aligned} \mathrm {Pr}\big (E^{p}_{a,k}\big )\le b^{k} 2^dC^* 2^{-p-1},\quad \forall (a,k)\in \mathbb {N}^2. \end{aligned}$$

Finally, because the uniform random numbers $\mathbf {z}^n$’s in $[0,1)^s$ that enter into the definition of $(t,s)_R$-sequences are IID, this shows that

$$\begin{aligned} \mathrm {Pr}\big (\cap _{j=a}^{a+m}E_{j,k}^p\big )\le (b^{k}2^dC^*2^{-p-1})^m,\quad \forall (a,m,k)\in \mathbb {N}^3,\quad \forall p\ge p^*. \end{aligned}$$

To conclude the proof, for $k\in \mathbb {N}$ let $\rho _k\in (0,1)$ and $p^*_k\ge p^*$ be such that

$$\begin{aligned} b^{k}2^dC^*2^{-p-1}\le \rho _k,\quad \forall p>p^*_k. \end{aligned}$$

Then, $\mathrm {Pr}\big (\cap _{a\in \mathbb {N}} E^p_{a,k})=0$ for all $p\ge p^*_k$, as required.

Lemma 8

Consider the set-up of Lemma 3. For $k\in \mathbb {N}$, let $\tilde{E}(dk)=\{\tilde{E}(j,k)\}_{j=1}^{b^{dk}}$ be the partition of $[0,1)^d$ into hypercubes of volume $b^{-dk}$. Let $k^{R} \in (dR+t):(dR+t+d)$ be the smallest integer k such $(k-t)/d$ is an integer and such that $(k-t)/d\ge R$ and, for $m\in \mathbb {N}$, let $I_m=\{mb^{k^R},\dots ,(m+1)b^{k^R}-1\}$. Then, for any $\delta \in (0,\bar{\delta }_K]$ verifying $k_\delta > t+d+dR$ (with $\bar{\delta }_K$ and $k_\delta $ as in Lemma 1), there exists a $p(\delta )>0$ such that

$$\begin{aligned} \mathrm {Pr}\big (\exists n\in I_m:\,\, \mathbf {u}_R^{n}\in \tilde{E}(j, k_{\delta }-t_{\delta ,d})\big )\ge p(\delta ),\quad \forall j\in 1: b^{ k_{\delta }-t_{\delta ,d}},\quad \forall m\in \mathbb {N} \end{aligned}$$

where $t_{\delta ,d}\in t:(t+d)$ is such that $(k_{\delta }-t_{\delta ,d})/d\in \mathbb {N}$.

Proof

Let $m\in \mathbb {N}$ and note that, by the properties of $(t,s)_R$-sequence, the point set $\{\mathbf {u}_{\infty }^{n}\}_{n\in I_m}$ is a $(t,{k^R},d)$-net in base b. Thus, for all $j\in 1:b^{k^R-t}$, this point set contains $b^t$ points in $\tilde{E}(j, k^{R}-t)$ and, consequently, for all $j\in 1:b^{dR}$, it contains $b^tb^{k^R-t-dR}=b^{k^R-dR}\ge 1$ points in $\tilde{E}(j,dR)$. This implies that, for all $j\in 1:b^{dR}$, the point set $\{\mathbf {u}_R^{n}\}_{n\in I_m}$ contains $b^{k^R-dR}\ge 1$ points in $\tilde{E}(j,dR)$ where, for all $ n\in I_{m_i}$, $\mathbf {u}_R^{n}$ is uniformly distributed in $\tilde{E}(j_{n},dR)$ for a $j_n\in 1:b^{dR}$.

In addition, it is easily checked that each hypercube of the set $\tilde{E}(dR)$ contains

$$\begin{aligned} b^{k_{\delta }-t_{\delta ,d}-dR}\ge b^{k_{\delta }-t-d-dR}> 1 \end{aligned}$$

hypercubes of the set $\tilde{E}(k_{\delta }-t_{\delta ,d})$, where $k_{\delta }$ and $t_{\delta ,d}$ are as in the statement of the lemma. Note that the last inequality holds because $\delta $ is chosen so that $k_{\delta }> t+d+dR$. Consequently,

$$\begin{aligned} \mathrm {Pr}\big (\exists n\in I_m:\,\, \mathbf {u}_R^n\in \tilde{E}(j, k_{\delta }-t_{\delta ,d})\big )\ge p(\delta ):=b^{dR+t_{\delta ,d}-k_{\delta }}>0,\quad \forall j\in 1: b^{ k_{\delta }-t_{\delta ,d}} \end{aligned}$$

and the proof is complete.

Proof of Lemma 3:

To prove the lemma we need to introduce some additional notation. Let $\varOmega =[0,1)^{\mathbb {N}}$, $\mathcal {B}([0,1))$ be the Borel $\sigma $-algebra on $[0,1)$, $\mathcal {F}= \mathcal {B}([0,1))^{\otimes \mathbb {N}}$ and $\mathbb {P}$ be the probability measure on $(\varOmega ,\mathcal {F})$ defined by

$$\begin{aligned} \mathbb {P}(A)=\prod _{i\in \mathbb {N}}\lambda _1(A_i),\quad (A_1,\dots ,A_i\dots )\in \mathcal {B}([0,1))^{\otimes \mathbb {N}}. \end{aligned}$$

Next, for $\omega \in \varOmega $, we denote by $\big (\mathbf {U}_R^n(\omega )\big )_{n\ge 0}$ the sequence of points in $[0,1)^{d}$ defined, for all $n\ge 0$, by (using the convention that empty sums are null),

$$\begin{aligned} \mathbf {U}_R^n(\omega )=\big (U_{R,1}^n(\omega ),\dots ,U_{R,d}^n(\omega )),\quad U_{R,i}^n(\omega )=\sum _{k=1}^{R}a_{ki}^nb^{-k}+b^{-R}\omega _{n d+i},\quad i\in 1:s. \end{aligned}$$

Note that, under $\mathbb {P}$, $\big (\mathbf {U}_R^n\big )_{n\ge 0}$ is a $(t,d)_R$-sequence in base b. Finally, for $\omega \in \varOmega $, we denote by $\big (\mathbf {x}^n_\omega \big )_{n\ge 0}$ the sequence of points in $\mathcal {X}$ generated by Algorithm 1 when the sequence $\big (\mathbf {U}_R^n(\omega )\big )_{n\ge 0}$ is used as input.

Under the assumptions of the lemma there exists a set $\varOmega _1\in \mathcal {F}$ such that $\mathbb {P}(\varOmega _1)=1$ and

$$\begin{aligned} \exists \bar{\varphi }_\omega \in \mathbb {R}\text { such that }\lim _{n\rightarrow \infty }\varphi \big (\mathbf {x}^n_\omega \big )=\bar{\varphi }_\omega ,\quad \forall \omega \in \varOmega _1. \end{aligned}$$

Let $\omega \in \varOmega _1$. Since $\varphi $ is continuous, for any $\epsilon >0$ there exists a $N_{\omega , \epsilon }\in \mathbb {N}$ such that $\mathbf {x}^n_\omega \in (\mathcal {X}_{\bar{\varphi }_\omega })_{\epsilon }$ for all $n\ge N_{\omega ,\epsilon }$, where we recall that $(\mathcal {X}_{\bar{\varphi }_\omega })_{\epsilon }=\{\mathbf {x}\in \mathcal {X}:\exists \mathbf {x}^{\prime }\in \mathcal {X}_{\bar{\varphi }_\omega }\text { such that } \Vert \mathbf {x}-\mathbf {x}^{\prime }\Vert _{\infty }\le \epsilon \}$. In addition, because $\varphi $ is continuous and $\mathcal {X}$ is compact, there exists an integer $p_{\omega , \epsilon }\in \mathbb {N}$ such that we have both $\lim _{\epsilon \rightarrow 0}p_{\omega ,\epsilon }=\infty $ and

$$\begin{aligned} (\mathcal {X}_{\bar{\varphi }_\omega })_{\epsilon }\subseteq (\mathcal {X}_{\varphi (x^{\prime })})_{2^{-p_{\omega ,\epsilon }}},\quad \forall x^{\prime }\in (\mathcal {X}_{\bar{\varphi }_\omega })_{\epsilon }. \end{aligned}$$

(14)

Next, let $\mathbf {x}^*\in \mathcal {X}$ be such that $\varphi (\mathbf {x}^*)=\varphi ^*$, $k^R \in (dR+t):(dR+t+d)$ be as in Lemma 8 and, for $(p,a,k)\in \mathbb {N}_+^3$, let

$$\begin{aligned} \tilde{E}^{p}_{a,k}&=\Big \{\omega \in \varOmega :\,\exists n\in \{ab^{k},\dots ,(a+1)b^{k}-1\}:\ \mathbf {x}_{\omega }^{n}\ne \mathbf {x}^{ab^{k}-1}_{\omega },\,\varphi (\mathbf {x}_{\omega }^{ab^k-1})<\varphi ^*\Big \}\\&\cap \Big \{\omega \in \varOmega :\,\forall n\in \{ab^{k},\dots ,(a+1)b^{k}-1\}:\mathbf {x}^{n}_{\omega }\in \big (\mathcal {X}_{\varphi (\mathbf {x}^{ab^k-1}_{\omega })}\big )_{2^{-p}}\Big \}. \end{aligned}$$

Then, by Lemma 7, there exists a $p^*\in \mathbb {N}$ such that $\mathbb {P}\big (\cap _{a\in \mathbb {N}} \tilde{E}^p_{a,k^R}\big )=0$ for all $p\ge p^*$, and thus the set $\tilde{\varOmega }_2=\cap _{p\ge p^*}\big (\mathcal {X}\setminus \cap _{a\in \mathbb {N}} \tilde{E}^p_{a,k^R}\big )$ verifies $\mathbb {P}(\tilde{\varOmega }_2)=1$. Let $\varOmega _2=\varOmega _1\cap \tilde{\varOmega }_2$ so that $\mathbb {P}(\varOmega _2)=1$.

For $\omega \in \varOmega _2$ let $\epsilon _\omega >0$ be small enough so that, for any $\epsilon \in (0,\epsilon _\omega ]$, we can take $p_{\omega ,\epsilon }\ge p^*$ in (14). Then, for any $\omega \in \varOmega _2$ such that $\bar{\varphi }_\omega <\varphi ^*$, there exists a subsequence $(m_i)_{i\ge 1}$ of $(m)_{m\ge 1}$ such that, for all $i\ge 1$, either

$$\begin{aligned} \mathbf {x}^{n}_\omega =\mathbf {x}^{m_ib^{k^R}-1}_\omega ,\quad \forall n\in I_{m_i}:=\big \{m_i b^{k^R},\dots , (m_i+1)b^{k^R}-1\big \} \end{aligned}$$

or

$$\begin{aligned} \exists n\in I_{m_i}\text { such that }\mathbf {x}^{n}_\omega \not \in \Big (\mathcal {X}_{\varphi (\mathbf {x}^{m_ib^{k^R}-1}_\omega )}\Big )_{2^{-p_{\omega ,\epsilon }}}. \end{aligned}$$

(15)

Assume first that there exist infinitely many $i\in \mathbb {N}$ such that (15) holds. Then, by (14), this leads to a contradiction with the fact that $\omega \in \varOmega _2\subseteq \varOmega _1$. Therefore, for any $\omega \in \varOmega _2$ such that $\bar{\varphi }_{\omega }<\varphi ^*$ there exists a subsequence $(m_i)_{i\ge 1}$ of $(m)_{m\ge 1}$ such that, for a $i^*$ large enough,

$$\begin{aligned} \mathbf {x}_\omega ^{n}=\mathbf {x}_\omega ^{m_ib^{k^R}-1},\quad \forall n\in I_{m_i},\quad \forall i\ge i^*. \end{aligned}$$

(16)

Let $ \tilde{\varOmega }_2=\{\omega \in \varOmega _2:\,\bar{\varphi }_{\omega }<\varphi ^*\}\subseteq \varOmega _2$ . Then, to conclude the proof, it remains to show that $\mathbb {P}(\tilde{\varOmega }_2)=0$. We prove this result by contradiction and thus, from henceforth, we assume $\mathbb {P}(\tilde{\varOmega }_2)>0$.

To this end, let $\mathbf {x}^*\in \mathcal {X}$ be such that $\varphi (\mathbf {x}^*)=\varphi ^*$, $\mathbf {x}\in \mathcal {X}$ and $\delta \in (0,\bar{\delta }_K]$, with $\bar{\delta }_K$ as in Lemma 1. Then, using this latter, a sufficient condition to have $F_K^{-1}(\mathbf {x},\mathbf {U}_R^{n}(\omega ))\in B_{\delta }(\mathbf {x}^*)$, $n\ge 1$, is that $\mathbf {U}_R^{n}(\omega )\in \underline{W}(\mathbf {x},\mathbf {x}^*,\delta )$, with $\underline{W}(\cdot ,\cdot ,\cdot )$ as in Lemma 1. From the proof of this latter we know that the hypercube $\underline{W}(\mathbf {x},\mathbf {x}^*,\delta )$ contains at least one hypercube of the set $\tilde{E}(k_{\delta }-t_{\delta ,d})$, where $t_{\delta ,d}\in t:(t+d)$ is such that $(k_{\delta }-t_{\delta ,d})/d\in \mathbb {N}$ and, for $k\in \mathbb {N}$, $\tilde{E}(dk)$ is as in Lemma 8. Hence, by this latter, for any $\delta \in (0,\delta ^*]$, with $\delta ^*$ such that $k_{\delta ^*}>t+d+dR$ (where, for $\delta >0$, $k_\delta $ is defined in Lemma 1), there exists a $p(\delta )>0$ such that

$$\begin{aligned} \mathbb {P}\Big (\omega \in \varOmega :\, \exists n\in I_m,\, F_K^{-1}\big (\mathbf {x}, \mathbf {U}_R^n(\omega )\big )\in B_{\delta }(\mathbf {x}^*)\Big )\ge p(\delta ),\quad \forall (\mathbf {x},m)\in \mathcal {X}\times \mathbb {N} \end{aligned}$$

and thus, using (16) and under Assumption (A2), it is easily checked that, for any $\delta \in (0,\delta ^*]$,

$$\begin{aligned} \mathbb {P}_2\Big (\omega \in \tilde{\varOmega }_2: F_K^{-1}\big (\mathbf {x}_{\omega }^{n-1}, \mathbf {U}_R^n(\omega )\big )\in B_{\delta }(\mathbf {x}^*)\text { for infinitly many }n\in \mathbb {N}\Big )=1 \end{aligned}$$

where $\mathbb {P}_2$ denotes the restriction of $\mathbb {P}$ on $\tilde{\varOmega }_2$ (recall that we assume $\mathbb {P}(\tilde{\varOmega }_2)>0$).

For $\delta >0$, let

$$\begin{aligned} \varOmega ^{\prime }_{\delta }=\Big \{\omega \in \tilde{\varOmega }_2:\,\, F_K^{-1}(\mathbf {x}_{\omega }^{n-1}, \mathbf {U}_R^n(\omega ))\in B_{\delta }(\mathbf {x}^*)\text { for infinitly many }n\in \mathbb {N}\Big \} \end{aligned}$$

and let $\tilde{p}^*\in \mathbb {N}$ be such that $2^{-\tilde{p}^*}\le \delta ^*$. Then, the set $\varOmega ^{\prime }=\cap _{\tilde{p}\ge \tilde{p}^*}\varOmega ^{\prime }_{2^{-\tilde{p}}}$ verifies $\mathbb {P}_2(\varOmega ^{\prime })=1$.

To conclude the proof let $\omega \in \varOmega ^{\prime }$. Then, because $\varphi $ is continuous and $\bar{\varphi }_\omega <\varphi ^*$, there exists a $\tilde{\delta }_{\bar{\varphi }_\omega }>0$ so that $\varphi (\mathbf {x})>\bar{\varphi }$ for all $\mathbf {x}\in B_{\tilde{\delta }_{\bar{\varphi }_\omega }}(\mathbf {x}^*)$. Let $\delta _{\bar{\varphi }_\omega }:=2^{-\tilde{p}_{\omega ,\epsilon }}\ge \tilde{\delta }_{\bar{\varphi }_\omega }\wedge \bar{\delta }_K$ for an integer $\tilde{p}_{\omega ,\epsilon }\ge \tilde{p}^*$. Next, take $\epsilon $ small enough so that we have both $B_{\delta _{\bar{\varphi }_\omega }}(\mathbf {x}^*)\cap (\mathcal {X}_{\bar{\varphi }_\omega })_{\epsilon }=\varnothing $ and $\varphi (\mathbf {x})\ge \varphi (\mathbf {x}^{\prime })$ for all $(\mathbf {x},\mathbf {x}^{\prime })\in B_{\delta _{\bar{\varphi }_\omega }}(\mathbf {x}^*)\times (\mathcal {X}_{\bar{\varphi }_\omega })_{\epsilon }$.

Using above computations, the set $ B_{\tilde{\delta }_{\bar{\varphi }_\omega }}(\mathbf {x}^*)$ is visited infinitely many time and thus $\varphi (\mathbf {x}_\omega ^n)>\bar{\varphi }_\omega $ for infinitely many $n\in \mathbb {N}$, contradicting the fact that $\varphi (\mathbf {x}_\omega ^n)\rightarrow \bar{\varphi }_\omega $ as $n\rightarrow \infty $. Hence, the set $\varOmega ^{\prime }$ is empty. On the other hand, as shown above, under the assumption $\mathbb {P}(\tilde{\varOmega }_2)>0$ we have $\mathbb {P}_2(\varOmega ^{\prime })=1$ and, consequently, $\varOmega ^{\prime }\ne \varnothing $. Therefore, we must have $\mathbb {P}(\tilde{\varOmega }_2)=0$ and the proof is complete.

Appendix 4: Proof of Theorem 2

Using Lemmas 4 and 5, we know that $\varphi (x^n)\rightarrow \bar{\varphi }\in \mathbb {R}$ and thus it remains to show that $\bar{\varphi }=\varphi ^*$.

Assume that $\bar{\varphi }\ne \varphi ^*$ and, for $\epsilon =2^{-p}$, $p\in \mathbb {N}_+$, let $N_\epsilon \in \mathbb {N}$, $p_\epsilon $ and $\delta _{\bar{\varphi }}>0$ be as in the proof of Lemma 3 (with the dependence of $N_\epsilon $, $p_\epsilon $ and of $\delta _{\bar{\varphi }}$ on $\omega \in \varOmega $ suppressed in the notation for obvious reasons).

Let $x^*\in \mathcal {X}$ be a global maximizer of $\varphi $ and $n=a_nb^{k_{\delta _{\bar{\varphi }}}}-1$ with $a_n\in \mathbb {N}$ such that $n>N_{\epsilon }$. For $k\in \mathbb {N}$, let $E(k)=\{E(j,k)\}_{j=1}^{k}$ be the splitting of [0, 1] into closed hypercubes of volume $k^{-1}$. Then, by Lemma 6, a necessary condition to have a move at iteration $n^{\prime }+1\ge 1$ of Algorithm 1 from $x^{n^{\prime }}\in (\mathcal {X}_{\bar{\varphi }})_{\epsilon }$ to $x^{n^{\prime }+1}\ne x^{n^{\prime }}$, $x^{n^{\prime }+1}\in (\mathcal {X}_{\bar{\varphi }})_{\epsilon }$ is that

$$\begin{aligned} u_{\infty }^{n^{\prime }}\in \bar{W}(\epsilon ):=\bigcup _{j,j^{\prime }\in J^{\bar{\varphi }}_{\epsilon , \epsilon /2}} \bar{W}(\bar{x}^j,\bar{x}^{j^{\prime }},\epsilon /2) \end{aligned}$$

where, for $j\in 1:(\epsilon /2)^{-d}$, $\bar{x}^j$ denotes the center of $E(j,\epsilon /2)$, $J^{\bar{\varphi }}_{\epsilon , \epsilon /2}$ is as in the proof of Lemma 7 and $\bar{W}(\cdot ,\cdot ,\cdot )$ is as in Lemma 6. Note that, using (12) with $d=1$, $|J^{\bar{\varphi }}_{\epsilon , \epsilon /2}|\le C^*$ for a constant $C^*<\infty $ (independent of $\epsilon $).

Let $b^{k^{\delta _{\epsilon }}}$ be the largest integer $k\ge t$ such that $b^{t-k}\ge \bar{S}_{\epsilon /2}^{d}$, with $\bar{S}_{\epsilon /2}$ as in Lemma 6, and let $\epsilon $ be small enough so that $b^{k^{\delta _{\epsilon }}}>2^dC^*b^t$. The point set $\{u_{\infty }^{n^{\prime }}\}_{n^{\prime }=a_nb^{k^{\delta _{\epsilon }}}}^{ (a_n+1)b^{k^{\delta _{\epsilon }}}-1}$ is a $(t, k^{\delta _{\epsilon }},d)$-net in base b and thus the set $\bar{W}(\epsilon )$ contains at most $2^dC^* b^t$ points of this points set. Hence, if for $n>N_{\epsilon }$ only moves inside the set $(\mathcal {X}_{\bar{\varphi }})_{\epsilon }$ occur, then, for a $\tilde{n}\in a_nb^{k^{\delta _{\epsilon }}}:\big ((a_n+1)b^{k^{\delta _{\epsilon }}}-\eta _{\epsilon }-1)\big )$, the point set $\{x^{n^{\prime }}\}_{n^{\prime }=\tilde{n}}^{\tilde{n}+\eta _{\epsilon }}$ is such that $x^{n^{\prime }}=x^{\tilde{n}}$ for all $n\in \tilde{n}:(\tilde{n}+\eta _{\epsilon })$, where $\eta _{\epsilon }\ge \lfloor \frac{b^{k^{\delta _{\epsilon }}}}{2^dC^*{}^2b^t}\rfloor $; note that $\eta _{\epsilon }\rightarrow \infty $ as $\epsilon \rightarrow 0$.

Let $k^{\epsilon }_0$ be the largest integer which verifies $\eta _{\epsilon }\ge 2b^{k^{\epsilon }_0}$ so that $\{u_{\infty }^n\}_{n=\tilde{n}}^{\tilde{n}+\eta _{\epsilon }}$ contains at least one $(t,k^{\epsilon }_0,d)$-net in base b. Note that $k^{\epsilon }_0\rightarrow \infty $ as $\epsilon \rightarrow 0$, and let $\epsilon $ be small enough so that $k^{\epsilon }_0\ge k_{\delta _{\bar{\varphi }}}$, with $k_{\delta }$ as in Lemma 1. Then, by Lemma 1, there exists at least one $n^*\in (\tilde{n}+1):(\tilde{n}+\eta _{\epsilon })$ such that $\tilde{y}^{n^*}:=F_K^{-1}(x^{\tilde{n}},u_{\infty }^{n^*})\in B_{\delta _{\bar{\varphi }}}(x^*)$. Since, by the definition of $\delta _{\bar{\varphi }}$, for all $(x,x^{\prime })\in B_{\delta _{\bar{\varphi }}}(x^*)\times (\mathcal {X}_{\bar{\varphi }})_{\epsilon }$, and for $\epsilon $ small enough, we have $\varphi (x)>\varphi (x^{\prime })$, it follows that $\varphi (\tilde{y}^{n^*})>\varphi (x^{\tilde{n}})$. Hence, there exists at least one $n\in \tilde{n}:(\tilde{n}+\eta _{\epsilon })$ such that $x^n\ne x^{\tilde{n}}$, which contradicts the fact that $x^n=\tilde{x}$ for all $n\in \tilde{n}:(\tilde{n}+\eta _{\epsilon })$. This shows that $\bar{\varphi }$ is indeed the global maximum of $\varphi $.

Appendix 5: Additional figures for the example of Sect. 6.1

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gerber, M., Bornn, L. Improving simulated annealing through derandomization. J Glob Optim 68, 189–217 (2017). https://doi.org/10.1007/s10898-016-0461-1

Download citation

Received: 27 October 2015
Accepted: 31 August 2016
Published: 08 September 2016
Issue Date: May 2017
DOI: https://doi.org/10.1007/s10898-016-0461-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving simulated annealing through derandomization

Abstract

Access this article

Similar content being viewed by others

Theoretically Grounded Acceleration Techniques for Simulated Annealing

An information guided framework for simulated annealing

Mathematical Aspects of the Digital Annealer’s Simulated Annealing Algorithm

References

Acknowledgments