Restart Strategies in a Continuous Setting

Lorenz, Jan-Hendrik

doi:10.1007/s00224-021-10041-0

Restart Strategies in a Continuous Setting

Open access
Published: 05 May 2021

Volume 65, pages 1143–1164, (2021)
Cite this article

Download PDF

You have full access to this open access article

Theory of Computing Systems Aims and scope Submit manuscript

Restart Strategies in a Continuous Setting

Download PDF

Jan-Hendrik Lorenz ORCID: orcid.org/0000-0002-9554-4347¹

1868 Accesses
4 Citations
Explore all metrics

Abstract

Restarting is a technique frequently employed in randomized algorithms. After some number of computation steps, the state of the algorithm is reinitialized with a new, independent random seed. Luby et al. (Inf. Process. Lett. 47(4), 173–180, 1993) introduced a universal restart strategy. They showed that their strategy is an optimal universal strategy in the worst case. However, the optimality result has only been shown for discrete processes. In this work, it is shown that their result does not translate into a continuous setting. Furthermore, we show that there are no (asymptotically) optimal strategies in a continuous setting. Nevertheless, we obtain an optimal universal strategy on a restricted class of continuous probability distributions. Furthermore, as a side result, we show that the expected value under restarts for the lognormal distribution tends towards 0. Finally, the results are illustrated using simulations.

Restart Strategies

Methods to compare expensive stochastic optimization algorithms with random restarts

Article 14 June 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Numerous algorithms use restarts to boost their performance: If the algorithm fails to find a solution, the algorithm’s state is reset to its initial state. Restarts are usually governed by a discrete quantity. For example, stochastic local search algorithms restart if the number of local search steps exceeds a certain threshold and backtracking algorithms commence a restart after a certain number of backtracking steps. In settings such as these, the results by Luby et al. [12] can be applied. They studied the case of Las Vegas algorithms, where restarts can be triggered at discrete time steps. They proved that restarts potentially help to improve the expected runtime of this type of algorithms. In their work, they introduced two of the most important restart strategies: The fixed-cutoff strategy and Luby’s (universal) strategy. They showed that the fixed-cutoff strategy is optimal w.r.t. the expected runtime for some parameter t^∗. However, in practice, t^∗ is rarely known and hard to determine. Luby’s strategy solves this problem: In expectation, it is only worse by a logarithmic factor as compared to the optimal fixed-cutoff strategy. Gomes et al. [6] use randomization and restarts to increase the performance of SAT and CSP-solvers by several orders of magnitude. Nowadays, virtually every state-of-the-art SAT solver utilizes restarts [1]. In an empirical study, Huang [7] found that Luby’s strategy is the best performing strategy in practice. Luby’s strategy is also beneficial in other contexts, such as (1 + 1) evolutionary algorithms [5].

Albeit both the fixed-cutoff strategy and Luby’s strategy have only been introduced for discrete quantities, the restart paradigm can also be applied to continuous quantities to minimize the expected value of this quantity. Eliazar et al. [2] examined enzymes searching for a target in a DNA strand. One of the search strategies of the enzyme is arbitrarily relocating after some time. This relocation corresponds to a restart in the algorithmic setting. Evans and Majumdar [3] analyzed general diffusion processes and found that random resets can be beneficial in expectation. Both works study the effect of restarts on the expected value of a continuous quantity: time. Generally, restarts can help to minimize continuous resources such as money, amount of fuel, traveled distances, emitted greenhouse gases, and so forth. In such scenarios, it seems natural to use established strategies from algorithmics such as the fixed-cutoff and Luby’s strategy. However, this raises the question of whether these are also good strategies in a continuous setting since their performance guarantees have only been shown for the discrete case.

Since (most) computers are discrete systems working with finite precision, it may seem unreasonable to consider real-valued restart times for computational purposes. Nevertheless, the results presented here are statements about all universal strategies, including those that can be represented by a digital computer. Therefore, when facing a situation involving a computer controlling a real-time system, e.g., transmission times in a network, the optimal restart time may be an irrational value. This kind of scenario is relevant, e.g., for distributed computations.

Our contribution

Luby et al. [12] prove the optimality of Luby’s strategy by upper and lower bounds in their work. In this work, we show that the upper bounds do not apply in a continuous setting. Furthermore, we show that there is no reasonable upper bound for either Luby’s strategy or any other universal strategy. These results show that in a continuous framework, optimal strategies cannot exist.

However, we determine classes of distributions for which optimal universal strategies exist. These classes are defined by the optimal restart times t^∗ of the fixed-cutoff strategy. We prove that if t^∗ is not too close to zero, then there are optimal strategies defined on these classes. Furthermore, we improve the known bounds of Luby’s strategy and, therefore, implicitly show that it is an optimal strategy on those classes.

Finally, the quality of Luby’s strategy is checked by means of simulations. We use Lévy-Smirnov, generalized Pareto, and lognormal distributions. As a spin-off result, we also show that the expected value under restarts of the lognormal distribution approaches zero under certain circumstances.

2 Preliminaries

Let X be a positive, continuous random variable. In the following, F_X denotes the cumulative distribution function, the survival function 1 − F_X(x) is denoted by S_X(x), and f_X denotes the density function of X. If it is evident within the context, then the subscripts are omitted.

2.1 Fundamental Restart Strategies

In this section, we formally define the concept of restarts and universal strategies.

Definition 1

[12] Let $\mathcal {A}$ be a randomized algorithm, and x be its input. Let $L=(t_{1}, t_{2}, \dots )$ be an arbitrary sequence of numbers with $t_{i} \in \mathbb {R}_{+} \cup \{\infty \}$ and consider the following modified algorithm $\mathcal {A}_{L}$: Let $\mathcal {A}(x)$ run for t_i time-units in the i-th try. If a solution is found, return it and quit; otherwise, reset $\mathcal {A}(x)$ and proceed with the (i + 1)-st try with new and independent random choices. Then, L is called a restart strategy. Let X be a random variable that describes the runtime behavior of $\mathcal {A}$. In the following, X_L describes the runtime behavior of $\mathcal {A}_{L}$. If L is not dependent on the algorithm $\mathcal {A}$ and its input x, then L is called a universal strategy.

Luby et al. [12] introduced two essential restart strategies. The first one is called the fixed-cutoff strategy. The fixed-cutoff strategy consists of always restarting after some time t has passed.

Definition 2

[12] Let X be a random variable which is defined on $\mathbb {R}_{+}$ and let $L=(t_{1},t_{2}, \dots )$ be a restart strategy with t_i = t₁ for all $i\in \mathbb {N}$. Then, L is called a fixed-cutoff strategy. Note that t₁ uniquely determines a fixed-cutoff strategy. We often write $X_{t_{1}}$ to denote a random variable which describes the runtime with fixed-cutoff restarts using L.

Luby et al. [12] showed that the fixed-cutoff strategy is optimal w.r.t. the expected runtime for some restart time. In other words, there are no other restart strategies performing better on average. Their claim is originally for discrete runtime distributions. For continuous distributions, it was first conjectured [14] that the fixed-cutoff strategy (also called sharp restarts) is optimal; later, this was proven by Pal and Reuveni [15].

Theorem 1

[12, 15] Let X be any random variable. Then there is a $t^{*} \in \mathbb {R}_{+}$ such that the fixed-cutoff strategy $(t^{*},t^{*},\dots )$ is an optimal strategy w.r.t. the expected value under restarts. Furthermore, the expected value E[X_t] of X with fixed-cutoff restarts at $t \in \mathbb {R}_{+}$ is given by:

$$ E[X_{t}] = \frac{1-F(t)}{F(t)} t + E[X \mid X < t]. $$

(1)

In the following, $\alpha ^{*}_{X} = \inf \limits _{t<\infty }E[X_{t}]$ denotes the expected runtime using the optimal fixed-cutoff strategy. It is worth noting that there are certain universal principles regarding $\alpha ^{*}_{X}$ applying for all distributions X. An example is that $\alpha _{X}^{*}= (1-F_{X}(t^{*}))/f_{X}(t^{*})$ holds, where t^∗ is the optimal restart time [14, 16].

While the fixed-cutoff strategy is theoretically intriguing, it suffers from a significant drawback: Finding the optimal restart time t^∗ is a non-trivial task. In fact, nearly the complete runtime behavior has to be known a priori to find t^∗. Moreover, even if the distribution is known, exactly determining t^∗ exactly is NP-hard [10]. In case the cumulative distribution function is known, there is an appropriate approximation algorithm [11]. However, the distribution is usually dependent on the problem instance, and the runtime behavior is therefore unknown. Thus, the fixed-cutoff strategy usually is not used in practice. Luby et al. [12] introduced a second strategy, referred to as Luby’s strategy. This strategy does not use any knowledge of the distribution.

Definition 3

[12] Let $L=(t_{1}, t_{2}, \dots )$ be a universal restart strategy. If t_i is of the form

$$ t_{i} = \begin{cases} 2^{k-1} & \text{if } i=2^{k} -1 \\ t_{i-2^{k-1}+1} & \text{if } 2^{k-1} \leq i < 2^{k} -1 \end{cases}, $$

(2)

then L is called Luby’s strategy.

We denote the expected value of a random variable X using Luby’s strategy by E[X_Luby]. Luby et al. showed that the expected runtime of Luby’s strategy can be bounded w.r.t. the best fixed-cutoff strategy.

Theorem 2

[12] For all discrete distributions X which are defined on $\mathbb {N}$

$$ E[X_{\text{Luby}}] \leq 192 \alpha^{*} (5+\log_{2}{\alpha^{*}}) $$

(3)

holds.

In their paper, Luby et al. introduced Luby’s strategy for discrete runtime distributions. They also showed that their strategy is asymptotically optimal in the worst case. This universal strategy is also highly relevant in practice [7].

2.2 Properties of the Weibull Distribution

Throughout this work, we often use the Weibull distribution in proofs. Here, we define this distribution and show some of its properties.

Definition 4

[13] Let X be a positive real-valued random variable. The random variable X is Weibull distributed if and only if its cumulative distribution function F_X is given by

$$ F_{X}(x)=1-e^{-(\frac{x}{\lambda})^{k}}, $$

(4)

where $\lambda \in \mathbb {R}_{+}$ is called scale, and $k \in \mathbb {R}_{+}$ is called shape.

It is known that for some Weibull distributions, the expected value with restarts approaches zero.

Lemma 1

[9] Let X be a Weibull distribution with shape k < 1 and arbitrary scale $\lambda \in \mathbb {R}_{+}$. Then,

$$ \lim\limits_{t\rightarrow 0}E[X_{t}] = 0 $$

(5)

holds. Here, E[X_t] is the expected value with fixed-cutoff restarts after t time units.

Furthermore, we show that the expected values with restarts are strictly monotone for Weibull distributions.

Lemma 2

Let X be a Weibull distributed random variable with arbitrary scale λ > 0. Then, the expected value E[X_t] with restarts at t is strictly monotonically increasing for t > 0 if and only if the shape parameter k is less than one. If the shape parameter k is greater than one, then E[X_t] is strictly monotonically decreasing.

Proof

The expected value E[X_t] with restarts after t time units is implicitly given in [9].

$$ E[X_{t}] = \frac{1}{1-\exp{(-(\lambda t)^{k})}}\left( t\exp{(-(\lambda t)^{k})}+\frac{1}{\lambda} \gamma(1+\frac{1}{k}, (\lambda t)^{k})\right), $$

(6)

where $\gamma (s, x) = {{\int \limits }_{0}^{x}}t^{s-1}e^{-t} \mathrm {d} t$ is the lower incomplete gamma function. To determine the derivative of E[X_t], the derivative of the lower incomplete gamma function ${\gamma (1+\frac {1}{k}, (\lambda t)^k)}$ is considered. Using the definition of γ and the chain rule yields:

$$ \begin{array}{@{}rcl@{}} \frac{\mathrm{d}}{\mathrm{d}{}t} {\gamma(1+\frac{1}{k}, (\lambda t)^k)}{} &=& k \cdot \lambda \cdot {\exp{(-(\lambda t)^k)}}{} \cdot \left( (\lambda t)^{k}\right)^{\frac{1}{k}} \cdot (\lambda t)^{k-1} \end{array} $$

(7)

$$ \begin{array}{@{}rcl@{}} & =& k \cdot \lambda \cdot {\exp{(-(\lambda t)^k)}}{} \cdot (\lambda t)^{k} \end{array} $$

(8)

We now calculate the derivative of E[X_t] w.r.t. t and denote it by g(t). The derivative g(t) is then given by:

$$ \frac{\exp{(-(\lambda t)^{k})}}{1-\exp{(-(\lambda t)^{k})}}\left( 1-\frac{\exp{(-(\lambda t)^{k})} k(\lambda t)^{k} + k (\lambda t)^{k-1} \gamma(1+\frac{1}{k},(\lambda t)^{k})}{1-\exp{(-(\lambda t)^{k})}}\right). $$

We show that g(t) > 0 for all t > 0. This inequality can be transformed to:

$$ \frac{1-\exp{(-(\lambda t)^{k})}}{k} - (\lambda t)^{k} \exp{(-(\lambda t)^{k})} - (\lambda t)^{k-1} \gamma\left( 1+\frac{1}{k}, (\lambda t)^{k}\right)> 0. $$

(9)

First, this inequality is observed for the limit $t \rightarrow 0$. The limit of the term $-(\lambda t)^{k-1} \gamma (1+\frac {1}{k},(\lambda t)^{k})$ has to be examined closely for k < 1. The limit of all other terms is clearly zero. Due to L’Hospital’s rule, we have:

$$ \lim\limits_{t\rightarrow 0}- \frac{\gamma(1+\frac{1}{k},(\lambda t)^{k})}{(\lambda t)^{1-k}} = \lim\limits_{t\rightarrow 0}-\frac{\exp(-(\lambda t)^{k}) k (\lambda t)^{2k}}{1-k}=0. $$

Thus, the left side of inequality (9) approaches zero in the limit. We now show that the derivative of the left side of inequality (9) is positive for all t > 0. This inequality can be transformed to

$$ \frac{1-\exp{(-(\lambda t)^{k})}}{k (\lambda t)^{k-1}} - \exp{(-(\lambda t)^{k})}\lambda t - \gamma\left( 1+\frac{1}{k}, (\lambda t)^{k}\right)> 0. $$

(10)

The derivative h(t) of the left side of inequality (10) is given by:

$$ h(t)=\frac{(1-\exp{(-(\lambda t)^{k})})(1-k)\lambda}{k (\lambda t)^{k}} $$

It can be easily verified that for t > 0, h(t) evaluates to:

$$ h(t) \begin{cases} > 0, \text{ for } k < 1, \\ = 0, \text{ for } k = 1, \\ < 0, \text{ for } k > 1. \end{cases} $$

(11)

Thus, for k < 1 we have h(t) > 0 which implies g(t) > 0 for all t. This property means that E[X_t] is strictly monotonically increasing. On the other hand, k > 1 implies that E[X_t] is strictly monotonically decreasing. □

3 Continuous Universal Strategies

Theorem 2 presents a useful performance guarantee for Luby’s strategy w.r.t. the optimal fixed-cutoff strategy. However, the bounds are obtained for discrete runtime distributions. This raises the question of whether the performance guarantee w.r.t. the optimal fixed-cutoff strategy also applies to continuous runtime distributions.

3.1 Nonexistence Results

Here, we prove that, generally, there are no optimal universal strategies in a continuous setting. First, we show a useful property of the fixed-cutoff strategy: The expected value of any strategy can be expressed as a convex combination of expected values of fixed-cutoff strategies.

Lemma 3

Let X be any random variable and let $L=(t_{1}, t_{2}, \dots )$ be an arbitrary restart strategy. Let $E[X_{t_{i}}]$ denote the expected value of X with the fixed-cutoff strategy with restart time t_i. Then, E[X_L] can be expressed by

$$ E[X_{L}] = \sum\limits_{i=1}^{\infty} a_{i} E[X_{t_{i}}], $$

(12)

where $(a_{1}, a_{2}, \dots )$ is a sequence with $a_{i} \in \mathbb {R}_{+}$ and $\sum \limits _{i=1}^{\infty } a_{i} = 1$.

Proof

Luby et al. [12] showed this property for discrete distributions. We prove the continuous case.

Let $L=(t_{1}, t_{2}, \dots )$ be an arbitrary restart strategy. Define the cumulative restart times $T_{i} := {\sum }_{k=1}^{i} t_{k}$ and set T₀ := 0 by convention. Let Y := X_L be a random variable describing the process of restarting X with strategy L. The functions f_Y and F_Y denote the pdf and, respectively, the cumulative distribution function of Y. Furthermore, S_Y(x) is used as abbreviation for $\Pr (Y > x)=1-F_{Y}(x)$. The function S_Y may be specified with respect to F_X. Consider the probability S_Y(T_i + x) with x < t_i+ 1 and $i\in \mathbb {N}$. Only if the first i runs failed and the first x time units of the (i + 1)-th run are not successful, Y is greater than T_i + x. By definition, the runs are mutually independent. Therefore, the probability S_Y(T_i + x) is:

$$ S_{Y}(T_{i} + x) = \left( 1-F_{X}(x)\right) \prod\limits_{k=1}^{i}\left( 1-F_{X}(t_{k})\right). $$

(13)

The pdf f_Y of Y can easily be determined using the appropriate derivative:

$$ f_{Y}(T_{i}+x) = f_{X}(x) \prod\limits_{k=1}^{i}\left( 1-F_{X}(t_{k})\right) = f_{X}(x) S_{Y}(T_{i}) $$

(14)

Next, the following auxiliary identity is shown:

$$ \sum\limits_{i=0}^{\infty} T_{i} S_{Y}(T_{i})F_{X}(t_{i+1}) = \sum\limits_{i=1}^{\infty} t_{i} S_{Y}(T_{i}). $$

(15)

The first term of the left sum can be omitted as its value is zero. Furthermore, it should be noted that S_Y(T_i)F_X(t_i+ 1) is the probability $\Pr (T_{i} < Y \leq T_{i+1})$. The result is then obtained using the following transformations.

$$ \begin{array}{@{}rcl@{}} &&\sum\limits_{i=0}^{\infty} T_{i} S_{Y}(T_{i})F_{X}(t_{i+1}) = \sum\limits_{i=1}^{\infty} T_{i} S_{Y}(T_{i})F_{X}(t_{i+1}) \end{array} $$

(16)

$$ \begin{array}{@{}rcl@{}} &=&\sum\limits_{i=1}^{\infty}\sum\limits_{k=1}^{i} t_{k} S_{Y}(T_{i})F_{X}(t_{i+1}) = \sum\limits_{k=1}^{\infty}t_{k}\sum\limits_{i=k}^{\infty} S_{Y}(T_{i})F_{X}(t_{i+1}) \end{array} $$

(17)

$$ \begin{array}{@{}rcl@{}} &=&\sum\limits_{k=1}^{\infty}t_{k}\sum\limits_{i=k}^{\infty} \Pr(T_{i}<Y\leq T_{i+1})=\sum\limits_{k=1}^{\infty} t_{k} S_{Y}(T_{k}) \end{array} $$

(18)

We now proceed to determine the expected value E[Y ]:

$$ \begin{array}{@{}rcl@{}} E[Y] &=& \int\limits_{0}^{\infty}xf_{Y}(x) \mathrm{d} x = \sum\limits_{i=0}^{\infty} \int\limits_{T_{i}}^{T_{i+1}}xf_{Y}(x) \mathrm{d} x \end{array} $$

(19)

$$ \begin{array}{@{}rcl@{}} &=& \sum\limits_{i=0}^{\infty} \int\limits_{0}^{t_{i+1}}(T_{i} + x)f_{Y}(T_{i} + x) \mathrm{d} x \end{array} $$

(20)

$$ \begin{array}{@{}rcl@{}} & =& \sum\limits_{i=0}^{\infty} T_{i} S_{Y}(T_{i})\int\limits_{0}^{t_{i+1}}f_{X}(x) \mathrm{d} x + S_{Y}(T_{i}) \int\limits_{0}^{t_{i+1}}xf_{X}(x) \mathrm{d} x \end{array} $$

(21)

$$ \begin{array}{@{}rcl@{}} & =& \sum\limits_{i=0}^{\infty} T_{i} S_{Y}(T_{i})F_{X}(t_{i+1}) + S_{Y}(T_{i}) \int\limits_{0}^{t_{i+1}}xf_{X}(x) \mathrm{d} x. \end{array} $$

(22)

The left part of (22) can be transformed with (15). We use Theorem 1 and the definition of the conditional expected value to simplify the right part:

$$ \int\limits_{0}^{t_{i}}x f_{X}(x) \mathrm{d} x=F_{X}(t_{i}) E[X_{t_{i}}] - t_{i} \left( 1-F_{X}(t_{i}) \right), $$

(23)

where $E[X_{t_{i}}]$ is the expected value of X utilizing the fixed-cutoff strategy $(t_{i}, t_{i}, \dots )$. Using this equation and additional transformations result in

$$ \begin{array}{@{}rcl@{}} E[Y] & =& \sum\limits_{i=1}^{\infty} t_{i} S_{Y}(T_{i}) + S_{Y}(T_{i-1}) \left( F_{X}(t_{i}) E[X_{t_{i}}] - t_{i} (1-F_{X}(t_{i})) \right) \end{array} $$

(24)

$$ \begin{array}{@{}rcl@{}} & = &\sum\limits_{i=1}^{\infty} t_{i} S_{Y}(T_{i}) + S_{Y}(T_{i-1}) F_{X}(t_{i}) E[X_{t_{i}}] - t_{i} S_{Y}(T_{i}) \end{array} $$

(25)

$$ \begin{array}{@{}rcl@{}} & =& \sum\limits_{i=1}^{\infty} S_{Y}(T_{i-1}) F_{X}(t_{i}) E[X_{t_{i}}] = \sum\limits_{i=1}^{\infty} \Pr{\left( T_{i-1} < Y \leq T_{i} \right)} E[X_{t_{i}}]. \end{array} $$

(26)

For obtaining (25), (13) is used. The expected value E[Y ] can, therefore, be expressed as a combination of the expected values of fixed-cutoff strategies. Evidently $\Pr {(T_{i-1} < Y \leq T_{i} )}$ is non-negative, and the sum ${\sum }_{i=1}^{\infty } \Pr {(T_{i-1} < Y \leq T_{i} )}$ equals one. □

We will now proceed and show that there is no universal strategy in a continuous setting that allows for any reasonable bound in contrast to the discrete case.

Theorem 3

There is no universal strategy $L=(t_{1}, t_{2}, \dots )$ such that there is a function $f:\mathbb {R}_{+} \rightarrow \mathbb {R}$ with $\lim \limits _{x\rightarrow 0}f(x) < \infty $ and $\exists x \in \mathbb {R}_{+}: f(x) < \infty $ with

$$ E[X_{L}] \leq \alpha^{*} \cdot f(\alpha^{*}) $$

(27)

for all runtime distributions X.

Proof

We consider two cases: Either the restart times of the universal strategy are all equal to zero or there is at least one restart time t_j which is not equal to zero. First, consider the strategy $L_{1} = (t_{1}, t_{2}, \dots )$ such that there is a $j \in \mathbb {N}$ with t_j > 0. Define the cumulative restart times $T_{i} := {\sum }_{k=1}^{i} t_{k}$ and set T₀ := 0 by convention. Let Z be a Weibull distribution with shape k < 1.0 and arbitrary scale $\lambda \in \mathbb {R}_{+}$. According to Lemma 3, $\Pr {\left (T_{j-1} < Z_{L_{1}} \leq T_{j} \right )} E[Z_{t_{j}}]$ bounds the expectation of the universal strategy $E[Z_{L_{1}}]$ from below.

Furthermore, the expected value $E[Z_{t_{j}}]$ is greater than zero due to Lemma 1 and Lemma 2. In addition, $\Pr {\left (T_{j-1} < Z_{L_{1}} \leq T_{j} \right )}$ is also strictly greater than zero. On the other hand, the minimal expected value α^∗ is zero, as stated in Lemma 1. Let $f:\mathbb {R}_{+} \rightarrow \mathbb {R}$ be any function with $\lim _{x\rightarrow 0}f(x)<\infty $. Due to this property, α^∗⋅ f(α^∗) = 0 must be valid. Hence, we conclude that

$$ E[X_{L_{1}}]\geq \Pr{\left( T_{j-1} < Z_{L_{1}} \leq T_{j} \right)} \cdot E[Z_{t_{j}}] > 0 = \alpha^{*} \cdot f(\alpha^{*}) $$

(28)

holds. Put differently, Inequality (27) is not valid.

In the second case, the universal strategy $L_{2} = (0,0, \dots )$ is considered. However, it is easy to construct a distribution such that the expected value with restarts using L₂ diverges to infinity. To this end, we consider the limiting value of the expected value when the restart times approach 0. First, we use (1) for the expected value and modify slightly modify its terms. The following derivation is well-known, compare, for example, Wolter [16].

$$ \begin{array}{@{}rcl@{}} {\lim\limits_{t \rightarrow 0}} E[X_{t}] & =& {\lim\limits_{t \rightarrow 0}} \frac{1-F(t)}{F(t)} t + E[X \mid X < t] \end{array} $$

(29)

$$ \begin{array}{@{}rcl@{}} &=&{\lim\limits_{t \rightarrow 0}} \frac{t}{F(t)} - t + \frac{{{\int}_{0}^{t}} x\cdot f(x) \mathrm{d} x}{F(t)}= {\lim\limits_{t \rightarrow 0}}\frac{1}{f(t)} \end{array} $$

(30)

In (30), L’Hospital’s rule is used. We now define a new Weibull distributed random variable U with shape k > 1.0. The scale parameter λ is discussed later. As can be easily checked by derivation of (4), the pdf f_U is given by

$$ f_{U}(x) = \frac{k}{\lambda} \left( \frac{x}{\lambda} \right)^{k-1} \exp{\left[-\left( \frac{x}{\lambda}\right)^{k} \right]}, $$

(31)

for x ≥ 0. And since k > 1, it follows that f(0) = 0 is valid. Then (30) states that the expected value $E[U_{L_{2}}]$ diverges towards infinity. On the other hand, restarts have a negative effect on the expected value of U, according to Wolter [16]. In this case, this indicates α^∗ = E[U]. The expected value E[U] is given by $\lambda \cdot {\varGamma }(1+\frac {1}{k})$, where ${\varGamma }(x)={\int \limits }_{0}^{\infty } t^{x-1}e^{-t} \mathrm {d} t$ is the gamma function [13].

By assumption there is a $x\in \mathbb {R}_{+}$ with $f(x) < \infty $. Therefore, choose λ such that $\lambda \cdot {\varGamma }(1+\frac {1}{k}) = x$ holds. Then the following observation is valid

$$ \begin{array}{@{}rcl@{}} \alpha^{*} \cdot f(\alpha^{*}) = x \cdot f(x) < \infty. \end{array} $$

(32)

As discussed above, $E[U_{L_{2}}]$ approaches infinity; it follows that Inequality (27) does not hold in this case as well. □

Theorem 3 can be applied to Luby’s strategy. This is because the logarithm approaches minus infinity as its argument approaches zero. This means that the bound in Theorem 2 does not hold in a continuous setting. Of course, this statement also applies to all other bounds, according to Theorem 3. Given that Luby’s strategy is of great importance in practice, we present this finding as a corollary.

Corollary 1

There is no function $f:\mathbb {R}_{+} \rightarrow \mathbb {R}$ with $\lim \limits _{x\rightarrow 0}f(x) < \infty $ such that

$$ E[X_{\text{Luby}}] \leq \alpha^{*} \cdot f(\alpha^{*}) $$

(33)

for all runtime distributions X.

Proof

This statement immediately follows from Theorem 3. □

Since computers are inherently discrete systems, not all real-valued restart times and, accordingly, not all restart strategies can be represented on a computer. Since computers are inherently discrete systems, not all real-valued restart times and, accordingly, not all restart strategies can be represented on a computer. Then, of course, a bound of the form given in Theorem 3 is too strong for the strategies that can actually be represented. This, however, raises the question of whether a suitable bound can be found by introducing an error term depending on the machine precision. As shall be seen, the answer to this question is also negative.

To formalize this result, we use $\mathcal {R} \subset \mathbb {R}$ to denote a subset of the real numbers that are representable by a computer. For example, $\mathcal {R}$ could be floating-point numbers. However, we will not impose any further requirements on $\mathcal {R}$, so $\mathcal {R}$ can be defined arbitrarily. Instead, we use a constant $c_{\mathcal {R}}\in \mathbb {R}_{+}$, which is supposed to describe a precision error with respect to $\mathcal {R}$. We call a restart strategy $(t_{1}, t_{2}, \dots )$ representable with respect to $\mathcal {R}$ if $t_{i}\in \mathcal {R}$ holds for all $i\in \mathbb {N}$.

Corollary 2

Let $\mathcal {R} \subset \mathbb {R}$ be a set of representable numbers and let $c_{\mathcal {R}} \in \mathbb {R}_{+}$ be an associated constant. There is no universal strategy $L=(t_{1}, t_{2}, \dots )$ representable with respect to $\mathcal {R}$ such that there is a function $f:\mathbb {R}_{+} \rightarrow \mathbb {R}$ with $\lim \limits _{x\rightarrow 0}f(x) < \infty $ and $\exists x \in \mathbb {R}_{+}: f(x) < \infty $ with

$$ E[X_{L}] \leq \alpha^{*} \cdot f(\alpha^{*}) + c_{\mathcal{R}} $$

(34)

for all runtime distributions X.

Proof

Once more, we have to distinguish whether the restart strategy L is constantly zero or not. In case L is constantly zero, the corresponding arguments from the proof of Theorem 3 can be applied. We, therefore, focus on the other situation.

Let X be a Weibull distributed random variable with shape parameter k < 1. In the following, we assume that $x\in \mathcal {R}$ is the smallest positive representable number. Since L is a representable strategy, x ≤ t,∀t ∈ L holds. As a consequence of Lemma 2 and 3, the following statement holds true:

$$ \begin{array}{@{}rcl@{}} E[X_{L}] \geq E[X_{x}]. \end{array} $$

(35)

The expected value E[X_x] can be bounded from below by applying Theorem 1:

$$ \begin{array}{@{}rcl@{}} E[X_{x}] = \frac{1-F(x)}{F(x)}x + E[X \mid X < x] \geq \frac{1-F(x)}{F(x)}x. \end{array} $$

(36)

We now use the cdf of X, which is given in Definition 4:

$$ \begin{array}{@{}rcl@{}} \frac{1-F(x)}{F(x)}x = \frac{\exp{\left( - \left( \frac{x}{\lambda}\right)^{k}\right)}}{1-\exp{\left( - \left( \frac{x}{\lambda}\right)^{k}\right)}}x. \end{array} $$

(37)

If the scale parameter λ approaches infinity, the following limiting value is obtained:

$$ \begin{array}{@{}rcl@{}} \lim\limits_{\lambda \rightarrow \infty} \exp{\left( - \left( \frac{x}{\lambda}\right)^{k}\right)} = 1. \end{array} $$

(38)

This means that (37) approaches infinity as λ approaches infinity. Therefore, there is a λ, such that $\frac {1-F(x)}{F(x)}x>c_{\mathcal {R}}$ holds. Finally, according to Lemma 1, α^∗ = 0 for every Weibull distribution with k < 1, so $\alpha ^{*}\cdot f(\alpha ^{*})+c_{\mathcal {R}}=c_{\mathcal {R}}$. By bringing all these findings together, we obtain:

$$ \begin{array}{@{}rcl@{}} E[X_{L}] \geq E[X_{x}] \geq \frac{1-F(x)}{F(x)}x > \alpha^{*} \cdot f(\alpha^{*}) + c_{\mathcal{R}} = c_{\mathcal{R}}. \end{array} $$

(39)

This completes the proof. □

3.2 Existence Results

In this part, we show that there are optimal universal strategies if a more restricted class of distributions is considered. The proofs so far used the fact that if the runtime distribution is Weibull distributed, then both the optimal restart time and the expected runtime approach zero. Generally, the optimal expected value can be bounded from above. Our rationale is based on the reasoning presented in [8] and is adjusted to the continuous case. Primarily, we provide an enhanced analysis of the expected value in Theorem 4 and thus derive, to the best of our knowledge, the currently best known upper bound for the expected value when applying Luby’s strategy.

In the following, $\beta ^{*}_{X}$ denotes $\min \limits _{t\in \mathbb {R}_{+}}{\frac {t}{F(t)}}$ and $k^{*} \in \mathbb {R}_{+}$ is a real number with $\frac {k^{*}}{F(k^{*})} = \beta ^{*}_{X}$.

Lemma 4

[8, 12] The inequality

$$ \beta^{*}_{X} \leq 4 \alpha^{*}_{X} $$

(40)

holds for all distributions X.

Proof

Luby et al. [12] state this property for discrete probability distributions. However, the proof is omitted in their work. A slightly modified version of the proof presented in [8] shows this property for continuous distributions.

Let $t \in \mathbb {R}_{+}$ be an optimal restart time, i.e., E[X_t] = α^∗. Two cases are distinguished: First, it is assumed that $E[X_{t}] < \frac {t}{2F(t)}$ holds. In this case, $\Pr (X \leq 2E[X_{t}]F(t)) \geq 0.5$ follows. An auxiliary random variable $Y := \min \limits {(X,t)}$ is defined to show this. It can be seen that the expected values E[Y ] and E[X_t] are related:

$$ \begin{array}{@{}rcl@{}} E[Y] = F(t)\cdot E[X \mid X < t] + (1-F(t))\cdot t = F(t)\cdot E[X_{t}]. \end{array} $$

(41)

We analyze the probability $\Pr (X > 2 E[X_{t}]F(t))$ in more detail. The following inequality can be derived by applying Markov’s inequality and the assumption 2E[X_t]F(t) < t:

$$ \begin{array}{@{}rcl@{}} \Pr(X > 2 E[X_{t}]F(t)) &=& \Pr(Y > 2 E[X_{t}]F(t)) \end{array} $$

(42)

$$ \begin{array}{@{}rcl@{}} &=& \Pr(Y > 2 E[X_{t}]F(t)) = \Pr(Y > 2 E[Y]) \leq \frac{1}{2}. \end{array} $$

(43)

From this inequality, $\Pr (X\leq 2 E[X_{t}]F(t)) \geq 1/2$ also follows. And thus we can estimate β^∗ as follows:

$$ \begin{array}{@{}rcl@{}} \beta^{*} = \min_{x} \frac{x}{F(x)} \leq \frac{2E[X_{t}]F(t)}{\Pr(X \leq 2E[X_{t}]F(t))}\leq 4E[X_{t}]F(t) \leq 4E[X_{t}] = 4 \alpha^{*}. \end{array} $$

Secondly, we now assume that 2E[X_t]F(t) > t is valid. With this assumption, α^∗ can be estimated as follows:

$$ \begin{array}{@{}rcl@{}} \alpha^{*} = E[X_{t}] > \frac{t}{2F(t)} \geq \frac{1}{2} \min_{x} \frac{x}{F(x)} = \frac{\beta^{*}}{2}. \end{array} $$

In conclusion, β^∗≤ 4α^∗ is true for all distributions. □

This Lemma can be used to show that distributions from a more restricted class admit an upper bound on the expected value of Luby’s strategy. That is, only runtime distributions are considered such that the optimal restart time does not approach zero.

Theorem 4

Let X be a positive real-valued distribution such that there is a k^∗ with $\frac {k^{*}}{F(k^{*})}=\beta ^{*}$ and k^∗ > 2^−d− 1 for some $d \in \mathbb {N}_{0}$. Then,

$$ E[X_{\text{Luby}}] \leq 2^{d+7} \alpha^{*} (d+4+\log_{2}{ \alpha^{*}}), $$

(44)

holds, where α^∗ is the optimal expected runtime under restart.

Proof

The presentation of the proof follows [8].

Let Y := X_Luby denote a random variable describing the process of restarting X with Luby’s strategy. The number $z := 2^{\lceil \log _{2}{k^{*}} \rceil + d}$ is the smallest element of Luby’s strategy guaranteed to be at least as large as k^∗. The elements of Luby’s strategy are those natural numbers being powers of two. This is because all elements in Luby’s strategy are natural numbers, which are powers of two. Furthermore, k^∗ > 2^−d− 1 is valid by assumption, so z ≥ k^∗ and $z\in \mathbb {N}$ is ensured. Two auxiliary values $j_{0} := \lceil \log _{2}{k^{*}} \rceil + d$ and $n_{0} := \lceil - \log _{2} F(k^{*}) \rceil $ are used in the following.

First, the case n₀ = 0 is examined. This case implies F(k^∗) = 1. In addition, let $m\in \mathbb {N}$ and $j \in \mathbb {N}_{0}$ be two natural numbers with j < m. As can be verified using Definition 3 there are at least 2^m− 1−j runs with length of 2^j if Y > m2^m− 1. Accordingly, for $Y>(j_{0}+1)2^{j_{0}}$ there is at least one run of length $2^{j_{0}}$. Since F(k^∗) = 1 is valid, $\Pr (Y>(j_{0}+1)2^{j_{0}})=0$ ensues. This allows the expected value E[Y ] to be bounded from above:

$$ \begin{array}{@{}rcl@{}} E[Y] &\leq& (\lceil \log_{2}{k^{*}} \rceil + d + 1) 2^{\lceil \log_{2}{k^{*}} \rceil + d} \leq k^{*}(\log_{2}{k^{*}} + d + 2) 2^{d + 1} \end{array} $$

(45)

$$ \begin{array}{@{}rcl@{}} & =& \beta^{*}(\log_{2}{\beta^{*}} + d + 2) 2^{d + 1} \leq \alpha^{*}(\log_{2}{\alpha^{*}} + d + 4) 2^{d + 3} \end{array} $$

(46)

In (46), we initially apply β^∗ = k^∗/F(k^∗) and F(k^∗) = 1, then we use Lemma 4. The last inequality conforms to the statement of the theorem.

We may now assume n₀ ≥ 1. Furthermore, j₀ ≥ 0 holds due to the definition of j₀. Since, by definition, j₀ ≥ 0, we may also use j₀ + n₀ ≥ 1. Above, we have already indicated that if Y > m2^m− 1, then there are at least 2^m− 1−j runs with a length of 2^j. This fact can be used to bound the probability $\Pr {(Y > m2^{m-1})}$ from above:

$$ \begin{array}{@{}rcl@{}} \Pr{(Y > m2^{m-1})} \leq \Pr{(X > 2^{j})}^{2^{m-1-j}}. \end{array} $$

(47)

The tail probability of Y is estimated by means of Inequality (47). The following holds for $\ell \in \mathbb {N}_{0}$:

$$ \begin{array}{@{}rcl@{}} \Pr{(Y > (j_{0}+n_{0} +\ell +1)2^{j_{0} + n_{0} + \ell})} &\leq& \Pr{(X > 2^{j_{0}})}^{2^{n_{0}+\ell}} \end{array} $$

(48)

$$ \begin{array}{@{}rcl@{}} & \leq &(1-F(k^{*}))^{\frac{2^{\ell}}{F(k^{*})}} \leq \exp{\left( -2^{\ell}\right)}. \end{array} $$

(49)

Let $d(\ell ) := (j_{0}+n_{0} +\ell +1)2^{j_{0} + n_{0} + \ell }$ for ℓ ≥ 0 and d(− 1) := 0. The expected value E[Y ] can be bounded from above by using Inequality (49).

$$ \begin{array}{@{}rcl@{}} E[Y] & =& \sum\limits_{\ell=0}^{\infty} {\int}_{d(\ell-1)}^{d(\ell)} (1-F_{Y}(x)) \mathrm{d} x \\ & \leq& (j_{0}+n_{0} +1)2^{j_{0} + n_{0}} + \sum\limits_{\ell=0}^{\infty} (j_{0}+n_{0} +\ell + 3)2^{j_{0} + n_{0} + \ell} \exp{\left( -2^{\ell}\right)} \\ & \leq &2(j_{0}+n_{0})2^{j_{0} + n_{0}} + \sum\limits_{\ell=0}^{\infty} (j_{0}+n_{0}) (\ell + 4)2^{j_{0} + n_{0} + \ell} \exp{\left( -2^{\ell}\right)} \end{array} $$

(50)

$$ \begin{array}{@{}rcl@{}} & \leq& (j_{0}+n_{0})2^{j_{0} + n_{0}} \left( 2+ \sum\limits_{\ell=0}^{\infty} (\ell + 4) \left( \frac{2}{\exp{\left( 2\right)}}\right)^{\ell}\right) \end{array} $$

(51)

In Inequality (50), the assumption j₀ + n₀ ≥ 1 is implicitly utilized. Next, we examine the series $\sum \limits _{\ell =0}^{\infty } (\ell + 4) \left (\frac {2}{\exp {\left (2\right )}}\right )^{\ell }$.

$$ \begin{array}{@{}rcl@{}} && \sum\limits_{\ell=0}^{\infty} (\ell + 4) \left( \frac{2}{\exp{\left( 2\right)}}\right)^{\ell} = \sum\limits_{\ell=1}^{\infty} \ell \left( \frac{2}{\exp{\left( 2\right)}}\right)^{\ell-1} + 3 \sum\limits_{\ell=0}^{\infty} \left( \frac{2}{\exp{\left( 2\right)}}\right)^{\ell} \end{array} $$

(52)

$$ \begin{array}{@{}rcl@{}} &= & \left( \frac{e^{2}}{e^{2}-2}\right)^{2} + 3 \frac{e^{2}}{e^{2}-2} \leq 6 \end{array} $$

(53)

The estimation of the expected value E[Y ] is simplified by inserting Inequality (53) into Inequality (51).

$$ \begin{array}{@{}rcl@{}} E[Y] \leq 8(j_{0}+n_{0})2^{j_{0} + n_{0}}. \end{array} $$

(54)

Using the definition of j₀ and n₀ results in:

$$ \begin{array}{@{}rcl@{}} 2^{j_{0} + n_{0}} \leq 2^{2+d+ \log_{2}{k^{*}} - \log_{2} F(k^{*})} = 2^{d+2} \frac{k^{*}}{F(k^{*})}=2^{d+2} \beta^{*}. \end{array} $$

(55)

In the final step, we insert Inequality (55) into Inequality (54) and apply Lemma 4.

$$ \begin{array}{@{}rcl@{}} E[Y] \leq 2^{d+5} \beta^{*}(d+2+\log_{2}{\beta^{*}}) \leq 2^{d+7} \alpha^{*} (d+4+\log_{2}{\alpha^{*}}). \end{array} $$

(56)

Inequality (56) describes the desired property, and therefore the proof is complete. □

Note that for d = 0, the upper bound of the expected runtime is $128 \alpha ^{*} (4+\log _{2}{\alpha ^{*}})$, which is, to the best of our knowledge, the best known bound for Luby’s strategy. This proof can also be transferred to discrete distributions, and thus this bound can be achieved in these distributions as well.

Furthermore, Theorem 7 in [12] bounds the performance of any universal strategy from below. While the proof is for discrete probability distributions, the arguments can be directly transferred to the case discussed here. This implies that Luby’s strategy is asymptotically optimal on the class of distributions examined here.

Theorem 4 limits the minimal restart time to 2^−d− 1. However, for some distributions, the optimal restart time approaches zero, and the reasoning from Theorem 4 cannot be applied. Theorem 3 states that for these distributions, there is no guarantee of the form $E[X_{L}] \leq \alpha _{X}^{*} \cdot g(\alpha _{X}^{*})$ with $\lim _{t\rightarrow 0} g(t) < \infty $. The only remaining option is to provide a runtime guarantee for universal strategies with respect to the optimal fixed-cutoff strategy is using a bound function g with $\lim _{t\rightarrow 0}g(t)=\infty $. It is known that the expected runtime E[X_x] approaches 1/f(x) as x approaches zero, where f is the density function. One of the critical distributions is the Weibull distribution, which has the property $f(x) \sim x^{k-1}$ for $x \rightarrow 0$ and k ∈ (0,1). Thus, every possible bound function g(x) has to approach infinity at least as fast as 1/x for $x \rightarrow 0$. It is possible, though, that there are distribution types for which their density function approaches infinity faster than the Weibull distribution in the neighborhood of zero.

4 Simulations

So far, the Weibull distribution has been used to show the negative result. In this section, we take other distributions into account. More specifically, the quality of Luby’s strategy in terms of the optimal expected value under restarts is examined. For this purpose, we consider the Lévy-Smirnov, the generalized Pareto, and the lognormal distribution. These distributions cover three possibilities: The Lévy-Smirnov distribution considered here will be shown to fulfill the conditions of Theorem 4. For the generalized Pareto distribution, it turns out that Theorem 4 is not applicable, but Luby’s strategy often yields good results. Finally, in the case of the lognormal distribution, Theorem 4 can be applied to any given parameter combination. In the limit, however, Theorem 4 is not applicable. Consequently, the results of Luby’s strategy become increasingly worse as the parameters approach the limit.

4.1 Setup of the Simulations

The following simulations are conducted as follows. For each random variable X, we use 1000 samples from X_Luby. In other words, we apply Luby’s strategy 1000 times to X. Using the collected data, we calculate the sample mean $\bar {x}$ as an estimate for E[X_Luby]. Additionally, we provide the 95% confidence interval in each case. The results are graphically illustrated and compared with α^∗. The data and the source code used to obtain the results are freely available online at https://github.com/jahehalo/restart_universal_strategies.

4.2 The Lévy-Smirnov Distribution

We start by defining the Lévy-Smirnov distribution.

Definition 5

[4] Let X be a positve, real-valued random variable. The random variable X is Lévy-Smirnov distributed if and only if its cumulative distribution function F_X is given by

$$ F_{X}(x) = \text{erfc}\left( \sqrt{\frac{c}{2x}}\right), $$

(57)

where $c\in \mathbb {R}_{+}$ is called scale, and erfc is the complementary error function with $\text {erfc}(x)=\frac {2}{\sqrt {\pi }}{\int \limits }_{x}^{\infty } e^{-u^{2}} \mathrm {d} u$.

To examine the effect of the scale parameter c, we will test different values of c. The values of c range between 10^− 3 and 10². In order to estimate the quality of the Luby strategy, we first determine α^∗. For this purpose, we use the methodology presented in [9]. This method involves finding the optimal restart quantile p^∗∈ [0,1] by solving the following equation:

$$ \begin{array}{@{}rcl@{}} &(p-1)\cdot Q(p)+(1-p)\cdot p \cdot Q^{\prime}(p)-{\mathfrak{Q}}(p)+{\mathfrak{Q}}(0) = 0, \end{array} $$

(58)

where Q is the quantile function of X, $Q^{\prime }$ is the derivative of Q and ${\mathfrak {Q}}$ is the antiderivative of Q. In the case of the Lévy-Smirnov distribution, these three functions are given by:

$$ \begin{array}{@{}rcl@{}} &&Q(p)=\frac{c}{2\cdot {\text{erfc}^{-1}(p)}^{2}} \end{array} $$

(59)

$$ \begin{array}{@{}rcl@{}} &&Q^{\prime}(p)=c \cdot \frac{\sqrt{\pi} \cdot \exp{\left[{\text{erfc}^{-1}(p)}^{2}\right]}}{2\cdot {\text{erfc}^{-1}(p)}^{3}} \end{array} $$

(60)

$$ \begin{array}{@{}rcl@{}} &&{\mathfrak{Q}}(p)=c\cdot\left( 1 - p + \frac{\exp{\left[-{\text{erfc}^{-1}(p)}^{2}\right]}}{\sqrt{\pi} \cdot {\text{erfc}^{-1}(p)}}\right). \end{array} $$

(61)

Here, erfc^− 1 is the inverse of the complementary error function erfc. As shown in [9], the optimal restart quantile p^∗ is invariant to the choice of the scale parameter c. Thus, we fix c to 1 and solve (58) numerically.

This approach leads to p^∗≈ 0.2963. Using p^∗ the optimal restart time t^∗ = Q(p^∗) can be determined. This way, the optimal restart time t^∗≈ 0.9170 ⋅ c is obtained, resulting in α^∗≈ 2.6718 ⋅ c.

We also examine whether the Lévy-Smirnov distribution satisfies the conditions of Theorem 4. Therefore k^∗ has to be determined. For this purpose we have to minimize t/F(t). By substituting t = Q(p), we can alternatively consider Q(p)/p, because Q is the quantile function and therefore F(Q(p)) = p. Numerical calculations show that Q(p)/p assumes its minimum at $p^{\prime } \approx 0.23381$. Resubstituting yields $k^{*}=Q(p^{\prime })\approx 0.7055 \cdot c$. Thus, the conditions for Theorem 4 are met for each c > 0. Therefore we can safely assume, that E[X_Luby] performs reasonably well if the value of c is not too small.

The results of the simulation are shown in Fig. 1. The blue line shows α^∗ while the orange dots represent the sample mean $\bar {x}$ of Luby’s strategy. The particularly good performance of the Luby strategy at c = 1 is notable here. In this case, the sample mean is $\bar {x}\approx 2.7835$ with a 95% confidence interval of (2.6043,2.9626). As can be seen on the log-log plot, Luby’s strategy is also well suited for the other values of the scale parameter. Partly based on these results, we suspect that for many distributions, a much tighter bound than the one given in Theorem 4 can be achieved.

4.3 The Generalized Pareto Distribution

Next, we will focus on the generalized Pareto distribution.

Definition 6

[13] Let X be a positive, real-valued random variable. The random variable X is generalized Pareto (GP) distributed if and only if its cumulative distribution function F_X is given by

$$ F_{X}(x)=1-\left( 1+\frac{k x}{\sigma} \right)^{-1/k}, $$

(62)

where $\sigma \in \mathbb {R}_{+}$ is called scale, and $k \in \mathbb {R}$ is called shape.

For the generalized Pareto distribution it is known that restarts have a beneficial effect if and only if k > 0. Therefore we focus on the case k > 0. In this case, it is known [9] that the optimal restart time is zero, and the associated expected value is α^∗ = σ. It is also well known that E[X_t] ≤ t/F(t) holds for all t. To see this, consider the following:

$$ E[X_{t}] = \frac{1-F(t)}{F(t)} t + E[X \mid X < t] \leq \frac{1-F(t)}{F(t)} t + t \leq \frac{t}{F(t)}. $$

(63)

Only the fact E[X∣X < t] ≤ t has been used here.

We consider the limit of t/F(t) for $t\rightarrow 0$. According to L’Hospital’s rule, the limit approaches 1/f(0), where $f(t)=\frac {1}{\sigma }(1+ \frac {k\cdot x}{\sigma })^{-\frac {1}{k}- 1}$ and therefore 1/f(0) = σ = α^∗. In [9] it was implicitly shown that E[X_t] > σ is valid for all t > 0. Together with Inequality (63), it follows that k^∗ = 0 and β^∗ = σ. This makes the GP distribution an interesting case: Since k^∗ = 0, Theorem 4 is not applicable. On the other hand, due to α^∗ > 0, the proof of Theorem 3 is also not directly applicable either. In other words, neither of the two main results apply directly. Instead, we can only rely on numerical simulations to clarify how well suited Luby’s strategy is for the GP distribution.

We particularly want to clarify the role of the two parameters σ and k. For this purpose, in Fig. 2, the scale parameter σ is set to 10, and different values of k are tested. The orange dots indicate the sample mean of Luby’s strategy, while the blue line indicates α^∗. As can be seen from the diagram, Luby’s strategy seems to be quite suitable for this kind of parameter combination. As k increases, the expected value of Luby’s strategy appears to increase approximately linearly. In this scenario, the difference between α^∗ and E[X_Luby] is limited. On the other hand, for $k\rightarrow 0$, it seems that α^∗ and E[X_Luby] approach the same value. This is consistent with the fact that in the limit for $k \rightarrow 0$, the GP distribution becomes an exponential distribution. For exponential distributions, it is known that restarts do not affect the expected value.

Next, we consider the case where the shape parameter k is fixed, and the scale parameter σ is a variable. Figure 3 shows the case with k = 5. As can be seen here, although the absolute difference between α^∗ and E[X_Luby] grows with increasing σ, the log-log plot shows that they have the same order of magnitude for high values of σ. On the other hand, the expected value of Luby’s strategy decreases considerably slower than α^∗ for $\sigma \rightarrow 0$.

4.4 The Lognormal Distribution

Finally, we consider the lognormal distribution.

Definition 7

[4] Let X be a positive, real-valued random variable. The random variable X is lognormal distributed if and only if its cumulative distribution function F_X is given by

$$ F_{X}(x)= \frac{1}{2} + \frac{1}{2} \text{erf}\left( -\frac{\ln{(x)}-\mu}{\sqrt{2}\cdot \sigma}\right), $$

(64)

where erf is the error function $\text {erf}(x)=\frac {2}{\sqrt {\pi }}{{\int \limits }_{0}^{x}} e^{-u^{2}} \mathrm {d} u$. The parameter $\sigma \in \mathbb {R}_{+}$ is called shape, and the parameter $\mu \in \mathbb {R}$ is called scale.

The lognormal distribution has the interesting property that for each parameter combination (μ,σ), k^∗ is greater than zero, and thus Theorem 4 is applicable. However, in the limit for $\sigma \rightarrow \infty $ both, k^∗ and α^∗, approach zero. This is, to the best of our knowledge, a previously unknown property.

Theorem 5

Let X_μ,σ denote a lognormal distributed random variable with scale μ and shape σ. Then the following holds for the optimal expected value under restarts:

$$ \lim\limits_{\sigma \rightarrow \infty} \alpha^{*}_{X_{\mu, \sigma}} = 0. $$

(65)

Proof

In the following, we set the restart time t to $t=\exp {\left (\mu - \sigma ^2\right )}$. It has already been explained in Inequality (63) that E[X_t] ≤ t/F(t) is valid. We, therefore, examine the limit of t/F(t) for $\sigma \rightarrow \infty $ more closely. Two technical details should be noted in advance. First, $\lim _{x\rightarrow -\infty } \text {erf}(x) = -1$ and also the derivative of the error function given by $\frac {\mathrm {d}}{\mathrm {d} x} \text {erf}(x) = \frac {2}{\sqrt {\pi }}\exp {(-x^{2})}$. The limiting value can then be determined by applying L’Hospital’s rule twice.

$$ \begin{array}{@{}rcl@{}} &&{\lim\limits_{\sigma \rightarrow \infty}}{} \frac{t}{F(t)} = {\lim\limits_{\sigma \rightarrow \infty}}{} \frac{2 \exp{\left( \mu - \sigma^2\right)}{}}{1+\text{erf}\left( -\frac{\sigma}{\sqrt{2}}\right)} = {\lim\limits_{\sigma \rightarrow \infty}}{} \frac{-4 \sigma \exp{\left( \mu - \sigma^2\right)}{}}{- \exp{\left( -\frac{\sigma^{2}}{2}\right)}}\sqrt{\frac{\pi}{2}} \end{array} $$

(66)

$$ \begin{array}{@{}rcl@{}} &=&{\lim\limits_{\sigma \rightarrow \infty}}{} \frac{\sqrt{8 \pi} \exp{(\mu)} \sigma}{\exp{\left( \frac{\sigma^{2}}{2}\right)}} = {\lim\limits_{\sigma \rightarrow \infty}}{} \frac{\sqrt{8\pi} \exp{(\mu)}}{\sigma \exp{\left( \frac{\sigma^{2}}{2}\right)}} = 0 \end{array} $$

(67)

Thus, since t/F(t) tends towards zero, we can conclude that E[X_t] also approaches zero. Due to the fact that $\alpha ^{*} = \inf _{x} E[X_{x}] \leq E[X_{t}]$ holds, the statement follows. □

This not only shows that the optimal expected value α^∗ approaches zero. As Theorem 5 is shown by using t/F(t) as an upper bound for E[X_t], this also implies that k^∗ approaches zero. Furthermore, the proof also indicates that α^∗ is approaching zero super exponentially.

As α^∗ tends to zero under the discussed circumstances, the reasoning from the proof of Theorem 3 is applicable. It is, therefore, reasonable to assume that the expected values of the optimal restart strategy and Luby’s strategy are significantly different for large σ values. This behavior can be observed in Fig. 4. Note that the optimal restart time was not used on this occasion, but $t=\exp {(\mu - \sigma ^{2})}$, as in the proof for Theorem 5.

For the sake of completeness, the case in which the shape parameter σ is fixed and the scale parameter μ is a variable illustrated as well. In Fig. 5, σ is 3. In this scenario, it seems that the difference between E[X_Luby] and α^∗ only increases slightly. The relative difference decreases as μ increases.

The main findings of this section are that Luby’s strategy is, in many circumstances, a good alternative to the optimal fixed-cutoff strategy. Especially if the exact probability distribution is unknown, however, it is worth checking if one of the boundary cases is present. Then the relative difference of the expected values may be significant, as has been illustrated by the GP distribution for low σ values and the lognormal distribution for high σ values.

5 Conclusion

Many established algorithms (e.g., [5, 6]) use restarts to optimize their respective expected runtimes. Among those algorithms, the use of Luby’s universal strategy is especially prevalent. Luby’s strategy consists of a slowly exponentially growing sequence of restart times. In their work, Luby et al. [12] showed that their universal strategy is asymptotically optimal in the worst case.

These properties have only been shown for discrete distributions. In this work, Luby’s strategy and universal strategies are studied in a continuous setting. We show that Luby’s strategy does not permit any bounds in a continuous setting.

Theorem 3 shows that in a continuous setting, there are no universal strategies with reasonable worst-case bounds. In other words, every universal strategy, including Luby’s strategy, is arbitrarily worse than an optimal strategy for some distributions.

Universal strategies are attractive to practitioners because knowledge of the underlying distribution is not necessary. Nonetheless, some information about the distribution is often available. Therefore, we study under which circumstances there are worst-case bounds for Luby’s strategy. Theorem 4 shows that if the optimal restart time is not too small, there are useful worst-case bounds. In particular, we improved the original bounds from Luby et al. [12] if the optimal restart time is at least 0.5. The results are also demonstrated with simulations. As can be observed, the results are often significantly better than the worst-case bound of Theorem 4.

In conclusion, there are two extreme cases: On the one hand, if the underlying distribution is completely known, then an optimal restart time can be calculated, resulting in an optimal strategy. On the other hand, if the distribution is entirely unknown, then only universal strategies are available. For future work, it would be interesting whether partial knowledge of the distribution can be exploited to efficiently create an intermediate restart strategy that performs better than Luby’s strategy.

In addition, the generalized Pareto distribution does not satisfy the conditions of Theorem 4 if its parameters are chosen appropriately. On the other hand, the simulations have shown that Luby’s strategy often yields good results in this case. Therefore the question arises whether the statement of Theorem 4 can be extended to cover cases like the generalized Pareto distribution.

References

Biere, A., Heule, M., van Maaren, H.: Handbook of Satisfiability. IOS Press (2009)
Eliazar, I., Koren, T., Klafter, J.: Searching circular DNA strands. J. Phys.: Condens. Matter 19(6), 065140 (2007)
Google Scholar
Evans, M., Majumdar, S.: Diffusion with stochastic resetting. Phys. Rev. Lett. 106(16), 160601 (2011)
Article Google Scholar
Florescu, I., Tudor, C. A.: Handbook of Probability. Wiley, New York (2013)
Book Google Scholar
Friedrich, T., Kötzing, T., Quinzan, F., Sutton, A.: Improving the run time of the (1 + 1) evolutionary algorithm with Luby sequences. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’18, pp 301–308. ACM, New York (2018)
Gomes, C., Selman, B., Crato, N.: Heavy-tailed distributions in combinatorial search. In: CP’97, pp 121–135. Springer (1997)
Huang, J.: The effect of restarts on the efficiency of clause learning. In: IJCAI, vol. 7, pp 2318–2323 (2007)
König, W.: Stochastische Algorithmen. Universität zu Köln, lecture notes (in German). Available at http://www.wias-berlin.de/people/koenig/www/AlgStoch.pdf(2003)
Lorenz, J.: Runtime distributions and criteria for restarts. In: SOFSEM’18, pp 493–507. Springer International Publishing, Cham (2018)
Lorenz, J. H.: On the complexity of restarting. In: International Computer Science Symposium in Russia, pp 250–261. Springer (2019)
Lorenz, J. H., Schöning, U.: Promise problems on probability distributions. In: Complexity and Approximation, pp 57–66. Springer (2020)
Luby, M., Sinclair, A., Zuckerman, D.: Optimal speedup of Las Vegas algorithms. Inf. Process. Lett. 47(4), 173–180 (1993)
Article MathSciNet Google Scholar
Norman, L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol 1 of Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics (1994)
Pal, A., Kundu, A., Evans, M. R.: Diffusion under time-dependent resetting. J. Phys. A: Math. Theor. 49(22), 225001 (2016)
Article MathSciNet Google Scholar
Pal, A., Reuveni, S.: First passage under restart. Phys. Rev. Lett. 118(3), 030603 (2017)
Article Google Scholar
Wolter, K.: Stochastic Models for Fault Tolerance: Restart, Rejuvenation and Checkpointing. Springer Science & Business Media, New York (2010)
Book Google Scholar

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Institute of Theoretical Computer Science, Ulm University, 89069, Ulm, Germany
Jan-Hendrik Lorenz

Authors

Jan-Hendrik Lorenz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jan-Hendrik Lorenz.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lorenz, JH. Restart Strategies in a Continuous Setting. Theory Comput Syst 65, 1143–1164 (2021). https://doi.org/10.1007/s00224-021-10041-0

Download citation

Accepted: 27 March 2021
Published: 05 May 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s00224-021-10041-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Restart Strategies in a Continuous Setting

Abstract

Similar content being viewed by others

Restart Strategies

Restart Strategies

Methods to compare expensive stochastic optimization algorithms with random restarts

1 Introduction

Our contribution

2 Preliminaries

2.1 Fundamental Restart Strategies

Definition 1

Definition 2

Theorem 1

Definition 3

Theorem 2

2.2 Properties of the Weibull Distribution

Definition 4

Lemma 1

Lemma 2

Proof

3 Continuous Universal Strategies

3.1 Nonexistence Results

Lemma 3

Proof

Theorem 3

Proof

Corollary 1

Proof

Corollary 2

Proof

3.2 Existence Results

Lemma 4

Proof

Theorem 4

Proof

4 Simulations

4.1 Setup of the Simulations

4.2 The Lévy-Smirnov Distribution

Definition 5

4.3 The Generalized Pareto Distribution

Definition 6

4.4 The Lognormal Distribution

Definition 7

Theorem 5

Proof

5 Conclusion

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation