# A Survey on Direct Search Methods for Blackbox Optimization and Their Applications

Chapter

## Abstract

Blackbox optimization typically arises when the functions defining the objective and constraints of an optimization problem are computed through a computer simulation. The blackbox is expensive to compute, can have limited precision and can be contaminated with numerical noise. It may also fail to return a valid output, even when the input appears acceptable. Launching twice the simulation from the same input may produce different outputs. These unreliable properties are frequently encountered when dealing with real optimization problems. The term blackbox is used to indicate that the internal structure of the target problem, such as derivatives or their approximations, cannot be exploited as it may be unknown, hidden, unreliable, or inexistent. There are situations where some structure such as bounds or linear constraints may be exploited and in some cases a surrogate of the problem is supplied or a model may be constructed and trusted. This chapter surveys algorithms for this class of problems, including a supporting convergence analysis based on the nonsmooth calculus. The chapter also lists numerous published applications of these methods to real optimization problems.

## Keywords

Blackbox optimization Direct search methods Derivative-free optimization Surrogate models Nonsmooth analysis Applications

## 1 Introduction

In many situations, one is interested in identifying the values of a set of variables that maximize or minimize some objective function. Furthermore, the variables cannot take arbitrary values, as they are confined to an admissible region and need to satisfy some prescribed requirements. Optimization studies problems of the form:
$$\displaystyle\begin{array}{rcl} \min _{x\in \varOmega }& f(x)\text{,}&{}\end{array}$$
(1)
where x represents the variables which must be taken in the admissible region Ω, a subset of $$\mathbb{R}^{n}$$, and where f(⋅ ) is the objective that we wish to minimize. The goal is to identify x ∈ Ω that has the least objective function value f(x). The nature of f and of Ω dictates the type of optimization methods that should be used to tackle a given problem. In general, properties such as linearity, convexity, monotonicity, and integrality may be used to identify appropriate optimization methods [70].

The paper focuses on optimization problems for which the nature of f or Ω is difficult or impossible to exploit. These are called blackbox optimization problems. Frequent examples are when the functions defining the problems are computed through a time consuming simulation. Some of the constraints defining Ω may be evaluated prior to launching the simulation, and others may only be evaluated a posteriori. In [113, 116] for example, launching a single simulation at a given tentative trial point $$x \in \mathbb{R}^{n}$$ to evaluate the objective function value f(x) and to verify whether x belongs to Ω or not requires as much as 2, 000 computational hours. In other situations, the simulation can fail to return a value even when the tentative x belongs to Ω. In [37], 60 % of the simulation calls return an error. There are also situations where the simulation is not deterministic. The simulation might involve random numbers and may return different values of f(x) for the same input x. In [21], the objective function value represents the time required to perform a series of tasks, and varies slightly even when the same conditions are prescribed.

There are situations where the evaluation of f is done through a blackbox simulation, but the problem is not as ill-conditioned as in the above-mentioned examples. Model-based derivative-free methods target this class of problems by constructing approximations of the objective function [45, 46, 48]. They do not assume explicit knowledge of the derivatives, but assume existence of the derivatives. The book [47] on derivative-free method distinguishes model-based and directional-based optimization method. The present work discusses directional-based optimization methods for general blackbox optimization on which no assumptions are made on the objective function or on the admissible region.

The chapter studies the optimization problem (1) and is structured as follows. Section 2 gives a high level overview of some directional-based methods designed to solve blackbox optimization problems. Section 3 describes typical exploitable specificities of the target problem. In particular, specific types of constraints and variables are discussed. Then, Sect. 4 describes how one can use surrogates and develop a model of the target problem to ease the solution process. These tools are used to guide the optimization so that it consumes less time evaluating the original expensive problem. That section also mentions work on the exploitation of parallelism in direct-search methods. Section 5 presents the theoretical foundations upon which these methods rely. The convergence analysis is hierarchical in the sense that the stronger the hypotheses on the objective and feasible region, the stronger the resulting theoretical guarantees. Finally, Sect. 6 surveys some selected published applications of these directional direct-search methods. Concluding remarks are drawn in the final section.

## 2 Directional Direct Search Methods

Direct-search methods were introduced more than half a century ago. They are named this way because they interact directly with the function values and do not attempt to use or estimate derivatives. Pioneer methods include the famous coordinate search (Cs) method used by Fermi and Metropolis [64] on one of the first digital computers back in 1952, the Nelder–Mead [122] and the Hooke and Jeeves [90] algorithms. This section first gives a general overview of the Csmethod, followed by some of its descendants.

### 2.1 The Coordinate Search Algorithm

Part of the structure of some modern direct search methods is present in the Csmethod for unconstrained optimization, i.e., problem (1) with $$\varOmega = \mathbb{R}^{n}$$. This iterative method can be simply described as follows.

At iteration k ∈ { 0, 1, 2, }, the current best known solution is denoted by x k and called the current incumbent solution. At the initial iteration, the starting point $$x_{0} \in \mathbb{R}^{n}$$ is supplied by the user of the method. An initial step size parameter Δ 0 > 0 is also supplied by the user.

Then, at each iteration, a total of 2n trial points are generated in hopes of improving the current incumbent. Each trial point is obtained by varying a single coordinate of the vector x k by a fixed step size of magnitude Δ k . These are called the poll points and belong to the set
$$\displaystyle{P_{k}\ =\ \{ x_{k} \pm \varDelta _{k}e_{i}\:\ i = 1,2,\ldots,n\}}$$
where e i is the i-th coordinate vector in $$\mathbb{R}^{n}$$. The objective function f is then evaluated at each of these points, and there are two possible outcomes. One possibility is that a trial point t ∈ P k satisfying f(t) < f(x k ) is identified. In that case the iteration is declared successful and x k+1 is set to t and the step size parameter Δ k+1 to Δ k . The alternate possibility is that all 2n trial points were tested, but none improved the objective function value. In that case the iteration is said to be unsuccessful and x k+1 is set to x k , but the step size parameter Δ k+1 is set to be half of Δ k . The iteration then ends, the counter k is incremented by one and a new iteration is initiated.

This method is simple to implement. Furthermore, it may be applied to any unconstrained optimization problem, without any assumptions on the smoothness of the objective function. The Csalgorithm is also known under the name compass search [98]. Section 5 reports that even if this method may seem naive, it is supported by a convergence analysis showing that it may produce a stationary point when the objective function is locally strictly differentiable.

### 2.2 The Pattern Search Class of Algorithms

Development and analyses of direct-search methods were not intensive for a few decades, but renewed interest occurred in the 1990s. In [120], it was shown that the Nelder–Mead method could fail to converge to a local solution, even on a strictly convex unconstrained two-dimensional minimization problem.

An unified framework was proposed in [140], generalizing the following direct search methods: Cs, Hooke and Jeeves, multidirectional search [57] as well as an evolutionary operation method [38]. The framework was called pattern search, or generalized pattern search (Gps), and one of the main contribution was to provide sufficient conditions on the target problem ensuring convergence to a stationary point for unconstrained optimization.

The Gpsframework generalizes Csby increasing its flexibility. The main algorithmic improvements are the following.
1. 1.

The directions used to generate the poll set P k are not restricted to the coordinate vectors.

2. 2.

When the iteration succeeds in improving the incumbent, the parameter Δ k+1 is allowed to increase or to remain the same as Δ k .

3. 3.

At every iteration, the algorithm allows exploration at a finite number of trial points other than the poll points.

The first improvement allows a richer set of polling directions. Lewis and Torczon [104] propose the use of positive bases [14, 55] to generate the polling directions. Positive bases are not bases, but are minimal sets of directions whose nonnegative linear combinations span $$\mathbb{R}^{n}$$. The set of positive and negative coordinate directions in $$\mathbb{R}^{n}$$ used by Csis an example of a positive basis. The flexibility of the construction of the polling directions is exploited in [11] for molecular geometry problems.

The second improvement allows the algorithm to dynamically adapt to the nature of the problem. Indeed, if the initial step size Δ 0 was chosen to be too small, then a series of successful iterations will increase Δ k and large steps will soon be taken. This modification was shown useful in practice, but the analysis of the behavior of the parameter Δ k gets more technical. Torczon [140] sets some rules on the way in which the parameter is increased or decreased, including a requirement that the factor by which it is modified be a rational number. She then gives sufficient conditions under which the limit inferior of the parameter Δ k goes to zero as the iteration number k goes to infinity. This fundamental result is the first cornerstone of the convergence analysis. The rationality requirement was later shown to be necessary for the analysis [13].

The third generalization is introduced for practical reasons. A user of a pattern search method is often tempted to alter an optimization code to make it more efficient by integrating his knowledge of the target problem. Modifications of the poll set P k could lead to loss of the structure necessary for the theoretical support. So, an additional exploration phase, later called the searchstep [37], is introduced at every iteration. The search step allows the evaluation of the objective function at finitely many trial points, located on a discretization of the space of variables. The discretization is called the mesh, and its coarseness is parameterized by the step size parameter Δ k . The mesh is formally described in the next subsection.

### 2.3 The Mesh Adaptive Direct Search Class of Algorithms

There are two main practical and theoretical limitations to Gps. First, the directions used to construct the mesh and the poll set need to be chosen from a fixed finite set $$D \subset \mathbb{R}^{n}$$. The algorithm and its convergence analysis heavily rely on this requirement. A negative consequence of limiting the polling directions is exposed in [98] where Csis applied to a modification of the Dennis–Wood function [59] and the iterates converge to a non-stationary point. The objective function of this problem is simply the maximum of two strictly convex quadratics in $$\mathbb{R}^{2}$$, and the problem is unconstrained. If one knew in advance the polling directions, it would be easy to devise a similar example for Gps.

The second limitation of Gpsis that it cannot handle general constraints. In fact, it can only treat explicit linear constraints and bounds on the variables. In [18], the Mesh Adaptive Direct-Search (Mads) class of algorithms is introduced to address these limitations.

In Mads, the role of the step size parameter Δ k is divided in two. That parameter is replaced by the mesh and the poll size parameters Δ k m and Δ k p . As its name indicates, the poll size parameter is used to construct the poll set P k . The poll points are constructed around the current incumbent solution x k and the distance separating the incumbent to each poll point is limited by Δ k p . In comparison, that distance is systematically equal to Δ k with Cs. The mesh size parameter dictates the coarseness or fineness of the mesh. In Mads, Δ k m is updated in a way that it converges to zero much faster than Δ k p . The consequence is that the tentative search and poll points can be chosen on a finer mesh than the mesh defined by the poll size parameter Δ k p .

As in Gps, Madsuses a fixed finite set of directions called D. Typically, D is composed of the 2n positive and negative coordinate directions. At iteration k, the mesh is a discretization of the space of variables on which all tentative search and poll points need to be selected. Formally, the mesh is defined as
$$\displaystyle{ M_{k}\ =\ \{ x +\varDelta _{ k}^{m}Dz\:\ x \in V _{ k},\ z \in \mathbb{N}^{n_{D} }\} \subset \mathbb{R}^{n}, }$$
where V k denotes the set of trial points where the simulation was launched by the start of iteration k. That set is also known as the cache as it contains the history of all evaluated trial points.
The poll set P k is composed of mesh points whose distance (the infinity norm is frequently used) from the current incumbent solution x k is bounded above by a constant c > 0 times the poll size parameter
$$\displaystyle{P_{k}\ \subseteq \ \{ x \in M_{k}\:\ \| x - x_{k}\| \leq c\varDelta _{k}^{p}\}.}$$
Notice that if the constant c equals one, if Δ k p  = Δ k m and if D = [I; − I], then the set on right-hand side of this inclusion corresponds exactly to the poll set P k of Sect. 2.1, with the additional point x k .

The poll set P k must contain at least n + 1 points since x k must lie in the strict interior of its convex hull. An equivalent way of stating this last requirement is that the polling directions must form a positive spanning set.

Figure 1 illustrates the effect of different mesh and poll size parameters in $$\mathbb{R}^{2}$$. In all plots, the arrows represent the directions of the positive spanning set D used to construct the mesh. The mesh is represented by the intersection of the lines. The darker lines delimit the region in which the poll points must be chosen. The incumbent solution x k is located at the intersection of the arrows at the center of each subfigure.

The leftmost figure represents the situation in which both mesh and poll size parameters are equal, and in which the mesh directions of D are the positive and negative coordinate directions, i.e., it depicts the Cspoll set. There is only one possibility to define the poll set with Gps.

The central figure represents an instance of Madsin which the mesh size parameter is half the poll size parameter. The directions are once again the positive and negative coordinate directions, but the poll points of P k can be chosen with more flexibility as they do not need to be located at the endpoints of the arrows. Any combination that contain x k in the strict interior of its convex hull can be chosen.

The rightmost figure is conceptually identical to the middle one, except that the mesh directions are not the coordinate directions. Again, there is great flexibility in choosing the poll points inside the hexagonal frame.

The key algorithmic element introduced in Madsis that the polling directions are not restricted to a fixed finite set. As Δ k m and Δ k p converge to zero, the set of normalized directions which may be used to construct the poll set P k becomes dense in the unit sphere. In the leftmost figure, there are 32 − 1 mesh points from which P k can be chosen. In the central figure, with a ratio of $$\frac{1} {2}$$ for $$\frac{\varDelta _{k}^{m}} {\varDelta _{k}^{p}}$$ that number grows to 52 − 1. For a ratio equal to $$\lambda \in \mathbb{N}$$, that number reaches (2λ + 1)2 − 1 for a generalization of this example.

Figure 2 gives a high-level description of a Madsalgorithm. Aside from the blackboxes defining the problem, the only element that the user must supply is a starting point x 0, at which the simulation runs successfully. As detailed in the next section, this does not mean that the initial point must be feasible, but only requires that the simulation does not fail. This is modeled by requiring that f(x 0) < .

In practice, a typical termination criteria for direct-search algorithms is based on an overall budget of calls to the simulation, or on wall clock time. Another possibility is to terminate when the mesh size parameter drops below a certain threshold.

The first instantiation of Mads, called LtMads [18] uses randomly generated non-singular triangular matrices to generate the directions used to construct the poll set P k . Three years later, the OrthoMadsinstantiation [8] used Householder matrices to generate orthogonal maximal positive bases.

### 2.4 Sufficient Decrease Methods

As mentioned in the previous section, Madsgeneralizes Gpsby decoupling the role of the mesh and poll size parameters. An alternate strategy to allow a rich set of directions consists in removing the mesh requirement, but in accepting a trial point as being the new incumbent solution only if the decrease in the objective function value is sufficiently important. For continuously differentiable functions, such strategies were proposed and analyzed in [112] and [74] as well as in the frame-based method described in [49, 127] and in the generating set search (Gss) methods [98, 100].

For Lipschitz continuous functions, such sufficient decrease methods ensure a minimal displacement in the space of variables when accepting a new incumbent. This displacement plays a role similar to that of the mesh requirement and prevents the iterates to converge prematurely to an undesirable solution.

## 3 Exploitable Specificities of the Target Problem

Even if the target problem is provided as a blackbox, there are situations where some structure is available and may be exploited. This section lists some examples.

### 3.1 Integer and Categorical Variables

Integer variables can easily be handled by mesh-based methods such as Gpsand Mads. Indeed, the mesh M k imposes a discrete structure on the space of variables. It suffices to make sure that the mesh points are integers, and the natural stopping criteria consists in terminating as soon as the mesh size parameter drops below the value one. This is illustrated by the two first plots of Fig. 1. Using the coordinate directions to define the mesh and an integral mesh size parameter ensures that all trial points satisfy the integrality requirement.

Discrete optimization variables are said to be categorical if the objective function or the constraints cannot be evaluated unless the variables take one of a prescribed enumerable set of values. Categorical variables differ from integer variables as they do not possess any natural ordering. These variables need to be accompanied by a user-defined notion of neighborhood, specific to the target problem. The Gpsand Madsalgorithms are extended to allow such variables in [15] and [7], respectively. In these papers, the variables are partitioned into three groups: continuous, integer, and categorical variables. There is no special treatment for the continuous variables. The integer variables are handled by making sure that the mesh only contains integer points. This is done easily by specifying a minimal mesh size for these variables. The user-defined neighborhood is used to define the poll points of the categorical variables.

### 3.2 Explicit Bound and Linear Constraints

Gpsalgorithms are adapted in [105, 106] to handle bound and linear inequalities. The extension consists of making sure at each iteration that the poll directions positively span the tangent cone at the nearly active constraints. In bound constraint optimization, it suffices to take the positive and negative coordinate directions. For linear inequalities and equalities, the explicit knowledge of the constraints is used to generate the tangent cone [108]. Degeneracy issues can be handled by the strategy proposed in [6].

### 3.3 General Constraints

Of the methods enumerated above, only Madscan handle general nonsmooth constraints. The union of normalized Madspoll directions grows asymptotically dense in the unit sphere, and this allows the treatment of constraints by the extreme barrier in which infeasible trial points are simply rejected from consideration. This aggressive treatment of constraints is necessary when the simulation cannot be launched when a constraint is violated. For example, a simulation that computes logarithms or square roots cannot be trusted when negative values are entered. These are called unrelaxable constraints. There are also practical situations where the simulation fails inexplicably. These are often referred to as hidden constraints [41].

The extreme barrier has the merit of being simple to implement, but there are more subtle ways to handle the relaxable constraints whose amount by which they are violated is available. For continuously differentiable functions, augmented Lagrangian approaches are presented in [99, 107] for Gpsand Gss, respectively.

In [68, 69], a filter method is proposed for nonlinear programming. The main component of a filter method is a constraint violation function h that aggregates the violations of each individual constraint. The function h is nonnegative, and equal to zero only when the corresponding trial point is feasible. Filter methods exploit tradeoffs between the reduction of the objective f and the constraint violation h. Filter methods are adapted to Gps [5, 17] and to frame-based methods [61].

More recently, another mechanism called the progressive barrier was proposed [19] to treat nonsmooth quantifiable constraints. The progressive barrier imposes a maximal threshold on the constraint violation h which is progressively reduced. Trial points whose constraint violation value exceed the threshold are rejected from consideration. Among all infeasible trial points that are not rejected by the progressive barrier, a local exploration is conducted around the one with the best objective function value.

An hybrid method is presented in [28] for the situation where the initial point x 0 does not satisfy all of the quantifiable constraints. Under this strategy, the constraints are initially handled by the progressive barrier, and as soon as an individual constraint is satisfied by an incumbent solution, then the treatment of that constraint is done by the extreme barrier.

### 3.4 Minimax Optimization Problems

The lack of differentiability may come from a variety of sources. One of them occurs when the objection function of Problem (1) is obtained by taking the maximum of finitely many functions $$f^{i}: X \rightarrow \mathbb{R} \cup \{\infty \}$$ for i = 1, 2, , q:
$$\displaystyle{f(x) =\max \{ f^{1}(x),f^{2}(x),\ldots,f^{q}(x)\}.}$$
The maximum operator introduces nondifferentiability, even if the finitely many functions are differentiable. Minimax optimization problems can be used to model worst case scenarios: one may wish to minimize the highest possible loss, i.e., to minimize the largest of the f i (x) values.

This class of problem is in [110] in the case where Ω is defined by linear constraints, and the functions f i are twice continuously differentiable. The finite minimax structure is exploited by a smoothing technique based on an exponential penalty function.

More recently, [86] studies the unconstrained case where the functions are continuously differentiable. They exploit the structure of the problem by identifying the active manifold and then treat the objective as a smooth function restricted to the manifold.

### 3.5 Multi-Objective Optimization and Trade-Off Studies

There are situations where one is interested in analyzing the tradeoffs between multiple objectives f (p), p = 1, 2, , q. There is no single objective function that encapsulates the totality of the design process. In such a situation, the goal of the optimization is not to produce a single solution, but to produce the collection of Pareto undominated solutions. The required computational effort increases rapidly with the number of objectives. A feasible solution x is said to be dominated by another x  ∈ Ω when f (p)(x ) ≤ f (p)(x) for every objective functions with a strict inequality for at least one objective function. The Pareto set is defined to be the set of undominated solutions.

The first direct search algorithm for biobjective optimization was introduced in [26] and then generalized to more than two objectives in [29] by incorporating the normal-boundary intersection [54] technique. The method approximates the Pareto front by launching a series of optimization on single-objective reformulations of the problem. A different mechanism is proposed in [52].

In [30], two strategies are developed to analyze the sensitivity of an optimal solution to general constraints, including bounds on variables, with the help of a direct-search solver. A simple method performed immediately after a single optimization by inspecting the cache, and a detailed one performing biobjective optimization on the minimization of the objective versus the constraint of interest. The resulting analysis helps in identifying the relative importance of the constraints.

## 4 Tools for Dealing with Costly Blackboxes

In addition to the specificities outlined in the previous section, there are situations where additional tools are available. This section discusses surrogates, models, and parallelism.

### 4.1 Static Surrogate Functions

In blackbox optimization, the functions defining the target problem are expensive to evaluate. A frequently used strategy consists in designing a second blackbox optimization problem called the surrogate. A surrogate needs to share some similarities with the expensive optimization problem, but must be cheaper to evaluate.

Static surrogates may be constructed by reducing the number of internal iterations, or through simplified physics model for example. The surrogate management framework [37, 58] manages the interplay between the fidelity of the surrogate problem to ensure that the optimization process converges to a solution of the original target problem. The variable precision of a surrogate is exploited in [125] to reduce the overall computational effort.

In the context of Gpsand Madsalgorithms, the surrogate may be used at many places in the algorithm. A first obvious usage consists in solving the surrogate optimization problem and to use the best solution(s) as starting points for the optimization of the true problem. Other usages consist in solving subproblems on the surrogate, and only to evaluate the true function at the solution of the subproblem. For example, in [15], an extended poll is conducted when categorical variables are modified. The extended poll can be viewed as a descent in a subspace of variables, using the surrogate problem. In [23], the variable neighborhood search (VNS) metaheuristic [85, 121] is used to attempt to escape locally optimal solutions. The descent is performed on the surrogate problem since VNS may be expensive in terms of function evaluations.

In addition, surrogates may be used at every iteration to order the tentative search and poll points so that the most promising ones are treated first. The list of tentative points is sorted with regards to their surrogate values and then, the expensive simulation is launched on the most promising ones first. An ordering in the presence of constraints is proposed in [44]. The process terminates as soon as a new incumbent is generated, thereby reducing the number of expensive function calls.

Notice that for a surrogate to be efficient, it does not need to be an accurate model of the true problem. The introduction of the chapter mentioned a problem [21] in which the objective function value represents the time required to perform a set of tasks. The surrogate used in that work consists of the time to perform a small number of these tasks. The units for the surrogate values are seconds, and the units for the true simulations are hours. The surrogate value is not at all a good approximation, but it is very useful as it shares some similarities with the true problem.

### 4.2 Dynamic Models

Static surrogates are usually supplied by the developer of the optimization problem. An alternate way to define a surrogate is to construct approximations of the objective and constraints. These models are then used in the searchand pollsteps of Algorithm 2 as detailed in the previous subsection. As the algorithm is deployed on a problem, more and more simulations are launched, and the newly collected information can be used to recalibrate and improve the fidelity of the models. This dynamic way of constructing models is done in the update step of Algorithm 2.

A natural option is to consider the quadratic models described in [44, 53] which are constructed and updated by considering past evaluations from the cache that are close to the current incumbent solution. For target problems that are not noisy, quadratic models may lead to significant improvements and increase the speed of convergence to a local solution.

Other strategies to dynamically construct models include DACE Kriging [34, 51, 111, 134], treed Gaussian processes [80] and radial basis functions [33, 123, 128, 143].

### 4.3 Parallelism

Most modern machines now have multiple processors. A first parallel synchronous version of Gpsis presented in [57]. The Asynchronous Parallel Pattern Search algorithm Apps [82, 91, 96] removes this synchronization barrier. Adaptation to Gss are presented in [83, 84]. The asynchronous versions are especially useful when the blackbox has heterogeneous computing times depending on the trial point where it is evaluated. A convergence analysis is presented in [97] for the smooth case.

Based on a remark of [60] stating that the parallel variable distribution of Ferris and Mangasarian [65] should be paired with Gps, [25] proposes another strategy to exploit parallelism.

## 5 Theoretical Foundations

None of the methods surveyed in this paper can guarantee convergence to a global minimizer of Problem (1). As stated in its title [136], global optimization requires global information. In blackbox optimization there is no information available, even less global information.

The convergence analysis looks at the sequence of trial points, and studies some of its accumulation points as the iteration number goes to infinity. Of course, this is a theoretical analysis since in practice one cannot let k → . But the analysis is useful as it shows limiting behaviors. Based on local properties of the objective and constraints, the analysis ensures that some necessary optimality conditions are met. This section summarizes the analysis for smooth and nonsmooth problems.

### 5.1 Smooth Unconstrained Optimization

Like Newton’s method for unconstrained optimization, the Csand Gpsalgorithm may get stuck at saddle points if the polling directions are not properly chosen. For example, if Newton’s method or Csis applied to the unconstrained minimization of the quadratic function f(x) = x 1 x 2 from the origin, then the sequence of iterates stagnate at the origin which is a stationary point.

However, [3] shows that Gpsand Cscannot converge to a strict local maximizer, unlike Newton’s method. This may appear to be surprising that a method using first and second derivatives of a C 2 function ensures weaker convergence results than a method that uses only function values, without using nor estimating derivatives. This study is generalized in [4] for Mads.

It is shown in [62] that if ∇f(⋅ ) is Lipschitz continuous, then at an unsuccessful iteration k of Gps, ∇f(x k ) is bounded above by a constant times the current mesh size parameter, whose limit inferior converges to zero [140]. An interesting consequence of this result is that it provides a theoretical justification of the stopping criteria based on a small mesh size parameter.

### 5.2 Nonsmooth Analysis for Unconstrained Optimization

As mentioned in the introduction, the target problems for which the direct-search methods are designed are typically blackbox problems. The function values returned by the blackbox are the result of an expensive simulation. There is no reason to believe that they should be differentiable nor even continuous functions. Therefore, the convergence results outlined in the previous subsection are certainly true, but are somewhat incompatible with the target problems.

The paper [16] studies the convergence of the Gpsmethod under less restrictive assumptions for unconstrained optimization. They propose a hierarchy of convergence results based on local smoothness of the objective function. In order to achieve this, they consider the set of unsuccessful iterations, i.e., the iterations where the incumbent solution x k is shown to have an objective function value less than or equal to that of the neighboring poll points in P k . Such an incumbent is called a mesh local optimizer. They then consider subsequence of unsuccessful iterations for which the corresponding incumbent solutions converge to a limit point denoted $$\hat{x}$$ and for which the corresponding subsequence of mesh size parameters converges to zero. Such a subsequence of iterates is called a refined subsequence, and $$\hat{x}$$ a refined point.

The fundamental convergence result does not require any assumption on the objective function. It is called the zero-th order result and states that $$\hat{x}$$ is the limit of mesh local optimizers on meshes that get arbitrarily fine.

The Clarke calculus for nonsmooth function [43] generalizes notions such as the directional derivative and the gradient to non-differentiable functions. The fundamental convergence theorem of [16] states that the Clarke generalized directional derivative at a refined point $$\hat{x}$$ in a direction d used infinitely often in the refining subsequence
$$\displaystyle{f^{\circ }(\hat{x};d)\:=\ \limsup _{ y\rightarrow \hat{x},\ t\downarrow 0}\frac{f(y + td) - f(y)} {t} }$$
is nonnegative, if f is locally Lipschitz near $$\hat{x}$$. The convergence analysis then progressively adds assumptions on f such as local regularity and strict differentiability [103].

The convergence analysis of Mads [18, 24] strengthens this result to $$f^{\circ }(\hat{x};d) \geq 0$$ for every direction $$d \in \mathbb{R}^{n}$$. This last result can be stated in an equivalent way as follows: 0 belongs to the generalized gradient of f at $$\hat{x}$$. A similar convergence analysis is developed for the DIRECT algorithm [66] and to sampling methods for perturbed Lipschitz functions [67].

### 5.3 Nonsmooth Stationarity for Constrained Optimization

In smooth optimization, a necessary optimality condition states that if $$\hat{x}$$ is a local minimizer of the function f over the domain $$\varOmega \subseteq \mathbb{R}^{n}$$, then the directional derivative of f at $$\hat{x}$$ in every tangent direction d to Ω is nonnegative:
$$\displaystyle{f^{{\prime}}(\hat{x};d)\ \geq \ 0,\quad \mbox{ for every }d \in T_{\varOmega }(\hat{x}).}$$
Another way to see this optimality condition is to state that there are no feasible descent directions at $$\hat{x}$$.
Using the Clarke calculus, and generalizations of the tangent cone, [18] shows that the refined point $$\hat{x}$$ generated by the Madsalgorithm under the extreme barrier satisfies
$$\displaystyle{f^{\circ }(\hat{x};d)\ \geq \ 0,\quad \mbox{ for every }d \in T_{\varOmega }^{H}(\hat{x})}$$
under the assumption that f is locally Lipschitz near $$\hat{x}$$, and where $$T_{\varOmega }^{H}(\hat{x})$$ is called the hypertangent cone [92, 130], and is a nonsmooth generalization of the tangent cone. The Rockafellar upper subderivative [130] is defined for non-Lipshitz functions, and is analyzed in [142] to enrich the convergence hierarchy.
When handling some relaxable quantifiable constraints with the progressive [19] rather than the extreme barrier, the method may generate feasible or infeasible refined points. The same necessary optimality conditions are guaranteed for the feasible refined points. However, the analysis shows that an infeasible refined point $$\hat{x}$$ satisfies
$$\displaystyle{h^{\circ }(\hat{x};d)\ \geq \ 0,\quad \mbox{ for every }d \in T_{ X}^{H}(\hat{x})}$$
where h is the constraint violation function mentioned in Sect. 3.3, and X is the domain corresponding to the unrelaxable or non-quantifiable constraints, and are handled by the extreme barrier. Roughly speaking, this suggests that the algorithm reached at a local minima of the constraint violation. This occurs in particular when there are no feasible solutions to the optimization problem.

## 6 Applications to Real Blackbox Problems

The methods discussed above were created to be applied on real blackbox optimization problems. The present section lists some of these applications. They are presented in non-disjoint groups and the list is not exhaustive. Other applications can be found in the introductory chapter of [47].

Some of these applications were solved by the NOMAD[9, 101, 102] open source C + + implementation of Madsfor single or biobjective blackbox optimization problems of the form (1). NOMADintegrates the features listed in the present paper, as well as others such as Latin hypercube sampling [139], periodic variables [20] and batch or library modes.

### 6.1 Shape Optimization

The Ph.D. thesis [113] applies the surrogate management framework mentioned in Sect. 4.1 to identify the shape of a hydrofoil trailing-edge that minimizes the aerodynamic noise propagated to the far field. Computations require large-eddy simulations. Unconstrained results for laminar flow are presented [114], and deformation of upper and lower surfaces of the trailing-edge in laminar flow with lift and drag constraints are analyzed in [115] resulting in as much as 70 % reduction in noise. Reynolds-averaged Navier–Stokes calculations are incorporated for constraint evaluation to make the optimization more efficient in [116] leading to a 89 % noise reduction.

A framework for coupling optimal shape design to time-accurate three-dimensional blood flow simulations in idealized cardiovascular geometries is presented in [117]. Results on idealized Y-shaped baffle for the Fontan surgery for children with congenital heart defects are shown in [144]. Uncertainties in the simulation input parameters as well as shape design variables are accounted for in [131] using the adaptive stochastic collocation technique of [132].

In [126], the evolution and propagation of cracks in two-dimensional elastic domains are studied. The simulation requires a finite element discretization and a set of partial differential equations with nonlinear boundaries. The ultimate goal is to determine the optimal shape resulting in a crack path with as much energy as possible without completely destroying the specimen.

### 6.2 Positioning Problems

There are situations in which the blackbox is expensive to evaluate, but some properties of the variables are know and may be exploited. One such family of problems are those where some or all of the variables represent spacial coordinates. Over the last years, direct-search methods were applied to some nonsmooth positioning problems.

Following the 2004 Indian Ocean tsunami, an effort lead by the National Oceanic and Atmospheric Administration’s Pacific Marine Environmental Labs to identify the optimal position to deploy tsunami detection buoys in the Pacific Ocean [135]. The question was formulated as an optimization problem, where the variables where the coordinates of the buoys, and the objective was to maximize the warning time to coastal cities in the event of a tsunami. Constraints on water depth and bottom roughness were incorporated and NOMADwould then position a selected number of buoys within a sub-region of the ocean so as to optimize the detection time for a set of unit sources.

In [72] a water supply problem and a hydraulic capture problem are proposed as a challenge to blackbox optimization community. The objective of the water supply problem is to minimize the cost to supply a specific quantity of water subject to a set of constraints on the net extraction rate, pumping rates, and hydraulic head. The decision variables are the two-dimensional locations and pumping rates of the wells, as well as the number of wells. The objective of the hydraulic capture CP is to minimize the cost needed to prevent an initial contaminant plume from spreading by using wells to control the direction and extent of advective fluid flow. The methods used in the study are APPS [91], Boeing DE [34, 51], DIRECT [73, 93], IFFCO [77, 94], NOMADand NSGA-II [56]. Additional tests with IFFCO are conducted in [71].

More recently, researchers [42, 118] use snow-monitoring devices to estimate the quantity of water stored in snow over a vast domain. When the accumulated snow melts in spring, important quantities of water are liberated, and a precise estimation is necessary for efficient management of hydroelectrical dams. In [10], the question of identifying the optimal position of these snow-monitoring devices is studied, to minimize the overall kriging approximation error of the quantity of water. Different strategies of exploiting the fact that the variables represent locations are presented.

In [119], the question of positioning antennas in an irregular-shaped domain and assigning radio frequencies in a telecommunications network is studied as a blackbox optimization problem. The proposed methodology combines Madsfor the positioning problem with a tabu search for the radio frequencies.

### 6.3 Parameter Estimation

There are situations in which a model is characterized by a set of parameters. The question of assigning good values to the parameters may be formulated as an optimization problem in which the difference between observed and actual data points needs to be minimized. Below are examples of such applications.

In [109] An automated flow is described for total-ionizing dose (TID)-aware SPICE model generation that includes TID response and its dependence on process variability and layout. A differential evolution algorithm is adapted for global exploration, and a modified Gpsstrategy is introduced for local exploration. The optimizer efficiently reduces the value of different kinds of objective functions in the extraction at reasonable cost and avoids premature convergence in most practical cases.

Facial recognition systems are studied in [39]. Given a database of image samples of known individuals the task is to design a system that for any input image, identifies the input with one of the known individuals. The classification problem involves designing a function to map feature vectors to the appropriate class label. The Madsalgorithm was shown to outperform heuristics both in accuracy and processing time.

In [137], a conditional averaging approach to estimate the parameters of a land surface water and energy balance model is presented. The parameters are then used to classify net radiation and precipitation. The paper proposes an objective function that approximates the temperature-and moisture-dependent errors in terms of atmospheric forcing, surface states, and model parameters. Minimization of the approximated error yields parameters for model applications.

A method for determining the fire front positions for optically thin flames and the rate of spread of forest or vegetation fires is presented in [40]. The first step of the method measures the heat fluxes coming from the flame by a specific thermal sensor in four horizontal directions. In the second step, these heat fluxes are approximated by a radiative transfer equation. Then, the positions of the fire front and the flame characteristics are determined by applying an inverse method. The rate of spread is deduced by applying a least-square regression on the position values.

In [87], a method for evaluating the kinetic constants in a rate expression for catalytic combustion applications using experimental light-off curves is presented. The method uses a finite element reactor model to simulate reactor performance. The heat and mass transfer models used account for developing flow in the entrance region. A Gpsalgorithm is used to determine the best fit parameters.

### 6.4 Tuning of Algorithms

Many algorithms depend on a set of parameters. As long as the parameters satisfy some prescribed requirements, the algorithm may be trusted to perform adequately. The question of finding good parameter values has been studied under different names, including parameter tuning, software automatic tuning and parameter optimization. Direct-search methods are used in [88] to optimize computations involving matrices. In [21], the question of adjusting the four trust-region algorithmic parameters so to minimize the overall computational time to solve a large collection of CUTEr [78] test problems was studied and solved using NOMAD. The sensitivity to the parameters was studied in [79].

A more general blackbox formulation of this question is proposed in [27] trough the Opal framework and applied to the DFO algorithm [48]. The user of the Opal framework must supply a target algorithm together with a set of metrics defining the notions of acceptable parameter values and of performance of the algorithm as well as a collection of representative sets of valid input data for the target algorithm.

Parallelism may be used at various levels within the Opal framework. It may be used at the blackbox solver level to concurrently test parameter values, or it can be used to assess the quality of a set of parameters on different test problems. These two strategies and a combination of both are studied in [31].

The Opal framework is applied in [124] on a matrix multiplication algorithm where optimization with respect to blocking, loop unrolling and compiler flags takes place. This application requires the use of categorical variables.

A methodology based on variable selection and a sensitivity analysis of inputs is applied on several instructive data sets, and a analysis of automatic computer code tuning is presented in [81]. A set of extensible and portable search problems in automatic performance tuning is proposed in [32].

### 6.5 Engineering Design Applications

The papers [35, 36] and thesis [133] study a design problem of interest to Boeing. It consists of minimizing a vibration measure of a helicopter rotor blade. A simplified surrogate simulation code is used, requiring only a few minutes in contrasts with hours for the true simulation. In addition, approximately 60 % of the simulation calls fail to return a value. This work is the first to illustrate the use of the surrogate management framework [37].

Categorical variables are studied in [95] to minimize the power required to maintain heat shields at given temperatures in a thermal insulation system. Variables defining this optimization problem include thicknesses, temperatures, and also the types of materials used as insulators as well as the number of shields. This means that the number of variables defining the problem is itself an optimization variable. Nonlinear constraints are considered in [2]. A formulation of the problem without the use of categorical variables is presented in [1].

The environmental impact of a commercial aircraft departure is defined by noise nuisance in the protected zones near airports, local air quality, and global warming. A multiobjective, constrained, nonlinear optimization problem is formulated to obtain optimal departure procedures in [141].

Spent potliners are a toxic byproduct generated by the aluminum industry. A process treatment of spent potliners is presented in [50], and the seven input parameters are optimized over four blackbox constraints in [22]. The simulation requires the Aspen simulation software [12].

The Madsalgorithm is applied in [138] in combination with full-field electromagnetic simulations to tailor the broadband spectral response of gold and silver split-ring resonator metamaterials.

In a series of paper [75, 76, 129], the FactSage thermodynamic software [63] coupled with the NOMADsoftware to optimize alloy and process designs. Mono and biobjective constrained problems are studied. The FactSage database contains thermodynamic properties as functions of temperature, pressure and composition for over 5,000 pure substances and hundreds of multicomponent solid and liquid solutions.

## 7 Discussion

Since the 1990s, there has been a renewed interest in direct search methods for nonsmooth blackbox optimization without derivatives. Recent methods can now handle general constraints, multiple objectives, integer and categorical variables, and can exploit models and surrogates to guide the optimization. An important effort has been deployed to embed the methods into a general framework and to develop a convergence analysis for this framework. Many of these methods are now supported by a hierarchical analysis that ensures necessary stationary conditions based on local properties of the objective and constraints.

Section 6 shows that there are numerous applications of these optimization methods on real problems. With only a few exceptions, most of these blackboxes are not openly distributed to the optimization community. Some of them use proprietary codes, others can only be released internally. Consequently, there are not many real blackbox optimization problems that can be shared and used for benchmarking different methods.

Comparing direct search methods for blackbox optimization on smooth problems from the CUTEr [78] or from the Hock and Schittkowski [89] collection (or on some perturbed variants) is not ideal, as these problems do not possess the same kind of difficulties than blackbox problems without derivatives from real applications. Hopefully, more and more real test problems will be shared to help developers to design more efficient optimization methods.

## Notes

### Acknowledgements

This work was supported by NSERC grant 239436 and AFOSR FA9550-12-1-0198.

## References

1. 1.
Abhishek, K., Leyffer, S., Linderoth, J.T.: Modeling without categorical variables: a mixed-integer nonlinear program for the optimization of thermal insulation systems. Optim. Eng. 11, 185–212 (2010). doi:10.1007/s11081-010-9109-z
2. 2.
Abramson, M.A.: Mixed variable optimization of a load-bearing thermal insulation system using a filter pattern search algorithm. Optim. Eng. 5(2), 157–177 (2004)
3. 3.
Abramson, M.A.: Second-order behavior of pattern search. SIAM J. Optim. 16(2), 315–330 (2005)
4. 4.
Abramson, M.A., Audet, C.: Convergence of mesh adaptive direct search to second-order stationary points. SIAM J. Optim. 17(2), 606–619 (2006)
5. 5.
Abramson, M.A., Audet, C., Dennis, J.E. Jr.: Filter pattern search algorithms for mixed variable constrained optimization problems. Pac. J. Optim. 3(3), 477–500 (2007)
6. 6.
Abramson, M.A., Brezhneva, O.A., Dennis, J.E. Jr., Pingel, R.L.: Pattern search in the presence of degenerate linear constraints. Optim. Methods Softw. 23(3), 297–319 (2008)
7. 7.
Abramson, M.A., Audet, C., Chrissis, J.W., Walston, J.G.: Mesh adaptive direct search algorithms for mixed variable optimization. Optim. Lett. 3(1), 35–47 (2009)
8. 8.
Abramson, M.A., Audet, C., Dennis, J.E. Jr., Le Digabel, S.: OrthoMADS: A deterministic MADS instance with orthogonal directions. SIAM J. Optim. 20(2), 948–966 (2009)
9. 9.
Abramson, M.A., Audet, C., Couture, G., Dennis, J.E. Jr., Le Digabel, S., Tribes, C.: The NOMAD project (2014). Software available at http://www.gerad.ca/nomad
10. 10.
Alarie, S., Audet, C., Garnier, V., Le Digabel, S., Leclaire, L.A.: Snow water equivalent estimation using blackbox optimization. Pac. J. Optim. 9(1), 1–21 (2013)
11. 11.
Alberto, P., Nogueira, F., Rocha, H., Vicente, L.N.: Pattern search methods for user-provided points: Application to molecular geometry problems. SIAM J. Optim. 14(4), 1216–1236 (2004)
12. 12.
Aspentech (2014). http://www.aspentech.com/
13. 13.
Audet, C.: Convergence results for pattern search algorithms are tight. Optim. Eng. 5(2), 101–122 (2004)
14. 14.
Audet, C.: A short proof on the cardinality of maximal positive bases. Optim. Lett. 5(1), 191–194 (2011)
15. 15.
Audet, C., Dennis, J.E. Jr.: Pattern search algorithms for mixed variable programming. SIAM J. Optim. 11(3), 573–594 (2001)
16. 16.
Audet, C., Dennis, J.E. Jr.: Analysis of generalized pattern searches. SIAM J. Optim. 13(3), 889–903 (2003)
17. 17.
Audet, C., Dennis, J.E. Jr.: A pattern search filter method for nonlinear programming without derivatives. SIAM J. Optim. 14(4), 980–1010 (2004)
18. 18.
Audet, C., Dennis, J.E. Jr.: Mesh adaptive direct search algorithms for constrained optimization. SIAM J. Optim. 17(1), 188–217 (2006)
19. 19.
Audet, C., Dennis, J.E. Jr.: A progressive barrier for derivative-free nonlinear programming. SIAM J. Optim. 20(4), 445–472 (2009)
20. 20.
Audet, C., Le Digabel, S.: The mesh adaptive direct search algorithm for periodic variables. Pac. J. Optim. 8(1), 103–119 (2012)
21. 21.
Audet, C., Orban, D.: Finding optimal algorithmic parameters using derivative-free optimization. SIAM J. Optim. 17(3), 642–664 (2006)
22. 22.
Audet, C., Béchard, V., Chaouki, J.: Spent potliner treatment process optimization using a MADS algorithm. Optim. Eng. 9(2), 143–160 (2008)
23. 23.
Audet, C., Béchard, V., Le Digabel, S.: Nonsmooth optimization through mesh adaptive direct search and variable neighborhood search. J. Glob. Optim. 41(2), 299–318 (2008)
24. 24.
Audet, C., Custódio, A.L., Dennis, J.E. Jr.: Erratum: Mesh adaptive direct search algorithms for constrained optimization. SIAM J. Optim. 18(4), 1501–1503 (2008)Google Scholar
25. 25.
Audet, C., Dennis, J.E. Jr., Le Digabel, S.: Parallel space decomposition of the mesh adaptive direct search algorithm. SIAM J. Optim. 19(3), 1150–1170 (2008)
26. 26.
Audet, C., Savard, G., Zghal, W.: Multiobjective optimization through a series of single-objective formulations. SIAM J. Optim. 19(1), 188–210 (2008)
27. 27.
Audet, C., Dang, C.-K., Orban, D.: Algorithmic parameter optimization of the DFO method with the OPAL framework. In: Naono, K., Teranishi, K., Cavazos, J., Suda, R. (eds.) Software Automatic Tuning: From Concepts to State-of-the-Art Results, Chap. 15, pp. 255–274. Springer, Berlin (2010)Google Scholar
28. 28.
Audet, C., Dennis, J.E. Jr., Le Digabel, S.: Globalization strategies for mesh adaptive direct search. Comput. Optim. Appl. 46(2), 193–215 (2010)
29. 29.
Audet, C., Savard, G., Zghal, W.: A mesh adaptive direct search algorithm for multiobjective optimization. Eur. J. Oper. Res. 204(3), 545–556 (2010)
30. 30.
Audet, C., Dennis, J.E. Jr., Le Digabel, S.: Trade-off studies in blackbox optimization. Optim. Methods Softw. 27(4–5), 613–624 (2012)
31. 31.
Audet, C., Dang, C.-K., Orban, D.: Efficient use of parallelism in algorithmic parameter optimization applications. Optim. Lett. 7(3), 421–433 (2013)
32. 32.
Balaprakash, P. Wild, S.M., Norris, B.: Spapt: Search problems in automatic performance tuning. Proc. Comput. Sci. 9, 1959–1968 (2012). Proceedings of the International Conference on Computational Science, ICCS (2012)Google Scholar
33. 33.
Björkman, M., Holmström, K.: Global optimization of costly nonconvex functions using radial basis functions. Optim. Eng. 1, 373–397 (2000)
34. 34.
Booker, A.J.: Well-conditioned Kriging models for optimization of computer simulations. Technical Report M&CT-TECH-00-002, Boeing Computer Services, Research and Technology, M/S 7L–68, Seattle, Washington 98124 (2000)Google Scholar
35. 35.
Booker, A.J., Dennis, J.E. Jr., Frank, P.D., Moore, D.W., Serafini, D.B.: Managing surrogate objectives to optimize a helicopter rotor design – further experiments. AIAA Paper 1998–4717, Presented at the 8th AIAA/ISSMO Symposium on Multidisciplinary Analysis and Optimization, St. Louis (1998)Google Scholar
36. 36.
Booker, A.J., Dennis, J.E. Jr., Frank, P.D., Serafini, D.B., Torczon, V.:. Optimization using surrogate objectives on a helicopter test example. In: Borggaard, J., Burns, J., Cliff, E., Schreck, S. (eds.) Optimal Design and Control. Progress in Systems and Control Theory, pp. 49–58. Birkhäuser, Cambridge (1998)Google Scholar
37. 37.
Booker, A.J., Dennis, J.E. Jr., Frank, P.D., Serafini, D.B., Torczon, V., Trosset, M.W.: A rigorous framework for optimization of expensive functions by surrogates. Struct. Multidiscip. Optim. 17(1), 1–13 (1999)Google Scholar
38. 38.
Box, G.E.P.: Evolutionary operation: A method for increasing industrial productivity. Appl. Stat. 6, 81–101 (1957)Google Scholar
39. 39.
Caleanu, C.-D., Mao, X., Pradel, G., Moga, S., Xue, Y.: Combined pattern search optimization of feature extraction and classification parameters in facial recognition. Pattern Recognit. Lett. 32(9), 1250–1255 (2011)Google Scholar
40. 40.
Chetehouna, K., Sero-Guillaume, O., Sochet, I., Degiovanni, A.: On the experimental determination of flame front positions and of propagation parameters for a fire. Int. J. Therm. Sci. 47(9), 1148–1157 (2008)Google Scholar
41. 41.
Choi, T.D., Kelley, C.T.: Superlinear convergence and implicit filtering. SIAM J. Optim. 10(4), 1149–1162 (2000)
42. 42.
Choquette, Y., Lavigne, P., Ducharme, P., Houdayer, A., Martin, J.-P.: Apparatus and Method for Monitoring Snow Water Equivalent and Soil Moisture Content Using Natural Gamma Radiation, September 2010. US Patent No. 7800051 B2Google Scholar
43. 43.
Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, New York (1983). Reissued in 1990 by SIAM Publications, Philadelphia, as Vol. 5 in the series Classics in Applied MathematicsGoogle Scholar
44. 44.
Conn, A.R., Le Digabel, S.: Use of quadratic models with mesh-adaptive direct search for constrained black box optimization. Optim. Methods Softw. 28(1), 139–158 (2013)
45. 45.
Conn, A.R., Scheinberg, K., Toint, Ph.L.: On the convergence of derivative-free methods for unconstrained optimization. In: Buhmann, M.D., Iserles, A. (eds.) Approximation Theory and Optimization: Tributes to M.J.D. Powell, pp. 83–108. Cambridge University Press, Cambridge (1997)Google Scholar
46. 46.
Conn, A.R., Scheinberg, K., Vicente, L.N.: Global convergence of general derivative-free trust-region algorithms to first and second order critical points. SIAM J. Optim. 20(1), 387–415 (2009)
47. 47.
Conn, A.R., Scheinberg, K., Vicente, L.N.: Introduction to Derivative-Free Optimization. MOS/SIAM Series on Optimization. SIAM, Philadelphia (2009)
48. 48.
Conn, A.R., Scheinberg, K., Toint, Ph.L.: DFO (derivative free optimization) (2014). Software available at http://www.coin-or.org
49. 49.
Coope, I.D., Price, C.J.: Frame-based methods for unconstrained optimization. J. Optim. Theory Appl. 107(2), 261–274 (2000)
50. 50.
Courbariaux, Y., Chaouki, J., Guy, C.: Update on spent potliners treatments: Kinetics of cyanides destruction at high temperature. Ind. Eng. Chem. Res. 43(18), 5828–5837 (2004)Google Scholar
51. 51.
Cramer, E.J., Gablonsky, J.M.: Effective parallel optimization of complex computer simulations. In: Proceedings of the 10th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference (2004)Google Scholar
52. 52.
Custódio, A.L., Madeira, J.F.A., Vaz, A.I.F., Vicente, L.N.: Direct multisearch for multiobjective optimization. SIAM J. Optim. 21(3), 1109–1140 (2011)
53. 53.
Custódio, A.L., Rocha, H., Vicente, L.N.: Incorporating minimum Frobenius norm models in direct search. Comput. Optim. Appl. 46(2), 265–278 (2010)
54. 54.
Das, I., Dennis, J.E. Jr.: Normal-boundary intersection: A new method for generating the pareto surface in nonlinear multicriteria optimization problems. SIAM J. Optim. 8(3), 631–657 (1998)
55. 55.
Davis, C.: Theory of positive linear dependence. Am. J. Math. 76, 733–746 (1954)
56. 56.
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)Google Scholar
57. 57.
Dennis, J.E. Jr., Torczon, V.: Direct search methods on parallel machines. SIAM J. Optim. 1(4), 448–474 (1991)
58. 58.
Dennis, J.E. Jr., Torczon, V.: Managing approximation models in optimization. In: Alexandrov, N.M., Hussaini, M.Y. (eds.) Multidisciplinary Design Optimization: State of the Art, pp. 330–347. SIAM, Philadelphia (1997)Google Scholar
59. 59.
Dennis, J.E. Jr., Woods, D.J.: Optimization on microcomputers: The Nelder–Mead simplex algorithm. In: Wouk, A. (ed.) New Computing Environments: Microcomputers in Large-Scale Computing, pp. 116–122. Society for Industrial and Applied Mathematics, Philadelphia (1987)Google Scholar
60. 60.
Dennis, J.E. Jr., Wu, Z.: Parallel Continuous Optimization. Sourcebook of Parallel Computing, pp. 649–670. Morgan Kaufmann, San Francisco (2003)Google Scholar
61. 61.
Dennis, J.E. Jr., Price, C.J., Coope, I.D.: Direct search methods for nonlinearly constrained optimization using filters and frames. Optim. Eng. 5(2), 123–144 (2004)
62. 62.
Dolan, E.D., Lewis, R.M., Torczon, V.: On the local convergence of pattern search. SIAM J. Optim. 14(2), 567–583 (2003)
63. 63.
FactSage (2014). http://www.factsage.com/
64. 64.
Fermi, E., Metropolis, N.: Numerical solution of a minimum problem. Los Alamos Unclassified Report LA–1492, Los Alamos National Laboratory, Los Alamos (1952)Google Scholar
65. 65.
Ferris, M.C., Mangasarian, O.L.: Parallel variable distribution. SIAM J. Optim. 4(4), 815–832 (1994)
66. 66.
Finkel, D.E., Kelley, C.T.: Convergence analysis of the DIRECT algorithm. Technical Report CRSC-TR04-28, Center for Research in Scientific Computation (2004)Google Scholar
67. 67.
Finkel, D.E., Kelley, C.T.: Convergence analysis of sampling methods for perturbed Lipschitz functions. Pac. J. Optim. 5(2), 339–350 (2009)
68. 68.
Fletcher, R., Leyffer, S.: Nonlinear programming without a penalty function. Math. Program. Ser. A 91, 239–269 (2002)
69. 69.
Fletcher, R., Leyffer, S., Toint, Ph.L.: On the global convergence of an SLP–filter algorithm. Technical Report NA/183, Department of Mathematics, Dundee University (1998)Google Scholar
70. 70.
Fourer, R., Orban, D.: Dr. Ampl – a meta-solver for optimization problem analysis. Comput. Manag. Sci. 7(4), 437–463 (2009)Google Scholar
71. 71.
Fowler, K.R., Kelley, C.T., Miller, C.T., Kees, C.E., Darwin, R.W., Reese, J.P., Farthing, M.W., Reed, M.S.C.: Solution of a well-field design problem with implicit filtering. Optim. Eng. 5(2), 207–234 (2004)
72. 72.
Fowler, K.R., Reese, J.P., Kees, C.E., Dennis, J.E. Jr., Kelley, C.T., Miller, C.T., Audet, C., Booker, A.J., Couture, G., Darwin, R.W., Farthing, M.W., Finkel, D.E., Gablonsky, J.M., Gray, G., Kolda, T.G.: Comparison of derivative-free optimization methods for groundwater supply and hydraulic capture community problems. Adv. Water Resour. 31(5), 743–757 (2008)Google Scholar
73. 73.
Gablonsky, J.M., Kelley, C.T.: A locally-biased form of the DIRECT algorithm. J. Glob. Optim. 21, 27–37 (2001)
74. 74.
García-Palomares, U.M., Rodríguez, J.F.: New sequential and parallel derivative-free algorithms for unconstrained optimization. SIAM J. Optim. 13(1), 79–96 (2002)
75. 75.
Gheribi, A.E., Robelin, C., Le Digabel, S., Audet, C., Pelton, A.D.: Calculating all local minima on liquidus surfaces using the factsage software and databases and the mesh adaptive direct search algorithm. J. Chem. Thermodyn. 43(9), 1323–1330 (2011)Google Scholar
76. 76.
Gheribi, A.E., Audet, C., Le Digabel, S., Bélisle, E., Bale, C.W., Pelton, A.D.: Calculating optimal conditions for alloy and process design using thermodynamic and properties databases, the FactSage software and the Mesh Adaptive Direct Search algorithm. CALPHAD 36, 135–143 (2012)Google Scholar
77. 77.
Gilmore, P., Kelley, C.T.: An implicit filtering algorithm for optimization of functions with many local minima. SIAM J. Optim. 5(2), 269–285 (1995)
78. 78.
Gould, N.I.M., Orban, D., Toint, Ph.L.: CUTEr (and SifDec): A constrained and unconstrained testing environment, revisited. ACM Trans. Math. Softw. 29(4), 373–394 (2003)
79. 79.
Gould, N.I.M., Orban, D., Sartenaer, A., Toint, Ph.L.: Sensitivity of trust-region algorithms on their parameters. 4OR 3(3), 227–241 (2005)Google Scholar
80. 80.
Gramacy, R.B., Le Digabel, S.: The mesh adaptive direct search algorithm with treed Gaussian process surrogates. Technical Report G-2011-37, Les cahiers du GERAD (2011)Google Scholar
81. 81.
Gramacy, R.B., Taddy, M.A., Wild, S.M.: Variable selection and sensitivity analysis via dynamic trees with an application to computer code performance tuning. Technical Report 1108.4739, arXiv, August 2011Google Scholar
82. 82.
Gray, G.A., Kolda, T.G.: Algorithm 856: APPSPACK 4.0: Asynchronous parallel pattern search for derivative-free optimization. ACM Trans. Math. Softw. 32(3), 485–507 (2006)Google Scholar
83. 83.
Griffin, J.D., Kolda, T.G.: Nonlinearly-constrained optimization using heuristic penalty methods and asynchronous parallel generating set search. Appl. Math. Res. Express 25(5), 36–62 (2010)
84. 84.
Griffin, J.D., Kolda, T.G., Lewis, R.M.: Asynchronous parallel generating set search for linearly-constrained optimization. SIAM J. Sci. Comput. 30(4), 1892–1924 (2008)
85. 85.
Hansen, P., Mladenović, N.: Variable neighborhood search: principles and applications. Eur. J. Oper. Res. 130(3), 449–467 (2001)
86. 86.
Hare, W.L., Macklem, M.: Derivative-free optimization methods for finite minimax problems. Optim. Methods Softw. 28(2), 300–312 (2013)
87. 87.
Hayes, R.E., Bertrand, F.H., Audet, C., Kolaczkowski, S.T.: Catalytic combustion kinetics: Using a direct search algorithm to evaluate kinetic parameters from light-off curves. Canad. J. Chem. Eng. 81(6), 1192–1199 (2003)Google Scholar
88. 88.
Higham, N.J.: Optimization by direct search in matrix computations. SIAM J. Matrix Anal. Appl. 14, 317–333 (1993)
89. 89.
Hock, W., Schittkowski, K.: Test Examples for Nonlinear Programming Codes. Lecture Notes in Economics and Mathematical Systems, vol. 187. Springer, Berlin (1981)Google Scholar
90. 90.
Hooke, R., Jeeves, T.A.: Direct search solution of numerical and statistical problems. J. Assoc. Comput. Mach. 8(2), 212–229 (1961)
91. 91.
Hough, P.D., Kolda, T.G., Torczon, V.: Asynchronous parallel pattern search for nonlinear optimization. SIAM J. Sci. Comput. 23(1), 134–156 (2001)
92. 92.
Jahn, J.: Vector Optimization: Theory, Applications, and Extensions. Springer, Berlin (2004)Google Scholar
93. 93.
Jones, D.R., Perttunen, C.D., Stuckman, B.E.: Lipschitzian optimization without the Lipschitz constant. J. Optim. Theory Appl. 79(1), 157–181 (1993)
94. 94.
Kelley, C.T.: Iterative Methods for Optimization. Frontiers in Applied Mathematics, vol. 18. SIAM, Philadelphia (1999)Google Scholar
95. 95.
Kokkolaras, M., Audet, C., Dennis, J.E. Jr.: Mixed variable optimization of the number and composition of heat intercepts in a thermal insulation system. Optim. Eng. 2(1), 5–29 (2001)
96. 96.
Kolda, T.G.: Revisiting asynchronous parallel pattern search for nonlinear optimization. SIAM J. Optim. 16(2), 563–586 (2005)
97. 97.
Kolda, T.G., Torczon, V.: On the convergence of asynchronous parallel pattern search. SIAM J. Optim. 14(4), 939–964 (2004)
98. 98.
Kolda, T.G., Lewis, R.M., Torczon, V.: Optimization by direct search: New perspectives on some classical and modern methods. SIAM Rev. 45(3), 385–482 (2003)
99. 99.
Kolda, T.G., Lewis, R.M., Torczon, V.: A generating set direct search augmented Lagrangian algorithm for optimization with a combination of general and linear constraints. Technical Report SAND2006-5315, Sandia National Laboratories, USA (2006)Google Scholar
100. 100.
Kolda, T.G., Lewis, R.M., Torczon, V.: Stationarity results for generating set search for linearly constrained optimization. SIAM J. Optim. 17(4), 943–968 (2006)
101. 101.
Le Digabel, S.: Algorithm 909: NOMAD: Nonlinear optimization with the MADS algorithm. ACM Trans. Math. Softw. 37(4), 44:1–44:15 (2011)Google Scholar
102. 102.
Audet, C., Le Digabel, S., Tribes, C.: NOMAD user guide. Technical Report G-2009-37, Les cahiers du GERAD (2009)Google Scholar
103. 103.
Leach, E.B.: A note on inverse function theorem. In: Proceedings of the AMS, vol. 12, pp. 694–697 (1961)
104. 104.
Lewis, R.M., Torczon, V.: Rank ordering and positive bases in pattern search algorithms. Technical Report 96–71, Institute for Computer Applications in Science and Engineering, Mail Stop 132C, pp. 23681–2199. NASA Langley Research Center, Hampton, VA (1996)Google Scholar
105. 105.
Lewis, R.M., Torczon, V.: Pattern search algorithms for bound constrained minimization. SIAM J. Optim. 9(4), 1082–1099 (1999)
106. 106.
Lewis, R.M., Torczon, V.: Pattern search methods for linearly constrained minimization. SIAM J. Optim. 10(3), 917–941 (2000)
107. 107.
Lewis, R.M., Torczon, V.: A globally convergent augmented Lagrangian pattern search algorithm for optimization with general constraints and simple bounds. SIAM J. Optim. 12(4), 1075–1089 (2002)
108. 108.
Lewis, R.M., Shepherd, A., Torczon, V.: Implementing generating set search methods for linearly constrained optimization. SIAM J. Sci. Comput. 29(6), 2507–2530 (2007)
109. 109.
Li, M., Li, Y.F., Wu, Y.J., Cai, S., Zhu, N.Y., Rezzak, N., Schrimpf, R.D., Fleetwood, D.M., Wang, J.Q., Cheng, X.X., Wang, Y., Wang, D.L., Hao, Y.: Including Radiation Effects and Dependencies on Process-Related Variability in Advanced Foundry SPICE Models Using a New Physical Model and Parameter Extraction Approach. IEEE Trans. Nuclear Sci. 58(6, Part 1), 2876–2882 (2011). IEEE Radiation Effects Data Workshop (REDW)/48th IEEE International Nuclear and Space Radiation Effects Conference (NSREC), pp. 25–29, Las Vegas (2011)Google Scholar
110. 110.
Liuzzi, G., Lucidi, S., Sciandrone, M.: A derivative-free algorithm for linearly constrained finite minimax problems. SIAM J. Optim. 16(4), 1054–1075 (2006)
111. 111.
Lophaven, S., Nielsen, H., Søondergaard, J.: Dace: A matlab kriging toolbox version 2.0. Technical Report IMM-REP-2002-12, Informatics and Mathematical Modelling, Technical University of Denmark (2002)Google Scholar
112. 112.
Lucidi, S., Sciandrone, M.: On the global convergence of derivative-free methods for unconstrained optimization. SIAM J. Optim. 13, 97–116 (2002)
113. 113.
Marsden, A.L.: Aerodynamic noise control by optimal shape design. Ph.D. thesis, Stanford University (2004)Google Scholar
114. 114.
Marsden, A.L., Wang, M., Dennis, J.E. Jr., Moin, P.: Optimal aeroacoustic shape design using the surrogate management framework. Optim. Eng. 5(2), 235–262 (2004)
115. 115.
Marsden, A.L., Wang, M., Dennis, J.E. Jr., Moin, P.: Suppression of airfoil vortex-shedding noise via derivative-free optimization. Phys. Fluids 16(10), L83–L86 (2004)Google Scholar
116. 116.
Marsden, A.L., Wang, M., Dennis, J.E. Jr., Moin, P.: Trailing-edge noise reduction using derivative-free optimization and large-eddy simulation. J. Fluid Mech. 572, 13–36 (2007)
117. 117.
Marsden, A.L., Feinstein, J.A., Taylor, C.A.: A computational framework for derivative-free optimization of cardiovascular geometries. Comput. Methods Appl. Mech. Eng. 197(21–24), 1890–1905 (2008)
118. 118.
Martin, J.-P., Houdayer, A., Lebel, C., Choquette, Y., Lavigne, P., Ducharme, P.: An unattended gamma monitor for the determination of snow water equivalent (SWE) using the natural ground gamma radiation. In: Nuclear Science Symposium Conference Record, pp. 983–988. IEEE (2008)Google Scholar
119. 119.
Marty, A.: Optimisation du placement et de l’assignation de fréquence d’antennes dans un réseau de télécommunications. Master’s thesis, École Polytechnique de Montréal (2011)Google Scholar
120. 120.
McKinnon, K.I.M.: Convergence of the Nelder-Mead simplex method to a nonstationary point. SIAM J. Optim. 9, 148–158 (1998)
121. 121.
Mladenović, N., Hansen, P.: Variable neighborhood search. Comput. Oper. Res. 24(11), 1097–1100 (1997)
122. 122.
Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)
123. 123.
Oeuvray, R., Bierlaire, M.: BOOSTERS: A derivative-free algorithm based on radial basis functions. Int. J. Model. Simul. 29(1), 26–36 (2009)Google Scholar
124. 124.
Orban, D.: Templating and automatic code generation for performance with Python. Technical Report G-2011-30, Les cahiers du GERAD (2011)Google Scholar
125. 125.
Polak, E., Wetter, M.: Precision control for generalized pattern search algorithms with adaptive precision function evaluations. SIAM J. Optim. 16(3), 650–669 (2006)
126. 126.
Prechtel, M., Leugering, G., Steinmann, P., Stingl, M.: Towards optimization of crack resistance of composite materials by adjustment of fiber shapes. Eng. Fract. Mech. 78(6), 944–960 (2011)Google Scholar
127. 127.
Price, C.J., Coope, I.D.: Frames and grids in unconstrained and linearly constrained optimization: A nonsmooth approach. SIAM J. Optim. 14, 415–438 (2003)
128. 128.
Regis, R.G., Shoemaker, C.A.: Constrained global optimization of expensive black box functions using radial basis functions. J. Glob. Optim. 31, 153–171 (2005)
129. 129.
Renaud, E., Robelin, C., Gheribi, A.E., Chartrand, P.: Thermodynamic evaluation and optimization of the Li, Na, K, Mg, Ca, Sr // F, Cl reciprocal system. J. Chem. Thermodyn. 43(8), 1286–1298 (2011)Google Scholar
130. 130.
Rockafellar, R.T.: Generalized directional derivatives and subgradients of nonconvex functions. Canad. J. Math. 32(2), 257–280 (1980)
131. 131.
Sankaran, S., Marsden, A.L.: The impact of uncertainty on shape optimization of idealized bypass graft models in unsteady flow. Phys. Fluids 22(12), 121902 (2010)Google Scholar
132. 132.
Sankaran, S., Audet, C., Marsden, A.L.: A method for stochastic constrained optimization using derivative-free surrogate pattern search and collocation. J. Comput. Phys. 229(12), 4664–4682 (2010)
133. 133.
Serafini, D.B.: A framework for managing models in nonlinear optimization of computationally expensive functions. Ph.D. thesis, Department of Computational and Applied Mathematics, Rice University (1998)Google Scholar
134. 134.
Søndergaard, J.: Optimization using surrogate models—by the space mapping technique. Ph.D. thesis, Informatics and Mathematical Modelling, Technical University of Denmark (2003)Google Scholar
135. 135.
Spillane, M.C., Gica, E., Titov, V.V.: Tsunameter Network Design for the U.S. DART Array. AGU Fall Meeting Abstracts, p. A1368 (December 2009)Google Scholar
136. 136.
Stephens, C.P., Baritompa, W.: Global optimization requires global information. J. Optim. Theory Appl. 96, 575–588 (1998)
137. 137.
Sun, J., Salvucci, G.D., Entekhabi, D., Farhadi, L.: Parameter estimation of coupled water and energy balance models based on stationary constraints of surface states. Water Resour. Res. 47, 1–16 (2011)Google Scholar
138. 138.
Sweatlock, L.A., Diest, K., Marthaler, D.E.: Metamaterials design using gradient-free numerical optimization. J. Appl. Phys. 108(8), 1–5 (2010)Google Scholar
139. 139.
Tang, B.: Orthogonal array-based Latin hypercubes. J. Am. Stat. Assoc. 88(424), 1392–1397 (1993)
140. 140.
Torczon, V.: On the convergence of pattern search algorithms. SIAM J. Optim. 7(1), 1–25 (1997)
141. 141.
Torres, R., Bès, C., Chaptal, J., Hiriart-Urruty, J.-B.: Optimal, environmentally-friendly departure procedures for civil aircraft. J. Aircr. 48(1), 11–22 (2011)Google Scholar
142. 142.
Vicente, L.N., Custódio, A.L.: Analysis of direct searches for discontinuous functions. Math. Program. 133(1–2), 299–325 (2012)
143. 143.
Wild, S.M., Shoemaker, C.A.: Global convergence of radial basis function trust region derivative-free algorithms. SIAM J. Optim. 21(3), 761–781 (2011)
144. 144.
Yang, W., Feinstein, J.A., Marsden, A.L.: Constrained optimization of an idealized y-shaped baffle for the fontan surgery at rest and exercise. Comput. Methods Appl. Mech. Eng. 199(33–36), 2135–2149 (2010)