Advertisement

Metamaterial Design by Mesh Adaptive Direct Search

  • Charles Audet
  • Kenneth Diest
  • Sébastien Le Digabel
  • Luke A. Sweatlock
  • Daniel E. Marthaler
Part of the Topics in Applied Physics book series (TAP, volume 127)

Abstract

In the design of optical metamaterials, some optimization problems require launching a numerical simulation. The Mesh Adaptive Direct Search algorithm is designed for such problems. The MADS algorithm does not require any derivative information about the problem being optimized, and no continuity or differentiability assumptions are made by MADS on the functions defining the simulation. A detailed discussion of the method is provided in the second section of the chapter, followed by a discussion of the NOMAD implementation of the method and its features. The final section of the chapter lists three instances of combining NOMAD with Finite-Difference Time-Domain electromagnetic simulations to tailor the broadband spectral response and near-field interactions of Split Ring Resonator metamaterials.

Keywords

Variable Neighborhood Search Trial Point Incumbent Solution Poll Point Library Mode 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

4.1 Introduction

This chapter describes methods for solving constrained optimization problems using the Mesh Adaptive Direct Search (Mads) algorithm, which belongs to the more broad class of Derivative-Free Optimization methods. Because small changes in the geometry of a metamaterial can results in large changes in the overall behavior of the structure, these techniques are well-suited for the design of optical metamaterials, and can handle the large discontinuities in the “cost function space” that often arise. The Mads algorithm does not require any derivative information about the problem being optimized, and no continuity or differentiability assumptions are made on the functions defining the simulation. Out of the many applicable techniques that can be used for metamaterial design, the NOMAD implementation of the Mads algorithm discussed in this chapter has many advantages, including built in capabilities to handle nonlinear constraints, bi-objective function optimization, sensitivity analysis, and up to 50 variables. For even larger problems, the Psd-Mads parallel version of the method is able to solve problems with 50–500 variables. Lastly, the NOMAD implementation has the added benefit that it is constantly being updated and improved within a number of programming languages including C++ and Matlab (http://www.gerad.ca/nomad).

4.1.1 Structuring the Optimization Problem

This chapter considers optimization problems that may be written in the following general form
$$\begin{aligned} \min_{x \in\varOmega} f(x), \end{aligned}$$
(4.1)
where f is a single-valued objective function, and Ω is the set of feasible solutions. The direct search methods described here can be applied without making any assumptions on the function f or on the set Ω. However, when analyzing the theoretical behavior of these methods, we will study them under various assumptions. Without any loss of generality, suppose that the set of feasible solutions is written as
$$\varOmega= \bigl\{x \in X: c_j(x) \leq0,\ j \in J\bigr\} \subset {\mathbb{R}}^n , $$
where X is a subset of \({\mathbb{R}}^{n}\) and \(c_{j}: X \rightarrow {\mathbb{R}}\cup\{ \infty\}\) for all jJ={1,2,…,m} are quantifiable constraints. This means that for any x in X, the real value c j (x) provides a measure by which a constraint is violated or satisfied. This notation does not make the problem restrictive, as problems where J=∅ are allowed.

The sets X and Ω define the feasible region and each one of them corresponds to a specific type of constraint for which different treatments are described in Sect. 4.2.2. The quantifiable constraints c j (x)≤0, jJ, defining Ω provide a distance to feasibility and/or to infeasibility. Violating these constraints is also permitted as long as these violations occur only at the intermediate candidates considered by an optimization method. The set X may contain any constraint for which a measure of the violation is not available, and/or constraints that cannot be relaxed. Typically, X contains bound constraints necessary to run the simulation, but can also include hidden constraints [20], which occur when the simulation fails to evaluate. In [5], the objective function failed to return a value on approximately 43 % of the simulations, and in [18], the failure rate climbed to 60 %. Such problems pose a challenge to optimization methods that use function evaluations to estimate derivatives.

Different approaches exist to tackle Problem (4.1), and this chapter discusses Derivative-Free Optimization (DFO) methods. This choice is justified by the fact that these methods are backed by rigorous convergence analyses based on different levels of assumptions on the nature of the functions defining the problem. This type of analysis marks the difference between DFO methods and heuristics. While this chapter focuses on the Mads algorithm to address these types of problems, a review of DFO methods may be consulted in the recent book [23], which does not focus on the constrained case. The present chapter aims at describing a practical and recent method and software for constrained blackbox optimization.

The chapter is divided as follows. Section 4.2 summarizes the general organization of the Mads algorithm, describes strategies to handle various types of constraints, discusses the use of surrogates and models to guide the optimization, and details the type of nonsmooth convergence analyses on which these methods rely. Section 4.3 describes our C++ implementation NOMAD of Mads and highlights some features that make the code versatile. Finally, Sect. 4.4 describes a metamaterial design optimization problem, shows how to formulate it as a blackbox optimization problem, and numerical experiments are conducted using the NOMAD software.

4.2 The Mesh Adaptive Direct Search Class of Algorithms

The Mesh Adaptive Direct Search (Mads) is presented in [9] as a generalization of several existing direct search methods.

The name of these methods comes from the fact that they are designed to work directly with the objective function values generated by the blackbox, and that they do not use or approximate derivatives or require their existence. Mads was introduced to extend the target class of problems to the constrained problem (4.1) while improving the practical and theoretical convergence results. That paper proposed a first instantiation of Mads called LTMads, which was improved in ulterior work. The non-deterministic nature of LTMads was corrected in the OrthoMads [3] instantiation. These algorithms were initially designed to handle the constraints of Ω by the so-called extreme barrier, which simply consists of rejecting any trial point which does not belong to Ω. The term extreme barrier comes from the fact that this approach can be implemented by solving the unconstrained minimization of
$$f_\varOmega(x) = \left \{ \begin{array}{l@{\quad}l} f(x) & \mbox{if } x \in\varOmega, \\ \infty& \mbox{otherwise} \end{array} \right . $$
instead of the equivalent Problem (4.1). A more subtle way of handling quantifiable constraints is presented in [10], and is summarized in Sect. 4.2.2 below.

4.2.1 General Organization of the Mads Algorithm

In order to use a Mads algorithm, the user must provide an initial point denoted \(x_{0} \in {\mathbb{R}}^{n}\). It does not need to be feasible with respect to the quantifiable constraints c j (x)≤0, jJ, but must belong to the set X. Mads algorithms are iterative and the iteration counter is denoted by the index k. At each iteration, the algorithm deploys some effort to improve the incumbent solution x k , i.e., the current best solution. For now, we will remain vague about the word best as it takes different meanings if it refers to a feasible or an infeasible solution. This clarification is made in Sect. 4.2.2

At each iteration, the algorithm tries to improve the incumbent solution by launching the simulation at a finite number of carefully selected trial points. This is done in two steps called the search and the poll. The search is very flexible and allows the user to take advantage of his knowledge of the problem to propose some candidates where the simulation will be launched. Some search strategies are tailored to specific problems, while others are generic (e.g., speculative search [9], Latin hypercube sampling [36], variable neighborhood searches [6], surrogates). The poll needs to satisfy more rigorous restrictions. It consists of a local exploration near the current incumbent solution, and its definition varies from one instantiation of Mads to another. In practice, the search can greatly improve the quality of the final solution, while the poll structure allows a rigorous convergence analysis.

A fundamental requirement of both the search and poll steps is that they must generate trial points belonging to a conceptual mesh M k on the space of variables \({\mathbb{R}}^{n}\). The mesh is defined by a mesh size parameter \(\varDelta ^{m}_{k} >0\), by the set V k of all trial points at which the simulation was launched before the start of iteration k, and by a finite set of positive spanning directions \(D \subset {\mathbb{R}}^{n}\). Of these three elements, only D is fixed throughout the algorithm, while the two others vary from one iteration to another. In practice, the set D is often chosen to be the columns of the n×n identity matrix, together with their negatives: in matrix form, \(D = [ I_{n} -I_{n}] \in {\mathbb{R}}^{n \times2n}\). Formally, the mesh at iteration k is the following enumerable subset of \({\mathbb{R}}^{n}\):
$$ M_k =\bigl\{ x + \varDelta ^m_k D z : x \in V_k, \ z \in {\mathbb{N}}^{n_D} \bigr\} \subset {\mathbb{R}}^n. $$
(4.2)
The set V k is also called the cache as it contains the history of all evaluated trial points. For functions that are expensive to evaluate, the cache allows a reduction in computational time as the simulation at a previously evaluated trial point is not performed.
Figure 4.1 illustrates the mesh M k on a problem with only two variables where the set of directions used to construct the mesh are the positive and negative coordinate directions: in matrix form,
$$D = \left [ \begin{array}{c@{\quad}c@{\quad}c@{\quad}c} 1 & 0 & -1 & 0 \\ 0 & 1 & 0 & -1 \end{array} \right ]. $$
The mesh points are represented by the intersections of the horizontal and vertical lines. The mesh M k is conceptual as it is never generated, but the method must make sure that the trial points belong to M k . The remaining elements of the figure are described below.
Fig. 4.1

Example Mads trial points in \({\mathbb{R}}^{2}\) consistent with the ones defined in [3]. The intersections of the thin lines represent the mesh of size \(\varDelta ^{m}_{k}\), and thick lines the points at distance \(\varDelta _{k}^{p}\) from x k in the infinity norm. Examples of search {t 1,t 2,t 3} and poll trial points {t 4,t 5,t 6,t 7} are illustrated

Each Mads iteration goes as follows. Given an incumbent solution x k X, the search step produces a list of tentative trial mesh points. Any mechanism can be used to created the list, as long as it contains a finite number of points located on the mesh. The list may even be empty. Then, the simulation is launched at the trial points until all trial points are tested, or until one trial point is found to be better than the incumbent x k . In the latter case, the poll step can be skipped, and the algorithm may continue directly to the updates.

Following an unsuccessful search step, the poll step generates a list of mesh points near the incumbent x k . The term near is tied to the so-called poll size parameter \(\varDelta _{k}^{p} >0\). Again, the poll step may be interrupted as soon as an improvement over the incumbent is found. During an iteration, the simulations can be launched sequentially or in parallel. Synchronous and asynchronous strategies are described in Sect. 4.3.3 when multiple processors are available.

Parameters are updated at the end of each iteration. There are two possibilities. If either the search or the poll step generated a mesh point tM k which is better than x k , then the next incumbent x k+1 is set to t and both the mesh size and poll size parameters are increased or kept to the same value. For example, \(\varDelta ^{p}_{k+1} \leftarrow 2\varDelta ^{p}_{k}\) and \(\varDelta ^{m}_{k+1} \leftarrow\min\{1, \sqrt{\varDelta ^{p}_{k}}\}\). Otherwise, x k+1 is set to x k and the poll size parameter is decreased and the mesh size parameter decreased or kept the same. For example, \(\varDelta ^{p}_{k+1} \leftarrow \frac{1}{2}\varDelta ^{p}_{k}\) and \(\varDelta ^{m}_{k+1} \leftarrow\min\{1, \sqrt{\varDelta ^{p}_{k}}\}\).

At any iteration of a Mads algorithm, the poll size parameter \(\varDelta ^{p}_{k}\) must be greater than or equal to the mesh size parameter \(\varDelta ^{m}_{k}\). In Fig. 4.1, \(\varDelta ^{p}_{k} = \frac{1}{2}\) and \(\varDelta ^{m}_{k} = \frac{1}{4}\), and the search points are {t 1,t 2,t 3}. In OrthoMads, the poll points are obtained by generating a pseudo-random orthogonal basis H k and by completing it to a maximal positive basis D k =[H k H k ]. The poll points are then obtained from the incumbent x k in the directions of the columns of D k while remaining in the frame (the shaded region in the figure) defined by the poll size parameter \(\varDelta ^{p}_{k}\).

The iteration concludes by increasing the counter k by one. A new iteration is then initiated. Figure 4.2 summarizes the main steps of a Mads algorithm.
Fig. 4.2

A general Mads algorithm. See Fig. 4.1 for some examples of search and poll points

4.2.2 Handling of Constraints

Mads possesses different techniques to handle constraints. The constraints xX are handled by the extreme barrier discussed in the introduction of Sect. 4.2. The constraints c j (x)≤0 are relaxable and quantifiable and this supplementary structure allows a potentially more efficient treatment. The progressive barrier [10] exploits this structure and allows the algorithm to explore the solution space around infeasible trial points. This treatment of constraints uses the constraint violation function originally devised for filter methods [25] for nonlinear programming:
$$h(x) = \left \{ \begin{array}{l@{\quad}l} \sum_{j \in J } (\max(c_j(x), 0) )^2 & \mbox{if} \ x \in X,\\ \infty& \mbox{otherwise.} \end{array} \right . $$
The constraint violation function h is nonnegative, and h(x)=0 if and only if xΩ. It returns some kind of weighted measure of infeasibility.

The extreme barrier is essentially a mechanism that allows infeasible trial points whose constraint violation function value is below a threshold \(h_{k}^{\max} > 0\). But as the algorithm is deployed and the iteration number increases, the threshold is progressively reduced. This is accomplished with the Mads algorithm by having two incumbent solutions around which polling is conducted. One poll center is the feasible incumbent solution \(x_{k}^{F}\), i.e., the feasible solution found so far with the least objective function value. The second poll center is the infeasible incumbent solution \(x_{k}^{I}\), i.e., the infeasible solution found so far with a constraint violation value under the threshold \(h^{\max}_{k}\) having the least objective function value. Under this strategy, the infeasible trial points approach the feasible region by prioritizing the infeasible points with a low objective function value. This strategy differs from the ones in [2, 8, 24] where priority was given to feasibility at the expense of the objective function value.

Figure 4.3 represents an optimization problem as a tradeoff between the objective function f and the constraint violation function h. The left part of the figure depicts the domain X of a two-variable problem, as well as its feasible region Ω. The right part of the figure shows the image of both X and Ω under the mappings h and f. The mapping of X is delimited by the nonlinear curve, and the mapping of Ω is represented by the greyed region located on the f-axis. The optimal solution of the optimization problem corresponds to the feasible point (h=0) with the least value of f, as indicated by the arrows. The figure also shows the feasible and infeasible incumbents, as well as their image.
Fig. 4.3

The feasible region Ω and the domain X of an optimization problem, and their image under the mappings h and f

With the progressive barrier, the iterations are categorized into more than two types. Dominating iterations are those that either generate a feasible trial point with a lower objective function value than that the feasible incumbent, or those that generate an infeasible trial point with better objective and constraint violation function values than the infeasible incumbent. Improving iterations are those that are not dominating, but generate an infeasible trial point with a better constraint violation value. Otherwise, the iteration is said to be unsuccessful.

At the end of an unsuccessful iteration the incumbents remain unchanged, and the poll size parameter is reduced as this suggests that we are near a locally optimal solution. At the end of a dominating solution, a new incumbent solution is identified, and the poll size parameter is increased to allow far-reaching explorations in the space of variables. Finally, after an improving iteration, the poll size parameter is kept unchanged, but the constraint violation threshold is reduce in such a way that the next iteration will not have the same infeasible incumbent as the one from the previous iteration.

Figure 4.4 represents the poll step around the feasible incumbent \(x_{k}^{F}\). The four poll points are represented by the light circles in the left part of the figure. The leftmost one is rejected by the extreme barrier, as it does not belong to the domain X. Only one of the poll points is feasible, but as illustrated in the right part of the figure, it is dominated by the feasible incumbent \(x_{k}^{F}\). The two other poll points are infeasible. One of them is rejected by the progressive barrier, as its constraint violation function value exceeds the threshold \(h_{k}^{\max}\). Any trial point that gets mapped in the shaded region gets rejected. The remaining poll point has a lower constraint violation value than the infeasible incumbent, but a worse objective function value. Therefore, the iteration is neither a dominating one nor an unsuccessful one, it is an improving iteration. At the end of the iteration, the mechanism of the progressive barrier updates the infeasible incumbent \(x_{k+1}^{I}\) to be the poll point located to the right of \(x_{k}^{F}\), and the threshold \(h_{k+1}^{\max}\) would be reduced to the constraint violation function value evaluated at the new infeasible incumbent solution.
Fig. 4.4

Polling around the feasible incumbent x F generates a new infeasible incumbent \(x_{k+1}^{I}\)

Another strategy to handle relaxable quantifiable constraints is the progressive to extreme barrier [12]. As its name suggests, this strategy consists in first handling the constraint by the progressive barrier. But as soon as a trial point which satisfies the constraint is generated, then the treatment of the constraint is switched to the extreme barrier. This last strategy allows infeasible initial trial points, but forces the satisfaction of individual constraints as soon as they are satisfied for the first time.

4.2.3 Surrogates and Models

Surrogates are functions that can be considered as substitutes for the true functions defining the optimization problem, f and c j , jJ. A surrogate function shares some similarities with the original one, but has the advantage of being significantly less expensive to evaluate. Surrogate functions can be classified into static surrogates and dynamic surrogates.

4.2.3.1 Static Surrogates

Static surrogates are approximations that are provided by the user with some knowledge of the problem. A surrogate consists of a model of the true function, and is fixed during the optimization process. For example, static surrogates may be obtained by simplified physics models, or by allowing more relaxed stopping criteria within the blackbox simulation, or by replacing complicated subproblems by simpler ones. Straightforward uses of static surrogates within an optimization method are described in [15]. A first possibility is to order a list of tentative trial points by their surrogate values, and then to launch the expensive simulation defining the truth on the most promising trial points first. The process terminates as soon as a better point is found. This is called the opportunist strategy and can save a considerable amount of time.

A second strategy using static surrogates consists in defining a search step that optimizes the surrogates in order to determine one or two candidates for the true evaluations. In some situations, static surrogates may be parametrized with controllable precision. The use of such surrogates within a Gps framework is described in [39].

4.2.3.2 Dynamic Surrogates

In contrast to static surrogates, dynamic surrogates are not provided by the user. They correspond to models dynamically built within the optimization method, based on past evaluations from the cache. Any interpolation method can be used for this task, as, for example, quadratic models, neural networks, radial basis functions, or statistical methods. The Surrogate Management Framework [18] proposes ways to exploit such surrogates within direct search methods, and Gps in particular, for the unconstrained case. Successful applications of this framework include unsteady fluid mechanics problems [34, 35], helicopter rotor blade design [17] and multi-objective liquid-rocket injector design [40].

Recent developments propose the use of these surrogates for the constrained case within Mads. The first of these developments considers quadratic models and is summarized in the next section. The second approach is ongoing research and currently considers statistical methods [27], namely tree Gaussian processes [26]. Quadratic model and statistical surrogates share some similarities. They can be used to sort a list of trial points before launching the expensive true blackbox simulation, as proposed above for the static surrogates. Dynamic surrogates may also define a search step named the model search, enabled as soon as a sufficient number of true evaluations is available (typically n+1). These points are denoted the data points and are used to build one model for f and m models for the quantifiable constraints c j ,jJ. The model of the objective function is then optimized subject to the models of the constraints. This provides one or two mesh candidates, one feasible and possibly one infeasible, at which the true functions are evaluated. These are called oracle points. In addition to the nature of the surrogates, some subtleties remain: The quadratic models are kept local, while statistical surrogates consider the whole space and attempt to escape local solutions. They can also provide additional candidates based on statistics such as the Expected Improvement (EI) [29].

4.2.3.3 Quadratic Models in Mads

This section discusses the quadratic models described in [22] and currently used in NOMAD. The framework is inspired from the work of Conn et al. summarized in the DFO book [23] where more focus is put on model-based methods.

Quadratic models are employed at two different levels: First, the model search exploits the flexibility of the search by allowing the generation of trial points anywhere on the mesh. Candidates of the model search are the result of an optimization process considering the model of the objective constrained to the models of the constraints.

The other way of using models is to sort a list of candidates prior to their evaluations (model ordering), so that the most promising—from the model point of view—points will be evaluated first. The impact of this strategy is important because of the opportunistic strategy.

To construct a model, a set Y={y 0,…,y p } of p+1 data points is collected from the cache. The objective function f(y) and the constraint functions c j (y), jJ are known and finite at each data point yY. Since quadratic models are more suited for local interpolation, data points are collected in the neighborhood of the current iterate: \(y^{i} \in B_{\infty}(x_{k};\rho \varDelta _{k}^{p})\) with \(B_{\infty}(x;r) = \{ y \in {\mathbb{R}}^{n} : \| y-x \|_{\infty} \leq r \}\), where the poll size parameter \(\varDelta _{k}^{p}\) bounds the distance between x k and the poll trial points, and ρ is a parameter called the radius factor, typically set to two.

Then, m+1 models are constructed: one for the objective f and one for each constraint c j ≤0,jJ. These models are denoted m f and \(m_{c_{j}}, j \in J\), and are such that
$$m_f(x) \simeq f(x)\quad \mbox{and}\quad m_{c_j}(x) \simeq c_j(x),\quad j \in J, \quad \mbox{for all}\ x \in B_{\infty}\bigl(x_k;\rho \varDelta _k^p\bigr). $$
For one function (f or one of the constraints c j ), the model m f is defined by q+1 parameters, \(\alpha\in {\mathbb{R}}^{q+1}\), evaluated at x with m f (x)=α ϕ(x) with ϕ the natural basis of the space of polynomials of degree less than or equal to two, which has q+1=(n+1)(n+2)/2 elements:
$$\begin{aligned} \phi(x) =& \bigl( \phi_0(x) , \ldots, \phi_q(x) \bigr)^{\top}\\ =& \biggl(1,x_1,\ldots,x_n, \frac{x_1^2}{2}, \ldots, \frac{x_n^2}{2}, x_1x_2 , x_1x_3, \ldots, x_{n-1}x_n \biggr)^{\top}. \end{aligned}$$
The parameter α is selected in such a way that ∑ yY (f(y)−m f (y))2 is as small as possible, by solving the system
$$ M(\phi,Y) \alpha= f(Y) $$
(4.3)
with f(Y)=(f(y 0),f(y 1),…,f(y p )) and
$$M(\phi,Y)= \left[ \begin{array}{c@{\quad}c@{\quad}c@{\quad}c} \phi_0(y^0) & \phi_1(y^0) & \ldots & \phi_q(y^0) \\ \phi_0(y^1) & \phi_1 (y^1) & \ldots & \phi_q (y^1) \\ \vdots & \vdots & \ddots & \vdots \\ \phi_0(y^p) & \phi_1 (y^p) & \ldots & \phi_q (y^p) \end{array} \right] \in {\mathbb{R}}^{(p+1)\times(q+1)}. $$
System (4.3) may possess one, several, or no solutions. If pq, i.e., there are more interpolation points than necessary, it is overdetermined and regression is used in order to find a solution in the least squares sense. When p<q, i.e., there are not enough interpolation points, the system is underdetermined and there is an infinite number of solutions. Minimum Frobenius norm (MFN) interpolation is used in that case, which consists in choosing a solution that minimizes the Frobenius norm of the curvature subject to the interpolation conditions. This is captured in the quadratic terms of α. Thus writing \(\alpha= \big[ \begin{array}{c} \scriptstyle \alpha_{L} \\[-3pt] \scriptstyle \alpha_{Q} \end{array} \big] \) with \(\alpha_{L} \in {\mathbb{R}}^{n+1}\), \(\alpha_{Q} \in {\mathbb{R}}^{n_{Q}}\), and n Q =n(n+1)/2, our model at x is given by \(m_{f}(x)=\alpha_{L}^{\top} \phi_{L}(x)+\alpha_{Q}^{\top} \phi_{Q}(x)\) with ϕ L =(1,x 1,…,x n ) and \(\phi_{Q}= ( \frac{x_{1}^{2}}{2}, \ldots, \frac{x_{n}^{2}}{2}, x_{1}x_{2} , x_{1}x_{3},\ldots, x_{n-1}x_{n} )^{\top}\). The corresponding MFN α vector is then found by solving
$$\min_{\alpha_Q \in {\mathbb{R}}^{n_Q}} \frac{1}{2} \| \alpha_Q\|^2 \quad \mbox{subject to}\ M(\phi_L,Y) \alpha_L + M(\phi_Q,Y) \alpha_Q = f(Y). $$
Once the m+1 models are available, the model search and the model ordering strategies differ slightly. The model ordering consists in evaluating the models at the candidates, and then sorting the candidates accordingly. The model search is more elaborated because the following optimization problem has to be solved:
$$ \min_{x \in B_{\infty}(x_k;\rho \varDelta _k^p).}m_f(x) \quad \mbox{subject to}\ m_{c_j}(x) \leq0,\ j \in J \mbox{.} $$
(4.4)

After Problem (4.4) is solved (in practice, heuristically), its feasible and infeasible incumbent solutions define new candidates at which evaluate the true functions f and c j , jJ. In order to satisfy the Mads convergence analysis described in Sect. 4.2.4, these candidates are projected on the mesh before they are evaluated.

4.2.4 Convergence Analysis

Even if Mads is designed to be applied to the general optimization problem (4.1) without exploiting any of its structure, Mads is supported by a rigorous hierarchical convergence analysis. The analysis reveals that depending on the properties of the objective function f and the domain Ω, Mads will produce a limit point \(\hat{x}\) at which some necessary optimality conditions are satisfied. Of course, we do not expect our target problems to satisfy any smoothness properties, but the convergence analysis can be seen as a validation of the behavior of the algorithm on smoother problems.

The entire convergence analysis relies on the following assumptions. Suppose that Mads was launched on a test problem, without any stopping criteria, and suppose that the union of all trial points generated by the algorithm belongs to a bounded subset of \({\mathbb{R}}^{n}\). The assumption that Mads is launched indefinitely is not realistic, as in practice it necessarily terminates after a finite amount of time. But for the analysis, we are interested in seeing where the iterates would lead, it the algorithm were not stopped. The second assumption on the bounded subset can be satisfied in multiple ways. For example, it is true when the variables are bounded, or when level sets of f are bounded. In practice, it is not frequent that real problems have unbounded solutions.

The convergence analysis then focuses on limits of incumbent solutions. Torczon [43] showed for pattern searches that the hypotheses on bounded trial points implies that there are infinitely many unsuccessful iterations and that the limit inferior of the mesh size parameters \(\varDelta ^{m}_{k}\) converges to zero. These results were adapted in [9] in the context of the Mads algorithm. Let U denote the indices of the unsuccessful iterations and let \(\hat{x}\) be an accumulation point {x k } kU . Such an accumulation point exists because of the assumption that the iterates belong to a bounded set.

An unsuccessful iteration occurs when the poll step was conducted around the incumbent x k , and no better solution was found. The mesh size parameter is reduced only after an unsuccessful iteration. We say that the incumbent solution x k is a mesh local optimizer. At the low end of the convergence analysis [7], we have the zeroth order result: \(\hat{x}\) is the limit of mesh local optimizers on meshes that get infinitely fine. At the other end of the analysis, we have that if f is strictly differentiable near \(\hat{x}\) then \(\nabla f(\hat{x}) =0\) in the unconstrained case, and in the constrained case, the result ensures that the directional derivatives \(f'(\hat{x},; d)\) are nonnegative for every direction \(d \in {\mathbb{R}}^{n}\) that points in the contingent cone to the feasible region Ω, provided that Ω is regular [9]. The contingent cone generalizes the notion of tangent cone.

There are several intermediate results in the convergence analysis that involve different assumptions on f such as lower-semi continuity and regularity, and on the constraints such as properties of the hypertangent, Clarke tangent, and Bouligand cones. The fundamental theoretical result in the analysis was shown in [7, 9] and relies on Clarke’s [21] generalization of the directional derivative f for nonsmooth functions. The result states that \(f^{\circ}(\hat{x}; d) \geq0\) for every direction d in the hypertangent cone to Ω at \(\hat{x}\). A generalization of this result for discontinuous functions was recently shown in [45]. In the case where the progressive barrier [10] fails to generate feasible solutions, the analysis ensures that the constraint violation function satisfies \(h^{\circ}(\hat{x}; d) \geq0\) for every direction d in the hypertangent cone to X at \(\hat{x}\).

4.3 NOMAD: A C++ Implementation of the Mads Algorithm

This section describes the NOMAD software [32] which implements the Mads algorithm. We list several of its features, but do not expect to cover all of them in this chapter. NOMAD is a C++ code freely distributed under the LGPL license. The package is found at http://www.gerad.ca/nomad. It includes a complete documentation, a doxygen [44] interactive manual, and many examples and tools. As for other derivative-free codes, we expect as a rule of thumb that NOMAD will be efficient for problems with up to 50 variables.

4.3.1 Batch and Library Modes

NOMAD can be used in two different modes having various advantages. The user must choose with care the appropriate mode depending on its problem.

First, the batch mode, which launches the NOMAD executable in the command line with the name of a parameter file given as an argument. This text file contains the parameters that are divided into two categories: problem and algorithmic. Problem parameters are required while all the algorithmic parameters have default values. A simple parameter file is shown in Fig. 4.5, and the most important parameters are described in Sect. 4.3.2. The batch mode is simpler for beginners and non-programmers. The user must write a parameters file and design a wrapper for its application so that it is compatible with the NOMAD blackbox format. This format requires that the blackbox is callable from the command line with an input file containing the values of the variables, given as an argument.1 The resulting outputs must be displayed to the standard output with a sufficient precision. The blackbox is disjoint from the NOMAD code, and consequently the application may be coded in any programming language, as long as a command-line version is available. A detailed description for one implementation of this command-line interface is covered in the Appendix. Finally, the batch mode is by definition resilient to the blackbox crashes that may occur when a hidden constraint is violated: NOMAD will simply reject the trial point that made the blackbox crash.
Fig. 4.5

Example of a basic parameters file. The blackbox executable bb.exe takes five variables as input, and returns three outputs: one objective function value and two constraints. The initial point is the origin and NOMAD terminates after 100 evaluations

The second way to use the NOMAD algorithm is through the library mode. The user must write a C++ code which will be linked to the NOMAD static library included in the package. This way, interactions with NOMAD are directly performed via C++ function calls and objects manipulations. The optimization problem is described as a class and is written in C++ or in a compatible language such as C, FORTRAN, R, etc. The problem and algorithmic parameters are given as objects, and no parameters file is necessary. The library mode must be considered only by users with basic C++ knowledge. The problem must also be C++-compatible in order to be expressed as a class, and hidden constraints need to be explicitly treated. If these points are addressed, the advantages of the library modes are numerous. First, when the blackbox is not costly, the execution will be much faster than the batch mode since no temporary files and no system calls are used. Second, numerical precision is not an issue because the communications between the algorithm and the problem occur at memory level. Third, more flexibility is possible with the use of callback functions that are user-defined and automatically called by NOMAD at key events, such as after a new success, or at the end of a Mads iteration. Last, the library mode is convenient when NOMAD is repeatedly called as a subroutine. Note finally that a hybrid use of the batch and library modes is possible. For example, one can define its problem as a C++ class and use a parameters file.

4.3.2 Important Algorithmic Parameters

The objective of this section is not to describe all the parameters, but to discuss the ones that may have a significative influence on the executions efficiency. The names of the parameters are not reported here but can be easily found in the user guide or by using the NOMAD command-line help (option -h).
Starting point(s):

As for every optimization method, the starting point choice is crucial. The user has to provide his/her best guess so far for the method to be as efficient as possible. Within NOMAD, it is possible to define multiple starting points. This might be useful, for example, if in addition to a feasible initial solution, an infeasible one corresponding to a different and promising design is known.

Initial mesh size and scaling:

In our opinion, the second most important algorithmic choice concerns the initial mesh size. Some defaults related to the scale of the starting point are used, but the user is encouraged to determine a good problem-related value. Some automatic scaling is performed, but there again, users should sometimes consider changing the scale of their variables and study the impact.

Poll directions:

The default type of poll directions is OrthoMads [3] but sometimes other direction types, such as LTMads [9], may perform well.

Surrogates:

Static and dynamic surrogates can be used with NOMAD. Static surrogates are indicated by the user with the same format as the true blackbox. Concerning dynamic surrogates, the current NOMAD version 3.5 includes only quadratic models, but statistical surrogates will be available in a future release. Dynamic surrogates may be employed at two different levels: as a search, and as a way to sort a list of candidates before they are evaluated. The current NOMAD default is to use quadratic models at both places.

Projection to bounds:

When generating candidates outside of the hyper-rectangle defined by the bounds on the variables, NOMAD projects these points to the boundary, by default. For some problems, this strategy might not be the appropriate one.

Seeds:

If the context allows multiple executions, changing the random seed for LTMads, or the Halton seed for OrthoMads [3], will lead to different executions.

Termination criteria:

We finish this expose by indicating that many termination criteria are available, in addition to the obvious choice of a budget of evaluations.

4.3.3 Extensions of Mads

This section describes some algorithmic extensions that are not covered in Sect. 4.2 on the basic description of Mads. These features may be useful in practice, to approach an optimization problem through different angles.
Parallel and variable decomposition methods:

Three different parallel versions are available, using MPI. These methods are called P-Mads, Coop-Mads, and Psd-Mads. P-Madssimply performs the evaluations in parallel, and two variants are available. First, the synchronous version waits for all ongoing parallel evaluations before iterating. This is opposed to the asynchronous variant inspired from [28] which iterates as soon as a new success is made, even if some evaluations are still not finished. In case one of these evaluations results in an improvement, the current iterate and the current mesh size are adjusted accordingly during a special update step. The two other parallel methods, Coop-Mads and Psd-Mads, are provided as tools in the package. The first executes several Mads instances in parallel with different seeds. Some cooperative actions are performed in order to guide the search. Psd-Mads performs the same collaborative process, but in addition, subgroups of variables are considered for each process. This technique is described in [11] and aims at solving larger problems (50 to 500 variables).

Groups of variables:

The user with some knowledge of the problem can create groups of variables. Directions are then relative to these groups and variables from separate groups are not going to vary at the same time. This proved useful for localization problems and in particular the one presented in [4].

Different types of variables:

It is possible to define integer and binary variables, which are treated by special meshes with a minimal size of one. Categorical variables may also be used. They are handled with the extended poll defined in [1, 31]. For such problems, the user must define a neighborhood structure, which may contain a different number of variables. NOMAD defines the concept of signature that allows such heterogeneous points. Other types of variables include fixed variables and periodic variables (angles, for example). The strategy used for the periodic variables is described in [14].

Bi-objective optimization:

In some situations, the user is interested in considering the tradeoffs between two conflicting objective functions. The method from [16] executes a series of single-objective optimizations on reformulated versions of the original problem. These reformulations are not linear combinations of the objective, and ensure that non-convex Pareto fronts can be identified.

Variable Neighborhood Search (VNS):

For problems with many local optima, it is possible to enable the generic VNS search. This strategy has been described in [6] and uses a variable neighborhood search metaheuristic in order to escape local optima.

Sensitivity analysis:

The last feature described here is a tool that uses bi-objective optimization in order to conduct sensitivity analyses on the constraints, including bounds as well as non-relaxable constraints. This is described in [13] with plots illustrating the impact on the objective of changing the right-hand side of a given constraint.

4.4 Metamaterial Design Using NOMAD

This section describes the implementation of the NOMAD optimization routine [32] in combination with full-field electromagnetic simulations to tailor the broadband spectral response of gold and silver split ring resonator metamaterials. By allowing NOMAD to “drive” finite-difference time-domain simulations, the spectral position of resonant reflection peaks and near-field interactions within the metamaterial were tuned over a wide range of the near-infrared spectrum. While this section discusses the design problems studied and the optimized results, a detailed discussion of the implementation used to communicate between the different software packages is provided in the Appendix.

4.4.1 Split Ring Resonator Optimization

The first example of NOMAD driving the design of metamaterial device geometries involves the structure shown in Fig. 4.6. Here, the broad-band reflection spectrum from a single split-ring resonator/cross-bar structure (SRR) surrounded by air was studied as a function of the device dimensions. All of the electromagnetic simulations studied in this chapter were performed using the finite-difference time-domain method and the Lumerical software package.
Fig. 4.6

Panel (a) shows a schematic of the SRR structure that was optimized in Sect. 4.4.1. The electric field intensity at 1500 nm, which corresponds to the resonance of the bar, is plotted in (b) and the electric field intensity at 2500 nm, which corresponds to the resonance of the SRR, is plotted in (c). The specific geometry in (b) and (c) corresponds to the optimized spectrum in Fig. 4.7(c) and Table 4.1

In this section, the SRRs were illuminated with a broad-band plane wave source from 1–4 μm, and the structure was parameterized based on the height, width, ring thickness (t 1), bar thickness (t 2), and gap width. For all simulations, the thickness of the metal was 100 nm, the E-field was perpendicular to the arms of the SRR, and the width of the bar was kept the same as the width of the SRR. Also, linear constraints were imposed to ensure that the parameters made physical sense (e.g., 2t 1 ≥ width), and a gap was always present between the two arms of the SRRs.

Because of the general, double peaked reflection spectrum that comes from the SRR/bar structure, a double Lorentzian was chosen as a plausible initial target for the optimization. Peaks were set at 1500 nm with a reflection intensity of 50 % and at 2500 nm with a reflection intensity of 35 %. Although the particular target wavelengths have been chosen arbitrarily, this double Lorentzian spectrum was chosen to correspond to the resonant modes of the bar and SRR shown in Fig. 4.6(b)–(c). This could be considered typical of an application in which the designer wishes to design a nanoantenna which simultaneously matches the center frequencies and line widths of both the absorption and the emission processes in a quasi-three level optical system, such as coupling to a photoluminescent quantum dot. The target spectrum is shown as the dashed green curve in Fig. 4.7(a)–(d). Simulations were done with multiple starting points, listed in Table 4.1, for both gold and silver SRRs. The upper and lower bounds for each of the five fit parameters are also listed in Table 4.1, and the bounds for t 1 are starred to indicated the imposed linear constraints. For these simulations, the objective function used to drive the optimization is listed in Eq. (4.5), with the first three terms scaled by \(\frac {1}{150}\) to keep the magnitudes of all six terms comparable. Through experience, we have seen that objective functions that focus on a few key points in the broad band spectrum, almost always gives better results than metrics such as the mean squared error at every point in the reflection spectrum. If every point in the broad band spectrum is weighted equally, the key objectives in the cost function are essentially “swamped out” by the remaining hundreds or thousands of other, less important data points. As a result, the general type of objective function listed in Eq. (4.5) is used throughout the rest of this chapter:
$$\begin{aligned} \mbox{O.F.} =& \frac{\vert \lambda_{P1} - 1500\vert }{150} + \frac {\vert \lambda_{P2} - 2500\vert }{150} + \frac{\vert \lambda_{V} - 2025\vert }{150} + \vert I_{P1} - 0.5\vert \\ &{}+ \vert I_{P2} - 0.35\vert + \vert I_{V} - 0.07\vert . \end{aligned}$$
(4.5)
Fig. 4.7

SRR spectrum optimization using NOMAD with FDTD. Simulations using gold are shown on the left and simulations using silver are shown on the right. The top row corresponds to simulations using the first starting point in Table 4.1, and the bottom row corresponds to simulations using the second starting point

Table 4.1

Starting and optimized dimensions (in nm) for the SRR structures tested in Sect. 4.4.1. The variables correspond to those listed in Fig. 4.6. Boundary conditions for t1 were linearly constrained so that [width−2t 1]>0 for all optimizations

 

SRR initial and optimized values

Initial value

Gold optimized value

Silver optimized value

Minimum

Maximum

Width

400

450

474

200

600

Height

400

394

412

200

600

t 1

100

125

152

50

200

t 2

100

114

193

50

200

Gap

100

192

184

50

200

Width

500

455

438

200

600

Height

500

476

480

200

600

t 1

125

190

182

50

200

t 2

125

94

105

50

200

Gap

125

190

173

50

200

Here “λ P1” is the shorter wavelength peak position, “λ P2” is the longer wavelength peak position, “λ V ” is the position of the reflection minima between λ P1 and λ P2, “I P1” is the intensity of the shorter wavelength peak, “I P2” is the intensity of the longer wavelength peak, and “I V ” is the intensity of the reflection minima between λ P1 and λ P2.

Figure 4.7 shows the optimization results for a matrix of initial conditions. The left panels (a & b) are simulations of gold SRRs, while the right panels (c & d) are silver SRRs. Table 4.1 details two sets of initial values for each of the resonators’ geometrical parameters. The top row of Fig. 4.7, panels (a & c), correspond to the first set of initial conditions. The bottom row, panels (b & d), correspond to the second starting point. For all four panels, the dashed curve represents the double Lorentzian target spectrum, the dotted curve represents the reflection spectrum from the starting point, and the solid curve represents the reflection spectrum from the optimized result. A close look at the results in Table 4.1 shows that for a each metal, there is some variability in the optimized results based on the starting point. This is not necessarily surprising considering the degeneracy in potential solutions for this five-parameter optimization, and a relatively relaxed convergence tolerance specified during the optimizations. For all four cases, the results are especially encouraging based on the fact that the ideal double Lorentzian to which the curves were fit was arbitrarily chosen, and a perfect fit cannot necessarily be obtained with the given geometrical constraints and materials.

4.4.2 Split Ring Resonator Filters

The second example involves an array of individual SRRs with the same unit cell design shown in Fig. 4.6(a). In this case, the array was on top of a sapphire substrate and was used as a “notch filter”. A seven-parameter NOMAD optimization was performed on this array which included the five parameters from Fig. 4.6(a), as well as the spacing between scattering elements along the x-axis, parallel to the E-field, and y-axis, perpendicular to the E-field. Here the objective was to minimize the reflectivity and pass band at a pre-specified wavelength, while maximizing the reflectivity on either side of the pass band. The objective function used to drive the optimization is listed in Eq. (4.6):
$$\begin{aligned} \mbox{O.F.} =& 100* \bigl[(1-I_{P1})+(1-I_{P2})+(I_{V}) \bigr] + \vert \lambda_{P1} - \lambda_{P2}\vert + \vert \lambda_{V} - \lambda_{T}\vert \qquad \end{aligned}$$
(4.6)
where “λ T ” is the pass band target wavelength, and the remaining terms are identical to those used in Eq. (4.5). Target wavelengths of λ=1310,1550, and 1800 nm were chosen and the optimization was run with the same starting conditions every time. The resulting spectra from the three optimizations are shown in Fig. 4.8. All three optimized spectra show an ∼45 % change in reflectivity at the pass band and corresponding linewidths of ∼90 meV. The starting and optimized dimensions for each solution are given in Table 4.2. Figure 4.8 clearly shows successfully optimized designs for all three target wavelengths and the wide range of tunability this technique can offer in terms of metamaterials design.
Fig. 4.8

Optimized reflection spectra for arrays of SRRs on sapphire substrates. The array was designed to act as a notch filter at three target wavelengths of λ=1310,1550, and 1800 nm, respectively. The dimensions that produced each spectrum are given in Table 4.2

Table 4.2

Starting and optimized dimensions (in nm) for the SRR structures in Sect. 4.4.2 designed to act as a notch filter at λ=1310,1550, and 1800 nm

 

SRR filter initial and optimized values

Initial value

1310 nm

1550 nm

1800 nm

Minimum

Maximum

Width

500

348

424

508

200

600

Height

500

356

420

492

200

600

t 1

200

100

164

208

50

200

t 2

100

124

140

104

50

200

Gap

100

51

60

96

50

200

x spacing

200

128

120

192

50

1000

y spacing

200

330

392

436

50

1000

As a final check of the solutions’ robustness in the previous sections, a systematic variation of each parameter of the 400 nm Au SRR in Sect. 4.4.1 and the 1500 nm SRR filter in Sect. 4.4.2 was performed near the optimized value. Variations from 1–3 % produced a corresponding change in the objective function of <0.3 % for the 400 nm Au SRR and <4 % for the 1500 nm notch filter. We conclude from this that the method is robust to local perturbations (which is an attribute of the relationship between the SRR geometry and the objective function, not the optimization method used).

4.4.3 Coupling Quantum Dots to Split-Ring Resonators

In the third and final example of this chapter, we examine the design requirements involved in coupling the previously analyzed resonances of SRRs to the electronic transition states of quantum dots. The nonlinear nature of metals at optical frequencies makes them strong candidates for nonlinear mixing experiments. As an example, at these frequencies gold exhibits a strong, third-order nonlinear susceptibility where (χ (3)∼1 nm2 V −2) [41]. Further, one of the biggest strengths of these nanoscale resonant structures is there ability to manipulate and couple light in the near-field to nanostructures such as quantum dots. Combining lithography techniques with surface chemistry modification of the quantum dots presents an interesting opportunity to pattern these dots within the high-field regions of an array of resonators, Fig. 4.9(a).2 This portion of the process has previously been reported in the literature [33, 38, 42, 46], and nonlinear mixing within gold nanostructures has already been demonstrated by Kim et al. [30]. By using the techniques described in this chapter, we can tailor the exact device geometry to have a resonance at the electronic transition energy of a specific batch of dots [19].
Fig. 4.9

The SRR design that was optimized to couple incident light at λ=3600 nm and λ=1800 nm to quantum dots lithographically patterned within the high-field regions of the resonant structure, (a). The plot in (b) shows an idealized resonant spectrum with the 2ω of the gold matched to a typical absorption spectrum of quantum dots. The optimized dimensions for this structure are listed in Table 4.4

While there exist a wide range of energy transitions for this type of system, for this example we consider second harmonic generation studies with resonances at both λ=3600 nm and λ=1800 nm, which would enhance both the absorption of light λ=3600 nm (ω Au) and the coupling of light between the resonator and quantum dots at λ=1800 nm (∼2ω Au), Fig. 4.9(b). For the case of gold resonators patterned on a sapphire substrate, the resonance wavelengths were chosen to closely match those of the lead selenide quantum dots.3

The diagram in Fig. 4.9(b) shows typical broadband absorption spectra of lead selenide quantum dots (dot-dashed line) overlaid on top of a target resonance spectrum of a metamaterial tuned to the wavelengths of interest (solid line). While a wide range of resonator geometries would satisfy the performance requirements specified above, throughout this chapter, variations on the basic SRR geometry have been studied, and will again be used in this section. Using λ=3600 nm and λ=1800 nm as the target spectrum, initial simulations showed that the basic SRR structure shown in Sects. 4.4.1 and 4.4.2 was unable to span the wavelength region of interest; however, the SRR variant shown in Fig. 4.9(a) would. In this case, the array was on a sapphire substrate and a seven-parameter NOMAD optimization was performed. While the array optimization in this situation was similar to that in Sect. 4.4.2; here the gap between the two SRRs was a design parameter with the lower bound set by fabrication constraints (50 nm), and the height of each SRR was varied independently. This last parameter is not to be confused with resonator thickness, which was set at 50 nm for all structures studied in this example. Finally, the total width and arm width for both SRRs was kept equal for all designs. As in Sects. 4.4.1 and 4.4.2, a nonlinear constraint was imposed within NOMAD to maintain a minimum gap between the arms of each SRR to maintain the general shape of the broadband reflection spectra.

In an effort to study the convergence behavior of NOMAD using different objective functions, eight separate multi-objective cost functions were used. These variants are shown in Table 4.3. For all eight functions, the absolute difference between the peak wavelengths and the two target wavelengths were included. In addition, the resonant intensity of one or both of the two broadband peaks were added. Lastly, the first four objective functions take the sum of all the individual terms, while the last four take the product of all the individual terms. Here again, the variables in Table 4.3 match those in Eq. (4.5).
Table 4.3

Objective functions used for the design optimization of an array of Split Ring Resonators with resonances at λ=3600 nm and λ=1800 nm, coupled to quantum dots. For the above functions, \(I^{*}_{P1,2}=500(1 - \textnormal{Intensity}_{P1,2})\). The resulting broadband resonance spectra are shown in Fig. 4.10

O.F. #

SRR initial and optimized values

Objective function

1

|λ P1−1800|+|λ P2−3600|/2

2

\(\vert \lambda_{P1} - 1800\vert + \vert \lambda_{P2} - 3600\vert /2 + I^{*}_{P1}\)

3

\(\vert \lambda_{P1} - 1800\vert + \vert \lambda_{P2} - 3600\vert /2 + I^{*}_{P2}\)

4

\(\vert \lambda_{P1} - 1800\vert + \vert \lambda_{P2} - 3600\vert /2 + I^{*}_{P1} + I^{*}_{P2}\)

5

|λ P1−1800|⋅|λ P2−3600|

6

\(\vert \lambda_{P1} - 1800\vert \cdot \vert \lambda_{P2} - 3600\vert \cdot I^{*}_{P2}\)

7

\(\vert \lambda_{P1} - 1800\vert \cdot \vert \lambda_{P2} - 3600\vert \cdot I^{*}_{P1}\)

8

\(\vert \lambda_{P1} - 1800\vert \cdot \vert \lambda_{P2} - 3600\vert \cdot I^{*}_{P1} \cdot I^{*}_{P2}\)

For each optimization, the initial conditions as well as upper and lower bounds were kept constant. For all the optimizations studied here, Table 4.4 shows the initial conditions (Column 2), the corresponding dimensions of the optimized designs for all eight objective functions (Columns 3–5), and the upper and lower boundary conditions (Columns 6 & 7).
Table 4.4

Starting and optimized dimensions (in nm) for an array of SRRs shown in Fig. 4.9(a) with dimensions optimized to resonate at both λ=3600 nm and λ=1800 nm. The columns “O.F. 1–6”, “O.F. 7”, and “O.F. 8”, correspond to the optimized dimensions obtained using Objective Functions #1–6, #7, and #8 from Table 4.3. The resulting broadband resonance spectrum is shown in Fig. 4.10

 

SRR initial and optimized values

Initial value

O.F. 1–6

O.F. 7

O.F. 8

Minimum

Maximum

Width

500

467±2

579

592

300

600

Length 1

500

494±1

530

573

300

600

t 1

100

134±1

198

196

50

200

Length 2

500

494±1

494

549

300

600

Gap

100

114±1

183

113

50

200

x spacing

100

306±8

374

296

100

500

y spacing

100

299±1

299

298

100

500

As in Sects. 4.4.1 and 4.4.2, the idea was to maximize the reflection intensity at the two wavelengths of interest while at the same time, set the peak resonance wavelengths as close to the target wavelengths as possible. From Fig. 4.10 we can see strong resonances from the metamaterial array at both λ=3600 nm and λ=1800 nm; however, there are clearly small differences between the results. These results illustrate a point that was made in Chap.  2. Even when the key features of the broadband spectrum are well known, the way in which these terms are combined can prove to be one of the most challenging parts of the design optimization. Figure 4.10 clearly illustrates that while the three optimized spectrum that resulted from the eight objective functions all closely match the intended optimized spectrum, objective function #8 is clearly better than the other two.
Fig. 4.10

The SRR design that was optimized to couple incident light at λ=3600 nm and λ=1800 nm to quantum dots lithographically patterned within the high-field regions of the resonant structure, (a). The plot in (b) shows an idealized resonant spectrum with the 2ω of the gold matched to the measured absorption spectrum of the quantum dots. The optimized dimensions for these spectra are listed in Table 4.4

While these resonances are close, but not a perfect match to the desired resonances, the full-width half-maxima of the resonances for O.F. #8 at λ=1800 nm and λ=3600 nm are ∼240 nm and ∼675 nm respectively, and that of quantum dots such is ∼150 nm. Hence, while the inherently broad nature of the resonances from metamaterial arrays is normally considered a drawback; in this situation, it can compensate for some amount of discrepancy between the optimized peaks and the ω and 2ω targets.

Finally, it should be noted that this example is only a first-order result. Here, the resonances of the quantum dots and the metamaterial are considered independently when setting the objective function targets. The addition of quantum dots into the near-field of the SRRs, will introduce perturbations in the local dielectric environment and as a result, shift the actual resonances from those predicted in the simulations. While non-negligible, this change is a second-order effect and combined with the substantial bandwidth of the individual resonances, should not significantly affect the results of the example.

Footnotes

  1. 1.

    Here, we note that for the purposes of metamaterial design and this book, the blackbox terminology refers to any electromagnetics solver used to simulate a given metamaterial design.

  2. 2.

    Figures 4.9 and 4.10 were produced using the resources of MIT Lincoln Laboratory.

  3. 3.

    Lead selenide quantum dot spectra courtesy of Dr. Seth Taylor.

Notes

Acknowledgements

Work of the first author was supported by NSERC Discovery Grant 239436-05 and AFOSR FA9550-09-1-0160.

References

  1. 1.
    M.A. Abramson, C. Audet, J.W. Chrissis, J.G. Walston, Mesh adaptive direct search algorithms for mixed variable optimization. Optim. Lett. 3(1), 35–47 (2009) MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    M.A. Abramson, C. Audet, J.E. Dennis Jr., Filter pattern search algorithms for mixed variable constrained optimization problems. Pac. J. Optim. 3(3), 477–500 (2007) MathSciNetzbMATHGoogle Scholar
  3. 3.
    M.A. Abramson, C. Audet, J.E. Dennis Jr., S. Le Digabel, OrthoMADS: a deterministic MADS instance with orthogonal directions. SIAM J. Optim. 20(2), 948–966 (2009) MathSciNetzbMATHCrossRefGoogle Scholar
  4. 4.
    S. Alarie, C. Audet, V. Garnier, S. Le Digabel, L.A. Leclaire, Snow water equivalent estimation using blackbox optimization. Pac. J. Optim. 9(1), 1–21 (2013) zbMATHGoogle Scholar
  5. 5.
    C. Audet, V. Béchard, J. Chaouki, Spent potliner treatment process optimization using a MADS algorithm. Optim. Eng. 9(2), 143–160 (2008) zbMATHCrossRefGoogle Scholar
  6. 6.
    C. Audet, V. Béchard, S. Le Digabel, Nonsmooth optimization through mesh adaptive direct search and variable neighborhood search. J. Glob. Optim. 41(2), 299–318 (2008) zbMATHCrossRefGoogle Scholar
  7. 7.
    C. Audet, J.E. Dennis Jr., Analysis of generalized pattern searches. SIAM J. Optim. 13(3), 889–903 (2003) MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    C. Audet, J.E. Dennis Jr., A pattern search filter method for nonlinear programming without derivatives. SIAM J. Optim. 14(4), 980–1010 (2004) MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    C. Audet, J.E. Dennis Jr., Mesh adaptive direct search algorithms for constrained optimization. SIAM J. Optim. 17(1), 188–217 (2006) MathSciNetzbMATHCrossRefGoogle Scholar
  10. 10.
    C. Audet, J.E. Dennis Jr., A progressive barrier for derivative-free nonlinear programming. SIAM J. Optim. 20(1), 445–472 (2009) MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    C. Audet, J.E. Dennis Jr., S. Le Digabel, Parallel space decomposition of the mesh adaptive direct search algorithm. SIAM J. Optim. 19(3), 1150–1170 (2008) MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    C. Audet, J.E. Dennis Jr., S. Le Digabel, Globalization strategies for mesh adaptive direct search. Comput. Optim. Appl. 46(2), 193–215 (2010) MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    C. Audet, J.E. Dennis Jr., S. Le Digabel, Trade-off studies in blackbox optimization. Optim. Methods Softw. 27(4–5), 613–624 (2012) MathSciNetzbMATHCrossRefGoogle Scholar
  14. 14.
    C. Audet, S. Le Digabel, The mesh adaptive direct search algorithm for periodic variables. Pac. J. Optim. 8(1), 103–119 (2012) MathSciNetzbMATHGoogle Scholar
  15. 15.
    C. Audet, D. Orban, Finding optimal algorithmic parameters using derivative-free optimization. SIAM J. Optim. 17(3), 642–664 (2006) MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    C. Audet, G. Savard, W. Zghal, Multiobjective optimization through a series of single-objective formulations. SIAM J. Optim. 19(1), 188–210 (2008) MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    A.J. Booker, J.E. Dennis Jr., P.D. Frank, D.B. Serafini, V. Torczon, Optimization using surrogate objectives on a helicopter test example, in Optimal Design and Control, ed. by J. Borggaard, J. Burns, E. Cliff, S. Schreck. Progress in Systems and Control Theory (Birkhäuser, Basel, 1998), pp. 49–58 Google Scholar
  18. 18.
    A.J. Booker, J.E. Dennis Jr., P.D. Frank, D.B. Serafini, V. Torczon, M.W. Trosset, A rigorous framework for optimization of expensive functions by surrogates. Struct. Multidiscip. Optim. 17(1), 1–13 (1999) CrossRefGoogle Scholar
  19. 19.
    A. Chipouline, V. Fedotov, Coupling plasmonics with quantum systems, in SPIE Newsroom (2011) Google Scholar
  20. 20.
    T.D. Choi, C.T. Kelley, Superlinear convergence and implicit filtering. SIAM J. Optim. 10(4), 1149–1162 (2000) MathSciNetzbMATHCrossRefGoogle Scholar
  21. 21.
    F.H. Clarke, Optimization and Nonsmooth Analysis (Wiley, New York, 1983). Reissued in 1990 by SIAM Publications, Philadelphia, as Vol. 5 in the series Classics in Applied Mathematics zbMATHGoogle Scholar
  22. 22.
    A.R. Conn, S. Le Digabel, Use of quadratic models with mesh adaptive direct search for constrained black box optimization. Optim. Methods Softw. 28(1), 139–158 (2013) MathSciNetzbMATHCrossRefGoogle Scholar
  23. 23.
    A.R. Conn, K. Scheinberg, L.N. Vicente, Introduction to Derivative-Free Optimization. MOS/SIAM Series on Optimization (SIAM, Philadelphia, 2009) zbMATHCrossRefGoogle Scholar
  24. 24.
    J.E. Dennis Jr., C.J. Price, I.D. Coope, Direct search methods for nonlinearly constrained optimization using filters and frames. Optim. Eng. 5(2), 123–144 (2004) MathSciNetzbMATHCrossRefGoogle Scholar
  25. 25.
    R. Fletcher, S. Leyffer, Nonlinear programming without a penalty function. Math. Program., Ser. A 91, 239–269 (2002) MathSciNetzbMATHCrossRefGoogle Scholar
  26. 26.
    R.B. Gramacy, tgp: an R package for Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian process models. J. Stat. Softw. 19(9), 1–46 (2007). http://www.jstatsoft.org/v19/i09 Google Scholar
  27. 27.
    R.B. Gramacy, S. Le Digabel, The mesh adaptive direct search algorithm with treed Gaussian process surrogates. Technical Report G-2011-37, Les cahiers du GERAD, 2011 Google Scholar
  28. 28.
    G.A. Gray, T.G. Kolda, Algorithm 856: APPSPACK 4.0: asynchronous parallel pattern search for derivative-free optimization. ACM Trans. Math. Softw. 32(3), 485–507 (2006) MathSciNetzbMATHCrossRefGoogle Scholar
  29. 29.
    D.R. Jones, M. Schonlau, W.J. Welch, Efficient global optimization of expensive black box functions. J. Glob. Optim. 13(4), 455–492 (1998) MathSciNetzbMATHCrossRefGoogle Scholar
  30. 30.
    S. Kim, J. Jin, Y.J. Kim, I.Y. Park, Y. Kim, S.W. Kim, High-harmonic generation by resonant plasmonic field enhancement. Nature 453, 757–760 (2008) ADSCrossRefGoogle Scholar
  31. 31.
    M. Kokkolaras, C. Audet, J.E. Dennis Jr., Mixed variable optimization of the number and composition of heat intercepts in a thermal insulation system. Optim. Eng. 2(1), 5–29 (2001) MathSciNetzbMATHCrossRefGoogle Scholar
  32. 32.
    S. Le Digabel, Algorithm 909: NOMAD: nonlinear optimization with the MADS algorithm. ACM Trans. Math. Softw. 37(4), 44 (2011) CrossRefGoogle Scholar
  33. 33.
    L.Y. Lin, C.J. Wang, M.C. Hegg, L. Huang, Quantum dot nanophotonics—from waveguiding to integration. J. Nanophotonics 3, 031603 (2009) ADSCrossRefGoogle Scholar
  34. 34.
    A.L. Marsden, J.A. Feinstein, C.A. Taylor, A computational framework for derivative-free optimization of cardiovascular geometries. Comput. Methods Appl. Mech. Eng. 197(21–24), 1890–1905 (2008) MathSciNetADSzbMATHCrossRefGoogle Scholar
  35. 35.
    A.L. Marsden, M. Wang, J.E. Dennis Jr., P. Moin, Trailing-edge noise reduction using derivative-free optimization and large-eddy simulation. J. Fluid Mech. 572, 13–36 (2007) MathSciNetADSzbMATHCrossRefGoogle Scholar
  36. 36.
    M.D. McKay, W.J. Conover, R.J. Beckman, A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21(2), 239–245 (1979) MathSciNetzbMATHGoogle Scholar
  37. 37.
    E.D. Palik, Handbook of Optical Constants of Solids (Academic Press, New York, 1997) Google Scholar
  38. 38.
    E. Plum, V.A. Fedotov, P. Kuo, D.P. Tsai, N.I. Zheludev, Towards the lasing spaser: controlling metamaterial optical response with towards the lasing spaser: controlling metamaterial optical response with semiconductor quantum dots. Opt. Express 17(10), 8548–8551 (2009) ADSCrossRefGoogle Scholar
  39. 39.
    E. Polak, M. Wetter, Precision control for generalized pattern search algorithms with adaptive precision function evaluations. SIAM J. Optim. 16(3), 650–669 (2006) MathSciNetzbMATHCrossRefGoogle Scholar
  40. 40.
    N. Queipo, R. Haftka, W. Shyy, T. Goel, R. Vaidyanathan, P. Kevintucker, Surrogate-based analysis and optimization. Prog. Aerosp. Sci. 41(1), 1–28 (2005) CrossRefGoogle Scholar
  41. 41.
    J. Renger, R. Quidant, N.V. Hulst, L. Novotny, Surface enhanced nonlinear four-wave mixing. Phys. Rev. Lett. 104(4), 046803 (2010) ADSCrossRefGoogle Scholar
  42. 42.
    K. Tanaka, E. Plum, J.Y. Ou, T. Uchino, N.I. Zheludev, Multifold enhancement of quantum dot luminescence in plasmonic metamaterials. Phys. Rev. Lett. 105(22), 227403 (2010) ADSCrossRefGoogle Scholar
  43. 43.
    V. Torczon, On the convergence of pattern search algorithms. SIAM J. Control Optim. 7(1), 1–25 (1997) MathSciNetzbMATHCrossRefGoogle Scholar
  44. 44.
    D. van Heesch, Doxygen manual (1997). http://www.doxygen.org/manual.html
  45. 45.
    L.N. Vicente, A.L. Custódio, Analysis of direct searches for discontinuous functions, in Mathematical Programming (2010) Google Scholar
  46. 46.
    C.J. Wang, L.Y. Lin, B.A. Parviz, 100-nm quantum dot waveguides by two-layer self-assembly, in Lasers and Electro-Optics Society, LEOS 2005. the 18th Annual Meeting of the IEEE (2005), pp. 194–195 Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Charles Audet
    • 1
  • Kenneth Diest
    • 2
  • Sébastien Le Digabel
    • 1
  • Luke A. Sweatlock
    • 3
  • Daniel E. Marthaler
    • 4
  1. 1.GERAD and Département de Mathématiques et Génie IndustrielÉcole Polytechnique de MontréalMontréalCanada
  2. 2.Massachusetts Institute of Technology Lincoln LaboratoryLexingtonUSA
  3. 3.Northrop Grumman Aerospace SystemsRedondo BeachUSA
  4. 4.GE Global Research: Industrial Internet AnalyticsSan RamonUSA

Personalised recommendations