Abstract
Two parallelized hybrid methods are presented for single-function optimization problems with side constraints. The optimization problems are difficult not only due to possible existence of local minima and nonsmoothness of functions, but also due to the fact that objective function and constraint values for a solution vector can only be obtained by querying a black box whose execution requires considerable computational effort. Examples are optimization problems in Engineering where objective function and constraint values are computed via complex simulation programs, and where local minima exist and smoothness of functions is not assured. The hybrid methods consist of the well-known method NOMAD and two new methods called DENCON and DENPAR that are based on the linesearch scheme CS-DFN. The hybrid methods compute for each query a set of solution vectors that are evaluated in parallel. The hybrid methods have been tested on a set of difficult optimization problems produced by a certain seeding scheme for multiobjective optimization. We compare computational results with solution by NOMAD, DENCON, and DENPAR as stand-alone methods. It turns out that among the stand-alone methods, NOMAD is significantly better than DENCON and DENPAR. However, the hybrid methods are definitely better than NOMAD.
This is a preview of subscription content, access via your institution.




References
Abramson MA, Audet C, Couture G, Dennis Jr JE, Le Digabel S, Tribes C (2014) The NOMAD project. http://www.gerad.ca/nomad
Abramson MA, Audet C, Dennis JE Jr, Le Digabel S (2009) Orthomads: a deterministic mads instance with orthogonal directions. SIAM J Optim 20(2):948–966
Audet C, Dennis JE Jr (2006) Mesh adaptive direct search algorithms for constrained optimization. SIAM J Optim 17(1):188–217
Audet C, Dennis JE Jr, Le Digabel S (2008) Parallel space decomposition of the mesh adaptive direct search algorithm. SIAM J Optim 19(3):1150–1170
Bratley P, Fox B (1988) Algorithm 659: implementing Sobol’s quasirandom sequence generator. ACM Trans Math Softw 14(1):88–100
Clarke FH (1983) Optimization and nonsmooth analysis. Wiley, New York
Dennis JE Jr, Torczon V (1991) Direct search methods on parallel machines. SIAM J Optim 1(4):448–474
Di Pillo G, Grippo L, Lucidi S (1993) A smooth method for the finite minimax problem. J Glob Optim 60:187–214
Fasano G, Liuzzi G, Lucidi S, Rinaldi F (2014) A linesearch-based derivative-free approach for nonsmooth constrained optimization. SIAM J Optim 24(3):959–992
García-Palomares UM, Rodríguez JF (2002) New sequential and parallel derivative-free algorithms for unconstrained minimization. SIAM J Optim 13(1):79–96
García-Palomares UM, García-Urrea IJ, Rodríguez-Hernández PS (2013) On sequential and parallel non-monotone derivative-free algorithms for box constrained optimization. Optim Methods Softw 28(6):1233–1261
Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Boston
Gray GA, Kolda TG (2006) Algorithm 856: Appspack 4.0: asynchronous parallel pattern search for derivative-free optimization. ACM Trans Math Softw 32(3):485–507
Griffin JD, Kolda TG, Lewis RM (2008) Asynchronous parallel generating set search for linearly constrained optimization. SIAM J Sci Comput 30(4):1892–1924
Grippo L, Lampariello F, Lucidi S (1986) A nonmonotone line search technique for Newton’s method. SIAM J Numer Anal 23(4):707–716
Halton JH (1960) On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numer Math 2:84–90
Hough PD, Kolda TG, Torczon VJ (2001) Asynchronous parallel pattern search for nonlinear optimization. SIAM J Sci Comput 23(1):134–156
Kolda TG (2005) Revisiting asynchronous parallel pattern search for nonlinear optimization. SIAM J Optim 16(2):563–586
Kolda TG, Torczon V (2004) On the convergence of asynchronous parallel pattern search. SIAM J Optim 14(4):939–964
Laguna M, Molina J, Pérez F, Caballero R, Hernández-Díaz AG (2009) The challenge of optimizing expensive black boxes: a scatter search/rough set theory approach. J Oper Res Soc 61:53–67
Le Digabel S (2011) Algorithm 909: NOMAD: nonlinear optimization with the MADS algorithm. ACM Trans Math Softw 37(4):1–15
Meza JC, Oliva RA, Hough PD, Williams PJ (2007) Opt++: an object oriented toolkit for nonlinear optimization. ACM Trans Math Softw 33(2):12
Moré JJ, Wild SM (2009) Benchmarking derivative-free optimization algorithms. SIAM J Optim 20(1):172–191
Ponstein J (1967) Seven kinds of convexity. SIAM Rev 9(1):115–119
Shetty CM, Bazaraa MS (1979) Nonlinear programming: theory and algorithms. Wiley, Massachusetts
Sobol I (1977) Uniformly distributed sequences with an additional uniform property. USSR Comput Math Math Phys 16:236–242
Truemper K (in review) Simple seeding of evolutionary algorithms for hard multiobjective minimization problems
Acknowledgements
We are thankful to an anonymous reviewer for helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Ernesto G. Birgin.
Appendices
Appendix
More comparisons
We show that the iteration budget of 2000 evaluation generally is a good choice.
To start, we define six candidate iteration budgets, of size 500, 1000, 2000, 3000, 4000, and 5000. That choice is based on earlier runs involving hard engineering problems where 1000 and 2000 turned out to be good choices.
We construct the graphs showing the data and performance profiles for NOMAD, NOMAD–DENCON, and NOMAD–DENPAR for the selected six iteration budgets. Space restrictions prevent inclusion of the graphs; they are available at http://www.iasi.cnr.it/~liuzzi/hybridDF/.
From the graphs, we derive tables structured like Table 1, which summarize the relative size of areas under the profile curves of the various graphs. The tables are included in the appendix as Tables , 7, 8, 9, 10 and . The percentages in the tables show that NOMAD always produces the smallest area and thus has worst performance except for two cases in Table 6 where the problems with \(n=20\) variables are solved on 64 processors, and where NOMAD dominates NOMAD–DENCON but still is dominated by NOMAD–DENPAR.
The results of Tables , 7, 8, 9, 10 and are reassuring in the sense that the dominance of NOMAD–DENCON and NOMAD–DENPAR over NOMAD does not depend on a critical choice of the iteration budget. There is an intuitive explanation for this results. During each iteration, the selected method may terminate due to convergence conditions and thus may not reach the iteration budget. In such a case, a larger value of the iteration budget will induce the same behavior. A corollary of this argument is that we always should prefer component methods that have well-justified convergence conditions, i.e., conditions based on a sound convergence analysis, as is the case for the three methods NOMAD, DENCON, and DENPAR selected here.
We interrupt the analysis of Tables 6, 7, 8, 9, 10 and 11 and look at the efficiency of the hybrid methods under parallelization, again considering the six iteration budgets. The relevant results are compiled in Table 3 for NOMAD–DENCON and in Table 4 for NOMAD–DENPAR. The interpretation is analogous to that for Table 2. For each problem subset, the efficiency ratios \(s/(64\cdot c)\) are very similar regardless of the iteration budget. For example, when the entire problem set is solved, then for NOMAD–DENCON the ratio ranges from 18 to 25%, and for NOMAD–DENPAR from 21 to 27%. These results indicate that the efficiency under parallelization is not very sensitive with respect to the iteration budget.
After this general investigation of the impact of the iteration budget, we turn to the problem of deciding a best iteration budget. For the selection, we compute the profile graphs and tables evaluating the performance of NOMAD–DENCON under the six iteration budgets. The graphs are omitted here; they are available at http://www.iasi.cnr.it/~liuzzi/hybridDF/. The graphs are summarized in Tables 12 and 13 for NOMAD–DENCON and Tables 14 and 15 for NOMAD–DENPAR.
We analyze the tables. Table 12 has the percentages for NOMAD–DENCON when running on a single processor. The bold numbers, indicating maximum as before, occur mostly in the rows for an iteration budget of 2000. Thus, that number is a good choice. Table 13, which covers NOMAD–DENCON and 64 processors, results in the same conclusion.
But Tables 14 and 15 do not lead to such clear-cut choices. Nevertheless, in Table 14 half of the bold entries occur in rows for the iteration budget of 2000. But Table 15 provides no significant insight. The reason becomes clear when we look at the corresponding graph, provided at http://www.iasi.cnr.it/~liuzzi/hybridDF/. That graph shows the profile curves bunched together, and we can accept an iteration budget of 2000 as reasonable choice, in tune with the decisions deduced from the other tables.
We conclude that an iteration budget of 2000 is a reasonable choice for both NOMAD–DENCON and NOMAD–DENPAR. For that choice, Table 8 indicates improvement percentages of NOMAD–DENCON ranging from 32 to 68% over NOMAD when the entire problem set is solved on a single processor, and ranging from 33 to 65% for 64 processors. The corresponding percentages for NOMAD–DENPAR are 12 to 33% for single processor and 29 to 44% for 64 processors. Thus, NOMAD–DENCON is clearly better than NOMAD–DENPAR on a single processor as well as on 64 processors.
Convergence analysis for DENPAR
This section is devoted to the convergence analysis of Algorithm DENPAR. In this section, we consider problem (3) where the objective function is assumed to be quasi-convex Shetty and Bazaraa (1979), i.e., for each \(x,y\in \mathbb {R}^n\),
Algorithm DENPAR is based on DFN\(_{simple}\) from Fasano et al. (2014). We review the latter scheme, beginning with the definition of Clarke stationarity (see, e.g., Clarke 1983).
Definition B.1
(Clarke Stationarity) Given the unconstrained problem \(\min _{x\in \mathbb {R}^n} f(x)\), a point \(\bar{x}\) is a Clarke stationary point if \(0\in \partial f(\bar{x})\), where \(\partial f( x)=\{s\in \mathbb {R}^n : f^{Cl}( x; d)\ge d^Ts,\ \forall d \in \mathbb {R}^n \}\) is the generalized gradient of f at x, and
We also need the definition of dense subsequences.
Definition B.2
(Dense subsequence) Let K be an infinite subset of indices (possibly \(K=\{0,1,\dots \}\)). The subsequence of normalized directions \(\{d_k\}_K\) is said to be dense in the unit sphere S(0, 1), if for any \(\bar{D}\in S(0,1)\) and for any \(\epsilon > 0\) there exists an index \(k\in K\) such that \(\Vert d_k-\bar{D}\Vert \le \epsilon \).
Here is a summary of DFN\(_{simple}\) for the solution of Problem (3).

In Algorithm DFN\(_{simple}\), a predefined sequence of search directions \(\{d_k\}\) is used. Then, the behavior of the function f(x) along the direction \(d_k\) is investigated. If the direction (or its opposite) is deemed a good direction, in the sense that sufficient decrease can be obtained along it, then a sufficiently large step size is computed by means of the Expansion Step procedure. On the other hand, if neither \(d_k\) nor \(-d_k\) are good direction, then the tentative step size is reduced by a constant factor.
Note that, in Algorithm DFN\(_{simple}\), considerable freedom is left for the selection of the next iterate \(x_{k+1}\) once the new point \(\tilde{x}_k\) has been computed. More specifically, the next iterate \(x_{k+1}\) is only required to satisfy inequality \(f(x_{k+1})\le f(\tilde{x}_k)\). This can trivially be satisfied by setting \(x_{k+1} \leftarrow \tilde{x}_k\). However, more sophisticated selection strategies can be implemented. For instance, \(x_{k+1}\) might be defined by minimizing suitable approximating models of the objective function, thus possibly improving the efficiency of the overall scheme. As we shall see, this freedom offered by DFN\(_{simple}\) is particularly useful for our purposes.
Next we describe a parallelized version of DFN\(_{simple}\) called DEN\(_{check}\).
1.1 Algorithm DEN\(_{check}\)
Here is a summary of the algorithm.

At every iteration of DEN\(_{check}\), an orthonormal basis is formed starting from the given direction \(\hat{d}_k\). First, the behavior of the objective function along directions \(d_k^1,\ldots ,d_k^n\) is investigated starting from the same point \(x_k\). This produces step sizes \(\alpha _k^i\ge 0\) and \(\tilde{\alpha }_k^i > 0\), \(i=1,\ldots ,n\). Provided that \(\alpha _k^i > 0\) for at least an index i, the index \(j_M\) is computed and \(\tilde{x}_k \leftarrow x_k + \alpha _k^{j_M}d_k^{j_M}\), that is \(\tilde{x}_k\) is the point that produces the worst improvement for the objective function.
Additional computation is carried out if \(\sum _{i=1}^n\alpha _k^i >0\). In particular, the step sizes obtained by the n linesearches are combined together to define the convex combination point \(x_c\). Then \(f(x_c)\) is compared with \(f(\tilde{x}_k)\). If \(f(x_c)\) improves upon the latter value, then \(x_{k+1}\) is set equal to \(x_c\), otherwise \(x_{k+1}\) is set equal to the previously computed \(\tilde{x}_k\). The reader may wonder about the choice of \(\tilde{x}_k\). We specify it here to get a theoretical scheme that can be readily converted to the more efficient DENPAR.
In the following proposition, we show that Algorithm DEN\(_{check}\) inherits the convergence properties of the sequential code DFN\(_{simple}\) by showing that DEN\(_{check}\) is a particular case of the latter method.
Proposition B.3
Let \(\{x_k\}\) be the sequence produced by Algorithm DEN\(_{check}\). Let \(\bar{x}\) be any limit point of \(\{x_k\}\) and K be the subset of indices such that
If the subsequence \(\{d_k\}_K\) is dense in the unit sphere (see Definition B.2), then \(\bar{x}\) is Clarke stationary for problem (3) (see Definition B.1).
Proof
We prove the proposition by showing that DEN\(_{check}\) is an instance of DFN\(_{simple}\). To this aim, let us consider the last step of Algorithm DFN\(_{simple}\), namely where the next iterate \(x_{k+1}\) is defined. As it can be seen, in Algorithm DFN\(_{simple}\), \(x_{k+1}\) is required to satisfy the condition \(f(x_{k+1})\le f(\tilde{x}_k)\). Note that \(\tilde{x}_k\) of DFN\(_{simple}\) corresponds to the point \(\tilde{x}_k\) of DEN\(_{check}\). Indeed, if \(\sum _{i=1}^n\alpha _k^i > 0\), then \(\tilde{x}_k= x_k + \alpha _k^{j_M} d_k^{j_M}\) and \(d_k = d_k^{j_M}\). Otherwise, \(\tilde{x}_k = x_k\) and \(d_k = \hat{d}_k\).
Now, let us consider an iteration k of Algorithm DEN\(_{check}\). By the instructions of the algorithm, one of the following cases occurs.
-
(i)
\(\sum _{i=1}^n\alpha _k^i > 0\) and \(f(x_c) \le f(\tilde{x}_k)\);
-
(ii)
\(\sum _{i=1}^n\alpha _k^i > 0\) and \(f(x_c) > f(\tilde{x}_k)\);
-
(iii)
\(\sum _{i=1}^n\alpha _k^i = 0\).
In case (i), \(x_{k+1}\leftarrow x_c\) and the iteration of DEN\(_{check}\) is like an iteration of DFN\(_{simple}\) with \(d_k = d_k^{j_M}\) and where \(x_{k+1}\) is chosen as point \(x_c\).
In case (ii), \(x_{k+1}\leftarrow \tilde{x}_k\) and the iteration of DEN\(_{check}\) is like an iteration of DFN\(_{simple}\) with \(d_k = d_k^{j_M}\) and where \(x_{k+1}\) is set equal to \(\tilde{x}_k\).
Finally, in case (iii), \(x_{k+1}\leftarrow x_k\) and the iteration of DEN\(_{check}\) is like an iteration of DFN\(_{simple}\) with \(d_k = \hat{d}_k\) and where sufficient improvement cannot be obtained both along \(d_k\) and \(-d_k\).
Hence, any iteration of DEN\(_{check}\) can be viewed as a particular iteration of DFN\(_{simple}\). This establishes the proposition. \(\square \)
DENPAR is derived from DEN\(_{check}\) by replacing lines 20–23 by \(x_{k+1}\leftarrow x_c\). Effectively, that replacement assumes that the inequality \(f(x_c) \le f(\tilde{x}_k)\) of line 20 is satisfied. This is indeed the case since f(x) is assumed to be quasi-convex.
Rights and permissions
About this article
Cite this article
Liuzzi, G., Truemper, K. Parallelized hybrid optimization methods for nonsmooth problems using NOMAD and linesearch. Comp. Appl. Math. 37, 3172–3207 (2018). https://doi.org/10.1007/s40314-017-0505-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40314-017-0505-2
Keywords
- Derivative-free optimization
- Nonsmooth optimization
- Parallel algorithms
Mathematics Subject Classification
- 90C30
- 90C56
- 49J52
- 68W10