Skip to main content
Log in

Machine-Learning Techniques for the Optimal Design of Acoustic Metamaterials

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

Recently, an increasing research effort has been dedicated to analyze the transmission and dispersion properties of periodic acoustic metamaterials, characterized by the presence of local resonators. Within this context, particular attention has been paid to the optimization of the amplitudes and center frequencies of selected stop and pass bands inside the Floquet–Bloch spectra of the acoustic metamaterials featured by a chiral or antichiral microstructure. Novel functional applications of such research are expected in the optimal parametric design of smart tunable mechanical filters and directional waveguides. The present paper deals with the maximization of the amplitude of low-frequency band gaps, by proposing suitable numerical techniques to solve the associated optimization problems. Specifically, the feasibility and effectiveness of Radial Basis Function networks and Quasi-Monte Carlo methods for the interpolation of the objective functions of such optimization problems are discussed, and their numerical application to a specific acoustic metamaterial with tetrachiral microstructure is presented. The discussion is motivated theoretically by the high computational effort often needed for an exact evaluation of the objective functions arising in band gap optimization problems, when iterative algorithms are used for their approximate solution. By replacing such functions with suitable surrogate objective functions constructed applying machine-learning techniques, well-performing suboptimal solutions can be obtained with a smaller computational effort. Numerical results demonstrate the effective potential of the proposed approach. Current directions of research involving the use of additional machine-learning techniques are also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Fleck, N.A., Deshpande, V.S., Ashby, M.F.: Micro-architectured materials: past, present and future. Proc. R. Soc. A Math. Phys. Eng. Sci. 466(2121), 2495–2516 (2010)

    Google Scholar 

  2. Schaedler, M.F., Carter, W.B.: Architected cellular materials. Annu. Rev. Mater. Res. 46, 187–210 (2016)

    Google Scholar 

  3. Meza, L.R., Zelhofer, A.J., Clarke, N., Mateos, A.J., Kochmann, D.M., Greer, J.R.: Resilient 3D hierarchical architected metamaterials. Proc. Natl. Acad. Sci. 112(37), 11502–11507 (2015)

    Google Scholar 

  4. Liu, Z., Zhang, X., Mao, Y., Zhu, Y.Y., Yang, Z., Chan, C.T., Sheng, P.: Locally resonant sonic materials. Science 289(5485), 1734–1736 (2000)

    Google Scholar 

  5. Lu, M.H., Feng, L., Chen, Y.F.: Phononic crystals and acoustic metamaterials. Mater. Today 12(12), 34–42 (2009)

    Google Scholar 

  6. Ma, G., Sheng, P.: Acoustic metamaterials: from local resonances to broad horizons. Sci. Adv. 2(2), e1501595 (2016)

    MathSciNet  Google Scholar 

  7. Phani, A.S., Hussein, M.I. (eds.): Dynamics of lattice materials. Wiley, New York (2017)

    Google Scholar 

  8. Zhang, S., Yin, L., Fang, N.: Focusing ultrasound with an acoustic metamaterial network. Phys. Rev. Lett. 102(19), 194301 (2009)

    Google Scholar 

  9. Craster, R.V., Guenneau, S. (eds.): Acoustic metamaterials: negative refraction, imaging, lensing and cloaking, vol. 166. Springer, Berlin (2012)

    Google Scholar 

  10. Bigoni, D., Guenneau, S., Movchan, A.B., Brun, M.: Elastic metamaterials with inertial locally resonant structures: application to lensing and localization. Phys. Rev. B 87, 174303 (2013)

    Google Scholar 

  11. Molerón, M., Daraio, C.: Acoustic metamaterial for subwavelength edge detection. Nat. Commun. 6, 8037 (2015)

    Google Scholar 

  12. Bacigalupo, A., Gambarotta, L.: Simplified modelling of chiral lattice materials with local resonators. Int. J. Solids Struct. 83, 126–141 (2016)

    Google Scholar 

  13. Diaz, A.R., Haddow, A.G., Ma, L.: Design of band-gap grid structures. Struct. Multidiscip. Optim. 29(6), 418–431 (2005)

    Google Scholar 

  14. Meng, H., Wen, J., Zhao, H., Wen, X.: Optimization of locally resonant acoustic metamaterials on underwater sound absorption characteristics. J. Sound Vib. 331(20), 4406–4416 (2012)

    Google Scholar 

  15. Bacigalupo, A., Lepidi, M., Gnecco, G., Gambarotta, L.: Optimal design of auxetic hexachiral metamaterials with local resonators. Smart Mater. Struct. 25(5), 054009 (2016)

    Google Scholar 

  16. Bacigalupo, A., Gnecco, G., Lepidi, M., Gambarotta, L.: Optimal design of low-frequency band gaps in anti-tetrachiral lattice meta-materials. Compos. B Eng. 115(5), 341–359 (2017)

    Google Scholar 

  17. Bacigalupo, A., Lepidi, M., Gnecco, G., Vadalà, F., Gambarotta, L.: Optimal design of the band structure for beam lattice metamaterials. Front. Mater. 6, 1–14 (2019)

    Google Scholar 

  18. Bruggi, M., Corigliano, A.: Optimal 2D auxetic micro-structures with band gap. Meccanica 54(13), 20012027 (2019)

    MathSciNet  Google Scholar 

  19. Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms. Wiley, New York (2006)

    MATH  Google Scholar 

  20. Koziel, S., Leifsson, L. (eds.): Surrogate-Based Modeling and Optimization: Applications in Engineering. Springer, Berlin (2013)

    MATH  Google Scholar 

  21. Wild, S.M., Shoemaker, C.: Global convergence of radial basis function trust-region algorithms for derivative-free optimization. SIAM Rev. 55, 349–371 (2013)

    MathSciNet  MATH  Google Scholar 

  22. Bacigalupo, A., Gnecco, G., Lepidi, M., Gambarotta, L.: Design of acoustic metamaterials through nonlinear programming. In: 2nd International Workshop on Optimization, Machine Learning and Big Data (MOD 2016). Lecture Notes in Computer Science, vol. 10122, pp. 170–181 (2016)

  23. Bacigalupo, A., Gnecco, G.: Metamaterial filter design via surrogate optimization. In: International Conference on Metamaterials and Nanophotonics (METANANO 2018). Journal of Physics: Conference Series, vol. 1092, pp. 1–4 (2018)

  24. Krushynska, A.O., Kouznetsova, V.G., Geers, M.G.D.: Towards optimal design of locally resonant acoustic metamaterials. J. Mech. Phys. Solids 71, 179–196 (2014)

    Google Scholar 

  25. Lepidi, M., Bacigalupo, A.: Multi-parametric sensitivity analysis of the band structure for tetrachiral acoustic metamaterials. Int. J. Solids Struct. 136, 186–202 (2018)

    MATH  Google Scholar 

  26. D’Alessandro, L., Zega, V., Ardito, R., Corigliano, A.: 3D auxetic single material periodic structure with ultra-wide tunable bandgap. Sci. Rep. 8(1), 2262 (2018)

    Google Scholar 

  27. Schmidt, J., Marques, M.R.G., Botti, S., Marques, M.A.L.: Recent advances and applications of machine learning in solid-state materials science. Nat. Comput. Mater. 5, 1–36 (2019)

    Google Scholar 

  28. Fasshauer, G.E.: Meshfree Approximation Methods with MATLAB. World Scientific, Singapore (2007)

    MATH  Google Scholar 

  29. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. SIAM, Philadelphia (1992)

    MATH  Google Scholar 

  30. Rippa, S.: An algorithm for selecting a good value for the parameter c in radial basis function interpolation. Adv. Comput. Math. 11, 193–210 (1999)

    MathSciNet  MATH  Google Scholar 

  31. Cristianini, S., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Methods. Cambridge University Press, Cambridge (2000)

    MATH  Google Scholar 

  32. Lin, C., Lee, Y.H., Schuh, J.K., Ewoldt, R.H., Allison, J.T.: Efficient optimal surface texture design using linearization. Advances in Structural and Multidisciplinary Optimization, pp. 632–647. Springer, New York (2017)

    Google Scholar 

  33. Svanberg, K.: The method of moving asymptotes—a new method for structural optimization. Int. J. Numer. Meth. Eng. 24, 359–373 (1987)

    MathSciNet  MATH  Google Scholar 

  34. Svanberg, K.: A class of globally convergent optimization methods based on conservative convex separable approximations. SIAM J. Optim. 12, 555–573 (2002)

    MathSciNet  MATH  Google Scholar 

  35. Wei, P., Ma, H., Wang, M.Y.: The stiffness spreading method for layout optimization of truss structures. Struct. Multidiscip. Optim. 49, 667–682 (2014)

    MathSciNet  Google Scholar 

  36. Sobol, I.M.: The distribution of points in a cube and the approximate evaluation of integrals. USSR Comput. Math. Math. Phys. 7, 86–112 (1967)

    MathSciNet  MATH  Google Scholar 

  37. Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1996)

    MATH  Google Scholar 

  38. Müller, J., Shoemaker, C.A.: Influence of ensemble surrogate models and sampling strategy on the solution quality of algorithms for computationally expensive black-box global optimization problems. J. Global Optim. 60, 123–144 (2014)

    MathSciNet  MATH  Google Scholar 

  39. Gnecco, G., Bemporad, A., Gori, M., Sanguineti, M.: LQG online learning. Neural Comput. 29, 2203–2291 (2017)

    MathSciNet  MATH  Google Scholar 

  40. Zoppoli, R., Sanguineti, M., Gnecco, G., Parisini, T.: Neural Approximations for Optimal Control and Decision. Springer (2020) (forthcoming)

  41. Gaggero, M., Gnecco, G., Sanguineti, M.: Dynamic programming and value-function approximation in sequential decision problems: error analysis and numerical results. J. Optim. Theory Appl. 156, 380–416 (2013)

    MathSciNet  MATH  Google Scholar 

  42. Girosi, F.: Approximation error bounds that use VC-bounds. In: Proceedings of the International Conference on Artificial Neural Networks, pp. 295–302 (1995)

  43. Gnecco, G., Sanguineti, M.: Suboptimal solutions to dynamic optimization problems via approximations of the policy function. J. Optim. Theory Appl. 146, 764–794 (2010)

    MathSciNet  MATH  Google Scholar 

  44. Barron, A.R.: Neural net approximation. In: Proceedings of the 7th Yale Workshop on Adaptive and Learning Systems, pp. 69–72 (1992)

  45. Giulini, S., Sanguineti, M.: Approximation schemes for functional optimization problems. J. Optim. Theory Appl. 140, 33–54 (2019)

    MathSciNet  MATH  Google Scholar 

  46. Kůrková, V., Sanguineti, M.: Error estimates for approximate optimization by the extended Ritz method. SIAM J. Optim. 15, 461–487 (2005)

    MathSciNet  MATH  Google Scholar 

  47. Zoppoli, R., Sanguineti, M., Parisini, T.: Approximating networks and extended Ritz method for the solution of functional optimization problems. J. Optim. Theory Appl. 112, 403–439 (2002)

    MathSciNet  MATH  Google Scholar 

  48. Schweidtmann, A.M., Mitsos, A.: Deterministic global optimization with artificial neural networks embedded. J. Optim. Theory Appl. 180, 925–948 (2019)

    MathSciNet  MATH  Google Scholar 

  49. Gnecco, G.: An algorithm for curve identification in the presence of curve intersections. In: Mathematical Problems in Engineering, pp. 1–7 (2018)

  50. Nakatsukasa, Y.: Absolute and relative Weyl theorems for generalized eigenvalue problems. Linear Algebra Appl. 432, 242–248 (2010)

    MathSciNet  MATH  Google Scholar 

  51. Gnecco, G., Nutarelli, F.: On the trade-off between number of examples and precision of supervision in machine learning problems. Optim. Lett. (2019). https://doi.org/10.1007/s11590-019-01486-x

    Article  Google Scholar 

Download references

Acknowledgments

The authors acknowledge financial support of the (MURST) Italian Department for University and Scientific and Technological Research in the framework of the research MIUR Prin15 project 2015LYYXA8, “Multi-scale mechanical models for the design and optimization of micro-structured smart materials and metamaterials.” The authors also acknowledge financial support by National Group of Mathematical Physics (GNFM-INdAM).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giorgio Gnecco.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendices

Some specific mathematical concepts and the related technical notations employed in the paper are reported in further details. In particular, “Appendix A” describes mesh-free interpolation methods, “Appendix B” provides details on surrogate optimization, and “Appendix C” is concerned with iterative optimization algorithms.

1.1 A: Mesh-Free Interpolation Methods

A Radial Basis Function (RBF) network interpolation method [28] based on \( N_{\text{c}} \) computational units is characterized by an interpolant \( \widehat{f}:\varOmega \subset {\mathbb{R}}^{p} \to {\mathbb{R}} \) of the form

$$ \widehat{f}({\mathbf{x}}) = \sum\limits_{i = 1}^{{N_{\text{c}} }} {c_{i} \varphi_{i} \left( {{\mathbf{x}} - {\mathbf{x}}_{i} } \right)} , $$
(3)

where, for \( i = 1, \ldots ,N_{c} \), the \( c_{i} \in {\mathbb{R}} \) are the coefficients of the linear combination, the \( \varphi_{i} :{\mathbb{R}}^p \to {\mathbb{R}} \) are radial basis functions (i.e., functions whose values depend only on the distances of their arguments from the origin), and the \( {\mathbf{x}}_{i} \in {\mathbb{R}}^{p} \) are called centers. RBF network interpolation methods are examples of mesh-free methods that, differently from mesh-based ones, do not require the presence of connections between their nodes, but are rather based on the interaction between each node and all its neighbors. As a particular case, the Gaussian RBF network interpolation method exploits an interpolant of the form

$$ \widehat{f}({\mathbf{x}}) = \sum\limits_{i = 1}^{{N_{\text{c}} }} {c_{i} \exp \left( { - \frac{{\left\| {{\mathbf{x}} - {\mathbf{x}}_{i} } \right\|_{2}^{2} }}{{2\sigma_{i}^{2} }}} \right)} , $$
(4)

where \( \left\| {{\mathbf{x}} - {\mathbf{x}}_{i} } \right\|_{2} \) denotes the Euclidean norm of the vector \( {\mathbf{x}} - {\mathbf{x}}_{i} \), and the coefficients \( \sigma_{i} > 0 \) are called widths. In the paper, a Gaussian RBF network interpolation method with fixed centers and the same widths \( \sigma_{i} = \sigma \) is used, where the coefficients \( c_{i} \) are chosen in such a way to make, for a function \( f:\varOmega \subset {\mathbb{R}}^{p} \to {\mathbb{R}} \) to be interpolated, the error associated with its interpolation by \( \widehat{f} \) equal to zero on a finite set of input training points in \( {\mathbb{R}}^{p} \). In the following, such a set is taken to be coincident with the set of \( N_{\text{c}} \) centers. Consequently, the coefficients \( c_{i} \) are obtained by solving the linear system

$$ {\mathbf{Ac}} = {\mathbf{f}}, $$
(5)

where \( {\mathbf{A}} \in {\mathbb{R}}^{{N_{\text{C}} \times N_{\text{C}} }} \) is called interpolation matrix, whose generic element, in the case of Gaussian RBF networks, has the form \( A_{i \, j} = \exp \left( { - {{\left\| {{\mathbf{x}}_{i} - {\mathbf{x}}_{j} } \right\|_{2}^{2} } \mathord{\left/ {\vphantom {{\left\| {{\mathbf{x}}_{i} - {\mathbf{x}}_{j} } \right\|_{2}^{2} } {(2\sigma^{2} }}} \right. \kern-0pt} {(2\sigma^{2} }})} \right) \), whereas \( {\mathbf{f}} \in {\mathbb{R}}^{{N_{\text{c}} }} \) is a column vector collecting the values assumed by the function \( f \) on the input training points, and \( {\mathbf{c}} \in {\mathbb{R}}^{{N_{\text{c}} }} \) is the column vector collecting the unknown coefficients of the linear combination.

Typically, radial basis functions used in RBF network interpolation methods are chosen to be strictly positive-definite. It is recalled here that a function \( g:{\mathbb{R}}^p \to {\mathbb{R}} \) is called strictly positive-definite if, for every positive integer \( n \) and every collection of \( n \) distinct elements \( {\mathbf{x}}_{1} , \ldots ,{\mathbf{x}}_{n} \in {\mathbb{R}}^p \), the symmetric matrix \( {\mathbf{G}} \in {\mathbb{R}}^{n \times n} \) having elements of the form \( {G}_{i \, j} = g(\mathbf{x}_{i} - \mathbf{x}_{j} ) \) is positive definite. Since the interpolation matrix \( {\mathbf{A}} \) belongs to this matrix class, strict positive-definiteness of a fixed (i.e., independent from the index \( i \)) radial basis function guarantees the non-singularity of the associated interpolation matrix; hence, existence and uniqueness of the associated interpolant, for every choice of the training set, provided the input training points are distinct (see also [28], Chapter 3). In this way, indeed, the solution of Eq. (5) is

$$ {\mathbf{c}} = {\mathbf{A}}^{ - 1} {\mathbf{f}}\text{.} $$
(6)

When the training points do not coincide with the centers, then Eq. (5) may not be solvable and an approximate (typically not interpolating) solution must be searched. For instance the solution \( {\mathbf{c}}^{ + } = {\mathbf{A}}^{ + } {\mathbf{f}} \) (where \( {\mathbf{A}}^{ + } \) is the Moore–Penrose pseudo-inverse of \( {\mathbf{A}} \)) or the Tikhonov-regularized solution \( {\mathbf{c}}^{\lambda } = \left( {{\mathbf{A}}^{T} {\mathbf{A}} + \lambda {\mathbf{I}}} \right)^{ - 1} {\mathbf{A}}^{T} {\mathbf{f}} \), where \( \lambda > 0 \) is a suitable regularization parameter, can be achieved. However, for the kind of band gap optimization problems considered in this paper, interpolation is preferred, since there is no noise in the training set.

Despite the uniqueness of the interpolation for every choice of the width parameter \( \sigma \), different choices provide different interpolants, with varying quality of the approximation outside the training set. A possible way to select a suitable value for \( \sigma \), aimed to prevent overfitting, is leave-one-out cross-validation [28], which in the context of RBF network interpolation is also called Rippa method. According to this method, for each choice of \( \sigma \), \( N_{\text{c}} \) different RBF network interpolants are constructed starting from \( N_{\text{c}} \) different—but highly overlapping—training sets of size \( N_{\text{c}} - 1 \), where each of them is obtained by removing a different training point from the full training set of size \( N_{\text{c}} \). Then, for the \( k \)-th such interpolant (\( k = 1, \ldots ,N_{c} \)), the approximation error \( \varepsilon_{k} \) is evaluated on the unique point that has been excluded from its training set. All these errors are collected in a column vector \( {\varvec{\upvarepsilon}} \in {\mathbb{R}}^{{N_{\text{c}} }} \), whose maximum norm is minimized with respect to \( \sigma \) in order to choose the width optimally. Interestingly, the evaluation of each \( \varepsilon_{k} \) does not require the actual training of the \( k \)-th interpolant, since it is possible to prove [30] that such error has the expression

$$ \varepsilon_{k} = \frac{{c_{k} }}{{\left( {A^{ - 1} } \right)_{{k} \,{k}} }}, $$
(7)

where \( c_{k} \) is the \( k \)-th element of the column vector \( {\mathbf{c}} \) of coefficients of the linear combination defining the interpolant trained on the whole training set, and \( {\mathbf{A}} \) is the associated interpolation matrix (still dependent on the whole training set). Finally, the output of the leave-one-out cross-validation is made of:

  1. (a)

    The optimal choice of \( \sigma \) and

  2. (b)

    The RBF network interpolant trained on the whole training set, for that value of \( \sigma \).

If \( N_{\text{c}} \) is large, the interpolant is expected to be very similar those trained—for the same choice of \( \sigma \)—on the \( N_{\text{c}} \) smaller training sets of size \( N_{\text{c}} - 1 \), due to their high overlap.

1.2 B: Surrogate Optimization

Surrogate optimization replaces the objective function of an optimization problem with a suitable interpolant (or more generally, with a suitable approximating function), which is called surrogate objective function [20]. Once a suitable optimization algorithm is chosen to solve the resulting surrogate optimization problem, the performance of surrogate optimization can be measured in terms of:

  1. (a)

    The quality of the suboptimal solution it provides, expressed in terms of the value assumed by the surrogate objective function at the suboptimal solution, and by its comparison with the value assumed by the true objective function;

  2. (b)

    The total computational time needed by the surrogate optimization algorithm to find the suboptimal solution;

  3. (c)

    Its number of evaluations of the true objective function.

In the paper, a fixed surrogate objective function is considered, i.e., the interpolant of the true objective function is constructed in the initialization phase of the surrogate optimization algorithm and remains unchanged. More advanced surrogate optimization algorithms update their interpolants during their iterations, by exploiting additional information coming from the evaluation of the true objective function on points selected adaptively by the algorithm itself [21].

Constructing an interpolant requires the choice of a suitable input training set, on which the function to be interpolated is evaluated. In this way, suitable coefficients of the linear combination defining the interpolant are obtained (possibly together with additional coefficients, like the width parameter in Gaussian RBF network interpolation). Two possible choices for the training set are provided by:

  1. (a)

    Quasi-Monte Carlo discretization, according to which the input training set is originated as a subsequence of a Quasi-Monte Carlo (deterministic) sequence, and

  2. (b)

    Monte Carlo discretization, which exploits, instead, a realization of a Monte Carlo (random) sequence.

Both kinds of sequences are used extensively in numerical integration [29]. In that context, Quasi-Monte Carlo sequences are often preferred to realizations of Monte Carlo ones, because the former are typically able to cover the domain \( \varOmega \) in a more uniform way than the latter (see, e.g., [17] for an illustrative comparison). In other words, points generated from Monte Carlo sequences tend to form clusters. This issue depends on the Monte Carlo algorithm that, once a specific point belonging to a Monte Carlo sequence has been generated, keeps no memory of it to generate the successive point of the sequence (being such points realizations of independent vector-valued random variables). An additional feature of Quasi-Monte Carlo sequences is that, since they are deterministic, they generate perfectly reproducible point sets. However, this can be achieved also by Monte Carlo discretization, if pseudo-random (deterministic) sequences, whose statistical properties are similar to the ones of Monte Carlo sequences, are exploited.

The use of Quasi-Monte Carlo sequences (and of the point sets associated with their subsequences) also in function optimization and function interpolation can be justified using the concept of dispersion. Let \( X = \{ {\mathbf{x}}_{1} , \ldots ,{\mathbf{x}}_{{N_{\text{c}} }} \} \subset \varOmega \) be a finite subset of the domain. Then, its dispersion (or fill distance) is defined as

$$ h_{X,\varOmega } = \sup_{{{\mathbf{x}} \in \varOmega }} \min_{{{\mathbf{x}}_{j} \in X}} \left\| {{\mathbf{x}} - {\mathbf{x}}_{j} } \right\|_{2} . $$
(8)

Then, when \( f:\varOmega \subset {\mathbb{R}}^{p} \to {\mathbb{R}} \) is Lipschitz continuous with Lipschitz constant \( L_{f,\varOmega } > 0 \), and its domain \( \varOmega \) is replaced by \( X \), the following upper bound holds on the error of approximate global optimization of \( f \):

$$ \left| {\max_{{{\mathbf{x}} \in X}} f({\mathbf{x}}) - \max_{{{\mathbf{x}} \in \varOmega }} f({\mathbf{x}})} \right| \le L_{f,\varOmega } h_{X,\varOmega } $$
(9)

The bound above holds for the specific case of the optimization problem (2), since its objective function \( \Delta \omega_{{\partial B_{1} }} \left( {\varvec{\upmu}} \right) \) is Lipschitz continuous (although its Lipschitz constant may be difficult to compute). The proof of this result (which is sketched here) follows from an application of the perturbation bound for generalized eigenvalue problems provided by Theorem 2.2 in [50]), combined with the specific expression of \( \Delta \omega_{{\partial B_{1} }} \left( {\varvec{\upmu}} \right) \) (which can be inferred from Eqs. (4), (6), (7) and Appendix in [22]), and involves a generalized eigenvalue problem parametrized by both \( {\varvec{\upmu}} \) and the wavevector \( {\mathbf{k}} \)).

Moreover, according to Theorem 14.5 in [28], a large class of RBF network interpolants \( \widehat{f}_{X} :\varOmega \subset {\mathbb{R}}^{p} \to {\mathbb{R}} \) with centers and input training points coincident with the elements of \( X \) is characterized by the following upper bound on the error (in the sup norm) in the approximation of a function \( f:\varOmega \subset {\mathbb{R}}^{p} \to {\mathbb{R}} \) satisfying a suitable smoothness condition

$$ \sup_{{{\mathbf{x}} \in \varOmega }} \left| {f({\mathbf{x}}) - \widehat{f}_{X} ({\mathbf{x}})} \right| \le Ch_{{_{X,\varOmega } }}^{k} \left\| f \right\|_{H} , $$
(10)

where \( C \) is a positive constant (which does not depend on \( f \) and on the choice of the fixed radial basis function), \( k \) is a lower bound on the degree of smoothness of the computational units, and \( \left\| f \right\|_{H} \) is the norm of \( f \) in a suitable function space \( H \), which is associated with the specific choice of the fixed radial basis function. Of course, to reduce the value of the upper bound above on the approximation error, one possibility is to choose a low-dispersion sequence.

It has to be observed that the bound (10) requires a smoothness assumption on the function \( f \), namely its belonging to the space \( H \). When this does not hold, the domain \( \varOmega \) can be replaced with a subdomain on which \( f \) is smooth, or \( f \) can be replaced with a smooth approximation (possibly coincident with \( f \) on \( X \)). The bound (10) is then applied to the \( f \)-approximation. In this case, the presence of the error associated with the additional approximation step must be taken into account.

Interestingly, when the domain \( \varOmega \) is the \( p \)-dimensional hypercube \( \left[ {0,1} \right]^{p} \), the dispersion \( h_{{X,\left[ {0,1} \right]^{p} }} \) can be bounded from above in terms of another property of the set \( X \) (which can be also related to the associated sequence \( {\mathbf{x}}_{1} , \ldots ,{\mathbf{x}}_{{N_{\text{c}} }} \)), called discrepancy and defined as

$$ D_{X} = \sup_{G \in \Im } \left| {\frac{S(G)}{{N_{\text{c}} }} - \prod\nolimits_{i = 1}^{{N_{\text{c}} }} {\left( {b_{i} - a_{i} } \right)} } \right| $$
(11)

where \( \Im \) is the family of all subsets of \( \left[ {0,1} \right]^{p} \) having the form \( G = \prod\nolimits_{i = 1}^{{N_{c} }} {\left[ {a_{i} ,b_{i} } \right)} \), and \( S(G) \) is the cardinality of the intersection \( X \cap G \). Then, the upper bound on the dispersion (see, for instance Theorem 6.6 in [29]) is

$$ h_{{X,\left[ {0,1} \right]^{p} }} \le p^{{\frac{1}{2}}} \left( {D_{X} } \right)^{{\frac{1}{p}}} , $$
(12)

hence it goes to zero when the discrepancy vanishes. Since Quasi-Monte Carlo sequences are characterized by low discrepancy, also their dispersions are low. This justifies the use of Quasi-Monte Carlo sequences (e.g., the Sobol’ sequence) also in function interpolation.

To conclude, it is worth remarking that upper bounds on the approximation error of the function \( f \) through its interpolant \( \widehat{f}_{X} \), similar to that expressed in the sup norm and defined in Eq. (10), allow also to bound from above the error in the approximation of the maximum value of \( f \) in terms of the maximum value of \( \widehat{f}_{X} \). Indeed, assuming that the respective maxima exist, Eq. (10) implies

$$ \left| {\max_{{_{{{\mathbf{x}} \in \varOmega }} }} f({\mathbf{x}}) - \max_{{_{{{\mathbf{x}} \in \varOmega }} }} \widehat{f}_{X} ({\mathbf{x}})} \right| \le Ch_{{_{X,\varOmega } }}^{k} \left\| f \right\|_{H} . $$
(13)

Similar guarantees cannot be obtained if one starts, instead, from upper bounds on the approximation error expressed in Lebesgue-space norms like the \( L_{2} \) norm. Analogous remarks hold for other optimization problems to which Gaussian RBF networks have been applied [41, 43].

1.3 C: Iterative Optimization Algorithms

Both the original and surrogate band gap optimization problems are characterized by the presence of continuous variables to be optimized, a nonlinear objective function, and linear and/or nonlinear constraints. To solve them approximately, suitable iterative optimization algorithms can be applied. In general, the number of iterations being the same, the performance of each such algorithm is expected to produce better suboptimal solutions if it is applied to the original optimization problem rather than to the surrogate one. This is motivated by the absence of the additional approximation error associated with the use of the surrogate objective function. Nevertheless, the iterations of the algorithm are faster when it is applied to a surrogate optimization problem with a fixed surrogate objective function, since such function—once it has been constructed—is cheaper to evaluate. This advantage holds also for its partial derivatives, if they are used during the iterations of the algorithm. For instance, partial derivatives of a Gaussian RBF network interpolant can be computed in closed form, once the values of the parameters of the interpolant have been chosen. So, even better suboptimal solutions may be produced starting from such surrogate objective function, the total computational time being the same. This possibility is particularly important when there are upper bounds on the available total computational time. It is worth mentioning that these arguments have been proved in [51] for the related problem of regression, whose investigation can be considered as a preliminary step for the analysis of surrogate optimization (indeed, it involves only the construction of the approximating function, but not its successive optimization). Possible choices for the iterative optimization algorithm, examined in this paper are:

  1. (a)

    Method of Moving Asymptotes (MMA) [33] and its Globally Convergent upgrade (GCMMA) [34]. The GCMMA algorithm—which was developed originally with the specific aim of solving optimization problems arising in structural engineering—replaces the initial (either the original, or the surrogate) optimization problem with a sequence of approximating optimization subproblems, which are easier to solve, though being still nonlinearly constrained. In each subproblem, the objective and constraining functions of the initial optimization problem are replaced by suitable approximating functions. From a certain viewpoint, this can be also considered as a form of surrogate optimization. However, such approximations are characterized by a much simpler functional form than Gaussian RBF network interpolation. In any case, the resulting subproblems are easier to solve than the initial optimization problem, e.g., because GCMMA applies separable approximators, which are summations of functions depending on a single real argument. Nevertheless, just because of their simplicity, the quality of such approximations may be worse than those of Gaussian RBF network interpolation, if a sufficiently large number of basis functions is used. Moreover, Gaussian RBF network interpolation guarantees a zero approximation error on the training set. Still, GCMMA is globally convergent in the sense that, for each possible initialization of the set of variables to be optimized, the algorithm was proved in [34] to converge to a stationary point of the initial optimization problem (assuming twice-continuous differentiability of the objective and constraining functions; in practice, convergence is often observed also for less smooth functions). Finally, it can be recalled that the expression moving asymptotes reflects that, in each iteration, GCMMA exploits approximating functions characterized by suitable asymptotes, which typically change (move) from one iteration to the successive one;

  2. (b)

    Sequential Linear Programming (SLP) [19]. Also in this case, the initial optimization problem is replaced by a sequence of simpler approximating subproblems. However, differently from GCMMA, at each iteration of SLP, linearizations of the objective and constraining functions at the current suboptimal solution are exploited. Moreover, additional box constraints are inserted, which define a so-called trust region inside which the approximation error due to the linearizations above is guaranteed to be small (such a guarantee can be obtained, e.g., by comparing the actual derivatives of the objective and constraining functions on selected points belonging to the trust region with their constant approximations coming from the linearizations above). Each resulting optimization subproblem is even easier to solve than those occurring in the application of GCMMA. Indeed, the simplex algorithm can be applied, or—if the number of variables to be optimized is not too large—the objective function can be evaluated only on the vertices of the admissible region of each subproblem, which are simplexes. Nonetheless, convergence may be slower in practice with respect to GCMMA, because of the extreme simplicity of the linear approximations, and of the need to choose a sufficiently small trust region, in order to guarantee a small approximation error in the linearizations. However, a possible way to improve the rate of convergence is obtained by exploiting an adaptive trust region (i.e., a trust region having an adaptive size), which can be constructed by enlarging (decreasing) the size of the trust region if the current linear approximations are good (bad). Still, a lower and upper bound on the size of the adaptive trust region are needed, in order to avoid such size growing or shrinking indefinitely, possibly generating numerical errors.

Iterative optimization algorithms need suitable termination criteria [19]. The simplest criterion consists in stopping the algorithm after a given number \( N_{\text{i}} \) of iterations. However, if different algorithms characterized by different computational times per iteration are used, then a fair approach to compare their performances consists in stopping each algorithm after the same total computational time. Other termination criteria involve, e.g., the comparison of the values assumed by the (true or surrogate) objective function in correspondence of the suboptimal solutions generated in consecutive iterations of the optimization algorithm, or the comparison between the values assumed by its gradient and the all-zero vector (limiting for simplicity the discussion to the case of unconstrained optimization).

Finally, it is important to recall that iterative optimization algorithms for nonlinearly constrained optimization problems with nonlinear objective function and continuous variables to be optimized typically find only local maxima (the same GCMMA, though globally convergent, is guaranteed to converge only to a local stationarity point, which could even not coincide with a local maximum point). In order to increase the probability of finding the global maximum or a good local maximum (i.e., one whose value of the objective function is near the global maximum), several repetitions of each iterative optimization algorithm can be performed, starting from its different initializations, possibly generated using a Quasi-Monte Carlo subsequence (this is called a Quasi-Monte Carlo multi-start initialization approach). In this way, indeed, the probability of starting the iterative optimization algorithm increases—in at least one repetition—either from the basin of attraction of a global maximum, or from that associated with a sufficiently good local maximum [40].

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bacigalupo, A., Gnecco, G., Lepidi, M. et al. Machine-Learning Techniques for the Optimal Design of Acoustic Metamaterials. J Optim Theory Appl 187, 630–653 (2020). https://doi.org/10.1007/s10957-019-01614-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-019-01614-8

Keywords

Mathematics Subject Classification

Navigation