Skip to main content
Log in

MetaNOR: A meta-learnt nonlocal operator regression approach for metamaterial modeling

  • Early Career Materials Researcher Research Letter
  • Published:
MRS Communications Aims and scope Submit manuscript

Abstract

We propose MetaNOR, a meta-learnt approach for transfer-learning operators based on the nonlocal operator regression. The overall goal is to efficiently provide surrogate models for new and unknown material-learning tasks with different microstructures. The algorithm consists of two phases: (1) learning a common nonlocal kernel representation from existing tasks; (2) transferring the learned knowledge and rapidly learning surrogate operators for unseen tasks with a different material, where only a few test samples are required. We apply MetaNOR to model the wave propagation within 1D metamaterials, showing substantial improvements on the sampling efficiency for new materials.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3

Data availability

The data that support the findings of this study are available from the corresponding author, Yue Yu, upon reasonable request.

References

  1. S. Yao, X. Zhou, G. Hu, Experimental study on negative effective mass in a 1d mass-spring system. N. J. Phys. 10(4), 043020 (2008)

    Article  Google Scholar 

  2. H. Huang, C. Sun, Wave attenuation mechanism in an acoustic metamaterial with negative effective mass density. N. J. Phys. 11(1), 013003 (2009)

    Article  Google Scholar 

  3. J.M. Manimala, H.H. Huang, C. Sun, R. Snyder, S. Bland, Dynamic load mitigation using negative effective mass structures. Eng. Struct. 80, 458–468 (2014)

    Article  Google Scholar 

  4. Q. Du, B. Engquist, X. Tian, Multiscale modeling, homogenization and nonlocal effects: Mathematical and computational issues (2019)

  5. L. Barker, A model for stress wave propagation in composite materials. J. Compos. Mater. 5(2), 140–162 (1971)

    Article  Google Scholar 

  6. H. You, Y. Yu, S. Silling, M. D’Elia, Data-driven learning of nonlocal models: from high-fidelity simulations to constitutive laws. (MLPS, AAAI Spring Symposium, 2021)

  7. S.A. Silling, Propagation of a stress pulse in a heterogeneous elastic bar. J. Peridyn. Nonlocal Model. 3, 1–21 (2021)

    Article  Google Scholar 

  8. K. Deshmukh, T. Breitzman, K. Dayal, K. Multiband homogenization of metamaterials in real-space: Higher-order nonlocal models and scattering at external surface. preprint (2021)

  9. H. You, Y. Yu, N. Trask, M. Gulian, M. D’Elia, Data-driven learning of robust nonlocal physics from high-fidelity synthetic data. Comput. Methods Appl. Mech. Eng. 374, 113553 (2021)

    Article  Google Scholar 

  10. H. You, Y. Yu, S. Silling, M. D’Elia, A data-driven peridynamic continuum model for upscaling molecular dynamics. Comput. Methods Appl. Mech. Eng. 389, 114400 (2022)

    Article  Google Scholar 

  11. N. Tripuraneni, C. Jin, M. Jordan, Provable meta-learning of linear representations. In: International Conference on Machine Learning, pp. 10434– 10443 ( 2021). PMLR

  12. M. Beran, J. McCoy, Mean field variations in a statistical sample of heterogeneous linearly elastic solids. Int. J. Solids Struct. 6(8), 1035–1054 (1970)

    Article  Google Scholar 

  13. S.A. Silling, Origin and effect of nonlocality in a composite. J. Mech. Mater. Struct. 9(2), 245–258 (2014)

    Article  Google Scholar 

  14. Q. Du, B. Engquist, X. Tian, Multiscale modeling, homogenization and nonlocal effects: mathematical and computational issues. Contemp. Math. 754, 18 (2020)

    Google Scholar 

  15. Q. Du, Y. Tao, X. Tian, A peridynamic model of fracture mechanics with bond-breaking. J. Elast. 25, 1–22 (2017)

    Google Scholar 

  16. R. Ge, C. Jin, Y. Zheng, No spurious local minima in nonconvex low rank problems: a unified geometric analysis. In: International Conference on Machine Learning, pp. 1233– 1242 ( 2017). PMLR

  17. Y. Liu, T. Zhao, W. Ju, S. Shi, Materials discovery and design using machine learning. J. Mater. 3(3), 159–177 (2017)

    Google Scholar 

  18. S. Lu, Q. Zhou, Y. Ouyang, Y. Guo, Q. Li, J. Wang, Accelerated discovery of stable lead-free hybrid organic-inorganic perovskites via machine learning. Nat. Commun. 9(1), 1–8 (2018)

    Article  Google Scholar 

  19. K.T. Butler, D.W. Davies, H. Cartwright, O. Isayev, A. Walsh, Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018)

    Article  CAS  Google Scholar 

  20. J. Cai, X. Chu, K. Xu, H. Li, J. Wei, Machine learning-driven new material discovery. Nanoscale Adv. 2(8), 3115–3130 (2020)

    Article  Google Scholar 

  21. Y. Iwasaki, I. Takeuchi, V. Stanev, A.G. Kusne, M. Ishida, A. Kirihara, K. Ihara, R. Sawada, K. Terashima, H. Someya et al., Machine-learning guided discovery of a new thermoelectric material. Sci. Rep. 9(1), 1–7 (2019)

    Article  Google Scholar 

  22. F. Ren, L. Ward, T. Williams, K.J. Laws, C. Wolverton, J. Hattrick-Simpers, A. Mehta, Accelerated discovery of metallic glasses through iteration of machine learning and high-throughput experiments. Sci. Adv. 4(4), 1566 (2018)

    Article  Google Scholar 

  23. K. Kaufmann, D. Maryanovsky, W.M. Mellor, C. Zhu, A.S. Rosengarten, T.J. Harrington, C. Oses, C. Toher, S. Curtarolo, K.S. Vecchio, Discovery of high-entropy ceramics via machine learning. NPJ Comput. Mater. 6(1), 1–9 (2020)

    Article  Google Scholar 

  24. A. Chen, X. Zhang, Z. Zhou, Machine learning: accelerating materials development for energy storage and conversion. InfoMat 2(3), 553–576 (2020)

    Article  CAS  Google Scholar 

  25. C. Finn, P. Abbeel, S. Levine, Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126– 1135 ( 2017). PMLR

  26. Y.L. Qiu, H. Zheng, A. Devos, H. Selby, O. Gevaert, A meta-learning approach for genomic survival analysis. Nat. Commun. 11(1), 1–11 (2020)

    Article  Google Scholar 

  27. Y. Chen, C. Guan, Z. Wei,X. Wang, W. Zhu, Metadelta: A meta-learning system for few-shot image classification. arXiv preprint arXiv:2102.10744 (2021)

  28. Yin, W.: Meta-learning for few-shot natural language processing: a survey. arXiv preprint arXiv:2007.09604 (2020)

  29. B. Kailkhura, B. Gallagher, S. Kim, A. Hiszpanski, T.Y.J. Han, Reliable and explainable machine-learning methods for accelerated material discovery. NPJ Comput. Mater. 5(1), 1–9 (2019)

    Article  Google Scholar 

  30. H. Mai, T.C. Le, T. Hisatomi, D. Chen, K. Domen, D.A. Winkler, R.A. Caruso, Use of meta models for rapid discovery of narrow bandgap oxide photocatalysts. iScience 24, 103068 (2021)

    Article  CAS  Google Scholar 

Download references

Acknowledgments

The authors would also like to thank Dr. Stewart Silling for sharing the DNS codes and for the helpful discussions.

Funding

Authors are supported by the National Science Foundation under award DMS 1753031, and the AFOSR grant FA9550-22-1-0197. Portions of this research were conducted on Lehigh University’s Research Computing infrastructure partially supported by NSF Award 2019035.

Author information

Authors and Affiliations

Authors

Contributions

LZ: methodology, validation, formal analysis, investigation, data curation, visualization; HY: methodology, investigation, data curation; YY: conceptualization, resources, funding acquisition, writing, supervision, project administration.

Corresponding author

Correspondence to Yue Yu.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Appendices

Appendix A: Related works

Material discovery

Using machine learning techniques for material discovery is gaining more attention in scientific communities.[17,18,19,20] They has been applied to materials such as thermoelectric material,[21] metallic glasses,[22] high-entropy ceramics,[23] and so on. Learning models for metamaterials has also gained popularity with recent approaches such as.[9] For a comprehensive review on the application of machine learning techniques to property prediction and materials development for energy-related fields, we refer interested readers to.[24]

.

Meta-learning

Meta-learning seeks to design algorithms that can utilize previous experience to rapidly learn new skills or adapt to new environments. There is a vast literature on papers proposing meta-learning[25] methods, and they have been applied to patient survival analysis,[26] few short image classification,[27] and natural language processing,[28] just to name a few. Recently, provably generalizable algorithms with sharp guarantees in the linear setting are first provided.[11]

Transfer and meta-learning for material modeling

Despite its popularity, few work has studied material discovery under meta or even transfer setting.[29] proposes a transfer-learning technique to exploit correlation among different material properties to augment the features with predicted material properties to improve the regression performance.[30] uses an ensemble of model and a meta-model to help discovering candidate water splitting photocatalysts. To the best of our knowledge, our work is the first application of transfer or meta-learning to heterogeneous material homogenization and discovery.

.

Appendix B: Detailed proof for the error bounds

In this section we review two main lemmas from,[11] which provide a theoretical prediction error bound for the meta-learning of linear representation model as illustrated in (16) and (17). Then we employ these two lemmas and detailed the proof of Theorem 3.4, which provides the error bound for the meta-learning of kernel representations and the resultant prediction tasks.

Lemma B.1

(11, Theorem 2) Assume that we are in a uniform task sampling model. If the number of meta-train samples \(N_{\mathrm{{train}}}\) satisfies \(N_{\mathrm{{train}}}\gtrsim \text {polylog}(N_{\mathrm{{train}}},d,H)(\kappa M)^4\max \{H,d\}\) and given any local minimum of the optimization objective (18), the column space of \(\mathbf {V}^*\), spanned by the orthonormal feature matrix \(\mathbf {B}^\circ\) satisfies

$$\begin{aligned} \sin \theta (\mathbf {B}^\circ ,\mathbf {B})\le {O}\left( \sqrt{\dfrac{\max \{H,d\}M\log N_{\mathrm{{train}}}}{\nu N_{\mathrm{{train}}}}}\right) , \end{aligned}$$
(B.1)

with probability at least \(1-1/\text {poly}(d).\)

Note that Assumption 3.2 guarantees that \(\nu \ge \Omega (1/M)\) and the above theorem yields

$$\begin{aligned} \sin \theta (\mathbf {B}^\circ ,\mathbf {B})\le {\tilde{O}}\left( \sqrt{\dfrac{\max \{H,d\} M^2}{N_{\mathrm{{train}}}}}\right) , \end{aligned}$$
(2)

with probability at least \(1-1/\text {poly}(d)\).

Lemma B.2

(11, Theorem 4) Suppose the parameter associated with the new task satisfies \({\left| \left| \mathbf {C}_{\mathrm{{test}}} \right| \right| }_{l^2}\le O(1)\), then if an estimate \(\mathbf {B}^{\circ }\) of the true feature matrix \(\mathbf {B}\) satisfies \(\sin \theta (\mathbf {B}^{\circ },\mathbf {B})\le \varpi\) and \(N_{\mathrm{{test}}}\gtrsim M\log N_{\mathrm{{test}}}\), then the output parameter \(\mathbf {C}_{\mathrm{{test}}}^\circ\) from (20) satisfies

$$\begin{aligned} {\left| \left| \mathbf {B}^{\circ }\mathbf {C}_{\mathrm{{test}}}^\circ -\mathbf {B}\mathbf {C}_{\mathrm{{test}}} \right| \right| }_{l^2}^2\le {\tilde{O}}\left( \varpi ^2+\dfrac{M}{N_{\mathrm{{test}}}}\right) , \end{aligned}$$
(3)

with probability at least \(1-O(N_{\mathrm{{test}}}^{-100}).\)

Combining Lemma B.2 with Lemma B.1, we obtain the following result for applying (18) and (20) as a linear feature meta-learning algorithm:

$$\begin{aligned} {\left| \left| \mathbf {B}^{\circ }\mathbf {C}_{\mathrm{{test}}}^\circ -\mathbf {B}\mathbf {C}_{\mathrm{{test}}} \right| \right| }_{l^2}^2 \le {\tilde{O}}\left( \dfrac{\max \{H,d\} M^2}{N_{\mathrm{{train}}}}+\dfrac{M}{N_{\mathrm{{test}}}}\right) , \end{aligned}$$
(4)

with probability at least \(1-O((\text {poly}(d))^{-1}+N_{\mathrm{{test}}}^{-100})\).

We now proceed to provide the proof for Theorem 3.4. In the following, we use C to denote a generic constant which is independent of \(\Delta x\), \(\Delta t\), M, H, \(N_{\mathrm{{train}}}\) and \(N_{\mathrm{{test}}}\), but might depend on \(\delta\).

Proof

With (4), we immediately obtain the \(l^2([0,\delta ])\) error estimate for the learnt kernel \(\gamma ^\circ _{\mathrm{{test}}}\) as

$$\begin{aligned} {\left| \left| \gamma _{\mathrm{{test}}}-\gamma ^\circ _{\theta _{\mathrm{{test}}}} \right| \right| }^2_{l^2([0,\delta ])}=&\Delta x\sum _{\alpha =1}^d \left( \sum _{m=1}^M(\mathbf {C}^\circ _{\mathrm{{test}}}-\mathbf {C}_{\mathrm{{test}}})_m P_m({\left| \alpha \Delta x \right| })\right) ^2\\ =&\Delta x{\left| \left| \mathbf {B}\mathbf {C}_{\mathrm{{test}}}-\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ \right| \right| }^2_{l^2}\\ =&{\tilde{O}}\left( \Delta x\dfrac{\max \{H,d\} M^2}{N_{\mathrm{{train}}}}+\Delta x\dfrac{M}{N_{\mathrm{{test}}}}\right) , \end{aligned}$$

with probability at least \(1-O((\text {poly}(d))^{-1}+N_{\mathrm{{test}}}^{-100})\).

For the error bound in the discretized energy norm, we notice that the ground-truth solution \({\hat{u}}\) satisfies:

$$\begin{aligned} \ddot{{\hat{u}}}(x_i,t^{n})&= {\mathcal {L}}_{\theta _{\mathrm{{test}}},h}[{\hat{u}}](x_i,t^n)+g(x_i,t^n) +\epsilon (x_i,t^n)+\left[ \ddot{{\hat{u}}}(x_i,t^{n})-\dfrac{\partial ^2 {\hat{u}}}{\partial t^2}(x_i,t^n)\right]+\left[ {\mathcal {L}}_{\theta _{\mathrm{{test}}}}[{\hat{u}}](x_i,t^n)-{\mathcal {L}}_{\theta _{\mathrm{{test}}},h}[{\hat{u}}](x_i,t^n)\right] \end{aligned}$$

for all \(x\in \chi _{\mathrm{{pred}}}\), \(n=1,\ldots ,\lfloor T_{\mathrm{{pred}}}/\Delta t\rfloor\). Subtracting this equation with (22) and denoting \(e_i^n:={\hat{u}}(x_i,t^n)-{\bar{u}}^n(x_i)\), we then obtain

$$\begin{aligned} \dfrac{e_i^{n+1}-2e_i^{n}+e_i^{n-1}}{\Delta t^2}&= \Delta x\sum _{\alpha =1}^d (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha (e_{i+\alpha }^{n}\nonumber +e_{i-\alpha }^{n}-2e_{i}^{n})+(\epsilon _{\mathrm{{all}}})_i^n, \end{aligned}$$
(5)

where

$$\begin{aligned} (\epsilon _{\mathrm{{all}}})_i^n:&=\epsilon (x_i,t^n)+\Delta x\sum _{\alpha =1}^d (\mathbf {B}\mathbf {C}_{\mathrm{{test}}}-\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha ({\hat{u}}(x_{i+\alpha },t^{n}) +{\hat{u}}(x_{i-\alpha },t^{n})-2{\hat{u}}(x_{i},t^{n}))+\left[ \ddot{{\hat{u}}}(x_i,t^{n})-\dfrac{\partial ^2 {\hat{u}}}{\partial t^2}(x_i,t^n)\right] +\left[ {\mathcal {L}}_{\theta _{\mathrm{{test}}}}[{\hat{u}}](x_i,t^n) \right. \left. -{\mathcal {L}}_{\theta _{\mathrm{{test}}},h}[{\hat{u}}](x_i,t^n)\right] . \end{aligned}$$

With Assumption 3.3, we have the truncation error for the Riemann sum part as \({\left| {\mathcal {L}}_{\theta _{\mathrm{{test}}}}{\hat{u}}(x,t) -{\mathcal {L}}_{\theta _{\mathrm{{test}}},h}{\hat{u}}(x,t) \right| }\le C\Delta x\) for a constant C independent of \(\Delta x\) and \(\Delta t\) but might depends on \(\delta\). Similarly, we have the truncation error for the central difference scheme as \({\left| \ddot{{\hat{u}}}(x_i,t^{n})-\dfrac{\partial ^2 {\hat{u}}}{\partial t^2}(x_i,t^n) \right| }\le C(\Delta t)^2\) with the constant C independent of \(\Delta x\), \(\Delta t\), and \(\delta\). Moreover, (4) yields

$$\begin{aligned}{\left| \Delta x\sum _{\alpha =1}^d (\mathbf {B}\mathbf {C}_{\mathrm{{test}}}-\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha ({\hat{u}}(x_{i+\alpha },t^{n}) +{\hat{u}}(x_{i-\alpha },t^{n})-2{\hat{u}}(x_{i},t^{n})) \right| }&\le \Delta x{\left| \sum _{\alpha =1}^d (\mathbf {B}\mathbf {C}_{\mathrm{{test}}}-\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha (\alpha \Delta x)^2\max _{(x,t)\in \Omega _{\mathrm{{pred}}}\times [0,T_{\mathrm{{pred}}}]}{\left| \dfrac{\partial ^2 {\hat{u}}}{\partial x^2} \right| } \right| }\\&\le \Delta x\delta ^2\sum _{\alpha =1}^d {\left| (\mathbf {B}\mathbf {C}_{\mathrm{{test}}}-\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha \right| }\max _{(x,t)\in \Omega _{\mathrm{{pred}}}\times [0,T_{\mathrm{{pred}}}]}{\left| \dfrac{\partial ^2 {\hat{u}}}{\partial x^2} \right| }\\& \le \Delta x\delta ^2\sqrt{d}{\left| \left| \mathbf {B}\mathbf {C}_{\mathrm{{test}}}-\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ \right| \right| }_{l^2}\max _{(x,t)\in \Omega _{\mathrm{{pred}}}\times [0,T_{\mathrm{{pred}}}]}{\left| \dfrac{\partial ^2 {\hat{u}}}{\partial x^2} \right| }\\&\le {\tilde{O}}\left( \sqrt{\dfrac{\max \{H,d\} M^2}{N_{\mathrm{{train}}}}+\dfrac{M}{N_{\mathrm{{test}}}}}\right) \sqrt{\Delta x\delta ^5}\max _{(x,t)\in \Omega _{\mathrm{{pred}}}\times [0,T_{\mathrm{{pred}}}]}{\left| \dfrac{\partial ^2 {\hat{u}}}{\partial x^2} \right| }, \end{aligned}$$

with probability at least \(1-O((\text {poly}(d))^{-1}+N_{\mathrm{{test}}}^{-100})\). Hence we have the bound for \(\epsilon _{\mathrm{{all}}}\):

$$\begin{aligned}{\left| (\epsilon _{\mathrm{{all}}})_i^n \right| }\le E+{\tilde{O}} &\left( \Delta x+(\Delta t)^2+\sqrt{\left( \dfrac{\max \{H,d\} M^2}{N_{\mathrm{{train}}}}+\dfrac{M}{N_{\mathrm{{test}}}}\right) \Delta x}\right) . \end{aligned}$$

To show the \(l^2({\Omega })\) error for \(e_i^n\), we first derive a bound for its error in the (discretized) energy norm. Multiplying (5) with \(\frac{e_i^{n+1}-e_i^{n}}{\Delta t}\) and summing over \(\chi _{\mathrm{{pred}}}=\{x_i\}_{i=1}^{L_{\mathrm{{pred}}}}\) yields:

$$\begin{aligned}\sum _{i=1}^{L_{\mathrm{{pred}}}}\dfrac{(e_i^{n+1}-2e_i^{n}+e_i^{n-1})(e_i^{n+1}-e_i^{n})}{\Delta t^3}&= \dfrac{\Delta x}{\Delta t}\sum _{i=1}^{L_{\mathrm{{pred}}}}\sum _{\alpha =1}^d (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha (e_{i+\alpha }^{n}+e_{i-\alpha }^{n}-2e_{i}^{n})(e_i^{n+1}-e_i^{n}) +\dfrac{1}{\Delta t}\sum _{i=1}^{L_{\mathrm{{pred}}}}(\epsilon _{\mathrm{{all}}})_i^n(e_i^{n+1}-e_i^{n}). \end{aligned}$$

With the formulation \(a(a-b)=\frac{1}{2}(a^2-b^2+(a-b)^2)\), we can rewrite the left hand side as

$$\begin{aligned}\sum _{i=1}^{L_{\mathrm{{pred}}}}\dfrac{(e_i^{n+1}-2e_i^{n}+e_i^{n-1})(e_i^{n+1}-e_i^{n})}{\Delta t^3}&\ge \dfrac{1}{2\Delta t}\sum _{i=1}^{L_{\mathrm{{pred}}}}\left[ \left( \dfrac{e_i^{n+1}-e_i^{n}}{\Delta t}\right) ^2-\left( \dfrac{e_i^{n}-e_i^{n-1}}{\Delta t}\right) ^2 \right.\left. +\left( \dfrac{e_i^{n+1}-2e_i^{n}+e_i^{n-1}}{\Delta t}\right) ^2\right] \\& \ge \dfrac{1}{2\Delta t}\sum _{i=1}^{L_{\mathrm{{pred}}}}\left[ \left( \dfrac{e_i^{n+1}-e_i^{n}}{\Delta t}\right) ^2-\left( \dfrac{e_i^{n}-e_i^{n-1}}{\Delta t}\right) ^2\right] . \end{aligned}$$

For the first term on the right hand side, with the formulations

$$\begin{aligned}\sum _{i=1-\alpha }^{L} a_i(b_{i+\alpha }-b_i)=\sum _{i=1}^{\alpha }a_{L+i}b_{L+i} -\sum _{i=1}^{\alpha }a_{i-\alpha }b_{i-\alpha }-\sum _{i=1-\alpha }^{L}b_{i+\alpha }(a_{i+\alpha }-a_i), \end{aligned}$$

\(a(b-a)=\frac{1}{2}(b^2-a^2-(a-b)^2)\), Assumption 3.3, and the exact Dirichlet-type boundary condition, i.e., \(e_i^n=0\) for \(i<1\) and \(i>L_{\mathrm{{pred}}}\), we have

$$\begin{aligned}&\dfrac{\Delta x}{\Delta t}\sum _{i=1}^{L_{\mathrm{{pred}}}}\sum _{\alpha =1}^d (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha (e_{i+\alpha }^{n}+e_{i-\alpha }^{n}-2e_{i}^{n})(e_i^{n+1}-e_i^{n})\\&=-\dfrac{\Delta x}{\Delta t}\sum _{\alpha =1}^d\sum _{i=1-\alpha }^{L_{\mathrm{{pred}}}} (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha (e^n_{i+\alpha }-e_i^n) (e^{n+1}_{i+\alpha }-e^n_{i+\alpha }-e_i^{n+1}+e_i^n)\\& =-\dfrac{\Delta x}{2\Delta t}\sum _{\alpha =1}^d\sum _{i=1-\alpha }^{L_{\mathrm{{pred}}}} (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha \left[ (e^{n+1}_{i+\alpha }-e_i^{n+1})^2-(e^n_{i+\alpha }-e_i^n)^2\right.\left. -(e^{n+1}_{i+\alpha }-e^n_{i+\alpha }-e_i^{n+1}+e_i^n)^2\right] \\& \le -\dfrac{\Delta x}{2\Delta t}\sum _{\alpha =1}^d\sum _{i=1-\alpha }^{L_{\mathrm{{pred}}}} (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha \left[ (e^{n+1}_{i+\alpha }-e_i^{n+1})^2-(e^n_{i+\alpha }-e_i^n)^2\right] +\dfrac{\Delta x}{\Delta t}\sum _{\alpha =1}^d\sum _{i=1-\alpha }^{L_{\mathrm{{pred}}}} (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha \left[ (e^{n+1}_{i+\alpha }-e^n_{i+\alpha })^2+(e_i^{n+1}-e_i^n)^2\right] \\& \le -\dfrac{\Delta x}{2\Delta t}\sum _{\alpha =1}^d\sum _{i=1-\alpha }^{L_{\mathrm{{pred}}}} (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha \left[ (e^{n+1}_{i+\alpha }-e_i^{n+1})^2-(e^n_{i+\alpha }-e_i^n)^2\right] +2\dfrac{\Delta x}{\Delta t}\sum _{i=1}^{L_{\mathrm{{pred}}}} {\left| \left| \mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ \right| \right| }_{l^1} (e_i^{n+1}-e_i^n)^2\\& \le -\dfrac{\Delta x}{2\Delta t}\sum _{\alpha =1}^d\sum _{i=1-\alpha }^{L_{\mathrm{{pred}}}} (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha \left[ (e^{n+1}_{i+\alpha }-e_i^{n+1})^2-(e^n_{i+\alpha }-e_i^n)^2\right] +\dfrac{1}{4}\sum _{i=1}^{L_{\mathrm{{pred}}}} \left( \dfrac{e_i^{n+1}-e_i^n}{\Delta t}\right) ^2. \end{aligned}$$

For the second term on the right hand side we have

$$\begin{aligned}\dfrac{1}{\Delta t}\sum _{i=1}^{L_{\mathrm{{pred}}}}(\epsilon _{\mathrm{{all}}})_i^n(e_i^{n+1}-e_i^{n})\le \sum _{i=1}^{L_{\mathrm{{pred}}}}((\epsilon _{\mathrm{{all}}})_i^n)^2+\dfrac{1}{4}\sum _{i=1}^{L_{\mathrm{{pred}}}}\left( \dfrac{e_i^{n+1}-e_i^n}{\Delta t}\right) ^2. \end{aligned}$$

Putting the above three inequalities together, we obtain

$$\begin{aligned}\sum _{i=1}^{L_{\mathrm{{pred}}}}\left[ \left( 1-\Delta t\right) \left( \dfrac{e_i^{n+1}-e_i^{n}}{\Delta t}\right) ^2-\left( \dfrac{e_i^{n}-e_i^{n-1}}{\Delta t}\right) ^2\right] +2\Delta x\sum _{\alpha =1}^d\sum _{i=1-\alpha }^{L_{\mathrm{{pred}}}} (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha \left[ (e^{n+1}_{i+\alpha }-e_i^{n+1})^2-(e^n_{i+\alpha }-e_i^n)^2\right] \le 2\Delta t\sum _{i=1}^{L_{\mathrm{{pred}}}}((\epsilon _{\mathrm{{all}}})_i^n)^2. \end{aligned}$$

With the discrete Gronwall lemma and the bound of \(\Delta t\) in Assumption 3.3, for \(n=1,\ldots ,\lfloor T_{\mathrm{{pred}}}/\Delta t\rfloor\) we have

$$\begin{aligned}\sum _{i=1}^{L_{\mathrm{{pred}}}}\left( \dfrac{e_i^{n}-e_i^{n-1}}{\Delta t}\right) ^2+2\Delta x\sum _{\alpha =1}^d\sum _{i=1-\alpha }^{L_{\mathrm{{pred}}}} (\mathbf {B}^\circ \mathbf {C}_{\mathrm{{test}}}^\circ )_\alpha (e^{n}_{i+\alpha }-e_i^{n})^2 &\le 2L_{\mathrm{{pred}}}((1-\Delta t)^{-n}-1)\max _{i,n}{\left| (\epsilon _{\mathrm{{all}}})_i^n \right| }^2\le 4\exp (T_{\mathrm{{pred}}})L_{\mathrm{{pred}}}\max _{i,n}{\left| (\epsilon _{\mathrm{{all}}})_i^n \right| }^2\\ & \le L_{\mathrm{{pred}}}{\tilde{O}}\left( E^2+(\Delta x)^2+(\Delta t)^4+\left( \dfrac{\max \{H,d\} M^2}{N_{\mathrm{{train}}}}+\dfrac{M}{N_{\mathrm{{test}}}}\right) \Delta x\right) , \end{aligned}$$

with probability at least \(1-O((\text {poly}(d))^{-1}+N_{\mathrm{{test}}}^{-100})\), which provides the error bound in the discrete energy norm. \(\square\)

Appendix C: Reduction of two physics constraints

In this section we further expend the discussion on physics-based constraints in “Prediction error bounds” section. The overall strategy is to fix the last two polynomial features,

$$\begin{aligned} P_{M-1}(\xi )=\beta _1:=\left( \sum _{\alpha =1}^d\alpha ^2\Delta x^3\right) ^{-1} \end{aligned}$$

and

$$\begin{aligned} P_{M}(\xi )=\beta _2\xi :=\left( \sum _{\alpha =1}^d\alpha ^3\Delta x^4\right) ^{-1}\xi \end{aligned}$$

into the set of basis polynomials. We note that these two polynomials satisfy

$$\begin{aligned}\sum _{\alpha =1}^d \alpha ^2 \Delta x^3 P_{M-1}(\alpha \Delta x)=1,\,\\ \sum _{\alpha =1}^d \alpha ^2 \Delta x^3 P_{M}(\alpha \Delta x)=1, \end{aligned}$$

and

$$\begin{aligned} \sum _{\alpha =1}^d \alpha ^4 \Delta x^5 P_{M-1}(\alpha \Delta x)=\dfrac{\sum _{\alpha =1}^d\alpha ^4\Delta x^2}{\sum _{\alpha =1}^d\alpha ^2}, \\ \sum _{\alpha =1}^d \alpha ^4 \Delta x^5 P_{M}(\alpha \Delta x)=\dfrac{\sum _{\alpha =1}^d\alpha ^5\Delta x^2}{\sum _{\alpha =1}^d\alpha ^3}. \end{aligned}$$

Then (24) writes

$$\begin{aligned} {\bar{\rho }} c_0^2=\sum _{m=1}^{M-2} C_m A_{1m}+C_{M-1}+C_{M}, \end{aligned}$$

and (26) writes

$$\begin{aligned} -4{\bar{\rho }} c_0^3 R=\sum _{m=1}^{M-2} C_m A_{2m}+\dfrac{\sum _{\alpha =1}^d\alpha ^4\Delta x^2}{\sum _{\alpha =1}^d\alpha ^2}C_{M-1}+\dfrac{\sum _{\alpha =1}^d\alpha ^5\Delta x^2}{\sum _{\alpha =1}^d\alpha ^3}C_{M}. \end{aligned}$$

Denoting

$$\begin{aligned} \Lambda := \begin{bmatrix} 1 &{} 1 \\ \dfrac{\sum _{\alpha =1}^d\alpha ^4\Delta x^2}{\sum _{\alpha =1}^d\alpha ^2}&{}\dfrac{\sum _{\alpha =1}^d\alpha ^5\Delta x^2}{\sum _{\alpha =1}^d\alpha ^3}\\ \end{bmatrix}, \end{aligned}$$

and

$$\begin{aligned} \mathbf {H}:= \begin{bmatrix} \Delta x^2&{}4\Delta x^2&{}\cdots &{}d^2\Delta x^2\\ \Delta x^4&{}16\Delta x^4&{}\cdots &{}d^4\Delta x^4\\ \end{bmatrix}, \end{aligned}$$

then

$$\begin{aligned} \begin{bmatrix} C_{M-1} \\ C_{M} \\ \end{bmatrix} =&\Lambda ^{-1} \begin{bmatrix} {\bar{\rho }} c_0^2 - \sum _{m=1}^{M-2} C_m A_{1m}\\ -4{\bar{\rho }} c_0^3 R-\sum _{m=1}^{M-2} C_m A_{2m} \\ \end{bmatrix}\\ =&\Lambda ^{-1} \begin{bmatrix} {\bar{\rho }} c_0^2\\ -4{\bar{\rho }} c_0^3 R\\ \end{bmatrix} -\Delta x\Lambda ^{-1}\mathbf {H}\mathbf {B}\mathbf {C}. \end{aligned}$$

Substituting this equation into the loss function in (8), for each \(x_i\) we obtain

$$\begin{aligned}\left( {y}_{k,i}^{n}-(\mathbf {s}_{k,i}^{n})^T\mathbf {B}\mathbf {C}\right) ^2 =&\left( {y}_{k,i}^{n}-(\mathbf {s}_{k,i}^{n})^T\left( \sum _{m=1}^{M-2} C_m\mathbf {b}_m +C_{M-1}\mathbf {b}_{M-1}+C_{M}\mathbf {b}_{M}\right) \right) ^2\\ =&\left( y_{k,i}^{n}-(\mathbf {s}_{k,i}^{n})^T\sum _{m=1}^{M-2} C_m\mathbf {b}_m-(\mathbf {s}_{k,i}^{n})^T[\mathbf {b}_{M-1},\mathbf {b}_{M}]\Lambda ^{-1} \begin{bmatrix} {\bar{\rho }} c_0^2\\ -4{\bar{\rho }} c_0^3 R\\ \end{bmatrix}\right. \left. +\Delta x(\mathbf {s}_{k,i}^{n})^T[\mathbf {b}_{M-1},\mathbf {b}_{M}]\Lambda ^{-1}\mathbf {H}\mathbf {B}\mathbf {C}\right) ^2\\ =&\left( y_{k,i}^{n}-(\mathbf {s}_{k,i}^{n})^T[\mathbf {b}_{M-1},\mathbf {b}_{M}]\Lambda ^{-1} \begin{bmatrix} {\bar{\rho }} c_0^2\\ -4{\bar{\rho }} c_0^3 R\\ \end{bmatrix}\right. \left. -(\mathbf {s}_{k,i}^{n})^T (\mathbf {I}-\Delta x[\mathbf {b}_{M-1},\mathbf {b}_{M}]\Lambda ^{-1}\mathbf {H}) \mathbf {B}\mathbf {C} \right) ^2\\ =&\left( {\tilde{y}}_{k,i}^{n}-(\tilde{\mathbf {s}}_{k,i}^{n})^T{\mathbf {B}}\tilde{\mathbf {C}}\right) ^2, \end{aligned}$$

where

$$\begin{aligned} {\tilde{y}}_{k,i}^{n}:=y_{k,i}^{n}-(\mathbf {s}_{k,i}^{n})^T[\mathbf {b}_{M-1},\mathbf {b}_{M}]\Lambda ^{-1} \begin{bmatrix} {\bar{\rho }} c_0^2\\ -4{\bar{\rho }} c_0^3 R\\ \end{bmatrix}, \end{aligned}$$

\(\tilde{\mathbf {s}}_{k,i}^{n}:=(\mathbf {I}-\Delta x[\mathbf {b}_{M-1},\mathbf {b}_{M}]\Lambda ^{-1}\mathbf {H})^T\mathbf {s}_{k,i}^{n}\), \(\tilde{\mathbf {C}}:=[C_1,\ldots ,C_{M-2}]\), and \(\mathbf {I}\) is an \(d\times d\) identity matrix. Therefore, the analysis and algorithm can also be extended to the “constraints” cases.

Appendix D: detailed parameter and experiment settings

Meta-train and meta-test datasets

To demonstrate the performance of MetaNOR on both periodic and disordered materials, in empirical experiments we generate four types of data from the DNS solver for each microstructure. For each sample, the total training domain \(\Omega =[-50,50]\) and the training data is generated up to \(T = 2\). The spatial and temporal discretization parameters in the DNS solver are set to \(\Delta t = 0.01\), and \(\max {\left| \Delta x \right| } = 0.01\). The other physical parameters are set as \(L=0.2\), \(E_1=1\), \(\rho _1=\rho _2=1\), and \(\phi =0\). In experiment 1, we fix \(E_2 = 0.25\) and set the disorder parameter \(D\in [0.05,0.50]\). In experiment 2, we set \(E_2 \in [0.2025,0.7225]\) and the disorder parameter \(D=0\). The training and testing data are obtained from the DNS data via linear interpolation with \(\Delta t = 0.02\) and \(\Delta x = 0.05\). The two types of data are chosen to follow a similar setting as in:[6]

  1. 1.

    Oscillating source. We let \({\hat{u}}(x,0) = \frac{\partial {\hat{u}}}{\partial t}(x,0) = 0\), \(g(x,t) = \exp ^{-(\frac{2x}{5kL})^2}\exp ^{-(\frac{t-0.8}{0.8})^2}\cos ^2(\frac{2\pi x}{kL}),\) where \(k=1,2,\ldots ,20\).

  2. 2.

    Plane wave. We set \(g(x,t) = 0\), \({\hat{u}}(x,-200) = 0\), and \(\frac{\partial {\hat{u}}}{\partial t}(-50,t) = \cos (\omega t)\). In experiment 1 (random microstructures), we set \(\omega = 0.20, 0.40, \dots , 4.0\). In experiment 2 (periodic microstructures), we set \(\omega = 0.30,0.60,\dots , 6.0\).

In these two types of loading scenarios, the displacement \({\hat{u}}(x,t)\) is mostly zero when \(x>10\), which makes the corresponding datapoints carry very little information. To utilize the sample datapoints more efficiently, for the type 1 data, we only use datapoints from the \(x\in [-10,10]\) region, and for the type 2 data we only use datapoints from the \(x\in [-38,-18]\) region.

Validation dataset: wave packet

We create a validation dataset, denoted as the wave packet dataset, which considers a much longer bar (\(\Omega _{wp}=[-133.3,133.3]\)), and with a 50 times longer simulation time (\(t\in [0,100]\)). The material is under a different loading condition from the training dataset, \(g(x,t) = 0\) and \(\frac{\partial {\hat{u}}}{\partial t}(-133.3, t) = \sin (\omega t) \exp \left( -(t/5 -3)^2\right)\), for \(\omega =1,\, 2, \,3\). To provide a metric for the estimator accuracy, we calculate the averaged displacement error in the discretized energy norm at the last time step. This error metric is referred to as the “validation error”, which checks the stability and generalizability of the estimators.

Application D: Projectile impact simulations

To demonstrate the performance of learnt model in long-term simulation, we simulate the long-term propagation of waves in this material due to the impact of a projectile. In particular, in this problem a projectile hits the left end of the bar at time zero, which generates a velocity wave that travels into the microstructures.

To demonstrate the generalization capability of our approach on different domains, boundary conditions, and longer simulation time, we consider a drastically different setting in this simulation task. In particular, a much larger domain, \(\Omega _{impact} = (-267,267)\), and a much longer simulation time \(t\in [0,600]\) are considered. Notice that our training dataset are only generated up to \(T=2\), this long-term simulation task is particularly challenging not only because it has a different boundary condition setting from all training samples, but also due to the large aspect ratio between training time scale and simulation time scale. On the left end of the domain, we prescribe the velocity as \(\frac{\partial {\hat{u}}}{\partial t}(-267,0) = 1\), and zero velocity on elsewhere.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, L., You, H. & Yu, Y. MetaNOR: A meta-learnt nonlocal operator regression approach for metamaterial modeling. MRS Communications 12, 662–677 (2022). https://doi.org/10.1557/s43579-022-00250-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1557/s43579-022-00250-0

Keywords

Navigation