Abstract
We propose MetaNOR, a meta-learnt approach for transfer-learning operators based on the nonlocal operator regression. The overall goal is to efficiently provide surrogate models for new and unknown material-learning tasks with different microstructures. The algorithm consists of two phases: (1) learning a common nonlocal kernel representation from existing tasks; (2) transferring the learned knowledge and rapidly learning surrogate operators for unseen tasks with a different material, where only a few test samples are required. We apply MetaNOR to model the wave propagation within 1D metamaterials, showing substantial improvements on the sampling efficiency for new materials.
Data availability
The data that support the findings of this study are available from the corresponding author, Yue Yu, upon reasonable request.
References
S. Yao, X. Zhou, G. Hu, Experimental study on negative effective mass in a 1d mass-spring system. N. J. Phys. 10(4), 043020 (2008)
H. Huang, C. Sun, Wave attenuation mechanism in an acoustic metamaterial with negative effective mass density. N. J. Phys. 11(1), 013003 (2009)
J.M. Manimala, H.H. Huang, C. Sun, R. Snyder, S. Bland, Dynamic load mitigation using negative effective mass structures. Eng. Struct. 80, 458–468 (2014)
Q. Du, B. Engquist, X. Tian, Multiscale modeling, homogenization and nonlocal effects: Mathematical and computational issues (2019)
L. Barker, A model for stress wave propagation in composite materials. J. Compos. Mater. 5(2), 140–162 (1971)
H. You, Y. Yu, S. Silling, M. D’Elia, Data-driven learning of nonlocal models: from high-fidelity simulations to constitutive laws. (MLPS, AAAI Spring Symposium, 2021)
S.A. Silling, Propagation of a stress pulse in a heterogeneous elastic bar. J. Peridyn. Nonlocal Model. 3, 1–21 (2021)
K. Deshmukh, T. Breitzman, K. Dayal, K. Multiband homogenization of metamaterials in real-space: Higher-order nonlocal models and scattering at external surface. preprint (2021)
H. You, Y. Yu, N. Trask, M. Gulian, M. D’Elia, Data-driven learning of robust nonlocal physics from high-fidelity synthetic data. Comput. Methods Appl. Mech. Eng. 374, 113553 (2021)
H. You, Y. Yu, S. Silling, M. D’Elia, A data-driven peridynamic continuum model for upscaling molecular dynamics. Comput. Methods Appl. Mech. Eng. 389, 114400 (2022)
N. Tripuraneni, C. Jin, M. Jordan, Provable meta-learning of linear representations. In: International Conference on Machine Learning, pp. 10434– 10443 ( 2021). PMLR
M. Beran, J. McCoy, Mean field variations in a statistical sample of heterogeneous linearly elastic solids. Int. J. Solids Struct. 6(8), 1035–1054 (1970)
S.A. Silling, Origin and effect of nonlocality in a composite. J. Mech. Mater. Struct. 9(2), 245–258 (2014)
Q. Du, B. Engquist, X. Tian, Multiscale modeling, homogenization and nonlocal effects: mathematical and computational issues. Contemp. Math. 754, 18 (2020)
Q. Du, Y. Tao, X. Tian, A peridynamic model of fracture mechanics with bond-breaking. J. Elast. 25, 1–22 (2017)
R. Ge, C. Jin, Y. Zheng, No spurious local minima in nonconvex low rank problems: a unified geometric analysis. In: International Conference on Machine Learning, pp. 1233– 1242 ( 2017). PMLR
Y. Liu, T. Zhao, W. Ju, S. Shi, Materials discovery and design using machine learning. J. Mater. 3(3), 159–177 (2017)
S. Lu, Q. Zhou, Y. Ouyang, Y. Guo, Q. Li, J. Wang, Accelerated discovery of stable lead-free hybrid organic-inorganic perovskites via machine learning. Nat. Commun. 9(1), 1–8 (2018)
K.T. Butler, D.W. Davies, H. Cartwright, O. Isayev, A. Walsh, Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018)
J. Cai, X. Chu, K. Xu, H. Li, J. Wei, Machine learning-driven new material discovery. Nanoscale Adv. 2(8), 3115–3130 (2020)
Y. Iwasaki, I. Takeuchi, V. Stanev, A.G. Kusne, M. Ishida, A. Kirihara, K. Ihara, R. Sawada, K. Terashima, H. Someya et al., Machine-learning guided discovery of a new thermoelectric material. Sci. Rep. 9(1), 1–7 (2019)
F. Ren, L. Ward, T. Williams, K.J. Laws, C. Wolverton, J. Hattrick-Simpers, A. Mehta, Accelerated discovery of metallic glasses through iteration of machine learning and high-throughput experiments. Sci. Adv. 4(4), 1566 (2018)
K. Kaufmann, D. Maryanovsky, W.M. Mellor, C. Zhu, A.S. Rosengarten, T.J. Harrington, C. Oses, C. Toher, S. Curtarolo, K.S. Vecchio, Discovery of high-entropy ceramics via machine learning. NPJ Comput. Mater. 6(1), 1–9 (2020)
A. Chen, X. Zhang, Z. Zhou, Machine learning: accelerating materials development for energy storage and conversion. InfoMat 2(3), 553–576 (2020)
C. Finn, P. Abbeel, S. Levine, Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126– 1135 ( 2017). PMLR
Y.L. Qiu, H. Zheng, A. Devos, H. Selby, O. Gevaert, A meta-learning approach for genomic survival analysis. Nat. Commun. 11(1), 1–11 (2020)
Y. Chen, C. Guan, Z. Wei,X. Wang, W. Zhu, Metadelta: A meta-learning system for few-shot image classification. arXiv preprint arXiv:2102.10744 (2021)
Yin, W.: Meta-learning for few-shot natural language processing: a survey. arXiv preprint arXiv:2007.09604 (2020)
B. Kailkhura, B. Gallagher, S. Kim, A. Hiszpanski, T.Y.J. Han, Reliable and explainable machine-learning methods for accelerated material discovery. NPJ Comput. Mater. 5(1), 1–9 (2019)
H. Mai, T.C. Le, T. Hisatomi, D. Chen, K. Domen, D.A. Winkler, R.A. Caruso, Use of meta models for rapid discovery of narrow bandgap oxide photocatalysts. iScience 24, 103068 (2021)
Acknowledgments
The authors would also like to thank Dr. Stewart Silling for sharing the DNS codes and for the helpful discussions.
Funding
Authors are supported by the National Science Foundation under award DMS 1753031, and the AFOSR grant FA9550-22-1-0197. Portions of this research were conducted on Lehigh University’s Research Computing infrastructure partially supported by NSF Award 2019035.
Author information
Authors and Affiliations
Contributions
LZ: methodology, validation, formal analysis, investigation, data curation, visualization; HY: methodology, investigation, data curation; YY: conceptualization, resources, funding acquisition, writing, supervision, project administration.
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Appendices
Appendix A: Related works
Material discovery
Using machine learning techniques for material discovery is gaining more attention in scientific communities.[17,18,19,20] They has been applied to materials such as thermoelectric material,[21] metallic glasses,[22] high-entropy ceramics,[23] and so on. Learning models for metamaterials has also gained popularity with recent approaches such as.[9] For a comprehensive review on the application of machine learning techniques to property prediction and materials development for energy-related fields, we refer interested readers to.[24]
.
Meta-learning
Meta-learning seeks to design algorithms that can utilize previous experience to rapidly learn new skills or adapt to new environments. There is a vast literature on papers proposing meta-learning[25] methods, and they have been applied to patient survival analysis,[26] few short image classification,[27] and natural language processing,[28] just to name a few. Recently, provably generalizable algorithms with sharp guarantees in the linear setting are first provided.[11]
Transfer and meta-learning for material modeling
Despite its popularity, few work has studied material discovery under meta or even transfer setting.[29] proposes a transfer-learning technique to exploit correlation among different material properties to augment the features with predicted material properties to improve the regression performance.[30] uses an ensemble of model and a meta-model to help discovering candidate water splitting photocatalysts. To the best of our knowledge, our work is the first application of transfer or meta-learning to heterogeneous material homogenization and discovery.
.
Appendix B: Detailed proof for the error bounds
In this section we review two main lemmas from,[11] which provide a theoretical prediction error bound for the meta-learning of linear representation model as illustrated in (16) and (17). Then we employ these two lemmas and detailed the proof of Theorem 3.4, which provides the error bound for the meta-learning of kernel representations and the resultant prediction tasks.
Lemma B.1
(11, Theorem 2) Assume that we are in a uniform task sampling model. If the number of meta-train samples \(N_{\mathrm{{train}}}\) satisfies \(N_{\mathrm{{train}}}\gtrsim \text {polylog}(N_{\mathrm{{train}}},d,H)(\kappa M)^4\max \{H,d\}\) and given any local minimum of the optimization objective (18), the column space of \(\mathbf {V}^*\), spanned by the orthonormal feature matrix \(\mathbf {B}^\circ\) satisfies
with probability at least \(1-1/\text {poly}(d).\)
Note that Assumption 3.2 guarantees that \(\nu \ge \Omega (1/M)\) and the above theorem yields
with probability at least \(1-1/\text {poly}(d)\).
Lemma B.2
(11, Theorem 4) Suppose the parameter associated with the new task satisfies \({\left| \left| \mathbf {C}_{\mathrm{{test}}} \right| \right| }_{l^2}\le O(1)\), then if an estimate \(\mathbf {B}^{\circ }\) of the true feature matrix \(\mathbf {B}\) satisfies \(\sin \theta (\mathbf {B}^{\circ },\mathbf {B})\le \varpi\) and \(N_{\mathrm{{test}}}\gtrsim M\log N_{\mathrm{{test}}}\), then the output parameter \(\mathbf {C}_{\mathrm{{test}}}^\circ\) from (20) satisfies
with probability at least \(1-O(N_{\mathrm{{test}}}^{-100}).\)
Combining Lemma B.2 with Lemma B.1, we obtain the following result for applying (18) and (20) as a linear feature meta-learning algorithm:
with probability at least \(1-O((\text {poly}(d))^{-1}+N_{\mathrm{{test}}}^{-100})\).
We now proceed to provide the proof for Theorem 3.4. In the following, we use C to denote a generic constant which is independent of \(\Delta x\), \(\Delta t\), M, H, \(N_{\mathrm{{train}}}\) and \(N_{\mathrm{{test}}}\), but might depend on \(\delta\).
Proof
With (4), we immediately obtain the \(l^2([0,\delta ])\) error estimate for the learnt kernel \(\gamma ^\circ _{\mathrm{{test}}}\) as
with probability at least \(1-O((\text {poly}(d))^{-1}+N_{\mathrm{{test}}}^{-100})\).
For the error bound in the discretized energy norm, we notice that the ground-truth solution \({\hat{u}}\) satisfies:
for all \(x\in \chi _{\mathrm{{pred}}}\), \(n=1,\ldots ,\lfloor T_{\mathrm{{pred}}}/\Delta t\rfloor\). Subtracting this equation with (22) and denoting \(e_i^n:={\hat{u}}(x_i,t^n)-{\bar{u}}^n(x_i)\), we then obtain
where
With Assumption 3.3, we have the truncation error for the Riemann sum part as \({\left| {\mathcal {L}}_{\theta _{\mathrm{{test}}}}{\hat{u}}(x,t) -{\mathcal {L}}_{\theta _{\mathrm{{test}}},h}{\hat{u}}(x,t) \right| }\le C\Delta x\) for a constant C independent of \(\Delta x\) and \(\Delta t\) but might depends on \(\delta\). Similarly, we have the truncation error for the central difference scheme as \({\left| \ddot{{\hat{u}}}(x_i,t^{n})-\dfrac{\partial ^2 {\hat{u}}}{\partial t^2}(x_i,t^n) \right| }\le C(\Delta t)^2\) with the constant C independent of \(\Delta x\), \(\Delta t\), and \(\delta\). Moreover, (4) yields
with probability at least \(1-O((\text {poly}(d))^{-1}+N_{\mathrm{{test}}}^{-100})\). Hence we have the bound for \(\epsilon _{\mathrm{{all}}}\):
To show the \(l^2({\Omega })\) error for \(e_i^n\), we first derive a bound for its error in the (discretized) energy norm. Multiplying (5) with \(\frac{e_i^{n+1}-e_i^{n}}{\Delta t}\) and summing over \(\chi _{\mathrm{{pred}}}=\{x_i\}_{i=1}^{L_{\mathrm{{pred}}}}\) yields:
With the formulation \(a(a-b)=\frac{1}{2}(a^2-b^2+(a-b)^2)\), we can rewrite the left hand side as
For the first term on the right hand side, with the formulations
\(a(b-a)=\frac{1}{2}(b^2-a^2-(a-b)^2)\), Assumption 3.3, and the exact Dirichlet-type boundary condition, i.e., \(e_i^n=0\) for \(i<1\) and \(i>L_{\mathrm{{pred}}}\), we have
For the second term on the right hand side we have
Putting the above three inequalities together, we obtain
With the discrete Gronwall lemma and the bound of \(\Delta t\) in Assumption 3.3, for \(n=1,\ldots ,\lfloor T_{\mathrm{{pred}}}/\Delta t\rfloor\) we have
with probability at least \(1-O((\text {poly}(d))^{-1}+N_{\mathrm{{test}}}^{-100})\), which provides the error bound in the discrete energy norm. \(\square\)
Appendix C: Reduction of two physics constraints
In this section we further expend the discussion on physics-based constraints in “Prediction error bounds” section. The overall strategy is to fix the last two polynomial features,
and
into the set of basis polynomials. We note that these two polynomials satisfy
and
Then (24) writes
and (26) writes
Denoting
and
then
Substituting this equation into the loss function in (8), for each \(x_i\) we obtain
where
\(\tilde{\mathbf {s}}_{k,i}^{n}:=(\mathbf {I}-\Delta x[\mathbf {b}_{M-1},\mathbf {b}_{M}]\Lambda ^{-1}\mathbf {H})^T\mathbf {s}_{k,i}^{n}\), \(\tilde{\mathbf {C}}:=[C_1,\ldots ,C_{M-2}]\), and \(\mathbf {I}\) is an \(d\times d\) identity matrix. Therefore, the analysis and algorithm can also be extended to the “constraints” cases.
Appendix D: detailed parameter and experiment settings
Meta-train and meta-test datasets
To demonstrate the performance of MetaNOR on both periodic and disordered materials, in empirical experiments we generate four types of data from the DNS solver for each microstructure. For each sample, the total training domain \(\Omega =[-50,50]\) and the training data is generated up to \(T = 2\). The spatial and temporal discretization parameters in the DNS solver are set to \(\Delta t = 0.01\), and \(\max {\left| \Delta x \right| } = 0.01\). The other physical parameters are set as \(L=0.2\), \(E_1=1\), \(\rho _1=\rho _2=1\), and \(\phi =0\). In experiment 1, we fix \(E_2 = 0.25\) and set the disorder parameter \(D\in [0.05,0.50]\). In experiment 2, we set \(E_2 \in [0.2025,0.7225]\) and the disorder parameter \(D=0\). The training and testing data are obtained from the DNS data via linear interpolation with \(\Delta t = 0.02\) and \(\Delta x = 0.05\). The two types of data are chosen to follow a similar setting as in:[6]
-
1.
Oscillating source. We let \({\hat{u}}(x,0) = \frac{\partial {\hat{u}}}{\partial t}(x,0) = 0\), \(g(x,t) = \exp ^{-(\frac{2x}{5kL})^2}\exp ^{-(\frac{t-0.8}{0.8})^2}\cos ^2(\frac{2\pi x}{kL}),\) where \(k=1,2,\ldots ,20\).
-
2.
Plane wave. We set \(g(x,t) = 0\), \({\hat{u}}(x,-200) = 0\), and \(\frac{\partial {\hat{u}}}{\partial t}(-50,t) = \cos (\omega t)\). In experiment 1 (random microstructures), we set \(\omega = 0.20, 0.40, \dots , 4.0\). In experiment 2 (periodic microstructures), we set \(\omega = 0.30,0.60,\dots , 6.0\).
In these two types of loading scenarios, the displacement \({\hat{u}}(x,t)\) is mostly zero when \(x>10\), which makes the corresponding datapoints carry very little information. To utilize the sample datapoints more efficiently, for the type 1 data, we only use datapoints from the \(x\in [-10,10]\) region, and for the type 2 data we only use datapoints from the \(x\in [-38,-18]\) region.
Validation dataset: wave packet
We create a validation dataset, denoted as the wave packet dataset, which considers a much longer bar (\(\Omega _{wp}=[-133.3,133.3]\)), and with a 50 times longer simulation time (\(t\in [0,100]\)). The material is under a different loading condition from the training dataset, \(g(x,t) = 0\) and \(\frac{\partial {\hat{u}}}{\partial t}(-133.3, t) = \sin (\omega t) \exp \left( -(t/5 -3)^2\right)\), for \(\omega =1,\, 2, \,3\). To provide a metric for the estimator accuracy, we calculate the averaged displacement error in the discretized energy norm at the last time step. This error metric is referred to as the “validation error”, which checks the stability and generalizability of the estimators.
Application D: Projectile impact simulations
To demonstrate the performance of learnt model in long-term simulation, we simulate the long-term propagation of waves in this material due to the impact of a projectile. In particular, in this problem a projectile hits the left end of the bar at time zero, which generates a velocity wave that travels into the microstructures.
To demonstrate the generalization capability of our approach on different domains, boundary conditions, and longer simulation time, we consider a drastically different setting in this simulation task. In particular, a much larger domain, \(\Omega _{impact} = (-267,267)\), and a much longer simulation time \(t\in [0,600]\) are considered. Notice that our training dataset are only generated up to \(T=2\), this long-term simulation task is particularly challenging not only because it has a different boundary condition setting from all training samples, but also due to the large aspect ratio between training time scale and simulation time scale. On the left end of the domain, we prescribe the velocity as \(\frac{\partial {\hat{u}}}{\partial t}(-267,0) = 1\), and zero velocity on elsewhere.
Rights and permissions
About this article
Cite this article
Zhang, L., You, H. & Yu, Y. MetaNOR: A meta-learnt nonlocal operator regression approach for metamaterial modeling. MRS Communications 12, 662–677 (2022). https://doi.org/10.1557/s43579-022-00250-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1557/s43579-022-00250-0