Abstract
Identifying unknown differential equations from a given set of discrete time dependent data is a challenging problem. A small amount of noise can make the recovery unstable. Nonlinearity and varying coefficients add complexity to the problem. We assume that the governing partial differential equation (PDE) is a linear combination of few differential terms in a prescribed dictionary, and the objective of this paper is to find the correct coefficients. We propose a new direction based on the fundamental convergence principle of numerical PDE schemes. We utilize Lasso for efficiency, and a performance guarantee is established based on an incoherence property. The main contribution is to validate and correct the results by time evolution error (TEE). A new algorithm, called identifying differential equations with numerical time evolution (IDENT), is explored for data with non-periodic boundary conditions, noisy data and PDEs with varying coefficients. Based on the recovery theory of Lasso, we propose a new definition of Noise-to-Signal ratio, which better represents the level of noise in the case of PDE identification. The effects of data generations and downsampling are systematically analyzed and tested. For noisy data, we propose an order preserving denoising method called least-squares moving average (LSMA), to preprocess the given data. For the identification of PDEs with varying coefficients, we propose to add Base Element Expansion (BEE) to aid the computation. Various numerical experiments from basic tests to noisy data, downsampling effects and varying coefficients are presented.
Similar content being viewed by others
References
Barth, T., Frederickson, P.: Higher order solution of the Euler equations on unstructured grids using quadratic reconstruction. In 28th Aerospace Sciences Meeting, p. 13 (1990)
Bertsekas, D.P.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, London (2014)
Bongard, J., Lipson, H.: Automated reverse engineering of nonlinear dynamical systems. Proc. Natl. Acad. Sci. 104(24), 9943–9948 (2007)
Bongini, M., Fornasier, M., Hansen, M., Maggioni, M.: Inferring interaction rules from observations of evolutive systems i: the variational approach. Math. Models Methods Appl. Sci. 27(05), 909–951 (2017)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J., et al.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends ® Mach. Learn. 3(1), 1–122 (2011)
Brunton, S.L., Proctor, J.L., Kutz, J.N.: Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. 113(15), 3932–3937 (2016)
Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006)
Donoho, D.L., Huo, X.: Uncertainty principles and ideal atomic decomposition. IEEE Trans. Inf. Theory 47(7), 2845–2862 (2001)
Fannjiang, A., Liao, W.: Coherence pattern-guided compressive sensing with unresolved grids. SIAM J. Imaging Sci. 5(1), 179–202 (2012)
Fuchs, J.-J.: On sparse representations in arbitrary redundant bases. IEEE Trans. Inf. Theory 50(6), 1341–1344 (2004)
Harten, A., Engquist, B., Osher, S., Chakravarthy, S.R.: Uniformly high order accurate essentially non-oscillatory schemes, iii. J. Comput. Phys. 71(2), 231–303 (1987)
Changqing, H., Shu, C.-W.: Weighted essentially non-oscillatory schemes on triangular meshes. J. Comput. Phys. 150(1), 97–127 (1999)
Kaiser, E., Kutz, J.N., Brunton, S.L.: Sparse identification of nonlinear dynamics for model predictive control in the low-data limit. Proc. R. Soc. A 474(2219), 20180335 (2018)
Khoo, Y., Ying, L.: Switchnet: a neural network model for forward and inverse scattering problems. arXiv preprint arXiv:1810.09675 (2018)
Liu, Y., Shu, C.-W., Tadmor, E., Zhang, M.: Central discontinuous Galerkin methods on overlapping cells with a non-oscillatory hierarchical reconstruction. SIAM J. Numer. Anal. 45, 2442–2467 (2007)
Loiseau, J.-C., Brunton, S.L.: Constrained sparse Galerkin regression. J. Fluid Mech. 838, 42–67 (2018)
Long, Z., Lu, Y., Ma, X., Dong, B.: PDE-Net: learning PDEs from data. arXiv preprint arXiv:1710.09668 (2017)
Lu, F., Zhong, M., Tang, S., Maggioni, M.: Nonparametric inference of interaction laws in systems of agents from trajectory data. arXiv preprint arXiv:1812.06003 (2018)
Lusch, B., Kutz, J.N., Brunton, S.L.: Deep learning for universal linear embeddings of nonlinear dynamics. Nat. Commun. 9(1), 4950 (2018)
Mangan, N.M., Kutz, J.N., Brunton, S.L., Proctor, J.L.: Model selection for dynamical systems via sparse regression and information criteria. Proc. R. Soc. A 473(2204), 20170009 (2017)
Qin, T., Wu, K., Xiu, D.: Data driven governing equations approximation using deep neural networks. arXiv preprint arXiv:1811.05537 (2018)
Raissi, M.: Deep hidden physics models: Deep learning of nonlinear partial differential equations. arXiv preprint arXiv:1801.06637 (2018)
Raissi, M., Karniadakis, G.E.: Hidden physics models: machine learning of nonlinear partial differential equations. J. Comput. Phys. 357, 125–141 (2018)
Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics informed deep learning (part i): data-driven solutions of nonlinear partial differential equations. arXiv preprint arXiv:1711.10561 (2017)
Rudy, S.H., Brunton, S.L., Proctor, J.L., Kutz, J.N.: Data-driven discovery of partial differential equations. Sci. Adv. 3(4), e1602614 (2017)
Schaeffer, H.: Learning partial differential equations via data discovery and sparse optimization. Proc. R. Soc. A Math. Phys. Eng. Sci. 473(2197), 20160446 (2017)
Schaeffer, H., Caflisch, R., Hauck, C.D., Osher, S.: Sparse dynamics for partial differential equations. Proc. Nat. Acad. Sci. 110(17), 6634–6639 (2013)
Schaeffer, H., Tran, G., Ward, R.: Extracting sparse high-dimensional dynamics from limited data. SIAM J. Appl. Math. 78(6), 3279–3295 (2018)
Schmidt, M., Lipson, H.: Distilling free-form natural laws from experimental data. Science 324(5923), 81–85 (2009)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58, 267–288 (1996)
Tran, G., Ward, R.: Exact recovery of chaotic systems from highly corrupted data. Multisc. Model. Simul. 15(3), 1108–1129 (2017)
Tropp, J.A.: Just relax: convex programming methods for subset selection and sparse approximation. ICES report, 404 (2004)
Tropp, J.A.: Just relax: convex programming methods for identifying sparse signals in noise. IEEE Trans. Inf. Theory 52(3), 1030–1051 (2006)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 68(1), 49–67 (2006)
Zhang, S., Lin, G.: Robust data-driven discovery of governing physical laws with error bars. Proc. R. Soc. A 474(2217), 20180305 (2018)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
S. H. Kang: Research is supported in part by Simons Foundation Grants 282311 and 584960.
W. Liao: Research is supported in part by the NSF Grants DMS 1818751 and DMS 2012652.
Y. Liu: Research is supported in part by NSF Grants DMS-1522585 and DMS-CDS&E-MSS-1622453.
Appendix A: Recovery Theory of Lasso with a Weighted \(L^1\) Norm
Appendix A: Recovery Theory of Lasso with a Weighted \(L^1\) Norm
In the field of compressive sensing, performance guarantees for the recovery of sparse vectors from a small number of noisy linear measurements by Lasso have been established when the sensing matrix satisfies an incoherence property [8] or a restricted isometry property [7]. We establish the incoherence property of Lasso for the case of identifying PDE, where a weighted \(L^1\) norm is used.
Given a sensing matrix \(\varPhi \in {\mathbb {R}}^{n \times m}\) and the noisy measurement
where \(\mathbf {x}^\mathrm{opt}\) is s-sparse (\(\Vert \mathbf {x}^\mathrm{opt}\Vert _0 =s\)), the goal is to recover \(\mathbf {x}_{\mathrm{opt}}\) in a robust way. Denote the support of \(\mathbf {x}^{\mathrm{opt}}\) by \(\varLambda \) and let \(\varPhi _\varLambda \) be the submatrix of \(\varPhi \) whose columns are restricted on \(\varLambda \). Suppose \(\varPhi = [\phi [1] \ \phi [2] \ \ldots \phi [m]]\) where all \(\phi [j]\)’s have unit norm. Let the mutual coherence of \(\varPhi \) be
The principle of Lasso with a weighted \(L^1\) norm is to solve
where \(W = \mathrm{diag}(w_1,w_2,\ldots ,w_m), w_j \ne 0, j=1,\ldots ,m\) and \(\gamma \) is a balancing parameter. Let \(w_{\max } = \max _{j}|w_j|\) and \(w_{\min } = \min _{j}|w_j|\). Lasso successfully recovers the support of \(\mathbf {x}^\mathrm{opt}\) when \(\mu (\varPhi )\) is sufficiently small. The following proposition is a generalization of Theorem 8 in [33] from \(L^1\) norm regularization to weighted \(L^1\) norm regularization.
Proposition 1
Suppose the support of \(\mathbf {x}^{\mathrm{opt}}\), denoted by \(\varLambda \), contains no more than s indices, \(\mu (s-1)<1\) and
Let
and \(\mathbf {x}(\gamma )\) be the minimizer of (W-Lasso). Then
-
1)
the support of \(\mathbf {x}(\gamma )\) is contained in \(\varLambda \);
-
2)
the distance between \(\mathbf {x}(\gamma )\) and \(\mathbf {x}^{\mathrm{opt}}\) satisfies
$$\begin{aligned} \Vert \mathbf {x}(\gamma ) - \mathbf {x}^{\mathrm{opt}}\Vert _\infty \le \frac{w_{\max }}{w_{\min }[1- \mu (s-1)] - w_{\max } \mu s}\Vert \mathbf {e}\Vert _2; \end{aligned}$$(21) -
3)
if
$$\begin{aligned} \mathbf {x}^\mathrm{opt}_{\min } := \min _{j \in \varLambda } |x_j^\mathrm{opt}| > \frac{w_{\max }}{w_{\min }[1 -\mu (s-1)] - w_{\max } \mu s}\Vert \mathbf {e}\Vert _2, \end{aligned}$$then \(\mathrm{supp}(\mathbf {x}(\gamma )) = \varLambda \).
Proof
Under the condition \(\mu (s-1)<1\), \(\varLambda \) indexes a linearly independent collection of columns of \(\varPhi \). Let \(\mathbf {x}^\star \) be the minimizer of (W-Lasso) over all vectors supported on \(\varLambda \). A necessary and sufficient condition on such a minimizer is that
where \(\mathbf {g}\in \partial \Vert W\mathbf {x}^\star \Vert _1\), meaning \(g_j = w_j\mathrm{sign}(x^\star )\) whenever \(x^\star _j \ne 0\) and \(|g_j| \le w_j\) whenever \(x^\star _j = 0\). It follows that \(\Vert \mathbf {g}\Vert _\infty \le w_{\max }\) and
Next we prove \(x^\star \) is also the global minimizer of (W-Lasso) by demonstrating that the objective function increases when we change any other component of \(\mathbf {x}^\star \). Let
Choose an index \(\omega \notin \varLambda \) and let \(\delta \) be a nonzero scalar. We will develop a condition which ensures that
where \(\mathbf {e}_\omega \) is the \(\omega \)th standard basis vector. Notice that
According to [10, 32], \(\max _{\omega \notin \varLambda } \Vert \varPhi _\varLambda ^\dagger \phi [\omega ]\Vert <\frac{\mu s}{1-\mu (s-1)}\). A sufficient condition to guarantee \(L(\mathbf {x}^\star + \delta \mathbf {e}_\omega ) - L(\mathbf {x}^\star )>0\) is
which gives rise to (20). This establishes that \(\mathbf {x}^\star \) is the global minimizer of (W-Lasso). (21) is resulted from (23) along with \(\Vert (\varPhi _\varLambda ^*\varPhi _\varLambda )^{-1}\Vert _{\infty ,\infty } \le [1-\mu (s-1)]^{-1}\). \(\square \)
We prove Theorem 1 based on Proposition 1.
Proof of Theorem 1
Suppose \(\widehat{F}_{\mathrm{unit}}\) is obtained from \(\widehat{F}\) with the columns normalized to unit \(L^2\) norm and let \(W \in {\mathbb {R}}^{N_3 \times N_3}\) be the diagonal matrix with \(W_{jj} =\Vert \widehat{F}[j]\Vert _\infty \Vert \widehat{F}[j]\Vert _2^{-1}\). The Lasso we solve is equivalent to
where \(\mathbf {z}= W \mathbf {y}\), \(\mathbf {y}^{\mathrm{opt}}_j = \mathbf {a}_j \Vert \widehat{F}[j]\Vert _2\) and \(\mathbf {e}= {\widehat{\mathbf {b}}} - \widehat{F}_{\mathrm{unit}} \mathbf {y}^{\mathrm{opt}} \). Then we apply Proposition 1. The choice of balancing parameters in (20) suggests
which gives rise to (11). The error bound in (21) gives
which implies
which yields (12). \(\square \)
Rights and permissions
About this article
Cite this article
Kang, S.H., Liao, W. & Liu, Y. IDENT: Identifying Differential Equations with Numerical Time Evolution. J Sci Comput 87, 1 (2021). https://doi.org/10.1007/s10915-020-01404-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-020-01404-9