Numerical Computation of Partial Differential Equations by Hidden-Layer Concatenated Extreme Learning Machine

Ni, Naxian; Dong, Suchuan

doi:10.1007/s10915-023-02162-0

Numerical Computation of Partial Differential Equations by Hidden-Layer Concatenated Extreme Learning Machine

Published: 16 March 2023

Volume 95, article number 35, (2023)
Cite this article

Journal of Scientific Computing Aims and scope Submit manuscript

Naxian Ni¹ &
Suchuan Dong¹

392 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Extreme learning machine (ELM) is a type of randomized neural networks originally developed for linear classification and regression problems in the mid-2000s, and has recently been extended to computational partial differential equations (PDE). This method can yield highly accurate solutions to linear/nonlinear PDEs, but requires the last hidden layer of the neural network to be wide to achieve a high accuracy. If the last hidden layer is narrow, the accuracy of the existing ELM method will be poor, irrespective of the rest of the network configuration. In this paper we present a modified ELM method, termed HLConcELM (hidden-layer concatenated ELM), to overcome the above drawback of the conventional ELM method. The HLConcELM method can produce highly accurate solutions to linear/nonlinear PDEs when the last hidden layer of the network is narrow and when it is wide. The new method is based on a type of modified feedforward neural networks (FNN), termed HLConcFNN (hidden-layer concatenated FNN), which incorporates a logical concatenation of the hidden layers in the network and exposes all the hidden nodes to the output-layer nodes. HLConcFNNs have the interesting property that, given a network architecture, when additional hidden layers are appended to the network or when extra nodes are added to the existing hidden layers, the representation capacity of the HLConcFNN associated with the new architecture is guaranteed to be not smaller than that of the original network architecture. Here representation capacity refers to the set of all functions that can be exactly represented by the neural network of a given architecture. We present ample benchmark tests with linear/nonlinear PDEs to demonstrate the computational accuracy and performance of the HLConcELM method and the superiority of this method to the conventional ELM from previous works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Article Open access 26 July 2022

Development and Application of Artificial Neural Network

Article 30 December 2017

A survey on ensemble learning

Article 30 August 2019

Data Availability

The datasets related to this paper are available from the correpsonding author on reasonable request.

References

Alaba, P., Popoola, S., Olatomiwa, L., Akanle, M., Ohunakin, O., Adetiba, E., Alex, O., Atayero, A., Daud, W.: Towards a more efficient and cost-sensitive extreme learning machine: a state-of-the-art review of recent trend. Neurocomputing 350, 70–90 (2019)
Google Scholar
Basdevant, C., Deville, M., Haldenwang, P., Lacroix, J., Ouazzani, J., Peyret, R., Orlandi, P., Patera, A.: Spectral and finite difference solutions of the Burgers equation. Comput. Fluids 14, 23–41 (1986)
MATH Google Scholar
Braake, H., Straten, G.: Random activation weight neural net (RAWN) for fast non-iterative training. Eng. Appl. Artif. Intell. 8, 71–80 (1995)
Google Scholar
Branch, M., Coleman, T., Li, Y.: A subspace, interior, and conjugate gradient method for large-scale bound-constrained minimization problems. SIAM J. Sci. Comput. 21, 1–23 (1999)
MathSciNet MATH Google Scholar
Byrd, R., Schnabel, R., Shultz, G.: Approximate solution of the trust region problem by minimization over two-dimensional subspaces. Math. Program. 40, 247–263 (1988)
MathSciNet MATH Google Scholar
Calabro, F., Fabiani, G., Siettos, C.: Extreme learning machine collocation for the numerical solution of elliptic PDEs with sharp gradients. Comput. Methods Appl. Mech. Eng. 387, 114188 (2021)
MathSciNet MATH Google Scholar
Cortes, C., Gonzalvo, X., Kuznetsov, V., Mohri, M., Yang, S.: Adanet: adaptive structural learning of artificial neural networks. arXiv:1607.01097 (2016)
Cyr, E., Gulian, M., Patel, R., Perego, M., Trask, N.: Robust training and initialization of deep neural networks: an adaptive basis viewpoint. Proc. Mach. Learn. Res. 107, 512–536 (2020)
Google Scholar
Dong, S., Li, Z.: Local extreme learning machines and domain decomposition for solving linear and nonlinear partial differential equations. Comput. Methods Appl. Mech. Eng. 387, 114129 (2021)
MathSciNet MATH Google Scholar
Dong, S., Li, Z.: A modified batch intrinsic plasticity method for pre-training the random coefficients of extreme learning machines. J. Comput. Phys. 445, 110585 (2021)
MathSciNet MATH Google Scholar
Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. J. Comput. Phys. 435, 110242 (2021)
MathSciNet MATH Google Scholar
Dong, S., Yang, J.: Numerical approximation of partial differential equations by a variable projection method with artificial neural networks. Comput. Methods Appl. Mech. Eng. 398, 115284 (2022)
MathSciNet MATH Google Scholar
Dong, S., Yang, J.: On computing the hyperparameter of extreme learning machines: algorithm and application to computational PDEs and comparison with classical and high-order finite elements. J. Comput. Phys. 463, 111290 (2022)
MathSciNet MATH Google Scholar
Driscoll, T., Hale, N., Trefethen, L.: Chebfun Guide. Pafnuty Publications, Oxford (2014)
Google Scholar
Dwivedi, V., Srinivasan, B.: Physics informed extreme learning machine (pielm) $-$ a rapid method for the numerical solution of partial differential equations. Neurocomputing 391, 96–118 (2020)
Google Scholar
Dwivedi, V., Srinivasan, B.: A normal equation-based extreme learning machine for solving linear partial differential equations. J. Comput. Inf. Sci. Eng. 22, 014502 (2022)
Google Scholar
Weinan, E., Yu, B.: The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Commun. Math. Stat. 6, 1–12 (2018)
MathSciNet MATH Google Scholar
Fabiani, G., Calabro, F., Russo, L., Siettos, C.: Numerical solution and bifurcation analysis of nonlinear partial differential equations with extreme learning machines. J. Sci. Comput. 89, 44 (2021)
MathSciNet MATH Google Scholar
Fokina, D., Oseledets, I.: Growing axons: greedy learning of neural networks with application to function approximation. arXiv:1910.12686 (2020)
Freire, A., Rocha-Neto, A., Barreto, G.: On robust randomized neural networks for regression: a comprehensive review and evaluation. Neural Comput. Appl. 32, 16931–16950 (2020)
Google Scholar
Galaris, E., Fabiani, G., Calabro, F., Serafino, D., Siettos, C.: Numerical solution of stiff ODEs with physics-informed random projection neural networks. arXiv:2108.01584 (2021)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Cambridge (2016)
MATH Google Scholar
Guo, P., Chen, C., Sun, Y.: An exact supervised learning for a three-layer supervised neural network. In: Proceedings of 1995 International Conference on Neural Information Processing, pp. 1041–1044 (1995)
He, J., Xu, J.: MgNet: a unified framework for multigrid and convolutional neural network. Sci. China Math. 62, 1331–1354 (2019)
MathSciNet MATH Google Scholar
Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELU). arXiv:1606.08415 (2016)
Huang, G., Chen, L., Siew, C.K.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17, 879–892 (2006)
Google Scholar
Huang, G., Huang, G., Song, S., You, K.: Trends in extreme learning machines: a review. Neural Netw. 61, 32–48 (2015)
MATH Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.: Densely connected convolutional networks. arXiv:1608.06993 (2018)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: 2004 IEEE International Joint Conference on Neural Networks, vol. 2, pp. 985–990 (2004)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70, 489–501 (2006)
Google Scholar
Igelnik, B., Pao, Y.: Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans. Neural Netw. 6, 1320–1329 (1995)
Google Scholar
Jaeger, H., Lukosevicius, M., Popovici, D., Siewert, U.: Optimization and applications of echo state networks with leaky integrator neurons. Neural Netw. 20, 335–352 (2007)
MATH Google Scholar
Jagtap, A., Kharazmi, E., Karniadakis, G.: Conservative physics-informed neural networks on discrete domains for conservation laws: applications to forward and inverse problems. Comput. Methods Appl. Mech. Eng. 365, 113028 (2020)
MathSciNet MATH Google Scholar
Karniadakis, G., Kevrekidis, G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nat. Rev. Phys. 3, 422–440 (2021)
Google Scholar
Karniadakis, G., Sherwin, S.: Spectral/hp Element Methods for Computational Fluid Dynamics, 2nd edn. Oxford University Press, Oxford (2005)
MATH Google Scholar
Katuwal, R., Suganthan, P., Tanveer, M.: Random vector functional link neural network based ensemble deep learning. arXiv:1907.00350 (2019)
Krishnapriyan, A., Gholami, A., Zhe, S., Kirby, R., Mahoney, M.: Characterizing possible failure modes in physics-informed neural networks. arXiv:2109.01050 (2021)
Kuramoto, Y.: Diffusion-induced chaos in reaction systems. Prog. Theor. Phys. Suppl. 64, 346–367 (1978)
Google Scholar
Li, J.Y., Chow, W., Igelnik, B., Pao, Y.H.: Comments on “stochastic choice of basis functions in adaptive function approximation and the functional-link net’’. IEEE Trans. Neural Netw. 8, 452–454 (1997)
Google Scholar
Liu, H., Xing, B., Wang, Z., Li, L.: Legendre neural network method for several classes of singularly perturbed differential equations based on mapping and piecewise optimization technology. Neural Process. Lett. 51, 2891–2913 (2020)
Google Scholar
Liu, M., Hou, M., Wang, J., Cheng, Y.: Solving two-dimensional linear partial differential equations based on Chebyshev neural network with extreme learning machine algorithm. Eng. Comput. 38, 874–894 (2021)
Google Scholar
Lu, L., Meng, X., Mao, Z., Karniadakis, G.: DeepXDE: a deep learning library for solving differential equations. SIAM Rev. 63, 208–228 (2021)
MathSciNet MATH Google Scholar
Lukosevicius, M., Jaeger, H.: Reservoir computing approaches to recurrent neural network training. Comput. Sci. Rev. 3, 127–149 (2009)
MATH Google Scholar
Maas, W., Markram, H.: On the computational power of recurrent circuits of spiking neurons. J. Comput. Syst. Sci. 69, 593–616 (2004)
MATH Google Scholar
Needell, D., Nelson, A., Saab, R., Salanevich, P.: Random vector functional link networks for function approximation on manifolds. arXiv:2007.15776 (2020)
Nocedal, J., Wright, S.: Numerical Optimization, 2nd edn. Springer, Berlin (2006)
MATH Google Scholar
Panghal, S., Kumar, M.: Optimization free neural network approach for solving ordinary and partial differential equations. Eng. Comput. 37, 2989–3002 (2021)
Google Scholar
Pao, Y., Park, G., Sobajic, D.: Learning and generalization characteristics of the random vector functional-link net. Neurocomputing 6, 163–180 (1994)
Google Scholar
Pao, Y., Takefuji, Y.: Functional-link net computing: theory, system architecture, and functionalities. Computer 25, 76–79 (1992)
Google Scholar
Rahimi, A., Recht, B.: Weighted sums of random kitchen sinks: replacing minimization with randomization in learning. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems (NIPS), vol. 2, pp. 1316–1323 (2008)
Raissi, M., Perdikaris, P., Karniadakis, G.: Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019)
MathSciNet MATH Google Scholar
Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386–408 (1958)
Google Scholar
Scardapane, S., Wang, D.: Randomness in neural networks: an overview. WIREs Data Mining Knowl. Discov. 7, e1200 (2017)
Google Scholar
Sirignano, J., Spoliopoulos, K.: DGM: a deep learning algorithm for solving partial differential equations. J. Comput. Phys. 375, 1339–1364 (2018)
MathSciNet MATH Google Scholar
Sivashinsky, G.: Nonlinear analysis of hydrodynamic instability in laminar flames—I. Derivation of basic equations. Acta Astronautica 4, 1177–1206 (1977)
MathSciNet MATH Google Scholar
Suhanthan, P., Katuwal, R.: On the origins of randomization-based feedforward neural networks. Appl. Soft Comput. 105, 107239 (2021)
Google Scholar
Sun, H., Hou, M., Yang, Y., Zhang, T., Weng, F., Han, F.: Solving partial differential equations based on Bernstein neural network and extreme learning machine algorithm. Neural Process. Lett. 50, 1153–1172 (2019)
Google Scholar
Tang, K., Wan, X., Liao, Q.: Adaptive deep density estimation for Fokker–Planck equations. J. Comput. Phys. 457, 111080 (2022)
MATH Google Scholar
Verma, B., Mulawka, J.: A modified backpropagation algorithm. In: Proceedings of 1994 IEEE International Conference on Neural Networks, vol. 2, pp. 840–844 (1994)
Wan, X., Wei, S.: VAE-KRnet and its applications to variational Bayes. Commun. Comput. Phys. 31, 1049–1082 (2022)
MathSciNet MATH Google Scholar
Wang, S., Yu, X., Perdikaris, P.: When and why PINNs fail to train: a neural tangent kernel perspective. J. Comput. Phys. 449, 110768 (2022)
MathSciNet MATH Google Scholar
Wang, Y., Lin, G.: Efficient deep learning techniques for multiphase flow simulation in heterogeneous porous media. J. Comput. Phys. 401, 108968 (2020)
MathSciNet Google Scholar
Webster, C.: Alan Turing’s unorganized machines and artificial neural networks: his remarkable early work and future possibilities. Evol. Intell. 5, 35–43 (2012)
Google Scholar
Widrow, B., Greenblatt, A., Kim, Y., Park, D.: The no-prop algorithm: a new learning algorithm for multilayer neural networks. Neural Netw. 37, 182–188 (2013)
Google Scholar
Wilamowski, B., Yu, H.: Neural network learning without backpropagation. IEEE Trans. Neural Netw. 21, 1793–1803 (2010)
Google Scholar
Winovich, N., Ramani, K., Lin, G.: ConvPDE-UQ: convolutional neural networks with quantified uncertainty for heterogeneous elliptic partial differential equations on varied domains. J. Comput. Phys. 394, 263–279 (2019)
MathSciNet MATH Google Scholar
Yang, Y., Hou, M., Luo, J.: A novel improved extreme learning machine algorithm in solving ordinary differential equations by Legendre neural network methods. Adv. Differ. Equ. 469, 1–24 (2018)
MathSciNet MATH Google Scholar
Yang, Z., Dong, S.: An unconditionally energy-stable scheme based on an implicit auxiliary energy variable for incompressible two-phase flows with different densities involving only precomputable coefficient matrices. J. Comput. Phys. 393, 229–257 (2019)
MathSciNet MATH Google Scholar
Yang, Z., Dong, S.: A roadmap for discretely energy-stable schemes for dissipative systems based on a generalized auxiliary variable with guaranteed positivity. J. Comput. Phys. 404, 109121 (2020)
MathSciNet MATH Google Scholar
Yang, Z., Lin, L., Dong, S.: A family of second-order energy-stable schemes for Cahn–Hilliard type equations. J. Comput. Phys. 383, 24–54 (2019)
MathSciNet MATH Google Scholar
Zhang, L., Suganthan, P.: A comprehensive evaluation of random vector functional link networks. Inf. Sci. 367–368, 1094–1105 (2016)
Google Scholar
Zheng, X., Dong, S.: An eigen-based high-order expansion basis for structured spectral elements. J. Comput. Phys. 230, 8573–8602 (2011)
MathSciNet MATH Google Scholar

Download references

Funding

This work was partially supported by US National Science Foundation (DMS-2012415).

Author information

Authors and Affiliations

Center for Computational and Applied Mathematics, Department of Mathematics, Purdue University, West Lafayette, USA
Naxian Ni & Suchuan Dong

Authors

Naxian Ni
View author publications
You can also search for this author in PubMed Google Scholar
Suchuan Dong
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

NN: software, data acquisition, data visualization, data analysis, writing of paper. SD: conceptualization, methodology, software, data acquisition, data analysis, writing of paper.

Corresponding author

Correspondence to Suchuan Dong.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proofs of Theorems from Sect. 2

Proof of Theorem 1

Consider an arbitrary $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})\in U({\varOmega },{{\textbf{M}}}_1,\sigma )$, where $\varvec{\theta }\in {{\mathbb {R}}}^{N_{h1}}$ and $\varvec{\beta }\in {{\mathbb {R}}}^{N_{c1}}$, with $N_{h1}=\sum _{i=1}^{L-1}(m_{i-1}+1)m_i$ and $N_{c1}=\sum _{i=1}^{L-1}m_i$. Let $w_{kj}^{(i)}$ ($1\leqslant i\leqslant L-1$, $1\leqslant k\leqslant m_{i-1}$, $1\leqslant j\leqslant m_i$) and $b^{(i)}_j$ ($1\leqslant i\leqslant L-1$, $1\leqslant j\leqslant m_i$) denote the hidden-layer weight/bias coefficients of the associated HLConcFNN(${{\textbf{M}}}_1,\sigma $), and let $\beta _{ij}$ ($1\leqslant i\leqslant L-1$, $1\leqslant j\leqslant m_i$) denote the output-layer coefficients of HLConcFNN(${{\textbf{M}}}_1,\sigma $). $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})$ is given by (7).

Consider a function $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})\in U({\varOmega },{{\textbf{M}}}_2,\sigma )$ with $\varvec{\vartheta }\in {{\mathbb {R}}}^{N_{h2}}$ and $\varvec{\alpha }\in {{\mathbb {R}}}^{N_{c2}}$, where $N_{c2}=N_{c1}+n$, and $N_{h2}=N_{h1}+(m_{L-1}+1)n$. We will choose $\varvec{\vartheta }$ and $\varvec{\alpha }$ such that $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}}) = u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})$. We construct $\varvec{\vartheta }$ and $\varvec{\alpha }$ by setting the hidden-layer and the output-layer coefficients of HLConcFNN(${{\textbf{M}}}_2,\sigma $) as follows.

The HLConcFNN(${{\textbf{M}}}_2,\sigma $) has L hidden layers. We set the weight/bias coefficients in its last hidden layer (with n nodes) to arbitrary values. We set those coefficients that connect the output node and the n nodes in the last hidden layer to all zeros. For the rest of the hidden-layer coefficients and the output-layer coefficients in HLConcFNN(${{\textbf{M}}}_2,\sigma $), we use those corresponding coefficient values from the network HLConcFNN(${{\textbf{M}}}_1,\sigma $).

More specifically, let $\xi _{kj}^{(i)}$ and $\eta _j^{(i)}$ denote the weight/bias coefficients in the hidden layers, and $\alpha _{ij}$ denote the output-layer coefficients, of HLConcFNN(${{\textbf{M}}}_2,\sigma $) associated with the function $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})$. We set these coefficients by,

$$\begin{aligned} \xi _{kj}^{(i)}= & {} \left\{ \begin{array}{ll} w_{kj}^{(i)}, &{} \text {for}\ 1\leqslant i\leqslant L-1,\ 1\leqslant k\leqslant m_{i-1},\ 1\leqslant j\leqslant m_i; \\ \text {arbitrary value}, &{} \text {for}\ i=L,\ 1\leqslant k\leqslant m_{L-1},\ 1\leqslant j\leqslant n; \end{array} \right. \end{aligned}$$

(31)

$$\begin{aligned} \eta _j^{(i)}= & {} \left\{ \begin{array}{ll} b_j^{(i)}, &{} \text {for all}\ 1\leqslant i\leqslant L-1,\ 1\leqslant j\leqslant m_i; \\ \text {arbitrary value}, &{} \text {for}\ i=L,\ 1\leqslant j\leqslant n; \end{array} \right. \end{aligned}$$

(32)

$$\begin{aligned} \alpha _{ij}= & {} \left\{ \begin{array}{ll} \beta _{ij},&{} \text {for}\ 1\leqslant i\leqslant L-1,\ 1\leqslant j\leqslant m_i; \\ 0, &{} \text {for}\ i=L,\ 1\leqslant j\leqslant n. \end{array} \right. \end{aligned}$$

(33)

With the above coefficients, the last hidden layer of the network HLConcFNN(${{\textbf{M}}}_2,\sigma $) may output arbitrary fields, which however have no effect on the output field of HLConcFNN(${{\textbf{M}}}_2,\sigma $) because $\alpha _{Lj}=0$ ($1\leqslant j\leqslant n$). The rest of the hidden nodes in HLConcFNN(${{\textbf{M}}}_2,\sigma $) and the output node of HLConcFNN(${{\textbf{M}}}_2,\sigma $) produce fields that are identical to those of the corresponding nodes in the network HLConcFNN(${{\textbf{M}}}_1,\sigma $). We thus conclude that $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})=v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})$. So $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})\in U({\varOmega },{{\textbf{M}}}_2,\sigma )$, and the relation (9) holds. $\square $

Proof of Theorem 2

We use the same strategy as that in the proof of Theorem 1. Consider an arbitrary $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})\in U({\varOmega },{{\textbf{M}}}_1,\sigma )$, where $\varvec{\theta }\in {{\mathbb {R}}}^{N_{h1}}$ and $\varvec{\beta }\in {{\mathbb {R}}}^{N_{c1}}$, with $N_{h1}=\sum _{i=1}^{L-1}(m_{i-1}+1)m_i$ and $N_{c1}=\sum _{i=1}^{L-1}m_i$. The hidden-layer coefficients of the associated HLConcFNN(${{\textbf{M}}}_1,\sigma $) are denoted by $w_{kj}^{(i)}$ ($1\leqslant i\leqslant L-1$, $1\leqslant k\leqslant m_{i-1}$, $1\leqslant j\leqslant m_i$) and $b^{(i)}_j$ ($1\leqslant i\leqslant L-1$, $1\leqslant j\leqslant m_i$), and the output-layer coefficients are denoted by $\beta _{ij}$ ($1\leqslant i\leqslant L-1$, $1\leqslant j\leqslant m_i$). $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})$ is given by (7).

Consider a function $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})\in U({\varOmega },{{\textbf{M}}}_2,\sigma )$ with $\varvec{\vartheta }\in {{\mathbb {R}}}^{N_{h2}}$ and $\varvec{\alpha }\in {{\mathbb {R}}}^{N_{c2}}$, where $N_{c2}=N_{c1}+1$, and $N_{h2}=N_{h1}+(m_{s-1}+1)+m_{s+1}$ if $1\leqslant s\leqslant L-2$ and $N_{h2}=N_{h1}+(m_{s-1}+1)$ if $s=L-1$. We construct $\varvec{\vartheta }$ and $\varvec{\alpha }$ by setting the hidden-layer and the output-layer coefficients of HLConcFNN(${{\textbf{M}}}_2,\sigma $) as follows.

In HLConcFNN(${{\textbf{M}}}_2,\sigma $) we set the weight coefficients that connect the extra node of layer s to those nodes in layer $(s+1)$ to all zeros, and we also set the weight coefficient that connects the extra node of layer s with the output node to zero. We set the weight coefficients that connect the nodes of layer $(s-1)$ to the extra node of layer s to arbitrary values, and also set the bias coefficient corresponding to the extra node of layer s to an arbitrary value. For the rest of the hidden-layer and output-layer coefficients of HLConcFNN(${{\textbf{M}}}_2,\sigma $), we use those corresponding coefficient values from the network HLConcFNN(${{\textbf{M}}}_1,\sigma $).

Specifically, let $\xi _{kj}^{(i)}$ and $\eta _j^{(i)}$ denote the weight/bias coefficients in the hidden layers, and $\alpha _{ij}$ denote the output-layer coefficients, of the HLConcFNN(${{\textbf{M}}}_2,\sigma $) associated with $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})$. We set these coefficients by,

$$\begin{aligned} \xi _{kj}^{(i)}= & {} \left\{ \begin{array}{ll} w_{kj}^{(i)}, &{} \text {for all}\ (1\leqslant i\leqslant s-1,\ \text {or}\ s+2\leqslant i\leqslant L-1),\\ &{} \quad 1\leqslant k\leqslant m_{i-1},\ 1\leqslant j\leqslant m_i; \\ w_{kj}^{(s)}, &{} \text {for}\ i=s,\ 1\leqslant k\leqslant m_{s-1},\ 1\leqslant j\leqslant m_s; \\ \text {arbitrary value}, &{} \text {for}\ i=s,\ 1\leqslant k\leqslant m_{s-1},\ j=m_{s}+1; \\ w_{kj}^{(s+1)}, &{} \text {for}\ i=s+1,\ 1\leqslant k\leqslant m_{s},\ 1\leqslant j\leqslant m_{s+1}; \\ 0, &{} \text {for}\ i=s+1,\ k=m_s+1,\ 1\leqslant j\leqslant m_{s+1}; \end{array} \right. \end{aligned}$$

(34)

$$\begin{aligned} \eta _j^{(i)}= & {} \left\{ \begin{array}{ll} b_j^{(i)}, &{} \text {for all}\ 1\leqslant i\leqslant L-1,\ i\ne s,\ 1\leqslant j\leqslant m_i; \\ b_j^{(s)}, &{} \text {for}\ i=s,\ 1\leqslant j\leqslant m_s; \\ \text {arbitrary value}, &{} \text {for}\ i=s,\ j=m_s+1; \end{array} \right. \end{aligned}$$

(35)

$$\begin{aligned} \alpha _{ij}= & {} \left\{ \begin{array}{ll} \beta _{ij},&{} \text {for all}\ 1\leqslant i\leqslant L-1,\ i\ne s,\ 1\leqslant j\leqslant m_i; \\ \beta _{sj}, &{} \text {for}\ i=s,\ 1\leqslant j\leqslant m_s; \\ 0, &{} \text {for}\ i=s,\ j=m_s+1. \end{array} \right. \end{aligned}$$

(36)

With the above coefficients, the extra node in layer s of the network HLConcFNN(${{\textbf{M}}}_2,\sigma $) may output an arbitrary field, which however has no contribution to the output field of HLConcFNN(${{\textbf{M}}}_2,\sigma $). The rest of the hidden nodes and the output node of HLConcFNN(${{\textbf{M}}}_2,\sigma $) produce identical fields as the corresponding nodes in the network HLConcFNN(${{\textbf{M}}}_1,\sigma $). We thus conclude that $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})=v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})$. So $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})\in U({\varOmega },{{\textbf{M}}}_2,\sigma )$ and the relation (10) holds. $\square $

Proof of Theorem 3

We use the same strategy as that in the proof of Theorem 1. Consider an arbitrary $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})\in U({\varOmega },{{\textbf{M}}}_1,\sigma ,\varvec{\theta })$, where $\varvec{\beta }\in {{\mathbb {R}}}^{N_{c1}}$ with $N_{c1}=\sum _{i=1}^{L-1}m_i$. We will try to construct an equivalent function from $U({\varOmega },{{\textbf{M}}}_2,\sigma ,\varvec{\vartheta })$.

We consider another function $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})\in U({\varOmega },{{\textbf{M}}}_2,\sigma ,\varvec{\vartheta })$, where $\varvec{\alpha }\in {{\mathbb {R}}}^{N_{c2}}$ with $N_{c2}=N_{c1}+n$, and we set the coefficients of the HLConcELM corresponding to $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})$ as follows. Since $\varvec{\vartheta }[1:N_{h1}]=\varvec{\theta }[1:N_{h1}]$, the random coefficients in the first $(L-1)$ hidden layers of the HLConcELM corresponding to $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})$ are identical to those corresponding hidden-layer coefficients in the HLConcELM for $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})$. We set the weight/bias coefficients in the L-th hidden layer of the HLConcELM for $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})$, which contains n nodes, to arbitrary random values. For the output-layer coefficients of the HLConcELM for $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})$, we set those coefficients that connect the hidden nodes in the first $(L-1)$ hidden layers and the output node to be identical to those corresponding output-layer coefficients in the HLConcELM for $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})$, namely, $\varvec{\alpha }[1:N_{c1}]=\varvec{\beta }[1:N_{c1}]$. We set those coefficients that connect the hidden nodes of the L-th hidden layer and the output node to be zeros in the HLConcELM for $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})$, namely, $\varvec{\alpha }[N_{c1}+1:N_{c2}]=0$.

With the above coefficient settings, the output fields of those nodes in the first $(L-1)$ hidden layers of HLConcELM(${{\textbf{M}}}_2,\sigma ,\varvec{\vartheta }$) are identical to those corresponding nodes of HLConcELM(${{\textbf{M}}}_1,\sigma ,\varvec{\theta }$). The output fields of those n nodes in the L-th hidden layer of HLConcELM(${{\textbf{M}}}_2,\sigma ,\varvec{\vartheta }$) are arbitrary, which however have no contribution to the output field of HLConcELM(${{\textbf{M}}}_2,\sigma ,\varvec{\vartheta }$). The output field of the HLConcELM(${{\textbf{M}}}_2,\sigma ,\varvec{\vartheta }$) is identical to that of the HLConcELM(${{\textbf{M}}}_1,\sigma ,\varvec{\theta }$), i.e. $v(\varvec{\vartheta },\varvec{\alpha },{{\textbf{x}}})=u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})$. We thus conclude that $u(\varvec{\theta },\varvec{\beta },{{\textbf{x}}})\in U({\varOmega },{{\textbf{M}}}_2,\sigma ,\varvec{\vartheta })$ and the relation (13) holds. $\square $

Appendix B. Numerical Tests with Several Activation Functions

Table 7 Appendix B (variable-coefficient Poisson equation): the activation functions and the corresponding hidden magnitude vector ${{\textbf{R}}}$ employed

Numerical Computation of Partial Differential Equations by Hidden-Layer Concatenated Extreme Learning Machine

Abstract

Access this article

Similar content being viewed by others

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Development and Application of Artificial Neural Network

A survey on ensemble learning

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Proofs of Theorems from Sect. 2

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Appendix B. Numerical Tests with Several Activation Functions

Appendix C. Additional Comparisons Between HLConcELM and Conventional ELM

Appendix D. Laplace Equation Around a Reentrant Corner

Appendix E. Kuramoto–Sivashinsky Equation

Appendix F. Schrodinger Equation

Appendix G. Two-Dimensional Advection Equation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation