Introduction

Recently, integral equations have been extensively investigated theoretically and numerically. Note that they occur in a wide variety of physical applications, various fields of neural sciences and numerous applications such as electrical engineering, economics, elastically, plasticity, etc. Since these equations usually cannot be solved explicitly, it is going to be obtained in approximate solutions. There are several numerical methods for approximating solution of Fredholm and Volterra integral equations in one- and two-dimensions. For example, Tricomi in his book [25] introduced the classical method of successive approximations for integral equations. Variational iteration method [15] was effective and convenient for solving integral equations. The Homotopy analysis method (HAM) was proposed by Liao [16] and then has been applied in [1]. The Taylor expansion approach was presented for solving integral equations by Kanwal and Liu [14] and then has been extended in [17]. In addition, Jafari et al. [12] applied Legendre wavelets method to find numerical solution of linear integral equations. In [13] an architecture of artificial neural networks (NNs) was suggested to approximate solution of linear Fredholm integral equations systems. For this aim, first the truncation of the Taylor expansions for unknown functions was substituted in the origin system. Then the purposed neural network has been applied for adjusting the real coefficients of given expansions in resulting system. In [9], a numerical method based on feed-forward neural networks has been presented for solving Fredholm integral equations of the second kind. The Bernstein polynomials have frequently been applied in the solution of integral equations and approximation theory [57, 19, 20]. Also, there are many articles which deal with the solution and analysis of two-dimensional Fredholm and Volterra integral equations. Mirzaei and Dehghan [22] described a numerical scheme based on the moving least squares (MLS) method for solving integral equations in one- and two-dimensional spaces. The method was a meshless method, since it did not require any background interpolation or approximation cells and it did not depend on the geometry of domain. Hadizadeh and Asgary [11] using the bivariate Chebyshev collocation method solved the linear Volterra–Fredholm integral equations of the second kind. Alipanah and Esmaeili [2] approximated the solution of the two-dimensional Fredholm integral equation using Gaussian radial basis function based on Legendre–Gauss–Lobatto nodes and weights. Two-dimensional orthogonal triangular functions are used in [3, 18] as a new set of basis functions to approximate solutions of nonlinear two-dimensional integral equations. Babolian et al. [4] applied two-dimensional rationalized Haar functions for finding the numerical solution of nonlinear second kind two-dimensional integral equations. They reduced the present problem to solve a nonlinear system of algebraic equations using bivariate collocation method and Newton–Cotes nodes. Moreover, some different valid methods for solving these kind of equations have been developed.

This paper focuses on constructing a new algorithm with the use of feed-forward neural networks to reach an approximate solution of the linear two-dimensional Fredholm integral equation. For this purpose, first unknown two-variable function in the problem is replaced by a three-layer perceptron neural network. Supposedly, the limits of integrations are partitioned into set points, this architecture of neural networks can calculate the output corresponding to input vector. Now a cost function to be minimized is defined on the set points. Consequently, the suggested neural net using a learning algorithm that is based on the gradient descent method adjusts parameters (the weights and biases) to any desired degree of accuracy. Here is an outline of the paper. In “Preliminaries”, the basic notations and definitions of the integral equations and the artificial neural networks are briefly presented. “The general method” describes how to find approximate solution of the given two-dimensional integral equations using proposed approach. Finally in “An example”, an numerical example is provided and results are compared with the analytical solutions to demonstrate the validity and applicability of the method.

Preliminaries

In this section we will focus on the basic definitions and introductory concepts in integral equations. In addition the basic principles of artificial neural network (ANN) approach are presented and reviewed for solving linear second kind two-dimensional integral equations (2D-IEs).

Integral equations

Integral equations appear in many scientific and engineering applications, especially when initial value problems for boundary value problems are converted to integral equations. As stated before, we will review some integral equations and linear two-dimensional integral equations of the second kind as well.

Definition 2.1

Let f : [ a , b ] R . For each partition P = { t 0 , t 1 , , t n } of [ a , b ] and for arbitrary ξ i ϵ [ t i - 1 , t i ] ( 1 i n ), suppose

R P = i = 1 n f ( ξ i ) ( t i - t i - 1 ) ,
Δ : = m a x { | t i - t i - 1 | , i = 1 , , n } .

The definite integral of f ( t ) over [ a , b ] is

a b f ( t ) d t = lim Δ 0 R P

provided that this limit exists in the metric D [25].

Definition 2.2

The linear two-dimensional Fredholm integral equation (2D-FIE) of the second kind is presented by the form [2]

F ( x , y ) = f ( x , y ) + λ c d a b k ( x , y , s , t ) F ( s , t ) d s d t , ( x , y ) [ a , b ] × [ c , d ]
(1)

where λ is a constant parameter, the kernel k and f are given analytic functions on L 2 ( [ a , b ] × [ c , d ] ) . The two-variable unknown function F that must be determined appears inside and outside the integral signs. This is a characteristic feature of a second kind integral equation. It is important to point out that if the unknown function appears only inside the integral signs, the resulting equation is of first kind.

If the kernel function satisfies k ( x , y , s , t ) = 0 , s > x , t > y in Eq. (1), we obtain the linear two-dimensional Volterra integral equation (2D-VIE) [24]

F ( x , y ) = f ( x , y ) + λ c y a x k ( x , y , s , t ) F ( s , t ) d s d t , ( x , y ) [ a , b ] × [ c , d ] .
(2)

It should be noted that, if one of the limits of integration varies, the integral equation is called a Volterra–Fredholm integral equation. It is clear that, two-dimensional integral equations appear in many forms. Three distinct ways that depend on the limits of integration are used to characterize these equations which have been are briefly introduced. Notice that, if the function f ( x , y ) in the present integral equations is identically zero, the equation is called homogeneous. Otherwise it is called inhomogeneous. These three concepts play a major role in the structure of the solution.

Artificial neural networks

Artificial neural networks (ANNs) can be considered as simplified computational structures that are inspired by observed process in natural networks of biological neurons in the brain. They are nonlinear mapping architectures based on the function of the human brain, therefore can be considered as powerful tools for modeling, especially when the underlying data relationship is unknown. A very important feature of these networks is their adaptive nature, where “learning by example” replaces “programming” in solving problems. In other words, in contrast to conventional methods, which are used to perform specific task, most neural networks are more versatile. This feature raises a very appealing computational model which can be applied to solve variety of problems.

The multilayer feed-forward neural network or multilayer perceptron (MLP) that had been proposed by Rosenblatt [23] is very popular and is used more than other neural network type for a wide variety of tasks. The present network learned by back-propagation algorithm is based on supervised procedure. In other words, the network constructs a model based on examples of data with known output.

In this subsection, an architecture of MLP model is discussed here briefly. We intend to give a short review on learning of the given neural network. First consider a three-layer ANN with two input units, N neurons in hidden layer and one output unit. Mathematical representation of the present neural network is given in Fig. 1. Using the figure, input–output relation of each unit and calculated output u N ( x , y ) can be written as follows:

Input units:

The input neurons make no change in their inputs, so:

o 1 = x ,
(3)
o 2 = y .

Hidden units:

Input into a node in hidden layer is a weighted sum of outputs from nodes connected to it. Each unit takes its net input and applies an activation function to it. The input/output relation is normally given as follows:

O p = g ( n e t ( p ) ) ,
(4)
n e t ( p ) = i = 1 2 ( w p i . o i ) + b p , p = 1 , , N .

where n e t ( p ) describes the result of the net outputs o i impacting on unit p. Also, w p i are weights connecting neuron i to neuron p and b p is a bias for neuron p. Bias term is baseline input to a node in absence of any other inputs.

Output unit:

u N ( x , y ) = p = 1 N n e t ( p ) .
(5)

The general method

In this section, we intend to use the MLP method to get a new numerical approach for solving the linear two-dimensional Fredholm integral equation of the second kind. In other words, how to apply this method to make a series approximation for the solution F ( x , y ) in (1) will be described. The output of two-layer MLP network that is defined in Eq. (1) can be rewritten as follows:

u N ( x , y ) = p = 1 N i = 1 2 g ( w p i . o i + b p ) .
(6)

In order to approximate function u, first the intervals [ a b ] and [ c d ] are partitioned into set points x i and y j , respectively. Thus, the following set of equations will be obtained:

u N ( x i , y j ) = p = 1 N g w p 1 . x i + w p 2 . y j + b p .
(7)
Fig. 1
figure 1

Schematic diagram of the proposed MLP

Cost function

First suppose that u N ( x , y ) is the approximate solution with the adjustable parameters (wights and biases) for the unknown F ( x , y ) . After substituting this solution instead of the unknown function in the given 2D-FIE, the Eq. (1) can be transformed to a sum squared error minimization problem corresponding to the proposed neural network. So, the error function is regarded as a function on the weights and biases space of the net for x = x i and y = y j as follows:

E i , j ( w , W , b ) : = 1 2 E N i , j ( w , W , b ) 2 ,
(8)

where

E N i , j ( w , W , b ) = u N ( x i , y j ) - f ( x i , y j ) - λ c d a b k ( x i , y j , s , t ) u N ( s , t ) d s d t .

Now the total error of the network is defined as:

E ( w , W , b ) = i , j E i , j ( w , W , b ) .
(9)

The goal then is to minimize this function; therefore, we must deduce a back-propagation learning algorithm using the present cost function.

Proposed learning algorithm

Multilayer feed-forward neural network is learned by back-propagation algorithm that is based on supervised procedure. In other words, the MLP network is trained using a supervised learning algorithm which uses the training data to adjust the network weights and biases. Now let w p , q , W p and b p ( for p = 1 , , N ; q = 1 , 2 ) are initialized at small random values for input signals. For parameter w p , q adjustment rule can be written as follows:

w p , q ( r + 1 ) = w p , q ( r ) + Δ w p , q ( r ) , p = 1 , , N ; q = 1 , 2 ,
(10)
Δ w p , q ( r ) = - η . E i , j w p , q + α . Δ w p , q ( r - 1 ) ,
(11)

where r is the number of adjustments, η is the learning rate and α is the momentum term constant. Similarly this adjustment rule can be written for other weight parameters. Thus, our problem is to calculate the derivative E i , j w p , q in (11). The derivative can be calculated as follows:

E i , j w p , q = E i , j E N i , j . E N i , j w p , q ,
(12)

where

u N ( x i , y j ) w p , q = u N ( x i , y j ) O ( p ) . O ( p ) n e t ( p ) . n e t ( p ) w p , q .

Consequently,

E i , j w p , q =
(13)
E N i , j . W p . ( x i . g ( n e t ( p ) ) - λ c d a b s . k ( x i , y j , s , t ) . g ( w p , 1 s + w p , 2 t + b p ) d s d t ) , q=1 E N i , j . W p . ( y j . g ( n e t ( p ) ) - λ c d a b t . k ( x i , y j , s , t ) . g ( w p , 1 s + w p , 2 t + b p ) d s d t ) , q=2 .

Using a similar procedure as mentioned above, we have the correspondingly corollary for parameters W p and b p , in which we are refrained from going through proof details. So, we have:

E i , j W p = E N i , j . ( n e t ( p ) - λ c d a b k ( x i , y j , s , t ) . g ( w p , 1 s + w p , 2 t + b p ) d s d t ) ,
(14)

and

E i , j b p = E N i , j . W p . ( g ( n e t ( p ) ) - λ c d a b k ( x i , y j , s , t ) . g ( w p , 1 s + w p , 2 t + b p ) d s d t ) .
(15)

The MLP neural nets are the sample of regular networks, therefore they can approximate any continuous function on a compact set to arbitrary accuracy [10]. Now the learning algorithm can be summarized as follows:

Learning process

  • Step 1: η > 0 , α > 0 and E m a x > 0 are chosen. Then quantities w p , q , W p a n d b p ( p = 1 , , N ; q = 1 , 2 ) are initialized at small random values.

  • Step 2: Let r : = 0 where r is the number of iterations of the learning algorithm. Then the running error E is set to 0.

  • Step 3: Let r : = r + 1 . Repeat below procedure for different values of i and j:

    1. i

      Forward calculation: Calculate the output vector u N ( x i , y j ) by presenting the input vectors x i and y j .

    2. ii

      Back propagation: Adjust the parameters w p , q , W p a n d b p using the cost function (8).

  • Step 4: Cumulative cycle error is computed by adding the present error to E.

  • Step 5: The training cycle is completed. For E < E m a x terminate the training session. If E > E m a x then E is set to 0 and we initiate a new training cycle by going back to Step 3.

An example

In this section, in order to investigate the accuracy of the proposed method, we have chosen an example of linear two-dimensional integral equations of the second kind. For the example, the computed values of the approximate solution are calculated over a number of iterations and the cost function is plotted. Also, to show the efficiency of the present method for our problem, results will be compared with the exact solution.

Example 4.1

Consider the linear 2D-FIE

F ( x , y ) = f ( x , y ) + 0 1 0 1 ( s . s i n ( t ) + 1 ) F ( s , t ) d s d t ,
(16)

where

f ( x , y ) = x . c o s ( y ) - 1 6 s i n ( 1 ) ( 3 + s i n ( 1 ) ) ,

with the exact solution F ( x , y ) = x . c o s ( y ) . In this example, we illustrate the use of the FNN technique to approximate the solution of this integral equation. In the following simulations, we use the specifications as follows:

  1. 1.

    The number of hidden units: N = 3 ,

  2. 2.

    Learning rate η = 0.5 ,

  3. 3.

    Momentum constant α = 0.05 .

Numerical result can be found in Table 1, and Fig. 2 shows the cost function in the 20 iterations. Figures 3, 4, 5, 6 show the convergence behaviors for computed values of the weight parameters w p , q and W p , bias b p for different number of iterations.

Table 1 Numerical results for example 4.1 by FNN technique
Fig. 2
figure 2

The cost function for Example 4.1 on the number of iterations

Fig. 3
figure 3

Convergence of the weights W r for Example 4.1

Fig. 4
figure 4

Convergence of the weights w 1 , r for Example 4.1

Fig. 5
figure 5

Convergence of the weights w 2 , r for Example 4.1

Fig. 6
figure 6

Convergence of the weights b r for Example 4.1

There is no magic formula for selecting the optimum number of hidden neurons. However, some thumb rules are available for calculating number of hidden neurons. A rough approximation can be obtained by the geometric pyramid rule proposed by Masters [21]. For a three-layer network with n input and m output neurons, the hidden layer would have at least [ n m ] + 1 neurons.

To show convergence of the proposed method we solve Example 4.1 using shifted Legandre collocation method. The reason for choosing shifted Legandre collocation method is its simplicity. The details of shifted Legandre collocation method are as follows.

Shifted Legandre collocation method

The Legendre polynomials, P n ( x ) , n = 0 , 1 , , are the eigenfunctions of the singular Sturm–Liouville problem

( 1 - x 2 ) P n ( x ) + n ( n + 1 ) P n ( x ) = 0 .

Also, they are orthogonal with respect to L 2 inner product on the interval [ - 1 , 1 ] with the weight function w ( x ) = 1 , that is

- 1 1 P n ( x ) P m ( x ) d x = 2 2 n + 1 δ n m ,

where δ n m is the Kronecker delta. The Legendre polynomials satisfy the recursion relation

P n + 1 ( x ) = 2 n + 1 n + 1 x P n ( x ) - n n + 1 P n - 1 ( x ) ,

where P 0 ( x ) = 1 and P 1 ( x ) = x . If P n ( x ) is normalized so that P n ( 1 ) = 1 , then for any n, the Legendre polynomials in terms of power of x are

P n ( x ) = 1 2 n m = 0 n 2 ( - 1 ) m n m 2 n - 2 m n x n - 2 m ,

where n 2 denotes the integer part of n 2 .

The Legendre–Gauss–Lobatto (LGL) collocation points - 1 = x 0 < x 1 < < x N = 1 are the roots of P N ( x ) together with the points −1 and 1. Explicit formulas for the LGL points are not known. The LGL points have the property that

- 1 1 p ( x ) d x = i = 0 N w i p ( x i ) ,

which is exact for polynomials of degree at most 2 N - 1 , where w i , 0 i N , are LGL quadrature weights. For more details about Legendre polynomials, see [8].

The shifted Legendre polynomials (ShLP) on the interval t [ 0 , 1 ] are defined by

P ^ n ( t ) = P n 2 t - 1 , n = 0 , 1 , ,

which are obtained by an affine transformation from the Legendre polynomials. The set of ShLP is a complete L 2 [ 0 , 1 ] -orthogonal system with the weight function w ( t ) = 1 . Thus, any function f L 2 [ 0 , 1 ] can be expanded in terms of ShLP.

The ShLGL (Shifted Legendre–Gauss–Lobatto) collocation points 0 = t 0 < t 1 < < t N = 1 on the interval [ 0 , 1 ] are obtained by shifting the LGL points, x i , using the transformation

t i = 1 2 x i + 1 , i = 0 , 1 , , N .
(17)

Thanks to the property of the standard LGL quadrature, it follows that for any polynomial p of degree at most 2 N - 1 on ( 0 , 1 ) ,

0 1 p ( t ) d t = 1 2 - 1 1 p 1 2 x + 1 d x = 1 2 i = 0 N w i p 1 2 x i + 1 = i = 0 N w ^ i p ( t i ) ,

where w ^ i = 1 2 w i , 0 i N , are ShLGL quadrature weights. The results stated above are also satisfied for Legendre–Gauss and Legendre–Gauss–Radau quadrature rules.

The function F ( x , y ) is approximated by a ShLP of degree at most N as

F ( x , y ) = i = 0 N j = 0 N α i j P ^ i ( x ) P ^ j ( y )
(18)

Now, by substituting (18) and collocation points (17) in (16), we have

i = 0 N j = 0 N α i j P ^ i ( t k ) P ^ j ( t l ) f ¯ ( t k , t l ) + 0 1 0 1 ( s . s i n ( t ) + 1 ) i = 0 N j = 0 N α i j P ^ i ( s ) P ^ j ( t ) d s d t , k , l = 0 , 1 , , N

By solving this linear system we can find α i j , i , j = 0 , 1 , , N , and then approximate the solution F ( x , y ) .

We solved the Example 4.1 using the method described for N = 2 and N = 3 . Results are shown in Table 2. By comparing Tables 1 and 2 we find that obtained results in Table 1 are in concordance.

Table 2 Numerical results for Example 4.1 by SHLGL

Conclusions

This paper suggested a new computational method to solve a two-dimensional Fredholm integral equations. So, a feed-forward artificial neural network has been proposed. This network is able of estimating approximate solution of assumed equation using the learning algorithm which is based on steepest descent rule. Clearly, in order to obtain accurate solution, many learning procedure should be considered. The analyzed examples illustrated the ability and reliability of the present approach. The obtained solutions, in comparison with exact solutions admit a remarkable accuracy. Extensions to the case of more general of integral equations are left for future studies.