Introduction and preliminaries

Many of the complex problems in science and engineering contain the function of nonlinear and transcendental nature in the equation of the form \(f(x)=0\). Numerical iterative schemes like Newton’s method [42] are often used to obtain the approximate solution of such problems because it is not always possible to obtain its exact solution by usual algebraic process. However, the condition \(f'(x)\ne 0\) in a neighborhood of the required root is severe indeed for convergence of Newton’s method, which restricts its applications in practice. To overcome on this difficulty, Steffensen replaced the first derivative of the function in the Newton’s iterate by forward finite difference approximation [58]. Traub in his book classified iterative methods for solving such equations as one point or multipoints [61]. We classify the iterative formulas by information they need as follows [61]:

  1. 1.

      One-point iterative method without memory In this type of methods, \(x_{k+1}\) can be determined by only new data at \(x_{k}\). No previous information is reused. 

    Thus, \(x_{k+1} = \phi (x_{k})\). Then \(\phi\) will be called a one-point iterative formula (I.F.). 

    The most commonly known example is Newton’s I.F. (iterative formula) [42]:

    $$\begin{aligned} x_{k+1}=x_{k}-\frac{f(x_{k})}{f'(x_{k})},\quad k=0,1,\ldots , \end{aligned}$$
    (1)

    and free derivative Steffensen’s [58] :

    $$\begin{aligned} \left\{ \begin{array}{ll} w_{k}=x_{k}+\beta f(x_{k}),\quad k=0,1,2,\ldots ,\,\\ x_{k+1}=x_{k}-\frac{f(x_{k})}{f[x_{k},w_{k}]}.\\ \end{array} \right. \end{aligned}$$
    (2)
  2. 2.

      One-point iterative method with memory In this category \(x_{k+1}\) can be determined by new information at \(x_{k}\) and reused information at \(x_{k-1},\ldots ,x_{k-n}\).  Thus, \(x_{k+1} = \phi (x_{k};x_{k-1},\ldots ,x_{k-n})\). Then \(\phi\) will be called a one-point I.F. with memory. The best-known examples of a one-point I.F. with memory are the secant I.F. [47]

    $$\begin{aligned} x_{k+1}=x_{k}-\frac{(x_{k}-x_{k-1})}{f(x_{k})-f(x_{k-1})}f(x_{k}),\quad k=1,2,\ldots , \end{aligned}$$
    (3)

    and Traub’s method [61]

    $$\begin{aligned} \left\{ \begin{array}{ll} \lambda _{k}=\frac{-1}{f[x_{k},x_{k-1}]},\quad k=1,2,\ldots ,\\ w_{k}=x_{k}+\lambda _{k}f(x_{k}),\,x_{k+1}=x_{k}-\frac{f(x_{k})}{f[x_{k},w_{k}]},\quad k=0,1,\ldots . \end{array} \right. \end{aligned}$$
    (4)
  3. 3.

      Multipoint iterative method without memory In this type of methods \(x_{k+1}\) can be determined by new at \(x_{k},w_{1}(x_{k}),\ldots ,w_{n}(x_{k}),n\ge 1\). No old information is reused. Thus \(x_{k+1}=\phi [x_{k},\ldots ,x_{k-n}].\) Hence, \(x_{k+1} = \phi [x_{k};w_{1}(x_{k}),\ldots , w_{n}(x_{k})]\). In this case, \(\phi\) will be called a multipoint I.F. Pioneers in the field: Ostrowski’s [43]

    $$\begin{aligned} \left\{ \begin{array}{ll} y_{k}=x_{k}-\frac{f(x_{k})}{f'(x_{k})},\quad k=0,1,\ldots ,\\ x_{k+1}=y_{k}-\frac{f(x_{k})}{f'(x_{k})}\frac{f(y_{k})}{f(x_{k})-2f(y_{k})},\\ \end{array} \right. \end{aligned}$$
    (5)

    and Jarratt [25]

    $$\begin{aligned} \left\{ \begin{array}{ll} y_{k}=x_{k}-\frac{2}{3}\frac{f(x_{k})}{f'(x_{k})},\quad k=0,1,\ldots ,\\ x_{k+1}=x_{k}-\frac{1}{2}\frac{f(x_{k})}{f'(x_{k})-3f'(y_{k})},\\ \end{array} \right. \end{aligned}$$
    (6)

    also, Neta [41]

    $$\begin{aligned} \left\{ \begin{array}{ll} y_{k}=x_{k}-\frac{f(x_{k})}{f'(x_{k})},\quad k=0,1,\ldots ,\\ z_{k}=y_{k}-\frac{f(y_{k})}{f'(x_{k})}\frac{f(x_{k})+\beta f(y_{k})}{f(x_{k})+(\beta -2)f(y_{k})},\\ x_{k+1}=z_{k}-\frac{f(z_{k})}{f'(x_{k})}\frac{f(x_{k})-f(y_{k})}{f(x_{k})-3f(y_{k})}.\\ \end{array} \right. \end{aligned}$$
    (7)
  4. 4.

      Multipoint iterative method with memory  Finally, in this category, let us define another iteration function \(\phi\) having arguments \(z_{j}\), where each such argument represents \(k+1\) quantities \(x_{j},w_{1}(x_{j}),\ldots ,w_{n}(x_{j}), (n \ge 1)\). Let the iteration mapping be defined by \(x_{k+1}=\phi (z_{k}; z_{k-1},\ldots ,z_{k-n})\). Then \(\phi\) is called a multipoint IF with memory. In the above-mentioned mapping, semicolon separates the points at which new information is used from the point at which old information is reused, i.e., at each iterative step, we must preserve information of the last n approximations \(x_{j}\) and for each approximation, we must calculate n expressions \(w_{1}(x_{j}), \ldots ,w_{n}(x_{j})\). Some other researchers worked on this method such as: Cordero [10,11,12,13], Dezunic [15,16,17], Petkovic [44,45,46,47,48,49], Lotfi [33,34,35,36,37], Soleymani [55, 56], Wang [63, 64], and, ….

Conjecture Kung and Traub [31]: Kung and Traub proved the best one-point iterative method should achieve order of convergence n using n function evaluations. Also, any multipoint method should achieve optimal order convergence \(2^{n}\) using \(n+1\) evaluations. Abbasbandy [1], Chun [7], Kou [29],  and, … worked on one-step methods and  also, Petkovic [44], Sharma [53] and Thukral [60],  and …,  worked on multi-step methods.

Efficiency Index (EI) We recall the so-called efficiency index defined by Ostrowski  [43], as \(\hbox {EI}=p^{1/n}\), where p is the order of convergence and n is the total number function evaluations per iteration. Lotfi [33] and Soleymani [62] have checked iterative methods with high efficiency index.

Note 1 We use the symbols \(\rightarrow ,O,\) and \(\sim\) according to the following conventions  [61]. If \(\lim _{x_{n}\rightarrow \infty }{g(x_{n})}=C\), we write \(g(x_{n})\rightarrow C\) or \(g\rightarrow C\). If \(\lim _{x \rightarrow a}{g(x)}=C\), we write \(g(x)\rightarrow C\) or \(g\rightarrow C\). If \(f/g \rightarrow C\) where C is a nonzero constant, we write \(f=O(g)\) or \(f\sim g\).

Traub investigated that it is possible to increase the order of convergence of without memory methods by reusing the obtained information of the previous iteration. If one can increase the order of convergence in a without memory method by reusing the old information, then he/she can develop it with a memory method. To our surprise, there is not any method with memory that reuses the information from the all previous information. This motivated us to focus on this problem. Therefore, in this work, we will develop an adaptive memory method that uses the information not only from the last two steps, but also from all the previous iterations. This technique enables us to achieve the highest efficiency both theoretically and practically. Indeed, we will prove that this adaptive memory method has efficiency index 2 and hence competes all the existing methods without and with memory in the literature. Also, we later compare both numerical performances and efficiency index of our proposed method with some significant methods to show our claims. We approximate and update the introduced accelerator parameters in each iteration by suitable kind and optimal of Newton’s interpolation. We conclude that even with this one-step method, we need not to pay attention to higher kinds of steps in multipoint methods since this adaptive with memory method can achieve the efficiency index near 2 after three iterations, so from the theoretical and numerical aspects, it is enough to consider and utilize it practically. This paper is organized as follows:

In “A family of two-parameter iterative methods” section deals with modifying the optimal one-point method without memory introduced by family Khaksar  [28], constructed by introducing two iterative parameters which are calculated with helped of Newton’s interpolatory polynomial of different degrees. In “Recursive adaptive method with memory” section, the aim of this work is presented by contributing an iterative method adaptive with memory for solving nonlinear equations, improved order of convergence from 3.56 to 4 without adding more evaluations is presented, and achieve in maximum performance index. It means that, without any new function calculations, we can improve convergence order by 100%. The comparisons of absolute errors and computational efficiencies are given in “Numerical examples” section to illustrate convergence behavior. In “Conclusion” section, we give the concluding remarks.

A family of two-parameter iterative methods

In this section, we deal with modifying one-point without memory methods by Khaksar  [28]. So that their error equation has two accelerator elements. Khaksar’s method has the iterative expression:

$$\begin{aligned} \left\{ \begin{array}{ll} w_{k}=x_{k}-\beta f(x_{k}),\quad k=0,1,2,\ldots ,\\ x_{k+1}=x_{k}-\frac{f(x_{k})}{f[x_{k},w_{k}]}(1+\xi \frac{f(w_{k})}{f[x_{k},w_{k}]}).\\ \end{array} \right. \end{aligned}$$
(8)

Denoted by KM, where \(\beta \in \mathfrak {R}-\{0\},\) its error equation is given by

$$\begin{aligned} e_{k+1}=(-1+f'(\alpha )\beta )(\xi -c_{2})e^{2}_{k}+O(e_{k}^{3}). \end{aligned}$$
(9)

To transform Eq. (8) in a method with memory, with two accelerators, we consider the following modification of (8)  [28]:

$$\begin{aligned} \left\{ \begin{array}{ll} w_{k}=x_{k}-\beta _{k} f(x_{k}),\quad k=0,1,2,\ldots ,\\ x_{k+1}=x_{k}-\frac{f(x_{k})}{f[x_{k},w_{k}]}(1+\xi _{k} \frac{f(w_{k})}{f[x_{k},w_{k}]}),\\ \end{array} \right. \end{aligned}$$
(10)

where \(\beta\) and \(\xi\) are nonzero arbitrary parameters. In what follows, we present the error of Eq. (10).

Remark 1

It is worth noting that to the best of our knowledge although there are many methods with memory, however, developing adaptive methods with memory has not been considered in the literature.

The next theorem states of the error equation of Eq. (10).

Theorem 1

Let \(I \subseteq \mathbf{R}\) be an open interval\(f : I \rightarrow \mathbf{R}\) be a scalar function which has a simple root \(\alpha\) in the open interval Iand also the initial approximation \(x_0\) is sufficiently close the simple zeroand thenthe one-step iteration method (10) has two orders, which satisfies the following error equation:

$$\begin{aligned} e_{k+1}=(-1+f'(\alpha )\beta )(\xi -c_{2})e^{2}_{k}+O(e_{k}^{3}). \end{aligned}$$
(11)

Proof

Let \(\alpha\) be a simple zero of equation \(f(x)=0\) and \(x_{k}=\alpha +e_{k}.\) By Taylor expansion, we have :

$$\begin{aligned} f(x_{k})=f'(\alpha )(e_{k}+c_{2}e^{2}_{k}+c_{3}e^{3}_{k}), \end{aligned}$$
(12)

where \(c_{k} =\frac{f^{(k)}(\alpha )}{k!f'(\alpha )},\,k = 2, 3, \ldots .\)

$$\begin{aligned} w_{k}=e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e^{2}_{k})+O(e_{k}^{3}). \end{aligned}$$
(13)

Expanding \(f(w_{k})\) about \(\alpha\), we get :

$$\begin{aligned} f(w_{k})& {}= f'(\alpha )(e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e^{2}_{k})\nonumber \\&\quad +c_{2}{(e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e^{2}_{k}))}^{2})e^{2}_{k}+O(e_{k}^{3}). \end{aligned}$$
(14)

If \(f[x,y]=\frac{f(x)-f(y)}{x-y}\) is a divided difference, then the expression \(f[x_{k},w_{k}]\) can be written in terms of \(e_{k}\) as:

$$\begin{aligned}&f[x_{k},w_{k}]\nonumber \\&\quad =\frac{f'(\alpha )(e_{k}+c_{2}e_{k}^{2})-f'(\alpha ){(e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e_{k}^{2})+c_{2}(e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e_{k}^{2}))}^{2})}{f'(\alpha ) \beta (e_{k}+c_{2}e_{k}^{2})}. \end{aligned}$$
(15)

Dividing (14) by (15) gives us :

$$\begin{aligned}&\frac{f(w_{k})}{f[x_{k},w_{k}]}\nonumber \\&\quad =-\frac{{f'(\alpha )}^{2} \beta (e_{k}+c_{2}e_{k}^{2})(e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e_{k}^{2})+c_{2}({e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e_{k}^{2}))}^{2})}{-f'(\alpha )(e_{k}+c_{2}e_{k}^{2})+f'(\alpha ){(e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e_{k}^{2})+c_{2}(e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e_{k}^{2}))}^{2})}. \end{aligned}$$
(16)

We also conclude by dividing Eq. (12) by (15) :

$$\begin{aligned}&\frac{f(x_{k})}{f[x_{k},w_{k}]}\nonumber \\&\quad =\frac{{f'(\alpha )}^{2}\beta {(e_{k}+c_{2}e_{k}^{2})}^{2}}{f'(\alpha )(e_{k}+c_{2}e_{k}^{2})-f'(\alpha ){(e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e_{k}^{2})+c_{2}(e_{k}-f'(\alpha ) \beta (e_{k}+c_{2}e_{k}^{2}))}^{2})}. \end{aligned}$$
(17)

By substituting (12), (14), (16),  and (17) in (8), it is obtained that

$$\begin{aligned} x_{k+1}& {} = \alpha +e_{k}-\frac{f(x_{k})}{f[x_{k},w_{k}]}\left( 1+\xi \frac{f(w_{k})}{f[x_{k},w_{k}]}\right) \nonumber \\ & {} = \alpha +(-1+f'(\alpha ) \beta )(\xi -c_{2})e^{2}_{k}+O(e_{k}^{3}). \end{aligned}$$
(18)

Therefore,

$$\begin{aligned} e_{k+1}=(-1+f'(\alpha ) \beta )(\xi -c_{2})e^{2}_{k}+O(e_{k}^{3}). \end{aligned}$$
(19)

The proof is completed. \(\square\)

Remark 2

The family of one-point methods mentioned in Eq. (10) requires two function evaluations and has order of convergence two. Therefore, this family is optimal in the sense of the Kung–Traub conjecture and possesses the computational efficiency \(\hbox {EI} = 2^{1/2} \approx 1.4142\).

Recursive adaptive method with memory

This section concerns with extracting the novel with memory method from (10) by using two self-accelerating parameters. Theorem (1) states that modified method (10) has order of convergence 2 if \(\beta \ne \frac{1}{f'(\alpha )}\) and \(\xi \ne c_{2}\). Now, we pose a main question: Is it possible to increase the order of convergence ? If so, how can it be done and what is the new convergence order? For answering these questions, we note the error equation (11). It can be seen that if we set \(\beta =\frac{1}{f'(\alpha )}\) and \(\xi =c_{2}=\frac{f''(\alpha )}{2f'(\alpha )}\), then at least the coefficient of \(e_{k}^{2}\) disappears. However, we do not know \(\alpha\) and consequently, \(f'(\alpha )\) and \(f''(\alpha )\) cannot be computed. On the other hand, we can approximate \(\alpha\) using available data and therefore improve order of convergence. Following the same idea in the methods with memory, this issue can be resaved. However, we are going to do it in a more efficient way, say recursive adaptively. Let us describe it a little more. If we use information from the current and only the last iteration, we come up with the method introduced in  [34, 36]. Also, we have considered the best approximations. Hence, the following approximates are applied

$$\begin{aligned} \left\{ \begin{array}{ll} \beta _{k}=\frac{1}{f'(\alpha )}\approx \frac{1}{N'_{2}(x_{k})} ,\\ \xi _{k}=\frac{f''(\alpha )}{2f'(\alpha )}\approx \frac{N''_{3}(w_{k})}{2N'_{3}(w_{k})}, \end{array} \right. \end{aligned}$$
(20)

where \(k=1,2,\ldots\).

\(N'_{2}(x_{k}),N'_{3}(w_{k})\) and \(N''_{3}(w_{k})\) are Newton’s interpolating polynomials of two and third degrees, set through three and four best available approximations (nodes) \((x_{k},x_{k-1},w_{k-1})\) and \((w_{k},x_{k},x_{k-1},w_{k-1}),\) respectively. It should be noted that if one uses lower Newton’s interpolation, lower accelerators are obtained. Replacing the fixed parameters \(\xi\) and \(\beta\) in the iterative formula (10) by the varying \(\beta _{k}\) and \(\xi _{k}\) calculated by (20), we propose the following new methods with memory, \(x_{0},\xi _{0},\beta _{0}\) are given then \(w_{0}=x_{0}-\beta _{0}f(x_{0})\)

$$\begin{aligned} \left\{ \begin{array}{ll} \beta _{k}=\frac{1}{N'_{2k}(x_{k})},\,\xi _{k}=\frac{N''_{2k+1}(w_{k})}{2N'_{2k+1}(w_{k})},\quad k=1,2,\ldots ,\\ w_{k}=x_{k}-\beta _{k}f(x_{k}),\,x_{k+1}=x_{k}-\frac{f(x_{k})}{f[x_{k},w_{k}]}(1+\xi _{k} \frac{f(w_{k})}{f[x_{k},w_{k}]}),\quad k=0,1,2,\ldots .\\ \end{array} \right. \end{aligned}$$
(21)

\(N'_{2k}(x_{k}),N'_{2k+1}(w_{k})\) and \(N''_{2k+1}(w_{k})\) are Newton’s interpolating polynomials of 2k and \(2k+1\) degrees, set through \(2k+1\) and \(2k+2\) best available approximations (nodes) \((x_{k},x_{k-1},w_{k-1},\ldots ,w_{1},x_{1},w_{0},x_{0})\) and \((w_{k},x_{k},x_{k-1},w_{k-1},\ldots ,w_{1}, x_{1},w_{0},x_{0})\) respectively. Here, we concern the second question regarding order of convergence of the method with memory (10). In what follows, we discuss the general convergence analysis of the recursive adaptive method with memory (10). It should be noted that the convergence order varies as the iteration go ahead. First, we need the following lemma:

Lemma 1

If \(\beta _{k}=\frac{1}{N'_{2k}(x_{k})},\) and  \(\xi _{k}=\frac{N''_{2k+1}(w_{k})}{2N'_{2k+1}({w_{k})}}\)then the estimate

$$\begin{aligned} \left\{ \begin{array}{ll} (-1+\beta _{k}f'(\alpha )) \sim \prod _{s=0}^{k-1}e_{s}e_{s,w},\\ (\xi _{k}-c_{2}) \sim \prod _{s=0}^{k-1}e_{s}e_{s,w},\\ \end{array} \right. \end{aligned}$$
(22)

where \(e_{s}=x_{s}-\alpha ,e_{s,w}=w_{k}-\alpha .\)

Proof

The proof is similar to Lemma 1 mentioned in  [64].

The following result determines the order of convergence through the one-point iterative method with memory (21). \(\square\)

Theorem 2

If an initial estimation \(x_{0}\) is close enough to a simple root \(\alpha\) of \(f(x)=0\) and \(\beta _{0}\) and \(\xi _{0}\) must be uniformly bounded abovebeing f a real sufficiently differentiable functionthen the R-order of convergence of the one-point method adaptive with memory (21) obtained from the following system of nonlinear equations.

$$\begin{aligned} \left\{ \begin{array}{ll} r^{k}p-(1+p)(1+r+r^{2}+r^{3}+\cdots +r^{k-1})-r^{k}=0,\\ r^{k+1}-2(1+p)(1+r+r^{2}+r^{3}+\cdots +r^{k-1})-2r^{k}=0, \end{array} \right. \end{aligned}$$
(23)

where r and p are the convergence order of the sequences \(\lbrace x_{k}\rbrace\) and \(\lbrace w_{k}\rbrace,\) respectivelyAlsok indicates the number of iterations.

Proof

Let \(\lbrace x_{k}\rbrace\) and \(\lbrace w_{k}\rbrace\) be convergent with orders r and p respectively. Then

$$\begin{aligned} \left\{ \begin{array}{ll} e_{k+1}\sim e_{k}^{r}\sim e_{k-1}^{r^{2}}\sim \ldots \sim e_{0}^{r^{k+1}}, \\ e_{k,w}\sim e_{k}^{p}\sim e_{k-1}^{rp}\sim \ldots \sim e_{0}^{pr^{k}}, \\ \end{array} \right. \end{aligned}$$
(24)

where \(e_{k}=x_{k}-\alpha\) and \(e_{k,w}=w_{k}-\alpha\). Now, by Lemma (1) and Eq. (24),

we obtain:

$$\begin{aligned} (-1+\beta _{k}f'(\alpha ))\sim \prod _{s=0}^{k-1}e_{s}e_{s,w}& {} = (e_{0} e_{0,w})\ldots (e_{k-1} e_{k-1,w} )\nonumber \\& {} = (e_{0}e_{0}^{p})(e_{0}^{r}e_{0}^{pr})\ldots (e_{0}^{r^{k-1}} e_{0}^{r^{k-1}p})\nonumber \\& {}= e_{0}^{(1+p)+(1+p)r+\cdots +(1+p)r^{k-1}}\nonumber \\& {} = e_{0}^{(1+p)(1+r+\cdots +r^{k-1})}. \end{aligned}$$
(25)

Similarly, we get:

$$\begin{aligned} (\xi _{k}-c_{2})\sim e_{0}^{(1+p)(1+r+\cdots +r^{k-1})}. \end{aligned}$$
(26)

By considering the errors of \(w_{k}\) and \(x_{k+1}\) in Eq. (21) and Eqs. (25)–(26), we conclude:

$$\begin{aligned}&e_{k,w} \sim (-1+\beta _{k}f'(\alpha ))e_{k}\sim e_{0}^{(1+p)(1+r+\cdots +r^{k-1})}e_{0}^{r^{k}}, \end{aligned}$$
(27)
$$\begin{aligned}&e_{k+1} \sim (-1+\beta _{k}f'(\alpha ))(\xi _{k}-c_{2})e_{k}^{2}\sim e_{0}^{((1+p)(1+r+\cdots +r^{k-1}))^{2}}e_{0}^{2r^{k}}. \end{aligned}$$
(28)

equating the powers of \(e_{k+1}\) on the right-hand sides of Eqs. (24)–(27) and (24)–(28), one can obtain:

$$\begin{aligned} \left\{ \begin{array}{ll} r^{k}p-(1+p)(1+r+r^{2}+r^{3}+\cdots +r^{k-1})-r^{k}=0,\\ r^{k+1}-2(1+p)(1+r+r^{2}+r^{3}+\cdots +r^{k-1})-2r^{k}=0. \end{array} \right. \end{aligned}$$
(29)

And thus we prove the result. \(\square\)

Remark 3

It should be kept in mind that the system of equations (23) includes the previous iterations for \(k=0,1,2,\ldots.\) In this case, we have the regular methods with memory in which the information from the current and the previous steps are used.

Remark 4

For \(k=1\), we use the information from the current and the one previous step. In this case, the order of convergence of the method with memory can be computed from the following of system of equations

$$\begin{aligned} \left\{ \begin{array}{ll} rp-(1+p)-r=0 ,\\ r^{2}-2(1+p)-2r=0. \end{array} \right. \end{aligned}$$
(30)

This system of equations has the solution \(p=\frac{1}{4}(3+\sqrt{17}) \simeq {1.78078}\), and \(r=\frac{1}{2}(3+\sqrt{17})\simeq {3.56155}.\)

This special case gives the given result by khaksar haghani  [28]. This is a new kind of adaptive approach with memory method .

Remark 5

For \(k=2\), the system of equations (23) becomes :

$$\begin{aligned} \left\{ \begin{array}{ll} r^{2}p-(1+p+rp+r+r^{2})=0 ,\\ r^{3}-2(1+p+rp+r+r^{2})=0. \end{array} \right. \end{aligned}$$
(31)

This system of equations has the solution: \(p\simeq {1.95029}\) and \(r\simeq {3.90057}\).

Remark 6

If \(k=3\), we get:

$$\begin{aligned} \left\{ \begin{array}{ll} (-1+\beta _{k}f'(\alpha ))\sim e_{k-3}e_{k-3,w}e_{k-2}e_{k-2,w}e_{k-1}e_{k-1,w}\sim e_{k-3}^{1+p+r+rp+r^{2}+r^{2}p}, \\ ( \xi _{k}-c_{2})\sim e_{k-3}e_{k-3,w}e_{k-2}e_{k-2,w}e_{k-1}e_{k-1,w}\sim e_{k-3}^{2(1+p+r+rp+r^{2}+r^{2}p)}.\\ \end{array} \right. \end{aligned}$$
(32)

and equating the powers of \(e_{k+1}\) and \(e_{k,w}\) error exponents of in pairs of relations (24), and (29) we obtain:

$$\begin{aligned} \left\{ \begin{array}{ll} r^{3}p-(1+p+r+rp+r^{2}+r^{2}p+r^{3})=0 ,\\ r^{4}-2(1+p+r+rp+r^{2}+r^{2}p+r^{3})=0. \end{array} \right. \end{aligned}$$
(33)

Positive solution of the system of equations (33) is given by: \(p\simeq {1.98804}\) and \(r\simeq {3.97609}\).

Remark 7

Also, if \(k=4\), we conclude by the system of equations (29): (shown by TLAM)

$$\begin{aligned} \left\{ \begin{array}{ll} r^{4}p-(1+p+r+rp+r^{2}+r^{2}p+r^{3}+r^{3}p+r^{4})=0 ,\\ r^{5}-2(1+p+r+rp+r^{2}+r^{2}p+r^{3}+r^{3}p+r^{4})=0. \end{array} \right. \end{aligned}$$
(34)

Solving these equations, we get : \(p\simeq {1.99705}\) and \(r\simeq {3.9941}.\)

Remark 8

As can be easily seen that the improvement in the order of convergence from 2 to 4 (100% of an improvement) is attained without any additional functional evaluations, which points to very high computational efficiency of the proposed method. Therefore, the efficiency index of the proposed method (23) is \(\hbox {EI}=4^{1/2}=2 ,\,\,(k\ge 4)\).

Numerical examples

In this section, the proposed derivative-free adaptive methods are applied to solve smooth as well as nonsmooth nonlinear equations and compared with the existing without memory and with memory methods. The iterative methods without memory and with memory are listed in Tables 1 and 2, respectively. Table 3 lists the exact roots \(\alpha\) and initial approximations \(x_{0}\), which are computed using the FindRoot command of Mathematica [23]. Table 4 compares evaluation function and efficiency index of the proposed method by with and without memory schemes. Table 5 compares improvement percent with memory and homogeneous without memory. Constructed iteration adaptive method, with the given function f having a simple zero is mentioned in Table 6. Tables 7, 8 and 9 compare our proposed method forty one with and without memory. In recent years, since in practice high-precision computations are applied, the higher-efficiency index schemes have become important. Due to this reason all the computations reported have been performed in the programming package Mathematica 10 using 2000 digits floating-point arithmetic using “SetAccuraccy”command. The errors \(\vert x_{k}-\alpha \vert\) of approximations to the sought zeros, produced by the different methods at the first three iterations, are given in Table 6 where \(m(-n)\) stands for \(m\times 10^{-n}\). These tables also include, for each test function, the initial estimation values and the last value of the computational order of convergence COC  [44] computed by the expression (if it is stable)

$$\begin{aligned} \hbox {COC}=\frac{\log |f(x_n)/f(x_{n-1})|}{\log |f(x_{n-1})/f(x_{n-2})|}\approx p, \end{aligned}$$
(35)

where p is the order of convergence. At least 40 iterative methods with and without memory, for comparing with our proposed methods, have been chosen as comes next. Test functions used in many papers concerning nonlinear equations. For example, the functions \(f_{i}(x),i=1,2,3,\ldots ,12\) are displayed in Figs. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12, respectively. Figure 13 compares of methods without memory, with memory (25%, 50% and 75% of improvements) and recursive adaptive (100% of improvements) in terms of highest possible efficiency index. Complex test function \(f_{10}\) used to show that the proposed method is applicable to the complex domain too. In these tables symbols \(In,\,div\) have demonstrator infinity and divergence, respectively. It can be observed our proposed method has minimum evaluation function and maximum efficiency index. Tables 4 and 5 show that the method (23) competes the previous methods. In additional its efficiency index is better than all the previous works. In other words, it has efficiency index \(4^{1/2} =2\). The same results can be observed in the second and third columns of Table 5 and at least has evaluation function inter iterative methods existent methods with- and without memory. Some of iterative methods in the some examples are divergent. We also incorporated and applied the developed adaptive method with memory (34) for different test examples and obtained results with the same behavior as above. We can see that the self-accelerating parameters and the consequently adapting method play a key role in increasing the order of convergence of the iterative method.

Fig. 1
figure 1

\(f_{1}(t),t\in [-\pi , \pi ]\)

Fig. 2
figure 2

\(f_{2}(t),t\in [-\pi , \pi ]\)

Fig. 3
figure 3

\(f_{3}(t),t\in [-10, 10]\)

Fig. 4
figure 4

\(f_{4}(t),t\in [-3, 3]\)

Fig. 5
figure 5

\(f_{5}(t),t\in [-5, 2]\)

Fig. 6
figure 6

\(f_{6}(t),t\in [-2, 2]\)

Fig. 7
figure 7

\(f_{7}(t),t\in [-1, 1]\)

Fig. 8
figure 8

\(f_{8}(t),t\in [-5, 2]\)

Fig. 9
figure 9

\(f_{9}(t),t\in [0, 3]\)

Fig. 10
figure 10

\(f_{10}(t),t\in [-4, 4]\)

Fig. 11
figure 11

\(f_{11}(t),t\in [-5, 5]\)

Fig. 12
figure 12

\(f_{12}(t),t\in [-3, 3]\)

Fig. 13
figure 13

Comparison of methods without memory, with memory (25%, 50%, and 75% of improvements) and recursive adaptive (100% of improvement) in terms of highest possible efficiency index

Algorithms to find an initial approximation

1

An important aspect in implementing the iterative methods for the solution of nonlinear equations and systems relies on the choice of the initial approximation. There are a few known ways in the literature [24] to extract a starting point for the solutions of nonlinear functions. In practice, users need to find out robust approximations for all the zeros in an interval. Thus, to remedy this and to respond on this need, we provide a way to extract all the real zeros of nonlinear function in the interval \(D=[ a, b]\). We use the command Reduce in Mathematica 10  [23]. Hence, we give a hybrid algorithm including two main steps, a predictor and a corrector. In the predictor step, we extract initial approximations for all the zeros in an interval up to 8 decimal places. Then the corrector step will be used to boost up the accuracy of the starting points up to any tolerance. We also give some significant cautions for applying on different test functions. In what follows, we keep going by choosing an oscillatory function \(f (x)=\frac{1}{10}+\cos (2+x^{2})+\sin (x)\) in the domain \(D=[ 0., 15.]\).

Let us define the function and the domain for imposing the \(Reduce [\, ]\) command as in Algorithm 1.

One may note that \(Reduce [\, ]\) works with function of exact arithmetic. Hence, if a nonlinear function is the floating-point arithmetic, that is, has inexact coefficients, thus we should write it in the exact arithmetic when we enter it into the above piece of code. Now we store the list of initial approximations in initialValues, by the following piece of code, which also sort the initial points. The tol will specify that the accuracy of each member of the provided sequence to be correct up to utmost tol, decimal places (Algorithm 2).

It is obvious that f is so oscillatory, and by the above predictor piece of Mathematica code, we attain that it has 59 real solutions. Note that the graph of the function f has been drawn in Fig. 14.

Fig. 14
figure 14

The graph of the function f with finitely many zeros in an interval

Note that if a user needs much more accuracy, thus higher number of steps should be taken. It should be remarked that in order to work with such a high accuracy, we must then choose more than 2000 decimal places arithmetic in our calculations.

However, running the above algorithm could capture all the real zeros of the nonlinear functions. One is that for many oscillatory function or for nonsmooth functions, the best way is to first divide the whole interval into some subintervals and then find all the zeros of the function on the subintervals. And second, in case of having a root cluster, that is, when the zeros are concentrated on a very small area, then it would be better to increase the \(first \,\,tolerance\) of our algorithm in the predictor step, to find reliable starting points and then start the process.

And last, if the nonlinear function has an exact solution, that is to say, an integer be the solution of a nonlinear function, then the first step of our algorithm finds this exact solution, and an error-like message would be generated by applying our second step. For instance, the function \(g(x)=(x^{2}-4) \sin (100x)\) on the interval \(D =[0, 10]\) has 319 real solutions in which one of them (its plot is given in Fig. 15), that is, 2, is an exact one. Thus, the first step of the mentioned Algorithm 1 finds the following very efficient list of starting points in which 2, is the exact solution:

Fig. 15
figure 15

The graph of the function g with finitely many zeros in an interval

{0.031416, 0.0628318, 0.0942478, 0.125664, 0.15708, 0.188496, 0.219912, 0.251327, 0.282743, 0.314159, 0.345575, 0.376991, 0.408407, 0.439823, 0.471239, 0.502655, 0.534071, 0.565487, 0.596903, 0.628319, 0.659734, 0.69115, 0.722566, 0.753982, 0.785398, 0.816814, 0.84823, 0.879646, 0.911062, 0.942478, 0.973894, 1.00531, 1.03673, 1.06814, 1.09956,1.13097, 1.16239, 1.19381, 1.22522, 1.25664, 1.28805, 1.31947, 1.35088,1.3823, 1.41372, 1.44513, 1.47655, 1.50796, 1.53938, 1.5708, 1.60221, 1.63363, 1.66504, 1.69646,1.72788, 1.75929, 1.79071, 1.82212, 1.85354, 1.88496, 1.91637, 1.94779, 1.9792, 2., 2.01062, 2.04203, 2.07345, 2.10487, 2.13628, 2.1677, 2.19911, 2.23053, 2.26195, 2.29336, 2.32478, 2.35619, 2.38761, 2.41903, 2.45044,2.48186, 2.51327, 2.54469, 2.57611,2.60752,2.63894, 2.67035, 2.70177, 2.73319, 2.7646, 2.79602, 2.82743, 2.85885, 2.89027, 2.92168, 2.9531, 2.98451, 3.01593, 3.04734, 3.07876, 3.11018, 3.14159, 3.17301, 3.20442, 3.23584, 3.26726, 3.29867, 3.33009, 3.3615, 3.39292, 3.42434, 3.45575, 3.48717, 3.51858,3.55, 3.58142, 3.61283, 3.64425, 3.67566, 3.70708, 3.7385, 3.76991, 3.80133, 3.83274, 3.86416, 3.89557, 3.92699, 3.95841, 3.98982, 4.02124, 4.05265, 4.08407, 4.11549, 4.1469, 4.17832, 4.20973, 4.24115, 4.27257, 4.30398, 4.3354, 4.36681, 4.39823, 4.42965, 4.46106, 4.49248, 4.52389,4.55531, 4.58673,4.61814, 4.64956, 4.68097, 4.71239, 4.7438, 4.77522, 4.80664, 4.83805,4.86947, 4.90088, 4.9323, 4.96372, 4.99513, 5.02655, 5.05796, 5.08938, 5.1208, 5.15221, 5.18363, 5.21504, 5.24646, 5.27788, 5.30929, 5.34071, 5.37212, 5.40354, 5.43496, 5.46637, 5.49779, 5.5292, 5.56062, 5.59203, 5.62345, 5.65487, 5.68628, 5.7177, 5.74911, 5.78053, 5.81195, 5.84336, 5.87478, 5.90619, 5.93761, 5.96903, 6.00044, 6.03186, 6.06327, 6.09469, 6.12611, 6.15752, 6.18894, 6.22035, 6.25177, 6.28319, 6.3146, 6.34602, 6.37743, 6.40885, 6.44026, 6.47168, 6.5031, 6.53451, 6.56593, 6.59734, 6.62876, 6.66018, 6.69159, 6.72301, 6.75442, 6.78584, 6.81726, 6.84867, 6.88009, 6.9115, 6.94292, 6.97434, 7.00575, 7.03717, 7.06858, 7.1, 7.13142, 7.16283, 7.19425, 7.22566, 7.25708, 7.28849, 7.31991, 7.35133, 7.38274, 7.41416, 7.44557, 7.47699, 7.50841, 7.53982, 7.57124, 7.60265, 7.63407, 7.66549, 7.6969, 7.72832, 7.75973, 7.79115, 7.82257, 7.85398, 7.8854, 7.91681, 7.94823, 7.97965, 8.01106, 8.04248, 8.07389, 8.13672, 8.16814, 8.19956, 8.23097, 8.26239, 8.2938, 8.32522, 8.35664, 8.38805, 8.41947, 8.45088, 8.4823, 8.51372, 8.54513, 8.57655, 8.60796, 8.63938, 8.6708, 8.70221, 8.73363, 8.76504, 8.79646, 8.82788, 8.85929, 8.89071, 8.92212, 8.95354, 8.98495, 9.01637, 9.04779, 9.0792, 9.11062, 9.14203, 9.17345, 9.20487, 9.23628, 9.2677, 9.29911, 9.33053, 9.36195, 9.39336, 9.42478, 9.45619, 9.48761, 9.51903, 9.55044,9.58186,9.61327, 9.64469, 9.67611, 9.70752, 9.73894, 9.77035, 9.80177, 9.83319, 9.8646, 9.89602, 9.92743, 9.95885, 9.99026}

Now we are able to solve nonlinear equations with finitely many roots in an interval and find all the real zeros in a short piece of time. Finding robust ways, to capture the complex solutions along working with complex nonlinear functions, can be taken into account as future works.

figure a
figure b

2

An important aspect of implementing high-order nonlinear solvers is in finding very robust initial guesses to start the process, when high-precision computing is needed. As discussed in "Introduction and preliminaries" section, the convergence of our iterative methods is local. To resolve this shortcoming, the best way is to rely on hybrid algorithms, in which the first item produces a robust initial point and the second item employs the new iterative methods when high precision is required. There are some ways in the literature to find robust starting points, mostly based on interval mathematics see, for example,  [3]. But herein we take into consideration the programming package Mathematica 10  [23] which could be efficiently applied on lists for high-precision computing. In fact using  [24], we could build a list of initial guesses close enough with good accuracy to start the procedure of our optimal derivative-free fourth-order methods. The procedure of finding such a robust list is based on the powerful command of   NDSolve   for the nonlinear function \(f (x)=\frac{1}{10}+\cos (2+x^{2})+\sin (x)\) on the interval \(D=[ a, b]\). Such a way can be written in the following piece of Mathematica code by considering an oscillatory function as the input test function on the domain \(D=[ 0., 15.].\) See Algorithm 1. The output of Algorithm 3 is to plot the function graph f(x).

Thus now, we have an efficient list of initial approximations for the zeros of a nonlinear once differentiable function with finitely many zeros in an interval. The number of zeros and the graph of the function including the positions of the zeros can be given by the following commands (see Fig. 14); see Algorithm 4.

For this test, there are 59 zeros in the considered interval which can easily be used as the starting points for our proposed high-order derivative-free methods. Note that the output of the vector “initialPoints” contains the initial approximations.  Note that we end this section by mentioning that for very oscillatory functions, it is better to first divide the interval into some smaller subintervals and then obtain the solutions. The command NDSolve uses Maximum number of 10,000 steps, if it is needed this could be changed. In cases when    NDSolve    fails, this algorithm might fail too. The output of Algorithm 4 is as follows:

{1.1103225, 2.5611445, 2.9496729, 3.4537697, 3.9993453, 4.1889818, 4.7622341, 4.8587772, 5.3502573, 5.5085282, 5.8682448, 6.0980068, 6.3442691, 6.6342307, 6.7876268, 7.1310609, 7.2020570, 8.3675131, 8.3999413, 8.7079140, 8.7949106, 9.0413573, 9.1668305, 9.3646249, 9.5223085, 9.6781865, 9.8636235, 9.9828725, 10.192138, 10.279661, 10.508430, 10.569895, 10.811671, 10.856029, 11.099569, 11.141751, 11.373401, 11.426999, 11.637691, 11.708314, 11.895029, 11.984045, 12.146536, 12.253911, 12.392804, 12.518074, 12.634217, 12.776841, 12.871042, 13.030597, 13.103445, 13.279866, 13.331413, 13.525843, 13.554222, 14.647052, 14.664168, 14.849621, 14.887657 }

59

6.82717

figure c
figure d

3

Although the choice of good initial approximations is of great importance in the application of iterative methods, including multipoint methods, this task is very seldom considered in the literature. Recall that Steffensen-like methods of the second order have been most frequently used as predictors in the first step of multipoint methods. These methods are of tangent type, and therefore, they are locally convergent, which means that a reasonably close initial approximation to the sought zero should be found. Otherwise, if the chosen initial approximation is too far from the sought zero (say, if it is chosen randomly), then the applied methods, either the ones proposed in this paper or some others with local convergence developed during the last two centuries, will probably find some other (often unwanted) zero or they will diverge.

Therefore, the determination of a reasonably good approximation \(x_{0}\) that guarantees the convergence of the sequence of approximations \(\lbrace x_{k}\rbrace _{k \in N}\) to the zero of f is a significant task. It is interesting to note that initial approximations, chosen randomly in a suitable way, give acceptable results when simultaneous methods for finding all roots of polynomial equations are applied, e.g., employing Aberth’s approach  [2].

There are many methods (mainly of non-iterative nature) and strategies for finding sufficiently good initial approximations. The well-known bisection method and its modifications belong to the simplest but not always sufficiently efficient techniques. There is a vast literature on this subject so that we omit details here. We only note that complete root-finding algorithms often consist of two parts: (1) slowly convergent search algorithm to isolate distinct real or complex interval containing single root and (2) rapidly convergent iterative method for finding sufficiently close approximation of the isolated root to the required accuracy. In this paper we are concentrating on the part (2). Applying computer algebra systems, a typical statement for solving nonlinear equations reads \(FindRoot[equation, \lbrace x, x_{0}\rbrace ]\); see, e.g., Wolfram’s computational software package Mathematica, that is, an initial approximation \(x_{0}\) is required. In finding good initial approximations, a great advance was recently achieved by developing an efficient non-iterative method of significant practical importance, originally proposed by Yun  [65]. Yun’s method is based on numerical integration briefly referred to as NIM, where tanh, arctan and signum functions are involved. The NIM requires neither any knowledge of the derivative f(x) nor any iterative process. Handling non-pathological cases it is not necessary to have a close approximation to the zero; instead, a real interval (not necessarily tight) that contains the root (so-called inclusion interval) is sufficient. For illustration, to find an initial approximation \(x_{0}\) of the zeros \(\alpha = -1.4044916, 1.4044916\) of the function \(h (x)=\sin (x)^{2}-x^{2}+1\) isolated in the interval \([-5, 5]\), we employed Yun’s algorithm with the statement taking \(m=250, a = -1, b=2\), and found very good approximation \(x_{0} =1.40449\). The graph of function h is plotted in Fig. 16.

Fig. 16
figure 16

The graph of the function h with finitely many zeros in an interval

figure e

Remark 9

By changing ab, and m, different values are obtained for this description: if \(a=-1,b=2\), and \(m=6\) the output of the algorithm is 1.40143. If \(a=-2,b=0\), and \(m=16\) the output of the algorithm is \(-1.40408\). If \(a=-2,b=0\), and \(m=16000,\) the output of the algorithm is \(-1.40457\) and so on.

4

Using the command FindRoot and assigning the function to the two functions, then draw both functions in a concatenated graph.

The command WorkingPrecision specifies the accuracy of the operation. For example, if we want to find the root of equation \(f_{2}(x)= \sin (5x)e^{x}-2\), we rewrite it like this \(f_{20}(x)= \sin (5x)e^{x},\,f_{21}(x)=2\). Then, first, by plotting the function in interval \([-2, 2]\) and then using the code given below, in the package Mathematica, the approximate value of the root can be determined.

$$\begin{aligned} {\texttt {FindRoot[e}\mathtt {\hat{}}{} \texttt {x sin[5x]==2, { t, 1}, WorkingPrecision ->50]}} \end{aligned}$$

Below is the program output and its graph in Fig. 17.

$$\begin{aligned} \texttt {\{ x -> 1.3639731802637126891832999034292974589390644240412 \}} \end{aligned}$$
Fig. 17
figure 17

The graph of the function \(f_{20}(x)= \sin (5x)e^{x},\,f_{21}(x)=2\) with finitely many zeros in an interval

Table 1 Considered methods without memory
Table 2 Studied methods with memory
Table 3 The test functions
Table 4 Numerical results for the test functions \(f_{i}(x),i=1,2,3,\ldots ,20\) the proposed method (34)
Table 5 Comparison evaluation function and efficiency index of the proposed method with other schemes
Table 6 Comparison improvement of convergence order the proposed method with other schemes
Table 7 Comparison evaluation function and efficiency index of the proposed method with other schemes for f1, f2, f3 and f4
Table 8 Comparison evaluation function and efficiency index of the proposed method with other schemes for f5, f6, f7 and f8
Table 9 Comparison evaluation function and efficiency index of the proposed method with other schemes for f9, f10, f11 and f12

Conclusion

In this work, we developed a new kind of with memory methods for solving nonlinear equations. Convergence analysis proves that these new derivative-free methods preserve their order of convergence. To this end, based on Newton’s interpolatory polynomial of different degrees. One should note that the computational accuracy strongly depends on the structures of the iterative methods, the sought zero and the test functions as well as good initial approximations. In general, in Tables 4, 5, 6, 7, 8 and 9 we have examined some methods with different kinds of convergence order. It is observed that these methods support their theoretical aspects. The last column of tables show computational efficiency index defined by \(\hbox {EI}=\hbox {COC}^{1/n}\), where n number of function evaluations per iteration. The numerical results show that proposed method is very useful to find an acceptable approximation of the exact solution of nonlinear equations, specially when the function is non-differentiable. In fact, we have contributed further to the development of the theory of iteration processes and propose a new accurate and efficient higher-order derivative-free method for solving nonlinear equations numerically. In other words, the efficiency index of the proposed family with memory is \(\hbox {EI} = 4^{1/2} =2\), which is much better than optimal one until six-point optimal methods without memory having efficiency indexes \(\hbox {EI} = 2^{1/2}\simeq 1.414,\hbox {EI} = 4^{1/3}\simeq 1.587,\hbox {EI} = 8^{1/4}\simeq 1.681, \hbox {EI} =16^{1/5} \simeq 1.741, \hbox {EI} = 32^{1/6} \simeq 1.781, \hbox {EI} = 64^{1/7} \simeq 1.814\), respectively. Also, which are better than the other methods given in [1, 4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20],  [22, 25,26,27,28,29,30, 32,33,34,35,36,37,38,39,40,41],  [44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64, 66]. A comparison between the without memory, with memory and adaptive methods in terms of the maximum efficiency index alongside the number of steps per cycle are given in Fig 5. All algorithms are implemented using symbolic Math of MATHEMATICA [23]. Adaptive method with memory has minimum evaluation function, and not evaluation derivative, hence competes with methods existent with and without memory.