Introduction

The Langevin function arises in diverse contexts including the classical model of paramagnetism (Langevin 1905) and the ideal freely jointed chain model (e.g., Fiasconaro and Falo 2018). In the ideal jointed chain model, as illustrated in Fig. 1, a chain comprising of n identical rigid segments and, in an equilibrium energy environment, is such that independent movement of each segment is valid with each segment moving independently and with an angle consistent with a uniform distribution on the interval (–π, π]. With an applied point force, the expected chain extension, normalized, is given by the Langevin function (e.g., Iliafar et al. 2013), which is defined according to

$$ {\displaystyle \begin{array}{c}L(x)=\left\{\begin{array}{cc}\coth (x)-\frac{1}{x}& x\in \left(0,\infty \right)\\ {}0& x=0\end{array}\right.\\ {}L\left(-x\right)=-L(x)\end{array}} $$
(1)

and whose graph is shown in Fig. 2 for the case of x > 0. In this equation, the non-descriptive variables of x (representing the normalized force for the freely jointed chain) and L (representing the normalized chain extension) are used. The inverse Langevin function, denoted L–1, is shown in Fig. 2 for the case of y > 0, and is of interest as, for the freely jointed chain case, it defines the normalized force required for a given expected normalized chain extension. The freely jointed chain model, and chain-based network models, have applications, for example, in rubber elasticity (e.g., Ehret 2015), the micromodelling of polymers (e.g., Arruda and Boyce 1993, Boyce and Arruda 2000, Hossain and Steinmann 2012) and the biophysics of macromolecules (e.g., Holzapfel 2005).

Fig. 1
figure 1

Model of a freely jointed chain with n segments. The left end is anchored whilst the free end is subject to a force

Fig. 2
figure 2

Graph of the Langevin and inverse Langevin functions

As the Langevin function and the inverse Langevin function are antisymmetric, it is sufficient to consider these functions, respectively, over the intervals [0, ∞) and [0, 1). As is evident in Fig. 2, the inverse Langevin function has a singularity at the end point of the interval which complicates analysis and an explicit analytical expression for this function is an unsolved problem. Finding an approximate analytical expression has received attention over a considerable period of time, e.g., Kuhn and Grün (1942), Treloar (1954, 1975), and Cohen (1991), and improved approximations have been detailed in recent years: Itskov et al. (2012), Nguessong et al. (2014), Darabi and Itskov (2015), Jedynak (2015), Kröger (2015), Marchi and Arruda (2015), Rickaby and Scott (2015), Petrosyan (2017), Jedynak (2018) and Marchi and Arruda (2019). These papers provide overviews of the approaches used and these include the use of Taylor series, several custom series, and the dominant approach of Padé approximants including the use of minimax optimization techniques. The Cohen approximation (Cohen 1991) has a maximum relative error of 0.0494 and improved approximations with maximum relative errors approaching 10–4 (Marchi and Arruda 2015, Jedynak 2018) have been reported with the recent work of Marchi and Arruda (2019) detailing approximations with maximum relative errors less than 10–6. Benitez and Montáns (2018) utilized discretization and interpolation via cubic splines to obtain highly accurate numerical results; a maximum relative error of the order of 10–11 can be achieved with the use of 10,000 points.

This paper details approaches for finding analytical approximations to the inverse Langevin function that can be made arbitrarily accurate. The approach taken is distinctly different from those reported in the literature and is based on determining an intermediate function that linearizes the Langevin function. This approach was initially motivated by the potential of higher order spline functions for function approximation (Howard 2019). The use of an intermediate function facilitates, first, the definition of convergent series for the inverse Langevin function over its complete domain—an unsolved problem. Second, the use of an intermediate function facilitates error approximation and functional iteration. When combined, these lead to approximations for the inverse Langevin function which can be made arbitrarily accurate over its complete domain. For example, a first-order iteration based on a simple approximating function has relative error bounds of better than 6 × 10–8, 3 × 10–16, and 3 × 10–30 depending on the order of error approximation used. Higher-order error approximation and/or higher levels of iteration lead to significantly lower relative error bounds. The error-based approximations can be simplified by using first- and second-order Taylor series.

The structure of the paper is illustrated in Fig. 3. Section “Approximation via an intermediate function” details the theory underpinning the use of an intermediate function for finding approximations to the inverse Langevin function. This section includes a direct linearization approach which results in a non-convergent series.  Section “Convergent series for inverse Langevin function” details convergent series for the inverse Langevin function which are based on utilizing suitable basis sets to approximate the error arising from linearization.  Section “Improved approximation via error approximation” details how error approximation, of different orders, can be used to yield improved approximations. Section “Improved approximation via iteration” details approximations, with lower relative error bounds, that arise from function iteration. Associated results are detailed in Section “Results: Error approximation and function iteration”. Section “Simplified approximation via Taylor series” details how the error-based approximations can be simplified by using the first- and second-order Taylor series. Conclusions are provided in  Section “Conclusion”.

Fig. 3
figure 3

Structure of paper illustrating the approaches used for approximating the inverse Langevin function

Notation

For a function f defined over the interval [α, β], an approximating function fA has a relative error, at a point x1, defined according to

$$ \mathrm{re}\left({x}_1\right)=1-\frac{f_A\left({x}_1\right)}{f\left({x}_1\right)} $$
(2)

The relative error bound for the approximating function over the interval [α, β] is defined according to

$$ \mathrm{re}=\max \left\{\mathrm{re}\;\left({x}_1\right):{x}_1\in \left[\upalpha, \upbeta \right]\right\} $$
(3)

The notation x ∈ { 0+, 1} is used with the meaning, respectively, of the limits as x approaches zero from above and one from below. The notation \( {f}^{(k)}(x)=\frac{d^k}{d{t}^k}f(x) \) is used in some instances.

Mathematica has been used to facilitate analysis and to obtain numerical results. In general, relative error results for approximations to the inverse Langevin function have been obtained by sampling the interval (0, 1) with a resolution of 0.001.

Approximation via an intermediate function

The Langevin function is defined over the interval [0, ∞) and the determination of an approximation to the inverse of this function is facilitated if a monotonically increasing intermediate function, denoted f, can be defined which, as illustrated in Figure 4, changes from zero to infinity as its argument changes from zero to one and creates an approximately linear function L[f(x1)] with respect to x1.

Fig. 4
figure 4

Illustration of the Langevin function L and the function L[f(x1)] defined by the intermediate function f

Theorem 1 Approximation via an intermediate function

Consider the case of a monotonically increasing intermediate function f which, as illustrated in Fig. 4, is such that \( L(x)=L\left[f\left({x}_1\right)\right] \) for x1 = [0, 1), x ∈ [0, ∞) and where L[f(x1)] is close to being linear with a slope close to unity for x1 ∈ [0, 1). When the linearity is such that

$$ {\displaystyle \begin{array}{c}L\left[f\left({x}_1\right)\right]={x}_1+\varepsilon \left({x}_1\right)\kern0.96em {x}_1\in \left[0,1\right),\left|\varepsilon \left({x}_1\right)\right|\ll {x}_1\\ {}\kern0.96em \mathrm{or}\\ {}L\left[f\left({x}_1\right)\right]={x}_1\left[1+\varepsilon \left({x}_1\right)\right]\kern0.96em {x}_1\in \left[0,1\right),\left|\varepsilon \left({x}_1\right)\right|\ll 1\end{array}} $$
(4)

then the inverse Langevin function can be approximated according to

$$ {\displaystyle \begin{array}{c}x={L}^{-1}(y)=f\left[y-\varepsilon \left({x}_1\right)\right]\approx f\left[y-\varepsilon (y)\right]\kern1.44em y\in \left[0,1\right)\\ {}\mathrm{or}\\ {}x={L}^{-1}(y)=f\left[\frac{y}{1+\varepsilon \left({x}_1\right)}\right]\approx f\left[y\left(1-\varepsilon (y)\right)\right]\kern1.54em y\in \left[0,1\right)\end{array}} $$
(5)

For both cases, a simple as possible approximation for the inverse Langevin function is

$$ x={L}^{-1}(y)\approx f(y)\kern0.24em \;\;\;y\in \left[0,1\right). $$
(6)

As the inverse Langevin function is antisymmetric

$$ x={L}^{-1}(y)\approx -f\left(-y\right)\kern0.24em \;\;\;\;y\in \left(-1,0\right) $$
(7)

Proof

The relationships

$$ y=L(x)=L\left[f\left({x}_1\right)\right]={x}_1+\varepsilon \left({x}_1\right) $$
(8)

imply x1 = yε(x1) and

$$ x={L}^{-1}(y)=f\left({x}_1\right)=f\left[y-\varepsilon \left({x}_1\right)\right] $$
(9)

With the assumption of │ε(x1)│ << x1, it follows that yx1 and the required relationship

$$ x={L}^{-1}(y)\approx f\left[y-\varepsilon (y)\right] $$
(10)

then follows. The alternative relationships follow in a similar manner with the first order approximation

$$ x={L}^{-1}(y)=f\left({x}_1\right)=f\left[\frac{y}{1+\varepsilon \left({x}_1\right)}\right]\approx f\left[y\left(1-\varepsilon \left({x}_1\right)\right)\right] $$
(11)

being made.

Implication

These results imply that to find approximations to the inverse Langevin function, it is sufficient to find functions f which effectively linearize the Langevin function such that Eq. 4 holds.

Determining intermediate function

As is clear from Theorem 1, the better the intermediate function f is in linearizing L[f(x1)], the better the approximation L–1(y) ≈ f(y) is for the inverse Langevin function. To determine appropriate intermediate functions, a useful starting point is to consider the rates of change of L[f(x1)] at the points x1 ∈ {0+, 1}. The following theorem facilitates this.

Theorem 2 Approximations for transformed Langevin function at 0,1

For the case where f(x1) changes monotonically between zero and infinity as x1 changes between zero and one, the following are valid approximations for the transformed Langevin function L[f(x1)] for, respectively, the right neighbourhood of the point zero and the left neighbourhood of the point one:

$$ L\left[f\left({x}_1\right)\right]\approx \left\{\begin{array}{cc}\frac{f\left({x}_1\right)\left[1-\frac{2^2}{3!}\right]-{f}^2\left({x}_1\right)\left[\frac{2^2}{3!}-\frac{2^3}{4!}\right]+{f}^3\left({x}_1\right)\left[\frac{2^3}{4!}-\frac{2^4}{5!}\right]-{f}^4\left({x}_1\right)\left[\frac{2^4}{5!}-\frac{2^5}{6!}\right]+\dots }{1-f\left({x}_1\right)+\frac{2^2{f}^2\left({x}_1\right)}{3!}-\frac{2^3{f}^3\left({x}_1\right)}{4!}+\dots }& {x}_1\to 0\\ {}1-\frac{1}{f\left({x}_1\right)}\kern0.24em \;\kern1.89em {x}_1\to 1& \end{array}\right. $$
(12)

Proof

The proof is detailed in Appendix 1.

Transformed Langevin function with unity rate of change at interval end points

The goal is for the transformed function L[f(x1)] to be linear with unity slope over the interval [0, 1). A starting requirement is for L[f(x1)] to have unity slope at x ∈ {0+, 1}.

Theorem 3 Transformed Langevin function with unity rate of change

For the case where f(x1) changes monotonically between zero and infinity as x1 changes between zero and one, and in a manner such that the approximations stated in Theorem 2 are valid, it follows that L[f(x1)] has unity slope at x1 = 0+, and at x1 = 1, when

$$ {f}^{(1)}\left({x}_1\right){\left|{}_{x_1={0}^{+}}=3\kern1.36em {f}^{(1)}\left({x}_1\right)={f}^2\left({x}_1\right)\right|}_{x_1={1}^{-}} $$
(13)

Proof

The proof is detailed in Appendix 2.

Notes

These constraints allow the coefficients in a valid function form for f(x1) to be simultaneously solved (see Kröger 2015 for the Padé approximant case). Petrosyan (2017) solves for a valid function approximation via combining two function forms which satisfy, separately, the two asymptotic constraints. The result that the derivative of an approximation for the inverse Langevin function should have a rate of change of 3 at the origin is well known, widely used (e.g., Darabi and Itskov 2015), and consistent with a Taylor series expansion for this function, e.g., Itskov et al. 2012.

Low-order forms for intermediate function

There are many choices for a function that goes from zero to infinity as its argument goes from zero to one, for example \( f\left({x}_1\right)=\mathit{\ln}\left[\frac{1}{1-{x}_1}.\left(1+{\alpha}_1{x}_1+\dots \right)\right] \). However, this form does not satisfy the constraints specified in Theorem 3. The following polynomial form

$$ f\left({x}_1\right)=\frac{3{x}_1\left(1+{k}_1{x}_1+{k}_2{x}_1^2+{k}_3{x}_1^3\dots \right)}{1-{x}_1} $$
(14)

has potential with the first-, second-, and third-order expressions being solved consistent with the constraints specified in Theorem 3 to yield:

$$ f\left({x}_1\right)=\frac{\left.3{x}_1\left(1-2{x}_1/3\right)\right|}{1-{x}_1}\kern0.55em {1}^{st}\kern0.55em \mathrm{order} $$
(15)
$$ f\left({x}_1\right)=\frac{3{x}_1\left(1+{k}_1{x}_1+{k}_2{x}_1^2\right)}{1-{x}_1}\kern0.24em \;\kern0.24em \;\kern0.24em \;{k}_2=-{k}_1-\frac{2}{3}\kern0.55em {2}^{nd}\kern0.55em \mathrm{order} $$
(16)
$$ f\left({x}_1\right)=\frac{3{x}_1\left(1+{k}_1{x}_1+{k}_2{x}_1^2+{k}_3{x}_1^3\right)}{1-{x}_1}\kern0.24em \;\;{k}_3=-{k}_1-{k}_2-\frac{2}{3}\kern0.55em {3}^{rd}\kern0.55em \mathrm{order} $$
(17)

Examples of second- and third-order functions that are close to optimum in the sense of minimizing the magnitude of the maximum relative error, over the interval [0, 1) in the approximation L–1(y) ≈ f(y) for the inverse Langevin function, are

$$ f\left({x}_1\right)=\frac{3{x}_1}{1-{x}_1}.\left[1-\frac{24{x}_1}{25}+\frac{22{x}_1^2}{75}\right] $$
(18)
$$ f\left({x}_1\right)=\frac{3{x}_1}{1-{x}_1}\left[1-\frac{459{x}_1}{500}+\frac{47{x}_1^2}{250}+\frac{19{x}_1^3}{300}\right]. $$
(19)

These two functions lead to maximum relative error magnitudes in the approximation L–1(y) ≈ f(y), respectively, of 0.00969 and 0.00583.

Higher-order forms for intermediate function

It is the case that the functions f defined by Eqs. 15, 18, and 19 have the required properties for the intermediate function, namely, being monotonically increasing between zero and infinity as its argument changes between zero and one and being such that L[f(x1)] has unity rate of change as the points of zero and one are approached. These functions are low-order approximations and higher-order approximations are possible as stated in the following theorem:

Theorem 4 Higher order forms for intermediate function

The coefficients in the general form for the intermediate function f, defined by Eq. 14, can be solved based on imposing the constraints of zero rate of change for higher-order derivatives at the points zero and one according to

$$ {\displaystyle \begin{array}{c}\frac{d}{dx}L\left[f\left({x}_1\right)\right]=1\kern0.72em \mathrm{for}\kern0.48em {x}_1\in \left\{{0}^{+},{1}^{-}\right\}\\ {}\frac{d^k}{d{x}_1^k}L\left[f\left({x}_1\right)\right]=0\kern0.36em \mathrm{for}\kern0.36em {x}_1\in \left\{{0}^{+},{1}^{-}\right\},k\in \left\{2,3,\dots \right\}\end{array}} $$
(20)

to yield, potentially, increasingly linear forms for L[f(x1)]. The results for first-, second-, and sixth-order approximations, respectively, are:

$$ f\left({x}_1\right)=\frac{3{x}_1}{1-{x}_1}.\left[1-\frac{2{x}_1}{3}\right]\kern0.24em \;\;\frac{d}{dx}L\left[f\left({x}_1\right)\right]=1,{x}_1\in \left\{{0}^{+}{1}^{-}\right\} $$
(21)
$$ f\left({x}_1\right)=\frac{3{x}_1}{1-{x}_1}.\left[1-{x}_1+\frac{x_1^2}{3}\right]\kern0.24em \;\;\frac{d}{d{x}_1}L\left[f\left({x}_1\right)\right]=1,\frac{d^2}{d{x}_1^2}L\left[f\left({x}_1\right)\right]=0,{x}_1\in \left\{{0}^{+}{1}^{-}\right\} $$
(22)
$$ f\left({x}_1\right)=\frac{3{x}_1}{1-{x}_1}\left[1-{x}_1+\frac{3{x}_1^2}{5}-\frac{3{x}_1^3}{5}+\frac{99{x}_1^4}{175}-\frac{99{x}_1^5}{175}+\frac{123{x}_1^6}{35}-\frac{612{x}_1^7}{35}+\frac{3974{x}_1^8}{105}-\frac{6994{x}_1^9}{175}+\frac{3604{x}_1^{10}}{175}-\frac{146{x}_1^{11}}{35}\right] $$
(23)

The third-, fourth-, fifth-, seventh-, and eighth-order approximations are detailed in Appendix 3.

Proof

Mathematica was used to solve for the coefficients by using the approximations stated in Theorem 2.

Results

The linearity of the function L[f(x1)] is illustrated in Fig. 5 for the first-order case as specified by Eq. 21. Higher orders closely approximate the linear line with unity slope. The relative errors in the various orders of approximation for the inverse Langevin function, based on L–1(y) ≈ f(y), where f is specified by Eqs. 21 to 23 and Eq. 122 to 126, are shown in Fig. 6. The relative error bounds, respectively, are 0.13, 0.0264, 0.0137, 0.0106, 6.06 × 10−3, 2.61 × 10−3, 3.15 × 10−3, and 6.18 × 10−3 for the first- to eighth-order approximations. These results indicate that the series does not converge and with the sixth-order approximation yielding the smallest relative error bound of 2.61 × 10−3.

Fig. 5
figure 5

Graph of the L[f(x1)] for a first order approximating function specified by Eq. (21)

Fig. 6
figure 6

Magnitude of the relative error in the approximation L–1(y) ≈ f(y) based on the first to eighth-order functions that linearize L[f(x1)] and are specified by Eqs. 21 to 23 and Eqs. 122 to 126. A sixth-order approximation yields the lowest relative error bound of 0.00261

Notes

The polynomial-based approximations, as specified by Eqs. 21 to 23 and Eq. 122 to 126, and resulting from the linearization constraints specified by Eq. 20, show modest convergence and then divergence with respect to providing an approximation L–1(y) ≈ f(y) for the inverse Langevin function. The relative error bound of 2.61 × 10−3 for the optimum sixth-order series, as specified by Eq. 23, is between that of the inverse Langevin approximations proposed by Kröger 2015, with a bound of 2.8 × 10–3, and the non-linear function proposed by Petrosyan 2017, with a bound of 1.8 × 10−3. Optimized Padé approximants, e.g., Marchi and Arruda (2019), yield better convergence with a [4/4] expression having a relative error bound of 1.8 × 10−4.

Simulation results indicate that the second-, third-, and fourth-order approximations, as specified by Eqs. 22, 122, and 123, represent lower bounds for the inverse Langevin function over the interval [0, 1).

An unsolved problem is a convergent series for the inverse Langevin function. The use of an intermediate function allows convergent series to be defined for the inverse Langevin function and this is detailed in the following section.

Convergent series for inverse Langevin function

Consider the approximation for the inverse Langevin function L−1(y) ≈ f(y), which is based on an intermediate function f that satisfies L[f(x1)] = x1 + ε(x1), |ε(x1)| ≪ x1. The relative error in the approximation is

$$ \upvarepsilon (y)=1-\frac{f(y)}{L^{-1}(y)} $$
(24)

Consider an associated relative error function

$$ {\varepsilon}_1(y)=\frac{L^{-1}(y)}{f(y)}-1=\frac{\varepsilon (y)}{1-\varepsilon (y)}\approx \varepsilon (y) $$
(25)

where the approximation is valid when |ε(y)| ≪ 1. When the approximation L−1(y) ≈ f(y) is good over the domain of [0, 1), the associated relative error is well defined and can be approximated arbitrarily accurately using a suitable basis set. This is the basis for a convergent series for the inverse Langevin function as detailed in the following theorem.

Theorem 5 Convergent series for inverse Langevin function

Consider an intermediate function f, which leads to the error function \( {\varepsilon}_1(y)=\frac{L^{-1}(y)}{f(y)}-1 \) being a smooth-bounded integrable function on the interval [0, 1). For this case, an orthonormal basis set {b0, b1, …} for the interval [0,1] leads to the convergent series

$$ {L}^{-1}(y)=f(y).\left[1+{c}_0{b}_0(y)+{c}_1{b}_1(y)+\dots \right]\kern0.24em \;\;\;y\in \left[0,1\right) $$
(26)

where bi, i ∈ {0, 1, 2, …}, is the ith orthonormal basis function and the ith coefficient, ci, is defined according to

$$ {c}_i={\int}_0^1\left[\frac{L^{-1}(y)}{f(y)}-1\right]{b}_i^{\ast }(y) dy\kern1.08em i\in \left\{0,1,\dots \right\}. $$
(27)

Here \( {b}_i^{\ast } \) is the conjugate of bi.

Proof

Convergence is guaranteed because, ε1(y) has been assumed to be a smooth, integrable function and the set of functions bi, i ∈ {0, 1, 2, …} has been assumed to be an orthonormal basis set for the interval [0, 1). A general reference is Debnath and Mikusinski 1999.

Suitable basis sets and results

Suitable basis sets for the interval [0,1] include the Legendre and the standard sinusoidal basis set and these are detailed in Appendix 4.

To illustrate the potential for a convergent series, consider the function f defined by Eq. 18:

$$ f(y)=\frac{3y}{1-y}.\left[1-\frac{24y}{25}+\frac{22{y}^2}{75}\right]. $$
(28)

The associated error function ε1 is shown in Fig. 7.

Fig. 7
figure 7

Graph of ε1(y), the error function to be approximated by a basis set decomposition for the case of f defined by Eq. 28

Results: Legendre basis set

The Legendre basis set leads to a polynomial series approximation for L−1(y) according to

$$ {L}^{-1}(y)\approx \frac{3y}{1-y}.\left[1-\frac{24y}{25}+\frac{22{y}^2}{75}\right]\left[1+{c}_0{b}_0(y)+{c}_1{b}_1(y)+{c}_2{b}_2(y)+\dots \right] $$
(29)

where c0 = 9.65679 × 10−4, c1 =  − 2.18413 × 10−3, c2 =  − 4.19473 × 10−3, c3 = 8.87659 × 10−4, c4 = 3.24574 × 10−3, c5 = 7.98418 × 10−4, c6 = – 2.72786 × 10−4, c7 = − 3.56513 × 10−4, c8 =  − 1.38877 × 10−4, c9 = 1.33846 × 10−5 and c10 = 5.20035 × 10−5, etc. Simplification leads to the standard polynomial forms. For example, the approximations of orders 4 and 10 are:

$$ {L}^{-1}(y)\approx \frac{3y}{1-y}.\left[1-\frac{24y}{25}+\frac{22{y}^2}{75}\right].\left[1+0.0027577-0.11785y+0.74962{y}^2-1.3162{y}^3+0.68161{y}^4\right] $$
(30)
$$ {\displaystyle \begin{array}{l}{L}^{-1}(y)\approx \frac{3y}{1-y}.\left[1-\frac{24y}{25}+\frac{22{y}^2}{75}\right].\left[1+{a}_0+{a}_1y+{a}_2{y}^2+\dots +{a}_{10}{y}^{10}\right]\\ {}\kern6em \begin{array}{llll}{a}_0=1.114252\times {10}^{-4}& {a}_1=-5.41586\times {10}^{-2}& {a}_2=0.695079& {a}_3=-5.789972\\ {}{a}_4=36.861391& {a}_5=-143.92813& {a}_6=343.76325& {a}_7=-507.39216\\ {}{a}_8=449.12483& {a}_9=-217.30958& {a}_{10}=44.029235& \end{array}\end{array}} $$
(31)

Orders zero to three are not of sufficiently high order to approximate the error function ε1(y). The relative error bounds for the fourth- to tenth-order approximations are 2.8 × 10−3, 2.6 × 10−3, 1.6 × 10−3, 5.1 × 10−4, 4.0 × 10−4, 3.5 × 10−4, and 1.2 × 10−4. The relative error bound for the twentieth-order series is 3.2 × 10−6. This is an error level that is better than a [6/6] Padé approximation specified by Marchi and Arruda (2019) (relative error bound of 6.37 × 10−6). The graphs of selected relative errors are shown in Fig. 8.

Fig. 8
figure 8

Relative error in approximations for L–1(y) based on the second-, fourth-, sixth-, eighth-, and tenth-order Legendre basis set approximations for the error function ε1(y)

Results: sinusoidal basis set

The sinusoidal basis set leads to the series for L−1(y) according to

$$ {L}^{-1}(y)\approx \frac{3y}{1-y}.\left[1-\frac{24y}{25}+\frac{22{y}^2}{75}\right]\left[1+{c}_1{b}_1(y)+{c}_2{b}_2(y)+\dots \right] $$
(32)

where c1 = 3.91964 × 10−3, c2 = 3.07689 × 10−3, c3 =  − 6.46390 × 10−3, c4 = 1.78090 × 10−3, c5 =  − 1.07345 × 10−3, c6 = 5.46701 × 10−5, c7 =  − 1.24337 × 10−4, c8 =  − 7.75576 × 10−5, c9 =  − 4.42517 × 10−5 and c10 =  − 2.43545 × 10−5 etc. Thus:

$$ {L}^{-1}(y)\approx \frac{3y}{1-y}.\left[1-\frac{24{x}_1}{25}+\frac{22{x}_1^2}{75}\right].\left[\begin{array}{c}1+{\mathrm{3.91964.10}}^{-3}\sin \left(\uppi \mathrm{y}\right)+{\mathrm{3.07689.10}}^{-3}\sin \left(2\uppi y\right)-\\ {}{\mathrm{6.46390.10}}^{-3}\sin \left[3\uppi \mathrm{y}\right]+{\mathrm{1.78090.10}}^{-3}\sin \left[4\uppi y\right]-\\ {}{\mathrm{1.07345.10}}^{-3}\sin \left[5\uppi y\right]+{\mathrm{5.46701.10}}^{-5}\sin \left[6\uppi y\right]+\dots \end{array}\right]. $$
(33)

Orders zero to two are not of sufficiently high order to approximate ε1(y). The relative error bounds for the third to tenth order approximations are 2.9 × 10−3, 1.2 × 10−3, 3.0 × 10−4, 3.4 × 10−4, 2.4 × 10−4, 1.8 × 10−4, 1.5 × 10−4, and 1.3 × 10−4. The relative error bound for a twentieth-order series is 3.3 × 10−5. Selected relative errors are shown in Fig. 9.

Fig. 9
figure 9

Relative error in approximations for L–1(y) based on the second-, fourth-, sixth-, eighth-, and tenth-order sinusoidal basis set approximations for the error function ε1(y)

Notes

The rate of convergence with both the Legendre and sinusoidal basis sets is modest. In general, a starting approximation for the inverse Langevin function, with a lower relative error bound, is likely to have a more oscillatory error function that needs to be approximated and this leads to more terms in the series approximation before a better approximation for the inverse Langevin function arises. For example, a tenth-order Legendre series approximation based on the initial approximating function given by Kröger (2015)

$$ {L}^{-1}(y)\approx \frac{3y-\frac{y}{5}\left[6{y}^2+{y}^4-2{y}^6\right]}{1-{y}^2}\mid \mathrm{re}\mid <2.8\times {10}^{-3} $$
(34)

leads to an improved approximation with a relative error bound of 1.2 × 10−4 for a tenth-order series. This relative error bound is close to that achieved by the approximation specified by Eq. 31, which is based on an initial starting approximation with a higher relative error.

The coefficients in the convergent series for the inverse Langevin function can be computed with arbitrarily high accuracy. Importantly, the series converges over the domain [0, 1). This is in contrast with a Taylor series, e.g., Itskov et al. (2012), where convergence breaks down around 0.904. Dargazany (2013) details an algorithm that facilitates the efficient evaluation of the higher-order derivatives involved in a Taylor series approximation.

The following section details approaches with significantly lower relative error bounds.

Improved approximation via error approximation

Consider the result, consistent with Theorem 1, for an intermediate function f which is such that L[f(x1)] is close to being linear:

$$ y=L(x)=L\left[f\left({x}_1\right)\right]={x}_1+\upvarepsilon \left({x}_1\right)\kern0.5em \Rightarrow \kern0.5em \mathrm{x}={L}^{-1}(y)=f\left[y-\upvarepsilon \left({x}_1\right)\right]\kern0.5em {x}_1,y\in \left[0,1\right). $$
(35)

If an approximation to the error function ε can be made, then an improved estimate for the inverse Langevin function results. Specifically, for y fixed, an approximation to the error ε at the point x1, which is in terms of y, is required.

Consider the general case, and the illustration shown in Fig. 10, for a function q which is close to being linear with a slope close to unity such that q(x1) = x1 + ε(x1) is an appropriate model. The goal is to find approximations to the error ε(x1) = q(x1) − x1, at the point x1, which are in terms of q(y1) and y1 where y1 = q(x1), x1 = q−1(y1).

Fig. 10
figure 10

Illustration of relationships for a function q which is approximately linear such that the model q(x1) = x1 + ε(x1), │ε(x1)│ << x1, is valid

Theorem 6 Zero-, first-, and second-order error approximations

For a function q, as illustrated in Fig. 10, which is such that the model q(x1) = x1 + ε(x1), |ε(x1)| ≪ x1, is valid, approximations for x1 = q−1(y1) and ε(x1) = q(x1) − x1 are, first, for a zero-order error approximation:

$$ {x}_1\approx 2{y}_1-q\left({y}_1\right)\kern0.5em \upvarepsilon \left({x}_1\right)\approx q\left({y}_1\right)-{y}_1. $$
(36)

Second, for a first-order error approximation:

$$ {x}_1\approx {x}_{11}={y}_1-\frac{q\left({y}_1\right)-{y}_1}{q^{(1)}\left({y}_1\right)}\kern0.5em \upvarepsilon \left({x}_1\right)\approx \frac{q\left({y}_1\right)-{y}_1}{q^{(1)}\left({y}_1\right)}. $$
(37)

Third, for a second-order error approximation:

$$ {x}_1\approx {y}_1-\frac{q^{(1)}\left({y}_1\right)}{q^{(2)}\left({y}_1\right)}\left[1-\sqrt{1-\frac{2{q}^{(2)}\left({y}_1\right)\left[q\left({y}_1\right)-{y}_1\right]}{{\left[{q}^{(1)}\left({y}_1\right)\right]}^2}}\right]\approx {y}_1-\left[\frac{q\left({y}_1\right)-{y}_1}{q^{(1)}\left({y}_1\right)}+\frac{q^{(2)}\left({y}_1\right){\left[q\left({y}_1\right)-{y}_1\right]}^2}{2{\left[{q}^{(1)}\left({y}_1\right)\right]}^3}\right] $$
(38)
$$ \upvarepsilon \left({x}_1\right)\approx \frac{q^{(1)}\left({y}_1\right)}{q^{(2)}\left({y}_1\right)}\left[1-\sqrt{1-\frac{2{q}^{(2)}\left({y}_1\right)\left[q\left({y}_1\right)-{y}_1\right]}{{\left[{q}^{(1)}\left({y}_1\right)\right]}^2}}\right]\approx \frac{q\left({y}_1\right)-{y}_1}{q^{(1)}\left({y}_1\right)}+\frac{q^{(2)}\left({y}_1\right){\left[q\left({y}_1\right)-{y}_1\right]}^2}{2{\left[{q}^{(1)}\left({y}_1\right)\right]}^3} $$
(39)

where the latter approximations assume \( \left|\frac{2{q}^{(2)}\left({y}_1\right)\left[q\left({y}_1\right)-{y}_1\right]}{{\left[{q}^{(1)}\left({y}_1\right)\right]}^2}\right|\ll 1. \)

Proof

The proofs for these approximations are detailed in Appendix 5.

Utilizing zero-, first-, and second-order error approximations

For the case where L[f(x1)] is approximately linear according to L[f(x1)] = x1 + ε(x1), it is the case, according to Theorem 1, that L−1(y) = f[y − ε(x1)]. The approximations for ε(x1), detailed in Theorem 6, lead to the following approximations, with improved accuracy, for the inverse Langevin function:

Theorem 7 Zero-, first-, and second-order error-based approximations

Consider an initial approximation to the inverse Langevin function of L−1(y) ≈ f(y) which is based on an intermediate function f that linearizes the Langevin function such that the model L[f(x1)] = x1 + ε(x1) is valid with |ε(x1)| ≪ x1. Approximations with a lower relative error bound are defined by

$$ {L}^{-1}(y)\approx {f}_0(y) $$
(40)

where

$$ {f}_0(y)=f\left[y-{\upvarepsilon}_k(y)\right]\kern0.5em k\in \left\{0,1,2\right\} $$
(41)

and with the zero-, first-, and second-order error approximations defined according to:

$$ {\upvarepsilon}_0(y)=L\left[f(y)\right]-y\kern0.5em \Rightarrow \kern0.5em {f}_0(y)=f\left[2y-L\left[f(y)\right]\right] $$
(42)
$$ {\upvarepsilon}_1(y)=\frac{L\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]}\kern0.5em \Rightarrow \kern0.5em {f}_0(y)=f\left[y-\frac{L\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]}\right] $$
(43)
$$ {\displaystyle \begin{array}{c}{\upvarepsilon}_2(y)=\frac{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]}{\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}L\left[f(y)\right]}\left[1-\sqrt{1-\frac{2\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}L\left[f(y)\right].\left[L\left[f(y)\right]-y\right]}{{\left[\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]\right]}^2}}\right]\\ {}\Rightarrow \kern0.85em {f}_0(y)=f\left[y-{\upvarepsilon}_2(y)\right].\end{array}} $$
(44)

The second-order error function can be approximated leading to

$$ {\upvarepsilon}_{2A}(y)=\frac{L\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]}+\frac{\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}L\left[f(y)\right].{\left[L\left[f(y)\right]-y\right]}^2}{2{\left[\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]\right]}^3}\kern0.5em \Rightarrow \kern0.75em {f}_0(y)=f\left[y-{\upvarepsilon}_{2A}(y)\right]. $$
(45)

In these functions:

$$ {\displaystyle \begin{array}{c}\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]=\frac{\mathrm{d}}{\mathrm{d}y}f(y).\frac{\mathrm{d}}{\mathrm{d}f}L\left[f(y)\right]\\ {}\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}L\left[f(y)\right]=\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}f(y).\frac{\mathrm{d}}{\mathrm{d}f}L\left[f(y)\right]+{\left[\frac{\mathrm{d}}{\mathrm{d}y}f(y)\right]}^2.\frac{{\mathrm{d}}^2}{\mathrm{d}{f}^2}L\left[f(y)\right]\end{array}} $$
(46)

and

$$ \frac{\mathrm{d}}{\mathrm{d}f}L(f)=\frac{1}{f^2}-\operatorname{csch}\ {(f)}^2\kern1.62em \frac{{\mathrm{d}}^2}{\mathrm{d}{f}^2}L(f)=\frac{-2}{f^3}+2\coth\ (f)\operatorname{csch}\ {(f)}^2. $$
(47)

Proof

Using the result, consistent with Theorem 1, of

$$ {L}^{-1}(y)=f\left[y-\upvarepsilon \left({x}_1\right)\right]\kern0.5em y\in \left[0,1\right), $$
(48)

these results follow from the approximations for ε(x1) specified in Theorem 6.

Results

Consider the relatively simple intermediate function f, defined by Eq. 18, and the approximation L−1(y) ≈ f(y) which has a modest relative error bound of 9.69 × 10−3. The magnitude of the relative errors in approximations to the inverse Langevin equation, based on this function, and for the cases of zero-, first-, and second-order error approximations, as specified by Theorem 7, are shown in Fig. 11. The relative error bounds for these cases are detailed in Table 1 and also for the case of a function, with a lower relative error bound, that is specified by Eq. 34.

Fig. 11
figure 11

Magnitude of the relative errors in the approximations for L–1(y) based on the function defined in Eq. 18, and for the cases of zero-, first-, and second-order error approximations as specified by Theorem 7

Table 1 The relative error bounds in approximations to the inverse Langevin function defined by L−1(y)≈f[yεk(y)] where εk(y) is defined by the zero-, first-, and second-order error approximations specified in Theorem 7

Notes

These results show the significant improvement in utilizing accurate approximations for the error function.

The approximation to the second-order error function, as defined by Eq. 45, yields results that are consistent with the precise form as specified by Eq. 44. For example, for the function f defined by Eq. 18, the approximation specified by Eq. 44 yields a relative error bound of 1.61 × 10−8 while the approximation specified by Eq. 45 yields a slightly better error bound of 1.53 × 10−8.

Higher-order error approximation

Given the significant improvement in approximations for the inverse Langevin function that can be obtained by zero-, first-, and second-order error approximations, it is of interest if higher-order approximations for the error function can be specified. Such approximations are detailed in the following Theorem:

Theorem 8 Higher order approximations for error function

Consider an initial approximation to the inverse Langevin function of L−1(y) ≈ f(y). Improved approximations, with a lower maximum relative error bound, are defined by L−1(y) ≈ f0(y) where \( {f}_0(y)=f\left[y-{\upvarepsilon}_k^A(y)\right] \) and \( {\upvarepsilon}_k^A \), k ∈ {1, 2, …}, is a kth order approximation for the error function. First-, second-, and third-order approximations are defined according to:

$$ {\upvarepsilon}_1^A(y)=\frac{-{a}_0}{a_1}=\frac{L\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]} $$
(49)
$$ {\upvarepsilon}_2^A(y)=\frac{-{a}_0}{a_1}.\left[1+\frac{a_0{a}_2}{a_1^2-2{a}_0{a}_2}\right] $$
(50)
$$ {\displaystyle \begin{array}{c}{\upvarepsilon}_3^A(y)=\frac{-{a}_0}{a_1}.{\delta}_2.\left[1+\frac{1-{\delta}_2+\frac{a_0{a}_2{\delta}_2^2}{a_1^2}-\frac{a_0^2{a}_3{\delta}_2^3}{a_1^3}}{\delta_2\left[1-\frac{2{a}_0{a}_2{\delta}_2}{a_1^2}+\frac{3{a}_0^2{a}_3{\delta}_2^2}{a_1^3}\right]}\right]\\ {}\kern0ex {\delta}_2=1+\frac{a_0{a}_2}{a_1^2-2{a}_0{a}_2}\end{array}} $$
(51)

where

$$ {\displaystyle \begin{array}{c}{a}_0=L\left[f(y)\right]-y\kern1.34em {a}_1=\left(-1\right)\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]\\ {}\kern0ex {a}_2=\frac{1}{2}.\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}L\left[f(y)\right]\kern1.34em \dots \kern1.34em {a}_k=\frac{{\left(-1\right)}^k}{k!}.\frac{{\mathrm{d}}^k}{\mathrm{d}{y}^k}L\left[f(y)\right]\end{array}} $$
(52)

Higher-order approximations can be specified by iteration and according to

$$ {\displaystyle \begin{array}{c}{\upvarepsilon}_k^A={\upvarepsilon}_{k-1}^A\left(1+{\varDelta}_k\right)\kern1.12em {\varDelta}_k=\frac{-{p}_k\left({\upvarepsilon}_{k-1}^A\right)}{\upvarepsilon_{k-1}^A{p}_k^{(1)}\left({\upvarepsilon}_{k-1}^A\right)}\\ {}\kern0ex {\upvarepsilon}_1^A=\frac{-{a}_0}{a_1}\kern1.12em {\varDelta}_1=0\end{array}} $$
(53)

where

$$ {\displaystyle \begin{array}{c}{p}_k(z)={a}_0+{a}_1z+{a}_2{z}^2+\cdots +{a}_k{z}^k\\ {}{p}_k^{(1)}(z)={a}_1+2{a}_2z+\cdots +k{a}_k{z}^{k-1}\end{array}} $$
(54)

Proof

The proofs for these results are detailed in Appendix 6.

Results

Results, for the functions f defined by Eqs. 18 and 34, are tabulated in Table 2 and, as expected, show the significant improvement in accuracy with higher-order error approximations and the improved accuracy by starting with an initial function with a lower relative error bound. The usual trade-off of complexity for accuracy applies.

Table 2 Relative error bounds in approximations to the inverse Langevin function, defined by \( {L}^{-1}(y)\approx f\left[y-{\upvarepsilon}_{\mathrm{k}}^A(y)\right] \), for the case of the error function approximations, \( {\upvarepsilon}_{\mathrm{k}}^A(y) \), specified in Theorem 8

Improved approximation via iteration

An initial approximating function f for the inverse Langevin function, i.e., L−1(y) ≈ f(y), leads to a better approximation of L−1(y) ≈ f0(y) where f0(y) = f(y − ε(y)) and ε(y) is one of the error approximations specified in Theorem 7 and Theorem 8. Iteration, to yield lower error bounds, is possible as stated in the following theorem.

Theorem 9 Inverse Langevin approximation: first-order iteration

Consider any of the approximations to the inverse Langevin function as specified by L−1(y) ≈ f0(y) where f0 has one of the forms specified in Theorem 7 or Theorem 8. With iteration, each of these functions yields an improved approximation to the inverse Langevin function according to

$$ {L}^{-1}(y)\approx {f}_1(y) $$
(55)

where

$$ {f}_1(y)={f}_0\left[y-{\upvarepsilon}_k(y)\right] $$
(56)

and εk(y) has one of the forms specified by Theorem 7 or Theorem 8. Examples include:

$$ {\upvarepsilon}_0(y)=L\left[{f}_0(y)\right]-y\kern0.5em \Rightarrow \kern0.5em {f}_1(y)={f}_0\left[2y-L\left[{f}_0(y)\right]\right] $$
(57)
$$ {\upvarepsilon}_1(y)=\frac{L\left[{f}_0(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_0(y)\right]}\kern0.5em \Rightarrow \kern0.5em {f}_1(y)={f}_0\left[y-\frac{L\left[{f}_0(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_0(y)\right]}\right] $$
(58)
$$ {\displaystyle \begin{array}{c}{\upvarepsilon}_{2A}(y)=\frac{L\left[{f}_0(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_0(y)\right]}+\frac{\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}L\left[{f}_0(y)\right].{\left[L\left[{f}_0(y)\right]-y\right]}^2}{2{\left[\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_0(y)\right]\right]}^3}\\ {}\Rightarrow \kern0.84em {f}_1(y)={f}_0\left[y-{\upvarepsilon}_{2A}(y)\right]\end{array}} $$
(59)
$$ {f}_1(y)={f}_0\left[y-{\upvarepsilon}_3^A(y)\right] $$
(60)

where ε3A is defined by Eqs. 51 and 52 with f(y) replaced by f0(y) .

Proof

These results follow directly from Theorem 7 and Theorem 8.

Approximation via second-order iteration

Consider any of the approximations to the inverse Langevin function as specified by L−1(y)≈f1(y) where f1 has one of the forms specified in Theorem 9. Iteration by using the results stated in Theorem 7 or Theorem 8 leads to improved approximations for the inverse Langevin function:

Theorem 10 Inverse Langevin approximation: second-order iteration

Each of the approximations for the inverse Langevin function, as specified in Theorem 9, can be used as a basis for an improved approximation according to

$$ {L}^{-1}(y)\approx {f}_2(y) $$
(61)

where

$$ {f}_2(y)={f}_1\left[y-{\upvarepsilon}_k(y)\right] $$
(62)

and εk(y) has one of the forms specified by Theorem 7 or Theorem 8. Examples include:

$$ {\upvarepsilon}_0(y)=L\left[{f}_1(y)\right]-y\kern0.5em \Rightarrow \kern0.5em {f}_2(y)={f}_1\left[2y-L\left[{f}_1(y)\right]\right] $$
(63)
$$ {\upvarepsilon}_1(y)=\frac{L\left[{f}_1(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_1(y)\right]}\Rightarrow \kern0.84em {f}_2(y)={f}_1\left[y-{\upvarepsilon}_1(y)\right]={f}_1\left[y-\frac{L\left[{f}_1(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_1(y)\right]}\right] $$
(64)
$$ {\displaystyle \begin{array}{c}{\upvarepsilon}_{2A}(y)=\frac{L\left[{f}_1(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_1(y)\right]}+\frac{\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}L\left[{f}_1(y)\right].{\left[L\left[{f}_1(y)\right]-y\right]}^2}{2{\left[\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_1(y)\right]\right]}^3}\\ {}\Rightarrow \kern0.84em {f}_2(y)={f}_1\left[y-{\upvarepsilon}_{2A}(y)\right]\end{array}} $$
(65)

Proof

These results follow directly from Theorem 7 and Theorem 9.

Approximation via higher-order iteration

By using the same order of error approximation at each iteration, general expressions for approximations to the inverse Langevin function, based on iteration, can be specified:

Theorem 11 Iteration with set order of error approximation

Direct iteration, with a set error approximation type at each stage, leads to the following general result:

$$ {L}^{-1}(y)\approx {f}_i(y)={f}_{i-1}\left[y-{\upvarepsilon}_k(y)\right] $$
(66)

where, for example:

$$ {\displaystyle \begin{array}{c}{f}_i(y)={f}_{i-1}\left[y-{\upvarepsilon}_0(y)\right]\kern1.34em {\upvarepsilon}_0(y)=L\left[{f}_{i-1}(y)\right]-y\\ {}\kern0ex {f}_0(y)=f\left[2y-L\left[f(y)\right]\right]\end{array}} $$
(67)
$$ {\displaystyle \begin{array}{c}{f}_i(y)={f}_{i-1}\left[y-{\upvarepsilon}_1(y)\right]\kern1.06em {\upvarepsilon}_1(y)=\frac{L\left[{f}_{i-1}(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_{i-1}(y)\right]}\\ {}\kern0ex {f}_0(y)=f\left[y-\frac{L\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]}\right]\end{array}} $$
(68)
$$ {f}_i(y)={f}_{i-1}\left[y-{\upvarepsilon}_{2A}(y)\right]{\upvarepsilon}_{2A}(y)=\frac{L\left[{f}_{i-1}(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_{i-1}(y)\right]}+\frac{\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}L\left[{f}_{i-1}(y)\right].{\left[L\left[{f}_{i-1}(y)\right]-y\right]}^2}{2{\left[\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_{i-1}(y)\right]\right]}^3}{f}_0(y)=f\left[y-\frac{L\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]}-\frac{\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}L\left[f(y)\right].{\left[L\left[f(y)\right]-y\right]}^2}{2{\left[\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]\right]}^3}\right]. $$
(69)

First-order iteration forms can be written as

$$ {L}^{-1}(y)\approx {f}_1(y)=f\left[h\left[{h}_1(y)\right]\right] $$
(70)

where, respectively, for zero- and first-error approximations:

$$ h(y)=2y-L\left[f(y)\right]\kern1.06em {h}_1(y)=2y-L\left[f\left[h(y)\right]\right] $$
(71)
$$ h(y)=y-\frac{L\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]}\kern1.34em {h}_1(y)=y-\frac{L\left[f\left[h(y)\right]\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f\left[h(y)\right]\right]} $$
(72)

Second-order iteration forms can be written as

$$ {L}^{-1}(y)\approx {f}_2(y)=f\left[h\left[{h}_1\left[{h}_2(y)\right]\right]\right] $$
(73)

where, respectively, for zero- and first-order error approximation:

$$ {\displaystyle \begin{array}{c}h(y)=2y-L\left[f(y)\right]\kern1.34em {h}_1(y)=2y-L\left[f\left[h(y)\right]\right]\kern0.5em \\ {}\kern0ex {h}_2(y)=2y-L\left[f\left[h\left[{h}_1(y)\right]\right]\right]\end{array}} $$
(74)
$$ {\displaystyle \begin{array}{c}h(y)=y-\frac{L\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f(y)\right]}\kern1.34em {h}_1(y)=y-\frac{L\left[f\left[h(y)\right]\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f\left[h(y)\right]\right]}\\ {}\kern0ex {h}_2(y)=y-\frac{L\left[f\left[h\left[{h}_1(y)\right]\right]\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[f\left[h\left[{h}_1(y)\right]\right]\right]}\end{array}} $$
(75)

Higher order iteration forms follow in a consistent manner.

Proof

The iteration results follow directly from Theorem 7, Theorem 9, and Theorem 10. The proof for the specific first- and second-order iteration forms is detailed in Appendix 7.

Evaluation of some of the expressions in these equations is facilitated by the use of Eqs. 46, 47 and the results:

$$ {\displaystyle \begin{array}{c}\frac{\mathrm{d}}{\mathrm{d}y}L\left[f\left[h(y)\right]\right]=\frac{\mathrm{d}}{\mathrm{d}y}h(y).\frac{\mathrm{d}}{\mathrm{d}h}f\left[h(y)\right].\frac{\mathrm{d}}{\mathrm{d}f}L\left[f\left[h(y)\right]\right]\\ {}\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}L\left[f\left[h(y)\right]\right]=\frac{{\mathrm{d}}^2}{\mathrm{d}{y}^2}h(y).\frac{\mathrm{d}}{\mathrm{d}h}f\left[h(y)\right].\frac{\mathrm{d}}{\mathrm{d}f}L\left[f\left[h(y)\right]\right]+{\left[\frac{\mathrm{d}}{\mathrm{d}y}h(y)\right]}^2.\frac{{\mathrm{d}}^2}{\mathrm{d}{h}^2}f\left[h(y)\right].\frac{\mathrm{d}}{\mathrm{d}f}L\left[f\left[h(y)\right]\right]+\\ {}\kern5.279997em {\left[\frac{\mathrm{d}}{\mathrm{d}y}h(y)\right]}^2.{\left[\frac{\mathrm{d}}{\mathrm{d}h}f\left[h(y)\right]\right]}^2.\frac{{\mathrm{d}}^2}{\mathrm{d}{f}^2}L\left[f\left[h(y)\right]\right]\end{array}} $$
(76)

Alternative forms for iterative approximations

The following theorem details two alternative forms for iterative approximations to the inverse Langevin function. These functions illustrate the complex nature of iterative approximate expressions for the inverse Langevin function.

Theorem 12 Alternative forms for iterative approximations

First, consider the function f1 defined by a first-order iteration, as specified by Eq. 57, where iteration is based on a zero-order error approximation (Eq. 42). An expanded expression is

$$ {\displaystyle \begin{array}{c}{L}^{-1}(y)\approx {f}_1(y)\\ {}=f\left[4y-2L\left[f\left[2y-L\left[f(y)\right]\right]\right]-L\left[f\left[2y-L\left[f\left[2y-L\left[f(y)\right]\right]\right]\right]\right]\right]\end{array}} $$
(77)

Second, consider the function f2 defined by a second-order iteration, as specified by Eq. 63, where iteration is based on a zero-order error approximation (Eq. 42). An expanded expression is

$$ {\displaystyle \begin{array}{c}{L}^{-1}(y)\approx {f}_2(y)\\ {}=f\left[4{h}_1(y)-2L\left[f\left[2{h}_1(y)-L\left[f\left[{h}_1(y)\kern0.28em \right]\right]\kern0.28em \right]\right]\kern0.28em -L\left[f\left[2{h}_1(y)-L\left[f\left[2{h}_1(y)-L\left[f\left[{h}_1(y)\right]\right]\kern0.28em \right]\right]\kern0.28em \right]\right]\right]\end{array}} $$
(78)

where

$$ {\displaystyle \begin{array}{c}{h}_1(y)=2y-\\ {}\kern0ex L\left[f\left[4y-2L\left[f\left[2y-L\left[f(y)\right]\kern0.28em \right]\right]-L\left[f\left[2y-L\left[f\left[2y-L\left[f(y)\right]\kern0.28em \right]\right]\kern0.28em \right]\right]\kern0.28em \right]\right]\end{array}} $$
(79)

Proof

The proofs for these results are detailed in Appendix 8.

Results: error approximation and function iteration

Relatively simple approximations for the inverse Langevin function, with modest relative error bounds—bounds greater than 10−4, include:

$$ {\displaystyle \begin{array}{c}{L}^{-1}(y)\approx {g}_1(y)=\frac{3y}{1-y}.\left[1-\frac{24y}{25}+\frac{22{y}^2}{75}\right]\\ {}\kern0ex \hspace{0.5em}\mathrm{this}\kern0.34em \mathrm{paper}\kern1.34em \mid \mathrm{re}\mid <9.69\times {10}^{-3}\end{array}} $$
(80)
$$ {\displaystyle \begin{array}{c}{L}^{-1}(y)\approx {g}_2(y)=\frac{3y}{1-y}.\left[1-\frac{459y}{500}+\frac{47{y}^2}{250}+\frac{19{y}^3}{300}\right]\\ {}\kern0ex \hspace{0.5em}\mathrm{this}\kern0.34em \mathrm{paper}\kern1.34em \mid \mathrm{re}\mid <5.83\times {10}^{-3}\end{array}} $$
(81)
$$ {\displaystyle \begin{array}{c}{L}^{-1}(y)\approx {g}_3(y)=\frac{3y-\frac{y}{5}\left[6{y}^2+{y}^4-2{y}^6\right]}{1-{y}^2}\\ {}\kern0ex \hspace{0.5em}\mathrm{Kroger}\kern0.28em (2015)\kern1.34em \mid \mathrm{re}\mid <2.8\times {10}^{-3}\end{array}} $$
(82)
$$ {L}^{-1}(y)\approx {g}_4(y)=\frac{3y}{1-y}.\left[1-y+\frac{3{y}^2}{5}-\frac{3{y}^3}{5}+\frac{99{y}^4}{175}-\frac{99{y}^5}{175}+\frac{123{y}^6}{35}-\frac{612{y}^7}{35}+\frac{3974{y}^8}{105}-\frac{6994{y}^9}{175}+\frac{3604{y}^{10}}{175}-\frac{146{y}^{11}}{35}\right] $$
(83)

This paper: 6th-order approx. (Eq. 23): |re| < 2.61 × 10−3

$$ {\displaystyle \begin{array}{c}{L}^{-1}(y)\approx {g}_5(y)=3y+\frac{y^2}{5}\sin \left[\frac{7y}{2}\right]+\frac{y^3}{1-{y}^2}\\ {}\kern0ex \hspace{1.34em}\mathrm{Petrosyan}\kern0.28em (2017)\kern1.34em \mid \mathrm{re}\mid <1.8\times {10}^{-3}\end{array}} $$
(84)
$$ {\displaystyle \begin{array}{c}{L}^{-1}(y)\approx {g}_6(y)=\frac{y\left(3-{y}^2\right)}{1-{y}^2}-\frac{y^{10/3}}{2}+3{y}^5\left(y-\frac{76}{100}\right)\left(y-1\right)\\ {}\kern0ex \hspace{1.06em}\mathrm{Nguessong}\kern0.28em (2014)\kern1.34em \mid \mathrm{re}\mid <7.2\times {10}^{-4}\end{array}} $$
(85)
$$ {\displaystyle \begin{array}{c}{L}^{-1}(y)\approx {g}_7(y)=\frac{3y+{a}_2{y}^2+{a}_3{y}^3+{a}_4{y}^4}{1-y+{b}_2\left(y-{y}^2\right)+{b}_3\left({y}^2-{y}^3\right)+{b}_4\left({y}^3-{y}^4\right)}\kern0.5em \left\{\begin{array}{c}\mathrm{Marchi}\ (2019):\left[4/4\right]\;\mathrm{approx}.\\ {}\left|\mathrm{re}\right|<1.8\times {10}^{-4}\end{array}\right.\\ {}{a}_2=-6.98408968,{a}_3=5.69026957,{a}_4=-1.35415696\\ {}{b}_2=-1.33411915,{b}_3=0.0391556,{b}_4=0.64694651\end{array}} $$
(86)
$$ {\displaystyle \begin{array}{c}{L}^{-1}(y)\approx {g}_8(y)=f\left[h(y)\right]f(y)\\ {}=\frac{3y}{1-y}.\left[1-\frac{24y}{25}+\frac{22{y}^2}{75}\right]\kern1.34em h(y)=2y-L\left[f(y)\right]\end{array}} $$
(87)

This paper: zero-order error approx. (Eq. 42): |re| < 1.31 × 10–4.

$$ {\displaystyle \begin{array}{c}{L}^{-1}(y)\approx {g}_9(y)=\frac{3y}{1-y}.\left[1-\frac{24y}{25}+\frac{22{y}^2}{75}\right]\\ {}.\left[1+{a}_0+{a}_1y+{a}_2{y}^2+\cdots +{a}_{10}{y}^{10}\right]\end{array}} $$
(88)

This paper: Legendre series (Eq. 31): |re| < 1.2 × 10–4.

Lower error bounds via iterative error approximation

Much lower relative error bounds are possible when error approximation and iteration are utilized. Indicative results are detailed in Table 3 for the case of the function specified by Eq. 80 which has the relative error bound of 9.69 × 10−3.

Table 3 Relative error bounds for approximations to the inverse Langevin function. Iteration based on functions defined by a zero-, first-, second-, and third-order error approximations and with a base function f  defined by Eq. 80

Lower relative error bounds arise by using an initial function approximation with a lower error bound. For example, the function specified by Nguessong (Eq. 85), which has the relative error bound of 7.2 × 10−4, yields the results specified in Table 4.

Table 4 Relative error bounds for approximations to the inverse Langevin function. Iteration based on functions defined by a zero-, first-, and second-order error approximation and with a base function f  defined by Eq. 85

Lower-error bounds via better base functions

By utilizing a base function with a lower relative error bound, improved approximations for the inverse Langevin function are possible with error approximation and iteration. Results are detailed in Table 5.

Table 5 Relative error bounds, over the interval [ 0, 1 ), for approximations to the inverse Langevin function based on the functions g1, …, g7 defined by Eqs. 80 to 86

Notes

The variation, with different base functions, in the relative error bound of approximations for the inverse Langevin function are illustrated in Fig. 12 for the case of a first iteration based on zero order error approximation (see Eq. 57). In general, the upper bound on the magnitude of the relative error occurs in the mid-band region of the interval [ 0, 1] —the exception being the case of g7(y) defined by Eq. 86. For this function, the coefficients defining the approximation are not of sufficient accuracy to yield high-order approximations with iteration and a floor in the relative error is evident in Table 5. This problem can be simply overcome by entering the coefficient numbers with higher precision.

Fig. 12
figure 12

Graphs of the magnitude of the relative error in approximations to L–1(y) based on a first iteration and on a zero-order error approximation (Eq. 57). The base functions, f, are defined by Eqs. 80 to 86

As is evident in the results tabulated in Tables 3, 4, and 5, the improvement in minimizing the relative error bound is dramatic with error approximation and iteration. The function specified by Eq. 80 approximates the inverse Langevin function with a relative error bound of 0.00969. As detailed in Table 3, this relative error bound can be reduced to 5.08 × 10−8, 2.66 × 10−16, and 2.06 × 10−30 by a first-order iteration based, respectively, on zero-, first-, and second-order error approximations. A comparison of the results between Tables 3 and 4 shows the natural improvement with error approximation and iteration when a base function with a lower relative error bound is used.

Approximation with high accuracy and modest complexity

The results detailed in Table 5 indicate that approximations for the inverse Langevin function, based on a first-order iteration and a first-order error approximation, yield relative error bounds of the order of 10−16 or better—a level of accuracy that is higher than that required for most applications. For a chosen base function f, the approximation is

$$ {L}^{-1}(y)\approx {f}_0\left[y-\frac{L\left[{f}_0(y)\right]-y}{f_0^{(1)}(y).\frac{\mathrm{d}}{\mathrm{d}{f}_0}L\left[{f}_0(y)\right]}\right] $$
(89)

where

$$ {f}_0(y)\approx f\left[y-\frac{L\left[f(y)\right]-y}{f^{(1)}(y).\frac{\mathrm{d}}{\mathrm{d}f}L\left[f(y)\right]}\right] $$
(90)
$$ {f}_0^{(1)}(y)=\left[\frac{1}{f^{(1)}(y).\frac{\mathrm{d}}{\mathrm{d}f}L\left[f(y)\right]}+\frac{\left[L\left[f(y)\right]-y\right]\left[{f}^{(2)}(y).\frac{\mathrm{d}}{\mathrm{d}f}L\left[f(y)\right]+{\left[{f}^{(1)}(y)\right]}^2\frac{{\mathrm{d}}^2}{\mathrm{d}{f}^2}L\left[f(y)\right]\right]}{{\left[{f}^{(1)}(y).\frac{\mathrm{d}}{\mathrm{d}f}L\left[f(y)\right]\right]}^2}\right].\kern0.28em {f}^{(1)}\left[y-\frac{L\left[f(y)\right]-y}{f^{(1)}(y).\frac{\mathrm{d}}{\mathrm{d}f}L\left[f(y)\right]}\right] $$
(91)

For y fixed, the evaluation of L−1(y) requires the determination of f(y), f(1)(y), f(2)(y), L[f(y)], \( \frac{\mathrm{d}}{\mathrm{d}f}L\left[f(y)\right] \), \( \frac{{\mathrm{d}}^2}{\mathrm{d}{f}^2}L\left[f(y)\right] \) etc.

Simplified approximation via Taylor series

Consider, the initial error-based approximations for the inverse Langevin function

$$ {L}^{-1}(y)\approx {f}_0(y)=f\left[y-{\upvarepsilon}_k(y)\right]\kern0.5em k\in \left\{0,1,2\right\} $$
(92)

as specified by Eqs. 42 to 45, where it is expected that the error term is small, i.e., |εk(y)| << y. This justifies the simplified approximations for the inverse Langevin function as stated in the following theorem:

Theorem 13 Taylor series–based approximations

A first- and second-order Taylor series yield the simplified approximations for the inverse Langevin function:

$$ {L}^{-1}(y)\approx {f}_0(y) $$
(93)

where

$$ {\displaystyle \begin{array}{cc}{f}_0(y)=f(y)-{\upvarepsilon}_k(y){f}^{(1)}(y)& {1}^{st}\ \mathrm{order}\kern0.34em \mathrm{Taylor}\\ {}{f}_0(y)=f(y)-{\upvarepsilon}_k(y){f}^{(1)}(y)+\frac{\upvarepsilon_k^2(y)}{2}{f}^{(2)}(y)& {2}^{nd}\ \mathrm{order}\kern0.34em \mathrm{Taylor}\end{array}} $$
(94)

For the zero-order error case (Eq. 42) the respective first- and second-order Taylor series based approximations are:

$$ {\displaystyle \begin{array}{c}{f}_0(y)=f(y)-\left[\mathrm{L}\left[f(y)\right]-y\right].{f}^{(1)}(y)\\ {}\kern0ex {f}_0(y)=f(y)-\left[\mathrm{L}\left[f(y)\right]-y\right].{f}^{(1)}(y)+\frac{{\left[\mathrm{L}\left[f(y)\right]-y\right]}^2}{2}.{f}^{(2)}(y)\end{array}} $$
(95)

For a first-order error-based approximation (Eq. 43) the respective first- and second-order Taylor series based approximations are:

$$ {\displaystyle \begin{array}{c}{f}_0(y)=f(y)-\frac{\mathrm{L}\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}f}\mathrm{L}\left[f(y)\right]}\\ {}\kern0ex {f}_0(y)=f(y)-\frac{\mathrm{L}\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}f}\mathrm{L}\left[f(y)\right]}+\frac{1}{2}{\left[\frac{\mathrm{L}\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}\mathrm{L}\left[f(y)\right]}\right]}^2.{f}^{(2)}(y)\end{array}} $$
(96)

Higher-order error approximations follow in an analogous manner.

Proof

These results arise from the first- and second-order Taylor series based on the point y, i.e. 

$$ {\displaystyle \begin{array}{c}f\left(y+\varDelta \right)\approx f(y)+\varDelta {f}^{(1)}(y)\\ {}\kern0ex \hspace{0.84em}f\left(y+\varDelta \right)\approx f(y)+\varDelta {f}^{(1)}(y)+\frac{\varDelta^2}{2}{f}^{(2)}(y)\end{array}} $$
(97)

where Δ =  − εk(y).

Explicit error-based approximations

Theorem 14 Explicit error-based approximations

The error-based approximations for the inverse Langevin function, as specified in Theorem 13, have the following form:

$$ {L}^{-1}(y)\approx {f}_0(y) $$
(98)

where, first, for a zero-order error-based approximation and respectively, for a first- and second-order Taylor series approximation:

$$ {\displaystyle \begin{array}{c}{f}_0(y)=f(y)-\left[\coth\ \left[f(y)\right]-\frac{1}{f(y)}-y\right]{f}^{(1)}(y){f}_0(y)=f(y)-\left[\coth\ \left[f(y)\right]-\frac{1}{f(y)}-y\right]{f}^{(1)}(y)\\ {}+\frac{1}{2}{\left[\coth\ \left[f(y)\right]-\frac{1}{f(y)}-y\right]}^2{f}^{(2)}(y)\end{array}} $$
(99)

Second, for a first order error-based approximation:

$$ {f}_0(y)=\frac{\left[2f(y)+y{f}^2(y)\right]\sinh {\left[f(y)\right]}^2-{f}^3(y)-{f}^2(y)\sinh \left[2f(y)\right]/2}{\sinh {\left[f(y)\right]}^2-{f}^2(y)} $$
(100)
$$ {\displaystyle \begin{array}{c}{f}_0(y)=\frac{\left[2f(y)+y{f}^2(y)\right]\sinh\ {\left[f(y)\right]}^2-{f}^3(y)-{f}^2(y)\sinh\ \left[2f(y)\right]/2}{\sinh\ {\left[f(y)\right]}^2-{f}^2(y)}\\ {}\kern1.68em +\frac{f^2(y)\sinh\ {\left[f(y)\right]}^2.{\left[f(y)\cosh\ \left[f(y)\right]-\sinh\ \left[f(y)\right]- yf(y)\sinh\ \left[f(y)\right]\right]}^2}{2{\left[\sinh\ {\left[f(y)\right]}^2-{f}^2(y)\right]}^2}.\frac{f^{(2)}(y)}{{\left[{f}^{(1)}(y)\right]}^2}\end{array}} $$
(101)

Results for higher-order error approximations follow in a natural manner.

Proof

These results arises from Theorem 13 and the definition of the Langevin function and its derivative.

First- and second-order iteration

The approximations detailed in Theorem 14 can be used to specify first-, second-, and higher-order iteration-based approximations for the inverse Langevin function:

Theorem 15 First- and second-order iteration: first-order error and first-order Taylor

Consider the case of a first-order error approximation with either a first- or second-order Taylor series approximation (Eq. 96). A first-order iteration approximation to the inverse Langevin function is defined according to

$$ {L}^{-1}(y)\approx {f}_1(y) $$
(102)

where

$$ {f}_1(y)=\frac{\left[2{f}_0(y)+y{f}_0^2(y)\right]\sinh {\left[{f}_0(y)\right]}^2-{f}_0^3(y)-{f}_0^2(y)\sinh \left[2{f}_0(y)\right]/2}{\sinh {\left[{f}_0(y)\right]}^2-{f}_0^2(y)} $$
(103)

for the case of a first-order Taylor series approximation. Here, f0 is defined by Eq. 100. For a second-order, Taylor series approximation

$$ {\displaystyle \begin{array}{c}{f}_1(y)=\frac{\left[2{f}_0(y)+y{f}_0^2(y)\right]\sinh {\left[{f}_0(y)\right]}^2-{f}_0^3(y)-{f}_0^2(y)\sinh \left[2{f}_0(y)\right]/2}{\sinh {\left[{f}_0(y)\right]}^2-{f}_0^2(y)}+\\ {}\kern1.68em \frac{f_0^2(y)\sinh {\left[{f}_0(y)\right]}^2.{\left[{f}_0(y)\cosh \left[{f}_0(y)\right]-\sinh \left[{f}_0(y)\right]-y{f}_0(y)\sinh \left[{f}_0(y)\right]\right]}^2}{2{\left[\sinh {\left[{f}_0(y)\right]}^2-{f}_0^2(y)\right]}^2}.\frac{f_0^{(2)}(y)}{{\left[{f}_0^{(1)}(y)\right]}^2}\end{array}} $$
(104)

where f0 is defined by Eq. 101.

A second-order iteration approximation is

$$ {L}^{-1}(y)\approx {f}_2(y) $$
(105)

where, for a first-order Taylor series approximation:

$$ {f}_2(y)=\frac{\left[2{f}_1(y)+y{f}_1^2(y)\right]\sinh {\left[{f}_1(y)\right]}^2-{f}_1^3(y)-{f}_1^2(y)\sinh \left[2{f}_1(y)\right]/2}{\sinh {\left[{f}_1(y)\right]}^2-{f}_1^2(y)} $$
(106)

Here, f1 is defined by Eq. 103. For a second-order Taylor series approximation:

$$ {\displaystyle \begin{array}{c}{f}_2(y)=\frac{\left[2{f}_1(y)+y{f}_1^2(y)\right]\sinh {\left[{f}_1(y)\right]}^2-{f}_1^3(y)-{f}_1^2(y)\sinh \left[2{f}_1(y)\right]/2}{\sinh {\left[{f}_1(y)\right]}^2-{f}_1^2(y)}+\\ {}\kern1.68em \frac{f_1^2(y)\sinh {\left[{f}_1(y)\right]}^2.{\left[{f}_1(y)\cosh \left[{f}_1(y)\right]-\sinh \left[{f}_1(y)\right]-y{f}_1(y)\sinh \left[{f}_1(y)\right]\right]}^2}{2{\left[\sinh {\left[{f}_1(y)\right]}^2-{f}_1^2(y)\right]}^2}.\frac{f_1^{(2)}(y)}{{\left[{f}_1^{(1)}(y)\right]}^2}\end{array}} $$
(107)

where f1 is defined by Eq. 104.

Proof

Consider first-order iteration based on the first-order Taylor series approximation specified by Eq. 96 of

$$ {f}_0(y)=f(y)-\frac{L\left[f(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}f}L\left[f(y)\right]} $$
(108)

It then follows from Eq. 58 that

$$ {L}^{-1}(y)\approx {f}_1(y)={f}_0\left[y-\frac{L\left[{f}_0(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}y}L\left[{f}_0(y)\right]}\right] $$
(109)

The first-order Taylor series approximation, as specified by Eq. 96, yields the required result of

$$ {L}^{-1}(y)\approx {f}_0(y)-\frac{L\left[{f}_0(y)\right]-y}{\frac{\mathrm{d}}{\mathrm{d}{f}_0}L\left[{f}_0(y)\right]} $$
(110)

The other results follow in an analogous manner.

Results

A first-order Taylor series approximation, as defined in Eq. 94, in general, leads to significantly higher relative errors in approximations to the inverse Langevin function. In contrast, a second-order Taylor series approximation leads to relative errors comparable with the non-approximated case. Illustrative results are shown in Table 6 where the base function defined by Eq. 80 has been used.

Table 6 The relative error bounds, over the interval (0,1), for approximations to the inverse Langevin function based on the function f defined by Eq. 80 and with the use of first- and second-order Taylor series approximations

Example

Consider the relatively simple expression, as defined by Eq. 100, which is for a first-order Taylor approximation to a first-order error-based approximation, i.e.,

$$ {L}^{-1}(y)\approx {f}_0(y)=\frac{\left[2f(y)+y{f}^2(y)\right]\sinh\ {\left[f(y)\right]}^2-{f}^3(y)-\frac{f^2(y)\sinh\ \left[2f(y)\right]}{2}}{\sinh\ {\left[f(y)\right]}^2-{f}^2(y)} $$
(111)

When the function f is that defined by Petrosyan (Eq. 84 with a relative error bound of 1.8 × 10−3), or Nguessong (Eq. 85 with a relative error bound of 7.2 × 10−4), the relative error bounds, respectively, are 3.20 × 10−6 and 3.81 × 10−7. These bounds are comparable with the [7/7] approximation given in Marchi and Arruda (2019) which has a relative error bound of 6.92 × 10−7. If this function is used iteratively, according to

$$ {L}^{-1}(y)\approx {f}_1(y)=\frac{\left[2{f}_0(y)+y{f}_0^2(y)\right]\sinh {\left[{f}_0(y)\right]}^2-{f}_0^3(y)-{f}_0^2(y)\sinh \left[2{f}_0(y)\right]/2}{\sinh {\left[{f}_0(y)\right]}^2-{f}_0^2(y)} $$
(112)

then the relative error bounds reduce, respectively, to 1.03 × 10−11 and 1.09 × 10−13. As is clear from the results detailed in Table 6, these error bounds can be significantly improved upon by utilizing a second-order Taylor series approximation.

Computational complexity

For Padé-based approximations, where a measure of the computational complexity involved in the evaluation of the inverse Langevin function for a set argument can be clearly defined, a graph of the relative error bound versus computational complexity can readily be generated, e.g., Kröger 2015 and Jedynak 2018. The results shown in Fig. 13 are indicative of the relative error improvement that is possible with an increase in functional and computational complexity. An unsolved problem is the determination of the computational complexity, in terms of the number of basic operations (addition, subtraction, multiplication, division, ...), for the approximations detailed in the paper, in particular for the Taylor series approximations detailed in Eqs. 100, 101, 103, 104, 106, and 107. There is potential, e.g., Muller (2006) and Brent (2018), for the computational efficiency of specific functions to be enhanced by innovative approaches. Further research is warranted.

Fig. 13
figure 13

Display of the relative error bound for the functions g1 to g7 defined by Eqs. 80 to 86 for the case of first-order error approximation and based on the Taylor series approximations defined by Eqs. 100, 101, and 103. Each group of results represents a distinct order of functional and computational complexity and, within each group, the relative error bounds are offset from one another to provide clarity

Conclusion

In this paper, an analytical framework has been detailed which underpins, first, convergent series approximations for the inverse Langevin function and, second, analytical approximations for this function, with potentially arbitrarily small maximum relative error magnitudes. The basis for both approaches is the definition of an intermediate function f that leads to L[f(x1)] being a closely linear function with a slope that is close to one over the interval [0, 1). This function allows, first, the definition of an error function for the inverse Langevin function which can be approximated via a basis set decomposition. Second, it allows error approximation and then function iteration which leads, potentially, to arbitrarily low relative error bounds in approximations.

Basis set decomposition for the defined error function underpins convergent series approximations for the inverse Langevin function. A tenth-order Legendre basis set, based on an initial approximating function as specified by Eq. 80, leads to a series approximation for the inverse Langevin equation with a relative error bound of 1.2 × 10−4. A twentieth-order series has a relative error bound of 3.2 × 10−6.

A modest approximating function (Eq. 80), with a relative error bound of 0.00969, leads to relative error bounds of 1.31 × 10−4, 2.77 × 10−6 and 1.61 × 10−8 with zero-, first-, and second-order error approximation. First-order iteration, based on a first-order error approximation, leads to a relative error bound of 2.66 × 10−16. Significantly lower relative error bounds can be obtained by higher-order iteration and by using a second-, or higher-, order error approximation. These results represent significant improvements on published analytical approximations for the inverse Langevin function.

First- and second-order Taylor series for the error-based approximations to the inverse Langevin function can be used to obtain simplified function forms. Whilst the first-order Taylor series leads to results with a much lower accuracy, the second-order Taylor series results in expressions without accuracy compromise. As usual, there is a trade-off between complexity and accuracy.