1 Introduction

Ever since Kruskal [22] remarked that soliton motion may be thought of as a “parade of poles,” the study of complex pole dynamics in nonlinear wave equations has been an active research field. This paper is an overview of the field, using some of the well-known model problems, including the Korteweg–De Vries equation that prompted Kruskal’s remark. The plan is to take these equations, some of them dissipative and others dispersive, and start them all with the same set of initial and boundary conditions. Using analysis where we can and numerical computation otherwise, we shall then track the evolution of the complex singularities. The singularity dynamics of the various equations will be contrasted, and also connected to the typical nonlinear features associated with these equations such as shock formation, soliton motion, finite time blow-up, and recurrence. Here, a particular interest is the entry of the singularities when the initial condition has no singularities in the finite complex plane.

We consider equations of the form

$$\begin{aligned} u_t + u u_x = L(u), \quad t > 0, \ -\pi \le x < \pi , \end{aligned}$$

and assume \(2\pi \)-periodic solutions in the space variable, x. The linear operator on the right can be any one of

$$\begin{aligned} L(u)= & {} \nu \, u_{xx} \qquad \ \ (\text{ Burgers}), \end{aligned}$$
$$\begin{aligned} L(u)= & {} -\nu \, u_{xxx} \quad \ \ (\text{ Korteweg--De } \text{ Vries}), \end{aligned}$$
$$\begin{aligned} L(u)= & {} \nu \, H\{u_{xx}\} \quad (\text{ Benjamin--Ono}), \end{aligned}$$

where \(\nu \) is a nonnegative constant and H denotes the periodic Hilbert transform, defined below. As initial condition we consider

$$\begin{aligned} u(x,0) = -\sin (x), \end{aligned}$$

the particular form of which allows us to make connections to several works of historical interest, namely papers by Cole [10], Hopf [21], Platzman [27], and Zabusky and Kruskal [39].

The numerical procedure we follow is similar to the one proposed in [35]. The first step involves a Fourier spectral method in space and a numerical integrator in time to compute the solution on \([-\pi ,\pi ]\times [0,T]\). The second step is to continue the solution at any time t in [0, T] into the complex x-plane. For the continuation we use a Fourier–Padé method, although other possibilities are considered as well.

In order to identify and display poles and branch points in the complex plane, we shall plot what is called the “analytical landscape” in [34]. With the solution f(z) expressed in polar form \(r e^{i \theta }\), the software of [34] can be used to generate a 3D plot in which the horizontal axes represent the real and imaginary components of \(z = x+iy\), the height represents the modulus r, and colour represents the phase \(e^{i \theta }\). The two examples in Fig. 1 should clarify this visualization.

Fig. 1
figure 1

Analytical landscapes of the functions \(f(z) = 1/z^2\) (top left), and \(f(z) = z^{1/2}\) (top right). The height represents the modulus and the colour represents the phase, as defined by the NIST standard colour wheel (bottom); see [13]. For details about the software used to produce these figures, see [34]

The outline of the paper is as follows: The inviscid Burgers equation and its viscous counterpart are discussed, respectively, in Sects. 2 and 3. Here, analysis provides the exact locations of the branch point singularities in the inviscid case and approximate locations of the poles in the case of small viscosity. For the other PDEs considered here, namely Benjamin-Ono (BO) in Sect. 4 and Korteweg-de Vries (KdV) in Sect. 5, analytical results are harder to come by and we resort to the numerical procedure mentioned above. The nonlinear Schrödinger equation (NLS) also makes an appearance in our discussion of recurrence in Sect. 6. In the final section we discuss the details of the numerical methods employed in the earlier sections.

Novel results presented here include the pole dynamics of the BO, KdV, and NLS equations. Related studies of KdV were undertaken in [7, 17], but these authors did not consider the Zabusky–Kruskal experiment which is our focus here. Pole behaviour in KdV and NLS was also discussed in the papers [11, 22] and [9, 23], respectively, but those analyses were based on cases where explicit solutions are available. Moreover, in those papers the poles were already present in the initial condition. Here, our interest is in the situation where the singularities are “born” at infinity.

Although this paper focuses only on simple model equations such as (1)–(4), pole dynamics have been studied in more complex models, particularly in the water wave context. Among the many references are [3, 7, 14].

2 The Inviscid Burgers Equation

The inviscid Burgers equation, \(u_t + uu_x = 0\), subject to the initial condition (5), develops a shock at \((x,t) = (0,1)\), as can be verified by the method of characteristics. It also admits an explicit Fourier series solution [27]

$$\begin{aligned} u(x,t) = -2 \sum _{k=1}^{\infty } c_k(t) \sin (kx), \qquad c_k(t) := \frac{J_k(kt)}{kt}, \end{aligned}$$

valid for \(0< t < 1\). The \(J_k\) are the Bessel functions of the first kind. This series is of limited use for numerical purposes, however, particularly for continuation into the complex plane. When truncated, it becomes an entire function and will not reveal much singularity information other than perhaps the location and type of the singularity nearest to the real axis [26, 32].

Instead, for numerical purposes we shall use the implicit solution formula

$$\begin{aligned} u = f(x-ut), \qquad f(x) = -\sin (x). \end{aligned}$$

This transcendental equation can be solved by Newton iteration for values of x in the complex plane. One can start at a small time increment, say \(t = \Delta t\), use \(u = f(x)\) as initial guess, and iterate until convergence. Then t is incremented to \(2\Delta t\), the initial guess is updated to the current solution, and the process is repeated. Figure 2 shows the corresponding solutions in the visualization format described in the introduction.

Fig. 2
figure 2

Solution to the inviscid Burgers equation as computed by applying Newton iteration to the implicit solution formula (7). The four frames correspond to \(t = \frac{1}{4}, \frac{1}{2}, \frac{3}{4}\), and 1 (in the usual order). The thicker black curve is the real-valued solution on the real axis, displaying the typical steepening of the curve until the shock forms in the last frame. The solution in the upper half-plane is displayed in the format of Fig. 1. The solution in the lower half-plane is not shown because of symmetry. The black dot represents a branch point singularity that travels along the imaginary axis according to (9). By referring to the colour wheel of Fig. 1, one can see that on the imaginary axis, there is no jump in phase between the origin and the branch point (in some printed versions the abrupt change in phase may appear to be discontinuous but it is not.) From the branch point to \(+i\infty \), however, there is a phase jump consistent with a singularity of quadratic type

The figure shows one member of a conjugate pair of branch point singularities, born at \(+i \infty \), which travels down the positive imaginary axis and meet its conjugate partner (not shown) at \((x,t) = (0,1)\) when the shock occurs. This behaviour was first reported in [5, 6], where a cubic polynomial was used as initial condition (similar to the first two terms in the Taylor expansion of (5)). In the cubic case, eq. (7) can be solved explicitly by Cardano’s formula, which enabled a complete description of the singularity dynamics as summarized in [5, 6, 28, 29]. In our case, the initial condition is trigonometric and therefore Cardano’s formula is not applicable. It is nevertheless possible to find the singularity locations and their type explicitly.

The singularity location, say \(z=z_s\), and the corresponding solution value, say \(u=u_s\), are defined by the simultaneous equations

$$\begin{aligned} u_s = f(z_s-u_st), \qquad 1 = -t f^{\prime }(z_s-u_st), \end{aligned}$$

the latter equation representing the vanishing Jacobian of the mapping; see for example [26]. With f(x) defined by (5), the solution is, for \(0< t < 1\),

$$\begin{aligned} z_s = \pm i \left( \sqrt{1-t^2} - \tanh ^{-1} \sqrt{1-t^2} \right) , \quad u_s = \pm i \, t^{-1} \sqrt{1-t^2}. \end{aligned}$$

These formulas are consistent with the solution shown in Fig. 2. A graph of the singularity location as a function of time is shown as the dashed curve in Fig. 3 of the next section.

Further analysis shows that the singularity is of quadratic type, consistent with the phase colours in Fig. 2 and in agreement with the analysis of [5, 6, 28, 29] for the cubic initial condition. When \(t=1\), i.e., at the time the shock occurs, the singularity type changes from quadratic to cubic. The Riemann surface structure associated with this is discussed in [5, 6], in connection with the cubic initial condition.

3 The Viscous Burgers Equation

When viscosity is added, i.e., \(\nu >0\) in the Burgers equation (1)–(2), shock formation does not occur. In the complex plane interpretation this means the singularities do not reach the real axis. Moreover, they become strings of poles rather than the branch points observed in the previous section. The poles travel in conjugate pairs from \(\pm i \, \infty \), with rapid approach towards the real axis, before turning around. They retrace their steps along the imaginary axes at a more leisurely pace, and eventually recede back to infinity, which ultimately leads to the zero steady state solution.Footnote 1

Fig. 3
figure 3

Left: Solution of the viscous Burgers equation (2), with \(\nu = 0.1\), \(t=1\), as computed from the series solution formula (10). Right: The locations on the positive imaginary axis of the first four poles as a function of time. The dash-dot curve is the location of the branch-point singularity when \(\nu = 0\), as given by formula (9) (the pole curves approach the dash-dot curve asymptotically as \(t \rightarrow 0^+\) but could not be computed reliably for small values of t because of ill-conditioning, hence the gaps)

Analogously to (6), the Burgers equation subject to the initial condition (5) has an explicit series solution, this time not a Fourier series but a ratio of two such series:

$$\begin{aligned} u(x,t) = - 2\nu \frac{\theta _x}{\theta }, \quad \theta (x,t) := I_0\big (\frac{1}{2\nu }\big ) + 2 \sum _{k=1}^{\infty } (-1)^k I_k\big (\frac{1}{2\nu }\big ) e^{-\nu k^2 t} \cos (kx). \end{aligned}$$

The \(I_k\) are the modified Bessel functions of the first kind. This solution is derived from the famous Hopf–Cole transformation; in fact, the above series is a special case of one of the examples presented in the original paper of Cole [10]. Presumably the solutions (6) and (10) can be connected in the limit \(\nu \rightarrow 0^+\), but we found no such reference in the literature.

The pole locations in Fig. 3 can be computed from the series solution (10). For asymptotic estimates, however, a better representation is the integral form [10, 21]:

$$\begin{aligned} u(x,t) = \frac{\int _{-\infty }^{\infty } \frac{x-s}{t} \exp \Big (\frac{1}{2\nu } F(x,s,t)\Big ) \, ds}{\int _{-\infty }^{\infty } \exp \Big (\frac{1}{2\nu } F(x,s,t)\Big ) \, ds}. \end{aligned}$$

In the case of the initial condition (5) the function F is defined by

$$\begin{aligned} F(x,s,t) = 1-\cos (s)-\frac{(x-s)^2}{2t}. \end{aligned}$$

To estimate the pole locations in the inviscid Burgers equation one can analyze the denominator of the formula in (11). Looking for poles on the positive imaginary axis, we define, for \(y>0\),

$$\begin{aligned} D(y,t) = \int \limits _{-\infty }^{\infty } \exp \Big (\frac{1}{2\nu } F(yi,s,t)\Big ) \, ds. \end{aligned}$$

A saddle point method can be used to estimate this integral when \(0 < \nu \ll 1\). We present an informal analysis here, focussed on an explanation of the situation shown in Fig. 3. A more comprehensive analysis (for the cubic initial condition) can be found in [28].

Figure 4 shows level curves of the real and imaginary parts of F(iyst) in the complex s-plane, with \(y=1\) and \(t=1\). The figure reveals three saddle points, two in the upper half-plane and one in the lower half-plane. The contour of integration in (13) is accordingly deformed into the upper half-plane, in order to pass through the two saddle points.

To estimate the saddle point contributions, we differentiate (13) with respect to s (and suppress the dependence on y and t),

$$\begin{aligned} F^{\prime }(s) = \sin (s)-\frac{(s-yi)}{t}, \qquad F^{\prime \prime }(s) = \cos (s)-\frac{1}{t}. \end{aligned}$$

The saddle points are defined by \(F^{\prime }(s) = 0\), i.e.,

$$\begin{aligned} s - yi - t \sin (s) = 0. \end{aligned}$$

No explicit solution of this equation seems to exist, but it can be checked that for \(t = 1\) and all \(y>0\) there is precisely one root on the negative imaginary axis, and two roots in the upper half-plane, symmetrically located with respect to the imaginary axis. The configuration shown in Fig. 4 can therefore be taken as representative of all \(y>0\), except that the saddle points coalesce at the origin as \(y \rightarrow 0^+\).

We label the roots in the first and second quadrants as \(s_1\) and \(s_2\), respectively, with \(s_2 = -\overline{s}_1\). The corresponding saddle point contributions are \(D_1\) and \(D_2\), where

$$\begin{aligned} D_j = 2\sqrt{\frac{\pi \nu }{|F^{\prime \prime }(s_j)|}} \exp \Big ( \frac{1}{2\nu }F(s_j) - \frac{1}{2} i (\theta _j \pm \pi )\Big ), \end{aligned}$$

where the upper (resp., lower) sign choice refer to \(j = 1\) (resp., \(j=2\)). The quantities \(\theta _j\) are defined by \(F^{\prime \prime }(s_j) = |F^{\prime \prime }(s_j)| e^{i \theta _j}\).

The approximation to the denominator integral (13) is now given by \(D \sim D_1 + D_2\) as \(\nu \rightarrow 0^+\). After using the symmetry relationships between \(s_1\) and \(s_2\) noted above, as well as the fact that \(|F^{\prime \prime }(s_1)|=|F^{\prime \prime }(s_2)|\), this becomes

$$\begin{aligned} D \sim 4 \sqrt{\frac{\pi \nu }{|F^{\prime \prime }(s_1)|}} e^{\frac{1}{2\nu } \lambda _1} \sin \Big ( \frac{\mu _1}{2\nu } - \frac{1}{2} \theta _1\Big ), \quad F(s_1) := \lambda _1+\mu _1 i. \end{aligned}$$

In the second frame of Fig. 4 the graph of this function is shown as a function of y. In comparison with a high-accuracy quadrature approximation of the integral (13), the approximation (17) is seen to be quite accurate. The exception is for small values of y, because of the coalescence of the saddle points mentioned above.

Fig. 4
figure 4

Saddle point analysis for the viscous Burgers equation shown in Fig. 3. Left: The dots are saddle points of F(yist), with \(y=1\), \(t=1\). The colour represents level curves of the real part of F(yist), and the dash-dot curves are level curves of the imaginary part. For the saddle point analysis the path of integration in (13), i.e., the real line, is deformed into the dash-dot curve in the upper half-plane that defines the steepest descent direction. The main contributions to the integral come from the regions in the neighbourhood of the saddle points. Right: The function D(y, 1), computed by numerical integration of (13) (solid curve), in comparison with the saddle point approximation (17) (dash-dot curve). The zeros of this function define the locations of the poles seen in Fig. 3

Approximate pole locations can be computed as the zeros of (17), i.e.,

$$\begin{aligned} \mu _1 - \nu \theta _1 = 2 \nu k \pi , \quad k = 1, 2, \ldots , \end{aligned}$$

which is solved simultaneously with the saddle point equation (15). In Table 1 we compare this estimate with the actual pole locations.

Table 1 Left: Pole locations on the positive imaginary axis for the solution shown in Fig. 3, i.e., \(t=1\) and \(\nu = 0.1\). The ‘exact’ values were computed by numerical quadrature of (13) and root finding, both processes executed to high precision. The estimated values were computed by a numerical solution of the two equations (15) and (18). Right: Turning points of the poles, i.e., the coordinates of the local minima in the right frame of Fig. 3. This was computed by a numerical solution of the two equations (15) and (18) in combination with a minimization procedure with objective function y

The equations (15)–(18) can be used as basis for further analysis, both theoretical and numerical, of the pole locations. For example, by solving these equations numerically and simultaneously minimizing over y, the closest distance any particular pole gets to the real axis can be computed. These results are also summarized in Table 1.

In conclusion of this section on the Burgers equation we mention a lesser known fact, namely, that nonlinear blow-up is possible with complex initial data. For example, Fig. 5 shows the blow-up in the solution corresponding to the complex Fourier mode initial condition

$$\begin{aligned} u(x,0) = -\sin (x)-i \cos (x). \end{aligned}$$

Features such as the blow-up time or the minimum value of \(\nu \) that allows blow-up can be analyzed by the saddle point method outlined above, but we shall not pursue this here.

Fig. 5
figure 5

Finite time blow-up in the Burgers equation (2) with \(\nu =0.1\), subject to the complex initial condition (19). The poles approach the origin from the positive imaginary direction, as can be seen in the left frame, which corresponds to \(t=0.7\). In the right frame the leading pole has reached the real axis, roughly at \(t=1\), which results in a blow-up (note that there is no upper/lower half-plane symmetry as was the case in Fig. 2, so we show both half-planes in this figure)

When dispersion replaces diffusion in (1), the poles drift away from the imaginary axis. The pole behaviour is more complicated than in the Burgers case and the bigger the dispersive effects, the more intricate the behaviour. For this reason we tackle the less famous BO equation first, before getting to the more celebrated KdV equation.

4 The Benjamin-Ono Equation

The periodic Hilbert transform H in (4) can be defined as a convolution integral involving a cotangent kernel [19, Ch. 14], or, equivalently, in terms of Fourier series

$$\begin{aligned} u(x,t) = \sum _{k=-\infty }^{\infty } c_k(t) e^{ikx} \quad \Rightarrow \quad H\{u_{xx}\} = \sum _{k=-\infty }^{\infty } (-i) \text{ sgn }(k) k^2 c_k(t) e^{ikx}. \end{aligned}$$

When the nonlinear term in (1) is absent, both the BO and KdV equations are linear dispersive wave equations. They admit travelling wave solutions \(u(x,t) = e^{i (kx -\omega (k)t) }\) with dispersion relations \(\omega = -\nu \, \text{ sgn }(k) k^2\) and \(\omega = -\nu \, k^3\), respectively. The quadratic vs cubic dependence on the wave number k makes dispersive effects in the BO equation less pronounced than in the KdV equation.

Fig. 6
figure 6

Solutions to the Benjamin-Ono equation (1) and (4), corresponding to the initial condition (5), with \(\nu =0.1\). The pole dynamics of this solution can be seen in Fig. 7

With the nonlinear term in (1) present, both the BO and KdV equations are completely integrable and solvable, in principle, by the inverse scattering transform [1]. For arbitrary initial conditions and particularly with periodic boundary conditions, however, it is unlikely that all steps of the procedure can be completed successfully to obtain explicit solutions. Numerical methods will therefore be used to study singularity dynamics. As mentioned in the introduction, this consists of a standard method of lines procedure to obtain the solution on the real axis, followed by numerical analytical continuation into the complex plane by means of a Fourier-Padé method. Details are postponed to Sect. 7. Our choice of a Padé based method stems from the fact that singularities in both BO and KdV (next section) are expected to be poles. This is related to the complete integrability of these equations and the Painlevé property as discussed in [1, Sect. 2].

Figure 6 shows the solution on the real axis for the BO equation. Like diffusion, dispersion prevents shocks, but the mechanism is different: oscillations appear and separate into travelling wave solutions. In the case of KdV, this behaviour gave rise to the numerical discovery of the soliton, as discussed in Sect. 5. In the present example, about eight such solitons can be seen, perhaps most clearly identifiable in the pole parade shown in Fig. 7.

Fig. 7
figure 7

Pole locations of a subset of the solutions of the BO equation shown in Fig. 6. Each soliton in that figure can be associated with a pair of conjugate simple poles in the complex plane. The poles that exit on the left re-enter on the right because of the periodic boundary conditions

The initial pole behaviour is very similar to that observed in the Burgers equation, namely, the poles are born at infinity and start to travel in conjugate pairs towards the imaginary axes. Unlike the Burgers case, however, the poles do not remain on the imaginary axes but veer off into the left half-plane. Eight pairs can eventually be associated with the solitons shown in Fig. 6.

In the absence of readily computable error estimates for our procedure we have used the following strategy to validate the results. Poles of the BO equation are simple, each with residue \(\pm 2 i \nu \); see for example [8]. The order and residue of each pole can be checked by contour integration on a small circle surrounding its location [35].Footnote 2 Using this technique, spurious poles and other numerical artifacts can be identified (one example of which is the slight irregularity near \(-3+0.8i\) in the third frame of Fig. 7.)

5 The Korteweg-De Vries Equation

In the case of KdV, the qualitative behaviour of the solutions is similar to that of the BO equation. The dispersion prevents shock formation in the solution by breaking it up into a number of solitons, which is the famous discovery of Zabusky and Kruskal [39]. The iconic figure from that paper is reprinted in Fig. 8. In the left frame of Fig. 9 we reproduce that solution, but rescaled to the domain \([-\pi ,\pi ]\) in order to facilitate comparisons with the other solutions shown in this paper.

The initial behaviour is the same as for the other equations we have seen thus far, namely, there are poles that enter from infinity and travel towards the real axis in conjugate pairs, roughly similar to the first two frames in Fig. 7. As was the case for the BO equation, dispersion causes the poles to drift into the left half-plane and eventually re-enter in the right half-plane because of periodicity. The eight solitons marked in the Zabusky–Kruskal figure are clearly identifiable in the pole plot of Fig. 9, with the poles closer to the real axis corresponding to the taller solitons.

We have used the same strategy mentioned at the end of Sect. 4 for validation of Fig. 9. In the case of KdV the poles are locally of the form \(-12\nu /(z-z_0)^2\). The phase information of Fig. 9, when viewed in colour, makes it clear that the computed poles are indeed of order two, and contour integration confirmed the strength coefficient of \(-12\nu \).

It should be noted, however, that numerical analytical continuation is inherently ill-conditioned as one goes further into the complex plane, and that puts some limitations on our investigations. Two examples are as follows:

First, for \(t \ll 1\) we found that the Fourier-Padé based method was not able to produce the theoretical pole information accurately, presumably because of the distance between the real axis and the nearest singularity. Therefore no figures of this initial phase of the evolution are presented here. Second, in the literature the existence of ‘hidden solitons’ in the Zabusky–Kruskal experiment is mentioned; see [12] (and the references therein). In order to investigate these hidden solitons, the solution of Fig. 9 has to be continued much farther into the complex plane. Because of spurious poles and the ill-conditioning alluded to above, our efforts at tracking these hidden solitons were inconclusive. Both of these investigations are offered as a challenge to computational mathematicians.

Here are two suggestions for such investigations. First, for the KdV method it is recommended that the equation be transformed into the potential KdV equation, by the substitution \(u = v_x\); see [22]. This equation has simple poles, which makes it better suited for approximation by Padé methods. Second, the use of multi-precision arithmetic is advisable. Here, everything was done in IEEE double precision, mainly because of the speed if offers to create animations of the pole parades [36].

Fig. 8
figure 8

The iconic figure of soliton formation in the KdV equation. The initial condition is \(u(x,0) = \cos (\pi x)\) on [0, 2], with \(\nu = 0.022^2\). Reprinted, with permission, from [39]. Copyright (1965) by the American Physical Society

Fig. 9
figure 9

Left: the Zabusky–Kruskal solution shown in Fig. 8, after rescaling to \([-\pi , \pi ]\). Right: the corresponding poles in the complex plane

6 Recurrence

Historically, the discovery of the soliton in [39] overshadowed the fact that the objective of that paper was something else entirely, namely, the verification of the recurrence phenomenon previously discovered by Fermi, Pasta, Ulam, and Tsingou (FPUT) in yet another celebrated numerical experiment [16].Footnote 3 In short, this means that if a nonlinear system is started in a low mode configuration such as the initial condition (5), then higher modes are created by the nonlinear interaction, causing an energy cascade from low modes to high. The upshot of the FPUT experiment was that this process is not continued indefinitely, but eventually reverses with most of the energy flowing back to the low modes. The effect of this is that the initial condition is reconstructed briefly—approximately so and with a shift in phase—after a certain period of time.

Numerical experiments with KdV such as those reported in Sect. 5 do not reveal the recurrence behaviour in the pole dynamics. Had true recurrence occurred, the poles would have retraced their steps back along the imaginary axes out to infinity or would have cancelled somehow. The most we could observe at the purported recurrence time was a slight widening of the strip of analyticity around the real axis. This lack of a clear recurrence can be attributed to the fact that the phenomenon is rather weak in KdV, as discussed in detail in [20].

For a more convincing demonstration of recurrence one has to look outside the family (1)–(4). Perhaps the best PDE for this purpose is the NLS equation

$$\begin{aligned} i \, u_t + u_{xx} + \nu |u|^2 u = 0, \end{aligned}$$

where the solution, u(xt), is complex-valued. We shall consider \(\nu >0\) (known as the focussing case) and continue to work with \(2\pi \)-periodic boundary conditions. It will be necessary, however, to modify our initial condition to have nonzero mean, so we consider

$$\begin{aligned} u(x,0) = 1 + \epsilon \cos x. \end{aligned}$$

The corresponding solution is an \(\epsilon \)-perturbation of the x-independent solution \(u = e^{i \nu t}\). Linearisation about this solution shows that the side-bands \(e^{\pm i n x}\) grow exponentially for all integers n satisfying [37, 38]

$$\begin{aligned} 0< n^2 < 2\nu . \end{aligned}$$

That is, for \(\nu < \frac{1}{2}\) there is no instability, for \(\frac{1}{2}< \nu < 2\) a single pair of side-bands is unstable, a double pair for \(2< \nu < \frac{9}{2}\), and so on. The instability is named after Benjamin and Feir, who derived it not via the NLS but directly from the water wave setting [4]. The growth does not continue unboundedly but subsides, and recurrences occur at periodic time intervals. The connection between Benjamin-Feir instability and FPUT recurrence was pointed out in [38].

The growth and recurrence pattern for a special case with two unstable modes can be seen in Fig. 10. In frames 2, 3 and 7, 8 the unstable mode \(e^{\pm ix}\) dominates, while \(e^{\pm 2ix}\) dominates in frames 4, 5, and 6. An almost perfect recurrence occurs in frame 9, after which time the process continues periodically.

Fig. 10
figure 10

Solutions to the nonlinear Schrödinger equation (21) corresponding to the initial condition (22), with \(\nu =3\), \(\epsilon = 0.1\). The unstable modes \(e^{\pm ix}\) and \(e^{\pm 2ix}\) take turns in dominating the solution, with a near perfect recurrence at \(t=5\). The pole dynamics of the first phase of this solution can be seen in Fig. 11

Pole locations of some of the solutions in Fig. 10 can be seen in Fig. 11. The first unstable mode is controlled by a conjugate pair of simple poles on the imaginary axis. The second is controlled by two pairs of conjugate poles, each pair symmetrically located with respect to the imaginary axis. The first frame shows the initial onset, with the poles on the imaginary axis leading the procession. The second frame is roughly where the first mode reaches its maximum growth, which corresponds to the point at which the poles reach their minimum distance to the real axis. In the third frame, these poles are receding back along the imaginary axes and are overtaken by the approaching secondary sets of poles. The last frame shows a situation where the second mode has become dominant. At the recurrence time, all of these poles will have receded back to infinity.

Fig. 11
figure 11

Pole locations of a subset of the solutions of the NLS equation shown in Fig. 10. In the first two frames the unstable mode \(e^{\pm ix}\) dominates, while \(e^{\pm 2ix}\) dominates in the last two frames. This is determined by which pairs of poles are closest to the real axis

7 Numerical Tools

In this final section we review some of the numerical techniques that can be used in this field. Our discussion, which focuses primarily on Padé approximation and its variants, is by no means exhaustive. For other approaches, including tracking the poles through the numerical solution of certain dynamical systems, we refer to [7, 26, 32, 33].

We limit the discussion to \(2\pi \)-periodic solutions that admit a Fourier series expansion of the form

$$\begin{aligned} u(x,t) = \sum _{k=-\infty }^{\infty } c_k(t) e^{ikx}, \quad -\pi \le x < \pi . \end{aligned}$$

In some rare cases the coefficients \(c_k(t)\) are known explicitly; cf. (6). Otherwise, the \(c_k(t)\) can be computed numerically by a Fourier spectral method and the method of lines [35]. In order to do this step as accurately as possible, it is necessary to truncate the Fourier series to a large number of terms (here we used \(|k|\le 256\) or 512), and also use small error tolerances in the time-integration (here on the order of \(10^{-12}\) in the stiff integrator ode15s in MATLAB).

When truncated, the series (24) becomes an entire function and will not reveal much singularity information other than perhaps the width of the strip of analyticity around the real axis [32]. A more suitable representation is obtained by converting the truncated series to Fourier-Padé form. For a fixed value of t (suppressed for now in the notation) we convert the series to Taylor-plus-Laurent form by the substitution \(z = e^{ix}\):

$$\begin{aligned} u(x) = \sum _{k=-\infty }^{\infty } c_k e^{ikx} = \sum _{k=0}^{\infty } c_k z^k + \sum _{k=0}^{\infty } c_{-k} (1/z)^k. \end{aligned}$$

(It is necessary to redefine \(c_0 \rightarrow c_0/2\).) Each term on the right can be converted to a type (NN) rational form as follows. Consider the first term and define

$$\begin{aligned} f(z) = \sum _{k=0}^{\infty } c_k z^k, \quad p(z) = \sum _{k=0}^{N} a_k z^k, \quad q(z) = \sum _{k=0}^{N} b_k z^k. \end{aligned}$$

One then requires that

$$\begin{aligned} f(z) \approx \frac{p(z)}{q(z)} \quad \Rightarrow \quad p(z) - q(z) f(z) = \mathcal {O}(z^{2N+1}). \end{aligned}$$

The latter equation can be set up as a linear system to solve for the coefficients \(a_k\) and \(b_k\) (after fixing one coefficient, typically \(b_0=1\)). The second term on the right in (25) can be converted to rational form in the same way, which then gives the approximation to u(x) as the ratio of two Fourier-series. The pole plots in Sects. 4, 5 and 6, were all computed using this Fourier-Padé approach.

A promising alternative to the Padé approach to rational approximation is the so-called AAA method, recently proposed in [24], with subsequent extensions to the periodic case [25]. It is not implemented in coefficient space like (24)–(26), but rather uses function values, easily obtained from (26) by an inverse discrete Fourier transform. The representation is the barycentric formula for trigonometric functions [18]

$$\begin{aligned} u(x) = \frac{\sum _{k=1}^{M} (-1)^k \csc \big (\frac{1}{2}(x-x_k)\big )u_k}{\sum _{k=1}^{M} (-1)^k \csc \big (\frac{1}{2}(x-x_k)\big )}, \end{aligned}$$

applicable when M is odd (a similar formula holds for even M). When \(x_k = -\pi +(k-1)2\pi /M\) (i.e., evenly spaced nodes in \([-\pi ,\pi )\)) and \(u_k = u(x_k)\), then u(x) is identical to the series (26) when truncated to \(|k| \le N\), where \(2N+1 = M\).

In the AAA algorithm the so-called support points \(x_k\) are not chosen to be equidistant, which changes the formula (28) from a truncated Fourier series to a rational form. The choice of the \(x_k\) proceeds adaptively so as to avoid exponential instabilities.

In preliminary numerical tests the trigonometric AAA algorithm was competitive with the Fourier-Padé method described above. But further experimentation is needed to decide the winner in this particular application field.

Neither of these two methods, however, can give much information on branch point singularities. One way of introducing branches into the approximant is quadratic Padé approximation [30], which is a special case of Hermite-Padé approximation [2]. Define a polynomial r(x) similar to p(x) and q(x) in (26), and in analogy with the rightmost expression in (27) define

$$\begin{aligned} p(z) + q(z) f(z) +r(z) \big ( f(z) \big )^2 = \mathcal {O}(z^{3N+2}). \end{aligned}$$

Dropping the order term on the right yields

$$\begin{aligned} f(z) \approx \frac{-q(z) \pm \sqrt{q(z)^2-4p(z)r(z) }}{2 \, r(z)}, \end{aligned}$$

and when this is used to approximate the two terms on the right of (25) a two-valued approximant to u(x) is obtained. Cubic and higher order approximants can be defined analogously, but will not be considered here.

Recall that Fig. 2 showed a solution of the inviscid Burgers equation with a branch point singularity. To test how accurately this singularity can be approximated by these methods, we solved the equation numerically as described below eq. (24). (We refrained from using the explicit series (6), which is too special.) The numerical solution (24) was then continued into the complex plane using the Fourier-Padé and quadratic Fourier-Padé approximations. Although we have a large number of Fourier coefficients available, we found that best results are obtained if only a fraction of those are used in the Padé approximations. For the results shown here, we used only \(N = 35\) terms in the series for f(z) in (26), which translates into a type (17, 17) linear Fourier-Padé approximant, and type (11, 11, 11) in the quadratic case.

The results are shown in Fig. 12. The middle figure is the reference solution, computed to high accuracy by the Newton iteration described in Sect. 2. On the left is the approximation obtained by the linear Fourier-Padé approximant. Away from the imaginary axis the approximation is good, but it is poor on the axis itself. In the absence of branches in the approximant, a series of poles and zeros (the latter not clearly visible) appears as a proxy for the jump in phase. The fact that alternating poles and zeros ‘fall in the shadow’ of the branch point is a well-known phenomenon in standard Padé approximation [31], and is evidently also present in the trigonometric case.Footnote 4 On the other hand, the quadratic Fourier-Padé approximant shown on the right is virtually indistinguishable from the reference solution.

Fig. 12
figure 12

Approximation of a branch point singularity in the inviscid Burgers equation, at \(t = 0.75\). Left: a type (17, 17) linear Padé approximation. Middle: reference solution computed by Newton iteration from (7). Right: a type (11, 11, 11) quadratic Padé approximation

The relative errors in these two approximations are shown in Fig. 13. The linear approximant has low accuracy near the imaginary axis because of the spurious poles mentioned above. By contrast, the quadratic approximant maintains high accuracy, even on the imaginary axis. If one takes the solution generated by the Newton method as exact, the quadratic approximant yields more than five decimal digits of accuracy in almost the whole domain shown in Fig. 13.

Fig. 13
figure 13

Relative errors in the approximation of the branch point singularity of Fig. 12. Left: the linear Padé approximation. Right: the quadratic Padé approximation. Bottom: the colour bar in a \(\log _{10}\) scale, so each change in shade represents roughly one decimal digit of accuracy

Further discussion of numerical aspects of quadratic Padé approximation, including their computation and conditioning, can be found in [15].