Orthogonal Systems with a Skew-Symmetric Differentiation Matrix

In this paper, we explore orthogonal systems in L2(R)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {L}_2({\mathbb R})$$\end{document} which give rise to a real skew-symmetric, tridiagonal, irreducible differentiation matrix. Such systems are important since they are stable by design and, if necessary, preserve Euclidean energy for a variety of time-dependent partial differential equations. We prove that there is a one-to-one correspondence between such an orthonormal system {φn}n∈Z+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{\varphi _n\}_{n\in {\mathbb Z}_+}$$\end{document} and a sequence of polynomials {pn}n∈Z+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{p_n\}_{n\in {\mathbb Z}_+}$$\end{document} orthonormal with respect to a symmetric probability measure dμ(ξ)=w(ξ)dξ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{d}\mu (\xi ) = w(\xi ){\mathrm {d}}\xi $$\end{document}. If dμ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{d}\mu $$\end{document} is supported by the real line, this system is dense in L2(R)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {L}_2({\mathbb R})$$\end{document}; otherwise, it is dense in a Paley–Wiener space of band-limited functions. The path leading from dμ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{d}\mu $$\end{document} to {φn}n∈Z+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{\varphi _n\}_{n\in {\mathbb Z}_+}$$\end{document} is constructive, and we provide detailed algorithms to this end. We also prove that the only such orthogonal system consisting of a polynomial sequence multiplied by a weight function is the Hermite functions. The paper is accompanied by a number of examples illustrating our argument.


Differentiation Matrices
The broad theme underlying this paper is the important benefits accrued in the semidiscretisation of time-dependent partial differential equations once space derivatives are approximated in a skew-symmetric (or skew-Hermitian in the complex case) manner. It is instructive to commence with three examples where, for simplicity, we assume a single spatial variable: the diffusion equation where a(x) > 0, x ∈ [−1, 1], given with an initial condition for t = 0 and either periodic or zero Dirichlet boundary conditions at x = ±1, the nonlinear advection equation where v f (v) ≤ 0 for all v ∈ R, given with an L 2 (R) initial condition at t = 0, and the linear Schrödinger equation in the semiclassical regime, given with an initial condition at t = 0 and periodic boundary conditions at x = ±1.
Here 0 < ε 1, while the interaction potential V is real. It is a trivial exercise to prove that the solutions of both (1.1) and (1.2) are nonincreasing in the usual L 2 norm, while the L 2 norm of (1.3) is conserved. This represents a critical structural feature of the three equations which should ideally be preserved under discretisation. Moreover, once the L 2 norm is uniformly bounded under discretisation, the underlying method is stable.
We commence from (1.1) and assume that it is semidiscretised by, say, finite differences, a spectral method or spectral collocation. 1 The outcome is a set of linear ODEs, where A is positive definite. The matrix D is the result of discretising the space derivative, and we call this the differentiation matrix. Therefore, where · is the 2 norm. Suppose now that the differentiation matrix D is skewsymmetric. Then D = −D, and A being positive definite, it follows at once that d u 2 / dt ≤ 0: the numerical solution is dissipative (and, incidentally, stable!) Likewise, semidiscretising (1.2) with finite differences, we have where f m (u) = f (u m )-we can again prove, identical to the above argument, that d u 2 / dt ≤ 0 once D is skew-symmetric. Finally, semidiscretising the Schrödinger Eq. (1.3), we have where the matrix V is real. Assuming again that D is skew-symmetric, because u V u and u * D 2 u = (D u) * (Du) = − Du 2 are both real. Therefore, the semidiscretisation is conservative, mimicking the behaviour of the original equation.
We should perhaps add that, in this context, |u(x, t)| 2 is the probability of finding a particle at x: preservation of the norm is not an optional extra but a basic requirement once we wish to get the physics right. Skew-symmetric differentiation matrices have been already analysed in some length in the context of finite differences in Hairer and Iserles [10,11] and Iserles [12,13]. The simplest second-order finite difference discretisation of a derivative, is skew-symmetric, but this is a false dawn: this is the highest order skew-symmetric finite difference differentiation matrix on uniform grid [12]. It is possible to construct higher-order skew-symmetric differentiation matrices on special grids, but this is far from easy and large orders become fairly complicated [10,11]. Arguably this complexity makes them less suitable for efficient computation.

Two Spectral Examples
In this paper, we explore a general mechanism to generate orthogonal systems with skew-symmetric differentiation matrices since such systems can be implemented in the context of spectral methods. We do now address the issue of time discretisation, noting in passing that it might require a great deal of additional care, cf., for example, [1]. We commence with two examples. Firstly, consider the task of approximating a smooth periodic function on [−π, π]. The obvious choice in this setting is the Fourier basis, which we write in a real setting, note that the basis is orthonormal. The differentiation matrix is It is indeed skew-symmetric, as well as tridiagonal and reducible. Once we can use a Fourier basis, it must be an obvious first choice. The problem is, though, that periodic boundary conditions in a compact interval are something of an exception. Most partial differential equations of interest are either Cauchy problems defined on the entire Euclidean space or possess Dirichlet, Neumann or mixed boundary conditions in a compact domain. The focus of this paper is on Cauchy problems on the real line, and the motivation, which originates in (1.3) and its multivariate counterparts, is explained in greater detail in Iserles [13]. Briefly, periodic boundary conditions are useful for numerical simulation only in very short-time integration, leading to wrong behaviour once the solution (which originally has, to all intents and purposes, finite support and moves at finite speed) reaches the boundary. Modern computational challenges, not least in quantum control, call for a long-time solution of (1.3) and modern time-stepping methods (for example, those described in Bader et al. [1]) can indeed achieve the stability required to achieve this if there exists a discretisation yielding a skew-symmetric differentiation matrix. Hence, the challenge is to design and explore orthogonal systems on the real line with this property.
One example of such an orthogonal system, Hermite functions, is familiar in mathematical physics: where H n is the nth Hermite polynomial. It follows at once from standard theory of orthogonal polynomials that hence, {ϕ n } n∈Z + is indeed an orthonormal sequence, dense in L 2 (R). It is well known (and can be confirmed easily using standard mixed recurrence relations for Hermite polynomials) that therefore, the corresponding differentiation matrix is skew-symmetric-in addition, it is tridiagonal (making computations easier) and irreducible. Hermite functions have many interesting features: they obey the Cramér inequality |ϕ n (x)| ≤ 0.816049 . . ., n ∈ Z + , x ∈ R, are eigenfunctions of the Fourier transform, obey a second-order linear ordinary differential equation and are related to Whittaker functions and parabolic cylinder functions. Insofar as this paper is concerned, (1.6) represents their most germane feature and a paradigm for more general systems of orthogonal functions in L 2 (R).

Plan of This Paper
The purpose of this paper is to characterise all orthogonal systems in L 2 (R) which possess a real skew-symmetric, tridiagonal, irreducible differentiation matrix.
An obvious approach is to consider different orthogonal systems {ϕ n } n∈Z + in L 2 (R) and to check for each whether the differentiation matrix is of the right form-this hit-and-miss approach is unlikely to take us far. In Sect. 2, we adopt an alternative organising principle: we start from a countable sequence of functions {ϕ n } n∈Z + in L 2 (R) which are hardwired to possess a skew-symmetric, tridiagonal, irreducible differentiation matrix and seek conditions for their orthogonality.
On the face of it, we have just replaced one hit-and-miss approach by another, yet there is crucial difference. In Sect. 3, we demonstrate that such a set of functions can be mapped by means of the Fourier transform, subject to rescaling, into a set { p n } n∈Z + of polynomials which are orthogonal with respect to some symmetric measure dμ(x) = w(x) dx. This argument can be reversed: given a real, symmetric measure dμ, we can construct an underlying set of orthonormal polynomials. These polynomials obey the familiar three-term recurrence relation from which we can recover the elements of D. Finally, we recover the functions ϕ n by inverse Fourier transforming √ w p n , n ∈ Z + .
It follows from the Parseval theorem that these functions form an orthonormal set. We prove that this set is dense in L 2 (R) if dμ is supported by R; otherwise, it is dense in a Paley-Wiener space. In Sect. 4, we describe practical algorithms for the evaluation of the ϕ n s, in Sect. 5 we explore the structure of the functions {ϕ n } n∈Z + , while in Sect. 6 we present a number of examples. The paper concludes in Sect. 7 with a brief summary and discussion.

The Main Paradigm
The purpose of this paper is to investigate orthogonal systems = {ϕ n } n∈Z + of functions which are orthogonal on the real line with respect to the standard L 2 (R) inner product and such that their differentiation matrix D is skew-symmetric, tridiagonal and irreducible, . . are real, nonzero constants. In other words, we wish to satisfy the skew-symmetric differentiation matrix relation Given a seed function ϕ 0 ∈ C ∞ (R), remaining functions can be defined uniquely by recursion from (2.1): and so on. In general, easy induction confirms that The first thing to notice from this formula is that the ϕ n must indeed be a smooth (i.e. infinitely differentiable) function for all n ∈ Z + . Also note that, while cumbersome, this formula provides a practical method of generating examples. Note, however, a major problem which provides the focus of this paper: the above procedure produces a set which is guaranteed to satisfy (2.1), but a priori there is absolutely no reason why it should be orthogonal, whether with respect to a standard or any other inner product. Likewise, even if orthogonal, there is no a priori reason for to be complete in the separable Hilbert space L 2 (R).

Example 1 (Hermite functions) Given the seed function and differentiation coefficients,
the functions generated form the orthonormal Hermite function basis, which we have already encountered in Sect. 1.

Example 2 (Spherical Bessel functions) Given the seed function and differentiation coefficients
the functions generated are which are the spherical Bessel functions. Note that while the Bessel functions J n+ 1 2 (z) have a branch cut in (−∞, 0], the spherical Bessel function j n (z) is entire.
Example 3 (Quasi-Hermite) Suppose we use the Gaussian seed function and constant differentiation coefficients, Then the generated function sequence is where H k (x) is the kth Hermite polynomial. We do not prove this formula since, as will transpire in Sect. 5, such functions cannot be orthogonal, and hence are of little merit.
To recap, whatever the merits of (2.2)-and is certainly useful for generating the desired function sequence {ϕ n } n∈Z + -this construction does not make obvious what properties the sequence might have, such as orthogonality or its norm. Indeed, the Hermite functions and the spherical Bessel functions generating the above examples turn out to be orthonormal in L 2 (R), but it is not obvious from this construction that would happen. It can be shown that the function sequence generated in Example 3 is not orthogonal with respect to any real-valued measure. Other types of constructions will be considered in Sects. 3 and 4, for which the orthogonality of the sequence can be more obviously predicted.
We commenced this section specifying a seed ϕ 0 and nonzero real coefficients {b n } n∈Z + . As becomes clear in the next section, once we seek orthogonality, the right procedure is more minimalistic: it is enough to choose a symmetric positive Borel measure dμ on the real line, which determines both the seed and the coefficients in a unique manner while ensuring orthogonality and completeness in a certain Hilbert space.

There and Back Again
We commence our journey from a given sequence of real-valued, square-integrable, smooth functions = {ϕ n } n∈Z + , which has a skew-symmetric differentiation matrix with nonzero real constants {b n } n∈Z + as in (2.1). For our departure, we require the unitary Fourier transform, As is well known, the Fourier transform and differentiation have the commutation relation This motivates our definition of a transformed sequence of functions, = {ψ n } n∈Z + , where Each ϕ n has a well-defined Fourier transform, being square-integrable on the real line. From the equation for differentiation of a Fourier transform given above, we have for all n ∈ Z + , Next, using the skew-symmetric differentiation property of , we see that the transformed functions satisfy In other words, they satisfy a three-term recurrence relation! A simple consequence of this is ψ n (ξ ) = p n (ξ )ψ 0 (ξ ), where P = {p n } n≥0 is a sequence of polynomials satisfying a three-term recurrence, We can now use the classical Favard theorem [3,6] to deduce that these polynomials are orthogonal with respect to a real-valued, finite Borel measure.
Theorem 4 (Favard) Let P = {p n } n≥0 be a sequence of real polynomials such that deg( p n ) = n. P is an orthogonal system with respect to the inner product f , g μ = f (ξ )g(ξ ) dμ(ξ ) for some probability measure 2 dμ on the real line if and only if the polynomials satisfy the three-term recurrence, The polynomials from Eq. (3.3) fall under the umbrella of Favard's theorem, with We can now deduce even more about the polynomials, because there exist simple relationships between the polynomials, the coefficients of the three-term recurrence and the measure of orthogonality, as follows.
Lemma 5 Let P be as in Favard's theorem above.
1. The polynomials are orthonormal if and only if γ n β n−1 /β n = 1 for all n ∈ N and Proof We sketch the proofs because they are elementary but not obvious. Multiplying the three-term recurrence by p n+1 , integrating, dropping the terms which are zero by orthogonality, and then applying the three-term recurrence to ξ p n (ξ ) (and dropping more terms which are zero by orthogonality) show that This proves part 1 of this lemma. Part 2 of the lemma is proved in Theorem 4.3 of Chihara [3]. Part 3 follows from the fact that the leading coefficient of p n+1 is equal to the leading coefficient of − β n ξ p n (ξ ), which equals the leading coefficient of p n times − β n .
By Lemma 5 parts 1 and 2, the polynomials in Eq. (3.3) are always orthonormal with respect to a symmetric probability measure dμ on the real line. Note that this is regardless of the signs and magnitudes of each b n . In fact, by part 3 of Lemma 5, the leading coefficients of these orthonormal polynomials will have signs which depend on the sign of each b n .
Taking stock, we see that our journey has led us from the sequence of functions = {ϕ n } n∈Z + with a real, skew-symmetric, irreducible differentiation matrix, to the transformed functions = {ψ n } n∈Z + , which are of the form ψ n (ξ ) = p n (ξ )g(ξ ), where P = {p n } n∈Z + are orthonormal polynomials with respect to a symmetric probability measure dμ and g(ξ ) = ψ 0 (ξ ) = F[ϕ 0 ](ξ ). By Eq. (2.2), ϕ 0 is infinitely differentiable, which implies that g(ξ ) → 0 superalgebraically as |ξ | → ∞. Since we assume that ϕ 0 is real-valued, then the functions g = F[ϕ 0 ] must have an even real part and an odd imaginary part (although in all specific cases considered in this paper, g is real-valued and even). Furthermore, since ϕ 0 is square-integrable, so is g.
In the current context, we refer to such functions as mollifiers.
Let us now embark on a return journey. Commencing with a symmetric probability measure dμ on the real line, we let P = {p n } n∈Z + be a family of real orthonormal polynomials for the measure dμ, with p 0 = 1. Then by Favard's theorem and Lemma 5, there exists a sequence of nonzero real numbers {b n } n∈Z + such that the three-term recurrence (3.3) holds. Also by Lemma 5, the sign of each b n depends on the changes in sign of the leading coefficients of the polynomials, which can be arbitrary.
Define the functions = {ϕ n } n∈Z + by, where g is a mollifier as discussed above. By Lemma 5 part 2, p n is has the same parity as a function as n has as an integer. Since g has even real part and odd imaginary part, this implies that (−i) n g · p n has even real part and odd imaginary part too. This implies that ϕ n is real-valued for all n ∈ Z + . Furthermore, since g is square-integrable and decays superalgebraically to zero at infinity, this implies that g· p n is square-integrable, and hence is ϕ n for all n ∈ Z + . It is our intent that has a skew-symmetric differentiation matrix. Indeed, using the equation for differentiation of a Fourier transform [Eq. (3.2)] and the three-term recurrence for P, we have Similarly, ϕ 0 = b 0 ϕ 1 . The return journey goes off without a hitch! The mollifier g is also uniquely determined by the family : we have g = F[ϕ 0 ], because p 0 = 1. This proves the following theorem.
where P = {p n } n∈Z + is an orthonormal polynomial system on the real line with respect to a symmetric probability measure dμ and g is mollifier (as defined above). Furthermore, P and g are uniquely determined by and {b n } n∈Z + .

Remark 7
Relaxing the conditions on what it means to be a mollifier yield further examples of sequences with a real, skew-symmetric, tridiagonal, irreducible differentiation matrix, but which are not necessarily square-integrable or real-valued. Furthermore, relaxing the symmetry condition on the measure dμ leads to skew-Hermitian differentiation matrices. We do not pursue these extensions in this paper.

Orthogonal Bases
It turns out that this characterisation using the Fourier transform lends itself well to the determination of orthogonality. Recall Parseval's theorem, namely that the unitary Fourier transform in (3.1) satisfies for all ϕ, ψ ∈ L 2 (R).
for n ∈ Z + as in Theorem 6. Then is orthogonal in L 2 (R) if and only if P is orthogonal with respect to the measure |g(ξ )| 2 dξ . Furthermore, whenever is orthogonal, the functions ϕ n / g 2 are orthonormal.
Clearly is orthogonal if and only if P is orthogonal with respect to |g(ξ )| 2 dξ . For the final part of the theorem, recall the earlier discussion that P is always orthonormal with respect to the probability measure dμ(ξ )/ p 2 0 dμ(ξ ). Therefore, in the present case, P is orthonormal with respect to |g(ξ )| 2 dξ/ g 2 2 . Therefore, ϕ n 2 = g 2 for all n ∈ Z + . This completes the proof.
What Theorems 6 and 8 show is that there is a one-to-one correspondence between orthogonal systems with real, skew-symmetric, irreducible differentiation matrices and sequences P of orthonormal polynomials with respect to a symmetric probability measure of the form dμ(ξ ) = w(ξ )dξ . (Note that we allow the leading coefficients of the orthonormal polynomials to have arbitrary signs for this correspondence.) One more question we shall answer now is the following. We have characterised when is an orthogonal basis, but what Hilbert space is it an orthogonal basis for?
Let us go right ahead with the answer. Define the Paley-Wiener space, where is the support of dμ [20]. Restricting a function's Fourier transform to lie within a subset of the real line is known as band-limiting, and Paley-Wiener spaces are often referred to as band-limited function spaces and have numerous applications in signal processing. Before proceeding to prove that this answer is the right one, we have a small disclaimer. From here onwards, we assume that the measure dμ in Theorems 6 and 8 is such that polynomials are dense in the space L 2 (R, dμ), so that P forms a complete orthonormal basis of the whole space. This is a technical condition, and only in esoteric examples does it fail [3, p. 73]. We will not dwell upon it here, just warn the reader.
for any polynomial p. Because g is a mollifier, we have that g · p ∈ L 2 (R), and since the Fourier transform maps L 2 (R) itself, we have that M[ p] ∈ L 2 (R). Furthermore, for any ξ / ∈ , and is its support. Therefore, M maps into PW (R). Now, consider the inner product after applying M to polynomials p and q: We used Parseval's theorem for the first equality. First of all, we see from this that M is a bounded linear operator from polynomials to PW (R). Since we have assumed that polynomials are dense in L 2 (R, dμ), we can extend M continuously to the entirety of L 2 (R, dμ). Hence, M is in fact an isometry between L 2 (R, dμ) and PW (R).
One can readily check that M has a left and right inverse given by M −1 [ϕ](ξ ) = F[ϕ](ξ )/g(ξ ) for ξ ∈ and M −1 [ϕ](ξ ) = 0 otherwise. We see that M is an isometric isomorphism between L 2 (R, dμ) and PW (R) which maps the orthogonal basis {(−i) n p n } n∈Z + to {ϕ n } n∈Z + . This proves that is an orthogonal basis for PW (R).

Construction of Orthogonal Systems
It follows from the analysis of the last section that the right starting point for our analysis is choosing a symmetric Borel measure dμ(ξ ) = w(ξ ) dξ in Fourier space, supported either on R or in an interval of the form [−a, a], a > 0. This measure determines the mollifier g, the constants {b n } n∈Z + and ultimately the orthogonal sequence in a unique manner. Alternatively, we may choose a positive sequence {b n } n∈Z + and reconstruct the measure dμ from (3.4) using Theorem 4, but this is more problematic because the Favard theorem does not ensure uniqueness of the measure unless, for example, the Carleman criterion is satisfied [3, p. 75]. Yet another alternative is to commence from the seed ϕ 0 and Fourier transform it, thereby recovering the mollifier and the measure. In the sequel, we restrict ourselves to the first of these three approaches.
We will now demonstrate how to take a symmetric Borel measure dμ(ξ ) = w(ξ ) dξ (which is not necessarily normalised to be a probability measure), and produce coefficients {b n } n∈Z + and a seed function ϕ 0 which generate an orthogonal sequence . Consistent with the last section, we let g(ξ ) = √ w(ξ ) for all ξ in , the support of dμ and g(ξ ) = 0 otherwise.
To determine the coefficients {b n } n∈Z + , take any sequence P = {1, p 1 , p 2 , . . .} of orthonormal polynomials for the normalised measure dμ(ξ )/ dμ(ξ ). The leading coefficients of p 1 , p 2 and so on may be negative. Favard's theorem states that P must satisfy a three-term recurrence, p n+1 (ξ ) = (α n − β n ξ) p n (ξ ) − γ n p n−1 (ξ ), n ∈ Z + . (4.1) By Lemma 5, α n = 0 and γ n β n−1 /β n = 1, since dμ is a symmetric and positive measure, and P is orthonormal with respect to a probability measure. Now, if we set b n = β −1 n , then ; here, we have used the fact that γ n β n−1 /β n = 1. The signs of the b n 's will be negative if the signs of the leading coefficients of p n and p n+1 are the same, and positive otherwise, by Lemma 5. By the discussion in Sect. 3, the seed for these coefficients {b n } n∈Z + which will lead to an orthogonal system is any scalar multiple of This particular multiple of the seed function will generate an orthonormal sequence by the last part of Theorem 8.
In some situations, all one has to hand are some orthogonal polynomialsP for dμ which are not necessarily orthonormal, and have a three-term recurrence of the form Either sign may be used to obtain all possible orthonormal polynomial sequences P. Using this relationship and the three-term recurrence forP, we can rewrite the three-term recurrence for P as From this, we see that we may take and seed function to be any scalar multiple of As before, this particular multiple generates an orthonormal sequence .

Algorithm I
The simplest approach is to follow (4.7) and (4.8) by consecutive differentiation where b −1 = 0. The simplest, alas, is not the best: it produces a sequence of functions, but it then depends on ingenuity and a great deal of careful algebra to identify the underlying construct-if possible-with more familiar functions. A good example is that of Hermite functions (1.5): now w(x) = e −x 2 ; therefore, g(x) = e −x 2 /2 and it is possible, using standard formulae for Hermite polynomials, to prove by induction that is consistent with the orthonormal sequence ϕ n (x) = 1 (2 n n!π 1/2 ) 1/2 e −x 2 /2 H n (x), n ∈ Z + , but this is neither the most natural nor the simplest way of doing so.

Algorithm II
We commence by constructing an orthonormal polynomial system { p n } n∈Z + explicitly. For numerous measures dμ, such systems are known explicitly; otherwise, we might use the three-term recurrence relation (3.3), perhaps in tandem with the Golub-Welsch algorithm [8]. This is followed by computing (3.7), Note that by the discussion in Sect. 3, we need to have these polynomials normalised so that their norms in L 2 (R, dμ) are all the same, or else we will not obtain a sequence with a skew-symmetric differentiation matrix. 3 Returning to Example 1, all we need is to recall that Hermite functions are eigenfunctions of the Fourier transform and that, wishing to obtain an orthonormal sequence, The factor of n + 1 2 comes from the normalisation of the Legendre polynomials. Since dμ is supported by [−1, 1], the above orthonormal sequence is dense in the Paley-Wiener space PW [−1,1] (R). Interestingly enough, we did not find a reference to the formula which follows from our construction, but note that it can be derived directly from Olver et al. [17,Eq. 10.22.6] with little difficulty.

Algorithm III
Formula (2.2) is the starting point to an alternative means to derive the sequence . Recall that Fourier transforming (2.2) therefore yields The polynomial p n , being of the same parity as n, can be written in the form and, comparing with ψ n = ψ 0 p n , we deduce that This, together with (2.2), implies that α n, = (−1) n− b 1 · · · b n−1 p n, , = 0, . . . , n 2 and, more importantly, provides an explicit formula for in the ubiquitous case when the p n, s are known. Inasmuch as (4.12) looks very attractive, we observe that it trades in the computation of Fourier transforms of the p n s (as in Algorithm II) for the computation of derivatives of ϕ 0 , which is often fairly complicated. Thus, returning to the example of Legendre polynomials, we deduce from Rainville [18, p. 161] that where j n is the (entire) spherical Bessel function. We deduce that It is far from straightforward to prove that (4.13) is identical to (4.11), e.g. iterating [17,Eq. 10.51.2].

Systems of Quasi-polynomials
The normalised Fourier transforms ψ n from Sect. 3 were all of the form gp n , with a mollifier g = √ w and orthogonal polynomials p n , each of degree n. Similar pattern is displayed by the functions in the special case of Hermite functions (Example 1 from Sect. 2). In general, we say that { f n } n∈Z + is a sequence of quasi-polynomials if each f n is a multiple of a function G and a polynomial q n of degree n-thus, disregarding scaling, for Hermite functions we have G(x) = e −x 2 /2 and q n = H n . Another example of quasi-polynomial is the quasi-Hermite functions from Example 3.
Quasi-polynomials are of a very attractive form; hence, it is of interest to explore which orthogonal sets lend themselves to this representation. We somewhat extend the framework from orthogonality in L 2 (R) to one in L 2 ( , dν) for an arbitrary measure residing in a real interval = [a, b]-in other words, b a ϕ m (x)ϕ n (x) dν(x) = 0, m = n, m, n ∈ Z + , and assume that ϕ n (x) = G(x)q n (x), deg q n = n, n ∈ Z + . We further assume that G −1 ∈ L 2 ( ) and define dη(x) = G −2 (x) dν(x): the orthogonality conditions become b a q m (x)q n (x) dη(x) = 0, m = n, m, n ∈ Z + .
In other words, {q n } n≥0 is an orthogonal polynomial system corresponding to the measure dη. Being orthogonal, the q n s obey the three-term recurrence relation where γ 0 = 0 and γ n > 0, n ∈ N.

Theorem 10
The only system of quasi-polynomials that is orthogonal on a real interval and whose differentiation matrix is real, skew-symmetric, tridiagonal and irreducible is, up to rescaling, the Hermite system (1.5).
Proof Our first observation is that ϕ n = Gq n , where q n is an nth degree polynomial, implies that {q n } n∈Z + is on orthogonal polynomial system with respect to the measure G 2 (x) dx. Therefore, all zeros of q n are real and distinct.
Since we require a skew-symmetric, tridiagonal, irreducible differentiation matrix, we are within the framework of this paper and the functions ϕ n = Gq n obey (2.1). Therefore, assuming G is differentiable, G q n + Gq n = −b n−1 Gq n−1 + b n Gq n+1 and, dividing by G, we deduce that We have on the right a rational function-a ratio of a polynomial of degree n + 1 to a polynomial of degree n. The poles are distinct, because q n is an orthogonal polynomial, and we deduce that there exist constants c n,1 , c 2,n d n,1 , d n,2 , . . . , d n,n such that where ζ n,1 , . . . , ζ n,n ∈ (a, b) are the zeros of q n . Note, however, that the left-hand side of (5.2) is independent of n-this implies that c n,1 = c 1 , c n,2 = c 2 and d n,1 = · · · = d n,n = 0-in other words, The solution of this ODE is G(x) = e c 1 x+c 2 x 2 G(0) for some G(0) = 0; therefore, This is the moment to use the recurrence relation (5.1) to replace q n+1 : we obtain We have polynomials of degree n + 1 on both sides, and comparing the coefficient of x n+1 , we deduce that c 2 = b n α n -the equality reduces to We now have on the left and the right polynomials of degree n and compare the coefficients of x n there-this yields c 1 = b n β n and we are left with We next recall the classical theorem of Hahn [9], namely that the only orthogonal polynomial systems whose derivatives are also orthogonal are Jacobi, Laguerre and Hermite polynomials. 4 Specifically, We need, however, more: to be consistent with (5.3), we want q n to be a constant multiple of q n−1 , therefore orthogonal with respect to the same measure-and this is the case only once (up to a multiplicative constant) q n = H n . Then, the Hermite measure being determinate [3, p. 58], necessarily for orthogonality G(x) = e −x 2 /2 and the proof is complete.

Examples
In this section, we present a number of examples, grouped into two: orthogonality in the Paley-Wiener space PW [−1,1] (R) and orthogonality in L 2 (R). Not all these examples are completely worked out, they are often no more than pointers towards interesting instances of orthogonal systems , calling for further research.
A word about terminology: once a system of orthogonal functions originates in a named set of orthogonal polynomials, we call them transformed named functions. For instance, if originated in the fictitious Bloggs Polynomials, we call them transformed Bloggs functions.
Note that for α = 0 this is consistent with (4.10). Very lengthy algebra of little intrinsic interest yields and this can be extended to other ϕ n s with Algorithm II. 5 Intriguingly, while ϕ 1 is a scalar multiple of J 3 consistently with the special case of Legendre polynomials, this neat representation is no longer true for ϕ 2 and beyond.
The calculation of ϕ 0 is long and complicated: the outcome is In the most interesting special case α = 0, this can be simplified. Given a, b > −1, Substitution in (6.1) and lengthy algebra result in a far simpler formula, Subsequent ϕ n s (obtained by the brute-force Algorithm I) do not appear to possess any recognisable structure.

Systems Dense in L 2 (R R R)
While orthogonal systems which are dense in a Paley-Wiener space have, as far as we can see, limited applications, this is hardly the case with systems dense in L 2 (R), which have an immediate relevance to spectral methods, along the lines of Sect. 1.

Generalised Hermite Polynomials
Hermite functions are a familiar tool and the most important example of an orthogonal system that shares all the welcome properties sought in this paper: a skew-symmetric, tridiagonal, irreducible differentiation matrix and density in L 2 (R). It is interesting, though, to generalise them in line with the Szegő [22, Problem 25] generalisation of Hermite polynomials.
We next compute the seed: since √ 1 + cosh πξ = √ 2 cosh πξ 2 , we have According to Olver et al. [17, Table 1.14(vii)], It follows at once by trivial rescaling that Dropping the factor (2/π ) 1/2 , we thus have and so on. We can certainly discern some structure, although this issue is not pursued further in the current paper. Further structure is revealed in Fig. 2: each ϕ n appears to have n real zeros which interlace.
Although we do not pursue further this theme, transformed Carlitz functions might well be a good subject for further investigation. As things stand, we just record them as an example of an orthogonal system of an entirely different flavour than, say, Hermite and transformed generalised Hermite functions.

Freud Polynomials
Given η > − 1 2 and an even function Q ∈ C ∞ (R), such that Q(x) ≥ 0 for |x| 1, orthogonal polynomials with respect to |x| η e −Q(x) dx are called Freud polynomials [15,23]. Hermite polynomials form the simplest example, and other familiar cases are Q(x) = x 4 − t x 2 for some t ∈ R and Q(x) = |x| (whereby smoothness fails at the origin without any ill effects). General rules governing the asymptotic behaviour of recurrence coefficients for Freud polynomials have been a major open problem in the theory of orthogonal polynomials which has been solved in a celebrated paper by Fokas et al. [7]. (See also Deift et al. [5].) This has led to a burgeoning body of work, combining theory of orthogonal polynomials, the Riemann-Hilbert transform, Painlevé equations and random matrix theory. Yet, even in the simplest non-trivial case, dμ(x) = e −x 4 dx, explicit expressions of recurrence coefficients, the one item of information necessary for the formation of , are unknown.
Yet, the sequence {γ m } m∈Z + can in this case be generated recursively from from Shohat [19]. The string relation (6.3) had been extended by Magnus [16] to the measure dμ(x) = e −x 4 +t x 2 dx, γ n (γ n−1 + γ n + γ n+1 ) + 2tγ n = n, n ∈ N, and K ν is a modified Bessel function of the second kind (also known as Macdonald function). Further generalisation to dμ(x) = |x| η e −x 4 +t x 2 dx has been presented in Clarkson and Jordaan [4], who have also given explicit relations of recurrence coefficients in terms of Wronskians related to the Painlevé IV equation. 6 At least in principle, we may use either (6.3) or (6.4) to calculate as many coefficients γ n (therefore also b n = √ γ n+1 ) as required. The other ingredient in constructing is ϕ 0 , the (scaled) inverse Fourier transform of the square root of the weight function. x 4 128 , while for t = 0 the explicit form of the seed ϕ 0 is unknown. In Fig. 3, we have plotted the first six functions ϕ m for the above transformed Freud functions. Inasmuch as they look quite complicated-linear combinations with polynomial coefficients of 0 F 2 hypergeometric functions-they display fairly regular behaviour. It is too early to guess what the features are (in particular insofar as approximation theory is concerned) of this orthogonal system and whether it is of any importance in the context of spectral methods on the real line.

Conclusions and Pointers to Future Research
In this paper, we have completely characterised orthogonal systems in L 2 (R) whose differentiation matrix is real, skew-symmetric, tridiagonal and irreducible. In essence, to every symmetric probability measure dμ(ξ ) = w(ξ ) dμ(ξ ) there corresponds an essentially unique , which is dense in L 2 (R) if the support of dμ is the real line or dense in an appropriate Paley-Wiener space depending on the support of dμ. We have also presented constructive algorithms for practical computation of and a number of examples.
Future research in this area is likely to focus on three general themes.
• Firstly, generalising the basic idea underlying this paper, 'good' approximation of a differentiation matrix in the setting of spectral methods. Are there any other separable Hilbert spaces, except for L 2 (R), that give raise to skew-symmetric, tridiagonal, irreducible differentiation matrices? How does one obtain such orthogonal systems? The question is of great relevance insofar as L 2 [−1, 1], say, and L 2 [0, ∞) are concerned, when the Fourier transform-based tools of this paper are no longer applicable-at least not in a straightforward manner. Likewise, what about higher-order differentiation matrices, 'encoding' higher derivatives? In particular, approximating the second derivative with a negative semidefinite matrix is of major importance because of the ubiquity of the Laplace operator. Of course, the quindiagonal matrix D 2 describes the action of second derivative in , and as long as D is skew-symmetric, D 2 is negative semidefinite. However, can we find with a tridiagonal, negative semidefinite, irreducible second-order differentiation matrix?
• Secondly, what are the features of orthogonal function systems which act on L 2 (R) and possess a skew-symmetric, tridiagonal, irreducible differentiation matrix D? For example, where are their zeros? Brief examination of Figs. 1, 2, and 3 seems to indicate that transformed Chebyshev functions of the second kind and transformed Freud functions have an infinity of real zeros, while each Carlitz function ϕ n has exactly n real zeros. Intriguingly, in all three cases the zeros of ϕ n and of ϕ n+1 seem to interlace. A feature which is of central importance, once we wish to harness in a spectral method, is its approximation power. How well do the ϕ n s approximate an arbitrary L 2 (R) function? Many dispersive equations of quantum mechanics have solutions which can be conveniently expressed (for simplicity, in one space dimension) in terms of wave packets e −α(x−x 0 ) cos ωx, where α > 0 and ω 1. How well does approximate wave packets? Initial exploration indicates that there are substantial differences in using different orthogonal systems insofar as the number of terms for fixed error tolerance is concerned, once ω becomes very large and the wave packet oscillates rapidly.
• Finally, how do we implement this entire body of ideas in a practical spectral algorithm? Our ambition is an algorithm applicable to an arbitrary (in the first stage linear) time-dependent partial differential equations which is stable by design and conserves Euclidean norm whenever this norm is conserved by the original equations. Even disregarding the entire issue of time discretisation, practical implementation of the body of ideas in this paper requires much further research. How can we rapidly compute expansions of L 2 (R) functions in the basis ? How do we generate such a basis in a rapid and stable manner? We expect further papers to address this issue. Another issue is how to make the tridiagonal structure of D act fully to our benefit. For example, are there effective ways of approximating the product e cD v, where c ∈ R and v ∈ 2 (R), to high precision? More problems of this kind are likely to emerge once orthogonal systems, as described in this paper, are used as the kernel of a competitive spectral method.