Algorithm for rigorous integration of Delay Differential Equations and the computer-assisted proof of periodic orbits in the Mackey-Glass equation

We present an algorithm for the rigorous integration of Delay Differential Equations (DDEs) of the form $x'(t)=f(x(t-\tau),x(t))$. As an application, we give a computer assisted proof of the existence of two attracting periodic orbits (before and after the first period-doubling bifurcation) in the Mackey-Glass equation.


Introduction
The goal of this paper is to present an algorithm for the rigorous integration of Delay Differential Equations (DDEs) of the form where 0 < τ ∈ R. Despite its apparent simplicity, Equation (1) can generate all kinds of possible dynamical behaviours: from simple stationary solutions to chaotic attractors.For example, this happens for the well-known Mackey-Glass equation: for which numerical experiments show the existence of a series of period doubling bifurcations which lead to the creation of an apparent chaotic attractor [17,16].Later in the paper, we will apply our rigorous integrator to this equation.There are many important works that establish the existence and the shape of a (global) attractor under various assumptions on f in Equation (1).Much is known about systems of the form ẋ = −µx(t) + f (x(t − 1)) when f is strictly monotonic, either positive or negative [9].Let us mention here a few developments in this direction.Mallet-Paret and Sell used discrete Lyapunov functionals to prove a Poincaré-Bendixson type theorem for special kind of monotone systems [19].Krisztin, Walther and Wu have conducted a thorough study on systems having a monotone positive 1 arXiv:1607.01080v2[math.DS] 11 Jul 2017 feedback, including studies on the conditions needed to obtain the shape of a global attractor, see [11] and references therein.In the case of a monotonic positive feedback f and under some assumptions on the stationary solutions, Krisztin and Vas proved that there exist large amplitude slowly oscillatory periodic solutions (LSOPs) which revolve around more than one stationary solution.Together with their unstable manifolds, connecting them with the classical spindle-like structure, they constitute the full global attractor for the system [10].In a recent work, Vas showed that f may be chosen such that the structure of the global attractor may be arbitrarily complicated (containing an arbitrary number of unstable LSOPs) [30].
Lani-Wayda and Walther were able to construct systems of the form ẋ = f (x(t − 1)) for which they proved the existence of transversal homoclinic trajectory, and a hyperbolic set on which the dynamics are chaotic.[13].
Srzednicki and Lani-Wayda proved, by the use of the generalized Lefshetz fixed point theorem, the existence of multiple periodic orbits and the existence of chaos for some periodic, tooth-shaped (piecewise linear) f [12].
The results from [10,12,13,30], while impressive, are established for functions which are close to piecewise affine ones.The authors of these works construct equations where an interesting behaviour appears, however it is not clear how to apply their techniques for some well known equations.
In recent years, there appeared many computer assisted proofs of various dynamical properties for ordinary differential equations and (dissipative) partial differential equations by an application of arguments from the geometric theory of dynamical systems plus the rigorous integration, see for example [2,7,20,29,32,36] and references therein.By the computer assisted proof we understand a computer program which rigorously checks assumptions of abstract theorems.This paper is an attempt to extend this approach to the case of DDEs by creating a rigorous forward-in-time integration scheme for Equation (1).By the rigorous integration we understand a computer procedure which produces rigorous bounds for the true solution.In the case of DDEs, the integrator should reflect the fact that, after the integration time longer than the delay τ , the solution becomes smoother, which gives the compactness of the evolution operator.Having an integrator, one should be able to directly apply standard tools from dynamics such as Poincaré maps, various fixed point theorems, etc.In this paper, as an application, we present computer-assisted proofs of the existence of two stable periodic orbits for Mackey-Glass equation, however we do not prove that these orbits are attracting.
There are several papers that deal with computer assisted proofs of periodic solutions to DDEs [8,14,33], but the approach used there is very different from our method.These works transform the question of the existence of periodic orbits into a boundary value problem (BVP), which is then solved by using the Newton-Kantorovich theorem [8,14] or the local Brouwer degree [33].It is clear, that the rigorous integration may be used to obtain more diverse spectrum of results.There are also several interesting results that apply rigorous numerical computations to solve problems for DDEs [3,4], but they do not rely on the rigorous, forward in time integration of DDEs.
The rest of the paper is organized as follows.Section 2 describes the theory and algorithms for the integration of Equation (1).Section 3 defines the notion of the Poincaré map and discusses computation of the Poincaré map using the rigorous integrator.Section 4 presents an application of the method to prove the existence of two stable periodic orbits in the Mackey-Glass equation (Equation ( 2)).Here, we investigate case for n = 6 (before the first period doubling bifurcation) and for n = 8 (after the first period doubling bifurcation).To the best of our knowledge, these are the first rigorous proofs of the existence of these orbits.Presented methods has been also successfully used by the first author to prove the existence of multiple periodic orbits in some other nonlinear DDEs [25].

Notation
We use the following notation.For a function f : R → R, by f (k) we denote the k-th derivative of f .By f [k] we denote the term 1 k! • f (k) .In the context of piecewise smooth maps by f (k) (t − ) and f (k) (t + ) we denote the one-sided derivatives f w.r.t.t.
For a given set A, by cl (A) and int (A) we denote the closure and interior of A, respectively (in a given topology e.g.defined by the norm in the considered Banach space).
Let For v ∈ R n by π i v for i ∈ {1, 2, .., n} we denote the projection of v onto the i-th coordinate.For vectors u, v ∈ R n by u • v we denote the standard scalar product: For a given function We will often use a symbol in square brackets, e.g.[r], to denote a set in R m .Usually it will happen in formulas used in algorithms, when we would like to stress the fact that a given variable represents a set.If both variables r and [r] are used simultaneously then usually r represents a value in [r], however this is not implied by default and it will be always stated explicitly.Please note, that the notation [r] does not impose that the set [r] is of any particular shape, e.g. an interval box.We will always explicitly state if the set is an interval box.
For any set X by mid(X) we denote the midpoint of hull(X) and by diam(X) the diameter of hull(X).

Basic properties of solutions to DDEs
For the convenience of the reader, we recall (without proofs) several classical results for DDEs [5].
We define the semiflow ϕ associated to Equation (1) by: where for a maximal a ψ ∈ R + ∪ {∞} such that the solution exists for all t < a ψ .
The smoothing of solutions gives rise to some interesting objects in DDEs [31].Assume for a while that f ∈ C ∞ .Then for any n ≥ 0 there exists a set (in fact a manifold) M n ⊂ C n , such that M n is forward invariant under ϕ.
It is easy to see that for n = 1 we have: and the conditions for M n with n > 1 can be simply obtained by differentiating both sides of (1).
We follow [31] and we call M n a C n solution manifold.

Rigorous integration of DDEs
This section is a reorganized excerpt from the PhD dissertation of the first author (Robert Szczelina).
A detailed analysis of results from numerical experiments with the proposed methods, more elaborate description of the algorithms, and detailed pseudo-codes of the routines can be found in the original dissertation [24].

Finite representation of "sufficiently smooth" functions
Here, we would like to present the basic blocks used in the algorithm for the rigorous integration of Equation ( 1).The idea is to implement the Taylor method for Equation (1) based on the piecewise polynomial representation of the solutions plus a remainder term.We will work on the equally-spaced grid and we will fix the step size of the Taylor method to match the selected grid.

Remark 3
In this section, for the sake of simplicity of presentation, we assume that τ = 1.All computations can be easily redone for any delay τ .We also assume that r.h.s.f of Equation (1) is "sufficiently smooth" for various expressions to make sense.The class of f in (1) restricts the possible order of the Taylor method that can be used in our algorithms, that is, if f is of class C n , then we can use Taylor method of order at most n.Therefore, thorough the paper it can be assumed that f ∈ C ∞ .This is a reasonable assumption in the case of applications of computer-assisted proofs where r.h.s. of equations are usually presented as a composition of elementary functions.The Mackey-Glass equation (2) is a good example (away from x = −1).
We fix two integers n ≥ 0 and p > 0 and we set h = 1 p .
Definition 1 By C n p we denote the set of all functions g : [−1, 0] → R such that, for i ∈ {1, .., p}, we have: • g (n+1) is continuous and bounded on From now on, we will abuse the notation and we will denote the right derivative g (k) (−i • h + ) by g (k) (−i • h) unless explicitly stated otherwise.The same holds for g [k] (−i • h + ).Under this notation, it is clear that we can represent g ∈ C n p by a piecewise Taylor expansion on each interval with ξ(ε) ∈ [0, h].
In our approach, we store the piecewise Taylor expansion as a finite collection of coefficients g [k] (−i • h) and interval bounds on g [n+1] (•) over the whole interval Our algorithm for the rigorous integration of (1) will then produce rigorous bounds on the solutions to (1) for initial functions defined by such piecewise Taylor expansion.
Please note, that we are using here a word functions instead of a single function, as, because of the bounds on g [n+1] over intervals the finite piecewise Taylor expansion describes an infinite set of functions in general.This motivates the following definitions.
Please note, that the index function I should be simply understood as an ordering in which, during computations, we store coefficients in a finite dimensional vector -its precise definition is only important from the programming point of view, see Section 2.4 for a particular example of I. So, in this paper for theoretical considerations, we would like to use the following ḡi,[k] notation instead: We call ḡi,[k] the (i,k)-th coefficient of the representation and ḡi,[n+1] the i-th remainder of the representation.The interval set B is called the remainder of the representation.We will call the constant m = p • (n + 2) + 1 the size of the (p, n)-representation.When parameters n and p are known from the context we will omit them and we will call ḡ the minimal representation of g.
for the minimal (p, n)-representation f of f .As the set [ḡ] contains the minimal representation of f for any f ∈ G, we will also say that [ḡ] is a (p, n)-representation of G.We will also use G and [ḡ] interchangeably and we will write f ∈ [ḡ] for short, if the context is clear.
Please note that the minimal (p, n)-representation ḡ of g defines (p, n)-f-set G ⊂ C n p , which, in general, contains more than the sole function g.Also, in general, for any (p, n)-f-set G there are functions g ∈ G which are discontinuous at grid points −i • h (see (5)).Sometimes we will need to assume higher regularity, therefore we define: For convenience we also set: It may also happen that Supp (k) (G) = ∅ for nonempty G even for k = 0. Now we present three simple facts about the convexity of the support sets.These properties will be important in the context of the computer assisted proofs and in an application of Theorem 17 to (p, n)-f-sets in Section 4.
We omit the easy proof.
To extract information on g [k] − i p + ε for any i and k having only information stored in a (p, n)-representation, we introduce the following definition.
Definition 5 Let (p, n)-representation ḡ be given.We define We will omit subscript The following lemma follows immediately from the Taylor formula, so we skip the proof: holds.
Before proceeding to the presentation of the integration procedure, we would like to discuss the problem of obtaining Taylor coefficients of a solution x to Equation (1) at a given time t (whenever they exist).From Equation (1), we have (we remind that, at grid points, by the derivative we mean the right derivative): . For example, in case of k = 1, we obviously have: and in case k = 2, by applying the chain rule, we get: If we define a function F (1) : R 4 → R as then we see that Now, by a recursive application of the chain rule, we can obtain a family of functions F f = F (k) : R 2•(k+1) → R k∈N such that: By setting we can write similar identity in terms of the Taylor coefficients x [k] : As we are using the Taylor coefficients instead of derivatives to represent our (p, n)-f-sets, this notation would be more suitable to describe computer algorithms.From now on, we will also slightly abuse the notation and we will denote . This is reasonable, since, for a function F : R → R defined by F (t) := f (x(t − 1), x(t)), we have: and Remark 6 The task of obtaining family F f by directly and analytically applying the chain rule may seem quite tedious, especially, if one will be required to supply this family as implementations of computer procedures.It turns out, that this is not the case for a wide class of functions.In fact, only the r.h.s. of Equation (1) needs to be implemented and the derivatives may be obtained by the means of the automatic differentiation (AD) [26,21].We use Taylor coefficients x [k] to follow the notation and implementation of AD in the CAPD library [1] which provide a set of rigorous interval arithmetic routines used in our programs.
2.2 One step of the integration with fixed-size step h = 1 p We are given (p, n)-f-set x0 and the task is to obtain xh -a (p, n)-f-set such that ϕ(h, x0 ) ⊂ xh .We will denote the procedure of computing xh by I h , that is: First of all, we consider how x0 and xh = I h (x 0 ) relate to each other.Their mutual alignment is shown in Figure 1.
We see that , so they can be simply shifted to the new representation -we call this procedure the Shift Part.Other coefficients need to be estimated using the dynamics generated by Equation ( 1).We call this procedure the Forward Part.This procedure will be divided into three subroutines:

Forward Part -Subroutine 1
This procedure is immediately obtained by a recursive application of Equation ( 7):

Forward Part -Subroutine 2
This subroutine can be derived from the Mean Value Theorem.We have for ε < h: A graphical presentation of the integrator scheme.We set n = 2 and p = 4.A (p, n)representation is depicted as dots at grid points and rectangles stretching on the whole intervals between consecutive grid points.The dot is used to stress the fact that the corresponding coefficient represents the value at a given grid point.Rectangles are used to stress the fact that remainders are bounds for derivative over whole intervals.Below the time line we have an initial (p, n)-representation.Above the time line we see a representation after one step of size h = 1 p .Black solid dots and grey rectangles represent the values we do not need to compute -this is the shift part.The forward part, i.e. the elements to be computed, are presented as empty dots and an empty rectangle.The doubly bordered dot represents the exact value of the solution at the time t = h = 1 p (in practical rigorous computations it is an interval bound on the value).The doubly bordered empty rectangle is an enclosure for the n + 1-st derivative on the interval for some 0 ≤ ξ ≤ ε.Let us look at the two terms that appear on the r.h.s. of Equation (8).The question is: can we estimate them by having only x0 and already computed x1,[k] h from Subroutine 1? Let us discuss each of these terms separately.
By Lemma 5 we have for 0 ≤ k ≤ n + 1: x x0 ([0, h]) Moreover by Definition 2 we know that: So those terms can be easily obtained.The problem appears when it comes to x(ξ) and x [k] (ξ), for ξ ∈ (0, h) Assume for a moment that we have some a priori estimates for x([0, h]) i.e. a set Z ⊂ R such that x([0, h]) ⊂ Z.We call this set the rough enclosure of x on the interval [0, h].Having rough enclosure Z, we could apply Equation ( 7) (as in the case of Subroutine 1) to obtain the estimates on x [k] ([0, h]) for k > 0. So the question is: how to find a candidate Z and prove that x([0, h]) ⊂ Z?
The following lemma gives a procedure to test the later.
Lemma 7 Let Y ⊂ R be a closed interval and let x 0 be a function defined on [−1, 0].Assume that the following holds true: Then the solution x(t) of Equation (1) with the initial condition x 0 exists on the interval [0, h] and x ([0, h]) ⊂ Z.
Proof: We can treat equation ( 1) on the interval [0, h] as a non-autonomous ODE of the form: where a(t) = x(t) for t ∈ [−1, −1 + h] is a known function.Now the conclusion follows from the proof of the analogous theorem for ODEs.The proof can be found in [35].
Using Lemma 7, a heuristic iterative algorithm may be designed such that it starts by guessing an initial Y and then it applies Equation ( 8) to obtain Z.In a case of failure of the inclusion, i.e.Z ⊂ Y , a bigger Y is taken in the next iteration.Please note, that this iteration may never stop or produce unacceptably big Y , especially when the step-size h is large.Finding a rough enclosure is the only place in the algorithm of the integrator that can in fact fail to produce any estimates.In such a case we are not able to proceed with the integration and we signalize an error.Now we can summarize the algorithm for Subroutine 2 as follows: Remark 8 Please note that the term a * is computed the same way as other coefficients in Subroutine 1 and the rough enclosure do not influence this term.In fact this is the n + 1-st derivative of the flow w.r.t.time.It is possible to keep track of those coefficients during the integration and after p steps (full delay) those coefficients may be used to build a (p, n + 1)-representation of the solutions -this is a direct reflection of consequences of Lemma 2. This fact is also important for the compactness of the evolution operator -an essential property that allows for an application of the topological fixed point theorems in infinite dimensional spaces.

Forward Part -Subroutine 3
The last subroutine of the forward part can be simply obtained by using Definition 2 and Equation ( 5): Notice, that the possible influence of the usually over-estimated rough enclosure Z is present only in the last term of the order h n+1 so, for small h (large enough p), it should not be a problem.

The integrator -altogether
Strictly speaking, the mapping I h does not produce a (p, n)-f-set which exactly represents xh = ϕ(h, x0 ).Instead, it returns some bigger set [x h ] such that xh is contained in it.Of course, we are interested in obtaining a result as close as possible to the set of true solutions represented by ϕ(h, x0 ).So, for technical reasons which will be apparent in Section 2.3, we decompose I h into φ0,[0] : Let R(x) = r and put: The map Φ is called the Taylor part, while the map R is called the Remainder part.This decomposition is important for an efficient reduction of negative effects caused by using interval arithmetic, primarily the wrapping effect, but also the dependency problem to some extent.

Reducing the wrapping effect
A representation of objects in computations as the interval sets has its drawbacks.Possibly the most severe of them are the phenomena called the wrapping effect and the dependency problem.
Their influence is so dominant that they are discussed in virtually every paper in the field of (0,0) Figure 2: An illustration of the wrapping effect problem for a classical, idealized mathematical pendulum ODE ẍ = −x.The picture shows a set of solutions in a phase space (x, ẋ).The grey boxes present points of initial box moved by the flow.The colored boxes present the wrapping effect occurring at each step when we want to enclose the moving points in a product of intervals in the basic coordinate system.For example, the blue square on the left encloses the image of the first iteration.Its image is then presented with blue rhombus which is enclosed again by an orange square.Then the process goes on.
rigorous computations (see [35] and references therein).The dependency problem arises in interval arithmetic when two values theoretically representing the same (or dependent) value are combined.
The most trivial example is an operation x − x which is always 0, but it is not the case for intervals.For example, applying the operation to the interval x = [−1, 1] gives as the result the interval [−2, 2] which contains 0 but it is far bigger than we would like it to be.The wrapping effect arises when one intends to represent a result of some evaluation on sets as a simple interval set. Figure 2 illustrate this when we consider the rotation of the square.
One of the mostly used and efficient methods for reducing the impact of the wrapping effect and the dependency problem was proposed by Lohner [15].In the context of the iteration of maps and the integration of ODEs, he proposed to represent sets by parallelograms, i.e. interval sets in other coordinate systems.In the sequel we follow [35] and we sketch the Lohner methods briefly.
By J we denote a computation of I h (•) using point-wise evaluation of the Taylor part, i.e.: Let us consider an iteration: By a simple argument based on the Mean Value Theorem [35], it can be shown that: We can reformulate the problem of computing [x k+1 ] to the following system of equations: Now the reduction of the wrapping effect could be obtained by choosing suitable representations of sets [r k ] and a careful evaluation of Equation ( 16).The terminology used for this in [35] is the rearrangement computations.We will briefly discuss possible methods of handling Eq. ( 16).Method 0 (Interval Set): Representation of [r k ] by an interval box and the direct evaluation of ( 16) is equivalent to directly computing I h ([x k+1 ]).This method is called an interval set and is the least effective.Method 1 (Parallelepiped): we require that being an interval box and B k being an invertible matrix.Then ( 16) becomes: ] .Since it is difficult to obtain the exact matrix inverse in computer calculations we will use interval matrices 's are well-chosen, then the formula in brackets can be evaluated to produce a matrix close to identity with very small diameter, thus the wrapping effect reduction is achieved.The Parallelepiped method is obtained when This approach is of limited use because of the need to compute the matrix inverse of a general matrix B k+1 , which may fail or produce unacceptable results if B k+1 is ill-conditioned.Method 2 (Cuboid): this is a modification of Method 1.In this method, we choose U ∈ [A k ][B k+1 ] and we do the floating point approximate QR decomposition of U = Q • R, where Q is close to an orthogonal matrix.Next we obtain matrix [Q] by applying the interval (rigorous) Gram-Schmidt method to Q, so there exist orthogonal matrix this representation is used in our computations as it proved to be the most efficient in numerical tests [24] and in other applications, see [35] and references therein.The original idea by Lohner is to separate the errors introduced due to the large size of initial data and the local errors introduced by the numerical method at every step.Namely we set: where [r k ] is evaluated by any method 0-2.To reduce the possible wrapping effect in the product Lohner proposed the following: Again, [r] is evaluated by any method 0-2.Please note that there is no need to inverse a matrix in the doubleton representation when [r] is evaluated either by Method 0 or Method 2, so this approach is suitable for the case where A k may be close to singular.In the computer assisted proofs presented in this paper we use Method 0 to represent [r] because it is less computationally expensive and, in our current setting, using the other methods have not improved the results.This is puzzling, as it contradicts our experience with ODEs where Method 2 is preferable, and thus it might be worthwhile to investigate this phenomena in some later study.

Optimization exploiting the block structure of DΦ(x)
In our setting,

Under this index function, [A k
] is of the form: where A 11 is n + 1 × 1 matrix (column vector), A 13 is n + 1 × n + 2, and Id is an identity matrix of size (p − 1) • (n + 1).Therefore, we use this index function to define blocks for all matrices and vectors appearing in all methods discussed in Section 2.3, as, with such block representation of matrices and vectors, we can easily program the multiplication by a matrix or a vector so that all the operations on any zero block are avoided.We will refer to this as the optimized algorithm.
If we have an arbitrary matrix C, then the cost of computing [A k ] • C by a standard algorithm for the matrix multiplication is of order O(n 3 • p 3 ) in both the scalar addition and multiplication operations (we remind, that p, n are the parameters of (p, n)-representation).In the case of the optimized algorithm, the block structure and sparseness of [A k ] reduce the computational cost to O(n 2 • p 2 ) in scalar additions and O(n 3 ) in scalar multiplications.
The computation times for the computer assisted proofs discussed in Section 4 on the 2.50 GHz processor (see Section 4 for a detailed specification) are presented in Table 1.We see that the optimized algorithm is much faster than the direct multiplication, the speed up is evident especially for the larger (p, n)-representations.
Table 1: A comparison of the execution times for computer assisted proofs for standard and optimized matrix multiplication on 2.5GHz processor (no multi-threading).

Definition of a Poincaré map
We begin with the definition of the (transversal) section of the semiflow ϕ associated to (1).First, we would like to recall the ODE setting where, for a flow ϕ : R × R m → R m , a (local) transversal section S is usually defined as a (subset of) smooth manifold M of codimension one satisfying the transversality condition: where T x M denotes the tangent bundle at x.If M is a hyperplane for some given normal vector s ∈ R m , s = 0 and a ∈ R, then condition (17) becomes We will use a similar approach in the context of the semiflow ϕ associated to (1).We will restrict ourselves to the linear sections and we will use the fact that from Equation ( 4) and Lemma 2 it follows d dt ϕ(t, x 0 )| t=t0 = ẋt0 for any t 0 ≥ τ .Moreover, ẋt0 is of class C n−1 wherever t 0 ≥ n • τ .This observation will be crucial for the definition of a transversal section in the DDE context and, later, for the rigorous computation of Poincaré maps.
Definition 6 Let ϕ be the semiflow associated with the system (1).Let n ∈ N, s : C n → R be a continuous affine mapping, i.e. s(x) = l(x) + a, where l is a bounded linear functional and a ∈ R.
We define a global C n -section as a hyperplane: Any convex and bounded subset S ⊂ S is called a local C n -section (or simply a section).
A section S is said to be transversal if there exists a convex open set W ⊂ C n such that where The same assumptions as in Theorem 10, in particular assume t 1 ≥ τ • (n + 1) = ω.
We define the transition map to the section S after (at least) ω by for t S defined as in Theorem 10.If V ⊂ S, then the map P ≥ω will be called the Poincaré return map on the section S after ω.
Finally, we state the last and the most important theorem that will allow us to apply topological fixed point theorems to P ≥ω .
Theorem 11 Consider Poincaré map (after ω) P ≥ω : S ⊃ V → S for some section S under the same assumptions as in Theorem 10, especially assume Then the map P ≥ω is continuous and compact in

Proof:
By Theorem 10, P ≥ω is well defined for any The continuity follows immediately from the continuity of t S (Theorem 10) and ϕ (Lemma 1).Let D = P ≥ω (V ).From our assumptions, it follows that D is bounded in C n .A known consequence of the Arzela-Ascoli Theorem is that, if D ⊂ C n is closed and bounded, x ∈ D, x (n+1) exists, and there is M such that sup t x (n+1) (t) ≤ M for all x ∈ D, then D is compact (in C n -norm).Therefore, to finish the proof, it is enough to show that there is a uniform bound on P ≥ω (x) (n+1) .For this, it is sufficient to have a uniform bound on (ϕ(t, x)) (n+1) for t ∈ [ω, sup x∈V t S ].The existence of this bound follows from boundedness of derivatives up to order n and formula (7).
The restriction on the transition time may seem a bit unnatural since each solution becomes C ∞ eventually, as discussed in Remark 9.In fact, it should be possible to work directly with the solutions on the C n solution manifold M n (i.e.M n ⊂ C n and ϕ(t, M n ) ⊂ M n for all t ≥ 0).When we restrict the flow to the solutions manifold M n , then we do not need to demand that the transition time to the section is bigger than ω = (n + 1) • τ .Instead, to obtain the compactness, we need to shift the set forward only by one full delay.Therefore we obtain the following theoretical result: Theorem 12 Consider Poincaré map (after τ ) P ≥τ : S ⊃ V → S for some section S, where V ⊂ M n .Let t 1 and t 2 be like in Theorem 10.
Assume that ϕ([0, t 2 ], V ) is bounded in C n .Then the map At the present stage of the development of our algorithm, we do not have the constructive parametrisation of the manifold M n , therefore we need to use the "long enough" integration time ω in the rigorous numerical computations.

Rigorous computation of Poincaré maps
The restriction of the integration procedure I h (Section 2) to the fixed-size step h = 1 p is a serious obstacle when we consider computation of Poincaré map P ≥ω : V → S. Obviously, if we assume for simplicity that ω = q • τ p , q ∈ N and t S (V ) ⊂ ω + [ε 1 , ε 2 ] with 0 < ε 1 < ε 2 < τ p , then we have to find a method to compute image of the set after small time ε ∈ [ε 1 , ε 2 ].The definition of the (p, n)-representation together with Equation ( 5) give a hint how to compute the value of the function (and the derivatives up to the order n) for some intermediate time 0 < ε < h.But again, we face yet another obstacle, as computing the (p, n)-f-set representing ϕ(ε, x 0 ) for all initial functions x 0 in some given (p, n)-f-set turns out to be impossible.It can be seen from the very same example as in Remark 9.In the example, x ε would be only C 0 at t = −ε.So if ε is not a multiple of h, then, for any n > 0, there is no (p, n)-representation of x ε , unless we restrict the computations to the set C n p ∩ C n+1 (or to the solutions manifold M n ).This is again a reason for an appearance of the "long enough" integration time in Definition 7.This discussion motivates the following definition and lemma.
For a given x0 we denote: Function I ε will be called the shift by ε or the ε-step integrator.
Remark 13 I ε (x 0 ) is constructed in such a way that it contains all solutions to (4) for initial functions x 0 ∈ Supp(x 0 ) ∩ M n+1 after time t = ε.
for all x ∈ x0 .
Proof: Since q ≥ n • p then q • h ≥ n • τ and the proof follows from Lemma 2, Lemma 5 and Definition 2. Now, the application of I h and I ε to compute P ≥ω is straightforward.
2. find q ≥ (n + 1) • p and ε 1 , ε 2 < h (for example by the binary search algorithm) such that for the assumptions of Theorem 10 are guaranteed for section S, t 3. By assumptions and by Theorems 10 and 14, we know that we have P t1 (x 0 ) ∈ W ∩ S for each x 0 ∈ V .Moreover, by Theorem 11, the map Please note, that the operator I [ε1,ε2] should be interpreted as computation of the sum ε∈[ε1,ε2] I ε (•) or as any reasonable bound on this sum.In our program we just evaluate Remark 15 (Controlling the wrapping effect for I ε ) We can use the decomposition of I ε into Φ ε and R ε such that the Lohner algorithm can again be used in the last step of the integration as described in Section 2.3.We skip the details and refer to the source code documentation of the library available at [24].Now, the question arises: how to represent the section S in a manner suitable for computation of the program P ≥ ω?

(p, n)-sections
Since we are using the (p, n)-representations to describe functions in C n , it is advisable to define sections in such a way that it would be easy to rigorously check whether x ∈ S for all functions represented by a given (p, n)-f-set.The straightforward way is to require l in the definition of S to depend only on representation coefficients xi,[k] .
We assume that at least one l i,k is not equal to zero.Let a ∈ R be given.For x ∈ C n we define a linear continuous map l : The section produces estimates on solutions distant from the section (blue rectangles) so the interval enclosure W of all solutions tend to be very large (green rectangle).Right: if the section is chosen carefully, then all the solution obtained from I [ε 1 ,ε 2 ] are close to the section, so the set W is small.

Choosing an optimal section
Numerical experiments with the rigorous integrator have shown that the choice of a good section is a key factor to obtain sufficiently good bounds on images of the Poincaré map to be used in computer assisted proofs reported in Section 4. The section has to be chosen so that the diameter of the bounds on transition time t S should be as small as possible, see Figure 3.
We will discuss the problem of choosing optimal section in the ODEs case.Later, in Section 4, we will apply a heuristic procedure based on this discussion to obtain a good candidate for an optimal section in the DDEs setting.
Let us consider an ODE of the form: Let x 0 be a periodic orbit of period T of the flow ϕ induced by (21).Then, f (x 0 ) is a right eigenvector of the matrix M = ∂ϕ ∂x (T, x 0 ) with eigenvalue λ = 1.Let l be a row vector which is a left eigenvector of M corresponding to λ = 1.Let us assume, that the periodic orbit passing through x 0 is hyperbolic.In such a case, the left and right eigenvectors corresponding to the eigenvalue λ = 1 are uniquely defined up to a multiplier and we have For any given row vector v ∈ R n let us consider a section hence v ⊥ is the tangent space to the section S v .Under the above assumptions, we have the following lemma.
where t Sv is the transition time to the section S v , defined in some neighborhood of x 0 .Moreover, iff v = αl for some α = 0 Proof: The transition time to section S v is defined by the following implicit equation From this, we immediately obtain (22).
The second assertion is obtained as follows.At first, assume (23).We have Therefore v is proportional to l.
The other direction of the second assertion is obvious.
In simple words, Lemma 16 states that choosing the left eigenvector of the monodromy matrix ∂ϕ ∂x (T, x 0 ) gives a section such that the return time to this section is constant in the first order approximation.
4 The existence of periodic orbits in Mackey-Glass equation The Mackey-Glass system (2) is one of the best known delay differential equations.The original work of Mackey and Glass [17] spawned wide attention, being cited by many papers with a broad spectrum of topics: from theoretical mathematical works to neural networks and electrical engineering.Numerical experiments show that, as either parameter τ [17] or n [16] is increased, the system undergoes a series of period doubling bifurcations and they lead to the creation of an apparent chaotic attractor.
In this section, we present computer assisted proofs of the existence of attracting periodic orbits in Mackey-Glass system (2).We use the classical values of parameters: τ = 2, β = 2 and γ = 1 and we investigate the existence of periodic orbits with n = 6 (before the first period doubling) and n = 8 (after the first period doubling) [16].We would like to stress, that we are not proving that these orbits are attracting.This would require some C 1 -estimates for the Poincaré map defined by (2).
We will now describe shortly how each of the above steps was implemented.In this description, we refer to non-rigorous computations, that is algorithms: Îh defined as in Section 2, and Îε defined as in Section 3, but with the remainder terms depending on the rough-enclosure ignored and explicitly set to 0. Using non-rigorous integrators Îh and Îε , we construct a finite-dimensional semiflow φ that approximates ϕ by Now, the procedure for finding good initial conditions can be described as follows: Step 1.Since we are looking for an attracting orbit we start by non-rigorously integrating forward in time an initial function x0 ≡ 1.1 for some arbitrary, long time T iter , until we see that xTiter approach the apparently stable periodic orbit.Then, we refine xTiter by the Newton algorithm applied to x → P≥ω (x) − x, where the map P≥ω is a non-rigorous version of P ≥ω defined as a first return map for semiflow φ to a simple section S = {x | x(0) = xTiter (0)}.The output of this step is a numerical candidate for the periodic solution x 0 , given by its (p, n)-representation x0 such that x0 and P ≥ω (x 0 ) are close.
Step 2. This is an essential step, as numerical experiments with the rigorous integrator have shown that the choice of a good section is the key factor to obtain tight bounds on the image of the Poincaré map.We use an observation from Section 3.4 and we find the left eigenvector l of the matrix ∂ φ ∂x (T, x 0 ) corresponding to eigenvalue 1, where T is an apparent period of the approximate periodic orbit for the non-rigorous semiflow φ.
Please note that l might be considered a (p, n)-representation with remainder part set to 0, therefore we can define a (p, n)-section by where the dot product is computed using the coordinates of (p, n)-representation, i.e. in the vector space R m , where m is the size of a (p, n)-representation, m = (n + 2) • p + 1.
Step 3. Having a good candidate for the section S (defined by ( 24)), we need to introduce the coordinates on it.For this, we create the following matrix: Now, let C denote the matrix obtained after orthonormalization of columns of A. Please note, that matrix C acts on the variables corresponding to the remainder terms as an identity.This follows from the fact that li,[n+1] ≡ 0. It is easy to see that all (p, n)-representations that lie on the section S are given by: y → x0 + Cy, for all y such that π 1 y = 0. Now, on section S, using the coordinates defined by the matrix C, we define a candidate set [V ], in a form of (p, n)-f-set in a following manner.Let [r] ⊂ R m−p (these correspond to variables x i, [k] for k ≤ n) and [B] ⊂ R p (these are bounds for x i,[n+1] -the remainders) be two interval boxes centered at 0 such that diam(π 1 [r]) = 0. We put [r 0 ] := [r] × [B] and we define (p, n)-f-set [V ] by: Diameters of π i [r 0 ] for i ≥ 2 are selected experimentally to follow some exponential law in parameter k (i.e.diam(π I(i,k) [r 0 ]) ≈ a k for 1 ≤ i ≤ n), as the periodic solutions to Equation (2) are at least of class C ∞ and, if x(t) > −1 for all t, then they should be analytic [18,22].The remainder [B] is chosen initially such that diam (B) diam ([r 0 ]).Therefore the initial selection of [r 0 ] may not be good enough to satisfy assumptions of Theorem 17 right away.As the dynamics of the system is strongly contracting, we hope to obtain a good initial condition by the following iteration.We start with [V ] 0 = [V ] and we compute i is eventually meet at some i stop .Then the initial set for the computer assisted proof is [ Ṽ ] = V istop .Both initial sets that are used in computer assisted proofs in this paper were generated with such a procedure (see source codes).
Observe that we are not very careful in the choice of coordinates on the section -we simply choose some basis orthonormal to the normal vector l of the section hyperplane.Definitely better choice would be to use approximate eigenvectors of the Poincaré map, but in the case of strongly attracting periodic orbits it is enough to choose a good section.Observe also, that the orthonormal matrix is easy to invert rigorously, which is an important step in comparison of the initial set and its image by the Poincaré map.In this section we present two theorems about the existence of periodic orbits in Mackey-Glass equation.As they depend heavily on the estimates obtained from the rigorous numerical computations, we would like to discuss first the textual presentation of numbers used in this section and how they are related to the input / output values used in rigorous computations.
In the rigorous numerics we use intervals with ends being representable computer numbers.The representable numbers are implemented as binary32 or binary64 data types defined in IEEE Standard for Floating-Point Arithmetic (IEEE 754) [28], so that they are stored (roughly speaking) as s • m • 2 e , where s is the sign bit, m is the mantissa and e the exponent.Such a representation means that most numbers with a finite representation in the decimal base are not representable (e.g.number 0.3333).In this paper, for better readability, we are going to use the decimal representation of numbers with the fixed precision (usually 4 decimal places), so we have rewritten computer programs to handle those values rigorously.For example, if we write in the text that a = 0.3333 then we put the following rigorous operation in the code: That is, all numbers presented here in theorems and/or proofs should be regarded by the reader as the real, rigorous values, even if they are not representable in the sense of IEEE 754 standard.
In the proofs we refer to computer programs mg_stable_n6 and mg_stable_n8.Their source codes, together with instructions on the compilation process, can be downloaded from [23].The codes were tested on a laptop with Intel ® Core ™ I7-2860QM CPU (2.50 GHz), 16 GB RAM under 64-bit Linux operating system (Ubuntu 12.04 LTS) and C/C++ compiler gcc version 4.6.3.

Case n = 6
Our first result is for the periodic orbit for the parameter value before the first period doubling bifurcation.
With n = 6, numerical experiments clearly show that the minimal period of the periodic orbit is around 5.58.In our proof however, due to the problem with the loss of the regularity at the grid points, thus the need to use the "long enough" transition time, we consider the second return to the section.
Numerical experiments indicate that the orbit is attracting with the most significant eigenvalues of the map P ≥ω (again, this is the second return to the Poincaré section) estimated to be: Reλ -0.0437 -0.0437 0.0030 0.0030 -0.0028 0.0019 -0.0003 -0.0003 0.0005 Imλ 0.0793 -0.0793 0.0097 -0.0097 0.0018 -0.0018 |λ| 0.0905 0.0905 0.0102 0.0102 0.0028 0.0019 0.0019 0.0019 0.0005 Therefore, the contraction appears to be quite strong, so the choice of good coordinates on the section appears to be not important.
We obtained the following theorem.
Theorem 18 There exists a T -periodic solution x with period T ∈ [10.9671, 10.9673] to Equation (2) for parameters γ = 2, α = 1, τ = 2 and n = 6.Moreover for x defined by Proof: Verification of assumptions of the Schauder theorem is done with the computer assistance in the program mg_stable_n6.It uses the (p, n)-representation of the phase-space with p = 32 and n = 4. Initial (p, n)-f-set [x 0 ] is provided directly in the source code and it was selected with procedure described in Section 4.2.For the map P ≥ω , ω = (n + 1) • τ = 10, which represents the second return to the section S, we obtained: which guarantees the C n+1 -regularity of the solutions and the compactness of the map P ≥ω (in C k norm for k ≤ n).The inclusion condition P ≥ω ([x 0 ]) ⊂ [x 0 ] of the Schauder Fixed Point Theorem is checked rigorously, see output of the program for details.Together, these two facts guarantee the non-emptiness of Supp (n+1) ([x 0 ]).The transversality is guaranteed with l( ẋ) ≥ 0.2828 for all x ∈ C n+1 ∩ P (x 0 ).The distance in C 0 norm is rigorously estimated to x − x C 0 ≤ 0.01902867681.
Similarly, we have verified the other norms, see output of the program.
The execution of the program realizing this proof took around 12 seconds on 2.50 GHz machine.The diameter of the estimation for period T (also for the last step [ε 1 , ε 2 ]) obtained from the computer-assisted proof is close to 1.15 • 10 −4 .
A graphical representation of the estimates obtained in the proof can be found in Figure 4.

Case n = 8
For n = 8 we consider the periodic orbit after the first period doubling.This time the period of the orbit is long enough to overcome the initial loss of regularity, so we consider the first return Poincaré map.Numerical computation shows that the orbit is attracting with the 10 most significant eigenvalues of the map P ≥ω estimated to be: .The proof follows the same lines as in the case of Theorem 18 (except this time we consider the first return to the section).Therefore we just list the parameters from the proof.x − x C 0 ≤ 0.01138319492 < 0.012.
The diameter of the estimation for period T (also for the last step [ε 1 , ε 2 ]) obtained from the computer-assisted proof is close to 3.899•10 −6 .A graphical representation of the estimates obtained in the proof can be found in Figure 5.
The execution time was around 12 minutes.This increase when compared to n = 6 is due to much larger representation size in this case which affects the complexity of matrix and automatic differentiation algorithms which we are using.

Outlook and future directions
The results presented in this work might be improved in several ways: • An extension of the integration algorithm to the systems of delay equations in R k for k > 1.
This is rather straightforward and it does not require any new ideas;   .
• A different representation of function sets.Currently, we use the piecewise Taylor expansions, but other approaches, like the Chebyshev polynomials, might be better as they may produce better approximations on longer intervals; • avoiding the loss of the regularity at the beginning of the integration, which imposes the requirement for the transition time to section t S to be "long enough".The complete solution would be to confine the initial condition to the invariant set M n ⊂ C n .We are currently working on this matter; Other goal would be to apply the integrator to prove the existence of hyperbolic periodic orbits with one or more unstable directions, for example to establish the existence of LSOPs [10] in some general smooth DDEs, or unstable periodic solutions to Mackey-Glass equation.Good theorems, suitable for that task, already exist, see [37] for the analogous question in the dissipative PDEs setting.
The ultimate goal is to establish tools to prove chaotic dynamics in general DDEs, such as Mackey-Glass equation.
and for some binary operation : R × R → R we define A B = {a b : a ∈ A, b ∈ B} and a A = A a = {a} A. Analogously, for g : R → R and a set A ∈ R we define g(A) = {g(a) | a ∈ A}.

Figure 3 :
Figure 3: Left: an illustration of the problem with big difference in transition time tS for poorly chosen section S.If tS ∈ q • τ p + [ε1, ε2] and |ε1 − ε2| is large, then I [ε 1 ,ε 2 ]produces estimates on solutions distant from the section (blue rectangles) so the interval enclosure W of all solutions tend to be very large (green rectangle).Right: if the section is chosen carefully, then all the solution obtained from I [ε 1 ,ε 2 ] are close to the section, so the set W is small.

Figure 5 :
Figure 5: Top: approximate function x (blue) and estimates on the value of the true solution obtained from computer-assisted proof (red).Bottom: solution plotted as parametric curve r(t) = (x(t), x(t − τ )).
where m is the size of (p, n)-representation (m = p • (n + 2) + 1), therefore, if we decide to represent the error part [r] in doubleton by interval box (Lohner Method 0), then the matrix multiplications involving the matrix [A k ] take the most of the execution time in one step of integration in the Lohner algorithm, especially for large n and/or p. From Equations (9-12), we see that DΦ(x) has a nice block structure and contains a large number of zero entries, i.e. it is a sparse matrix.This structure is well visible when we use the following index function I: