Reducing the O(3) model as an effective field theory

We consider the O(3) or CP(1) nonlinear sigma model as an effective field theory in a derivative expansion, with the most general Lagrangian that obeys O(3), parity and Lorentz symmetry. We work out the complete list of possible operators (terms) in the Lagrangian and eliminate as many as possible using integrations by parts. We further show at the four-derivative level, that the theory can be shown to avoid the Ostrogradsky instability, because the dependence on the d'Alembertian operator or so-called box, can be eliminated by a field redefinition. Going to the six-derivative order in the derivative expansion, we show that this can no longer be done, unless we are willing to sacrifice Lorentz invariance. By doing so, we can eliminate all dependence on double time derivatives and hence the Ostrogradsky instability or ghost, however, we unveil a remaining dynamical instability that takes the form either as a spiral instability or a runaway instability and estimate the critical field norm, at which the instability sets off.


Introduction and summary
Effective field theories (EFTs) at low energies are very useful tools in classical and quantum field theories for formulating the most important impact of the full theory in terms of the low-energy variables (fields) valid at low energies up to the scale of the lightest field that has been integrated out, see e.g. ref. [1] for a review. Nowadays, it is even a commonly used approach for parametrizing our ignorance of new physics beyond the standard model and in that context it is called the standard model effective field theory (SMEFT) [2].
A more traditional low-energy EFT is the chiral Lagrangian as the low-energy theory of the strong interactions, which is a step more radical than the SMEFT, since the original fields, viz. quark and gluon fields, of quantum chromodynamics (QCD) are absent in the chiral Lagrangian theory and the lightest fields are the Nambu-Goldstone bosons of chiral symmetry breaking, namely the pions [3][4][5]. In chiral perturbation theory, the pions and other vector mesons are coupled to a nucleon field and the expansion is made order-byorder in the so-called chiral order which counts the number of derivatives or powers of quark masses, where a quark mass counts as two derivatives.
The leading-order term in the chiral Lagrangian, in the mesonic sector, is the kinetic term of the pions, which is written as a kinetic term of an SU(2)-valued field containing three pions and an auxiliary field, traditionally called σ. The latter auxiliary field allows for a nonlinear constraint on the SU(2)-valued field of the chiral Lagrangian, such that its determinant is manifestly equal to unity. This constraint in turn induces a nonvanishing curvature for the metric on the target space of the pions. For historic reasons and due to the choice of σ as the symbol for the auxiliary field, such kind of theory, has henceforth been coined a nonlinear sigma model (NLσM).
Due to a low-energy theorem established by Weinberg [6], the effective field theory can be considered as a meaningful theory, despite the fact that it is nonrenormalizable and the systematic scheme usually utilized works by fixing a chiral order, D, to which calculations are carried out, and only considering L ≤ D 2 − 1 loops in the theory. A problem, however, naturally occurs at higher orders in a derivative expansion, namely that there may be more than one derivative acting on the same field. The simplest example with Lorentz invariance, is the square of the d'Alembertian operator on a field, ( φ) 2 , which due to the theorem of Ostrogradsky [7], see also ref. [8], must have a linear dependence on at least one conjugate momentum, since the Hamiltonian corresponding to this Lagrangian contains (∂ 2 t φ) 2 . This readily implies that the energy is unbounded from below and from above and in turn that the theory is plagued by at least one ghost field. The necessary assumption for the Ostrogradsky theorem to hold, is that the Lagrangian is nondegenerate in the double time derivative, i.e. ∂ 2 L (∂∂ 2 t φ) 2 = 0. It is, however, well known that a fundamental theory can be absolutely sane, whereas upon writing down its effective low-energy theory, for example by integrating out a massive field, the resulting EFT turns out to be either nonlocal or in general plagued by the Ostrogradsky ghost, see e.g. ref. [9, sec. II.B]. It can further be argued that the Ostrogradsky ghost is naturally suppressed by the EFT energy scale, which is the energy scale of the lightest massive particle that has been integrated out, and hence can only be dynamically excited at energies of order of the EFT scale, which is by definition the energy scale where the EFT breaks down [9]. In case the theory at hand is in a certain class of asymptotically free theories, it has further been argued to have an effective mass that runs to infinity in the ultraviolet limit [10]. It is thus clear that the Ostrogradsky ghost, in a physically sane theory, should be considered as an artifact of the EFT and not a physically viable excitation (solution). Since the ghost by its nature comes with a kinetic term of the wrong sign, it furthermore implies the loss of unitarity [8], which is unfortunate for a quantum theory.
A well-known condition on a theory with higher derivatives, is to fine tune the coefficients of the higher-order derivative terms in such a way that the Euler-Lagrange equation of motion is of second order. This is indeed the way the Ostrogradsky ghost or instability intact O(3) symmetry, parity symmetry and Lorentz (Poincaré) symmetry, and only relax the latter symmetry requirement at the end. The O(3) NLσM is equivalent to the CP 1 NLσM, by changing coordinates from a real vector field n : R d+1 → R 3 to a complex field z : R d+1 → C, which is the ordinary Riemann sphere coordinate. We will consider the model order-by-order in a derivative expansion, where each order is suppressed by further powers of a single intrinsic mass scale of the NLσM, taken as the scale up to which the EFT is trustable.
Our main motivation is two-fold. We would like to see whether it is possible to get rid of the Ostrogradsky instability in the O(3) NLσM by simplifying and using field redefinitions, since the Ostrogradsky ghost and corresponding instability is non-physical and should be considered an artifact of the low-energy effective field theory. This is known not to be possible in other theories, like the chiral Lagrangian, but the only glimmer of hope here could be due integrability of the O(3) NLσM. The other motivation is to find a minimal formulation of the model, with an explicit and natural basis of operators describing the model. It is important in this regard that we clearly define what symmetries, continuous and discrete, are imposed on the theory for the result, since changing this assumption will allow further (or fewer) operators in the theory.
First we contemplate how to construct the most general Lagrangian of the O(3) NLσM with unbroken O(3), parity and Lorentz symmetry. We establish in Lemma 1 that with the given symmetry assumptions, the number of derivatives must be even and hence so does the suppressing powers of the EFT scale, Λ. At the leading order in the derivative expansion, we establish the unique Lagrangian, up to an overall constant, in Theorem 2, Lemma 4 and Corollary 5.
At the next-to-leading order in the derivative expansion, which we denote Λ −2 , we exhaust the possibilities of terms or operators compatible with the symmetry requirements and show in Theorem 6, that the most general O(3) NLσM to this order is box-free or d'Alembertian-free, viz. it contains no more than a single derivative operator acting on a field. This means that the most general theory -to this order -is described by a secondorder Euler-Lagrange equation of motion and is free from the Ostrogradsky instability or ghost. We utilize integrations by parts and field redefinitions to establish this result.
At the next-to-next-to-leading order in the derivative expansion, which we denote Λ −4 , we first write down the complete list of terms or operators to this order, namely containing six derivative operators and any number of fields (which can easily be shown to be 6, 4 or 2). We then establish in Lemma 9, that the list of operators can be reduced by integrations by parts to a representative of 10 operators. We take into account the repercussions of the field redefinition used at order Λ −2 to eliminate boxes or d'Alembertian operators, which generates a flurry of terms at the subsequent order, i.e. Λ −4 . Although we aim at removing all terms with more than one derivative acting on a field, we find that this does not seem [31], sixth [32], eighth [33] and even higher order [34] in derivatives. The analysis there is somewhat different, due to the phenomenological theory (chiral perturbation theory) being based on SU(2) × SU(2) symmetry and the fact that external gauge and scalar fields are included in the expansion, complicating the analysis.
possible. Therefore, we propose in Proposition 11 a field redefinition that simplifies the most general theory as much as possible, in the sense that only 4 operators with more than one derivative acting on a field remain, and all the terms induced by the field redefinition at order Λ −2 have been eliminated. The theory still contains terms quadratic in double time derivatives acting on a field, but we show with Lemma 12, that they can be eliminated with a further field redefinition, leaving the theory with only linear dependence on double time derivatives on a field.
The linear dependence on double time derivatives, if the term is having a constant prefactor, is naturally canceled by the Legendre transform in going to the Hamiltonian. In fact, it is often considered as a sufficient condition for a higher-derivative theory to avoid the Ostrogradsky instability or ghost [35][36][37][38][39][40]; in fact, this works out for a single real field or for a real field theory where the prefactor of the double time derivative contains no dependence on single time derivatives of other fields and this is the content of Lemma 14. However, the conjugate momentum picks up new dependence on the double time derivative of the complex conjugated field at the linear level, which does not cancel out, and this is because the complex conjugate of the field is another field and hence the assumption of the Lemma fails and an Ostrogradsky-like instability is not avoided, as pointed out in Corollary 15.
As a last resort, we choose to relax the assumption of intact Lorentz invariance, and make a frame dependent low-energy EFT suitable for the rest frame, which could be considered reasonable for a gapped theory at low energies. This is the proposal in Proposition 16, where a suitable field redefinition is found that eliminates all dependence on double time derivatives in the theory and hence the Ostrogradsky instability can manifestly be avoided. The reason why these linear terms in double time derivatives cannot be eliminated by field redefinitions when insisting on manifestly intact Lorentz symmetry, like the other problematic terms, is due to an incompatibility with the metric on CP 1 of the form of said terms.
Although we have avoided the Ostrogradsky instability at order Λ −4 in the derivative expansion of the theory, at the cost of sacrificing Lorentz invariance, we calculate the corresponding Hamiltonian of the theory in Lemma 20 and prove in Theorem 21 that the theory still suffers from either a spiral instability or a traditional runaway instability, that is turned on if the norm of the field reaches a critical value. Of course, this instability is simply due to the EFT breaking down at the scale Λ, but the mathematical nature of the spiral instability is different from the instability of the EFT breaking down at the previous (Λ −2 ) order in the derivative expansion.
The obvious extensions of our work that one could contemplate, are the extensions from O(3) to O(N ) or to CP N −1 , which we will leave as future work. In this direction, it may be interesting to see if it is easier (or harder) to eliminate higher-order derivative operators in the CP N −1 case, compared with the O(N ) case, since the former enjoys a complex structure and as a target space manifold is Kähler. This work has repeatedly utilized integration by parts and discarded total derivatives, which is sensible on infinite flat Minkowski space R d+1 , but it makes the analysis unsuitable for the theory if one wishes to apply it to condensed matter physics, like e.g. anti-ferromagnetism. In such case, one needs to keep track of every boundary term and analyze them one-by-one. We leave such a possibility for future studies.

Acknowledgments
We thank Lorenzo Bartolini, Johan Bijnens and Chris Halcrow for useful discussions.

The sigma model
The O(3) or CP 1 NLσM is given by where n = (n 0 , n 1 , n 2 ) : R d+1 → S 2 is a unit-length 3-vector of real scalar field and λ is a Lagrange multiplier imposing the unit-length or NLσM-constraint. The spacetime index µ = 0, 1, . . . d runs over time and d-dimensional space and we are using the convention in which the flat Minkowski metric has the mostly-positive signature.
The equation of motion is given by where = ∂ µ ∂ µ is the d'Alembertian operator and by using the NLσM-constraint, λ is given by n · n.
The coordinates n are called homogeneous coordinates. Another set of variables natural for the O(N ) model are the inhomogeneous coordinates m = (m 1 , m 2 ) : R d+1 → R 2 , which is an unconstrained 2-vector (or (N − 1)-vector in the O(N ) case), related to n as for which the Lagrangian (2.1) is given by This generalizes to O(N ) for any N ≥ 2.
The above Lagrangian is exactly that of the CP 1 model (for N = 3) via the identification C z = m 1 + im 2 , for which the Lagrangian simply reads clearly the map from n to z is and z is the Riemann (2-)sphere coordinate, which is the stereographic projection from S 2 to C.
The advantage of working with inhomogeneous coordinates is that no constraints need to be taken into account, which will be crucial in the further analysis in this paper. Obviously, the Lagrangians (2.4) and (2.5) are identical, but offer straightforward generalizations to two different theories, namely the O(N ) NLσM and the CP N −1 NLσM, which is the reason for spelling them out here.

Group invariants and building blocks
Lemma 1 Suppose n is a unit-length 3-vector scalar field with mass-dimension 0, then all O(3) and Lorentz invariant terms must have an even number of derivatives.
Proof : All derivatives contracted by the inverse Minkowski metric come in pairs, so they must contribute an even number to the total number of derivatives, say 2n. In order to have an odd number of derivatives, we need a Lorentz invariant tensor structure with an odd number of spacetime indices. The only one is the Levi-Civita tensor so one might naively think that the following term is possible However, geometrically there are only two independent tangent vectors on S 2 and therefore if two derivatives correspond to orthogonal tangent vectors on S 2 , the third must be a linear combination of the latter two and hence an anti-symmetric contraction must vanish. We have now ruled out any possible terms for d = 2. For d > 2, there are no anti-symmetric group structures of O(3) to contract with that can give a nonvanishing term. For d = 0, there are no nonvanishing terms with one derivative. This completes the proof.

The dimension-2 operators
Theorem 2 Suppose n is a unit-length 3-vector scalar field with mass-dimension 0, then is the unique dimension 2 term with O(3) and Lorentz invariance in d + 1 = 2 spacetime dimensions, whereas in d + 1 = 2 spacetime dimensions there is additionally the topological term µν n · ∂ µ n × ∂ ν n. (2.11) Proof : Using Lemma 1, no terms with a single derivative exist. The O(3)-invariant tensor structures are δ ab and abc and two derivatives must act on the term constructed with either tensor, since vanish identically. The former is the NLσM constraint and vanishes because where n · n = 1 and the latter vanishes due to anti-symmetry of the O(3)-invariant tensor and symmetry of the two n's. Hence, no composites can be made out of two O(3)-invariants with a single spacetime derivative each.
Considering first the tensor δ ab , we can perform an integration by parts which vanishes identically due to the NLσM constraint (2.12). Since the left-hand side of eq. (2.15) vanishes, n · n is equal to −∂ µ n · ∂ µ n and there are no other ways of acting with two derivatives on the O(3) invariant δ ab .
Considering now the second tensor abc , a single derivative vanishes due to eq. (2.13) and three anti-symmetrized derivatives vanish as well, see the proof of Lemma 1. The unique nonvanishing contraction with O(3) and Lorentz invariance, with dimension less than 4, is thus µν abc n a ∂ µ n b ∂ ν n c = µν n · ∂ µ n × ∂ ν n, (2.16) and hence the theorem follows.

Remark 3
The topological term (2.16) once integrated over spacetime, measures the topological degree of the mapping from (one-point compactified) (1 + 1)-dimensional spacetime (more precisely an Euclidean two-dimensional space after a Wick rotation) to S 2 . More commonly used is the Lorentz vector whose time-component, when integrated over space, represents the static topological degree from (one-point compactified) 2-dimensional space to S 2 . In some literature, this is known as the baby-Skyrme charge and in other it is known as the vortex charge or vorticity.

Lemma 4 The O(3) and Lorentz invariant term
does not contribute to the equations of motion. F = F is a real function of the field z and its complex conjugate.
Proof : The Lagrangian has the corresponding equation of motion forz: which is identically zero due to the antisymmetry of µν . Absorbing the metric factor into F (z,z) completes the proof.
Corollary 5 Suppose n is a unit-length 3-vector scalar field with mass-dimension 0, then is the unique dimension 2 term with O(3) and Lorentz invariance that contributes to the equation of motion.
Proof : Using Theorem 2, the only other term than ∂ µ n · ∂ µ n is given by µν n · ∂ µ n × ∂ ν n and is a Lorentz invariant only in d + 1 = 2 dimensions. The latter term is shown not to contribute to the equations of motion, by setting F = 1 (or any constant) in Lemma 4.

The baby-Skyrme term
The baby-Skyrme term is a special dimension-4 derivative operator, given by By using the identity it is easy to see that only the first term is nonvanishing for the Skyrme term (2.22) due to the NLσM-constraint (2.12) and hence the Skyrme term obeys the identity It is thus clear that the Skyrme term equivalently can be viewed as the special combination of the two latter terms with relative coefficient 1 and −1, respectively. Although the last term in the second line above has a negative coefficient, it is clear from the left-hand side that this specific combination is positive semi-definite on a Euclidean manifold.
It will prove convenient to rewrite the Skyrme term in inhomogeneous coordinates:

Further identities
Let us note that acting with derivatives on the nonlinear sigma model constraint, n · n = 1, yields where the first equation is the already well-used identity (2.12) and the following equations are generalizations thereof. The largest tensor structure we will be needing here is a spin-3 tensor (three free Lorentz indices), since the Lorentz-invariant operator that can be built from such a tensor must have mass dimension 6 or higher and that will be the largest mass dimension we will consider in this paper.
Contracting two free Lorentz indices with the inverse Minkowski metric in eqs. (2.27) and (2.28) yields whereas the latter identity is obtained by acting with the d'Alembertian on eq. (2.27).
The similar constraint with two contracted pairs of Lorentz indices is given by 4∂ µ n · ∂ µ n + 2∂ µ ∂ ν n · ∂ µ ∂ ν n + n · n + n · 2 n = 0, (2.32) which is obtained by acting with the d'Alembertian on eq. (2.29). Two Lorentz contractions yield a minimum mass dimension-4 operator and the dimension-5 operator would have one free Lorentz index, which for making at most dimension-6 operators would vanish, since it can only be contracted with the term of eq. (2.26).
The final constraint contains three pairs of Lorentz-contracted indices

The sigma model as an EFT
We will now consider the NLσM as an EFT and make a derivative expansion, but conserving O(3) and Lorentz invariance. The program we will employ here is to use field redefinitions to eliminate as many derivative operators as possible. In order to set up the derivative expansion, we will assume that the NLσM only has one scale Λ and hence up to an irrelevant overall constant factor, and using Corollary 5, we have for the theory in (d + 1)-dimensional spacetime and the order Λ −2 term represents fourthorder derivative terms, which thus have to be suppressed by a factor of Λ 2 . Since we have assumed that there is only one energy scale in the theory, it must be proportional to Λ. λ is a Lagrange multiplier enforcing the unit length of the 3-vector field n and we have conveniently written the model in both the vector (n) and stereographic (z) coordinates.

Order Λ −2
We will now write down the most general O(3) NLσM to order Λ −2 in the EFT expansion. Recalling Lemma 1, there are no O(3) and Lorentz invariant terms with an odd number of derivatives and hence no terms of order Λ −1 , Λ −3 , · · · and so on.
The complete list of dimension 4 operators is The identity (2.32) can be used to eliminate one of the four dimension-4 operators (3.4)-(3.7). However, integration by parts (IBP) and discarding total derivatives, relates the operators (3.4)-(3.7), which can easily be shown: yielding (∂ µ n · ∂ µ n)(∂ ν n · ∂ ν n), (3.13) (∂ µ n · ∂ ν n)(∂ µ n · ∂ ν n), (3.14) n · n, (3.15) where the operator (3.8) has been eliminated from the above list due to its relation to the existing operators (3.2) and (3.3). First using the identity we have and then using the identity (2.23), we have The operator (3.9), on the other hand, has been eliminated due to the following considerations. Consider a field configuration with a non-negative instanton density ( µν n · ∂ µ n × ∂ ν n) ≥ 0; for such a configuration, the operator (3.9) is bounded from below and perturbations do not destabilize the theory. Now consider a parity transformation of the latter configuration (x 0 , x 1 ) → (x 0 , −x 1 ); this turns instantons into anti-instanton and anti-instantons into instantons. Now the operator (3.9), however, is non-positive and is bounded from above. This has the dire consequence that perturbations of such a field will destabilize the theory via a runaway instability. We have thus justified the assumption that the theory should better be parity invariant, hence eliminating the operator (3.9).
One may wonder if the constraint (2.32) can be used to eliminate the last operator containing boxes, i.e. (3.15). Consider thus If we integrate the first three operators by parts, we get

22)
via a field redefinition, where the latter formulation of the theory contains no d'Alembertian operators.
Proof : First we write the theory (3.21) in terms of the stereographic coordinate, z: (3.23) Now considering the following field redefinition z → z + 1 Λ 2 ψ, (3.24) to order Λ −2 , the terms generated by ψ can only come from the first term (the kinetic term) in the theory.
It is easily shown that An educated guess is to take ψ ∝ z, but unfortunately one can only remove either the term with two boxes or the two terms with one box; thus another term in ψ is necessary. Choosing instead a straightforward calculation yields (3.27) Thus setting α = 1 2 c 4 and γ = −c 4 , we get which is thus a box-free Lagrangian density. It is now easy to see, by comparing the coefficients of the above equation with those of eq. (3.23), that transforming back to vector coordinates n, yields the theory (3.22).

Remark 7
Since we have eliminated all boxes from the Lagrangian to this order, the equation of motion is of second order -both in time and space directions. The theory to this order in 1/Λ is thus free from the Ostrogradsky ghost or related instabilities [8], see app. A.

Remark 8
The constraints c 4 + c 4 ≥ 0, c 4 ≥ 0 and c 4 ≥ 0 are due to constraining the static energy density to be positive definite, which is easier to do by considering the eigenvalues of the strain tensor ∂ µ n · ∂ ν n. It is thus easy to see that the −c 4 term will give a positive energy as long as c 4 < c 4 + c 4 [14], hence yielding the second constraint.

Order Λ −4
We will now consider the next order in the derivative or 1/Λ expansion of the EFT, and thus write down the list of all possible operators to order Λ −4 in the EFT expansion, which is: group 1: group 2:

53)
F 64 ≡ n · 2 n, (3.54) which is a total of 34 operators and we have already eliminated operators that are related by eq. (2.27) to the ones in the above list. We have furthermore eliminated all operators that are related by the relation (3.18) to the above ones.

Performing integrations by parts on operators with four fields
we find 18 total derivatives and hence 18 relations between the dimension-6 operators with four fields. There are 19 of such operators (i.e. eq. (3.32)-(3.50)) and 18 relations, so one would naively think that all but one can be eliminated. However, some of the relations are dependent on others; in fact the rank of the vectors in the space of coefficients is only 13, so we can only eliminate 13 operators, leaving us with 6 independent operators composed by six derivatives and four fields. There is quite a lot of ambiguity in which operators to keep and which to eliminate. Our choice of which to keep is predicated on symmetry in the derivatives and the preference of having a box instead of two uncontracted derivatives acting on the field, because of the field definitions that we have in mind, to be applied shortly.
Performing instead integrations by parts on the operators with two fields and six derivatives, we have and hence all but one operator can be eliminated. We will choose to retain F 66 .

Lemma 9
The remaining operators after the integration by parts procedure, are a complete basis of the most general O(3), parity and Lorentz invariant dimension-6 operators, and they are given by: up to total derivatives and using eqs. (3.64), we get  At this point, it will prove convenient to introduce the following short-hand notation: where M , K, F , O, P and U are real Lorentz scalar quantities, whereas H, B, E, G, I, J, Q, R, S, T , and V are complex Lorentz scalars. According to Theorem 6, the order Λ −2 Lagrangian in the above short-hand notation thus neatly reads The NLσM with O(3), parity and Lorentz invariance having a single mass scale Λ to order Λ −4 , written as   where the short-hand notation (3.71) has been used and we have defined the notation We will now perform field redefinitions of the field z up to order Λ −4 : where ψ 1 was already determined in Theorem 6 and in the short-hand notation (3.71) is given by whereas ψ 2 is only determined at the higher order, namely at order Λ −4 . The zeroth order Lagrangian is unchanged under any field redefinition (by definition), whereas the proof of Theorem 6 can be restated in the short-hand notation (3.71) as and hence we obtain nicely which indeed is box free.
The change of the fourth-order Lagrangian due to the field redefinition (3.80) is given by Remark 10 The corrections to L (2) due to field redefinitions are determined only by ψ 1 . ψ 1 in turn will affect L (4) , but since we fixed ψ 1 at the previous order, the impact at order Λ −4 and hence on L (4) is given. ψ 2 does not affect L (2) and should thus be used for simplifying L (4) . The perturbative ordering in the field redefinitions is thus evident.
from which it is clear that all terms proportional to z 2 00 ,z 2 00 and |z 00 | 2 have been canceled out and we have used the short-hand notation z µ = ∂ µ z, etc.

Remark 13
Notice that the last five lines of eq. (3.96) are linearly dependent on either z 00 orz 00 (and not both of them). Technically, the assumption of non-degeneracy ( ∂ 2 L ∂z 2 00 = 0) is violated. The Ostrogradsky theorem implying instability is thus not valid, but unfortunately, that is not necessarily sufficient for ruling out instability.
In order to illustrate the problem of a residual instability, let us consider a sub-Lagrangian of the full theory (3.94), i.e., where on the second line, we have set the overall factor equal to unity and in the last line written out the time and spatial derivatives, explicitly. Defining now we can write down the Ostrogradsky Hamiltonian H = π z z 0 +π zz0 + π w z 00 +π wz00 + π z i z 0i +π z iz 0i − L where the Legendre transform has canceled out the terms linear in z 00 andz 00 between π w z 00 +π wz00 and L. But, unfortunately, there remains linear dependence on z 00 andz 00 inside π z which is not canceled out in the resulting Hamiltonian. Their presence in π z is indeed a necessity, since the third-order derivative present in the Euler-Lagrange equation of motion corresponding to the Lagrangian, manifests itself in the Hamilton equation as Writing now the Hamiltonian in terms of the phase-space variables z, w, w i , π z , π w , π z i and their complex conjugates, we have Allowing for spatial derivatives of the phase-space variables, since this is a Hamiltonian density for a field theory, we can see that the Hamiltonian is indeed still linearly dependent on π z and its complex conjugate, and the Ostrogradsky instability is thus not avoided.
Lemma 14 A sufficient condition for avoiding the Ostrogradsky instability, defined by the presence of a linearly dependent conjugate momentum in the corresponding Ostrogradsky Hamiltonian, can be stated as follows: The conjugate momentum π w = ∂L ∂z 00 cannot contain time derivatives of any fields other than z.
Proof : The trouble of generating terms linear in z 00 andz 00 in the conjugate momentum π z can be traced back to the fact that the prefactor of the linear term in z 00 contains the time derivative of another field, i.e.z 0 and not only z 0 . That is, consider But consider instead another real term L = c|z 0 | 2p (z 00 +z 00 ), c ∈ R, p ∈ Z + , (3.108) then we have the conjugate momentum and hence we have generated a term linear inz 00 in π z which is not canceled by anything, as the Hamiltonian gets the induced terms H ⊃ −cp|z 0 | 2(p−1) z 2 0z 00 +z 2 0 z 00 , (3.110) not present in the Lagrangian. Generalizing the proof to letting ∂L ∂z 00 contain functions of z andz does not alter the conclusion and hence the Lemma follows.
Corollary 15 Using Lemma 14, we can see that every term with linear dependence on z 00 in the Lagrangian (3.96) also has dependence onz 0 and hence does not obey the condition of the Lemma to avoid the Ostrogradsky instability.

Remark 22
The unveiled instability can be avoided if B = 0, which however would require a precise fine-tuning of theory, which is not expected generically to be the case.

Remark 23
The instability found in Theorem 21 is due to the nature of the derivative expansion of the effective low-energy theory and is a classical instability. The quantum considerations are beyond the scope of this paper. The instability of spiral type is mathematically different from the instability taking place at order Λ −2 , which however is always of the isotropic type.

A Ostrogradsky's theorem
For convenience, we review the Ostrogradsky's theorem, specialized to second-order derivatives: Theorem 24 (Ostrogradsky [7], Woodard [8]) Given a Lagrangian theory with quadratic and non-degenerate dependence on a second-order time derivative of a field, the corresponding Ostrogradsky Hamiltonian possesses a linear dependence on one of the two conjugate momenta, and resultantly the corresponding energy is not bounded neither from below nor from above.
Proof : Consider the Lagrangian L(z, z 0 , z 00 ) which is a functional of z, as well as its first and second-order time derivatives denoted by one and two zeros as indices, respectively, and the corresponding Ostrogradsky Hamiltonian H(z, w, π z , π w ) which is a functional of z, w = z 0 and their conjugate momenta π z and π w . 2 The condition that the Lagrangian depends non-degenerately on z 00 means that Finally, the Hamiltonian can be written as H = π z w +π zw + π w a(z, w, π w ) +π w a(z, w, π w ) − L(z, w, a(z, w, π w )), (A. 6) where the acceleration a(z, w, π w ) is defined by ∂L ∂z 00 z 0 =w,z 00 =a = π w . (A.7) The Hamilton equations z 0 = ∂H ∂πz , w 0 = ∂H ∂πw , ∂ t π w = − ∂H ∂w simply reproduce the phase space transformation (A.4)-(A.5), whereas ∂ t π z = − ∂H ∂z reproduces the Euler-Lagrange equation of L. The assumption of non-degeneracy (A.1) implies that the phase space transformation (A.4)-(A.5) can be inverted to solve for z 00 in terms of z, w and π w , by means of the implicit function theorem, which is the statement (A.7). Crucially, the acceleration a(z, w, π w ) does not depend on the conjugate momentum π z , which is only needed for the third-order time derivative of z. Notice that the third-order time derivative only appears when the assumption of non-degeneracy (A.1) holds true.
Finally, we have that the Ostrogradsky Hamiltonian (A.6) depends only linearly on the conjugate momentum π z . The linear dependence implies the Ostrogradsky instability, since the system can linearly be driven to lower and lower (or higher and higher) energies. This completes the proof.

B Field redefinition operators
In order to cancel the unwanted operators, we need a systematic educated guess for the field redefinitions, ψ 2 , at the order Λ −4 and they read