1 Introduction

The higher-derivative dynamics is as good as the conventional ones in many principal issues. In particular, the Noether theorem still applies that connects symmetries and conservation laws. The Hamiltonian formulation is also known for both nonsingular theories [1] and the most general higher-derivative Lagrangians with singular Hessian [2]. For many decades, a variety of higher-derivative models are studied once and again. The well-known examples include the Pais–Uhlenbeck oscillator [3], Podolsky electrodynamics [46], various conformal field theories [7, 8], \(R^2\)-gravity [9, 10], and many others. A vast literature exists on various higher-derivative models, we mention the papers [1143] and references therein.

In many cases, the higher-derivative models reveal remarkable properties. They often admit a wider symmetry than the first-derivative analogs. One more typical phenomenon is that the inclusion of the higher derivatives in Lagrangian can improve the convergence in field theoretical models both at the classical and the quantum level.

A notorious difficulty of higher-derivative models concerns instability of their dynamics. The Noether energy is typically unbounded for higher-derivative Lagrangians, and this fact is usually considered as evidence of a classical instability. At the quantum level, the instability reveals itself by ghost poles in the propagator and a related problem with the unbounded spectrum of the energy. In their turn, the problems of quantum instability are related to the fact that Ostrogradsky’s Hamiltonian, being the phase-space equivalent of Noether’s energy, is unbounded due to the higher derivatives.

For the general acceleration-dependent Lagrangian, the Noether energy

$$\begin{aligned} E_{\mathfrak {N}}\equiv \left( \frac{\partial L}{\partial \dot{\phi }^i} -\frac{\mathrm{d}}{\mathrm{d}t}\frac{\partial L}{\partial \ddot{\phi }^i} \right) \dot{\phi }^i + \frac{\partial L}{\partial \ddot{\phi }^i} \ddot{\phi }^i - L \end{aligned}$$
(1)

cannot be positive because of a simple reason: it is linear in \(\dddot{\phi }{}^i\). The third derivatives are the independent initial data for the fourth-order Lagrange equations whenever the Hessian

$$\begin{aligned} \frac{\partial ^2 L(\phi , \dot{\phi }, \ddot{\phi })}{\partial \ddot{\phi }^i\partial \ddot{\phi }^j} \end{aligned}$$

is non-degenerate.

For the models with degenerate Hessian, the constraints appear in phase space [2], which can restrict the third derivatives. It is a very special case, where the constraints are strong enough to make the linear function positive, though it may happen on some occasions [18, 21]. The known examples of this type include the higher-order theories of gravity [28, 3133] and some models of higher-spin fields [38, 39, 42]. One more example is given by the relativistic point particle, whose Lagrangian linearly depends on the curvature of the world line [43]. Because of positive Hamiltonian, these models are stable classically and have no ghosts at the quantum level.

The positivity of the canonical Noether’s energy is a sufficient condition for classical stability, while it is unnecessary. The simplest example is provided by the Pais–Uhlenbeck oscillator. The Lagrangian is acceleration dependent and nonsingular. Therefore Noether’s energy is unbounded in this model, while the classical stability is obvious, because the motion is bounded. The point is that the Pais–Uhlenbeck oscillator admits another integral of motion which is positive. It is the integral which provides stability. Various specific reasons can be seen for considering this positive conserved quantity as a natural candidate for the role of energy in this model. We elaborate on the details in the next section.

In this paper, we consider the issue of stability of the higher-derivative theories from the viewpoint of existence of a positive integral of motion. In a first instance, we consider a class of linear higher-derivative systems. The fourth-order operator of the equations is supposed to admit factorization into a pair of different second-order operators satisfying certain (not too restrictive) condition. Many of known higher-derivative linear models fall into this class, including the Pais–Uhlenbeck oscillator, Podolsky electrodynamics, and linearized conformal gravity. For the models of this type we construct the integral of motion which is squared in third derivatives. It can be either bounded or unbounded depending on signature, in contrast to the Noether energy, which is almost always unbounded unless the theory is not strongly constrained. Besides the general method of construction, we explicitly present the positive integral in several higher-derivative models with unbounded Noether’s energy. As we further demonstrate, the concept of factorization extends beyond the linear level providing the procedure for inclusion of stable interactions in higher-derivative theories.

As the next step we establish a relationship between the conserved positive quantity, being responsible for the classical stability of the higher-derivative dynamics and the translation invariance. The key tool allowing one to connect the integral of motion with the symmetry is the concept of a Lagrange anchor [44]. Originally, the Lagrange anchorFootnote 1 was introduced as a tool for extending the BV-BRST quantization procedure beyond the scope of Lagrangian theories [44]. Given not necessarily variational equations of motion, the Lagrange anchor allows one to define the Schwinger–Dyson equation [45] and the path integral representation for the partition function [46]. It has been noticed later that the Lagrange anchor maps conservation laws to symmetries [47] extending in such a way the Noether theorem beyond the class of variational equations. Any Lagrangian system admits a canonical Lagrange anchor, which is given by an identity operator. The same system of equations may admit different inequivalent Lagrange anchors. Inequivalent Lagrange anchors result in inequivalent quantum theories, and different Lagrange anchors assign different symmetries to the same conservation law. It turns out that the higher-derivative Lagrangian dynamics of the considered class always admit the Lagrange anchor which is inequivalent to the canonical one. If the energy is connected to the time-translation invariance with this anchor, we arrive at positive energy which differs from the unbounded expression (1). Furthermore, the quantization with this anchor will not break the stability as we explain below.

For the first-order unconstrained mechanical systems without gauge symmetries, each Lagrange anchor defines and is defined by a bivector [44, 48, 49]. This means, in particular, that when a nonsingular, higher-derivative Lagrangian of a mechanical systemFootnote 2 is reduced to the first order by introducing auxiliary variables, the first-order system will be bi-Hamiltonian whenever the two inequivalent Lagrange anchors are admissible for the higher-derivative equations. The different Hamiltonians represent in the phase space the different conserved quantities connected with the time-shift transformation by different Lagrange anchors in the configuration space. The fact that the Pais–Uhlenbeck oscillator is a bi-Hamiltonian system has been noticed in [16, 17]. The “non-Ostrogradsky Hamiltonian” is positive. As we observe, it corresponds to the integral of motion connected with the time-shift symmetry of the Pais–Uhlenbeck oscillator by an alternative Lagrange anchor. As we will demonstrate, it is not an isolated observation which is valid for particular higher-derivative model. It is a part of a broader picture concerning the issue of stability in the higher-derivative systems. These systems turn out to be classically stable because of the same reason as the first-derivative Lagrangian dynamics: they all have a positive energy that is conserved. The only essential difference is that the definition of energy may involve a more general Lagrange anchor than the canonical one.

In this paper, we also address the problem of including interaction without breaking stability of higher-derivative dynamics. For the Lagrangian equations without higher derivatives, and with a positive Noether energy, it would be sufficient to include the translation-invariant interaction into the Lagrangian in a way that keeps the energy bounded. For the general higher-derivative systems, where stability cannot be controlled by Noether’s energy (1) anymore, the issue becomes more tricky. As we see, a positive (non-canonical) energy is connected with the translation invariance by a non-canonical Lagrange anchor in the higher-derivative theory. With this regard, the sufficient conditions for stability mean to meet the following requirements, which are automatically satisfied with the canonical anchor. First, the interaction has to be included simultaneously into the equations of motion and in the Lagrange anchor to keep them compatible. When a relevant Lagrange anchor is canonical, it is automatically compatible with the Lagrangian vertices in the equations. For the stability of higher-derivative systems, as we see, typically a non-canonical Lagrange anchor is relevant because it connects the positive integral of motion with translation invariance. Second, the interaction should keep the positivity of the energy. If the vertex is Lagrangian and translation invariant, this will mean that the Noether energy still is conserved, though it does not automatically mean the same for a positive energy which is a different integral of motion. The requirement for the deformed energy to be conserved and keep being positive is an additional requirement imposed on the interaction. The last but not least, the deformed Lagrange anchor should connect the positive energy of interacting system with the generator of time translations. This is not automatically satisfied either. We demonstrate by examples that all these requirements can be met, though the stability control is not so simple procedure as it is in the theories without higher derivatives.

The paper is organized as follows. In the next warming-up section we consider the model of the Pais–Uhlenbeck oscillator to illustrate the key general constructs we further use to control the stability of higher-derivative dynamics. Section 3 describes the general structure of the factorizable higher-derivative dynamics, both linear and nonlinear, that allows one to control stability at the classical level and keep it upon quantization. Section 4 illustrates the proposed technique by the examples of a higher-derivative scalar field model and Podolsky’s electrodynamics. We demonstrate stability of these models. As the paper essentially employs the Lagrange anchor method developed in [4448], we outline the relevant aspects of this construction in the appendices, to make the paper self-contained. The general idea of a Lagrange anchor is explained in Appendix A. This appendix also provides some relations, which are used in this work. Appendix B demonstrates how the Lagrange anchor is applied to connect conserved quantities with symmetries. A particular consideration is given to the possibility to connect different conserved quantities to the translation invariance when the system admits different anchors. Appendix C provides an elementary technique of finding the Lagrange anchors for free field equations. It also explains why the higher-derivative dynamics admit a wider set of Lagrange anchors than the second-order field equations. Appendix D explains how the linear techniques for finding the Lagrange anchors are extended to a certain class of nonlinear higher-derivative systems considered in this paper. The appendices provide the background and techniques for those who wish to apply or further develop the method, while the results of the present paper can be apprehended by consulting only the relations which are directly referred to in the main text.

2 Stability of the Pais–Uhlenbeck oscillator

In this section, we consider the Pais–Uhlenbeck (PU) oscillator which has been studied for decades; see [1117, 19, 20, 22, 23] and references therein. By this simplest model we exemplify the key structures related to the (in)stability problem of higher-derivative dynamics. In the next section these structures are described in the general form.

The action of the PU oscillator involves derivatives of a single variable \(\phi (t)\) up to the second order:

$$\begin{aligned}&S[\phi ]= \int \mathrm{d}t L, \nonumber \\&\quad L=\frac{1}{2(\omega _1^2-\omega _2^2)} \left( \ddot{\phi }+\omega _1^2\phi \right) \left( \ddot{\phi }+\omega _2^2\phi \right) ; \end{aligned}$$
(2)

here \(\omega _1\ne \omega _2\) are the frequencies of oscillations. The corresponding equation of motion reads

$$\begin{aligned} \frac{\delta S}{\delta \phi }\equiv \frac{1}{\omega _1^2-\omega _2^2}\left( \frac{\mathrm{d}^2}{\mathrm{d}t^2} +\omega _1^2\right) \left( \frac{\mathrm{d}^2}{\mathrm{d}t^2}+\omega _2^2\right) \phi =0. \end{aligned}$$
(3)

As is seen, the fourth-order operator of the equation factorizes into the product of the second-order commuting operators. Because of this factorization, the general solution to (3) is given by the sum

$$\begin{aligned} \phi =\xi +\eta , \end{aligned}$$
(4)

where the functions \(\xi \) and \(\eta \) satisfy the second-order equations

$$\begin{aligned} \left( \frac{\mathrm{d}^2}{\mathrm{d}t^2}+\omega ^2_1\right) \xi =0,\quad \left( \frac{\mathrm{d}^2}{\mathrm{d}t^2}+\omega ^2_2\right) \eta =0. \end{aligned}$$
(5)

Conversely, if \(\phi \) is a solution to the original fourth-order equation (3), then the expressions

$$\begin{aligned} \xi = \frac{\ddot{\phi }+\omega _2^2\phi }{\omega _2^2-\omega _1^2}, \quad \eta =\frac{\ddot{\phi }+\omega _1^2\phi }{\omega _1^2-\omega _2^2}\, \end{aligned}$$
(6)

obey the second-order equations (5). The relations (4) and (6) establish a one-to-one correspondence between the solutions to the fourth-order equation (3) and the second-order system (5).

The general solution for \(\phi \) is a linear combination of the two independent harmonic oscillations,

$$\begin{aligned} \xi =A_1\sin {\omega _1 (t-t_1)},\quad \eta =A_2\sin {\omega _2 (t-t_2)}. \end{aligned}$$
(7)

Taking the linear combination of the energies of the oscillations, we get a two-parameter family of integrals of motion for the PU model

$$\begin{aligned} E_{\alpha ,\beta }=\frac{\alpha }{2}\left( \dot{\xi }^2+\omega _1^2\xi ^2\right) + \frac{\beta }{2}\left( \dot{\eta }^2+\omega _2^2\eta ^2\right) , \end{aligned}$$
(8)

with \(\alpha ,\beta \) being arbitrary real constants. Using (6), we can write \(E_{\alpha ,\beta }\) as a quadratic form of \(\phi \) and its derivatives up to the third order:

$$\begin{aligned} E_{\alpha ,\beta }&= \frac{\alpha }{2}\left[ \left( \frac{\dddot{\phi }+\omega _2^2\dot{\phi }}{\omega _2^2-\omega _1^2}\right) ^2+ \omega _1^2\left( \frac{\ddot{\phi }+\omega _2^2\phi }{\omega _2^2-\omega _1^2}\right) ^2\right] \nonumber \\&+\frac{\beta }{2}\left[ \left( \frac{\dddot{\phi }+\omega _1^2\dot{\phi }}{\omega _1^2-\omega _2^2}\right) ^2+ \omega _2^2\left( \frac{\ddot{\phi }+\omega _1^2\phi }{\omega _1^2-\omega _2^2}\right) ^2\right] \nonumber \\&= \frac{\alpha A_1^2\omega _1^2}{2}+\frac{\beta A_2^2\omega _2^2}{2}. \end{aligned}$$
(9)

If \(\alpha \beta \ne 0\), then the only critical point of the function \(E_{\alpha ,\beta }(\phi ,\dot{\phi },\ddot{\phi },\dddot{\phi })\) is zero. The quadratic form \(E_{\alpha ,\beta }\) is positive definite whenever \(\alpha >0\) and \(\beta >0\). The latter fact ensures the boundedness of motion for any choice of initial data.Footnote 3

In general, we say that the classical dynamics is stable in a vicinity of a phase-space point \(\phi _0\), if \(\phi _0\) provides a local minimum for a conserved quantity \(E\) and the Hessian matrix \(\mathrm{d}^2E\) is positive definite at \(\phi _0\). In this case the level surfaces \(E=E_0\), where \(E_0\) is close enough to the minimum value, are compact and the motion is bounded in the phase space. In the subsequent discussion we will call a conserved quantity \(E\) positive definite (in the vicinity of its extremum point \(\phi _0\)) if its Hessian matrix \(\mathrm{d}^2E\) is.

In the case of PU oscillator we have the two-parameter family (9) of conserved quantities and at least two physically reasonable candidates for the energy. First of all, as we are dealing with the pair of oscillations (7), it is quite natural to define the energy of the PU model as the total energy of two uncoupled harmonic oscillators, namely,

$$\begin{aligned} E_{1,1} = \frac{A_1^2\omega _1^2}{2}+\frac{A_2^2\omega _2^2}{2}. \end{aligned}$$

This energy is positive definite and its conservation ensures the classical stability of the PU oscillator.

Another possibility is suggested by the Noether theorem [51]. In Lagrangian mechanics the canonical energy is defined as the integral of motion corresponding to the invariance of a conservative system under the time translations. This correspondence, being applied to the PU oscillator, leads to an unbounded energy as we explain below.

The time derivative of any integral of motion \(E\) is to be proportional to the l.h.s. of equations of motion, i.e.,

$$\begin{aligned} \frac{\mathrm{d}E}{\mathrm{d}t}=Q \frac{\delta S}{\delta \phi }. \end{aligned}$$
(10)

The coefficient \(Q=Q(\phi ,\dot{\phi },\ddot{\phi }, \dddot{\phi })\) is called the characteristic of the conserved quantity \(E\). The Noether theorem connects the integrals of motion to the symmetries of the action by identifying the characteristic \(Q\) with the infinitesimal symmetry transformation:

$$\begin{aligned} \delta _\varepsilon \phi = \varepsilon Q, \quad \delta _\varepsilon S=0\quad \Leftrightarrow \quad Q\frac{\delta S}{\delta \phi }=\frac{\mathrm{d}E}{\mathrm{d}t} \end{aligned}$$
(11)

for some \(E=E(\phi ,\dot{\phi },\ddot{\phi },\dddot{\phi })\). In this way, the invariance of the action (2) with respect to the time translation \(\delta _\varepsilon \phi = -\dot{\phi }\varepsilon \) gives rise to the Noether energy (1). On the other hand, one can find the following expression for the characteristic of the conserved quantity (9):

$$\begin{aligned} Q_{\alpha ,\beta }=\frac{(\alpha +\beta )\dddot{\phi }+(\alpha \omega _2^2+\beta \omega _1^2)\dot{\phi }}{\omega _1^2-\omega _2^2}. \end{aligned}$$
(12)

Thus, the identification \(Q=-\dot{\phi }\) implies that \(\alpha =-\beta =1\) and the corresponding Noether energy reads

$$\begin{aligned} E_{1,-1}&= \frac{2\dddot{\phi }\dot{\phi }-(\ddot{\phi })^2+(\omega _1^2+\omega _2^2)\dot{\phi }^2+ \omega _1^2\omega _2^2\phi ^2}{2(\omega _2^2-\omega _1^2)} \nonumber \\&= \frac{A_1^2\omega _1^2}{2}-\frac{A_2^2\omega _2^2}{2}. \end{aligned}$$
(13)

Unlike \(E_{1,1}\), this energy is not positive definite. The positive definite integrals of motion (9) correspond to \(\alpha >0\), \(\beta >0\) and their characteristics (12) are bound to involve the third derivative of \(\phi \). As a result, the usual Noether theorem cannot connect a positive conserved quantity to the time translation.

A more general correspondence between symmetries and integrals of motion is established by means of the Lagrange anchor [47]; see also Appendix B. The Lagrange anchor is a differential operator that satisfies certain compatibility conditions with the equations of motion; see the definition (6.10). Given equations of motion, the Lagrange anchor is not necessarily unique and the different Lagrange anchors establish different connections between symmetries and conservation laws. In particular, for the PU oscillator we have the two-parameter family of the Lagrange anchors (8.7):

$$\begin{aligned} V_{\rho ,\sigma }= \frac{\rho }{\omega _2^2-\omega _1^2}\left( \frac{\mathrm{d}^2~}{\mathrm{d}t^2}+\omega _2^2\right) + \frac{\sigma }{\omega _1^2-\omega _2^2}\left( \frac{\mathrm{d}^2~}{\mathrm{d}t^2}+\omega _1^2\right) , \end{aligned}$$
(14)

with \(\rho \) and \(\sigma \) being arbitrary real constants. The details about deriving this Lagrange anchor are collected in Appendix C.

Each Lagrange anchor maps characteristics to symmetries by the rule (7.6). Applying the Lagrange anchor (14) to the characteristic (12), we get the following symmetry, which corresponds to the integral of motion (9):

$$\begin{aligned} \displaystyle \delta _\varepsilon \phi&= \varepsilon V_{\rho ,\sigma }(Q_{\alpha ,\beta })=\displaystyle \frac{\varepsilon }{(\omega _1^2-\omega _2^2)^2} \nonumber \\&\quad \times \left[ (\alpha +\beta )(\rho -\sigma )\phi ^{(5)}+ (\omega _1^2(\alpha \rho +2\beta \rho -\beta \sigma ) \right. \nonumber \\&\quad \left. \displaystyle -\, \omega _2^2(\beta \sigma +2\alpha \sigma -\alpha \rho ))\dddot{\phi }+ (\beta \rho \omega _1^4 \right. \nonumber \\&\quad \left. +\,(\alpha \rho -\beta \sigma )\omega _1^2\omega _2^2- \alpha \sigma \omega _2^4)\dot{\phi }\right] . \end{aligned}$$
(15)

Let us consider this relationship from the perspective of having alternative integrals of motion connected with the time translation. To establish the correspondence, we re-arrange (15) to absorb the higher-derivative term with \(\phi ^{(5)}\) by the equation of motionFootnote 4:

$$\begin{aligned} \delta _{\varepsilon }\phi&=\varepsilon \frac{(\omega _1^2-\omega _2^2)(\alpha \sigma +\beta \rho )\dddot{\phi }+ (\beta \rho \omega _1^4+(\alpha \sigma -\beta \rho )\omega _1^2\omega _2^2- \alpha \sigma \omega _2^2)\dot{\phi }}{(\omega _1^2-\omega _2^2)^2}\nonumber \\&\quad + \varepsilon \frac{(\alpha +\beta )(\rho -\sigma )}{\omega _1^2-\omega _2^2}\frac{d}{dt}\frac{\delta S}{\delta \phi }. \end{aligned}$$
(16)

The anchor connects the general characteristic (12) with the time translation \(\delta _\varepsilon \phi =-\dot{\phi }\varepsilon \) if the coefficient at \(\dddot{\phi }\) vanishes. This leads to the condition \(\alpha \rho +\beta \sigma =0\). The correct coefficient at the first derivative is provided by \(\alpha \rho =1\). Solving these conditions for \(\rho \) and \(\sigma \), we see that the Lagrange anchor \(V_{\frac{1}{\alpha },-\frac{1}{\beta }}\) connects the general non-degenerate integral of motion (9) to the time translation

$$\begin{aligned} \delta _\varepsilon \phi =\varepsilon V_{\frac{1}{\alpha },-\frac{1}{\beta }}(Q_{\alpha ,\beta })=-\varepsilon \dot{\phi }-\frac{(\alpha +\beta )^2}{\alpha \beta } \frac{\varepsilon }{\omega _1^2-\omega _2^2}\frac{\mathrm{d}}{\mathrm{d}t}\frac{\delta S}{\delta \phi }. \end{aligned}$$
(17)

We have observed above that any integral of motion (9) with \(\alpha \beta \ne 0\) can be connected to the time translation by specification of the free parameters in the general Lagrange anchor (14). The Noether energy (13) is mapped to the symmetry by the canonical Lagrange anchor. The positive integrals of motion are mapped to the generator of time translations by the non-canonical Lagrange anchors (14) with \(\rho >0,\sigma <0\).

Let us stress once and again that different Lagrange anchors result in different quantizations of one and the same classical system (see Appendix A and [44, 45]). For the first-order ODEs, a Lagrange anchor always definesFootnote 5 a Poisson bracket on the phase space of the system, while the corresponding energy becomes a Hamiltonian [44, 48]. Once the equations of motion admit several Lagrange anchors, they admit several Poisson brackets and Hamiltonians. If the Hamiltonian is positive, one can expect a bounded spectrum of the energy and quantum stability, while the unbounded energy usually results in quantum instability. Therefore, the choice of the Lagrange anchor and the energy gains importance when the quantum stability is concerned.

We do not elaborate here on the generalities of the connection (which is basically one-to-one for ODEs, modulo certain equivalence relations) between the integrable Lagrange anchors and the Poisson brackets; see [44, 48, 49]. We will just explicitly demonstrate that any non-degenerate integral of motion (9) leads to the corresponding Hamiltonian form of dynamics.

Consider the Hamiltonian formulation for the model (2). Following the Ostrogradsky method, we introduce the canonical variables

$$\begin{aligned} q_1&= \phi , \quad q_2=\dot{\phi }, \nonumber \\ p_1&= \frac{\partial L}{\partial \dot{\phi }}-\frac{d}{dt}\frac{\partial L}{\partial \ddot{\phi }} =-\frac{2\dddot{\phi }+(\omega _1^2+\omega _2^2)\dot{\phi }}{2(\omega _1^2-\omega _2^2)} , \nonumber \\ p_2&= \frac{\partial L}{\partial \ddot{\phi }}=\frac{2\ddot{\phi }+(\omega _1^2+\omega _2^2)\phi }{2(\omega _1^2-\omega _2^2)}, \end{aligned}$$
(18)

which have the canonical Poisson brackets

$$\begin{aligned} \{q_i,p_j\}_O\!=\!\delta _{ij}, \quad \{q_i,q_j\}_O=\{p_i,p_j\}_O=0, \quad i,j=1,2. \end{aligned}$$
(19)

Then \(\phi ,\dot{\phi }, \ddot{\phi }, \dddot{\phi }\) can be expressed in terms of the phase-space variables:

$$\begin{aligned}&\phi =q_1, \quad \dot{\phi }\!=\!q_2, \quad \ddot{\phi } =(\omega _1^2-\omega ^2_2)p_2-\frac{1}{2}(\omega _1^2+\omega _2^2)q_1,\nonumber \\&\quad \dddot{\phi }=(\omega _2^2-\omega _1^2)p_1-\frac{1}{2}(\omega _1^2+\omega _2^2)q_2\ . \end{aligned}$$
(20)

The Ostrogradsky Hamiltonian, being the phase-space expression for Noether’s energy (13), reads

$$\begin{aligned} H_O=p_1q_2-\frac{\omega _1^2+\omega _2^2}{2}p_2q_1+ \frac{\omega _1^2-\omega _2^2}{2}\left( p_2^2+\frac{1}{4}{q_1^2}\right) . \end{aligned}$$
(21)

The phase-space variables \(z^I=\{q_1,q_2,p_1,p_2\}\) satisfy the Hamiltonian equations

$$\begin{aligned} \dot{z}^I=\{z^I,H_O\}_O. \end{aligned}$$
(22)

Because of the aforementioned correspondence between the Lagrange anchors in mechanical systems and Poisson structures, the two-parameter set of Lagrange anchors (14) and the energy functions (9) imply the existence of two-parameter sets of Poisson brackets and Hamitonians. These read

$$\begin{aligned}&\displaystyle \{q_1,q_2\}_{\alpha ,\beta }\!=\!\frac{1}{\alpha }\!+\!\frac{1}{\beta },\quad \displaystyle \{q_1,p_1\}_{\alpha ,\beta }\!=\!\frac{1}{2}\left( \frac{1}{\alpha }-\frac{1}{\beta }\right) , \nonumber \\&\quad \displaystyle \{q_1,p_2\}_{\alpha ,\beta }=0, \nonumber \\&\quad \displaystyle \{q_2,p_1\}_{\alpha ,\beta }=0,\quad \displaystyle \{q_2,p_2\}_{\alpha ,\beta }=\frac{1}{2}\left( \frac{1}{\alpha }-\frac{1}{\beta }\right) , \nonumber \\&\quad \displaystyle \{p_1,p_2\}_{\alpha ,\beta }=\frac{1}{4}\left( \frac{1}{\alpha }+\frac{1}{\beta }\right) , \end{aligned}$$
(23)
$$\begin{aligned}&H_{\alpha ,\beta }=\frac{\alpha }{2}\left[ (p_1+q_2/2)^2+\omega _1^2(p_2-q_1/2)^2\right] \nonumber \\&\qquad \qquad +\, \frac{\beta }{2}\left[ (p_1-q_2/2)^2+\omega _2^2(p_2+q_1/2)^2\right] . \end{aligned}$$
(24)

The Hamiltonians \(H_{\alpha , \beta }\) are derived from \(E_{\alpha , \beta }\) by substitution \(\phi , \dot{\phi }, \ddot{\phi }, \dddot{\phi }\) in terms of the phase-space variables (20). The Ostrogradsky Hamiltonian and bracket correspond to \(\alpha =1, \beta =-1\):

$$\begin{aligned} \{\;\cdot \;,\;\cdot \;\}_O=\{\;\cdot \;,\;\cdot \;\}_{1,-1},\quad H_O=H_{1,-1}. \end{aligned}$$

Notice that the brackets and Hamiltonians with different \(\alpha , \beta \) are not obtained from each other by canonical transformations. This is an obvious fact because the brackets between the same variables essentially depend on the parameters. For example, the original coordinate \(q_1=\phi \) Poisson commutes with the velocity \(q_2=\dot{\phi }\) once \(\alpha =-\beta \), while they are conjugate when \(\alpha =\beta \); \(q_1=\phi \) is conjugate to

$$\begin{aligned} p_1=-\frac{2\dddot{\phi }+(\omega _1^2+\omega _2^2)\dot{\phi }}{2(\omega _1^2-\omega _2^2)} \end{aligned}$$

with respect to the bracket (23) once \(\alpha =-\beta \), while they commute when \(\alpha =\beta \). However, for any \(\alpha , \beta \), the corresponding Hamiltonian equations with the brackets \(\{ \cdot , \cdot \}_{\alpha , \beta }\) and the Hamiltonians \(H_{\alpha , \beta }\) coincide with each other, and in particular with the Ostrogradsky system, i.e.,

$$\begin{aligned} \dot{z}^I=\{z^I,H_{\alpha ,\beta }\}_{\alpha ,\beta }\equiv \{z^I, H_O\}_O, \quad \forall \alpha \ne 0,\quad \forall \beta \ne 0. \end{aligned}$$
(25)

Thus, the phase-space equations of the PU oscillator admit a two-parameter set of brackets and Hamiltonians.

For \(\alpha > 0\), \(\beta >0\) (which corresponds to \(H_{\alpha , \beta }>0\)) the special coordinates can be introduced by

$$\begin{aligned} \begin{array}{ll} \displaystyle \pi _\xi =\sqrt{\alpha }(p_1+q_2/2),&{}\quad \displaystyle \chi _\xi \equiv \sqrt{\alpha }\xi =\sqrt{\alpha }(q_1/2-p_2),\\ \displaystyle \pi _\eta =\sqrt{\beta }(q_2/2-p_1),&{}\quad \displaystyle \chi _\eta \equiv \sqrt{\beta }\eta =\sqrt{\beta }(p_2+q_1/2). \end{array} \end{aligned}$$
(26)

In these coordinates, the brackets (23) take the canonical form

$$\begin{aligned}&\{\chi _i,\pi _j\}_{\alpha ,\beta }=\delta _{ij}, \quad \{\chi _i,\chi _j\}_{\alpha ,\beta }= \{\pi _i,\pi _j\}_{\alpha ,\beta }=0,\nonumber \\&\quad i,j=\xi ,\eta . \end{aligned}$$
(27)

The Hamiltonian (24) reduces to that of a two-dimensional harmonic oscillator, namely,

$$\begin{aligned} H_{\alpha ,\beta }=\frac{1}{2}\left( {\pi _\xi ^2}+\omega _1^2\chi _\xi ^2\right) + \frac{1}{2}\left( {\pi _\eta ^2}+\omega _2^2\chi _\eta ^2\right) . \end{aligned}$$
(28)

If the PU oscillator is quantized with the Hamiltonian (24) by imposing the commutation relations according to the corresponding bracket (23) with \(\alpha > 0, \ \beta >0\), this is equivalent to canonical quantization with the canonical bracket (27) and Hamiltonian (28). This means that the quantum theory with the non-canonical Lagrange anchor leads to a positive energy spectrum, while the canonical choice results in a spectrum unbounded from below.

Let us summarize the conclusions made in this section that apply (as we will see in the next sections) to a wide class of higher-derivative dynamics. Once the free higher-derivative system admits factorization, it turns out to be classically stable, because the two-parameter family exists of the conserved quantities that includes the bounded functions. The model was shown to admit a two-parameter family of the Lagrange anchors that connect the conserved quantities with the symmetry of system under time translation. This allows one to consider any of the integrals as the energy. As we have seen, the diversity of the Lagrange anchors admitted by the higher-derivative dynamics makes possible to choose between inequivalent quantizations. It turns out that the classical stability can be retained at the quantum level by an appropriate choice of the Lagrange anchor.

In the next section, we generalize these observations to a broad class of interacting higher-derivative systems. The example of the interaction that does not break the stability of the PU oscillator will be provided. Then, in Sect. 4, we will consider examples of the stability in higher-derivative field theories.

3 Nonlinear factorization

In this section, we formulate the general pattern for factorizing not necessarily linear higher-derivative systems. This pattern can be seen in its simplest form already from the example of the PU oscillator. Once the higher-derivative dynamics is factorized in this sense, the stability turns out to be a common occurrence as much as it happens in the usual dynamics without higher derivatives. As we will demonstrate, many of the higher-derivative systems of this class appear to be stable, though their canonical energy is unbounded from below.

Suppose that \(\xi \), \(\eta \), and \(\phi \) are \(n\)-component fields on space-time with local coordinates \(\{x^\mu \}\). Given the \(n\times n\) matrix differential operator \(\mathcal {P}\), define \(\mathcal {Q}\) by the relationFootnote 6

$$\begin{aligned} 1=\mathcal {P}+\mathcal {Q}. \end{aligned}$$
(29)

Clearly, \([\mathcal {P}, \mathcal {Q}]=0\). Using these operators and an arbitrary vector-valued nonlinear differential operator \(\mathcal {F}\), we can define two systems of field equations. The first one includes two groups of equations,

$$\begin{aligned} \mathcal {P}\xi + \mathcal {F}(\xi , \eta )=0,\quad \mathcal {Q}\eta + \mathcal {F}(\xi ,\eta )=0, \end{aligned}$$
(30)

while the second group is given by

$$\begin{aligned} \mathcal {P}\mathcal {Q}\phi + \mathcal {F}(\mathcal {Q}\phi ,\mathcal {P}\phi )=0. \end{aligned}$$
(31)

It is easy to check that the relations

$$\begin{aligned} \xi =\mathcal {Q}\phi ,\quad \eta =\mathcal {P}\phi ,\quad \phi =\xi +\eta \, \end{aligned}$$
(32)

establish a one-to-one correspondence between solutions of both systems. So, the systems (30) and (31) are equivalent and may be thought of as two different representations of one and the same theory. We will refer to them as \(\xi \eta \)- and \(\phi \)-representations. The PU oscillator provides the simplest example of factorization with \(\mathcal {F}=0\), cf. (3), (4), and (5).

The \(\xi \eta \)-representation (30) may be viewed as a special way to decrease the order of the system (31). For example, if \(\mathcal P\) is of the second order, and \(\mathcal {F}\) is algebraic, then the fourth-order equations (31) are equivalent to the second-order equations (30). The operator \(\mathcal {F}\) can be considered as an interaction includedFootnote 7 into the free system \(\mathcal {P}\mathcal {Q}\phi =0\). In this way, the factorization can still be efficient for keeping track of stability in the interacting higher-derivative dynamics.

Let us assume that \(\mathcal {P}^\dag =\mathcal {P}\) and construct \(\mathcal {F}(\xi ,\eta )\) in the following way. Given a function \(U(\phi ,\partial \phi , \partial ^2\phi , \ldots ,\partial ^N\phi )\), consider its Euler–Lagrange derivative for brevity denoted by

$$\begin{aligned} U'=\sum _{k=0}^N(-1)^k\frac{\partial ^k}{\partial x^{\mu _{1}}\ldots \partial x^{\mu _{k}}} \frac{\partial U}{\partial (\partial _{\mu _{1}}\ldots \partial _{\mu _{k}}\phi )}. \end{aligned}$$

The nonlinearity \(\mathcal {F}\) in (30) can be chosen as

$$\begin{aligned} \mathcal {F}(\xi ,\eta )=-U'|_{\phi \rightarrow \alpha \xi -\beta \eta }, \end{aligned}$$
(33)

with \(\alpha \) and \(\beta \) being nonzero constants. Then the system (30) comes from the least action principle for

$$\begin{aligned}&S_1 \ [\xi (x),\eta (x)] =\int L_1 \mathrm{d}x, \nonumber \\&\quad L_1=\frac{\alpha }{2}\xi \mathcal {P}\xi -\frac{\beta }{2}\eta \mathcal {Q}\eta -U(\alpha \xi -\beta \eta ), \end{aligned}$$
(34)

while (31) is not necessarily variational. For the special nonlinearity (33), (30) takes the form

$$\begin{aligned}&\frac{\delta S_1}{\delta \xi }\equiv \alpha (\mathcal {P}\xi -U'(\alpha \xi - \beta \eta ))=0, \nonumber \\&\quad \frac{\delta S_1}{\delta \eta }\equiv -\beta (\mathcal {Q}\eta -U'(\alpha \xi - \beta \eta ))=0, \end{aligned}$$
(35)

and (31) reads

$$\begin{aligned} \mathcal {P}\mathcal {Q}\phi - U'(\alpha \mathcal {Q}\phi -\beta \mathcal {P}\phi )=0. \end{aligned}$$
(36)

In some cases, the dynamical equations (35) and (36) should be multiplied by an overall dimensional constant to ensure the proper dimension of the action (34). For example, for the PU oscillator (2), it is convenient to take this factor as \(\omega _2^2-\omega _1^2\). Once the dimensional coefficient is introduced, all the expressions in this section for the actions, the equations of motion, and the conserved currents are to be multiplied by this constant, while the characteristics, symmetries, and Lagrange anchors remain intact. As the dimensional coefficient adds no essential generality but complicates the explicit expressions, it is omitted from most of the expressions.

The least action principle for (35) not necessarily makes (36) Lagrangian. The obvious variational vertex \(\mathcal {F}(\mathcal {P}\phi ,\mathcal {Q}\phi )=-U'(\phi )\) corresponds to the special choice of constants \(\alpha =-\beta =1\). The corresponding action reads

$$\begin{aligned} S_2[\phi (x)]=\int L_2 \mathrm{d}x, \quad L_2=\frac{1}{2}\phi \mathcal {P}\mathcal {Q}\phi -U(\phi ). \end{aligned}$$
(37)

If the action (34) is invariant under the space-time translations \(x^\mu \rightarrow x^\mu -\varepsilon ^\mu \), then (by the Noether theorem (11)) the system of equations (35) admits the conserved current \(J(\xi ,\eta )\) such that

$$\begin{aligned} \partial _\mu J^\mu =-\varepsilon ^\mu \partial _\mu \xi \frac{\delta S_1}{\delta \xi }-\varepsilon ^\mu \partial _\mu \eta \frac{\delta S_1}{\delta \eta }. \end{aligned}$$
(38)

It is expressible through the canonical energy-momentum tensor as

$$\begin{aligned} J^\mu =\Theta ^\mu _{~\nu }\varepsilon ^\nu , \end{aligned}$$
(39)

where

$$\begin{aligned}&\Theta ^\mu _{~\nu }(\xi ,\eta ) \nonumber \\&\quad =\sum _{\phi =\xi ,\eta }\sum _{k=1}^{N} \left[ (\partial _{\mu _1}\ldots \partial _{\mu _{k-1}}\partial _\nu \phi ) \sum _{m=k}^{N}(-1)^{(m-k)}\partial _{\mu _k} \right. \nonumber \\&\left. \qquad \times \ldots \partial _{\mu _{m-1}}\frac{\partial L_1}{\partial (\partial _{\mu _1}\ldots \partial _{\mu _{m-1}}\partial _\mu \phi )}\right] -\delta ^\mu _\nu L_1. \end{aligned}$$
(40)

Here, the sums by \(k\) and \(m\) run up to the maximal order of derivatives \(N\) entering the Lagrangian (34). The energy-momentum tensor is given by the sum

$$\begin{aligned} \Theta ^\mu _{~\nu }(\xi ,\eta )=\alpha (\Theta _\mathcal {P}){}^\mu _{~\nu }(\xi )- \beta (\Theta _\mathcal {Q}){}^\mu _{~\nu }(\eta )+(\Theta _U){}^\mu _{~\nu }(\xi ,\eta ), \end{aligned}$$
(41)

where \((\Theta _\mathcal {P}){}^\mu _{~\nu }\) and \((\Theta _\mathcal {Q}){}^\mu _{~\nu }\) are the energy-momentum tensors for the Lagrangian free theories \(\mathcal {P}\xi =0\) and \(\mathcal {Q}\eta =0\), while the term \((\Theta _U){}^\mu _{~\nu }\) is the energy-momentum tensor of “interaction”. By construction, the component \(\Theta ^0_{~0}\) has the meaning of the energy density of the theory (35), so that the total energy of the system is given by the integral \(E=\int _{\text {space}} \Theta ^0_{~0}\). The stability of the theory (35) is provided by the condition \(\Theta ^0_{~0}\ge 0\).

An alternative analysis of stability can be made by switching to the Hamiltonian formalism for the theory (34). The stability of the theory (35) is guaranteed if the Hamiltonian \(H=E\) is positive definite. This approach may be convenient for the theories whose lower-order Lagrangian formulations (34) are well studied. As an example we can mention the conformal higher-spin fields [37].

Let us now prove that in the \(\phi \)-representation the energy-momentum tensor (40) is also associated with the space-time translations. This tensor can ensure stability of the theory (36) much like the canonical energy-momentum tensor does in the usual theory without higher derivatives. Substituting \(\phi \) into (39) by the rule (32), we find that the tensor \(\Theta ^\mu _{~\nu }(\mathcal {Q}\phi ,\mathcal {P}\phi )\) is conserved,

$$\begin{aligned}&\partial _\mu \Theta ^\mu _{~\nu }(\mathcal {Q}\phi ,\mathcal {P}\phi )= \left[ \partial _{\nu }(\beta \mathcal {P}-\alpha \mathcal {Q})\phi \right] \nonumber \\&\quad \times \left[ \mathcal {P}\mathcal {Q}\phi -U'(\alpha \mathcal {Q}\phi -\beta \mathcal {P}\phi )\right] , \end{aligned}$$
(42)

and the corresponding characteristic reads

$$\begin{aligned} Q_\nu =\partial _\nu (\beta \mathcal {P}-\alpha \mathcal {Q})\phi . \end{aligned}$$
(43)

Obviously, \(\Theta ^0_{~0}(\xi ,\eta )\ge 0\) implies \(\Theta ^0_{~0}(\mathcal {Q}\phi ,\mathcal {P}\phi )\ge 0\).

Notice that the order of variational equations (35) may be lower than the order of equations (36). For this reason, the use of variational formulation (34) allows one to surpass the obstructions to the existence of positive definite energy in theories with higher derivatives. For example, if the differential operators \(\mathcal {P}\) and \(\mathcal {Q}\) are of the second order, then the positive definite energy density may exist even if the theory (36) is nonsingular. On the other hand, the use of the Noether theorem for the constriction of conservation laws sets the natural upper bound for the order of action (34). This suggests to concentrate on the theories (36) for which the operators \(\mathcal {P}\), \(\mathcal {Q}\) are at most of the second order and \(U=U(\phi ,\partial \phi )\) depends on at most first derivatives of the field. However, if the higher-derivative models (34) with the positive definite Noether energy are found in the future, our construction will be applicable to them as well.

More information about stability of the theory (36) may be obtained if the structure of the energy-momentum tensor (40) is taken into account. For example, if the two factors are stable (i.e., \(\alpha (\Theta _\mathcal {P}){}^0_{~0}, -\beta (\Theta _\mathcal {Q}){}^0_{~0}\ge 0\) for some values of \(\alpha \) and \(\beta \)) and \((\Theta _U){}^0_{~0}\ge 0\), the theory (30) is stable. This fact can be used for a systematical constriction of stable interacting higher-derivative theories. If both factors are stable, but the interaction term is not positive definite, the energy can still have a local minimum in a neighborhood of zero solution. Such theories with “locally stable” behavior are also considered as physically acceptable models. They can be studied within the perturbation theory. The examples are known of the locally stable models with not necessarily positive energy [11, 13, 22, 23]. In such theories with “benign ghosts” we can expect the existence of a (yet unknown) Lagrange anchor and an alternative positive definite conserved energy. In other cases, the stability of a theory cannot be guaranteed even in a small neighborhood of the vacuum solution. The theories of this type are branded as having “malicious ghosts” [11] and cannot be considered as physical.

Whenever the system of equations (36) is not variational, the relationship between the conserved tensor (41) and the space-time translations can be established by the Lagrange anchor. In Appendix D we find that for factorizable systems the Lagrange anchor reads

$$\begin{aligned} V=\frac{1}{\alpha }\mathcal {Q}-\frac{1}{\beta }\mathcal {P}+ \frac{(\alpha +\beta )^2}{\alpha \beta }U''(\alpha \mathcal {Q}\phi -\beta \mathcal {P}\phi ). \end{aligned}$$
(44)

The action of the matrix differential operator \(U''\) on an arbitrary characteristic \(Q(\phi (x))\) is defined by

$$\begin{aligned} U''(\phi )Q=\int \mathrm{d}x\frac{\delta U'(\phi )}{\delta \phi (x)}Q(\phi (x)). \end{aligned}$$
(45)

Verification of the defining property (6.10) for the Lagrange anchor (44) requires some technical details provided in Appendix D. Applying (44) to the characteristic (43), we get the space-time translation symmetry

$$\begin{aligned} \displaystyle \delta _\varepsilon \phi&= \varepsilon ^\nu V (Q_\nu )\nonumber \\&= \varepsilon ^\nu \left( \frac{1}{\alpha }\mathcal {Q}-\frac{1}{\beta }\mathcal {P}+ \frac{(\alpha +\beta )^2}{\alpha \beta }U''(\alpha \mathcal {Q}\phi -\beta \mathcal {P}\phi )\right) \nonumber \\&\times (\beta \mathcal {P}-\alpha \mathcal {Q})\partial _\nu \phi \nonumber \\&= \left( \frac{1}{\alpha }\mathcal {Q}-\frac{1}{\beta }\mathcal {P}\right) (\beta \mathcal {P}-\alpha \mathcal {Q})\varepsilon ^\nu \partial _\nu \phi - \frac{(\alpha +\beta )^2}{\alpha \beta } \nonumber \\&\times \, U''(\alpha \mathcal {Q}\phi -\beta \mathcal {P}\phi ) \varepsilon ^\nu \partial _\nu (\alpha \mathcal {Q}\phi -\beta \mathcal {P}\phi ) \nonumber \\&= -\varepsilon ^\nu \partial _\nu \phi +\frac{(\alpha +\beta )^2}{\alpha \beta }\varepsilon ^\nu \partial _\nu \left( \mathcal {Q}\mathcal {P} \phi \right. \nonumber \\&\left. -\,U'(\alpha \mathcal {Q}\phi -\beta \mathcal {P}\phi )\right) \approx -\varepsilon ^\nu \partial _\nu \phi . \end{aligned}$$
(46)

This relation allows us to identify the conserved current (42) with the energy-momentum current of the theory (36).

Let us illustrate the general construction above by the example of PU oscillator. The operators \(\mathcal {P}\) and \(\mathcal {Q}\) now take the form

$$\begin{aligned} \mathcal {P}=\frac{1}{\omega _1^2-\omega _2^2}\left( \frac{\mathrm{d}^2}{\mathrm{d}t^2}+\omega ^2_1\right) ,\quad \!\! \mathcal {Q}=\frac{1}{\omega _2^2-\omega _1^2}\left( \frac{\mathrm{d}^2}{\mathrm{d}t^2}+\omega ^2_2\right) \!.\nonumber \\ \end{aligned}$$
(47)

Upon substituting (47) into (36) and multiplying by the overall factor \(\omega ^2_2-\omega ^2_1\), we get the following equation of motion:

$$\begin{aligned} T&\equiv \frac{1}{\omega _1^2-\omega _2^2}\left( \frac{\mathrm{d}^2}{\mathrm{d}t^2} +\omega _1^2\right) \left( \frac{\mathrm{d}^2}{\mathrm{d}t^2}+\omega _2^2\right) \phi \nonumber \\&\quad -\, U'\left( \frac{(\alpha +\beta )\ddot{\phi }+(\alpha \omega _2^2+\beta \omega _1^2)\phi }{\omega _2^2-\omega _1^2}\right) =0. \end{aligned}$$
(48)

For simplicity’s sake we assume the function \(U(\phi )\) to depend on \(\phi \) but not on its derivatives, so that \(U'=\mathrm{d} U(\phi )/\mathrm{d}\phi \). The two-parameter family of integrals of motion reads

$$\begin{aligned} E=E_{\alpha ,\beta }+U\left( \frac{(\alpha +\beta )\ddot{\phi }+(\alpha \omega _2^2+\beta \omega _1^2)\phi }{\omega _2^2-\omega _1^2}\right) , \end{aligned}$$
(49)

where \(E_{\alpha ,\beta }\) is defined by (9). One can easily check that

$$\begin{aligned} \frac{\mathrm{d}E}{\mathrm{d}t}=Q T,\quad Q=\frac{(\alpha +\beta )\dddot{\phi }+(\alpha \omega _2^2+\beta \omega _1^2)\dot{\phi }}{\omega _1^2-\omega _2^2}. \end{aligned}$$
(50)

Expression (49) is positive definite whenever \(\alpha ,\beta >0\) and \(U\ge 0\). In that case the motion is bounded for any initial data. To the best of our knowledge this is the first example of the self-interacting PU oscillator whose classical stability can be proved analytically for all initial data. In the previously known examples of interactions [11, 22] boundedness of the motion has been demonstrated by numerical computations.

To conclude the consideration of the fourth-order formulation (48) let us write out the Lagrange anchor

$$\begin{aligned} \displaystyle V&= \displaystyle \frac{1}{\alpha }\frac{1}{\omega _2^2-\omega _1^2}\left( \frac{\mathrm{d}^2~}{\mathrm{d}t^2}+\omega _2^2\right) + \frac{1}{\beta }\frac{1}{\omega _2^2-\omega _1^2}\left( \frac{\mathrm{d}^2~}{\mathrm{d}t^2}+\omega _1^2\right) \nonumber \\&\displaystyle +\frac{1}{\omega ^2_2-\omega _1^2}\frac{(\alpha +\beta )^2}{\alpha \beta }U''\left( \frac{(\alpha +\beta )\ddot{\phi }+(\alpha \omega _2^2+\beta \omega _1^2)\phi }{\omega _2^2\!-\!\omega _1^2}\right) , \nonumber \\&\displaystyle U''=\frac{\mathrm{d}^2 U(\phi )}{\mathrm{d}\phi ^2}, \end{aligned}$$
(51)

and the corresponding time-translation symmetry

$$\begin{aligned} \delta _\varepsilon \phi =\varepsilon V(Q)=-\varepsilon \dot{\phi }-\frac{(\alpha +\beta )^2}{\alpha \beta } \frac{\varepsilon }{\omega _1^2-\omega _2^2}\frac{\mathrm{d}T}{\mathrm{d}t}. \end{aligned}$$
(52)

The Hamiltonian formulation for the fourth-order theory (48) can be derived with the help of the auxiliary action (34). In our case, it takes the form

$$\begin{aligned} S_1&= \int L_1 \mathrm{d}t,\quad L_1= \frac{\alpha }{2}(\dot{\xi }^2-\omega _1^2\xi ^2)+\frac{\beta }{2}(\dot{\eta }^2-\omega _2^2\eta ^2) \nonumber \\&-\, U(\alpha \xi -\beta \eta ). \end{aligned}$$
(53)

Introducing the canonical momenta

$$\begin{aligned} p_\xi \equiv \frac{\partial L}{\partial \dot{\xi }}=\alpha \dot{\xi },\quad p_\eta \equiv \frac{\partial L}{\partial \dot{\eta }}=\beta \dot{\eta }, \end{aligned}$$
(54)

and performing the Legendre transformation, we obtain the Hamiltonian

$$\begin{aligned} H=\frac{1}{2}\left( \frac{p_\xi ^2}{\alpha }+\alpha \omega _1^2\xi ^2\right) + \frac{1}{2}\left( \frac{p_\eta ^2}{\beta }+\beta \omega _2^2\eta ^2\right) +U(\alpha \xi -\beta \eta ). \end{aligned}$$
(55)

Obviously, the Hamiltonian (55) is positive definite simultaneously with the energy (49). The canonical transformation (26)

$$\begin{aligned} \pi _\xi =\frac{p_\xi }{\sqrt{\alpha }},\quad \pi _\eta =\frac{p_\eta }{\sqrt{\beta }},\quad \chi _\xi =\sqrt{\alpha }\xi ,\quad \chi _\eta =\sqrt{\beta }\eta \end{aligned}$$
(56)

brings the Hamiltonian to the form

$$\begin{aligned} H=H_{\alpha ,\beta }+U\left( \sqrt{\alpha }\chi _\xi -\sqrt{\beta }\chi _\eta \right) . \end{aligned}$$
(57)

As is seen the Hamiltonian (57) is a deformation of the free Hamiltonian (28). Quantizing this theory in the usual way by introducing creation–annihilation operators, we arrive at the quantum theory with a well-defined ground state and a positive energy spectrum.

4 Examples of stable higher-derivative field theories

In this section, we consider two examples of the higher-derivative field theories which are stable despite the fact that their canonical energy is unbounded from below. The consideration follows the general pattern described in the previous section.

4.1 Scalar field with higher derivatives

Consider the Lagrangian of a free scalar field \(\phi \):

$$\begin{aligned} L=\frac{1}{2(m_1^2-m_2^2)}\left( \Box \phi +m_1^2\phi \right) \left( \Box \phi +m_2^2\phi \right) , \end{aligned}$$

where \(\Box =\partial _\mu \partial ^\mu \) is the D’Alembert operator. The equation of motion reads

$$\begin{aligned} \frac{\delta S}{\delta \phi }=\frac{1}{m_1^2-m_2^2}\left( \Box +m_1^2\right) \left( \Box +m_2^2\right) \phi =0. \end{aligned}$$
(58)

If \(m_1\ne m_2\), the theory has the factorizable structure (31) with the following operators \(\mathcal {P}\) and \(\mathcal {Q}\):

$$\begin{aligned} \mathcal {P}=\frac{\Box +m_1^2}{m_1^2-m_2^2},\quad \mathcal {Q}=\frac{\Box +m_2^2}{m_2^2-m_1^2}. \end{aligned}$$

In the second-order formalism the corresponding fields \(\xi \) and \(\eta \) are the usual scalar fields with masses \(m_1\) and \(m_2\), respectively.

Interaction can be included in (58) following the pattern (31), (33) of the previous section:

$$\begin{aligned} {T}&\equiv \frac{(\Box +m_1^2)(\Box +m_2^2)\phi }{(m_1^2-m_2^2)} \nonumber \\&-\, U'\left( \frac{(\alpha +\beta )\Box +(\alpha m_2^2+\beta m_1^2)}{m_2^2-m_1^2}\phi \right) =0. \end{aligned}$$
(59)

The common multiplier \(m^2_2-m_1^2\) provides the correct dimension of energy.

Here we consider a \(U\) which does not depend on derivatives of fields. This allows us to simplify explicit formulas in this section. The general expressions and conclusions, however, hold true even if the interaction depends on the derivatives of fields.

The corresponding energy-momentum tensor reads

$$\begin{aligned} \Theta ^{\mu }_{\nu }&= \alpha \Theta ^{(1)}{}^{\mu }_{\nu }(\mathcal {Q}\phi )+\beta \Theta ^{(2)}{}^{\mu }_{\nu }(\mathcal {P}\phi )\nonumber \\&+\,\delta ^\mu _\nu U \times \left( \frac{(\alpha +\beta )\Box +(\alpha m_2^2+\beta m_1^2)}{m_2^2-m_1^2}\phi \right) , \end{aligned}$$
(60)

where

$$\begin{aligned} \displaystyle \Theta ^{(1)}{}^{\mu }_{\nu }(\mathcal {Q}\phi )&= \displaystyle \partial ^\mu \left( \frac{\Box \phi +m_2^2\phi }{m_2^2-m_1^2}\right) \partial _\nu \left( \frac{\Box \phi +m_2^2\phi }{m_2^2-m_1^2}\right) -\\&\displaystyle -\,\frac{1}{2}\delta ^{\mu }_{\nu }\partial ^\sigma \left( \frac{\Box \phi +m_2^2\phi }{m_2^2-m_1^2}\right) \partial _\sigma \left( \frac{\Box \phi +m_2^2\phi }{m_2^2-m_1^2}\right) \\&+\, \delta ^{\mu }_{\nu }\frac{m_1^2}{2}\left( \frac{\Box \phi +m_2^2\phi }{m_2^2-m_1^2}\right) ^2 \end{aligned}$$

and

$$\begin{aligned} \displaystyle \Theta ^{(2)}{}^{\mu }_{\nu }(\mathcal {P}\phi )&= \displaystyle \partial ^\mu \left( \frac{\Box \phi +m_1^2\phi }{m_1^2-m_2^2}\right) \partial _\nu \left( \frac{\Box \phi +m_1^2\phi }{m_1^2-m_2^2}\right) \\&\displaystyle -\,\frac{1}{2}\delta ^{\mu }_{\nu }\partial ^\sigma \left( \frac{\Box \phi +m_1^2\phi }{m_1^2-m_2^2}\right) \partial _\sigma \left( \frac{\Box \phi +m_1^2\phi }{m_1^2-m_2^2}\right) \\&+\, \delta ^{\mu }_{\nu }\frac{m_2^2}{2}\left( \frac{\Box \phi +m_1^2\phi }{m_1^2-m_2^2}\right) ^2\, \end{aligned}$$

are the energies of scalar modes with masses \(m_1\) and \(m_2\), and the last term in (60) has the meaning of interaction energy.

The characteristic of the conserved energy-momentum tensor (60) reads

$$\begin{aligned} Q_\nu \!=\!\partial _\nu \left( \frac{(\alpha +\beta )\Box +(\alpha m_2^2+\beta m_1^2)}{m_1^2-m_2^2}\phi \right) ,\quad \partial _\mu \Theta ^\mu _\nu =Q_\nu {T}. \end{aligned}$$
(61)

The Lagrange anchor, being constructed for (59) by the general recipe (9.2), has the form

$$\begin{aligned} V&= \frac{1}{\alpha }\frac{\Box +m_1^2}{m_1^2-m_2^2}+\frac{1}{\beta }\frac{\Box +m_2^2}{m_1^2-m_2^2}+ \frac{1}{m_2^2-m_1^2}\frac{(\alpha +\beta )^2}{\alpha \beta }\frac{\mathrm{d}^2U}{\mathrm{d}\phi ^2} \nonumber \\&\times \left( \frac{(\alpha +\beta )\Box +(\alpha m_2^2+\beta m_1^2)}{m_2^2-m_1^2}\phi \right) . \end{aligned}$$
(62)

The Lagrange anchor maps characteristics to infinitesimal symmetry transformations; see Appendix B. Applying the anchor (62) to the characteristic (61), we find

$$\begin{aligned} \delta _\varepsilon \phi =\varepsilon ^\mu V (Q_\mu )=-\varepsilon ^\mu \partial _\mu \phi -\frac{(\alpha +\beta )^2}{\alpha \beta }\frac{1}{m_1^2-m_2^2}\varepsilon ^\mu \partial _\mu T, \end{aligned}$$

where \(T\) is the l.h.s. of the field equation (59). The symmetry transformation is a translation along the constant vector \(\varepsilon ^\mu \), as it must be. The stable interaction vertices correspond to \(\alpha ,\beta >0\) and depend on the second derivatives of the scalar field through \(\Box \phi \).

In Ref. [29] the higher-derivative self-interactions of the scalar field of a similar form are considered in cosmology as one of the scenarios explaining inflation. With this regard, the suggested stability control method, being based on the conservation of the tensor (60), can be relevant to cosmology where the classical stability is an important selection principle for the models.

Let us mention one more evidence of stability of scalar fields with high derivatives. The instability of the theory is usually related with the presence of “ghost states”. These states correspond to the wrong sign of the pole in propagator. They are responsible for the presence of negative norm states, which represents notorious trouble for high-derivative theories. Below we demonstrate that the correct choice of the Lagrange anchor leads to the ghost-free theory. The procedure of quantization of theories equipped with the Lagrange anchor has been developed in the series of works [4446]. Here, we use the method based on the generalized Schwinger–Dyson equation (a brief outline of the method can be found in Appendix A; for a more systematic exposition see [45]). We find the generating functional of Green functions for the free higher-derivative scalar field with Lagrange anchor (62) and derive the propagator as the second variational derivative of the generating functional of Green’s functions.

For the free equations of motion (58) and the Lagrange anchor (62), the Schwinger–Dyson equation reads

$$\begin{aligned} \left[ \frac{\delta S}{\delta \phi } (\widehat{\phi }) -V(\bar{\phi })\right] Z[\bar{\phi }]=0, \end{aligned}$$
(63)

where \(\widehat{\phi }=i\hbar \delta /\delta \bar{\phi }\), \(\bar{\phi }\) is the source for the scalar field \(\phi \), and \(Z[\bar{\phi }]\) is the generating functional of Green’s functions. The solution to the Schwinger–Dyson equation (63) has the form

$$\begin{aligned} Z[\bar{\phi }]\!=\!\exp \left[ -\frac{i}{2\hbar }\int \mathrm{d}^4x \bar{\phi }\left( \frac{1}{\alpha }\frac{1}{\Box +m_2^2}\!+\! \frac{1}{\beta }\frac{1}{\Box +m_1^2}\right) \bar{\phi }\right] \!. \end{aligned}$$
(64)

Taking the second variational derivative of (64) and setting \(\bar{\phi }=0\), we get the propagator

$$\begin{aligned} G_2(x_1-x_2)&= i\hbar \frac{\delta ^2 Z[\bar{\phi }]}{\delta \bar{\phi }(x_1)\delta \bar{\phi }(x_2)}\Big |_{\bar{\phi }=0} \nonumber \\ \!&= \!\left( \frac{1}{\alpha }\frac{1}{\Box +m_2^2}\!+\! \frac{1}{\beta }\frac{1}{\Box +m_1^2}\right) \delta (x_1\!-\!x_2). \end{aligned}$$
(65)

As one could expect, both terms in (65) have the same sign if \(\alpha , \beta >0\). The canonical Lagrange anchor corresponds to the choice \(\alpha =-\beta =1\), which leads to the theory with ghosts.

Let us note that the presence of derivatives in the Lagrange anchor makes the ultraviolet behavior of the propagator worse. Only the canonical Lagrange anchor (\(\alpha =-\beta \)) provides the ultraviolet asymptotic form \(G_2\sim p^{-4}\) in the momentum representation. In the case of positive definite energy, the propagator behaves like the usual Feynman propagator for the scalar field, \(G_2\sim p^{-2}\). As a result, the use of a Lagrange anchor with derivatives does not allow one to get simultaneously the positive definite energy and improve the renormalization properties of the theory. This can decrease the potential attractiveness of using higher-derivative theories from the viewpoint of surpassing the divergences in quantum theory.

As we have seen, at the free level the higher-derivative scalar field model admits a two-parameter family of conserved energy-momentum tensors. The interaction, being included by the recipe (59), explicitly involves these parameters. In the interacting model only one conservation law survives by construction. The conserved tensor (40) has positive density \(\Theta ^0_{~0}\) once \(\alpha ,\beta >0\), while the canonical energy (which is unbounded) corresponds to \(\alpha =-\beta =1\). So, the interaction with \(\alpha , \beta >0\) does not break stability, because the positive quantity still is conserved in this case. A similar phenomenon is seen when the theory is quantized. If the Lagrange anchor is chosen with positive parameters \(\alpha ,\beta \) the theory is stable, while the canonical choice results in the ghosts.

4.2 Podolsky’s electrodynamics and its interaction with massive spin \(1/2\)

The free Podolsky electrodynamics is the theory of vector field \(\phi ^\mu \) with action

$$\begin{aligned} S\!=\!-\frac{1}{4}\int \mathrm{d}x \left[ (F_\phi )_{\mu \nu }(F_\phi )^{\mu \nu }-\frac{2}{m^2_p}\partial ^\mu (F_\phi )_{\mu \rho }\,\partial _\nu (F_\phi )^{\nu \rho }\right] \!. \end{aligned}$$
(66)

Here, \((F_\phi )_{\mu \nu }=\partial _\mu \phi _\nu -\partial _{\nu }\phi _\mu \) is the field strength and \(m_p>0\) is a parameter of the theory having the dimension of mass.

The equations of motion

$$\begin{aligned} -\frac{1}{m_p^2}\frac{\delta S}{\delta \phi }\equiv \mathcal {P}\mathcal {Q}\phi =0\, \end{aligned}$$

have the factorizable structure (31), where the operators \(\mathcal {P}, \mathcal {Q}\) and \( \mathcal {F} \) read

$$\begin{aligned} \mathcal {P}&= -\frac{1}{m_p^2}(\Box -\partial \partial \cdot ),\quad \mathcal {Q}=\frac{1}{m_p^2}\left( \Box -\partial \partial \cdot +m_p^2\right) ,\nonumber \\&\mathcal {F}=0. \end{aligned}$$
(67)

Obviously \(\mathcal {P}\) is the Maxwell operator, \(\mathcal Q\) is the Proca operator.

Being a factorizable fourth-order theory, Podolsky electrodynamics can be reduced to the second order by introducing the variables \(\xi \) and \(\eta \) that absorb the second derivatives of \(\phi \) following the general recipe (32): \(\xi =\mathcal {Q}\phi \), \(\eta =\mathcal {P}\phi \). Then the equivalent second-order theory will be given by the Maxwell equations for \(\xi \) and the Proca equations for \(\eta \). The corresponding action has the form

$$\begin{aligned} S_1&= -\frac{1}{4}\int \mathrm{d}x \left[ \alpha (F_\xi )_{\mu \nu }(F_\xi )^{\mu \nu }\right. \nonumber \\&\left. +\, \beta \left( (F_\eta )_{\mu \nu }(F_\eta )^{\mu \nu }-2m^2_p\eta ^\nu \eta _\nu \right) \right] \end{aligned}$$
(68)

with some constants \(\alpha ,\beta \ne 0\). The Lagrangians (66) and (68) enjoy the usual gauge symmetry

$$\begin{aligned} \delta _\chi \phi _\mu =\partial _\mu \chi ,\quad \delta _\chi \xi _\mu =\partial _\mu \chi ,\quad \delta _\chi \eta _\mu =0. \end{aligned}$$
(69)

Let us first discuss the inclusion of interaction in the \(\xi \eta \)-formalism, and then switch to the \(\phi \)-picture, where the equations are of fourth order.Footnote 8 Introduce the Dirac field \(\psi \) (\(\widetilde{\psi }\) stands for the Dirac conjugate spinor) minimally coupled to the vector field by adding the following term to the action (68):

$$\begin{aligned} S'_1&= S_1-\int \mathrm{d}x U,\quad U(\alpha \xi -\beta \eta ,\psi ,\widetilde{\psi }) \nonumber \\&= -\widetilde{\psi } (i\gamma ^\mu (\partial _\mu -e(\alpha \xi -\beta \eta )_\mu )-m)\psi . \end{aligned}$$
(70)

The equations read

$$\begin{aligned}&\partial ^\nu (F_\xi )_{\nu \mu }-j_\mu =0, \quad \partial ^\nu (F_\eta )_{\nu \mu }+m_p^2\eta _\mu +j_\mu =0, \nonumber \\&\quad j_\mu =e\widetilde{\psi }\gamma ^\mu \psi , \end{aligned}$$
(71)
$$\begin{aligned}&(i\gamma ^\mu (\partial _\mu -e(\alpha \xi -\beta \eta )_\mu )-m)\psi =0, \nonumber \\&\quad \widetilde{\psi }(i\gamma ^\mu (\overleftarrow{\partial }_\mu +e(\alpha \xi -\beta \eta )_\mu )+m)=0. \end{aligned}$$
(72)

The consistency of the interaction implies that the gauge transformations (69) are complemented by the standard \(U(1)\)-transformation for the Dirac field

$$\begin{aligned} \delta _\chi \psi = -ie\alpha \chi \psi ,\quad \delta _\chi \widetilde{\psi }=ie\alpha \chi \widetilde{\psi }. \end{aligned}$$
(73)

As is seen, the full theory of (71) and (72) describes propagation of one vector field \(\eta \) of mass \(m_p\) and one massless gauge field \(\xi \), and both vectors are minimally coupled to the spinor field \(\psi \).

If \(\alpha ,\beta >0\), the theory (68) is (perturbatively) stable. The energy-momentum tensor reads

$$\begin{aligned} \displaystyle \Theta ^{\mu }_{~\nu }(\xi ,\eta ,\psi ,\widetilde{\psi })&= \frac{\beta }{4}(\delta ^{\mu }_{\nu }(F_{\eta })^{\rho \sigma }(F_{\eta })_{\rho \sigma }- 4(F_{\eta })^{\mu \rho }(F_{\eta })_{\nu \rho } \nonumber \\&+\, 4m_p^2 \eta ^\mu \eta _\nu -2m_p^2\delta ^{\mu }_{\nu }\eta ^\rho \eta _\rho ) \nonumber \\&\displaystyle +\, \frac{\alpha }{4}(\delta ^{\mu }_{\nu }(F_{\xi })^{\rho \sigma }(F_{\xi })_{\rho \sigma }- 4(F_{\xi })^{\mu \rho }(F_{\xi })_{\nu \rho }) \nonumber \\&+\, \frac{i}{4}\widetilde{\psi }\left[ \gamma ^\mu (\overrightarrow{\partial }_\nu +ie(\alpha \xi -\beta \eta )_\nu ) \right. \nonumber \\&\displaystyle +\,\gamma _\nu (\overrightarrow{\partial }^\mu +ie(\alpha \xi -\beta \eta )^\mu )\nonumber \\&-\, \gamma ^\mu (\overleftarrow{\partial }_\nu -ie(\alpha \xi -\beta \eta )_\nu ) \nonumber \\&\left. -\,\gamma _\nu (\overleftarrow{\partial }^\mu -ie(\alpha \xi -\beta \eta )^\mu )\right] \psi . \end{aligned}$$
(74)

Notice that the stable and unstable models describe different physics. To demonstrate this fact, let us make the field redefinition

$$\begin{aligned} \xi \rightarrow \pm \frac{\xi }{\sqrt{|\alpha |}},\quad \eta \rightarrow \pm \frac{\eta }{\sqrt{|\beta |}}\, \end{aligned}$$
(75)

in the action (70). Substituting (75) into (70), we get the standard action of theory describing the minimal coupling of massive and massless vector fields with the Dirac field

$$\begin{aligned} \displaystyle S'_1&= \displaystyle -\frac{1}{4}\int \mathrm{d}x\left\{ \frac{\alpha }{|\alpha |}(F_\xi )_{\mu \nu }(F_\xi )^{\mu \nu } \right. \nonumber \\&+\, \frac{\beta }{|\beta |}\left[ (F_\eta )_{\mu \nu }(F_\eta )^{\mu \nu }-2m^2_p\eta ^\nu \eta _\nu \right] \nonumber \\&\left. \displaystyle -\,4 \widetilde{\psi }\left( i\gamma ^\mu (\partial _\mu -e(\pm \frac{\alpha }{|\alpha |}\sqrt{|\alpha |}\xi \mp \frac{\beta }{|\beta |}\sqrt{|\beta |}\eta )_\mu )\!-\!m\right) \psi \right\} .\nonumber \\ \end{aligned}$$
(76)

The parameters \(\alpha \), \(\beta \) define the intensity of this coupling. Notice that, by construction, any model (76) with nonzero \(\alpha ,\beta \) remains equivalent to the Podolsky theory interacting with Dirac field. For this reason, any theory of massive and massless vector fields minimally interacting with spinor field has an equivalent description in terms of the interacting Podolsky theory.

It is well known that in the theory of the form (76), two fermions interact by means of massless “photons” producing the Coulomb force and massive “photons” producing the Yukawa force. If the theory is stable, both types of photons mediate the force of repulsion between two particles of the same charge and the force of attraction if the particles have opposite electric charges. In contrast, the unstable theories (because of the “wrong” sign of the action of one (or both) photons in (76)) describe the interactions where one (or both) types of photons mediate the force of attraction between two particles of the same charge and the force of repulsion between particle and antiparticle. For example, in the special case of \(\alpha =-\beta =1\), which corresponds to the inclusion of the minimal interaction \(\phi ^\mu j_\mu \) into the original Lagrangian (67), the Coulomb and Yukawa contributions to the interaction energy are equal by intensity but must be different by sign. This fact was first noticed by Podolsky in [4] and it turned out that this sign cannot be controlled within the Lagrangian formalism. It was long believed that the phenomenon of the subtraction of two forces is the strong side of the theory, because it allows one to improve the short-distance behavior of Green’s functions. Now we see that the minimal interaction of Podolsky theory with the Dirac field is incompatible with the stability condition. The stable interactions with \(\alpha ,\beta >0\) correspond to non-minimal and non-Lagrangian interaction vertices in the Podolsky theory. Below, we explain that the stability of the theory can be controlled immediately in terms of fourth-order equations with any \(\alpha ,\beta \) even though they are not necessarily Lagrangian.

In the \(\phi \)-representation, which corresponds to the original fourth-order formalism, the equations of the nonlinear theory (71) and (72) read

$$\begin{aligned}&\displaystyle (T_\phi )_\mu \equiv \left( \frac{1}{m^2_p}\Box +1\right) \partial ^\nu (F_\phi )_{\mu \nu }-j_\mu =0,\nonumber \\&\displaystyle T_{\widetilde{\psi }}\equiv \left\{ i\gamma ^\mu \left( \partial _\mu -e\alpha \phi _\mu -e\frac{\alpha +\beta }{ m^2_p}\partial ^\nu (F_\phi )_{\nu \mu }\right) -m\right\} \nonumber \\&\quad \times \,\psi =0,\\&\displaystyle T_{\psi } \equiv \widetilde{\psi }\left\{ i\gamma ^\mu \left( -\overleftarrow{\partial }_\mu -e\alpha \phi _\mu -e\frac{\alpha +\beta }{ m^2_p}\partial ^\nu (F_\phi )_{\nu \mu }\right) \!-\!m\right\} \nonumber \\&\quad =0.\nonumber \end{aligned}$$
(77)

The equations (77) are invariant under the usual gauge transformations (69) and (73).

In the \(\phi \)-representation the energy-momentum tensor (74) takes the form

$$\begin{aligned}&\displaystyle \Theta ^{\mu }_{~\nu }(\phi ,\psi ,\widetilde{\psi }) \nonumber \\&\quad = \frac{\alpha +\beta }{4m^4_p}\left[ \delta ^\mu _{\nu }(\Box F_\phi )^{\rho \sigma }(\Box F_\phi )_{\rho \sigma }-4(\Box F_\phi )^{\mu \rho }(\Box F_\phi )_{\nu \rho }\right] \nonumber \\&\qquad \displaystyle +\,\frac{\alpha }{2m_p^2}\left[ \delta ^\mu _{\nu } (F_\phi )^{\rho \sigma }(\Box F_\phi )_{\rho \sigma }-2(F_\phi )^{\mu \rho }(\Box F_\phi )_{\nu \rho } \right. \nonumber \\&\qquad \left. -\,2(F_\phi )_{\nu \rho }(\Box F_\phi )^{\mu \rho }\right] \nonumber \\&\qquad \displaystyle +\,\frac{\beta }{2m_p^2}\left[ 2\partial _\rho (F_\phi )^{\rho \mu }\partial ^\sigma (F_\phi )_{\sigma \nu }\!-\!\delta ^{\mu }_{~\nu }\partial _\rho (F_\phi )^{\rho \tau }\partial ^\sigma (F_\phi )_{\sigma \tau }\!\right] \nonumber \\&\qquad +\, \frac{1}{4}\delta ^\mu _{\nu }(F_\phi )^{\rho \sigma }(F_\phi )_{\rho \sigma }- (F_\phi )^{\mu \rho }(F_{\phi })_{\nu \rho } \nonumber \\&\qquad \displaystyle +\, \frac{i}{4}\widetilde{\psi }\left[ \gamma ^\mu (\overrightarrow{\partial }_\nu +ieb_\nu ) +\gamma _\nu (\overrightarrow{\partial }^\mu +ieb^\mu ) \right. \nonumber \\&\qquad \left. -\,\gamma ^\mu (\overleftarrow{\partial }_\nu -ieb_\nu )- \gamma _\nu (\overleftarrow{\partial }^\mu -ieb^\mu )\right] \psi , \end{aligned}$$
(78)

where

$$\begin{aligned} b_\mu =\alpha \phi _\mu +\frac{\alpha +\beta }{ m^2_p}\partial ^\nu (F_\phi )_{\nu \mu }. \end{aligned}$$

In the limit of free Lagrangian theory (\(\alpha =-\beta =1,\psi =0)\) this conserved tensor reduces to the standard energy-momentum tensor of the Podolsky theory [4], as one could expect.

The tensor (78) is conserved,

$$\begin{aligned} \partial _\mu \Theta ^{\mu }_{~\nu }= \mathcal {}(Q_\phi )_\nu ^\mu (T_\phi )_\mu +T_{\psi }(Q_\psi )_\nu +(Q_{\widetilde{\psi }})_\nu T_{\widetilde{\psi }}, \end{aligned}$$
(79)

and the corresponding characteristic readsFootnote 9

$$\begin{aligned} Q_\nu =((Q_\phi )^\mu _\nu ,(Q_\psi )_\nu ,(Q_{\widetilde{\psi }})_\nu )=(-\partial _\nu b^\mu , -\partial _\nu \psi ,-\partial _\nu \widetilde{\psi }). \end{aligned}$$
(80)

The Lagrange anchor (6.10) for factorizable systems is constructed by the general recipe (9.2). Following this pattern, we arrive at the Lagrange anchor \(V\), whose action on the general characteristic \(Q\) reads

$$\begin{aligned} \displaystyle V(Q)&\equiv \left( V^\mu _\phi (Q),V_{\bar{\psi }}(Q),V_\psi (Q)\right) \nonumber \\&= \left( \left[ \left( \frac{1}{\alpha }+\left( \frac{1}{\alpha }+ \frac{1}{\beta }\right) \frac{\Box -\partial \partial \cdot }{m_p^2}\right) Q\right] ^\mu \right. \nonumber \\&\left. +\,\frac{1}{m_p^2}\frac{(\alpha +\beta )^2}{\alpha \beta } \nonumber \right. \\&\times \,\left. \left[ e\bar{\psi }\gamma ^\mu Q_{\bar{\psi }}+eQ_{\psi }\gamma ^\mu \psi \right] , \ Q_\psi , \, Q_{\bar{\psi }}\right) . \end{aligned}$$
(81)

Substituting (80) into (81), we find the following symmetry transformation corresponding to the characteristic:

$$\begin{aligned}&(\delta _\varepsilon \phi ^\mu ,\delta _\varepsilon \psi ,\delta _\varepsilon \widetilde{\psi })= \varepsilon ^\nu V(Q_\nu ) =\Big (-\varepsilon ^\nu \partial _\nu \phi ^\mu \nonumber \\&\quad -\,\frac{1}{m^2_p}\frac{(\alpha +\beta )^2}{\alpha \beta } \varepsilon ^\nu \partial _\nu (T_\phi )^\mu ,-\varepsilon ^\nu \partial _\nu \psi ,-\varepsilon ^\nu \partial _\nu \widetilde{\psi }\Big ). \end{aligned}$$
(82)

This means that the Lagrange anchor connects the conservation of the tensor (78) with translation invariance of the fourth-order equations (77). Once \(\alpha ,\beta \) are positive, the tensor satisfies the condition \(\Theta ^0_{~0}>0\), and the theory is stable. The corresponding positive, conserved, non-canonical energy-momentum tensor is connected to the translation invariance by the non-canonical Lagrange anchor (81).

If the fourth-order equations (77) were quantized with the corresponding Lagrange anchor with \(\alpha >0, \beta > 0\) along the lines of the previous section, we would arrive at the stable quantum theory precisely corresponding to the quantization of the second-order Lagrangian (68) and (70). If the fourth-order theory is considered with unstable vertices corresponding to the opposite signs of \(\alpha \) and \(\beta \) in the Lagrange anchor, the theory will be classically unstable, and its quantization will correspond to the standard Feynman rules for the Podolsky Lagrangian with minimal coupling to the Dirac field. The quantum instability problem is well known for the couplings of this type; see e.g. [2426] and references therein.

In this section, we have studied the stability proceeding from the fact that the free higher-derivative electrodynamics by Podolsky has the factorizable structure for the equations. Therefore, it admits a bounded conserved energy-momentum tensor, besides the unbounded canonical one. The conservation of the bounded tensor ensures classical stability irrespectively of the unboundedness of the canonical tensor. Then we considered a not necessarily minimal inclusion of interactions with the massive spin \(1/2\) field such that the bounded tensor, being deformed by the interaction (74), still keeps conserving. The nonlinear higher-derivative theory is both classically and quantum mechanically equivalent to the theory of one massless and one massive vector fields both coupled with the Dirac field. Studying these auxiliary second-order formulations, we showed that the minimal coupling of Podolsky’s theory breaks stability of the free theory, while the non-minimal interactions (77) keep the dynamics stable.

5 Conclusion

In this paper we study the higher-derivative dynamics proceeding from the idea that the stability can be ensured by the existence of any bounded conserved quantity even if it is different from the canonical energy. We have focused at the special class of factorizable higher-derivative systems whose equations (31) include the linear term \(\mathcal {PQ}\phi \) and the nonlinearity \(\mathcal {F}(\mathcal {P}\phi ,\mathcal {Q}\phi )\). By making use of factorization, we can construct the conserved quantity that might be positive both in the linear model and with a variety of interactions \(\mathcal {F}\), while the canonical energy is not positive definite for the system already in the linear approximation. The conservation of this positive quantity is by construction connected to the translation invariance, so it can be viewed as an alternative definition of energy for the higher-derivative systems. As we have demonstrated, the classical stability can be promoted to the quantum level. This class of higher-derivative systems is wide enough to accommodate the models of interest for physics, as is seen from the examples of Sect. 4. However, the factorizable structure of the equations seems to us to be rather a technical tool than a genuine restriction for the dynamics related to stability. In any case, we see that higher-derivative systems can have stable classical and quantum dynamics with nontrivial interactions irrespectively of the fact that the canonical energy is unbounded.