1 Introduction

General relativity (GR) radically changed our understanding of the universe. The predictions of this elegant theory have been confirmed up to the date [1, 2]. In order to fit extragalactic and cosmological observational data, however, the presence of a non-vanishing cosmological constant and six times more dark matter than ordinary matter have to be assumed in this framework [3]. In addition, the observed value of this cosmological constant differs greatly from the value expected for the vacuum energy. On the other hand, while the strong and electroweak forces are renormalisable gauge theories, that is not the case for GR, and the compatibility of GR with the quantum realm is still a matter of debate. Given this situation, there has been a renewed interest in alternative theories of gravity, which modify the predictions of GR.

A particular approach to formulating alternative theories of gravity involves an extension of the geometrical treatment that covers the microscopic properties of matter [4]. It should be noted that the mass is not enough to characterize particles at the quantum level given that they have another independent label, that is, the spin. Whereas at macroscopic scales the energy-momentum tensor is enough to describe the source of gravity, a description of the spacetime distribution of the spin density is needed at microscopic scales. Moreover, there are macroscopic configurations that may also need a description of the spin distribution, as super-massive objects (e.g. black holes or neutron stars with nuclear polarisation). In this spirit, a new geometrical concept should be related to the spin distribution in the same way that spacetime curvature is related to the energy-momentum distribution. Torsion is a natural candidate for this purpose [4, 5] and an important advantage of a theory of gravity with torsion is that it can be formulated as a gauge theory [6,7,8].

Since 1924 many authors have considered theories of gravity in a Riemann–Cartan \(U_4\) spacetime. In this manifold the non-vanishing torsion can be coupled to the intrinsic spin density of matter and, in this way, the spin part of the Poincaré group can change the geometry of the manifold as the energy-momentum tensor does it. The first attempt to introduce torsion in a theory of gravity was the Einstein–Cartan theory, which is a reformulation of GR in a \(U_4\) spacetime. In this theory the scalar curvature of the Einstein–Hilbert action is constructed from a \(U_4\) connection instead of using the Christoffel symbols. However, the resulting theory was not completely satisfactory because the field equations relate the torsion and its source in an algebraic way and, therefore, torsion is not dynamical. Hence the torsion field vanishes in vacuum and the Einstein–Cartan theory collapses to GR except for unobservable corrections to the energy-momentum tensor [4]. In order to obtain a theory with propagating torsion, we need to consider an action that is at least quadratic in the curvature tensors [4, 6,7,8,9,10,11]. Moreover, an important advantage of adding quadratic terms \(\mathcal{{R}}^2\) to the Einstein–Hilbert action is the possibility of making the theory renormalisable [9]. In addition, it can be shown [4, 6] that, considering a gauge description, the torsion and curvature tensors correspond to the field-strength tensors of the gauge potentials of the Poincaré group \((e_{\mu }^{\ a }, w_{\mu }^{\ a b })\), which are the vierbein and the local Lorentz connection, respectively. Thus, a pure \(\mathcal{{R}}^2\) gauge theory of gravity has some resemblance to electroweak and strong theories.

From an experimental point of view there have been many attempts to detect torsion or to set an upper bound to its gravitational effects. One of the most debated attempts was the use of the Gravity Probe B experiment to measure torsion effects [12]. Nevertheless, this experiment was criticized because torsion will never couple to the gyroscopes installed in the satellite [13]. Therefore, this probe cannot measure the gravitational effects due to torsion. On the other hand, other unsuccessful experiments aimed to constrain torsion with accurate measurements on the perihelion advance and the orbital geodetic effect of a satellite [14]. The experimental difficulty is the need of dealing with elementary particles with spin to obtain a maximal coupling with torsion.

In this paper we present a self-contained introduction to quadratic theories of gravity with torsion in the geometrical approach (gauge treatment is not considered). We partly recover well-known results about the stability of these theories using simple methods. Therefore, we simplify the existent mathematical treatment and reinforce the critical discussion as regards some controversial results published in the literature.

The paper is organized as follows: In Sect. 2 we present a general introduction to the basic concepts on general affine geometries and introduce the conventions used throughout the paper. In Sect. 3 we present our main results. In the first place, we consider a Lagrangian density quadratic in the curvature and torsion tensors. In Sect. 3.1 we discuss the different methods presented in the literature to obtain the field equations and explicitly derive them in the Palatini formalism. In Sect. 3.2 we obtain conditions on the parameters of the Lagrangian necessary to avoid large deviations from GR and instabilities. Then, in Sect. 3.3, we analyse the Lagrangian density with the aim of setting necessary conditions for avoiding ghost and tachyon instabilities. The conclusions are summarized in Sect. 4. We relegate some calculations and further comments to the appendices: in Appendix A we include the Gauss–Bonnet term in Riemann–Cartan geometries; in Appendix B we include detailed expressions necessary to obtain the equations of the dynamics using the Palatini formalism; in Appendix C we discuss the source terms of these equations; and, in Appendix D, we include relevant expressions for the study of the vector and pseudo-vector torsion fields around Minkowski.

2 Basic concepts and conventions

The geometric structure of a manifold can be catalogued by the properties of the affine connection. A general affine connection \(\tilde{\varGamma }\) provides three main characteristics: curvature, torsion, and non-metricity. Combinations of these quantities in the affine connection generate the geometric structure [5]. In GR it is assumed that the spacetime geometry is described by a Riemannian manifold, thus the affine connection reduces to the so-called Levi-Civita connection and the gravitational effects are only produced by the consequent curvature in terms of the metric tensor alone. Nevertheless, in a general geometrical theory of gravity the gravitational effects are generated by the whole connection, which involves a post-Riemannian approach described by curvature, torsion and non-metricity. In this scheme, there are many ways to deal with torsion and non-metricity due to different conventions. For that reason, it is important to set the conventions and definitions used throughout this work. Thus, the notation assumed for the symmetric and the antisymmetric part of a tensor A is

$$\begin{aligned}&A_{(\mu _1 \ldots \mu _s)}\equiv \frac{1}{s!}\sum _{\pi \in P(s)} A_{\pi (\mu _1) \cdots \pi (\mu _s )} , \end{aligned}$$
(1)
$$\begin{aligned}&A_{[\mu _1 \ldots \mu _s]}\equiv \frac{1}{s!}\sum _{\pi \in P(s)} \text {sgn}(\pi )A_{\pi (\mu _1) \cdots \pi (\mu _s )} , \end{aligned}$$
(2)

respectively, where P(s) is the set of all the permutations of \(1,\ldots , s\) and \(\text {sgn}(\pi )\) is positive for even permutations whereas it is negative for odd permutations.

In the first place, the Cartan torsion is defined as the antisymmetric part of the affine connection as [4, 15,16,17]

$$\begin{aligned} T^{\mu }_{\cdot \nu \sigma }\equiv \tilde{\varGamma }^{\mu }_{\cdot [ \nu \sigma ] } . \end{aligned}$$
(3)

Note that a dot appears below the index \(\mu \) to indicate the position that it takes when it is lowered with the metric. As the difference of two connections transforms as a tensor, the Cartan torsion is a tensor. Thus, from now on we call it just the torsion and emphasise that it cannot be eliminated with a suitable change of coordinates.

In the second place, non-metricity can also be described by a third rank tensor. This is

$$\begin{aligned} Q_{\rho \mu \nu }\equiv \tilde{\nabla }_{\rho }g_{\mu \nu } , \end{aligned}$$
(4)

where \(\tilde{\nabla }\) is the covariant derivative defined from the affine connection \(\tilde{\varGamma }\). The non-metricity tensor is usually split into a trace vector \(\omega _{\rho }\equiv \frac{1}{4}Q^{\ \ \ \nu }_{\rho \nu \cdot }\), called the Weyl vector [18], and a traceless part \(\overline{Q}_{\rho \mu \nu }\),

$$\begin{aligned} Q_{\rho \mu \nu }=w_{\rho }g_{\mu \nu }+\overline{Q}_{\rho \mu \nu }. \end{aligned}$$
(5)

It should be noted that there are manifolds with non-metricity where the cancellation of the \(\omega _{\rho }\) or the traceless part of Q are demanded.

Since the general connection \(\tilde{\varGamma }\) is asymmetric in the last two indices, a convention is needed for the covariant derivative of a tensor. Let \(A^{\mu _1 \cdots \mu _r}_{\cdot \ \ \cdots \ \cdot \ \nu _1 \cdots \nu _s}\) be the components of a tensor type (rs), then

$$\begin{aligned} \tilde{\nabla }_{\rho }A^{\mu _1 \cdots \mu _r}_{\cdot \ \ \cdots \ \cdot \ \nu _1 \cdots \nu _s}\equiv & {} \partial _{\rho }A^{\mu _1 \cdots \mu _r}_{\cdot \ \ \cdots \ \cdot \ \nu _1 \cdots \nu _s} \nonumber \\&+\,\sum _{i=1}^{r}\tilde{\varGamma }^{\mu _i}_{\cdot \lambda \rho }A^{\mu _1 \cdot \lambda \cdot \mu _r}_{\cdot \ \ \cdots \ \cdot \ \nu _1 \cdots \nu _s} \nonumber \\&-\sum _{j=1}^{s}\tilde{\varGamma }^{\lambda }_{\cdot \nu _j \rho }A^{\mu _1 \cdots \mu _r}_{\cdot \ \ \cdots \ \cdot \ \nu _1 \cdot \lambda \cdot \nu _s} . \end{aligned}$$
(6)

It is important to emphasise the syntax of the lower indices in the affine connections, that is, the index \(\rho \) of the derivative is written in the last position in the affine connection.

Using the definitions presented in this section, the general connection \(\tilde{\varGamma }\) is written as [4, 15, 19]

$$\begin{aligned} \tilde{\varGamma }^{\mu }_{\cdot \nu \sigma }= \varGamma ^{\mu }_{\cdot \nu \sigma }+ W^{\mu }_{\cdot \nu \sigma } , \end{aligned}$$
(7)

with \(\varGamma ^{\mu }_{\cdot \nu \sigma }\) the Levi-Civita connection,

$$\begin{aligned} \varGamma ^{\mu }_{\cdot \nu \sigma }=\frac{1}{2}g^{\mu \rho }\varDelta _{\sigma \nu \rho }^{\alpha \beta \gamma }\partial _{\alpha }g_{\beta \gamma } , \end{aligned}$$
(8)

which is expressed in a compact form by the permutation tensor [20]

$$\begin{aligned} \varDelta _{\sigma \nu \rho }^{\alpha \beta \gamma }=\delta _{\sigma }^{\ \alpha }\delta _{\nu }^{\ \beta }\delta _{\rho }^{\ \gamma }+\delta _{\nu }^{\ \alpha }\delta _{\rho }^{\ \beta }\delta _{\sigma }^{\ \gamma }-\delta _{\rho }^{\ \alpha }\delta _{\sigma }^{\ \beta }\delta _{\nu }^{\ \gamma }, \end{aligned}$$
(9)

and the additional tensor \(W^{\mu }_{{}^{.} \nu \sigma }\) defined by the following expression:

$$\begin{aligned} W^{\mu }_{\cdot \nu \sigma }= K^{\mu }_{{}^{.} \nu \sigma }+\frac{1}{2}\left( Q^{\mu }_{\cdot \nu \sigma }-Q^{\ \mu }_{\sigma \cdot \nu }-Q^{\ \mu }_{\nu \cdot \sigma }\right) , \end{aligned}$$
(10)

where \(K^{\mu }_{{}^{.} \nu \sigma }\) is called the contortion tensor,

$$\begin{aligned} K^{\mu }_{{}^{.} \nu \sigma }=T^{\mu }_{{}^{.} \nu \sigma }-T^{ \ \mu }_{\nu {}^{.} \sigma }-T^{ \ \mu }_{\sigma {}^{.} \nu } \ . \end{aligned}$$
(11)

Note that \(Q_{\rho \mu \nu }\) is symmetric in the last two indices, while \(T^{\mu }_{\cdot \nu \sigma }\) is antisymmetric in these indices. However, the contortion, \(K^{\mu }_{{}^{.} \nu \sigma }\), is antisymmetric in the first pair of indices. This property ensures the existence of a metric-compatible connection when the non-metricity tensor vanishes.

Furthermore, it is useful to write the torsion through its three irreducible components. These are [19]

  1. (i)

    the trace vector \( T^{\mu }_{{}^{.} \nu \mu }\equiv T_{\nu }\);

  2. (ii)

    the pseudo-trace axial vector \(S^{\nu }\equiv \epsilon ^{\alpha \beta \sigma \nu }T_{\alpha \beta \sigma }\);

  3. (iii)

    the tensor \(q^{\alpha }_{{}^{.} \beta \sigma }\) , which satisfies \( q^{\alpha }_{{}^{.} \beta \alpha }=0 \) and \( \epsilon ^{\alpha \beta \sigma \nu } q_{\alpha \beta \sigma }=0 \).

Thus, the torsion field can be rewritten as

$$\begin{aligned} T^{\alpha }_{\cdot \beta \mu }=\frac{1}{3}(T_{\beta } \delta ^{\alpha }_{\ \mu }- T_{\mu } \delta ^{\alpha }_{\ \beta })+ \frac{1}{6}g^{\alpha \sigma }\epsilon _{\sigma \beta \mu \nu } S^{\nu } + q^{\alpha }_{\cdot \beta \mu } \ . \end{aligned}$$
(12)

The introduction of these new geometrical degrees of freedom leads to the generalisation of the usual definition of the curvature tensor in the Riemann spacetime, \([\nabla _{\rho },\nabla _{\sigma }]V^{\mu }=R^{\mu }_{ \cdot \nu \rho \sigma }V^{\nu } \), by the following commutative relations associated with a connection \(\tilde{\varGamma }\):

$$\begin{aligned}{}[\tilde{\nabla }_{\rho },\tilde{\nabla }_{\sigma }]V^{\mu }= \tilde{R}^{\mu }_{\cdot \nu \rho \sigma }V^{\nu }+2T^{\alpha }_{\cdot \rho \sigma }\tilde{ \nabla }_{\alpha }V^{\mu }, \end{aligned}$$
(13)

where the curvature tensor reads

$$\begin{aligned} \tilde{R}^{\mu }_{\cdot \nu \rho \sigma }= \partial _{\rho }\tilde{\varGamma }^{\mu }_{\cdot \nu \sigma } - \partial _{\sigma }\tilde{\varGamma }^{\mu }_{\cdot \nu \rho } + \tilde{\varGamma }^{\mu }_{\cdot \lambda \rho }\tilde{\varGamma }^{\lambda }_{\cdot \nu \sigma } - \tilde{\varGamma }^{\mu }_{\cdot \lambda \sigma }\tilde{\varGamma }^{\lambda }_{\cdot \nu \rho } \ . \end{aligned}$$
(14)

Using Eq. (7), the curvature tensor can be rewritten as

$$\begin{aligned} \tilde{R}^{\mu }_{\cdot \nu \rho \sigma }= & {} R^{\mu }_{ \cdot \nu \rho \sigma }+ \nabla _{\rho } W^{\mu }_{\cdot \nu \sigma } - \nabla _{\sigma }W^{\mu }_{\cdot \nu \rho } + W^{\mu }_{\cdot \lambda \rho }W^{\lambda }_{\cdot \nu \sigma }\nonumber \\&- W^{\mu }_{\cdot \lambda \sigma }W^{\lambda }_{\cdot \nu \rho } , \end{aligned}$$
(15)

with \(R^{\mu }_{ \cdot \nu \rho \sigma }\) the curvature tensor of the Riemann spacetime, commonly called the Riemann tensor, and \(\nabla \) the covariant derivative constructed from the Levi-Civita connection.

On the other hand, the generalisation of the two Bianchi identities can be computed from Eq. (14). Taking into account Eq. (3), the new Bianchi identities are

$$\begin{aligned}&\tilde{R}^{\mu }_{\cdot [\nu \rho \sigma ]}=2\tilde{\nabla }_{[\rho } T^{\mu }_{\cdot \nu \sigma ]}-4T^{\lambda }_{\cdot [\nu \rho }T^{\mu }_{\cdot \sigma ]\lambda } , \end{aligned}$$
(16)
$$\begin{aligned}&\tilde{\nabla }_{[\mu |}\tilde{R}^{\alpha }_{\cdot \beta | \nu \rho ]}=-2T^{\lambda }_{\cdot [ \mu \nu |}\tilde{R}^{\alpha }_{\cdot \beta | \rho ] \lambda } \ . \end{aligned}$$
(17)

Moreover, it is well known that not all the components of the curvature tensor (14) are independent. By definition, this tensor is antisymmetric in the last pair of indices \(\tilde{R}^{\mu }_{\cdot \nu \rho \sigma }=\tilde{R}^{\mu }_{\cdot \nu [\rho \sigma ]}\). A simple calculation using Eq. (15) shows that

$$\begin{aligned} \tilde{R}_{(\mu \nu ) \rho \sigma }=\nabla _{[ \rho }Q_{\sigma ]\mu \nu }+ T^{\lambda }_{\cdot \rho \sigma }Q_{\lambda \mu \nu }. \end{aligned}$$
(18)

Thus, when the connection is set to be metric-compatible, the curvature tensor is also antisymmetric in the first pair of indices. The symmetry of the curvature tensor under the exchange of pair of indices depends on the torsion and non-metricity tensors. In general, for non-trivial values for those tensors, this symmetry does not hold. However, there are particular conditions under which the exchange symmetry is recovered for non-trivial values.

From now on we consider a metric-compatible connection, focusing our attention only on curvature and torsion. We denote by a hat the objects constructed from a metric-compatible connection with torsion:

$$\begin{aligned} \widehat{\varGamma } \equiv \left. \tilde{\varGamma }\right| _{Q=0}. \end{aligned}$$
(19)

All the conventions and identities that we have already presented are, of course, still valid. The Ricci tensor and the scalar curvature are obtained with the usual contractions, \( \widehat{R}_{\mu \nu }=\widehat{R}^{\sigma }_{\cdot \mu \sigma \nu } \) and \(\widehat{R}=g^{\mu \nu }\widehat{R}_{\mu \nu }\). However, the absence of symmetry in the exchange of pair of indices in Eq. (14) allows the Ricci tensor \(\widehat{R}_{\mu \nu }\) to be non-symmetric. Indeed, the antisymmetric part of this tensor is

$$\begin{aligned} \widehat{R}_{[ \mu \nu ]}=\widehat{\nabla }_{\rho }( T^{\rho }_{\cdot \mu \nu } +\delta ^{\rho }_{\ \mu }T_{\nu }-\delta ^{\rho }_{\ \nu }T_{\mu })-2T_{\rho }T^{\rho }_{\cdot \mu \nu }\ . \end{aligned}$$
(20)

In view of this identity, a modified torsion tensor can be defined

$$\begin{aligned} \overset{\star }{T}{}^{\rho }_{\cdot \mu \nu } \equiv T^{\rho }_{\cdot \mu \nu }+\delta ^{\rho }_{\ \mu }T_{\nu }-\delta ^{\rho }_{\ \nu }T_{\mu }, \end{aligned}$$
(21)

and a modified covariant derivative can be introduced,

$$\begin{aligned} \overline{\nabla }_{\rho }\equiv \widehat{\nabla }_{\rho } -2T_{\rho }. \end{aligned}$$
(22)

Hence the antisymmetric part of the Ricci tensor is rewritten as

$$\begin{aligned} {R}_{[ \mu \nu ]}=\overline{\nabla }_{\rho }\overset{\star }{T}{}^{\rho }_{\cdot \mu \nu }. \end{aligned}$$
(23)

It should be stressed the importance of this modified derivative for vectors, since \(\partial _{\mu }(\sqrt{-g}A^{\mu })=\sqrt{-g}\,\overline{\nabla }_{\mu }A^{\mu }\), for any vector \(A^{\mu }\).

3 Quadratic theory of gravity

As we have already argued in the introduction, we are going to consider an action that is quadratic in the curvature tensor, in order to obtain a theory with propagating torsion [4, 6,7,8,9,10,11]. Excluding parity violating pieces, a total of six independent scalars can be formed from the curvature tensor (14) and its contractions. In addition, three other scalars can be constructed from the torsion tensor (3). On the other hand, the Gauss–Bonnet action is known to lead to a total divergence in a 4-dimensional Riemannian manifold and, therefore, it does not produce any contribution through the variational process of the action. It is worth noting that the Gauss–Bonnet Lagrangian does not contribute to the field equations even in a Riemann–Cartan geometry [6, 21].Footnote 1 Therefore, the terms \(\widehat{R}^2\), \(\widehat{R}_{\nu \sigma } \widehat{R}^{ \sigma \nu }\), and \(\widehat{R}_{\mu \nu \rho \sigma } \widehat{R}^{\rho \sigma \mu \nu }\) in the Lagrangian density are not independent. Throughout this work, we are going to consider the quadratic Lagrangian density from Poincaré gauge theory of gravity, as written in Refs. [6, 7, 10, 11]. This is

$$\begin{aligned} \mathcal{{L}}_g= & {} -\lambda \widehat{R}+\frac{1}{12}(4a+b+3\lambda )T_{\mu \nu \rho }T^{\mu \nu \rho }\nonumber \\&+\,\frac{1}{6}(-2a+b-3\lambda )T_{\mu \nu \rho }T^{ \nu \rho \mu }\nonumber \\&+\,\frac{1}{3}(-a+2c-3\lambda )T^{\lambda }_{\cdot \mu \lambda }T_{\rho }^{\cdot \mu \rho } \nonumber \\&+\,\frac{1}{6}(2p+q)\widehat{R}_{\mu \nu \rho \sigma } \widehat{R}^{\mu \nu \rho \sigma } \nonumber \\&+\,\frac{1}{6}(2p+q-6r)\widehat{R}_{\mu \nu \rho \sigma } \widehat{R}^{\rho \sigma \mu \nu }\nonumber \\&+\,\frac{2}{3}(p-q)\widehat{R}_{\mu \nu \rho \sigma }\widehat{R}^{\mu \rho \nu \sigma }\nonumber \\&+\,(s+t)\widehat{R}_{\nu \sigma } \widehat{R}^{\nu \sigma } +(s-t)\widehat{R}_{\nu \sigma } \widehat{R}^{ \sigma \nu } \ , \end{aligned}$$
(24)

with \(\lambda \), a, b, c, p, q, r, s and t the free parameters of the theory. The particular combinations of the parameters that appear in the Lagrangian density have been chosen for convenience without loss of generality. Note that the scalar curvature is also included, which is the only term present in the Einstein–Cartan theory. The procedure to obtain the field equations of this Lagrangian density is summarized in Sect. 3.1. In addition, parity violating pieces can also be assumed in a natural way in the Lagrangian density leading to interesting results; see Refs. [8, 22].

In this work we are interested in the stability of theories of gravity with dynamical torsion that avoid large deviations from the predictions of GR where this theory is satisfactory. In this spirit, we focus on quadratic theories, because that is the minimal modification leading to dynamical torsion, and we will not assume that all the components obtained by the irreducible decomposition of the torsion necessarily propagate. In order to study the stability of the theory, we will focus on two regimes where the metric and torsion degrees of freedom completely decoupled from each other through the consideration of the following conditions:

  1. (a)

    GR must be recovered when the torsion vanishes.

  2. (b)

    The theory must be stable in the weak-gravity regime.

Note that condition (a) implies both that the general relativistic predictions will be recovered when the torsion is small and that the theory is stable at least when the torsion vanishes. This condition will be imposed in Sect. 3.2 by means of the geometrical structure of the manifold, whereas the second condition will be investigated in Sect. 3.3 considering the propagation of the torsion modes in a Minkowki space. Both conditions have been studied separately in the literature using different approaches; see Refs. [6,7,8].

3.1 Field equations

The field equations of the Lagrangian density (24) have to be obtained, as usual, from a variational principle where the action is extremised with respect to the dynamical variables. However, different sets of dynamical variables can be chosen and different field equations will be obtained accordingly. On one hand, the metric and the affine connection can be taken as completely independent variables. Then the field equations are obtained from varying the action with respect \(g^{\mu \nu }\) and \(\tilde{\varGamma }^{\sigma }_{\cdot \mu \nu }\). This is called the Palatini formalism.Footnote 2 On the other hand, the connection can be taken to be metric-compatible from the beginning. Hence, the field equations are obtained varying with respect to g and T, or to g and K. This procedure is sometimes called the metric or Hilbert variational method. The Palatini and Hilbert methods are known to differ only on the constraint on the symmetric part of the connection \(\tilde{\varGamma }_{(s)}{}^{\sigma }_{\cdot \mu \nu }=\varGamma ^{\sigma }_{\cdot \mu \nu }-T^{ \ \mu }_{\nu {}^{.} \sigma }-T^{ \ \mu }_{\sigma {}^{.} \nu }\); that is, they differ on a Lagrange multiplier for the metricity condition, see Refs. [23, 24]. Therefore, the two methods coincide without imposing the Lagrange multiplier when after solving the field equations the related quantity turns out to be zero. In addition, a third method consists in treating the theory as a gauge theory. This may be seen as being more natural, since the variables are the gauge potentials \((e_{\mu }^{\ a }, w_{\mu }^{\ a b })\). The field equations in this formalism can been found in Refs. [8, 10].

Let us use the Palatini formalism with the metricity condition implemented as a constraint via a Lagrange multiplier \(\varLambda \) to obtain the field equations. The total Lagrangian density of the theory can by written as

$$\begin{aligned} \mathcal{{L}}=\mathcal{{L}}_g+\mathcal{{L}}_M+\varLambda ^{\ \ \ \rho }_{\nu \mu \cdot }\tilde{\nabla }_{\rho }g^{\mu \nu } , \end{aligned}$$
(25)

with \(\mathcal{{L}}_g\) from Eq. (24), \(\mathcal{{L}}_M\) the Lagrangian density for matter fields minimally coupled to gravity, and \(\varLambda ^{\ \ \ \rho }_{\nu \mu \cdot }\) a Lagrange multiplier. The use of the Lagrange multipliers in theories of gravity has been studied in Refs. [20, 25, 26]. For the sake of simplicity, we rewrite the Lagrangian density \(\mathcal{{L}}_g\) as

$$\begin{aligned} \mathcal {L}_g= & {} -\lambda \,\delta _{\alpha }^{\ \gamma }g^{\beta \delta }\tilde{R}^{\alpha }_{\cdot \beta \gamma \delta } +f_{{}_{{}_T} \lambda \alpha }^{\ \ \eta \rho \beta \gamma }\ T^{\lambda }_{\cdot \eta \rho }T^{\alpha }_{\cdot \beta \gamma }\nonumber \\&+\,f_{{}_{{}_R} \lambda \alpha }^{\ \ \eta \rho \sigma \beta \gamma \delta }\ \tilde{R}^{\lambda }_{\cdot \eta \rho \sigma }\tilde{R}^{\alpha }_{\cdot \beta \gamma \delta }, \end{aligned}$$
(26)

with the permutation tensors \(f_{{}_{{}_T} \lambda \alpha }^{\ \ \eta \rho \beta \gamma }\) and \(f_{{}_{{}_R} \lambda \alpha }^{\ \ \eta \rho \sigma \beta \gamma \delta }\) defined in Appendix B. This decomposition factorizes \(\mathcal{{L}}_g\) in parts depending purely on the metric and parts depending on the connection—those are the permutation tensors, and the curvature tensors and the torsion tensors, respectively; thus, the application of the Euler–Lagrange equations is straightforward. The field equations for the Lagrangian density (25) are

$$\begin{aligned}&\tilde{\mathcal {E}}_{\mu \nu }-(\tilde{\nabla }_{\kappa }-2T_{\kappa })\varLambda _{\nu \mu \cdot }^{\ \ \ \kappa }-\frac{1}{2}\varLambda _{\mu \nu \cdot }^{\ \ \ \kappa }g^{\alpha \beta }\tilde{\nabla }_{\kappa }g_{\alpha \beta } =\tilde{\tau }_{\mu \nu } , \end{aligned}$$
(27)
$$\begin{aligned}&\tilde{\mathcal {P}}_{\tau }^{\cdot \mu \nu }+2\varLambda _{\tau }^{\cdot \mu \nu }=\tilde{\varSigma }_{\tau }^{\cdot \mu \nu }, \end{aligned}$$
(28)
$$\begin{aligned}&\tilde{\nabla }_{\rho }g^{\mu \nu }=0. \end{aligned}$$
(29)

Note that the metricity condition is obtained as a field equation from the variation of the action with respect to the Lagrange multiplier. The definitions used in the above equations are

$$\begin{aligned} \tilde{\mathcal {E}}_{\mu \nu }\equiv & {} \frac{1}{\sqrt{-g}}\frac{\partial \sqrt{-g}\mathcal {L}_g}{\partial g^{\mu \nu }}, \end{aligned}$$
(30)
$$\begin{aligned} \tilde{\mathcal {P}}_{\tau }^{\cdot \mu \nu }\equiv & {} \frac{\partial \mathcal {L}_g}{\partial \tilde{\varGamma }^{\tau }_{\cdot \mu \nu }}- \frac{1}{\sqrt{-g}}\partial _{\kappa }\left( \sqrt{-g}\frac{\partial \mathcal {L}_g}{\partial (\partial _{\kappa } \tilde{\varGamma }^{\tau }_{\cdot \mu \nu } ) } \right) \ . \end{aligned}$$
(31)

The tensor \(\tilde{\mathcal {E}}_{\mu \nu }\) could be considered as the generalisation of the Einstein tensor for the Lagrangian density \(\mathcal {L}_g\), as it contains the dynamical information of the metric. Analogously, the tensor \(\tilde{\mathcal {P}}_{\tau }^{\cdot \mu \nu }\) is the generalisation of the Palatini tensor. The source tensors are the energy-momentum tensor

$$\begin{aligned} \tilde{\tau }_{\mu \nu } \equiv -\frac{1}{\sqrt{-g}} \frac{\partial \sqrt{-g}\mathcal {L}_M (g, \tilde{\varGamma }, \varPsi ) }{\partial g^{\mu \nu }} \, , \end{aligned}$$
(32)

and the hypermomentum tensor

$$\begin{aligned} \begin{aligned} \tilde{\varSigma }_{\tau }^{\cdot \mu \nu } \equiv&- \frac{\partial \mathcal {L}_M (g, \tilde{\varGamma }, \varPsi ) }{\partial \tilde{\varGamma }_{\cdot \mu \nu }^{\tau }}, \end{aligned} \end{aligned}$$
(33)

as defined in Refs. [20, 27].

Now, taking into account the expression of \(\mathcal {L}_g\) in Eq. (26), the generalized Einstein and Palatini tensors are

$$\begin{aligned} \tilde{\mathcal {E}}_{\mu \nu }= & {} -\lambda \tilde{G}_{(\mu \nu )}+\left( \frac{\partial f_{{}_{{}_T} \lambda \alpha }^{\ \ \eta \rho \beta \gamma } }{\partial g^{\mu \nu }} -\frac{1}{2}g_{\mu \nu }f_{{}_{{}_T} \lambda \alpha }^{\ \ \eta \rho \beta \gamma } \right) T^{\lambda }_{\cdot \eta \rho }T^{\alpha }_{\cdot \beta \gamma } \nonumber \\&+\,\left( \frac{\partial f_{{}_{{}_R} \lambda \alpha }^{\ \ \eta \rho \sigma \beta \gamma \delta } }{\partial g^{\mu \nu }} -\frac{1}{2}g_{\mu \nu } f_{{}_{{}_R} \lambda \alpha }^{\ \ \eta \rho \sigma \beta \gamma \delta } \right) \tilde{R}^{\lambda }_{\cdot \eta \rho \sigma }\tilde{R}^{\alpha }_{\cdot \beta \gamma \delta },\nonumber \\ \end{aligned}$$
(34)

where \(\tilde{G}_{(\mu \nu )}\) is the symmetric part of the Einstein tensor, and

$$\begin{aligned} \tilde{\mathcal {P}}_{\tau }^{\cdot \mu \nu }= & {} -2\lambda \left[ \overset{\star }{T}{}^{\nu \mu \cdot }_{\ \ \sigma } +\delta _{\sigma }^{\ \nu }\left( \tilde{\nabla }_{\lambda }g^{\mu \lambda }+ \frac{1}{2}g^{\alpha \beta }\tilde{\nabla }^{\mu }g_{\alpha \beta }\right) \right. \nonumber \\&\qquad \qquad -\left. \tilde{\nabla }_{\sigma }g^{\mu \nu }-\frac{1}{2} g^{\mu \nu } g^{\alpha \beta } \tilde{\nabla }_{\sigma } g_{\alpha \beta } \right] \nonumber \\&+ 2f_{{}_{{}_T} \lambda \alpha }^{\ \ \eta \rho \beta \gamma } T^{\lambda }_{\cdot \eta \rho }\frac{\partial T^{\alpha }_{\cdot \beta \gamma }}{\partial \tilde{\varGamma }^{\tau }_{\cdot \mu \nu }} +2f_{{}_{{}_R} \lambda \alpha }^{\ \ \eta \rho \sigma \beta \gamma \delta } \tilde{R}^{\lambda }_{\cdot \eta \rho \sigma } \frac{\partial \tilde{R}^{\alpha }_{\cdot \beta \gamma \delta }}{\partial \tilde{\varGamma }^{\tau }_{\cdot \mu \nu }}\nonumber \\&- \frac{2}{\sqrt{-g}} \partial _{\kappa } \left( \sqrt{-g} f_{{}_{{}_R} \lambda \alpha }^{\ \ \eta \rho \sigma \beta \gamma \delta }\tilde{R}^{\lambda }_{\cdot \eta \rho \sigma } \frac{\partial \tilde{R}^{\alpha }_{\cdot \beta \gamma \delta }}{\partial \left( \partial _{\kappa } \tilde{\varGamma }^{\tau }_{\cdot \mu \nu }\right) } \right) ,\nonumber \\ \end{aligned}$$
(35)

respectively. The full expressions of these tensors in terms of the free parameters of the Lagrangian density are shown in Appendix B.

As the metricity condition has arisen as a field equation, from now on we can consider a metric-compatible connection \(\widehat{\varGamma }\). Then the field equations (27) and (28) reduce to

$$\begin{aligned} \widehat{\mathcal {E}}_{\mu \nu }-\overline{\nabla }_{\kappa }\varLambda _{\nu \mu \cdot }^{\ \ \ \kappa }= & {} \widehat{\tau }_{\mu \nu }\end{aligned}$$
(36)
$$\begin{aligned} \widehat{\mathcal {P}}_{\tau }^{\cdot \mu \nu }+2\varLambda _{\tau }^{\cdot \mu \nu }= & {} \widehat{\varSigma }_{\tau }^{\cdot \mu \nu }\ . \end{aligned}$$
(37)

To obtain the final expression for the field equations, the Lagrange multiplier \(\varLambda \) must be solved from Eqs. (36) and (37). To this end, note that a generic third rank tensor A can always be written as

$$\begin{aligned} A_{\alpha \beta \gamma }=\varDelta _{\beta \alpha \gamma }^{\mu \nu \rho }\left( A_{\mu (\nu \rho )}-A_{[\mu \nu ] \rho } \right) \, \end{aligned}$$
(38)

where \(\varDelta _{\beta \alpha \gamma }^{\mu \nu \rho }\) is defined in Eq. (9). As \(\varLambda ^{\ \ \ \rho }_{\nu \mu \cdot }\) is symmetric in the first two indices, we can solve from Eq. (36)

$$\begin{aligned} \varLambda _{\mu \nu \rho }=\frac{1}{2}\varDelta ^{ \alpha \beta \gamma }_{\nu \mu \rho }\left( \widehat{\varSigma }_{\alpha (\beta \gamma )} -\widehat{\mathcal {P}}_{\alpha (\beta \gamma )} \right) \ . \end{aligned}$$
(39)

Thus, the field equations become

$$\begin{aligned}&\mathcal {\widehat{E}}_{\mu \nu }-\frac{1}{2}\varDelta _{\nu \mu \kappa }^{\alpha \beta \gamma }\overline{\nabla }^{\kappa }\left( \widehat{\varSigma }_{\alpha (\beta \gamma )}-\widehat{\mathcal {P}}_{\alpha (\beta \gamma )} \right) =\widehat{\tau }_{\mu \nu } \ , \end{aligned}$$
(40)
$$\begin{aligned}&\varDelta _{\nu \mu \kappa }^{\alpha \beta \gamma }\left( \widehat{\varSigma }_{[\alpha \beta ]\gamma }-\widehat{\mathcal {P}}_{[\alpha \beta ]\gamma }\right) =0 . \end{aligned}$$
(41)

These are the general expressions of the field equations of any theory of gravity with metricity and torsion. This set of equations is obviously equivalent to the equations obtained from a Hilbert variational principle over the variables (gK) or (gT), as can easily be checked. Now, taking into account the calculations showed in Appendix B for the Lagrangian density (24), these equations are

$$\begin{aligned}&-\lambda \left( \widehat{G}_{(\mu \nu )}-2\overline{\nabla }^{\kappa }\overset{\star }{T}{}_{(\mu \nu )\kappa } \right) \nonumber \\&\quad +\,\frac{1}{12}(4a+b+3\lambda )\left( 2T_{\alpha \beta \mu }T^{\alpha \beta \cdot }_{\ \ \ \nu }-T_{\mu \alpha \beta }T^{\cdot \alpha \beta }_{\nu }\right. \nonumber \\&\quad \left. -\,\frac{1}{2}g_{\mu \nu }T_{\alpha \beta \rho }T^{\alpha \beta \rho }\right) \nonumber \\&\quad +\,\frac{1}{6}(-2a+b-3\lambda )\left( T_{\alpha \beta \mu }T^{\beta \alpha \cdot }_{\ \ \ \nu } -\frac{1}{2}g_{\mu \nu }T_{\alpha \beta \rho }T^{\beta \rho \alpha } \right) \nonumber \\&\quad +\,\frac{1}{3}(-a+2c-3\lambda )\left( T_{\mu }T_{\nu } -\frac{1}{2}g_{\mu \nu }T_{\alpha }T^{\alpha } \right) \nonumber \\&\quad +\,\frac{1}{6}(2p+q)\left[ 2\widehat{R}_{\alpha \beta \lambda \mu } \widehat{R}^{\alpha \beta \lambda \cdot }_{\ \ \ \ \nu }-\frac{1}{2}g_{\mu \nu }\widehat{R}_{\alpha \beta \lambda \sigma } \widehat{R}^{\alpha \beta \lambda \sigma } \right. \nonumber \\&\quad \left. -\,4\overline{\nabla }^{\kappa }\left( \overline{\nabla }^{\lambda }\widehat{R}_{\kappa (\mu \nu )\lambda }+T_{(\mu }^{\cdot \ \lambda \beta }\widehat{R}_{\nu )\kappa \lambda \beta } \right) \right] \nonumber \\&\quad +\,\frac{1}{6}(2p+q-6r)\left[ 2\widehat{R}_{\alpha (\mu | \beta \lambda } \widehat{R}^{\beta \lambda \alpha \cdot }_{\ \ \ \ | \nu )}\right. \nonumber \\&\quad -\frac{1}{2}g_{\mu \nu }\widehat{R}_{\alpha \beta \lambda \sigma } \widehat{R}^{\lambda \sigma \alpha \beta }\nonumber \\&\quad \left. -\,4\overline{\nabla }^{\kappa }\left( \overline{\nabla }^{\lambda }\widehat{R}_{\lambda (\mu \nu )\kappa }+T_{(\mu |}^{\cdot \ \ \lambda \beta }\widehat{R}_{\lambda \beta |\nu )\kappa } \right) \right] \nonumber \\&\quad +\,\frac{2}{3}(p-q)\left[ 2\widehat{R}_{\alpha (\mu |\beta \lambda } \widehat{R}^{\alpha \beta \cdot \lambda }_{\ \ \ \ | \nu )}+\widehat{R}_{\alpha \lambda \sigma \mu }\widehat{R}^{\alpha \sigma \lambda }_{\ \ \ \ \nu } \right. \nonumber \\&\quad \left. -\,\widehat{R}_{\mu \alpha \lambda \sigma }\widehat{R}^{\cdot \lambda \alpha \sigma }_{\nu } -\frac{1}{2}g_{\mu \nu }\widehat{R}_{\alpha \beta \lambda \sigma } \widehat{R}^{\alpha \beta \lambda \sigma } \right. \nonumber \\&\quad -\, \left. 2\overline{\nabla }^{\kappa }\left( \overline{\nabla }^{\lambda }\widehat{R}_{\kappa (\mu \nu )\lambda }-2T_{\kappa }^{\cdot \lambda \beta }\widehat{R}_{\beta (\mu \nu )\lambda }\right. \right. \nonumber \\&\quad \left. \left. +\,2T_{(\mu }^{\ \cdot \lambda \beta }\widehat{R}_{\nu ) \beta \lambda \kappa }-2T_{(\mu |}^{\ \cdot \lambda \beta }\widehat{R}_{\kappa \beta \lambda |\nu )} \right) \right] \nonumber \\&\quad +\,(s+t) \left[ \widehat{R}_{\mu \cdot }^{\ \lambda }\widehat{R}_{\nu \lambda }+\widehat{R}^{\lambda }_{\cdot \mu }\widehat{R}_{\lambda \nu }-\frac{1}{2}g_{\mu \nu }\widehat{R}_{\alpha \beta }\widehat{R}^{\alpha \beta }\right. \nonumber \\&\quad +\,\overline{\nabla }^{\kappa }\left( g_{\mu \nu }\overline{\nabla }^{\lambda }\widehat{R}_{\kappa \lambda }\right. +\, \overline{\nabla }_{\kappa }\widehat{R}_{(\mu \nu )} - \overline{\nabla }_{(\mu }\widehat{R}_{\nu )\kappa } \nonumber \\&\quad -\, \overline{\nabla }_{(\mu |}\widehat{R}_{\kappa |\nu )}+\frac{1}{2}T_{(\mu |\kappa \cdot }^{\ \ \ \ \lambda }\widehat{R}_{|\nu ) \lambda }-\frac{1}{2}T_{\kappa (\mu \cdot }^{\ \ \ \ \lambda }\widehat{R}_{\nu ) \lambda }\nonumber \\&\quad \left. \left. -\,\frac{1}{2}T_{(\mu \nu )\cdot }^{\ \ \ \ \lambda }\widehat{R}_{\kappa \lambda } \right) \right] \nonumber \\&\quad +\,(s-t)\left[ \widehat{R}_{\mu \cdot }^{\ \lambda }\widehat{R}_{\lambda \nu }+\widehat{R}^{\lambda }_{\cdot \mu }\widehat{R}_{\nu \lambda }-\frac{1}{2}g_{\mu \nu }\widehat{R}_{\alpha \beta }\widehat{R}^{\beta \alpha }\right. \nonumber \\&\quad +\,\overline{\nabla }^{\kappa }\left( g_{\mu \nu }\overline{\nabla }^{\lambda }\widehat{R}_{ \lambda \kappa } \right. \nonumber \\&\quad +\, \overline{\nabla }_{\kappa }\widehat{R}_{(\mu \nu )} - \overline{\nabla }_{(\mu }\widehat{R}_{\nu )\kappa } \nonumber \\&\quad -\,\overline{\nabla }_{(\mu |}\widehat{R}_{\kappa |\nu )} +\frac{1}{2}T_{(\mu |\kappa \cdot }^{\ \ \ \ \lambda }\widehat{R}_{\lambda |\nu ) }\nonumber \\&\quad \left. \left. -\,\frac{1}{2}T_{\kappa (\mu | \cdot }^{\ \ \ \ \lambda }\widehat{R}_{ \lambda |\nu )}-\frac{1}{2}T_{(\mu \nu )\cdot }^{\ \ \ \ \lambda }\widehat{R}_{ \lambda \kappa } \right) \right] \nonumber \\&= \widehat{\tau }_{\mu \nu }+\frac{1}{2}\varDelta _{\nu \mu \kappa }^{\alpha \beta \gamma }\overline{\nabla }^{\kappa }\widehat{\varSigma }_{\alpha (\beta \gamma )} \end{aligned}$$
(42)

and

$$\begin{aligned}&-2\lambda \overset{\star }{T}{}_{\nu \mu \tau }+\frac{1}{6}(4a+b+3\lambda )T_{[\tau \mu ]\nu }\nonumber \\&\quad -\,\frac{1}{6}(-2a+b-3\lambda ) \left( T_{[\mu \tau ]\nu }+T_{\nu \mu \tau }\right) \nonumber \\&\quad +\,\frac{1}{3}(-a+b-3\lambda ) g_{ \nu [\tau }T_{\mu ] } \nonumber \\&\quad +\,\frac{2}{3}(2p+q)\left( \overline{\nabla }^{\kappa } \widehat{R}_{\tau \mu \nu \kappa } -T_{\nu }^{\cdot \lambda \kappa }\widehat{R}_{\tau \mu \lambda \kappa } \right) \nonumber \\&\quad +\,\frac{2}{3}(2p+q-6r)\left( \overline{\nabla }^{\kappa } \widehat{R}_{ \nu \kappa \tau \mu } -T_{\nu }^{\cdot \lambda \kappa }\widehat{R}_{\lambda \kappa \tau \mu } \right) \nonumber \\&\quad +\,\frac{4}{3}(p-q)\left( \overline{\nabla }^{\kappa } \widehat{R}_{\kappa [\tau \mu ]\nu }-\overline{\nabla }^{\kappa } \widehat{R}_{\nu [\tau \mu ]\kappa }- 2T_{\nu }^{\cdot \lambda \kappa }\widehat{R}_{\kappa [\tau \mu ]\lambda } \right) \nonumber \\&\quad +\, (s+t) \left( 2g_{\nu [\tau }\overline{\nabla }^{\kappa } \widehat{R}_{\mu ]\kappa } -2 \overline{\nabla }_{[\tau } \widehat{R}_{\mu ]\nu }+ T_{\nu \cdot [\tau }^{\ \lambda }\widehat{R}_{\mu ]\lambda } \right) \nonumber \\&\quad +\,(s-t) \left( 2g_{\nu [\tau |}\overline{\nabla }^{\kappa } \widehat{R}_{\kappa |\mu ]} -2 \overline{\nabla }_{[\tau |} \widehat{R}_{\nu |\mu ]} + T_{\nu \cdot [\tau |}^{\ \lambda }\widehat{R}_{\lambda |\mu ]} \right) \nonumber \\&= \widehat{\varSigma }_{[\tau \mu ]\nu }\ . \end{aligned}$$
(43)

For an interpretation of the right sides of both field equations, see Appendix C.

3.2 Reduction to GR

We want to obtain a theory which reduces to GR when the torsion vanishes. Thus, the theory will not only be stable in this regime, but it will also deviate only slightly from the predictions of GR when the torsion is small. Note that when the torsion is set to zero, the usual Riemannian structure is recovered. Therefore, the Riemann tensor is now symmetric under the exchange of the first and the second pair of indices and the Ricci tensor is symmetric. From the first Bianchi identity (16), it follows that

$$\begin{aligned} R_{\mu \nu \rho \sigma }\left( R^{\mu \nu \rho \sigma } -2R^{\mu \rho \nu \sigma } \right) =0 \ \quad \text {for}\quad \ T^{\alpha }_{\cdot \beta \gamma }=0 . \end{aligned}$$
(44)

Then, when \(T=0\), the Lagrangian density (24) becomes

$$\begin{aligned} \left. \mathcal{{L}}_g \right| _{T=0}=-\lambda \, R+ (p-r) \, R_{\mu \nu \rho \sigma }R^{\mu \nu \rho \sigma }+2\,s \,R_{\mu \nu } R^{\mu \nu } \ . \end{aligned}$$
(45)

From this expression, it is clear that GR is recovered when \(T=0\) if and only if \(p=r\) and \(s=0\). This is the only choice of parameters that leads to GR when the torsion vanishes.

Note that the same conclusion can be extracted from a different and longer approach. That is, considering the field equations (42) and (43), it can be concluded that this is the only choice of parameters that produce the Einstein equations of GR when the torsion vanishes. The same conclusion was reached in Ref. [8].

3.3 Stability in Minkowski spacetime

It is well known that the Lagrangian density (24) contains, along with the usual graviton \(2^+\), up to six new modes or torsions. These are \(2^+\), \(2^-\), \(1^+\), \(1^-\), \(0^+\) and \(0^-\), in the representation \(S^P\) where S is the spin and P is the parity of the mode. A physically meaningful restriction is to demand the theory to be stable in all the \(S^P\) sectors; see Refs. [6, 7, 28,29,30]. Quadratic theories in the curvature and torsion tensors are usually treated as a gauge theory, hence the variables considered are the gauge potentials of the Poincaré group \((e_{\mu }^{\ a }, w_{\mu }^{\ a b })\). Then the stability analysis is made through the construction of the spin projection operators.

In this work, however, we consider the metric formulation. We will examine the decoupling limit between the torsion and curvature degrees of freedom. Thus, in view of Eq. (15), we focus on the case where \(g_{\mu \nu }=\eta _{\mu \nu }\), with \(\eta _{\mu \nu }\) the Minkowski metric. For the sake of simplicity, we do not consider the purely tensor component of the torsion in Eq. (12). As the only torsion components compatible with a Friedmann–Lemaître–Robertson–Walker (FLRW) universe are the vectorial \(T^{i}\) and pseudo-vectorial \(S^{i}\) components [31], we assume that they are the minimum non-vanishing components that should be taken into account in this framework. In the spirit of investigating only slight modifications of GR, we assume that they are the only non-vanishing torsion components for a minimal modification over the FLRW background. Under these considerations, we will now impose the absence of ghost and tachyon instabilities for the theory given by the Lagrangian density (24). The quadratic Riemann and torsion terms that appear in this Lagrangian density are computed in Appendix D.

As we consider only the vector and pseudo-vector torsion components in Minkowski spacetime, the Lagrangian density (24) reduces in this regime to an ordinary vector and pseudo-vector field theory in flat spacetime. A general quadratic action for a vector \(A^{\mu }\) in flat spacetime comes from [32,33,34]

$$\begin{aligned} \mathcal {L}=\alpha \partial _{\mu }A_{\nu }\partial ^{\mu }A^{\nu }+\beta \partial _{\mu }A_{\nu } \partial ^{\nu }A^{\mu } +\gamma \partial _{\mu }A^{\mu }\partial _{\nu }A^{\nu }-\mathcal {V}, \end{aligned}$$
(46)

where \(\mathcal {V}\) is a possible potential for \(A^{\mu }\). However, not all the kinetic terms are independent from each other. The terms with factors \(\beta \) and \(\gamma \) are related by

$$\begin{aligned} \int \sqrt{-g}\,d^4x \ ({\nabla }_{\mu }A^{\mu })^2= & {} \int \sqrt{-g}\,d^4x \left( {\nabla }_{\mu }A_{\nu } {\nabla }^{\nu }A^{\mu }\right. \nonumber \\&\left. +{R}_{\mu \nu }A^{\mu }A^{\nu }\right) , \end{aligned}$$
(47)

as can be seen from Eq. (13). Thus, in flat spacetime these terms are related by a total derivative. On the other hand, as is well known, the Hamiltonian density of a system is obtained by performing a Legendre transformation. For this vector system, it is

$$\begin{aligned} \mathcal {H}=\pi ^{\mu }\dot{A}_{\mu }-\mathcal {L} , \end{aligned}$$
(48)

where \(\dot{A}_{\mu }\equiv \partial _0 A_{\mu }\) are the generalized velocities and \(\pi ^{\mu }\) the canonical momenta defined as \(\pi ^{\mu }\equiv \frac{\partial \mathcal {L}}{\partial \dot{A}^{\mu }}\). The canonical momenta of the Lagrangian density (46) are

$$\begin{aligned} \pi ^{\mu }=2\alpha \dot{A}^{\mu }+2\beta \eta ^{\mu \nu }\partial _{\nu }A^0 +2\gamma \eta ^{\mu 0}\partial _{\alpha }A^{\alpha }, \end{aligned}$$
(49)

or written in terms of the components of the four-vector,

$$\begin{aligned} \pi ^0= & {} 2(\alpha +\beta +\gamma )\dot{A}^0 +2\gamma \partial _{i}A^{i} , \end{aligned}$$
(50)
$$\begin{aligned} \pi ^i= & {} 2\alpha \dot{A}^i-2\beta \delta ^{ij}\partial _jA^0 \ . \end{aligned}$$
(51)

Then, performing the Legendre transformation (48), the Hamiltonian density reads

$$\begin{aligned} \mathcal {H}= & {} \frac{(\pi ^0-2\gamma \partial _{i}A^{i})^2}{4(\alpha +\beta +\gamma )}-\frac{(\pi ^i+2\beta \partial _iA_0)^2}{4\alpha }+ \frac{\beta }{2}F_{ij}F^{ij} \nonumber \\&+\, \alpha (\partial _iA_0)^2 -(\alpha +\beta )(\partial _iA_j)^2 -\gamma (\partial _iA^i)^2+\mathcal {V} \ , \end{aligned}$$
(52)

with \(F_{ij}=2\partial _{[i}A_{j]}\). Unfortunately, the kinetic energy of this system is unbounded from below and, therefore, suffers from ghost-type instabilities whatever the signs of \(\alpha \), \(\beta \) and \(\gamma \) are. This behaviour confirms that vector theories suffer from ghost-type instabilities if all the degrees of freedom of the four-vector \(A^{\mu }\) propagates (see Refs. [32, 33]). Hence, a necessary condition for the absence of this kind of instabilities is to make the scalar mode non-dynamical. Alternatively, the vector degrees of freedom can be frozen and propagate only the scalar mode, but this corresponds to a scalar theory rather than a vectorial one. To remove the scalar mode, the free parameters of the theory must be chosen in such a way that the canonical momenta given in Eq. (50) vanish. Since \(\partial _0{A}^0\) and \(\partial _{i}A^{i}\) are independent quantities, the only possibility to cancel out the contribution of \(\partial _{i}A^{i}\) to the canonical momenta of the scalar mode is to set \(\gamma =0\). In addition, \(\alpha +\beta =0\) is also needed to remove the contributions of the two remaining kinetic terms in the Lagrangian density (46) to the dynamics of the scalar mode. With these conditions, the kinetic terms in the vector Lagrangian density becomes a Maxwell-type \( F_{\mu \nu } F^{\mu \nu }\) that only propagates the spatial degrees of freedom of the four-vector \(A^{\mu }\). This conclusion is in agreement with the well-known fact that the only ghost-free vector theory in flat spacetime is the Maxwell–Proca Lagrangian density. Then the Hamiltonian density can be positive-defined with \(\alpha =-\beta <0\). For a more detailed discussion on this item see Ref. [34].

Back to the Lagrangian density (24), when the metric corresponds to the Minkowski spacetime the expression reduces to

$$\begin{aligned} \mathcal{{L}}_g= & {} \frac{16}{9} (p+s+t)\partial _{\mu }T_{\nu }\partial ^{\mu }T^{\nu }+\frac{16}{9}(p-2r)\partial _{\mu }T_{\nu }\partial ^{\nu }T^{\mu } \nonumber \\&+\,\frac{16}{9}(p-r+5s-t)\partial _{\mu }T^{\mu } \partial _{\nu }T^{\nu } -\frac{1}{9}t\partial _{\mu }S_{\nu }\partial ^{\nu }S^{\mu }\nonumber \\&+\,\frac{1}{9}(2r+t)\partial _{\mu }S_{\nu }\partial ^{\mu }S^{\nu } +\frac{1}{18}(3q-4r)\partial _{\mu }S^{\mu }\partial _{\nu }S^{\nu } \nonumber \\&+\,\frac{8}{27}(p-q-3t)\varepsilon ^{\mu \nu \rho \sigma }\partial _{\rho }T_{\mu }\partial _{\nu }S_{\sigma } -\mathcal{{V}}(T, S) , \end{aligned}$$
(53)

where \(\mathcal{{V}}(T, S)\) are potential-type terms of the torsion fields; see Appendix D. As discussed previously, the free parameters p, q, r, s and t must be carefully selected to produce ghost-free kinetic terms, i.e. Maxwell-type kinetic terms for the trace four-vector \(T^{\mu }\) and pseudo-trace four-vector \(S^{\mu }\). After suitable integrations by parts the expression above simplifies to

$$\begin{aligned} \mathcal{{L}}_g= & {} \frac{8}{9}(p+s+t)F_{\mu \nu }(T)F^{\mu \nu }(T) \nonumber \\&+\,\frac{1}{18}(2r+t)F_{\mu \nu }(S)F^{\mu \nu }(S)+\frac{1}{6}q\partial _{\mu }S^{\mu }\partial _{\nu }S^{\nu } \nonumber \\&+\,\frac{16}{3}(p-r+2s)\partial _{\mu }T^{\mu }\partial _{\nu }T^{\nu } -\mathcal{{V}}(T, S) \ . \end{aligned}$$
(54)

Since we have two dynamical fields, there are two canonical momenta. These are

$$\begin{aligned} \pi ^{\mu }_T \equiv \frac{\partial \mathcal {L}_g}{\partial (\partial _0 T_{\mu })}= & {} \frac{32}{9} (p+s+t)F^{0\mu }(T)\nonumber \\&+\,\frac{32}{3}\eta ^{0\mu }(p-r+2s)\partial _{\alpha }T^{\alpha },\end{aligned}$$
(55)
$$\begin{aligned} \pi ^{\mu }_S \equiv \frac{\partial \mathcal {L}_g}{\partial (\partial _0 S_{\mu })}= & {} \frac{2}{9}(2r+t)F^{0\mu }(S)+\frac{1}{3} \eta ^{0\mu }q\partial _{\alpha }S^{\alpha } \ . \end{aligned}$$
(56)

Written in terms of the scalar and vectorial degrees of freedom of the four-vectors we have

$$\begin{aligned} \pi ^{0}_T= & {} \frac{32}{3} (p-r+2s)\partial _{\alpha }T^{\alpha } \ ,\end{aligned}$$
(57)
$$\begin{aligned} \pi ^{i}_T= & {} \frac{32}{9}(p+s+t) \left( \dot{T}^i-\partial ^iT^0\right) ,\end{aligned}$$
(58)
$$\begin{aligned} \pi ^{0}_S= & {} \frac{1}{3}q\, \partial _{\alpha }S^{\alpha } ,\end{aligned}$$
(59)
$$\begin{aligned} \pi ^{i}_S= & {} \frac{2}{9}(2r+t) \left( \dot{S}^i-\partial ^iS^0\right) \ . \end{aligned}$$
(60)

As here we have two fields with their own kinetic terms, we need to ensure that neither of them introduces a ghost. Thus, to remove the scalar \(T^0\) and pseudo-scalar \(S^0\) degrees of freedom, we consider \(p-r+2s=0\) and \(q=0\), respectively. Then the Hamiltonian density reads

$$\begin{aligned} \mathcal {H}_g= & {} -\frac{9}{64}\frac{(\pi _T^i)^2}{(p+s+t)} -\frac{8}{9}(p+s+t)F_{ij}(T)F^{ij}(T) \nonumber \\&-\,\frac{9}{4}\frac{(\pi _S^i)^2}{2r+t}-\frac{1}{18}(2r+t)F_{ij}(S)F^{ij}(S)\nonumber \\&+\,\pi ^i_T \partial _i T_o +\pi ^i_S \partial _i S_o+\mathcal{{V}}(T, S) \ . \end{aligned}$$
(61)

The kinetic energy can be bounded from below with the extra conditions of \(p+s+t<0\) and \(2r+t<0\) for the vectorial and pseudo-vectorial torsion fields, respectively. These conditions are summarised in Table 1.

On the other hand, we now require the absence of tachyon instabilities. In the first place, we consider the weak torsion fields regime, that is, the regime where the quadratic terms in torsion fields lead the evolution of the potential. Thus, the potential in the Lagrangian density (54) takes the form

$$\begin{aligned} \mathcal{{V}}(T, S)= -\frac{2}{3}(c+3\lambda )T_{\mu }T^{\mu }-\frac{1}{24}( b+3\lambda ) S_{\mu }S^{\mu } +\mathcal{{O}}(3) ; \end{aligned}$$
(62)

see Appendix D. Note that the mass terms in an action for a vector field comes from a potential of type \(V(\phi )\propto \frac{1}{2}m^2\phi _{\mu }\phi ^{\mu }\). Hence, the roles of the masses \(m^2\) for the vector and pseudo-vector torsion fields are played by the combinations of the coupling constants b, c and \(\lambda \). For these combinations, the correct sign must be taken for the spatial components to avoid tachyon-like instabilities. In our convention, \(\phi _{\mu }\phi ^{\mu }=\phi _0^2-\mathbf {\phi }^{\,\,2}\), then the combinations \(c+3\lambda \) and \(b+3\lambda \) must be positive to ensure a well-behaved vector and pseudo-vector sector, respectively (see Table 1). In summary, with these simple arguments we have found a set of conditions for the ghost and tachyon stability of the Lagrangian density (24) at the decoupling limit and the weak torsion regime, summarized in Table 1.

Table 1 Conditions over the free parameters of the Lagrangian density (24) for stability and reduction to GR when the torsion vanishes

In Refs. [6, 7], Sezgin and Nieuwenhuizen provided a detailed analysis of the stability of the Lagrangian density (24) for the weak torsion field regime. These two articles were the first systematic stability analysis of this kind of theories, made with the spin projectors formalism, and they are a key reference point in this issue. The conclusions they showed for the \(1^-\) torsions are compatible with those obtained here. Their ghost-free condition is the same we have obtained here, and the tachyon-free condition is compatible. On the other hand, for the \(1^+\) sector both conclusions are, however, incompatible. While the condition obtained for a well-defined kinetic term for \(S^{\mu }\) in this section is \(2r+t<0\), they claim that \(2r+t>0\) is needed. It is worth noting that other authors have suggested that the analysis carried out by Sezgin and Nieuwenhuizen is not restrictive enough to ensure a ghost- and tachyon-free spectrum; see Refs. [28, 29]. In fact, in Ref. [28] the authors pointed out that they even obtain a different expression of the spin projector operator for the pseudo-vector mode. Furthermore, they argue the relevance of considering the additional condition for the absence of \(p^{-4}\) poles in all spin sectors, which is not done in the analysis of Refs. [6, 7]. In Ref. [35], Fabbri analyses the stability of the most general quadratic gravitational action with torsion and Dirac fields by demanding, in addition, a consistent decoupling between curvature and torsion that preserves continuity in the torsionless limit, concluding that the only non-vanishing component of the torsion is given by the pseudo-vector mode and that parity-violating terms are not allowed in the Lagrangian density. Nevertheless, due to some lack of clarity in the existing literature, a deeper analysis of the origins of these differences is not available yet.

Let us now go beyond the weak torsion regime when analysing the potential \(\mathcal{{V}}\). Thus, higher orders in the potential can dominate its evolution. The highest order that appears in the potential is quartic, symbolically \(\mathcal{{V}}^{(4)}\),

$$\begin{aligned} \mathcal{{V}}^{(4)}(T, S)= & {} -\frac{64}{27}(p - r + 2 s) T_{\alpha }T^{\alpha }T_{\beta }T^{\beta }\nonumber \\&-\,\frac{1}{108}(p - r + 2 s)S_{\alpha }S^{\alpha }S_{\beta }S^{\beta } \nonumber \\&-\,\frac{8}{81} (2 p + 3 q - 4 r + 2 s)T_{\alpha }S^{\alpha }T_{\beta }S^{\beta } \nonumber \\&-\,\frac{8}{81}(p+r+4s)T_{\alpha }T^{\alpha }S_{\beta }S^{\beta } . \end{aligned}$$
(63)

As there are terms mixing the vector and pseudo-vector fields, we note that the potential can be diagonalized in the following basis:

$$\begin{aligned} \mathcal{{V}}^{(4)}=\left( \begin{array}{c} T_{\alpha }T^{\alpha } \\ S_{\alpha }S^{\alpha } \\ T_{\alpha }S^{\alpha } \end{array} \right) \mathbb {V}^{(4)} \left( \begin{array}{ccc} T_{\alpha }T^{\alpha }&S_{\alpha }S^{\alpha }&T_{\alpha }S^{\alpha } \end{array} \right) \ , \end{aligned}$$
(64)

with \(\mathbb {V}^{(4)}\) a \(3\times 3\) matrix. The eigenvalues of \(\mathcal{{V}}^{(4)}\) are

$$\begin{aligned} \lambda _1= & {} -\frac{8}{81}(2 p + 3 q - 4 r + 2 s) , \end{aligned}$$
(65)
$$\begin{aligned} \lambda _2= & {} -\frac{79}{72}\left( p-r+2s +\sqrt{ A} \right) ,\end{aligned}$$
(66)
$$\begin{aligned} \lambda _3= & {} -\frac{79}{72}\left( p-r+2s -\sqrt{ A} \right) \, , \end{aligned}$$
(67)

with

$$\begin{aligned} A= & {} \frac{1}{711^2}\left( 586249 p^2 - 1168402 p r + 586249 r^2 \right. \nonumber \\&+\, \left. 2349092 p s - 2332708 r s + 2357284 s^2\right) \ . \end{aligned}$$
(68)

For a positive-defined quadratic form, the three eigenvalues must be positive. Since we are only interested in the vector and pseudo-vector torsion degrees of freedom, we can assume \(p-r+2s=0\) and \(q=0\), which are the conditions found for making the scalar and pseudo-scalar mode non-dynamic, respectively. Then the expressions of the eigenvalues reduce to

$$\begin{aligned}&\lambda _1= \frac{16}{81} (p + 3 s) ,\end{aligned}$$
(69a)
$$\begin{aligned}&\lambda _2= -\frac{8}{81} (p + 3 s) ,\end{aligned}$$
(69b)
$$\begin{aligned}&\lambda _3= \frac{8}{81} (p + 3 s) . \end{aligned}$$
(69c)

It is easy to see that these eigenvalues cannot be positive at the same time for any combination of p and s. Hence, the quartic order in the potential in Eq. (61) is unstable and, therefore, this order must be removed to obtain a stable theory. This can be done taking \(3s+p=0\). Furthermore, the third order in the potential is not present once we consider that GR is recovered when the torsion vanishes. Therefore, when we take \(p=r\), \(s=0\) and \(3s+p=0\), there are only quadratic terms in the potential. Thus, the potential is stable under the same conditions as those obtained in the weak torsion field approximation with the additional constraint of \(p+3s=0\); see Table 1.

On the other hand, we should stress that the stability analysis developed in the literature is usually made using a weak curvature approximation for the metric. However, our stability analysis is made in the limit where the degrees of freedom of the torsion are completely decoupled from those of the metric. For this purpose, we have considered that GR is recovered when \(T=0\) and we have investigated the stability of the torsion in Minkowski flat spacetime, assuming that only the vector and pseudo-vector modes propagate. These conditions are combined and summarized in Table 2. Therefore, we expect that the conditions obtained, which are found to be necessary and sufficient for the stability in this regime, are necessary but no longer sufficient conditions for the stability of the theory when both curvature and torsion are present.

Table 2 Compatibility of the stability conditions studied in this paper. In the first column we show necessary conditions for a theory propagating vector or pseudo-vector torsion to be stable. Those conditions have to be implemented (at least) by the inequality contained in the second column when the vector mode propagates and by the conditions of the last column when the pseudo-vector also propagates

4 Summary

In this work we have investigated a quadratic and parity preserving action with curvature and torsion [6, 7, 10, 11] in order to obtain a stable theory of gravity with dynamical torsion. For this purpose, we have analysed two regimes where the degrees of freedom of the metric and those of the torsion are completely decoupled. The assumptions made in those regimes are also motivated by looking for theories of which the predictions are expected not to be in great disagreement with those of GR.

On the one hand, we have assumed that the theory reduces to GR when the torsion vanishes. This implies the stability of the metric degrees of freedom in the regime where there are no torsion modes. Therefore, we have imposed the requirement that the only term independent of the torsion is contained in the scalar curvature \(\widehat{R}\), obtaining two conditions for the parameters of the general quadratic Lagrangian.

On the other hand, we have investigated the stability of the torsion when the metric is flat, following an approach that differs from the usual techniques used in the literature. We have focussed attention on the stability of the vector and psuedo-vector torsion components in Minkowski because they are the only components that propagate in a FLRW spacetime [31] from the torsion irreducible decomposition. Therefore, it is not necessary to consider the purely tensor component if we are interested in “minimal” modifications of the predictions of GR. We have studied the stability of these fields analysing the Hamiltonian formulation of the theory to ensure a ghost and tachyon-free spectrum in this regime. Thus, we have obtained several conditions for the parameters of the general quadratic action with propagating torsion that we have summarized in Table 1. Moreover, we have contrasted the conditions obtained in the weak torsion limit of this regime with those already presented in the literature [6, 7, 28, 29]. As we have discussed in detail, the disagreement with the conclusions of Ref. [6, 7] regarding the pseudo-vector field may be due to the arguments exposed in Refs. [28, 29]. It should be stressed that, after the first approach, we have gone beyond the weak torsion approximation, obtaining the general conditions for the stability of the vector and pseudo-vector torsion fields in Minkowski spacetime.

In summary, we have found the most general subfamily of the Lagrangian density (24) that is stable in both decoupling regimes. This is described by

$$\begin{aligned} \mathcal{{L}}_g= & {} -\lambda \widehat{R}+\frac{1}{12}(4a+b+3\lambda )T_{\mu \nu \rho }T^{\mu \nu \rho }\nonumber \\&+\,\frac{1}{6}(-2a+b-3\lambda )T_{\mu \nu \rho }T^{ \nu \rho \mu }\nonumber \\&+\,\frac{1}{3}(-a+2c-3\lambda )T^{\lambda }_{\cdot \mu \lambda }T_{\rho }^{\cdot \mu \rho } +2t\widehat{R}_{\mu \nu }\widehat{R}^{[\mu \nu ]} \ , \end{aligned}$$
(70)

where \(b+3\lambda >0\), \(c+3\lambda >0\), and \(t<0\), and we restrict our study to theories where only the vector and pseudo-vector torsion components of the irreducible decomposition propagate.