1 Introduction

Despite the impressive success of General Relativity (GR) in explaining gravitational phenomena in a wide range of scales [1] (including direct detection of gravitational waves [2]), there is a logic in exploring gravitation beyond GR. On the observational side, the necessity of incorporating additional components to the universe inventory (e.g. dark matter and dark energy) as well as a satisfactory inflationary mechanism triggered a very active quest for explanations originating from a modification of gravity [3, 4]. On the theoretical side, the appearance of classical singularities such as the Big Bang or the black hole singularities, signal the breakdown of GR at high energies. This simply reflects the loss of perturbative unitarity of GR at around the Planck scale (in an optimistic scenario) that calls for a UV completion. Wishful thinking motivates exploring modified gravity scenarios where the classical evolution can be trusted before reaching the cut-off of the theory so that e.g. the Big Bang is replaced by a stable classical bounce or the central singularity of black holes is smoothed out by a wormhole or some other regular classical structures. This is the hope for instance in the Born-Infeld-like gravity theories (see e.g. [5] and references therein.).

The intimate relation between GR and the geometry of the spacetime makes it natural to explore modifications to GR based on extending its geometrical framework, which in turn could be related to carefully accounting for the microscopic nature of spacetime itself. It is worth to emphasise that the very possibility of interpreting gravity on geometrical grounds roots in the most fundamental property of GR, namely: it describes an interacting massless spin-2 particle. Once the nature of the interaction has been established as such, one is naturally led to the conclusion that this particle must couple universally, thus giving rise to the equivalence principle and, consequently, to the very appealing relation of gravity to the geometry of spacetime. The precise geometrical framework where GR is to be interpreted is however conventional. We are most accustomed to regarding gravity as a manifestation of the spacetime curvature within a pseudo-Riemannian framework where the connection is devoid of any relevant role. However, this is just one possibility. In general, given a metric, the connection can be determined by its torsion (describing its antisymmetric part) and the non-metricity (measuring the failure of the connection in being metric-compatible). It is remarkable that the very same dynamics of GR can in fact be ascribed to these two objects in flat geometries [6,7,8,9,10,11], i.e., geometries with a vanishing curvature. In this context, it seems a lawful act to grant the connection its rightful place in the geometrical scenario and construct theories where the metric and the connection are treated on equal footing.

The generality of the metric-affine framework, where the connection introduces \(D^3\) additional degrees of freedom (dof’s) in D dimensions, usually motivates the introduction of some restrictions either on the class of geometries or on the considered actions in order to work with manageable theories. Popular restrictions on the geometries are for instance the teleparallel framework, where the connection is imposed to be flat, or Weyl geometries and their generalisations, where the connection contains one additional vector field associated to the gauging of scale invariance. On the other hand, the connection can be left completely arbitrary and, then, make a judicious choice of the action so that the dynamics is substantially simplified. A paradigmatic example is the Einstein–Hilbert term that gives the same dynamics regardless the employed formalism because, in the metric-affine formulation, the connection is an auxiliary field whose equations of motion determine it to be the Levi-Civita connection up to an irrelevant projective mode. Another class of theories that has recently received attention is conformed by the so-called Ricci Based Gravity (RBG) where the action is constructed in terms of the Ricci tensor. Furthermore, most studies restrict it even further by imposing a projective symmetry so only the symmetric part of the Ricci tensor is allowed. This practical restriction turns out to be physically meaningful because it deprives the connection of any dynamics and, in fact, they are nothing but GR in disguise.

A common argument in favour of general metric affine theories of gravity including higher order curvature terms is the manifest second order nature of the field equations. This property is sometimes claimed to avoid the presence of Ostrogradski instabilities that plague their corresponding metric formulation [12, 13]. However, this reasoning is flawed because what is sometimes not fully appreciated is the true origin of the pathologies. A very simple illustration of the incorrectness of this naive expectation is to notice that one can always reduce the order of some field equations by introducing suitable auxiliary fields. A perhaps better way of phrasing it would be to say that a criterion to avoid Ostrogradski instabilities is to ensure second order field equations on the constraint surface in phase space or, in other words, after having solved the constraints of the system and/or having integrated out all the auxiliary fields. In practice, this can lead to non-local actions that can obscure the number and nature of the physical dof’s.

Thus, in order to properly diagnose the presence of Ostrgradski instabilities and/or other pathologies, a better strategy would be desired. The most direct way of tackling the issue would be to perform an exhaustive Hamiltonian analysis to identify the presence of constraints and work out the corresponding algebra. However, this can be technically difficult and, in many occasions, simpler approaches can be used to show the presence of pathologies. It is specially useful to check the breaking of gauge symmetries and the operators producing itFootnote 1 as a manner to easily identify ghostly dof’s. This approach was used in [14] to show that generalized RBGs where the projective symmetry is explicitly broken generally contain additional ghostly dof’s. These come in two forms: A spin-1 ghost arising from the absence of a pure kinetic term for the projective mode and Ostrogradski instabilities associated to non-minimal couplings of the additional 2-form field present in the theory. In [15], it has been explored the important role played by the projective symmetry in the construction of healthy scalar-tensor theories. These studies put forward the relevance of the projective symmetry to avoid pathologies. In this work we will provide a more detailed analysis of RBGs and show the pathological character of non-projectively invariant RBG theories from different perspectives. Furthermore, we will argue how our findings naturally transcend to general metric-affine theories, thus making it clear that care must be taken when formulating theories with non-linear terms in the curvature even when formulated for a general affine connection. We will also illustrate how to evade these no-go results in different ways.

The paper is organised as follows. To make the paper as self-contained as possible, we start in Sect. 2 by giving a complete and detailed introduction to the metric-affine formalism with all the necessary relations that we will use throughout the paper. In Sect. 3 we introduce the generalized RBG theories with the derivation of the field equations and the transformation to the Einstein frame. After introducing the general case, we briefly discuss on the projectively invariant theories in Sect. 4 where we seize the opportunity to clarify some potential misconceptions. We then move to the main core of the paper in Sect. 5 where we study the non-projectively invariant theories. We discuss how the corresponding Einstein frame relates to the Non-symmetric Gravity Theory so that the results obtained for those theories are also applicable to generalized RBGs. We study a certain decoupling limit of the theory to show the appearance of ghosts that jeopardise the stability of the theories. We also show the problem by directly solving the connection equations. After diagnosing the sources of instabilities, in Sect. 6 we consider matter interactions that include direct couplings to the connection and show how they cannot, in general, stabilise the theories. In Sect. 7 we suggest several procedures that can lead to the avoidance of the ghosts, essentially by constraining the connection in different ways. We then consider a hybrid framework and show how they are even more prone to exhibit ghosts. We finally give our main conclusions in the discussions.

2 Generalities of the metric-affine framework

Since we are going to deal with metric-affine gravity theories, it will be convenient to introduce the corresponding geometrical framework. A space-time is defined as a manifold endowed with a metric and an affine structures, determined by a metric \(g_{\mu \nu }\) and a connection \(\Gamma ^\alpha {}_{\mu \nu }\) respectively. Unlike the metric formalism where the affine connection is assumed to be completely specified by the metric as the unique symmetric and metric-compatible connection, the metric-affine framework does not establish any relation a priori so the connection and the metric bear no relation between them, and both are accounted as fundamental fields. Despite the complete independence of the metric and the connection, it is often very useful to use the fact that given a metric there is a distinguished connection, namely the Levi-Civita connection, so the independent affine connection admits the following decomposition in three distinctive pieces:

$$\begin{aligned} \Gamma ^\alpha {}_{\mu \nu }=\bar{\Gamma }^\alpha {}_{\mu \nu }(g)+L^\alpha {}_{\mu \nu }(Q)+K^\alpha {}_{\mu \nu }({\mathcal {T}}). \end{aligned}$$
(1)

These pieces are respectively called Christoffel symbols of \(g_{\mu \nu }\), distortion tensor and contortion tensor and are given by:

$$\begin{aligned}&\bar{\Gamma }^\alpha {}_{\mu \nu }(g)=\frac{1}{2}g^{\alpha \lambda }\left( \partial _{\mu }g_{\lambda \nu }+\partial _{\nu }g_{\mu \lambda }-\partial _{\lambda }g_{\mu \nu }\right) , \end{aligned}$$
(2)
$$\begin{aligned}&L^\alpha {}_{\mu \nu }(Q)=\frac{1}{2}\left( Q^\alpha {}_{\mu \nu }-Q_{\mu \;\;\nu }^{\;\;\alpha }-Q_{\nu \;\;\mu }^{\;\;\alpha }\right) , \end{aligned}$$
(3)
$$\begin{aligned}&K^\alpha {}_{\mu \nu }({\mathcal {T}})=\frac{1}{2}\left( {\mathcal {T}}^\alpha {}_{\mu \nu }+{\mathcal {T}}_{\mu }{}^\alpha {}_\nu +{\mathcal {T}}_\nu {}^\alpha {}_\mu \right) , \end{aligned}$$
(4)

where \(Q_{\alpha \mu \nu }\equiv \nabla _\alpha g_{\mu \nu }\) and \({\mathcal {T}}^\alpha {}_{\mu \nu }\equiv 2\Gamma ^\alpha {}_{[\mu \nu ]}\) are the non-metricity and torsion tensors respectively. Notice that whereas the torsion is a property only of the affine connection, the non-metricity defines the relation between the metric and the symmetric part of the connection, and thus it is not a property of the affine connection alone. Notice also that all the non-tensorial behaviour of the connection is encoded in the Christoffel symbols, since the distortion and contortion pieces are indeed rank 3 tensors, since they can be regarded as the different of two connections. As a consequence, whereas one could choose given coordinates to make the connection, or the Christoffel symbols of \(g_{\mu \nu }\) vanish at a point, this cannot be done for the contortion and distortion pieces. The curvature associate to the connection is given by the Riemann tensor, which reads

$$\begin{aligned} {\mathcal {R}}^\alpha {}_{\beta \mu \nu }(\Gamma )\equiv \partial _{\mu }\Gamma ^\alpha {}_{\nu \beta }-\partial _{\nu }\Gamma ^\alpha {}_{\mu \beta }+\Gamma ^\alpha {}_{\mu \lambda }\Gamma ^{\lambda }{}_{\nu \beta }-\Gamma ^\alpha {}_{\nu \lambda }\Gamma ^{\lambda }{}_{\mu \beta }. \end{aligned}$$
(5)

For an arbitrary affine connection and in the presence of a metric, we can define the following three independent traces of the Riemann tensor:

$$\begin{aligned}&{\mathcal {R}}_{\mu \nu }(\Gamma )={\mathcal {R}}^\alpha {}_{\mu \alpha \nu }(\Gamma ),\end{aligned}$$
(6)
$$\begin{aligned}&{\mathcal {P}}^\mu {}_{\nu }(g,\Gamma )=g^{\alpha \beta }{\mathcal {R}}^\mu {}_{\alpha \nu \beta }(\Gamma ),\end{aligned}$$
(7)
$$\begin{aligned}&{\mathcal {Q}}_{\mu \nu }(\Gamma )={\mathcal {R}}^\alpha {}_{\alpha \mu \nu }(\Gamma ), \end{aligned}$$
(8)

which are called the Ricci, co-Ricci and homothetic tensors respectively. The trace of the Ricci tensor defines the Ricci scalar \({\mathcal {R}}(\Gamma )=g^{\mu \nu }{\mathcal {R}}_{\mu \nu }\) which also coincides with the trace of the co-Ricci tensor. Since the homothetic tensor is antisymmetric by construction, its trace vanishes. Notice that while the Ricci and the homothetic tensor exist in the absence of a metric, the co-Ricci tensor and the Ricci scalar do not. Another relevant property worth mentioning is the symmetries of a general Riemann. While the (covariant version of the) Riemann tensor defined from a (symmetric) metric tensor satisfies \({\mathcal {R}}_{\alpha \beta \mu \nu }(g)=-{\mathcal {R}}_{\alpha \beta \nu \mu }(g)=-{\mathcal {R}}_{\beta \alpha \nu \mu }(g)={\mathcal {R}}_{\mu \nu \alpha \beta }(g)\), the Riemann tensor of a general connection, only has antisymmetry in its two last indices \({\mathcal {R}}_{\mu \nu \alpha \beta }(\Gamma )\ne {\mathcal {R}}_{\alpha \beta \mu \nu }(\Gamma )=-{\mathcal {R}}_{\alpha \beta \nu \mu }(\Gamma )\ne -{\mathcal {R}}_{\beta \alpha \nu \mu }(\Gamma )\). This implies that, while the Ricci tensor of a (symmetric) metric is always symmetric, the Ricci tensor of a general connection is not. Even if the connection is symmetric, the Ricci tensor will develop an antisymmetric piece.

Through the parallel transport operation, any affine connection defines a preferred set of paths named pre-geodesics or auto-parallel paths. Any of these paths can be defined as the equivalence class of curves that are a solution of the parallel transport equation

$$\begin{aligned} \frac{\text {d} x^\mu }{\text {d}\lambda }\nabla _\mu \left( \frac{\text {d} x^\nu }{\text {d}\lambda }\right) =f(\lambda )\frac{\text {d}x^\mu }{\text {d}\lambda }, \end{aligned}$$
(9)

where two curves are equivalent if one is a re-parametrization of the other, or put it in another way, if their image on the manifold is the same set of points. This set of points is the path corresponding to the equivalence class; and the element of the class parametrized in a way such that the right hand side of (9) vanishes is the affinely-parametrized geodesic representative of the class.Footnote 2 Auto-parallel paths are occasionally stipulated to describe the trajectories of test-particles (see however the paragraph below and Sect. 6.3 for a detailed discussion on the physical relevance of this stipulation), so it is interesting from the physical point of view to wonder about the existence of a group of transformations acting on the affine connection that is a symmetry of the set of pre-geodesics. Indeed it is known that projective transformations, which act on the connection asFootnote 3

(10)

leave pre-geodesics invariant. Here \(\xi _\mu \) is any smooth 1-form field. Indeed given a pre-geodesic of \(\Gamma ^\alpha {}_{\mu \nu }\), any parametrization of that path is its affine-parametrization for the connection \({\bar{\Gamma }}^\alpha {}_{\mu \nu }=\Gamma ^\alpha {}_{\mu \nu }+A_\mu \delta ^\alpha {}_\nu \) where, as is immediately seen by writing the projectively transformed version of (9), \(A_\mu \) has to satisfy \(A_\mu \frac{\text {d}x^\mu }{\text {d}\lambda }=-f(\lambda )\). The meaning of this fact is that, regarding pre-geodesics, a change in the connection by a projective transformation is equivalent to a reparametrization of the solutions of (9), and therefore the equivalence classes of curves related by parametrization (i.e. pre-geodesics) are left invariant under projective transformations. Moreover, at least locally, it is clear from the properties of smooth 1-forms that the projective transformations form a continuous group.

The above discussion implicitly promotes the autoparallel equation to a fundamental level, establishing how particles move in a general affinely connected spacetime. This is not however what occurs in standard physics where the classical equations are derived from an action principle. If we consider a massive test particle, the action is proportional to the line element computed along its trajectory, i.e., \({\mathcal {S}}=-m\int {\mathrm{d}}s\) with \({\mathrm{d}}s=\sqrt{\vert g_{\mu \nu }\dot{x}^\mu \dot{x}^\nu \vert }{\mathrm{d}}\lambda \). This action is obviously only sensitive to the metric and hence the particle cannot see the affine structure. The equations derived from this action for the particle will be the metric geodesic equations that coincide with the autoparallel equation for the Levi-Civita connection. If we insist in giving a preferred role to the autoparallel equation we should add an explicit coupling to the affine connection in the equations. At a more fundamental level, this amounts to including direct couplings between the connection and the fields. We will come back to these issues in Sect. 6.3 with a more detailed discussion.

Since the projective symmetry plays a fundamental role in affine geometries and, by extension, in the dynamics of metric-affine theories (as we will see below), it is useful to unveil the transformation properties of the different tensorial objects associated to the connection under projective transformations. For the sake of generality, let us first write down the transformation law of these connection-related tensorial objects for an arbitrary change in the connection given by

(11)

Due to its traditional importance in gravitational theories, let us start by writing down the corresponding transformation rule satisfied by the Riemann curvature tensor and other associated tensors. We find the relation

$$\begin{aligned} {\mathcal {R}}^\alpha {}_{\beta \mu \nu }(\Gamma )&={\mathcal {R}}^\alpha {}_{\beta \mu \nu }({\bar{\Gamma }})+2{\bar{\nabla }}_{[\mu }\delta \Gamma ^\alpha {}_{\nu ]\beta }+{{\bar{{\mathcal {T}}}}}^\lambda {}_{\mu \nu }\delta \Gamma ^\alpha {}_{\lambda \beta }\nonumber \\&\quad +2\delta \Gamma ^\alpha {}_{[\mu |\lambda |}\delta \Gamma ^\lambda {}_{\nu ]\beta }, \end{aligned}$$
(12)

where the connection-related objects with an over-bar are defined in terms of the background connection \({{\bar{\Gamma }}}^\alpha {}_{\mu \nu }\). By taking the corresponding traces we are led to

$$\begin{aligned}&{\mathcal {R}}_{\mu \nu }(\Gamma )=R_{\mu \nu }(\bar{\Gamma })+2{{\bar{\nabla }}}_{[\alpha }\delta \Gamma ^\alpha {}_{\nu ]\mu }+{{\bar{{\mathcal {T}}}}}^\lambda {}_{\alpha \nu }\delta \Gamma ^\alpha {}_{\lambda \mu }\nonumber \\&\quad +2\delta \Gamma ^\alpha {}_{[\alpha |\lambda |}\delta \Gamma ^\lambda {}_{\nu ]\mu },\end{aligned}$$
(13)
$$\begin{aligned}&{\mathcal {P}}^\mu {}_{\nu }(g,\Gamma )={\mathcal {P}}^\mu {}_{\nu }(g,\bar{\Gamma })+{{\bar{\nabla }}}_\nu \delta \Gamma ^{\mu \alpha }{}_{\alpha }-{{\bar{\nabla }}}_\alpha \delta \Gamma ^{\mu \;\;\alpha }_{\;\;\nu }\nonumber \\&\quad +{{\bar{{\mathcal {T}}}}}^\lambda {}_{\nu \alpha }\delta \Gamma ^{\mu }{}_{\lambda }{}^{\alpha } +2\delta \Gamma ^\mu {}_{[\nu |\lambda |}\delta \Gamma ^{\lambda }{}_{\alpha ]}{}^{\alpha },\end{aligned}$$
(14)
$$\begin{aligned}&{\mathcal {Q}}_{\mu \nu }(\Gamma )={\mathcal {Q}}_{\mu \nu }(\bar{\Gamma })+2\partial _{[\mu }\delta \Gamma ^\alpha {}_{\nu ]\alpha }\;. \end{aligned}$$
(15)

Also the changes undergone by the torsion and non-metricity tensors are given by

$$\begin{aligned}&{\mathcal {T}}^\alpha {}_{\mu \nu }(\Gamma )={{\bar{{\mathcal {T}}}}}^\alpha {}_{\mu \nu }+2\delta \Gamma ^\alpha {}_{[\mu \nu ]},\end{aligned}$$
(16)
$$\begin{aligned}&Q_{\alpha \mu \nu }(g,\Gamma )=Q_{\alpha \mu \nu }(g,\bar{\Gamma })-2\delta \Gamma _{(\mu |\alpha |\nu )}, \end{aligned}$$
(17)

and therefore, the contortion and distortion tensors change as

$$\begin{aligned}&K^\alpha {}_{\mu \nu }(\Gamma )=K^\alpha {}_{\mu \nu }(\bar{\Gamma })-\delta \Gamma _{(\mu \nu )}{}^\alpha {}+\delta \Gamma ^\alpha {}_{[\mu \nu ]}+\delta \Gamma _{(\mu }{}^{\alpha }{}_{\nu )},\end{aligned}$$
(18)
$$\begin{aligned}&L^\alpha {}_{\mu \nu }(g,\Gamma )=L^\alpha {}_{\mu \nu }(g,\bar{\Gamma })-\delta \Gamma _{(\mu }{}^{\alpha }{}_{\nu )}+\delta \Gamma ^\alpha {}_{(\mu \nu )}+\delta \Gamma _{(\mu \nu )}{}^\alpha \;. \end{aligned}$$
(19)

Now that the transformation laws under a general shift in the connection (11) are given, notice that a projective transformation (10) is a special case of (11) with \(\delta \Gamma ^\alpha {}_{\mu \nu }=-\xi _\mu \delta ^\alpha {}_{\nu }\). The transformation properties of the different curvature tensors under a projective transformation are then:

$$\begin{aligned}&{\mathcal {R}}^\alpha {}_{\beta \mu \nu }(\Gamma )={\mathcal {R}}^\alpha {}_{\beta \mu \nu }({\bar{\Gamma }})-F_{\mu \nu }\delta ^\alpha {}_\beta \nonumber \\&{\mathcal {R}}_{\mu \nu }(\Gamma )={\mathcal {R}}_{\mu \nu }(\bar{\Gamma })-F_{\mu \nu }\;\nonumber \\&{\mathcal {P}}^\mu {}_{\nu }(\Gamma )={\mathcal {P}}^\mu {}_{\nu }(\bar{\Gamma })+F^{\mu }{}_{\nu } \nonumber \\&{\mathcal {Q}}_{\mu \nu }(\Gamma )={\mathcal {Q}}_{\mu \nu }(\bar{\Gamma })-DF_{\mu \nu }\; \nonumber \\&K^\alpha {}_{\mu \nu }(\Gamma )=K^\alpha {}_{\mu \nu }(\bar{\Gamma })+\xi _\nu \delta ^\alpha {}_\mu -g_{\mu \nu }\xi ^\alpha \nonumber \\&L^\alpha {}_{\mu \nu }(g,\Gamma )=L^\alpha {}_{\mu \nu }(g,\bar{\Gamma })+g_{\mu \nu }\xi ^\alpha -2\xi _{(\mu }\delta ^\alpha {}_{\nu )} \end{aligned}$$
(20)

where \(F_{\mu \nu }=2\partial _{[\mu }\xi _{\nu ]}\) is the field-strength of the projective mode \(\xi \), and D is the number of space-time dimensions. The above transformation laws reveal some interesting properties that will be crucial for the construction of metric-affine theories. Firstly, let us note that projective transformations leave invariant the symmetric parts of the Ricci and co-Ricci tensors, but their antisymmetric part is not. This fact is important because of what follows: It is well known that higher order curvature gravity theories in the metric formalism propagate ghostly degrees of freedom (except Lovelock theories), which can be traced back to the fact that their equations of motion for the metric are of fourth order and present Ostrogradski instabilities. This is due to the fact that, in the metric formalism, if the action has higher order curvature invariants, since the connection contains \(\Gamma \supset \partial _{}g\), the Riemann and associated curvature tensors contain \(R^\alpha {}_{\beta \mu \nu }\supset \partial _{}^2g\). Therefore, introducing powers of the Riemann of order higher than one in the action, gives rise to equations of motion for the metric of differential order higher than two. Remarkably, this is not true in the metric-affine formalism, where the connection is no longer related to derivatives of the metric but a fundamental field, and therefore, arbitrary powers of the Riemann in the action do not render higher order equations of motion for the metric. Holding on to this fact, as commented in the Introduction, it is sometimes argued that metric-affine higher order curvature gravity theories do not propagate ghost-like degrees of freedom. This belief is inaccurate because, as explicitly shown in [14], ghosts arise in theories of gravity whose actions are arbitrary analytic functions of the full Ricci tensor unless further constraints are imposed. We will rederive this results with different complementary approaches below. However, it is also known that metric-affine theories whose actions are an arbitrary analytic function of the symmetric part of the Ricci tensor do not propagate more than the two degrees of freedom of the graviton, and in fact are ghost-free. These two facts can be understood in light of projective symmetry. Indeed, as we have seen above, since \(R_{(\mu \nu )}\) is invariant under projective transformations, an action which is an arbitrary function only of the symmetric part of the Ricci enjoys a projective symmetry. On the other hand, an action which is a function of the full Ricci tensor does not because the \(R_{[\mu \nu ]}\) part explicitly breaks it.Footnote 4 We will see that the breaking of this symmetry unleashes five extra degrees of freedom associated to the presence of the dynamical projective mode with an unavoidable ghostly sector. This poses a serious drawback to consider general non-projectively invariant gravity theories. There is a loophole in the argument which we will explore: By introducing additional constraints, even theories which break projective symmetry can be safe. Let us now introduce the general formulation of RBGs.

3 Ricci-based metric-affine theories

Equipped with the general geometrical framework introduced in the previous section, we can turn to the family of theories that will conform the main focus of this work, namely theories of gravity formulated in the metric-affine approach and that only depend on the Ricci tensor. This might give the impression of an unnecessary restraint given the huge freedom permitted by the the general metric-affine formalism. Let us recall at this point that we have a plethora of different geometrical objects that could be used and which should indeed enter the action, unless some additional guiding principle is invoked. Since our purpose here is showing the (generically) pathological nature of higher order curvature theories of gravity in the metric-affine formalism, we simply take these theories as a benchmark to illustrate the potential problems suffered by metric-affine theories. It is important however to stress that RBG theories have received considerable attention in the literature [5, 16,17,18,19,20,21,22], due to their interesting properties that make them appealing and more tractable than other more general metric-affine theories, thus being useful as a proxy to better understanding general metric-affine theories.

3.1 Field equations

The family of theories that we will mainly consider throughout our analysis will be described by an action of the following form:

$$\begin{aligned} {\mathcal {S}}[g_{\mu \nu },\Gamma ^\alpha {}_{\mu \nu },\Psi ]=\frac{1}{2}\int ^{}_{} {\mathrm{d}}^Dx \sqrt{-g}\,F\big (g^{\mu \nu },{\mathcal {R}}_{\mu \nu }\big )+{\mathcal {S}}_{\mathrm{m}}[g_{\mu \nu },\Psi ], \end{aligned}$$
(21)

where F is an arbitrary scalar function that depends on the (inverse) metric \(g^{\mu \nu }\) and the Ricci tensor \({\mathcal {R}}_{\mu \nu }\) of an arbitrary connection \(\Gamma ^\alpha {}_{\mu \nu }\) that is to be determined by the field equations. We have also included the matter sector through its action \({\mathcal {S}}_{\mathrm{m}}\), where \(\Psi \) stands for all matter fields, i.e., it can carry both internal and Lorentz indices. Unless otherwise stated, we will assume that matter fields are minimally coupled to gravity. In the metric formalism, this is a pretty straightforward procedure to follow free from ambiguities. However, in the metric affine-formalism, even this prescription leads to ambiguities in several respects that lead to the appearance of terms involving the torsion and/or non-metricity tensors (see e.g. [23, 24]). For the moment, we will have in mind matter actions containing scalar fields with up to first derivatives and vector fields whose kinetic terms are gauge invariant. That way the matter sector will not contain the connection and all the dependence on \(\Gamma ^\alpha {}_{\mu \nu }\) will come from the gravitational sector. We will drop this assumptions later and discuss their impact.

The field equations obtained by varying (21) with respect to the metric and the connection are respectively

$$\begin{aligned}&\frac{\partial F}{\partial g^{\mu \nu }}-\frac{1}{2}F g_{\mu \nu }=T_{\mu \nu } \end{aligned}$$
(22)
$$\begin{aligned}&\nabla _\lambda \left[ \sqrt{-q}q^{\nu \mu }\right] -\delta ^\mu {}_\lambda \nabla _\rho \left[ \sqrt{-q}q^{\nu \rho }\right] \nonumber \\&\quad =\sqrt{-q}\left[ {\mathcal {T}}^\mu {}_{\lambda \alpha } q^{\nu \alpha }+{\mathcal {T}}^\alpha {}_{\alpha \lambda } q^{\nu \mu }-\delta ^\mu {}_\lambda {\mathcal {T}}^\alpha {}_{\alpha \beta } q^{\nu \beta }\right] , \end{aligned}$$
(23)

where \(T_{\mu \nu }=-\frac{2}{\sqrt{-g}}\frac{\delta {\mathcal {S}}_{\mathrm{m}}}{\delta g^{\mu \nu }}\) is the usual energy-momentum tensor of the matter sector and we have introduced the object \(\sqrt{-q}q^{\mu \nu }\equiv \sqrt{-g}\frac{\partial F}{\partial {\mathcal {R}}_{\mu \nu }}\). In the usual treatment of RBGs, projective symmetry is assumed, which from (20) restricts the dependence of the action only to the symmetric part of the Ricci tensor. Here, we want to offer a more detailed discussion about what are the consequences of breaking the projective symmetry in RBGs than that presented in [14]. Thus in our discussion the full Ricci tensor will enter the action and, therefore, the object \(q^{\mu \nu }\) will carry all its 16 components instead of the 10 components of the projectively symmetric case.Footnote 5 Turning back to the field equations of the generalised RBGs, the above connection equation can be recast in a more useful form by introducing a new connection \({\hat{\Gamma }}^\alpha {}_{\mu \nu }\) obtained by subtracting a projective mode from the original one

$$\begin{aligned} {\hat{\Gamma }}^\alpha {}_{\mu \nu }=\Gamma ^\alpha {}_{\mu \nu }+\frac{2}{D-1}\Gamma ^\lambda {}_{[\lambda \mu ]}\delta ^\alpha {}_\nu , \end{aligned}$$
(24)

which identically satisfies \({\hat{\Gamma }}^\lambda {}_{[\lambda \mu ]}=0\). In terms of this new connection, (23) can be recast as

$$\begin{aligned}&\partial _\lambda (\sqrt{-q}q^{\mu \nu })+{\hat{\Gamma }}^{\mu }{}_{\lambda \alpha }\sqrt{-q}q^{\alpha \nu }+{\hat{\Gamma }}^{\nu }{}_{\alpha \lambda }\sqrt{-q}q^{\mu \alpha } \nonumber \\&\quad -{\hat{\Gamma }}^{\alpha }{}_{\lambda \alpha }\sqrt{-q}q^{\mu \nu }=0. \end{aligned}$$
(25)

We can now remove the different traces of the connection appearing in the field equations by taking the different traces of the above equation. By doing so, we get from the algebraic manipulations the condition

$$\begin{aligned} \partial _\mu \Big (\sqrt{-q} q^{[\mu \nu ]}\Big )=0, \end{aligned}$$
(26)

and then we arrive at the same connection equation found in [25] for NGT:

$$\begin{aligned} \partial _\lambda q^{\mu \nu }+{\hat{\Gamma }}^\mu {}_{\lambda \alpha }q^{\alpha \nu }+{\hat{\Gamma }}^\nu {}_{\alpha \lambda }q^{\mu \alpha }=0. \end{aligned}$$
(27)

Solving Eq. (27) for the connection is in general quite cumbersome, if possible at all. What is easy to see is that this equation can be algebraically solved in terms of \(q^{\mu \nu }\). However, \(q^{\mu \nu }\) as defined above depends itself on the connection through its dependence on \({\mathcal {R}}_{\mu \nu }\) so that this does not in general give the solution for the connection. A singular case is when the function is linear in the Ricci tensor so that \(q^{\mu \nu }\) does not depend on the connection. This is of course the case for the Einstein–Hilbert action. Thus, although it would be possible to work directly with the above equations, and indeed we will solve them perturbatively in the antisymmetric part of \(q^{\mu \nu }\), it is useful to consider other ways of working with generalised RBGs. Indeed it can be seen that all RBG theories can be described in terms of an Einstein–Hilbert-like term \(q^{\mu \nu }{\mathcal {R}}_{\mu \nu }\) by performing a suitable field re-definition. Let us clarify this in the next section.

3.2 The Einstein–Hilbert frame

As discussed above, it is possible to obtain the main properties of general RBG theories by working with the field equations. However, it is more illuminating to re-write the action so that the gravitational sector looks more familiar and, consequently, the physical content of the theory is more apparent. We will follow the procedure presented in [5, 16] for the projectively invariant theories, extending it to the general non-projectively invariant case. Let us start by performing a Legendre transformation in order to linearise the action in the Ricci tensor as follows:

$$\begin{aligned} {\mathcal {S}}= & {} \frac{1}{2}\int {\mathrm{d}}^Dx \sqrt{-g}\left[ F(\Sigma _{\mu \nu })+\frac{\partial F}{\partial \Sigma _{\mu \nu }}\big ({\mathcal {R}}_{\mu \nu }-\Sigma _{\mu \nu }\big )\right] \nonumber \\&\quad +{\mathcal {S}}_{\mathrm{m}}[g,\Psi ], \end{aligned}$$
(28)

where \(\Sigma _{\mu \nu }\) is an auxiliary field. Unlike in the projectively invariant case, where \(\Sigma _{\mu \nu }\) is symmetric, this auxiliary field does not have any symmetry carrying both symmetric and antisymmetric parts. We will see below that it is precisely the antisymmetric part that gives rise to the pathological behaviour of these theories. In order to put our action in a more familiar form, we will introduce the following field re-definition:

$$\begin{aligned} \sqrt{-q}q^{\mu \nu }=\sqrt{-g}\frac{\partial F}{\partial \Sigma _{\mu \nu }}. \end{aligned}$$
(29)

This definition will allow to express the auxiliary field \(\Sigma _{\mu \nu }\) in terms of the spacetime metric and the object \(q^{\mu \nu }\), i.e., we will have an algebraic relation \(\Sigma _{\mu \nu }=\Sigma _{\mu \nu }(q,g)\). The resemblance of this definition with the one introduced in the field equations of Sect. 3.1 is due to the fact that the dynamics of this new auxiliary field is given by the constraint \(\Sigma _{\mu \nu }={\mathcal {R}}_{\mu \nu }\), so that the above field re-definition looks exactly like the definition for \(q^{\mu \nu }\) given in the previous section when the field equations are satisfied. After this field re-definition, we can then express the RBG action in the form

$$\begin{aligned} {\mathcal {S}}=\frac{1}{2}\int {\mathrm{d}}^Dx\Big [\sqrt{-q}q^{\mu \nu }{\mathcal {R}}_{\mu \nu }(\Gamma )+{\mathcal {U}}(q,g)\Big ]+{\mathcal {S}}_{\mathrm{m}}[g,\Psi ], \end{aligned}$$
(30)

where we have introduced the potential term

$$\begin{aligned} {\mathcal {U}}(q,g)=\sqrt{-g}\left[ F-\frac{\partial F}{\partial \Sigma _{\mu \nu }}\Sigma _{\mu \nu }\right] _{\Sigma =\Sigma (q,g)}. \end{aligned}$$
(31)

The action (30) already features the standard Einstein–Hilbert term in the first order formalism, but for the object \(q^{\mu \nu }\) instead of the spacetime metric \(g_{\mu \nu }\). As a matter of fact, we can notice that \(g_{\mu \nu }\) appears algebraically in the potential \({\mathcal {U}}\) and the matter action so that it is simply an auxiliary field that we can integrate out by solving its equation of motion

$$\begin{aligned} \frac{\partial {\mathcal {U}}}{\partial g^{\mu \nu }}=\sqrt{-g}\,T_{\mu \nu }. \end{aligned}$$
(32)

From this equation we can obtain the spacetime metric \(g_{\mu \nu }\) in terms of the object \(q^{\mu \nu }\) and the energy-momentum tensor of the matter sector, computed as the variation of the matter action w.r.t. \(g_{\mu \nu }\) as usual. We will see below that there is another energy-momentum tensor that we can introduce to make the resemblance with the first-order formulation of GR even more apparent. Once we have obtained the corresponding solution to (32), we can use it to finally express (30) as

$$\begin{aligned} {\mathcal {S}}=\frac{1}{2}\int {\mathrm{d}}^Dx \Big [\sqrt{-q}q^{\mu \nu }{\mathcal {R}}_{\mu \nu }(\Gamma )+{\mathcal {U}}(q,T)\Big ]+{\mathcal {S}}_{\mathrm{m}}[g(q,T),\Psi ]. \end{aligned}$$
(33)

This is the desired appearance of the theory where the gravitational sector reduces to the well-known Einstein–Hilbert action in the first order formalism. It is important to emphasise that the resemblance is purely formal at this point and, in fact, solving for the connection will fail to recover GR owed to the lack of any symmetries of \(q^{\mu \nu }\). In the next sections we will explicitly show when this is the case and the differences when it is not.

In addition to the purely gravitational sector, we also see how we have generated couplings between the object \(q^{\mu \nu }\) and the matter sector. Such couplings arise from two sources after integrating out the spacetime metric, namely: from the potential \({\mathcal {U}}\) generated when linearising in the Ricci tensor and from the explicit couplings of the matter sector to \(g_{\mu \nu }\). Notice that, since the matter sector was assumed to be minimally coupled to gravity, i.e., it only couples to \(g_{\mu \nu }\), matter will only enter Eq. (32) through the energy-momentum tensor obtained as the reaction to variations of the spacetime metric. This further implies that all the newly generated matter couplings will only depend on \(T_{\mu \nu }\), which guarantees the preservation of the symmetries in the original matter sector. Notice that since \(g_{\mu \nu }\) appears in \(T_{\mu \nu }\) not as \(T_{\mu \nu }\propto g_{\mu \nu }\) but in a more involved form, it could be that if we truly want to eliminate \(g_{\mu \nu }\) in favour of \(q_{\mu \nu }\) and the matter fields, the dependence could also be more general than through \(T_{\mu \nu }\) (we have to solve the corresponding equation for \(g_{\mu \nu }\)). However, the new couplings will still surelly have the same symmetries as the matter action.

4 Projectively-invariant theories: equivalence to GR

Before proceeding to the general case where the object \(q^{\mu \nu }\) does not exhibit any symmetries, let us consider what happens when a projective symmetry is imposed. This has already been studied in the literature, but it will be useful to discuss these known results here in order to appreciate better the fundamental role played by the projective mode in these theories. As explained in Sect. 2, in Ricci-Based actions, the projective symmetry can be straightforwardly implemented by restricting the action to depend only on the symmetric part of the Ricci tensor. If that is the case, then it is easy to see from the definition of \(q^{\mu \nu }\) that this object will inherit the symmetric character of the Ricci tensor. Being a symmetric rank-2 tensor, \(q^{\mu \nu }\) is then entitled to claim its status as a proper metric tensor so that the gravitational sector in (33) is actually the first order formulation of GR. However, the corresponding solution for the connection will be given by the Christoffel symbols of the metric \(q^{\mu \nu }\) (up to the projective mode entering as a gauge mode [26]) instead of those of the spacetime metric \(g_{\mu \nu }\).

In the Einstein-frame we thus recover the usual form of the Einstein equations, but the right hand side is now given by the energy-momentum tensor describing the reaction to the metric \(q^{\mu \nu }\) of the matter action resulting after integrating out \(g_{\mu \nu }\), i.e.

$$\begin{aligned} {\tilde{T}}_{\mu \nu }=-\frac{2}{\sqrt{-q}}\frac{\delta {\tilde{S}}_{\mathrm{m}}}{\delta q^{\mu \nu }}. \end{aligned}$$
(34)

This energy-momentum is highly non-linearly related to \(T_{\mu \nu }\) [17], and will feature new interactions between all the matter fields in general [21, 22], which are the origin of the different phenomenology and solutions that differ from the usual GR behaviour. Let us stress however that these theories are nothing but standard GR in disguise. The apparent differences between RBGs and GR are simply due to the fact that a matter sector coupled to a projectively invariant RBG corresponds to another matter sector (obtained as a non-linear deformation of the previous one) coupled to GR. The peculiar property of the RBG with projective symmetry is that the interactions in the matter sector present a somewhat universal form (that of course depends on the specific theory, i.e., the function F). As we have discussed above, if we start from minimally coupled matter fields, all the new interactions will be generated through the total energy-momentum tensor [17, 22]. Assuming that the most relevant interactions in the gravitational sector of RBG appear at some specific scale \(\Lambda \), which means that the function F only contains one additional dimensionfull parameter, then all the new interactions in the matter sector will not only be universally constructed in terms of \(T_{\mu \nu }\), but they all will in turn have the same coupling constant. This means that, if an effect is seen at a given scale in some sector of the standard model, effects at the same scale will arise in the remaining sectors. Regarded from this perspective in the Einstein frame, we can interpret RBG theories as a procedure to encapsulate a universally interacting matter sector, in the sense explained above, in an auxiliary field that plays the role of a non-dynamical connection. In particular, this property is precisely what permits to study the dynamics in terms of a metric \(g_{\mu \nu }\) for all matter fields at the same time. Let us elaborate on this point a bit more.

The physical meaning of the two metrics is also apparent in the Einstein frame, again assuming minimally coupled fields. The metric \(g_{\mu \nu }\) will determine the trajectories of the particles, which will follow the corresponding geodesics.Footnote 6 One may then wonder why they do not follow the geodesics of \(q_{\mu \nu }\) in the Einstein-frame and how to square this with our statement that these theories are GR. The answer is quite simple. Around trivial matter backgrounds, both metrics are the same and therefore there is no possible confusion. In the presence of a matter background however both metrics are different and while matter fields follow the geodesics of \(g_{\mu \nu }\), it is \(q_{\mu \nu }\) that satisfies Einstein equations. There is no onus however because, also in GR when matter fields propagate on a non-trivial background (and are coupled to it) the propagation does not follow the geodesics of \(g_{\mu \nu }\). Paradigmatic examples of this behaviour are for instance K-essence models of scalar fields or non-linear electrodynamics (see e.g. [27,28,29,30,31]). As a matter of fact, starting from a standard canonical scalar field and usual Maxwellian electrodynamics in the RBG frame, the Einstein frame formulation will precisely be K-essence [18] and non-linear electrodynamics respectively [20].

5 Generalised RBG theories: the non-symmetric gravity frame

The explicit breaking of projective symmetry in the RBG Lagrangian allows the full Ricci tensor to appear in the action, thus jeopardising the symmetric nature of the corresponding \(q^{\mu \nu }\). This crucially changes the situation and the resulting theory in the Einstein frame representation is no longer GR but it resembles the Nonsymmetric Gravity Theory (NGT) introduced by Moffat [32] and which has been explored in different versions. Although the non-symmetric frame of generalised RBGs does not exactly reproduce Moffat’s non-symmetric gravity, it does so in certain limits. A crucial difference is the coupling to matter fields, although even this can be made equivalent by ad-hoc choices of the matter couplings in Moffat’s theory. Thus, given the similarities between both theories, it will be instructive to review some of the known results on non-symmetric gravity that can then be straightforwardly applied to the generalised RBGs. In particular, we will review the pathologies that plague Moffat’s theory [25, 33] (see also [34,35,36,37,38,39,40,41,42,43,44]) and how they will then be inherited by generalised RBGs. We will seize the opportunity to provide alternative understandings for the origin of the pathologies. Let us start by considering vacuum solutions so that no matter fields are presentFootnote 7 and the analysis of the gravitational sector becomes cleaner. Thus, the starting action for NGT (or generalised RBGs in the Einstein–Hilbert frame) will be

$$\begin{aligned} {\mathcal {S}}=\frac{1}{2}\int {\mathrm{d}}^Dx \Big [\sqrt{-q}M_{\mathrm{Pl}}^2q^{\mu \nu }{\mathcal {R}}_{\mu \nu }(\Gamma )+{{\bar{{\mathcal {U}}}}}\Big ], \end{aligned}$$
(35)

where \(q^{\mu \nu }\) is a metric with an antisymmetric partFootnote 8 and \({\mathcal {U}}\) is some potential for the non-symmetric object \(q_{\mu \nu }\). Of course, in the case of a symmetric \(q_{\mu \nu }\), this term can only contribute a cosmological constant by virtue of covariance, but it can have a non-trivial structure for the non-symmetric case with important consequences. In fact, such a term was invoked in [33] to resolve the pathologies of Moffat’s theory. The instabilities that plague this theory around arbitrary backgrounds can be evidenced by different methods that provide complementary views. Let us start by the allegedly simplest procedure to show the presence of pathologies.

5.1 Instabilities in the decoupling limit

We will first study a suitable decoupling limit of the theory that already manifests the presence of ghosts. For that, we will consider the antisymmetric sector perturbatively up to quadratic order so that

$$\begin{aligned} q_{\mu \nu }={\bar{q}}_{\mu \nu }+\frac{\sqrt{2}}{M_{\mathrm{Pl}}} \left( B_{\mu \nu }+\alpha B_{\mu \alpha } B^\alpha {}_\nu +\beta B^2{\bar{q}}_{\mu \nu }\right) , \end{aligned}$$
(36)

with \({\bar{q}}_{\mu \nu }\) an arbitrary symmetric metric, \(B_{\mu \nu }\) a 2-form field corresponding to the antisymmetric part of \(q_{\mu \nu }\), and where the parameters \(\alpha \) and \(\beta \) account for the possibility of field re-definitions at quadratic order (see e.g. [33]). The numerical factor and the Planck mass have been introduced for convenience. When expanding around such a background at second order in \(B_{\mu \nu }\) we find:Footnote 9

$$\begin{aligned} {\mathcal {S}}^{(2)}&=\int {\mathrm{d}}^4x\sqrt{-{\bar{q}}}\Big [\frac{1}{2}M_{\mathrm{Pl}}^2R({\bar{q}})-\frac{1}{12} H_{\mu \nu \rho } H^{\mu \nu \rho }-\frac{1}{4} m^2 B^2 \nonumber \\&\quad -\frac{\sqrt{2}M_{\mathrm{Pl}}}{3}B^{\mu \nu }\partial _{[\mu }\Gamma _{\nu ]}\nonumber \\&\quad +\frac{1-2\alpha +4\beta }{4} R({\bar{q}}) B^2+\alpha R_{\mu \nu }({\bar{q}})B^{\mu \alpha }B^\nu {}_\alpha \nonumber \\&\quad -R_{\mu \nu \alpha \beta }({\bar{q}})B^{\mu \alpha }B^{\nu \beta }\Big ] \end{aligned}$$
(37)

where \(H_{\mu \nu \rho }=3\partial _{[\mu } B_{\nu \rho ]}\) the field strength of the 2-form field, \(m^2\) is the mass generated from \({{\bar{{\mathcal {U}}}}}\), and \(\Gamma _\mu \) is the projective mode of the connection. In order to make apparent the presence and nature of the instabilities, we will first follow a different approach from those used in analysis of NGT that will allow us to clearly pinpoint the problems, namely we will resort to the Stückelberg trick. Let us first consider a flat background so the couplings to curvature in (37) disappear. Then, we can restore the gauge symmetry of the 2-form by introducing Stückelberg fields \(b_\mu \) via the replacement \(B_{\mu \nu }\rightarrow {\hat{B}}_{\mu \nu }+\frac{2}{m} \partial _{[\mu } b_{\nu ]}\), and take the decoupling limit \(m\rightarrow 0\). There will still be the scalar mode of the gauge invariant 2-form sector described by \({\hat{B}}_{\mu \nu }\) that we do not need to consider to show the presence of a ghost. The relevant sector in the decoupling limit of the action in a flat background is then

$$\begin{aligned}&{\mathcal {S}}^{(2)}_{\mathrm{dec, flat}}\nonumber \\&\quad =-\int {\mathrm{d}}^4x\sqrt{-{\bar{q}}}\left( \frac{1}{12} {\hat{H}}_{\mu \nu \rho } {\hat{H}}^{\mu \nu \rho }{+}\frac{1}{4} {\mathcal {B}}_{\mu \nu } {\mathcal {B}}^{\mu \nu }{+} {\mathcal {B}}_{\mu \nu } \Gamma ^{\mu \nu } \right) \end{aligned}$$
(38)

where \({\hat{H}}_{\mu \nu \rho }=3\partial _{[\mu } {\hat{B}}_{\nu \rho ]}\), \({\mathcal {B}}_{\mu \nu }=2 \partial _{[\mu } b_{\nu ]}\) and \(\Gamma _{\mu \nu } =2 \partial _{[\mu } \Gamma _{\nu ]}\). In order to properly take the decoupling limit, we have re-scaled \(\Gamma _\mu \rightarrow \frac{3m}{\sqrt{2}M_{\mathrm{Pl}}}\Gamma _\mu \) that has been kept finite. We see that the decoupling limit shows the presence of five degrees of freedom, namely: one associated to the massless 2-form \({\hat{B}}_{\mu \nu }\) and two associated to the helicity-1 modes described by \(b_\mu \) and the projective mode. This is of course the expected counting for (37) corresponding to a massive 2-form and a gauge spin-1 field. In this decoupling limit it is then apparent that the theory is plagued by ghost-like instabilities owed to the mixing \({\mathcal {B}}_{\mu \nu } \Gamma ^{\mu \nu }\) that comes in without the diagonal \( \Gamma _{\mu \nu } \Gamma ^{\mu \nu }\) element. This signals the presence of a ghost caused by the negative definite character of the kinetic matrix. More explicitly, if we diagonalise by means of \(b_\mu =A_\mu +\xi _\mu \), \(\Gamma _\mu =\lambda A_\mu -(2+\lambda ) \xi _ \mu \), the action (38) reads

$$\begin{aligned} {\mathcal {S}}^{(2)}_{\mathrm{dec,flat}}= & {} -\int {\mathrm{d}}^4x\sqrt{-{\bar{q}}}\bigg [\frac{1}{12} {\hat{H}}_{\mu \nu \rho } {\hat{H}}^{\mu \nu \rho }+ \frac{1+\lambda }{4}\Big (\partial _{[\mu } A_{\nu ]} \partial ^{[\mu } A^{\nu ]} \nonumber \\&- \partial _{[\mu } \xi _{\nu ]} \partial ^{[\mu } \xi ^{\nu ]}\Big )\bigg ], \end{aligned}$$
(39)

showing that either \(A_\mu \) or \(\xi _\mu \) is necessarily a ghost. We have reproduced the result announced in [14] in the decoupling limit of the theory.

After showing the presence of a ghost in a flat background, we will turn on the symmetric sector and allow for an arbitrary curved \({\bar{q}}\)-background. It should then be clear that the non-minimal couplings to the curvature in (37) will present additional pathologies. These pathologies have also been discussed for NGT in [33]. Within our approach we can readily see and interpret the nature of these pathologies as Ostrogradski instabilities [13] associated to having higher order equations of motion for the Stueckelberg fields.Footnote 10 The appropriate decoupling limit now needs to take into account that the curvature scales as \(R\sim M_{\mathrm{Pl}}^{-2}\) and the appropriate limit to be taken is \(m\rightarrow 0\) and \(M_{\mathrm{Pl}}\rightarrow \infty \) with \(\Lambda \equiv m M_{\mathrm{Pl}}\) fixed. In this limit, the Stückelberg fields \(b_\mu \) will feature non-minimal couplings with the schematic form \(\sim \frac{1}{\Lambda ^2}R{\mathcal {B}}{\mathcal {B}}\). It is known that these derivative couplings generically give rise to higher order equations of motion, thus giving rise to Ostrogradski instabilities. An exceptional case is provided by the Horndeski vector-tensor interaction found in [45]. Having the two free parameters \(\alpha \) and \(\beta \) that allow for field redefinitions at quadratic order, one would be tempted to say that the pathology is not physical since the Horndeski interaction could be reached by an appropriate local field redefinition. It is worth noticing that even this Horndeski interaction presents pathologies around relevant backgrounds [46]. Nevertheless, we need to remember that this is the quadratic action and it is expected that going to higher perturbative orders, new higher order non-minimal couplings will be generated. Since there are no healthy such terms beyond the Horndeski interaction in four dimensions, these will need to be trivial modulo field redefinition to avoid re-introducing the pathologies. At this point, the pathological character of these theories should be unequivocal taken at face value. One could argue that interpreted as effective field theories, there could be a certain regime of validity at low energies. However, the very presence of the ghosts already around a Minkowski background shown above makes this hope difficult to realise. In this respect, this ghost could be stabilised easily by introducing a term \(\Gamma _{\mu \nu } \Gamma ^{\mu \nu }\). Although such a term cannot be generated from RBGs, within an EFT approach, not only it should appear, but also a bunch of other terms accompanying it.Footnote 11 The non-minimal couplings however, being (irrelevant) higher dimension operators, should typically be perturbative and, consequently, the associated ghosts would only come at a scale beyond the cut-off. One could tune some coefficients to push the ghosts to higher scales so that the corresponding irrelevant operators could have non-perturbative effects on the low-energy phenomenology. This is clearly beyond the scope of this works, but it would be an interesting study to pursue.

A potential caveat of our analysis (up to now) is that we have neglected the matter sector, but this should not worry us too much since including matter fields will hardly render the theories stable. Rather, one could expect a more pathological behaviour. We will address this point later to show it explicitly.

To summarise, we have seen that the breaking of the projective symmetry results in the appearance of five degrees of freedom, two of which correspond to the projective mode and the remaining three belong to the antisymmetric part of the metric. In both sectors we have clearly identified the root for the problems and we can now understand that it is precisely the trivialisation of the affinity in projectively-invariant RBG theories what makes them viable by reducing their gravitational sector to GR.

5.2 Another view on the problem with additional dofs

In the previous section we have shown how vacuum RBG without a projective symmetry (or vacuum NGT for that matter) are plagued by ghost-like instabilities arising from two sectors, namely: the dynamical projective mode whose mixing with the 2-form leads to the necessary presence of a spin-1 ghost and the non-minimal couplings of the 2-form field that gives rise to Ostrogradski instabilities. This has been neatly shown in the decoupling limit of the theories. Here we will show the appearance of these pathologies in an alternative manner. Let us consider our family of theories described by the action

$$\begin{aligned} {\mathcal {S}}[g_{\mu \nu },\Gamma ]=\frac{1}{2}\int {\mathrm{d}}^Dx \sqrt{-g}\,F\big (g^{\mu \nu },{\mathcal {R}}_{\mu \nu }(\Gamma )\big ), \end{aligned}$$
(40)

where we again consider vacuum generalised RBGs. Let us now separate a metric contribution to the connection from the rest, i.e., let us perform the following field re-definition

$$\begin{aligned} \Gamma ^\alpha {}_{\mu \beta }={\{^{\;{\alpha }}_{{\mu }{\beta }}\}}(h)+\Upsilon ^\alpha {}_{\mu \beta } \end{aligned}$$
(41)

where \({\{^{\;{\alpha }}_{{\mu }{\beta }}\}}(h)\) are the Christoffel symbols of a metric that we have called \(h_{\mu \nu }\) and that we will choose in a convenient manner. After splitting the non-symmetric metric as

$$\begin{aligned} \sqrt{-q}q^{\mu \nu }=\sqrt{-h}h^{\mu \nu }+\sqrt{-h}B^{\mu \nu } \end{aligned}$$
(42)

with \(\sqrt{-h}h^{\mu \nu }=\sqrt{-q}q^{(\mu \nu )}\) and \(\sqrt{-h}B^{\mu \nu }=\sqrt{-q}q^{[\mu \nu ]}\), and using (13) for the field re-definition (41), we can write the generalised RBG action in its Einstein–Hilbert frame (33) as

$$\begin{aligned}&{\mathcal {S}}=\frac{1}{2}\int {\mathrm{d}}^Dx\sqrt{-h}\Big [R(h)-\Upsilon ^{\lambda \alpha \mu }\Upsilon _{\alpha \mu \lambda }+\Upsilon ^\alpha {}_{\alpha \lambda }\Upsilon ^\lambda {}_{\kappa }{}^\kappa \nonumber \\&\quad -\Upsilon ^\alpha {}_{\alpha \lambda }\Upsilon ^\lambda {}_{\mu \nu }B^{\mu \nu }\nonumber \\&\quad -\Upsilon ^\alpha {}_{\nu \lambda }\Upsilon ^\lambda {}_{\alpha \mu }B^{\mu \nu }-B^{\mu \nu }\nabla ^h_\alpha \Upsilon ^\alpha {}_{\mu \nu }-B^{\mu \nu }\nabla ^h_\nu \Upsilon ^\alpha {}_{\alpha \mu }+{\mathcal {U}}(B)\Big ]. \end{aligned}$$
(43)

Here we have used the fact that the connection \({\{^{\;{\alpha }}_{{\mu }{\beta }}\}}\) is torsion-free, \(\nabla ^h\) is the covariant derivative with respect to the Levi-Civita connection of \(h^{\mu \nu }\), and we have dropped a boundary term. Notice that we have used (and will use in the subsequent manipulations) \(h_{\mu \nu }\) as the metric so we will raise and lower indices with \(h^{\mu \nu }\) and its inverse \(h_{\mu \nu }\). The field equations for \(\Upsilon ^\alpha {}_{\mu \nu }\) obtained by variation of the above action are

$$\begin{aligned}&B^{\mu \nu } \Upsilon ^{\beta }{}_{\beta \alpha } - h^{\mu \nu } \Upsilon ^{\beta }{}_{\beta \alpha } + B^{\nu }{}_{\beta } \Upsilon ^{\mu \beta }{}_{\alpha } \nonumber \\&\quad + \Upsilon ^{\mu \nu }{}_{\alpha } - B^{\mu }{}_{\beta } \Upsilon ^{\nu }{}_{\alpha }{}^{\beta } + \Upsilon ^{\nu }{}_{\alpha }{}^{\mu } - \delta _{\alpha }{}^{\mu } \Upsilon ^{\nu \beta }{}_{\beta }\nonumber \\&\quad + B_{\beta \lambda } \delta _{\alpha }{}^{\mu } \Upsilon ^{\nu \beta \lambda } - \nabla ^h_{\alpha }B^{\mu \nu } - \delta _{\alpha }{}^{\mu } \nabla ^h_{\beta }B^{\nu \beta }=0, \end{aligned}$$
(44)

and taking the trace with respect to \(\alpha \) and \(\nu \) of this equation we obtain

$$\begin{aligned} \nabla ^h_\mu B^{\mu \nu }=0 \end{aligned}$$
(45)

which constrains the 2-form field \(B^{\mu \nu }\) to be divergence-free and leaves the connection equation as

$$\begin{aligned}&\nabla ^h_{\alpha }B^{\mu \nu } - B^{\mu \nu } \Upsilon ^{\beta }{}_{\beta \alpha } - B^{\nu }{}_{\beta } \Upsilon ^{\mu \beta }{}_{\alpha } + B^{\mu }{}_{\beta } \Upsilon ^{\nu }{}_{\alpha }{}^{\beta } \nonumber \\&\quad - B_{\beta \lambda } \delta _{\alpha }{}^{\mu } \Upsilon ^{\nu \beta \lambda } + h^{\mu \nu } \Upsilon ^{\beta }{}_{\beta \alpha } - \Upsilon ^{\mu \nu }{}_{\alpha } - \Upsilon ^{\nu }{}_{\alpha }{}^{\mu } + \delta _{\alpha }{}^{\mu } \Upsilon ^{\nu \beta }{}_{\beta } =0, \nonumber \\ \end{aligned}$$
(46)

This constraint on the 2-form shows that (in \(D=4\)) \(B^{\mu \nu }\) can be expressed as the dual of the field strength of some 1-form \(A_\mu \) so that we can write \(B^{\mu \nu }=-\frac{1}{2\sqrt{-h}}\epsilon ^{\mu \nu \alpha \beta }\partial _{[\alpha }A_{\beta ]}\). Notice that the constraint is exact so that we see that the 2-form can propagate at most the same number of degrees of freedom as a vector field (see Appendix 1). It is also easy to see that a projective mode \(\Upsilon ^\alpha {}_{\mu \nu }=\xi _\mu \delta ^\alpha {}_\nu \) is a solution when \(B_{\mu \nu }=0\). This was indeed expected since for vanishing \(B_{\mu \nu }\) we recover the usual projective-invariant theory whose connection is the Levi-Civita connection of \(h_{\mu \nu }\) up to a projective mode. As a matter of fact, in the generalised RBG case where the projective symmetry is explicitly broken, this projective mode is the only dynamical component of the connection and the remaining components of \(\Upsilon \) can be expressed in terms of \(B_{\mu \nu }\) by solving (44), as we will do later perturbatively up to lowest order in \(B^{\mu \nu }\).

Since the equations are linear in \(\Upsilon ^\alpha {}_{\mu \nu }\), the projective mode can be regarded as a homogeneous solution for \(\Upsilon ^\alpha {}_{\beta \gamma }\) in the general case, i.e., it belongs to the kernel of (46) . In order to isolate this projective mode (homogeneous solution) from the remaining non-dynamical part of the connection (non-homogeneous solution), it is common to introduce the shifted connection

$$\begin{aligned} {\hat{\Upsilon }}^\alpha {}_{\mu \nu }=\Upsilon ^\alpha {}_{\mu \nu }+\frac{1}{D-1}\Upsilon _\mu \delta ^\alpha {}_\nu \end{aligned}$$
(47)

with \(\Upsilon _\mu =2\Upsilon ^\alpha {}_{[\alpha \mu ]}\). This shifted connection satisfies \({\hat{\Upsilon }}^\alpha {}_{[\alpha \mu ]}=0\) and it is invariant under a projective transformation of \(\Upsilon ^\alpha {}_{\mu \nu }\). In terms of these variables the action can be written as

$$\begin{aligned} {\mathcal {S}}&=\frac{1}{2}\int {\mathrm{d}}^Dx\sqrt{-h}\Big [R(h)-\frac{2}{D-1}B^{\mu \nu }\partial _{[\mu }\Upsilon _{\nu ]}\nonumber \\&\quad +{\hat{\Upsilon }}^\alpha {}_{\alpha \lambda }{\hat{\Upsilon }}^\lambda {}_{\kappa }{}^\kappa -{\hat{\Upsilon }}^{\alpha \mu \lambda }{\hat{\Upsilon }}_{\lambda \alpha \mu }\nonumber \\&\quad -{\hat{\Upsilon }}^\alpha {}_{\alpha \lambda }{\hat{\Upsilon }}^\lambda {}_{\mu \nu }B^{\mu \nu }-{\hat{\Upsilon }}^\alpha {}_{\nu \lambda }{\hat{\Upsilon }}^\lambda {}_{\alpha \mu }B^{\mu \nu }-B^{\mu \nu }\nabla ^h_\alpha {\hat{\Upsilon }}^\alpha {}_{\mu \nu } \nonumber \\&\quad -B^{\mu \nu }\nabla ^h_\nu {\hat{\Upsilon }}^\alpha {}_{\alpha \mu }+{\mathcal {U}}(B)\Big ]. \end{aligned}$$
(48)

We then see that the projective mode \(\Upsilon _\mu \) is in fact the responsible for the divergence-free constraint on the 2-form field. From this form of the action we can already understand the root of the pathologies. Firstly, the absence of a pure kinetic term for the projective mode will render this sector unstable on arbitrary \(B_{\mu \nu }\) backgrounds. To show this, let us consider a background where the 2-form develops a non-trivial profile. On such a background, and leaving out kinetic terms and/or non-minimal couplings that will not affect our argument here, the relevant sector is described by

$$\begin{aligned} {\mathcal {S}}\supset \int {\mathrm{d}}^Dx\sqrt{-h}\Big (B^{\mu \nu }\partial _{[\mu }\Upsilon _{\nu ]}-m^2M^{\alpha \beta \mu \nu }B_{\alpha \beta }B_{\mu \nu }\Big ), \end{aligned}$$
(49)

where \(m^2\) is some mass parameter and \(M^{\alpha \beta \mu \nu }\) the mass tensor that depends on the background configuration, with the obvious symmetries of being antisymmetric in the first and second pair of indices and symmetric under the exchange \((\alpha \beta )\leftrightarrow (\mu \nu )\). If the background 2-form field is trivial, the mass tensor reduces to \(M^{\alpha \beta \mu \nu }=h^{\alpha [\mu }h^{\nu ]\beta }\) so we have

$$\begin{aligned} {\mathcal {S}}\supset \int {\mathrm{d}}^Dx\sqrt{-h}\Big (B^{\mu \nu }\partial _{[\mu }\Upsilon _{\nu ]}-m^2B_{\mu \nu }B^{\mu \nu }\Big ). \end{aligned}$$
(50)

We can diagonalise this sector by performing the field re-definition \(B^{\mu \nu }={\hat{B}}^{\mu \nu }+\frac{1}{2m^2}\partial ^{[\mu }\Upsilon ^{\nu ]}\), an the above action now reads

$$\begin{aligned} {\mathcal {S}}\supset \int {\mathrm{d}}^Dx\sqrt{-h}\left( \frac{1}{4m^2}\partial _{[\mu }\Upsilon _{\nu ]}\partial ^{[\mu }\Upsilon ^{\nu ]}-m^2{\hat{B}}_{\mu \nu }{\hat{B}}^{\mu \nu }\right) . \end{aligned}$$
(51)

Once this sector of the gravitational action has been diagonalised, it becomes apparent that the projective mode acquires the usual gauge-invariant Maxwellian kinetic term for a vector field, but with the wrong sign. One could obtain the correct sign by assuming \(m^2<0\), but then the 2-form sector would have the wrong sign for the mass term and, consequently, the ghost would appear there. Either case, we clearly see that the presence of a ghost around a trivial \(B^{\mu \nu }\) background is unavoidable. However, there is the possibility that within a non-trivial \(B^{\mu \nu }\) background the 2-form field behaves as a ghost condensate. In order to see if this is the case, notice that in a general \(B^{\mu \nu }\) background, the diagonalisation requires a field re-definition of the form

$$\begin{aligned} B^{\mu \nu }={\hat{B}}^{\mu \nu }+\frac{1}{2m^2}\Lambda ^{\mu \nu \alpha \beta }\partial _{[\alpha }\Upsilon _{\beta ]} \end{aligned}$$
(52)

with \(\Lambda ^{\mu \nu \alpha \beta }\) satisfying generally

$$\begin{aligned} M^{\alpha \beta \lambda \kappa }\Lambda _{\lambda \kappa }{}^{\mu \nu }=h^{\alpha [\mu }h^{\nu ]\beta }. \end{aligned}$$
(53)

In this case, the relevant sector of the gravitational action can be written as

$$\begin{aligned}&{\mathcal {S}}\supset \int {\mathrm{d}}^Dx\sqrt{-h}\bigg (\frac{1}{4m^2}\Lambda ^{\alpha \beta \mu \nu }\partial _{[\alpha }\Upsilon _{\beta ]}\partial _{[\mu }\Upsilon _{\nu ]}\nonumber \\&\quad -m^2M^{\alpha \beta \mu \nu }B_{\alpha \beta }B_{\mu \nu }\bigg ). \end{aligned}$$
(54)

To see whether the ghost persists in general we have to look to the signature character of \(\Lambda ^{\alpha \beta \mu \nu }\) and \(M^{\alpha \beta \mu \nu }\). The ghostly nature of the projective mode is avoided if \(\Lambda ^{\alpha \beta \mu \nu }\) is a super-metric with the same signature as \(-h^{\alpha [\mu }h^{\nu ]\beta }\), being \(h^{\mu \nu }\) a Lorentzian metric. On the other hand, stability of the 2-form sector requires a mass tensor with the signature of \(h^{\alpha [\mu }h^{\nu ]\beta }\). These two conditions are however inconsistent with each other by virtue of the relation (53) and therefore no ghost condensation can stabilise the theory. Thus, we find that the presence of a ghost in the projective sector of generalised RBGs is unavoidable and occurs in an arbitrary background. This is the ghost found in 5.1 beyond the decoupling limit.

It is interesting to notice that the re-definition of the 2-form field that diagonalises the quadratic action for the trivial background configuration corresponds to a gauge-like transformation for the 2-form, hence, its field strength will be oblivious to such re-definition. In particular, this means that kinetic terms with the correct gauge invariant form \(H^2\) will not be affected by the diagonalisation and, therefore, cannot change our conclusion about the presence of a ghost. The same reasoning applies to non-trivial backgrounds that vary weakly as compared to \(m^2\). If this is not the case, one might envision that sufficiently strongly varying backgrounds could give rise to a stabilisation à la ghost condensate. Even without taking into account couplings to gravity, it should be apparent that there will always be UV modes with a sufficiently high frequency for which the background is effectively constant and, therefore, our discussion above will also apply, thus showing the pathological character of these modes. A natural way around this problem is to assume that those modes are beyond the regime of validity of the theory and, consequently, it does not pose an actual problem. In that case however, the full EFT approach should be taken from the very beginning. Moreover, there will also be non-minimal couplings to the curvature, which after diagonalisation will introduce yet additional pathologies arising from that sector so our hopes stand on shaky grounds anyways. To understand this, we must look at the connection equations (44), from where it is apparent that the solution for \({\hat{\Upsilon }}\) will have the schematic form

$$\begin{aligned} {\hat{\Upsilon }}\sim \frac{\nabla ^h B}{1+B}. \end{aligned}$$
(55)

Plugging this solution back into the RBG action written as (48) and integrating out the non-dynamical piece of the connection \({\hat{\Upsilon }}\), additional terms like \((\nabla ^h B)^2\) and \(B(\nabla ^h)^2B\) will arise. The latter can be integrated by parts to be put in the form of the former. Doing this however can result in non-gauge invariant derivative terms and/or non-minimal couplings arising from commuting covariant derivatives. Both of such terms are potentially dangerous and the source of ghost-like instabilities. It is remarkable that the quadratic derivative terms generated in the action can be brought into the standard gauge-invariant kinetic term of a two form. However, this is an accident of the leading order solution and it is broken at higher orders. Let us see this explicitly.

5.3 Solving for the connection

We will illustrate the form of the solutions for the connection by considering vacuum configurations, so that the action is given by

$$\begin{aligned} {\mathcal {S}}=\frac{1}{2}\int {\mathrm{d}}^Dx \Big [\sqrt{-q}q^{\mu \nu }{\mathcal {R}}_{\mu \nu }(\Gamma )+{\mathcal {U}}(q)\Big ]. \end{aligned}$$
(56)

The connection equations for this action are the same as we obtained in (25) or (27), i.e., the connection deprived of its projective mode satisfies

$$\begin{aligned}&\partial _\lambda (\sqrt{-q}q^{\mu \nu })+{\hat{\Gamma }}^{\mu }{}_{\lambda \alpha }\sqrt{-q}q^{\alpha \nu }\nonumber \\&\quad +{\hat{\Gamma }}^{\nu }{}_{\alpha \lambda }\sqrt{-q}q^{\mu \alpha }-{\hat{\Gamma }}^{\alpha }{}_{\lambda \alpha }\sqrt{-q}q^{\mu \nu }=0. \end{aligned}$$
(57)

this equation does allow, at least formally, to algebraically solve for the connection in terms of \(q^{\mu \nu }\). With this aim let us again decompose the connection as in (41), so that we extract the Levi-Civita connection of the symmetric component of \(h^{\mu \nu }\). The projectively transformed connection is therefore given by

$$\begin{aligned} {{\hat{\Gamma }}}^\alpha {}_{\mu \beta }={\{^{\;{\alpha }}_{{\mu }{\beta }}\}}(h)+{\hat{\Upsilon }}^\alpha {}_{\mu \beta } \end{aligned}$$
(58)

where \({\hat{\Upsilon }}\) is defined as in (47). We can now introduce the above splitting (58) into the connection equations equations (57). By performing the usual trick of adding and subtracting the resulting equation with suitably permuted indices, we can write a formal solution for the connection as

$$\begin{aligned}&{\hat{\Upsilon }}^{\alpha }{}_{\mu \nu }\nonumber \\&\quad =\left[ \frac{1}{2} h^{\kappa \lambda }\left( \nabla _{\beta }^{h} B_{\gamma \lambda }+\nabla _{\gamma }^{h} B_{\lambda \beta }-\nabla _{\lambda }^{h} B_{\beta \gamma }\right) \right] \left( A^{-1}\right) _{\kappa }{}^{\alpha }{}_{\mu \nu }{}^{\beta \gamma }, \end{aligned}$$
(59)

where by definition \({A}^\kappa {}_{\alpha '}{}^{\mu '\nu '}{}_{\beta \gamma } ({A}^{-1})_\kappa {}^\alpha {}_{\mu \nu }{}^{\beta \gamma }\equiv \delta ^{\alpha '}{}_{\alpha } \delta ^{\mu '_{\mu }} \delta ^{\nu '_{\nu }}\). Here \({A}^\kappa {}_{\alpha '}{}^{\mu '\nu '}{}_{\beta \gamma }\) is linear in \(B_{\mu \nu }\) and is given by

$$\begin{aligned} \begin{aligned}&{A}^\kappa {}_{\alpha }{}^{\mu \nu }{}_{\beta \gamma } \equiv a^\kappa {}_{\alpha }{}^{\mu \nu }{}_{\beta \gamma }+b^\kappa {}_{\alpha }{}^{\mu \nu }{}_{\beta \gamma }{}^{\rho \sigma }B_{\rho \sigma }\\&a^\kappa {}_{\alpha }{}^{\mu \nu }{}_{\beta \gamma } \equiv \delta ^\kappa {}_\alpha \delta ^\mu {}_\beta \delta ^\nu {}_\gamma +\frac{1}{2} \delta ^\mu {}_{\alpha }\left( h^{\nu \kappa } h_{\beta \gamma }-2 \delta ^{\nu }{}_{(\beta } \delta ^\kappa {}_{\gamma )}\right) \\&b^\kappa {}_{\alpha }{}^{\mu \nu }{}_{\beta \gamma }{}^{\rho \sigma }=\frac{1}{2}\left[ h_{\alpha \gamma } h^{\mu \sigma } \delta ^\nu {}_{\beta } h^{\rho \kappa }+\delta ^{\beta }{}_\rho h_{\alpha \gamma } h^{\mu \kappa } h^{\nu \sigma }\right. \\&\quad +\delta ^{\rho }{}_\gamma \delta ^\mu {}_{\alpha } \delta ^\nu {}_{\beta } h_{\kappa \sigma }-h^{\rho \kappa } \delta ^{\mu }{}_\gamma h^{\nu \sigma } h_{\alpha \beta }-\delta ^{\rho }{}_ \beta h^{\sigma \kappa } \delta ^\mu {}_{\alpha }\delta ^{\nu }{}_\gamma \\&\quad -\delta ^{\rho }{}_ \beta \delta ^{\sigma }{}_\gamma \delta ^{\mu }{}_\alpha h^{\nu \kappa }-\delta ^{\rho }{}_\gamma h_{\alpha \beta } h^{\mu {\sigma }} h^{\nu {\alpha }}\\&\left. \quad -\delta ^{\rho }_{\gamma } \delta ^{\kappa }{}_{\alpha } \delta ^{\mu }{}_\beta h^{\nu \sigma }+\delta ^{\rho }{}_\beta \delta ^{\kappa }{}_ \alpha h^{\mu \sigma } \delta ^{\nu }{}_\gamma \right. \big ]. \end{aligned} \end{aligned}$$
(60)

In order to explicitly show the appearance of problematic couplings, it will suffice to give a perturbative solution to lowest-order in B. To that end, let us consider a trivial 2-form background and expand around it, leaving the symmetric sector \(h^{\mu \nu }\) completely general. The only task then is either to compute the \({{\mathcal {O}}}(B^0)\) term of \(({A}^{-1})_\kappa {}^\alpha {}_{\mu \nu }{}^{\beta \gamma }\) or to directly solve the equations (44) for \({\hat{\Upsilon }}\) expanded as a power series. Let us proceed with the second method by expanding \({\hat{\Upsilon }}\) as a power series of the 2-form B in the form

$$\begin{aligned} {\hat{\Upsilon }}^{\alpha }{}_{\mu \nu }=\sum _{n=0}^{\infty }{\hat{\Upsilon }}_{(n)}{}^\alpha {}_{\mu \nu }, \end{aligned}$$
(61)

where the sub-index n implies that the quantity is of order \({\mathcal {O}}(B^n)\). We can now use (58) to split the connection symbols that appear in (57), and plugging the expansion of \({\hat{\Upsilon }}^{\alpha }{}_{\mu \nu }\) into the resulting equation, we obtain

$$\begin{aligned} \nabla ^h_\lambda&B^{\mu \nu }-B^{\mu \nu }{\hat{\Upsilon }}_{(0)}{}^{\alpha }{}_{\lambda \alpha }-B^\nu {}_\alpha {\hat{\Upsilon }}_{(0)}{}^{\mu \alpha }{}_{\lambda }+B^\mu {}_\alpha {\hat{\Upsilon }}_{(0)}{}^{\nu }{}_\lambda {}^\alpha \nonumber \\&\quad +h^{\mu \nu }{\hat{\Upsilon }}_{(1)}{}^\alpha {}_{\lambda \alpha }-{\hat{\Upsilon }}_{(1)}{}^{\mu \nu }{}_\lambda -{\hat{\Upsilon }}_{(1)}{}^{\nu }{}_\lambda {}^\mu -\nonumber \\&\quad -{\hat{\Upsilon }}_{(0)}{}^{\mu \nu }{}_\lambda -{\hat{\Upsilon }}_{(0)}{}^{\nu }{}_\lambda {}^\mu +h^{\mu \nu }{\hat{\Upsilon }}_{(0)}{}^{\alpha }{}_{\lambda \alpha }={\mathcal {O}}(B^2). \end{aligned}$$
(62)

Notice that this equation is consistent with substituting the perturbative series (61) in (46), as it should be.Footnote 12 The zeroth order term gives the equation

$$\begin{aligned} {\hat{\Upsilon }}_{(0)}{}^{\mu \nu }{}_\lambda +{\hat{\Upsilon }}_{(0)}{}^{\nu }{}_\lambda {}^\mu -h^{\mu \nu }{\hat{\Upsilon }}_{(0)}{}^{\alpha }{}_{\lambda \alpha }=0, \end{aligned}$$
(63)

which after contracting with \(h_{\mu \nu }\) gives \({\hat{\Upsilon }}_{(0)}{}^\alpha {}_{\lambda \alpha }=0\) for \(D\ne 2\). This leaves us with the equation \({\hat{\Upsilon }}_{(0)}{}^{\mu \nu }{}_\lambda +{\hat{\Upsilon }}_{(0)}{}^{\nu }{}_\lambda {}^\mu =0\). Doing the permutating trick we arrive to

$$\begin{aligned} {\hat{\Upsilon }}_{(0)}{}^\alpha {}_{\mu \nu }=0, \end{aligned}$$
(64)

which ensures that the Levi-Civita connection of \(h^{\mu \nu }\) is the solution (up to a projective mode) for the affine connection for a symmetric metric. Note that this was expected since for vanishing \(B^{\mu \nu }\) we are describing GR, which has the Levi-Civita connection of the metric as the only solution (up to a projective mode) [26, 48]. Plugging the 0th order result into (62), we arrive to the equation for the \({\mathcal {O}}(B)\) piece of \({\hat{\Upsilon }}^\alpha {}_{\mu \nu }\), which reads

$$\begin{aligned} \nabla ^h_\lambda B^{\mu \nu }-{\hat{\Upsilon }}_{(1)}{}^{\mu \nu }{}_\lambda -{\hat{\Upsilon }}_{(1)}{}^\nu {}_\lambda {}^\mu -h^{\mu \nu }{\hat{\Upsilon }}_{(1)}{}^\alpha {}_{\lambda \alpha }=0. \end{aligned}$$
(65)

Contracting with \(h_{\mu \nu }\) we the condition \({\hat{\Upsilon }}_{(1)}{}^\alpha {}_{\lambda \alpha }=0\) for \(D\ne 2\), which leads to the equation \(\nabla ^h_\lambda B^{\mu \nu }-{\hat{\Upsilon }}_{(1)}{}^{\mu \nu }{}_\lambda -{\hat{\Upsilon }}_{(1)}{}^{\nu }{}_\lambda {}^\mu =0\). This can be solved again by performing the permutation trick, thus obtaining

$$\begin{aligned} {\hat{\Upsilon }}_{(1)}{}^\alpha {}_{\mu \nu }=\frac{1}{2}h^{\alpha \lambda }\left( \nabla ^h_\mu B_{\nu \lambda }+\nabla ^h_\nu B_{\lambda \mu }-\nabla ^h_\lambda B_{\mu \nu }\right) , \end{aligned}$$
(66)

in agreement with previous results in NGT obtained in [25] and the formal solution (59) given above.Footnote 13 As stated in the end of the previous section, and analogously to the results on NGT in [25], the dependence of \({\hat{\Upsilon }}\) on the derivatives of \(B^{\mu \nu }\) will introduce additional pathologies in the 2-form field. As a matter of fact, upon substitution of this solution into (48) and integration by parts, we arrive at the desired action similar to (37) featuring a gauge-invariant kinetic term for the 2-form together with the non-minimal couplings advertised above. Again, the gauge invariance of the derivative operators for the 2-form is accidental of this order, but it is broken at cubic and higher orders. It is possible, although tedious, to obtain the solution for \(\Upsilon \) at arbitrary order by following this perturbative scheme. Obtaining a full solution in closed form appears to be a more challenging task.

5.4 On more general actions

So far we have focused on theories constructed in terms of the Ricci tensor alone as a simplified proxy to prove the pathological character of general metric-affine theories described by higher order curvature actions. Our results should suffice to clearly identify the origin for the potential pathologies in more general metric-affine theories where not only the Ricci tensor appears in the action, but arbitrary non-linear terms constructed with the Riemann curvature tensor. In general, if we have an action with an arbitrary dependence on the Riemann tensor formulated in an affine geometry, we can always introduce the splitting of the connection into its Levi-Civita part, the torsion and the non-metricity. That way, it is possible to re-formulate the theory in a pure Riemannian geometry with additional fields. These fields, i.e. the torsion and the non-metricity, can be decomposed into their irreducible representations under some appropriate group, GL(4,\(\mathbb {R}\)) or ISO(1,3) being natural choices (see e.g. [49]), and they will feature non-minimal couplings to the curvature and, quite generically, these will involve either derivatives of the fields or couplings of spin higher than zero. In both cases, as it is well-known, such interactions are prone to pathologies, specially to Ostrogradski instabilities. In the precedent sections we have explicitly shown how these expected pathologies come about for a particular class of theories and only when the extra fields drop from the spectrum can we have stable theories, in which case they simply reduce to GR, but it is clear that the same problems will persist for more general actions.

It is important to emphasise that we are providing a general argument against some commonly quoted statementsFootnote 14 that the metric affine theories avoid instabilities because the field equations remain of second order. This does not mean however that all metric-affine theories with higher order curvature terms featuring additional propagating dofs (other than the graviton) will be pathological, but one should be careful on how these theories are constructed and not give for granted that the very fact of using a metric-affine formulation prevents the appearance of ghosts from operators involving arbitrary powers of the Riemann tensor. Of course, non-pathological theories exist and they can be constructed in a variety of manners (some of which we will discuss in Sect. 7), usually introducing additional symmetries, constraints or geometrical identities. However, it should be clear from our discussion that one should be careful when constructing theories in a metric-affine framework. The general problematic character of metric-affine gravity theories can be seen from the analysis of the perturbative dof’s around Minkowski performed in [50] where it was shown that, already at that level, wise choices of parameters must be taken to avoid instabilities. It is important however to stress that our analysis above goes beyond the linear regime around Minkowski and, in fact, some of the diagnosed instabilities cannot be seen from such a perturbative analysis. Thus, though the perturbative analysis gives necessary conditions for stability, it is not sufficient to ensure the non-linear stability of the theories. For an example of how the perturbative analysis is not sufficient see e.g. [51,52,53] within the context of Poincaré gauge theories.

Let us finally briefly comment on how our results can be relevant for a pure effective field theory (EFT) approach to the metric-affine theories theories. This approach has been thoroughly pursued in [47] within the class of Riemann-Cartan geometries including up to dimension 4 operators. We have seen that higher order powers of the Riemann tensor generically introduce ghosts-like instabilities in the metric-affine formalism very much like in the metric approach (essentially for the same reasons). It is possible however to adopt an EFT approach where these will just be irrelevant operators with perturbative effects below the cut-off of the theory. In this view, the ghosts are not really part of the perturbative spectrum of the theory because their masses are beyond the cut-off scale so they are harmless. If the gravitational cut-off is assumed to be the Planck mass and the Wilsonian coefficients are \({\mathcal {O}}(1)\) according to naturalness arguments, then the EFT is will be similar (with additional fields) to the usual EFT approach to GR. On the other hand, if we assume that the Planck scale only represents the cut-off for the purely metric sector and the metric-affine sector comes in with another scale \(M<M_{\mathrm{Pl}}\), then one would expect the EFT theory to breakdown at that scale. This implies for instance that classical solutions where the curvature becomes larger than M cannot be generically trusted.

6 Matter couplings

In the precedent sections we have only considered matter fields which do not couple to the connection. However, our conclusions on the presence of pathological dof’s do not change substantially by coupling the connection to the matter sector. Couplings to matter fields in a metric-affine framework is an interesting issue by itself, specially when it involves spinor fields (see e.g. [23, 24, 49]). It is not the scope of this section to carefully go through the different coupling prescriptions to matter nor their consistency. Our aim is to show how our results above are not substantially affected in the presence of matter fields with minimal couplings as well as discussing some non-minimal couplings that can be safely included.

6.1 The Non-symmetric gravity frame for non-minimally coupled fields

Curvature couplings to the matter sector include derivatives of the affine connection in the matter Lagrangian. This further complicates the connection Eq. (71) by adding extra terms on the right hand side. However, there is a class of couplings for which, while adding technical complications, the qualitative results remain the same with just some minor adjustments with respect to the minimally coupled fields. We will start by considering bosonic fields whose non-minimal couplings are through the Ricci tensor. To illustrate this point, we can consider a scalar field \(\varphi \) as a proxy for the matter sector. If we restrict to only first derivatives of the scalar, we can use for instance \({\mathcal {R}}^{\mu \nu }\partial _\mu \varphi \partial _\nu \varphi \) or \({\mathcal {R}}(\partial \varphi )^2\) in our action. In the usual metric formalism, these two terms are only allowed if they enter through the specific combination \((R^{\mu \nu }-\frac{1}{2} Rg^{\mu \nu })\partial _\mu \varphi \partial _\nu \varphi \) and accompanied by the appropriate second derivative interactions of the scalar field in order to avoid Ostrogradski instabilities. In the metric-affine formalism however, this is not necessary and the dependence on said terms is completely arbitrary. Let us note that these interactions will not break the projective symmetry since they only depend on the symmetric part of the Ricci tensor. Interestingly, it has been suggested in [15] that the projective symmetry could also play a crucial role to guarantee the absence of ghosts for theories containing up to second order covariant derivatives of a scalar field. They also find that in the stable theories the connection is devoid of any propagating mode as a consistency condition as we argued above.

Our reasoning can be straightforwardly extended to other fields such as vector fields \(A_\mu \) where interactions like \({\mathcal {R}}_{\mu \nu }A^\mu A^\nu \) or \({\mathcal {R}}_{\mu \nu } F^{\mu \alpha }F^\nu {}_\alpha \) also respect the projective symmetry and are permitted. The crucial point of all these interactions is that an Einstein frame still exists where it is apparent that the connection remains an auxiliary field [16]. In the absence of the projective symmetry, we will encounter the same pathologies as exposed for the pure gravitational sector and the inclusion of a contrived matter sector cannot remedy it. In Sect. 3.2 we showed how to go to the Einstein frame of RBG theories for minimally coupled matter fields. Let us see here how to proceed in the presence of non-minimally coupled matter fields. In this case the action reads

$$\begin{aligned} {\mathcal {S}}[g_{\mu \nu },\Gamma ,\Psi ]=\frac{1}{2}\int {\mathrm{d}}^D x \sqrt{-g}\,F\big (g^{\mu \nu },{\mathcal {R}}_{\mu \nu }\big )+{\mathcal {S}}_m[g,\Psi ,\Gamma ]. \end{aligned}$$
(67)

Parallel to 5.2, we now go to the Einstein frame of the above theory, and after splitting the corresponding auxiliary metric as in (42) and the connection as in (41), and also isolating the projective mode from \(\Upsilon ^\alpha {}_{\mu \nu }\) as in (47), we get

$$\begin{aligned} {\mathcal {S}}&=\frac{1}{2}\int {\mathrm{d}}^Dx\sqrt{-h}\Big [R(h)-\frac{2}{D-1}B^{\mu \nu }\partial _{[\mu }\Upsilon _{\nu ]}\nonumber \\&\quad +{\hat{\Upsilon }}^\alpha {}_{\alpha \lambda }{\hat{\Upsilon }}^\lambda {}_{\kappa }{}^\kappa -{\hat{\Upsilon }}^{\alpha \mu \lambda }{\hat{\Upsilon }}_{\lambda \alpha \mu }-{\hat{\Upsilon }}^\alpha {}_{\alpha \lambda }{\hat{\Upsilon }}^\lambda {}_{\mu \nu }B^{\mu \nu }\nonumber \\&\quad -{\hat{\Upsilon }}^\alpha {}_{\nu \lambda }{\hat{\Upsilon }}^\lambda {}_{\alpha \mu }B^{\mu \nu }-B^{\mu \nu }\nabla ^h_\alpha {\hat{\Upsilon }}^\alpha {}_{\mu \nu }\nonumber \\&\quad -B^{\mu \nu }\nabla ^h_\nu {\hat{\Upsilon }}^\alpha {}_{\alpha \mu }+{\mathcal {U}}(B)\Big ]+{{\tilde{{\mathcal {S}}}}}_m[h,B,\Psi ,{{\hat{\Upsilon }}},\Upsilon ]. \end{aligned}$$
(68)

where now \({{\tilde{{\mathcal {S}}}}}_m\) is the matter action in the Einstein frame, and the variables inside square brackets means that the matter action can depend on those fields and their derivatives in general. Concretely \(\Upsilon \) stands for the dependence of the matter action on the projective mode, so it will be absent for projectively invariant matter. It is then apparent that the gravitational sector features the same pathological terms. Obviously, a trivial matter sector background will not modify those terms. A non-trivial matter background contributing to the background symmetric part of the metric could help by providing a kinetic term for the projective mode. However, the non-minimal couplings to the curvature for the 2-form that are generated after integrating \({\hat{\Upsilon }}\) out are hardly cured. In any case, this would require very specific choices of the matter sector. To make this statement more explicit, let us consider a particular class of matter sector coupled to the connection.

6.2 Ultra-local matter couplings

For mater actions which do not include curvature couplings (i.e. no derivatives of the connection), we already know that the projective mode will be problematic due to the absence of a proper kinetic term for it. In order to understand if the inclusion of a general coupling between matter and connection can solve the instability problems we can now compare the above action (68) to (48). First notice that the divergence-free constraint of the 2-form (45) that came from the field equations of the projective mode gets modified if non-projectively invariant matter actions are taken into account, and the trace of the hypermomentum acts now as a source for B

$$\begin{aligned} \nabla ^h_\mu B^{\mu \nu }=\frac{D-1}{4}\Delta _\alpha {}^{[\mu \alpha ]} \end{aligned}$$
(69)

where \(\Delta _{\lambda }^{\mu \nu }\) is the hypermomentum defined as

$$\begin{aligned} \left. \Delta _{\lambda }{}^{\mu \nu } \equiv 2 \frac{\delta {\mathcal {S}}_{m}}{\delta \Gamma ^{\lambda }{}_{\mu \nu }}\right| _{g_{\mu \nu }}=2 \left. \frac{\delta {\mathcal {S}}_{m}}{\delta \Upsilon ^{\lambda }{}_{\mu \nu }}\right| _{g_{\mu \nu }} \end{aligned}$$
(70)

which vanishes for matter fields that do not couple to the connection. Looking at the form of this action, we can see that the projective mode will in general feature the same problems as in the previous case when the matter and connection did not couple. The Ostrogradski instabilities that arise from the couplings between the 2-form \(B^{\mu \nu }\) and the curvature of \(h_{\mu \nu }\) will still be there no matter what matter action we choose. Therefore, we see that allowing for an arbitrary coupling between matter and connection is not helpful in solving any of the instabilities listed above. To explicitly see what kind of couplings arise, we have to solve the connection equation now with a hypermomentum. Since generally an analytic solution is not possible, and even if it is, it is not very illuminating, we will attempt to find a perturbative solution which will already give us a clear picture of the issue. Let us then write down the connection equations when a coupling between matter and connection is present:

$$\begin{aligned} \begin{aligned}&{\nabla _{\lambda }\left[ \sqrt{-q} q ^{\nu \mu }\right] -\delta ^\mu {}_\lambda \nabla _{\rho }\left[ \sqrt{-q} q ^{ \nu \rho }\right] }=\Delta _{\lambda }{}^{\mu \nu }\\&\quad +\sqrt{-q}\left[ {\mathcal {T}}^{\mu }{}_{\lambda \alpha } q ^{\nu \alpha }+{\mathcal {T}}^{\alpha }{}_{\alpha \lambda } q ^{\nu \mu }-\delta _{\lambda }^{\mu } {\mathcal {T}}^{\alpha }{}_{\alpha \beta } q ^{\nu \beta }\right] . \end{aligned} \end{aligned}$$
(71)

In order to remain as close as possible to the previous analysis in sec.5.3, it is necessary to use the shifted connection (27) and find the relation between the hypermomentum of the original connection \(\Delta _\alpha {}^{\mu \nu }\) and the shifted hypermomentum \({\hat{\Delta }}_\alpha {}^{\mu \nu }\), which reads

$$\begin{aligned} \Delta _\alpha {}^{\mu \nu }={\hat{\Delta }}_\alpha {}^{\mu \nu } +\frac{2}{D-1}\delta _\alpha {}^{[\mu }{\hat{\Delta }}_\beta {}^{\nu ]\beta }, \end{aligned}$$
(72)

where the shifted hypermomentum is defined in an analogous manner as (70). This implies that the hyermomentum of projectively invariant matter fields satisfies \({\hat{\Delta }}_\beta {}^{\mu \beta }\). We can now recast (71) in the form of (25) by doing the same manipulations, thus finding

$$\begin{aligned}&\partial _\lambda (\sqrt{-q}q^{\mu \nu })+{\hat{\Gamma }}^{\mu }{}_{\lambda \alpha }\sqrt{-q}q^{\alpha \nu }+{\hat{\Gamma }}^{\nu }{}_{\alpha \lambda }\sqrt{-q}q^{\mu \alpha } \nonumber \\&\quad -{\hat{\Gamma }}^{\alpha }{}_{\lambda \alpha }\sqrt{-q}q^{\mu \nu }={\hat{\Delta }}_\alpha {}^{\mu \nu } +\frac{2}{D-1}\delta _\alpha {}^{[\mu }{\hat{\Delta }}_\beta {}^{\nu ]\beta }. \end{aligned}$$
(73)

As in the vanishing hypermomentum case, we can obtain a formal solution for the full connection in the case of arbitrary hypermomentum as

$$\begin{aligned}&{\hat{\Upsilon }}^{\alpha }{}_{\mu \nu }=\frac{1}{2} h^{\kappa \lambda }\left[ \left( \nabla _{\beta }^{h} B_{\gamma \lambda }+\nabla _{\gamma }^{h} B_{\lambda \beta }-\nabla _{\lambda }^{h} B_{\beta \gamma }\right) +\frac{1}{\sqrt{-h}} h^{\kappa \lambda }\left( {\hat{\Delta }}_{\beta \gamma \lambda }\right. \right. \nonumber \\&\quad \left. \left. +{\hat{\Delta }}_{\gamma \lambda \beta }+{\hat{\Delta }}_{\lambda \beta \gamma }+\frac{2}{D-1}h_{\lambda [\gamma }{\hat{\Delta }}^\alpha {}_{\beta ]\alpha }\right) \right] \left( A^{-1}\right) _{\kappa }{}^{\alpha }{}_{\mu \nu }{}^{\beta \gamma }, \end{aligned}$$
(74)

where \(\left( A^{-1}\right) _{\kappa }{}^{\alpha }{}_{\mu \nu }{}^{\beta \gamma }\) is the same operator as in the vanishing hypermomentum case, which is specified in (60). Notice that the above formula points to the fact that the addition of hypermomentum does not solve any of the instabilities due to the dependence of \({\hat{\Upsilon }}\) on the derivatives of \(B_{\mu \nu }\). To see that this is the case, let us find a perturbative solution to the connection in an analogous way to that of 5.3. First we need to write \({\hat{\Delta }}_ \alpha {}^{\mu \nu }={\hat{\Delta }}^{(0)}_ \alpha {}^{\mu \nu }+{\hat{\Delta }}^{(1)}_ \alpha {}^{\mu \nu }+...\) as a power series in \(B^{\mu \nu }\), where the superscript (n) indicates that such term is of order \({\mathcal {O}}(B^n)\). Then, after splitting the shifted connection as in (58) and then writing \({\hat{\Upsilon }}^\alpha {}_{\mu \nu }\) as a power series in \(B^{\mu \nu }\) as in (61), we can write (71) in an analogous fashion to (62) as

$$\begin{aligned}&\nabla ^h_\lambda B^{\mu \nu }-B^{\mu \nu }{\hat{\Upsilon }}_{(0)}{}^{\alpha }{}_{\lambda \alpha }-B^\nu {}_\alpha {\hat{\Upsilon }}_{(0)}{}^{\mu \alpha }{}_{\lambda }+B^\mu {}_\alpha {\hat{\Upsilon }}_{(0)}{}^{\nu }{}_\lambda {}^\alpha \nonumber \\&\quad +h^{\mu \nu }{\hat{\Upsilon }}_{(1)}{}^\alpha {}_{\lambda \alpha }-{\hat{\Upsilon }}_{(1)}{}^{\mu \nu }{}_\lambda -{\hat{\Upsilon }}_{(1)}{}^{\nu }{}_\lambda {}^\mu -{\hat{\Upsilon }}_{(0)}{}^{\mu \nu }{}_\lambda \nonumber \\&\quad -{\hat{\Upsilon }}_{(0)}{}^{\nu }{}_\lambda {}^\mu +h^{\mu \nu }{\hat{\Upsilon }}_{(0)}{}^{\alpha }{}_{\lambda \alpha }-{\hat{\Delta }}^{(0)}_\alpha {}^{\mu \nu }+\frac{2}{D-1}\delta _\alpha {}^{[\mu }{\hat{\Delta }}^{(0)}_\beta {}^{\nu ]\beta }\nonumber \\&\quad -{\hat{\Delta }}^{(1)}_\alpha {}^{\mu \nu }+\frac{2}{D-1}\delta _\alpha {}^{[\mu }{\hat{\Delta }}^{(1)}_\beta {}^{\nu ]\beta }={\mathcal {O}}(B^2). \end{aligned}$$
(75)

Notice that in general, \({\hat{\Delta }}^{(n)}_\alpha {}^{\mu \nu }\) might have a complicated dependence on the affine connection, and thus on \({\hat{\Upsilon }}^\alpha {}_{\mu \nu }\), which may complicate further the solution of the above equation for \({\hat{\Upsilon }}^\alpha {}_{\mu \nu }\) order by order in \(B^{\mu \nu }\). Thus, in general, one could make a further expansion of each \({\hat{\Delta }}^{(n)}_\alpha {}^{\mu \nu }={\hat{\Delta }}^{(0,n)}_\alpha {}^{\mu \nu }+{\hat{\Delta }}^{(1,n)}_\alpha {}^{\mu \nu }+...\) where the superscript (mn) denotes a term of order \({\mathcal {O}}({\hat{\Upsilon }}^m)\) and \({\mathcal {O}}(B^n)\). Since the completely general case is rather cumbersome, and is not particularly illuminating, let us focus on the case where the hypermomentum does not depend on the affine connection, where we can expand only in terms of \(B^{\mu \nu }\). Let us mention that this would be the case, for instance, of minimally coupled spin 1/2 fields, which have a well known hypermomentum of the form \(\Delta ^{(\Psi )}_\alpha {}^{\mu \nu }=-i\sqrt{-q}g_{\alpha \rho }\epsilon ^{\rho \sigma \mu \nu }\left[ {\bar{\Psi }}\gamma _\sigma \gamma _5\Psi \right] .\) Assuming thus no dependence of the hypermomentum on the connectionFootnote 15 (i.e. the matter action is linear in the connection), we can proceed exactly as in Sect. 5.3 to obtain the following zeroth and first order solutions:

$$\begin{aligned}&{\hat{\Upsilon }}_{(0)}{}^\alpha {}_{\mu \nu }=\frac{1}{\sqrt{-h}}\left. \Bigg [{\hat{\Delta }}^{(0)}{}_{\mu \nu }{}^\alpha +{\hat{\Delta }}^{(0)}_\nu {}^\alpha {}_\mu -{\hat{\Delta }}^{(0)}{}^\alpha {}_{\mu \nu } \right. \nonumber \\&\quad +\frac{2}{D-1}\delta ^\lambda {}_{[\nu }{\hat{\Delta }}^{(0)}{}^\alpha {}_{\mu ]\lambda }\nonumber \\&\quad \left. +\frac{1}{2(D-2)}\left( h_{\mu \nu }{\hat{\Delta }}^{(0)}{}^{\alpha \lambda }{}_\lambda -2\delta ^\alpha {}_{(\mu }{\hat{\Delta }}^{(0)}{}_{\nu )\lambda }{}^\lambda \right) \right] \nonumber \\&{\hat{\Upsilon }}_{(1)}{}^\alpha {}_{\mu \nu }=\frac{1}{2}h^{\alpha \lambda }\Big (\nabla ^h_\lambda B_{\nu \mu }+\nabla ^h_\mu B_{\nu \lambda }+\nabla ^h_\nu B_{\lambda \mu }\Big ) \nonumber \\&\quad -\frac{1}{\sqrt{-h}}\Bigg [\frac{1}{2}\left( {\hat{\Delta }}^{(1)}{}^\alpha {}_{\mu \nu }+{\hat{\Delta }}^{(1)}{}_{\mu \nu }{}^\alpha +{\hat{\Delta }}^{(1)}{}_\nu {}^\alpha {}_{\mu }\right) \nonumber \\&\quad -\frac{1}{D-2}\delta ^\alpha {}_{(\mu }{\hat{\Delta }}^{(0)}{}_{\nu )}{}^{\gamma \sigma }B_{\gamma \sigma }\nonumber \\&\quad +\frac{2}{(D-1)(D-2)}\delta ^\alpha {}_{(\mu }B_{\nu )\gamma }{\hat{\Delta }}^{(0)}{}_\sigma {}^{\sigma \gamma }\frac{1}{D-2}\delta ^\alpha {}_{(\mu }{\hat{\Delta }}^{(1)}{}_{\nu )\sigma }{}^{\sigma }\nonumber \\&\quad +\frac{1}{D-2}\delta ^\alpha {}_{[\mu }B_{\nu ]\gamma }{\hat{\Delta }}^{(0)}{}^{\gamma \sigma }{}_\sigma +\frac{2}{D-1}\delta ^\alpha {}_{[\mu }{\hat{\Delta }}^{(1)}{}^{\sigma }{}_{\nu ]\sigma }\nonumber \\&\quad +\frac{1}{2}\left( B_{\mu \sigma }{\hat{\Delta }}^{(0)}{}^{\sigma }{}_{\nu }{}^{\alpha }-B_{\nu \sigma }{\hat{\Delta }}^{(0)}{}^{\sigma \alpha }{}_{\nu }\right) \nonumber \\&\quad +\frac{1}{2(D-2)}h_{\mu \nu }B_{\gamma \sigma }{\hat{\Delta }}^{(0)}{}^{\alpha \gamma \sigma }\nonumber \\&\quad -\frac{1}{D-2}h_{\mu \nu }B^\alpha {}_\sigma {\hat{\Delta }}^{(0)}{}_{\gamma }{}^{\gamma \sigma }-\frac{1}{2(D-2)}h_{\mu \nu }{\hat{\Delta }}^{(1)}{}^{\alpha \sigma }{}_{\sigma }\Bigg ], \end{aligned}$$
(76)

where \({\hat{\Delta }}^{(0)}{}_\alpha {}^{(\alpha \beta )}=0\) and \({\hat{\Upsilon }}_{(0)}{}^{\alpha \beta }{}_{\beta }=0\) must be satisfied as can be shown from the connection field equations and the identity \({\hat{\Upsilon }}_{(n)}{}^\alpha {}_{[\alpha \beta ]}=0\). As we can see, besides obtaining the problematic \({\hat{\Upsilon }}\sim \nabla ^h B+{\mathcal {O}}(B^2)\) terms that we obtained in the vanishing hypermomentum case, we here obtain also a bunch of terms that couple non-minimally the matter fields with themselves and with the 2-form \(B^{\mu \nu }\) through their hypermomentum. It is apparent that the addition of these new terms cannot heal the problematic behaviour of the \(\nabla ^h B\) terms by themselves, thus clarifying why the addition of non-minimal couplings to matter fields would not solve the instability problem. Indeed, the extra couplings between the unstable 2-form and the matter fields potentially reduce the time-scale in which the 2-form instability manifests physically through its decay to lighter particles.

6.3 A digression on metric vs affine geodesic equation

After discussing the consequences of ultra local couplings between the matter fields and the affine connection, let us take on the discussion about the propagation of test particles initiated in Sect. 2. We indicated there how the projective symmetry was related to the re-parameterisation invariance of the autoparallel or affine geodesic Eq. (9) that for affinely parameterised curves can be written as

$$\begin{aligned} \ddot{x}^\alpha +\Gamma ^\alpha {}_{\mu \nu }\dot{x}^\mu \dot{x}^\nu =0. \end{aligned}$$
(77)

This equation describes the straightest paths defined as those whose acceleration along the tangent direction vanishes, while the shortest paths are described by the metric geodesic equation

$$\begin{aligned} \ddot{x}^\alpha +{\bar{\Gamma }}^\alpha {}_{\mu \nu }(g)\dot{x}^\mu \dot{x}^\nu =0. \end{aligned}$$
(78)

Unlike the autoparallel Eq. (77), the metric geodesic equation is oblivious to the general affine structure and only cares about the Levi-Civita part entirely determined by the metric, as it should because the length of curves only depends on the metric. We can parameterise the difference between both equations by performing a post-Riemannian expansion \(\Gamma ^\alpha {}_{\mu \nu }={\bar{\Gamma }}^\alpha {}_{\mu \nu }+\Upsilon ^\alpha {}_{\mu \nu }\) so the autoparallel equation reads

$$\begin{aligned} \ddot{x}^\alpha +{\bar{\Gamma }}^\alpha {}_{\mu \nu }(g)\dot{x}^\mu \dot{x}^\nu =-\Upsilon ^\alpha {}_{\mu \nu }\dot{x}^\mu \dot{x}^\nu . \end{aligned}$$
(79)

Only experiments can tell us whether particles follow metric geodesic paths or their trajectories are in turn auto-parallel curves for the full affine connection. In other words, we can only constrain the \(\Upsilon -\)sector by resorting to experiments. However, we can argue which one seems more natural, with all the caveats that this word might induce, from a theoretical perspective. Let us state our conclusion right away: metric geodesic trajectories seem better aligned with our current understanding of physics. Let us elaborate on why we believe this.

Firstly, the most natural action for a test particle on a gravitational field (that may include a general connection) is given by its line element. If the trajectory of the particle is \(x^{\alpha }=x^{\alpha }(\lambda )\) for some affine parameter \(\lambda \), we can expect its action to be

$$\begin{aligned} {\mathcal {S}}_{\mathrm{pp}}=\int g_{\mu \nu }(x)\dot{x}^\mu \dot{x}^\nu {\mathrm{d}}\lambda , \end{aligned}$$
(80)

which leads to the metric geodesic equation and not to the affine autoparallel one. One might object that the naturalness and our expectation is crucially biased by our prejudice so some more motivation would seem desirable. That (80) is the natural action for the gravitational interaction of the particle can be motivated by the fact that the particle’s motion should be described by its velocity \(\dot{x}^\alpha \) and, in compliance with the equivalence principle, it should reduce to \(\eta _{\mu \nu }\dot{x}^\mu \dot{x}^\nu \) in a freely falling frame. Furthermore, once we accept that the particle dof’s are described by \(\dot{x}^\alpha \), (80) can be regarded as the lowest order interaction with the metric tensor from an effective theory perspective. There could be other higher order interactions but they will be suppressed by some appropriate scale. In fact, we do expect higher order corrections of this type. The same reasoning can be applied to determine the coupling to the affine connection. If we stick to the equivalence principle for gravity, then the connection cannot couple directly to the particle. This could be too restrictive because the equivalence principle is only a required consistency coupling prescription for the massless spin-2 sector of the theory [54, 55]. However, the connection sector could contain additional propagating dof’s that do not need to comply with the equivalence principle so there would not be any reason to impose it for the couplings to the connection. Thus, if we let the connection couple to the particle, the lowest order interaction is given by

$$\begin{aligned} {\mathcal {S}}_\mathrm{pp-\Gamma }=\int \Upsilon _\mu {\mathrm{d}}x^\mu =\int \Upsilon _\mu \dot{x}^\mu {\mathrm{d}}\lambda , \end{aligned}$$
(81)

where \(\Upsilon _\mu \) is some arbitrary combination of traces of the connection. The correction to the field equations coming from this coupling is of the form

$$\begin{aligned} \frac{\delta {\mathcal {S}}_{\mathrm{pp}}}{\delta x^\alpha }\supset \big (\partial _\alpha \Upsilon _\mu -\partial _\mu \Upsilon _\alpha \big )\dot{x}^\mu , \end{aligned}$$
(82)

which contributes a Lorentz-like force and, certainly, it does not lead to the affine autoparallel equation. Again, we can expect higher order corrections, but they will be suppressed by some suitable scale and it will contain higher powers of the particles velocity. Thus, obtaining the autoparallel equation for the full connection from an appropriate action is substantially more contrived than obtaining the metric geodesic equation, which in turn appears quite naturally. In fact, Eq. (77) cannot be obtained from a standard variational principle in general. Within the context of teleparallel theories where the curvature vanishes identically, one can design an appropriate variational principle to obtain the corresponding autoparallel equation as suggested in [56, 57]. One can always resort to suitable constraints and more or less involved couplings leading to the desired equations (whenever this is possible), but this procedure seems artificial to eventually produce the equations in a somewhat ad-hoc manner. An objection to the argument could be that there is no fundamental principle stating that physical equations should follow from an action. After all, not all field equations can be derived from an action principle. Thus, we could regard Eq. (79) as Lagrange equations of the second kind with some generalised velocity-dependent force precisely given by \(\Upsilon ^\alpha {}_{\mu \nu }\dot{x}^\mu \dot{x}^\nu \) that go beyond the usual friction forces linear in the velocities and derivable from a Rayleigh dissipation function. However, our current understanding of physics at the most fundamental level can be formulated in terms of the path integral whose primary ingredient is the action (besides an appropriate measure). Let us recall that the standard model of the fundamental interactions including gravity is indeed described by an action so it is natural, though not mandatory, to expect that physical equations should follow from an action principle and, in particular, the motion of particles in a gravito-affine background field.

We will finalise our digression by noticing that a particle is just an idealisation of some more fundamental (classical or quantum) field. Standard bosonic fields like a scalar or spin-1 fields only couple to the metric, so it is difficult to justify the appearance of the connection (other than its Levi-Civita part) in their field equations and, consequently, on the propagation of the associated point-like particles. Furthermore, the propagation of these fields is usually obtained by applying the eikonal or geometric optics approximation to the corresponding hyperbolic equation describing the dynamics of the fields, which in most cases reduces to a wave equation (or a set of them) with the d’Alembertian associated to the metric [23]. In that approximation, the trajectory arises as the curve whose tangent vector is parallel transported with the Levi-Civita connection. On the other hand, we can include couplings to the connection and these will modify the paths of the associated particles in the corresponding approximation, but ensuring that such modifications will lead to the affine autoparallel Eq. (77) will require a certain amount of artificiality. When considering fermions that do couple to the connection, the conclusion is similar. In that case the eikonal approximation will exhibit additional torsional forces, but they will not mimic the effect of the affine autoparallel propagation [58,59,60,61,62].

Our discussion here is relevant concerning the physical importance of geodesically complete spacetimes in metric-affine theories, meaning spacetimes where the solutions of (77) can be extended to the entire manifold. The incompleteness of these curves can be associated to the existence of singularities. It is then crucial to discern the class of trajectories that carry physical information on the propagation of actual particles. In view of our discussion, it is most natural to consider the solutions of Eq. (78) as the relevant ones in order to draw physical consequences, even if we are in a metric-affine framework. If that is the case and our matter sector couples to the connection directly, then the geodesic Eq. (78) cease to be valid to describe the dynamics of particles because we will need to include the corresponding affine forces, but these will not, in general, be encapsulated in an autoparallel equation and a case by case study would be required since, as commented above, universality is no longer a property of the interactions. It is also important to emphasise that the metric determining the trajectories of different particles could depend on the species around non-trivial backgrounds, as it is the case for projectively invariant RBG where gravitational waves follow the geodesics of the auxiliary metric \(q_{\mu \nu }\), while matter fields travel according to \(g_{\mu \nu }\) (see e.g. [5]).

7 Constrained geometries

In the precedent sections we have seen that abandoning the projective symmetry in the higher order curvature sector of a metric-affine theory results in the appearance of ghost-like pathologies precisely related to the projective mode. We will now discuss the different frameworks where metric-affine theories can be rendered stable, not by imposing additional symmetries, but by enforcing suitable constraints on the connection, i.e, by restricting to some specific geometries. In this respect, it is known that broad families of theories admit stable (ghost-free) higher order curvature theories for some particular classes of geometries. In this Section we will review some known examples where the connection is deprived of specific components of the non-metricity and/or torsion. We will finally show a general result that imposing a vanishing torsion reduces general RBG theories to a theory with an extra interacting massive vector field.

7.1 Torsion-free theories

We will start by showing how imposing a vanishing torsion avoids the presence of ghosts. This general result was already shown in [14], but we will reproduce here for completeness. The implementation of this constraint can be performed by either only allowing for variations of the symmetric part of the connection (i.e., assuming a symmetric connection from the beginning) or by introducing a set of Lagrange multiplier fields that enforce \({\mathcal {T}}^\alpha {}_{\mu \nu }=0\). Either way, the resulting connection equations now read

$$\begin{aligned} \nabla _\lambda \left[ \sqrt{-q}q^{(\mu \nu )}\right] -\nabla _\rho \left[ \sqrt{-q}q^{\rho (\mu }\right] \delta ^{\nu )}_\lambda =0. \end{aligned}$$
(83)

Notice that the only difference with respect to the equations for the unconstrained connection is precisely the trivialisation of their antisymmetric part. Let us decompose \(q^{\mu \nu }\) again as in (42). Due to the vanishing of the torsion tensor, the general decomposition of the connection (1) lacks the contortion tensor. Thus, the connection can here be split in a Levi-Civita connection of \(h^{\mu \nu }\) and a disformation part that depends on the non-metricity \(N_{\lambda \mu \nu }\equiv \nabla ^h_\lambda h_{\mu \nu }\) asFootnote 16

$$\begin{aligned} \Gamma _{\mu \nu }^\alpha ={\bar{\Gamma }}_{\mu \nu }^\alpha (h)+L^\alpha {}_{\mu \nu }(N) \end{aligned}$$
(84)

without loss of generality, where the disformation tensor is now built with the non-metricity of \(h^{\mu \nu }\). The above splitting allows to obtain the following relations that we will use below

$$\begin{aligned} \nabla _\lambda \big (\sqrt{-h}h^{\lambda \nu }\big )&=\sqrt{-h}\tilde{L}^\nu , \end{aligned}$$
(85)
$$\begin{aligned} \nabla _\lambda \big (\sqrt{-h}B^{\lambda \nu }\big )&=\sqrt{-h}\nabla ^h_\lambda B^{\lambda \nu }, \end{aligned}$$
(86)

where \(\tilde{L}^\nu \equiv L^\nu {}_{\alpha \beta } h^{\alpha \beta }\) is one of the two independent traces of the disformation tensor. The trace of the connection Eq. (83) together with (85) yields

$$\begin{aligned} \nabla ^h_\lambda B^{\lambda \nu }=\frac{1-D}{1+D}\tilde{L}^\nu , \end{aligned}$$
(87)

which implies the dynamical constraint

$$\begin{aligned} \nabla ^h_\nu \tilde{L}^\nu =0. \end{aligned}$$
(88)

On the other hand, contracting the connection equation (83), with \(h_{\mu \nu }\) defined as the inverse of \(h^{\mu \nu }\), leads to

$$\begin{aligned} L_\mu =\frac{2}{(2-D)(1+D)}\tilde{L}_\mu , \end{aligned}$$
(89)

where \(L_{\mu }\equiv L^\alpha {}_{\mu \alpha }\) and indices are raised and lowered with \(h_{\mu \nu }\). Thus, we see that there is only one independent trace of the disformation tensor. Using the above relations in the connection Eq. (83), we are led to

$$\begin{aligned} 2h^{\alpha (\mu } L^{\nu )}{}_{\lambda \alpha }=L_\lambda h^{\mu \nu }+(2-D) L_\alpha h^{\alpha (\mu }\delta ^{\nu )}{}_\lambda . \end{aligned}$$
(90)

Given that the the non-metricity tensor of the auxiliary metric is given by \(N_\lambda {}^{\mu \nu }\equiv -\nabla _\lambda h^{\mu \nu }=-2h^{\alpha (\mu }L^{\nu )}{}_{\lambda \alpha }\), which implies the identity \(L_\mu =-\frac{1}{2}h_{\alpha \beta }N_\mu {}^{\alpha \beta }\equiv -\frac{1}{2} {\tilde{N}}_\mu \), the above equation can be used to re-write the connection equation (90) as a constraint for the non-metricity tensor

$$\begin{aligned} N_\lambda {}^{\mu \nu }=\frac{1}{2}\Big [ {\tilde{N}}_\lambda h^{\mu \nu }+(2-D){\tilde{N}}_\alpha h^{\alpha (\mu }\delta ^{\nu )}_\lambda \Big ], \end{aligned}$$
(91)

which becomes completely specified by its Weyl component (although it is not Weyl-like). Thus we see that the connection field equations can be fully solved explicitly, and the connection is given by a disformation piece given by the non-metricity tensor (91) added to the Levi-Civita of \(h^{\mu \nu }\). Given that this disformaton piece is completely determined by \({\tilde{N}}_\mu \) (the Weyl trace of the non-metricity of \(h^{\mu \nu }\)) , the connection carries only one additional vector component, instead of a vector field plus a 2-form as in the most general case. Moreover, from the transversality constraint (88) obtained above, this new vectorial component must be a Proca field, thus propagating only three extra degrees of freedom. The corresponding metric equations of the system will allow to solve algebraically for \(h^{\mu \nu }\) as a function of the matter fields and (possibly) the new vector field \({\tilde{N}}_\mu \), which ensures the absence of the pathologies that were found in the most general case. To illustrate this, let us re-consider a particular example that has already been treated in the literature. Assume a metric-affine gravitational Lagrangian of the form

$$\begin{aligned} {\mathcal {L}}={\mathcal {R}}+c_1 {\mathcal {R}}_{[\mu \nu ]}{\mathcal {R}}^{[\mu \nu ]}. \end{aligned}$$
(92)

As explained above, this theory breaks projective symmetry due to the presence of the antisymmetric part of the Ricci in the action. Therefore pathologies should arise in the general case unless further constraints are imposed. However, as shown in past works [63, 64], the torsion-free version of this model reduces to the Einstein-Proca system , where the Proca field arises from the connection sector. For more general examples with violation of projective symmetry but where the torsion-free constraint is imposed torsion, the Proca field will in general develop non-trivial interactions, as was already discussed in [65] for the Ricci-based sub-family \(F(g^{\mu \nu }, {\mathcal {R}}^{\mu \nu }{\mathcal {R}}_{\mu \nu })\) with the torsion-free constraint.

To enlighten the mechanism that renders the torsion-free version of generalised RBG theories ghost-free, let us resort to the the Einstein frame of the theory making explicit the torsion-free constraint. The action of the theory can be written as

$$\begin{aligned} {\mathcal {S}}&=\frac{1}{2}\int {\mathrm{d}}^Dx\sqrt{-g}\Big [f(\Sigma ,A)+\frac{\partial f}{\partial \Sigma _{\mu \nu }}\big ({\mathcal {R}}_{(\mu \nu )}-\Sigma _{\mu \nu }\big )\nonumber \\&\quad +\frac{\partial f}{\partial A_{\mu \nu }}\big ({\mathcal {R}}_{[\mu \nu ]}-A_{\mu \nu }\big ) +\frac{1}{\sqrt{-g}}\lambda _\alpha {}^{\mu \nu }{\mathcal {T}}^\alpha {}_{\mu \nu }\Big ], \end{aligned}$$
(93)

where \(\lambda _\alpha {}^{\mu \nu }\) is a Lagrange multiplier that enforces the torsion-free constraint \({\mathcal {T}}^\alpha {}_{\mu \nu }=0\); and \(A_{\mu \nu }\) and \(\Sigma _{\mu \nu }\) are auxiliary fields that are antisymmetric and symmetric respectively. In an analogue manner to Sect. 3.2, we can perform field re-definitions which allow us to algebraically solve for the space-time metric \(g^{\mu \nu }\) in terms of \(h^{\mu \nu }\), \(B^{\mu \nu }\) and the matter fields; thus integrating \(g^{\mu \nu }\) out. We can then write the Einstein frame action for torsion-free generalised RBGs as

$$\begin{aligned} {\mathcal {S}}&=\frac{1}{2}\int {\mathrm{d}}^Dx\Big [\sqrt{-h}h^{\mu \nu }{\mathcal {R}}_{(\mu \nu )} +\sqrt{-h}B^{\mu \nu }{\mathcal {R}}_{[\mu \nu ]} \nonumber \\&\quad +{\mathcal {U}}(h,B,T)+\lambda _\alpha {}^{\mu \nu }{\mathcal {T}}^\alpha {}_{\mu \nu }\Big ]. \end{aligned}$$
(94)

This action gives the same connection equations that we solved above (83), so we can take the above solution (basically the splitting (84) and Eq. (91) together) and plug it back into the above action. As it can be seen, the solution for the connection satisfies the relations

$$\begin{aligned} {\mathcal {R}}_{[\mu \nu ]}&=-\frac{1}{2}\partial _{[\mu }{\tilde{N}}_{\nu ]},\nonumber \\ {\mathcal {R}}_{(\mu \nu )}&=R_{\mu \nu }(h)+\frac{(D-2)(D-1)}{16}{\tilde{N}}_\mu {\tilde{N}}_\nu -\frac{(D-1)}{4}h_{\mu \nu }\nabla ^h_\alpha {\tilde{N}}^\alpha \end{aligned}$$
(95)

which, after dropping the surface term \(\nabla ^h_\mu {\tilde{N}}^\mu \), allow us to re-express the action (93) in terms of the metric \(h_{\mu \nu }\), the 2-form \(B_{\mu \nu }\) and the vector field \({\tilde{N}}^\mu \) as

$$\begin{aligned} \begin{aligned} {\mathcal {S}}&=\frac{1}{2}\int {\mathrm{d}}^Dx\Big [\sqrt{-h}\Big (R(h)+\frac{(D-2)(D-1)}{16}{\tilde{N}}^2 \\&\quad -\frac{1}{2}B^{\mu \nu }\partial _{[\mu }{\tilde{N}}_{\nu ]}\Big )+{\mathcal {U}}(h,B,T)\Big ], \end{aligned} \end{aligned}$$
(96)

Notice that this form of the action reproduces the constraint on the 2-form (87) as the field equations of the vector field \({\tilde{N}}^\mu \) (which are in some sense the connection equations in the corresponding RBG frame), which read

$$\begin{aligned} \nabla ^h_\mu B^{\mu \nu }=-\frac{(D-2)(D-1)}{4}{\tilde{N}}^\nu , \end{aligned}$$
(97)

and imply the constraint \(\nabla ^h_\alpha {\tilde{N}}^\alpha =0\). At the same time the 2-form field equations yield a non-linear relation among the 2-form, the field-strength of the vector field \({\tilde{N}}^\mu \), and the matter fields given by

$$\begin{aligned} \partial _{[\mu }{\tilde{N}}_{\nu ]}=\frac{2}{\sqrt{-h}}\frac{\partial {{\mathcal {U}}}}{\partial B^{\mu \nu }}. \end{aligned}$$
(98)

This stems from the fact that our final action (96) is nothing but the first-order form of a self-interacting massive vector field coupled to the matter sector. Going back to the particular case \(F={\mathcal {R}}+c_1 {\mathcal {R}}_{[\mu \nu ]}{\mathcal {R}}^{[\mu \nu ]}\), we can reproduce the above results (found previously in [63,64,65]). For this particular example, the metric \(h^{\mu \nu }\) is exactly \(g^{\mu \nu }\), the 2-form is given by \(B^{\mu \nu }=2c_1{\mathcal {R}}^{[\mu \nu ]}\), and the effective potential reads \({\mathcal {U}}=-(\sqrt{-h}/4c_1) B^{\mu \nu }B_{\mu \nu }\). Thus (98) becomes

$$\begin{aligned} \text {d}{\tilde{N}}={\mathcal {F}} \end{aligned}$$
(99)

showing that (96) is a first-order description of a free Proca field \({\tilde{N}}_\mu \) with field-strength given by \({\mathcal {F}}_{\mu \nu }=(2/c_1)B_{\mu \nu }\).

7.2 Weyl geometries

Let us now briefly comment on another paradigmatic extension of the Riemannian framework introduced by Weyl shortly after the GR inception which has been analised widely in the literature (see e.g. the nice survey in [66]). This geometry is characterised by local scale (gauge) invariance and a torsion-free connection so the only non-trivial part of the affine connection is the so-called Weyl trace of the non-metricity \(A_\alpha =-\frac{2}{D}g^{\mu \nu }Q_{\alpha \mu \nu }\). This allows to replace the metric compatibility condition \({{\bar{\nabla }}}_\alpha g_{\mu \nu }=0\) by \(D_\alpha g_{\mu \nu }\equiv ({\bar{\nabla }}_\alpha -A_\alpha )g_{\mu \nu }=0\) which is invariant under the scale transformation \(g_{\alpha \beta }\rightarrow e^{2\alpha (x)} g_{\alpha \beta }\), under which \(A_\mu \) transforms as \(A_\mu \rightarrow A_\mu -\partial _\mu \alpha \) as required by invariance of the affine connection.

Theories whose actions are constructed in terms of quadratic curvature invariants for a Weyl connection trivially admit ghost-free theories and, consequently, imposing the connection to be of the Weyl form evidently avoids the ghostly pathologies of the general RBG theories. This constraint can be implemented either by imposing the connection to be Weyl-like from the beginning or by adding suitable Lagrange multipliers. Now we should impose a vanishing torsion and also vanishing of all the non-metricity irreducible components except for the Weyl trace. Since for the torsion-free case there are no ghostly degrees of freedom, it is clear that for Weyl geometries, since they are a sub-class of the torsion-free ones, which also feature additional constraints (non-metricity is forced to be vectorial), there will be no ghosts either. General quadratic theories in Weyl geometries have been studied in e.g. [67] where it was shown that some interesting non-trivial interactions for the Weyl vector can be generated.

7.3 Geometries with vector distortion

The affine connection in Weyl geometries are characterised by a vector field that controls the departure from the Levi Civita connection. A natural generalisation is to include not only this vector part, but a general vector piece of the connection in both the torsion and the non-metricity. Such a general connection was considered in [68] in the absence of torsion and was extended to include the torsion trace in [69, 70]. The connection in these geometries can be parameterised as

$$\begin{aligned}&\Gamma ^\alpha {}_{\mu \nu }={\bar{\Gamma }}^\alpha {}_{\mu \nu }-b_1A^\alpha g_{\beta \gamma }+b_2\delta ^\alpha _{(\beta }\nonumber \\&\quad A_{\gamma )}+b_3\delta ^\alpha _{[\beta } A_{\gamma ]}+b_4\epsilon ^\alpha {}_{\mu \nu \rho } S^\rho . \end{aligned}$$
(100)

This is the minimal field content to describe the desired geometrical setup. It is necessary to have at least two different vector fields with opposite transformation properties under parity in order to account for the axial part of the torsion. The remaining vector pieces, i.e., the two non-metricity traces and the torsion trace, have been identified (up to some proportionality constant) so that this sector is fully described by one single vector field. It would be interesting to study the geometries where the different vector pieces are not identified and the presence of some internal symmetries in that sector (see [71] related to this point). The present framework however allows to substantially simplify the analysis. Within the framework of curvature-based theories, the general quadratic action can be written as

$$\begin{aligned}&{\mathcal {S}}_{\mathrm{VD}} = M_\mathrm{2} \int {\mathrm{d}}^D x \sqrt{-g}\Big [{\mathcal {R}}^2 + {\mathcal {R}}_{\alpha \beta \gamma \delta }\Big ( d_1 {\mathcal {R}}^{\alpha \beta \gamma \delta } \nonumber \\&\quad + d_2 {\mathcal {R}}^{\gamma \delta \alpha \beta } - d_3 {\mathcal {R}}^{\alpha \beta \delta \gamma }\Big ) \nonumber \\&\quad - 4\Big ( c_1 {\mathcal {R}}_{\mu \nu }{\mathcal {R}}^{\mu \nu } + c_2 {\mathcal {R}}_{\mu \nu }{\mathcal {R}}^{\nu \mu } + {\mathcal {P}}_{\mu \nu }\left( c_3 {\mathcal {P}}^{\mu \nu } \right. \nonumber \\&\quad \left. + c_4 {\mathcal {P}}^{\nu \mu } - c_5 {\mathcal {R}}^{\mu \nu } - c_6 {\mathcal {R}}^{\nu \mu }\right) \nonumber \\&\quad + {\mathcal {Q}}_{\mu \nu }(c_7 {\mathcal {Q}}^{\mu \nu } + c_8 {\mathcal {R}}^{\mu \nu }+ c_9{\mathcal {P}}^{\mu \nu })\Big )\, \Big ]\,. \end{aligned}$$
(101)

where \(d_i\) and \(b_i\) are some dimensionless constants and \(M_2\) some scale. This action will generically lead to instabilities, once again along the lines of what one would expect as discussed in detail above. In order to guarantee a ghost-free pure graviton sector, it is convenient to impose that the theory reduces to a Gauss-Bonnet theory in the Riemannian limit, i.e., when \(A_\mu \rightarrow 0\). It is then remarkable that it is sufficient to restrict the geometrical framework rather than the parameters in the action in order to obtain a ghost-free vector-tensor theory [69, 70]. The ghost-free geometries are characterised by \(2b_1-b_2-b_3=0\) and the resulting action reduces to

$$\begin{aligned}&{\mathcal {S}}_{\mathrm{VD}}=\mu \int {\mathrm{d}}^Dx\sqrt{-g}\left[ \Big (R^2-4R_{\mu \nu }R^{\mu \nu }+R_{\mu \nu \rho \sigma }R^{\mu \nu \rho \sigma }\Big )\right. \nonumber \\&\quad \left. -\frac{\alpha }{4} F_{\mu \nu } F^{\mu \nu }+\xi A^2\nabla \cdot A -\lambda A^4-\beta G^{\mu \nu } A_\mu A_\nu \right] \end{aligned}$$
(102)

where \(\alpha \), \(\xi \), \(\lambda \) and \(\beta \) are some constants that are given in terms of the parameters in (101) and \(F_{\mu \nu }=\partial _\mu A_\nu -\partial _\nu A_\mu \). The noteworthy property of this action is that the vector field features derivative non-gauge invariant interactions and a non-minimal coupling, but which precisely belong to the class of ghost-free interactions cite. Thus, the general result regarding the ghostly pathologies has been resolved in the vector distorted geometries by two conditions, namely: i) imposing the recovery of the safe Gauss-Bonnet quadratic gravity in the absence of distortion and ii) restricting the class of geometries. The singular property of the selected ghost-free geometries is that they generalise the Weyl connection by including a trace torsion piece but maintaining the Weyl invariance of the metric (in)-compatibility condition. This can be easily understood by noticing that the non-metricity for this restricted class of geometries is \(Q_{\mu \alpha \beta }=(b_3-b_2) A_\mu g_{\alpha \beta }\) which is of the Weyl type. However the torsion is non-vanishing and given by \({\mathcal {T}}^\alpha {}_{\mu \nu }=2b_3\delta ^\alpha _{[\mu }A_{\nu ]}\). We refer to [69, 70] for the detail discussion on the interesting geometrical properties of these geometries and here we will content ourselves with simply signalling how ghost-free theories can be obtained.

7.4 Riemann–Cartan geometries

Let us now consider the case of one of the first extensions of GR, namely the extension of the Riemannian framework to the so-called Riemann–Cartan geometry, where the connection is allowed to have a torsion component while keeping a trivial non-metricity. This can be achieved by introducing a suitable Lagrange multiplier in the action (21):

$$\begin{aligned}&{\mathcal {S}}[g_{\mu \nu },\Gamma ,\lambda ]=\frac{1}{2}\int {\mathrm{d}}^Dx \sqrt{-g}\,\left[ F\big (g^{\mu \nu },{\mathcal {R}}_{\mu \nu }(\Gamma )\big ) \right. \nonumber \\&\quad \left. +\lambda ^{\alpha }{}_{\mu \nu }\nabla _\alpha g^{\mu \nu }\right] +{\mathcal {S}}_{\mathrm{m}}[g_{\mu \nu },\Psi ]. \end{aligned}$$
(103)

While the torsion-free constraint heals the instabilities of generalised RBGs, this is not the case for a constraint imposing the vanishing of the non-metricity tensor. Given that the full analysis is rather cumbersome in this case, we will simply highlight the main differences between the vanishing non-metricity and vanishing torsion constraints, emphasising which are the conditions that improve the pathological behaviour of generalised RBGs in their torsion-free versions that do not occur when the non-metricity free constraint is imposed. First of all notice that varying the above action with respect to \(\lambda ^\alpha {}_{\mu \nu }\) one gets the constraint \(\nabla _\alpha g^{\mu \nu }=-Q_\alpha {}^{\mu \nu }=0\). Now an infinitesimal variation of the above action with respect to the connection yields

$$\begin{aligned}&\delta _\Gamma {\mathcal {S}}=\frac{1}{2}\int {\mathrm{d}}^Dx \sqrt{-g}\frac{\partial F}{\partial {\mathcal {R}}_{\mu \nu }}\delta _\Gamma {\mathcal {R}}_{\mu \nu }\nonumber \\&\quad =-\frac{1}{2}\int {\mathrm{d}}^Dx \sqrt{-q}q^{\mu \nu }\left( \nabla _\alpha \delta \Gamma ^\alpha {}_{\nu \mu }-\nabla _\nu \delta \Gamma ^\alpha {}_{\alpha \mu }-{\mathcal {T}}^\lambda {}_{\nu \alpha }\delta \Gamma ^\alpha {}_{\lambda \mu }\right) \nonumber \\ \end{aligned}$$
(104)

where the conditions \(Q^\alpha {}_{\mu \nu }=0\) and \(\delta _\Gamma Q_{\alpha }{}^{\mu \nu }=0\rightarrow \delta _\Gamma L^{\alpha }{}_{\mu \nu }=0\) are imposed by the Lagrange multiplier field equation after integrating it out. The root of the difference between the two cases is the term in the variation of the Ricci tensor (13). In the above variation of the action, that term vanishes in the torsion-free case (after integrating out the vanishing torsion field), while this does not occur in the non-metricity case. As a consequence, the connection field equations for the vanishing non-metricity case are

$$\begin{aligned}&\nabla _\lambda \left[ \sqrt{-q}q^{\nu \mu }\right] -\delta ^\mu {}_\lambda \nabla _\rho \left[ \sqrt{-q}q^{\nu \rho }\right] =\sqrt{-q}\left[ {\mathcal {T}}^\mu {}_{\lambda \alpha } q^{\nu \alpha } \right. \nonumber \\&\quad \left. +{\mathcal {T}}^\alpha {}_{\alpha \lambda } q^{\nu \mu }-\delta ^\mu {}_\lambda {\mathcal {T}}^\alpha {}_{\alpha \beta } q^{\nu \beta }\right] , \end{aligned}$$
(105)

thus having the same tensorial structure than the ones in the general caseFootnote 17 (23), which does not happen in the torsion-free case (83). This difference will have consequences in the number of degrees of freedom propagated in the different cases, as well as in their stability properties. To make this more clear, let us first decompose the non-metricity free connection as \(\Gamma ^\alpha {}_{\mu \nu }={{\bar{\Gamma }}}^\alpha {}_{\mu \nu }(q)+L^\alpha {}_{\mu \nu }+K^\alpha {}_{\mu \nu }\). Notice that although the covariant derivative \(\nabla _\alpha g_{\mu \nu }\) vanishes, this is not true for \(\nabla _\alpha h_{\mu \nu }\), and thus the distortion tensor corresponding to \(h_{\mu \nu }\) in the connection decomposition is non-vanishing. We thus see that the non-metricity free condition does not have an implementation as nice as the torsion-free condition, and the structure of the equations is identical to the general case, having also the constraint

$$\begin{aligned} \nabla ^h_\lambda B^{\lambda \mu }=0. \end{aligned}$$
(106)

In the torsion-free case, we found instead that the divergence (with respect to \(h^{\mu \nu }\)) of the 2-form was proportional to one of the traces of the distortion tensor \(\tilde{L}_\mu \). Thus, in both the torsion-free and the non-metricity-free cases the divergence of the 2-form can be eliminated from the field equations. Another important point is that the absence of \(K^\alpha {}_{\mu \nu }\) in the torsion-free case and the index symmetries of \(B^{\mu \nu }\) and \(L^\alpha {}_{\mu \nu }\) yield the relations(85) and (86). While (85) is still occurring in this case, the analogue relation to (86) is

$$\begin{aligned} \nabla _\lambda (\sqrt{-h}B^{\lambda \nu })=\sqrt{-h}\nabla ^h_\lambda B^{\lambda \nu }+\sqrt{-h}\left( t_\alpha B^{\alpha \nu }+\frac{1}{2}{\mathcal {T}}^{\nu }{}_{\alpha \beta }B^{\alpha \beta }\right) , \end{aligned}$$
(107)

where \(t_\alpha \equiv {\mathcal {T}}^\beta {}_{\beta \alpha }\) and the first term on the right hand side vanishes due to (106). Thus, while in the torsion-free case these relations together with the divergence of the two-form (87) allow to write \(\nabla _\alpha (\sqrt{-h}h^{\alpha \mu })\) and \(\nabla _\alpha (\sqrt{-h}B^{\alpha \mu })\) in terms of the vector field \(\tilde{L}_\mu \), this is not the case in the non-metricity free scenario. Once we have highlighted these differences, which rely only on the decomposition of the connection that one is able to do in the different cases, we are now ready to understand why the difference in the tensorial structure of the connection field equations plays a crucial role in the stability properties. After the decomposition of \(q^{\mu \nu }\) into its symmetric and antisymmetric parts, due to the symmetrization of \(\mu \) and \(\nu \) in the torsion-free case only the contraction \(\nabla _\alpha B^{\alpha \mu }\) enters the connection field equations. As explained above, this can be substituted by \(\tilde{L}_\mu \) in the torsion-free case and, together with the relations (85) and (86), it allows to find a relation between both traces of the distortion tensor. Then, since \(\nabla _\alpha B^{\mu \nu }\) does not appear in the equations, and \(\nabla _\alpha h^{\mu \nu }\) can be written only in terms of \(L^\alpha {}_{\mu \nu }\), the connection equation (83) allows to find a solution for the full connection as the Levi-Civita conection of the auxiliary metric plus a distortion part characterized only by the vector field \(\tilde{L}_\mu \). In contrast, since in the vanishing non-metricity case (as in the general one) the symmetrization of \(\mu \) and \(\nu \) does not occur in the connection field equations, not only its trace but also the full covariant derivative of \(B^{\mu \nu }\) enters the connection field equations. This makes \(B_{\mu \nu }\) a propagating field and makes it impossible to solve the connection only in terms of a new vector field. Indeed it can be seen that the torsion tensor has the schematic form \(\nabla B/(1+B)\) as happened to \({\hat{\Upsilon }}\) in Sect. 5, thus potentially introducing Ostrogradski instabilities propagated by the 2-form. Therefore the Einstein frame version of this theory would be formally identical to the one of the general case, since the distortion of \(h^{\mu \nu }\) is not vanishing here. Thus we see that in general the constraint of vanishing non-metricity will not heal the instabilities of the previous theory, as the extra 5 degrees of freedom corresponding to the projective mode an the 2-form will in general also propagate, although there could be fine-tuned Lagrangians in which this does not occur.

We will end this Section by noticing that the Poincaré gauge theories [72] are formulated in a Riemann-Cartan geometry. It is known that the general quadratic theories of this class present pathologies and only very specific choices of parameters give rise to healthy theories (see e.g. [51,52,53, 73,74,75,76]). As repeated several times, it is possible to have phenomenologically viable theories by interpreting them as effective field theories as done in [47, 77].

8 Hybrid theories

So far we have considered RBG in the pure metric-affine formalism so that only the curvature of the full connection enters the action. As we explained in Sec. 2, every spacetime endowed with a metric tensor admits a distinguished connection given by the Christoffel symbols of the metric. Thus, for any spacetime with a general connection there is a coexistent affine structure provided by the Levi-Civita connection. The hybrid formalism [78, 79] steps outside the purely metric-affine framework and embraces these two coexisting affine structures so that the action contains the curvatures of the two connections. As we will see, rather than improving the situation of the pure metric-affine formalism, delving into the hybrid framework generically introduces even more pathologies. This may not be too surprising since the hybrid formalism is prone to the independent pathologies of the metric and metric-affine formalisms separately from the outset and hence it is natural to expect the same pathologies at the very least. The existence of pathologies in the hybrid formalism was analysed in [80] by looking at the propagator on flat spacetime and identifying the presence of ghosts for a class of hybrid theories whose action is an arbitrary function of the two Ricci scalars R(g) and \({\mathcal {R}}(\gamma )\) and the hybrid Ricci square term \(R_{\mu \nu }(g){\mathcal {R}}^{\mu \nu }(\Gamma )\).

In order to pinpoint the sources of pathologies for the hybrid theories, we will consider the following hybrid action

$$\begin{aligned} {\mathcal {S}}_{\mathrm{hybrid}}=\int {\mathrm{d}}^Dx\sqrt{-g}f({\mathcal {R}}_{\mu \nu },R_{\mu \nu }). \end{aligned}$$
(108)

We will then proceed analogously to the pure metric-affine formalism to write the action as

$$\begin{aligned} {\mathcal {S}}_{\mathrm{hybrid}}=\int {\mathrm{d}}^Dx\Big [\sqrt{-q} q^{\mu \nu }{\mathcal {R}}_{\mu \nu }(\Gamma )+{\mathcal {U}}(R_{\mu \nu },q,g)\Big ] \end{aligned}$$
(109)

where we have defined

$$\begin{aligned} {\mathcal {U}}\equiv \sqrt{-g}\left[ f-\frac{\partial f}{\partial \Sigma _{\mu \nu }} \Sigma _{\mu \nu }\right] ,\quad \text {and}\quad \sqrt{-q}q^{\mu \nu }\equiv \sqrt{-g}\frac{\partial f}{\partial \Sigma _{\mu \nu }}, \end{aligned}$$
(110)

and here f is understood as a function of \(\Sigma _{\mu \nu }\) and \(R_{\mu \nu }\). The general hybrid action written in the form (109) is sufficient to understand the multiple sources of instabilities. Since we have linearised in the Ricci of the connection, that sector alone already reproduces the pathologies associated to the projective mode and the additional 2-form field that we have extensively discussed in precedent sections. Furthermore, even if we impose a projective symmetry in an attempt to avoid those pathologies, we can then straightforwardly integrate out the connection and obtain the equivalent action

$$\begin{aligned} {\mathcal {S}}_{\mathrm{hybrid}}=\int {\mathrm{d}}^Dx\Big [\sqrt{-q} q^{\mu \nu }{\mathcal {R}}_{(\mu \nu )}(q)+{\mathcal {U}}(R_{\mu \nu },q,g)\Big ],\nonumber \\ \end{aligned}$$
(111)

so we have an Einstein–Hilbert term to describe the dynamics of the (now symmetric) field \(q_{\mu \nu }\). That pure metric-affine sector is then fine. However, the hybrid couplings introduce yet two additional sources of pathologies. On one hand, if we have an arbitrary dependence on the metric Ricci tensor, the theory will be prone to the usual Ostrogradski instabilities in the metric sector. Furthermore, even if we avoid those problems by utilising only the Ricci scalar of the metric for instance, that is known to represent a safe higher order curvature of the metric formalism, the potential \({\mathcal {U}}\) will introduce arbitrary interactions between \(q_{\mu \nu }\) and \(g_{\mu \nu }\) so we will have an interacting bi-metric theory that will again introduce ghostly modes unless much care is taken in the construction of the interactions. We can understand this a bit better by considering a simplified theory where the metric and metric-affine sectors are split as

$$\begin{aligned} {\mathcal {S}}_{\mathrm{hybrid}}=\int {\mathrm{d}}^Dx\sqrt{-g}\left[ \frac{1}{2} R(g)+{\mathcal {F}}({\mathcal {R}}_{(\mu \nu )})\right] \end{aligned}$$
(112)

where we have separated the pure metric sector described by the Einstein–Hilbert action and the metric-affine sector on which we have imposed a projective symmetry. Each of these sectors by itself would seem perfectly fine. However, they can talk to each other through the \(\sqrt{-g}\) factor in the volume element and this will be the source of the problems. In view of our results above and neglecting matter fields for simplicity, we can expect to have two Einstein–Hilbert terms once we integrate the connection out. This is in fact the case, but we also generate a potential so the action reads

$$\begin{aligned} {\mathcal {S}}_{\mathrm{hybrid}}=\int {\mathrm{d}}^Dx\left[ \frac{\sqrt{-q}}{2}q^{\mu \nu }{\mathcal {R}}_{\mu \nu }(\Gamma )+\frac{\sqrt{-g}}{2} R(g)+{\mathcal {U}}(q,g)\right] , \nonumber \\ \end{aligned}$$
(113)

where the dependence on the general potential term in (109) can be separated as the R(g) term in the above action. The resulting action is then a bi-metric theory where the two metrics interact through the potential \({\mathcal {U}}\) and it will suffer from a Boulware-Deser ghost [81]. Since this potential is determined by the function f, only functions that generate the known ghost-free potentials [82, 83] have a chance to be stable. It is clear that resorting to a hybrid action not only cannot cure the found instabilities in RBG theories, but makes things even worse by introducing yet new sources of ghosts. A way around this general no-go result for stable hybrid theories results in theories where the bi-metric construction fails. This happens for theories where only the Ricci scalars are allowed, i.e., theories described by the action

$$\begin{aligned} {\mathcal {S}}_{\mathrm{hybrid}}=\int {\mathrm{d}}^Dx\sqrt{-g}f({\mathcal {R}},R). \end{aligned}$$
(114)

We can proceed analogously by performing the corresponding Legendre transformations to linearise in R and \({\mathcal {R}}\), but now we only need to introduce two auxiliary scalar fields instead of the tensor \(\Sigma _{\mu \nu }\) so we can rewrite the action as

$$\begin{aligned} {\mathcal {S}}_{\mathrm{hybrid}}= & {} \int {\mathrm{d}}^Dx\Big [\sqrt{-g}\,\chi \, g^{\mu \nu }{\mathcal {R}}_{\mu \nu }(\Gamma )+\sqrt{-g}\varphi g^{\mu \nu } R_{\mu \nu }\nonumber \\&\quad +{\mathcal {U}}(\varphi ,\chi )\Big ]. \end{aligned}$$
(115)

From this action we see that now the connection is nothing but the Levi-Civita connection of a metric that is conformally related to \(g_{\mu \nu }\). In order words, the definition of \(q^{\mu \nu }\) in (110) yields \(q_{\mu \nu }={\tilde{\chi }}g_{\mu \nu }\), with \({\tilde{\chi }}=\chi ^{\frac{2}{D-2}}\), so we only introduce an extra scalar instead of the full symmetric \(q_{\mu \nu }\). The action then takes the form

$$\begin{aligned} {\mathcal {S}}_{\mathrm{hybrid}}= & {} \int {\mathrm{d}}^Dx\sqrt{-g}\Big [(\varphi +\chi )R\nonumber \\&+2(1{-}D)\chi \Big (\Box \log {\tilde{\chi }}+\frac{D^2-4D-4}{2}(\partial \log {\tilde{\chi }})^2\Big )\nonumber \\&+{\mathcal {U}}(\varphi ,\chi )\Big ]. \end{aligned}$$
(116)

It is then apparent that these theories propagate two additional scalars and avoid the Boulware-Deser ghosts of the general case. It was found in [80] however, that even these theories seem to present some tension between the absence of tachyons and ghosts around a flat Minkowski background so it is unavoidable to have some kind of instabilities.

9 Conclusions

In this work we have addressed the unstable nature of general gravitational theories in a metric-affine formalism constructed with arbitrary curvature invariants. In particular, we showed in Sect. 5 the crucial role player by the projective symmetry in RBGs and how its breaking generically leads to pathologies. Remarkably, we could establish an interesting relation between RBGs without projective symmetry and non-symmetric gravity that allows to relate the found instabilities in both families of theories and, furthermore, give a novel interpretation of the pathologies in non-symmetric gravity. We have traced the origin of the pathologies to the absence of a proper kinetic term for the projective mode as well as the presence of non-minimal couplings the 2-form field associated to the antisymmetric part of the metric. This is shown to happen in two independent forms: in Sect. 5.1 by taking the decoupling limit after applying the Sẗeckelberg trick to restore the gauge symmetry for the 2-form, and in section 5.2 by explicitly writing the action in a pure post-Riemannian form with two extra dynamical fields corresponding to the 2-form and the projective mode. Moreover, in Sect. 5.3 we also present a formal solution that can be used to perturbatively solve the connection which makes explicit that the 2-form features unstable couplings as was the case for NGT [33]. The possibility that some couplings between matter and the connection could cure these instabilities is also explored in Sect. 6. First we sketch the construction of the non-symmetric gravity frame for generalized RBGs in presence of couplings between matter and connection in 6.1, arguing that the addition of such couplings will generally be oblivious to the ghosts (if it does not worsen the instabilities). We also elaborate in Sect. 6.3 on the distinction between metric geodesics and autoparallel curves and discuss why metric geodesics seem more natural trajectories for freely falling particles than affine ones.

Motivated by the results of [14] that constraining torsion to vanish can render generalized RBGs stable even with broken projective symmetry, we have extended the discussion to more general geometrical constraints in Sect. 7, surveying the cases of torsionless spacetimes, Weyl geometries, geometries with vector distorsion and Riemann-Cartan geometries. In all cases, although pathologies can still be present, we have explicitly shown particular cases where the theories are ghost-free. Finally, we have shown how theories formulated in the hybrid formalism are also generically prone to ghost-like pathologies (in particular they contain a Boulaware-Deser ghost) even in a more harmful manner.

In Sect. 5.4 we have argued why these theories are generically plagued by ghost-like instabilities and the results presented in this work should make it clear. Thus, although there will certainly be healthy metric-affine theories with higher order curvature invariants, it is crucial to guarantee the avoidance of pathologies before reliably using them for any physical application in e.g. cosmology or black hole physics.