1 Introduction

General relativity (GR) is an extremely successful theory of gravity, which agrees with all observations performed so far. Recent tests of GR include the discovery of gravitational waves, whose production is consistent with coalescing black holes [1], and the images of the black holes in the center of the M87 and our galaxy produced by the Event Horizon Telescope [2,3,4,5,6].

Of course, GR has to be complemented by some matter fields. At least a set of spin-1, spin-1/2 and spin-0 fields are needed to describe all we know about non-gravitational physics, the Standard Model of particles (SM) and its extensions that can account for the evidence of beyond-the-SM physics (neutrino masses and mixings, dark matter, baryon asymmetry, etc.).

Moreover, a UV completion is also necessary because GR is known to be nonrenormalizable by perturbative methods [7, 8] and to be within the regime of validity of perturbation theory at energies much below the Planck scale. However, at those low energies we can construct a consistent theory by adding all possible operators along the lines of effective field theories [9] (see also Refs. [10, 11] for reviews). The lower the dimensionality of a given operator is the more relevant such operator is expected to be at low energies.

The main principle behind these constructions, including GR itself, is general covariance (or the general relativity principle), which essentially states that all laws of physics should be invariant under a general coordinate transformation. This also implies that the field equations can be written in a covariant form and renders the presence of tensors, such as the metric, and a connection necessary. This geometrization of physics is commonly regarded as one of the greatest achievements of Einstein’s theory. In GR and its effective field theory (EFT) extensions, including ordinary matter fields (spin-1, spin-1/2 and spin-0 fields), the connection is typically assumed to be the Levi-Civita one, a functional of the metric. Theories of this sort are thus called metric theories. However, from the geometrical point of view the metric and the connection can be completely independent objects.

Therefore, a natural modification of gravity can be obtained by promoting the connection to an independent degree of freedom, but preserving general covariance. The resulting theories are called metric-affine (see Ref. [12] for a recent discussion on this subject and references to other original articles and Ref. [13] for a classic review). In general the difference between an arbitrary connection and the Levi-Civita connection is a tensor, known as the distorsion. The distorsion coincides with the contorsion when the theory is metric compatible, i.e. when the covariant derivate of the metric vanishes, which is required by the presence of fermions. The contorsion in turn is a tensor that can be expressed in terms of the torsion and that vanishes if and only if the torsion does. The metric-compatible theories are also known as Poincaré gauge theories because they can always be formulated as theories with a local Poincaré symmetry (see [14] for a recent review with many references to original works).

One of the purposes of the present paper is to identify the generalFootnote 1 form of the action of a metric-affine EFT that is equivalent to a metric EFT in the sense that does not feature an independent dynamical distorsion: i.e. the distorsion can be exactly integrated out and expressed in terms of the metric and the matter fields that are not of gravitational origin (that do not come from the metric and/or the distorsion). Indeed, even in a metric EFT additional gravitational degrees of freedom besides the massless spin-2 graviton can emerge from the metric because higher powers of the curvature tensors (that can involve higher derivatives) are generically present.

The motivation for finding the general action described in the previous paragraph is the fact that it helps us to tell whether a given metric-affine theory does not feature an independent dynamical distorsion without performing a direct calculation of the dynamical degrees of freedom. Also, with this result in hand, one could automatically identify the complement set of metric-affine EFTs that can potentially feature an independent dynamical distorsion. This set of theories is particularly interesting as the new distorsion fields can have interesting phenomenological consequences.

Another purpose of this paper is to discuss the validity of the equivalence principle in these EFTs. The equivalence principle is often presented as the starting point in formulating GR. However, in a metric EFT this principle is generically broken by the higher-dimensional operators. Given that GR plus minimally-coupled matter fields anyhow describe the low-energy limit of metric EFTs the equivalence principle is always recovered at low energies in metric theories. It is then natural to ask whether the same is true in general metric-affine EFTs: is the equivalence principle always an emergent low energy property in an arbitrary theory?

Let us now give an outline of the paper (a detailed summary of the results will be given in the concluding section). In Sect. 2 we will present the key ingredients that are needed to construct metric-affine EFTs. We will not limit ourselves to the gravitational sector, but we will also include a general matter content, namely an arbitrary number of scalars (or pseudoscalars), gauge fields and fermions. The general action of theories with non-dynamical distorsion will then be the topic of Sect. 3. After that, in Sect. 4, we will discuss theories with dynamical distorsion, studying in detail some explicit examples. The possible breaking of the equivalence principle and its possible emergence at low energies in metric-affine theories will then be investigated in Sect. 5. Finally, in the concluding Sect. 6 we offer a detailed summary of the new results of the paper with some further discussions.

2 Ingredients

In this section we provide the main ingredients that are needed to construct gravitational theories coupled to a generic matter sector. Most of the material in this section is a review of well-known results, but it is all needed to understand the subsequent sections. Here we will also take advantage to fix our notation.

To describe gravity we start from the general relativity principle, which states that all laws of physics should be invariant under general coordinate transformations. To implement such principle we introduce a metric \(g_{\mu \nu }\) and a connection \({{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }\) as independent fields. So we are in the framework of metric-affine theories.

The metric would be needed even if gravity were absent, indeed writing the flat metricFootnote 2\(ds^2=\eta _{ab}d\xi ^ad\xi ^b\) in general coordinates \(x^\mu \) the metric \(g_{\mu \nu }\) appears: \(ds^2=g_{\mu \nu }dx^\mu dx^\nu \). The general transformation rule of the metric (obtained by requiring \(ds^2\) invariant) is

$$\begin{aligned} g'_{\alpha \beta }(x') = \frac{\partial x^\mu }{\partial x^{'\alpha }}\frac{\partial x^\nu }{\partial x^{'\beta }} g_{\mu \nu }(x) \end{aligned}$$
(2.1)

and, generically, in the presence of gravity it is not possible to recover the flat metric with a general coordinate transformation.

The connection, on the other hand, is needed in curved space to introduce covariant derivatives of tensors, which are essential to write the field equations (which contain derivatives) in a covariant form: the covariant derivatives of a generic tensor \(T_{\mu _1\ldots \mu _n}^{\nu _1\ldots \nu _m}\) with n covariant indices and m contravariantFootnote 3 indices are

$$\begin{aligned}&{{\mathcal {D}}}_\mu T_{\mu _1\ldots \mu _n}^{\nu _1\ldots \nu _m} =\partial _\mu T_{\mu _1\ldots \mu _n}^{\nu _1\ldots \nu _m} +{{\mathcal {A}}}_{\mu ~\beta _1}^{~\,\nu _1}T_{\mu _1\ldots \mu _n}^{\beta _1\ldots \nu _m} +\cdots \nonumber \\&\quad +{{\mathcal {A}}}_{\mu ~\beta _m}^{~\,\nu _m}T_{\mu _1\ldots \mu _n}^{\nu _1\ldots \beta _m} -{{\mathcal {A}}}_{\mu ~\mu _1}^{~\,\alpha _1}T_{\alpha _1\ldots \mu _n}^{\nu _1\ldots \nu _m} -\cdots -{{\mathcal {A}}}_{\mu ~\mu _n}^{~\,\alpha _n}T_{\mu _1\ldots \alpha _n}^{\nu _1\ldots \nu _m}.\nonumber \\ \end{aligned}$$
(2.2)

This calligraphic covariant derivative \({{\mathcal {D}}}\) is generically different from the covariant derivative, which we denote D, computed with the Levi-Civita (LC) connection

$$\begin{aligned} \Gamma _{\mu ~\sigma }^{~\,\rho } = \frac{1}{2} g^{\rho \tau }\left( \partial _\mu g_{\tau \sigma }+\partial _\sigma g_{\tau \mu }-\partial _\tau g_{\mu \sigma }\right) . \end{aligned}$$
(2.3)

In order for the quantity in (2.2) to be a tensor with m contravariant indices and \(n+1\) covariant indices \({{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }\) should transform under general coordinate transformations precisely as \(\Gamma _{\mu ~\sigma }^{~\,\rho }\). So

$$\begin{aligned} C_{\mu ~\sigma }^{~\,\rho } \equiv {{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }-\Gamma _{\mu ~\sigma }^{~\,\rho }, \end{aligned}$$
(2.4)

which we call the distorsion, transforms as a tensor. Theories where \(C_{\mu ~\sigma }^{~\,\rho }=0\) are called metric theories as the connection can be computed once the metric is known in that case. The torsion \(T_{\mu \nu \rho }\) is defined in terms of the distorsion by

$$\begin{aligned} T_{\mu \nu \rho } \equiv C_{\mu \nu \rho }-C_{\rho \nu \mu }, \end{aligned}$$
(2.5)

which is antisymmetric with respect to the exchange \(\mu \leftrightarrow \rho \). The curvature associated with \({{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }\) is defined by

$$\begin{aligned} {{\mathcal {F}}}_{\mu \nu ~~\sigma }^{~~~\rho } \equiv \partial _\mu {{\mathcal {A}}}_{\nu ~\sigma }^{~\,\rho }-\partial _\nu {{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }+{{\mathcal {A}}}_{\mu ~\lambda }^{~\,\rho }{{\mathcal {A}}}_{\nu ~\sigma }^{~\,\lambda }-{{\mathcal {A}}}_{\nu ~\lambda }^{~\,\rho }{{\mathcal {A}}}_{\mu ~\sigma }^{~\,\lambda },\nonumber \\ \end{aligned}$$
(2.6)

which can be expressed in terms of \(C_{\mu ~\sigma }^{~\,\rho }\) as

$$\begin{aligned} {{\mathcal {F}}}_{\mu \nu ~~\sigma }^{~~~\rho }= & {} R_{\mu \nu ~~\sigma }^{~~~\rho } +D_\mu C_{\nu ~\sigma }^{~\,\rho }-D_\nu C_{\mu ~\sigma }^{~\,\rho }+ C_{\mu ~\lambda }^{~\,\rho } C_{\nu ~\sigma }^{~\,\lambda }\nonumber \\&- C_{\nu ~\lambda }^{~\,\rho } C_{\mu ~\sigma }^{~\,\lambda }, \end{aligned}$$
(2.7)

where \(R_{\mu \nu ~~\sigma }^{~~~\rho }\) is the standard Riemann tensor,Footnote 4 i.e. \({{\mathcal {F}}}_{\mu \nu ~~\sigma }^{~~~\rho }\) evaluated at \({{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }=\Gamma _{\mu ~\sigma }^{~\,\rho }\). Starting from \({{\mathcal {F}}}_{\mu \nu ~~\sigma }^{~~~\rho }\) we can define a scalar

$$\begin{aligned} {{\mathcal {R}}} \equiv {{\mathcal {F}}}_{\mu \nu }^{~~~\mu \nu } \end{aligned}$$
(2.8)

and a pseudoscalar (see [19,20,21])

$$\begin{aligned} {{\mathcal {R}}'} \equiv \frac{1}{\sqrt{-g}}\epsilon ^{\mu \nu \rho \sigma }{{\mathcal {F}}}_{\mu \nu \rho \sigma }, \end{aligned}$$
(2.9)

where \(\epsilon ^{\mu \nu \rho \sigma }\) is the totally antisymmetric Levi-Civita symbol with \(\epsilon ^{0123}=1\). We will refer to \({{\mathcal {R}}'}\) as the Holst invariant. The pseudoscalar \({{\mathcal {R}}'}\) vanishes for \(C_{\mu ~\sigma }^{~\,\rho }=0\) (that is when the connection is the LC one) because of the cyclicity property \(R_{\mu \nu \rho \sigma }+R_{\nu \sigma \rho \mu }+R_{\sigma \mu \rho \nu }=0\), which is the reason why in standard Riemannian geometry \({{\mathcal {R}}'}\) is absent. Therefore, \({{\mathcal {R}}'}\) can be considered as a direct manifestation of a connection that is independent of the metric. We will study its possible dynamics in Sect. 4.2. By using (2.7) one obtains

$$\begin{aligned} {{\mathcal {R}}}= & {} R +D_\mu C_{\nu }^{~\,\mu \nu }-D_\nu C_{\mu }^{~\,\mu \nu }+ C_{\mu ~\lambda }^{~\,\mu } C_{\nu }^{~\,\lambda \nu }- C_{\nu ~\lambda }^{~\,\mu } C_{\mu }^{~\,\lambda \nu }, \nonumber \\ \end{aligned}$$
(2.10)
$$\begin{aligned} {{\mathcal {R}}'}= & {} \frac{2}{\sqrt{-g}}\epsilon ^{\mu \nu \rho \sigma }\left( D_\mu C_{\nu \rho \sigma }+C_{\mu \rho \lambda } C_{\nu ~\sigma }^{~\,\lambda }\right) . \end{aligned}$$
(2.11)

Note that we can decompose

$$\begin{aligned} {{\mathcal {F}}}_{\mu \nu \rho \sigma }= & {} \frac{1}{16} g_{\mu \rho }g_{\nu \sigma } {{\mathcal {R}}} -\frac{1}{4!\sqrt{-g}}\epsilon _{\mu \nu \rho \sigma }{{\mathcal {R}}}'+\tilde{{\mathcal {F}}}_{\mu \nu \rho \sigma },\nonumber \\&(\tilde{{\mathcal {F}}}_{\mu \nu }^{~~~\mu \nu }=0,~~\epsilon ^{\mu \nu \rho \sigma }\tilde{{\mathcal {F}}}_{\mu \nu \rho \sigma }=0), \end{aligned}$$
(2.12)

where \(\epsilon _{\mu \nu \rho \sigma }\) is the totally antisymmetric tensor with \(\epsilon _{0123}\) equal to the metric determinant g, such that \(g_{\mu \alpha }g_{\nu \beta }g_{\rho \gamma }g_{\sigma \delta }\epsilon ^{\alpha \beta \gamma \delta }=\epsilon _{\mu \nu \rho \sigma }\).

All the ingredients introduced so far are sufficient to describe gravity only. However, we want to include all the other interactions (electroweak, strong, Yukawa interactions, etc.) so we also consider a generic number of real scalars (or pseudoscalars) \(\phi \), gauge fields \(A^I_\mu \) corresponding to an internal gauge group G and fermions, which we represent here with Weyl spinors \(\psi \). Note that massive vector fields can be obtained as usual through the Higgs or Stückelberg mechanisms.

In general the distorsion tensor does not have special properties. However, in the presence of fermions one can show that it should be such that the covariant derivative of the metric vanishes, or, in other words, the theory should be metric compatible.

As we will recover now, this has to do with the fact that in a generic curved spacetime fermion fields belong to the spinorial representation of a local Lorentz group in the tangent space. Indeed in order to define them one introduces a basis \(\{e_a\}\) in the tangent space such that

$$\begin{aligned} \eta _{ab}=e_a^\mu e_b^\nu g_{\mu \nu }, \end{aligned}$$
(2.13)

where the “tetrads” \(e_a^\mu \) are defined by expanding each \(e_a\) in the coordinate basis, \(e_a=e_a^\mu \frac{\partial }{\partial x^\mu }\). We can also define \(e^a_\mu \equiv \eta ^{ab}g_{\mu \nu }e^\nu _b\), which can be considered as the components of some one-form fields \(e^a\) in the one-form basis \(\{dx^\mu \}\), namely \(e^a= e^a_\mu dx^\mu \). Using (2.13) one finds that these quantities satisfy \(e^a_\mu e^\mu _b=\delta ^a_b\) and

$$\begin{aligned} g_{\mu \nu }=e^a_\mu e^b_\nu \eta _{ab}. \end{aligned}$$
(2.14)

It follows that the inverse of the metric exists and is given by \(g^{\mu \nu } = e_a^\mu e_b^\nu \eta ^{ab}\), which implies \(\eta ^{ab}=e^a_\mu e^b_\nu g^{\mu \nu }\). The \(e^a\) (and analogously the \(e_a\)) are defined modulo local Lorentz transformations: if we redefine \(e^{'a} = \Lambda ^a_{~b}e^b\), where \(\Lambda ^a_{~b}\) are the elements of a local Lorentz transformation, we obtain the same metric \(g_{\mu \nu }=e^{'a}_\mu e^{'b}_\nu \eta _{ab}\). Let us consider now a vector \({{\mathcal {V}}}\), which we take to be G-invariant for simplicity, and expand it in the basis \(\{e_a\}\), that is \({{\mathcal {V}}}= {{\mathcal {V}}}^a e_a\). The components \({{\mathcal {V}}}^a\) belong to the vector representation of the local Lorentz group so their covariant derivative

$$\begin{aligned} {{\mathcal {D}}}_\mu {{\mathcal {V}}}^a = \partial _\mu {{\mathcal {V}}}^a+{{\mathcal {A}}}_{\mu ~b}^{~\,a}{{\mathcal {V}}}^b \end{aligned}$$
(2.15)

should feature a connection \({{\mathcal {A}}}_{\mu ~b}^{~\,a}\) (known as the spin connection) whose values belong to the Lorentz algebra: defining \({{\mathcal {A}}}_{\mu }^{~\,ab}\equiv {{\mathcal {A}}}_{\mu ~b}^{~\,c} \eta ^{bc}\), we can impose an antisymmetry with respect to the exchange of the flat indices ab:

$$\begin{aligned} {{\mathcal {A}}}_{\mu }^{~\,ab}=-{{\mathcal {A}}}_{\mu }^{~\,ba}. \end{aligned}$$
(2.16)

The spin connection can be seen as the connection \({{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }\) rewritten using the tetrad basis and we can express one in terms of the other: this can be done by considering the covariant derivative \({{\mathcal {D}}} {{\mathcal {V}}}\) and writing the identities

$$\begin{aligned}&{{\mathcal {D}}}_\mu {{\mathcal {V}}}^\rho ~ dx^\mu \otimes \frac{\partial }{\partial x^\rho }= {{\mathcal {D}}} {{\mathcal {V}}}={{\mathcal {D}}}_\mu {{\mathcal {V}}}^a ~ dx^\mu \otimes e_a \nonumber \\&\quad = e_a^\rho {{\mathcal {D}}}_\mu {{\mathcal {V}}}^a ~ dx^\mu \otimes \frac{\partial }{\partial x^\rho } \end{aligned}$$
(2.17)

which implies \( {{\mathcal {D}}}_\mu {{\mathcal {V}}}^\rho =e_a^\rho {{\mathcal {D}}}_\mu {{\mathcal {V}}}^a\). Using then (2.2) and \({{\mathcal {V}}}^a=e^a_\lambda {{\mathcal {V}}}^\lambda \) one finds

$$\begin{aligned} {{\mathcal {A}}}_{\mu ~b}^{~\,a} = e^a_\nu {{\mathcal {A}}}_{\mu ~\lambda }^{~\,\nu } e^\lambda _b- e^\lambda _b\partial _\mu e^a_\lambda . \end{aligned}$$
(2.18)

From this result one can show

$$\begin{aligned} {{\mathcal {D}}}_\mu e_\nu ^a \equiv \partial _\mu e_\nu ^a -{{\mathcal {A}}}_{\mu ~\nu }^{~\,\lambda } e^a_\lambda + {{\mathcal {A}}}_{\mu ~b}^{~\,a} e^b_\nu = 0 \end{aligned}$$
(2.19)

and, therefore, using (2.14), the antisymmetry property (2.16) and the Leibniz rule we obtain \({{\mathcal {D}}}_\mu g_{\alpha \beta }=0\).

The above-mentioned local Lorentz group is precisely the one with respect to which fermions belong to the spinorial representation. Therefore, we recover the well-known result that in the presence of fermions, when this local Lorentz group is compulsory, the theory should be metric compatible. In the absence of fermions, on the other hand, one can have \({{\mathcal {D}}}_\mu g_{\alpha \beta }\ne 0\) and \(T_{\mu \nu \rho }=0\), which is known as Palatini gravity.

The gauge fields \(A^I_\mu \), together with the connection \({{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }\), allow us to define a covariant derivative with respect to both general coordinate transformations and elements of G, whose action on scalars and fermions reads

$$\begin{aligned}&{{\mathcal {D}}}_\mu \phi = \partial _\mu \phi + i \theta ^I A^I_\mu \phi , \nonumber \\&{{\mathcal {D}}}_\mu \psi = \partial _\mu \psi + i t^IA^I_\mu \psi + \frac{1}{2} {{\mathcal {A}}}^{ab}_\mu \sigma _{ab} \psi , \end{aligned}$$
(2.20)

where, recalling that we work with Weyl fermions, \(\sigma ^{ab} \equiv \frac{1}{4} (\sigma ^a{{\bar{\sigma }}}^b-\sigma ^b{{\bar{\sigma }}}^a)\), also \(\sigma ^i\equiv -{{\bar{\sigma }}}^i\) (with \(i=1,2,3\)) are the Pauli matrices and \(\sigma ^0\equiv {{\bar{\sigma }}}^0\equiv 1\) is the \(2\times 2\) identity matrix. The gauge couplings are contained in the matrices \(\theta ^I\) and \(t^I\), which are the generators of G in the scalar and fermion representations, respectively.

We consider now the commutator of two covariant derivatives acting on a scalar field \(\phi \):

$$\begin{aligned}{}[{{\mathcal {D}}}_\mu ,{{\mathcal {D}}}_\nu ]\phi = \left[ iF_{\mu \nu }^I \theta ^I-({{\mathcal {A}}}_{\mu ~\nu }^{~\,\lambda }-{{\mathcal {A}}}_{\nu ~\mu }^{~\,\lambda }){{\mathcal {D}}}_\lambda \right] \phi , \end{aligned}$$
(2.21)

where

$$\begin{aligned} F_{\mu \nu }^I \equiv \partial _\mu A^I_\nu -\partial _\nu A^I_\mu -f^{KJI} A_\mu ^KA_\nu ^J \end{aligned}$$
(2.22)

and the \(f^{KJI}\) are the structure constants of G. Note that both \([{{\mathcal {D}}}_\mu ,{{\mathcal {D}}}_\nu ]\phi \) and \(({{\mathcal {A}}}_{\mu ~\nu }^{~\,\lambda }-{{\mathcal {A}}}_{\nu ~\mu }^{~\,\lambda }){{\mathcal {D}}}_\lambda \phi \) are tensors and so, because of (2.21), also the \(F_{\mu \nu }^I\) must be tensors. This shows that the expression of the field strength tensor of \(A^I_\mu \) in the presence of a generic connection can be taken to be \(F_{\mu \nu }^I\), namely the same as the one in flat space even if the connection is not the LC one.

3 Theories with non-dynamical distorsion

Before going to the general characterization of theories with non-dynamical distorsion it is useful to recall the structure of metric theories.

Einstein’s GR is the leading theory of this type in the low energy limit. Its action is the standard Einstein–Hilbert one

$$\begin{aligned} S_{\mathrm{EH}} = \int d^4x \sqrt{-g}\left( \frac{M_{P}^2}{2} R - \Lambda \right) , \end{aligned}$$
(3.1)

where \(M_{P}\) is the reduced Planck mass and \(\Lambda \) is the cosmological constant. We can also add higher curvature terms to \(S_{\mathrm{EH}} \),

$$\begin{aligned}&\int d^4x \sqrt{-g}\left( a_2 R^2 +b_2 R_{\mu \nu }R^{\mu \nu } +\frac{a_3}{M_{P}^2} R^3 +\cdots \nonumber \right. \\&\left. \quad +\frac{a_4}{M_{P}^4} (R_{\mu \nu }R^{\mu \nu })^2 + \cdots \right) , \end{aligned}$$
(3.2)

where the \(a_i\), \(b_i\), etc. are freely adjustable dimensionless coefficients.

Furthermore, we can also add to the theory a generic matter sector with action \( S_{\mathrm{matter}} = \int d^4x \sqrt{-g}{\mathscr {L}}_{\mathrm{matter}}\), where \({\mathscr {L}}_{\mathrm{matter}}\) can contain (pseudo)scalar fields \(\phi \), fermions \(\psi \) and gauge fields \(A_\mu ^I\). Besides the standard renormalizable terms (which play the leading role in the low energy limit) \({\mathscr {L}}_{\mathrm{matter}}\) can also contain higher-order terms built with \(\phi \), \(\psi \) and \(A_\mu ^I\) as well as \(g_{\mu \nu }\). All these terms, of course, must be compatible with the given symmetries (general coordinate invariance, G and possibly some global symmetries). For example, we can add to \({\mathscr {L}}_{\mathrm{matter}}\) terms of the form \((F_{\mu \nu }^IF^{I\mu \nu })^2\), \(({D}_\mu \phi {D}^\mu \phi )^3\), \(R {D}_\mu \phi {D}^\mu \phi \) etc. with, again, freely adjustable coefficients.

Adopting the EFT point of view, the higher the dimensionality of the generic added term is the less relevance such term has at low energies. Using the same reasoning, we do not add non-local terms too, because at sufficiently low energies any non-locality will be described by a series of local terms.

3.1 General characterization

The purpose of this section is to identify the most general class of (local effective field) theories of the type defined in Sect. 2 where the distorsion \(C_{\mu ~\sigma }^{~\,\rho }\) is not dynamical. These theories are those whose action can be brought into the form

$$\begin{aligned} S_{\mathrm{eq}} = \int d^4x\sqrt{-g}\left( {{\mathcal {F}}}_{\mu \nu \rho \sigma } {{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi ) + \Sigma (\Phi , {{\mathcal {D}}}\Phi , C) \right) ,\nonumber \\ \end{aligned}$$
(3.3)

where \(\Phi \) represents the set of fields that are independent of \(C_{\mu ~\sigma }^{~\,\rho }\), namely

$$\begin{aligned} \Phi =\{g_{\mu \nu }, \phi , \psi , F_{\mu \nu }^I,\ldots \}, \end{aligned}$$
(3.4)

the dots are curvatures and covariant derivatives of the previous fields constructed with the LC connection, \({{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi )\) is a rank-four contravariant tensor that depends on \(\Phi \) only (not on its derivatives) and \(\Sigma (\Phi , {{\mathcal {D}}}\Phi , C)\) is a quantity that depends on \(\Phi \) and \(C_{\mu ~\sigma }^{~\,\rho }\) only. Note that \({{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi )\) and \(\Sigma (\Phi , {{\mathcal {D}}}\Phi , C)\) should also be invariant under gauge transformations of G and possibly some global symmetries, if any.

The reason why the distorsion is not dynamical for theories of the form (3.3) is because the field equations of \(C_{\mu ~\sigma }^{~\,\rho }\) are purely algebraic in \(C_{\mu ~\sigma }^{~\,\rho }\). Indeed, the derivatives of the distorsion only appear in the first term proportional to \({{\mathcal {T}}}^{\mu \nu \rho \sigma }\) and they are first derivatives, so, after an integration by parts it is possible to make them act on \({{\mathcal {T}}}^{\mu \nu \rho \sigma }\) instead. Therefore, in principle, these equations can be solved exactly to find \(C_{\mu ~\sigma }^{~\,\rho }\) as a functional of \(\Phi \). Once this is done, the theory with action \(S_{\mathrm{eq}}\) can always be written as a metric theory, whose general form has been described at the beginning of this Sect. 3.

Note that the theory defined in (3.3) is the most general one with non-dynamical distorsion. The reason is that even setting \(C_{\mu ~\sigma }^{~\,\rho }=0\) one can recover the most general metric theory: this is because, as we have specified, the collective field \(\Phi \) can also contain curvature tensors and covariant derivatives of \(\phi , \psi , {\overline{\psi }}, F_{\mu \nu }^I\) constructed with the LC connection. If one allows now for a non-vanishing distorsion, one can anyhow express it in terms of \(\Phi \) by using its field equations.

We can thus state that the theories with non-dynamical distorsion are those whose action is linear in the curvature \({{\mathcal {F}}}_{\mu \nu ~~\sigma }^{~~~\rho }\) of the full connection \({{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }\) with the “coefficients” of the linear terms, i.e. the tensor \({{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi )\), being independent of the distorsion itself. This class of theories can be regarded as equivalent formulations of the most general metric theories with the given set of matter fields \(\{\phi , \psi , A_\mu ^I\}\) (for examples of equivalent formulations of specific metric theories see Refs. [22,23,24,25]).

3.2 Theories with a falsely-dynamical distorsion

It is important to note that some theories, despite not appearing of the form (3.3), can be brought into that form with appropriate redefinitions.

To illustrate this point let us consider as an example the case where the action is

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left( {{\mathcal {F}}}_{\mu \nu \rho \sigma } {{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi ) + \Delta (\Phi , \alpha (\Phi ){{\mathcal {R}}}\nonumber \right. \\&\left. +\beta (\Phi ) {{\mathcal {R}}}')+ \Sigma (\Phi , {{\mathcal {D}}}\Phi ) \right) , \end{aligned}$$
(3.5)

with

$$\begin{aligned} {{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi ) = \alpha (\Phi ) g^{\mu \rho }g^{\nu \sigma } + \beta (\Phi )\frac{\epsilon ^{\mu \nu \rho \sigma }}{\sqrt{-g}} \end{aligned}$$
(3.6)

and \(\Delta \) is a function of \(\Phi \) and the specific combination \(\alpha (\Phi ){{\mathcal {R}}}+\beta (\Phi ) {{\mathcal {R}}}'\) only, where \(\alpha \) and \(\beta \) are the same functions of \(\Phi \) that appear in (3.6). Moreover, we take \(\Phi \) independent of the curvature and covariant derivatives built with the LC connection and \(\Sigma \) independent of \({{\mathcal {D}}}g_{\mu \nu }\); also we take \(\alpha \), \(\beta \) and \(\Delta \) independent of the metric and impose the further constraint \(1+\frac{\partial \Delta }{\partial z}(\Phi ,z)>0\). In this specific case, using (3.6), the action reads

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left( \alpha (\Phi ){{\mathcal {R}}}+\beta (\Phi ) {{\mathcal {R}}}' + \Delta (\Phi , \alpha (\Phi ){{\mathcal {R}}}\nonumber \right. \\&\left. +\beta (\Phi ) {{\mathcal {R}}}')+ \Sigma (\Phi , {{\mathcal {D}}}\Phi ) \right) . \end{aligned}$$
(3.7)

Theories of this form actually belong to the class of (3.3) and, therefore, feature a non-dynamical distorsion. In order to show that we introduce an auxiliary field z that allows us to write S in the form

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left( \alpha (\Phi )\left( 1+\frac{\partial \Delta }{\partial z}(\Phi ,z)\right) {{\mathcal {R}}}\nonumber \right. \\&\left. +\beta (\Phi )\left( 1+\frac{\partial \Delta }{\partial z}(\Phi ,z)\right) {{\mathcal {R}}'} \right. \nonumber \\&\left. + \Delta (\Phi ,z)-z\frac{\partial \Delta }{\partial z}(\Phi ,z)+\Sigma (\Phi ,{{\mathcal {D}}}\Phi )\right) . \end{aligned}$$
(3.8)

The action above is equivalent to the one in (3.7) because of the following argument. First note that we can impose the condition \(\frac{\partial ^2\Delta }{\partial z^2}\ne 0\) without loss of generality given that around any point where \(\frac{\partial ^2\Delta }{\partial z^2} = 0 \) we can write

$$\begin{aligned}&\Delta (\Phi , \alpha (\Phi ){{\mathcal {R}}}+\beta (\Phi ) {{\mathcal {R}}}') \simeq \Delta _0(\Phi ) \nonumber \\&\quad +\Delta _1(\Phi ) ( \alpha (\Phi ){{\mathcal {R}}}+\beta (\Phi ) {{\mathcal {R}}}') \end{aligned}$$
(3.9)

and the functions of \(\Phi \) that we called here \(\Delta _0\) and \(\Delta _1\) can be absorbed in an appropriate redefinition of \(\alpha \), \(\beta \) and \(\Sigma \). Now, by using the field equation of z computed using the action in (3.8), we find

$$\begin{aligned} \frac{\partial ^2\Delta }{\partial z^2}(\Phi ,z) (\alpha (\Phi ){{\mathcal {R}}}+\beta (\Phi ) {{\mathcal {R}}}' - z) = 0, \end{aligned}$$
(3.10)

which implies, using \(\frac{\partial ^2\Delta }{\partial z^2}\ne 0\), that \(z=\alpha (\Phi ){{\mathcal {R}}}+\beta (\Phi ) {{\mathcal {R}}}'\). By inserting this result in (3.8) one recovers exactly (3.7).

The reason why these theories can be brought into the form (3.3) is because we can absorb the dependence on z in front of both \({{\mathcal {R}}}\) and \({{\mathcal {R}}'}\) in (3.8) through the metric rescaling

$$\begin{aligned} g_{\mu \nu }\rightarrow \Omega ^2 g_{\mu \nu }, \end{aligned}$$
(3.11)

where \(\Omega ^2\) depends only algebraically on z:

$$\begin{aligned} \Omega ^2(\Phi ,z) = \frac{1}{1+\frac{\partial \Delta }{\partial z}(\Phi ,z)} \end{aligned}$$
(3.12)

(here is where we use \(1+\frac{\partial \Delta }{\partial z}(\Phi ,z)>0\)). After this metric rescaling the spacetime derivatives of z do not appear because we do not change at the same timeFootnote 5\({{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }\), \(\phi \), \(\psi \) and \(A_\mu ^I\) and, as specified, we take \(\Phi \) independent of the curvature and covariant derivatives built with the LC connection and \(\Sigma \) independent of \({{\mathcal {D}}}g_{\mu \nu }\). Therefore, we can easily integrate out z and express it in terms of the other fields \(\Phi \). So in this case z is not dynamical and there are no other degrees of freedom besides the metric and the matter fields \(\{\phi , \psi , A_\mu ^I\}\).

3.3 \(f({{\mathcal {R}}})\) theories

A particular form of Eq. (3.7) is

$$\begin{aligned} S = \int d^4x\sqrt{-g} f({{\mathcal {R}}}), \end{aligned}$$
(3.13)

where f is a function with \(\frac{\partial f}{\partial {{\mathcal {R}}}} > 0\) and \(\frac{\partial ^2 f}{\partial {{\mathcal {R}}}^2} \ne 0\). Therefore, we obtain that also \(f({{\mathcal {R}}})\) metric-affine theoriesFootnote 6 do not feature a dynamical distorsion.

Also, as a consequence of the calculations we have performed in Sect. 3.2, the \(f({{\mathcal {R}}})\) metric-affine theories can actually be recast in the GR form (3.1). Indeed, by defining the function \(\Delta \) through

$$\begin{aligned} \alpha {{\mathcal {R}}} +\Delta (\alpha {{\mathcal {R}}}) \equiv f({{\mathcal {R}}}), \end{aligned}$$
(3.14)

where \(\alpha \) is an arbitrary positive constant, we obtain (after the metric rescaling in (3.11) and (3.12))

$$\begin{aligned} S = \int d^4x\sqrt{-g}\left( \alpha {{\mathcal {R}}} +\alpha ^2\frac{f({\tilde{z}})-{\tilde{z}} \frac{\partial f}{\partial {\tilde{z}}}({\tilde{z}})}{\frac{\partial f}{\partial {\tilde{z}}}({\tilde{z}})^2} \right) , \end{aligned}$$
(3.15)

where \({\tilde{z}}\equiv z/\alpha \); the field \({\tilde{z}}\) is clearly non dynamical and in principle we can solve its field equation and plug the solution into the action to obtain

$$\begin{aligned} S = \int d^4x\sqrt{-g}\left( \alpha {{\mathcal {R}}} - \Lambda \right) , \end{aligned}$$
(3.16)

where

$$\begin{aligned} \Lambda =\alpha ^2\frac{{\tilde{z}}_0 \frac{\partial f}{\partial {\tilde{z}}_0}({\tilde{z}}_0)-f({\tilde{z}}_0)}{\frac{\partial f}{\partial {\tilde{z}}_0}({\tilde{z}}_0)^2} \end{aligned}$$
(3.17)

and \({\tilde{z}}_0\) is a solution of the \({\tilde{z}}\) field equation. After that, using the techniques in Appendix A (see also Ref. [25]), we can solve the field equations of the distorsion and insert the solution into the action to obtain precisely (3.1), with the identification \(\alpha =M_{P}^2/2\).

This means, among other things, that \(f({{\mathcal {R}}})\) metric-affine theories do not have any other gravitational degrees of freedom besides the ordinary graviton (see also [30] for a previous related discussion, and [31] for the particular case \(f({{\mathcal {R}}}) \propto {{\mathcal {R}}}^2\)). Instead, in f(R) metric theories the gravitational spectrum features, in additional to the ordinary graviton, a dynamical scalar field: technically this happens because it is not possible to rescale the metric as in (3.11) without changing the connection in the metric case (where the connection is the LC one).

4 Theories with dynamical distorsion

4.1 General characterization

The general form (3.3) of theories with non-dynamical distorsion is useful, among other things, because it helps us in identifying the class of theories with a dynamical distorsion: they are those that can never be brought into the form (3.3). Indeed, in this case kinetic terms for some components of the distorsion necessarily appear. In general there can be other components of the distorsion that remain non dynamical: we say that the distorsion is dynamical when at least some components of this tensor feature kinetic terms.

In the following we provide some examples of (local effective field) theories that cannot be brought into the form (3.3) and, in simple cases, compute explicitly the kinetic terms for the dynamical components of the distorsion.

4.2 Examples: dynamical (pseudo)scalarons

As we have seen, the theories with non-dynamical distorsion are those whose action can be brought into a form that is linear in the curvature of the full connection with the “coefficients” of the linear terms being independent of the distorsion itself. Therefore, generically, we can have a dynamical distorsion by adding terms that are non linear in the curvature. So the first examples of metric-affine theories with dynamical distorsion that we consider have actions of the form

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left( {{\mathcal {F}}}_{\mu \nu \rho \sigma } {{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi ) + \Delta (\Phi , {{\mathcal {R}}}, {{\mathcal {R}}}')\nonumber \right. \\&\left. + \Sigma (\Phi , {{\mathcal {D}}}\Phi , C) \right) , \end{aligned}$$
(4.1)

where \(\Phi \), \({{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi )\) and \(\Sigma (\Phi , {{\mathcal {D}}}\Phi , C)\) have been defined in Sect. (3.1) and \(\Delta \) is a function of \(\Phi , {{\mathcal {R}}}\) and \({{\mathcal {R}}}'\) only. Note that \(\Delta (\Phi , {{\mathcal {R}}}, {{\mathcal {R}}}')\) should also be invariant under gauge transformations of G and the global symmetries, if any. The function \(\Delta \) can introduce the non linearity in the curvature that is crucial to have a dynamical distorsion. Indeed, barring specific choices of the action, such as those described in Sect. 3.2, one has dynamical (pseudo)scalar degrees of freedom coming from the distorsion in this case, as we now show.

Let us start with the case in which \(\Delta \) does not depend on \({{\mathcal {R}}}'\), but can have a generic dependence on \(\Phi \) and \({{\mathcal {R}}}\). This case can be treated by introducing one auxiliary scalar field \(\zeta \). The action S can be equivalently written as follows

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left( {{\mathcal {F}}}_{\mu \nu \rho \sigma } {{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi ) + \Delta (\Phi , \zeta )\nonumber \right. \\&\left. +\frac{\partial \Delta }{\partial \zeta }(\Phi , \zeta )({{\mathcal {R}}}-\zeta )+ \Sigma (\Phi , {{\mathcal {D}}}\Phi , C) \right) . \end{aligned}$$
(4.2)

To show this we observe that the field equation of \(\zeta \) is

$$\begin{aligned} ({{\mathcal {R}}}-\zeta )\frac{\partial ^2\Delta }{\partial \zeta ^2}=0. \end{aligned}$$
(4.3)

and that we can require without loss of generality \(\frac{\partial ^2\Delta }{\partial \zeta ^2}\ne 0\). Indeed, around any point with \(\frac{\partial ^2\Delta }{\partial \zeta ^2}= 0\) we can have at most a linear dependence of \(\Delta \) on \({{\mathcal {R}}}\), and we can, therefore, absorb \(\Delta \) in a redefinition of \({{\mathcal {T}}}^{\mu \nu \rho \sigma }\) and \(\Sigma \) and go back to the case of non-dynamical distorsion of Sect. 3. From (4.3) it follows that the field equations fix \(\zeta ={{\mathcal {R}}}\) and (4.2) reduces to (4.1). We can now write

$$\begin{aligned} S = \int d^4x\sqrt{-g}\left( {{\mathcal {F}}}_{\mu \nu \rho \sigma } {{\bar{\mathcal {T}}}}^{\mu \nu \rho \sigma }(\Phi ,\zeta ) + {{\bar{\Sigma }}}(\Phi , \zeta , {{\mathcal {D}}}\Phi ,C) \right) ,\nonumber \\ \end{aligned}$$
(4.4)

where

$$\begin{aligned} {{\bar{\mathcal {T}}}}^{\mu \nu \rho \sigma }(\Phi ,\zeta )\equiv & {} {{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi ) + g^{\mu \rho }g^{\nu \sigma }\frac{\partial \Delta }{\partial \zeta }(\Phi ,\zeta ), \end{aligned}$$
(4.5)
$$\begin{aligned} {{\bar{\Sigma }}}(\Phi ,\zeta ,{{\mathcal {D}}}\Phi , C)\equiv & {} \Sigma (\Phi ,{{\mathcal {D}}}\Phi , C) \nonumber \\&\quad + \Delta (\Phi ,\zeta )-\zeta \frac{\partial \Delta }{\partial \zeta }(\Phi ,\zeta ). \end{aligned}$$
(4.6)

Therefore, we have come back to the previously studied case \(\Delta =0\), but with an extra scalar \(\zeta \) in addition to the \(\phi \) fields we started with. In deriving the algebraic equations of \(C_{\mu ~\sigma }^{~\,\rho }\), derivatives of \(\zeta \) generically appear when we integrate by parts the terms coming from \({{\mathcal {F}}}_{\mu \nu \rho \sigma } {{\bar{\mathcal {T}}}}^{\mu \nu \rho \sigma }(\Phi ,\zeta )\) that contain one derivative of the variation of \(C_{\mu ~\sigma }^{~\,\rho }\), see Eq. (2.7). This fact can produce a kinetic term for \(\zeta \), barring specific choices of the action. An example of such specific choices is when \({{\mathcal {T}}}^{\mu \nu \rho \sigma }\propto g^{\mu \rho }g^{\nu \sigma }\) as we have seen in Sect. 3.2.

Whether this new dynamical scalar \(\zeta \) is a manifestation of the dynamics of the distorsion is not clear. This is because \({{\mathcal {R}}}\), which is equal to \(\zeta \) by using the field equations, does not vanish when the distorsion is zero (see Eq. (2.10)) and so a dynamical \(\zeta \) could also correspond just to an extra dynamical scalar from the metric.

Since this section is devoted to theories with a dynamical distorsion we then consider the case where \(\Delta \) depends on both \({\mathcal {R}}\) and \({\mathcal {R}}'\), but for now only through a linear combination

$$\begin{aligned} \rho \equiv a(\Phi ){{\mathcal {R}}}+b(\Phi ){\mathcal {R}}'. \end{aligned}$$
(4.7)

Note that this situation is a generalization of the theories with a falsely-dynamical distorsion that we have analyzed in Sect. 3.2, where \(a=\alpha \), \(b=\beta \) and \({{\mathcal {T}}}^{\mu \nu \rho \sigma }\) was chosen to be of the specific type (3.6). From the technical point of view this case can be treated similarly, but, as we will see soon, generically there is one more dynamical scalar here. Again we introduce an auxiliary field z and we can show that S can be equivalently written as

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left( {{\mathcal {F}}}_{\mu \nu \rho \sigma } {{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi ) + \Delta (\Phi , z)\nonumber \right. \\&\left. +\frac{\partial \Delta }{\partial z}(\Phi , z)(\rho -z)+ \Sigma (\Phi , {{\mathcal {D}}}\Phi , C) \right) \end{aligned}$$
(4.8)

if the non-restrictive condition \(\frac{\partial ^2\Delta }{\partial z^2}\ne 0\) is imposed. At this point we can again write S as in (4.4) but with different redefined tensors:

$$\begin{aligned} {{\bar{\mathcal {T}}}}^{\mu \nu \rho \sigma }(\Phi ,z)\equiv & {} {{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi ) + \left( g^{\mu \rho }g^{\nu \sigma }a(\Phi )\nonumber \right. \\&\left. + \frac{\epsilon ^{\mu \nu \rho \sigma }}{\sqrt{-g}}b(\Phi )\right) \frac{\partial \Delta }{\partial z}(\Phi ,z), \end{aligned}$$
(4.9)
$$\begin{aligned} {{\bar{\Sigma }}}(\Phi ,z,{{\mathcal {D}}}\Phi , C)\equiv & {} \Sigma (\Phi ,{{\mathcal {D}}}\Phi , C) + \Delta (\Phi ,z)-z\frac{\partial \Delta }{\partial z}(\Phi ,z).\nonumber \\ \end{aligned}$$
(4.10)

So also here we have come back to the previously studied case \(\Delta =0\), but with a new scalar z. Again, barring specific choices of the action (e.g. the ones of Sect. 3.2), the kinetic term of z generically emerge when we solve for the distorsion because of the term \({{\mathcal {F}}}_{\mu \nu \rho \sigma } {{\bar{\mathcal {T}}}}^{\mu \nu \rho \sigma }(\Phi ,\zeta )\), which contains one derivative of \(C_{\mu ~\sigma }^{~\,\rho }\). When the kinetic term appears the field z shows its dynamical nature, but again it is not clear whether this dynamics comes from the distorsion or from the metric because, using the field equations, \(z=\rho \) and Eqs. (4.7), (2.10) and (2.11) tell us that a part of this dynamical field is sourced by the metric and a part is sourced by the distorsion.

A class of theories where the distorsion is certainly dynamical can be found by considering the generic case where the dependence of \(\Delta \) on \({\mathcal {R}}\) and \({\mathcal {R}}'\) is arbitrary. This case can be treated by introducing an auxiliary scalar field \(\zeta \) and an auxiliary pseudoscalar field \(\zeta '\). The action can be equivalently written as follows

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left( {{\mathcal {F}}}_{\mu \nu \rho \sigma } {{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi ) + \Delta (\Phi , \zeta ,\zeta ')\right. \nonumber \\&\left. +\frac{\partial \Delta }{\partial \zeta }(\Phi ,\zeta ,\zeta ')({{\mathcal {R}}}-\zeta )+\frac{\partial \Delta }{\partial \zeta '}(\Phi ,\zeta ,\zeta ')({{\mathcal {R}}'}-\zeta ')\nonumber \right. \\&\left. + \Sigma (\Phi , {{\mathcal {D}}}\Phi , C) \right) . \end{aligned}$$
(4.11)

To show this we observe that the field equations of \(\zeta \) and \(\zeta '\) are, respectively,

$$\begin{aligned} ({{\mathcal {R}}}-\zeta )\frac{\partial ^2\Delta }{\partial \zeta ^2} +({{\mathcal {R}}'}-\zeta ')\frac{\partial ^2\Delta }{\partial \zeta '\partial \zeta }= & {} 0 \end{aligned}$$
(4.12)
$$\begin{aligned} ({{\mathcal {R}}}-\zeta )\frac{\partial ^2\Delta }{\partial \zeta '\partial \zeta } +({{\mathcal {R}}'}-\zeta ')\frac{\partial ^2\Delta }{\partial \zeta '^2}= & {} 0. \end{aligned}$$
(4.13)

Therefore, when the Hessian matrix of \(\Delta \) (with respect to the variables \(\zeta \) and \(\zeta '\)) is not singular these field equations imply \({{\mathcal {R}}}=\zeta \) and \({{\mathcal {R}}'}=\zeta '\) and (4.11) is equivalent to (4.1). We can always require that the Hessian matrix of \(\Delta \) is not singular without loss of generality because around any point where this matrix is singular \(\Delta \) depends at most linearly on a linear combination of \({{\mathcal {R}}}\) and \({{\mathcal {R}}'}\) (with a coefficient independent of the other linearly independent combination) and we can go back to the previously analysed cases with a redefinition of \({{\mathcal {T}}}^{\mu \nu \rho \sigma }\). Now we can again write the action as in (4.4), but with the following redefined tensors that this time depend on both \(\zeta \) and \(\zeta '\):

$$\begin{aligned} {{\bar{\mathcal {T}}}}^{\mu \nu \rho \sigma }(\Phi ,\zeta ,\zeta ')\equiv & {} {{\mathcal {T}}}^{\mu \nu \rho \sigma }(\Phi ) + g^{\mu \rho }g^{\nu \sigma } \frac{\partial \Delta }{\partial \zeta }(\Phi ,\zeta ,\zeta ')\nonumber \\&+ \frac{\epsilon ^{\mu \nu \rho \sigma }}{\sqrt{-g}} \frac{\partial \Delta }{\partial \zeta '}(\Phi ,\zeta ,\zeta '), \end{aligned}$$
(4.14)
$$\begin{aligned} {{\bar{\Sigma }}}(\Phi ,\zeta ,\zeta ',{{\mathcal {D}}}\Phi ,C)\equiv & {} \Sigma (\Phi ,{{\mathcal {D}}}\Phi ,C) + \Delta (\Phi ,\zeta ,\zeta ')\nonumber \\&-\zeta \frac{\partial \Delta }{\partial \zeta }(\Phi ,\zeta ,\zeta ')-\zeta '\frac{\partial \Delta }{\partial \zeta '}(\Phi ,\zeta ,\zeta ').\nonumber \\ \end{aligned}$$
(4.15)

So, again, we have come back to the previously studied case \(\Delta =0\), but with the new scalars \(\zeta \) and \(\zeta '\) and when we derive the algebraic equations of \(C_{\mu ~\sigma }^{~\,\rho }\) derivatives of both \(\zeta \) and \(\zeta '\) appear in integrating by parts the terms coming from \({{\mathcal {F}}}_{\mu \nu \rho \sigma } {{\bar{\mathcal {T}}}}^{\mu \nu \rho \sigma }(\Phi ,\zeta ,\zeta ')\). So, generically, both \(\zeta \) and \(\zeta '\) can be dynamical, barring specific choices of the action.Footnote 7

The fields \(\zeta \) and \(\zeta '\) have a purely geometrical origin. We refer to them as the scalaron and the pseudoscalaron, respectively. The pseudoscalaron is particularly interesting for our purposes because it corresponds to a degree of freedom coming essentially from the distorsion: using the field equations \(\zeta '={{\mathcal {R}}'}\) and, according to Eq. (2.11), \({{\mathcal {R}}'}\) can be non zero only if the distorsion is not zero. As discussed above \(\zeta \) and \(\zeta '\) are generically dynamical, but computing explicitly the corresponding kinetic and interaction terms is of course very difficult and not very illuminating in the most general case of (4.1). Therefore, from now on to study the (pseudo)scalaron we focus on a less general class of theories. We take an action of the form

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left( \alpha (\Phi ){{\mathcal {R}}}+\beta (\Phi ){{\mathcal {R}}'}\nonumber \right. \\&\left. + \Delta (\Phi ,{{\mathcal {R}}}, {{\mathcal {R}}'})+\Sigma (\Phi ,{{\mathcal {D}}}\Phi )\right) , \end{aligned}$$
(4.16)

where \(\alpha \) and \(\beta \) are functions of \(\Phi \). Also, for simplicity, we take \(\Phi \) independent of the curvature and covariant derivatives built with the LC connection and \(\Sigma \) independent of \({{\mathcal {D}}}g_{\mu \nu }\). This is clearly a particular case of (4.1).

4.2.1 Dynamical pseudoscalaron \(\zeta '\)

Let us now provide explicit examples of the most interesting case where \(\zeta '\) is dynamical and explicitly compute its kinetic and potential terms.

To simplify the calculation of the kinetic and potential terms of \(\zeta '\) here we also assume that \(\Delta \) is independent of \({{\mathcal {R}}}\) and that there are no matter fields \(\{\phi , \psi , A_\mu ^I\}\), so that we can drop \(\Sigma \) and write

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left( \alpha {{\mathcal {R}}}+\beta {{\mathcal {R}}'} + \Delta ({{\mathcal {R}}'})\right) \nonumber \\= & {} \int d^4x\sqrt{-g}\left[ \alpha {{\mathcal {R}}}+\left( \beta +\frac{\partial \Delta }{\partial \zeta '}(\zeta ')\right) {{\mathcal {R}}'} + \Delta (\zeta ') \right. \nonumber \\&\left. -\zeta '\frac{\partial \Delta }{\partial \zeta '}(\zeta ')\right] \end{aligned}$$
(4.17)

having required, again without loss of generality, \(\frac{\partial ^2\Delta }{\partial \zeta '^2}\ne 0\). The quantities \(\alpha \) and \(\beta \) are real parameters here; we will shortly identify \(\alpha = M^2_P/2\) so we also have to assume \(\alpha >0\); the ratio \(M_{P}^2/(4\beta )\) is also known as the Barbero–Immirzi parameter. In this case, unlike those discussed in Sect. 3.2, it is not possible to have the quantities in front of both \({{\mathcal {R}}}\) and \({{\mathcal {R}}'}\) constant after a metric rescaling and \(\zeta '\) becomes dynamical. Indeed, by using (2.10) and (2.11) and integrating out \(C_{\mu ~\sigma }^{~\,\rho }\) leads to (see Appendix A)

$$\begin{aligned} S = \int d^4x\sqrt{-g}\left[ \alpha R-K(\zeta ')\frac{(\partial \zeta ')^2}{2} -U(\zeta ') \right] , \end{aligned}$$
(4.18)

where we have defined

$$\begin{aligned}&K(\zeta ') = \frac{24M_{P}^2}{1+16 B^2(\zeta ')}\left( \frac{\partial B}{\partial \zeta '}\right) ^2, \qquad \nonumber \\&B(\zeta ')= \frac{\beta +\frac{\partial \Delta }{\partial \zeta '}(\zeta ')}{M_P^2},\qquad \nonumber \\&U(\zeta ') = \zeta '\frac{\partial \Delta }{\partial \zeta '}(\zeta ')-\Delta (\zeta '). \end{aligned}$$
(4.19)

This is a standard Einstein–Hilbert action plus a kinetic and potential terms for an ordinary matter field. So we have to identify

$$\begin{aligned} \alpha =\frac{M_{P}^2}{2}. \end{aligned}$$
(4.20)

Note that B has to depend non-trivially on \(\zeta '\) because of \(\frac{\partial ^2\Delta }{\partial \zeta '^2}\ne 0\). The second term in (4.18) is a kinetic term of \(\zeta '\), which is therefore dynamical. Note that \(K(\zeta ')\) is always positive, so \(\zeta '\) is never a ghost. We can render the kinetic term of this dynamical scalar canonical through the field redefinition

$$\begin{aligned} \omega (\zeta ') = \int _0^{\zeta '} dx \sqrt{K(x)}. \end{aligned}$$
(4.21)

Indeed, calling \(\zeta '(\omega )\) the inverse function, which is uniquely defined because \(\frac{d\omega }{d\zeta '} = \sqrt{K} > 0\), and inserting in (4.18) one obtains

$$\begin{aligned} S = \int d^4x\sqrt{-g}\left[ \frac{M_P^2}{2} R-\frac{(\partial \omega )^2}{2} -U(\zeta '(\omega )) \right] . \end{aligned}$$
(4.22)

The provided examples where \(\zeta '\) is dynamical are very interesting because, as mentioned above, \({{\mathcal {R}}'}\) is non-vanishing only when \(C_{\mu ~\sigma }^{~\,\rho }\) is present; so in these cases the distorsion has a scalar dynamical component. Given the relevance of this case we look for a general expression for the mass of \(\zeta '\) (defined as the mass of the fluctuations of \(\zeta '\) around a Lorentz invariant solution). First note that a Lorentz invariant stationary point of S with respect to \(\zeta '\) has to be a stationary point of \(\Delta -\zeta '\frac{\partial \Delta }{\partial \zeta '}\), that is a solution of

$$\begin{aligned} \zeta '\frac{\partial ^2 \Delta }{\partial \zeta '^2} = 0. \end{aligned}$$
(4.23)

But \(\frac{\partial ^2\Delta }{\partial \zeta '^2}\ne 0\) so the only Lorentz invariant stationary point is \(\zeta ' =0\). This can be understood observing that the field equations fix \({{\mathcal {R}}'}=\zeta '\) and Lorentz invariance requires \(C_{\mu ~\sigma }^{~\,\rho }=0\), which implies \({{\mathcal {R}}'}=0\) according to Eq. (2.11). Note that Lorentz invariance also requires \(\Delta -\zeta '\frac{\partial \Delta }{\partial \zeta '}=0\) and so, using \(\zeta '=0\), one obtains \(\Delta (0)=0\). To compute the mass of \(\zeta '\) around \(\zeta '=0\) we can focus on the part of the Lagrangian in (4.18) that is quadratic in \(\zeta '\),

$$\begin{aligned} -\frac{24M_{P}^2}{(1+16 B^2(0))}\left( \frac{\partial B}{\partial \zeta '}(0)\right) ^2\frac{(\partial \zeta ')^2}{2} -\frac{1}{2} \frac{\partial ^2 \Delta }{\partial \zeta '^2}(0) \zeta '^2.\nonumber \\ \end{aligned}$$
(4.24)

So the squared mass of \(\zeta '\) is

$$\begin{aligned} m^2_{\zeta '} = \frac{(1+16 B^2(0))\frac{\partial ^2 \Delta }{\partial \zeta '^2}(0)}{24M_{P}^2\left( \frac{\partial B}{\partial \zeta '}(0)\right) ^2}. \end{aligned}$$
(4.25)

We observe that \(m^2_{\zeta '} \ne 0\) as a consequence of \(\frac{\partial ^2\Delta }{\partial \zeta '^2}\ne 0\), which also implies \(\frac{\partial B}{\partial \zeta '}\ne 0\), so that the denominator in (4.25) never vanishes. The requirement that \(\zeta '\) is not a tachyon leads to the condition \(\frac{\partial ^2 \Delta }{\partial \zeta '^2}(0)> 0\).

The potential \(U(\zeta '(\omega ))\) can only be explicitly computed once the function \(\Delta \) is specified. Let us consider, for example,Footnote 8\(\Delta ({{\mathcal {R}}'}) = c {{\mathcal {R}}'}^2\) , where c is a positive constant (so that \(\zeta '\) is not a tachyon). In this case we obtain

$$\begin{aligned}&B(\zeta ') = \frac{\beta +2c\zeta '}{M_{P}^2},\quad \frac{\partial B}{\partial \zeta '} =\frac{2c}{M_{P}^2}, \quad \nonumber \\&K(\zeta ') =\frac{96 c^2}{M_{P}^2 \left[ 1+\frac{16 (2c \zeta '+\beta )^2}{M_{P}^4}\right] }, \quad U(\zeta ') =c\zeta '^2 \end{aligned}$$
(4.26)

and so

$$\begin{aligned} m^2_{\zeta '} = \frac{(1+16 \beta ^2/M_{P}^4)}{48 c}M_{P}^2 > 0. \end{aligned}$$
(4.27)

In this simple quadratic case, by using the expression of K in (4.26) one obtains

$$\begin{aligned}&\omega (\zeta ')= \sqrt{\frac{3}{2}} M_{P}\left[ \tanh ^{-1}\left( \frac{4 B(\zeta ')}{\sqrt{1+16 B(\zeta ')^2}}\right) \nonumber \right. \\&\left. \quad -\tanh ^{-1}\left( \frac{4 \beta }{\sqrt{M_{P}^4+16 \beta ^2}}\right) \right] . \end{aligned}$$
(4.28)

By inverting this function one then finds

$$\begin{aligned} \zeta '(\omega ) = \frac{1}{2c}\left( \frac{M_{P}^2 \tanh X(\omega )}{4\sqrt{1-\tanh ^2X(\omega )}}-\beta \right) , \end{aligned}$$
(4.29)

where

$$\begin{aligned} X(\omega )\equiv \sqrt{\frac{2}{3}}\frac{\omega }{M_{P}}+\tanh ^{-1}\left( \frac{4 \beta }{\sqrt{16 \beta ^2+M_{P}^4}}\right) \end{aligned}$$
(4.30)

and the potential is

$$\begin{aligned} U(\zeta '(\omega )) = c\zeta '(\omega )^2=\frac{1}{4c}\left( \frac{M_{P}^2 \tanh X(\omega )}{4\sqrt{1-\tanh ^2X(\omega )}}-\beta \right) ^2.\nonumber \\ \end{aligned}$$
(4.31)

We see that the condition \(c>0\), which ensures \(m^2_{\zeta '}>0\), also ensures that the potential is bounded from below. The function \(\zeta '(\omega )\) at large field values is (using \((1-\tanh ^2(x))\exp (2x)\rightarrow 4\) as \(x\rightarrow \infty \))

$$\begin{aligned} \zeta ' (\omega )= \frac{M_{P}^2}{16 c} \text {sign}(\omega )\exp \left( \sqrt{\frac{2}{3}}\frac{|\omega |}{M_{P}}\right) , \quad (|\omega |\gg M_{P})\qquad \end{aligned}$$
(4.32)

and one obtains an exponential potential:

$$\begin{aligned} U(\zeta '(\omega )) = \frac{M_{P}^4}{256 c} \exp \left( \sqrt{\frac{8}{3}}\frac{|\omega |}{M_{P}}\right) , \quad (|\omega |\gg M_{P}). \end{aligned}$$
(4.33)

On the other hand, at small field values

$$\begin{aligned} \zeta ' (\omega )= \frac{m_{\omega }\omega }{\sqrt{2c}}, \quad U(\zeta '(\omega )) = \frac{m_{\omega }^2\omega ^2}{2} \quad (|\omega |\ll M_{P}), \end{aligned}$$
(4.34)

where \(m_{\omega } = m_{\zeta '}\). For intermediate values of \(\omega \) the potential is shown in Fig. 1. We note that the behavior in the intermediate region, unlike the one at large field values, depends crucially on the Barbero–Immirzi parameter. The plots also show the invariance of the potential under \(\{\omega , \beta \}\rightarrow \{-\omega , -\beta \}\) which can be analytically understood from Eqs. (4.30) and (4.31).

Fig. 1
figure 1

Potential of the canonically normalized pseudoscalaron (for \(\Delta ({{\mathcal {R}}'}) = c {{\mathcal {R}}'}^2\)) multiplied by c. Left plots: positive values of \(\beta \). Right plots: negative values of \(\beta \)

4.2.2 Dynamical combination of \(\zeta \) and \(\zeta '\)

In general, for actions of the form (4.16) a combination of \(\zeta \) and \(\zeta '\) can be dynamical. Although a dynamical combination of \(\zeta \) and \(\zeta '\) is not an unambiguous sign of dynamical distorsion (as \(\zeta \) is sourced not only by the distorsion, but by the metric too, see Eq. (2.10)), here we explicitly compute the kinetic and potential terms of such dynamical combination in simple and illuminating cases. We do so in order to compare them with the most interesting case where the distorsion field \(\zeta '\) is clearly dynamical, which we have analyzed in Sect. 4.2.1.

As an example, we first consider the case where \(\Delta \) depends on \({{\mathcal {R}}}\) and \({{\mathcal {R}}'}\) only through a combination \(a(\Phi ) {{\mathcal {R}}} + b(\Phi ) {{\mathcal {R}}'}\) that is linearly independent of \(\alpha (\Phi ) {{\mathcal {R}}} + \beta (\Phi ) {{\mathcal {R}}'}\). This linear independence is assumed in order not to fall into the cases examined in Sect. 3.2, which have been proved not to contain extra degrees of freedom besides the metric, and the matter fields \(\{\phi , \psi , A_\mu ^I\}\). Let us assume for simplicity again that these matter fields are absent so that

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}(\alpha {{\mathcal {R}}} + \beta {{\mathcal {R}}'} + \Delta (a {{\mathcal {R}}} + b {{\mathcal {R}}'})) \nonumber \\= & {} \int d^4x\sqrt{-g}\left[ \left( \alpha +a\frac{\partial \Delta }{\partial z}(z)\right) {{\mathcal {R}}} + \left( \beta +b\frac{\partial \Delta }{\partial z}(z)\right) {{\mathcal {R}}'} \nonumber \right. \\&\left. + \Delta (z) -z \frac{\partial \Delta }{\partial z}(z)\right] , \end{aligned}$$
(4.35)

where in the second step we introduced the auxiliary field z and we assumed, again without loss of generality, \(\frac{\partial ^2\Delta }{\partial z^2}(z)\ne 0\). The two functions in front of \({{\mathcal {R}}}\) and \({{\mathcal {R}}'}\) can only be proportional to each other when \(\frac{\partial \Delta }{\partial z}\) is constant (which is not compatible with \(\frac{\partial ^2\Delta }{\partial z^2}(z)\ne 0\)) and/or when \(\{a,b\}\) and \(\{\alpha ,\beta \}\) are linearly dependent (which has been excluded in this case). So it is not possible to remove both functions with a rescaling of the metric \(g_{\mu \nu }\rightarrow \Omega ^2 g_{\mu \nu }\). We can, however, convert the function in front of \({{\mathcal {R}}}\) into \(M_{P}^2/2\) by choosing

$$\begin{aligned} \Omega ^2(z) = \frac{M_{P}^2}{2(\alpha +a\frac{\partial \Delta }{\partial z}(z))}, \end{aligned}$$
(4.36)

whenever \(\alpha +a\frac{\partial \Delta }{\partial z}(z)> 0\), which we assume from now on. After this metric rescaling

$$\begin{aligned} S = \int d^4x\sqrt{-g}\left[ \frac{M_{P}^2}{2}{{\mathcal {R}}} +M_{P}^2B(z){{\mathcal {R}}'} - U(z)\right] , \end{aligned}$$
(4.37)

where

$$\begin{aligned} B(z) = \frac{\beta + b\frac{\partial \Delta }{\partial z}(z)}{2(\alpha +a\frac{\partial \Delta }{\partial z}(z))}, \quad U(z) = \frac{M_{P}^4(z\frac{\partial \Delta }{\partial z}(z)-\Delta (z))}{4(\alpha +a\frac{\partial \Delta }{\partial z}(z))^2}.\nonumber \\ \end{aligned}$$
(4.38)

By using again (2.10) and (2.11) and integrating out \(C_{\mu ~\sigma }^{~\,\rho }\) as we did in Sect. 4.2.1 we obtain

$$\begin{aligned} S = \int d^4x\sqrt{-g}\left[ \frac{M_P^2}{2} R-K(z)\frac{(\partial z)^2}{2} -U(z) \right] , \end{aligned}$$
(4.39)

where

$$\begin{aligned} K(z) = \frac{24M_{P}^2}{1+16 B^2(z)}\left( \frac{\partial B}{\partial z}\right) ^2. \end{aligned}$$
(4.40)

It is easy to show that \(\frac{\partial B}{\partial z}\ne 0\) when \(\frac{\partial ^2\Delta }{\partial z^2}(z)\ne 0\) and \(\{a,b\}\) and \(\{\alpha ,\beta \}\) are linearly independent. So z has a non-vanishing kinetic term and is thus a dynamical field in this case. Also, K(z) is always positive, so z is never a ghost. Like we did before, we can render the kinetic term of this dynamical scalar canonical through the redefinition \(\omega (z)\) in (4.21) and express the action in terms of \(\omega \) like we did in (4.22).

Let us determine now the mass of z (defined as the mass of the fluctuations of z around a Lorentz invariant solution). By construction on a solution of the field equation \(z = a {{\mathcal {R}}}+b {{\mathcal {R}}'}\), as it can be easily checked from (4.35), so in a Lorentz invariant stationary point \(z=0\) (see Eqs. (2.10) and (2.11)). Note that Lorentz invariance also requires \(U(0)=0\) and so, using the second expression in (4.38), also \(\Delta (0)=0\) and

$$\begin{aligned} \frac{\partial U}{\partial z}(0)=0\qquad \frac{\partial ^2U}{\partial z^2}(0)=\frac{M_{P}^4 \frac{\partial ^2\Delta }{\partial z^2}(0)}{4 \left( \alpha +a \frac{\partial \Delta }{\partial z}(0)\right) ^2}. \end{aligned}$$
(4.41)

Expanding the action in (4.39) at quadratic order in z we then easily obtain the squared mass of z:

$$\begin{aligned} m_z^2 = \frac{M_{P}^2(1+16 B^2(0))\frac{\partial ^2 \Delta }{\partial z^2}(0)}{96\left( \alpha +a\frac{\partial \Delta }{\partial z}(0)\right) ^2\left( \frac{\partial B}{\partial z}(0)\right) ^2}. \end{aligned}$$
(4.42)

Given the assumption we have made, \(m_z^2\) is always finite and non vanishing. It is also positive for \(\frac{\partial ^2 \Delta }{\partial z^2}(0)> 0\), which is then the condition in order for z not to be a tachyon.

The potential of z can only be computed once we specify the function \(\Delta \). As an example, we take now a quadratic function like we did in Sect. 4.2.1, \(\Delta (z) = c z^2\), where c is a positive constant (so that z is not a tachyon). In this case we obtain

$$\begin{aligned}&B(z) = \frac{\beta +2 b c z}{ 2\alpha +4 a c z},\quad U(z) =\frac{c M_{P}^4 z^2}{4 (\alpha +2 a c z)^2}, \nonumber \\&\frac{\partial B}{\partial z} =\frac{c (\alpha b-a \beta )}{(\alpha +2 a c z)^2}, \quad \nonumber \\&K(z) = \frac{24 c^2 M_{P}^2 (\alpha b-a \beta )^2}{(\alpha +2 a c z )^4 \left[ 1+\frac{4 (\beta +2 b c z)^2}{(\alpha +2 a c z)^2}\right] }, \quad \nonumber \\&\quad m^2_z = \frac{M_{P}^2 \left( \alpha ^2+4 \beta ^2\right) }{48 c (\alpha b-a \beta )^2} > 0. \end{aligned}$$
(4.43)

Note that the quantity \(\alpha b-a \beta \) never vanishes because \(\{a,b\}\) and \(\{\alpha ,\beta \}\) have been assumed to be linearly independent. In this case the potential U(z) is asymptotically flat at large z, unlike the \(U(\zeta ')\) considered in Sect. 4.2.1 at large \(\zeta '\). However, expressing z in terms of B through the first equation in (4.43) to find U as a function of B we obtain

$$\begin{aligned} U = \frac{M_{P}^4 (2 \alpha B-\beta )^2}{16 c (\alpha b-a \beta )^2}, \end{aligned}$$
(4.44)

which is, surprisingly, the same potential as the one in (4.26) once we express \(\zeta '\) in terms of B and we redefine the parameters appropriately. Given that the kinetic term of B is also the same (see the first expression in (4.19) and (4.40)) this scalar–tensor theory is precisely the same as the one of Sect. 4.2.1, which features a dynamical distorsion.

Let us consider now another example. A combination of \(\zeta \) and \(\zeta '\) can be dynamical for actions of the form (4.16) also when the Hessian matrix of \(\Delta \) (with respect to \({{\mathcal {R}}}\) and \({{\mathcal {R}}'}\)) is not singular. This example, as we will see, is a bit more complicated to analyze, but it can be considered as a more generic case: \(\Delta \) can be expected to depend on both \({{\mathcal {R}}}\) and \({{\mathcal {R}}'}\) rather than on a specific linear combination of them. To illustrate how a kinetic term can emerge we take again the simple case where there are no matter fields \(\{\phi , \psi , A_\mu ^I\}\) so that, introducing the two auxiliary fields \(\zeta \) and \(\zeta '\), we can write

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left[ \left( \alpha +\frac{\partial \Delta }{\partial \zeta }(\zeta ,\zeta ')\right) {{\mathcal {R}}}\nonumber \right. \\&+\left( \beta +\frac{\partial \Delta }{\partial \zeta '}(\zeta ,\zeta ')\right) {{\mathcal {R}}'} \nonumber \\&\left. + \Delta (\zeta ,\zeta ')-\zeta \frac{\partial \Delta }{\partial \zeta }(\zeta ,\zeta ')-\zeta '\frac{\partial \Delta }{\partial \zeta '}(\zeta ,\zeta ')\right] . \end{aligned}$$
(4.45)

By performing again a local rescaling of the metric \(g_{\mu \nu }\rightarrow \Omega ^2 g_{\mu \nu }\) with

$$\begin{aligned} \Omega ^2 = \frac{M_{P}^2}{2 \left( \alpha +\frac{\partial \Delta }{\partial \zeta }(\zeta ,\zeta ')\right) } \end{aligned}$$
(4.46)

(having assumed \(\alpha +\frac{\partial \Delta }{\partial \zeta }(\zeta ,\zeta ')>0\)) we obtain

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left[ \frac{M_{P}^2}{2} {{\mathcal {R}}}+ M_{P}^2 B(\zeta ,\zeta ') {{\mathcal {R}}'} -U(\zeta ,\zeta ')\right] ,\nonumber \\ \end{aligned}$$
(4.47)

where this time

$$\begin{aligned} B(\zeta ,\zeta ')= & {} \frac{\beta +\frac{\partial \Delta }{\partial \zeta '}(\zeta ,\zeta ')}{2(\alpha +\frac{\partial \Delta }{\partial \zeta }(\zeta ,\zeta '))},\\ U(\zeta ,\zeta ')= & {} \frac{M_{P}^4}{4\left( \alpha +\frac{\partial \Delta }{\partial \zeta }(\zeta ,\zeta ')\right) ^2}\left( \zeta \frac{\partial \Delta }{\partial \zeta }(\zeta ,\zeta ')\right. \\&\left. +\zeta '\frac{\partial \Delta }{\partial \zeta '}(\zeta ,\zeta ')- \Delta (\zeta ,\zeta ')\right) . \end{aligned}$$

Note that B generically depends on both \(\zeta \) and \(\zeta '\). By using (2.10) and (2.11) and integrating out \(C_{\mu ~\sigma }^{~\,\rho }\) as we did in Sect. 4.2.1 we obtain

$$\begin{aligned} S = \int d^4x\sqrt{-g}\left\{ \frac{M_P^2}{2} R-K(B(\zeta ,\zeta '))\frac{(\partial B)^2}{2}-U(\zeta ,\zeta ') \right\} ,\nonumber \\ \end{aligned}$$
(4.48)

where

$$\begin{aligned} K(B)= \frac{24M_{P}^2}{(1+16 B^2)}. \end{aligned}$$
(4.49)

Therefore, the field \(B(\zeta ,\zeta ')\) is the dynamical combination of \(\zeta \) and \(\zeta '\). Since K(B) is always positive, B is never a ghost.

In order to compute the potential of B we need to integrate out the other independent combination of \(\zeta \) and \(\zeta '\) that is not dynamical. We can do so by imposing that U is stationary with respect to variations of \(\zeta \) and \(\zeta '\) with constant \(B(\zeta ,\zeta ')\). Calling b such constant value, when \(\zeta \) is varied \(\zeta '\) must equal \(\zeta '_b(\zeta )\), which is the function of \(\zeta \) such that \(B(\zeta , \zeta '_b(\zeta )) = b\). Assuming that \(\zeta '_b(\zeta )\) is a single-valued differentiable function, the condition that U is stationary with respect to variations of \(\zeta \) and \(\zeta '\) with constant \(B(\zeta ,\zeta ')\) can be expressed as follows

$$\begin{aligned} \left. \frac{\partial U}{\partial \zeta }+ \frac{\partial U}{\partial \zeta '}\frac{d\zeta '_b}{d\zeta }\right| _{b=B(\zeta ,\zeta ')}=0. \end{aligned}$$
(4.50)

Imposing this constraint on \(\zeta \) and \(\zeta '\) integrates out the other non-dynamical scalar and allows us to express U in terms of B only. The resulting action is

$$\begin{aligned} S = \int d^4x\sqrt{-g}\left\{ \frac{M_P^2}{2} R-K(B)\frac{(\partial B)^2}{2}-U(B) \right\} . \end{aligned}$$
(4.51)

Once again, we can render the kinetic term of this dynamical scalar canonical through the redefinition \(\omega (B)\) in (4.21) and express the action in terms of \(\omega \) like we did in (4.22).

We cannot determine explicitly U(B) until we specify the function \(\Delta \). As an example let us consider the case where \(\Delta \) is a generic quadratic function of \(\zeta \) and \(\zeta '\), namely

$$\begin{aligned} \Delta (\zeta ,\zeta ')=c \zeta ^2+c'\zeta '^2+c_m \zeta \zeta ', \end{aligned}$$

whose Hessian matrix is not singular for \(4 c c'\ne c_m^2\). In this case

$$\begin{aligned}&B(\zeta ,\zeta ')=\frac{\beta +c_m \zeta +2 c' \zeta '}{2 (\alpha +2 c \zeta +c_m \zeta ')}, \qquad \nonumber \\&U(\zeta ,\zeta ')=\frac{M_{P}^4 \left( c \zeta ^2+c' \zeta '^2+c_m\zeta \zeta '\right) }{4 (\alpha +2 c \zeta +c_m\zeta ')^2} \end{aligned}$$
(4.52)

and one finds (for \(b c_m\ne c'\))

$$\begin{aligned} \zeta '_b(\zeta )= & {} \frac{\beta -2 \alpha b+(c_m -4 b c)\zeta }{2 (b c_m-c')}, \end{aligned}$$
(4.53)
$$\begin{aligned}&\left. \frac{\partial U}{\partial \zeta }+ \frac{\partial U}{\partial \zeta '}\frac{d\zeta '_b}{d\zeta }\right| _{b=B(\zeta ,\zeta ')}\nonumber \\= & {} \frac{M_{P}^4 \left( 4 c c'-c_m^2\right) (\alpha \zeta +\beta \zeta ')}{4 (\alpha +2 c \zeta +c_m \zeta ')^2 [2 c' \alpha -c_m \beta +(4 cc'-c_m^2) \zeta ]}. \end{aligned}$$
(4.54)

Since the non singularity of the Hessian matrix of \(\Delta \) requires \(4 c c'\ne c_m^2\), integrating out the non-dynamical scalar through Eq. (4.50) then gives \(\alpha \zeta =-\beta \zeta '\). This condition, together with \(B=B(\zeta ,\zeta ')\) allows us to express both \(\zeta \) and \(\zeta '\) in terms of B and the potential of this dynamical scalar reads

$$\begin{aligned} U(B) = \frac{M_{P}^4 (2 \alpha B-\beta )^2}{16 \left[ \beta (\beta c-\alpha c_m)+\alpha ^2 c'\right] }. \end{aligned}$$
(4.55)

Again this is the same potential as the one in (4.26) once we express \(\zeta '\) there in terms of B and redefine the parameters appropriately. Like in the previous example, also the kinetic term of B is the same (see the first expression in (4.19) and (4.49)) so this scalar–tensor theory is again precisely the same as the one of Sect. 4.2.1, that features a dynamical distorsion. We then see that this theory is much more general than what we could have imagined from the analysis of Sect. 4.2.1.

One can of course find cases where both \(\zeta \) and \(\zeta '\) are dynamical: for example one can introduce, like in (4.1), a dependence of \(\Sigma \) on C, which is not invariant but transforms inhomogeneously under \(g_{\mu \nu }\rightarrow \Omega ^2g_{\mu \nu }\) for a spacetime-dependent \(\Omega \). But, as observed before, it is only \(\zeta '\) that is directly linked to the distorsion. Since we are interested in a dynamical distorsion we do not explore these further possibilities here and leave them for future work.

4.3 Examples: Poincaré gauge theories coupled to matter

The distorsion, in the most general case, does not only include scalars and pseudoscalars, but also higher rank tensors, which, in the most general case also lead to spin-3, spin-2 and spin-1 particles (see Ref. [12] for a detail discussion and a summary of previous works). Here we consider the case of Poincaré gauge theories, also known as Einstein–Cartan theories (see [35, 36] for detailed reviews): the gravitational fields are represented by the tetrads and the connection, which, as we have seen in Sect. 2, has to be metric compatible, i.e. \({{\mathcal {D}}}_\rho g_{\mu \nu } =0\). From the physical point of view this is not a restrictive choice because, as we have seen in Sect. 2, in order to have fermions it is necessary to introduce the tetrads and have a metric-compatible connection. In this case the distorsion coincides with what is known as the contorsion, which can be expressed in terms of the torsion:

$$\begin{aligned} C_{\mu \nu \rho } = \frac{1}{2} (T_{\mu \nu \rho }+T_{\nu \mu \rho }-T_{\mu \rho \nu }), \end{aligned}$$
(4.56)

which is antisymmetric in the second and third indices. From this equation and (2.5) we see that the contorsion vanishes if and only if the torsion does. As we have seen in Sect. 2, the tetrads are defined modulo local Lorentz transformations, which together with the local translations always present in any generally covariant theory, leads to local Poincaré symmetry (hence the name Poincaré gauge theories).

As shown in [37] (see also Refs. [38, 39] for subsequent studies), in Poincaré gauge theories in the absence of matter fields (i.e. without \(\{ \phi , \psi , A_{\mu }^I \}\)) the metric and the connection generically contain three spin-2 fields (one of which correspond to the ordinary massless graviton), plus four spin-1 and three spin-0 fields (including the fields \(\zeta \) and \(\zeta '\) discussed in Sect. 4.2). The spin-3 field present in the most general case is removed by the condition of metric compatibility. Subsequently, it was shown that the stability of these theories can only occur if the additional spin-2 fields (besides the ordinary graviton) are massive at least in the absence of matter fields [40]. The argument was based on an expansion of the action at the quadratic level in the fluctuations around the flat (Minkowski) spacetime.

If one introduces ordinary matter fields \(\{\phi , \psi , A_\mu ^I\}\) this result does not change as we now show. To see this let us first introduce some scalar or pseudoscalar fields \(\phi \). Since we want to exclude the presence of massless spin-2 fields we take these scalars to be massless because otherwise it would not be possible to construct a quadratic mixing term between them and the massless components of the contorsion. The only possible independent scalar or pseudoscalar terms involving the contorsion and \(\phi \) at the quadratic level and with only one derivative are then

$$\begin{aligned} C_{\mu \nu }^{~~~\mu }\partial ^\nu \phi , \qquad \epsilon ^{\mu \nu \rho \sigma } C_{\mu \nu \rho }\partial _\sigma \phi , \end{aligned}$$
(4.57)

which, of course, can only be constructed with those \(\phi \) fields that are invariant under the gauge group G. The terms in (4.57) are mixing terms between \(\phi \) and a vector field \(C_{\mu \nu }^{~~~\mu }\) and a pseudovector field \(\epsilon ^{\mu \nu \rho \sigma } C_{\mu \nu \rho }\). So they do not affect the spin-2 sector. Actually the quadratic terms in (4.57) even vanish in the massless sector as one can always decompose the above mentioned vector and pseudovector fields into spin-1 fields that are transverse and spin-0 fields whose d’Alembertian is anyhow zero in the massless case. Non-vanishing scalar or pseudoscalar terms with more than one derivative cannot be constructed either as they would unavoidably contain (because \(C_{\mu \nu \rho }\) is antisymmetric in the second and third indices) a d’Alembertian acting on \(\phi \), which vanishes because \(\phi \) are massless fields.

Similarly, considering gauge fields, one can construct quadratic terms that involve both the contorsion and an Abelian gauge field \(A_\mu \), such as

$$\begin{aligned}&C_{\mu \nu \rho }\partial ^\mu F^{\nu \rho }, \quad C_{\nu \mu \rho } \partial ^\mu F^{\nu \rho },\quad C_{\alpha \nu }^{~~~\alpha } \partial _\mu F^{\mu \nu }, \quad \nonumber \\&\quad \epsilon ^{\mu \nu \rho \sigma }C_{\mu \nu \rho } \partial _\alpha F^\alpha _{~~\sigma }, \quad \epsilon ^{\mu \nu \rho \sigma } C_{\mu \nu \alpha } \partial ^\alpha F_{\rho \sigma }, \quad \ldots , \end{aligned}$$
(4.58)

where \(F_{\mu \nu }\) is the field strength of \(A_\mu \). But it is always possible to choose the gauge in a way that fields with a non-vanishing spin are described by transverse tensors so, recalling that the d’Alembertian of any massless field vanishes, these terms do not modify the spin-2 sector. Of course, with fermions it is not possible to construct terms involving \(C_{\mu \nu \alpha }\) that change the quadratic action because fermions always come in pair.

We conclude that, even in the presence of matter fields, the argument of [40] holds and the two extra spin-2 fields besides the ordinary graviton must be massive to have a stable theory.

4.3.1 Dark photons from torsion

The vector \(v_\nu \equiv C_{\alpha \nu }^{~~~\alpha }\), and the pseudovector \(p^\sigma \equiv \frac{\epsilon ^{\mu \nu \rho \sigma }}{\sqrt{-g}} C_{\mu \nu \rho }\), that we have already discussed in the previous section, contain spin-1 particles, which can play the role of dark photons of gravitational origin. Dark photons have interesting phenomenology (see e.g. [41, 42]) as they can act as portals to dark sectors.

Note that after integrating by parts the third and fourth terms in (4.58) one obtains mixing kinetic terms between the vector \(v_\mu \) and an Abelian gauge field \(A_\mu \) and between the pseudovector \(p_\mu \) and \(A_\mu \),

$$\begin{aligned} v_{\mu \nu } F^{\mu \nu }, \quad p_{\mu \nu } F^{\mu \nu }, \end{aligned}$$
(4.59)

where

$$\begin{aligned} v_{\mu \nu }\equiv \partial _\mu v_\nu - \partial _\nu v_\mu , \qquad p_{\mu \nu }\equiv \partial _\mu p_\nu - \partial _\nu p_\mu \end{aligned}$$
(4.60)

are the field strengths of \(v_\mu \) and \(p_\mu \). If \(F_{\mu \nu }\) is the electromagnetic field strength the terms (4.59) are mixing terms between the photon and the torsion dark photons. These mixing terms give the possibility of detecting the effect of the dark photons when they are massive [41]. In the massless case interaction terms between the dark photons and the SM fields are necessarily higher dimensional (non-renormalizable) operators [43] that might, however, induce observable effects depending on the size of their coefficients. Such higher dimensional operators are allowed in our EFT approach (generically, the couplings of \(v_\mu \) and \(p_\mu \) in the metric theory depends on the initial metric-affine action [44]).

One sees that theories where the connection carries extra degrees of freedom (besides the metric) generically lead to the existence of (and thus motivate) dark photons. In total there are two dark photons with negative parity and two with positive parity: \(v_\mu \), \(p_\mu \) and other two spin-1 fields (one with positive parity and another one with negative parity) that come from the other independent components of the torsion, as it can be easily shown by using the results of [37].

One might think that the torsion spin-1 fields cannot couple to the (pseudo)scalars \(\phi \) because the torsion is part of the full connection (and (pseudo)scalars are invariant under proper orthochronous Poincaré transformations). However, in the most general Poincaré gauge theory we could also include these spin-1 fields in the covariant derivative of \(\phi \) by adding to the action appropriate terms: considering, as an example, \(v_\mu \) such a term would be

$$\begin{aligned}&\int d^4x\sqrt{-g}\left( -\left[ ({{\mathcal {D}}}_\mu +iv_\mu ) \phi \right] ^\dagger ({{\mathcal {D}}}^\mu +iv^\mu ) \phi \nonumber \right. \\&\left. \quad +{{\mathcal {D}}}_\mu \phi ^\dagger {{\mathcal {D}}}^\mu \phi \right) \nonumber \\&\int d^4x\sqrt{-g}\left( iv_\mu \left[ \phi ^\dagger {{\mathcal {D}}}^\mu \phi -\left( {{\mathcal {D}}}^\mu \phi \right) ^\dagger \phi \right] -v_\mu v^\mu \phi ^\dagger \phi \right) \nonumber \\ \end{aligned}$$
(4.61)

and analogous terms for the other spin-1 fields. It is clear that these terms depend on \(\phi \), \({{\mathcal {D}}}_\mu \phi \) and \(C_{\mu ~\sigma }^{~\,\rho }\) and can, therefore, be included in a function like \(\Sigma (\Phi ,{{\mathcal {D}}}\Phi , C)\) in Eqs. (3.3) and (4.1).

We do not study here cases where the torsion spin-2 fields are dynamical due to standard difficulties when one attempts an extension to a fully covariant theory in the presence of additional spin-2 fields besides the graviton, see e.g. Ref. [40].

4.3.2 Coupling the pseudoscalaron to matter

One of the most interesting component of the distorsion, that can be dynamical, is the pseudoscalaron \(\zeta '\), which we have discussed in Sect. 4.2. This field is also present in Poincaré gauge theories, because in Sect. 4.2 we have not used that \({{\mathcal {D}}}_\rho g_{\mu \nu }\ne 0\).

In order to illustrate how the pseudoscalaron couples to a generic matter sector let us take an action of the form

$$\begin{aligned}&S = \int d^4x\sqrt{-g}\left[ \alpha (\phi ){{\mathcal {R}}}+\beta (\phi ){{\mathcal {R}}'} +\Delta (\phi ,{{\mathcal {R}}'})\nonumber \right. \\&\left. \quad +\Sigma (\Phi ,{{\mathcal {D}}}\Phi )\right] , \end{aligned}$$
(4.62)

where \(\alpha \), \(\beta \) and \(\Delta \) are generic functions of the (pseudo)scalars \(\phi \), the function \(\Delta \) has an additional dependence on \({{\mathcal {R}}'}\), which has been added to introduce the pseudoscalaron (see Sect. 4.2.1), and

(4.63)

represents the matter Lagrangian, where V is the potential. The coefficients \(Y^a_{ij}\) and \(M_{ij}\) are generic Yukawa couplings and fermion mass parameters. As usual, since we work with Weyl fermions, , where \({{\bar{\sigma }}}^\mu = e^\mu _a {{\bar{\sigma }}}^a\) and \({{\bar{\psi }}}\) represent the (transpose) hermitian conjugate of \(\psi \). All terms are contracted in a gauge-invariant way with respect to both the local Poincaré group and the gauge group G. This action is clearly a particular case of (4.16). This form, despite not being the most general one, is suggested by the structure of the SM (although it also covers, among others, any of its renormalizable extensions) and by the geometrical interpretation of the torsion as part of the full connection: in (4.62) we only use the covariant derivative \({{\mathcal {D}}}\) rather than the one, D, constructed with the Levi-Civita connection or, in other words, \(\Sigma \) does not explicitly depend on the contorsion \(C_{\mu ~\sigma }^{~\,\rho }\). The matter Lagrangian in (4.63) is general enough to accommodate not only all the SM fields but also additional fields needed to describe the current evidence of beyond-the-SM physics (neutrino masses and mixings, dark matter, baryon asymmetry, etc.).

By performing steps similar to those made around Eq. (4.17), the action in (4.62) can be equivalently rewritten as follows

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\left[ \alpha (\phi ){{\mathcal {R}}}+M_{P}^2B(\phi ,\zeta '){{\mathcal {R}}'} + \Delta (\phi ,\zeta ') \right. \nonumber \\&\left. -\zeta '\frac{\partial \Delta }{\partial \zeta '}(\phi ,\zeta ')+\Sigma (\Phi ,{{\mathcal {D}}}\Phi )\right] \end{aligned}$$
(4.64)

having required again, without loss of generality, \(\frac{\partial ^2\Delta }{\partial \zeta '^2}\ne 0\). Here the function \(B(\phi ,\zeta ')\) is

$$\begin{aligned} B(\phi ,\zeta ')=\frac{\beta (\phi )+\frac{\partial \Delta }{\partial \zeta '}(\phi ,\zeta ')}{M_{P}^2}. \end{aligned}$$
(4.65)

In (4.64) the pseudoscalaron \(\zeta '\) appears explicitly, but the other torsion components are not dynamical like in Sect. 4.2.1. We can again integrate out the torsion by using the method of Appendix A to find

(4.66)

where the full potential is

$$\begin{aligned} U(\phi ,\zeta ') = V(\phi )-\Delta (\phi ,\zeta ')+\zeta ' \frac{\partial \Delta }{\partial \zeta '}(\phi ,\zeta ') \end{aligned}$$
(4.67)

and \(V_\mu \) is defined by

$$\begin{aligned} V_\mu \equiv M_{P}^2\partial _\mu B(\phi ,\zeta ') +\frac{1}{8} {{\bar{\psi }}}_j {{\bar{\sigma }}}_\mu \psi _j. \end{aligned}$$
(4.68)

Note that \(U(\phi ,\zeta ')\) contains some interactions of \(\zeta '\) with the \(\phi \) fields, e.g. the Higgs. The last line in Eq. (4.66) contains other interactions of \(\zeta '\) as well as its kinetic term, which emerges from the \(V_\mu V^\mu \) term. Note that in this class of theories the pseudoscalaron interacts with \(\phi \) and the fermions \(\psi \), but not with the gauge fields \(A^I_\mu \): this is because the starting action (4.62) does not feature couplings between the torsion and \(A^I_\mu \). The pseudoscalaron here interacts with \(\phi \) through the function \(\alpha \) and the potential U and also has two- and four-fermion interactions.

A commonly encountered case is \(\alpha (\phi ) = M_{P}^2/2+\xi _{kl}\phi _k\phi _l\), where \(\xi _{kl}\) are real coefficients, sometimes called non-minimal couplings. In this case one recovers the standard Einstein–Hilbert action for gravity at small field values, when \(\alpha \simeq M_{P}^2/2\). One can easily compute the interactions in terms of the \(\xi _{kl}\) (including those involving the pseudoscalaron) by expanding \(\alpha (\phi )\) in powers of \(\xi _{kl}\phi _k\phi _l/M_{P}^2\).

5 A note on the equivalence principle

In any modification or extension of GR it is natural to ask whether (and to what extent) the equivalence principle holds. It is particularly interesting to answer this question in the context of metric-affine theories as these are gravitational theories constructed starting from the geometrical principle of general covariance.

Let us first recall what the equivalence principle states: for any fixed spacetime point X, it is possible to choose a reference frame (called locally inertial frame) where the laws of physics are those without gravity in a small enough neighbourhood of X.

A first thing one may note is that the equivalence principle is ambiguous if one does not specify what is meant by “the laws of physics without gravity”. In order to eliminate this ambiguity, given our current description of fundamental non-gravitational forces, we understand that the physics without gravity is described by a theory with ordinary matter, such as the one present in the SM and its common extensions. This can feature (pseudo)scalars, gauge fields and fermions, which are enough to account for all matter we observe and address the evidence of beyond-the-SM physics. Massive (pseudo)vector fields, for example, can be modeled by gauge fields and (pseudo)scalars using the Stückelberg or Higgs mechanism. Also note that pseudoscalars and pseudovectors are present in the QCD spectrum and appear in popular SM extensions, such as those featuring an axion. Therefore, the scalar \(\zeta \) and pseudoscalar \(\zeta '\), which we defined in Sect. 4.2, as well as the vector \(v_\mu \) and pseudovector \(p_\mu \) encountered in Sect. 4.3.1 are particular examples of ordinary matter fields. Since, starting from general covariance, gravity is described by the metric \(g_{\mu \nu }\) and the connection \({{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }\), as discussed in Sect. 2, we conclude that the equivalence principle tells us that in the locally inertial frame \(g_{\mu \nu }(X)=\eta _{\mu \nu }\) and the effect of \({{\mathcal {A}}}_{\mu ~\sigma }^{~\,\rho }(X)\) is indistinguishable from that of such ordinary matter.

Another part of the equivalence principle that calls for a clarification are the words “small enough”. Following the argument in [18], we interpret them as the requirement that not only \(g_{\mu \nu }(X)=\eta _{\mu \nu }\), but also \(\partial _\rho g_{\mu \nu }(X) = 0\) in the locally inertial frame. With this interpretation the equivalence principle also tells us that \( \Gamma _{\mu ~\sigma }^{~\,\rho }(X) =0\) and the effect of \(C_{\mu ~\sigma }^{~\,\rho }(X)\) is indistinguishable from that of ordinary matter in the locally inertial frame. So any physical effect of this \(C_{\mu ~\sigma }^{~\,\rho }(X)\) that cannot be accounted for by ordinary matter may be interpreted as a violation of the equivalence principle (see also Ref. [45] for a related discussion).

It is important to note that a violation of this principle can even occur in a metric theory, through the presence of higher dimensional terms in the action, which start to be relevant at high energies. An example is the term \(\int d^4x \sqrt{-g}\, R F_{\mu \nu }^IF^{I\mu \nu }\): in a spacetime where \(R\ne 0\) locally, such as the de Sitter spacetime of cosmological relevance, this term would lead to an observable modification of electrodynamics due to gravity even in arbitrarily small neighbourhood of X. This is not surprising because the equivalence principle is a classical local statement but at very small distances, i.e. at very high energies, we expect quantum gravity effects to show up and these can lead to higher dimensional terms in the EFT description, such as the one we have just mentioned. The (classical) equivalence principle is expected to fail in a quantum gravity framework, while general covariance can survive [46].

On the other hand, as we have seen in Sect. 4.3, starting from the general relativity principle, the dynamical components of the distorsion that can be massless are only spin-1 and spin-0 fields for realistic theories (that must be stable and feature fermions and whose connection is, therefore, metric compatible). So at low enough energies the effect of \(C_{\mu ~\sigma }^{~\,\rho }\) is indistinguishable from that of ordinary matter not only at X in the locally inertial frame, but in any frame and at any point. Furthermore, in the low energy limit metric-affine theories coupled to spin-0, spin-1/2 and spin-1 fields are described by the Einstein–Hilbert term computed with the LC connection, Eq. (3.1), plus the renormalizable action of the matter fields \(\{ \phi ', \psi , A_{\mu }^{'I} \}\) (where \(\phi '\) and \(A_{\mu }^{'I}\) include \(\phi \) and \(A_{\mu }^I\) plus all spin-0 and spin-1 massless dynamical fields from the torsion), which do satisfy the equivalence principle. This result does not change if one also considers other fields with spin 3/2 or higher than or equal to two: the only massless particles with spin higher than or equal to two that can interact with gravity in a Minkowski background are gravitons and massless spin 3/2 particles should interact exactly as gravitinos in supergravity [47, 48]. But supersymmetry must be broken at low energies in order for the theory to be realistic and as soon as this happens the gravitino acquires a mass.

Therefore, we see that, although general covariance does not imply the equivalence principle at all energies, the latter in general emerges at low energies from the former in realistic theories.

6 Conclusions

We conclude by providing a detailed summary of the new results of this paper with some further discussions.

  • After an introduction and some background material in Sects. 1 and 2, in Sect. 3 we have constructed the most general action of metric-affine EFTs that are equivalent to metric ones, namely those theories with a non-dynamical distorsion. We have included a generic matter sector featuring an arbitrary number of spin-1, spin-1/2 and spin-0 fields. We have pointed out, however, that in some specific cases the action can be brought in that form with appropriate redefinitions although it might not look so initially. The bottom line of that section is that the actions with non-dynamical distorsion are those that can be recast in a form linear in the curvature of the full connection with the “coefficients” of the linear terms being independent of the distorsion itself. This class is very vast and includes as a particular case, among many others, \(f({{\mathcal {R}}})\) theories.

  • In Sect. 4 we have studied some examples of theories that have instead a dynamical distorsion. We have investigated in detail a vast class where the parity-odd Holst invariant \({{\mathcal {R}}}'\) is a dynamical pseudoscalar field (pseudoscalaron). This field is supported by the distorsion (it vanishes when the distorsion does) and can, therefore, be regarded as a genuine distorsion field. The pseudoscalaron can coexist with a dynamical scalaron \({{\mathcal {R}}}\) and a generic matter sector. In the simplest cases we have been able to compute explicitly the pseudoscalaron kinetic term, mass and potential. In the same section, we have also discussed general Poincaré gauge theories coupled to matter, where the connection is metric compatible and fermions can be introduced. We have extended a previous result by Neville in a pure gravitational theory [40] to the presence of a generic matter sector, showing that the spin-2 fields from the torsion cannot be massless compatibly with the stability requirements and thus cannot appear at low enough energies. Also, we have commented on the possible phenomenology of torsion spin-1 fields, which can play the role of dark photons. At the end of Sect. 4 we have computed interactions of the pseudoscalaron with a generic matter sector and, of course, the metric. These results can be used in the future to study the role of the pseudoscalaron in the early and late universe as well as the possible scattering, production mechanisms and decays of this torsion field.

  • Section 5 presents a proof that in generic realistic, and thus metric compatible, metric-affine EFTs the equivalence principle (appropriately defined) always emerges at low energies, although it is generically violated at high energies. This was possible by means of the extension of Neville’s result to a general matter sector, which we presented in Sect. 4.3: the massless dynamical torsion fields can only have spin 1 or spin 0 and can, therefore, be represented by ordinary matter fields; so at low enough energies the theory can be described by the Einstein–Hilbert action complemented by minimally-coupled ordinary matter fields, which satisfy the equivalence principle.