1 Introduction

At a fundamental level gravity may be regarded as a theory of connections. An example is the “Palatini approach” to gravity due to Einstein [1, 2], hereafter called EP approach [3,4,5]. In this case the “Palatini connection” \((\tilde{\Gamma })\) is apriori independent of the metric \((g_{\alpha \beta })\) and is actually determined by its equations of motion, from the action considered. For simple actions, \(\tilde{\Gamma }\) plays an auxiliary role only, with no dynamics. For example, for an Einstein action in the EP approach the variation principle gives that \(\tilde{\Gamma }\) is actually equal to the Levi-Civita connection \((\Gamma ).\) With this solution for \(\tilde{\Gamma }\), one then recovers Einstein gravity – the metric formulation and EP approach are equivalent.

However, this equivalence is not true in general, for complicated actions, with matter present, etc. [6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]. For example, for quadratic gravity actions of the type studied here in the EP approach, the equations of motion of \(\tilde{\Gamma }\) become complicated second-order differential equations; further, some components of \(\tilde{\Gamma }\) even become dynamical in a sense discussed shortly, etc. The question remains, however, if such general actions in the EP formalism and in the absence of matter can recover dynamically the Levi-Civita connection and Einstein gravity. If true, this would be similar to the original Weyl quadratic gravity theory [22,23,24,25] as we showed recently in [26, 27]. The main goal of this paper is to answer this question.

To address this question we study a gravity action in the EP approach with gauged scale symmetry also called Weyl gauge symmetry, see [26, 27] for an example.Footnote 1 This symmetry, first present in Weyl gravity [22,23,24] is important for mass generation, hence our interest. This symmetry demands us to consider quadratic gravity actions, with no dimensionful parameters. For such action we shall: (1) explain the spontaneous breaking of this symmetry and the emergence of Levi-Civita connection, Einstein gravity and Planck scale in the broken phase, even in the absence of matter. This answers the above question; (2) study the relation of this action to Weyl theory [22,23,24] of similar symmetry; (3) study its inflation predictions.

In Sect. 2 we first review the \(R(\tilde{\Gamma },g)^2\) gravity in the EP approach, where \(R(\tilde{\Gamma },g)\) denotes the scalar curvature in this formalism. This action is local scale invariant. The connection is shown to be conformally related to the Levi-Civita connection. When “fixing the gauge” of this symmetry, the “auxiliary” scalar field \(\phi \) introduced to “linearise” the \(R^2\) term decouples. As a result, one finds that \(\tilde{\Gamma }=\Gamma \) and Einstein action is obtained.

In Sects. 3 and 4 we study the quadratic action \(R(\tilde{\Gamma },g)^2+R_{[\mu \nu ]}(\tilde{\Gamma })^2\) in the EP approach, hereafter called “EP quadratic gravity”. Here we used the notation \(R_{[\mu \nu ]}\equiv (R_{\mu \nu }-R_{\nu \mu })/2\). In this action the trace \(\tilde{\Gamma }_\mu \) of the Palatini connection (assumed symmetric) is dynamical in the sense that \(R_{[\mu \nu ]}(\tilde{\Gamma })^2\) is a gauge kinetic term for \(\tilde{\Gamma }_\mu \) or, more exactly, for the vector fieldFootnote 2\(v_\mu \sim \tilde{\Gamma }_\mu -\Gamma _\mu \), \((\tilde{\Gamma }_\mu \equiv \tilde{\Gamma }^\alpha _{\mu \alpha },\) \(\Gamma _\mu \equiv \Gamma ^\alpha _{\mu \alpha }).\) With \(\tilde{\Gamma }\) independent of \(g_{\mu \nu },\) one notices that the local scale symmetry of this action is actually a gauged scale symmetry, of gauge field \(v_\mu \).

A consequence of the gauged scale symmetry is that EP quadratic gravity is non-metricFootnote 3 i.e. \({\tilde{\nabla }}_\mu g_{\alpha \beta }\not =0\). This is due to a dynamical \(v_\mu \sim \tilde{\Gamma }_\mu \) and \(v_\mu \) is the non-metricity field, also called Weyl gauge field. Further, we find that the equations of motion for \(\tilde{\Gamma }\) are second-order differential equations. In this case the usual EP approach in f(R) theories to solve algebraically for \(\tilde{\Gamma }\) [4, 5] does not work, due to local scale symmetry and non-metricity. Nevertheless, we compute \(\tilde{\Gamma }\) and find that EP quadratic gravity with \(\tilde{\Gamma }\) onshell is equivalent to a ghost-free second-order gauged scale invariant theory with an additional dynamical field (“dilaton”). (Expressed in terms of this field the differential equations of \(\tilde{\Gamma }\) simplify considerably and this is how they are solved).

The main result of this work (Sect. 3) is that the gauged scale invariance of the above action is broken spontaneously by a new mechanism [26, 27] valid even in the absence of matter; in this, the necessary scalar field \((\phi )\) is not added ad-hoc to this purpose (as usually done), but is “extracted” from the \(R^2\) term in the action; \(\phi \) is thus of geometric origin. After a Stueckelberg mechanism [74,75,76] the gauge field \(v_\mu \) becomes massive, of mass \(m_v\) near the Planck scale \(M\sim \langle \phi \rangle \), by “absorbing” the derivative \(\partial _\mu \ln \phi \) of the Stueckelberg field (also referred to as “dilaton”). Near the Planck scale we obtain the Einstein–Proca action of \(v_\mu \). Further, below the scale \(m_v\propto M\), the field \(v_\mu \) decouples and we recover metricity, Levi-Civita connection and Einstein gravity; the Planck scale M is then an emergent scale where this symmetry is broken. These results remain true if the theory also has matter fields (Higgs, etc.) non-minimally coupled with Palatini connection, while respecting gauged scale invariance (Sect. 4). Briefly, the EP quadratic gravity is a gauged scale invariant theory broken à la Stueckelberg, even in the absence of matter, to an Einstein–Proca action with a positive cosmological constant and a potential for the scalar fields – if present. This answers the main goal of the paper.

Another theory where the connection is not determined by the metric itself is the original Weyl quadratic gravity of gauged scale invariance [22,23,24] (also [25]). With hindsight, it is then not too surprising that the above results are similar to those in [26,27,28] for Weyl theory. This theory came under early criticism from Einstein [22] for its non-metricity implying e.g. changes of the atomic spectral lines, in contrast to experiment; however, if the Weyl “photon” \((v_\mu )\) of non-metricity is actually massive (mass \(\sim M\)) by the same Stueckelberg mechanism, metricity and Einstein gravity are recovered below its decoupling scale (\(\sim \) Planck scale). Non-metricity effects are then strongly suppressed by a large M (their current lower bound seems low [77, 78]). Hence, the long-held criticisms that have implicitly assumed \(v_\mu \) be massless are actually avoided and Weyl gravity is then viable [26,27,28]. As outlined, in this work we obtain similar results in EP quadratic gravity, up to different non-metricity effects.

We also study inflation in EP quadratic gravity (Sect. 5). We consider this theory with an extra scalar field (Higgs-like) with perturbative non-minimal coupling and Palatini connection, that plays the role of the inflaton. We compute the potential after the gauged scale symmetry breaking. With the Planck scale a simple phase transition scale in our theory, field values above M are natural. Interestingly, the inflaton potential is similar to that in Weyl quadratic gravity [28], up to couplings and field redefinitions (due to a different non-metricity of the theory). Inflation in EP quadratic gravity has a specific prediction for the tensor-to-scalar ratio \(0.007\le r \le 0.010\) for the current spectral index \(n_s\) at \(95\%\) CL. This range of r is distinct from that predicted by inflation in Weyl gravity [28, 47] and will soon be reached by CMB experiments [79,80,81]. The conclusions are presented in Sect. 6 followed by an Appendix.

2 Palatini \(R^2\) gravity

For later reference we first review \(R^2\) gravity in the EP formalism [82, 83]. As discussed below, the action is local scale invariant (unlike its Riemannian counterpart):

$$\begin{aligned} L_1=\sqrt{g}\, \frac{\xi _0}{4!}\, R(\tilde{\Gamma },g)^2,\quad \xi _0>0, \end{aligned}$$
(1)

where

$$\begin{aligned} R(\tilde{\Gamma },g)= & {} g^{\mu \nu }\,R_{\mu \nu }(\tilde{\Gamma }),\nonumber \\ R_{\mu \nu }(\tilde{\Gamma })= & {} \partial _\lambda \tilde{\Gamma }^\lambda _{\mu \nu }- \partial _\mu \tilde{\Gamma }^\lambda _{\lambda \nu } +\tilde{\Gamma }_{\rho \lambda }^\lambda \tilde{\Gamma }^\rho _{\mu \nu }-\tilde{\Gamma }^\lambda _{\rho \mu } \tilde{\Gamma }^\rho _{\nu \lambda } \end{aligned}$$
(2)

\(R_{\mu \nu }(\tilde{\Gamma })\) is the metric-independent Ricci tensor in the EP formalism. Our conventions are as in [84] with metric \((+,-,-,-),\) \(g\equiv \vert \det g_{\mu \nu }\vert \) and we assume there is no torsion i.e. \(\tilde{\Gamma }_{\mu \nu }^\rho =\tilde{\Gamma }_{\nu \mu }^\rho \).

There is an equivalent “linearised” version of \(L_1\), found by using an auxiliary field \(\phi \)

$$\begin{aligned} L_1=\sqrt{g}\, \frac{\xi _0}{4!}\, \{-2 \,\phi ^2 R(\tilde{\Gamma },g)-\phi ^4\}. \end{aligned}$$
(3)

Indeed, (1) is recovered if we use in (3) the solution \(\phi ^2=-R(\tilde{\Gamma },g)\) of the equation of motion of the scalar field \(\phi \). With the connection \(\tilde{\Gamma }\) independent of the metric, (3) and (1) have local scale symmetry i.e. are invariant under a Weyl transformation \(\Omega =\Omega (x)\) withFootnote 4

$$\begin{aligned} {\hat{g}}_{\mu \nu }= & {} \Omega ^2 g_{\mu \nu },\quad \sqrt{{\hat{g}}}=\Omega ^4\sqrt{g},\nonumber \\ {\hat{\phi }}= & {} \frac{1}{\Omega }\phi , \quad {\hat{R}}(\tilde{\Gamma },{\hat{g}})=\frac{1}{\Omega ^2} R(\tilde{\Gamma },g). \end{aligned}$$
(4)

Unlike in the metric case, \(R_{\mu \nu }(\tilde{\Gamma })\) is invariant under (4) while \(R(\tilde{\Gamma },g)\) transforms covariantly, hence (1) and (3) are invariant. \(L_1\) has a shift symmetry: \(\ln \phi \rightarrow \ln \phi -\ln \Omega \). In global cases \(\ln \phi \) is the dilaton field generating a mass scale from its vev (assumed to be non-zero); here, \(\ln \phi \) is similar to a would-be Goldstone, as seen if we “gauge” symmetry (4) (see later, Eq. (18)).

Let us solve the equation of motion for \(\tilde{\Gamma }\), then find the action for \(\tilde{\Gamma }\) onshell.Footnote 5 The change of \(R_{\mu \nu }(\tilde{\Gamma })\) under a variation of the connection is \(\delta R_{\mu \nu }(\tilde{\Gamma })={\tilde{\nabla }}_\lambda (\delta \tilde{\Gamma }_{\mu \nu }^\lambda ) - {\tilde{\nabla }}_\nu (\delta \tilde{\Gamma }_{\mu \lambda }^\lambda )\), where the operator \({\tilde{\nabla }}\) is defined with connection \(\tilde{\Gamma }\). Then from (3) the equation of motion of \(\tilde{\Gamma }_{\mu \nu }^\lambda \) gives

$$\begin{aligned} \tilde{\nabla }_\lambda \left( \sqrt{g}\, g^{\mu \nu }\phi ^2 \right) - \frac{1}{2}\,\Big [ {\tilde{\nabla }}_\rho \left( \sqrt{g}\,g^{\rho \mu } \phi ^2\right) \, \delta _\lambda ^\nu +(\mu \leftrightarrow \nu )\Big ] =0.\nonumber \\ \end{aligned}$$
(5)

Setting \(\nu =\lambda \) and then summing over, thenFootnote 6

$$\begin{aligned} {\tilde{\nabla }}_\rho \left( \sqrt{g}\,g^{\rho \mu }\phi ^2\right) =0. \end{aligned}$$
(6)

To simplify notation, introduce an auxiliary dimensionful “metric” \(h_{\mu \nu }\equiv \phi ^2 g_{\mu \nu }\), then

$$\begin{aligned} {\tilde{\nabla }}_\lambda \left( \sqrt{h}\, h^{\mu \nu }\right) =0. \end{aligned}$$
(7)

This means that in terms of \(h_{\mu \nu }\), the connection is Levi-CivitaFootnote 7

$$\begin{aligned} \tilde{\Gamma }^\alpha _{\mu \nu }(h)=(1/2)\, h^{\alpha \lambda } ( \partial _\mu h_{\lambda \nu } +\partial _\nu h_{\lambda \mu }- \partial _\lambda h_{\mu \nu }), \end{aligned}$$
(9)

or, in terms of \(g_{\mu \nu }\)

$$\begin{aligned} \tilde{\Gamma }_{\mu \nu }^\alpha= & {} \Gamma _{\mu \nu }^\alpha (g) + (1/2)\big (\delta _\nu ^\alpha \,u_\mu +\delta _\mu ^\alpha u_\nu -g^{\alpha \lambda } g_{\mu \nu } u_\lambda \big ),\nonumber \\&u_\mu \equiv \partial _\mu \ln \phi ^2, \end{aligned}$$
(10)

with Levi-Civita \(\Gamma ^\alpha _{\mu \nu }(g)=(1/2) g^{\alpha \lambda } (\partial _\mu g_{\lambda \nu }+\partial _\nu g_{\lambda \mu } -\partial _\lambda g_{\mu \nu })\). Next, if we use the equation of motion of \(\phi \) of solution \(\phi ^2=-R(\tilde{\Gamma },g)\), Eq. (10) for \(\tilde{\Gamma }\) (also (5), (6)) becomes a second-order differential equation since \(\partial \phi ^2\sim \partial R\sim \partial ^2\tilde{\Gamma }\), and it is difficult to solve (and since solution \(\tilde{\Gamma }\) of (10) involves \(\partial g_{\mu \nu }\) from \(\Gamma (g)\) then for \(\tilde{\Gamma }\) onshell action (1) is a four-derivative theory in \(g_{\mu \nu }\)). An easy way out is to keep \(\phi \) an independent variable hereafter (no use of its equation of motion), then Eqs. (5) and (6) have solution \(\tilde{\Gamma }\) given by the rhs of (10). For this solution, then

$$\begin{aligned} R(\tilde{\Gamma },g)=R(g)- 3 \nabla _\mu u^\mu -\frac{3}{2} \, g^{\mu \nu } u_\mu \, u_\nu , \end{aligned}$$
(11)

with the Ricci scalar R(g) for \(g_{\mu \nu }\) while \(\nabla \) is defined with the Levi-Civita connection (\(\Gamma \)). Using (11) in (3) of the same metric, we find for \(\tilde{\Gamma }\) onshellFootnote 8

$$\begin{aligned} L_1=\sqrt{g}\, \left\{ \frac{\xi _0}{2}\,\left[ - \frac{1}{6}\, \phi ^2 R(g) - (\partial _\mu \phi )^2 -\frac{1}{12} \,\phi ^4\right] \right\} . \end{aligned}$$
(12)

\(L_1\) is a second order theory with an additional dynamical variable demanded by symmetry (4) and is equivalent to action (1) which for \(\tilde{\Gamma }\) onshell is a four-derivative theory, as noticed.

Lagrangian (12) has local scale symmetry so one may like to “fix the gauge”. We choose the Einstein or unitarity gauge reached by a \(\phi \)-dependent transformation \(\Omega ^2=\phi ^2\!/\langle \phi \rangle ^2\) that is gauge-fixing \(\phi \) to a constant (vev); in this gauge \(M^2=\xi _0\langle \phi \rangle ^2/6\) is the Planck mass. From (12)

$$\begin{aligned} L_1=\sqrt{{\hat{g}}}\, \left\{ \frac{-1}{2} M^2 {\hat{R}}({\hat{g}}) - \frac{3}{2 \xi _0} M^4\right\} . \end{aligned}$$
(13)

Hence Einstein action (13) is recovered as a gauge fixed form of (12); symmetry (4) is now spontaneously broken and \(\phi \) decouplesFootnote 9 [85]; this may be expected since the local scale symmetry current of (12) is vanishing [86,87,88] (this will change in Sect. 3.3). With \(\phi \) “gauge fixed” to a constant, Eqs. (7) and (10) give \(h_{\mu \nu }\!\propto \! g_{\mu \nu }\) and \(\tilde{\Gamma }=\Gamma \) so the theory is metric.Footnote 10\(^,\)Footnote 11

3 Palatini quadratic gravity with gauged scale symmetry

3.1 The Lagrangian and its expression for onshell \(\tilde{\Gamma }\)

Consider now the following EP quadratic gravity, with \(\alpha \) = constant and \(R_{[\mu \nu ]}\equiv (R_{\mu \nu }-R_{\nu \mu })/2\)

$$\begin{aligned} L_2=\sqrt{g}\, \left\{ \frac{\xi _0}{4!}\,R(\tilde{\Gamma },g)^2- \frac{1}{4 \alpha ^2} R_{[\mu \nu ]}(\tilde{\Gamma })\, R^{\mu \nu }(\tilde{\Gamma })\right\} . \end{aligned}$$
(14)

With \(R_{\mu \nu }(\tilde{\Gamma })\) from Eq. (2) and \(\tilde{\Gamma }^\alpha _{\mu \nu }\) symmetric in \((\mu ,\nu )\), \(L_2\) has a more intuitive form

$$\begin{aligned} L_2=\sqrt{g}\, \left\{ \frac{\xi _0}{4!}\, R(\tilde{\Gamma },g)^2 -\frac{1}{4 \alpha ^2}\, F_{\mu \nu }(\tilde{\Gamma }) F^{\mu \nu }(\tilde{\Gamma })\right\} . \end{aligned}$$
(15)

This is a natural extension of \(L_1\) of Eq. (1), with the second term above indicating we now have a dynamical trace \((\tilde{\Gamma }_\mu )\) of the Palatini connection, as seen from the notation below:

$$\begin{aligned} F_{\mu \nu }(\tilde{\Gamma })= {\tilde{\nabla }}_\mu v_\nu -{\tilde{\nabla }}_\nu v_\mu ; \quad v_\mu = (1/2)(\tilde{\Gamma }_\mu -\Gamma _\mu (g)),\nonumber \\ \end{aligned}$$
(16)

with \(\tilde{\Gamma }_\mu \equiv \tilde{\Gamma }_{\mu \lambda }^\lambda \) and \(\Gamma _\mu \equiv \Gamma _{\mu \lambda }^\lambda \). Since \(\tilde{\Gamma }^\alpha _{\mu \nu }=\tilde{\Gamma }_{\nu \mu }^\alpha \) and \({\tilde{\nabla }}_\mu v_\nu =\partial _\mu v_\nu - \tilde{\Gamma }_{\mu \nu }^\alpha v_\alpha \), then we have \(F_{\mu \nu }=\partial _\mu v_\nu -\partial _\nu v_\mu =(\partial _\mu \tilde{\Gamma }_\nu -\partial _\nu \tilde{\Gamma }_\mu )/2=-R_{[\mu \nu ]}\), and Eqs. (14) and (15) are equivalent. While \(\Gamma _\mu (g)\) does not contribute to \(F_{\mu \nu }(\tilde{\Gamma })^2\), it is needed to ensure that \(v_\mu \) is a vector under coordinate transformation (which is not true for \(\tilde{\Gamma }_\mu \) or \(\Gamma _\mu \), see Appendix). \(v_\mu \) is the Weyl fieldFootnote 12 and measures the trace of the deviation of the Palatini connection \(\tilde{\Gamma }\) from Levi-Civita connection \(\Gamma (g)\). \(L_2\) is quadratic in R but for \(\tilde{\Gamma }\) offshell resembles a second order theory.

As in previous section, write \(L_2\) in an equivalent “linearised” form useful later on

$$\begin{aligned} L_2=\sqrt{g}\, \left\{ - \frac{\xi _0}{12}\, \phi ^2 \,R(\tilde{\Gamma },g) - \frac{1}{4 \alpha ^2}\, F_{\mu \nu }(\tilde{\Gamma })^2 -\frac{\xi _0}{4!}\,\phi ^4 \right\} .\nonumber \\ \end{aligned}$$
(17)

The equation of motion for \(\phi \) has solution \(\phi ^2=-R(\tilde{\Gamma },g)\) which replaced in \(L_2\) recovers (15).

Since \(\tilde{\Gamma }\) does not transform under (4) and with \(\Gamma _\mu (g)=\partial _\mu \ln \sqrt{g}\) that follows from the definition of Levi-Civita connection, then \(L_2\) is invariant under (4) extended by

$$\begin{aligned} {\hat{v}}_\lambda =v_\lambda -\partial _\mu \ln \Omega ^2. \end{aligned}$$
(18)

The invariance of \(L_2\) under transformations (4) and (18), is referred to as gauged scale invariance or Weyl gauge symmetry, with a (dilatation) group isomorphic to \(\mathbf{R^+}\), as in Weyl gravity.

Let us then compute the connection \(\tilde{\Gamma }_{\mu \nu }^\lambda \) from its equation of motion which is

$$\begin{aligned}&{\tilde{\nabla }}_\lambda (\sqrt{g}\,g^{\mu \nu }\phi ^2)\nonumber \\&\quad -\left\{ \frac{1}{2}\,\delta _\lambda ^\nu \left[ {\tilde{\nabla }}_\rho (\sqrt{g}\, g^{\mu \rho } \phi ^2) {-}\frac{6 \sqrt{g}}{\alpha ^2 \xi _0} \nabla _\rho F^{\rho \mu }\right] +(\mu \leftrightarrow \nu )\right\} =0.\nonumber \\ \end{aligned}$$
(19)

Here \({\tilde{\nabla }}_\mu \) and \(\nabla _\mu \) are evaluated with the Palatini (\(\tilde{\Gamma }\)) and Levi-Civita (\(\Gamma \)) connections, respectively. Setting \(\lambda =\nu \) and summing over gives (compare against Eq. (6))

$$\begin{aligned} {\tilde{\nabla }}_\rho \left( \sqrt{g}\,g^{\mu \rho }\,\phi ^2\right) =\frac{10}{ \alpha ^2} \frac{1}{\xi _0} \sqrt{g}\, \nabla _\rho F^{\rho \mu }, \end{aligned}$$
(20)

which is an equation of motion for the trace \(\tilde{\Gamma }_\mu \sim v_\mu \). Replacing (20) back in (19) leads to

$$\begin{aligned} {\tilde{\nabla }}_\lambda (\sqrt{g}\,g^{\mu \nu }\phi ^2) -\frac{1}{5}\, \Big \{ \,\delta _\lambda ^\nu \, {\tilde{\nabla }}_\rho (\sqrt{g}\, g^{\rho \mu }\phi ^2) +(\mu \leftrightarrow \nu )\Big \}=0.\nonumber \\ \end{aligned}$$
(21)

Therefore, the set of Eq. (19) is equivalent to the combined set of Eqs. (21) and (20).Footnote 13

Let us find \(\tilde{\Gamma }_{\mu \nu }^\lambda \) from (21). Note that if one used the equation of motion of \(\phi \) of solution \(\phi ^2= -R(\tilde{\Gamma },g)\), then (21) would be a second-order differential equation for \(\tilde{\Gamma }^\alpha _{\mu \nu }\), since \({\tilde{\nabla }}_\lambda \phi ^2\sim \partial \phi ^2\sim \partial R(\tilde{\Gamma },g)\sim \partial ^2\tilde{\Gamma }\), with further complications. It is however easier to simply regard \(\phi \) hereafter as an independent variableFootnote 14 (i.e. no use of its equation of motion) in terms of which one then easily computes \(\tilde{\Gamma }\) algebraically, as we do below. To find a solution to (21) we first introduce, based on an approach of [8]:

$$\begin{aligned} {\tilde{\nabla }}_\lambda \left( \sqrt{g}\,g^{\mu \nu }\phi ^2\right) =(-2)\sqrt{g} \,\phi ^2(\delta _\lambda ^\mu \, V^\nu +\delta _\lambda ^\nu \,V^\mu ), \end{aligned}$$
(22)

where \(V_\mu \) is some arbitrary vector field (to be determined later). \(V_\mu \) is introduced since, due to underlying symmetry, Eq. (21) with \(\lambda =\nu \) summed over is automatically respected for fixed \(\mu \) \((=0,1,2,3);\) this is leaving four undetermined components, accounted for by \(V_\mu \). Further, if in Eq. (21) one replaces \({\tilde{\nabla }} (..)\) terms by the rhs of (22) one easily shows that (21) is indeed verified. Hence, instead of finding \(\tilde{\Gamma }\) from (21), it is sufficient to compute \(\tilde{\Gamma }\) from (22),Footnote 15 which is easier. To this end, multiply (22) by \(g_{\mu \nu }\) and use \(g_{\mu \nu }{\tilde{\nabla }}_\lambda g^{\mu \nu }=-2 {\tilde{\nabla }}_\lambda \ln \sqrt{g}\), to find that

$$\begin{aligned} V_\lambda =- (1/2)\,{\tilde{\nabla }}_\lambda \ln \left( \sqrt{g}\,\phi ^4\right) . \end{aligned}$$
(23)

From (22) and (23)

$$\begin{aligned} {\tilde{\nabla }}_\lambda \, (\phi ^2 g_{\mu \nu }) =(-2)\,\big (g_{\mu \nu }\,V_\lambda -g_{\mu \lambda } V_\nu - g_{\nu \lambda } V_\mu \big )\, \phi ^2, \end{aligned}$$
(24)

so the theory is non-metric. From (24) we find the solutionFootnote 16\(\tilde{\Gamma }\) to (21) in terms of \(V_\lambda \):

$$\begin{aligned} \tilde{\Gamma }_{\mu \nu }^\alpha= & {} \Gamma _{\mu \nu }^\alpha (\phi ^2 g) -\,\big (3\, g_{\mu \nu }\, V_\lambda -g_{\nu \lambda }\,V_\mu - g_{\lambda \mu } \,V_\nu \,\big )\,g^{\lambda \alpha },\nonumber \\&\quad \mathrm{with}\ \Gamma ^\alpha _{\mu \nu }(\phi ^2 g)=\Gamma ^\alpha _{\mu \nu }(g)\nonumber \\&+1/2\, \big ( \delta _\nu ^\alpha \, \partial _\mu +\delta _\mu ^\alpha \, \partial _\nu -g^{\alpha \lambda } g_{\mu \nu } \,\partial _\lambda ) \ln \phi ^2. \end{aligned}$$
(25)

\(\Gamma _{\mu \nu }^\alpha (g)\) is Levi-Civita connection of \(g_{\mu \nu }\). From (25), \(\tilde{\Gamma }_\lambda =\Gamma _\lambda (\phi ^2 g)+ 2 V_\lambda \) and with (16) and (23)

$$\begin{aligned} v_\lambda =-(1/2)\, {\tilde{\nabla }}_\lambda \ln \sqrt{g}, \end{aligned}$$
(26)

and finally, \(V_\lambda =v_\lambda -\partial _\lambda \ln \phi ^2\). With this relation between \(V_\lambda \) and \(v_\lambda \), the solution \(\tilde{\Gamma }\) in (25) is finally expressed as a function of \(v_\lambda \), \(\phi \), and will be used shortly to compute the action for \(\tilde{\Gamma }\) onshell (see Eq. (29) belowFootnote 17). Notice that solution \(\tilde{\Gamma }\) of (25) and also (24), are invariant under transformations (4) and (18) for any \(\Omega (x)\) since \(\phi ^2 g_{\mu \nu }\), \(V_\lambda \), \(\sqrt{g}\phi ^4\) are invariant.

As expected, \(v_\lambda \) is the Weyl field of non-metricity defined as \(Q_{\lambda \mu \nu }\equiv {\tilde{\nabla }}_\lambda g_{\mu \nu }\), since from (26) the trace \(Q^\mu _{\lambda \mu }=-4\, v_\lambda \). Non-metricity is a consequence of the dynamical \(v_\lambda \), see (20). Equation (26) is similar to that in Weyl quadratic gravity of same symmetry (e.g. [33]).

Finally, from solution (25) and (2) we compute \(R_{\mu \nu }(\tilde{\Gamma })\) and scalar curvatureFootnote 18\(R(\tilde{\Gamma },g)\)

$$\begin{aligned} R(\tilde{\Gamma },g)= & {} R(g)-6 g^{\mu \nu } \nabla _\mu \nabla _\nu \ln \phi -6 (\nabla _\mu \ln \phi )^2\nonumber \\&-12 \big (\nabla _\lambda V^\lambda + V^\lambda \partial _\lambda \ln \phi ^{{2}}\big ) -6 V_\mu \, V^\mu . \end{aligned}$$
(28)

R(g) is here the usual Ricci scalar and \(V_\lambda = w_\lambda -\partial _\lambda \ln \phi ^2\). Using (28) in (17), then finally

$$\begin{aligned} L_2= & {} \sqrt{g}\,\left\{ -\frac{\xi _0}{12} \left[ \phi ^2 R(g) +6 (\partial _\mu \phi )^2\right] \right. \nonumber \\&\left. +\frac{\xi _0}{2} \phi ^2 \,(v_\mu -\partial _\mu \ln \phi ^2)^2 -\frac{1}{4\, \alpha ^2} F_{\mu \nu }^2-\frac{\xi _0}{4!}\,\phi ^4\right\} . \end{aligned}$$
(29)

This is the “onshell” Lagrangian of EP quadratic gravity of Eq. (14) and is gauged scale invariant. \(L_2\) is a second-order scalar–vector–tensor theory of gravity which is ghost-free according to [92] for a torsion-free connection as here (this is also obvious from (30) below). This is relevant since initial action (14) which (offshell) was of second order is actually a four-derivative theory in the metricFootnote 19 for \(\tilde{\Gamma }\) onshell; indeed, \(R(\tilde{\Gamma },g)^2\) in (14) with replacement (28) contains the higher derivative term \(R^2(g)+\cdots \); this four-derivative theory has an equivalent second-order formulation with additional \(\phi \), as shown in Eq. (29). Finally, if \(v_\mu =\partial _\mu \ln \phi ^2\) (“pure gauge”), the model is Weyl integrable and (29) recovers (12).

Lagrangian (29) (also initial (15)) is similar to that of Weyl quadratic gravity [26, 27], up to a Weyl tensor-squared term not included here. However, unlike in Weyl theory, here \(\tilde{\Gamma }\) is \(\phi \)-dependent; also, in Weyl theory non-metricity follows from the underlying Weyl conformal geometry, while here it emerges after we determine \(\tilde{\Gamma }\) from its equation of motion.

3.2 Stueckelberg breaking to Einstein–Proca action

Given \(L_2\) in (29) with gauged scale symmetry we would like to “fix the gauge”. We choose the Einstein gauge obtained from (29) by transformations (4) and (18) of a special \(\Omega ^2=\xi _0\phi ^2/(6 M^2)\) fixing \(\phi \) to a constant (\(\langle \phi \rangle \not =0\)). After removing the hats ( \(\hat{}\) ) on transformed g,\(v_\mu \), R, we find

$$\begin{aligned} L_2= & {} \sqrt{g}\, \left\{ -\frac{1}{2} M^2 R(g) +3\, M^2 \,v_\mu \,v_\nu \, g^{\mu \nu } \right. \nonumber \\&\left. -\frac{1}{4 \alpha ^2} F_{\mu \nu }^2-\frac{3}{2\xi _0} M^4\right\} . \end{aligned}$$
(30)

This is the Einstein–Proca action for the gauge field \(v_\mu \) with a positive cosmological constant, in which we identified M with the Planck scale (M) as seen from Eq. (29)

$$\begin{aligned} M^2\equiv \xi _0\langle \phi \rangle ^2/6. \end{aligned}$$
(31)

The initial gauged scale invariance is broken by a gravitational Stueckelberg mechanism [74,75,76]: the massless \(\phi \) is not part of the action anymore, but \(v_\mu \) has become massive, after “absorbing” the derivative \(\partial _\mu (\ln \phi )\) of the Stueckelberg field (dilaton) in Eq. (29). Note that \(\partial _\mu (\ln \phi )\) is actually the Goldstone of special conformal symmetry – this Goldstone is not independent but is determined by the derivative of the dilaton [93]. The number of degrees of freedom (dof) other than graviton is conserved in going from (29) to (30), as it should be for spontaneous breaking: massless \(v_\mu \) and dynamical \(\phi \) are replaced by massive \(v_\mu \) (dof = 3). The mass of \(v_\mu \) is \(m_v^2\!=6 \alpha ^2 M^2\) which is near Planck scale M (unless one fine-tunes \(\alpha \ll 1\)).

Using the same transformation \(\Omega \), from (24)

$$\begin{aligned} {\tilde{\nabla }}_{\lambda } g_{\mu \nu }=(-2)(g_{\mu \nu } v_\lambda -g_{\mu \lambda } v_\nu -g_{\nu \lambda } v_\mu ). \end{aligned}$$
(32)

This has a solution \(\tilde{\Gamma }\) that is immediate from (25) for \(\phi \) constant and \(V_\lambda \) replaced by \(v_\lambda \). Finally, after the massive field \(v_\mu \) decouples, metricity is recovered below \(m_v\), so \({\tilde{\nabla }}_{\lambda } g_{\mu \nu }=0\) and \(\tilde{\Gamma }=\Gamma (g)\). Briefly, Einstein action is a “low energy” limit of Einstein–Palatini quadratic gravity, and the Planck scale \(M\sim \langle \phi \rangle \) is a phase transition scale (up to coupling \(\alpha \)).Footnote 20

For comparison, in Weyl quadratic gravity e.g. [26, 27], non-metricity is differentFootnote 21

$$\begin{aligned} {\tilde{\nabla }}_\lambda g_{\mu \nu }=-g_{\mu \nu }\,v_\lambda . \end{aligned}$$
(33)

Interestingly the different non-metricity of these theories (giving different \(\tilde{\Gamma }\)) has phenomenological impact, see Sect. 5. In both theories the non-metricity scale is \(m_v\sim \,\)Planck scale and is large enough (current bounds [77, 78] are low \(\sim \) TeV) to suppresses unwanted effects e.g. atomic spectral lines spacing. Past critiques of non-metricity assumed a massless \(v_\mu \).

Finally, let us remark that the above spontaneous symmetry breaking mechanism for initial action (14) is special since it takes place in the absence of matter. Indeed, the necessary scalar (Stueckelberg) field \(\ln \phi \) was not added ad-hoc to this purpose, as usually done in the literature; instead, this field was “extracted” from the \(R^2\) term in the initial, symmetric action (14) and is thus of geometric origin. This situation is similar to Weyl quadratic gravity where this mechanism was first noticed [26, 27].

3.3 Conserved current

Equations (20) and (22) show there is now a non-trivial current due to dynamical \(v_\mu \sim \tilde{\Gamma }_\mu \)

$$\begin{aligned} J^\mu =\sqrt{g}\, g^{\rho \mu } \,\phi \, (\partial _\rho -1/2\, v_\rho )\,\phi ,\quad \nabla _\mu J^\mu =0, \end{aligned}$$
(34)

This is conserved since \(F_{\mu \nu }\) in (20) is anti-symmetric. To obtain (34) we used that the lhs of (20) and of (22) (with \(\lambda =\nu \)) are equal and replaced \(V_\lambda =v_\lambda -\partial _\lambda \ln \phi ^2\). The current \(J^\mu \) is the same as that in Weyl quadratic gravity [26] (Eq. 18) which has similar symmetry but different non-metricity. The presence of this conserved current extends to the case of the gauged scale symmetry a similar conservation for a global scale symmetry [64]. For a Friedmann–Robertson–Walker metric with \(\phi \) only t-dependent such current conservation in the global case naturally leads to \(\phi \)=constant [64] and a breaking of scale symmetry. In our case, since Eq. (30) has \(\phi \)=constant (assumed \(\langle \phi \rangle \not =0\)), then from (34) one has \(\nabla _\mu v^\mu =0\) which is a condition similar to that for a Proca (massive) gauge field, leaving 3 degrees of freedom for \(v_\mu \) in (30).

4 Palatini quadratic gravity: adding matter

In this section we re-do the previous analysis in the presence of a scalar \(\chi \) which can be the SM Higgs, with non-minimal coupling with Palatini connection to the EP quadratic gravity.

The general Lagrangian of the field \(\chi \), with gauged scale invariance, Eqs. (4) and (18) is

$$\begin{aligned} L_3= & {} \sqrt{g}\,\left[ \, \frac{\xi _0}{4!}\,R(\tilde{\Gamma },g)^2 - \frac{1}{4 \alpha ^2}\, F_{\mu \nu }^2 - \frac{1}{12} \,\xi _1\chi ^2\,R(\tilde{\Gamma },g)\right. \nonumber \\&\left. +\frac{1}{2} \,({\tilde{D}}_\mu \chi )^2 - \frac{\lambda _1}{4!}\chi ^4\right] , \end{aligned}$$
(35)

with the potential dictated by this symmetry and with

$$\begin{aligned} {\tilde{D}}_\mu \chi =(\partial _\mu -1/2\,v_\mu )\,\chi . \end{aligned}$$
(36)

Under (4) and (18) the Weyl-covariant derivative transforms as \(\hat{{\tilde{D}}}_\mu {\hat{\chi }}=(1/\Omega )\, {\tilde{D}}_\mu \chi \). As in previous sections, replace \(R(\tilde{\Gamma },g)^2\!\rightarrow \! -2 \phi ^2 R(\tilde{\Gamma },g)-\phi ^4\) to find an equivalent “linearised” \(L_3\)

$$\begin{aligned} L_3&=\sqrt{g}\,\left[ \, - \frac{1}{2} \,\rho ^2\,R(\tilde{\Gamma },g) - \frac{1}{4 \alpha ^2}\,F_{\mu \nu }^2 \right. \nonumber \\&\quad \left. + \frac{1}{2} \,({\tilde{D}}_\mu \chi )^2 -{{\mathcal {V}}}(\chi ,\rho ) \right] , \end{aligned}$$
(37)

where

$$\begin{aligned}&{{\mathcal {V}}}(\chi ,\rho )\equiv \frac{1}{4!}\,\left[ \frac{1}{\xi _0} (6\rho ^2-\xi _1\chi ^2)^2 + \lambda _1\chi ^4\right] ,\nonumber \\&\quad \text {and}\quad \rho ^2=\frac{1}{6}\,(\xi _1\chi ^2+\xi _0\phi ^2). \end{aligned}$$
(38)

Notice that we also replaced the scalar field \(\phi \) by the new, radial direction field \(\rho \); \(\ln \rho \) transforms as \(\ln \rho \rightarrow \ln \rho -\ln \Omega \) and acts as the (would-be) Goldstone of the symmetry.

The equation of motion for \(\tilde{\Gamma }_{\mu \nu }^\lambda \) is similar to (19) but with a replacement \(\phi \rightarrow \rho \) and with an additional contribution from the kinetic term of \(\chi \). Following the same steps as in the previous section, we eliminate the contributions of the kinetic terms of \(\chi \) and \(v_\mu \) to the equation of \(\tilde{\Gamma }\) and find an equation similar to (21) with \(\phi \rightarrow \rho \):

$$\begin{aligned} {\tilde{\nabla }}_\lambda (\sqrt{g}g^{\mu \nu }\rho ^2) -\frac{1}{5} \Big \{ \delta _\lambda ^\nu \, {\tilde{\nabla }}_\sigma (\sqrt{g}g^{\sigma \mu }\rho ^2)+(\mu \leftrightarrow \nu ) \Big \}=0.\nonumber \\ \end{aligned}$$
(39)

This gives (see previous section):

$$\begin{aligned} {\tilde{\nabla }}_\lambda (\rho ^2 g_{\mu \nu }) =(-2)\rho ^2 (g_{\mu \nu }\,V_\lambda -g_{\mu \lambda } V_\nu - g_{\nu \lambda } V_\mu ), \end{aligned}$$
(40)

where \(V_\mu =(-1/2){\tilde{\nabla }}_\mu \ln (\sqrt{g}\rho ^4)=v_\mu -\partial _\mu \ln \rho ^2\). From (40) one finds the solution for Palatini connection \(\tilde{\Gamma }_{\mu \nu }^\alpha \) in terms of \(v_\mu \sim \tilde{\Gamma }_\mu \), with a result similar to (25) but with \(\phi \rightarrow \rho \). We use this solution for the connection back in the action and find for \(\tilde{\Gamma }\) onshellFootnote 22

$$\begin{aligned} L_3= & {} \sqrt{g}\left\{ \frac{-1}{2} [\rho ^2 R(g) + 6 (\partial _\mu \rho )^2] + 3 \rho ^2 (v_\mu -\partial _\mu \ln \rho ^2)^2 \right. \nonumber \\&\left. -\frac{1}{4 \alpha ^2} \,F_{\mu \nu }^2 +\frac{1}{2} ({\tilde{D}}_\mu \chi )^2 - {{\mathcal {V}}}(\chi ,\rho )\right\} . \end{aligned}$$
(41)

\(L_3\) has a gauged scale symmetry and extents (29) in the presence of scalar field \(\chi \).

Finally, we choose the Einstein gauge by using transformation (4) and (18) of a particular \(\Omega \!=\!\rho /M\) which essentially sets \({\hat{\rho }}\) to a constant (vev). In terms of the new variables (with a hat) we find

$$\begin{aligned} L_3= & {} \sqrt{{\hat{g}}}\, \left\{ -\frac{1}{2} M^2\,R({\hat{g}}) +3 M^2 {\hat{v}}_\mu {\hat{v}}^\mu -\frac{1}{4 \alpha ^2} {\hat{F}}_{\mu \nu }^2\right. \nonumber \\&\left. +\frac{1}{2}(\hat{{\tilde{D}}}_\mu {\hat{\chi }})^2 -{{\mathcal {V}}}({\hat{\chi }},M) \Big ]\right\} , \end{aligned}$$
(42)

with \({\hat{{\tilde{D}}}}_\mu {\hat{\chi }}=(\partial _\mu -1/2\,\,{\hat{v}}_\mu ){\hat{\chi }}\) and we identify M with the Planck scale (\(M=\langle {\hat{\rho \rangle }}\)). As in the absence of matter, we obtained the Einstein–Proca action of a gauge field that became massive after Stueckelberg mechanism of “absorbing” the derivative term \(\partial _\mu (\ln \rho )\). A canonical kinetic term of \({\hat{\chi }}\) remained present in the action, since only one degree of freedom (radial direction \(\rho \)) was “eaten” by \(v_\mu \). The mass of \(v_\mu \) is \(m_v^2=6 \alpha ^2 M^2\). The potential becomes

$$\begin{aligned} {{\mathcal {V}}}= \frac{3 M^4}{2\,\xi _0} \left[ 1-\frac{\xi _1{\hat{\chi }}^2}{6 \,M^2} \right] ^2 + \frac{\lambda _1}{4!}\,{\hat{\chi }}^4. \end{aligned}$$
(43)

For a “standard” kinetic term for \({\hat{\chi }}\), similar to a “unitary gauge” in electroweak case, we remove the coupling \({\hat{v}}^\mu \partial _\mu {\hat{\chi }}\) in the Weyl-covariant derivative in (42) by a field redefinition

$$\begin{aligned} {{\hat{v}}_\mu ^\prime }= & {} {\hat{v}}_\mu - \partial _\mu \ln \cosh ^2 \left[ \frac{\sigma }{2 M\sqrt{6}}\right] ,\nonumber \\ {\hat{\chi }}= & {} 2 M\sqrt{6}\,\sinh \left[ \frac{\sigma }{2 M\sqrt{6}}\right] , \end{aligned}$$
(44)

which replaces \({\hat{\chi \rightarrow \sigma }}\). After some algebra, we find the final Lagrangian

$$\begin{aligned} L_3= & {} \sqrt{{\hat{g}}}\, \left\{ -\frac{1}{2} M^2\,\hat{R} + 3 M^2 \cosh ^2 \left[ \frac{\sigma }{2 M\sqrt{6}}\right] {\hat{v}}^\prime _\mu {\hat{v}}^{\prime \,\mu } \right. \nonumber \\&\left. -\frac{1}{4 \alpha ^2} {\hat{F}}^{\prime \, 2}_{\mu \nu } +\frac{{\hat{g}}^{\mu \nu }}{2}\partial _\mu \sigma \partial _\nu \sigma -{{\hat{{{\mathcal {V}}}}}}(\sigma )\right\} \end{aligned}$$
(45)

with

$$\begin{aligned} {{\hat{{{\mathcal {V}}}}}(\sigma )}= & {} {\hat{{{\mathcal {V}}}}}_0\, \left\{ \left[ 1-4 \xi _1\sinh ^2 \frac{\sigma }{2 M\sqrt{6}}\right] ^2 \right. \nonumber \\&\left. + 16 \,\lambda _1 \xi _0 \sinh ^4\frac{\sigma }{2 M\sqrt{6}} \,\right\} ,\quad {\hat{{{\mathcal {V}}}}}_0\equiv \frac{3}{2} \frac{M^4}{\xi _0}. \end{aligned}$$
(46)

In (45) one finally rescales \({\hat{v}}^\prime _\mu \rightarrow \alpha \,{\hat{v}}_\mu ^\prime \) for a canonical gauge kinetic term.

For small field values, \(\sigma \ll M\), then \({\hat{\chi }}\approx \sigma \) (up to \({{\mathcal {O}}}(\sigma ^3/M^2)\)) and a SM Higgs-like potential is recovered,Footnote 23 see Eq. (43). For \(\xi _1>0\) it has spontaneous breaking of the symmetry carried by \(\sigma \) i.e. electroweak (EW) symmetry if \(\sigma \) is the Higgs; this is triggered by the non-minimal coupling to gravity (\(\xi _1\not = 0\)) and Stueckelberg mechanism. The negative mass term originates in (38) due to the \(\phi ^4\) term (itself induced by \({\tilde{R}}^2\)). The mass \(m_\sigma ^2\propto \xi _1 M^2/\xi _0\) may be small enough, near the EW scale by tuning \(\xi _1\ll \xi _0\). It may be interesting to study if the gauged scale symmetry brings some “protection” to \(m_\sigma \) at the quantum level.

\(L_3\) of (45) is similar to that in Weyl quadratic gravity with a non-minimally coupled scalar/Higgs field [26,27,28],Footnote 24 up to a rescaling of the couplings (\(\xi _1\), \(\lambda _1\)) and fields (\(\sigma \)). This difference is due to the different non-metricity of the two theories, Eqs. (32) and (33). Both cases provide a gauged scale invariant theory of quadratic gravity coupled to matter. They both recover Einstein gravity in their broken phase, see Eq. (45), and also metricity below the scale \(m_v\sim \alpha M\) (\(\alpha \le 1\)). This result may be more general – it may apply to other theories with this symmetry and can be used for model building.

To conclude, mass generation (Planck scale, \(v_\mu \) mass) and Einstein gravity emerge naturally from spontaneous breaking of gauged scale symmetry in Einstein–Palatini theories, even in the absence of matter. Actions (14) and (35) were inspired by Weyl quadratic gravity of similar breaking [27]; but in a more general case, additional operators may be present in (14) and (35); for a list of all quadratic operators and a complementary study see [14]. The mechanism of symmetry breaking should remain valid in their presence if one includes the terms in (14): \(R^2\) that ’supplied’ the scalar field and \(R_{[\mu \nu ]}^2\) generating the symmetry and non-metricity. However, in such general case it is unclear that one can still solve algebraically the second-order differential equations of motion of \(\tilde{\Gamma }\) (Eq. (19)) without simplifying assumptions, since these equations acquire new terms of different indices structure and new states will be present (ghosts, etc.).

5 Palatini \(R^2\) inflation

In this section we consider an application to inflation of the action in the previous section.

For large field values, the potential in (46) can also be used for inflation (hereafter Palatini \(R^2\) inflation), with \(\sigma \) as the inflaton.Footnote 25 For a Friedmann–Robertson–Walker metric (FRW) \((1,-a^2(t),-a^2(t), -a^2(t))\) and compatible background \(v_\mu (t)=(v_0(t),0,0,0)\) the gauge fixing condition \(\nabla _\mu v^\mu =0\) gives that \(v_\mu (t)\) redshifts to zero \(v_\mu (t)\sim 1/a^3(t)\). Then the coupling \(v_\mu -\sigma \) in (45) is vanishing and therefore \(v_\mu (t)\) cannot affect inflation; this means we have single-field inflation of potential (46) and standard slow-roll formulae can be used. Further, since M is just a phase transition scale, field values \(\sigma \ge M\) are natural. \({\hat{{{\mathcal {V}}}}}(\sigma )\) is similar to that in Weyl gravity \(R^2\)-inflation, see [28, 47] for a detailed analysisFootnote 26; however, as mentioned, the couplings and field normalization in the potential differ (for same initial couplings and non-metricity trace); hence the spectral index \(n_s\) and tensor-to-scalar ratio r are different, too, and need to be analyzed separately.

The potential is shown in Fig. 1 for perturbative values of the couplings relevant for successful inflation. This demands \(\lambda _1 \xi _0\ll \xi _1^2\ll 1\), with the first relation from demanding that the initial energy be larger than at the end of inflation \({\hat{{{\mathcal {V}}}}}_0>{\hat{{{\mathcal {V}}}}}_{\mathrm{min}}\), respected by choosing a small enough \(\lambda _1\) for given \(\xi _{0,1}\). Therefore, we shall work in the leading order in \((\lambda _1\xi _0)\).

The slow-roll parameters are:

$$\begin{aligned} \epsilon= & {} \frac{M^2}{2} \left\{ \frac{{\hat{{{\mathcal {V}}}^\prime }}(\sigma )}{{\hat{{{\mathcal {V}}}}}(\sigma )}\right\} ^2 = \frac{{4}}{3}\, \,\xi _1^2\, \sinh ^2 \frac{\sigma }{M\sqrt{6}} +{{\mathcal {O}}}(\xi _1^3)\, \end{aligned}$$
(47)
$$\begin{aligned} \eta= & {} M^2\,\frac{{\hat{{{\mathcal {V}}}}}^{\prime \prime }(\sigma )}{{\hat{{{\mathcal {V}}}}}(\sigma )}= -\frac{2}{3}\,\xi _1 \,\cosh {\frac{\sigma }{M\sqrt{6}}}+{{\mathcal {O}}}(\xi _1^2). \end{aligned}$$
(48)

Then

$$\begin{aligned} n_s = 1 + 2\,\eta _* - 6\, \epsilon _*= 1 -\frac{4}{3}\,\xi _1 \,\cosh \frac{\sigma _*}{M\sqrt{6}}+{{\mathcal {O}}}(\xi _1^2),\nonumber \\ \end{aligned}$$
(49)

where \(\sigma _*\) is the value of \(\sigma \) at the horizon exit. With \(r=16 \epsilon _*\) we haveFootnote 27

$$\begin{aligned} r={12}\, (1- n_s)^2+{{\mathcal {O}}}(\xi _1^2). \end{aligned}$$
(50)

The contribution of \(\epsilon \) is subleading for small \(\xi _1\) considered here. The slope of the curves in the plane \((n_s,r)\), shown in leading order in (50), is steeper than in Weyl \(R^2\) inflation [28] (or Starobinsky model) where \(r=3 (1-n_s)^2+{{\mathcal {O}}}(\xi _1^2)\).

Fig. 1
figure 1

Left plot: The potential \({\hat{{{\mathcal {V}}}}}(\sigma )/{\hat{{{\mathcal {V}}}}}_0\) for \(\lambda _1\xi _0=10^{-10}\!\ll \! \xi _1^2\) with different \(\xi _1\!\ll \! 1\). For larger \(\lambda _1\xi _0\) the curves move to the left while the minimum of the rightmost ones is lifted. Larger values of \(\lambda _1\xi _0\) are allowed, but inflation becomes less likely when \(\lambda _1\xi _0\sim \xi _1^2\). The flat region is wide for a large range of \(\sigma \), with the width controlled by \(1/\sqrt{\xi }_1\) while its height is \({\hat{{{\mathcal {V}}}}}_0\propto 1/\xi _0\). We have \({\hat{{{\mathcal {V}}}}}/{\hat{{{\mathcal {V}}}}}_{\mathrm{min}}\propto \xi _1^2/(\lambda _1\xi _0)\). Right plot: The values of \((n_s,r)\) for different values of \(\xi _1\) that enable values of \(n_s=0.9670\pm 0.0037\) at 68% CL (blue band) and 95% CL (light blue region). For each curve \(N=60\) efolds is marked by a red point and the dark blue interval corresponds to \(55\le N\le 65\). Curves of \(\xi _1<10^{-3}\) are degenerate with the red one while those with \(\xi _1>2.5 \times 10^{-2}\) have \(N>65\)

The exact numerical results for \((n_s,r)\) in our model, for different e-folds number N, are shown in Fig. 1. From experimental data \(n_s=0.9670\pm 0.0037\) (\(68\%\) CL) and \(r<0.07\) (\(95\%\) CL) from Planck 2018 (TT, TE, EE + low E + lensing + BK14 + BAO) [105]. Using this data, Fig. 1 (right plot) shows that a specific, small range for r is predicted in our model for the current range for \(n_s\) at \(95\%\) CL:

$$\begin{aligned} N=60,\quad 0.007 \le r\le 0.010,\quad [\text {Palatini }R^2\text { inflation}]. \end{aligned}$$
(51)

Similar values for r can be read from Fig. 1 for \(55\le N\le 65\). The lower bound on r comes from that for \(n_s\) while the upper one corresponds to a saturation limit, \(\xi _1\rightarrow 0\), with values \(\xi _1<10^{-3}\) having similar \((n_s,r)\). One should also respect the constraint \(\lambda _1\le \xi _1^2/\xi _0\), giving \(\lambda _1\sim 10^{-12}\) or smaller (with the CMB anisotropy constraint \(\xi _0\ge 6.89\times 10^8\)).

For comparison, in Weyl \(R^2\)-inflation for same \(n_s\) at \(95\%\) CL one has a smaller r [28, 47]

$$\begin{aligned} N=60, \quad 0.00257\le r\le 0.00303,\quad [\text {Weyl }R^2 \text { inflation}].\nonumber \\ \end{aligned}$$
(52)

The different range for r in Eq. (51) versus Eq. (52) is important since it enables us to distinguish these two inflation models based on gauged scale invariance, and is due to their different non-metricity.Footnote 28\(^,\)Footnote 29 Such values for \(r\sim 10^{-3}\) will soon be reached by various CMB experiments [79,80,81] that will then be able test both models. This establishes an interesting connection between non-metricity and testable inflation predictions.

Similar values for r were found in other recent inflation models in Palatini \(R^2\) gravity [102,103,104] but these are not gauged scale invariant. In the absence of this symmetry, other successful models (e.g. Starobinsky model [106]) have corrections to r from higher curvature operators (\(R^4\), etc.) of unknown coefficients [108]. Such operators (and their corrections) are not allowed here because they must be suppressed by some effective scale whose presence would violate scale invariance.Footnote 30 Another advantage is that due to the gauged scale symmetry Palatini \(R^2\) inflation is allowed by black-hole physics (similarly for Weyl \(R^2\) inflation [28]), in contrast to models of inflation with global scale symmetry.Footnote 31

6 Conclusions

At a fundamental level gravity may be regarded as a theory of connections. An example is the Einstein–Palatini (EP) approach to gravity where the connection (\(\tilde{\Gamma }\)) is apriori independent of the metric, and is determined by its equation of motion, from the action. For simple actions \(\tilde{\Gamma }\) plays an auxiliary role (no dynamics) and can be solved algebraically. In particular, for Einstein action in the EP approach one finds that the connection is actually equal to the Levi-Civita connection (of the metric formulation); then Einstein gravity is recovered, so the metric and EP approaches are equivalent. However, this equivalence is not true in general, for complicated actions, etc. In this work we considered quadratic gravity actions in the EP approach, with the goal to show that, while this equivalence does not hold true, one can still find actions that recover dynamically the Levi-Civita connection, metricity, Einstein gravity and Planck mass in some “low-energy” limit, even in the absence of matter.

We studied EP quadratic gravity given by \(R(\tilde{\Gamma },g)^2+R_{[\mu \nu ]}(\tilde{\Gamma })^2\) which has local scale symmetry. \(R_{[\mu \nu ]}(\tilde{\Gamma })^2\) can be regarded as a gauge kinetic term for the vector field \(v_\mu \sim \tilde{\Gamma }_\mu -\Gamma _\mu \) where \(\tilde{\Gamma }_\mu \) (\(\Gamma _\mu \)) denotes the trace of the Palatini (Levi-Civita) connections, respectively. Hence this theory actually has a gauged scale symmetry, with \(v_\mu \) the Weyl gauge field. A consequence of this symmetry is that the theory is non-metric i.e. \({\tilde{\nabla }}_\mu g_{\alpha \beta }\not =0\) (due to dynamical \(v_\mu \sim \tilde{\Gamma }_\mu \)). At the same time, the equations of motion of the connection (\(\tilde{\Gamma }\)) become complicated second-order differential equations and we showed how to solve them algebraically in terms of an auxiliary scalar \(\phi \) that “linearises” the \(R(\tilde{\Gamma },g)^2\) term. While initially the action appears to be of second order, for \(\tilde{\Gamma }\) onshell it is a higher derivative theory since \(R(\tilde{\Gamma },g)^2\) contains a (four-derivative) metric contribution \(R(g)^2+\cdots \). We showed that for \(\tilde{\Gamma }\) onshell, the action is equivalent to a second-order theory in which the initial auxiliary field \(\phi \) has become dynamical, while preserving the symmetry of the theory.

The main result is that our EP quadratic gravity action has an elegant spontaneous breaking mechanism of gauged scale invariance and mass generation valid even in the absence of matter; in this, the necessary scalar field (\(\phi \)) was not added ad-hoc to this purpose (as usually done), but was “extracted” from the \(R^2\) term, as mentioned, being of geometric origin. The derivative \(\partial _\mu \ln \phi \) of this field acting as a Stueckelberg field is “eaten” by \(v_\mu \) which becomes massive, of mass \(m_v\) proportional to the Planck scale \(M\sim \langle \phi \rangle \). One obtains the Einstein–Proca action for the gauge field \(v_\mu \) and a positive cosmological constant. This is a “low-energy” broken phase of the initial action. Below the scale \(m_v\sim M\), the Proca field \(v_\mu \) decouples and metricity and the Einstein action are recovered. Non-metricity effects are strongly suppressed by a large scale (\(\propto M\)), which is important for the theory to be viable.

The above results remain valid in the presence of scalar matter (Higgs, etc.) with a (perturbative) non-minimal coupling to this theory with a Palatini connection; in such case and following the Stueckelberg mechanism, the scalar potential also has a breaking of the symmetry under which this scalar is charged, e.g. electroweak symmetry in the Higgs case. This is relevant for building models with this symmetry for physics beyond the SM.

To summarise, Einstein–Palatini quadratic gravity \(R(\tilde{\Gamma },g)^2+R_{[\mu \nu ]}^2(\tilde{\Gamma })\) is a gauged theory of scale invariance that is spontaneously broken to the Einstein–Proca action for the Weyl field with a positive cosmological constant; if initial action also contains (non-minimally coupled) scalar fields with Palatini connection, a scalar potential is also present.

This picture is similar to a recent analysis for the original Weyl quadratic gravity, despite the different non-metricity of these two theories. With hindsight, this is not too surprising, since in both theories there is a gauged scale symmetry and the connection is not fixed by the metric, except that in Weyl gravity non-metricity is present from the onset (due to underlying Weyl conformal geometry) while here it emerges for \(\tilde{\Gamma }\) onshell. It is worth studying further the relation of these two theories, by including any remaining operators (on the Einstein–Palatini side) that can have this symmetry.

There are also interesting predictions from inflation. While the scalar potential is Higgs-like for small field values (\(\ll M\)), for large field values it can be used for inflation. With the Planck scale M a simple phase transition scale, field values above M are natural. The inflaton potential is similar to that in Weyl quadratic gravity, up to couplings and field redefinitions (due to different non-metricity of the two theories). We find a specific prediction for the tensor-to-scalar ratio, \(0.007\le r \le 0.01\), for the current value of the spectral index at \(95\%\) CL. This value of r is mildly larger than that predicted by inflation in Weyl gravity. This enables us to distinguish and test these two theories by future CMB experiments that will reach such values of r. It also establishes an interesting connection between non-metricity and inflation predictions.