1 Introduction

Standard general relativity (GR) has proven to be a reliable theory of gravitation in many an instance (see also [1]). All predictions of the theory in vacuum have been confirmed up to the experimental precision, both locally in the Solar System as in astrophysical regimes such as in binary systems.

However, when considering non-vacuum solutions, as one does in galaxies and cosmology, standard GR is still successful, yet at the price of introducing dark sources (dark matter and dark energy) to fit observations (see [2,3,4,5,6,7]). While there are overwhelming pieces of evidence of the gravitational effects of such sources, there is still no direct evidence of their fundamental constituents.

For these reasons, researchers have been considering the possibility that the effects which are currently ascribed to dark sources may be in fact purely gravitational effects due to modifications of the gravitational interaction itself. There are a number of candidate models which collectively are called modified gravitational theories or extended theories of gravitation (see [8,9,10,11,12,13,14,15,16,17]). In these models there are no fundamental dark source fields or particles. It is the gravitational interaction that is modified with respect to standard GR, and the modification is designed so that it preserves the local predictions in vacuum, while deviations at different scales justify observations in terms of effective sources.

In this paper we consider two specific classes of modified models: Brans–Dicke models (BD) [18] and Palatini \(f({\mathcal {R}})\)-theories (see [19,20,21,22]), especially in relation to classical Solar System tests, in particular Mercury’s precession. The discussion aims also at highlighting that specifically in gravitational theories, observables are model dependent and when data are available, one needs to make predictions for each test within each extended model, making explicit all choices about observational protocols, since such choices may be different from what one does in standard GR.

The two modified models: Brans–Dicke and Palatini \(f({\mathcal {R}})\)-theories, have been chosen because Palatini \(f({\mathcal {R}})\)-theories are known to be dynamically equivalent via a conformal transformation, to a subset of Brans–Dicke models. Since Brans–Dicke theories have historically been used as a benchmark for Solar System tests, its parameters have been experimentally constrained and it has been shown that the preferred values of parameters correspond to standard GR.

Now it happens that, by dynamical equivalence, Palatini \(f({\mathcal {R}})\)-theories correspond to a subset of Brans–Dicke theories (with a specific potential as well as) with a value of parameter which is not compatible with Solar System tests. Hence, this equivalence has been used to rule out all Palatini \(f({\mathcal {R}})\)-theories. This is not the only argument for ruling out Palatini \(f({\mathcal {R}})\)-theories. It has also been argued (and confuted) that Palatini \(f({\mathcal {R}})\)-theories lead to singularities in polytropic stars (see [23,24,25,26]).

In this paper we shall show in details how this is wrong due to multiple reasons interacting with each other. First, the dynamical equivalence requires a potential which was not assumed in the original Solar System tests analysis. Secondly, the value of the parameter for which the dynamical equivalence occurs is a singular value for Brans–Dicke models. Since the original analysis of Solar System tests in Brans–Dicke theories has been performed for generic regular values, one cannot even say that the singular value of the parameter has been ruled out (and in fact we shall show it is allowed). Lastly, we highlight how, even in view of the dynamical equivalence, test particles (e.g. Mercury itself) in the two theories are expected to go along different worldlines and in the specific example they do not only because one is considering a vacuum solution (so that the conformal factor is constant).

In general, we argue that dynamical equivalence may or may not extend to a complete physical equivalence, in which case the equivalence should preserve the action principle, as well as all the independent choices which define observational protocols (see also [27]).

Test particles are an independent choice: when you use the eikonal approximation, equations for test particles are obtained, though they are not invariant with respect to redefinition of fields (e.g. [28, 29]). Accordingly, the choice of test particle equations is just transformed into the choice of which field corresponds to test particle. Moreover, often one does not have a clear Lagrangian description of test particles in terms of fields and still one uses test particles.

Another choice is space–time decomposition. In a relativistic theory there is no time and no space, just space-time. Each observer may split space-time into space and time, though each in a different way. Space and time lengths are thus relative to the choice and conventional. We use them extensively in astrophysics, just because a relativistic theory has no Dirac observables [30]. If we fix a space–time decomposition, which partially breaks general covariance, it conventionally reduces the symmetry group, so that non-trivial relative observables may be allowed.

One can show that defining atomic clocks then a space–time decomposition follows (see [31, 32]), and one can define space and time lengths out of each specific atomic clock. That is very well known in standard GR, though it extends to a Weyl geometry [33], as required in Palatini \(f({\mathcal {R}})\)-theories.

In a previous paper (see [34]; see also [35]) we discussed cosmology in a particular Palatini \(f({\mathcal {R}})\)-theory, based on the function

$$\begin{aligned} f({\mathcal {R}})= \alpha {\mathcal {R}}-\frac{\beta }{2}{\mathcal {R}}^2 -\frac{\gamma }{3}{\mathcal {R}}^{-1}. \end{aligned}$$
(1)

We discussed SNIa fit and showed that the system is quite strongly degenerate. If we provide \(\alpha \) and \(\beta \), then the SNIa data set allows to fix \(\gamma \), by the way to a value of about \(\gamma \simeq 10^{-104}\,\mathrm {m}^{-4}\). That calls for independent measurements to fix \(\alpha \) and \(\beta \). Here we used Solar System tests to reduce degeneracy. It has also been argued (by an anonymous referee) that the best fit value \(\alpha \simeq 0.1\) we found there would fail in Solar System tests, we show here that this is not the case considering Mercury test.

Material is organised as follows: in Sect. 2, we fix notation in BD theories and Palatini \(f({\mathcal {R}})\)-theories. In Sect. 3, we review the dynamical equivalence. In Sect. 4, we consider static, spherically symmetric solutions which will be used to model the Solar System. In Sect. 5, we consider geodesic equation, first for generic static, spherically symmetric metric and then for a solution.

2 Brans–Dicke and Palatini \(f({\mathcal {R}})\)-theories

A Brans–Dicke (BD) theory is a gravitational theory for a metric g and a scalar field \(\varphi \). The action principle in dimension \(m=\dim (M)=4\) is

$$\begin{aligned} L_\mathrm{BD}= \frac{\sqrt{g}}{2{\kappa }} \left( \varphi R -\frac{\omega }{\varphi } \nabla _\mu \varphi \nabla ^\mu \varphi + U(\varphi )\right) \hbox {d}\sigma , \end{aligned}$$
(2)

where R is the scalar curvature of g and \(U(\varphi )\) is a potential. The parameter \(\omega \) is called the BD parameter.

From this action, one has vacuum field equations

$$\begin{aligned} {\left\{ \begin{array}{ll} \varphi R_{\mu \nu } = \nabla _{\mu \nu }\varphi + \frac{\omega }{\varphi } \nabla _\mu \varphi \nabla _\nu \varphi + \frac{1}{2} \left( \Box \varphi -U \right) g_{\mu \nu } \\ \left( 3 + 2\omega \right) \Box \varphi +\left( \varphi U' -2 U \right) = 0. \\ \end{array}\right. } \end{aligned}$$
(3)

We shall eventually be interested also in the special case \(\omega =-\frac{3}{2}\) in which field equations become

$$\begin{aligned} {\left\{ \begin{array}{ll} \varphi R_{\mu \nu } = \nabla _{\mu \nu }\varphi - \frac{3}{2\varphi } \nabla _\mu \varphi \nabla _\nu \varphi + \frac{1}{2} \left( \Box \varphi -U \right) g_{\mu \nu } \\ \varphi U' =2 U, \\ \end{array}\right. } \end{aligned}$$
(4)

which is of course a degenerate value since for \(\omega =-\frac{3}{2}\) the field equation for \(\varphi \) drops order and becomes an algebraic equation depending on the potential \(U(\varphi )\). When no potential is assumed in the degenerate case \(\omega =-\frac{3}{2}\), the scalar field \(\varphi \) is left undetermined. The original analysis of Solar System tests (see [36]) was carried over with no potential and for a generic regular value of \(\omega \).

In BD models, test particles go along time-like geodesics of g, which determines the geometry of space-time as well as its metric structure. The scalar field \(\varphi \) is non-minimally coupled, and it modifies the law in which gravitational field is mediated.

2.1 Palatini \(f({\mathcal {R}})\)-theory

For a Palatini \(f({\mathcal {R}})\)-theory, we start from fields \((g_{\mu \nu }, \tilde{\varGamma }^\epsilon _{\mu \nu })\), where \(\tilde{\varGamma }\) is a (here torsionless) generic connection on the space-time M, a priori independent of g, and a Lagrangian

$$\begin{aligned} L_f= \frac{\sqrt{g}}{2{\kappa }} f({\mathcal {R}}) \hbox {d}\sigma \end{aligned}$$
(5)

for some (regular enough) function \(f({\mathcal {R}})\) of the scalar curvature \({\mathcal {R}}= g^{\mu \nu } \tilde{R}_{\mu \nu }\), where \(\tilde{R}_{\mu \nu }\) is the Ricci tensor of the connection \(\tilde{\varGamma }\) alone. Field equations read

$$\begin{aligned} {\left\{ \begin{array}{ll} f^\prime ({\mathcal {R}}) \tilde{R}_{(\mu \nu )} - \frac{1}{2} f({\mathcal {R}}) g_{\mu \nu }=0\\ \tilde{\nabla }_\epsilon \left( \sqrt{g}f^\prime ({\mathcal {R}}) g^{\mu \nu } \right) =0.\\ \end{array}\right. } \end{aligned}$$
(6)

By tracing the first field equation by \(g^{\mu \nu }\), we obtain the so-called master equation,

$$\begin{aligned} f^\prime ({\mathcal {R}}) {\mathcal {R}}- 2 f({\mathcal {R}}) =0, \end{aligned}$$
(7)

which must be identically satisfied along solutions. The function \(f({\mathcal {R}})\) is called regular enough when the zeros of the master equation are simple and they form a discrete set.

To solve the second equation, one has to set \(\varphi \propto f^\prime ({\mathcal {R}})\). That can be locally inverted as \({\mathcal {R}}\propto r(\varphi )\).

For the model based on (1), we have the master equation

(8)

With these values of the curvature we get

$$\begin{aligned} f_\pm := f({}^\pm {\mathcal {R}}) =\pm \frac{2\sqrt{\alpha \gamma } }{3} -\frac{\beta \gamma }{2\alpha } \end{aligned}$$
(9)

and

$$\begin{aligned} \varphi _\pm =f^\prime _\pm = f^\prime ({}^\pm {\mathcal {R}}) = \frac{ 4\alpha }{3 } \mp \beta \sqrt{\frac{ \gamma }{\alpha }} \end{aligned}$$
(10)

At that point the (vacuum) field equations are

$$\begin{aligned} {\left\{ \begin{array}{ll} R_{\mu \nu } = \tilde{R}_{\mu \nu } = \frac{1}{2\varphi } f({\mathcal {R}}) g_{\mu \nu } = \varLambda _\pm g_{\mu \nu } \quad \Rightarrow \varLambda _\pm = -\frac{1}{4} \frac{{3\beta \gamma } \mp {4\alpha \sqrt{\alpha \gamma } } }{ {4\alpha ^2} \mp 3\beta \sqrt{\alpha \gamma }} \simeq \pm \frac{1}{4} \sqrt{\frac{\gamma }{\alpha }} \\ f^\prime ({\mathcal {R}}) {\mathcal {R}}- 2 f({\mathcal {R}}) =0 \quad \Rightarrow {}^\pm {\mathcal {R}}= \pm \sqrt{\frac{ \gamma }{\alpha }} \\ \end{array}\right. } \end{aligned}$$
(11)

where the last approximations have been done for small values of \(\beta \gamma \) with respect to \(\alpha \sqrt{\alpha \gamma }\) (which both have units of an inverse squared length) and small values of \(\beta \sqrt{\alpha \gamma }\) with respect to \(\alpha ^2\) (which are both adimensional)

Let us notice that, as a consequence of the assumption of being in vacuum, the conformal factor \(\varphi \) is constant and field equations reduce to Einstein equations with a cosmological constant, which is proven in general by universality theorem (see [37]). Universality theorem guarantees that vacuum solutions of Palatini \(f({\mathcal {R}})\)-theories maintain the successes shown by standard GR solutions provided that the cosmological constant is small enough, as small as it is expected by observations.

In the specific example (1), the (effective) cosmological constant \(\varLambda _\pm \) is small enough iff \(\gamma \) is small enough and \(\gamma \) is the parameter which is better constrained by SNIa. We can consider the best value of \(\gamma \)

$$\begin{aligned} \gamma \simeq 2.46^{+3.84}_{-2.24}\times 10^{-104} \,\mathrm {m}^{-4} \end{aligned}$$
(12)

obtained in (1) setting \(\alpha =0.095\), \(\beta =0.25 \,\mathrm {m}^{-2}\).

As a matter of fact, the predicted best fit value of \(\varLambda = 1.27^{+0.58}_{-0.98}\times 10^{-52} \,\mathrm {m}^{-2}\) is not far away from the value observed, e.g. by the one found by Planck survey (\(\varLambda _\mathrm{Planck (2018)} = (1.106 \pm 0.023) \times 10^{-52} \,\mathrm {m}^{-2}\)) (see [38])

This value for the (effective) cosmological constant will not be observable (or falsifiable) with experiments in the Solar System. If it were, then \(\varLambda CDM\) would be falsified as well [39, 40].

Thus, let us review the dynamical equivalence and then investigate how this rough though clear result can be compatible with exclusion of the values \(\omega < 4 \times 10^4\) [41, 42], hence including \(\omega =-\frac{3}{2}\), by BD and Mercury’s precession.

3 Equivalence between Palatini \(f({\mathcal {R}})\)-theories and BD

Before going to field equations, one can prove dynamical equivalence at the level of the action. The proof of equivalence is in two steps. In the first step we show equivalence between Palatini \(f({\mathcal {R}})\)-theory and a theory with an extra scalar field governed by the Helmholtz Lagrangian. In the second step we recast Helmholtz Lagrangian as a BD model by a suitable field transformation (see [17, 28]).

Let us consider the definition of the conformal factor \(\varphi =f^\prime ({\mathcal {R}})\) and solve it for the curvature \({\mathcal {R}}\),

$$\begin{aligned} \varphi = \alpha -\beta {\mathcal {R}}+ \frac{\gamma }{3} {\mathcal {R}}^{-2}= \frac{3\alpha {\mathcal {R}}^2-3\beta {\mathcal {R}}^3 +\gamma }{ 3{\mathcal {R}}^2} \quad \Rightarrow \quad 3\beta {\mathcal {R}}^3+ 3(\varphi -\alpha ){\mathcal {R}}^2 -\gamma =0. \end{aligned}$$
(13)

Let us fix \(\varphi _*= \alpha + {}^3\sqrt{\frac{3\beta ^2\gamma }{4}}\). For \(\varphi < \varphi _*\), the equations have only one positive solution \({\mathcal {R}}={}^+{\mathcal {R}}(\varphi )\).

For \(\varphi \ge \varphi _*\), the equation has one positive solution \({\mathcal {R}}={}^+{\mathcal {R}}(\varphi )\) as well as two negative solutions \({\mathcal {R}}={}^-{\mathcal {R}}_1(\varphi )\) and \({\mathcal {R}}={}^-{\mathcal {R}}_2(\varphi )\).

When one inverts for \({\mathcal {R}}= r(\varphi )\), there are three branches:

  1. (i)

    \({\mathcal {R}}>0\) and any \(\varphi \in \mathbb {R}\); \({\mathcal {R}}= r_1(\varphi )\).

  2. (ii)

    \({\mathcal {R}}_*\le {\mathcal {R}}<0\) and \(\varphi \ge \varphi _*\); \({\mathcal {R}}= r_2(\varphi )\).

  3. (iii)

    \({\mathcal {R}}\le {\mathcal {R}}_*<0\) and \(\varphi \ge \varphi _*\); \({\mathcal {R}}= r_3(\varphi )\).

where we set \({\mathcal {R}}_*= {}^-{\mathcal {R}}(\varphi _*)\).

The for each branch we have an ‘inverse’ \({\mathcal {R}}= r(\varphi )\) and we can define a Helmholtz Lagrangian

$$\begin{aligned} L_\mathrm{H}= \sqrt{g} \big ( f(r(\varphi ))+ \left( {\mathcal {R}}- r(\varphi )\right) \varphi \big ) \hbox {d}\sigma \end{aligned}$$
(14)

which depends on \(\varphi \), g, \(\tilde{\varGamma }\) and first derivatives of \(\tilde{\varGamma }\).

Fig. 1
figure 1

\(\varphi ({\mathcal {R}})\) for \(\alpha =0.091\), \(\beta =0.25 \,\mathrm {m}^{-2}\), \(\gamma =1/100 \,\mathrm {m}^{-4}\) in the plane \(({\mathcal {R}}, \varphi )\). The asymptotic line just depends on \(\alpha \) and \(\beta \). The deviation from it is governed by \(\gamma \) (the smaller the \(\gamma \), the more the graph is closed to the asymptotic line)

The Helmholtz Lagrangians are dynamically equivalent to \(f({\mathcal {R}})\)-theory. By varying Helmholtz Lagrangian with respect to \((g, \tilde{\varGamma }, \varphi )\), one gets field equations

$$\begin{aligned} {\left\{ \begin{array}{ll} \varphi \tilde{R}_{(\mu \nu )} = \frac{1}{2} f(r(\varphi )) g_{\mu \nu }\\ \tilde{\nabla }_\epsilon (\sqrt{g} \varphi g^{\mu \nu })=0 \\ {\mathcal {R}}= r(\varphi )\\ \end{array}\right. } \end{aligned}$$
(15)

the last equation being equivalent to a branch of \(\varphi = f^\prime ({\mathcal {R}})\). Then, we can define \(\tilde{g}= \varphi g\) and solve the second to get \(\tilde{\varGamma }= \{\tilde{g}\}\). Finally, from the first one, one can trace to get the master equation (and get that \({\mathcal {R}}\) and hence \(\varphi \) are constant on shell). Since \(\varphi \) is constant \(\{g\}=\{\tilde{g}\}\) and \(\tilde{R}_{(\mu \nu )} = R_{\mu \nu } \). Then, consequently, the first equation becomes Einstein with cosmological constant, equivalent to (11).

As a matter of fact, this gives us the chance to test the Palatini \(f({\mathcal {R}})\) model directly without passing through BD equivalence. Moreover, the other way around, this gives us also the chance to use equivalence to test the degenerate BD models which are equivalent to Palatini \(f({\mathcal {R}})\) models. In both cases, we have that the value of the cosmological constant one gets from \(f({\mathcal {R}})\) given by (1), with parameters (12), is compatible with what found by Planck survey in 2018. Accordingly, the models are not rules out by Solar System tests (as well as by other test which are insensitive to the observed value of the cosmological constant).

If we select a specific potential \(U= f(r(\varphi ))- \varphi r(\varphi ) \) in a Brans–Dicke theory we get equations:

$$\begin{aligned} {\left\{ \begin{array}{ll} \varphi R_{\mu \nu } = \nabla _{\mu \nu }\varphi + \frac{\omega }{\varphi } \nabla _\mu \varphi \nabla _\nu \varphi + \frac{1}{2} \left( \Box \varphi - f(r(\varphi ))+ \varphi r(\varphi ) \right) g_{\mu \nu } \\ \left( 3 + 2\omega \right) \Box \varphi +\left( \varphi r(\varphi ) -2 f(r(\varphi )) \right) = 0 .\\ \end{array}\right. } \end{aligned}$$
(16)

With that potential, one has

(17)

which in fact is the left-hand side of the master equation.

Ricci tensor conformal transformations are given by

$$\begin{aligned} \varphi \tilde{R}_{\mu \nu } +\nabla _{\mu \nu } \varphi +\frac{1}{2}\Box \varphi g_{\mu \nu } - \frac{3}{2\varphi } \nabla _\mu \varphi \nabla _\nu \varphi = \varphi R_{\mu \nu }. \end{aligned}$$
(18)

Thus, the first equation reads as

(19)
$$\begin{aligned}&\varphi \tilde{R}_{\mu \nu } = \frac{1}{2\varphi } \left( 3+2\omega \right) \nabla _\mu \varphi \nabla _\nu \varphi + \frac{1}{2} \left( - f(r(\varphi ))+ \varphi r(\varphi ) \right) g_{\mu \nu }. \end{aligned}$$
(20)

By tracing the first equation, we get

$$\begin{aligned} \varphi {\mathcal {R}}= \frac{2\omega +3}{2\varphi } \nabla _\epsilon \varphi \nabla ^\epsilon \varphi + 2 \varphi {\mathcal {R}}- 2f(r(\varphi )) \Rightarrow \varphi {\mathcal {R}}- 2f({\mathcal {R}}) = - \frac{2\omega +3}{2\varphi } \nabla _\epsilon \varphi \nabla ^\epsilon \varphi . \end{aligned}$$
(21)

For \(\omega \not =-\frac{3}{2}\)

$$\begin{aligned} {\left\{ \begin{array}{ll} \varphi \tilde{R}_{\mu \nu } = \frac{3+2\omega }{2\varphi } \nabla _\mu \varphi \nabla _\nu \varphi + \frac{1}{2} \left( \varphi r(\varphi ) - f(r(\varphi )) \right) g_{\mu \nu } \\ \quad \quad = \frac{3+2\omega }{2\varphi } \nabla _\mu \varphi \nabla _\nu \varphi + \frac{1}{2} \left( \varphi r(\varphi ) - 2f(r(\varphi )) \right) g_{\mu \nu } + \frac{1}{2}f(r(\varphi )) g_{\mu \nu } \\ \quad \quad = \frac{3+2\omega }{2\varphi } \left( \nabla _\mu \varphi \nabla _\nu \varphi + \varphi \Box \varphi g_{\mu \nu } \right) + \frac{1}{2}f(r(\varphi )) g_{\mu \nu }\\ \Box \varphi =\frac{ \varphi r(\varphi ) - 2f(r(\varphi )) }{ (3+2\omega )}. \end{array}\right. } \end{aligned}$$
(22)

There are solutions of the second equation with a non-constant \(\varphi \).

For \(\omega =-\frac{3}{2}\)

$$\begin{aligned} {\left\{ \begin{array}{ll} \varphi \tilde{R}_{\mu \nu } = \frac{1}{2} \left( - f(r(\varphi ))+ \varphi r(\varphi ) \right) g_{\mu \nu } = \frac{1}{2} f(r(\varphi ))g_{\mu \nu }\\ \varphi f(r(\varphi )) - 2f({\mathcal {R}}) =0 . \end{array}\right. } \end{aligned}$$
(23)

The second equation implies that \(\varphi \) is constant, and then, \(\tilde{R}_{\mu \nu }= R_{\mu \nu }\). By tracing the first, one gets the master equation, from which \({\mathcal {R}}\) is constant. Then, the first equation becomes Einstein with cosmological constant.

That shows that there is a dynamical equivalence between Palatini \(f({\mathcal {R}})\)-theories and BD theories with \(\omega =-\frac{3}{2}\) and a potential \(U(\varphi )= f(r(\varphi ))- \varphi r(\varphi )\).

For the function \(f({\mathcal {R}})\) given by (1), the function \(r(\varphi )\) is quite complicated, but in fact we do not really need it. Anyway, currently we are on the positive branch (the first one) and as long as observational cosmology is concerned we can restrict to that branch.

For a proof at the level of action see [17, 28].

4 Solutions with point-like sources

Let us here consider the static spherically symmetric solutions in Palatini \(f({\mathcal {R}})\)-theory and BD theory, both in the case of generic parameter and no potential and for \(\omega =-\frac{3}{2}\) and the potential \(U(\varphi )= f(r(\varphi ))- \varphi r(\varphi )\) induced by dynamical equivalence.

4.1 Solution in Palatini \(f({\mathcal {R}})\)-theory

In a Palatini \(f({\mathcal {R}})\)-theory, in view of universality theorem, we get a static spherically symmetric solution which is

$$\begin{aligned} \tilde{g} = -A(r) \mathrm {d}t^2 + \frac{ \mathrm {d}r^2}{ A(r)} + r^2 \mathrm {d}\varOmega ^2 \quad \text {with}\quad A(r) = a - \frac{b}{r} - \frac{\varLambda }{3} r^2, \end{aligned}$$
(24)

where we set \(\mathrm {d}\varOmega ^2 := \mathrm {d}\theta ^2 + \sin ^2(\theta ) \mathrm {d}\phi ^2\) for the volume element on the sphere. Thus,

$$\begin{aligned} g= \varphi ^{-1} \left( -A(r) \mathrm {d}t^2 + \frac{ \mathrm {d}r^2}{ A(r)} + r^2 \mathrm {d}\varOmega ^2\right) . \end{aligned}$$
(25)

In view of the fact that the conformal factor is constant, we can change coordinates by \((\sqrt{\varphi }\tilde{r}= r, \sqrt{\varphi }\tilde{t}= t)\) and obtain

$$\begin{aligned} g= \left( -A \>\mathrm {d}{\tilde{t}}^2 + \frac{ \mathrm {d}{\tilde{r}}^2}{ A} + \tilde{r}^2 \mathrm {d}\varOmega ^2\right) . \end{aligned}$$
(26)

If we want a specific asymptotic behaviour, for example a metric which is asymptotically anti-de-Sitter, we can set \(a=1\) in the function A(r).

Since the solution of BD theory will be given in isotropic coordinates, let us first recast this solution in isotropic coordinates \((t, \rho , \theta ,\phi )\). Any static, spherically symmetric metric

$$\begin{aligned} g = -A(r) \mathrm {d}t^2 +C(r) \mathrm {d}r^2+ r^2 \mathrm {d}\varOmega ^2 \end{aligned}$$
(27)

can be recast in isotropic form

$$\begin{aligned} g = -A(\rho ) \mathrm {d}t^2 +B(\rho ) \left( \mathrm {d}\rho ^2+ \rho ^2 \mathrm {d}\varOmega ^2 \right) \end{aligned}$$
(28)

by a change of radial coordinate \(\rho =\rho (r)\) (hence \(\mathrm {d}\rho = \rho ' \>\mathrm {d}r\)).

One has simply

$$\begin{aligned} g = -A(\rho ) \mathrm {d}t^2 +B(\rho )\> (\rho ')^2 \mathrm {d}r^2+ B \rho ^2 \mathrm {d}\varOmega ^2 \end{aligned}$$
(29)

and comparing with the expression in pseudo-spherical coordinates one gets the conditions

$$\begin{aligned} {\left\{ \begin{array}{ll} B(\rho )\> (\rho ')^2 = C \\ B(\rho )\> \rho ^2= r^2\\ \end{array}\right. } \quad \Rightarrow \left( \frac{\rho ' }{ \rho }\right) ^2 = \frac{C(r)}{r^2}. \end{aligned}$$
(30)

Hence, one can integrate the last condition to get \(\rho (r)\) and then set \(A(\rho ):= A(r(\rho ))\).

For example, the Schwarzschild metric in pseudo-spherical coordinates \((t, r, \theta ,\phi )\) is

$$\begin{aligned} g= -\left( \frac{r-2m}{r}\right) \mathrm {d}t^2 + \frac{ r }{ r-2m} \mathrm {d}r^2 + r^2 \mathrm {d}\varOmega ^2 \end{aligned}$$
(31)

while in isotropic coordinates it reads as

$$\begin{aligned} g= -\left( \frac{2\rho -m}{2\rho +m }\right) ^2 \mathrm {d}t^2 + \left( 1+ \frac{m}{ 2\rho }\right) ^4 \left( \mathrm {d}\rho ^2 + \rho ^2 \mathrm {d}\varOmega ^2\right) \end{aligned}$$
(32)

where the integration constant has been fixed to have \(\lim _{r\rightarrow +\infty } \frac{\rho }{r}=1\). Let us remark that for \(\rho \rightarrow +\infty \) we get Minkowski metric.

4.2 Solution in BD theory

We can find a static and isotropic solution of BD equations by an ansatz in isotropic coordinates \((t, \rho ,\theta , \phi )\), namely

$$\begin{aligned} g= -A(\rho ) \mathrm {d}t^2 + B(\rho )\left( \mathrm {d}\rho ^2 + \rho ^2 \mathrm {d}\varOmega ^2 \right) \quad \text {with}\quad \varphi = \varphi (\rho ). \end{aligned}$$
(33)

If we fix the potential to be zero and \(\omega \not =-\frac{3}{2}\), we can solve the BD equation (see [43, 44]) as

$$\begin{aligned} A(\rho )&=\alpha _0 \left( \frac{ 2\rho -m }{ 2\rho +m } \right) ^{\frac{2}{\lambda }}, \end{aligned}$$
(34)
$$\begin{aligned} B(\rho )&= \beta _0 \left( \frac{2\rho +m}{2\rho }\right) ^4 \left( \frac{ 2\rho -m }{ 2\rho +m } \right) ^{\frac{2(\lambda -C-1)}{\lambda }} = \frac{\beta _0}{\alpha _0^{\lambda -C-1}} \left( \frac{2\rho +m}{2\rho }\right) ^4 A^{\lambda -C-1}, \end{aligned}$$
(35)
$$\begin{aligned} \varphi (\rho )&=\varphi _0 \left( \frac{ 2\rho -m }{ 2\rho +m } \right) ^{\frac{C}{\lambda }}, \end{aligned}$$
(36)

where we have defined

$$\begin{aligned} \lambda ^2=(C+1)^2-C+\frac{\omega }{2} C^2=\left( 1+\frac{\omega }{2} \right) C^2+ C +1. \end{aligned}$$
(37)

That corresponds to what Weinberg does [36], though with no approximations. It is a solution for any \((\alpha _0, \beta _0, \varphi _0, C)\) and \(\lambda \) is computed with the identity above. If we want to get the Schwarzschild metric at infinity, we need to set \((\alpha _0=\beta _0=1)\). Even considering \(\alpha _0=\beta _0=1\), we see that there is a 1-parameter family of static and spherically symmetric solutions, parameterised by C, unlike what happens in standard GR where the solution is unique.

We see that there are two families of solutions: one with \(C=0\), and consequently \(\lambda =\pm 1\), where the conformal factor \(\varphi \) is constant and indeed, when \(\lambda =1\), reduces to the Schwarzschild metric.

$$\begin{aligned} A(\rho )= \left( \frac{ 2\rho -\lambda m }{ 2\rho +\lambda m } \right) ^{2}, \quad \quad B(\rho )= \left( \frac{ 2\rho +\lambda m}{2\rho }\right) ^4, \quad \quad \varphi (\rho )=\varphi _0. \end{aligned}$$
(38)

The second family is for \(C\not =0\) which has a non-constant conformal factor whenever \(m\not =0\). Accordingly, we can say that Schwarzschild solution is always there, although as a somehow isolated solution, while BD theory allows a whole family of solutions with a non-constant conformal factor.

For \(\omega = -\frac{3}{2}\) and no potential, one has an arbitrary conformal factor and the Schwarzschild metrics only. If the potential is added as in view of the dynamical equivalence, then the conformal factor is frozen to be constant and the Schwarzschild–de Sitter solution is obtained.

In what follows we shall consider a subfamily of the general solution with \(C=1/n\), and consequently \(\lambda ^2=[2(n^2+n+1)+\omega ]/(2n^2)\) (which for \(\omega \rightarrow -\frac{3}{2}\) gives \(\lambda \rightarrow \frac{2n+2 }{ 2n }\), while for \(n\rightarrow \infty \) gives \(\lambda =\pm 1\) which is the value associated with the Schwarzschild solution) and we shall compute an observable, such as the precession rate of Mercury. We shall show that even in the limit \(\omega \rightarrow -\frac{3}{2}\) such an observable is discontinuous and it does not reproduce the result for the Schwarzschild–de Sitter solution obtained by setting \(\omega = -\frac{3}{2}\). That eventually justifies the claim that the value \(\omega = -\frac{3}{2}\) is degenerate and one cannot use a limit procedure to infer the result in the degenerate case and then neither in the dynamically equivalent Palatini \(f({\mathcal {R}})\)-theory (which in fact will pass the Mercury test).

5 Geodetics and exact relativistic Kepler laws

We shall predict precession of Mercury by solving a more general problem, i.e. writing exact Kepler laws for an arbitrary static spherically symmetric metric in isotropic coordinates restricted to the equatorial plane \(\theta =\frac{\pi }{2}\),

$$\begin{aligned} g= -c^2 A(r) \mathrm {d}t^2 + B(r) \mathrm {d}r^2 + r^2 \mathrm {d}\phi ^2. \end{aligned}$$
(39)

Precession comes from failure of first Kepler laws, since it measures by how much the orbits fail to be closed. Orbits are g-time-like g-geodesics, in an allowed region where r is bounded from below and above.

We could use two Lagrangians. The first Lagrangian is for geodesics parameterised by proper time, i.e.

$$\begin{aligned} \tilde{L} = \frac{m}{2}\left( c^2 A(r) \left( {\frac{\mathrm {d}t}{\mathrm {d}\tau }}\right) ^2 - B(r) \left( {\frac{\mathrm {d}r}{\mathrm {d}\tau }}\right) ^2 - r^2 \left( {\frac{\mathrm {d}\phi }{\mathrm {d}\tau }}\right) ^2 \right) \hbox {d}\tau . \end{aligned}$$
(40)

This Lagrangian has three d.o.f. \((t, r, \phi )\) and three first integrals (PHK), namely

$$\begin{aligned} P = A {\frac{\mathrm {d}t}{\mathrm {d}\tau }}, \quad K= r^2 {\frac{\mathrm {d}\phi }{\mathrm {d}\tau }}, \quad H= c^2 A \left( {\frac{\mathrm {d}t}{\mathrm {d}\tau }}\right) ^2 - B \left( {\frac{\mathrm {d}r}{\mathrm {d}\tau }}\right) ^2 - r^2 \left( {\frac{\mathrm {d}\phi }{\mathrm {d}\tau }}\right) ^2, \end{aligned}$$
(41)

which we can solve as

$$\begin{aligned}&\displaystyle {\frac{\mathrm {d}t}{\mathrm {d}\tau }} = \frac{P }{ A}, \quad {\frac{\mathrm {d}\phi }{\mathrm {d}t}}= \frac{ K A }{ Pr^2}, \quad \left( {\frac{\mathrm {d}r}{\mathrm {d}t}}\right) ^2 = \frac{\left( c^2 P^2 - H\mathbb {A}\right) r^2 - K^2\mathbb {A}}{ P^2 AB r^2} A^2,\end{aligned}$$
(42)
$$\begin{aligned}&\displaystyle \left( {\frac{\mathrm {d}r}{\mathrm {d}\tau }}\right) ^2 = \frac{\left( c^2 P^2 - HA \right) r^2 - K^2A}{ AB r^2} . \end{aligned}$$
(43)

The second Lagrangian is invariant with respect to reparameterisations

$$\begin{aligned} L= mc \sqrt{c^2A(r) - B(r) \left( {\frac{\mathrm {d}r}{\mathrm {d}t}}\right) ^2 -r^2 \left( {\frac{\mathrm {d}\phi }{\mathrm {d}t}}\right) ^2 } \> \mathrm {d}t. \end{aligned}$$
(44)

This Lagrangian has two d.o.f. \((r, \phi )\) and two first integrals (EJ): (\([J]= L\), \([E]=TL^{-1}\))

$$\begin{aligned} J=- \frac{ r^2 {\frac{\mathrm {d}\phi }{\mathrm {d}t}} }{\sqrt{c^2 A(r) - B(r) \left( {\frac{\mathrm {d}r}{\mathrm {d}t}}\right) ^2 -r^2 \left( {\frac{\mathrm {d}\phi }{\mathrm {d}t}}\right) ^2 }}, \quad E=- \frac{ A }{ \sqrt{c^2 A(r) - B(r) \left( {\frac{\mathrm {d}r}{\mathrm {d}t}}\right) ^2 -r^2 \left( {\frac{\mathrm {d}\phi }{\mathrm {d}t}}\right) ^2 }}. \end{aligned}$$
(45)

We have

$$\begin{aligned} {\frac{\mathrm {d}\phi }{\mathrm {d}t}} = \frac{ JA }{ Er^2 } , \quad \left( {\frac{\mathrm {d}r}{\mathrm {d}t}}\right) ^2 = \frac{ c^2 A }{ B } - \frac{ A^2 }{ E^2 B } - \frac{ J^2 A^2 }{ E^2 B r^2 } = \frac{ c^2 E^2 r^2 - A (r^2 + J^2 ) }{ E^2 AB r^2} A^2. \end{aligned}$$
(46)

Now we can map the values of the first integrals in the two frameworks

$$\begin{aligned} \frac{K }{ P} = \frac{J }{ E } , \quad E^2 = \frac{ P^2 }{ H}. \end{aligned}$$
(47)

From the invariant Lagrangian we have

$$\begin{aligned} \left( {\frac{\mathrm {d}r}{\mathrm {d}t}}\right) ^2&= \frac{ A }{ B E^2 r^2 } \left( c^2E^2 r^2 -A(r^2+J^2 ) \right) =:\varPhi (r; E, J),\end{aligned}$$
(48)
$$\begin{aligned} \left( {\frac{\mathrm {d}\phi }{\mathrm {d}r}}\right) ^2&= \frac{ J^2 AB }{ r^2 } \frac{ 1 }{ c^2 E^2 r^2 -A (r^2 + J^2) } =:\varPsi (r; E, J), \end{aligned}$$
(49)

where the former equation gives the time evolution and the latter the orbit trajectory.

Having orbits is related to having an allowed region \([r_-, r_+]\) in \(\varPhi \) (as well as in \(\varPsi \)). By integrating the latter in the allowed region \([r_-, r_+]\), one gets \(2\phi = 2\pi + \delta \), where \(\delta \) is the precession per orbit. When the precession is zero, the orbit is closed, i.e. the classic first Kepler laws. Accordingly, the precession \(\delta \) is expected to be smaller and smaller getting away from the source, though not in the limit unless the solution is asymptotically flat (namely \(\varLambda =0\)). In the limit to standard GR, Mercury’s precession should approach \(\sim 43\,\mathrm {arcsec/century}\). For the Earth precession should be negligible.

The second Kepler law is related to conservation of angular momentum, which indeed is conserved exactly also in the relativistic regime. Finally, we can get the period T(r) from the integral in the allowed region. In the Newtonian approximation one has \(T^2\propto (r_-+r_+)^3\). The function T(r) contains the exact law, which in the limit must reproduce the Newtonian prediction.

Let us now make this explicit in some metrics which are relevant for the theories we are considering.

5.1 Schwarzschild

The Schwarzschild solution is defined as

$$\begin{aligned} \mathbb {A}= 1- \frac{b}{r}= \frac{r-b}{r}<1 \quad \text {with}\quad b:=\frac{ 2G }{ c^2} m >0. \end{aligned}$$
(50)

One has \([b]= L\) and b is called the Schwarzschild radius.

Since the mass of the Sun is \(M_\odot =1.9885\times 10^{30} \,\mathrm {kg}\), its Schwarzschild radius is \(b= 2953.29\,\mathrm {m}\).

The function \(\varPhi \) is then given by

$$\begin{aligned} \varPhi = \frac{ (r-b)^2 }{ E^2 r^5 } \left( ( c^2 E^2 -1) r^3 + br^2 -J^2r +bJ^2\right) . \end{aligned}$$
(51)

This function to the power \(-\frac{1}{2}\) will be integrated. In order to have an analytical expression of that integral it will be better to factorise the polynomial, i.e. expressing it as a function of perihelion and aphelion \((r_\pm )\) instead of as a function of (EJ). If we want a bounded orbit, the function \(\varPhi \) must be negative at big r, thus \(c^2 E^2 < 1\), i.e. \(-1<cE<0\). The polynomial \(p(r)=( c^2 E^2 -1) r^3 + br^2 -J^2r +bJ^2 \) has a zero since it is of odd degree. We have \(p(b)=( c^2 E^2 -1) b^3 + b^3 = c^2 E^2 b^3>0\) and \(-1<cE<0\) and then p(r) as a root \(r_0>b\).

Hence, we can set initial conditions in the zero \(r=r_0\), where \(\dot{r}_0=0\), and set \(\phi _0=0\), \(\dot{\phi }_0= v_0/r_0\). For these initial conditions, we have

$$\begin{aligned} J=- \frac{ r_0 v_0 }{\sqrt{c^2 \mathbb {A}_0 -v_0^2 }}, \quad E=- \frac{ \mathbb {A}_0 }{ \sqrt{c^2 \mathbb {A}_0 -v_0^2 }}, \end{aligned}$$
(52)

where we set \(\mathbb {A}_0= \mathbb {A}(r_0)\). For having a bounded orbit we must have \(c^2E^2<1\), i.e. \(v_0^2< \frac{ b(r_0-b)}{ r_0^2} c^2=\frac{\xi _0b }{ (\xi _0+ b)^2} c^2= v_M^2\).

Let us also notice that \(v_0^2< \frac{ b(r_0-b)}{ r_0^2} c^2=\frac{\xi _0b }{ (\xi _0+ b)^2} c^2< c^2\). Accordingly, if we start from zero initial speed \(v_0=0\) and we increase it, at some point the orbit will become unbounded.

With these values of course we can express \(\varPhi (r, r_0, v_0)\) and factorise out of it \((r-r_0)\):

$$\begin{aligned}&\varPhi (r)= (r-b)^2 \frac{ (c^2 \mathbb {A}_0^2- c^2 \mathbb {A}_0+ v_0^2) r^3 + br^2(c^2 \mathbb {A}_0 -v_0^2) -r_0^2 v_0^2 r +b r_0^2 v_0^2 }{ \mathbb {A}_0^2 r^5 } \quad \nonumber \\&\quad \Rightarrow \varPhi (r_0)=0\end{aligned}$$
(53)
$$\begin{aligned}&\varPhi (r)= (r-b)^2 (r-r_0) \frac{ (c^2 b^2- c^2 r_0b + r_0^2v_0^2) r^2 + ( r_0-b)r_0^2 v_0^2 r -b r_0^3 v_0^2 }{ (r_0-b)^2 r^5 }. \end{aligned}$$
(54)

If \(\varPhi \) has only one zero \(r=r_0>b\), the allowed region is \([b, r_0]\) and the geodesic is falling towards the asymptotic goal at \(r=b\), i.e. the horizon. To have a bounded periodic orbit the allowed region must be \([b, r_0] \cup [r_-, r_+]\). Necessary condition for that to happen is

$$\begin{aligned} \varDelta = ( r_0-b)^2r_0^4 v_0^4 -4 c^2 b^2(r_0-b) r_0^3 v_0^2+ 4b r_0^5 v_0^4>0 \quad \Rightarrow v^2_0 > \frac{4(r_0-b) b^2 c^2}{r_0 (b+r_0)^2}=: v_m^2. \end{aligned}$$
(55)

Let us remark that \(0< v_m^2<v_M^2 < c^2\). In fact, \(v_m^2\) is obviously positive. It is also

$$\begin{aligned} v_m^2= \frac{4(r_0-b) b^2 c^2}{r_0 (b+r_0)^2} = \frac{4br_0 }{ (b+r_0)^2} v_M^2 =\frac{b^2+ b\xi _0 }{ b^2+ b\xi _0+\left( {\frac{\xi _0}{2}}\right) ^2} v_M^2< v_M^2. \end{aligned}$$
(56)

Hence, starting from zero speed and increasing it, the test particle will fall in until it reaches a limit speed \(v_m\) and it will stay bounded until a new limit \(v_M\). Accordingly, one has bounded orbits for \(v_m^2< v_0^2 <v_M^2\).

Then for \(v_0= v_m\), we have a new zero of \(\varPhi \) appearing in \(r_1=\frac{2b r_0 }{ r_0-b}\), which is less than \(r_0\), for any \(r_0>2b\):

$$\begin{aligned} \frac{a(\xi _0+2b) }{ \xi _0}< \xi _0 \quad \Rightarrow - \xi _0^2+ b\xi _0+2b^2 < 0. \end{aligned}$$
(57)

One has that it vanishes for \(\xi _0=b\) and it is negative for \(\xi _0>b\) since it goes to \(-\infty \) as \(\xi \rightarrow \infty \). Accordingly, unless \(b<r_0< 2b\), for \(r_0>2b\) one has the first stable orbit (for \(v_0=v_m\)) in \(\xi \in [\xi _1, \xi _0]\). Then, increasing \(v_0\) the perihelion grows to infinity. At some point it will reach and pass \(\xi _0\), and then, it will grow to infinity which is reached at \(v_0=v_M\).

When the perihelion becomes equal to the aphelion (circular orbit) and it keeps growing, they get exchanged with each other. When we originally choose a zero \(r_0\) of \(\varPhi \), we can always choose the biggest one. In this way we are interested to find the speed \(v_c = v_0\) for which we have circular motion. That happens when \(\varPhi (r_0)=0\) twice, i.e. for \(v_c^2= \frac{c^2 b}{2 r_0}\).

When \(r_0>3b\), we have \(v_m<v_c<v_M\). Once again for \(r_0<3b\) we are too near and Newtonian dynamics is not a good approximation.

Let us summarise. We choose initial position at the aphelion, for \(r=r_+>3b\).Footnote 1

(i):

For \(0<v_0< v_m\) (thus low angular momentum), the test particle falls in, end of the story.

(ii):

For \(v_0=v_m\), we have the first ‘orbit’ with a perihelion of \(r_-= \frac{2b r_+ }{ r_+-b}<r_+\) (which is not an orbit yet, since \(r_-\) is an asymptotic goal).

(iii):

For \(v_m^2<v_0^2< v_c^2=\frac{c^2 b}{2 r_+}\), we have elliptic orbits, with perihelion \(\frac{2b r_+ }{ r_+-b}<r_-(v_0)<r_+ \).

(iv):

For \(v_0^2= v_c^2\), we have circular orbits, \(r_-=r_+\).

(v):

For \(v_c^2<v_0^2<v_M^2\), we have elliptic orbits, though the initial conditions are given in the perihelion, not in the aphelion. These orbits are recovered by giving initial conditions in the aphelion.

(vi):

For \(v_0^2=v_M^2\), one has parabolic orbits.

(vii):

For \(v_0^2>v_M^2\), one has hyperbolic orbits.

In what follows we are interested in (iii) and (iv) only since they capture all bounded orbits \(r\in [r_-, r_+]\), parameterised by \(r_+\) and \(r_-\). By the way, if we fix \(r_-\) and \(r_+\) the initial velocity we need for that is \(v_0^2= \frac{ c^2 b r_-^2 (r_+-b)}{ r_+^2 (r_++r_- )( r_--b)}\). The first integrals for the bounded orbits are

$$\begin{aligned} cE&= -\sqrt{\frac{(r_+-b)(r_--b)(r_++r_-) }{ r_-^2 r_+ +r_- r_+^2-b r_+ r_- -br_+^2 - br_-^2 }}, \end{aligned}$$
(58)
$$\begin{aligned} J&= -\sqrt{\frac{b r_+^2r_-^2 }{ r_-^2 r_+ +r_- r_+^2-b r_+ r_- -br_+^2 - br_-^2 }}. \end{aligned}$$
(59)

Accordingly, we can express the function \(\varPhi \) in terms of \(r_\pm \) as

$$\begin{aligned} \varPhi (r, r_\pm )&= -bc^2 (r-b)^2\frac{(r_+-r)(r- r_-)\left( (br_- -r_-r_+ +br_+)r +br_+r_-\right) }{ r^5(r_--b)(r_+ +r_-)(r_+-b)}, \end{aligned}$$
(60)
$$\begin{aligned} \varPsi (r, r_\pm )&= - \frac{ r_+^2 r_-^2}{ r (r_+-r)(r-r_-)\left( (br_- -r_-r_+ +br_+)r +br_+r_-\right) }. \end{aligned}$$
(61)

Alright then, we have expressed everything in terms of \(r_\pm \).

5.2 Exact Kepler laws in Schwarzschild

For the Earth, we have \(r_-=147095000000\,\mathrm {m}\) and \(r_+=152100000000\,\mathrm {m}\). Then, its period is

(62)

This is computed, theoretical quantities, not measured. We are not meaning we can observe the Earth period up to \(10^{-7}s\). We mean that we can predict its value with arbitrary prediction so that we can compare observed value with it.

The Earth precession per orbit is

(63)

and consequently, the precession per century (i.e. the cumulative precession for 100 Earth’s periods ) in \(\mathrm {arcsec}= \mathrm {deg}/3600\) is .

On the other hand, for Mercury we have \(r_-=46001200000\,\mathrm {m}\) and \(r_+=69816900000\,\mathrm {m}\). Then, its period is

(64)

The Mercury’s precession per orbit is

(65)

implying a precession per century (i.e. the cumulative precession for 100 Earth’s periods , which corresponds to about 415.21 Mercury’s periods) of .

Let us stress that comparing with Earth’s period to define century allows us to avoid any direct reference to an external clocks. In some sense we are using relational time within the system itself.

To find third law we can consider \(\varPhi \) and \(\varPsi \), make the substitutions \(r\rightarrow \rho /x\), \(r_\pm \rightarrow (1\pm e)/x\) and expand at \(x=0^+\) (i.e. far away from the central mass). The first term in the series reproduces the corresponding Kepler function. The second term in the series gives corrections. This is a very convenient technique to define what is the Newtonian limit, since it is done before integration. Of course, it requires one does Kepler case first.

5.3 Post-Newtonian approximation

One usually does is expanding the metric coefficients in isotropic coordinates in series of \(MG/\rho \) and assume it is not very different from Minkowski, i.e. the weak field approximation, namely

$$\begin{aligned} g= -\left( 1-2\alpha \frac{MG}{\rho } + 2 \beta \frac{M^2G^2}{\rho ^2} \right) \mathrm {d}t^2+ \left( 1+2\gamma \frac{MG}{\rho } \right) (\mathrm {d}\rho ^2 + \rho ^2 \mathrm {d}\varOmega ^2). \end{aligned}$$
(66)

This approximation is good enough for Mercury, it would fail too close to a black hole. In this approximation the Schwarzschild solution is recovered for \(\alpha =\beta =\gamma =1\), while the BD solution is obtained for

$$\begin{aligned} \alpha&=\beta =1, \end{aligned}$$
(67)
$$\begin{aligned} \gamma&=\frac{\omega +1}{\omega +2}. \end{aligned}$$
(68)

It is clear that in the very same moment one expands in series, the ability to spot isolated singular solutions is lost. Of course, for \(\omega \rightarrow \infty \), one gets \(\gamma =1\) for the Schwarzschild solution, but for \(\omega =-3/2\) one gets \(\gamma = -1\). As a matter of fact, one is assuming a priori to be on the main regular sequence of solution, which is incorrect (or a partial viewpoint) as we shall see. For this reason it is much better to stick to exact results rather than starting expanding in series. As an extra bonus, by doing it exactly, we also can test how close we need to be to see the strong field effects which, in principle, is a solid prediction of the theory.

5.4 Schwarzschild–de Sitter

Let us discuss the Schwarzschild–de Sitter space-time with \(A= 1 -b/r - \lambda r^2\). That is a solution of Einstein equation in vacuum with cosmological constant \(\varLambda = 3\lambda \). We need \(A>0\) at least in a region \(r\in [r_1, r_2]\). Since we have

$$\begin{aligned} A= 1 -\frac{b}{r} - \lambda r^2= \frac{ -\lambda r^3+ r-b }{ r}, \end{aligned}$$
(69)

if \(\lambda >0\), we have \(\lim _{r\rightarrow +\infty } A=-\infty \) and, if \(b>0\), we have \(\lim _{r\rightarrow 0} A=-\infty \). Thus, we would like a region in the middle where \(A>0\), which is easy to find since

$$\begin{aligned} A'= \frac{b}{r^2} - 2\lambda r = \frac{-2\lambda r^3+ b }{ r^2}=0 \quad \iff \quad r_*={}^3\sqrt{\frac{ b }{ 2\lambda }} \end{aligned}$$
(70)

and in \(r=r_*\) we have

$$\begin{aligned} A= \frac{ -\lambda {\frac{ b }{ 2\lambda }}+ {}^3\sqrt{\frac{ b }{ 2\lambda }}-b }{{}^3\sqrt{\frac{ b }{ 2\lambda }}} =\left( 1 - \frac{3}{2}\> {}^3\sqrt{ 2b^2\lambda } \right) >0 \quad \Rightarrow \quad 0<\lambda < \frac{4}{27b^2}. \end{aligned}$$
(71)

If \(\lambda \) is too big there is no such a region where \(A>0\). That means that as \(\lambda \) grows, sooner or later the cosmological horizon will touch the Schwarzschild one.

Thus, the cosmological constant has to be small enough. In this case the function A has at least two zeros, hence three (since it is odd degree).

$$\begin{aligned} \begin{aligned} A&= \frac{ -\lambda (r-r_0)(r-r_1)(r- r_2) }{ r}\\&=\frac{ -\lambda r^3 +\lambda (r_0+ r_1+ r_2)r^2 -\lambda (r_0r_1+ r_0r_2+ r_1r_2)r+\lambda r_0r_1r_2 }{ r}. \end{aligned} \end{aligned}$$
(72)

Hence, we have \(r_0=-( r_1+ r_2)\) and

$$\begin{aligned} A=\frac{ -\lambda r^3 +\lambda (r_1^2 +r_1r_2 +r_2^2)r-\lambda (r_1^2r_2+r_1r_2^2) }{ r} \quad \Rightarrow \quad {\left\{ \begin{array}{ll} \lambda = \frac{ 1}{ r_1^2 +r_1r_2 +r_2^2}\\ b= \frac{ r_1^2r_2+r_1r_2^2}{ r_1^2 +r_1r_2 +r_2^2}\\ \end{array}\right. }, \end{aligned}$$
(73)

which is solved for \((r_1(\lambda , b), r_2(\lambda , b))\).

In what follows, we shall fix the (positive, small) Schwarzschild radius \(r_1\) and the (positive, large) cosmological radius \(r_2\), thus setting \(r_0=-r_1-r_2\). Thus, we can write

$$\begin{aligned} A= \frac{ -\lambda (r-r_0)(r-r_1)(r- r_2) }{ r} = \frac{ \left( r+(r_1+r_2)\right) (r-r_1)(r_2-r) }{ r (r_1^2 +r_1r_2 +r_2^2)}, \end{aligned}$$
(74)

which indeed is positive in the allowed region \(r\in [r_1, r_2]\), and we compute the corresponding \((a, \lambda )\) out of \((r_1, r_2)\). The function \(\varPhi \) is then given by

$$\begin{aligned} \varPhi&= A^2 \left( c^2 - \frac{r^2+J^2 }{ E^2 r^2 } A \right) \end{aligned}$$
(75)
$$\begin{aligned}&= \frac{ c^2E^2 (r_1^2 +r_1r_2 +r_2^2)r^3 -(r^2+J^2)(r+r_1+r_2)(r-r_1)(r_2-r) }{ (r_1^2 +r_1r_2 +r_2^2)^3E^2 r^5}\nonumber \\&\quad (r+r_1+r_2)^2(r-r_1)^2(r_2-r)^2, \end{aligned}$$
(76)

while the function \(\varPsi \) is given by

$$\begin{aligned} \varPsi = \frac{J^2}{r} \cdot \frac{ r_1^2 + r_1r_2 +r_2^2 }{ c^2E^2 (r_1^2 +r_1r_2 +r_2^2)r^3 -(r^2+J^2)(r+r_1+r_2)(r-r_1)(r_2-r) }. \end{aligned}$$
(77)

Thus, within the region \([r_1, r_2]\) the behaviour of \(\varPhi \) (and \(\varPsi \)), in particular the zeros and the allowed regions, are ruled by the fifth degree polynomial

$$\begin{aligned} p(r)=c^2E^2 (r_1^2 +r_1r_2 +r_2^2)r^3 -(r^2+J^2)(r+r_1+r_2)(r-r_1)(r_2-r). \end{aligned}$$
(78)

If we want a bounded orbit, p(r) must have 4 zeros, \(\{r_m, r_+, r_-, r_M\}\), in the region \([r_1, r_2]\) so one negative zero \(r=-(r_m+ r_++r_- + r_M)\). Hence, it is

$$\begin{aligned} p(r)=(r+r_m+ r_++r_- + r_M)(r-r_m)(r-r_-)(r-r_+)(r-r_M). \end{aligned}$$
(79)

If one knows the mass of the star and cosmological constant, then \(r_1\) and \(r_2\) can be computed. Then, one gives a planet with its \(r_\pm \) and can compute out of \((r_1, r_2, r_\pm )\) the value of \((E, J, r_m, r_M)\), i.e. the initial conditions. In other words, while \((r_1, r_2)\) are a convenient way of parameterising the parameters of the system, namely \((m, \lambda )\), \((r_+, r_-)\) are a convenient way of parameterising initial conditions of a specific time-like geodesic.

Although it is good to discuss it once from scratch, one can also cut the discussion short by saying that we want to have an orbit, i.e. time-like geodesic that is bounded from above and below. That means we need an allowed region \([r_-, r_+]\) for \(\varPhi \) and \(\varPsi \), so that both \(r_\pm \) are simple zeros. Knowing that, at infinity, \(\varPhi \) is definite negative and it is positive around \(r_1<r_m<r_-\le r_+< r_M< r_2\), we directly get

$$\begin{aligned} \varPhi= & {} \frac{ (r+r_m+ r_++r_- + r_M)(r-r_m)(r-r_-)(r-r_+)(r-r_M)}{ (r_1^2 +r_1r_2 +r_2^2)^3E^2 r^5} \nonumber \\&\quad (r+r_1+r_2)^2(r-r_1)^2(r_2-r)^2. \end{aligned}$$
(80)

Note that we also have two allowed regions \([r_1, r_m]\) and \([r_M, r_2]\) corresponding to the test particle falling in and escaping to infinity, respectively. Anyway, here we are interested in solutions in \([r_-, r_+]\).

Let us finally remark that the factorised form of \(\varPhi \) is very convenient for analytical computation. Most of the possibility of treating the problem analytically relies on this factorisation, i.e. in using \((r_1, r_2, r_\pm )\) instead \((\lambda , m, E, J)\) and express the rest in terms of them.

Now that \(\varPhi (r)\) and \(\varPsi (r)\) are fixed, one can compute the planet period and the precession per orbit as we did for Schwarzschild. That can be done for Earth and for Mercury so to have the ratio between periods. Then, we can compute the precession of Mercury in a century, obtaining

(81)

to be compared with the result in GR

(82)

Accordingly, we have a relative error of \(\varDelta =10^{{-16}}\) if we neglect the cosmological constant.

We can repeat the computation for different values of the cosmological constant to check how it grows when we switch it on. In fact we have

(83)

which shows that the effect of the cosmological constant grows approximately linearly in the \((\log ,\log )\)-graph and that with a cosmological constant \(\varLambda = \varLambda _+\) we are well within the limit in which we cannot observe it in Mercury perihelion. We put a bar to highlight on its left-hand side the digits which agrees with the value computed with no cosmological constant. That bar also highlights on the right-hand side the digits which are affected by the cosmological constant. Let us notice that we are not simply saying that as long as we consider Mercury’s precession we can neglect the cosmological constant. We are in fact computing the relative error one does by neglecting it.

5.5 Solution in \(f({\mathcal {R}})\)

We have considered the Schwarzschild–de Sitter metric and computed precession of Mercury in them. We found that if the cosmological constant is small enough the theoretical precession is compatible with the observed one.

In view of universality theorem we know that (vacuum) Palatini \(f({\mathcal {R}})\)-theories in fact are equivalent to Einstein with a cosmological constant. However, extra care is needed in this case. Universality theorem claims that \(\tilde{g}\) is a solution of Einstein equations with a cosmological constant the value of which is dictated by the function \(f({\mathcal {R}})\) via the master equation. Now from the viewpoint of Ehlers, Pirani and Schild (EPS) framework (see [45, 46]), one should expect \(\tilde{g}\) to govern geodesic motions, while g is related to distances, while in Schwarzschild–de Sitter model (as above) one has a single metric, namely g above, which dictates both geodesic equations and distances. Although in principle one should discuss whether this aspect plays a relevant role when applying the discussion above as it does in general, we have to remark that Solar System is modelled by a vacuum solution of Palatini \(f({\mathcal {R}})\)-theory, in which hence the conformal factor is constant. Accordingly, \(\{g\}= \{\tilde{g}\}\), i.e. the two metrics g and \(\tilde{g}\) actually define the same geodesics trajectories, the same time-like directions, and one has \(\tilde{R}_{\mu \nu }=R_{\mu \nu }\), i.e.

$$\begin{aligned} \tilde{R} \tilde{g}_{\mu \nu } =\tilde{g}^{\rho \sigma } \tilde{R}_{\rho \sigma } \tilde{g}_{\mu \nu } = g^{\rho \sigma } \tilde{R}_{\rho \sigma } g_{\mu \nu } = {\left\{ \begin{array}{ll} {\mathcal {R}}g_{\mu \nu } \\ g^{\rho \sigma } R_{\rho \sigma } g_{\mu \nu } = R g_{\mu \nu }.\\ \end{array}\right. } \end{aligned}$$
(84)

Accordingly, also g obeys the same field equations as \(\tilde{g}\). We can use g only in vacuum, and apply the result above.

If the function \(f({\mathcal {R}})\) determines a small enough cosmological constant (as it happens with the function (1) and parameters (12)), then it actually predicts precession of Mercury compatible with the observed one. Let us remark that Mercury test is passed despite the value \(\alpha \simeq 0.095\). That directly shows that \(\alpha \simeq 1\) is not required to pass this test. In view of dynamical equivalence between Palatini \(f({\mathcal {R}})\) theory and BD (with a potential and \(\omega =-3/2\)) this shows also that such a BD theory passes the test as well.

Also in this case, one should pay attention to the fact that in Palatini \(f({\mathcal {R}})\)-theories geodesics and distances are related to two different metrics, while in BD both are related to the same metric g. However, in vacuum, this is not an issue since g-time-like \(\tilde{g}\)-geodesics are also (and the only) g-time-like g-geodesics. However, in non-vacuum solutions (as galactic dynamics or cosmology) the models would be actually different.

Since classical tests have been performed in BD theories (with no potential and generic \(\omega \), hence different from \(\omega =-3/2\), which is degenerate) and they show that \(\omega \) must be \(\omega >10^4\), this was used to try, erroneously as we discussed above, to rule out all Palatini \(f({\mathcal {R}})\)-theories at once. Besides, we saw directly that this argument is spurious, we shall review the test in BD model on an exact formulation. We have two reasons to do it: first, since we will not use PN approximation for it we can apply it close to the horizon, in the strong field regime. Secondly, we show that the value \(\omega =-\frac{3}{2}\) is degenerate also with respect to the test and the role of the potential cannot be neglected.

6 Mercury test in BD theory

Let us here consider a BD theory with no potential (and \(\omega \not =-3/2\), as this is used to determine the solution as long as the scalar field is concerned). Test particles (and the planets Earth and Mercury) are assumed to go along geodesics of g. On the equatorial plane, in isotropic coordinates, one has the Lagrangian

(85)

again with total energy and angular momentum as first integrals. In isotropic coordinates, the coefficients are a bit different to what described so far, so we need to repeat the discussion from scratch.

First integrals are

$$\begin{aligned} J&=- \frac{ B(\rho ) \rho ^2 {\frac{\mathrm {d}\phi }{\mathrm {d}t}} }{\sqrt{c^2 A(\rho ) - B(\rho ) \left( {\frac{\mathrm {d}\rho }{\mathrm {d}t}}\right) ^2 -\rho ^2 B(\rho ) \left( {\frac{\mathrm {d}\phi }{\mathrm {d}t}}\right) ^2 }}, \end{aligned}$$
(86)
$$\begin{aligned} E&= - \frac{ A(\rho ) }{ \sqrt{c^2 A(\rho ) - B(\rho ) \left( {\frac{\mathrm {d}\rho }{\mathrm {d}t}}\right) ^2 -\rho ^2 B(\rho ) \left( {\frac{\mathrm {d}\phi }{\mathrm {d}t}}\right) ^2 }}, \end{aligned}$$
(87)

which can be inverted for velocities as

$$\begin{aligned} \left( {\frac{\mathrm {d}\rho }{\mathrm {d}t}}\right) ^2&= \frac{ A }{ B^2 } \left( c^2 B - A \frac{ B \rho ^2+J^2 }{ E^2 \rho ^2 } \right) =\varPhi (\rho ; E, J; m, C, \omega ), \end{aligned}$$
(88)
$$\begin{aligned} \left( {\frac{\mathrm {d}\phi }{\mathrm {d}t}}\right) ^2&= \frac{ J^2 A^2}{ B^2\rho ^4 E^2 } \quad \Rightarrow \left( {\frac{\mathrm {d}\phi }{\mathrm {d}r}}\right) ^2 = \frac{J^2 A}{ { \rho ^4 E^2 \left( c^2 B - A \frac{ B \rho ^2+J^2 }{ E^2 \rho ^2 } \right) } } =\varPsi (\rho ; E, J). \end{aligned}$$
(89)

Considering \(\varPhi \), if \(\rho \rightarrow +\infty \), given that \(A\rightarrow 1\) and \(B\rightarrow 1\), we have

$$\begin{aligned} \varPhi \rightarrow \frac{ c^2 E^2 -1 }{ E^2 }<0 \quad \Rightarrow c^2 E^2< 1 \quad \Rightarrow -1<cE\le 0 \end{aligned}$$
(90)

in order to have bounded orbits.

Now that we know how it works, we can either study the function \(\varPhi \) for all parameters and then select parameters which describe bounded orbits, or simply require two solutions \(\varPhi (\rho _\pm )=0\) that bound an allowed region \([\rho _-, \rho _+]\). Indeed, if we consider the equations \(\varPhi (\rho _\pm ; E, J)=0\), we can solve for \(\left( E(\rho _\pm ), J(\rho _\pm )\right) \) and then replace them back into the Weierstrass functions to obtain

$$\begin{aligned} \left( \frac{\mathrm {d}\rho }{ \mathrm {d}t} \right) ^2&= \varPhi (\rho ; \rho _\pm ), \end{aligned}$$
(91)
$$\begin{aligned} \left( \frac{ \mathrm {d}\phi }{ \mathrm {d}\rho } \right) ^2&= \varPsi (\rho ; \rho _\pm ). \end{aligned}$$
(92)

The second issue to discuss is how to use orbital parameters. Since we are in isotropic coordinates, the coordinate \(\rho \) is not endowed with a direct meaning of a distance and it is different from the coordinate r in pseudo-spherical coordinates (see [32]). In view of the form of the metric in spherical coordinates, space-time is foliated into Euclidean spheres (we mean metric spheres, not only topological spheres) parameterised by (tr), with the coordinate r being the radius each sphere would have if it were embedded into a Euclidean space. That in fact corresponds to the observation protocol for measuring astronomical distances, which in fact uses Newtonian approximation and classical Kepler laws. Accordingly, the observed orbital parameters have to be related to r, not to \(\rho \). However, we know that

$$\begin{aligned} B(\rho ) \rho ^2= r^2 \end{aligned}$$
(93)

and we can determine the values of \(\rho _\pm \) which correspond to the observed \(r_\pm \) for Mercury or the Earth.

In order to consider the solution (36) in BD theory, we know that standard GR corresponds to a big value of \(\omega \), \(C=0\) and \(\lambda =1\). In view of the constraint among \((\omega , \lambda , C)\) we can consider a subfamily of solutions given by setting \(\lambda =1\), \(C=-\frac{1}{n}\) and computing \(\omega = 2(n-1)\) and hence table the exact precession of Mercury for each value of n.

This gives us a (partial) insight on how precession of Mercury depends on \((\omega , C)\) in BD theory. We expect to obtain a good agreement with observed values for \(n\rightarrow \infty \) and observe rather incompatible values for small values of n. We can also consider the limit \(n\rightarrow \frac{1}{4}\), which corresponds to \(\omega \rightarrow -\frac{3}{2}\), i.e. the degenerate parameter although with no potential.

The whole computation of the precession for a specific value of n involves exact calculations up to a finite number of numerical integrations of converging improper integrals. This is not much different from what one does when studying the graph of a transcendent function, in which a finite number of evaluations of the function (e.g. to determine zeros or critical points) are eventually performed numerically. Accordingly, we can say our computation is an analytical exact result or, if you prefer, call it semi-analytical.

For each value of n we computed the orbital period of the Earth, of Mercury, the precession per orbit of Mercury and finally obtain the precession P(n) of Mercury during 100 orbits of the Earth:

$$\begin{aligned} \begin{aligned} n \quad \quad&P(n)\>\mathrm{(arcsec/century)}\\ \frac{1}{4} \quad \quad&\ -71.64 \\ 1 \quad \quad&14.33 \\ 4 \quad \quad&35.82 \\ 10 \quad \quad&40.12 \\ 50 \quad \quad&42.41 \\ 100 \quad \quad&42.70 \\ 500 \quad \quad&42.92 \\ 1000 \quad \quad&42.95. \\ \end{aligned} \end{aligned}$$
(94)

These values are theoretical predictions in different solutions of BD theories. They can be computed at arbitrary precision. Here we checked that the shown digits are not affected when the overall precision required is increased.

We see that in fact one can distinguish among different values of n (and \(\omega \)) by means of P which is definitely observable. As the observed value of about \(42.98\,\mathrm {arcsec\,century^{-1}}\), observations are able to exclude small values of \(\omega \), including \(\omega =-\frac{3}{2}\), as expected.

For the degenerate value \(\omega =-\frac{3}{2}\) one has two possible solutions: one for \(C=\frac{1}{4}\) which sits in the sequence we analysed and one for \(C=0\) which does not and it corresponds to ordinary Schwarzschild. In fact, for the solution with \(C=0\) the value of \(\omega \) is undetermined and the corresponding solution is somehow isolated in the solution space. It passes the tests obviously, since it is the same solution of standard GR.

Accordingly, it is not completely correct to say that BD theories with small value of \(\omega \), even with no potential, are ruled out by observations. The solutions with \(C\not =0\) are while of course \(C=0\) is not.

6.1 Constraints in parameter space

We consider BD solutions for \(\omega \) and C. We sample parameter spaces computing precession of Mercury (in \(\mathrm {arcsec\,century^{-1}}\)) and computing \(\varDelta P(\omega , C) =\frac{P-\tilde{P}}{\tilde{P}}\) the relative error (\(\times 100\)) with respect to the expected GR value of about \(\tilde{P}=42.98045118132\).

Fig. 2
figure 2

Relative error between computed precession and observed value (\(\log \)-scale) as a function of \(\log (\omega )\) on x-axis and \(-\log (-C)\) on y-axis. Standard GR corresponds to \(\omega \rightarrow +\infty \) and \(C\rightarrow 0\); hence, it corresponds to the direction \((+\infty , + \infty )\) in the plane

The values of \(\omega \) and C which have interestingly low errors are too spread in the \((\omega , C)\) plane. It is expected low errors in the limit \(\omega \rightarrow +\infty \) and \(C\rightarrow 0\) that corresponds to standard GR in the non-degenerate sequence. Thus, we plot \(\varDelta P(\omega , C)\) on the axes \(x=\log (\omega )\), \(y=-\log (C)\), so that standard GR corresponds to the limit \(x\rightarrow +\infty \) and \(y\rightarrow +\infty \). Indeed, we see in Fig. 2 that in the region \(x\rightarrow +\infty \) and \(y\rightarrow +\infty \) one has the smallest errors, as expected.

6.2 Conformal factor and Weyl transformations

We can still explore one possibility. When we used aphelion and perihelion in isotropic coordinates we used the metric g as EPS framework dictates. However, we have two metrics and we should check the difference with what is predicted if we used \(\tilde{g}\) instead. This is relatively easy, all we have to do is determining \(\rho _\pm \) by using the equation

$$\begin{aligned} \varphi (\rho ) B(\rho ) \rho ^2 = r_\pm ^2 \end{aligned}$$
(95)

instead of (93). In the case \(n=1000\), we obtain the precession prediction to be

$$\begin{aligned} \tilde{P}(1000) = 42.95\,\mathrm {arcsec\,century}^{-1} \end{aligned}$$
(96)

to be compared with the value computed in (94), namely \(P(1000)= 42.95\,\mathrm {arcsec\,century^{-1}}\). Thus, we see a tiny difference, a difference yet, as tiny as expected since the conformal factor is very close to 1 at the orbit of Mercury or the Earth.

Still, the difference is there and is expected to become bigger in stronger regimes which, by the way, says that the difference between using g or \(\tilde{g}\) to describe distances is in principle observable.

7 Conclusion and perspectives

We considered the standard test of Mercury in different contexts. Our treatment is analytical, and we do not resort to weak field approximations so that our framework is still valid for satellites orbiting a black hole at few Schwarzschild radiuses; see [47, 48].

It has been argued that since BD theory is ruled out by observations (for small values of \(\omega \)) and they are dynamically equivalent to Palatini \(f({\mathcal {R}})\)-theories, these are ruled out as well.

We showed that this is not the case for a number of reasons. First of all, dynamical equivalence is with a degenerate value of the parameter \(\omega =-\frac{3}{2}\), moreover with a potential which has not been considered in the original BD test. Moreover, in BD theory one has no Birkhoff theorem, so one has a three parameters’ family of static spherically symmetric solutions.

We considered standard GR, standard GR with a cosmological constant, Palatini \(f({\mathcal {R}})\)-theories and BD theories. These produce Schwarzschild solution with or without a cosmological constant \(\varLambda \) as well as a more general family (36) of solutions of BD theory.

We showed that Schwarzschild and Schwarzschild–de Sitter solutions pass the test provided that the cosmological constant \(\varLambda \) is small enough. For Palatini \(f({\mathcal {R}})\)-theories this imposes constraints on the function \(f({\mathcal {R}})\) which, for example, are met in some of the models based on (1), among which one has the best fit values (12) found in [34]. Among solutions of BD theory we showed that, besides Schwarzschild solution which is also a solution of BD theory for any value of \(\omega \) and which passes the test, also the other solutions of BD theory pass the tests provided that \(\omega \) is big enough.

This shows directly how Mercury test does not technically rule out BD for small \(\omega \), and it rules out some solutions of it. Of course, the Schwarzschild solution cannot be ruled out, rather one does not need BD theory to have a Schwarzschild solution.

It also shows directly that dynamical equivalence is irrelevant for ruling out Palatini \(f({\mathcal {R}})\)-theories which in fact are ruled out only if they produced too big values of the effective cosmological constant. In particular, in the family (1) considered in [34] where the best fit value (12) of \((\alpha , \beta , \gamma )\) was found to model Ia supernovae, the parameter fit was found to be strongly degenerate. In particular, the value of \(\beta \) was found to be poorly localised by SNIa data (as it can be expected since the \(\beta \) parameter has very tiny effect in a universe where supernovae can occur) and also for any given value of \(\alpha \) one could find a best value for \(\gamma \) which produces a good agreement with observations.

During the peer review of [34] it has been argued that the best fit value of \(\alpha \simeq 0.1\) would fail to model Solar System. Here we showed that this is not the case. The ruling out of theories has to be carried over at the level of observables not of actions. Despite \(\alpha \simeq 0.1\) produces a vacuum action which does not approximate the standard GR vacuum action, the constant factor has no effect on observables in vacuum, hence in the Solar System tests. This is also implied by universality theorem for Palatini \(f({\mathcal {R}})\)-theories.

In fact, if SNIa fix a best fit value for \(\gamma \), then Mercury test produces constraints for the ratio \(\frac{\gamma }{\alpha }\). The two set of data in fact remove the degeneracy for \(\alpha \) and \(\gamma \), leaving the one connected to \(\beta \).

As for future perspectives, we need to extend the analysis to the other classical Solar System test (light deflection and radar delays) to check whether these add constraints. We need to extend our exact approach to these cases so that conclusions are robust against weak field approximations and stay valid in a strong field regime.

Further constraints may arise from tight binary systems in which the weak field approximations can be at stake, as well one can look for consequences in collapse events relevant to gravitational waves. Being our method viable for different theories this would open a way to use gravitational wave phenomenology to be used for reliably distinguish different gravitational theories, especially when much more precise data will be available with new experimental surveys; see, for example, [49, 50].

Under this viewpoint of distinguishing different theories in terms of observables only, this paper is in a series with the aim of discussing validity of a specific family of theories [namely in this case (1)]. Before even discussing the validity of a model, one needs to fix parameters. This is not different from what it is routinely done in QFT, when a general theory of electromagnetic field and its interaction with charged fields has to be calibrated by choosing one cross section (the Compton scattering) to fix the renormalised parameters \(\frac{e}{m}\). Only after this calibration one can predict other cross sections and validate or falsify the model.

Gravitational theories are not different even in the classical regime. They have parameters to be calibrated by choosing some conventional observations. What we are doing is choosing SNIa to fix \(\gamma \), Mercury’s precession to fix \(\alpha \). Further investigation must be devoted to fix \(\beta \) (e.g. by using elements formation). Only at that point, even when a model has survived calibration, one can use the calibrated model to predict expected values (e.g. power spectrum) to falsify the theory on the basis of observations.

In other words, that is the way to go in modified gravity. We are adding parameters (e.g. in \(f({\mathcal {R}})\)-theories we have the (potentially infinite) parameters which fix the function \(f({\mathcal {R}})\)). In order to get stuck to finitely many parameters, we consider a family of functions at a time. Then, we need extra experiments to calibrate the theory (in standard GR we only have G which in fact can be fixed by lab experiments) before we are allowed to say we defined a model. The more parameters we add, the harder the calibration, which is what is fair to pay for an extended model of gravitation.

All other heuristic arguments about validity of a theory are based on physical intuition which is often model dependent and moreover it has been developed in standard GR and sometimes uncritically applied to different gravitational theories. We provided above a number of such arguments which eventually have been shown to be inconsistent. Being stuck to observation is the only robust way in gravitational theories (although probably in general) to really falsify a theory.

If we take this approach seriously, the current situation is almost desperate for modified gravity models. Let us close by remark and stress that here and in the series of investigation to come we are still trying to falsify a specific family (1) of Palatini \(f({\mathcal {R}})\)-theories. We still have no clue at all about how to do it for a generic Palatini \(f({\mathcal {R}})\)-theory, which depends on potentially infinitely many parameters, less than ever for a generic modified gravity theory. However, this is what needs to be done.