1 Introduction

Despite many theoretical and experimental triumphs [1], including recent detection of gravitational waves [2], general relativity is not considered a fundamental theory describing gravitational interactions; see e.g. [3,4,5,6,7,8]. Based on our current understanding of the workings of Nature, a few arguments for modifying it can be given. First of all, GR cannot be satisfactorily quantized, as attempts to renormalize it have been futile. Secondly, it is not a low-energy limit of theories regarded as fundamental, such as bosonic string theories [9], where dilaton fields couple non-minimally to the spacetime curvature. Another problem concerns the \(\Lambda \)CDM model: it is customary to consider that the value of \(\Lambda \) being responsible for the current acceleration of the expansion of the Universe is usually incomprehensibly small (120 order of magnitude smaller) when compared to the value predicted by quantum field theory. In fact, more realistic estimations taking into account Pauli–Zeldovich cancellation effect, quantum field theory in curved background or supersymmetry, make this discrepancy not so drastic (for more discussion see [10,11,12]).

As far as the mathematical reasons for modifying the Einstein’s gravity are concerned, we can take the so-called Palatini formalism into consideration. In the standard gravity, the underlying assumption of geometric structures defined on spacetime is that the affine connection is the Levi-Civita connection of the metric. In the Palatini approach, however, we consider these two objects as unrelated, since there is no reason whatsoever we should impose a relation between them a priori. In case of Einstein gravity, introducing Palatini formalism does not affect the resulting field equations in any way; however, in case of more complicated theories, such as scalar–tensor or F(R) theories of gravity, both approaches usually give different results, describing different physics. Palatini formalism has been investigated especially in the context of cosmological applications [13,14,15,16,17,18,19,20,21].

Scalar–tensor (S–T) theories of gravity are a very promising modification of the Einstein gravity. In these theories, a scalar field is non-minimally coupled to the curvature scalar [22]. Historically, the prototype of all contemporary scalar–tensor theories was the Brans–Dicke theory [23]. An interesting feature of the scalar–tensor theories of gravity is their equivalence to the F(R) theories, which basically means that the latter can be analyzed using the “mathematical machinery” developed for the former [24]. The reason why the scalar–tensor theories deserve some attention is that they can be successfully used to build credible models for cosmic inflation [25] (utilizing the equivalence between the scalar–tensor and F(R) theories of gravity) and dark energy [26].

Hitherto, the scalar–tensor theories of gravity have been considered mostly in a purely metric approach [13, 22, 26,27,28,29,30] and the possible effects of adopting the Palatini approach have been analyzed somewhat less commonly

[31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54]. So far, general conditions for a correct formulation of the scalar–tensor theories have been analyzed [34]. Change of formalism from metric to Palatini applied to S–T theories has been investigated in the context of cosmology, to analyze the problem of cosmological constant [35], quintessence – to show that equation of state in the Palatini formalism can cross the phantom divide line [36], and inflation, where it was discovered that in the Palatini approach [37,38,39,40,41,42,43,44,45], inflationary epoch is naturally provided [37,38,39,40], and almost scale-invariant curvature perturbations are generated with no tensor modes [46]. Some authors generalized scalar–tensor theories and allowed non-minimal derivative coupling as well [47,48,49,50,51,52]. In such theories, one makes extensive use of so-called “disformal transformations”. It was shown that for a special choice of parameters characterizing the theory, adopting Palatini approach allows one to avoid Ostrogradski ghosts [47].Footnote 1 Also, vector-Horndeski theories were analyzed with the metric structure decoupled from the affine structure. It was proven that in the Palatini formalism, there exist cosmological solutions which can pass through singularities [53].

The main goal of this paper is to introduce the general theory of scalar–tensor gravity analyzed in the Palatini approach and to develop mathematical formalism enabling us to analyze any S–T theory in a (conformally) frame-independent manner. The outline of this paperFootnote 2 goes as follows: in the first part, postulated action functional will be presented, and equations of motion derived. Next, modified conformal transformations in the Palatini approach will be introduced in order to allow the connection to transform independently of the metric tensor. A solution of the equation resulting from varying with respect to the independent connection will be inspected. Then, following the procedure carried out in [26] (see also [27, 29]), invariant quantities defined for the Palatini S–T theory will be obtained. The results will be applied to an analysis of F(R) Palatini gravity. In the last part, general conditions on the possible equivalence between a given S–T theory and some F(R) gravity will be discussed. For reader’s convenience, some supplementary material is collected in four Appendices.

2 Action functional and equations of motion

The main idea behind the Palatini approach is the following: we no longer consider metric tensor and linear connection to be dependent on each other. This approach was originally analyzed by Einstein [56], but then was attributed to an Italian mathematician Attilio Palatini [57, 58]. In this approach, one decouples causal structure of spacetime from its affine structure (which determines geodesics followed by particles). In practical terms, Palatini formalism amounts to varying the action functional with respect to both the metric tensor and the torsionless (i.e. symmetric) affine connection, resulting in two sets of field equations. One of these sets establishes a relation between the metric and the connection. There is no particular reason to apply the Palatini variation to the standard Einstein–Hilbert action, as in that case the independent connection turns out to be Levi-Civita with respect to the metric tensor, i.e. related to the metric by the standard formula: \(\Gamma ^\alpha _{\mu \nu }=\frac{1}{2}g^{\alpha \beta }(\partial _\mu g_{\beta \nu }+\partial _\nu g_{\beta \mu }-\partial _\beta g_{\mu \nu })\). However, in case of more complicated theories, such as F(R) theories of gravity, where the curvature scalar in the Einstein–Hilbert action is replaced by a function of it, both approaches give physically incompatible results, leading to different field equations describing different physics in the presence of matter sources. Instead, in the vacuum case, the Einstein equations enriched by adding cosmological constant are still valid [59, 60].

Consider a triple \((M, \Gamma , g)\), where M is n-dimensional \(n>2\) manifoldFootnote 3 equipped with a torsion-free (\(\equiv \) symmetric) connection \(\Gamma =\Gamma _{\mu \nu }^\alpha =\Gamma _{\nu \mu }^\alpha \) and a metric tensor \(g=g_{\mu \nu }\), possibly of the Lorentzian signature. The affine connection is used to build the Riemann curvature tensor:

$$\begin{aligned} R^\alpha _{\,\mu \beta \nu }(\Gamma )=\partial _\beta \Gamma ^\alpha _{\mu \nu }-\partial _\nu \Gamma ^\alpha _{\mu \beta }+\Gamma ^\alpha _{\beta \sigma }\Gamma ^\sigma _{\nu \mu }-\Gamma ^\alpha _{\nu \sigma }\Gamma ^\sigma _{\beta \mu }. \end{aligned}$$
(1)

The curvature scalar is a function of both the connection and the metric tensor:

$$\begin{aligned} R(g,\Gamma )=g^{\mu \nu }R_{\mu \nu }(\Gamma ), \end{aligned}$$
(2)

where \(R_{\mu \nu }(\Gamma )=R^\alpha _{\,\mu \alpha \nu }(\Gamma )\).

Utilizing the Palatini approach, we want now to write down the most general action functional for scalar–tensor theories, which is consistent with some class of transformations (see explanations below and Appendix B). The action should contain a scalar field \(\Phi \) – or a function thereof – non-minimally coupled to the curvature defined above and possibly to the matter fields. Furthermore, one must include also a kinetic term rendering the scalar field dynamic, and a self-interaction potential of the field. Presence of additional terms resulting from the approach we adopt, absent in the metric version of the theory, cannot be excluded.

Therefore, we postulate the following action functional:

$$\begin{aligned} S[g_{\mu \nu },\Gamma ^\alpha _{\mu \nu },\Phi ]= & {} \frac{1}{2\kappa ^2}\int _{\Omega }d^nx\sqrt{-g}\Big [{\mathcal {A}}(\Phi )R(g,\Gamma )\nonumber \\&-\,{\mathcal {B}}(\Phi )g^{\mu \nu }\nabla _\mu \Phi \nabla _\nu \Phi \nonumber \\&-\,A_1^\mu (g,\Gamma ) {\mathcal {C}}_1(\Phi )\nabla _\mu \Phi \nonumber \\&-\,A_2^\mu (g,\Gamma ){\mathcal {C}}_2(\Phi )\nabla _\mu \Phi -{\mathcal {V}}(\Phi )\Big ]\nonumber \\&+\,S_{\text {matter}}[e^{2\alpha (\Phi )}g_{\mu \nu },\chi ]. \end{aligned}$$
(3)

This action functional contains six arbitrary functions of one real variable: \(\{{\mathcal {A}},{\mathcal {B}},{\mathcal {C}}_1,{\mathcal {C}}_2,{\mathcal {V}},\alpha \}\), which after composing with the scalar field \(\Phi \) become the scalar functions on the spacetime M. They provide, together with the dynamical variables \((\Gamma , g,\Phi )\), the so-called frame for the action (3). A change of frame is governed by a consistent action which will be introduced later on. Some of these coefficients have exactly the same meaning as their metric counterparts (cf. Appendix A), i.e. \({\mathcal {A}}\) describes coupling between curvature and the field, \({\mathcal {B}}\) is the kinetic coupling, \({\mathcal {V}}\) is the potential of self-interaction of the scalar field, while non-zero \(\alpha \) means that the action functional features an anomalous coupling between the scalar and matter fields \(\chi \). One requires \({\mathcal {A}}\) be non-negative, otherwise, gravity would be rendered a repulsive force. The coefficients \({\mathcal {C}}_1\) and \({\mathcal {C}}_2\) do not have a clear interpretation yet. Their inclusion in the functional is a direct consequence of the Palatini approach we adopted; they do not appear in the metric S–T theory.

Two vectors \(A^\mu _1\) and \(A^\mu _2\) are also a novelty. They are constructed purely from metric and linear connection, and their presence is a direct result of lack of a priori established dependence of the connection on the metric tensor. The two vectors are defined to be:

$$\begin{aligned}&A^\mu _1(g,\Gamma )=g^{\mu \nu }g^{\alpha \beta }\nabla _\nu g_{\alpha \beta }=g^{\mu \nu }g^{\alpha \beta }Q_{\nu \alpha \beta }, \end{aligned}$$
(4a)
$$\begin{aligned}&A^\mu _2(g,\Gamma )=-g^{\mu \nu }g^{\alpha \beta }\nabla _\alpha g_{\nu \beta }=-g^{\mu \nu }g^{\alpha \beta }Q_{\alpha \nu \beta }. \end{aligned}$$
(4b)

The \(\nabla \) operator is defined with respect to the independent connection, hence covariant derivative of the metric tensor does not have to vanish in general. The extent to which theory fails to be metric is quantified by the so-called non-metricity tensor \(Q_{\alpha \mu \nu }=\nabla _\alpha g_{\mu \nu }\).

The form of the action functional follows necessarily from our requirement that the action remain form-invariant under conformal and almost-geodesic transformations, accompanied by a re-parametrization of the scalar field. This condition states that if one changes the metric tensor, the connection and the scalar field according to the transformation relations given below (we shall call such transformation “changing the frame”, and the choice of particular metric, connection and scalar field – “(conformal) frame”), solutions to the field equations are mapped into corresponding solutions obtained in the transformed frame.

Palatini approach is based on the assumption that the metric and the symmetric connection are independent quantities and thus should transform independently of each other. In the standard approach only the metric tensor is transformed, and the Levi-Civita connection, being a function of the metric, changes accordingly. In our case, one must devise a way to transform these two objects separately, as it should be possible, for instance, to conformally transform the metric while keeping the connection intact. We introduce the following transformations (cf. [32]):

$$\begin{aligned}&{\bar{g}}_{\mu \nu }=e^{2\gamma _1(\Phi )}g_{\mu \nu }, \end{aligned}$$
(5a)
$$\begin{aligned}&{\bar{\Gamma }}^\alpha _{\mu \nu }=\Gamma ^\alpha _{\mu \nu }+2 \delta ^\alpha _{(\mu }\partial _{\nu )}\gamma _2(\Phi )-g_{\mu \nu }g^{\alpha \beta }\partial _\beta \gamma _3(\Phi ), \end{aligned}$$
(5b)
$$\begin{aligned}&{\bar{\Phi }}=f(\Phi ). \end{aligned}$$
(5c)

These transformations are invertible:

$$\begin{aligned}&g_{\mu \nu }=e^{2{\check{\gamma }}_1({\bar{\Phi }})}{\bar{g}}_{\mu \nu }, \end{aligned}$$
(6a)
$$\begin{aligned}&\Gamma ^\alpha _{\mu \nu }={\bar{\Gamma }}^\alpha _{\mu \nu }+2 \delta ^\alpha _{(\mu }\partial _{\nu )}{\check{\gamma }}_2({\bar{\Phi }})-{\bar{g}}_{\mu \nu }{\bar{g}}^{\alpha \beta }\partial _\beta {\check{\gamma }}_3(\Phi ), \end{aligned}$$
(6b)
$$\begin{aligned}&\Phi =\check{f}({\bar{\Phi }}), \end{aligned}$$
(6c)

so that the transformations and their inverse are related in the following way:

$$\begin{aligned}&{{\check{\gamma }}}_i^{}=-\gamma _i \circ f, \end{aligned}$$
(7a)
$$\begin{aligned}&\check{f} =f^{-1}. \end{aligned}$$
(7b)

The transformations are governed by three smooth functions of the scalar field: \(\{\gamma _1,\gamma _2,\gamma _3\}\), depending on the space-time position indirectly, through the scalar field \(\gamma _i(\Phi (x))\). Equation (5c) provides the possibility of field re-definition by the diffeomorphism \(f\in \mathtt {Diff}^{}({\mathbb {R}})\) (see Appendix B). Equation (5a) clearly represents the conformal transformation of the metric tensor. It can be further generalized to include the disformal transformations of the metric tensor, given by:

$$\begin{aligned} g_{\mu \nu }=e^{2{\check{\gamma }}_1({\bar{\Phi }})}{\bar{g}}_{\mu \nu } + D({\bar{\Phi }})\partial _\mu {\bar{\Phi }} \partial _\nu {\bar{\Phi }}, \end{aligned}$$

with a disformal factor \(D({\bar{\Phi }})\); for an example of disformal transformation use within the Palatini framework, see [47]. In this paper, however, we limit our attention to the case when \(D({\bar{\Phi }})=0\).

Equation (5b) is called a generalized almost-geodesic transformation of type \(\pi _3\); the word “almost” suggests that one needs to distinguish between the transformation (5b) and a transformation which genuinely preserves geodesics on the space-time (see Appendix D). In fact, if the function \(\gamma _3\) was equal zero, one would have precisely the geodesic transformation of the affine connection. The new connection preserves also the light cones, leaving the causal structure of spacetime unchanged. If all functions \(\gamma _i\) were equal, one would recover standard conformal transformation formulae, identical to the case when the connection is Levi-Civita with respect to the metric tensor. One can also think of the transformation as Weyl transformation, i.e. without assuming that the connection is metric; in particular setting \(\gamma _1\ne \gamma _2=\gamma _3\).

One obtains field equations in the standard way, varying with respect to all independent variables entering the action. Unlike in the metric approach, now it is also necessary to vary w.r.t. the linear connection. Three sets of resulting equations are given below:

$$\begin{aligned} \begin{aligned}&\mathbf{Metric: }\\&-\frac{1}{2}g_{\mu \nu }{\mathcal {L}}(\Phi ,g,\Gamma )+{\mathcal {A}}(\Phi )R_{(\mu \nu )}(\Gamma )-{\mathcal {B}}(\Phi )\partial _\mu \Phi \partial _\nu \Phi \\&\qquad +{\mathcal {C}}'_2(\Phi )\partial _\mu \Phi \partial _\nu \Phi -{\mathcal {C}}'_1(\Phi )g_{\mu \nu }g^{\sigma \beta }\partial _\sigma \Phi \partial _\beta \Phi \\&\qquad +{\mathcal {C}}_2(\Phi )\nabla _\mu \nabla _\nu \Phi -{\mathcal {C}}_1(\Phi )g_{\mu \nu }\Box \Phi \\&\qquad +Q_{\beta \lambda \zeta }\partial _\sigma \Phi \left[ \frac{1}{2}{\mathcal {C}}_2(\Phi ) \delta ^\sigma _{(\mu }\delta ^\beta _{\nu )}g^{\lambda \zeta }\right. \\&\qquad -{\mathcal {C}}_1(\Phi ) \left. \left( \frac{1}{2}g_{\mu \nu }g^{\sigma \beta }g^{\lambda \zeta } -g_{\mu \nu }g^{\sigma \lambda }g^{\beta \zeta }+ \delta ^\sigma _{(\mu }\delta ^\beta _{\nu )}g^{\lambda \zeta }\right) \right] \\&\quad =\kappa ^2 T_{\mu \nu }, \end{aligned} \end{aligned}$$
(8)
$$\begin{aligned} \begin{aligned}&\mathbf{Connection: }\\&\nabla _\alpha \left[ \sqrt{-g}\left( g^{\alpha (\zeta } \delta ^{\lambda )}_{\beta }-g^{\lambda \zeta }\delta ^\alpha _{\beta }\right) \right] \\&\quad =\sqrt{-g}\partial _\alpha \Phi \left[ g^{\alpha (\zeta } \delta ^{\lambda )}_{\beta }\left( \frac{{\mathcal {C}}_2(\Phi ) -2{\mathcal {C}}_1(\Phi )-{\mathcal {A}}'(\Phi )}{{\mathcal {A}}(\Phi )}\right) \right. \\&\left. \qquad -\,g^{\lambda \zeta }\delta ^\alpha _{\beta } \left( \frac{-{\mathcal {C}}_2(\Phi )-{\mathcal {A}}'(\Phi )}{{\mathcal {A}}(\Phi )}\right) \right] , \end{aligned} \end{aligned}$$
(9)
$$\begin{aligned} \begin{aligned}&\mathbf{Scalar field: }\\&{\mathcal {A}}'(\Phi )R(g,\Gamma )+{\mathcal {B}}'(\Phi )g^{\mu \nu } \partial _\mu \Phi \partial _\nu \Phi +2{\mathcal {B}}(\Phi )\Box \Phi \\&\quad +2{\mathcal {B}}(\Phi )\partial _\mu \Phi Q_{\nu \alpha \beta } \left( \frac{1}{2}g^{\mu \nu }g^{\alpha \beta }-g^{\alpha \mu }g^{\beta \nu }\right) \\&\quad +\frac{1}{\sqrt{-g}}\left[ {\mathcal {C}}_1(\Phi )\nabla _\mu \left( \sqrt{-g}A^\mu _1(g,\Gamma )\right) \right. \\&\left. \quad +{\mathcal {C}}_2(\Phi )\nabla _\mu \left( \sqrt{-g}A^\mu _2(g,\Gamma )\right) \right] -{\mathcal {V}}'(\Phi )=2\alpha '(\Phi )T, \end{aligned} \end{aligned}$$
(10)

where \(T_{\mu \nu }=-\frac{2}{\sqrt{-g}}\frac{\delta (\sqrt{-g}{\mathcal {L}}_{\text {matter}})}{\delta g^{\mu \nu }}\), \({\mathcal {L}}\) is simply the gravitational part of Lagrangian; furthermore, all primes denote differentiation with respect to the scalar field \(\Phi \).

An analysis of the equations written above will not be particularly illuminating unless one inspects the equation resulting from varying with respect to the affine connection. As it turns out, it is always possible to find a frame in which the independent connection is the Levi-Civita connection of the metric tensor \(g_{\mu \nu }\). One transforms the connection using Eq. (5b), with \({\check{\gamma }}_2\) and \({\check{\gamma }}_3\) specified by the field equations. Denoting the Levi-Civita connection of the metric tensor \(g_{\mu \nu }\) by \(\Big \{\genfrac{}{}{0.0pt}{}{\alpha }{\mu \nu }\Big \}_g\), we find out that it is related to the initial independent affine connection in the following way:

$$\begin{aligned} \begin{aligned} \Gamma ^\alpha _{\mu \nu }=\Big \{\genfrac{}{}{0.0pt}{}{\alpha }{\mu \nu }\Big \}_g+ {\mathcal {F}}_1(\Phi )\delta ^\alpha _{(\mu }\partial _{\nu )}\Phi -{\mathcal {F}}_2(\Phi ) g_{\mu \nu }g^{\alpha \beta }\partial _\beta \Phi , \end{aligned} \end{aligned}$$
(11)

where the functions \({\mathcal {F}}_1, {\mathcal {F}}_2\) of the scalar field \(\Phi \) take the form:

$$\begin{aligned} {\mathcal {F}}_1(\Phi )=\frac{2{\mathcal {C}}_1(\Phi )+(n-3){\mathcal {C}}_2(\Phi )+(n-1){\mathcal {A}}'(\Phi )}{{\mathcal {A}}(\Phi )(n-1)(n-2)} \end{aligned}$$

and

$$\begin{aligned} {\mathcal {F}}_2(\Phi )=\frac{2{\mathcal {C}}_1(\Phi )-{\mathcal {C}}_2(\Phi )+{\mathcal {A}}'(\Phi )}{{\mathcal {A}}(\Phi )(n-2)}. \end{aligned}$$

This result simply means that one can always choose a frame in which the theory is effectively metric, with vanishing vectors \(A^\mu _1,\,A^\mu _2\). More generally, if \({\mathcal {C}}_1={\mathcal {C}}_2\equiv {\mathcal {C}}\), then one has \({\mathcal {F}}_1={\mathcal {F}}_2\equiv {\mathcal {F}}=\frac{{\mathcal {C}}(\Phi )+{\mathcal {A}}'(\Phi )}{{\mathcal {A}}(\Phi )(n-2)}\) and the metric providing the connection has the form \(\exp \left( \int {\mathcal {F}}(\Phi )d\Phi \,\right) g_{\mu \nu }\). This gives a link to the so-called C-theories of gravity studied recently in [61,62,63].

Since the connection can be always solved in terms of the metric and the scalar field, there are no additional physical degrees of freedom carried by it. The connection always turns out to be an auxiliary field [64].

The relation (11) is defined by two functions, which in general (except the case mentioned above) are not equal. One can identify them as the functions \({\check{\gamma }}_2\) and \({\check{\gamma }}_3\) relating affine connections of two different frames. Frame, in which the theory turns out to be fully metric, can be obtained by plugging back the connection (11) in the action functional (3). Such a change of frame should not affect the form of action functional (otherwise solutions of equations of motion in one frame would not be mapped to solution in another frame, which would contradict one of our basic assumptions), and the coefficients \(\{{\mathcal {A}},{\mathcal {B}},{\mathcal {C}}_1,{\mathcal {C}}_2,{\mathcal {V}},\alpha \}\) will change in a way that preserves the functional form of the action. Exact transformation relations will be presented in the next section.

Because the transformation (5b) depends on two independent parameters, one cannot in general end up in a frame in which the initial independent connection is Levi-Civita with respect to some metric tensor, as the transformation of the metric is governed by a single function \({\check{\gamma }}_1\). However, if \({\mathcal {C}}_1={\mathcal {C}}_2\), then it is possible to transform the metric tensor in such a way that the initial independent connection becomes a Levi-Civita connection of the transformed, new metric.

3 Transformation formulae

Redefinition of the transformations leads to a modification of conformal mapping formulae for all quantities built from the connection, i.e. Riemann tensor and its contractions. This is an obvious consequence of decoupling metric tensor from the connection. In the metric approach, transformation of the Riemann tensor is fully determined by the way the metric transforms; here, one must take into account the fact that the transformation is governed by the functions \({\check{\gamma }}_2\) and \({\check{\gamma }}_3\). Additionally, covariant derivative of the metric does not vanish in general, and this fact plays an important role in the process of deriving transformation relations. If the calculations are performed in n dimensions, requiring the transformations be defined by Eqs. (5a)–(5c), the formulae relating Riemann tensors of two different conformal frames are the following:

$$\begin{aligned} \begin{aligned} R^\alpha _{\mu \beta \nu }&={\bar{R}}^\alpha _{\mu \beta \nu }+\delta ^\alpha _\nu {\bar{\nabla }}_\beta {\bar{\nabla }}_\mu {\check{\gamma }}_2({\bar{\Phi }})-\delta ^\alpha _\beta {\bar{\nabla }}_\nu {\bar{\nabla }}_\mu {\check{\gamma }}_2({\bar{\Phi }})\\&\quad -\delta ^\alpha _\nu {\bar{\nabla }}_\beta {\check{\gamma }}_2({\bar{\Phi }}){\bar{\nabla }}_\mu {\check{\gamma }}_2({\bar{\Phi }})+\delta ^\alpha _\beta {\bar{\nabla }}_\nu {\check{\gamma }}_2({\bar{\Phi }}){\bar{\nabla }}_\mu {\check{\gamma }}_2({\bar{\Phi }})\\&\quad +{\bar{g}}_{\mu \beta }{\bar{g}}^{\alpha \lambda }{\bar{\nabla }}_\nu {\bar{\nabla }}_\lambda {\check{\gamma }}_3({\bar{\Phi }})-{\bar{g}}_{\mu \nu }{\bar{g}}^{\alpha \lambda }{\bar{\nabla }}_\beta {\bar{\nabla }}_\lambda {\check{\gamma }}_3({\bar{\Phi }})\\&\quad +\delta ^\alpha _\nu {\bar{g}}_{\mu \beta }{\bar{g}}^{\sigma \lambda }{\bar{\nabla }}_\sigma {\check{\gamma }}_3({\bar{\Phi }}){\bar{\nabla }}_\lambda {\check{\gamma }}_2({\bar{\Phi }})\\&\quad -\delta ^\alpha _\beta {\bar{g}}_{\mu \nu }{\bar{g}}^{\sigma \lambda }{\bar{\nabla }}_\sigma {\check{\gamma }}_3({\bar{\Phi }}){\bar{\nabla }}_\lambda {\check{\gamma }}_2({\bar{\Phi }}) \\&\quad + {\bar{g}}^{\alpha \lambda }{\bar{g}}_{\mu \nu }{\bar{\nabla }}_\lambda {\bar{\gamma }}_3({\bar{\Phi }}){\bar{\nabla }}_\beta {\check{\gamma }}_3({\bar{\Phi }})\\&\quad -{\bar{g}}^{\alpha \lambda }{\bar{g}}_{\mu \beta }{\bar{\nabla }}_\lambda {\check{\gamma }}_3({\bar{\Phi }}){\bar{\nabla }}_\nu {\check{\gamma }}_3({\bar{\Phi }}) \\&\quad + {\bar{g}}^{\alpha \lambda }{\bar{\nabla }}_\nu {\bar{g}}_{\mu \beta }{\bar{\nabla }}_\lambda {\check{\gamma }}_3({\bar{\Phi }})-{\bar{g}}^{\alpha \lambda }{\bar{\nabla }}_\beta {\bar{g}}_{\mu \nu }{\bar{\nabla }}_\lambda {\check{\gamma }}_3({\bar{\Phi }})\\&\quad +{\bar{g}}_{\mu \beta }{\bar{\nabla }}_\nu {\bar{g}}^{\alpha \lambda }{\bar{\nabla }}_\lambda {\check{\gamma }}_3({\bar{\Phi }})-{\bar{g}}_{\mu \nu }{\bar{\nabla }}_\beta {\bar{g}}^{\alpha \lambda }{\bar{\nabla }}_\lambda {\check{\gamma }}_3({\bar{\Phi }}). \end{aligned} \end{aligned}$$
(12)

The formula for the (symmetrized) Ricci curvature tensor reads as follows:

$$\begin{aligned} \begin{aligned} R_{(\mu \nu )}&={\bar{R}}_{(\mu \nu )}-(n-1){\bar{\nabla }}_{\mu }{\bar{\nabla }}_\nu {\check{\gamma }}_2({\bar{\Phi }})+{\bar{\nabla }}_{\mu }{\bar{\nabla }}_\nu {\check{\gamma }}_3({\bar{\Phi }})\\&\quad +(n-1){\bar{\nabla }}_\nu {\check{\gamma }}_2({\bar{\Phi }}){\bar{\nabla }}_\mu {\check{\gamma }}_2({\bar{\Phi }})\\&\quad -{\bar{\nabla }}_\nu {\check{\gamma }}_3({\bar{\Phi }}){\bar{\nabla }}_\mu {\check{\gamma }}_3({\bar{\Phi }})-{\bar{g}}_{\mu \nu }{\bar{g}}^{\alpha \beta }{\bar{\nabla }}_{\alpha }{\bar{\nabla }}_\beta {\check{\gamma }}_3({\bar{\Phi }})\\&\quad -(n-1){\bar{g}}_{\mu \nu }{\bar{g}}^{\alpha \beta }{\bar{\nabla }}_\alpha {\check{\gamma }}_3({\bar{\Phi }}){\nabla }_\beta {\check{\gamma }}_2({\bar{\Phi }})\\&\quad +{\bar{g}}_{\mu \nu }{\bar{g}}^{\alpha \beta }{\bar{\nabla }}_\alpha {\check{\gamma }}_3({\bar{\Phi }}){\nabla }_\beta {\check{\gamma }}_3({\bar{\Phi }}) \\&\quad +\Big [{\bar{g}}_{\mu \nu }{\bar{g}}^{\alpha \beta }{\bar{g}}^{\sigma \lambda }{\bar{\nabla }}_\alpha {\bar{g}}_{\beta \sigma }-{\bar{g}}^{\alpha \lambda }{\bar{\nabla }}_\alpha {\bar{g}}_{\mu \nu }\Big ]{\bar{\nabla }}_\lambda {\check{\gamma }}_3({\bar{\Phi }}). \end{aligned} \end{aligned}$$
(13)

Finally, contracting the previous formula with the metric tensor, we get an expression for the Palatini–Ricci scalar:

$$\begin{aligned} R= & {} e^{-2{\check{\gamma }}_1({\bar{\Phi }})}\Big [{\bar{R}}-(n-1){\bar{g}}^{\mu \nu }{\bar{\nabla }}_\mu {\bar{\nabla }}_\nu \left( {\check{\gamma }}_2({\bar{\Phi }})+{\check{\gamma }}_3({\bar{\Phi }})\right) \nonumber \\&+{\bar{g}}^{\mu \nu }{\bar{g}}^{\lambda \sigma }\Big (n{\bar{\nabla }}_\mu {\bar{g}}_{\nu \sigma }-{\bar{\nabla }}_\sigma {\bar{g}}_{\nu \mu }\Big ){\bar{\nabla }}_\lambda {\check{\gamma }}_3({\bar{\Phi }}) \nonumber \\&+(n-1){\bar{g}}^{\mu \nu }\left( {\bar{\nabla }}_\mu {\check{\gamma }}_2({\bar{\Phi }}){\bar{\nabla }}_\nu {\check{\gamma }}_2({\bar{\Phi }}) \right. \nonumber \\&\left. - n{\bar{\nabla }}_\mu {\check{\gamma }}_2({\bar{\Phi }}){\bar{\nabla }}_\nu {\check{\gamma }}_3({\bar{\Phi }})+{\bar{\nabla }}_\mu {\check{\gamma }}_3({\bar{\Phi }}){\bar{\nabla }}_\nu {\check{\gamma }}_3({\bar{\Phi }})\right) \Big ]. \end{aligned}$$
(14)

In the Weyl case \(\gamma _3=\gamma _2+\text {const}\) one gets

$$\begin{aligned} \begin{aligned} R&=e^{-2{\check{\gamma }}_1({\bar{\Phi }})}\Big [{\bar{R}}-2(n-1){\bar{g}}^{\mu \nu }{\bar{\nabla }}_\mu {\bar{\nabla }}_\nu {\check{\gamma }}_2({\bar{\Phi }})\\&\quad +{\bar{g}}^{\mu \nu }{\bar{g}}^{\lambda \sigma }\Big (n{\bar{\nabla }}_\mu {\bar{g}}_{\nu \sigma }-{\bar{\nabla }}_\sigma {\bar{g}}_{\nu \mu }\Big ){\bar{\nabla }}_\lambda {\check{\gamma }}_2({\bar{\Phi }}) \\&\quad -(n-1)(n-2){\bar{g}}^{\mu \nu }{\bar{\nabla }}_\mu {\check{\gamma }}_2({\bar{\Phi }}){\bar{\nabla }}_\nu {\check{\gamma }}_2({\bar{\Phi }})\Big ]. \end{aligned} \end{aligned}$$
(15)

When \(\gamma _2+\gamma _3=\text {const}\) the expression (14) reduces instead to

$$\begin{aligned} R= & {} e^{-2{\check{\gamma }}_1({\bar{\Phi }})}\Big [{\bar{R}}+{\bar{g}}^{\mu \nu }{\bar{g}}^{\lambda \sigma }\Big (n{\bar{\nabla }}_\mu {\bar{g}}_{\nu \sigma }-{\bar{\nabla }}_\sigma {\bar{g}}_{\nu \mu }\Big ){\bar{\nabla }}_\lambda {\check{\gamma }}_2({\bar{\Phi }}) \nonumber \\&+(n-1)(n+2){\bar{g}}^{\mu \nu }{\bar{\nabla }}_\mu {\check{\gamma }}_2({\bar{\Phi }}){\bar{\nabla }}_\nu {\check{\gamma }}_2({\bar{\Phi }})\Big ]. \end{aligned}$$
(16)

Since the functions \({\check{\gamma }}_2\) and \({\check{\gamma }}_3\) do not depend on the spacetime position explicitly, derivatives of these quantities can be cast in the following form:

$$\begin{aligned} {\bar{\nabla }}_\mu {\check{\gamma }}_i({\bar{\Phi }})=\frac{d {\check{\gamma }}_i({\bar{\Phi }})}{d{\bar{\Phi }}}{\bar{\nabla }}_\mu {\bar{\Phi }}\equiv {\check{\gamma }}'_i{\bar{\nabla }}_\mu {\bar{\Phi }}, \end{aligned}$$

where \(i=2,3\).

Conformal transformation and almost-geodesic mapping, accompanied by re-definition of the scalar field, applied to the three independent variables should map solutions of equations of motion in one frame to corresponding solutions in another frame. For it to be true, the way functions \(\{{\mathcal {A}},\ldots ,\alpha \}\) transform must be governed by equations analogous to (A.6), as the action functional needs to preserve its form. The condition of form-invariance of the action leads to the following transformation equations for the five independent scalar field functions:

$$\begin{aligned} \bar{{\mathcal {A}}}({\bar{\Phi }})&=e^{(n-2){\check{\gamma }}_1({\bar{\Phi }})}{\mathcal {A}}(\check{f}({\bar{\Phi }})), \end{aligned}$$
(17a)
$$\begin{aligned} \bar{{\mathcal {B}}}({\bar{\Phi }})&=e^{(n-2){\check{\gamma }}_1({\bar{\Phi }})}\Big [{\mathcal {B}}(\check{f}({\bar{\Phi }}))(\check{f}'({\bar{\Phi }}))^2\nonumber \\&\quad +(n-1) \Big (n{\mathcal {A}}(\check{f}({\bar{\Phi }})){\check{\gamma }}'_2({\bar{\Phi }}){\check{\gamma }}'_3({\bar{\Phi }})\nonumber \\&\quad -{\mathcal {A}}(\check{f}({\bar{\Phi }}))\left( {\check{\gamma }}'_2({\bar{\Phi }})\right) ^2\nonumber \\&\quad -{\mathcal {A}}(\check{f}({\bar{\Phi }}))\left( {\check{\gamma }}'_3({\bar{\Phi }})\right) ^2\nonumber \\&\quad -\frac{d{\mathcal {A}}(\check{f}({\bar{\Phi }}))}{d{\bar{\Phi }}}({\check{\gamma }}'_2({\bar{\Phi }})+{\check{\gamma }}'_3({\bar{\Phi }}))\nonumber \\&\quad -(n-2){\mathcal {A}}(\check{f}({\bar{\Phi }})){\check{\gamma }}'_1({\bar{\Phi }})({\check{\gamma }}'_2({\bar{\Phi }})+{\check{\gamma }}'_3({\bar{\Phi }}))\Big )\nonumber \\&\quad +\check{f}'({\bar{\Phi }})\Big ({\mathcal {C}}_1(\check{f}({\bar{\Phi }}))(2 n{\check{\gamma }}'_1({\bar{\Phi }})\nonumber \\&\quad -2(n+1){\check{\gamma }}'_2({\bar{\Phi }})+2{\check{\gamma }}'_3({\bar{\Phi }}))-{\mathcal {C}}_2(\check{f}({\bar{\Phi }}))(2{\check{\gamma }}'_1({\bar{\Phi }})\nonumber \\&\quad -(n+3){\check{\gamma }}'_2({\bar{\Phi }})+(n+1){\check{\gamma }}'_3({\bar{\Phi }}))\Big )\Big ], \end{aligned}$$
(17b)
$$\begin{aligned} \bar{{\mathcal {C}}}_1({\bar{\Phi }})&=e^{(n-2){\check{\gamma }}_1 ({\bar{\Phi }})}\left[ \check{f}'({\bar{\Phi }}){\mathcal {C}}_1 (\check{f}({\bar{\Phi }}))\right. \nonumber \\&\quad \left. -{\mathcal {A}}(\check{f}({\bar{\Phi }})) \left( \frac{n-1}{2}{\check{\gamma }}'_2({\bar{\Phi }})+ \frac{n-3}{2}{\check{\gamma }}'_3({\bar{\Phi }})\right) \right] , \end{aligned}$$
(17c)
$$\begin{aligned} \bar{{\mathcal {C}}}_2({\bar{\Phi }})&=e^{(n-2){\check{\gamma }}_1({\bar{\Phi }})}\Big [\check{f}'({\bar{\Phi }}){\mathcal {C}}_2(\check{f}({\bar{\Phi }})) \nonumber \\&\quad -{\mathcal {A}}(\check{f}({\bar{\Phi }})) \left( (n-1){\check{\gamma }}'_2({\bar{\Phi }})-{\check{\gamma }}'_3({\bar{\Phi }})\right) \Big ], \end{aligned}$$
(17d)
$$\begin{aligned} \bar{{\mathcal {V}}}({\bar{\Phi }})&=e^{n{\check{\gamma }}_1({\bar{\Phi }})}{\mathcal {V}}(\check{f}({\bar{\Phi }})), \end{aligned}$$
(17e)
$$\begin{aligned} {\bar{\alpha }}({\bar{\Phi }})&=\alpha (\check{f}({\bar{\Phi }}))+{\check{\gamma }}_1({\bar{\Phi }}). \end{aligned}$$
(17f)

These transformations are induced by the transformations (5a)–(5c) of independent variables which are invertible. This means that (17a)–(17f) allow us to transform solutions obtained in one frame into another, therefore we have split theories given by the action (3) into classes which are solution-equivalent. Next task is to find a typical representative in each class. One choice mentioned before is the so-called Einstein frame, another one is known as the Jordan frame.

As we can see, some of the transformation relations involve nothing but a simple multiplication of the “old” coefficients by a factor related to the transformation of the metric tensor. These relations do not depend on the approach we adopt – they retain the same form regardless of whether we work within metric or Palatini formalism. However, coefficients \({\mathcal {C}}_1, {\mathcal {C}}_2\) and \({\mathcal {B}}\) transform in a more complicated way depending on whether the theory is metric or not. The transformation relations preserve the sign of the \({\mathcal {A}}\) coefficient. Similarly, if \({\mathcal {B}}\) is subject to a scalar field re-parametrization only, then its sign does not change as well. By the same token, if the potential \({\mathcal {V}}\) vanishes in one frame, it cannot emerge in any other.

Due to our freedom of choice of three functions \(\{\gamma _1,\gamma _2,\gamma _3\}\) and re-parametrization of the scalar field \(\Phi =\check{f}({\bar{\Phi }})\), it is always possible to fix four of the above six coefficients. We shall call such fixing “choosing a frame”, as it was mentioned before. If we specify the remaining two functions, we choose a theory. For example, the four functions \(\{\gamma _1,\gamma _2,\gamma _3,f\}\) can be chosen in such a way that four coefficients \(\{{\mathcal {B}},{\mathcal {C}}_1,{\mathcal {C}}_2,\alpha \}\) vanish, simplifying the calculations. Results obtained in a given frame can be always “translated” to another frame if the two frames can be related by a conformal transformation accompanied by a re-parametrization of the scalar field. It must be also noted that increased number of functions used to change the frame (from two in scalar–tensor theory in the metric approach – see Appendix A – to four in case of the Palatini formalism) result in additional coefficients appearing in the action functional. However, analogously to the metric case, despite the fact we are able to fix four of them, we are always left with two functions, defining the particular theory.

Conformal and generalized almost-geodesic transformation establish a mathematical equivalence of two frames. On the physical ground, they may constitute two very different theories [65,66,67,68,69,70,71,72,73]. The multitude of equivalent theories poses a problem of identifying frames which can be related by the transformations given by Eqs. (5a)–(5c). Such frames may bear no resemblance to one another and yet, be two different manifestations of the same theory, but written using different variables. This situation suggests that it would be desirable to formulate the general scalar–tensor theory in a frame-independent way, fully analogous to the way GR circumvents the problem of deciding upon the “right” coordinate system to describe physical phenomena by resorting to the language of tensors, allowing one to write equations in a covariant manner. In case of scalar–tensor gravity in the Palatini approach, we decided to follow on [26] and find invariant quantities built from coefficients \(\{{\mathcal {A}},\ldots ,\alpha \}\), metric and connection, whose values are independent of the choice of frame – just like, for instance, value of \(R^\alpha _{\,\,\mu \beta \nu }R_{\alpha }^{\,\,\mu \beta \nu }\) does not depend on our choice of coordinate frame. This analogy, however, should not be taken too seriously, as general covariance in case of GR is a consequence of the fact that our description of Nature should not depend on an artificial construct of coordinate frame, whereas such invariance of physical laws is not present when changing conformal frames. For example, geodesic curves, due to covariant formulation of geodesic equations, are the same in every coordinate frame; on the other hand, if the mapping (5b) is applied, geodesics are not preserved (unless \(\gamma _3=0\)), thus leading to emergence of an unobserved “fifth force”, causing particles to deviate from their standard trajectories, see e.g. [74] for application to explaining galaxy rotational curves.

4 Invariant quantities and their applications

In order to check whether two frames can be conformally related, we may introduce the notion of invariants [26]. The invariants are quantities which are built from the functions \(\{{\mathcal {A}},{\mathcal {B}},{\mathcal {C}}_1,{\mathcal {C}}_2,{\mathcal {V}},\alpha \}\) such that their functional dependence on them is the same in every frame. Also, their value at a given spacetime point remains unchanged. If the invariants calculated for one theory coincide with the invariant quantities computed for another one, we can always find a conformal transformation relating these two theories (this transformation, however, may not obey group composition law, and the solutions to equations in both frames may not be mathematically equivalent). The way the invariants are constructed comes from transformation properties of the five arbitrary functions. Some of the functions get multiplied only by a factor, while the coefficients \({\mathcal {B}}\), \({\mathcal {C}}_1\) and \({\mathcal {C}}_1\) transform in a more sophisticated manner. Taking this into account, we can find the correct combinations of the functions giving us quantities expressed in terms of the same coefficients irrespective of the frame we are in. Two exemplary invariants are given belowFootnote 4:

$$\begin{aligned} {\mathcal {I}}_1(\Phi )= & {} \frac{{\mathcal {A}}(\Phi )}{e^{(n-2)\alpha (\Phi )}}, \end{aligned}$$
(18)
$$\begin{aligned} {\mathcal {I}}_2(\Phi )= & {} \frac{{\mathcal {V}}(\Phi )}{({\mathcal {A}}(\Phi ))^{\frac{n}{n-2}}}. \end{aligned}$$
(19)

In four dimensions, the invariant \({\mathcal {I}}_1\) characterizes the non-minimal coupling [75]. Apart from the case when \({\mathcal {A}}=e^{2\alpha }\), its constancy means that both \({\mathcal {A}}\) and \(e^{2\alpha }\) are some numbers, implying that in such theory scalar field is entirely decoupled from curvature and matter. The invariant \({\mathcal {I}}_2\) generalizes the notion of self-interaction potential. It should be obvious that any function of the invariants is invariant itself. Moreover, spacetime derivatives of the invariants are invariant, as well as derivatives with respect to other invariants (if we treat an invariant as a function of another invariant quantity) [26]. It is also possible to construct invariant metrics and connections. In the case of the metric there is no unique way of doing so, but in this paper, only two possibilities will be considered:

$$\begin{aligned} {\hat{g}}_{\mu \nu }=({\mathcal {A}}(\Phi ))^{\frac{2}{n-2}}g_{\mu \nu }, \end{aligned}$$
(20)

or

$$\begin{aligned} {\tilde{g}}_{\mu \nu }=e^{2\alpha (\Phi )}g_{\mu \nu }. \end{aligned}$$
(21)

As for the affine connection, it is possible to choose the following:

$$\begin{aligned} {\hat{\Gamma }}^\alpha _{\mu \nu }=\Gamma ^\alpha _{\mu \nu }-2{\mathcal {P}}_1(\Phi )\delta ^\alpha _{(\mu }\partial _{\nu )}\Phi +g_{\mu \nu }g^{\alpha \beta }{\mathcal {P}}_2(\Phi )\partial _\beta \Phi \,, \end{aligned}$$
(22)

where:

$$\begin{aligned} {\mathcal {P}}_1(\Phi )=\frac{2 {\mathcal {C}}_1(\Phi ) + (n-3){\mathcal {C}}_2(\Phi )}{ {\mathcal {A}}(\Phi )(n-1)(n-2)} \end{aligned}$$

and

$$\begin{aligned} {\mathcal {P}}_2(\Phi )=\frac{-2 {\mathcal {C}}_1(\Phi ) +{\mathcal {C}}_2(\Phi )}{ {\mathcal {A}}(\Phi )(n-2 )}. \end{aligned}$$

From a purely algebraic point of view, invariance of the quantities given above means that when changing the frame, the additional terms multiplying the metric or added to the connection transform in a way balancing out multiplicative or additive terms containing transformation-defining functions \({\check{\gamma }}_1\), \({\check{\gamma }}_2\) and \({\check{\gamma }}_3\). Their physical invariance is much more profound a can be a subject for various phenomenological speculations (see e.g. [76,77,78]). It is obvious that conformal transformation of the metric tensor does not preserve the line element on a (pseudo-)Riemannian manifold due to the fact that conformal change is not equivalent to a simple coordinate transformation. Thence, two observers using conformally-related metric tensors will agree only on the causal structure of space-time but will measure distances differently; the same can be said about affine connections used to determine geodesic curves. Observers of different frames will, in general, disagree on whether a test particle moves along its geodesic, as the general almost-geodesic mapping (or conformal transformation in case of the purely metric approach) changes geodesics (except for the null ones) on a given space-time. Introduction of invariant metric tensors and connections aims at resolving – at least partially – this ambiguity. If two observers of different frames agree on using the same invariant quantity to describe geometry, the measurements they make shall give exactly the same outcome. In case of the invariant metric, all distances will be the same, while the invariant connection guarantees invariance of geodesic curves. There is, however, more than one invariant metric (and in fact, there are also multiple invariant connections, but in this paper, we introduce only one), so that no unique way of choosing invariant objects to describe the geometry of space-time exists.

4.1 Integral invariants

Let us define the following quantityFootnote 5:

$$\begin{aligned} \begin{aligned} {\mathcal {I}}^n_E(\Phi )&=\int \Bigg (\pm \frac{(n-2){\mathcal {A}}(\Phi ) {\mathcal {B}}(\Phi ) + 2{\mathcal {A}}'(\Phi )[ {\mathcal {C}}_2(\Phi ) - n{\mathcal {C}}_1(\Phi )] }{ (n-2){\mathcal {A}}(\Phi )^2} \\&\quad \pm \frac{ (n^2-5) {\mathcal {C}}_2(\Phi )^2 - 4 {\mathcal {C}}_1(\Phi )^2+2(4 + n - n^2){\mathcal {C}}_1(\Phi ){\mathcal {C}}_2(\Phi )\big )}{(n-2) (n-1) {\mathcal {A}}(\Phi )^2}\Bigg )^{\frac{1}{2}}d\Phi . \end{aligned} \end{aligned}$$
(23)

Such quantity is a genuine invariant for arbitrary transformation \(\{f, \gamma _1,\gamma _2,\gamma _3\} \in \mathtt {Diff}^{(3)}({\mathbb {R}})\).

In four dimensions, the quantity \({\mathcal {I}}_E\)Footnote 6 can be written as:

$$\begin{aligned} {\mathcal {I}}_E(\Phi )= \int \Bigg (\pm \frac{{\mathcal {A}}(\Phi ) {\mathcal {B}}(\Phi ) -\frac{2}{3}{\mathcal {C}}_1( \Phi )^2-\frac{8}{3}{\mathcal {C}}_1(\Phi ){\mathcal {C}}_2(\Phi )+\frac{11}{6}{\mathcal {C}}_2( \Phi )^2 -4 {\mathcal {C}}_1(\Phi ) {\mathcal {A}}'(\Phi )}{{\mathcal {A}}(\Phi )^2} \pm \frac{{\mathcal {C}}_2(\Phi ) {\mathcal {A}}'(\Phi )}{{\mathcal {A}}(\Phi )^2}\Bigg )^\frac{1}{2}d\Phi . \end{aligned}$$
(24)

It will be shown later on that in the Einstein-like frame it plays the role of the scalar field.

In can be noticed that the function \({\mathcal {A}}(\Phi )\) in the denominator of (23) can be replaced by \(e^{(n-2)\alpha (\Phi )}\) without changing its transformation properties. We will arrive at an invariant closely related to \({\mathcal {I}}^n_E\). Its importance shall be revealed while investigating different frame parametrizations of the S–T theories.

$$\begin{aligned} \begin{aligned} {\mathcal {I}}^n_J(\Phi )&=\int e^{-\frac{n-2}{2}\alpha (\Phi )}\Bigg (\pm \frac{(n-2){\mathcal {A}}(\Phi ) {\mathcal {B}}(\Phi ) + 2{\mathcal {A}}'(\Phi )[ {\mathcal {C}}_2(\Phi ) - n{\mathcal {C}}_1(\Phi )] }{ (n-2){\mathcal {A}}(\Phi )} \\&\quad \pm \frac{ (n^2-5) {\mathcal {C}}_2(\Phi )^2 - 4 {\mathcal {C}}_1(\Phi )^2+2(4 + n - n^2){\mathcal {C}}_1(\Phi ){\mathcal {C}}_2(\Phi )\big )}{(n-2) (n-1) {\mathcal {A}}(\Phi )}\Bigg )^{\frac{1}{2}}d\Phi . \end{aligned} \end{aligned}$$
(25)

This invariant was given the subscript “J” to indicate that it arises naturally in the Jordan frame. It is obvious that if \({\mathcal {I}}^n_E\) vanishes, so does \({\mathcal {I}}^n_J\).

5 Einstein and Jordan frames, and their invariant generalizations

So far, we have been using terms “Jordan/Einstein frame” without defining it in an unambiguous way. As it is widely known, the notion of a (conformal) frame has been applied to an analysis of the S–T theories primarily in the metric approach. It is straightforward to extend the concepts of Einstein and Jordan frames to Palatini theory as well. We define the former in the following way:

Definition 5.1

The Einstein frame in the Palatini theory is characterized by specific values of four out of six arbitrary functions \(\{{\mathcal {A}},\ldots ,\alpha \}\): \({\mathcal {A}}=1,\,{\mathcal {B}}=\epsilon _\text {Palatini},\,{\mathcal {C}}_1={\mathcal {C}}_2=0.\)

The action functional is given by:

$$\begin{aligned} S[g_{\mu \nu }^E,(\Gamma ^E)^\alpha _{\mu \nu },\Phi ]= & {} \frac{1}{2\kappa ^2}\int _{\Omega }d^nx \sqrt{-g^E}\Big (R(g^E,\Gamma ^E)\\&-\,\epsilon _\text {Palatini} (g^E)^{\mu \nu }\nabla _\mu \Phi \nabla _\nu \Phi -{\mathcal {V}}(\Phi )\Big )\\&+\,S_{\text {matter}}\left[ e^{2\alpha (\Phi )}g_{\mu \nu }^E,\chi \right] , \end{aligned}$$

where \(\epsilon _\text {Palatini}\equiv (\pm 1,0)\) is a three valued function.

It follows from the very definition that there are three types of Einstein frames, depending on the value of the parameter \(\epsilon _\text {Palatini}\), which cannot transform each other by a diffeomorphism.Footnote 7 In the simplest case \(\gamma _1=\gamma _2=\gamma _3=0\) its values can be identified with the signature of \({{{\mathcal {B}}}}\), i.e. \(\epsilon _\text {Platini}=\text {sign}({{{\mathcal {B}}}})\). In fact, Einstein frames can be labelled as a triple \((\epsilon _\text {Palatini}, {{{\mathcal {V}}}},\alpha )\). They include the original Einstein–Hilbert–Palatini action as a particular case: \(\epsilon _\text {Palatini}={{{\mathcal {V}}}}=\alpha =0\). One should notice that the frames with \(\epsilon _\text {Palatini}=0\) are singular in the following sense: scalar field re-definition by an arbitrary diffeomorphism \(f\in \mathtt {Diff}^{}({\mathbb {R}})\) transforms one Einstein frame into another (within the same orbit) without changing the value of \(\epsilon _\text {Palatini}=0\). This is not the case for \(\epsilon _\text {Palatini}=\pm 1\): such frames are not preserved under diffeomorphisms. In the Einstein frame, the choice \(\epsilon _\text {Palatini}=+1\) suggests that the scalar field has positive energy, whereas for \(\epsilon _\text {Palatini}=-1\), the theory features a ghostFootnote 8 [22].

Because the transformations (5a) and (5b) act in a self-consistent way, any theory has a mathematically equivalent Einstein frame representation. Therefore, all possible scalar–tensor theories in the Palatini approach can be also labelled by the triple \((\epsilon _\text {Palatini}, {\mathcal {V}}, \alpha )\) in the Einstein frame.

More generally, one can show (cf. (29b)) that the theory written in the Einstein frame becomes effectively metric.

For completeness, let us also write the invariants we have introduced so far for the Einstein frame:

$$\begin{aligned}&{\mathcal {I}}^n_1(\Phi )=e^{-(n-2)\alpha (\Phi )}, \end{aligned}$$
(26a)
$$\begin{aligned}&{\mathcal {I}}^n_2(\Phi )={\mathcal {V}}(\Phi ), \end{aligned}$$
(26b)
$$\begin{aligned}&{\mathcal {I}}^n_E(\Phi )=\sqrt{\pm \epsilon _\text {Palatini}}(\Phi -\Phi _0). \end{aligned}$$
(26c)

As one can see, the quantity \({\mathcal {I}}^n_E\) plays the role of the scalar field in the Einstein frame.

In order to understand better how the invariants can be used to find out whether a given theory is equivalent to some other theory written in the Einstein frame via transformations (5a)–(5c), let us consider the following example: an S–T theory is described by the action functional:

$$\begin{aligned} S[{\bar{g}}_{\mu \nu },{\bar{\Gamma }}^\alpha _{\mu \nu },{\bar{\Phi }}]= & {} \frac{1}{2\kappa ^2}\int _{\Omega }d^nx\sqrt{-{\bar{g}}}\Big [\bar{{\mathcal {A}}}({\bar{\Phi }})R({\bar{g}},{\bar{\Gamma }})\nonumber \\&-\bar{{\mathcal {B}}}({\bar{\Phi }}){\bar{g}}^{\mu \nu }{\bar{\nabla }}_\mu {\bar{\Phi }}{\bar{\nabla }}_\nu {\bar{\Phi }}\nonumber \\&-{\bar{A}}_1^\mu ({\bar{g}},{\bar{\Gamma }})\bar{{\mathcal {C}}}_1({\bar{\Phi }}){\bar{\nabla }}_\mu {\bar{\Phi }}\nonumber \\&-{\bar{A}}_2^\mu ({\bar{g}},{\bar{\Gamma }})\bar{{\mathcal {C}}}_2({\bar{\Phi }}){\bar{\nabla }}_\mu {\bar{\Phi }}-\bar{{\mathcal {V}}}({\bar{\Phi }})\Big ]\nonumber \\&+S_{\text {matter}}[e^{2{\bar{\alpha }}({\bar{\Phi }})}{\bar{g}}_{\mu \nu },\chi ]. \end{aligned}$$
(27)

Such theory always possesses the Einstein frame representation. The comparison of the quantities \({\mathcal {I}}^n_1\) and \({\mathcal {I}}^n_2\) will yield the exact form of the \({\mathcal {V}}\) and \(\alpha \) functions in the transformed frame:

$$\begin{aligned} \alpha (\Phi )= & {} {\bar{\alpha }}({\bar{\Phi }}(\Phi ))-\frac{1}{n-2}\ln \bar{{\mathcal {A}}}({\bar{\Phi }}(\Phi )),\\ {\mathcal {V}}(\Phi )= & {} \frac{\bar{{\mathcal {V}}}({\bar{\Phi }}(\Phi ))}{\big (\bar{{\mathcal {A}}}({\bar{\Phi }}(\Phi ))\big )^{\frac{n}{n-2}}}, \end{aligned}$$

where \(\Phi \) is the scalar field in the new frame; it becomes a function of the “old” scalar field \({\bar{\Phi }}\).

The Jordan frame is defined as follows:

Definition 5.2

The Jordan frame in the Palatini theory is characterized by specific values of four out of the six arbitrary functions \(\{{\mathcal {A}},\ldots ,\alpha \}\): \({\mathcal {A}}=\Psi ,\,{\mathcal {C}}_1={\mathcal {C}}_2=\alpha =0\).

The action functional is given by:

$$\begin{aligned} S[g_{\mu \nu }^J,(\Gamma ^J)^\alpha _{\mu \nu },\Psi ]= & {} \frac{1}{2\kappa ^2}\int _{\Omega }d^nx \sqrt{-g^J}\Big (\Psi R(g^J,\Gamma ^J)\\&-{\mathcal {B}}(\Psi )(g^J)^{\mu \nu }\nabla _\mu \Psi \nabla _\nu \Psi -{\mathcal {U}}(\Psi )\Big )\\&+S_{\text {matter}}\left[ g_{\mu \nu }^J,\chi \right] . \end{aligned}$$

Therefore, the Jordan frame can be described by two functions \(({{{\mathcal {B}}}}, {\mathcal {U}})\). In the Jordan frame, there is no coupling between the scalar field and matter; the field – or a function of it, but it can always be re-defined appropriately – is coupled directly to the curvature. We impose no conditions on the kinetic coupling \({\mathcal {B}}\) and the potential \({\mathcal {U}}\). It can be shown, varying the action expressed in the Jordan frame w.r.t. all dynamical variables, that the curvature scalar is in fact built from a metric conformally related to the initial one. Thence, the Jordan frame in the Palatini approach is in fact almost identical to its metric counterpart, except for a difference in the kinetic coupling. This difference is simply a Brans–Dicke term \(\frac{\omega }{\Psi }\), where \(\omega \) is a constant and depends on the number of dimensions. This term shall be given explicitly later on when considering the invariant generalizations of the Jordan frame.

We may now attempt to express the action (3) for S–T theories fully in terms of invariant quantities. Such an approach would be advantageous because any computations performed in an invariant – or generalized – frame will become independent of the variables we use. Unfortunately, there is no unique way of choosing an invariant frame, as one needs to choose between two invariant metric tensors that have been introduced. The existence of (at least) two non-equivalent invariant metric tensors forces us to analyze the theory in two distinct invariant frames. In each frame, we shall be using the invariant connection \({\hat{\Gamma }}\) given by (22). If we decide to use the variables \(({\hat{g}},{\hat{\Gamma }},{\mathcal {I}}^{n}_E)\) (assuming that the relation (23) between the invariant \({\mathcal {I}}^n_E\) and the scalar field \(\Phi \) is invertible; see [26]), the action functional (3) will take on the following Einstein frame form:

$$\begin{aligned}&S[{\hat{g}}_{\mu \nu },{\hat{\Gamma }}^\alpha _{\mu \nu },{\mathcal {I}}^n_E]=\frac{1}{2\kappa ^2}\int _{\Omega }d^nx\sqrt{-{\hat{g}}}\Big [R({\hat{g}},{\hat{\Gamma }})\nonumber \\&\quad -\,\epsilon _{\text {Palatini}}{\hat{g}}^{\mu \nu }{\hat{\nabla }}_\mu {\mathcal {I}}^n_E{\hat{\nabla }}_\nu {\mathcal {I}}^n_E-{\mathcal {I}}^n_2\Big ]\nonumber \\&\quad +\,S_{\text {matter}}\Big [({\mathcal {I}}^n_1)^{\frac{-2}{n-2}}{\hat{g}}_{\mu \nu },\chi \Big ], \end{aligned}$$
(28)

where \({\mathcal {I}}^n_1\) and \({\mathcal {I}}^n_2\) are functions of the invariant \({\mathcal {I}}^n_E\).

Let us notice that if the invariant \({\mathcal {I}}^{n}_E\) vanishes, the scalar field has no dynamics, as the kinetic term is not present in the Lagrangian. In this case, the invariant \({\mathcal {I}}^{n}_2\) can be thought of as a function of the invariant \({\mathcal {I}}^{n}_1\) (the case in which \({\mathcal {I}}^{n}_E=0\) and \({\mathcal {I}}^{n}_2=0\) will not be considered, as such a theory is ill-posed). Regardless of which invariant will play the role of the scalar field, at the level of field equation the relation between the scalar field and the remaining fields will be purely algebraic, so that no additional physical degree of freedom will correspond to the extra scalar field included in the action. Since the transformation group acts always in a self-consistent way, this property must hold for all conformally related frames, for which \({\mathcal {I}}^{n}_E=0\). This is the case when \(\epsilon _\text {Palatini}=0\) in the Einstein frame, thence all theories located on its orbit have no additional physical degree of freedom due to the presence of the scalar field. Moreover, at the level of the action functional, a given theory may look as if it featured a dynamical scalar field (e.g. when \({\mathcal {B}}\ne 0\), \({\mathcal {C}}_1\ne 0\) and \({\mathcal {C}}_2\ne 0\)) but in fact it would be just an artifact of poorly chosen independent variables (metric and connection).

As it can be seen, it is possible to find out a short cut passage from the complicated general action functional given by (3) to a surprisingly simple and familiar form written above without using the group transformation rules. In the new frame, the scalar field is coupled only to matter part of the Lagrangian, which means that the Principle of Equivalence does not hold any more. The gravitational part is now free of terms \({\mathcal {C}}_1\) and \({\mathcal {C}}_2\), which were difficult to handle due to their coupling to the non-metricity tensors. Also, the kinetic coupling \({\mathcal {B}}\) is now equal to \(\epsilon _{\text {Palatini}}\), leading to a further simplification of the field equations.

Variation with respect to all dynamical variables (assuming non-vanishing invariant \({\mathcal {I}}^n_E\)) gives the following field equations:

$$\begin{aligned}&\delta {\hat{g}}: {\hat{G}}_{\mu \nu }=\kappa ^2{\hat{T}}_{\mu \nu }+ \epsilon _{\text {Palatini}}{\hat{\nabla }}_\alpha {\mathcal {I}}^n_E {\hat{\nabla }}_\beta {\mathcal {I}}^n_E \left( \delta ^\alpha _\mu \delta ^\beta _\nu - \frac{1}{2}{\hat{g}}^{\alpha \beta }{\hat{g}}_{\mu \nu }\right) \nonumber \\&\quad -\frac{1}{2}{\hat{g}}_{\mu \nu }{\mathcal {I}}^n_2, \end{aligned}$$
(29a)
$$\begin{aligned}&\delta {\hat{\Gamma }}: {\hat{\nabla }}_\lambda \big (\sqrt{-{\hat{g}}}\,{\hat{g}}^{\mu \nu }\big )=0, \end{aligned}$$
(29b)
$$\begin{aligned}&\delta {\mathcal {I}}_3: 2\epsilon _\text {Palatini}{\hat{\Box }}{\mathcal {I}}^n_E-\frac{d{\mathcal {I}}^n_2}{d{\mathcal {I}}^n_E}=\kappa ^2\frac{2-n}{2}\frac{1}{{\mathcal {I}}^n_1}\frac{d{\mathcal {I}}^n_1}{d{\mathcal {I}}^n_E}{\hat{T}}. \end{aligned}$$
(29c)

In the Eq. (29c), the box operator is defined as \(\hat{g}^{\mu \nu } \hat{\nabla }_\mu \hat{\nabla }_\nu \). If we consider the second equation, we immediately recognize the well-known relation between connection and metric tensor: if a connection is symmetric and the covariant derivative of the metric multiplied by its determinant vanishes, then the connection is necessarily Levi-Civita with respect to the metric. This shows an interesting result: after writing the action functional in terms of invariants, the initially independent invariant connection becomes Levi-Civita with respect to the invariant metric \({\hat{g}}_{\mu \nu }\). Consequently, the curvature scalar also depends on the metric. Apart from the presence of scalar field in the matter part of the action functional, this suggests that the Einstein frame is supposedly the simplest.

Alternatively, we can express the action functional in terms of the invariant metric \({\tilde{g}}_{\mu \nu }=e^{2\alpha (\Phi )}g_{\mu \nu }\), and the invariant linear connection \({\hat{\Gamma }}^{\alpha }_{\mu \nu }\). Also, the invariant \({\mathcal {I}}^n_1\) shall now play role of the scalar field. This will give us an action functional cast in a Jordan frame:

$$\begin{aligned}&S[{\tilde{g}}_{\mu \nu },{\hat{\Gamma }}^\alpha _{\mu \nu },{\mathcal {I}}^n_1] =\frac{1}{2\kappa ^2}\int _{\Omega }d^nx\sqrt{-{\tilde{g}}}\nonumber \\&\quad \times \left[ {\mathcal {I}}^n_1{\hat{R}}({\tilde{g}},{\hat{\Gamma }}) -{\tilde{g}}^{\mu \nu }{\mathcal {I}}^n_1\left( \frac{d{\mathcal {I}}^n_J}{d{\mathcal {I}}^n_1}\right) ^2{\hat{\nabla }}_\mu {\mathcal {I}}^n_1{\hat{\nabla }}_\nu {\mathcal {I}}^n_1-{\mathcal {I}}^n_3 \right] \nonumber \\&\quad +\,S_\text {matter}[{\tilde{g}}_{\mu \nu },\chi ]. \end{aligned}$$
(30)

For simplicity, we introduced another invariant, \({\mathcal {I}}^n_3\), defined in the following way:

$$\begin{aligned} {\mathcal {I}}^n_3=({\mathcal {I}}_1^n)^{\frac{n}{n-2}}{\mathcal {I}}^n_2, \end{aligned}$$

denoting a modified potential.

Let us now obtain equations of motion for the theory. Variation with respect to all three dynamical variables yields the following formulae:

$$\begin{aligned}&\delta {\tilde{g}}:{\hat{G}}_{\mu \nu }({\tilde{g}},{\hat{\Gamma }})=\frac{\kappa ^2}{{\mathcal {I}}^n_1}{\tilde{T}}_{\mu \nu }\nonumber \\&\quad +\left( \frac{d{\mathcal {I}}^n_J}{d{\mathcal {I}}^n_1}\right) ^2 {\hat{\nabla }}_\alpha {\mathcal {I}}^n_1{\hat{\nabla }}_\beta {\mathcal {I}}^n_1\left( \delta ^\alpha _\mu \delta ^\beta _\nu - \frac{1}{2}{\tilde{g}}_{\mu \nu }{\tilde{g}}^{\alpha \beta }\right) -\frac{1}{2}{\tilde{g}}_{\mu \nu }\frac{{\mathcal {I}}^n_3}{{\mathcal {I}}^n_1}, \end{aligned}$$
(31a)
$$\begin{aligned}&\delta {\hat{\Gamma }}: {\hat{\nabla }}_\alpha \big ({\mathcal {I}}^n_1\sqrt{-{\tilde{g}}}{\tilde{g}}^{\mu \nu }\big )=0, \end{aligned}$$
(31b)
$$\begin{aligned}&\delta {\mathcal {I}}^n_1: {\hat{R}}({\tilde{g}},{\hat{\Gamma }})-{\tilde{g}}^{\mu \nu } {\hat{\nabla }}_\mu {\mathcal {I}}^n_1{\hat{\nabla }}_\nu {\mathcal {I}}^n_1 \Bigg [\left( \frac{d{\mathcal {I}}^n_J}{d{\mathcal {I}}^n_1}\right) ^2\nonumber \\&\quad +2{\mathcal {I}}^n_1\frac{d{\mathcal {I}}^n_J}{d{\mathcal {I}}^n_J}\frac{d^2{\mathcal {I}}^n_J}{d({\mathcal {I}}^n_1)^2}\Bigg ] -\frac{d{\mathcal {I}}^n_3}{d{\mathcal {I}}^n_1}\nonumber \\&\quad +\frac{2}{\sqrt{-{\tilde{g}}}}{\hat{\nabla }}_\mu \left( \sqrt{-{\tilde{g}}}{\tilde{g}}^{\mu \nu }{\mathcal {I}}^n_1 \left( \frac{d{\mathcal {I}}^n_J}{d{\mathcal {I}}^n_1} \right) ^2{\hat{\nabla }}_\nu {\mathcal {I}}^n_1\right) =0. \end{aligned}$$
(31c)

Making use of the field equations, we can eliminate the independent invariant connection from (30) and arrive at the action functional dependent on the metric and the scalar field only:

$$\begin{aligned} \begin{aligned}&S[{\tilde{g}}_{\mu \nu },{\mathcal {I}}^n_1]=\frac{1}{2\kappa ^2}\int _{\Omega }d^nx \sqrt{-{\tilde{g}}}\\&\quad \times \left[ {\mathcal {I}}^n_1{\tilde{R}}({\tilde{g}}) -{\tilde{g}}^{\mu \nu }\left( {\mathcal {I}}^n_1 \left( \frac{d{\mathcal {I}}^n_J}{d{\mathcal {I}}^n_1} \right) ^2-\frac{n-1}{n-2}\frac{1}{{\mathcal {I}}^n_1} \right) {\hat{\nabla }}_\mu {\mathcal {I}}^n_1{\hat{\nabla }}_\nu {\mathcal {I}}^n_1 -{\mathcal {I}}^n_3\right] \\&\quad +S_\text {matter}[{\tilde{g}}_{\mu \nu },\chi ]. \end{aligned} \end{aligned}$$
(32)

For simplicity, let us introduce another invariant \({\mathcal {I}}^n_4\): \({\mathcal {I}}^n_4={\mathcal {I}}^n_1\Big (\frac{d{\mathcal {I}}^n_J}{d{\mathcal {I}}^n_1}\Big )^2-\frac{n-1}{n-2}\frac{1}{{\mathcal {I}}^n_1}\). As it can be seen, if the invariant \({\mathcal {I}}^n_J\) is equal to zero, then \({\mathcal {I}}^n_4\) reduces to \(-\frac{n-1}{n-2}\frac{1}{{\mathcal {I}}^n_1}\), so that the resultant theory in four dimensions is simply the standard Brans–Dicke theory with \(\omega =-\frac{3}{2}\) and the modified self-interaction potential \({\mathcal {I}}^n_3\) added.

5.1 Scalar–tensor extension of \(F({\hat{R}})\) gravity

By means of a simple transformation, it can be shown that \(F({\hat{R}})\) gravity is equivalent to special cases of [15], both in the metric and Palatini approach.Footnote 9 This is achieved by a simple trick, as presented in the Appendix C. In fact, the metric F(R) is equivalent to the Brans–Dicke (BD) theory with \(\omega _{BD}=0\) (no kinetic term), while the Palatini \(F({\hat{R}})\) is equivalent to the Brans–Dicke theory with \(\omega _{BD}=-\frac{n-1}{n-2}\) (with potential added to the Lagrangian in both cases and in n dimensions). However, we may invert the problem and ask whether a given scalar–tensor gravity is equivalent to some \(F({{\hat{R}}})\) theory (in mathematical, not physical sense). Answering this question might be much easier thanks to the introduction of invariant quantities, which are the same for different theories related to each other via conformal transformation. In order to find out whether two arbitrary theories can be linked by a transformation, we need to calculate the invariants and compare them. In this chapter, we will focus on \(F({\hat{R}})\) gravity and discuss conditions for equivalence with an S–T theory. First, let us introduce the notion of Brans–Dicke theory in Palatini approach, which is a particular case of the Jordan frame (cf. Definition 5.2.)

Definition 5.3

Brans–Dicke theory in Palatini approach is given by the following action functional expressed in the Jordan frame:

$$\begin{aligned}&S[g_{\mu \nu },\Gamma ^\alpha _{\mu \nu },\Psi ]=\frac{1}{2\kappa ^2}\int _{\Omega }d^nx \sqrt{-g}\left( \Psi R(g,\Gamma )\phantom {\frac{\omega _\text {Palatini}}{\Psi }}\right. \\&\quad \left. -\frac{\omega _\text {Palatini}}{\Psi }g^{\mu \nu } \nabla _\mu \Psi \nabla _\nu \Psi -{\mathcal {U}}(\Psi )\right) +S_{\text {matter}}\left[ g_{\mu \nu },\chi \right] , \end{aligned}$$

with \(\omega _\text {Palatini}=\text {const}\).

Brans–Dicke theory in the Palatini approach is not to be confused with the (original) BD theory in the metric approach, despite both of them having exactly the same functional form (see Appendix C). These theories are not physically equivalent, albeit one can show their mathematical equivalence. The proof goes as follows: using the fact that the BD theory in the Palatini approach is effectively metric, as it was proven in the previous section, one can express it the form analogous to (32). Here, invariants \({\mathcal {I}}^n_1\) and \({\mathcal {I}}^n_2\) have exactly the same form, whereas the invariant \({\mathcal {I}}^n_J\) for a special choice of the function \({\mathcal {B}}\) is now:Footnote 10

$$\begin{aligned} {\mathcal {I}}^n_J(\Psi )=\sqrt{\pm \, \omega _\text {Palatini}}\ln \left( \frac{\Psi }{\Psi _0}\right) . \end{aligned}$$

Therefore, the (metric) action (32) written for BD theory given initially in the Palatini approach, reads now as follows:

$$\begin{aligned}&S[g_{\mu \nu },\Psi ]=\frac{1}{2\kappa ^2}\int _{\Omega }d^nx \sqrt{-g}\Big (\Psi R(g)\nonumber \\&\quad -\frac{\omega _\text {Palatini}-\frac{n-1}{n-2}}{\Psi }g^{\mu \nu } \nabla _\mu \Psi \nabla _\nu \Psi -{\mathcal {U}}(\Psi )\Big )\nonumber \\&\quad +S_{\text {matter}}\left[ g_{\mu \nu },\chi \right] . \end{aligned}$$
(33)

Let us observe that this action differs from (C.7), as the one written above is already evaluated on-shell, when the connection is Levi-Civita of the metric tensor. As it can be seen, when \(\omega _\text {Palatini}=0\), the only difference is that the functions \({{{\mathcal {C}}}}_1\) and \({{{\mathcal {C}}}}_2\) do not vanish, so that they contribute to the field equation obtained from varying w.r.t. the metric and the independent connection. Therefore, the actions (33) and (C.7) are fully equivalent on-shell.

The action written in the Einstein frame will have the following form (assuming \(\omega _{\text {Palatini}}\ne 0\)):

$$\begin{aligned} \begin{aligned} S[{\bar{g}}_{\mu \nu },{\bar{\Psi }}]&= \frac{1}{2\kappa ^2}\int _{\Omega }d^nx \sqrt{-{\bar{g}}}\Big (R({\bar{g}})\mp {\bar{g}}^{\mu \nu }{\bar{\nabla }}_\mu {\bar{\Psi }}{\bar{\nabla }}_\nu {\bar{\Psi }}-\bar{{\mathcal {U}}}({\bar{\Psi }})\Big )\\&\quad +S_{\text {matter}}\left[ \exp \left( -\frac{2}{n-2}\frac{{\bar{\Psi }}}{\sqrt{\pm \omega _{\text {Palatini}}}}\right) {\bar{g}}_{\mu \nu },\chi \right] . \end{aligned} \end{aligned}$$
(34)

We may introduce the Brans–Dicke coefficient in the metric approach given in terms ofFootnote 11

$$\begin{aligned} \omega _{BD}=\omega _\text {Palatini}-\frac{n-1}{n-2}. \end{aligned}$$

Hence, the BD theory in the Palatini approach is equivalent to a BD in the metric formalism with the coefficient \(\omega \) changed. Let us now ask a more general question: under what conditions is an arbitrary S–T theory equivalent to the BD theory by means of the transformation (5a)–(5c)? In order to resolve this issue, one needs to observe that for any theory to be equivalent to the BD, it must necessarily be expressible in the Jordan frame representation. In the transformed frame, one arrives at an action functional given by (30). For this new action to describe a BD theory, it must possess the kinetic coupling of the form \(\frac{\text {const}}{{\bar{\Psi }}}\), where \({\bar{\Psi }}\) is a function of the “old” scalar field \(\phi \). Therefore, one might write the following equivalency condition:

$$\begin{aligned} {\mathcal {I}}^n_1(\phi )\left( \frac{d{\mathcal {I}}^n_J}{d{\mathcal {I}}^n_1}\right) ^2=\pm \frac{\omega _\text {Palatini}}{{\bar{\Psi }}(\phi )}. \end{aligned}$$
(35)

From this point on, it will be very easy to give general conditions for mathematical equivalence between \(F({\hat{R}})\)-Palatini gravity and S–T theories. As it is shown, \(F({\hat{R}})\) gravity can be thought of as a (Palatini) Brans–Dicke theory with \(\omega _\text {Palatini}= 0\) (or, equivalently, \(\omega _{BD}=-\frac{n-1}{n-2}\), cf. Appendix C). Therefore, in order to find out whether a given S–T theory in the Palatini approach arises from some \(F({\hat{R}})\) gravity, one needs to examine the condition (35) for \(\omega _\text {Palatini}=0\). Such a condition is satisfied only when \(\frac{d{\mathcal {I}}^n_J}{d{\mathcal {I}}^n_1}=0\), which means that (up to an additive constant) \({\mathcal {I}}^n_J={\mathcal {I}}^n_E=0\). This reproduces the well-known result that there are only two physical degrees of freedom (graviton) in Palatini \(F({\hat{R}})\) theories of gravity [64].

When the equivalence is established, one may also wish to see what the exact form of the \(F({\hat{R}})\) function is. It is obvious that information about the \(F({\hat{R}})\) theory in the scalar–tensor representation is stored in the form of the potential defined as \({{{\mathcal {U}}}}(\Psi )=\Psi \,\Xi (\Psi )-F(\Xi (\Psi ))\) (and \({\hat{R}}(\Psi )\equiv \Xi (\Psi )=\frac{d{{{\mathcal {U}}}}(\Psi )}{d\Psi }\)) (see Appendix C). We find out that (assuming the coefficients defining the “old” frame – the one being subject to our inquiry – are \(\{\bar{{\mathcal {A}}},\bar{{\mathcal {B}}},\bar{{\mathcal {C}}}_1,\bar{{\mathcal {C}}}_2,\bar{{\mathcal {V}}},{\bar{\alpha }}\}\), and the variables: \(\{{\bar{g}},{\bar{\Gamma }},{\bar{\Psi }}\}\)):

$$\begin{aligned} \mathcal{U}(\Psi )=\left( {\mathcal {I}}^n_1({\bar{\Psi }}(\Psi ))\right) ^{\frac{n}{n-2}}{\mathcal {I}}^n_2({\bar{\Psi }}(\Psi ))\rightarrow {\hat{R}}(\Psi ), \end{aligned}$$
(36)

where

$$\begin{aligned} {\hat{R}}(\Psi )= & {} \frac{n}{n-2}\big ({\mathcal {I}}^n_1({\bar{\Psi }}(\Psi ))\big )^{\frac{2}{n-2}}{\mathcal {I}}^n_2({\bar{\Psi }}(\Psi ))\nonumber \\&+\left( {\mathcal {I}}^n_1({\bar{\Psi }}(\Psi ))\right) ^{\frac{n}{n-2}}\frac{d}{d\Psi }{\mathcal {I}}^{n}_2({\bar{\Psi }}(\Psi )). \end{aligned}$$
(37)

The resulting equation is a non-linear differential equation of the first order, as \(\Psi \) can be now identified with \(\frac{dF}{d{\hat{R}}}\). Solving this equation will result in an exact form of the function \(F({\hat{R}})\).

6 Conclusions

In this paper, we have combined two frequently used ways of altering general relativity, Palatini variation and addition of a scalar field non-minimally coupled to the curvature, into a single theory of gravity. Our motivation for considering such coalescence of modifications of classical gravity was the lack of formalism of invariants defined for Palatini approach in S–T theories. Although the prevalent approach to the analysis of S–T theories is the metric one, the Palatini formalism has many interesting features to offer.

In the course of the paper, we placed special emphasis on the notion of conformal and almost-geodesic transformations, as it allows us to establish – under well-defined and strict conditions – mathematical equivalence between two different conformal frames. We did not aim to take a stand on the issue of which frame is the physical one; the main purpose of this paper was to obtain solution-equivalent classes of frames and introduce proper language enabling one to analyze the theory in a frame-independent manner. The first step to creating such language was to recognize that in case of the Palatini approach, one must transform the metric and the connection independently. Decoupling of metric from affine structure of spacetime influenced the action functional defined for a general S–T theory, devised to preserve its form under conformal change, enforcing us to add special terms linear in scalar field derivatives. These terms do not have any clear interpretation yet.

We singled out two frames most commonly used in the literature – Jordan and Einstein. Quantities behaving as invariants on the orbits of the two frames were also introduced and the role they play when comparing equivalent theories was discussed. In general, the theory possesses three degrees of freedom: one introduced by the scalar field, and the remaining two being a property of the metric. However, the independent scalar field turns out to be an auxiliary field in case the invariant \({\mathcal {I}}^n_E\) vanishes; then, the theory has only two degrees of freedom.

It was discovered that there exists a subclass of conformal frames with \({\mathcal {C}}_1={\mathcal {C}}_2=0\) fully analogous to the metric frames. In such frames, the (initially independent) connection is always Levi-Civita with respect to a metric \({\bar{g}}\) conformally related to the initial metric g. This class is invariant under the action of the subgroup \(\gamma _2=\gamma _3=0\).

If a given theory has the same \(\{{\mathcal {A}},{\mathcal {B}},{\mathcal {V}},\alpha \}\) functions both in the metric and Palatini approach, the latter one can be brought to the metric form using the property discussed above. The only difference between such two theories will be the exact form of the kinetic coupling \({\mathcal {B}}\); in the metric formalism resulting from a prior Palatini frame, the coupling will take on the form \({\mathcal {B}}-\frac{n-1}{n-2}\frac{1}{\Phi }\). This fact allowed us to establish a correspondence between the Brans–Dicke theories in the metric and Palatini formalism.

It was also shown that for an arbitrary S–T theory in the Palatini approach there always exists a unique transformation defined for the connection such that it renders the theory effectively metric. This useful property allows us to analyze a specific theory within the metric formalism.

Finally, \(F({\hat{R}})\) theories were analyzed using the language of invariants. We made use of the well-established equivalence of these theories to S–T gravity – to the Brans–Dicke theory, to be precise. Invariants made it possible for us to address an issue of the relation between S–T and \(F({\hat{R}})\), namely, we identified cases in which those two theories could be related by the transformation (5a)–(5c), meaning that they are mathematically equivalent. It was discovered that the coefficients \(\{{\mathcal {A}},{\mathcal {B}},{\mathcal {C}}_1,{\mathcal {C}}_2,{\mathcal {V}},\alpha \}\), which characterize a specific S–T theory, must fulfil certain relations (given by (35)) in order for the theory to be equivalent to \(F({\hat{R}})\) gravity in the Palatini approach. Furthermore, because the metric and the Palatini formalisms always give two non-equivalent theories, if a given scalar–tensor theory results from some F(R) theory, it cannot simultaneously be derived from both the metric and the Palatini F(R).

The main aim of this paper was to introduce a new class of scalar–tensor theories of gravity and analyze some of its mathematical properties. Due to its introductory nature, it focuses on the formal aspects of the theory, with a special emphasis put on self-consistency conditions, and lacks direct physical applications. Also, due to adopting the Palatini approach and adding more degrees of freedom into the theory, it will be straightforward to include torsion and/or disformal transformations in order to investigate theirs impact on self-consistency of the theory. Analysis of real-world phenomena will be carried out in the forthcoming papers. In order to find out whether the predictions of the theory are in agreement with experiment, we plan on computing the post-Newtonian parameters in the first place. Furthermore, topics to be covered in the future works will include cosmological applications (cf. [20, 21]), F(R) theories with non-minimal curvature coupling (see e.g. [17, 19]), the appearance of ghosts and tachyons.