1 Introduction

A notable topic of interest in modern cosmology is to understand Dark Matter (DM) and Dark Energy (DE), whose origin and nature remain elusive. These components hold a significant importance in our Universe, making up a substantial 95% of its energy-matter composition (\(\sim 68\%\) as DE and \(\sim 27\%\) as DM) [1]. The existence of DM was first inferred in the 1930 s by Zwicky. He noticed a discrepancy between the observed dynamical mass of galaxy clusters and the mass derived from theoretical calculations [2]. Subsequent pioneering studies on the rotation curves of spiral galaxies confirmed this inconsistency at galactic scales [3]. These results challenged the previously held assumption that the concentration of a galaxy’s mass is within its central bulge, which contains the majority of its stars and gas. Instead, observations indicated a notably more consistent density extending across the entire gravitational structures. This “hidden” mass which could not be directly observed thus acquired the name of “dark matter”.

Since then, more and more evidence has accumulated supporting the need of DM on both cosmological and astrophysical scales. The enthusiastic exploration driven by particle physics considerations has generated numerous models and a variety of dark matter candidates (check [4, 5] for more information). However, up to the present moment, there has been no evidence supporting the existence of these proposed dark matter particles, and the sole means to detect the presence and characteristics of dark matter comes from observations within the realm of astrophysics and cosmology.

Currently, the dominant cosmological model is the \(\Lambda \)–Cold Dark Matter model (\(\Lambda \)CDM), which is fully based on General Relativity (GR). While the \(\Lambda \)CDM model has been successful in explaining so many observations and closely matches the data we have, it suffers from some problems [6,7,8,9,10,11,12,13]. In the pursuit of understanding the nature of DM and DE, researchers have proposed exploring extended theories of gravity (ETGs) [14, 15]. By embracing the idea that GR is a special case of a more comprehensive theory, we naturally arrive to the realm of ETGs. In certain ETG scenarios, both geometry and matter can undergo modifications, offering us a wider horizon through which to comprehend DM and DE. Over time, a multitude of models have been proposed within the ETG category, each contributing uniquely to our understanding of these phenomena [16,17,18,19].

In this paper, our objective is to investigate the characteristics of the DM fluid considering it to be non-minimally coupled with gravity. One might inquire about the rationale for choosing our perfect fluid to be Non-Minimally Coupled (NMC). In the fundamental concept of fluids in GR, we assume that when we transition from individual particles to a fluid, we are dealing with very small scales that can be expressed with a good approximation as a flat spacetime. However, it is worth exploring the scenario where the scale of the fluid’s mean free path is comparable to the scale at which spacetime curvature undergoes changes. This scenario is quite likely to be applicable to DM, which doesn’t interact with anything, exhibiting at most only very weak (self)interactions. As a result, its mean free path can be potentially as large as the Hubble scale (\(l_{\mathrm {{mfp}}} \sim 10^3\, \textrm{Gpc}\)). As a consequence, DM can be NMC and, consequently, the standard Einstein equations will be modified [20,21,22].

Within the framework of GR, there exists an option to contemplate DM as a Bose–Einstein condensate (BEC) [23]; this approach provides a way to satisfy the aforementioned criteria, as a BEC naturally possesses a characteristic length scale.Footnote 1 Additionally, within the domain of ETGs, modifications to gravity inspired by Born-Infeld theory result in the same modifications [32].

Furthermore, it has been shown that models assuming NMC between DM and gravity can resolve the core-cusp problem of \(\Lambda \)CDM model [27]. This controversy arises from the contrasting findings between cosmological simulations and observations. Cosmological simulations predict that the density distribution of DM near the center of galaxies will exhibit a cusp-like pattern. In contrast, observations on dwarf galaxies have revealed different results, demonstrating a linear increase in velocity as one moves toward their centers, resulting in the eventual development of a central density core (see [33,34,35,36]).

In the Newtonian limit, NMC manifests as an adjustment to the Poisson equation. Specifically, it introduces an additional term proportional to the dark matter density \( \rho \), expressed as \( L^2 \nabla ^2 \rho \), thereby the modified Poisson equation now relies not only on density but also on the gradients of the density [23].

While corrections to Poisson’s equation have been examined on stellar scales [37, 38], there is currently a gap in the analysis at galactic and cluster scales. Here, we try to address this gap by conducting an investigation of these corrections specifically at the scale of galaxy clusters. We will investigate the implications of our modified Poisson equation without making any initial assumptions about the source or magnitude of L.

The structure of our paper is as follows. First, in Sect. 2, we provide a concise overview of the NMC DM model’s underlying theory. Following that, we introduce the fundamental principles of gravitational lensing theory, upon which we base our theoretical predictions. Lastly in this section, we shortly review the specifics of our chosen mass density profile. In Sect. 3 we go through the data set from the CLASH program that has been utilized in our analysis. In Sect. 4 we outline the key aspects of the statistical analysis we have conducted. Finally, in Sect. 5, we provide a comprehensive discussion of our results and present our concluding findings.

2 Theory

We will quickly go over our model’s theoretical foundation in this section (see [20, 39] for more detail).

The general action that describes the NMC case can be written as follows [20]

$$\begin{aligned} \begin{aligned} S&= \frac{M_{\text {Pl}}^2}{2}\int d^4 x\sqrt{-g}\big [R+\alpha _{\textrm{c}} \rho _c(n,s)R \\&\quad +\alpha _{\textrm{d}} \rho _d(n,s) R_{\mu \nu }u^\mu u^\nu \big ] +S_{\textrm{fluid}}, \end{aligned} \end{aligned}$$
(1)

where, the action \(S_{\textrm{fluid}}\) represents the behavior of dark matter which we will model using the action for a perfect fluid. This reads [20, 40]

$$\begin{aligned} S_{\textrm{fluid}} = \int d^{4}x \, \sqrt{-g} \rho (n,s) + J^{\mu }(\psi _{,\mu } + s \theta _{,\mu }+ \beta _{A}\alpha ^{A}_{,\mu }),\nonumber \\ \end{aligned}$$
(2)

where n represents the particle number density and s indicates the entropy assigned to each particle. In addition, \(\alpha ^{A}\) and \(\beta _{A}\), where A takes on values of 1, 2, 3, represent the Lagrangian coordinates for the fluid. The second term introduces some limitations on the perfect fluid’s flow. In addition, \(\psi \) and \(\theta \), have a thermodynamic interpretation in terms of thermodynamic potentials. Furthermore, \(J^{\mu }\) is defined as

$$\begin{aligned} J^{\mu } = n u^{\mu } \sqrt{-g}, \end{aligned}$$
(3)

and is the conserved current representing particles number conservation, being \(u^{u}\) the four-vector velocity of the fluid.

In Eq. (1), the term \(\rho _c(n,s)R\) represents a conformal coupling term, while \(\rho _d(n,s)R_{\mu \nu }u^\mu u^\nu \) shows a disformal one where our fluid variable couple to the contracted Ricci tensor with the fluid four-vector velocity.

As we will explain below, we are focusing specifically on utilizing the disformal coupling term only. Hence, we retain only the latter term in the total action

$$\begin{aligned} S=\int d^4x\sqrt{-g}\left[ \frac{M_{\text {Pl}}}{2}\left( R+\alpha _{d} \rho _d(n,s)R_{\mu \nu }u^\mu u^\nu \right) \right] +S_{\textrm{fluid}}.\nonumber \\ \end{aligned}$$
(4)

In order to derive the Newtonian limit of our theory, it is beneficial to employ fluid approximation [22]. This approach leads to a modification of the Poisson equation of the form [20, 21]

$$\begin{aligned} \mathbf {\nabla }^2\Phi =4\pi G_N\,[(\rho +\rho _{\textrm{bar}}) - \epsilon \,L^2\,\nabla ^2\rho ]~, \end{aligned}$$
(5)

where \(\Phi \) represents the Newtonian potential, and \(\rho _{\textrm{bar}}\) and \(\rho \) denote the mass densities of baryonic matter and dark matter, respectively. The second term of the Eq. (5) represents the non-minimal coupling term: from [20]it can be seen how \(L^2 \propto \alpha _d\) thus being L the non-minimal coupling length; \(\epsilon = \pm 1\) represents the polarity of the coupling. According to [39], the negative polarity \(\epsilon =-1\) is required. In our chosen scenario for disformally coupled fluid, we only have one gravitational potential and no anisotropic stress. Additionally, it’s important to highlight that Eq. (5) is not the most general expression and having extra terms is possible, but most of them have to be close to zero to satisfy the equivalence principle [41, 42].

It is worth mentioning that this kind of modification to the dynamics of GR (Eq. 2) is linked to a “coarse-grained” scenario and doesn’t involve making fundamental modifications to gravitational dynamics. Consequently, L is not a fundamental constant of nature and its value can hence depend on the local environment.

As can be seen in Eq. (5), the modified Poisson equation has a term that is dependent on gradients of density. As a consequence, the impact of this modification becomes more pronounced as the distribution of dark matter becomes increasingly inhomogeneous [21].

According to [43], this modification has an important role in modifying the dynamics of spiral galaxies. In addition, in the mentioned study it has been shown that NMC DM can provide a better fit to the rotation curves of spiral galaxies than NFW.

2.1 Gravitational lensing

Gravitational lensing emerges as a potent tool for probing the distribution of both dark and baryonic matter within galaxy clusters.

Considering a source that is positioned at an angular diameter distance of \(D_A\), \(D_s\) is the distance from the observer and \(D_{l}\) would be distance from the lens; the distance between the lens and the source is denoted as \(D_{ls}\) in a gravitational lensing setup [44,45,46]. The angular diameter distance which is a function of redshift can be defined as

$$\begin{aligned} D_A(z) = \frac{c}{1+z}\int _0^z \frac{dz'}{H(z')}\,, \end{aligned}$$
(6)

where, in the context of a \(\Lambda \)CDM (Lambda Cold Dark Matter) model, the Hubble function denoted as H(z) is expressed through the first Freedman equation:

$$\begin{aligned} H(z) = H_{0} \sqrt{\Omega _{m} (1+z)^{3} + \Omega _{k} (1+z)^{2} + \Omega _{\Lambda }}\,, \end{aligned}$$

with \(\Omega _{\Lambda } = 1-\Omega _m\) in the case of spatial flatness (\(\Omega _k =0\)). Throughout this work, we are assuming our background cosmology parameters to be given from Planck baseline model [1], with the values: Hubble constant \(H_0 = 67.89\) km s\(^{-1}\) Mpc\(^{-1}\) and the matter density parameter \(\Omega _m = 0.308\). Thus, we are implicitly assuming that the NMC model we are analyzing behaves on cosmological scales as this chosen\(\Lambda \) CDM one, at least effectively, whatever would be the (different) cosmological parameters which would effectively describe it.

Additionally, we assume that this system can be roughly thought of as two-dimensional given the scale differences between \(D_l\) and \(D_{ls}\) distances compared to the physical dimensions of the lens (the “thin-lens” approximation).Footnote 2 In such case, the lens’s primary function is to deflect light beams from the source by an angle called \(\hat{\vec {\alpha }}\), which is defined as

$$\begin{aligned} \hat{\vec {\alpha }}= \frac{2}{c^{2}} \int _{-\infty }^{+\infty } \vec {\nabla }_{\perp } \Phi \textrm{d}z\,, \end{aligned}$$
(7)

where \(\vec {\nabla }_{\perp }\) represents the two-dimensional gradient operator, which is perpendicular to the path of the light. Additionally, z denotes the coordinate that specifies the position along the path in which the light is propagating.

The deflection angle \(\hat{\vec {\alpha }}\), can be described using the effective lensing potential

$$\begin{aligned} \Phi _\textrm{lens}(R) = \frac{2}{c^2}\frac{D_{ls}}{D_lD_s}\int ^{+\infty }_{-\infty }\Phi (R,z)dz \,, \end{aligned}$$
(8)

where R is the two-dimensional projected radius on the lens plane.

The Laplacian of Eq. (8) gives twice the lensing convergence

$$\begin{aligned} \kappa (R) = \frac{1}{c^2}\frac{D_{ls}D_l}{D_s}\int ^{+\infty }_{-\infty }\Delta _r\Phi (R,z)dz \,, \end{aligned}$$
(9)

where as mentioned earlier R is the two-dimensional projected radius in the lens plane, \(r = \sqrt{R^2+z^2}\) is the three-dimensional radius and \(\Delta _r = \frac{2}{r}\frac{\partial }{\partial r} + \frac{\partial ^2}{\partial ^2 r}\) represents the radial Laplacian in spherical coordinates, where we assume spherical symmetry for simplification. Now, by using the standard Poisson equation

$$\begin{aligned} \Delta _r\Phi = 4\pi G_N\rho (r), \end{aligned}$$
(10)

we can establish a connection between the convergence \(\kappa \) and the distribution of mass density \(\rho \) within the lens system, ultimately leading us to another expression for convergence

$$\begin{aligned} \kappa (R) = \int _{-\infty }^{+\infty } \frac{4 \pi G_N}{c^{2}} \frac{D_{ls}D_{l}}{D_{s}} \rho (R,z)\textrm{d}z \equiv \frac{\Sigma (R)}{\Sigma _{cr}}\,, \end{aligned}$$
(11)

where \(\Sigma (R)\) is the lens’s two-dimensional surface density and \(\Sigma _{cr}\) is the critical surface density of gravitational lensing and these quantities are expressed as follows, respectively

$$\begin{aligned} \Sigma (R){} & {} = \int _{-\infty }^{+\infty } \rho (R,z) \textrm{d}z\,, \end{aligned}$$
(12)
$$\begin{aligned} \Sigma _{cr}{} & {} = \frac{c^{2}}{4\pi G_N} \, \frac{D_{s}}{D_{ls}D_{l}}\,. \end{aligned}$$
(13)

It should be mentioned that, till this point, our focus has been on the GR case, thus having \(\Phi = \Psi \). Yet, we can broaden our perspective by considering a more general scenario where we have non-zero anisotropic stress, which means the gravitational (\(\Phi \)) and metric potential (\(\Psi \)) are not equal (\(\Phi \ne \Psi \)). In this context, the expression for the convergence can be generalized to

$$\begin{aligned} \kappa (R) = \frac{1}{c^2}\frac{D_{ls}D_l}{D_s}\int ^{+\infty }_{-\infty }\Delta _r\biggl \{\frac{\Phi (R,z) + \Psi (R,z)}{2}\biggr \}dz\,. \end{aligned}$$
(14)

For the model we are focusing on and outlined in Eq. (5), in the case of a disformal coupling, we do have \(\Phi = \Psi \) but with the modified Poisson equation Eq. (5), for which we will have the following convergence for our model

$$\begin{aligned} \kappa (R) = \frac{1}{\Sigma _{cr}} \int _{-\infty }^{+\infty } \left[ \rho (R,z) - \epsilon L^2 \Delta _r \rho _{NFW}(R,z)\right] \, \textrm{d}z \,,\nonumber \\ \end{aligned}$$
(15)

where \(\rho = \rho _{NFW} + \rho _{gas}\) is the total density. In the case of DM, as described in the next section, we considered the Navarro–Frenk–White to describe its density profile; for the hot intracluster gas component, we used the profile \(\rho _{gas}\) described in the following pages.

As can be seen, the new convergence is influenced by the behavior of \(\rho (R,z)\) and the radial Laplacian of only DM, \(\Delta _r \rho _{NFW}(R,z)\). Consequently, notable variations in the density profile may impact the value of the convergence. One should also consider the influence of the parameter L, distinct for each cluster, which further contributes to the overall change in convergence.

2.2 Navarro–Frenk–White profile

The mass distribution within galaxy clusters is frequently represented using spherically symmetric Navarro–Frenk–White (NFW) mass density profile [33]. One might argue that such a distribution, although seemingly generally valid [47], emerges from simulations in the context of standard General Relativity. We thus follow a minimally-conservative approach, in which we explore if the NFW profile is compatible with the modified scenario and still can be used as density profile for DM distribution in galaxy clusters. But we are aware that the only way to check if a different DM distribution would be achieved in the modified scenario we are considering here, would be to run cosmological simulations based on it. But this is out of the scope of this work.

It’s important to note that, in this study, we assume that the mass distribution in galaxy clusters is primarily influenced by DM

$$\begin{aligned} \rho _\textrm{NFW}(r) = \frac{\rho _{s}}{\frac{r}{r_{s}} \Big ( 1 + \frac{r}{r_{s}} \Big )^{2}} , \end{aligned}$$
(16)

where \(\rho _{s}\) represents the characteristic density of the halo, while \(r_{s}\) corresponds to the scale radius. Moreover, \(\rho _{s}\) can be written

$$\begin{aligned} \rho _{s} = \frac{\Delta }{3} \rho _{c} \frac{c_{\Delta }^{3}}{\textrm{ln}(1+ c_{\Delta }) - \frac{c_{\Delta }}{1 + c_{\Delta }}} \,, \end{aligned}$$
(17)

where

$$\begin{aligned} c_{\Delta } = \frac{r_{\Delta }}{r_{s}}\,, \end{aligned}$$
(18)

\(c_{\Delta }\) –the ratio of the size of the halo– is the dimensionless concentration parameter. \(r_{\Delta }\) represents the spherical radius where the average density inside it is equal to \(\Delta \) times the critical density \(\rho _c\) of the Universe at the redshift of the lens which here is the cluster. In addition, we have also \(M_{\Delta }\), which corresponds to the total mass encompassed within the overdensity radius \(r_{\Delta }\)

$$\begin{aligned} M_{\Delta } = \frac{4}{3} \pi r_{\Delta }^{3} \Delta \rho _{c} = 4 \pi \rho _{s} r_{s}^{3} \bigg [ \textrm{ln}(1+ c_{\Delta }) - \frac{c_{\Delta }}{1 + c_{\Delta }} \bigg ]\,. \end{aligned}$$
(19)

For our analysis, we have fixed the value of \(\Delta \) to be 200. As a result, the free NFW parameters we have utilized in our study are \(\{c_{200}, M_{200}\}\).

2.3 Hot gas

Although it would be possible to use X-ray observations for the CLASH clusters, which all have related archival data [48], we have decided to not take directly into account them. As it is well known, such type of observables might be biased by non-gravitational local astrophysical phenomena, contrarily to gravitational lensing, which is a neat gravitational probe. Thus, we have decided to sacrifice a bit of precision (X-ray reconstructed masses are generally better than some lensing-based data) for a stronger and lesser biased reconstruction.

Despite this, we consider hot gas in our modelling of the clusters, and we include \(\rho _{gas}\) in the total density appearing in Eq. (15). From the data at our disposal (as discussed in the next section), we fit the gas densities with a double (truncated) \(\beta \) model

$$\begin{aligned} \rho _\textrm{gas}(r)&= \rho _{e,0}\biggl (\frac{r}{r_0}\biggr )^{-\alpha }\biggl [1 + \biggl (\frac{r}{r_{e,0}}\biggr )^2\biggr ]^{-3\beta _0/2} \nonumber \\&\quad + \rho _{e,1}\biggl [\biggl (\frac{r}{r_{e,1}}\biggr )^2\biggr ]^{-3\beta _1/2}\, . \end{aligned}$$
(20)

Note that the free parameters in this expression are fixed at a preliminary stage, by independent fits, and are not left free in the global analysis.

3 Data

In this study, we have used the data from the CLASH (Cluster Lensing And Supernova survey with Hubble) programFootnote 3 [49].

The goal of the survey was (among others) to analyse the gravitational lensing characteristics of a set of massive galaxy clusters selected in the redshift range \(0.18<z<0.90\) to precisely determine their mass distributions. The sample covers a wide range of masses, \(5\lesssim M_{200}/10^{14}M_\odot \lesssim 30\), and each cluster has both weak- and strong-lensing data from Hubble Space Telescope focusing on the central regions [50, 51] combined with ground-based weak-lensing shear and magnification data from the Subaru Telescope [52]. The radial convergence profiles for 20 clusters [53] is then reconstructed. Out of these 20 clusters, 16 were selected based on X-ray observations, while 4 were chosen through lensing observations.

Our work focuses on a subset of the CLASH sample, consisting of 15 clusters selected based on X-ray observations and 4 clusters chosen through lensing observations, as described in [53]. One of the X-ray-selected clusters, RXJ1532, was excluded from our analysis because its mass reconstruction was based only on wide-field weak-lensing data resulting in too large errors [51]. The clusters in our analysis sample span a redshift range of \(0.187 \le z \le 0.686\), with a median redshift of \(z_\textrm{med}= 0.352\). The resolution limit of the mass reconstruction, determined by the HST lensing data, is typically around 10 arcseconds (\(\approx 35\) h\(^{-1}\) kpc) at the median redshift [54]. It is worth noting that approximately half of the selected clusters in our sample are anticipated to be unrelaxed [54].

In [53], it is mentioned that the average surface mass density (\(\Sigma (R)\)) of the X-ray-selected subset from the CLASH sample is most accurately described by the NFW profile when considering GR. The NFW model is effective in explaining the distribution of dark matter in clusters, as it dominates the overall cluster scale. On the other hand, cluster baryons, including X-ray-emitting hot gas and BCGs, are influenced by non-gravitational and local astrophysical phenomena. Consequently, estimates of the total mass based on hydrostatic methods using X-ray observations are heavily influenced by the dynamic and physical conditions within the cluster. In comparison, gravitational lensing offers a direct means to investigate the projected mass distribution in galaxy clusters.

4 Statistical analysis

In order to constrain the values in the non-minimally coupled model and the variables describing the NFW profile for each cluster, we need to define a \(\chi ^2\) function. Therefore, \(\varvec{\theta } = \{c_{200}, \, M_{200}, \, L \}\) denotes the collection of parameters we see as variables in our theory. Surely, as we switch into the realm of GR, this will change to \(\varvec{\theta } = \{c_{200}, \, M_{200} \}\). The \(\chi ^2\) function is defined as below

$$\begin{aligned} \chi ^{2} = \big ( \varvec{\kappa ^{theo}}(\varvec{\theta }) - \varvec{\kappa ^{obs}} \big ) \cdot \textbf{C}^{-1} \cdot \big ( \varvec{\kappa ^{theo}}(\varvec{\theta }) - \varvec{\kappa ^{obs}} \big )\,, \end{aligned}$$
(21)

where \(\varvec{\kappa ^{obs}}\) refers to the data vector related to the observed convergence values. This vector comprises 15 data elements, each corresponding to the measured value of \(\kappa \) in a specific radial bin. The vector \(\varvec{\kappa ^{theo}}(\varvec{\theta })\) contains the theoretical predictions for the convergence of the model, calculated using Eq. (11) for GR and Eq. (15) for our NMC model. Additionally, \(\textbf{C}\) represents the covariance error matrix [46, 53].

We employed our custom Monte Carlo Markov Chain (MCMC) code to minimize the \(\chi ^{2}\) function. To ensure the convergence of the chains, we followed the approach described in [55]. To assess the credibility of our NMC model compared to standard GR by a meaningful statistical comparison, we calculated the Bayesian Evidence [56], \(\mathcal {E}\), for both models for each of cluster using the nested sampling algorithm explained in [57]. Since the selection of priors may significantly impacts Bayesian evidence [58], we maintained consistency by always choosing the same uninformative flat priors for the parameters.

The posterior distribution \(\mathcal {P}(\varvec{\theta },\mathcal {M}|{D})\), which we get as output from our MCMCs, is defined as

$$\begin{aligned} \mathcal {P}(\varvec{\theta },\mathcal {M}|{D}) = \frac{\mathcal {L}(D|\varvec{\theta },\mathcal {M}) \pi (\varvec{\theta },\mathcal {M})}{\mathcal {E}(D|\mathcal {M})}, \end{aligned}$$
(22)

where \(\varvec{\theta }\) is the set of parameters of our models \(\mathcal {M}\) (GR and NMC), having the data D, and \(\mathcal {L}(D|\varvec{\theta },\mathcal {M}) \propto \exp (-\chi ^2 (\varvec{\theta })/2\)) is the likelihood distribution function given the priors distributions \(\pi (\varvec{\theta },\mathcal {M})\). Thus, the Evidence is

$$\begin{aligned} \mathcal {E}(D|\mathcal {M})= \int d\varvec{\theta }\mathcal {L}(D|\varvec{\theta },\mathcal {M})\pi (\varvec{\theta },\mathcal {M})\,. \end{aligned}$$
(23)

We calculate the Bayes Factor (\(\mathcal {B}^{\,i}_{j}\)), defined as the ratio of evidence values between two models

$$\begin{aligned} \mathcal {B}^{\,i}_{j} = \frac{\mathcal {E}(M_{i})}{\mathcal {E}(M_{j})}, \end{aligned}$$
(24)

with \(\mathcal {M}_{j}\) being the reference model (in our case, GR). The comparison of models is then conducted employing the empirically calibrated Jeffreys scale [59] which states that: if \(\ln \mathcal {B}_{ij} < 1\), the evidence in favor of model i is weak against model j; if \(1< \ln \mathcal {B}_{ij} < 2.5\) the evidence is substantial; if \(2.5< \ln \mathcal {B}_{ij} < 5\) it is strong; if \(\ln \mathcal {B}_{ij} > 5\) it becomes decisive.

Moreover, in order to be sure to minimize the impact from the applied priors on the Bayesian comparison, we have also resorted on the Suspiciousness, \( \mathcal {S}^{i}_{j}\), introduced in [60,61,62] and defined as

$$\begin{aligned} \ln \mathcal {S}^i_j = \ln \mathcal {B}^i_j + \mathcal {D}_{KL,i} - \mathcal {D}_{KL,j}\;, \end{aligned}$$
(25)

where \(\mathcal {D}_{KL}\) is the Kullback–Leibler (KL) divergence[63]. An interpretation of the suspiciousness, similar to Jeffrey’s scale for the Bayes Ratio, is provided by Fig. 4 of [62]. Specifically, a negative value of \(\log \mathcal {S}^i_j\) should be intended as a sign of tension; a positive value of \(\log \mathcal {S}^i_j\) instead as a sign of concordance.

We anticipate here (a discussion of the reasons behind our choice will be detailed in the next section) that in our statistical analysis we have chosen two different approaches: “standard” marginalisation, as working directly on the MCMCs outputs; and the profile distribution (PD—an extension of the profile likelihood) [64, 65] approach.

Table 1 CLASH clusters ordered by redshift. Our results regarding \(c_{200}, M_{200}\) and L in GR and Modified version (marginal and PD) for the combination of DM and gas. The units for cluster radii are expressed in kpc
Table 2 CLASH clusters ordered by redshift. Our results regarding \(\log r_{200}, \log r_{s}\) and \(\log L\) in GR and Modified version (marginal and PD) for the combination of DM and gas. The units for cluster radii are expressed in kpc

5 Results and discussion

In Table 1 we report all the main results of our analysis. In the first step, in the second and third column we show the values for the NFW parameters \(c_{200}\) and \(M_{200}\) in the GR case, which will serve as our benchmark model, and we find a perfect agreement (just as cross-check of our codes) with results from literature [53, 66]. In Table 2 we report some secondary (not directly fitted) quantities which are equally important: the characteristic lengths from GR, namely, \(r_{200}\) and the NFW scaling, \(r_s\).

In both tables, regarding our NMC model, the results we report are obtained both from custom marginalisation, namely, simply “reading” the posteriors which are produced as output by the MCMCs; and after applying a PD procedure. The reason for this double analysis is due to clear volume-effects which we can notice when we take a more close inspection of the \(\chi ^2\) (or, equivalently, \(\mathcal {L}\)) landscape. Indeed, one can easily spot that the results obtained by standard marginalisation for our NMC model exhibit just a statistically not-significant deviation from GR for what concerns the NFW parameters, \(c_{200}\) and \(M_{200}\). On the other hand, when looking carefully at the posterior distribution of the main characterizing NMC parameter, the coupling length L, we note how its peak is generally highly shifted from the value at which we effectively get the minimum \(\chi ^2\), which does should serve as best fit estimation for this parameter. Actually, the region around the minimum is poorly explored with respect to rest of the parameter space because it is quite narrow, thus, volume-effects might be penalizing the physical information we may infer from our analysis and jeopardize our final assessments about the reliability of the NMC model with respect to GR. The PD approach as described in [64] is designed exactly to highlight statistical inference beyond such volume-effects.

Before any conclusion can be drawn, it is important to highlight that the value of \(r_{200}\), as it is possible to check from Table 2, does not change in a statistically significant way when moving from GR to the NMC case. That is important, because it means that the scale at which the concentration and the mass are estimated are the same in both cases and, thus, any difference can be consistently compared.

The difference between the marginalisation and the PD approach is made clear in our figures. For example, in Fig. 1, we present a comparison between the values of \(c_{200}\) and \(M_{200}\) acquired from GR and our NMC model. The left panels illustrates the comparison in the marginalisation case, while the right panel showcases the PD outcomes.

Upon closer look at these figures, a notable trend emerges. In general, the values obtained for \(c_{200}\) and \(M_{200}\) from the marginal analysis align more closely with those derived from GR, while the PD results exhibit more pronounced variations from the GR predictions. More specifically, in the PD analysis we see how both the concentration and the mass of the NFW profile are systematically lower than the GR case. We try to stress even more this trend in Fig. 2, where we do not show error bars for the sake of clarity, and we connect, for each cluster, GR (solid circles) to NMC marginalisation (empty circles) results with solid lines, and NMC marginalisation results to NMC PD ones (bold empty circles) with dashed ones. Thus, the NMC model, at least within the internal \(r_{200}\) region, requires less massive and less concentrated dark matter haloes in order to explain lensing data.

Fig. 1
figure 1

Comparison of the constraints on dark matter parameters for \(c_{200}\) and for \(M_{200}\) obtained from GR and from the NMC model considered in this work. In the left panels, we plot results from the marginalisation procedure; in the right ones, we show results from the profile distribution procedure

It is now interesting to give a look at the parameter which actually characterizes the NMC model, the interaction length L (for numerical reasons, we have chosen to work with \(\log {L}\)). In the case of a marginalized analysis, the estimated value of L appears to be “relatively” small, where the qualitative “relatively” should be clarified. Indeed, we are dealing with clusters which have ranges of the order of few Mpc, and L ranges from 0.1 to \(10^2\) kpc, with a typical average value \(\sim 10\) kpc. The main consequence which could be draw from this result, is that the correction to the Poisson equation introduced by the NMC model is just a small “perturbation” to the standard one. This result prompts further investigation. In our quest for validation, we can compare our results with those of [67], which addresses similar research objectives, albeit with a different data set. Interestingly, their computed value for L is also very small and of the same order of our finding, as shown in their corresponding Fig. 4.

When moving to the PD analysis, things change substantially. In some cases PDs are in full disagreement with the marginalized results: for example, in the case of MACS0416 we move from \(L \sim 10^{-2}\) kpc in the marginalized case to \(L \sim 1\) Mpc in the PD one. This is a general trend: from the PD analysis, which once again we remind highlights the behaviour of the posterior around the maximum of the likelihood, we get systematically larger values for L with respect to the marginalized analysis.

To get even more insight we compare L with the NFW parameters, \(c_{200}\) and \(M_{200}\) and with the other two characteristic lengths from GR, \(r_{200}\) and \(r_s\) in Fig. 3, where the results from marginalized analysis are shown as solid circles and those from PD are empty ones.

In the top left plot, we see again the shift towards smaller values of the concentration which we obtain in the PD analysis, but more strikingly we see an almost perfect anti-correlation between \(c_{200}\) and L. Although, this is not surprising, because in the corrective term induced by the NMC model into the Poisson equation, Eq. (5), we actually have the combination \(L^2\,\rho _s\), with \(\rho _s\) being mostly dependent on \(c_{200}\), as in Eq. (17). The same anti-correlation vs the mass is visible also in the top right panel, although much weaker. Also, we notice how it goes in the opposite direction with respect to what shown in Fig. 4 by [67], although in this reference they do not seem to have performed the PD analysis.

Fig. 2
figure 2

Comparison of the constraints on dark matter parameters \(\{c_{200},M_{200}\}\) obtained from GR and from the NMC model considered in this work. Solid circles are GR; empty circles are from NMC after marginalisation; bold empty circles are from NMC after profile distribution procedure. Solid lines connect GR and NMC after marginalisation; dashed lines connect NMC after marginalisation with NMC after profile distribution procedure. We avoid to plot error bars for the sake of clarity

Fig. 3
figure 3

Comparisons between the NMC characteristic parameter L, and the NFW parameters. In the plot, solid circles represent the Marginalized analysis, and empty circles show the PD ones

In the bottom panels of Fig. 3 we have the most interesting finding. On the left, we compare L with \(r_{200}\), and we notice how moving from the marginalized to the PD analysis leads from \(L \ll r_{200}\) to \(L \sim r_{200}\). This same pattern is even more clearly evident on the right, where the correspondence between L and \(r_s\) seems to be almost perfect, with a tentative weighted fit producing \(\log L \approx (0.964 \pm 0.034) \log r_{s} \). If confirmed by further investigations, these results would state for the NFW scaling \(r_{s}\) a sort of “natural” explanation if connected to the interaction length L of dark matter explained as an NMC fluid.

While the anti-correlation between \(c_{200}\) and L can be easily explained, this latter correlation is more tricky, and interesting. If we expressed the NFW profile and the NMC correction in dimensionless units, \(x=r/r_s\), we would have

$$\begin{aligned} \rho _{NFW}(x) \propto \frac{1}{x \left( 1+x\right) ^2} + \frac{L^2}{r^{2}_s} \frac{6}{x\left( 1+x\right) ^4}\,. \end{aligned}$$
(26)

This would be also expected from dimensional considerations and by the second derivative nature of the NMC. But in no way it implies the linear correlation between L and \(r_{s}\). If the NMC would contribute in a negligible way, it would be more logical and statistically favoured to expect small values for L. Moreover, the correction is itself a function of the scale, r.

Finally, in Fig. 4 we show the variation in the M(r)) distribution from GR to the NMC model at the best fit derived from the PD statistical analysis. Note that \(\Delta M_{200}\) is defined as \((M_{200,NMC}-M_{200,GR})/M_{200,GR}\), as considering the change in the scale which is due to the differences in the estimated lengths, we normalize the distances from the center to \(r/r_{200}\). It is quite evident to notice how the NMC model requires much less matter and much less concentrated in most of the cases we have considered: in some cases even more than \(70\%\) less dark matter with respect to GR from the inner to the outer regions; in many cases, we require half of the mass in the inner regions, with a difference which is less evident \((\approx 10\%)\) at outer ranges; few cases seem to be outliers and deviate from this general trend.

Fig. 4
figure 4

Difference in the mass distribution between GR and NMC model from the PD statistical analysis. We plot M(r) (left panel; dotted lines for GR, solid lines for the NMC model) and \(\Delta M \equiv (M_{NMC}-M_{GR}) / M_{GR}\) (right panel) as functions of the normalized distance from the center, \(r/r_{200}\)

6 Conclusions

Our investigation centers on exploring the scenario of a non-minimal coupling between dark matter (modeled as a perfect fluid) and gravity. As highlighted earlier, this coupling introduces alterations to the Einstein equations, extending its impact to the Planck mass and the energy-momentum tensor of a fluid, given their reliance on the curvature scale.

By adapting the action and taking the Newtonian limit for the disformal case, one reaches the modified Poisson equation Eq. (5), characterized by an additional term \(L^2 \nabla ^2\rho \). In this equation, the first term represents the density of dark matter and gas, while the additional term involves the coupling length L and the NFW density \(\rho \).

Leveraging both robust strong and weak gravitational lensing data within the CLASH program, we tested the NMC model across 19 high-mass galaxy clusters. It’s noteworthy that our analysis extends beyond dark matter to include the density of gas (X-ray). While the option to incorporate gas data from the CLASH dataset was available, we exercised caution, opting to not consider it due to potential biases.

Our analytical methodology employs two approaches for presenting findings: Marginalisation and Profile Distribution. Recognizing the influence of volume effects in the posterior distribution, we find that the PD is more suitable when working with data. Applying the PD reveals that dark matter necessitates lower mass and concentration to align with observed lensing data. Furthermore, a noteworthy correlation emerges between the coupling constant L and the standard NFW scale parameter \(r_s\) prompting further exploration into the connection between them in future research.

In our forthcoming research, we aim to expand our investigations beyond the exclusive consideration of dark matter and gas. Our focus will encompass additional components, such as galaxies, enhancing our understanding of dark matter characteristics.