Abstract
Weak gravitational lensing of background galaxies provides a direct probe of the projected matter distribution in and around galaxy clusters. Here, we present a selfcontained pedagogical review of cluster–galaxy weak lensing, covering a range of topics relevant to its cosmological and astrophysical applications. We begin by reviewing the theoretical foundations of gravitational lensing from first principles, with a special attention to the basics and advanced techniques of weak gravitational lensing. We summarize and discuss key findings from recent cluster–galaxy weaklensing studies on both observational and theoretical grounds, with a focus on cluster mass profiles, the concentration–mass relation, the splashback radius, and implications from extensive masscalibration efforts for cluster cosmology.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The propagation of light rays from a distant source to the observer is governed by the gravitational field of local inhomogeneities, as well as by the global geometry of the universe (Schneider et al. 1992). Hence, the images of background sources carry the imprint of gravitational lensing by intervening cosmic structures. Observations of gravitational lensing phenomena can thus be used to study the mass distribution of cosmic objects dominated by dark matter and to test models of cosmic structure formation (Blandford and Narayan 1992).
Galaxy clusters represent the largest class of selfgravitating systems formed in the universe, with typical masses of \( M\sim 10^{{14  15}} h^{{  1}} M_{ \odot } \). In the context of the standard structure formation scenario, cluster halos correspond to rare massive local peaks in the primordial density perturbations (e.g., Tinker et al. 2010). Galaxy clusters act as powerful cosmic lenses, producing a variety of detectable lensing effects from strong to weak lensing (Kneib and Natarajan 2011), including deflection, shearing, and magnifying of the images of background sources (e.g., Umetsu et al. 2016). The critical advantage of cluster gravitational lensing is its ability to study the mass distribution of individual and ensemble systems independent of assumptions about their physical and dynamical state (e.g., Clowe et al. 2006).
Weak gravitational lensing is responsible for the weak shape distortion, or shear, and magnification of the images of background sources due to the gravitational field of intervening massive objects and largescale structure (Bartelmann et al. 2001; Schneider 2005; Umetsu 2010; Hoekstra 2013; Mandelbaum 2018). Weak shear lensing by galaxy clusters gives rise to levels of up to a few 10% of elliptical distortions in images of background sources. Thus, the weak shear lensing signal, as measured from small but coherent image distortions in galaxy shapes, can provide a direct measure of the projected mass distribution of galaxy clusters (e.g., Kaiser and Squires 1993; Fahlman et al. 1994; Okabe and Umetsu 2008). On the other hand, lensing magnification can influence the observed surface number density of background galaxies seen behind clusters, by enhancing their apparent fluxes and expanding the area of sky (e.g., Broadhurst et al. 1995, 2005b; Taylor et al. 1998; Umetsu et al. 2011b; Chiu et al. 2020). The former effect increases the source counts above the limiting flux, whereas the latter reduces the effective observing area in the source plane, thus decreasing the observed number of sources per unit solid angle. The net effect, known as magnification bias, depends on the intrinsic faintend slope of the source luminosity function.
In this paper, we present a selfcontained pedagogical review of weak gravitational lensing of background galaxies by galaxy clusters (cluster–galaxy weak lensing), highlighting recent advances in our theoretical and observational understanding of the mass distribution in galaxy clusters. We shall begin by reviewing the theoretical foundations of gravitational lensing (Sect. 2), with special attention to the basics and advanced techniques of cluster–galaxy weak lensing (Sects. 3, 4, and 5). Then, we highlight and discuss key findings from recent cluster–galaxy weaklensing studies (Sects. 6), with a focus on cluster mass distributions (Sect. 6.1), the concentration–mass relation (Sect. 6.2), the splashback radius (Sect. 6.3), and implications from extensive masscalibration efforts for cluster cosmology (Sect. 6.4). Finally, conclusions are given in Sect. 7.
There have been a number of reviews of relevant subjects (e.g., Blandford and Narayan 1992; Narayan and Bartelmann 1996; Mellier 1999; Hattori et al. 1999; Umetsu et al. 1999; Van Waerbeke and Mellier 2003; Schneider 2005; Kneib and Natarajan 2011; Hoekstra 2013; Futamase 2015; Mandelbaum 2018). For general treatments of gravitational lensing, we refer the reader to Schneider et al. (1992). For a general review of weak gravitational lensing, see Bartelmann and Schneider (2001) and Schneider (2005). For a comprehensive review of cluster lensing, see Kneib and Natarajan (2011). For a pedagogical review on strong lensing in galaxy clusters, see Hattori et al. (1999a).
Throughout this paper, we denote the presentday density parameters of matter, radiation, and \({\varLambda }\) (a cosmological constant) in critical units as \(\varOmega _\mathrm {m}, \varOmega _\mathrm {r}\), and \(\varOmega _{\varLambda }\), respectively (see, e.g., Komatsu et al. 2009). Unless otherwise noted, we assume a concordance \({\varLambda }\) cold dark matter (\({\varLambda }\hbox {CDM}\)) cosmology with \(\varOmega _\mathrm {m}=0.3\), \(\varOmega _{\varLambda }=0.7\), and a Hubble constant of \(H_0 = 100\,\hbox {km s}^{1}\,\hbox {Mpc}^{1}\) with \(h=0.7\). We denote the mean matter density of the universe at a particular redshift z as \(\overline{\rho }(z)\) and the critical density as \(\rho _\mathrm {c}(z)\). The presentday value of the critical density is \(\rho _\mathrm {c,0}=3H_0^2/(8\pi G)\approx 1.88\times 10^{29}h^2\,\hbox {g cm}^{3}\approx 2.78\times 10^{11}h^2 M_\odot \,\hbox {Mpc}^{3}\), with G the gravitational constant. We use the standard notation \(M_{\varDelta _\mathrm {c}}\) or \(M_{\varDelta _\mathrm {m}}\) to denote the mass enclosed within a sphere of radius \(r_{\varDelta _\mathrm {c}}\) or \(r_{\varDelta _\mathrm {m}}\), within which the mean overdensity equals \(\varDelta _\mathrm {c} \times \rho _\mathrm {c}(z)\) or \(\varDelta _\mathrm {m} \times \overline{\rho }(z)\) at a particular redshift z. That is, \(M_{\varDelta _\mathrm {c}}=(4\pi /3)\varDelta _\mathrm {c}\rho _\mathrm {c}(z)r_{\varDelta _\mathrm {c}}^3\) and \(M_{\varDelta _\mathrm {m}}=(4\pi /3)\varDelta _\mathrm {m}\overline{\rho }(z)r_{\varDelta _\mathrm {m}}^3\). We generally denote threedimensional radial distances as r and reserve the symbol R for projected radial distances. Unless otherwise noted, we use projected densities (e.g., \(\varSigma (R)\)) and distances (e.g., R) both defined in physical (not comoving) units. All quoted errors are \(1\sigma \) confidence levels (CL) unless otherwise stated.
2 Theory of gravitational lensing
The local universe appears to be highly inhomogeneous on a wide range of scales from stars, galaxies, through galaxy groups and clusters, to forming superclusters, largescale filaments, and cosmic voids. The propagation of light from a farbackground source is thus influenced by the gravitational field caused by such local inhomogeneities along the line of sight. In general, a complete description of the light propagation in an arbitrary curved spacetime is a complex theoretical problem. However, a much simpler description is possible under a wide range of astrophysically relevant circumstances, which is referred to as the gravitational lensing theory (e.g., Schneider et al. 1992; Bartelmann and Schneider 2001; Kneib and Natarajan 2011). This section reviews the basics of gravitational lensing theory to provide a basis and framework for cluster lensing studies, with an emphasis on weak gravitational lensing.
2.1 Bending of light in an asymptotically flat spacetime
To begin with, let us consider the bending of light in a weakfield regime of an asymptotically flat spacetime in the framework of general relativity. Specifically, we assume an isolated and stationary mass distribution (Schneider et al. 1992). Then, the metric tensor \(g_{\mu \nu }\) (\(\mu ,\nu =0,1,2,3\)) of the perturbed spacetime can be written as:
where \((x^\mu )=(ct,x^1,x^2,x^3)\) are the four spacetime coordinates, \(\varPhi _\mathrm {N}\) is the Newtonian gravitational potential in a weakfield regime \(\varPhi _\mathrm {N}/c^2\ll 1\), and c is the speed of light in vacuum. We consider the metric given by Eq. (1) to be the sum of a background metric \(g_{\mu \nu }^{(\mathrm {b})}\) and a small perturbation denoted by \(h_{\mu \nu }\), that is, \(g_{\mu \nu }=g_{\mu \nu }^{(\mathrm {b})} + h_{\mu \nu }\) with \(h_{\mu \nu }\ll 1\).
To the first order in \(\varPhi _\mathrm {N}/c^2\), we have \(g_{\mu \nu }^{(\mathrm {b})}=\eta _{\mu \nu }=\mathrm {diag}(1,1,1,1)\) and \(h_{\mu \nu }=\mathrm {diag}(2\varPhi _\mathrm {N},2\varPhi _\mathrm {N},2\varPhi _\mathrm {N},2\varPhi _\mathrm {N})/c^2\), where \(g^{\mu \nu }\) and \(g^{(\mathrm {b})\mu \nu }\) are defined by \(g^{\mu \rho } g_{\rho \nu }= \delta ^{\mu }_{\nu }\) and \(g^{(\mathrm {b})\mu \rho } g^{(\mathrm {b})}_{\rho \nu }= \delta ^{\mu }_{\nu }\), with \(\delta ^\mu _\nu \) the Kronecker delta symbol in four dimensions. Then, to the first order of h, we have \(g^{\mu \nu }= g^{(\mathrm {b})\mu \nu }h^{\mu \nu }\), where \(h^{\mu \nu }\) is defined by \(h^{\mu \nu }\equiv g^{(\mathrm {b})\mu \rho } g^{(\mathrm {b})\nu \sigma } h_{\rho \sigma } =\eta ^{\mu \rho }\eta ^{\nu \sigma }h_{\rho \sigma }\).
The propagation of light is described by null geodesic equations:
where \(k^\mu \) is the fourmomentum, \(\lambda \) is the affine parameter, and \(\varGamma ^\mu _{\nu \lambda }\) denotes the Christoffel symbol, \(\varGamma ^{\mu }_{\nu \rho } = (1/2) g^{\mu \lambda }\left( g_{\lambda \nu ,\rho } + g_{\lambda \rho , \nu }  g_{\nu \rho ,\lambda }\right) \), with \(g_{\mu \nu }^{(\mathrm {b})}=\eta _{\mu \nu }\) and \(\varGamma ^{(\mathrm {b})\mu }_{\nu \rho }=0\) in the background Minkowski spacetime. For a light ray propagating along the \(x^3\)direction in the background metric, the photon fourmomentum \(k^{(\mathrm {b})\mu }\) and the unperturbed orbit \(x^{(\mathrm {b})\mu }\) are given by \(k^{(\mathrm {b})\mu }={\text{d}}x^{(\mathrm {b})\mu }/{\text{d}}\lambda =(1,0,0,1)\) and \(x^{(\mathrm {b})\mu }=(\lambda ,0,0,\lambda )\).
Now, we consider the light ray propagation in a perturbed spacetime. To this end, we express the perturbed orbit \(x^\mu (\lambda )\) as a sum of the unperturbed path \(x^{(\mathrm {b})\mu }(\lambda )\) and the deviation vector \(\delta x^\mu (\lambda )\):
Without loss of generality, we can take the deflection angle to lie in the \(x^3 x^1\) plane with \(x^2=0\), and we denote \((x^1, x^3) = (x^\perp , x^{})\). In the weakfield limit of \(\varPhi _\mathrm {N}/c^2\ll 1\), the impact parameter b of the incoming light ray is much greater than the Schwarzschild radius of the deflector with mass M, that is, \(b\gg 2GM/c^2\). Then, the linearized null geodesic equations are written as^{Footnote 1}:
The perturbed Christoffel symbol is \(\delta \varGamma ^{\mu }_{\nu \rho }= (1/2) \eta ^{\mu \lambda }\left( h_{\lambda \nu ,\rho } + h_{\lambda \rho , \nu } h_{\nu \rho ,\lambda } \right) + O(h^2)\). Choosing the boundary conditions in the instate (\(\lambda \rightarrow \infty \)) as \(\delta k^\mu (\infty )=0\), we integrate the linearized null geodesic equations (Eq. (4)) to obtain the following equations for the spatial components in the outstate (\(\lambda \rightarrow +\infty \)):
where \(\lambda =x^{}(\lambda )+O(h)\). Taking the unperturbed path, we obtain the bending angle \(\hat{\alpha }\) in the smallangle scattering limit (\(\hat{\alpha }\ll 1\)) as^{Footnote 2}
which is known as the Born approximation. This yields an explicit expression for the bending angle of \(\hat{\alpha }\simeq 4GM/(rc^2) = 1.75^{\prime\prime} (M/M_\odot )(r/R_\odot )^{1}\). General relativity predicts a deflection angle twice as large as that Newtonian physics would provide. Einstein’s prediction for the solar deflection of light is verified within \(\sim 0.1\%\) (e.g., Lebach et al. 1995).
The null geodesic condition leads to \(\delta k^0(\lambda )=2\varPhi _\mathrm {N}(\lambda )/c^2 + O(h^2)\), or \(c\text {d}t/\text {d}\lambda =12\varPhi _\mathrm {N}(\lambda )/c^2 + O(h^2)>1\). The gravitational timedelay \(\varDelta t_\mathrm {grav}\), with respect to the unperturbed light propagation, is thus given by:
Note that there is an additional timedelay due to a change in the geometrical path length caused by gravitational deflection (see Sect. 2.6.1).
2.2 Lens equation
Let us consider the situation illustrated in Fig. 1. A light ray propagates from a fardistant source (S) at the position \(\varvec{\eta }\) in the source plane to an observer (O), passing the position \(\varvec{\xi }\) in the lens plane, in which the light is deflected by a bending angle \(\hat{{\varvec{\alpha }}}\). Here, the source and lens planes are defined as planes perpendicular to the optical axis at the distance of the source and the lens, respectively. The exact definition of the optical axis does not matter, because the angular scales involved are very small. The angle between the optical axis and the unlensed source (S) position is \({\varvec{\beta }}\), and the angle between the optical axis and the image (I) is \(\varvec{\theta }\). The angular diameter distances between the observer and the lens, the observer and the source, and the lens and the source, are denoted by \(D_l, D_s\), and \(D_{ls}\), respectively.
As illustrated in Fig. 1, we have the following geometrical relation: \(\varvec{\eta }=(D_s/D_l)\varvec{\xi }+D_{ls}\hat{{\varvec{\alpha }}}(\varvec{\xi })\). Equivalently, this is translated into the relation between the angular source and image positions, \({\varvec{\beta }}=\varvec{\eta }/D_s\) and \(\varvec{\theta }=\varvec{\xi }/D_l\), as:
where we defined the reduced bending angle, or the deflection field (Broadhurst et al. 2005a), \({\varvec{\alpha }}(\varvec{\theta })=(D_{ls}/D_s)\hat{{\varvec{\alpha }}}\) in the last equality. Equation (8) is referred to as the lens equation, or the raytracing equation.
In general, the lens equation is nonlinear with respect to the image position \(\varvec{\theta }\), so that it may have multiple solutions \(\varvec{\theta }\) for a given source position \({\varvec{\beta }}\). This corresponds to multiple imaging of a background source (see Hattori et al. 1999a; Kneib and Natarajan 2011). An illustration of the typical circularly symmetric lens system is shown in Fig. 2. We refer to Keeton (2001) for a review of various families of mass models for gravitational lensing.
2.3 Cosmological lens equation
Here, we turn to the cosmological lens equation that describes the light propagation in a locally inhomogeneous, expanding universe. There are various approaches to derive a cosmological version of the lens equation (e.g., Schneider 1985; Sasaki 1993; Seitz et al. 1994; Futamase 1995; Dodelson 2003; Sereno 2009). We follow the approach by Futamase (1995) based on perturbed null geodesic equations as introduced in Sect. 2.1.
Consider a perturbed Friedman–Lemaítre–Robertson–Walker (FLRW) metric in the Newtonian gauge of the form (e.g., Kodama and Sasaki 1984):
where a(t) is the scale factor of the universe (normalized to unity at present), \(\chi \) is the comoving distance, \(\vartheta \) and \(\varphi \) are the spherical polar and azimuthal angles, respectively, \(\varPsi \) is a scalar metric perturbation, K is the spatial curvature of the universe, and \(r = f_K(\chi )\) is the comoving angular diameter distance:
The spatial curvature K is expressed with the total density parameter at the present epoch, \(\varOmega _0\equiv \sum _X\varOmega _{X}=\varOmega _\mathrm {m}+\varOmega _\mathrm {r}+\varOmega _{\varLambda }\), as \(K=(\varOmega _01)H_0^2/c^2\). The evolution of a(t) is determined by the Friedmann equation, \(H(a)\equiv (\text {d}a/\text {d}t)/a=H_0[\varOmega _\mathrm {r}a^{4} + \varOmega _\mathrm {m}a^{3} + \varOmega _{\varLambda }+ (1\varOmega _0)a^{2}]^{1/2}\). In the line element (9), we have neglected all terms of higher than \(O(\varPsi /c^2)\), the contributions from vector and tensor perturbations, and the effects due to anisotropic stress. As we will discuss in Sect. 2.5.1, \(\varPsi \) is interpreted as the Newtonian gravitational potential generated by local inhomogeneities of the matter distribution in the universe.
Since the structure of a light cone is invariant under the conformal transformation, we work with the conformally related spacetime metric given by \(\text {d}\tilde{s}^2 = a^{2}\text {d}s^2 \equiv \tilde{g}_{\mu \nu } \text {d}x^\mu \text {d}x^\nu \) with \((x^\mu )=(\eta , \chi , \vartheta , \varphi )\), where \(\eta =c\int ^t\text {d}t'/a(t')\) is the conformal time. The metric \(\tilde{g}_{\mu \nu }\) can be rewritten in the form of \(\tilde{g}_{\mu \nu }=\tilde{g}^{(\mathrm {b})}_{\mu \nu }+h_{\mu \nu }\), as a sum of the background metric and a small perturbation (\(h\ll 1\)).
We follow the prescription given in Sect. 2.1 to solve the null geodesic equations in the perturbed spacetime (Eq. (9)). To this end, we consider pastdirected null geodesics from the observer. Choosing the spherical coordinate system centered on the observer, we have \(k^{(\mathrm {b})\mu }=(1,1,0,0)\) in the background metric with \(\varPsi =0\). The unperturbed path is parameterized by the affine parameter \(\lambda \) along the photon path as \(x^{(\mathrm {b})\mu }(\lambda )=(\lambda ,\lambda \lambda _o,\vartheta _I,\varphi _I)\), where \(\lambda _o\) is the affine parameter at the observer and \((\vartheta _I,\varphi _I)\) denote the angular direction of the image position on the sky. The comoving angular distance r in the background spacetime can be parameterized by \(\lambda \) as \(r(\lambda )\equiv f_K[\chi (\lambda \lambda _o)]\) (see Eq. (10)).
The perturbed null geodesic equations for the angular components (\(\vartheta ,\varphi \)) can be formally solved as:
where \(\partial ^i \varPsi =(\varPsi _{,\vartheta }, \sin ^{2}\vartheta \varPsi _{,\varphi })\) and we have chosen \(\delta k^\vartheta (\lambda _o)=\delta k^\varphi (\lambda _o)=0\). Inserting this result in Eq. (4) and integrating by part yield (Futamase 1995; Dodelson 2003):
where \(\lambda _s\) is the affine parameter at the background source, \((\vartheta _S, \varphi _S)\equiv (\vartheta (\lambda _s),\varphi (\lambda _s))\) denote the angular direction of the unlensed source position on the sky, and we set \(\delta \vartheta (\lambda _o)=\delta \varphi (\lambda _o)=0\). Here, the integral is performed along the perturbed trajectory \(x^\mu (\lambda )=x^{(\mathrm {b})\mu }(\lambda )+\delta x^\mu (\lambda )\). Equation (12) relates the observed direction of the image position \((\vartheta _I, \varphi _I)\) to the (unlensed) direction of the source position \((\vartheta _S,\varphi _S)\) for a given background cosmology and metric perturbation \(\varPsi (\varvec{\chi },\eta )\). This is a general expression of the cosmological lens equation obtained by Futamase (1995).
2.4 Flatsky approximation
Now, we consider a small patch of the sky around a given line of sight (\(\vartheta =0\)), across which the curvature of the sky is negligible (\(\vartheta \ll 1\)). Then, we can locally define a flat plane perpendicular to the line of sight. By noting that \(\delta \varvec{\theta }\equiv (\delta \vartheta ,\vartheta \delta \varphi )\) is an angular displacement vector within this sky plane, we can express Eq. (12) as:
where \({\varvec{\beta }}(\chi _s)\) is the (unlensed) angular position of the source, \(\varvec{\theta }\) is the apparent angular position of the source image, and \({\varvec{\alpha }}(\chi _s)\) is the deflection field given by (Futamase 1995):
where \(\varvec{\nabla }_\perp \equiv r^{1}(\lambda )(\partial _\vartheta ,\vartheta ^{1}\partial _\varphi )\) is the transverse comoving gradient and the integral is performed along the perturbed trajectory \(x^\mu (\lambda )=x^{(\mathrm {b})\mu }(\lambda )+\delta x^\mu (\lambda )\) with \(\lambda =\chi + O(\varPsi /c^2)\). Equation (13) can be applied to a range of lensing phenomena, including multiple deflections of light from a background source (Sect. 2.5), strong and weak gravitational lensing by individual galaxies and clusters (Sect. 2.6), and cosmological weak lensing by the intervening largescale structure (a.k.a., the cosmic shear). Note that the cosmological lens equation is obtained using the standard angular diameter distance in a background FLRW spacetime without employing the thinlens approximation (see Sect. 2.6).
2.5 Multiple lens equation
We consider a discretized version of the cosmological lens equation (Eq. (13)) by dividing the radial integral between the source (\(\chi =\chi _s\)) and the observer (\(\chi =0\)) into N comoving boxes (\(N1\) lens planes) separated by a constant comoving distance of \(\varDelta \chi \). The angular position \(\varvec{\theta }^{(n)}\) of a light ray in the nth plane (\(1\leqslant n\leqslant N\)) is then given by (e.g., Schneider et al. 1992; Schneider 2019):
where \(\varvec{\theta }^{(0)} = {\varvec{\beta }}^{(1)}\) is the apparent angular position of the source image and \(\hat{{\varvec{\alpha }}}^{(m)}\) is the bending angle at the mth lens plane (\(m=1,2,\ldots,n1\)):
The \(2\times 2\) Jacobian matrix of Eq. (15) (\(1\leqslant n \leqslant N\)) is expressed as (e.g., Jain et al. 2000):^{Footnote 3}
where \(\mathscr{I}\) denotes the identity matrix, \(\mathscr{H}^{(m)}\equiv \partial \hat{{\varvec{\alpha }}}^{(m)}/\partial {\varvec{\beta }}^{(m)}\) is a symmetric dimensionless Hessian matrix with \(\mathscr{H}_{ij}^{(m)}=(2/c^2)r(\chi _m)\nabla _{\perp ,i}\nabla _{\perp ,j}\varPsi [\chi _m,r(\chi _m){\varvec{\beta }}^{(m)}]\,\varDelta \chi \) (\(i,j=1,2\)), \(D_n\) is the angular diameter distance between the observer and the nth lens plane, and \(D_{mn}\) is the angular diameter distance between the mth and nth lens planes (\(m<n\)). In general, the Jacobian matrix \(\mathscr{A}^{(n)}\) can be decomposed into the following form:
where \(\kappa \) is the lensing convergence, \((\gamma _1,\gamma _2)\) are the two components of the gravitational shear (see Sect. 2.6.2 for the definitions and further details of the convergence and shear), \(\omega \) is the net rotation (e.g., Cooray and Hu 2002), and \(\sigma _{a}\; (a=1,2,3)\) are the Pauli matrices that satisfy \(\sigma _{a}\sigma _{b}=i\epsilon _{abc}\sigma _{c}\), with \(\epsilon _{abc}\) the Levi–Civita symbol in three dimensions. The Born approximation \(\mathscr{A}^{(m)}=\mathscr{I}\) on the righthand side of Eq. (17) leads to a symmetric Jacobian matrix with \(\omega =0\).
The multiple lens equation has been widely used to study gravitational lensing phenomena by raytracing through Nbody simulations (e.g., Schneider and Weiss 1988; Hamana et al. 2000; Jain et al. 2000).
2.5.1 Cosmological poisson equation
We assume here a spatially flat geometry with \(K=0\) motivated by cosmological observations based on cosmic microwave background (CMB) and complementary data sets (e.g., Hinshaw et al. 2013; Planck Collaboration et al. 2015b). The cosmological Poisson equation relates the scalar metric perturbation \(\varPsi \) (see Eq. (9)) to the matter density perturbation \(\delta \rho \) on subhorizon scales as:
where \(\delta =\delta \rho /\overline{\rho }\) is the density contrast with respect to the background matter density \(\overline{\rho }\) of the universe, \(\overline{\rho }=a^{3}(3H_0^2\varOmega _\mathrm {m})/(8\pi G)\), and \(\varvec{\nabla }\) is the threedimensional gradient operator in comoving coordinates. A key implication of Eq. (19) is that the amplitude of \(\varPsi \) is related to the amplitude of \(\delta \) as \(\varPsi /c^2 \sim (3\varOmega _\mathrm {m}/2)(l/L_H)^2 (\delta /a)\) where l and \(L_H=c/H_0\) denote the characteristic comoving scale of density perturbations and the Hubble radius, respectively. Therefore, assuming the standard matter power spectrum of density fluctuations (e.g., Smith et al. 2003), we can safely conclude that the degree of metric perturbation is always much smaller than unity, i.e., \(\varPsi /c^2\ll 1\), even for highly nonlinear perturbations with \(\delta \gg 1\) on small scales of \(l\ll L_H\; (\sim 3h^{1}\mathrm {Gpc}\)).
2.6 Thinlens equation
2.6.1 Thinlens approximation
Let us turn to the case of gravitational lensing caused by a single clusterscale halo. Galaxy clusters can produce deep gravitational potential wells, acting as powerful gravitational lenses. In cluster gravitational lensing, it is often assumed that the total deflection angle, \({\varvec{\alpha }}(\varvec{\theta })\), is dominated by the cluster of interest and its surrounding largescale environment, which becomes important beyond the cluster virial radius, \(r_\mathrm {vir}\) (Cooray and Sheth 2002; Oguri and Hamana 2011; Diemer and Kravtsov 2014).
Assuming that the light propagation is approximated by a singlelens event due to the cluster and that a light deflection occurs within a sufficiently small region (\(\chi _l\varDelta \chi /2, \chi _l+\varDelta \chi /2\)) compared to the relevant angular diameter distances, we can write the deflection field by a single cluster as:
where \(D_s=a(\chi _s)r(\chi _s)\) and \(D_{ls}=a(\chi _s)r(\chi _s\chi _l)\) are the angular diameter distances from the observer to the source and from the deflector to the source, respectively, and \(r(\chi _l)\varvec{\theta }\) is the comoving transverse vector on the lens plane. In a cosmological situation, the angular diameter distances \(D_{mn}\) between the planes m and n (\(z_m < z_n\)) are of the order of the Hubble radius, \(L_H\equiv c/H_0\sim 3h^{1}\mathrm {Gpc}\), while physical extents of clusters are about \(2r_\mathrm {200m} \sim (2  4)h^{1}\mathrm {Mpc}\) in comoving units. Therefore, one can safely adopt the thinlens approximation in cluster gravitational lensing.
We then introduce the effective lensing potential \(\psi (\varvec{\theta })\) defined as:
where \(D_l\) is the angular diameter distance between the observer and the lens, \(D_l=a(\chi _l)r(\chi _l)\). In terms of \(\psi (\varvec{\theta })\), the deflection field \({\varvec{\alpha }}(\theta )\) is expressed as:
where \(\varvec{\nabla }_{\theta }=r\varvec{\nabla }_{\perp }=(\partial _\theta ,\theta ^{1}\partial _\phi )\).
With the Fermat or timedelay potential defined by:
the lens equation can be equivalently written as \(\varvec{\nabla }_{\theta }\tau (\varvec{\theta };{\varvec{\beta }})=0\) (Blandford and Narayan 1986). Here, the first term on the righthand side of Eq. (23) is responsible for the geometric delay and the second term for the gravitational timedelay. The Fermat potential \(\tau (\varvec{\theta };{\varvec{\beta }})\) is related to the timedelay \(\varDelta t\) with respect to the unperturbed path in the observer frame by \(\varDelta t(\varvec{\theta };{\varvec{\beta }})=D_lD_s/(cD_{ls})(1+z_l)\tau (\varvec{\theta };{\varvec{\beta }})\equiv D_{\varDelta t}\tau (\varvec{\theta };{\varvec{\beta }})/c\). with \(D_{\varDelta t}=(1+z_l)D_lD_s/D_{ls}\propto H_0^{1}\) the timedelay distance (Refsdal 1964). According to Fermat’s principle, the images for a given source position \({\varvec{\beta }}\) are formed at the stationary points of \(\tau (\varvec{\theta };{\varvec{\beta }})\) with respect to variations of \(\varvec{\theta }\) (Blandford and Narayan 1986).
Note that cluster gravitational lensing is also affected by uncorrelated largescale structure projected along the line of sight (e.g., Schneider et al. 1998; Hoekstra 2003; Umetsu et al. 2011a; Host 2012). The intervening largescale structure in the universe perturbs the propagation of light from distant background galaxies, producing small but continuous transverse excursions along the light path. For a given depth of observations, the impact of such cosmic noise is most important in the cluster outskirts where the cluster lensing signal is small (Hoekstra 2003; Becker and Kravtsov 2011; Gruen et al. 2015).
2.6.2 Convergence and shear
Let us work with local Cartesian coordinates \(\varvec{\theta }=(\theta _1,\theta _2)\) centered on a certain reference point in the image plane. The local properties of the lens mapping are described by the Jacobian matrix defined as:
where we have introduced the notation, \(\psi _{,ij}=\partial ^2\psi /\partial \theta _i\partial \theta _j\) (\(i,j=1,2\)). Alternatively, we can write the Jacobian matrix as \(\mathscr{A}_{ij}=\delta _{ij}\psi _{,ij}\) (\(i,j=1,2\)) with \(\delta _{ij}\) the Kronecker delta in two dimensions. This symmetric \(2\times 2\) Jacobian matrix \(\mathscr{A}\) can be decomposed as:
where \(\sigma _{a} \; (a=1,2,3)\) are the Pauli matrices (Sect. 2.5); \(\kappa (\varvec{\theta })\) is the lensing convergence responsible for the change in the trace part of the Jacobian matrix (\(\mathrm {tr}(\mathscr{A})=2(1\kappa )\)):
with \(\triangle = \varvec{\nabla }_\theta ^2\), and (\(\gamma _1,\gamma _2\)) are the two components of the complex shear \(\gamma (\varvec{\theta }):=\gamma _1(\varvec{\theta })+i\gamma _2(\varvec{\theta })\):
Note that Eq. (26) can be regarded as a twodimensional Poisson equation, \(\triangle \psi (\varvec{\theta }) = 2\kappa (\varvec{\theta })\). Then, the Green function in the (hypothetical) infinite domain is \(\triangle ^{1}(\varvec{\theta },\varvec{\theta }')=\ln \varvec{\theta }\varvec{\theta }'/(2\pi )\),^{Footnote 4} so that the convergence is related to the lensing potential as:
The Jacobian matrix is expressed in terms of \(\kappa \) and \(\gamma \) as:
The determinant of the Jacobian matrix (Eq. (29)) is given as \(\mathrm {det}\mathscr{A}=(1\kappa )^2\gamma ^2\). In the weaklensing limit where \(\kappa , \gamma  \ll 1\), \(\mathrm {det}\mathscr{A}\simeq 12\kappa \) to the first order.
The deformation of the image of an infinitesimal circular source (\({\text{d}}{\varvec{\beta }}\rightarrow 0\)) behind the lens can be described by the inverse Jacobian matrix \(\mathscr{A}^{1}\) of the lens equation. In the weaklensing limit (\(\kappa , \gamma \ll 1\)), we have:
where \(\varGamma _{ij}\) is the symmetric tracefree shear matrix defined by (Bartelmann and Schneider 2001; Crittenden et al. 2002):
with \(\partial _i:=\partial /\partial \theta _i\) (\(i=1,2\)). The shear matrix can be expressed in terms of the Pauli matrices as \(\varGamma =\sigma _3\gamma _1+\sigma _1\gamma _2\). The first term in Eq. (30) describes the isotropic light focusing or area distortion in the weaklensing limit, while the second term induces an asymmetry in lens mapping. The shear \(\gamma \) is responsible for image distortion and can be directly observed from image ellipticities of background galaxies in the regime where \(\kappa ,\gamma \ll 1\) (see Sect. 3). Note that both \(\kappa \) and \(\gamma \) contribute to the area and shape distortions in the nonweaklensing regime.
In Fig. 3, we illustrate the effects of the lensing convergence \(\kappa \) and the gravitational shear \(\gamma \) on the angular shape and size of an infinitesimal circular source. The convergence acting alone causes an isotropic magnification of the image, while the shear deforms it to an ellipse. Note that the magnitude of ellipticity induced by gravitational shear in the weaklensing regime (\(\gamma \lesssim 0.1\)) is much smaller than illustrated here.
2.6.3 Magnification
Gravitational lensing describes the deflection of light by gravity. Lensing conserves the surface brightness of a background source, a consequence of Liouville’s theorem. On the other hand, lensing causes focusing of light rays, resulting in an amplification of the image flux through the local solidangle distortion. Lensing magnification \(\mu \) is thus given by taking the ratio between the lensed to the unlensed image solid angle as \(\mu =\delta \varOmega ^I/\delta \varOmega ^S = 1/\mathrm {det}\mathscr{A}\), with:
In the weaklensing limit (\(\kappa ,\gamma \ll 1\)), the magnification factor to the first order is:
The magnitude change at \(\kappa (\varvec{\theta })=0.1\) is thus \(\varDelta m = (5/2)\log _{10}(\mu ) \sim 0.2\).
2.6.4 Strong and weaklensing regimes
The \(\mathscr{A}(\varvec{\theta })\) matrix has two local eigenvalues \(\varLambda _\pm (\varvec{\theta })\) at each image position \(\varvec{\theta }\):
with \(\varLambda _+ \geqslant \varLambda _\).
Images with \(\mathrm {det}\mathscr{A}(\varvec{\theta })>0\) have the same parity as the source, while those with \(\mathrm {det}\mathscr{A}(\varvec{\theta })<0\) have the opposite parity to the source. A set of closed curves defined by \(\mathrm {det}\mathscr{A}(\varvec{\theta })=0\) in the image plane are referred to as critical curves, on which lensing magnification formally diverges, and those mapped into the source plane are referred to as caustics (see Hattori et al. 1999a). The critical curves separate the image plane into even and oddparity regions with \(\mathrm {det}\mathscr{A}>0\) and \(\mathrm {det}\mathscr{A}<0\), respectively.
An infinitesimal circular source is transformed to an ellipse with a minortomajor axis ratio (\(\leqslant 1\)) of \(\varLambda _/\varLambda _+\) for \(\kappa <1\) and \(\varLambda _+/\varLambda _\) for \(\kappa > 1\), and it is magnified by the factor \(\mu =1/\varLambda _+\varLambda _\) (see Sect. 2.6.3). The gravitational distortion locally disappears along the curve defined by \(\mathrm {tr}[\mathscr{A}(\varvec{\theta })]=0\), i.e., \(\kappa (\varvec{\theta })=1\), which lies in the oddparity region (Kaiser 1995). This is illustrated in Fig. 4 for a simulated lens with a bimodal mass distribution. Images forming along the outer (tangential) critical curve \(\varLambda _{}(\varvec{\theta })=0\) are distorted tangentially to this curve, while images forming close to the inner (radial) critical curve \(\varLambda _{+}(\varvec{\theta })=0\) are stretched in the direction perpendicular to the critical curve.
A lens system that has a region with \(\kappa (\varvec{\theta })>1\) can produce multiple images for certain source positions \({\varvec{\beta }}\), and such a system is referred to as being supercritical. Note that being supercritical is a sufficient but not a necessary condition for a general lens to produce multiple images, because the shear can also contribute to multiple imaging. Nevertheless, this provides us with a simple criterion to broadly distinguish the regimes of multiple and single imaging. Keeping this in mind, we refer to the region where \(\kappa (\varvec{\theta }) \gtrsim 1\) as the stronglensing regime and the region where \(\kappa (\varvec{\theta })\ll 1\) as the weaklensing regime.
2.6.5 Critical surface mass density
The lensing convergence \(\kappa \) is essentially a distanceweighted mass overdensity projected along the line of sight. We express \(\kappa (\varvec{\theta })\) due to cluster gravitational lensing as:
where \(\chi _s\) is the comoving distance to the source plane; \(\varSigma =\int _0^{\chi _s} (\rho \overline{\rho })\,a {\text{d}}\chi \) is the surface mass density field of the lens projected on the sky; and \(\varSigma _\mathrm {cr}\) is the critical surface mass density of gravitational lensing^{Footnote 5}:
for \(z_s > z_l\) and \(\varSigma _\mathrm {cr}^{1}(z_l,z_s) = 0\) (i.e., \(D_{ls}/D_s= 0\)) for an unlensed source with \(z_s\leqslant z_l\). In the second (approximate) equality of Eq. (35), we have explicitly used the thinlens approximation (Sect. 2.6.1). The critical surface mass density \(\varSigma _\mathrm {cr}\) depends on the geometric configuration (\(z_l, z_s\)) of the lens–source system and the background cosmological parameters, such as (\(\varOmega _\mathrm {m}, \varOmega _{\varLambda }, H_0\)). For example, for \(z_l=0.3\) and \(z_s=1\) in our fiducial cosmology, we have \(\varSigma _\mathrm {cr}\approx 4.0 \times 10^{15}hM_\odot \,\hbox {Mpc}^{2}\). For a fixed lens redshift \(z_l\), the geometric efficiency of gravitational lensing is determined by the distance ratio \(D_{ls}/D_s\) as a function of \(z_s\) and the background cosmology.
To translate the observed lensing signal into surface mass densities, one needs an estimate of \(\varSigma _\mathrm {cr}(z_l,z_s)\) for a given background cosmology. In the regime where \(z_l\ll z_s\) (say, \(z_l\lesssim 0.2\) for background galaxy populations at \(z_s\sim 1\)), \(\varSigma _\mathrm {cr}\) depends weakly on the source redshift \(z_s\), so that a precise knowledge of the sourceredshift distribution is less critical (e.g., Okabe and Umetsu 2008; Okabe et al. 2010).
Conversely, this distance dependence of the lensing effects can be used to constrain the cosmological redshift–distance relation by examining the geometric scaling of the lensing signal as a function of the background redshift (Taylor et al. 2007, 2012; Medezinski et al. 2011; Dell’Antonio et al. 2019). Figure 5 compares \(D_{ls}/D_s\) as a function of \(z_s\) for various sets of the lens redshift and the cosmological model.
Note that, in the limit where the lensing matter is continuously distributed along the line of sight, the first equality of Eq. (35) can be formally rewritten as:
with \(g(\chi ,\chi _s)=r(\chi )r(\chi _s\chi )/r(\chi _s)\) and \(\delta =\delta \rho /\overline{\rho }\). Equation (37) coincides with the expression for the cosmic convergence due to intervening cosmic structures (see Jain et al. 2000).
It is interesting to compare the above lineofsight integral (Eq. (37)) to the thermal Sunyaev–Zel’dovich effect (SZE) in terms of the Comptony parameter (e.g., Sunyaev and Zeldovich 1972; Rephaeli 1995; Birkinshaw 1999):
where \(\sigma _\mathrm {T}\), \(m_\mathrm {e}\), and \(k_\mathrm {B}\) are the Thomson scattering crosssection, the electron mass, and the Boltzmann constant, respectively; \(T_\mathrm {CMB}= T_0(1+z)\) is the temperature of CMB photons with \(T_0=2.725\,\hbox {K}\); and \(T_e\) and \(n_e\) are the electron temperature and number density of the intracluster gas, with \(P_\mathrm {e}= n_\mathrm {e} k_\mathrm {B}T_\mathrm {e}\) the electron pressure. In the second (approximate) equality, we have used \(T_e\gg T_0(1+z)\). The Comptony parameter is proportional to the electron pressure integrated along the line of sight, thus probing the thermal energy content of thermalized hot plasmas residing in the gravitational potential wells of galaxy clusters. The combination of the thermal SZE and weak lensing thus provides unique astrophysical and cosmological probes (e.g., Doré et al. 2001; Umetsu et al. 2009; Osato et al. 2020).
2.6.6 Einstein radius
Detailed stronglens modeling using many sets of multiple images with measured spectroscopic redshifts allows us to determine the location of the critical curves (e.g., Zitrin et al. 2015; Meneghetti et al. 2017), which, in turn, provides accurate estimates of the projected total mass enclosed by them. In this context, the term Einstein radius is often used to refer to the size of the outer (tangential) critical curve (i.e., \(\varLambda _{}(\varvec{\theta })=0\); Sect. 2.6.4). We note, however, that there are several possible definitions of the Einstein radius used in the literature (see Meneghetti et al. 2013). Here, we adopt the effective Einstein radius definition (Redlich et al. 2012; Meneghetti et al. 2013, 2017; Zitrin et al. 2015), \(\vartheta _\mathrm {Ein}=\sqrt{A_\mathrm {c}/\pi }\), where \(A_\mathrm {c}\) is the (angular) area enclosed by the outer critical curve. For an axisymmetric lens, the average surface mass density within the critical area is equal to \(\varSigma _\mathrm {cr}\) (see Hattori et al. 1999; Meneghetti et al. 2013), thus enabling us to directly estimate the enclosed projected mass by \(M_\mathrm {2D}(<\vartheta _\mathrm {Ein})=\pi (D_l\vartheta _\mathrm {Ein})^2\varSigma _\mathrm {cr}\). Even for general nonaxisymmetric lenses, the projected enclosed mass profile \(M_\mathrm {2D}(<\vartheta )=\varSigma _\mathrm {cr}D_l^2\int _{\vartheta '\le \vartheta }\kappa (\varvec{\theta }')\,{\text{d}}^2\theta '\) at the location \(\vartheta \sim \vartheta _\mathrm {Ein}\) is less sensitive to modeling assumptions and approaches (e.g., Umetsu et al. 2012, 2016; Meneghetti et al. 2017), thus serving as a fundamental observable quantity in the stronglensing regime (Coe et al. 2010).
3 Basics of cluster weak lensing
In this section, we review the basics of cluster–galaxy weak lensing based on the thinlens formalism (Sect. 2.6). Unless otherwise noted, we will focus on subcritical lensing (i.e., outside the critical curves). We consider both linear (\(\kappa \ll 1\)) and mildly nonlinear regimes of weak gravitational lensing.
3.1 Weaklensing mass reconstruction
3.1.1 Spin operator and lensing fields
For mathematical convenience, we introduce a concept of “spin” for weaklensing quantities as follows (Bacon et al. 2006; Okura et al. 2007, 2008; Schneider and Er 2008; Bacon and Schäfer 2009): a quantity is said to have spin N if it has the same value after rotation by \(2\pi /N\). The product of spinA and spinB quantities has spin (\(A+B\)), and the product of spinA and spin\(B^*\) quantities has spin (\(AB\)), where \(*\) denotes the complex conjugate.
We define a complex spin1 operator \(\partial :=\partial _1+i\partial _2\) that transforms as a vector, \(\partial '=\partial e^{i\varphi }\), with \(\varphi \) being the angle of rotation relative to the original basis. Then, the lensing convergence is expressed in terms of \(\psi (\varvec{\theta })\) as:
where \(\partial \partial ^*=\nabla _\theta ^2\) is a scalar or a spin0 operator. Similarly, the complex shear \(\gamma =\gamma _1+i\gamma _2\equiv \gamma e^{2i\phi _\gamma }\) is expressed as:
where
is a spin2 operator transforms such that \(\hat{\mathcal{D}}'=\hat{\mathcal{D}}e^{2i\varphi }\) under a rotation of the basis axes by \(\varphi \).
3.1.2 Linear mass reconstruction
Since \(\gamma (\varvec{\theta })\) and \(\kappa (\varvec{\theta })\) are both linear combinations of the second derivatives of \(\psi (\varvec{\theta })\), they are related to each other by (Kaiser 1995; Crittenden et al. 2002; Umetsu 2010)^{Footnote 6}:
The sheartomass inversion can thus be formally expressed as:
Using \(\triangle ^{1}(\varvec{\theta },\varvec{\theta }')=\ln \varvec{\theta }\varvec{\theta }'/(2\pi )\) (Sect. 2.6.2), Eq. (43) in the flatsky limit can be solved to yield the following nonlocal relation between \(\kappa \) and \(\gamma \) (Kaiser and Squires 1993, hereafter KS93):
where \(\kappa _0\) is an additive constant and \(D(\varvec{\theta })\) is a complex kernel defined as:
Similarly, the complex shear field can be expressed in terms of the convergence \(\kappa \) as:
This linear mass inversion formalism is often referred to as the KS93 algorithm.
It is computationally faster to work in Fourier domain (Jain et al. 2000) using the fast Fourier transform algorithm. By taking the Fourier transform of Eq. (42), we have a mass inversion relation in the conjugate Fourier space as:
where \(\varvec{k}\) is the twodimensional wave vector conjugate to \(\varvec{\theta }\), and \(\hat{\kappa }(\varvec{k})\) and \(\hat{\gamma }(\varvec{k})\) are the Fourier transforms of \(\kappa (\varvec{\theta })\) and \(\gamma (\varvec{\theta })=\gamma _1(\varvec{\theta })+i\gamma _2(\varvec{\theta })\), respectively. In practical applications, one may assume \(\hat{\kappa }(0)=0\) if the angular size of the observed shear field is sufficiently large, so that the mean convergence across the data field is approximated to zero. Otherwise, one must explicitly account for the boundary conditions imposed by the observed shear field to perform a mass reconstruction on a finite field (e.g., Kaiser 1995; Seitz and Schneider 1996; Bartelmann et al. 1996; Seitz and Schneider 1997; Umetsu and Futamase 2000).
In Fig. 6, we show the shape distortion field in the rich cluster Cl0024+1654 (\(z_l=0.395\)) obtained by Umetsu et al. (2010) from deep weaklensing observations taken with SuprimeCam on the 8.2 m Subaru telescope. They accounted and corrected for the effect of the weight function used for calculating noisy galaxy shapes, as well as for the anisotropic and smearing effects of the point spread function (PSF), using an improved implementation of the modified Kaiser et al. (1995, hereafter KSB) method (see Sect. 3.4.2). In the left panel of Fig. 7, we show the \(\kappa (\varvec{\theta })\) field reconstructed from the Subaru weaklensing data (see Fig. 6). A prominent mass peak is visible in the cluster center, around which the distortion pattern is clearly tangential (Fig. 6). In this study, a variant of the linear KS93 algorithm was used to reconstruct the \(\kappa \) map from the weak shear lensing data. In the right panel of Fig. 7, we show the member galaxy distribution \(\varSigma _n(\varvec{\theta })\) in the cluster. Overall, mass and light are similarly distributed in the cluster.
Figure 8 shows the projected mass distribution in the very nearby Coma cluster (\(z_l=0.0236\)) reconstructed from a \(4\,\hbox {deg}^2\) weaklensing survey of cluster subhalos based on Subaru SuprimeCam observations (Okabe et al. 2014). In the figure, the weaklensing mass map is compared to the luminosity and number density distributions of spectroscopically identified cluster members, as well as to the projected largescale structure model based on galaxy–galaxy lensing with the lighttracingmass assumption. The projected mass and galaxy distributions in the Coma cluster are correlated well with each other. Thanks to the large angular extension of the Coma cluster, Okabe et al. (2014) measured the weaklensing masses of 32 cluster subhalos down to the order of \(10^{3}\) of the cluster virial mass.
3.1.3 Masssheet degeneracy
Adding a constant mass sheet to \(\kappa (\varvec{\theta })\) in the sheartomass formula (46) does not change the shear field \(\gamma (\varvec{\theta })\) that is observable in the weaklensing limit. This leads to a degeneracy of solutions for the weaklensing mass inversion problem, which is referred to as the masssheet degeneracy (Falco et al. 1985; Gorenstein et al. 1988; Schneider and Seitz 1995).
As we shall see in Sect. 3.4, in general, the observable quantity for weak shear lensing is not the shear \(\gamma \), but the reduced shear:
in the subcritical regime where \(\mathrm {det}\mathscr{A}>0\) (or \(1/g^*\) in the negativeparity region with \(\mathrm {det}\mathscr{A}<0\)). We see that the \(g(\varvec{\theta })\) field is invariant under the following global transformation:
with an arbitrary scalar constant \(\lambda \ne 0\) (Schneider and Seitz 1995). This transformation is equivalent to scaling the Jacobian matrix \(\mathscr{A}(\varvec{\theta })\) with \(\lambda \), \(\mathscr {A}(\varvec{\theta }) \rightarrow \lambda \mathscr{A}(\varvec{\theta })\). It should be noted that this transformation leaves the location of the critical curves (\(\mathrm {det}\mathscr{A}(\varvec{\theta })=0\)) invariant as well. Moreover, the location of the curve defined by \(\kappa (\varvec{\theta })=1\), on which the distortion locally disappears, is left invariant under the transformation (Eq. (49)). A general conclusion is that all mass reconstruction methods based on shape information alone can determine the \(\kappa \) field only up to a oneparameter family (\(\lambda \) or \(\kappa _0\)) of linear transformations (Eq. (49)).
In principle, this degeneracy can be broken or alleviated, for example, by measuring the magnification factor \(\mu \) in the subcritical regime (i.e., outside the critical curves; see Umetsu 2013), because \(\mu \) transforms under the invariance transformation (Eq. (49)) as:
3.1.4 Nonlinear mass reconstruction
Following Seitz and Schneider (1995), we generalize the KS93 algorithm to include the nonlinear but subcritical regime (outside the critical curves). To this end, we express the KS93 inversion formula in terms of the observable reduced shear \(g(\varvec{\theta })\). Substituting \(\gamma =g(1\kappa )\) in Eq. (44), we have the following integral equation:
For a given \(g(\varvec{\theta })\) field, this nonlinear equation can be solved iteratively, for example, by initially setting \(\kappa (\varvec{\theta })=0\) everywhere (Seitz and Schneider 1995),
Equivalently, Eq. (51) can be formally expressed as a power series expansion (Umetsu et al. 1999):
where \(\hat{\mathcal{G}}\) is the convolution operator defined by:
Here, \(\hat{\mathcal{G}}(\mathbf {\theta },\mathbf {\theta }^{\prime})\) acts on a function of \(\varvec{\theta }^{\prime}\). The KS93 algorithm corresponds to the firstorder approximation to this power series expansion in the weaklensing limit. Note that solutions for nonlinear mass reconstructions suffer from the generalized masssheet degeneracy, as explicitly shown in Eq. (52).
Note that there is another class of mass inversion algorithms that uses maximumlikelihood and Bayesian approaches to obtain a mass map solution and its error covariance matrix from weaklensing data (e.g., Bartelmann et al. 1996; Bradač et al. 2006; Merten et al. 2009).
3.2 E/B decomposition
The shear matrix \(\varGamma (\varvec{\theta })=\gamma _1(\varvec{\theta })\sigma _3+\gamma _2(\varvec{\theta })\sigma _1\) that describes a spin2 anisotropy can be expressed as a sum of two components corresponding to the number of degrees of freedom. By introducing two scalar fields \(\psi _E(\varvec{\theta })\) and \(\psi _B(\varvec{\theta })\), we decompose the shear matrix \(\varGamma _{ij}\) (\(i,j=1,2\)) into two independent modes as (Crittenden et al. 2002):
with
where \(\epsilon _{ij}\) is the Levi–Civita symbol in two dimensions, defined such that \(\epsilon _{11}=\epsilon _{22}=0\), \(\epsilon _{12}=\epsilon _{21}=1\). Here, the first term associated with \(\psi _E\) is a gradient or scalar E component and the second term with \(\psi _B\) is a curl or pseudoscalar B component.
The shear components \((\gamma _1,\gamma _2)\) are written in terms of \(\psi _E\) and \(\psi _B\) as:
As we have discussed in Sect. 3.1.1, the spin2 \(\gamma (\varvec{\theta })\) field is coordinate dependent and transforms as \(\gamma ^{\prime}=\gamma e^{2i\varphi }\) under a rotation of the basis axes by \(\varphi \). The E and B components can be extracted from the shear matrix as:
where we have defined the E and B fields, \(\kappa _E=(1/2)\triangle \psi _E\) and \(\kappa _B=(1/2)\triangle \psi _B\), respectively. This technique is referred to as the E/Bmode decomposition. We see from Eq. (58) that the relations between E/B fields and spin2 fields are intrinsically nonlocal. Remembering that the shear matrix due to weak lensing is given as \(\varGamma _{ij}=(\partial _i\partial _j\delta _{ij}\triangle /2)\psi (\varvec{\theta })\) (\(i,j=1,2\)), we identify \(\psi _E(\varvec{\theta })=\psi (\varvec{\theta })\) and \(\psi _B(\varvec{\theta })=0\). Hence, for a lensinginduced shear field, the Emode signal is related to the convergence \(\kappa \), i.e., the surface mass density of the lens, while the Bmode signal is identically zero.
Figure 9 illustrates characteristic distortion patterns from Emode (curlfree) and Bmode (divergencefree) fields. Weak lensing only produces curlfree Emode signals, so that the presence of divergencefree B modes can be used as a null test for systematic effects. In the weaklensing regime, a tangential Emode pattern is produced by a positive mass overdensity (e.g., halos), while a radial Emode pattern is produced by a negative mass overdensity (e.g., cosmic voids).
Now, we turn to the issue of E/Bmode reconstructions from the spin2 shear field. Rewriting Eq. (58) in terms of the complex shear \(\gamma \), we find:
where \({\mathfrak {R}}(Z)\) and \({\mathfrak {I}}(Z)\) denote the real part and the imaginary part of a complex variable Z, respectively. Defining \(\kappa \equiv \kappa _E+i\kappa _B\), we see that the first of Eq. (59) is identical to the mass inversion formula (Eq. (42)). The Bmode convergence \(\kappa _B\) can thus be simply obtained as the imaginary part of Eq. (44), which is expected to vanish for a purely weaklensing signal. Moreover, the second of Eq. (59) indicates that the transformation \(\gamma ^{\prime}(\varvec{\theta })=i\gamma (\varvec{\theta })\) (\(\gamma _1^{\prime}=\gamma _2, \gamma _2'=\gamma _1\)) is equivalent to an interchange operation of the E and B modes of the original maps by \(\kappa _E'(\varvec{\theta })=\kappa _B(\varvec{\theta })\) and \(\kappa '_B(\varvec{\theta })=\kappa _E(\varvec{\theta })\). Since \(\gamma \) is a spin2 field that transforms as \(\gamma '=\gamma e^{2i\varphi }\), this operation is also equivalent to a rotation of each ellipticity by \(\pi /4\) with each position vector fixed.
Note that gravitational lensing can induce B modes, for example, when multiple deflections of light are involved (Sect. 2.5). However, these B modes can be generated at higher orders and the Bmode contributions coming from multiple deflections are suppressed by a large factor compared to the Emode contributions (see, e.g., Krause and Hirata 2010). In real observations, intrinsic ellipticities of background galaxies also contribute to weaklensing shear estimates. Assuming that intrinsic ellipticities have random orientations in projection space, such an isotropic ellipticity distribution will yield statistically identical contributions to the E and B modes. Therefore, the Bmode signal provides a useful null test for systematic effects in weaklensing observations (Fig. 9).
3.3 Flexion
Flexion is introduced as the next higher order lensing effects responsible for an arclike and weakly skewed appearance of lensed galaxies (Goldberg and Bacon 2005; Bacon et al. 2006) observed in a regime between weak and strong lensing (i.e., a nonlinear but subcritical regime). Such higher order lensing effects occur when \(\kappa (\varvec{\theta })\) and \(\gamma (\varvec{\theta })\) are not spatially constant across a source galaxy image. By taking higher order derivatives of the lensing potential \(\psi (\varvec{\theta })\), we can work with higher order transformations of galaxy shapes by weak lensing (e.g., Massey et al. 2007b; Okura et al. 2007, 2008; Goldberg and Leonard 2007; Schneider and Er 2008; Viola et al. 2012).
The thirdorder derivatives of \(\psi (\varvec{\theta })\) can be combined to form a pair of complex flexion fields as (Bacon et al. 2006):
The first flexion F has spin1 and the second flexion G has spin3. The two complex flexion fields satisfy the following consistency relation:
Figure 10 illustrates the characteristic weaklensing distortions with different spin values for an intrinsically circular Gaussian source (Bacon et al. 2006).
If the angular size of an image is small compared to the characteristic scale over which \(\psi (\varvec{\theta })\) varies, we can locally expand Eq. (13) to the next higher order as:
where \(\mathscr{A}_{ij,k}=\psi _{,ijk}\) (\(i,j,k=1,2\)). The \(\mathscr{A}_{ij,k}\) matrix can be expressed with a sum of two terms:
with the spin1 part \({F}_{ijk}\) and the spin3 part \({G}_{ijk}\) defined by:
Flexion has a dimension of inverse (angular) length, so that the flexion effects depend on the angular size of the source image. That is, the smaller the source image, the larger the amplitude of intrinsic flexion contributions (Okura et al. 2008). The shape quantities affected by the first flexion F alone have spin1 properties, while those by the second flexion G alone have spin3 properties.
Note that, as in the case of the spin2 shear field, what is directly observable from higher order image distortions are the reduced flexion effects, \(F/(1\kappa )\) and \(G/(1\kappa )\) (Okura et al. 2007, 2008; Goldberg and Leonard 2007; Schneider and Er 2008), a consequence of the masssheet degeneracy.
From Eq. (60), the inversion equations from flexion to \(\kappa \) can be obtained as follows (Bacon et al. 2006):
where the complex part iB describes the Bmode component that can be used to assess the noise properties of weaklensing data (e.g., Okura et al. 2008). An explicit representation for the inversion equations is obtained in Fourier space as:
for \(\varvec{k}\ne 0\).
In principle, one can combine independent mass reconstructions \(\widehat{\kappa }_a(\varvec{k})\) \((a=\gamma ,{F}, {G})\) linearly in Fourier space to improve the statistical significance with minimum noise variance weighting as (Okura et al. 2007):
where \(\widehat{W}_{\kappa a}(\varvec{k}) = 1/P^{(N)}_{\kappa a}(\varvec{k})\) with \(P^{(N)}_{\kappa a}(\varvec{k})\) the twodimensional noise power spectrum of \(\kappa \) reconstructed using the observable a:
with \(P^{(N)}_a(\varvec{k})\) the shot noise power, \(\sigma _a\) the shape noise dispersion, and \(n_a\) the mean surface number density of background source galaxies, for the observable a (\(a=\gamma , F, G\)). Assuming that errors in \(\widehat{\kappa }_a(\varvec{k})\) between different observables are independent, the noise power spectrum for the estimator (Eq. (69)) is obtained as (Okura et al. 2007):
Figure 11 shows the \(\kappa \) field in the central region of the rich cluster Abell 1689 (\(z_l=0.183\)) reconstructed from the spin1 flexion alone (Okura et al. 2008) measured with Subaru SuprimeCam data. Okura et al. (2008) used measurements of higher order lensing image characteristics (HOLICs) introduced by Okura et al. (2007). Their analysis accounted for the effect of the weight function used for calculating noisy shape moments, as well as for higher order PSF effects. One can employ the assumption of random orientations for intrinsic HOLICs of background galaxies to obtain a direct estimate of flexion, in a similar manner to the usual prescription for weak shear lensing. Okura et al. (2008) utilized the Fourierspace relation (Eq. (68)) between \(F(\varvec{\theta })\) and \(\kappa (\varvec{\theta })\) with the linear weaklensing approximation. The Bmode convergence field was used to monitor the reconstruction error in the \(\kappa \) map. The reconstructed \(\kappa \) map exhibits a bimodal feature in the central region of the cluster. The pronounced main peak is associated with the brightest cluster galaxy (BCG) and central cluster members, while the secondary mass peak is associated with a local concentration of bright galaxies.
Note that, as discussed in Viola et al. (2012), there is a crosstalk between shear and flexion arising from shear–flexion coupling terms, which makes quantitative measurements of flexion challenging.
3.4 Shear observables
Since the pioneering work of Kaiser et al. (1995), numerous methods have been proposed and implemented in the literature to accurately extract the lensing signal from noisy pixelized images of background galaxies (e.g., Kuijken 1999; Bridle et al. 2002; Bernstein and Jarvis 2002; Refregier 2003; Hirata and Seljak 2003; Miller et al. 2007). On the other hand, considerable progress has been made in understanding and controlling systematic biases in noisy shear estimates by relying on realistic galaxy image simulations (e.g., Heymans et al. 2006; Massey et al. 2007a; Refregier et al. 2012; Kacprzak et al. 2012; Mandelbaum et al. 2014, 2015, 2018a).
Here, we will review the basic idea and essential aspects of the momentbased KSB formalism. We refer the reader to Mandelbaum (2018) for a recent exhaustive review on the subject.
3.4.1 Ellipticity transformation by weak lensing
In a momentbased approach to weaklensing shape measurements, we use quadrupole moments \(Q_{ij}\) (\(i,j=1,2\)) of the surface brightness distribution \(I(\varvec{\theta })\) of background galaxy images to quantify the shape of the images as (Kaiser et al. 1995):
where \(q_I[I(\varvec{\theta })]\) is a weight function and \(\varDelta \theta _i = \theta _i\overline{\theta }_i\) denotes the offset vector from the image centroid. Here, we assume that the weight \(q_I\) does not explicitly depend on \(\varvec{\theta }\), but is set by the local value of the brightness \(I(\varvec{\theta })\) (Bartelmann and Schneider 2001). The trace of \(Q_{ij}\) describes the angular size of the image, while the traceless part describes the shape and orientation of the image. With the quadrupole moments \(Q_{ij}\), we define the complex ellipticity \(e=e_1+ie_2\) as^{Footnote 7}:
For an ellipse with a minortomajor axis ratio of \(q \; (\leqslant 1)\), \(e=(1q^2)/(1+q^2)\).
The spin2 ellipticity e (Eq. (73)) transforms under the lens mapping as:
where \(e^{(s)}\) is the unlensed intrinsic ellipticity and \(g = \gamma /(1  \kappa )\) is the spin2 reduced shear. Since e is a nonzero spin quantity with a direction dependence, the expectation value of the intrinsic source ellipticity \(e^{(s)}\) is assumed to vanish, i.e., \(\mathcal{E}(e^{(s)})=0\), where \(\mathcal{E}(X)\) denotes the expectation value of X. Schneider and Seitz (1995) showed that Eq. (74) with the condition \(\mathcal{E}(e^{(s)}) = 0\) is equivalent to:
where \(e_n\) is the image ellipticity for the nth object, \(w_n\) is a statistical weight for the nth object, and \(\delta _g\) is the spin2 complex distortion (Schneider and Seitz 1995):
Note that the complex distortion \(\delta _g\) is invariant under the transformation \(g\rightarrow 1/g^*\).
For an intrinsically circular source with \(e^{(s)}=0\), we have:
On the other hand, in the weaklensing limit (\(\kappa , \gamma \ll 1\)), Eq. (74) reduces to \(e^{(s)} \simeq e2g \simeq e2\gamma \). Assuming random orientations of source galaxies, we average observed ellipticities over a local ensemble of source galaxies to obtain:
For an input signal of \(g=0.1\), Eq. (77) yields \(e\approx 0.198\). Hence, the weaklensing approximation (Eq. (78)) gives a reducedshear estimate of \(g^\mathrm {(est)}\approx 0.099\), corresponding to a negative bias of \(1\%\). For \(g=0.2\) in the mildly nonlinear regime, Eq. (78) gives \(g^\mathrm {(est)}\approx 0.192\), corresponding to a negative bias of \(4\%\).
In real observations, the reduced shear g may be estimated from a local ensemble average of background galaxies as \(\langle g \rangle \simeq \langle e\rangle /2\). The statistical uncertainty in the reducedshear estimate \(\langle g\rangle \) decreases with increasing the number of background galaxies N (see Sect. 4.2 for more details) as \(\propto \sigma /\sqrt{N}\), with \(\sigma \) the dispersion of background image ellipticities (dominated by the intrinsic shape noise). Weaklensing analysis thus requires a large number of background galaxies to increase the statistical significance of the shear measurements.
3.4.2 The KSB algorithm: a momentbased approach
For a practical application of weak shear lensing, we must account for various observational and instrumental effects, such as the impact of noise on the galaxy shape measurement (both statistical and systematic uncertainties), the isotropic smearing component of the PSF, and the effect of instrumental PSF anisotropy. Therefore, one cannot simply use Eq. (78) to measure the shear signal from observational data.
A more robust estimate of the shape moments (Eq. (72)) is obtained using a weight function \(W(\varvec{\theta })\) that depends explicitly on the separation \(\varvec{\theta }\) from the image centroid. In the KSB approach, a circular Gaussian that is matched to the size of each object is used as a weight function (Kaiser et al. 1995). The quadrupole moments obtained with such a weight function \(W(\varvec{\theta })\) suffer from an additional smearing and do not obey the transformation law (Eq. (74)). Therefore, the expectation value \(\mathcal{E}(e)\) of the image ellipticity is different from the distortion \(\delta _g = 2g/(1+g^2)\) (see Eq. (77)).
The KSB formalism (Kaiser et al. 1995; Hoekstra et al. 1998) accounts explicitly for the Gaussian weight function used for measuring noisy shape moments, the effect of spin2 PSF anisotropy, and the effect of isotropic PSF smearing. The KSB formalism and its variants assume that the PSF can be described as an isotropic function convolved with a small anisotropic kernel. In the limit of linear response to lensing and instrumental anisotropies, KSB derived the transformation law between intrinsic (unlensed) and observed (lensed) complex ellipticities, \(e^{(s)}\) and e, respectively. The linear transformation between intrinsic and observed complex ellipticities can be formally expressed as (Kaiser et al. 1995; Hoekstra et al. 1998; Bartelmann and Schneider 2001):
where \(q_i\) denotes the spin2 PSF anisotropy kernel, \((C^q)_{ij}\) is a linear response matrix for the PSF anisotropy \(q_i\), \((C^g)_{ij}\) is a linear response matrix for the reduced shear \(g_i\). The PSF anisotropy kernel and the response matrices can be calculated from observable weighted shape moments of galaxies and stellar objects (Kaiser et al. 1995; Bartelmann and Schneider 2001; Erben et al. 2001). In real observations, the PSF anisotropy kernel \(q(\varvec{\theta })\) can be estimated from image ellipticities \(e^*\) observed for a sample of foreground stars for which \(e^{(s)}\) and g vanish, so that \(q_i(\varvec{\theta })=(C^q)^{1}_{ij}e^*_{j}\).
Assuming that the expectation value of the intrinsic source ellipticity vanishes \(\mathcal{E}[e^{(s)}]=0\), we find the following linear relation between the reduced shear and the ensembleaveraged image ellipticity:
In the KSB formalism, the shear response matrix \(C^g\) is denoted as \(P^g\) (or \(P^\gamma \)) and dubbed preseeing shear polarizability. Similarly, \(C^q\) is denoted as \(P^\mathrm {sm}\) and dubbed smear polarisability.
A careful calibration of the signal response \(P^g\) is essential for any weak shear lensing analysis that relies on accurate shape measurements from galaxy images (see Mandelbaum 2018). The levels of shear calibration bias are often quantified in terms of a multiplicative bias factor m and an additive calibration offset c through the following relation between the true input shear signal, \(g^\mathrm {true}\), and the recovered signal, \(g^\mathrm {obs}\) (Heymans et al. 2006; Massey et al. 2007a; Mandelbaum et al. 2014):
The original KSB formalism, when applied to noisy observations, is known to suffer from systematic biases that depend primarily on the size and the detection signaltonoise ratio (S/N) of galaxy images (e.g., Erben et al. 2001; Refregier et al. 2012). Different variants of the Kaiser et al. (1995) method (KSB+) have been developed and implemented in the literature primarily to study mass distributions of highmass galaxy clusters (e.g., Hoekstra et al. 1998, 2015; Clowe et al. 2004; Umetsu et al. 2010, 2014; Oguri et al. 2012; von der Linden et al. 2014a; Okabe and Smith 2016; Schrabback et al. 2018). Note that KSB+ pipelines calibrated against realistic image simulations of crowded fields can achieve a \(\sim 2\%\) shear calibration accuracy even in the cluster lensing regime (e.g., Herbonnet et al. 2019; HernándezMartín et al. 2020).
3.5 Tangential and crossshear components
As we have seen in Sect. 3.1, the spin2 shear components \(\gamma _1\) and \(\gamma _2\) are coordinate dependent, defined relative to a reference Cartesian coordinate frame (chosen by the observer). It is useful to consider components of the shear that are coordinate independent with respect to a certain reference point, such as the cluster center.
We define a polar coordinate system (\(\vartheta , \varphi \)) centered on an arbitrary point \(\varvec{\theta }_\mathrm {c}\) on the sky, such that \(\varvec{\theta }=(\vartheta \cos \varphi ,\vartheta \sin \varphi )+\varvec{\theta }_\mathrm {c}\). The convergence \(\overline{\kappa }(\vartheta )\) averaged within a circle of radius \(\vartheta \) about \(\varvec{\theta }_\mathrm {c}\) is then expressed as:
where \(\overline{\varSigma }(\vartheta )\) is the surface mass density averaged within a circle of radius \(\vartheta \) about \(\varvec{\theta }_\mathrm {c}\) and \(\varSigma (\vartheta )\) is the surface mass density averaged over a circle of radius \(\vartheta \) about \(\varvec{\theta }_\mathrm {c}\). The reference point \(\varvec{\theta }_\mathrm {c}\) can be taken to be the cluster center, which can be determined from symmetry of the stronglensing pattern, the Xray centroid position, or the BCG position.
Let us introduce the tangential and \(45^\circ \)rotated crossshear components, \(\gamma _+(\varvec{\theta })\) and \(\gamma _\times (\varvec{\theta })\), respectively, defined relative to the position \(\varvec{\theta }_\mathrm {c}\) as:
which are directly observable in the weaklensing limit where \(\kappa ,\gamma \ll 1\) (see Sect. 3.4). Using the twodimensional version of Gauss’ theorem, we find the following identity for an arbitrary choice of \(\varvec{\theta }_\mathrm {c}\) (Kaiser 1995):
where we have defined the excess surface mass density \(\varDelta \varSigma (\vartheta )\) around \(\varvec{\theta }_\mathrm {c}\) as a function of \(\vartheta \) by (MiraldaEscude 1991):
From Eqs. (82) and (84), we find:
Equation (84) shows that, given an arbitrary circular loop of radius \(\vartheta \) around the chosen center \(\varvec{\theta }_\mathrm {c}\), the tangential and crossshear components averaged around the loop extract Emode and Bmode distortion patterns (Sect. 3.2).
An important implication of the first of Eq. (84) is that, with tangential shear measurements from individual source galaxies (see Sect. 3.4), one can directly determine the azimuthally averaged \(\varDelta \varSigma (\vartheta )\) profile around lenses in the weaklensing regime, even if the mass distribution \(\varSigma (\varvec{\theta })\) is not axissymmetric around \(\varvec{\theta }_\mathrm {c}\). Moreover, the second of Eq. (84) tells us that the azimuthally averaged \(\times \) component, or the Bmode signal, is expected to be statistically consistent with zero if the signal is due to weak lensing. Therefore, a measurement of the Bmode signal \(\langle g_\times (\vartheta )\rangle \) provides a useful null test against systematic errors.
3.6 Reduced tangential shear
3.6.1 Azimuthally averaged reduced tangential shear
The reduced tangential shear \(g_+(\vartheta )\) averaged over a circle of radius \(\vartheta \) about an arbitrary reference point \(\varvec{\theta }_\mathrm {c}\) is expressed as:
If the projected mass distribution around a cluster has quasicircular symmetry (e.g., elliptical symmetry), then the azimuthally averaged reduced tangential shear \(\langle g_+(\vartheta )\rangle \) around the cluster center can be interpreted as:
where \(\gamma _+(\vartheta )\) and \(\kappa (\vartheta )\) are the tangential shear and the convergence, respectively, averaged over a circle of radius \(\vartheta \) about \(\varvec{\theta }_\mathrm {c}\).
According to Nbody simulations in hierarchical \({\varLambda }\hbox {CDM}\) models of cosmic structure formation, darkmatter halos exhibit aspherical mass distributions that can be well approximated by triaxial mass models (e.g., Jing and Suto 2002; Limousin et al. 2013; Despali et al. 2014). Since triaxial halos have elliptical isodensity contours in projection on the sky (Stark 1977), Eq. (88) can give a good approximation to describe the weaklensing signal for regular clusters with a modest degree of perturbation. However, the approximation is likely to break down for merging and interacting lenses having complex, multimodal mass distributions. To properly model the weaklensing signal in such a complex merging system, one needs to directly model the twodimensional reducedshear field \((g_1(\varvec{\theta }),g_2(\varvec{\theta }))\) with a lens model consisting of multicomponent halos (e.g., Watanabe et al. 2011; Okabe et al. 2011; Medezinski et al. 2013). Alternatively, one may attempt to reconstruct the convergence field \(\kappa (\varvec{\theta })\) in a freeform manner from the observed reduced shear field, with additional constraints or assumptions to break the masssheet degeneracy (e.g., Jee et al. 2005; Bradač et al. 2006; Merten et al. 2009; Jauzac et al. 2012; Umetsu et al. 2015; Tam et al. 2020).
On the other hand, for a statistical ensemble of galaxy clusters, the average mass distribution around their centers tends to be spherically symmetric if the assumption of statistical isotropy holds (e.g., Okabe et al. 2013). Hence, the stacked weaklensing signal for a statistical ensemble of clusters can be interpreted using Eq. (88). For more details, see Sects. 3.6.4 and 4.5.
3.6.2 Sourceaveraged reduced tangential shear
With the assumption of quasicircular symmetry in the projected mass distribution around clusters (see Eq. (88)), let us consider the nonlinear effects on the sourceaveraged cluster lensing profiles. The reduced tangential shear for a given lens–source pair is written as:
To begin with, let us consider the expectation value for the reduced tangential shear averaged over an ensemble of source galaxies. For a given cluster, the sourceaveraged reduced tangential shear is expressed as:
where \(\langle \cdots \rangle \) denotes the averaging over all sources, defined such that:
where the index s runs over all source galaxies around the lens (l) and \(w_s\) is a statistical weight for each source galaxy. An optimal choice for the statistical weight is \(w_{s} = 1 / \sigma _{g_{+},s}^{2}\), with \(\sigma _{g_{+},s}\) the statistical uncertainty of \(g_+(\vartheta z_l,z_s)\) estimated for each source galaxy. Note that this choice for the weight assumes that \(\sigma _{g_{+},s}\) is independent of the lensing shear signal (see Schneider and Seitz 1995; Seitz and Schneider 1995). In the continuous limit, Eq. (91) is written as:
with \(\text {d}N(z)/\text {d}z\) the redshift distribution function of the source sample and w(z) a statistical weight function. For a given cluster lens, \(\varSigma _\mathrm {cr}^{1}(z_l,z_s)\propto D_{ls}/D_s\), so that \(\langle \varSigma _{\mathrm {cr},l}^{n}\rangle \propto \langle (D_{ls}/D_s)^{n}\rangle \).
In the weaklensing limit, Eq. (90) gives \(\langle g_+\rangle \simeq \langle \gamma _+\rangle \). The next order of approximation is given by (Seitz and Schneider 1997):
From Eq. (93), we see that an interpretation of the averaged weaklensing signal \(\langle g_+(\vartheta z_l)\rangle \) does not require knowledge of individual source redshifts. Instead, it requires ensemble information regarding the statistical redshift distribution dN(z)/dz of background source galaxies used for weaklensing measurements.
For a lens at sufficiently low redshift (see Sect. 2.6.5), \(f_l\approx 1\), thus leading to the single sourceplane approximation: \(\langle g_+\rangle \simeq \langle \gamma _+\rangle /(1\langle \kappa \rangle )\). The level of bias introduced by this approximation is \(\varDelta \langle g_+\rangle /\langle g_+\rangle \simeq (f_l1)\langle \kappa \rangle \). In typical groundbased deep observations of \(z_l\sim 0.4\) clusters, \(\varDelta f_l=f_l1\) is found to be of the order of several percent (Umetsu et al. 2014), so that the relative error \(\varDelta \langle g_+\rangle /\langle g_+\rangle \) is negligibly small in the mildly nonlinear regime of cluster lensing.
3.6.3 Sourceaveraged excess surface mass density
Next, let us consider the following estimator for the excess surface mass density \(\varDelta \varSigma (\vartheta )\) for a given lens–source pair:
This assumes that an estimate of \(\varSigma _\mathrm {cr}^{1}(z_l,z_s)\) for each individual source galaxy is available, for example, from photometricredshift (photoz) measurements. This estimator is widely used in recent cluster weaklensing studies thanks to the availability of multiband imaging data and the advances in photoz techniques (see Sect. 4.1).
In real observations, if the photoz probability distribution function (PDF), \(P_s(z)\), is available for individual source galaxies (s), one can calculate:
averaged over the PDF for each source galaxy. Similarly to Eq. (90), \(\varDelta \varSigma _+(\vartheta z_l,z_s)\) averaged over all sources is expressed as:
with
where the index s runs over all source galaxies around the lens (l) and \(w_{ls}\) is a statistical weight for each source galaxy. An optimal choice for the statistical weight is:
where \(\sigma _{g_{+},s}\) is the statistical uncertainty of \(g_+(\vartheta z_l,z_s)\) estimated for each source galaxy (Sect. 3.6.2).
In the weaklensing limit, we thus have \(\langle \varDelta \varSigma _+\rangle \simeq \varDelta \varSigma \). The next order of approximation is:
3.6.4 Lens–sourceaveraged excess surface mass density
Finally, we consider an ensemble of galaxy clusters. Now, let \(\varDelta \varSigma \) be the ensemble mass distribution of these clusters. Then, \(\varDelta \varSigma _+\) averaged over all lens–source (ls) pairs is expressed as (Johnston et al. 2007):
with
where \(\langle \langle \cdots \rangle \rangle \) denotes the averaging over all lens–source pairs, the double summation is taken over all clusters (l) and all source galaxies (s), and \(w_{ls}\) is a statistical weight for each lens–source pair (ls). An optimal choice for the statistical weight is given by Eq. (98).
Again, the weaklensing limit yields \(\langle \langle \varDelta \varSigma _+\rangle \rangle \simeq \varDelta \varSigma \) and the next order of approximation is given by (Umetsu et al. 2014, 2020):
Equation (102) can be used to interpret the stacked weaklensing signal including the nonlinear regime of cluster lensing. In Sect. 4.5, we provide more details on the stacked weaklensing methods.
3.7 Aperture mass densitometry
In this subsection, we introduce a nonparametric technique to infer a projected total mass estimate from weak shear lensing observations. Integrating Eq. (86) between two concentric radii \(\vartheta _\mathrm {in}\) and \(\vartheta _\mathrm {out}(>\vartheta _\mathrm {in})\), we obtain an expression for the \(\zeta \) statistic as (Fahlman et al. 1994; Kaiser 1995; Squires and Kaiser 1996):
where \(\overline{\kappa }(\vartheta _\mathrm {in},\vartheta _\mathrm {out})\) is the convergence averaged within a concentric annulus between \(\vartheta _\mathrm {in}\) and \(\vartheta _\mathrm {out}\):
In the weaklensing regime where \(\gamma _+(\vartheta )\simeq g_+(\vartheta )\), \(\zeta \) can be determined uniquely from the shape distortion field in a finite annular region at \(\vartheta _\mathrm {in}\leqslant \theta \leqslant \vartheta _\mathrm {out}\), because additive constants \(\kappa _0\) from the invariance transformation (Eq. (49)) cancel out in Eq. (103). Note that this technique is also referred to as aperture mass densitometry.
Since galaxy clusters are highly biased tracers of the cosmic mass distribution, \(\overline{\kappa }(\vartheta _\mathrm {in},\vartheta _\mathrm {out})\) around a cluster is expected to be positive, so that \(\zeta (\vartheta _\mathrm {in},\vartheta _\mathrm {out})\) yields a lower limit to \(\overline{\kappa }(\vartheta _\mathrm {in})\). That is, the quantity \(M_\zeta \equiv \pi (D_l\vartheta _\mathrm {in})^2 \varSigma _\mathrm {cr}\zeta (\vartheta _\mathrm {in},\vartheta _\mathrm {out})\) yields a lower limit to the projected mass inside a circular aperture of radius \(\vartheta _\mathrm {in}\), \(M_\mathrm {2D}=\pi (D_l\vartheta _\mathrm {in})^2\overline{\varSigma }(\vartheta _\mathrm {in})\). This technique provides a powerful means to estimate the total cluster mass from shear data in the weaklensing regime \(\gamma \ll 1\).
We now introduce a variant of aperture mass densitometry defined as (Clowe et al. 2000):
where the aperture radii \((\vartheta ,\vartheta _\mathrm {in},\vartheta _\mathrm {out})\) satisfy \(\vartheta< \vartheta _\mathrm {in} < \vartheta _\mathrm {out}\), and the first and second terms in the second line of Eq. (105) are equal to \(\overline{\kappa }(\vartheta )\overline{\kappa }(\vartheta _\mathrm {in})\) and \(\overline{\kappa }(\vartheta _\mathrm {in})\overline{\kappa }(\vartheta _\mathrm {in},\vartheta _\mathrm {out})\), respectively. In the weaklensing limit, the quantity
yields a lower limit to the projected mass inside a circular aperture of radius \(\vartheta \), that is:
We can regard \(\zeta _\mathrm {c}(\vartheta \vartheta _\mathrm {in},\vartheta _\mathrm {out})\) as a function of \(\vartheta \) for fixed values of \((\vartheta _\mathrm {in},\vartheta _\mathrm {out})\) and measure \(\zeta _\mathrm {c}(\vartheta \vartheta _\mathrm {in},\vartheta _\mathrm {out})\) at several independent aperture radii \(\vartheta \). As in the case of the standard \(\zeta \) statistic (Eq. (103)), one may choose the inner and outer annular radii (\(\vartheta _\mathrm {in}, \vartheta _\mathrm {out}\)) to lie in the weaklensing regime where \(g_+\simeq \gamma _+\). In general, however, \(\vartheta \) may lie in the nonlinear regime where \(\gamma _+\) is not directly observable. In the subcritical regime, we can express \(\gamma _+(\vartheta )\) in terms of the observed reduced tangential shear \(g_+(\vartheta )\) as:
when assuming a quasicircular symmetry in the projected mass distribution (Sect. 3.6). If these conditions are satisfied, for a given boundary condition \(\overline{\kappa }_0\equiv \overline{\kappa }(\vartheta _\mathrm {in},\vartheta _\mathrm {out})\), Eq. (105) can be solved iteratively as (Umetsu et al. 2010):
where we have introduced a differential operator defined as \(\hat{\mathcal{L}}(\vartheta ) = \frac{1}{2\vartheta ^2}\frac{{\text{d}}}{\text {d}\ln \vartheta }\vartheta ^2\) that satisfies \(\hat{\mathcal{L}}\overline{\kappa }(\vartheta )=\kappa (\vartheta )\) and \(\hat{\mathcal{L}}1 = 1\), and the quantities indexed by (n) refer to those in the nth iteration (\(n=0,1,2,\ldots \)).
We solve a discretized version of Eq. (109). See Appendix A of Umetsu et al. (2016) for discretized expressions for \(g_+(\vartheta )\) and \(\overline{\kappa }(\vartheta )\). One can start the iteration process with an initial guess of \(\kappa ^{(0)}(\vartheta )=0\) for all \(\vartheta \) bins and repeat it until convergence is reached in all \(\vartheta \) bins. This procedure will yield a solution for the binned mass profile:
for a given value of \(\overline{\kappa }_0\). Note that the errors for the mass profile solution in different radial bins are correlated. The bintobin error covariance matrix \(C_{bb'}\equiv \mathrm {Cov}[\overline{\kappa }(\vartheta _b),\overline{\kappa }(\vartheta _{b'})]\) (\(b,b'=1,2,\ldots \)) can be calculated with the linear approximation \(\kappa (\vartheta )\ll 1\) in Eq. (109), by propagating the errors for the binned \(g_+(\vartheta )\) profile (e.g., Okabe and Umetsu 2008; Umetsu et al. 2010; Okabe et al. 2010).
Alternatively, one can attempt to determine the boundary term \(\overline{\kappa }_0\) from shear data by incorporating additional iteration loops. Starting with an initial guess of \(\overline{\kappa }_0=0\), one can update the value of \(\overline{\kappa }_0\) in each iteration using a specific mass model (e.g., a powerlaw profile) that best fits the binned \(\overline{\kappa }(\vartheta )\) profile. This iteration procedure is repeated until convergence is obtained (see Umetsu et al. 2010).
4 Standard shear analysis methods
In this section, we outline procedures to obtain cluster mass estimates from azimuthally averaged reduced tangential shear measurements for a given galaxy cluster.
4.1 Background source selection
A critical source of systematics in weak lensing comes from accurately estimating the redshift distribution of background source galaxies, which is needed to convert the lensing signal into physical mass units (Medezinski et al. 2018b). Contamination of background samples by unlensed foreground and cluster galaxies with \(\varSigma _\mathrm {cr}^{1}(z_l,z_s)=0\), when not accounted for, leads to a systematic underestimation of the true lensing signal. Inclusion of foreground galaxies produces a dilution of the lensing signal that does not depend on the clustercentric radius. In contrast, the inclusion of cluster galaxies significantly dilutes the lensing signal at smaller cluster radii and causes an underestimation of the concentration of the cluster mass profile (Broadhurst et al. 2005b), as well as of the halo mass \(M_\varDelta \), especially at higher overdensities \(\varDelta \). The level of contamination by cluster galaxies increases with the cluster mass or richness (see Fig. 12). A secure selection of background galaxies is thus key for obtaining accurate cluster mass estimates from weak gravitational lensing (Medezinski et al. 2007, 2010, 2018b; Umetsu and Broadhurst 2008; Okabe et al. 2013; Gruen et al. 2014).
In real observations, acquiring spectroscopic redshifts for individual source galaxies is not feasible, particularly to the depths of weaklensing observations. Instead of spectroscopic redshifts, photozs can be used when multiband imaging is available. Cluster weaklensing studies, however, often rely on two to three optical bands for deep imaging (e.g., Broadhurst et al. 2005b; Medezinski et al. 2010; Oguri et al. 2012; Okabe and Smith 2016), so that reliable photozs could not be obtained. Instead, wellcalibrated field photoz catalogs, such as COSMOS (Ilbert et al. 2009; Laigle et al. 2016), were used to determine the redshift distribution \(\text {d}N(z)/\text {d}z\) of background galaxies for a given color–magnitude selection (Medezinski et al. 2010; Okabe et al. 2010). Such field surveys are often limited to deep but small areas and thus subject to cosmic variance.
Dedicated widearea optical surveys, such as the Hyper SuprimeCam Subaru Strategic Program (HSCSSP; Miyazaki et al. 2018a; Aihara et al. 2018a, b), the Dark Energy Survey (DES; Abbott et al. 2018), and the upcoming Large Synoptic Survey Telescope (LSST; Ivezic et al. 2008), are designed to observe in several broad bands, so that photozs are better determined. These photoz estimates will still suffer from a large fraction of outliers due to inherent color–redshift degeneracies, as limited by a finite number of broad optical bands. The photoz uncertainties are folded in by incorporating the full PDF for each source galaxy (Applegate et al. 2014). However, photoz PDFs are often sensitive to the assumed priors. Moreover, the accuracy of photoz PDFs will be limited by the representability of spectroscopicredshift samples used for calibration. Alternative approaches rely on more stringent color cuts to reject objects with biased photozs (Medezinski et al. 2010, 2011; Umetsu et al. 2010, 2012, 2014; Okabe et al. 2013), which, however, lead to lower statistical power, because they result in lower source galaxy densities.
Using the firstyear CAMIRA (Cluster finding Algorithm based on Multiband Identification of Redsequence gAlaxies; Oguri et al. 2018) catalog of \(\sim 900\) clusters (\(0.1<z_l<1.1\)) with richness \(N \geqslant 20\) found in \(\sim 140\,\hbox {deg}^2\) of HSCSSP survey data, Medezinski et al. (2018b) investigated robust sourceselection methods for cluster weak lensing. They compared three different sourceselection schemes: (1) relying on photoz’s and their full PDFs P(z) to correct for dilution (all), (2) selecting background galaxies in color–color space (CCcut), and (3) selection of robust photoz’s by applying constraints on their cumulative PDF (Pcut). All three methods use photoz PDFs of individual source galaxies, P(z), to convert the lensing signals into physical mass units. With perfect P(z) information, all these methods should thus yield consistent, undiluted \(\langle \langle \varDelta \varSigma _+(R)\rangle \rangle \) profiles. After applying basic quality cuts, Medezinski et al. (2018b) found the typical mean unweighted galaxy number density in the HSC shape catalog to be \(n_\mathrm {g}=21.7\,\hbox {arcmin}^{2}\). Similarly, they found \(n_\mathrm {g}=11.6\,\hbox {arcmin}^{2}\) and \(n_\mathrm {g}=13.8\,\hbox {arcmin}^{2}\) for cluster lenses at \(z_l<0.4\) using the CCcut and Pcut methods, respectively.
Medezinski et al. (2018b) showed that simply relying on the photoz PDFs (all) results in a \(\langle \langle \varDelta \varSigma _+\rangle \rangle \) profile that suffers from dilution due to residual contamination by cluster galaxies. Using proper limits, the CC and Pcut methods give consistent \(\langle \langle \varDelta \varSigma _+\rangle \rangle \) profiles with minimal dilution. Differences are only seen for rich clusters with \(N \geqslant 50\), where the Pcut method produces a slightly diluted signal in the innermost radial bin compared to the CCcuts (see Fig. 12). Employing either the Pcut or CCcut selection results in cluster contamination consistent with zero to within the 0.5% uncertainties. For more details on the sourceselection methods, we refer the reader to Medezinski et al. (2018b) and references therein. An alternative approach to correct for dilution of the lensing signal is to statistically estimate the level of contamination and subtract it off (e.g., Varga et al. 2019), in which the effect of magnification bias must be properly taken into account (see Sect. 5).
4.2 Tangential shear signal
Here, we describe a procedure to derive azimuthally averaged radial profiles of the tangential (\(+\)) and cross (\(\times \)) shear components around a given cluster lens at a certain redshift, \(z_l\). Specifically, we calculate for each cluster the lensing profiles, \(\{\langle g_+(\vartheta _b)\rangle \}_{b=1}^{N_\mathrm {bin}}\) and \(\{\langle g_\times (\vartheta _b)\rangle \}_{b=1}^{N_\mathrm {bin}}\), in \(N_\mathrm {bin}\) discrete clustercentric bins spanning the range \(\vartheta \in [\vartheta _\mathrm {min},\vartheta _\mathrm {max}]\).
Since weak shear measurements of individual background galaxies (Eq. (80)) are very noisy, we calculate the weighted average of the source ellipticity components as:
where the summation is taken over all source galaxies (s) that lie in the bin (b); \(g_{+,s}\) and \(g_{\times ,s}\) represent the tangential and \(45^\circ \)rotated cross components of the reduced shear (Eq. (83)), respectively, estimated for each source galaxy; and \(w_s\) is its statistical weight. The azimuthally averaged cross component, \(\langle g_\times (\vartheta )\rangle \), is expected to be statistically consistent with zero (see Sect. 3.6.1).
The statistical uncertainty per shear component per source galaxy is denoted by \(\sigma _{g_{+},s} = \sigma _{g_{\times },s} \equiv \sigma _{g,s}\), which is dominated by the shape noise. Here, \(\sigma _{g,s}\) includes both contributions from the shape measurement uncertainty and the intrinsic dispersion of source ellipticities (e.g., Mandelbaum 2018). In general, an optimal choice for weighting is to apply an inversevariance weighting with \(w_s = 1/\sigma _{g,s}^2\) (Sect. 3.6.2). However, using inversevariance weights from noisy variance estimates may result in an unbalanced weighting scheme (e.g., sensitive to extreme values). To avoid this, one can employ a variant of inversevariance weighting, \(w_s = 1/(\sigma _{g,s}^2+\alpha _g^2)\), with \(\alpha _g\) a properly chosen softening constant (see, e.g., Hamana et al. 2003; Umetsu et al. 2009, 2014; Okabe et al. 2010; Oguri et al. 2010; Okabe and Smith 2016). The error variance per shear component for \(\langle g_{+,\times }(\vartheta _b)\rangle \) is given by:
where we have assumed isotropic, uncorrelated shape noise, \(\mathcal{E}(\varDelta g_{i,s}\varDelta g_{j,s'})=\sigma _{g,s}^2\delta _{s s'}\delta _{ij}\) (\(i,j=+,\times \)) with s and \(s'\) running over all source galaxies.
To quantify the significance of the tangential shear profile measurement \(\{g_+(\vartheta _b)\}_{b=1}^{N_\mathrm {bin}}\), we define a linear S/N estimator by (Sereno et al. 2017; Umetsu et al. 2020):
This estimator gives a weaklensing S/N integrated over the radial range of the data. Equation (113) assumes that the total uncertainty is dominated by the shape noise and ignores the covariance between different radial bins. Note that we shall use the full covariance matrix for cluster mass measurements (Sect. 4.4). This S/N estimator is different from the conventional quadratic estimator defined by (e.g., Umetsu and Broadhurst 2008; Okabe and Smith 2016):
As discussed by Umetsu et al. (2016, 2020), this quadratic definition breaks down and leads to an overestimation of significance in the noisedominated regime where the actual perbin S/N is less than unity.
The observed \(\langle g_+(\vartheta )\rangle \) profile can be interpreted according to Eq. (93). Then, it is important to define the corresponding bin radii \(\vartheta _b\) so as to minimize systematic bias in cluster mass measurements. We define the effective cluttercentric bin radius \(\vartheta _b\) (\(b=1,2,\ldots ,N_\mathrm {bin}\)) using the weighted harmonic mean of lens–source transverse separations as (Okabe and Smith 2016; Sereno et al. 2017):
If source galaxies are uniformly distributed in the image plane and \(w_s\) is taken to be constant, Eq. (115) in the continuous limit yields \(\vartheta _b = [\int _{\vartheta _{b1}}^{\vartheta _{b2}}\text {d}\vartheta \vartheta w(\vartheta ))][\int _{\vartheta _{b1}}^{\vartheta _{b2}}\text {d}\vartheta w(\vartheta )]^{1} = (\vartheta _{b1}+\vartheta _{b2})/2\) for a single radial bin defined in the range \(\vartheta \in [\vartheta _{b1},\vartheta _{b2}]\).^{Footnote 8}
4.3 Lens mass modeling
4.3.1 NFW model
The radial mass distribution of galaxy clusters is often modeled with a spherical Navarro–Frenk–White (Navarro et al. 1996, hereafter NFW) profile, which has been motivated by cosmological Nbody simulations (Navarro et al. 1996, 2004). The radial dependence of the twoparameter NFW density profile is given by:
where \(\rho _\mathrm {s}\) is the characteristic density parameter and \(r_\mathrm {s}\) is the characteristic scale radius at which the logarithmic density slope, \(\gamma _\mathrm {3D}(r)\equiv \text {d}\ln {\rho (r)}/\text {d}\ln {r}\), equals \(2\). The logarithmic gradient of the NFW profile is \(\gamma _\mathrm {3D}(r)=[1+3(r/r_\mathrm {s})]/[1+(r/r_\mathrm {s})]\). For \(r/r_\mathrm {s}\ll 1\), \(\gamma _\mathrm {3D}\rightarrow 1\), whereas, for \(r/r_\mathrm {s}\gg 1\), \(\gamma _\mathrm {3D}\rightarrow 3\). The radial range where the logarithmic density slope is close to the “isothermal” value of \(2\) is particularly important, given that such a mass distribution is needed to explain the flat rotation curves observed in galaxies.
The overdensity mass \(M_\varDelta \) is given by integrating Eq. (116) out to the corresponding overdensity radius \(r_\varDelta \) at which the mean interior density is \(\varDelta \times \rho _\mathrm {c}(z_l)\) (Sect. 1). For a physical interpretation of the cluster lensing signal, it is useful to specify the NFW model by the halo mass, \(M_\mathrm {200c}\), and the concentration parameter, \(c_\mathrm {200c}=r_\mathrm {200c}/r_\mathrm {s}\). The characteristic density \(\rho _\mathrm {s}\) is then given by:
Analytic expressions for the radial dependence of the projected NFW profiles, \(\varSigma _\mathrm {NFW}(R)=2\rho _\mathrm {s}r_\mathrm {s}\times f_\mathrm {NFW}(R/r_\mathrm {s})\) and \(\overline{\varSigma }_\mathrm {NFW}(R)=2\rho _\mathrm {s}r_\mathrm {s}\times g_\mathrm {NFW}(R/r_\mathrm {s})\) with \(R=D_l\vartheta \), are given by Wright and Brainerd (2000, see also Bartelmann 1996):
and
The excess surface mass density for an NFW halo is then obtained as \(\varDelta \varSigma _\mathrm {NFW}(R)=\overline{\varSigma }_\mathrm {NFW}(R)\varSigma _\mathrm {NFW}(R)\). These projected NFW functionals provide a good approximation for the projected matter distribution around clustersize halos (Oguri and Hamana 2011).
As an example, we show in Fig. 13 the reduced tangential and \(45^\circ \)rotated shear profiles \(\langle g_+(\vartheta )\rangle \) and \(\langle g_\times (\vartheta )\rangle \), respectively, for two highmass clusters, Abell 2142 and Abell 1689, obtained from Subaru SuprimeCam data (Umetsu et al. 2009). The \(\langle g_+(\vartheta )\rangle \) profiles are compared with their bestfit NFW and singular isothermal sphere (SIS) models. The SIS density profile is given by \(\rho (r)=\sigma _v^2/(2\pi G r^2)\), with \(\sigma _v\) the onedimensional velocity dispersion. For both clusters, the observed \(\langle g_+(\vartheta )\rangle \) profiles are better fitted by the NFW model having an outwardsteepening density profile. Abell 2142 is a nearby cluster at \(z_l=0.091\) perturbed by merging substructures (e.g., Okabe and Umetsu 2008; Umetsu et al. 2009; Liu et al. 2018). The radial curvature observed in the \(\langle g_+\rangle \) profile of Abell 2142 is highly pronounced, so that the powerlaw SIS model is strongly disfavored by the Subaru weaklensing data. From the bestfit NFW model, the mass and concentration parameters of Abell 2142 are constrained as \(M_\mathrm {200c}=(9.1\pm 1.9)\times 10^{14}h^{1}M_\odot \) and \(c_\mathrm {200c}=4.1\pm 0.8\) (Umetsu et al. 2009; Liu et al. 2018).
In contrast, Abell 1689 (\(z_l=0.183\)) is among the beststudied clusters and the most powerful known lenses to date (e.g., Broadhurst et al. 2005a; Limousin et al. 2007; Umetsu and Broadhurst 2008; Lemze et al. 2009; Kawaharada et al. 2010; Coe et al. 2010; Diego et al. 2015; Umetsu et al. 2011b, 2015), characterized by a large Einstein radius (Sect. 2.6.6) of \(\vartheta _\mathrm {Ein}=(47.0\pm 1.2)\,\hbox {arcsec}\) for a fiducial source at \(z_s=2\) (Coe et al. 2010). This indicates a high degree of mass concentration in projection of the sky. In fact, the observed \(\langle g_+(\vartheta )\rangle \) profile of Abell 1689 is well fitted by an NFW profile with a high concentration of \(c_\mathrm {200c}\sim 10\) (Broadhurst et al. 2005b; Umetsu and Broadhurst 2008; Umetsu et al. 2009, 2015; Coe et al. 2010), compared to the theoretical expectations, \(c_\mathrm {200c}\sim 4\) (e.g., Bhattacharya et al. 2013; Diemer and Kravtsov 2015). From full triaxial modeling of twodimensional weaklensing, Xray, and SZE observations, Umetsu et al. (2015) obtained \(M_\mathrm {200c}=(12.1\pm 1.9)\times 10^{14}h^{1}M_\odot \) and \(c_\mathrm {200c}=7.91\pm 1.41\), which overlaps with the \(1\sigma \) tail of the predicted distribution of halo concentration. Moreover, the multiprobe data set is in favor of a triaxial geometry with a minortomajor axis ratio of \(c/a=0.39\pm 0.15\) and a major axis closely aligned with the line of sight by \((22\pm 10)^\circ \). Therefore, the superb lensing efficiency of Abell 1689 is likely caused by its intrinsically high mass concentration combined with a chance alignment of its major axis with the lineofsight direction (see also Oguri et al. 2005).
4.3.2 Halo model
At large clustercentric distances, the correlated matter around the cluster, namely the 2halo term (Cooray and Sheth 2002), contributes to the lensing signal. In the standard halomodel prescription (Oguri and Takada 2011; Oguri and Hamana 2011), the total lensing signal \(\varDelta \varSigma (R)\) is given as the sum of the 1halo and 2halo terms. The 1halo term \(\varDelta \varSigma _\mathrm {1h}\) accounts for the mass distribution within the main cluster halo, which can be described by a smoothly truncated NFW profile (Baltz et al. 2009, hereafter BMO):
where the truncation parameter \(r_\mathrm {t}\) is set to a fixed multiple of the halo outer radius (e.g., \(r_\mathrm {t}\approx 2.6r_\mathrm {vir}\) or \(r_\mathrm {t}\approx 3r_\mathrm {200c}\); see Oguri and Hamana 2011; Covone et al. 2014; Umetsu et al. 2014). Analytic expressions for the radial dependence of the projected BMO profiles are given by Baltz et al. (2009) and Oguri and Hamana (2011).
The 2halo term contribution \(\varDelta \varSigma _\mathrm {2h}\) to the tangential shear signal is expressed as (Oguri and Takada 2011; Oguri and Hamana 2011, see de Putter and Takada 2010 for the fullsky expression):^{Footnote 9}
where \(b_\mathrm {h}(M_\mathrm {200c}; z_l)\) is the linear halo bias (e.g., Tinker et al. 2010), \(k_\ell \equiv \ell /[(1+z_l)D_l(z_l)]\), \(P(k;z_l)\) is the linear matter power spectrum as a function of the comoving wavenumber k evaluated at the cluster redshift \(z_l\), and \(J_n(x)\) is the Bessel function of the first kind and the nth order. We can compute the corresponding radial profile \(\varSigma _\mathrm {2h}(RM_\mathrm {200c},z_l)\) of the lensing convergence by replacing \(J_2(x)\) in Eq. (121) with the zerothorder Bessel function \(J_0(x)\). The 2halo term is proportional to the product \(b_\mathrm {h}\sigma _8^2\), with \(\sigma _8\) the rms amplitude of linear mass fluctuations in a sphere of comoving radius \(8h^{1}\mathrm {Mpc}\). In the standard \({\varLambda }\hbox {CDM}\) model, the 2halo term contribution to \(\varDelta \varSigma \) (or \(\varSigma \)) becomes important, on average, at \(R\gtrsim \) several (or two) virial radii (Oguri and Hamana 2011; Becker and Kravtsov 2011). In particular cases where clusters are residing in extremely dense environments, the 2halo contribution to the lensing signal could become more significant (Sereno et al. 2018a).
4.3.3 DK14 model
Diemer and Kravtsov (2014, hereafter DK14) provide a useful fitting function for the spherically averaged density profile \(\rho (r)\) around darkmatter halos calibrated for a suite of Nbody simulations in \({\varLambda }\hbox {CDM}\) cosmologies. The DK14 density profile is given by:
with \(r_\mathrm {piv}=5r_\mathrm {200m}\) and \(\varDelta _\mathrm {max} = 10^3\), which is introduced as a maximum cutoff density of the outer term to avoid a spurious contribution at small halo radii (Diemer 2018). The inner profile \(\rho _\mathrm {inner}(r)\) describes the intrahalo mass distribution in a multistream region, which is modeled by an Einasto profile (Einasto 1965) with \(\rho _{2}\) and \(r_{2}\) the scale density and radius at which the logarithmic slope is \(2\) and \(\alpha _\mathrm {E}\) the shape parameter describing the degree of profile curvature. The transition term \(f_\mathrm {trans}(r)\) characterizes the steepening around a truncation radius, \(r_\mathrm {t}\). The outer term \(\rho _\mathrm {outer}\), given by a softened power law, is responsible for the material infalling toward the cluster in a singlestream region at large halo radii. DK14 found that this fitting function provides a precise description (\(\lesssim 5\%\)) of their simulated DM density profiles at \(r\lesssim 9r_\mathrm {vir}\). At larger radii (\(r\gtrsim 9r_\mathrm {vir}\)), the outer term is expected to follow a shape proportional to the matter correlation function. As in the case of the NFW profile, it is useful to define the halo concentration by \(c_\varDelta =r_\varDelta /r_{2}\).
The DK14 profile is described by eight parameters, \((\rho _{2}, r_{2}, \alpha _\mathrm {E}, \beta , \gamma , r_\mathrm {t}, b_\mathrm {e}, s_\mathrm {e})\), and is sufficiently flexible to reproduce a range of fitting functions, such as the halo model (Oguri and Hamana 2011; Hikage et al. 2013) and density profiles with a sharp steepening feature associated with the splashback radius (see Sect. 6.3). Equation (122) can be used as a fitting function in conjunction with generic priors for some of the shape and structural parameters (see Diemer and Kravtsov 2014; More et al. 2015; Umetsu and Diemer 2017; Chang et al. 2018). By projecting \(\varDelta \rho (r)\) along the line of sight, we obtain the surface mass density responsible for gravitational lensing as:
where the lineofsight integral is evaluated numerically. The publicly available code, colossus (Diemer 2018), implements a range of calculations relating to threedimensional and projected halo profiles including the NFW, Einasto, and DK14 models.
4.4 Shear likelihood function
The likelihood function \(\mathcal{L}\) of a mass model for weak shear observations \(\varvec{d}\equiv \{\langle g_+(\vartheta _b)\rangle \}_{b=1}^{N_\mathrm {bin}}\) is written as:
where C is the \(N_\mathrm {bin}\times N_\mathrm {bin}\) error covariance matrix for the binned reduced tangential shear profile \(\varvec{d}\) and \(\widehat{g}_+(\vartheta _b\varvec{p})\) represents the theoretical expectation for \(\langle g_+(\vartheta _b)\rangle \) (Eq. (93)) predicted by the model parameterized by a set of parameters \(\varvec{p}\). Note that modeling of the cluster lensing signal \(\widehat{g}_+(\vartheta \varvec{p})\) requires information of the lensing depth \(\langle \varSigma _\mathrm {cr}^{1}\rangle \) averaged over the sourceredshift distribution (Sect. 3.6.2). Similarly, one can define a likelihood function for the lensing convergence profile \(\kappa (\vartheta )\), which can be reconstructed from combined shear and magnification measurements (e.g., Umetsu et al. 2011b, 2014).
A wellcharacterized inference of the model parameters \(\varvec{p}\) can be obtained within the Bayesian framework by properly choosing the priors (Umetsu et al. 2020). In this context, when interpreting the cluster lensing signal with an NFW profile (Sect. 4.3.1), it is useful to take \(\varvec{p}=(M_\mathrm {200c},c_\mathrm {200c})\) as fitting parameters.^{Footnote 10} Tangential shear fitting with a spherical NFW profile is a standard approach for measuring individual cluster masses from weak lensing (e.g., Okabe et al. 2010; Applegate et al. 2014; Hoekstra et al. 2015). Numerical simulations suggest that mass estimates from tangential shear fitting can be biased low (by \(\sim 510\%\); Meneghetti et al. 2010b; Becker and Kravtsov 2011; Rasia et al. 2012), because local substructures that are abundant in the outskirts of massive clusters dilute the shear tangential to the cluster center. Moreover, systematic deviations of the lensing signal from the assumed NFW profile form in projection can lead to a substantial level of mass bias, even if the spherically averaged density profiles \(\rho (r)\) in three dimensions are well described by the NFW form (e.g., Sereno et al. 2016; Umetsu et al. 2020). Therefore, it is highly important to optimize the radial range for tangential shear fitting so as to alleviate the mass bias (von der Linden et al. 2014a; Applegate et al. 2014; Pratt et al. 2019).
To obtain robust constraints on the underlying cluster mass distribution, we need to ensure that the shear likelihood function (Eq. (124)) includes all relevant sources of uncertainty (Gruen et al. 2015). Following Umetsu et al. (2016, (2020), we decompose the error covariance matrix C for \(\varvec{d}=\{\langle g_+(\vartheta _b)\rangle \}_{b=1}^{N_\mathrm {bin}}\) as:
where \((C^\mathrm {shape})_{bb'}=\sigma ^2_\mathrm {shape}(\vartheta _b)\delta _{bb'}\) is the diagonal statistical uncertainty due to the shape noise (Eq. (112)); \((C^\mathrm {lss})_{bb'}\) is the cosmic noise covariance due to uncorrelated largescale structure projected along the line of sight (Schneider et al. 1998; Hoekstra 2003); and \((C^\mathrm {int})_{bb'}\) accounts for statistical fluctuations of the projected cluster lensing signal due to intrinsic variations associated with assembly bias and cluster asphericity (Gruen et al. 2015).
Figure 14 shows the diagonal elements of the covariance matrix used in a stacked weaklensing analysis of Miyatake et al. (2019) and the correlation matrix, defined with the total covariance matrix as \(C_{bb'}/\sqrt{C_{bb}C_{b'b'}}\). A similar figure but for the \(\kappa \) profile was presented in Umetsu et al. (2016), which presents a joint weak and strong lensing analysis of 20 highmass clusters targeted by the CLASH survey (Cluster Lensing And Supernova survey with Hubble; Postman et al. 2012).
The elements of the \(C^\mathrm {lss}\) matrix are given by (Hoekstra 2003; Oguri and Takada 2011):
where \(\hat{J}_2(\ell \vartheta _b)\) is the Bessel function of the first kind and second order averaged over the bth annulus (for the case of the binned convergence profile, see Umetsu et al. 2011a, 2016; Gruen et al. 2015); and \(P_\kappa (\ell )\) is the twodimensional convergence power spectrum (see Eq. (37)) as a function of angular multipole \(\ell \) calculated using the flatsky and Limber’s approximation as (Limber 1953; Kaiser 1992):
with \(\chi \) the comoving coordinate along the line of sight, \(W(\chi ,\chi _s)=r(\chi _s\chi )/r(\chi _s)\) the ratio of angular diameter distances \(D_{ls}/D_s\), and \(P_\mathrm {NL}(k;\chi )\) the nonlinear matter power spectrum. The convergence power spectrum \(P_\kappa (\ell )\) can be evaluated for a given source population and a cosmological model. In Eq. (127), we have assumed a single comoving distance \(\chi _s\) corresponding to the effective singleplane redshift of source galaxies (i.e., all source galaxies lying at \(\chi =\chi _s\)). Provided that \(\varDelta \vartheta _b/\vartheta _b\ll 1\) with \(\varDelta \vartheta _b\) the radial bin width, we have \(\hat{J}_2(\ell \vartheta _b)\simeq J_2(\ell \vartheta _b)\) (without binaveraging) in Eq. (126).
The \(C^\mathrm {int}\) matrix describes statistical fluctuations of the projected cluster lensing signal at fixed halo mass due to intrinsic variations in halo concentration, triaxiality and orientation, and correlated secondary structures around the cluster, as well as to deviations from the assumed NFW form (Becker and Kravtsov 2011; Gruen et al. 2015).^{Footnote 11} Gruen et al. (2015) constructed a semianalytical model of \(C^\mathrm {int}\) that is calibrated to cosmological numerical simulations. Umetsu et al. (2016) found that the diagonal part of the intrinsic covariance for the convergence \(\kappa \) can be well approximated by^{Footnote 12}:
with \(\alpha _\mathrm {int}=0.2\) in the intracluster (1halo) regime at \(R=D_l\vartheta \lesssim r_\mathrm {200m}\). This suggests that the intrinsic variance is selfsimilar in the sense that \(\sqrt{(C^\mathrm {int}_\kappa )_{bb}}/\kappa (\vartheta _b)\sim \mathrm {const}\). A further simplification can be made by setting the offdiagonal elements of \(C^\mathrm {int}_\kappa \) to zero, i.e., \((C^\mathrm {int}_{\kappa })_{bb'}=\alpha ^2_\mathrm {int}\kappa ^2(\vartheta _b)\delta _{bb'}\). In general, the diagonal approximation to \(C^\mathrm {int}_\kappa \) can lead to an underestimation of parameter uncertainties (Gruen et al. 2015, see Fig. 5), where the degree of underestimation depends on the binning scheme, depth of weaklensing observations, and halo mass. The impact of the diagonal approximation is more severe for deeper observations (or higher S/N weaklensing data).^{Footnote 13} Assuming a representative mass profile, it is possible to convert the intrinsic covariance matrix \(C^\mathrm {int}_\kappa \) for the convergence into that for the tangential shear. This can be done by assuming an NFW density profile together with the concentration–mass relation \(c_\mathrm {200c}(M_\mathrm {200c},z)\) for a given cosmological model (Miyatake et al. 2019; Umetsu et al. 2020). The covariance \(C^\mathrm {int}\) for the \(g_+\) profile obtained in this way thus depends on halo mass. Miyatake et al. (2019) found that, however, the intrinsic covariances with different halo masses remain nearly selfsimilar in their shapes once scaled by \(R\rightarrow R/r_\mathrm {200m}\).
4.5 Stacked weaklensing estimator
Stacking an ensemble of galaxy clusters helps average out large statistical fluctuations inherent in noisy weaklensing observations of individual clusters. The statistical precision can be largely improved by stacking together a large number of clusters, allowing for tighter and more robust constraints on the ensemble properties of the cluster mass distribution. A stacked lensing analysis is thus complementary to an alternative approach that relies on individual cluster mass measurements (Sects. 4.2 and 4.4). In particular, a comparison of the two approaches provides a useful consistency check in different S/N regimes (e.g., Okabe et al. 2010; Umetsu et al. 2014, 2016, 2020; Okabe and Smith 2016).
Let us consider an ensemble of N galaxy clusters. We model the ensemble mass distribution of these clusters in terms of the excess surface mass density profile as:
Specifically, our model is a vector of \(N_\mathrm {bin}\) parameters containing the binned \(\varDelta \varSigma (R)\) profile as a function of the projected clustercentric radius R (see Sect. 3.5). Here, we aim to construct an unbiased estimator for the model \(\varvec{m}\), or the ensemble \(\varDelta \varSigma (R)\) profile, given weaklensing observations of N individual clusters.
We assume that these clusters are distributed in redshift, having different geometric responses to the lensing signal through \(\varSigma _\mathrm {cr}^{1}(z_l,z_s)\). We express weaklensing observations \(\varvec{d}_l=\{\langle g_{+}(R_bz_l)\rangle \}_{b=1}^{N_\mathrm {bin}}\) for a given cluster (l) as a sum of the signal vector \(\varvec{s}_l\) and the noise vector \(\varvec{n}_l\) as:
with
where the response coefficient \(a_l\) represents the sourceaveraged inverse critical surface mass density evaluated for the lth cluster (Eq. (91)):
In this expression, we assume that both \(\varvec{d}_l\) and \(a_l\) have been averaged over an ensemble of source galaxies to represent the respective sourceaveraged quantities for the lth cluster. For simplicity, we have ignored the nonlinearity between the lensing signal \(g_+\) and the surface mass density \(\varDelta \varSigma \) (see Sect. 3.6.2).^{Footnote 14} We refer to Umetsu et al. (2020) for a treatment of the stacked weaklensing analysis accounting for the nonlinear correction for the sourceaveraging effect.
Assuming that \(\varvec{n}\) obeys Gaussian statistics and that the noise vectors between different clusters are statistically independent, we can write the total likelihood function of observations \(\varvec{d}=\{\varvec{d}_1,\varvec{d}_2,\ldots ,\varvec{d}_N\}\) as:
where \(C_l=\langle \varvec{n}_l \varvec{n}_l^t\rangle \) is the error covariance matrix (Sect. 4.4) for the lth cluster and \(\mathcal{N}_l=(2\pi )^{N_\mathrm {bin}/2}C_l^{1/2}\) is a normalization factor. In groundbased cluster weaklensing observations, the shear covariance matrix \((C_l)_{bb'}\) per cluster (\(b,b'=1,2,\ldots ,N_\mathrm {bin}\)) is dominated by the statistical uncertainty due to the shape noise. The contribution from cosmic noise (Sect. 4.4) becomes important at large clustercentric distances (Fig. 14).
The total loglikelihood function \(\ln {P(\varvec{d}\varvec{m})}\) is expressed as:
According to Bayes’ theorem, the posterior probability distribution of \(\varvec{m}\) given the data \(\varvec{d}\) is:
where \(P(\varvec{m})\) is the prior probability distribution for the model \(\varvec{m}\) and \(P(\varvec{d})\) is the evidence that serves as a normalization factor here. We assume an uninformative uniform prior for our model \(\varvec{m}\), such that \(P(\varvec{m}\varvec{d}) \propto P(\varvec{d}\varvec{m})\). By maximizing \(\ln {P(\varvec{m}\varvec{d})}\) with respect to \(\varvec{m}\), we obtain the desired expression for the stacked weaklensing estimator \(\widehat{\varvec{m}}\) as (e.g., Umetsu et al. 2011a):
Note that the weight assigned to \(\varDelta \varSigma _+\) of each cluster is proportional to \(a_l^2=\left\langle \varSigma _{\mathrm {cr},l}^{1}\right\rangle ^2\) (see also Eq. (98)), because \(\varvec{s}_l\propto a_l\). The error covariance matrix \(\mathcal{C}\) for the stacked estimator \(\langle \langle \varvec{\varDelta \varSigma }_+\rangle \rangle \) (Eq. (136)) is given by:
where F is the Fisher information matrix defined as (e.g., Umetsu et al. 2011a):
The total S/N for detection is given by (e.g., Umetsu and Broadhurst 2008):
Again, this quadratic S/N estimator breaks down and leads to an overestimation of significance if the actual perbin S/N is less than unity (see Sect. 4.2).
It is noteworthy that interpreting the effective mass from the stacked lensing signal (Eq. (136)) requires caution especially when the cluster sample spans a wide range in mass and redshift. This is because the amplitude of the lensing signal is weighted by the redshiftdependent sensitivity and it is not linearly proportional to the cluster mass (e.g., Mandelbaum et al. 2005; Umetsu et al. 2016, 2020; Melchior et al. 2017; Sereno et al. 2017). We refer the reader to Miyatake et al. (2019) and Murata et al. (2019) for further discussion of this issue.
Figure 15 shows the stacked weaklensing signals around a sample of 136 spectroscopically confirmed Xray groups and clusters at \(0.033\leqslant z_l\leqslant 1.033\) selected from the XMMXXL survey (Umetsu et al. 2020). Their weaklensing analysis is based on HSCSSP survey data. The figure compares stacked \(\langle \langle \varDelta \varSigma _{+}\rangle \rangle \) profiles of the XXL sample obtained with different sourceselection methods (see Sect. 4.1). This comparison shows no significant difference between these profiles within errors in all bins. From a singlemassbin NFW fit to the stacked shear profile, Umetsu et al. (2020) found \(M_\mathrm {200c}=(8.7\pm 0.8)\times 10^{13}h^{1}M_\odot \) and \(c_\mathrm {200c}=3.5\pm 0.9\) at a lensingweighted mean redshift of \(z_l\approx 0.25\). This is in agreement with the mean concentration expected for darkmatter halos in the standard \({\varLambda }\hbox {CDM}\) cosmology, \(c_\mathrm {200}\approx 4.1\) at \(M_\mathrm {200c}=8.7\times 10^{13}h^{1}M_\odot \) and \(z=0.25\) (e.g., Diemer and Joyce 2019). Figure 15 also displays the bestfit halo model including the effects of surrounding largescale structure as a 2halo term. Figure 15 shows that the 2halo term in the range \(R\in [0.3,3]\,h^{1}\mathrm {Mpc}\) (comoving) is negligibly small even in lowmass clusters and groups (e.g., Leauthaud et al. 2010; Covone et al. 2014; Sereno et al. 2015), for which the maximum radius corresponds to \(\sim 3r_\mathrm {200c}\). This is because the tangential shear \(\varDelta \varSigma (R)=\overline{\varSigma }(R)\varSigma (R)\) is insensitive to flattened sheetlike structures (Schneider and Seitz 1995).
4.6 Quadrupole shear
Halos formed in collisionless CDM simulations are not spherical and can have complex shapes. A more realistic description of individual cluster halos is as triaxial ellipsoids with minortomajor axis ratios of order \(a/c\sim 0.5\), slowly increasing with halocentric radius (Jing and Suto 2002; Bonamigo et al. 2015). More massive halos are less spherical and more prolate, as they tend to form later. The projected matter distributions around clusters are thus expected to be anisotropic, with typical axis ratios of \(q\sim 0.6\) (e.g., Okabe et al. 2018). The projected axis ratio of cluster halos varies slowly with clustercentric distance (e.g., Okabe et al. 2018).
For sufficiently massive clusters at low redshift, deep weaklensing observations allow us to constrain the halo shape on an individual cluster basis, by forwardmodeling the observed twodimensional shear (or convergence) field with elliptical lens models (Oguri et al. 2010; Okabe et al. 2011; Watanabe et al. 2011; Umetsu et al. 2012; Medezinski et al. 2013, 2016; Wegner et al. 2017). This towdimensional fitting approach is flexible and can be readily generalized to include multiple halo components (see the discussion in Sect. 3.6.1, e.g., Okabe et al. 2011; Medezinski et al. 2013) and triaxial halo shapes (e.g., Oguri et al. 2005; Sereno and Umetsu 2011; Umetsu et al. 2015; Chiu et al. 2018).
In this subsection, we introduce quadrupole shear estimators for measuring the projected halo shape for a stacked ensemble of galaxy clusters.
4.6.1 Projected halo shape and multipole expansion
Following Adhikari et al. (2015), we introduce a formalism that allows for modeling the effects of halo ellipticity on weak shear observables based on an angular multipole expansion of the lensing fields.^{Footnote 15} Let us write the azimuthally averaged projected mass density profile as \(\varSigma ^{(0)}(R)\propto R^{\eta _0}\) with \(\eta _0 = d\ln {\varSigma ^{(0)}(R)/d\ln {R}}>0\). Assuming that q is constant with clustercentric radius, we can write the surface mass density around clusters as \(\varSigma (x,y)\propto R_q^{\eta _0}\) (Adhikari et al. 2015; Clampitt and Jain 2016), where \(R_q\) is an elliptical radial coordinate defined as (Evans and Bridle 2009; Oguri et al. 2012; Umetsu et al. 2012, 2018):
with q the minortomajor axis ratio \((0<q\leqslant 1)\). Here, we have chosen the Cartesian coordinate system (x, y) centered on the halo, such that the xaxis is aligned with the major axis of the projected ellipse. We define the corresponding mass ellipticity by \(e=(1q^2)/(1+q^2)\).
We express the multipole expansion of \(\varSigma \) as (Adhikari et al. 2015; Clampitt and Jain 2016):
where \(\varphi \) is the azimuthal angle relative to the halo’s major axis, the multipole \(\varSigma ^{(m)}(R)\) is the coefficient of the \(e^{im\varphi }\) component of the azimuthal behavior, and we assume \(e\eta _0/2\ll 1\) to justify the neglect of higher order terms in the expansion. We thus model the projected mass distributions of clusters as the sum of a monopole and a quadrupole. We further assume that:
The quadrupole \(\varSigma ^{(2)}\) can thus be completely determined by the monopole \(\varSigma ^{(0)}\propto R^{\eta _0}\), up to a multiplicative factor corresponding to the halo ellipticity e.
Similarly, the quadrupole moments of the tangential (\(+\)) and cross (\(\times \)) shear components are given by (Adhikari et al. 2015):
where \(I_1(R)\) and \(I_2(R)\) are defined by (Clampitt and Jain 2016):
Equation (143) suggests an optimal estimator weighted by \(\cos {2\varphi }\) to extract the quadrupole of the excess surface mass density, \(\varDelta \varSigma ^{(2)}(R)\), from tangential shear measurements.
Weighted quadrupole estimators for the tangential and crossshear components are given by (Natarajan and Refregier 2000; Mandelbaum et al. 2006):^{Footnote 16}
where \(\varDelta \varSigma _{+,\times }(Rz_l,z_s)=\varSigma _\mathrm {cr}(z_l,z_s) g_{+,\times }(Rz_l,z_s)\) (Eq. (94)); \(w_{ls} = \varSigma _{\mathrm {cr},ls}^{2} / \sigma _{g,ls}^{2}\) is the statistical weight for each lens–source pair (ls), with \(\sigma _{g,ls}\) the statistical uncertainty per shear component (see Eq. (98)); and \(\varphi _{ls}\) is the azimuthal angle of each source galaxy (s) relative to the major axis of each cluster lens (l). In real observations, we must rely on the major axis of the distribution of baryonic tracers (e.g., central galaxies and Xray gas) to perform aligned, stacked lensing measurements by Eq. (145) (Mandelbaum et al. 2006; van Uitert et al. 2012, 2017; Clampitt and Jain 2016).
As discussed by Hoekstra et al. (2004) and Mandelbaum et al. (2006), in practical applications, Eq. (145) is susceptible to a possible systematic alignment of lens galaxy (e.g., BCGs) and source ellipticities. Such a spurious alignment signal can arise from an incomplete correction of the PSF anisotropy, which tends to affect neighboring objects in a similar manner. On the other hand, when interpreting the quadrupole shear signal, one must take into account possible misalignment between the underlying matter and tracer distributions, which will cause a dilution of the quadrupole shear signal. Moreover, modeling of the quadrupole shear based on the multipole expansion (Eq. (142)) should only be applied to the case with a small halo ellipticity (\(e\eta /2\ll 1\)), so that the higher order terms can be safely ignored (see Eq. (141)).
4.6.2 Cartesian estimator
Now, we introduce the Cartesian estimator of Clampitt and Jain (2016). Compared to the estimator of Natarajan and Refregier (2000), a practical advantage of this estimator is that one of the two Cartesian components (\(\varDelta \varSigma _2^{(\pm )}\) defined below) is insensitive to the spurious alignment of lens–source galaxy ellipticities (Clampitt and Jain 2016) discussed at the end of Sect. 4.6.1. With this estimator, we measure the stacked quadrupole shear signal with respect to a coordinate system with the xaxis aligned with the major axis of the distribution of baryonic tracers (e.g., central galaxies and Xray gas). The monopole signal is nulled with this Cartesian estimator. We adopt the same sign convention for the Cartesian \(\gamma _1\) and \(\gamma _2\) components as defined in Clampitt and Jain (2016) and use \(\varphi \) to denote the azimuthal angle relative to the xaxis of each cluster. This is illustrated in Fig. 16.
The Cartesian shear components are related to the tangential and cross components (see Eq. (83)) by:
In the framework of Adhikari et al. (2015) based on the multipole expansion, the multipole moments of the Cartesian shear components are written as follows (Clampitt and Jain 2016):
Equation (147) shows that the azimuthal dependence of the Cartesian shear components goes as \(\cos {4\varphi }\) (except for the two terms without \(\varphi \) dependence; see Clampitt and Jain 2016 for more discussion) and \(\sin {4\varphi }\), so that there is a sign change in both components after every angle \(\pi /4\). When moving around the circle, the shear signal from elliptical clusters transitions between regions where \(\gamma _1^{(2)}\) and then \(\gamma _2^{(2)}\) alternately dominate (Clampitt and Jain 2016), as illustrated in Fig. 16.
Following Clampitt and Jain (2016), we group together the first and second shear components \((g_1,g_2)\) of background source galaxies in the regions where \(\cos {4\varphi }\) and \(\sin {4\varphi }\) have the same sign (see Fig. 17), respectively, and define the following estimator:
where we have introduced the notation in analogy to the tangential shear (Eq. (94)):
where \(w_{ls} = \varSigma _{\mathrm {cr},ls}^{2} / \sigma _{g,ls}^{2}\) is the statistical weight for each lens–source pair (ls), with \(\sigma _{g,ls}\) the statistical uncertainty per shear component (see Eq. (98)); and s runs over all source galaxies that fall in the specified bin \((R,\varphi )\), different for each shear component i and sign (Clampitt and Jain 2016): \(i=1\), \(\mathrm {sign}=\), \(\pi /8\leqslant \varphi <\pi /8\); \(i=1\), \(\mathrm {sign}=+\), \(\pi /8\leqslant \varphi <3\pi /8\); \(i=2\), \(\mathrm {sign}=\), \(0\leqslant \varphi <\pi /4\); \(i=2\), \(\mathrm {sign}=+\), \(\pi /4\leqslant \varphi <\pi /2\). For each case, the summation in Eq. (148) also includes source galaxies lying in symmetrical regions shifted by \(\pi /2\), \(\pi \), and \(3\pi /2\), as illustrated in Fig. 17.
Figure 18 shows the stacked quadrupole shear profiles, \(\langle \langle \varDelta \varSigma _1^{(+)}\rangle \rangle \), \(\langle \langle \varDelta \varSigma _1^{()}\rangle \rangle \), \(\langle \langle \varDelta \varSigma _2^{(+)}\rangle \rangle \), and \(\langle \langle \varDelta \varSigma _2^{()}\rangle \rangle \), derived for a sample of 20 highmass CLASH clusters (Umetsu et al. 2018). The quadrupole shear signal was measured with respect to the major axis of the Xray gas shape of each cluster. Umetsu et al. (2018) modeled the stacked \(\langle \langle \varDelta \varSigma _{1,2}^{(\pm )}\rangle \rangle \) profiles by assuming an ellipticalNFW density profile with the major axis aligned with the Xray major axis (for an elliptical extension of lensing mass models, see Keeton 2001). Any misalignment would thus lead to a dilution of the quadrupole signal and hence an underestimation of the halo ellipticity. Umetsu et al. (2018) obtained stacked constraints on the projected axis ratio of \(q=0.67\pm 0.10\) (or \(1q=0.33\pm 0.10\)), which is fully consistent with the median axis ratio \(q=0.67\pm 0.07\) of this sample obtained from their twodimensional shear and magnification analysis of the 20 individual clusters. Their results suggest that the total matter distribution is closely aligned with the Xray brightness distribution (with a median misalignment angle of \(\varDelta \mathrm {PA}=21^\circ \pm 7^\circ \)) as expected from cosmological hydrodynamical simulations (see Okabe et al. 2018).
5 Magnification bias
In addition to the shape distortions, gravitational lensing can cause focusing of light rays, which results in an amplification of the image flux through the solidangle distortion (Sect. 2.6.3). Lensing magnification provides complementary and independent observational alternatives to gravitational shear, especially at high redshift where source galaxies are more difficult to resolve (Van Waerbeke et al. 2010; Hildebrandt et al. 2011; Ford et al. 2014; Chiu et al. 2016, 2020).
5.1 Magnified source counts
Let us consider source number counts \(n_0(>F)\) per unit solid angle as a function of the limiting flux F for a given population of background objects (e.g., color–magnitudeselected galaxies, quasars, etc.). In the absence of gravitational lensing, the intrinsic source counts can be written as:
where \({\text {d}}^2V(z)/\text {d}z/\text {d}\varOmega \) is the comoving volume element per redshift interval per unit solid angle, \({\text {d}}^2N(L,z)/\text {d}L/\text {d}V\) is the luminosity function of the background population, \(L(z)=4\pi D_L^2(z)F\) is the luminosity threshold corresponding to the flux limit F at redshift z, with \(D_L(z)\) the luminosity distance, and \({\text{d}}n_0[z>L(z)]/\text {d}z\) is the redshift distribution function.
Here, we focus on the subcritical regime with \(\mu (\varvec{\theta }) > 0\) (i.e., outside the critical curves). Lensing magnification causes focusing of light rays while conserving the surface brightness (Sect. 2.6.3), resulting in the following two competing effects (Broadhurst et al. 1995; Umetsu 2013):

1
Area distortion: \(\delta \varOmega \rightarrow \mu (\varvec{\theta })\delta \varOmega \);

2
Flux amplification: \(F\rightarrow \mu (\varvec{\theta })F\).
The former effect reduces the geometric area in the source plane, thus decreasing the observed number of background sources per unit solid angle. On the other hand, the latter effect amplifies the flux of background sources, increasing the observed number of sources above the limiting flux.
In the presence of gravitational lensing, the magnified source counts are given as:
where \(\text {d}n_\mu [z>L(z)]/\text {d}z\) is the magnified redshift distribution function of the source population. Hence, the net change of the magnified source counts \(n_\mu (>F)/n_0(>F)\), known as magnification bias, depends on the intrinsic (unlensed) source luminosity function, \({\text{d}}^2N(L,z)/\text {d}L/\text {d}V\). One can calculate the expectation value for the magnified source counts \(n_\mu (>F)\) for a given background cosmology and a given source luminosity function.
In real observations, we apply different cuts (e.g., size, magnitude, and color cuts) in source selection for measuring the shear and magnification effects, thus leading to different sourceredshift distributions. In contrast to the former effect, measuring the effect of magnification bias does not require source galaxies to be spatially resolved, but it does require a stringent flux limit against incompleteness effects (Hildebrandt 2016; Chiu et al. 2020).
Equation (151) indicates that, when redshift information of individual source galaxies is available from spectroscopic redshifts, we can directly measure the magnified redshift distribution of background source galaxies for a fluxlimited sample (Broadhurst et al. 1995):
Hence, in principle, the lensinginduced distortion of the redshift distribution \(\text {d}n_\mu [z>L(z)]/\text {d}z\) can be measured from spectroscopicredshift measurements with respect to the unlensed distribution \(\text {d}n_0[z>L(z)]/\text {d}z\), which can be found in random fields. In particular, the integrated magnificationbias effect will translate into an enhancement in mean source redshift of the background sample (i.e., the first moment of Eq. (152)). Using 300,000 BOSS survey galaxies with accurate spectroscopic redshifts, Coupon et al. (2013) measured their mean redshift depth behind four large samples of optically selected clusters from the Sloan Digital Sky Survey (SDSS), totaling 5000–15,000 clusters. They found a \(\gtrsim 1\) percent level of mean redshift increase \(\delta z(R)\) toward the cluster center for SDSSdefined optical clusters with an effective mass of \(M_\mathrm {200c}\sim (12)\times 10^{14}M_\odot \).
5.2 Magnification observables
To simplify the calculations, we discretize Eq. (151) as:
where \(n_\mu (z_s>F)\) represents a subsample of the background population in the redshift interval \([z_s,z_s+\varDelta z]\). If the change in flux due to magnification is small compared to the range over which the slope of the luminosity function varies, the intrinsic source counts \(n_0[z_s>L(z_s)]\) can be approximated at \(L(z_s)\) by a power law with a logarithmic slope of^{Footnote 17}:
The magnified source counts \(n_\mu (z_s>F)\) in the redshift interval \([z_s,z_s+\varDelta z]\) are given by:
The corresponding magnification bias is given by:
Equation (156) implies a positive bias for \(\alpha > 1\) and a negative bias for \(\alpha < 1\). The net magnification effect on the source counts vanishes at \(\alpha =1\). For a depleted sample of background sources with \(\alpha \ll 1\), the effect of magnification bias is dominated by the geometric area distortion (\(b_\mu \rightarrow \mu ^{1}\) at \(\alpha \rightarrow 0\)) and insensitive to the intrinsic source luminosity function (Umetsu 2013). In the weaklensing limit (\(\gamma ,\kappa \ll 1\)), we have:
Hence, the flux magnification bias \(b_\mu \) in the weaklensing limit provides a local measure of the surface mass density field, \(\kappa (\varvec{\theta })\). The combination of shear and magnification can thus be used to break or alleviate masssheet degeneracy (Schneider et al. 2000; Broadhurst et al. 2005b; Umetsu and Broadhurst 2008; Umetsu et al. 2011b, 2014, 2018).
In practical applications, we need to average over a broad range of source redshifts to increase the S/N. The magnification bias averaged over the sourceredshift distribution is expressed as:
In the continuous limit \(\sum _s n_0(z_s>F) \rightarrow \int \text {d}z\,\text {d}n_0(z>F)/dz\), we have the following equation (Umetsu 2013; Umetsu et al. 2016):
Equation (159) gives a general expression for the flux magnification bias. Deep multiband photometry spanning a wide wavelength range allows us to identify distinct populations of background galaxies (e.g., Medezinski et al. 2010, 2011, 2018b; Umetsu et al. 2012, 2014, 2015). Since a given flux limit (F) corresponds to different intrinsic luminosities \(L(z_s)\) at different source redshifts \(z_s\) (Eq. (150)), source counts of distinctly different background populations probe different regimes of magnification bias. The bias is strongly negative for quiescent galaxies at \(\langle z_s\rangle \sim 1\), with a faintend slope of \(\alpha \sim 0.4\) at the limiting magnitude \(z'\approx 25.6\,\hbox {ABmag}\) (e.g., Umetsu et al. 2014, 2015). A net count depletion (\(b_\mu <1\)) results for such a source population with \(\alpha \ll 1\) (e.g., Broadhurst 1995; Taylor et al. 1998; Broadhurst et al. 2005b; Umetsu and Broadhurst 2008; Umetsu et al. 2011b, 2012, 2014, 2015; Ford et al. 2012; Coe et al. 2012; Medezinski et al. 2013; Radovich et al. 2015; Ziparo et al. 2016; Wong et al. 2017), because the effect of magnification bias is dominated by the geometric area distortion. In the regime of density depletion, a practical advantage is that the effect is not sensitive to the exact form of the source luminosity function. The S/N for detection of \(b_\mu \) increases progressively as the flux limit F decreases.
Figure 19 displays weaklensing radial profiles for the cluster MACS J1206.2−0847 at \(z_l = 0.439\) derived from Subaru SuprimeCam observations (Umetsu 2013). It is a highly massive Xray cluster with \(M_\mathrm {200c}=(11.1\pm 2.5)\times 10^{14}h^{1}M_\odot \) (Umetsu et al. 2014) targeted by the CLASH survey. The black squares in the top panel show the reduced tangential shear profile \(g_+(R)\). The blue and red circles in the bottom panel are positive and negative magnificationbias measurements \(n_\mu (R)\) showing density enhancement and depletion, respectively, as a function of clustercentric radius R. These weaklensing measurements yield respective S/N values of 10.2, 2.9, and 4.7. Figure 19 also shows a joint Bayesian reconstruction of each observed profile obtained from combined stronglensing, weak shear lensing, and positive/negative magnificationbias measurements.
5.3 Nonlinear effects on the sourceaveraged magnification bias
It is instructive to consider a maximally depleted population of source galaxies with \(\alpha =\text {d}\log {n_0(>F)}/\text {d}\log {F}=0\) at the limiting flux F. For such a population, the effect of magnification bias is purely geometric, \(\langle b_\mu \rangle =\langle \mu ^{1}\rangle \), and insensitive to details of the intrinsic source luminosity function, \({\text{d}}^2N(L,z)/{\text{d}}L/{\text{d}}z\). In the nonlinear subcritical regime, the sourceaveraged inverse magnification factor is expressed as (Umetsu 2013):
where \(\langle \cdots \rangle \) denotes the averaging over the sourceredshift distribution (see Eq. (159)), \(f_l=\langle \varSigma _{\mathrm {cr},l}^{2}\rangle /\langle \varSigma _{\mathrm {cr},l}\rangle ^2\) is a quantity of the order of unity, and \(\varDelta \langle \mu ^{1}\rangle \) is the correction with respect to the single sourceplane approximation. The error associated with the single sourceplane approximation is \(\varDelta \langle \mu ^{1}\rangle /\langle \mu ^{1}\rangle \simeq (f_l1)\left( \langle \kappa \rangle ^2\langle \gamma \rangle ^2\right) \), which is much smaller than unity for background populations with \(\alpha \sim 0\) in the mildly nonlinear subcritical regime where \(\langle \kappa \rangle \sim \langle \gamma \rangle \sim O(10^{1})\). It is therefore reasonable to use the single sourceplane approximation for calculating the magnification bias of depleted source populations with \(\alpha \ll 0\).
In the regime of density enhancement (\(\alpha >1\)), on the other hand, interpreting the observed lensing signal requires detailed knowledge of the intrinsic source luminosity function (see, e.g., Chiu et al. 2016, 2020), especially in the nonlinear regime where the flux amplification factor is correspondingly large (say, \(\mu \gtrsim 1.5\)). For example, a blue distant population of background galaxies is observed to have a welldefined redshift distribution that is fairly symmetric and peaked at a mean redshift of \(\langle z_s\rangle \sim 2\) (e.g., Lilly et al. 2007; Medezinski et al. 2010). Therefore, the majority of these faint blue galaxies are in the far background of typical cluster lenses, so that the lensing signal has a weaker dependence on the source redshift \(z_s\). In such a case, the single sourceplane approximation may well be justified (Umetsu 2013).
5.4 Observational systematics and null tests
In real observations, contamination of the background sample by unlensed galaxies is a critical source of systematics in cluster weak lensing, as discussed in Sect. 4.1. In particular, contamination by cluster galaxies has a direct impact on the interpretation of background source counts, because it will cause an apparent density enhancement at small clustercentric radii. To avoid significant contamination and alleviate this problem as much as possible, one often relies on a stringent color–color selection (Sect. 4.1) to measure the lensing magnification signal from a distinct population of background galaxies (e.g., Broadhurst et al. 2005b; Umetsu and Broadhurst 2008; Umetsu et al. 2011b, 2014; Ziparo et al. 2016; Chiu et al. 2016, 2020). If wellcalibrated photoz PDFs are available from multiband observations, the impact of cluster contamination can be characterized and assessed by statistically decomposing the photoz PDF P(z) into the cluster and randomfield populations (e.g., Gruen et al. 2014; Chiu et al. 2020).
Moreover, for unbiased magnificationbias measurements, one has to correct for the incomplete area coverage due to masked regions and incomplete measurement annuli. Specifically, masking (or blocking) of background galaxies by foreground objects, cluster members, and saturated or bad pixels needs to be properly accounted for (see Umetsu et al. 2011b; Chiu et al. 2020). Another concern is the impact of blending effects in the crowded regions of cluster environments and the presence of intracluster light (Gruen et al. 2019a), which could bias the photometry and thus photozs. The effects of masking and blending on the source counts can be examined and quantified by injecting synthetic galaxies into real images from observations (Huang et al. 2018; Chiu et al. 2020).
Since the net effect of magnification bias is expected to vanish for a fluxlimited background sample defined at \(\alpha =1\) (Sect. 5.2), weaklensing magnification provides a powerful null test, similar to the crossshear (Bmode) signal in the case of weak shear lensing (Sect. 4.2). By performing a null test, one can empirically assess the level of residual bias that could be present in the measurement for a “lensingcut” sample defined at \(\alpha > 1\) or \(\alpha < 1\). The only assumption made in this approach is that residual systematics are the same between the lensingcut and nulltest samples defined at different flux (magnitude) limits. This null test allows us to quantify the impact of deblending effects, biased photometry in crowded regions, and any incorrect assumptions about P(z)decomposition (Chiu et al. 2020). This is demonstrated in Fig. 20. The figure shows the stacked magnificationbias profiles around a sample of 3029 CAMIRA clusters with richness \(N>15\) in the redshift range \(0.2\leqslant z < 1.1\), obtained using fluxlimited lowz and highz background samples, as well as the joint sample, selected in the \(gi\) versus \(rz\) diagram from HSCSSP survey data (Chiu et al. 2020). The magnificationbias signal for the full CAMIRA sample is detected at a significance level of \(9.5\sigma \). On the other hand, the residual bias estimated from the nulltest samples was found to be statistically consistent with zero (Chiu et al. 2020).
6 Recent advances in cluster weaklensing observations
Galaxy clusters provide valuable information from the physics driving cosmic structure formation to the nature of dark matter and dark energy. Their content reflects that of the universe: \(\sim 85\%\) dark matter and \(\sim 15\%\) baryons (cf. \(\varOmega _\mathrm {b}/\varOmega _\mathrm {m}=(15.7\pm 0.4)\%\); Planck Collaboration et al. 2016a), with \(\sim 90\%\) of the baryons residing in a hot, Xrayemitting phase of the intracluster medium. Massive clusters dominated by dark matter are not expected to be significantly affected by baryonic gas cooling (Blumenthal et al. 1986; Duffy et al. 2010), unlike individual galaxies, because the high temperature and low density prevent efficient cooling and gas contraction. Consequently, for clusters in a state of quasiequilibrium, the form of their total mass profiles reflects closely the underlying darkmatter distribution. Hence, galaxy clusters offer fundamental tests on the assumed properties of dark matter, as well as on models of nonlinear structure formation.
The \({\varLambda }\hbox {CDM}\) paradigm assumes that dark matter is effectively cold (nonrelativistic) and collisionless on astrophysical and cosmological scales (Bertone and Tait 2018). In this context, the standard CDM model and its variants, such as selfinteracting dark matter (SIDM; Spergel and Steinhardt 2000) and wave dark matter (\(\psi \)DM; Peebles 2000; Hu et al. 2000; Schive et al. 2014), can provide a range of testable predictions for the properties of clustersize halos (Sects. 6.1 and 6.2). A prime example is the “Bullet Cluster”, a merging pair of clusters exhibiting a significant offset between the centers of the gravitational lensing mass and the Xray peaks of the collisional cluster gas (Clowe et al. 2004, 2006). The data support that dark matter in clusters is effectively collisionless like galaxies, placing a robust upper limit on the selfinteracting crosssection for dark matter of \(\sigma _\mathrm {DM}/m<1.25\,\hbox {cm}^2\,\hbox {g}^{1}\) (Randall et al. 2008).
The abundance of clusters as a function of redshift provides a sensitive probe of the amplitude and growth rate of primordial density perturbations, as well as of the cosmological volume element \({\text{d}}^2V(z)/{\text{d}}z/{\text{d}}\varOmega \). This cosmological sensitivity arises mainly because clusters populate the highmass exponential tail of the halo mass function (e.g., Haiman et al. 2001). In principle, galaxy clusters can complement other cosmological probes, such as CMB, galaxy clustering, cosmic shear, and distant supernova observations. To place cosmological constraints using clusters, however, it is essential to study large cluster samples with wellcharacterized selection functions, spanning a wide range in mass and redshift (Allen et al. 2004; Vikhlinin et al. 2009; Mantz et al. 2010). Currently, the ability of galaxy clusters to provide robust cosmological constraints is limited by systematic uncertainties in their mass calibration (Pratt et al. 2019). Since the level of mass bias is sensitive to calibration systematics of the instruments (Donahue et al. 2014; Israel et al. 2015) and is likely mass dependent (Sereno and Ettori 2017; Umetsu et al. 2020), a concerted effort is needed to enable an accurate mass calibration with weak gravitational lensing (see Sect. 6.4).
Substantial progress has been made in constructing statistical samples of galaxy clusters thanks to dedicated widefield surveys in various wavelengths (Planck Collaboration et al. 2014a, 2015a; Bleem et al. 2015; Miyazaki et al. 2018b; Oguri et al. 2018). Systematic lensing studies of galaxy clusters often target Xray or SZEselected samples (e.g., von der Linden et al. 2014a; Postman et al. 2012; Gruen et al. 2014; Hoekstra et al. 2015; Sereno et al. 2017). This is because the hot intracluster gas provides an excellent tracer of the cluster’s gravitational potential (e.g., Donahue et al. 2014), except for the cases of massive cluster collisions caught in an ongoing phase of dissociative mergers (Clowe et al. 2006; Okabe and Umetsu 2008). Moreover, Xray and SZE observations provide useful centering information of individual clusters. The effect of offcentered clusters is to dilute and flatten the observed \(\varSigma (R)\) profile at scales smaller than the offset scale \(\sigma _\mathrm {off}\) (Johnston et al. 2007; George et al. 2012; Du and Fan 2014). Since flattened, sheetlike mass distributions produce little shear, the impact of miscentering on \(\varDelta \varSigma (R)\) is much larger. The offcentered \(\varDelta \varSigma (R)\) profile is strongly suppressed by smoothing at scales \(R\lesssim 2.5\sigma _\mathrm {off}\) (Johnston et al. 2007). A comparison of Xray, SZE, and optical (e.g., BCGs) center positions allows us to empirically assess the level of halo miscentering. It should be noted, however, that a merger can boost the Xray and SZE signals, and make their peaks offcentered during the compression phase (Molnar et al. 2012). Although the timescale on which this happens is expected to be short (\(\sim 1\,\hbox {Gyr}\); Ricker and Sarazin 2001), it could induce a selection effect and contribute to the scatter in their scaling relations (Umetsu et al. 2020).
In this section, we review recent advances in our understanding of the distribution and amount of mass in galaxy clusters based on cluster weaklensing observations.
6.1 Cluster mass distribution
The distribution and concentration of mass in darkmatterdominated halos depend fundamentally on the properties of dark matter. For the case of collisionless CDM models, cosmological Nbody simulations with sufficiently high resolution can provide accurate predictions for the end product of collisionless collapse in an expanding universe. Although the formation of halos is a complex, nonlinear dynamical process and halos are evolving through accretion and mergers, \({\varLambda }\hbox {CDM}\) models predict that the structure of quasiequilibrium halos characterized in terms of the spherically averaged density profile \(\rho (r)\) is approximately selfsimilar with a characteristic density cusp in their centers, \(\rho (r)\propto 1/r\) (Navarro et al. 1996, 1997). The density profile \(\rho (r)\) of darkmatterdominated halos steepens continuously with radius and it is well described by the NFW form out to the virial radius, albeit with large variance associated with the assembly histories of individual halos (Jing and Suto 2000).
Subsequent numerical studies with improved statistics and higher resolution found that the spherically averaged density profiles of \({\varLambda }\hbox {CDM}\) halos are better approximated by the threeparameter Einasto function with an additional degree of freedom (Merritt et al. 2006; Gao et al. 2008), which is closely linked with the mass accretion history of halos (Ludlow et al. 2013). The Einasto profile has a powerlaw logarithmic slope of \(\gamma _\mathrm {3D}(r) = 2(r/r_{2})^{\alpha _\mathrm {E}}\) (Sect. 4.3.3). For a given halo concentration, an Einasto profile with \(\alpha _\mathrm {E}\approx 0.18\) closely resembles the NFW profile over roughly two decades in radius (Ludlow et al. 2013). The shape parameter \(\alpha _\mathrm {E}\) of \({\varLambda }\hbox {CDM}\) halos increases gradually with halo mass and redshift (see Gao et al. 2008; Child et al. 2018; \(0.15\lesssim \alpha _\mathrm {E}\lesssim 0.25\) at \(z=0\)), so that the density profiles of \({\varLambda }\hbox {CDM}\) halos are not strictly selfsimilar (Navarro et al. 2010). By analyzing a large suite of Nbody simulations in \({\varLambda }\hbox {CDM}\), Child et al. (2018) found that both Einasto and NFW profiles provide a good description of the stacked mass distributions of clustersize halos at low redshift, implying that the two fitting functions are nearly indistinguishable for stacked ensembles of lowredshift clusters, in contrast to clusters at higher redshift (\(z\gtrsim 1\)).
The threedimensional shape of collisionless halos is predicted to be generally triaxial with a preference for prolate shapes (Warren et al. 1992; Jing and Suto 2002), reflecting the collisionless nature of dark matter (Ostriker and Steinhardt 2003). Older halos tend to be more relaxed and thus to be rounder. Since more massive halos form later on average, clustersize halos are expected to be more elongated than less massive systems (Shaw et al. 2006; Ho et al. 2006; Despali et al. 2014, 2017; Bonamigo et al. 2015). Accretion of matter from the surrounding largescale environment also plays a key role in determining the shape and orientation of halos. The halo orientation tends to be in the preferential infall direction of the subhalos and hence aligned along the surrounding filaments (Shaw et al. 2006). The shape and orientation of galaxy clusters thus provide an independent test of models of structure formation (see Sect. 4.6).
Prior to dedicated widefield optical imaging surveys such as Subaru HSCSSP and DES, several cluster lensing surveys carried out deep targeted observations toward a few tens to several tens of highly massive galaxy clusters with \(M_\mathrm {200c}\sim 10^{15}M_\odot \) (e.g., Postman et al. 2012; Okabe et al. 2013; von der Linden et al. 2014a; Hoekstra et al. 2015). Since such clusters are extremely rare across the sky, targeted weaklensing observations with deep multiband imaging currently represent the most efficient approach to study in detail the highmass population of galaxy clusters each with sufficiently high S/N (see Contigiani et al. 2019).
In the last decade, cluster–galaxy weaklensing observations have established that the total matter distribution within clusters in projection can be well described by cuspy, outwardsteepening density profiles (Umetsu et al. 2011b, 2014, 2016; Beraldo e Silva et al. 2013; Newman et al. 2013b; Okabe et al. 2013; Sereno et al. 2017), such as the NFW and Einasto profiles with a nearuniversal shape (Niikura et al. 2015; Umetsu and Diemer 2017), as predicted for collisionless halos in quasigravitational equilibrium (e.g., Navarro et al. 1996, 1997; Taylor and Navarro 2001; Merritt et al. 2006; Gao et al. 2008; Hjorth and Williams 2010; Williams and Hjorth 2010). Moreover, the shape and orientation of galaxy clusters as constrained by weaklensing and multiwavelength data sets are found to be in agreement with \({\varLambda }\hbox {CDM}\) predictions (e.g., Oguri et al. 2005; Evans and Bridle 2009; Oguri et al. 2010; Morandi et al. 2012; Sereno et al. 2013, 2018b; Umetsu et al. 2015, 2018; Chiu et al. 2018; Shin et al. 2018), although detailed studies of individual clusters are currently limited to a relatively small number of highmass clusters with deep multiwavelength observations (see Sereno et al. 2018b; Umetsu et al. 2018). These results are all in support of the standard explanation for dark matter as effectively collisionless and nonrelativistic on submegaparsec scales and beyond, with an excellent match with standard \({\varLambda }\hbox {CDM}\) predictions (however, see Meneghetti et al. 2020, for an excess of galaxy–galaxy stronglensing events in clusters with respect to \({\varLambda }\hbox {CDM}\)).
In Fig. 21, we show the ensembleaveraged \(\langle \langle \varDelta \varSigma _+\rangle \rangle \) profile in the radial range \(R\in [0.1,2.8]\,h^{1}\mathrm {Mpc}\) obtained for a stacked sample of 50 Xray clusters (Okabe and Smith 2016) targeted by the LoCuSS Survey (Local Cluster Substructure Survey; Smith et al. 2016). Their weak shear lensing analysis is based on twoband imaging observations with Subaru/SuprimeCam. Their cluster sample is drawn from the ROSAT AllSky Survey (RASS; Voges et al. 1999) at \(0.15<z<0.3\) and is approximately Xray luminosity limited. The stacked shear profile of the LoCuSS sample is in excellent agreement with the NFW profile with \(M_\mathrm {200c}=6.37^{+0.28}_{0.27}\times 10^{14}h^{1}M_\odot \) and \(c_\mathrm {200c}=3.69^{+0.26}_{0.24}\) at \(z_l=0.23\). The 2halo term contribution to \(\varDelta \varSigma \) for the LoCuSS sample is negligibly small in the radial range \(\lesssim 2r_\mathrm {200c}\). From a single Einasto profile fit to the stacked \(\langle \langle \varDelta \varSigma \rangle \rangle \) profile, Okabe and Smith (2016) obtained the bestfit Einasto shape parameter of \(\alpha _\mathrm {E}=0.161^{+0.042}_{0.041}\), which is consistent within the errors with the \({\varLambda }\hbox {CDM}\) predictions for clustersize halos at \(z_l=0.23\) (Gao et al. 2008; Dutton and Macciò 2014).
Figure 22 shows the ensembleaveraged \(\langle \langle \varSigma \rangle \rangle \) profile of 16 CLASH Xrayselected clusters (Umetsu et al. 2016) based on a joint strong and weaklensing analysis of 16band Hubble Space Telescope (HST) observations (Zitrin et al. 2015) and widefield multicolor imaging taken primarily with Subaru/SuprimeCam (Umetsu et al. 2014). The CLASH survey is an HST MultiCycle Treasury program designed to study with 525 assigned orbits the mass distributions of 25 highmass clusters. In this sample, 20 clusters were selected to have regular Xray morphologies and Xray temperatures above 5 keV. Numerical simulations suggest that this Xrayselected subsample is mostly composed of relaxed clusters (\(\sim 70\%\)) but the rest (\(\sim 30\%\)) are unrelaxed systems (Meneghetti et al. 2014). Another subset of five clusters were selected by their highmagnification lensing properties. Umetsu et al. (2016) studied a subset of 20 CLASH clusters (16 Xrayselected and 4 highmagnification systems) taken from Umetsu et al. (2014), who presented a joint shear and magnification weaklensing analysis of these individual clusters. The stacked \(\langle \langle \varSigma \rangle \rangle \) profile over 2 decades in radius, \(R\in [0.02,2]r_\mathrm {200m}\), is well described by a family of density profiles predicted for cuspy darkmatterdominated halos in gravitational equilibrium, namely, the NFW, Einasto, and DARKexp models (Umetsu et al. 2016).^{Footnote 18} In contrast, the single powerlaw, coredisothermal, and Burkert density profiles are statistically disfavored by the data. Cuspy halo models that include the 2halo term provide improved agreement with the data.
Umetsu et al. (2016) found the bestfit NFW parameters for the stacked CLASH \(\langle \langle \varSigma \rangle \rangle \) profile of \(M_\mathrm {200c}=10.1^{+0.8}_{0.7}\times 10^{14}h^{1}M_\odot \) and \(c_\mathrm {200c}=3.76^{+0.29}_{0.27}\) (Umetsu et al. 2016) at a lensingweighted mean redshift of \(z_l\approx 0.34\). Similarly, the bestfit Einasto shape parameter for the stacked \(\langle \langle \varSigma \rangle \rangle \) profile is \(\alpha _\mathrm {E}=0.232^{+0.042}_{0.038}\), which is in excellent agreement with predictions from \({\varLambda }\hbox {CDM}\) numerical simulations, \(\alpha _\mathrm {E}=0.21 \pm 0.07\) (Meneghetti et al. 2014, \(\alpha _\mathrm {E}=0.24\pm 0.09\) when fitted to surface mass density profiles of projected halos).
Note that the innermost bin in Fig. 22 represents the mean density interior to \(R_\mathrm {min}=40h^{1}\mathrm {kpc}\) corresponding to the typical resolution limit of their HST stronglensing analysis, \(\delta \vartheta \approx 10\,\hbox {arcsec}\). This scale \(R_\mathrm {min}\) is about twice the typical halflight radius of the CLASH BCGs (see Tian et al. 2020), within which the stellar baryons dominate the total mass of the clusters (e.g., Caminha et al. 2019). Determinations of the central slope of the darkmatter density profile \(\rho _\mathrm {DM}(r)\) in clusters require additional constraints on the total mass in the innermost region, such as from stellar kinematics of the BCG (Newman et al. 2013a). To constrain \(\rho _\mathrm {DM}(r)\), one needs to carefully model the different contributions to the cluster total mass profile, coming from the stellar mass of member galaxies, the hot gas component, the BCG stellar mass, and dark matter (Sartoris et al. 2020). Moreover, it is important to take into account the velocity anisotropy on the interpretation of the lineofsight stellar velocity dispersion profile of the BCG (Schaller et al. 2015; Sartoris et al. 2020; He et al. 2020). For these reasons, current measurements and interpretations of the asymptotic central slope of \(\rho _\mathrm {DM}(r)\) in galaxy clusters appear to be controversial (e.g., Newman et al. 2013a; Sartoris et al. 2020; He et al. 2020).
According to cosmological Nbody simulations, the spherically averaged density profiles in the halo outskirts are most selfsimilar when expressed in units of overdensity radii \(r_{\varDelta _\mathrm {m}}\) defined with respect to the mean density of the universe, \(\overline{\rho }(z)\), especially to \(r_\mathrm {200m}\) (Diemer and Kravtsov 2014). This selfsimilarity indicates that overdensity radii defined with respect to the mean cosmic density are preferred to describe the structure and evolution of the outer density profiles. The structure and dynamics of the infall region are expected to be universal in units of the turnaround radius, according to selfsimilar infall models (Gunn and Gott 1972; Fillmore and Goldreich 1984; Bertschinger 1985; Shi 2016). In these models, the turnaround radius is a fixed multiple of the radius enclosing a given fixed overdensity with respect to the mean cosmic density. The outer profiles can thus be expected to be selfsimilar in \(r/r_\mathrm {200m}\). In contrast, the density profiles in the intrahalo (1halo) region are found to be most selfsimilar when they are scaled by \(r_\mathrm {200c}\) (or any other critical overdensity radius with a reasonable threshold; Diemer and Kravtsov 2014). That is, density profiles of \({\varLambda }\hbox {CDM}\) halos in Nbody simulations prefer different scaling radii in different regions of the density profile (Diemer and Kravtsov 2014). These empirical scalings were confirmed in cosmological hydrodynamical simulations of galaxy clusters (Lau et al. 2015, see also Shi 2016). However, the physical explanation for the selfsimilarity of the inner density profile when rescaled with \(r_\mathrm {200c}\) is less clear.
In Fig. 23, we show the projected total mass density (\(\varSigma \)) and enclosed mass (\(M_\mathrm {2D}\)) profiles for seven CLASH clusters derived from a detailed stronglensing analysis of Caminha et al. (2019) based on extensive spectroscopic information, primarily from the Multi Unit Spectroscopic Explorer (MUSE) archival data complemented with CLASHVLT redshift measurements (Biviano et al. 2013; Rosati et al. 2014). In the figure, the projected mass profiles of individual clusters are rescaled using \(M_\mathrm {200c}\) and \(r_\mathrm {200c}\) obtained from NFW fits to independent groundbased weaklensing measurements (Umetsu et al. 2018). All clusters have a relatively large number of multipleimage constraints in the region \(10^{2}\lesssim R/r_\mathrm {200c}\lesssim 10^{1}\), where the shapes of the rescaled \(\varSigma (R)\) and \(M_\mathrm {2D}(<R)\) profiles are remarkably similar. Even MACS J0416 (Zitrin et al.