1 Introduction

The relative arrival times of multiply lensed sources can be used to measure an absolute distance of the Universe. The method, known to date as time-delay cosmography, was originally proposed over half a century ago, prior to the discovery of the first extragalactic gravitational lens, by Refsdal (1964). Time-delay cosmography provides a one-step measurement of the Hubble constant (\(H_{0}\)), independent of the local distance ladder or probes anchored with sound horizon physics, such as the cosmic microwave background (CMB).

Almost a century after its first measurement, the Hubble constant \(H_{0}\) still remains arguably the most debated number in cosmology. In the past few years, a tension has emerged between a number of local measurements, and inferences from early-Universe probes such as the cosmic microwave background (CMB) and Big Bang Nucleosynthesis, under the assumption of flat \(\Lambda \) cold dark matter (\(\Lambda \)CDM) cosmology (see, e.g., Verde et al. 2019; Shah et al. 2021; Abdalla et al. 2022, for recent summaries). If this tension is real, and not due to unknown systematic uncertainties in multiple measurements and their analyses, it implies that the standard \(\Lambda \)CDM model is not sufficient and new physical ingredients beyond this model are required. From a theoretical standpoint, a number of possible solutions – for example, involving early dark energy – have been proposed (e.g., Knox and Millea 2020; Di Valentino et al. 2021; Schöneberg et al. 2022, and references therein), often requiring fine-tuning of free parameters not to violate other observational constraints. From an observational standpoint, besides improving the precision of the measurements, significant attention has turned to the systematic investigation of unknown systematic uncertainties (e.g., Riess et al. 2019; Freedman et al. 2019, 2020; Riess et al. 2021; Mörtsell et al. 2022; Dainotti et al. 2021; Riess et al. 2023).

This article details the general methodology developed over the past decades in time-delay cosmography, discusses recent advances and results, and, foremost, provides a foundation and outlook for the next decade in providing accurate and ever more precise measurements with increased sample size and improved observational techniques. We will refer throughout this text to other articles of the same collection covering a wide range of strong lensing theory and applications (e.g., Saha et al. 2024; Shajib et al. 2024; Suyu et al. 2024; Vernardos et al. 2024; Lemon et al. 2024). We refer to, e.g., Treu and Marshall (2016), Suyu et al. (2018) to provide more in-depth historical perspectives on the early years of the field, Treu et al. (2022), Treu and Shajib (2023) to a broader and less technical perspective on the opportunities of time-delay cosmography in this decade, and Moresco et al. (2022) for a compact overview embedded within other cosmological probes.

We discuss primarily the methodology around lensed quasars as source objects to perform time-delay measurements because quasars are currently the most established sources with the most relevant current results. The use of lensed supernovae for cosmological and astrophysical studies is reviewed in detail by Suyu et al. (2024) in the same series; interested readers can also find descriptions of lensed supernova cosmographic results until 2023 in Suyu et al. (2024). For additional discussions on lensed supernovae and particularly on other types of lensed transients, such as gamma ray bursts and fast radio bursts, we refer to Oguri (2019) and Liao et al. (2022).

This manuscript is organized as follows: In Sect. 2 we provide the general concept and physics to turn relative time delays between multiple images of the same source into distance measurements. Section 3 provides an overview of the required analysis ingredients for individual lenses. Subsequent sections go into the details of these ingredients, the time-delay measurement (Sect. 4), determination of the lensing potential (Sect. 5), and the study of the line of sight (LOS) of the lenses (Sect. 6). Section 7 describes the cosmographic inferences and how to utilize a sample of lenses to perform an \(H_{0}\) inference. In Sect. 8 we present an overview of the cosmographic method applied for galaxy clusters as the deflectors. We summarize the current status and results obtained in Sect. 9. Lastly, in Sect. 10, we look in the future and discuss the potential and challenges lying ahead for the community.

2 Time Delays and the Time-Delay Distance

2.1 Lensing Formalism and the Fermat Potential

The deflection of light due to mass over- or under-density in the Universe lead to an angular displacement between the angle of the arriving photons, where we see the image, and the angle to the originating source ignoring lensing effects.

The lens equation formally describes the lensing distortions

$$ \boldsymbol{\beta} = \boldsymbol{\theta} - \boldsymbol{\alpha}( \boldsymbol{\theta}), $$
(1)

where \(\boldsymbol{\beta}\) is the angular position of the source without the lensing effect, \(\boldsymbol{\theta}\) is the corresponding angular coordinate on the sky as seen when lensed, and \(\boldsymbol{\alpha}\) is the deflection angle that maps the image position to the source position in angular coordinates as seen from the observer. We refer to e.g., Saha et al. (2024) for further details into the theory, including that this formula is only valid for small angles.

There exists a scalar potential, the lensing potential \(\phi \), such that the gradients correspond to the deflection field

$$ \nabla \phi (\boldsymbol{\theta}) = \boldsymbol{\alpha}( \boldsymbol{\theta}). $$
(2)

The convergence of the potential \(\phi \), \(\kappa \), is half the Laplacian

$$ \kappa ({\boldsymbol{\theta }})={\frac {1}{2}}\nabla ^{2}\phi ({ \boldsymbol{\theta }}) $$
(3)

and is given in the thin lens approximation for small angles as

$$ \kappa ({\boldsymbol{\theta }})={ \frac {\Sigma ({\boldsymbol{\theta }})}{\Sigma _{\mathrm{crit}}}} $$
(4)

with \(\Sigma ({\boldsymbol{\theta }})\) as the projected mass over- or under-density with respect to the mean background density and \(\Sigma _{\mathrm{crit}}\) the critical surface densityFootnote 1

$$ \Sigma _{\mathrm{crit}}={ \frac {c^{2}D_{\mathrm{s}}}{4\pi G D_{\mathrm{ds}}D_{\mathrm{d}}}}, $$
(5)

where \(c\) is the speed of light and \(G\) the gravitational constant. \(D_{\mathrm{d}}\) is the angular diameter distance to the lens, \(D_{\mathrm{s}}\) is the angular diameter distance to the source, and \(D_{\mathrm{ds}}\) is the angular diameter distance between the lens and the source, respectively.

When the luminosity of a strongly lensed background source varies over time, such as an active galactic nucleus (AGN), the variability pattern manifests in each of the multiple images and is delayed in time due to the different light paths of the different images (see Fig. 1). The arrival-time difference between two images \(\boldsymbol{\theta}_{\mathrm{A}}\) and \(\boldsymbol{\theta}_{\mathrm{B}}\) that originated from the same source \(\boldsymbol{\beta}\), \(\Delta t_{\mathrm{AB}}\), is

$$ \Delta t_{\mathrm{AB}} = \frac{D_{\Delta t}}{c} \left [\tau ( \boldsymbol{\theta}_{\mathrm{A}}, \boldsymbol{\beta}) - \tau ( \boldsymbol{\theta}_{\mathrm{B}}, \boldsymbol{\beta}) \right ], $$
(6)

where

$$ \tau (\boldsymbol{\theta}, \boldsymbol{\beta}) \equiv \left [ \frac{\left (\boldsymbol{\theta} - \boldsymbol{\beta} \right )^{2}}{2} - \phi (\boldsymbol{\theta})\right ] $$
(7)

is the Fermat potential (Schneider 1985; Blandford and Narayan 1986), and

$$ D_{\Delta t} \equiv \left (1 + z_{\mathrm{d}}\right ) \frac{D_{\mathrm{d}}D_{\mathrm{s}}}{D_{\mathrm{ds}}} $$
(8)

is the time-delay distance (Refsdal 1964; Schneider et al. 1992; Suyu et al. 2010). The Fermat potential (Eq. (7)) consists of two terms, a geometric term reflecting the geometric path difference, and a potential term, capturing the difference in the local spacetime dilation, known as the Shapiro delay. The optical terms stated (such as the Fermat potential) are only valid under small-angle and thin-lens assumptions. We refer to Saha et al. (2024, this collection) for these assumptions and the more general multi-plane formalism.

Fig. 1
figure 1

Illustration of the light path of the quadruply imaged lensed quasar HE0435-1223. The different light paths result in different arrival times. The relative time delays between the images is directly proportional to the overall physical distances from the observer to the lens and source. Measuring the time delays and reconstructing the lensing effect allow one to measure an absolute scale in the universe. Graphics from: Martin Millon, Image from Hubble Space Telescope (Wong et al. 2017)

2.2 Angular Diameter Distances and Cosmology

The angular diameter distance between two redshifts \(z_{1}\) and \(z_{2}\) in an Friedmann–Lemaitre–Robertson–Walker (FLRW) metric is

$$ D(z_{1},z_{2}) = \frac{1}{1+z_{2}}f_{K}[\chi (z_{1},z_{2})] $$
(9)

where

$$ \chi (z_{1},z_{2})=\frac{c}{H_{0}} \int _{z_{1}}^{z_{2}} \frac{dz'}{E(z')} $$
(10)

is the comoving distance with \(E(z) \equiv H(z)/H_{0}\) as the dimensionless Friedman equation and

$$ f_{K}(\chi ) = \left \{ \textstyle\begin{array}{l@{\quad}l} K^{-1/2} \sin \left (K^{1/2}\chi \right ) & \textrm{for $K>0$} \\ \chi & \textrm{for $K=0$} \\ (-K)^{-1/2}\sinh \left [(-K)^{1/2}\chi \right ] & \textrm{for $K< 0$} \end{array}\displaystyle \right . $$
(11)

is the spatial curvature of the background metric.

In the \(\Lambda \)CDM cosmology with density parameters \(\Omega _{\mathrm{m}}\) for matter, \(\Omega _{k}\) for spatial curvature, and \(\Omega _{\Lambda}\) for dark energy described by the cosmological constant \(\Lambda \), the dimensionless Friedman equation (\(E(z)\), Eq. (10)) is given by

$$ E(z) = \left (\Omega _{\mathrm{m}}(1+z)^{3} + \Omega _{k} (1+z)^{2} + \Omega _{\Lambda} \right )^{1/2} $$
(12)

and the spatial curvature is \(K=-\Omega _{k}H_{0}^{2}/c^{2}\).

Constraints on the Fermat potential difference \(\Delta \tau _{\mathrm{AB}}\) and a measured relative time delay \(\Delta t_{\mathrm{AB}}\) between to images of the same source can be turned into constraints of the time-delay distance \(D_{\Delta t}\). \(D_{\Delta t}\) is an absolute quantity that has units of distance and anchors the scale of the Universe within the lensing configuration. The Hubble constant, the local expansion rate of the cosmological background metric, sets the locally linear relationship between relative recession velocities and physical separation of two objects. In the frame of an observer, such as on Earth, for a fixed relative velocity or redshift, the Hubble constant is inversely proportional to the absolute physical distance to the object. The Hubble constant is inversely proportional to the absolute scales of objects in the Universe for which redshifts are measured (see e.g., Eq. (10)) and thus scales with \(D_{\Delta t}\) as

$$ H_{0} \propto D_{\Delta t}^{-1}. $$
(13)

The direct (inverse) proportionality of the time-delay cosmography measurable quantity \(D_{\Delta t}\) and \(H_{0}\) makes \(H_{0}\) the primary cosmological parameter time-delay cosmography can constrain. Footnote 2 Time-delay cosmography provides primarily an absolute distance anchor and hence can provide valuable information to shed light on the current tension in cosmology..

There is a secondary mild dependence of the measured Hubble constant when inferred from time-delay cosmography, namely on the relative expansion history from current time (\(z=0\)) to the redshift of the deflector and the source (dependence on \(E(z)\) in Eq. (12)). The mild dependency on other cosmological parameters beyond \(H_{0}\) can be compensated with other cosmological probes that are sensitive to the relative expansion history (such as SNIa luminosity distances, e.g., Taubenberger et al. 2019, Arendse et al. 2019, Liao et al. 2019, 2020), or with a large set of gravitational lenses at different lens and source redshifts (e.g., Li et al. 2024).

2.3 Observables and Degeneracies

The time delay between two images \(\Delta t_{\mathrm{AB}}\) can be measured from light curves and is hence a direct observable (see Sect. 4). The relative Fermat potential \(\Delta \tau _{\mathrm{AB}}\), however, is not a direct observable. The primary observations used to infer \(\Delta \tau _{\mathrm{AB}}\) are positional constraints of multiple imaged sources and their extended distortions in the lensed arcs from the lensing effect. However, there are degeneracies inherent in gravitational lensing that limit the amount of information accessible by positional and distortion effects as observed in imaging data (e.g., Falco et al. 1985; Gorenstein et al. 1988; Kochanek 2002; Saha and Williams 2006; Schneider and Sluse 2013, 2014; Birrer et al. 2016; Unruh et al. 2017; Birrer 2021) and Saha et al. (2024).

The mass-sheet degeneracy (MSD; Falco et al. 1985) is the most prominent lensing degeneracy impacting the prediction of the Fermat potential and hence time-delay cosmography. The MSD stems formally from a multiplicative transform of the lens equation (Eq. (1)) which preserves image positions under a linear source displacement \(\boldsymbol{\beta} \rightarrow \lambda \boldsymbol{\beta}\) combined with a transformation of the convergence field

$$ \kappa _{\lambda}(\boldsymbol{\theta}) = \lambda \kappa ( \boldsymbol{\theta}) + \left ( 1 - \lambda \right ). $$
(14)

Equation (14) above is known as the mass sheet transform (MST) and is a mathematical transformation where the term \((1 - \lambda ) \equiv \kappa _{\mathrm{MST}}\) is equivalent to a uniform-surface density sheet of convergence (or mass) that extends to infinite angular scales, and hence the name mass sheet transform and the name of the experienced degeneracy, the mass sheet degeneracy. \(\kappa _{\mathrm{MST}}\) can be positive or negative, since it is defined relative to the average positive density of the universe. The MST, by means of preserving image positions and being linear, also preserves any higher order relative differentials of the lens equation. Absolute lensing quantities, such as absolute magnification or size of structure, however, are not preserved by the MST. Only observables related to either the unlensed apparent source size (\(\boldsymbol{\beta}\) vs. \(\lambda \boldsymbol{\beta}\)), such as the unlensed apparent brightness, or the lensing potential are able to break the MSD. For example, the same relative lensing observables can be predicted if the mass profile is scaled by the factor \(\lambda \) with the addition of a sheet of convergence (or mass) of \(\kappa _{\mathrm{MST}}(\boldsymbol{\theta}) = (1-\lambda )\) and re-sizing of the source by a factor \(\lambda \).

The Fermat potential difference between a pair of images A& B (Eq. (7)) scales with \(\lambda \) as

$$ \Delta \tau _{\mathrm{AB },\mathrm{ \lambda}} = \lambda \Delta \tau _{\mathrm{AB}}, $$
(15)

and so does the relative time delay as

$$ \Delta t_{\mathrm{AB },\mathrm{ \lambda}} = \lambda \Delta t_{\mathrm{AB}}. $$
(16)

When transforming a lens model with an MST, the inference of the time-delay distance (Eq. (8)) from a measured time delay and previously inferred Fermat potential transforms as

$$ D_{\Delta t , \lambda} = \lambda ^{-1}D_{\Delta t}. $$
(17)

In turn, the Hubble constant, when inferred from the time-delay distance \(D_{\Delta t}\), transforms as (from Eq. (13))

$$ H_{0 , \lambda} = \lambda H_{0}. $$
(18)

An MSD effect relative to a specified deflector model might be associated with the mass distribution of the main deflector, referred as internal MSD with \(\lambda _{\mathrm{int}}\), or with inhomogenities along the line of sight (LOS) of the strong lens system, referred as external MSD.

Mass over- or under-densities relative to the mean background density along the LOS of the strong lensing system cause, to first order, shear and convergence lensing perturbations (e.g., Dalal et al. 2005; Hilbert et al. 2007; Puchwein and Hilbert 2009). Reduced shear distortions do have a measurable imprint on the azimuthal structure of the strong lensing arcs (see e.g., Birrer 2021; Hogg et al. 2023). In contrast, the convergence component of the LOS, denoted as \(\kappa _{\mathrm{ext}}\), describes the focusing or de-focusing of the light rays and is equivalent to an MST, \(\kappa _{\mathrm{ext}} \equiv (1 - \lambda )\), and hence not directly measurable from imaging data.

Equivalent to describing the (de-) focusing along specific line-of-sights by convergence terms, we can alter the specific angular diameter distance relative to the homogeneous background metric. In our notation, \(D^{\mathrm{lens}}\) is the angular diameter distance along a specific line of sight, including all structure such as LOS. \(D^{\mathrm{bkg}}\) is the angular diameter distance just from the homogeneous background metric without any perturbative contribution. The relation between \(D^{\mathrm{lens}}\) and \(D^{\mathrm{bkg}}\) are given by the convergence terms as

$$ \begin{aligned} D^{\mathrm{lens}}_{\mathrm{d}} &= (1 - \kappa _{\mathrm{d}})D_{\mathrm{d}}^{\mathrm{bkg}} \\ D^{\mathrm{lens}}_{\mathrm{s}} &= (1 - \kappa _{\mathrm{s}})D_{\mathrm{s}}^{\mathrm{bkg}} \\ D^{\mathrm{lens}}_{\mathrm{ds}} &= (1 - \kappa _{\mathrm{ds}})D_{\mathrm{ds}}^{\mathrm{bkg}}, \end{aligned} $$
(19)

where \(\kappa _{\mathrm{d}}\) is the external convergence from the observer to the deflector, \(\kappa _{\mathrm{s}}\) from the observer to the source, and \(\kappa _{\mathrm{ds}}\) from the deflector to the source, respectively (Birrer et al. 2020). The individual convergence terms can be calculated in the Born approximation along undeflected light paths independent of the strong lensing deflector.

The notation of perturbed angular diameter distances allow us also to directly calculate the impact of line-of-sight structure on the cosmographic inference, and in particular the measurement of the Hubble constant. The time delay can be described as the product of three different angular diameter distances entering \(D_{\Delta t}\) in Eq. (8) (Birrer et al. 2020; Fleury et al. 2021a), and hence the effective external convergence \(\kappa _{\mathrm{ext}}\)impacting the time delay and time-delay distance is

$$ 1 - \kappa _{\mathrm{ext}} = \frac{(1 - \kappa _{\mathrm{d}})(1 - \kappa _{\mathrm{s}})}{1 - \kappa _{\mathrm{ds}}}. $$
(20)

We note that, also directly visible from the equation above, the lensing efficiency (see Saha et al. 2024, this collection) impacting the linear distortions for both shear and \(\kappa _{\mathrm{ext}}\) is different from the standard weak lensing efficiency in the absence of a strong lensing deflector (McCully et al. 2014, 2017; Birrer et al. 2017, 2020; Fleury et al. 2021b).

Uncertainties or biases related to the MSD may also arise in regards to assumptions made in the radial density profile of the main deflector galaxy (see e.g. Kochanek 2002; Saha and Williams 2006; Read et al. 2007; Schneider and Sluse 2013; Coles et al. 2014; Xu et al. 2016; Birrer et al. 2016; Unruh et al. 2017; Sonnenfeld 2018; Kochanek 2020; Blum et al. 2020; Birrer et al. 2020; Kochanek 2021).

We refer to the MSD attributed to uncertainties in the radial density profile as the internal MSD, and describe its effect with the MST parameter \(\lambda _{\mathrm{int}}\) relative to an assumed mass profile. We will further discuss this aspect in Sect. 5.3.

The total MST, the relevant transform to constrain for an accurate Fermat potential determination and \(H_{0}\) measurement, is the product of the internal and external MST (e.g., Schneider and Sluse 2013; Birrer et al. 2016, 2020)

$$ \lambda = (1-\kappa _{\mathrm{ext}}) \times \lambda _{\mathrm{int}}. $$
(21)

To summarize, the prediction of the time delay (Eq. (6) can be generalized to

$$ \Delta t_{\mathrm{AB}} = (1-\kappa _{\mathrm{ext}}) \lambda _{\mathrm{int}} \frac{D_{\Delta t}}{c} \Delta \tau _{\mathrm{AB}}. $$
(22)

The existence of the MST and its generalizations imply that one has to rely either on (1) non-lensing information that can specifically constrain the shape of the radial mass density profile or (2) assumptions and priors about the functional form of the radial mass density profile that limit the degrees of freedom in the direction of the MST to constrain the lens model with a sufficient level of precision for time-delay cosmography.

The line-of-sight lensing contribution, \(\kappa _{\mathrm{ext}}\), can be estimated by tracers of the large-scale structure using galaxy number counts (e.g., Greene et al. 2013; Rusu et al. 2017) or weak-lensing measurements (Tihhonova et al. 2018, 2020). Galaxy number counts, paired with a cosmological model including a galaxy–halo connection, are able to constrain the probability distribution of \(\kappa _{\mathrm{ext}}\) to a few per cent per sight line. The main uncertainty in this approach arise from the uncertainties of luminous matter tracing the more dominant dark matter structure.

To break the total MSD \(\lambda \) with observations, we require observations that are sensitive to the total MSD. Stellar kinematics is the most prominent and commonly used one to break the total MSD. The collective motion of stars is a direct tracer of the three-dimensional gravitational potential and hence provides an independent mass estimate. Joint lensing + dynamics constraints have been used to provide measurements of galaxy mass profiles (e.g., Grogin and Narayan 1996; Romanowsky and Kochanek 1999; Treu and Koopmans 2002; Koopmans 2004; Barnabè et al. 2011, 2012). The modelling of the kinematic observables in lensing galaxies range in complexity from spherical Jeans modeling (Binney and Tremaine 2008) to Schwarzschild (Schwarzschild 1979) methods.

The prediction of the LOS velocity dispersion \(\sigma _{\mathrm{v}}\) from any model, regardless of the approach, can be decomposed into a cosmology-dependent and cosmology-independent part, as (see e.g., Birrer et al. 2016, 2019)

$$ \sigma _{\mathrm{v}}^{2} = \frac{1-\kappa _{\mathrm{s}}}{1 - \kappa _{\mathrm{ds}}} \frac{D_{\mathrm{s}}}{D_{\mathrm{ds}}}c^{2} J(\boldsymbol{\xi}_{\mathrm{lens}}, \boldsymbol{\beta}_{\mathrm{ani}},\lambda _{\mathrm{int}}), $$
(23)

where \(J\) is a dimensionless quantity dependent on the deflector model parameters (\(\boldsymbol{\xi}_{\mathrm{lens}}\)), \(c\) is the speed of light, and \(\boldsymbol{\beta}_{\mathrm{ani}}\) the stellar anisotropy distribution. The dimensionless factor \(J\) incorporates also the observational conditions and luminosity-weighting within the aperture of the dispersion measurement being taken (e.g., Binney and Mamon 1982; Treu and Koopmans 2004; Suyu et al. 2010). The internal component \(\lambda _{\mathrm{int}}\) should be physically interpretable as a three-dimensional mass profile and incorporated into the kinematics modeling term \(J\), in particular when there are multiple aperture measurements available (Teodori et al. 2022). In the approximate case of a very extended sheet-like perturbation, we can approximate

$$ J(\boldsymbol{\xi}_{\mathrm{lens}}, \boldsymbol{\beta}_{\mathrm{ani}},\lambda _{ \mathrm{int}}) \approx \lambda _{\mathrm{int}} J(\boldsymbol{\xi}_{\mathrm{lens}}, \boldsymbol{\beta}_{\mathrm{ani}}). $$
(24)

Combined lensing + dynamics constraints are sensitive to the combination of terms present in Eq. (23). Only the combination of terms, i.e., \(\lambda _{\mathrm{int}})\), \(D_{\mathrm{s}}\), \(D_{\mathrm{ds}}\), \(\boldsymbol{\beta}_{\mathrm{ani}}\), \(\kappa _{\mathrm{s}}\), \(\kappa _{\mathrm{ds}}\), is constrained and hence assumptions or priors on parts of the terms are required to provide a precise statement other terms. For example, when assuming the relative expansion history through the involved angular diameter distance ratio \(D_{\mathrm{s}}/D_{\mathrm{ds}}\), and the LOS contributions \(\kappa _{\mathrm{s}}\) and \(\kappa _{\mathrm{ds}}\), an inference on \(\lambda _{\mathrm{int}}\) is possible. On the other hand, when assuming \(\lambda _{\mathrm{int}}\) and the convergence terms, an inference on the angular diameter distance ratio, and hence the relative expansion history, is possible.

When combining time delays with lensing + dynamics, the observations of the time delay and kinematics need to be simultaneously be described by Eq. (22) and Eq. (23) in addition to the imaging data. These two independent equations can be arbitrarily algebraically combined in two-dimensional angular diameter distance constraints (Birrer et al. 2016, 2019). A convenient transform of those constraints is in the basis of

$$ D_{\Delta t}= \frac{1}{(1-\kappa _{\mathrm{ext}}) \lambda _{\mathrm{int}}} \frac{c \Delta t_{\mathrm{AB}}}{\Delta \tau _{\mathrm{AB}}} $$
(25)

and

$$ D_{\mathrm{d}} = \frac{1}{1 - z_{\mathrm{d}}}\frac{1}{1 - \kappa _{\mathrm{d}}} \frac{c \Delta t_{\mathrm{AB}}}{\Delta \tau _{\mathrm{AB}}} \frac{c^{2} J(\boldsymbol{\xi}_{\mathrm{lens}}, \boldsymbol{\beta}_{\mathrm{ani}}, \lambda _{\mathrm{int}})}{\lambda _{\mathrm{int}}\sigma ^{2}_{\mathrm{v}}}. $$
(26)

When mapped into the \(D_{\Delta t}\)\(D_{\mathrm{d}}\) plane as outlined above, the projection on constraints in \(D_{\mathrm{d}}\) is invariant under any pure external MSD parameter \(\kappa _{\mathrm{{ext}}}\) (Paraficz and Hjorth 2009; Jee et al. 2015; Birrer et al. 2019; Yıldırım et al. 2023).Footnote 3 If the approximation of Eq. (24) holds, \(D_{\mathrm{d}}\) becomes even independent of \(\lambda _{\mathrm{int}}\), and is overall less susceptible to the internal MSD.

3 Overview of Analysis Ingredients

To measure the cosmographic distances, in particular the time-delay distance \(D_{\Delta t}\) (Eq. (8)), or the more general \(D_{\Delta t}\)\(D_{\mathrm{d}}\) (Eqs (25)-(26)) combination, from a strong lensing system with a time-variable source, the following data products are required:

  1. 1.

    discovery of a lens with a time-variable source that is multiply imaged,

  2. 2.

    spectroscopic redshifts of the source, \(z_{\mathrm{s}}\), and lens, \(z_{\mathrm{d}}\),

  3. 3.

    measured time delays between at least one multiple image pair,

  4. 4.

    lens mass model to determine the Fermat potential between the multiple images from sufficiently high resolution imaging to resolve the positions of the multiply-imaged quasars,

  5. 5.

    lens environment studies to constrain external lensing effects.

A complete analysis for an individual lensing system requires the coordination of multiple independent observations. The analysis can be severely limited in its precision and reliability due to a single missing ingredient. For example, without measurements of a time delay, no constraints on absolute distances involved can be achieved, and thus, regardless of the approach or external priors chosen, no direct constraints on the Hubble constant can be made.

The spectroscopic redshifts of the quasar sources, \(z_{\mathrm{s}}\), are often obtained using the frequent emission lines in quasars. The redshift of the lens, \(z_{\mathrm{d}}\), can be challenging to measure since massive elliptical galaxies lack prominent and sharp absorption or emission lines and the bright quasar images can outshine the lens galaxy. Measuring \(z_{\mathrm{d}}\) of a lensed quasar systems often require high signal-to-noise ratio spectra taken under good seeing conditions, to deblend the lensing galaxy from the quasar. Technically, the redshifts involved in the lensing system are not directly required for the distance measurement. However, for the cosmological interpretation of the obtained distances, the redshifts are of crucial importance.

We describe in the next sections the remaining three ingredients; time delays (Sect. 4), lensing potential (galaxy scale and cluster) (Sect. 5), and line-of-sight perturbations (Sect. 6).

4 Measuring Time Delays

4.1 Monitoring of Lensed Quasars

Lensed quasars are variable on short timescale, making the time-delay measurements possible, and sufficiently bright to be observed at cosmological distances. They were hence quickly identified as excellent sources for time-delay cosmography. Lensed quasars are also currently much more common than lensed supernovae as around 300 lensed quasars have been discovered at the time of writing compared to only four lensed supernovae (see Suyu et al. 2024, this collection). Lensed quasars are typically found in the redshift range \(z_{\mathrm{s}} \sim 1\text{-}3\), with massive early-type galaxies acting as the lenses located around redshift \(z_{\mathrm{d}}\sim 0.2-0.8\) (e.g., Lemon et al. 2023, and Lemon et al. 2024). This lensing configuration typically produces multiple images separated by a few arcseconds, which is sufficient to be resolved with small ground-based telescope, Typical time delays in this configuration are of the order of days to a year. The monitoring of lensed quasars and the measurement of their individual brightness fluctuations is thus challenging but possible with 1-m or 2-m class telescopes, provided that a regular and long-term access is guaranteed (see e.g. the COSMOGRAIL collaboration Eigenbrod et al. 2005; Courbin et al. 2011; Tewes et al. 2013a). The relative error on the time delays, which is directly propagated to \(H_{0}\), depends on the absolute errors divided by the time-delay itself. Therefore, long-delay lenses are more valuable for cosmography as they yield smaller \(H_{0}\) uncertainties from this component. The achievable precision on time-delay measurements is limited by several astrophysical, observational and instrumental factors, that are listed below.

Photometric Accuracy:

In the optical, most quasars are variable on a timescale of weeks to years, and the longest variations also have the largest amplitude. This means that either long-duration light curves or high photometric accuracy are required to measure the delay reliably. In one visibility season, variations of the order of 0.2 mag are often observed, which requires a photometric accuracy of a few milli-magnitudes to identify precisely the inflection points. These inflection points are essential features in the light curves since it is not possible to measure a time delay if the quasars does not display any variations, or if the first derivative remains always constant. Reaching a photometric accuracy of only a few milli-magnitude is challenging as the quasar images are often blended with extended sources such as gravitational arcs or the lens galaxy. Consequently, the reconstruction of the Point Spread Function (PSF) and proper treatment of the contaminating light from these extended sources are usually the key to reduce the noise in the light curves.

Monitoring Cadence and Duration of the Monitoring:

A fast and precise temporal sampling of the light curve is necessary if one targets the fast variations of small amplitudes of the quasar. The monitoring cadence then needs to be commensurate with the timescale of the targeted variations. The total time span of the monitoring campaign also needs to be sufficient to cover the lensing time delays and to ensure that enough variations of the quasar are recorded for multiple images at relative delayed times. To obtain light curves to such specifications requires continuous access to the telescope for at least one visibility season, which is typically 6 to 8 months.

Windowing Effects and Correlated Noise:

Seasonal gaps are often unavoidable in optical light curves since only circumpolar targets are observable all year-long. The fact that data are missing every year can introduce some windowing effects, which should be accounted for when using cross-correlation techniques to measure time delays. The missing data introduces a periodic signal that must be carefully removed or taken into account before attempting to measure the time delays. Additionally, great care should be taken in the presence of correlated noise, which is often present in the light curves due to uncertainties in the assignment of flux coming from different quasar images. If no evident variations can be matched unambiguously in both light curves, it is unlikely that any statistical methods will robustly measure a time delay.

Extrinsic Variability:

Extrinsic variations are often observed in the light curves. They are caused mainly by the microlensing of the quasar images, and also a variety of other astrophysical effects (see e.g. Schechter et al. 2003; Blackburne and Kochanek 2010; Dexter and Agol 2011; Sluse and Tewes 2014; Vernardos et al. 2024). Microlensing is caused by the stars in the lensing galaxy, which add some extra time-variable “micro-magnification” on top of the static “macro-magnification” produced by the lensing galaxy. As described in Vernardos et al. (2024), the modulation of the micro-magnification due to the relative motion between the quasar, the lens and the observer introduces some extrinsic variations on top of the quasar intrinsic variations. For this reason, the light curves, even shifted in time and magnitude, rarely match perfectly. These extrinsic variations can severely bias time-delay measurements if not appropriately modelled and marginalized over.

In the past two decades, these difficulties have been progressively dealt with. The advances in photometric instrumentation in the late 1990s allowed us to acquire accurate and well-sampled light curves, which yielded the first robust time-delay measurements from optical monitoring (e.g., Kundić et al. 1997a; Schechter et al. 1997; Burud et al. 2000, 2002; Hjorth et al. 2002; Colley et al. 2003; Kochanek et al. 2006) and in radio monitoring (Fassnacht et al. 1999, 2002; Biggs et al. 1999; Koopmans et al. 2000). Although some of these measurements already reached an excellent precision of a few percents, the majority had ∼10-15% errors, hence limiting the measurement of \(H_{0}\) to the same precision. These first encouraging results led to a systematic attempt to monitor a sample of lensed quasars in both hemispheres by the COSMOGRAIL program, which started in 2003 (Courbin et al. 2005). The observing strategy then was to follow a dozen of lenses at bi-weekly cadence until the time delays can be measured to a few percent precision. This strategy yielded precise measurements for the brightest and most variable objects in about five years (Vuissoz et al. 2007, 2008; Courbin et al. 2011; Tewes et al. 2013b; Eulaers et al. 2013; Rathna Kumar et al. 2013) but required more than a decade of monitoring to obtain the time delays for most of the less variable and fainter targets (Millon et al. 2020a). An example of a light curves acquired by the COSMOGRAIL program over the past 15 years is shown in Fig. 2. Thanks to this long-term observing effort and other monitoring campaigns (e.g. Poindexter et al. 2007; Goicoechea and Shalyapin 2016; Giannini et al. 2017; Shalyapin and Goicoechea 2019), about 40 lensed quasars have now known time delays, although with variable precision, but the sample starts to be sufficiently large to vastly reduce the random uncertainties and to enable a statistical study of the time-delay lenses. In addition, three cluster-scale lensed quasars have measured time delays (Fohlmeister et al. 2007, 2008, 2013; Dahle et al. 2015; Muñoz et al. 2022) but their modelling is much more complex than galaxy-scale lenses (see Sect. 8.2 for detail).

Fig. 2
figure 2

R-band light curves of the lensed quasar HE0435-1223, obtained by the COSMOGRAIL program from the Euler 1.2 m Swiss Telescope, the 1.5 m telescope at the Maidanak Observatory, the Mercator 1.2 m telescope, and the SMARTS 1.3 m telescope. The bottom panels corresponds to the difference between pairs of light curves, corrected by the time delays, highlighting the microlensing variability. Figure reproduced from Millon et al. (2020a)

As time-delay cosmography is now entering a new regime with an increasing number of lensed quasars being discovered every year, the time delays now need to be measured rapidly to turn these newly discovered systems into cosmological constraints. Courbin et al. (2018) demonstrated that it is possible to obtain accurate time-delay measurements in only one monitoring season thanks to the small amplitude variations of the quasars, of the order of 10 to 50 millimag, that happen on a timescale of weeks or months. These variations are faster than the microlensing variability, which varies on a typical timescale of several months or years. If detected at a sufficient signal-to-noise ratio (SNR), these features reduce the need for long light curves as it allows us to disentangle the intrinsic and microlensing variability more easily. However, this change of strategy requires almost a daily cadence to obtain a sufficient sampling of these small features in the light curves. Their amplitude is of the order of 10 mmag, which requires 2-m class telescopes to obtain a sufficient SNR in 30 minutes of exposure at magnitude as faint as \(\sim 20\). This is illustrated in Fig. 3 in the case of the bright quadruple quasar WFI 2033-4723 (Bonvin et al. 2019). The technique enabled to measure six new time delays in one single season (Millon et al. 2020c), with more to follow.

Fig. 3
figure 3

Comparison of two observing cadences. On the two top panels are shown 14-year-long light curves of WFI 2033-4723, observed with a 4-day cadence at the 1.2 m Euler telescope at La Silla and at the SMARTS telescope at Las Campanas. During the season indicated with a red rectangle, the object has also been observed daily with the MPIA 2.2 m at La Silla (bottom panel), unveiling exquisite small-scale structures that vary faster than microlensing. Such observations allow to measure time delays in 1 single season, with similar accuracy and precision than the lower cadence data over 14 years (Bonvin et al. 2019; Millon et al. 2020a)

In the future, the Vera Rubin Observatory will obtain high-SNR multiband data for all southern lensed quasars, opening the possibility of building a sample of a few hundreds lensed quasars with known time delays. The cadence will however be limited to one point every few days in each band, which might not be sufficient to obtain the time delays at a few percent precision for the most interesting targets. Complementary observations from a 2-m class telescope at a daily cadence might still be useful to refine the time-delay measurements of the most promising objects.

4.2 Time-Delay Measurements Techniques

Once well-sampled light-curves have been acquired, the next step consists of identifying time-variable features that can be matched in all light curves from the individual images. The shifting of the light curve to match the features leads to a measurement of the time delays.

This step is significantly complicated by the presence of extrinsic variations due to microlensing in the light curves on top of the quasar intrinsic variations. The signature of micro-lensing can be seen in most lensed quasar light curves. In some cases, it manifests itself by a sharp rise of the luminosity in one of the multiple images, which happens when the source approaches or crosses a micro-caustic. This probably happened, for example, in 2007 for image A of HE0435–1223. In Fig. 2, we show the light curves of HE0435–1223 as well as the differential curves between the images, highlighting the microlensing signal. Caustic crossing events are not the only signature of microlensing visible in the data. Most quasar light curves also exhibit slow variations of the microlensing over several years. An example of this phenomenon is image B of HE0435–1223, which slowly raised by ∼ 0.5 mag between 2013 and 2018. It typically happens when the stellar density is high and the quasar is located in regions where the microcaustics overlap. The net effect is a smooth variation of the microlensing magnification as the quasar moves through these crowded regions. The extrinsic variations introduced by microlensing contains valuable information about the quasar accretion disk structure but is a real source of nuisance when measuring the time delays from optical monitoring. However, radio light curves are generally less influenced by microlensing compared to their optical counterparts. This is because the region from which radio waves are emitted is usually significantly larger than the microcaustics, so the impact of microlensing tends to be averaged out and less noticeable. In the optical range, the accretion disk, responsible for most of the UV and optical emission, has a comparable size to the microcaustics. Consequently, optical light curves are more affected by microlensing, leading to important extrinsic variations (for a detailed description of how microlensing differently impacts various emission regions, refer to Vernardos et al. 2024, this collection).

To deal with this issue, several curve-shifting algorithms have been proposed over the years, which can be classified into two categories. On one hand, some methods are based on the light-curve cross-correlations (e.g. Pelt et al. 1996), sometimes without attempting to subtract the microlensing variability (e.g. the smoothing and cross-correlation method by Aghamousa and Shafieloo 2015). On the other hand, several techniques rely on the analytical modelling of the intrinsic variability of the quasars and/or microlensing variations with, for example, splines (Tewes et al. 2013a) or Gaussian Processes (e.g. Hojjati et al. 2013). When explicitly modelled, the microlensing variations are removed from the light curves before attempting to find the optimal time delays. Due to the broad band nature of the monitored signal, mixing flux arising from multiple emission regions, microlensing is rarely perfectly removed but this is shown to have in general a negligible impact on the delay (Sluse and Tewes 2014). One can also mention the recent work by Tak et al. (2016), Donnan et al. (2021) and Meyer et al. (2023), aiming to infer the time delays in a Bayesian framework, including an explicit modelling of the microlensing variations.

These methods were tested in the “Time Delay Challenge” (TDC; Dobler et al. 2015), a blind data challenge aiming at assessing the precision and accuracy of the curve shifting algorithm on simulated but realistic data, which includes the microlensing variability. The results and conclusions of the challenge are presented in Liao et al. (2015) as well as in individual papers (Hojjati and Linder 2014; Bonvin et al. 2016). The problem is in fact more complicated than it sounds since a large fraction of the participating teams did not meet the requirements in term of precision and accuracy on the first and simplest rung of the challenge. Among the qualified teams to participate to the more advanced rungs of the TDC, the different proposed techniques showed overall good performance given the actual quality of the data. Several teams reached an accuracy of ≲1% on the most variable light curves. However, it remains to be checked if this level of performance holds if more realistic accretion disk emission mechanisms and source-size effects are included in the simulations.

Among these source-size effects, microlensing time delay, which is described in details in Tie and Kochanek (2018), may be a more subtle manifestation of microlensing acting as a nuisance to measure the time delay. Although it has never been detected so far in lensed quasar light curves directly, this effect may arise when different emission regions of the accretion disc are differentially magnified. A simple model to explain the UV and optical variability of the quasars is the “lamp post” model (e.g. Cackett et al. 2007; Starkey et al. 2017), where the luminosity fluctuations originate close from the supermassive black hole and then illuminate the rest of the disk. This triggers temperature fluctuations in the disk, which result into delayed UV and optical emission due to the light travel time from the center. In the absence of differential magnification caused by microlensing, this time lag cancels out between the multiple images and only the “cosmological” time delay is observed. However, if one of the multiple image is affected by microlensing, the time lag originating from a particular region of the disk might be amplified by microlensing and a net excess of microlensing time delay can add to the “cosmological” time-delay. This effect could reach a few hours to a couple of days, which is negligible for most of the systems with long time delays but it can significantly increase the uncertainties for systems with short time delays. This effect can however be mitigated with multi-band light curves (Chan et al. 2021; Liao 2020, 2021) or a proper Bayesian treatment of this effect as a source of nuisance (Chen et al. 2018).

Future developments of curve shifting algorithms might also include time-delay measurements from unresolved light curves (e.g. Hirv et al. 2007; Shu et al. 2021; Biggio et al. 2021; Springer and Ofek 2021; Bag et al. 2023), which will open the possibility to monitor small-separation (\(<1\)′′) lensed quasars. While precise delays from unresolved lightcurves have already been measured in the gamma-ray range (Barnacka et al. 2011; Cheung et al. 2014), they cannot be used for cosmography as the location of the gamma-ray emission w.r.t. the central AGN remains unknown. In the optical range, space-based large sky surveys, such as Euclid, are expected to discover thousands of small-separation systems, which will not be fully resolved from the ground with our current follow-up facilities. To turn this large sample of small-separation lensed quasars into a useful cosmological probe will require to develop these new techniques.

5 Determining Lensing Potential

Determining the lensing potential \(\phi \) is a crucial ingredient in time-delay cosmography as it directly enters the time-delay prediction through the Fermat potential (Eq. (7)). The Fermat potential is generally dominated by a massive elliptical galaxy acting as the main deflector and intervening line-of-sight over- and under-densities. To achieve a precise and accurate cosmographic inference, knowledge of both the line-of-sight structure and the mass distribution within the main deflector need to be known. One of the major limitations in a precise determination of the Fermat potential is the MSD, and the inability with imaging data to constrain the Fermat potential. Thus, either physical assumptions based on what we know from other modelled galaxies, e.g. in the nearby universe, or external data, such as stellar kinematics, is required to constrain the Fermat potential. The measurement of the Hubble constant and constraining the galaxy density profiles are tightly connected and most of the questions asked and techniques being used in Shajib et al. (2024, this collection) are of the same relevance and applicability for time-delay cosmography.

We discuss observables and inferences, first, for historical context on positional constraints alone in Sect. 5.1, and then from imaging data in Sect. 5.2. We then discuss assumptions on mass profiles in Sect. 5.3 and what external information provide necessary constraints in Sect. 5.4.

5.1 Conjugate Point Analysis

Historically, the first lens models constructed for time-delay cosmography were based on positional constraints of the quasar images alone, and in combination with the time delays, and to some extent the image magnifications. Valid models must satisfy the lens equation (Eq. (1)) such that all images must map back to the same source position. In the case of a quadruply lensed system, the four image positions result in constraints of five relative distortion angles (plus translation and rotation of the system). With such limited information, investigators had to assume a very simple form for the lens mass distribution, such as a singular isothermal sphere (Koopmans and Fassnacht 1999), for meaningful constraints on the model. Different assumptions on the mass profiles lead to vastly different results (see e.g., Kochanek and Schechter 2004).

5.2 Inference from Imaging Data

High-resolution imaging of gravitational lenses with constraints from hundreds/thousands of high signal-to-noise surface brightness pixels is able to measure accurate astrometry of multiple images and capture the detailed distorted images of extended source structure. This information is crucial to capture the relative lensing deflection and to achieve a precise determination of the relative Fermat potential between multiple images of the time-variable source. Modeling the imaging data on the pixel level has become the standard over the last two decades. In this section, we discuss the necessary aspects of the mass distribution that imaging data can constrain, apart from the remaining degeneracies.

To derive constraints on the mass of the gravitational lens and its deflection field from imaging data, models of the imaging data with different deflection fields are compared to the data in a Bayesian way on the likelihood level of individual pixels. Besides a description of the deflection field, all light emission components have to be described, containing the light emission of the source and the deflector. All components that affect the imaging data need to be modeled and accounted for, in particular around the region impacted by lensing features. Required modeled components include, but are not limited to, the extended source component of host galaxy of the time-variable source, the image positions of the time-variable source and its resulting approximate point-like flux emission, the surface brightness of the deflector galaxy, differential dust extinction caused by the deflector galaxy on the background source, and any other sources of surface brightness, such as satellite galaxies. In addition, instrument effects, such as the point spread function (PSF), instrumental noise and shot noise, and pixelization, as well as potential data reduction artifacts need to be accurately taken into account in the modeling and comparison with the data.

The lensing effect distorting an extended surface brightness \(S(\boldsymbol{\beta})\), such as from the extended host galaxy, can be computed with a method known as ‘backwards ray-tracing’. Making use of the fact that surface brightness is conserved through lensing, the surface brightness at a position in the image \(I(\boldsymbol{\theta})\) can be computed by the surface brightness in the source plane associated with the corresponding coordinate \(\boldsymbol{\beta}(\boldsymbol{\theta})\) as \(I(\boldsymbol{\theta}) = S(\boldsymbol{\beta}(\boldsymbol{\theta})) \), where \(\boldsymbol{\beta}(\boldsymbol{\theta}) = \boldsymbol{\theta} - \boldsymbol{\alpha}(\boldsymbol{\theta})\) is the ‘ray tracing’ term given by the lens equation (Eq. (1)).

For unresolved point-like images, the backwards ray-tracing is numerically inefficient. To guarantee that multiple images precisely come from the same location in the source plane within the astrometric requirements for an accurate time-delay prediction, the lens equation has to be solved for the point source constraints within the astrometric precision, or alternatively, solutions not satisfying the astrometric requirement (e.g., Birrer and Treu 2019) need to be discarded.

Given a lens model with parameters \(\boldsymbol{\xi}_{\mathrm{mass}}\) (which includes all aspects of the deflection field, including line-of-sight structure and nearby perturbers) and surface brightness model with parameters \(\boldsymbol{\xi}_{\mathrm{light}}\), a model of the imaging data can be constructed, \(\boldsymbol{d}_{\mathrm{model}}\). The full process of simulating a modeled image can be cast as a consecutive application of operators as follows: starting with the surface brightness operator \(\mathcal{S}\), the lensing operator ℒ is applied on the lensed source, followed by a PSF convolution operation \(\mathcal{C}\), and finally an operator \(\mathcal{P}\) matching the pixel resolution of the data, formally an integral of the convolved surface brightness over the size of a pixel. With this notation and ⊙ denoting the consecutive application of operators from left to right, we can write the generation of modeled data as

$$ \boldsymbol{d}_{\mathrm{model}} = \mathcal{P} \odot \mathcal{C} \odot \left [\mathcal{L}(\boldsymbol{\xi}_{\mathrm{lens}}) \odot \mathcal{S}_{ \mathrm{source}}(\boldsymbol{\xi}_{\mathrm{light}}) + \mathcal{S}_{\mathrm{lens}}( \boldsymbol{\xi}_{\mathrm{light}})\right ]. $$
(27)

The Bayesian analysis to constrain the lens model is performed on the pixel-level likelihood of the imaging data. The likelihood is computed at the individual pixel level accounting for the noise properties from background and other noise properties, such as read-out, as well as the Poisson contribution from the sources. In the Gaussian limit the imaging likelihood is given by

$$ p(\mathcal{D}_{\mathrm{img}} \mid \boldsymbol{\xi}_{\mathrm{mass}}, \boldsymbol{\xi}_{\mathrm{light}}) = \frac{\exp \left [-\frac{1}{2}\left (\boldsymbol{d}_{\mathrm{data}} - \boldsymbol{d}_{\mathrm{model}}\right )^{\mathrm{T}} \boldsymbol{\Sigma}_{\mathrm{pixel}}^{-1}\left (\boldsymbol{d}_{\mathrm{data}} - \boldsymbol{d}_{\mathrm{model}}\right )\right ]}{\sqrt{(2 \pi )^{k} {\mathrm{det}}(\boldsymbol{\Sigma}_{\mathrm{pixel}})}}, $$
(28)

where \(k\) is the number of pixels used in the likelihood and \(\boldsymbol{\Sigma}_{\mathrm{pixel}}\) is the error covariance matrix. We also note that for the flux noise, the error covariance matrix \(\boldsymbol{\Sigma}_{\mathrm{pixel}}\) is a function of the brightness of the model \(\boldsymbol{d}_{\mathrm{model}}\) and hence not independent of the model prediction. Current analyses assume uncorrelated noise properties in the individual pixels and the covariance matrix becomes diagonal.

The primary target of an imaging analysis is to retrieve the lens model parameter posteriors marginalized over other model parameters, in particular the surface brightness and regularization parameters as

$$ p(\boldsymbol{\xi}_{\mathrm{mass}} \mid \mathcal{D}_{\mathrm{img}}) = \int p( \mathcal{D}_{\mathrm{img}} \mid \boldsymbol{\xi}_{\mathrm{mass}}, \boldsymbol{\xi}_{\mathrm{light}}) p(\boldsymbol{\xi}_{\mathrm{mass}}, \boldsymbol{\xi}_{\mathrm{light}}) d \boldsymbol{\xi}_{\mathrm{light}}, $$
(29)

where \(p(\boldsymbol{\xi}_{\mathrm{mass}}, \boldsymbol{\xi}_{\mathrm{light}})\) denotes the prior on the lens and light model parameters.

To jointly marginalize over an unknown yet possibly complex source morphology, different techniques have been developed. Such techniques include regularized pixelated source reconstruction (e.g., Warren and Dye 2003; Treu and Koopmans 2004; Koopmans 2005; Suyu et al. 2006, 2009), a set of basis functions, such as shapelets (e.g., Birrer et al. 2015; Birrer and Amara 2018) or wavelets (Joseph et al. 2019; Galan et al. 2021), or simply parameterized surface brightness profiles, such as Sérsic profiles. The methods mentioned above have in common that their surface brightness amplitude components create all a linear response on the pixel values of the data. The optimization of the often numerous linear coefficients to provide the maximum likelihood of the data given a proposed model for the other parameters (such as surface brightness shape parameters and lensing parameters) can be cast as a linear problem with a solution obtained by a weighted least square minimization. The Gaussian covariance matrix of the linear weighted least square minimization can be used to analytically marginalize over the prior of the linear coefficients (e.g., Suyu et al. 2006; Vegetti and Koopmans 2009; Birrer et al. 2015).Footnote 4

The joint sampling of lens and light model parameters to infer the lens model posterior distribution is then a semi-linear process. While the amplitudes of the light model coefficients can be solved linearly, the remaining parameters, including those pertaining to the lens mass model, and other shape-related surface brightness parameters have to be sampled non-linearly.

Often it is not clear at the beginning of an investigation what the level of complexity in the model is required to describe the data and to guarantee an accurate modeling. Current procedures are to start with a simple model and subsequently increase the complexity in the different model components until a satisfactory fit is achieved. Current criteria for a goodness of fit in use are the Bayesian Information Criteria (BIC) (Birrer et al. 2019) and the Bayesian Evidence (Shajib et al. 2020).

Imaging modeling is primarily performed on high resolution space based Hubble Space Telescope (HST) or ground-based adaptive optics (AO; Chen et al. 2016, 2019, 2021b) imaging. Figure 4 illustrates, as an example, the imaging data and models for two quadruply imaged lensed quasars, originally presented by Shajib et al. (2020) and Chen et al. (2019). Figure 5 presents the key lens model posteriors from the imaging modeling fit of the lens HE0435–1223 by Chen et al. (2019) for both, HST and AO imaging, for a power-law elliptical mass distribution with external shear contribution. To enhance the signal in the data set, and to distinguish deflector and source light components, the lens modeling is often performed simultaneously with multiple filters and combined on the likelihood level. With multiple filters, a better differentiation between source and deflector light can be drawn, with the deflector often being bright in the infrared channels, and the lensed source often dominant in the optical and ultraviolet channels. The infrared channels are often brighter, and hence contain more signal, while the optical channels can resolve smaller angular scales with more prominant source morphologies. The modeling across bands is performed with an identical lens model, while the surface brightness solutions are flexible to change (i.e., independent linear optimization or even different surface brightness components). Very high relative astrometric solutions of order \(\sim 1\)mas across bands are required for the modeling. Current modeling fits the relative astrometric solutions across bands in the modeling process (e.g., Shajib et al. 2020).

Fig. 4
figure 4

Illustration of imaging modeling for two lenses. From left to right: Imaging data, the reconstructed model, the reduced residuals \((\boldsymbol{d}_{\mathrm{data}}-\boldsymbol{d}_{\mathrm{model}})/\sigma \), reconstructed source. Top row: HST data and model for DES J0408–5354 in three bands, from Shajib et al. (2020), with shapelet and parameterized source reconstruction using the modeling software lenstronomy. Bottom row: Keck adaptive optics imaging and modeling of HE0435–1223, from Chen et al. (2019), with pixelated source reconstruction using the modeling software GLEE

Fig. 5
figure 5

Key lens model parameter posterior from the fit to imaging data (HST and Keck adaptive optics). \(q\) is the semi-minor to semi-major axis ratio in the projected mass density profile, \(\theta _{\mathrm{E}}\) is the Einstein radius, \(\gamma \) the three-dimensional radial power-law density slope, \(\gamma _{\mathrm{ext}}\) is the external shear strength, and \(\theta _{\mathrm{ext}}\) is the shear angle. Figure adopted from Chen et al. (2019)

The PSF needs to be characterized very accurately, both to provide a high astrometric precision of the images of the time-variable sources (Birrer and Treu 2019; Chen et al. 2021a) and for the detailed modeling of the extended source structure without spurious signal of bright quasar images. Current methods to obtain a precise PSF model contain an iterative procedure during the model fitting process to extract improved constraints of the PSF from the data itself (e.g., Chen et al. 2016; Birrer et al. 2016).

The lens model and Fermat potential posteriors are marginalized over a set of systematic effects and modifications in the choice of the source reconstruction and other modeling choices.

5.3 Mass Profile Assumptions of the Main Deflector

Resolved multiply imaged structure is an exquisite data product to provide constraints on the relative deflection field (see Sect. 5.2). Imaging data, in the absence of the knowledge of the unlensed apparent source size or brightness, is not able to break the MST and its generalization, the source position transformation (SPT; Schneider and Sluse 2014). The quantity that is invariant under the MST in the radial direction and hence can be constrained by imaging data is (Kochanek 2002; Sonnenfeld 2018; Kochanek 2020; Birrer 2021)

$$ \xi _{\mathrm{rad}} \equiv \frac{\theta _{\mathrm{E}} \alpha _{\mathrm{E}}^{\prime \prime}}{1-\alpha _{\mathrm{E}}^{\prime}} \propto \frac{\theta _{\mathrm{E}} \alpha ^{\prime \prime}_{\mathrm{E}} }{1 - \kappa _{\mathrm{E}}}, $$
(30)

where \(\alpha ^{\prime}_{\mathrm{E}}\) is the derivative and \(\alpha ^{\prime \prime}_{\mathrm{E}}\) is the second derivative of the deflection angle at the Einstein radius \(\theta _{\mathrm{E}}\), respectively, and \(\kappa _{\mathrm{E}}\) is the convergence at \(\theta _{E}\). On azimuthal invariances and observable lensing features, we refer to Birrer (2021) and references therein. Constraining a global mass density profile based on imaging data alone requires assumptions on the radial profile. For example, when imposing the assumption that the mass density profile follows a single power law, the power-law slope \(\gamma _{\mathrm{pl}}\) has a direct correspondence to \(\xi _{\mathrm{rad}}\) (Eqn. (30)) with \(\xi _{\mathrm{rad}} = \gamma _{\mathrm{pl}} - 2\) and a precise (few percent uncertainty) inference on the Fermat potential is possible. Relaxing the assumption on the shape of the density profile leads to significantly widened constraints. When allowing for a full mass-sheet degree of freedom, the Fermat potential is fully degenerate (i.e. \(\xi _{\mathrm{rad}}\) remains unchanged by a MST). A pure mass sheet is unphysical as no localized three-dimensional density profile can describe it. However, approximate parameterizations can be found that can be expressed as a three-dimensional density profile and are indistinguishable based on imaging data (Schneider and Sluse 2013; Blum et al. 2020; Birrer et al. 2020; Yıldırım et al. 2023). For example, Blum et al. (2020) introduced the cored density profile which has a three-dimensional density distribution

$$ \rho _{c}(r) = \frac{2}{\pi} \Sigma _{c} R_{c}^{3} \left (R_{c}^{2} + r^{2} \right )^{-2} $$
(31)

with the convergence profile as

$$ \kappa _{c}(\theta ) = \left (1 + \frac{\theta ^{2}}{\theta _{c}^{2}} \right )^{-3/2}. $$
(32)

The approximate MST can then be written as

$$ \kappa _{\lambda _{c}}(\theta ) = \lambda _{c} \kappa (\theta ) + (1- \lambda _{c}) \kappa _{c}(\theta ). $$
(33)

Figure 6 illustrates an example of how an approximate MST can be physically interpreted when applied to a composite profile described with a stellar and a dark matter component.

Fig. 6
figure 6

Illustration of a composite profile consisting of a stellar component (Hernquist profile, dotted lines) and a dark matter component (NFW + cored component with \(\lambda _{c}\) acting as an approximate MST (from Blum et al. 2020, Eqns. (32), (33)), dashed lines) which transform according to an approximate MST (joint as solid lines). The stellar component gets rescaled by the MST while the cored component transforms the dark matter component. Physically, the profiles of each color differ by a 10% different mass-to-light ratio combined with a slightly more extended or contracted dark matter profile also on the 10% level. Left: profile components in three dimensions. Right: profile components in projection. Each profile provides a 10% difference in the predicted time delay, and hence \(H_{0}\) inference. The transforms presented here cannot be distinguished by imaging data alone and require i.e., stellar kinematics constraints. Figure from Birrer et al. (2020)

There are also possibilities in deviations in the mass density profiles that do not directly mimic an MST. Any radial mass profile that satisfies the same constraints on \(\xi _{\mathrm{rad}}\) (Eqn. (30)) provides an equally good fit to the data.Footnote 5 Azimuthal assumptions of the mass density profile do also matter in the interpretation of the radial components (Birrer 2021; Kochanek 2021) and assumptions on disky and boxiness, ellipticity gradients and isodensity twists of the density profile may also impact the Fermat potential differences (Van de Vyvere et al. 2022b,a; Gomer and Williams 2020, 2021).

From a physics point of view, the matter distribution of the main deflector is made of stellar mass, gas, and dark matter, where the stellar mass is dominating the inner-most parts. The dark matter fraction within the Einstein radius is about ∼10–60% (e.g., Auger et al. 2010; Ferreras et al. 2005). Invisible substructure in the lens and along the LOS can also perturb the Fermat potential (e.g., Oguri 2007; Keeton and Moustakas 2009). Gilman et al. (2020) showed that omitting dark substructure does not bias inferences of \(H_{0}\). However, perturbations from substructure contribute an additional source of random uncertainty in the inferred value of \(H_{0}\) ranging from 0.7–2.4% depending on the redshift and image configuration. We also highlight that the lensing mass and convergence only accounts for the mass over-density in regard to the cosmological background density.

Different approaches running with different assumptions have been taken in the literature to describe the deflector lensing potential. Among the assumptions being used are single power-law mass profiles, composite models involving a mass-follows-light component with a separate component describing the dark matter profile, free-form pixelated mass profiles (e.g., Saha and Williams 2004; Coles et al. 2014; Denzel et al. 2021), pixelated lensing potential corrections (e.g., Suyu et al. 2009), or an explicit internal mass profile MST component (Blum et al. 2020; Birrer et al. 2020).

There are multiple considerations in the specific choice of an investigation. On one side, there are physical considerations. What basic assumptions in the modeling are justified? What priors to chose in the Bayesian modeling? Then there are also practical considerations. What aspects of the model can be constrained by the data? Is it feasible to perform a posterior inference in a finite amount of time with given resources?

Among the simplest models employed is the single power-law profile. It has a one-to-one relation to the radial quantity described in Eq. (30) and breaks the MST. A power-law elliptical mass profile is an efficient parameterization to describe the first order radial and azimuthal features in strong lensing imaging data. Composite models do relate to certain physical assumptions of mass follows light and assert a stiffness in the profile that implicitly also break the MST .Footnote 6

An explicit parameterization of the MST in the model denies any prior or assumptions to break the MST and is maximally agnostic to the MST with minimal added parameter degrees.

On the high-complexity end of lens models are free-form methods, such as pixelated mass distributions (Saha and Williams 2004; Coles et al. 2014). Free-form models come with very few restrictions on the lens mass distribution and offer a different modeling strategy compared to the simply parametrized approaches. The ensemble of models allowed by the data can be interpreted as the model posterior distribution, with the regularization scheme proposing models without data constraints being the prior.

Increased flexibility in the parameterization better guarantees that the underlying truth in the mass distribution, and in particular the prediction of the Fermat potential entering the time delays, can be represented by the model. On the other hand, increased flexibility in the model at fixed data constraining power increases the uncertainty in the posterior-predictive model. Less constraining posteriors put inevitably more weight and reliance on the priors applied, whether they are explicitly in a parameterized form, or implicit within an over-parameterized, free-form approach. No matter what choices are being made in the modeling of lenses, mitigating the dependence on the explicit or implicit priors becomes important when combining a set of multiple lenses, as we will discuss in Sect. 7.2.

We discuss additional data sets that can constrain the lens mass profile in Sect. 5.4.

5.4 Non-lensing Observables

The currently used primary observation to break the MSD is stellar kinematics from the deflector galaxy (Treu and Koopmans 2002; Koopmans et al. 2003; Koopmans 2004). The kinematics of stars, in particular their velocity dispersion, is a direct and lensing-independent tracer of the three-dimensional gravitational potential. The kinematic measurement is performed by targeting stellar absorption lines and measuring their width with spectrographs. Figure 7 shows a Keck/LRIS spectrum of HE0435–1223.

Fig. 7
figure 7

Top: Keck/LRIS spectrum of HE0435–1223 with the best-fitting model overplotted in red and a polynomial continuum, which accounts for contamination from the lensed QSO images and template mismatch, shown in green. The measurement results in an integrated velocity dispersion \(\sigma _{\mathrm{v}} = 222\pm 15\text{ km}\text{ s}^{-1}\), including systematic uncertainties due to the templates used, the region of the spectrum that was fitted, and the order of the polynomial continuum. The grey vertical band represents a wavelength range that is excluded from the fit due to the presence of a strong Mg II absorption system. Bottom: Residuals from the best fit. Figure adopted from Wong et al. (2017)

The line-of-sight stellar velocity dispersion is an integrated quantity of the radial and tangential components of the velocity dispersion projected along the line of sight. The orbital anisotropy, i.e., the ratio of the radial and tangential velocity dispersion components, is unknown a priori and thus introduces a degeneracy in the predicted line-of-sight velocity dispersion corresponding to the same 3D mass profile. This degeneracy is known as the mass–anisotropy degeneracy. Typically, a prior on the anisotropy profile, e.g., isotropic or Osipkov–Merritt (Osipkov 1979; Merritt 1985a,b), is assumed. The Osipkov–Merritt profile allows the anisotropy to be isotropic near the center and gradually more radial farther away from the center, which is motivated by the observed properties of the stellar orbits in local elliptical galaxies. The isotropic profile is thus a special case of the Osipkov–Merritt profile. The transition scale radius \(r_{\mathrm{ani}}\) from isotropic to radial orbits in the Osipkov–Merritt profile is a priori unknown, which directly manifests in the mass–anisotropy degeneracy. Thus, the prior on \(r_{\mathrm{ani}}\) has significant impact on the kinematics prediction (e.g., Shajib et al. 2018; Birrer et al. 2020). We note that there are many other forms of the radial anisotropy distribution and the specific choice of functional model used might impact the results, as well as what priors are adopted. One way to mitigate this degeneracy is to obtain spatially resolved measurement of the velocity dispersion instead of an unresolved (or, integrated) velocity dispersion (Shajib et al. 2018; Yıldırım et al. 2020, 2023; Birrer and Treu 2021).

Other proposed observations and analyses methods that can break the MST and provide the necessary constraints on the mass density profile are standardizable magnifications (e.g., Kolatt and Bartelmann 1998; Oguri and Kawano 2003; Foxley-Marrable et al. 2018; Birrer et al. 2022), lens population statistics of appearances and asymmetry in the multiple images (e.g., Sonnenfeld and Cautun 2021; Sonnenfeld 2021), and galaxy-galaxy weak gravitational lensing (Khadka et al. 2024).

We emphasize that these non-lensing observations are primarily sensitive to the total MST, the combination of LOS and internal mass density profile degeneracies (Eq. (20)). Decoupling of the different projected effects in the lensing potential is not necessary to perform an accurate cosmographic inference since the time-delay prediction only requires the combined product. However, when combining different lenses with potentially different selections, the priors and assumptions imposed in either of the two components impacting the MST can become important.

6 Estimating Line-of-Sight Contributions

Strong lensing requires a high projected mass density. Strong lenses are hence biased toward more massive galaxies, which are biased toward overdense environments. The contribution of the mass density fluctuations along the line of sight to the lensed source is generally of order few percent, and commonly lower than 10% of the total convergence of the lens. While this may appear to be small, it is not negligible when it comes to estimating the Hubble parameter to percent accuracy. A constant effective contribution of a few percents caused by the line-of-sight is equivalent to an external mass-sheet \(\kappa _{\mathrm{ext}}\) (Sect. 2).

The exact impact of the line-of-sight objects depends on whether the dominant-lens approximation is valid, in which case the critical density of the line-of-sight objects is very small compared to the main deflector critical density, and on whether the tidal regime is applicable, which happens when the perturber’s gravitational field is small compared to the changes of the deflection \(\alpha (\theta )\) (e.g. McCully et al. 2014; Birrer et al. 2017; Fleury et al. 2021b). When one of those approximation is invalid, an explicit treatment is needed, requiring potentially to solve the multi-plane lens equation (see e.g., McCully et al. 2014; Wong et al. 2020; Shajib et al. 2020; Li et al. 2021). We refer to Saha et al. (2024) for a detailed discussion on the multi-plane gravitational lensing formalism. Conversely, when line-of-sight objects can be treated as small perturbations that only introduce convergence that is constant over the extent of the lensed system, a statistical treatment is sufficient. In practice, a hybrid scheme needs to be followed most of the time, including explicitly modelling those objects that modify differently the Fermat potential for each lensed images, and calculating the statistical contribution of the other objects that shift the Fermat potential in linear order.

From an information perspective, there is only limited direct data available of the total matter distribution on the universe at the scales relevant ton constrain \(\kappa _{\mathrm{ext}}\). Hence, any method relies on some assumptions on how mass traces light. These assumptions are well motivated by large scale structure probes, but are only validated in a statistical way.

The following subsections present the various methods that have been considered to estimate \(\kappa _{\mathrm{ext}}\). Section 6.1 presents a direct modeling, Sect. 6.2 presents galaxy number counts statistics, Sect. 6.3 weak lensing measurements, and Sect. 6.4 a hybrid approach.

6.1 Direct Modeling

The most direct approach is to collect the positions, redshifts, stellar masses and potentially even velocity dispersion measurement of the galaxies located in the field of view towards the lens and explicitly model the matter distribution of all relevant objects. A complete direct reconstruction is near-impossible. A simple approach that has been developed to estimate \(\kappa _{\mathrm{ext}}\) consists in identifying which galaxies form massive galaxy groups that contribute the most significant impact along the line of sight (e.g. Fassnacht and Lubin 2002; Momcheva et al. 2006; Fassnacht et al. 2006; Sluse et al. 2017). An estimate of \(\kappa _{\mathrm{ext}}\) may then be derived by fitting analytical mass density profiles on those groups (Auger et al. 2007; Wilson et al. 2017). This approach generally yields estimates of \(\kappa _{\mathrm{ext}}\) typically uncertain to a factor 2-4 depending on the specific assumptions one may reasonably make on the group mass density (e.g., halo associated to each individual galaxies, a common halo for all the systems), but also sometimes with low precision due to the uncertainties associated to the group identification (fields of view are never complete and group finders have their own biases). To overcome this problem, Collett et al. (2013) have proposed a simple halo model prescription to reproduce the mass along the line-of-sight from a photometric catalogue of galaxies. The convergence \(\kappa _{\mathrm{h}}\) from each halo has then been calibrated against \(\kappa _{\mathrm{{ext}}}\) derived from ray-tracing estimate through numerical simulations. This method does not account explicitly for the convergence due to dark structures and divergence due to voids, but those effects are included statistically owing to the calibration of \(\kappa _{\mathrm{h}}\) against \(\kappa _{\mathrm{{ext}}}\).

6.2 Number Counts

An alternative to the direct modeling of the line of sight consists of measuring the galaxy number density in the vicinity of the lens, describe it in a summary statistic, and comparing it to reference fields. This procedure will determine whether the LOS is over- or under-dense compared to average (Suyu et al. 2010; Fassnacht et al. 2011; Greene et al. 2013; Rusu et al. 2017; Wells et al. 2023). First, a detailed characterization of the line of sight towards the lens is required, using deep imaging data, complemented by spectroscopic data (see Fig. 8 for an illustration). Similar LOS are then searched for in large volume, and high resolution numerical simulations. The surface mass density of matter along these line of sight being calculated using a ray-tracing technique (Hilbert et al. 2007, 2008, 2009), it is then possible to derive a probability distribution of external convergence compatible with the observed lens matching the summary statistics.

Fig. 8
figure 8

Example of the result of the spectroscopic characterization of the field of view of the gravitational lens HE0435–1223. The redshifts of the objects measured in the field of view are used to identify which objects (or galaxy groups) needs to be explicitly included in the macro-model. The number counts analysis uses the galaxies located at projected distances of <45′′ and <120′′ from the lens to estimate \(P(\kappa _{\mathrm{ext}})\). Courtesy of Sluse et al. (2017)

In practice, summary statistics that deviate from pure number counts can be a better informed statistics of the underlying over- or under- density. For example, if \(N_{\mathrm{{gal}}}\) galaxies are observed in the field of view of a lens, one can calculate a weighted number counts \(W_{q} = \sum _{i=1}^{N_{\mathrm{{gal}}}} q_{i}\) with \(q\) being a particular type of weight, such as the inverse of the distance to the lens, i.e. \(1/r\). To calculate a weighted density of galaxies, \(\zeta _{q}\), it is necessary to perform the same measurement over an ensemble of control fields, such that for each control field \(({\mathrm{CF}},\,j)\) one derives a density \(\zeta _{q}^{j} = W_{q} / W_{q}^{{\mathrm{CF}},\,j}\). To avoid introducing any bias through this normalisation procedure, it is important to choose enough control fields but also ensure that those fields have characteristics that match closely those of the imaging data of the observed lens system. This allows one to account for sample variance and to assess that galaxy detection biases are similar for the lens and for the control fields. In particular, one should ascertain that the lens and control fields have similar depth, seeing, and pixel scale, the latter quantities being critical in the framework of source deblending and identification. It happens that some regions of the control fields and potentially the data around the lens target are masked due to saturation of stars, cosmic rays, or camera defects. To guarantee unbiased estimates of \(\zeta _{q}\), it is important to apply the same mask to the weighted count of the lens field and the control field. A large variety of weighting schemes has been explored, some of them involving a proxy on some of the galaxy properties such as the redshift and the stellar mass. Those quantities are derived using a photometric redshift code, such as LEPHARE (Ilbert et al. 2006). This implies the availability of deep multi-band photometric data. Having a similar depth for the lens and comparison field is important to matching galaxy properties in photometric surveys. Also, similar set-up for magnitude measurements are required to minimize systematic errors caused by aperture and/or object deblending uncertainties. When spectroscopic redshift are also available, they may generally be preferred to the photometric ones.

The conversion of the probability density distribution \(p(\zeta _{q})\) into \(p(\kappa _{\mathrm{{ext}}})\) requires the use of numerical simulations for which \(\kappa _{\mathrm{{ext}}}\) has been derived through ray-tracing. Large simulation volumes are required to minimise the impact of sample variance and cosmic variance. The Millennium simulation (Springel et al. 2005) being dark matter only, galaxy photometric properties are inpainted using physically motivated prescriptions. A common choice has been to use the semi-analytic model of De Lucia and Blaizot (2007). Densities \(\zeta _{q}\) can be estimated in those catalogues in similar way as for real data except that the reference fields is now the simulation catalog itself. As explained in Rusu et al. (2017), the probability \(p(\kappa _{\mathrm{{ext}}}\mid \mathbf{d})\), where \(\mathbf{d}\) stands for the available data, is given by:

$$ p(\kappa _{\mathrm{{ext}}}\mid \mathbf{d}) = \int d \zeta _{q} \frac{p(\kappa _{\mathrm{{ext}}},\zeta _{q}) p(\zeta _{q},\mathbf{d})}{p(\zeta _{q}) p(\mathbf{d})} = \int d \zeta _{q} p(\kappa _{\mathrm{{ext}}}\mid \zeta _{q}) p(\zeta _{q} \mid \mathbf{d}) . $$
(34)

Greene et al. (2013) has shown that the precision on \(\kappa _{\mathrm{ext}}\) is improved by a factor 2 when using a combination of weights, the two main ones being the standard number counts (\(q=1\)), and as the inverse of the projected distance to the lens (\(q=1/r\)). The addition of a weight based on the modeled amplitude of the shear, \(\gamma _{\mathrm{ext}}\) is also often considered, such that the general expression of \(p(\kappa _{\mathrm{ext}} \mid \mathbf{d})\) becomes:

p ( κ ext d ) = d ζ 1 d ζ 1 / r d ζ q 1 , 1 / r d ζ γ ext p MS ( κ ext ζ 1 , ζ 1 / r , . . . ζ q 1 , 1 / r , ζ γ ext ) p ( ζ 1 , ζ 1 / r , ζ q 1 , 1 / r , ζ γ ext d ) .
(35)

The addition of a fourth weight, \(q\neq 1,1/r\) allows one to evaluate systematic errors involved by the specific choice of equally motivated weighting schemes and explore which combination of weight yields the best precision on \(\kappa _{\mathrm{{ext}}}\). The first application of this technique in the framework of time-delay cosmography has been presented in Suyu et al. (2010). Subsequent time-delay cosmography analyses from H0LICOW, STRIDES and TDCOSMO have used an approach that broadly follows the strategy outlined above (see Greene et al. 2013; Rusu et al. 2017, for a more in-depth description of the method), but proposing small variations and tests in terms of weighting schemes and choices of comparison fields (Birrer et al. 2019; Sluse et al. 2019; Buckley-Geer et al. 2020). Figure 9 displays \(p(\kappa _{\mathrm{ext}})\) as derived with different weighting schemes for the lens system HE0435–1223.

Fig. 9
figure 9

Probability distribution \(P(\kappa _{\mathrm{ext}})\) for HE0435–1223 and an aperture radius of 45′′. The different curves correspond to all lines of sights (dotted red), considering only lines of sights with the same overdensity as the data (dash-dotted blue), using a weighting inversely proportional to the distance (dashed green) and the additional constraint from the shear (solid black). Courtesy of Rusu et al. (2017)

A novel method for estimating \(\kappa _{\mathrm{{ext}}}\) has recently been proposed by Park et al. (2021). They replace the weighted number count scheme by a machine learning approach. Specifically, they have trained a Bayesian Graph Neural Network on LSST DESC DC2 sky survey (LSST Dark Energy Science Collaboration (LSST DESC) 2021) in order to derive a distribution of \(\kappa _{ext}\) for arbitrary gravitational lens sight-line.

The reliance on cosmological simulations and their cosmological assumptions poses a slight circularity in the inference when the very goal of time-delay cosmography is to test and challenge cosmological models. However, the line-of-sight correction term that is constrained relying on cosmological assumptions is perturbative, i.e., even if the actual cosmology resulted in a \(\sim 10\%\) relative difference in the line of sight characteristics, it would be a sub 1% change to the distance measurements since the expected effect is only a few percent.

6.3 Weak Lensing

Weak lensing, the linear shape distortion of background galaxies due to foreground structure, is a direct probe of the LOS structure. On linear scales, the cosmic shear measurements can be translated to convergence in a unique mapping (Kaiser and Squires 1993). Hence, this technique does neither rely on priors from numerical simulations nor of a galaxy-halo connection. However, there are also several drawbacks. The angular scale of a weak lensing measurement is limited by the number density of lensed sources, and a high S/N measurement can only be achieved at scales of arc minutes. Thus, weak lensing is an excellent observable to quantify large scale cosmic density distributions but other smaller scale density perturbations down to the scales of arc seconds are not well captured. Another limitation is that the weak lensing source population is not at the same redshift as the strongly lensed source. One needs to translate the weak lensing convergence map to a different lensing kernel, which comes with additional statistical uncertainties (e.g., Kuhn et al. 2021).

In the strong lensing context, for example, Fischer and Tyson (1997), Nakajima et al. (2009), Fadely et al. (2010) relied on the weak lensing effect produced by massive structures in the vicinity of the deflector. They constrained the external convergence by integrating the tangential weak gravitational shear in the area around the lens. More recently, Tihhonova et al. (2018, 2020) applied the weak lensing techniques to the quadruply lensed quasar systems HE0435–1223 and B1608+656 and performed a convergence map reconstruction based on HST imaging. Kuhn et al. (2021) performed a convergence map reconstruction of the COSMOS field at the position of discovered strong lenses.

6.4 Hybrid Framework

Given the strengths and weaknesses of the direct modeling and summary statistics approaches, as well as weak lensing measurements, a hybrid approach can leverage the complementary methodologies. Summary statistics can be most effectively employed for objects that mostly cumulatively affect the lensing convergence while explicit modeling of LOS objects makes a difference for massive or very close by objects. The specific decision of where to split the analysis between a statistical approach and explicit modeling is primarily impacted by two factors. The first is the accuracy in the deflection properties, both in terms of higher-order lensing distortions and the need for a multi-plane lensing approach. The second is the available information and the handling of priors in the absence of sufficient information.

A method to account accurately for the line-of-sight has been proposed by McCully et al. (2014, 2017). It consists in a multi-plane lens equation where only the planes associated to important perturbing groups/clusters/galaxies are included. The other perturbers along the LOS are treated under the tidal approximation. In order to identify those objects, McCully et al. (2017) proposes to use a threshold based on the value of the flexion-shift, i.e. \(\Delta _{3} x\) whose expression is given by:

$$ \Delta _{3} x = f(\beta ) \, \times \frac{(\theta _{\mathrm{E}} \,\theta _{\mathrm{E},\mathrm{p}})^{2}}{\theta ^{3}}, $$
(36)

where \(\theta _{\mathrm{E}}\) and \(\theta _{\mathrm{E},\mathrm{ p}}\) are the Einstein radius of the main lens and of the perturber, and \(\theta \) is the angular separation on the sky between the lens and the perturber. The function \(f(\beta ) = (1-\beta )^{2}\) if the perturber is behind the main lens, and \(f(\beta ) = 1\) if the galaxy is in the foreground. In that expression, \(\beta \) is the pre-factor of the lens deflection in the multiplane lens equation:

$$ \beta = \frac{D_{\mathrm{{dp}}} D_{\mathrm{{os}}} }{D_{\mathrm{{op}}} D_{\mathrm{{ds}}}}, $$
(37)

where \(D_{ij} = D(z_{i}, z_{j})\) are angular diameter distances between redshifts \(z_{i}\) and \(z_{j}\), corresponding to the observer (\(\mathrm{{o}}\)), deflector (\(\mathrm{{d}}\)), perturber (\(\mathrm{{p}}\)) and source (\(\mathrm{{s}}\)). Missing to accounting for a foreground perturber may have a stronger impact on the models than missing a background one. The reason is that the background perturber will have a multiplicative effect on the source position, while the deflection from the foreground pertuber enters the lens equation inside the argument of the deflection of the main lens galaxy. In other words, the foreground perturber modifies the coordinates of the lensed images positions compared to the main lens case. These non-linear effects require a multi-plane treatment to be properly accounted for. From a set of simulation of time-delay lens systems resembling real ones, and their subsequent modeling based on point-source image positions, McCully et al. (2017) suggests that a value \(\Delta _{3} x < 10^{-4}\) arcsec yields to a bias on \(H_{0}\) of less than a percent. Since Sluse et al. (2017), this prescription is used by the H0LICOW and TDCOSMO collaborations to select the objects that they explicitly include in the lens mass modelling.

Birrer et al. (2017), Kuhn et al. (2021) combined the study of the environment using the halo-rendering approach, i.e. linking the galaxy stellar masses to the underlying mass distribution, with the external shear measurements of the strong lens system. Their combined approach yielded tighter constraints on the inferred external convergence compared to a halo-rendering approach only.

7 Cosmographic Inference

Having established the necessary observations and analyses components in the previous sections, in this section we discuss how an end-to-end combined analysis leads to constraints on \(H_{0}\) and other relevant cosmological parameters. First we discuss the analysis for a single lens (Sect. 7.1) and then state the analysis for a set of multiple lenses (Sect. 7.2).

7.1 Single Lens Cosmography

For each individual strong lens, there are preferrably four data sets available: (i) imaging data of the strong lensing features and the deflector galaxy, \(\mathcal{D}_{\mathrm{img}}\); (2) time-delay measurements between the multiple images, \(\mathcal{D}_{\mathrm{td}}\); (3) stellar kinematics measurement of the main deflector galaxy, \(\mathcal{D}_{\mathrm{spec}}\); (4) line-of-sight galaxy count and weak lensing statistics, \(\mathcal{D}_{\mathrm{los}}\). These data sets are independent and so are their likelihoods in a joint cosmographic inference. Hence, we can write the likelihood of the joint set of the data

$$ \mathcal{D} =\{\mathcal{D}_{\mathrm{img}}, \mathcal{D}_{\mathrm{td}}, \mathcal{D}_{\mathrm{spec}}, \mathcal{D}_{\mathrm{los}}\} $$
(38)

given the cosmographic parameters \(\{D_{\mathrm{d}}, D_{\mathrm{s}}, D_{\mathrm{ds}} \} \equiv D_{\mathrm{d},\mathrm{ s},\mathrm{ ds}}\) as

$$ \begin{aligned} \mathcal{L}(\mathcal{D}| D_{\mathrm{d},\mathrm{ s},\mathrm{ ds}} ) ={}& \int \mathcal{L}( \mathcal{D}_{\mathrm{img}} | \boldsymbol{\xi}_{\mathrm{mass}}, \boldsymbol{\xi}_{ \mathrm{light}}) \\ &{}\times \mathcal{L}(\mathcal{D}_{\mathrm{td}} | \boldsymbol{\xi}_{\mathrm{mass}}, \boldsymbol{\xi}_{\mathrm{light}}, \lambda , D_{\Delta t}) \\ &{}\times \mathcal{L}(\mathcal{D}_{\mathrm{spec}}| \boldsymbol{\xi}_{\mathrm{mass}}, \boldsymbol{\xi}_{\mathrm{light}}, \beta _{\mathrm{ani}}, \lambda , D_{\mathrm{s}}/D_{ \mathrm{ds}}) \mathcal{L}(\mathcal{D}_{\mathrm{los}}| \kappa _{\mathrm{ext}}) \\ &{}\times p(\boldsymbol{\xi}_{\mathrm{mass}}, \boldsymbol{\xi}_{\mathrm{light}}, \lambda _{\mathrm{int}}, \kappa _{\mathrm{ext}}, \beta _{\mathrm{ani}}) d \boldsymbol{\xi}_{\mathrm{mass}} d\boldsymbol{\xi}_{\mathrm{light}} d\lambda _{ \mathrm{int}} d\kappa _{\mathrm{ext}} d\beta _{\mathrm{ani}}. \end{aligned} $$
(39)

In the expression above we only included the relevant model components of the individual likelihoods. \(\boldsymbol{\xi}_{\mathrm{light}}\) formally includes the source and lens light surface brightness.Footnote 7

The sampling of the cosmographic posterior from the joint likelihood of Eq. (39) can be split in parts to simplify the problem. For example, we can first perform the imaging analysis providing constraints on \(\boldsymbol{\xi}_{\mathrm{lens}}\) and \(\boldsymbol{\xi}_{\mathrm{light}}\) without sampling the cosmological or distance parameters. In turn, simple sampling the \(\boldsymbol{\xi}_{\mathrm{lens}}\) and \(\boldsymbol{\xi}_{\mathrm{light}}\) posteriors in post-processing when evaluating the time-delay likelihood and stellar kinematic likelihood can translate the posteriors into distance posteriors in \(D_{\Delta t}-D_{\mathrm{d}}\) space. Marginalization over different modeling choices can also be done in the \(D_{\Delta t}-D_{\mathrm{d}}\) posterior space. Figure 10 provides an example of \(D_{\Delta t}-D_{\mathrm{d}}\) for a set of different modeling choices.

Fig. 10
figure 10

Median-subtracted and mean-divided relative angular diameter distance posteriors, \(D_{\Delta t}\) and \(D_{\mathrm{d}}\) for a power-law mass density profile with three different source reconstruction settings. \(n_{\mathrm{max}}\) in this figure refers to the polynomial order in the shapelets with the set of three numbers corresponding to three different sources being simultaneously modelled in the image. Figure adopted from Shajib et al. (2020)

Weighting the posteriors of different models can be done with Bayesian model comparison. This weighting also allows one to combine models in a single posterior, which then includes systematics considerations. Discrete and finite choices made in the models and scatter in the sampling and BIC calculation can lead to over-constraint model selections. Procedures to take noise and finite model selection in the BIC estimate into account have been developed (Birrer et al. 2019).

7.2 Population Level Analysis

The overarching goal of time-delay cosmography is to provide a robust inference of cosmological parameters, \(\boldsymbol{\pi}\), and in particular the absolute distance scale, the Hubble constant \(H_{0}\), and possibly other parameters describing the expansion history of the Universe (such as \(\Omega _{\Lambda}\) or \(\Omega _{\mathrm{m}}\)), from a sample of gravitational lenses with measured time delays.

In Bayesian language, we want to calculate the probability of the cosmological parameters, \(\boldsymbol{\pi}\), given the strong lensing data set, \(p(\boldsymbol{\pi} | \{\mathcal{D}_{i} \}_{N})\), where \(\mathcal{D}_{i}\) is the data set of an individual lens (including imaging data, time-delay measurements, kinematic observations and line-of-sight galaxy properties) and \(N\) the total number of lenses in the sample.

In addition to \(\boldsymbol{\pi}\), we denote \(\boldsymbol{\xi}\) all the model parameters part of either a single lens analysis (Sect. 7.1) or present on the population level. Using Bayes rule and considering that the data of each individual lens \(\mathcal{D}_{i}\) is independent, we can write, following Birrer et al. (2020):

p ( π { D i } N ) L ( { D i } N | π ) p ( π ) = L ( { D i } N π , ξ ) p ( π , ξ ) d ξ = i N L ( D i | π , ξ ) p ( π , ξ ) d ξ .
(40)

In the following, we divide the nuisance parameter, \(\boldsymbol{\xi}\), into a subset of parameters that we constrain independently per lens, \(\boldsymbol{\xi}_{i}\), and a set of parameters that require to be sampled across the lens sample population globally, \(\boldsymbol{\xi}_{\mathrm{pop}}\). The parameters of each individual lens, \(\boldsymbol{\xi}_{i}\), include the lens model, source and lens light surface brightness and any other relevant parameter of the model to predict the data. Hence, we can express the hierarchical inference (Eqn. (40)) as

p ( π { D i } N ) i [ L ( D i D d , s , ds ( π ) , ξ i , ξ pop ) p ( ξ i ) ] × p ( π , { ξ i } N , ξ pop ) i p ( ξ i ) d ξ { i } d ξ pop
(41)

where \(\{\boldsymbol{\xi}_{i} \}_{N} = \{\boldsymbol{\xi}_{1}, \boldsymbol{\xi}_{2}, \ldots, \boldsymbol{\xi}_{N} \}\) is the set of the parameters applied to the individual lenses and \(p( \boldsymbol{\xi}_{i})\) are the interim priors on the model parameters in the inference of an individual lens. The cosmological parameters \(\boldsymbol{\pi}\) are fully encompassed in the set of angular diameter distances, \(\{D_{\mathrm{d}}, D_{\mathrm{s}}, D_{\mathrm{ds}} \} \equiv D_{\mathrm{d},\mathrm{ s},\mathrm{ ds}}\), and thus, instead of stating \(\boldsymbol{\pi}\) in Eq. (41), we now state \(D_{\mathrm{d},\mathrm{ s},\mathrm{ ds}}(\boldsymbol{\pi})\). Up to this point, no approximation was applied to the full hierarchical expression (Eqn. (40)).

Key differences among different inferences of \(H_{0}\) from a set of lenses involve, beyond the assumptions on individual lenses, assumptions on the covariant nature and the prior on the population level of the governing hyper-parameters. For example, Wong et al. (2020) assumes full independence of the nuisance priors from one lens to another. Formally, within Bayes Theorem, this approach assumes perfect knowledge of the governing population hyper-parameter distribution prior (Eq. (41)). In this approach, the distance posteriors of individual lenses can be interpreted as measurements and the cosmographic analysis can be done in solely operating in the \(D_{\Delta t}-D_{\mathrm{d}}\) space with a direct independent and easy accessible likelihood description.

Millon et al. (2020b) performed an analysis exploring the difference between two different radial mass density profile families, assuming that either all lenses are of one type or another, effectively treating one modeling choice as a covariant nuisance parameter in their inference while keeping all other priors independent with an assumed population.

Denzel et al. (2021) is using a free-form approach in the modeling of individual lenses. The ensemble of models allowed by the data for an individual lens is providing the model posterior distribution. The underlying regularization scheme is the implicit prior applied on individual lenses. The identical regularization scheme is applied to all lenses assuming independence in the priors without covariances in the choice of the regularization scheme between lenses. Denzel et al. (2021) did not use any external information to break the MST. Hence, the specific choice of the regularization scheme with their underlying physical and regularization priors is responsible for the breaking of the MST on the population level.

Birrer et al. (2020) introduced the hierarchical analysis framework into time-delay cosmography and identified few key parameters, that on a per lens basis are not sufficiently well constrained and thus the population prior can significantly affect the outcome of the analysis. The parameters hierarchically sampled, beyond the cosmological ones, were the MST population \(\lambda _{\mathrm{int}}\) (Eq. (21)), and the stellar anisotropy distribution (see Sect. 5.4).

Park et al. (2023) implemented a Bayesian hierarchical framework to determine the external convergence distribution on the population level for a full sample of lens systems used for time-delay cosmography and demonstrated how to correct for a selection bias in the population of lenses when there is limited information on an individual lens basis.

The required population-level description of priors, in particular of parameters that can not be constrained to high precision (overcoming the prior in the analysis) do also need to take accurately into account potential differences among subsets of the population. For example, different lens discovery channels might preferentially select a different lens and line-of-sight population.

8 Cosmography with Galaxy Clusters

In this section, we discuss current and past application of cosmograpy with galaxy clusters. We first discuss relative expansion history constraints from multiple source redshifts (Sect. 8.1) and then discuss time-delay cosmography applications (Sect. 8.2). This section is aimed to provide a brief overview over these aspects and we refer to the specific literature referenced in this section for further details.

8.1 Relative Expansion History with Galaxy Clusters

For a given deflector, changing the source redshift alters the angular diameter distances in Eq. (5), whilst the other terms in Eqs (1)-(4) are unchanged. Hence, for two photons passing through the same point in the lens plane, but originating on different source planes, the ratio of scaled deflection angles, \(\alpha _{1}\), \(\alpha _{2}\) is given by the cosmological scaling factor, \(\beta \),

$$ \frac{\alpha _{1}}{\alpha _{2}} = \frac{D_{ls1} D_{s2}}{D_{s1}D_{ls2}} \equiv \beta . $$
(42)

It has been realized that lenses with multiple source planes can additionally provide constraints on cosmological distance ratios sensitive to the relative expansion history and geometry of the Universe (e.g., Paczynski and Gorski 1981; Link and Pierce 1998; Cooray 1999; Golse et al. 2002; Sereno 2002; Soucail et al. 2004; Gavazzi et al. 2008; Gilmore and Natarajan 2009). Link and Pierce (1998) showed that the cosmological sensitivity of the angular size-redshift relation could be exploited using sources at distinct redshifts and developed a methodology to simultaneously invert the lens and derive cosmological constraints.

In particular, galaxy clusters with a large strong lensing cross section do have multiple sources at different redshifts and are exquisite objects to study the geometrical effect between different source redshifts. In fact, early studies of galaxy clusters already indicated the presence of a cosmological constant (e.g., Paczynski and Gorski 1981; Sereno 2002; Soucail et al. 2004).

The method to probe relative angular diameter distances, in addition to multiple sources at different redshifts, also requires a complete understanding of the lens density profile and other perturbing masses along the line of sight.

With more exquisite deep multi-colour imaging and spectroscopy for a small subset of galaxy clusters, such as with the Hubble Frontier Fields program (Lotz et al. 2017) has led to the discovery of hundreds of multiple images and thus to a significant improvement of cluster mass estimates (Jauzac et al. 2014; Diego et al. 2016; Lagattuta et al. 2017; Monna et al. 2017). Cluster lenses are complex and most of the efforts have been spent in accurately reconstruct cluster lensing profiles (see e.g., Jullo et al. 2010; D’Aloisio and Natarajan 2011; Magaña et al. 2015; Caminha et al. 2016; Acebron et al. 2017). The mass modelling of strong lensing clusters can be carried out in different manners: parametric and non-parametric methods are equally used; the primary distinction between them being that parametric modelling assumes that luminous cluster galaxies trace the cluster mass whereas non-parametric does not.

The method of using sources at multiple redshifts can also be applied to galaxy-scale lenses, thought double source plane lenses are more rare Biesiada (2006), Gavazzi et al. (2008), Collett et al. (2012). The Einstein radius is a function of the lens mass and the cosmological distances. The ratio of Einstein radii in a lens with sources at two or more redshifts is independent of the deflector mass (e.g., Gavazzi et al. 2008; Collett et al. 2012).

In both cases, galaxy and cluster scales, the method also requires a complete understanding of the lens density profile and additional lensing by the source galaxies and other perturbing masses along the line of sight.

8.2 \(H_{0}\) with Galaxy Clusters

To date, galaxy-scale lenses have dominated the literature on \(H_{0}\) determination in the number of measurements and precision. We have recently witnessed competitive constraints from galaxy clusters in measuring \(H_{0}\) (Kelly et al. 2023a; Napier et al. 2023; Liu et al. 2023; Pascale et al. 2024) and other cosmological quantities (Grillo et al. 2024). Massive clusters are rich in multiple images and have definite advantages over individual galaxies. The main one is that sources at multiple redshifts break the mass-sheet or steepness degeneracy (e.g., Bradač et al. 2004), which is the main degeneracy and hence source of uncertainty affecting galaxy-scale determination of \(H_{0}\) (see also Sect. 5). Having a much larger image separation in clusters compared to galaxy-scale lenses resulting in overall longer time delays of order months to years. Those time delays are relatively easily determined to a few percent precision, rivalling time-delay determinations from quasar sources in galaxy-scale lenses. However, the longer time-delay implies years of monitoring to obtain lightcurves with sufficient overlap in the case of lensed quasars (Fohlmeister et al. 2008, 2013; Muñoz et al. 2022, e.g.) or dedicated HST follow-up at the time of reappearance in the case of lensed supernovae (Kelly et al. 2023b). The drawback of clusters is that their mass distributions are more complex: they are dynamically younger than galaxies, and their multiple image regions sample a much larger fraction of the clusters’ virial radius than in galaxies. Therefore the multiple image region of clusters is expected to be more abundant in substructure, and hence harder to model. These difficulties can be circumvented if there are a few tens or hundreds of multiple images, then \(H_{0}\) can be estimated to a 1-few % precision (Ghosh et al. 2020). At present, in a cluster lens like MACS 1149, one can estimate \(H_{0}\) to 6%, assuming a conservative 3% uncertainty on the observed time delay (Grillo et al. 2018).

The first cluster lens to produce a precise estimate of \(H_{0}\) was MACS 1149, where the first confirmed multiply imaged supernova was observed a few years ago (Kelly et al. 2015). The long time delay before the reappearance of the last arriving image—saddle in the arrival time surface of the cluster—allowed the lensing community to make model predictions for the time of the reappearance. Most models agreed reasonably well on 250-350 day delay (Kelly et al. 2016; Treu et al. 2016). Very recently, Frye et al. (2024) discovered another lensed supernovae in a cluster, this time of type Ia, which even allows for a standardization of the lensing magnification (Pierel et al. 2024a), and (Pierel et al. 2024b) discovered for the first time a second supernovae in the same host galaxy, with the initial supernovae discovered in archival data by (Rodney et al. 2021), further demonstrating the future prospects using cluster-scale lenses.

9 Current Status and Results

We go in length through recent results using lensed quasars and also present the recent results using lensed SNe. Figure 11 summarizes a selection of current measurements and comparison with other probes.

Fig. 11
figure 11

Comparison of recent \(H_{0}\) measurements in the literature. Presented are the time-delay cosmography constraints from Kelly et al. (2023a) of SN Refsdal, for the full set of eight models (orange) and the subset of the two best models (blue), from the TDCOSMO collaboration (red) of six TDCOSMO time-delay lenses (five H0LiCOW lenses (Wong et al. 2020) and one STRIDES lens by Shajib et al. (2020) assuming parameteric forms of the mass density profile of the deflector, either described as a power-law or stars (constant mass-to-light ratio) plus dark matter halos (Millon et al. 2020b), from the TDCOSMO collaboration of the same lenses as in Millon et al. (2020b) with virtually no assumption on the radial mass density profile of the lens galaxy, and taken into account the covariance between the lenses (green) (Birrer et al. 2020). The TDCOSMO + SLACS measurement comes from the joint analysis of the TDCOSMO sample and 33 SLACS lenses with SDSS spectroscopy. The “free” mass profile assumptions of the two measurements by Birrer et al. (2020) are constrained only by the stellar kinematics and fully accounts for the uncertainty related to the mass sheet transformation (MST). Aside from time-delay studies, the local measurements by SH0ES + Gaia (Riess et al. 2022), the Carnegie-Chicago Hubble Program (CCHP) (Freedman et al. 2019), surface brightness fluctuations (SBF) SN (Khetan et al. 2021), SBF Tip of the Red Giant Branch (TRGB) + Cepheids (Blakeslee et al. 2021), Megamaser Cosmology Project (MCP) (Pesce et al. 2020), gravitational wave (GW) event 170817 (Dietrich et al. 2020), Planck (Planck Collaboration 2020, ; dashed grey), and Dark Energy Survey (DES) + Baryon Acoustic Oscillation (BAO) + Big Bang nucleosynthesis (BBN) (Abbott et al. 2018) are presented. Error bars in panel (B) show the 16th, 50th, and 84th percentile confidence levels. Dashed horizontal line separates measurements from observations of the universe early in its evolution from those late in its evolution. \(H_{0}\) measurements bracketed by different vertical gray bars are entirely independent of each other. Figure from Kelly et al. (2023a) which was generated using a previous comparison (Bonvin and Millon 2020)

9.1 Recent Results from Quasars

The independent analysis of six lensed quasar systems (Suyu et al. 2010, 2013; Wong et al. 2017; Bonvin et al. 2017; Birrer et al. 2019; Chen et al. 2019; Rusu et al. 2020) by the H0LiCOW collaboration (Suyu et al. 2017) inferred a Hubble constant value of \(H_{0} = 73.3^{+1.7}_{-1.8}\text{ km}\text{ s}^{-1}{\mathrm{Mpc}}^{-1}\). This measurement uses parameteric forms of the mass density profile of the deflector, either described as a power-law or stars (constant mass-to-light ratio) plus dark matter halos following an NFW (Navarro et al. 1997) profile with priors on the mass and concentration of the halo reflecting the population of haloes in N-body simulations (Wong et al. 2020). The H0LiCOW result is a 2% precision measurement on H0, in excellent agreement with the local distance ladder measurement by the SH0ES team (Riess et al. 2019, 2021). Moreover, the H0LiCOW measurement is not more than 3\(\sigma \) statistical tension with early-Universe probes (e.g., Planck Collaboration 2020; Aiola et al. 2020). An additional lens analyzed by the STRIDES collaboration with the same mass profile assumptions as the H0LiCOW collaboration further provided the most precise single-lens measurement of \(H_{0} = 74.2^{+2.7}_{-3.0}\text{ km}\text{ s}^{-1}{\mathrm{Mpc}}^{-1}\) (Shajib et al. 2020). In summary, if the mass density profiles of the H0LiCOW and STRIDES lenses are well described by a power-law or a baryonic component with a constant mass-to-light ratio plus dark matter profiles from standard N-body dark matter only simulations, and under the assumption that the covariances are negligible, the tension is significant from the strong lensing measurements alone, and corroborating other measurements (e.g., Riess et al. 2021).

Given that incompatibilities between the local value of \(H_{0}\) and the \(\Lambda \)CDM model extrapolated \(H_{0}\) inference from the CMB or other early-universe physics anchored inherently break the standard model of cosmology and likely may require new physics, several groups, including H0LiCOW, STRIDES and SHARP, now TDCOSMO collaboration, are investigating potential systematics in the \(H_{0}\) measurements.

The TDCOSMO collaboration found, combining six lenses from H0LiCOW, SHARP and STRIDES, that the results when assuming that all lenses are either of one or the other previously assumed forms of the mass density profile are in good agreement with each other when measuring \(H_{0}\). The good agreement in the \(H_{0}\) results between power-law and composite profiles was interpreted by Millon et al. (2020b) as a consequence of the ‘bulge-halo conspiracy’ that the combined baryonic and dark matter density components form a power-law profile (e.g., Koopmans et al. 2006, 2009; van de Ven et al. 2009). Denzel et al. (2021) analyzed 8 quadruply imaged quasars with a free-form modeling approach and obtained \(H_{0} = 71.8^{+3.9}_{-3.3}\text{ km}\text{ s}^{-1}{ \mathrm{Mpc}}^{-1}\). Gilman et al. (2020) investigated the effect of unaccounted for subhalos and small undetected line-of-sight halos in the uncertainty budget and found insignificant residual uncertainties to mitigate the tension of the measurements with the CMB and large scale structure probes. Van de Vyvere et al. (2022b,a) showed that a variety of expected azimuthal structures in the mass distribution (i.e. multipoles, twists and ellipticity gradients) should leave \(H_{0}\) unaffected at the population level unless there are specific selection effect in the galaxy population.

The attention further turned to assessing and relaxing the radial profile assumption (see Sect. 5.3), as well as the introduction of population priors for parameters that cannot be constrained on a lens-by-lens basis for a covariant treatment of their uncertainties. Birrer et al. (2020) addressed the radial profile assumption by choosing a parametrization of the radial mass density profile that is maximally degenerate with H0, via the MST. This is the most explicit and direct way addressing the MST effect on the time-delay cosmographic analysis. With this more flexible parametrization, H0 is only constrained if the measured time delays and imaging data are supplemented by stellar kinematics. Applying this extremely conservative choice to the TDCOSMO sample of 7 lenses increases the uncertainty on H0 from 2% to 8% resulting in \(H_{0} = 74.5^{+5.6}_{-6.1}\text{ km}\text{ s}^{-1}{\mathrm{Mpc}}^{-1}\), without changing the mean inferred value significantly.

Birrer et al. (2020) further introduced a hierarchical framework (see also Sect. 7.2) in which external datasets can be combined with the time-delay lenses to improve the precision, in particular on the MST parameter of the population, and hence on \(H_{0}\). A secondary required parameter that must be constrained when using stellar kinematics is the stellar anisotropy, due to the mass-anisotropy degeneracy. External data sets with spatially resolved kinematics measurements can aid breaking this degeneracy to constrain the MST parameter. Birrer et al. (2020) achieved a 5% precision measurement on \(H_{0}\) by combining the TDCOSMO lenses with imaging modeling and stellar kinematic measurements of a sample of lenses from the Sloan Lens ACS (SLACS) survey with no time-delay information (Bolton et al. 2008; Auger et al. 2009; Shajib et al. 2021) and measured \(H_{0} = 67.4^{+4.1}_{-3.2}\text{ km}\text{ s}^{-1}{\mathrm{Mpc}}^{-1}\). The mean of the TDCOSMO + SLACS measurement is offset with respect to the TDCOSMO-only value, effectively matching the CMB inferred value, although still statistically consistent with previous assumptions given the uncertainties in the measurement. The Birrer et al. (2020) measurements with and without the SLACS dataset added are in statistical agreement with each other and with the earlier H0LiCOW/SHARP/STRIDES measurements based on the radial mass profile assumptions. The result by Birrer et al. (2020) is also consistent, with the work by Shajib et al. (2021) studying more flexible mass density profiles and mass-to-light gradients. The studies by Birrer et al. (2020) and Shajib et al. (2021) share the same measurements for the SLACS lenses and the consistency is implied by construction. Shajib et al. (2021) concluded that NFW + stars, when using wider priors on mass and concentration than earlier H0LiCOW/SHARP/STRIDES measurements, is a sufficiently accurate description of the mass density profile of the SLACS lenses. However, a larger flexibility in the mass-concentration relation on the population level and small departures from those radial forms are allowed by the data, resulting in the uncertainties reflected by the Birrer et al. (2020) analysis. The shift in the mean in \(H_{0}\) when adding the SLACS lenses could be real or it could be due to an intrinsic difference between the deflector population in the TDCOSMO and SLACS samples. Differences in the deflectors might arise from unequal selection effects. For example, the two samples are well matched in stellar velocity dispersion (total mass), but they differ in redshift. Potentially unaccounted for evolutionary trends in the mass profiles could bias the results when adding samples of lenses at different redshifts. Another example is that the TDCOSMO sample is source selected, meaning the main characteristics for the data set to be discovered and selected are properties of the source as seen when lensed, and composed mostly of quadruply imaged quasars, while the SLACS sample is deflector selected, meaning the primary criteria for the selection arises from properties of the deflector irrespective of the source and its geometric lensing effect, and dominated by doubly imaged galaxies.

More recently, Shajib et al. (2023) performed the first analysis with spatially resolved stellar kinematics measurement with the same conservative assumptions as Birrer et al. (2020) and achieved a \(\sim 9\%\) measurement from a single quadruply lensed quasar, finding results in agreement with the previous analysis based on power law and composite models Suyu et al. (2014). This work demonstrates the constraining power of kinematic data in the absence of priors on the shape of the mass density profile.

9.2 Recent Results from Lensed Supernovae

In addition to lensed quasars, the discovery of the first multiply-imaged supernovae (SN) (Kelly et al. 2015) in the cluster MACS 1149 allowed to measure \(H_{0}\) with lensed SNe (Vega-Ferrero et al. 2018; Kelly et al. 2023a). Kelly et al. (2023a) presents results of a combination of several models from different independent modeling teams which were done truly blind, before the measured time delays were known. A Bayesian model selection was done based on the precision and accuracy of the predicted image positions of the reappearance of SN Refsdal, and the predicted magnification ratio (which are independent of H0). Combining these Bayesian weights with the weighted uncertainties of all the eight individual models, they found \(H_{0} = 64.8^{+4.4}_{-4.3}\text{ km}\text{ s}^{-1}{\mathrm{Mpc}}^{-1}\). Kelly et al. (2023a) found that models that assign dark-matter halos to individual galaxies and the overall cluster best reproduce the observations. When combining the two best performing models, both consistent within their uncertainties with each other, Kelly et al. (2023a) found \(H_{0} = 66.6^{+4.1}_{-3.3}\text{ km}\text{ s}^{-1}{\mathrm{Mpc}}^{-1}\). Very recently, a \(H_{0}\) measurement was made by a second lensed SN, supernova “H0pe” (Frye et al. 2024). A combination of a spectroscopic and photometric time-delay measurement (Chen et al. 2024; Pierel et al. 2024a) were compared to the predictions of seven independently constructed cluster lens models to measure a value for the Hubble constant. In combination with the standardizable magnification of the type Ia nature of the SN, this resulted \(H_{0} = 75.4^{+8.1}_{-5.5}\text{ km}\text{ s}^{-1}{\mathrm{Mpc}}^{-1}\). These are very encouraging and precise measurements. For further discussions of lensed SNe, we refer to Suyu et al. (2024, this collection).

10 Outlook in the (Near) Future

The goal of time-delay cosmography is to provide a robust measurement of the Hubble constant to 1% precision to decisively tell the outcome of the currently observed tension between late and early time measurements of \(H_{0}\). In the previous Sect. 9 we presented current results. In this section, we discuss the potential of the time-delay method in the near future. To do so, we first provide some details on the error budget of current analyses which allows us to assess and scale to the expected future samples and data quality (Sect. 10.1). Second, we describe the data and instrumentation which enable us to push ahead (Sect. 10.2). Third, we will highlight avenues where continuing work is required in assessing the methodology to maintain accuracy while increasing the precision of the measurements (Sect. 10.3). Finally, we leave some concluding remarks about the prospects of time-delay cosmography in Sect. 10.4.

10.1 Error Budget

Table 1 presents an overview of the error budget of time-delay cosmography divided into different aspects of the analysis for individual lenses, the current work of 7 lenses, and a forecast of a future analysis of 40 lenses with improved data. The error budget is split between the three different components of time-delay cosmography: The time-delay measurement (Sect. 4), the main deflector profile for the prediction of Fermat potential (Sect. 5), and the line-of-sight contribution (Sect. 6). For the deflector error budget, we provide an uncertainty on the lens model excluding the potential systematics of the MST, which can be achieved with high-resolution imaging, and a total error budget including the MST, which depends on additional data (such as currently stellar kinematics of the deflector). For the analysis of individual lenses, we report uncertainties from Wong et al. (2020), Millon et al. (2020b) excluding the MST, and from Birrer et al. (2020) including the MST in the deflector profile. The errors between the time-delay measurement, deflector model and line of sight are effectively uncorrelated, and hence the total error budget adds in quadrature. Excluding the MST-related uncertainties, the three components are roughly on equal footing, and the best single lenses alone can provide an \(H_{0}\) uncertainty of \(<4\%\) (e.g., Shajib et al. 2020). With stringent priors on the deflector potential ignoring additional MST-related uncertainties, the current sample of 7 lenses assuming uncorrelated uncertainties result in a \(\sim 2\%\) uncertainty on \(H_{0}\) (Wong et al. 2020; Millon et al. 2020b).

Table 1 Approximate error budget of time-delay cosmography divided into different aspects of the analysis for individual lenses, the current work by Birrer et al. (2020) of 7 lenses, and a forecast of a future analysis of 40 lenses with improved data. The error budget is split between the three different components of time-delay cosmography: The time-delay measurement (see Sect. 4), the main deflector profile for the prediction of Fermat potential (see Sect. 5, and the line-of-sight contribution (see Sect. 6. For the deflector error budget, we provide an uncertainty on the lens model excluding the potential systematics of the MST (ex MST), which can be achieved with high-resolution imaging, and a total error budget including the MST, which depends on additional data (such as currently stellar kinematics of the deflector). The forecast for 40 lenses is based on the same mix of quality in the data as for the current 7 lens constraints

The MST-related uncertainty for an individual lens depends on the imposed priors and can be constrained to a \(\sim 10-20\%\) level with current pre-JWST kinematic measurements. The combination of 7 lenses by Birrer et al. (2020) lead to a \(\sim 8\% \) uncertainty on \(H_{0}\), dominated by a \(\sim 7\%\) uncertainty on the MST. Given the uncertainties and their expected scaling (\(\Delta H_{0}/H_{0} \propto \sqrt{N}\) for \(N\) number of lenses), for 40 lenses, the MST-related uncertainty is the dominant and single most relevant uncertainty to achieve a 1% measurement of \(H_{0}\). The required data and methodology improvements are laid out in the following subsections.

The secondary uncertainties of the relative expansion history on \(H_{0}\) is \(< 1\%\), not mentioned in Table 1, but included in the total error budget on the population and can be mitigated by either an informed prior on the relative expansion history, or by a combination with probes sensitive to the relative expansion history.

10.2 Future Data

We expect there to be several 10,000 galaxy-galaxy lenses, several hundred quadruply lensed quasars and more than a thousand doubly lensed quasars on the full sky (e.g., Oguri and Marshall 2010; Collett 2015). With the upcoming large area (wide) and sensitive to faint objects (deep) ground- and space-based surveys, such as the Vera C. Rubin Observatory, Roman Observatory, and Euclid, we expect many of those lenses to be discovered within a decade. Compared to the current analyses conducted on few lenses (e.g., 7 lenses in case of current TDCOSMO results), these are several e-foldings of the number of lenses possibly suitable for time-delay analyses. The sheer number of lenses will transform the measurements and new approaches are going to be required in the domain of time-delay cosmography to efficiently and accurately make use of all the data available.

The first step in utilizing these lenses present on the sky is to discover them in the large data sets. We refer to Lemon et al. (2024, this collection) for an extensive review of techniques, recent successes and an outlook in the searches and discovery of strong lenses. The next step is to acquire all the necessary follow-up data products to conduct accurate and precise cosmographic analyses (see Sect. 3 and subsequent sections). The data products range from monitoring data for a time-delay measurements, high-resolution imaging, to spectroscopic information about the source and lens redshift as well as velocity dispersion of the deflector. This step is very resource expensive and there are going to be challenges in how to allocate these limited resources. Decisions will have to be made to decide which lenses are followed-up. We comment in Sect. 10.3 about developments of methodology that can deal with less constraining or incomplete data for a larger lensing data set. Some lenses might require less substantial monitoring follow-up in case where LSST light curves are good enough for a time-delay measurement (Liao et al. 2015). Some lenses may also automatically obtain high-resolution and sufficiently high signal-to-noise ratio imaging data from wide field space surveys, such as Euclid or Roman (Meng et al. 2015). Understanding to what extent the acquired data products impact the precision on \(H_{0}\) is key to assess the need for allocating follow-up resources and on which lenses to spend it. Besides the limited resources, follow-up decisions are currently also impacted by the limited access to adaptive optics (AO) instrument on ground-based large-diameter telescopes. With the next-generation AO instrumentation and their commissioning on both hemispheres, we expect a full sky accessibility that allows the community to target every single gravitational lens on the sky.

The dominant uncertainty in the current measurement of the Hubble constant with time-delay cosmography is attributed to uncertainties in the mass profiles of the main deflector galaxies (see e.g., Sect. 5.3). There are multiple independent avenues available in the near future to approach a 1% measurement of \(H_{0}\) with different data sets. We will focus on these pathways with improved instrumentation and increased data sets in this section.

Spatially resolved stellar kinematics of the deflector galaxy (see Sect. 5.4 for details on methodology) with the next generation space (James Webb Space Telescope; JWST) and ground-based (extremely large telescopes; E-ELT, GMT, TMT) instruments provide precise measurements of the kinematics of stars. Such two-dimensional observations of the kinematics, paired with the lensing measurements, have the ability to break the mass-anisotropy degeneracy, a currently limiting systematic when interpreting and de-projecting integrated kinematic measurements to measure the three-dimensional gravitational potential. Birrer and Treu (2021) forecasts, based on the methods and assumptions used by Birrer et al. (2020) without relying on mass-density profile assumptions to break the MST, that with 40 time-delay lenses with exquisite spatially resolved kinematics and otherwise similar measurements as the 7 lens TDCOSMO sample, a 1.5% precision on \(H_{0}\) can be achieved (Fig. 12 left graphic), and see also e.g. Yıldırım et al. (2020, 2023). Such a strategy with exquisite data on the sample of time-delay lenses is one way to make progress. Another approach is to infer the mass density profile properties from a larger set of non-time-delay lenses and apply the constraints on the mass density profile and stellar anisotropy distribution on the time-delay lenses (Birrer et al. 2020; Birrer and Treu 2021; Gomer et al. 2022). In particular, resolved spectroscopy can also be employed on non-time delay lenses without bright and contaminating quasar images, either as prior constraints or by directly incorporating into a hierarchical analysis, to further improve the kinematic measurement precision.

Fig. 12
figure 12

Forecast for \(H_{0}\) measurements in the near future with the upcoming facilities. Left: Spatially resolved kinematics measurements of a sample of 40 time-delay lenses enable a precision on \(H_{0}\) of 1.5% JWST (Figure adopted from Birrer and Treu 2021). Right: Lensed supernovae with standardizable magnification measurements. An expected yield of \(\sim 144\) gravitationally lensed supernovae over the span of the 10 years LSST survey enable a precision on \(H_{0}\) of 1.5% (Figure adopted from Birrer et al. 2022). Both approaches, stellar kinematics and standardized magnifications, do provide independent observational constrains on the MST with different systematics

Standardizable magnifications with gravitationally lensed supernovae (glSNe) provide another promising avenue to constrain the mass density profiles and open up an avenue for a percent measurement of \(H_{0}\) in the near future with the onset of LSST. Standardizable magnifications are able to constrain the absolute lensing magnification and hence constrain the density profile (incl. the MST) (e.g., Kolatt and Bartelmann 1998; Oguri and Kawano 2003; Foxley-Marrable et al. 2018; Birrer et al. 2022). Birrer et al. (2022) (Fig. 12 right panel) provides a forecast with glSNe in constraining \(H_{0}\) independently of stellar kinematics. They conclude that the standardizable nature of glSNe of type Ia enables a 1.5% \(H_{0}\) measurement with a 10 years LSST survey. This forecast is contingent to a near-optimal discovery and follow-up effort of glSNe. We refer to Suyu et al. (2024) for a detailed review and in-depth discussion on the discovery, expected number of glSNe, the challenges of following them up and the caveats of micro-lensing.

Another method is to make use of the statistical distribution of images under the assumption of knowing the distribution of sources in the source plane with a statistical combination of a large sample of time-delay lenses, relying purely on strong lensing data (Sonnenfeld 2021).

Yet another method to constrain the radial density profile is to use galaxy-galaxy weak gravitational lensing for a large sample of deflectors analogue to the strong gravitational lenses (Khadka et al. 2024).

Overall, the trade-offs of analysing all (or most) of the lenses, with most of them with limited data, or to focus on a few of the best (e.g. “golden”) lenses, has yet to be seen and explored in detailed. Different approaches have advantages and inconveniences in regard to precision and accuracy on the \(H_{0}\) measurements.

10.3 Methodology Improvements

With the expected wealth of data and the increase in the number of time-delay and non-time-delay lenses, the prospect of measuring \(H_{0}\) to 1% precision can become a reality. The employed methodology and assumptions must keep up to provide the accuracy requirement. In the following we discuss methodology improvements and validations in the domain of galaxy density profiles (Sect. 10.3.1), assumption in the interpretation of non-lensing constraints (Sect. 10.3.2), selection effects (Sect. 10.3.3), automatization (Sect. 10.3.4) and general aspects of methodology verification (Sect. 10.3.5). These sections are not meant to be complete but to provide guidance in the near future on where focused effort is required.

10.3.1 Galaxy Density Profiles

The currently employed model mitigating the MST effect by Birrer et al. (2020) is parameterized with a pure MST parameter \(\lambda \). This parameterization is foremost of mathematical nature and leaves the physical interpretation (e.g., Blum et al. 2020) ambiguous. A pure MST parameterization may in certain regimes even become unphysical, e.g. resulting in total mass profiles with negative density in the outskirts.Footnote 8 Such a one-parameter extension to previously considered more simple and rigid mass profiles may also not encompass the necessary flexibility beyond the pure MST that can affect kinematics observations (e.g., Birrer et al. 2020; Yıldırım et al. 2023), or to deal with more generalized forms of lensing degeneracies, such as the SPT. To make progress, the full degeneracy inherent in gravitational lensing needs to be folded into flexible, but physically motivated, mass profile parameters. Such an approach was explored by (Shajib et al. 2021) constraining the extended mass density profiles of the SLACS galaxy-galaxy lenses, but has not yet been employed for time-delay cosmography. Quasar microlensing studies might also help to constrain the stellar mass to light ratio in massive elliptical galaxies. Ambitious measurements below the 10% level might additionally help to constrain the mass density profiles and would allow the focus on the dark matter portion of the profile. We refer to Vernardos et al. (2024) for techniques and prospects of this methodology.

10.3.2 Non-lensing Constraints

Significant constraints on the MST, and in general mass density profiles, are expected to come from non-lensing observables. These measurements, as well as their model interpretation, need to be tested to the percent level. For example, for the kinematic measurements, the impact of stellar template fitting needs to be further assessed and validated. For the interpretation of the measurements, de-projection assumptions, rotational structure, as well as stellar anisotropy need to be rigorously tested and assessed for covariant systematics on the population level. For upcoming magnification measurements with glSNe, micro- and milli-lensing effects need to be assessed and incorporated into the model self-consistently. Furthermore, more knowledge about the structure and size of the variable quasar accretion disc are required to determine the strength of the micro-lensing time delay effect.

10.3.3 Selection Effects

The phenomena of strong gravitational lensing is inherently a very specific selection effect of an otherwise mostly weak lensing field. Quantifying the selection effect of where and in what form strong lensing phenomena occur is going to be crucial requirement to maintain accuracy when increasing precision on \(H_{0}\) in the years to come. The strong lensing phenomena is impacted by both, the nature of line-of-sight structure, and the main deflector. For the line-of-sight structure, the convergence either raises or lowers the lensing efficiency of an equal mass galaxy to act as a strong lensing deflector, and the cosmic shear changes the geometry of the caustic structure, making it more likely to have quadruply imaged sources. Similarly, for the main deflector, more concentrated mass distribution, or favorable projections along the line of sight, lead to higher lensing efficiencies, and more elliptical mass profiles (also in projection) lead to a more extended inner caustic region. Including the differential selection effects among different samples of lenses is required when combining information coming from differently selected populations. For example, quadruply lensed quasars are visible only when the source quasar lies within the diamond caustic of the lensing galaxy. This condition creates a Malmquist-like selection effect in the population of observed quadruply lensed quasars, increasing the true caustic area (Baldwin and Schechter 2024).

Many of these effects are hard or near-impossible to quantify on a lens-by-lens basis. These selection effects need to be modeled and inferred on the population level, with the focus of making sure that relative selection effects between different sub-populations are being understood.

There are two distinct and complementary approaches to understand and mitigate selection effects in the analysis. First, one can attempt to understand the selection from first principles which then can be used to explicitly account for in the analysis procedure. This approach requires extensive simulations including all relevant aspects, starting from the full sky a prior abundance and population of phenomena and a reproducible selection function of the discovery channel and follow-up decision being made to either include a lens in the sample or not. For example, Collett and Cunnington (2016) simulated a sample of double- and quadruple-image systems and when assuming reasonable thresholds on image separation and flux, based on current lens monitoring campaigns, they found that the typical density profile slopes of monitorable lenses are significantly shallower than the input ensemble. Second, one can empirically determine a relative selection function by comparing a set of observables of a sample of lenses compared to random galaxies or sight lines on the sky. Deviations on the set of observables on the population level indicate then the level of selection bias in the sample. Observables may include, but are not restricted to, central velocity dispersion, stellar mass, size and morphology of the deflector, number of subhaloes and line-of-sight projected galaxies nearby, redshift of the deflector, among others. Deviations from established scaling relations among the galaxy properties are then indications of selection biases We refer to Sect. 6 for data and approaches to quantify line of sight effects. We also stress that these techniques rely on underlying priors and model assumption on the population bias and an explicit de-biasing is required to constrain hierarchically unknown selection effects (see e.g., Park et al. 2023). Currently, neither of the two approaches have been successfully applied.

With the expected large number of lenses in the next few year of the mid 2020s, and the more uniform data set of large and deep surveys, both, the theoretical forward modeling and the empirical hierarchical modeling, will become feasible. We also advocate for analyses to take into account the specific discovery channel of the lenses when performing population level inferences. Understanding the selection function may or may not imply to effectively re-discover the lenses in the analysis to guarantee a uniform and reproducible selection and analysis.

10.3.4 Automatization

Current state-of-the-art analyses of single lens systems takes up more than a year of work, with the involvement of many people, as well as several hundred of thousands of CPU hours of computational cost. To utilize the upcoming larger lens samples and to achieve a high-precision \(H_{0}\) measurement, the time to analyse a single system has to be reduced significantly. Automated decision-making and model choices (e.g., Schmidt et al. 2023; Ertl et al. 2023), as well as GPU assisted computations (e.g., Gu et al. 2022) hold promises in these regards. Moreover, analyses have to be able to be repeated with modifications to test for assumptions covariant among all lenses multiple times. The faster the entire analysis runs, the more explorations of potential systematics in the choices can be executed. The challenge in finding uniform analyses choices are that every lens is different from another and particularities have been notices that needed special attention for lenses on the individual basis. The analyses conducted need to be uniform in their choices and approaches such that impacts on assumptions can be tested on the ensemble level. Uniformity of analyses can also reduce human errors and sets the analyses on quantifiable priors.

There is currently an effort in homogenizing the analysis procedure, for both time-delay lenses (Shajib et al. 2019; Schmidt et al. 2023; Ertl et al. 2023) and non-time delay lenses (Shajib et al. 2021) and further effort is underway. In parallel, alternative methodology in the modeling and posterior inference are being explored with machine learning techniques, which have the potential to speed up the analysis by orders of magnitude (e.g., Park et al. 2021).

10.3.5 Methodology Verification

Guaranteeing accuracy with ever more precise measurements is a challenge throughout the cosmological community. High-precision measurements of quantities to relevance of fundamental physics is a relatively new field and we dedicate a separate subsection highlighting different strategies to verify the methodology and to perform to the necessary quality standard to maintain accuracy.

  • Realistic simulations offer a validation of a methodology on a known truth (see e.g., Xu et al. 2016; Tagore et al. 2018). It is important that the complexity in the simulations are realistic to explore avenues of potential systematics and gain a deep understanding of what data products are able to constrain what aspects of the model. Simulations eventually need to encompass all aspects of the analysis, including the selection effect and the entire line-of-sight structure within the full cosmological and astrophysical context.

  • Data modeling challenges, such as the Time-Delay Challenge (Liao et al. 2015) and the Time-Delay Lens Modeling Challenge (Ding et al. 2021) offer platforms to validate currently employed methodology on mock data sets, explore new ways of analyzing the data and can provide a transparent overview of the current state of the field.

  • Blind analyses prevent experimenter bias. The analysis should be guided by the assessment of uncertainties regardless of the anticipated result. Blind analyses have regularly been performed by the H0LiCOW and TDCOSMO collaborations.

  • Open source accessibility of the raw data, processed data product, analysis software and entire end-to-end analysis pipelines can best guarantee reproducibility, form community trust and provides access to the community to alter and improve existing methodology.

10.4 Concluding Remarks

Time-delay cosmography has an exciting time ahead. The method has come along way since its original proposal by Refsdal (1964). Current measurements of the Hubble constant with time-delay cosmography are at the few percent level, enabled by detailed analyses and precise measurements of different aspects of the analysis. With the expected increase in the lensing sample and the advances in instrumentation, the path towards a percent precision measurement of \(H_{0}\) becomes in reach.

Measuring the Hubble constant to percent level precision is a challenging endeavor, regardless of the cosmological probe. In this manuscript, we aimed to provide a detailed account of the methodology and measurements to provide guidance to achieve a precise and accurate measurement of \(H_{0}\) at the one-percent level. We emphasized the challenges and systematics in the different components of the analysis and strategies to mitigate them. Above all, in Carl Sagans words: “Extraordinary claims require extraordinary evidence”.