1 Introduction

Microlensing is a phenomenon that allows us to probe very small spatial and mass scales in the Universe that are otherwise inaccessible by current and next-generation instruments. The underlying physical principles of gravitational light deflection are the same as in the case of strong lensing by galaxies and clusters, however, the main difference is that these deflections happen in such small angular scales (\(\sim 10^{-6}\) arcsec) that the resulting multiple images of a source are unresolved. The only remaining quantity with an observable effect is the total magnification, i.e. the total flux received from all the unresolved microimages (the centroid of the images can be affected too, but this requires higher sensitivity to detect). A demonstration of the phenomenon is shown in Fig. 1. Given our solid theoretical understanding of microlensing, we can use it as a tool that is, in many ways, more powerful than any telescope we could build in the next several decades. With this tool, we can learn both about the background source and the lens itself (the massive galaxy along the line of sight). In this case, the type of sources that we examine are quasars, although microlensing can also affect lensed supernovae (see Suyu et al. 2024) and other more ‘exotic’ sources like Fast Radio Bursts (Lewis 2020), Gamma-Ray Bursts (Mao 1993), and gravitational waves (Diego et al. 2019). In this review, we present the state-of-the-art of the field of quasar microlensing and expand on previous works by Schneider et al. (1992), Schneider et al. (2006), and Schmidt and Wambsganss (2010).

Fig. 1
figure 1

Illustration of the effect of microlensing on a source, in this case, the coat-of-arms of Bern, Switzerland, the seat of the International Space Science Institute, where the “Strong Gravitational Lensing” workshop was held between 18-22 of July 2022 that lead to the writing of this review. Right: A collection of microlenses on the image plane (the same as the one shown in Fig. 4) produces magnified and distorted micro-images of the source. Left: The corresponding magnification pattern is shown on the source plane (the same as the one shown in Fig. 10). For an interactive version of this figure visit: https://austinpeel.github.io/gladius-microlensing/

The first discovery of a gravitational lens, the doubly imaged quasar Q 0957+561 (Walsh et al. 1979), was followed by the pioneering works of Chang and Refsdal (1979), Gott (1981), Young (1981), and Chang and Refsdal (1984) who suggested that light from the multiple images could be further affected by the presence of stellar mass objects near the line of sight. Indeed, microlensing was confirmed a decade later in the quadruply lensed Q 2237+0305 (Irwin et al. 1989), also known as Huchra’s lens (Huchra et al. 1985) or “The Einstein Cross”, shown in Fig. 2. These, and the studies of Paczynski (1986) and Kayser et al. (1986), set the foundations of the field of quasar microlensing.

Fig. 2
figure 2

Multiply imaged quasar Q 2237+0305. Left: the lensed quasar seen through the central regions of a foreground spiral galaxy-lens (credits: J. Rhoads, S. Malhotra, I. Dell’Antonio, NOAO/WIYN/NSF). Right: zoom on the bulge of the galaxy-lens seen in the centre, surrounded by multiple images of the background quasar produced by gravitational lensing (credits: NASA, ESA, and STScI)

Although throughout this review we refer to the “quasar” as if it was a single, well-defined object, a quasar, or Active Galactic Nucleus (AGN, we use both terms indistinctly), is a composite of several different structures around a Super-Massive Black Hole (SMBH). Each of these regions produces its own signature radiation and together they sum up to give an astrophysical object that is bright across the whole electromagnetic spectrum. Through decades of dedicated work (Netzer 2015; Padovani et al. 2017), a general picture of the various quasar regions has been inferred. However, given the small physical scales around the black hole (∼ ld) and the large distances to these objects (\(z \gtrsim 0.5\)), the dimensions and exact geometry of these regions remain largely unresolved by any current or planned telescope. With microlensing, we can measure and place direct constraints on the sizes and locations of the emitting regions of a quasar at a cosmological distance on scales from micro- to nano-arcsec. In Fig. 3 we summarize our current understanding of quasar structure and highlight the microlensing effect on its different components.

Fig. 3
figure 3

Top: Schematic view of the structure of a quasar with the main regions that can be probed by microlensing color-coded as follows, starting from the smallest/innermost (left to right): X-ray corona (purple), accretion disc (blue to red), Broad Line Region (lighter green), dust torus (red), and Narrow Line Emission (darker green). For completeness, the black hole and the jet are also shown. Different characteristic lengths are shown in logarithmic scale in the bottom of this panel: the gravitational radius, \(r_{\mathrm {g}}\), for a \(7 \times 10^{8}\) M black hole (see Eq. (35)), the corresponding time it takes to cross this length, \(t_{\mathrm{cross}}\), for an effective velocity of 500 km/s (see Sect. 3.4.1), and the Einstein radius on the source plane, \(R_{\mathrm{E}}\), for lens and quasar located at redshifts of 0.5 and 2 respectively (see Eq. (7)). The latter is given for two different masses of the microlenses. Inspired from Fig. 1 of Moustakas et al. (2019). Middle: Composite UV-optical spectrum of an AGN with a few main emission lines identified. Its different components have been color-coded to match the corresponding regions shown in the top panel. Bottom: Same as the middle panel but for the X-rays

What we can learn about the composition of galaxies from microlensing is as important as it is inaccessible by any other means. While every astronomer knows that a galaxy is “a gravitationally bound system of stars, stellar remnants, interstellar gas, dust, and dark matter”, establishing the relative proportions of each component is one of the great challenges of extragalactic astronomy. In particular, the contribution of either dark matter and stellar mass is subject to much disagreement and uncertainty (see Sect. 4 in Shajib et al. 2024). For example, there are important works that measure stellar mass-to-light ratios for elliptical galaxies (see Cappellari 2016, and references therein), but buried within these papers the authors invariably caution that uncertainties in the faint end of the stellar mass function, where the stars are practically invisible, render their results uncertain by a factor of two. Because microlensing is sensitive only to mass, it can be used to determine the amount of mass in individual stars, including stellar remnants, brown dwarfs, and red dwarfs that are too faint to produce any photometric or spectroscopic signatures. In other words, microlensing is an independent and direct method to determine the graininess of the gravitational potential.

In order to prepare the unfamiliar reader with the data and methods used in microlensing studies, in the remaining of this section we summarise some general theoretical and observational properties. In Sect. 2 we set the foundations of the theory of microlensing as well as our best current theoretical understanding of the structure of quasar emission regions. Section 3 presents the different approaches to analyze how microlensing manifests across time and wavelength. The application of these methods to real data leads to measurements of quasar structure and lens galaxy mass, which are reviewed in Sects. 4 and 5 respectively. We conclude in Sect. 6 with the promising future of quasar microlensing, which will be revolutionized by the upcoming all-sky surveys of this decade, and the challenges that need to be addressed in order to deliver groundbreaking scientific results.

1.1 General Theoretical Properties and Facts

Microlensing is caused by compact astrophysical objects with dimensions much smaller than their Einstein radii,Footnote 1\(\theta _{\mathrm{E}}\) (i.e. stars and stellar remnants), as opposed to smoothly varying mass distributions over large scales (e.g. galaxies and clusters). In fact, this radius, which is typically of the order of \(10^{-6}\) arcsec for lensed quasars (hence the prefix “micro”, see also Kayser 1992), is critical because it determines the scale length of the phenomenon. The rule to remember is that the smaller (larger) a quasar emitting region with respect to \(\theta _{\mathrm{E}}\), the stronger (weaker) the amplitude of microlensing.

The optical depth to microlensing is dramatically increased along the line of sight to a quasar that is already strongly lensed by a foreground galaxy (see Fig. 2). Hence, we almost always expect a population of compact objects acting as microlenses and in very few cases single, isolated point masses. Such populations create caustic features of very high magnification, often superimposed in a non-linear fashion, but because flux has to be conserved (e.g. compared to a smooth mass sheet with the same total mass) there need to be large swathes of de-magnified regions on the source plane as well. Depending on where the source lies within this caustic pattern it can be magnified or de-magnified.

Due to the relative velocity of observer, lens, microlenses, and quasar, which is of the order of 1 \(\theta _{\mathrm{E}}\) per decade (Mosquera and Kochanek 2011), we expect to see significant variations (∼1 mag) of the microlensing magnification as the quasar emitting regions cross caustics and areas of demagnification. Moreover, because the quasar region(s) emitting at any given wavelength can vary in size, there is a chromatic dependence of this (de)magnification. It is important to note that quasars are intrinsically variable objects, therefore, in order to obtain the microlensing signal (in a single epoch or as a function of time) one needs to cleanly subtract this intrinsic variability after taking into account the time delays per multiple image due to the presence of the lensing galaxy (see Birrer et al. 2024).

1.2 Observational Considerations

From an observer’s point of view, one may wonder which kind of data are needed in order to use microlensing for one of the above mentioned science applications. The size of the source with respect to \(\theta _{\mathrm{E}}\), together with the dynamic nature of the phenomenon may guide the answer to this question. The wavelength range to consider covers almost the whole electromagnetic spectrum, from the innermost X-ray corona and accretion disc to the Broad Line Region (BLR, see Fig. 3), while the time scales when large variations are expected are in general shorter for the most compact regions. When the quasar is crossing a caustic, the magnification increases rapidly with time and therefore nightly cadence is required to capture in detail how the different quasar regions respond. Under a “normally” microlensed quasar, i.e. not undergoing a high magnification event, the daily cadence requirement can be relaxed and weekly observations should suffice for long-term microlensing signals. In practice, the existing observational resources are quite restricted both in wavelength coverage and cadence (e.g. season gaps).

Over the years, widely different observations have been performed to capture the various manifestations of microlensing variability: from decade-long monitoring to multiwavelength snapshots and spectra. The limited resources have resulted in two main strategies, each with its own advantages and drawbacks: either long-term weekly monitoring but, with very few exceptions, in a single band, or multi-wavelength snapshots at a given moment in time. For example, light curves provide more data points to constrain quasar structure and lens mass but require an additional model component (the effective velocity) and a good measurement of the time delay, while snapshots can constrain the accretion disc temperature profile but are prone to other systematic biases and limitations, like contamination from massive substructures (e.g. anomalous flux ratios, see Vegetti et al. 2024) and application to image pairs with negligible, or known, time delays. Although very information-rich on quasar structure, data for high magnification events have so far been scarce due to their rarity and quick evolution. Requiring almost daily cadence, these have been almost exclusively observed in a single system (the Einstein cross).

2 Background and Theory

In this section, we present the theoretical background that is specific to microlensing, as opposed to strong lensing in general. For an overview of the basic principles and theory of lensing we refer the reader to Saha et al. (2024).

2.1 Light Deflection

To study the lensing effect of an ensemble of compact objects in the lens plane on a background quasar, we start again by setting up the lens equation (Saha et al. 2024), but this time for many objects:

$$ \boldsymbol{\beta }= \boldsymbol{\theta }- \frac{D_{\mathrm{ds}}}{D_{\mathrm{d}}D_{\mathrm{s}}} \sum _{i=1}^{N} \hat{\boldsymbol{\alpha}}_{i} (\boldsymbol{\theta }), $$
(1)

where we are summing over the deflection angles \(\hat{\boldsymbol{\alpha}}_{i}\) due to \(N\) compact objects with masses \(M_{i}\) situated at \(\boldsymbol{\theta }_{i}\) in the lens plane.

In addition to the surface mass density at the quasar image position due to compact objects within a radius \(\theta \):

$$ \kappa _{*}=\Sigma _{\ast }(< \theta )/\Sigma _{\mathrm{cr}}, $$
(2)

with \(\Sigma _{\mathrm{cr}}=\frac{c^{2}}{4 \pi G} \frac{D_{\mathrm{s}}}{D_{\mathrm{ds}}D_{\mathrm{d}}}\) (Saha et al. 2024), we need to include the effect of a constant surface density of matter \(\kappa _{\mathrm{c}}\) in the lens plane to account for a smooth dark matter component (also expressed in units of \(\Sigma _{\mathrm{cr}}\)). The total surface mass density is then:

$$ \kappa = \kappa _{*}+ \kappa _{\mathrm{c}}. $$
(3)

Finally, we need to consider the effect of shear due to the tidal field of the lensing galaxy at the position of the quasar image (Saha et al. 2024). Using Einstein’s formula for the deflection (Saha et al. 2024) the lens equation due to \(N\) compact masses \(M_{i}\) in the presence of a mass sheet \(\kappa _{\mathrm{c}}\) and shear \(\gamma =\sqrt{\gamma _{1}^{2}+\gamma _{2}^{2}}\) becomes:

$$ \boldsymbol{\beta }= \left ( \textstyle\begin{array}{c@{\quad}c} 1 - \kappa _{\mathrm{c}}-\gamma _{1} & -\gamma _{2} \\ -\gamma _{2} & 1- \kappa _{\mathrm{c}}+ \gamma _{1} \end{array}\displaystyle \right ) \boldsymbol{\theta }+ \frac{4 G}{c^{2}} \frac{D_{\mathrm{ds}}}{D_{\mathrm{d}}D_{\mathrm{s}}} \sum _{i=1}^{N} M_{i} \frac{\boldsymbol{\theta }_{i} - \boldsymbol{\theta }}{|\boldsymbol{\theta }_{i} - \boldsymbol{\theta }|^{2}}. $$
(4)

For a homogeneous mass distribution with surface density \(\kappa _{*}\), the second term becomes \(\propto -\kappa _{*}\boldsymbol{\theta }\). This equation can also be written as:

$$ \boldsymbol{\beta }= \left ( \textstyle\begin{array}{c@{\quad}c} 1 - \kappa _{\mathrm{c}}-\gamma _{1} & -\gamma _{2} \\ -\gamma _{2} & 1- \kappa _{\mathrm{c}}+ \gamma _{1} \end{array}\displaystyle \right ) \boldsymbol{\theta }+ \sum _{i=1}^{N} {\theta _{\mathrm{E}}}_{i}^{2} \frac{ \boldsymbol{\theta }_{i} - \boldsymbol{\theta }}{|\boldsymbol{\theta }_{i} - \boldsymbol{\theta }|^{2}} , $$
(5)

where we used the Einstein radii \({\theta _{\mathrm{E}}}_{i}\) of the compact objects in the lens plane (Saha et al. 2024). This is a fundamental quantity for microlensing as it defines the scale length of the phenomenon. Thus, we define it here again in the image plane, as in the equation above:

$$ \theta _{\mathrm{E}}= \sqrt{ \frac{4GM}{c^{2}} \frac{D_{ds}}{D_{d}D_{s}}} $$
(6)

and its projection on the source plane:

$$ R_{\mathrm{E}}= \sqrt{ \frac{4GM}{c^{2}} \frac{D_{s}D_{ds}}{D_{d}}}, $$
(7)

in units of length.

How can we distribute \(N\) masses such that a certain \(\kappa _{*}\) is achieved? From the homogeneous mass case \(\boldsymbol{\beta }\propto - \kappa _{*}\boldsymbol{\theta }\) (Eq. (4)) and \(\boldsymbol{\beta }\propto - ({\sum _{i=1}^{N} {\theta _{\mathrm{E}}}_{i}^{2}}/{ \theta ^{2}})~\boldsymbol{\theta }\) in Eq. (5), it can be derived that the normalized surface density in compact objects \(\kappa _{*}\) equals the summed “Einstein circles” divided by the area in which the point masses are distributed:

$$ \kappa _{*}= \frac{\pi {\theta _{\mathrm{E}}}^{2} }{\pi \theta ^{2}} \sum _{i=1}^{N} m_{i} = \frac{\pi \sum _{i=1}^{N} m_{i}}{A_{enc}} $$
(8)

with \(m_{i} = M_{i}/\text{M}_{\odot}\). In the last term, the area \(A_{enc}\), in which the \(N\) point masses are distributed, has been normalized by the Einstein radius for one solar mass \(\theta _{\mathrm{E}}^{2}\) squared. To create a mass ensemble of compact objects with surface density \(\kappa _{*}\), one needs to distribute point masses with summed masses \(m_{i}\):

$$ \sum _{i=1}^{N} m_{i} = \frac{\kappa _{*}A_{enc}}{\pi} $$
(9)

in the area \(A_{enc}\).

It is customary to also rewrite the microlensing mapping from Eq. (5) in a normalized way by dividing by \((1-\kappa _{\mathrm{c}})/\sqrt{|1-\kappa _{\mathrm{c}}}|\) (for \(\kappa _{\mathrm{c}}\neq 1\)). Then, one can define the normalized lens-plane and source-plane coordinates:

$$ \boldsymbol{z} = \frac{\boldsymbol{\theta }}{\theta _{\mathrm{E}}/ \sqrt{|1-\kappa _{\mathrm{c}}|} } \quad \mathrm{and} \quad \boldsymbol{\zeta} = \frac{1}{1-\kappa _{\mathrm{c}}}~\frac{\boldsymbol{\beta }}{\theta _{\mathrm{E}}/\sqrt{|1-\kappa _{\mathrm{c}}|}}. $$
(10)

The normalized lens equation follows as (Paczynski 1986; Kayser et al. 1986):

$$ \boldsymbol{\zeta} = \left ( \textstyle\begin{array}{c@{\quad}c} 1 - g_{1} & -g_{2} \\ -g_{2} & 1 + g_{1} \end{array}\displaystyle \right ) \boldsymbol{z} + \mathit{sign}\left ( \frac{\kappa _{*}}{1-\kappa _{\mathrm{c}}}\right ) \sum _{i=1}^{N} m_{i} \frac{ {\boldsymbol{z}}_{i} - \boldsymbol{z}}{|\boldsymbol{z}_{i} - \boldsymbol{z}|^{2}} , $$
(11)

where \(g_{1}\) and \(g_{2}\) are called the components of the reduced shear (see Saha et al. 2024). Employing the signum function is due to Paczynski (1986), who used it because it shows that only the normalized surface mass density \(\kappa _{*}/(1-\kappa _{\mathrm{c}})\) and the reduced shear \(g\) are needed to describe a quasar microlensing situation (note that \(\mathit{sign}(\kappa _{*})>0\) always). We note that these two quantities, normalized surface mass density and reduced shear, are also known as “effective” convergence and shear and are further discussed in Sect. 2.8.

Often in the literature, the compact-object surface density \(\kappa _{*}\) and the smooth surface density \(\kappa _{\mathrm{c}}\) are both given in addition to the shear \(\gamma \). It should be noted, however, that the above described degeneracy between the three quantities holds (e.g. Saha 2000). In the case of \(\kappa _{\mathrm{c}}>1\) (sometimes called over-focussing) the deflection due to the individual masses is formally counted as repulsive. However, as Paczynski put it this is just a “trick” to make equation Eq. (11) simpler. In the following we shall assume \(\kappa _{\mathrm{c}}<1\) for simplicity because this is the most common scenario for observations (but see also Dobler et al. 2007).

2.2 Magnification for Ensembles of Point Masses

The magnification of a microlensed image is denoted by the symbol \(\mu \). An image can be magnified, \(|\mu |>1\), or demagnified, \(|\mu |<1\). In addition the image can be mirror-inverted, which corresponds to negative magnification \(\mu <0\). The sign of the magnification \(\mu \) is also called parity (see also Saha et al. 2024).

For the lens mapping described by the lens equation introduced above, the magnification of a point source is given by the inverse of the Jacobi determinant (see Saha et al. 2024):

$$ A = {\mathrm{det}}~J = {\mathrm{det}}~\left ( \frac{\partial \zeta}{\partial \boldsymbol{z}} \right ). $$
(12)

The total magnification can be found by taking the sum:

$$ \mu _{\mathrm{tot}} = \sum _{j=1}^{M} \frac{1}{|A|_{\boldsymbol{z}=\boldsymbol{z}_{j}}}, $$
(13)

over all \(M\) images at locations \(\boldsymbol{z}_{j}\), or \(\boldsymbol{\theta}_{j}\), on the lens plane. Calculating the magnification for extended sources, such as a quasar accretion disc, can proceed by splitting up the source into sub-sources (e.g. Witt and Mao 1994). However, in Sect. 2.7 it is shown how to calculate the magnification using the ray-shooting method. This latter technique is the most widely used approach.

As an example, the magnification \(\mu \) for a constant sheet of matter \(\kappa _{\mathrm{c}}>0\) with shear \(\gamma \) is given by (see also Saha et al. 2024):

$$ \mu = \frac{1}{(1-\kappa _{\mathrm{c}})^{2}-\gamma ^{2}}. $$
(14)

We note that this is often referred to as the “macro-magnification” because \(\kappa _{\mathrm{c}}\) and \(\gamma \) can be attributed to the lensing galaxy’s macroscopic potential at the location of the multiple images. It can be seen from this equation that for positive parity images the magnification is always \(|\mu |>1\). For negative parity the image can be magnified, \(|\mu |>1\), or demagnified, \(|\mu |<1\).

2.3 Complex Notation for Microlensing

The normalized lens equation Eq. (11) describes a mapping from the lens plane \(\boldsymbol{z}\) to the source plane \(\boldsymbol{\zeta}\). Starting with Bourassa et al. (1973), Bourassa and Kantowski (1975), complex numbers were used to describe this mapping. While they partly treated real and imaginary parts separately, Witt (1990) introduced a consequent complex notation.

Normalized positions in the lens plane are represented by \(z=x + i y\) and in the source plane by \(\zeta =\xi + i \eta \). The shear can be described as a complex quantity a well, \(g=g_{1}+i g_{2} = |g| \exp ^{2 i \varphi}\). Since for complex numbers \(z/|z|^{2}=1/\bar{z}\) (the bar indicates the complex conjugate), the normalized lens equation Eq. (11) can be re-written as:

$$ \zeta = z - g \bar{z} + \sum _{i=1}^{N} \frac{m_{i}}{\bar{z}_{i}- \bar{z}}. $$
(15)

The complex formulation has the advantage that it can be well treated by a computer language that can deal with complex numbers (e.g. Fortran, C, python). For example, the magnification of an image at \(z\) can be calculated using the inverse of the complex version of Eq. (12) (Witt 1990):

$$\begin{aligned} {\mathrm{det}}~J = &\left (\frac{\partial \zeta}{\partial z} \right )^{2} - \frac{\partial \zeta}{\partial \bar{z}} \overline{\frac{\partial \zeta}{\partial \bar{z}}} \\ & = 1 - \left (-g + \sum _{i=1}^{N} \frac{m_{i}}{(\bar{z}_{i} - \bar{z})^{2}}\right ) \left (-\bar{g} + \sum _{i=1}^{N} \frac{m_{i}}{({z}_{i} - {z})^{2}}\right ). \end{aligned}$$
(16)

In the last step, the corresponding derivatives of Eq. (15) were calculated with respect to \(z\), \(\bar{z}\) rather than the more usual pair \(x\), \(y\) (i.e. the complex plane) are used. They have the same properties as “normal” derivatives (i.e. linearity, product rule, chain rule, conjugation), but can be used more efficiently with the complex lens equation.

2.4 Finding the Micro-Images

The micro-images of a point source at a given position produced by a large number of point mass lenses can be found using explicit search algorithms in the lens plane (e.g. Paczynski 1986; Saha and Williams 2011). This can be done very efficiently using the complex notation.

The maximal number of micro-images that can be observed of a point source at \(\zeta \) has been studied thoroughly (Witt 1990, 1991; Petters 1992; Petters et al. 2001). Witt (1990) has shown that all images of a point source at \(\zeta \) can be calculated by complex conjugation of Eq. (15), by multiplying with \(\prod _{i=1}^{N} (z_{i} - z)\), and by re-inserting Eq. (15) for \(z\). The result is a large polynomial of degree \((N+1)^{2}\) that only depends on \(\bar{z}\). Taking the conjugate of the complex roots of this polynomial yields all possible \((N+1)^{2}\) solutions – for \(g=0\) the maximal number of images is \(N^{2}+1\). Not all of those solutions, however, also satisfy the real-valued lens-equation Eq. (11) and a check needs to be performed. This polynomial-technique to determine the solutions of the normalized lens equation is very successfully used in the field of planetary microlensing, where \(N\) is only a few (binary, triple system etc, e.g. Bozza 2010).

Another elegant procedure to find all images corresponding to a point source is due to Witt (1993) (see also Lewis et al. 1993, for a different implementation of the same idea). Witt (1993) shows that all images of a point source can be found by following a straight source track from far away from the ensemble of point masses to the position of interest \(\zeta \):

  • For a source position \(\zeta _{0}\) very far away from the ensemble of \(N\) point masses there exists one image close to the source and \(N\) images close to the masses \(m_{i}\).

  • To find these images one first needs to solve the lens equation iteratively to find the far image (except for \(|g|>1\), see Witt 1993):

    $$ z_{[k+1]} = \zeta _{0} + g\,\bar{z}_{[k]} - \sum _{i=1}^{N} \frac{m_{i}}{\bar{z}_{i} - \bar{z}_{[k]}}, $$
    (17)

    where \(k\) is the number of iteration.

  • The other images near the stars are found from:

    $$ z \approx z_{i} - m_{i}/\left [ \bar{\zeta}_{0} - \bar{z}_{i} + g\,z_{i} - \sum _{j=1,j\neq i}^{N} \frac{m_{j}}{z_{j}-z_{i}} \right ]. $$
    (18)
  • After all the initial image positions are found, the paper goes on to show how to find all images of a straight line, and thus effectively all images for all source positions \(\zeta \) on this line.

An example is given in the left panel of Fig. 4, where a straight line on the source plane is lensed into a main image of a “wiggly” line and many small “loops” around the microlenses. All images consist of single or multiply imaged parts of the straight line on the source plane. Depending on the configuration, loops can sometimes combine to form bigger loops and the main line can also be connected to stars, as can be seen on the figure.

Fig. 4
figure 4

Left: Example images of a straight line lensed by an ensemble of 50 point masses (star symbols) distributed randomly within a circle of radius 11.2 \(\theta _{\mathrm{E}}\) without shear. Right: corresponding source plane and caustic curves. Each image of the line can consist of multiply imaged parts, as indicated by the colored segments. The main image line passes through 2 stars. The image plane is covered by regions of negative (orange) and positive (blue) parity separated by the critical lines. There is at least one image close to every star, but those images with negative parity (micro-saddles) are usually highly demagnified and can get arbitrarily close to the point mass positions when the source is far away. The small gaps in the image lines are numerical artifacts near the critical lines (see also Sect. 2.5). The micro-images for the particular position \(\boldsymbol{\zeta}=(-0.35, 3.20)\) on the source line (right panel) are marked with blue crosses. Approx. 98 per cent of the total magnification of the source, \(\mu =6.7\), is due to the images on the main track (see also Fig. 6). For these images, the symbols are scaled to the corresponding magnification (the other symbol sizes remain fixed for clarity). Coordinates in both panels are in units of the Einstein radius (the dashed rectangle corresponds to Fig. 6 and the labelled ones - A,B,C - as well as the corresponding source plane locations on the right panel, to Fig. 7)

The total magnification is actually dominated by only a few images (e.g. see Rauch et al. 1992; Wambsganss 1992; Schechter and Wambsganss 2002; Granot et al. 2003; Saha and Williams 2011). This is apparent in the left panel of Fig. 4 and illustrated further in Fig. 6. The overwhelming majority of the micro-images are faint negative-parity images (saddle-points of the arrival time) near each star. There are formally also images coincident with the stars; these are maxima of the arrival time, and are so demagnified that they are negligible. The important images are a small number of micro-minima and equal number of micro- saddle points, whose properties we describe in Sect. 2.6.

2.5 Caustics of the Deflectors

In the case of quasar microlensing, the combination of many point masses and a shear creates a complicated pattern of critical curves. The critical curves can be determined as the places in the lens plane where the determinant Eq. (12) vanishes. A small area in the lens plane would be mapped to a point, identifying a locus of formally infinite magnification in the source plane.

Using the complex formalism above, Witt (1990) has shown that Eq. (16) can be used to work out the location for all critical curves in the lens plane. Because the determinant vanishes, the expression in brackets can be written as:

$$ -g + \sum _{i=1}^{N} \frac{m_{i}}{(\bar{z}_{i}-\bar{z})^{2}} = e^{i \varphi}. $$
(19)

Similar to Sect. 2.4, multiplying by \(\prod _{i=1}^{N} (\bar{z}_{i} - \bar{z})^{2}\) yields a polynomial of degree \(2N\). All critical curves can be found by solving for all the roots \(\varphi =0\) and then tracing all critical curves for \(0<=\varphi <=2\pi \). This can be easily achieved using a simple root-finder like the Newton-Raphson method.

In the left panel of Fig. 4 the critical lines for the \(N=50\) stars were calculated in this way by starting at the \(2\,N = 100\) complex roots and following the critical curves for \(\varphi \) between 0 and \(2\pi \). The corresponding caustic lines in the source plane can be found by mapping the critical lines using the microlensing equation Eq. (15) (or Eq. (4) and Eq. (5)) and are shown in the right panel of Fig. 4. The caustic lines separate regions of differing image multiplicity; whenever a source crosses a caustic, two images either appear or disappear. The source position denoted by the cross symbol in Fig. 4 is inside the slightly elongated astroid caustic, so that two additional images appear, as seen in the left panel of the same figure and in Fig. 6.

The total magnification of a point source moving along the straight line in the right panel of Fig. 4 can be calculated as a function of position or time. Characteristic spikes of extreme magnification are expected whenever the source is crossing a caustic. Such a plot of magnification against time, also known as a “light curve”, is shown in Fig. 15 and is examined in detail in Sect. 3.4 as it is a very important microlensing observable.

2.6 Properties of the “Swarm” of Microimages

Finding all the microimages, albeit is feasible as described in Sect. 2.4, is quite costly. However, there is a number of key properties of the “swarm” of microimages, such as their number, distribution, and individual magnifications, that can be obtained more easily. It is helpful to begin our examination of these properties with the time delay surface:

$$ \tau = \frac{1}{2}(\boldsymbol {\theta }- \boldsymbol {\beta })^{2}-\psi (\boldsymbol {\theta }), $$
(20)

where \(\psi \) is some lensing potential (see Saha et al. 2024). In the absence of any gravitational lensing, this surface is a circular paraboloid with a single extremum and one image is seen at the position of the source with magnification \(\mu = 1\). Under the influence of some background gravitational potential \(\psi _{B}(\boldsymbol {\theta })\) with constant second derivatives (i.e. constant \(\kappa \) and \(\gamma \)), the time delay surface transforms into an elliptical or hyperbolic paraboloid. Whether the image, which corresponds to the observed macro-image, is a minimum or a saddle point then depends on the curvature of the time delay surface.

Point masses introduce logarithmic terms into the gravitational potential (see Saha et al. 2024). If the point masses have some convergence \(\kappa _{*}\) the time delay surface becomes:

$$ \tau = \frac{1}{2}(\boldsymbol {\theta }- \boldsymbol {\beta })^{2}-\psi _{B}(\boldsymbol {\theta }) -\theta _{\mathrm{E}}^{2}\sum _{i=1}^{N} m_{i}\ln |\boldsymbol {\theta }-\boldsymbol {\theta }_{i}| - \frac{1}{2}(-\kappa _{*})\boldsymbol {\theta }^{2}, $$
(21)

where a term due to a negative constant surface mass density equal to the mass in stars is included to conserve the total convergence. In the limit of few deflecting masses, the time delay is largely unperturbed, as shown in Fig. 5. There are scattered logarithmic spikes, but the likelihood of one lying close to the line-of-sight to the source is low.

Fig. 5
figure 5

Top: The time delay surface for a minimum (left) and saddle point image (right). Bottom: Same but with the addition of a point-mass perturber. The white circles mark the locations of the micro-images

The locations of the microimages satisfy the lens equation, which we re-write here as:

$$ \boldsymbol {\beta }= \boldsymbol {\theta }- \boldsymbol {\alpha }_{B}(\boldsymbol {\theta }) - \boldsymbol {\alpha }_{*}(\boldsymbol {\theta }), $$
(22)

where \(\boldsymbol {\alpha }_{\mathrm{B}}(\boldsymbol {\theta })\) and \(\boldsymbol {\alpha }_{*}(\boldsymbol {\theta })\) are the deflection angles due to the potential \(\psi _{\mathrm{B}}\) and the microlenses (with the negative constant convergence correction) respectively. For constant second derivatives of the potential \(\psi _{\mathrm{B}}\) (constant \(\kappa \) and \(\gamma \)), the first two terms of the lens equation become proportional to \(\boldsymbol {\theta }\). For microimages far away from the source position, \(\boldsymbol {\beta }\), these terms become very large. Consequently, the sum of the deflections due to the microlenses (see Eq. (1)) must also become very large. Because point mass deflections are proportional to \(1/|\boldsymbol {\theta }-\boldsymbol {\theta }_{i}|\), the location of the microimage is required to be very close to some microlens so that a single term in the sum dominates. It can be shown that the magnifications of these far away micro-images behave as \(\mu \propto |\boldsymbol {\theta }|^{-4}\) (Paczynski 1986; Schneider and Weiss 1987), and they are therefore faint saddle points.

For larger values of \(\kappa _{*}\), microlenses are more likely to lie closer to the macro-image and affect it. So long as there is external shear, a single point mass close to the macro-image position can split it into four, instead of just two, extra images (counting the infinitely de-magnified micro-maximum) if the source lies within the star’s caustic. As \(\kappa _{*}\) increases, there is no longer a single micro-image that can be associated with the macro-image, which is split into an increasingly more complex “swarm” of microimages. Figure 6 shows the microimages along with contours of the light travel time that pass through micro-saddle-points. Close to the macro-image position, the micro-images are not restricted to lie very close to an associated microlens anymore; the deflection term of the macro-potential is small enough to be counterbalanced by the sum of the deflections due to all the microlenses without any specific term dominating. Instead, there is an effective region within which they tend to lie that determines the size of the micro-image swarm.

Fig. 6
figure 6

Zoomed-in region around the brightest images in the left panel of Fig. 4. The gray curves are saddle-point contours of the arrival time, at whose self-intersection points saddle-point micro-images form (red ellipses). The micro-minima (blue ellipses) and positions of the microlenses (black stars) are also shown. The latter coincide with the locations of zero-brightness micro-maxima. The size and shapes of the ellipses are indicative of the magnification tensor

2.6.1 The Size of the Micro-Image Swarm

The magnification of a point source located at \(\boldsymbol {\beta }\) is (Venumadhav et al. 2017):

$$ \mu (\boldsymbol {\beta }) = \int \delta (\boldsymbol {\theta }- \boldsymbol {\alpha }_{B}(\boldsymbol {\theta })- \boldsymbol {\alpha }_{*}(\boldsymbol {\theta }) - \boldsymbol {\beta }) \,\mathrm {d}^{2} \boldsymbol {\theta }. $$
(23)

This can be seen by utilizing properties of the Dirac delta function; Eq. (23) can be rewritten as a sum:

$$ \mu (\boldsymbol {\beta }) = \int \sum _{i} \mu (\boldsymbol {\theta }_{i})\delta (\boldsymbol {\theta }- \boldsymbol {\theta }_{i})\,\mathrm {d}^{2} \boldsymbol {\theta }, $$
(24)

where the \(\boldsymbol {\theta }_{i}\) are the locations of the microimages. The deflection angle \(\boldsymbol {\alpha }_{*}(\boldsymbol {\theta })\) due to the point masses in Eq. (23) can be viewed as a random variable that changes based on different realizations of the random point mass positions. Averaging over all such realizations, we get:

$$ \langle \mu (\boldsymbol {\beta })\rangle = \int \delta (\boldsymbol {\theta }- \boldsymbol {\alpha }_{B}( \boldsymbol {\theta })-\boldsymbol {\alpha }_{*} - \boldsymbol {\beta })p(\boldsymbol {\alpha }_{*})\,\mathrm {d}^{2} \boldsymbol {\alpha }_{*} \,\mathrm {d}^{2} \boldsymbol {\theta }, $$
(25)

where \(p(\boldsymbol {\alpha }_{*})\) is the probability density function of the deflection angle. By transforming coordinates from the image plane to the source plane with the change of variable \(\boldsymbol {\beta }' = \boldsymbol {\theta }- \boldsymbol {\alpha }_{B}(\boldsymbol {\theta }) - \boldsymbol {\beta }\), we arrive at:

$$ \langle \mu (\boldsymbol {\beta })\rangle = \int \delta (\boldsymbol {\beta }' - \alpha _{*})p( \boldsymbol {\alpha }_{*}) \mu _{B}(\boldsymbol {\beta }' + \boldsymbol {\beta })\,\mathrm {d}^{2} \boldsymbol {\alpha }_{*} \,\mathrm {d}^{2} \boldsymbol {\beta }'. $$
(26)

Performing the integral over \(\boldsymbol {\alpha }_{*}\), this simplifies to:

$$ \langle \mu (\boldsymbol {\beta })\rangle = \int p(\boldsymbol {\beta }') \mu _{B}(\boldsymbol {\beta }' + \boldsymbol {\beta }) \,\mathrm {d}^{2} \boldsymbol {\beta }'. $$
(27)

This resulting expression is a convolution of the microlens deflection probability density with the background magnification model. What was once a point source has, on average, been “smeared out” into a source with a profile that looks like \(p(\boldsymbol {\alpha }_{*})\).

The form of the probability density function of \(\boldsymbol {\alpha }_{*}\) was first worked out by Katz et al. (1986), while Schneider et al. (1992) provide it in a slightly different form and Petters et al. (2009a) present a more thorough mathematical treatment. This function is isotropic and depends only on the magnitude of the deflection angle, not the direction. It is a combination of a bivariate normal distribution, whose width depends on the number of point masses \(N\) as:

$$ \sigma _{*} = \theta _{\mathrm{E}}\kappa _{*}^{1/2} \big[ \ln ( 2e^{1- \gamma _{E}} N^{1/2} ) \big]^{1/2}, $$
(28)

where \(\gamma _{E}\approx 0.577\) is the Euler-Mascheroni constant, with a tail that behaves as:

$$ p(\boldsymbol {\alpha }_{*})= \frac{\theta _{\mathrm{E}}^{2}\kappa _{*}}{\pi |\boldsymbol {\alpha }_{*}|^{4}}, $$
(29)

for large values of \(\boldsymbol {\alpha }_{*}\).

To determine the size of the image swarm for a point source, one can integrate \(p(\boldsymbol {\alpha }_{*})\) over a portion of the source plane that contains a large fraction of the flux. If we only integrate over a region with width \(\sigma _{*}\), the probability density function behaves like a normal distribution and the resulting isophote in the image plane contains microimages that on average constitute only \(39\%\) of the flux; in order to include \(90\%\) of the flux (on average), integrating over a region with width \(2.15\sigma _{*}\) is required. We can transform Eq. (28) for \(\sigma _{*}\), which depends on the number of point masses \(N\), into a function of the macro-parameters only. By considering the tail of the probability density function, one finds that \(99\%\) of the average flux is contained within a circle of radius \(10 \kappa _{*}^{1/2} \theta _{\mathrm{E}}\). The resulting size of the image swarm is then given by transforming this circle to the image plane, resulting in an ellipse with axes given by:

$$ r_{\pm }= \frac{10\kappa _{*}^{1/2}\theta _{\mathrm{E}}}{|1-\kappa \pm \gamma |}. $$
(30)

Figure 7 shows various micro-image configurations for the three source positions indicated in Fig. 4. The magnification lying within the marked contours is slightly different than the expected average due to sample variance of the source and microlens positions.

Fig. 7
figure 7

Microimages corresponding to a point source located at the three source positions indicated in the right panel of Fig. 4. The blue (orange) ellipses represent the micro-minima (micro-saddles). The sizes of the ellipses scale logarithmically with magnification in order to show a greater dynamic range, while their orientation shows the micro-shear from the stars. The dashed and dotted ellipses (circles in this case because there is no macro-shear) denote the isophotes of 90 per cent and \(1\sigma _{*}\) respectively, while the numbers indicate the actual fraction of the total magnification that lies within them

The size of the image swarm determines the allowable area where bright microimages can appear. This in turn affects the possible astrometric shifts from microlensing. Large astrometric shifts can come about only due to caustic crossings and the creation or annihilation of pairs of microimages, as otherwise the microimages only change positions smoothly. Treyer and Wambsganss (2004) found that strong astrometric microlensing shifts are seen in systems with \(\kappa =\gamma =0.4\) and \(\kappa =\gamma =0.6\), while weak shifts are seen in systems with \(\kappa =\gamma =0.2\) and \(\kappa =\gamma =0.8\). This is easily explained by the axis ratio of the image swarm ellipse for each set of parameters, which additionally provides the reason for a dependence on the direction of the external shear noted in Treyer and Wambsganss (2004). The maximum allowable astrometric shift is roughly given by Eq. (30) and is therefore on the order of 10 s of micro-arcseconds. In more extreme cases of magnification from, e.g., cluster critical curves, the astrometric shifts from microlensing may approach more observable milli-arcsecond levels.

For an extended source with some profile \(I(\boldsymbol {\beta })\), the microimages in general simply result in a broadening of the source profile. The width of \(p(\boldsymbol {\alpha }_{*})\), which acoording to the previous discussion is of order \(\kappa _{*}^{1/2}\theta _{\mathrm{E}}\), is added in quadrature to the width of the source \(\sigma _{\mathrm{I}}\) (Dai and Pascale 2021), resulting in an effective width:

$$ \sigma _{\mathrm{eff}} \approx \sqrt{ \sigma _{\mathrm{I}}^{2} + \kappa _{*}\theta _{\mathrm{E}}^{2} }. $$
(31)

2.6.2 The Number of Micro-Images

While every point mass has an associated saddle point image, the total number of images for a field of \(N\) point masses with external shear cannot exceed \(5N\) (Khavinson and Neumann 2004). This implies there are at most \(2N-1\) micro-minima and \(3N-1\) micro-saddles, plus one additional micro-image of either parity corresponding to the macro-minimum or saddle (Petters et al. 2009b). In practice, for macro-images that are not near a macro-critical curve (i.e. outside the extreme high magnification regime), the number of expected micro-minima (extra image pairs) is fairly low, with some simulations suggesting an approximately Poisson distribution (Granot et al. 2003).

The micro-critical curves split the image plane into regions of positive and negative parity, as shown in Fig. 4. The multiplicity with which regions of positive parity overlap when mapped back to the source plane gives the mean number of positive parity images \(\langle n\rangle \), for which analytic formulas are given in Wambsganss (1992), Granot et al. (2003). Figure 8 shows the caustic pattern of Fig. 7, color-coded to show the number of micro-minima. The histogram of the resulting distribution of the number of microminima is shown in Fig. 9.

Fig. 8
figure 8

Caustics (black lines, same as in the right panel of Fig. 4) and corresponding number of micro-minima from each region they enclose. The region of the source plane shown here is the same as in Fig. 10

Fig. 9
figure 9

Histogram showing the distribution of the number of micro-minima. The black triangles denote the expected probabilities for a Poisson distribution with a mean \(\langle n\rangle - 1\) (as the image is a macro-minimum)

Finally, by extending arguments found in Wambsganss (1992), Granot et al. (2003), it can be shown that the average magnification of a single micro-minimum is:

$$ \langle \mu _{\mathrm{mm}}\rangle = \frac{\mu _{\mathrm{macro}}}{\langle n\rangle} P_{\mu >0}(x_{img}), $$
(32)

where \(\mu _{\mathrm{macro}}/\langle n \rangle \) is the macro-magnification from Eq. (14) divided by the expected number of micro-minima, and \(P\) is the probability that location \(x_{\mathrm{mm}}\) is a minimum and not a saddle-point.

2.7 Microlensing Maps

A very useful tool for microlensing studies is the magnification field on the source plane induced by an ensemble of microlenses. This can be used together with a model for the source to match simulated light curves and flux ratios with observations, as will be discussed in Sect. 3. To calculate this magnification, we have to transform Eq. (13), the sum of magnifications due to all the microlenses, from the lens to the source plane. This is hardly practical, and becomes quickly intractable as the number of microlenses increases. However, there has been a number of techniques developed specifically to optimize this task, which we review here.

Our starting point is Eq. (5), where the image positions \(\boldsymbol{\theta}\) map uniquely to a position on the source plane, \(\boldsymbol{\beta}\) – a one-to-one mapping. As we have seen, the inverse is not true and a source can have multiple images. Finding these solutions to the lens equation is a complex problem, if not impossible (see Schneider et al. 1992, for analytical solutions in a few simple cases). Therefore, an alternative approach is to proceed in the inverse manner and calculate the total magnification as the sum of the intensity of all the microimages in a finite region of the source plane, as shown in Fig. 10. This technique, called inverse ray-shooting, was introduced by Kayser et al. (1986) and comprises propagating a grid of rays backwards from the observer through the lens plane, where each ray is deflected using Eq. (5), to the source plane, where its final position is mapped. By dividing the source plane into pixels, counting the number of rays reaching each pixel, \(N_{ij}\), and comparing it to the number of rays that would have reached that pixel if there was no lensing taking place, \(N_{\mathrm{rays}}\), we obtain an estimate for the magnification, i.e. a pixelated magnification map:

$$ \mu _{ij} = N_{ij}/N_{\mathrm{rays}}. $$
(33)

This relation is an approximation because it does not take into account light rays from outside the defined grid that could be deflected inside the region of the source plane under examination. In practice, by choosing a grid of rays large enough with respect to the size of the source plane, the effect of such extreme deflections can be neglected. The above outlined procedure is schematically shown in Fig. 11.

Fig. 10
figure 10

Magnification map corresponding to the microlens configuration shown in Fig. 4, generated by the inverse ray-shooting code GPUD Thompson et al. (2010, 2014). The scale is in magnitudes over the mean magnification, \(\langle \mu \rangle \), which approaches the macroscopic one given by Eq. (14) (a function of the local convergence and shear) as the number of rays increases. The field of view is \(13\times 13 \, \theta _{\mathrm{E}}\) on the source plane and includes the track from the right panel of Fig. 4 (green line) and the locations used to produce Fig. 7 (green crosses). Corresponding light curves for different accretion discs are shown in Fig. 15

Fig. 11
figure 11

Schematic representation of ray-shooting. Microlenses (black dots) are distributed in a circle of area A on the lens plane. A grid of light-rays (grey and red dots) is projected backwards from the observer through the lens plane, where the deflection by each lens is calculated for each ray using Eq. (5), and mapped onto the source plane. A rectangular area on the source plane is selected (red points) away from edge effects, divided into pixels, and used to calculate the magnification from Eq. (33). Taken from Vernardos and Fluke (2014a)

Performing inverse ray-shooting directly is a computationally expensive procedure, with a forbidding cost for large scale applications (see Bate and Fluke 2012; Vernardos and Fluke 2014a, for some benchmarks). Hence, efficient approximate implementations have been developed, as well as direct approaches taking advantage of hardware acceleration, which we outline below.

2.7.1 The Tree Code

One way of reducing the calculations required to produce magnification maps is by approximating the sum of all the deflections due to the microlenses. The key idea here is that the deflection angle of a given light ray due to a given microlens is inversely proportional to the distance between the two, in an analogous way to gravitational N-body problems. In the hierarchical tree code approach developed by Wambsganss (1990, 1999), microlenses that are further away from a light ray are grouped together in cells, or pseudo-particles, whose size and total mass depends on the distance to the light ray. This approximation greatly reduces the final number of deflections that need to be calculated per light ray and, subsequently, increases the speed of the calculations, at the cost of a slight computational overhead and a loss in accuracy. This method has been used extensively in the past to constrain properties of the background quasar and the lens (Rauch and Blandford 1991; Keeton et al. 2006; Bate et al. 2008; Pooley et al. 2012; Dai and Guerras 2018; Hutsemekers et al. 2019). Due to its efficiency and the fact that it was the only widely available code for producing magnification maps for a long period of time, this approach can be considered as the “industry standard” in quasar microlensing.

Although the tree code has led to a tremendous speed-up of the calculations compared to direct ray-shooting, there are a few disadvantages. The required accuracy of the code – the level at which a number of microlenses at a given distance from a light ray are grouped into a pseudo-particle or treated individually – is something to be determined empirically by running the code multiple times until the results are acceptable. Hence, a certain amount of expert knowledge is required to implement and use it. The memory overhead of the tree data structure used to group the microlenses into cells may grow too large with the number of microlenses for a single processor to handle. For this reason, a parallel version has been developed (Garsden and Lewis 2010) that overcomes this by distributing the computations across many nodes on a computer cluster. Using this approach, magnification patterns of up to a billion microlenses can be computed in realistic timescales.

2.7.2 Polygon Mapping

Another approach to speed up the solution of Eq. (33) is to reduce the number of rays shot, \(N_{\mathrm{rays}}\), as does the inverse polygon mapping technique developed by Mediavilla et al. (2006, 2011), Shalyapin et al. (2021). Here, the lens plane is divided into cells that are mapped onto the source plane using the lens equation. By counting the portions of the area of the cells that are within a given source plane pixel, as opposed to the standard inverse ray-shooting that counts rays, an estimate of the magnification is obtained. The main advantage of this method is the increased computational efficiency, i.e. there are no unused rays that got deflected outside the source plane region of interest, while there is an improved resolution around caustics as well. However, it has its own computational overhead due to finding the correct tiling of the lens plane and the large number of cells required near critical lines. Another advantage of this method is that the error on its estimate of the number of rays per pixel follows \(N_{ij}^{-3/4}\), which is smaller than the Poisson error \(N_{ij}^{-1/2}\) of randomly shot rays. In fact, the former is a direct result of using a regular grid of rays (instead of random positions Kayser et al. 1986). Selected examples of studies employing the inverse polygon mapping technique are Guerras et al. (2013a), Jiménez-Vicente et al. (2014), Rojas et al. (2014), Esteban-Gutiérrez et al. (2020).

2.7.3 Direct Approaches

Numerous scientific problems, whose solution has so far been possible only through complex approximations, if available, can now be revisited due to the advent of Graphics Processing Units (GPU) and their design focused on massively parallel processing (see the general brute-force and algorithm analysis techniques described in Barsdell et al. 2010; Fluke et al. 2011). Although originally developed as specialized hardware to accelerate the generation of graphics on the computer screen, particularly for the game industry, the GPU architecture together with the emergence of the notion of general purpose programming libraries, has allowed for speed-ups of \(\mathcal{O}(10)\) to \(\mathcal{O}(100)\) for various algorithms. This is the case for inverse ray-shooting, whose direct implementation is “embarassingly parallel”: each deflection by a single microlens is independent of any other microlenses and the deflections for each light ray are independent of all the other light rays. Such an algorithm – GPUD – has been implemented by Thompson et al. (2010, 2014), with an obvious advantage being its simplicity and ease to modify and maintain. Another advantage is the benefit from Moore’s law (Moore 1965) for GPUs – a doubling of the speed every 1-2 years due to hardware improvements – without any software modification (see Fig. 2 in Vernardos and Fluke 2014a), which has ceased since 2005 for CPUs. GPUD was used to carry out GERLUMPH, the largest parameter space exploration of microlensing maps discussed in Sect. 2.8, which enabled a series of previously unfeasible studies (Vernardos 2018; Foxley-Marrable et al. 2018; Vernardos 2019; Vernardos and Tsagkatakis 2019; Neira et al. 2020). Recent developments by Zheng et al. (2022) can increase the speed of GPUD by 100 in the regime where it is the slowest, i.e. large numbers of lenses and high resolution maps.

The Fourier-based approach by Kochanek (2004) separates the long- and short-range effects of the microlenses on ray deflections using a particle-particle/particle-mesh (P3M) algorithm, assumes the lens and source planes to be spatially periodic, and approximates the long-range deflections via a Fourier transform. As a consequence, the edge effects in the source plane are removed and more of the magnification map is usable (higher efficiency). However, there is an upper bound on the size of the generated magnification maps (approx. 8192 pixels on a side) due to the increasing amount of memory required by the Fourier transforms. Selected examples of studies employing the Fourier-based method are Poindexter et al. (2008), Chartas et al. (2009), Dai et al. (2010), MacLeod et al. (2015), Morgan et al. (2018), Cornachione et al. (2020a).

2.7.4 Combined Approaches

Apart from GPUD’s direct approach to inverse ray-shooting, the tree and polygon based algorithms can also benefit from the massive GPU parallelization, at the cost of more complex algorithm development. A spectacular example is the TeralensFootnote 2 algorithm, which follows the same general principles as the tree code of Wambsganss (1999) but has a dramatically different algorithmic design tuned for maximum parallel efficiency. One can also envisage merging the tree-based method that approximates the lens deflections with shooting polygons instead of rays, since the two approximate different parts of the lens equation. Recently, Jiménez-Vicente and Mediavilla (2022) achieved this by combining the inverse polygon method with the fast multiple method (Greencard and Rokhlin 1987), an algorithm similar in concept to a tree or particle-mesh algorithm. Implementing this approach on the GPU would result in the fastest conceivable magnification map generating code. Finally, alternative GPU/CPU parallel architectures could provide another path forward, requiring minimal modifications for speeding up existing CPU codes (Chen et al. 2017).

2.8 How Caustic Structures Respond to Macro Parameters

The stellar-mass compact object populations that cause microlensing are unobservable – one would require to measure masses and positions of hundreds to thousands of such objects within galaxies at cosmological distances, orders of magnitude below the resolution of any conceivable telescope. Hence, we have to resort to a statistical description of such populations at the location of the observed quasar multiple images that undergo microlensing. This is based on local properties of the macroscopic mass and light distribution of the lensing galaxy. The three major macro-parameters statistically defining a microlensing population are:

  • the total convergence, \(\kappa \), defined in Eq. (3),

  • its compact matter component, \(\kappa _{*}\), related to the number of microlenses via Eq. (8), and

  • the shear, \(\gamma \).

The \(\kappa \), \(\gamma \) are introduced and linked to the lens potential within the general formalism presented in Sect. 3 of Saha et al. (2024), while specific choices of lens models and partitioning the mass between a smooth/dark and a compact/stellar component are discussed in Sects. 2 and 3 of Shajib et al. (2024). Although the shear is a vector, its direction becomes important only when comparing to other multiple images, for example when generating light curves (e.g. see Fig. 17). Therefore, without loss of generality, we can align the shear with the x-axis, leading to \(\gamma _{2} = 0\) in Eq. (5) – a useful trick to maximize the efficiency in using magnification maps. This leads to all three parameters being scalar fields, functions of the position on the lens plane (an example of how these three parameters define microlensing populations which in turn produce corresponding magnification maps and microlensing observables is shown in Fig. 17).

Figure 12 demonstrates the effect of different macro-parameters on the magnification maps in the case where the whole mass is in the form of microlenses, i.e. \(\kappa _{\mathrm{c}}=0\) in Eq. (5). The density of the caustics is proportional to the total magnification (and consequently the number of microlenses from Eq. (13)), whose contours are shown in the top panel of Fig. 13. The effect of the shear is also striking, stretching the caustics and producing stratified maps with elongated “filaments” of caustics interspaced with demagnification troughs. Interestingly, for super-critical regions where \(\kappa >1\) (these correspond to highly demagnified saddle-point or maximum images), the anisotropic magnification (stretching) due to the shear changes direction by \(\pi /2\) due to the \(1-\kappa \) term being negative (see Eq. (76) of Saha et al. 2024). This is clearly seen as the horizontal caustic structures for \(\kappa <1\) become vertical for \(\kappa >1\).

Fig. 12
figure 12

Tiling of the \(\kappa -\gamma \) parameter space by \(17 \times 10\) magnification maps with \(\kappa _{\mathrm{c}}=0\) from the (GERLUMPH) database. The central point of each tile corresponds to the \(\kappa -\gamma \) values that were used to generate it. The caustics become denser as the number density of microlenses increases along the x-axis, while they appear increasingly stretched due to higher shear along the y-axis (the stretching always occurs along the x-axis of the maps themselves, see Fig. 17 for the proper alignment of the maps in a real lensed system). The color scale is the same as in Fig. 10

Fig. 13
figure 13

Theory (top), macromodels (middle), and simulations (bottom) shown in the \(\kappa - \gamma \) parameter space. The critical line, i.e. where the magnification given by Eq. (14) diverges dividing the parameter space to the minimum, saddle-point, and maximum image regions in direct correspondence to the macromodel multiple images, and the \(\kappa =\gamma \) line, i.e. the locus of a Singular Isothermal Ellipsoid (SIE) macromodel, are shown in all panels for reference. Top: Selected locations with \(\kappa _{\mathrm{c}}=0\) (orange points) have the same \(\kappa _{\mathrm{eff}}\), \(\gamma _{\mathrm{eff}}\) with any other location along the lines to (1,0) (grey dashed lines) with \(\kappa _{\mathrm{c}}\) given by Eq. (34). Because locations B, C and E, F lie very close to the lines emanating from A and D respectively, they have very similar \(\kappa _{\mathrm{eff}}\), \(\gamma _{\mathrm{eff}}\) values (grey points) for suitable \(\kappa _{\mathrm{c}}\) (this is illustrated in Fig. 14 in more detail). Magnification contours from Eq. (14) are also indicated. Middle: Selected macromodels from the literature – see https://gerlumph.swin.edu.au/macromodels/ for an interactive version of this plot. Bottom: Two simulated populations of lensed quasars are shown, one from Oguri and Marshall (2010) and an identical one generated with lens potential ellipticity and external shear distributions with lower and higher mean values respectively (0.15 and 0.1 as opposed to 0.3 and 0.05, based on Luhtaru et al. 2021). The footprint of the GERLUMPH database is also shown

These diverse caustic structures have an impact on the microlensing effect present in observations, i.e. through light curves and magnification probability distributions of flux ratios (see the next section). Vernardos and Fluke (2013) have explored the latter case, producing a similar tiling of the parameter space as Fig. 12 but with magnification distributions (see their Fig. 4). In summary, they found that along the critical curve (where magnification diverges), the distributions are centered around the macro-magnification, extend in a narrow area around it, \(0.2 < \mathrm{log} (\mu /\mu _{\mathrm{macro}}) < 5\), where \(\mu _{\mathrm{macro}}\) is given by Eq. (14), and appear symmetric (in log space). Away from the critical curve, the distributions generally become wider and asymmetric, mostly due to the appearance of a high magnification tail and/or a secondary peak.

Including an additional convergence component due to smooth matter has fundamental implications that trace back to the mass-sheet degeneracy (Saha et al. 2024). The effect is two-fold: on one hand, the number of microlenses is reduced since now there is less mass in the form of compact objects (see Eq. (8)), while on the other hand, the resulting caustic networks, which are now less dense, are isotropically magnified by the uniform mass-sheet described by \(\kappa _{\mathrm{c}}\) (see Eq. (5)). An example of such maps is shown in Fig. 14.

Fig. 14
figure 14

Magnification maps and histograms corresponding to the locations marked on the top panel of Fig. 13. The \(\kappa \), \(\gamma \), \(\kappa _{\mathrm{c}}\) values are given above each map and correspond to \(\kappa _{\mathrm{eff}},\gamma _{\mathrm{eff}}= (0.6\pm 0.02,0.55\pm 0.02)\) and \((0.5\pm 0.02,0.4\pm 0.02)\) for the saddle-points and minima respectively (grey points on the top panel of Fig. 13). Although the \(\kappa \), \(\gamma \), \(\kappa _{\mathrm{c}}\) of these maps are very different, their magnification histograms are very similar, making them ‘equivalent’ maps (per image type, see Vernardos et al. 2014). The minor differences between the histograms can be attributed to the sample variance of examining a finite source plane region. The same cannot be readily said for light curves resulting from these maps

Figure 12 is actually showing one slice of a three dimensional parameter space – the third dimension being \(\kappa _{*}\). As Paczynski (1986) has shown, the \(\kappa \), \(\gamma \), \(\kappa _{*}\) parameter space can be transformed to an equivalent, two-dimensional parameter space of effective convergence and shear:

$$ \kappa _{\mathrm{eff}}= \frac{\kappa _{*}}{1-\kappa +\kappa _{*}}, \quad \gamma _{\mathrm{eff}}= \frac{\gamma}{1-\kappa +\kappa _{*}}. $$
(34)

In the above equation, \(\kappa _{\mathrm{eff}}\) is entirely due to compact matter. The effect of the above transformation can be understood in the following way, as also illustrated in the top panel of Fig. 14. Starting from a given location \(\kappa _{0}\), \(\gamma _{0}\) with \(\kappa _{\mathrm{c}}=0\) (locations A and D), maps with \(\kappa _{\mathrm{c}}>0\) lie in the third dimension in the traditional \(\kappa \), \(\gamma \), \(\kappa _{*}\) parameter space. But in the effective space they lie along straight lines defined by \(\kappa _{0}\), \(\gamma _{0}\) and \((1,0)\). In terms of the magnification distribution, the above mass-sheet transformation leads to statistically equivalent maps (Vernardos et al. 2014), i.e. we expect the same microlensing deviations from the macro-magnification, as shown in Fig. 14. Minor differences arise due to sample variance from using a finite field of stars to generate maps for a finite source plane region. Such differences can be minimized by, e.g., averaging over multiple maps or using a larger region in the source plane. In terms of light curves, however, there are other slight differences: we expect less peaks because there are fewer caustics to cross, but also longer intervals for the variability because caustics cover now a bigger area on the source plane.

2.9 Quasar Structure Overview

Because the microlensing signal depends on the size and structure of the lensed source, we aim in this section at providing an overview of the characteristics of the AGN emission regions over the electromagnetic spectrum, e.g. shape and size of the X-ray corona, size and temperature profile of the accretion disc, geometry of the broad line region, etc. The generally accepted structure of a quasar has been pieced together by painstaking work over many decades to account for all of the various emission and absorption features that have been seen across the electromagnetic spectrum. Figure 3 shows a schematic representation of the different components.

A scale commonly used as “unit length” when it comes to describing inner region of AGN is the gravitational radius \(r_{\mathrm {g}}\), also called Schwarzschild radius:

$$ r_{\mathrm {g}}= \frac{2GM_{\mathrm{BH}}}{c^{2}}, $$
(35)

where \(M_{\mathrm{BH}}\) is the mass of the black hole, and \(G\) and \(c\) are the gravitational constant and speed of light. This radius defines the event horizon of a (non-spinning) black hole. For a typical mass of \(M_{\mathrm{BH}}=10^{8}~M_{\odot}\), we have \(r_{\mathrm {g}}\sim 3 \times 10^{8}\text{ km}\) that corresponds to \(\sim 2\) AU, \(10^{-2}\) lt-days, or \(\sim 10^{-5}\) pc. In the perspective of microlensing applications, this should be compared to the typical Einstein radius of a microlens in the source plane (taken as the average from Mosquera and Kochanek 2011, see also Eq. (7)), \(R_{\mathrm{E}}= 2.5 \times 10^{16} \sqrt{ \langle M \rangle /0.3~\mathrm{M}_{\odot}}\text{ cm}\) ∼ 9.7 \(\sqrt{\langle M\rangle /0.3~\mathrm{M}_{\odot}} \) l-d \(\sim 840~\sqrt{\langle M\rangle /0.3~\mathrm{M}_{\odot}} \; r_{\mathrm {g}}\) (for \(M_{{\mathrm{BH}}}=10^{8}~\mathrm{M}_{\odot}\)). Finally, another quantity of interest from general relativity typically associated with the inner edge of the disc is the radius of the innermost stable circular orbit of a photon, \(R_{\mathrm{ISCO}} = \alpha r_{\mathrm {g}}\), where \(\alpha =3\) for a non-rotating black hole and \(9/2\) (\(1/2\)) for a maximally pro- (retro-) grade spinning one.

Our discussion will “follow the photons” from their production in the accretion disc through their encounters with other material in the quasar on their way to the observer. The aim is to help the reader associate features in the electromagnetic radiation with physical structures in the quasar. We do not aim to give a complete description of every aspect of quasar emission; instead we focus on those most relevant to microlensing.

The hot material in the accretion disc produces an abundance of optical/UV photons. Some of those encounter a population of high-energy electrons in the innermost regions near the black hole and are inverse-Compton upscattered to X-ray energies; this is the “primary” X-ray emission. Some of this energy will reach the observer directly and result in an X-ray continuum spectrum, while the rest of the emission from this X-ray corona will interact with other parts of the quasar and give rise to reflection features, the most prominent of which being the Fe K\(\alpha \) emission line at 6.4 keV. This line can have two components: a broad component that comes from the inner part of the accretion disc, and a narrow component that comes from material farther out (i.e., the broad-line region or torus).

The matter accretion onto the central supermassive black hole can reach a rate of a few solar masses per year and radiate with an efficiency ranging between 5.7% in the non spinning case and 43% for high spin (Thorne 1974), much larger than nuclear fusion (\(\sim 0.8\)%). This energy release causes the accretion disc to heat up and emit a power-law continuum radiation from UV to optical ranges. This emission arises from the inner part of the disc that corresponds to distances from the central black hole ranging from several to tens of astronomical units. The high energy continuum from the inner disc ionizes the gas in the surrounding area, producing broad emission lines characteristic of quasar optical spectra (Fig. 3).

Based on the variety of observed line widths and ratios, AGN were originally classified into various types and subtypes. In the nineties, a unification scheme based on orientation has been proposed to explain this phenomenology in a coherent, geometric way (Antonucci 1993; Urry and Padovani 1995). The model postulates that obscuring material in the equatorial plane (assumed to be a dust torusFootnote 3) and orientation with respect to the line of sight are causing this diversity.

There is growing evidence that the physical properties of AGN also play a role in their spectral appearance. In particular, variation in the accretion rate, which is suspected to be related to the launch of radio jetsFootnote 4 (Laor and Behar 2008; White et al. 2015), also impacts the spectral appearance of AGN (Marziani et al. 2001; Shen and Ho 2014; Elitzur et al. 2014).

The following subsections describe more quantitatively our knowledge of the AGN emission regions that are the most relevant for microlensing studies, as shown in Fig. 3. As direct imaging is not possible, we briefly explain the methods used to build up our understanding of the unified model of AGN structure. Our journey into the heart of AGNs starts with the most compact regions, which are also the most susceptible to microlensing, and ends at the interface between the AGN and its host galaxy, namely the torus and the narrow line region.

2.9.1 The Most Compact Emission

After the first X-ray satellites were launched in the 1970s (Uhuru, Ariel 5, SAS-3, OSO-8), the X-ray spectral properties of quasars and (suspected) black hole X-ray binaries became known, and much work was done to understand their features. In particular, the X-ray continuum power-law spectrum seen in quasars and the high energy emission of black hole X-ray binaries was explained as the Compton upscattering (also called “inverse Compton scattering” or just “Compton scattering” in the literature) of lower energy photons by a population of hot, energetic electrons (e.g. Shapiro et al. 1976; Katz 1976; Pozdnyakov et al. 1976; Galeev et al. 1979; Sunyaev and Truemper 1979; Sunyaev and Titarchuk 1980). As more and higher quality observations of quasars and Seyfert galaxies were obtained by the HEAO I and then EXOSAT observatories, the remarkable similarity of the X-ray spectra (a single power-law continuum with spectral index of ∼0.7 Rothschild et al. 1983; Mushotzky 1984; Turner and Pounds 1989) suggested a common origin. After the Ginga satellite detected the Fe emission line and a broad hump reflection feature (Rothschild et al. 1983) that were predicted by Guilbert and Rees (1988) and Lightman and White (1988), Mushotzky (1984), Turner and Pounds (1989), a “two-phase” model AGN was proposed by Haardt and Maraschi (1991, 1993) in which a hot, tenuous corona exists above the accretion disc and Compton upscatters UV and optical photons from the disc to X-ray energies with a power-law continuum spectral shape. The corona is thought to be supplied with energetic particles accelerated by the magnetic fields anchored in the accretion disc (e.g. Haardt et al. 1994; Di Matteo 1998; Merloni and Fabian 2001). For decades, the geometry and extent of the corona was largely unknown, with arguments that it may be patchy, extend over parts of the inner disc, and vary with time (e.g. Gallo et al. 2015; Wilkins and Gallo 2015). However, spectral and timing studies suggest a compact, centrally located corona (e.g. Brenneman and Reynolds 2006; Fabian et al. 2009, 2013; Parker et al. 2015). Microlensing has strongly constrained the X-ray emitting corona to be compact.

2.9.2 The Accretion Disc

The powerhouse of AGN emission originates from within 10-1000 \(r_{\mathrm {g}}\) (i.e. 0.1-10 lt-days for a black hole mass \(M_{BH} \sim 10^{8} M_{\odot}\)) of the central supermassive black hole. This structure is thought to be a geometrically thin (in the vertical direction) but optically thick disc that is heated locally by the dissipation of gravitational binding energy through accretion of matter (the “thin-disc” model, Lynden-Bell 1969; Shakura and Sunyaev 1973). Ignoring relativistic effects, the temperature profile for such a thin-disc can be expressed as (e.g. Zdziarski et al. 2022):

$$ T(R) \propto R^{-\beta} \left ( 1 - \sqrt{\frac{R_{\mathrm{in}}}{R}} \right )^{1/4}, $$
(36)

where \(R_{\mathrm{in}}\) is the inner edge of the disc, and \(\beta = 3/4\). The inner edge effects are commonly ignored (see Zdziarski et al. 2022, regarding the impact of disc truncation and wind), and the dependence \(T\propto R^{-\beta}\) is often kept as a first order generalisation of the thin-disc.

For this model, the radius at which the temperature coincides with the rest wavelength of the observations (\(k T = h_{p} c/ \lambda _{\mathrm{rest}}\)) is:

$$ R_{\lambda}^{\mathrm{{flux}}} \simeq \frac{3.4\times 10^{15}}{\sqrt{\cos i}} \frac{D_{s}}{r_{H}} \left ( \frac{\lambda}{\upmu \mathrm{m}} \right )^{3/2} \left ( \frac{\mathrm{{zpt}}}{3631~\mathrm{Jy}} \right )^{1/2} 10^{-0.2 (m-19)} h^{-1}~\mathrm{cm}, \mathrm{ }$$
(37)

where \(r_{H}\) is the Hubble radius, \(i\) is the disc inclination angle, \(\mathrm{{zpt}}\) is the zero point of the AB magnitude system,Footnote 5 and \(m\) is the intrinsic magnitude of the source (Morgan et al. 2010). A handy quantity for comparison with microlensing studies is the half-light radius of the disc. Cornachione and Morgan (2020) refers to the latter as “luminosity size”, and its expression, based on the observed specific flux per wavelength (in a particular band), \(F_{\lambda , \mathrm{{obs}}}\), is given by:

$$\begin{aligned} R_{\lambda , 1/2}^{\mathrm{{flux}}} & = C(\beta )~R_{\lambda}^{\mathrm{{flux}}} \\ & = C(\beta ) \left ( \frac{\lambda ^{5}_{\mathrm{{obs}}} D^{2}_{L} F_{\lambda , \mathrm{{obs}}}}{4\pi h c^{2} \cos (i)(1+z)^{4}} \right )^{1/2} \times \left ({\int _{u_{\mathrm{in}}}^{\infty} \frac{u \, \mathrm{d}u}{\exp{(u^{\beta})} -1}} \right )^{-1/2}, \end{aligned}$$
(38)

where \(D_{L}\) is the luminosity distance to the quasar, \(u = R / r_{\lambda}\), and the factor \(\cos (i)\) accounts for disc inclination. The factor \(C(\beta )\) is required to transform the radius \(R_{\lambda}^{\mathrm{{flux}}}\) into a half-light radius and is given in Li et al. (2019) as a numerical solution to

$$ \int _{u_{\mathrm{in}}}^{C(\beta )} \frac{u\, \mathrm{d}u}{\exp{(u^{\beta})} -1} = \frac{1}{2} \int _{u_{\mathrm{in}}}^{\infty} \frac{u\, \mathrm{d}u}{\exp{(u^{\beta})} -1}. $$
(39)

For the standard disc, we have \(C(\beta =3/4) = 2.44\). On the other hand, knowing the mass of the central black hole, \(M_{\mathrm{{BH}}}\), and the luminosity, \(L\), the disc half-light radiusFootnote 6 can be calculated as:

$$ R_{\lambda , 1/2}^{\mathrm{{BH}}}=2.37\times 10^{16} \left ( \frac{\lambda _{\mathrm{rest}}}{\upmu \mathrm{m}} \right )^{4/3} \left ( \frac{M_{\mathrm{BH}}}{10^{9} \mathrm{M}_{\odot}} \right )^{2/3} \left ( \frac{L}{\eta L_{\mathrm{E}}} \right )^{1/3}~\mathrm{cm}, $$
(40)

where \(L_{E}\) is the Eddington luminosity and \(\eta \) is the accretion efficiency (Marconi and Hunt 2003; Graham 2007, 2016). Typical values for these parameters are \(L/L_{\mathrm{E}} \sim 1/3\) and \(\eta =0.1\) (Kollmeier et al. 2006; Shen et al. 2008; Edelson et al. 2015). Although both methods assume the same model, \(R_{\lambda , 1/2}^{\mathrm{{flux}}}\) estimations are a factor of \(\sim 2-3\) smaller than \(R_{\lambda , 1/2}^{\mathrm{{BH}}}\) (Collin et al. 2002; Pooley et al. 2007; Morgan et al. 2010) and cannot be reconciled neither through uncertainties in the measured black hole mass, nor, in the case of strongly lensed quasars, through lensing magnification (e.g. Mosquera et al. 2011).

Extended versions of the standard thin-disc model have been formulated (see e.g. Abramowicz and Fragile 2013; Middleton et al. 2015; Lasota et al. 2015) that include general relativistic corrections, radiative transfer in the disc atmosphere, black hole spin, or disc winds (Novikov and Thorne 1973; Thorne 1974; Hubeny et al. 2000; Sądowski et al. 2011; Davis and Laor 2011). While the standard accretion disc remains the dominant paradigm, a growing number of observations challenge at least its universality. A non exhaustive list of alternative models that have been proposed includes: advection dominated accretion discs (Ichimaru 1977; Narayan and Yi 1994), slim discs (Abramowicz et al. 1988), inhomogeneous accretion (Dexter and Agol 2011), magnetically arrested discs (Zamaninasab et al. 2014), discs with modified viscosity laws to account for magnetization (Grzędzielski et al. 2017), torn discs (Nixon et al. 2012; Hall et al. 2014), and puffy discs Lančová et al. (2019), Wielgus et al. (2022).

Constraints on the accretion disc temperature profile have been achieved over the last decade by continuum reverberation mapping. This technique consists in measuring how the UV-optical continuum responds to variations of the X-ray emission. Measuring the time lag of this response, \(\tau \), in multiple bands can be translated into an increase in size of the emission region with wavelength. The data accumulated for a growing number of AGN indicate that the disc size varies as expected for the thin-disc model of Shakura and Sunyaev (1973). Uncertainties on the slope of the temperature profile are, however, still too large to robustly rule out alternatives. In addition, the disc size is found to be larger than predicted by the model (e.g. Edelson et al. 2015; Cackett et al. 2020; Guo et al. 2022). Diffuse continuum emission, originating from the inner part of the BLR (such as Balmer and Paschen emission; Korista et al. 1995; Korista and Goad 2001; Gardner and Done 2017) may explain the excess UV-optical size in several reverberation mapped systems, but other sources of diffuse emission (e.g. scattering, thick gas clouds, etc) may also be present in AGN (e.g. Lawrence 2012; Lawther et al. 2023). All in all, our theoretical understanding of accretion discs is under tension. On the one hand, theoretical developments indicate that the accretion disc should not be universally described by a thin-disc model. On the other hand, the thin-disc cannot explain all the observations: the size is found to be larger than expected but its dependence on wavelength follows the \(R \propto \lambda ^{4/3}\) relation from the standard model. Unfortunately, firm conclusions are hard to draw due to the blending of the disc emission with diffuse pseudo-continuum emission of debated nature.

2.9.3 Intermediate Sizes: The Broad Line Region

One of the most salient features detected in UV-optical quasar spectra are broad emission lines. They arise mostly from hydrogen and helium recombination lines, but also permitted and semi-forbidden lines, such as C iv, and C iii], and also complex multiplets from Fe ii. These lines, observed over 7 orders of magnitude in AGN luminosities, arise from the Broad Line Region, BLR, a region of radius ranging from ∼ 4 up to 100 times larger than the accretion disc, and possibly originating in the form of a clumpy wind launched from the latter (see Fig. 3 and Elitzur et al. 2014; Czerny et al. 2017). The BLR is an important probe of the physical conditions (e.g. gas density, hydrogen column density, metallicity, …) prevailing in the direct vicinity of the black hole. It is composed of a large number of clouds or of an inhomogeneous/clumpy wind of gas, photoionized by the continuum emission. As is the case for the accretion disc, the BLR is also too compact to be spatially resolved with current telescopes. Therefore, our knowledge of the geometry (e.g. disc-like, bi-conic, bowl-like) and kinematics (Keplerian rotation, inflow, outflow) of BLR gas is yet limited despite of more than fifty years of investigation. The physical conditions in the BLR are explored through advanced photoionization codes, such as the state-of-the-art code CLOUDY (Ferland et al. 2013). This code enables one to reproduce most of the line ratios observed in AGN and has improved our understanding of the chemical evolution of AGN host galaxies (Nagao et al. 2006). Some challenges remain, however, such as reproducing the iron multiplet or the ratio between optical and UV Fe ii emission, which shows a large scatter at the population level (Ferland et al. 2009; Sarkar et al. 2021). Our inability to accurately reproduce this major component of AGN spectra is likely linked to uncertainties on the distribution and kinematics of the iron emission in the BLR (Ferland et al. 2009).

Most of our knowledge on the structure of the BLR comes from the reverberation mapping technique (Blandford and Mckee 1982). This technique has enabled the measurement of the luminosity-weighted distance, \(R_{\mathrm{BLR}}\), of the BLR from the continuum for the H\(\beta \) line in about 70 local AGN (Peterson et al. 1985; Horne et al. 1991; Kaspi et al. 2000; Bentz et al. 2013; Bentz and Katz 2015). This distance scales with the square root of the quasar luminosity, confirming that the BLR gas is mostly photoionized. Time-lags for UV emission lines have been harder to obtain. The Mg ii lags have been measured for about 50 AGNs and the luminosity-size relation agrees with the one obtained for H\(\beta \) (Homayouni et al. 2020; Yu et al. 2023). Time lags for higher ionization lines, such as C iv or He ii have been measured for about 50 systems (Peterson et al. 2005; Kaspi et al. 2007; Grier et al. 2019) and found to be systematically shorter than for H\(\beta \), indicating a stratification of the BLR (i.e. high ionization gas is found closer to the continuum). The line broadening of the (optical) Fe ii, as well as reverberation mapping data of a small sample of local Seyfert, indicate that this blend arises from a region at least as large as the Balmer BLR (e.g. Barth et al. 2013; Hu et al. 2015).

Information on the geometry and kinematics of the BLR is more difficult to achieve. Direct modeling of the lines using radiative transfer codes can successfully reproduce their shapes (Murray and Chiang 1997; Borguet and Hutsemékers 2010; Higginbottom et al. 2014). However, the same line shape can be reproduced by a variety of geometries and kinematics of the emitting region, limiting the usefulness of this approach in constraining BLR structure. Spectro-polarimetry of emission lines has been a useful complementary probe, revealing, for instance, the rotating disc-structure of H\(\alpha \) emission in some Seyfert galaxies (Smith et al. 2005). But such measurements are limited to significantly polarized AGN, and results arguably depend on the (poorly known) location of the scattering region at the origin of the polarization. Velocity-resolved delay maps, i.e. measurements of the reverberation time-lag for various velocity slices in the line, are another indirect probe of BLR structure (Horne et al. 1991; Peterson and Wandel 1999). Results obtained for a few dozens of systems are often difficult to interpret due to the unknown transfer function that encodes the time-delay distribution across the broad line as a function of the line-of-sight velocity (Villafaña et al. 2022, and references therein). The more direct method developed by Pancoast et al. (2011) does not require knowledge of the transfer function but has other limitations, especially in presence of complex variability features (see e.g. Li et al. 2013; Pancoast et al. 2014; Bentz et al. 2021). Overall, the data indicate a disc-like geometry for the region emitting the H\(\beta \) line (which is the best studied line) but suggest a surprising diversity of BLR kinematics, including Keplerian discs, inflows and outflows.

2.9.4 Further Out: The Torus, NLR, and the Radio Domain

Above \(\sim 1.5~\upmu \text{m}\) (rest-frame), the contribution of the accretion disc to the continuum emission becomes subdominant compared to the emission from the dust. Observations of dozens of local AGN indicate that the near/mid-infrared dust emission arises from two regions: a “compact” region characterised by hot/warm dust temperature (\(T \sim 1400\text{ K}\)), and a colder component with \(T \sim 300\text{ K}\) that may be 100 to 1000 times more extended than the hotter emission (Kishimoto et al. 2011a). Interferometric studies of local AGN suggest that the ratio of effective areas between these two components is of the order of 400, such that cold emission dominates above typically \(7~\upmu \text{m}\), while hot emission peaks around \(2~\upmu \text{m}\) (Kishimoto et al. 2011b). The latter component is the only one compact enough to be susceptible to microlensing. Dust reverberation mapping studies show that the \(K\)-band reverberation radius (dominated by dust emission) scales with the square root of the luminosity, establishing a solid dependence of the dust emission on AGN luminosity (e.g. Suganuma et al. 2006; Koshida et al. 2014; Yang et al. 2020). This reverberation radius can be used as the foundation of a ring-like model of the torus by fitting it as a function of luminosity. Kishimoto et al. (2007) derived the scaling of the inner radius of the torus as a function of the luminosity in the V-band, i.e. \(\nu L_{\nu}(5500~\text{\r{A}})\), as:

$$ R_{\mathrm{in}} = 0.47 \left ( \frac{ 6 \nu L_{\nu}(5500~\text{\r{A}})}{10^{46}~\mathrm{erg}/\mathrm{s}} \right )~\mathrm{{pc} }. $$
(41)

The remaining model components are the surface brightness and outer radius. The surface brightness is observationally unknown but theoretical arguments favour a power-law decrease (Barvainis 1987). Interferometric data suggest that \(R_{\mathrm{out}}/R_{\mathrm{in}} \leq 2\) at 2.2 μm and reach a factor of several at longer wavelengths (Kishimoto et al. 2007).

Finally, we may stress that the geometry and clumpiness of the dust emitting region is yet debated. A torus-like structure has been favoured for decades as it provides a strong support to the orientation-based AGN manifestations within the framework of the unified model. However, a disc-like ring with dust winds launched from it is suggested in some systems instead of a torus (e.g. Pfuhl et al. 2020). Overall, a lot of open questions remain regarding the dust emitting region. Depending of the AGN luminosity and torus properties, the near-/mid-infrared AGN emission region can be small enough to be slightly affected by microlensing.

The narrow emission lines (with FWHM \(\leq 800~\mathrm{km}\,\mathrm{s}^{-1}\)) commonly detected in quasar spectra either arise from the AGN host galaxy or from the Narrow Line Region (NLR). The gas in the NLR is photo-ionised, while narrow emission in the host can be associated, for example, with star formation and may therefore appear to be spatially offset with respect to the NLR (and possibly spatially resolved in high resolution images). The gas in the NLR is exterior to the torus, reaching tens to thousands of parsecs (depending on the luminosity), and can also be spatially resolved (Bennert et al. 2002; Dempsey and Zakamska 2018). The frequent asymmetry in the shape of narrow lines, in particular of [O iii] \(\lambda \lambda 4959,5007~\text{\r{A}}\), indicates the common presence of outflowing material that can regulate stellar activity in the host (e.g. Speranza et al. 2021). Due to these characteristics, the NLR is expected to be too large to undergo any microlensing.

AGN emission at radio wavelengths reveals a dichotomy whose origin (and even existence) is still debated. About 15% of AGN are radio-loud, while the remaining are qualified as radio-quiet. The emission from radio-loud quasars is mostly due to synchrotron emission. Interferometric radio data often reveal an unresolved core (at sub-parsec scales) and multiple components of a jet that sometimes extends over kiloparsecs. The strongest radio emission is often observed in blazars and flat spectrum systems, whose (often relativistic) jet is aligned to our line-of-sight. Quasars classified as radio-quiet can still have some energy emitted in the radio. Thermal free-free emission (i.e. bremsstrahlung) from either a stellar or AGN component is possible. Additionally, magnetic heating of the corona, thin free-free emission from winds, or synchrotron emission from the base of a small jet or particles accelerated in shocks are also plausible mechanisms of this radio emission (see e.g. Silpa et al. 2020, and reference therein). The radio emission from AGN is commonly assumed to be insensitive to microlensing, but this complex picture of radio AGN does not fully imply this. The existence of radio variability on time scales of a few days (e.g. Biggs and Browne 2018) indicates that regions compact enough to be microlensed should contribute a substantial fraction of the radio emission. Observation of microlensed radio-emission of AGN remains, however, elusive (Koopmans and Bruyn 2000; Biggs 2023).

2.10 Microlensing and the Quasar Structure

The main reason a detailed physical picture of AGN structure is not yet in place is our inability to spatially resolve their innermost regions. “Direct” imaging of the vicinity of a supermassive black-hole has been possible only for a handful of nearby systems thanks to the Event Horizon Telescope, which required a tremendous technical and observational effort to turn our planet into a giant radio interferometer. This enabled the reconstruction of an image of the shadow of the black hole in M87 and of the one lying in the center of our own Galaxy (Event Horizon Telescope Collaboration et al. 2019, 2022). While EHT has been used to observe some distant quasars (\(z > 0.5\)) at 20 \(\mu \)arcsec resolution at 230 GHz (Jorstad et al. 2023), we are still far from reaching such a spatial resolution for a sizeable sample of systems over the whole electromagnetic spectrum. Our most powerful near-infrared interferometers resolve AGN on scales of ∼ 1 pc, corresponding to emission from the “dust torus” (Amorim et al. 2021), but we are yet far from getting a comprehensive view of the innermost parsec.

The fortunate match between the microlensing Einstein radius and the size of the otherwise unresolved AGN regions represents an opportunity to shed light on quasar structure. Microlensing (de)magnification directly scales with source size (see next section), turning variable signals in time and wavelength into an astrophysical ruler that enables a sensitive measurement of the heart of quasars at multiple scales. For instance, it is the only technique that enables a measurement of the size of the X-ray continuum emission. Contrary to reverberation mapping that gets hampered by relativistic time dilation at high redshift,, microlensing does not rely on observing the intrinsic quasar variability. It is therefore particularly suitable to probe the size of distant quasar emitting regions, nicely complementing reverberation mapping measurements in the local Universe. Differential microlensing between regions of different sizes also provides a tool to study finer properties, like the temperature profile of the disc or the geometry and kinematics of the BLR. Here as well, microlensing nicely complements standard techniques (e.g. reverberation mapping, photoionization, line shape modeling) that rely on different working assumptions. At the scale of \(\gtrsim 0.1 pc\), the radial structure of the hot torus may also potentially be constrained by microlensing (or its absence of). The possibility of using microlensing to zoom in the radio and sub-milimeter emission regions is less clear, but observations hint that microlensed effects in these wavelengths may not be immediately excluded. A presentation of the main results obtained with microlensing techniques is given in Sect. 4.

3 Methods

Microlensing offers two main methods to probe quasar structure and the partition of matter in the lens. The first one, known as the “single-epoch method” (described in Sect. 3.3), takes advantage of the differential microlensing occurring between regions of different sizes. For instance, our basic understanding that inner (outer) parts of accretion discs are hotter (cooler) and hence emit in bluer (redder) wavelengths (see Fig. 3 and Eqs. (37) and (40)) postulates that microlensing magnification measured at the same epoch will depend on wavelength. As we can anticipate from the structure of AGNs (Sect. 2.9), the wavelength dependence of microlensing is not restricted to the disc. The second method, known as the “light curve method” (described in Sect. 3.4), analyses the amplitude and rate of the time variability of the microlensing effect. The timescale of variability is shorter for the smaller regions, explaining why it is mostly applied to study the X-ray (corona) and optical continuum (disc) regions. Other methods, generally sharing concepts with the single-epoch and light curve methods, exist, but they are usually tailored to a particular type of data or scenario. A non-exhaustive overview of some selected such methods is given in Sect. 3.5.

All these techniques require identifying the presence of microlensing in the data and measuring its amplitude. In order to infer properties of the source or the lens, a forward modeling method is generally followed that simulates microlensing data and compares them to the observations. Section 3.1 presents important aspects that need to be considered upon designing the simulations, while Sect. 3.2 explains how the microlensing signal can be extracted from the data in the most common cases.

3.1 General Modeling Considerations

Microlensing variability depends most crucially on the size of different emitting regions of the quasar with respect to the Einstein radius of the microlenses (see Eq. (7)). In order to simulate the magnification for any size and shape of the source one simply needs to convolve a magnification map with the source’s brightness profile. The magnification range produced by the stars in the lens galaxy increases as the source becomes more compact, which is imprinted on all kinds of microlensing data, i.e. flux ratios, light curves, spectra, and high magnification events (see Table 1). An example for the case of light curves is shown in Fig. 15, where we see a clear dependence primarily on the size of the source – short dramatic changes and smooth extended variations for small and large sources respectively – and to second order on its shape. Another example is illustrated in Fig. 16, where the magnification probability distribution is shown for a “point source” (\(< 0.0025~\theta _{\mathrm{E}}\), see below), and three Gaussian brightness profiles of increasing size. In fact, it has been shown (Mortonson et al. 2005; Vernardos 2019) that it is mainly the half-light radius of the source, \(r_{1/2}\), that determines the extent of microlensing effects, while its detailed projected two-dimensional shape plays a secondary role (as long as \(r_{1/2}\) is kept the same). For this reason, a common choice is a Gaussian luminosity profile for the source (\(r_{1/2}= 1.18 \sigma \)). In a similar way, a common choice of the dependence of size on wavelength, particularly for the accretion disc, is the parametric model:

$$ r_{1/2}= r_{0} \left ( \frac{\lambda _{\mathrm{{rest}}}}{\lambda _{0}} \right )^{1/\beta}, $$
(42)

where \(\lambda _{0}\) and \(r_{0}\) are a reference wavelength and size, and \(\beta \) is the slope of the temperature profile. This allows for more flexibility, as opposed to, for example, the thin-disc model that has a fixed temperature slope of \(\beta = 3/4\) (and depends on other physical parameters like the black hole mass, etc, see Eq. (40)).

Fig. 15
figure 15

Light curves for sources of different size and shape moving along the trajectory shown in the right panel of Fig. 4 and in Fig. 10 (from bottom to top). When the point source (gray curve) enters and exits the areas bordered by caustic lines, characteristic double-horned profiles can be seen. Larger sources lead to smoother curves without the extremely magnified peaks, as it can be seen by the Gaussian-shaped sources with half-light radii at 5 (blue) and 30 (red) per cent of the Einstein radius. The shape of the source plays a secondary role and only modulates small scale features in the light curves, as demonstrated by a uniform and irregular disc (the latter being simply a greyscale, black background photo of the face of one of the authors) with the same half-light radii as the Gaussian profiles. Statistical errors (noise) introduced by the inverse ray-shooting technique in the magnification map pixels can be seen for the point source, which are averaged out as soon as the source has a finite size of just a few pixels. The source position marked with the blue cross in the right panel of Fig. 4 is also indicated here

Fig. 16
figure 16

Magnification probability distributions from the maps corresponding to the multiple images of RXJ 1131−1231 (shown in Fig. 17) for a point source (\(< 0.0025~R_{\mathrm{E}}\), grey lines) and three Gaussian brightness profiles with sizes of 0.117 (blue), 0.2 (red) and 1.6 (cyan) \(R_{\mathrm{E}}\), respectively (matching the source sizes shown in Fig. 19). The vertical lines indicate the macromagnification \(\langle \mu \rangle \). We can see that the larger the source the smaller the extent of the deviations from the macromagnification due to microlensing. However, in this example we used the same \(25^{2}\)-\(\theta _{\mathrm{E}}\), \(10,000^{2}\)-pixel maps for all the sources, which have a sufficiently large resolution compared to the size of the smallest sources (blue and red circles on Fig. 19) but are probably not wide enough to result in unbiased magnifications for the largest source. The high resolution is not needed in this case because of the smoothness of the chosen Gaussian profiles

Table 1 Summary of the most common types of data sets used depending on the microlensing application. Temperature and time delay are denoted as \(T\) and \(\varDelta t\)

A macroscopic model of the lensing galaxy defines the general properties of microlensing variability. It is the starting point of almost all of the methods used to infer physical quantities from a microlensing signal. The macromodel values of \(\kappa \), \(\gamma \), \(\kappa _{*}\) at the locations of the observed multiple images (see also Shajib et al. 2024) are used to obtain corresponding microlensing magnification maps, as illustrated in Fig. 17 for a quad. Such maps are the indispensable tool to model microlensing variability and are computed with one of the techniques presented in Sect. 2.7. Because the \(\kappa \), \(\gamma \), \(\kappa _{*}\) parameters only define the microlenses at the population level, e.g. they do not define the positions and masses of the microlenses uniquely, each map is one of manyFootnote 7 random and equivalent realizations that can be used interchangeably. This is important because maps have a finite size and resolution that can be limiting depending on the application, hence using equivalent maps can help. In addition, averaging over multiple realizations of maps can help avoid the sample variance in the histogram from using a single map.

Fig. 17
figure 17

Lens model of quadruply lensed quasar RXJ 1131−1231 taken from Chen et al. (2016) and corresponding magnification maps at the locations of the quasar multiple images. The maps are consistent with the mass model, i.e. have the corresponding \(\kappa \), \(\kappa _{*}\), \(\gamma \) (see Table 1 in Vernardos 2022) and are aligned with the local total shear vector. A trial trajectory for a quasar moving (from right to left) across the source plane is shown on each map (black line with length of \(\approx 0.5~\theta _{\mathrm{E}}\); the corresponding microlensing light curves are shown in Fig. 18). Adapted from Vernardos (2022)

A map constitutes a finite region of the source plane and has to be large enough to be representative, i.e. cover enough “troughs” and clusters of superimposed caustics (see Fig. 12) in order not to be biased towards lower or higher magnification.Footnote 8 The higher the resolution of the map compared to \(\theta _{\mathrm{E}}\), the better the pixels approximate the size of a point source. For pixel sizes of \(< 0.006~\theta _{\mathrm{E}}\) (practically point sources), Vernardos and Fluke (2013) found that the minimum size for unbiased maps is \(24\times 24\) \(\theta _{\mathrm{E}}\). For larger sources, there are two reasons for having an adequately sized map: avoiding convolution edge effects and covering enough area for unbiased magnifications, i.e. mitigating the sample variance (e.g. studies of the BLR usually use maps whose size can be up to 50-200 \(\theta _{\mathrm{E}}\)). Statistical studies need to sample the stochastically varying magnification field in an unbiased way. It should be noted that locations that are within a source-size from each other become correlated via convolution.Footnote 9 This correlation may be desirable when studying time series, whose values are not independent from each other. The number of sources that can fit in a map of given size without overlap decreases with the source size and the more overlap between source positions the more correlated their magnification values become. The issue merely reflects the already mentioned problem of sample variance: if a few, long-range features dominate a magnification map, even after smoothing via convolution with a large source, then larger or statistically equivalent maps should be used in order to obtain unbiased magnifications.

Once the map size is set, then the minimum pixel size (resolution) should be small/high enough so that the smallest source would still be covered by a few map pixels. This rule-of-thumb is to avoid extreme (de)magnifications that can appear if a source is equal to or smaller than a map pixel.Footnote 10 Similarly, because each pixel is a numerical approximation of the magnification by a finite number of rays, statistical (Poisson) fluctuations average out over a few pixels. Both of these effects can be seen in the light curves shown in Figs. 15 and 18. In fact, the pixel size should be small enough to resolve any part of the source that contains a large fraction of the total flux (e.g. a hotspot within an accretion disc). Conversely, if there are mostly small brightness gradients over the parts of the source that emit most of the flux, then there is no need for high resolution. This is the case for the magnification of the BLR, which is smoothed out due to its extent over several \(\theta _{\mathrm{E}}\). In the examples shown in Fig. 16, the probability distributions for the largest source could be biased and using multi-scale, multi-resolution maps cannot be avoided, at the cost of making sure they are consistent with each other.

Fig. 18
figure 18

Realization of magnification as a function of time for each multiple image of RXJ 1131−1231 and corresponding zoomed-in region of the magnification maps shown in Fig. 17. The source trajectory is indicated by the arrow (same as the trajectories in Fig. 17). The horizontal dashed lines correspond to the macromagnification of each image. The shaded areas indicate the Poisson error (after convolution, Vernardos et al. 2015) of the magnification resulting from the numerical inverse ray-shooting technique (Kayser et al. 1986) that was used to generate the maps

3.2 General Observational Considerations

At any given time, there is a high probability that microlensing takes place in at least one image of a lensed quasar due to the high optical depth for microlensing. This probability is the largest for quads, where there is almost always one image that is subject to substantial microlensing (Witt et al. 1995). The presence of microlensing can be assessed by measuring fluxes, in one or more bands, or spectral ratios, at a single or at multiple epochs, as summarized in Table 1, and then comparing these measurements to the microlensing-free case. The latter serves as a “micro-lensing zero-point” or “no-microlensing baseline”, and has to be determined from ancillary data or models. However, it is difficult to obtain this for single images, as well as disentangling the intrinsic variability of the quasar, without additional information (e.g. the time delay, see below). For this reason, multiple images are considered in pairs.

For an image \(i\), the magnification from the macro-model \(M_{i}\) (Eq. (14)), can be used to derive the baseline magnification ratio between a pair of images (\(i\), \(j\)), i.e. \(M = M_{i} / M_{j}\). There is, however, no a priori guarantee that the macromodel captures the full complexity of the system. For instance, the presence of a nearby satellite galaxy may produce a flux ratio anomaly compared to a standard macro-model (e.g. Vegetti et al. 2024). If the latter is not accounted for, the estimated amplitude of microlensing from the data will be biased. For that reason, one can alternatively derive the amplitude of microlensing directly from the data, as explained in Sect. 3.2.1 and Sect. 3.2.2.

The requirement to compare pairs of lensed images implies that the interpretation of the microlensing signal is in general ambiguous: either one image is magnified or the other one is demagnified. It can also be that microlensing is taking place simultaneously in both images in a pair. The comparison of multiple pairs of images in quads helps resolving this ambiguity. The combination of different kinds of data, such as time-series and multi-wavelength data, may also be helpful for determining a plausible microlensing scenario in a lensed system.

One can distinguish three main strategies for carrying out observations suitable for microlensing studies (Table 1): snapshot data, long-term monitoring, and targeted monitoring triggered based on specific criteria. The single-epoch method (see Sect. 3.3) is observationally the simplest one, as it consists of one-time observations of a system (snapshots) in many wavelengths. Flux ratios are assumed to have been corrected for time delays, otherwise, it is necessary to simulate the impact of intrinsic variability, either by theoretical models (e.g. Vernardos et al. 2023, submitted) or by means of Monte-Carlo simulations. Long-term monitoring data have commonly been a by-product of campaigns aiming at measuring the time delay between lensed images (e.g. Tewes et al. 2013, and Birrer et al. 2024). Once corrected for the time delay, these light curves can be used to extract microlensing signals. Surveys observing regularly the same region of the sky, like the Rubin Observatory Legacy Survey of Space and Time (LSST), will yield an increase in the availability of such data in the next decade. Finally, there is a category of suitably timed observations targeting specific systems that are undergoing High Magnification Events (HME). These are generally defined as sharp increases of brightness (e.g. \(>1\) mag in the optical) in one of the images in a pair over a period of a few weeks to months or even years, owing to a very compact region of the quasar, most likely very close to the SMBH, going through a very high magnification region. These events are usually attributed to single caustic crossings, although more complex situations can produce similar events (see Neira et al. 2023, submitted, and Fig. 4). Triggering such follow-up observations is not easy due to the large number of systems that have to be monitored in order to capture the onset of an event. Although a handful of candidate HMEs have been captured by dedicated campaigns, we expect a few hundred events per year to be observed by surveys like LSST (Neira et al. 2020).

3.2.1 Measuring Microlensing from Imaging Data

Good proxies of the microlensing-free flux ratios can be measured in regions that are considered large enough to be free of microlensing, i.e. tens of \(\theta _{\mathrm{E}}\) (see Refsdal and Stabell 1997, for an estimate of the predicted amplitude of microlensing fluctuations for large sources). As explained in Sect. 2.9.4, a sweet-spot may be the radio and mid-infrared ranges. While those regions may be the ones less affected, it is not guaranteed that they are totally free of microlensing. For instance, microlensing is suspected to occasionally occur at radio wavelengths (Koopmans and Bruyn 2000; Biggs 2023), but also radio emission may be spatially offset from the optical disc or blend multiple sources of emission only discernible with very high resolution interferometry (e.g. Hartley et al. 2019; Pashchenko et al. 2020). In the mid-infrared range, one needs to disentangle between different emission regions to derive flux ratios totally free of microlensing (Stalevski et al. 2012; Sluse et al. 2013). Rest-frame emission with \(\lambda > 11~\upmu \text{m}\) is probably the least susceptible to microlensing (Sect. 2.9.4), but is observationally out of reach for quasars at \(z \gtrsim 2\). When the source size gets too large, one should also account for finite size effects arising because the macro-magnification is not constant over the source profile. In some rare situations, the source may even cross a macro-caustic introducing more dramatic changes of flux ratios but generally accompanied by spatially resolved emission.

When spectra are available, NLR flux ratios may also be used (see Sect. 3.2.2), however, they may sometimes deviate from the baseline and become spatially resolved (e.g. Sluse et al. 2007). Because of small asymmetries in the emission region, there can be a luminosity centroid that is slightly offset compared to the emission from the accretion disc. The differential magnification between the disc and the NLR due to strong lensing may yield a small bias on flux ratio measurements. Even in the case where a NLR brightness peak is in projection coincident with the location of the disc, biases can occur due to the way the flux ratio is obtained. The measurement needs to be performed over the whole resolved emission from the NLR, and a correction for source size effects associated to strong lensing has to be applied.

It is important to keep in mind that other effects can yield a deviation of monochromatic flux ratios from the baseline. The intrinsic quasar variability, as well as the differential extinction caused by different amounts of dust along the path of individual lensed images, inevitably introduce additional uncertainties. The former can be accounted for by observing a system at multiple epochs separated by the time delay, while the latter requires multi-wavelength data to be corrected for (see Sect. 3.2.3).

3.2.2 Measuring Microlensing in Spectra

The spectra of quasars combine emissions arising from regions of different sizes, which are therefore subject to a different amount of microlensing. Consequently, the shape of a microlensed spectrum will be deformed compared to the intrinsic one. In general, at a given epoch, the smallest region will be more affected than more extended regionsFootnote 11 This basic rule helps to anticipate the shape of a spectral deformation.

The simplest diagnostic for the presence of microlensing in spectra consists of calculating spectral ratios between pairs of gravitationally lensed images (see also Fig. 19). Let us consider the spectra of two lensed images of an AGN, \(F_{1}(\lambda )\) and \(F_{2}(\lambda )\), as observed in the optical rest-frame range at two epochs separated by the time delay. In absence of microlensing, those two images are simply magnified by the macro-magnification \(M_{1}\) and \(M_{2}\). The spectra may contain emission from the power-law continuum,Footnote 12 which arises from the compact accretion disc with size of typically \(\sim 0.1~\theta _{\mathrm{E}}\), emission from broad lines originating from the BLR with size \(\sim 2~\theta _{\mathrm{E}}\), and emission from narrow emission lines that arise from regions with size \(\sim 50~\theta _{\mathrm{E}}\) (Fig. 3). In a no-microlensing scenario, we should have \(M(\lambda ) = F_{1} / F_{2} = M_{1} / M_{2}\) being a constant across wavelengths. If extinction due to the lensing galaxy is present, \(M(\lambda )\) is not constant anymore and differs from the macro-magnification M (Sect. 3.2.3).

Fig. 19
figure 19

Illustration of the spectral deformation introduced by microlensing on an AGN spectrum. Panel (a): The rest-frame range 4000-6000 Å is shown. The emission components considered are a power-law continuum, a broad \(H\beta~\lambda \) 4864 Å emission arising from the BLR, and 3 narrow emission lines arising from the NLR and associated to \(H\beta \) and [O iii] \(\lambda\lambda 4959\), 5007. The broad line components (i.e. after continuum subtraction) are shown with colored solid lines. Panel (b): The colored circles correspond to \(r_{1/2}\), of the continuum emission at 4000 Å (blue; \(r_{1/2}= 0.117~R_{\mathrm{E}}\)), 6000 Å (red; \(r_{1/2}= 0.2~R_{\mathrm{E}}\)), and the BLR (turquoise; \(r_{1/2}= 1.6~R_{\mathrm{E}}\)), overlaid on the magnification map of image A of RXJ 1131−1231 (Fig. 17). The NLR is much larger than the map (\(r_{1/2}= 48~R_{\mathrm{E}}\)) and hence not shown. The micro-magnification of the different components are \(\mu _{\mathrm{{blue}}} = 2.46\), \(\mu _{\mathrm{{red}}} = 1.86\), \(\mu _{\mathrm{{BLR}}}=1.16\). The solid black horizontal bar corresponds to \(1~R_{\mathrm{E}}\). Panel (c): The spectra of two lensed images \(F_{1}\) and \(F_{2}\). The macro-magnification ratio between the two images is fixed to \(M = M_{1}/M_{2} = 2\). Panel (e): Spectral ratio for the microlensing situation displayed (black), and in absence of microlensing (grey). The ratios derived from the narrow lines (triangles) and from the broad lines (circles) are also shown. Panels (d & f): Result of the MmD decomposition method, explained in Sect. 3.2.2. This method derives \(F_{M\mu}\) (\(F_{M}\)), which corresponds to the fraction of the flux that is (not) microlensed. Panel (d) shows the decomposition when an incorrect pair (\(M\), \(\mu \)) is used. Panel (f) shows a realistic decomposition achieved by minimizing the appearance of the NEL emission in \(F_{M\mu}\). Due to the chromatic microlensing of the continuum, we have \(F_{M} < 0\) for \(\lambda < 4700\) Å and \(F_{M} > 0\) for \(\lambda < 4700\) Å. The value \(\mu = 2.15\) retrieved is \(\mu _{\mathrm{{blue}}} > \mu > \mu _{\mathrm{{red}}}\). It corresponds to \(\mu \) at \(\lambda \sim 4900\) Å. Detrending the chromaticity of the continuum microlensing prior to the MmD can be performed to achieve a more precise decomposition

Figure 19c shows an example of spectral deformation expected for a pair of lensed images due to microlensing. For simplicity, this figure displays the case where \(F_{2}\) is virtually free of microlensing (i.e. \(\mu \sim 1\)), and the continuum of \(F_{1}\) is microlensed by \(\mu \sim 2.15\) at the wavelength of \(H\beta \). The spectral ratio \(F_{1} / F_{2}\) (Fig. 19e) reveals two features. First, a dip is visible at the wavelengths corresponding to the emission lines. This can be easily understood in terms of the expected relative sizes of the emitting regions: since the continuum comes from the smallest region, it is on average more microlensed than the broad and narrow lines, resulting in more continuum than line flux. Second, a chromatic slope is seen because of differential magnification of the continuum itself: bluer emission arises from the inner, smaller part of the disc and is more magnified than redder. The strength of this chromatic effect depends mostly on the amplitude of microlensing and on the relative change of size of the disc as a function of wavelength. The flux ratio in the narrow lines will be the closest to \(M\) due to the large size of the NLR. It is important to realise that an accurate measurement of \(M\) based on the narrow lines (when unresolved by strong lensing) cannot be achieved by simply looking at the spectral ratios due to the presence of both continuum and line flux at the same wavelength. Instead, it should be based on the integrated flux in the lines, i.e. after continuum subtraction [see panel (e) of Fig. 19]. In order to better characterise the potential role of variability, one can also compare flux ratios of the same lensed images at multiple epochs. This method, in combination with quantitative measurements on the spectral ratio of pairs of images, has been used by Popović and Chartas (2005) to disentangle microlensing from intrinsic variability and millilensing in several systems.

The spectral ratios mostly provide a simple and quick diagnostic of the presence and amplitude of microlensing. A more quantitative criterion of differential microlensing between the continuum and the lines can be obtained by measuring the line equivalent width:

$$ W = \int _{\lambda _{1}}^{\lambda _{2}} \left ( \frac{F_{\lambda}}{\mathcal{F}} - 1 \right ) \,\mathrm{d}\lambda , $$
(43)

where ℱ is the continuum flux at the level of the line and \(F_{\lambda}\) the total observed spectral emission at wavelength \(\lambda \). Because it involves a normalisation by the continuum, it is easy to verify that \(W\) measured in two lensed images will differ only if the amplitude of microlensing is different in the line and in the continuum (e.g. Lewis and Belle 1998; Sluse et al. 2015, and Vernardos et al. 2023, submitted).

Another method to derive microlensing signals from spectral deformations consists of modeling the pairs of spectra with the same intrinsic components, such as the sum of a power law continuum and of a series of Gaussian/Lorentzian line profiles that represent the emission (and/or absorption) components, as depicted in panels (d) and (f) of Fig. 19. The comparison of the flux ratios in each of these components may reveal the presence of microlensing (Sluse et al. 2007, 2011). The drawback of this method is that there is no guarantee that each of these components effectively represents a spatially different emission region (e.g. a single-peaked Gaussian line profile could result from the superposition of a double-peaked emission from a Keplerian emission and of a Gaussian profile arising in a wind). It is also possible that the intrinsic emission profile will get deformed due to microlensing, such that a model valid for a non-microlensed line will fail for a microlensed one.

To cope with the limitation of the previous method, Sluse et al. (2007) have proposed isolating the flux arising from regions affected by microlensing from the flux arising from larger, microlensing-free regions by linearly combining pairs of spectra. This method, called Macro-micro decomposition (MmD), is similar to the one proposed by Angonin et al. (1990) and O’Dowd et al. (2015). It assumes that one spectrum is minimally microlensed (hereafter \(F_{2}\)) and is a good template of the intrinsic spectrum. If \(\mu \) is the average microlensing magnification of the continuum in the wavelength range of the line, \(M = M_{1} / M_{2}\) is the macro-magnification ratio, \(F_{M\mu}\) the microlensed part of the flux, and \(F_{M}\) the non-microlensed part, then one can write:

$$\begin{aligned} F_{1} = & M \times (\mu \, F_{M\mu} + F_{M}), \end{aligned}$$
(44)
$$\begin{aligned} F_{2} = & F_{M} + F_{M\mu}. \end{aligned}$$
(45)

These two equations can easily be combined to isolate \(F_{M}\) and \(F_{M\mu}\):

$$\begin{aligned} F_{M} \ = & \frac{-A \;}{A - M} \; \; \left ( \frac{F_{1}}{A} - F_{2} \right ), \end{aligned}$$
(46)
$$\begin{aligned} F_{M\mu} = & \frac{M}{A - M } \; \; \left ( \frac{F_{1}}{M} - F_{2} \right ), \end{aligned}$$
(47)

where we have defined \(A = M \times \mu \). This method is illustrated in panels (d) and (f) of Fig. 19. In addition, any decomposition should verify \(F_{M} > 0\) and \(F_{M\mu} > 0\). In general, these two equations can be verified for a range of (positive) values of (\(M \), \(\mu \)). The procedure described below enables the derivation of the values of those parameters that minimize the amount of microlensing in the broad line. First, one determines the product, \(A = M \times \mu \), by measuring the flux ratio \(F_{1} / F_{2}\) in a region that contains only continuum emission. This implies that \(F_{M} = 0\) in the continuum. Once \(A\) is derived, \(M\) is fined-tuned until the flux \(F_{M\mu}\) mimics as closely as possible a continuum emission. \(F_{M\mu}\) can deviate from a pure continuum emission if other regions are microlensed (e.g. the broad lines). Note that \(M\) can also be set a priori from the macro-model or from ancillary imaging data as explained in Sect. 3.2.1. To first order, this method also generally works when the second lensed image is also microlensed (see appendix C of Sluse et al. 2012). In that case, only the relative microlensing between the pair of images is retrieved. When more than two lensed images are observed, the comparison between multiple pairs of images generally enables one to identify which image is minimally microlensed. There is a number of situations where the above decomposition does not work. For instance, when the line flux is magnified only for some velocity bins while others are demagnified, e.g. the blue wing of the line is magnified while the red wing is demagnified. The method also fails when some velocity bins are more (de-)magnified than the underlying local continuum. A description of the method when both absorption and emission lines are present can be found in Hutsemékers et al. (2010). If applied to pairs of spectra that have not been corrected by the time delay, intrinsic variability could mimic microlensing-induced line deformations, and/or chromatic microlensing. These effects remain in principle small for time delays ≲ 40 days Yonehara et al. (2008), Sluse et al. (2012).

Guerras et al. (2013a) proposed another method to measure microlensing from spectra. It consists in deriving the amount of microlensing present in a line by calculating the difference of magnitude between the line wings and core. First, one needs to locally fit the continuum flux based on two regions on each side of the line and free of pseudo-continuum emission. From the continuum subtracted spectrum, one can measure the magnitude \(m_{1,2} = -2.5 \log (F^{\mathrm{{line}}}_{1, 2})\) of the line, either in its core or in the line wings. The differential microlensing between the wing and core of the line is given by:

$$ \varDelta m = (m_{1} - m_{2})_{\mathrm{wings}} - (m_{1} - m_{2})_{\mathrm{core}}. $$
(48)

As shown by Guerras et al. (2013a), \(\varDelta m\) is naturally corrected for differential extinction, which can be assumed to be constant over the wavelength range covered by the line. \(\varDelta m\) exactly corresponds to the microlensing of the line only if the line core is effectively free of microlensing.

Braibant et al. (2017) have introduced several indicators to quantify the effect of differential magnification between the wings and core of a line and/or line asymmetry. They can be calculated using the ratio between the line profiles in pairs of lensed images, are independent of the shape of the line profile, and can be calculated on both simulated or observed emission lines. These indicators are useful in establishing quantitative diagnostics of line deformations (see Braibant et al. 2017), and can be calculated even for moderate signal-to-noise ratios. Calculating such indicators instead of using the full profile may however be seen as a loss / degradation of the signal. Instead, one may rather use continuum subtracted line profiles in pairs of spectra to derive microlensing as a function of velocity: \(\mu (v) = F^{\mathrm{{line}}}_{1} / (M * F^{\mathrm{line}}_{2})\). In the situation where the line is also contaminated by pseudo-continuum emission (such as Fe ii), a model of the latter needs to be subtracted as well and uncertainties propagated to the estimate of \(\mu (v)\). The use of \(\mu (v)\) is only possible for spectra with a sufficient signal-to-noise ratio. Even in that case, \(\mu (v)\) may need to be truncated at high velocity because the line flux becomes too low to enable a reliable estimate of \(\mu \).

3.2.3 The Role of Extinction

Extinction caused by dust along the line of sight to different lensed images can lead to a different no-microlensing baseline and disguise itself as chromatic microlensing (wavelength dependence). To mitigate this, let us consider that the intrinsic flux of an AGN lensed image at a given wavelength, \(\lambda \) (observer’s frame), and time, \(t\), can be expressed as \(m_{0}(\lambda ,t)\) (in mag). An observed lensed image \(i\) will be magnified by \(M_{i}(\lambda ,t)\) (the total magnification includes the macrolens and microlensing that could vary with wavelength and time), and have a time delay \(\varDelta t_{i}\), resulting in the observed spectrum (Falco et al. 1999):

$$ \begin{aligned} m_{i}(\lambda ,t) = m_{0}\left (\frac{\lambda}{1+z_{S}},t- \varDelta t_{i} \right ) -2.5 \log \left [ M_{i} \left ( \frac{\lambda}{1+z_{L}},t \right ) \right ] \\ + E_{i} R_{i} \left (\frac{\lambda}{1+z_{L}} \right ) + E_{\mathrm{Gal}} R_{ \mathrm{Gal}} (\lambda ) + E_{S} R_{S} \left (\frac{\lambda}{1+z_{S}} \right ), \end{aligned} $$
(49)

where \(E_{i}\), \(E_{\mathrm{Gal}}\), \(E_{S}\) are the image \(i\) extinctions, \(E(B-V)\), produced by the lens galaxy, our Galaxy, and the source host respectively, and \(R_{i}(\lambda )\), \(R_{\mathrm{Gal}}(\lambda )\), and \(R_{S}(\lambda )\) indicate the corresponding extinction laws. This equation can be simplified considering several assumptions. First, the terms associated with our Galaxy and the host galaxy extinction can be considered as negligible because the separation between rays at the source and at the observer is considered small and thus the amount of extinction should be the same for all images. On the contrary, the contribution of the lens galaxy should be more important because the images pass through regions separated by kiloparsecs, which are likely to contain different amounts of dust (Cardelli and Savage 1988). Finally, assuming that the magnification is independent of \(\lambda \) and \(t\) (i.e. no microlensing), the source spectrum is independent of \(t\) and time delays, and the lens galaxy extinction curve is the same for all images, we can estimate the magnitude difference between two images \(i\) and \(j\) as (Falco et al. 1999):

$$ m_{i}(\lambda ) - m_{j}(\lambda )= -2.5 \log \left ( \frac{M_{i}}{M_{j}} \right ) + (E_{i}-E_{j}) R \left ( \frac{\lambda}{1+z_{L}} \right ), $$
(50)

which depends only on the constant macro-magnification ratios, \(M_{i}/M_{j}\), the extinction difference, \(E_{i}-E_{j}\), and the extinction curve in the lens rest frame, \(R(\lambda /(1+z_{L}))\). The previous assumptions might not hold for the smallest regions of the AGN (e.g. the continuum could be affected by chromatic microlensing magnification, (see Fig. 19), thus it is better to use narrow emission lines (which are only affected by the macrolensing magnification and extinction). For example, the doubly imaged SBS0909+523 system was used to estimate the extinction curve of the lens galaxy and confirm the lens redshift by detecting the 2175 Åblue bump (Motta et al. 2002; Mediavilla et al. 2005), while also estimating chromatic microlensing in the system.

3.3 The Single-Epoch Method

Here we consider the application of the single-epoch method to photometric data. These consist of single or combined exposures in a narrow or broad band, from which flux measurements for point sources, i.e. the quasar multiple images, are extracted. In this case, the flux between a pair of macroscopically observed multiple images \(A\) and \(B\), \(\varDelta m^{\mathrm{{obs}},\mathrm{k}}_{\mathrm{{AB}}}=m_{\mathrm{{B}}}^{\mathrm{{obs}},\mathrm{k}}-m_{A}^{\mathrm{{obs}},\mathrm{k}}\) where \(k\) denotes the photometric band used, is measured with respect to a baseline that is believed unaffected by microlensing, \(\varDelta m^{\mathrm{ref},k}_{\mathrm{{AB}}}=m_{\mathrm{{B}}}^{\mathrm{ref},k}-m_{\mathrm{{A}}}^{ \mathrm{ref},k}\). The microlensing signal can then be defined as:

$$ \varDelta m^{k}_{\mathrm{{AB}}}=\varDelta m^{\mathrm{{obs}},\mathrm{k}}_{\mathrm{{AB}}} - \varDelta m^{ \mathrm{ref},k}_{\mathrm{{AB}}}. $$
(51)

We note that the no-microlensing baseline may or may not depend on the band/wavelength, for example if there is extinction or if it is taken to be the macromodel flux ratio respectively (see Sect. 3.2).

Simulated microlensing flux ratios (or more precisely magnification ratios) can be obtained from the magnification maps corresponding to images \(A\) and \(B\) given a model of the source light profile. The former are defined mainly by the \(\kappa \), \(\gamma \), \(\kappa _{*}\) parameters, but can also depend on other assumptions like the mass function of the microlenses, which we can collectively denote as \(\boldsymbol{\eta}_{m}\). The latter essentially boils down to a way of determining \(r_{1/2}\) of the source as a function of wavelength. This radius may be calculated based on a physical model of the source, for instance, in the case of an accretion disc one can use Eq. (40) that depends on the black hole mass, luminosity and accretion efficiency, or be purely phenomenological (for example Eq. (42)). We can collectively denote all the free parameters of the source as \(\boldsymbol{\eta}_{s}\). Simulated microlensing flux ratios as a function of wavelength are in fact a function of these free model parameters:

$$ \varDelta m^{\mathrm{{mod}}}_{\mathrm{{AB}}}(\lambda _{\mathrm{{rest}}}) \equiv \varDelta m^{ \mathrm{{mod}}}_{\mathrm{{AB}}}(\lambda _{\mathrm{{rest}}} | \boldsymbol{\eta}_{m}, \boldsymbol{\eta}_{s}). $$
(52)

To obtain the microlensing flux ratio in any photometric band one should in principle integrate over a range of wavelengths and take into account the band’s response function. In practice, only the central wavelength of the band can be considered and the dependence on \(\lambda _{rest}\) can be replaced by the index \(k\). Finally, to obtain the ratios between images \(A\) and \(B\) we randomly select locations on the convolved maps and divide their magnification values pairwise. We note that these locations have to be the same across the different bands considered.

The observed and simulated microlensing flux ratios can be compared through a chi-squared statistic:

$$ \chi ^{2}_{n} = \sum _{i} \sum _{j \; > i} \sum _{k}^{K} \left [ \frac{\varDelta m_{i j}^{\mathrm{{obs}},\mathrm{k}} - \varDelta m_{i j}^{\mathrm{ref},k} - \varDelta m^{\mathrm{{mod}},\mathrm{k},\mathrm{n}}_{ij}(\boldsymbol{\eta}_{m},\boldsymbol{\eta}_{s})}{\sigma _{i j}^{k}} \right ]^{2} , $$
(53)

where \(i\), \(j\) is \(1-4\) for a quadruply lensed quasar, \(\sigma _{i j}^{k}\) is the error associated with the observed flux ratios, \(K\) is the number of bands, and \(n\) denotes a given pair of locations between two magnification maps. We can thus obtain the final likelihood for a given combination of parameters \(\boldsymbol{\eta}_{m}\), \(\boldsymbol{\eta}_{s}\) as the sum:

$$ \mathcal{L}(\boldsymbol{d}| \boldsymbol{\eta}_{m},\boldsymbol{\eta}_{s}) = \sum _{n}^{N} e^{-\chi ^{2}_{n}/2}, $$
(54)

where \(\boldsymbol{d}\) represents the flux ratio observations between all image pairs across all bands and we use \(N\) trials of simulated flux ratios (i.e. pairs of locations between maps).

The likelihood above can be used in a Bayesian setup with the probability density of the posterior given by:

$$ \frac{\mathrm{d} P}{\mathrm{d} \boldsymbol{\eta}_{m} \,\mathrm {d}\boldsymbol{\eta}_{s}} \propto \mathcal{L}(\boldsymbol{d} | \boldsymbol{\eta}_{m}, \boldsymbol{\eta}_{s}) \frac{\mathrm{d} p}{\mathrm{d} \boldsymbol{\eta}_{m}} \frac{\mathrm{d} p}{\mathrm{d} \boldsymbol{\eta}_{s}}, $$
(55)

where \(p\) are the priors on the free parameters of the problem – common choices being uniform or logarithmic. In this way, the posterior probabilities for any parameter of the model, e.g. the dependence of size on wavelength (\(\nu \), for an accretion disc) or \(\kappa _{*}\), can be obtained by marginalizing over all the remaining free parameters of the model (e.g. Bate et al. 2018).

Finally, one can extend this method to constrain a common underlying source model with data from multiple systems (e.g. Jiménez-Vicente et al. 2015a, who rescaled the size of the accretion disc by the black hole mass). The final joint likelihood is obtained by the product:

$$ \frac{\mathrm{d} P}{\mathrm{d} \boldsymbol{\eta}_{s}} \propto \prod ^{L}_{l} \frac{\mathrm{d} p_{l}}{\mathrm{d} \boldsymbol{\eta}_{s}} \int \mathcal{L}( \boldsymbol{d}_{l} | \boldsymbol{\eta}_{m,l},\boldsymbol{\eta}_{s}) \frac{\mathrm{d} p_{l}}{\mathrm{d} \boldsymbol{\eta}_{m,l}} \,\mathrm {d}\boldsymbol{\eta}_{m,l} \;, $$
(56)

where each of the \(L\) observed lensed systems has its own map parameters \(\boldsymbol{\eta}_{m}\) (e.g. the \(\kappa \), \(\gamma \) for each image) and may have its individual priors \(p_{l}\) for the source parameters \(\boldsymbol{\eta}_{s}\).

3.3.1 Advantages and Drawbacks

The main advantage of the single-epoch method is that both data and simulations are easy to handle. One needs snapshots of a lensed system in one or more bands, which are easy to schedule and obtain, and the method can be applied to several systems simultaneously. Creating pairs of simulated microlensing magnification ratios for different sources is straightforward. The parameter space of the modelFootnote 13 is usually smooth and sampling the bulk of the probability does not rely heavily on priors. However, the resulting probability distribution can be quite extended and in some cases can lead only to upper limits for some parameters (usually the accretion disc size, see Sect. 4.3). Bate et al. (2018) have shown that this method can have poor constraining power depending on the chromaticity amplitude, i.e. by how much the flux ratio between different bands varies, and the offset with respect to the no-microlensing baseline. Other difficulties include controlling systematic uncertainties like extinction, the no-microlensing baseline (see Sect. 3.2), leakage of broad line emission in the wavelength range covered by a broad band filter (see also Sect. 4.3.3), and quasar intrinsic variability. To mitigate the latter, configurations with small time delays like crosses or close image pairs are preferred. Although one could still apply this method to any image pair with a known time delay, two snapshots would be required in this case correctly spaced in time.

3.4 The Light Curve Method

In a lensed quasar, the observer, lensing galaxy, micro-lenses, and background emission regions are not static. The combined velocity of all the different components makes quasar microlensing a dynamic phenomenon in timescales that vary from weeks to decades. Several lensed quasars have been monitored for more than a decade, mainly for time-delay cosmography applications (see Birrer et al. 2024) and mostly in a single photometric band. Once time delays are measured, it is possible to shift the observed light curves of each image by the corresponding delay and subtract them pair-wise to cancel out the intrinsic variations of the quasar. The resulting difference curves are the observed microlensing signal, i.e. the ratio of microlensing magnification between the pair of multiple images, which contains valuable information on quasar structure and lensing galaxy mass partition and IMF. An example of such data for the doubly lensed quasar Q J0158−4325 is shown in Fig. 20.

Fig. 20
figure 20

Luminosity variation of the doubly lensed quasar J 0158−4325 in the r-band, from 13 years of monitoring at the Euler Swiss Telescope in La Silla Observatory. Top: Light curves of the two lensed images, \(A\) (minimum) and \(B\) (saddle). Measurement errors are indicated by the grey vertical lines (mostly smaller than the size of the points). Bottom: A spline model of the \(A\) image light curve subtracted from another spline model of \(B\) (Paic et al. 2022, see also Fig. 3 in). Importantly, the \(B\) spline model is shifted by the best time-delay estimate (\(\varDelta t_{A}B = 22.7\) d, image A leading Millon et al. 2020a). This system displays strong microlensing effects as indicated by the magnitude difference between the two images that steadily increases by \(>1\) mag over the 13 years of the monitoring campaign

3.4.1 The Effective Velocity Model

To describe time-varying microlensing it is necessary to define a model for the effective projected velocity, which results in the background source moving along trajectories or “tracks” on top of a magnification pattern. These tracks can then be used to extract magnification values as a function of time that translate into a light curve. In fact, this is the biggest difference and complication with respect to the single-epoch method from the simulations’ point of view. The two additional degrees of freedom, the length and direction of the simulated trajectories, are critically increasing the number of simulations required (see next section).

Let us define the effective velocity of the source, \(\boldsymbol{\upsilon}_{e}\), as the vector sum of the transverse velocities of the observer, \(\boldsymbol{\upsilon}_{o}\), microlenses, \(\boldsymbol{\upsilon}_{\star}\), lensing galaxy, \(\boldsymbol{\upsilon}_{l}\), and source, \(\boldsymbol{\upsilon}_{s}\) (see e.g. Kayser et al. 1986; Kochanek 2004; Neira et al. 2020):

$$ \boldsymbol{\upsilon}_{e} = \frac{\boldsymbol{\upsilon}_{o}}{1+z_{l}} \frac{D_{ls}}{D_{ol}} + \frac{\boldsymbol{\upsilon}_{\star}+\boldsymbol{\upsilon}_{l}}{1+z_{l}} \frac{D_{os}}{D_{ol}} + \frac{\boldsymbol{\upsilon}_{s}}{1+z_{s}}, $$
(57)

where time is measured in the observer’s rest frame and length on the source plane. These velocity components are also schematically shown in Fig. 21. The only fully measurable of these vector quantities is \(\boldsymbol{\upsilon}_{o}\), which is determined relative to the Cosmic Microwave Background velocity dipole (\(\boldsymbol{\upsilon}_{ \mathrm{CMB}}\)):

$$ \boldsymbol{\upsilon}_{0} = \boldsymbol{\upsilon}_{\mathrm{CMB}} - ( \boldsymbol{\upsilon}_{\mathrm{CMB}} \cdot \hat{z}) \, \hat{z}, $$
(58)

where \(\hat{z}\) is the direction of the line-of-sight. The unknown peculiar velocities of both the lensing galaxy and the background quasar can be considered as uniformly random in direction with magnitude in the source plane drawn from a normal distribution \(\mathcal{N}(0,\sigma _{g})\) with:

$$ \sigma _{g} = \left [ \left ( \frac{\sigma _{l}^{\mathrm{pec}}}{1+z_{l}}\frac{D_{os}}{D_{ol}} \right )^{2} + \left ( \frac{\sigma _{s}^{\mathrm{pec}}}{1+z_{s}} \right )^{2} \right ]^{1/2}, $$
(59)

where \(\sigma _{l}^{\mathrm{pec}}\) and \(\sigma _{s}^{\mathrm{pec}}\) are the standard deviations of the peculiar velocity distributions of the lens and the source respectively. A proper treatment of the final velocity component, the individual velocities of the microlenses, would require the use of computationally expensive “moving patterns”, where the individual movement of each star is taken into account (see Fig. 1). However, Kundic and Wambsganss (1993) showed that this relative movement was equivalent to an increase in the magnitude of the effective velocity. Wyithe et al. (2000a) showed that the magnitude of this effect can be approximated by a “bulk velocity” of the micro-lenses such that:

$$ \upsilon _{\star }= \sqrt{2} \epsilon \sigma _{\star}, $$
(60)

where \(\sigma _{\star}\) is the velocity dispersion at the lens center and \(\epsilon \) is an efficiency factor that depends on local \(\kappa \) and \(\gamma \) values. Since in the general case only \(\boldsymbol{\upsilon}_{o}\) is known, a direction and magnitude for the final effective velocity vector need to be drawn from a distribution, as defined by equation (57). By construction, this distribution is skewed towards the fixed direction of \(\boldsymbol{\upsilon}_{o}\) and has a minimum magnitude defined by \(\boldsymbol{\upsilon}_{\star}\). As an example, Fig. 22 shows histograms of effective velocity realizations for three well-known lensed quasar systems.

Fig. 21
figure 21

Schematic representation of the components of the relative transverse velocity defined in equation (57). Left: the magnitude and direction of the transverse CMB dipole velocity, i.e. the velocity of the observer. Middle: the individual velocities (arrows) of the microlenses (star symbols) that lie along the line of sight (grey cone). Right: the random peculiar velocities of lensing galaxy and quasar

Fig. 22
figure 22

Histograms of simulated effective velocities for three known lensed quasar systems. The minimum possible value for the velocity (inner radius of the ring-like structure) is defined by \(\boldsymbol{\upsilon}_{\star}\), which depends on the lensing galaxy’s velocity dispersion. The peaks of the distributions are skewed towards the direction defined by the fixed velocity component \(\boldsymbol{\upsilon}_{l}\). The star symbol denotes the direction and magnitude of this component

Once an effective velocity distribution and the duration of an observing period have been established, one can finally define tracks of the source traveling through the magnification pattern. These tracks have a length given by the duration of the observations times the velocity modulus and a direction defined by the vector sum of the velocity components (see Fig. 17). Extraction of the pixel values by superposition of this track on top of a magnification pattern convolved with the chosen quasar model results in a simulated light curve (see Fig. 18). Unlike static microlensing, in this case, in addition to the strength, also the steepness and the timescales of the variations shown in the light curve will depend on the specific model of the source (see Fig. 15).

3.4.2 Fitting Light Curves

The standard approach to analyse light curve data consists of fitting simulated light curves to difference curves and was introduced by Kochanek (2004). To simplify a bit the equations, we focus on a single photometric band light curve and a single system. In this case, we define the time series that constitute a difference light curve between a pair of macroscopically observed multiple images A and B as \(\varDelta m^{\mathrm{{obs}},\mathrm{t}}_{\mathrm{{AB}}} = m^{\mathrm{{obs}},\mathrm{t}}_{\mathrm{{B}}} - m^{\mathrm{{obs}},\mathrm{t}}_{ \mathrm{{A}}}\), where \(t\) denotes one of \(T\) measurements from the difference light curve. The no-microlensing baseline is \(\varDelta m^{\mathrm{ref}}_{\mathrm{{AB}}}=m_{\mathrm{{B}}}^{\mathrm{ref}}-m_{\mathrm{{A}}}^{\mathrm{ref}}\) and does not depend on time. The microlensing signal can then be defined as:

$$ \varDelta m^{t}_{\mathrm{{AB}}}=\varDelta m^{\mathrm{{obs}},\mathrm{t}}_{\mathrm{{AB}}} - \varDelta m^{ \mathrm{ref}}_{\mathrm{{AB}}}. $$
(61)

Before generating simulated microlensing light curves, a number of hyper-parameters need to be defined for the magnification maps, \(\boldsymbol{\eta}_{m}\), and the source, \(\boldsymbol{\eta}_{s}\), in the same way as for the single-epoch method. Here, we also need to include the hyper-parameters that are related to the effective velocity, i.e. its magnitude and direction, which we denote as \(\boldsymbol{\eta}_{v}\). Finally, the simulated microlensing light curves that are sampled from trial trajectories on convolved maps are a function of these free model parameters:

$$ \varDelta m^{\mathrm{{mod}}}_{\mathrm{{AB}}}(t) \equiv \varDelta m^{\mathrm{{mod}}}_{\mathrm{{AB}}}(t | \boldsymbol{\eta}_{m}, \boldsymbol{\eta}_{s}, \boldsymbol{\eta}_{v}). $$
(62)

The same notes as for the single-epoch method apply here, i.e. we can replace a broad photometric band by its central wavelength, we can randomly select starting locations for the trajectories on the convolved maps and divide their magnification values pairwise, and these locations have to remain the same if we are considering data in many bands. However, the pair of trajectories must have the same effective velocity and the orientation of each map, or more precisely the orientation of the local shear at the position of each multiple image, has to be taken into account, as illustrated in Fig. 17.

The observed and simulated microlensing light curves can be compared through a chi-squared statistic:

$$ \chi ^{2}_{n} = \sum _{i} \sum _{j \; > i} \sum _{t}^{T} \left [ \frac{\varDelta m_{i j}^{\mathrm{{obs}},\mathrm{t}} - \varDelta m_{i j}^{\mathrm{ref}} - \varDelta m^{\mathrm{{mod}},\mathrm{t},\mathrm{n}}_{ij}(\boldsymbol{\eta}_{m},\boldsymbol{\eta}_{s},\boldsymbol{\eta}_{v})}{\sigma _{i j}^{t}} \right ]^{2} , $$
(63)

where \(i\), \(j\) is \(1-4\) for a quadruply lensed quasar, \(\sigma _{i j}^{t}\) is the error associated with each point in the difference light curve, and \(n\) denotes a given pair of trajectories between two magnification maps. The observant reader may have noticed the similarity between this and Eq. (53), however, we note that because of \(T \gg K\) and the larger parameter space due to \(\boldsymbol{\eta}_{v}\), inference of the target physical parameters becomes much harder in this case. We can define a likelihood function in the same way as in Eq. (54) using \(N\) pairs of simulated light curves, and cast it in a Bayesian framework to obtain:

$$ \frac{\mathrm{d} P}{\mathrm{d} \boldsymbol{\eta}_{m} \,\mathrm {d}\boldsymbol{\eta}_{s} \,\mathrm {d}\boldsymbol{\eta}_{v}} \propto \mathcal{L}(\boldsymbol{d} | \boldsymbol{\eta}_{m}, \boldsymbol{\eta}_{s},\boldsymbol{\eta}_{v}) \frac{\mathrm{d} p}{\mathrm{d} \boldsymbol{\eta}_{m}} \frac{\mathrm{d} p}{\mathrm{d} \boldsymbol{\eta}_{s}} \frac{\mathrm{d} p}{\mathrm{d} \boldsymbol{\eta}_{v}}, $$
(64)

where \(\boldsymbol{d}\) is now difference light curve data, i.e. a set of measurements \(\varDelta m^{\mathrm{{obs}},\mathrm{t}}\) and their associated uncertainties, \(\sigma ^{t}\) for one or more pairs of multiple images. The velocity model presented in Sect. 3.4.1 can serve as the prior for \(\boldsymbol{\eta}_{v}\) here. The posterior probabilities for any parameter of the model can be obtained by marginalizing over all the other remaining free parameters. Extending Eq. (63) to include multiple wavelengths and systems (as in Eq. (56)) is left as an exercise for the reader.

3.4.3 Advantages, Drawbacks, and Other Approaches

The light curve method is quite appealing because of the larger amount of data – hundreds as opposed to a single epoch – that, in principle, should result in more constraining power over the source and lens galaxy models. That alone is a sufficiently strong argument to consider this a major advantage over other methods. Of course, the first step for the applicability of this method is the availability of the data, meaning that the required monitoring observations for periods of years should have been scheduled and taken place, a demanding and expensive task. LSST will dramatically improve on the availability of microlensing light curves by monitoring thousands of lensed quasars for at least a decade. However, the potential advantages of the light curve method come at the cost of complicated calculations, additional model degeneracies, and some “hidden” complexity in the data.

The signal contained in the difference light curves can in fact be much more complex an amplitude modulation as the source is crossing the magnification field produced by the microlenses, as shown in 15. A rapid flickering on a timescale of weeks to months is often observed (Schild 1996; Schechter et al. 2003; Millon et al. 2020b), which is too fast to be explained by the transverse velocity models described in Sect. 3.4.1, as it can be seen in Fig. 20. Several explanations have been put forth to explain these fast microlensing variations: Schild (1996) proposes that a population of planetary mass objects act as microlenses, whereas Blackburne and Kochanek (2010) attribute this flickering to a variation of the accretion disc size over time, Gould and Miralda-Escude (1997), Wyithe and Loeb (2002), Schechter et al. (2003), Dexter and Agol (2011) invoke inhomogeneities or broad absorption clouds in relativistic motion in the accretion disc being magnified by microlensing, and Millon et al. (2022) propose a secondary black hole in orbit within the disc. A different explanation simply invokes the combination of reverberation of the BLR and microlensing, as introduced by Sluse and Tewes (2014) (see also Eq. (12) in Paic et al. 2022). Because the accretion disc and the BLR have different physical sizes, they are affected differently by microlensing. In addition to this, the BLR is expected to reverberate the intrinsic variability of the quasar with some delay due to its location further away from the SMBH (and some smoothing due to its size, see Cackett et al. 2021). This leads to a difference in contrast between the reverberated and the direct signal that can introduce variations in the difference curves on the same timescale as the quasar intrinsic variations. These variations are much faster than the usual microlensing timescale and could explain the complex signals present on such shorter timescales.

When a source moves through a caustic network, whose characteristic size is described by \(\theta _{\mathrm{E}}\), there is a three-way degeneracy that is introduced between the source size, effective velocity, and the mass of the microlenses. The relation between these parameters is not exact, but can be understood by the following examples. If we disregard any velocity component, then the source size and microlens mass are obviously degenerate through the ratio \(r_{1/2}/R_{\mathrm{E}}\) that determines the extent of microlensing effects. For a given effective velocity, a small source moving in a sparse caustic field (few high-mass microlens) and a large source moving through a dense field of caustics (many small-mass microlenses) will result a similar small-amplitude microlensing signal. A similar signal would result from a slow and small or a fast and large source moving through a fixed-mass caustic network. Finally, for a fixed source size, small \(\theta _{\mathrm{E}}\) and low velocity lead to magnification similar to having a large \(\theta _{\mathrm{E}}\) and high velocity. Eventually, if the observations are long enough then these degeneracies are easier to break, but this is not always the case. Therefore, priors on these three parameters are often imposed, with the velocity model presented in Sect. 3.4.1 being a commonly used one (Kochanek 2004). Other sources of systematic bias in resulting measurements have not been adequately or at all explored (see Bate et al. 2018, for a study of such biases for the single-epoch technique). For example, these could be the peak-to-peak amplitude in the light curves as a function of wavelength and the deviation of the mean magnification from the no-microlensing baseline.

A more practical, but possibly the main disadvantage of this method, is its very high computational cost, as the probability of finding a trajectory that is a good-fit to the data is decreasing with the length of the light curve. To mitigate this, authors have divided full light curves into separately-fitted segments (Poindexter and Kochanek 2010) or averaged data within a season (Morgan et al. 2012). Another common simplifying assumption is to assume that microlensing is taking place in only one of the two images and thus simulate trajectories on a single map (Kochanek 2004). With the advent of machine learning methods, Vernardos and Tsagkatakis (2019) have investigated the use of Convolutional Neural Networks to infer the source size from the microlensing features seen in the difference curves. Although promising, careful thinking needs to go into the design of the training set for such methods in order to reflect the complexity of real world scenarios as closely as possible.

3.5 Other Microlensing Data Analysis Methods

Apart from the two general methods to analyze microlensing light curve and flux ratio data described until now, there is a number of other more specialized approaches tailored to specific situations and data. Here we give a short summary of these methods and refer the reader to the corresponding papers for more details.

3.5.1 Measuring SMBH Properties from HMEs

Observing a lensed quasar at any random moment, either by obtaining flux ratios or through monitoring, can be used to constrain its overall size (the half-light radius) as a function of wavelength. However, when a microlensing HME takes place there is much more information to be extracted on the immediate vicinity of the SMBH, e.g. its event horizon, mass, and spin. The magnification during such events can often be described analytically, which can make modelling simpler (see also Sect. 3.5.6). Mediavilla et al. (2015) use three such events in Q 2237+0305 whose fine structure at the peak they attribute to the Innermost Stable Circular Orbit (ISCO) at \(\approx 3\) gravitational radii. This result is obtained by fitting both classical and relativistic accretion disc models convolved with an analytic function for the caustic magnification (a straight fold combined with a linear term to factor in long-range magnification gradients due to neighbouring microlenses, i.e. not responsible for this particular caustic). Their relativistic model includes the effect of beaming, Doppler shift, and gravitational redshift, which effectively improves their fit to observed data. (Best et al. 2022) on the other hand adopt a machine learning approach to measuring the ISCO, which they train with thousands of simulated microlensing light curves and accretion disc models that include relativistic beaming, Doppler shifts, and lensing from the central SMBH.

3.5.2 Modelling X-ray Line Deformation

Several authors have studied with increasing sophistication how the shape of the Fe K\(\alpha \) reflection line changes as a result of microlensing. The emitted 6.4 keV photons in the inner regions of the disc are affected by the gravitational redshift, bending, Doppler broadening, and relativistic beaming, resulting in a well-known two peaked profile (Fabian et al. 1989; Laor 1991). During an HME or caustic crossing, some regions of the disc producing these photons will be magnified more than others, resulting in additional peaks (Heyrovský and Loeb 1997; Popović et al. 2006; Jovanović et al. 2009) and edges (Neronov and Vovk 2016) to the profile. As pointed out by (Krawczynski and Chartas 2017), this may explain the Fe K\(\alpha \) line variations observed in RX J1131−1231 (Chartas et al. 2012).

The above mentioned studies used methods similar to those described below in Sect. 3.5.3. The work by Ledvina et al. (2018) first makes maps at representative inclinations of the Fe line intensity and energy shift (\(g\)-factor) around a quasar (in a \(60\times 60 r_{\mathrm {g}}\) region) and then sweeps a high-magnification, narrow, linear fold caustic across those maps. At specific moments in time, the total observed (microlensed) Fe K\(\alpha \) line profile is constructed. One of the most prominent microlensing-induced spectral features is seen as the caustic sweeps across the inner disc: a sharp and highly magnified peak dominates the spectral profile. The location of the peak changes in time as the caustic moves as different parts of the disc get highly magnified (e.g., their Figs. 2 and 3).

3.5.3 Modeling Broad Emission Line Deformations

Because the broadening of an emission line is due to different velocity components of the emitting material that arise from spatially distinct regions, it is necessary to calculate the microlensing signal for an ensemble of luminosity profiles, each associated to a velocity slice in the BLR, as shown in Fig. 23. These luminosity profiles are subsequently convolved with microlensing maps to calculate the line deformation. This is similar to the procedure followed to model the temperature profile of the accretion disc, but instead of calculating the disc luminosity profile at different wavelengths (corresponding to different temperatures), one calculates the luminosity profile of the BLR for different velocities. A simple power-law decrease of the BLR brightness with distance to the central black hole is assumed, as well as simple geometries, such as a keplerian disc, a polar or equatorial wind, and various velocity fields able to reproduce emission line profiles (e.g. Murray et al. 1995; Murray and Chiang 1997). To simulate such profiles, one could use analytical models (e.g. Abajas et al. 2002) or radiative transfer simulations that can produce the light profile of the BLR for arbitrary velocity slice(s) (e.g. Braibant et al. 2017). For radiative transfer models, the photo-ionizing source of light is supposed to be the quasar accretion disc, assumed to be smaller and have the same inclination as the BLR. The microlensing of the continuum is calculated together with the line deformation to ensure that the model can self-consistently reproduce the data. The exact luminosity profile of the disc is not critical (see Sect. 3.1) and a uniform profile has commonly been considered. The need for the simulations to reproduce both the continuum and the BLR emission implies that the resolution of the microlensing maps should be high enough to sample the disc.

Fig. 23
figure 23

Illustration of broad line deformation produced by microlensing for two different models of the broad emission line region, with \(r_{1/2}\sim 0.3 R_{E}\) (the size chosen to be smaller than expected BLR size to highlight line deformations). Two models are sketched the polar wind (top) and the keplerian disc (bottom) The left most panel shows the two models at intermediate inclination. The blue and red colors indicate the location of the approaching and receding gas, corresponding to blue and red velocity components of the line. The right panel depicts the line deformation corresponding to different location of the BLR with respect to the caustic, as shown in the middle pannel (colored losanges corresponding to the colored line profiles on the right). For the depicted event, the microlensing of the Keplerian disc yields asymmetric red/blue line profile distortions while microlensing of the biconical outflow is characterized by symmetric wings/core distortions. Adapted from Hutsemékers et al. (2017)

Once deformed line profiles have been calculated for a variety of BLR models, they can be compared to the data in different ways. For example, O’Dowd et al. (2011) used the ratio between the microlensed and un-lensed profiles, which is proportional to the amplitude of microlensing as a function of the gas velocity (i.e. \(\mu (v)\)), Abajas et al. (2007) calculated relative changes of the line FWHM, while Braibant et al. (2017) integrated the flux over a fraction of the line profile to quantify the red-blue asymmetry and its wing-to-core magnification/deformation. In order to minimize the loss of information inevitable by the use of such “integrated” quantities, which may even lead to overfitting, Hutsemékers and Sluse (2021), Hutsemékers et al. (2023) used the velocity dependent microlensed amplitude, \(\mu (v)\) (Sect. 3.2.2). This choice minimizes the loss of information but requires high signal-to-noise data and in the case of too simple BLR models the method may be unable to reproduce the observed signal at all.

The comparison of models to data can proceed through a standard \(\chi ^{2}\) statistic. For instance, one can compare the amplitude of the simulated microlensing signal in the continuum, \(\mu _{\mathrm{C}}^{\mathrm{mod}}\), and the profile of the line deformation, \(\mu _{{\mathrm{B}}}^{\mathrm{mod}} (v)\), for an ensemble of velocity bins, \(\varDelta v_{k}\) (see e.g. Sect. 3.2.2 for methods enabling to measure those quantities). If \(\boldsymbol{\eta}_{B}\) and \(\boldsymbol{\eta}_{C}\) are the ensemble of model parameters of the BLR and the disc (i.e. continuum emission), then the total \(\chi _{n}^{2}\) associated to a model realisation \(n\) can be calculated as the sum of the corresponding BLR and continuum terms:

$$ \chi ^{2}_{n} = \sum _{k} \left [ \frac{\mu ^{\mathrm{obs}}_{{\mathrm{B}}}( \varDelta v_{k}) - \mu ^{\mathrm{mod}}_{{\mathrm{B}}} (\varDelta v_{k} | \boldsymbol{\eta}_{B})}{ \sigma ^{\mathrm{obs}}_{\mathrm{B}}(\varDelta v_{k})} \right ]^{2} + \left [ \frac{ \mu ^{\mathrm{obs}}_{\mathrm{C}} - \mu ^{\mathrm{mod}}_{\mathrm{C}}(\boldsymbol{\eta}_{C})}{\sigma ^{\mathrm{obs}}_{\mathrm{C}}} \right ]^{2} . $$
(65)

We note that the first term in the above equation is not unique. Instead of \(\mu _{{\mathrm{B}}} (\varDelta v_{k})\), one can use other measurements of line deformations like the “integrated” indicators described above. The inference on the model parameters can be cast in a Bayesian framework as described in Sects. 3.3 and 3.4.2 for single- and multi- epoch data respectively.

Existing works show that the largest constraining power arises when the amplitude of microlensing is large. For low amplitudes, the whole range of possible line deformations are very similar for any BLR model (Braibant et al. 2017). On the other hand, there are indications that more complex BLR models than the three listed above (i.e. keplerian disc, polar and equatorial wind) may be needed to accurately reproduce the line deformations observed in some systems (Hutsemekers et al. 2019; Hutsemékers and Sluse 2021).

3.5.4 Extensions to the Single-Epoch Technique

A single flux ratio measurement obtained from a snapshot of a lensing system at a given band has little constraining power; compared to the degrees of freedom of the microlensing models (source structure and mass in the lens) the problem is under-constrained. To mitigate this, flux ratios obtained at different epochs could be used, however, successful incorporation of such additional data requires attention to both the time separation between observations and the analysis method used. If enough time has passed between observations to result in the source having moved to an entirely different region of the magnification map, then the measurements can be considered independent, otherwise their covariance needs to be estimated and considered in, for example, \(\chi ^{2}\) fits. These time intervals are shorter for smaller sources because magnification is less correlated on short distances through convolution.

In the X-rays, where the source is expected to be the smallest, Pooley et al. (2012) consider such flux ratios observed in multiple epochs as independent measurements. Guerras et al. (2020) used \(\approx 10\) flux ratio measurements in the X-rays for each of four systems over periods of 5-15 years. This timescale is most likely longer than the source-size crossing time in the X-rays. They did not consider these measurements as independent, but used the mean and a second-order moment of their distribution instead, effectively compressing the X-ray light curve to these two summary statistics. Their simulated flux ratio distributions where obtained along tracks on magnification maps whose length – an important parameter that is proportional to the effective velocity – was arbitrarily chosen and not fitted for.

Fian et al. (2016, 2018, 2021b) have followed a similar approach to measure the accretion disc size by collapsing observed difference light curves to flux ratio probability distributions. However, instead of measuring the mean and width of the distribution, they calculate its distance from the one obtained from magnification maps for different source sizes. While the method has been applied to very long light curve data (between 10 and 20 years), the microlensing variability observed over that period is not guaranteed to be representative of the full magnification map. Indeed, the corresponding tracks on the maps may not be longer than a few \(\theta _{\mathrm{E}}\) – the exact value depending on the unknown effective velocity. More work may be needed to demonstrate that this caveat does not bias the final results of this method.

3.5.5 The Microlensing Time-Delay as a Probe for Accretion Disc Size

In addition to the well known macroscopic time delay in lensed quasars, which can be used as an independent way to perform cosmography and measure the Hubble constant (see Birrer et al. 2024), Tie and Kochanek (2018) introduced the microlensing time delay that can extend or shorten the arrival times of the signal in the different multiple images by a few days in an uncorrelated way. The origin of this delay is due to the combination of the driving variability mechanism of the quasar, assumed to be a lamppost, and how microlensing can magnify different parts of the accretion disc. Based on the fact that this effect depends on the source size, which in turn depends on wavelength, Chan et al. (2021) proposed using measured differences in the time delays between different bands to constrain the accretion disc size. Their method is complementary to traditional curve shifting techniques, which consider the amplitude of microlensing only, but, unlike these methods, they take into account the distortion and delay caused by the convolution of the lamppost variability with the light profile of the accretion disc and the caustic network.

The method requires a set of measured time delays in different bands with respect to some baseline with an uncertainty of 0.1 days, which is about 10 times better than the highest precision delays achieved nowadays. Additional assumptions are that the quasar varies as a lamppost-like model and its accretion disc brightness profile shape (more precisely, its spatial derivative Tie and Kochanek 2018) is known. Forward modelling and fitting of the time delays as a function of wavelength, and consequently half-light radius of the disc, follows. The value of the micro-magnification of a multiple image can be used to drastically speed up the calculations by pre-selecting the matching locations on magnification maps. Using multi-band light curves of a fiducial quadruply imaged quasar, the authors were able to forecast that the measured accretion disc size from such data would be within a factor of 0.2 and 2 of the truth. This method provides measurements that are independent of the choice of Initial Mass Function for the microlenses in the lens galaxy (see also Sect. 5.2).

3.5.6 Reconstructing Accretion Disc Profiles

The methods to measure accretion discs discussed until now correspond to the so-called forward problem: microlensing observables are calculated and fitted to data based on disc models prescribed a priori. However, the problem can be approached in an inverse way as well, i.e. reconstructing an unknown disc profile from the measured data. An observed microlensing light curve results from the convolution of a source profile with the magnification field of the microlenses. Hence, the situation is similar to deconvolution which requires solving an ill-posed inverse problem with the use of regularization to avoid overfitting and stabilize against noise.

The first to adopt this treatment were Grieger et al. (1988, 1991), followed by Mineshige and Yonehara (1999), Agol and Krolik (1999), and (Bogdanov and Cherepashchuk 2002). These studies made two main assumptions: they described the source with a one-dimensional “strip” profile and focused on HMEs. The first assumption is justified by the fact that the observed magnification depends mostly on the brightness distribution of the source along the direction perpendicular to the caustic (see Fig. 1 in Mineshige and Yonehara 1999). On long time scales, of the order of the \(\theta _{\mathrm{E}}\) crossing time, the source can be affected by several microlenses that lead to superimposed caustics and a complex and non-linear magnification field (see Sect. 3.4). But in the case of short HMEs, the magnification field corresponds to that of a caustic crossing, which is simpler and can be described analytically to a good approximation (Gaudi and Petters 2002a,b). This simplifies the problem of deconvolution considerably but can lead to some problems, e.g. the caustic can be discontinuous in the case of a cusp, it can have significant curvature, and magnification can vary along its direction (Agol and Krolik 1999). It turns out that the latter two can be resolved if the source has a size of 10% of \(\theta _{\mathrm{E}}\) (Grieger et al. 1988).

This method has been demonstrated to be stable on simulations using basic regularization schemes (see Benning and Burger 2018, for a review), showing potential of resolving the accretion disc structure within a few gravitational radii from the central SMBH. When it comes to real observations, the best and only system to provide the necessary HMEs has been the quadruply lensed quasar Q 2237+0305. Its accretion disc was found to have a brightness profile consistent with the thin-disc model (Agol and Krolik 1999; Bogdanov and Cherepashchuk 2002; Koptelova et al. 2007).

3.5.7 Time Varying Equivalent Widths of Non-lensed Quasars

Looking for signatures of lensing in the variability of non-lensed quasars is an idea that has been explored by several authors over decades (Press and Gunn 1973; Canizares 1982; Schneider 1993; Zackrisson et al. 2003; Bruce et al. 2017; Hawkins 2022). Vernardos et al. (2023, submitted) revisit this in the context of variable emission line equivalent widths. The data, in this case, are two measured equivalent widths of the same line obtained by spectroscopic observations of the same system taken \(\approx 10\) years apart. The authors develop a probabilistic model for these measurements using the magnification of a single compact microlens (similar to Galactic microlensing models). Once the most promising candidates are selected from spectroscopic surveys and re-observed to confirm that lensing is indeed responsible for their varying equivalent widths, the microlens mass and effective velocity can be measured.

3.5.8 Astrometric Microlensing

The variations in flux due to microlensing are correlated with astrometric shifts due to the individual microimages (Lewis and Ibata 1998). In particular, the creation or annihilation of pairs of microimages during a caustic crossing event can give rise to astrometric shifts on the order of 10 s of micro-arcseconds (Treyer and Wambsganss 2004). While the detection of such shifts have been long predicted, they have yet to be confirmed observationally on extra-galactic scales. They may, however, contaminate studies of proper motions of lensed AGN from Gaia (Makarov and Secrest 2022). Astrometric shifts from stellar microlensing have been observed however; see e.g. McGill et al. (2023).

4 Quasar Results

4.1 Black Hole Tomography with HMEs

Due to the rarity of observations of HMEs in the currently known lensed quasars, there are few results regarding their very promising use for constraining the inner accretion disc and the black hole. HMEs have almost exclusively been observed in one system: the Einstein cross (Fig. 2). This system, due to its exceptionally low lens redshift (\(z = 0.039\)) that results in short variability time scales, has been the target of several intensive monitoring campaigns (Ostensen et al. 1996; Udalski et al. 2006; Goicoechea et al. 2020). A few HMEs in this object have been captured in almost 3-decades worth of data and two indicative examples are shown in Fig. 24 (see Fig. 1 of Eigenbrod et al. 2008b, for another example). Study of such HMEs has allowed to place the inner edge of the quasar accretion disc at 3 \(R_{\mathrm{g}}\) (Mediavilla et al. 2015) and detect a possible warped disc geometry (Abolmasov and Shakura 2012b). Several monitoring programs have provided a rich dataset for this system, but such rapid variability is a statistical outlier.

Fig. 24
figure 24

Difference light curve between images B and C of Q 2237+0305 (i.e. the Einstein Cross) observed in the VRI bands. The double-peaked signal is typical of the source entering and exiting a single diamond caustic (e.g. as in Galactic microlensing events). Notice how the brightest band changes from I (red) to V (blue) within the grey-shaded area - a crucial indicator of an imminent HME. Adapted from Goicoechea et al. (2020)

For the vast majority of lensed quasars, a typical HME is expected in roughly every 20 years.Footnote 14 Therefore, monitoring of tens of systems for several years would be required in order to observe a statistically meaningful sample of HMEs. Such monitoring has been performed by the COSMOGRAIL program for 30 systems over more than a decade. Although its main goal has been measuring the time delays between the macroscopic quasar images, microlensing signals can be extracted from the observed light curves as a by-product. A few potential HMEs can be spotted in the final data (Millon et al. 2020a), albeit none unambiguously due to lack of observations during its peak. The GLENDAMA archive also provides a compilation of monitoring data from various telescopes and filters for 9 systems (Gil-Merino et al. 2018), where a few events can be identified.

The majority of observations and analyses have employed single-band photometry, which is a cheaper and more viable approach for long-term monitoring. If daily observations are taken during a HME, then such single-band data should be enough to provide valuable insights to quasar structure. However, the real wealth of information on accretion disc structure, which is accessible only during the event’s peak, can be extracted through multi-band, or even better, spectroscopic observations. With existing facilities, such observations are expensive and impractical for long monitoring periods, therefore, predictions and early detection of the onset of HMEs and their peaks are mandatory to trigger multi-wavelength follow-up.

4.1.1 HME Prediction and Triggers

An alert system has been used by OGLE (Woźniak et al. 2000) and a triggering mechanism for multi-band follow-up has been proposed by Wyithe et al. (2000b). However, such prediction algorithms have not been used extensively and as of yet there has not been any clear prediction and detailed follow-up of a HME as it unfolds, i.e. predicting the peak from light curves like those shown in Fig. 24 and scheduling detailed observations accordingly (spectroscopy, X-ray spectra and imaging, etc). This could also be attributed to using only just a single band as the base signal from which to predict HMEs. Every HME is preceded by a rise in magnification in the light curve, but such a rise could be due to other reasons too, e.g. improperly subtracted intrinsic variability, reverberated variability from the BLR (Sluse et al. 2013; Paic et al. 2022), or just long-term microlensing (see Sect. 3.2). Microlensing, however, has a distinct signature as a function of wavelength: larger (redder) sources are expected to get magnified sooner than smaller (bluer) ones (Young 1981; Schneider and Weiss 1987), which can be beautifully seen in the VRI band monitoring data of the Einstein cross shown in Fig. 24. Therefore, having such multi-wavelength information – even in two bands – is paramount for obtaining a clear and coherent signal that will lead to early and robust predictions of the onset of HMEs.

4.2 Corona

The first indication from microlensing that the X-ray emitting corona is fairly compact came from the study of Chandra observations of 10 quadruply lensed quasars by Pooley et al. (2007). They showed that, in almost every case where a flux ratio anomaly is present, it is more extreme in X-rays than in the optical. This indicates that the X-rays must arise from a region much smaller than the optical half-light radius. Pooley et al. (2007) qualitatively conclude that the X-ray emission region size is consistent with the inner edge of the accretion disc or smaller.

Quantitative results first came when the light curve method (Kochanek 2004) was applied to multiple Chandra and HST observations of PG 1115+080 by Morgan et al. (2008). They found that the X-ray half-light radius (rest frame 1.4–21.8 keV) is \(\log{(r_{\mathrm{1/2,X}}/\mathrm{cm})} = 15.6^{+0.6}_{-0.9}\). For a black hole mass of \(1.2\times 10^{9} M_{\odot}\), this is consistent with the inner edge of the disc or smaller. As Chandra has observed and followed-up on more lensed quasars and the light curve method has been applied to many of them, it is clear that a compact size for the X-ray emitting region is common for all systems observed to date, typically within 10 \(r_{\mathrm {g}}\) or less. A summary of the results so far is given in Table 2.

Table 2 X-ray Microlensing Results on the size of the X-ray corona

4.3 Accretion Disc

4.3.1 Inner Edge of the Disc

The Fe K\(\alpha \) reflection feature is one of the best probes we have of the inner edge of the accretion disc. Detection and possible microlensing of this emission feature has been reported for H1413+117 (Oshima et al. 2001; Chartas et al. 2004, 2007), MG J0414+0534 (Chartas et al. 2002), QSO 2237+0305 (Dai et al. 2003; Chen et al. 2012), SDSS J1004+4112 (Ota et al. 2006; Chen et al. 2012), RXJ 1131−1231 (Chartas et al. 2012), QJ 0158−4325, HE 0435−1223, SDSS J0924+0219, and HE 1104−1805 (Chen et al. 2012). In almost all cases, the equivalent width of the Fe K\(\alpha \) line is much larger for these lensed quasars than it is for comparable, non-lensed AGN. This indicates substantial microlensing and suggests a compact size for that region, comparable to or even smaller than the X-ray continuum region (i.e. the corona, e.g. Chen et al. 2012).

Chartas et al. (2017) apply their \(g\)-distribution method to the Fe lines seen in RX J1131−1231, QJ 0158−4325, and SDSS J1004+4112. In this method, the observed Fe line shifts are due to microlensing caustics near the inner edge of the disc, and the distribution of the shifts can provide estimates of the size of the innermost stable circular orbit (ISCO), spin, and inclination (\(i\), with zero being perpendicular to the line-of-sight) of the disc. To date, results have only been obtained on RX J1131−1231, for which Chartas et al. (2017) report values of \(r_{\mathrm{ISCO}} \lesssim 8.5 r_{\mathrm {g}}\) and \(i \gtrsim 55^{\circ}\).

Strong amplitude of microlensing has been found in the Fe iii multiplet visible in rest-frame UV (i.e. range [2039-2113] Å) suggesting that emission arises in the direct vicinity of the disc and in a region almost as compact as the latter (Guerras et al. 2013b; Shalyapin and Goicoechea 2014; Fian et al. 2021a). By combining the SMBH mass inferred from the redshift and broadening of this line, with microlensing-based estimates of BLR sizes, Mediavilla et al. (2020) constrained the virial factor, \(f\), of several low ionization lines in an ensemble of 10 quasars.

4.3.2 Size

Just a year after the first detection of a microlensing-induced variability in Q 2237+0305 (Irwin et al. 1989), Wambsganss et al. (1990) compared the observed variations to simulated light curves and predicted the Optical-UV size of its accretion disc radius to be \(< 2\,10^{15}\text{ cm}\). This early result showed consistency with the expectations of the thin-disc theory. However, further analysis of this and another ∼20 systems showed that the sizes measured from microlensing are systematically larger by a factor of 3-5 than the expectations from the thin-disc model (see Cornachione et al. 2020a, for a recent compilation). These results were obtained from the analysis of the microlensing signal in optical light curves (see Sect. 3.4.2), focusing either on a single HME (Anguita et al. 2008; Eigenbrod et al. 2008b; Mediavilla et al. 2015) or using all variations in the difference light curves over a long period of time (Hainline et al. 2012, 2013; MacLeod et al. 2015; Morgan et al. 2008, 2012, 2018; Cornachione and Morgan 2020). Measurements based on single-epoch observations also generally suggest larger disc sizes (e.g. Blackburne and Kochanek 2010; Jiménez-Vicente et al. 2012). Although it seems that this method consistently results in larger sizes than predicted by the thin-disc theory, no consensus on the physical implications for quasar accretion discs has been reached.

Before discussing possible explanations for this discrepancy, it is important to remember that single-band multi-epoch analysis can constrain only a combination of the slope of the temperature profile of the disc, \(\beta \), and its size, \(r_{s}\) (Eq. (36)). This means that this tension should be investigated in the \(\beta \)\(r_{s}\) plane, shown in Fig. 25. In this perspective, a possible solution could be a shallower slope than predicted by the thin-disc theory (Cornachione et al. 2020a). However, slopes are more prone to systematic uncertainties than absolute source sizes, as discussed in the next subsection (Sect. 4.3.3).

Fig. 25
figure 25

Ratio between the luminosity half-light radius \(r_{L} \equiv R_{\lambda , 1/2}^{\mathrm{flux}}\) (Eq. (38)) and the half-light radius measured from microlensing \(r_{1/2}\) (\(\equiv r_{\mu}\) on the figure). Discs described by a power-law temperature profile are represented by the light blue line whereas the dot indicates the particular case of \(\beta = 3/4\), corresponding to the thin-disc model. Alternative disc models including an inner edge or contamination from the BLR or scattered light are also plotted. These alternative models, however, are insufficient to reconcile luminosity and microlensing sizes. For this to happen, measurements should fall in the shaded grey region, which means that the slope of the temperature profile should be reduced drastically compared to the thin-disc model. Figure reproduced from Morgan et al. (2010) and Cornachione and Morgan (2020)

Several alternative explanations have been proposed to reconcile the disc size discrepancy. One suggestion to explain the larger sizes inferred by microlensing is the inhomogeneous accretion model proposed by Dexter and Agol (2011). However, the level of inhomogeneities required by this model to match the observations would need to be unrealistically high. Another plausible explanation could be that part of the UV-continuum emission is actually coming from a much more extended region, hence biasing the microlensing measurement. For example, Hutsemékers et al. (2015) presented spectropolarimetric observations of the BAL lensed quasar H 1413+117 that showed two distinct sources of continuum emission. Coupling the observations with microlensing simulations, they noted that these regions differ significantly in their size with a compact unpolarized emission coming directly from the disc and a much larger non-microlensed continuum coming from an extended region located along the polar axis. The fraction of total flux arising from that extended emission may however be less than ∼40 per cent. This would reduce the microlensing source size by less than a factor two (Dai et al. 2010; Sluse et al. 2015).

Alternatively, Abolmasov and Shakura (2012a) proposed a super-Eddington accretion rate for these specific systems, leading to the formation of an optically thick envelope scattering the radiation emitted from the disc. This effect would make the apparent disc size larger and independent of wavelength. Similarly, low-density scattering atmospheres could produce non-black-body emission spectra (see the model by Hall et al. 2018), which would result in a flatter spectral energy distribution than a thin-disc. This could result in a broken power-law temperature profile, increasing the apparent half-light radius of the disc. A similar broken power-law profile could also appear from iron opacity (Jiang et al. 2016).

The disc wind model (e.g. Li et al. 2019) with mass outflows could also explain the discrepancy. The principle is again the same: the disc wind is flattening the radial temperature profile that increases the source size measured by microlensing. It might also be that the solution to this problem is related to our detailed understanding of the origin of X-ray emission and its interaction with the disc. (Papadakis et al. 2022) calculated that for a disc illuminated by the X-ray corona, energy is injected into the disc increasing the temperature mainly at larger radii. Therefore, X-ray illuminated discs will have an increased apparent size. Finally, we note that both changes in the disc structure and the presence of unaccounted extended emission could be at play at the same time, as proposed by Zdziarski et al. (2022). These authors showed that a combination of disc truncation at an inner radius larger than the innermost stable circular orbit, disc winds, and local color corrections to the disc black-body emission could explain larger discs.

4.3.3 Thermal Slope

Any measurement of accretion disc sizes as a function of wavelength using multi-band or spectroscopic observations (single-epoch or monitoring) allows one to infer the thermal slope of the accretion disc (because \(r_{\lambda }\propto \lambda ^{1/\beta}\) or \(T\propto R^{-\beta}\); Eq. (36) and Sect. 2.9.2). In 2008, several studies using the techniques described in Sect. 3 obtained the first results for Q 2237+0305 (Anguita et al. 2008; Eigenbrod et al. 2008a; Mosquera et al. 2009) but also for a few other systems (Poindexter et al. 2008; Bate et al. 2008; Floyd et al. 2009). These results showed consistency with the expectation for a thin-disc model, i.e. \(\beta =3/4\) Sect. 2.9.2. The error bars of these measurements, were too large to rule out alternative disc models. Thanks to multi-epoch data gathered for a sample of 11 systems, Morgan et al. (2010) alerted that, even if the size of the disc increased as \(M_{BH}^{2/3}\), in agreement with the thindisc model, their too large absolute sizes implied thermal slopes in conflict with that model. The results from Blackburne et al. (2011), who used single-epoch multi-band data from X-ray to NIR for 12 lensed quasar systems, are in agreement with Morgan et al. (2010). They infer a variety of thermal slope measurements, but most of them significantly steeper that expected (\(\beta > 3/4\)); even consistent with no size variation with wavelength at all (i.e. \(1/\beta \rightarrow 0\)). Several additional studies thereafter also found steeper than expected slopes (\(\beta \gtrsim 1\), Muñoz et al. 2011; Motta et al. 2012; Jiménez-Vicente et al. 2014; Muñoz et al. 2016). On the other hand, a few studies using either the single-epoch (Rojas et al. 2014; Bate et al. 2018) or the light curve method (Cornachione and Morgan 2020) found slopes shallower than expected (\(\beta \lesssim 0.5\)).

At the time of writing, the tension between different microlensing measurements still exists. There are, however, several possible avenues being explored to explain the contradicting results. A first attempt at explaning this investigates the impact of flux not arising from within the disc. As one may notice by looking at an AGN spectrum (Fig. 3), broad band data can include some fraction of emission from the broad emission lines. The contribution to the total flux from the two regions will vary in each lensed quasar image due to differential microlensing (see Fig. 15). To minimise contamination, Jiménez-Vicente et al. (2014) have specifically used data originating from narrow band imaging, spectroscopy, and broad-band imaging free of salient emission lines. This sound observational strategy does not however correct for the contribution arising from less salient lines, like the Fe ii pseudo-continuum, Balmer continuum emission or other pseudo-continua, which are present in different amounts in AGN spectra.Footnote 15 Sluse et al. (2015) measured the disc temperature profile of H1413+117 from single-epoch data and shown that not correcting for the observed extended continuum pushes the measured slope towards steeper temperature profiles. Due to the different amount of non-disc emission among quasars, but also of its possible variability as a function of wavelength (and possibly time), it is difficult to predict a generic impact on the measured slope. The latter effectively corresponds to a ratio of sizes, and may be thus prone to any systematic uncertainty affecting the reference source size. To mitigate this effect, Cornachione et al. (2020a), have chosen to explore the inconsistency between the source size and luminosity size \(r_{L}\) (Eq. (38), Fig. 25) as a function of \(\beta \).

A second explanation is systematic biases and uncertainties in the methods. For the single-epoch technique, (Bate et al. 2018) showed that low amplitudes of chromaticity, i.e. \(< 0.4\) mag, defined as the difference of microlensing magnification between the reddest and bluest wavelengths, may yield slopes that are steeper than expected. On the other hand, the selection of observations displaying large amplitude of microlensing may systematically favour smaller sizes. Overall, simulations suggest that selection effects need to be carefully assessed when interpreting single-epoch measurements (Bate et al. 2018; Guerras et al. 2020). Monitoring data may be less prone to biases but the impact of sampling, number of bands, and length of monitoring on the results still needs to be quantified. A related source of systematic uncertainty is the one associated to the macro-model and specifically to the exact value of the macro-magnification. This can be uncertain by a factor of a few in each multiple image (see Sect. 3.2). This may impact the theory size of the source (see Sect. 2.9.2), which depends on the magnification-corrected size of the source, i.e. \(R_{\lambda}^{\mathrm{BH}} \propto L^{1/3}\) (see e.g. Cornachione et al. 2020b). A biased estimate of the black hole mass may have an effect when comparing microlensing sizes to theory as \(R_{\lambda}^{\mathrm{BH}} \propto M_{\mathrm{BH}}^{2/3}\) (Morgan et al. 2018).

Finally, a third explanation is that intrinsic scatter of disc properties between AGN naturally takes place. For instance, the accretion efficiency, \(\eta \), directly impacts the source size (Eq. (40)), but depends on the black hole spin (e.g. Thorne 1974). However, as discussed by Morgan et al. (2010), the efficiency alone may be insufficient explaining all the microlensing data. On the other hand, a non universal disc model is also a possibility, with a slim disc or an advection dominated flow occuring for high or low Eddington accretion rates respectively (e.g. Lasota 2023). Therefore, it could be that the various disc models invoked to explain over-sized discs (see discussion in Sect. 4.3.2) are in fact the case for different subsamples of systems.

In any case, none of the above solutions may be sufficient on its own to explain all microlensing measurements. Large multi-epoch surveys of the sky, like LSST, but also targeted works through e.g. spectroscopic monitoring, in combination with a careful account for potential sources of systematic uncertainties, may enable to design the optimal experiment for revealing the properties of the accretion disc.

4.4 Broad Line Region

Microlensing of the BLR has long been thought to be a marginal effect (Schneider and Wambsganss 1990; Lewis et al. 1998). After the early measurements of the size of BLR with reverberation mapping at the end of the 1990s, several authors have suggested that BLR microlensing may be larger than previously though (Abajas et al. 2002; Lewis and Ibata 2004). This has been soon after confirmed with the first striking line deformation observed by Richards et al. (2004) in the large separation lensed quasar J1004+4112. It is now securely detected in more than 30 AGN and suspected to be visible in at least 80% of the microlensed AGN (Sluse et al. 2012; Guerras et al. 2013a). Often, only a small fraction (≲ 10%) of the line flux is found to be microlensed, but extreme cases have been observed where almost the whole line is magnified by ML (Richards et al. 2004; Keeton et al. 2006; Sluse et al. 2007, 2011). Except in the situation where some velocities are magnified and other are demagnified (Hutsemékers et al. 2023), the amplitude of microlensing of the broad lines provides a robust estimate of the projected and luminosity weighted BLR size. Information on the structure and kinematics of the BLR is encoded in the microlensing induced deformation of the emission line profile (Fig. 23). A variety of line deformations are observed, affecting either differently each wing of the line (Richards et al. 2004; Braibant et al. 2014), or symmetrically the whole line, with however the higher velocities being more magnified than the lower ones. That finding supports a BLR kinematics where the highest velocity component takes place closer to the central engine. Overall, the diversity of broad emission line deformations is qualitatively supported by simulations. The following subsections review important insights on the BLR obtained for various emission lines.

4.4.1 High Ionization Lines

Guerras et al. (2013a) have shown, from the analysis of microlensing in image pairs in 16 strongly lensed systems that the size of high ionization lines (O vi\(\lambda \lambda \)1035, Ly \(\alpha \)+N v \(\lambda \lambda \) 1216, Si iv+O iv \(\lambda \lambda \) 1400 and C iv\(~\lambda \lambda \) 1549) is substantially smaller than that of the lower ionization lines, providing a confirmation of the stratification of the BLR expected from photo-ionization models, and observed in reverberation mapping data. This result has recently been confirmed from an expanded sample of systems by Fian et al. (2021a). Large sample studies (Sluse et al. 2012; Guerras et al. 2013a), also reveal frequent occurrence of microlensing in one of the line wings, indicating that the high ionization BLR may not have a spherically symmetric geometry.

Detailed studies of individual systems have provided further insights on the size and structure of the BLR for specific quasars. To-date, the best studied system has been the Einstein cross. O’Dowd et al. (2011) have shown that the microlensing-induced line deformation observed in that system favour a gravitationally dominated dynamics over an accelerating outflow. Furthermore, constraints on the size and geometry of the region emitting C iv have obtained owing to the spectroscopic monitoring carried out by Eigenbrod et al. (2008a). On the one hand, the microlensing-induced size has been found to be compatible with the \(R_{BLR}-L\) relation measured in local AGN (Sluse et al. 2011; Hutsemékers and Sluse 2021). On the other hand, forward models of line deformations observed for 3 geometries and kinematics models (and a range of orientations with respect to the observer) ruled out an equatorial wind model, and favoured a keplerian disc over a biconical polar wind (Hutsemékers and Sluse 2021). In addition, inclination of \(\sim 40\deg \) of BLR had been favoured by these authors, in agreement with the Type 1 properties of this system, and previous constraints on the inclination derived from microlensing of the accretion disc (Poindexter and Kochanek 2010). Another system displaying salient line deformation is the large separation quad J1004+4112. The recurrent asymmetric blue-wing enhancement of C iv observed in that system has been qualitatively reproduced by Abajas et al. (2007) using a biconical geometry. The microlensing interpretation of the atypical microlensing signal observed in this system (Lamer et al. 2006) has been challenged by Green (2006) who has proposed a non-microlensing explanation of the data, where each strongly-lensed image is crossing different regions of warm outflowing material. As shown by several authors (e.g. Lamer et al. 2006; Hutsemékers et al. 2023), the choice of normalisation of the line plays an important role in the apparent recurrence of the blue wing enhancement and in the quantification of the velocity dependant amplitude of microlensing. Using as baseline the flux ratios from radio data (Hartley et al. 2021), Hutsemékers et al. (2023) have shown that a microlensing-induced deformation of the line stable over 15 years exists, with the red wing being demagnified while the blue wing is strongly magnified. This scenario is supported by simulations that enable them to show that a keplerian disc or an equatorial wind provide a good description of the BLR.

4.4.2 Low Ionization Lines

Emission lines requiring lower ionisation energy, such as Balmer lines (in e.g. \(\mathrm{H}\alpha \,\lambda 6562.8 \) and \(\mathrm{H}\beta \,\lambda 4861.3\)) and Mg ii \(\lambda \lambda \)2798, are observed at larger wavelengths than higher ionization lines. Because most lensed AGNs lie at \(z> 1.5\), those lines have been less regularly observed and scrutinized for microlensing than higher ionization ones. Mg ii is found to be on average less microlensed than higher ionization lines, which means that their sizes are a few times larger than high ionization lines Guerras et al. (2013a), Fian et al. (2021a). This is in agreement with the expected stratification of the BLR with increasing ionization degree.

Symmetric line deformations are the most common, but asymmetric deformation are occasionally observed. There are no statistical constraints on line deformations of Balmer lines, but both symmetric and asymmetric deformation of these lines have been reported (Sluse et al. 2007; Braibant et al. 2016, 2014). For two systems (Q 2237+0305 and HE 0435−1223), a forward modeling of the observed deformation has been performed (cf. Sect. 3.5.3), strongly favouring a keplerian disc geometry (Braibant et al. 2016; Hutsemekers et al. 2019; Hutsemékers and Sluse 2021). A half-light radius \(r_{1/2}(\mathrm{H}\beta ) = 47 \pm 19\) lt-days has been derived for Q 2237+0305, about an order of magnitude below the expectation from \(R_{BLR} - L\) relation derived from reverberation mapping (Grier et al. 2017). Hutsemékers and Sluse (2021) have proposed that this could be related to the high accretion rate estimated for that system.

The comparison of line deformations based on data obtained at the same epoch (or pairs of data accounting for the time delay) provide further insights on the existence of a common BLR structure for various atomic species. To our knowledge, there are only a handful of systems where both Mg ii and a Balmer line have been scrutinized. In HE 0435−1223, Mg ii and H\(\alpha \) are found to be similarly microlensed (Braibant et al. 2014). For WGD2038−4008, both lines are found to be free of microlensing deformation (Melo et al. 2021). This contrasts with the results obtained for RXJ 1131−1231 where a very broad and spatially compact emission has been identified only in Mg ii (Sluse et al. 2007). While differences are also observed between these two lines in SBS0909+532, this is at two very different epochs Mediavilla et al. (2005), Guerras et al. (2013a)). Therefore, the microlensing configuration could have varied substantially between the two observations, precluding to draw robust conclusions.

The C iii]\(~\lambda 1908.73\) line, sometimes classified as intermediate ionization line, is commonly present is lensed AGNs spectra but suffers from a strong blending with Al iii and Si iii lines, which complicates the identification of microlensing induced line deformation. Nevertheless, substantial microlensing is often identified in this line, with appearance of blue/red asymmetries similar to what is found for C iv (Sluse et al. 2012; Fian et al. 2021a). This line may therefore arise from a region smaller than the one emitting low-ionization lines. In their analysis of the spectro-photometric monitoring data of Q 2237+0305, Sluse et al. (2011) found an emitting size of C iii] emission region compatible with the one of C iv, but with hints of a different structure in the two regions.

4.4.3 UV-Optical Iron Lines

The Iron emission blend covers almost the whole UV to optical range (Sect. 2.9.3). Guerras et al. (2013b) performed the only systematic study of microlensing of UV Fe ii in a sizeable sample of 13 quasars. They focused on the range [2050-2650] Åand identified strong microlensing in 4 systems. They derived a size of the corresponding emitting region of several light-days, comparable to the accretion disc size. Evidence for compact UV Fe ii emission has also been reported in Q 2237+0305 (Sluse et al. 2011), H1413+117 (O’Dowd et al. 2015) (in [1590-1680] Å) and in J1131-1231 (in [3080-3540] Å) by Sluse et al. (2007). The low redshift of the latter system enabled those authors to compare microlensing of the UV and optical Fe ii. They found a microlensing of optical Fe ii comparable to that of the Balmer lines, at the exception of the region [4630-4800] Å potentially arising from a more compact region.

4.4.4 Broad Absorption Lines

Broad absorption lines (\(\mathit{FWHM} > 2000~\mathrm{km}\,\mathrm{s}^{-1}\)) are present in 10 to 20% of quasars at \(2 < z < 4\), and are predominantly observed in resonance lines of ionized species such as C iv, Si iv or N v (Allen et al. 2011). The blueshifts from \(1000~\mathrm{km}\,\mathrm{s}^{-1}\) to up to 20% of the speed of light displayed by the absorber may be the signature of outflowing gas. At X-ray wavelengths, BALs quasars are generally X-ray weak, probably due to absorptionFootnote 16 (Gallagher et al. 2006). The idea of using microlensing for probing the structure of the outflowing gas in BAL quasars has been proposed very early, but yet applied to a limited number of systems (e.g. Hutsemékers 1993, 1994; Lewis et al. 1998; Chelouche 2005).

The best-studied system is H1413+117, a.k.a. the cloverleaf. The main insights on the geometry of the outflow come from the decomposition of the line profiles using the MmD decomposition (see Sect. 3.2.2). The analysis of this system (Hutsemékers et al. 2010; O’Dowd et al. 2015) reveals (a) a nearly black absorption of C iv in the microlensed flux, and unveils the underlying line emission in the non microlensed component; (b) an onset velocity of the absorption of about \(1500~\mathrm{km}\,\mathrm{s}^{-1}\) for C iv and a larger velocity onset of Si iv and Al iii; (c) part of the C iv emission is reabsorbed over a small wavelength range. These characteristics support radial changes in the absorbing material, which can be interpreted as a two components outflowing wind: one component which may be co-spatial with the emission, and another one, more distant, that partially re-absorb the BEL emission (Borguet and Hutsemékers 2010). The ionization dependence of the onset velocity of the BAL can be interpreted in the context of the disc+wind model.

4.5 Torus

Due to limited access to the mid-infrared range from the ground and to the low resolution of mid-infrared instrumentation in the last decades, observations of lensed quasars in the mid-infrared have been rather limited. Agol et al. (2000) reports the first measurements of the flux ratios in that range for the Einstein Cross. The flux ratios measured at 8.9 and 11.7 μm were found to be in agreement with those measured at radio wavelengths, but relative uncertainties reach ∼20%. Nevertheless, this observation has been used to rule out the presence of (compact) synchrotron emission in the mid-infrared, and support the idea that the emission in that range is dominated by dust arising from a region too large to be microlensed (Agol et al. 2000; Wyithe et al. 2002). Several additional observations of the Einstein Cross have been performed over the years (see Vives-Arias et al. 2016, for a census), in broad agreement with the early measurements of Agol et al. (2000). The absence of microlensing yields \(R_{1/2} \gtrsim 200 \sqrt{< M> /0.3 M_{\odot }}\) lt-days at about 11 μm (Vives-Arias et al. 2016). Due to this large size, the few other observations of lensed quasars in the mid-infrared have instead been used to search for flux ratio anomalies, interpreted as microlensing by substructures of mass \(M\sim 10^{6}-10^{8} M_{\odot}\) (e.g. Chiba et al. 2005; Minezaki et al. 2009).

Although microlensing in the mid-infrared range is small, it may not always be negligible. Observations of 6 lensed quasars at \(2.2~\upmu \text{m}\) and \(3.8~\upmu \text{m}\) by Fadely and Keeton (2011) show that microlensing can occur. Since this typically corresponds to \(\sim 1~\upmu \text{m}\) rest-frame, this may be explained by the yet substantial contribution of accretion disc emission. Observations of unlensed quasars as well as simulations however suggest that, for lower luminosity systems, the hot component of the torus (peaking around 2.2 μm rest-frame) may be microlensed in some systems (Stalevski et al. 2012; Sluse et al. 2013). Upcoming observations of lensed quasars with the James-Webb Space Telescope is expected to shed new lights on the presence of microlensing in wavelengths ranges dominated by dust emission.

4.6 Scattered and Polarized Emission

Polarization of UV-optical light in AGNs is observed at levels ranging between zero and 4-5%, rare systems sometimes reaching polarization levels of 10-20%. Broad absorption line AGNs, radio-loud systems and blazar generally display the larger linear polarization degrees. In the UV-optical, the polarization arises predominantly from scattering of the light by electrons, and maybe dust. The exact location of that region (equatorial/polar, inside/outside the BLR, etc), and even its ubiquity in all AGN are yet totally opened questions. Because microlensing re-weight the spatial distribution of the emission in the inner regions of AGNs, it is expected to produce (de-)polarization of AGNs, hence modifying both the polarization degree and angle of microlensed images (Belle and Lewis 2000; Hales and Lewis 2007; Kedziora et al. 2011).

The first observational evidence for polarization differences induced by microlensing required the high resolution capabilities of the HST targeting the BAL quasar H1413+117 (Chae et al. 2001). It took more than a decade before obtaining resolved polarimetric and spectropolarimetric observations of individual images of lensed quasars from the ground, again for H1413+117 (Hutsemékers et al. 2010, 2015; Sluse et al. 2015). Those studies have shown a twist of the polarisation angle of image D, which displays slowly varying microlensing since about 25 years. This observation can be explained with the presence in that BAL of two spatially separated regions producing orthogonal polarization: an equatorial region which is microlensed, and a more extended polar region. Resolved polarised observations of the two large separation lensed quasars J1004+4112 and Q0957+561 have been presented by Popović et al. (2020, 2021). Similarly, changes in the polarisation degree and angles are observed in those objects due to microlensing. This supports the existence of a compact a equatorial scattering region in systems which are not BALs.

In addition to an imprint of microlensing in the polarisation signal, spectroscopic data of microlensed BAL have revealed the presence of extended diffuse emission separated from the accretion disc (Sluse et al. 2015; Hutsemékers et al. 2020). This discovery has been facilitated by the presence of almost black absorption covering the emission from the accretion disc but not the most extended continuum. Differential microlensing has revealed non-microlensed continuum emission contributing to up to 40% of the observed continuum. Despite of contributing to a large fraction of the observed continuum, this emission is insufficient to explain the larger size of the accretion disc unveiled by microlensing studies (Sect. 4.3; Sluse et al. 2015). The presence of non-microlensed (pseudo-)continuum emission in non-BAL systems is yet to be demonstrated, as well as its exact origin. If this emission is generically present in AGNs, a small flickering of microlensed lightcurves could be expected at wavelengths free of broad emission lines. While such flickering is observed in numerous systems (see Sect. 3.4.2), it is not yet demonstrated that it is caused by extended continuum emission.

5 Lensing Galaxy Results

What can be constrained uniquely by microlensing about the lens galaxy is the partition of its mass into baryons (compact) and dark (smooth) matter, and how the baryonic component is further partitioned between the microlenses, i.e. its mass function. As we have already seen in Sect. 2.8 (see also Fig. 12), the values of \(\kappa \), \(\gamma \), \(\kappa _{*}\)Footnote 17 at the positions of the multiple images, which are directly linked to the local values of the second derivatives of the lens potential (see Saha et al. 2024), define microlensing variability. The total mass in baryons, defined by \(\kappa _{*}\), can be further distributed in compact objects in different ways. Microlensing magnification is generally more sensitive to the mean microlens mass, but information on a mixture of masses and the IMF can be extracted under specific conditions.

5.1 Measuring the Baryonic/Dark Matter Content Under Different Constraints from Macromodels

In principle, the \(\kappa \), \(\gamma \), \(\kappa _{*}\) could all be constrained as free parameters of a microlensing model fitted to observables, but in practice, only a single free parameter is used. This can be either \(\kappa _{*}\) itself, with \(\kappa \), \(\gamma \) derived separately from a macromodel, or a parameter partitioning the total mass between a stellar and dark mass profile (usually the mass-to-light ratio), leading to a sequence of covariant \(\kappa \), \(\gamma \), \(\kappa _{*}\). A lens macromodel is a mass distribution/lens potential fitted to the imaging data of a lensed quasar. Lens modelling requires some form of the mass profile to be chosen first, e.g. SIE, power-law, etc, and proceeds by making specific assumptions for the (unknown) extended source brightness profile, including the quasar point source, and solving the lens equation to fit the lensed light components to the data (see Shajib et al. 2024). It is straightforward to calculate the values of \(\kappa \), \(\gamma \) at the multiple image positions from the resulting lens potential. One could additionally make use of the fact that light traces mass, fit a two-dimensional brightness profile to the observed lens light (e.g. a Sersic profile, spline polynomials, etc), assume some mass-to-light ratio, and thus compute a stellar-mass component to incorporate in the mass model. In combination with a dark matter component, e.g. a Navarro-Frenk-White profile, \(\kappa _{*}\) can be provided straightforwardly, albeit not in a completely unbiased way (see below).

Both flux ratio and light curve microlensing data can be used to constrain the local graininess of matter, i.e. the value of \(\kappa _{*}\), and both require simulated magnification maps. These maps are generated for given \(\kappa \), \(\gamma \), \(\kappa _{*}\) combinations derived from some macromodel and used to extract observable quantities. In this case, all the other parameters of the model – accretion disc profile, effective velocity, microlens mass – are treated as nuisance parameters and marginalised over. By fitting the model to the data, the most likely \(\kappa _{*}\), or more precisely \(\kappa \), \(\gamma \), \(\kappa _{*}\) combination, can be found. We present the results on \(\kappa _{*}\) from flux ratios and light curves below.

5.1.1 Flux Ratios

The idea of constraining \(\kappa _{*}\) from microlensing flux ratios was originally proposed by Schechter and Wambsganss (2002), who found that a smooth mass component in addition to microlenses can increase variability especially for saddle-points.Footnote 18 Bate et al. (2007) and Congdon et al. (2007) found that a finite source can in fact cause the same broadening of microlensing magnification distributions, requiring the two parameters – source size and \(\kappa _{*}\) – to be studied simultaneously. Mediavilla et al. (2009) studied flux ratios in 29 image pairs obtaining their \(\kappa \), \(\gamma \) from a SIS plus external shear macromodel and found an overall \(5_{-3}^{+9}\) per cent of matter in stars (at 90 per cent confidence) for a given source size (their result is positively correlated with the source size and probably contains various other systematic effects and biases, see the discussion in Mediavilla et al. 2009). In a later study, Jiménez-Vicente et al. (2015a) and Jiménez-Vicente et al. (2015b) took into account the source size simultaneously with \(\kappa _{*}\) and found consistently 20 per cent of matter in stars at \(1{-}2\) effective radii for a sample of 18 lenses from flux ratios in the optical and X-rays respectively. They used the more realistic SIE plus external shear macromodels from Schechter et al. (2014), which are known to produce smaller magnifications. Bate et al. (2011) found \(20{-}50\) per cent of matter in stars within \(5{-}10\) kpc of the lens center for two systems after marginalizing over the source properties (size and temperature profile) and \(\kappa _{*}\) for Q 2237+0305. This was an expected outcome for this system because the multiple images are located so close to the lens center that the stellar mass component is comparable to, or even dominates over, smoothly distributed dark matter (see references in Bate et al. 2011, for the macromodels they used). Pooley et al. (2012) found 7 per cent of matter in stars at a mean distance of 6.6 kpc from the lens center by analyzing an ensemble of 14 quadruply lensed quasars and a decreasing trend of \(\kappa _{*}\) with distance. They derived their \(\kappa \), \(\gamma \) from a SIS plus external shear macromodel (see also Pooley et al. 2007; Blackburne et al. 2011).

The results summarized above are prone to systematic errors in measurements and degeneracies in modelling that are specifically associated with flux ratio snapshots. As discussed previously (see Sect. 4.3), analyzing flux ratios with microlensing is more robust for close image pairs that have almost the same macro-magnification and small time delays. But properly deblending the image fluxes can be tricky precisely because of the proximity of the images, while unaccounted for intrinsic source variability can lead to \(\kappa _{*}\) overestimates (Mediavilla et al. 2009). Any overlap of the observed wavelengths (e.g. for a broad band) with emission lines, especially broad lines that are known to undergo microlensing as well, could contaminate the measured flux ratios. The extent of the chromatic variation and the level of the no-microlensing baseline, which are known to affect source measurements (Bate et al. 2018), could also play a role for measuring \(\kappa _{*}\), but this has not been investigated yet. Finally, there are cases where the flux ratio data are simply not sufficient to lead to any conclusive \(\kappa _{*}\) resultsFootnote 19 (e.g. Bate et al. 2008; Floyd et al. 2009).

5.1.2 Light Curves

Constraining \(\kappa _{*}\) from light curve data has proceeded largely by applying the fitting method of Kochanek (2004). Studies using this approach have adopted a different strategy to calculating \(\kappa \), \(\gamma \) from a macromodel and having \(\kappa _{*}\) as a free parameter. Their mass model consists of a combination of a NFW halo and a concentric de Vaucouleurs component of the mass in stars, obtained by fitting the light of the lensing galaxy and a constant mass-to-light ratio (see also Lehar et al. 2000). The total mass being conserved, a single free parameter, \(f_{\mathrm{{ML}}}\), describes the combination of the two components, from purely stars (\(f_{\mathrm{{ML}}}=1\)) to just a NFW halo (\(f_{\mathrm{{ML}}}=0\)). Varying this parameter leads to different \(\kappa \), \(\gamma \), \(\kappa _{*}\) combinations, not just \(\kappa _{*}\) (and also a varying mass-to-light ratio across the lensing galaxy). Examples of such families of models are shown in Fig. 13 – the points from Morgan et al. (2008), Dai et al. (2010), and MacLeod et al. (2015).

Light curve data analysis favours low values of \(\kappa _{*}\), albeit not decisively. Chartas et al. (2009) and Dai et al. (2010) find \(f_{\mathrm{{ML}}}=0.2\) and 0.3 for lensed quasars HE1104−1805 and RXJ 1131−1231 respectively. Modest trends supporting dark matter dominated models are found for PG1115+080 (Morgan et al. 2008), QJ0158−4325 (Morgan et al. 2012), and WFI2026−4536 (Cornachione et al. 2020a), while measurements for Q0957+561 (Hainline et al. 2012), SBS0909+532 (Hainline et al. 2013), and J0924+0219 (Morgan et al. 2006; MacLeod et al. 2015) have been inconclusive.

One of the biggest caveats of analyzing light curves is the very demanding computations (Kochanek 2004; Poindexter and Kochanek 2010) that have resulted in the analysis of only \(\approx 10\) systems (Cornachione and Morgan 2020). However, the generally weak trends in \(\kappa _{*}\) are probably due to insufficient uncorrelated variability in the light curves to constrain the dark matter fraction (Hainline et al. 2012). Including the time dimension, as opposed to snapshots, comes at the cost of additional parameters, mainly in relation to the effective velocity, but this alone does not seem to affect the measurements of \(\kappa _{*}\).

5.1.3 Impact of Macromodel Uncertainty on Measuring \(\kappa _{*}\)

Apart from the possible systematic biases and statistical uncertainties inherent to the analysis methods described above, any uncertainty of the macromodel itself can potentially have an impact on \(\kappa _{*}\) measurements. Vernardos and Fluke (2014b) have shown that \(\varDelta \kappa \), \(\varDelta \gamma \) as small as 0.02 can lead to significant differences in the properties of microlensing magnification. Macromodel biases and uncertainties of this magnitude are not uncommon and can be due to three reasons that we briefly list here. First, they can be due to missing ingredients of the mass model, like perturbers, disc- or bar-like structures, or higher order moments in the potential (e.g. see Van de Vyvere et al. 2022, for an example of how the latter affect time delays). For example, not accounting for the nearby galaxy G2 and a few close-by perturbers in the model of WFI2033−4723 yields a relative change of \(\kappa \) by up to 60 per cent, and of the absolute macro-magnification by up to a factor 2 for some of the lensed images (Rusu et al. 2020). The presence of any other massive substructure (\(>10^{6} M_{\odot}\)), whose impact on flux ratios can be dramatic (Mao and Schneider 1998; Dalal and Kochanek 2002; Metcalf and Zhao 2002), can bias the values of \(\kappa \), \(\gamma \) locally, albeit with different characteristics compared to microlensing (Inoue 2016). Second, there is a fundamental degeneracy between the initial mass function (IMF) and the dark matter fraction (e.g. see Oguri et al. 2014; Foxley-Marrable et al. 2018), which, in short, states that the same lens potential and light distribution can be produced by a low dark matter fraction and many low-mass (and less luminous) stars, or a high dark matter fraction and fewer but brighter stars. Finally, the mass-sheet degeneracy (Falco et al. 1985; Gorenstein et al. 1988), i.e. scaling the lens galaxy mass distribution and adding a constant surface mass density (mass-sheet), leaves observables such as the image positions, shapes, and flux ratios, unchanged but affects the \(\kappa \), \(\gamma \). Although there are ways to mitigate these effects, e.g. by breaking the mass-sheet degeneracy through modelling of kinematics data (see Shajib et al. 2024) or explicitly modelling substructures (see Vegetti et al. 2024), the complexity of the true lens potential and the corresponding quality of a macromodel will eventually affect microlensing measurements.

5.2 Microlens Mass and the IMF

Microlensing variability depends on the relative size of micro-caustics on the source plane with respect to the emitting source. A typical value for this size is \(R_{\mathrm{E}}\) that is \(\propto \mathrm{M/M_{\odot}}^{1/2}\) (Eq. (7)), where M is the microlens mass. The masses for a population of microlenses are, in principle, free to vary in the broad range roughly from small star clusters (or even intermediate mass black holes) to sub-stellar objects like planets and moons. In the latter case, one can speak of ‘nano-lensing’,Footnote 20 which is still relevant for quasars as long as there is a wavelength to observe in which their emitting region is accordingly small. It turns out that microlensing variability (at a fixed source size) depends primarily on the microlens mean mass and less on the shape of the mass spectrum. However, the IMF of the lensing galaxy can still be probed by measuring the microlens mean mass across different wavelengths (or by indirectly measuring an overall mass-to-light ratio for the galaxy lens, see below). This is suggested by observations as well, for example, by the lack of correlation between the microlensing variability in optical and X-ray observations found by Mosquera et al. (2013) (see also Pooley et al. 2007), where the former could be due to stellar microlenses and the later due to planetary mass nanolenses (e.g. Guerras et al. 2020).

It has been known from early on (Wambsganss 1992; Lewis and Irwin 1995; Wyithe and Turner 2001; Congdon et al. 2007) that the expected microlensing magnification shows very little dependence on the shape of the mass spectrum of the microlenses. Consequently, a widely adopted approach has been using a fixed-mass microlens population to produce magnification maps. Schechter et al. (2004) found that magnification histograms can be substantially different in the case of a mix of two such populations whose masses differ by more than a factor of ten but have a comparable total mass. Esteban-Gutiérrez et al. (2020) further examine such bi-modal mass distributions and find that they can be replaced by a single population with mass equal to their geometric mean. Both studies admit that they intentionally focus on extreme cases and that for extended sources the low mass component can effectively behave like a mass-sheet – a case equivalent to a single fixed-mass population with a different \(\kappa _{*}\) and the low-mass objects absorbed into a smooth component.

The source appears the smallest in the X-rays and allows one to probe the sub-stellar mass range for the microlenses. The innermost accretion disc is known to emit the relativistically blurred iron Fe K\(\alpha \) line in the X-rays (Reynolds and Nowak 2003). Lensed quasars can display shifts in the peak energy of this line over time, attributed to microlensing (Chartas et al. 2017). Dai and Guerras (2018) and Bhatiani et al. (2019) modelled such shifts in three systems and interpreted them as due to the presence of a population of sub-stellar mass objects (\(10^{-8} - 10^{-3}\) M). They base their argument on the fact that the cross section of an X-ray source is too small for microlensing by stellar-mass objects. This can be understood by considering the caustic density resulting from the same total mass density divided into a few stellar-mass microlenses compared to many more smaller mass ones. As the caustic density increases with smaller masses so does the corss section for microlensing of an X-ray (very small) source (see Fig. 2 in Bhatiani et al. 2019). The study of Guerras et al. (2020) similarly supports planetary masses for the microlenses using the standard deviation of distributions of uncorrelated X-ray flux ratios over time (i.e. a collapsed light curve).

Observations in the optical are mostly affected by stellar-mass microlenses. Jiménez-Vicente and Mediavilla (2019) calculate the microlens mean mass of a sample of 24 lensed quasars to be \(0.08 < \mathrm{M/M}_{\odot} < 0.21\) (at 68 per cent confidence) using optical and X-ray flux ratios. They used fixed-mass simulations and marginalized over all the other free parameters of their model. Nevertheless, there was still a degeneracy with the source size that was broken by using a size prior from reverberation mapping (Mediavilla et al. 2017). Using light curve data for Q 2237+0305, Kochanek (2004) and Poindexter and Kochanek (2010) measured \(0.006 < \mathrm{M/M}_{\odot}< 0.2\) and \(0.12 < \mathrm{M/M}_{\odot} < 1.94\) (at 68 per cent confidence) respectively. Both studies highlight the importance of the velocity priors used to get these marginally consistent measurements. The latter uses longer light curves and a velocity model that includes the random motions of the microlenses, which could explain the higher mass values found. Under similar dependence on velocity priors, MacLeod et al. (2015) and Morgan et al. (2018) found \(0.016 < \mathrm{M/M}_{\odot} < 0.8\) and \(0.03 < \mathrm{M/M}_{\odot} < 0.44\) (at 68 per cent confidence) for J0924+0219 and WFI2033−4723 respectively, while Cornachione et al. (2020b) found \(0.1 < \mathrm{M/M}_{\odot} < 1\) for Q0957+561 and a somewhat unexpected order of magnitude smaller mass range for SBS0909+532.

The interest in probing higher masses (up to 100 M) has been kept high due to observations towards the Galactic bulge by the MACHO experiment (Alcock et al. 2000) and the recent detection of gravitational waves by LIGO/Virgo, attributed to high-mass (\(\approx 40\) M) black hole mergers. Interpreting the former observations is arguably model-dependent and can allow for more mass in the form of dark, compact objects Hawkins (2015). Hawkins (2020) reached this conclusion in order to explain microlensing variability in lensed quasar light curves as well. However, only theoretical arguments were set forward to attribute this to stellar-mass black holes (and not e.g. stars) and microlensing observables were not modelled explicitly using such mass distributions. A re-analyis of the same data by Awad et al. (2023) suggested that a standard explanation, where galaxies are composed of a population of stars with a standard IMF and smoothly distributed dark matter, cannot be ruled out. Similarly, Esteban-Gutiérrez et al. (2022) found a negligible fraction of mass in \(\approx 30~\mathrm{M}_{\odot}\) objects by explicitly using a mass spectrum in their magnification map simulations.

In addition to directly embedding a mass spectrum in microlensing models, there is also an indirect way the IMF can be probed through the total mass in microlenses. This is based on breaking the IMF-dark matter fraction degeneracy (e.g. see Oguri et al. 2014), i.e. the lens total mass (\(\kappa \), coming from the macromodel) and the lens light distribution can be attributed to either a low dark matter fraction and many low-mass (and less luminous) stars, or a high dark matter fraction and fewer but brighter stars. An example is the stellar mass fundamental plane (Hyde and Bernardi 2009), which is obtained by converting surface brightness to stellar mass through an assumption on the IMF (e.g. see Kauffmann et al. 2003). The spectral data used to achieve this light-to-mass conversion suffer from negligible contributions from stellar remnants, brown dwarfs, and red dwarfs that are too faint, hence placing loose constraints on the lower mass part of the IMF. Schechter et al. (2014) recalibrate the mean of the IMF (a Salpeter one in this case) to a higher mass based on microlensing measurements of X-ray flux ratios in 10 lensed quasars. We note that this was an indirect measurement; although the magnification maps used had a microlens mass spectrum (assumed to be a Kroupa IMF, e.g. Blackburne et al. 2011), its form was fixed and the free parameter of the model was in fact \(\kappa _{*}\). Oguri et al. (2014) followed a similar approach to constrain the shape of the IMF, finding a preference for Salpeter over Chabrier (see Schechter et al. 2014, for a detailed comparison of the methodology of the two studies). Finally, Vernardos (2019) presented a theoretical study of how optical flux ratios could be used to constrain the IMF. They found that the IMF could be measured better in a sample of doubly-lensed quasars (due to their higher numbers) that additionally have higher ratios of Einstein to effective radius (i.e. their multiple images appear further away from the lens-galaxy light).

5.3 Convergence and Shear, \(\kappa \), \(\gamma \)

As mentioned in the beginning of 5.1, the values of \(\kappa \), \(\gamma \) can in principle be constrained alongside \(\kappa _{*}\) and the mean microlens mass from microlensing data. However, in practice this is almost never the case due to the very high computational cost of creating and sampling from many magnification maps. An exception is the study of MG0414+0534 by Vernardos (2018), where \(\kappa \), \(\gamma \) were treated as free parameters and constrained (over an area) from flux ratio data independently from any macromodel. Vernardos and Tsagkatakis (2019) tested a proof-of-concept that the values of \(\kappa \), \(\gamma \) could be measured from microlensing light curves as well. Although promising, these studies have their own limitations and their applicability to a wide range of systems/data remains to be explored. Achieving such measurements will constitute an entirely new probe of matter in galaxies and provide constraints to macromodels – \(\kappa \), \(\gamma \) are directly linked to the second derivatives of the lens potential – with potential application to galaxy evolution, time delay cosmography, etc.

6 Future Prospects and Open Questions

From a curiosity applicable to only a handful of systems, microlensing has steadily matured over the years, both in terms of observations and methodology. It can now complement well-established fields like reverberation mapping (for AGN structure e.g. see Cackett et al. 2021) and galaxy-galaxy lensing (for the mass partition between baryons/dark matter and the IMF, e.g. see Shajib et al. 2024 and Vegetti et al. 2024). Microlensing constitutes an additional observational probe based on an “orthogonal” set of assumptions, e.g. it is not sensitive to the brightness profile of AGN emission in the same way as reverberation techniques, nor does it critically depend on the shape of the mass profile of the lens and the macroscopically observed brightness distribution of the lensed source (shape of the arcs, rings, etc). As such, it can be combined with more traditional methods, for example, to study the kinematics and morphology of the BLR (Garsden et al. 2011) and to break the degeneracy between the stellar IMF and the dark matter fraction (Oguri et al. 2014). Moreover, microlensing provides the additional advantage of probing objects that may be hard to study with standard methods. For instance, microlensing measurements are independent of the level of intrinsic variability of the source (a needed prior for the reverberation method to work), naturally probes objects at cosmic-noon (i.e. \(1.5 < z < 2.5\)), and thanks to macro-magnification, enables to study even faint systems at high redshift.

Currently, several hundreds of lensed quasars are known (see Lemon et al. 2024) but suitable microlensing data have been obtained for only a few tens of them. In the next decade, the field will be revolutionised by the advent of all-sky surveys, like LSST and Euclid, which are not only expected to discover thousands of new systems but will also provide high-quality data to perform microlensing analyses. Microlensing holds a great potential in addressing major science questions by facing the challenge of the upcoming avalanche of data and subsequent adaptation and refinement of its analysis techniques. The following subsections outline these three points in detail.

6.1 Science Potential

There is a treasure trove of information to be mined for the central quasar engine from monitoring data and in particular from HMEs. Once thousands of lensed quasars are discovered and regularly observed, we will be poised to start observing tens or even hundreds of such events. During their peaks, the inner accretion disc and the flow of material through the ISCO are magnified and encoded in the microlensing signal. With high cadence (<nightly) and signal-to-noise ratio data, especially in the X-rays and optical-UV wavelengths (could also include radio, see Sect. 4.6), we will be able to perform a variety of tests on accretion disc theory and the theory of relativity.

Two main microlensing results for the accretion disc are its size and temperature profile, which are respectively found larger and shallower than expected. More insights into these somewhat puzzling results would be gained by simply increasing the number of studied systems from the current 15 or so. Even without new and better-adapted to “big-data” methods, simply turning the crank with existing ones would suffice for this goal.

Microlensing has enabled the measurement of the BLR half-light radius, in general agreement with expectations from the \(R-L\) relations from reverberation mapping. It has additionally confirmed its increase in size with ionization degree for tens of systems. For a small subsample of objects, constraints on the BLR kinematics have been set by modeling the broad line deformations. There is a great potential in expanding such works to more systems. The best observational setup for such work consists in spatially resolved spectroscopy, ideally from space or from the ground in combination with adaptive optics. The modeling of the kinematics is computationally more demanding, and may benefit from development of innovative modeling strategies to be massively applied to order(s) of magnitude more systems.

Directly measuring the graininess of matter and by extension the mass density in stars can shed light into the interplay of dark matter with baryons throughout galaxy evolution. This critical information is very hard to obtain with other methods and prone to degeneracies and assumptions. The goal to achieve is measuring \(\kappa _{*}\) as a function of galactic radius within the lens or per galaxy type (e.g. Pooley et al. 2012). Such microlensing results could be combined with other modelling methods as priors, e.g. galaxy-galaxy lens modelling (see Shajib et al. 2024), used to break degeneracies (e.g. Oguri et al. 2014), or recalibrate scaling relations like the fundamental plane (e.g. Schechter et al. 2014). Again, just the sheer number of new measurements will be enough for gaining important new knowledge.

The expected large number of new observations will enable slicing into subcategories sharing similar physical properties (e.g. quasar black hole mass and Eddington ratio, lens galaxy type and/or age, etc). This will be crucial for drawing a re-fined picture of AGN structure and to understand how their properties depend on key physical characteristics. This will be particularly powerful in combination with improved methods (see below), which would make the most out of every single measurement, i.e. not just counting on large number statistics.

Finally, we would like to make a brief comment on the potential of performing cosmological studies with microlensing. Firstly, understanding their central engine can clarify whether quasars can be used as standard candles, e.g. with relations such as the one between the luminosity in the X-rays and UV light (Risaliti and Lusso 2019). These are exactly the wavelengths where the quasar is the most prone and sensitive to microlensing. And secondly, microlensing is the only probe for peculiar velocities of galaxies at high redshifts (\(z>0.1\), Mediavilla et al. 2016). Such velocities can in turn be used to constrain the growth of structure (e.g. Koda et al. 2014) and shed more light into the \(\sigma _{8}\) tension between early and late Universe (Di Valentino et al. 2021).

6.2 Future Observations

The two main and invaluable sources for microlensing data in the next decade will be the all-sky surveys by the Vera Rubin observatory and the Euclid satellite, both starting to provide data within 2023. The former, will provide \(\sim 800\) epochs across 6 bands over a decade for each patch of sky (on average), while the latter will carry out a survey at a much higher spatial resolution (space-based). Both will provide data suitable for microlensing studies for thousands of systems without explicit need for follow-up: LSST will provide 10-year long light curves and Euclid flux ratios in the optical and infrared. The LSST light curves in particular will not only be suitable for long-term microlensing variability studies (see Sect. 3.4) but they could also be used to predict the onset of HMEs.

Imaging and high cadence photometry will play a key role in probing the accretion disc, but a lot may be gained by obtaining spatially resolved spectroscopy. This will be possible for much fainter than current systems, owing to the upcoming 30-m class telescopes. Such data may be instrumental in deblending the disc emission from any other superimposed source of extended (pseudo)-continuum emission. At the same time, BLR deformations may be studied in detail.

In X-rays, almost all of the progress made in constraining the size of the corona, the inner disc radius, and the dark matter content of the lensing galaxies has been a result of the ability to determine the X-ray fluxes of each image of a lensed quasar. This task requires sub-arcsecond spatial resolution in X-rays, and Chandra has been the only observatory with this capability. Launched in 1999, the mission has made incredible contributions to this field, but the satellite is aging and the main detector has lost most of its effective area below 1 keV. Currently, there is no mission with similar or better spatial resolution in X-rays that has been approved by any of the major space agencies. The Astro2020 Decadal recommended that NASA plan for the launch of the next flagship X-ray mission no earlier than the late 2040s. Such a mission would likely have Chandra-like or better spatial resolution.

In the interim, NASA Astrophysics has introduced a new line of satellites (Probes) the first of which is to be launched in the early 2030s and will be either a far-infrared mission or an X-ray mission. One of the X-ray mission concepts (AXIS) aims to have spatial resolution similar to Chandra but with an order of magnitude more collecting area. If such a mission was to meet these specifications, it would be able to make substantial progress in this field both by observing lensed quasars at higher redshifts / lower fluxes and by extending the X-ray light curves of Chandra-observed lensed quasars to over 30- and likely over 40-year baselines.

The presence of microlensing at radio wavelengths is still debated with only two claimed signatures in existing data. There is a possible confirmation bias in not finding more cases. A dedicated observing program is required to investigate the question in a systematic way. In the near future, the Square Kilometer Array may become an ideal facility to confirm or rule out the existence of microlensing in the radio domain, and probe radio emission at even higher scales than possible with interferometry.

The impact of microlensing on the polarisation of the quasar light has not yet been explored much, but important results on the presence of scattered light have been obtained. A polarimetric imaging survey with a 2-m class telescope may enable to systematically study the presence of scattered light in a larger number of microlensed quasars, possibly contributing to understanding oversized accretion discs. Spectropolarimatry may naturally complement photometric data, in particular for the study of the structure of BAL quasars. Tools for simulating polarized signal are basically in place, but data are required to develop the field.

6.3 Developing New Methods

For all microlensing applications, the increase in sample size will require novel analysis approaches. New technologies like GPUs have already helped in understanding the magnification map parameter space that would have otherwise been inaccessible due to the high computational cost. This remains true for the study of systematic biases in the analysis methods; a few works have explored this, albeit in a limited fashion, for the single-epoch method, while a similar study for the light curve method is hindered by its very resource-demanding computations. This is especially true when long light curves most likely containing more complex signals are used; one such example could be the role of the dynamic nature of the microlensing map (individual \(\boldsymbol{\upsilon}_{\star}\) terms per microlens in Eq. (57)). Identifying and mitigating such biases will be essential for delivering groundbreaking science with microlensing. One way to achieve this is by understanding how the length of observations, the amount of microlensing amplitude variations within it, and errors in the macromodel \(\kappa \), \(\gamma \) and magnification can affect resulting quasar structure measurements and correct for it, and similarly for determining the stellar matter fraction, \(\kappa _{*}\). The alternative possibility of simply allowing for large number statistics to improve the precision of the results (without a guaranteed improvement on the accuracy due to possible systematic biases), although more straightforward to achieve, would prevent reaching the full potential of microlensing by, for example, studying subcategories of objects with key properties of interest (e.g. as a function of redshift).

Because the BLR covers a larger fraction of the microlensing map than the disc, its modeling might be affected by different systematic uncertainties than the latter. The modeling of the BLR together with the accretion disc limits the range of events reproducing the data. It may be worth investigating if this is associated with a gain in precision and accuracy on the retrieved sizes. A drawback of the BLR modeling strategy is the need for microlensing maps that are sufficiently large for modeling the BLR, and sufficiently high resolution to constrain the disc. Another challenge that arises with the increase in number of observational constraints (e.g. the microlensing in many velocity bins in the case of the BLR) is the difficulty for the models to reproduce the full signal down to the noise. It often happens that only a few tens of realisations out of millions of trials successfully reproduce the data. Simulations may help finding out if this is evidence for the need of more complex models, or if the observed events are simply rare. Whatever the answer, the computational cost is large, and alternatives to the existing “Monte-Carlo” sampling are desired. We note that this problem is not specific to BLR modeling, for instance, the models of high photometric precision light curves are generally unable to reproduce the data down to the noise. In the latter case, however, there are good reasons to think that it is not the statistics to blame, but the model that ignores non-disc contributions to the broad band flux (e.g. BLR emission blended with disc emission).

Machine Learning is another novel technology that could in fact help with achieving control over the systematics. However, apart from a handful of exploratory works, machine learning methods applied to microlensing have remained largely untapped. Key applications where such methods could make a difference would be fast generation of magnification maps and/or light curves for any \(\kappa \), \(\gamma \), \(\kappa _{*}\), and source size, or direct inference of the model parameters (source size and geometry, \(\kappa _{*}\), etc) from the data, avoiding expensive Bayesian likelihood estimations.

Last but not least, the unprecedented amount of data could be mined for the target information in novel ways that go beyond standard approaches, e.g. population studies, inclusion of more complex signals like reverberated flux and/or binary black holes, etc. Finally, HMEs were mentioned before in the context of machine learning, however, regardless of implementation, a robust predictor and follow-up trigger for such events should be developed and put to the task – there is a unique chance to capitalize LSST data for this.

6.4 Closing Remarks

We believe that with this review we have drawn a coherent and detailed picture of the current state of the field of quasar microlensing. We would argue that both data and methods have evolved from circumstantial and basic to targeted and advanced, but not yet fully mature – the field is in its “adolescence”. This means that although first results are coherent and exciting, there is more work needed to control systematic biases and to be able to address the avalanche of data from all-sky surveys in the next decade. We are confident that this will happen and microlensing will achieve its full potential to deliver groundbreaking scientific results.