Key words

1 Introduction

Optical imaging is expanding its boundaries with powerful emerging capacities for super-resolution of subcellular features [1, 2], widefield imaging across scales [3, 4], and achieving high temporal resolution of ultrafast biological processes [5, 6]. An important frontier in this endeavor is imaging at greater depths through scattering media, which is being addressed by multiphoton excitation and adaptive optics [7,8,9,10]. The prospect of recovering high-resolution information over large volumes of tissues is particularly attractive for neuroscience and developmental biology. However, the high spatiotemporal resolutions needed to observe many biological processes are challenging to presently achieve using point-by-point scanning, as in conventional fluorescence microscopy, which is often combined with aberration correction schemes [3]. Recently, a new strategy of refocusing ultrafast laser pulses in the time domain rather than solely in the spatial domain [11, 12] has revitalized the concept of widefield multiphoton imaging at depth [13,14,15].

The premise that one can focus light in time rather than in space has emerged rapidly over the past decades. Spatial focusing, the concentration of the intensity of a light field in space, is ubiquitous in virtually all-optical imaging systems. It is well-known that spatial frequencies can be focused in space via the Fourier transforming action of a lens [16]. Similarly, spectral frequencies can be focused in time, with much of the same equivalence in their Fourier transform properties, by introducing phase modulation and spatial dispersion [17]. This has formed the foundation of pulse compression and has led to chirped pulse amplification [18] that won Strickland and Mourou a share of the Nobel Prize in Physics in 2018. Defocusing or time stretching, on the other hand, has enabled the recording of ultrashort phenomena and spectral content in the time domain [19].

More recently, the simultaneous use of spatial and temporal focusing was presented in 2005 together by Oron et al. [11] and Zhu et al. [12], demonstrating an interesting phenomenon wherein a time-compressed pulse can be made to exist only at the focus of a lens. Away from the focus, the pulse broadens in space and in time, with a concomitant rapid reduction in its peak intensity. This restriction of the pulse to the focal region enables axial confinement of non-linear optical excitation in a scanless, widefield illumination scheme. This development, termed temporal focusing (TF), has had profound impact on multiphoton microscopy, where previously axial confinement could only be achieved by point scanning a highly focused beam across the image plane, which has severe limits on temporal resolution. The initial work was followed by a flurry of demonstrations of TF in widefield imaging [20,21,22,23,24,25], excitation [26, 27], harmonic generation [28, 29], super-resolution [30], micromachining [31,32,33,34,35], remote focusing [36, 37], tissue ablation [38], and trapping [39], among others. Unsurprisingly, the capacity for ultrafast widefield excitation has particularly flourished in optogenetics and neuroimaging [40,41,42,43,44,45,46,47,48,49]. This is because the absence of scanning has readily allowed for the simultaneous excitation and measurement of neuronal firing events on the millisecond timescale and over wide fields of view, previously unattainable by point-scanning approaches.

While a direct mathematical correspondence can be made between focusing in space and in time [19], it has become evident that TF behaves differently to spatial focusing in the presence of wavefront aberrations and through scattering media [50,51,52,53,54,55] due to the added angular diversity of the illumination spectra at the focus. For instance, the addition of TF has demonstrated a substantial improvement in propagation through scattering media and a reduction in speckle at the focus [51]. This discovery has enabled precise patterned multiphoton excitation at depth and has led to remarkable progress in optogenetics. This area is reviewed in [56] and in Chapter 10.

However, beyond this improved excitation, the capacity for imaging at depth has been impeded by tissue or sample scattering in the detection arm of the optical system. Specifically, widefield detection with a camera would observe severe spatial cross-talk from depths beyond one scattering mean-free-path length making it difficult to recover signals in a conventional manner. A major development came in 2018 with the addition of spatial demixing via single-pixel detection [13, 14], enabling widefield images to be recovered without the need for spatial coherence in the detected signal. This method, termed TempoRAl Focusing microscopy with single-pixel detection (TRAFIX) [13], works by decomposing the imaging process into a different domain, or coordinate space, in which the spatial information is carried by the scattering-robust illumination, rather than by the detection itself. A hybrid between widefield and single-pixel detection can also be realized [15, 57], trading off speed and the robustness to scattering. These methods have demonstrated a major reduction in photodamage compared to point scanning by spreading the excitation power both over a widefield and in the time domain [13]. Further, the imaging scheme is amenable to novel compressive-sensing techniques [14, 58], i.e., image reconstruction from a few sparse measurements, fundamentally reducing the needed excitation power (and thus photodamage), and imaging time. The capacity for precise all-optical excitation and multiplexed detection further offers great future potential for simultaneous volumetric recording of sparse functional signals, for instance, neuronal firing events.

Widefield multiphoton imaging at depth marries a striking combination of novel physics and a new computationally driven paradigm for multiphoton microscopy. In this chapter, we describe the theory and experimental realizations, with a particular focus on the method of single-pixel imaging with TRAFIX. Importantly, we also note many of the current challenges and prospective advances of these techniques.

2 Methods

Achieving widefield multiphoton imaging at depth requires TF in illumination and spatial demixing, such as single-pixel recording, in the detection. We present, in turn, the theory of TF and single-pixel detection and the combined principle of TRAFIX. We further describe the addition of compressive-sensing and hybrid demixing strategies.

2.1 Temporal Focusing

Focusing of illumination pulses is the route by which we achieve axial sectioning in multiphoton imaging. Axial sectioning is required to record the signal precisely from the focal plane, with limited out-of-focus interference: all essential to form highly resolved three-dimensional images. Spatial and temporal focusing achieve this by different means. For instance, Fig. 1a shows a spatially focused Gaussian beam, where the drop-off in the axial intensity away from the focus is related to the Rayleigh range (zR), which is proportional to the square of the lateral beam waist. Since the probability of two- or three-photon excitation is related to the square and the cube of the field intensity, respectively, there is a strong confinement of fluorescence at the focus. Typically, a tight focal spot is needed to achieve sectioning at the micrometer scale. Figure 1b illustrates the concept of TF. Rather than confining the pulse intensity in space, the pulse width is broadened out of focus. Since the chance of multiphoton excitation is also inversely proportional to the pulse duration, axial sectioning is achieved over arbitrary spot sizes.

Fig. 1
Four illustrations, A to D, depict the scattering of pulses at four different angles.

Illustrations of the pulse shape in space and time in (a) spatial focusing and (b) temporal focusing. Temporal focusing realized using a (c) scattering plate, SP, and a (d) diffraction grating, DG. L: lens; Obj: objective; FP: common Fourier plane; IP: image plane. Adapted from [11]

To describe the realization of TF in this regard, we first consider the formation of ultrashort laser pulses [12]. Ultrashort pulses are characterized by a broadband optical spectrum. The shortest possible pulse is formed when: (1) each spectral component is in phase (such that the pulse width is related to the Fourier transform of the envelope of its spectrum) and (2) the spectral components spatially overlap. Temporal dispersion, which can be described by a relative phase delay between different frequencies in the spectral domain or by a chirp in the time domain (i.e., a change in the instantaneous frequency with time), leads to a broadening of the pulse width. On the other hand, spatial dispersion leads to a separation of spectral components, limiting the available bandwidth in a local region, and thus the minimum pulse width. TF operates by forming an ideal, chirp-free pulse at the focal plane and by deliberately introducing spatial and temporal dispersion away from the focus, such that imaging can solely take place in the focal region.

The experimental realization of TF is well-described in Oron et al. [11]. Let us consider a thin scattering plate that is imaged by a perfect 4f system as illustrated in Fig. 1c. An ultrashort pulse incident onto the plate is scattered, and the individual rays that travel through the system are refocused onto the imaging plane. According to Fermat’s principle, all rays that travel from one point of the scatterer (x1) and arrive at one point in the image (x2) have identical path lengths. As such, the pulse will be reconstructed at the image plane with no relative phase delay. However, at a point, P, away from the focus, rays arriving at a particular angle, θ, will have a path length difference related to \( z\left({\cos}^{-1}\left(\theta \right)-1\right)/c \). The maximum phase delay due to the varied path length at P increases with θ, which is limited by the numerical aperture (NA) of the system and the distance from focus, z. In this scenario, the capacity for broadening the pulse out of focus is dependent on the ratio between the original pulse width and the path length difference introduced. Alternatively, an angled pulse wavefront can be incident onto the scattering plate maximizing the phase delays away from the focus [11].

Ultimately, a much more facile configuration can be achieved using a diffraction grating in place of a scatterer (Fig. 1d). The diffraction grating separates the frequencies of the incoming pulse at differing angles. These are then refocused by a 4f system. In the common Fourier plane, the signal may be described as a collection of laterally shifted monochromatic (single-frequency) beams. For an ideal Gaussian beam, the amplitude at the common Fourier plane can be described in the time domain as [12]:

$$ {A}_1\left(x,t\right)={\int}_{-\infty}^{\infty }{e}^{-\frac{{\left(x-\alpha \varDelta \omega \right)}^2}{s^2}}\cdot {e}^{-\frac{\varDelta {\omega}^2}{\varOmega }}\cdot {e}^{i\varDelta \omega t}\cdot d\varDelta \omega \kern0.3em , $$
(1)

where s is the spatial width (1∕e2 radius) of each monochromatic beam and Ω is the spectral width. The first exponent represents a spatial Gaussian profile, laterally shifted by αΔω, where Δω is an offset in frequency and α is a proportionality constant set by the diffraction grating and the lens. The second exponent represents the amplitude scaling of each monochromatic beam, which follows a Gaussian spectrum of the original pulse. The last exponent is the phase shift relative to the group velocity.

The amplitude near the focus can be evaluated using Fresnel diffraction. This is done by performing a Fourier transform of Eq. 1, evaluated at spatial frequencies established by the focus, f. We assume that the wavenumber for each beam is approximately the central wavenumber of the pulse k0. The amplitude is given as [22]:

$$ {A}_2\left(x,z,t\right)=\kappa \cdot {e}^{-\frac{x^2}{s_2^2}}\cdot {e}^{-\frac{\varOmega^2}{4\left(1+\chi \right)}{\left(t+\gamma x\right)}^2}\kern0.3em , $$
(2)

where

$$ {\displaystyle \begin{array}{llll}\kappa & =\Omega \sqrt{\frac{i\pi f}{z_R}}{\left[1+i\frac{z}{z_M}\right]}^{-\frac{1}{2}}\kern0.3em ,& & \\ {}\gamma & =\frac{k_0\alpha /f}{1+ iz/{z}_M}\kern0.3em ,& & \\ {}\chi & =\frac{i z/{z}_B}{1+ iz/{z}_M}\kern0.3em ,& & \\ {}{s}_2^2& =\frac{4{f}^2}{k_0^2{s}^2}+i\frac{2z}{k_0}\kern0.3em ,& & \\ {}{z}_M& =\frac{1}{s^2}\frac{2{f}^2}{k_0},\kern1em {z}_R=\frac{1}{s^2+{\alpha}^2{\Omega}^2}\frac{2{f}^2}{k_0},\kern1em \textrm{and}\kern1em {z}_B=\frac{1}{\alpha^2{\Omega}^2}\frac{2{f}^2}{k_0}.& & \end{array}} $$

Notably, the spatial profile at the focus is equivalent to that defined by any one of the monochromatic beams, i.e., a Gaussian width of 2fk0s. However, the modified Rayleigh range zR is dependent on both s and αΩ. Typically, αΩ ≫ s for widefield illumination, therefore, zR is defined by the extent to which the spectral dispersion fills the back aperture of the objective. Rayleigh-like coefficients zM and zB, related to the spatial and temporal distributions, respectively, additionally modify the phase evolution with time and away from the focus. Here, z is defined as the distance away from the focus (z = 0). The temporal evolution of intensity in the last exponent defines the pulse shape, and the width can be given as [22]

$$ \tau (z)=\frac{2\sqrt{2\ln 2}}{\varOmega}\cdot {\left[1+\frac{z_M}{z_B}\frac{z^2}{z^2+{z}_M{z}_R}\right]}^{\frac{1}{2}}\kern0.3em . $$
(3)

The pulse is shortest at the focus, reaching the minimum transform-limited pulse width of 1∕Ω (1∕e2 radius). The important conclusion of these equations is that, compared to a spatially focused Gaussian beam, TF decouples axial sectioning (zR) from the lateral beam shape (s2), which in turn allows precise control of the multiphoton excitation profile in 3D.

An alternative and more intuitive view of TF was offered by Durfee et al. [59] by examining the evolution of the phase delay (chirp) of each frequency with respect to the focus and lateral position. The formulation is detailed in Note 2. Figure 2a visualizes the evolution of the phase front of three selected frequencies of the pulse in positions corresponding to fractions of the Rayleigh range (zR). At the focus, the wavefronts are flat; however, they are tilted with respect to each other. This represents a linear phase delay of each frequency with lateral position (Eq. 9). A purely linear phase delay in the spectrum results in a time shift in the arrival time of the pulse (via the Fourier shifting theorem). Figure 2b shows the time-domain version of this signal. Simply, the pulse will sweep across the focal plane, in a phenomenon termed as the pulse front tilt (PFT). Away from the focus, we can see increasing second-order dispersion (Eq. 10), corresponding to a rapid broadening of the pulse shape and the reduction in the peak magnitude. Interestingly, by introducing a group velocity dispersion of the second order to the original pulse, the focal plane of TF can be shifted within the linear region set by zR [21, 22, 60]. Using this method, TF can scan a 3D volume without physically scanning the focus or the sample. Further, this principle enables simultaneous TF excitation in 3D via holographic means [61,62,63]. These methods are reviewed by Ronzitti et al. [64] and may also be found in Chapters 1 and 7.

Fig. 2
Two illustrations, A and B, depict the frequencies and pulses. A. Five parts exhibit three frequency lines in different phases. B. Five parts exhibit the pulses varying with time.

Profile of a temporally focused pulse in the (a) spectral and (b) time domains. Lines in (a) represent the phase front delay of different spectral components. The profiles in (b) represent the pulse shapes and their relative delay with respect to lateral position, x. PFT: pulse front tilt

A recent discovery and a very important feature of TF are its ability to robustly propagate through scattering media [51]. This underlies its importance for widefield imaging at depth. Spatial focusing over a widefield constitutes weak focusing, or low numerical aperture (NA) illumination. In this scenario, the field at the entrance pupil of the objective lens is tightly confined in space and is refocused, taking nearly parallel trajectories through the sample to the focal plane. The wavefront propagating through the scattering media is aberrated, leading to speckle due to multiple interference. With TF, an equivalent widefield area at the sample is illuminated; however, the spectral dispersion leads to a substantially broader field intensity at the entrance pupil, corresponding to an effectively high-NA illumination scheme. Each spectral beamlet takes diverse angular paths through the sample. This leads to a rearrangement in the speckle patterns of each beamlet and an effective speckle reduction at the focus [51]. Figure 3 shows experimental evidence of this phenomenon. Widefield illumination in Figs. 3a and c is robust to scattering with TF (Fig. 3b) and exhibits severe speckle without TF (Fig. 3d) [13].

Fig. 3
Four illustrations, A to D, depict the scattering of light with temporal focusing and no temporal focusing.

Point-spread function of (a,b) temporally focused and (c,d) spatially focused beams with (a,c) no scattering and (b,d) though 900-μm of a scattering phantom. Scale bar is 20 μm. Reproduced from [13]

The aspect ratio of spatial dispersion with respect to beam size at the Fourier plane, β = αΩs, is a useful parameter in quantifying the transition between the temporal and spatial focusing regimes (see Note 2). A key consideration is that at high NA, where the beam width at the common Fourier plane exceeds the spectral dispersion, i.e., β → 0, TF is equivalent to purely spatial focusing [59]. This can be validated by examining Eqs. 9 and 10 (see Note 2). In fact, from a purely theoretical perspective, at the focus, we can consider TF to be equivalent to a high-NA beam, scanned point-by-point across the same field of view. However, the PFT in TF sweeps the focal plane with a duration of \( {\tau}_0\sqrt{1+{\beta}^2} \) [22], which can be several picoseconds for typical multiphoton setups. This is not practically achievable by point scanning; further, the sweep in the case of TF is completed by a single pulse. As such, TF offers a mode of illumination unavailable to conventional spatial focusing and will likely see to the emergence of novel creative methods for precise non-linear excitation, which we discuss in Subheading 3.

2.2 Single-Pixel Detection

Widefield imaging at depth requires some form of demixing of scattering in the detection that impedes direct recovery of the signal. To explain the imaging process in a practical sense, let us first consider the storage of a two-dimensional image on digital media. An image that is W pixels in width and H pixels in height can be virtually represented as a 2D matrix of values, vwh, with indices w ∈ 0, 1, …, W − 1 and h ∈ 0, 1, …, H − 1. This can be stored in physical memory as a linear sequence of values, vj, that are indexed by j = hW + w. Here, by “value,” we mean an 8-bit or 16-bit or any other datatype that describes the intensity or color of each pixel in the image. As a matter of fact, any image, volume, or any other digital “information” can be stored in this linear vectorized fashion. This representation will be used throughout this chapter.

Figure 4 illustrates various imaging processes through scattering media using this linear vectorial form. Figure 4a shows conventional imaging with a camera. Here, the sample (x = {xj : j = hW + w}) is represented in discretized form, corresponding to its mapping onto the pixels of the camera. Imaging with a camera, in an ideal case, is the mapping of the sample onto an image (y = {yj : j = hW + w}), which can be mathematically represented as y = Ix, where I is the identity matrix. In turbid media, however, light that travels from the sample is scattered and is detected by nearby camera pixels. This cross-talk leads to a loss in spatial information. At depths exceeding one mean-free path length, the contribution to yj from xj is lower than the cross-talk from other locations in the sample, thus obscuring the image.

Fig. 4
Three illustrations, A to C, depict the scattering of images on the pixels in parallel, sequential, and multiplexed directions. Accompanying images depict the matrix calculations of each pixel.

Imaging process in scattering media. Imaging performed (a) in parallel with a camera; (b) sequentially using point-scanning microscopy; and (c) by multiplexing with single-pixel detection

Point-scanning methods can overcome this effect of scattering in detection by sequentially illuminating a tight spot in the sample and collecting the total signal using a single-pixel detector, for instance, a photomultiplier tube. Figure 4b illustrates this process. Scattering in the detection is no longer relevant since the total signal is detected. The spatial information is carried by the illumination. Similarly to camera imaging, we can represent the imaging process as y = Ix, with y formed sequentially over time, rather than in one shot.

Figure 4c illustrates the concept of single-pixel imaging, which operates by multiplexing the sample signal. Rather than sequentially probing one location, the signal from all locations is mixed. Scattering in detection is not an issue since the total sum of the signal is measured. The mixing weights are set by a sequence of test functions or structured patterns illuminated onto the sample. Similar to point scanning, a sequence of measurements is needed with different mixing weights. We can describe this arbitrary imaging process as y = Φx, where Φ is a measurement matrix whose rows dictate the pattern of illumination onto the sample (each row is a vector representing a 2D pattern image).

The recovery of the sample, x, from the measurements involves pre-multiplying y by the inverse of Φ, i.e., x = Φ−1y. For camera imaging and point scanning, this is trivial as I = I−1. For single-pixel imaging, the capacity to invert Φ is strongly dependent on the choice and number of the test functions. Let us first consider an orthonormal basis as our measurement matrix, for instance, a Hadamard matrix [65], H, routinely used in single-pixel imaging. Its inverse is well-defined as H−1 = HTN, where N = WH is the total number of pixels in an image. Using this measurement matrix, it is simple to recover x by multiplying the detected signal by the transpose of H. All orthonormal measurement matrices share this property, namely that their inverse is linearly related to its transpose (a trivial mathematical operation). In fact, one could consider point scanning to be a form of single-pixel imaging.

For all these imaging methods, N measurements are required: either with N pixels of a camera or using N sequential recordings on the single-pixel detector. When comparing single-pixel to point-scanning detection, there are several advantages. First, point scanning illuminates a tight spot in the sample with a duty cycle of 1∕N. For a maximum allowable irradiance, the signal at the single-pixel detector is typically weak. Using single-pixel detection, the duty cycle is typically 1∕2, leading to a stronger signal at the detector. This results in superior signal-to-noise in detection and allows for lower excitation power for the same image quality, reducing photobleaching [13]. Second, widefield illumination of patterns may be used, which is amenable to TF illumination schemes. Third, single-pixel detectors can be used at wavelengths for which camera hardware does not exist or is prohibitively expensive, e.g., electron-multiplying CCDs in the infrared range. The major advantage of single-pixel detection is it allows for compressive sensing, which we describe in the following section. Using compressive sensing, the number of measurements needed to reconstruct x can be significantly reduced by employing a smaller measurement matrix and more sophisticated inversion methods.

2.2.1 Compressive Sensing

Compressive sensing (CS) describes the recovery of a signal from far fewer measurements than required by the Nyquist sampling criterion. The idea that one can accurately reconstruct N independent values of a signal x from M ≪ N measurements is counter-intuitive. However, let us consider a similar topic of image compression, where a one megapixel raw image taking 4MB can routinely be compressed into a ∼400-kB JPEG with imperceptible losses in quality. This process relies on the idea of “sparsity.” Any image x in the Cartesian coordinate space can be represented in a different domain Ψ by coefficients s, such that x = Ψs. Images are considered sparse when there exists a domain where the coefficient vector s possesses only a few non-zero values. The number of non-zero values, K ≪ N, indicates that the image is “K-sparse,” and it follows that the remainder N − K values are zero. In practice, images are not perfectly sparse; however, s possesses a few large coefficients with the rest being close to zero. For example, JPEG compression selects Ψ to be the discrete cosine transform (DCT); very briefly, a DCT is performed on x and the largest 10% of values are stored.

CS, developed independently by Candes et al. [66] and Donoho [67] in 2006, introduced compression directly to the measurement process by leveraging the assumption that the signal to be measured is sparse. This is in contrast to making a full measurement and performing compression at a later time. If we consider a case where the imaging process y = Φx is compressed by performing 10% of the measurements needed to achieve Nyquist sampling (M = N∕10), such that Φ has N columns and M rows. It is evident that this problem is underdetermined and the solution space of all possible x that can generate y is infinite. For instance, in point scanning, this would be equivalent to imaging the first 10% of the field of view and leaving the other 90% up to interpretation. CS approaches this problem by a careful design of the measurement process. We realize that x can be decomposed into a different domain with a sparse coefficient vector, modifying the imaging process to y = Φx = ΦΨs = Θs [68] (Fig. 5). A solution to the CS problem lies in finding the most sparse s that satisfies y = Θs. The major challenge in CS is twofold: first, is finding an efficient minimization algorithm that can quantify the “sparsity” of s; and, second, is the selection of an appropriate measurement matrix Φ. We consider these aspects in turn.

Fig. 5
An illustration depicts three matrices. The calculations of three matrices depict the following elements as inputs: y, pattern, phi, psi, s, theta, and mutually incoherent.

Visual illustration of the compressive-sensing process in matrix form

In early CS work, a minimization algorithm using an l1-norm proved to be effective in finding a sparse representation of s and thus in recovering a compressed signal [66, 67]. Using this approach, the estimated image \( \hat{\boldsymbol{x}} \) is recovered as

$$ \hat{\boldsymbol{x}}=\varPsi \cdot \arg\ \min \kern0.3em {\left\Vert \boldsymbol{s}\right\Vert}_1\kern0.3em ,\textrm{s}.\textrm{t}.\varTheta \boldsymbol{s}=\boldsymbol{y}\kern0.3em ,\textrm{and}\kern5.55pt {\left\Vert \boldsymbol{s}\right\Vert}_1=\sum \limits_{i=1}^N\mid {s}_i\mid \kern0.3em . $$
(4)

In other words, we assume that s is sparse if its l1-norm is small; thus, we find the s with the smallest l1-norm that still satisfies the imaging process Θs = y. This optimization can be solved using basis pursuit methods [66,67,68]. A convenient MATLAB toolbox is provided by Candes et al. Footnote 1 alongside their original publications [66, 69]. Several alternative efficient algorithms have been proposed [70,71,72,73]. Typically, these methods sacrifice some aspect of accuracy to gain a substantial advantage in speed. We examine one of such algorithms in a later section.

Let us now consider the choice of the measurement matrix. The CS problem can be solved with high accuracy using Eq. 4 provided that the number of measurements taken exceeds the sparsity, i.e., M ≥ K [69]. Additionally, the problem is termed well-conditioned if no two coefficients in s are sampled by a similar sequence of test weights, i.e., no two columns of Θ are the same. More formally, the measurement matrix should satisfy the restricted isometry property (RIP) [66]. A more intuitive method to verify the suitability of Φ for CS is using the mutual coherence metric, defined as \( c={\max}_{i\ne j}\mid {\phi}_i^T{\phi}_j\mid \), where ϕi is the i-th column of Φ. A low mutual coherence implies that the columns are nearly orthonormal. Matrices Φ that satisfy RIP are difficult to generate deterministically [68]. Fortuitously, randomly generated matrices, for example, with Gaussian or Bernoulli distributions, are highly likely to satisfy RIP and mutual incoherence [74]. Thus, random generation can be found at the heart of many CS methods.

While CS is broadly applicable to many signal processing techniques, its use in microscopy is met with additional constraints and challenges. In particular, s is not ideally sparse, and the imaging process introduces additional noise to the measurements. Thus, CS is typically an accurate but lossy estimation. Additionally, the projection of Φ into the imaging plane is limited by the finite spatial-frequency bandwidth of the optical system and diffraction and scattering. We discuss this in the context of imaging and propose modified measurement matrices and recovery methods in the following section.

2.3 TRAFIX

2.3.1 Setup

TempoRAl Focusing microscopy with single-pixel detection (TRAFIX) [13] combines the capacity to project widefield depth-sectioned patterns through scattering tissue with multiplexed detection and compressive sensing, realizing widefield imaging at depth. This configuration, in particular, has strong prospects for rapid, low-photodamage imaging and is well-positioned to provide utility in neuroimaging and developmental studies that cannot be offered by conventional point-scanning multiphoton technologies.

The practical implementation of TRAFIX involves a diffraction grating to enable temporal focusing, a single-pixel detector, and a dynamic light shaping element, for instance, a spatial light modulator [13] or a digital micromirror device [58], for the sequential projection of test patterns, ϕi. Figure 6 illustrates a typical TRAFIX configuration. Briefly, an ultrashort laser pulse illuminates a widefield area on the dynamic light shaper (DLS) using a beam expander (BE). The selection of the light source is discussed in Note 1. A sequence of patterns (ϕi) is displayed on the DLS and relayed by a relay lens (RL) onto the diffraction grating (DG). The DG sets up spatial dispersion at the entrance pupil (EP) of the objective (Obj) using the lens (L). The Obj temporally focuses the patterns through the sample (S). The total fluorescence signal is filtered by a dichroic mirror (DM) and collected by the single-pixel detector (SPD) into a measurement vector yi.

Fig. 6
An illustration depicts the sequence of patterns containing the following elements: D L S, B E, Laser, R L, D G, L, S P D, D M, E P, Obj, and S.

TRAFIX setup, illustrating the projection of a sequence of patterns (ϕi) and single-pixel detection of a series of measurements (yi). BE: beam expander; DLS: dynamic light shaper; RL: relay lens; DG: diffraction grating; L: lens; EP: entrance pupil; Obj: objective; S: sample; DM: dichroic mirror; SPD: single-pixel detector. Numbers (1–3) correspond to the image plane on the DG, the common Fourier plane, and the sample image plane, respectively

A DLS is used to sequentially shape the light field into a series of test patterns. A beam expander is used to illuminate a widefield region of the DLS and to efficiently utilize the available pixels. The DLS can be embodied by a spatial light modulator (SLM) by encoding each pattern with a blazed grating, i.e., multiplying each pattern in the Cartesian space, ϕi → ϕxy, by a wrapped linear phase γ ≡ xdx + ydy (mod 1), where dx and dy are spatial frequencies of the grating. Additionally, assuming ϕ ∈ [0, 1], an orthogonal (90) dispersion of complementary patterns via ϕxyγ + (1 − ϕxy)γ, where γ≡ xdx − ydy (mod 1), leads to a clearer separation of the light-field energy in diffracted orders. Following the SLM, the first diffracted order should be selected by spatial filtering implemented using a pinhole. Alternatively, a digital micromirror device (DMD) can be used to directly deflect a binary pattern. DMDs, typically, have faster projection rates in the 10-kHz range but are limited to binary (on–off) light shaping and may exhibit high loss. Nematic liquid crystal SLMs enable greater control over the light field, including grayscale patterns, however, at sub-kHz rates. Despite the high maximum speed of these devices, a practical limit exists in the speed with which the test patterns can be generated and sent to the hardware. This throughput limit is set by the efficiency of the software, which is unlikely to be met by scientific toolsets such as MATLAB or ImageJ.

The plane of the DLS forms the first image plane of the system. The image plane is relayed to a diffraction grating that enables TF. The DG is conjugate to the back aperture of the objective (common Fourier plane of the 4f system), such that spatial frequencies of each test pattern are linearly dispersed along one axis according to the finite bandwidth of the laser pulse. In the previous sections, we considered TF of a Gaussian beam (Eq. 1). However, widefield test patterns used for single-pixel imaging possess a breadth of spatial frequencies. Thus, a consideration has to be made on the finest spatial frequency in the patterns, relating to lateral resolution, and the extent of spatial dispersion for TF. Previous work utilized TF dispersion aspect ratios (β) of approximately 4–8 [13, 58].

We consider that a given test pattern comprises a superposition of high-spatial-frequency and low-spatial-frequency light fields. The high-frequency component occupies a large space in the Fourier plane and is spatially dispersed to an effectively low extent, i.e., a small relative β = αΩs, where s is large. The low-frequency component occupies a small region of the Fourier space and is dispersed to a large extent. At the sample plane, the high-frequency light field is axially sectioned due to the tight spatial focus, in a similar fashion to diffraction-limited point-scanning microscopy. Its robustness through scattering is established by the high NA. The low-frequency light field is sectioned via TF and propagates through scattering media due to the effective increase in the NA from the spatial dispersion. Thus, if the combination of the spatial-frequency bandwidth of the test patterns and the spatial dispersion from TF fills the entire back aperture of the objective, axial sectioning over a widefield and robust propagation in scattering media will be achieved.

The selection of relative dispersion, or β, is achieved by choosing an appropriate DG period and the objective and lens pair. For instance, a 1200 g/mm reflective blazed grating, a 400-mm focal length lens, and 20x 0.75NA and 100x 0.7NA objectives (Nikon, Japan) were used in the original studies of TRAFIX [13, 58]. Practically, a 4f relay system could be introduced between the DG and Obj to provide an additional degree of freedom in magnification. The RL system between the DLS and DG can be used to independently control the magnification of the test pattern without affecting the TF dispersion.

The detection system is equivalent to a conventional point-scanning system, where an appropriate wavelength filter directs the total fluorescence signal to a widefield single-pixel detector. Typically, a photomultiplier tube (PMT) may be used and then low-pass filtered to double the pattern projection rate [14, 58]. The initial demonstration [13] utilized an electron-multiplying CCD as an SPD by integrating the total signal, which provides the added capacity for conventional widefield imaging. For particularly sensitive applications, photon-counting devices may be implemented.

2.3.2 Imaging

Initial work employed the Hadamard matrix as the set of test patterns (Φ) [13, 14, 75, 76]. The Hadamard matrix [65] (also termed the Walsh–Hadamard matrix) is formed recursively. It is orthonormal and symmetrical, with each column being orthogonal to any other. It is routinely used for single-pixel imaging because it is easy to generate and it is its own inverse. The first-order Hadamard matrix is unity; the second order is given as

$$ {H}_2=\left[\begin{array}{ll}\hfill 1\hfill & \hfill 1\hfill \\ {}\hfill 1\hfill & \hfill -1\hfill \end{array}\right]\kern0.3em ; $$
(5)

and the 2n order is formed from the n order as

$$ {H}_{2n}=\left[\begin{array}{ll}\hfill {H}_n\hfill & \hfill {H}_n\hfill \\ {}\hfill {H}_n\hfill & \hfill -{H}_n\hfill \end{array}\right]\kern0.3em . $$
(6)

The sizes of Hadamard matrices are limited to powers of 2, which set the allowed image sampling sizes. Each row of the Hadamard matrix represents an individual test pattern.

Figure 7 shows examples of images generated with TRAFIX using Hadamard test sequences [13]. Figure 7a shows a reference image of a test target (x) fabricated from a 200-nm spun-coated layer of super-yellow polymer. The target was imprinted by photobleaching a negative pattern. When the test target is obscured by a 400-μm section of rat brain tissue (mean-free path length, ls = 55 μm), conventional widefield detection using a camera is impossible due to scattering in the detection (Fig. 7b). Using TRAFIX, the image may be retrieved when obscured by 200-μm (Fig. 7c) and 400-μm (Fig. 7d) section of rat brain tissue.

Fig. 7
Eight illustrations, A to H, depict the scattering of fluorescent images. From A to D and E to H, the images are not scattered and scattered respectively.

Images recovered using TRAFIX. (a) Reference image of a fluorescent test sample with no scattering, (b) imaged through a 400-μm brain slice using widefield detection, and (d) TRAFIX. (c) TRAFIX through a 200-μm brain slice. (e,g) Reference images of mouse-derived astrocytes compared with (f,h) TRAFIX. Scale bar is 20 μm. (ad) Adapted from [13]

Figures 7e–h demonstrate the proof-of-principle of TRAFIX on primary mouse astrocytes. No scattering was introduced; however, the results demonstrate the capacity to reconstruct images from the faint signal in biological samples. Importantly, a long recording time per test pattern on the order of 0.1–0.5 s was required due to the low pulse energy density of the fast-repetition-rate (80 MHz) laser used. However, even with a non-ideal laser, three-photon signal recovery was demonstrated with TRAFIX [75]. Wadduwage et al. [15] built on this concept and demonstrated a stronger signal and faster widefield detection of mouse muscle at depth by using a high-pulse-energy laser (see Note 1) and a hybrid demixing method.

The capacity for CS gives TRAFIX an advantage in the ultimate imaging speed and in the reduction to photodamage. However, a special consideration has to be made to the implementation of CS in microscopy compared to conventional macroscopic compressive imaging (e.g., photography). Typically, applications of CS in imaging employ Hadamard or randomly generated matrices. The test patterns from those measurement matrices possess a broad spatial-frequency bandwidth that is difficult to propagate through high-NA imaging systems. The exceptionally high, step-wise variations in intensity that are characteristic to these test patterns, even at low pixel sampling N, lead to large diffraction effects and an overfilling of the back aperture of the objective. Figure 8 shows a simulation of the field intensity at the DG (1), the common Fourier plane (2), and the sample (3), corresponding to the locations marked in Fig. 6. The blue circle indicates the entrance pupil size. The Hadamard pattern (Fig. 8a) demonstrates a large structured pattern in the Fourier plane that exceeds the entrance pupil. The random pattern (Fig. 8b) shows a broad specular bandwidth. In both cases, the sampling pixel size of each pattern was set as the diffraction limit.

Fig. 8
Five illustrations, A to E, depict different patterns of compressive sensing in microscopy.

Compressive sensing in microscopy. (ac) The projected pattern (1), the spectrum in the Fourier plane (2) clipped by the objective pupil (blue circle), and the resulting pattern in the sample plane (3), for the (a) Hadamard, (b) random, and (c) Morlet patterns. (d) Performance of compression (MN) using each pattern set, compared to the reference image (e). Scale bar is 10 μm. Adapted from [58]

The limit in the spatial-frequency bandwidth leads to two important issues in CS. First, the effective low-pass filtering leads to a different pattern being projected onto the sample plane to the one assumed in the CS optimization algorithm. This is clear by comparing locations (1) and (3) in Fig. 8. In this scenario, the CS problem becomes more poorly conditioned; the noise in the measurement process may be attributed to larger non-existing frequencies. Second, diffraction leads to a large portion of the pulse energy to be blocked by the entrance pupil. In the widefield regime, pulse energy is an important parameter to maximize, directly impacting the signal-to-noise ratio.

An alternative pattern set was proposed to control and selectively probe the spatial-frequency space, generated from Morlet wavelets mathematically convolved with randomly generated matrices [58, 77]. These Morlet patterns feature two important properties. First, the Morlet wavelet, based on real-valued Gabor filters, reaches the limit set by the uncertainty principle, i.e., an optimal trade-off between spatial and spatial-frequency localization. Second, the convolution with a randomly generated, Gaussian-distributed matrix satisfies the mutual incoherence property required by CS.

Figure 8c shows the propagation of a Morlet pattern [58]. It is evident that the Morlet patterns can be designed to fit the entrance pupil, and the pattern at the sample plane resembles the pattern projected by the DLS. In this demonstration, the sampling pixel size N matches the theoretical diffraction limit. Interestingly, the CS sampling pixels size sets the image resolution, yet, due to the Nyquist sampling criterion, a bandwidth of twice the pattern frequency is required to propagate the pattern through a microscopy system. This corresponds to the observations in Fig. 8a–c (2) and suggests that an image resolution below twice the diffraction limit is not achievable with Hadamard or Random measurement matrices. Optionally, this can be overcome with digital microscanning techniques [78], by taking multiple CS measurements with a series of patterns, spatially shifted by half the sampling size. Morlet patterns, due to the independent selection of bandwidth and sampling size, are able to reach the diffraction limit.

CS recovery from microscopy data requires additional consideration. The conventional method of CS recovery using l1-norm minimization using the basis pursuit algorithm is inefficient for imaging applications for several reasons. Importantly, the CS problem is linearized into vector form (Eq. 4), which makes it broadly applicable to signal processing; however, the 2D/3D nature of imaging can be exploited to improve CS recovery. We can assume some extent of spatial smoothness to the image. Further, the basis pursuit method scales with the total sampling pixels N, making the recovery of large images (e.g., exceeding 64 × 64 pixels) computationally taxing. Finally, the l1-norm minimization is constrained to provide a solution that strictly satisfies y = Φx. It is inevitable that some noise is introduced in imaging, especially when the multiphoton signal from the sample is faint; thus, concessions for noise should be made. This is particularly important in the stability of measurement bases, which we discuss in Note 3. Taking the above into account, first-order approximations to the CS problem may be used to achieve satisfactory, and in some instances improved, performance. In particular, we can utilize Nesterov’s method [73] (NESTAFootnote 2), which uses a smooth approximation to the l1-norm, minimizing ||s||1 s.t. ||y − Φx||2 ≤ ε, for some estimated measurement noise ε. NESTA was found to be substantially faster, less hardware demanding, and more resilient to measurement noise [58].

Figure 8d demonstrates the CS performance of the measurement matrices in TRAFIX [58]. The sample comprises 4.8-μm green fluorescent polystyrene beads (G0500, Thermo Scientific) embedded on a glass microscope slide. A reference image is provided in the top right of Fig. 8d. Various compression ratios (MN) are evaluated. The Morlet exhibits a superior performance at high compression. The pulse energy density is kept constant between measurements; thus, the higher intensity in the Morlet-generated results suggests a higher signal to noise. In fact, equivalent image contrast is achieved at 25%, 67%, and 82% compression, for the Hadamard, Random, and Morlet patterns, respectively [58]. Additionally, the Morlet features a stronger performance through scattering media [58].

2.3.3 Hybrid Demixing

An interesting alternative approach, combining patterned illumination and widefield camera detection, toward the demixing of scattering was presented by Wadduwage et al. [15]. This technique was termed de-scattering with excitation patterning (DEEP). The principle is conceptually similar to another technique presented by Parot et al. [79], termed compressed Hadamard imaging (CHI). We recognize that camera detection of TF signal at depth is obscured by the cross-talk of scattering between adjacent pixels. However, the contribution due to scattering of the signal from xi to an imaging pixel yj, where the coordinate j is substantially far away from i, is minimal. Therefore, if cross-talk can be removed between adjacent pixels, rather than completely from the entire imaging process, depth imaging can still be achieved.

The DEEP or CHI operates by projecting a sequence of small Hadamard codes (e.g., 16 × 16 pixels), repeated and tessellated over a large-pixel-count image. The mapping of the patterns onto the camera pixels is performed by imaging a calibration phantom (a thin fluorescent layer). The calibration set is then used to demodulate patterned recording sequences. In effect, this method realizes a parallel version of single-pixel imaging and is equivalently amenable to CS [15, 79]. Additionally, this can be performed in a line-scanning configuration [57]. The trade-off exists in choosing the size of the Hadamard code. The length of the code establishes the distance in the camera over which the scattering cross-talk can be eliminated; however, a larger length requires more total measurements.

3 Future Prospects

TF offers a new paradigm of widefield, axially confined multiphoton excitation; yet, as we demonstrate in Subheading 2.1, it shares a common mathematical foundation with point-scanning methods. Point-scanning methods sweep the field of view with a series of laser pulses using mechanical mirrors. TF, however, does so elegantly by controlling the spatiotemporal evolution of the phase front. In theory, equivalent field of view, axial confinement, and depth penetration can be obtained using both methods. In practice, however, TF can do so with a single pulse, including the 3D excitation via holographic means [64]. As such, TF has become advantageous for rapid surface imaging and precision optogenetics at depth [56].

The capacity for widefield imaging at depth has been introduced with multiplexed detection schemes. Due to its relatively young stage of technological maturity, TF imaging at depth, particularly in two-photon modes, will struggle to achieve the same speed as point-scanning methods within the next several years (see Note 4). The maturation of high-speed light shaping technology and massively parallel detection schemes can improve the prospects of TF, however. For instance, parallel detection by using each pixel of an EMCCD as a single-pixel detector combined with local, repeating pattern projection, could elevate the speed to beyond 30 fps with short measurement matrices.

Recently, three-photon microscopy has re-emerged due to the availability of near-infrared, high-pulse-energy lasers, three-photon fluorescent markers, and a desire to perform deep, volumetric imaging in vivo and through scattering media, such as a mouse skull [7, 80, 81]. There, due to the low repetition rate of the lasers, the speed of point scanning is on par with multiplexed detection methods (see Note 4). Toward this end, the promise of three-photon microscopy has already been shown with TRAFIX [75]. In vivo, through-skull imaging is an important goal for three-photon technology [3]. The lowered goal posts in speed and the requirement for minimal photodamage present a likely niche for TF imaging. Beyond this, there are several key capabilities that can give TF not only a competitive edge, but also deliver unprecedented performance in select areas, including in neuroscience applications.

A likely key to the success of this technology is the combination of sparse detection and precision excitation. The concept of sparsity allows for a fundamentally lower number of measurements to be made to recover the same information. A clear advantage here comes in the form of reduced photobleaching and photodamage [13] to point-scanning methods. TF may find immediate utility in applications that emphasize long-term, non-invasive imaging over rapid detection, for instance in developmental studies.

Controlled excitation of arbitrary fields in 3D is possible with TF holographic methods [24, 61,62,63, 82]. The combination of controlled excitation and multiplexed detection lends itself to adaptive sampling schemes. For instance, a low-resolution volume may be formed rapidly; then, based on the features of the low-resolution volume, sparse volumes of interest may be dynamically probed at higher resolutions on-the-fly. This would be of great advantage to sparsely populated samples, for instance, for the tracking of individual cellular bodies in 3D biomaterials. In neuroscience, in particular, the combination of holography and multiplexing may enable the volumetric tracking of individual sparse neurons and their connectivity at high spatiotemporal resolutions with no a priori information to their locations. In fact, a similar compressive approach was demonstrated using light-field microscopy [83].

Optogenetic excitation with TF has already shown great promise [56]. Naturally, the addition of multiplexed detection may enable all-optical functional imaging. For instance, optogenetic excitation with TF can be combined with TRAFIX recordings of a reporter (e.g., a voltage sensor), given that the absorption and emission spectra are well-separated. The detected temporal response could be demodulated based on the illumination pattern sequence using the CS method, however, with several caveats.

The key consideration in this endeavor is the repeatability of the neuronal response. One approach to performing CS in this scenario would be to project a pattern and record the multiplexed response over a desired duration. This is then repeated for all other patterns. The caveat is that the response from all neurons should be repeatable each time they are excited; otherwise, CS will not be able to accurately demodulate the signal. It is a big ask for biological samples to act in such a repeatable manner. The other approach would be to project all the patterns within the desired temporal resolution timeframe and repeat this for all time points of the experiment. The added temporal domain of video imaging is amenable to further improvements to the compression ratio, even down to a few percent [84]. This approach would remove the need for repeatability; however, fast pattern projection and fluorescent reporter switching rates are needed. This is readily achievable for calcium imaging, where the desired temporal resolution is on the sub-second to second scale [5]. For voltage sensing on the millisecond scale [85,86,87,88], pattern projection and reporter switching rates should be on the microsecond scale. This may be presently achievable for sparse simultaneous measurements on the order of tens of neurons; however, widefield imaging is currently intractable. An ambitious vision could be to map the volumetric structure of a neuronal sample with an adaptable resolution, identify and optically stimulate each neuron, and map the sparse connectivity to its neighbors with functional imaging. In fact, the idea of sparsity and adaptive resolution is well-suited to neuroscience methods, where neuronal connectivity is naturally sparse over a widefield and in vivo behavioral stimuli might only affect a subset of the interrogated volume.

These considerations form a grand challenge for TF imaging in neuroscience. Despite the appreciable challenges in the road ahead, the holy grail of this undertaking is the unprecedented capacity for low-photodamage, highly parallel, adaptive and rapid interrogation of neuronal signals, and their connectivity in turbid 3D structures, and potentially in vivo.

4 Summary

Compressive sensing and temporal focusing have seen remarkable progress in the past decade, in their own right. The combination of these two techniques in widefield imaging at depth provides a broad canvas for powerful future advances, both in the novel physics of the spatiotemporal manipulation of light and in the new computationally driven paradigm of imaging. Undoubtedly, from the rapid uptake and interest of this relatively young field, many developments are still in store over the coming years. However, a grand challenge remains to fully exploit the multiplexing capacities in the illumination and the detection methods, to provide capacity not available to serial point-scanning approaches, and to demonstrate it in biological samples.

5 Notes

  1. 1.

    Ultrashort Pulsed Laser Selection

    An important practical consideration is the choice of the laser source. For comparison, a laser source for two-photon point-scanning microscopy typically features a fast repetition rate (∼ 80 MHz) with pulse energy densities at the sample in the range of 0.1–0.5 nJ/μm2 [89]. Incorporating the same laser in a widefield configuration leads to a large loss in energy per unit area, for example, at the maximum power setting of a point-scanning laser (Chameleon Ultra II, Coherent Inc., USA), the pulse energy at the sample is 6-nJ, and the pulse energy density is 0.6 pJ/μm2 [13] for a modest 100-μm field of view. This is an order of magnitude lower than point scanning, which makes imaging biological samples with weak multiphoton signals a significant challenge. At the other end of the scale, low-repetition high-power lasers can be employed. In Wadduwage et al. [15], a 10-kHz repetition rate regenerative amplifier laser provided a remarkable 1.5 μJ pulse energy with 0.06 nJ/μm2 over their 160-μm field of view; however, the laser in that instance was not tunable in wavelength. Point-scanning methods require a high repetition rate to ensure a fast raster scanning speed across the sample. For widefield imaging, this is not required; thus, low-repetition high-pulse-energy lasers are preferred for a good signal-to-noise ratio.

  2. 2.

    Chirp Evolution in Temporal Focusing

    Following Durfee et al. [59], we examine TF from the perspective of position-dependent spectral chirp. TF is considered as the superposition of paraxially propagating Gaussian “beamlets” modified by ray optics to incorporate the tilt of the spectral components. Consider a Gaussian beamlet that undergoes a tilt from spectral dispersion, transforming the lateral position \( x\to x-z\sin {\theta}_x \), where θx = αΔωf. The amplitude is given as [59]

    $$ A\left(x,z,\omega \right)={E}_0\left(\omega \right)\frac{s_2}{s_2(z)}{e}^{-\frac{{\left(x-z\sin {\theta}_x\right)}^2}{s^2(z)}}\kern0.3em , $$
    (7)

    and the phase as

    $$ \phi \left(x,z,\omega \right)={k}_0x\sin {\theta}_x+{k}_0z\left(1-\frac{1}{2}{\sin}^2{\theta}_x\right)-\eta (z)+{k}_0\frac{{\left(x-z\sin {\theta}_x\right)}^2}{2R(z)}\kern0.3em , $$
    (8)

    where s2(z) is the axially dependent beam radius at the focus, R is the radius of curvature, and η is the Gouy phase, given with respect to the Rayleigh range, \( {z}_R={k}_0{s}_2^2\slash 2 \), as

    $$ {s}_2(z)={s}_2\sqrt{1+\frac{z^2}{z_R^2}}\kern0.3em ,\kern1em R(z)=z\left(1+\frac{z_R^2}{z^2}\right)\textrm{and}\kern1em \eta (z)=\arctan \left(\frac{z}{z_R}\right)\kern0.3em . $$

    The position-dependent chirp can be evaluated from the phase in Eq. 8 by taking the derivative with respect to ω = ω0 + Δω, evaluated around the central frequency ω0. The first-order chirp is [59]

    $$ {\phi}_1\left(x,z\right)=\frac{z}{c}+\frac{x}{s_2}\beta {\tau}_0\left(\frac{1}{1+{z}^2/{z}_R^2}\right)+\frac{x^2}{2 cR(z)}\kern0.3em , $$
    (9)

    where β = αΩs represents the aspect ratio of the spatial dispersion rate with respect to the beam size (s) at the common Fourier plane and τ0 is the transform-limited pulse width. Notably, the first term is the arrival time of the pulse, and the last term is symmetric with x and represents the curvature of the beam. The middle term is linear with x and represents a characteristic trait of TF—a pulse front tilt (PFT) [90]. Note that a linear phase shift in the spectral domain relates to a temporal shift in the time domain via the Fourier shifting property. As such, PFT describes the property of TF whereby the pulse rapidly sweeps across the lateral dimension of the focal plane.

    Similarly, the second-order chirp is given as [59]

    $$ {\phi}_2\left(x,z\right)=\left(\frac{x}{s_2}\frac{\tau_0\beta }{\omega_0}-\frac{z}{z_R}\frac{\tau_0^2{\beta}^2}{4}\right)\left(\frac{1}{1+{z}^2/{z}_R^2}\kern0.3em .\right) $$
    (10)

    Notably, ϕ2 is dominated by the quadratic term, which is negligible at the focus and increases with z, with a linear relationship when z is well within the Rayleigh range \( {z}_R={k}_0{s}_2^2/2 \). This second-order chirp leads to pulse broadening from temporal dispersion away from the focus (illustrated in Fig. 2).

  3. 3.

    Stability of Compressive Sensing

    Like many other inversion methods, the efficacy of CS relies on several closely linked parameters: the mutual incoherence and the condition number of the measurement matrix, and the noise in the measurement. The interplay between these parameters dictates the qualities seen in the recovered images. For instance, a compressed Hadamard matrix is not mutually incoherent (in fact, it is perfectly coherent, i.e., there will be at least two columns that are identical); however, it is well-conditioned. As a result, CS with a Hadamard matrix leads to superior performance when the signal-to-noise ratio is low; however, the image may exhibit features that repeat spatially and, overall, will demonstrate poor performance with high compression. A random matrix is mutually incoherent and has a satisfactory condition number with compression. Thus, it will perform well with compression and will fail only when the signal-to-noise ratio is exceptionally low. As a result, the random matrix is often the basis of choice for many CS applications. A Morlet matrix is mutually incoherent; however, it has a poor condition number when the spatial bandwidth is highly constrained. While it is superior in microscopy applications due to the limited spatial bandwidth, it requires a strong signal-to-noise ratio, i.e., a strong fluorescent signal from the sample.

  4. 4.

    Imaging Speed

    Multiplexed widefield detection and CS promise a reduction in the total measurements and a capacity to capture rapid dynamic processes; however, the proof-of-principle demonstrations [13,14,15] have not yet shown a faster imaging speed compared to state-of-the-art point-scanning methods. This is largely due to the discrepancy in the maturity of the hardware and control software that drive these methods. Here, we explore the theoretical speeds that can be achieved by both methods.

    In point-scanning two-photon microscopy, the imaging speed is limited by the repetition rate of the laser, which is 80 MHz for typical embodiments. Another limit is set by the scanning speed. High-speed resonant scanning systems can achieve a line scan rate of 12 kHz. Let us define the maximum imaging speed based on a 1 megapixel (MP) image; thus, 12 frames per second (fps) can be achieved practically. The speed of multiplexed imaging is limited by the speed of the light shaper. For an SLM, let us consider a conservative high-speed refresh rate of 300 Hz. For a 1MP image and 10% compression (MN), 0.003 fps can be achieved. A DMD can reach speeds beyond 15 kHz. Similarly, for a 1MP image and 10% compression, 0.15 fps can be achieved. This is substantially lower than point scanning. Hybrid methods [15, 79] using electron-multiplying CCDs (EMCCD) with 25 fps (1MP image) could reach >1 fps with short Hadamard codes.

    In three-photon microscopy, the requirement for high pulse energy restricts the laser repetition rate to typically sub-MHz regimes. For point scanning, this limits imaging speed to below 0.1 fps (in our 1MP comparison). The speed of multiplexed imaging, however, is still limited by the light shaper. As such, multiplexed imaging may have an advantage in speed in widefield three-photon imaging.