Introduction

Fluorescence lifetime imaging, unlike conventional fluorescence imaging techniques, measures the temporal properties of a fluorophore—its fluorescence lifetime1,2,3. Fluorescence decay can be affected by the environment of the fluorophore such as concentration of oxygen, pH, or protein-protein interactions, among many others3,4,5. Hence, extracted lifetimes can reveal contrast across the sample, which would be otherwise unseen from fluorescence intensity measurements only. Fluorescence lifetime imaging microscopy (FLIM) is widely utilised in biological sciences6,7. For instance, in cancer research FLIM has been used for cancer cell detection8,9,10,11, anti-cancer or chemotherapy drug delivery12,13, and anti-cancer drug efficacy studies14,15. In addition to this, in recent years FLIM has started to play a role in clinical diagnostics15,16,17. However, wide adoption of FLIM in clinical settings is still lacking, partially due to limited imaging speed and/or field of view (FOV) of available FLIM systems17. The challenges arise from the fact that nominal lifetimes of endogenous fluorophores and fluorescent proteins lie in the range of 0.1–7 ns6. Since there are a number of possible quenching interactions3,18,19 that decrease lifetimes even further, detectors with sub-nanosecond temporal resolution are required for FLIM. Typical commercial systems make use of confocal microscopes with detectors suitable for point-scanning (such as photomultiplier tubes) and time-correlated single-photon counting (TCSPC) electronics that can satisfy the temporal resolution requirements20.

However, point-scanning systems can suffer from photo-bleaching due to the high optical energy in the light pulses used in the system, and cannot provide instantaneous full FOV information, which becomes important when imaging dynamic scenes or in vivo applications. We note that having higher laser power in each spot with 2-photon excitation can be useful, however, in our case for 1-photon excitation the samples are already close to bleaching with an average power of a few mW. Therefore, an analytical comparison of raster scanning and wide-field data acquisition for FLIM experiments indicates that for the case of dim/sparse samples, wide-field acquisition can be up to \(N^{2}\) times faster compared to raster scanning, where N is the number of pixel rows in a square array detector21. Therefore, large detectors, such as the SPAD array used in this work, can be important for imaging dim samples which are often encountered in biologically relevant experiments.

Wide-field FLIM is typically realised using TCSPC in a system with microchannel plate-based gated optical intensifiers combined with a sensor capable of resolving signal position, such as a Charge-Coupled Device (CCD) camera22,23. An emerging alternative to aforementioned intensifier based systems are Single Photon Avalanche Diode (SPAD) arrays manufactured with complementary-metal-oxide semiconductor (CMOS) technology24, which can operate with TCSPC25,26,27,28,29,30,31 or time-gated32,33,34,35,36,37 acquisition mode. The main advantages of SPAD arrays over conventional CCD/CMOS cameras are the picosecond temporal resolution, and single-photon sensitivity38, which make them ideal for a broad range of applications in the area of ultrafast time-resolved imaging39. Until recently, SPAD arrays had a relatively limited number of ‘active areas’ (or ‘pixels’) due to the physical constraints imposed by the need to fit complex timing electronics for each individual pixel on the same chip. Nonetheless, recent technological advances have led to SPAD sensors with formats comparable to intensified CCDs. Prior to the development of the SPAD array used in this work34, a \(512 \times 512\) SPAD array (SwissSPAD233) was the largest SPAD array available (recent developments in SPAD detectors are reviewed in40). We note that generally speaking, gated SPAD cameras offer higher pixel fill-factor and therefore better photon detection probability compared to TCSPC SPAD cameras that require more complex on-pixel electronics. On the other hand, gated cameras require digital scanning of the gate to achieve temporal resolution, therefore lengthening the total acquisition times. There is therefore always a trade-off between the requirement to scan the gate (for a gated camera) and low pixel fill-factor (for a TCSPC camera). When building a megapixel SPAD camera, one must also consider the advantage of higher quality readout obtained by simplifying the electronics.

In this work, we demonstrate 0.5 megapixel (500 \(\times\) 1024 pixels) wide-field FLIM microscopy with a SPAD array, while protein fluorescent lifetimes are extracted directly from the data via a bespoke artificial neural network (ANN). We illustrate the applicability of the 0.5 MP SPAD array for FLIM imaging of Convallaria samples at \(1\,{\mathrm{Hz}}\) acquisition rate, and also in experiments with samples relevant in cancer research, such as the human fibrosarcoma (HT1080) cell line. Finally, we show the potential to extend this approach to ultra-wide fields-of-view by acquiring 8 tiles with the SPAD array, thus providing a 3.6 MP lifetime image of Convallaria.

Figure 1
figure 1

Principle of time-gated acquisition and the machine learning model. (a) Fluorescence decay is sampled with a number of gates, each shifted by a minimum of 36 ps. Each exposure corresponds to a ‘time bin’, which samples a different part of the fluorescence decay signal. (b) The ANN architecture (see “Methods”) consists of one input layer (IL), one output layer (OL), and a series of hidden layers (HLi, with \(i = 1,2,3\)). Each of these layers consists of a fully-connected dense layer (dark blue) followed by with rectified linear unit (ReLU) activation function (light blue). The input layer is fed with the fluorescence decay signal recorded by a single pixel of the SPAD array.

0.5 megapixel wide-field SPAD array

The SPAD array is described in detail in Ref.34 and in the “Methods”. The camera operates in gated mode, i.e. the sensor is sensitive to incoming photons for a fixed duration gate of \(\sim 3.8\) ns that has, to good approximation, a super-Gaussian profile (see “Methods”). Figure 1a schematically explains how the 3.8 ns gate is scanned in steps that can be as small as 36 ps. At each step, a binary image (frame) of the spatially resolved photon counts is recorded. Stacking together all of these frames therefore provides a temporally-resolved spatial image of the sample where the data along the temporal axis is the convolution of the lifetime response with the camera temporal gate. The samples are imaged on to the camera with 0.47 \(\upmu\)m/pixel spatial sampling for our all data, with the exception of Fig. 2g–i that has 33 \(\upmu\)m/pixel spatial sampling with (see “Methods”).

Figure 2
figure 2

Wide-field fluorescence lifetime measurements of Convallaria and HT1080 cells. First column: least-squares (LSQ) deconvolution; second column: ANN deconvolution; third column: temporal sum of pile-up and background corrected intensity data clipped to selected intensity values to reveal dimmer structures. (a)–(c): high photon count measurements of Convallaria (100 s acquisition). Mean lifetime measurements for LSQ (processing time, 56 min) and ANN deconvolution (processing time, 2.7 s) yield similar values. Spatial sampling is 0.47 \(\upmu\)m/pixel with a 7% active area fill-factor. (d)–(f) Low photon counts measurements of Convallaria at a total acquisition time of 1 s. LSQ (processing time, 58 min) and ANN (processing time 2.7 s) deconvolution results are similar. Spatial sampling is 0.47 \(\upmu\)m/pixel. (g)–(i) measurements of HT1080 (fibrosarcoma) cells expressing Clover41. As with previous data-sets with LSQ (processing time, 23.2 min) and ANN (processing time, 3.6 s) retrievals. Spatial sampling is 0.33 \(\upmu\)m/pixel. We found the HT1080 cells to be dimmer than the Convallaria cells. HT1080 cells yield around 100 photons per second on average in the brightest region, compared to around 2500 in the brightest region of Convallaria. Scale bars 50 \(\upmu\)m.

Lifetime retrieval

Irrespective of the imaging modality used for FLIM measurements, extracting lifetime information is not a trivial matter42. The measured signal, f(t), is the convolution of the impulse response function (IRF) and the fluorescence decay of the fluorophore, g(t), can be expressed as:

$$\begin{aligned} f(t_{i}) = \sum _{t=0}^{t=t_{i}}g(t_i)\mathrm {IRF}(t-t_{i})\Delta {t}, \end{aligned}$$
(1)

where \(t_{i}\) is the time of the i-th sampling of the signal. A plethora of algorithms have been developed in order to tackle the problem of retrieving fluorescence decay (reviewed in42), with perhaps the most common being least-squares (LSQ) deconvolution (sometimes referred to as ‘reconvolution’). In this approach a model of the fluorescence decay is convolved with the IRF and compared to the measured data using LSQ minimisation. The ‘best fit’ yields a set of parameters, including the lifetime (described in detail in “Methods”). However, LSQ minimisation-based lifetime estimation is typically very demanding computationally, even with the reduction of computational times provided by Graphical Processing Units (GPUs)43. Recently, a rapid non-fitting method called center-of-mass method (CMM) has been improved by accounting the background contribution44. However, CMM relies on the explicit assumption that the IRF is much shorter than the lifetime measured45, which does not apply for example when using gated cameras with gate times that are similar to the expected lifetimes, as in this work.

Alternatively, fast visualisation methods such as phasor analysis have been proposed46, and have been successfully used for time-gated SPAD array FLIM data analysis47. In addition to the above-mentioned numerical approaches, advances in machine learning (ML) methods48 has enabled researchers to utilise deep learning (DL) frameworks to extract the exponential decay time and component fraction information from FLIM data rapidly and without fitting49,50. Here we employ an ANN to retrieve the lifetime at each pixel of the SPAD array. The ANN layout is shown in Fig. 1b: the input layer (IL) is a one-dimensional array corresponding to the time-resolved photon-count signal for a given pixel. This is connected to the output layer (OL), which provides a single value for the fluorescence lifetime decay constant \(\tau\), through 3 fully-connected dense layers of decreasing size. The ANN is trained on computer generated data that is created by taking a range of (mono) exponential lifetime decays (see Fig. 1a) that are chosen in the range \(\tau =0.5-5\) ns and convolved with a super-Gaussian of width that varies in the range 3.6–6 ns. The ANN was then tested on both simulated data not used in the training and also on actual experimental data.

In the latter case, the retrieval was compared to the results from the LSQ deconvolution. We found that in order to retrieve precise values of \(\tau\) from test data provided by the camera it was necessary to also include noise in the training data. The best results were obtained assuming two sources of noise: a Poisson-distributed component that is proportional to the actual signal, as expected for a photon detection process and also used in Ref.50 and a Gaussian component, see “Methods”. As shown in what follows, the ANN is applied to each individual pixel and can provide very similar results to a standard LSQ deconvolution albeit with a retrieval time of \(\sim 8\) s for a megapixel image, corresponding to \(1000\times\) gain in speed, if compared to our LSQ approach (the mean absolute difference between the two methods for the results shown in Fig. 2 was \(0.14\pm 0.12\) ns) and is thus a key component in rendering megapixel FLIM a real-time technique.

Results—0.5 megapixel wide-field FLIM

To illustrate the applicability of our SPAD array for FLIM data acquisition, we imaged a Convallaria sample with ‘high photon counts’ (HPC) at a 10 s acquisition rate (Fig. 2a–c) and with ‘low photon counts’ (LPC) at a 1 second acquisition rate (Fig. 2d–f). The HPC data-set was obtained by using a 504 ps gate shift and exposure of \(\approx 330\) ms. In order to achieve 1 Hz acquisition, we reduced the exposure to \(\approx 33\) ms per frame. Figure 2 shows that lifetime data can be retrieved for both the HPC and LPC data. However, LPC data analysis is more challenging due to the lower signal-to-noise ratio (SNR). In the examples shown here, the total photon count in the LPC data falls below 2700 photons per pixel, whereas the HPC data exceeds 8500 (the intensity values in Fig. 2 are clipped to make dimmer structures more visible). Nonetheless, both the LSQ and ANN methods recovered similar mean lifetime values for both HPC and LPC data. The mean lifetime and standard deviation values for the lifetime images in Fig. 2 are shown in Table 1.

Table 1 Mean and standard deviation of extracted lifetime values of data shown in Fig. 2. The LSQ and ANN lifetime retrieval methods provide similar, compatible results. We note the consistency in the lifetimes for HPC and LPC data, which shows that we can retrieve reliable lifetimes even at relatively low photon counts.

One of the main benefits of the ANN is that it has the potential for being significantly faster than LSQ. Using a pre-trained model (training time \(\approx 38\) min on a training set of \(\approx\) 2 million simulated decay curves), the ANN retrieval requires 2.7–3.6 s to process the full image. This is 388–1288 times faster than the LSQ method, which took 23.3–58 min in our tests (detailed times for each data-set are described in Fig. 2). We emphasise that we fit each pixel independently, and do not rely on ‘global fitting’ schemes, where data is averaged spatially and/or temporally52.

While Convallaria is a popular sample for testing FLIM systems53,54,55,56, the strong signal it yields is not necessarily representative, for example, of the signal level from transfected mamallian cells. To show a more practical example, we provide FLIM data of cancer research relevant samples: fixed HT1080 (fibrosarcoma) cells, transfected with pcDNA3-Clover57 and expressing a protein with a single fluorescence lifetime (Fig. 2g,h). The HT1080 cell data was acquired using a 108 ps gate shift, with a total \(\approx {400}\) s acquisition time.

Similarly to the Convallaria results, the ANN and LSQ results match well quantitatively (absolute difference between LSQ and ANN: \(0.10\pm 0.05\) ns). Our retrieved lifetime for HT1080 cells transfected with Clover is in good agreement with a previously reported value of 2.6 ns58. The large size of the sensor allows imaging multiple cells, at high detail, across a large field of view simultaneously. We note that the acquisition time could be decreased by increasing the gate shift and acquiring fewer time bins at the cost of reduced sampling of the fluorescence decay, and potentially less accurate lifetime recovery.

Results: 3.6 megapixel wide-field FLIM

Finally, we present a 3.6 MP image of our Convallaria sample to showcase that very large field-of-views can be achieved almost trivially using the 0.5 megapixel SPAD array (Fig. 3). The field of view in Fig. 3 is \(618\times 650\) \(\upmu\)m with the same spatial sampling of 0.33 \(\upmu\)m/pixel as in previous figures. We acquired the data using a mosaic acquisition by moving the sample with \(\approx 10\%\) overlap between mosaic tiles. Crucially, our ANN method required only 36 s to recover lifetime information from this data-set. This retrieval time could be shortened by processing each pixel (or batches of pixels) in parallel.

Figure 3
figure 3

Mosaic image of 8 tiles of Convallaria sample stitched together, yielding 3.64 MP data (1875\(\times\)1942 pixels) corresponding to a field of view \(\approx 618 \times 650\) \(\upmu\)m (or, equivalently, a sampling of 0.33 \(\upmu\)m/pixel). The total acquisition time was approximately 16 minutes in HPC mode (that can be reduced to 10–20 s by operating in low photon count mode) with a processing time of \(\approx\) 36 s using ANN deconvolution. Image stitched using BigStitcher ver. 0.3.6 https://imagej.net/BigStitcher51.

Conclusions

We have demonstrated the application of the largest-to-date time-gated SPAD array for fluorescence lifetime imaging of biologically relevant samples. By exploiting the ability to vary the gate-shift size between different exposures of the camera, we showed that wide-field FLIM at 0.5 megapixel resolution is possible at 1 Hz acquisition speed. By performing a spatial mosaic acquisition, 3.6 MP fluorescent lifetime images are readily available. These could be scaled to even larger fields of view and shorter acquisition times by using bespoke and rapid translation stages. In this work we retrieve mono-exponential lifetimes as this has less stringent SNR requirements compared to multi-exponential fitting42,59. However, future work could of course extend this multi-exponential decays.

While fast analysis methods such as the phasor approach46, or methods utilising advances in machine learning49,50, including our own ANN, are important parts of a high-speed FLIM systems, the biggest impact on imaging low-signal biologically relevant structures at sub-micron resolution at large FOV is delivered by the continuous improvement of SPAD array technology with e.g. increases also in fill factor and quantum efficiency60.

Methods

Time-gated imaging

The imaging is based on a 1 MP SPAD image sensor34. In its current version, the megapixel array accommodates multiple functionalities, which leads to only half of the sensor being read out in the gated mode used here34. The camera has 7% fill factor and 10% photon detection probability at 510 nm. To acquire the fluorescence lifetime of a sample, the camera is operated in the time-gated mode. Laser pulses are repeatedly exciting the sample; the photons emitted due to fluorescence will be detected by the camera in multiple frames with shifting gate window, resulting in an image sequence per acquisition. For each frame with a fixed gate position, 255 binary frames are summed to create an 8-bit image (9-bit images are obtained by summing two 8-bit registers). To obtain the full characteristic of a single-exponential fluorescence decay, the starting gate position is tuned to be prior to the laser excitation. The gate with average width of 3.8 ns over all pixels is then shifted finely between each frame by a fixed time. Illustration of gate-shifting is shown in Fig. 1a. We note that we pre-process the data by performing background subtraction and pile-up correction, where the latter is accounted for, following equation 1 in Ref.47 and adopting the same nomenclature:

$$\begin{aligned} I_{\text {corr}} = -I_{\text {max}}\ln \left( {1 - \frac{I_{\text {rec}}}{I_{\text {max}}}}\right) , \end{aligned}$$
(2)

where \(I_{\text {corr}}\) is the pile-up corrected counts, \(I_{\text {max}}\) is the maximum possible photon count (depending on the bit depth e.g. 255 for 8-bit data), and \(I_{\text {rec}}\) is the actual recorded value at a particular pixel. The background is removed by taking the average of first few frames, before the decay signal is observed, and subtracting it from all the frames.

We threshold out any pixels that have fewer than \(N_{tot}\) photon counts in total (i.e. integrated over all time gates) in order to eliminate pixels that have no significant signal. Photon counts drop towards the left hand side of the sensors due to a decay in the strength of the electronic drivers that distribute the signal controlling the time gates. We account for this non-uniform response of the SPAD array34 with a ‘sliding thresholding’ of the data, going from \(N_{tot}=1300\) (right hand edge of the image) to \(N_{tot}=50\) (left hand edge).

We note that despite the 7% fill factor, the small pixel pitch of 9.4 \(\upmu\)m (corresponding to a pixel active circular area of 6.18\(\upmu \text {m}^{2}\)) of the sensor provides sufficient sampling for microscopy at common magnification range.

Data analysis—LSQ deconvolution

As briefly explained in the introduction in the main text, the measured data is a convolution of the impulse response function (IRF) and the underlying fluorescence decay (see Eq. 1). Technically, the IRF of a FLIM system depends on the excitation source and the detector (in our case, the gate length of each SPAD in the array). However, since the nominal pulse width of our laser pulse (<47 ps) is significantly smaller than the gate length of our SPADs (average \(3.8\pm 0.2\) ns), the contribution to the IRF from our laser source is negligible.

We modelled the IRF as a generalised Gaussian function (or super-Gaussian) at each pixel b as

$$\begin{aligned} g_N(t) =\exp \left[ -2\left( \frac{t-t'}{w_N}\right) ^{2N}\right] , \end{aligned}$$
(3)

where t is time, \(t'\) is the position of the gate, N the Gaussian order, and \(w_N\) is given by the full width at half maximum (FWHM) of the gate through the following relation:

$$\begin{aligned} w_N = \frac{\mathrm{FWHM}}{2\left( 0.5 \ln 2\right) ^{1/2N}}. \end{aligned}$$
(4)

We measured that the average 10% to 90% intensity rise time for the gate (evaluated over all the pixels) is \((0.55\pm 0.08)\,\mathrm {ns}\)34. The order N of the super-Gaussian is chosen such that it matches the actual measured profiles34. Namely, \(N=6\) in Eq. (3) yields a rise time of approximately equal to 0.61 ns.

The fluorescence decay model used obeys the following equation:

$$\begin{aligned} d(t) = {\left\{ \begin{array}{ll} A_0\exp \left( -\frac{t-t_0}{\tau } \right) +b &{} t \ge t_0 \\ b &{} t < t_0, \end{array}\right. } \end{aligned}$$
(5)

where \(\tau\) is the decay constant (i.e. lifetime), \(t_0\) is a temporal offset, b is a constant that accounts for a signal offset induced by a non-zero background and \(A_0\) is an amplitude parameter corresponding to the number of photon counts. Following Eqs. (3) and (5), we model the temporal response of our detection system through the function \(f(t)\,=\,g_k(t)\circledast d(t)\), where \(\circledast\) stands for mathematical convolution. We then apply a LSQ optimisation to the measured data that provides an estimate of [\(w,t_0,\tau ,b,A_0,t'\)]. Computations were implemented using MATLAB R2017b using lsqcurvefit function.

Figure 4
figure 4

Lifetime retrieval performance of the LSQ (orange) and ANN (blue). Both methods give similar results compared to the ground truth (zero error indicated by the dashed line), and to each other. Each point corresponds to a shift of 500 ps and with a conservative assumption that we can distinguish lifetimes within 2 standard deviations of the points shown, the data indicates that we have a 200 ps (2 standard deviations) resolution for lifetimes in the 0.5–2 ns region.

Data analysis—artificial neural network

We use a custom ANN consisting of an input layer (IL), an output layer (OL), and three hidden layers (HLi, with \(i = 1,2,3\)) connecting IL with OL, as depicted in Fig. 1b. Each of these layers is formed by a fully-connected dense layer followed by with rectified linear unit (ReLU) activation function. The IL (with \(n_0 = 200\) nodes) is fed with a fluorescence decay signal (normalised to the range [0, 1]), i.e. with a 1D vector with as many elements as the number of gate shifts (200 in our work). Then, the output of the IL is fed in cascade through the ANN, while the number of nodes of each subsequent HLi is decreased to: \(n_1 = 100\), \(n_2 = 50\), and \(n_3 = 25\). Finally, the OL provides an estimation of the lifetime \(\tau\) of the fluorescence decay. Computations were carried out using Tensorflow 1.14.

Standard machine learning techniques require training with data-sets of sufficient size and quality. To this aim, we train our ANN only with synthetic data, i.e. with fluorescence decay curves-lifetime pairs that are generated numerically. Our training set consisted of 2 million datapoints. We trained using mini-batch gradient descent, with a mini-batch batch size of 128 using adaptive moment estimation (Adam) as our gradient descent algorithm. Our loss function was the mean squared error (MSE) between the ground truth lifetime values of our mini-batches, and the corresponding ANN predictions.

In order to prevent overtraining the network, we validated the number of epochs to train for using a validation set of 1 million datapoints. We found that the models overtrained minimally, achieving a training set loss that was <10% smaller than our validation loss for all ANNs, and, upon prediction, we found that our test set loss was approximately equal to our validation loss.

Our test set comprised 1 million datapoints. For our 200 time-bin data, expressing our ground truths and predictions in terms of nanoseconds, we obtained mean squared errors of  0.0053 ns\(^2\) on our test set. Our errors had \(\sim\)0 mean and 0.072 ns standard deviation, and approximately 0 expectation. In other words, our estimator was unbiased, and, averaged over all the test-set lifetimes, gave predictions within error bounds of ±72 ps. For our 30 time-bin data, we obtained a test set MSE of 0.0634 ns\(^2\). Our errors had approximately 0 mean, and 250 ps standard deviation.

Including variability on gate width, centre position, lifetime, and lifetime curve height, allows us to account for the fact that the various SPAD pixels have slightly different underlying electronic properties34. For the two cases (200 time-bin, 108 ps gate shift and 30 time-bin, 504 ps gate shift), we created two distinct neural networks, to match the input sizes of 200 and 30. For both of these neural networks, we created training, validation and test sets, all modelled according to Eqs. (1), (3), (4), and (5) . In order to facilitate good generalisation, our datasets all contained histograms modelled according to a wide range of parameters; this range was kept the same for both ANNs.

The parameter ranges were: decay lifetime \(\tau\): [0.5, 5] ns, decay start \(t_0\): [5, 10] ns, decay amplitude, A: [2, 32], gate width \(w_n\): [3.6, 6] ns. Our range of lifetimes cover the lifetime expression domain of our dye, acridine orange. Variations in decay starting point arise from the signal transmission time difference for different regions of the SPAD array. Decay intensity variations originate from various sources, such as the local concentration of fluoroscent dye caused by cellular structure. The gate width variations are inherent to the SPAD array (see34).

Both LSQ and ANN deconvolution approaches retrieve fluorescence lifetimes one pixel at a time, and can therefore be used on data of any dimension. For the ANN approach we obtained a root mean squared error of 0.0725 on a test set of \(\approx 1\) million synthetic data. The ANN algorithm then takes \((\approx 8.5 \pm 0.5)\) s to predict the lifetimes of a \(1024\times 1024\) synthetic data set with 200 time bins on a Intel Core i7 10510U CPU.

Noise model

We experimentally confirmed that the noise model in our experiments was best described by a mixture of Poissonian (due to photon counting statistics) that scales as the square root of the signal and Gaussian (electronic) noise.

By analysing the first and last ten time bins (in which no fluorescence signal is present) of 3, full 0.5 MP images, containing 135 thousand non-background pixels, we found that the Gaussian noise had an approximately 0 mean and a standard deviation of around 2.3 counts. In order to add flexibility to our model, we added to each histogram Gaussian noise whose mean and standard deviation were randomly drawn from \([-2,2]\) and [0, 5], respectively.

Poissonian noise is constantly present in our training/testing/experimental data, by nature of the measurements. Further, we tested the robustness of our method on varying levels of background (Gaussian) noise for various lifetimes i.e. 1, 2.5 and 4 ns. For each noise distribution scenario, we kept the decay intensity parameter, A, the decay start \(t_0\) and the gate width \(w_n\) constant, and tested on 100 histograms. We found that even for Gaussian noise with a standard deviation of 10 counts/pixel, which is  \(4\times\) larger than the experimentally measured standard deviation, and \(2\times\) larger than the highest training noise level, we still retrieve the correct mean lifetime, with a typical std. dev. of 130 ps and worst case std. dev. of 199 ps. This behaviour was found to be consistent over all the tested lifetimes.

Figure 4 shows the lifetime retrieval performance of both the LSQ and the ANN methods. For this benchmark, we generated 9 sets of synthetic curves following the convolution model described above. Each set of curves had constant lifetime decay \(\tau\) within the range [0.75–4.75] ns, and randomly varying noise from curve to curve within the set. Orange and blue dots represent the mean of the lifetime values retrieved with the LSQ and the ANN, respectively, while error bars correspond to their standard deviation. Both methods provide lifetime estimates in excellent agreement with the ground truth data.

Imaging set-up

We acquired the data on a custom-built epi-fluorescence microscope, with an Olympus 20 \(\times\) 0.4 NA objective with an \(f=250\) mm tube lens (or Nikon 40 \(\times\) 0.75 NA objective, and \(f=100\) mm tube lens for 1 Hz acquisition) air objective, and a FITC emission/excitation filter and dichroic mirror set. For the illumination source, we used HORIBA DeltaDiode laser diode model DD-470L, nominal peak wavelength 470 nm spectral Full Width at Half Maximum (FWHM) 10 nm, nominal pulse width of 47 ps, and a nominal pulse energy of 15 pJ. The repetition rate of the diode was set to 25 MH–7 nsz.

Mammalian cells, culturing conditions and transfections

HT1080 cells were maintained in Dulbecco’s modified eagle’s medium (DMEM), supplemented with 10% foetal bovine serum (FBS), 2 mM L-glutamine and 1\(\times\) PenStrep. Cells were maintained in 10 or 15 cm TC-treated plastic dishes at \(37\,^{\circ }\)C and 5 % \(\text {CO}_{2}\). HT1080 cells were transfected using Amaxa nucleofector Kit T, program L-005. Cells were transfected with 5 \(\upmu\)g DNA (pcDNA3-Clover) following manufacturers guidelines and replated on 6 cm TC-treated plastic dishes overnight at \(37\,^{\circ }\)C, 5% \(\text {CO}_{2}\). 35 mm glass bottom MatTek dishes that were coated with laminin 10 \(\upmu\)g/ml diluted in PBS and left overnight at \(4\,^{\circ }\)C. Cells were then collected and replated onto the dishes and incubated for 24 hours at \(37\,^{\circ }\)C, 5 % \(\text {CO}_{2}\). After, the dishes were washed twice with PBS before fixing with 4 % Paraformaldehyde for 10 min. These were then washed three times with PBS before slight drying. 10\(\upmu\)l Fluromount-G (without DAPI) was added on top of the cells and a 19 mm glass coverslip was added on top to seal the cells in the dish. Dishes were kept in the dark until imaging.