Analysis of Simulated Fluorescence Intensities Decays by a New Maximum Entropy Method Algorithm
- First Online:
- Received:
- Accepted:
- 6 Citations
- 875 Downloads
Abstract
A new algorithm for the Maximum Entropy Method (MEM) is proposed for recovering the lifetime distribution in time-resolved fluorescence decays. The procedure is based on seeking the distribution that maximizes the Skilling entropy function subjected to the chi-squared constraint χ^{2} ~ 1 through iterative linear approximations, LU decomposition of the Hessian matrix of the lagrangian problem and the Golden Section Search for backtracking. The accuracy of this algorithm has been investigated through comparisons with simulated fluorescence decays both of narrow and broad lifetime distributions. The proposed approach is capable to analyse datasets of up to 4,096 points with a discretization ranging from 100 to 1,000 lifetimes. A good agreement with non linear fitting estimates has been observed when the method has been applied to multi-exponential decays. Remarkable results have been also obtained for the broad lifetime distributions where the position is recovered with high accuracy and the distribution width is estimated within 3 %. These results indicate that the procedure proposed generates MEM lifetime distributions that can be used to quantify the real heterogeneity of lifetimes in a sample.
Keywords
Maximum Entropy Method Fluorescence lifetime distributions Synthetic dataIntroduction
Fluorescence spectroscopy is a classical method for studying the structural and dynamical aspects of biological systems such as proteins, membranes and living cells [1, 2, 3]. The excited states of the fluorophores have indeed lifetimes in the range of a few picoseconds to some tens of nanoseconds. Since this corresponds to the time scale of many important biological processes, such as the protein folding dynamics, the protein-DNA interactions, the diffusion of small molecules, the rotational and internal motions, the proton transfer and the energy transfer mechanism [4, 5, 6], the time resolved fluorescence has became an important investigation tool.
Despite the great advances in measurement techniques over the recent years, the analysis of fluorescence decays belongs to the difficult tasks in data analysis. Indeed, a parametric multi-exponential model function is usually fitted with the experimental data by non linear least squares methods [7]. Besides a difficulty to find the global overall minimum of the chi-squared function χ^{2} by the parametric minimization procedure, this statistical criteria very rarely allowed to fit the analysed decay curve by more than three exponential functions [4]. Moreover, there are many situations where the time resolved fluorescence profile is described by a continuous lifetime distribution rather than a discrete number of exponential decays. For example, the tryptophan residues in the exited state are usually involved in a energy transfer process to intra-molecular acceptors that depends on the relative distance and affects the fluorescence lifetime [4, 8]. This behaviour of the tryptophan is used to get structural informations on the intermediate forms of proteins that characterise their folding process from an extended random polypeptide chain. In complex situations, such as those encountered even in small proteins, the rate of energy transfer and thus the intra-molecular distances can be described by a continuum distribution due to heterogeneity of the structure that turns out to a continuum lifetime distribution for the fluorescence decay [9, 10, 11, 12]. In these cases, the analysis for recovering the distribution takes advantages of a “regularizing function” in addition to the chi-squared statistic for forcing the data to choose one member out of the set of the feasible distributions [13]. The Skilling entropy S is one of these regularizing functions and it is maximized subjected to the constraint χ^{2} ~1 by the maximum entropy method (MEM) thus determining the desired distribution with no assumptions about its functional form [14, 15, 16, 17, 18, 19, 20, 21].
The MEM is a mathematically complex algorithm that seeks the stationary point of the Lagrange function Λ = S + λ(χ^{2} − 1), λ being the so-called Lagrange multiplier, that turns out to be the solution of the set of non linear equations resulting from \(\nabla \Lambda=0\). Due to the once limited computer memory and low performing computational time, the conjugate gradient method (CG) [22] has been the iterative method widely spread for the solution of equation systems [23]. The necessary compromise between the accuracy of the solution and the convergence speed of the algorithm [24] limited both the number of data points and the number of lifetimes used for reconstructing the desired distribution to a maximum values of about 1,000 and 200 respectively, with the obvious limitation of the information content in the MEM distribution [25, 26].
Nowadays, the extensive computing power of a standard home computer makes possible the adoption of new iterative approaches to the MEM based on direct methods. In this paper we present a new procedure based on the Newton-Raphson method for solving the Lagrange multiplier problem associated to the MEM constrained problem. A method for the solution of the equation system \(\nabla \Lambda=0\) through iterative linear approximations and LU decomposition [27] is extensively discussed. The implementation of the MEM is carried out using a homebuilt routine package written in MATLAB (version 7.14, The MathWorks Inc., Natick, MA, 2012). The accuracy of the algorithm is investigated through comparisons with numerical simulations of fluorescence decay data. It results that the MEM algorithm proposed can analyse datasets with up to 4,096 data points, that is a typical value of an experimental set up based on a time-correlated single photon counting technique, by considering distributions with a number of lifetimes that ranges from 100 to 1,000.
Theory of the Method
In the following a new algorithm for maximizing the entropy S according to the MEM requirements will be described. The procedure refers to the negative function − S which is minimized by the same distribution that maximizes S obviously.
Analysis of the Accuracy of the MEM Algorithm
In the following the accuracy of the MEM algorithm proposed in the previous section is investigated through comparisons with numerical simulations of fluorescence decay data. Synthetic data \(\lbrace E_m \rbrace\) are generated with time scale and time resolution that are typical in experimental set-up for time-correlated single photon counting technique (TCSPC) [31]. The simulated curves consist of 4,096 data points over a time scale of 25 ns and are generated by the convolution product of a Gaussian profile with full-width at half maximum (FWHM) of 120 ps, the impulse response function R(t), and a decay model function according to the Eq. 3. The Poisson noise statistics that affects the typical TCSPC measurements [32] is simulated by taking as a value for the intensity at each time a Poisson-distributed random number with a mean equal to the calculated model value E_{m}. The routine poissrnd of Matalb is used to this purpose and its algorithm is extensively described in [33]. According to this procedure for generating simulated data, the number of counts in each point is its variance σ^{2} and the peak counts can be considered as a measure of the noise content of the decay curve.
Multi-Exponential Fluorescence Decays
The decay parameters recovered by the MEM analysis of a three exponential decay with 10^{4} counts in the peak channel
Simulated parameters | α_{1} | α_{2} | α_{3} | τ_{1} (ps) | τ_{2} (ps) | τ_{3} (ps) | |
---|---|---|---|---|---|---|---|
0.33 | 0.33 | 0.33 | 100 | 1,000 | 4,000 |
MEM results | \(\langle \alpha_1 \rangle\) | \(\langle \alpha_2 \rangle\) | \(\langle \alpha_3 \rangle\) | \(\langle \tau_1 \rangle\) | \(\langle \tau_2 \rangle\) | \(\langle \tau_3 \rangle\) | χ^{2} |
---|---|---|---|---|---|---|---|
0.33 ±0.01 | 0.337 ±0.005 | 0.333 ±0.003 | 99 ±7 | 970 ±40 | 3,970 ±40 | 1.01 |
Least-square fit | α_{1} | α_{2} | α_{3} | τ_{1} (ps) | τ_{2} (ps) | τ_{3} (ps) | χ^{2} |
---|---|---|---|---|---|---|---|
0.33 ±0.01 | 0.332 ±0.006 | 0.340 ±0.004 | 99 ±5 | 970 ±15 | 3,960 ±10 | 1.01 |
Gaussian Lifetime Distributions
The center and the width Δτ of the Gaussian lifetime distributions recovered by the MEM and estimated through a fitting procedure
| Gaussian 1-modal distribution | Gaussian 2-modal distribution | ||||
---|---|---|---|---|---|---|
Nominal values | MEM results | Nominal values | MEM results | |||
Amplitude | 1 | 1 | 0.92 ±0.02 | 1 | ||
Center (ps) | 5,000 | 4,990 ±15 | 1,000 | 5,000 | 985 ±34 | 5,016 ±23 |
Width Δτ (ps) | 500 | 498 ±15 | 300 | 1,500 | 319 ±31 | 1,515 ±95 |
Amplitude | 0.5 | 1 | 0.45 ±0.03 | 1 | ||
Center (ps) | 5,000 | 5,008 ±19 | 1,000 | 5,000 | 989 ±42 | 5,012 ±24 |
Width Δτ (ps) | 1,000 | 1,029 ±20 | 300 | 1,500 | 329 ±38 | 1,528 ±54 |
Amplitude | 0.25 | 1 | 0.23 ±0.01 | 1 | ||
Center (ps) | 5,000 | 5,009 ±10 | 1,000 | 5,000 | 990 ±30 | 5,038 ±24 |
Width Δτ (ps) | 1,500 | 1,524 ±15 | 300 | 1,500 | 315 ±24 | 1,546 ±31 |
Amplitude | 0.1 | 1 | 0.092 ±0.008 | 1 | ||
Center (ps) | 1,000 | 5,000 | 1,036 ±87 | 5,006 ±17 | ||
Width Δτ (ps) | 300 | 1,500 | 336±49 | 1,509 ±31 |
The resolution limit of the algorithm in resolving the intensities of the lifetimes spectrum has been investigated by adding a Gaussian distribution with different peak intensities on the tail of the distribution peaked at τ = 5,000 ps. To this purpose, we have considered the worst case of Fig. 4 observed for a width Δτ = 1,500 ps and added an additional distribution centred at τ = 1,000 ps with a width Δτ = 300 ps. Four values for the amplitude of this additional peak have been considered: 100, 50, 25 and the 10 % of the main peak at τ = 5,000 ps. As before, a set of 20 fluorescence decay curves has been generated in each case for working out the errors in recovering the characteristic parameters of these bi-modal lifetime distributions through a bi-Gaussian fitting procedure.
Conclusions
We have described a new algorithm for implementing a MEM analysis of the time-resolved fluorescence decays. The proposed procedure is based on seeking the desired lifetime distribution by solving the set of non linear equations \(\nabla \Lambda = 0\) through iterative linear approximations, LU decomposition of the Hessian matrix H and the Golden Section Search for backtracking. This algorithm has been tested on complex analytically simulated data that arise from multi-exponential decays and from broad lifetime distributions. A typical inversion with M = 4,096 data points and N = 400 discretization of the lifetime spectrum takes about 60 s in total using an Intel(R) Core(TM)i7-2600 CPU@3.40 GHz processor with 8 GB memory. The computation time is about 150 s when the number of lifetimes considered is N = 1,000.
The analysis of the multi-exponential decays with our computational approach has clearly shown that the sensitivity of the MEM increases as the number N of discretization in logτ space increases. An accuracy in retrieving the simulated parameters comparable to that of non linear regression can be achieved by considering N = 1,000 lifetimes.
The characteristic parameters of the broad lifetime distributions have been retrieved with high accuracy. Particularly, our MEM procedure has retrieved the widths Δτ that represent 10 % of the lifetime with a discrepancy lower than 3 % by using a typical value of 5×10^{4} counts in the maximum peak. This result clearly indicates that the procedure proposed generates MEM lifetime distributions that can be used to quantify the real heterogeneity of lifetimes in a sample with no need for a time consuming high statistic. Moreover, the analysis of bi-modal distributions has demonstrated that our algorithm resolves a secondary peak even when its relative amplitude is 10 % of the amplitude of the neighbourhood peak.
It is also important to highlight that the MEM algorithm proposed in this paper extends the analysis to datasets of up to 4,096 data points, thereby increasing the limit currently set at about 1,000. This increases the information that can be achieved from the data thus improving the accuracy in recovering the fast decay times; in fact, short lifetimes affect the leading edge that only contains few points. On the other hand, a larger density (N ≥ 1,000) is beneficial also for larger lifetimes since their spectrum can be determined with higher resolution.
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.