Parameter estimation in fluorescence recovery after photobleaching: quantitative analysis of protein binding reactions and diffusion

Fluorescence recovery after photobleaching (FRAP) is a common experimental method for investigating rates of molecular redistribution in biological systems. Many mathematical models of FRAP have been developed, the purpose of which is usually the estimation of certain biological parameters such as the diffusivity and chemical reaction rates of a protein, this being accomplished by fitting the model to experimental data. In this article, we consider a two species reaction–diffusion FRAP model. Using asymptotic analysis, we derive new FRAP recovery curve approximation formulae, and formally re-derive existing ones. On the basis of these formulae, invoking the concept of Fisher information, we predict, in terms of biological and experimental parameters, sufficient conditions to ensure that the values all model parameters can be estimated from data. We verify our predictions with extensive computational simulations. We also use computational methods to investigate cases in which some or all biological parameters are theoretically inestimable. In these cases, we propose methods which can be used to extract the maximum possible amount of information from the FRAP data.


Fluorescence recovery after photobleaching
The development of live cell fluorescence microscopy has revolutionised molecular cell research. Much modern fluorescence microscopy depends upon the use of the green fluorescence protein (GFP) and its variants. GFP, first isolated from the jellyfish Aequorea Victoria, has the ability to absorb energy from light in the ultra violet blue to wavelength range, which is then released by radiating green light (Tsien 1998). By modifying cells to express a fusion of GFP with a particular target protein (tagging or labelling), researchers are able to study gene expression and protein localisation within the living cell (Giepmans et al. 2006). This is done by illuminating the target cell with light of an appropriate wavelength and detecting the green fluorescent emission. However, observation of a cell at steady state reveals little, if anything, about protein mobility. The small size of proteins (way below the resolution limit of light microscopy) and the typically large number of labelled proteins (10 4 −10 6 ) means that in most experiments it is not possible to follow the movement of individual proteins. Reducing the number of labelled proteins can help to address this problem, but then detecting the fluorescent signal becomes increasingly difficult.
In the 1970s, researchers, mainly Axelrod et al. (2018), began to develop experimental methods to study protein mobility by perturbing the cell under observation. The technique they devised, known as fluorescence recovery after photobleaching (FRAP) (Cone 1972;Poo and Cone 1973;Liebman and Entine 1974;Koppel et al. 1976;Wu et al. 1977) is widely used to this day. Although numerous improvements have been made to FRAP procedure since it was first introduced, the fundamental idea has not changed (Lippincott-Schwartz et al. 2018).
In typical FRAP experiments, a short sequence of images is acquired prior to photobleaching. These serve to document the initial spatial distribution of the fluorescent molecule. The next step is photobleaching: a small defined region of interest is briefly illuminated with high intensity light, usually delivered by a laser. This triggers an irreversible change in the chemistry of the fluorophore (typically GFP) which causes a permanent loss in fluorescent properties. This creates a high concentration of photobleached (or simply bleached) protein molecules within the region of interest. Next, the laser intensity is attenuated in order to acquire a longer sequence of images, ideally with minimal photobleaching. During this period the motion of both non-bleached and bleached GFP molecules will lead to the spatial re-distribution of the fluorescent signal. Passive transport processes, such as Brownian motion, will create a net transfer of bleached molecules out of (and a net transfer of unbleached molecules into) the region of interest, causing the cell to relax towards equilibrium. This is referred to as the fluorescence recovery. The average intensity of fluorescent emission from the region of interest is recorded against time to construct the fluorescence recovery curve (White and Stelzer 1999;Meyvis et al. 1999;Reits and Neefjes 2001;Carrero et al. 2003).
The earliest FRAP experiments (conventional FRAP) were conducted using a static laser that was attenuated by placing neutral density filters in front of the beam (Jacobson 1 et al. 1976). As fluorescence microscopy proliferated during the 1980s, the confocal scanning laser microscope (CSLM) was developed (Amos and White 2003). This type of microscopy relies on raster scanning a laser beam over an area of interest. Modern FRAP apparatus essentially consists of a CSLM and an acousto-optical modulator which is capable of rapidly varying the intensity of the CSLM laser as it scans across the sample. During image acquisition a low laser intensity is used on the whole field of view, whereas during photobleaching the laser intensity is increased, and the scan area restricted to just the region that is being targeted for bleaching. A static beam can only yield a fluorescence recovery curve, but with modern confocal scanning FRAP almost any desired pattern can be bleached into fluorescent samples at high definition and then imaged (Wedekind et al. 1994).

Quantitative analysis of fluorescence recovery after photobleaching
Quantitative analysis of FRAP data is made possible by mathematical modelling. Many different models have been brought forward, beginning in the 1970s with relatively simple analytical models, based on partial differential equations that are solved (typically under idealised conditions) in order to derive an expression for the recovery curve. By fitting a model to experimental data, estimates of the model parameters are produced. The earliest FRAP models [the first being published in  were single species analytic models of protein transport within cellular membranes, due to diffusion or electrophoresis Soumpasis 1983). Many different models have since been proposed, including simplified one-dimensional models (Ellenberg et al. 1997;Houtsmuller et al. 1999) and more complicated threedimensional models (Braeckmans et al. 2003;Braga et al. 2004;Mazza et al. 2008).
The principal disadvantage of analytical modelling is that it is almost always necessary to make simplifying assumptions, for example that the system is homogeneous, that the system is infinitely large or that photobleaching is effectively instantaneous. The latter assumption is a problem in confocal scanning FRAP, since photobleaching requires repeated scanning of the region of interest which typically takes several seconds (Kang et al. 2009). This has forced analytical modellers to make phenomenological assumptions about the distribution of fluorescent material immediately after photobleaching (Braga et al. 2007;Kang et al. 2009Kang et al. , 2010. There also exists a variety of computational models that need not make any of these simplifying assumptions, of which there are two main types: continuum models in which a partial differential equation is solved numerically (Beaudouin et al. 2006;Blumenthal et al. 2015;Moraru et al. 2008;Bläßle et al. 2018;Röding et al. 2019), and stochastic approaches that track the diffusion and interactions of individual molecules (Nicolau et al. 2007;Vilaseca et al. 2011;Groeneweg et al. 2014). Computational models have the clear advantage over analytical models that they may incorporate greater complexity yet have the disadvantage that they may be time-consuming to run (particularly Monte Carlo methods).
One of the most significant developments in the history of FRAP modelling was the introduction of models that incorporate binding kinetics, either to immobile interacting partners within the cell or to partners with different diffusion properties (Kaufman and Jain 1990;Carrero et al. 2003;Sprague et al. 2004;Lin and Othmer 2017;Hinow et al. 2006;Phair et al. 2004;Kang et al. 2010;Braga et al. 2007). Knowledge of kinetic properties may yield important biological conclusions about how proteins function (Mueller et al. 2010). To give an example, Ege et al. established quantitative differences in molecular association and dissociation rates of a regulatory protein, YAP1, as evidence of qualitative biological differences between the normal and cancerassociated variants of fibroblasts (Ege et al. 2018).
While there is much value in FRAP mathematical modelling, various problems remain. First, FRAP studies using different kinetic models have been shown to arrive at very different predictions for the same or similar proteins due to technical issues (rather than genuine biological differences) (Mueller et al. 2010). Secondly, fits to FRAP data are not necessarily unique, which diminishes their usefulness (Sadegh Zadeh et al. 2006). In this article, we will seek to provide additional clarity by deriving mathematical conditions, in terms of model parameters and experimental parameters (such as recording frame rate), which guarantee that all model parameters are theoretically estimable from FRAP data. When this is the case, we will say that the model is tractable.
In Sect. 2 we will introduce the two-species reaction diffusion model that we will use throughout. In Sect. 3 we will present new analytic FRAP formulae and formally rederive existing ones using asymptotic methods (derivations may be found in appendix A). Invoking the concept of Fisher information, we will infer sufficient conditions to ensure FRAP model tractability. In Sect. 4 we present the computational methods used to test our theoretical predictions from Sect. 3. Further numerical investigation will inform as to the best course of action in cases where the tractability conditions do not hold. Computational results are discussed in Sect. 5. In Sect. 6 we propose a general method to determine when full parameter fitting is possible and when extra measures will be required. Finally, possibilities for future work considered in Sect. 7.

Mathematical model
We assume that a diffusible protein species, A, associates reversibly with a homogeneously distributed binding partners, B, to form a complex molecules, C. We also assume that the number of molecules involved is large enough for the law of mass action to be applicable so that, in a well-mixed system, the concentrations of A, B and C evolve according to, where [X] denotes the concentration of X.
Prior to the FRAP experiment, protein A is tagged with a fluorescent probe. We also assume that system (1) reaches chemical equilibrium before the experiment begins. Let u(x, t) be the concentration of species A at point x and time t that is fluorescent (not 1 Fig. 1 Schematic representation of model (3). Arrows, which indicate chemical reactions and photobleaching, are labelled with the reaction rates derived from the principle of mass action. Note that the bleached and unbleached concentrations of A and C sum to u eq and v eq respectively, since photobleaching does not perturb the overall chemical equilibrium, only the fluorescence equilibrium photobleached), and D u be the diffusivity of A (note that photobleaching is assumed not to alter the diffusivity or reactivity of the molecules). Likewise, let v(x, t) be the concentration of the fluorescent C species and D v its diffusivity. Similarly, let the concentrations of photobleached A and C beū andv respectively.
As A is the tagged species, only molecules of A, or molecules which contain A, may be fluorescent. Hence a molecule of C is fluorescent only if it contains a fluorescent A, as B is not tagged. Association of fluorescent A with B will always form fluorescent C, and dissociation of fluorescent C will always release fluorescent A. Both A and C can be photobleached by exposure to high intensity light, which we assume has intensity I (x, t) at position x and time t. Making the simplifying assumption (Lorén et al. 2015) that photobleaching is a first order process, the rate of bleaching per unit concentration is α I (x, t), where α is the sensitivity of the fluorescent probe to photobleaching. The resulting system of equations is, (2) (also see Fig. 1 for a schematic representation).
It is clear by conservation of mass that u+ū = u eq , a constant (likewise v+v = v eq ). Note that u eq and v eq are the pre-bleach equilibrium concentrations of fluorescent A and C respectively, assuming that all material is fluorescent prior to photobleaching.
where k on =k on [B], which is a constant as the concentration of binding sites is not altered by photobleaching. Model (3) has appeared previously in several quantitative FRAP studies, some of which assume the immobility of the binding sites (i.e. D v = 0) (Kaufman and Jain 1990;Sprague et al. 2004;Hinow et al. 2006;Beaudouin et al. 2006;Mueller et al. 2008;Tsibidis 2009), while others allow for the possibility of mobile sites (D v > 0) (Braga et al. 2007;Berkovich et al. 2011; Montero Llopis et al.  Kang et al. (2010) is an empirical example of a molecule with a non-zero diffusivity in the bound state).

2012) (this study of Ras by
For convenience we measure fluorescence in units such that the total pre-bleach fluorescence is 1 (u eq + v eq = 1), which implies that Model (3) includes four physical parameters, the diffusivities D u and D v and the reaction rates k on and k off . In what follows we will seek to determine the reliability with which these four parameters (or combinations thereof) can be measured experimentally by fitting (3) to simulated synthetic data. We assume the system (3) to be radially symmetric. Let r = x 2 + y 2 , and nondimensionalise by setting r = r /r n , where r n is the characteristic radius of the bleach region of interest and t = k on t. The resulting equations (given negligible laser intensity) are are positive dimensionless parameters. We will consider the initial value problem for (5) on an infinite spatial domain with far field conditions and initial conditions where H is the Heaviside step function. Initial conditions (8) are appropriate if all available material inside the region of interest is bleached instantaneously. We will show that the orders of magnitude of the dimensionless parameters η, κ and δ control the identifiability of the model parameters, D u , D v , k on and k off .

Inverse modelling problem
The inverse modelling problem is the problem of minimising an appropriate objective function in order to obtain a maximum likelihood estimate for the values of the model parameters. If all model parameters are identifiable given some data, we will refer to the inverse modelling problem as tractable.
In keeping with numerous prior studies Soumpasis 1983;Sprague et al. 2004;Kang et al. 2009) we define the fluorescence recovery curve as 1 the average light intensity of fluorescent emission across the region of interest (ROI), where To be consistent with the initial conditions (8), the region of interest, which is the area covered by the laser, is assumed to be a circular region of radius r n . Although F(t) in (9) is a continuous function of time, in practice only finitely many data points F(t i ) may be acquired at discrete times, t i . Let t = t i+1 − t i for all i be the time step, so that 1/ t is the frame rate of the imaging process; let Y i be the fluorescence recovery curve data values at the sample times t i ; and F i (θ ) = F(t i ; θ), with θ being the vector of model parameters. We assume that empirical data may be described by the sum of the output of the mathematical model and a stochastic variable such that, where the ξ i are normally distributed random variables and the σ i account for the scale of the observational uncertainty. This assumption is appropriate if the the model accurately captures the underlying dynamics of the system under investigation and experimental errors are normally distributed. The objective function may be defined as follows where the factor of 2 is purely for notational convenience. The global minimum of the objective function, θ * , corresponds to a maximum likelihood estimate of model parameters (White et al. 2016). The identifiability of the model parameters is given by the Fisher Information Matrix (FIM) (Rao 1992;Akaike 1998), which in this case is the Hessian matrix of the objective function, which gives If the system is well-described by the model, then F i (θ * ) = Y i , so the term involving second derivatives vanishes and the FIM simplifies to This relation is exact in our case since our data are synthetic. If an eigenvector of the FIM has a large corresponding eigenvalue, then the combination of parameters given by the eigenvector is identifiable (Gutenkunst et al. 2007). Models containing a mixture of identifiable and non-identifiable parameters (for which the eigenvalues of the FIM are spread across a logarithmic scale) are said to be sloppy (White et al. 2016). Sloppy models characteristically include certain parameters, or combinations of parameters, in which even substantial variations do not significantly affect the behaviour of the dependent variables. In geometric terms, there is a manifold within the space of model parameters which is a flat minimum of the objective function so that the global minimum cannot be easily located. Numerous sloppy models have been identified within the mathematical biology literature, usually those with large numbers of parameters (Gutenkunst et al. 2007;Daniels et al. 2008;Machta et al. 2013;Transtrum et al. 2015). Even though the FRAP model (3) has only four parameters, we will show that under certain circumstances it may be sloppy in the sense of having a mixture of identifiable and unidentifiable parameters.

Asymptotic approximations
In this section, we will handle important special cases of (5) analytically under idealised circumstances (i.e. subject to far-field conditions (7) and initial conditions (8), paying special attention to the effect of varying η and κ on parameter identifiability. It is useful at this stage to introduce the following function, which is the recovery curve of a radially symmetric, single species, pure diffusion FRAP model with Heaviside step function initial conditions (Soumpasis 1983): where I 0 and I 1 are modified Bessel functions of the first kind. The series expression is due to the asymptotic expansion of Abramowitz and Stegun (1972).

Rapid equilibration (Á 1)
In the limit as η → ∞, the dynamics of the fluorescence recovery arising from system (5) subject to (7) and (8) admit a small parameter ε = 1/η 1. The recovery curve is then well approximated (see appendix A.1) by From expansion (15) it is clear that lim z→0 F S (z) = 1 and that lim z→∞ F S (z) = 0, so, if the slow diffusion time scale is sufficiently long (r 2 n /D v 1) and the time scale of rapid diffusion is sufficiently short (r 2 n /D u 1), then for all t = O(1) (16) reduces to the well known formula (Bulinski et al. 2001;Dundr et al. 2002;Phair et al. 2004;1 A B D C Fig. 2 Logarithm of the eigenvalues, log(λ i ), of the Fisher information matrix computed from formula (16) for different values of the r n and t. Each subplot corresponds to one of the four eigenvalues, a λ 1 , b λ 2 , c λ 3 and d λ 4 Rabut et al. 2004;Sprague et al. 2004) for 'reaction limited' dynamics, or equivalently, Formula (18) suggests prima facie that the dissociation rate k off is the only measurable model parameter, yet this is not necessarily true. The eigenvalues of the Fisher information matrix derived from formula 16 are plotted in Fig. 2. One of the eigenvalues ( Fig. 2a) is many orders of magnitude smaller than the other three. On this basis we expect that there will be a manifold within the parameter space which represents a flat minimum of the objective function. This is quite clearly visible in Fig. 3b, d and f, showing that the diffusivity, D u , is inestimable. The fluorescence recovery in this case is bi-phasic; there is an early diffusion-dominated phase which occurs imperceptibly quickly unless the time step, t, is much smaller than the time scale of diffusion across

Fig. 3
Logarithm of the numerically constructed objective function, log(φ), in the rapid equilibration (reaction limited) regime. Colour indicates the size of the sum of square errors between a single simulated fluorescence recovery curve spanning 15 seconds (1024 data points) generated with parameter values D u = 20.0μm 2 s −1 , D v = 0.00μm 2 s −1 , k on = 2.00s −1 , k off = 1.00s −1 , and a secondary simulated recovery curve generated with indicated parameter values. Each subplot displays variation in one of the six possible pairs of parameters, with the remaining two parameters held at the correct value in each case 1 the bleach region of interest. Equivalently, D u is only estimable if Returning briefly to Fig. 2a, it is quite clear that the magnitude of the smallest eigenvalue is increased as t decreases and r n increases. Interestingly, two other eigenvalues (Fig. 2c, d) visibly decline, for fixed t, as r n increases. Notwithstanding, it is clear that every model parameter is estimable in this case provided condition (19) holds.

Intermediate equilibration (Á = O(1))
No known approximations describe the dynamics of the intermediate case in which η = O(1), but this does not mean that it is impossible to analyse. In order to derive formula (17) we introduced an asymptotic expansion in terms of a small parameter ε = 1/η to produce an approximation which holds whenever η is large. By extending our asymptotics to include first order terms (see appendix A.2), we are able to produce an approximation which is accurate for somewhat smaller values of η and so gives some insight into the behaviour of the system as it approaches η = O(1). The first-order extension of formula (17) is which holds for r 2 n /D v 1 and t ε. In contrast with formula (18), D u appears explicitly in (20), which implies that it could be estimated if the other model parameters were known. The full significance of this result will be discussed in Sect. 3.3.
By contrasting Figs. 3 and 4, it can be seen quite clearly how reducing the value of the dimensionless parameter η changes the shape of the objective function. In particular, in the subplots that involve D u (Fig. 4b, d, f), there is a clear unique local minimum of the objective function (in contrast with the manifold in Fig. 3b, d, f) which tends to support the prediction that D u is estimable when η = O(1) (at least when other parameters are known).

Slow equilibration (Á 1)
In the limit η → 0, we define the small parameter by ε = η. The fluorescence recovery approximation is simply where D eff , a straightforward generalisation of the effective diffusivity defined by Crank (1975), is , and a secondary simulated recovery curve generated with indicated parameter values. Each subplot displays variation in one of the six possible pairs of parameters, with the remaining two parameters held at the correct value in each case As the recovery curve (21) depends only on D eff , this is the only estimable combination of parameters. As η → 0 , a manifold within the parameter space, defined by (22), emerges upon which the value of the objective function is approximately zero. In Figure 5, which is clearly visible in any subplot of , and a secondary simulated recovery curve generated with indicated parameter values. Each subplot displays variation in one of the six possible pairs of parameters, with the remaining two parameters held at the correct value in each case determine three of the parameters to determine the fourth. If the diffusivities, D u and D v , could be independently determined, then at most the ratio of the reaction rates, κ = k off /k on could be estimated.
Like the rapid equilibration case, the slow equilibration recovery is bi-phasic. The early phase consists of a rapid convergence to local chemical equilibrium between the bound and unbound species (v and u respectively) which is imperceptible because it does not alter the total concentration, w.

Asymmetric reaction rates (Ä 1)
If we take κ → ∞, we find that u eq = κ/(1 + κ) → 1, so almost all available material will be in the unbound state, and the system will be closely approximated by pure diffusion. The recovery curve is Since, it is clear that D eff → D u as κ → ∞. In effect, the κ 1 case coincides with the η 1 case, except that κ = k off /k on could not be estimated in the κ 1 case even if D u and D v could be independently measured.

Asymmetric reaction rates (Ä 1)
As κ → 0 almost all available molecules are in a bound state, such that the recovery curve can be approximated by and D v is the only measurable parameter. As κ → 0, D eff → D v , so this case coincides with the η 1 case, except that κ itself is always inestimable.

Parameter identifiability
Here we will summarise the conditions which guarantee parameter identifiability in FRAP modelling. Suppose we have a theoretical recovery curve based on the solution to a mathematical model F(t; θ) for parameter values θ , and some recovery curve data F Data (t). We can define the objective function, φ(θ), to be the residual sum of squared errors (without the scaling with σ i used in (12)), We have four physical model parameters that are unknown, θ = (D u , D v , k on , k off ) and two experimental parameters: r n , the radius of the bleach region of interest and t = t i+1 − t i the time interval between data points.
Of the cases we have considered in Sect. 3.1, the inverse modelling problem was tractable only in the rapid equilibration (η 1) when condition (19) holds. This means that we require the following conditions on the physical parameters and the following conditions on the experimental parameters These results imply that smaller bleach region radius and higher frame rate data acquisition are generally preferable in principle. However, this is not necessarily practical; r n cannot be reduced arbitrarily as the resolution of an optical system is limited by diffraction. Although conditions (27) and (28) appear quite specific, we expect that systems in which they are satisfied will be relatively common. For example, many different nuclear proteins have been found to have a high mobility (van Royen et al. 2009;Phair and Misteli 2000). Highly mobile proteins such as those found within the cell nucleus will satisfy condition (27) except in extreme cases of highly transient binding interactions.

Confocal scanning FRAP
As we discussed in Sect. 1, confocal scanning FRAP, unlike conventional FRAP, may yield a detailed recording of an entire cell. In this case, we may attempt to fit the total fluorescence w(x, t), not just the recovery curve F(t). Under the assumption of radial symmetry, let w(r , t; θ) = u(r , t; θ) + v(r , t; θ), for some parameter values, θ , and let w Data (r , t) be some appropriate fluorescence microscopy data. The objective function in this case is defined as It has already been observed that the process of averaging across the bleach region of interest to compute the recovery curve effectively destroys a significant amount of information (Orlova et al. 2011;Seiffert and Oppermann 2005), so we expect that it will be advantageous to define the objective function as in (30). Here we will derive simple conditions to ensure parameter estimability in confocal scanning FRAP.
Once again, we have four physical parameters, θ = (D u , D v , k on , k off ), though this time we have three experimental parameters: r , the length scale of a pixel of the micrograph; t, the duration of one frame; and L, the length scale of the whole field of view.
We could, in principle, construct a recovery curve of radius r n so that η = D u /(k on r n ) 1, provided that r < r n (clearly we cannot have a recovery curve radius smaller than one pixel). As we saw in Sect. 3.1.1, we could use this recovery curve to estimate k on , k off and D v , but not necessarily D u except for very high frame rate data. Likewise, we could construct a second recovery curve of radius r n so that η = D u /(k on r n ) = O(1) provided that L > r n . From the results of Sect. 3.1.2 (formula (20)) we know that D u will be estimable if η = O(1) or greater, as long as the other model parameters are known, but this is certainly the case because estimates can be obtained from the first recovery curve of radius r n . Moreover, there is no theoretical reason to suppose that the two recovery curves would actually be necessary, as the objective function (30) contains information about the redistributive dynamics of the system under investigation on all length scales between r and L. In summation, we expect that the inverse modelling problem of confocal scanning FRAP will be fully tractable as long as and D u k on r 2 1, D u k on L 2 1.
There is also an extremely weak implicit constraint on t, that the frame rate is not so low that the fluorescence recovery is totally imperceptible.

Computational methodology
The analysis in Sect. 3 has two limitations. First, it is local to the optimal point and does not reveal anything about the viability of global parameter fitting with general initial guesses that may be far from the global minimum. Secondly, it applies only to the idealised case with step function initial conditions. In this section, we will introduce the computational methods by which we aim to test our theoretical predictions from Sect. 3 and extend our results to the global parameter fitting problem with non-ideal initial conditions. We simulate the FRAP model (3) numerically with the laser profile I (r , t) being given in terms of the Heaviside step function as We impose zero-flux boundary conditions on a disk and likewise for v. The radially symmetric Laplacian is where the result at r = 0 is a consequence of l'Hôpital's rule. Using a central difference approximation of the Laplacian (35) we produce a semidescretised approximation to (3) to which we apply a stiff ODE solver (MATLAB's ode15s function) to obtain numerical solutions u Data (r j , t i ), v Data (r j , t i ), which represent the mobile and bound fluorescent fractions at position r j and time t i . Then the total fluorescence is and the fluorescence recovery curve is We will allow for simultaneous fitting of multiple instances of a fluorescence recovery generated using different bleach region radii. Let each of these instances be indexed by a number, k = 1, ..., n exp , then let r k n be the nominal bleach region radius used in experiment k, w k Data the total fluorescence and F k Data (note the superscript k does not mean 'raised to the power of k'). We will attempt to fit generated model solutions to synthetic data simulated using known parameter values to ascertain the accuracy of the parameter fitting in various cases.
For each instance k, with bleach region radius r k n , we solve (3) numerically to obtain u k (r j , t i ), v k (r j , t i ). We define the total fluorescence w k (r j , t i ) and the fluorescence recovery curve F k (r j , t i ) as in (36) and (37) respectively. We define the objective functions, φ and φ Space as and whose minima we attempt to find with the Nelder-Mead downhill simplex algorithm (Nelder and Mead 1965;Olsson and Nelson 1975) (using the fminsearch function of MATLAB). Since we know the values of the parameters used to generate w Data , we can easily measure the accuracy of the fitting procedure. Let θ l be any model parameter (D u , D v , k on , k off ) used to generate w Data , andθ l be the fitting procedure output, then the proportional estimation error is where once more the superscript k is an index, not a power. The mean error is then simply In each case we take n run = 1024. In this way we are able to condense information about the accuracy of parameter estimation in the various dynamic regimes (rapid equilibration, intermediate and so on) into a single variable. However, since k on and k off may span several orders of magnitude, the mean error μ l may be biased by a small number of extreme outlying results. For this reason it may also be of interest to record the number of instances (indexed by k), n l (μ), which returned valuesθ k l such that μ k l < μ. Then we may define, where μ has a chosen value. For example, if μ = 0.01, then f l (μ) would be the fraction of instances which returned estimation errors of less than 1%. It is necessary to produce samples of parameter combinations which are used as inputs in generating w Data , which is done semi-randomly as follows: 1. Generate a uniformly distributed positive random value for D u . We set D u ≤ 50 μm 2 s −1 to keep the diffusivity in a biologically realistic range (Kang et al. 2009). 2. Pick a random real numberη ∈ [−3, 3] from a uniform distribution, and set the dimensionless parameter η = 10η. Ifη ≥ 1, we consider the dynamics to be 'rapid equilibration'. If −1 <η < 1 we consider the dynamics to be 'intermediate'.
Initial guesses are generated in two different ways. First, for the data (recorded in Table 1), each initial guess,θ k l , is of the form where p ∈ [0, 0.5] is a uniform random variable. This ensures that initial guesses are within 50% of the correct parameter value in each case. This is done mainly to test the predictions in Sect. 3. In the second instance, steps 1-7 were repeated to generate more general random initial guesses (these data are recorded in Table 2). 1

Computational results
We begin by considering the rapid equilibration case in which η 1. On the basis of the analysis in Sect. 3.1.1, we predicted that conventional recovery curve analysis would be generally be sufficient to estimate the slow diffusivity, D v , and the reaction rates, k on , k off . Furthermore, the fast diffusivity, D u , could be estimated given sufficiently high frame rate data. Numerical results (see Table 1) confirm this prediction. We also predicted that the use of spatial data in confocal scanning FRAP would enable the estimation of all of the model parameters, even for a relatively low frame rate, and again our simulated data supports this prediction. The use of spatial data offers a significant improvement over recovery curves alone. On the basis of Table 2, we expect that all four model parameters can be reliably estimated in the η 1 case by fitting the model to three spatially dynamic fluorescence recoveries with different bleach region radii. We found that this process returned parameter estimates accurate to within 1% of the correct values in at least 92% of instances given initial guesses were also in the η 1 regime, but otherwise uncontrolled. In the intermediate case (η = O(1)) we were unable to establish in Sect. 3.1.2 that parameter estimation would be possible unless some parameter values could be determined independently. Numerical results (Table 1) confirm that it is not possible to obtain accurate parameter estimates in most cases, even high frame rate spatial data. Interestingly, however, we consistently found that the effective diffusivity D eff was strongly estimable, which suggests that in practice the intermediate fluorescence recovery (η = O(1)) resembles the effective diffusion recovery (η 1) quite closely. On the basis of the constraints (32), we expect that improving the resolution of spatially dynamic data would improve parameter estimation by increasing the value of η. However, since this is not necessarily practical, we also investigated the utility of independently estimating certain parameters, as previous studies have found that fitting multiple fluorescence recoveries with different sized bleach regions (González-Pérez et al. 2011) or independently determining certain model parameters (Sadegh Zadeh et al. 2006) may be beneficial. We therefore investigated the possibility of fitting the reaction rates to data while fixing the diffusivities at some independently determined values. We found ( Table 2) that this is method can be used to produce highly accurate estimates of both k on and k off . Supplied with correct values for D u and D v , and three fluorescence recoveries with different sized bleach regions, we were able to obtain estimates of k on and k off accurate to within 1% in 100% of instances.
In Sect. 3.1.3 we predicted that in the η 1 case it will not be possible to identify individual parameter values, only to show that they lie within a manifold defined by for constant D eff . We found that it is possible to estimate accurately the effective diffusivity D eff , but not of any of the parameters individually (see Table 1). In accordance with our predictions, we did not find that increasing frame rate or the use of spatial data, unless of extremely high resolution, could improve this ( Table 2). As in the η = O(1) case, it will be necessary to estimate the diffusivities D u and D v separately; however, with an average error of 2.2% and 93% of instanced returned an error of less than 1%. The initial guesses were generated randomly as outlined in Sect. 4, formula (43) of the main text so that each initial guess was within 50% of the correct parameter value. The word 'spatial' in the 'special conditions' column indicates that the fit was performed using fully spatially dynamic data as in confocal scanning FRAP, otherwise the fit was performed using recovery curves. Likewise, 'fast frame rate' indicates that the the time resolution was t = 10 −4 s, otherwise t = 10 −2 s The first number in each cell is f l (0.01), defined in (42), that is the fraction of cases which returned results accurate to within 1% of the correct value. The second number in brackets is μ l ,defined in (41), which is the average parameter estimation error. For example, for synthetic FRAP data in the η 1 regime, in the first row D u was estimated with an average error of 14% and 92% of instanced returned an error of less than 1%. The initial guesses were generated randomly as outlined in Sect. 4, so that only the regime (e.g. η 1) was known unlike the η = O(1) case, this does not enable us to estimate the reaction rates, k on and k off , only the ratio κ = k off /k on . The estimation accuracy of κ is closely depends upon the estimation accuracy of D u and D v , with the relationship between them being Finally, we considered the case of asymmetric reaction rates, κ 1 and κ 1. As predicted in Sects. 3.1.4 and 3.1.5 only D u and D v respectively are measurable in this case. Increasing frame rate, fitting multiple fluorescence recoveries, and using spatial data regardless of resolution are not beneficial (Table 1).
Our summarised results are as follows: • When η 1, all model parameters can be estimated. This may be possible with conventional analysis of recovery curves, but it is more reliable to fit a spatially dynamic model to confocal FRAP data.
• When η = O(1), k on and k off can be estimated. To do this, it necessary to conduct separate experiments in order to measure D u and D v accurately. • When η 1, the ratio k off /k on can be estimated. As in the previous case, it necessary to conduct separate experiments in order to measure D u and D v accurately.
• When k off /k on 1 or k off /k on 1, it is only possible to measure D u or D v respectively. There is no experiment which could reliably determine k on , k off or the ratio of the two.

Regime identification
We have so far determined that the reliability and accuracy of parameter estimation are determined by the parameter regime of the data. However, one does not automatically know the regime of experimental data. The objective of this section is therefore to determine the precise boundary between the regimes and propose a method to determine the regime of arbitrary FRAP data. To this end, we ran numerical experiments in which we attempted fitting on synthetic data with procedurally generated parameter inputs, as described in Sect. 4, but precisely controlling the values of the dimensionless quantities, η and κ. We consider η in Sect. 6.1 and κ in Sect. 6.2.

The effect of varying Á
In order to locate the boundary between the regimes we ran a sample of parameter fitting experiments with η values in a set of intervals, η ∈ [10η, 10η +0.1 ] with η ∈ {..., −0.2, −0.1, 0, 0.1, 0.2, ...}, and recorded the fraction of output parameter estimates with an error of less than 1% relative to the correct corresponding parameter input value ( f l (0.01) as defined in (42)).
Results (Fig. 6) indicated that, as expected, the reliability of the fit generally increased with the value of η. When D u and D v were known, fits of the reaction rates k on and k off were consistently accurate for η > 10 0.4 ≈ 2.51 (Fig. 6a). Fitting all four parameters reliably, however, required η > 10 1.7 ≈ 50.1 (Fig. 6b). 1 B A C Fig. 6 a Fraction of parameter estimates within 1% of the correct value ( f l (0.01) as defined in (42)) when attempting to fit just k on and k off . Each data point is calculated from n run = 128 instances of fitting with η ∈ [η min , η max ] where η min is indicated and log 10 (η max ) = log 10 (η min ) + 0.1. b identical to a, except attempting to fit D u , D v , k on and k off . c Residual sum of squared errors between a simulated fluorescence recovery curve and recovery curves computed by the rapid equilibration formula (16) (φ R ) and the slow equilibration (effective diffusion) formula (21) (φ D ). Parameters were D u = 30 μm 2 s −1 , D v = 0.01 μm 2 s −1 , r n = 0.5 μm, k on = k off = D u /(r 2 n η) for variable η. The rapid equilibration error, φ R , decreases as η increases, while the effective diffusion error, φ D , increases We would expect that, in the regime where accurate estimation of all model parameters is possible, the rapid equilibration formula (16) ought to well-approximate the recovery curve. Accordingly, it is clear in Fig. 6c that the error between formula (16) and simulated data, φ R , decreases as η increases, and is negligible for η > 10 1.7 . Similarly, we would expect the slow equilibration (effective diffusion) formula (21) to be a good approximation where estimation of k on and k off is not possible. Although the error, φ D , decreases as η decreases, as we would expect, it does not appear that the effective diffusion formula is a good approximation when η = 10 0.4 . This suggests that for η ≈ 1, neither the effective diffusivity D eff , nor k on and k off individually, are estimable with total accuracy. The leftmost row indicates the regime of the data: rapid equilibration (RE), intermediate (I) and slow equilibration (SE). Each cell displays the percentage of cases in which the AIC in a regime indicated by the column was greater than the AIC in the correct regime indicated by the row We can place data into one of three regimes: rapid equilibration (η > 10 1.7 ), intermediate (10 0.4 ≤ η ≤ 10 1.7 ), and slow equilibration η < 10 0.4 . If the regime can be determined, then the required course of action is obvious: in the rapid equilibration regime full parameter fitting is possible, in the intermediate regime the reaction rates can be estimated after separate experiments to determine the diffusivities have been conducted, while in the slow equilibration regime at most the ratio of the reaction rates can be estimated.
We propose that the regime can be identified by attempting separate fits which are restricted to particular regimes. The best fit corresponds to the correct regime of the data. We measured goodness-of-fit with the Akaike information criterion (Akaike 1998), where N param is the number of model parameters, N data is the number of data points and φ is the objective function/residual sum of squared errors. The model with the smallest AIC is in general the best fit with the least degree of over-fitting. We tested procedurally generated data by fitting in the three major model regimes, as well as by fitting with a pure diffusion model. The restricted parameter estimation was implemented using MATLAB's constrained optimisation algorithm, fmincon. We define AIC RE as the AIC resulting from a model fit which is limited to the rapid equilibration regime, while AIC I and AIC SE are likewise for the intermediate regime and the slow equilibration regime respectively. AIC D is the AIC of the pure diffusion model fit. Note that N param = 4 for AIC RE , AIC I and AIC SE , while N param = 1 for AIC D . For this reason, the pure diffusion model will yields a lower AIC than the full reaction-diffusion model in cases where the residual sum of squared errors, φ, are equal.
Results in Table 3 indicate that, for both intermediate and rapid equilibration, constrained fitting in the correct regime produced the best fit in all cases, which strongly supports our contention that this method can be used for regime identification. In a minority of cases, the pure diffusion model provided a better fit than the full model in the slow equilibration regime, hence a slow equilibration recovery cannot reliably be distinguished from a purely diffusive recovery. This is to be expected, as the fluorescence recovery in slow equilibration regime tends to resemble pure diffusion with effective diffusivity, D eff . 1 C B A Fig. 7 a Fraction of parameter estimates within 1% of the correct value ( f l (0.01) as defined in (42)) when attempting to fit D u , D v , k on and k off . Each data point is calculated from n run = 128 instances of fitting with κ ∈ [κ min , κ max ] where κ min is indicated and log 10 (κ max ) = log 10 (κ min ) + 0.1. b Similar to a, except over a different range of values of κ min . c Residual sum of squared errors between a simulated fluorescence recovery curve and recovery curves computed by the pure diffusion formula (23) with diffusivities D u (φ u ) and D v (φ v ). Parameters were D u = 8 μm 2 s −1 , D v = 1 μm 2 s −1 , r n = 0.5 μm, k on = 1 s −1 and k off = κk on for variable κ. The goodness-of-fit of pure diffusion with diffusivity D u improves as κ increases, while the fit with diffusivity D v improves as κ decreases

The effect of varying Ä
As with η, we began investigating the effect of varying κ on parameter estimation by locating the boundary between the regimes. Computational results (Fig. 7a, b) indicate that parameter estimation deteriorates the further κ deviates from 1 in either direction. We found that 10 −0.9 < κ < 10 0.53 ensured reliably accurate estimation of all four model parameters.
As κ → ∞, the system asymptotically approaches a pure diffusion recovery with diffusivity D u , and likewise for D v as κ → 0. Yet the pure diffusion model with the appropriate diffusivity is a better approximation for κ = 10 −0.9 than for κ = 10 0.53 (Fig. 7c). We believe that this asymmetry can be explained as follows. Since D u > D v , the diffusive recovery with diffusivity D u is faster, hence there are comparatively fewer data points available with the fluorescence recovery in progress, ultimately leading to less accurate parameter estimates.
Next, we tested whether constrained fitting can identify the magnitude of κ, similar to η in Sect. 6.1. Again, we computed the Akaike information criterion of various fits limited to different regimes: AIC U for κ > 10 0.53 , AIC V for κ < 10 −0.9 , and AIC D for the pure diffusion model. For fits where κ was of intermediate magnitude (10 −0.9 < κ < 10 0.53 ), we also imposed η > 10 1.7 (i.e. the rapid equilibration regime considered in Sect. 6.1). We made this imposition because rapid equilibration is the sole regime in which full parameter estimation is possible, so identifying it is the most important problem.
Results (Table 4) clearly indicate that the κ 1 and κ 1 regimes cannot always be distinguished from one another, nor can they always be distinguished from pure diffusion; however this is unavoidable as both regimes are approximately diffusive.
For rapid equilibration data, the fit constrained to the rapid equilibration regime gave the best fit in all cases, which encouragingly suggests that this regime can be identified. On the other hand, for κ 1 data, the fit constrained to the rapid equilibration regimes gave a better fit in 11.7% of cases. Judging by goodness-of-fit alone, we would erroneously conclude that these data were rapid equilibration, leading to potentially wildly inaccurate parameter estimates. However, in all of these instances we had κ ≤ 10 −0.9 +10 −3 whereκ is the estimated value of κ. The algorithm clearly converged towards a point which was as close as possible to the κ 1 regime (the correct regime). We therefore imposed the additional rule that a regime is not considered viable if the constrained fit in that regime yields parameter estimates at the boundary between regimes. With the addition of this rule, in all of our numerical tests we were able to identify the rapid equilibration regime without any false positives or false negatives.
It is worthwhile noting that, even though the fluorescence recovery approximates pure diffusion as κ → ∞ or κ → 0, the κ 1 and κ 1 regimes could not be reliably identified with model selection alone. For κ 1 and κ 1, we found that AIC RE < AIC D in 63.3% and 74.2% of cases respectively. In other words, the reaction-diffusion model produced a better fit than the pure diffusion model in the majority of cases. It is clear, then, that constrained fitting of the reaction-diffusion model is essential for the purposes of regime identification.

The diffusive regimes, Á 1, Ä 1 and Ä 1
Although the η 1 and η = O(1) regimes can be identified, the κ 1, κ 1 and η 1 regimes cannot be differentiated from one another as they all somewhat resemble diffusive recoveries. However, this is no problem, as these regimes can easily be identified by other means. Suppose that D is optimum diffusivity obtained from fitting the pure diffusion model to data. If D ≈ D u then κ 1 and v eq ≈ 0, while if D ≈ D v then κ 1 and v eq ≈ 1. If it is clear that D v < D < D u , then D = D eff and κ can be calculated using formula (45). In summary, it is always possible, in principle, to determine the parameter regime, and by extension, which parameters are estimable and under what circumstances, of given FRAP data. 1 Table 4 Regime identification with constrained parameter fitting tested on a sample of 128 synthetic experiments for each regime

Discussion
The application of mathematical modelling to FRAP can improve the understand of biological systems by enabling researchers to extract quantitative binding information from fluorescence microscopy data. In this article we investigated the feasibility of obtaining quantitative information from fluorescence microscopy data. On the basis of approximations derived using formal asymptotic methods, we theoretically predicted the conditions under which a FRAP inverse modelling problem (the problem of determining parameter values from data) is tractable in terms of biological and experimental parameters. We found that, in all cases, the inverse modelling problem is tractable only if For conventional FRAP recovery curve analysis we predicted that the following sufficient conditions ensure tractability: where t is the temporal resolution of the data and r n is the radius of the bleach region. Since many modern FRAP experiments are carried out using confocal scanning laser microscopy, we also considered the use of spatial information in FRAP fitting, and derived the following sufficient conditions for tractability where r is the length scale of a single pixel and L is the length scale of the whole of the imaged region. Whenever the rates of molecular association and dissociation are of comparable order, all FRAP model parameters may be inferred from either conventional FRAP or confocal scanning FRAP data of sufficient temporal and/or spatial resolution. We expect that this will the case in many circumstances, but not universally. We found (Sect. 5) that when the tractability conditions are not met, it is still possible to estimate the reaction rates k on and k off , or at the very least the ratio k off /k on , by estimating the diffusivities D u and D v independently. We also proposed simple tests to determine when full parameter fitting is possible and when separate experiments will be required.
Despite the large number of quantitative FRAP studies which have been published, in practice researchers have often preferred to fit recovery curves with a simple exponential formula, even in cases where pure diffusion is likely the best model of the system under investigation (Taylor et al. 2019). Even in rapid equilibration reactiondiffusion systems with D v = 0, where the exponential formula is appropriate, it is nevertheless an under-utilisation of data, as it yields only an estimate of the dissociation rate, k off , where estimates of the association rate, k on , and diffusivity D u are possible. Yet, the exponential formula is not really applicable to a diffusion-based recovery, and it must be noted that inappropriate model choice may lead to inaccurate 1 parameter estimates and incorrect conclusions (Sprague et al. 2004;Mueller et al. 2010;Mazza et al. 2012). Therefore, it is our belief that a thorough approach to FRAP parameter estimation, incorporating model selection and regime identification, would be beneficial. It is our intention to develop this approach in future work, utilising the theoretical results which we have established here.

Conflict of interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

A Derivations
In this appendix we present the derivations of the formulae in Sect. 3. We begin with the non-dimensionalised FRAP equation, which is equation (5) in the main text. Equation (50) is subject to the the far-field conditions and the post-bleach initial conditions, where H is the Heaviside step function. The dimensionless variables are and the dimensionless parameters are By assumption the Laplacian is radially symmetric and likewise for v. We will use the following asymptotic expansion in the subsequent sections,

A.1 Rapid equilibration (Á 1)
In the first instance, we will derive the rapid equlibration recovery curve formula, (16) in the main text. We set ε = 1/η. Substituting expansion (56) into system (50) yields (57) We will necessarily be left with ∇ 2 u 0 = ∇ 2 v 0 = 0 as we let ε → 0 unless we also take δ → 0 (that is, the entire system instantaneously returns to equilibrium in the limiting case where both species are infinitely diffusive). Neglecting this uninteresting case, taking the limit ε, δ → 0 such that δ/ε = O(1), we are left with a single equation where we have already used the fact that u 0 = u eq , since this is the unique solution of ∇ 2 u 0 = 0 given the boundary conditions (51). Noting that k on u eq = k off v eq and making the substitution v 0 (r , t) = v eq −ṽ 0 (r , t), we arrive at where we should note that (59) has been restored to dimensional form. The solution to (59) subect to Dirac delta intitial conditions is hence the solution to the general initial value problem of (59) is, in Cartesian coordinates, v(x, y, t) = (62) At this stage, the derivation becomes virtually identical to that of Soumpasis (1983). The precise form of v 0 can only be expressed in integral form, yet if we define the contribution of the bound fraction to the fluorescence recovery as then we conclude that The above approximation breaks down over the short time scale, t = O(ε), so it is helpful to consider the rescaling t = ετ which, in the limit ε → 0 reduces (57) to which indicates that the dynamics of u are governed purely by diffusion, implying that, if we define F u analogously to F v such that we conclude that F u (t) = u eq F S r 2 n 2D u t .
Finally, noting that the total fluorescence recovery curve is given by F(t) = F u (t) + F v (t) and that u eq = k off /(k on + k off ), v eq = k on /(k on + k off ), we arrive at
In dimensional form and truncated to first order, we have ⎧ ⎨ ⎩ u(r , t) = u eq 1 + k on 4D u e −k off t (r 2 − r 2 n ) , v(r , t) = v eq 1 − e −k off t + k on 4D u k off te −k off t (r 2 − r 2 n ) .
We are now able to evaluate the fluorescence recovery function, F(t) = 1 πr 2 n 2π 0 r n 0 (u(r , t) + v(r , t))dr dt, to obtain F(t) = 1 − v eq e −k off t − k on r n 3D u e −k off t (1 + k off t), as required.

A.3 Slow equilibration (Á 1)
Finally, we will derive the FRAP formula (21) from the main text. Let ε = η. We begin by truncating the asymptotic expansions, (56), of u and v to zeroth order prior to substitution into (50), yielding In the limit ε → 0, there is no net spatial redistribution and the the total concentration, w 0 = u 0 + v 0 is time-invariant, yet it is clear that the system will converge towards a local chemical equilibrium at each point, given by u 0 = κw 0 /(1+κ), v 0 = w 0 /(1+κ).
Adding the two equations in system (80) neutralises the zeroth order terms ∂u 0 ∂τ + ∂v 0 ∂τ = ∇ 2 u 0 (r , τ/ ) + δ∇ 2 v 0 (r , τ/ ), where we have cancelled ε throughout. Taking ε → ∞ At this point it is convenient to re-dimensionalise, finally yielding hence the total fluorescence concentration recapitulates the behaviour of a pure diffusion system and the fluorescence recovery function is Now that we have explicitly derived the partial derivatives of F with respect to each of the four model parameters, from formula (14) in the main text it follows that Then the Fisher information matrix is simply