Parameter estimation and determinability analysis applied to Drosophila gap gene circuits

Ashyraliyev, Maksat; Jaeger, Johannes; Blom, Joke G

doi:10.1186/1752-0509-2-83

Parameter estimation and determinability analysis applied to Drosophila gap gene circuits

Research article
Open access
Published: 25 September 2008

Volume 2, article number 83, (2008)
Cite this article

Download PDF

You have full access to this open access article

BMC Systems Biology Aims and scope

Parameter estimation and determinability analysis applied to Drosophila gap gene circuits

Download PDF

Maksat Ashyraliyev¹,
Johannes Jaeger² &
Joke G Blom¹

12k Accesses
64 Citations
Explore all metrics

Abstract

Background

Mathematical modeling of real-life processes often requires the estimation of unknown parameters. Once the parameters are found by means of optimization, it is important to assess the quality of the parameter estimates, especially if parameter values are used to draw biological conclusions from the model.

Results

In this paper we describe how the quality of parameter estimates can be analyzed. We apply our methodology to assess parameter determinability for gene circuit models of the gap gene network in early Drosophila embryos.

Conclusion

Our analysis shows that none of the parameters of the considered model can be determined individually with reasonable accuracy due to correlations between parameters. Therefore, the model cannot be used as a tool to infer quantitative regulatory weights. On the other hand, our results show that it is still possible to draw reliable qualitative conclusions on the regulatory topology of the gene network. Moreover, it improves previous analyses of the same model by allowing us to identify those interactions for which qualitative conclusions are reliable, and those for which they are ambiguous.

Analysis of functional importance of binding sites in the Drosophila gap gene network model

Article Open access 16 December 2015

Sequence-based model of gap gene regulatory network

Article Open access 19 December 2014

Dynamic Modeling of Transcriptional Gene Regulatory Networks

Background

Many real-life processes can be modeled by non-linear Ordinary Differential Equations (ODEs) or Partial Differential Equations (PDEs). In developmental biology, for instance, systems of reaction-diffusion equations are used to model spatio-temporal patterns of gene expression [1]. A common difficulty is that the model equations usually have a large number of unknown parameters, such as weights for regulatory interactions, diffusion coefficients, decay and reaction rates, etc. Sometimes, it is feasible to determine the missing parameters experimentally, but in most cases this is difficult or even impossible. However, one can usually measure other quantities involved in the model. For instance, experimentalists can quantify mRNA or protein concentrations using microarrays, quantitative PCR, in situ hybridization or immunofluorescence. Unknown model parameters can then be found by parameter estimation techniques based on fitting the model solution to the measured gene expression data.

Whether the parameters for the mathematical model can be found assuming that sufficient and error-free data is available is the subject of a priori or structural identifiability analysis. Once the parameter estimates have been computed, it is very important to know how reliable they are. An a posteriori or practical identifiability study can show how well the parameters have been determined given a data set that is possibly sparse and noisy. For the subject of structural and practical identifiability we refer to [2–4] and references therein. Ideally, one would wish to determine all parameters accurately enough. In practice, however, this is usually not possible and one has to face an uncertainty in the parameter values. This can be due to several reasons: First, the model could be 'wrong'. In this paper, we do not focus on this possibility assuming that the 'right' model is available (i.e. a model which represents the underlying mechanism of the modeled process accurately and correctly). Second, the data used for fitting could be insufficient or too noisy. Finally, a recent study by Gutenkunst et al. [5] revealed that even if a correct model is used with a comprehensive set of data, many models used in systems biology still exhibit parameter 'sloppiness'. This means that some model parameters can be determined with great certainty ('stiff' parameters), while estimates of other ('sloppy') parameters can vary by orders of magnitude without significantly influencing the quality of the fit. Parameter sloppiness implies that very different sets of estimated parameters can lead to accurate model predictions. Therefore, it is not a serious problem if the main purpose of a model is to predict the dynamical behavior of the system, and little significance is attributed to parameter values. This is the case for all models considered by Gutenkunst et al. [5].

Parameter sloppiness becomes much more problematic, however, when models are used explicitly to extract biological information from estimated parameter values. In particular, this affects attempts at reverse engineering gene regulatory networks underlying cellular or developmental processes, where models are used to infer regulatory interactions – and hence regulatory network topology – from quantitative gene expression data.

Identifiability is a mathematical notion. For biological implications the precise values of parameters are not always important as long as they have certain characteristics, like being (roughly) positive, negative or zero. If a posteriori analysis results in a parameter uncertainty range which lies in the characteristic range we call this parameter determinable. Note that for those parameters which have to be determined quantitatively, i.e. having no characteristics, determinability refers to a posteriori identifiability.

As a case study, we consider the gap gene system of the vinegar fly Drosophila melanogaster. Gap genes constitute the first step in a regulatory cascade that leads to the determination of body segment positions along the major (or anterior-posterior, A-P) body axis during early Drosophila development [6]. The biological function of the gap gene system is to interpret long-range protein gradients implemented by the products of the maternal co-ordinate genes (e.g. bicoid (bcd), hunchback (hb) and caudal (cad); see [7–9] and references therein). Zygotic gap genes, such as hb, Krüppel (Kr), knirps (kni) and giant (gt), are activated or repressed by these maternal gradients, which establishes their expression in broad, overlapping regions of the embryo. These spatial domains of gap gene expression are stabilized and refined by gap-gap cross-repression. In turn, gap genes are involved in regulation of pair-rule and segment-polarity genes, the latter of which establish a segmental pre-pattern of gene expression by the onset of gastrulation.

The gap gene system is one of the best characterized developmental gene networks available today. It has been studied extensively using genetic and molecular approaches (see [7] and references therein). More importantly for our purposes, quantitative expression data are available for all relevant maternal co-ordinate and gap genes [10, 11], and those data have been used to infer regulatory interactions between gap genes using different global and local optimization strategies [7, 8, 12, 13]. In this study, we use parameter values from these earlier studies as starting points for local optimization to obtain a large set of parameter estimates. We then apply a practical identifiability analysis to those parameter sets to establish how well these estimates can be determined based on the available experimental data. We discuss the implications our results have for modeling of the gap gene system and for the biological interpretation of estimated parameter values. Finally, we note that the analysis can easily be adapted to other systems, and we strongly recommend its use to systems biology models in which large emphasis is put on the biological interpretation of estimated parameter values.

Methods

We consider a model given by the system of ODEs of the general form:

{\begin{array}{l} \frac{d y}{d t} = f (t, y, θ), & 0 < t \leq T, \\ y (t, θ) = y_{0} (θ), & t = 0. \end{array}

(1)

Here the m-dimensional vector θ contains all unknown parameters, y is an n-dimensional state vector, and f is a given vector function, differentiable with respect to t, y and θ. When components of the initial state vector y₀ are not known, they are considered as unknown parameters. Thus, y₀ may depend on θ.

As mentioned above, we assume that (1) is the 'right' model for the problem we are interested in, implying that (1) is a sufficiently accurate mathematical description approximating reality. This means that all relevant knowledge about the modeled processes is incorporated correctly in the vector function f. Thus, the only uncertainty in (1) is the vector of unknown parameters θ. Furthermore, it means that there exists a 'true' value θ* for the parameters θ such that (1) represents reality. Therefore, in principle, all unknown parameters can be determined if sufficient and accurate enough data are available.

Quantities that can be experimentally measured are called observables. The theory of identifiability holds in general for observables being a combination of state variables. However, for the sake of simplicity we consider here the particular case when only the components of the state vector are measured. Let us assume that for fitting (1) there are N measurements available. Each measurement, which we denote by ${\tilde{y}}_{i}$ , is specified by the time t_iwhen the c_i-th component of the state vector y is measured. The corresponding model value obtained from (1) is denoted by $y_{c_{i}}$ (t_i, θ). The assumptions outlined above imply that the difference $| {\tilde{y}}_{i} - y_{c_{i}} (t_{i}, θ^{*}) |$ is solely due to experimental error. We denote the vector of discrepancies between the theoretical values and the measured values by Y(θ). Then the least squares estimate $\hat{θ}$ of the parameters is the value of θ that minimizes the sum of squares [14, 15]

S (θ) = \sum_{i = 1}^{N} {(y_{c_{i}} (t_{i}, θ) - {\tilde{y}}_{i})}^{2} = Y^{T} (θ) Y (θ) .

(2)

We note that (2) is an appropriate measure under certain assumptions only, which we will discuss below. Other measures might be used when these assumptions do not hold.

Parameter Estimation by the Levenberg-Marquardt Method

There exist a number of different optimization techniques for parameter estimation. The choice of technique usually depends on the type of model equations (deterministic or stochastic), on the number of unknown parameters (moderate or large), as well as on the dependence of model solutions on parameters (linear or nonlinear, continuous or discontinuous). For a survey on optimization methods in biochemical models we refer to [2, 16]. In general, model (1) – being nonlinear in θ – leads to a least squares problem (2) that has several minima, first because the problem has more than one solution, and second because the fitness function (2) can have several stationary points that do not correspond to the lowest value of the fitness landscape (so-called local minima). Local search methods, like Levenberg-Marquardt (LM), easily get trapped in one of the local minima rather than finding the global minimum. To explore the whole search space one needs global search methods, like the Evolution Strategy used in [12]. Unfortunately, these methods converge very slowly once near a minimum. In contrast, gradient-based methods are efficient optimizers [17] for nonlinear least-squares problems once a sufficiently good initial guess for the parameter values is available. In this paper we use the solutions from the global search in [12] as initial guesses for local optimization by the LM method [18]. In this way, we reduce the chance of missing the global minimum and the determination of all the minima is precise and fast.

In general, any gradient-based optimization procedure seeks a correction δθ for the parameter vector, such that S(θ + δθ) ≤ S(θ) holds. The LM method [18] determines the correction as the solution of the equations

(J^T(θ)J(θ) + λI_m) δθ = -J^T(θ)Y(θ),

where λ ≥ 0 is a control parameter (see below), I_mis the identity matrix of size m and the Jacobian $J (θ) = \frac{\partial Y (θ)}{\partial (θ)}$ is the so-called 'sensitivity' matrix of size N × m. The entry J_{i, j}in J(θ) shows how sensitive the model response is at the i-th data point for a change in the j-th parameter. The LM method can be seen as the combination of two gradient-based approaches: Gauss-Newton and steepest descent [17]. If λ = 0 in (3), it coincides with the Gauss-Newton method. However, when the matrix J^T(θ)J(θ) is (almost) singular, to solve (3), λ has to be positive and for large λ the LM method approaches the steepest descent method. During the optimization λ is adapted such that the algorithm strives to exploit the fast convergence of the Gauss-Newton method whenever this is possible [18, 19].

In order to solve (3), the singular value decomposition (SVD) [20] of the matrix J(θ) can be used, i.e.

J(θ) = U(θ) Σ (θ) V^T(θ),

where U(θ) is an orthogonal matrix of size N × m, such that U^T(θ)U(θ) = I_m, V(θ) is an orthogonal matrix of size m × m, such that V^T(θ)V(θ) = V(θ)V^T(θ) = I_m, and Σ(θ) is a diagonal matrix of size m × m which contains all singular values σ_iin non-increasing order. Then the correction δθ can be found as

δθ = -V(θ) (Σ²(θ) + λI_m)^-1 Σ(θ) U^T(θ) Y(θ).

Later, when we study the reliability of the parameters computed, the SVD will play an important role again.

In order to execute an LM optimization step, the vector of discrepancies Y(θ), the matrix J(θ) and its SVD have to be evaluated for each new estimate of θ. For this purpose, for Y and the entries of J one needs to resolve (1) and the additional system of variational equations (i = 1,2,...,m)

{\begin{array}{l} \frac{\partial}{\partial t} \frac{\partial y}{\partial θ_{i}} = \frac{\partial f}{\partial θ_{i}} + \frac{\partial f}{\partial y} \frac{\partial y}{\partial θ_{i}}, & 0 < t \leq T, \\ \frac{\partial y (t, θ)}{\partial θ_{i}} = \frac{\partial y_{0} (θ)}{\partial θ_{i}}, & t = 0, \end{array}

(6)

respectively. We note that the costs for performing the SVD and computing the correction (5) are negligible in comparison with the computational costs for solving (1) and (6).

Thus, a single LM step requires the numerical solution of m + 1 coupled systems, each one consisting of n ODEs. Fortunately, these systems are coupled in a special way, namely, for each i = 1, 2,...,m, system (6) is a system of linear ODEs for $\frac{\partial y}{\partial θ_{i}}$ , coupled only with (1). The system of equations (6) has the same stiffness as (1), so for numerical stability the same step size can be used for the time integration of (1) and (6) (note that ODE stiffness is determined by the eigenvalues of the Jacobian matrix $\frac{\partial f}{\partial y}$ and is not related to parameter stiffness as described above). Therefore, the one-way coupling can be used to solve (1) and (6) efficiently.

Numerical integration of (1) and (6) requires a fast and reliable ODE solver. Search in the parameter space may lead to some values of θ such that the systems of ODEs become stiff [21]. It is well known that for stiff ODE systems explicit schemes can give rise to numerical instability or, alternatively, extremely small time steps. Therefore, an implicit scheme is the best choice for time integration for stability reasons. Using an implicit scheme allows us to exploit the specific coupling between (1) and (6) in an efficient way. At each time step τ integrating first (1) provides the solution vector y. This requires the LU decomposition of the Jacobian matrix $I_{m} - τ \frac{\partial f}{\partial y}$ . Using this LU decomposition the calculation of $\frac{\partial y}{\partial θ_{i}}$ from (6) reduces to a simple forward substitution and backsubstitution. In our simulations we use a tailor-made code [22] based on the implicit multistep Backward Differentiation Formulas (BDF) [23].

When the unknown parameters have to obey certain constraints – linear or non-linear – some additional work is needed. If the correction δθ found by (5) leads to violation of some constraints, then by the introduction of Lagrange multipliers a modified correction can be found, which fits all constraints. For the constrained minimization problem we refer the reader to [22].

For additional modeling and numerical aspects of this method we refer the reader to Additional file 1 (Section 1).

Statistical Analysis of Parameter Estimates

Above we used θ* to denote the 'true' parameter vector, for which (1) describes reality with sufficient accuracy, and by $\hat{θ}$ we denote the parameter vector which minimizes (2). Even having a 'right' model and an estimate $\hat{θ}$ for the parameter vector which fits the data well, does not mean that the whole modeling problem is resolved successfully. It is important to know how reliable the obtained estimate is. This is the subject of a posteriori identifiability analysis [3, 4, 24]. One way to look at this is inspecting the fitness landscape S(θ) in the neighbourhood of $\hat{θ}$ . Roughly speaking, if it is a sharp trough then the true parameter vector θ* and the obtained minimum $\hat{θ}$ are close. If it is flat in one or more directions, like the surface for a 2-parameter case in Fig. 1(a), then the minimum found can be far apart from the true parameter vector. Near the minimum, where the gradient of S(θ) vanishes, this surface is approximated by the second derivative or Hessian of S(θ). If the model is linear in the parameters the Hessian is equal to J^TJ. This linearity assumption and some statistics underlie the following rigorous analysis [14, 15, 21].

We assume that the measurement errors in ${\tilde{y}}_{i}$ are independent of each other and normally distributed and that the error distributions have zero mean and constant standard deviation σ. Then, $\hat{θ}$ is a maximum likelihood estimate [14, 15]. By assumption the model with the 'true' solution θ* describes reality, thus

\begin{matrix} {\tilde{y}}_{i} \approx y_{c_{i}} (t_{i}, θ^{*}) + ϵ_{i}, & i = 1, 2, ..., N, \end{matrix}

where ϵ_iare the measurement errors, for which

\hat{θ} - θ^{*} ~ N_{m} (0, σ^{2} {(J^{T} (\hat{θ}) J (\hat{θ}))}^{- 1})

(7)

holds approximately [14]. Here N_m(·,·) denotes the m-dimensional multivariate normal distribution. Notice that (7) holds exactly when y is linear in θ. Next we can define a region around $\hat{θ}$ in which the 'true' parameter vector θ* lies with a certain probability 1 - α. This (1 - α)-confidence region is determined by the inequality

{(θ^{*} - \hat{θ})}^{T} (J^{T} (\hat{θ}) J (\hat{θ})) (θ^{*} - \hat{θ}) \leq \frac{m}{N - m} S (\hat{θ}) F_{α} (m, N - m),

(8)

where F_α(m, N - m) is the upper α part of Fisher's distribution with m and N - m degrees of freedom. Geometrically these confidence regions are given by the contours of S( $\hat{θ}$ ) (for linear models), cf. Fig. 1(a).

The ellipsoid defined by (8), is centered at $\hat{θ}$ and has its principal axes directed along the eigenvectors of J^T( $\hat{θ}$ )J( $\hat{θ}$ ). Using the SVD (4) for J( $\hat{θ}$ ), we get

J^{T} (\hat{θ}) J (\hat{θ}) = V (\hat{θ}) Σ^{2} (\hat{θ}) V^{T} (\hat{θ}),

and the eigenvectors of J^T( $\hat{θ}$ )J( $\hat{θ}$ ) are the columns of the matrix V( $\hat{θ}$ ). So, the ellipsoid has its principal axes directed along the column vectors of the matrix V( $\hat{θ}$ ). Moreover, the radii along these principal axes are inversely proportional to the corresponding singular values σ_i, the diagonal elements of Σ( $\hat{θ}$ ). This all can be seen by using the following transformation (rotation)

z = V^{T} (\hat{θ}) (θ^{*} - \hat{θ}),

(9)

yielding

{(θ^{*} - \hat{θ})}^{T} (V (\hat{θ}) Σ^{2} (\hat{θ}) V^{T} (\hat{θ})) (θ^{*} - \hat{θ}) = z^{T} Σ^{2} (\hat{θ}) z = \sum_{i = 1}^{m} σ_{i}^{2} z_{i}^{2} .

(10)

On the other hand, since S( $\hat{θ}$ )/(N - m) is an unbiased estimator of σ², the equation for the ellipsoid can be rewritten as

\sum_{i = 1}^{m} σ_{i}^{2} z_{i}^{2} = r_{σ}^{2},

(11)

where $r_{σ}^{2}$ ≈ mσ²F_α(m, N - m) is proportional to the variance in the measurement errors. This form is more convenient to deal with because z can be considered as a set of uncorrelated variables, and once the conclusion has been drawn for the identifiability of z, the problem can be transformed back, revealing us the quality of $\hat{θ}$ .

Now, we assume that the model (1) is properly scaled, such that all parameter values are of the same order of magnitudes, and that we are interested only in the first few digits of the parameter values. Let us introduce the sphere given by

\sum_{i = 1}^{m} z_{i}^{2} = r_{ϵ}^{2},

where r_ϵdefines the level of accuracy one desires for the parameter estimates. For instance, if the parameters are of order O(1) and one is interested only in the first two digits to the right of the decimal point, then r_ϵ= 0.01. In order to be able to determine z_iaccurately enough, the radius along the ellipsoid's i-th principal axis shouldn't exceed the radius of the sphere, which leads us to the following inequality

σ_{i} \geq \frac{r_{σ}}{r_{ϵ}} .

(12)

A graphical representation of the ellipsoid and the sphere for the 2-dimensional case is given in Figure 1(b).

If only the first k largest singular values satisfy (12), then only the first k entries of z are estimated with the required accuracy and no sufficient information is available for the remaining components of z. Each of the first k entries of z defines a parameter or a linear combination of parameters which is well-determined. If a principal axis of the ellipsoid makes a significant angle with the axes in parameter space (i.e., there exists more than one significant entry in the eigenvector), this implies correlation between parameters in $\hat{θ}$ .

To summarize, the level of noise in the data in combination with the accuracy requirement for the parameter estimates, defines the threshold for significant singular values in the matrix Σ. The number of singular values exceeding this threshold determines the number of parameter relations that can be derived from the experiment. How these relations relate to the individual parameters is described by the corresponding columns in the matrix V. The largest entries in these columns indicate the well-determined parameters. This method is illustrated on the basis of a simple enzymatic reaction in [2].

Finally, (11) indicates that having, for instance, two times more accurate data so that the standard deviation σ is halved, will decrease the radii along the ellipsoid's principal axis by a factor of 2. Therefore, in case of very small singular values σ_i(i.e. strongly elongated ellipsoids) more accurate data obtained by the experimentalist will not improve the quality of the corresponding parameter estimates by much. In such a case, one certainly needs additional measurements of a different type (e.g. different components, different time points, or in the case of PDEs different spatial points).

Another way of assessing the information from the confidence region is by looking at confidence intervals of the parameter estimates ${\hat{θ}}_{i}$ (i = 1, 2,...,m). From (8) one can derive dependent and independent confidence intervals. The dependent confidence interval is the intersection of the ellipsoidal region with the i-th parameter axis

{θ_{i} : | θ_{i} - {\hat{θ}}_{i} | \leq \frac{r_{σ}}{\sqrt{{(V (\hat{θ}) Σ^{2} (\hat{θ}) V^{T} (\hat{θ}))}_{i i}}}},

(13)

i.e. one assumes that all other parameters are exactly determined. The independent confidence interval is the projection of the ellipsoidal region onto the i-th parameter axis

{θ_{i} : | θ_{i} - {\hat{θ}}_{i} | \leq r_{σ} \sqrt{{(V (\hat{θ}) Σ^{- 2} (\hat{θ}) V^{T} (\hat{θ}))}_{i i}}} .

(14)

Clearly, small independent confidence intervals for ${\hat{θ}}_{i}$ indicate that it is well-determined. However, in some cases considering only individual confidence intervals can be misleading. For instance, in the presence of strong correlations between parameters, the dependent confidence intervals underestimate the confidence region while the independent confidence intervals overestimate it.

From (7), the covariance matrix of $\hat{θ}$ is given by

σ^{2} {(J^{T} (\hat{θ}) J (\hat{θ}))}^{- 1} = σ^{2} V (\hat{θ}) Σ^{- 2} (\hat{θ}) V^{T} (\hat{θ}) .

(15)

Then, by denoting $B (\hat{θ}) = V (\hat{θ}) Σ^{- 2} (\hat{θ}) V^{T} (\hat{θ})$ , the correlation coefficient between ${\hat{θ}}_{i}$ and ${\hat{θ}}_{j}$ can be computed by

ρ_{i j} = \frac{B_{i j}}{\sqrt{B_{i i} B_{j j}}} .

(16)

We note that by computing individual confidence intervals and correlations between parameters, one is not able to assess the identifiability of linear combinations of parameters. This can be seen only by using the first approach, i.e. by inspection of the V and Σ matrix.

The Biological Test Problem: Gap Gene Circuits

We apply the methodology described above to assess parameter determinability of gene circuit models for the gap gene network in early Drosophila development. Here, we provide a brief outline of gap gene circuit models. More detailed information can be found in [7, 8, 25].

Segment determination occurs during the blastoderm stage of Drosophila development, between 1.5 and 3 hours after egg laying [26]. During this stage, the embryo consists of a syncytium; there are no cell membranes between nuclei. These nuclei constitute the basic objects of the model. They are arranged in a row along the A-P axis. Nuclei divide rapidly and synchronously [27]. Periods between mitotic divisions are called cleavage cycles, where cycle n occurs between mitoses n - 1 and n. The models considered here run from early cycle 13 (t = 0.0 min) to the onset of gastrulation at the end of cycle 14A (t = 71.1 min). Mitosis occurs at the end of cycle 13, between t = 16.0 min and t = 21.1 min [27].

Gene circuit models describe the change in concentrations of each gap gene product in each nucleus over time by the following system of ODEs

\frac{d g_{i}^{a}}{d t} = R_{a} Φ (\sum_{b = 1}^{N_{g}} W_{a}^{b} g_{i}^{b} + m_{a} g_{i}^{Bcd} + h_{a}) - λ_{a} g_{i}^{a} + D_{a} (g_{i + 1}^{a} - 2 g_{i}^{a} + g_{i - 1}^{a}) .

(17)

a and b denote regulated genes and regulators respectively. a and b are integer indices representing cad, hb, Kr, kni, gt as well as the terminal gap gene tailless (tll). $g_{i}^{a}$ denotes the concentration of the product of gene a in nucleus i. The Bcd gradient remains constant over time, and is not regulated by the other genes in the model. $g_{i}^{Bcd}$ denotes the concentration of Bcd protein in nucleus i. N_g= 6 is the number of genes in the model (excluding Bcd), and the function

Φ (x) = \frac{1}{2} (\frac{x}{\sqrt{x^{2} + 1}} + 1)

(18)

is a sigmoid regulation-expression function.

During mitosis, protein production is shut down. Nuclei divide instantaneously at the end of mitosis and the distance between them is halved. Gap gene circuits cover the region from 35% to 92% A-P position, which includes 30 (cycle 13) and 58 (14A) nuclei. Therefore, system (17) consists of 180 and 348 ODEs during cycles 13 and 14A, respectively. Initial conditions are prescribed by maternal gradients of Bcd, Cad and Hb, and zero levels for all other gene products. We use no-flux boundary conditions at i = 0 and i = i_max.

In system (17) there are m = 66 unknown parameters. These include the genetic interconnection or regulatory weight matrix W of size N_g× N_gwhere the matrix elements $W_{a}^{b}$ represent the regulation of gene a by gene b, while maternal coefficients m_arepresent the regulatory effect of Bcd on gene a. Regulatory parameters represent repression (if < 0), activation (if > 0) or no interaction (if ≈ 0). Other parameters include promoter thresholds h_a, promoter strengths R_a, diffusion coefficients D_a, and decay rates λ_a. Estimates for these parameters have been obtained in previous studies by fits to quantitative expression data [11] using global search methods such as parallelized Lam Simulated Annealing [7, 8] or the Stochastic Ranking Evolution Strategy (followed by downhill simplex direct search) [12] and using a first-improvement local search method with randomized order of examination [13]. In the latter the initial parameter estimates are obtained by using a splitting strategy: parameters λ_aand D_aare estimated by assuming that the protein production is constant within certain spatio-temporal domains which reduces (17) to a system of linear equations uncoupled for each gene (the boundaries of production domains are obtained from data); parameters in the nonlinear part of the model are estimated by fitting the production term in (17) with the data given as input, as closely as possible, to the quadrilateral production regions.

The data set used for model fitting consists of N = 2702 measurements of protein concentrations at nuclear resolution (using multi-channel immunofluorescent antibody assays; available online [11]). Measurements were taken at one time point during cycle 13 (T₀), and eight time points T_i(1 ≤ i ≤ 8) during cycle 14A (Figure 2a). Measurements for the concentrations of all gene products represented in the model at all time points are available, except for Cad at T₇ and T₈, and Tll before T₃. The level of measurement error in the data is less than 5%, see [28]. Each data point represents concentration values which have been averaged across 9–62 embryos. Therefore, from the Central Limit Theorem (CLT) we assume that the experimental errors are approximately normally distributed.

The quality of the parameter estimates is measured by the root mean square (RMS) of the discrepancy vector

R M S (θ) = \sqrt{\frac{1}{N} \sum_{a = 1}^{N_{g}} \sum_{i = 1}^{N_{c}} \sum_{j = 0}^{N_{t}} α_{j}^{a} {(g_{i}^{a} {(T_{j}, θ)}_{m o d e l} - g_{i}^{a} {(T_{j})}_{d a t a})}^{2},}

(19)

where N_t= 8 is the number of time classes, N_cis the number of nuclei and $α_{j}^{a}$ is equal to zero for Tll at j = 0,1, 2 and for Cad at j = 7, 8, and is equal to one otherwise. A solution is considered to be 'good' if RMS < 12.0 and if there are no visible pattern defects in the model response [7, 8, 12, 13]. It is important to note that the RMS only shows the quality of the fit of the model to the data but does not give any information about the quality of the parameter estimates. Our aim is to find the parameter estimates that give a good fit and to apply statistical analysis in order to investigate how reliable these estimates are.

The search space for parameters is defined by the linear constraints

\begin{matrix} 10.0 \leq R_{a} \leq 30.0, & 0.0 < D_{a} \leq 0.3, & 5.0 \leq \frac{l n (2)}{λ_{a}} \leq 20.0, & a = 1, ..., N_{g}, \end{matrix}

(20)

and by the nonlinear constraints

\begin{matrix} \sum_{b = 1}^{N_{g}} {(W_{a}^{b} g_{m a x}^{b})}^{2} + {(m_{a} g_{m a x}^{Bcd})}^{2} + {(h_{a})}^{2} \leq 10^{4}, & a = 1, ..., N_{g}, \end{matrix}

(21)

where $g_{m a x}^{b}$ and $g_{m a x}^{Bcd}$ are the maximum values in the data set for proteins b and Bcd, respectively. Note that in [7, 8, 13] threshold parameters h_afor genes Kr, Kni, gt, and hb are fixed to negative values representing a constitutively repressed state for the corresponding genes [7, 8]. Fixing some parameters to specific values may severely restrict the search space leaving some solutions out of consideration. Contrary to their approach, we include threshold parameters for these genes in the search by putting the constraints -10.0 ≤ h_a≤ 0.0.

In order to make the analysis of parameter estimation easier, we scale all parameters used in (17) in the following way:

\begin{matrix} {\tilde{R}}_{a} = 0.1 R_{a}, & {\tilde{D}}_{a} = 10 D_{a}, & {\tilde{λ}}_{a} = 10 λ_{a}, & {\tilde{W}}_{a}^{b} = 10^{2} W_{a}^{b}, & {\tilde{m}}_{a} = 10^{2} m_{a}, & {\tilde{h}}_{a} = h_{a}, \end{matrix}

for all genes a and b. Note that the choice of the scaling factors for R_a, D_a, and λ_ais based on the search ranges of the corresponding parameters. The choice of the scaling factors for regulatory weights $W_{a}^{b}$ and maternal coefficients m_ais based on the fact that the maximum level of protein concentration for all genes in the data set is of order O(10²). Thus, all scaled parameters are of order O(1).

Results and Discussion

We use 80 different parameter sets, obtained by global search [12], as initial guess for the parameter values and apply the LM method to estimate all 66 unknown parameters of the gap gene circuit model (17), such that the state variables fit the given data (see Figure 2b), subject to (non)linear constraints (20)–(21). Once the parameters are estimated we apply our statistical analysis to assess the quality of the parameter estimates.

Optimization Results

Least squares estimation of the 66 parameters of the gap gene circuit model (full search case) using the LM method yields a significant decrease of the RMS (19) in all simulations (see Table 1). There are only 5 (out of 80) initial parameter sets with RMS < 10.0 (best fit: RMS = 9.56). After using the LM method there are 71 final parameter sets with RMS < 10.0, among which there are 64 with their RMS evenly distributed between 8.37 and 9.43. None of these low-scoring parameter sets show any visible patterning defects (see Figure 2.1 in Additional file 1), while most solutions with larger RMS do. As it is difficult to make a distinction between these 64 parameter sets based on RMS values and expression patterns only, we take all of them into account for our analysis. We note that there is no guarantee that a better solution might have been missed by our parameter estimation procedure. However, since the initial points for the LM search were found by a global search method [12], it is likely that the search space for unknown parameters is explored sufficiently enough.

Table 1 RMS distribution for parameter estimates.

Full size table

Parameter estimates found by the LM method significantly improve solution fits found in previous studies (see Figure 3) [7, 8, 12, 13]. However, there are two problems, mentioned in [7, 8], that remain unsolved with the new parameter estimates. The first one concerns the artificially high level of gap gene expression during early cycle 13. The model responses are much larger than the data values yielding large positive discrepancies. This is probably due to the lack of protein production delays in the model [7]. The second one concerns the incorrect shift of the posterior Hb domain, which is due to the absence of the terminal gap gene huckebein (hkb) from our current models [7, 8].

Many parameters have a broad range of possible values, meaning that they are not uniquely determined (see Figures 2.2-2.3 in Additional file 1). Classification of all parameter estimates for regulatory weights into 'activating', 'repressing' or 'no interaction' categories is shown in Figure 4(a). The resulting network topology is in very good agreement with the results obtained in [7, 8, 12, 13]. Specifically,

(A1) Cad and Bcd activate gap genes hb, Kr, kni, and gt;

(A2) gap genes hb, Kr, kni, and gt show auto-activation;

(A3) Tll represses gap genes Kr, kni, and gt;

(A4) gap genes with mutually exclusive expression domains strongly repress each other; these correspond to weights $W_{g t}^{K r}$ , $W_{K r}^{g t}$ , $W_{h b}^{k n i}$ , and $W_{k n i}^{h b}$ .

Previous results also suggested that pairs of overlapping gap genes (hb and gt, gt and kni, kni and Kr, as well as Kr and hb) either show no or weak repressive interactions among each other. Note that some of these weights differ slightly from earlier analyses [7, 12]. In all of these cases the difference is extremely slight and depends on the threshold chosen to categorize an interaction as 'very weak repression' or 'no interaction' (for example $W_{k n i}^{K r}$ or $W_{K r}^{k n i}$ in Figure 4(a); see also scatter plots in Figure 2.2 in Additional file 1). It is therefore unlikely that such differences are biologically significant. The only activation between overlapping gap genes is predicted for the effect of Gt on hb. In addition, we find that Kni activates gt in a majority of solutions. In both of these cases, the significance of the interactions does not lie in their weak activating effect (which has no discernible biological function), but rather in the absence of repression [7, 8].

Parameter Determinability

We applied the statistical analysis introduced in the Methods section to the 64 parameter sets obtained by the LM method to assess the quality of our estimates. Ellipsoidal confidence regions corresponding to parameter estimates are given by (10). None of the parameter estimates lies in the ellipsoidal confidence regions of all other parameter sets. Note that this does not necessarily imply that there is no unique 'true' solution for the parameter vector, since the ellipsoidal confidence regions – or at least some of them – may still have a non-empty intersection.

For each parameter set $\hat{θ}$ , the SVD (4) of the Jacobian J( $\hat{θ}$ ) yields the matrices V( $\hat{θ}$ ) and Σ( $\hat{θ}$ ). In order to find the number of singular values in Σ( $\hat{θ}$ ) satisfying the accuracy inequality (12), i.e. to determine how many (combinations of) parameters can be determined, we need to quantify r_ϵand r_σ. We are interested only in the first decimal digit of the scaled parameters, and therefore we take r_ϵ= 0.1. For α = 0.05 we obtain r_σ≈ 9.4 RMS( $\hat{θ}$ ) (the choice of α does not make much difference here due to the large value of N).

Investigation of all parameter sets shows that on average, 15 singular values satisfy (12) meaning that at most 15 parameters or linear combinations of them can be determined with one digit accuracy. There is a set of parameters which have significant entries in the first 15 columns of all V matrices. It includes regulatory weights $W_{K r}^{c a d}$ , $W_{g t}^{c a d}$ , $W_{k n i}^{c a d}$ , $W_{t l l}^{c a d}$ , $W_{K r}^{h b}$ , promoter thresholds h_Kr, h_gt, h_tll, decay rate λ_cad, and promoter strength R_Kr. However, inspection of the first 15 columns of the V matrices shows that there is not a single parameter which can be determined individually with the chosen accuracy. Thus, each column has a number of significant entries implying that the principal axis of the confidence ellipsoid is at an angle with the corresponding axes in parameter space. This indicates the presence of correlations between parameters.

Dependent and independent confidence intervals for each parameter set can be computed by (13) and (14), respectively. We check if the corresponding confidence intervals for regulatory weights fall entirely into the 'repression', 'no interaction', or 'activation' categories. Results in Figure 4(a) do not change when only dependent confidence intervals are taken into account. However, including independent confidence intervals one can no longer make correct qualitative conclusions about many of the entries in the regulatory weight matrix.

The sizes of the independent confidence intervals give an indication about the determinability of the corresponding regulatory weights. There is a set of eight regulatory weights which have relatively small confidence intervals for all 64 parameter sets (see Figure 2.4 in Additional file 1). It includes $W_{h b}^{K r}$ , $W_{h b}^{t l l}$ , $W_{K r}^{h b}$ , $W_{K r}^{K r}$ , $W_{K r}^{k n i}$ , $W_{g t}^{h b}$ , $W_{g t}^{g t}$ , and $W_{g t}^{k n i}$ . For instance, Figure 5(a) shows the confidence intervals for $W_{h b}^{K r}$ . This regulatory weight is well determined qualitatively, i.e. the independent confidence intervals fall entirely into one category and therefore the type of the regulation can be concluded. The model predicts that Kr does not regulate hb. Note that the confidence intervals for these eight parameters in the scaled case are of order O(10^-1) and therefore they are not determinable with the chosen accuracy level r_ϵ= 0.1. In fact, they are determinable only if we choose r_ϵ= 1.0.

The remaining regulatory weights have larger independent confidence intervals (see Figure 2.4 in Additional file 1) and therefore they are not determined quantitatively. Among them are some regulatory weights for which qualitative conclusions can be deduced from the results. For example, panels (d) and (e) of Figure 5 show the confidence intervals for regulatory weights $W_{k n i}^{h b}$ and $W_{g t}^{K r}$ , respectively. Although these two regulatory weights can not be determined quantitatively, there is a qualitative difference between them. The independent confidence intervals for $W_{g t}^{K r}$ do not extend significantly into the positive part of the plane. Therefore, one can make a qualitative conclusion for this weight: the model predicts that Kr does not activate gt. Note that this is a weaker conclusion than predicting repression for this weight from Figure 4(a). In contrast, we cannot draw any qualitative conclusions about $W_{k n i}^{h b}$ . Thus, our analysis does not confirm the repression of kni by Hb inferred from Figure 4(a) (but does not contradict it either). To demonstrate that repression of kni by Hb is not strictly necessary to fit the data correctly, we fix this weight to zero while performing parameter estimation. The obtained parameter set has a RMS = 9.24 and produces patterns with no visible defects (see Figure 2.7 in Additional file 1).

Based on the confidence intervals, we summarize the qualitative conclusions for the most important regulatory weights in the gap gene system:

(B1) Cad and Bcd do not repress gap genes hb, Kr, and gt; no conclusions can he made for regulation of kni by Cad and Bcd;

(B2) gap genes hb, Kr, kni, and gt do not show auto-repression;

(B3) Tll does not activate gap gene gt; no conclusions can be made for regulation of Kr and kni by Tll;

(B4) gap genes with mutually exclusive expression domains gt and Kr do not activate each other; no conclusions can be made for regulatory interactions between hb and kni.

Interactions between overlapping gap genes are mostly weakly repressive or absent, and are largely consistent with Figure 4(a): confidence intervals for $W_{h b}^{K r}$ , $W_{K r}^{h b}$ , $W_{K r}^{k n i}$ , $W_{g t}^{h b}$ , and $W_{g t}^{k n i}$ indicate no interaction, while confidence intervals for $W_{k n i}^{g t}$ , and $W_{k n i}^{K r}$ suggest the absence of activation. Finally, confidence intervals for $W_{h b}^{g t}$ indicate the absence of repression.

Obviously, our qualitative conclusions (B1)–(B4) are weaker than the conclusions (Al)–(A4) made from Figure 4(a) by considering only the values of parameter estimates. Note that for all genes, promoter thresholds h, promoter strengths R, diffusion coefficients D, and decay rates λ have extremely large independent confidence intervals (see Figure 2.5 in Additional file 1) meaning that all these parameters are not determinable.

The large difference between dependent and independent confidence intervals indicates the presence of correlations between parameters. Individual confidence intervals are not informative for understanding the reason of poor determinability of parameters when their estimates are correlated. Using (16), we find the correlation matrix for each parameter set. To detect the most significant correlations between parameters present in all correlation matrices, we calculate an averaged matrix – which we call the mean correlation matrix – whose entries are the mean values of the corresponding correlation coefficients in the individual correlation matrices. The obtained mean correlation matrix has a block diagonal structure such that each block corresponds to a given gene and contains the correlation coefficients between parameters for the same gene (see Figure 2.6 in Additional file 1). Panels (a),(b),(c), and (d) of Figure 6 show the blocks corresponding to gap genes hb, Kr, gt, and kni, respectively. Note that the correlations corresponding to the most significant entries in the mean correlation matrix (with absolute values greater than 0.5) are statistically present in all individual correlation matrices because corresponding standard deviations are relatively small (less than 0.2).

Parameter Estimation with Fixed Promoter Thresholds

The main insight from the mean correlation matrix is that we observe strong correlations of regulatory parameters with promoter thresholds. For instance, regulation of hb, Kr, gt, and kni by Bcd and Cad, and auto-regulation are all strongly correlated with their corresponding h_a(see panels (a),(b),(c), and (d) of Figure 6). This may explain the poor determinability for these interactions. We checked this hypothesis by fixing promoter thresholds h_afor gap genes hb, Kr, gt, and kni in (17) to a value of -3.5, similar to the approach used in [7, 8, 12]. We find that also in this case, least squares estimation using the LM method yields a significant decrease of the RMS (see Table 1). There are 63 parameter sets with RMS < 10.0 (best fit: RMS = 8.66). Among these, there are 60 parameter sets which have no visible patterning defects (see Figure 3.1 in Additional file 1) and these were taken into account in the following analysis. The resulting regulatory network topology (see Figure 4(b)) largely corresponds to that obtained without fixing threshold parameters (full search case) with a few minor exceptions. $W_{k n i}^{K r}$ , $W_{K r}^{k n i}$ , and $W_{g t}^{k n i}$ now all fall into the 'no interaction' category while the full search found mutual repression between Kr and kni, and activation of gt by Kni (compare panels (a) and (b) of Figure 4). As discussed above, these changes represent very small quantitative changes in the parameter values and depend on the (somewhat arbitrary) choice of cut-off between regulatory categories (compare scatter plots in Figures 2.2 and 3.2 in Additional file 1). Therefore, they are unlikely to be biologically significant, while all our main qualitative conclusions (Al)–(A4) on gap gene network topology are fully consistent with our results using fixed threshold parameters.

On the other hand, we observe significant improvement in determinability of some regulatory weights when we compute dependent and independent confidence intervals for each parameter set by (13) and (14), respectively (see Figure 3.4 in Additional file 1). As an example, Figure 5(c) shows the confidence intervals for the regulatory weight $W_{h b}^{c a d}$ with fixed promoter thresholds. There is a quantitative improvement in the determinability of this parameter indicated by smaller independent confidence intervals in the case of fixed threshold parameters (compare panels (b) and (c) of Figure 5). But there are also qualitative changes. The model now predicts the activation of hb by Cad. Similarly, Figure 5(f) shows the confidence intervals for the regulatory weight $W_{g t}^{K r}$ with fixed promoter thresholds. Comparison of the panels (e) and (f) of Figure 5 shows that there is no quantitative difference between the two approaches for this weight.

However, we see a qualitative improvement for the case of fixed threshold parameters. The independent confidence intervals in Figure 5(f) lie in the negative part of the plane for almost all parameter estimates and therefore, repression is now predicted for this weight while the plot in Figure 5(e) corresponding to full search case predicts only the absence of activation.

Based on the confidence intervals, we summarize the qualitative conclusions for the essential regulatory weights in the gap gene model in the case of fixed promoter thresholds:

(C1) Cad activates gap genes hb, Kr, kni, and gt;

(C2) Bcd does not repress gap genes hb, Kr, and gt; no conclusions can be made for regulation of kni by Bcd;

(C3) gap genes hb, Kr, and gt have auto-activation; gap gene kni does not have auto-repression;

(C4) Tll does not activate gap gene gt; no conclusions can be made for the regulation of Kr and kni by Tll;

(C5) mutually exclusive gap genes gt and Kr repress each other; no conclusions can be made for regulations between hb and kni.

For interaction among overlapping gap genes, the confidence intervals in the case of fixed promoter thresholds are fully consistent with those for the full search case, even though three of these interactions fall into different categories in the analysis based on parameter values only (compare panels (a) and (b) of Figure 4). This shows that confidence intervals can be used to check the significance of ambiguities in predicted interactions based on parameter classification alone. However, although conclusions (C1)–(C5) show qualitative improvement for some regulations in comparison with (B1)–(B4), they are still weaker than those drawn from classifying parameter values only (A1)–(A4).

Similar to the full search case, we compute the mean correlation matrix to detect the significant correlations between parameters (see Figure 3.6 in Additional file 1). The obtained mean correlation matrix also has a block diagonal structure. However, there is a number of significant entries in off-diagonal blocks. Panels (e),(f),(g), and (h) of Figure 6 show the diagonal blocks corresponding to gap genes hb, Kr, gt, and kni, respectively. In the absence of dominating correlations between regulatory parameters and thresholds h_awe can now identify biologically significant parameter correlations. Here we restrict ourselves to describe some correlations which can be interpreted in biological terms with the emphasis on those for which at least one parameter is 'sloppy':

Strong negative correlation is present between $W_{k n i}^{h b}$ and m_kni. That is, strong repression of kni by Hb needs to be overcome through increased activation by Bcd. Note that both parameters are poorly determined. In the circuit with $W_{k n i}^{h b}$ set to zero, Bcd actually represses kni (see Table 2.1 in Additional file 1). This contradicts genetic and molecular evidence indicating that both repression of kni by Hb and its activation by Bcd are present in the embryo [29, 30].
There are complex correlations between the (very small, or absent) repressive effects of Hb on Kr and gt, and the activation of those two genes by Bcd. This confirms earlier results indicating that the balance between activation and repression from maternal genes is crucial for correct gap gene expression [31].
The importance of the balance between activation and repression is highlighted by the following: repression of kni and gt by Tll can be compensated through increased activation by Cad, repression of kni by Kr can be compensated through increased activation by Bcd, while repression of kni by Gt can be overcome by increased kni auto-activation in the posterior of the embryo.
Increased hb auto-activation is compensated through decreased activation of hb by Bcd indicating that broad maternal activation and auto-regulation are somewhat redundant.
There is a strong positive correlation present between m_Krand m_gt. This correlation is most likely indirect, due to repressive interaction between gt and Kr. Increased activation of Kr by Bcd must be balanced by increased activation of gt by Bcd to maintain balance of mutual repression between Kr and gt.
There are correlations between activation of Kr and gt by Bcd and their respective promoter strengths and decay rates. Such correlations are to be expected as stronger expression or increased protein stability can compensate for weaker activation by Bcd.

We note that some of the 'sloppy' parameters, such as $W_{g t}^{K r}$ , $W_{K r}^{g t}$ , $W_{h b}^{k n i}$ , and $W_{K r}^{t l l}$ are not (strongly) correlated to any of other parameters and their sloppiness remains unclear. The last is completely uncorrelated parameter. Posteriorly Kr is strongly repressed by Gt and somewhat weaker by Hb and Kni. Apparently, due to these interactions repression of Kr by Tll is somewhat redundant in the model.

In summary, the above suggests that complex correlations between regulatory weights as well as correlations between those weights and promoter strength or protein decay rates are an unavoidable property of complex biological networks, as some interactions or changes in expression rate can always compensate for changes in others.

Parameter Correlations: Data vs Model

Poor determinability of most of the parameters in the gap gene model is due to correlations between parameters. Here we investigate whether these correlations are caused by shortcomings of the data or the model.

At first glance, it seems that insufficient accuracy of the data cannot be the reason for correlations. More accurate data would simply make the ellipsoid confidence region shrink but not rotate. Therefore, it cannot significantly improve the determinability of the parameters (see also [5]). We checked this by assuming that a larger data set was available: Say we had measurements for all gene products, in all nuclei, at 71 uniformly distributed time points (instead of 9). With these choices the total number of measurements would be N = 21180. Suppose that we have found that one of our parameter estimates $\hat{θ}$ minimizes the sum of squares (2). Since the Jacobian depends only on the model responses and not on the values of the data, we can generate a new Jacobian $\tilde{J} (\hat{θ})$ including all 'ghost' data points. From the SVD of the corresponding $\tilde{J} (\hat{θ})$ we get the matrices $\tilde{V} (\hat{θ})$ and $\tilde{Σ} (\hat{θ})$ which define new ellipsoidal regions. The ellipsoids are slightly rotated in comparison with the initial ones but not enough to make the principal axes of the ellipsoid get closer to the parameter axes, i.e. the correlations between parameters are not removed.

Each data point is actually a sample mean, obtained by averaging gene concentrations from individual embryos. Therefore, measurement errors most likely have a normal distribution with zero mean. However, their standard deviations may vary for different data points. Assume that for the i-th data point K_imeasurements from individual embryos are used and assume that the standard deviation of this sample s_iis known. Then the normal distribution of the sample mean has a standard deviation which can be estimated by $σ_{i} = \frac{s_{i}}{\sqrt{K_{i}}}$ . For the dataset we used, both s_iand K_iare available from the FlyEx database [11]. Once all σ_iare found, we can use a weighted least squares estimation such that θ minimizes the sum

S (θ) = \sum_{i = 1}^{N} w_{i}^{2} {(y_{c_{i}} (t_{i}, θ) - {\tilde{y}}_{i})}^{2}

instead of (2). We take the weights w_iinversely proportional to σ_isuch that the weighted least squares yields the maximum likelihood estimate. Also in this case, we find that the obtained parameter estimates have the same type of correlations as those obtained with an ordinary least squares fit (data not shown).

Correlations between parameters can be due to hidden dependencies in the data set. To investigate whether this is the case, we conduct an inverse experiment. We choose one of the parameter sets obtained by the LM search, with an RMS = 8.38, and we denote it by θ*. By integrating the model equations with θ* we generate an exact data set at the same data points as the initial data set. To the exact data values we add errors drawn from the normal distribution with zero mean and standard deviation equal to 8.5. From the exact and the perturbed data set, we compute RMS(θ*) = 8.17. The perturbed dataset is used for the parameter estimation by means of the LM search. By constructing this inverse problem, we make sure that the assumption about the independence of the measurement errors is correct. With 40 different initial values of θ from [12] we obtain 34 parameter estimates having RMS between 7.95 and 8.25. Inspection of the corresponding V matrices shows that parameters are not determinable due to the correlations, similar to the original problem.

We conclude that the observed correlations between parameters are a property of the model, not of the data. Since an explicit form of the dependence of the state vector on the parameters is not known, the use of reparametrization techniques is not feasible. Note that the majority of parameters in (17) appear in the argument of the sigmoid regulation-expression function Φ. If the model (17) is used to obtain only qualitative information, such as the signs of regulatory weights, then the particular mathematical form of this function is of no importance [25]. However, it has to be studied if the choice of the sigmoid function affects the determinability of parameters.

Conclusion

In this paper we have applied the Levenberg-Marquardt (LM) optimization method to obtain a set of parameter estimates for gap gene circuit models. We then used statistical analysis to study the quality of these estimates, i.e. how well the parameters are determined with the available experimental data. Our analysis shows that none of the model parameters can be determined individually with reasonable accuracy due to correlations between parameters. Therefore, current gene circuit models cannot be used as a tool to infer quantitative regulatory weights for the gap gene network.

With this caveat in mind, however, it is still possible to draw qualitative conclusions on the regulatory topology of the gap gene network. These conclusions are weaker than, but entirely consistent with those made by only considering the values of parameter estimates [7, 8, 12, 13]. Therefore, they are also fully consistent with genetic and molecular evidence on gap gene regulation (see [7], and references therein). Our analysis allows us to determine exactly which interactions predicted by gene circuit analysis remain ambiguous. If considered in isolation, this ambiguity poses a serious problem for inferring regulatory interactions from expression data as it leaves important aspects of gap gene regulation unresolved. We show that more and better data will not necessarily improve parameter estimates. On the other hand, our results using fixed threshold parameters indicate that at least some of these ambiguous aspects can be resolved by reducing parameter correlations through fixing some parameters in the optimization. Others may disappear if more realistic models are used: for instance, models incorporating protein production delays, or reduced models incorporating cad and tll as time-variable external inputs as these genes are not regulated by gap genes themselves. Further research into parameter correlations within complex network models will be required to explore what kind of improved models or optimization constraints lead to better parameter determinability.

Still it remains doubtful whether an approach can be found which leads to complete parameter determinability. The study by Gutenkunst et al. [5] indicates that parameter sloppiness is a very common phenomenon among models used in systems biology. Our results corroborate this as it is difficult to see how, for example, correlations between regulatory weights could be eliminated from a network model. The situation is not hopeless, however, as genetic evidence can help us clarify these remaining ambiguous interactions. Such evidence is itself ambiguous in many cases, as it is often difficult to interpret mutant phenotypes. But it is also complementary to and completely independent of the evidence gained by reverse engineering approaches such as the one used here [7]. This means that its ambiguities are often complementary to the ones described in this study. For instance, while cross-repressive feedback between hb and kni is not supported (but also not contradicted) by our current models, it is very strongly supported by genetic evidence [30]. Based on this, we conclude that systems biology approaches are most successful if they combine experimental and theoretical insights in a consistent and powerful manner.

Other biological interpretations of parameter sloppiness are possible. Our results on the interactions between hb and kni indicate that although present in the Drosophila embryo, they are not strictly necessary to maintain correct gap gene expression, and may be at least partially redundant with or replaceable by other regulatory interactions in the system. It is interesting to think about this from an evolutionary point of view, as such redundancy or replaceability allows the network to be re-wired while maintaining correct gap gene expression.

References

Murray JD: Mathematical Biology. 2002, Berlin: Springer
Google Scholar
Ashyraliyev M, Nanfack YF, Kaandorp JA, Blom JG: Parameter estimation for biochemical models. FEBS J.
Jaqaman K, Danuser G: Linking data to models: data regression. Nature Reviews Molecular Cell Biology. 2006, 7: 813-819. 10.1038/nrm2030
Article CAS PubMed Google Scholar
Ljung L: System, Identification – Theory For the User. 1999, New Jersey: Prentice Hall
Google Scholar
Gutenkunst RN, Waterfall JJ, Casey FP, Brown KS, Myers CR, Sethna JP: Universally sloppy parameter sensitivities in systems biology models. PLoS Computational Biology. 2007, 3: e189- 10.1371/journal.pcbi.0030189
Article PubMed Central Google Scholar
Akam M: The molecular basis for metameric pattern in the Drosophila embryo. Development. 1987, 101: 1-22.
CAS PubMed Google Scholar
Jaeger J, Blagov M, Kosman D, Kozlov KN, Manu , Myasnikova E, Surkova S, Vanario-Alonso CE, Samsonova M, Sharp DH, Reinitz J: Dynamical analyses of regulatory interactions in the gap gene system of Drosophila melanogaster. Genetics. 2004, 167: 1721-1737. 10.1534/genetics.104.027334
Article PubMed Central CAS PubMed Google Scholar
Jaeger J, Surkova S, Blagov M, Janssens H, Kosman D, Kozlov KN, Manu Myasnikova E, Vanario-Alonso CE, Samsonova M, Sharp DH, Reinitz J: Dynamic control of positional information in the early Drosophila embryo. Nature. 2004, 430: 368-371. 10.1038/nature02678
Article CAS PubMed Google Scholar
Jaeger J, Reinitz J: On the dynamic nature of positional information. BioEssays. 2006, 28: 1102-1111. 10.1002/bies.20494
Article CAS PubMed Google Scholar
Poustelnikova E, Pisarev A, Blagov M, Samsonova M, Reinitz J: A database for management of gene expression data in situ. Bioinformatics. 2004, 20: 2212-2221. 10.1093/bioinformatics/bth222
Article CAS PubMed Google Scholar
FlyEx Database. http://flyex.ams.sunysb.edu/flyex
Nanfack YF, Kaandorp JA, Blom J: Efficient parameter estimation for spatio-temporal models of pattern formation: Case study of Drosophila melanogaster. Bioinformatics. 2007, 23: 3356-3363. 10.1093/bioinformatics/btm433
Article Google Scholar
Perkins TJ, Jaeger J, Reinitz J, Glass L: Reverse Engineering the Gap Gene Network. PLoS Computational Biology. 2006, 2: e51- 10.1371/journal.pcbi.0020051
Article PubMed Central PubMed Google Scholar
Seber GAF, Wild CJ: Nonlinear regression. 1988, New York: John Wiley & Sons. Inc
Google Scholar
Draper NR, Smith H: Applied regression analysis. 1981, New York: John Wiley & Sons, Inc
Google Scholar
Mendes P, Kell DB: Non-linear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation. Bioinformatics. 1998, 14: 869-883. 10.1093/bioinformatics/14.10.869
Article CAS PubMed Google Scholar
Nocedal J, Wright SJ: Numerical Optimization. 1999, New York: Springer
Book Google Scholar
Marquardt DW: An algorithm for least-squares estimation of nonlinear parameters. SIAM J Appl Math. 1963, 11: 431-441. 10.1137/0111030.
Article Google Scholar
Nonlinear least squares estimation (CWI report, NW 17/75 by J C P Bus, B Domselaar and J Kok). , http://repository.cwi.nl:8888/cwi_repository/docs/I/09/9052A.pdf
Golub GH, Loan CF: Matrix computations. 1996, Baltimore: Johns Hopkins UP
Google Scholar
Hemker PW: Numerical methods for differential equations in system simulation and in parameter estimation. Anal Sim Biochem Sys. 1972, 25: 59-80.
Google Scholar
Stortelder W: Parameter Estimation in Nonlinear Dynamical Systems. PhD thesis. 1998, University of Amsterdam. Mathematics and Computer Science Faculty
Google Scholar
Gear CW: Numerical initial value problems in ordinary differential equation. 1971, Englewood Cliff: Prentice Hall
Google Scholar
Aster RC, Borchers B, Thurber CH: Parameter Estimation and Inverse Problems. 2005, USA: Elsevier
Google Scholar
Reinitz J, Sharp DH: Mechanism of eve stripe formation. Mech Dev. 1995, 49: 133-158. 10.1016/0925-4773(94)00310-J
Article CAS PubMed Google Scholar
Simcox AA, Sang JH: When does determination occur in Drosophila embryos?. Developmental Biology. 1983, 97: 212-221. 10.1016/0012-1606(83)90078-7
Article CAS PubMed Google Scholar
Foe VE, Alberts BM: Studies of nuclear and cytoplasmic behaviour during the five mitotic cycles that precede gastrulation in Drosophila embryogenesis. The Journal of Cell Science. 1983, 61: 31-70.
CAS PubMed Google Scholar
Myasnikova E, Samsonova A, Kozlov K, Samsonova M, Reinitz J: Registration of the expression patterns of Drosophila segmentation genes by two independent methods. Bioinformatics. 2001, 17: 3-12. 10.1093/bioinformatics/17.1.3
Article CAS PubMed Google Scholar
Rivera-Pomar R, Lu X, Perrimon N, Taubert H, Jäckle H: Activation of posterior gap gene expression in the Drosophila blastoderm. Nature. 1995, 376: 253-256. 10.1038/376253a0
Article CAS PubMed Google Scholar
Clyde DE, Corado MSG, Wu X, Paré A, Papatsenko D, Small S: A self-organizing system of repressor gradients establishes segmental complexity in Drosophila. Nature. 2003, 426: 849-853. 10.1038/nature02189
Article CAS PubMed Google Scholar
Jaeger J, Sharp DH, Reinitz J: Known maternal gradients are not sufficient for the establishment of gap domains in Drosophila melanogaster. Mechanisms of Development. 2007, 124: 108-128. 10.1016/j.mod.2006.11.001
Article CAS PubMed Google Scholar

Download references

Acknowledgements

MA and JB acknowledge support from the Dutch BSIK/BRICKS project and from NWO's 'Computational Life Science' program, projectnr. 635.100.010. JJ is supported by the UK Biotechnology and Biological Sciences Research Council (grant number BB/D00513). We would like to thank Prof. P. W. Hemker and Prof. J. G. Verwer for their valuable comments and suggestions. We are also grateful to Yves Fomekong Nanfack for providing us with the parameter estimates obtained from the global search. Finally, we thank the anonymous reviewers for their valuable remarks.

Author information

Authors and Affiliations

CWI, Kruislaan 413, Amsterdam, SJ, 1098, The Netherlands
Maksat Ashyraliyev & Joke G Blom
Department of Zoology, Laboratory for Development & Evolution, University Museum of Zoology, University of Cambridge, Cambridge, SJ, CB2 3EJ, UK
Johannes Jaeger

Authors

Maksat Ashyraliyev
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Jaeger
View author publications
You can also search for this author in PubMed Google Scholar
Joke G Blom
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Maksat Ashyraliyev or Joke G Blom.

Additional information

Authors' contributions

MA performed the optimization and the statistical analysis; JB proposed the research and supervised the work; JJ contributed to the biological interpretation of the results. All authors contributed to the manuscript.

Electronic supplementary material

12918_2008_240_MOESM1_ESM.pdf

Additional file 1: Supplementary Material. This file (supplment.pdf) contains the material which is not given in the paper due to the space limitations. This file consists of three parts. Section 1 outlines some technical aspects of the methodology described in the paper. Section 2 presents the results obtained in the full search case. It includes plots for model responses compared to the data, scatter plots of parameters, plots of the confidence intervals for parameter estimates, discussion about the correlations between parameters, and results of parameter estimation with fixed weight for regulation of kni by Hb. Section 3 presents the results in the case of fixed promoter thresholds, including plots for model responses compared to the data, scatter plots of parameters, plots of the confidence intervals for parameter estimates, and discussion about the correlations between parameters. (PDF 2 MB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Ashyraliyev, M., Jaeger, J. & Blom, J.G. Parameter estimation and determinability analysis applied to Drosophila gap gene circuits. BMC Syst Biol 2, 83 (2008). https://doi.org/10.1186/1752-0509-2-83

Download citation

Received: 21 May 2008
Accepted: 25 September 2008
Published: 25 September 2008
DOI: https://doi.org/10.1186/1752-0509-2-83

Parameter estimation and determinability analysis applied to Drosophila gap gene circuits

Abstract

Background

Results

Conclusion

Similar content being viewed by others

Background

Methods

Parameter Estimation by the Levenberg-Marquardt Method

Statistical Analysis of Parameter Estimates

The Biological Test Problem: Gap Gene Circuits

Results and Discussion

Optimization Results

Parameter Determinability

Parameter Estimation with Fixed Promoter Thresholds

Parameter Correlations: Data vs Model

Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Authors' contributions

Electronic supplementary material

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation