1 Introduction

All organisms must respond to changes and stresses in their environment to survive and reproduce. Such environmental stresses include changes in nutrient or oxygen availability, changes in osmolarity, salinity, or pH, the presence of reactive oxygen species or other damaging agents, and sudden or large changes in temperature, either an increase (heat shock) or decrease (cold shock). Organisms respond to environmental stresses through characteristic programs of gene expression. Among the most interesting and challenging problems in understanding this environmental stress response is the dynamic behavior of gene expression networks within the cell. The careful regulation of these networks is a fundamental activity of the organism. In this paper, we discuss the development and application of a dynamical systems model for regulation of gene expression during the early response to cold shock in budding yeast.

Our focus on Saccharomyces cerevisiae and cold shock is motivated by a number of factors. These yeast have been studied extensively, especially their response to heat shock, which occurs through the induction of heat shock proteins (Morano et al. 2012). These heat shock proteins are universally conserved across all organisms and have been very well characterized. However, the response to cold shock has been less well studied, although its effects on cellular physiology are known (Thieringer et al. 1998; Al-Fageeh and Smales 2006; Aguilera et al. 2007). Decreases in temperature cause a reduction in membrane fluidity, a reduction in enzymatic activity, the stabilization of DNA and RNA secondary structures, and the impairment of protein synthesis. Similarly to heat shock, cold shock does induce the expression of a set of “cold shock” proteins; however, these proteins are not universally conserved. Much remains to be discovered about the molecular mechanisms and regulation of the response to cold temperatures in yeast. The model we develop provides some new tools for investigating the regulation of this response and provides new biological insight into this phenomenon.

Biologically, computationally, and mathematically, parameter estimation remains a significant challenge for the modeling of gene regulatory dynamics, even for medium-scale networks of just 5–10 interacting genes, (Cao and Zhao 2008; Lillacci and Khammash 2010; Kuwahara et al. 2013; Fan et al. 2015). The large number of parameters, the highly nonlinear dynamics of gene regulation, and the noisiness and relative sparseness of time course microarray data make parametric inference a difficult problem requiring mathematical and numerical care. Our approach integrates numerical solution of the ODE model, state-of-the-art optimization algorithms, and novel use of penalization to infer parameters for a relatively large network with few temporal data points. Our results demonstrate that large-scale parameter estimation can be successfully performed for nonlinear dynamic gene regulatory networks using sparse, noisy microarray data.

Our model involves a few key ingredients. One is a network of transcription factors that activate or repress transcription of genes needed for the cell to respond to the cold shock stress. The network itself can be thought of as a simple qualitative model in its own right, and many investigators have explored the problem of network inference from gene expression data (for a review see Hecker et al. 2009 and references therein). Instead, we start with an experimentally defined network so that we can take the next step of developing quantitative production and degradation dynamics for the transcription factors involved in the cold shock response.

We then develop parameter estimation techniques for extracting rate parameter information from time course microarray data obtained from cold shock experiments to infer the direction (activation or repression) and magnitude of influence that regulatory transcription factors have on their target genes. Other models of this type have either been developed on relatively simple small gene circuits (e.g., Cao and Zhao 2008) or have used data from biological systems that are already well understood (e.g., the yeast cell cycle, Vu and Vohradsky 2007), so little new biological insight is gained. The novelty of our approach is to take a problem where relatively little is known about the biology and create a meaningful dynamical model of the system. A number of methods have been proposed and implemented for fitting differential equation models to data (see, e.g., Cao and Zhao 2008, for an excellent review). In this paper, we discuss a penalized nonlinear least squares approach to parameter estimation, which we have applied with success to a number of problems, ranging from the dynamics of college drinking (Ackleh et al. 2009) and subsurface contaminant transport (Bailey and Fitzpatrick 1997) to inverse interferometry (Fitzpatrick and Keeling 1997) and liquid chromatography (Fitzpatrick 1993). This approach has largely been avoided in gene regulatory models due to its mathematical and numerical complexity. The advantages of our approach over extended Kalman filtering (Lillacci and Khammash 2010; Fan et al. 2015) or profiling methods (Cao and Zhao 2008) is that appropriate treatment of the penalized least squares allows the estimation of a fairly high-dimensional parameter from relatively sparse temporal data, a common challenge with microarrays and other measurement technologies. Here we compare the solution of the differential equations to microarray data from cold shock experiments on S. cerevisiae, using penalized least squares in an innovative way, to extract parameter estimates and determine the regulatory directions (activation or repression) and the strengths of the regulatory relationships of controlling genes on targets in a complex feedback network of 21 genes (nodes) and 31 regulatory relationships (edges).

The paper is organized as follows. In Sect. 2, we describe the model organism S. cerevisiae, the environmental stress of cold shock, and the determination of a regulatory network structure. The nature of the microarray data that we use for parameter estimation is discussed in Sect. 3, while Sect. 4 is devoted to the mathematical model and the estimation problem. Section 5 provides the results of our parameter estimation process. We close the paper in Sect. 6 with some concluding remarks that discuss the results and suggest future directions.

2 Regulation of the Response to Cold Shock in \({\mathbf {S. cerevisiae}}\)

As a single-celled eukaryote, budding yeast, Saccharomyces cerevisiae, must respond to changes and stresses in the environment such as changes in nutrient or oxygen availability, changes in osmolarity, salinity, or pH, the presence of reactive oxygen species or other damaging agents, and sudden or large changes in temperature, either an increase (heat shock) or decrease (cold shock; Dawes 2004). Yeast respond to environmental stresses through characteristic programs of gene expression, called the Environmental Stress Response (ESR; Gasch et al. 2000; Causton et al. 2001). With the advent of high-throughput, whole-genome methods such as DNA microarrays, programs of gene expression, including the ESR, have been elucidated as never before. These data are key to developing a fundamental understanding of cell function. Mechanistic models of gene regulatory networks that have been validated by experiment can then yield additional insights. This paper details modeling and parameter estimation for a gene regulatory network controlling the cold shock response in yeast.

Unlike the response to heat shock and other environmental stresses, the transcriptional response to cold shock has been relatively less well studied in yeast. The previous studies that exist have revealed that the response varies depending on the temperature and the length of time spent at the cold temperature. The cold shock response occurs between the temperatures of 10 and \(18\,^{\circ }\hbox {C}\) (Sahara et al. 2002; Schade et al. 2004; Tai et al. 2007), and the near-freezing response occurs between 0 and \(10\,^{\circ }\hbox {C}\) (Kandror et al. 2004; Murata et al. 2006). The early response occurs after 10 min up to 2 h of cold temperatures, and the late response occurs after 12 h of cold or near-freezing temperatures (Kandror et al. 2004; Schade et al. 2004), although the exact transition time between the early and late responses has not been definitively determined. However, it is clear from these studies that the early and late responses represent two different biological phenomena of first adaptation by the cells to the cold temperature, followed by acclimation. These two distinct processes require the expression of different sets of genes and different sets of regulatory transcription factors to regulate them. Indeed, these studies revealed that the cold shock late response, but not the early response, include the ESR genes induced by many environmental stresses. Through the use of gene deletion experiments, Schade et al. (2004) and Kandror et al. (2004) also determined that the ESR genes in the late response to cold and near-freezing temperatures, respectively, were regulated by the Msn2 and Msn4 transcription factors, as they are during other environmental stresses. However, the transcription factors responsible for the induction of the early response genes and the overall regulatory mechanism governing this early response remain largely unknown. Furthermore, there is ample evidence to suggest that environmental stress response pathways overlap, as is seen by the induction of the same set of ESR genes under multiple stress conditions (Gasch et al. 2000; Causton et al. 2001). Finally, DNA microarray experiments comparing gene expression changes when the Leu3 transcription factor was deleted or overexpressed has revealed that many genes that are not direct targets of that factor were affected in the experiment due to indirect effects (Tang et al. 2006). These indirect effects are most likely due to regulatory relationships between transcription factors. Thus, these questions remain: (1) which transcription factors control the early response to cold shock in S. cerevisiae? (2) what is the extent of ESR pathway overlap? (3) which part of the early transcriptional response to cold shock is due to indirect effects of other transcription factors? To approach these questions, we need complementary types of high-throughput genomic data, the tools of mathematical biology, and the perspective of systems biology.

A great deal of research has focused on the empirical identification of the network structure from microarray or other genomic data. An established method called genome-wide location analysis, which uses chromatin immunoprecipitation with epitope-tagged transcription factors followed by hybridization to DNA microarrays spotted with intergenic sequences (ChIP-chip), has determined the relationships between transcription factors and the target genes they regulate on a global scale in budding yeast (Lee et al. 2002; Harbison et al. 2004). Starting with the network of 106 transcription factors identified by Lee et al. (2002), we considered only those transcription factors that had been previously annotated as involved in the ESR. We also considered the transcription factors that regulated them and those that they regulated, regardless of annotation. The confidence level of these regulatory relationships reported by Lee et al. (2002) was \(p<0.001\). The largest connected cluster of nodes became the gene regulatory network, comprised of 21 nodes and 31 edges. All of the edges were confirmed with the data from a second genome-wide location dataset from Harbison et al. (2004). The standard names for the transcription factors in the gene regulatory network are listed in “Appendix”, along with their systematic names and aliases from the Saccharomyces Genome Database (http://www.yeastgenome.org), and the network structure itself is pictured in Fig. 1. Each node simultaneously represents the gene, the mRNA, and the protein. For the sake of simplicity, in the rest of the paper, we will refer to the nodes as “genes” even though the node represents all three entities. Each directed edge represents the regulatory relationship between two nodes. This means that the transcription factor encoded by the gene at the originating node either activates or represses expression of the gene at the recipient or target node. We emphasize that the arrows do not denote activation here; rather, we are indicating the directionality of regulation.

Fig. 1
figure 1

Cold shock gene regulatory network diagram. The arrows indicate the direction of regulation (transcription factor to target) and do not represent activation here. Each edge is annotated with the weight parameter index from Table 5, referred to in Figs. 9, 10, 11, 12, 13 and 14

This graph contains a total of 21 nodes and 31 edges. Of the 21 nodes, 15 are regulated by at least one gene in the network. The in-degree and out-degree distributions of the nodes are given in Fig. 2.

Fig. 2
figure 2

In-degree (dark) and out-degree (light) distribution of directed edges in the gene regulatory network

One observation from this histogram is that 6 nodes have in-degree 0, meaning that those 6 nodes are not controlled by any of the genes in the network. Furthermore, four of the nodes have out-degree 0, meaning that they do not control any of the genes in the network. One gene, RAP1, has out-degree 5, making it influential to the most genes. The gene YAP6 is influenced by 6 genes. Four genes show autoregulation: AFT1, NRG1, RAP1, and YAP6. The deepest regulatory chain includes 5 nodes (originating at SKN7), with 4-node chains originating at CIN5, MAC1, PHD1, SKN7, and YAP1. Most nodes have a single input or are part of a simple regulatory chain, but several participate in complex feedforward motifs (CIN5, ROX1, and YAP6; SKN7, YAP1, and ROX1). Furthermore, there appears to be two distinct subnetworks (upper left and lower right of Fig. 1) that are only connected through edges originating at ABF1 and PHD1. This complexity of network structure makes it difficult to hypothesize up front what the regulatory dynamics might be and necessitates use of a model to explicate them.

After defining the network topology, the next step in the modeling process is the determination of the dynamics, including the signs (activation/repression) and the influence magnitudes of the regulatory relationships. However, we first describe in more detail the nature of the microarray data that we will use to infer parameters in the model.

3 Cold Shock DNA Microarray Data

We are grateful to Babette Schade for providing the complete microarray dataset for wild type yeast subjected to cold shock as published in Schade et al. (2004). In their experiment, wild type Saccharomyces cerevisiae strain BY4743 grown at \(30\,^{\circ }\hbox {C}\) in rich YEPD medium was shifted to \(10\,^{\circ }\hbox {C}\). Samples were collected before cold shock \((t_{0})\), and after \(10 (t_{10})\), \(30 (t_{30})\), \(120 (t_{120})\) minutes, and 12 and 60 h of cold shock. We restricted our analysis to the first three cold shock timepoints because we are specifically interested in the early response to cold temperatures in yeast. As discussed in Sect. 2, there are substantial biological differences between the early and late cold shock responses which would lead to substantial differences in the dynamics of the early response which occurs on the timescale of minutes to hours and the late response which occurs on the timescale of hours to days. The dataset we obtained had three replicates for the \(t_{0}\) timepoint, seven replicates of the \(t_{10}\) timepoint, six replicates of the \(t_{30}\) timepoint, and four replicates of the \(t_{120}\) timepoint. We assumed that each replicate of the \(t_{0}\) timepoint consisted of a competitive hybridization of Cy3-labeled cDNA derived from one culture grown at \(30\,^{\circ }\hbox {C}\) with Cy5-labeled cDNA derived from a different culture grown at \(30\,^{\circ }\hbox {C}\). We also assumed that the replicates of the \(t_{10}\), \(t_{30}\), and \(t_{120}\) timepoints consisted of competitive hybridizations of labeled cDNA from independently cold shocked cultures to labeled cDNA from control cultures grown at \(30\,^{\circ }\hbox {C}\). The data we obtained had already been subjected to within-chip normalization. We performed the following manipulations on the data. The expression ratios (fold changes) were \(\hbox {log}_{2}\) transformed. Between-chip normalization was carried out (see Stekel 2003 for a detailed discussion of microarray normalization). Each replicated measurement of \(\hbox {log}_{2}\) ratio (that is, each individual microarray chip) was mean removed and scaled by subtracting the average \(\hbox {log}_{2}\) ratio for all of the spots on the microarray from each spot and dividing each spot by the standard deviation of all spots on the microarray. For each gene at each timepoint we computed the average \(\hbox {log}_{2}\) ratio of the replicate measurements to produce one data point, along with the standard deviation. We also computed a modified t statistic to determine whether each average \(\hbox {log}_{2}\) ratio was significantly different than zero and a p value based on the t statistic. We should note that the variability and the small number of replicates make for tests that are not very powerful. Table 1 shows the number and percentage of genes in the dataset with significant changes in gene expression at three different p value cut-offs, \(p<0.05\), \(p<0.01\), and \(p<0.001\). The \(t_{0}\) timepoint has very few genes with significant changes in expression as would be expected when labeled cDNA from two control cultures are hybridized against each other. However, the fact that 2.6 % of the genes did actually meet the \(p<0.05\) criterion for significant differential expression points to the variability, both technical and biological, in this experimental system. The other timepoints all have a greater number of genes showing a significant change in expression than would be expected by chance using that particular p value cut-off, except for the \(t_{30}\) timepoint at \(p<0.001\). This demonstrates that the yeast did indeed respond to the cold shock treatment at \(10\,^{\circ }\hbox {C}\) with changes in gene expression.

Table 1 Number and percentage of genes with significant changes in gene expression at each timepoint for three different p value cut-offs

In Table 2, we provide the average \(\hbox {log}_{2}\) ratios and p values for the 21 genes in our network. Notably, only nine genes in the network show significant changes in gene expression at \(p<0.05\) at any timepoint. ABF1, FHL1, and HSF1 show significant decreases in gene expression at one or more cold shock timepoints, and MAC1, MSN4, RAP1, and RPH1 show significant increases in gene expression at one or more cold shock timepoints. AFT1 and ROX1 have \(p<0.05\) for decreases in expression observed at the \(t_{0}\) timepoint, when no change in expression is expected.

Table 2 Average \(\hbox {log}_{2}\) ratios of expression and p values derived from Schade et al. (2004)

4 Mathematical Modeling of Regulatory Networks

Gene regulation can be modeled with a wide variety of mathematical structures at many levels of resolution. Schlitt and Brazma (2007) review four levels at which gene regulatory networks have been modeled: (1) parts lists, (2) topology models, (3) control logics models, and (4) dynamic models. Karlebach and Shamir (2008) provide a similar breakdown of gene regulatory modeling, into logical models, continuous models, and single-molecule models. In many cases, trade-offs between the number of genes included in the model and the level of detail of the model govern the modeling structure that is chosen and applied. Parts lists and topology models concern themselves with the identity and connectivity of genes in the model on the scale of the entire genome, transcriptome, or proteome, while kinetic models often focus on small systems where detailed experimental data are available (e.g., the \(O_\mathrm{R}\) control system of bacteriophage lambda, Shea and Ackers 1985). In the case of the early cold shock response, we want to scale down from the whole-genome topology model to more closely investigate a smaller gene regulatory network. Because a master regulator for this response, akin to HSF1 for heat shock, has not been identified for cold shock, our network must still be large enough to include all potential regulators annotated as being involved in the ESR. And because we want to discover the relative influence of this set of factors and their activation/repression relationships, we want to investigate the dynamics of the network. In short, to understand the cell’s early response to cold shock, we must combine topology and dynamic models on a medium scale in a way that has predictive power to understand the interactions in gene regulatory networks.

Taking a step in that direction, we build a model of gene regulation that adds the dynamics of transcription factor production onto their interaction network. Research along these lines has applied differential equation structures (e.g., Alon 2007; Wilkinson 2006; Vohradský 2001; Vu and Vohradsky 2007; Kauffman et al. 2003; Climescu-Haulica and Quirk 2007; Chen et al. 2005, 1999; Blossey et al. 2008), typically treating the problem as one of mass balance.

The basic balance concept is one of production and degradation. The equation

$$\begin{aligned} \dot{x}_i (t)=p_i (x(t))-d_i x_i (t) \end{aligned}$$
(1)

in which the function \(p_i\) gives the production rate, and the linear term \(d_i x_i (t)\) is the degradation rate, defines an in-flow, out-flow conservation principle for the level of expression \(x_i (t)\) over time. The functions \(p_{i}\) will of course depend on expression levels of all the genes controlling gene i. Commonly used structures for the production functions include linear (Chen et al. 1999), quadratic (Angeli et al. 2009; Sontag 2007), Michaelis-Menten (Alon 2007; Cao and Zhao 2008), and sigmoidal (Chen et al. 2005; Mendoza and Xenarios 2006; Smolen et al. 2000; Vu and Vohradsky 2007). The form of \(p_i\) is thus a primary modeling issue.

The production function that we adopt here, based on a sigmoidal production model proposed in Vu and Vohradsky (2007), takes the general form

$$\begin{aligned} p_i (x(t),\theta )=\frac{P_i }{1+\exp \left( {-\sum \limits _j {w_{ij} (x_j (t)-\tau _{ij} )} } \right) } \end{aligned}$$
(2)

in which \(P_i\) is the maximal rate of expression (i.e., the production rate at full production activation), \(w_{ij}\) is the interaction weight of gene j in regulating gene i, and \(\tau _{ij}\) is a threshold expression level at which production switches “on” and “off.” In this functional form, the parameter \(\theta \) captures the weights, thresholds, and possibly even the baseline production rates.

We first note that the interaction network is contained in the weight parameters. If the weight \(w_{ij} \) is nonzero, then an edge connects the production of gene or node i with the expression level \(x_j \). For example, the graph of Fig. 1 has 31 edges. We emphasize that the network is a directed graph: the expression of transcription factor j may affect that of i without the converse relationship necessarily holding. We also note that the sign of the weight governs the type of relationship: positive weights correspond to activation, while negative weights correspond to repression.

The functional form of the sigmoid \(S(u;w,\tau )=1/(1+e^{-w(u-\tau )})\) on which Eq. (2) is based is more easily understood with a graph. In Fig. 3, we show the basic shape of repression and activation production functions of the form \(S(w(u-\tau ))\) versus u.

Fig. 3
figure 3

Sigmoidal repression (top) and activation (bottom) functions

Roughly speaking, we think of production as turning on and off, depending on the expression levels of activating and repressing transcription factors. The weight governs the “boundary layer” between on and off states, and the threshold governs the input level at which the switch is thrown. For very large weights, the production function approximates the unit step or Heaviside function with jump positioned at the threshold value. For an activator, expression levels above the threshold lead to production, while expression levels below turn production off. Likewise, repressors turn production off at higher-than-threshold levels and turn production on when expression levels decrease below the threshold.

Generally speaking, the transient behavior of the system (1) must be determined numerically. Long-time behavior issues, such as equilibria and their stability, are quite difficult for systems of the size under study here: the specific example of cold shock in yeast we discuss below involves 21 state variables. Our interest in this paper is in the determination of parameters from data, so we do not undertake any analysis of long-time behavior, other than to note that the work of Angeli et al. (2009) provides an interesting approach to stability through the notion of a coherent system.

With a model of dynamic regulation in hand, we now turn to the determination of parameter values for the model. The system of differential equations we have presented in (1) is a complex model with a large number of parameters. When considered in the context of fitting this model to microarray data, which is expensive and time consuming to collect, we must take great care in our parameter estimation procedures. Here we discuss a number of issues associated with parametric dependence and parameter estimation.

As discussed in Sect. 3, the microarray data we use provides a measurement of the level of gene expression activity at the time of measurement relative to the initial expression at \(t_{0}\). We denote by \(\hat{{x}}_i^r (t_k )\) the rth replicate observation of gene i expression level at time \(t_k \). The parameter identification process then becomes a problem of comparing the model form

$$\begin{aligned} \dot{x}_i (t;\theta )=\frac{P_i }{1+\exp \left( {-\sum \limits _j {w_{ij} (x_j (t;\theta )-\tau _{ij} )} } \right) }-d_i x_i (t;\theta ) \end{aligned}$$

to the observed data. Here we have explicitly included the dependence of the state variable on the rate and network parameters, which comprise the vector \(\theta \).

The most common approach to the estimation of parameters for models such as our gene regulatory network model is that of least squares. A form of nonlinear regression, the least square approach compares model output to observed data and chooses the parameter estimate by minimizing this discrepancy. In particular, the function

$$\begin{aligned} J(\theta )=\sum _{s=1}^R {\sum _{k=1}^{N_T } {\sum _{i=1}^{N_G } {\left| {\log _2 \hat{{x}}_i^s (t_k )-\log _2 x_i (t_k ,\theta )} \right| ^{2}} } } \end{aligned}$$

is to be minimized, in which \(\hat{{x}}\) denotes observed expression levels from the microarray data, and \(x(t,\theta )\) denotes the parameter dependent solution of the differential equation. Here we are assuming R repetitions of the experiment, which is observed at times \(t_k ,k=1,2,\ldots ,N_T\) for all genes in the network \((i=1,2,\ldots ,N_G )\). We also note the use of the \(\hbox {log}_{2}\) transform, which as noted in Sect. 3 is commonly applied to microarray data.

This type of estimation problem has been studied by a number of investigators, including the definitive text (Gallant 1987), the papers (Banks and Fitzpatrick 1990; Fitzpatrick 2008) and the monograph (Huet et al. 2004).

We note that the model requires potentially a very large number of parameters. In the “worst” case, if the regulatory network forms a connected graph with n nodes, then there are \(n^{2}\) weights and \(n^{2}\) thresholds. While the number of parameters is a serious concern, the difficulty in identifying the thresholds is perhaps the most significant problem. Note that

$$\begin{aligned} \sum _j {w_{ij} (x_j (t)-\tau _{ij} )} =\sum _j {w_{ij} x_j (t)} -\sum _j {w_{ij} \tau _{ij} =} \sum _j {w_{ij} x_j (t)} -b_i , \end{aligned}$$

where

$$\begin{aligned} b_i =\sum _j {w_{ij} \tau _{ij} } \end{aligned}$$

defines a new parameter, \(b_{i}\). We note that, for any choice of weights with at least two being non-zero, there are an infinite selection of thresholds that would produce identical model dynamics, making the thresholds non-identifiable. Thus, for the purposes of parameter identification, we reduce the thresholds down to the b parameters. This parameterization was also used by Vu and Vohradsky (2007). While the individual threshold parameterization holds a slightly more intuitive meaning, in terms of the expression level in each controller gene that “turns the switch,” the b parameter represents a “net threshold” at which the combined level of activities leads to switching.

We thus denote by \(\theta \) the parameter vector \(\theta =(w,b,P),\) in which the number of individual w’s is governed by the total number of edges in the network, the number of b’s is governed by the sum of the in-degrees of each node, and the number of P’s is governed by the number of nodes. As noted in Sect. 2, our network involves 31 weights, 15 b’s, and 21 production rates.

We denote by \(\hat{{\theta }}\) the minimizer of the least squares cost. Generally speaking, one must determine this minimizer numerically with an iterative optimization procedure. Some theoretical results pertaining to the estimator, however, are available. For example, statistical results from the references above pertain to modeling the observations. If we assume that

$$\begin{aligned} \hat{{x}}_i^s (t_k )=x_i (t_k ,\theta ^{{*}})+\varepsilon _{ik}^s , \end{aligned}$$

where the errors \(\varepsilon _{ik}^s \) are zero mean, finite variance, independent and identically distributed random variables, then parameter estimator obeys a central limit theorem:

$$\begin{aligned} \sqrt{n}\left( {\hat{{\theta }}-\theta ^{{*}}} \right) \sim N(0,\Sigma ), \end{aligned}$$

as \(n\rightarrow \infty \), where \(\Sigma =\sigma ^{2}V^{-1}\) and \(\sigma ^{2}\) is the noise variance in the observations. The matrix V is the sensitivity matrix, given by

$$\begin{aligned} V=\sum _i { \int \limits _{t_0 }^{t_f } {\frac{\partial x_i }{\partial \theta }(t,\theta ^{{*}})\frac{\partial x_i }{\partial \theta }^{{T}}(t,\theta ^{{*}})\mathrm{d}t} } \end{aligned}$$

in which \(\frac{\partial x_i }{\partial \theta }\) denotes the gradient of gene i expression levels with respect to the parameter vector and with the superscript T as its transpose. The asymptotic as stated involves in-fill sampling in time, but other types of asymptotics are available (see, e.g., Banks and Fitzpatrick 1990; Fitzpatrick 2008; Gallant 1987). This matrix is related not only to the covariance of the parameter estimator but also to the numerical conditioning of the optimization procedure.

A more complex and robust approach to parameter estimation is Bayesian estimation. In Bayesian statistical inference, one begins with a prior distribution, \(\pi \). This distribution quantifies our a priori information concerning the parameters. The second component of the Bayesian approach is the conditional distribution of the measurement, given the parameter, \(p(x|\theta )\). Inference (e.g., estimation, hypothesis testing) is performed through the posterior distribution, computed via Bayes’ formula:

$$\begin{aligned} \pi (\theta |x)=\frac{p(x|\theta )\pi (\theta )}{\int \limits _\Theta {p(x|\theta ^{\prime })\pi (\theta ^{\prime })\mathrm{d}\theta ^{\prime }} },\quad \theta \in \Theta . \end{aligned}$$

An interpretation of Bayesian analysis that is particularly appealing in applications is that the prior and posterior represent quantifications of our uncertainty in parameter values before and after experimental data has been collected. A full coverage of Bayesian analysis, including philosophy, conceptual structure, analysis, and application, is contained in the excellent text of Berger (1993).

Bayesian maximum likelihood, in which one determines the parameter estimator by maximizing the posterior density, corresponds to a type of penalized least squares. If we assume, for example, that the errors \(\varepsilon _{ik}^s\) are zero mean normally distributed random variables and that the prior is of an exponential family, \(\pi (\theta )=C\exp \left( {-G(\theta )} \right) \), then the negative of the log of the posterior is

$$\begin{aligned} -\ln \left( {\pi (\theta |y)} \right)= & {} \frac{\sum \limits _{s=1}^S {\sum \limits _{k=1}^{N_T } {\sum \limits _{i=1}^{N_G } {\left| {\log _2 \hat{{x}}_i^s (t_k )-\log _2 x_i (t_k ,\theta )} \right| ^{2}} } } }{2\sigma ^{2}}+G(\theta )-\ln (C)\nonumber \\&+\,\frac{1}{2}\ln (2\pi \sigma ^{2}). \end{aligned}$$

The last two terms in this expression are independent of the parameter and thus irrelevant to parameter estimation. We may then take as our penalized least squares criterion

$$\begin{aligned} \tilde{J}(\theta )=\sum _{s=1}^S {\sum _{k=1}^{N_T } {\sum _{i=1}^{N_G } {\left| {\log _2 \hat{{x}}_i^s (t_k )-\log _2 x_i (t_k ,\theta )} \right| ^{2}} } } +G(\theta ) \end{aligned}$$

with the function G representing our prior level of uncertainty in the parameter’s value. The form of G is often taken to be a quadratic, an assumption equivalent to using a normal prior. This approach to estimation is also called penalized least squares. In this work, we use a quadratic G with a scaling factor \(\alpha \) to control the relative role of data noise and parameter sensitivity (where \(\theta _{0}\) denotes our best a priori estimate, as well as the prior mean):

$$\begin{aligned} \tilde{J}_\alpha (\theta )=\sum _{s=1}^S {\sum _{k=1}^{N_T } {\sum _{i=1}^{N_G } {\left| {\log _2 \hat{{x}}_i^s (t_k )-\log _2 x_i (t_k ,\theta )} \right| ^{2}} } } +\alpha \left| {\theta -\theta _0 } \right| ^{2}. \end{aligned}$$
Table 3 Degradation rates for transcription factor proteins

The choice of the parameter \(\alpha \) can be challenging, and there are many approaches to its selection, including cross validation (Golub et al. 1979) and the L-curve (Hansen and O’Leary 1993), the technique we examine here. The L-curve method involves the computation of a parametric plot of the least squares residual versus the penalty term, parameterized by \(\alpha \). For each \(\alpha \), we compute the minimizer \(\hat{{\theta }}_\alpha \)of \(\tilde{J}_\alpha \), and then we compute \(\tilde{J}_0 (\hat{{\theta }}_\alpha )\) (the least squares residual error) and \(r(\hat{{\theta }}_\alpha )=\left| {\hat{{\theta }}_\alpha } \right| ^{2}\) (the penalty). In this procedure, we plot \(r(\hat{{\theta }}_\alpha )\) versus \(\tilde{J}_0 (\hat{{\theta }}_\alpha )\) for each \(\alpha \). Typically, this plot takes the shape of an L, the corner of which is used to select an appropriate penalty level. The additional computation required to perform the L-curve analysis pays significant dividends in practice. Working from larger values of \(\alpha \) to smaller ones aids in the numerical optimization, as the output of the more highly penalized optimization provides an improved starting point for the less penalized one to follow.

In Sect. 5 below, we illustrate the penalized least squares and L-curve technique with microarray data as published in Schade et al. (2004). Having reviewed the basic concepts of dynamic modeling and parameter estimation, we turn to the specific problem of interest, inferring the regulatory dynamics of the early response to cold shock in S. cerevisiae.

5 Issues of Parameter Estimation and Model Sensitivity

In considering the particular aspects of our 21-state model, we see that there are 21 production rate parameters, 21 degradation rate parameters, 31 weights, and 15 net thresholds. Such a large number of parameters brings about a major challenge within the context of the microarray data we are using, in which we have 3–7 replicates reporting \(\hbox {log}_{2}\) fold changes in expression for each gene at 4 time points.

First, we will assume that the degradation rates are known or obtainable through other means. To find the degradation rate, we used published protein half-life data from Belle et al. (2006). We converted the half-life data values to the degradation rates by taking the natural log of the half-life and dividing by 2 (Table 3). For several transcription factors, the half-life data were not available, so we computed a median of the half-life values for the other transcription factors, converted it and used that value for those proteins. The median was based on the half-lives reported by Belle et al. (2006) for 142 proteins for which there were data out of 203 proteins annotated as transcription factors by Harbison et al. (2004).

The data we obtain from microarrays are in the form of expression relative to time 0 expression, \(x_i (t)=\textit{mRNA}_i (t)/\textit{mRNA}_i (0)\), leading to theoretical initial values of 1 for all expression levels in the dynamics. In all model simulations, we specify \(x_i (0)=1\) for all genes. Moreover, were the system not cold shocked, we would expect it to be in equilibrium at constant (relative) expression of 1 with no transcriptional regulation occurring, i.e., \(\sum \limits _j {w_{ij} -b_i } =0\). Thus, we would expect the non-cold-shocked system to have threshold values for \(x_{i}(t)\) equal to one, leading to the steady-state equations of \(\frac{P_i }{1+\exp (0)}-d_{i}\, ^{*}1=0\), or \(P_i =2d_i \).

We do not use this approach to estimate production rates for the following reason: several of the equations, associated with genes not receiving activation or repression signals from within the network, are independent of the parameter estimation process. Thus, these genes would be in steady state, and we could then drop them from the dynamical system and estimation. We do find that this estimation approach does give us a reasonable initial guess for any iterative optimization algorithm we apply to minimize the penalized least squares cost. We emphasize that this produces an initial guess for production rate parameters; it is not an initial condition for the dynamical system, nor are any cold shock dynamics assumed or forced to be in steady state.

The data we use for the penalized least squares estimation come from the experiments reported in Schade et al. (2004; see Sect. 3 and Table 2).

The least squares criterion takes the form

$$\begin{aligned} \tilde{J}_\alpha (\theta )=\sum _{k=1}^4 {\sum _{i=1}^{15} {\left| {\log _2 \hat{{x}}_i (t_k )-\log _2 x_i (t_k ,\theta )} \right| ^{2}} } +\alpha \left| w \right| ^{2}+\alpha \left| b \right| ^{2}+\alpha \left| P \right| ^{2}, \end{aligned}$$

in which we apply the L-curve method to determine an appropriate value for \(\alpha \). Our numerical implementation in MATLAB (Release R2010a) uses the optimization toolbox routine fmincon to perform the minimization. We use a constrained minimization algorithm to maintain non-negative production rates. In producing this L-curve, we start with a fairly large value of \(\alpha \), so that the minimization is dominated by the penalty. Initial guesses for the weights are all set to 1, and initial guesses for the net thresholds are set to 0. The production rates are initialized as discussed above. Once the minimization iteration has reached numerical convergence, the resulting optimal parameters are used to initialize the minimization for the next smaller penalty parameter. In Fig. 4, we provide the L-curve obtained through this procedure.

Fig. 4
figure 4

L-curve analysis of Schade et al. (2004) data as fit to model. Values of \(\alpha \) annotate the points

Fig. 5
figure 5

Genes ABF1, ACE1, AFT1, CIN5, CUP9, FHL1, GTS1, HAL9, HSF1 in the regulatory network: best fit model dynamics and data. Relative expression level is plotted as \(\hbox {Log}_{2}\) fold change (ratio) over time. The solid blue curve in each panel gives the model with the best fit parameters. The green circles represent the data, and the red crosses provide a 95 % confidence interval for the data. The upper point of the confidence interval for ABF1 at \(t_{0}\) extends outside of the graphic coordinate limits

Fig. 6
figure 6

Genes MAC1, MSN1, MSN4, NRG1, PHD1, RAP1, REB1, ROX1, RPH1 in the regulatory network: best fit model dynamics and data. Relative expression level is plotted as \(\hbox {Log}_{2}\) fold change (ratio) over time. The solid blue curve in each panel gives the model with the best fit parameters. The green circles represent the data, and the red crosses provide a 95 % confidence interval for the data

Fig. 7
figure 7

Genes SKN7, YAP1, and YAP6 in the regulatory network: best fit model dynamics and data. Relative expression level is plotted as \(\hbox {Log}_{2}\) fold change (ratio) over time. The solid blue curve in each panel gives the model with the best fit parameters. The green circles represent the data, and the red crosses provide a 95 % confidence interval for the data

The L-curve suggests three possible good \(\alpha \) values to select. In Fig. 9 we compare the weight, net threshold, and production parameter values for \(\alpha = 0.02\), 0.01, and 0.005. We selected the value \(\alpha =0.01\) for the remainder of the analyses presented below. In Figs. 5, 6 and 7, we show the dynamics of each gene’s expression. The solid blue curve in each panel gives the model with the best fit parameters. The green circles represent the data, and the red crosses provide a 95 % confidence interval for the data. Genes without significant changes in expression (Table 2; Fig. 8) show little change in dynamics over time.

The parameter estimates derived from the minimization are given in Table 4. The electronic supplementary material is a zipped file containing the corresponding input spreadsheet and output spreadsheet. The MATLAB code is available upon request.

Figure 8 shows the weights and experimental expression data displayed on the network diagram.

We conducted a number of additional computations to explore the quality of these estimates. First, we compared the estimated parameter values for several of the L-curve runs. In Fig. 9, we plot the weights, net thresholds (b’s), and production rates from three different penalty levels.

We see that the magnitudes of the parameters are different, but that the trends and patterns agree for all \(\alpha \) values in the penalized least squares estimation. The signs of the weight and thresholds, in particular, stay the same, and the production rates for a number of the genes are quite close. The parameter index is used for simplicity of plotting: Table 5 connects the indexing with the genes for Fig. 9 and subsequent figures (weight indexes are annotated on the edges of Fig. 1).

Fig. 8
figure 8

Weights and experimental expression data displayed on the network diagram. The sign of the weight (positive for activation and negative for repression) is represented by both the arrowhead type (pointed or blunt, respectively) and edge color (magenta and cyan, respectively, or gray for weights near zero). The magnitude of the weight is represented by the thickness of the edge; larger weights are represented by thicker lines. The weight value is noted next to each edge. Each node is colored based on the Schade et al. (2004) expression data. There are four stripes for the four timepoints, \(t_{0}\), \(t_{10}\), \(t_{30}\), \(t_{120}\). The stripe is gray if there was no significant change in expression at that timepoint, magenta if there was a significant increase in expression, and cyan if there was a significant decrease in expression \((p<0.05)\). An interactive version of this diagram can be viewed online at http://dondi.github.io/GRNsight/index.html

In a second test, we randomized the initial guesses for the iterative optimization scheme. We ran the minimization routine using 10 different initial guesses for each individual parameter. In the cases of the weights and thresholds, we sampled from a standard normal distribution, and for the production rates (which must be nonnegative), we multiplied the optimal production rates by a normal with mean 1 and standard deviation 0.03, truncating to 0 if negative. Using the penalty parameter \(\alpha =0.01\), we found that the resulting optimal parameter values were quite stable. In Tables 6, 7, and 8, we provide the standard deviations of the randomly selected initial guesses from the ten individual computations as well as the standard deviations of the resulting estimated parameters.

As a final test of the estimation routine’s accuracy, we performed some tests using model-generated data. We used the parameters in Table 4 to simulate data by solving the differential equation system (1). From the simulation, we used model-generated data in 5, 10, and 20 min time steps to conduct the penalized least squares fit, again with \(\alpha =0.01\). Figure 10 contains the resulting parameter estimates.

Since we have no a priori knowledge concerning the quality of the model or the parameter values, we cannot say with certainty that our fit, as detailed in Figs. 5, 6, and 7, and Table 4, are “correct” or even “close to the truth.” The additional tests of randomized initial guesses and model-generated data lend confidence, however, to the fit of the Schade et al. (2004) microarray data.

A final topic of interest along these lines is that of the sensitivity matrix. As discussed in Sect. 3 above, the matrix

$$\begin{aligned} V=\sum _i { \int \limits _{t_0 }^{t_f } {\frac{\partial x_i }{\partial \theta }(t,\theta ^{{*}})\frac{\partial x_i }{\partial \theta }^{T}(t,\theta ^{{*}})\mathrm{d}t} } \end{aligned}$$

measures the sensitivity of the least squares minimization to the parameters. With the parameterization under study, this matrix is of dimension \(67 \times 67\). The large sample asymptotic parameter covariance matrix \(\Sigma =\frac{\sigma ^{2}V^{-1}}{\sqrt{n}}\) resulting from the parameter estimation is illustrated in the heat map image of Fig. 11, which shows significant uncertainty in the weight and net threshold parameter estimates. In Fig. 11, the parameters are indexed according to the “Full Index” given in Table 5. Thus, the indices 1–31 count the weight parameters, the indices 32–46 count the net thresholds, and the indices 47–67 count the production rates (which clearly have the smallest uncertainty levels.

A heat map image of the sensitivity matrix is dominated by the production rates, and the image itself is not very illuminating. In Fig. 12, we show the eigenvalues and the eigenvectors of the sensitivity matrix V. Some interesting patterns can be detected.

The eigenvectors in the image are ordered in terms of largest to smallest eigenvalues (that is, from highest to lowest sensitivity). The eigenvectors \(V_i \) are ordered according to decreasing eigenvalues \(\left( {\lambda _i \ge \lambda _{i+1} } \right) \). Note that the first 21 eigenvectors have support concentrated primarily in the production rate parameters (parameter indices 46–67, of the Full Index of Table 5), indicating that the model is most sensitive to changes in those parameters. The magnitude of the eigenvalues decreases dramatically as we move from the first 21 eigenvectors to the next 25. In this group, some interesting relationships can be observed.

Table 4 Network weights, net thresholds, and production rates
Fig. 9
figure 9

Weight (top), b (middle), and production rate (bottom) parameter comparisons for \(\alpha =0.02\) (black), 0.01 (blue), and 0.005 (red) from penalized least squares estimation. Weight, \(w_{ij} \), (top panel), net threshold, b \(_{i}\), (middle panel), and production rate, \(P_i \), (bottom panel)

First, we note that Eigenvector 22 involves the state equation of NRG1. In Fig. 13, we graph Eigenvector 22, labeling the four significant parametric directions it contains.

The sensitivity is strongest with respect to the weight of SKN7 controlling NRG1, slightly dependent on the self-control of NRG1, with opposite sign sensitivity for the net threshold and the production rate. Eigenvector 23 shows a complex connection of sensitivities in the ROX1, YAP1, and YAP6 dynamics (Fig. 14).

The weights corresponding to the indices 19–22, 24–31 are the controlling weights for the dynamics of ROX1, YAP1, and YAP6, while indices 43, 45, and 46 correspond to the net thresholds in those three genes.

To interpret these sensitivities, we note that YAP1, ROX1, and YAP6 form a densely connected core in one sub-network (Fig. 1, upper left). Second, we observe that NRG1 is a controller of YAP6 with a fairly large positive weight (indicating activation), while four other genes controlling YAP6 are repressing, and the final sixth gene (YAP1) slightly activates YAP6 at a near-zero level almost two orders of magnitude below the activation of NRG1. Of these four genes, only ROX1 shows a significant change in gene expression (\(p<0.05\) at \(t_{0}\); see Table 2 and Fig. 8). The others do show fluctuations in expression (Table 2; Figs. 6, 7), but none of the \(\hbox {log}_{2}\) ratios are significantly different than zero. Thus, the regulatory weights must balance each other out to conform to the observed levels of expression.

Table 5 Parameter indexing for Figs. 9, 10, 11, 12, 13 and 14
Table 6 Standard deviations of initial guess and resulting estimates of network weights, \(w_{ij} \), for 10 penalized least squares computations
Table 7 Standard deviations of initial guess and resulting estimates of network net threshold parameters, b \(_{i}\), for 10 penalized least squares computations
Table 8 Standard deviations of initial guess and resulting estimates of production, \(P_i\), rates for 10 penalized least squares computations

6 Concluding Remarks

We have presented a general approach to modeling medium-scale gene regulatory networks, with an emphasis on the ability to extract parameters from data obtained from microarray experiments. Our findings are that a high-dimensional parameter vector in a complex high-dimensional dynamic network model can be reliably inferred from temporally sparse microarray data using a penalized least squares approach. The resulting dynamics are not, however, calibrated to units of concentration in mass balance, due to the relative nature of two-color microarray measurement technology. Furthermore, our model does not separate rates of mRNA and protein production or degradation. The model starts with a network topology and extracts relative strength of relationships, direction (activation/repression) of relationships, and rate of expression. The magnitude of the parametric uncertainties, as measured through the covariance, are large enough to preclude the use of this approach in extracting the network topology from data at this coarse level of time resolution, so the techniques described herein must be used in conjunction with other methods, either statistical clustering approaches or additional experiments, to identify the network connections. We are confident, however, in the utility of this approach to refine the dynamics and directionality of a candidate regulatory graph, which should have general applicability to other biological problems where time course gene expression data are available.

Fig. 10
figure 10

Estimated weights from three different model-generated data fits. Black circles denote the parameters used to generate the data. Blue circles denote the estimated parameters from 5 min time step data; red 10 min time steps; green 20 min time steps

Fig. 11
figure 11

Heat map of parameter estimator vector’s covariance matrix

Fig. 12
figure 12

Eigenvalues (top) and eigenvectors (bottom) of the sensitivity matrix. The heat map is a three-dimensional view of the eigenvectors with intensity of the shading indicating the magnitude of the eigenvectors

Fig. 13
figure 13

Eigenvector 22 shows the relative sensitivities in the NRG1 dynamics. The x axis refers to the parameter Full Index from Table 5

Fig. 14
figure 14

Eigenvector 23 shows the combined sensitivity to Parameters 19–22, 24–31, 43, 45, and 46. The x axis refers to the parameter Full Index from Table 5

Biologically, the estimated model parameters have shed light on the regulation of the early transcriptional response to cold shock in S. cerevisiae for which we had three questions: (1) which transcription factors control the early response to cold shock in S. cerevisiae? (2) what is the extent of ESR pathway overlap? (3) which part of the transcriptional response to cold shock is due to indirect effects of other transcription factors? First, the Schade et al. (2004) expression data and inferred network weights (Tables 2, 4) suggest that the subnetwork of transcription factors centered around RAP1 and including FHL1, MSN4, RPH1, and HSF1 plays a prominent role in the regulation of the cold shock response (Fig. 8, lower right). This makes sense biologically because RAP1 and FHL1 are responsible for activating genes encoding ribosomal proteins, and ribosome biogenesis is a biological process known to be induced by cold shock (Aguilera et al. 2007; Xiao and Grove 2009). RAP1 acts as both an activator and repressor in the model and is known to have both transcriptional activator and repressor activity in the cell (Shore and Nasmyth 1987). RAP1 strongly activates MSN4 and RPH1 in our model, both of which have significant changes in gene expression in the Schade et al. (2004) data. Indeed, all three inputs to MSN4 activate it. Both MSN4 and RPH1 bind to stress response elements (STRE) in approximately 200 genes, the activation of which constitutes the general ESR (Gasch et al. 2000; Causton et al. 2001; Orzechowski et al. 2012). FHL1 is weakly activated by both MSN4 and ABF1. Because ABF1 itself is down-regulated, the main activating influence comes from MSN4. However, FHL1 itself is down-regulated, so there must be another transcription factor outside this network that influences its expression. RAP1 also strongly represses HSF1, which is significantly downregulated in expression. HSF1 is responsible for inducing genes required for the heat shock response (Morano et al. 2012). There is some evidence to suggest that the cold shock response has some “opposite” effects than the heat shock response, so the down-regulation of HSF1 makes sense (Gasch et al. 2000; Schade et al. 2004). Thus, our model indicates that further examination of the roles of RAP1, FHL1, MSN4, RPH1, and HSF1 in regulating the early response to cold shock is warranted.

In contrast, the other subnetwork, (upper left of Fig. 8, including ACE2, CIN5, MSN1, NRG1, ROX1, SKN7, YAP1, and YAP6) appears to play less of a role in controlling the early cold shock response as there are few significant changes in gene expression in that part of the network. If the weights of the incoming edges are summed for each gene, they are all negative except for the weights controlling NRG1. Even though the weights of CIN5 and ABF1 controlling MSN1 are among the largest in magnitude in the entire network, they have opposite effects. CIN5 strongly activates MSN1, while ABF1 strongly represses it with the sum of the weights being negative; however, from the data, we see that the expression of MSN1 is unchanged.

Second, in terms of ESR pathway overlap, RAP1, FHL1, MSN4, RPH1, and HSF1 have all been implicated in controlling the response to other environmental stresses (Gasch et al. 2000; Causton et al. 2001; Morano et al. 2012; Orzechowski et al. 2012; Xiao and Grove 2009). Our model suggests that there is overlap between the general ESR and the early response to cold, not just the late cold response as noted in Schade et al. (2004) and Kandror et al. (2004).

Third, as for the indirect effects of transcription factors, as noted in Sect. 2, our network has regulatory chains that are 4 or 5 nodes deep and two complex feedforward motifs. However, it appears that the influence of transcription factors in a regulatory chain peters out after just one or two nodes. For example, RAP1 strongly influences HSF1 and MSN4, but the influence of HSF1 upon REB1 and MSN4 upon FHL1 are much weaker. Furthermore, as has already been noted, there is evidence to suggest that additional transcription factors not included in our network are necessary to explain the expression of the genes in our network. For example, RAP1 is found to repress itself in the model, even though it shows a significant increase in expression after 120 min of cold exposure, so there must be another transcription factor activating it that was not included in this network. FHL1 is significantly downregulated in expression, but its regulators ABF1 and MSN4 only weakly activate it, suggesting that FHL1, too, is repressed by an additional factor outside the current model. The significant down-regulation in the expression of ABF1 in the data, together with the fact that there are no predicted gene regulators for ABF1 in the current network, suggests that this must be due to some other transcription factors outside this network. Finally, MAC1 also shows a significant increase in gene expression at the \(t_{120}\) timepoint, but is also not regulated by any transcription factors in the current network, necessitating the invocation of other regulators.

The results of this model suggest several lines of future investigation, both experimentally and computationally. The model highlights the role of RAP1, FHL1, MSN4, RPH1, and HSF1 in regulating the early response to cold shock. A natural next experiment would be to investigate how the early response to cold shock is affected by the deletion of those genes. Unfortunately, RAP1, HSF1, and FHL1 are all essential genes in yeast, making the simple knockout experiment impossible (Winzeler et al. 1999). However, MSN4 and RPH1 are not essential and could be investigated in such a way. Although Schade et al. (2004) did perform microarray experiments on a strain deleted for both the MSN2 and MSN4 transcription factors, they only performed two replicates with the double deletion strain, precluding statistical analysis of the data that would indicate its reliability for use in estimating model parameters, and leaving additional experiments to be performed. Such biological knockout experiments could then be complemented by in silico knockouts where parameter estimation and forward simulations are performed using networks with the appropriate transcription factors removed. A comparison of the experimental and computational results could lead to refinements of the model and further biological insights. However, given that it appears that ABF1, FHL1, MAC1, and RAP1 are regulated by transcription factors not included in our network, a new network would need to be defined that includes those potential regulating factors. To our knowledge, genome-wide location analysis has not been performed under cold shock conditions, so important network connections could be missing from the currently available experimental data, necessitating other approaches for defining the regulatory network.

In conclusion, we have successfully estimated model parameters from microarray data for a medium-scale gene regulatory network using a penalized least squares approach. The results accurately model the expression dynamics, have revealed activation and repression relationships between the transcription factors in our network, and suggest which factors are most important to the regulation of the early response to cold shock in S. cerevisiae. Our work provides a firm mathematical foundation and specific biological suggestions with testable hypotheses for future systems biology iterations of modeling and experiment regarding the cold shock response in yeast. Finally, our work has general applicability to other biological systems.