Opinion

The control of gene expression is pivotal in biology. It is accomplished by a large number of regulators, including transcription factors (TFs), that can modulate mRNA synthesis by directly interacting with regulatory genome sequences. The human genome contains about 20,000 genes and an estimated 1,400 TFs [1]. Although much is known about the basic mechanics of transcription, little is known about how TFs function collectively in the context of intricate gene regulatory networks (GRNs) to achieve complex biological outputs during development and in physiology and disease.

Ideally, one would like to comprehensively map the binding of each TF within the genome and understand the effects that such interactions have on its target genes. Conversely, for each gene, one would like to know which TFs contribute to its expression and in which cells and under which circumstances this contribution occurs.

Over the past decade several high-throughput methodologies have been developed, standardized and implemented to map GRNs, including computational reverse engineering (reviewed in [2, 3]), chromatin immunoprecipitation (ChIP) combined with microarrays (ChIP-chip) or next generation sequencing (ChIP-seq) (reviewed in [4, 5]), and yeast one-hybrid (Y1H) assays (reviewed in [6, 7]). Each of these methods has inherent limitations, and therefore GRNs might miss interactions (false negatives) and contain interactions that are not 'biologically meaningful' (false positives; Table 1). Determining the scale of these limitations has proven difficult, in part because it is challenging to define the term 'biologically meaningful'. For instance, interactions between genes and regulators are often deemed biologically meaningful only if the expression of the gene changes following removal or reduction of the regulator, and/or if mutations in the gene and the regulator confer similar phenotypes. Here, I discuss methods that are used to identify interactions between genes and regulators and illustrate different levels of validation that can be used to obtain further support for these interactions. In addition, I argue that lack of validation does not necessarily signify irrelevance because validation assays each come with their own caveats, and their interpretation can be further complicated by mechanisms such as TF redundancy. Instead, results from many assays should be combined to generate increasingly comprehensive, high-quality network models.

Table 1 Overview of commonly used techniques for gene regulatory network mapping and their advantages and limitations

Gene regulatory networks

GRNs are graph diagrams that depict interactions between genes and their regulators (such as TFs). These interactions can indicate a regulatory relationship, and/or can depict a physical interaction between a TF and a genomic DNA region associated with a particular gene. A genome-scale method used for inferring regulatory interactions is to computationally search transcriptomic data for correlations between gene and TF expression. This reverse engineering approach has been pioneered in yeast [8] and has also been applied to mammals [9, 10]. ChIP starts with a protein and is an example of a TF-centered (protein-to-DNA) method, whereas Y1H starts with a DNA fragment and can be referred to as a gene-centered (DNA-to-protein) technique (reviewed in [6, 7]). Both ChIP (for example, [1116]) and Y1H assays (for example, [1721]) have been successfully used in various systems and have each led to a wealth of data. Importantly, inferred regulatory relationships are not necessarily a result of direct physical interactions. Conversely, for physical interactions between TFs and DNA, the regulatory consequence (repression or activation) is usually not known. Therefore, for optimal coverage and information content, both types of approaches need to be applied and integrated.

What is 'biologically meaningful'?

The quality of GRNs depends on the proportion of real interactions that are retrieved and the proportion of retrieved interactions that are real. For many scientists, interactions identified by high-throughput methods are deemed biologically meaningful (real) only if a regulatory and/or functional consequence is demonstrated, after which the interaction is considered 'validated in vivo'. However, a lack of in vivo validation does not necessarily invalidate an interaction because: (i) in vivo validation methods have their own limitations; (ii) a TF binding event might have been attributed to the wrong gene - for instance, when a TF binds an enhancer far from a transcription start site; and (iii) biological safety nets that buffer the loss of individual TFs can mask the effect of genuine DNA-TF interactions on gene expression.

Data quality: false negatives

To obtain a complete picture of gene regulation, it is important to detect all physical and regulatory interactions that occur between genes and TFs. However, it is likely that the interaction networks that have been delineated so far are incomplete because not all interactions can be detected by the method(s) used. There are several reasons why DNA-TF interactions can be missed (false negatives; Table 1). In computationally inferred GRNs, relationships can be missed when the required cut-off for correlation was set too high, when TFs do not change in expression in accordance with their target genes, or when the TF or its target is expressed at very low levels, thereby disabling detection of expression changes. With ChIP, the detection of interactions depends on the expression level, concentration and activity of the TF in the cell or tissue sampled, and the strength and accessibility of its binding sites (Figure 1). With ChIP-Seq, precipitated DNA fragments are sequenced, and bound regions are 'called' by compiling all the reads that correspond to particular genomic DNA regions into 'peaks'. Subsequently, cut-offs are selected somewhat arbitrarily to distinguish bound from unbound regions [22]. This will inevitably cause the strongest and/or most well-represented (robust) interactions to be considered at the expense of weaker interactions that might just as likely be biologically meaningful. Gene-centered Y1H assays also miss DNA-TF interactions. For instance, interactions with obligatory heterodimers cannot be detected with the current configurations of the assay. In addition, TFs that require post-translational modification or a cofactor in order to bind DNA may not be retrieved. When cDNA libraries are screened, low-abundance TFs have a high likelihood of being missed. This disadvantage has been partially alleviated by using directed Y1H assays in which TFs are tested one by one for their ability to bind to a particular DNA fragment [23]. Such clone-based assays, however, depend on clone resources such as the ORFeome [24, 25], and TFs for which open reading frame clones are not available will obviously not be represented in these assays.

Figure 1
figure 1

The detection of physical interaction between a TF and a target gene depends on various parameters, including the expression level or concentration of the TF, the activity of the TF, the affinity of the binding site for the TF and the accessibility of the binding site in the context of chromatin. TFBS, transcription binding site.

Data quality: false positives

False positives can be incorporated into GRNs and can be either technical or biological. Technical false positives are interactions that are sporadic in nature and cannot be repeated, even with the same assay with which they were originally retrieved. Obviously, any high-throughput method should avoid detecting spurious interactions by carefully optimizing and evaluating the robustness of the assay. Biological false positives are defined as interactions that are robustly detected but that are not biologically meaningful. In computationally derived GRNs spurious edges (regulatory interactions) can arise if both a TF and its inferred target are regulated by another TF that itself does not change in expression. In ChIP experiments, biological false positives might be obtained when the antibody is not exclusively specific for the TF that is being studied, or when a TF is overexpressed (for instance, from a transgene), and starts binding to lower affinity or non-specific sites. Furthermore, selecting a threshold that is too low when 'calling' interactions from a background of non-interacting fragments may result in the inclusion of false interactions. Y1H assays can retrieve interactions that do not occur in vivo when a TF binding site is available in the context of yeast chromatin but not in the organism from which the DNA fragment was cloned. TF levels in yeast are controlled by a yeast promoter and by the copy number of the TF-expressing plasmid, and it is possible that lower affinity DNA sequences are bound when TF levels are high.

Five levels of validation

There are five conceptual levels of validation of interactions between genes and their regulators.

The first level is retesting interactions detected with the same experimental approach and reagents to minimize technical false positives. This can be done by retesting individual interactions, or by performing larger, genome-scale experiments multiple times. For example, in ChIP assays, the DNA regions deemed bound by a TF are often confirmed by quantitative PCR of the ChIPped DNA. However, this only confirms that the DNA fragment was precipitated; it is not a retest of the ChIP assay itself. In Y1H assays, interactions retrieved can be confirmed in freshly grown yeast cells containing the 'DNA bait', using, for instance, a TF-encoding clone [26].

The second level is confirming an interaction with the same assay but using different reagents. Computationally inferred regulatory interactions can be assessed in an independent dataset, or with a different algorithm. ChIP interactions can be confirmed by using multiple antibodies to the same protein [15, 27]. The signal-to-noise ratio can also be improved by including control experiments of samples in which the TF is removed or reduced [12]. In such experiments, DNA regions that are detected by ChIP both in wild-type and TF mutant or RNA interference (RNAi) samples can be considered false positives, and thresholds can be drawn accordingly. Y1H assays use two reporter genes that are integrated into different locations in the yeast genome, and only interactions that result in the activation of both reporters should be considered, as they are basically detected twice, and therefore confirmed. In addition, independent 'DNA bait' strains and 'TF preys' from different clone resources can be used to confirm interactions.

The third level is detecting an interaction with a different assay of the same type. For instance, to validate computationally inferred regulatory interactions one can use RNAi of a TF and examine changes in the expression of its inferred targets in vivo. This approach has recently been used for a subset of regulatory relationships inferred to control the pathogenic response of murine dendritic cells [10]. Physical interactions detected by Y1H assays can be confirmed by ChIP and vice versa. For example, we have confirmed multiple Y1H interactions in Caenorhabditis elegans and Arabidopsis thaliana interactions by ChIP [17, 21]. Additional support for physical interactions identified either by Y1H assays or by ChIP can be obtained by identifying a putative TF binding site within the DNA fragment bound, either using motif prediction algorithms (for example, [2830]) or by interrogating large TF binding site datasets (for example, [3134]). In Y1H assays, the putative site can be deleted and interactions with the mutant fragment can be examined. Loss of the DNA-TF interaction with the mutant fragment would confirm that the selected binding site is indeed correct [35].

The fourth level is observing an interaction in a different type of assay, that is, a regulatory interaction is confirmed by a physical interaction or vice versa. An example of how the regulatory effect of physical DNA-TF interactions can be examined in C. elegans is the generation of transgenic animals that express green fluorescent protein (GFP) under the control of the DNA fragment with which the physical interaction was detected, and subjection of these transgenic animals to RNAi of the relevant TF or crossing them into TF mutant animals [19, 20, 36]. An increase in GFP expression following reduction of the TF would indicate that the TF represses gene expression, whereas a decrease in GFP expression following TF reduction would indicate that the TF is an activator. Other methods used to determine target gene expression in mutant/RNAi animals include quantitative RT-PCR and expression profiling (for example, [17, 37]).

The final level is observing that a TF and its target share functional roles in a biological process, or confer similar phenotypes when mutated and/or overexpressed. Such functional similarities can be uncovered by performing phenotypic experiments, or alternatively, correlations between functions can be investigated in silico using Gene Ontology [38] and KEGG databases [39]. It is likely that a complete correlation between a TF and its target genes in either a biological process or a phenotype occurs only in a minority of cases because networks are highly interconnected, and TFs often have multiple functions according to developmental, physiological or environmental circumstances.

Limitations of DNA-TF validation: detection limits and interpretation

It is important to note that validation experiments are each subject to their own limitations and that a negative result need not invalidate the original interaction observed. For instance, when RNAi is used to examine regulatory relationships between a gene and its putative regulators, off-target effects can complicate the interpretation. Therefore, it is desirable to observe the same effect on target gene expression with multiple small interfering RNAs (siRNAs) designed to target the same regulator. In C. elegans, not every tissue is equally amenable to RNAi; for instance, most neurons are largely refractory, and interactions occurring there would wrongly be deemed not valid when analyzed by RNAi [40, 41]. When validating Y1H interactions by ChIP, it is important to perform the experiment using samples that not only express the TF of interest, but also under conditions where the TF is active (Figure 1). An example that illustrates this concept is C. elegans DAF-16, a forkhead TF that is located in the cytoplasm in wild-type animals unless they are exposed to nutritional or environmental stress. In daf-2 mutants, however, DAF-16 aberrantly translocates to the nucleus and is constitutively active, and the daf-2 mutant background has therefore been used to identify an initial set of DAF-16-bound regions [12]. Some other TFs could truly be bound to DNA sites in vivo, but be functionally dormant until activation by a ligand or signaling pathway, and regulatory effects would only be detectable under activation conditions. Prior knowledge of TF expression [42, 43] or activity will greatly facilitate the design of an appropriate in vivo experiment. For instance, we focused on animals in the dauer stage (in which their development is arrested) to validate interactions between microRNA promoters and DAF-3, whose expression greatly increases in the dauer stage and which confers a dauer-related phenotype [19]. Finally, some TFs may affect the expression of their target genes in a tissue-specific manner, and some TFs can even function as both an activator and a repressor, depending on the circumstances. For instance, using a transgenic GFP approach, we found that the C. elegans nuclear hormone receptor NHR-45 activates the promoter of nhr-178 in some tissues and under some conditions, whereas it represses it in other tissues under other conditions [20]. This example highlights the complexity of gene regulation and the caution one should take when interpreting 'whole animal' validation assays, such as quantitative RT-PCR.

Limitations of DNA-TF validation

In the past few years, it has become clear that many physical interactions - for instance, those identified by ChIP - do not convey a detectable regulatory consequence on their predicted target genes (for example, [44, 45]). This could be because the regulatory consequences were tested under conditions in which the TF is not active. Alternatively, the wrong gene could have been attributed to the binding event, or the loss or reduction of individual TFs could have been masked by TF combinatorics or redundancy.

Both TF redundancy and combinatorics are prevalent in GRNs (Figure 2). For instance, multiple TFs from the same family can act through a single binding site and function redundantly, so that loss of one TF is compensated by another (Figure 2a). Such redundancy has been shown in various systems and for various different TF families, including mammalian ETS proteins that were studied by ChIP [46] and C. elegans FLYWCH-type zinc fingers that were found by Y1H assays [36]. Loss of a TF can also be masked by combinatorics involving TFs from the same family (Figure 2b) or different (Figure 2c) families, in which each TF contributes only a small regulatory effect, and this can lead to apparent redundancy. Such built-in redundancy may be very useful for critical genes that need to be buffered to avoid detrimental phenotypic consequences of TF loss [47].

Figure 2
figure 2

The regulatory effect of a TF on a target gene can be masked when the TF is mutated (loss of function) or when its levels are reduced by RNAi. Two mechanisms that explain such masking are (a) TF redundancy and (b,c) the combinatorial interactions between multiple TFs, either from (b) one family or (c) from different families. In any of these cases, loss of a single TF would have only modest effects on target gene expression. Similar shapes with different colors indicate members of the same TF family; different shapes indicate members of different TF families.

Physical interactions between TFs and genomic DNA fragments are often inferred to affect the gene that is in closest linear proximity to the binding event. For interactions that occur close to a transcription start site, either in a gene promoter or just downstream in the transcribed region, this is a reasonable assumption. Interactions involving more distant DNA elements (such as enhancers), however, may not necessarily affect a nearby gene because the genome is not organized as a linear polymer, but rather is organized in three dimensions [48]. Studies using chromatin conformation capture techniques [49] are providing insights into which genomic regions contact each other to form loops, potentially to bring together enhancers and gene promoters. Integrating physical and regulatory interaction data with structural genome and DNA looping data will facilitate the further dissection of GRNs.

Conclusions

When is an interaction between a genomic DNA fragment and a TF biologically meaningful? When the gene located closest to the binding event changes in expression following loss or reduction of the TF? When the gene and the TF have the same phenotype? Clearly, the answers are not simple, and the assays used to validate interactions are not foolproof. However, when the methods used for GRN delineation are robust and technically sound, individual interactions should not be discarded when one or more validation methods fail to detect regulatory or phenotypic consequences, particularly in complex, whole organisms. Rather, multiple methods need to be combined into increasingly integrated networks to attribute different degrees of confidence to each of the interactions observed.