To succeed, all long-term relationships require some degree of compromise from both partners. This is no less true for persistent virus infections and their hosts. Unrestricted replication of the parasite may be to the detriment of the health of the host and shorten its life span, thus depriving the parasite of its niche. Equally, no replication at all is a dead end for the parasite. The pathogen thus constrains its replication, and the host, given that it has effectively lost the battle to eliminate the invader, makes the best of a bad job and controls it when it gets out of hand. Restricting replication quantitatively or temporally so that the virus reproduces just sufficiently, or at particularly strategic times (for example, pregnancy), to achieve transmission while remaining silent at other times (latency), are techniques used by several viral families, of which the herpesviruses are the best studied.

The human immunodeficiency virus (HIV), the causative agent of AIDS, is also postulated to become latent when it infects a T lymphocyte (T cell) that has ceased to divide, and where levels of transcription factors that both cell and virus need for gene expression are declining [1]. This may be an oversimplification, however, as latency may be effected by more than one process and it may occur in cells other than memory T cells. Latency in HIV is of immense practical importance because it provides a reservoir of virus that can reactivate years later and that is protected from immune clearance and the effects of antiviral drugs. Control of gene expression from an integrated retroviral genome, the provirus, also provides an insight into how the chromatin reacts to parasites invading the genome, a process that is thought to have occurred throughout evolution, as evidenced by the abundance of endogenous retroviruses [2, 3] and repeated elements [4] in the human genome.

With knowledge burgeoning about the role of chromatin in the control of gene expression it is timely to review HIV latency. It is the greatest barrier to virus eradication, and understanding it can only enhance our knowledge of cellular gene expression.

After entry into the cell, HIV, like other retroviruses, uncoats and reverse transcribes its dimeric RNA genome into first a complementary DNA (cDNA) and then a double-stranded DNA (dsDNA). The DNA duplex containing the viral genes flanked by non-coding repeat sequences (long terminal repeats, LTRs) is then integrated by the viral integrase enzyme into the host DNA. To complete the viral life cycle, the integrated genome, now termed a provirus, utilizes host machinery, including transcription factors and RNA polymerase II, to activate its genome. The viral 5'-LTR acts as an enhancer and a promoter, directing the transcription of viral mRNAs, which are translated to make viral components.

The proviral genome is, like the host DNA, associated with chromosomal proteins. Here we review some of the information on gene expression from the integrated retroviral genome in the context of gene-silencing mechanisms that might contribute to latency. On the basis of this evidence and the dynamics of proviral gene expression, we propose that the cell uses different forms of proviral silencing: one occurring soon after integration and related to the integration site of the provirus, and a second, delayed, location-independent mechanism affecting proviruses that initially had established transcriptional activity.

Proviral transcription is affected by its chromatin structure

Several observations point to the local state of chromatin being highly influential in the ability of the HIV provirus to overcome a major intracellular hurdle - transcription initiation. HIV-1 superinfecting already latently infected cell lines is expressed [5], implying that local factors and global cellular conditions influence gene expression independently. Cell lines harboring a single copy of a simple retroviral provirus, that of murine leukemia virus (MLV), with varying levels of expression do not differ in their capacity to support the transcription of a transiently transfected LTR-driven reporter gene [6]. This suggests that factors independent of those influencing the resident provirus can affect transcription. Another finding highlighting the influence of local conditions is that silenced, methylated endogenous retro-virus (ERV) DNA becomes infectious after cloning [7].

More recently, a wealth of experiments illustrating the role of chromatin in HIV gene expression has been reported. Protein complexes involved in chromatin remodeling, including histone acetyltransferases (HATs) [8] and late SV40 transcription factor (LSF), which recruits histone deacetylase complex (HDAC-1) [9], can alter HIV-1 gene expression both in vitro and in vivo. HIV-1 expression is also affected by pharmacological modification of histone tails. Agents such as polyamides, which bind to the proviral promoter and block HDAC-1 recruitment [10], and trichostatin A, a HDAC inhibitor [11], have striking effects. Histone modifications involved in HIV gene expression have also been demonstrated by chromatin histone immunoprecipitation (ChIP) assays. HIV-1 reactivation from latently infected cell lines by either the induction of cell-cycle arrest [12] or the application of phorbol ester [13] both require the recruitment of HATs, accompanied by acetylation of histones at the HIV-1 promoter. ATP-dependent chromatin-remodeling proteins, including members of the SWI/SNF complex, are also recruited during HIV reactivation by phorbol ester application, resulting in the disruption of nucleosomes at the LTRs [14]. Retinoic acid, rather specifically in HIV-1, can interfere with nucleosome remodeling at the LTRs, but not with histone acetylation, and inhibits HIV-1 transcription [15]. Thus, during activation, histones associated with the HIV-1 promoter are first acetylated and chromatin-remodeling complexes are then recruited to disrupt the resident nucleosome (see [16] for a review).

Chromatin remodeling is also involved in repressing proviral gene expression. Protein factors associated with repression, such as c-Myc, occupy the HIV-1 promoter alongside HDAC-1 in a coordinated manner [17]. Recruitment of HDACs, the histone methyltransferase Suv39H1, the protein HP1 (which is typical of heterochromatin) and trimethylation of H3 lysine 9 at the HIV LTR have been shown to correlate with repression of gene expression in microglial cells [18]. Depleting Suv39H1 and the HP1γ subtype by siRNA increases the level of HIV gene expression [19].

The enigma of DNA methylation and viral gene silencing: correlation or causation?

Whereas the weight of evidence for the involvement of chromatin histone remodeling in the control of retroviral gene expression is compelling, the involvement of DNA methylation is more controversial. DNA methylation is associated with the recruitment of HP1, a marker of tight repression [20]. In vitro, the expression levels of transfected methylated LTR-driven reporter constructs based on HIV [21], human T-cell leukemia virus type 1 (HTLV-1) [22] or ERVs [23] have been assessed. Methylation-sensitive restriction enzymes were used to detect DNA methylation in silenced transfected HIV-1 constructs [24] or in ERV LTR-driven reporter constructs [23]. These studies established a clear link between DNA methylation and the absence of transcription.

Data from transiently transfected plasmids may not, however, accurately represent the behavior of integrated proviruses. For example, an integrated HTLV-1 responds differently from a transfected construct in response to extracellular stimuli [25], and in hepatocyte cell lines the promoter activity of a transfected HIV-based construct differs from that of an integrated vector [26]. A further example of episomes and chromosome-associated constructs behaving differently is seen in human papillomavirus gene expression. Associating a matrix-attachment region to a human papillomavirus gene has diametrically opposite effects on gene expression depending on whether the construct is transiently transfected or stably transfected (and presumably integrated) [27].

Despite these caveats, a latent MLV provirus with a methylated genome, confirmed by methylation-sensitive restriction enzymes, can be reactivated by 5-azacytidine, an inhibitor of DNA methylation [6]. In addition, methylation of the HTLV-1 proviral LTR has been assessed using methylation-sensitive restriction enzymes in cells from infected patients [28] or in transformed cell lines [29]. Methylation correlated inversely with the level of viral RNA [29] and the provirus could be reactivated by 5-azacytidine [29, 30]. Mutating the CpG sites (sites of DNA methylation) eliminates integrants that are completely silenced, strongly suggesting a role for DNA methylation in silencing [31].

In contrast, however, a methylated MLV provirus introduced into a defined site of the host-cell genome with Cre recombinase still lacks transcriptional activity after pharmacological inhibition of DNA methylation [32]. More problematically, bisulfite genomic sequencing analysis on HTLV-1-infected cells from patients and transformed cell lines showed that while the 5'-LTR is hypermethylated, the 3'-LTR is hypomethylated. In neither HTLV-1 infection [30] nor in its often-used animal model, bovine leukemia virus (BLV) infection [33], does the pattern of methylation of the LTR correspond to the clinical manifestation of the infection or to disease progression.

In cell lines latently infected with HIV-1, such as ACH-2, the 5'-LTR of the provirus is hypermethylated, whereas the 3'-LTR is hypomethylated. Activating the provirus with the cytokine tumor necrosis factor-α (TNF-α) partially relieves DNA methylation of the 5'-LTR [34]. However, in clones of cells derived after infection with a defective HIV-1 expressing green fluorescent protein (GFP) from the LTR, proviral expression did not correlate with DNA methylation, and bisulfite genome analysis showed most cytosine residues to be unmethylated [35]. Thus, in MLV, HTLV-1 and HIV, the influence of DNA methylation on proviral behavior is controversial. Stability of gene expression may correlate more with the density of DNA methylation, as opposed to there being a binary 'methylated' or 'demethylated' state [36].

What controls the behavior of the chromatin associated with the provirus?

The link between chromatin remodeling and proviral activity seems incontrovertible, but how is it controlled? One possible factor is the site of integration. A study of 35 HIV-1-infected clones found heterogeneity in their individual levels of gene expression. There was no correlation between the expression level of the integrated provirus and a second transfected construct in the same clone, again implicating the local environment in controlling proviral expression [37]. Analyzing the accessibility of restriction enzymes to DNA as a guide to nucleosome remodeling showed that this did correlate with the level of gene expression, despite a lack of correlation between gene expression and promoter methylation [37]. In addition, latent HIV-1 proviruses are frequently found integrated near alphoid repeats, which are frequently found in heterochromatin [38, 39]. This association with heterochromatin was further supported by a large-scale study of 971 HIV-1 integration sites that revealed that proviruses with inducible gene expression - presumably representing latent provirus - had integrated near gene deserts and centromeres, which are rich in alphoid repeats [40]. Others, however, were near very highly expressed genes [40] and a study of 74 HIV-1 integration sites of latent proviruses obtained from resting CD4 T cells from 16 patients found that most of them were in actively expressed regions [41]. Integration sites may, therefore, be influential, but other factors are also in operation. Consistent with the notion that viral gene expression is influenced by local factors set up at the time of integration, gene-therapy vectors containing DNA elements that can shield the provirus from the effects of adjacent chromatin, such as an MLV-based vector with a locus control region [42] or lentiviral vectors with a matrix-attachment region [43], establish high-level, position-independent gene expression. Overall, these studies strongly imply a significant role for the position of integration.

Does the integration site program gene expression indefinitely?

The time frame of silencing in a number of experiments is the major piece of evidence suggesting that factors other than integration site affect proviral expression. The 'site' effect on proviral chromatin configuration and gene expression would presumably be imposed soon after integration and, if it were the overriding influence, be permanent. Attenuation of gene expression has, however, been observed in longer-term culture in several experiments, mostly conducted using long-term cell clones infected with MLV or its derived vectors [32, 4446], sometimes with the transgene driven by a heterologous promoter rather than the viral LTR.

Even more telling, within the cell clones with provirus integrated at the same sites, variation of gene expression was observed in a number of different retroviral vectors [31, 44, 4749]. Where variation exists, the level of proviral gene expression between mother and daughter cells shows some degree of correlation [49, 50]. Silencing, where it occurred, was often linked to DNA methylation, as determined by digestion with restriction enzymes [46, 51] or bisulfite genomic sequencing [51]. This was not universally the case, however, and variation was possible even in cells devoid of de novo methyltransferases [47, 48]. Attempts to reactivate these proviruses using 5-azacytidine or trichostatin A after they had been silenced in long-term culture were at best only partially successful [4447, 49]. Variation in gene expression and the same difficulty in reversing the repression are also observed in HIV-1-infected cell clones [52]. Arguably, the required strength of the reactivating stimulus may be a dose-response effect. Lorincz et al. [51] reported that more efficient reactivation could be achieved by applying trichostatin to cells pretreated with 5-azacytidine, which presumably led to the removal of the methylation mark on DNA and relieved the transgene from tight repression. Thus, the overall picture is that, whereas the chromosomal position certainly contributes to the chromatin configuration of a provirus, the level of gene expression is modulated by other factors. One possibility is that variation emerges as cell division alters existing epigenetic marks on the provirus [50]. Intriguingly, there are at least two reports of upregulation of a DNA methyltransferase after HIV-1 infection [53, 54], suggesting the possibility of an active mechanism that silences integrated provirus.

More than one form of proviral silencing may exist

A model for retroviral gene expression thus has to accommodate several observations. First is the strong evidence of the involvement of chromatin. Second is the correlation between DNA methylation and proviral behavior, which is not consistent in all studies: there may be a gradation of silencing from strongly repressed to unstably expressed. Thirdly, the position of integration is important. Finally, proviral shutdown can and does occur over time. Even within cell clones, variation in gene expression is common.

One possible hypothesis is that the degree of proviral gene expression reflects the permissiveness of the chromatin at the site of integration. Thus, when integration occurs in repressed chromatin, the provirus is heavily repressed, which is probably correlated with DNA methylation. Where the density of DNA methylation is less, the provirus enters a state of unstable gene expression, manifesting as variation of expression from cell to cell (variegation; Figure 1). Although this is an attractive model, it cannot completely explain all the observations listed in Box 1. Another hypothesis is that the behavior of the provirus at an early stage after infection is primarily governed by the site of integration [37, 55] and is only partially related to the local chromatin configuration [35, 38]. Proviruses silenced at this point are probably more amenable to reactivation: indeed, the rate-limiting step in initiation of HIV-1 gene expression is the recruitment of the general transcription factor TFIIH [56], a relatively late step in the derepression of local chromatin. Of the proviruses that are not silenced initially, a proportion undergo shutdown with time. Proviruses silenced in this manner are tightly repressed and can be difficult to reactivate using external stimuli (Figure 2). Indeed, given the complexity of the genome, it is entirely possible that the two models are complementary: depending on the site of integration, one or other of the mechanisms depicted in Figures 1 and 2 is at work.

Figure 1
figure 1

Variation in the level of proviral expression within cell clones might be accounted for by the degree of methylation. In this model the degree of repression, set up at the time of integration, is critical to the degree of gene expression. Studies supporting this model include [36,47,49,50]. DNA methylation is an attractive candidate as a molecular correlate of repression and is depicted as such here and in Figure 2. There is, however, evidence suggesting that other molecular mechanism may be involved (see text). (a) Provirus integrated into repressive chromatin is stably repressed, probably correlating with a high degree of proviral DNA methylation. (b) Provirus in partially repressed chromatin is unstable and may proceed to become tightly repressed, or continue to be expressed but could be induced to a higher degree of expression. The change in epigenetic mark could arise from cell division. (c) Integration into permissive chromatin leads to high-level gene expression.

Figure 2
figure 2

Variation in the expression of proviruses integrated at the same position in different cells might be accounted for by a delayed mechanism leading to proviral silencing. Soon after infection the behavior of the provirus depends on the site of integration. (a) Integration into a chromosome position that is nonpermissive for gene expression results in a silent provirus. Note that although the environment may be nonpermissive for gene expression, the provirus itself need not be tightly repressed and is amenable to reactivation by various stimuli. (b) Integration into permissive chromatin permits viral gene expression. In HIV-1 this stage is prolonged because of the stability conferred by the Tat-TAR positive-feedback axis. (c) With time the provirus is silenced. At present, the trigger leading to the collapse of proviral activity is not known. (d) Once silenced, the provirus is tightly repressed and cannot be easily reactivated.

The additional transcriptional control mechanisms of complex retroviruses like HIV add a further level of control. Unlike simple retroviruses, where attenuation of gene expression is often observed, once transcription of an HIV-1 provirus has been established, it is extremely stable [52]. This is probably due to the viral protein Tat and its response element TAR [39]. Tat exerts positive feedback and enhances proviral gene expression by several mechanisms (reviewed in [57, 58]; see also [5961]). This positive-feedback axis leads to remarkably durable HIV-1 gene expression, which persists for more than 18 months once established [52]. The virus-encoded transactivator Tat may be an evolutionary development by the virus to counter the cellular silencing mechanisms.

In summary, retroviral gene expression is influenced by more than one mechanism involving chromatin (Figure 2). The location of integration probably crucially affects the initial level of proviral activity. With time, the provirus may be silenced. These silencing mechanisms are likely to affect most integrated constructs, accounting for the silencing observed in simple retroviruses, HIV-1 and retroviral vectors. In HIV-1, however, silencing can be counteracted or delayed by the powerful Tat-TAR positive-feedback axis [39]. The trigger event that leads to silencing is not clear. One possibility is that all proviruses are susceptible to silencing mediated by epigenetic changes through a direct mechanism, possibly via the upregulated DNA methyltransferases [53, 54]. Such a mechanism must act early after infection.

Formation of heterochromatin is known to spread to adjacent genetic regions unless it is stopped by an insulator [6264]. In the first model (Figure 1), proviral gene silencing observed in long-term cultures is one end of the spectrum of the process that relates the site of integration to proviral activity, the provirus being silenced by spreading hetero-chromatinization. Alternatively, a microRNA (miRNA)-based mechanism may be involved [65]. Another possibility is that silencing is related to the cyclical expression of transcription factors, such as NFκB for HIV-1 [66]. In a cell infected with HIV-1, the level of NFκB dips intermittently, the production of Tat would not be maintained and the Tat-TAR positive-feedback axis would collapse. The now vulnerable provirus would be further repressed by chromatin changes that cannot be easily reversed (see Box 1). In simple retroviruses and retroviral vectors, this positive-feedback axis is absent, and therefore silencing is more frequent [32, 4446, 51, 67].

We can begin to draw some tentative conclusions. The 'site' effect, whereby an identical provirus behaves differently according to its point of integration, argues for powerful regional control mechanisms for gene expression independent of the gene itself, providing a general defense against insertional elements. The effects of known influences on chromatin configuration clearly affect proviral gene expression, and study of individual genes in the context of proviral insertions may be illuminating to dissect out the individual contributions of methylation, acetylation and so on. Lastly, intrinsic promoter properties of the provirus have an effect that can potentially hold back the effect of regional silencing influences as long as the promoter is functioning. Ironically, the cellular silencing mechanisms actually contribute to the persistence of HIV by facilitating its evasion of drug and immunological attack.

Knowledge about the mechanism of proviral silencing and how to reverse it may lead to clinical applications in HIV infection. There have been attempts at clearing HIV from an infected patient by reactivating latent virus using interleukin-2 [68, 69] or the histone deacetylase inhibitor valproic acid [70], thus rendering the virus susceptible to anti-retroviral therapy and immune clearance. These have been, at best, only partially successful. Understanding proviral silencing will be instrumental in devising further strategies. New insights into the control of gene expression by miRNA, which is possibly another defense against invading molecular parasites [71], will undoubtedly also have an impact [72, 73].

The oldest endogenous retroviruses in the human genome entered the genomes of our mammalian ancestors at around the time of the extinction of the dinosaurs [74]. It might be expected that, with such a long history of coexistence between mammalian genomes and transposable elements, there would exist well developed defenses against these molecular parasites, most probably chromatin dependent. It is tempting to speculate that the silencing of retroviruses, and possibly of cellular genes, originates from such defensive mechanisms.

Box 1 Two contrasting phenotypes of retroviral provirus silencing

Immediately after infection

The site of integration critically affects the level of proviral gene expression of HIV-1 [37].

DNA methylation does not correlate with proviral gene expression in cells infected with HIV-1 [35, 38] or HTLV-1 [29].

HIV-1 gene expression can be augmented by trichostatin A [11] and/or Tat [37].

At late times after infection

Variation in proviral gene expression is observed in clonal populations of cells infected with MLV [51], with MLV-based vectors [4446] and with HIV-1-based vectors [39, 52].

Local factors (chromatinization) correlate with proviral expression in HIV-1 [52]. The degree of DNA methylation correlates with behavior of the proviruses of MLV [51] and MLV-based vectors [4446].

Variable reactivation by trichostatin A on HIV [39], by 5-azacytidine and/or trichostatin A on MLV [4446].

Depending on the study, the time frame required for the 'late times' phenotype to arise can range from weeks [4446] to months [39, 52].