Introduction

Since histones were first discovered over one hundred years ago a great deal has been learned about the structure and function of these small basic proteins [1]. Importantly, histone molecules interact to form a scaffolding unit capable of compacting DNA within the relatively small confines of the nucleus [2]. Compaction is achieved by wrapping DNA around a globular histone octamer composed of a tetramer of histone H3/H4 and two dimers of H2A/H2B. Together, this DNA-protein structure is called a nucleosome and forms the basic repeated unit of chromatin. This classical view of chromatin as a simple structural entity has recently been superseded by a wide body of evidence demonstrating that nucleosome deposition, subunit composition, and post-translational modification can profoundly affect how chromatin function is regulated. For example, nucleosome placement over defined regulatory elements can impact how transcription factors recognize DNA and regulate transcription [3], incorporation of histone variants into the octamer can define how the nucleosome functions [4], and post-translational modification of histone tails can act as nucleation sites for factors involved in chromatin regulation and metabolism [5, 6].

Over the past decade it has become increasingly clear that post-translational modification of histones plays a central role in regulating information that is stored within DNA. Many of these modifications are found on the N-terminal histone tails that protrude from the globular core of the nucleosome and include: phosphorylation, acetylation, ribosylation, ubiquitylation, and methylation. Of particular interest are the recent advances in our understanding of the histone arginine and lysine methylation systems, which will be discussed in this review. These modifications have been shown to play important roles in maintaining genome integrity, regulating transcription, and contributing to epigenetic memory [7, 8]. Unlike some histone modifications that can affect chromatin function by altering the charge and structural interactions within the nucleosome, histone methylation is a relatively inert modification that appears to function as a nucleation site for ‘effector’ proteins [5]. These effector proteins often elicit their function by recruiting other factors that directly impact chromatin. To regulate post-translational histone modifications, a corresponding class of enzymes has evolved that can directly remove these modifications. For example, a class of histone demethylase enzymes has recently been identified [9] that appear to play central roles in regulating chromatin function.

The emergence of the histone methylation and chromatin modification systems as important regulators of cellular information has garnered much attention and an excellent series of detailed review articles covering this topic have been published [7, 1013]. Therefore, in this article we will provide a more focussed discussion of the recent advances in our understanding of how histone methylation is placed, interpreted, and dynamically regulated in mammals. We will also highlight a series of recent studies that suggest histone methyltransferase/demethylase systems modify non-histone proteins, many of which have important roles in transcription and cancer.

Histone methylation

Methylation of lysine or arginine residues can occur in several modification states. Lysine residues can house either one (me1), two (me2) or three (me3) methyl moieties on their amine group, whereas arginine residues can carry one (me1) or two (me2) methyl groups on their guanidinyl group. The di-methyl arginine state is further defined by whether the modification exists in the symmetric (me2s) or the asymmetric (me2a) configuration [14]. As will be discussed inmore detail in later sections, these defined modification states can have different and profound implications on the function of chromatin. The most thoroughly studied histone lysine methylation marks are found on H3K4, K9, K27, K36, K79 and H4K20. In general, H3K4, K36, and K79 methylation are found near active or poised transcriptional units and H3K9 and H4K20 modifications are hallmarks of silenced or heterochromatic regions. Histone arginine methylation occurs on H3R2, R8, R17 and R26 and H4R3 and has roles in defining both active and repressed chromatin states. Our understanding of the histone residues that are modified by methylation has largely been achieved by examining the in vitro substrate specificity of histone methyltransferase enzymes and by the generation of antibody reagents that specifically recognize modified lysine and arginine residues. Although simple and robust, these two approaches have clear limitations in that they either rely on the knowledge of the enzyme that places the modification to identify the site or are essentially predictive and rely on the costly approach of modification specific antibodies.

A coming of age for mass spectrometry based histone methylation analysis

In an attempt to identify new histone methylation marks and to study the combinatorial state of histone modifications, a great deal of effort has been invested in refining mass spectrometry based analysis techniques. One of the goals of using mass spectrometry for histone modification analysis is to apply relatively unbiased methodologies in gauging and defining cellular chromatin modification patterns, thus eliminating some of the limitations inherent to enzyme and antibody based approaches. Initially, ‘bottom up’ mass spectrometry approaches were used in which histones are fragmented into small peptides prior to ionization and mass spectrometry analysis (Fig. 1, left hand side). This proved an effective tool in characterizing histone methylation marks, some of which were not previously identified using traditional techniques. For example, bottom up sequencing yielded evidence that H3K79[1517] and H4R3[18, 19] were methylated and permitted the characterization of the enzymes responsible for placing these modifications. Subsequently, more in depth histone modification analysis has suggested methylation also occurs on: H2BK47 [20], H2BK57 [20], H2BK108 [20], H3K18 [21, 22], H3K23 [21], H3K56 [22], H3K64 [21, 22], H3K122 [2123], H4K59 [24], H4K31 [20, 21], H4R55 [20], and H4K77 [20]. Although the biological relevance and the histone methyltransferase enzymes that place these additional modifications remain to be fully established, bottom up strategies have proven successful in yielding novel and potentially interesting methylation sites.

Figure 1
figure 1

Mass spectrometry based approaches for studying histone modifications. Histones are extracted from tissue or cultured cells and biochemically purified, usually by means of reverse phase high performance chromatography. As the histones elute from the column they are fractionated and individual histones are isolated (i.e. Histone H4 in the schematic). The purified histone is then analysed using bottom up or top down approaches. Bottom up analysis involves enzymatic digestion of the histone prior to ionization, whereas top down analysis relies on ionization the intact histone. Both approaches permit interrogation of histone modifications, but top down approaches have the advantage of retaining information about modifications that occur on the same histone. MS - mass spectrometry.

One clear disadvantage of ‘bottom up’ mass spectrometry approaches is that information about modifications that occur on the same histone tail can be lost during the digestion step prior to ionization and mass analysis. To overcome this limitation ‘top down’ approaches have been pioneered where larger fragments of the histone are directly ionized prior to mass analysis (Fig. 1, right hand side). This was recently employed to analyse the 50 amino acid N-terminal tail fragment of histone H3.2 [25]. The results of this analysis demonstrated that the majority of histone H3.2 modification occurs in the form of H3K9, K36, and K27 methylation without an apparent modification hierarchy between these sites and suggesting they are independently regulated. Interestingly, the same work also revealed that H3K4 methylation, which is generally considered a mark of active chromatin, is predominantly found on hyper-acetylated H3 and is never found together with H3K9me3, a hallmark of repressed chromatin. These molecular observations corroborate many years of cell-based work that had suggested this type of arrangement of histone modifications in active chromatin [2629]. As advanced techniques in mass spectrometry are more routinely applied to understanding histone methylation, their utility is becoming more obvious and applicable. Recently top down techniques were applied to study H4K20 methylation during the cell cycle [30]. This work demonstrated that newly synthesized H4 is progressively methylated to the me2 modification state with a subset of H4K20 being methylated to the me3 state. Furthermore, this study demonstrated that once in place, H4K20 methylation is not dynamically regulated suggesting its levels are not controlled at the global level by histone demethylase activities. In contrast to previous suggestions that H4K20 methylation and H4K16 acetylation are antagonistic modifications, mass spectrometry analysis demonstrates that they co-exist on the same histone tail and are independently regulated. The utility of the top down mass spectrometry approach has been further highlighted by a recent study that defined the dynamics of histone H4 modification during human embryonic stem cell differentiation [31]. Although in its infancy as a tool for studying histone modification, the use of top down mass spectrometry based approaches to understand histone methylation, the dynamics of these modifications and their combinatorial states has proven a powerful tool. Looking to the future, mass spectrometry approaches will become a common addition to the chromatin biologist repertoire of experimental tools.

Histone methyltransferases

Since the discovery of the first histone lysine methyltransferase in 2000 [32], an extended family of histone lysine methyltransferases have been identified (reviewed in [7, 12]). The majority of these methyltransferases share a SET domain as their catalytic core and have very defined residue and modification state specificity. With the exception of a few lysine methylation sites that are of low abundance in chromatin, the enzymes that place lysine methylation are known and their catalytic activity are characterized. In contrast, histone arginine methylation is placed by the protein arginine N-methyltransferase (PRMT) class of histone methyltransferase enzymes. Although the histone residue specificity of the PRMT enzymes is not as well characterized as the lysine methyltransferase enzymes, these enzymes are typically more promiscuous and often target multiple arginine residues on the N-terminal tails of histone H3 and H4. Furthermore, PRMT enzymes also target a broad range of other cellular proteins suggesting they contribute to regulation of additional non-chromatin based processes [8].

Amore complexmethylation system in mammals with a division of labour

In budding yeast there is a simple histone lysine methylation system with three prominent histone lysine methylation marks that occur on histone H3 in position K4, K36 and K79. The KMT2/SET1(H3K4), KMT3/SET2(H3K36), and KMT4/DOT1(H3K79) methyltransferase enzymes place all three methylation states (me1, me3, me3) on their respective target residues [33]. In stark contrast to budding yeast, mammals have a greatly expanded genomic histone lysine methylation site modification profile, produced by a larger family of enzymes. For example, there are at least three proteins that place H3K36 methylation [3437]. Among these H3K36 methyltransferases, individual enzymes have evolved to recognize and target defined modification states. This is exemplified in a recent study that demonstrated the human KMT3A(Setd2) enzyme catalyses H3K36me3 and its depletion in cultured cells causes an acute loss of the me3 state but does not affect me2/1 levels [37]. This observation suggests that H3K36me1 and me2 modification states are maintained by additional methyltransferases. Asimilar division of labour was observed when H3K36 methyltransferases were depleted in Drosophila cells [38], suggesting this diversification in modification state specificity is a feature of higher eukaryotes.

This trend of expansion and diversification among the histone lysine methylation system also applies to enzymes that place other histone methylation marks. For example, H3K4 and H3K9 methylation are catalysed by an extended group of methyltransferase enzymes most of which have distinct modification state preferences in higher eukaryotes [7]. Although there are mechanisms in budding yeast to regulate to some extent how individual methylation states are defined [3942], the sheer increase in complexity of the enzymes that place histone methylation and their unique modification state specificity in mammals suggest a more important role for this modification system in higher eukaryotes. This may represent an evolutionary adaptation that allows increased capacity for information storage through strict placement and recognition of defined modification states by individual methyltransferase enzymes and effector proteins. Perhaps this expansion and specialization of the histone methylation system helps multicellular organisms to cope with more complex requirements for long term chromatin based memory in controlling gene expression patterns and tissue specificity during development.

Histone methyltransferases as pharmacological targets in cancer and disease

With a firmgrasp of the identity ofmany of the enzymes that place histone methylation, a significant amount of effort has been applied to understand how these enzymes are involved in regulating important nuclear processes. Mouse knockout models for enzymes that place H3K4, H3K9, and H3K27methylation have been informative in demonstrating that methyltransferases play important roles in regulating transcription, maintaining genome integrity, and development [36, 4349]. Furthermore, enzymes involved in placing histone methylation are targets of translocations that produce oncogenic fusion proteins [50, 51]. The realization that histone methyltransferases impact so many important aspects of cellular biology has spawned a huge effort both in the academic and industrial setting to identify small molecule compounds that directly and specifically regulate individual histone methyltransferase enzymes. An initial attempt to identify inhibitory molecules used a library of natural compounds to search formolecules that target the KMT1A(Suv39H1) H3K9me3 methyltransferase enzymes [52]. One of the compounds isolated in this screen, a fungal toxin chaetocin, inhibited KMT1A in addition to a related H3K9me1/2 methyltransferase, KMT1C(G9a). Chaetocin also functionally inhibited H3K9 methylation in cell based assays, but its use as a molecular probe appears limited due to complications with cellular toxicity. Furthermore, from a preclinical standpoint the chemical properties of chaetocin make it a poor lead compound for further inhibitor design.

A more recent study took advantage of a high throughput activity based assay to screen a library of 125 000 chemical compounds for their ability to inhibit the histone H3K9me1/me2 methyltransferase enzyme KMT1C [53]. This screen resulted in seven positive hits based on four distinct molecular scaffolds. Of these initial hits one compound, BIX-01294, was shown to specifically inhibit KMT1C. The functionality of this compound was then tested in cell-based assays and efficiently inhibited KMT1C mediated H3K9me1/me2 methylation at known KMT1C target loci. Furthermore, cells treated with BIX-01294 largely recapitulated global changes in histone H3K9 methylation levels observed in KMT1C deficient cells. Although the IC50 of BIX-01294 is relatively modest at 1.7 µM, this base compound will provide an opportunity for lead optimization and synthesis of more potent methyltransferase inhibitors. Clearly from these initial reports reasonable scope exists for the development of enzyme specific inhibitors of the histone methylation system. Compounds that are nontoxic and cell permeable will clearly provide important molecular probes for studying histone methylation in the laboratory and provide a basis for designing compounds that are useful for pharmacological intervention in human disease where the histone methylation system is perturbed.

Effectors

Unlike acetylation and phosphorylation that alter the charge properties of a modified residue, methylation of lysine and arginine residues is comparatively inert and does not affect residue charge. Instead, histone methylation acts as a nucleation site for effector proteins that elicit functional outcome. For example, the plant homeodomain (PHD) of the bromodomain PHD transcription factor (BPTF) protein binds H3K4me2/me3 (Fig. 2D) and recruits the NURF chromatin-remodelling complex to target genes in concert with transcriptional activation [54, 55]. In contrast, the chromo domain of heterochromatin protein 1 (HP1) proteins recognizes H3K9 (Fig. 2B) methylation and helps to reinforce the repressed state at silenced genes and heterochromatic regions of the genome [5658]. As we learn more about the specificity and function of methyl-lysine binding effector proteins it is becoming clear that these proteins also have the capacity to preferentially recognize defined modification states. This molecular selectively provides the histone methylation system with the ability to encode additional information within each modified residue depending on the modification state specificity of the methyltransferase placing the modification and the effector protein that recognizes it. Furthermore, several PHD domain containing proteins were recently shown to have the unique ability to recognize histone lysine residues only when they lack methylation [59, 60]. Therefore it appears that methylated residues act as modules capable of encoding information that can be preferentially recognized by site and state specific effector proteins (reviewed in [5, 61]). Although the majority of well studied histone lysine methylation marks have corresponding effector proteins, some of the more recently identified histone lysine modifications occur in the core of the nucleosome and this could sterically limit accessibility by effector proteins. It remains possible that these methylation marks may have effector independent roles in regulating nucleosome structure and function. Nevertheless, over the past year several important structural studies have highlighted unique properties of certain methyl-lysine binding effector proteins (Fig. 2A–C) demonstrating a unique complexity in chromatin recognition by effector proteins.

Figure 2
figure 2

Effector proteins form an aromatic cage that recognizes methylated lysine residues (A–E). Cartoon representations corresponding to the three dimensional structure of effector protein methyl-lysine binding domains (top half of each section) with a close up view of the aromatic cage in association with a methylated ligand (bottom half of each section). (A) The KMT1D ankyrin repeat in complex with H3K9me2 (PDB 3b95), repeats 3, 4 and 5 shown only. (B) The chromodomain of heterochromatin protein 1 (HP1) in complex with H3K9me2 (PDB 1kna). (C) The MBT repeats of L3MBTL1 in complex with H4K20me2 (PDB 2pqw), repeat 2 shown only. (D) The PHD finger of bromodomain PHD transcription factor (BPTF) in complex with H3K4me3 (PDB 2f6j), PHD finger shown only. (E) The double tudor domain of KDM4A in complex with H4K20me3 (PDB 2qqs).

One cage, two binding modes

The crystal structures of effector proteins bearing PHD, chromo, and tudor domains have been solved (Fig. 2A–E). A recognition motif consisting of two to four aromatic residues that create a cage capable of accommodating the lysine methyl ammonium group appears to be a common chromatin recognition interface for these domains. The KDM4A (JHDM3A/JMJD2A) histone demethylase enzymes contain a double tudor domain that recognizes either H3K4 or H4K20 when modified in the me2 and me3 modification state [62]. Interestingly, the amino acid sequence surrounding the H3K4 and H4K20 share no significant sequence homology suggesting that modification site specificity is inherent to the tudor domain. Whilst the structural basis for H3K4 binding has been elucidated [62], the molecular mechanism by which the same effector could also bind H4K20 was unknown. This was recently clarified by a study reporting the structure of the KDM4A double tudor domain in complex with an H4K20me3 peptide [63](Fig. 3A). In both structures, the aromatic cage engages the methylated H3K4 and H4K20 residues endowing methylation dependent substrate recognition. In contrast, the bi-functional binding site specificity of the hybrid double tudor domain appears to be due to the usage of two binding interfaces. The H3K4 peptide traverses the intra-domain boundary of the two tudor domains, whereas the H4K20 peptide makes contacts with only one of the tudor domains. Therefore the two peptides bind the effector surface in opposite orientations and engage in different binding interactions (Fig. 3A). Despite the clear specificity of the KDM4A tudor domain for methylated histones it remains to be determined whether this chromatin binding motif is required to target histone demethylase activity to genomic loci that also contain methylated histones.

Figure 3
figure 3

Unique substrate recognition properties of two methyl-lysine recognition domains. (A) Space filling representation of the three dimensional structure of the KDM4A tandem-tudor domain in complex with H3K4me3 (magenta) and H4K20me3 (yellow) histone substrates. The N- and C- terminus of the of histone peptides are labelled and the lysine side chain is depicted projecting into the aromatic methyl-lysine recognition cage (cyan). The two unique binding faces of KDM4a appear to permit the dual substrate recognition properties of this methyl lysine binding effector protein. (B) Space filling representation of the RAG2-PHD domain in association with an H3 peptide (green) containing H3K4me3 and H3R2me2s modifications. The methyl-lysine binding aromatic cage is coloured cyan and the Tyr445 residue that interacts with the symmetric dimethyl-arginine is coloured magenta.

RAG2 PHD — a dual-mark reader?

The proximity of lysines and arginines within the histone tails suggest there may be synergistic or antagonistic cross-talk between adjacent modifications. Two recent studies have addressed the relationship between H3K4 trimethylation, a transcription activation mark, and H3R2 methylation. Using in vitro peptide pull-downs, it was demonstrated that H3R2 methylation abrogates H3K4me3 recognition by many known effector proteins including the double tudor domain of KDM4A and the PHD domains of tumour suppressor protein ING2 (inhibitor of growth 2) and BPTF. In contrast, the PHD domain of recombination activation gene RAG2, a component of the VDJ recombinase, binds more efficiently to H3K4me3 when the adjacent H3R2 residue is dimethylated [64, 65]. The structure of RAG2-PHD bound to a dual-modified H3R2me2/H3K4me3 peptide revealed two significant differences when compared to other H3K4me3 binding PHD domains (Fig. 3B). Firstly, the H3K4me3 binding site in RAG2 does not form the typical aromatic cage, but instead has a channel lined by two aromatic residues. Secondly, and more interestingly, RAG2 lacks an acidic residue that forms salt bridges with the unmodified H3R2 residue in other H3K4me3 recognition domains. In the RAG2 PHD domain this acidic residue is replaced by a tyrosine that favours an interaction with symmetrically di-methylated H3R2. Therefore, the RAG2 PHD domain can uniquely bind H3K4me3 in the context of H3R2me2s and may represent the first example of a dual-mark reader.

Ankyrin and MBT repeats — lessons for recognizing lower methylation states

The ankyrin repeats of histone methyltransferases KMT1C and KMT1C-like protein (KMT1D) preferentially bind H3K9me1/me2 and hence represent the newest addition to the already diverse class of methyllysine effectors [66]. The structure of KMT1D bound to an H3K9me2 peptide demonstrates that the ankyrin repeats recognize H3K9me2 using a partially aromatic pocket lined by tryptophans on three faces (Trp839, 844, 877) and an acidic residue on the fourth face (Glu847) (Fig. 2A). Importantly, the acidic residue forms a salt bridge with the methylammonium group sterically constricting the cavity and preventing recognition of the me3 methylation state. The biological implications for the presence of domains in KMT1C and KMT1D which recognize the same modification that their enzymatic domain catalyses is interesting and merits further study, but such an arrangement could represent a mechanism for spreading of H3K9me1/2 from nucleation sites in silenced chromatin.

The KMT1C and KMT1D ankyrin repeats are not alone in preferentially recognizing lower lysine methylation states. Other examples include the 53BP1 tandem tudor domain that selectively recognizes H4K20me1 and me2 [67] and the lethal (3) malignant brain tumour-like protein 1 (L3MBTL1), which binds various me1 and me2 modifications [68, 69](Fig. 2C). Structures of these two effector molecules in complex with methyl-lysine ligands reveal a striking similarity with KMT1C/D ankyrin repeats in the arrangement of the binding pocket. Methyl-lysine recognition occurs in a constricted pocket with three to four aromatic residues and a critical acidic residue. The acidic residue again interacts with the methyl ammonium group precluding recognition of the me3 modification state suggesting this is the determining factor in constricting this group of effectors to the me1 and me2 methylation states. In support of this possibility, if a tyrosine residue in the aromatic cage of the BPTF PHD finger is substituted with glutamic acid, the PHD domain is converted into a me2 recognition domain from its usual preference for the me3 modification state.

WD40 repeats

The WD40 repeat of the WDR5 protein was originally identified as a potential binding domain for methylated H3K4. Subsequently, several structural studies have revealed that histone binding by WDR5 occurs through several peptide side-chain contacts to the WD40 domain without creating a specific hydrophobic or aromatic cage as described for the other methylated lysine binding modules [7073]. Instead, in the case of WDR5, the methylated lysyl side-chain might be exposed to the solvent, thus allowing further modification by histone methyltransferases [72].

Histone Demethylases

Despite the fact that global histone methylation turnover is relatively low [74, 75] it has recently been demonstrated that histone lysine methylation can be dynamically regulated by histone demethylases [76, 77]. These enzymes are of two general classes. The fist class of enzymes are amine oxidase enzymes which are typified by the mammalian lysine specific demethylase 1 (KDM1/LSD1) which uses FAD as a co-factor and targets removal of the me1 and me2 modification states (Fig. 4A, top)[76]. The second class of enzymes belongs to a large family of proteins that contain a Jumonji-C (JmjC) domain as their catalytic core. The JmjC domain-containing proteins are iron and alpha-ketoglutarate dependent oxygenases that target removal of all three histone lysine methylation states (Fig. 4A, bottom). The structure of KDM1 [7880] (Fig. 4A, top) and KDM4A [8184](Fig. 4A, bottom) histone demethylases have been solved and provided important insight into the substrate recognition and regulatory properties of these proteins. Interestingly, the crystal structures of peptide substrate complexes from both types of demethylases reveal that the active sites are predominantly polar as opposed to the largely hydrophobic recognition sites described above for the effector domains [81, 85]. Not surprisingly, the identification and characterization of histone lysine demethylases has demonstrated that these enzymes regulate many of the cellular processes that histone methylation has been implicated in including transcriptional regulation, maintenance of genome integrity, and regulation of epigenetic memory [9, 86].

Figure 4
figure 4

Proposed lysine and arginine demethylase reaction mechanisms. (A) A schematic indicating potential lysine demethylation reaction mechanisms. (Top) Mono-methyl lysine demethylation catalyzed by JmjC domain-containing proteins using 2OG and Fe(II) as cofactors. The reaction produces succinate, CO2, and formaldehyde as by-products. A ribbon structure of the catalytic domain of the KDM4A demethylase enzymes is depicted to the left (PDB 2oq6). (Bottom) The amine oxidase histone demethylase, KDM1, uses FAD as cofactor to demethylate mono-methyl lysine. The reaction produces H2O2, FADH2, and formaldehyde as by products. The ribbon structure of the KDM1 catalytic domain is depicted to the left (PDB 2v1d). (B) A schematic indicating potential arginine deimination and demethylation reaction mechanisms. (Top) Deimination of mono-methyl arginine catalysed by the peptidylarginine PADI4 enzyme. The ribbon structure of the PADI4 enzyme is depicted to the right (PDB 2dex). This reaction antagonizes histone arginine methylation by converting it to citrulline. (Bottom) Demethylation of mono-methyl arginine by the JmjC domain-containing protein JMJD6. The atomic structure of JMJD6 remains to be solved.

Identification of a histone arginine demethylase

The JmjC family of proteins have been extensively analysed for enzymatic activities that target histone lysine methylation(reviewed in [9, 8789]). Recently, a member of this family of proteins, JMJD6, was shown to apparently reverse histone arginine methylation [90]. In contrast to previously identified histone deiminase enzymes that convert methyl arginine to citrulline and simply antagonize arginine methylation [91, 92] (Fig. 4B, top), JMJD6 utilizes a hydroxylation based reaction to directly reverse arginine methylation. JMJD6 appears to remove both H3R3 and H4R3 methylation and targets both the me1 and me2 modification states. This important discovery suggests that both arginine and histone methylation can be dynamically regulated, but how JMJD6 contributes to regulation of arginine methylation dependent process in vivo remains to be examined. Interestingly, mass spectrometry analysis of JMJD6 reacted histone substrate revealed that JMJD6 also catalyzes oxidation of several non-modified lysine residues. The biological relevance of this additional activity remains elusive and warrants further examination.

The by-product of histone demethylation mediates DNA damage dependent gene activation

Histone demethylation can produce reactive by-products including formaldehyde and H2O2 that may be damaging to chromatin and nuclear proteins. Currently, it remains unclear if these by-products are further metabolized to make them less harmful to the cell or whether they have adverse affects in the nucleus due to their reactive nature. In addressing this question, a recent study has revealed that H2O2 produced as a by-product during KDM1 mediated demethylation at estrogen receptor (ER) target genes results in production of 8-oxo-guanine lesions [93] (Fig. 5A–B). This DNA damage event resulted in the mobilization of the 8-oxo-guanine DNA glycosylase-1 (OGG1) and topoisomerase IIb repair enzymes to the regulatory regions of the gene (Fig. 5C). Surprisingly, this DNA damage event appears to be required for efficient transcription of the ER target genes. The authors propose that single stranded breaks induced during the DNA repair process may help to facilitate DNA bending permitting more efficient RNApolII loading onto the promoter during gene activation (Fig. 5D). Interestingly, a recent report has also implicated a gylcosylase dependent process in mediating removal of methylated DNA bases during the same estrogen receptor mediated gene activation event [94, 95]. Clearly, DNA breaks induced by this system could also result in similar transcriptional outcomes as are proposed for an H2O2 damage dependent mechanism. In light of these two independent mechanisms for inducing single stranded DNA breaks during ER induced gene activation, it remains to be determined what the relative contribution of each pathway is to transcriptional activation. Nevertheless, it appears that in some instances the reactive by-products of the demethylation process may contribute in an unexpected way to chromatin regulation. Given that formaldehyde is produced during JmjC-mediated demethylation reactions and can cause adverse DNA and protein adducts, it is likely that a formaldehyde scavenging system is used to inactivate this reactive by-product. A possible candidate for this type of reaction is the class III alcohol dehydrogenase that uses GSH and NAD to inactivate formaldehyde [96, 97]. Since this enzyme is also found in the nucleus [98], it will be important to determine if this system is targeted to genomic sites undergoing histone demethylation to inactivate formaldehyde.

Figure 5
figure 5

Hydrogen peroxide produced by LSD1 during demethylation contributes to transcriptional activation. (A) During transcriptional activation of estrogen receptor target genes, KDM1 removes repressive H3K9me2 marks. (B) A by-product of this demethylation reaction is hydrogen peroxide (H2O2) which is reactive and can cause 8-oxo-guanine (8-OG) lesions on DNA. (C) LSD1 mediated 8-OG is targeted for removal by the 8-oxo-guanine DNA glycosylase-1 (OGG1) and perhaps other components of the base excision repair system. This repair process leads to single stranded DNA nicks that are a substrate for topoisomersase IIb (TOPO). Recruitment of topoisomerase IIb can lead to alterations in DNA architecture. (D) Changes in DNA architecture may aid in RNApolII loading onto target genes by promoting chromatin accessibility or DNA bending and therefore contribute to transcriptional activation. ER — estrogen receptor.

Histone demethylases play important roles in germ cell development and embryonic stem cell function

Although only recently JmjC domain-containing histone demethylases were identified, there has been a rapid advancement of our understanding of the function of this large family of enzymes through the use of gene knockdown and knockout strategies [99112]. A recent study has determined that two histone demethylase enzymes, KDM3A (JMJD1A) and KMD4C (JHDM3C/JMJD2C), are direct transcriptional targets of the pluripotency promoting transcription factor Oct-4[113]. Based on the central role of Oct-4 in regulating embryonic stem cell (ES) pluripotency, it was proposed that regulation of these demethylase enzymes by Oct-4 may contribute to downstream maintenance of stemness through histone demethylation. To test this hypothesis KDM3A (an H3K9me2 demethylase) and KDM4C (an H3K9/36me3 demethylase) enzymes were knocked-down in mouse ES cells using RNAi mediated approaches. Interestingly, when these enzymes were depleted, mouse ES cells lost their characteristic morphology and self renewal was inhibited. Loss of ES cell characteristics appeared to correlate with reduction in H3K9me levels and changes in gene expression. For example, KDM3A knockdown caused reduced expression of the Tcl-1 gene, a known regulator of self renewal in ES cells, and the promoter region of the gene displayed increased levels of H3K9me2. Furthermore, knockdown studies revealed that KDM4C binds to the Nanog gene and potentiates gene expression while counteracting H3K9me3 and HP1 effector protein binding. Together, these observations suggest a central role for H3K9 demethylation in maintaining stem cell function. Surprisingly, it has subsequently been demonstrated that KDM3A hypomorphic and knockout mice are viable and grossly normal with the exception thatmalemice are infertile [114]. This suggests that although KDM3A is important for ES cell pluripotency in cell culture, it is not essential for normal mouse development. Nevertheless, in KDM3A deficient mice male sterility appears to be caused by a lack of expression of genes central to the histone to protamine transition during sperm development. Loss of normal gene expression correlated with increases in H3K9me at promoter regions suggesting that histone demethylation contributes to regulated gene expression during germ cell development in males. Given the complex gene expression patterns of demethylase enzymes in mammals, it will be important to examine in more detail how developmental and tissue specific regulation of these interesting enzymes contributes to normal epigenetic regulation of gene expression.

Reversible methylation of non-histone proteins

Although studies on protein methylation have most recently focussed on lysyl- and arginyl methylation of histones, it should be emphasized that methylation of non-histone proteins also occurs to a significant extent (reviewed in [115]). Many important proteins have been shown to be subject to lysine methylation, including the tumour suppressor protein p53, a kinetochore protein (DAM1)[116], the retinoic acid receptor alpha [117], components of the transcriptional machinery such as TAF10 [118], as well as chloroplast (Rubisco)[119] and mitochondrial proteins (cytochrome c)[120]. Furthermore, many proteins are arginine methylated, including RNA-binding proteins, splicing factors, coactivator p300/CBP, and DNA polymerase β [8]. Given the importance of histone methylation in regulation of important chromatin based processes, it seems likely that protein methylation in general has important regulatory functions and is subject to regulation by demethylase enzymes.

Dynamic and modification state specific methylation of p53

Lysine methylation has recently been identified on the tumor suppressor p53 at three particular sites in the Cterminal domain [45, 121124](Fig. 6). Methylation of p53 by the KMT7(SET7/9) methyltransferase enzyme on Lys372 in the C-terminal region of the protein following DNA damage results in p53 stabilization and trans-activation of p53 target genes like p21 [124]. Conversely, mono-methylation of Lys370 by KMT3C(Smyd2) antagonizes association of p53 with target genes and therefore inhibits p53 mediated trans-activation [121]. Interestingly, when p53 is dimethylated on Lys370 this creates a binding site for the tudor domain containing co-activator protein 53BP1 and permits activation of p53 target genes. These observations suggest that differing methylation states on the same residue of p53 can have diverse functional outcomes. In analogy to histone methylation, it was recently demonstrated that p53 Lys370me2 methylation can be dynamically regulated by the LSD1 histone demethylase enzyme; removal of the Lys370me2 inhibited association of p53 with 53BP1 and repressed p53 function [125]. To further the analogy to histone modification, the same p53 lysine residues that are methylated are also subject to other modifications like acetylation, sumoylation and ubiquitylation [126]. This suggests that cross talk or interference between different protein modification pathways can also occur in the post-translational modification of p53.

Figure 6
figure 6

p53 is regulated by protein methylation and demethylation. The p53 protein is made up of an N-terminal transcriptional activation domain (TAD), a central DNA binding domain (DBD), and a C-terminal domain (CTD). Methylation of the CTD on Lys372 by SET7/9 occurs following DNA damage and stabilizes p53 leading to transactivation of p53 target genes. Conversely, mono-methylation of Lys370 by KMT3C/Smyd2 antagonizes binding of p53 to target genes and inhibits p53 mediated transactivation. Surprisingly, di-methylation of the same Lys370 residue creates a binding site for 53BP1 leading to transactivation of p53 target genes. Di-methylation of p53 is enzymatically reversed by the LSD1 histone demethylase indicating that this post-translational modification and its effects on p53 function are regulated. Recent evidence also indicates the KMT5/Set8 enzyme methylates Lys382 in the CTD, indicating that p53 is even more broadly regulated by lysine methylation than previously realized.

A complicated and dynamic pattern of histone methylation

Although we now know many of the enzymes that place and remove histone methylation, it is not completely clear how the action of these enzymatic activities translates into genomic histone methylation profiles in mammals. In many ways this understanding has been limited by the technology used to interrogate histone methylation profiles. Chromatin immunoprecipitation (ChIP) is the most widely used technique to define histone methylation profiles. ChIP relies on crosslinking histones to DNA followed by immunoprecipitation using antibodies that recognize specific methylation marks. This DNA is then analysed using quantitative PCR or by hybridization to microarrays to define the relative enrichment of a modification at a given locus. The relatively small size of the yeast genome has made array hybridization techniques feasible as well as cost effective to obtain genome wide profiles of histone methylation in this organism [127, 128]. In mammalian systems, this type of approach has been limited due to costs associated with arrays that interrogate the entire genome. Recent advances in massively parallel sequencing technologies have provided a cost-effective alternative to analysing ChIP DNA (ChIP-Seq) and provided the first comprehensive glance at the genome wide histone methylation profiles [129, 130].

‘ChIPing away’ at histone methylation profiles in mammals

Several important revelations regarding the methylation state of histone in mammals have arisen from studies using ChIP-array and ChIP-seq technologies. Firstly, histone methylation marks appear in a highly co-ordinated fashion over genic and non-genic regions. Although some of these profiles were inferred from studies analysing isolated genomic regions in mammals and histone methylation in model organisms like budding yeast, the genome-wide view of mammalian histone methylation indicates that there are very specific modification patterns that share uniformity over similar genomic elements. For example, H3K4me3 methylation is a hallmark of regulatory elements at the 5′ end of transcriptionally active genes or of genes poised for transcriptional activation, whereas H3K36me3 methylation is largely restricted to the body and 3′ end of the gene [129131]. These observations are in close agreement with the targeting properties of the enzymes that place these modifications. For example, it is known that H3K4 methyltransferases are associated with transcriptional regulators that recognize promoters, whereas enzymes that place H3K36 methylation can associate with the elongating form of RNA pol II over the body of genes [132]. Since the profiles of H3K4me3 and H3K36me3 appear to be highly correlated with the initiation and elongation of transcription, these landmarks appear to provide a general tool in identifying novel transcriptional units. In particular microRNA and non-coding RNAs may be rapidly processed from longer precursors and not readily detectable at the mRNA level, but histone modifications over their regulatory and genic regions highlight their existence on chromatin [129, 130]. In non-genic regions histone H3K9me3 and H4K20me3 silencing marks are associated with repetitive or transposable DNA elements including satellite sequences and long terminal repeats. It is thought that these marks may be placed in response to the production of double stranded RNAs through mobilization of the RNA interference based transcriptional silencing pathway as has been reported for H3K9 methylation in other organisms.

A second important observation regarding mammalian histone methylation profiles is the existence of a specific bivalent modification pattern containing H3K4me3 and H3K27me3 at the regulatory elements of certain genes in ES cells [129, 133135]. This observation was initially very puzzling given that H3K4me3 methylation was known to be involved in transcriptional activation whereas the H3K27me3 modification is a part of the polycomb mediated silencing system [136138]. It turns out that these bivalent domains are largely found at genes with more complex expression patterns including those key developmental transcription factors [129, 135]. In ES cells, genes with bivalent domains are largely silenced or expressed at low levels, but during differentiation into defined cell lineages their bivalent modification profile is often resolved to contain either H3K4me3 methylation in concert with transcriptional activation, or to contain H3K27me3 and remain strongly silenced. Interestingly, the recently discovered KDM6 class of histone demethylase enzymes, that remove the H3K27me3, were found to be part of an H3K4 methyltransferase complex [103105, 110, 139, 140]. This suggests that bivalent domains may be resolved by reinforcing the H3K4 methylation state while actively removing the H3K27me3 modification. Based on these observations, bivalent modifications in ES cells appear to play an important role in maintaining genes in a poised epigenetic state from which they can be resolved to either the activated or permanently silenced state following cellular differentiation.

Although histone arginine methylation is known to play important roles in transcription regulation, the genomic profile in mammalian chromatin is poorly understood. In a recent study the profiles of H3R2me2a at a cohort of genes was analysed by ChIP and quantitative PCR [141]. This study revealed that H3R2me2a is generally found over silenced genes and inhibits recruitment of H3K4 methyltransferase enzymes by blocking effector protein recognition of the adjacent H3K4 methylation site. Conversely, H3K4 methylation at active genes appears to block PRMT6 from modifying H3R2me2a [141, 142]. This interesting observation suggests that there is a specific interplay between methylation at H3R2 and H3K4, with a unique interdependence between histone lysine and arginine methylation. A future challenge will be to profile histone arginine methylation on a genomewide scale and to understand the functional interplay between these modifications and other covalent histone modifications.

Conclusions and future directions

The importance of the histone methylation system in regulating chromatin based processes is becoming increasingly clear as we understand more about the enzymes which place methylation, the proteins which interpret it, and the enzymes which counteract this modification. A clear challenge for the future is to understand and integrate how additional modifications on the same histone, within the same nucleosome, and in contiguous chromatin domains interact to define local and global epigenetic histone modification patterns. In part this will rely on mass spectrometry based approaches to interrogate combinatorial modifications on individual histones and comprehensive genome location analysis of histone modifications using ChIP-array or ChIP-seq technologies to understand genomic modification patterns. Armed with a more defined understanding of the histone modification pattern, genetic manipulation of factors involved in defining these arrangements will help to elucidate how modifications like histone lysine and arginine methylation dictate or respond to epigenetic states involved in transcription and DNA repair. A more holistic and integrated view of the histone modification system will be essential in defining how these processes are perturbed and exploited in human disease and provide potentially novel avenues for pharmacological intervention. Clearly, with the pace that the histone methylation and chromatin modification fields are advancing and the new technologies being applied to their study, more secrets of this important epigenetic modification system will soon to be revealed.