Introduction

Stem cells have long-term self-renewing activity and can commit to multiple celltypes upon differentiation signals. Since Yamanaka and colleagues demonstrated thatthe four DNA-binding transcription factors Oct4, Sox2, c-Myc, and Klf4 transformfibroblasts into a type of pluripotent cells known as induced pluripotent stemcells, the importance of transcription factors in cellular reprogramming has beenmore recognized [1]. However, because the reprogramming efficiency of these four factors islow, it is evident that additional layers of co-regulatory mechanisms exist besidestranscription factor-driven regulation [2]. In fact, a recent study demonstrated that the histone modification andDNA methylation profiles differ in one-third of the genome between human embryonicstem (ES) cells and primary fibroblasts [3], indicating that such remarkable epigenetic difference may serve as amajor molecular mechanism in determining cellular characteristics of these two celltypes. Notably, the functions of epigenetic modifiers in stem cell fate decisionhave been intensively studied.

Histone lysine methylation has been widely accepted as a key epigenetic modification. Unlike acetylation, the methylation does not change the charge of lysine residuesand thus has a minimal direct effect on DNA-histone association. Rather, thedifferent methylation status of specific histone lysines can serve as a uniqueplatform for recruiting methylation “reader” proteins that activate orrepress genes’ transcriptional activity. In general, histone H3 lysine 4(H3K4), H3K36, and H3K79 methylation are gene activation marks, whereas H3K9, H3K27,and H4K20 methylation are gene-repressive modifications [4].

Histone lysine methylation is generated by a battery of histone methyltransferases(HMTs) that transfer the methyl group from S-adenosylmethionine to specific lysineresidues. For example, H3K4 methylation is mediated by several SET [Su(var)3-9,Enhancer of zeste, Trithorax] domain-containing methyltransferases, including mixedlineage leukemia 1–5 (MLL1−5), SET1A/B, SET7/9, SET and MYNDdomain-containing protein 1–3 (SMYD1−3), Absent, Small, or Homeotic1-like (ASH1L), SET domain and Mariner transposase fusion gene (SETMAR), and PRdomain zinc finger protein 9 (PRDM9) [524]. Methylated lysines exist in three forms: mono-, di- and tri-methylation(me1, me2, and me3).

Similar to other histone modifications, histone methylation can be reversed byhistone demethylases (HDMs). The first identified lysine-specific demethylase 1[LSD1; also known as FAD-binding protein BRAF35-HDAC complex, 110 kDa subunit(BHC110) and Lysine-specific demethylase 1A (KDM1A)], together with LSD2, belongs tothe polyamine oxidase family. LSD1 and LSD2 remove methyl groups from di- andmonomethylated H3K4 but are unable to demethylate trimethylated H3K4 [2528]. LSD1 was reported to also have H3K9 demethylation activity [29]. Subsequently, many Jumonji (JmjC) domain-containing histone demethylaseshave been discovered. In particular, the JARID1 family of histone demethylases(JARID1A−D) can erase H3K4me3 and H3K4me2 [3035].

In this review, we summarize the recent progress in understanding the functions ofH3K4 methyltransferases and demethylases in modulating stem cells’ fates.

H3K4 methylation

H3K4me3 occupies as many as 75% of all human gene promoters in several cell types(e.g., ES cells), indicating that it plays a critical role in mammalian geneexpression [36, 37]. In fact, H3K4me3 is required to induce critical developmental genesin animals, including Drosophila and several mammals, and is importantfor animal embryonic development [38]. H3K4me3 levels are positively correlated with gene expression levels [39, 40] (Figure 1A).

Figure 1
figure 1

H3K4me3 marks actively transcribed and poised gene promoters inmammals. (A) The genome-wide correlation of mRNA expressionlevels (High, Medium, Low, and Silent) with H3K4me3 levels at human genepromoters. Note that a dip of H3K4me3 levels may be associated with thenucleosome-free region around the transcriptional start site (TSS). Adapted from [39]. (B) The Venn diagram showing the percentage of genesthat have H3K4me3 and/or H3K27me3 in their promoters in mouse and humanES cells. All percentages are based on about total 18,000 genes. The“bivalent” denotes the promoters that contain both H3K4me3and H3K27me3 marks. Adapted from [36, 37, 43].

Although H3K4me3 is clearly associated with actively transcribed genes, however,studies have demonstrated that H3K4me3 is localized around the transcriptioninitiation sites of numerous unexpressed genes in human ES cells, primaryhepatocytes, and several other cell types [36, 37, 41]. In particular, it frequently co-resides with the repressive markH3K27me3 in the promoters of critical differentiation-specific genes [e.g.,Homeobox (HOX) gene clusters] that are transcriptionallyinactive in ES cells [36, 37, 42, 43] (Figure 1B). It has been proposed that the“bivalent” domains, composed of H3K4me3 and H3K27me3, may maintaindifferentiation-specific gene promoters in a repressive status in self-renewingstem cells but be poised for prompt gene activation upon differentiation stimuli [42]. Consistent with this, many bivalent genes have increased H3K4me3levels and decreased H3K27me3 levels while being transcriptionally activatedduring differentiation. Interestingly, recent studies demonstrated that mostbivalent domains are occupied by LSD1 [44, 45], indicating that it plays a role in maintaining low levels ofdimethylated H3K4 (H3K4me2) that are often co-localized with H3K4me3. For thesereasons, H3K4me3 is classified as a chromatin landmark for transcriptionallyactive or poised genes in ES cells [41].

Compared with mouse thymocytes, mouse ES cells contain higher levels of totalgenomic H3K4me3 and have higher H3K4me3 occupancy at the promoter of thepluripotent gene Oct4[46]. In agreement with this, global decreases in H3K4me3 levels occurduring retinoic acid (RA)-induced differentiation of mouse ES cells [47]. In addition, there are dynamic changes in H3K4me3 profiles atspecific sets of genes during ES cell differentiation. Such global and localchanges in H3K4me3 profiles are partly because levels of H3K4me3-regulatoryfactors [e.g., WD repeat-containing protein 5 (WDR5), MLL1 and MLL3] aremodulated [47]. It is believed that higher H3K4me3 levels allow the ES cell genometo be more open and transcriptionally permissive by recruitingchromatin-modifying factors. Therefore, unique H3K4me3 profiles at pluripotentand differentiation-specific genes may be key determinants of cellularidentity.

Most H3K4me3-containing promoters are also occupied by H3K9/H3K14 acetylation [41]. In transcriptionally active genes, H3K36me3 and H3K79me2 aresignificantly enriched downstream of H3K4me3-containing promoters: H3K36me3peaks toward the 3′ end of genes in gene bodies, whereas H3K79me2 islocated toward the 5′ end [41]. Therefore, H3K4me3 likely cooperates with other histone marks forgene activation. The combinatorial arrangement of H3K4me3 and other histonemarks may support, at least in part, the “histone code” hypothesis [48].

H3K4me2 decorates genomic regions independently of H3K4me3, although most of itoverlaps with H3K4me3 near the transcription start sites [49]. H3K4me2 may have an antagonistic effect on DNA methylation [50]. Monomethylated H3K4 (H3K4me1) also co-occupies regions near thestart sites with H3K4me3. Apart from the transcription start sites, H3K4me1,together with H3K27 acetylation, specifies enhancer regions [51, 52]. In summary, H3K4me1, H3K4me2 and H3K4me3 have a commonality for geneactivation, although their subsets play distinct roles in modulating chromatinfunction.

H3K4 methyltransferases

Some H3K4 methyltransferases are well conserved in different species. In yeast,the Set1 complex, also called Complex of Proteins Associated with Set1(COMPASS), catalyzes the mono-, di- and trimethylation of H3K4 [5, 8]. The protein complex is composed of the catalytic component of Set1and seven other regulatory subunits (Cps60, Cps50, Cps40, Cps35, Cps30, Cps25,and Cps15) that are essential for full enzyme activity [38] (Table 1). In Drosophila, thereare three Set1 homologs: dSet1, Trithorax (Trx), and Trithorax-related (Trr). The deletion of any of their genes results in lethality in flies, indicatingthat their target genes may not be redundant. In particular, loss ofdSet1, but not Trx or Trr, leads to a globalreduction of H3K4me2/3, suggesting that Trx and Trr have morespecialized functions [38]. Human SET1A, SET1B, and MLL1−4 are yeast Set1 homologs and arerelated to dSet1 (the counterpart of SET1A and SET1B), Trx (the counterpart ofMLL1 and MLL2), and Trr (the counterpart of MLL3 and MLL4) inDrosophila. Other SET domain-containing histone methyltransferasesthat methylate H3K4 but are not closely related to yeast Set1/COMPASS have alsobeen identified and include MLL5, SET7 (also called SET9), SMYD1-3, SETMAR, andPRDM9 [6, 15, 24].

Table 1 Subunit composition of H3K4 methyltransferase complexes in yeast andhuman

SET1A/1B and MLL1−4 are present in multi-protein complexes and share commoncore subunits, such as WDR5, Retinoblastoma-binding protein 5 (RBBP5), ASH2L,and Dumpy-30 (DPY-30), which are also highly conserved in yeast and flies [38] (Table 1). Several studies havedemonstrated that these core subunits are indispensable for the enzyme activityof methyltransferases and biological functions [5355]. In addition to common core subunits, there are unique subunits inthe individual H3K4 methyltransferase complexes: WDR82 and CXXC finger protein 1(CFP1) in the SET1 complex; Multiple endocrine neoplasia type 1 (MENIN) and PC4and SFRS1-interacting protein 1 (PSIP1) in MLL1 and 2 complex; Host cell factor1/2 (HCF1/2) in SET1, MLL1, and MLL2 complexes; and PAX transcription activationdomain interacting protein 1 (PTIP), PTIP-associated protein 1 (PA1), Nuclearreceptor coactivator 6 (NCOA6), and Ubiquitously transcribed X chromosometetratricopeptide repeat protein (UTX) in the MLL3 and MLL4 complexes [12, 16, 19, 22, 5663] (Table 1). These subunits may playimportant roles in recruiting H3K4 methyltransferases to specific genes andintegrating additional histone-modifying capacities (see below).

MLL1 and MLL2

MLL1 (also known as MLL and KMT2A) was initiallycloned from acute myeloid and lymphoid leukemia that contain frequentMLL1 chromosomal fusions and translocations [6466]. The MLL1 gene encodes a protein of 3,972 amino acids;this protein contains several highly conserved functional domains, includingthe N-terminal AT-hook DNA binding domains, Plant homeo domains (PHD), aBromo domain, and the catalytic SET domain (Figure 2). Inside cells, MLL1 protein is cleaved into MLL-N(320 kDa) and MLL-C (180 kDa) by Taspase I; these two largefragments dimerize through FY-rich motifs to form the functional MLL complexin vivo[67, 68].

Figure 2
figure 2

Protein domain architectures and stem cell function of MLL/SET1H3K4 methyltransferases. AT: AT-hook DNA binding domain;PHD: Plant Homeo Domain; BRD: Bromodomain; FYR: FY-rich domain; SET:Su(var)3-9, Enhancer of zeste, Trithorax domain; HMG: High MobilityGroup domain; RRM: RNA Recognition Motif.

Homozygous deletion of Mll1 is embryonic lethal; Mll1+/− mice display retarded growth and hematopoietic defects [69, 70]. Specifically, expression of the key developmental genes,including Hoxa7 and Hoxc9, were shifted from the anteriorboundaries toward the posterior regions in Mll1+/− embryos and were lost in Mll1−/− mice [69]. In addition, recent studies using a tissue-specific knockoutmouse model revealed that Mll1 is essential for sustaining adulthematopoiesis [71, 72]. Mll1 is not required for survival, proliferation, anddifferentiation of subventricular zone neural stem cells but plays anessential role in neurogenesis in the postnatal mouse brain [73]. Mechanistically, Mll1 directly occupies the promoter ofDistal-less homeobox 2 (Dlx2), a critical regulator ofneurogenesis, and is required to resolve the poised bivalent state to theactively transcribed status with predominant H3K4me3 during neurogenesis ofneural stem cells [73].

MLL2 (also called MLL4 and KMT2B) has a similar protein domain structure tothat of MLL1 and was found to be the MLL1 paralog [74]. Like Mll1, Mll2 is widely expressed during development and inadult tissues. Mll2- null mice die before embryonic day E11.5, withdrastically reduced expression of Hoxb2 and Hoxb5[75]. However, Mll2 may be only required briefly fordevelopment, because it appears to be dispensable for mouse developmentafter E11.5 [76]. Mll2−/− ES cells maintain pluripotency, have increasedapoptotic activity, and undergo skewed cellular differentiation along threegerm layers [77]. Therefore, Mll1 and Mll2 are unlikely redundant for generegulation during early embryonic development. In support with this notion,the phenotypes of Mll1 and Mll2 knockout mice aredifferent in adult tissues. For example, hematopoietic-specific loss ofMll1 showed defects in hematopoiesis [71, 72], whereas Mll2 loss did not show any aberrant bloodprofiles and notable pathology [76].

MLL3 and MLL4

MLL3 (also called HALR/KMT2C) and MLL4 (alias ALR/KMT2D) are mammaliancounterparts of Drosophila Trr and were co-purified astranscriptional coactivator complexes [14, 7880]. MLL3 and MLL4 associate with nuclear hormone receptors in bothDrosophila and mammals. For example, the MLL3/MLL4 complex isrecruited to HOXC6 gene and activates its transcription in anestrogen receptor-dependent manner [79]. Frequent somatic loss-of-function mutations have been identifiedin MLL3 and MLL4 genes in human cancers, includingcolorectal cancer, non-Hodgkin B-cell lymphoma, and medulloblastoma [8185]. Consistently, a recent study reported that trr geneproduct suppresses cell growth in Drosophila eye imaginal discs. Ofinterest, trr mutation markedly reduced H3K4 monomethylation levelswithout significantly changing H3K4 di- and trimethylation levels [86], in agreement with earlier findings that Trr is a major H3K4mono-methyltransferase for Drosophila enhancers [87]. Mll3 homozygous mutant mice, which have an in-framedeletion of a 61-aa catalytic core of the SET domain, exhibited reducedwhite adipose tissue, stunted growth, and slow cellular doubling rate [88, 89]. During epidermal differentiation, the MLL4 complex is recruitedto differentiation-related genes via the transcription factor GRHL3/GET1 andcollaboratively activates the epidermal progenitor differentiation program [90].

Recently, we found that MLL4 is essential for the neuronal differentiation ofhuman NT2/D1 stem cells [91]. Mechanistically, the neuron-specific gene NESTIN andkey developmental genes HOXA1–3 are activated by MLL4 duringRA-induced differentiation. Intriguingly, the tandem PHD4-6 ofseven PHD motifs in MLL4 (Figure 2) specificallyrecognized unmethylated or asymmetrically dimethylated histone H4 Arg 3(H4R3me0 or H4R3me2a) and is required for MLL4′s nucleosomalmethyltransferase activity and MLL4-mediated differentiation. H4R3 symmetricdimethylation (H4R3me2s), a gene-repressive mark, blocks the bindingactivity of MLL4′s PHD4-6. Consistent with this, knockdownof the protein arginine methyltransferase 7, which is involved in generationof H4R3me2s, increases MLL4 occupancy and H3K4me3 levels at the MLL4 targetgene promoters and enhances the MLL4-dependent neural differentiationprogram. Therefore, these results revealed that the trans-tail regulation ofMLL4-catalyzed H3K4me3 by protein arginine methyltransferase 7-controlledH4R3me2s serves as a novel epigenetic mechanism underlying neuronaldifferentiation of human stem cells.

MLL5

Independent studies have demonstrated that MLL5 is required for hematopoiesis [9294]. Moreover, MLL5 promotes myogenic differentiation by controllingexpression of cell cycle genes (e.g., Cyclin A2) and myogenticregulator genes (e.g., Myogenin) [95]. Mll5 knockout male mice are sterile, at least in partbecause of deregulated expression of genes that are required for terminaldifferentiation during spermatogenesis [96]. Of interest, although MLL5 was reported to be inactive [92, 95], GlcNAcylation of MLL5 greatly increased MLL5′s enzymaticactivity towards H3K4me1/2 and facilitated RA-induced granulopoiesis inhuman HL60 promyelocytes [24].

SET1A and SET1B

Human SET1A and SET1B have an N-terminal RNA recognition motif and aC-terminal enzymatic SET domain (Figure 2). TheSET1A complex was purified as a multi-protein complex that associates withCFP1 [19]. CFP1 is required for stem cell differentiation and interactswith unmethylated CpGs via its zinc finger domain CXXC [97]. Interestingly, Cfp1−/− ES cells displayed aberrant H3K4me3 peaks atnumerous ectopic sites (i.e., distinct regions outside annotated CpGislands), suggesting that CFP1 recruits the SET1 complex to CpGisland-containing promoters and consequently prevents it from generatingH3K4me3 to inappropriate chromatin locations [19, 98, 99].

A protein sequence analysis revealed that SET1A shares 39% identity with aSET domain protein named SET1B [22]. Although both proteins associate with a similar set ofnon-catalytic subunits, a confocal microscopy analysis revealed that SET1Aand SET1B exhibit distinct subnuclear localizations in euchromatin regions;thus, this suggests that each protein regulates a unique group of targetgenes [22].

ASH1L

ASH1L (also called Ash1) is the human homolog of Ash1, a DrosophilaTrithorax group protein that is essential for expression of severalHOX genes. Some reports have indicated that ASH1L primarilyacts as a H3K4 methyltransferase [13, 100, 101], whereas others have reported that human ASH1L specifically mono-and dimethylates H3K36 [102104]. ASH1L cooperates with MLL1 in HOX gene activation andis required for the myelomonocytic lineage differentiation of hematopoieticstem cells [105]. Of interest, a mutation of the SET domain of ASH1L did notdecrease HOX gene expression, suggesting that ASH1L’scatalytic activity is dispensable for hematopoietic stem celldifferentiation [105].

SET7/9

SET7 (or called SET9) is an H3K4 mono- and di-methytransferase [6, 106108]. SET7 expression is upregulated during myoblast differentiation [109]. Specifically, SET7 interacts with Myoblast determination protein1 (MyoD), a central transcriptional factor for myogenic gene expression, andis indispensable for MyoD-mediated muscle differentiation. Knockdown of SET7impaired the association of MyoD with the promoter and enhancer regions ofthe myogenic genes (e.g., Myogenin) and reduced gene expression bydecreasing H3K4me1 levels at its target genes. Intriguingly, SET7antagonizes Suv39h1-mediated H3-K9 methylation at the myogenicdifferentiation gene promoters [109].

SMYD1−3

Smyd1 (also called Bop) is essential for mouse cardiac differentiation [110]. Consistently, knockdown of Smyd1 in zebrafish embryos results indefective skeletal and cardiac muscle differentiation; this cannot berescued by the Smyd1 catalytic mutant, which lacks H3K4methyltransferase activity [21]. SMYD2 methylates H3K4 and H3K36, as well as tumor-suppressorproteins such as p53 and Retinoblastoma protein (pRB) [23, 111113]. Specifically, SMYD2-mediated monomethylation of p53 K370attenuates the interaction of p53 with p53 target promoters and consequentlyantagonizes p53-dependent transcriptional regulation [112]. Unlike SMYD1, cardiac-specific knockout of Smyd2 has nophenotype during mouse heart development [114]. SMYD3 is a methyltransferase for both H3K4 and H4K5 [15, 115]. It is overexpressed in colorectal and hepatocellular cancers andpromotes cell proliferation [15]. During zebrafish embryogenesis, SMYD3 appears to be importantfor cardiac and skeletal muscle development [116].

SETMAR

SETMAR (also called METNASE) encodes a chimeric proteinthat contains an N-terminal SET domain and a C-terminal mariner transposasedomain [117] (Figure 3). The function of SETMAR instem cells remains unknown. However, SETMAR-catalyzed methylation of H3K4and H3K36 may lead to an open chromatin structure, which may facilitate itstransposase-dependent processes, such as foreign DNA integration and DNAdouble-strand break repair [20].

Figure 3
figure 3

Protein domain architectures and stem cell function of other H3K4methyltransferases and core subunits. AT: AT-hook DNAbinding domain; AWS: Associated With SET domain; SET: Su(var)3-9,Enhancer of zeste, Trithorax domain; BRD: Bromodomain; PHD: PlantHomeo Domain; BAH: Bromo Adjacent Homology domain; MYND: Myeloid,Nervy, and DEAF-1 domain; MT: Mariner Transposase domain; KRAB:Krüppel Associated Box domain; C2H2:C2H2-type zinc finger; WD: WD40 repeat;SPRY: SplA and Ryanodine domain.

PRDM9

PRDM9 (also called MEISETZ) is a PR/SET domain-dependent histonemethyltransferase that is required for meiotic prophase progression [18]. Deletion of the Prdm9 gene attenuates H3K4me3 levels,resulting in defective chromosome pairing, impaired sex body formation,damaged meiotic progression, and sterility in both sexes of mice [18]. Mechanistically, Prdm9 binds to 13-base pair DNA elements viaits C2H2 zinc fingers. During early meiosis, this binding event may linkPrdm9-catalyzed H3K4me3 to mammalian meiotic recombination hotspots thatcontain the 13-nucleotide DNA elements [118120].

Subunits of H3K4 methyltransferases

WDR5, a core subunit of the SET1 and MLL1−4 complexes, plays an importantrole in ES cell self-renewal and somatic cell reprogramming [47]. WDR5 is highly expressed in ES cells and downregulated upondifferentiation. Knockdown of WDR5 resulted in loss of ES cell self-renewal anddecreased the generation of induced pluripotent stem cells [47]. WDR5 interacts with OCT4 and activates transcription of theself-renewal factors, such as OCT4 and NANOG, in ES cells. Moreover, WDR5,together with OCT4, NANOG and SOX2, regulates the self-renewal-regulatorynetwork [47]. Similarly, ASH2L is required for the pluripotency of mouse ES cells.ASH2L knockdown resulted in elevated expression of mesodermal lineagedifferentiation genes [121].

DPY-30 and RBBP5 are other core components of the SET1/MLL methyltransferases. Incontrast to ASH2L and WDR5, DPY-30 and RBBP5 were not required for ES cellself-renewal [53]. DPY-30 or RBBP5 knockdown reduces global and neuronal gene-specificH3K4me3 levels, resulting in inefficient RA-induced neural differentiation ofmouse ES cells.

Differing biological outcomes for ASH2L and WDR5 from DPY-30 and RBBP5 aresurprising because these four proteins are core components of the sameSET1/MLL1−4 methyltransferases. These unexpected findings might beexplained by the following possibilities. Besides the known SET1/MLL1−4complexes, some of these subunits may be present in other complexes in the samecells so that they may exert different biological functions fromSET1/MLL1−4 complexes. In fact, gel filtration analysis of ES cell nuclearextracts showed that elution profiles of WDR5/OCT4 did not overlap with those ofWDR5/ASH2L/RBBP5, suggesting that WDR5 also belongs to another new complexcontaining OCT4 [47]. Another possible scenario is that cellular levels of some coresubunits and H3K4 methyltransferases may be dynamically changed between ES cellsand differentiated cells. Such changes might allow certain H3K4methyltransferase complexes to be dominant over the others or lead to formationof new functional complexes, subsequently affecting expression of stemness genesand differentiation-specific genes. In support with this, during ES celldifferentiation, ASH2L and WDR5 levels are down-regulated whereas MLL1 and MLL3are up-regulated [47, 121]. In addition, some H3K4 methyltransferase complexes may havenon-redundant cellular function by regulating their unique target genes in acell type-specific manner, as mentioned earlier. Future studies are required tofurther understand the distinct roles of the SET1/MLL complexes.

H3K4 demethylases

The reversibility of histone methylation was not clear until the discovery of thefirst histone demethylase LSD1 in 2004 [25]. Subsequently, a new class of JmjC-domain-containing proteins wasidentified that can demethylate methylated lysine residues in histones. TheF-box and leucine-rich repeat protein (FBXL11, also known as KDM2A) is the firstidentified JmjC domain-containing demethylase that removes methyl groups fromH3K36me2/1 [122]. The catalytic JmjC domain requires iron and α-ketoglutarate ascofactors to hydroxylate methyl groups [123]. Among this class of demethylases, JARID1A−D (or KDM5A−D)proteins specifically remove the methyl group from H3K4me2/3. NO66, abifunctional lysine-specific demethylase and histidyl-hydroxylase, candemethylate H3K4me/ H3K36me and hydroxylate a histidyl group of the non-histoneprotein Rpl8 [124, 125]. Not surprisingly, the LSD family (LSD1 and LSD2) and JARID1 familyof H3K4 demethylases play important roles in gene transcription in stem cellhomeostasis.

LSD1 and LSD2

LSD1 protein contains an N-terminal SWIRM domain and a long C-terminalFAD-dependent amine oxidase domain (AOD). The AOD is divided by an insertionknown as the tower domain (Figure 4). LSD1 alonedemethylates H3K4me2/1 on histones but not nucleosomes, while theassociation of Co-REST with LSD1 allows LSD1 to demethylate nucleosomal H3K4 [26, 27, 126].

Figure 4
figure 4

Protein domain architectures and stem cell function of H3K4demethylases. SWIRM: SWI3, RSC8 and MOIRA domain; AOD-N:Amine Oxidase Domain-N terminal; TOWER: LSD1 tower domain; AOD-C:Amine Oxidase Domain-C terminal;C4H2C2:C4H2C2-type zinc finger; ZF_CW:CW-type zinc finger; AOD: Amine Oxidase Domain; JmjN: Jumonji Ndomain; ARID: AT-rich interactive domain; PHD: Plant HomeoDomain; JmjC: Jumonji C domain; C5HC2:C5HC2-type zinc finger.

Numerous studies in ES cells and neural stem cells strongly suggest that LSD1is a key histone methylation modifier in transcriptional regulation for stemcell fate determination. Lsd1-null mice are embryonic lethal aroundE6.5, and Lsd1-deficient mouse ES cells demonstrate increased celldeath and impaired differentiation, such as embryoid body formation defects [127129]. Similar to mouse ES cells, LSD1 is required for neural stem cellproliferation; it is recruited by the nuclear receptor TLX to repressnegative cell cycle regulators, including p21, in neural stem cells [130]. Interestingly, LSD1 is indispensable for differentiation ofseveral cell types, including skeletal muscles and adipocytes [131, 132]. In mouse ES cells, LSD1 demethylates and stabilizes DNAmethyltransferase 1 (DNMT1), and Lsd1 deletion results inprogressive loss of DNA methylation [128]. Moreover, LSD1 and its associated nucleosome remodeling andhistone deacetylase (NuRD) complex are recruited to Oct4-occupied enhancersat active stemness genes in ES cells, but the repression activities ofLSD1-NuRD may be antagonized by histone acetyltransferases (e.g., p300). During mouse ES cell differentiation, Oct4 and acetyltransferase levels aredown-regulated, and LSD1-NuRD decommissions active enhancers by removingH3K4me1 while promoting cellular differentiation [45]. In contrast to the above stem cell studies, seeminglyconflicting results regarding the role of LSD1 in ES cells have beenreported. Knockdown of LSD1 induces differentiation in human ES cells, whichis correlated with de-repression of developmental genes with elevatedH3K4me2/3 levels [44]. In addition, Lsd1−/− ES cells had a strong potential to generateextraembryonic tissues from the embryoid body [133].

LSD2 (AOF1 or KDM1B) was recently identified as a homolog of LSD1; itdemethylates H3K4me2/1 like LSD1 [28, 134136]. Interestingly, unlike LSD1, LSD2 has no tower domain in the AODregion, but contains unique N-terminal zinc fingers, includingC4H2C2 and CW-type zinc fingers, whichare required for demethylase activity [136, 137] (Figure 4). A genome-wide mappinganalysis revealed that LSD2 primarily resides in the intragenic regions ofactively expressed genes [28]. LSD2 may activate its target genes, possibly via its associationwith transcriptional elongation factors [28]. Lsd2 is not essential for mouse development. However,the DNA methylation of several imprinted genes is lost in oocytes fromlsd2-deleted females [135]. Consequently, the embryos derived from these oocytes exhibitedbiallelic expression or silencing (i.e., loss of monoallelic expression) ofthe affected imprinted genes and died before mid-gestation [135]. The molecular mechanism underlying the functional link betweenH3K4 demethylation and DNA methylation for expression of imprinted genesremains to be investigated.

JARID1A

JARID1A (RBP2 or KDM5A) was identified as a binding partner of pRB protein inearly 1990 [138]. RBP2 contains a highly conserved JmjC domain and was found as aspecific H3K4me3/2 demethylase [30, 139] (Figure 4).Rbp2−/− mice are viable anddisplay mild phenotypic defects in expansion of hematopoietic stem cells andmyeloid progenitors. The weak phenotype ofRbp2−/− mice suggests thatother JARID1 family proteins may compensate the loss ofRbp2[139].

During ES cell differentiation, RBP2 is dissociated from HOX genes,resulting in increased H3K4me3 levels and gene activation [30]. Consistently, Pasini et al. reported that RBP2 associates withthe important Polycomb repressive complex 2 (PRC2), which enzymaticallygenerates the repressive mark H3K27me3 for silencing of manydifferentiation-specific genes in ES cells [140]. A genome-wide chromatin immunoprecipitation (ChIP)-on-chipanalysis revealed that RBP2 colocalizes on a subset of PRC2 target genepromoters in mouse ES cells. However, the interaction of RBP2 with PRC2 maynot be strong, because the mass spectrometric analysis revealed thataffinity eluates of the PRC2 component EED, which were purified from ES cellextracts, did not contain RBP2 [141]. Beshiri et al. recently demonstrated that RBP2 augments therepressive effects of the pRB-related protein p130 and E2F4 on cell cyclegenes during stem cell differentiation via H3K4me3 demethylation [142]. Interestingly, RBP2 inhibits osteogenic differentiation of humanadipose-derived stroma cells [143]. RBP2 interacts with Runt-related transcription factor 2 (RUNX2),a transcriptional factor that is required for osteogenic differentiation. Subsequently, RBP2 represses RUNX2 target genes, including Alkalinephosphatase, Osteocalcin, and Osterix[143].

JARID1B

JARID1B (PLU1 or KDM5B) was shown to be overexpressed in breast cancer celllines [144]. As a member of the JARID1 family, PLU1 catalyzes thedemethylation of H3K4me2/3. Its full activity requires JmjN, ARID,PHD1, and C5HC2 zinc finger in additionto the catalytic domain JmjC [30, 34] (Figure 4). Consistent with the resultof earlier studies, knockdown of PLU1 reduced MCF7 breast cancer cellproliferation and concomitantly upregulated expression of the Breastcancer1, early onset (BRCA1), Caveolin 1(CAV1), and HOXA5 genes as a result of increasedH3K4me3 levels on their promoters [34]. However, PLU1′s role in ES cell self-renewal anddifferentiation is controversial. Xie et al. reported that PLU1 is adownstream target of the pluripotent factor Nanog and is required for EScell self-renewal [145]. PLU1 interacts with the chromodomain protein MRG15 and isrecruited to H3K36me3-containing sites within gene bodies ofself-renewal-associated genes via MRG15. Knockdown of PLU1 or MRG15increased intragenic H3K4me3 that produces cryptic intragenic transcriptionand inhibited the transcriptional elongation [145]. Another study showed that constitutive overexpression of PLU1blocked neural terminal differentiation [146]. On the contrary, Schmitz et al. has provided evidence that PLU1is required for the neural differentiation of ES cells but is dispensablefor self-renewal [147]. Using a genome-wide ChIP-sequencing analysis, they found thatPLU1 predominantly localizes on the transcription start sites of targetgenes, over 50% of which are also occupied by Polycomb group proteins.PLU1-depleted ES cells fail to differentiate into the neural lineage, whichcorrelates with the inappropriate depression of stem and germ cell genes [147]. These findings are further supported by their recent research inPlu1 knockout mice, which have the phenotype of neonatallethality and neural defects [148]. The discrepancies in these studies regarding the role of PLU1 inES cell homeostasis are not entirely clear. However, Schmitz et al.indicated that their PLU1 localization data were obtained using a betterPLU1 antibody and that the unimportance of PLU1 in ES cell self-renewal wasconfirmed by both a lentiviral shRNA knockdown method and a genetic deletionapproach.

JARID1C and JARID1D

Compared with RBP2 and PLU1, much less is known about the biological functionof JARID1C (SMCX or KDM5C) and JARID1D (SMCY or KDM5D). Both demethylaseshave similar domain structures and contain a conserved and functional JmjCdomain that is responsible for demethylating H3K4me2/3 [3032]. SMCX is an X-chromosome gene that escapes from X inactivation [149] and is often mutated in renal tumors and X-linked mentalretardation (XLMR), suggesting that it has important functions in the humankidneys and brain [150, 151]. Indeed, SMCX is highly expressed in brain during zebrafishdevelopment and is required for neuron survival [31]. Moreover, SMCX knockdown reduces dendritic length of rat primaryneurons, which cannot be rescued by its XLMR-patient mutants with reduceddemethylase activity [31]. Therefore, SMCX may play an important role in neuronaldevelopment. In addition, Outchkourov et al. reported that SMCX may interactwith the transcriptional factors c-MYC and ELK1 to regulate gene expressionin mouse ES cells [152].

JARID1D requires multiple domains, including ARID, JmjC, andC5HC2 zinc finger, for its full demethylaseactivity towards H3K4me3/2 [32] (Figure 4). JARID1D interacts withRING6A/MBLR, a polycomb-like protein with homology to Mel18 and Bmi1proteins [153]. This interaction stimulates JARID1D’s enzyme activityin vitro; the protein complex mediates H3K4me3 demethylation atthe Engrailed 2 gene promoter and is required for Engrailed2 gene repression [32]. However, JARID1D’s biological role in stem cells islargely unknown. Given its localization on the Y-chromosome, it will beinteresting to determine whether JARID1D plays a role in male-specific geneexpression in vivo.

NO66

NO66 has been reported to demethylate H3K4me3/2/1 and H3K36me3/2 [124] and to catalyze histidyl hydroxylation of the 60S ribosomalprotein Rpl8 [125]. This enzyme inhibits osteoblast differentiation [124]. Specifically, it directly interacts with Osterix, anosteoblast-specific transcription factor, and represses Osterix target geneexpression [124]. In addition, NO66 plays a role in mouse ES cell differentiation [154]. During this process, it is recruited to stemness genes (e.g.,Oct4 and Nanog) via the PHD finger protein 19 (PHF19),which interacts with the H3K27 methyltransferase complex PRC2;NO66-PHF19-PRC2 represses gene expression by reducing H3K36me3 andincreasing H3K27me3 [154].

Conclusions

Stem cells are indistinguishable from somatic cells at the genomic level. Incontrast, there are remarkable differences in epigenomes that may be represented bycovalent and noncovalent modifications of histones and DNA. As reviewed herein,specific epigenetic modifiers, such as H3K4 methylation modifiers, may playfundamental roles in orchestrating cellular epigenomes whose genomic sequences areidentical. Consistent with this, many H3K4 methylation modifiers and theircomponents are required for ES cell self-renewal or differentiation. In addition,some of them cooperate with transcription factors for efficient somatic cellreprogramming. For example, WDR5 is required for the efficient generation ofpluripotent stem cells that were induced by Oct4, Sox2, c-Myc, and Klf4 [47]. Therefore, the epigenetic modifiers, with the transcription factornetwork, may establish epigenomes in a coordinate manner.

Recently, small molecule inhibitors against specific histone methyltransferases,including LSD1 inhibitors, have been developed by several pharmaceutical companies,although their specificities and efficacies require improvement [155]. Certain inhibitors, alone or combined, may increase somaticreprogramming efficiency or drive somatic reprogramming, perhaps providing newavenues for personalized therapeutic interventions using stem cells. With regard tothe roles of histone modifiers in stem cell maintenance and differentiation, manymore new exciting findings are expected. We predict that our current and futureknowledge about stem cell self-renewal and lineage commitment will be highlyrelevant to cancer stem cell studies, because stem cells and cancer stem cells shareseveral characteristics, such as high degrees of self-renewal and differentiation [156]. We believe that a new era of stem cell epigenetics has begun.