Introduction

In eukaryotes, the alteration of chromatin structure is one of the main methods for modifying cell phenotypes by regulating specific DNA replication and mRNA transcription1. In addition to DNA methylation, changing the properties of certain amino acid residues at histones is a major method for modifying the structure of chromatin. The enzymes involved in the acetylation, methylation, ubiquitination, and phosphorylation of histones have been identified and extensively studied to define the biological function of each enzyme2. Many studies have provided evidence that histone modification plays a decisive role in cell fates such as carcinogenesis, differentiation, proliferation, and senescence3.

Polycomb group (PcG) proteins were originally identified from fruit flies. They are well conserved from invertebrates to mammals during evolution. PcG proteins can act as transcriptional repressors by inhibiting the mRNA transcription of specific gene loci through the trimethylation or monoubiquitination of histones H3 and H2A4. To initiate and maintain such chromatin modification, two distinct protein complexes, polycomb repressor complex 1 (PRC1) and PRC2, work in coordination with each other. PRC2 exhibits methyltransferase activity to add methyl functional groups to specific amino acid residues of histone H3, while PRC1 exhibits E3 ubiquitin-ligase activity to modify the structure of histone H2A4,5. Mammalian PRC complexes display structural plasticity because the existence of several paralogs of PcG subunit proteins6. In particular, more than 100 different types of mammalian PRC1 complexes may exist based on a simple combinatorial algorithm7.

Although recent progress in biochemical and molecular analyses involving transgenic animal techniques has revealed the functional importance of the core subunit of the PcG proteins that regulate mRNA expression through histone modifications, how each of the paralogs of PcG subunit protein interact with each other to orchestrate the fine tuning of chromatin structure remains elusive. In this review, we summarize current knowledge about the immune regulatory role of PcG proteins related to the compositional diversity of each PRC complex. We also introduce therapeutic drugs that target PcG proteins.

Structural heterogeneity related to the function of PcG proteins

PcG genes were initially identified as genes involved in the regulation of homeotic gene expression, critical for the body axis plan and segment development in fruit flies8. PcG proteins are present in plants, nematodes, and metazoan species from flies to mammals, indicating that these proteins are well-conserved transcriptional repressors via the modification of chromatin structure during evolution9. Each PcG protein is a subunit of multiprotein complexes categorized by two different functional groups: PRC1 and PRC210.

Embryonic ectoderm development (EED), suppressor of zeste (SUZ)12, and enhancer of zeste homolog (EZH) are the catalytic core subunits of PRC2. Since EZH has two paralogs (EZH1 and EZH2), two structural variants are found in the catalytic core of PRC2 (Fig. 1a)11. EZH2 is the enzymatic subunit of the PRC2 complex, which acts as an S-adenosyl-l-methionine (SAM)-dependent histone methyltransferase via the mono-, di-, or trimethylation of lysine 27 residue at histone H3 (H3K27me1, H3K27me2, or H3K27me3) (Fig. 1 and Table 1)11,12,13,14,15,16. EZH1 also acts as a methyltransferase with reduced enzyme activity compared to EZH210. The SET domain of EZH1 or EZH2, which contains the catalytic core and SAM-binding site, is indispensable for their methyltransferase activity. However, purified EZH1 or EZH2 monomers alone are unable to efficiently exert enzyme activity in vitro because they must bind with two other noncatalytic subunit proteins, SUZ12 and EED (Fig. 1, Tables 1 and 2)9,11,12,17,18,19,20,21. SUZ12 contains a zinc-finger domain that can bind to DNA or RNA and facilitate protein–protein interactions22. EED contains WD40 repeats that can putatively bind to H3K27me3 (Table 2)23. The fourth member of the PRC2 core subunit is retinoblastoma-binding protein 4 (RBBP4) (NURF55) or RBBP7 (Fig. 1, Tables 1 and 2)9,12,18,19,24,25. Whether RBBP 4/7 is included in the catalytic core of PRC2 is still controversial because RBBP 4/7 activity is not required for the catalytic activity of PRC2 in vitro26. However, RBBP 4/7 also contains WD40 domains that can bind to histones and facilitate the catalytic activity of PRC2 in vivo26.

Fig. 1: The repressive mechanism of specific mRNA transcription by PcG proteins through the modification of chromatin structure.
figure 1

Schematic representation of transcriptional repression by PcG proteins according to the ‘hierarchical repressive model’ (a) and the ‘reverse-hierarchical repressive model’ (b). a Core subunits of PRC2 (EED, EZH, SUZ12, RBBP) recognize and repress a target locus by introducing H3K27me3. The CBX subunit of canonical PRC1 (PRC1.2 and PRC1.4) then recognizes the H3K27me3 tag, and canonical PRC1 further represses the target locus by introducing H2AK119. b The KDM2B subunit of noncanonical PRC1 (PRC1.1) recognizes CpG, and PRC1.1 represses the target locus by introducing H2AK119. The JARID2 subunit of PRC2.2 then recognizes the H2AK119 tag, and PRC2.2 further represses the target locus by introducing H3K27me3.

Table 1 Each subunit of canonical and noncanonical polycomb complexes in mammals.
Table 2 Paralogs of each subunit of PRC2 and canonical PRC1 in mammals.

In addition to the core subunits of PRC2, several other proteins can bind to these subunits and modulate the enzyme activity of PRC2. Two different types of PRC2 complexes (PRC 2.1 and PRC 2.2) have been identified based on their noncore subunit proteins in humans (Table 1)11,12,27. PRC2.1 contains three other subunits, including polycomb-like protein (PCL), PRC2-associated LCOR isoform (PALI), elongin B/C and PRC2-associated protein (EPOP) (Table 1)11,12,28,29,30. PCL has three paralogs: PCL1, PCL2, and PCL3. They are also known as PHF1 (PCL1), MTF2 (PCL2), and PHF19 (PCL3), respectively. PALI, also known as C10ORF12, has two paralogs: PALI1 and PALI228,29,30. Three noncore subunit proteins (PCL, PALI and EPOP) can act as enhancers to facilitate the catalytic activity of PRC 2.1. The function of PCL is essential for H3K27me3 by PRC 2.1 because the recognition of H3K36me2/3 by the TUDOR domain of PCL is a prerequisite for PRC 2.1 to introduce H3K27me3 marks31. PCL is also required for the recognition of unmethylated CpG islands of DNA by PRC 2.132,33. PALI1 can facilitate the catalytic activity of PRC2 both in vitro and in vivo34. Similar to the phenotype of EZH2-deficient mice, PALI1-deficient mice exhibit embryonic lethality34. EPOP can mediate the interaction between PRC2.1 and elongin B/C, which is important for maintaining the transcriptional repression of PRC2’s target locus35.

Adipocyte enhancer-binding protein 2 (AEBP2) and Jumonji AT-rich interactive domain 2 (JARID2) are additional subunits that for PRC 2.2 along with the PRC2 core subunits (Fig. 1 and Table 1)11,12,27,36. Both AEBP2 and JARID2 are required to recruit PRC 2.2 to chromatin by specifically binding to the CpG-rich region of DNA36. Recent studies have indicated that Jarid2-containing PRC 2.2 can specifically recognize and bind to the mono-ubiquitinated lysine 119 residue at histone H2A (H2AK119Ub) tagged by the PRC1.1 (noncanonical PRC 1) complex (Fig. 1b)37. The binding of H2AK119Ub by Jarid2 can further facilitate the methyltransferase activity of PRC 2.2 (Fig. 1b)37.

The subunits of PRC1 complexes are much more diverse than those of PRC2 (Fig. 1b, Table 1)11,12. There are two groups of PRC1 complexes categorized based on the original findings in fruit flies. Canonical PRC1 complexes are composed of subunit proteins conserved from flies to mammals, whereas the subunit proteins of noncanonical PRC1 complexes are less conserved in flies38. Really interesting new gene 1 (RING) and polycomb group ring finger (PCGF) have been found in both canonical and noncanonical PRC1 complexes, suggesting that these proteins are structurally and functionally essential components38. RING proteins exhibit two paralogs (RING1A and RING1B) that possess E3 ubiquitin ligase activity when they are combined with PCGF proteins (H2AK119Ub activity) (Fig. 1, Tables 1 and 2)9,11,12,17,18,19,20,21,38,39. PCGF proteins exhibit six paralogs (PCGF1–PCGF6)20. Upon interaction with RING proteins, PCGF proteins can increase ubiquitin ligase activity by acting as cofactors39,40. Each PCGF paralog (PCGF1 through PCGF6) can be a subunit of different types of PRC1 complexes (PRC1.1 through PRC 1.6) (Fig. 1a, Tables 1 and 2)9,11,12.

Among the six different types of PRC1 complexes (PRC1.1–PRC 1.6), the PCGF2 (MEL-18)-containing PRC1.2 and PCGF4 (BMI-1)-containing PRC1.4 complexes are classified as canonical PRC1 complexes41. Chromobox homologs (CBX) can form canonical PRC1 complexes with RING proteins and PCGF2 (MEL-18) or PCGF4 (BMI-1) (Fig. 1a)41. Five CBX paralogs (CBX2, CBX4, CBX6, CBX7, CBX8) have been found to act as subunits of the canonical PRC1 complex in mammals (Fig. 1a)41. The proposed role of CBX in the canonical PRC1 complex is to recruit PRC1 to H3K27me3 tags because CBX proteins contain chromodomains that recognize the H3K27me3 tag introduced by PRC2 (Fig. 1a)42,43. Additionally, polyhomeotic homolog (PHC) and sex comb on midleg homolog (SCMH) can interact with core proteins (RING and PCGF) to form canonical PRC1 complexes (Fig. 1a, Table 1)44,45. PHC proteins exhibit three paralogs (PHC1-PHC3)3,7. SCMH proteins exhibit two paralogs (SCMH1 and SCMH2)46. Both types of proteins contain a sterile α motif domain that allows them to bind to other canonical PRC1 complex proteins and participate in the recruitment of PRC1 to chromatin (Table 1)44,45. PHC proteins also have zinc-finger domains that facilitate nucleic acid binding and chromatin compaction46.

The noncanonical PRC1 complex is composed of more protein subunits (Table 1)11,12. In the noncanonical PRC1 complex, the core subunits (RING1 and PCGF) can interact with ring and YY1 binding protein (RYBP) or YY1-associated factor 2 (YAF2) or CBX8 (Table 1) via C-terminal ring finger and WD40 ubiquitin-like (RAWUL) domains12,47. Previous observations have indicated that RYBP can compete with CBX for the binding site of RING1B48. YAF2 and RYBP occur in the noncanonical PRC1 complex in a mutually exclusive manner, since YAF2 is a homolog of RYBP (Table 1).

The function of the noncanonical PRC1 complex is clearly different from that of the canonical PRC1 complex (Fig. 1). According to the ‘hierarchical repressive model’, PRC2 can repress a target locus via an H3K27me3 tag. The canonical PRC1 complex can recognize this methylation tag through CBX and further repress a target locus by introducing a H2AK119 mark (Fig. 1a)43. Recently, the RYBP-containing noncanonical PRC1 complex has been found to show higher E3 ligase activity than PCGF4-RING1B containing canonical PRC1 complex49. This finding suggests that another pathway for transcriptional repression exists in addition to the ‘hierarchical repressive model’. Indeed, the CxxC DNA-binding domain of KDM2B in the PRC1.1 complex can specifically recognize CpG DNA sequences and recruit PRC1.1 to a target locus50,51. PRC1.1 then suppresses specific mRNA transcription via an H2AK119ub tag (Fig. 1b)50,51. Thereafter, PRC 2.2-containing Jarid2 can specifically recognize and bind the H2AK119Ub tag and further modify the structure of chromatin by introducing an H3K27me3 tag (Fig. 1b)37,50,51. This model is known as the ‘reverse hierarchical repressive model’ because PRC1.1 first represses the specific transcription of mRNA instead of PRC2.2.

In fruit flies, putative DNA regions recognized by PRCs have been identified, validated, and designated as PcG/trithorax-group response elements (PREs)2,3,4,5,6,7. The existence of vertebrate PRE sites around CpG-rich sequences has also been suggested36,52. However, the conserved DNA-binding motif of mammalian PRCs and the detailed mechanism by which mammalian PRCs recognize specific DNA regions remain elusive. In fruit flies, it has been suggested that the pleiohomeotic (Pho) protein can recognize PREs and guide the core subunits of PRC1 and PRC2 to PREs since the core subunits of PRC2 or PRC1 do not directly bind to DNA53. In vertebrates, YinYang1 (YY1), a Pho homolog, can bind to a conserved DNA region and interact with PRC1 subunits54. Therefore, YY1 may recognize PRE sites and guide noncanonical PRC1 by interacting with RYBP or YAF255.

The role of PcG proteins in immune regulation

A knockout (KO) mouse model and the cell type-specific deletion of PcG genes generated in a conditional knockout (cKO) mouse model using the cre-lox system have been used in most studies to study the function of PcG proteins in immune regulation (Table 3)56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82. Except gene encoding RBBP, mice deficient in the genes encoding each core subunit of PRC2 have been generated and characterized (Table 3)56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82. Based on animal studies, Ezh1 can partially replace the function of EZh2 in specific cell types83,84. For example, Ezh2 is not required for the self-renewal activity of long-term hematopoietic stem cells (LT-HSCs) in adult bone marrow64. However, Ezh1-deficient mice exhibit immunodeficiency due to a significant loss of the self-renewal activity of HSCs56. Because the INK4a/Arf locus, encoding p16INK4a and p19Arf, which can suppress cell cycle progression, is a target of PcG-mediated repression, the deficiency of certain core subunits of PRC2 and canonical PRC1 can cause the loss of self-renewal activity of HSCs85. In addition to EZH1 deficiency, insufficiency of other subunits of PRC2, including EED or SUZ12, can lead to the loss of the self-renewal activity of HSCs64,66. The deficiency of some canonical subunits of PRC1, including BMI-1 and PHC1, can also cause the loss of self-renewal activity of HSCs67,68,75. However, other canonical subunits of PRC1, including MEL18, CBX2, CBX8, and PHC2, do not influence the self-renewal activity of HSCs72,76,77,79. These phenotypic variations observed in each of the mice deficient in different PcG subunits reflect structural heterogeneity depending on the specific stage of cells or tissues due to the redundancy or paralogs of each PcG subunit (Table 3). Cell type-specific roles of various PRC1 and PRC2 complexes have already suggested (Fig. 2)12. In support of these ideas, EZH2 expression in LT-HSCs peaks on embryonic day 14.5 and gradually decreases thereafter until 10 months postnatal64. However, EZH1 expression in LT-HSCs gradually increases from embryonic day 14.5 to 10 months after birth64. BMI-1 and MEL18 expression patterns also follow the paradigm of EZH1/2 expression. BMI-1 is mainly expressed in specific lineage precursors of immune cells, whereas the expression of MEL-18 is correlated with mature immune cell populations86. In addition to contributing to the self-renewal activity of HSCs, PcG proteins participate in the differentiation of hematopoietic progenitor cells (HPCs) into specific lineages of immune cells. The contributions of PRC2 and canonical PRC1 to immune cell differentiation according to the ‘hierarchical repressive model’ are summarized in Table 3 and Fig. 2.

Table 3 Hematopoietic cell fate decision by PRC2 and canonical PRC1 according to the ‘hierarchical repressive model’ in mice.
Fig. 2: Functional contribution of PcG proteins during immune cell differentiation.
figure 2

Schematic representation of particular PRC2 or PRC1 complexes involved in hematopoiesis according to the ‘hierarchical repressive model’.

Studies on the importance of PcG proteins in immune cell function are much less common than studies on the influence of PcG proteins during the differentiation of immune cells (Table 3). Most studies on the functional contribution of PcG proteins to immune cell function have focused on T cell function (Table 3). The CD8+ T cell-specific deletion of Ezh2 or Eed using the CD4-Cre or granzyme B-Cre system revealed that the antigen-specific activation of CD8+ T cells requires the function of the PRC2 complex (Table 3)61. Interestingly, the contribution of PcG proteins to CD4+ T cell function is controversial because the phenotypes of each of the PcG protein-deficient mice are quite different from each other. For example, CD4+ T cell-specific Ezh2 deletion has led to type 2 helper T cell (Th2)-prone immunity via the accumulation of memory Th2 cells, which exacerbates allergic diseases (Table 3)59. However, Bmi1 and Mel18 knockout mice are defective in Th2 cell differentiation71,73. Furthermore, Bmi1 knockout mice exhibit the enhanced apoptosis of memory Th2 cells69.

Current RNA-seq and chromatin immunoprecipitation (ChIP)-seq data for identifying the target loci of PcG proteins

The phenotypic analysis of transgenic mice in combination with RNA-seq and chromatin immunoprecipitation (ChIP)-seq analyses might be a good approach for identifying additional target loci of PcG proteins or the unique functions of each PcG protein paralog in specific immune cell types. Table 4 summarizes current gene chip and RNA-seq databases generated from specific cell types of transgenic mice or specific cell lines subjected to the inhibition of PcG function (http://www.ebi.ac.uk/arrayexpress/). Most of the RNA-seq data were acquired from embryonic stem cells (ES cells), HSCs (LSK cells, LSK, LinSca-1+c-kit+ cells), hematopoietic stem and progenitor cells (HSPC), and cancer cell lines including leukemia, multiple myeloma, sarcoma, ovarian tumor, and gastric cancer cell lines because the initial identification of PcG function emphasized the maintenance of self-renewal activity (Table 4). To expand the collection of differentially expressed gene (DEG) data, RNA-seq analyses need to be performed using a broad range of immune cells, including B cells, monocytes, dendritic cells, mast cells, and polymorphonuclear cells. All DEGs identified in PcG-defective cells might not be direct targets of PcG proteins. Chip-seq data might be needed to verify whether these DEGs are direct targets of PcG proteins. Table 5 summarizes the current ChIP-seq databases for specific cell types (http://www.ebi.ac.uk/arrayexpress/). The DNA-binding sites of most core subunits of PRC2, except for RBBP, and the core subunits of PRC1.2 (RING1B and MEL18) have been analyzed by ChIP-seq (Table 5). The DNA-binding sites of some paralogs of CBX and Jarid2, a subunit of PRC2.2, have also been analyzed (Table 5). However, most of the ChIP-seq data were acquired from stem cell lineages with few exceptions (Table 5). Therefore, a broad range of cells need to be analyzed by Chip-seq using antibodies against the remainder of the PcG proteins, including RBBP, BMI-1, and PHC, to identify novel target genes repressed by PcG proteins.

Table 4 RNA-seq or gene-chip data from loss or gain of function of each PcG subunit proteina, b.
Table 5 ChIP-seq data of each PcG subunit proteina, b.

Therapeutic agents for treating hematopoietic malignancies by inhibiting the activity of PcG proteins

Since the function of PcG proteins is important to maintain the self-renewal activity of stem cells, PcG proteins might act as oncogenes to facilitate tumorigenesis. In support of this idea, high expression of EZH2 has been observed in several hematopoietic malignancies, including myelodysplastic syndromes, acute myeloid leukemia, and various types of lymphomas87,88,89. In particular, EZH2 deficiency in mice can inhibit leukemogenesis by decreasing the proliferation rate of leukemia90. Consistent with these observations, the expression levels of canonical subunits of PRC1, including BMI1, CBX7, CBX8, and RING1A, are elevated in many hematopoietic-originating tumors88,91,92. A mouse model involving Bmi1-deficient mice with transformed cells also supports the notion that BMI1 can act as an oncogene in some hematopoietic malignant cells93. However, the loss of function of PcG proteins by mutation or deletion might also cause hematopoietic malignancies91. In particular, defects in core subunits of PRC2, including EZH2, EED, and SUZ12, have been found in various acute lymphoblastic leukemia and myelodysplastic syndromes94,95,96,97. Therefore, at least PRC2 can act as an oncogene or a tumor suppressor depending on the type of hematopoietic malignant cells involved91. Further study is needed to define the mechanisms underlying the dual functions of these proteins in tumorigenesis.

Table 6 summarizes the inhibitors of PcG proteins applied to clinical trials in hematopoietic malignancies and other types of tumors. Major groups of inhibitors target EZH enzyme activity (Table 6). Most EZH2 inhibitors undergoing clinical trials compete with SAM for binding to the SET domain98. Among the competitive inhibitors of EZH, tazemetostat (EPZ-6438), an orally administered small chemical, has been applied to a broad range of malignant cell types, including lymphoma, sarcoma, mesothelioma, ovarian cancers and advanced solid tumors (Table 6)98. Other inhibitors of PcG proteins that are currently undergoing clinical trials target EED and BMI-1 activity (Table 6). MAK683 is an allosteric EED inhibitor that drives conformational changes in the H3K27me3-binding pocket of EED upon binding99. These conformational changes in EED further prevent the interaction between EED and EZH2, thus blocking H3K27me399. PTC596 is a BMI-1 inhibitor that can facilitate the degradation of BMI-1 by inducing the cyclin-dependent kinase 1-mediated biphosphorylation of the N-terminus of BMI-1100.

Table 6 Inhibitors of PcG proteins undergoing current clinical trials in malignant cellsa.

Conclusion and future prospects

In this review, we highlighted the structural diversity of mammalian PRC2 and PRC1 complexes related to their functional contribution to immune regulation. We also described currently available RNA-seq and ChIP-seq data that could be used to mine new target loci of PcG proteins. Finally, we listed the PcG inhibitors currently undergoing clinical trials. Many previous reports have demonstrated that PcG proteins are major chromatin modifiers that can modulate many biological processes by influencing specific gene repression, mainly using loss-of-function models.

Unfortunately, we still do not know how many different types of PRCs exist in nature due to structural heterogeneity caused by many paralogs and accessory proteins recruited by PRC complexes. We also do not know how each different PRC containing a particular paralog as a subunit contributes to the phenotype of a specific cell type. Solving these unknown issues might provide novel targets for PcG-mediated gene regulation and expand the range of PcG proteins considered as therapeutic targets to treat other human diseases in addition to cancer.