Evidence supporting a critical contribution of intrinsically disordered regions to the biochemical behavior of full-length human HP1γ
HP1γ, a non-histone chromatin protein, has elicited significant attention because of its role in gene silencing, elongation, splicing, DNA repair, cell growth, differentiation, and many other cancer-associated processes, including therapy resistance. These characteristics make it an ideal target for developing small drugs for both mechanistic experimentation and potential therapies. While high-resolution structures of the two globular regions of HP1γ, the chromo- and chromoshadow domains, have been solved, little is currently known about the conformational behavior of the full-length protein. Consequently, in the current study, we use threading, homology-based molecular modeling, molecular mechanics calculations, and molecular dynamics simulations to develop models that allow us to infer properties of full-length HP1γ at an atomic resolution level. HP1γ appears as an elongated molecule in which three Intrinsically Disordered Regions (IDRs, 1, 2, and 3) endow this protein with dynamic flexibility, intermolecular recognition properties, and the ability to integrate signals from various intracellular pathways. Our modeling also suggests that the dynamic flexibility imparted to HP1γ by the three IDRs is important for linking nucleosomes with PXVXL motif-containing proteins, in a chromatin environment. The importance of the IDRs in intermolecular recognition is illustrated by the building and study of both IDR2 HP1γ−importin-α and IDR1 and IDR2 HP1γ−DNA complexes. The ability of the three IDRs for integrating cell signals is demonstrated by combined linear motif analyses and molecular dynamics simulations showing that posttranslational modifications can generate a histone mimetic sequence within the IDR2 of HP1γ, which when bound by the chromodomain can lead to an autoinhibited state. Combined, these data underscore the importance of IDRs 1, 2, and 3 in defining the structural and dynamic properties of HP1γ, discoveries that have both mechanistic and potentially biomedical relevance.
KeywordsHP1 HP1γ CBX3 Molecular modeling Molecular dynamics Epigenetics Chromatin
The heterochromatin protein 1 (HP1) family of histone mark readers, the focus of the current study, was one of the first types of chromatin regulators to be identified [1, 2]. This family of proteins participates in evolutionarily conserved processes in organisms ranging from early eukaryotes to humans [2, 3]. Human cells produce three different HP1 protein isoforms, HP1α (CBX5), HP1β (CBX1), and HP1γ (CBX3), which regulate the expression of entire networks of genes that are critical for normal embryonic development and the maintenance of most homeostatic processes, including cell cycle control, proliferation, apoptosis, differentiation, and DNA damage response [2, 4]. In addition, the expression and deregulation of HP1-mediated processes associate with the development, spreading, and prognosis of several cancers . Consequently, better understanding of the biochemical properties of HP1 proteins has both biological and medical implications.
The current work represents an extension of work in our laboratory, which seeks to understand the biological and pathobiological roles of HP1γ. Early biochemical studies revealed that HP1γ recognizes and binds specific di- and tri-methylated forms of histones (K9H3 and K26H1) and translates this biochemical information into a defined pattern of gene expression [5, 6, 7]. The ability of HP1γ to recognize this mark was subsequently mapped to a small region within the N-terminal domain, known as chromodomain . In addition, HP1γ uses this chromodomain to recruit the related histone methyltransferases, G9a and GLP, which write dimethylated K9 histone marks as part of a positive-feedback loop that leads to increased concentration of reader–writer complexes on specific genomic regions where they are needed to regulate gene expression . G9a and GLP have the ability to auto-methylate at an internal K-containing peptide, which mimics methylated-histones (histone mimicry) . HP1γ also recruits an additional histone methyltransferase protein, SUV39H1, in a manner that is independent of its methylation status, but rather contains a specific linear motif with a PXVXL consensus sequence . For recognizing and binding the PXVXL motif, HP1γ must first form homodimers or heterodimers with HP1α or HP1β [3, 10]. Dimerization and PXVXL recognition, which is imparted to HP1γ by its N-terminal chromoshadow domain, recruits additional chromatin regulators that may impart further instructions for the regulation of genomic and epigenomic functions [3, 10]. Thus, due to the functional importance of both the chromo- and chromoshadow domains, structural studies have begun to focus on deciphering the biophysical properties that determine their function, in the hope that this knowledge may aid in the development of drugs for manipulating HP1γ-mediated processes in experimental and therapeutic settings .
Several laboratories have focused on studying the function of less well-characterized regions of the HP1γ molecules, namely the most N- and C-terminal regions located between the chromo and chromoshadow domains. Unfortunately, in this regard, no NMR or X-ray crystallographic studies have yet yielded any useful information regarding the properties of these less-known domains [11, 12]. Therefore, there is a need for a better understanding of the structure and biophysical behavior of full-length human HP1γ by assigning biophysical properties of those domains for which data at the atomic resolution is lacking, establishing their role in molecular connectivity and flexibility as well as intermolecular interactions. Consequently, using a combination of structural bioinformatics, molecular modeling methods, and molecular dynamics approaches, we here report that HP1γ is an elongated molecule, in which three Intrinsically Disordered Regions (IDRs, 1, 2, and 3) endow this protein with dynamic flexibility, intermolecular recognition properties, and the ability to integrate signals from various intracellular pathways. Our models and the inferences derived from them integrate, complement, explain, and extend available experimental data, providing new insights that can serve as the structural rationale for future experimentations and drug design.
Materials and methods
Generation of a structural model for full-length HP1γ
Modeling of HP1 complexes
The HP1γ−HP1γ homodimer and heterodimers with HP1α and HP1β were docked by homology using the structure of the chromoshadow domain of the mouse HP1α and HP1β (PDB: 1GUW and 3Q6S for HP1β; 3KUP and 3DM1 for HP1γ; as well as 3I3C and 3FDT for HP1α). The three-dimensional complex structure of HP1γ bound with α-importin was generated by docking its linker region to a previously solved structure of α-importin (PDB: 1PJN) to achieve maximal intermolecular interactions by the bipartite cluster of basic amino acids as previously described . For this purpose, the IDR2 region was modeled first by homology to the conformation described for the isolated N1N2 NLS (PDB: 1PJN), which is a paradigm for docking homologous peptides to α−importin. Because of its high level of structural similarity (RMSD = 0.3), this peptide was easily docked manually to the respective NLS receptor of α−importin. Intermolecular interactions of the HP1γ-α-importin complex, including salt bridge interactions, hydrogen bonds, electrostatic interactions, and hydrophobic interactions, were calculated in the Receptor-Ligand function of Discovery Studio Client 4.1 using the default parameters . The three-dimensional complex structure of HP1γ bound to B-DNA was generated by using DP-Dock , which has been well validated by our laboratory and others . DP-Dock uses a nonspecific B-DNA model to probe the binding site on a 3D model of a protein that is known to bind DNA, but for which the specific contacts are unknown. Using the structure of a DNA binding protein as input, the method first automatically generated an ensemble of protein–DNA complexes obtained by rigid-body docking with nonspecific canonical B-DNA molecules . Models were subsequently selected by clustering and ranking them according to their DNA–protein interfacial energies .
Molecular dynamics (MD) simulations
The MD simulations of HP1γ and its complexes were performed using the all-atom force field in CHARMm c36b2 at a temperature of 300 K (NVT ensemble) . The molecule was first energy-minimized using a two-step protocol of steepest descent and conjugated gradients. All of these steps were done using the SHAKE procedure . A distance-dependent dielectrics implicit solvent model was used with a dielectric constant of 80 and a pH of 7.4. Using the same procedure, additional MD simulations were performed on models of HP1 complexes and on HP1γ mutants. In order to better approximate experimental conditions, additional simulations were run using generalized born (GB) implicit solvation with single switching and a NaCl concentration of 150 mM . Studies on the flexibility of HP1γ required performance of two simulations, one at 100 ps and another at 2 ns.
Linear motif analysis for post-translational modifications, protein–protein interaction domains, protein–protein interaction motifs
The presence of a nuclear localization signal (NLS) was derived by combining linear motifs analysis using PsortII, confirming the similarity with other NLSs by virtual peptide display method using Prints . The potential of the IDR1 and IDR2 for binding to DNA was predicted using DP-Bind . Prediction of post-translational modification sites on the CBX isoforms was performed by compiling and statistically scoring linear motifs for phosphorylation, acetylation, methylation, ubiquitination, and sumoylation as predicted by 20 different software programs. The software used to predict phosphorylation were NetPhosK 1.0 , NetPhos 2.0 , Kinasephos 2.0 , DIPHOS , PhosphoSVM , Scansite, Musite , and PPSP . Acetylation sites were predicted using PAIL , ASEB , BRABSB-PHKA , LysAcet , and LAceP . Methylation sites were predicted using BPB-PPMS  and MASA . Ubiquitination sites were predicted using BDM-PUB , CKSAAP UbSite , and UbPred . Sumoylation sites were predicted using GPS-SUMO  and SUMOplot (http://www.abgent.com/sumoplot/). Results from these predictions were then compiled and statistically scored to assign specificity potential to sites that were predicted to undergo modification in HP1 proteins. Briefly, for each distinct software, we considered sites for which the prediction score was above the cut-off that had been derived using a training set of modified sequences that have been experimentally validated. Subsequently, we developed a meta-prediction score (MPS) by assigning a maximum score of 1 to sites that were predicted by all of the programs cited. Scores for other programs were numerically expressed relative to this maximum score. Results of these predictions were then compared to experimentally validated sites listed in PhosphositePlus  and PHOSIDA  databases to define whether all predicted sites have also been found in large-scale OMICs analyses.
Immunoprecipitation of HP1γ complexes and mass spectrometry
Subconfluent HeLa cells were lysed and immunoprecipitation of HP1γ was performed using the Pierce Crosslink Magnetic IP/Co-IP Kit according to the manufacturer’s instructions. HP1γ antibody (Abcam) was cross-linked to the Protein A/G magnetic beads using disuccinimidyl suberate (DSS) to minimize IgG contamination in the final elution. The immunoprecipitated HP1γ complexes were resolved on a 4–15 % Criterion Tris–HCl polyacrylamide gel (Bio-Rad) and stained with Bio-Safe Coomassie Stain (Bio-Rad) according to the manufacturer’s recommendations. Subsequently, bands were selected for excision and processed for nano high-pressure liquid chromatography electrospray tandem mass spectrometry (nano-LC-ESI-MS/MS) by the Mayo Medical Genome Facility Proteomics Core.
For visualizing the shape and contour of the HP1γ dimer, we produced and purified an N-terminal 6×His-tagged recombinant form of this protein using the pET vector system (Novagen, CA). The HP1γ-encoding plasmid was grown in DE3 BL21 bacteria cells overnight and induced with 0.5 mM IPTG for 90 min at 32 °C. The recombinant protein was purified using the Thermo Scientific HisPur Cobalt Resin Kit according to the manufacturer’s instructions. Protein was dialyzed overnight and concentrated to a final concentration of 1 mg/ml. For visualization at the electron microscopy level, 10 μl of the purified protein solution was placed on the surface of glow-discharged formvar carbon-coated grids. After 30 s, the grids were blotted and stained for 30 s in 1 % uranyl acetate. Micrographs were acquired using a JEOL, JEM-1400Plus TEM at 80-kV accelerating voltage, equipped with a Gatan Orius 832 camera.
Building a high-resolution molecular model of full-length human HP1γ
We sought to build a model to enhance our understanding of the structure and molecular dynamics of the human full-length HP1γ. The goal of our study was to use Short Linear Motifs (SLiMs) algorithms, homology modeling, threading, in silico mutations, docking, and molecular dynamics simulations to infer biochemical and biophysical information contained particularly within those regions of the protein for which the structure has not been determined. These regions, which together encompass 41.5 % of the protein, correspond to the 31 a.a. N-terminal and 12 a.a. C-terminal tail, as well as the 33 a.a. peptide that links the two known globular domains. Several observations led to modeling these regions of HP1γ as Intrinsically Disordered Regions (IDRs) 1, 2, and 3 (Fig. 1a), a fact that subsequent MD simulations later demonstrated. Initially, hydropathic analyses, shown in Fig. 1b, indicated that these regions display a high polar-to-hydrophobic ratio of residues, a characteristic of Intrinsically Disordered Protein Regions . Furthermore, several order-to-disorder prediction algorithms, such as PrDOS , metaPrDOS , POODLE , DISpro , DisEMBL , IUPred , PONDR-FIT , PreDisorder , OnD-CRF , RONN , FoldIndex , DISOclust , and GlobPlot2 , revealed a large propensity for each of the three regions to remain unfolded as an IDR in solution (Fig 1c; Supplementary Fig. 1a–c). As a negative control, we performed the same disorder meta-prediction on a helical region of the HP1γ chromoshadow domain (PQIVIAFYEER; residues 161–171). The results of this meta-prediction show that most of this region is ordered, as opposed to the IDRs (Supplementary Fig. 1e). Homology-based modeling of the HP1γ IDR2 domain, using the Xenopus laevis N1N2 phosphoprotein structure as a template (PDB: 1PJN), also indicated its tendency to adopt a random coil conformation (Fig. 1e). We chose N1N2 as a template since it was used in previous structural studies to determine the specificity of α-importin for a variety of nuclear localization signal sequences . The structure of the N-terminal tail (IDR1), also as a random coil, was derived from threading results (Fig. 1d), which were congruent with the predictions of disorder (Fig. 1c; Supplementary Fig. 1a–c). Similar random coil assignments to the structure of the HP1γ linker region (IDR2) and the C-terminal tail (IDR3) were obtained by threading (Fig. 1d–f) and were congruent with predictions of disorder (Supplementary Fig. 1a–c).
Structural and dynamics properties of the full-length HP1γ molecule
The IDR2 domain of HP1γ mediates protein–protein interactions: heterodimerization with α-importin
The IDR1 and IDR2 domains of HP1γ mediate protein–DNA interactions
Post-translational modification of the intrinsically disorder regions of HP1γ have the ability to influence intermolecular interactions and histone mimicry
In summary, at the onset of the current study, most of the structural considerations related to HP1γ had been confined to its globular chromo and chromoshadow domains. However, little was known about how the rest of the primary sequence influenced the behavior of HP1γ. Using several methodologies germane to structural bioinformatics, modeling, docking, dynamics, and mutational analyses, we have gathered evidence that supports a critical contribution of intrinsically disorder regions to define the connectivity, dynamic flexibility, and intermolecular interactions of this protein. This new knowledge, therefore, significantly contributes to further our understanding of the biophysical properties and biochemical behavior of this important epigenetic regulator.
The current work was initiated as a means to extend our understanding of the molecular properties of HP1γ. However, based on homology and evolutionary conservation of this protein to other members of its family, our models are likely to be applicable to isoforms and orthologs of HP1γ. HP1 proteins are among the most widely characterized epigenetic regulators with many of their functions being conserved throughout evolution . HP1γ associates with the development of human diseases, including many forms of deadly cancers [3, 4]. Recent studies have applied state-of-the-art biophysical methods to solve the structure of this protein as to advance our understanding of the basic biochemical mechanisms underlying its function and the hope that these efforts will aid in the future design of small drug inhibitors. These studies produced the structure of both the chromodomain and chromoshadow domain [11, 74]. In spite of this useful information, no reliable full-length model for HP1γ yet exists. Notably, however, extensive biochemical studies indicate that other parts of the protein, namely the most N- and C-terminal regions as well as the linker, which joins the chromo and chromoshadow domains, may contribute to its function. Toward this end, the current study provides information on the molecular behavior of HP1γ that did not exist before by building and characterizing a structural model for this protein. In fact, our study underscores the critical role of the HP1γ IDRs in molecular connectivity, flexibility, protein–protein, and protein–DNA interactions as well as post-translational modifications, which include histone mimicry. Though useful, HP1α and HP1β models were also built but not studied in a dynamic fashion, so as to maintain our focus on HP1γ. We show models for the HP1γ homodimer (Fig. 2b) and the molecule bound to DNA (Fig 7b–c). We also provide an atomic resolution view of the α-importin-HP1γ complex (Fig. 6). We perform, for the first time, an MD simulation of the full-length HP1γ monomer (Fig. 4), in complex with α-importin (Fig. 6c), and with nucleosomes (Fig. 7e). These studies indicate that the intrinsically disordered parts of the protein make the human HP1γ protein highly dynamic, a characteristic that had never been previously defined for this protein. Dynamic flexibility given by this region may allow other domains, such as the chromodomain, to more easily sample the tridimensional space in search for binding partners. Thus, we are optimistic that future studies using experimental techniques may test the validity of this interpretation. This dynamic behavior, however, appears to be restricted when HP1γ forms complexes. MD simulations using harmonically restrained nucleosome particles bound by a single HP1γ dimer show that due to its flexibility, it has the potential to recoil onto the nucleosomes. This activity allows for the recruitment of the HP1-binding domain of SUV39H1 through its contact with nucleosomes. Combined, the building and analyses of these structural models provide a more complete description of the biochemical function of HP1 proteins, as elongated molecules with their two globular domains joined by a flexible linker, which endows them with dynamic flexibility and intermolecular recognition properties. Thus, it becomes important to discuss the accuracy, novelty, and mechanistic contribution of this new information to understanding the biochemical properties of these important epigenetic regulators. Several observations are in agreement with and extend, at a predicted atomic resolution, experimentally derived data, increasing the reliability of the models: (1) HP1γ has the ability to form an NLS-importin complex, which renders it competent to translocate into the nucleus (Fig. 5). (2) Once in the nucleus, HP1γ binds to 3Me-K9H3 and nucleosomal DNA (Fig. 7). (3) The protein is heavily marked by post-translational modifications, some of them playing a significant role in the regulation of histone mimicry. (4) Similar to its yeast homolog, the histone mimetic peptide within the linker region of HP1γ can be recognized by the chromodomain of this protein, a phenomenon which should inhibit its binding to histone marks. (5) The model suggests that the largest number of post-translational modifications map to the intrinsically disorder regions of the protein, which are more surface exposed. Thus, to our knowledge, when combined, these considerations make the current study novel and important.
Modeling of disordered proteins, such as HP1γ, is challenging as their structure cannot be represented by a single, derived conformation. These highly flexible molecules sample a multitude of conformations; both expanded and collapsed in nature. Thus, several restrictions were applied in the generation of our model. First, the structure of both the model for the monomer and dimer presented here for HP1γ is in agreement with homology-based data available from structural NMR and SAXS data recently made available for HP1β . This model for the HP1γ monomer complexed nicely with α-importin via the IDR2 region, allowing the N-terminal and C-terminal globular domain to protrude out of the complex without steric hindrances (Fig. 6b). Congruently, the model of the dimer was built by docking the chromodomain of individual monomers, rather than stitching domains from docked chromodomains. This method leads the N-terminal IDRs and globular domain to adopt a “lobster claw” configuration, which is in agreement with structural data for the highly homologous protein HP1β  and yeast SWI6 . It is also true that a single conformation cannot be considered for either the monomer or the dimer. For this reason, we performed conformational sampling using molecular dynamic simulations. Thus, our data is in agreement with the structure of homologous monomers and dimers from human homologues and yeast orthologues, along with their numerous conformations carefully derived from MD simulations, to faithfully represent the structure expected for HP1γ. Further modeling studies using longer MD simulations and coarse-grained models may lend more insight into the biochemical behavior or HP1γ and its complexes.
In conclusion, force field-supported, molecular mechanic calculations and analyses of molecular dynamics simulations infer that a significant amount of structure-to-function information is contained within the less studied regions of HP1γ. The intrinsically disordered properties of these regions endow the entire molecule with a highly dynamic behavior, intermolecular recognition properties, and the ability to receive signals from several intracellular signaling cascades. Since HP1γ plays a key role in normal epigenetics and cancers, the data and models here reported have current and future applications for better understanding biological and pathobiological functions of this protein. By analogy, this data on HP1γ may also inspire both experimental and in silico testable hypotheses regarding the function of the closest members of this family of proteins.
This work was supported by funding from the National Institutes of Health (grants R01 CA178627 to GL, R01 DK52913 to RU, R01 AI-089714 to W. A. Faubion, and T32 GM007337 to GV), the Mayo Foundation, as well as the Mayo Clinic Center for Cell Signaling in Gastroenterology (P30DK084567) and the Mayo Clinic SPORE in Pancreatic Cancer (P50 CA102701).
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.