Crystal structures of putative phosphoglycerate kinases from B. anthracis and C. jejuni
- First Online:
- Cite this article as:
- Zheng, H., Filippova, E.V., Tkaczuk, K.L. et al. J Struct Funct Genomics (2012) 13: 15. doi:10.1007/s10969-012-9131-9
- 153 Views
Phosphoglycerate kinase (PGK) is indispensable during glycolysis for anaerobic glucose degradation and energy generation. Here we present comprehensive structure analysis of two putative PGKs from Bacillus anthracis str. Sterne and Campylobacter jejuni in the context of their structural homologs. They are the first PGKs from pathogenic bacteria reported in the Protein Data Bank. The crystal structure of PGK from Bacillus anthracis str. Sterne (BaPGK) has been determined at 1.68 Å while the structure of PGK from Campylobacter jejuni (CjPGK) has been determined at 2.14 Å resolution. The proteins’ monomers are composed of two domains, each containing a Rossmann fold, hinged together by a helix which can be used to adjust the relative position between two domains. It is also shown that apo-forms of both BaPGK and CjPGK adopt open conformations as compared to the substrate and ATP bound forms of PGK from other species.
KeywordsCarbohydrate degradationGlycolysisPGKPhosphoglycerate kinasePathogenic organismAnthraxGastroenteritisGuillain–Barré syndromeRossmann foldBacillus anthracisCampylobacter jejuni
Phosphoglycerate kinase from Bacillus anthracis
Phosphoglycerate kinase from Campylobacter jejuni
Small-angle X-ray scattering
Open reading frame
Tobacco etch virus
Life Science Collaborative Access Team
European Molecular Biology Laboratory
Single-wavelength anomalous diffraction
Root mean square deviation
National Institute of Allergy and Infectious Diseases
Center of Structural Genomics for Infectious Disease
The impetus for the molecular investigation of proteins from pathogenic organisms is to inch towards understanding and abating their pathogenic effects. Bacillus anthracis is a Gram-positive bacterium which is best known as the causative agent for anthrax  and has been prioritized as one of the most lethal human bacteria pathogens (category A) by the National Institute of Allergy and Infectious Diseases (NIAID). Campylobacter jejuni is a NIAID Category B food- and water-borne Gram-negative pathogen, and the leading cause gastroenteritis in humans. Certain subspecies have also been associated with Guillain–Barré syndrome (GBS) . There are currently three strains of Bacillus anthracis (str. Ames, str. ‘Ames Ancestor’, and str. Sterne) and one subspecies of Campylobacter jejuni (subsp. jejuni NCTC 11168) under investigation as model organisms for the Center of Structural Genomics for Infectious Disease (CSGID) . The TargetDB  currently contains one entry for phosphoglycerate kinase (PGK) from Bacillus anthracis str. Sterne (IDP04624) and one from Bacillus anthracis str. Ames (OPTIC6700) that encode the same protein, which we shall refer to as BaPGK. The putative PGK from Campylobacter jejuni subsp. jejuni NCTC 11168 (IDP90717) will be referred to as CjPGK.
Phosphoglycerate kinase is a magnesium-dependent kinase involved in the glycolysis regulation pathway which anaerobically degrades glucose into pyruvate. It controls the first ATP-generating step in glycolysis and thus is a crucial target for disrupting anaerobic metabolism in pathogens . PGK is responsible for transferring the phosphate from 1,3-bisphosphoglycerate to ADP, forming ATP for energy storage and preparing 3-phospho-d-glycerate (3PG) for further degradation . The reaction catalyzed by PGK (1,3-bisphosphoglycerate4− + ADP-Mg− → 3-phosphoglycerate3− + ATP-Mg2−) is driven by the depletion of ATP in the cell and is repressed by high concentrations of ATP (and low concentration of ADP). This reaction is reversible only in the carbon fixation process in photosynthetic organisms, where the reverse reaction is catalyzed by the same enzyme . The ADP and ATP interacting with PGK are present in magnesium associated forms that neutralize their charge .
Since glycolysis abides in the energy generation pathway for all living organisms, PGK is present in virtually all prokaryotic and eukaryotic organisms, and is acting on exactly the same substrate. PGK is highly conserved evolutionarily and there is usually only one PGK encoded per genome . However, in mammalian cells, a second sperm-specific isoform PGK2 exists in addition to the PGK1 found in somatic cells .
The two domains in the PGK structure are linked by an α-helix hinge. The phosphoglycerate binding site is in the N-terminal domain and the ADP binding site is located in the C-terminal domain. PGK is a good model enzyme for the study of protein flexibility due to its well documented mechanism of domain closure during catalysis . The two domains of PGK adopt a closed, catalytically active, conformation upon phosphoglycerate and ADP binding, but assume an open conformation in the absence of substrates , with a significant domain rotation of ~33° between the two conformations. The phosphoglycerate and ADP binding sites are usually more than 10 Å apart from each other in the open conformation. Upon substrate and ADP binding the active site will close in a “hinge-bending” manner to bring these two sites close enough (~4 Å) to each other for the phosphate transfer reaction to proceed [12, 13]. Extensive structural studies on human PGK have recently made the fine-grained illustration of the catalytic mechanism possible [14–17]. These structures include different ligand-bound states of the human PGK (3-phospho-d-glycerate, magnesium, ADP, and transition state analogue) in a partially- or fully-closed conformation. A recent synergistic use of SAXS and X-ray crystallography shows that PGK spends most of its time in the open, “resting” state. Binding of both substrate and nucleotide elicit domain closure which exposes a substantial hydrophobic surface that serves as the driving force in a “spring-loaded” release mechanism for domain opening .
The ligand-free BaPGK and CjPGK reported in this study both adapt an open conformation, and they are the first PGKs from pathogenic bacteria reported in the PDB. The BaPGK crystal structure was refined at 1.68 Å resolution to an R factor of 17.6% and Rfree of 21.3%, while the second one was refined at 2.14 Å to an R factor of 19.0% and Rfree of 23.7%. Compared to the closest structural homolog (PGK from Bacillus stearothermophilus), they share 77 and 49% sequence identity respectively. Both crystal structures are monomeric and very similar to other PGK structures determined to date. One of the crystal packing interfaces in the BaPGK structure includes small molecules from the crystallization solution which might assist in crystal growth.
Materials and methods
Cloning, expression and purification
The open reading frames (ORFs) of pgk from Bacillus anthracis and Campylobacter jejuni were amplified by polymerase chain reaction (PCR). These genes were cloned into pMCSG7 plasmids using ligation independent cloning [18–20]. This expression vector encodes an N-terminal tobacco etch virus protease-cleavable hexahistidine tag (MHHHHHHSSGVDLGTENLYFQ/SNA) . Both pgk genes were overexpressed in E. coli BL21-CodonPlus(DE3)-RIPL. Cells were grown in selenomethionine media at 37°C, induced with isopropyl-β-d-1-thiogalactopyranoside and grown at 20°C afterwards. Harvested cells were sonicated in lysis buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 5% glycerol, 0.5 mM tris(2-carboxyethyl)phosphine-HCl (TCEP), 5 mM imidazole, 0.5 mM phenylmethylsulfonyl fluoride and 1 mM benzamidine), clarified by centrifugation, and the supernatant was applied to a nickel chelate affinity resin (Ni-NTA, Qiagen). The resin was washed with wash buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 5% glycerol, 0.5 mM TCEP, 30 mM imidazole) and the tagged protein was eluted using elution buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 5% glycerol, 0.5 mM TCEP, 250 mM imidazole).
The hexahistidine tag was cleaved by addition of 1 mg of recombinant His-tagged TEV protease per 15 mg of eluted protein. EDTA, TCEP and arginine were added to the final concentrations of 1, 0.5 mM and 0.2 M respectively. The cleavage was performed at 4°C overnight and continued during dialysis into cleavage buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 0.5 mM TCEP). Proteins were separated from TEV protease by running over nickel-chelating resin, dialyzed into crystallization buffer (300 mM NaCl 10 mM HEPES pH 7.5, 5% glycerol, 0.5 mM TCEP), and further concentrated to 40 mg/mL (BaPGK) and 5.3 mg/mL (CjPGK).
Crystallization, data collection and processing
Crystallization was performed using hanging drop vapor diffusion at 20°C. Drops were composed of 1 μL of the reservoir solution with equal volume of the concentrated protein sample. BaPGK (40 mg/mL) was mixed with reservoir solution containing 25% w/v PEG3350, 0.2 M NaCl, 0.1 M Bis–Tris pH5.5 and 2% w/v 1,6-hexanediol. CjPGK (5.3 mg/mL) was mixed with reservoir solution containing 20% w/v PEG3350 and 0.2 M K citrate pH 7. Immediately after harvesting, crystals were transferred into cryoprotectant solution (paratone-N) and flash cooled in liquid nitrogen.
Data were collected at the Life Science Collaborative Access Team (LS-CAT) at the Advanced Photon Source (Argonne National Laboratory, Argonne, IL, USA). Data collection at beamline 21ID-F was controlled by MD-2 software from European Molecular Biology Laboratory (EMBL) with LS-CAT developed extensions. Diffraction data were collected at a single wavelength at 100 K and processed with HKL-2000 .
Structure determination, refinement and validation
Summary of data collection, structure determination, and refinement statistics. Data for the highest-resolution shell are given in parentheses. Bijvoet pairs were merged for Rmerge calculation. Ramachandran plot statistics were calculated by MOLPROBITY
PDB accession code
C 1 2 1
P 21 21 2
Unit-cell parameters (Å, °)
a = 119.7, b = 45.3, c = 68.6, α = 90.00, β = 97.95, γ = 90.00
a = 110.3, b = 165.2, c = 54.6, α = 90.00, β = 90.00, γ = 90.00
No. of unique reflections
Molecules in asymmetric unit
Matthews coefficient (Å3 Da−1)
Solvent content (%)
Resolution range (Å)
No. of residues/protein atoms
No. of water atoms
Average B factors (Å2)
Ramachandran plot (%)
Bond lengths (Å)
Bond angles (°)
The coordinates and experimental structure factors were deposited to the PDB with the accession code 3uwd for BaPGK and 3q3v for CjPGK. Diffraction images can be obtained through CSGID webpage. (http://www.csgid.org/csgid/pages/diffraction_images).
Structure analysis and visualization
The sequence conservation scores were calculated by the ConSurf server . Multiple sequence alignments for conservation scores calculations were done by ConSurf server using 150 of the most similar proteins which were restrained by sequence identity in the range of 35–95% and PSI-BLAST cutoff E-value 1.0E−04. PITA  and PISA  servers were used for prediction of quaternary structure. NEIGHBORHOOD database was used to assist the analysis of 3D motifs . An in-house program was used for the preparation of Table 1 from CSGID database (http://www.csgid.org/). PyMOL  was used for the generation of figures of the structure. TopDrawPrep and TopDraw  were used to prepare the topology diagrams.
Structure alignment with PGK homologues
Summary of different conformations for known PGK structures compared to BaPGK
pdbid [seq. ident. (%)]
Distance (Ǻ) (R62-D200)
Angle (°) (R62-V177-D200)
1v6 s (50)
Binary: PGK, nucleotide
1vjc, 1vjd (47)
Binary: PGK, 3PG
2xe6, 3c39 (47)
Tertiary: PGK, substrate, nucleotided
2xe7, 3c3a, 3c3c, 2x13 (47)
Tertiary: PGK, 3PG, ADP/ATP/Pi analogueb
2x14, 2ybe, 2y3i, 2xe8 (47)
2wzb, 2wzc, 2wzd (47)
3PG, ADP, Pi analogueb
Results and discussion
Protein structure and domains description
The results of domain comparison in both cases clearly confirm that both presented structures belong to PGK family (cd00318: E-value 2.12e−148; COG0126: E-value 2.85e−160; pfam00162: E-value 0). CjPGK shares a sequence identity of 46% with BaPGK, and its crystal structure exhibits a RMSD of 1.6 Å for 344 C-α atoms after superposition with BaPGK. Even though both PGKs are in their “open” conformation, the relative orientations of the two domains are different. The structure alignment can achieve a RMSD of 0.8 Å for 143 C-α atoms in the N-terminal domain and 0.7 Å for 160 C-α atoms in the N-terminal domain if the superposition is done for each domain individually (Table 2).
The topologies of both PGK structures within each domain are almost identical, with the only exception being helix 7, which is a seven-residue α-helix in BaPGK but a five-residues 310 helix in CjPGK (Fig. 1C). The N-terminal domain is composed of a central β-sheet of six parallel strands (A1–A6) surrounded by two α-helices in the front (helices 2 and 4) and three α-helices at the back (helices 5, 7, 8) plus five 310 helices (helices 1, 3, 6, 9, 10) (Fig. 1). The C-terminal domain is formed by a major six parallel stranded central β-sheet (B1–B6), and a minor β-sheet of three anti-parallel strands (C1–C3). These two β-sheets are surrounded by three α-helices in the front (helices 17, 19, 20) and four α-helices at the back (helices 12, 13, 15, 22) plus four 310 helices (helices 14, 16, 18 21).
Two highly conserved motifs around a potential salt bridge
Phosphoglycerate kinase proteins with substrate and nucleotide bound adopt a completely “closed” conformation in the active state as compared to the “open”, resting conformation observed when either reactant is absent. Upon transition to the active, fully closed state, a salt bridge is formed between Arg62 from the N-terminal domain on top of the 3PG binding site and Asp200 from the C-terminal domain on top of the nucleotide binding site . This creates a cover to fully close the cleft and form a catalytically active environment. The BaPGK and CjPGK structures determined here are both an apo-forms without either substrate or nucleotide bound. They adopt a typical “open” conformation with 15 and 13 Å distance between the end of the side chains of Arg62 and Asp200 respectively, which gives an opening angle of 37° and 31° relative to the middle of the linker helix on the right.
DALI  and FATCAT  searches using the BaPGK structure identifies more than a dozen PGK structures, including those from bacteria/archaea (thermophilic bacteria/archaea) and eukaryotic organisms (human, mouse, pig, horse, yeast, malaria parasite, and sleeping sickness parasite) (Table 2). While most other known structures share a 40–50% sequence identity with BaPGK, the closest homologue to Bacillus anthracis PGK with a known structure is Bacillus stearothermophilus PGK (PDB code: 1php)  which shares 75% sequence identity (Fig. 3A). Using this subset of PGK structures, two strongly evolutionarily preserved motifs around the salt-bridge forming residues Asp62 and Asp200 in BaPGK were identified. Namely, “SHLGRPK” that contains Arg62 (BaPGK) and “GG(A)KV(K)DKX” (where X is usually Ile or Leu) that contains Asp200 (BaPGK) (Fig. 2C). Interestingly, although Arg62 is conserved in CjPGK, a glycine is present in the position corresponding to Asp200. The lack of Asp200 may influence the aforementioned domain closure in CjPGK. It is possible that CjPGK structure adopts another way of stabilizing its closed conformation for the enzymatic reaction or does not possess PGK activity. Further biochemical data is needed to validate either hypotheses.
Conformation changes in PGK between “resting” and “active” states
The set of PGK structures identified in the DALI and FATCAT searches described in the previous section is analyzed further in terms of conformational changes. The RMSD values from the DALI search do not account for different conformation states amongst these structures. These differences can be quite significant and cannot be neglected. Therefore N-terminal and C-terminal domains were superposed separately. PGK structures adopt an almost identical fold amongst all species. For both C-terminal and N-terminal domains, more than 70–80% of the homologues can be aligned with an RMSD of around 1 Å (C-terminal domain) or less (N-terminal domain), which is no more than the flexibility between different structures of the same protein. The difference in length of structure alignment is mostly due to missing residues in incomplete structure models. Exceptionally high RMSDs are observed for only two structures (2pgk: RMSD = 2.6–2.9 Å, 3pgk: RMSD = 1.7–2.3 Å) that were determined almost three decades ago at relatively low resolution. These differences may reflect the quality of the structure and/or the tools available for structure refinement/validation at that time, rather than species-specific variations among PGK structures.
The degree of cleft opening can be evaluated using the distance between Arg62 and Asp200. In a structure with a typical “open” conformation this distance is between 13 and 18 Å, and in structures with a typical “closed” conformation it is less than 5–6 Å (Table 2). The actual distance between Arg62 and Asp200 will be closer to 3 Å upon domain closure due to side chain rearrangements that accompany salt bridge formation. The angle of cleft opening, which is measured between Arg62 and Asp200 relative to the middle of the linker helix, is around 40° and 12° for “open” and “closed” conformation, respectively. The rotating motion is also measured for BaPGK N-terminal domain while immobilizing C-terminal domain, and embodies a rotation in the range of 30°–40° upon switching to the “closed” conformation (Table 2). Some domain rotations which are not in the direction of domain closing are also observed in the range of 10°–20°. Conformational changes within the substrate binding site associated with substrate binding are shown by superimposing BaPGK with three other partially closed or fully closed PGK structures (Fig. 3). During the transition from the fully open to fully closed conformation, the substrate 3PG associated with the N-terminal domain is brought closer to, and eventually contacts, the phosphate moiety from the nucleotide, exemplifying a significant domain rotation of around 37°.
Substrate binding site
Crystal packing interfaces
The active biological unit of PGK is most likely a monomer. Analysis of crystal contacts for both BaPGK and CjPGK structures reveal three crystal packing interfaces with buried interface area approximately 600 Å2 in each structure. PISA result indicates that all three interfaces in CjPGK structure are not stable, but for BaPGK structure the interface with symmetry operator (−x, y, −z) in the same cell is predicted to be stable in solution and could be treated as a putative dimer interface. It involves four hydrogen bonds and six salt bridges between two C-terminal domains. Modeled small molecules, including a Bis–Tris molecule and an unidentified carbohydrate-like molecule, contribute to this crystal interface. However, when the Bis–Tris and the unidentified molecule from crystallization condition are removed from the model, this interface is no longer predicted to be stable by PISA. Although this interface does not appear to be biologically relevant, the small molecules may play an important role for crystal formation.
We present herein two new crystal structures of putative PGKs from pathogenic organisms. They both conform to the canonical PGK fold. Their overall structures are consistent with an “open” conformation typical for other PGKs in apo form. Comparison with other substrate and ATP bound structures highlights the flexibility of the hinge between two domains, reaffirming that the cleft is closing upon substrate and/or ATP binding, with both domains moving toward each other. Some of the residues around the substrate binding site are less ordered due to the lack of substrate, as compared to the active state seen in other structures. Nevertheless, as shown in our surface conservation analysis, these residues are still highly conserved. We have also identified two PGK specific motifs not described previously. These two motifs contain residues Arg62 and Asp200 that form a salt bridge participating in the mechanism of domain movements upon substrate binding.
The authors would like to thank David R. Cooper, Ivan G. Shabalin, Jing Hou and the members of the Center of Structural Genomics for Infectious Diseases for valuable comments and discussions. This research was funded with Federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contract No. HHSN272200700058C. Use of the Advanced Photon Source was supported by the U. S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357. Use of the LS-CAT Sector 21 was supported by the Michigan Economic Development Corporation and the Michigan Technology Tri-Corridor for the support of this research program (Grant 085P1000817).
Conflict of interest