Journal of Structural and Functional Genomics

, Volume 13, Issue 1, pp 15–26

Crystal structures of putative phosphoglycerate kinases from B. anthracis and C. jejuni


  • Heping Zheng
    • Department of Molecular Physiology and Biological PhysicsUniversity of Virginia
    • Center for Structural Genomics of Infectious Diseases (CSGID)
  • Ekaterina V. Filippova
    • Department of Molecular Pharmacology and Biological ChemistryNorthwestern University Feinberg School of Medicine
    • Center for Structural Genomics of Infectious Diseases (CSGID)
  • Karolina L. Tkaczuk
    • Department of Molecular Physiology and Biological PhysicsUniversity of Virginia
    • Center for Structural Genomics of Infectious Diseases (CSGID)
  • Piotr Dworzynski
    • Department of Molecular Physiology and Biological PhysicsUniversity of Virginia
    • Center for Structural Genomics of Infectious Diseases (CSGID)
  • Maksymilian Chruszcz
    • Department of Molecular Physiology and Biological PhysicsUniversity of Virginia
    • Center for Structural Genomics of Infectious Diseases (CSGID)
  • Przemyslaw J. Porebski
    • Department of Molecular Physiology and Biological PhysicsUniversity of Virginia
    • Center for Structural Genomics of Infectious Diseases (CSGID)
  • Zdzislaw Wawrzak
    • Center for Structural Genomics of Infectious Diseases (CSGID)
    • Northwestern UniversitySynchrotron Research Center
  • Olena Onopriyenko
    • Banting and Best Department of Medical ResearchUniversity of Toronto
    • Center for Structural Genomics of Infectious Diseases (CSGID)
  • Marina Kudritska
    • Banting and Best Department of Medical ResearchUniversity of Toronto
    • Center for Structural Genomics of Infectious Diseases (CSGID)
  • Sarah Grimshaw
    • J. Craig Venter Institute
    • Center for Structural Genomics of Infectious Diseases (CSGID)
  • Alexei Savchenko
    • Banting and Best Department of Medical ResearchUniversity of Toronto
    • Center for Structural Genomics of Infectious Diseases (CSGID)
  • Wayne F. Anderson
    • Department of Molecular Pharmacology and Biological ChemistryNorthwestern University Feinberg School of Medicine
    • Center for Structural Genomics of Infectious Diseases (CSGID)
    • Department of Molecular Physiology and Biological PhysicsUniversity of Virginia
    • Center for Structural Genomics of Infectious Diseases (CSGID)
Short Communication

DOI: 10.1007/s10969-012-9131-9

Cite this article as:
Zheng, H., Filippova, E.V., Tkaczuk, K.L. et al. J Struct Funct Genomics (2012) 13: 15. doi:10.1007/s10969-012-9131-9


Phosphoglycerate kinase (PGK) is indispensable during glycolysis for anaerobic glucose degradation and energy generation. Here we present comprehensive structure analysis of two putative PGKs from Bacillus anthracis str. Sterne and Campylobacter jejuni in the context of their structural homologs. They are the first PGKs from pathogenic bacteria reported in the Protein Data Bank. The crystal structure of PGK from Bacillus anthracis str. Sterne (BaPGK) has been determined at 1.68 Å while the structure of PGK from Campylobacter jejuni (CjPGK) has been determined at 2.14 Å resolution. The proteins’ monomers are composed of two domains, each containing a Rossmann fold, hinged together by a helix which can be used to adjust the relative position between two domains. It is also shown that apo-forms of both BaPGK and CjPGK adopt open conformations as compared to the substrate and ATP bound forms of PGK from other species.


Carbohydrate degradationGlycolysisPGKPhosphoglycerate kinasePathogenic organismAnthraxGastroenteritisGuillain–Barré syndromeRossmann foldBacillus anthracisCampylobacter jejuni



Phosphoglycerate kinase from Bacillus anthracis


Phosphoglycerate kinase from Campylobacter jejuni


Phosphoglycerate kinase


Adenosine triphosphate


Adenosine diphosphate




Small-angle X-ray scattering


Open reading frame


2-[4-(2-Hydroxyethyl)piperazin-1-yl]ethanesulfonic acid




Tobacco etch virus


Ethylenediaminetetraacetic acid


Polyethylene glycol


Life Science Collaborative Access Team


European Molecular Biology Laboratory


Single-wavelength anomalous diffraction




Protein Databank


Root mean square deviation


Guillain–Barré syndrome


National Institute of Allergy and Infectious Diseases


Center of Structural Genomics for Infectious Disease


The impetus for the molecular investigation of proteins from pathogenic organisms is to inch towards understanding and abating their pathogenic effects. Bacillus anthracis is a Gram-positive bacterium which is best known as the causative agent for anthrax [1] and has been prioritized as one of the most lethal human bacteria pathogens (category A) by the National Institute of Allergy and Infectious Diseases (NIAID). Campylobacter jejuni is a NIAID Category B food- and water-borne Gram-negative pathogen, and the leading cause gastroenteritis in humans. Certain subspecies have also been associated with Guillain–Barré syndrome (GBS) [2]. There are currently three strains of Bacillus anthracis (str. Ames, str. ‘Ames Ancestor’, and str. Sterne) and one subspecies of Campylobacter jejuni (subsp. jejuni NCTC 11168) under investigation as model organisms for the Center of Structural Genomics for Infectious Disease (CSGID) [3]. The TargetDB [4] currently contains one entry for phosphoglycerate kinase (PGK) from Bacillus anthracis str. Sterne (IDP04624) and one from Bacillus anthracis str. Ames (OPTIC6700) that encode the same protein, which we shall refer to as BaPGK. The putative PGK from Campylobacter jejuni subsp. jejuni NCTC 11168 (IDP90717) will be referred to as CjPGK.

Phosphoglycerate kinase is a magnesium-dependent kinase involved in the glycolysis regulation pathway which anaerobically degrades glucose into pyruvate. It controls the first ATP-generating step in glycolysis and thus is a crucial target for disrupting anaerobic metabolism in pathogens [5]. PGK is responsible for transferring the phosphate from 1,3-bisphosphoglycerate to ADP, forming ATP for energy storage and preparing 3-phospho-d-glycerate (3PG) for further degradation [6]. The reaction catalyzed by PGK (1,3-bisphosphoglycerate4− + ADP-Mg → 3-phosphoglycerate3− + ATP-Mg2−) is driven by the depletion of ATP in the cell and is repressed by high concentrations of ATP (and low concentration of ADP). This reaction is reversible only in the carbon fixation process in photosynthetic organisms, where the reverse reaction is catalyzed by the same enzyme [7]. The ADP and ATP interacting with PGK are present in magnesium associated forms that neutralize their charge [8].

Since glycolysis abides in the energy generation pathway for all living organisms, PGK is present in virtually all prokaryotic and eukaryotic organisms, and is acting on exactly the same substrate. PGK is highly conserved evolutionarily and there is usually only one PGK encoded per genome [5]. However, in mammalian cells, a second sperm-specific isoform PGK2 exists in addition to the PGK1 found in somatic cells [9].

The two domains in the PGK structure are linked by an α-helix hinge. The phosphoglycerate binding site is in the N-terminal domain and the ADP binding site is located in the C-terminal domain. PGK is a good model enzyme for the study of protein flexibility due to its well documented mechanism of domain closure during catalysis [10]. The two domains of PGK adopt a closed, catalytically active, conformation upon phosphoglycerate and ADP binding, but assume an open conformation in the absence of substrates [11], with a significant domain rotation of ~33° between the two conformations. The phosphoglycerate and ADP binding sites are usually more than 10 Å apart from each other in the open conformation. Upon substrate and ADP binding the active site will close in a “hinge-bending” manner to bring these two sites close enough (~4 Å) to each other for the phosphate transfer reaction to proceed [12, 13]. Extensive structural studies on human PGK have recently made the fine-grained illustration of the catalytic mechanism possible [1417]. These structures include different ligand-bound states of the human PGK (3-phospho-d-glycerate, magnesium, ADP, and transition state analogue) in a partially- or fully-closed conformation. A recent synergistic use of SAXS and X-ray crystallography shows that PGK spends most of its time in the open, “resting” state. Binding of both substrate and nucleotide elicit domain closure which exposes a substantial hydrophobic surface that serves as the driving force in a “spring-loaded” release mechanism for domain opening [15].

The ligand-free BaPGK and CjPGK reported in this study both adapt an open conformation, and they are the first PGKs from pathogenic bacteria reported in the PDB. The BaPGK crystal structure was refined at 1.68 Å resolution to an R factor of 17.6% and Rfree of 21.3%, while the second one was refined at 2.14 Å to an R factor of 19.0% and Rfree of 23.7%. Compared to the closest structural homolog (PGK from Bacillus stearothermophilus), they share 77 and 49% sequence identity respectively. Both crystal structures are monomeric and very similar to other PGK structures determined to date. One of the crystal packing interfaces in the BaPGK structure includes small molecules from the crystallization solution which might assist in crystal growth.

Materials and methods

Cloning, expression and purification

The open reading frames (ORFs) of pgk from Bacillus anthracis and Campylobacter jejuni were amplified by polymerase chain reaction (PCR). These genes were cloned into pMCSG7 plasmids using ligation independent cloning [1820]. This expression vector encodes an N-terminal tobacco etch virus protease-cleavable hexahistidine tag (MHHHHHHSSGVDLGTENLYFQ/SNA) [21]. Both pgk genes were overexpressed in E. coli BL21-CodonPlus(DE3)-RIPL. Cells were grown in selenomethionine media at 37°C, induced with isopropyl-β-d-1-thiogalactopyranoside and grown at 20°C afterwards. Harvested cells were sonicated in lysis buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 5% glycerol, 0.5 mM tris(2-carboxyethyl)phosphine-HCl (TCEP), 5 mM imidazole, 0.5 mM phenylmethylsulfonyl fluoride and 1 mM benzamidine), clarified by centrifugation, and the supernatant was applied to a nickel chelate affinity resin (Ni-NTA, Qiagen). The resin was washed with wash buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 5% glycerol, 0.5 mM TCEP, 30 mM imidazole) and the tagged protein was eluted using elution buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 5% glycerol, 0.5 mM TCEP, 250 mM imidazole).

The hexahistidine tag was cleaved by addition of 1 mg of recombinant His-tagged TEV protease per 15 mg of eluted protein. EDTA, TCEP and arginine were added to the final concentrations of 1, 0.5 mM and 0.2 M respectively. The cleavage was performed at 4°C overnight and continued during dialysis into cleavage buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 0.5 mM TCEP). Proteins were separated from TEV protease by running over nickel-chelating resin, dialyzed into crystallization buffer (300 mM NaCl 10 mM HEPES pH 7.5, 5% glycerol, 0.5 mM TCEP), and further concentrated to 40 mg/mL (BaPGK) and 5.3 mg/mL (CjPGK).

Crystallization, data collection and processing

Crystallization was performed using hanging drop vapor diffusion at 20°C. Drops were composed of 1 μL of the reservoir solution with equal volume of the concentrated protein sample. BaPGK (40 mg/mL) was mixed with reservoir solution containing 25% w/v PEG3350, 0.2 M NaCl, 0.1 M Bis–Tris pH5.5 and 2% w/v 1,6-hexanediol. CjPGK (5.3 mg/mL) was mixed with reservoir solution containing 20% w/v PEG3350 and 0.2 M K citrate pH 7. Immediately after harvesting, crystals were transferred into cryoprotectant solution (paratone-N) and flash cooled in liquid nitrogen.

Data were collected at the Life Science Collaborative Access Team (LS-CAT) at the Advanced Photon Source (Argonne National Laboratory, Argonne, IL, USA). Data collection at beamline 21ID-F was controlled by MD-2 software from European Molecular Biology Laboratory (EMBL) with LS-CAT developed extensions. Diffraction data were collected at a single wavelength at 100 K and processed with HKL-2000 [22].

Structure determination, refinement and validation

Structures of the selenomethionine substituted BaPGK and CjPGK were determined using single-wavelength anomalous diffraction (SAD) method. The initial model for BaPGK was built with HKL-3000 [23]. HKL-3000 is integrated with SHELXD, SHELXE [24], MLPHARE [25], DM [26], APR/wARP [27], CCP4 [28], SOLVE [29], RESOLVE [30, 31] and REFMAC5 [32]. The initial phases for CjPGK was determined with PHENIX [33], with subsequent model building in ARP/wARP [27]. Manual model building and refinement were performed in COOT [34]. TLSMD web server [35] was used for definition of the TLS groups (six TLS groups for BaPGK and two for CjPGK) during TLS refinement. MOLPROBITY [36] and ADIT [37] were used for structure validation. Data collection, structure determination, and refinement statistics are summarized in Table 1.
Table 1

Summary of data collection, structure determination, and refinement statistics. Data for the highest-resolution shell are given in parentheses. Bijvoet pairs were merged for Rmerge calculation. Ramachandran plot statistics were calculated by MOLPROBITY

PDB accession code



Data collection

 Wavelength (Å)



 Space group

C 1 2 1

P 21 21 2

 Unit-cell parameters (Å, °)

a = 119.7, b = 45.3, c = 68.6, α = 90.00, β = 97.95, γ = 90.00

a = 110.3, b = 165.2, c = 54.6, α = 90.00, β = 90.00, γ = 90.00

 Resolution (Å)

50.00–1.68 (1.69–1.68)

30.00–2.15 (2.19–2.15)

 No. of unique reflections

41,651 (996)


 Completeness (%)

99.9 (99.4)

99.3 (97.4)


4.1 (4.0)

5.0 (4.9)

 Mean I/σ

24.38 (2.2)

17.97 (2.4)

 Molecules in asymmetric unit



 Matthews coefficient (Å3 Da−1)



 Solvent content (%)



 Rmerge (%)

3.9 (39.8)

10.3 (63.7)

Structure refinement

 Resolution range (Å)

50.00–1.68 (1.73–1.68)

91.72–2.15 (2.20–2.15)

 Rwork/Rfree (%)



 No. of residues/protein atoms



 No. of water atoms



Average B factors (Å2)

 Main chain



 Side chains












Ramachandran plot (%)

 Most favoured









RMS deviations

 Bond lengths (Å)



 Bond angles (°)



The coordinates and experimental structure factors were deposited to the PDB with the accession code 3uwd for BaPGK and 3q3v for CjPGK. Diffraction images can be obtained through CSGID webpage. (

Structure analysis and visualization

The sequence conservation scores were calculated by the ConSurf server [38]. Multiple sequence alignments for conservation scores calculations were done by ConSurf server using 150 of the most similar proteins which were restrained by sequence identity in the range of 35–95% and PSI-BLAST cutoff E-value 1.0E−04. PITA [39] and PISA [40] servers were used for prediction of quaternary structure. NEIGHBORHOOD database was used to assist the analysis of 3D motifs [41]. An in-house program was used for the preparation of Table 1 from CSGID database ( PyMOL [42] was used for the generation of figures of the structure. TopDrawPrep and TopDraw [43] were used to prepare the topology diagrams.

Structure alignment with PGK homologues

The structure of BaPGK and CjPGK were superimposed with other known PGK structures using the align function in PyMOL for both the C-terminal and N-terminal domains separately. The N-terminal domain rotation angle was used to evaluate the conformational changes, and was measured by fixing the BaPGK C-terminal domain orientation with other PGK homologues, and rotating the N-terminal domain of BaPGK to align with the N-terminal domain of each PGK homologue. The distances and angles between domains were used to evaluate the degree of active site opening. The distance was measured between the Arg62 side chain amino group and the C-terminal domain’s Asp200 side chain carboxyl group of the superposed BaPGK. The same angle was measured in all PGK structures using the superposed BaPGK as a reference to ensure that side chains used for distance calculations have exactly the same orientation. The angle was measured as Arg62NH2-Val177Cα-Asp200Oδ1 (Table 2).
Table 2

Summary of different conformations for known PGK structures compared to BaPGK

Complex status


pdbid [seq. ident. (%)]


C-term. RMSDc

N-term. RMSDc

N-term. rotation

Distance (Ǻ) (R62-D200)

Angle (°) (R62-V177-D200)


B. anthracis

3uwd (100)


0.0 (170)

0.0 (184)




T. thermophilus

1v6 s (50)


0.9 (160)

0.7 (122)




T. caldophilus

2ie8 (50)


0.9 (159)

0.7 (127)




S. cerevisiae

1fw8 (49)


1.1 (159)

0.6 (132)




M. musculus

2p9q (48)


1.2 (150)

0.8 (138)




E. coli

1zmr (47)


1.0 (163)

0.8 (127)




E. caballus

2pgk (47)


2.6 (157)

2.9 (148)




C. jejuni

3q3v (46)


0.7 (160)

0.8 (143)




Binary: PGK, nucleotide

G. stearotherm

1php (75)


0.5 (159)

0.4 (154)




P. falciparum

1ltk (50)


1.1 (159)

0.9 (141)




T. brucei

16pk (48)


0.8 (153)

0.6 (140)




S. scrofa

1vjc, 1vjd (47)


1.1 (162)

0.7 (132)




H. sapiens

2zgv (47)


1.0 (150)

0.7 (133)




3c3b (47)


1.1 (152)

0.9 (134)




Binary: PGK, 3PG

M. musculus

2p9t (48)


1.2 (164)

0.8 (140)




H. sapiens

2xe6, 3c39 (47)


1.1 (163)

0.8 (135)




P. horikoshii

2cun (37)


2.5 (140)

0.8 (127)




Tertiary: PGK, substrate, nucleotided

S. cerevisiae

3pgk (49)


2.3 (147)

1.7 (131)




M. musculus

2paa (48)


1.1 (160)

0.7 (139)




T. brucei

13pk (48)


0.9 (156)

0.7 (141)




S. scrofa

1hdi (47)


1.1 (161)

0.9 (137)




H. sapiens

2xe7, 3c3a, 3c3c, 2x13 (47)


1.0–1.3 (152–164)

0.8–0.9 (133–135)

15–23, 37(2xe7)

12–13, 6(2xe7)

29–31, 13(2xe7)

2x15 (47)

X15, ATP

1.2 (164)

0.8 (135)




Tertiary: PGK, 3PG, ADP/ATP/Pi analogueb

T. maritima

1vpe (59)


0.9 (160)

0.6 (148)




S. cerevisiae

1qpg (49)


1.1 (161)

0.6 (137)




S. scrofa

1kf0 (47)


1.1 (162)

0.9 (135)




H. sapiens

2x14, 2ybe, 2y3i, 2xe8 (47)


1.0–1.3 (160–166)

0.8–1.0 (133–142)

37–38, 14(2xe8)

5–6, 14(2xe8)

12–13, 32(2xe8)

2wzb, 2wzc, 2wzd (47)

3PG, ADP, Pi analogueb

1.3 (159–164)

0.9 (134–136)




RMSD to BaPGK is calculated for C-terminal (Pro188-Thr371) and N-terminal (Met1-Leu170) domains individually. The rotation angle is defined by the rotation necessary to superpose N-terminal domains after previously superposing C-terminal domains. The distance is measured between the NH2 atom of Arg62 from the superposed N-terminal domain and the Oδ1 atom of Asp200 from the fixed C-terminal domain. The angle is measured as Arg62NH2-Val177 Cα-Asp200Oδ1, while Val177 is at the center of the connecting hinge helix between N-terminal and C-terminal domains. Both distance and angle are used to indicate the cleft opening

a3-Letter ligand code from PDB is used, only substrate and nucleotide cofactors are listed, magnesium or other metal cofactors are not indicated

bADP analogue: LA8 (l-adenosine-5′-diphosphate); ATP analogues: ACP (phosphomethylphosphonic acid-adenylate ester), ANP (phosphoaminophosphonic acid-adenylate ester), MAP (magnesium-5′-adenyly-imido-triphosphate), BIS (1,1,5,5-tetrafluorophosphopentylphosphonic acid-adenylate ester; Pi (phosphate) analogues: MGF (trifluoromagnesate), ALF (trifluoroaluminate), AF3 (aluminum fluoride)

cRMSD for N-terminal and C-terminal domains are calculated for main-chain Cα atoms only and in the unit of Å, the number in parenthesis indicates the number of Cα atoms being aligned for RMSD calculation. Length of alignment could be different for different models from the same species due to model completeness. If there are more than one model of the same organism and complex status, the RMSD value for higher number of aligned Cα atoms is reported

dIn glycolysis, substrate is 3PG (3-phosphoglyceric acid), and product is X15 (1,3-bisphosphoglyceric acid); substrate and product is reversed for carbon fixation process in photosynthetic organisms

Results and discussion

Protein structure and domains description

The results of domain comparison in both cases clearly confirm that both presented structures belong to PGK family (cd00318: E-value 2.12e−148; COG0126: E-value 2.85e−160; pfam00162: E-value 0). CjPGK shares a sequence identity of 46% with BaPGK, and its crystal structure exhibits a RMSD of 1.6 Å for 344 C-α atoms after superposition with BaPGK. Even though both PGKs are in their “open” conformation, the relative orientations of the two domains are different. The structure alignment can achieve a RMSD of 0.8 Å for 143 C-α atoms in the N-terminal domain and 0.7 Å for 160 C-α atoms in the N-terminal domain if the superposition is done for each domain individually (Table 2).

The structure of BaPGK contains one molecule per asymmetric unit (Fig. 1A). The final model has 329 water molecules, five chloride ions, one magnesium ion, one Bis–Tris molecule and one unknown ligand molecule (Table 1). The six-water coordinated magnesium ion is observed at a non-physiological location. Two residues in the loop that covers the substrate binding site (Lys27, Glu28) cannot be modeled due to poor electron density, presumably caused by high flexibility. A single residue discrepancy with the UniProt sequence (T190A) is observed in the structure and was confirmed by DNA sequencing. The structure of CjPGK contains two molecules per asymmetric unit (Fig. 1B). The final model has 324 water molecules, two sulfate ions, four potassium ions, four formic acid residues, and two PEG fragments (Table 1).
Fig. 1

Wall-eyed stereo view of both pathogenic PGKs in cartoon representation. A The BaPGK structure (3uwd) with helices colored in green, β-sheets in red and loops in gray. B The CjPGK structure (3q3v) with helices colored in light green, β-sheets in red and loops in gray. C The topology of both PGKs with the same color coding as in A

The overview of both BaPGK and CjPGK structures are presented on Fig. 1. They both adopt very similar overall structures in the open conformation (Figs. 2B, 3A). Both PGK monomers are comprised of two domains with a linking helix referred to as the hinge region (α-helix 11 in Fig. 1C; BaPGK: Gly168-Asn184, CjPGK: Gly172-Ile186, marked in red on Fig. 2A). Each domain is composed of a central β-sheet sandwiched by two α-helical layers (Rossmann fold) (Fig. 1A). A pre-proline peptide bond is observed in the cis conformation at the border between the N-terminal domain and the C-terminal domain (BaPGK: Arg187-Pro188, CjPGK: Arg191-Pro192) and is believed to be conserved [44]. The C-terminal domain ends with a helix (BaPGK: Leu385-Cys390, CjPGK: Leu388-Ala393, also marked in red on Fig. 2A) followed by a C-terminal loop (α-helix 23 in Fig. 1C; BaPGK: Leu391-Lys394, CjPGK: Leu394-Ser396). This C-terminal loop is in contact with the N-terminal domain, and comes into the proximity of Met1-Asn2-Lys3-Lys4 (BaPGK) and Met1-Ser2-Asp3-Ile4-Ile5 (CjPGK) thus is considered as a second “hinge” [13].
Fig. 2

Atlas of PGK architecture. A Presents crystal structure of BaPGK, N-terminal domain is coloured in light green, the hinge region in red and C-terminal domain in dark green. The substrate (3PG) binding site in the N-terminal domain and nucleotide (ADP/ATP) binding site in the C-terminal domain are indicated by arrows. The side chains of Arg62 and Asp200 that form a salt bridge upon closing are shown in stick representation. The distance between the ends of the side chains (15 Å) and the opening angle (37°) as defined in the text are shown using black lines. B A superposition of CjPGK (gray) and BaPGK (coloured as on A), with two highly conserved motifs circled. C A multiple sequence alignment of the highly conserved motifs indicated in B, with Arg62 and Asp200 marked by an asterisk. Sequences used for multiple alignment are ordered the same way as in Table 2, with BaPGK and CjPGK highlighted in a red boxed frame
Fig. 3

Conformational change of PGK during catalysis. Structures have been superimposed on the C-terminal domain of BaPGK and separated into A (apo-PGK in open forms) and B (semi-close and fully-closed forms) for clarity. A BaPGK in green CjPGK in red, and PGK from Bacillus stearothermophilus in yellow (PDB code: 1php). B PGK from Homo sapiens in light blue (PDB code: 3c39), PGK from Thermotoga maritima in navy blue (PDB code:1vpe), and fully-closed PGK from Homo sapiens in orange (PDB code: 2wzc). All semi-close and fully-closed forms of PGKs are shown with nucleotide/substrate or their analogues (in ball and stick representation) bound in the structures

The topologies of both PGK structures within each domain are almost identical, with the only exception being helix 7, which is a seven-residue α-helix in BaPGK but a five-residues 310 helix in CjPGK (Fig. 1C). The N-terminal domain is composed of a central β-sheet of six parallel strands (A1–A6) surrounded by two α-helices in the front (helices 2 and 4) and three α-helices at the back (helices 5, 7, 8) plus five 310 helices (helices 1, 3, 6, 9, 10) (Fig. 1). The C-terminal domain is formed by a major six parallel stranded central β-sheet (B1–B6), and a minor β-sheet of three anti-parallel strands (C1–C3). These two β-sheets are surrounded by three α-helices in the front (helices 17, 19, 20) and four α-helices at the back (helices 12, 13, 15, 22) plus four 310 helices (helices 14, 16, 18 21).

Two highly conserved motifs around a potential salt bridge

Phosphoglycerate kinase proteins with substrate and nucleotide bound adopt a completely “closed” conformation in the active state as compared to the “open”, resting conformation observed when either reactant is absent. Upon transition to the active, fully closed state, a salt bridge is formed between Arg62 from the N-terminal domain on top of the 3PG binding site and Asp200 from the C-terminal domain on top of the nucleotide binding site [45]. This creates a cover to fully close the cleft and form a catalytically active environment. The BaPGK and CjPGK structures determined here are both an apo-forms without either substrate or nucleotide bound. They adopt a typical “open” conformation with 15 and 13 Å distance between the end of the side chains of Arg62 and Asp200 respectively, which gives an opening angle of 37° and 31° relative to the middle of the linker helix on the right.

DALI [46] and FATCAT [47] searches using the BaPGK structure identifies more than a dozen PGK structures, including those from bacteria/archaea (thermophilic bacteria/archaea) and eukaryotic organisms (human, mouse, pig, horse, yeast, malaria parasite, and sleeping sickness parasite) (Table 2). While most other known structures share a 40–50% sequence identity with BaPGK, the closest homologue to Bacillus anthracis PGK with a known structure is Bacillus stearothermophilus PGK (PDB code: 1php) [44] which shares 75% sequence identity (Fig. 3A). Using this subset of PGK structures, two strongly evolutionarily preserved motifs around the salt-bridge forming residues Asp62 and Asp200 in BaPGK were identified. Namely, “SHLGRPK” that contains Arg62 (BaPGK) and “GG(A)KV(K)DKX” (where X is usually Ile or Leu) that contains Asp200 (BaPGK) (Fig. 2C). Interestingly, although Arg62 is conserved in CjPGK, a glycine is present in the position corresponding to Asp200. The lack of Asp200 may influence the aforementioned domain closure in CjPGK. It is possible that CjPGK structure adopts another way of stabilizing its closed conformation for the enzymatic reaction or does not possess PGK activity. Further biochemical data is needed to validate either hypotheses.

Conformation changes in PGK between “resting” and “active” states

The set of PGK structures identified in the DALI and FATCAT searches described in the previous section is analyzed further in terms of conformational changes. The RMSD values from the DALI search do not account for different conformation states amongst these structures. These differences can be quite significant and cannot be neglected. Therefore N-terminal and C-terminal domains were superposed separately. PGK structures adopt an almost identical fold amongst all species. For both C-terminal and N-terminal domains, more than 70–80% of the homologues can be aligned with an RMSD of around 1 Å (C-terminal domain) or less (N-terminal domain), which is no more than the flexibility between different structures of the same protein. The difference in length of structure alignment is mostly due to missing residues in incomplete structure models. Exceptionally high RMSDs are observed for only two structures (2pgk: RMSD = 2.6–2.9 Å, 3pgk: RMSD = 1.7–2.3 Å) that were determined almost three decades ago at relatively low resolution. These differences may reflect the quality of the structure and/or the tools available for structure refinement/validation at that time, rather than species-specific variations among PGK structures.

The degree of cleft opening can be evaluated using the distance between Arg62 and Asp200. In a structure with a typical “open” conformation this distance is between 13 and 18 Å, and in structures with a typical “closed” conformation it is less than 5–6 Å (Table 2). The actual distance between Arg62 and Asp200 will be closer to 3 Å upon domain closure due to side chain rearrangements that accompany salt bridge formation. The angle of cleft opening, which is measured between Arg62 and Asp200 relative to the middle of the linker helix, is around 40° and 12° for “open” and “closed” conformation, respectively. The rotating motion is also measured for BaPGK N-terminal domain while immobilizing C-terminal domain, and embodies a rotation in the range of 30°–40° upon switching to the “closed” conformation (Table 2). Some domain rotations which are not in the direction of domain closing are also observed in the range of 10°–20°. Conformational changes within the substrate binding site associated with substrate binding are shown by superimposing BaPGK with three other partially closed or fully closed PGK structures (Fig. 3). During the transition from the fully open to fully closed conformation, the substrate 3PG associated with the N-terminal domain is brought closer to, and eventually contacts, the phosphate moiety from the nucleotide, exemplifying a significant domain rotation of around 37°.

Substrate binding site

The residues along the cleft between the two domains are highly conserved, while other surface exposed regions are highly variable (Fig. 4). The conserved region corresponds well with the substrate binding site in the N-terminal domain and the ADP-Mg/ATP-Mg binding site in the C-terminal domain. The buried residues of the linker helix and its surrounding helices are also highly conserved. These residues form the hydrophobic patch which will be partially exposed to the solvent upon substrate binding, thereby making the closed conformation less energetically favorable, which has to be compensated by strong interactions in the active site during the catalytic process. This agrees with the proposed spring-loaded release mechanism proposed in a recent study [15].
Fig. 4

Sequence conservation analysis for BaPGK (A) and CjPGK (B). Conservation score mapped on both surface and cartoon representations are colored according to ConSurf conservation score (dark red: the most conserved to cyan: variable)

The PGK substrate and ATP binding site is well-defined and resides in the cleft between the two domains [16]. Compared to human PGK in closed conformation (PDB code: 2wzc) the BaPGK model exhibits some different side chain conformations (Arg62, Asp200, Lys197) and main chain conformation (Gly349-351) within its potential binding environment (Fig. 5). Arg62 and Asp200 presumably adopt different orientations upon forming a salt bridge in the closed conformation. Lys197 and Arg36 which form part of the coordination sphere for the transferred phosphate have highly exposed side chains compared to the corresponding residues in the active state of human PGK. The substrate binding site is highly positively charged, with Arg62, Arg118, Arg151 coming close to form a phosphate clamp that precisely defines the location for the phosphate moiety from 3PG. Two main-chain alternative conformations (Gly349-Gly350-Gly351) are also modeled at the beginning of helix Gly351-Phe360. This motif forms the base below the phosphate moiety of the putative nucleotide binding site and is likely to be stabilized upon nucleotide binding.
Fig. 5

Side chain conformations around the active site that are important for substrate binding are shown after independently superposing the N-terminal (magenta) and C-terminal (light blue) domains of BaPGK on the closed conformation of human PGK (white) with 3-phosphoglyceric acid (3PG), trifluoroaluminate (ALF, a phosphate analogue), and ADP bound (PDB code: 2wzc). Side chains are shown in stick representation for residues that form the salt bridge (Arg62, Asp200), the phosphate clamp (Arg62, Arg118, Arg151) and coordinates phosphate (Arg36, Lys197). The Gly349-Gly350-Gly351 loop below the phosphate analogue that has a main chain modelled with alternative conformations is shown in green

Crystal packing interfaces

The active biological unit of PGK is most likely a monomer. Analysis of crystal contacts for both BaPGK and CjPGK structures reveal three crystal packing interfaces with buried interface area approximately 600 Å2 in each structure. PISA result indicates that all three interfaces in CjPGK structure are not stable, but for BaPGK structure the interface with symmetry operator (−x, y, −z) in the same cell is predicted to be stable in solution and could be treated as a putative dimer interface. It involves four hydrogen bonds and six salt bridges between two C-terminal domains. Modeled small molecules, including a Bis–Tris molecule and an unidentified carbohydrate-like molecule, contribute to this crystal interface. However, when the Bis–Tris and the unidentified molecule from crystallization condition are removed from the model, this interface is no longer predicted to be stable by PISA. Although this interface does not appear to be biologically relevant, the small molecules may play an important role for crystal formation.


We present herein two new crystal structures of putative PGKs from pathogenic organisms. They both conform to the canonical PGK fold. Their overall structures are consistent with an “open” conformation typical for other PGKs in apo form. Comparison with other substrate and ATP bound structures highlights the flexibility of the hinge between two domains, reaffirming that the cleft is closing upon substrate and/or ATP binding, with both domains moving toward each other. Some of the residues around the substrate binding site are less ordered due to the lack of substrate, as compared to the active state seen in other structures. Nevertheless, as shown in our surface conservation analysis, these residues are still highly conserved. We have also identified two PGK specific motifs not described previously. These two motifs contain residues Arg62 and Asp200 that form a salt bridge participating in the mechanism of domain movements upon substrate binding.


The authors would like to thank David R. Cooper, Ivan G. Shabalin, Jing Hou and the members of the Center of Structural Genomics for Infectious Diseases for valuable comments and discussions. This research was funded with Federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contract No. HHSN272200700058C. Use of the Advanced Photon Source was supported by the U. S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357. Use of the LS-CAT Sector 21 was supported by the Michigan Economic Development Corporation and the Michigan Technology Tri-Corridor for the support of this research program (Grant 085P1000817).

Conflict of interest

None declared.

Copyright information

© Springer Science+Business Media B.V. 2012