European Biophysics Journal

, Volume 41, Issue 5, pp 449–460

Ionizable side chains at catalytic active sites of enzymes

Authors

  • David Jimenez-Morales
    • Department of BioengineeringUniversity of Illinois at Chicago
  • Jie Liang
    • Department of BioengineeringUniversity of Illinois at Chicago
    • Department of BioengineeringUniversity of Illinois at Chicago
    • Department of Molecular Biophysics and PhysiologyRush University
Original Paper

DOI: 10.1007/s00249-012-0798-4

Cite this article as:
Jimenez-Morales, D., Liang, J. & Eisenberg, B. Eur Biophys J (2012) 41: 449. doi:10.1007/s00249-012-0798-4

Abstract

Catalytic active sites of enzymes of known structure can be well defined by a modern program of computational geometry. The CASTp program was used to define and measure the volume of the catalytic active sites of 573 enzymes in the Catalytic Site Atlas database. The active sites are identified as catalytic because the amino acids they contain are known to participate in the chemical reaction catalyzed by the enzyme. Acid and base side chains are reliable markers of catalytic active sites. The catalytic active sites have 4 acid and 5 base side chains, in an average volume of 1,072 Å3. The number density of acid side chains is 8.3 M (in chemical units); the number density of basic side chains is 10.6 M. The catalytic active site of these enzymes is an unusual electrostatic and steric environment in which side chains and reactants are crowded together in a mixture more like an ionic liquid than an ideal infinitely dilute solution. The electrostatics and crowding of reactants and side chains seems likely to be important for catalytic function. In three types of analogous ion channels, simulation of crowded charges accounts for the main properties of selectivity measured in a wide range of solutions and concentrations. It seems wise to use mathematics designed to study interacting complex fluids when making models of the catalytic active sites of enzymes.

Keywords

Computational geometryActive siteCharge density

Introduction

The biological function of an enzyme usually takes place in infoldings (pockets) of the protein called catalytic active sites. Functional pockets place substrate and catalytic side chains together in the catalytic active site. Substrate and side chains form a selective substrate-enzyme complex in this tiny volume, much as Fischer (1894a, b) originally imagined, according to (Segel 1993), p. 7. Here, we use a modern analysis program CASTp to investigate the catalytic active site of enzymes as identified in the Catalytic Site Atlas database (Porter et al. 2004). CASTp (Liang et al. 1998; Dundas et al. 2006) can define and measure the volume of catalytic active sites and determine the number of acid and base side chains in that site. We are motivated by experiments (Ellinor et al. 1995; Koch et al. 2000; Sather and McCleskey 2003; Wu et al. 2000; Yang et al. 1993) and simulations of ion channels that show the importance of the acid side chains in calcium channels (Boda et al. 2008; Boda et al. 2009; Gillespie et al. 2009; Boda et al. 2011). Models that capture steric exclusion and the special electrostatics of ion channels (and little else) do quite well in describing the selectivity properties of channels and have successfully guided synthesis of artificial selective channels (Miedema et al. 2004; Miedema et al. 2006; Vrouenraets et al. 2006).

Ion channels are specialized proteins not known to Fischer (1894a, b). They are proteins with a hole down their middle that allow the movement of specific solutes across otherwise impermeable membranes. Ion channels ‘catalyze’ (Eisenberg 1990) the selective movement of ions moving through a dielectric barrier—from outside a cell to inside a cell, for example—but they do so without conventional chemistry. The ‘catalysis’ of ion channels does not involve the breaking or making of chemical bonds or the use of chemical energy. The catalytic active sites of ion channel proteins are the selectivity filters of the channel. The selectivity filter distinguishes between ions as the channel protein speeds (i.e., ‘catalyzes’) their movement across cell membranes—without the hydrolysis of ATP. Ion channels are nearly enzymes (Eisenberg 1990) and have been studied extensively in that tradition (Hille 2001).

Selectivity in three types of selectivity filters comes from charged side chains that face into the pore (Ellinor et al. 1995; Koch et al. 2000; Sather and McCleskey 2003; Wu et al. 2000; Yang et al. 1993) and mix with ions in an electrical stew (McCleskey 2000) in the tiny space of the selectivity filter. L-type Ca2+ channels (Boda et al. 2009) (CaV1.n; n = 1,2,…), voltage activated sodium channels (Boda et al. 2007) (NaV1.n; n = 1,2,…), and cation selective ryanodine receptors (Gillespie et al. 2009). RyRs can be simulated with success in a wide range of ionic conditions using a model of crowded charges in an implicit solvent (Eisenberg 2011a). Ion specific properties of bulk electrolytes have been treated in this tradition with some success for a long time (Friedman 1981; Torrie and Valleau 1982; Patwardhan and Kumar 1993; Durand-Vidal et al. 1996; Barthel et al. 1998; Fawcett 2004; Hansen and McDonald 2006; Lee 2008; Kunz 2009; Li 2009; Fraenkel 2010a, b; Kalyuzhnyi et al. 2010; Vincze et al. 2010; Hünenberger and Reif 2011). A synthetic channel has been built with properties rather like RyRs by mutating an entirely unrelated protein to have a high density of acid side chains (Miedema et al. 2006), according to the prescription of these low resolution models. Water, of course, is not a uniform dielectric and ions are not hard spheres: more atomic detail is clearly needed in treatments of ionic solutions in many cases (Howard et al. 2010). More atomic detail seems to be needed in models of potassium channels like KcsA (1K4C) that do not have charged side chains mixing with permeating ions. Simulations have not yet dealt with the binding found in potassium channels in a range of solutions and varying potassium concentrations. References to this large literature are in (Cannon et al. 2010; Yu et al. 2009; Bostick and Brooks 2009; Varma and Rempe 2010; Varma et al. 2011).

The idea of catalytic active sites has been important in the history of enzymology (Dixon and Webb 1979; Kyte 1995; Segel 1993) but the idea is not as prominent as it once was, perhaps because the notion of an active site seems vague. After all the image of an active site is rather dim when compared to structures seen in the bright light of modern x-ray sources. The phrases ‘active site’ and ‘catalytic active site’ are not even in the index of one of the more widely used textbooks of biochemistry (Voet and Voet 2004).

Here, we use the computational power of CASTp to define active sites objectively, avoiding vagueness. CASTp identifies and measures all the concavities in enzymes, both pockets and voids, using a computer code involving little human subjectivity. It identifies and measures both pockets and voids. First, we examine these concavities to see if they contain amino acids that participate in the chemical reaction catalyzed by the enzyme. Then, we further examine the concavities that are catalytic active sites to see if they have large densities of acid and base side chains in a small volume, as in calcium and sodium channel proteins.

We find that 573 catalytic active sites of enzymes of known structure and function are easily distinguished by their large numbers of acid and base side chains: Acid and base side chains are reliable markers of catalytic active sites. These enzymes have 4 acid and 5 basic side chains, on the average, in their catalytic active sites. The volume of the catalytic active sites is tiny so the number densities (in chemical units) of acid and base side chains is some 20 molar. In comparison, the number density of solid sodium chloride is 37 molar. The phrase number density is used, as it is in mathematics, to make clear that no assumptions about the properties of the system are made. The number density is simply the number of objects found in a region, divided by the volume of that region. We fear (and find) that the use of the word ‘concentration’ causes confusion because ‘concentration’ is often treated as if it is the (thermodynamic) ‘activity’, but concentration does not well approximate activity in the ionic solutions found in biology (Eisenberg 2011b, c).

It seems likely that enzymes use the special properties of such concentrated mixtures of charges to promote catalysis one way or the other, for example, by crowding ions into the special electrostatic environment identified by Warshel (Warshel et al. 2006).

We imagine it will be useful to view catalytic activity of enzymes as a property of an ionic liquid of substrate and (tethered) side chains in the special electrostatic environment of the catalytic active site. Analysis that neglects interactions between ions seems unlikely to be useful, no matter how common in the classical literature of enzymology.

Methods: dataset

Catalytic active sites are called that because they contain amino acids known to be directly involved in the catalytic reactions of the enzyme. We define the (catalytic) active site pocket as (1) an enclosed space formed in the three-dimensional structure of proteins that also (2) contains the amino acids responsible for the catalytic reaction. This definition can be made quantitative and precise because of the enormous amount of structural data now available, along with the tools now available to analyze that data.

The Catalytic Site Atlas database (CSA) (Porter et al. 2004) annotates a subset of enzymes available in the Protein Data Bank (PDB) (Berman et al. 2002). The database contained 966 entries on June 16, 2011. The CSA classifies side chains using both experimental results and computational predictions. We do not use classifications based on computational predictions. We only use classifications based on experimental results. Redundant sequences are first removed: if a sequence has more than 95  % (pairwise) sequence identity to a common sequence (in >90 % of the length of each sequence), we select just one ‘at random’. The enzymes are grouped into six main classes (Tipton 1994) according to the chemical reactions they catalyze: EC1, oxidoreductase; EC2, transferases; EC3, hydrolases; EC4, lyases; EC5, Isomerases; EC6, ligases.

Characterization of the active site pocket

Binding sites and catalytic active sites of proteins are often associated with structural pockets and cavities. The CASTp program (Dundas et al. 2006) identifies and measures the pockets and cavities of the experimentally determined structures found in the PDB files (Fig. 1). CASTp is based on alpha shape theory of computational geometry. It uses an analytically exact method to compute the metric properties of voids and pockets on models of macromolecules (Liang et al. 1998).
https://static-content.springer.com/image/art%3A10.1007%2Fs00249-012-0798-4/MediaObjects/249_2012_798_Fig1_HTML.gif
Fig. 1

Sketch of the structural elements calculated and measured in this paper. a Catalytic active site (left hand panel). The catalytic active site in this example is in a pocket accessible from outside. Most (93 %) of the actives sites in our dataset are accessible. b Molecular surface that defines the catalytic active site volume (unit A3). The volume is reported for both pockets and voids, but not for depressions, where it is difficult to define precisely

We use the Molecular Surface (MS) model (Connolly 1985) in CASTp to determine metrics—e.g., the volume—of all pockets. The volume of the surface pockets is the measurement of the space inside the boundary of the pocket that is not occupied by any atom. Details of the pocket geometry calculations can be found in (Liang et al. 1998; Dundas et al. 2006; Edelsbrunner et al. 1998). We use this model to analyze (1) pockets that contain catalytic amino acids, (2) surface pockets (that usually do not contain catalytic amino acids), and (3) interior pockets or voids, whether or not they contain catalytic amino acids. MS represents the protein as a set of intersecting hard spheres (the ‘atoms’). The outer boundary molecular surface is obtained by tracing the distal edge of a spherical ball that is rolled around the protein molecule (see Fig. 1b). This surface is supposed to characterize a spherical solvent molecule rolling around an irregular protein if both were macroscopic uncharged objects. No one knows how to sample the space around an irregular protein the way a solvent or solute molecule actually samples that space in a protein in an ionic solution. Such sampling is needed if the free energy of solvent or solute is to be simulated precisely enough to calculate biological selectivities (Kokubo et al. 2007; Kokubo and Pettitt 2007; Zhang et al. 2010; Eisenberg 2010). We use CASTp with MS to measure the catalytic active site because together they provide computer based objective estimates. These estimates are significantly more reproducible than those that that require more human judgment.

We are mostly interested in catalytic active sites but first we must discuss structural features of the enzyme (Liang and Dill 2001) that do not participate in the catalytic reaction of the enzyme. We call some of these ‘craters’. Craters are pockets (1) that do not contain atoms of a catalytic side chain, and also (2) have a volume between 100 and 3,000 Å3. Some craters contain protein–ligand complexes that participate in the substrate chemical reaction. Some do not. The range of volumes for protein–ligand complexes was 100–1,694 Å3 (Saranya and Selvaraj 2009).

The enzymes surveyed here contain, on average, 48 pockets or voids, most of them with a tiny size (less than 100 Å3). Of these, 53 % are pockets, i.e., accessible from the outside, and 47 % are voids (i.e., non-accessible pockets). Craters as we define them are entirely distinct from catalytic active sites. The function of craters is not known despite our speculations later in this paper.

We define the Active Site Pocket (ASP) as the pocket with a volume between 100 and 3,000 Å3 that contains the largest number of atoms of the catalytic side chains. The range of minimum and maximum volume of the substrates was from 81 to 768 Å3. The properties of active sites located in either depressions or convex surfaces are not considered here because volume cannot be measured reliably in those cases.

Figure 2 shows the distribution of the volume of catalytic active sites. Three-quarters of the selected PDB dataset (573 of 759) has active site pockets, as we define ASPs. Figure 3 shows the distribution of amino acids in the entire enzyme; the distribution of amino acids in the active site pockets; and the distribution of amino acids in the catalytic side chains of the selected proteins. The results are consistent with an earlier study (Dundas et al. 2006). Also see (Porter et al. 2004) and (Gutteridge and Thornton 2005).
https://static-content.springer.com/image/art%3A10.1007%2Fs00249-012-0798-4/MediaObjects/249_2012_798_Fig2_HTML.gif
Fig. 2

Histogram of the distribution of the volume of the active site pocket for a set of 759 enzyme structures (unit: Å3). Pockets with volumes between 100 and 3,000 Å3 were used in the determination of the number of acid and base side chains and the calculation of the density of charge

https://static-content.springer.com/image/art%3A10.1007%2Fs00249-012-0798-4/MediaObjects/249_2012_798_Fig3_HTML.gif
Fig. 3

Amino acid composition in our dataset for the entire protein, all the amino acids in the active site pocket and only the catalytic amino acids. The distribution of amino acids in the entire protein and the catalytic active site are not very different. There is a significant increase of polar charged and uncharged side chains for catalytic residues

One difficulty in measuring the size of surface pockets on proteins is determining the boundary that separates the pocket from the outside solution. In this study, we used the convex hull of the atoms of a protein to define the boundary of the surface pockets. This choice gives an unambiguous measurement, although other definitions may also be possible (Liang et al. 1998; Edelsbrunner et al. 1998). Another difficulty in measuring the size of surface pockets is the significant change of measured volume that is produced by even a small change of the shape of the surface pocket. Here, pocket volume was calculated from the experimentally determined structure in the conditions in which the structure was measured. The effects of substrate (Otyepka et al. 2007b) and ion concentrations on structure and active site volume cannot yet be dealt with quantitatively because of the lack of crystallographic data. The experimentally determined structure is an average of the ensemble of conformations a protein adopts, but overall conclusions are well determined estimators of protein properties because they are based on statistics of the volume measurement gathered from a large number of protein structures. Surface area can be used in our analysis instead of volume without significantly changing our conclusions (data not shown).

Charge densities (CharDen)

The density of charge (of acid and base side chains) is the key variable that determines a biological function (selectivity) of ion channels (Eisenberg 2011a), so we are interested in measuring that variable in the catalytic active site of enzymes of known structure, as best we can.

The calculation of the charge density (CharDen) requires counting the number of acid (‘negative’) and base (‘positive’) side chains and calculation of the volume they surround and occupy. In our calculation, surface exposed atoms are those with non-zero exposed surface area, whether or not the exposed atoms are side-chain atoms or backbone atoms. When counting the number of ionizable residues in an active site, only those with the side chains pointing towards the pocket are considered. We simplify our language by using the word negative to describe the charge of acid side chains and the word positive to describe the charge of basic side chains. ‘Charge density’ refers to the number density of either acid or basic side chains, or both. We are quite aware that the ionization state of the side chains is not known in most cases and is sensitive to the local (and even global) environment. See (Warshel and Russell 1984; Warshel 1981; Davis and McCammon 1990; Honig and Nichols 1995; Antosiewicz et al. 1996), climbing on the shoulders of (Tanford 1957; Tanford and Kirkwood 1957; Tanford and Roxby 1972) and the older references cited in, p. 457–463 of (Edsall and Wyman 1958); p. 117–127 of (Cohen and Edsall 1943). The physics involved is oversimplified by this language, with implications that we discuss later.

Number density (objects/m3) is given in units of molar concentration for easier chemical intuition, e.g., comparison with ionic liquids and solids. No assumptions concerning the activity or activity coefficient are implied. That is why we use the phrase number density. We fear (and find) that ‘concentration’ and activity are often confused, with serious effects when dealing with the ionic mixtures of biology (Eisenberg 2011b, c). Obviously, acid and base side chains at these number densities are not ideal non-interacting particles with activity coefficients of one. Indeed, it is not clear even how to define the activity coefficient of an ion in systems this concentrated (Hünenberger and Reif 2011). The protein creates a special electrostatic environment (Warshel et al. 2006). It also creates a fluid substrate that is more like an ionic liquid than an ideal solution. Theories and simulations that assume ideal properties of reactants or force fields in these conditions are unlikely to be helpful.

Results

We analyze the distribution of the different amino acids according to the enzymatic activities of the catalytic active sites (Tipton 1994) as described in Table 1, noting that some amino acids may be included in two or more classes.
Table 1

Group of amino acids according to the chemical properties of their side chain

Group

Amino acids

Hydrophobic (non-polar, uncharged)

Alanine, Leucine, Isoleucine, Methionine, Phenylalanine, Tryptophan, Tyrosine and Valine

Polar (uncharged)

Serine, Threonine, Asparagine and Glutamine

Aromatic

Tryptophan, Phenylalanine and Tyrosine

Basic (positively charged)

Lysine, Arginine and Histidine

Acidic (negatively charged)

Aspartic and Glutamic acid

Special cases

Cysteine, Proline and Glycine

Hydrophobic side chains (i.e., Ala, Val, Leu, Ile) are found less frequently in active site pockets than in the entire protein. Aromatic side chains (Trp, Tyr, Phe), small polar side chains (Ser, Thr), and particularly Glycine, are more common (Fig. 3).

The distribution of amino acids responsible for the catalytic reaction is striking. It is very different from the (distribution of the) overall composition of the active site pocket as well as the (distribution of the) amino acids in the whole protein (Figs. 3, 4), as previously reported by (Porter et al. 2004) and (Gutteridge and Thornton 2005). The catalytic side chains of transferases (EC2), lyases (EC4) and isomerases (EC5) have a similar distribution of base and acid side chains. However, hydrolases (EC3) have a larger fraction of acid side chains (D and E) and also a larger fraction of histidine. Ligases (EC6) have a larger fraction of base side chains (K, R and H). The evolutionary or chemical reasons for this specialization are not known.
https://static-content.springer.com/image/art%3A10.1007%2Fs00249-012-0798-4/MediaObjects/249_2012_798_Fig4_HTML.gif
Fig. 4

Amino acid composition grouped by enzymes (EC1–EC6). All the amino acids in the entire protein, in the catalytic active site pockets and only the catalytic amino acids

Volumes of catalytic active sites

The mean volume of the catalytic active site pocket is 1,072 Å3. The average sequence length of proteins in our dataset is 338 amino acids, with a standard error of the mean of 6.3. The average number of side chains that are part of the active site pocket is 34 ± 0.77 (n = 573) (mean ± SE of the Mean).

Different classes of enzymes have somewhat different characteristics. The largest catalytic active sites are found in oxidoreductases (1,568 A3), ligases (1,233 Å3) and transferases (1,206 Å3). Hydrolases (786 Å3) and isomerases (863 Å3) have the smallest pocket volume. Oxidoreductases (EC1) have the longest sequence length (average 379.7 ± 18.9, n = 99), and the largest number of amino acids in the catalytic active site (on average 47.6 ± 2.05, n = 99). Isomerases have the shortest sequence length (296.4 ± 23.3, n = 43) and isomerases the lowest number of amino acids (29.33 ± 2.60, n = 43).

Charge densities at catalytic active sites

We calculated various densities for each pocket (Table 2), assuming for the purposes of exposition that all acid and base side chains are ionized. We calculated (1) the density of positive charges, (2) the density of negative charges, (3) the density of the absolute value of charges, namely the total density of acid and base side chains. The mean density for the whole dataset of 573 enzymes is 18.9 ± 0.58 M. The distribution of the total density of charge of catalytic active sites (CharDen) is shown in Fig. 5. Isomerases (22.1 M) and hydrolases (22.8 M) have the largest CharDen values. Oxidoreductases have the smallest (12.1 M). For 93 % of the proteins in our data set, catalytic active sites have clear connections to the outside through what we call ‘mouth-opening(s)’. These openings are large enough to allow the access of water molecule(s). For the remaining proteins (7 %) in our data set, active sites are found to be in voids buried and non-accessible according to our definitions. Since these enzymes do in fact catalyze reactions involving substrates outside the protein, it is likely that the structure of the protein fluctuates to allow substrate and ligand access, as seen in cytochrome P450 (Otyepka et al. 2007a; Ludemann et al. 2000b, Ludemann et al. 2000a; Cojocaru et al. 2011).
Table 2

Summary of charge density (CharDen, unit molar) at the catalytic active site, craters and the entire protein

  

Catalytic active site

Craters

Protein

  

CD+

CD

CDt

CD+

CD

CDt

CDt

EC1

Oxidoreductases (n = 98)

7.5

4.6

12.1

16.8

12.1

28.9

2.8

EC2

Transferases (n = 126)

9.5

7.2

16.6

16.4

12.5

28.8

3.1

EC3

Hydrolases (n = 214)

12.1

10.7

22.8

15.2

11.9

27.1

2.7

EC4

Lyases (n = 72)

11.2

7.3

18.5

16.6

11.6

27.8

2.8

EC5

Isomerases (n = 43)

12.6

9.5

22.1

16.2

13.5

29.7

2.9

EC6

Ligases (n = 20)

9.7

8.3

18.0

16.2

11.9

28.0

3.0

 

Total (n = 573)

10.6

8.3

18.9

16.0

12.1

28.2

2.8

CD+: Molar positive CharDen; CD−: Molar negative CharDen; CDt: Total (positive + negative) molar CharDen

https://static-content.springer.com/image/art%3A10.1007%2Fs00249-012-0798-4/MediaObjects/249_2012_798_Fig5_HTML.gif
Fig. 5

Density estimation of the fraction of proteins with a given charge density (CharDen). Catalytic active site, craters and the entire protein CharDen

Protein charge density

We also computed the density of charge of the entire protein. This calculation used the volume of the entire protein (Edelsbrunner et al. 1995). The charge density for the entire protein in our dataset (global charge density) is on average 2.82 M ± 0.03 (n = 573), which is a small dispersion (Fig. 5). The value 2.8 M was smaller than we expected considering that 25 % of the side chains in proteins are charged.

We find that the positive charge density is always larger than the negative charge density, but for some classes of enzymes the surplus of negative charge is smaller (hydrolases, ligases) than for others (oxidoreductases or lyases).

Craters

Craters (as we have defined them above) have a smaller size (262.2 A3, 13 amino acids per crater) than catalytic active sites (1,072 Å3, 34 amino acids for catalytic active sites) and are mostly (83.5 %) accessible from the outside. The volume of craters is largest among ligases and smallest in isomerases. The distribution of the volume of craters is quite different from the distribution of the volume of catalytic active sites (two sample Kolmogorov–Smirnov test, p-value = 2.2 × 10−16, Fig. 6). The number of amino acids in craters of different types of enzymes ranges from 12.8 (Transferases) to 13.3 (Oxidoreductases and Lygases).
https://static-content.springer.com/image/art%3A10.1007%2Fs00249-012-0798-4/MediaObjects/249_2012_798_Fig6_HTML.gif
Fig. 6

Density estimation of the volume (A3) of catalytic active sites and craters

The charge density in craters (28.2 ± 0.34 M) is larger than in catalytic active sites (where it is 18.9 ± 0.58 M: Table 2). Values in craters are different among the different groups of enzymes. They vary from 27.1 M (hydrolases) to 29.7 M (isomerases). The distribution of charge density in craters is very different (Fig. 5; K–S test, p-value = 2.7 × 10−15) from the distribution in the catalytic active sites. We do not know why.

Discussion

A great deal of attention has been paid to the chemical role of acid and base side chains in the catalytic active sites of enzymes, and to the special electrostatic environment of enzymes (Warshel et al. 2006) and channels (Eisenberg 1996a, b), but less attention has been paid to the steric effects of excluded volume. Those effects can be substantial when charge densities are high and crowding results. The steric repulsion of finite size ions produces chemical specificity in bulk solution (Friedman 1981; Torrie and Valleau 1982; Patwardhan and Kumar 1993; Durand-Vidal et al. 1996; Barthel et al. 1998; Fawcett 2004; Hansen and McDonald 2006; Lee 2008; Kunz 2009; Li 2009; Fraenkel 2010a, b; Kalyuzhnyi et al. 2010; Vincze et al. 2010; Hünenberger and Reif 2011) and some ion channels (Eisenberg 2011a; Boda et al. 2007; Boda et al. 2009; Gillespie et al. 2009).

It seems likely to us that the charge densities in catalytic active sites create a special physical environment optimized in some unknown way to help enzymes do their work. The tiny volume of the catalytic active site ensures that even a few acid side chains produce a large density of electric charge. The forces that produce (approximate) electroneutrality ensure that a nearly equal amount of counter charge is near the acid or basic side chains, within a few Debye or Bjerrum lengths. One component of the enzymatic specialization is the electrostatic environment analyzed in detail by Warshel in enzymes (Warshel et al. 2006) and (in significantly less detail) by Eisenberg in channels (Eisenberg 1996a, b). In channels, another component of the specialization is the steric effect of crowded charge. In (some types of) channels, it is the balance between electrostatics and crowding that produce the selectivity that defines channel types.

It seems useful to speculate that enzymes balance electrostatic and steric forces the way some channels do. After all, channels are nearly enzymes (Eisenberg 1990). The tiny volume surrounding the side chains and counter ions guarantees severe crowding and steric repulsion.

In these crowded catalytic active sites, reactants and side chains mix in an environment without much water, very different from the water dominated ionic solutions outside of proteins. The environment does not resemble the infinitely dilute ideal fluid for which the law of mass action is appropriate (Eisenberg 2011c). The catalytic active site seems more like an ionic liquid (Kornyshev 2007; Siegler et al. 2010; Spohr and Patey 2010) than an ideal gas. The ionic liquid of the catalytic active site differs from classical ionic liquids because some of its components are side chains of proteins, ‘tethered’ to a polypeptide backbone, not free to move into the bulk solution. These charged side chains may have as large a role in the function of proteins (Eisenberg 1996a, b) as doping has in transistors (Markowich et al. 1990; Howe and Sodini 1997; Pierret 1996; Sze 1981), although the finite diameter of the side chains adds a strong flavor of chemical selectivity and competition not found in semiconductors (Eisenberg 2005; Eisenberg 2012).

Craters

Our main focus has been on the catalytic active sites and the pockets that surround them, but we also found pockets (we call craters) that do not contain catalytic amino acids. Proteins in our data set contain 4.5 pockets per protein that are large enough for us to analyze (i.e., are larger than 100 Å3 and are not located in either depressions or convex surfaces). These craters do not contain catalytic residues and are thus not catalytic active sites. Some craters are known to be binding sites for effectors (activators or repressors), i.e., small molecules that change the biological activity of the protein. Craters near the outer surface of a protein are likely to be important in protein–protein interactions because they contain large amounts of permanent (i.e., ‘fixed’) charge.

Craters seem to us to be atomic-scale ion exchangers, i.e., charged reservoirs of mechanical energy. Ion exchangers are Donnan systems that generate substantial internal osmotic and hydrostatic pressure (Helfferich 1962 (1995 reprint); Nonner et al. 2001) The osmotic pressure in craters creates strong mechanical forces in the enzyme. When those forces are unleashed, so they can cause motion, the structure of the enzyme is likely to change, on atomic and also on macroscopic scales. These structural changes might be conformation changes involved in the natural function of the enzyme. The osmotic pressure of craters might be one of the forces that drives the conformational changes of enzyme function.

Charge in the catalytic active site: amount and role

The large densities of acid and base side chains reported here do not automatically imply a large density of charge. The ionization state of most of these side chains is not known. Direct measurements are needed in our view. Calculations are not reliable given the difficulties in designing force fields and calibrating simulations in the special ionic environment of the catalytic active site, so different from bulk solution. Ionization would, of course, differ from enzyme to enzyme and mutant to mutant. Ionization is expected to depend on the concentrations of reactants and ions near the binding site, as well as in the surrounding baths. Similar charge interactions were considered long ago (p. 457–463 of (Edsall and Wyman 1958); p. 117–127 of (Cohen and Edsall 1943)), even before proteins were shown to be well-defined molecules (Linderstrom-Lang 1924) and have been simulated and analyzed with great success more recently (Warshel and Russell 1984; Warshel 1981; Davis and McCammon 1990; Honig and Nichols 1995; Antosiewicz et al. 1996), with (Tanford 1957; Tanford and Kirkwood 1957; Tanford and Roxby 1972) serving as a link between the early and recent literature. The special importance of the electrostatic environment was brought to the attention of modern workers by Warshel (Warshel and Russell 1984), who particularly has emphasized its importance in the active site (Warshel et al. 2006).

Salt bridges are likely to reduce the net charge of catalytic active sites because the negative charge of one acid side chain balances the positive charge of a basic side chain. Specifically, 73 % of the catalytic active sites contain at least one acid side chain within 4 Å of a basic side chain (44 % of craters).

The leftover charge, not balanced in salt bridges, is still likely to be large. The unbalanced density of side chain charge is still likely to be enough to create densities of ions far beyond those found in bulk solutions. These unbalanced charges are an important source of the special electrostatic environment in active sites (Warshel et al. 2006) we believe.

Large densities of charge obviously have a profound effect on protonation steps of found in many chemical reactions catalyzed by enzymes. Large densities of charge are likely to have other effects beyond shifts in protonation states. The protein creates a charged surface that fits the substrates as a glove fits a hand. Indeed this is a special electrostatic environment.

This special electrostatic structure will have large effects on any step in a chemical reaction that produces changes in charge, or is influenced by the electric field (consider dielectrophoresis (Pohl 1978)). In addition to these effects, it is possible that the large densities of charge produce special physical constraints on orbitals of electrons in the molecules close to the protein. The permanent (i.e., ‘fixed’) charge of the protein must enforce a nearly Neumann boundary condition for the Poisson part of the Schrödinger equation that defines the molecular orbitals of nearby (substrate) electrons.

Whatever the role of the large charge densities in catalysis, their presence produces interactions not present in the law of mass action (Eisenberg 2011c) used universally in models of enzyme kinetics (Dixon and Webb 1979; Segel 1993), with rate constants independent of concentration. That law of mass action is appropriate for an infinitely dilute ideal gas, not for the concentrated solutions (nearly an ionic liquid) in an catalytic active site. ‘Everything’ interacts with everything else in those conditions. The free energy (that drives a chemical reaction) then depends on the concentrations of all species, not just the concentrations of reactants and products (Pytkowicz 1979; Hovarth 1985; Zemaitis et al. 1986; Pitzer 1995; Barthel et al. 1998; Durand-Vidal et al. 2000; Fawcett 2004; Lee 2008; Kunz 2009; Kontogeorgis and Folas 2009; Fraenkel 2010b). In addition, the flow of reactants is coupled to the concentration (and perhaps flow) of all other species near the catalytic active site. ‘Everything’ interacts with everything else in the crowded confines of the catalytic active site. Indeed, the singular single file behavior seen in some types of ion channels is an extreme example of nonideal behavior. Ions in such systems clearly do not behave as if they are infinitely dilute with activities independent of other ions. It seems wiser to use mathematics designed to handle interactions in complex fluids (Hyon et al. 2010; Eisenberg et al. 2010; Liu 2009; Sheng et al. 2008; Doi 2009) rather than mathematics designed to handle infinitely dilute uncharged ideal gases.

Coupling between ions is known to be an inevitable product of nonideal properties of ions in solutions. (Pytkowicz 1979; Hovarth 1985; Zemaitis et al. 1986; Pitzer 1995; Barthel et al. 1998; Durand-Vidal et al. 2000; Fawcett 2004; Lee 2008; Kunz 2009; Kontogeorgis and Folas 2009; Fraenkel 2010b; Justice 1983; Fuoss and Accascina 1959; Fuoss and Onsager 1955). Ion-ion interactions have not had a prominent role in models of channels, transporters, or enzyme function (Tosteson 1989; Dixon and Webb 1979; Segel 1993). The coupled flows of ions that define transporters (and are characteristic of enzymes) have usually been ascribed entirely to the ion-protein interaction. Perhaps some flows are coupled because of interactions of ions among themselves in the crowded nonideal environments near, if not in the catalytic active sites.

Conclusion

The catalytic active sites of enzymes can be defined using a modern computational program working with a data base of enzyme structure. These active sites have large numbers of acid and base side chains. The volume of the catalytic active sites is well defined by modern computational analysis of protein structure. The volume of catalytic active sites is small. The number density of acid and base side chains is very high. The contents of catalytic acid sites do not resemble the infinitely dilute solutions used in classical enzyme kinetics or force fields of modern molecular dynamics. The balance of steric and electrostatic forces in the highly concentrated environment of the catalytic active site is likely to be an evolutionary adaptation that has an important role in enzymatic catalysis, although we do not yet know what that role is. It seems wise to use mathematics designed to handle interactions in complex fluids when studying the catalytic active site of enzymes. It seems wise to seek the reason evolution fits the charged surface of the active site to the substrate as a glove fits a hand.

Acknowledgments

Mr. Jimenez-Morales was supported by Becas Talentia Excellence Grant (Andalusian Ministry of Innovation, Science and Enterprise, Junta de Andalucia, Spain) and funding from Dr. Liang’s laboratory. Dr. Liang is supported by the NIH GM079804 and GM086145, and the NSF DBI 1062328 and DMS-0800257. Dr. Eisenberg was supported by NIH GM076013.

Copyright information

© European Biophysical Societies' Association 2012