Proton transfer during class-A GPCR activation: do the CWxP motif and the membrane potential act in concert?

G-protein-coupled receptors (GPCRs) form the largest receptor superfamily of eukaryotic cells (Flock et al. 2017). A typical GPCR protein contains seven transmembrane helices (TMs 1–7) (Palczewski et al. 2000). Upon activation, the GPCR opens a cavity on its cytoplasmic side, between TMs 5–6 and the remaining of the receptor, to interact with downstream effectors such as G-proteins and arrestins (Kang et al. 2015; Rasmussen et al. 2011). Based on phylogenetic analysis, GPCRs have been further categorized into several families. Among them, the rhodopsin-like class-A—referred to as such in the A–F classification system (Attwood and Findlay 1994; Kolakowski 1994)—is the largest family, including over 700 members from human genome alone. Class-A GPCRs (hereafter referred simply as ‘‘GPCRs’’) possess a number of highly conserved fingerprint motifs (Nygaard et al. 2009) which presumably play important functional roles in common signaling mechanisms shared by all members of their family. While both the ligand binding on the extracellular side and interactions with effectors on the intracellular side have been extensively studied and their mechanisms are fairly well understood (Kang et al. 2015; Katritch et al. 2012;Rasmussen et al.2011), the detailedmechanismsof transmembrane signaling remain under debate. Recently, we proposed the electrostatic transmembrane potential (DW)-driving hypothesis, a mechanism of agonistinduced, proton transfer-mediated activation for GPCRs (Zhang et al. 2014). This hypothesis is based on an extensive range of biochemical studies, on structural data from decades of GPCR studies regarding the most conserved structural features of GPCRs, and on a long-time belief that these shared features are associated with a common activation mechanism for the entire class-A family (Schwartz and Rosenkilde 1996). Since GPCRs carry electric charges, usually non-uniformly distributed, activation of a GPCR is inevitably affected by the membrane potential (Barchad-Avitzur et al. 2016; Birk et al. 2015; Mahaut-Smith et al. 2008; Rinne et al. 2013; Sahlholm et al. 2008). More specifically, according to our hypothesis, activation of GPCR is associated with the redistribution of electric charges within the receptor. In the ground state of GPCR, a key proton-titratable residue, D2.50 (as per B-Wnumbering (Ballesteros andWeinstein 1995)) maintains a protonated state. This aspartate residue is 92% conserved in all GPCRs (where percentage conservation of a given position is estimated according to the online database GPCRdb (Isberg et al. 2016)). The electrostatic force exerted byDWon theprotonatedD2.50 is balanced by hydrophobic mismatch forces as well as by an amphipathic helix-8 (H8) at the C-terminal end of TM7. This amphipathic helix is embedded in the intracellular surface of themembrane bilayer and links TMs 1, 2, and 7 through a hydrophobic core. Upon activation, agonist binding induces a rotation of TM3 relative to TM2. In turn, a conserved polar residue at the position 3.39 triggers deprotonation of D2.50. The released proton is driven by DW and moves along a conserved proton wire. It enters a cavity, termed the middle cavity (or TM6 clamp (Hulme 2013)), and further protonates a water molecule trapped within. Interestingly, the protonated water ion is insulated within a hydrophobic environment. A group of & Correspondence: zhangc@ibp.ac.cn (X. C. Zhang)


INTRODUCTION
G-protein-coupled receptors (GPCRs) form the largest receptor superfamily of eukaryotic cells (Flock et al. 2017). A typical GPCR protein contains seven transmembrane helices (TMs 1-7) (Palczewski et al. 2000). Upon activation, the GPCR opens a cavity on its cytoplasmic side, between TMs 5-6 and the remaining of the receptor, to interact with downstream effectors such as G-proteins and arrestins (Kang et al. 2015;Rasmussen et al. 2011). Based on phylogenetic analysis, GPCRs have been further categorized into several families. Among them, the rhodopsin-like class-A-referred to as such in the A-F classification system (Attwood and Findlay 1994;Kolakowski 1994)-is the largest family, including over 700 members from human genome alone. Class-A GPCRs (hereafter referred simply as ''GPCRs'') possess a number of highly conserved fingerprint motifs (Nygaard et al. 2009) which presumably play important functional roles in common signaling mechanisms shared by all members of their family.
While both the ligand binding on the extracellular side and interactions with effectors on the intracellular side have been extensively studied and their mechanisms are fairly well understood (Kang et al. 2015;Katritch et al. 2012;Rasmussen et al. 2011), the detailed mechanisms of transmembrane signaling remain under debate. Recently, we proposed the electrostatic transmembrane potential (DW)-driving hypothesis, a mechanism of agonistinduced, proton transfer-mediated activation for GPCRs (Zhang et al. 2014). This hypothesis is based on an extensive range of biochemical studies, on structural data from decades of GPCR studies regarding the most conserved structural features of GPCRs, and on a long-time belief that these shared features are associated with a common activation mechanism for the entire class-A family (Schwartz and Rosenkilde 1996). Since GPCRs carry electric charges, usually non-uniformly distributed, activation of a GPCR is inevitably affected by the membrane potential (Barchad-Avitzur et al. 2016;Birk et al. 2015;Mahaut-Smith et al. 2008;Rinne et al. 2013;Sahlholm et al. 2008). More specifically, according to our hypothesis, activation of GPCR is associated with the redistribution of electric charges within the receptor. In the ground state of GPCR, a key proton-titratable residue, D2.50 (as per B-W numbering (Ballesteros and Weinstein 1995)) maintains a protonated state. This aspartate residue is 92% conserved in all GPCRs (where percentage conservation of a given position is estimated according to the online database GPCRdb (Isberg et al. 2016)). The electrostatic force exerted by DW on the protonated D2.50 is balanced by hydrophobic mismatch forces as well as by an amphipathic helix-8 (H8) at the C-terminal end of TM7. This amphipathic helix is embedded in the intracellular surface of the membrane bilayer and links TMs 1, 2, and 7 through a hydrophobic core. Upon activation, agonist binding induces a rotation of TM3 relative to TM2. In turn, a conserved polar residue at the position 3.39 triggers deprotonation of D2.50. The released proton is driven by DW and moves along a conserved proton wire. It enters a cavity, termed the middle cavity (or TM6 clamp (Hulme 2013)), and further protonates a water molecule trapped within. Interestingly, the protonated water ion is insulated within a hydrophobic environment. A group of conserved hydrophobic residues (including L/I/V-2.43, L2.46, L/I-3.43, I/L/M/V-3.46, L/V/I/A-6.37, and I/V/M/ L-6.40) form the middle cavity (Zhang et al. 2013), and their importance in GPCR activation is also unveiled by a recent structural survey (Venkatakrishnan et al. 2016). The protonated water ion (H 3 O ? ) is then again subjected to an electrostatic force from DW; this extra force cannot be balanced by H8, thus promoting a conformational change of the receptor required for the downstream signaling. In general, amphipathic helices are often located on the cytosolic side of the integral membrane proteins that undergo large conformational changes during their functional cycles, and constrain the terminal movement of associated TM helices to the cytosolic surface of the lipid bilayer (Zhang et al. 2018). In addition to the conserved H8, a class-A GPCR often contains a short amphipathic helix in the intracellular loop-2 region, probably playing an anchoring function similar to H8. Taken together, the putative activation process of class-A GPCRs is hereafter called the ''D2.50 switch.'' It is encouraging that the functions of many fingerprint motifs and conserved residues as well as of the amphipathic H8 of GPCRs can be explained with our newly proposed DW-driving hypothesis (Zhang et al. 2014). One exception, however, is a highly conserved motif in TM6, whose functional roles remain enigmatic. More specifically, TM6 contains a conserved CWxP motif (where ' 'x' ' stands for any amino acid residue) (Nygaard et al. 2009). P6.50 (99% conserved) of this motif generates a break in TM6 acting as a hinge, presumably facilitating the conformational change observed during GPCR activation. However, the functional roles of C6.47 and W6.48 remain controversial at best. Interestingly, Cys and Trp are the two types of amino acid residues showing the most extreme pattern of conservation in a statistical survey; namely, both are frequently located at positions either highly conserved or highly degenerated (i.e., poorly conserved) (Marino and Gladyshev 2010). This observation suggests that strong selective pressure exists on keeping Cys as well as Trp residues in crucially important locations (e.g., functional sites), whereas in other positions Cys and Trp residues are less likely to be conserved because of fewer codons. In the following, we will discuss the unique properties and putative functional roles of C6.47 and W6.48 in the framework of our DW-driving hypothesis.

DEPROTONATION OF CYS6.47
C6.47 is conserved in nearly 70% of all class-A GPCRs, with Ser being a distant second choice (10%) for this position. Unlike the other conserved pair of Cys residues in the GPCR superfamily, which forms a characteristic disulfide bond on the extracellular side of the receptor, C6.47 is not involved in any disulfide bond formation.
For integral membrane proteins, any existing disulfide bonds are exclusively located on the extracellular side, as oxidative environments are only available in the lumen of the endoplasmic reticulum (ER) and in the extracellular space. In the transmembrane region, in contrast, Cys residues are not involved in disulfide bonds but are often exchangeable with Ser residues. Cys and Ser residues are similar in geometry and are considered of possessing similar chemical properties (Marino and Gladyshev 2010). On the one hand, Cys is classified as a polar residue based on its physical-chemical properties. On the other hand, Cys (non-disulfide bonded) residues are mostly found in buried positions and are therefore considered hydrophobic. This peculiar discrepancy suggests that if Cys residues are positioned inside a protein, it is likely due to functional reasons, even though this results in an energetic cost, as it negatively interferes with folding and stability. Why then does the position 6.47 predominantly favor Cys over Ser? One clear difference between Cys and Ser is their pK a values: Cys (pK a 8.5 in solution (Poole 2015)) is much easier to deprotonate than Ser (pK a 14.2). In fact, the intrinsic pK a of a Cys residue is the closest to the physiological pH of all of the naturally occurring amino acids. Their pK a difference partially explains why Cys pairs are able to form disulfide bonds, while Ser residues never form similar covalent bonds: in disulfide bond formation, one of the Cys residues is deprotonated first, thus permitting a nucleophilic attack on the other Cys residue (Poole 2015). As the most strictly selected amino acid residue in evolution (Marino and Gladyshev 2010), cysteine is among the most frequently used residues in catalytic sites of enzymes, often activated by a neighboring general-base thus initiating a nucleophilic attack (Lu et al. 2017). Furthermore, analogous to an Asn (Gln) residue mimicking the protonation state of an Asp (Glu) residue, a Ser residue may mimic the protonation state of a Cys residue. Therefore, if a Cys residue is not exchangeable for a Ser residue at a given position in a protein, it is likely that the deprotonation-thus the change of the protonation status-of this Cys residue is essential for the function of the protein. Similar to an acidic residue, a Cys residue is more likely to be deprotonated when its micro-environment becomes electropositive (Awoonor-Williams and Rowley 2016). Under such conditions, the pK a value of the Cys residue can decrease to as low as 2.9. Therefore, the question arises as to whether C6.47 switches its protonation status during GPCR activation.
C6.47 is strategically located on the cytoplasmic side of the helix kink at P6.50 and on the interface between TM6 and TM7. In the ground-state crystal structures of GPCRs (e.g., of a 2A adrenergic receptor (a 2A AR)/4EIY (Liu et al. 2012)), the Sc atom of C6.47 is located close to the d-amide group (a hydrogen bond donor) of the N7.45 sidechain (with a distance of *3.8 Å and a proper angle for hydrogen bonding). This hydrogen bonding encourages deprotonation of C6.47. Position 7.45 is conserved as a polar residue, mainly Asn (67%), and belongs to the NSxxNPxxY motif in TM7 (Fredriksson et al. 2003). In contrast to the ground state, in the active state (e.g., in b 2 adrenergic receptor (b 2 AR)/ 3SN6 (Rasmussen et al. 2011)), C6.47 and N7.45 slightly move from each other (to a distance of *4.3 Å and an unfavorable angle for hydrogen bonding). This separation allows C6.47 to become protonated, and it is accompanied by the TM6 movement away from the remaining transmembrane helix bundle. In addition, in the active state capable of binding with downstream effectors (e.g., in b 2 AR/3SN6), the TM6 helix becomes less regular and curved within the region located at the cytoplasmic side of the P6.50 kink, a change consistent with the observed outward movement (i.e., opening) of the intracellular end of TM6. This helix curving causes the hydrogen bonds of the helix backbone to become more distorted from those in an ideal a-helix, in particular exposing the backbone carbonyl oxygen of the residue 6.43. As a consequence, this exposed oxygen atom forms a hydrogen bond with the sidechain thiol (sulfhydryl) group of C6.47. This time, C6.47 is most likely to contribute a hydrogen bond donor. Therefore, formation of this potential hydrogen bond favors protonation of C6.47. In summary, while the microenvironment of C6.47 promotes its deprotonation in the ground state, it promotes protonation in the active state.
In the context of our DW-driving hypothesis, such an alternation in protonation status favors conformational change during the activation transition (Zhang et al. 2014. More specifically, the protonated C6.47 is subjected to an electrostatic force exerted by DW which points in the intracellular direction. Together with hydrophobic mismatch forces, the electrostatic force generates a mechanical torque and facilitates both the opening of the intracellular side of the GPCR protein and the formation of the binding site for downstream effectors. In this sense, any ligand that weakens the C6.47-N7.45 interaction between TM6 and TM7 in the ground state should promote protonation of C6.47, subsequently stabilizing the active state. Therefore, in addition to the D2.50 switch, C6.47 likely serves as an auxiliary activation switch in GPCRs (Fig. 1).
Within the electric field of DW, C6.47 is located on the positive side of the key acidic residue D2.50. Thus, hypothetically, C6.47 becomes deprotonated during establishment of the ground state probably by forming the hydrogen bond with N7.45, and the released proton moves along a proton path (via N7.45-H 2 O) finally protonating D2.50. However, during GPCR activation, the proton is unable to move back from D2.50 to C6.47, because this process would be against the electric field of DW. Instead, another proton is likely to be recruited from the extracellular space, a process likely mediated by a water molecule located at the kink of P6.50 (see the 1.8-Å crystal structure of a 2A AR/4EIY). Furthermore, the proton released from D2.50 during activation moves to a water molecule trapped in the middle cavity before being released into the cytoplasm. Thus, one effective proton apparently moves from the extracellular space to the intracellular side of the GPCR during the activation process. In agreement with this argument, the gating current of the muscarinic acetylcholine receptor-2 (M2 AChR) was previously estimated to be 0.85 proton per GPCR molecule (Ben-Chaim et al. 2006).
In accordance with the new hypothesis on the functional role of C6.47, mutational variants C6.47T in b 2 AR (Shi et al. 2002) and C6.47S/R in thyrotropin hormone receptor (Biebermann et al. 2012) result in constitutive G as activation. These mutations are equivalent to protonation of Cys or carrying an extra positive change, thus stabilizing the active state in the presence of DW according to our hypothesis. Similarly, mutations at C6.47 in cannabinoid receptors CB1 and CB2 modified both ligand recognition and receptor activation (Pei et al. 2008;Picone et al. 2005). It has been proposed that C6.47 participates in the rearrangement of the TM6 and TM7 interface during GPCR activation (Olivella et al. 2013). However, in this previous hypothesis, the C6.47-N7.45 hydrogen bond was thought to be formed between the thiol group of C6.47 and the d-carbonyl group of the N7.45 sidechain, and no Cys deprotonation was postulated. Given that the resolutions of currently available GPCR structures are not sufficiently high to determine the atom types at the terminus of the N7.45 sidechain, it is possible that the thiolate group of C6.47 forms a hydrogen bond with the d-amide group of N7.45 sidechain (as proposed here). Furthermore, such a sidechain rotamer conformation of N7.45 is in agreement with the original assignment of the 1.8-Å resolution, ground-state crystal structure of a 2A AR/4EIY.
In addition, C6.47 of b 2 AR was found to be more reactive with some sulfhydryl-specific reagents in the active state than the ground state (Rasmussen et al. 1999). This observation also indicated that GPCR activation alters the micro-environment of the thiol group of C6.47. However, it is generally assumed that in solutions, deprotonated thiol groups are more chemically active. Therefore, the change of C6.47 reactivity appears to be opposite to what would be predicted by our hypothesis. The underlying cause for this discrepancy may be the difference in reactivity of the thiol/ thiolate group, depending on whether it is surrounded by solvent phase or the hydrophobic environment inside a membrane protein. It has been argued that nucleophilicity of a Cys residue actually increases with pK a (Poole 2015), which is exactly what happens upon N7.45 moves away from C6.47. Future experimental studies (e.g., Raman spectroscopic analysis (Rothschild et al. 1993;Saint Clair et al. 2012)) on the C6.47 protonation status in both the ground and active states will help to verify the functional roles of C6.47.

POLARIZATION OF TRP6.48
At the position 6.48 within the CWxP motif of GPCRs, tryptophan is the dominant type of amino acid residue (68% conserved), and phenylalanine comes as the second (16%). W6.48 is in the middle of TM6, ''vertically'' located at the bottom of the ligand-binding pocket and above the conserved water-trapping ''middle cavity''. A recent NMR study suggests that W6.48 is involved in the same micro-switch as D2.50 (Eddy et al. 2018), probably through a shared hydrogen bond network (Liu et al. 2012). In addition, based on structures of the ground state, the sidechain rotamer of W6.48 was proposed earlier to assume a toggle role in GPCR activation (Shi et al. 2002). Surprisingly, however, W6.48 does not appear to change its sidechain rotamer in crystal structures of agonist-bound GPCRs reported later (Nygaard et al. 2009;Rasmussen et al. 2011;Rosenbaum et al. 2007). Furthermore, in many cases (for instance, GPR119, GPR39, b 2 AR, and the NK1 receptor), it was found that mutations of Ala substitution at W6.48 eliminate basal activities and strongly impair agonistinduced activation of downstream G-proteins (i.e., efficacy) without affecting agonist affinity (i.e., potency) (Holst et al. 2010). Thus, although it does not appear to be involved in a rotamer toggle switch induced by agonist binding, W6.48 is likely to function in a common activation mechanism. These observations raise the question as to what renders the Trp residue ''irreplaceable'' at the position 6.48. Is it possible that W6.48 also plays a functional role in the context of DW?
Statistically, integral membrane proteins contain more Trp residues than their soluble counterparts (Schiffer et al. 1992). Considering that only one genetic codon is available for tryptophan residues, the disproportionally abundant Trp residues likely play important roles in all classes of membrane proteins. The indole sidechain of Trp has an intrinsic electric dipole moment larger than 2 Debye, approximately pointing from the benzene ring to the pyrrole ring (Callis 1997). Since it contains nonlocalized electrons, the sidechain of a Trp residue can be further polarized within the indole plane by an external  , and more hydrogen bonds in TM6 become distorted than the ground state. Consequently, the Sc atom of C6.47 becomes protonated and forms a hydrogen bond with the backbone O-atom of residue 6.43. The extra positive charge of the captured proton (golden sphere) is subjected to an electrostatic force (indicated with the blue arrow), which promotes and/or stabilizes the active state electric field, for example, of the transmembrane (or local) electrostatic potential. Accordingly, among all natural amino acid residues, tryptophan exhibits the highest polarizability. In other words, the indole moiety is characterized by a large dielectric constant along the direction of its long axis. Although their intrinsic dipoles are close to zero, sidechains of other aromatic amino acid residues may also possess similar but weaker induced dipoles. In general, the interactions between DW and its embedded membrane protein can be divided into two parts: (1) the electrostatic forces on charged groups and (2) interactions of electric dipoles with the electric field of DW (Hill 1985). The electrostatic force on each charged group is along the direction of the local electric field (i.e., electric line of force); the total electrostatic force is balanced by the forces of hydrophobic mismatch, keeping the protein molecule at an equilibrium position inside the lipid bilayer. Furthermore, if its dipole aligns with the electric field of DW (corresponding to a low energy state), the Trp residue may stabilize the orientation of the membrane protein immerged inside the electric field of DW. The combination of both intrinsic and induced dipoles of Trp provides a possible explanation as to what the currently unknown function might be of the conserved W4.50 (94%), whose sidechain is also aligned with the electric field of DW. In addition, the energy term of the interaction between an electric dipole and its surrounding electric field is the dot-product of the vectors of the dipole and field. Therefore, in a nonuniform electric field, an electric dipole prefers to move into a region of stronger strength of the electric field as well as to align with the electric line of force. As an example of such interactions, multiple Trp residues are often found to form a peripheral belt on the surface of the membrane protein near each of the two membranesolvent interfacial regions, where the electric field of DW is strongly non-uniform, diminishing quickly away from the membrane. Such a Trp belt is likely to contribute to stabilization of the protein molecule in its proper position and orientation relative to the lipid bilayer. More generally, for a given DW, minimization of the total energy of the membrane protein-lipid bilayer complex system requires maximization of the effective dielectric constant(s) inside the membrane protein molecule, consequently increasing of the charge surface density on both sides of the protein exposed to the solvent. In other words, in the presence of DW, embedding polar groups, residues of high polarizability, and hydrogen bond networks inside the transmembrane region of the membrane protein is likely to be energetically favored for the overall stability of the protein-membrane complex system. Thus, the dipole of the indole sidechain of a Trp residue may reshape the surrounding electric field, resulting in a locally ' 'focused' ' electrostatic field, especially along the direction of the dipole (Fig. 2). The combined electric field from both the Trp dipole(s) and DW further influences the properties of the host membrane protein.
These physical properties of a Trp residue may partially explain why W6.48 within the core of the 7-TM helix bundle is highly conserved across GPCRs. In reported crystal structures of GPCRs, the W6.48 sidechain is often ' 'vertically' ' oriented such that both its intrinsic and induced dipoles are aligned with the electrostatic field of the negative-inside DW. Therefore, the combined dipole of W6.48 is likely to strengthen the electrostatic field underneath (i.e., in the cytoplasmic direction), where a few water molecules are trapped inside the middle pocket (*15 Å distance from W6.48). In this pocket, the strength of the electric field from the W6.48 dipole is of about the same magnitude as that of DW (see Box 1). As mentioned above, during activation of the GPCR, a proton released from D2.50 is captured by one of the trapped water molecules, and the strengthened electric field thus exerts an extra force to the protonated water ion (H 3 O ? ), favoring the conformational change in the cytoplasmic region of the GPCR. In addition, the conserved F6.44 (76%) is located between W6.48 and the middle pocket, presumably serving as an insulator to prevent electrical leakage across the receptor. In summary, both the ligand-binding cavity on the extracellular side and the G-protein-binding cavity on the cytoplasmic side result in focused electric field of DW onto the middle transmembrane region of the receptor; W6.48 further enhances the electric field onto the conserved hydrophobic middle pocket, optimizing the process of proton transfer-mediated activation. Electric potentials are contoured as thin lines. In the middle cavity region, the combined electric field is stronger than that of DW alone. Thus, the protonated water ion (golden sphere) is subjected to a stronger electric force (cyan line), which drives the conformational change of the activation process Proton transfer during class-A GPCR activation OPINION Box 1 The electric field To verify our hypothesis, it is essential to determine experimentally the extent to which W6.48 is polarized by DW, in combination with quantitative theoretical calculations. In addition, mutations at W6.48 may result in more compound effects on the buried hydrogen bond network formed around the D2.50 switch, and possibly on the C6.47 switch as well. Thus, interpretation of the data from mutagenesis studies requires cautions.

SUMMARY
Here we shed new light on the role of the electrostatic transmembrane potential in the function of membrane proteins. Based on current experimental evidence discussed, we updated our DW-driving hypothesis for class-A GPCR (Fig. 3). The effects of DW on membraneembedded tryptophan and cysteine residues are likely to be of general importance for a more detailed understanding of the principles governing membrane protein function, and should attract attention from other fields studying the structures and molecular dynamics of membrane proteins.
Abbreviations a 2A AR a 2A adrenergic receptor b 2 AR b 2 adrenergic receptor GPCR G-protein-coupled receptor TM Transmembrane (helix) For a parallel-plate capacitor (e.g., the lipid bilayer), the electric field E c is given as where the surface density (r) of electric charge is *2 9 10 -5 e 0 /Å 2 for a 100-mV DW, 30-Å separation, and e 0 = 9 9 10 -12 F/m The electric field E d along the direction of a dipole moment is the following: E d l 2pe0r 3 : The dipole moment (l) of an indole group is 2 Debye (&0.4 e 0 Å) or larger. The distance (r) between W6.48 and the protonated water ion in the middle cavity is *15 Å (e.g., in the crystal structure of a 2A AR/4EIY