In vitro expression and analysis of the 826 human G protein-coupled receptors
- First Online:
- 4.2k Downloads
G protein-coupled receptors (GPCRs) are involved in all human physiological systems where they are responsible for transducing extracellular signals into cells. GPCRs signal in response to a diverse array of stimuli including light, hormones, and lipids, where these signals affect downstream cascades to impact both health and disease states. Yet, despite their importance as therapeutic targets, detailed molecular structures of only 30 GPCRs have been determined to date. A key challenge to their structure determination is adequate protein expression. Here we report the quantification of protein expression in an insect cell expression system for all 826 human GPCRs using two different fusion constructs. Expression characteristics are analyzed in aggregate and among each of the five distinct subfamilies. These data can be used to identify trends related to GPCR expression between different fusion constructs and between different GPCR families, and to prioritize lead candidates for future structure determination feasibility.
KEYWORDSG protein-coupled receptors insect protein expression surface expression analysis fusion construct
The families of 826 GPCRs and their structures
# Of receptors
# Of structures currently available
Detailed three-dimensional structural information is of great importance for understanding the physiological functions of GPCRs and for designing new drugs to target them. In recent years, persistent efforts of researchers and implementation of new technologies have contributed to the accelerated development of GPCR structural studies. In 2000, the first mammalian GPCR structure was elucidated (Palczewski et al., 2000). Since then, the structures of 30 different GPCRs (Fig. 1A, 1B and Table 1) have been reported. While this represents real progress, it comprises only a fraction of almost 300 GPCRs that are known to be involved in psychiatric diseases, cancer, and other maladies, and an even smaller fraction of the 826 GPCRs found in humans (Katritch et al., 2013).
Given the challenges in structurally determining GPCRs and the large number of structures that remain to be solved, one approach to maintain the recently developed momentum is to prioritize those GPCRs with the highest likelihood of success. As protein expression is the critical first step in the structure determination process, it makes sense to pursue the receptors with high expression levels first as these are most likely to provide the highest yield after purification. In this study, we applied a comprehensive family-wide approach to express all 826 human GPCRs using two different construct designs. The comprehensive results (Table S1) are provided to facilitate future biochemical, pharmacological, and structural studies.
Constructs were then cloned into a modified pFastBac1 vector for expression in Spodoptera frugiperda (Sf9) cells (See “MATERIALS AND METHODS”). Sf9 cells were selected based on their demonstrated success in other GPCR structural studies. Four types of expression systems have been employed in protein production for structural studies of GPCRs to date: E. coli, yeast, mammalian cells and insect cells (Zhao and Wu, 2012). We chose the Spodoptera frugiperda (Sf9) expression system as it presently has the most established track record, given that 25 of the 30 structurally determined GPCRs were expressed in this system (Fig. 1B).
In this study, expression levels were detected using a fluorescent probe that consists of an α-flag FITC-coupled antibody that specifically recognizes a FLAG sequence inserted at the N-terminus of each construct (Fig. 2). Receptor cell surface expression and total receptor expression was determined by flow cytometry using a fluorescence signal detected from cells pre-incubated with the fluorescent probe in the absence (For surface expression % and surface density values) or presence of a mild detergent (For total expression % and total density values), respectively. This approach allowed us to quantify the percentage of cells expressing GPCRs, as well as the relative receptor expression, at the surface or overall (total).
General GPCR expression levels
Statistics of expression levels among the 1,652 GPCR constructs1
% Surface expression2
% Total expression4
High expressing GPCR constructs by family1
No current structure4
Comparison of expression between Nt_BRIL and ICL3_BRIL constructs
Statistics of GPCR expression levels by construct
% Surface expression1
% Total expression3
High expression for the Nt_BRIL construct of a receptor did not always correspond to high expression for the ICL3_BRIL construct. For example, 54 Nt_BRIL and 65 ICL3_BRIL constructs had surface expression levels >80%, yet only 22 receptors expressed at this level for both constructs (Tables S2–S4). Similarly, 164 Nt_BRIL and 309 ICL3_BRIL constructs had surface expression <30%, compared to 94 receptors with low expression for both constructs.
Expression between the GPCR families
Median expression levels by family and construct1
% Surface expression2
% Total expression4
In Nt_BRIL constructs, Glutamate and Adhesion families showed the highest surface density. For the total density of ICL3_BRIL constructs, there are notable differences between the different families with expression ranking as: Glutamate > Secretin > Adhesion > Frizzled/Taste2 > Rhodopsin family (Fig. 4). When the fusion partner BRIL is inserted at ICL3, Glutamate family receptors collectively produced the best expression levels, although Rhodopsin family receptors constitute the majority of the receptors whose surface expression levels exceed 80% (Fig. 4).
Representative receptors within GPCR families with high expressing Nt_BRIL constructs and low expressing ICL3_BRIL constructs
% Surface expression1
% Total expression3
Example of variance in expression despite high sequence similarity in the adrenergic and adensosine receptors within the Rhodopsin family
Surface expression (%)
Total expression (%)
Surface expression (%)
Total expression (%)
The expression level of N_BRIL and ICL3-BRIL constructs of olfactory receptors and non-olfactory receptors from the Rhodopsin family
% Surface expression1
% Total expression3
Olfactory Rhodopsin (n = 422)
Non-Olfactory Rhodopsin (n = 297)
Identifying trends in the results
The GPCR structures that have been solved with fusion partners did not share the same precise placement location for their fusion partner, as in they did in this study, therefore, a lack of positional optimization can be expected when reviewing these results. Yet, we can define some general trends from the large amount of data collected in this study. Overall, the expression levels of the 826 Nt_BRIL GPCR constructs was higher than at the ICL3_BRIL constructs, it can be concluded that a well-organized N-terminus is helpful for effective trafficking of the post-translational receptor to cell membrane. Another possible conclusion is that the N-terminal fusion partner may make the tertiary structure more stable and less toxic to the cell as a result.
For the adrenergic receptors in the Rhodopsin family, β1 and β2 adrenergic receptors have high sequence identity. However, they displayed very different expression levels in this Sf9 expression system. This is evident that the expression level or the property of receptors can be affected by very few residues. Just as in the construct optimization process, point mutation screening could identify a more stabilizing version of the protein (Zhang et al., 2014). From the expression data of the Frizzled/Taste2 family, it can be concluded that the expression level is closely related to the protein’s properties. In other words, a good expression level is one of the characteristics of a stable receptor.
The differences between non-olfactory and olfactory receptors within the Rhodopsin family are mainly reflected in longer extracellular loops and the conserved properties of the 7TM domain. After analysis of the receptor’s sequence data from Uniprot, generally, the length of extracellular loop 2 (ECL2) and ECL3 in most olfactory receptors was found to be more than 20 and 35 amino acids, respectively. However, for the non-olfactory receptors, either ECL1 or ECL2 is longer than 20 amino acids, or both loops are shorter than 20 amino acids. This observation is distinct from the trend of GPCRs in general, of which the 7TM helical bundle has been the most conserved component (Katritch et al., 2012), across the over 400 various odorant receptors (Jiang and Matsunami, 2015), the most conserved domains are the intracellular loops and the seventh transmembrane helix (helix VII), while the sequence diversity of helices III, IV, and V to which the odorant molecules bind is very high (Gao et al., 2010; de March et al., 2015). These two characteristics may contribute to the low expression level and instability of the olfactory receptors. From the perspective of function, one odorant can stimulate several kinds of odorant receptors, meanwhile one single odorant receptors can be activated by numerous different odorants (Sanz et al., 2014). Therefore the functional peculiarity of olfactory receptors may reflect their particularity in structure.
Glycosylation is also known to affect the ability of the receptor to reach the cell surface. This fact is especially relevant to some of the Glutamate family receptors, like GABAB1 and GPRC6. GABAB1 contains five N-glycosylation sites in the extracellular domain; when mutating all five sites, low surface expression was seen 24 h post-transfection (Deriu, 2005; Norskov-Lauritsen and Brauner-Osborne, 2015). GPRC6 was shown to be N-glycosylated at seven different sites in vitro in the extracellular domain. Mutation of any two sites was shown to affect the receptor’s surface expression (Norskov-Lauritsen and Brauner-Osborne, 2015; Norskov-Lauritsen et al., 2015). However, not all the Glutamate family receptors require glycosylation to maintain surface expression. For example, the inhibition N-glycosylation of mGlu1R did not change its surface expression level (Mody, 1999; Norskov-Lauritsen and Brauner-Osborne, 2015). In this study, truncation of the extracellular domain which contains most of the glycosylation sites contributed to the low expression levels of both GABAB1 and GPRC6A receptors.
Finally, the expression level on the membrane maybe also affected by the exogenous environment. If one receptor is co-expressed or interacts with another receptor in its native physiological environment, the receptor maybe unstable and expressed poorly in the heterologous experimental system.
The expression study of these 1,652 GPCR constructs identified some familial trends, and importantly, identified several high expressing GPCRs for which no structural data currently exists. Based on these findings, future studies can prioritize work on these high expressing receptors and work to further optimize the construct and identify stabilizing ligands to assist with elucidation of the protein’s three dimensional structure.
MATERIALS AND METHODS
Design of truncations and BRIL fusion sites was based on similarity with previously solved structures of GPCRs from different families. Unique receptor sequences for 826 GPCRs were derived from Uniprot, and 3D structural models were generated for each receptor’s 7TM domain with the automated ICM Build Model tool (Abagyan et al. 2015) using alignment with the closest homology template (Katritch, 2013). Structure-based positional Ballesteros-Weinstein (BW) numbers were assigned from the structural alignments with the templates as described in GPCRDB (Isberg et al., 2015).
The N-terminal truncation sites were designed using predicted structural features in the receptor’s N-termini derived from the corresponding structural templates. For those cases where the N-terminus included important structural elements that were resolved in the 3D template, the truncation site was designed upstream of this structural element. Thus, for Secretin family GPCRs, the N-termini were truncated at the first residues attributed to their 7TM domains (Siu et al., 2013). For chemokine and other Rhodopsin family receptors, which have the N-terminal Cysteine residues predicted to make an important disulfide bond to a Cysteine in ECL3, this prospective disulfide bond was included in the construct (Wu et al., 2010; Hanson et al., 2012). Otherwise, for Rhodopsin family receptors that had a missing or truncated N-terminus in their closest structural template, we used a default truncation upstream of the beginning of helix I at BW position 1.19.
The C-terminal truncation was universally applied at BW position 7.78, which in most receptors corresponds to the site ~10 residues after the end of helix VIII. The constructs thus include potential Cysteine palmitoylation sites in helix VIII residues, when present.
The N-terminal BRIL fusion (Nt_BRIL) constructs placed the BRIL sequence at the truncated position of the receptor N-terminus as described above.
The ICL3 BRIL insertion (ICL3_BRIL) constructs were designed based on truncated sequences using insertion sites in ICL3 as in the construct that was used to solve the crystal structure of 5HT2B (Wacker et al., 2013). According to this design, the BRIL sequence was inserted between BW positions 5.69 and 6.25, replacing ICL3 residues between these positions. In some rare cases when helices V and VI were shorter than in the template, additional residues from ICL3 were added to keep the helical structure in helices V and VI the same as in the 5HT2B construct.
Gene synthesis and codon optimization was performed by GeneScript. The method of overlap extension PCR cloning was used to subclone the protein gene into the vector which is a simple and reliable way to create recombinant plasmids. The expression vector, designated as pFastBac 1, was a modified vector (Invitrogen) containing an expression cassette with a BamHI flanked HA signal sequence followed by a FLAG tag at the N-terminal and with a 10× His tag at the C-terminal. Once the recombinant donor plasmids were obtained, the cloning core transfected them to the competent DH10Bac E. coli cells which contain bacmid and helper to facilitate the combination of the donor and bacmid into a recombinant bacmid.
Cell culture and transfection
BV (baculovirus) expression is a high throughput platform supporting biomass production for GPCR structure and function studies. The platform transfects the insects cells (Sf9) with the recombinant bacmids provided by the cloning core to produce recombinant baculovirus. Recombinant baculoviruses have been widely used as vectors to express heterologous genes in cultured insect cells. High-titer recombinant baculovirus (>108 viral particles per mL) was obtained using the Bac-to-Bac Baculovirus Expression System (Invitrogen). Forty mL cells were harvested by centrifugation and stored at −80°C until use.
Quantitation of protein expression
The monoclonal ANTI-FLAG®M2-FITC (Sigma-Aldrich: F4049), which is a monoclonal antibody covalently conjugated to fluorescein isothiocyanate (FITC), can recognize the FLAG sequence at the N-terminus (Hanson et al., 2007). Therefore, α-flag FITC (2.5 µg/mL) was added to cells to quantify the percentage of cells with surface-expressing GPCRs and the density (mean fluorescence intensity; MFI) of GPCRs on the surface of those cells. α-Flag FITC (2.5 µg/mL) with 1.5% Triton was added to cells to quantify the total expression levels which includes total percentage and total density. For total and surface FITC expression assay, we used 10 µL FITC with and without Triton working solution plus 10 µL of cells, incubate at 4°C for 20 min, add 180 µL 1× TBS (straight TBS, without BSA), then ran the assay on a Guava flow cytometer. The Guava Express Plus GRN histogram statistics provide the count, cells/mL, mean signal intensity, and %CV for each population within a marker. Additionally, the % of total shows the percentage of the data displayed in that plot. Here, we use the data of mean signal intensity and % of total and surface expression.
The data was analyzed by the software of Statistical Product and Service Solution (SPSS) which can be used to do correlation analysis and cluster analysis. Through the K-S test by SPSS, most of the indexes indicated the expression levels in this study conform to a skewed distribution. The expression data distribution was analyzed by GraphPad Prism.
This work was mainly done by the cores of iHuman Institute at ShanghaiTech University and supported by grants from the National Basic Research Program (973 Program) (Nos. 2014CB910400 and 2015CB910104). The authors thank Michael Hanson, Meihua Chu, and Martin Audet for thoughtful comments on this manuscript.
7TM, seven transmembrane; BW, Ballesteros-Weinstein; GPCR, G protein-coupled receptor; ICL3, intracellular loop 3; ICL3_BRIL, ICL3 insertion with apocytochrome b562 RIL; Nt_BRIL, N-terminal fusion with apocytochrome b562 RIL; Sf9, Spodoptera frugiperda (an insect cell line); SPSS, Statistical Product and Service Solution.
COMPLIANCE WITH ETHICS GUIDELINES
Xuechen Lv, Junlin Liu, Qiaoyun Shi, Qiwen Tan, Wu Dong, Jack Skinner, Angela L. Walker, Lixia Zhao, Xiangxiang Gu, Na Chen, Lu Xue, Pei Si, Lu Zhang, Zeshi Wang, Vsevolod Katritch, Zhi-jie Liu, and Raymond C. Stevens declare that they have no conflict of interest. This article does not contain any studies with human or animal subjects performed by the any of the authors.
- Abagyan RA, Orry A, Raush E, Budagyan L, Totrov M (2015) ICM manual. MolSoft LLC, La JollaGoogle Scholar
- Hanson MA, Brooun A, Baker KA, Jaakola VP, Roth C, Chien EY, Alexandrov A, Velasquez J, Davis L, Griffith M, Moy K, Ganser-Pornillos BK, Hua Y, Kuhn P, Ellis S, Yeager M, Stevens RC (2007) Profiling of membrane protein variants in baculovirus system coupling cell-surface detection with small-scale parallel expression. Protein Expr Purif 56:85–92CrossRefPubMedPubMedCentralGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.