Abstract
Scavenger receptors are a protein superfamily that typically consists of one or more repeats of the scavenger receptor cysteine-rich structural domain (SRCRD), which is an ancient and highly conserved protein module. The expression and purification of eukaryotic proteins containing multiple disulfide bonds has always been challenging. The expression systems that are commonly used to express SRCRD proteins mainly consist of eukaryotic protein expression systems. Herein, we established a high-level expression strategy of a Type B SRCRD unit from human salivary agglutinin using the Escherichia coli expression system, followed by a refolding and purification process. The untagged recombinant SRCRD was expressed in E. coli using the pET-32a vector, which was followed by a refolding process using the GSH/GSSG redox system. The SRCRD expressed in E. coli SHuffle T7 showed better solubility after refolding than that expressed in E. coli BL21(DE3), suggesting the importance of the disulfide bond content prior to refolding. The quality of the refolded protein was finally assessed using crystallization and crystal structure analysis. As proteins refolded from inclusion bodies exhibit a high crystal quality and reproducibility, this method is considered a reliable strategy for SRCRD protein expression and purification. To further confirm the structural integrity of the refolded SRCRD protein, the purified protein was subjected to crystallization using sitting-drop vapor diffusion method. The obtained crystals of SRCRD diffracted X-rays to a resolution of 1.47 Å. The solved crystal structure appeared to be highly conserved, with four disulfide bonds appropriately formed. The surface charge distribution of homologous SRCRD proteins indicates that the negatively charged region at the surface is associated with their calcium-dependent ligand recognition. These results suggest that a high-quality SRCRD protein expressed by E. coli SHuffle T7 can be successfully folded and purified, providing new options for the expression of members of the scavenger receptor superfamily.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The scavenger receptors are a protein superfamily that have been shown to be widely expressed in a variety of tissues [1] and to recognize diverse ligands [2]. The scavenger receptor family includes a range of secreted and membrane-associated proteins with widespread functions in the immune response [3], cell differentiation [4], apoptosis [5], and tumor suppression [6] that play an important role in host defense, such as sensing and cleaning various pathogens [1]. The scavenger receptors usually consist of one or more repeats of the scavenger receptor cysteine-rich domain (SRCRD) protein, which is an ancient and highly conserved protein module comprising about 110 residues [7]. SRCRD proteins have been classified into two types based on the number of cysteine residues [8]; Type A proteins are encoded by two exons and contain six cysteine residues, whereas Type B proteins typically contain eight cysteines and are encoded by one exon [9]. Although the number of cysteines differs, the relative pairing of SRCRDs from different sources is consistent [10].
The expression and purification of active and correctly folded SRCRD proteins are an important basis for crystallography studies of the scavenger receptors. To date, several crystal structures of the SRCRD from various scavenger receptor members have been determined, including Salivary scavenger and agglutinin (SALSA) [11], SCARA1 [12], SCARA5 [1], M2BP [13], MARCO [3], CD163 [14, 15], CD6 [16], CD5 [17], and hepsin [18]. The SRCRDs contain three or four pairs of disulfide bonds that are essential for maintaining the structural stability of the protein, which requires a protein expression system that has the ability to modify the disulfide bonds and fold the protein correctly. The expression systems commonly used in these studies to express SRCRD proteins were mainly an insect baculovirus expression vector system and a mammalian cell expression system, such as S2 cells [11, 14, 15] from Drosophila, High Five cells [1, 12] from Trichoplusia ni, and human HEK 293-EBNA embryonic kidney cells [15, 17]. Yeast expression systems have also been used for SRCRD expression, such as strain X-33 [15] and strain KM71 [18] from Pichia pastoris (Table 1). Although the eukaryotic expression systems offer the ability to express complex multi-disulfide-bonded proteins, such as SRCRD, these systems are slow, have a low yield, and are expensive [19].
Among the various expression systems available, E. coli has become the most widely used host for the production of recombinant proteins because of the advantages of rapid cell multiplication, high yield, and the relative ease of IPTG-induced expression [20]. However, most eukaryotic proteins usually suffer from insolubility and misfolding when expressed by the E. coil system, thus requiring the solubilization and refolding of the protein inclusion bodies [21]. Therefore, the expression of recombinant proteins containing multiple disulfide bonds in E. coli remains challenging. Some multi-disulfide-bonded proteins have been reported to be expressed in E. coli strains, resulting in the formation of inactive inclusion bodies that require renaturation to gain biological activity [22]. In the SHuffle strain, the periplasmic disulfide bond isomerase DsbC has been reported to shuffle the disulfide bonds within the mis-oxidized protein to produce its native folded state [23, 24], thus greatly enhancing the amount of correctly folded disulfide-bonded proteins [25].
SALSA, which is also as known as deleted in malignant brain tumors 1 (DMBT1) and salivary agglutinin (SAG), is a member of the scavenger receptor family [26] that was thought to be involved in the oral clearance of microorganisms because of its bacterial agglutination properties [27]. This protein contains 14 Type B SRCRDs separated by SRCR-interspersed domains (SIDs) [28]. Because of its ability to bind and agglutinate Streptococcus mutans, SALSA has been considered to play an important role in preventing dental caries [29]. In the present study, we developed an expression and purification method for the SRCRD from SALSA utilizing the Shuffle T7 E. coli expression system; moreover, we performed refolding to obtain diffraction-quality crystals during crystallization. The strategy developed in this study was considered to be reliable, as further confirmed by structural analysis, thus contributing to the structural and functional study of other scavenger receptor family proteins and cysteine-rich proteins.
2 Materials and Methods
2.1 Expression of the SRCRD Protein in E. coil
The DNA encoding SRCRD11 (1371–1489) of the deleted in malignant brain tumors 1 (DMBT1) protein was cloned into pET-32a in two forms: Trx-His6-TEV-SRCRD and SRCRD only for soluble and insoluble expression, respectively. In the first form, a thioredoxin (Trx) tag was added at the N-terminus for expression in the supernatant, followed by a polyhistidine (His6) tag for affinity purification. A Tobacco Etch Virus (TEV) protease cleavage site was added between the His6 tag and the SRCRD to remove the tags for further purification. The recombinant plasmids were transformed into E. coli SHuffle T7 (New England Biolabs) cells, which were then spread on an LB agar plate (100 µg/mL ampicillin). A single colony from the plate was picked up and precultured in 10 mL of LB medium (100 µg/mL ampicillin). The transformant was cultured overnight at 37℃ with shaking at 100 rpm and then transferred into 1 L of LB medium in a 5-L Erlenmeyer flask. The culture solution was incubated at 37 °C with shaking until the OD600 reached 0.6–0.8. Subsequently, the culture solution was rapidly cooled by placing it in ice water. Overnight expression was induced by 0.5 mM isopropyl-β-d-thiogalactopyranoside (IPTG) at 25 °C with shaking at 100 rpm. E. coli cells were harvested by centrifugation at 3500×g at 4 °C for 30 min. The harvested cells were collected in sampling bags, flash frozen in liquid nitrogen, and stored at − 80 °C.
2.2 Refolding and Purification Strategy for SRCRD Expression in the Supernatant
The cell pellets were resuspended with lysis buffer (20 mM Tris–HCl (pH 8.0), 200 mM NaCl, and 10 mM imidazole) and sonicated to break the cells using a Digital Sonifier (Branson). The conditions of sonication were: Time = 7 min, Power = 60%, On = 1.0 s, Off = 1.0 s, max temperature = 7 °C. The lysate was centrifuged at 4 °C at 30,000 × g for 30 min, to pellet the cellular debris. The supernatant was then applied to nickel–nitrilotriacetic acid (Ni–NTA) agarose resin (Fujifilm), for purification according to the manufacturer’s protocol. The Ni–NTA resin was pre-equilibrated with lysis buffer. Next, the supernatant was mixed with resin for 30 min, to immobilize the target protein. After washing with washing buffer (20 mM Tris–HCl (pH 8.0), 200 mM NaCl, and 20 mM imidazole), the protein was eluted with elution buffer (20 mM Tris–HCl (pH 8.0), 200 mM NaCl, and 250 mM imidazole). The eluted protein was then buffer-exchanged with dialysis buffer (20 mM Tris–HCl (pH 8.0) and 200 mM NaCl) and digested with TEV protease at 4 °C overnight, to remove the Trx-His6-tag. A Resource RPC column (Cytiva) was used to separate the pure SRCRD protein from the digestion solution using a linear gradient of 1%–99% acetonitrile (1% trifluoroacetic acid). The target fractions were collected and freeze-dried. A refolding process was carried out at 4 °C in 50 mM Tris–HCl (pH 8.0), 0.2 M NaCl, 0.4 M l-Arg, 2.5 mM CaCl2, 2 mM/0.4 mM glutathione reduced form (GSH)/glutathione oxidized form (GSSG), and 10% glycerol. The refolded protein was then buffer-exchanged in 50 mM Tris–HCl (pH 8.0), 0.2 M NaCl, 2.5 mM CaCl2, and 10% glycerol for 24 h, twice.
2.3 Refolding and Purification Strategy for SRCRD Expression in Inclusion Bodies
The cell pellets were suspended with lysis buffer (20 mM Tris–HCl (pH 8.0) and 200 mM NaCl) and sonicated to break the cells. The lysate was centrifuged at 4 °C at 30,000 × g for 30 min, to collect the inclusion bodies. The inclusion bodies were first washed twice with washing buffer (20 mM Tris–HCl (pH 8.0), 200 mM NaCl, 2 M urea, and 5 mM EDTA) and centrifuged at 4 °C at 8000 × g for 20 min to remove soluble impurities. The inclusion bodies were then solubilized in dissolution buffer (20 mM Tris–HCl (pH 8.0), 200 mM NaCl, and 8 M urea) with shaking at 4 °C overnight, and the supernatant was collected by centrifugation at 30,000 × g for 30 min at 4 °C. The refolding of SRCRD was performed through a dilution process with addition of the solubilized SRCRD sample to eightfold volume of refolding buffer (50 mM Tris–HCl (pH 8.0), 200 mM NaCl, 1 M urea, 0.5 M l-Arg, 2 mM GSH, 0.4 mM GSSG, and 2.5 mM CaCl2) in several batches. The final concentration of the protein was about 0.2 mg/mL. This refolding process was carried out over 1 week at 4 °C. After the refolding procedure, the sample was buffer-exchanged with dialysis buffer 1 (20 mM Tris–HCl (pH 8.0), 200 mM NaCl, and 50 mM Arg), followed by dialysis buffer 2 (20 mM Tris–HCl (pH 8.0)) at 4 °C. Subsequently, the refolded SRCRD was filtered and further loaded onto a Mono Q column (GE Healthcare) to remove protein impurities using an ÄKTA purifier (GE Healthcare) as the final purification. Elution was performed with a linear gradient of 0–1.0 M NaCl in 20 mM Tris–HCl (pH 8.0) buffer.
2.4 MALDI-TOF MS
Sinapinic acid, which was used as the matrix solution, was dissolved in the mixture solution of TA30 (acetonitrile and 0.1% TFA mixed at a ratio of 30:70). Proteins were diluted with 0.1% TFA to a concentration of about 10 μM and mixed with matrix at a ratio of 1:1. The mixed solution was spotted onto a metal plate and dried by air before being loaded onto an AutoFlex Speed MALDI-ToF instrument (Bruker). The analytical mode was set to LP m/z HighMasses. Bovine serum albumin was used as the calibration reagent. The laser power was 70%, and spotting was performed about 20 times for each sample. Spectra were analyzed using the flexAnalysis Software (Bruker).
2.5 Crystallization
The purified SRCRD protein was concentrated to 12 mg/mL using a 5000 MWCO VIVASPIN (Sartorius) ultrafiltration device, followed by screening of the crystallization conditions using Wizard I & II (Rigaku Reagents, Inc.), Crystal Screen HT (Hampton Research), and PEG/Ion HT (Hampton Research) kits. The sitting-drop vapor diffusion method was adopted at 20 °C in 96-well VIOLAMO Protein Crystallization Plates (As One); 0.5 µL of the protein solution and 0.5 µL of the reservoir solution were mixed in one drop, for crystal growth. SRCRD expressed in the supernatant formed needle-shaped crystals under the following crystallization conditions: 0.2 M lithium citrate tribasic tetrahydrate, 25% PEG 3350, pH 7.0. SRCRD expressed in inclusion bodies formed rhombic crystals under the following crystallization conditions: 0.2 M ammonium iodide, 20% PEG 3350, pH 6.2. These crystals were used for data collection.
2.6 X-ray Diffraction Data Collection and Processing
The crystals were picked up with a Mounted CryoLoop (Hampton Research) and cryoprotected in a reservoir solution containing 20% glycerol. Subsequently, the crystals were flash frozen in liquid nitrogen and transferred to unipuck (crystal positioning system). X-ray diffraction data of the SRCRD crystals were collected on beamline AR-NE3A at Photon Factory (Ibaraki, Japan) at a wavelength of 1.000 Å. The diffraction data were indexed and integrated using the XDS program [30] and then scaled using XSCALE [30]. The crystal structure of the SRCRD protein was solved by molecular replacement with MOLREP [31, 32] from the CCP4 suite, using the structure of GP340 SRCRD 8 (PDB 6SA5; sequence identity: 96.33%) as an input model. The building of a model of the crystal structure of SRCRD was carried out by CCP4i and Coot [33, 34]. The structure was further refined using Refmac5 [35] and Phenix refine [36].
2.7 Size Exclution Chromatography
The protein sample was concentrated to 1 mL and loaded onto a Superdex 200 10/300 GL column (GE Healthcare) with gel filtration buffer (20 mM Tris–HCl (pH 8.0), and 200 mM NaCl) using an ÄKTA purifier (GE Healthcare).
3 Results
3.1 Refolding and Purification of SRCRD Expressed in the Supernatant of an E. coil Lysate
To express the SRCRD protein in the soluble fraction, we first attempted a strategy to improve the solubility of the expressed proteins by fusing a thioredoxin (Trx) tag [37] to the N-terminal end of the construct using pET-32a (Fig. 1A). A histidine tag for affinity purification followed the Trx-tag. Moreover, a TEV protease cleavage site replaced the enterokinase cleavage site between the histidine tag and the SRCRD sequence to allow subsequent digestion with TEV protease to obtain the SRCRD protein. The target protein (Trx-His6-TEV-SRCRD) was expressed in the soluble fraction of the E. coli SHuffle T7 lysate and observed as a band of about 30.5 kDa on SDS–PAGE (Fig. 2A).
The soluble fraction containing Trx-His6-TEV-SRCRD from the E. coil lysate was initially separated by immobilized Ni–NTA affinity chromatography. Trx-His6-TEV-SRCRD was successfully trapped on the Ni–NTA resin via the binding affinity of the His6 tag to Ni2+ ions. Most of contaminant proteins were removed by the washing buffer, and the target protein was eluted using a solution containing 250 mM imidazole. The identity of Trx-His6-TEV-SRCRD protein was further confirmed by MALDI-TOF–MS (Fig. S1). TEV protease digestion was then performed to cut the protein at the cleavage site and separate the Trx-His6 tag from the SRCRD protein. The SRCRD protein without the tags was confirmed by SDS-PAGE (Fig. 2B). However, this digestion was inadequate, and a large amount of protein remained from which the excess tags could not be removed, which reduced the yield of SRCRD. This incomplete digestion may be attributed to the formation of protein aggregates, resulting in the inability of TEV protease to effectively act at the cleavage site. Non-reducing SDS–PAGE revealed the formation of aggregates, and the administration of DTT, a reducing agent that is often used in protein purification, eliminated this state (Fig. S2). These results suggested that intermolecular disulfide bonds were likely to be the contributing factor in the formation of the aggregates. Although the protein was solubilized with the help of Trx tags, the disulfide bonds within the recombinant protein molecule remained incorrectly folded.
Reverse-phase chromatography was selected for the purification of SRCRD from the digestion mixture, and the bands were confirmed by SDS–PAGE (Fig. 2C). As the content of the organic phase increased, the SRCRD appeared in peak 2 at approximately 55% acetonitrile (ACN). The Trx-His6 tag and the uncleaved protein were separated into peak 3. The peak 2 fraction samples were then freeze-dried, to remove the organic solvents.
Freeze-dried samples from RPC were prone to precipitation when redissolved in buffer, which may be attributed to incorrect folding of the disulfide bonds. Loading of the RPC-purified SRCRD sample into the SEC revealed the presence of many aggregates in the sample (Fig. S3). An incomplete structure with free cysteine residues may have led to this result. To improve its quality, the SRCRD sample needs to be refolded to break the intermolecular disulfide bonds of the protein and form correct intramolecular disulfide bonds. Because DTT is too reductive and may break the otherwise correctly formed disulfide bonds in proteins, glutathione was chosen as the redox system for the refolding process because of its stable and reversible effect. In this study, the disulfide bonds of SRCRD were refolded using a refolding buffer containing GSH/GSSG as a redox system, l-arginine, calcium ion, and glycerol, simultaneously. After long period of refolding, the number of aggregates in the sample was decreased (Fig. S3). The purified SRCRD after refolding was checked by non-reducing (no DTT) and reducing (0.2 M DTT) SDS–PAGE (Fig. 2D). Proteins containing disulfide bonds may exhibit a loose state when the disulfide bonds are broken by DTT; thus, they move more slowly in SDS–PAGE than do the proteins with intact structures. The addition of DTT caused a change in the position of the SRCRD protein on SDS–PAGE, which demonstrated that disulfide bonds had been formed within the refolded SRCRD.
3.2 Refolding and Purification of SRCRD Expressed from the Inclusion Bodies of an E. coli Lysate
Even if the protein is expressed as inclusion bodies, some methods (such as solubilization and refolding) can be chosen to restore the solubility and natural structure of the protein. To apply this strategy to the expression of SRCRD, a construct containing only SRCRD without tags was used for SRCRD expression (Fig. 1A). In the absence of a Trx tag to enhance the solubility of the expressed protein, the SRCRD proteins were expressed as inclusion bodies in the precipitate of the E. coli lysate after centrifugation; furthermore, the target protein in the inclusion bodies was observed as a band of about 12.5 kDa on SDS–PAGE (Fig. 3A).
The precipitate containing SRCRD inclusion bodies was first washed twice with washing buffer, then centrifuged to remove soluble impurities. The inclusion bodies of SRCRD were retained and then solubilized in dissolution buffer containing 8 M urea with shaking at 4 °C overnight, followed by the removal of the insoluble matter by centrifugation. The solubilized SRCRD sample was added to refolding buffer contain GSH/GSSG or no GSH/GSSG in a stepwise manner and with sufficient stirring to begin the refolding process. This refolding process then lasted for 1 week at 4 °C, and the refolded protein samples were checked by SDS–PAGE (Fig. 3B). Two SRCRD bands appeared on the first day of refolding with GSH/GSSG, indicating that some of the SRCRD had been refolded; however, a fraction of the SRCRD remained in a loose state because the intramolecular disulfide bonds were not all formed. The SRCRD obtained after refolding without GSH/GSSG had only one upper band, indicating that the protein was not efficiently refolded. After refolding for 1 week, with the help of the GSH/GSSG, most of the SRCRD transformed into a lower band. In contrast, in the refolding without GSH/GSSG, most of the SRCRD remained as the same upper band. Samples were collected daily during the refolding process and checked by SDS–PAGE (Fig. 3C). The upper unfolded bands gradually disappeared over time (days), eventually leaving only the lower bands, representing the refolded protein. The sample was finally dialyzed to remove all chaotropic agents and redox reagents and end the refolding process. The refolded protein was loaded onto the SEC and a large peak representing the monomer was eluted (Fig. S4), indicating that most of the protein was retained as a monomer after refolding. Together, these results provide important insights into the importance of the GSH/GSSH redox system for the refolding of multi-disulfide-bonded proteins. Another E. coli strain, BL21(DE3), was used for SRCRD expression; however, it failed to yield soluble protein after refolding attempts. Precipitation of proteins occurred when refolding reagents were removed using dialysis (Fig. S5). Thus, SHuffle T7 is more suitable for SRCRD expression than the commonly used BL21(DE3) strain.
For crystallization experiments, further ion exchange chromatography was used for the purification of the SRCRD protein. After the refolding process, the SRCRD protein sample was buffer-exchanged by dialysis at 4 °C for 48 h, to remove all the salt present in the sample. Subsequently, the refolded SRCRD was filtered and further loaded onto a Mono Q column, to remove protein impurities as the final purification (Fig. 3D). The purity of the target protein was confirmed by SDS–PAGE, and the identity of the protein was further confirmed by MALDI-TOF–MS (Fig. S1).
3.3 Crystallization and X-ray Diffraction of SRCRD
The purity and stability of a protein are essential prerequisites for its successful crystallization; thus, crystallization can be used to evaluate the SRCRD purification and refolding strategies propose here. To confirm further the structural integrity of the SRCRD protein and whether the disulfide bonds in the protein were formed correctly after refolding, the crystallization experiments of SRCRD were performed using the sitting-drop vapor diffusion method. The purified SRCRD was concentrated to 12 mg/ml on an ultrafiltration device, and several commercial crystallization screening kits were used in the crystallization screening experiment.
More than 1 month after the SRCRD protein was purified and refolded from supernatant expression, a few protein crystals were finally detected in the reservoir solution containing lithium citrate tribasic tetrahydrate and polyethylene glycol 3,350. Noodle-shaped crystals formed in the condition with 0.2 M lithium citrate tribasic tetrahydrate, 25%–30% w/v polyethylene glycol 3,350, pH 6.0–7.0 (Fig. 4). However, the crystallization of SRCRD prepared using this purification and refolding strategy was not very reproducible, and crystals did not always grow. Therefore, this purification and refolding strategy for SRCRD expressed in supernatants is not adequately effective.
In the case of the SRCRD protein purified and refolded from inclusion body expression, the protein yields were enhanced, and large quantities of crystals could be obtained within a short period (2–3 days) in crystallization experiments. The growth of many crystals could be observed under several conditions using the PEG/ion kit. Most of these crystals were rectangular or rhombic in shape (Fig. 4). The crystallization experiments were repeated, and crystals were obtained every time using the PEG/ion kit (the conditions under which crystals often grew are shown in Fig. S6. Therefore, this strategy of purification and refolding of SRCRD proteins from inclusion bodies was considered to be reliable.
Whether the four pairs of disulfide bonds in SRCRD were correctly formed was further confirmed by X-ray diffraction experiments. The X-ray diffraction data of two types of SRCRD crystals (noodle-shaped and rhombic crystals) were collected at beamline AR-NE3A of Photon Factory at a wavelength of 1.000 Å. The diffraction data of a crystal with a high resolution were used for further analysis. The data collection statistics are listed in Table 2.
The highest resolution of the noodle-shaped crystal obtained here was determined to be 2.54 Å, and one asymmetric unit contained one molecule. The space group of the noodle-shaped crystal was determined to be P 32 2 1, with unit cell parameters of a = 71.08 Å, b = 71.08 Å, and c = 55.36 Å. The highest resolution of the rhombic crystal obtained was determined to be 1.47 Å, and one asymmetric unit contained one molecule. The space group of the rhombic crystal was determined to be P 21 21 2, with unit cell parameters of a = 81.80 Å, b = 31.70 Å, and c = 35.22 Å.
A structural analysis of the rhombic crystal was performed because of its higher resolution, and the structure refinement statistics are listed in Table 3. The initial phase of the SRCRD structure was established by the molecular replacement method using the PDB ID of the 6SA5 crystal structure as the search model. The initial model was determined, which was further refined using Refmac5, Coot, and Phenix. After refinement, the Rwork and Rfree of the final structure of SRCRD were refined to 15.8% and 17.6%, respectively. The Ramachandran plot indicated that the percentage of residues of SRCRD in preferred regions, allowed regions, and outliers was 96.32%, 3.68%, and 0.00%, respectively.
3.4 Crystal Structure of the SRCRDs from SALSA
All known SRCRDs of scavenger receptor superfamily members share a very high degree of identity at the sequence and structural levels, which is even more evident in the SRCRDs from SALSA. The sequence of SRCRD11 solved in this research exhibited high similarity to another two SRCRDs with reported structures: SRCRD1 (PDBid: 6SA4, 90.00% identity) and SRCRD8 (PDBid: 6SA5, 96.33% identity); moreover, the fold was also very highly conversed. The crystal structure revealed a globular SRCR-fold containing four disulfide bridges with typical characteristics of Type B members (Fig. 5). The fold contained one α-helix at the center of the sequence and several β-strands at both the N and C termini. These β-strands linked the N and C termini together, to maintain a compact fold structure. The SRCR11 model spanned 96 residues and lacked the N-terminal “TAGSES” and C-terminal “QSQPTPS” sequences because of the high mobility of these loop tails. The refined electron density map was perfectly matched to the disulfide bonds, indicating that the eight cysteines formed the correct four pairs of disulfide bonds in SHuffle T7-expressed SRCR11, which ensured the success of the refolding process. As a Type B scavenger receptor protein, the SRCRD of SALSA has one additional disulfide bond pair compared with Type A domains, i.e., the C1–C4 pair, which corresponded to Cys19–Cys53 in the SRCR11 structure. The relative numbering of the remaining pairs was structurally conserved, with Cys35–Cys99 corresponding to the C2–C7 pair, Cys79–Cys89 corresponding to the C3–C8 pair, and Cys35–Cys99 corresponding to the C5–C6 pair. The C1–C4 and C3–C8 pairs are involved in linking the N- and C-terminal peptide chains and the α-helix, thus maintaining structural closure and compactness. The remaining two pairs of disulfide bonds are responsible for stabilizing the loop-rich flexible region.
3.5 Comparison of Type A and B SRCRD
The disulfide bonds inside the molecules of SRCRDs play an important role in maintaining the structural stability and compactness of these proteins. The C1–C4 disulfide bond pair distinguishes Type A from Type B SRCRDs. This C1–C4 pair is located near the surface of the protein, which may be a factor in the production of inclusion bodies caused by intermolecular disulfide bonds during the expression of Type B SRCRD by SHuffle T7 cells. To understand the role of the C1–C4 pair in the structure of the protein, we compared the structure of Type B SRCR11 to that of Type A (the SRCRD from M2BP was used as a Type A template) regarding the C1–C4 pair and the adjacent β-sheet (Fig. 6A). The C1–C4 pair connects the second β-strand from the N terminus to the end of the α-helix. The disulfide bond formed by the Cys19–Cys53 pair in the structure of SRCR11 renders the structure more compact. In contrast, Asn15 and Phe49 of Type A SRCRD at the same amino acid sequence site are unable to form a bond and render the structure looser. In a close-up view of the C1–C4 pair and adjacent β-strands (Fig. 6B), the second β-strand located at the N-terminal of Type A formed three hydrogen bonds with the β-strand located at the C-terminal. Moreover, Cys105 at the C-terminal formed a C3–C8 pair disulfide bond with Cys44 on the α-helix, thus causing the two ends to establish a compact closure. In the Type B structure, in addition to the hydrogen bonds between the β-strands, Cys19 next to the second β-strand at the N terminus formed a C1–C4 pair with Cys53 at the end of the α-helix, which provides stronger support for the compactness of the terminal structure.
4 Discussion
Tandem repeats of SRCRDs, similar to immunoglobulin (IG) and epidermal growth factor-like (EG) domains, are common among the membrane-bound members of the superfamily [13, 38, 39]. SRCRDs have been identified in numerous cell surface and secreted proteins, most of which are associated with the immune system, mediating protein–protein interactions and ligand binding [8]. In previous studies on SRCRD, eukaryotic expression systems were usually used to express these multi-disulfide-bonded proteins (Table 1) to ensure that the small and compact structure can be folded correctly. However, high costs, harsh culture conditions, susceptibility to contamination, and low protein yields represented obstacles to the use of eukaryotic cells in this context [40, 41], especially in structural studies [19, 42]. In contrast, bacterial cells have the advantage of being cost effective and easy to handle and modify. However, they are limited in the expression of large proteins and lack eukaryotic post-translational modifications [19].
To meet the requirements for a high protein yield and quality of various experiments, such as crystallization and NMR analysis, here, we aimed to identify an efficient and low-cost method to produce SRCRD proteins. An SRCRD unit (1371–1489) from human SALSA, a member of the Type B scavenger receptor subfamily, was targeted in this study. This SRCRD protein contains four pairs of disulfide bonds, which is a challenge for the E. coli expression system. E. coli SHuffle T7 cells, which are particularly adapted to facilitate disulfide bond formation, were used as the expression system for SRCRD. The SHuffle T7 strain is an E. coli K12 strain engineered to form proteins containing disulfide bonds in the cytoplasm and suitable for T7 promoter driven protein expression. In addition to mutations in trxB and gor, DsbC in the cytoplasm facilitates the formation of accurate disulfide bonds of the expressed proteins [23]. To express the SRCRD protein, its cDNA was inserted into the pET32a expression vector and expressed by SHuffle T7. One Type A SRCRD, the third SRCRD of murine neurotrypsin, has been reported to be expressed in the soluble fraction from SHuffle T7 cells [9]. However, because the Type B SRCRDs contain four pairs of disulfide bonds, the presence of eight cysteines can contribute significantly to the incorrect formation of disulfide bonds in the reducing environment of the E. coli cytoplasm [19]. It is also difficult for the SHuffle strain to support the solubility of the misformed protein. In our case, the SRCRD from human SALSA could not be expressed in the soluble fraction of the SHuffle T7 lysate and remained in the precipitate. The extra pair of disulfide bonds over Type B domains is likely to have contributed to this result.
Our initial attempt to produce the protein included fusing a Trx tag at the N terminus of the sequence, to increase the solubility of the recombinant protein, followed by digestion with TEV protease, to obtain purified SRCRD. However, the proteins obtained contained aggregates, thus requiring further refolding. To investigate the cause of protein agglutination, DTT, which is a reducing agent that is often used in protein purification, was added to the digested samples, followed by SDS–PAGE using a non-reducing sample buffer (Fig. S1). After the DTT treatment, the bands of the aggregate were disassembled, and the bands of SRCRD appeared at the right position. DTT is often used for the reduction of disulfide bonds in proteins and can be employed to block the formation of intermolecular disulfide bonds between cysteines in proteins. These results indicated that protein aggregation was the result of the incorrect disulfide bonding of cysteines between protein molecules and that the addition of reducing agents is effective in reducing the formation of aggregates. In cysteine-rich proteins, misconnection of disulfide bonds may lead to the formation of protein multimers. During refolding, the oxidation of disulfide bonds can be triggered by the thiol group of glutathione [43]. It has been reported that chitinase, which is a cysteine-rich protein from E. coli inclusion bodies, can be solubilized and correctly folded into active proteins in a glutathione redox system [44]. Therefore, further refolding was performed using the GSH/GSSG redox system, for the formation of appropriate disulfide bonds by the cysteines of SRCRD. However, the final yield of the purified protein was not high, probably because of the tedious steps and inadequate digestion process. Although crystals were finally obtained, the crystallization could not be stably repeated.
Subsequently, we attempted to express the protein in inclusion bodies, followed by solubilization with a high concentration of urea and arginine, as chaotropic agents. After removing all insoluble impurities, refolding was performed using the GSH/GSSG redox system. The final enrichment and purification of the protein were finished by ion exchange chromatography. This strategy ensured the quality of the recombinant protein while improving its yield. High resolution crystals were formed under a variety of crystallization conditions from a commercial PEG/ion kit within a few days. Because the constructs used in this method do not carry added tags, including the His tag, the minimization of the flexible region of the protein facilitates the formation of crystals. Additionally, we tried the construct His6-SRCRD by adding a histine tag at the N-terminal. High yield and purity of protein were obtained in case of SRCRD construct without any tags; however, no crystal growth was observed (Table 4). This may be due to the excess of terminal loop structure that was preventing protein crystal formation. Moreover, the BL21(DE3) strain was tested for its ability to express SRCRD; however, it failed to yield soluble protein from refolding. This may be due to the absence of the isomerase that promotes disulfide bond formation similar to that in SHuffle T7 strain. Thus, this strategy of purification and refolding of SRCRD proteins from inclusion bodies expressed by SHuffle T7 strain is considered to be reliable.
Regarding the crystal structure, by matching the electron density map, we confirmed the correct formation of all four pairs of disulfide bonds. SRCRD exhibits a highly conserved structure. In addition to the multiple cysteines that formed disulfide bonds, several amino acid sequences that constitute the primary secondary structure exhibited a very high level of conservation.
The disulfide bond formed by the C1–C4 pair distinguishes Type A from Type B SRCRDs [45]. This pair is located near the protein surface and may lead to the generation of inclusion bodies because of the faulty intermolecular bonds formed by it. By comparing the position of this extra disulfide bond pair and the nearby structure in the Type B SRCRD structure with those of Type A, we found that the C1–C4 pair may help link the terminal peptide chains and stabilize the structural closure together with the C3–C8 pair and nearby hydrogen bonds. These bonds link the N-terminal β-strands, C-terminal β-strand, and α-helix to each other. In the Type A structure, this function is performed independently by the C3–C8 pair and the hydrogen bonds between the β-strands. Overall, the C1–C4 pair causes almost no change in the SRCRD intact structure; rather, it only serves as an additional insurance for the closure of the structure, which also ensures the conservation between Type A and B domains. However, the extra two cysteines would likely be a factor in the production of inclusion bodies. In any case, this strategy of successfully expressing and refolding all disulfide bonds, including the C1–C4 pair, to obtain high-quality Type B SRCRD proteins exhibits high potential for widespread application to other both Type A and Type B SRCRD proteins.
In conclusion, our studies led to the development of a new and efficient strategy for the production of SRCRD proteins using an E. coli expression system. The recombinant protein containing four disulfide bonds could be refolded using a simple and efficient procedure, with the refolding buffer containing glutathione as a redox system. Crystallization and structural analysis confirmed the appropriate formation of disulfide bonds and the correct folding of the protein structure. The expression and purification strategy reported in this study may further contribute to the expression and functional study of other scavenger receptor family proteins and Cys-rich proteins.
References
Yu B, Cheng C, Wu Y, Guo L, Kong D, Zhang Z, Wang Y, Zheng E, Liu Y, He Y (2020) Interactions of ferritin with scavenger receptor class A members. J Biol Chem 295:15727–15741
PrabhuDas MR, Baldwin CL, Bollyky PL, Bowdish DME, Drickamer K, Febbraio M, Herz J, Kobzik L, Krieger M, Loike J, McVicker B, Means TK, Moestrup SK, Post SR, Sawamura T, Silverstein S, Speth RC, Telfer JC, Thiele GM, Wang X-Y, Wright SD, El Khoury J (2017) A consensus definitive classification of scavenger receptors and their roles in health and disease. J Immunol 198:3775–3789
Ojala JRM, Pikkarainen T, Tuuttila A, Sandalova T, Tryggvason K (2007) Crystal structure of the cysteine-rich domain of scavenger receptor MARCO reveals the presence of a basic and an acidic cluster that both contribute to ligand recognition. J Biol Chem 282:16654–16666
Ashkenazy H, Abadi S, Martz E, Chay O, Mayrose I, Pupko T, Ben-Tal N (2016) ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res 44:W344–W350
Purushotham S, Deivanayagam C (2014) The calcium-induced conformation and glycosylation of scavenger-rich cysteine repeat (SRCR) domains of glycoprotein 340 influence the high affinity interaction with antigen I/II homologs. J Biol Chem 289:21877–21887
Reichhardt MP, Holmskov U, Meri S (2017) SALSA—A dance on a slippery floor with changing partners. Mol Immunol 89:100–110
Sarrias MR, Grønlund J, Padilla O, Madsen J, Holmskov U, Lozano F (2004) The Scavenger Receptor Cysteine-Rich (SRCR) domain: an ancient and highly conserved protein module of the innate immune system. Crit Rev Trade Immunol 24:1–37
Resnick D, Pearson A, Krieger M (1994) The SRCR superfamily: a family reminiscent of the Ig superfamily. Trends Biochem Sci 19:5–8
Canciani A, Catucci G, Forneris F (2019) Structural characterization of the third scavenger receptor cysteine-rich domain of murine neurotrypsin. Protein Sci 28:746–755
Resnick D, Chatterton JE, Schwartz K, Slayter H, Krieger M (1996) Structures of class A macrophage scavenger receptors. Electron microscopic study of flexible, multidomain, fibrous proteins and determination of the disulfide bond pattern of the scavenger receptor cysteine-rich domain. J Biol Chem 271:26924–26930
Reichhardt MP, Loimaranta V, Lea SM, Johnson S (2020) Structures of SALSA/DMBT1 SRCR domains reveal the conserved ligand-binding mechanism of the ancient SRCR fold. Life Sci Alliance 3(4):e201900502
Cheng C, Zheng E, Yu B, Zhang Z, Wang Y, Liu Y, He Y (2021) Recognition of lipoproteins by scavenger receptor class A members. J Biol Chem 297(2):100948
Hohenester E, Sasaki T, Timpl R (1999) Crystal structure of a scavenger receptor cysteine-rich domain sheds light on an ancient superfamily. Nat Struct Biol 6:228–232
Ma H, Jiang L, Qiao S, Zhi Y, Chen X-X, Yang Y, Huang X, Huang M, Li R, Zhang G-P (2017) The crystal structure of the fifth scavenger receptor cysteine-rich domain of porcine CD163 reveals an important residue involved in porcine reproductive and respiratory syndrome virus infection. J Virol 91(3):e01897-16
Ma H, Li R, Jiang L, Qiao S, Chen XX, Wang A, Zhang G (2021) Structural comparison of CD163 SRCR5 from different species sheds some light on its involvement in porcine reproductive and respiratory syndrome virus-2 infection in vitro. Vet Res 52:97
Chappell PE, Garner LI, Yan J, Metcalfe C, Hatherley D, Johnson S, Robinson CV, Lea SM, Brown MH (2015) Structures of CD6 and its ligand CD166 give insight into their interaction. Structure 23:1426–1436
Rodamilans B, Muñoz IG, Bragado-Nilsson E, Sarrias MR, Padilla O, Blanco FJ, Lozano F, Montoya G (2007) Crystal structure of the third extracellular domain of CD5 reveals the fold of a group B scavenger cysteine-rich receptor domain. J Biol Chem 282:12669–12677
Somoza JR, Ho JD, Luong C, Ghate M, Sprengeler PA, Mortara K, Shrader WD, Sperandio D, Chan H, McGrath ME, Katz BA (2003) The structure of the extracellular region of human hepsin reveals a serine protease domain and a novel scavenger receptor cysteine-rich (SRCR) domain. Structure 11:1123–1131
Kesidis A, Depping P, Lodé A, Vaitsopoulou A, Bill RM, Goddard AD, Rothnie AJ (2020) Expression of eukaryotic membrane proteins in eukaryotic and prokaryotic hosts. Methods 180:3–18
Marisch K, Bayer K, Cserjan-Puschmann M, Luchner M, Striedner G (2013) Evaluation of three industrial Escherichia coli strains in fed-batch cultivations during high-level SOD protein production. Microb Cell Fact 12:58
Long X, Gou Y, Luo M, Zhang S, Zhang H, Bai L, Wu S, He Q, Chen K, Huang A, Zhou J, Wang D (2015) Soluble expression, purification, and characterization of active recombinant human tissue plasminogen activator by auto-induction in E. coli. BMC Biotechnol 15:13
Fathi-Roudsari M, Akhavian-Tehrani A, Maghsoudi N (2016) Comparison of three Escherichia coli strains in recombinant production of reteplase. Avicenna J Med Biotechnol 8:16–22
Shevchik VE, Condemine G, Robert-Baudouy J (1994) Characterization of DsbC, a periplasmic protein of Erwinia chrysanthemi and Escherichia coli with disulfide isomerase activity. EMBO J 13:2007–2012
Missiakas D, Georgopoulos C, Raina S (1994) The Escherichia coli dsbC (xprA) gene encodes a periplasmic protein involved in disulfide bond formation. EMBO J 13:2013–2020
Lobstein J, Emrich CA, Jeans C, Faulkner M, Riggs P, Berkmen M (2016) Erratum to: SHuffle, a novel Escherichia coli protein expression strain capable of correctly folding disulfide bonded proteins in its cytoplasm. Microb Cell Fact 15(1):124
Holmskov U, Mollenhauer J, Madsen J, Vitved L, Grønlund J, Tornøe I, Kliem A, Reid KBM, Poustka A, Skjødt K (1999) Cloning of gp-340, a putative opsonin receptor for lung surfactant protein D. Proc Natl Acad Sci U S A 96:10794–10799
Rundegren J, Ericson T (1981) Effect of calcium on reactions between a salivary agglutinin and a serotype c strain of Streptococcus mutans. J Oral Pathol 10:269–275
Bikker FJ, Ligtenberg AJM, Nazmi K, Veerman ECI, Hof WVNT, Bolscher JGM, Poustka A, Nieuw Amerongen AV, Mollenhauer J (2002) Identification of the bacteria-binding peptide domain on salivary agglutinin (gp-340/DMBT1), a member of the scavenger receptor cysteine-rich superfamily. J Biol Chem 277:32109–32115
Larson MR, Rajashankar KR, Patel MH, Robinette RA, Crowley PJ, Michalek S, Brady LJ, Deivanayagam C (2010) Elongated fibrillar structure of a streptococcal adhesin assembled by the high-affinity association of alpha- and PPII-helices. Proc Natl Acad Sci U S A 107:5983–5988
Kabsch W (2010) XDS. Acta Crystallogr D Biol Crystallogr 66:125–132
Vagin A, Teplyakov A (2010) Molecular replacement with MOLREP. Acta Crystallogr D Biol Crystallogr 66:22–25
Vagin A, Teplyakov A (1997) MOLREP: an automated program for molecular replacement. J Appl Crystallogr 30:1022–1025
Winn MD, Ballard CC, Cowtan KD, Dodson EJ, Emsley P, Evans PR, Keegan RM, Krissinel EB, Leslie AGW, McCoy A, McNicholas SJ, Murshudov GN, Pannu NS, Potterton EA, Powell HR, Read RJ, Vagin A, Wilson KS (2011) Overview of the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr 67:235–242
Emsley P, Lohkamp B, Scott WG, Cowtan K (2010) Features and development of Coot. Acta Crystallogr D Biol Crystallogr 66:486–501
Murshudov GN, Skubák P, Lebedev AA, Pannu NS, Steiner RA, Nicholls RA, Winn MD, Long F, Vagin AA (2011) REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr D Biol Crystallogr 67:355–367
Afonine PV, Grosse-Kunstleve RW, Echols N, Headd JJ, Moriarty NW, Mustyakimov M, Terwilliger TC, Urzhumtsev A, Zwart PH, Adams PD (2012) Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallogr D Biol Crystallogr 68:352–367
LaVallie ER, DiBlasio EA, Kovacic S, Grant KL, Schendel PF, McCoy JM (1993) A thioredoxin gene fusion expression system that circumvents inclusion body formation in the E. coli cytoplasm. Biotechnology (N Y) 11:187–193
Aruffo A, Bowen MA, Starling GC, Gebe JA, Bajorath J, Patel DD, Haynes BF (1997) CD6–ligand interactions: a paradigm for SRCR domain function? Immunol Today 18:498–504
Bork P, Downing AK, Kieffer B, Campbell ID (1996) Structure and distribution of modules in extracellular proteins. Q Rev Biophys 29:119–167
Midgett CR, Madden DR (2007) Breaking the bottleneck: Eukaryotic membrane protein expression for high-resolution structural studies. J Struct Biol 160:265–274
Junge F, Schneider B, Reckel S, Schwarz D, Dötsch V, Bernhard F (2008) Large-scale production of functional membrane proteins. Cell Mol Life Sci 65:1729–1755
Yin J, Li G, Ren X, Herrler G (2007) Select what you need: A comparative evaluation of the advantages and limitations of frequently used expression systems for foreign genes. J Biotechnol 127:335–347
White B, Patterson M, Karnwal S, Brooks CL (2022) Crystal structure of a human MUC16 SEA domain reveals insight into the nature of the CA125 tumor marker. Proteins 90:1210–1218
Moghadam M, Ganji A, Varasteh A, Falak R, Sankian M (2015) Refolding process of cysteine-rich proteins: chitinase as a model. Rep Biochem Mol Biol 4:19
Sarrias MR, Grønlund J, Padilla O, Madsen J, Holmskov U, Lozano F (2004) The Scavenger Receptor Cysteine-Rich (SRCR) domain: an ancient and highly conserved protein module of the innate immune system. Crit Rev Immunol 24:1–37
Acknowledgements
The X-ray diffraction experiments were performed at synchrotron beamlines BL44XU at SPring-8 (Hyogo, Japan) under the Cooperative Research Program of Institute for Protein Research, Osaka University (Proposal No. 2022A6721) and AR-NE3A at Photon Factory (Ibaraki, Japan) under the approval of the Photon Factory Program Advisory Committee (Proposal Nos. 2020G681 and 2022G031). We would like to thank the beamline staff members at SPring-8 BL44XU and Photon Factory AR-NE3A for their help with X-ray data collection and processing.
Funding
Open access funding provided by The University of Tokyo. This work was supported in part by JSPS KAKENHI [Grant Numbers JP18H02151] (Grant-in-Aid for Scientific Research (B) to K.N., H.K., and M.S.) and JP19H05771 (Grant-in-Aid for Scientific Research on Innovative Areas IBmS to M.S. and K.N.).
Author information
Authors and Affiliations
Contributions
Conceptualization: [HK and KN]; methodology: [CZ, PL, KO, HI, SO, MS, and KN]; formal analysis [PL]; investigation: [CZ, PL, SW, CH, MM, and KN]; writing—original draft preparation: [CZ]; writing—review and editing: [PL, KO, HI, SO, MS, HK, and KN]; Supervision: [KN], project administration: [KN]; funding acquisition: [MS, HK, and KN].
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
10930_2023_10173_MOESM2_ESM.tif
Fig. S2 Size-exclusion chromatography of the RPC-purified SRCRD sample before and after refolding. Before refolding, most of the protein was observed to be eluted in the void volume as an aggregate. After refolding, most of the protein was observed to be eluted as a monomer (TIF 578 KB)
10930_2023_10173_MOESM3_ESM.tif
Fig. S3 Effect of DTT on the digestion of Trx-His6-TEV-SRCRD by TEV protease. Samples were prepared for SDS–PAGE using a non-reducing sample buffer. Different concentrations of DTT (0, 5, and 10 mM) were added to the protein samples before TEV protease digestion. The digested products were heat-treated and non-heat-treated, respectively (TIF 91 KB)
10930_2023_10173_MOESM4_ESM.tif
Fig. S4 Size-exclusion chromatography of refolded SRCRD. Most of the protein was observed to be eluted as a monomer (TIF 267 KB)
10930_2023_10173_MOESM5_ESM.tif
Fig. S5 SDS-PAGE of SRCRD expressed by E. coil BL21(DE3). “S” represents the soluble fraction of the cell lysate. “P” denotes the precipitate fraction of the cell lysate. “R” represents the sample solubilized in refolding buffer. “D” is the precipitate formed during dialysis after refolding (TIF 200 KB)
10930_2023_10173_MOESM6_ESM.tif
Fig. S6 The crystals of SRCRD obtained via the inclusion body expression strategy grew in several conditions of the PEG/ion kit. The conditions that favored SRCRD crystal growth are marked by green boxes (TIF 98 KB)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhang, C., Lu, P., Wei, S. et al. Refolding, Crystallization, and Crystal Structure Analysis of a Scavenger Receptor Cysteine-Rich Domain of Human Salivary Agglutinin Expressed in Escherichia coli. Protein J 43, 283–297 (2024). https://doi.org/10.1007/s10930-023-10173-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10930-023-10173-x