The spectrum of building block conformers sustains the biophysical properties of clinically-oriented self-assembling protein nanoparticles

Histidine-rich peptides confer self-assembling properties to recombinant proteins through the supramolecular coordination with divalent cations. This fact allows the cost-effective, large-scale generation of microscopic and macroscopic protein materials with intriguing biomedical properties. Among such materials, resulting from the simple bioproduction of protein building blocks, homomeric nanoparticles are of special value as multivalent interactors and drug carriers. Interestingly, we have here identified that the assembly of a given His-tagged protein might render distinguishable categories of self-assembling protein nanoparticles. This fact has been scrutinized through the nanobody-containing fusion proteins EM1-GFP-H6 and A3C8-GFP-H6, whose biosynthesis results in two distinguishable populations of building blocks. In one of them, the assembling and disassembling is controllable by cations. However, a second population immediately self-assembles upon purification through a non-regulatable pathway, rendering larger nanoparticles with specific biological properties. The structural analyses of both model proteins and nanoparticles revealed important conformational variability in the building blocks. This fact renders different structural and functional categories of the final soft materials resulting from the participation of energetically unstable intermediates in the oligomerization process. These data illustrate the complexity of the Hismediated protein assembling in recombinant proteins but they also offer clues for a better design and refinement of protein-based nanomedicines, which, resulting from biological fabrication, show an architectonic flexibility unusual among biomaterials.


INTRODUCTION
Histidine-rich peptides, when genetically fused to recombinant proteins, confer them the capability to self-assemble as different kinds of functional protein-only materials including fibers, nanoparticles (NPs) and microparticles, through their crossmolecular coordination with divalent cations [1]. This happens because the interactivity exhibited by the hexahistidine tag (H6) and other His-rich tails with Ni 2+ during the immobilized metal affinity chromatographic (IMAC) purification of recombinant proteins [2,3] can be also exploited for a controlled crossmolecular assembly, when adding defined amounts of Zn 2+ , Ca 2+ , Mn 2+ or other divalent cations to solutions of pure Histagged protein [4][5][6]. In fact, this principle sustains the formation of the secretory granules in the mammalian endocrine system [7,8] and of different types of amyloidal and non-amyloidal protein materials existing in nature [4,7,[9][10][11][12][13][14][15][16]. The simplicity of His-rich peptides used as architectonic agents at the nanoscale offers an interesting alternative to more refined approaches to control protein self-assembling, that might require more sophisticated protein engineering [17][18][19][20][21].
We have previously generated a family of recombinant modular proteins, based on an N-terminal cationic peptide and a C-terminal polyhistidine (mainly H6), which, when flanking a protein of interest, self-assemble as homomeric NPs of around 15-20 nm [22]. Since such an assembling platform is fully transversal and highly robust, and the resulting NPs are stable upon in vivo administration [23], these materials have been developed as effective nanocarriers for conventional antitumoral drugs in colorectal cancer [24], lymphoma [25,26] and acute myeloid leukemia [27]. In addition, the incorporation of proapoptotic peptides [28], toxins [29] or venoms [30] to the modular protein constructs allows generating cytotoxic building blocks that, if targeted through solvent-exposed ligands of tumoral markers, result in NPs with selective, build-in antitumoral activity for precision medicine [31]. The polyhistidine tail, apart from being an architectonic agent, allows the one-step purification of the recombinant building blocks from the bacterial cell extracts, upon their mechanical disruption.
During the production of these types of NPs, we have occasionally observed some structural variability in the oligomers resulting from self-assembling [32]. Although from a production point of view, this fact can be simply overcome by selecting the desired material population in the chromatography, the extent, causes and biological significance of such variability are not known. In a recent development of two versions of fluorescent and nanostructured nanobody fusions [33], such variability was specially apparent. Therefore, we decided to take these two proteins (A3C8-GFP-H6 and EM1-GFP-H6, Fig. 1a, b) as models to investigate the categories of particle subpopulations generated through the H6-based platform, why they assemble in disparate material versions and how they functionally perform in biological interfaces.
An ÄKTA Pure FPLC system (GE Healthcare, USA) was used for the IMAC in a His-Trap HP column (GE Healthcare, USA). Protein elution was achieved by lineally increasing the molar amount of imidazole in the column through an elution buffer (20 mmol L −1 Tris-HCl, 500 mmol L −1 NaCl, 500 mmol L −1 imidazole, pH 8.0). Protein purity and integrity was assessed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), western blot (anti-His, Santa Cruz Biotechnology, USA) and matrix assisted laser-desorption ionization time-of-flight (MALDI-TOF). Proteins were finally dialyzed against sodium bicarbonate with salt buffer (166 mmol L −1 NaHCO 3 , 333 mmol L −1 NaCl, pH 8.0). Protein concentration was determined by the Bradford assay and Nanodrop was used to measure the A 280 /A 260 (absorbance at 280/260 nm) ratio, indicative of DNA presence in the sample.

Three-dimensional protein modeling and visualization
Structure prediction of each nanobody-based modular protein was performed to estimate the size of the building block, using the RosettaCM [35] high-resolution approach via the Robetta webserver [36]. Specifically, EM1, A3C8 and GFP-H6 domains were modelled individually and manually assembled in UCSF Chimera (v1.14) to serve as quality templates for the prediction of the whole modular construct. Parameters were set to 100 sampling models, 1 register shift, and a probability of 0.1 of sampling fragments within template regions. Best candidates were chosen by the highest confidence score and lowest error estimates. UCSF ChimeraX (v1.1) was used to visualize the three-dimensional structures [37].

Assembly and disassembly of NPs
The monomeric populations of A3C8-GFP-H6 and EM1-GFP-H6 were assembled into protein Zn NPs by the addition of 0.22 μm-filtered ZnCl 2 at 1:3 molar ratio with the histidine residues in the H6 tag. Proteins were set to 0.2 mg mL −1 and ZnCl 2 was added at a final concentration of 8.5 mmol L −1 for A3C8-GFP-H6 and 8.3 mmol L −1 for EM1-GFP-H6, depending on their molecular weights. Controlled disassembly of protein Zn NPs was subsequently obtained by adding EDTA at the same molar concentration. The spontaneous protein NPs based on A3C8-GFP-H6 and EM1-GFP-H6 were disassembled by the addition of 0.5% v/v Triton X-100. Addition of 2% anionic SDS detergent also resulted in NP disassembly.

Physicochemical characterization
A3C8-GFP-H6 and EM1-GFP-H6, at their different oligomerization states, were analyzed by SDS-PAGE and western blot based on an anti-His (Santa Cruz Biotechnology, USA) to assess protein purity and integrity. The same molar amount was used for each protein. For characterization experiments, proteins were set to 0.2 mg mL −1 and pH 8.0. Dynamic light scattering (DLS) was used to determine the volume size distribution of A3C8-GFP-H6 and EM1-GFP-H6 at their different oligomerization states. Measurements were carried out in triplicate at 25°C in a Zetasizer Advanced Pro Blue (Malvern Instruments Limited, Malvern, Worcestershire, UK) at 633 nm.
Fluorescence emission spectra of all protein variants were analyzed with a Varian Cary Eclipse Fluorescence Spectrophotometer (Agilent Technologies). For tryptophan fluorescence, the excitation wavelength was set at 295 nm and the excitation and emission slits at 5 nm. Measurements were carried out by triplicate at 20°C. For GFP fluorescence, the excitation wavelength was set at 488 nm and the emission peak was achieved at 511 nm. Measurements were carried out by triplicate at 20 and 37°C and the excitation and emission slits were set at 5 nm.
Circular dichroism (CD) measurements were made with a JASCO J-715 spectropolarimeter (JASCO, USA) using a 0.2-mm path length quartz cell. Two spectra were acquired for each protein species. Each spectrum was an average of ten scans. The scan speed was set at 50 nm min −1 with a 1 s response time. Measurements were obtained as ellipticity in millidegrees (mdeg) in the 190-260 nm region. The secondary structure content of each protein format was analyzed using the Dicroweb platform [38].

Ultrastructural characterization
Ultrastructural morphology (size and shape) of A3C8-GFP-H6 and EM1-GFP-H6 NPs with and without Zn 2+ was visualized with two rapid high-resolution electron microscopy techniques. Drops of 10 μL of the four samples resuspended in its buffer were deposited in silicon wafers (Ted Pella Inc.) for 1 min, air dried, and immediately observed without coating in a field emission scanning electron microscope (FESEM) Merlin (Zeiss, Oberkochen, Germany) operating at 1 kV and equipped with a high-resolution in-lens secondary electron detector. Representative images of general fields and nanostructure details were collected at three magnifications (180,000×, 240,000× and 500,000×). For negative staining, drops of 10 μL of the same samples were deposited 1 min in 200 mesh copper grids coated with carbon, contrasted with uranyl acetate 2% (Polysciences Inc.) during 1 min, air dried and observed with a transmission electron microscope (TEM) JEM-1400 (Jeol Ltd.) operating at 120 kV and equipped with a Gatan Orius SC200 CCD camera (Gatan Inc.). Representative images of general fields and nanostructure details were collected at three magnifications (5000×, 10,000× and 50,000×).
Thermal stability DLS (using the settings and equipment afore-mentioned) was used to evaluate the thermal stability of A3C8-GFP-H6 and EM1-GFP-H6 at their different oligomerization states. Temperature was initially set at 25°C and progressively increased up to 90°C. To evaluate the thermal behavior of A3C8-GFP-H6 and EM1-GFP-H6 at their different oligomerization states within a range from 25 to 90°C, center of spectral mass (CSM) was calculated as previously described [39] for analysis and comparisons. CSM is related with the relative exposure of the tryptophan (Trp) to the protein solvent. The maximum red-shift in the CSM of the Trp is compatible with a large solvent accessibility. On the contrary, the blue shift is related to a highly hydrophobic environment for Trp.

Functional characterization
CXCR4 + human cervical adenocarcinoma cells (Hela, ATCC, CCL-2) were used for functional characterization experiments. HeLa cells were maintained in Mem-Alpha (Gibco) media supplemented with fetal bovine serum (Gibco) at 5% CO 2 in a humidified atmosphere. HeLa cells were used to test the neutralization capacity of A3C8-GFP-H6 at its different oligomerization states. They were seeded in a 96-well plate at 3.5 × 10 3 cells/well and the proteins were added after 24 h. A3C8-GFP-H6 was pre-incubated for 1 h with T22-mRTA-H6 (to allow potential binding and neutralization before exposure to cells), a recombinant modular protein containing the CXCR4-targeting peptide T22 and the active site of ricin toxin, at the three different molar ratios of A3C8:ricin (1:10, 1:1 and 10:1). T22-mRTA-H6 concentration was set at 50 nmol L −1 in all cases. Protein mixtures were added at a final volume of 100 μL for 48 h and the Cell-Titer Glo Luminescent Cell Viability Assay (Promega, USA) Protocol was followed. Viabilities of cells without protein incubation and cells treated only with 50 nmol L −1 T22-mRTA-H6 were used as controls. A Victor3 (PerkinElmer, USA) microplate reader was used for the measurements. Experiments were performed in triplicate.

Statistical analysis
Data is represented as mean ± standard error of the mean. For the neutralization assay data, a Shapiro-Wilk test was performed to ensure normality. Significant differences among means were identified through a One-Way ANOVA test. GraphPad Prism was used for statistical tests.

RESULTS AND DISCUSSION
The IMAC purification of A3C8-GFP-H6 and EM1-GFP-H6 from cell extracts of the producing bacteria resulted, in both cases, in two distinguishable pure protein peaks, sequentially released from the columns because of a differential affinity of the proteins to the immobilized Ni 2+ . Both proteins showed a similar distribution in the chromatogram, with an initial peak estimated to represent 86%-87% of the total protein amount and a final peak, accounting for the remaining 13%-14% of the protein population. The DLS of these samples revealed, for both proteins, that the primarily eluted protein peak presumably corresponded to monomers (Ms) while the protein sample released at higher imidazole concentrations corresponded to larger materials (Fig. 1c). The monomer-containing samples showed a hydrodynamic size of around 6-7 nm by DLS (Fig. 1d, e), a dimension compatible to those estimated in a plain modeling of both proteins, in which a certain degree of flexibility is expected between the nanobody and the GFP modules (Fig. 1b). The larger materials, leaked at latter elution volumes, showed a mean size slightly above 100 nm (Fig. 1e), with a mode peak size around 70 nm (Fig. 1d). The fact that such large entities were eluted from IMAC at high concentrations of imidazole was indicative of a tighter attachment to the column than the monomeric versions (Fig. 1c). The polydispersity indexes of the protein in these slowly eluted fractions were relatively low (Fig. 1e), indicating that the materials are not mere protein aggregates but that they have some extent of regular organization at the nanoscale. Importantly, only negligible traces of nucleic acids were detected associated to the protein in each of these samples according to the A 280 /A 260 ratio [40] (Fig. 1e), whose abundance might have accounted for the formation of nanoscale protein-DNA complexes through cationic protein stretches. Both models, namely A3C8-GFP-H6 and EM1-GFP-H6, showed a similar profile when determining all these parameters.
As expected for a H6-tagged protein [41,42], the addition of equimolar amounts of Zn 2+ to the monomeric protein versions promoted a shift in the DLS plots from 6.2-6.7 to 10.2-10.5 nm, indicative of protein assembling as NPs (Zn NPs, Fig. 2a). Also, as expected, such size increase was reversed by the addition of EDTA, since the metal chelation is able to disassemble multimeric protein materials organized through cross molecular interactions with metals (Fig. 2b) [1]. Contrarily, the pre-formed protein NPs peaking at 70 nm were not disassembled by EDTA or by imidazole ( Fig. 2c and data not shown), even at concentrations higher than those used to disassemble the small NPs ( Fig. 2b and data not shown). This finding was indicative that forces other than divalent cation coordination were supporting the organization of such a more complex material. For other types of related protein-only NPs, hydrogen bonds, van der Waals and especially electrostatic interactions were predicted to act, promoting and maintaining protein-protein contacts [23], in

SCIENCE CHINA Materials
ARTICLES a similar way in which monomers are kept together in viral capsids [43][44][45]. In this context, the addition of 0.5% Triton X-100 reduced the size of the materials from more than 100 nm (mean) to 10 nm (Fig. 2a), a value very similar to that exhibited by the small Zn NPs. Such disassembling process appeared to be not complete, since forms with a size compatible to that expected for monomeric versions were not observed in presence of the detergent (Fig. 2a). Therefore, NPs and Zn NPs were clearly distinguishable materials regarding the interactions that regulate oligomerization.
In all these samples and under the tested assembling and disassembling conditions, the proteins remained proteolytically stable (Fig. 2d, e), with only symptoms of minor and partial proteolysis in the NP versions of both fusion proteins. The GFP fluorescence emission was similar in all the constructs, and slightly lower in the case of both NP versions (Fig. 2f). The Znmediated oligomerization did not disturb the GFP emission and still, Zn NPs tended to be more fluorescent than the Ms versions (Fig. 2f).
To comparatively evaluate the ultrastructural morphometry of NPs and Zn NPs, both type of materials made of the two alternative constructs were examined by TEM and FESEM (Fig. 3). These high-resolution imaging techniques showed clear size and shape differences between NPs and Zn NPs. When exploring different magnifications and field broadness, we observed structured materials in both cases, with sizes fully compatible to those determined by DLS. The smaller Zn NPs appeared as more regular, dense structures than the larger NPs, which were still regular but showed important deformability. Apparently, protein building blocks in NPs might be organized into toroidal NPs, with empty cores, while the modular proteins clustered by Zn showed a rather spherical organization. The higher amount of bulk material in NPs should result from a higher number of building blocks arranged in each individual item. If so, we should expect a more intense reactivity of the nanobody moiety provided it is solvent-exposed, when compared with Zn NPs or with Ms. Since the nanobody A3C8, as a modular GFP-containing protein, binds the plant toxin ricin [33], highly loaded A3C8 NPs should inactivate more efficiently this toxin, that has a very strong biological activity over mammalian cells [34,46,47]. To assess this hypothesis, the different versions of A3C6-GFP-H6 generated here were incubated with a biologically active recombinant ricin that had been developed for cancer therapies [34]. When incubating this ricin-based recombinant protein with equimolar or ten times molar amount of the nanostructured nanobody (A3C8 NPs), its toxicity was majorly abolished. This event occurred at a lesser extent when using Zn NPs and even less when using Ms (Fig. 4a). This observation was in agreement with the afore-mentioned multivalence hypothesis, and it demonstrated that at least part of the nanobody moieties are available for interaction in NPs as well as in Zn NPs. In addition, these data strongly suggested that higher multivalency in NPs is more efficient in neutralizing the toxin than the smaller Zn NPs and the Ms versions, what would probably occur because of the clustering of a higher number of toxin ligands in NPs, showing a higher density when compared with Zn NPs or Ms.
At that point, it was demonstrated that Zn NPs (with regulatable assembling and disassembling) and NPs (spontaneously formed, non-regulatable) were materials with distinguishable biophysical properties, despite being formed by the same building block polypeptide. This was proved with two model

SCIENCE CHINA Materials
nanobody-containing proteins, but such a variability had been internally observed in the laboratory when developing other protein building blocks (not shown). Since these materials are formed by recombinant proteins bioproduced in bacteria, a conformational variability in these proteins could account for dissimilar building block populations, even having the same primary structure. In fact, the conformational variability of recombinant proteins is a well-recognized event in multiple models and production systems [48][49][50][51][52][53][54][55], but the impact of such diversity over self-assembling as protein materials has remained so far unexplored. In this context, intrinsic fluorescence analysis demonstrated that Tryptophan residues sense a high hydrophobic environment in both types of NPs as the fluorescence peaks move to lower wavelengths respect to Ms (Fig. 4b). This result confirmed the oligomerization states of both types of oligomers. On the other hand, the CD spectra of the whole set of A3C8-GFP-H6 and EM1-GFP-H6 variants (Ms, Zn NPs and NPs) suggested structural similarities between Ms and Zn NPs and also common differences between this pair and NPs (Fig. 4c). A higher content of β-sheet secondary structure was observed in Ms and the derived Zn NPs in comparison with NPs (Fig. 4d). These data, concomitantly with the different interaction types involved in the disassembling (Fig. 2a, b), strongly supported again the concept that NPs and Zn NPs derive from structurally different building blocks resulting from bacterial production, which, despite having the same amino acid sequence, fold into alternative conformations. In this same context, the mild proteolysis observed in these proteins (Fig. 2d) again discriminated the sensitivity between the protein forming Zn NPs and NPs, especially in the case of the EM1 fusion. Importantly, protein conformation is a well-known determinant of proteolytic susceptibility [56][57][58][59].
The DLS and CSM analyses of the proteins revealed that both NPs and Zn NPs were structurally more robust than plain Ms (Fig. 5). While both types of oligomers remained stable during the whole tested temperature range, the monomeric versions of both proteins suffered a conformational conversion (more evident for A3C8-GFP-H6) between 60 and 80°C (Fig. 5a, b). Then, irrespective of the precise conformational states of the building blocks in the NPs, resulting in disparate morphologies and interactivities (Figs 3 and 4a), the oligomeric organization proves to be protective in front of thermal stress.
Such stabilizing property, which is linked to the formation of supramolecular complexes, prompts the development of transversal assembling platforms that, like those studied here, would allow the organization of functional proteins with therapeutic potential into nanoscale multimeric materials. In particular, the use of histidine-rich peptides as architectonic agents benefits from the simplicity of the engineering methods that is based on the addition of an H6 or related peptide to the protein of interest. Since H6 also allows simple, one-step purification of any protein by IMAC [2,3,60,61], the dual role of the peptide shows benefits from both the bioproduction and purification side, and the nanofabrication process itself, through selfassembling. In contrast to other more refined protein-specific engineering approaches [17][18][19][20][62][63][64], the protein self-assembling process promoted by H6 tails can be universally applied to any protein of clinical interest. The data shown here indicate that the recombinant production of distinct His-tagged nanobody-GFP fusions result in at least two distinguishable conformers of the protein that assemble into two categories of NPs. One of the conformers, enriched in β-sheet secondary structure, is found in a monomeric form upon IMAC protein purification. Its assembly is finely controlled by externally added Zn 2+ ions and its disassembly by the addition of a divalent cation chelator such as EDTA (Fig. 2). On the other hand, the alternative conformer with low β-sheet secondary structure spontaneously assembles during elution from IMAC columns into larger NPs, Figure 4 Functional and structural comparison. (a) Neutralization assays of T22-mRTA-H6 with A3C8-GFP-H6 in their three different oligomerization states. Y-axis represents HeLa CXCR4 + cell viability in percentage (being 100% the cell viability without protein incubation). T22-mRTA-H6 at 50 nmol L −1 was pre-incubated for 1 h with A3C8-GFP-H6 at three different molar ratios (10:1, 1:1, 1:10) before being administered for 48 h to HeLa cells. Horizontal green line represents HeLa survival when exposed to T22-mRTA-H6 for 48 h at 50 nmol L −1 . Significant differences are shown as * (p < 0.05), in black for differences with non-exposed cells (100%), in green for differences with cells exposed to T22-mRTA-H6 alone and in dark red for differences with cells exposed to (T22-mRTA-H6:A3C8-GFP-H6 NPs) at the same molar ratio. with regular toroid-like morphology with important extent of mechanical flexibility. Such NP version, over the Zn NP, shows an enhanced capability to neutralize a target ricin (Fig. 4a). This observation demonstrates the full functionality of the nanobodies and also the reaching of local ligand concentrations in NPs higher than in Zn NPs, which favors the neutralization of the toxin. Importantly, the ligand is active in both nanomaterials as well as in the monomeric form.
The origin of both populations (monomers and spontaneously formed NPs) is determined by the genetic cell background and culture conditions of the recombinant production in E. coli cells and in a differential folding pattern followed by each protein fraction, rendering different conformers. The structural analysis of the NPs and the unassembled monomer (Fig. 4) strongly suggests that the larger NPs are formed by a conformer that might come from an unstable monomeric intermediate in the oligomerization process, represented as an intermediate structure (IS) in the scheme (Fig. 5c, d). Also, the similarity in size when comparing Zn NPs and Triton X-100-disrupted NPs (Fig. 2), might be indicative that Zn NPs are similar or structurally close to an oligomeric intermediate in the fast generation of large NPs (Fig. 5d). Monomers can assemble into NPs when adding Zn. The shift between monomers and the spontaneously formed NPs may be a thermodynamic transition which is observed for A3C8-GFP-H6 but not for EM1-GFP-H6 at the selected range of temperatures (Fig. 5a).
The simplicity of the oligomerization system based on H6 and related humanized His-rich peptides [65] makes it straightforward applicable to the construction of protein-materials with full biocompatibility and multiple uses in clinics [1], including nonreactive materials [66], drug vehicles [24] or nanoscale drugs with build-in therapeutic activities [67]. However, the data presented here also reveal a biophysical variability in the resulting materials linked to the conformational spectrum of the recombinant proteins, specific of the type of soft material that they generate. This concept, irrelevant in the use of more rigid building blocks for material design, might be highly relevant in the emerging field of protein-based materials for clinical applications, which is seen as a broad and promising technology for a diversity of therapeutic applications [6,[17][18][19][20][21]62]. Being this fact an issue with regulatory implications, the awareness of such potentially disparate assembling makes possible a proper quality control of a clinically oriented product by a simple separation of relevant peaks during the chromatographic step of the production process.