From isolated polyelectrolytes to star-like assemblies: the role of sequence heterogeneity on the statistical structure of the intrinsically disordered neurofilament-low tail domain

Abstract Intrinsically disordered proteins (IDPs) are a subset of proteins that lack stable secondary structure. Given their polymeric nature, previous mean-field approximations have been used to describe the statistical structure of IDPs. However, the amino-acid sequence heterogeneity and complex intermolecular interaction network have significantly impeded the ability to get proper approximations. One such case is the intrinsically disordered tail domain of neurofilament low (NFLt), which comprises a 50 residue-long uncharged domain followed by a 96 residue-long negatively charged domain. Here, we measure two NFLt variants to identify the impact of the NFLt two main subdomains on its complex interactions and statistical structure. Using synchrotron small-angle x-ray scattering, we find that the uncharged domain of the NFLt induces attractive interactions that cause it to self-assemble into star-like polymer brushes. On the other hand, when the uncharged domain is truncated, the remaining charged N-terminal domains remain isolated in solution with typical polyelectrolyte characteristics. We further discuss how competing long- and short-ranged interactions within the polymer brushes dominate their ensemble structure and, in turn, their implications on previously observed phenomena in NFL native and diseased states. Graphic abstract Visual schematic of the SAXS measurement results of the Neurofilament-low tail domain IDP (NFLt). NFLts assemble into star-like brushes through their hydrophobic N-terminal domains (marked in blue). In increasing salinity, brush height (h) is initially increased following a decrease while gaining additional tails to their assembly. Isolating the charged sub-domain of the NFLt (marked in red) results in isolated polyelectrolytes Supplementary Information The online version contains supplementary material available at 10.1140/epje/s10189-024-00409-8.


Introduction
Intrinsically disordered proteins (IDPs) are a subset of proteins that, instead of forming a rigid singular structure, fluctuate between different conformations in their native form [1,2]. Nonetheless, IDPs serve significant biological functions and account for about 44% of the human genome [3].The lack of fixed structure provides IDPs many advantages in regulatory systems in which they often play a crucial role in mediating protein interaction [4,5].These roles often come into play from intrinsically disordered regions (IDRs) of folded proteins interacting with other IDRs.For example, in the Neurofilament proteins, tails emanating from the self-assembled filament backbone domains bind together and form a network of filaments [6][7][8][9][10].
The ensemble statistics of IDPs stem from their sequence composition and the surrounding solution [2].For example, previous studies showed that IDPs comprising mostly negatively charged amino acids (polyelectrolytes) are locally stretched due to electrostatic repulsion between the monomers [11].Moreover, different properties, such as hydrophobicity, were shown to be linked with local IDP domain collapse [12].The complex interactions that arise from sequence heterogeneity allow IDPs to form specific complexes without losing their disordered properties [13].For example, Khatun et al. recently showed how, under limited conditions, the human amylin protein self-assembles into fractal structures [14].
As IDPs are disordered chains, polymer theories are prime candidates to relate the measured structural statistics to known models, which can help link the sequence composition of the IDP to its conformations [15][16][17][18].Specifically, polymer scaling theories allow us to derive the statistical structure of IDPs given sequence-derived parameters, such as charge density and hydrophobicity [11,12,[19][20][21].However, due to the heterogeneity of the IDP primary structure (i.e., the amino acid sequence), some systems showed contradictions with the behavior theorized by standard heterogeneous polymer physics [17,19,[22][23][24].
The unique biological properties of IDPs have given rise to numerous attempts to use them as building blocks for self-assembled structures [25].For example, IDPs were proposed as brush-like surface modifiers, due to their enhanced structural plasticity to environmental conditions [26,27].Another example of an IDP brush system is the Neurofilament (NF) protein system [6,28,29], described as interacting bottle-brushes.NF subunit proteins form mature filaments with protruding disordered C-terminus IDR known as 'tails.'NF tails were shown to mediate NF network formation and act as shock absorbents in highstress conditions [29].Moreover, NF aggregations are known to accumulate alongside other proteins in several neurodegenerative diseases, such as Alzheimer's, Parkinson's, etc. [30].
The NF low disordered tail domain (NFLt) sequence can be divided into two unique regions: an uncharged region (residues 1-50) starting from its N terminal and a negatively charged region (residues .The NFLt can be described as a polyelectrolyte with a net charge per residue (NCPR) of -0.24.Furthermore, the statistical structures of segments within the NFLt are influenced by the amount, type, and disperse of the charged amino acid within a segment [22].Nonetheless, other structural constraints, particularly long-range contacts, impact the local statistical structures.Additionally, NFLt was shown to have glassy dynamics with the response to tension [31].Such dynamics were associated with multiple weakly interacting domains and structural heterogeneity.
In this paper, we revisit NFLt as a model system for charged IDP and focus on the contribution of its neutral and hydrophobic N-terminal domain.We will show that increased salt concentration causes NFLt to form star-like brushes with increased aggregation number (Z).Here, we are motivated by theoretical models, in particular the Pincus' model for salted polyelectrolytes [32], that capture key physical properties of IDPs, including the model system presented here [26,29,33].We will further quantify the competition between hydrophobic attraction and electrostatic and steric repulsion in the formation of the structures of NFLt.

Results
To study the N-terminal domain contribution to the structure of NFLt, we designed two variants and measured them at various buffer conditions.The first construct is the entire 146 residues of the NFLt chain, which we term as WT (NCPR = -0.24),and the second is isolating the 104 negatively charged residues from the C-terminal of NFLt (NCPR = -0.33),termed as ∆N42.We expressed the variants in E-coli and purified it up to 96% (see methods).
We assessed the variants in solution using small-angle X-ray scattering (SAXS), a technique extensively used to characterize the statistical structures of IDPs [34].From the raw SAXS data, measured at various salinities, we can already find high structural differences between the two variants (Fig. 1a).Dominantly at the low wave-vector (q) region, the WT variant scattering (I) rises with added NaCl salt.Such an increase at low q implies high molecular mass particles due to aggregation of the WT variant.
In contrast, ∆N42 shows a separated Gaussian polymer profile (Figs.1a, S1), nearly insensitive to total salinity (C s = 20 − 520 mM).Similarly, the data presented in Kratky format (qI 2 vs. q, Fig. 1a) shows the ∆N42 has the signature of a disordered polymer.In contrast, the WT variant, in particular at high salinity, has a combination of a collapse domain (the peaks from below q = 0.25nm −1 ) and a disordered polymeric structure (the scattering rise at higher q Fig. 1a).Normalized Kratky plot of the same SAXS measurements.The ∆N42 variant remains disordered and unchanged with salinity, while the WT variant shows a hump at low q, typical for a collapse region.With increasing Cs, the hump at the lower q range becomes a sharper peak accompanied by a scattering rise at the higher q range.Such behavior indicates that the aggregation coexists with the WT variant's highly dynamic and disordered regions.Both variants shown are at the highest measured concentration (Table S1 Being completely disordered, ∆N42 lacks a stable structure and can be described using a statistical ensemble of polymeric conformations [35] were:

Salinity
Here, I 0 is the scattering at q = 0, ν is Flory scaling exponent, and R G is the radius of gyration defined by: where γ = 1.615 and b = 0.55nm (see [35]) and the analysis is viable up to qR G ∼ 2 (Fig. S2, S3).In all ∆N42 cases, the scattering profile fits Eq. 1 and with ν ranging between 0.63-0.69depending on the buffer salinity (Table S1).In 'infinite dilution' conditions (zero polymer concentration), we find ν to decrease monotonically from 0.73 to 0.62 with added salt (Table S2).
Given the noticeable aggregation for the WT variant, alternative form factors were considered to match the scattering profiles (lines in Fig. 1).The absence of structural motifs at high q values (q > 0.3 nm −1 ) indicates a disordered nature for WT at shorter length scales.Conversely, in the lower q region (q < 0.3 nm −1 ), the scattering suggests stable structural motifs or a larger molecular weight particles.Such SAXS resembles that of self-assembled decorated spherical micelles [36].Variations of micelle models are shown to fit the data (Figs. 1, S4-S6).Sufficiently low aggregation number and core size distil the description of the spherical micelle into a 'star-like' brush.Alternative attempts to fit the scattering profiles to other form factors models, including vesicles and lamellar, were unsuccessful.
For the star-like model, the aggregated variants form a small spherical core of volume V core made out of n • Z monomers (comparison with different cores described in [37] and in Fig. S4), where n denotes the peptide length per polypeptide within the core, and Z is the aggregation number, i.e. the number of polypeptides per 'star.'The remainder of the WT variant then protrudes from the core as the star polymer brush (Figs.2a, S4-S6).The star-like scattering form factor is described as a combination of four terms [36]: the self-correlation term of the core F c , the self-correlation term of the tails F t , the crosscorrelation term of the core and the tails S ct and the cross-correlation term of the tails S tt : ( a. c.

e.
Fig. 2 a.Schematic of the system's structure variation with salinity (Cs).While ∆N42 remains disordered and segregated, the WT variant aggregates to a star-like polymer with a higher aggregation number at higher Cs.b-e.Structural parameters for WT (blue symbols) and ∆N42 (red symbols) variants extracted from fitting the SAXS data.Full and hollow circles represent the spherical and cylindrical core fitted parameters, respectively.d.In all cases, the brush heights (h) are much larger than the corresponding grafting length (ρ), indicative of a brush regime.e.The structurally intrinsically disordered ∆N42 variant compacts with higher Cs values and remains more compacted from the projected brushes for the WT variant.All values are the extrapolated 'zero concentration' fitting parameters (see Fig. S7) Here, β c and β t are the excess scattering length of the core and the tails, respectively.From fitting the scattering data, we extracted the height of the tails h = 2R G , the aggregation number Z, and the relevant core's parameters (e.g., core radius R for a spherical core, cylinder radius R and length L for a cylindrical core [37]), schematically illustrated in Fig. 2a.All fitting parameters are found in Table S3.
To avoid misinterpretation and to minimize intermolecular interaction effects, we present the fitting results at the 'infinitely diluted regime' by extrapolating the relevant parameters measured at various protein concentrations to that at zero protein concentration (Fig. S7, Table S4).The parameters are mostly independent of the concentration unless explicitly mentioned.
At low salinity (20mM), the aggregation number for the WT variant is of a dimer (Z ≈ 2), and the core's shape is that of a cylinder (with a radius R = 0.89 nm and length L = 1.19 nm).At higher salt conditions (170-520 mM), the form factor fits spherical core aggregates with increasingly higher Z's (Fig. 2a).
Given the relatively small core volume (V core ≈ 1 − 2nm 3 , Fig. 2c), it is crucial to evaluate the 'grafting' distance between neighboring chains, ρ, on the core surface (S = 4πR 2 = Zρ 2 ) and the brush extension, h, outside the core.As shown in Fig. 2d, in all cases, h/ρ ≫ 1 indicates a 'brush regime' where neighboring chains repel each other while extending the tail's height [38].
The repulsion between the grafted tail is further emphasized when comparing h/2 for WT to the equivalent ∆N42 length-scale (R G ), showing a significant extension for WT (Fig. 2e).We notice that the WT tail's length (h) increases at low salt (during the transitions from a dimer to a trimer), followed by a steady mild decrease as the C s , and following Z increase.Similar compactness with increasing C s is shown for ∆N42 and is expected for polyelectrolyte due to the reduction in electrostatic repulsion [39].To better compare the statistical structure of two variants of disordered regions, we followed the polymeric scaling notation ν that quantifies the compactness of the chain.For ∆N42, we extracted ν from Eqs. 1 and 2 and found a significant decrease in its value as 50 mM of NaCl is added to the 20 mM Tris buffer (Fig. 3a).The following monotonic decline is in line with polyelectrolytic models and electrostatic screening effects [40], shown in a solid red line in Fig. 3a.Interestingly, previous measurements of segments within the NFLt charged domain were shown to have similar ν values as in ∆N42 .However, the same decline in salinity was not observed (Fig. 3a) [22].For the WT variant, the scaling factor (ν) of the 'star-like polymer' brushes is extracted from Eq. 2. Here, we use R G = h/2, where h is obtained from Eq. 3.For C s = 20 mM, we find that ν is of similar scale as for ∆N42 .This similarity can be attributed to the nature of the dimer, where the intramolecular electrostatic interactions dominate the expansion of each of the two tails.As C s increases by 150 mM, ν exhibits a considerable increase, presumably due to neighboring tail repulsion.Above C s = 170 mM, ν shows a weak decrease.We attribute this weak decline to the salt-brush regime of polyelectrolyte brushes [41] shown in solid blue in Fig. 3a.In this regime, h ∝ C −1/3 s , and subsequently ν ∝ − 1  3 log(C s ).We note that the cores of the star-like polymers are relatively small and that each polypeptide aggregates through only a few, most likely hydrophobic, amino acids.From the tabulated amino-acid partial volume, ⟨ϕ aa ⟩ [42], we estimate the comprising amino acids as spheres of volume ⟨ϕ aa ⟩.From here, the average number of amino acids per polypeptide inside the core is estimated by the number of spheres that can fit within the core volume, divided by the aggregation number: n = V core /(⟨ϕ aa ⟩ • Z).Noticeably, our fit results with small n values, ranging between ∼ 7 − 2 residues on average within the aggregate ensemble and depending on the buffer salinity.Attempting to 'fix' n to a larger constant residue per tail number results in a poorer fitting (Fig. S9).In Fig. 3a, we indeed see that the most significant change occurs at the low salt regime, where n drops from an average of 7 to 3 amino acids (C s = 20, 170 mM, respectively).Such behavior is known to occur within globular proteins [43] and were recently alluded to impact IDPs [44].The following trend is a further decrease in n, albeit much weaker, which results in a final average n of about two as the salinity reaches C s = 520 mM.
Last, in Fig. 4, we quantify the intermolecular interactions by evaluating the second virial coefficient, A 2 , using a Zimm analysis [45] (Table S5).Here, A 2 describes the deviation of the statistical ensemble from an ideal gas.In agreement with our previous data, we find that the inter-molecular interactions of ∆N42 change from repulsive (A 2 > 0) to weakly attractive (A 2 ≤ 0) as the salinity increases.In contrast, for WT, A 2 changes from a nearly neutral state of intermolecular interactions (i.e., ideal gas regime) to mildly attractive (A 2 < 0).These findings are reflected in the dependency of the variant Flory coefficient ν in concentration.While at the lowest salinity, ∆N42 is shown to expand as protein concentration is decreased, for higher salinities and for the WT measurements, ν remain primarily unchanged (Fig. S8a).
Combining our results for both variants, we find an exemplary role of long-range electrostatic interactions tuning the statistical structure of IDPs.Without the uncharged N-terminal domain, the NFLt exhibited significant change as the electrostatic interactions were screened, causing them to condense further.In contrast, the presence of the uncharged domain incurred aggregation of the proteins, bringing the tails much closer to each other.The increase in proximity was reflected in a significant increase in the expansion compared to the truncated variant, which exhibited a much weaker contraction with salinity.

Discussion and Conclusions
We investigated the effects of sequence heterogeneity on the interactions of NFLt, an IDP model system.For NFLt, the N-terminal region consisting of the first ∼ 50 residues is hydrophobic and charge neutral, while the remaining chain is highly charged.We found that the sequence heterogeneity differentiates between the structures of the entire WT NFLt and a variant lacking the Nterminal domain.In particular, the WT variant self-assembles into star-like structures while the ∆N42 one remains isolated in all measured cases.
Since ∆N42 can be attributed as a charged polymer, weakly attractive interactions take center stage as the electrostatic repulsion diminishes with charge screening (Fig. 4).These interactions showing extended disordered scaling.The red line refers to the theoretical brush model [41], and the blue line refers to the theoretical polyelectrolyte [40].∆N42 shows a decrease in the protein extension due to the decline in intermolecular electrostatic repulsion (see also Fig. 4).WT shows an increase in the extension when shifting from a dimer to a trimer, followed by a slight decline with a further increase in salinity.In gray, average ν is obtained from measuring separate NFLt segments with an NCPR of -0.3 to -0.6 [22].b.The core (aggregated) peptide length per polypeptide as a function of salinity.At high salinity, each polypeptide aggregates via 2-3 amino acids that form the star-like polymer core.Both panels' values are the extrapolated 'zero concentration' parameters (supplementary Fig. S8).

C
= 0 q = 0 0.6 mg/ml 1.2 mg/ml 1.7 mg/ml 2.3 mg/ml Fig. 4 The osmotic second virial coefficient A 2 as a function of the two variants' salinity (Cs).∆N42 intermolecular interactions transition from repulsive to attractive as Cs increases.WT changes from a nearly neutral state of intermolecular interactions to attractive.Inset: A demonstration (WT variant, 20 mM Tris and 500 mM NaCl pH 8.0) for the Zimm analysis used to extract A 2 from SAXS data measured at various protein concentrations (C).Values shown in the graph are in mg/ml units.The dashed lines show the extrapolation from the measured data (colored lines) to the fitted q → 0 and C → 0 yellow lines, where α = 0.01 is an arbitrary constant used in this analysis.could be attributed to monomer-monomer attractions that arise from the sequence heterogeneity of the IDP, such as weak hydrophobic attraction from scattered hydropathic sites [22,28,29,[46][47][48].
For the WT variant, the intermolecular interactions started from a near-neutral state and transitioned to weakly attractive.However, as the WT measurements describe self-assembling complexes, the interpretation of these results differs from ∆N42.As such, we interpret the intermolecular interactions as the 'aggregation propensity,' the protein complex's growing ability.The aggregation propensity grows as the attractivity between the complex and the other polypeptides in the solution increases.This behavior can be observed when examining the responsiveness of the aggregation number Z to protein concentration C (Fig. S7).In the lowest measured screening, Z dependency on protein concentration was minimal.As we increase the screening effects, this dependency becomes more substantial.This characterization is also found in folded proteins, where intermolecular interactions were shown to indicate aggregation propensity [49].The increased intermolecular attraction induced at increasing salinity is indicative of a salting-out phenomenon [50,51], although further investigation at higher salinity is needed.
The stability of the star-like polymer core should be evaluated by the participating residues per polypeptide (n).Indeed, while our fittings result with rather small n values, the SAXS signal at low q is dominated with aggregated structures under all salinity conditions.Within the occurring hydrophobic interactions, the release of bound water molecules and ions from the polypeptides is likely to contribute to the core's stability .Such entropic based effects have been observed in similar processes such as protein flocculation [52,53] and in temperature specific IDP binding modulation [54].
In our previous study [22], Flory exponents (ν) of shorter segments from the same NFLt were measured independently and in the context of the whole NFLt using SAXS and time-resolved Förster resonance energy transfer (trFRET).There, regardless of the peptide sequence, in the context of the entire NFLt, the segments' structural statistics were more expanded (i.e., with larger ν values) than when measured independently.Similarly, these short segments measured with SAXS have smaller ν values (i.e., with a compacted statistical structure) than those of measured here for ∆N42 in all salt conditions (Fig. 3a, grey symbols).
The expansion of segments in the context of a longer chain corroborates that long-range contacts contribute to the overall disordered ensemble [22].Interestingly, at C s = 520 mM salinity, we found similar ν values of the ∆N42 and the previous short segment measurements, indicating a comparable expansion.We suggest that at higher salinities, the significance of electrostatic longrange contacts diminishes, aligning the expansion 'scaling laws' regardless of the chain length.Importantly, comparisons between our ∆N42 variant results (and not to the WT variant) to the previous segments' measurements are more suitable as the chains did not aggregate in those cases.
Compared to ∆N42, WT exhibits a mild contraction in salt, resembling the behavior of the 'salt-brush' regime observed in polyelectrolyte brushes, as demonstrated in Fig. 3. Similar saltbrush behavior was previously observed in Neurofilament high tail domain brushes grafted onto a substrate [26], and in a recent polyelectrolytic brush scaling theory [55].In the salt-brush regime, Pincus showed that brush mechanics resemble neutral brushes, determined by steric inter-chain interactions [32].In this interpretation, the effective excluded volume per monomer enlarges and is proportional to 1/κ 2 s , where κ s is the Debye length attributed to the added salt.Consequently, we suggest that the heightened charge screening in the WT solution allows steric interactions between brushes to play a more significant role in determining the brush ensemble.Additionally, we deduce that the increased prevalence of steric repulsion counteracts the attractive forces responsible for aggregation, thereby preventing brush collapse.
The NFLt contraction aligns with previous studies of native NFL hydrogel networks [28,29].At high osmotic pressure, the NFL network showed weak responsiveness to salinity higher than C s = 100 mM, in agreement with theory [55].With the observed salt-brush behavior for WT, we suggest that weak salt response in NFL hydrogels coincides with the increase in steric repulsion shown for the star-like structures (Fig. 3a, blue line).
Additionally, our measurements show that the hydrophobic N-terminal regime of the NFLt domain aggregates.This result is consistent with the findings of Morgan et al. [31], where singlemolecule pulling experiments were performed on WT NFLt, and slow aging effects were observed, likely due to collapse (and potential aggregation) of the neutral domain.Indeed, follow-up studies by Truong et al. [56] used single-molecule stretching to show that added denaturant led to a swelling of the chain (increased ν), demonstrating that the WT chain has hydrophobic aggregation that can be disrupted by the denaturant.These observations suggest that at higher salt, the loss of repulsion may lead to attractive hydrophobic interactions growing more prominent in the NFL network.However, the steric repulsion from the remaining NFL tail may shield such an unwanted effect.Nonetheless, such effects may grow more prominent as the native filament assembly is disrupted.
In summary, we showed how the sequence composition of the NFLt IDP caused structural deviation from a disordered polyelectrolyte to a self-assembled star-like polymer brush.Together with the self-regulatory properties of the brushes, such behavior can be exploited to design structures that can resist specific environmental conditions.Additionally, our results showed possible implications on NFL aggregates that could shed light on the underlying correlations between the complex structure and the conditions driving it.While IDPs resemble polymers in many aspects, as we showed here, it is critical to assess their sequence to distinguish where and how to use the appropriate theoretical arguments to describe their statistical properties and structure.

Methods
Protein purification Protein purification followed Koren et al. [22].Variant ∆N42, included two cysteine residues at the C-and N terminals.After purification, ∆N42 variants were first reduced by 20 mM 2-Mercaptoethanol.Next, 2-Mercaptoethanol was dialysed out with 1 L of 50 mM HEPES at pH 7.2.To block the cysteine sulfhydryl group, we reacted ∆N42 variants with 2-Iodoacetamide at a molar ratio of 1:20.At the reaction, the variants' concentrations were ∼2 mg/ml.The reaction solution was kept under dark and slow stirring for 5 hr and stopped by adding 50 mM 2-Mercaptoethanol followed by overnight dialysis against 1 L of 20 mM Tris at pH 8.0 with 0.1% 2-Mercaptoethanol.Final purity was >95% as determined by SDS-PAGE (Fig. S10).
SAXS measurement and analysis Protein samples were dialyzed overnight in the appropriate solution and measured with a Nanodrop 2000 spectrophotometer (Thermo Scientific) for concentration determination.Buffers were prepared with 1 mM of TCEP to reduce radiation damage and 0.2% of Sodium Azide to impair sample infection.The samples were prepared in a final concentration of 2 mg/ml, measured in a series of 4 dilutions.Preliminary measurements were measured at Tel-Aviv University with a Xenocs GeniX Low Divergence CuKα radiation source setup with scatterless slits [57] and a Pilatus 300K detector.All samples were measured at three synchrotron facilities: beamline B21, Diamond Light Source, Didcot, UK [58], beamline P12, EMBL, DESY, Hamburg, Germany [59], and beamline BM 29 ESRF, Grenoble, France [60].Measurements at ESRF were done using a robotic sample changer [61].
Integrated SAXS data was obtained from the beamline pipeline and 2D integration using the "pyFAI" Python library [62].Extended Guinier analyses for the ∆N42 variant were done with the "curve fit" function from the "Scipy" Python library [63].To extract R g and ν, extended Guinier analysis was conducted for 0.7 < qR g < 2.
Error calculation was done from the covariance of the fitting.
Model fittings for the WT variant were done using the "lmfit" Python library [64] using the model described in [36,37].Due to the complexity of the model, cylindrical core fittings were done by binning the data in 100 logarithmic bins to reduce computation time.Within the same model, core parametres (cylinder radius R and cylinder length L) were set constant, to offset fitting errors.Initial values of R and L were calculated with the highest measured concentration.Physical boundary conditions were imposed on the fitting, and scattering length (SL) values were set to be unchanged by the fitting process.SL values of both the core and the tail domains were determined by tabulated values of amino acid SLD in 100% H 2 O [65] (Table S3).Fitting parameter error evaluation was done by finding the covariant of the returning fitting parameters.Error calculation of the volume was done using: In addition, ν values of WT were found by a recursive search of the corresponding tail height h/2 over Eq. 2. Errors of ν were then found by assuming a simple case of R g = bN ν , from which: dν ∼ ln (1+dR/R) ln N

∼ ln (N )
−1 dR R Zimm analysis Zimm analysis was performed as described in [45].Data normalization was done by first determining I 0 by fitting a linear curve over the Guinier plot (ln I(q) vs q 2 ).Normalized 1/I(q) linear fitting was done starting with the earliest possible data point until a deviation from the linear behavior occurs.Data points were then binned for visual clarity without impacting the result.
Brush model fitting Brush height model as described in [41] was fitted with a prefactor c = 0.33 to match data.Resulting heights were converted to ν by h = bN ν where b = 0.38 nm and N = 146.To accommodate for the change in grafting density, a linear curve was fitted to the grafting density's change in salinity and was used to obtain a continuous plot.
Polyelectrolye fitting The fitting model was used as described in [40] with a pre-factor c = 1.24 to match data.I(q) (a.u.) Protein concentration   I(q) (a.u.) C s = 370mM 10 1 3 × 10 0 q (nm 1 ) Protein concentration    10 0 q (nm 1 ) 10 0 10 fit Fig. S9 SAXS measurement of WT and its fitting with different core residue number n.In fixing the core residue number to a constant value of 10 (in red), the fitting becomes noticeably worse than when n is allowed to vary (in black).Displayed data: WT in 20 mM Tris pH=8.0, and 170 mM NaCl at a concentration of 1.3 mg/ml.

Fig. 1
Fig.1SAXS measurements of WT and ∆N42 at different salinity (Cs).a.For increasing Cs, the WT variant shows increased small angle scattering, a signature for aggregation.In contrast, ∆N42 remains structurally intrinsically disordered as Cs vary.Data points are shifted for clarity.Lines are form-factor fittings, as described in the text.b.Normalized Kratky plot of the same SAXS measurements.The ∆N42 variant remains disordered and unchanged with salinity, while the WT variant shows a hump at low q, typical for a collapse region.With increasing Cs, the hump at the lower q range becomes a sharper peak accompanied by a scattering rise at the higher q range.Such behavior indicates that the aggregation coexists with the WT variant's highly dynamic and disordered regions.Both variants shown are at the highest measured concentration (TableS1, S3).WT measurements are in 20 mM Tris pH 8.0 with 0, 150, 250, and 500 mM added NaCl (from bottom to top).Likewise, for ∆N42, measurements are in 20 mM Tris pH 8.0 with 0 and 150 mM added NaCl (bottom to top).

Fig. 3
Fig.3Deduced structural parameters from the SAXS data fitting.a. Flory exponent (ν) of WT tails and ∆N42 variants showing extended disordered scaling.The red line refers to the theoretical brush model[41], and the blue line refers to the theoretical polyelectrolyte[40]. ∆N42 shows a decrease in the protein extension due to the decline in intermolecular electrostatic repulsion (see also Fig.4).WT shows an increase in the extension when shifting from a dimer to a trimer, followed by a slight decline with a further increase in salinity.In gray, average ν is obtained from measuring separate NFLt segments with an NCPR of -0.3 to -0.6[22].b.The core (aggregated) peptide length per polypeptide as a function of salinity.At high salinity, each polypeptide aggregates via 2-3 amino acids that form the star-like polymer core.Both panels' values are the extrapolated 'zero concentration' parameters (supplementary Fig.S8).

2 Data
Fig. S2 ∆N42 SAXS measurement with corresponding extended Guinier curve.Bottom: Deviation from fit σ f it = (Y data − Y f it )/σ data Dashed line represents the maximum analysis point qR G = 2 from which deviation starts.Displayed data: 20mM Tris pH8.0 in 1.1mg/ml.

Fig. S3
Fig.S3∆N42 SAXS measurements with corresponding extended Guinier curves.Dashed lines represent the maximum analysis point qR G = 2 from which deviation starts.Protein concentrations were offset for clarity, with the lowest (blue) being of the highest concentration.

Fig. S5
Fig.S5SAXS measurements and spherical form factor fitting for all salinity concentrations (Cs).The Cs = 20 mM data fit is to a cylindrical core.Dashed lines represent the Gaussian form factor of the structure tails.Protein concentrations were offset for clarity, with the lowest (blue) being the highest concentration.

Fig. S7
Fig. S7 Structural parameters for WT (circles) and ∆N42 (triangles) variants extracted from fitting the SAXS data.Dashed lines demonstrate the linear fitting of the data used to obtained the zero concentration extrapolations.a. Aggregation number (Z) dependency on protein concentration (C) increases with increasing salt.b.Core volume Vs against protein concentration (C).In Cs = 20 mM, the Vs values are constant due to fitting constraints (see Methods).c.In all cases, the tail heights (h) are much larger than the corresponding grafting length (ρ), indicative of a brush regime.d.The structurally intrinsically disordered ∆N42 variant compacts with higher Cs values and remains more compacted from the projected tails for the WT variant.For the ∆N42 variant R G drastically changes as a function of the protein concentration (C).

Fig
Fig.S8a.Flory exponent (ν) of WT tails and ∆N42 variants as a function of the concentration.∆N42 shows to change radically as a function of the concentration at the lowest salinities.This effect is reduced as salinity concentration Cs reaches 170mM.WT and the rest of ∆N42 ν data shows little change as a function of the protein concentration.b.The core (aggregated) peptide length per polypeptide as a function of the concentrations.The large drop observed from Cs = 20mM to Cs = 170mM can be attributed to the shift from a dimer to a trimer.Core peptide length difference diminishes with increasing salinity, however the value still remain largely similar.

Fig. S10
Fig. S10 a. SDS-PAGE Tris-Glycine 15% of both ∆N42 and NFLt (WT), showing purity above 95%.White dashed lines indicate where image lanes were edited closer for clarity.Both show a higher molecular weight reading in the gel, which is common for IDPs.b-c.Deconvoluted ESI-TOF MS spectra of ∆N42 and NFLt respectively.Theoretical molecular weight values are 12423.57and 16233.79 for ∆N42 and NFLt, respectively

Table S1 ∆N42
Extended guinier analyis data.Analysis parametres (radius of gyration R G , scaling exponent ν, and scattering intensity at q = 0 (I 0 )) obtained for different salt concentrations (Cs) and protein concentrations (C).TableS2Zero concentration extended Guinier analysis data.Analysis parameters (radius of gyration R G and scaling exponent ν) were extrapolated to zero protein concentration at various salt concentrations (Cs).Cylinder length L values are only relevant to Cs = 20mM where a cylindrical core fit was used.For the cylindrical core, the same values of L and R were used for all concentrations to alleviate fitting errors (see Methods).SAXS measurements of WT and its fitting to different form factors.Both form factors are of the same model but use a different core: Spherical or Ellipsoidal.Spherical core fitting yields a core radius of R = 0.66 ± 0.016 nm, and the ellipsoidal core yields a core radius of R = 1.335 ± 0.23 nm and a secondary radius of ϵR where ϵ = 0.153 ± 0.08.Both fittings yield close values of aggregation number Z (3.046 ± 0.04 for spherical and 3.562 ± 0.07 for ellipsoidal) and tail height h/2 (9.838 ± 0.04 nm for spherical and 9.584 ± 0.12 for ellipsoidal).Below: Fitting error σ f it = (Y f it − Y data )/σ data .Both curves show similar error profiles.The spherical model proved best to describe the model due to its simplicity.Displayed data: WT in 20 mM Tris pH=8.0, and 170 mM NaCl at a concentration of 1.3 mg/ml.