ABC-transporter CFTR folds with high fidelity through a modular, stepwise pathway

The question how proteins fold is especially pointed for large multi-domain, multi-spanning membrane proteins with complex topologies. We have uncovered the sequence of events that encompass proper folding of the ABC transporter CFTR in live cells by combining kinetic radiolabeling with protease-susceptibility assays. We found that CFTR folds in two clearly distinct stages. The first, co-translational, stage involves folding of the 2 transmembrane domains TMD1 and TMD2, plus one nucleotide-binding domain, NBD1. The second stage is a simultaneous, post-translational increase in protease resistance for both TMDs and NBD2, caused by assembly of these domains onto NBD1. Our assays probe every 2–3 residues (on average) in CFTR. This in-depth analysis at amino-acid level allows detailed analysis of domain folding and importantly also the next level: assembly of the domains into native, folded CFTR. Defects and changes brought about by medicines, chaperones, or mutations also are amenable to analysis. We here show that the well-known disease-causing mutation F508del, which established cystic fibrosis as protein-folding disease, caused co-translational misfolding of NBD1 but not TMD1 nor TMD2 in stage 1, leading to absence of stage-2 folding. Corrector drugs rescued stage 2 without rescuing NBD1. Likewise, the DxD motif in NBD1 that was identified to be required for export of CFTR from the ER we found to be required already upstream of export as CFTR mutated in this motif phenocopies F508del CFTR. The highly modular and stepwise folding process of such a large, complex protein explains the relatively high fidelity and correctability of its folding. Supplementary Information The online version contains supplementary material available at 10.1007/s00018-022-04671-x.


SUPPLEMENTARY TEXT & TABLES Workflow for identification of domain-specific protease-resistant fragments
To identify the regions in CFTR that are protected or de-protected from the protease during folding we determined the identity of each proteolytic fragment, i.e. the cleavage sites used to generate the domain-specific protease-resistant fragments.
We mapped the presence of antigenic epitopes and glycans in the fragments, electrophoretic mobility shifts in the fragments in missense-mutated CFTR and truncated CFTR constructs, and considered helical propensity and consensus cleavage sites of the protease to locate the cleavage sites even further.With this information we identified each cleavage site, down to just a few amino acids for most fragments, a treasure trove for many future studies.
We set out to determine the identity of each proteolytic fragment and started by mapping the presence of antigenic epitopes (Table S1, Figures 3a, 5a, 7c) and glycans in each fragment.In a next step we truncated CFTR domains or parts and analyzed resulting electrophoretic mobility shifts in SDS-PAGE, which allowed determination of several fragment boundaries already (Tables 1, S3-6 and Figures 3,   5e, 7d).Last experimental dataset was the mapping of fragment mobility shifts (or lack thereof) in point mutants and in truncated CFTR (Tables 1, S3-6 and Figures 3d,   5, 8b).These results, combined with literature on Proteinase-K consensus cleavage sites (Table S2) and alpha-helical-propensity predictions, allowed determination of the amino-acid sequences (up to a single likely residue) that constituted the Nterminal and C-terminal cleavage sites for each fragment (Tables 1, S3-6 and Figures 4a, 6a and 7a-b).
We found that the electrophoretic mobility of fragments (especially from TMD) was not dictated by size/mass alone, but strongly by charge and hydrophobicity as well.
For instance, the electrophoretic mobility of fragment T1a suggested a size of ~19 kDa, but this was too small to fit within the experimentally determined fragment boundaries (minimal 23 kDa).We therefore did not use this as a strict parameter to determine cleavage sites.
Our analysis depends fully on the specificity of Proteinase K. Other proteases such as trypsin, chymotrypsin, endoGlu-C, and thermolysin are useful to confirm our findings, but the strength of Proteinase K is its low specificity, with possible cleavage sites all throughout the CFTR sequence.This implies that protection from cleavage can be caused only by conformational or external factors.Proteases select against helical sequences [1,2], which we used by predicting helical propensity of cleavage stretches using Jpred (http://www.compbio.dundee.ac.uk/jpred/).Cleavage sites are shielded by interaction with polypeptide chains, either within the molecule (domain folding and domain assembly), or from another molecule, for instance by assembly with other subunits in an oligomer or by a (transient) interaction with another protein such as a chaperone.Our analysis in Triton X-100 favors intramolecular interactions and tightly associating oligomers and disfavors chaperone associations.We therefore conclude that proteolytic changes are caused by CFTR folding, although we cannot exclude intermolecular interactions.
The consensus cleavage sites of Proteinase K have been well established but unfortunately some ambiguity remains.Proteinase K preferentially cleaves at the carboxyl side of residues A, Q, M, C, W, S, F or L (Table S2) [3][4][5][6][7] located in accessible loops and not in secondary structure, especially α-221 helices [1,2].
Cleavage of residues that are followed by small amino acids such as alanine or glycine is slightly preferred whereas cleavage is inhibited when the residue is followed by a proline and slightly reduced when followed by valine [8].Also, cleavage is more efficient when the cleavage site is flanked by at least 1 to 3 residues N-or Cterminal [4,8].
The antigenic epitopes and truncated constructs provided the primary information on the preferred cleavage sites and are written in Tables 1, S3-6 (and Figures 4a, 6a  Ad 1.Change of a cleavage site may go unnoticed when a cluster of sites is being used by the protease, implying that we cannot conclude that a lack of change equates a lack of cleavage of the target or mutated residue.Exceptions to this are the few instances where a single residue appears to be the protease target rather than an amino-acid cluster.A change in proteolytic fragment caused by a mutated cleavage site does provide positive information.Ad 2. Electrophoretic mobility of proteins in SDS-PAGE [9] in principle is determined by their mass because of their inherent capacity to bind 1.4 g of SDS per g polypeptide, independent of amino acid composition.As SDS adds 2 negative charges it in principle overrules any charge the protein may have.These ideal conditions often are not met, for instance in hydrophobic proteins, where SDS binding may be higher or lower, or in negatively charged proteins such as all calcium-binding chaperones in the endoplasmic reticulum, where SDS binding is lower, leading to decreased mobility and overestimation of their masses [10,11].Analyzing smaller constructs, the individual CFTR domains and proteolytic fragments, we noticed that a charge change in the fragment did contribute to a mobility change, either up or down, by a combination of its own charge and the changed SDS-binding.On top of the charge changes we have noticed electrophoretic mobility changes unrelated to charge, such as in I539T [12].In summary, this parameter, while important to notice and include in the analysis, does not provide rigorous information on proteolytic fragments.Mobility appeared most useful to establish whether the mutated residues were present in a fragment.Ad 3. Limited proteolysis classically has been used for this parameter: to determine folded state or domain boundaries of proteins [1].In Cleveland analysis [13] it is used as unique protein identifier, on the premise that each native protein has its own sequence and hence unique proteolytic pattern.We have used the approach and found a surprising capacity to zoom into the used cleavage sites.This is fortunate as mass spectrometry so far has had limited success in determining identities of these subpicomole quantities of radiolabeled fragments.Equating proteolysis to folded state of CFTR is only possible when other proteins are not interacting and shielding protease sites.As Triton X-100 dissociates most chaperone-client and many other complexes, we limited our conclusions here to intramolecular shielding, i.e. folding.Following this list of parameters for the N-terminus and C-terminus of each proteolytic fragment, described in Tables S3-6 and visualized in Figures 4a, 6a, 7a and 8a, we identified each fragment with rather precise boundaries (Table 1).• Lh2 (aa46-63) has relatively low helical propensity.
• C-terminal cleavage site is most likely L383.• Helical propensity higher in N-terminal than C-terminal of ICL3 coupling helix.
• C-terminal cleavage site is most likely M1191.* N1a arises from cleavages in RI and RE.We cannot exclude that this fragment is shifted towards the N-terminus, with N-and C-terminal cleavage sites moved upstream.This is not relevant for the analysis however.
and 7a-b).Next are the consensus Proteinase-K sites, and the electrophoretic mobility shifts caused by missense mutations.The reasoning for the N-and Cterminal boundaries of each fragment is in Tables S3-6 and shown in Figures4a, 6a and 7a-b.A missense mutation can influence proteolysis by three mechanisms: 1. Removal or addition of a consensus Proteinase-K cleavage site, 2. Change of electrophoretic mobility by charge or changed SDS binding, and 3. Change of conformation or interactions with other proteins, leading to different shielding of consensus sites.

Figure S1 .
Figure S1.Glycosylation of TMD2 in vitro and CFTR in cells, related to Figure 1.(a) TMD2 was translated in vitro in the presence of HEK293 microsomes.Membranes were pelleted, digested with Endo H where indicated, and resolved by 12% SDS-PAGE.TMD2gg: core-glycosylated TMD2.(b) HEK293T cells expressing CFTR were pulse-labeled for 15 minutes and lysed immediately or after a chase of 2 hours, immunoprecipitated using MrPink antibody and resolved by 10% SDS-PAGE.

Figure S2 .
Figure S2.PNGase F deglycosylates core and complex glycosylated CFTR, related to Figure 5. HEK293T cells expressing full-length CFTR were pulse-labeled for 15 minutes and lysed immediately or after a chase of 2 hours.CFTR was immunoprecipitated with MrPink, subjected to PNGase F treatment, and resolved by 7.5% SDS-PAGE.

Figure S3 .
Figure S3.Protease resistant fragments shift downwards in N-terminally truncated CFTR, related to Figure 3d-g.(a) HEK293T cells expressing N-terminal CFTR truncations ΔN48 to ΔN76 were pulse-labeled for 15 minutes and lysed immediately.CFTR variants were immunoprecipitated with MrPink and analyzed by 7.5% SDS-PAGE.Digests are shown in Figure 3b.(b) Same as (a) but now with N-terminal CFTR truncations ΔN47 to ΔN51.Digests are shown in Figure 3c.(c) same as (a) but with C-terminal truncations K381X to E395X.Digests are shown in Figure 3d.(d) same as (a) but for

Figure S4 .
Figure S4.Protease resistant fragments shift in CFTR mutants, related to Figure 5b-e.(a) HEK293T cells expressing CFTR mutants N901K-V905K were pulse labeled for 15 minutes and lysed after a 2-hour chase.Detergent lysates were immunoprecipitated with MrPink or subjected to limited proteolysis with 25 µg/mL P roteinase K and immunoprecipitated with TMD2C.Top panel represents nondigested material resolved by 7.5% SDS-PAGE and the bottom panel shows the fragments generated by limited proteolysis and resolved by 12% SDS-PAGE.(b) same as (a) but with mutants L957K to G970K and 0-hour chase.Dotted red line through T2b fragment is for visual aid.(c) HEK293T cells expressing CFTR mutants S1058K to L1065K were pulse labeled for 15 minutes and lysed immediately.Detergent lysates were immunoprecipitated with MrPink and resolved by 7.5% SDS-PAGE.(d) same as (c) but with mutants M961K to L967K.(e) same as (c) but with mutants T908K to S912K and after a chase of 2h.(f) same as (bc) with C-terminal truncations N1184X to M1191X.(g) same as (b) but with mutants V1190X to S1196X.Dotted red line through T2b fragment is for visual aid.Digests of panels c-f are shown in Figure 5b-e.

Table S2 . Proteinase-K consensus cleavage studies
Summary of literature describing which amino acids (displayed as one-letter code) Proteinase K preferentially cleaves.Amino acids that were not or barely cleaved are in regular font.This was generally consistent throughout all studies.We chose to use the residues in bold as preferentially cleaved by Proteinase K, mostly based on the study performed by Whittaker et al., which is the most rigorous and complete one.These are colored blue in Figures4 and 6. Results on tyrosine (Y) were too variable to consider Y a consensus cleavage site.
Q-C-F-L-Y-N-H-S-E-G> V=A=R=T=P=K

Table S3 . Cleavage site identification of TMD1 Indirect
, supporting results are in italics, conclusion is in bold

Table S4 . Cleavage site identification of TMD2 Indirect
results are in italics, conclusion is in bold

Table S6 . Cleavage site identification of NBD2 Indirect
results are in italics, conclusion is in bold