Main

We aimed to elucidate the molecular architecture of exon-defined spliceosomal complexes. To that end, we first determined the three-dimensional (3D) structure of a human CE pre-B complex (previously denoted the 37S exon complex) assembled in HeLa nuclear extract on a MINX exon-containing RNA (Fig. 1a) that was previously shown to undergo exon definition7. To enhance the stability of the CE pre-B complex, we isolated complexes at low salt conditions, which led to the formation of predominantly pre-B dimers. Affinity-purified complexes contained stoichiometric amounts of the MINX exon RNA and U1, U2, U4, U5 and U6 snRNAs (Extended Data Fig. 1a). They also contained essentially all known snRNPs and U1-related and serine-arginine-rich (SR) proteins (Supplementary Table 1).

Fig. 1: 3D structure of human CE pre-B complexes.
figure 1

a, Schematic of the MINX exon RNA. Yn, polypyrimidine tract. b, Left, fit of two monomeric, CE pre-B molecular models into the EM density of the class 1, pre-B dimer. Right, cartoon of the structural organization of the CE pre-B complex dimer. BP, branch point. c, Top, two different views of the molecular architecture of the human CE pre-B complex. Bottom, summary of all modelled proteins and RNAs with colour code.

Cryo-EM structure of CE pre-B complexes

Single-particle cryo-electron microscopy (cryo-EM) followed by 3D reconstructions of the purified CE pre-B complexes revealed that one of the major dimer classes comprised two similar units (protomers) aligned in an antiparallel manner, whereby each of the protomers resembles a CI pre-B complex (Fig. 1b and Extended Data Fig. 1b). We next determined the structure of individual CE pre-B protomers at 3.5 Å resolution at their tri-snRNP core (Extended Data Fig. 1c–h). The molecular architecture of the tri-snRNP in the CE pre-B complex was highly similar if not identical to that in human CI pre-B complexes and isolated tri-snRNPs17,18,19 (Fig. 1c). For example, in both types of pre-B complexes, PRP8 has an open conformation and the BRR2 helicase domain is located close to the PRP8 reverse transcriptase-like (RT) domain (PRP8RT). Moreover, the BRR2 helicase domain has not yet translocated to the PRP8 endonuclease-like (En) domain (PRP8En) nor docked with its U4 snRNA substrate (Fig. 1c). Furthermore, U4/U6 stem III and the so-called U4 snRNA quasi-pseudoknot are present (Fig. 1c and Extended Data Fig. 1i–k). In CE pre-B, the tri-snRNP interacts with the U2 snRNP through three main bridges that are also observed in CI pre-B complexes18,19 (Extended Data Fig. 2a–c). Thus, our structure shows that tri-snRNP recruitment is achieved through similar molecular interactions. Moreover, its 3D structure and orientation relative to U2 are highly similar in both exon-defined and intron-defined pre-B complexes.

We previously showed that the U1 snRNA is base-paired to the 5′ splice site (5′ss) of the MINX exon RNA in CE pre-B complexes7. In the latter, a poorly resolved, globular density (GD) is connected on one side of the U2-SF3B1 HEAT domain (SF3B1HEAT) by a thin density element (Extended Data Fig. 2d) that protrudes from SF3B1HEAT repeats H7–H9. This result indicates that it contains intron or exon nucleotides near the MINX 3′ ss. Consequently, the MINX exon together with U1 snRNP and exon-binding proteins are located in the adjacent GD (Extended Data Fig. 2d–f). Owing to the poor resolution of the CE pre-B structure in this region, U1 could not be clearly localized, and the nature of the molecular bridge linking U2 at the branch site (BS) to U1 snRNP at the downstream 5′ss could not be discerned by cryo-EM. However, protein crosslinking coupled with mass spectrometry (Extended Data Fig. 2g and Supplementary Table 2) indicated that U1 and U1-related proteins communicate indirectly with U2 components, mainly through SR proteins that interact with the U2AF1 and U2AF2 proteins and SF3B1 that are located near the 3′ end of the intron. This result is consistent with the idea that U2 and U1 do not directly interact with one another. Taken together, the molecular architecture of the CE pre-B complex indicates that no major rearrangements in the pre-B complex are required for the tri-snRNP to subsequently interact across an intron with a U1 snRNP bound to an upstream 5′ss during the switch from a CE to CI complex.

3D structure of the B-like complex

CI pre-B complexes are converted to CI B complexes following PRP28 helicase-mediated handover of the 5′ss from U1 to the U6 snRNA, thereby forming the U6/5′ss helix20,21. This helix is present in an extended form (that is, extended U6/5′ss helix) in CI B complexes. This leads to multiple, major structural rearrangements in the CI pre-B complex18,19,22,23 and the recruitment of so-called B-specific proteins (including, SMU1, RED, FBP21, SNU23, MFAP1, PRP38A and UBL5)6. We previously showed that addition of an excess of a 5′ss-containing RNA oligonucleotide (5′ss oligo) to CE pre-B complexes in the presence of nuclear extract and ATP bypasses the requirement for PRP28 (refs. 7,24). The CE pre-B complex is converted into a CE B-like complex that undergoes structural rearrangements that are currently poorly characterized7,24. This simplified system was therefore proposed to mimic the interaction of a bona fide upstream 5′ss with the tri-snRNP during the CE to CI transition. We sought to determine whether CE B-like complexes undergo structural rearrangements similar to those that occur during the conversion of a CI pre-B complex to a CI B complex. We affinity-purified CE B-like complexes (generated as described above) and subsequently determined their cryo-EM structure. Single-particle cryo-EM of the affinity-purified CE B-like complexes revealed dimers comprising two B-like complexes aligned in a parallel manner, similar to the overall organization of human CI B dimers isolated under low salt conditions23 (Extended Data Fig. 3a–i).

We subsequently determined the structure of the B-like protomers at 3.1 Å resolution in their tri-snRNP core (Fig. 2a). In the B-like complex, G+1 and U+2 of the 5′ss of the exogenously added RNA oligonucleotide are exclusively recognized by DIM1 and PRP8 in the same manner as the 5′ss of the pre-mRNA in CI B complexes19,22,23 (Fig. 2b–d and Extended Data Fig. 3j). Nucleotides downstream of U+2 base pair with nucleotides at/or near the U6 ACAGA box, forming a U6/5′ss helix, similar to the situation in CI B complexes (Fig. 2b,e and Extended Data Fig. 3k). Notably, a second copy of the 5′ss oligo (denoted oligo 2) also binds, which extends the U6/5′ss helix through additional non-canonical base pairs and base-stacking interactions. This structure therefore mimics the extended U6/5′ss found in CI B complexes (Fig. 2e,f). The zinc finger (ZnF) regions of the B-specific proteins FBP21 and SNU23 bind side by side to the backbone of 5′ss oligo 1 and 5′ss oligo 2, respectively, stabilizing their base-pairing interactions with the U6 snRNA in the B-like complex, analogous to their spatial organization and function in the human CI B complex (Fig. 2f). Indeed, nearly all of the B-specific proteins localized in the CE B-like complex are spatially organized in an analogous manner to that of the human CI B complex23 (Extended Data Fig. 4a–c).

Fig. 2: 3D structure of human CE B-like complexes.
figure 2

a, Top, two different views of the molecular architecture of the human CE B-like complex. Bottom, summary of all modelled proteins and RNAs with colour code. bd, Interactions of the 5′ss oligo with U5 snRNA loop 1 nucleotides and residues of PRP8 and DIM1. e, Schematic of the base-pairing interactions of the two copies of the 5′ss oligo that are bound in the B-like complex. 5′ss oligo 2 stacks through its 5′ terminal A−3 on the 3′ terminal U+8 of the first 5′ss oligo. f, The ZnF domains of FBP21 and SNU23 stabilize the base-pairing interactions of 5′ss oligo 1 and 5′ss oligo 2, respectively, with U6 and U5 snRNA. Left, coloured EM density. Right, molecular model. Arrowhead indicates missing EM density, which confirms that two distinct 5′ss oligos are bound. g, Schematic of the domain organization of PRP8. HB, helical bundle. h, Repositioning of the PRP8 RT/En and RH domains and BRR2 and PRP6HAT during the conversion of the CE pre-B complex (left) into a B-like complex (right). The 3D structures are aligned through PRP8NTD.

Structural comparisons of the CE B-like complex with the CE pre-B complex revealed that the majority of tri-snRNP remodelling events that occur during the CI pre-B to B complex transition also occur after addition of an excess of the 5′ss oligo to the splicing reaction. For example, in the B-like complex, PRP8 adopts a half-closed conformation and the BRR2 helicase domain is translocated to PRP8En (Fig. 2g,h). In addition, the BRR2 amino-terminal helicase cassette (BRR2NC) is in an active conformation and is now bound to its U4 snRNA substrate. Moreover, essentially all of the remodelling events that have been described for the U4/U6 proteins during the formation of the CI B complex also occur in the B-like complex. These include, among others, the repositioning of the PRP6 HAT domain (PRP6HAT) and PRP31 (Extended Data Fig. 4d,e), and the disruption of U4/U6 stem III and the U4 snRNA quasi-pseudoknot (Extended Data Fig. 3k). Finally, the molecular architecture of the U2 snRNP and its interfaces with the tri-snRNP are also similar between B-like and CI B complexes19,22 (Fig. 2a). Taken together, our data reveal that the tri-snRNP has nearly the same structure in human B-like and B complexes. Thus, addition of an excess of a 5′ss oligo in the presence of nuclear extract and ATP triggers tri-snRNP remodelling events highly similar to those observed during the PRP28-mediated transformation of a CI pre-B complex into a B complex. These results indicate that CE pre-B complexes are not only structurally similar (with the exception of the position of the U1 snRNP) but also functionally equivalent to CI pre-B complexes. Therefore, the CE and CI assembly pathways converge at the pre-B stage. Furthermore, the results support the conclusion that the CE to CI switch does not require the initial conversion of a CE A-like complex into a CI A complex.

Addition of a 5′ss triggers PRP8 remodelling

We next sought to identify the molecular triggers and the order and interdependency of the various RNP remodelling events that lead to the formation of B-like complexes. To that end, we determined cryo-EM structures of complexes formed after the addition of an excess of 5′ss oligo to affinity-purified CE pre-B dimers in the absence or presence of ATP or ATPγS. ATPγS was previously shown to support the formation of human B complexes but prevents their subsequent transformation into activated B (Bact) complexes and/or their dissociation25. The cryo-EM structures of the protomers of affinity-purified CE pre-B dimers incubated with 5′ss oligo (designated pre-B5′ss) or with 5′ss oligo plus ATPγS (designated pre-B5′ss+ATPγS) were determined at 4.2 Å and 3.1 Å, resolution, respectively, at their tri-snRNP cores (Fig. 3a and Extended Data Fig. 5).

Fig. 3: RNP rearrangements in pre-B triggered by the addition of 5′ss oligo alone or in combination with ATPγS.
figure 3

a, Molecular architecture of CE pre-B5′ss complexes (left) and pre-B5′ss+ATPγS complexes (right). Notably, PRP28 helicase is still bound to PRP8 in CE pre-B5′ss in essentially the same manner as in CE pre-B (compare with Fig. 1). This is presumably due to the absence of the PRP38A–SNU23–MFAP1 complex, which helps to displace PRP28 during the formation of B complexes and B-like complexes owing to their overlapping (that is, mutually exclusive) binding sites. b, Top, schematic of the U4/U6 snRNA helices and base-pairing interactions of the 5′ss oligo in pre-B5′ss. Bottom, 3D organization showing the position of RBM42. c, 3D organization of the U4/U6 helices and interactions of the 5′ss oligo with U6 snRNA in pre-B5′ss (left), pre-B5′ss+ATPγS (middle) and pre-B5′ssLNG+ATPγS (right). d, 5′ss oligo binding triggers PRP8RT/En movement towards the PRP8NTD and concomitant movement of the PRP8 RH and HB domains and other bound proteins. An overlay of the indicated PRP8 domains in pre-B (grey) and pre-B5′ss (various colours) is shown. Structures are aligned through PRP8NTD. e, Large-scale translocation of BRR2 requires ATP. Comparison of the location of the BRR2 helicase cassettes and the PRP8 RH, JAB1 and RT/En domains in pre-B5′ss versus pre-B5′ss+ATPγS. Aligned through PRP8NTD.

Owing to the absence of the B-specific protein SNU23, which stabilizes the binding of a second 5′ss oligo in B-like complexes, only one 5′ss oligo is bound in pre-B5′ss in a manner similar to the 5′ss oligo 1 in B-like complexes. Thus, an extended U6/5′ss helix does not form (Fig. 3b,c and Extended Data Fig. 6a). As a consequence, U4/U6 stem III, the formation of which is mutually exclusive of that of an extended U6/5′ss helix, is still present, as well as the U4 quasi-pseudo knot and its associated proteins RBM42 and SNRNP27K (Fig. 3a–c and Extended Data Fig. 6a). In pre-B5′ss complexes, 5′ss oligo binding triggers the movement of the PRP8 RT/En domain (PRP8RT/En) towards the PRP8 N-terminal domain (PRP8NTD). This action generates a half-closed PRP8 conformation that is also found in B-like and B complexes (Fig. 3d, Extended Data Fig. 6b and Supplementary Video 1). However, no large-scale translocation of BBR2 or substantial repositioning of other tri-snRNP components like those in CE pre-B is observed (Extended Data Fig. 6b). However, PRP4 kinase (PRP4K) and the SF3B3 WD40B domain (SF3B3WD40B) are located farther apart in pre-B5′ss complexes compared with pre-B complexes, and PRP4K interacts more extensively with PRP6HAT (Extended Data Fig. 6c–e and Supplementary Video 2), which may potentially contribute to activation of its kinase activity. The movement of PRP8RT/En that is triggered by the formation of the short U6/5′ss helix also destabilizes interactions between SAD1 and the BRR2 PWI domain (BBR2PWI) and between the U4 Sm core and the PRP8 RNase H-like (RH) domain (PRP8RH) (Extended Data Fig. 6c,d,f and Supplementary Videos 1 and 3). These contacts help to tether BRR2 and PRP8RH to their docking sites in the pre-B complex.

BRR2 translocation is driven by ATP

A single 5′ss oligo is also bound in pre-B5′ss+ATPγS complexes (Fig. 3c and Extended Data Fig. 6g,h). Notably, the addition of ATPγS or ATP to pre-B5′ss complexes leads to repositioning of the BRR2 helicase domain and to substantial restructuring of other tri-snRNP components. These include PRP31, PRP6 and PRP8RH, which are also repositioned in B-like complexes (Fig. 3e, Extended Data Fig. 6i,j and Supplementary Videos 4 and 5). Thus, most structural rearrangements within the tri-snRNP that occur during B-like and B complex formation are driven by ATP. Notably, however, they do not require B-specific proteins, which are underrepresented or absent in purified CE pre-B complexes. In pre-B5′ss+ATPγS, BRR2 and the tightly bound PRP8 Jab1/MPN domain (PRP8Jab1) undergo a large-scale translocation (Fig. 3e and Supplementary Video 5). BRR2–PRP8Jab1 is now anchored to its new position at the tip of PRP8En through the PRP8RH–PRP8Jab1 linker, the ends of which are stably bound to PRP8En at the same position as in the B-like complex (Extended Data Fig. 7a–c). However, in pre-B5′ss+ATPγS, BRR2–PRP8Jab1 does not reach the conformational state and position that it adopts in B-like complexes, which would require a rotation of BRR2 around its long axis and substantial structural rearrangement in its helicase domains (Extended Data Fig. 7d,e and Supplementary Video 6). Furthermore, in contrast to the situation in the B-like complex, in pre-B5′ss+ATPγS, BRR2NC is in an inactive conformation, in which the separator loop tightly interacts with the Sec63 domain, thereby blocking access of the U4 snRNA to the RNA-binding region of the BRR2NC RecA domains (Extended Data Fig. 7e). Thus, the ATP-triggered remodelling of BRR2 during the transition of pre-B5′ss to pre-B5′ss+ATPγS and then to the B-like complex is an intricate process that occurs in a stepwise manner. Consistent with the idea that a key event for the restructuring of the tri-snRNP during the pre-B to B-like to B transition is the PRP4K-mediated phosphorylation of PRP6 and PRP31 (ref. 26), both proteins are phosphorylated in pre-B5′ss+ATP, but not in pre-B or pre-B5′ss (Extended Data Fig. 7f).

The U4 snRNA quasi-pseudoknot is dissolved in pre-B5′ss+ATPγS, accompanied by displacement of RBM42 and SNRNP27K, but U4/U6 stem III is still present and, together with the U4 Sm core, adopts a new position (Fig. 3a,c and Extended Data Fig. 7g). A previous model proposed that dissociation of U4/U6 stem III is the driving force for the substantial restructuring of the tri-snRNP during the formation of human B complexes18. Our data indicate that most of these remodelling events do not require disruption of U4/U6 stem III or the formation of an elongated U6/5′ss helix. Instead, U4/U6 stem III is stabilized in pre-B5′ss+ATPγS by both RNA and protein interactions, in particular by the capture of U6 C37 by a newly identified structural element of U2 SF3A1 (Extended Data Fig. 7h,i). In addition, the absence of the B-specific protein SNU23 prevents stable binding of a second 5′ss oligo that competes with stem III formation. To elucidate whether disruption of U4/U6 stem III is a prerequisite for the final positioning of BRR2, we extended the 5′ss oligo (henceforth called 5′ssLNG) by 8 additional nucleotides that are fully complementary to U6 snRNA nucleotides 31–38 that form part of U4/U6 stem III. Cryo-EM of the complexes formed after incubation of affinity-purified CE pre-B with 5′ssLNG, followed by ATPγS (designated pre-B5′ssLNG+ATPγS) (Extended Data Fig. 8a–g) revealed the formation of an extended U6/5′ss and the absence of U4/U6 stem III (Fig. 3c and Extended Data Fig. 7i). In addition, the U4 Sm core has moved upwards by about 5 nm (compared with pre-B5′ss+ATPγS) and now contacts BRR2 (Fig. 3c and Extended Data Fig. 7h,i). However, the position of BRR2 and the conformation of the BRR2NC RecA domains are not substantially different from pre-B5′ss+ATPγS (Extended Data Fig. 8h,i). These results point towards an essential role for one or more of the B-specific proteins in these BRR2 structural rearrangements.

B-specific proteins aid BRR2 positioning

We next compared the structure of pre-B5′ssLNG+ATPγS with that of the B-like complex. The results indicated that the rotational movement of BRR2 and the opening of BRR2NC required for B-like formation are probably facilitated, among others, by the SNU23 loop 62–74 and MFAP1 helix 215–255, which bind in the B-like structure to the BRR2NC Sec63 domain. The binding of the long α-helix of FBP21 and α-helix 18–36 of SNU23, which both interact with RecA2 of BRR2NC, also has an important role (Fig. 4 and Extended Data Fig. 8j–n). In B-like complexes, SNU23 also tethers BRR2NC to PRP8NTD, bridging it to the U6/5′ss helix (Fig. 4 and Supplementary Video 7). The rotation of BRR2 towards PRP8NTD is probably further facilitated by the SMU1–RED complex, which stabilizes the new position of BRR2 in the B-like complex (Fig. 4). In B-like complexes, the BRR2 carboxy-terminal helicase cassette (BRR2CC) also moves to a new position that seems to be stabilized by its interaction with MFAP1 α-helix 141–172 and by SMU1 and RED (Fig. 4 and Extended Data Fig. 8k). Finally, SNU23, promotes the formation of the extended U6/5′ss helix, which frees up the U4 snRNA for its interaction with BRR2NC. Thus, numerous B-specific proteins are required for the final positioning of BRR2 and probably cooperate to facilitate or stabilize the active, more open conformation of BRR2, thereby allowing the docking of U4 snRNA across the BRR2NC RecA domains.

Fig. 4: B-specific proteins are required for the final positioning of BRR2 and the docking of its U4 snRNA substrate.
figure 4

a, Schematic of the domain organization of BRR2. PWI, domain with a PWI tri-peptide located within its N-terminal region; SL, separator loop; WH, winged helix. b,c, Multiple B-specific proteins contact BRR2 and tether it to its activation position. Comparison of the location and conformation of BRR2 and its U4 snRNA substrate in pre-B5′ssLNG+ATPγS (b), which is formed in the absence of the B-specific proteins, and the B-like complex (c). In pre-B5’ssLNG+ATPγS, the BRR2NC has a closed, inactive conformation and is located about 15 Å away from its final position observed in the B-like complex. In the presence of B-specific proteins, BRR2NC has an open, active conformation and is bound to the U4 snRNA.

CE pre-B interacts with a U1-bound 5’ss

The antiparallel orientation of the two CE pre-B complexes in the purified pre-B dimers (Fig. 1b) suggests that U1 snRNP bound to the 5′ss of the exon of one protomer may transiently interact in trans with PRP28 of the adjacent protomer. PRP28 is in the vicinity of the U6 ACAGA box in the pre-B protomers. This configuration therefore raises the possibility that addition of ATP alone might trigger PRP28-mediated transfer of the 5′ss of the MINX exon of one protomer to the U6 ACAGA box of the adjacent protomer. Cryo-EM of purified pre-B complexes incubated solely with ATP (denoted pre-BATP) revealed highly similar molecular architectures between pre-BATP protomers and pre-B5′ss+ATPγS (Fig. 5a–c and Extended Data Fig. 8o–v), which indicated that a U6/5′ss helix has formed. Indeed, in the pre-BATP dimers, nucleotides encompassing the 5′ss directly downstream of the MINX exon RNA base pair with U5 loop 1 and the U6 ACAGA box to form a short U6/5’ss helix (Fig. 5d). Owing to structural constraints, PRP28 cannot interact within the same protomer with the U1 snRNP-bound 5′ss (Extended Data Fig. 2e). Furthermore, stoichiometric amounts of full-length MINX are present in the peak gradient fractions of the pre-BATP dimers analysed by cryo-EM (Extended Data Fig. 8o). This finding supports the conclusion that the MINX 5′ss that is bound by U6 comes from intact MINX RNA present in the adjacent pre-B protomer; therefore, a RNA–RNA network that links both protomers is formed (Fig. 5a,c and Extended Data Fig. 9a). Notably, the PRP28 RecA domains are no longer visible in pre-BATP (Extended Data Fig. 8v,w), which indicates that they have been destabilized. This result is consistent with the idea that the pre-BATP structure is formed through ATP-dependent, PRP28-mediated transfer of the 5′ss of the MINX exon from U1 snRNA in one CE pre-B protomer to the U6 snRNA of the neighbouring protomer.

Fig. 5: ATP-dependent transfer in trans of the 5′ss from U1 snRNP to the U6 ACAGA box by PRP28.
figure 5

a, Structure of purified pre-B complexes incubated with ATP alone (pre-BATP). Fit of the pre-BATP molecular model into the EM density of the pre-BATP dimer. b, Molecular architecture of the pre-BATP complex. The view shown in b is obtained if the structure of protomer 1 in a is rotated 150°. c, Cartoon showing the structural organization of the CE pre-BATP complex dimer. d, The 5′ss of the MINX exon RNA from one pre-B monomer interacts with U5 loop 1 nucleotides and the U6 ACAGA box of the adjacent monomer, forming a short U6/5′ss helix. U4/U6 stem III is still present. Top right, schematic of the base-pairing interactions with the MINX exon RNA near or/at the 5′ss. Nucleotides at the 3′ end of the exon (C−5 to G−1) base pair with nucleotides of U5 loop 1. e, Molecular architecture of a pre-BAMPPNP monomer bound in trans to the U1 snRNP of an adjacent monomer of the pre-BAMPPNP dimeric complex. Right, zoomed-in region that is boxed on the left, showing the molecular architecture of U1 snRNP and its interaction with PRP28. ssRNA, single-stranded RNA. f, Comparison of the corresponding regions shown in e with those of the pre-B complex dimer.

To test the latter conclusion, we incubated purified CE pre-B dimers with the non-hydrolysable ATP analogue AMPPNP and subsequently determined their structure by cryo-EM (Extended Data Fig. 9b–h). The spatial organization of the tri-snRNP proteins in pre-BAMPPNP complexes was the same as in the CE pre-B protomers (Fig. 5e,f and Extended Data Fig. 9i,j). Notably, in the presence of AMPPNP, U1 snRNP, presumably from the adjacent protomer, was now stably docked to the PRP28 helicase domain of the pre-B protomer, as evidenced by the improved resolution in this region of the pre-BAMPPNP complexes compared with pre-B complexes (Fig. 5e,f and Extended Data Fig. 9i,j). This enabled the localization of U1 snRNP components (Fig. 5e and Extended Data Fig. 9i) and further supports the idea that the poorly resolved density element of the CE pre-B complex indeed contains U1. In contrast to pre-B complexes, in which the RecA domains of PRP28 are in an inactive, open conformation (Fig. 5f and Extended Data Fig. 9j, l–o), in pre-BAMPPNP they exhibit a closed conformation (Fig. 5e and Extended Data Fig. 9i,k). Consistent with the absence of additional structural rearrangements in tri-snRNP components, the 5′ss of the adjacent protomer is not base paired to the U6 ACAGA box (Extended Data Fig. 9p,q). Indeed, the release and transfer of the 5′ss to the U6 ACAGA box would require ATP hydrolysis and lead to the release of the RecA domains of PRP28 from U1 snRNP and PRP8, as observed in the pre-BATP complex. Taken together, our data demonstrate that the tri-snRNP of a CE pre-B complex is capable of interacting in trans with a U1 snRNP bound to a 5′ss, which leads to the handover of this 5′ss from U1 to U6 by PRP28. This result further indicates that the latter handover does not require that the CE U2–U1 molecular bridge in a CE pre-B complex is first converted into a CI U1–U2 interaction, as the former bridge is still intact in pre-BATP. Our data indicate that the conversion of a CE pre-B to a CI B complex can occur within the context of a CE pre-B dimer. However, formation of the latter is probably not a pre-requisite for the CE to CI switch within the cell, where CE pre-B complexes probably exist predominantly as monomers and can therefore productively engage with an upstream, U1-bound 5′ss in cis (see below).

Discussion

The cryo-EM structures of the CE pre-B and B-like complexes, together with our cryo-EM coupled, add-back experiments using purified CE pre-B complexes, provide new insights into the switch from a CE to CI organized spliceosome. We also provide new information on the intricate RNP remodelling events that occur during the transformation of a CE pre-B complex into a CI B complex. Given the structural similarities of a CE and CI pre-B complex, the order of the latter rearrangements and the intermediate structural states that our studies uncovered, as well as their molecular triggers, should also apply to the conversion of a CI pre-B complex into a CI B complex. We showed that the formation of a U6/5′ss helix induces a structural change in PRP8 such that it now adopts the PRP8 conformation found in B complexes and B-like complexes (Fig. 6a). This PRP8 conformational change destabilizes the SAD1–BBR2PWI and U4 Sm core–PRP8RH interactions, which are probably prerequisites for BRR2 translocation, and leads to the repositioning of PRP4K that might help activate its kinase activity. Thus, U6/5′ss helix formation seems to be a prerequisite for several subsequent, ATP-dependent RNP rearrangements, and at the same time ensures that the 5′ss is stably bound before the more extensive structural rearrangements that subsequently occur during formation of B complexes.

Fig. 6: Structural model of the conversion of a spliceosomal CE pre-B complex to a CI B complex.
figure 6

a, Schematic of RNA remodelling (bottom) and rearrangement and repositioning of tri-snRNP proteins (top) during the pre-B to B-like to B complex transition. First, binding of a 5′ss to the U6 ACAGA box leads to the formation of a short U6/5′ss helix and rearrangements in PRP8 to produce a half-closed conformation. Following phosphorylation of PRP6 and PRP31 (indicated by stars), major structural remodelling of the tri-snRNP occurs, including the large-scale movement of the BRR2 helicase cassettes, and the U4/U6 quasi-pseudoknot is dissolved. B-specific proteins are recruited to the remodelled complex and tether BBR2 to its activation position. At the same time, they facilitate the formation of the extended U6/5′ss helix, freeing the U4 nucleotides that are subsequently bound by the BRR2 helicase. b, Model of the conversion of a CE to CI spliceosome, including alternative 5′ss/U1 snRNP choices.

Our studies revealed that BRR2 translocation and multiple other tri-snRNP rearrangements that accompany the formation of B complexes and B-like complexes are driven by ATP. Moreover, PRP4K-mediated phosphorylation of PRP6 and PRP31 triggers a cascade of structural rearrangements in the pre-B complex that lead to BRR2 translocation (Fig. 6a). These include, among others, the repositioning of PRP31, PRP6HAT and PRP8RH, and the displacement of SNRNP27K, RBM42 and PRP4K, which in some cases involve a mutually exclusive interaction or location. For example, PRP6HAT can only be repositioned following the translocation of BRR2, and SNRNP27K has to be displaced from the tip of PRP8En to allow BRR2–PRP8Jab1 to dock there following its translocation (Extended Data Fig. 7a,b). Thus, protein phosphorylation coupled with structural rearrangements that generate mutually exclusive positions or protein-binding sites are likely to be the main driving forces for the repositioning of BRR2, a prerequisite for the subsequent spliceosome activation step. Indeed, mutually exclusive RNP interactions are a general mechanism that the spliceosome uses to regulate major structural transitions and to ensure progression of not only spliceosome assembly but also its catalytic activation and splicing catalysis.

B-specific proteins play a key part in stabilizing the extended form of the U6/5′ss helix, as we demonstrated that it does not form in their absence. Our studies also showed that formation of the short U6/5′ss helix is not sufficient to disrupt U4/U6 helix III, which is a prerequisite for binding of U4 snRNA by BRR2 (Fig. 6a). Notably, our add-back experiments showed that the majority of the other RNP remodelling events that occur during the pre-B to B complex transition can occur in the absence of the B-specific proteins. Thus, all of the proteins needed for the release of BRR2 from its pre-B binding site and for its large-scale translocation and docking to its new position at PRP8En are already present in CE and CI pre-B complexes. However, our studies revealed a key role for the B-specific proteins during the final positioning of BRR2 and the opening of the RNA-binding channel of its N-terminal helicase cassette. Most of the binding sites of the B-specific proteins are created during or after the translocation of BRR2 to its pre-activation position, which is consistent with these proteins functioning primarily first after BRR2 translocation, and further indicates that their final positioning and that of BRR2 probably occurs in a coordinated manner. This supports the conclusion that the intermediate conformation of BRR2 that we observed in our purified system in the absence of the B-specific proteins is indeed physiologically relevant.

Finally, our structural data provided new mechanistic insights into the conversion of CE splicing complexes into those assembled across an intron (Fig. 6b). In contrast to previous models of the CE to CI switch, our studies clearly demonstrated that there is no need to establish a CI U1–U2 interaction during the conversion from a CE to CI organized complex. Instead, we showed that the tri-snRNP of a CE pre-B complex has the potential to interact with an upstream U1/5′ss complex, without previous formation of a CI A complex. It subsequently undergoes the extensive rearrangements that generate a B complex, which in the presence of B-specific proteins can then be converted into a catalytically active spliceosome. As the CE and CI pre-B complexes seem to be functionally equivalent, the two assembly pathways mechanistically converge at the pre-B stage. Our structural data further suggest that it may not be necessary to break contacts across the downstream exon to establish a CI interaction with an upstream 5′ss once U2 snRNP and tri-snRNP interactions are in place. This in turn would indicate that the molecular bridge connecting U2 and U1 across an exon has no apparent effect on subsequent spliceosome assembly across the upstream intron. This further supports the idea that one of the primary functions of a CE complex is to promote U2AF–U2 snRNP binding to weak BS or polypyrimidine tracts. The structure of a CE pre-B complex that we reveal here suggests that the pre-B complex can engage in different forms of pre-mRNA splicing depending on which 5′ss it subsequently interacts with. When the upstream U1–5′ss of the same pre-mRNA interacts, canonical splicing that leads to the ligation of the two exons separated by a single intron would occur (Fig. 6b). It is also possible that different U1–5′ss complexes may compete for binding with the U2–tri-snRNP complex, leading to alternative 5′ss usage or exon skipping, or other less frequently occurring forms of splicing, such as trans-splicing or back-splicing. Our studies are consistent with the idea that commitment to the usage of a particular 5′ss and its functional pairing to a given 3′ss occurs primarily during the pre-B to B complex transition when the 5′ss is handed over from U1 to the U6 snRNA. Thus, the regulation of the usage of alternative 5′ss, which would affect the level of exon skipping, could be achieved by modulating the initial interaction with the tri-snRNP and/or productive interaction of PRP28 with the bound U1/5′ss.

Methods

MS2 affinity selection of CE complexes

HeLa S3 cells were obtained from the Helmholtz Zentrum für Infektionsforschung, Braunschweig and tested negative for mycoplasma. Cells were not authenticated. HeLa nuclear extracts were prepared according to a previously published method27 and were dialysed twice for 2.5 h against 50 volumes of Roeder D buffer (20 mM HEPES-KOH, pH 7.9, 0.2 mM EDTA, pH 8.0, 1.5 mM MgCl2, 100 mM KCl, 10% (v/v) glycerol, 0.5 mM DTT and 0.5 mM PMSF). For both pre-B and B-like complexes, 10 nM m7G(5′)ppp(5′)G-capped MINX exon RNA containing 3 MS2 aptamers at its 3′ end7 was pre-incubated with 100 nM MS2–MBP fusion protein for 40 min on ice before addition to the splicing reaction. Splicing reactions were carried out at 30 °C with 50% (v/v) nuclear extract in splicing buffer (1.5 mM MgCl2, 65 mM KCl, 20 mM HEPES-KOH pH 7.9, 2 mM ATP and 20 mM creatine phosphate). For pre-B complexes, splicing was performed for 20 min. To obtain B-like complexes, a 100-fold molar excess of a 5′ss oligo (5′-AAG/GUAAGUAU-3′, where / indicates the exon–intron boundary) was added after allowing pre-B complex formation, and the reaction was incubated for an additional 10 min at 30 °C. Splicing reactions were then chilled on ice for 10 min, centrifuged 15 min at 18,000g to remove aggregates and loaded onto a MBP Trap HP column (GE Healthcare). The column was washed with G-75 buffer (20 mM HEPES-KOH pH 7.9, 1.5 mM MgCl2 and 75 mM NaCl) and complexes were eluted with G-75 buffer containing 15 mM maltose. Eluted complexes were loaded onto a linear 10–30% (v/v) glycerol gradient prepared in G-75 buffer, centrifuged at 17,500 r.p.m. for 18 h at 4 °C in a TST41.14 rotor (Thermo Fisher Scientific), and fractions were collected from the bottom of the gradient. RNA from complexes in peak gradient fractions was separated on a denaturing 4–12% NuPAGE gel (Life Technologies) and visualized by staining with SYBR Gold (Thermo Fisher Scientific). For cryo-EM analysis, eluted complexes were subjected to gradient fixation (GRAFIX)28 and further processed as described below.

Add-back experiments using purified pre-B

CE pre-B complexes were MS2 affinity-purified as described above. To obtain pre-B5′ss complexes, affinity-purified complexes were subsequently incubated for 10 min at 0 °C or 30 °C with a 100-fold molar excess of the 5′ss oligo alone. To generate pre-B5′ss+ATP/ATPγS complexes, after incubation with the 5′ss oligo at 0 °C, complexes were subsequently incubated for 30 min at 30 °C after addition of 2 mM ATP or ATPγS. To generate pre-B5′ssLNG+ATPγS complexes, purified pre-B complexes were incubated with a 100-fold molar excess of an elongated 5′ss oligo (5′-AAG/GUAAGUAUCGUUCCAA-3′) for 10 min at 0 °C, followed by the addition of 2 mM ATPγS and incubation for an additional 30 min at 30 °C. To obtain pre-BATP or pre-BAMPPNP complexes, affinity-purified pre-B complexes were incubated with 2 mM ATP or AMPPNP, respectively, for 30 min at 30 °C. All complexes were then loaded onto a linear 10–30% (v/v) glycerol gradient prepared in G-75 buffer, and centrifuged and analysed as described above. For cryo-EM analyses, eluted complexes were subjected to gradient fixation (GRAFIX) and further processed as described below.

Western blotting

For western blot analysis of purified pre-B, pre-B5′ss, pre-B5′ss+ATP and pre-BATP complexes, 200 fmoles of each complex was separated on 4–12% NuPAGE gels and transferred to a Hybond P membrane. Membranes were first blocked with 5% milk in 1× TBS-T buffer (20 mM Tris-HCl, pH 7.5, 150 mM NaCl and 0.1% Tween 20) and then incubated with rabbit antibodies against human phospho-PRP6 (1:1,000 dilution) and phospho-PRP31 (1:500 dilution), followed by antibodies against human SF3B1 (1:700 dilution), PRP31 (1:500 dilution) and PRP6 (1:1,000 dilution; AB99292, Abcam). Subsequent to incubation with the primary antibodies, membranes were washed with TBS-T buffer and incubated with HRP-conjugated goat anti-rabbit IgG (1:30,000 dilution; 111-035-144, Jackson Immunoresearch). After washing, membranes were immunostained using an enhanced chemiluminescence detection kit (GE Healthcare) and the signal was visualized using an Amersham Imager 680.

Protein–protein crosslinking

CE pre-B complexes were MS2 affinity-purified as described above with the following modifications: after affinity purification, eluted complexes were crosslinked with 400 μM BS3 for 45 min at 18 °C in a total volume of 1.4 ml. Crosslinked complexes were loaded onto a linear 10–30% (v/v) glycerol gradient and subjected to centrifugation at 17,500 r.p.m. for 18 h at 2 °C in a TST41.14 rotor. Four peak fractions containing dimeric pre-B complexes were pooled and ultracentrifuged in a S100-AT4 rotor (Thermo Fisher Scientific). The pelleted, crosslinked dimeric CE pre-B complexes (approximately 20 pmol) were dissolved in 50 mM ammonium bicarbonate buffer containing 4 M urea, reduced with dithiothreitol, alkylated with iodoacetamide and, after diluting the urea to 1 M, in-solution digested with trypsin. Peptides were reverse-phase extracted using Sep-Pak Vac tC18 1cc cartridges (Waters), lyophilized and subsequently dissolved in 40 µl 2% acetonitrile (ACN) and 20 mM ammonium hydroxide. Peptides were separated on an xBridge C18 3.5 µm 1 × 150 mm reverse-phase column (Waters) using a 4–48% gradient of ACN in 10 mM ammonium hydroxide over 45 min at a flow rate of 60 µl min–1. One-minute fractions of 60 µl were collected, pooled in a step of 12 min (resulting in 12 pooled fractions in total), vacuum dried and dissolved in 5% ACN and 0.1% trifluoroacetic acid (TFA) for subsequent uHPLC-ESI–MS/MS analysis that was performed in triplicate on an Orbitrap Exploris 480 (Thermo Scientific). The mass spectrometer was coupled to a Dionex UltiMate 3000 uHPLC system (Thermo Scientific) with a custom 35 cm C18 column (75 µm inner diameter packed with ReproSil-Pur 120 C18-AQ beads, 3 µm pore size (Dr. Maisch)). The MS1 and MS2 resolutions were set to 120,000 and 30,000, respectively. Only precursors with a charge state of 3–8 were selected for MS2. MS data were acquired using Thermo Scientific Xcalibur (v.4.4.16.14) software.

B-like complexes were MS2 affinity-purified as described above with the following modifications: after affinity purification, eluted spliceosomal complexes were crosslinked with 350 μM BS3 for 30 min at 18 °C in a total volume of 2 ml. Crosslinked complexes were loaded onto a linear 10–30% (v/v) glycerol gradient and subjected to centrifugation at 21,000 r.p.m. for 12 h at 2 °C in a TST41.14 rotor. Peak fractions containing dimeric B-like complexes (about 12 pmol) were pooled, pelleted, digested and the peptides were reverse-extracted as described above for the pre-B complexes. Peptides were fractionated by gel filtration using a Superdex Peptide PC3.2/30 column (GE Healthcare) in 30% ACN and 0.1% TFA. Fifty-microlitre fractions corresponding to an elution volume of 1.2–1.8 ml were analysed in duplicate on a Thermo Scientific Orbitrap Fusion Lumos Tribrid mass spectrometer coupled to Ultimate 3000 uHPLC (Thermo Scientific). MS data acquisition was performed using Thermo Scientific Xcalibur (v.4.5.445.18) software.

The protein composition of the spliceosomal complexes was determined in a search with MaxQuant (v.2.4.2.0) against a UniProt human reference proteome using the same samples but before pre-fractionation by either offline reverse phase (pre-B) or size exclusion (B-like) chromatography. Based on the MaxQuant results, restricted protein databases were compiled and used for protein–protein crosslink identification by searching the Thermo raw files with pLink (v.2.3.11) for pre-B and pLink (v.2.3.9) for B-like complexes29. For model building, a maximum distance of 30 Å between the Cα atoms of the crosslinked lysine residues was allowed.

EM sample preparation and imaging

For cryo-EM samples, spliceosomal complexes were loaded onto a linear 10–30% (v/v) glycerol gradient prepared in G-75 buffer containing 0–0.1% glutaraldehyde (GRAFIX) and centrifuged at 17,500 r.p.m. for 18 h at 4 °C in a TST41.14 rotor. Fractions were collected from the bottom of the gradient and were quenched with 120 mM Tris-HCl pH 7.5 on ice. Complexes in the peak gradient fractions were pooled, buffer-exchanged and concentrated in an Amicon 50 kDa cut-off unit. Complexes were then adsorbed for 20 min to a thin layer carbon film that was subsequently attached to R2/2 UltrAuFoil grids (Quantifoil). A volume of 3.8 μl of double-distilled water was applied to the grids and excess water was blotted away using a FEI Vitrobot loaded with pre-wet filter paper, with the following settings: blotting force of 11 and blotting time of 7.5 s at 4 °C and 100% humidity. Samples were subsequently vitrified by plunging into liquid ethane cooled to liquid nitrogen temperature. Cryo-EM grids of the pre-B, pre-B5′ss, pre-B5′ss+ATPγS, pre-B5′ssLNG+ATPγS and pre-BATP were imaged in a Titan Krios 1 (Thermo Fisher Scientific), equipped with a Cs corrector, operated at 300 kV, on a Falcon III detector in linear mode at a calibrated pixel size of 1.16 Å at the specimen level (see Extended Data Table 1 for a summary of EM statistics). Cryo-EM grids of B-like and pre-BAMPPNP were imaged in a Titan Krios 3 (Thermo Fisher Scientific), operated at 300 kV, on a Falcon III detector in linear mode at a calibrated pixel size of 1.35 Å at the specimen level. Krios1 and Krios3 cryo-EM images were acquired using Thermo Fisher EPU2.1 with an exposure time of 1.02 s (40 movie frames), with a total dose of 60 e Å–2 and 48 e Å–2, respectively.

EM data processing

For all of the cryo-EM datasets, frames were aligned, dose-weighted and summed using MotionCor (v.2.0)30. Defocus values were estimated using Gctf31. Particle picking was performed using crYOLO32. For each sample, approximately 800–1,000 particles were manually picked from 30–50 micrographs and used to train a neural network model, which was then used to automatically pick particles for the corresponding dataset. All subsequent processing was performed using RELION 3.1 (http://www2.mrc-lmb.cam.ac.uk/relion/index.php/Main_Page) unless otherwise specified. Cryo-EM data were split randomly into two halves for gold-standard FSC determination in RELION 3.1.

For the pre-B complex, 777,350 particles were picked from 25,904 micrographs, extracted and binned to 200 × 200 pixels (3× binned, pixel size of 3.48 Å). After reference-free two-dimensional (2D) classification, 586,198 particles were retained for further processing, from which 100,000 particles were used for ab initio reconstruction in cryoSPARC33. The ab initio model showed a 3D structure resembling that of a CI pre-B complex but with additional fuzzy densities (which later turned out to be the second protomer). The fuzzy density, as well as the unstable U2 density, was erased using Chimera34, and the resulting 3D structure was low-pass filtered to 40 Å resolution to prevent model bias, which was then used for 3D classification in RELION 3.1. The 586,198 particles were 3D classified into 5 classes that contained 2 major types of particles. In class 1, both protomers were well-defined, whereas in class 2, only one well-defined pre-B protomer was observed. To improve the resolution of the pre-B protomer, the two protomers were separately re-extracted, re-centred in a box of 160 × 160 pixels (3× binned, pixel size of 3.48 Å) using the alignment parameters from the first round of 3D classification. The resulting particles were then 3D classified separately, and the good particles were combined and subjected to a masked 3D classification, focusing on the tri-snRNP density. The 279,781 particles showing a well-defined tri-snRNP density were then re-extracted in the original pixel size in a 480 × 480 box. 3D refinement, CTF refinement and Bayesian polishing were performed in two rounds. In the final round of 3D refinement, soft masks around the tri-snRNP core—encompassing PRP8NTD, the PRP8 Large domain (PRP8Large), U4/U6 stem I and stem III, U5 snRNA, SNU114, SAD1, DIM1, the PRP6 N-terminal domain (PRP6NTD), SF3A1 C-terminal region (SF3A1CT), the PRP28 N-terminal domain (PRP28NTD) and the BRR2 N-terminal domain (BRR2NTD)—and the BRR2 region (encompassing the BRR2 helicase domain and PRP8Jab1) were applied, producing two 3D reconstructions at nominal resolution of 3.5 Å and 4.2 Å, respectively. Focused classification without alignment was applied to improve the U4 core region (encompassing the U4 Sm domain, SNU66, U4/U6 stem I and stem III, RBM42, PRP8RH and PRP8En) and the U2 region (encompassing the U2 5′ domain comprising SF3b proteins, U2/U6 helix II and U6 Lsm proteins). After masked refinement, the U4 core and the U2 region were resolved to nominal resolutions of 6.1 Å and 12 Å, respectively.

For the B-like complex, 488,598 particles from 14,665 micrographs were picked, extracted and binned to 200 × 200 pixels (3× binned, pixel size of 4.05 Å). A total of 389,830 particles were retained after reference-free 2D classification, from which 100,000 particles were used for ab initio reconstruction in cryoSPARC33. The ab initio model showed a dimeric structure. The less well-defined protomer was erased using Chimera34, and the better-defined protomer was low-pass filtered to 40 Å and used as the starting model for 3D classification of the entire dataset. For the first round of 3D classification, a soft mask around one protomer was applied, so that all the particles were forced to align to only one protomer. This separated particles that had at least one well-defined protomer from the bad particles. To investigate whether the good particles contain monomeric B-like complexes, after the first round of 3D classification the particles were further 3D classified into four classes without a mask. All of the 3D classes showed well-defined dimeric complexes, which suggested that all of the good particles are dimeric B-like complexes. To improve the resolution of the B-like protomers, particles were re-centred and re-extracted in the original pixel size in a 480 × 480 pixels box. Two rounds of 3D refinement, CTF refinement and Bayesian polishing were performed. In the final round of 3D refinement, soft masks around the tri-snRNP core (encompassing PRP8, the 5′ss oligo, U4/U6 stem I and stem II, U5 snRNA, PRP3, SNU114, SAD1, DIM1, PRP6NTD, SF3A1CT, PRP28NTD, SNU13, FBP21, SNU23, MFAP1 and PRP38A), the BRR2 region (encompassing the BRR2 helicase domain and PRP8Jab1) and the U4/U6 region (encompassing U4/U6 stem I and stem II, SNU13, PRP3, PRP4, PRP31, PPIH, PRP6HAT and PRP8RH) were applied, producing three 3D reconstructions with nominal resolutions of 3.1 Å, 4.3 Å, and 3.3 Å, respectively. Focused classification without alignment was applied to improve the U2 region (encompassing the U2 5′ region, U2/U6 helix II and SMU1). After masked refinement, the U2 region was improved to about 12 Å resolution.

For the pre-B5′ss complex, 1,283,541 particles from 23,372 micrographs were picked, extracted and binned to 200 × 200 pixels (3× binned, pixel size of 3.48 Å). Overall, 944,381 particles were retained after reference-free 2D classification and subjected to 3D classification using the low-pass filtered tri-snRNP part of the pre-B complex or the tri-snRNP core of the B-like complex (excluding the BRR2 region) as the starting model. Both starting models generated the same result, with one 3D class containing a well-defined protomer. No class resembling the B-like complex was found even when the tri-snRNP core of the B-like complex was used as the starting model. To separate the class 1 and class 2 dimers, the good 3D class was further classified into nine classes. To improve the resolution of the pre-B5′ss protomer, particles were re-centred and re-extracted in the original pixel size in a 480 × 480 pixels box, and another round of 3D classification was performed with a soft mask around the tri-snRNP region. The good class was selected and 3D refined, followed by one round of CTF refinement and Bayesian polishing. The final 176,879 particles were 3D refined with a soft mask around the tri-snRNP core (encompassing 5′ss oligo, PRP8NTD, PRP8Large, U4/U6 stem I and stem III, U5 snRNA, SNU114, SAD1, DIM1, PRP6NTD, SF3A1CT, PRP28NTD and BRR2NTD), resulting in a 3D reconstruction at a nominal resolution of 4.2 Å.

For the pre-B5′ss+ATPγS complex, 791,079 particles were picked, extracted and binned to 200 × 200 pixels (3× binned, pixel size of 3.48 Å). As the ab initio reconstruction from cryoSPARC largely resembles the B-like complex, the tri-snRNP core (excluding BRR2) of the B-like complex was low-pass filtered to 40 Å and used as the starting model for 3D classification. The good classes were combined, re-centred and re-extracted in the original pixel size in a 480 × 480 pixels box. Two rounds of 3D refinement, CTF refinement and Bayesian polishing were performed, and the final 411,185 particles were refined with a soft mask around the tri-snRNP core (encompassing PRP8, 5′ss oligo, U4/U6 stem I and stem II, U5 snRNA, PRP3, SNU114, SAD1, DIM1, PRP6NTD, SF3A1CT, PRP28NTD and SNU13), producing a 3D reconstruction at a nominal resolution of 3.1 Å. Focused classification without alignment followed by a masked refinement was applied to improve the BRR2 region (encompassing the BRR2 helicase domain, PRP8En and PRP8Jab1) to a nominal resolution of 4.0 Å.

For the pre-B5′ssLNG+ATPγS complex, 541,230 particles from 13,740 micrographs were picked, extracted and binned to 200 × 200 pixels (3× binned, pixel size of 3.48 Å). Using the low-pass filtered tri-snRNP core of the pre-B5′ss+ATPγS complex as a starting model, the particles were 3D classified, and particles from the best class were re-centred and re-extracted in the original pixel size in a 480 × 480 pixels box. After two rounds of 3D refinement, CTF refinement and Bayesian polishing, the final 136,333 particles were refined with a soft mask around the tri-snRNP core (encompassing PRP8, the long 5′ss oligo, U4/U6 stem I and stem II, U5 snRNA, PRP3, SNU114, SAD1, DIM1, PRP6NTD, SF3A1CT, PRP28NTD and SNU13), producing a 3D reconstruction at a nominal resolution of 3.7 Å.

For the pre-BATP complex, 757,260 particles from 11,752 micrographs were picked, extracted and binned to 200 × 200 pixels (3× binned, pixel size of 3.48 Å). A total of 499,792 particles were retained after 2D classification. Various starting models were tested for 3D classification, including the tri-snRNP core from the pre-B, pre-B5′ss and pre-B5′ss+ATPγS complexes. The density of BRR2 was erased from all of the starting models to prevent model bias, and the low-pass filtered tri-snRNP core from the pre-B5′ss+ATPγS complex worked best for 3D classification. No class resembling pre-B or pre-B5′ss was detected even when the two complexes were used as the starting model. The particles from the best 3D class (94,460 particles) were further 3D classified with a resolution limit of 30 Å, which showed that 25.4% of the particles contain a well-resolved second protomer. The rest (74.6%) of the particles showed a poorly resolved second protomer, which was due to either the flexibility of the second protomer or the lack of stable tri-snRNP integration in the second protomer (that is, it consists of a CE A-like complex). Given that all of the 94,460 particles contained at least one good protomer, for 3D reconstruction of the high-resolution core, all of these particles were re-centred and re-extracted in the original pixel size in a 480 × 480 pixels box. After two rounds of 3D refinement, CTF refinement and Bayesian polishing, the final 3D refinement was performed with a soft mask around the tri-snRNP core (encompassing PRP8, the 5′ss region of the MINX exon RNA, U4/U6 stem I and stem II, U5 snRNA, PRP3, SNU114, SAD1, DIM1, PRP6NTD, SF3A1CT, PRP28NTD and SNU13), producing a 3D reconstruction at a nominal resolution of 3.7 Å. Subsequent local 3D classification around the tri-snRNP core did not reveal further structural heterogeneity, which suggested that the tri-snRNP core remains identical regardless of the presence or absence of a stable second protomer.

For the pre-BAMPPNP complex, 619,945 particles from 20,337 micrographs were picked, extracted and binned to 200 × 200 pixels (3× binned, pixel size of 4.05 Å). In total, 371,919 particles were retained after 2D classification. The same set of starting models prepared for the pre-BATP complex was used for 3D classification of the pre-BAMPPNP complex, with the low-pass filtered tri-snRNP core from the pre-B complex working best. No class resembling pre-B5′ss+ATPγS was detected even when it was used as the starting model. The particles from the best 3D class were re-centred and re-extracted in the original pixel size in a 480 × 480 pixels box. After one round of 3D refinement, CTF refinement and Bayesian polishing, the final 53,422 particles were refined with a soft mask around the tri-snRNP core (encompassing PRP8NTD, PRP8Large, U4/U6 stem I and stem III, U5 snRNA, SNU114, SAD1, DIM1, PRP6NTD, SF3A1CT, PRP28NTD and BRR2NTD), producing a 3D reconstruction at a nominal resolution of 4.1 Å. Focused classification without alignment followed by a masked refinement was applied to improve the PRP28/U1 snRNP region (encompassing PRP8, U5 snRNA, PRP28 and U1 snRNP) to a nominal resolution of 6.1 Å.

Model building and refinement

Model building was carried out by docking cryo-EM, crystal and AlphaFold2 structures into EM density and adjusting in COOT35. A list of modelled protein and RNA components, as well as their corresponding model templates, is provided in Supplementary Table 3. In brief, the CE pre-B complex was modelled by fitting the tri-snRNP and U2 snRNP parts of the CI pre-B complex (Protein Data Bank (PDB) identifier 6QX9) into the EM density as rigid bodies. The U2 part (including the SF3B core complex, the SF3A core complex, U2 Sm, U2-A′ and U2-B″) was truncated to a polyalanine chain without further adjustment. The SF3B6 protein was modelled based on its position relative to the SF3B1 C-terminal HEAT domain in the A-like complex (PDB 7Q4O) without further adjustment, consistent with crosslinks (Supplementary Table 2 and Extended Data Fig. 2). For the high-resolution tri-snRNP part, each individual component and its side chains were adjusted manually in COOT. The B-like complex was modelled by fitting the B complex model (PDB 8Q7N) into the EM density, and the parts that are absent in B-like (that is, UBL5, an extended U6/5′ss helix, TCERG1 and BUD31) were deleted from the model. Two copies of the 5′ss oligo were modelled de novo, and the high-resolution tri-snRNP part was manually adjusted in COOT. The pre-B5′ss complex was modelled by fitting individual components of the CE pre-B complex into the EM density as rigid bodies. PRP8, along with 5′ss oligo 1, was taken from the B-like complex and fit into the EM density as a rigid body. The side chains were initially truncated to a polyalanine chain owing to the relatively lower resolution, and the carbon backbones were manually adjusted in COOT. The side chains were then added back manually at the positions where the local resolution allows. The pre-B5′ss+ATPγS) complex was modelled by fitting individual components of the B-like complex into the EM density. The high-resolution tri-snRNP part was adjusted in COOT. The crystal structure of the BRR2 helicase (PDB 4F91) was truncated into a polyalanine chain and docked into the density as a rigid body. The BRR2CC was not further adjusted, and the N-terminal cassette was manually adjusted into the density in COOT. The C-terminal part of SF3A1 (amino acids 496–521) was predicted by AlphaFold2 and docked into the density and adjusted in COOT. U6 nucleotides between the U6/5′ss helix and U4/U6 stem III (nucleotides 35–39), and U4 nucleotides between U4/U6 stem I and stem III (nucleotides 63–74) were de novo modelled in COOT. The pre-B5′ssLNG+ATPγS complex was modelled by fitting the pre-B5′ss+ATPγS complex into the EM density, and the flexible U4 snRNA strand (nucleotides 62–85) was deleted. The U4 Sm core was fit into the density as a rigid body. The extended U6/5′ss helix was modelled as a A-form helix and fit into the density. The pre-BATP complex was modelled by fitting the pre-B5′ss+ATPγS complex into the EM density, and changing the 5′ss oligo sequence into the MINX exon sequence. A−4 and C−5 of the MINX exon were de novo modelled into the EM density in COOT. The PRP28 RecA domains were deleted owing to the absence of EM density at the corresponding position. The pre-BAMPPNP complex was modelled by fitting the CE pre-B complex into the EM density. The U1 Sm core, U1 snRNA and U1-70K were taken from the CI pre-B complex (PDB 6QX9) and docked into the EM density as a rigid body. The closed RecA domains of PRP28, together with the unknown single-stranded RNA, were modelled based on the crystal structure of the closed Mss116p DEAD-box helicase bound to AMP-PNP and a single-stranded RNA (PDB 3I5X). Coordinates of the tri-snRNP parts of the various complexes were refined in real space using PHENIX36.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.