Introduction

Due to the continuous challenges by a variety of pathogens in the environment, our adaptive immune system has evolved an immune diversification process to diversify the antigen receptors in lymphocytes (Alt et al. 2013). The antigen receptors include T cell receptor (TCR) which serves as the basis for cell-mediated immune response, and B cell receptor (BCR) which can be secreted as antibody or immunoglobulin (Ig) in humoral immunity. The diversity of antigen receptors or immune repertoires, is initiated by programmed DNA lesions and channeled into either recombination or mutation events by the general DNA repair machineries with common and unique features.

Immune diversity is mainly achieved through two steps: antigen-independent and antigen-dependent steps. The first step of immune diversification involved the joining of the variable (V), diversity (D) and joining (J) gene segments to form the variable regions exons of immunoglobulin heavy (IgH) and light (IgL) chains or TCR-α/β/γ/δ chains. This antigen-independent process occurs during lymphocyte development and is called V(D)J recombination (Gellert 2002; Helmink and Sleckman 2012; Schatz et al. 1992). Upon antigen stimulation, the BCR undergoes additional diversification processes such as IgH class switch recombination (CSR), Ig variable (IgV) exon somatic hypermutation (SHM), or gene conversion (GCV) in mature B cells. CSR allows switching of the BCR isotype from IgM to other classes, which changes the effector function of a particular BCR and elicits different downstream immune functions (Chaudhuri et al. 2007; Stavnezer et al. 2008). SHM introduces point mutations or small insertions and deletions (indels) in the IgV exon, allowing antibody affinity mutation (Di Noia and Neuberger 2007; Peled et al. 2008). GCV, which exchanges the expressing rearranged V with an upstream V gene segment, is observed in birds and rabbits (Maizels 1987).

V(D)J recombination is initiated by RAG endonuclease, while CSR, SHM and GCV are initiated by the same cytidine deaminase called activation-induced deaminase (AID) (Arakawa et al. 2002; Muramatsu et al. 2000; Revy et al. 2000). Activities of RAG and AID are highly regulated in lymphocytes to ensure an ordered diversification of antigen receptors (Teng and Schatz 2015; Yeap and Meng 2019). The programmed DNA lesions of either DNA double strand breaks (DSBs) or modified base (deoxyurindine, U) caused by RAG or AID can activate the general cellular DNA damage response factors and be repaired efficiently by general DNA repair factors. In this context, the RAG-generated ends are exclusively ligated through the classic nonhomologous end-joining (c-NHEJ) pathway (Boboila, Alt, et al. 2012; Deriano and Roth 2013). In SHM, base excision repair (BER) and mismatch repair (MMR) pathways cooperatively channel the U’s into mutations or indels (Peled et al. 2008). While in CSR, AID-initiated U’s are first converted into DSBs (Stavnezer et al. 2008). CSR utilizes both c-NHEJ and alternative end-joining (Alt-EJ) pathways to ligate the ends (Boboila, Alt, et al. 2012; Deriano and Roth 2013; Yan et al. 2007), while GCV uses homologous recombination (HR)-based mechanism to diverse the antibody repertoire (Arakawa and Buerstedde 2004). Immune diversification happens in a physiological context and utilizes the general DNA repair pathways. Thus, the V(D)J recombination and CSR are frequently used as reporter assays to characterize DSB response factors or NHEJ factors, as exampled in the studies of Shieldin (Dev et al. 2018; Ghezraoui et al. 2018; Noordermeer et al. 2018).

Here we use antibody CSR as an exemplar to discuss the common and unique features in the repair of programmed DNA lesions. The readers are encouraged to read the recent reviews on V(D)J recombination (Schatz and Ji 2011), SHM and CSR (Methot and Di Noia 2017; Yeap and Meng 2019) for the general mechanisms of immune diversity, and the roles of particular DNA repair pathways in immune diversity (Saha et al. 2020; Wang et al. 2020).

Introduction of programmed DNA lesions by DNA deaminase

CSR, SHM and GCV were considered as three distinct biological processes, until the discovery of AID (Muramatsu et al. 2000; Revy et al. 2000). AID was first revealed with a cDNA subtraction approach by comparing cytokine-stimulated and unstimulated murine B cell line CH12F3 (Muramatsu et al. 1999). Later, data from both gene knockout mouse model (Muramatsu et al. 2000) and hyper IgM syndrome (a genetic disorder with defective CSR) patients (Revy et al. 2000) suggest AID is required for CSR and SHM (Honjo et al. 2002), and GCV (Arakawa et al. 2002; Fugmann and Schatz 2002). AID is a member of AID/APOBEC cytidine deaminase superfamily, which was originally considered as an RNA editing enzyme based on its homology to the RNA editing enzyme APOBEC1 (Honjo et al. 2002), until more lines of direct evidence that AID is an authentic DNA mutator (Chaudhuri et al. 2003; Petersen-Mahrt et al. 2002; Pham et al. 2003). It turns out that in the AID/APOBEC superfamily, APOBEC1 is the only RNA editing enzyme while the other catalytic-active enzymes all work on single-strand DNA (ssDNA) (Harris and Dudley 2015; Liu and Meng 2018). AID preferentially converts the cytidine in the context of WRC (W = A/T; R = G/A) motif to uridine on ssDNA, leading to an U:G mismatch on the genomic DNA (Fig. 1a).

Fig. 1
figure 1

IgH class switch recombination (CSR). Schematic illustration of IgH constant genes and AID targeting (a), processing of U’s into DSBs by BER and MMR (b), activation of DSBR (c) and end ligation by c-NHEJ and Alt-EJ (d). See text for more details

Biochemical evidence clearly suggests that AID works on ssDNA that are generated during gene transcription (Chaudhuri et al. 2003; Pham et al. 2003), correlating with the observed deamination in transcribed regions in vivo (Yoshikawa et al. 2002) (Fig. 1a). Many cis- and trans-factors are involved in the regulation of AID activity in vivo as we recently discussed (Yeap and Meng 2019). Particularly, both RNA exosome and ssDNA binding complex RPA are required for AID’s access to the ssDNA substrates (Basu et al. 2011; Chaudhuri et al. 2004). Although the AID C-terminus was proposed to interact or recruit several DNA repair factors including BER/MMR (Ranjit et al. 2011) or DSB response factors (Zahn et al. 2014), genetic evidence suggests the introduction and processing of DNA lesions are separate steps in CSR (Zarrin et al. 2007).

During CSR, AID extensively targets the long repetitive S regions (Hackney et al. 2009; Yu and Lieber 2003), which locate at the first intron of each IgH constant (IgH C) gene (Fig. 1a). During evolution, AID-preferred substrate motif AGCT is selective enriched in S region, which motif offers an ideal substrate for AID since C’s on both strands can be deaminated (Han et al. 2011). In mammalian cells, S regions harbor additional repetitive G-rich motifs (Hackney et al. 2009; Yu and Lieber 2003). Besides the Ig loci, AID is found to target many off-target proto-oncogenes and its ectopically expression or mis-regulation contributes to lymphomagenesis (Casellas et al. 2016). A few high-throughput approaches were developed to track the AID off-target sites in CSR, including high-throughput genome-wide translocation sequencing (HTGTS) (Chiarle et al. 2011; Meng et al. 2014), translocation-capture sequencing (TC-Seq) (Klein et al. 2011), resected ssDNA ChIP-seq (Qian et al. 2014), DNA capture and sequencing (Alvarez-Prado et al. 2018), etc. The chromatin features associated with AID targeting shows that AID prefers to target the divergent/convergent transcribed regions (Meng et al. 2014; Pefanis et al. 2014) in the intersection regions of super-enhancer and gene (Meng et al. 2014; Pefanis et al. 2015; Qian et al. 2014). The specific AID targeting sites contribute to its mutagenic outcomes, as elegantly demonstrated in germinal center B cells undergoing SHM and/or CSR (Liu et al. 2008). How the DNA sequence or other cis-elements contribute to the differential DNA repair is still an open question in the field. Machine learning was applied to predict AID targeting based on the mutation frequency (Alvarez-Prado et al. 2018) and recently applied to predict base editing outcomes (Arbab et al. 2020). However, whether DNA sequence or chromatin location affects DNA repair is still unclear.

Generation of DSBs through BER and MMR

In CSR, AID is specifically targeted to the IgH S regions and the deamination products were channeled into DSBs through BER and MMR pathways (Fig. 1b). UNG is found to be the major glycosylase for U excision in S regions. In this context, mouse genetic studies revealed that UNG-deficiency significantly abolished CSR (Rada et al. 2002) and SMUG1-deficiency further decreased CSR (Di Noia et al. 2006; Dingler et al. 2014), which is consistent with the fact that UNG deleterious mutants cause hyper IgM syndrome Type IV in human patients (Imai et al. 2003). However, single UNG deficiency does not completely abolish CSR (Rada et al. 2002, 2004). On the other hand, the U:G pair is also a substrate for MMR. Even before the discovery of AID, several studies have already revealed a critical role of MMR protein in SHM (Phung et al. 1998; Rada et al. 1998; Wiesendanger et al. 2000). The discovery of AID explained the role of MMR proteins in processing AID-initiated lesions, and a panel of MMR proteins were found vital to CSR (Li et al. 2004; Martin et al. 2003; Martomo et al. 2004; Schrader et al. 2002, 2003). Later, it was found that combined UNG and MMR deficiencies completely abolish CSR in different mouse models (Rada et al. 2004; Shen et al. 2006), suggesting the combined effort of BER and MMR in processing AID-lesions.

BER and MMR are important cellular pathways to maintain genome integrity, which use two distinct sets of proteins to process damaged base or mismatch into single-strand nick or gap. In BER, the damaged base is removed by glycosylase to generate an apurinic/apyrimidinic site (AP), which is further processed into a nick by apurinic/apyrimidinic endonuclease 1 (APE1) (Jacobs and Schar 2012). In MMR, the mismatch is recognized by MutS complex, which further recruits a cascade of scaffold and nuclease proteins to generate a single-strand gap (Li 2008). The nick/gap is further subjected to DNA polymerase fill-up and XRCC1-Lig1/3-mediated ligation. However, instead of working in an error-free manner, BER and MMR process the U’s into DSBs in CSR. In this context, B cells could utilize the first few steps of BER/MMR to achieve the goal. Consistent with this thought, deficiency of XRCC1 which forms complex with Lig1/3 in the ligation step enhances CSR in mouse B cells (Han et al. 2012; Saribasak et al. 2011). Although, XRCC1 could function through its role in Alt-EJ, it is tempting to speculate that XRCC1’s role in sealing the gap in single-strand break repair could inhibit the DSB generation in CSR. Furthermore, the sequence features of S regions could aid the DSB generation (Yu et al. 2004). The enriched palindrome AGCT motif offers a convenient way to generate DSB: UNG-processed APs can be cut by APE1to form two closely single-strand nicks which can turn into DSB (Han et al. 2011). Consistently, APE1 is required for CSR in B cells (Masani et al. 2013). Last but not least, the unique features of S region sequence also offer other DSB generation ways. The G-rich S region can form G4 (Dempsey et al. 1999) or R-loop structure (Yu et al. 2003), which could contribute to the genomic breaks observed in S region in the absence of AID (Chiarle et al. 2011).

Ligation of DSB ends by end-joining pathways

The AID-initiated DSBs can activate DSB response (DSBR) in activated B cells (Fig. 1c), as NBS1/gamma-H2AX foci are quickly formed at IgH locus (Petersen et al. 2001). Roles of many of the DSBR factors in CSR were dissected in mouse models, including MRN complex (Rass et al. 2009; Reina-San-Martin et al. 2005; Xie et al. 2009; Yin et al. 2009; Zha et al. 2009), ATM (Lumsden et al. 2004; Reina-San-Martin et al. 2004), H2AX (Franco et al. 2006; Reina-San-Martin et al. 2003), etc. Among the DSBR factors, 53BP1 plays a unique role in CSR, as its deficiency leads to the most dramatic CSR decrease (Manis et al. 2004; Ward et al. 2004) among deficiencies for its upstream, e.g., RNF8/RNF168 (Ramachandran et al. 2010; Santos et al. 2010) or downstream factors, e.g., Rif1 (Di Virgilio et al. 2013), PTIP (Daniel et al. 2010; Starnes et al. 2016), Rev7/Shieldin (Boersma et al. 2015; Dev et al. 2018; Ghezraoui et al. 2018; Noordermeer et al. 2018; Xu et al. 2015). Beyond its role in inhibiting end-resection, 53BP1 could affect DSB repair at chromatin level by affecting chromatin synapsis or movement (Difilippantonio et al. 2008; Dimitrova et al. 2008; Kilic et al. 2019; Lukas et al. 2011; Ochs et al. 2019), potentially through its oligomer domain (Lottersberger et al. 2013; Sundaravinayagam et al. 2019).

The end-joining pathways finally ligate the two DSBs at upstream and downstream S regions in CSR (Fig. 1d). Different from the end-joining in V(D)J recombination which DSB ends are exclusively ligated by classic-NHEJ pathway (Deriano and Roth 2013), the ends can be ligated by both c-NHEJ and Alt-EJ in CSR (Boboila, Alt, et al., 2012). NHEJ contains several evolutionarily conserved factors which are named as the “core” subunits (Lieber 2010). Contrast to the completed abolished V(D)J recombination, CSR is decreased but not abolished in the core NHEJ subunit deficiency (Yan et al. 2007), which lead to the concept of Alt-EJ pathways. Alt-EJ is defined as end-joining in the absence of certain NHEJ core subunits, which may represent many parallel pathways (Boboila, Jankovic, et al. 2010; Boboila, Yan, et al. 2010; Boboila, Oksenych, et al. 2012). Alt-EJ frequently uses micro-homology at their ligation junctions, which was also observed in yeast and termed as microhomology-mediated end-joining (MMEJ). In CSR, Alt-EJ plays an important role and a few factors are proposed in this pathway, including PARP, XRCC1/Lig1/3, FEN1, polymerase theta (POLQ), etc. (Audebert et al. 2004; Boboila, Oksenych, et al. 2012; Ceccaldi et al. 2015; Della-Maria et al. 2011; Frit et al. 2014; Han et al. 2012; Mateos-Gomez et al. 2015; Saribasak et al. 2011; Schreiber et al. 2002). It is of note that the end-joining pathway factors work in an iterative and redundant manner (Lieber 2010). The CSR end-joining of single gene deficiency could be complemented by redundant factors. For example, neither absence of POLQ nor catalytic-inactive POLQ causes defective CSR level (Li et al. 2011; Masuda et al. 2006; Zeng et al. 2004). Instead, POLQ deficiency leads to minor changes of CSR-junctions (Yousefzadeh et al. 2014).

Coordination of DNA repair and replication in CSR

Activated B cells undergo extensive cell proliferation in vivo and ex vivo, and defective cell proliferation results to decreased CSR level in various abovementioned mouse models. In this context, replicative helicase Mcm complex is found to be required for optimal CSR (Wiedemann et al. 2016). AID initiates a variety of DNA damages on genomic DNA, including U’s, AP sites, SSBs and DSBs. How B cells tolerate all these damages during DNA replication was a long-sought question in the field. Lines of evidence suggest that activated B cells have decreased p53 through a BCL6-dependent way, and tolerate the physiological DNA breaks (Phan and Dalla-Favera 2004). However, how damaged base or AP sites are tolerated was not clear. In the course of studying Rev7’s roles in CSR, we unexpectedly revealed such a role of translesion synthesis (TLS) in this process (Yang et al. 2020).

Rev7 is an adaptor protein and plays multiple roles in different cellular processes through protein interaction, including DNA translesion synthesis (Lawrence, Das, et al. 1985; Nelson et al. 1996), APC/C inhibition (Chen and Fang 2001), spindle assembly (Bhat et al. 2015), DNA resection end inhibition (Boersma et al. 2015; Xu et al. 2015) as a subunit of Shieldin (Dev et al. 2018; Ghezraoui et al. 2018; Gupta et al. 2018; Mirman et al. 2018; Noordermeer et al. 2018). We found that Rev7-deficient activated B cells undergo cell death in an AID-UNG-dependent manner (Yang et al. 2020), suggesting the Rev7 assists DNA replication in the presence of AID-UNG generated AP sites. In this context, we tested a panel of DNA repair/replication factors involved in DSBR, end resection, and TLS, and conlcude that the cell death phenotype is caused by the defective function of DNA Polymerase Zeta (PolZ). PolZ contains Rev7, Rev3l and other subunits, and is the major extender enzyme in TLS (Lange et al. 2016; Yang and Gao 2018). Combined with the previous observations in Rev3l-deficient mouse model (Schenten et al. 2009), our findings show that TLS enzyme not only diversifies DNA sequence but also ensures B cell proliferation, coordinating the processes of DNA repair and replication.

In this context, CSR serves as a valuable assay to test the functions of many DNA repair pathways. First, the end-joining efficiency can be directly visualized by flow cytometry study of surface-expressed IgG/IgM ratios, making it a robust NHEJ reporter for physiological end-joining. Second, the end resection can be showed by the location of switch junctions in IgH locus by applying HTGTS technology. In the same line, the choice of c-NHEJ and Alt-EJ can be reflected by the microhomology usage of switch junctions. Third, the mutation spectrum (e.g., transition C > T, or transversion C > A/G) can be displayed by sequencing of the 5’S regions. As an exemplar, we characterized Rev7 with CSR assays (Yang et al. 2020) (Fig. 2). Consistent with other reports (Boersma et al. 2015; Dev et al. 2018; Ghezraoui et al. 2018; Gupta et al. 2018; Leland et al. 2018; Mirman et al. 2018; Noordermeer et al. 2018; Tomida et al. 2018; Xu et al. 2015), Rev7 deficiency dramatically increases the end resection and affects optimal CSR (Yang et al. 2020). Moreover, we found that Rev7 promotes C > G transversion in S regions (Yang et al. 2020), consistent with its role in TLS (Lawrence, Das, et al., 1985; Lawrence, Nisson, et al. 1985; Torpey et al. 1994; Zhao et al. 2006).

Fig. 2
figure 2

Multiple roles of Rev7 in CSR. Rev7 is involved in DSB response, transversion mutation, and toleration of AID-initiated AP sites in CSR. See text for more details

Molecular basis of deletional CSR

CSR is described as a deletional recombination event, as only the deletional end-joining yields a productive IgH locus and inversional event is non-productive. It was taken for granted that productive Ig expression could select the deletional recombination events in activated B cells, until elegant studies using high-throughput technology reveal that CSR happens intrinsically in a deletional manner (Dong et al. 2015; Panchakshari et al. 2018). The abovementioned HTGTS was adapted into CSR-HTGTS-seq using a primer annealing to the 5’Sµ region, which can capture the endogenous S region junctions in CSR (Dong et al. 2015). The junction numbers in CSR-HTGTS-seq authentically reflect the CSR levels and can serve as a molecular assay to measure CSR (Dong et al. 2015). Strikingly, CSR-HTGTS-seq reveal that deletion recombination happens nearly ten times higher than the inversion event (Dong et al. 2015).

In cis, the chromatin architecture offers a convenient way for deletional recombination. IgH locus locates in its own topological associated domain (TAD), in which context the CTCF-binding elements (CBEs) are key cis-elements in regulating either V(D)J recombination or CSR (Guo et al. 2011; Lin et al. 2015; Zhang et al. 2021). Cohesin-mediated loop extrusion facilitates chromatin contacts and the formation of TADs (Vian et al. 2018), and assists both V(D)J recombination and CSR (Zhang et al. 2019a, 2019b). In activated B cells, the activation of IgHC promoters drives stepwise cohesin loading on the pre-assembled CSR centre in naïve B cells (Zhang et al. 2019a, ), providing a directional alignment of donor and acceptor S regions for deletional CSR in cis (Zhang et al. 2019a, ). In trans, the deletion recombination replies on DSBR and end-joining factors (Dong et al. 2015; Panchakshari et al. 2018), as deletion of 53BP1 or Lig4 partially abolished the dominant deletional recombination. However, the loss of directional bias is associated with excessive resection of DSB ends in 53BP1 deficiency (Bunting et al. 2010), large amounts of unrepaired broken ends in Lig4 deficiency (Canela et al. 2016), and severe block of CSR in both genotype (Bunting et al. 2010; Han and Yu 2008). The unbias joining could reflect that the escaped broken ends from the joining complex are joined randomly in a manner resmebling that of translocation (Dong et al. 2015) and/or in a diffusional manner at low levels (Zhang et al. 2019a, b).

To search for such a determining factor in deletional CSR, we applied a combintion approach of CRISPR-screening and chemical treatment, which yield a new end-joinining factor ERCC6L2 (Liu et al. 2020). ERCC6L2 belows to the Snf2-like ERCC6 family which also include ERCC6 (CSB, important factor in transcription-coupled nucleotide excision repair) and ERCC6L (PICH, invloved in spindle assembly checkpoint). Deleterious mutations of ERCC6L2 have been identified in inherited bonee marrow failure (BMF) patients (Bluteau et al. 2018; Jarviaho et al. 2018; Shabanova et al. 2018; Tummala et al. 2014; Zhang et al. 2016) and a subtype of inherited acute myloid leukemia (Douglas et al. 2019). We found that ERCC6L2 is quickly recruited to DNA damage sites and the deficient cells are sensitive to IR or IR-mimic treatment (Liu et al. 2020). We found that ERCC6L2 is required for optimal CSR, and V(D)J recombination in XLF-defieicent background (Liu et al. 2020). Combining several lines of evidence, we concluded that ERCC6L2 is an authentic NHEJ factor (Liu et al. 2020). The conclusion is further supported by two other contemporaneous reports (Francica et al. 2020; Olivieri et al. 2020).

ERCC6L2 plays a vital role in NHEJ, and, as we showed, deteremines the orientation-specifc joining in CSR (Liu et al. 2020) (Fig. 3). Using CSR-HTGTS-seq, we found that in the absence of ERCC6L2, the deletional CSR is completely abolished and S region ends join in a 1:1 ratio in term of deletion:inversion. ERCC6L2 does not greatly affect end resection and the CSR level is decreased to half of wildtype level, making its role in deletional-CSR distinct from 53BP1 or Lig4/XRCC4. Further investigation showed that ERCC6L2’s catalytic activity is required for the orientation-specifc joining, leading to a model that ERCC6L2 could clean the DSB end for the Lig4/XRCC4 sliding, and/or affecting the long-range chromatin synapsis or DSB end tethering.

Fig. 3
figure 3

ERCC6L2-depedent deletional CSR. A working model to explain the molecular basis for deletional CSR. See text for more details

Perspective

In antigen receptor diversification, the repairs of programmed DNA lesions utilize the general cellular DNA repair factors and also show several unique features including an error-prone manner and higher efficiency. We use antibody CSR as an example to illustrate the introduction of base damage, the processing of base damage into DSBs, the ligation of DSBs, the coordination of DNA repair and replication, and the molecular basis for the unique deletional recombination. With the development of new technologies, such as CSR-HTGTS-seq, IgH deep sequencing and 3C-HTGTS-seq, we will be able to discover and elaborate new features in immune diversity. CSR is an ideal assay to examine many aspects of DNA repair including NHEJ, end resection, BER, MMR and TLS. We believe that the assay will continually aid genome instability studies in the future. Furthermore, AID/APOBEC deaminase is combined with CRISPR/Cas9 into cytidine base editing (CBE) tools (Rees and Liu 2018). However, the repair of CBE editing is less explored. Whether similar pathways in repairing AID-lesions function downstream of CBE editing are waiting to be solved.