Introduction

The CRISPR/Cas9 adaptive immune system in bacteria and archaea has been engineered into a powerful tool for targeted genome editing and shown tremendous potential for therapeutic translation. A single guide RNA (sgRNA) that can be custom-designed enables CRISPR/Cas9 to target almost any locus and induces modification into genome genetically and epigenetically (Cong et al., 2013; Jinek et al., 2014; Ran et al., 2013). In genome editing, the CRISPR/Cas9 system rewrites the genetic code in two ways. Traditionally, Cas9 induces a site-specific double strand break (DSB) on DNA strand with HNH nuclease domain and RuvC-like nuclease domain. Repair of Cas9-induced DSB through error-prone non-homologous end joining (NHEJ) pathway generates random and unpredictable nucleotide deletions or insertions. In contrast, homology-directed repair (HDR) creates precise DNA fragment integration or sequence replacement in the presence of donor templates (Fig. 1) (Doudna & Charpentier, 2014; Gallagher & Haber, 2018). Alternatively, base editors in which a cytidine deaminase is tethered to a Cas9 nickase (nCas9) can achieve site-specific conversion of nucleotides with no induction of a DSB, but a nick (Fig. 1) (Gaudelli et al., 2017; Komor et al., 2016). Recently, by fusing a reverse transcriptase to nCas9 or a transposase to a deactivated Cas9 (dCas9), nCas9-based prime editors and dCas9-based prime editors have been respectively developed for precise insertion and deletion in genome editing (Anzalone et al., 2019). The genome editing tools that induce no DSBs are expected to cause less deleterious chromosomal rearrangements and hold a great potential for safer clinical applications. In addition, dCas9 has been explored as a platform to recruit multiple effectors that not only control epigenetic information at relevant locus, but also shape the chromatin structure (Chen et al., 2013; Gilbert et al., 2013; Konermann et al., 2013; Qi et al., 2013; Wang et al., 2019). For example, dCas9 fused with DNMT3A and p300 can bind to a target to methylate DNA and acetylate nucleosomal histones, respectively (Hilton et al., 2015; Vojta et al., 2016). These epigenetic modifications can either suppress or activate expression of a target gene (Dominguez et al., 2016). The dCas9 can act directly as a transcription blocker to dislodge the RNA polymerase or fuse with epigenetic modifiers to reshape epigenetic landscape (Fig. 1) (Amabile et al., 2016; Perez-Pinera et al., 2013; Qi et al., 2013; Vojta et al., 2016). The changing of epigenetic information including DNA methylation and histone post-transcriptional modifications is sufficient to induce transcriptional changes. In these nCas9- or dCas9-based applications, the stronger and longer the binding between Cas9 and target DNA is, the more effective the effectors fused to nCas9 and dCas9 may be with longer time.

Fig. 1
figure 1

The mechanism of the CRISPR/Cas9 gene editing system. CRISPR/Cas9 system rewrites the genome information through two ways, genetically and epigenetically. The Cas9 nuclease directly induces a DSB on DNA strand, and repair of the break by either NHEJ or HDR causes small insertion or deletion mutations or incorporates desired changes. A deaminase fused with Cas9 nickase conducts cytosine or adenosine deamination at target sites. Dead Cas9-based platform recruits variety of effectors including transcriptional activators and repressors, operating as an epigenetic writer without changing gene coding sequence

Among several distantly related CRISPR/Cas9 systems, Streptococcus pyogenes Cas9 (SpCas9) is widely used for genome editing in mammalian cells. In vitro and in vivo assays have demonstrated that the target residence of SpCas9-sgRNA can last long, even after DNA cleavage (Ma et al., 2016; Richardson et al., 2016; Sternberg et al., 2014). The binding of SpCas9-sgRNA to its DNA target is mediated by several interactions, including the recognition of PAM sequence by SpCas9, the base pairing between sgRNA and target DNA and the non-specific contact between SpCas9 and DNA. Such interactions enable the SpCas9-sgRNA complex to recognize and bind its target site and determine the strength and duration of SpCas9-sgRNA target residence. These interactions are in theory dictated by nucleotide (nt) composition of target DNA and to some extent, neighboring DNA sequence in in vitro assays, but could be affected spatiotemporally by chromosome conformation, chromatin dynamics and local DNA metabolisms in cells. This may explain in part why the target binding and editing activity of SpCas9-sgRNA vary not only between DNA targets but also at a same target between different individual cells. Previous study indicated the single-turnover activity of SpCas9-sgRNA is challenged by local transcription which dislodges SpCas9-sgRNA from targets, allowing the reuse of SpCas9-sgRNA (Clarke et al., 2018). In our study, Cas9-sgRNA residing at cleaved DNA could be dislodged by local DNA replication to generate replication-coupled three-ended DSBs, not only suppressing DNA-PKcs-dependent NHEJ in favor of HDR, but also causing palindromic sister chromatid fusion (Manuscript in submission).

At some targets, SpCas9-sgRNA could be released quickly from its target sites upon DSB induction. Exposure of DNA ends initiates DNA damage response (DDR) followed by engagement of a DSB repair pathway. In mammalian cells, two major pathways, i.e., HDR and NHEJ, compete for DSB repair. NHEJ can be further categorized into two sub-pathways: classical NHEJ (c-NHEJ) that requires DNA-PKcs/Ku70/Ku80 for end recognition and DNA ligase 4/XRCC4/XLF for end ligation, and alternative end joining (alt-EJ) that operates without either one of classical NHEJ factors (Falck et al., 2005; Xie et al., 2007). In CRISPR/Cas9 genome editing, majority of repair products are created by c-NHEJ whereas alt-EJ makes minor contribution with larger deletions and increased use of microhomology at repair junctions (Gallagher & Haber, 2018; Sfeir & Symington, 2015). Nevertheless, the DSB repair pathway choice is determined by cell cycle stage, DNA end configurations, surrounding chromatin context and local DNA metabolism etc. However, at some other targets, the binding of SpCas9-sgRNA can persist, even after DNA cleavage, for several hours. Persistent target binding at cleaved DNA buries the DNA ends within the SpCas9-sgRNA complex, preventing initiation of the DDR. In fact, several studies have indicated that the long lifetime of the Cas9-sgRNA-DNA complex at some DNA targets poses physical barriers for cellular surveillance of DNA damage and access of repair machinery in CRISPR/Cas9 genome editing (Brinkman et al., 2018; Clarke et al., 2018). DSB repair requires exposure of DNA ends from the post-cleavage Cas9-sgRNA-DNA complex for recognition and binding of repair factors. Conformational change of the post-cleavage Cas9-sgRNA-DNA complex may release DSB ends, but local DNA metabolism and chromatin remodeling could also dislodge Cas9-sgRNA from cleaved DNA to reveal DSBs. Varying exposure processes add a layer of regulation into decision of a repair pathway choice. In this review, we focus on the property of Cas9 target residence and discuss its effect on DNA repair pathway choice in CRISPR/Cas9 genome editing.

Key interactions within the Cas9-sgRNA-DNA tertiary complex

Although SpCas9 in the apo state can bind DNA, neither this binding exhibits sequence specificity nor apo-Cas9 has apparent cleavage activity. To search for and bind to DNA targets, SpCas9 needs to form the effector complex by binding with sgRNA created by fusion of the CRISPR RNA (crRNA) and trans-activating crRNA (tracrRNA) (Cong et al., 2013; Mali et al., 2013). RNA-bound SpCas9 scans the entire genome for a protospacer adjacent motif (PAM) sequence and the interaction between the PAM-interacting (PI) domain of SpCas9 and the PAM initiates SpCas9-mediated unwinding of PAM-proximal DNA for pairing of the target DNA strand with the 20-nt spacer sequence of sgRNA (Anders et al., 2014; Sternberg et al., 2014). A stable R-loop is formed after annealing between 20-nt spacer of sgRNA and the target strand of unwound DNA, allowing the cleavage of the target and non-target strand respectively by the HNH domain and the RuvC domain of SpCas9 (Fig. 2). Therefore, the interaction between sgRNA and target DNA strand is part of forces that determine the binding affinity of SpCas9-sgRNA to its targets. Additionally, the target DNA strand is trapped by bilobed SpCas9 architecture and accommodated within a positively charged channel between REC lobe and NUC lobe (Anders et al., 2014; Nishimasu et al., 2014). The non-specific interaction within is dynamic and could differ due to different target strands or conformational changes, adding flexibility to the Cas9-RNA–DNA complex formed at different target sites. Weakening these interactions provides an opportunity to engineer Cas9 for broader PAM compatibility or enhanced specificity (Nishimasu et al., 2018; Slaymaker et al., 2016).

Fig. 2
figure 2

The recognition and dissociation of Cas9-sgRNA from DNA target. Cas9 binds with sgRNA and undergoes conformational changes to interact with target DNA. The Cas9-sgRNA-DNA complex is stabilized by three types of intrinsic interactions. The arginine residues in PI domain of Cas9 directly interact with conserved GG dinucleotide in PAM sequence of the target DNA. The 20-nt spacer of sgRNA anneals to target DNA strand by Watson–Crick base pairing to form a R-loop. The electrostatic and hydrogen-bond interactions between Cas9 and the R-loop enable long time residence of Cas9-sgRNA on target DNA. While Cas9-sgRNA can be released from its target spontaneously due to conformational change in the Cas9-sgRNA-DNA complex, the target binding of Cas9 can also be disrupted by local DNA metabolism, including chromatin remodeling, transcription and DNA replication, leading to forced dissociation of Cas9-sgRNA from its target

Specific recognition of PAM sequence by the PI domain of SpCas9

Recent studies have proposed that SpCas9-sgRNA locates its target via three-dimensional diffusion, in combination with lateral sliding (Globyte et al., 2019; Sternberg et al., 2014). This target search is mediated first by the interaction between the PI domain of SpCas9 and a 5’-NGG-3’ PAM (Ran et al., 2015). This interaction requires the structural rearrangement of SpCas9, repositioning the PI domain in NUC lobe for PAM recognition. Two conserved arginine residues (R1333 and R1335) in the PI domain recognize G-G dinucleotide on non-target strand through major groove interaction (Anders et al., 2014; Nishimasu et al., 2014) (Fig. 2). Elucidation of structural basis underlying the PAM recognition could allow protein engineering for novel SpCas9 variants with less restrain of PAM requirement (Kleinstiver et al., 2015; Nishimasu et al., 2018). However, expanded PAM spectrum may promote DNA cleavage at unwanted sites by SpCas9 variants, exacerbating the off-target effect. It is therefore desirable that structural modifications in SpCas9 variants would not increase off-target mutations while expanding their PAM compatibility.

Base pairing of sgRNA with target DNA

The PAM recognition by the PI domain of Cas9 is the first step of target locating by SpCas9-sgRNA. Afterwards, target DNA is unwound, allowing base pairing between the spacer of sgRNA and target DNA strand to form a triple-stranded R loop. The 8–12 nucleotides adjacent to the PAM of target DNA represent as “seed region”, in which any single-base mismatches could dramatically decrease or even abolish the binding affinity, leading to destabilization of the R-loop (Boyle et al., 2017; Doench et al., 2016; Hsu et al., 2013). However, some studies have suggested that the “seed region” may tolerate mismatches, thus causing the widespread off-target effect of Cas9 (Boyle et al., 2017; Wang et al., 2015). In addition, PAM-proximal and PAM-distal nucleotides of DNA target exert distinct effect: the PAM-proximal region determines the formation of the Cas9-sgRNA-DNA complex, whereas the PAM-distal region may govern dissociation of the complex (Boyle et al., 2017; Singh et al., 2016). It is certain that nucleotide composition of a DNA target including PAM-proximal and PAM-distal region is a major factor that determines the strength of the interactions between sgRNA and target DNA. As this composition varies at different target sites, the binding affinity between Cas9-sgRNA and DNA target could change accordingly. Thus, it is possible that rationale design of sgRNA can improve the target specificity and editing efficiency of Cas9-sgRNA by optimizing the dynamic of the interaction between Cas9-sgRNA and DNA target. In fact, truncating sgRNAs to 17–18 nt restrains off-target recognition without affecting on-target activity and is used for improved gene editing (Fu et al., 2014). However, further shortening of sgRNAs could decrease or eliminate the editing activity of Cas9-sgRNA.

Non-specific interactions between Cas9 and target DNA

In the Cas9-sgRNA-DNA complex, the DNA target is embedded in the positively charged groove at the REC and NUC interface of Cas9 and surrounded by flexible loops enriched with hydrophobic and cationic residues (Nishimasu et al., 2014; Zhu et al., 2019) (Fig. 2). As the central channel of Cas9 is structured to accommodate DNA, this layout of the DNA substrate provides non-specific but more extensive contacts between Cas9 and DNA in addition to the interaction between Cas9 and the -NGG PAM. Disruption of these interactions not only reduces the target binding affinity of Cas9-sgRNA but also minimizes off-target interactions. However, SpCas9 variants can be generated to retain robust on-target activity but with diminished off-target effect by neutralizing positively charged residues of SpCas9 or disrupting hydrogen bond contacts in these non-specific interactions (Kleinstiver et al., 2016; Slaymaker et al., 2016). This suggests that the non-specific interactions between Cas9 and target DNA are excessive for the on-target activity of SpCas9-sgRNA, but not for off-target activity. The other mechanisms may be also involved as additional Cas9 variants can be generated with minimal off-target effect while retaining on-target activity (Casini et al., 2018; Hu et al., 2018; Lee et al., 2018). Beyond the 20-bp protospacer region, a weaker interaction occurs between Cas9 and nucleotides 14 bp downstream of the PAM and regulates target binding and target dissociation of Cas9 (Zhang, Wen, et al., 2019). Given the fact that apo-Cas9 can bind DNA weakly in vitro, it is possible that additional non-specific interactions may arise between Cas9 and non-target DNA that is spatially close to DNA target in the context of a highly folded genome. These additional interactions may differently influence the target binding, the editing activity and the target residence of SpCas9-sgRNA at different targets.

Cellular factors that challenge target residence of Cas9

In vitro assays indicate that post-cleavage residence of SpCas9-sgRNA at some targets can last for about 5.5 h (Richardson et al., 2016; Sternberg et al., 2014). This residency seems to be more complicated in mammalian genome, as the genome is organized into chromatin and dynamic with active DNA and chromatin metabolism (Friman et al., 2019; Zhang, Emerson, et al., 2019). In fact, while the interactions between DNA and Cas9-sgRNA dictate target binding and dissociation of SpCas9-sgRNA, the stability of the Cas9-sgRNA-DNA complex is actively altered by the forces behind the structural and dynamic changes of local chromatin, including spontaneous nucleosome sliding and breathing, catalyzed remodeling or DNA topology, as well as changes in chromatin structure during DNA transcription or replication. As a result, the target-binding dynamics of Cas9-sgRNAs oscillates constantly between targets or even at a same target, causing significant variations in outcomes of genome editing.

DNA torsion

Mechanical distortion of DNA by stretching and unwinding can influence both on-target and off-target activity of Cas9 including target recognition, target binding and target dissociation in vitro (Newton et al., 2019; Raz et al., 2016). In particular, after the formation of the Cas9-RNA–DNA complex, two nuclease domains of Cas9 respectively induce a break at two strands of DNA, releasing the tension conferred within torsional duplex strands. This tension may also be affected by DNA metabolism and chromatin remodeling adjacent to the target site. Long residency of Cas9-sgRNA at cleaved target DNA suggests strong and persistent post-cleavage interactions between Cas9-sgRNA and DNA. Structural studies have provided some molecular details of post-catalytic conformational rearrangements that delay the dissociation of Cas9-sgRNA from its target. During DNA cleavage, the HNH domain undergoes a conformational change to capture the target strand, while disordered REC2 recognition domain moves outward (Sternberg et al., 2015). After DNA cleavage, the HNH domain returns to a disordered conformation and the interaction between REC2 and cleaved target strand is reestablished (Zhu et al., 2019). During post-catalytic conformational change of the Cas9-RNA–DNA complex, additional interactions between REC3 and the RNA–DNA hybrid, as well as between RuvC and DNA, persist to stabilize the complex, allowing long residence time of Cas9-sgRNA at the cleaved DNA. Considering a more dynamic environment in cells than in structural analysis in vitro, it is expected that the post-catalytic conformation change of the Cas9-RNA–DNA complex could also be altered by spontaneous elements present in cellular activities such as local DNA metabolism and chromatin remodeling. Part of these effects is exerted by mechanical torsions of DNA generated during these cellular activities.

Local chromatin barriers

In mammalian cells, genomic DNA wraps around histone octamers to form the nucleosomes that are further organized into chromatin, posing a barrier for target search and target-binding of Cas9-sgRNA in the genome (Kallimasioti-Pazi et al., 2018; Knight et al., 2015). In order for Cas9-sgRNA to locate and bind target DNA sequence, nucleosomal DNA should be unpeeled from histone octamers by spontaneous nucleosome breathing or catalytic chromatin remodeling (Isaac et al., 2016). It has been indicated that the target binding of Cas9-sgRNA relaxes the condensed chromatin structure although it remains unclear whether this relaxation requires the helicase activity of Cas9 (Chen et al., 2017). When Cas9-sgRNA binds to the target DNA and forms a stable complex in the context of chromatin, nucleosome sliding or rearrangement still occurs at adjacent chromatin, which would in turn affect the binding and release Cas9-sgRNA from DNA, in either of pre-cleavage, cleavage and post-cleavage stages. In fact, histone chaperone FACT was recently found to remove Cas9 from nucleosomal DNA and such process could be facilitated by other histone chaperones or chromatin remodelers (Wang et al., 2020) (Fig. 2). As a result, the chromatin context of a target should be considered in predicting the efficiency and specificity of CRISPR/Cas9 genome editing if possible.

Transcription

Genome-wide analysis shows an enrichment of Cas9-sgRNA off-target sites in open chromatin regions (Kuscu et al., 2014; Nielsen et al., 2014). As transcription is highly active in these regions, the probability is increased for Cas9-sgRNA residing at the cleaved DNA to collide with translocating RNA polymerase II. It has been shown that this collision could displace Cas9-sgRNA from its target if the target strand of Cas9-sgRNA is also the template strand of transcription (Clarke et al., 2018) (Fig. 2). After translocating RNA polymerase dislodges Cas9-sgRNA from cleaved target, the RNA polymerase may run off one end of the DSB, leaving the DNA end containing a premature mRNA-DNA heteroduplex and a displaced single-strand DNA while retaining the other dsDNA end. This form of DNA end configuration may help decide a particular DSB repair pathway choice by engaging a different set of repair factors and activating a different DDR. Although it is unclear whether the premature mRNA in the RNA–DNA hybrid can assist DSB repair, several studies have indicated that RNA molecules can bridge DNA ends and serve as a repair template or provide a platform for recruitment of repair factors, such as ERCC6, BRCA1 and Rad52, to facilitate DSB repair by HDR and NHEJ (Ouyang et al., 2017; Puget et al., 2019). On the other hand, if the target strand of Cas9-sgRNA is the non-template strand of transcription, it is translocating RNA polymerase II that would be dislodged from the templated strand by target-bound Cas9-sgRNA, disrupting transcription but possibly having little direct effect on repair of Cas9-induced DSBs (Clarke et al., 2018).

DNA replication

Cas9-sgRNA persistently binds to DNA at some target sites for an extended time, thus creating a time window for collision with a DNA replication fork and target dissociation of Cas9-sgRNA by DNA replication. In Escherichia coli, Cas9-sgRNA is dissociated from target DNA when DNA replication occurs, indicating that DNA replication forks could dislodge Cas9-sgRNA from its target and shorten its target residence duration (Jones et al., 2017). In eukaryotic DNA replication, the coordination between helicases and DNA polymerases at leading strands would increase the rate of DNA unwinding and provide extremely powerful mechanical force in both directions to remove Cas9-sgRNA from its target (Fig. 2). In vitro assays have shown that DNA polymerase Phi29 can dislodge DNA-bound dCas9 from the upstream side of PAM whereas BLM helicase-mediated collision from post-PAM direction enables the displacement of dCas9 from DNA (Zhang, Wen, et al., 2019). After DNA cleavage by Cas9-sgRNA, the DSB remains buried in the Cas9-sgRNA complex. This post-cleavage target residence can also last long at many target sites and provide an opportunity for collision with a DNA replication fork, thus resulting in collapse of the replication fork and generating particular DSB ends with a blunt end on the leading strand and a 3’-overhanging end on the lagging strand (Fig. 3). Such collapsed fork can recover via DSB repair, ensuring complete DNA synthesis before entering mitosis. In yeast, this DSB repair is mediated mainly by a process called break-induced replication (BIR), which requires the PIF1 DNA helicase and the PolD3 DNA polymerase (Deem et al., 2011; Saini et al., 2013; Wilson et al., 2013). PolD3 can tolerate errors during DNA synthesis, thus adding more mutations to the genome (Deem et al., 2011). In mammalian cells, BIR and BIR-like mechanisms may also operate in recovery of collapsed DNA replication forks; however, it is unclear how conserved these pathways are from yeast to mammals.

Fig. 3
figure 3

The formation and repair of Cas9-induced three-ended DSBs. DNA replication dislodges Cas9-sgRNA from its target and generates a DSB with three ends in which the two replication-associated DSB ends may be unfavorable for binding of c-NHEJ factors. The end at each side of the break can be ligated by NHEJ with a large deletion at repair junction (i). Two replication-associated DSB ends have potential to directly ligate with each other, forming palindromic sister chromatid fusion (ii). The three DSB ends could also join together to form a radial chromosome (iii)

Target residence of Cas9 in DSB repair pathway choices

Upon DSB induction, the “naked” DNA ends of the break site form the “DNA domain” that Ku70/Ku80/DNA-PKcs, MRN/ATM and ATRIP/ATR compete to recognize and bind (Feng et al., 2016; Scully & Xie, 2013). Coordinating with the DNA ends and cell cycle stages, these three phosphatidylinositol 3-kinase-related protein kinases (PI3KKs) are primary effectors in shaping a DSB repair pathway choice. All these kinases are also able to phosphorylate the histone H2A variant H2A.X, inducing a “chromatin domain” response that extends over kilo- or megabases away from the break site (Feng et al., 2016; Scully & Xie, 2013). The range of this chromatin response is partly mediated by cohesin-dependent loop extrusion, adding additional spatial control over association of repair factors on this chromatin domain and eventually a choice of DSB repair pathways (Arnould et al., 2021). Among these repair factors recruited, BRCA1 and 53BP1 antagonize in DNA end resection and Rad51 loading and play an important function in DSB repair pathway choices between HDR and NHEJ (Bunting et al., 2010; Panier & Boulton, 2014). This two-layer regulation (i.e., the DNA domain and chromatin domain) of DSB repair pathway choices is generally applicable to DSBs induced by IR or chemical exposure as well as Cas9-sgRNA followed by its spontaneous dissociation from cleaved DNA. However, as post-cleavage target residence duration of Cas9-sgRNA varies between targets including on-target sites and off-target sites in a same cell or even at a same target between single cells, the decision of DSB repair pathway choices can be influenced by the mechanisms underlying exposure of Cas9-induced DSBs and unique end configurations of exposed DSBs.

On-target repair pathway choices

The post-catalytic conformational change of Cas9-sgRNA has been revealed; however, it remains unclear how Cas9-sgRNA is spontaneously released from cleaved DNA. In vitro assays indicated that Cas9-induced DNA lesions are initially buried within the post-catalytic Cas9-sgRNA complex and this concealment can last for a period of time at some test sites. As it takes more than 15 h to complete repair of Cas9-induced DNA lesions in mammalian cells, it is believed that post-cleavage target residence of Cas9-sgRNA delays the response and repair to such DNA lesions (Brinkman et al., 2018; Kim et al., 2014). While the post-catalytic conformational change of Cas9-sgRNA serves as a primary inducer for the dissociation of Cas9-sgRNA from cleaved DNA, it is reasonable to speculate that this conformational change may be partly controlled by the aforementioned specific and non-specific interactions between Cas9-sgRNA and target DNA. The dissociation may also be facilitated by external factors such as mechanical forces from local DNA torsion, chromatin remodeling activity, transcription and DNA replication. We therefore divide the target dissociation of Cas9-sgRNA into two forms: spontaneous dissociation and forced dissociation. The former is induced solely by intrinsic conformational change of the Cas9-RNA–DNA complex after DNA cleavage whereas the latter is additionally controlled by external forces. Nevertheless, the factors other than transcription and DNA replication do not affect end configuration of Cas9-induced two-ended DSBs, but may change the timing of DSB exposure, allowing the cell cycle-based control of DSB repair pathway choices.

In G1 phase, blunt DNA ends of Cas9-induced DSBs exposed spontaneously or by the forces other than transcription and DNA replication are largely repaired by c-NHEJ, which is intrinsically accurate for the ends that are readily ligatible (Guo et al., 2018). Repair by alt-EJ can also occur for some of Cas9-induced DSBs and tends to generate repair products with insertions and deletions (indels) and increased use of microhomology at junctions (Biehs et al., 2017). In S/G2 phase where sister chromatids are available, repair of many Cas9-induced DSBs is mediated by HDR using sister chromatids as the homologous template (Biehs et al., 2017; Symington, 2016). In addition, due to active end resection during DNA replication, indels could be frequently introduced into repair products even upon c-NHEJ-mediated repair of Cas9-induced DSBs in this stage.

However, persistent post-cleavage residence of Cas9-sgRNA at its target poses a serious obstacle for detection of Cas9-induced DNA lesions and delays DSB repair. It also increases the probability of a collision between target-bound Cas9-sgRNA and transcription or DNA replication. The collision of translocating RNA polymerases with Cas9-sgRNA on target strand not only expose Cas9-induced DSBs by dislodging Cas9-sgRNA from cleaved DNA, but may also generate a DNA end containing a premature mRNA-DNA heteroduplex and a displaced single-strand DNA. This end configuration may favor one particular DSB repair pathway over the others and involve RNA in this pathway. On the other hand, the collision of a DNA replication fork with Cas9-sgRNA exposes Cas9-induced DSBs in the S phase and limits repair in this phase, allowing cell cycle-based choices between different repair pathways in repair of Cas9-induced DSBs. Indeed, our study has shown that the stronger the on-target binding of Cas9-sgRNA is, the greater chance Cas9-sgRNA has to encounter DNA replication, thus biasing DSB repair from c-NHEJ to HDR. It is possible that Cas9-sgRNA is released spontaneously from a cleaved target in one cell but from the same cleaved target by transcription or DNA replication in the other cell. Thus, the choice of DSB repair pathways and its regulation differ even at a same target. In combination with variable repair products in a particular repair pathway, the varying choice of DSB repair pathways could cause significant heterogeneity even in on-target edit products.

Unique repair pathway choice for Cas9-induced three-ended DSBs

Due to persistent post-cleavage residence of Cas9-sgRNA at its target, it becomes more frequent that DNA replication forks encounter target-bound Cas9-sgRNA. The CDC45-MCM-GINS (CMG) complex travels along the leading strand to unwind DNA, and collide with Cas9-sgRNA at cleaved DNA. As a result, Cas9-sgRNA is dislodged and the CMG complex along with DNA polymerases may run off the DNA breakage point, creating a DSB with three ends (Vrtis et al., 2021). The first one is a blunt end arisen from leading strand DNA synthesis on a sister chromatid. Owing to discontinuous synthesis on lagging strand, the second DNA end is located on the other sister chromatid and carries a long 3′ ssDNA, the length of which depends on the position of the very last Okazaki fragment. The staggered end with a long 3’-overhang may not engage the c-NHEJ factors such as Ku70/Ku80 properly, but favor the Rad51 assembly for HDR. The third end is the blunt end away from the colliding replication fork on the un-replicated DNA. The two structurally distinct ends created on each sister chromatid, along with the third end, form a three-ended DSB configuration at the cleaved target (Fig. 3). It is yet to be determined how such three-ended DSB is repaired. It is possible that two blunt ends would be preferably repaired by c-NHEJ, limiting nucleotide loss at the repair junction. The staggered end with a long 3’-overhang is a poor substrate for c-NHEJ, but could be ligated with either of blunt ends by microhomology-mediated end joining (MMEJ), generating deletions at the repair junction (Fig. 3). In either case, one end is left unrepaired and could be lost during mitosis, giving rise to a chromosome that loses part of a chromosomal arm or even causing loss of entire chromosome in one of the daughter cells. The unrepaired end could also be ligated to an end of off-target DNA breaks or invade the other chromosome via homologous pairing, causing translocations. Occasionally, repair of three-ended DSBs could form a radial chromosome by direct fusion in a three-armed manner (Fig. 3).

Our study has also indicated a possibility that two ends from sister chromatids in this Cas9-induced three-ended DSB are ligated to generate a giant palindromic chromosome (Fig. 3). It is known that chromosome fusion can occur between both ends of the replicated sister chromatids with shorten telomeres, forming either a ring-shaped chromosome or telomere fusion of two sister chromatids (Kagaya et al., 2020; Liddiard et al., 2016). The end-to-end fusion of two sister chromatids creates a stable dicentric chromosome with two microtubule attachment sites. In anaphase, the dicentric chromosome would be pulled in the opposite direction towards two daughter cells between which a long chromosomal bridge is established. The chromosomal bridge contains ssDNA that would be attacked by APOBEC enzymes to introduce clusters of base mutation, or catastrophically shattered (Umbreit et al., 2020). This type of chromosomal shattering is termed “chromothripsis” and can lead to characteristic assembly of the chromosomal fragments frequently observed in human cancers (Maciejowski et al., 2015; Tanaka & Yao, 2009). The three-ended DSB configuration induced by on-target collision between DNA replication forks and target-bound Cas9-sgRNA may help explain the loss of chromosomal arms or entire chromosome, a phenomenon observed in CRISPR/Cas9 genome editing of human embryos (Zuccaro et al., 2020). The genome instability associated with this form of Cas9-induced DSBs also provides a potential mechanism underlying carcinogenic risks in CRISPR-mediated gene therapy.

Off-target repair pathway choices

CRISPR/Cas9 genome editing utilizes both the -NGG PAM sequence and 20-nt spacer of sgRNA to recognize and pair with a DNA target. The PAM is widely distributed in the genome and Cas9-sgRNA can tolerate mismatches between the pairing of the sgRNA spacer and off-target sequence to some extent for the target binding and catalytic activity of Cas9-sgRNA. Thus, CRISPR/Cas9 can bind to numerous off-target sites, cleave DNA and causes mutagenesis over these sites. This off-target effect is a serious problem in CRISPR/Cas9 genome editing. However, the mismatches of sgRNA with off-target sequences weaken the binding affinity of Cas9-sgRNA at off-target sites and reduce the residence of Cas9-sgRNA at these sites (Kim et al., 2019). Cas9-sgRNA at the off-target sites is more likely dissociated from cleaved DNA in a spontaneous manner. In addition, due to weaker binding, Cas9-sgRNA at the off-target sites can be more easily dislodged by DNA torsion, transcription or chromatin remodeling. These activities frequently occur in each stage of cell cycle, lowering the probability of the forced dissociation by DNA replication that only occurs once in every cell cycle. Compared to on-target sites, c-NHEJ is more likely engaged for repair of Cas9-induced DSBs at off-target sites, generating accurate repair products. In addition, the re-cleavage activity of Cas9-sgRNA at off-target sites is also reduced owing to weaker binding and frequent pre-catalytic dissociation at off-target sites. The combination of preferable c-NHEJ engagement and reduced activity of re-cleavage could account for less mutagenic events at off-target sites. While c-NHEJ inactivation is widely used to enhance HDR in CRISPR/Cas9 genome editing, we speculate that this strategy would shift the bias of DSB repair pathway toward alt-EJ to induce more frequent mutations at off-target sites, thus exacerbating the off-target effect. Such deleterious effect at off-target sites is often ignored (Canny et al., 2018; Chu et al., 2015; Maruyama et al., 2015; Yeh et al., 2019).

Perspective on target residence of Cas9

The encountering of resident Cas9-sgRNA with local DNA metabolism makes specific contributions to the varying editing spectrum by regulating local repair pathway choices and generating aberrant repair outcomes. Evaluation of the target residence of Cas9 at specific sites is a requisite before translating Cas9 into therapeutic use to avoid the carcinogenic potential in CRISPR/Cas9 genome editing. A Cas9 variant that can spontaneously dissociate from the DNA target, but retain effective cutting activity, would reduce the mutational heterogeneity in genome editing. As dislodge of Cas9 by cellular factors cannot be completely avoided, we propose that strategies that do not generate DNA breaks may avoid such catastrophic mutational process. In this circumstance, engineering or evolving a Cas9 variant with enhanced binding affinity, or with a long resident duration on DNA is an alternative way to expand dCas9-based applications. Overall, developing a toolbox of Cas9 with different binding affinity as desired is urgently required in future research.

Contribution to the mutational heterogeneity in CRISPR/Cas9 genome editing

In CRISPR/Cas9 genome editing, the nucleotide composition of a target sequence is a key determinant for the binding affinity between Cas9-sgRNA and the target, but the duration of Cas9-sgRNA target residence adds a layer of regulation over the DSB repair pathway choice in repair of Cas9-induced DSBs. Spontaneous dissociation of Cas9-sgRNA from cleaved DNA exposes two clean ends of Cas9-induced DSBs to preferably engage c-NHEJ. Persistent target residence of Cas9-sgRNA after DNA cleavage increases the probability of its forced dissociation from the target by chromatin remodeling, transcription and DNA replication. While chromosome remodeling creates similar clean ends, transcription and DNA replication generate unique end configurations in Cas9-induced DSBs. Each form of forced dissociation integrates repair of Cas9-induced DSBs into a respective spatiotemporal context of control. In particular, replication-mediated release of Cas9-sgRNA from cleaved DNA not only exposes DSBs specifically in S phase but also generates three-ended DSBs in which the staggered end with a long 3’ ssDNA favors HDR and alt-EJ over c-NHEJ. The availability of three ends in these DSBs could also induce many types of deleterious repair outcomes such as translocation to distant ends, chromosomal loss and palindromic fusion of sister chromatids. As a key step in the breakage-fusion-bridge cycle, sister chromatid fusion could consequently lead to chromothripsis, causing more complex chromosomal rearrangements including translocations, inversions and loss of a chromosome.

Like any other type of DSBs, Cas9-induced DSBs are repaired by a chosen repair pathway after a number of factors are considered in decision. These factors include end configuration of DSBs, nucleotide composition of end sequences, availability of repair factors, cell cycle stage, etc. Repair products are thus highly heterogeneous due to the combined but varying act of these factors. Here, we propose that post-cleavage target residence of Cas9-sgRNA is a unique determinant in the DSB repair pathway choice for repair of Cas9-induced DSBs. Variations in this regulation between different sites including on-target sites and off-target sites in a same cell or even at a same site between single cells could greatly elevate the heterogeneity in the mutation profile of CRISPR/Cas9 genome editing. Therefore, the post-cleavage target residence of Cas9-sgRNA at specific sites should be taken into accounts in evaluating the efficiency and safety of Cas9-sgRNA in clinical applications. In particular, as forced dissociation of Cas9-sgRNA from cleaved DNA by DNA replication tends to cause more significant chromosomal aberrations than the other forms of dissociation, strategies that reduce the probability of the collision between Cas9-sgRNA and DNA replication forks or do not generate DNA breaks may help avoid such mutational process and benefit applications of CRISPR/Cas9 genome editing (Artegiani et al., 2020; Suzuki et al., 2016). Better understanding of the mechanisms underlying repair of Cas9-induced three-ended DSBs can also provide insight into developing a strategy that prevent undesired repair outcomes.

Requirement in dCas9- or nCas9-based platforms

Beyond DSB-based genome editing, CRISPR/Cas9 system has also been repurposed as a flexible DNA recognition platform that recruit diverse effectors to achieve transcription regulation, epigenetic modification, chromosome imaging, base editing and prime editing (Chen et al., 2013; Gilbert et al., 2013; Komor et al., 2016; Liu et al., 2017). The catalytically dead Cas9, termed dCas9, which is able to bind a DNA target without cleavage, can be used as a transcriptional roadblock (Qi et al., 2013). By fusing with proper effectors, dCas9-based platforms have also been developed for gene expression regulation, live cell imaging of genomic loci and base editing (Chen et al., 2013; Komor et al., 2016; Qi et al., 2013). Similarly, the Cas9 D10A nickase has been used in place of dCas9 to improve base editors (Zhang et al., 2020). Recently, using a reverse transcriptase and an engineered sgRNA containing a template for editing, Cas9 D10A-based prime editors have been established (Anzalone et al., 2019). These applications are supposedly enabled by strong interactions between dCas9-sgRNA or Cas9 D10A-sgRNA and its DNA target and thereby long residence time of dCas9-sgRNA or Cas9 D10A-sgRNA at its target. Therefore, extending target residence of dCas9-sgRNA or Cas9 D10A-sgRNA could improve the efficiency of dCas9- or Cas9 D10A-based platforms and promote their applications. Consistently with this idea, the ssDNA-binding domain of Rad51 has been fused to various base editors to improve the binding affinity of base editors to target DNA and increase the base editing efficiency (Zhang et al., 2020). Enhancing the binding affinity of dCas9 or Cas9 D10A to its target might also lengthen its target residence, providing an alternative mean to improve dCas9- and Cas9 D10A-based platforms. Combined post-cleavage target residence of Cas9-sgRNA with DNA replication, we reasoned that Cas9n-induced nicks could have a higher probability of colliding with DNA replication forks to become one-ended DSBs.

dCas9-based local inhibition of c-NHEJ

Inactivation of c-NHEJ by either chemical inhibitors or genetic ablation promotes HDR and is widely used in HDR-mediated knock-in of a desired DNA fragment or a gene in CRISPR/Cas9 genome editing (Chu et al., 2015; Maruyama et al., 2015). However, c-NHEJ inactivation promotes error-prone alt-EJ globally, particularly at off-target sites, thus exacerbating off-target effect, as Cas9-induced DSBs at off-target sites are most likely repaired by c-NHEJ in an accurate fashion. Therefore, local suppression of c-NHEJ could be a better strategy to stimulate site-specific HDR in CRISRP/Cas9 genome editing without exacerbating off-target effect. Persistent target residence of dCas9 maintains long-time activities of dCas9-based tools at their targets, allowing efficient genetic and epigenetic modifications in a site-specific manner. Similarly, if target-bound dCas9 near a DSB can block recruitment of Ku70/Ku80 to the ends of the DSB, we could tether dCas9-sgRNA to a site adjacent to a DSB, locally inhibiting c-NHEJ. This dCas9-based local inhibitor approach can indeed improve HDR-mediated gene targeting in CRISPR genome editing without exacerbating off-target effect (manuscript in preparation).