FormalPara Key Points

Progress in utilizing in vivo targeted gene insertion, primarily clustered regularly interspaced short palindromic repeats/(CRISPR)/CRISPR-associated protein-9 nuclease (Cas9)-based, for treating inherited diseases has been demonstrated in preclinical hemophilia studies.

Technical advancements in targeted gene insertion include strategies to enhance desired DNA repair pathways, developing universal target sites/locations and expression strategies, and promoting the survival of edited cells. Outcomes highlight efficiency, safety, and areas for improvement.

We explore challenges in current technologies and the potential impact of novel advances such as lipid nanoparticles delivery and ex vivo gene editing on technology advancement and clinical trial drug development.

1 Introduction

Genome editing technology has revolutionized biomedical research and promoted the development of novel biomedicines [1]. Engineered nucleases can recognize a preselected sequence in a genome and generate double-strand breaks (DSBs) with high specificity in various cell and organism models [2]. Zinc-finger nucleases (ZFNs) [3] and transcription activator-like effector nucleases (TALENs) [4] were early established tools that rely on reprogrammed DNA-binding protein motifs to target a specific sequence. The clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein-9 nuclease (Cas9) system has emerged later as a complex of a non-specific Cas9 nuclease and site-specific single guide RNA(s) (sgRNA) [5]. Complementary base-pairing between sgRNA and its target DNA sequence enables the Cas9/sgRNA complex to recognize and cleave the DNA at a preselected site [6]. The CRISPR/Cas9 system has surpassed ZFNs and TALENs owing to its remarkably enhanced specificity and flexibility in reprogramming sgRNA, becoming the most popular tool for genome manipulation [1].

The genomic DSBs introduced by engineered nucleases trigger immediate cellular responses and evoke DNA repair via multiple intrinsic pathways. Non-homologous end joining (NHEJ) [7, 8], microhomology-mediated end joining (MMEJ) [9], and single-strand annealing (SSA) pathways [10] mediate error-prone repair processes, which predispose to indels at the DSB sites, whereas homology-directed repair (HDR) enables accurate repairing of a DNA lesion based on homology sequences from sister chromatid or exogenous templates [11]. By selectively favoring a particular DNA repair pathway and the corresponding repair outcomes, different gene editing strategies can achieve different forms of genome manipulation [12]. Gene/enhancer disruption, intron/exon deletion, mutation correction, and targeted gene insertion, have all been reported, in preclinical models, to produce therapeutic benefits [13]. Among these advances, gene insertion entails the integration of a large sequence into the host genome to restore normal function. This versatile strategy produces curative benefits regardless of the type, quantity, and location of disease-causing mutations [14]. However, the requirement of a donor DNA for targeted gene insertion poses significant technical hurdles, hampering its development for therapeutic application.

Therapeutic gene editing through in vivo delivery of the engineered nucleases has achieved great success using recombinant adeno-associated virus (AAV) [15], a vector system demonstrating excellent safety profiles in both preclinical and clinical studies [16, 17] and enabling the approval of eight gene therapy drugs by regulatory authorities (Table 1). AAV-delivered ZFNs for targeted gene insertion were first tested in humans (Table 2). Clinical trials for SB-913, SB-318, and SB-FIX were conducted in 2017–2019, inserting a healthy copy of gene into the albumin (ALB) locus to treat mucopolysaccharidosis (MPS) type I [18], MPS type II [19] and hemophilia B [20], respectively. Although the safety of AAV-ZFNs in humans was largely confirmed, the trials failed to demonstrate the desired therapeutic effectiveness [21]. Further investigations speculated the requirement of much higher AAV inputs for gene insertion therapy, which are prohibitively expensive and associated with high risks of toxicity.

Table 1 Approved AAV drugs
Table 2 Clinical trials for gene editing therapies

Subsequent clinical studies for CRISPR/Cas9-based therapeutic gene editing were carried out with extra caution (Table 2). Exagamglogene autotemcel (Exa-cel) [CTX001] was designed for ex vivo genome editing in hematopoietic stem and progenitor cells (HSPCs) to alleviate sickle cell disease (SCD) and transfusion-dependent β-thalassemia (TDT) by disrupting the BCL11A enhancer [22] (Fig. 1). In vivo treatment using AAV-delivered CRISPR/Cas9 was initiated with a retinal injection of EDIT-101 for treating hereditary blindness Leber congenital amaurosis type 10 (LCA10). It co-delivers two sgRNAs with Cas9 to delete the intronic region containing the IVS26 mutation, and therefore rescues CEP290 function in photoreceptor cells [23]. Exa-cel, showcasing robust efficacy and safety in phase III clinical trials, was submitted to the European Medicines Agency (EMA) and United States (US) Food and Drug Administration (FDA) for regulatory approval in January 2023 [24]. It received marketing authorization in the UK on 16 November 2023, and subsequently gained approval from the FDA in the US on 8 December 2023 [25]. Meanwhile, the phase I/II clinical trial of EDIT-101 has revealed clinically significant improvement of visual acuity in 3/24 participants [26] (Fig. 1). These achievements have prompted CRISPR-mediated targeted gene insertion for disease therapy, which has not been tested in humans.

Fig. 1
figure 1

Overview of CRISPR drugs tested in humans. Left: CTX-001 is an ex vivo gene editing in HSPs, for the treatment of SCD and TDT by disrupting the BCL11A enhancer. Right upper: EDIT-101 is an in vivo treatment for LCA10 by deleting the CEP290 intronic region containing IVS26 mutation, which is delivered by AAV5 and administrated through retinal injection. Right lower: NTLA-2001 and NTLA-2002 are both liver-targeted in vivo editing systemically delivered by LNP. NTLA-2001 is intended for the treatment of ATTR by disrupting the TTR gene, while NTLA-2002 targets HAE by disrupting the KLKB1 gene. The status and year of clinical trials are shown. Lower: Schematics of gene editing strategies applied. Gene/enhancer disruption via NHEJ introduces indels at the target site (blue). Exon/intron deletion employs two sgRNAs to delete the disease-causing mutation (reddish). The labeled color of CRISPR drugs indicates the type of gene editing strategies applied. (Created with BioRender.com.) SCD sickle cell disease, TDT transfusion-dependent β-thalassemia, LCA10 Leber congenital amaurosis type 10, ATTR transthyretin amyloidosis, HAE hereditary angioedema, AAV adeno-associated virus, LNP lipid nanoparticle, CRISPR/Cas9 clustered regularly interspaced short palindromic repeat-associated 9 nucleases, indels insertions and deletions, NHEJ non-homologous end joining, HSPs hematopoietic stem cells, sgRNAs single guide RNAs

Currently, ongoing research conducted in animal models has exhibited robust therapeutic potential of gene insertion strategies. Hemophilia A and B, caused by the deficiency of blood coagulation factor VIII (FVIII) and IX (FIX), respectively, are among the most intensively investigated disease models [27]. In this review, we focus on the relevant preclinical studies that explore in vivo targeting strategies for achieving efficient somatic knock-in of therapeutic payloads via systemic administration. Technical components of different targeting strategies, new findings, remaining challenges, and potential solutions with implications in clinical translation will be discussed.

2 Strategies for Targeted DNA Insertion for in vivo Therapy via Systemic Delivery

Various targeting strategies relying on DSB-activated intrinsic repair pathways, including HDR, NHEJ, MMEJ and SSA, have been reported to introduce DNA insertion at specific loci. Each repair pathway has distinct advantages and limitations for targeted insertions, while donor designs can selectively enhance the targeting specificity and minimize non-specific and potential adverse effects.

2.1 Homology-Directed Repair (HDR)-Based Gene Insertion as a Classical and Broadly Used Targeting Strategy

HDR accurately corrects or removes a DNA lesion through synthesizing new DNA based on existing homology sequences [28]. Targeted gene insertion via the HDR mechanism requires donors carrying flanking homology arms of approximately 0.6–1.4 kb [29,30,31,32,33,34,35]. The long homologies facilitate the HDR repair of not only site-specific DSBs induced by a nuclease but also endogenous DNA lesions within a large region, which can lead to low frequency gene insertion in the absence of nuclease.

Targeted gene insertion via HDR repair is a widely adopted strategy to introduce therapeutic payloads in preclinical models. The compact size of the human F9 gene (hF9, 1.4 kb) allows the inclusion of long homology arms within the limited capacity of AAV vectors (< 4.7 kb), which therefore renders hemophilia B as a desirable model for HDR knock-in studies. In vivo hF9 insertion via HDR-based methods has been broadly achieved through systemic AAV administration of either ZFNs [30, 31] or CRISPR/Cas9 systems [32,33,34], demonstrating therapeutic benefits in both neonatal and adult mice with hemophilia B. The plasma hFIX levels ranged from 3 to 23% of normal in most studies targeting the F9 locus [30,31,32, 34], while Wang et al. observed notably higher plasma hFIX levels by targeting the mAlb locus, which reached 40% of normal in adult mice and 120% of normal in neonates [33]. Among these studies, the insertion rates of the hF9 transgene in somatic livers varied from 1 to 3.8% in adult mice, and from 11 to 16.1% in neonates, primarily due to donor designs and target loci involved, whereas on-target DNA cleavage was detected at much higher levels, with indel rates ranging from 34 to 47% [30, 32, 33]. Researchers have also explored nuclease-free gene correction/insertion, aiming to bypass the use of nucleases and minimize DNA damage; however, the insertional efficiency was relatively low. Barzel et al. delivered a single AAV8 vector carrying hF9 donor and flanking homology sequences to the Alb locus into hemophilia B mice without any nuclease, and detected only around 0.5% of Alb alleles carrying hF9 in both neonatal and adult mice [29]. In the study by Li et al. targeting the hF9 locus in humanized mice, administration of AAV8 hF9 donor in the absence of a nuclease failed to produce detectable hFIX [30].

In addition to Cas9, alternative Cas families also hold considerable promise in mediating in vivo gene insertion. The Cas12a (Cpf1) family offers a distinct advantage of producing staggered DSB ends, making it well-suited for precise and long homology-based HDR insertion [36, 37]. Recently, investigations have further highlighted a compact AsCas12f (422 amino acids) [38,39,40]. Hino et al. utilized a hyperactive variant enAsCas12f to facilitate HDR-based hF9 knock-in at the mAlb 3′ untranslated region (UTR) locus in hemophilia B neonatal mice using a single AAV vector, yielding over 40% of normal FIX activity [38].

In vivo gene insertion in livers via systemic delivery has also been applied to treat other diseases beyond hemophilia B. In 2015, Sharma et al. reported HDR-based gene insertion at the mAlb site as a versatile platform for therapeutic gene knock-in to treat liver metabolic diseases, including Fabry and Gaucher diseases, and Hurler and Hunter’s syndromes (also known as MPS type I or II, respectively) [41]. Meanwhile, in situ correction of disease genes via HDR-based insertion at the original loci has also shown therapeutic efficacy in addressing Crigler–Najjar disease, Ornithine transcarbamylase (OTC) deficiency, and Hereditary tyrosinemia type 1 (HT-1) diseases [42,43,44,45,46,47,48].

Collectively, these advances supported HDR-mediated insertion as a promising approach to treat a broad spectrum of diseases; nevertheless, the insertion rates attained are still limited, restricting it to treating diseases with low therapeutic thresholds [33]. Furthermore, the requirement of long homology arms in the donor imposes constraint on AAV packaging capacity, confining the application of HDR-based insertion to small transgene sequences. More investigations are thus warranted to enhance the insertional efficiency and reduce the length of homology, without sacrificing insertion precision.

2.2 Targeted Insertion via Non-homologous End Joining (NHEJ) Provides Distinct Alternatives

The NHEJ pathway functions particularly in repairing DSBs rather than other DNA lesions [7]; hence, a nuclease-induced site-specific DSB is essential for NHEJ-based targeted gene insertion. Unlike HDR, NHEJ repairs DSBs by directly ligating the broken DNA ends in a homology-independent and template-free manner [8]. The NHEJ donors are therefore devoid of homology arms and are presented in a linear form to expose the DNA ends [49, 50]. Remarkably, NHEJ insertion exhibited much higher efficiency than HDR-based strategy in cellular assays [49] due to the rapid and dominant nature of NHEJ-based DSB repair in mammalian species [51].

Despite the error-prone and nondirectional features of NHEJ repair, targeted insertion mediated by NHEJ emerges as a distinct alternative to the HDR strategy for therapeutic application, owing to its high efficiency and homology-independent flexibility [50]. In 2016, Suzuki et al. provided the first demonstration of NHEJ-mediated gene insertion in rodents [50]. This study also established a useful design termed homology-independent targeted integration (HITI), which can significantly enhance forward integration by reconstructing the sgRNA target sites and permitting re-cleavages upon reverse integration [50].

NHEJ-based insertion has been extensively employed in various in vivo disease models, particularly favoring the knock-in of larger genes such as hF8, as the limited capacity of AAV precludes the HDR vector design. Chen et al. used two AAV8 vectors to deliver SaCas9/sgRNA targeting the mAlb locus and an NHEJ donor carrying B-domain deleted hF8 (BDD-F8), a truncated gene of 4.4 kb to encode functional hFVIII [52]. The hemophilia A mice receiving intervention showed long-term and dose-dependent production of hFVIII, with the plasma hFVIII level and activity reaching around 34% and 13% of the normal, respectively. Droplet digital PCR (ddPCR) analysis detected the insertion of BDD-F8 at the mAlb locus in 0.2–0.3% liver DNA [52]. Meanwhile, Zhang et al. also performed effective somatic knock-in of BDD-F8 in hemophilia A mice using three AAV8 vectors to carry SpCas9, sgRNA targeting mAlb, and the BDD-F8 NHEJ donor. Stable plasma hFVIII was robustly detected within 1 month after the treatment, demonstrating 100–200% of normal activity [53].

Targeted insertion of the small hF9 gene via NHEJ is technically less challenging and is often used as trailblazer to explore new Cas9 systems, target sites, or donor designs to address unsolved issues. He et al. used triple AAV8 delivery of SpCas9, sgRNA, and the NHEJ hF9 donor targeting the mAlb 3′UTR to treat hemophilia B mice and achieved liver-specific hF9 knock-in via systemic administration [54]. They demonstrated a significant synergy yielded by the high Cas9 expression, active sgRNAs, and hyperactive hFIX Padua variant, which substantially reduced the AAV input doses required by approximately 100-fold in both neonatal and adult mice. RNA-seq analysis detected the mAlb-hF9 chimeric transcripts produced from desired hF9 insertion, at around 0.49% and 0.08% mAlb mRNA in the adult and neonatal mice receiving the lowest effective AAV doses. Lee et al. demonstrated hF9 knock-in using a less commonly used Cas9 from Campylobacter jejuni (CjCas9) and by targeting the APOC3 transgene locus, showing that a bidirectional AAV-trap donor resulted in much higher hFIX production at the insertion rate of approximately 3%, as both forward and reverse insertions could support functional expression (55). In another study, Chen et al. employed the HITI strategy to treat hemophilia B in a rat model and achieved a notable increase in forward hF9 insertion by 7.5-fold [56].

2.3 Other Repair Mechanisms Involved in Gene Insertion

The MMEJ pathway requires 1–16 nt homology flanking the DSB for repair [9]. DSB repair via the MMEJ mechanism is also an error-prone process and mostly contributes to larger deletions that often co-exist with the indels produced by NHEJ [57].

Studies have reported the potential of MMEJ to mediate large gene insertion at DSB sites by adopting linearized donors flanked by 20-bp microhomology sequences [58, 59]. However, the insertion of MMEJ donors was often supported by both MMEJ and NHEJ mechanisms, and the evident MMEJ insertion was only observed in mouse embryonic stem cells and fetus, suggesting its activity may be cell context-dependent [60]. Using AAV donors and ex vivo cell models, Fu et al. found that the MMEJ pathway competed with NHEJ and HDR in mediating the insertions at nuclease-induced DSBs at a relatively slower process [61]. Zhao et al. utilized a recombinant donor design that featured one microhomology arm and one long homology arm, which facilitated precise insertion in multiple cell models by employing both MMEJ and HDR repair [62]. To date, in vivo gene knock-in via MMEJ was only tested via electroporation and hydrodynamic injection of plasmids [60, 63]. Additional research is needed to examine the efficacy of AAV-delivered MMEJ-mediated in vivo knock-in, since the efficient and precise insertion circumventing the need of long homologous arms is appealing for AAV delivery.

Compared with MMEJ, SSA repair involves longer homology and leads to larger deletions [64]. The potential of SSA in gene knock-in has primarily been validated in single-stranded DNA (ssDNA)-mediated targeted insertion [65]. Co-delivery of Cas9/sgRNA and single-stranded oligonucleotides (ssODN; <200 nt) has previously led to the successful correction of HBB and CYBB mutations in human HSPCs [66, 67] and the efficient knock-in in mouse and rat embryos [68], which supported ssDNA donors as an attractive option because short homology arms at 30–100 nt can mediate effective insertion [69, 70]. To test the potential of ssDNA for somatic gene knock-in, Guan et al. delivered naked 120 nt ssODN and Cas9/sgRNA plasmids in adult mice by hydrodynamic injection, which indeed corrected a point mutation in the F9 gene, yet showing a low rate at only 0.56% due to plasma instability of ssODN [71]. Recently, long ssDNA (lssDNA) donors were reported to mediate gene-size knock-in and demonstrated high efficacy and low cytotoxicity in mouse embryos and multiple cell models [72,73,74]. New designs and modifications of ssDNA have further improved its transfection efficiency and in vivo stability [72, 75, 76], actively promoting new research. However, costly production and unstable delivery persist as unsolved obstacles, especially to the in vivo targeted insertion via systemic delivery.

3 Choices of Insertion Site and Expression Strategies

3.1 Maximizing Therapeutic Effect and Versatility by Insertion Locus Selection

The transgene expression and therapeutic benefits produced from a gene insertion greatly depend on the choice of integration locus (Table 3). Inserting a therapeutic sequence into the original locus to functionally replace the defective sequence was first tested in hemophilia B mice. Ohmori et al. [34] and Wang et al. targeted the endogenous mF9 locus [32], while Li et al. targeted hF9 transgene locus in humanized mice [30], for the integration of mF9 or hF9 exons 2-8. These insertions all achieved stable plasma FIX production and hemostasis correction. However, the efficacy of this substitutive insertion approach relies on the unscathed transcriptional activity of the target loci and hence cannot benefit diseased subjects who carry mutations in promotors or regulatory regions.

Table 3 In vivo targeted insertion strategies for hemophilia treatment in preclinical studies

Alternatively, general target loci with high transcription activity and tissue specificity become attractive as versatile platforms for therapeutic insertion. The Alb locus is one of the most popular target sites owing to its exclusive and high expression in the liver. The proof-of-concept study by Sharma et al. in 2015 generated successful transgene knock-in at mAlb intron-1 for the treatment of hemophilia A and B and other liver metabolic diseases [41]. The versatility and efficacy of targeting other sites within the Alb locus were also investigated by using hemophilia A or B models [29, 33, 52,53,54, 56]. Thus far, no reduction of ALB production has been reported in mice carrying an insertion at Alb locus [33], but the validity of the concern is yet to be verified.

More recently, the APOC3 locus in humanized mice was assessed for hF9 insertion, producing 300 ng/mL plasma hFIX after administration of AAVs carrying CjCas9/sgRNA and a bidirectional hF9 donor [55]. APOC3 was selected because of its liver specificity and minimal safety risks, as decreased APOC3 expression is clinically asymptomatic or even beneficial for cholesterol modulation in the human body [55]. In 2023, Lee et al. also directed hF9 knock-in into Serpinc1 locus, which encodes antithrombin (AT), the most highly expressed anticoagulant factor. Marked improvement in coagulation activity was observed, attributed to the combinational effect of hFIX production and AT decrease [77]. Albeit showing promising results, careful control of AT suppression is necessary to prevent thrombotic adverse events.

Transgene knock-in at safe harbor loci such as Rosa26 in mice or AAVS1 in humans have rarely been examined for therapeutic insertion due to the requisite for exogenous promoters, which limited the donor delivery via AAV, and, meanwhile, may result in unexpected gene activation with risk for tumorigenesis [31, 50].

3.2 Balancing High Expression and Low Disturbance by Selecting a Target Site

Early research favored intronic targeting to avoid frameshift caused by NHEJ-produced indels. The F9 exon 2-8 from human/mouse has been inserted into the intron 1 of multiple loci, such as mF9 [34], hF9 [30] and mAlb [33, 41]; each produced plasma FIX and ameliorated hemophilia B symptoms in mice (Fig. 2). In these studies, a synthetic splice acceptor (SA) site was included in the donors to connect the integrated transgene sequence with the upstream exon 1, yielding desirable mRNA and protein [30, 33, 34, 41]. The mALB-hFIX hybrid proteins carried a short sequence from endogenous loci at the N-terminal but fully represented FIX activities for therapeutic benefits [33, 41]. It is noteworthy that the F9 and Alb exon 1 encode signal peptides, making the intron 1 an ideal target site for the expression of secretory proteins like FIX, FVIII, or some metabolic enzymes in liver, but may not be suitable for proteins with cell-autonomous functions.

Fig. 2
figure 2

Selection of insertion sites and expression strategies. Engineered nucleases, including ZFN and CRISPR/Cas9, have been systemically delivered by AAV for in vivo transgene insertion in mouse models. Intron 1, exon 2, intron 13, and exon 14 are intensively used insertion sites. SA is typically included in the donor when targeting intron 1, to connect the GOI with the upstream exon 1. Targeting exon 2 results in the fusion of GOI with exon 1, generating a chimeric protein. When targeting intron 13, a compensated exon 14 sequence is contained in the donor to avoid impacting endogenous gene expression level. Auxiliary 2A is also added to segregate two proteins. Stop codon and downstream 3′UTR at exon 14 are also tested in combination with 2A-GOI or IRES-GOI for bicistronic expression, respectively. (Created with BioRender.com.) AAV adeno-associated virus, ZFN zinc-finger nuclease, CRISPR/Cas9 clustered regularly interspaced short palindromic repeat-associated 9 nucleases, SA splicing acceptor, GOI gene of interest, 2A self-cleaving peptide, IRES internal ribosome entry site, UTR untranslated region

Recent studies also evaluated mAlb intron 13 for the targeted insertion of BDD-F8 using NHEJ approaches, recording plasma hFVIII production and hemostasis correction in hemophilia A mice [52, 53]. To avoid altering endogenous ALB protein, NHEJ donors were specially designed to compensate mAlb exon 14 coding region and carry auxiliary 2A sequences to separate ALB and hFVIII proteins. Using a hemophilia B rat model, Chen et al. demonstrated successful integration of hF9 Padua into the endogenous rat Alb intron 13 through an SpCas9/sgRNA-induced NHEJ-based HITI targeting approach [56] (Fig. 2).

Targeting sites in exons has been explored in parallel. Wang et al. inserted hF9 exon 2–8 into the endogenous mF9 exon 2 (Fig. 2), producing a chimeric mFIX-hFIX protein detectable in mouse plasma and with therapeutic activity [32]. Barzel et al. and De Caneva et al. targeted stop codon of mAlb at exon 14 to insert 2A-hF9 [29] and 2A-UGT1A1 [48], respectively, by applying HDR-based methods to precisely position the insertions. Via adopting a NHEJ-based strategy, He et al. specifically integrated hF9 at mAlb exon 14 but in the 3′UTR downstream of the stop codon [54] (Fig. 2). The donor contained an internal ribosome entry site (IRES) to translate hF9 separately from mAlb. Compared with the targeting at mAlb stop codon, insertion at the 3′UTR region provided greater flexibility in choosing a highly active sgRNA target site, which reached a higher efficiency and achieved therapeutic effects with much lower AAV doses [54].

The targeted insertions in adult mouse liver were generally detected in the vicinity of 0.5–3.8%, whereas the indels at DSB sites often reached much higher levels (12–50.3%) in both HDR- and NHEJ-based insertion studies [33, 54]. The high indel rates at target sites, driven by the intrinsic potency of NHEJ regardless of targeting strategies, underscore the value of intronic or 3′UTR-based targeting strategies in avoiding massive disruption of the coding sequences at target loci. Nonetheless, extra precautions are still necessary due to the presence of genomic regulatory elements.

3.3 Donor Design and Expression Strategies

Promoterless donors targeting a well-characterized locus, such as Alb, have drawn considerable attention in the field of gene insertion therapy (Table 3). Including a promoterless sequence serves the dual purpose of reducing donor size and bypassing exogenous promoters, which mitigates the risk of interfering with endogenous transcription, gene silencing [78], and unintended activation of oncogenes by random integration [79].

Ectopic SA is often used to link the promoterless transgene to the upstream endogenous exons upon intronic targeting [30, 33, 41], while self-cleaving 2A peptides and IRES discovered in viruses are widely used to introduce bicistronic expression [80] (Fig. 2).

The 2A sequences usually encode 18–22 amino acids and mediate simultaneous expression of two proteins through ribosomal skipping during translation [81]. The upstream protein carries the majority of 2A peptide residues at the C terminus, while the downstream protein contains one extra proline at the N terminus [82]. The small size of 2A sequences (54–66 bp) is ideal for AAV delivery, and the efficiency of 2A peptides in mediating bicistronic expression is typically high [80], especially P2A, which reached nearly 100% in a previous study [83]. Preclinical studies employing either HDR- or NHEJ-based insertion strategies have harnessed 2A sequences to separate transgenes from endogenous coding sequences, thereby achieving successful expression and therapeutic benefits [29, 48, 52, 53].

IRES is an RNA element that can recruit ribosomes to initiate translation from the internal region of mRNA [84], obviating the need for in-frame connection of two ORFs. It can therefore tolerate indels and serve as a suitable element for NHEJ insertion [85]. IRES-mediated bicistronic expression is less efficient, generally resulting in 20–50% expression of the upstream gene [86]. Nevertheless, in cases in which a target locus with high transcription activity (such as Alb) was selected, a superior level of plasma hFIX could still be produced with a low input AAV dose using IRES [54]. To date, the efficiencies of bicistronic expression, as well as the side products and their functional implications, were less thoroughly investigated in preclinical studies, warranting further examination in the future.

4 Remaining Challenges and Potential Solutions

4.1 Challenges Arising from Adeno-Associated Virus-Based Delivery

At present, in vivo gene insertion therapies validated in preclinical models have primarily relied on the AAV system for systemic delivery, and hence they face the same hurdles as AAV gene augmentation therapy [87]. The barriers include pre-existing immunity in the human population [88], limited payload capacity [89, 90], toxicity at high dose [91, 92], and difficulty in mass production [17]. In addition to the extensive research on advancing AAV technologies [93], studies on targeted gene insertion also explored the potential to address the obstacles arising from high AAV dosage via improved targeting designs or other delivery tools [94]. Through combining optimal settings, He et al. demonstrated robust and functional in vivo hF9 insertion with significantly reduced AAV input despite delivering SpCas9, sgRNAs, and donor using three separate AAV vectors [54]. Lee et al. alternatively combined viral and non-viral vectors for in vivo hF9 knock-in and revealed the possibility of replacing AAV for Cas9/sgRNA delivery with lipid nanoparticles (LNPs), which reduced the total AAV dose required [77, 95].

4.2 Relatively Low Efficiency of Somatic Gene Knock-In

In vivo gene insertion targeting somatic tissues including the liver could only achieve modest efficiency (approximately 1–3% of target alleles) with current technologies and tools, thus restricting the preclinical demonstration to diseases with low therapeutic thresholds, such as hemophilia. Although technological innovations can further enhance targeting efficiency, the extent of improvement may be limited by unidentified intrinsic factors. The evidence presented in the dosage analysis by He et al. suggested that the knock-in of hF9 in mouse livers may reach a plateau level, with approximately 4.66% of functional insertions detected in adult mice and 0.96% observed in neonates [54].

Another intriguing avenue to boost the therapeutic potential of somatic knock-in lies in the in vivo selective expansion of target cells. Mature hepatocytes with genetic advantages can expand substantially in vivo under selective conditions, as naturally manifested in HT1 mice with fumarylacetoacetate hydrolase (FAH) deficiency and Wilson disease (WD) mice with P-type copper-transporting ATPase (ATP7B) deficiency [96, 97]. Nygaard et al. developed a universal selection system by applying a chemical inhibitor to replicate FAH deficiency conditions, while delivering hF9 donor with short hairpin RNA (shRNA) against upstream enzymes to ameliorate the cytotoxicity specifically in targeted cells [98]. Indeed, hepatocytes with hF9 knock-in significantly repopulated in liver, and the plasma hFIX levels increased by 10- to 1000-fold after selective expansion, robustly reaching therapeutic levels. Recent studies then used acetaminophen (APAP) to induce cytochrome P450 (CYP)-dependent hepatotoxicity and co-delivered sgRNA against CYP-cofactor (Cypor) to render growth advantage in target cells, which displayed over 30-fold expansion of transgene-bearing hepatocytes in both CRISPR-mediated [99] and cleavage-free insertions [100].

4.3 Tissue Specificity of Transgene Expression and Somatic Genome Editing

Tissue-specific transgene expression is preferred but is challenging in AAV-delivered gene augmentation. While naturally occurring or engineered capsids may result in distinct tissue tropism [101], the features differ considerably between preclinical models and human patients, and exclusive specificity to certain tissues or cell types has not been conferred [102]. Markedly, targeted insertion of promoterless sequences restrains transgene expression under the control of endogenous target loci [49], which provides an excellent solution to introduce tissue/cell type-specific expression. Targeting hF9 integration at either mAlb or Apoc3 locus secured hepatocyte-specific expression in mouse [29, 30, 33, 41, 48, 52,53,54,55,56].

Moreover, tissue specificity arises as a significant challenge when implementing systemic administration of AAVs carrying nucleases for in vivo gene editing. To address this issue, Li et al. and Anguela et al. controlled ZFN activities by liver-specific ApoE/hAAT1 promoter and constricted hF9 knock-in and expression in liver, despite lacking direct verification of tissue-specific gene editing [30, 31]. He et al. and Lee et al. applied liver-specific LP1 and TBG promoters, respectively, and achieved confined Cas9 expression in mouse hepatocytes and liver-specific genome editing [54, 55]. These observations indicate the feasibility of cell-type specific genome editing, which could support a broad range of applications with enhanced safety profiles.

Owing to the privilege of germline cells in resisting AAV infection [103], vertical transmission of editing events through germlines have not been perceived, although in vivo gene editing and insertion via systemic AAV-delivery were widely reported in various somatic tissues [54, 104]. However, the possibility of germline infection by novel AAV capsids cannot be eliminated, which highlights the importance of confining editing activity to disease-relevant tissues/cells for new gene editing or insertion therapy.

4.4 Off-Target Effect

Off-targeting remains a pivotal concern for gene editing-related applications. Extensive efforts have been dedicated to minimizing the risk of unintended targeting, including advancements in in silico prediction, restriction on Cas9 activity duration, and further engineering of Cas9 protein and sgRNA [105]. Lee et al. effectively shortened the Cas9 expression window by employing LNP delivery instead of AAV [77, 95]. Additionally, high-fidelity Cas9 variants, such as HypaCas9 [106] and Cas9-HF1 [107], have been rationally designed, albeit with compromised activity to give way to specificity [108]. The chemical modifications of sgRNA have also been evidenced to enhance specificity through incorporating 2′-O-methyl-3′-phosphonoacetate (MP), bridged nucleic acids, or locked nucleic acids [109]. Optimizing sgRNA length such as addition of two guanine nucleotides [110] and truncation of 2–3 bp at the 5′ end could also diminish off-target potentials [111].

4.5 Unintended Integration and Editing Outcomes

Despite efforts to channel the repair process towards a specific pathway, endogenous mechanisms may competitively participate in repairing nuclease-induced DSBs and mediating gene insertion, resulting in diverse yet unintended modifications at the genomic target site. The prevalence of on-target indels regardless of targeting strategies was broadly confirmed. For instance, the study by Li et al. on HDR-based hF9 insertion revealed up to 45% indels yielded by NHEJ repair at the DSB site [30], while the study by He et al. on NHEJ-based hF9 insertion detected up to 42% indels, at a comparable level [54]. Subsequent studies by Sharma et al. and Wang et al. detected NHEJ-mediated donor integration events, despite the presence of HDR donors with homology arms [33, 41]. Additional undesired editing outcomes, including large deletions, chromosome rearrangement, and truncated or concatemerized integration, have also been reported [112,113,114]. Although new methods have been actively developed [114, 115], full-spectrum detection of these heterogenous and often unpredictable editing outcomes based on the current deep sequencing platforms remains technically challenging. Subsequent research is compulsory to minimize their occurrence and thoroughly evaluate the functional implications.

5 New Advances for Gene Editing Therapy

5.1 Lipid Nanoparticles Provide an Alternative Vehicle to Introduce In vivo Gene Insertion

LNP is another promising liver-targeted vehicle to deliver CRISPR/Cas9 through systemic administration, due to its primary uptake by hepatocytes mediated by the ApoE receptor [116]. Compared with recombinant AAV vectors, LNPs have a relatively short expression window and low immunogenicity, and offer advantages in potency, payload capacity and design flexibility [117] (Fig. 3).

Fig. 3
figure 3

Features of AAV and LNP delivery platforms and challenges for in vivo gene insertion. AAV and LNP have emerged as promising delivery vehicles for in vivo gene insertion. AAV exhibits a broad range of tissue tropisms, while LNP naturally targets mainly the liver despite recent advancements in other tissue targeting. AAV delivers the DNA vector, in which Cas9 and sgRNA are packaged separately or in the same vector. The donor was delivered by AAV DNA vector. LNP facilitates the encapsulation of mRNA or proteins to deliver Cas9/sgRNA as mRNA or RNP. LNPs also allow the delivery of dsODN donor, while large donor encapsulation remains a challenge. AAV-mediated in vivo gene insertion enables specific expression owing to the locus-specific insertion of promoterless donor and restriction of Cas9 expression. In contrast, LNP lacks expression specificity due to the difficulty in large donor delivery and the lack of restriction by cell type/tissue-specific promoters. The remaining challenges are shown in grey and are marked with red crosses. (Created with BioRender.com.) AAV adeno-associated virus, LNP lipid nanoparticle, Cas9 clustered regularly interspaced short palindromic repeat-associated 9 nuclease, sgRNA single guide RNA, RNP ribonucleoprotein, dsODN double-stranded oligodeoxynucleotide

The most common in vivo application of LNP-packed Cas9 mRNA/sgRNA is gene disruption in the liver. Han et al. targeted the Serpinc1 gene to lower AT expression in hemophilia mice, resulting in significantly reduced plasma AT and enhanced thrombosis. Moreover, in vivo bioluminescence imaging and indels analysis confirmed the predominant occurrence of editing in the liver [118]. Similarly, Qiu et al. disrupted the Angptl3 locus in mice and observed reduced serum angiopoietin-like 3 (ANGPTL3) and blood lipoprotein levels for more than 100 days after a single dosing [119]. These advances have prompted clinical tests of LNP-mediated gene disruption. NTLA-2001 and NTLA-2002 were devised as remedies for transthyretin amyloidosis (ATTR) and hereditary angioedema (HAE) by disrupting the TTR and KLKB1 genes, respectively [120, 121] (Fig. 1). According to the data released thus far, NTLA-2001 has successfully reduced serum TTR levels in all tested subjects with ATTR [122], while the HAE patients receiving NTLA-2002 were recorded with robust decline in plasma kallikrein levels and HAE attack rates [123].

To date, LNP-Cas9-mediated in vivo gene insertion was only attempted in combination with AAV or dsODN donor, given the challenges in delivering long DNA sequences via LNPs. Yin et al. first combined LNP-Cas9 mRNA/sgRNA and AAV donor for in vivo knock-in of Fah gene in HT1 mice, which achieved gene correction in > 6% of hepatocytes and ameliorated disease symptoms [95]. Another example is the study by Lee et al., in which coadministration of AAV donor and LNP-Cas9 complex showed efficient in vivo hF9 knock-in in hemophilia mice [77]. Co-delivery of LNP-Cas9 and dsODN donor (54 bp) was also conducted by Samanta et al. to correct the p.R83C mutation in mice with glycogen storage disease type-Ia (GSD-Ia), which produced approximately 3.6 ± 0.8% of normal G6Pase-α activity [124].

In summary, the well-demonstrated efficiency and safety profiles of LNP for in vivo gene editing in both preclinical and clinical trials have illustrated its potential in future gene insertion therapy.

5.2 Ex vivo Gene Editing and Potential of Cell Therapy for Hemophilia Treatment

Ex vivo gene therapy comprises the extraction of particular autologous cells from patients, followed by culturing, expansion, editing and selection in vitro, and eventual transplantation back to the patients. This approach avoids the toxicity associated with AAV systemic administration and non-specific tissue targeting [125]. Successful ex vivo gene insertions have been accomplished in hematologic cell types, achieving the highest efficiency through coupling AAV6 HDR-donor with Cas9/sgRNA ribonucleoprotein (RNP) [126]. Notably, AAV6-RNP delivery has been widely used to introduce the targeted insertion of engineered chimeric antigen receptor (CAR) into human T cells [127], and the generated CAR-T cells have been actively tested in clinical trials [128].

Engineering B cells for cell therapy represents a nascent therapeutic strategy for genetic diseases, including hemophilia [129]. Primary B cells can undergo differentiation into plasma cells, rendering them suitable for long-term expression of secretory factors and feasible for multiple infusions. Therefore, B-cell therapy provides incomparable flexibility in adjusting treatment according to patient responses. Furthermore, unlike engineered HSPCs and CAR T cells, autologous B-cell transplants can be efficiently engrafted without lymphodepletion, granting B-cell-based therapy a unique advantage by obviating the requirement for toxic preconditioning, which is a customary practice in emerging ex vivo approaches targeting HSCs for hemophilia treatment [130, 131]. In 2018, Hung et al. applied AAV6 HDR-donor alongside Cas9/sgRNA RNP to insert MND promoter-driven hF9 Padua at the CCR5 locus in human primary B cells and successfully generated secretory hFIX with coagulation activity [132]. Recently, Liu et al. further demonstrated the successful infusion of the engineered B cells into immunodeficient mice, revealing rapid engraftment and stable FIX production for up to 20 weeks [133]. Two studies identified the IgH intronic region between JH segments and Eμ enhancer as a promising site for targeted insertion in primary human B cells, leading to successful production of antigen-specific antibodies [134, 135]. In conclusion, ex vivo therapy offers distinct advantages and can potentially serve as a treatment option for addressing hemophilia, particularly as a compensatory approach when in vivo therapy falls short.

6 Conclusion

The past decade has witnessed the breakthrough of gene editing tools as well as the development of ZFN and CRISPR/Cas9-based in vivo gene editing, which opens new avenues for the treatment of inherited genetic diseases. Recent preclinical advances have focused on optimizing targeted gene insertion by leveraging distinct targeting mechanisms, donor designs, and insertion sites, resulting in promising efficacy and safety profiles. While insertion approaches feature unique advantages and achieved primary success in treating conditions such as hemophilia using animal models, several challenges need to be addressed prior to their transition to human clinical applications. The implementation of LNP delivery and ex vivo gene editing may offer feasible directions for addressing these concerns but demands more comprehensive and deliberate considerations, especially on practicality. Despite these challenges, the ongoing research efforts and continuous improvements in related studies suggest targeted gene insertion as a potent constituent of gene editing therapies and a potential candidate in the clinical treatment of genetic diseases.