Genome editing tools such as the clustered regularly interspaced short palindromic repeat (CRISPR)-associated system (Cas) have been widely used to modify genes in model systems including animal zygotes and human cells, and hold tremendous promise for both basic research and clinical applications. To date, a serious knowledge gap remains in our understanding of DNA repair mechanisms in human early embryos, and in the efficiency and potential off-target effects of using technologies such as CRISPR/Cas9 in human pre-implantation embryos. In this report, we used tripronuclear (3PN) zygotes to further investigate CRISPR/Cas9-mediated gene editing in human cells. We found that CRISPR/Cas9 could effectively cleave the endogenous β-globin gene (HBB). However, the efficiency of homologous recombination directed repair (HDR) of HBB was low and the edited embryos were mosaic. Off-target cleavage was also apparent in these 3PN zygotes as revealed by the T7E1 assay and whole-exome sequencing. Furthermore, the endogenous delta-globin gene (HBD), which is homologous to HBB, competed with exogenous donor oligos to act as the repair template, leading to untoward mutations. Our data also indicated that repair of the HBB locus in these embryos occurred preferentially through the non-crossover HDR pathway. Taken together, our work highlights the pressing need to further improve the fidelity and specificity of the CRISPR/Cas9 platform, a prerequisite for any clinical applications of CRSIPR/Cas9-mediated editing.
KEYWORDSCRISPR/Cas9 β-thalassemia human tripronuclear zygotes gene editing homologous recombination whole-exome sequencing
The CRISPR/Cas9 RNA-endonuclease complex, consisting of the Cas9 protein and the guide RNA (gRNA) (~99 nt), is based on the adaptive immune system of streptococcus pyogenes SF370. It targets genomic sequences containing the tri-nucleotide protospacer adjacent motif (PAM) and complementary to the gRNA, and can be programmed to recognize virtually any genes through the manipulation of gRNA sequences (Cho et al., 2013; Cong et al., 2013; Jinek et al., 2012; Jinek et al., 2013; Mali et al., 2013c). Following Cas9 binding and subsequence target site cleavage, the double strand breaks (DSBs) generated are repaired by either non-homologous end joining (NHEJ) or homologous recombination directed repair (HDR), resulting in indels or precise repair respectively (Jinek et al., 2012; Moynahan and Jasin, 2010). The ease, expedience, and efficiency of the CRISPR/Cas9 system have lent itself to a variety of applications, including genome editing, gene function investigation, and gene therapy in animals and human cells (Chang et al., 2013; Cho et al., 2013; Cong et al., 2013; Friedland et al., 2013; Hsu et al., 2014; Hwang et al., 2013; Ikmi et al., 2014; Irion et al., 2014; Jinek et al., 2013; Li et al., 2013a; Li et al., 2013b; Long et al., 2014; Ma et al., 2014; Mali et al., 2013c; Niu et al., 2014; Smith et al., 2014a; Wu et al., 2013; Wu et al., 2014b; Yang et al., 2013).
The specificity of CRISPR/Cas9 is largely dictated by PAM and the 17–20 nt sequence at the 5′ end of gRNAs (Cong et al., 2013; Hsu et al., 2013; Mali et al., 2013a; Mali et al., 2013c; Pattanayak et al., 2013; Wu et al., 2014a). Up to 5 mismatches may be tolerated for target recognition in human cancer cells (Fu et al., 2013). Unintended mutation in the genome can greatly hinder the application of CRISPR/Cas9, especially in studies of development and gene therapy (Hsu et al., 2014; Mali et al., 2013b; Sander and Joung, 2014). Interestingly, three groups recently found through whole genome sequencing that off-target effects of CRISPR/Cas9 appeared rare in human pluripotent stem cells (Smith et al., 2014b; Suzuki et al., 2014; Veres et al., 2014), raising the possibility that high frequencies of unintended targeting by CRISPR/Cas9 may be more prevalent in cancer cell lines. Additionally, lower rates of off-target effects (compared to human cell lines) have also been reported in mouse zygotes (Wu et al., 2013; Yang et al., 2013). Despite great progress in understanding the utilization of CRISPR/Cas9 in a variety of model organisms, much remains to be learned regarding the efficiency and specificity of CRISPR/Cas9-mediated gene editing in human cells, especially in embryos. Because ethical concerns preclude studies of gene editing in normal embryos, we decided to use tripronuclear (3PN) zygotes, which have one oocyte nucleus and two sperm nuclei.
Extensive studies have shown that polyspermic zygotes such as tripronuclear (3PN) zygotes, discarded in clinics, may serve as an alternative for studies of normal human zygotes (Balakier, 1993). Polyspermic zygotes, which occur in ~2%–5% of zygotes during in vitro fertilization (IVF) clinical trials, may generate blastocysts in vitro but invariably fail to develop normally in vivo (Munne and Cohen, 1998), providing an ideal model system to examine the targeting efficiency and off-target effects of CRISPR/Cas9 during early human embryonic development (Bredenoord et al., 2008; Sathananthan et al., 1999).
Here, we report that the CRISPR/Cas9 system can cleave endogenous gene efficiently in human tripronuclear zygotes, and that the DSBs generated by CRISPR/Cas9 cleavage are repaired by NHEJ and HDR. Repair template of HDR can be either the endogenous homologous gene or exogenous DNA sequence. This competition between exogenous and endogenous sequence complicates the analysis of possible gene editing outcomes make it difficult to predict the consequence of gene editing. Furthermore, mosaicism and mutations at non-target sites are apparent in the edited embryos. Taken together, our data underscore the need to more comprehensively understand the mechanisms of CRISPR/Cas9-mediated genome editing in human cells, and support the notion that clinical applications of the CRISPR/Cas9 system may be premature at this stage.
CRISPR/Cas9-mediated editing of HBB gene in human cells
The human β-globin (HBB) gene, which encodes a subunit of the adult hemoglobin and is mutated in β-thalassemia (Hill et al., 1962). In China, CD14/15, CD17, and CD41/42, which are frame-shift or truncated mutations of β-globin, are three of the most common β-thalassemia mutations (Cao and Galanello, 2010). Located on chromosome 11, HBB is within the β-globin gene cluster that contains four other globin genes with the order of (from 5′ to 3′) HBE, HBG2, HBG1, HBD, and HBB (Schechter, 2008). Because the sequences of HBB and HBD are very similar, HBD may also be used as a template to repair HBB. The HBD footprints left in the repaired HBB locus should enable us to investigate whether and how endogenous homologous sequences may be utilized as HDR templates, information that will prove invaluable to any future endeavors that may employ CRISPR/Cas9 to repair gene loci with repeated sequences.
CRISPR/Cas9-mediated editing of HBB gene in human tripronuclear zygotes
Because of the preference for the error-prone NHEJ pathway, the HBB sequences from Cas9-cleaved embryos showed double peaks near the PAM site on sequencing chromatographs (Fig. 2C). Analysis of 5 of these embryos using the T7E1 assay also confirmed successful cleavage by G1 gRNA and Cas9 (Fig. 2D). In addition, the gene-edited embryos were mosaic. For example, embryo No. 16 contained many different kinds of alleles (Fig. 2E).
CRISPR/Cas9 has off-target effect in human tripronuclear embryos
HDR of double strand breaks at the HBB gene occurs preferentially through the non-crossover pathway
In this study, we used 3PN zygotes to investigate the specificity and fidelity of the CRISPR/Cas9 system. Similar to cultured human cells, most of the DSBs generated by Cas9 in 3PN zygotes were also repaired through NHEJ (Fig. 2A). ssDNA-mediated editing occurred only in 4 embryos (14.3%), and the edited embryos were mosaic, similar to findings in other model systems (Shen et al., 2013; Yang et al., 2013; Yen et al., 2014). Endogenous homologous sequences were also used as HDR templates, with an estimated editing efficiency of 25% (Fig. 2A). This high rate of repair using endogenous sequences presents obvious obstacles to gene therapy strategies using CRISPR/Cas9, as pseudogenes and paralogs may effectively compete with exogenous templates (or endogenous wild-type sequences) during HDR, leading to unwanted mutations (Fig. 2B).
Our whole-exome sequencing result only covered a fraction of the genome and likely underestimated the off-target effects in human 3PN zygotes. In fact, we found that even with an 8 bp mismatch between the G1 gRNA and C1QC gene (Fig. S6), the CRISPR/Cas9 system was still able to target the C1QC locus in human 3PN embryos (Figs. 3B and S5). Such off-target activities are similar to what was observed in human cancer cells. Because the edited embryos are genetically mosaic, it would be impossible to predict gene editing outcomes through pre-implantation genetic diagnosis (PGD). Our study underscores the challenges facing clinical applications of CRISPR/Cas9.
Further investigation of the molecular mechanisms of CRISPR/Cas9-mediated gene editing in human model is sorely needed. In particular, off-target effect of CRISPR/Cas9 should be investigated thoroughly before any clinical application (Baltimore et al., 2015; Cyranoski, 2015; Lanphier et al., 2015).
MATERIALS AND METHODS
Construction and use of CRISPR plasmids
pX330 (Addgene, #42230) was used for transient transfection and pDR274 (Addgene, #42250) was used for in vitro transcription. We amplified the sequences encoding 3×Flag-tagged hCas9 from pX330 and cloned it into the NotI/AgeI restriction sites of pDR274 to obtain pT7-3×Flag-hCas9. The pT7-3×Flag-hCas9 plasmid was linearized with PmeI and in vitro transcribed using the mMESSAGE mMACHINE T7 ULTRA kit (Life Technologies). The pDR274 vector encoding gRNA sequences was in vitro transcribed using the MEGAshortscript T7 kit (Life Technologies). The Cas9 mRNA and the gRNAs were subsequently purified with the MEGAclear kit (Life Technologies), resuspended in RNase-free water, and quantified using NanoDrop-1000.
Sequences for cloning the G1, G2, and G3 gRNAs into the pX330 vector are: pX330-G1-FP: CACCGTAACGGCAGACTTCTCCTC; pX330-G1-RP: AAACGAGGAGAAGTCTGCCGTTAC; pX330-G2-FP: CACCGTCTGCCGTTACTGCCCTGT; pX330-G2-RP: AAACACAGGGCAGTAACGGCAGAC; pX330-G3-FP: CACCGGCTGCTGGTGGTCTACCCT; pX330-G3-RP: AAACAGGGTAGACCACCAGCAGCC; Sequences for cloning the G1 gRNA into pDR274 are: pDR274-G1-FP: TAGGTAACGGCAGACTTCTCCTC; pDR274-G1-RP: AAACGAGGAGAAGTCTGCCGTTA.
The sequence for the ssDNA oligo used to repair HBB is: 5′-CAACCTGCCCAGGGCCTCACCACCAACTTCATCCACGTTCACCTTGCCCCACAGGGCAGTGACAGCGGATTTTTCTTCAGGAGTCAGATGCACCATGGTGTCTGTTTGAGGTTGCTAGTGAACAC-3′
Identification and collection of human tripronuclear (3PN) embryos
Mature oocytes were inseminated in fertilization medium (Vitrolife, Sweden) 4 h after retrieval by conventional in vitro fertilization (IVF). Fertilization status was checked 16–19 h after insemination and normal fertilization was assessed by the presence of two clear pronuclei. Abnormal fertilized oocytes with three clear pronuclear were selected for cryopreservation.
Embryo vitrification and recovery
Embryos were selected for cryopreservation using the CryoTop device as reported (Kuwayama et al., 2005). Briefly, embryos were incubated in Vitrification Solution 1 (7.5% (v/v) DMSO (v/v) and 7.5% (v/v) ethylene glycol) for 5–6 min, and then moved to Vitrification Solution 2 (15% (v/v) DMSO, 15% (v/v) ethylene glycol, and 0.65 mol/L sucrose) for 30 s. The embryos were then quickly placed onto a Cryotop (Kitazato Supply Co., Fujinomiya, Japan), followed by aspiration of excess medium with a fine pipette and quick immersion in liquid nitrogen. The embryos were then stored in liquid nitrogen. For recovery, the embryos were warmed with the polypropylene strip of the Cryotop immersed directly into 3 mL of 1.0 mol/L sucrose at 37°C for 1 min, retrieved and held for 3 min in 1 mL of a dilution solution (0.5 mol/L sucrose in TCM199 medium with 20% serum substitute supplement), and then washed at room temperature before being cultured for subsequent analysis.
Analysis of CRISPR/Cas9 induced cleavages
The T7 endonuclease 1 (T7E1) cleavage assay was performed as described by Shen et al. (Shen et al., 2014). For verification of indels and mutations, genomic DNA was used for PCR amplification of target sites with primers listed in Supplementary information, Table S1. PCR products were sequenced directly using primers from Supplementary information, Table S1 to confirm the presence of double peaks, and those with double peaks were then TA cloned into the pGEM-T vector (Promega) for sequencing. In general, a total of 45–50 clones were sequenced for each embryo.
To identify potential off-target sites, we used the online tool (http://crispr.mit.edu/). Sequences surrounding these genomic sites were PCR amplified for the T7E1 assay with primers listed in Table S1.
Whole genome amplification using embryos
Whole genome amplification of the embryos was performed using the PEPLI-g Midi Kit (Qiagen). Briefly, embryos were transferred into PCR tubes containing reconstituted buffer D2 (7 μL), and then incubated at 65°C for 10 min, before the addition of Stop solution (3.5 μL) and MDA master mix (40 μL) and incubation at 30°C for 8 h. The DNA preparation was diluted with ddH2O (3:100), and 1 μL of the diluted DNA was used for PCR analysis.
Whole-exome sequencing, data processing, and off-target analysis
The exome was captured using the 50 Mb SureSelectXT Human All Exon V5 kit (Agilent). The enriched exome was sequenced on Illumina HiSeq 2000 PE100 as paired-end 100 bp reads, which were aligned to the human reference genome (UCSC, hg19) by means of BWA with default parameters (v0.7.5a) (Li and Durbin, 2010). Samtools (v0.1.19, http://samtools.sourceforge.net) and Picard tools (version 1.102, http://picard.sourceforge.net) were used to build indices and remove duplicates. Local realignment around indels (RealignerTargetCreator, IndelRealigner) and base score recalibration (BaseRecalibrator) were applied by GATK (The Genome Analysis ToolKit, version 3.1-1) (McKenna et al., 2010) to ensure accuracy in identifying indels and single nucleotide variants (SNVs). GATK HaplotypeCaller and Samtools were used to call variants for six samples and the union variants of both obtained by CombineVariants were then divided into indels and SNVs by SelectVariants.
We first excluded indels and SNVs located outside of exon regions following annotation by ANNVAR based on RefSeq gene models (hg19) (Wang et al., 2010). A total of 7463 indels and 188,078 SNVs passed this filter. Next, indels and SNVs with more than two reads were retained by VariantFiltration and Python, discarding low-quality and unlikely indels (8.99%) and SNVs (5.91%).
To avoid false positive calls that overlap with repeat sequences and/or include homopolymers (Bansal and Libiger, 2011), we removed indels and SNVs that overlapped with low-complexity regions as defined by RepeatMasker (UCSC Genome Browser) and filtered out indels and SNVs containing homopolymers (>7 bp) in the low-complexity flanking region (±100 bp), removing 55.58% of potential indels and 17.01% of potential SNVs. To more definitively assign indels, we searched the ±100 bp regions flanking the potential indel sites for potential off-target sites. Bowtie1 (version 0.12.8, http://bowtie-bio.sourceforge.net) was used to align gRNA sequences (20 bp) to the ±100 bp sequences, allowing for ≤6 mismatches or perfect match of the last 10 nt 3′ of the gRNA. Successfully aligned sites with an NRG PAM were deemed on/off-target sites. Of the 12 candidate indels identified by this analysis, there were ten on-target indels in all samples and two off-target indels in samples A and C. Candidate off-target sites were further confirmed by PCR and sequencing. The results are summarized in Table S2.
This study was supported by the National Basic Research Program (973 Program) (Nos. 2010CB945401 and 2012CB911201), the National Natural Science Foundation of China (Grant Nos. 91019020, 81330055, and 31371508).
3PN, tripronuclear; DSB, double strand break; gRNA, guide RNA; IVF, in vitro fertilization; HDR, homologous recombination directed repair; NHEJ, non-homologous end joining; PAM, protospacer adjacent motif; PGD, pre-implantation genetic diagnosis; SDSA, synthesis-dependent strand annealing.
COMPLIANCE WITH ETHICS GUIDELINES
Puping Liang, Yanwen Xu, Xiya Zhang, Chenhui Ding, Rui Huang, Zhen Zhang, Jie Lv, Xiaowei Xie, Yuxi Chen, Yujing Li, Ying Sun, Yaofu Bai, Zhou Songyang, Wenbin Ma, Canquan Zhou, and Junjiu Huang declare that they have no conflict of interest.
This study conformed to ethical standards of Helsinki Declaration and national legislation and was approved by the Medical Ethical Committee of the First Affiliated Hospital, Sun Yat-sen University. The patients donated their tripronuclear (3PN) zygotes for research and signed informed consent forms.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.