Introduction

Throughout the years following the discovery of the structure of DNA, scientists have endeavored to genetically manipulate organisms. Until recently, most of the genetic engineering tools developed were based on DNA:protein recognition principles, such as restriction enzymes, site-directed zinc finger nucleases (ZFs), and TAL effector nucleases (TALENs) [6, 47]. However, these tools are commonly experienced with difficulties in design, synthesis, and efficiency which altogether prevented a global widespread adoption, e.g. TALENs require 30–35 amino acids repeats, each only recognizing a single nucleotide (nt) [29]. On the other hand, the RNA-programmable CRISPR/Cas9 technology has led to a scientific revolution by solving all of the above-mentioned issues [20]. The technology relies on two elements: a protein, the CRISPR associated protein (Cas9), and a RNA molecule, the guide RNA (gRNA) [5, 46]. Cas9, the first Cas protein used in genome editing, is a large multi-domain enzyme interacting with the gRNA, the target DNA, and the Protospacer Adjacent Motif sequence (PAM) (Fig. 1a). The gRNA element is composed of two distinct elements: the spacer, a 20 nt domain that binds to the DNA; and the scaffold, a ~ 79 nt domain that interacts with Cas9 (Fig. 1a). Once guided to the target, Cas9 catalytically cleaves the DNA sequence 3 nt upstream the 5′-NGG PAM, resulting in the activation of endogenous repair mechanisms, such as homologous recombination (HR) or non-homologous end joining (NHEJ) [2, 71] (Fig. 1b).

Fig. 1
figure 1

CRISPR/Cas9 for genome editing and gene regulation. a The gRNA:Cas9 complex binding to the DNA target. In green, the spacer region which interacts with the DNA target. b DNA cut generated from the Cas9 nuclease activity. c Gene regulation with dCas9 physically blocking the RNA polymerase from binding to the promoter region. d CRISPR interference further enhanced with dCas9 fused with transcriptional regulators. e gRNA scaffold extended with stem-loops recruiting regulator elements. f. Multiplexing gRNAs from a single transcript through endoribonuclease or self-processing elements

Expanding Cas9 features through enzyme engineering

The structural characterization of Cas9 has led to the development of mutagenized variants with various catalytic properties, specificities through different PAM recognition preferences and reduction in off-targeting [37, 55, 56, 90]. For example, Hirano et al. first characterized FnCas9 from Francisella novicida and, based on the characterized protein structure, created a variant recognizing a 5′-YG’ PAM instead of the original 5′-NGG [37]. A 5′-YG PAM increases the target space availability for genome editing, i.e. any target followed by CG or TG is prone to be targeted by the gRNA:FnCas9 complex. Additionally, other CRISPR nucleases with different PAM preferences can also be used to increase the target space availability, e.g. FnCpf1 a type V CRISPR system from F. novicida which recognizes a T-rich PAM 5′-TTTN [26, 101] (Fig. 1b). In another approach, mutation in one of the nuclease activity domains (RuvCD10A or HNHH840A, Cas9n) was shown to result in a modified Cas9 only capable of performing single-strand DNA breaks (nick) instead of the original blunt DNA break [80]. This feature has been shown to reduce off-targeting and enhance HR in some organisms [12, 70]. By extension, ‘paired nickases’, i.e. using two adjacent gRNAs with Cas9n, can efficiently introduce both indel mutations and HR events with a single-stranded DNA oligo-nucleotide donor template in mammalian cells [28, 10, 80]. Complete disruption of the endonuclease activities (RuvCD10A along with HNHH840A) results in a catalytically inactive Cas9, or dead-Cas9 (dCas9) [78, 79]. This has been exploited to physically block the transcriptional machinery when targeted in the promoter region of a gene of interest, coined CRISPR interference (CRISPRi) [22, 34] (Fig. 1c). Additionally, repression can be further enhanced by fusing dCas9 with repressive domains, such as the mammalian transcriptional repressor domain Mxi1 [33] (Fig. 1d). Gander et al. have recently exploited dCas9-Mxi1 repressive mechanism to effectively built up to seven layers of synthetic NOR gate circuits, in S. cerevisiae [30] (Figs. 1d, 2b). Likewise, dCas9 can be coupled to activating transcription factor domains, such as the tripartite VP64-p65-Rta (VPR) or the RNAP ω-subunit (rpoZ), which have been characterized as powerful tools for activating genes [4, 7, 44, 91] (Fig. 1d). Similarly, epigenetic regulators, such as methylation, demethylation, acetylation and deacetylation domains, can be fused to dCas9 to influence chromatin structure and, therefore, interfere with the transcriptional signature of a promoter [36, 50, 54]. Hilton et al. reported the fusion of dCas9 with the histone acetyltransferase domain of the human E1A-associated protein p300 (dCas9-p300), which significantly modulated the chromatin structure, and resulted in a 4000-fold up regulation with a single gRNA [36].

The gRNA characteristics and extensions

Cas9 can be guided virtually anywhere in the genome where a PAM sequence is present. However, several parameters, such as nucleotide motifs, particularity of the PAM sequence, and mismatches in the guide, have to be taken into account for a correct cleavage in the target DNA [91]. Recently, efforts have been made to solve target efficiency through algorithms predicting the ability of the gRNA to facilitate DNA cleavage by Cas9 at the intended target site by employing refined machine learning methods and incorporating large training datasets [18, 19]. The sequence accuracy of the gRNA is also essential to achieve a correct base-pairing between the gRNA and the target DNA. Most commonly, RNA pol. III promoters are used to transcribe gRNAs. These are scarce and more importantly, some of them contain idiosyncratic features, e.g. U6 mammalian promoter requires to have a G at the 5′ end of the transcript [28]. Transcriptional expression can be improved by inserting self-processing elements, such as HDV ribozyme and tRNAs, at the 5′ or 3′ end to prevent potential degradation of the transcript [49, 83]. Processing elements can also be exploited to multiplex several gRNAs in a row by collocating those element between each gRNAs [17] (Fig. 1e). Several examples, such as the type III CRISPR-Csy4 [25, 76] or natural CRISPR array [1, 14] have been shown to efficiently generate multiple gRNAs from a single transcript (Fig. 1e). Notably, while Cpf1 belongs to the same CRISPR class II as Cas9, i.e. only a single crRNA–effector enzyme and no tracrRNA part is required for cutting DNA. It differs from it by possessing a specific RNA processing domain that allows to process the crRNA into multiple gRNAs [55, 69, 92, 101].

Finally, the gRNA scaffold can be extended to include effector protein recruitment stem-loops, which has been shown to enhance transcriptional regulation [8, 44, 100] (Fig. 1d). With that strategy, Zalatan et al. were able to design gRNAs to either recruit activator or repressor elements, which ultimately established both, repression and activation of specific gene targets at the same time [100] (Fig. 1d). This platform offers a considerable advantage in comparison to dCas9 fused to a regulator, as it is not limited to which transcriptional regulator is fused to dCas9, but to which stem-loop is connected to the gRNA scaffold.

Another high potential application area for the CRISPR technology is systematic genetic screening employing gRNA libraries. Due to the short length of gRNAs (~ 100nt), accurate predictability, and easy cloning approaches, genome-wide gRNA libraries have been successfully designed to knockout and regulate genes throughout the entire genome [31]. For example, Shen et al. developed a systematic approach to map synthetic lethality genes by targeting all pairs of 73 cancer genes with dual guide RNAs in three different cancer cell lines (Fig. 2a). Their strategy involved nine gRNA pairs per combination, the library comprised 23,652 double-gene-knockout constructs with two replicates in three cell lines which ultimately led to a total of 141,912 interactions and to the discovery of 120 potential drugs candidates [86] (Fig. 2a).

Fig. 2
figure 2

Example of application in drug discovery and synthetic biology. a Genome-wide pooled gRNA libraries targeting all pairs of 73 cancer genes with dual guide RNAs in three mammalian cell lines. b Example of logic circuits made with dCas9-Mxi NOR gates with GFP signal used as output similar to Gander et al. study

Industrial applications through metabolic engineering

Genome engineering

There has been an increasing interest in improving microbial cell factories through metabolic engineering approaches using CRISPR/Cas9 technology [42]. The efficiency and versatility offered by CRISPR tools have shown great potential in rewiring the metabolic network of host cells to enhance their production of metabolites used in various areas of industrial biotech ranging from applications as biofuels to chemical building blocks and pharmaceuticals (Table 1). Metabolic pathway optimization towards the product of interest commonly requires deletions of multiple genes, e.g. competitive metabolic pathways, which is traditionally performed through iterative cycles of genetic marker integration and removal [15]. Conversely, the CRISPR technology does not necessitate integrative markers, and several efficient marker-free approaches were developed to perform multiplexed genome editing, e.g. knockouts, point-mutations [41, 93] and gene integration [45], which extensively reduced the time and effort required to perform targeted strain engineering. The CRISPR technology has also improved genetic engineering in difficult-to-engineer industrial organisms, such as food crops. Among several examples (Table 1), Li et al. reported a significant site-specific gene replacement of the 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) in rice plants using a pair of gRNAs targeting introns, ultimately converting the crop into a glyphosate-resistant one (Fig. 3c) [62]. More recently, several studies have highlighted significant improvements in genome editing in plants using DNA-free CRISPR/Cas9 ribonucleoproteins [65, 98].

Table 1 Non-exhaustive list of studies applying CRISPR and CRISPRi for metabolic engineering purposes
Fig. 3
figure 3

CRISPR applications in metabolic engineering. a CRISPR enabled trackable genome engineering (CREATE) strategy for optimal expression of 4 genes involved in isopropanol biosynthetic pathway. Integration of DNA libraries composed of variant RBS sequences. b Multiple integration of the xylose and BDO pathway into retroposons sites in S. cerevisiae. Retrotransposons are composed of similar DNA sequences, which ultimately allows to generate a promiscuous gRNA able to target several of these targets at once. c Genetic engineering in rice using dual gRNAs targeting ESPS introns for double amino acid substitution [T102I + P106S (TIPS)]. d Systematic testing of enzyme perturbation sensitivities (STEPS) approach to iteratively find bottleneck

Besides its multiplexing qualities, CRISPR has also shown great efficiency to integrate large pathways and libraries [38, 88]. For example, Shi et al. specifically designed gRNAs to target multiple delta sites in the yeast genome, ultimately achieving 18-copy genomic integrations of a 24 kb combined xylose utilization and (R,R)-2,3-butanediol (BDO) production pathway in a single step, in S. cerevisiae [88]. DNA libraries, such as error-prone PCRs derived or double-stranded fragments obtained from DNA synthesizing companies, can be genomically integrated to find variants of a studied enzyme with enhanced catalytic activities or optimal level of expression [64, 83]. Genomically integrated DNA libraries offer several advantages compared to plasmid based strategies, especially in terms of expression stability [83]. Liang et al. used that strategy to integrate 640 ribosome binding sites (RBS) for five different enzymes involved in the production of isopropanol in E. coli [64] (Fig. 3a). After multiple rounds of screening for strains carrying the best RBS variants, i.e. optimal expression of the gene leading to a higher isopropanol titer, a final strain producing 7.1 g L−1 of isopropanol was obtained [64].

dCas9—transcriptional regulation

Fine-tuning of biosynthetic pathways is a key step in the correct and efficient synthesis of a particular target compound (Nielsen and Keasling 2016). Traditional strategies have been relying on a limited number of characterized promoters to control gene expression, i.e. strong, weak, and inducible promoters [51]. As mentioned above, besides its efficient endonuclease activity, CRISPR can enable gene expression modulation through the deactivated form of the Cas9 protein, dCas9 [59, 79]. Once bound to, or in the vicinity of the transcriptional start site (TSS), the gRNA:dCas9 complex can significantly alter the transcriptional expression by physically interfering with RNA polymerase binding [14, 43, 79]. Wu et al. recently exploited this strategy in E. coli where they did a selective knockdown of gene expression of enzymes that could divert the carbon flux away from the production of 1,4-Butanediol (BDO) [99]. They divided their study into two phases, (1) a heavy strain engineering approach through multiple genome edits such as gene knockouts, knockins, and point-mutations, and (2) optimization through fine tuning of gene expression of three genes competing with the production of BDO. This strategy increased the 1,4-BDO titer by 100% from phase (1) to phase (2), resulting in a final titer of 1.8 g L−1 1,4-BDO (Table 1).

In addition, graded transcriptional patterns can be achieved depending on where the dCas9 complex binds in the promoter region, e.g. on the TSS for strong downregulation or more distanced from it for a medium repression. Thus, optimal gene expression can be elucidated by targeting dCas9 at different positions on the studied promoter [16, 17, 44]. This feature is subject to several parameters, such as the distance to TSS, condition dependent presence of transcription factors, chromatin accessibility, but the complete understanding of how to obtain precise regulation has yet to be characterized and is most likely dependent on specific promoters [57, 91]. For example, Deaner et al. recently developed a graded expression platform that can be employed to systematically test enzyme perturbation sensitivities (STEPS), and assists to identify potential flux limiting enzymes arising from production pathways [16] (Fig. 3d; Table 1). Their strategy relied on targeting dCas9, with either a repressor and activator domain, at different positions of several promoters of genes, and analyze their effect on the final titer. For example, while optimizing glycerol production, from the seven tested genes, one gRNA targeting GPD1 with dCas9-VPR led to a significant titer increase, highlighting its importance in the overall production pathway. Then they iteratively used STEPS to find a second bottleneck in GPP1, which ultimately led to a final titer ~ 28 g L−1, a sevenfold increase compared with their original strain.

Patenting landscape

The patent landscape related to CRISPR/Cas9 technology is complex, constantly changing, with several main actors dominating the field [21]. Those include one hospital, five universities, and one researcher, namely: Massachusetts General Hospital, Duke University, the Broad Institute (joint Harvard and MIT entity), the University of California Berkeley, the University of Vienna and Emmanuelle Charpentier. These entities have granted broad exclusive licenses to “surrogate” companies such as Caribou Bioscience (Berkeley, Vienna University, Jennifer Doudna), CRISPR Therapeutics (E. Charpentier; therapeutic field) and ERS genomics (E. Charpentier; all applications, except human therapeutics). Additionally, several spin-out companies have been formed, e.g. Editas Medicine (Broad institute, Duke University, Massachusetts General Hospital; area of human therapeutics) and Intellia Therapeutics (Caribou Biosciences; human therapeutics) with focus on their own R&D activities in human therapy, and specific out-licensing in certain areas. Notably, Editas Medicine, CRISPR Therapeutics and Intellia Therapeutics are publicly registered in the NASDAQ Stock Market.

Regarding the different commercialization areas of these patents, three main application fields have formed: (1) CRISPR/Cas9 used in medical applications with focus on human therapeutics and drug discovery, (2) research tool applications, cell line and animal models, and (3) agriculture and food applications (Fig. 4).

Fig. 4
figure 4

CRISPR companies and licensing agreements. Bold lines represent non-exclusive licensing. Dashed lines represent exclusive licensing. In the middle, the four most important owners of CRISPR patents. In dark blue, companies applying CRISPR for health-related applications. In green, companies applying CRISPR in the crop industry and biotech industry. In black, companies developing tools, cell lines and animal models

In the area of human therapeutics, spin-outs originated from academic institutions and initial inventors are dominating the field with focus on R&D, licensing, and commercial partnering. We see specific exclusive licenses to newly formed companies in the field, e.g. for Chimeric Antigen Receptor T-cell therapy (CAR-T) (Juno, Novartis and Cellectis) or treatment of blood, eye and heart diseases (Casebia, and Editas Medicine), as well as broader licenses in using CRISPR as a drug for human therapeutics (AstraZenenca, Amri, Oxford Genetics and Evotec) (Figs. 4, 5).

Fig. 5
figure 5

Map of key CRISPR players

In the area of research tools, non-exclusive licenses, mostly coming from the Broad Institute and Caribou Biosciences, are most prominent in the field. The applications range from licenses for general research tools, e.g. Clontech, Horizon, ATCC, GE-Healthcare, to specific licenses in the field of drug discovery, e.g. Evotec, Novartis, Regeron, and applications in animal models, e.g. Taconic, Sage Labs, The Jackson Laboratory, and Knudra (Fig. 4).

In the area of agricultural and food applications, larger industry players, such as DowDupont, control the field with regards to patent holding and licensing. Their strategy included (1) the acquisition of Danisco in 2011, an agricultural/food ingredient company that made crucial progress in understanding CRISPR mechanism and the role of Cas9 [2], (2) agreements with Virginijus Siksnys from University of Vilnius, one of the founders of CRISPR technology [32, 84], and (3) exclusive cross-licenses from Caribou Bioscience and ERS Genomics specific for the agricultural field. In addition, Monsanto/Bayer Crop Science recently acquired a non-exclusive license from the Broad Institute for sole use in the agricultural sector. Another key player in the field of crop engineering is Calyxt, which acquired exclusive worldwide rights for CRISPR/Cas9 utilization in plants from the University of Minnesota, highlighting the complexity emerging from these patents and the different licensing structures in the field.

In the area of industrial biotechnology, CRISPR licenses are so far only obtained in a small number of cases, such as Evolva which acquired a license from ERS genomics for yeast and fungal engineering for biotechnological production of chemicals.

Because of the ongoing patent dispute between the Broad Institute and UC Berkeley/Charpentier, the licensing situation remains opaque. Currently, some of the Broad patents were granted in the beginning of 2017 while the UC Berkeley/E. Charpentier patents are still pending. A request of interference filed by UC Berkeley was turned down in the first round but has now gone to a second round with an appeal to the original decision. The hubbub created by the “battle” has incentivized several companies, e.g. Horizon, DowDupont, Sage labs, to acquire licenses from different main patent owners, to secure even exclusive access to the technology in a certain field.

A main area in CRISPR-based drug development is its use in cancer immunotherapy to reprogram enhanced CAR-T receptors for selectively targeting cancer cells [81]. The genetic modifications are done in vitro, making this approach a potential low hanging fruit for successful approval of CRISPR based medical therapies. A major milestone was recently achieved with two CAR-T based treatments approved by the FDA [73, 74]. Large companies and several startups acquired exclusive licensing from different CRISPR IP holders in the field, e.g. Novartis with Intellia Therapeutics, and Juno with Editas Medicine (Fig. 4).

Looking at all the different patents and patent applications in the field, the total number of patents encompasses over 90 granted patents and 1300 filed patents ranging from CRISPR/Cas9 components to delivery systems and applications [21]. Some of the main actors started to create patent pools to simplify the licensing process for commercial users. As such, agreements were made between CRISPR Therapeutics, Intellia Therapeutics, Caribou Biosciences and ERS Genomics to maintain and coordinate prosecution of particular patent families. As direct competitors, a similar alliance has been formed between the Broad Institute, Rockefeller University, Harvard University, and MIT, through the intermediary of MPEG LA, LLC firm Sheridan [87].

Another strategy followed by certain entities in the field is to diversify their IP portfolio with the result of having priority for follow-on refinements of the previous patent applications. For example, Zhang and colleagues from the Broad Institute have discovered and filed patent protection for Cpf1, a robust alternative to Cas9 [21, 101].

Conclusion and discussion

Only recently discovered, CRISPR/Cas9 technology has already been enhanced to the point of fulfilling most of the genome editing and gene regulation currently demanded, ranging from the ability to perform multiple gene insertions, gene knockouts, combinatorial libraries, to advanced fine-tuning of biosynthetic pathways [23, 39, 47, 95]. However, off-targeting remains an important limitation to the technology, with several studies pointing out unwanted cuts due to the gRNA binding elsewhere than the intended target region [27, 85]. This phenomenon is known to be accentuated in regions with sequences similar to the original sequence, e.g. paralogs genes or retrotransposons regions [24]. This feature, so far, severely hinders the technology to enter into advanced clinical phases. Screening every engineered cell for off-target effects after each genetic manipulation poses long term viability issues for the technology. Consequently, other technologies with proven track record such as TALENs are so far offering a safer solution for gene editing therapies. However, while CRISPR struggles as a standalone therapy, several efforts to minimize off-target cleavage have been reported. Recently, the development of an improved Cas9 variant with enhanced proofreading capacities has extensively reduced off-targeting effects while maintaining the high-cutting efficiency [9]. Additionally, powerful molecules with the ability to inactivate Cas proteins activity, named anti-CRISPR proteins, have been reported to significantly reduce off-targeting edits [89].

In the field of crop engineering, CRISPR techniques are currently having a major impact, facilitating cheaper, faster, and more precise engineering in comparison to laborious and time-consuming traditional methods [3, 82]. However, it has yet to be determined whether CRISPR based gene editing of crops will be regulated the same way traditional genetically engineered crops are, which ultimately will settle its commercial value within this sector.

Currently, the industrial biotechnology field using metabolically engineered microbial cell factories is progressively shifting from studies with few genetic modifications to highly engineered strains. CRISPR has become a near-commodity in the field as a result of the available panoply of engineering tools for these microbial cell factories, as well as the complex tasks these tools can perform. While most of the CRISPR proof-of-concepts have been carried out in well characterized industrial strains, more and more complex organisms successfully generating CRISPR/Cas9 mediated genome edits are being reported. A particular example concerns secondary metabolites, which are often derived from non-model organisms, thus making the corresponding biosynthetic pathways poorly characterized especially with the host being difficult to genetically engineer with traditional tools. In this example, one could either consider using CRISPR technology to integrate this large pathway into a well characterized organism, or, directly genetically engineer the host organism to further enhance the product formation or elucidate its idiosyncrasies [52, 75].