Introduction

The CRISPR-Cas9 technology has allowed public institutions and private companies to boost the development of crops more tolerant to diseases and environmental stresses, or with enhanced nutritional value in order to face global issues that are threatening agriculture and human life itself. Some genome edited plants, like the Calyxt Soybean with higher oleic acid content, is already cultivated and commercialized in the USA under conventional regulation [1, 2]. In fact, since 2016, the USA Department of Agriculture (USDA Animal and Plant Health Inspection Service-APHIS) has exempted more than 35 gene-edited products from GMO legislation. Moreover, in May 2020, USDA issued the SECURE rule, an updated revision of the biotechnology regulation, according to which crops whose genetic changes could have been produced with conventional breeding will be exempted from GMO legislation. Argentina, Brazil, Chile, Colombia, Honduras, Israel, Japan, and Australia have also decided to exempt edited plant varieties from GMO regulation, as long as no residual exogenous DNA remains integrated into the plant genome [3]. Conversely, the European Union has a more restrictive position towards genome editing and considers organisms obtained by such technology as GMOs even if they do not contain additional exogenous sequences in the genome [3]. For vegetatively propagated tree-species, the application of systems that either avoid the use of foreign genes or that remove them when no longer needed is not an easy task. On one hand, the use of protoplasts with RNPs, which avoids the introduction of exogenous DNA, suffers from the scarce availability of protocols for plant regeneration starting from single cells. On the other hand, the classical gene transfer mediated by Agrobacterium leads to T-DNA integration in the plant genome, and the strategies adopted so far failed to remove it entirely [4]. However, despite the research focused on obtaining transgene-free edited fruit-trees is ongoing [5,6,7], in Europe these crops are still considered GMO, regardless of the worldwide legislative landscape. The legal framework of the European Union (Directive 2001/18/EC, Regulation (EC) No 1829/2003; Regulation (EC) No 1830/2003) demands for event-specific methods for detection and quantification of GMO prior to their authorization and placing on the market. Nowadays, several techniques are available thanks to huge progresses in the field of molecular diagnostics. GMO analytical control is currently implemented with DNA-based methods, among which the real-time or quantitative PCR (qPCR) is the most used one for quantitative purposes because of its higher sensitivity and robustness even on complex and processed food matrices [8]. Many qPCR-based assays have been developed for the quantification of the transgene copy number (CN) in fruit crops [9,10,11,12,13]. However, qPCR can be affected by many factors, including the presence of inhibitors in the sample matrix and the reliability of the calibration system [14]. An alternative method is digital PCR [15, 16], in particular the droplet digital (ddPCR), a technique that relies on partitioning the sample into several thousands or even millions of individual droplets, and uses a flow-cytometry-like system to count positive PCR tests [17, 18]. ddPCR, unlike qPCR, does not require a calibration or a highly efficient amplification performance, and estimates the number of targeted copies per reaction using Poisson statistics. In addition, recent methods based on next generation sequencing have been used to detect exogenous genes in transgenic plants [19,20,21]. These were semi-quantitative techniques developed mainly for the safety assessment of GMO crops or to screen mutant collections in the framework of forward genetic studies.

In this work, qPCR, ddPCR, and NGS methods were assessed for their performances in the quantification of T-DNA integration copies and pros and cons of each technique are discussed. To operate under the best experimental conditions, we chose fresh plant tissue as testing material. In particular, we analyzed ten grapevine transgenic lines, which were transformed with Agrobacterium tumefaciens carrying a binary vector for genome editing. Three binary vectors were used for the transformation experiments, each carrying the CRISPR/SpCas9 system with a different sgRNA to target putative disease-susceptibility genes. One sgRNA was directed against VviMLO7 (VIT_13s0019g04060), needed for susceptibility to powdery mildew, caused by the fungus Erysiphe necator [22], while the other two sgRNAs were directed against two different target sites (ts-1 and ts-2) of VviDMR6 (VIT_16s0098g00860), a gene likely involved in the onset of downy mildew triggered by the oomycete Plasmopara viticola [23]. In addition, the expression of SpCas9 and sgRNA and the mutation profile in the genomic target sites were evaluated to correlate the CN information with the activity of the sgRNA/Cas9 complex. Besides a technical dissertation about the most advanced diagnostic techniques nowadays available for CN quantification and food control, this study aims primarily to provide developers with tools for selecting the proper transgenic material at an early stage, to be propagated and maintained for research studies or commercial purposes.

Materials and methods

Plant material and binary vector for gene transfer

Gene transfer was performed via co-cultivation of embryogenic calli from the table grape varieties ‘Crimson seedless’ (CS) and ‘Sugraone’ (S) with the A. t. strain EHA105 pCH32 containing the binary vector for expression of 35S::Cas9 and Ubi::sgRNA from Addgene (www.addgene.org). In particular, three binary vectors were used which differ for the 20-mer sgRNA present in the T-DNA cassette: (1) MLO7-sgRNA = ACTTGAAGAGCGTAGTTTGG; (2) DMR6-ts1-sgRNA = GCCGATGCTTGCAGGCTCTA and (3) DMR6-ts2-sgRNA = GTCCTTGCCGAGGTCGATTA. Actively growing Agrobacterium cultures, previously induced with 100 µM acetosyringone (AS) for 3 h, were re-suspended in liquid GSICA [24] with AS to an OD600 = 0.3–0.45, and then mixed with about 5 g of embryogenic callus in a volume of 20 mL. The co-culture was shook for 10 min at 25 °C (60 rpm), pelleted, blotted on sterile Whatman paper, transferred to GS1CA solid medium, and then incubated at 25 °C in the dark for 48 h. Embryogenic callus was then washed in liquid GS1CA supplemented with 1 g/L Timentin, spun and blotted on paper, transferred to solid GS1CA medium supplemented with 1 g/L Timentin, and maintained at 25 °C in the dark for 4 weeks. Thereafter, the callus was subcultured monthly on GS1CA medium supplemented with 1 g/L Timentin and 150 mg/L Kanamycin in the dark at 25 °C for 8 months. Developed embryos (torpedo stage) were transferred to NN medium [25] supplemented with 25 mg/L Kanamycin, in the light (16 h photoperiod) to induce embryo differentiation and germination. Transgenic shoots were transferred in WP medium [26] and subcultured every two months. ‘CS’ lines 1, 2, 4, 5, 6 were edited to knock-out VviMLO7 gene while ‘S’ lines 3, 7, 8, 9, 10 were edited to knock-out VviDMR6 gene, in particular the sgRNA was directed against target site 1 (ts-1) in lines 3, 7, 9, 10 and against target site 2 (ts-2) in line 8. For each transgenic lines, two in vitro biological replicates (A and B) were analyzed. For each sample, DNA was extracted from 3 leaves of a 6 cm-tall plantlet using Nucleospin Plant II (Macherey–Nagel, Düren, Germany). DNA concentrations were measured with the Picogreen dye (Thermo Scientific) in the NanoDrop 3300 Fluorospectrometer (Thermo Scientific) following the manufacturer protocol “PicoGreen Assay for dsDNA”.

Real-time PCR quantification

The real-time PCR quantification of nptII or SpCas9 CNs in grapevine lines was carried out according to the method developed by [10] and following the scheme reported in Online Resource 1. Reactions were performed in a 96-well plate on a C1000 thermal cycler (Bio-Rad, Hercules, USA) equipped with CFX96 real-time PCR detection system (Bio-Rad, Hercules, USA). The real-time PCR singleplex reaction was carried out in a 10 µl final volume containing 1 × SsoAdvanced Universal Probes Supermix (Bio-Rad, Hercules, USA), 40 ng of genomic DNA, 0.3 µM primers (Sigma, Haverhill, UK) and a 0.2 µM specific Taqman probe (Sigma, Haverhill, UK). The thermal protocol was as follows: polymerase activation for 3 min at 95 °C followed by 40 cycles of denaturation of 10 s at 95 °C, annealing of 5 s at 58 °C and 5 s at 60 °C and an elongation of 30 s at 72 °C. Primers and Taqman probes used to amplify grapevine endogenous gene Chi and exogenous genes nptII and Cas9 are reported in Online Resource 2. The standard curves (four points, starting from 106 plasmid molecules and adopting a serial dilution of 1:5) were built with a plasmid pGEM-T easy (Promega, Madison, Wisconsin, USA) containing specific fragments of the genes to be quantified (VviChi, nptII and SpCas9). For each sample, the transgene (nptII or SpCas9) CN was calculated using the following formula: (transgene total copies/endogenous gene total copies) × 2. The total copies of the transgene and of the endogenous gene were calculated on the basis of the mean values of the quantification cycles (Cq) of two technical replicates.

ddPCR quantification

The same DNA samples analyzed by real-time PCR were tested in droplet digital PCR (ddPCR). Each sample (10 lines, 2 biological replicates per line) was analyzed as detailed in Online Resource 1. The 2 technical replicates in a single plate were treated individually, being ddPCR independent from calibration. The reaction volume was 20 μL containing 1× ddPCR Supermix for probes (No dUTP) (Bio-Rad, Pleasanton, CA, USA), 300 nM of primers, 200 nM of probes and 6 ng of template DNA, for both the VvChi and Cas9 assays. Droplet generation was carried out in DG8 cartridges (Bio-Rad) loaded on a QX100 droplet generator (Bio-Rad Laboratories, Inc.). Droplets were then transferred to 96-well plates amplified in a PCR thermocycler GeneAmp PCR System 9700 (Thermo Fisher Scientific, Waltham, MA, USA) with the following thermal profile: 10 min DNA polymerase activation at 95 °C, 45 cycles of a two-step thermal profile of 30 s at 94 °C for denaturation, and 60 s at 55 °C for annealing and extension, droplets stabilization at 98 °C for 10 min followed by an infinite hold at 4 °C. After thermal cycling, the 96-well plates were transferred to a QX100 droplet reader (Bio-Rad Laboratories, Inc.) and data were analyzed with QuantaSoft 1.7.4 software (Bio-Rad Laboratories, Inc.). The results obtained from each replicate had to fulfil the following acceptability criteria to be included in the subsequent elaborations: single peak of amplitude for the positive droplets; a number of accepted droplets above 10.000 per well; an amount of “rain” below 2.5% per well.

NGS method for the identification of T-DNA integration site and bioinformatics pipeline

The NGS method was performed according to [4]. Genomic DNA (1 µg) was reduced to fragments ranging between 200 and 1000 bp using BIORUPTOR NextGen (Diagenode, Seraing, Belgium) with three cycles of 30 s at low intensity. The DNA fragmentation profile was checked on Tapestation 2200 (Agilent, Santa Clara, CA, USA) using D1000 or Genomic DNA ScreenTape. Fragmented DNA was purified with AMPure XP beads (Beckman, Brea, CA, USA) at 1.8 × ratio, treated with NEBNext End Repair Module E6050S (New England Biolabs, Ipswich, MA, USA) and again purified with 1.8 × AMPure XP beads. Purified end-repaired fragments were quantified with D1000 ScreenTape on Tapestation. Genomic blunt fragments were ligated to the Adaptors of the Universal GenomeWalker 2.0 kit (Takara Bio, Kusatsu, Japan) using T4 Ligase (Thermo Scientific, Waltham, MA, USA), and following the manufacturer’s instructions. A PCR was performed in a 20 µl final volume containing 1 × PCR BIO (Resnova, Rome, Italy), 0.25 µM of the primers ADAP_ill and PNOS_ill (see Online Resource 2) and 20 ng genomic DNA. PCR products were purified with 0.8 × AMPure XP Beads to remove fragments smaller than 200 bp, primers and primer dimers. The library was sequenced by Illumina MiSeq (PE300) in house, at the FEM Sequencing Platform Facility (San Michele all’Adige, Italy). Approximately 100,000 reads per sample were produced. Two datasets of raw sequencing reads (fastQ files for both ends) were analyzed using VSEARCH 2.13.4 and BLAST 2.9.0 (https://blast.ncbi.nlm.nih.gov/Blast.cgi) software. Reads of Dataset 1, amplified with the primer ADAP_ill, were trimmed of 48 bp to remove the GenomeWalker adaptor sequence and then merged with the reads of dataset 2, amplified with the primer PNOS_ill (minimum overlapping = 50 bp); merged sequences were then clustered using an identity threshold (ID) minimum of 0.90. To identify exogenous sequences, clusters were mapped to the T-DNA vector sequence using BLAST and filtered for length of the alignment > 50 bp and e value (random background noise) below 0.01. Filtered sequences were then mapped against the reference genome, and hits with less than 10 mismatches and an e value above 10–6 were selected. According to the BLAST output, the T-DNA integration points were identified, and confirmed by PCR amplification of the junction regions between genomic DNA and T-DNA. PCR was performed in a 20 µl final volume containing 1 × PCR BIO (Resnova, Rome, Italy), 50 ng of genomic DNA and 0.5 µM of the primers reported in Online Resource 2. Amplification products were checked on agarose gel, purified using PureLink Quick Gel Extraction (Invitrogen, Carlsbad, CA, USA) or PCR Purification Combo Kit (Thermo Scientific, Waltham, MA, USA) and sequenced by Sanger sequencing (FEM Sequencing Platform Facility). Sequencing outputs were analyzed with the BLAST online tool (blast.ncbi.nlm.nih.gov).

Gene expression analysis

For each line, the two in vitro biological replicates used for DNA measurements were micro-propagated (for a cycle of 1 month) and three leaves were collected from the respective daughter plants, and used for RNA extraction. Total RNA was isolated from grapevine leaves using the Spectrum Plant Total RNA Kit (Sigma–Aldrich, St. Louis, MO, USA). RNA was quantified with the spectrophotometer NanoDrop ND-8000 (NanoDrop Technologies, Wilmington, DE, USA) and by gel electrophoresis. Following DNase treatment, 1 µg of RNA was reverse-transcribed into cDNA with the SuperScript III Reverse Transcriptase (Invitrogen, Carlsbad, CA, USA) using random primers according to manufacturer’s instructions. The real-time PCR was carried out on the CFX96 instrument (Bio-rad, Hercules, CA, USA) in a 12.5 µl volume containing SsoAdvanced Universal SYBR Green Supermix (Bio-rad, Hercules, CA, USA), 0.5 µM primers (Online Resource 2) and 1 µl of diluted cDNA (1:10). An initial denaturation step at 95 °C for 5 min was followed by 40 cycles at 95 °C for 10 s and 60 °C for 30 s. Finally, to detect non-specific amplification in cDNA samples a melting curve analysis was performed as follows: 95 °C for 10 s, 65 °C for 5 s and a stepwise T increase (0.5 °C/s) up to 95 °C with a continuous detection. Glyceraldehyde 3-phospate dehydrogenase (GAPDH) and Actin (Online Resource 2) were used as housekeeping genes. For each biological replicate of each plant line, three technical replicates were run in a single plate and data were elaborated using the software Bio-rad CFX Manager 3.0.

Editing evaluation on the CRISPR/SpCas9 target sites

Regions of the VviMLO7 and VviDMR6 genes containing the target sites were amplified with the primers reported in Online Resource 2. PCR was carried out in 20 µl final volume containing 1 × PCR BIO (Resnova, Rome, Italy), 0.4 µM of each primer (both elongated with overhang Illumina adapters) and 50 ng of genomic DNA. The Illumina library was sequenced on an Illumina MiSeq (PE300) platform at the FEM Sequencing Platform Facility (San Michele all’Adige, Italy). Raw paired-end reads were processed by the CRISPResso pipeline (http://crispresso.rocks/) with default parameters.

Statistics

Statistical analyses were performed using R software (R Core Team 2020).

Results

Transgene copy number quantification: qPCR vs ddPCR

Edited grapevine plantlets were regenerated from single somatic embryos of cv ‘Sugraone’ and cv ‘Crimson seedless’ after Agrobacterium mediated transformation of embryogenic callus. SpCas9 CN was quantified in two biological replicates for each line by qPCR and ddPCR. In addition, the selection marker gene nptII was quantified by qPCR (Fig. 1). As shown by statistical analysis (Table 1), the CN values of SpCas9 and nptII measured with qPCR were not significantly different. Conversely, by comparing CN data quantified with the two techniques, significant differences were found. As expected, these discrepancies were more pronounced when the comparison was done across different genes (SpCas9_ddPCR vs nptII_qPCR) rather than within the same gene (SpCas9_ddPCR vs SpCas9_qPCR) (Table 1). Being qPCR affected more than ddPCR in terms of amplification performances by the presence of single nucleotide polymorphisms occurring in the endogenous gene VvChi (e.g. due to somatic mutations caused by in vitro culturing of embryogenic callus cells), we checked the sequence of the amplicons in the edited lines. As reported in Online Resource 3, no differences were found in the sequence of VvChi recognized by the primer and probe set used for qPCR and ddPCR. In general, mean CN values ranged between one and two copies, with values higher than 2 in three lines (lines 3, 4, 7). For those lines, high standard errors were calculated, mainly associated to the SpCas9 CN measured by qPCR (Fig. 1). During T-DNA integration in the plant genome, a frequent occurrence of tandem/inverted-repeated insertions of the entire T-DNA or of its fragments has been reported [4, 27]. In this scenario, ddPCR, unlike qPCR, would probably not distinguish the tandem/inverted multiple copies from a single copy. This occurrence was further investigated in line 3, for which the value of SpCas9 CN calculated by ddPCR differed strongly from that of the qPCR output. Both biological replicates of line 3 were subject to genomic DNA digestion using the BsmBI restriction enzyme which cuts twice between nptII and SpCas9 (Online Resource 4) separating putative multiple copies in tandem. On the digested gDNA (dDNA), ddPCR measured a SpCas9 CN that is nearly double as compared to the value measured on not-digested gDNA (NdDNA). Mean SpCas9 CN in NdDNA resulted equal to 3.47 in replicate 3A, and 3.59 in replicate B while mean SpCas9 CN in dDNA resulted equal to 6.06 in replicate 3A, and 5.49 in replicate B. These latter values are not statistically different from the SpCas9 CN mean value estimated by qPCR (i.e. 6.55 in replicate A and 4.79 in replicate B).

Fig. 1
figure 1

SpCas9 and nptII CN measured with real-time PCR (qPCR) and droplet digital PCR (ddPCR) in 10 grapevine edited lines (L1-L10). qPCR CN values are the average of six measures calculated by analyzing two biological replicates in three independent PCR sessions. ddPCR CN values are the average of 12 values calculated by analyzing two biological replicates in three independent PCR sessions. Bars represent standard deviation of the mean of the two biological replicates

Table 1 Comparison of ddPCR and qPCR CN data by non-parametric analysis of variance (Kruskal–Wallis test) followed by a pairwise comparison (Wilcoxon test)

Precision of qPCR and ddPCR data

To compare the precision performances of qPCR and ddPCR, the standard deviations (SD) associated to the mean CN values calculated for the two replicates of each line were plotted (Fig. 2). SDs of qPCR data were substantially more scattered than those of ddPCR, highlighting the higher precision of the ddPCR technique compared to qPCR. In addition, a bootstrap test on SD values was performed to statistically evaluate precision. The bootstrap output shown in Table 2 indicates that the 95% confidence intervals of ddPCR and qPCR did not overlap for line 1 through 6 and attested that the standard deviations of the two methods were significantly different, being the precision of ddPCR significantly higher than that of qPCR. For lines from 7 to 10 a small rate of overlap was found, which reduced the level of significance of the differences.

Fig. 2
figure 2

Distribution of the standard deviations (SDs) associated to the mean CN values. For each replicate (a or b) of each transgenic line, mean values and SD were calculated on three measurements obtained in three independent qPCR sessions and on six measurements obtained in six independent ddPCR sessions

Table 2 Bootstrap estimation of 95% confidence interval (CI 2.5%–97.5%) applied to the dataset of standard deviations derived by qPCR and ddPCR CN quantifications (see Fig. 2)

T-DNA integration point

T-DNA integration points were assessed in the ten grapevine lines with a method based on high throughput sequencing [4] and are described in Fig. 3. The genomic position of the T-DNA insertion was identified in 7 lines (lines 1, 2, 3, 4, 5, 9, 10), and in the case of line 9 two integration points were detected. In all cases integration occurred in intergenic regions. In addition, T-DNA rearrangements were detected. A tandem repeat T-DNA insertion with a LB-head preceded by a RB-tail was observed in lines 3, 7 and 10. Moreover, upstream the LB region, other unexpected DNA sequences were found: a fragment of the binary vector backbone in lines 6 and 8 and a fragment of the SpCas9 gene in lines 3 and 8. In line 3, a truncated copy of SpCas9 upstream the T-DNA LB border was found, suggesting the probable loss of a substantial region of the T-DNA cassette containing the nptII gene. Such result may explain the quite divergent CN of SpCas9 (i.e. 6.55 and 4.79 for replicates A and B, respectively) and nptII (i.e. 4.09 and 3.48 for replicates A and B, respectively) obtained by qPCR. Another interesting result was the detection of an identical genomic integration point in lines 1, 2 and 5 (Chr. 6, position 6,517,569). This outcome proved that the three plants that were believed to derive from independent transformation events, were actually clones, originated from the same transformation event.

Fig. 3
figure 3

Overview of the number and pattern of T-DNA integration in the ten edited lines. IP: integration point

SpCas9 and sgRNA expression

SpCas9 and sgRNA expressions were evaluated in two replicates for each plant line. According to the results showed in Fig. 4, a different SpCas9 expression profile was observed among the 10 lines, with the lowest values for line 4 and the highest values for lines 1, 5, 8, 10. Regarding sgRNA expression, for MLO7-sgRNA the lowest expression was found in line 4 (consistent with a similar lower value for SpCas9), while for DMR6-ts1-sgRNA in line 9 (the trend is similar for SpCas9). No comparison was possible for DMR6-ts2-sgRNA.

Fig. 4
figure 4

Expression profile of SpCas9 gene (A) and of sgRNA: MLO7 (B), DMR6-ts1 (C), DMR6-ts2 (D). Error bars indicated SD associated to the mean of three technical replicates

Editing profile in the target site

The editing profile of the three target sites was analyzed by Illumina sequencing (Fig. 5). A full mutated asset was found in all the lines with the exception of lines 4 and 6, which maintained a portion of non-mutated allele (wild type, wt) of 50% and 20% of the sequenced reads, respectively. The editing profiles of lines 1, 2, and 5, actually clones, were very similar. This result indicates that editing may have occurred prior to the division of the embryogenic unit into the progenitors of the three independent clones. Moreover, it is also worth noticing that the lines showing a complete and homogeneous mutation pattern (i.e. lines 3, 9, 10) were those edited with the same sgRNA (DMR6 ts-1).

Fig. 5
figure 5

Mutation profile of the target site detected by Illumina sequencing and CRISPResso software. For each line and biological replicate (A and B), the analyzed DNA was the same used for qPCR and ddPCR quantification

Discussion

In the last few years gene editing technologies have been massively applied to unveil the function of candidate genes or to improve traits in crops [28,29,30,31,32]. In such experiments, typically a large number of in vitro plants are generated, and, therefore, an early step of selection is required to maintain only the most interesting lines. Molecular characterization of the transformation events and phenotyping of the trait of interest are crucial for such selection process. In this study, we compared and evaluated a set of methods to carry out a molecular characterization of grapevine CRISPR/Cas9-edited lines, which contain inserted recombinant DNA and are, therefore, considered conventional GMOs. Agrobacterium tumefaciens-mediated transformation, actually the most common method used to engineer crops, produces lines with a number of integrated T-DNA ranging from one to several copies, as well as chimeric tissues with modified cells mixed to wt cells which results in fractional CN [33]. Both multiple copies and chimerism are undesired, the former because it is often associated with post-transcriptional silencing of the transgene [34, 35] and the latter because it can result in the loss of the trait after many cycles of plant propagation [11, 33, 36, 37]. Thus, an accurate and precise CN measurement is critical for lines selection, being single or low CN generally desired [38]. Many studies have been carried out to compare the performances of qPCR and ddPCR in view of improving detection and quantification methods intended for those laboratories committed in official GMO control [15, 39,40,41,42,43,44]. In general, they argued in favor to ddPCR, being this technique insensitive to PCR inhibiting components often present in complex matrices and not dependent on calibration with standard curves obtained from certified reference materials for quantification. According to our results, as for accuracy qPCR and ddPCR outputs are broadly in agreement, especially for low CN values (Fig. 1). The most divergent values were those observed for line 3. In this case, however, ddPCR CN values measured on genomic DNA digested with a restriction enzyme, resulted closer to qPCR ones. This result pointed out that a T-DNA repeated integration pattern, which is a common outcome of Agrobacterium tumefaciens and biolistic-mediated transformation, cannot be resolved by ddPCR without separating tandem T-DNA cassettes. However, ddPCR resulted more precise than qPCR in all cases (Fig. 2). Besides qPCR and ddPCR, a method to assess transgene CN is T-DNA integration site sequencing based on NGS that allows to identify and count unique insert-to-plant junctions at one end of the integration site [4]. A weak point of this method is its relatively low efficiency: it has been possible to discriminate the integration points only in 7 out of 10 lines. Moreover, according to our data, CN detected with NGS were often not consistent with those calculated by qPCR and ddPCR. According to [45], NGS (MiSeq) analysis is more sensitive than qPCR to measure high CNs but the presence of rearrangements may impair its accuracy. In fact, in case of tandem or inverted repeats, the NGS method cannot ensure resolution of multiple copies within a single integration point. Likewise, it cannot identify chimerical T-DNA integration. However, although unreliable for an accurate CN assessment, this method can clearly demonstrate the presence of T-DNA truncations (due to T-DNA border trimming or of tandem/inverted repeats), thus confirming potential hypothesis formulated on the basis of qPCR and ddPCR results, such as the case of line 3 where inconsistency in the CN of two exogenous genes was caused by a T-DNA truncation. These kinds of rearrangements may have a profound effect on inactivation or variable expression of the transgene [46, 47], and their detection is, therefore, crucial. Head-to-tail arrays of multiple copies frequently result in stable expression, whereas head-to-head or tail-to-tail arrangements generally result in silencing [48]. Moreover, the location of the integration point may also give indications about the possible influence of the surrounding genomic sequence on recombinant DNA expression, a feature that is known as position effect [49, 50]. In recent studies [47, 51], random transgene integration sites were analyzed in Chinese hamster ovary cells, mammalian cell lines used as biofactories for the production of therapeutic molecules. These authors found that transgene stability is ensured over time when integration occurred in genomic regions with high transcriptional activity and accessibility to transcription factors (not necessarily within highly transcribed genes). If transgene integration is directed to genomic “landing pads” with such features, also a multicopy integration pattern allowed for stable and high-rate expression of the recombinant DNA [52]. Mapping such regions in plant genomes would be of great importance to predict the behavior of a trait of interest over time in a modified plant. In addition, the available biotechnological tools for targeted gene insertion (i.e. the knock-in approach based on Site Directed Nucleases 3) [53] greatly encourage the identification of such regions in the genomes of crops. The empirical analysis of several events of transformation (i.e. assessment of the integration point and expression stability over long times together with bioinformatics predictions) may help to reach this goal. The knowledge of plant-T-DNA junctions is also very important to understand whether the integration could be detrimental to the plant such those occurring in important coding regions. Another interesting aspect is that NGS methods serves to identify lines that derive from the same transformation event and should be more correctly classified as clones (e.g. lines 1, 2, 5). Such information cannot be retrieved by qPCR and ddPCR, nor by the analysis of the editing profile, because a specific target site tends to be conservatively repaired, often resulting in the same mutation [54]. In the case of edited plants, the expression rate of SpCas9 and sgRNA and the mutation profile of the target site allow to assess the activity of the CRISPR system integrated in the plant genome and to correlate the T-DNA integration pattern (genomic position, number of copies, presence of rearrangements) with the efficiency of the editing machinery. In our study, a multicopy integration correlates with a low expression of transgenes (in lines 3 and 4) and the lowest SpCas9 and sgRNA expression is associated with a partial editing in the target site (line 4).

Conclusions

Based on the results obtained in this work, the integrated use of the three proposed techniques has proven to allow the characterization of transgenic plants already at an early stage. qPCR and ddPCR can be considered as alternative techniques to quantify the integration copy number of a transgene. However, their use in parallel can provide more complete information, especially in case of divergence between the results obtained. Indeed, while ddPCR is more precise and substantially unaffected by PCR inhibition, it is less accurate in quantifying tandemly repeated sequences unless digested with appropriate endonucleases. In parallel, the NGS method should be considered as a complementary technique to gather extensive knowledge about the transgene integration asset, which may be crucial for plant selection in the early stages of development.