Introduction

Contamination of samples with DNA from other individuals of the same species (intraspecific) or from other species (interspecific), but also degradation, are typical problems one faces working with historical or non-invasively obtained wildlife samples, like hair or swabs, and forensic sample material. Microsatellite analyses are plagued with cross-contamination during sampling, microbial contamination problems resulting from inappropriate storage, as well as degradation processes, which cause additional false alleles or missing peaks, through allelic dropout and null alleles (Bradley and Vigilant 2002; Taberlet et al. 1996). Hence several approaches have been developed to minimize problems during microsatellite analyses (Frantz et al. 2003; Taberlet et al. 1996).

In forensic DNA analyses, interpretation of crime scene traces is controlled by a guideline and includes three important thresholds in order to ensure the reliability of the data and to unravel the number and genotypes of contributors in a sample (Scientific Working Group on DNA Analysis Methods 2010). The first is an analytical threshold, which has to be determined for each automatic sequencer, below which the signal cannot be distinguished from the background. The second is a stochastic threshold, below which signals can be distinguished from the background but are still subjected to stochastic effects. Thus, such peaks cannot be used for reliable quantification (Gilder et al. 2007). The third criterion is the peak height ratio (PHR), the ratio of the smaller peak to the higher peak given in percent, within a locus. For heterozygous individuals, both alleles of a locus should exhibit peaks of almost equal strength (Clayton et al. 1998), whereas contamination will cause a greater difference in peak height (Navidi et al. 1992). False alleles caused by an intraspecific cross-contamination can thus be excluded using the forensic approach in conjunction with the PHR criterion.

For non-invasive genetic tracking of wildlife, a multiple-tubes approach is widely used, whereby a sample is analysed several times to derive a consensus result from these single analyses. There are several designs for a multiple-tubes approach, which vary in the number of DNA isolations and repeated PCRs (Caniglia et al. 2012; Gagneux et al. 1997; Lucchini et al. 2002). The classical approach includes up to seven PCRs from one DNA isolate (Taberlet et al. 1996). However, this requires a high effort and expenditure of time. Fernando et al. (2003), in contrast, considered two DNA isolations with one PCR each to be sufficient for reliable construction of consensus genotypes. Stepwise approaches were also developed in order to reduce the number of amplification reactions (Frantz et al. 2003). However, both, the forensic and multi-tube approach, have limitations. The forensic approach can exclude false alleles due to cross-contamination or stochastic artefacts, but nothing is known about the reproducibility of the results. Using this method, some genotyping errors such as stochastic allelic dropouts may be overlooked. On the other hand, the multiple-tubes approach considers reproducibility but may not be able to disentangle false microsatellite profiles from cross-contaminated samples. Especially when two homozygous genotypes are mixed, the resulting profile appears to be a heterozygous genotype and would not be detected with this approach. However, the two approaches could complement each other and this combined approach could overcome the limitations. This applies in particular to wildlife samples (e.g. from monitoring programs), which are likely to be affected by intraspecific cross-contamination.

Within a monitoring program in the course of the EU-LIFE and EU-LIFE + projects involving the reintroductions of the allis shad (Alosa alosa) in the River Rhine, inter alia fish scale samples for molecular analyses were collected, to assess the project success (Hundt et al. 2015). The allis shad is an anadromous fish within the family Clupeidae, which was formerly widespread throughout Western Europe. Overfishing, bad water quality and migration barriers have led to a collapse of allis shad stocks in most Atlantic tributaries by the mid-20th century, including the River Rhine (Stoll and Beeck 2011). Larger spawning populations currently remain only in a few Atlantic tributaries in the south-west of France, the largest population being in the Gironde-Garonne-Dordogne system. For the reintroduction in the River Rhine, larvae were produced every year from new adults captured from the Gironde-Garonne-Dordogne population since 2007 (Scharbert and Beeck 2010). Returning allis shad have been continuously recorded yearly since 2013, also with indications of first natural reproduction of the species in the River Rhine since the mid of the 20th century (Hundt et al. 2015). Furthermore, samples from adults of the donor population, as well from captured returning allis shads in the River Rhine, were collected in the course of the two subsequent EU-Life projects. To ascertain whether natural reproduction is present as a consequence of the stocking actions and thus, to reveal the success of the reintroduction measures, parentage assignment analyses was conducted based on microsatellite analyses.

However, a substantial proportion of the microsatellite profiles showed multiple peaks or substantial peak-height imbalance, which is most likely caused by intraspecific cross-contamination during the sampling procedure of the scale samples. The exclusion of these unclear profiles resulted in a low percentage of measurable samples not sufficient for reliable parentage assignment analyses (K. Mäck, unpublished data).

Therefore, the main objective of this study was to evaluate a method, combining forensic and multiple-tubes approaches in order to maximize the portion of measurable intraspecific cross-contaminated allis shad samples within the reintroduction project. Additionally, our study aim was to identify, whether a combination of both of these approaches—forensic approach and multiple-tubes approach– results in a higher proportion of measurable profiles than using single approaches.

Material and methods

Microsatellite analysis

The samples were provided by the EU-Life project Alosa alosa. Scale samples (dried and stored in paper envelopes; n = 242) from individuals captured for breeding during their spawning migration in the Garonne, southwest France, in 2010 were used for this study.

DNA was isolated for each individual from a single scale using a modified salt-extraction according to Koester and Gergs (2014), omitting the initial homogenization step. DNA-pellets were dissolved in 30 µl 1xTE-buffer (10 mM Tris-HCl; 1 mM EDTA, pH 8) and further diluted using sterile ddH2O in order to obtain a final DNA-concentration of 10 ng/µl. Amplification of 11 microsatellite loci separated in two batches was conducted in 5 µl multiplex PCR reactions including 1 µl DNA extract (2 ng final concentration) in each reaction (Type-it Microsatellite PCR Kit, QIAGEN, Hilden, Germany; for details see Table 1). The following cycling conditions were used: 95 °C for 5 min, followed by 30 cycles at 94 °C for 30 s, 57 °C for 1.5 min, 72 °C for 1 min, and final elongation at 60 °C for 30 min. Amplified fragments were separated by capillary electrophoresis (CEQ 8000 Genetic Analysis System, Beckman Coulter, Brea, CA, USA). The Genome Lab Genetic Analysis System Software (Beckman Coulter) was used for automated allele scoring followed by a visual inspection by the user for maximum reliability (Pompanon et al. 2005). To preclude that contamination arises during the processing in the lab, several measures were established. Separation of pre- and post-PCR steps was done by using different rooms. Negative controls were included during DNA extraction as well as during PCR. PCRs were set up in a clean bench using barrier tips and UV light to sterilise the equipment.

Table 1 Microsatellite loci and primers used in this study

Data analysis and optimisations steps

Two approaches were used to distinguish between true allis shad peaks and contaminant signals from the microsatellite analyses. First, the forensic approach was used for which the analytical threshold was defined by the threefold standard deviation of the background and equated 450 RFU (Relative Fluorescence Unit) in our case (Fig. 1). The stochastic threshold was set to 1500 RFU as the tenfold standard deviation of the background (Gilder et al. 2007). Furthermore, signals below a PHR of 40% were assessed as contamination and were excluded from further analyses. The PHR of related peaks of a locus was calculated as follows:

$$PHR=\frac{height\, smaller\, peak\left(RFU\right)}{height\, larger\, peak\left(RFU\right)}\times 100$$
Fig. 1
figure 1

Forensic approach, with the three different defined thresholds, as recommended by the Scientific Working Group on DNA Analysis Methods (2010). Peaks smaller than the analytical threshold cannot be distinguished from the background. Peaks below the stochastic threshold were still subjected to stochastic effects. Peaks smaller than the defined peak height ratio were assessed as contamination

In addition, we used a stepwise multiple-tubes approach modified from Fernando et al. (2003; see Fig. 2): As the first step, for each individual, two DNA isolations followed by one PCR and fragment analysis each was conducted. If it was not possible to build a consensus genotype from the results (caused by a lack of conformance or more than two alleles in the resulting consensus genotype), two DNA isolations and subsequent fragment analyses were repeated from the same individual in a second step. Afterwards, a peak was counted as an allele, if it was detected in three out of four analyses. The sample was discarded, if it was not possible to build a consensus genotype after step two.

Fig. 2
figure 2

Stepwise multiple-tubes approach modified from Fernando et al. (2003). In the first step, two DNA-isolates were compared. If it was not possible to build a consensus genotype, because of a lack of conformance (unequal) or more than two alleles in the resulting consensus genotype (not measurable), another two DNA-isolates were compared in a second step. The sample was discarded, if it was not possible to build a consensus genotype after step two

These two approaches were merged to a combined approach, meaning that only peaks meeting the threshold criteria were used to build consensus genotypes. To test the applicability of the different approaches, the expected and observed heterozygosity (Hexp and Hobs) were calculated and deviations from Hardy–Weinberg equilibrium (HWE) were analysed using χ2-test with Bonferroni correction using Cervus 3.0.7 (Kalinowski et al. 2007). Data were further checked for genotyping errors, such as large allele dropout, stutter and null alleles using MICRO-CHECKER 2.2.3 (van Oosterhout et al. 2004) and additionally, the frequency of null alleles was calculated (Cervus 3.0.7, Kalinowski et al. 2007).

Results and discussion

From the 242 samples, we were able to get evaluable profiles (i.e. one or two clearly scorable alleles) from 138 (55%) with one DNA isolation without a refined approach. All other profiles were unclear, i.e. showed multiple peaks or a substantial peak-height imbalance which indicates contamination, and were excluded. Using the forensic approach alone, the portion of measurable results increased to 67% with a PHR threshold of 40% (for detailed information on the results of single samples see Supplementary Material). This threshold is considerably lower than those usually used in forensic analyses of human DNA (60%; Scientific Working Group on DNA Analysis Methods 2010). However, one threshold cannot be transferred directly into another system, and can be as low as 33% (Navidi et al. 1992). PHRs were estimated on the basis of microsatellite data from non-contaminated allis shad larvae, collected as a part of the reintroduction program (K. Mäck, unpublished data). It became apparent that a 60% threshold would be too high in this case, leading to a high proportion of erroneously excluded alleles.

The use of the stepwise multiple-tubes approach led to a proportion of 75% measurable samples. With non-invasive or historical samples, degradation and contamination are known to cause problems. For example Lucchini et al. (2002), using wolf (Canis lupus) faeces, found only 69% of the samples to be measurable applying a multiple-tubes approach. Furthermore, with the stepwise multiple-tubes approach we applied it was possible to avoid very high costs and effort, especially in comparison with the classical multiple-tubes approach from Taberlet et al. (1996). Combining the forensic and multiple-tubes approaches we achieved a considerable increase in the interpretability of up to 90% (221 samples). We found no evidence for large allele dropout and genotyping errors due to stutter for the results of the forensic approach, the multiple-tubes approach, the combined approach as well as for the dataset without a refined approach in the analysis using MICRO-CHECKER 2.2.3. This was within our expectations as the approaches should correct the data for genotyping errors or exclude ambiguous data. Furthermore, we did not find evidence of null alleles using MICRO-CHECKER 2.2.3 in any of the approaches or in the data without a refined approach. Regarding null alleles, these results correspond to the frequencies calculated by Cervus 3.0.7 showing no value being higher 0.05 (Table 2), which is regarded as problematic (Kalinowski et al. 2007) and were comparable between the approaches. Since a null allele is an allele which is not amplified during PCR due to mutations in the primer flanking regions, null alleles are highly reproducible errors and cannot be detected and corrected by one of the approaches used in this study. Therefore, a check of the data for null alleles as well as the calculation of the frequency is necessary (Pompanon et al. 2005). Erroneous exclusion of true alleles through one of the approaches should have led to deviations from HWE, i.e. Hexp > Hobs. However, the analysed dataset did not deviate from HWE regardless which approach was used. Hence, a bias caused by one of the approaches can be excluded.

Table 2 Summary statistics for each locus including number of alleles (k), observed and expected heterozygosity (Hexp, Hobs), χ2-test for deviations from Hardy–Weinberg-equilibrium (HW, NS = not significant) and null allele frequencies (F(Null)) for the three different approaches and without a refined approach

Our study illustrates the potential of the suggested approach for other studies with comparable contamination problems or sample material. Mitchell et al. (2008) detected severe cross-contaminations on herring (Clupea harengus) scales from a historical collection. Different treatments (washing, bleach, scrubbing etc.) could not remove the contamination and thus, these samples were no longer usable (Mitchell et al. 2008). In such cases where serious cross-contamination occurs, the combined approach could be applied to substantially enhance the portion of reliably measurable results. This is of high importance in our case, as the success of the parental assignment depends on the proportion of parents sampled (Jones and Wang 2010; Kalinowski et al. 2007; Marshall et al. 1998). Another essential factor is the reliability of the data. Mismatches caused by genotyping errors could lead to an erroneous exclusion of a potential parent of the offspring of interest (Marshall et al. 1998). The multiple-tubes approach, and therefore also the combined approach we used in our study, is designed to detect genotyping errors and correct the data accordingly. If this is not possible the sample is excluded from further analysis. Consequently, the resulting data set should ideally contain hardly any errors (exceptions are null alleles, for example).With our combined approach the influence of both factors, cross-contamination and genotyping errors, can be minimised and the subsequent analysis, in our study the parental assignment, can be done without further constraints.

Our approach could also improve the genotyping success for other markers, such as SNPs, although some adaptations may be necessary. For example, in DNA SNaPshot assays, the signal imbalance due to the different fluorochromes used for labelling (Fondevila et al. 2017) must be considered.