Bridging the gaps between non-invasive genetic sampling and population parameter estimation
- First Online:
- Cite this article as:
- Marucco, F., Boitani, L., Pletscher, D.H. et al. Eur J Wildl Res (2011) 57: 1. doi:10.1007/s10344-010-0477-7
- 329 Views
Reliable estimates of population parameters are necessary for effective management and conservation actions. The use of genetic data for capture–recapture (CR) analyses has become an important tool to estimate population parameters for elusive species. Strong emphasis has been placed on the genetic analysis of non-invasive samples, or on the CR analysis; however, little attention has been paid to the simultaneous overview of the full non-invasive genetic CR analysis, and the important insights gained by understanding the interactions between the different parts of the technique. Here, we review the three main steps of the approach: designing the appropriate sampling scheme, conducting the genetic lab analysis, and applying the CR analysis to the genetic results; and present a synthesis of this topic with the aim of discussing the primary limitations and sources of error. We discuss the importance of the integration between these steps, the unique situations which occur with non-invasive studies, the role of ecologists and geneticists throughout the process, the problem of error propagation, and the sources of biases which can be present in the final estimates. We highlight the importance of team collaboration and offer a series of recommendations to wildlife ecologists who are not familiar with this topic yet but may want to use this tool to monitor populations through time.
KeywordsCapture-mark-recaptureGeneticMolecular taggingNon-invasivePopulation size
Wildlife conservation and research has benefited from the field of molecular genetics (DeYoung and Brennan 2005). Our ability to delineate populations, understand dispersal patterns, detect hybridization, and count and monitor wildlife has improved through the synergy of traditional wildlife biology with molecular ecology (Schwartz et al. 2007). One area of wildlife research, in particular, that has benefited from molecular genetics is the estimation of animal abundance and other demographic parameters (e.g., Boulanger et al. 2004; Kendall et al. 2009; Prugh et al. 2005).
Historically, estimation of demographic parameters on species that are rare, elusive, difficult or expensive to capture has been limited due to small sample sizes. Advances in molecular genetics allows individual identification from the collection of non-invasive samples (mainly hairs and faeces), often eliminating the need to capture or handle an animal to uniquely mark it (Kohn et al. 1999; Lucchini et al. 2002). Thus, “molecular tags” can now be used to track individuals throughout their lives and capture–recapture (CR) analyses can be applied to genetic data if individuals are sampled sufficiently to estimate recapture probabilities (Nichols 1992).
What is the optimal sampling design for collecting faeces or hairs to reduce biases and increase precision in the population estimates?
Given the potential for over-sampling (sampling an individual many times in one sampling session) and the cost per sample, how does a biologist prioritise which samples to analyse to minimise individual capture heterogeneity?
To increase precision of estimates, is it better to reanalyze existing samples to decrease genotyping errors or to analyse additional samples already collected during previous sessions?
How does one resolve disparate results between the laboratory and the field?
In this paper, we examine the three main steps of a genetic CR analysis to estimate demographic parameters, with the aim of discussing the primary limitations and sources of error. We provide a guide to effective integration of these steps to highlight issues that may be unfamiliar to either geneticists or ecologists. Our purpose is to address issues which commonly arise when wildlife biologists are initiating a molecular tagging study for the first time, focusing both on conventional issues (e.g., population closure and individual heterogeneity), and new issues, when genetic data are used in a CR analysis. At the end of each section we offer a series of recommendations to wildlife ecologists who are relatively unpractised with this topic and may want to use this tool to monitor populations through time. We discuss the role of ecologists throughout the process, the importance of the study design, the need for interpretation of the genetic results from an ecological prospective, the problem of error propagation, and future research needs in genetic CR analysis.
Sampling design for genetic “captures”: how to improve accuracy starting with DNA sample collection
Strong inference with genetic CR estimates, like any other study, can be achieved through proper sampling design, reducing bias and increasing precision. In non-invasive approaches, sampling designs will differ depending on whether DNA samples are collected using a sampling device which attracts the animal (“active sampling”, usually adopted for hair collections) or simply discovered in the field (“passive sampling”, usually adopted for faeces, feather, regurgitates or urine collections). In this paper, we will focus mainly on hairs and faeces collections, as examples of an active and passive sampling, as they are the techniques most cited in literature (e.g., Boulanger et al. 2006; Gleeson et al. 2010; Prugh et al. 2005), although we recognize that other non-invasive samples are being used for estimating abundance (e.g., Jacob et al. 2010). During these collections, bias can be caused by violation of CR assumptions (i.e., closure violation, individual capture heterogeneity) whereas precision can be poor if the sample size (i.e., number of genotyped individuals identified with scats/hairs and recapture rates) is too small.
Reducing biases: population closure and individual capture heterogeneity
In the past, study designs for genetic CR estimates have rarely been defined a priori, and often a sampling occasion was poorly identified (Lukacs and Burnham 2005). With important exceptions (e.g., Boulanger et al. 2006; Mulders et al. 2007), the number of sessions were typically defined a posteriori, either because samples were collected continuously with no formal sampling schemes, or because non-invasive samples are hard to accurately date, introducing uncertainty as to when the animal left the sample (e.g., Kohn et al. 1999; Wilson et al. 2003). This can cause biases in applying CR models because model assumptions can be violated (Boulanger and McLellan 2001). We strongly suggest that the first step, before initiation of non-invasive sample collection for a CR analysis, should be to define the sampling occasions taking into account the type of DNA collection and the behaviour of the species. The decision of conducting an active or passive DNA collection is dependent on the species of study. More often active hairs collections have been implemented in bear (e.g., Boulanger et al. 2004; Mowat and Strobeck 2000) or weasels (e.g., Mowat and Paetkau 2002) studies while passive faeces collections in canids (e.g., Marucco 2009; Prugh et al. 2005), marsupials (Ruibal et al. 2009) or bats studies (e.g., Puechmaille and Petit 2007). For both approaches, it is fundamental to define sampling strategies which meet the population closure assumption, if required by the CR model used in the analysis, and minimize the problem of unexplained individual capture heterogeneity. Here, we discuss how to limit these two sources of bias from non-invasive sampling design.
Violation of population closure (immigration or emmigration, deaths and births) (Pollock et al. 1990) is likely if a lengthy sampling period, relative to the lifespan of the species, is needed to obtain an adequate sample size. Long sampling periods often causes positively biased estimates when applying closed CR models or a robust design, which require closure (Boulanger and McLellan 2001). A robust design is one of the most useful design in CR studies (Lukacs and Burnham 2005) and should be considered among the first choices of sampling if the objectives of the study are to obtain both accurate abundance estimates and estimates of survival, emigration rates, and trend over time (Pollock et al. 1990); however, robust design is more difficult to apply, especially in passive faecal sampling, where faecal deposition is continuous.
Several species are best captured by faeces collections due to their behaviour (e.g., wolves, foxes, badgers, and elephants). The sampling design for passive faeces collection is usually organised on transects used to locate faeces on trails, latrines, etc. that can be travelled multiple times. Each of the temporally independent transects is often considered a sampling occasion (Marucco 2009; Matejusová et al. 2008; Ruibal et al. 2009). Closure assumptions are harder to meet in this case, because it is often not logistically feasible to sample transects often enough to collect large numbers of faeces in short timeframes. Moreover, the actual time of deposition is usually undefined, extending the actual timeframe of sampling (Mulders et al. 2007). For instance, a scat sampling period of several consecutive days was too short to obtain a meaningful estimate of population abundance for the spotted-tailed quoll (Dasyurus maculatus), and it took 7 to 8 weeks of scat sampling to achieve re-sampling rates sufficient to obtain population estimates (Ruibal et al. 2009). However, Ruibal et al. (2009) adopted a closed CR model by focusing the sampling in May to June, when the behaviour of the species justified closure assumptions despite the long sampling period. Because well-defined sampling occasions with short time intervals are harder to conduct for passive faeces sampling, open CR models that do not have a closure assumption may be better suited for these situations (e.g., Cubaynes et al. 2010; Marucco 2009; Prugh et al. 2005). The use of faecal detection dogs may improve the efficacy of the collections for some species (MacKay et al. 2008; Smith et al. 2005; Wasser et al. 2004). Regardless, if a closed CR model is used, and short sampling periods are not feasible, it is critical to conduct other field tests (e.g., radiocollared animals, videos and observations) to check for violation of closure assumptions (e.g., Powell et al. 2000) (Fig. 1).
Active collections of hair samples are more suited for meeting the closure assumption: they are usually conducted with hair traps, allowing the researcher to clearly identify the number of occasions (i.e., the hair snags are checked at certain time intervals), the kind of sampling (e.g., random, systematic and adaptive), and the number of grids (i.e., power), with the ultimate goals of the sampling design being robust to capture heterogeneity and maximising geographic closure (Boulanger et al. 2004).
Individual capture heterogeneity
Individual capture heterogeneity is one of the most frequent issue in non-invasive CR studies as it biases the estimates (Lukacs and Burnham 2005; Prugh et al. 2005). Whenever possible, it is very important to minimize differences in capture probabilities between individuals (i.e., collecting scats or hairs in equal numbers among individuals, avoiding differences in sampling intensity between sex, age, social status, etc.) (Fig. 1). This should be first attempted via the sampling design in a species and study-specific manner. Here, we first discuss issues of individual capture heterogeneity with faeces and then with hair collections.
For minimising individual capture heterogeneity passive faeces sampling might have some advantages compared with traditional CR studies. Faeces sampling may be less affected by the “trap-shy” or ‘trap-happy’ behavioural response, typical of animal’s responses to bait posts used to collect hairs (e.g., Boulanger et al. 2006) or capture individuals (Lebreton et al. 1992). However, it is critical to know deposition patterns and home ranges of individuals to calibrate a faeces sampling scheme for minimising capture heterogeneity while at the same time controlling for a “trap-happy artefact”. From a biological perspective, it is unclear if there is a behavioural response to having scats removed for collection, which may induce a trap-happy or trap-shy response. Therefore, problems of individual capture heterogeneity in passive sampling can be related to differing behaviour of individuals with respect to the probability of finding their faeces. For instance, Marucco (2009) found that dominant wolves (Canis lupus) strong marking behaviour can increase their capture probabilities, especially if their faecal signs are deposited at marking sites, which are easier to see and collect. To minimize this problem, the authors found that the best sampling design was along wolf snow tracks because they increased the probability of sampling each individual in the pack, thus minimising the effects of differences in individual marking behaviour. Similarly, Eggert et al. (2003) followed elephant trails to collect fresh dung, but this sampling design likely over-represented larger elephants group, whose trails are more obvious to the human eye, thus possibly causing capture heterogeneity. Ruibal et al. (2009) showed that seasonal differences in scent-marking behaviour of the spotted-tailed quoll can produce sex- and age-sampling biases. They showed the higher likelihood that subadult individuals dispersed by May and the peak in scat deposition during May and June suggested that scat sampling at this time of year was optimal for this species.
For active hair sampling, Boulanger et al. (2006) found that one of the main causes of heterogeneity in recaptures of grizzly bears (Ursus arctos) was low capture probabilities for bear females with cubs. A field solution, moving baited sites within each sampling grid, was used to reduce this capture heterogeneity-induced bias. Boulanger et al. (2004) found that calibrating the distance between traps to the grid edge could also minimize heterogeneity in recaptures. Settlage et al. (2008) showed that using a grizzly bear study hair sampling design for black bears in the Southern Appalachians (USA) was not suitable to yield accurate abundance estimates due to the small home ranges of black bears causing individual capture heterogeneity. The same outcome (i.e., heterogeneity in p) has been documented for wild boar (Sus scrofa) non-invasive sampling (Ebert et al. 2009). Ebert et al. (2009) used videos to test if hair traps could sample wild boar randomly for the purpose of non-invasive population estimation. Video analysis revealed differences in the behaviour of adult and subadult wild boars with respect to the baited hair traps, which produced heterogeneous individual sampling probabilities. They suggested that for social species, such as wildboar which can occur in groups of up to 30 individuals, collections through baited hair traps might create non-representative genetic samples causing increased bias due to heterogeneity in detection probability.
Therefore, before initiating either a passive or an active non-invasive study, it is fundamental to analyse all possible sources of capture heterogeneity related to the type of DNA samples collected and the behaviour of the species, and to design a sampling scheme to minimize them. If this is not accomplished, it is of critical importance to collect a large sample, representative of a large proportion of the population, to allow high recapture rates (Fig. 1). High recapture rates will allow detection and correction of individual capture heterogeneity. Subsequently, it is important to collect data in the field (e.g., with videos, tracks, etc.) on possible sources of variation which cause differences in capture rates (e.g., indications on the social status, age of the individual, behaviour, location within the home ranges, circumstances of sample collection, etc.). Greater effort in identifying potential sources of heterogeneity is recommended in non-invasive CR studies than in traditional CR studies, because visual covariates (such as body condition, age, size, etc.) are not collected with non-invasive samples, as they would be with captured animals.
Increasing precision: sample size, lab success rates, and money
In traditional CR analysis, researchers often aim to increase sample size to produce higher capture probabilities and subsequently increase the precision of population parameter estimates. In addition, it is common to formally estimate robustness to capture heterogeneity (Boulanger et al. 2004). White et al. (1982) show that accurate population estimates require average capture probabilities of >0.30 when N < 100. Small sample sizes coupled with low capture probabilities tend to negatively bias population estimates (White et al. 1982). In genetic CR studies, the strategy is the same; however, it is important to recognize that an increase in sample size can simultaneously increase the number of genotyping errors, which, in turn, can increase unexplained “apparent individual heterogeneity” in recaptures and introduce biases into the estimates. Therefore, the traditional methods of increasing sample size to increase the ability to detect and model capture heterogeneity should be carefully considered because, unless the genotyping error is zero, this solution could be ineffective and costly. Furthermore, it is unlikely that the genotyping error is ever zero (Bonin et al. 2004). This is one of the greatest dilemmas in genetic CR sampling.
Sample size can be increased with increased sampling effort as in typical CR sampling. Sample size can also be increased by increasing lab success rates (the percentage of the samples analysed in the lab that give positive multilocus genotyping results) and recapture rates can be increased decreasing genotyping errors (Fig. 1). Solberg et al. (2006) recommended that studies using non-invasive genetic methods based on faecal samples should aim at collecting 2.5–3 times the number of faecal samples as the “assumed” number of animals (considering that in their lab analysis approximately 20–30% of the samples could not be genotyped). Lab success rates for faecal samples can be lower than that, and in such cases a more faecal samples are necessary to sample the population (Marucco 2009). Therefore, if laboratory success rates are not known, we recommend a pilot study, a critical step that is often overlooked in non-invasive CR analyses (Schwartz and Monfort 2008; Valiere et al. 2007), which should give average lab success rates and an indication on the best types of samples (i.e., storage method, or timing of sampling collection) (Fig. 1). Alternatively, it has been helpful to target for collection faeces or hairs with higher probabilities of amplification (e.g., fresh faeces collected in the winter on snow give a higher lab performance—Lucchini et al. 2002). For example, McKelvey et al. (2006) found that scat and hair sampling for Canada lynx (Lynx canadensis) was more efficient during winter snow-tracking than using hair-snaring techniques in summer. This was due to samples being better preserved in the winter where they were likely frozen upon deposition. In addition, sampling along snowtracks served as an effective screening mechanism for eliminating non-target species, optimising the use of funds for laboratory analysis.
Investigators should plan an annual budget based on a desired level of precision and number of recaptures. This should be done taking into account the average lab success rate for a given species in a given area, and considering that over-sampling (sampling an individual many times in the same sampling session) should be minimised (Fig. 1). This procedure will decrease costs and potential biases in genetic CR estimates, because each sample has a risk of creating a new individual if genotyping error occurs, which can inflate the population size estimate (Creel et al. 2003; McKelvey and Schwartz 2004). If the species of interests is characterised by high recapture probabilities during the same sampling session (as typical for clusters of faeces sampled at latrines, feeding sites, resting sites, etc., which is data that is typically not used in ordinary CR analyses), it is possible to randomly reduce the number of samples collected per session to reduce costs..
Adams et al. (2003) have recommended using the spatial pattern of faecal deposition as an initial criterion for prioritizing samples to analyse. At the same time it is suggested to keep the samples collected and not analysed for future analysis, in case the primary samples fail to amplify. In fact, we still suggest collecting a large number of high quality samples during the sampling occasions whenever it is possible, more than initially needed for analysis in the lab. One of the important differences between genetic CR, compared with traditional CR, is that it is always possible to increase the number of recaptures later if the samples have already been collected. This is accomplished by analysing additional samples in the lab from the sampling occasion of interest. However, subselections of samples have to be random as to not introduce bias in the analysis. In fact, in some cases it could be more important to invest new funding in additional lab analyses to increase recapture rates and decrease standard errors of a sampling occasion, rather than perform new analyses on a new sampling occasion. If there are no biological reasons to suspect heterogeneity problems in the sampling design, the presence of unexplained recapture heterogeneity in the data can be evidence for the presence of genotyping errors. If this is the case, more effort should be invested in additional replicates of current samples rather than analysing new samples.
Synthesis of recommendations
Based on the biology of the species, determine if greater precision, less bias and effort is possible using an active non-invasive genetic sample collection or a passive one.
Clearly define the study design and sampling occasions prior to collecting non-invasive genetics samples. This includes taking into account the biology of the species, its faecal deposition patterns, the assumptions of the CR models to be used, and the number of samples expected to be analysed in the genetic lab to reach the desired level of recaptures while avoiding over-sampling and individual heterogeneity in recaptures.
Investigate the possible violations of CR model assumptions inherent in the non-invasive method of choice, and control them through the sampling design first.
If closure assumptions are required, but long periods of sampling are needed, other field tests (e.g., radiotracking) need to be conducted to check for closure violations.
If capture heterogeneity is hard to avoid, plan to collect data on important covariates to explain individual heterogeneity in the model framework.
Plan your funding for a desired level of precision, considering the field effort to collect samples, the lab success rate, and the issue of minimising genotyping errors.
Laboratory analysis and interpretation of genetic results
The presence of genotyping errors, a phenomenon that can occur when working with low quality DNA samples such as faeces and hairs, causes important biases in genetic CR estimates. In recent years this topic has been heavily reviewed (e.g., Pompanon et al. 2005; Broquet et al. 2007), thus we will only briefly discuss it here. Two main types of genotyping errors occur: allelic dropout which occurs when only one allele of a heterozygous individual is detected creating an erroneous homozygote, and false alleles that can make a homozygote appear to be a heterozygote (Pompanon et al. 2005). Partial null alleles, which occur when there is a mutation at the priming site’s sequence causing primers to irregularly attach to the sequence of certain individuals, are another less common source of genotyping error (O'Connell and Wright 1997), but have been often overlooked (Wagner et al. 2007). In addition to these errors, there are human errors such as scoring and transcription errors. Most commonly, genotyping errors create the appearance of additional “false” individuals in the population that will never be recaptured unless the error is exactly repeated (Waits and Leberg 2000). Partial null alleles also create false individuals, but at least these can be recaptured, because the error is systematic (Marucco 2009).
Genotyping errors cause biases in the estimates leading to overestimation of abundance, because they increase the minimum population size detected and lower the probability of recapture (Lukacs and Burnham 2005; McKelvey and Schwartz 2004). For example, Creel et al. (2003) compared non-invasive faecal DNA results to a known population, and found that genotyping errors caused wolf population estimates to be biased upward by 5.5 times the real population size. Moreover, the presence of genotyping errors can appear as an unexplained “apparent individual heterogeneity” in recaptures and introduce biases into the estimates as the appropriate “true” model will not be supported. This apparent individual heterogeneity, which is generated by the presence of “false” individuals created by random genotyping errors, is hard to handle in a model framework, because there are no covariates related to individuals or sampling design that can be considered to explain this variance.
In addition to genotyping errors, researchers have identified a shadow effect, which can occur when individuals share the same molecular tag due to using an insufficient number of variable molecular markers (Mills et al. 2000). When the shadow effect is present, individuals cannot be distinguished from each other (Mills et al. 2000), leading to underestimation in population size. Fortunately, underestimation of abundance caused by the shadow effect can be identified using population level statistics such as probability of identity (PI) and probability of identity of siblings (PISIB), and eliminated by increasing the number of variable molecular markers used to produce a molecular tag (Mills et al. 2000). One problem with using the probability of identity statistics is that they are population level statistics which assume no inbreeding, or population structure, and most populations have some undetected level of subdivision or structured composition, whether created by landscapes, social behaviour or territorial dynamics (Ayres and Overall 2004). Ayres and Overall (2004) developed a new and more robust probability of identity test (PIAVE) which take into account these factors, as well as the relatedness structure of the population. However, these statistics are seldom used in the literature and have not been thoroughly evaluated in natural populations. In general, considering the proliferation of molecular markers for many taxa available, the shadow effect is a less serious concern than genotyping error.
Lab protocols and techniques are constantly improving and this includes developing systems which minimize errors (Kalinowski et al. 2006; Miller et al. 2002; Paetkau 2003; Valiere et al. 2002). Several approaches to minimize errors in the lab have been proposed: the multiple-tube approach (Taberlet et al. 1996), simulations to identify errors and to quantify them (Kalinowski et al. 2006; McKelvey and Schwartz 2004; Miller et al. 2002; Valiere et al. 2002), intensive error checking which involves scrutiny of pairs of genotypes which differ by only one or two alleles (1MM and 2MM pairs) and subsequent discarding of poorly performing samples (McKelvey and Schwartz 2005; Paetkau 2003), quantifying species-specific DNA prior to use (Morin et al. 2001), and several other approaches (reviewed in Bonin et al. 2004; Pompanon et al. 2005). The more approaches used, the more error that will be detected (Fig. 1). Population size estimation will be continually advanced by new DNA genotyping technologies, which will allow minimization of errors (Perkel 2008).
An additional way to check laboratory errors is through the use of independent field data and the foresight to preserve some of the laboratory budget to re-examine any samples which produce results incongruous with field information. This is an important and critical step before initiating the CR analysis. For example, wildlife biologists could compare genotyping data to Geographic Information System (GIS) data (Smith et al. 2006), radio tracking information, and the behaviour and ecology of the study species to find incongruities in the data. The genotypes that cause these incongruities should be reanalyzed in the laboratory. Smith et al. (2006) established a series of GIS based rules that examined distances between collected scats and territory sizes on California kit fox (Vulpes macrotis mutica) to check their genetic results. Marucco (2009) used a similar technique for wolves in the Alps.
While all of these error detection techniques are rapidly improving, we can still expect a small percentage of residual errors in the final dataset, and a clear and universal way of estimating the error is not yet available. Generally, the level of dropout errors or false alleles reported in a genetic CR study is estimated from the ongoing lab analysis (Broquet and Petit 2004) and refers to the documented error already removed in the lab. Virtually every final data set obtained by genotyping likely contains some residual errors (Bonin et al. 2004); these errors should not be ignored because they may bias the final results.
In human forensic science error rates in the final dataset are well discussed. Human forensic labs have extremely strict rules and developed a series of recommendations for those data that should or should not be used in an investigation (e.g., thresholds of peak height, sister peak high ratios, etc.) (Budowle et al. 2009; Penacino et al. 2003). Saks and Koehler (2005), however, extensively discussed how forensic scientists often reject error rate estimates in favour of arguments that theirs is an error-free science. Since its beginnings, forensic DNA testing was surrounded by an aura of infallibility; nevertheless, errors may occur also in forensic DNA typing (Penacino et al. 2003). To combat this preconception, Saks et al. (2003) suggested that forensic scientists might adopt protocols, such as blind test samples that minimize the risks that their success rates will be inflated and their conclusions biased by extraneous evidence and assumptions. Saks and Koehler (2005) even suggest conducting these blind external proficiency tests by an agency unaffiliated with the forensic scientist's laboratory. Externality is important to the integrity of proficiency tests because laboratories have strong incentives to be perceived as error free. They also argued that the best test would be one where the analyst believes the test materials are part of their ordinary case load, thus these samples would be treated as any other samples entering the laboratory. This approach would be appropriate for non-invasive studies of wildlife, to improve the error identification process.
Overall, it is important to acknowledge that genotyping errors might not be completely eliminated using any lab protocol, and then working with the lab to assess the quality of the final multilocus genotypes to have the most reliable results for subsequent CR analysis. This can be achieved by using sample quality quantification methods, such as the one suggested by Miquel et al. (2006) which provides “quality indexes” for each sample and genotype based on the sample’s performance during multiple laboratory analyses of it. Alternatively new species-specific quality indices can be derived by using a combination of genetic information and independent field data based on the biology of the species.
When incongruence between the lab and the field data occur, or the quality of the final sample is considered “low”, several solutions exist. The best option may be to perform more replicates in the lab until results are deemed reliable. A second option is to remove the sample from the analysis, but the discarded samples might not be random with respect to identity, thus introducing biases into the population estimate (Creel et al. 2003; Lukacs and Burnham 2005), hence, this issue should be checked. A third approach is to consider it a low quality or uncertain sample and use it in the analysis as such (obtaining a range of genotypes, and thus subsequent CR analysis, with and without uncertain samples).
Synthesis of recommendations
Conduct a pilot study to determine sample success rates.
Select the minimum number of variable molecular markers needed to produce a molecular tag that eliminates the problem that individuals share the same molecular tag while minimising the probability of having genotyping errors.
Conduct blind laboratories tests on ordinary case samples for the species of interest to quantify the quality of the final genetic results and further understands problems of genotyping errors.
Adopt multiple approaches to detect genotyping errors in the laboratory.
Use the wildlife biologist’s knowledge of the study species and field data to pinpoint genetic results incongruous with field information.
Remove poor quality samples or treat them in a model framework as such (see next section), assessing the quality of the final multilocus genotypes.
CR analysis of a non-invasive genetic dataset
There are many reasons to collect non-invasive genetic samples for species identification. Sometimes the simple detection of a rare species is valuable information (Dematteo et al. 2009; Zielinski et al. 2006) while other times a collection of these detections are used for occupancy modelling (MacKenzie 2006). When non-invasive samples are collected opportunistically or haphazardly they may prove difficult for estimating abundance in a CR analysis, because of inadequate recapture rates. Rarefaction curves might be used in this case, but they will be biased under nearly all conditions (e.g., Eggert et al. 2003; Frantz and Roper 2006; Kohn et al. 1999).
Moreover, some unique situations could occur with genetic recaptures which are opportunities but still difficult to include in CR analysis: the recapture of an individual (e.g., from collected faeces) after the individual is found dead, or multiple detections of a single individual within the same day (e.g., cluster of faeces at latrines, or at kill site, or at resting sites are frequently from the same individual). So far, these kinds of data have been discarded or lumped together into a single detection to allow use of standard CR analysis. However, there are opportunities to take advantage of this rich, genetic data stream. For example, solutions have been developed for the situation where there are multiple detections within a single sampling occasion, something which is not common in traditional CR analysis. Miller et al. (2005) present ways to take advantage of these extra data. The ability to estimate population size from single sampling sessions have been developed by using maximum likelihood or a Bayesian estimator (Gazey and Staley 1986; Petit and Valiere 2006; Puechmaille and Petit 2007). Lukacs (2005) developed alternative methods for using multiple encounters of an individual within a sampling occasion to estimate population size. These models still rely on the assumptions that the population is stable, closed, and has no capture heterogeneity. Furthermore, these sampling methods are limited because they can only be used to estimate population size, and do not allow estimation of survival or other demographic parameters that can be readily estimated with a multi-session approach (Lebreton et al. 1992).
If individuals are sampled often enough to estimate recapture probabilities over sessions, CR analysis should be applied. CR analysis strengths are that they can account for the sampling design, are designed to study the process that generates the data (Lukacs and Burnham 2005), and can be more efficient in the data analysis producing robust estimates of abundance with given levels of precision (Nichols 1992). CR analysis also can estimate population parameters other than abundance, which is important for management and conservation. Unfortunately, CR analysis are also data intensive.
Lukacs and Burnham (2005) provided an initial overview of the CR analysis of a non-invasive genetic dataset, considering the pros and cons of using a closed model, open model, or a robust design. We particularly concur with one of their principle assertions that before running analysis of population size estimates, the dataset should be checked for capture heterogeneity, and any other assumption violations. In particular, with genetic datasets, other important assumptions, such as no animals having lost their marks and all marked animals are correctly reported (i.e., in genetic studies these mean no genotyping errors), should also be tested. Most CR models (e.g., Cormack–Jolly–Seber (CJS) model) are tested for assumption violations using goodness of fit tests (Lebreton et al. 1992, 1993). Once fit is assessed and a model is chosen (see Lukacs and Burnham 2005) two main issues unique to genetic CR modelling remain: how to handle the presence of the remaining genotyping errors in the dataset, and how to handle the individual heterogeneity in recaptures, if present.
Modelling genotyping error
CR analysis can be the last check of the genetic dataset (Fig. 1). In fact, Paetkau (2003) and Lukacs and Burnham (2005) reported that datasets which have not been heavily scrutinized both by geneticists and ecologists often show either geographical closure violations or presence of heterogeneity in recaptures without biological explanations (Lukacs and Burnham 2005). The apparent violations of assumptions often disappear when the datasets were heavily scrutinized and errors removed (Paetkau 2003). Marucco (2009) noticed that the probability of detection (i.e., the recapture rates) did not decrease with an increase in population size through time, and the estimates of population size did not increase with increases in sample size. These two indices are a good indirect check that there are no major genotyping errors in the dataset.
We suggest two possible ways to deal with the presence of the remaining genotyping error. If it is believed that the error rate can be considered “negligible” (i.e., it passes the screens such as implemented in McKelvey and Schwartz 2005), then an initial step would be to conduct the analysis without including the genotyping error rate. However, it is advisable to subsequently simulate different levels of error to evaluate the potential impact of errors on the estimates of population size. This simulation allows one to determine if indeed the error rate is negligible on the final population estimate. Moreover, if the quality of samples has been evaluated (e.g., Miquel et al. 2006), it is possible to estimate abundance using different quality levels of samples to identify possible biases. Genotyping errors lead to different patterns of misidentification in CR data (see “Sampling design for genetic “captures”: how to improve accuracy starting with DNA sample collection”), and their effects on estimates differ depending on how and which errors are introduced into the data. Building an appropriate model for CR data with genotyping errors requires a clear understanding of the misidentification mechanism; when different patterns of misidentification occur simultaneously (which is often the case), it is very difficult to build a likelihood-based model to analyse CR data. Simulations of the misidentification patterns are now a good solution to explore the effects of the simultaneous presence of different types of errors. Yoshizaki (2007) has developed models for performing such simulations.
A second option is to use CR models that have been developed to include a parameter that estimates the genotyping error rate (Lukacs 2005). In the past, corrections have been developed for errors in identification using markers other than DNA (Stevick et al. 2001), or for tag-misreading (Schwarz and Stobo 1999). These methods account only for false negative errors in identifications, which only decrease the probability of detection, and have not been applied to genotyping errors, which also produce false individuals caught only once, or false individuals caught multiple times. The models developed by Lukacs (2005), which incorporate genotyping errors, are for closed population models and robust designs, and rely on multiple realistic assumptions, such as that the shadow effect does not exist, that two genotyping errors are never the same, and that a genotyping error does not produce the same genotype as an existing “real” individual (Lukacs and Burnham 2005). One potential problem with these models is that they cannot separate the effects of a closure violation from potential genotyping errors: both can lead to an overabundance of individuals captured only once. Unfortunately, closure is often violated to some extent; hence, these models are only useful if a lab-based residual error rate exists (which is often difficult to estimate). Moreover, the models have a problem with the model structure and identifiably of the parameters (Yoshizaki 2007). Recently, other CR models have been developed which incorporate genotype errors for estimating abundance using DNA samples (Knapp et al. 2009; Wright et al. 2009). It will be interesting to test model performance in several real case studies and situations where abundance is known.
Dealing with individual heterogeneity
As noted above, it is important to first try to minimize individual capture heterogeneity issues using an optimal sampling scheme (Fig. 1). If individual heterogeneity exists in the dataset (this should be tested with goodness of fit tests), then the solution is to model it, explaining this variance by using covariates related to individuals, their behaviour, or other important biological and environmental variables. Studies should be designed to ensure that sample sizes (i.e., capture probabilities) are high enough so that models are able to detect heterogeneity if present and be robust to capture heterogeneity (Williams et al. 2002). A flexible framework of likelihood-based models which allows for individual heterogeneity in survival and capture rates have been developed for closed and open CJS models (Pledger et al. 2003). A large literature is available to deal with individual heterogeneity (Williams et al. 2002); the key to this is ensuring adequate encounter rates.
In a simulation study, Roon et al. (2005) found out that estimators such as Mh-Jackknife or Mh-Chao are highly sensitive to the probability of recapture and thus may exacerbate the impact of genotyping errors, suggesting that heterogeneity estimators in closed population models should be used with caution in non-invasive genetic studies. In fact, one of the most difficult problems facing estimation of animal abundance is the unexplained individual heterogeneity, or “apparent” individual heterogeneity (Lukacs and Burnham 2005; Pledger and Efford 1998; Prugh et al. 2005), which can be caused by genotyping errors, and/or other issues such as the presence of transient individuals. Models that take into account cases of “apparent” individual heterogeneity resulting from the capture of animals just passing through a population of resident animals (i.e., transients) have been developed (Pradel et al. 1997a, b). In fact, transience, like genotyping errors, can make it appear that there is more individual heterogeneity than there really is. Modelling transience explicitly is a way to explain apparent individual heterogeneity in CR data. Cubaynes et al. (2010) estimated the size of an open population of wolves in France with non-invasive genetic data, where they detected heterogeneity and did not have data on possible covariates which could explain this heterogeneity. Therefore, they developed a multi-event CR model where they considered a two-class mixture model with weakly and highly detectable individuals to account for individual detection heterogeneity. They found an underestimation of population size up to 27% when individual heterogeneity was ignored. This is an interesting solution for modelling heterogeneity, especially when covariates related to age or social status are not available for the captured individuals, which is a typical case if dealing with genetic datasets. Future studies could include estimates of individual’s age directly derived from non-invasive DNA samples (Luikart et al. 2010), very important for having covariates related to genotypes, which will improve non-invasive CR population size estimation.
Synthesis of recommendations
Assess the fit of the CR model and the extent to which its assumptions are violated.
Acknowledge the possible presence of genotyping error and consider how various levels and types of genotyping error will impact the final demographic estimate through simulations.
Use newly sophisticated models that incorporate genotyping errors, taking into consideration that some error patterns have not been considered yet.
If individual heterogeneity is present, try to model it with respect to the sampling design.
Management implications and future needs
Genetic CR analysis is a highly promising tool to estimate population parameters and monitor populations through time (Schwartz et al. 2007). However, part of the studies mentioned in this paper and the discussion developed above demand attention and follow-up investigations. We suggest that in general it will be important that genetic CR analyses increase candour about performance and create pressure for improvement. Tests of the overall technique in controlled settings such as captive facilities (i.e., genetic CR analysis based on known size populations) are strongly needed.
It is interesting that most studies which applied genetic CR analyses produced population size estimates 30–50% larger than estimates obtained with traditional methods (e.g., Cubaynes et al. 2010; Guschanski et al. 2009; Kendall et al. 2009; Marucco 2009; Solberg et al. 2006; Zhan et al. 2006). For instance, Zhan et al. (2006) reported that the molecular census of the giant panda (Ailuropoda melanoleuca) in China was doubled than that previously estimated with traditional methods. This is likely due to the application of CR analysis which account for undetected individuals, providing more accurate estimates. However, we discussed how the presence of residual genotyping errors in final datasets used for CR analysis is still not well quantified, which can cause overestimation of abundance. It will be important in the future to investigate this aspect further because these genetic CR estimates might be used for management and conservation decisions.
If ecologists wants to use this technique for the first time, they need to carefully test for the possible biases cited in this paper, as well as plan to still use traditional methods to estimate population size to compare to baseline approaches. We emphasised the problem of looking at errors independently at each of the three steps of the approach, because errors accumulate and can propagate. Improving the sampling design to increase accuracy, developing species-specific protocols for specific objectives, improving genetic lab protocols, and the ecological interpretation of the genetic results will additively improve CR estimates (Fig. 1). This process requires a strong collaboration between geneticists, wildlife ecologists, and statisticians knowledgeable in capture–recapture. Ecologists must have a solid understanding of genetic techniques and understand that genotyping errors are inherent in the system and cannot be totally removed while geneticists should fully recognize the important insights which come from the field data and not be upset by being asked to conduct specific blind tests. Overall, close collaboration between ecologists and geneticists is fundamental to correctly amalgamate field data with genetic data and define species-specific protocols for assessing quality in genetic CR analysis.
However, limitations and specific sources of error will always be present using a non-invasive dataset in a CR analysis. This can have huge implications for management of rare species, especially if biases are present and standard errors of estimates are large (Begon 1983). Future efforts should be focused on quantifying in a standardized way the residual genotyping errors and better use the ecological knowledge in the error checking process. Even if the first two parts of the process are conducted properly, modellers should still try to solve individual heterogeneity problems and incorporate the different patterns of residual genotyping errors in models, topics currently of interests to researchers.
As this field of genetic CR advances, it will be possible to apply the technique to more complex situations already well developed in traditional CR studies, such as the estimation of survival rates, movement or transition rates, recruitment, and population growth (Williams et al. 2002). Moreover, it is possible to also apply multistate CR models (Lebreton and Pradel 2002) which are sophisticated models for handling heterogeneity of capture and for investigating spatial aspects of metapopulation dynamics. The great promise with genetic CR analysis is that this will be possible for small or endangered populations, where disturbance to the animal is minimised. The non-invasive genetic method, besides being suited for large-scale monitoring (e.g., Flagstad et al. 2004; Kendall et al. 2009), also has several other advantages. The genetic data obtained from the non-invasive analysis contain information that could be used for additional purposes not related to estimating population size, such as estimating effective population size (Luikart et al. 2010) and other genetic parameters (i.e., structure, gene flow, or relatedness), although the number of markers required for this type of analysis might be higher than the number required for individual identification. In this way, with just one monitoring approach, a population can be demographically and genetically monitored over time and at a large scale.
We thank K. Griffin, P. Ciucci, and J. Boulanger for helpful comments on the first draft of this paper. F. Marucco was supported by the Regione Piemonte, Progetto Lupo Piemonte, Parco Naturale Alpi Marittime. M. Schwartz was supported by a PECASE award during the writing of this manuscript.