Validation of qualitative PCR methods on the basis of mathematical–statistical modelling of the probability of detection

Abstract

A new model for the probability of detection (POD curve) for qualitative PCR methods examined in a method validation collaborative study is presented. The model allows the calculation of the POD curve and the limit of detection (LOD 95%), i.e. the number of copies of the target DNA sequence required to ensure 95 % probability of detection. The between-laboratory variability of the limit of detection is used to derive the between-laboratory reproducibility of the PCR method. The model is closely related to the approach for quantitative methods described in ISO 5725.2:2002, and the relative limit of detection approach described in the new standard ISO 16140-2:2014.

Introduction

Qualitative analytical methods like qualitative PCR tests are binary methods resulting in a “yes” or a “no”, i.e. either the target sequence tested for is present and detectable in the DNA test sample or it is not. The validation of binary test methods according to ISO standards or guidelines from AOAC INTERNATIONAL is based on collaborative trials in which identical test samples are analysed in different laboratories. An important question is how to define corresponding validation criteria, that is to say, the requirements that the analytical performance of a binary test method must meet.

Various authors have set out to define such validation criteria or have examined whether and to what extent established criteria for quantitative methods can also be applied in the case of binary methods. The performance of a binary method yields one single result, either negative or positive, which can be coded with the numerical values 0 or 1. When a laboratory performs a large number of test repetitions on the same sample material, the mean of the numerical results will lie very close to the probability of obtaining a positive result. Wehling et al. [3] introduced the term probability of detection (POD) to denote this probability. For qualitative (binary) PCR methods, the probability of obtaining a positive result is generally considered to be dependent upon the number of copies of the target sequence present in the DNA test sample. For most well-designed qualitative PCR methods, the probability of a positive response should be 0 if the target DNA sequence is not present, and the probability should tend to 1 as the number of copies of target DNA sequences increases.

Several authors discuss statistical methods to derive repeatability and reproducibility standard deviations of the test results for qualitative test methods. Wilrich [4] and Wehling et al. [3] use analysis of variance (ANOVA) with random effects in order to derive repeatability and reproducibility standard deviations. Using this same approach, Wehling et al. [3] also calculate a confidence interval for the theoretical mean of the POD across the laboratory population.

Bashkansky et al. [5] and Gadrich et al. [6] propose another approach based on “ORDANOVA” (Gadrich and Bashkansky [7]) for the calculation of precision data. This approach is closely related to the one proposed by Wilrich [4] and Wehling et al. [3]. It is applicable not only for binary test methods but also for methods using ordinal scales for reporting test results.

Uhlig et al. [8] demonstrate that the reproducibility variance as proposed by both Wilrich [4] and Wehling et al. [3] depends only on the theoretical mean of the POD across the laboratory population. Indeed, on the basis of the formulas suggested by Wilrich [4] and Wehling et al. [3] for the repeatability variance σ 2r and the between-laboratory variance σ 2L , and for a finite laboratory population with K laboratories, we have

$$\sigma_{\text{R}}^{2} = \sigma_{\text{r}}^{2} + \sigma_{\text{L}}^{2} = \frac{1}{K}\mathop \sum \limits_{i = 1}^{K} p_{i} \left( {1 - p_{i} } \right) + \frac{1}{K}\mathop \sum \limits_{i = 1}^{K} \left( {p_{i} - p} \right)^{2} = p\left( {1 - p} \right),$$
(1)

where p i denotes the theoretical mean of the POD of laboratory i, and \(p = \frac{1}{K}\sum\nolimits_{i = 1}^{K} {p_{i} }\) denotes the theoretical mean of the POD across the laboratory population. Since this reproducibility variance depends alone on p, it cannot reasonably be adopted as a measure of the precision of the test method. On the other hand, the between-laboratory variance σ 2L does not depend alone on p and could be taken as a measure of precision. The drawback is that, in practice, σ 2L is often very small. Indeed, Wilrich himself writes that “generally, the between-laboratory standard deviation is much smaller than the repeatability standard deviation so that the reproducibility standard deviation is only slightly larger than the repeatability…” [4]. Indeed, the between-laboratory standard deviation is often estimated to be zero. For this reason, its usefulness as a measure of precision may be rather limited. As far as the repeatability variance is concerned, it can be seen from Eq. (1) that it is uniquely determined by the reproducibility and the between-laboratory variances and thus does not provide any extra information as to the precision of the method.

It is important to point out that, since the reproducibility standard deviation depends alone on p, if two different methods have the same p, the reproducibility variance will be the same for the two methods even if one of them displays considerable more variability from one laboratory to another. This is because, in order for p to remain constant, an increase in the between-laboratory variance must be accompanied by a reduction in the repeatability variance. In other words, the reproducibility variance cannot be partitioned into two independent variance components. This is not reconcilable with the approach adopted in ISO 5725-2 [1].

An approach that, on the one hand, allows for independent repeatability and between-laboratory variances and, on the other, adopts and expands the general POD framework introduced by Wehling et al. [3] was described in Uhlig et al. [8] and Uhlig et al. [9]. The basic idea for this approach is to introduce a latent variable. For many chemical binary test methods, detection is based on whether a quantitative response exceeds a certain threshold. Accordingly, a non-observable (i.e. latent) random variable X is introduced, and it is assumed that a positive outcome of the binary test occurs if and only if X > 0. The latent variable X itself follows the statistical model used in ISO 5725-2 [1] using the equation \(X = \mu + b + e\) with mean value μ and two stochastically independent normally distributed random variables b and e with zero mean. The two random variables b and e represent the laboratory-specific bias and the deviations under repeatability conditions, respectively. Moreover, this latent-variable approach makes it possible to compute p and a confidence interval for p [10].

This approach is only applicable if the concentration level in different test portions (e.g. different aliquots) can be considered constant. This is mostly true, e.g. for chemical substances. However, in the case of qualitative PCR methods, such an assumption is not appropriate. Indeed, if the mean number of copies per test portion of the target DNA sequence (at a particular dilution level) is very low, say 1.5 copies per test portion, an individual test portion may contain 0, 1, 2 or 3 copies—thus, the number of copies per test portion of the target DNA sequence (at a particular dilution level) cannot be considered constant.

Since the latent-variable approach is not applicable, an alternative approach is presented in this paper.

Modelling

POD and Poisson distribution

In order to describe the relationship between the POD and the mean number of copies per test portion of the target DNA sequence at a particular dilution level, a few statistical assumptions must be made. It is assumed that the number of copies per test portion of the target DNA sequence for a particular dilution level is subject to random variation. Provided that the DNA molecules do not interact, this random distribution of copies per test portion at a particular dilution level can be described by the Poisson distribution.

It is furthermore assumed that every target DNA sequence copy present in the reaction mixture is successfully amplified above a certain critical threshold over the course of a certain number of PCR cycles.

Let x denote the mean number of copies per test portion of the target DNA sequence for a particular dilution level. The probability that a particular DNA test portion (at the particular dilution level under consideration) contains precisely k copies is then

$$p_{k} = \frac{{x^{k} }}{k!}\exp ( - x),{\text{ for all}}\quad k = 0, 1, 2, 3, 4, \ldots$$
(2)

The probability that a particular DNA test portion (at the particular dilution level under consideration) contains at least one copy (k > 0) corresponds to the complementary probability 1 − p 0 of the occurrence that the test portion contains no DNA copies:

$$1 - p_{0} = 1 - { \exp }( - x).$$
(3)

Let POD denote the probability that the target DNA sequence be detected at a particular dilution level. Under the assumption that all target sequences are detected, the resulting POD (as a function of x) is

$${POD} = 1 - \exp \left( { - x} \right).$$
(4)

In microbiology, the above assumptions are often called the single-hit Poisson model (SHPM), and the above POD function is the basis for the commonly used MPN models (most probable number).

In order to avoid cumbersome prose, we will use the phrases “number of DNA copies” or “nominal number of DNA copies”, and the symbol x to denote the mean number of copies per test portion of the target DNA sequence at a particular dilution level.

POD, amplification probability and LOD 95%

Unfortunately, the assumption that every target DNA sequence is successfully amplified must be modified in order to take into consideration the fact that, in the course of amplification over the course of several PCR cycles, copy failures and enzymatic inconstancies will take place. Let λ denote the probability that a randomly selected copy of the target DNA sequence is successfully amplified above a certain critical threshold over the course of a certain number of PCR cycles. The parameter λ will henceforth be called the average amplification probability.

For example, if x = 10 and if all target sequences are detected, then Eq. (4) can be applied and POD = 1 − exp(−10) is obtained. However, if only half the target sequences are detected (λ = 1/2), then the situation is the same as if only half the target sequences were present. Eq. (4) can still be applied, but now with \(x = (1/2) \cdot 10\) , and thus, POD = 1 − exp(−5) is obtained. In other words, in order to take the average amplification probability into account, \(\lambda \cdot x\) is substituted for x in Eqs. (3) and (4), and the following expression for the POD is obtained:

$${POD}= 1 - \exp \left( { - \lambda \cdot x} \right) .$$
(5)

Please note that, for now, λ is assumed to be independent of the number of DNA copies. In the next section, the model will be expanded in order to allow for λ to depend on x. However, before doing so, a few comments are due.

Let LOD 95% denote the mean number of DNA copies per test portion required to ensure that the POD is equal to 0.95. This parameter can be determined by solving the equation

$$0.95 = {POD} = 1 - \exp \left( { - \lambda \cdot x} \right)$$
(6)

for x = LOD 95%. Rearranging, we obtain

$$0.95 = 1 - \exp \left( { - \lambda \cdot {LOD}_{95\% } } \right)\,\ \Leftrightarrow \,\exp \left( { - \lambda \cdot {LOD}_{95\% } } \right) = 0.05\,\ \Leftrightarrow \, - \lambda \cdot {LOD}_{95\% } = \ln \left( {0.05} \right) .$$
(7)

Thus, a relationship between the average amplification probability λ and the limit of detection LOD 95% can be established:

$$\begin{gathered}{LOD}_{95\% } = \frac{ - \ln (0.05)}{\lambda } = \frac{2.996}{\lambda },\\{\text{ which is equivalent to }}\\ \lambda \cdot {LOD}_{95\% } = 2.996\end{gathered}$$
(8)

The product of the average amplification probability λ and the limit of detection LOD 95% has thus been determined to be approximately 3. This relationship can prove very useful in practice, because either one of the two parameters λ and LOD 95% is sufficient in order to obtain the functional representation of the POD curve described above. Note that for the ideal case λ = 1, we obtain

$${LOD}_{95\% } = - \ln \left( {0.05} \right) = 2.996 \approx 3 .$$
(9)

It may happen in practice that λ > 1 is computed. This may be the result of inaccuracies in the determination of the number of DNA copies in the validation study test portions. In order to include such cases within the proposed framework, we only require λ > 0. The result λ > 1 will be obtained if the nominal value for the number of DNA copies used to compute λ is less than the actual value. Let us clarify this point. Suppose that the nominal number of DNA copies is x and that every copy is detected. Provided the distribution of DNA copies in the samples is random, the maximum value for the POD, corresponding to λ = 1, would then be

$$1 - \exp \left( { - \lambda \cdot x} \right) = 1 - \exp \left( { - x} \right).$$
(10)

Even if clumping of DNA is present, e.g. on account of electrostatic forces, the POD cannot exceed this value. If an average amplification probability λ > 1 is computed, then the corresponding POD is higher than is theoretically possible. This, in turn, suggests either that the POD is affected by false-positive results or that the value for x used to compute λ is less than the true number of DNA copies.

POD curve

For the computation of LOD 95% as outlined above, we have considered one single dilution level with its number of DNA copies x. If the POD corresponding to x is known, we can determine the average amplification probability λ and the limit of detection LOD 95%. The problem is that the value of the POD can only be approximately determined. More than 100 replicate tests would be required in order to obtain an expanded uncertainty <0.1, i.e. <10 %. Accordingly, it makes sense to include all the dilution levels under examination in the computation of LOD 95% as is done for MPN methods.

The following aspects must be taken into account for PCR assays:

  1. 1.

    Matrix effects resulting in a decrease in the amplification efficiency may reduce the amplification probability. These inhibitory matrix effects depend among other things on the number of DNA copies, with less or no inhibition at lower DNA concentrations. It is thus important that the dilution levels lie within the working range of the PCR assay in order to ensure that the average amplification probability does not depend on the matrix.

  2. 2.

    It cannot be excluded that the nominal number of DNA copies at a given dilution level in a particular sample has been incorrectly determined and is subject to a systematic error, e.g. caused by an incorrect dilution procedure or an incorrect procedure for the determination of the number of DNA copies. Furthermore, systematic errors affecting the determination of the nominal number of DNA copies at the first dilution level must be taken into consideration. Indeed, the determination of the original number of DNA copies is subject to uncertainties which will propagate (see Grohmann et al. [11]). This is true both for a test sample obtained by dilution from a stock solution and for a sample produced in advance of the validation study and prepared by another procedure.

  3. 3.

    Finally, it is not improbable that the average amplification probability λ is subject to random variation under repeatability conditions, just like any other quantity. Such random variation can decrease the slope of the POD curve compared to that corresponding to a constant amplification probability. In order to clarify this point, the following very simple, though admittedly not very realistic, example is considered. In Fig. 1, POD curves are displayed for amplification probabilities of 100 % and 10 %. If both cases can occur with the same probability—on account of random variation—we obtain the curve of POD means. Please note that the latter is obtained not by setting λ = 0.55 in Eq. (5), but rather by computing the mean of the values from the first two curves at each x value. As can be seen, the slope of the curve of POD means is less than that of each of the other two curves.

    Fig. 1
    figure1

    POD curves for λ = 100 % and λ = 10 % (dashed and dotted lines, respectively), and the resulting curve of POD means (solid line) corresponding to random variation of the average amplification probability at each x value

In order to mitigate the risks associated with the various sources of error, the following expanded model for the POD function is introduced:

$${POD}(x) = 1 - \exp \left( { - \lambda \cdot x^{b} } \right)$$
(11)

In this expanded model, the parameter λ now represents the average amplification probability at the dilution level corresponding to x = 1 only. The corrective parameter b provides extra flexibility to adjust the average amplification probability to the real conditions and represents a measure of the slope of the POD function relative to the ideal POD function with b = 1.

Expressing the POD as a function of the logarithm of the number of DNA copies ln(x) = z, the following is obtained for the slope parameter b of the POD function:

$$\begin{aligned} \frac{\partial }{\partial z}{POD}\left( {\exp \left( z \right)} \right) & = \frac{\partial }{\partial z}\left( {1 - \exp \left( { - \lambda \cdot \exp \left( z \right)^{b} } \right)} \right) = \frac{\partial }{\partial z}\left( {1 - \exp \left( { - \lambda \cdot \exp \left( {b\cdot z} \right)} \right)} \right) \\ & = \lambda \cdot b \cdot \exp \left( {b\cdot z} \right) \cdot \exp \left( { - \lambda \cdot \exp \left( {b\cdot z} \right)} \right). \\ \end{aligned}$$
(12)

For z = 0, we thus have

$$\left. {\frac{\partial }{\partial z}{POD}\left( {\exp \left( z \right)} \right)} \right|_{z = 0} = \lambda \cdot b \cdot \exp \left( { - \lambda } \right).$$
(13)

This means that for x = 1, the slope of the POD curve is proportional to b.

The case b < 1 is an indication of a lower slope of the POD curve (at x = 1) as compared to the ideal POD function. Such a situation can be related to inhibitory matrix effects or large variability of the amplification process from one test to the next under repeatability conditions or on accidental problems causing false positives if the number of DNA copies is less than x = 1.

The case b > 1 is an indication of a higher slope of the POD curve (at x = 1) as compared to the ideal POD function. Such a situation can be related to a lower PCR efficiency (for a specific target sequence and a specific assay) when the number of DNA copies is less than x = 1.

Finally, let it be noted that this expansion of the model can also be interpreted as allowing the average amplification probability λ to depend on the number of DNA copies x. Indeed, Eq. (11) can also be written

$${POD}(x) = 1 - \exp \left( { - \lambda \cdot x^{b - 1} \cdot x} \right).$$
(14)

This allows us to define the following functional relationship:

$$\lambda_{b} (x) = \lambda \cdot x^{b - 1} .$$
(15)

Note that we then obtain

$$\lambda_{b} (1) = \lambda ,$$
(16)

which is consistent with our definition of λ. Further, we note that b = 1 implies that the average amplification probability does not depend on x, which would correspond to the intermediary model preceding the introduction of the slope parameter.

Thus, the parameter b simultaneously models the shape of the POD curve and the λ b (x) functional relationship.

POD model with random laboratory deviations

The model described in the previous section is only applicable for one single laboratory. However, it can be expanded to take into account results from several laboratories. Two further assumptions must now be made. First, it is assumed that the corrective parameter b is identical in all laboratories. On the one hand, this assumption is justified on pragmatic grounds: the sources of error which were discussed in the previous section may very well affect the different laboratories in a similar way. Indeed, the various data sets examined so far provide evidence to support this assumption. Moreover, this assumption is also justified on theoretical grounds. The standard errors attending the estimates for the laboratory-specific slope parameters are so large, than only very considerable deviations between these laboratory-specific estimates would lead to a positive result on a test for significant differences.

The second assumption is that the parameter λ now describes the sensitivity of the different laboratories and varies randomly from one laboratory to the next.

Formally, we now write for the POD function of laboratory \(i = 1, \ldots ,L\)

$${POD}_{i} \left( x \right) = 1 - \exp \left( { - \lambda_{i} \cdot x^{b} } \right),$$
(17)

where the parameter λ i describes the average amplification probability at x = 1 for laboratory i. This POD model can also be written

$$\ln \left( { - \ln \left( {1 - {POD}_{i} \left( x \right)} \right)} \right) = \ln \lambda_{i} + b \cdot \ln x,$$
(18)

where \( \ln {\lambda_i}\) represents the randomly varying logarithmic amplification parameter (which depends on the randomly selected laboratory). It is assumed that \( \ln {\lambda_i}\) is a random variable that follows a normal distribution with

$${ \ln } \; \lambda_{i} \sim N\left( {\mu , \sigma_{\text{L}}^{ 2} } \right), \quad \mu = { \ln } \; \lambda_{0} .$$
(19)

This POD model is known as a generalized linear mixed model (GLMM) with complementary log–log link function, and it depends on three parameters \(\lambda_{0} , \sigma_{\text{L}}^{2}\) and b that can be estimated by means of a maximum likelihood approach. The parameter λ 0 represents the average amplification parameter (at x = 1) for a laboratory with average sensitivity, and b represents the quotient obtained by dividing the slope of the POD curve at x = 1 by that corresponding to the ideal POD function with b = 1. The variance σ 2L characterizes the variability across laboratories.

Statistical computation

The computation of the statistical parameters of the POD model can be carried out by means of the following sequence of steps.

For each laboratory, compute estimates of the parameters λ i and b i for its individual POD curve

$${POD}_{i} \left( x \right) = 1 - \exp \left( { - \lambda_{i} \cdot x^{{b_{i} }} } \right).$$
(20)

Detect outliers by visual inspection of the individual POD curves or by means of the Grubbs test applied to the sensitivity parameters λ 1, …, λ L and slope parameters b 1, …, b L .

Determine whether the slope parameters b 1, …, b L differ significantly. This can be checked visually or by means of a Chi-squared test based on the standard errors for the slope parameters. If the slope parameters do not differ significantly, the sensitivity parameters λ 1, …, λ L should be recalculated on the basis of a common slope parameter b. If the slope parameters differ significantly from one another, then the current model cannot be applied as such, but would need to be expanded. As already noted, in practice this should not be the case.

Determine estimates for the parameters \(\lambda_{0} , \sigma_{\text{L}}^{2}\) and b of the POD curve across laboratories by means of the GLMM model. If b is not significantly different from 1, it should be assumed that b = 1. Compute the 95 % confidence band of the POD curve across all laboratories. Compute the limit of detection LOD 95%, i.e. the number of DNA copies required to ensure 95 % probability of detection of the target sequence for a laboratory with average sensitivity (median laboratory). Following the same procedure as in the derivation of Eq. (8) and solving

$$1 - \exp \left( { - \lambda_{0} \cdot x^{b} } \right) = 0.95$$
(21)

for x = LOD 95%, we obtain

$${LOD}_{95\% } = \left( {\frac{{ - \ln \left( {0.05} \right)}}{{\lambda_{0} }}} \right)^{1/b} .$$
(22)

This last equation will enable us to derive an intuitive interpretation for the parameter σ 2L . Indeed, with the approximate value \(- { \ln }(0.05) \approx 3\), we have \(\ln {{(LOD}}_{95\% }) = \frac{1}{b} \cdot \left( {\ln 3 - \ln \lambda_{0} } \right)\). For the natural logarithm of the upper and lower limits of the 95 % prediction interval of LOD 95%, we thus have

$$\ln ({LOD}_{{95\% ,{\text{upper}}}}) = \frac{1}{b} \cdot \left( {\ln 3 - \left( {\ln \lambda_{0} - 1.96 \cdot \sigma_{\text{L}} } \right)} \right)$$
(23)

and

$$\ln ({LOD}_{{95\% ,{\text{lower}}}}) = \frac{1}{b} \cdot \left( {\ln 3 - \left( {\ln \lambda_{0} + 1.96 \cdot \sigma_{\text{L}} } \right)} \right) .$$
(24)

This implies

$$\ln \frac{{{LOD}_{{95\% ,{\text{upper}}}} }}{{{LOD}_{{95\% ,{\text{lower}}}} }} = \frac{1}{b} \cdot 3.92 \cdot \sigma_{\text{L}} .$$
(25)

In other words, the between-laboratory variability of the LOD 95%, defined as the logarithmic ratio between LOD 95%,upper and \(LOD_{{95\% , {\text{lower}}}}\), is proportional to \(\sigma_{\text{L}}\).

Example

Data

Seventeen laboratories participated in a collaborative study for a qualitative PCR method for the detection of Pubi-cry (see Grohmann et al. [11]). Each participant received a sample DNA stock solution and a dilution buffer. After preparation of a dilution series according to instructions (6 dilution levels), each laboratory carried out 6 PCR replicate tests at each dilution level. For each dilution level, the nominal number of DNA copies and the number of positive PCR test results per participant were used for the evaluation (Table 1). Details on the PCR method and the design of the collaborative study are described elsewhere [11].

Table 1 Number of positive test results per dilution level and laboratory (6 PCR replicates)

Individual POD curves

Due to the low number of 6 PCR replicate tests per dilution level, the laboratory-specific POD curves are highly affected by the random error of the binary response variable. Nevertheless, their parameters can be assessed statistically. The Grubbs test applied to the two series \(\widehat{{\lambda_{1} }}, \ldots ,\widehat{{\lambda_{L} }}\) and \(\widehat{{b_{1} }}, \ldots ,\widehat{{b_{L} }}\) revealed no outliers. There are also no significant differences between the slope parameters \(\widehat{{b_{1} }}, \ldots ,\widehat{{b_{L} }}\) so that the sensitivity parameters can be calculated on the basis of a common slope parameter \(\hat b\) = 1.29. The corresponding laboratory-specific POD curves are displayed in Fig. 2.

Fig. 2
figure2

Individual Pubi-cry POD curves for each laboratory with common slope parameter \(\hat b = 1.29\)

Table 2 displays for each participant the sensitivity parameter estimate \(\widehat{{\ln \lambda_{i} }}\) and the associated standard error \({\text{SE}}(\widehat{{\ln \lambda_{i} }})\).

Table 2 Laboratory-specific POD model parameter estimates \(\widehat{{\ln \lambda_{i} }}\) and the associated standard error SE \((\hat b = 1.29)\)

Outliers in the estimates \(\widehat{{\lambda_{i} }}\) corresponding to the common slope parameter \(\hat b\) = 1.29 were not detected (Grubbs test). The slope parameter estimate \(\widehat{b}\) is significantly >1. It can thus be expected that the average amplification probability is less at lower concentration levels than at higher ones. In accordance with these results, the sensitivity parameters will be modelled as random effects in the GLMM. Moreover, since b = 1 cannot be assumed, the common slope parameter will be estimated as a fixed effect in the GLMM.

POD curve across laboratories

In Table 3, the final POD model parameter estimates for \(\lambda_{0} , \sigma_{\text{L}}^{2}\) and b derived from the GLMM approach are presented. The value \(\widehat{{\lambda_{0} }} = 0.77\) means an average amplification probability of 77 % for the target DNA sequence at the dilution level corresponding to x = 1. This calculation is based on the assumption that the nominal number of DNA copies is correct, that there are no false-positive results and that the model is correctly specified.

Table 3 POD statistics for the collaborative validation of the Pubi-cry PCR method

The estimate \(\widehat{b} = 1.19\) as derived from the GLMM approach means a higher average amplification probability at higher dilution levels than at lower dilution levels. If the number of DNA copies increases from 1 to 2 copies, the average amplification probability will increase from 0.77 to 0.88 (\(= 0.77 \cdot 2^{b - 1}\), see Eq. 15). As is to be expected—since the laboratory-specific sensitivity parameters are now modelled as realizations of a random variable rather than as fixed effects—this estimate for b is not equal to the original estimate for the common slope parameter.

Of course, it is not possible to actually check whether the \( \ln {\lambda_i}\) are normally distributed, since we only have their estimates whose standard errors are quite large (see Table 2). Nevertheless, as already mentioned, we can at least resort to outlier tests (Grubbs test), and none were detected.

The value \(\widehat{{\sigma_{\text{L}} }} = 0.31\) means that the ratio between the upper and lower limits of the 95 % prediction interval for the laboratory-specific LOD 95% values equals \(\exp \left( {{{3.92 \cdot \sigma_{\text{L}} }}{/b}} \right) = 2.74\) (see Eq. 25).

The 95 % confidence interval for the estimate of the parameter λ 0 was determined on the basis of the standard deviation of the method and under the assumption that the nominal number of DNA copies is correct.

The rate of detection (ROD) is defined as the number of positive test results divided by the total number of replicate tests. For a particular dilution level, each laboratory will thus obtain its own ROD.

The mean POD curve (also called LPOD curve) with the 95 % prediction range for the laboratory-specific POD values along with the 90 % prediction range for the laboratory-specific ROD values is shown in Fig. 3a. The 90 % prediction range for the ROD values (i.e. the area between the step functions) was computed on the basis of simulation runs.

Fig. 3
figure3

Mean POD curve (solid line) for the Pubi-cry PCR with the 95 % prediction range for the laboratory POD values (dark grey zone) and the 90 % prediction range for the laboratory ROD ratios (light grey zone) (a) and mean POD curve (solid line) with its 95 % confidence band (grey zone) and POD curve under ideal conditions (dashed line) (b). Dots and numbers indicate the distribution of the PCR results across all 17 laboratories (number of laboratories with corresponding ROD for the 6 PCR replicates at each of the 6 dilution levels corresponding to 0.1, 1, 2, 5, 10 and 20 copies)

As expected, the 90 % prediction range for the RODs is wider than the 95 % prediction range of the laboratory-specific POD curves. This can be explained by the considerable random variation resulting from the low number of replicate tests (= 6) available from this collaborative study.

Figure 3b shows the mean POD curve curve along with its 95 % confidence band and the POD curve under ideal conditions (theoretical POD curve) computed on the basis of the Poisson model for the case that the average amplification probability is not affected by the dilution level (λ = 1 and b = 1). If the theoretical POD curve lies within the 95 % confidence band of the mean POD curve calculated on the basis of the collaborative study data, then systematic deviations may be considered negligible. For computational details concerning the mean POD curve and its confidence band, the reader is referred to [9].

Discussion

A new model for the probability of detection (POD curve) for qualitative PCR methods was examined in collaborative studies. This model is based on the complementary log–log model, which is also used for the RLOD method described in the new standard ISO 16140-2 for the validation of microbiological methods against a reference method [2]. The model is expanded on the basis of theoretical considerations concerning the PCR amplification process. It reflects the asymmetric shape of typical POD curves better than the logistic model proposed by Burns and Valdivia [13]. It is also more flexible in cases where deviations from the theoretical POD curve (λ = 1 and b = 1) are observed.

The primary validation parameter is the laboratory standard deviation σ L. This parameter has a simple interpretation: it represents the relative between-laboratory variability of the LOD 95% (see Eq. 25). Indeed, the laboratory standard deviation σ L could also be calculated by means of a quantitative collaborative study according to ISO 5725-2 [1] in which the measured characteristic would be the natural logarithm of the LOD 95%. If each laboratory estimated its LOD 95% twice (using independent dilution series), the laboratory standard deviation of the ln(LOD 95%) values would be equal to the laboratory standard deviation derived from the complementary log–log approach described here. The advantage of the complementary log–log approach is its statistical efficiency: the number of necessary replicates is much lower. As demonstrated in the example (see Fig. 3), the uncertainty of the mean POD curve is acceptable.

More generally, it can be shown that acceptable results from at least 8 participants, 5 dilution levels (0.1, 1, 2, 5 and 10 copies of the target sequence) and 6 PCR replicates per laboratory and dilution level are required to obtain a statistically reliable POD curve.

It should also be noted that the model can be used without reference values. Neither the slope parameter b nor the laboratory standard deviation σ L depends on the true number of DNA copies. This is similar to many validation studies conducted according to ISO 5725-2 [1] where the true value of the measured characteristic is not available and the validation is primarily based on the reproducibility standard deviation.

The POD curve approach can also be used to estimate the rate of false negatives, which can be described as 1 − POD. The rate of false negatives depends on the number of DNA copies accessible for the PCR method.

It must be noted that, as with every other model, the approach presented here involves making assumptions, and it cannot be excluded that these assumptions may be inappropriate in a given situation. As always, in case of doubt, it is thus important to compare the fitted values with the test results to check the suitability of the model.

A limitation of the complementary log–log model is that it cannot be used to estimate the rate of false positives. This rate can only be calculated on the basis of test samples without any copies of the target DNA sequence. However, since a true negative is excluded in the complementary log–log model (the natural logarithm of zero is not defined), the approach presented here is not suitable for the computation of false-positive rates. Nevertheless, it is possible to determine false-positive rates by means of a conventional approach without any reference to the POD model. In the design of the current collaborative study, blank DNA samples were not included [11]. However, in parallel studies, the 17 participants did carry out tests on 170 DNA samples not containing the target DNA sequence using the Pubi-cry PCR. The evaluation of these results showed that two DNA samples were falsely classified as positive, resulting in an acceptable rate of false positives of 1.2 % [11].

Finally, it is important to mention that the POD curve approach described in this paper can also be used for in-house validation. If no measure for intermediate reproducibility is required, the expanded model for the POD curve provides the average amplification probability and the LOD 95%. This validation requires at least one dilution series with a minimum of 12 PCR replicates per dilution level; otherwise, the uncertainty and effects of random errors will be too high. The PCR results obtained for the dilution level corresponding to 0.1 copies can be also used to verify that the dilution series is approximately correct. Not more than 1 or 2 positive results of the 12 PCR replicates should be obtained [14]. For the calculation of intermediate in-house precision parameters, test series would have to be performed on different days, and the evaluation would have to be based on the model with random laboratory deviations.

Calculations can be performed with R, a publicly available statistics software, and with PROLab POD, a software specifically developed for interlaboratory studies involving qualitative methods [12].

References

  1. 1.

    ISO 5725-2 (2002) Accuracy (trueness and precision) of measurement methods and results—part 2: basic method for the determination of repeatability and reproducibility of a standard measurement method (ISO 5725-2:1994 including Technical Corrigendum 1:2002)

  2. 2.

    ISO/DIS 16140-2:2014 (2014) Microbiology of food and animal feed—method validation—Part 2: protocol for the validation of alternative (proprietary) methods against a reference method (pending publication)

  3. 3.

    Wehling P, LaBudde RA, Brunelle SL, Nelson MT (2011) Probability of detection (POD) as a statistical model for the validation of qualitative methods. JAOAC 94:335–347

    CAS  Google Scholar 

  4. 4.

    Wilrich P-Th (2010) The determination of precision of qualitative measurement methods by interlaboratory experiments. Accred Qual Assur 15:439–444

    Article  CAS  Google Scholar 

  5. 5.

    Bashkansky E, Gadrich T, Kuselman I (2012) Interlaboratory comparison of measurement results of an ordinal property. Accred Qual Assur 17:239–243

    Article  Google Scholar 

  6. 6.

    Gadrich T, Bashkansky E, Kuselman I (2012) Comparison of biased and unbiased estimators of variances of qualitative and semi-quantitative results of testing. Accred Qual Assur 18:85–90

    Article  Google Scholar 

  7. 7.

    Gadrich T, Bashkansky E (2012) ORDANOVA: analysis of ordinal variation. J Stat Plan Inference 142:3174–3188

    Article  Google Scholar 

  8. 8.

    Uhlig S, Niewöhner L, Gowik P (2011) Can the usual validation standard series for quantitative methods, ISO 5725, be also applied for qualitative methods? Accred Qual Assur 16:533–537

    Article  Google Scholar 

  9. 9.

    Uhlig S, Krügener S, Gowik P (2013) A new profile likelihood confidence interval for the mean probability of detection in collaborative studies of binary test methods. Accred Qual Assur 18:367–372

    Article  CAS  Google Scholar 

  10. 10.

    QuoData Web Services (2014) Validation of qualitative methods. http://quodata.de/en/web-services.html. Accessed 6 June 2014

  11. 11.

    Grohmann L, Reiting R, Mäde D, Uhlig S, Simon K, Frost K, Randhawa GJ, Zur K (2015) Collaborative trial validation of cry1Ab/Ac and Pubi-cry TaqMan-based real-time PCR assays for detection of DNA derived from genetically modified Bt plant products. Accred Qual Assur. doi:10.1007/s00769-015-1108-5

    Google Scholar 

  12. 12.

    PROLab POD (2014) http://quodata.de/en/software/for-interlaboratory-tests/prolab.html. Accessed 6 June 2014

  13. 13.

    Burns M, Valdivia H (2008) Modelling the limit of detection in real-time quantitative PCR. Eur Food Res Technol 226:1513–1524

    Article  CAS  Google Scholar 

  14. 14.

    Broeders S, Huber I, Grohmann L, Berben G, Taverniers I, Mazzara M, Roosens N, Morisset D (2014) Guidelines for validation of qualitative real-time PCR methods. Trends Food Sci Technol 37:115–126

    Article  CAS  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Steffen Uhlig.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Uhlig, S., Frost, K., Colson, B. et al. Validation of qualitative PCR methods on the basis of mathematical–statistical modelling of the probability of detection. Accred Qual Assur 20, 75–83 (2015). https://doi.org/10.1007/s00769-015-1112-9

Download citation

Keywords

  • Probability of detection (POD)
  • Collaborative study
  • Real-time PCR
  • Validation
  • Generalized linear mixed model (GLMM)
  • Complementary log–log model