Abstract
Pooled (or group) testing has been widely used for the surveillance of infectious diseases of low prevalence. The potential benefits of pooled testing include savings in testing time and costs, reducing false positive tests, and estimating models or making predictions from limited observed data information (e.g., only initial pooled responses). However, realizing these benefits often critically depends on the pool size used. Statistical methods introduced in the literature for optimal pool size determination have been developed mainly to accommodate simpler pooling protocols or perfect diagnostic assays. In this article, we study these issues with the goal of presenting a general optimization technique. We evaluate the efficiency of the estimators of disease prevalence (i.e., the proportion of diseased individuals in a population) while accounting for testing costs. Then, we determine the optimal pool size by minimizing the measures of optimality, such as screening efficiency and estimation efficiency. Our findings are illustrated using data from an ongoing screening application at the Louisiana Department of Health. We show that when a pooling application is properly designed, substantial advantages can be realized. We provide an R package and a software application to facilitate the implementation of our optimization techniques. Supplementary materials accompanying this paper appear online.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Pooling biospecimens (e.g., blood, urine, and swabs) to improve the efficiency of disease screening and monitoring has gained a great deal of popularity. Pooled testing was formally introduced by Dorfman (1943) in the context of screening American soldiers for syphilis. Dorfman suggested pooling a fixed number of individual blood specimens, conducting initial pool tests, and resolving the positive pools to identify diseased cases. Since then, this method, commonly known as two-stage hierarchical testing, has been extended to more advanced testing protocols (Quinn et al. 2000; Pilcher et al. 2005). Following Dorfman’s work, pooled testing has been used in applications for various infectious diseases, such as human immunodeficiency virus (HIV) and hepatitis B/C (Hourfar et al. 2008; Stramer et al. 2013), chlamydia and gonorrhea (Lindan et al. 2005), and influenza (Van et al. 2012). Pooled testing is also useful in genetics (Chi et al. 2009), animal disease testing (Dhand et al. 2010), and other fields. During the recent COVID-19 pandemic, pooling has received widespread attention because of the urgent need for rapid disease screening (Abdalhamid et al. 2020).
Statistical research in pooled testing has developed for both case identification and estimation. The aim in case identification research is to develop efficient testing protocols and assess the accuracy of classification (Kim et al. 2007). In estimation research, the primary goal is to estimate disease probabilities. Estimation can be performed either in a homogeneous population setting (Liu et al. 2012; Speybroeck et al. 2012; Ding and Xiong 2016; Haber et al. 2018; Nguyen et al. 2018) or in a regression context using individual covariates, such as age, race, sex, and symptoms (Xie 2001; Delaigle et al. 2014; Wang et al. 2014). Whether covariates are used or not, estimating the probabilities can be achieved using only initial pooled results or a combination of pooled and individual retest results—often with a fraction of the tests required by the usual individual testing.
This article focuses on optimizing the disease prevalence estimates calculated from pooled data. We consider a general pooled testing scenario, where both pooled and subsequent retesting responses are observed from individuals or pools. There are two reasons that cause modeling such data a challenging task. First, the observed pooled responses are correlated because an individual is potentially assigned to multiple pools in multiple stages. Second, the true disease status for all individuals is latent (i.e., unobservable) due to the errors occurred during the diagnostic procedure. These challenges are commonly overcome by resorting to the “missing data” technique where parameter estimation is accomplished using the expectation-maximization (EM) algorithm. However, evaluating the estimator efficiency and determining the optimal pool size is especially challenging because it requires deriving the expected variance. Using the methods available in the literature, this is rarely possible for pooled data when retesting responses are involved.
Optimization of the estimator of a disease prevalence using pooled data has been investigated by many researchers, but most studies used only initial pool tests (Hughes-Oliver and Swallow 1994; Liu et al. 2012; Tu et al. 1995). The advantages of using such simple pooling are that the testing cost can be lowered and the asymptotic efficiency results can be derived based on binomial distribution. However, this approach cannot leverage the wealth of retesting information naturally observed in public health applications. Zhang et al. (2020a) explored the benefits of using retesting data in the estimator precision for two-stage hierarchical testing. Brookmeyer (1999) explored pooled testing performed in multiple hierarchical stages, but it is limited to scenarios where a perfect diagnostic assay is available; i.e., Brookmeyer’s method cannot account for the misclassification errors (i.e., false negatives/positives). Furthermore, existing approaches cannot be used for other advanced protocols, such as array testing (Kim et al. 2007). That is, statistical methods that can use multistage pooled responses—potentially contaminated by misclassification errors—have been highly limited.
This article aims at addressing the research gap and providing methods that can be useful in designing pooled testing studies. We consider a general model framework, which can accommodate pooled data of any complexity, where the approach used for estimation is maximum likelihood. We assess the asymptotic efficiency and cost efficiency of the estimator and determine optimal pool sizes in the context of the surveillance application at the Louisiana Department of Health. It is worth noting that evaluating the optimality measures using analytical approaches is often challenging for multistage pooling methods. For such instances, we present a computation algorithm that can be widely used. Furthermore, because implementing such methods is non-trivial, we provide ready-to-use software tools using R (R Core Team 2023) for hierarchical and array testing, which can also be expanded for quality control (Hanson et al. 2006) and numerous other scenarios introduced during the COVID-19 pandemic (Daniel et al. 2021; Mutesa et al. 2021).
The subsequent sections are organized as follows. In Sect. 2, we summarize the screening results for four infectious diseases from the Louisiana Department of Health. In Sect. 3, we discuss different pooled testing protocols, assumptions, and the likelihood-based estimation framework. In Sect. 4, we assess the measures of efficiency and suggest optimal pooling configurations under three different constraints. In Sect. 5, we briefly describe our software tools. In Sect. 6, we conclude with a brief discussion. Additional information is provided in the electronic supplementary materials (Web Appendices A-D). Our software tools and code used to produce all results in this article have been uploaded as supplementary material.
2 Louisiana Infectious Disease Data
The Louisiana Department of Health (LDH) is a governing agency that oversees public health-related issues in the state of Louisiana. The LDH conducts screening and surveillance of various infectious diseases to monitor their prevalence, spread, and distribution. This includes regular testing, data collection, reporting, and analysis to identify outbreaks. In this article, we explore the practical aspects of pooled testing using LDH’s screening data to improve the screening and surveillance efficiency.
We obtained three separate datasets from the LDH that have test outcomes collected in the year 2021. The first dataset comprises HIV test results, while the second has outcomes for Neisseria gonorrhoeae (gonorrhea) and Chlamydia trachomatis (chlamydia). The third dataset consists of test results for SARS-CoV-2. The data collection procedure at LDH typically involves collecting specimens at different sites across the state, transporting them to designated laboratories, performing tests, and integrating the test outcomes into a central database for analysis.
The testing protocol adopted for these infections was the traditional one-at-a-time approach (i.e., individual testing). Both males and females were tested for HIV using blood serum samples. For chlamydia and gonorrhea, the testing was conducted on female subjects using urine specimens, while for SARS-CoV-2, both males and females were tested using nasopharyngeal swab samples. The assays used for HIV and SARS-CoV-2 were ARCHITECT HIV Ag/Ab Combo (AHAC) and Biofire Respiratory Panel 2.1 (BRP2.1), respectively, while the Aptima Combo 2 Assay (AC2A) was used for both chlamydia and gonorrhea. The sensitivity (\(S_e\)) and specificity (\(S_p\)) of these assays are reported in Table 1, which can be found in the assay product literature available at www.fishersci.com, www.biofiredx.com, and www.hologic.com.
In this article, we treat the test outcomes as historical data. Table 1 provides a summary of the collected data, which consists of the number of individuals tested (i.e., sample size, N) and the count of positive cases identified. The table also presents an estimate of the true prevalence for each infection, denoted by p. These estimates have been calculated by adjusting for testing errors using the prevalence estimator formula in Eq. (1), with pool size 1 for this special scenario of individual testing. The prevalence of HIV is low, while the prevalences of gonorrhea and chlamydia are moderate, which makes them ideal for the implementation of pooled testing. The prevalence of SARS-CoV-2 is fairly high for using pooled testing. This wide range of disease prevalence enables us to explore the effectiveness of pooled testing under different scenarios. We treat these estimates as the true values of p and use them in the methods described in Sects. 4.1, 4.2, 4.3.
3 Preliminaries
Consider a public health scenario where N individuals are tested for a disease, such as HIV. Instead of conducting separate tests for each individual, we consider the use of pooled testing. Let p denote the disease prevalence, which is the proportion of individuals who truly have the disease, and let \(\widehat{p}\) denote the maximum likelihood estimator (MLE) of p. In this article, \(\widehat{p}\) is calculated from pooled testing data, and studying its precision is of primary interest.
Pooled testing can be performed in many ways, depending on the needs and context in an application. The simplest one involves assigning individual specimens into a fixed number of non-overlapping initial pools and conducting only the initial pooled test (i.e., retesting is not performed). This is commonly referred to as “master pool testing” and is often used only for prevalence estimation purposes (Tu et al. 1995). When the goal is to identify positive cases, initial pools that test positively in stage 1 are resolved in the second stage by individual retesting or resolved in multiple stages in a hierarchical manner. Non-hierarchical methods, such as array testing and quality control-type testing, are also commonly used. The work presented in this article can accommodate both hierarchical or non-hierarchical data.
Suppose N individuals are tested using a pooling protocol, with a total number of T tests expended. Let \(Z_i\) denote the binary test response for pool i where \(Z_i = 1\) if the ith pool is diagnosed as positive and \(Z_i = 0\) otherwise, for \(i=1, 2,..., T\). Denote by \(\textbf{Z}=(Z_1, Z_2,..., Z_T)'\) the vector of all test responses. Let \(S_e\) and \(S_p\) denote the assay sensitivity and specificity, respectively. We assume throughout that \(S_e\) and \(S_p\) do not depend on the pool size—a common and reasonable assumption when the pool size is not too large (Kim et al. 2007). We also assume \(S_e\) and \(S_p\) are known and can be estimated from a pilot study or obtained from the assay product literature.
A few remarks are as follows. First, to simplify our notation, we use \(Z_i\) generically for any test response, whether it is from a pool, subpool, or individual (where the pool size is 1). Second, the test result \(Z_i\) is error-prone and potentially different from its true status, denoted by \(\widetilde{Z_i}\). As commonly used in the literature, our method accounts for the test errors (false negatives and false positives) through the specification of the sensitivity and specificity of the assay, defined as \(S_e = \text {pr}(Z_i=1|\widetilde{Z_i}=1)\) and \(S_p=\text {pr}(Z_i=0|\widetilde{Z_i}=0)\). Third, instead of parameter estimation, our work focuses on a design framework in which the number of pooled tests that would be needed for case identification cannot be determined prior to the study. Consequently, T, the number of expended tests in a pooling application, is best regarded as random when a multistage protocol, such as hierarchical or array testing, is used.
For the illustration of maximum likelihood estimation from pooled testing data, consider master pool testing where n initial pools are tested; i.e., in this particular case, T is fixed and \(T=n\). Then \(X=\sum _{i=1}^n Z_i\), the number of positive pooled tests, has binomial distribution. In this case, using the standard statistical techniques of maximum likelihood and information theory, the MLE \(\widehat{p}\) and its large-sample variance can be derived in closed form as
where k is the initial pool size, \(r=S_e + S_p - 1\), and \(\lambda = X/n\); for more details, refer to Tu et al. (1995). Based on the variance in Eq. (2), the asymptotic properties of \(\widehat{p}\) have been broadly studied, but they are sparse or unavailable for multistage pooling scenarios.
For a multistage protocol, such as hierarchical or array testing, the likelihood function is often intractable, which makes the computation of the MLE and its variance using observed-data likelihood difficult. These issues have been discussed in the literature for regression problems, and the EM algorithm has been introduced for estimation (Xie 2001; Zhang et al. 2013; Warasi 2023). In the absence of covariates, as is the case in this article, we outline the EM algorithm for the estimation of p and the variance estimation technique in Web Appendix A. It is worth noting that our objective is not parameter estimation; we instead focus on optimizing the estimation and cost efficiencies.
4 Efficiency
We study efficiency from three perspectives. We first consider minimizing \(E[(\widehat{p}-p)^2]\), the mean squared error of the MLE \(\widehat{p}\), to achieve the best precision in estimation. Our next consideration is to minimize the expected number of tests, E[T], focusing on improving the screening efficiency. The third criterion is to minimize the cost per unit information \(E[T(\widehat{p}-p)^2]\). This combines the consideration of both screening and estimation and is aimed at achieving precise estimation while reducing testing costs. To assess these criteria while comparing to individual testing, we study three relative measures of optimality: relative testing efficiency (RTE), relative estimation efficiency (REE), and relative cost efficiency (RCE), which are defined as
where ‘G’ and ‘I’ stand for group/pooled testing and individual testing, respectively. When the relative value is 1, both pooled testing and individual testing offer the same efficiency. On the other hand, when the relative value is smaller than 1 (or greater than 1), pooled testing offers higher (or lower) efficiency when compared to individual testing. Thus, finding an optimal pooling strategy requires minimizing each relative measure as a function of the pool size(s). We do so with three constraints as follows in the subsections.
4.1 Efficiency with a Fixed Number of Individuals
We first consider the setting where the number of individuals, N, tested in a study is fixed. For master pooled testing, the number of expended tests \(T=N/k\) is also fixed, so \(E_G[T]=N/k\) and \(\text {RTE}(\widehat{p}_G, \widehat{p}_I) = 1/k\) because \(E_I[T]=N\) for individual testing; i.e., the expected number of tests decreases with pool size k. However, with fewer test responses, the precision of the MLE \(\widehat{p}\) may be compromised. We explore this using the large-sample properties of the MLE. In that case, the MLE \(\widehat{p}\) is unbiased, and consequently, \(E_G[(\widehat{p}-p)^2]\) reduces to the variance expression in Eq. (2). The variance of \(\widehat{p}\) for individual testing can also be found from Eq. (2) by setting \(n=N\) and \(k=1\). Then the relative estimation efficiency is
As for cost efficiency, we find \(E_G[T(\widehat{p}-p)^2] = (N/k)\times \text {var}(\widehat{p}_G)\) which results in \(\text {RCE}(\widehat{p}_G, \widehat{p}_I)=(1/k)\times \text {REE}(\widehat{p}_G, \widehat{p}_I)\) and can be evaluated using Eq. (3).
For multistage protocols, expressions of \(E_G(T)\) are provided in Kim et al. (2007) for a few particular cases (e.g., two-stage hierarchical and array protocols) but not for the general scenario. Whenever applicable, we use those expressions for the evaluation of \(\text {RTE}(\widehat{p}_G, \widehat{p}_I)\). For two-stage hierarchical testing, \(\text {REE}(\widehat{p}_G, \widehat{p}_I)\) can be derived using the Dorfman-type work in Zhang et al. (2020a). Unfortunately, for array testing, hierarchical testing with three or more stages, and other multistage protocols, no methods are available in the literature for the calculation of \(E_G[(\widehat{p}-p)^2]\). Evaluation of \(\text {RCE}(\widehat{p}_G, \widehat{p}_I)\) involves even further complexity because \(E_G[T(\widehat{p}-p)^2]\) requires deriving the joint distribution of \(\widehat{p}\) and T. To overcome these challenges, we provide a computation algorithm that enables us to numerically evaluate \(E_G[(\widehat{p}-p)^2]\), \(E_G[T]\), \(E_G[T(\widehat{p}-p)^2]\), or any complex quantities of interest.
To present a general computation technique, we let \(\textbf{Z}\) denote the collection of all test responses for a pooling protocol and let \(\widehat{p}\) denote the MLE of p calculated from this dataset. For complex pooling scenarios, we compute \(\widehat{p}\) using the EM algorithm. To approximate the expected Fisher information \(\mathcal {I}(p)\), we compute the observed Fisher information, denoted by I(p), using the missing data principle and Louis’s (1982) method; see Web Appendix A. It is worth mentioning that I(p) is a random quantity because it depends on the observed data \(\textbf{Z}\) and, thus, cannot be directly used for \(\mathcal {I}(p)\) when assessing efficiency. The computation algorithm is provided below.
-
1.
Specify \(p=p_0\) as the true value, where \(p_0\) is an estimate from historical or pilot data.
-
2.
Choose B, the number of repetitions of steps 2(a)-2(c). For each \(s=1,2,...,B\):
-
(a)
Simulate the pooled data \(\textbf{Z}^{(s)}\) for a given pooling protocol and record \(T^{(s)}\), the number of tests expended.
-
(b)
Compute the MLE \(\widehat{p}^{(s)}\) using the EM algorithm and find \(T^{(s)}(\widehat{p}^{(s)}-p)^2\).
-
(c)
Compute the observed Fisher information \(I(p)^{(s)}\) using Louis’s method.
-
(a)
-
3.
Calculate the sample means:
-
\(\overline{I}=\frac{1}{B}\sum _{s=1}^B I(p)^{(s)}\), \(\overline{T}=\frac{1}{B}\sum _{s=1}^B T^{(s)}\), and \(\overline{V}=\frac{1}{B}\sum _{s=1}^B T^{(s)}(\widehat{p}^{(s)}-p)^2\).
-
This algorithm provides \(\overline{I}\), \(\overline{T}\), and \(\overline{V}\), which serve as approximations of \(\mathcal {I}(p)\), E[T], and \(E[T(\widehat{p}-p)^2]\), respectively, and then \(\text {REE}(\widehat{p}_G, \widehat{p}_I)\), \(\text {RTE}(\widehat{p}_G, \widehat{p}_I)\) and \(\text {RCE}(\widehat{p}_G, \widehat{p}_I)\) can be evaluated. When B is large, by the law of large numbers, these approximations are reasonable. Note that the EM algorithm and Louis’s method in steps 2(b)-2(c) involve implementing a Gibbs sampler to recover individual latent true statuses, where we use \(G=3000\) Gibbs iterates, which generally provides sufficient random draws to accurately estimate \(\widehat{p}\) and I(p); refer to Web Appendix A for more details. We explored convergence of the computation algorithm under different scenarios of the disease prevalence, p, and found that \(B=5000\) repetitions may be sufficient for convergence; see Web Appendix B. Therefore, we used \(B=5000\) repetitions throughout. Interested readers may also refer to Warasi et al. (2022), where a similar computation algorithm was proposed for a multiple-infection estimation problem of animal diseases.
Now, we explore the efficiency for five commonly used protocols: master pool testing (MPT), hierarchical testing with two stages (H2) and three stages (H3), and array testing without master pool test (A2) and with master pool test (A2M). Our software tools can provide efficiency results for four-stage hierarchical testing (H4) as well, but we do not use it because the prevalence rates in our LDH infectious disease data are too high for a four-stage protocol. As mentioned in Sect. 3, MPT uses only initial pools. In H2, initial positive pools are resolved by individual retesting in stage 2, while for H3 an intermediate stage is used for subpooling before resolving positive pools. For A2, individual specimens are first placed on the rows and columns of a square array, and then pooled samples comprised of the specimens from each row/column are tested in stage 1. In stage 2, individual retesting is conducted for case identification based on the strategy described in Kim et al. (2007). Similarly, the A2M protocol implements array testing but conducts an initial screening test for all specimens of the rows and columns. Note that testing for H3 and H4 proceeds with the framework that the pool size in one stage is evenly divisible by the pool size used in the immediately next stage—an assumption broadly adopted in the pooled testing literature; see Kim et al. (2007).
We use \(k=16\) as the maximum initial pool size, as it is commonly used for screening infectious diseases (Dodd et al. 2002; Bilder and Tebbs 2012). Although our work can accommodate pools of any size, larger pools are generally avoided as a precaution against potential dilution effects. The relative measures are calculated using the LDH data prevalence estimates (p) and the assay sensitivity/specificity (\(S_e\) and \(S_p\)) from Table 1. These calculations are performed over all possible configurations of pool size and are depicted in Fig. 1. For convenience, Fig. 1 shows results for only MPT, H2, and A2, while the entire results are presented in the electronic supplementary materials. Note that the RTE for all protocols is calculated using the closed-form expressions in Kim et al. (2007) and the REE for H2 is calculated using the variance presented in Web Appendix A. For other multistage scenarios, REE and RCE are evaluated using the computational algorithm above.
For MPT, the RTE decreases with the initial pool size k, because \(\text {RTE}(\widehat{p}_G, \widehat{p}_I) = 1/k\) as discussed above. For the H2 protocol, RTE is minimized at 9, 6, 4, and 3 for HIV, gonorrhea, chlamydia, and SARS-CoV-2, respectively. When estimation is of concern, interesting patterns are observed. First, when only initial pools are tested (i.e., MPT), there could be a substantial loss in estimation efficiency if the pool size is not carefully chosen, as depicted in the left-middle subplot of Fig. 1. Second, one might expect pools of smaller size would be better because that would generate more test responses. While this is generally true, in cases of very low prevalence, larger pools (i.e., fewer tests) may actually be more effective for precise estimation, as seen with HIV. For a multistage protocol, such as H2, estimation precision is not much affected by the pool size (the lines are fairly flat). This is why when the number of tests is taken into account, the cost-efficiency values are optimized at pool sizes that are identical or close to the optimal pool sizes based on RTE. For the A2 protocol, similar patterns are observed, but the optimal efficiencies occur at larger pools. Refer to Web Appendix D where we show three best pooling configurations, which might be useful when selecting an optimal or suboptimal configuration depending on the goal of a surveillance study.
4.2 Efficiency with a Fixed Number of Tests
This section proceeds assuming that the number of tests is fixed, a scenario commonly encountered when the testing budget is limited. In this case, we use the MPT protocol where only initial pools are tested. We do not use a multistage protocol, such as H2, because the number of tests, T, required for case identification is random and depends on whether the initial pools test positive or not. In this section, T is fixed, but the pool size k may vary.
Under this constraint, \(\text {RTE}(\widehat{p}_G, \widehat{p}_I)=1\) and \(\text {RCE}(\widehat{p}_G, \widehat{p}_I)\) reduces to \(\text {REE}(\widehat{p}_G, \widehat{p}_I)\), because T is a constant and identical for both pooled and individual testing. After simple algebraic steps, we find that \(\text {REE}(\widehat{p}_G, \widehat{p}_I)\) is (1/k) times the expression shown in Eq. (3). This indicates that the estimation efficiency discussed in Sect. 4.1 can be increased by a factor of k. The \(\text {REE}(\widehat{p}_G, \widehat{p}_I)\) expression can be simplified using the approximations \((1-p)^k \approx 1-p k\) and \((1-p)^{2(k-1)} \approx 1-2p(k-1)\), and more theoretical insights can be extracted as a function of p, \(S_e\), \(S_p\), and k. Many of the theoretical properties have been studied in the literature; e.g., see Liu et al. (2012) and Tu et al. (1995). However, we limit our discussion to determining only the optimal pool size and developing software tools so clinicians can directly use the optimality methods in applications.
Using the p, \(S_e\), and \(S_p\) values from Table 1, \(\text {REE}(\widehat{p}_G, \widehat{p}_I)\) is calculated and presented in Table 2. Here, the optimal pool size is much larger than that calculated under the constraint of a fixed number of individuals in Sect. 4.1. For gonorrhea, the minimum REE value is 0.0674, corresponding to a pool size of \(k=28\). This implies that the variance of the MLE, \(\widehat{p}\), using individual testing is approximately 15 times greater than using pooled testing with \(k=28\). However, increasing the pool size is not always better as is seen in Table 2. Another important aspect is that 28 times more individuals can be screened using pooled testing, when compared to individual testing. This can be useful in blood transfusion, animal testing, or other applications where a quick screening test is necessary. For diseases with larger prevalence, the optimal precision is achieved at smaller pool sizes and the loss in estimation precision can be enormous if the pool size is not optimally determined. For example, for SARS-CoV-2, the optimal REE at \(k=6\) is 0.33, while the REE value at \(k=35\) is 783.67. For HIV, where the prevalence is very low, the minimal REE value is 0.0130, which occurs at \(k=102\). However, in practical applications, pools of much smaller size may be used. In such cases, the optimal pool size can be determined within the upper threshold, such as \(k=16\), as discussed in Sect. 4.1.
4.3 Efficiency with a Fixed Minimum Level of Precision
In this section, our goal is to determine the pooling configuration that allows us to maintain a minimum level of precision in estimation. To accomplish this, we set an upper bound on the mean squared error \(E_G[(\widehat{p}-p)^2]\) or, equivalently, the standard error. Unlike the first two constraints in Sects. 4.1 and 4.2, the constraint here allows that both the pool size, k, and the number of tests, T, can vary over their possible values. As in Sect. 4.2, we use only the MPT protocol for this investigation.
Let \(E_0\) denote the maximum standard error allowed for the estimator \(\widehat{p}\) and let \(T^*\) denote the minimum number of tests required to achieve that precision. Then setting the constraint \(\text {max}_T\,\, E_G[(\widehat{p}-p)^2] \le E_0^2\) and using the large-sample property of the MLE, we find
for a given pool size k; i.e., \(T^*\) is the T that satisfies the above constraint. Note that the final result needs to be reported as \(\big \lceil T^*\big \rceil \), the smallest integer greater than or equal to \(T^*\) (i.e., the ceiling of \(T^*\)).
At the prevalence, sensitivity, and specificity values in Table 1, we calculate \(T^*\) for four choices of standard error: \(E_0 = 0.005, 0.010, 0.015, 0.020\). The results, presented in Table 3 for pool sizes \(k = 2, 3,..., 35\), reveal interesting patterns. For gonorrhea, the number of tests decreases sharply as the pool size increases, but becomes stable at around \(k = 10\). However, increasing the pool size is not always beneficial; for instance, when \(E_0 = 0.005\), \(T^*\) reaches its optimal (i.e., minimum) value at \(k = 27\) but then begins to increase. Similar patterns are noticed for other infections as well. A key observation is that as the prevalence p increases, the minimum number of tests required to reach a certain precision also increases. This can be explained as follows. If a pool comprises multiple positive individuals, which is roughly the case when \(p > 0.10\), a positive pooled response yields less estimation information than what would be generated by the usual one-at-a-time testing. With a larger p, this information loss further increases. Another observation is that with a smaller standard error (i.e., when a more precise estimate is desired), more tests are needed, and the optimal \(T^*\) is found at larger pool size. However, caution needs to be practiced when pooled dilution is a concern. In that case, the optimal configuration can be sought within a range of pool sizes, such as \(k=2, 3,..., 16\).
5 Software Package
Implementing optimization techniques in real-world applications is crucial for practitioners. To facilitate this, we offer two software tools: a package in R and an application written using the shiny package. While the package is developed for R users, the software application is designed for a broader audience. For any range of initial pool sizes, our software can provide the efficiency results discussed in Sects. 4.1, 4.2, 4.3.
In the R package, we provide the function mle.prop.eff (as shown above), which can be used to evaluate the relative efficiencies (RTE, REE, and RCE) for six pooling protocols. Initial pool sizes can be specified through ‘initial.psz.’ Because the computation algorithm in Sect. 4.1 is computationally intensive (as it involves a Gibbs sampler), we developed compiled Fortran and C programs and integrated them into the R function to boost its computing power. For further efficiency gain, our software offers the option to use multiple processing cores through ‘ncore.’ For the H3 protocol with pool sizes 9, 3, and 1 in stages 1-3 (with \(B=5000\) repetitions, each having \(G=3000\) Gibbs iterates), the computing times were approximately 0.01, 20, and 33 minutes for RTE, REE, and RCE, respectively. These times were recorded when executed using a single core on a computer of Intel Core i7-10750 @2.60GHz processor and 32GB of RAM. With multiple cores, the execution can be substantially faster; refer to Web Appendix C, where the function mle.prop.eff is briefly illustrated. Additional information and examples can be found in the documentation of the R package.
We have uploaded our software tools and code as supplementary material with this article, which can be used to reproduce all results presented. Furthermore, we provided the R function in the package groupTesting (Warasi 2024), which is available on the Comprehensive R Archive Network (CRAN), and uploaded the software application to https://mdwarasi.shinyapps.io/optimizeGT-app/.
6 Discussion
Pooled testing has been studied over the decades and recognized as an attractive alternative to individual testing. However, two challenges often limit its practical implementation. First, determining an optimal pooling configuration is not always possible using existing methods, except in simple pooling scenarios. Second, although optimal methods may exist, the implementation of such methods is not straightforward due to the complicated nature of pooled data and models. In this article, we made an effort to unify numerous contributions in a common framework and presented a computation technique that is conceptually simple but can be broadly useful. We also have developed software tools for optimization in disease surveillance studies, focusing on their user-friendliness, computing efficiency, and portability.
An important assumption adopted in our methods is that the prevalence p can be reliably estimated from historical or pilot study data and can be regarded as the true value of p. This assumption, although somewhat restrictive, is not unreasonable because a plethora of disease screening data becomes available every year to clinicians and public health officials. However, if the uncertainty in the prior estimates of p is of major concern, more advanced methods such as multistage adaptive pooling (Hughes-Oliver and Swallow 1994) or Bayesian formulation (Atkinson et al. 1993) can be pursued in future work for more robustness in the optimization methods. Another simplification in our work is that it focuses on the most natural and basic model structure. As a result, we did not address additional complexities, such as dilution effects (Zenios and Wein 1998), differential errors (Zhang et al. 2020b), and correlated data (Lendle et al. 2012), in our model framework. We conjecture that deriving analytical results for such advanced methods may rarely be possible. However, the computation algorithm outlined in this article can be extended to numerically calculate the results, as is illustrated in Sect. 4.1. The software we provide is also easily expandable with a straightforward effort.
We illustrated our work using infectious disease data from the LDH and demonstrated that major cost savings can be realized when pool sizes are chosen optimally. Similar benefits may be attained for other screenings in the LDH, including hepatitis B/C and influenza. In addition, our approach has the potential to be applied to infectious disease studies elsewhere, such as chlamydia/gonorrhea surveillance studies in Iowa (Tebbs et al. 2013) and rapid SARS-CoV-2 screening practices worldwide, particularly in resource-limited regions (Mutesa et al. 2021).
Data Availability
The data used in this study consist of counts of results for HIV, gonorrhea, chlamydia, and SARS-CoV-2, all of which are presented in Table 1.
References
Abdalhamid B, Bilder C, McCutchen E, Hinrichs S, Koepsell S, Iwen P (2020) Assessment of specimen pooling to conserve SARS-CoV-2 testing resources. Am J Clin Pathol 153:715–18. https://doi.org/10.1093/AJCP/AQAA064
Atkinson A, Chaloner K, Herzberg A, Juritz J (1993) Optimum experimental designs for properties of a compartmental model. Biometrics 49:325–37. https://doi.org/10.2307/2532547
Bilder C, Tebbs J (2012) Pooled testing procedures for screening high volume clinical specimens in heterogeneous populations. Stat Med 31:3261–68. https://doi.org/10.1002/sim.5334
Brookmeyer R (1999) Analysis of multistage pooling studies of biological specimens for estimating disease incidence and prevalence. Biometrics 55:608–12. https://doi.org/10.1111/j.0006-341x.1999.00608.x
Chi X, Lou X, Yang M, Shu Q (2009) An optimal DNA pooling strategy for progressive fine mapping. Genetica 135:267–81. https://doi.org/10.1007/s10709-008-9275-5
Daniel E, Esakialraj B, Muthuramalingam A, Karunaianantham R, Karunakaran L, Nesakumar M, Selvachithiram M, Pattabiraman S, Natarajan S, Tripathy S, Hanna L (2021) Pooled testing strategies for SARS-CoV-2 diagnosis: a comprehensive review. Diagn Microbiol Infect Dis 101:115432. https://doi.org/10.1016/j.diagmicrobio.2021.115432
Delaigle A, Hall P, Wishart J (2014) New approaches to nonparametric and semiparametric regression for univariate and multivariate group testing data. Biometrika 101:567–85. https://doi.org/10.1093/biomet/asu025
Dhand N, Johnson W, Toribio J (2010) A Bayesian approach to estimate OJD prevalence from pooled fecal samples of variable pool size. J Agric Biol Environ Stat 15:452–73. https://doi.org/10.1007/s13253-010-0032-8
Ding J, Xiong W (2016) A new estimator for a population proportion using group testing. Commun Stat - Simul Comput 45:101–14. https://doi.org/10.1080/03610918.2013.854909
Dodd R, Notari E, Stramer S (2002) Current prevalence and incidence of infectious disease markers and estimated window-period risk in the American Red Cross blood donor population. Transfusion 42:975–79. https://doi.org/10.1046/j.1537-2995.2002.00174.x
Dorfman R (1943) The detection of defective members of large populations. Ann Math Stat 14:436–40. https://doi.org/10.1214/aoms/1177731363
Haber G, Malinovsky Y, Albert P (2018) Sequential estimation in the group testing problem. Seq Anal 37:1–17. https://doi.org/10.1080/07474946.2017.1394716
Hanson T, Johnson W, Gastwirth J (2006) Bayesian inference for prevalence and diagnostic test accuracy based on dual-pooled screening. Biostatistics 7:41–57. https://doi.org/10.1093/biostatistics/kxi039
Hourfar M, Jork C, Schottstedt V, Weber-Schehl M, Brixner V, Busch M, Geusendam G, Gubbe K, Mahnhardt C, Mayr-Wohlfar W et al (2008) Experience of German Red Cross blood donor services with nucleic acid testing: Results of screening more than 30 million blood donations for human immunodeficiency virus, hepatitis C virus, and hepatitis B virus. Transfusion 48:1558–66. https://doi.org/10.1111/j.1537-2995.2008.01718.x
Hughes-Oliver J, Swallow W (1994) A two-stage adaptive group-testing procedure for estimating small proportions. J Am Stat Assoc 89:982–93. https://doi.org/10.1080/01621459.1994.10476832
Kim H, Hudgens M, Dreyfuss J, Westreich D, Pilcher C (2007) Comparison of group testing algorithms for case identification in the presence of testing error. Biometrics 63:1152–63. https://doi.org/10.1111/j.1541-0420.2007.00817.x
Lendle S, Hudgens M, Qaqish B (2012) Group testing for case identification with correlated responses. Biometrics 68:532–40. https://doi.org/10.1111/j.1541-0420.2011.01674.x
Lindan C, Mathur M, Kumta S, Jerajani H, Gogate A, Schachter J, Moncada J (2005) Utility of pooled urine specimens for detection of Chlamydia trachomatis and Neisseria gonorrhoeae in men attending public sexually transmitted infection clinics in Mumbai, India, by PCR. J Clin Microbiol 43:1674–7. https://doi.org/10.1128/JCM.43.4.1674-7.2005
Liu A, Liu C, Zhang Z, Albert P (2012) Optimality of group testing in the presence of misclassification. Biometrika 99:245–51. https://doi.org/10.1093/biomet/asr064
Louis T (1982) Finding the observed information matrix when using the EM algorithm. J Royal Stat Soc Series B (Methodology) 44:226–33. https://doi.org/10.1111/j.2517-6161.1982.tb01203.x
Mutesa L, Ndishimye P, Butera Y, Souopgui J, Uwineza A, Rutayisire R, Ndoricimpaye E, Musoni E, Rujeni N, Nyatanyi T et al (2021) A pooled testing strategy for identifying SARS-CoV-2 at low prevalence. Nature 589:276–80. https://doi.org/10.1038/s41586-020-2885-5
Nguyen N, Bish E, Aprahamian H (2018) Sequential prevalence estimation with pooling and continuous test outcomes. Stat Med 37:2391–2426. https://doi.org/10.1002/sim.7657
Pilcher C, Fiscus S, Nguyen T, Foust E, Wolf L, Williams D, Ashby R, O’Dowd J, McPherson J, Stalzer B, Hightow L, Miller W, Eron J, Cohen M, Leone P (2005) Detection of acute infections during HIV testing in North Carolina. N Engl J Med 352:1873–83. https://doi.org/10.1056/NEJMoa042291
Quinn T, Brookmeyer R, Kline R, Shepherd M, Paranjape R, Mehendale S, Gadkari D, Bollinger R (2000) Feasibility of pooling sera for HIV-1 viral RNA to diagnose acute primary HIV-1 infection and estimate HIV incidence. AIDS 14:2751–7. https://doi.org/10.1097/00002030-200012010-00015
R Core Team (2024) A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org
Speybroeck N, Williams C, Lafia K, Devleesschauwer B, Berkvens D (2012) Estimating the prevalence of infections in vector populations using pools of samples. Med Vet Entomol 26:361–71. https://doi.org/10.1111/j.1365-2915.2012.01015.x
Stramer S, Notari E, Krysztof D, Dodd R (2013) Hepatitis B virus testing by minipool nucleic acid testing: does it improve blood safety? Transfusion 53:2449–58. https://doi.org/10.1111/trf.12213
Tebbs J, McMahan C, Bilder C (2013) Two-stage hierarchical group testing for multiple infections with application to the infertility prevention project. Biometrics 69:1064–73. https://doi.org/10.1111/biom.12080
Tu X, Litvak E, Pagano M (1995) On the informativeness and accuracy of pooled testing in estimating prevalence of a rare disease: application to HIV screening. Biometrika 82:287–97. https://doi.org/10.1093/biomet/82.2.287
Van T, Miller J, Warshauer D, Reisdorf E, Jerrigan D, Humes R, Shult P (2012) Pooling nasopharyngeal/throat swab specimens to increase testing capacity for influenza viruses by PCR. J Clin Microbiol 50:891–6. https://doi.org/10.1128/JCM.05631-11
Warasi M (2023) groupTesting: an R package for group testing estimation. Commun Stat - Simul Comput 52:6210–224. https://doi.org/10.1080/03610918.2021.2009867
Warasi M (2024) groupTesting: Simulating and Modeling Group (Pooled) Testing Data. R package version 1.3.0. https://cran.r-project.org/web/packages/groupTesting
Warasi M, Hungerford L, Lahmers K (2022) Optimizing pooled testing for estimating the prevalence of multiple diseases. J Agric Biol Environ Stat 27:713–27. https://doi.org/10.1007/s13253-022-00511-4
Wang D, McMahan C, Gallagher M, Kulasekera B (2014) Semiparametric group testing regression models. Biometrika 101:587–98. https://doi.org/10.1093/biomet/asu007
Xie M (2001) Regression analysis of group testing samples. Stat Med 20:1957–69. https://doi.org/10.1002/sim.817
Zenios S, Wein L (1998) Pooled testing for HIV prevalence estimation: exploiting the dilution effect. Stat Med 17:1447–67
Zhang B, Bilder C, Tebbs J (2013) Group testing regression model estimation when case identification is a goal. Biom J 55:173–89. https://doi.org/10.1002/bimj.201200168
Zhang W, Liu A, Li Q, Albert P (2020) Incorporating retesting outcomes for estimation of disease prevalence. Stat Med 39:687–97. https://doi.org/10.1002/sim.8439
Zhang W, Liu A, Li Q, Albert P (2020) Nonparametric estimation of distributions and diagnostic accuracy based on group-tested results with differential misclassification. Biometrics 76:1147–56. https://doi.org/10.1111/biom.13236
Acknowledgements
We are grateful to the Editor, Associate Editor, and two anonymous reviewers for many helpful suggestions, which have substantially improved the manuscript. We express our gratitude to our colleagues at the LDH (Louisiana Department of Health) laboratory for their important collaboration in sampling, testing, and compiling the data. We also extend our thanks to Drs. Richard Tulley and Arundhati Bakshi of LDH for their consultation and insightful comments regarding the testing and dataset. We thank the IRB at LDH for their cooperation. We would like to thank Dr. Joshua Tebbs for taking the time to graciously review an earlier version of the manuscript and for providing helpful feedback.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There are no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Warasi, M.S., Das, K.P. Optimizing Disease Surveillance Through Pooled Testing with Application to Infectious Diseases. JABES (2024). https://doi.org/10.1007/s13253-024-00646-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13253-024-00646-6