1 Introduction

Screening for diseases especially those with low prevalences can be very costly and time-consuming. Group testing, as a cost-effective strategy, has been widely used in many fields to identify diseased subjects, for example, genetics [10], infectious disease screening [9, 11, 26], pharmaceutical industries [14], and agriculture [20], among others. Such a strategy was first used by [6] to test pooled blood samples for syphilis antigen in the US army recruitment. In Dorfman’s [6] procedure, the blood samples of subjects are pooled in some groups prior to testing. If a group is tested negative, all subjects in the group are declared negative. Otherwise, at least one subject in the group is infected and retesting is subsequently conducted on all subjects to identify the diseased ones. It is clear that this screening procedure as compared to individual testing can greatly save cost if the disease prevalence is low since a much larger number of groups are tested negative. Meanwhile, it could reduce the turnaround time for test results. Recently, group testing has been used for SARS-CoV-2 detection Lagopati et al. [16].

Since Dorfman’s [6] seminal work on group testing, a lot of work has been done in this area [12, 22, 23, 25]. When the personal information is available, some informative group testing procedures have been developed to further improve screening accuracy in term of certain operating characteristic [2, 3, 21]. Common operating characteristics include efficiency (expected number of tests per subject), pooling specificity and sensitivity, and positive and negative predictive values [18].

Predictive value, as one of the most important measures of a diagnostic test’s accuracy [4, 7], has also been used to evaluate the performance of a group testing procedure [15, 19]. Four predictive values including true-positive, false-positive, true-negative, and false-negative predictive values are usually investigated [15, 17]. The true (false)-positive predictive value is the probability that a subject tested positive is truly diseased (disease-free), and the true (false)-negative predictive value is the probability that a subject tested negative is truly disease-free (diseased). The smaller the false-negative and false-positive values are, the better a group testing procedure is. Low false-negative predictive value is particularly desirable for life-threatening diseases such as human immunodeficiency virus due to the serious consequences if missing treatment of the disease, and coronavirus disease for its quick transmission among human beings.

In this work, we study four predictive values from a general group testing procedure in which the test is conducted at multiple stages. By comparing it with individual testing procedure, we show that the false-negative predictive value for a group testing procedure is larger, while the false-positive predictive value is smaller. Moreover, as the testing stage increases, the false-negative predictive value increases. So, we propose a nested group testing strategy by retesting negative groups, which is shown to yield smaller false-negative and false-positive predictive value than individual testing procedure. The remaining parts of the paper are arranged as follows. In Sect. 2, we introduce the predictive values from a general group testing procedure and show that its false-negative predictive value is larger, and the false-positive predictive value is lower than those from individual testing procedure. Then we propose a nested group testing procedure that can improve the false-negative predictive value. Applications of the new method to Dorfman’s, Halving and Sterrett procedures are discussed in Sect. 3. The extensive simulation studies and a real data analysis are conducted in Sect. 4 to investigate the performance of the proposed method. Section 5 concludes the work. The technical details are provided in Appendix.

2 Main Results

2.1 Notations

Consider a general multi-stage group testing procedure, shown in Fig. 1a, being denoted as \({\mathcal {A}}_{O}\), where groups tested positive are successively randomly split into subgroups for retesting. In the first stage, all the available specimens are randomly divided into a certain number of groups and testing are conducted at the group level. In the subsequent stages, if a group is tested negative, no further splitting is needed and all its members are declared negative; groups tested positive are further split into subgroups and tested until all subgroups are declared negative or individual testing occurs. Assume that there are n specimens \(X_1, \ldots , X_n\) conducted in \({\mathcal {A}}_{O}\) of \(L(\ge 1)\) stages. Denote the probability that the sth specimen is truly diseased by \(p_{s}\), \(s=1,\ldots ,n\), which are allowed to be different among subjects. Suppose specimens are tested by an assay with the sensitivity \(S_e\) and specificity \(S_p\). We assume that \(S_e\) and \(S_p\) do not depend on the group size.

Fig. 1
figure 1

An illustrative diagram of L-stage group testing procedure \({\mathcal {A}}_O\) (subfigure a) where positive groups are successively split and tested until all specimens are detected. For negative groups, no further splitting is conducted and all specimens in the group are declared negative. Subfigure b illustrates L-stage nested group testing procedure \({\mathcal {A}}_N\), where retesting is performed for negative groups using Dorfman’s procedure \({\mathcal {A}}_D\)

Denote the testing results of these n specimens by \({\mathcal {I}}_1^{(l_1)},\ldots ,{\mathcal {I}}_n^{(l_n)}\) taking 0 if being declared negative and 1 otherwise, with the corresponding true diseased statuses being \(\widetilde{{\mathcal {I}}}_1,\ldots ,\widetilde{{\mathcal {I}}}_n\), where \(l_s\in \{1,\ldots ,L\}\) is the stage at which the disease status of the sth specimen is declared, \(s=1,\ldots ,n\). For example, if \(X_1\) is tested negative in the third stage, then \(l_1=3\). For \(s=1,\ldots , n\), the false-negative and true-negative predictive values are defined, respectively, as \(\xi _{1,{\mathcal {A}}_{O}}(X_s)= \mathrm{Pr}\big (\widetilde{{\mathcal {I}}}_s=1|{\mathcal {I}}^{(l_s)}_s=0\big )~\hbox {and}~\xi _{2,{\mathcal {A}}_{O}}(X_s)= \mathrm{Pr}\big (\widetilde{{\mathcal {I}}}_s=0|{\mathcal {I}}^{(l_s)}_s=0\big ).\) Similarly, the false-positive and true-positive predictive values are \(\eta _{1,{\mathcal {A}}_{O}}(X_s)= \mathrm{Pr}\big (\widetilde{{\mathcal {I}}}_s=0|{\mathcal {I}}^{(l_s)}_s=1\big )~\hbox {and}~\eta _{2,{\mathcal {A}}_{O}}(X_s)= \mathrm{Pr}\big (\widetilde{{\mathcal {I}}}_s=1|{\mathcal {I}}^{(l_s)}_s=1\big )\), respectively. Denote the true and testing status of the group containing \(X_{s}\) in the jth stage by \({\widetilde{G}}^{(j)}(X_s)\) and \(G^{(j)}(X_s)\) , \(j\le l_s\), \(s=1,\ldots ,n\).

We use a toy example to illustrate the effectiveness of implementing retesting. Consider the Dorfman’s algorithm. Under the same size of the master pool, suppose we perform testing in two ways: one is to use a group size of k, denoted by \({\mathcal {A}}_D(k)\); the other is to use a group size of 2k but with each group tested twice, denoted by \({\mathcal {A}}'_D(2k)\). Then the overall number of tests is almost the same for \({\mathcal {A}}_D(k)\) and \({\mathcal {A}}'_D(2k)\). The corresponding false-negative predictive values of specimens X tested negative in the first stage are \(\xi _{1,{\mathcal {A}}_D(k)}(X) =\mathrm{Pr}\big (\widetilde{{\mathcal {I}}}=1|G_{k}=0\big )\) and \(\xi _{1,{\mathcal {A}}'_D(2k)}(X) =\mathrm{Pr}\big (\widetilde{{\mathcal {I}}}=1|G_{1,2k}=0,G_{2,2k}=0\big )\) respectively, where \(G_{1,2k}\) and \(G_{2,2k}\) representing the groups which are tested twice. Using the Bayesian rule, we obtain \(\frac{\xi _{1,{\mathcal {A}}_D(k)}(X)}{\xi _{1,{\mathcal {A}}'_D(2k)}(X)}=1+\frac{S_p(S_e+S_p-1)(1-p)^k}{(1-S_e)[1-S_e+(S_e+S_p-1)(1-p)^k]}.\) The detailed derivation is given in Appendix A. This ratio will be far larger than 1 if the sensitivity \(S_e\) is approaching 1. For example, let \(S_e=S_p=0.99\), \(p=0.01\) and \(k=10,\) then the ratio is 98.9. This means the false-negative predictive value could be significantly reduced through retesting, while using almost the same number of tests as using ordinary Dorfman’s algorithm. This toy example shows the advantages of implementing retesting in group testing. It motivates us to investigate thoroughly the properties of group testing algorithms by incorporating retesting.

2.2 Predictive Values of \({\mathcal {A}}_{O}\)

For some \(s\in \{1,2,\ldots ,n\}\), if a specimen \(X_s\) is tested negative at the \(l_s\) stage. Those groups that containing this specimen \(X_s\) should be tested positive at the previous stages, which is \(G^{(j)}(X_s)=1,j< l_{s}\) and \(G^{(l_s)}(X_s)=0.\) On the other hand, if \(X_s\) is tested positive at the \(l_s\) stage, then all groups containing \(X_s\) are tested positive at the previous stage, which is \(G^{(j)}(X_s)=1,j\le l_{s}.\) Therefore, the predictive values defined above is actually determined by the process a specimen has went through until being declared as positive or negative finally. For \(s=1,\ldots ,n\), the false-negative, true-negative, false-positive, and true-positive predictive values using \({\mathcal {A}}_{O}\) can be derived as, respectively,

$$\begin{aligned} \begin{array}{ccl} \xi _{1,{\mathcal {A}}_{O}}(X_{s}) &{}=&{}\mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{(j)}(X_s)=1,j< l_{s}, G^{(l_s)}(X_s)=0\right) , \\ \xi _{2,{\mathcal {A}}_{O}}(X_{s}) &{}=&{}\mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=0|G^{(j)}(X_s)=1,j< l_{s}, G^{(l_s)}(X_s)=0\right) ,\\ \eta _{1,{\mathcal {A}}_{O}}(X_{s}) &{}=&{}\mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=0|G^{(j)}(X_s)=1,j\le l_{s}\right) ,\\ \eta _{2,{\mathcal {A}}_{O}}(X_{s}) &{}=&{}\mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{(j)}(X_s)=1,j\le l_{s}\right) . \end{array} \end{aligned}$$

If a specimen \(X_s\) is tested negative at the second stage, that is \(l_s=2\), then the false- and true-negative predictive values are expressed as \(\mathrm{Pr}\big (\widetilde{{\mathcal {I}}}_{s}=1|G^{(1)}(X_s)=1, G^{(2)}(X_s)=0 \big )\) and \(\mathrm{Pr}\big (\widetilde{{\mathcal {I}}}_{s}=0|G^{(1)}(X_s)=1,G^{(1)}(X_s)=0\big )\), respectively. It is clear that \(\xi _{1,{\mathcal {A}}_{O}}(X_{s})+\xi _{2,{\mathcal {A}}_{O}}(X_{s})=1\). The expressions for \(\eta _{1,{\mathcal {A}}_{O}}(X_{s})\) and \(\eta _{2,{\mathcal {A}}_{O}}(X_{s})\) can be similarly derived. It is worth mentioning that, since a specimen could be detected positive only after the first stage, the false- and true-positive predictive value are only defined for stages \(l_s>1\). After some algebraic manipulations given in Appendix A, we have

$$\begin{aligned} \begin{array}{ccl} \xi _{1,{\mathcal {A}}_{O}}\left( X_s\right) &{}=&{}\left( 1-S_e\right) S_e^{l_s-1}p_{s}/\phi \left( l_s,s\right) , \\ \xi _{2,{\mathcal {A}}_{O}}\left( X_{s}\right) &{}=&{} 1-\left( 1-S_e\right) S_e^{l_s-1}p_{s}/\phi \left( l_s,s\right) ,\\ \eta _{1,{\mathcal {A}}_{O}}\left( X_{s}\right) &{}=&{} 1-S^{l_s}_ep_{s}/\psi \left( l_s,s\right) ,\\ \eta _{2,{\mathcal {A}}_{O}}\left( X_{s}\right) &{}=&{} S^{l_s}_ep_{s}/\psi \left( l_s,s\right) , \end{array} \end{aligned}$$

where

$$\begin{aligned} \begin{array}{lll} \phi (l_s,s)&{}=&{}S_e^{l_s-1} (1-S_e) \left( 1-\varphi \left( G^{(l_s)}(X_s)\right) \right) +S_p(1-S_p)^{l_s-1}\varphi \left( G^{(1)}(X_s)\right) \\ &{}&{}+\sum \limits _{\tau =1}^{l_s-1}S_p S^{\tau }_e(1-S_p)^{l_s-1-\tau }\left( 1-\varphi \left( G^{(\tau )}(X_s)\backslash G^{(\tau +1)}(X_s)\right) \right) \varphi \left( G^{(\tau +1)}(X_s)\right) ,\\ \psi (l_s,s)&{}=&{}S_e^{l_s}\left( 1-\varphi \left( G^{(l_s)}(X_s)\right) \right) +(1-S_p)^{l_s}\varphi \left( G^{(1)}(X_s)\right) \\ &{}&{}+\sum \limits _{\tau =1}^{l_s-1} S_e^{\tau }(1-S_p)^{l_s-\tau } \left( 1-\varphi \left( G^{(\tau )}(X_s)\backslash G^{(\tau +1)}(X_s)\right) \right) \varphi \left( G^{(\tau +1)}(X_s)\right) ,\\ \end{array} \end{aligned}$$

with \(\varphi (A)=\prod _{\{j:X_j\in A\}}(1-p_j)\).

When \(L=1\), the procedure \({\mathcal {A}}_{O}\) reduces to individual testing procedure, denote it by \({\mathcal {A}}_I.\) Then four predictive values for individual testing \(s\in \{1,2,\ldots ,n\}\) are

$$\begin{aligned} \begin{array}{ccl} \xi _{1,{\mathcal {A}}_I}\left( X_{s}\right) &{}=&{} \mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|{\mathcal {I}}^{\left( l_s\right) }_{s}=0\right) =\frac{\left( 1-S_e\right) p_{s}}{S_p\left( 1-p_{s}\right) +\left( 1-S_e\right) p_{s}}, \\ \xi _{2,{\mathcal {A}}_I}\left( X_{s}\right) &{}=&{} \mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=0|{\mathcal {I}}^{\left( l_s\right) }_{s}=0\right) =\frac{S_p\left( 1-p_{s}\right) }{S_p\left( 1-p_{s}\right) +\left( 1-S_e\right) p_{s}},\\ \eta _{1,{\mathcal {A}}_I}\left( X_{s}\right) &{}=&{} \mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=0|{\mathcal {I}}^{\left( l_s\right) }_{s}=1\right) =\frac{\left( 1-S_p\right) \left( 1-p_{s}\right) }{S_ep_{s}+\left( 1-S_p\right) \left( 1-p_{s}\right) },\\ \eta _{2,{\mathcal {A}}_I}\left( X_{s}\right) &{}=&{} \mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|{\mathcal {I}}^{\left( l_s\right) }_{s}=1\right) =\frac{S_ep_{s}}{S_ep_{s}+\left( 1-S_p\right) \left( 1-p_{s}\right) }. \end{array} \end{aligned}$$

2.3 Predictive Value Comparison Between \({\mathcal {A}}_O\) and \({\mathcal {A}}_I\)

The predictive value is important in real applications. A good group testing strategy should have lower false-negative and lower false-positive predictive values compared to individual testing procedure. Intuitively, \({\mathcal {A}}_{O}\) should bring better false-positive predictive value than \({\mathcal {A}}_I\) since the average number of testing for a positive specimen is beyond one as \(L>1\). However, its false-negative predictive value is a little bit optimistic because the testing results of the group containing it in the previous stages are tested positive. The following Theorem 2.1 verifies these results, whose proof is given in Appendix B.

Theorem 2.1

Suppose a specimen \(X_{s}\) is tested using \({\mathcal {A}}_{O}\) and \({\mathcal {A}}_I\) respectively, \(s=1,\ldots ,n\), then

$$\begin{aligned} \begin{aligned}&\xi _{1,{\mathcal {A}}_{O}}\left( X_{s}\right)>\xi _{1,{\mathcal {A}}_I}\left( X_{s}\right) ,~ \xi _{2,{\mathcal {A}}_{O}}\left( X_{s}\right)<\xi _{2,{\mathcal {A}}_I}\left( X_{s}\right) ,\\&\eta _{1,{\mathcal {A}}_{O}}\left( X_{s}\right) <\eta _{1,{\mathcal {A}}_I}\left( X_{s}\right) ,~ \eta _{2,{\mathcal {A}}_{O}}\left( X_{s}\right) >\eta _{2,{\mathcal {A}}_I}\left( X_{s}\right) . \end{aligned} \end{aligned}$$

\(\xi _{1,{\mathcal {A}}_{O}}(X_{s})\) and \(\xi _{2,{\mathcal {A}}_{O}}(X_{s})\) are specified by the stage in which a specimen is tested positive or negative. Intuitively, the false-negative values of a specimen tested negative at different stages should be different. Theorem 2.2 below, whose proof is given in Appendix C, reveals that the false-negative predictive value of a specimen increases when it is declared in high stage under certain condition. Let \(\xi _{1,{\mathcal {A}}_{O}}(X_{s}|l_s=l)\) represent the false-negative predictive value of a specimen \(X_s\) declared negative in the lth stage, \(l\le L\). To simplify the notations, denote \(V_{s,z}=\mathrm{Pr}\Big ({\widetilde{W}}_{G^{(z)}(X_s)\backslash G^{(z+1)}(X_s)}=0\Big )=\prod \limits _{\{j:X_j\in G^{(z)}(X_s)\backslash G^{(z+1)}(X_s)\}}(1-p_j),\) where \({\widetilde{W}}_{A\backslash B}\) represents the true disease status of the difference between two sets A and B containing specimen \(X_s\) tested negative in the zth stage, \(z\le L\).

Theorem 2.2

Suppose a specimen \(X_s\) is tested negative in the lth stage using \({\mathcal {A}}_{O}\). Then the following relationships are established:

  1. (1)

    \(\xi _{1,{\mathcal {A}}_{O}}(X_{s}|l_s=1)<\xi _{1,{\mathcal {A}}_{O}}(X_{s}|l_s=2)\) for \(l=1\),

  2. (2)

    \(\xi _{1,{\mathcal {A}}_{O}}(X_{s}|l_s=l)<\xi _{1,{\mathcal {A}}_{O}}(X_{s}|l_s=l+1)\) if \(S_pV_{s,l}(1-V_{s,l-1})>S_e(1-V_{s,l})\) for \(l>1\).

Group testing is usually applied for rare diseases, therefore the condition in result (2) of Theorem 2 is frequently satisfied. Taking a homogeneous population as an example, this condition reduces to \(S_p/S_e V_{s,l+1}(1+V_{s,l+1}+\ldots +V_{s,l+1}^{\lfloor a_{l}\rfloor -1})>1\) and \(a_{l}=(k_{l-1}-k_{l})/(k_{l}-k_{l+1})\ge 2\), where \(V_{s,l+1}=(1-p)^{k_l-k_{l+1}}\), \(k_{l}\) is the group size in the lth stage, and \(\lfloor a \rfloor \) is the largest integer less then or equal a. This condition is easy to satisfy, for example, if \(S_p\ge \)0.8 and \((1-p)^{k_2}\ge \) 0.75.

2.4 Nested Group Testing Procedure

A good group testing procedure is expected to have low false-negative and false-positive and high true-negative and true-positive predictive values. As shown in Theorem 1, using \({\mathcal {A}}_{O}\) yields higher false-negative predictive values than \({\mathcal {A}}_I\). To improve it, we propose a nested group testing procedure based on \({\mathcal {A}}_{O}\) where the negative groups are retested at each stage. With this doings, all specimens in the groups tested negative at the \(l_s\) stage are randomly split into a certain number of groups for retesting using Dorfman’s procedure (\({\mathcal {A}}_D\)). We call it the nested group testing procedure, denoted by \({\mathcal {A}}_N\). See Fig. 1b for illustration.

Denote the false-negative predictive value of the specimen \(X_{s}\) from \({\mathcal {A}}_N\) by \(\xi _{1,{\mathcal {A}}_{N}}(X_{s};k)\), which is expressed as \(\xi _{1,{\mathcal {A}}_{N}}(X_{s};k)=\mathrm{Pr}\big (\widetilde{{\mathcal {I}}}_{s}=1|G^{(l_{{s}})}(X_s)=0,G^{(j)}(X_s)=1,j<l_s,B_{s}(k)=0\big ),s\in \{1,\ldots ,n\},\) where \(B_s(k)\) denotes the group containing \(X_{s}\), and k is the group size of retesting. The following Theorem 3, whose proof is given in Appendix D, shows that using \({\mathcal {A}}_N\) yields lower false-negative predictive value than \({\mathcal {A}}_I\).

Theorem 2.3

If \((1-S_e)/S_p\le \mathrm{Pr}\big ({\widetilde{W}}_{G^{(l_s-1)}(X_s)\backslash G^{(l_s)}(X_s)}=1\big )\), then there exists a group size \(k^{(l_s)}_{s}\) such that, for any \(k_1 \le k^{(l_s)}_{s}\) and \(k_2 > k^{(l_s)}_{s}\) the false-negative predictive value of a specimen \(X_s\) using nested group testing procedure \({\mathcal {A}}_{N}\) satisfies \(\xi _{1,{\mathcal {A}}_{N}}(X_{s};k_1)\le \xi _{1,{\mathcal {A}}_I}(X_{s}),\) and \( \xi _{1,{\mathcal {A}}_{N}}(X_{s};k_2)> \xi _{1,{\mathcal {A}}_I}(X_{s}),s\in \{1,\ldots ,n\}.\)

With the proposed procedure, if a group size larger than \(k_{s}^{(l_s)}\) is used, the false-negative predictive value of \(X_s\) would be larger than that from individual testing procedure; otherwise, it would be smaller. For a homogeneous population where all specimens have the same probability of being positive (that is, \(p_s=p\) for all \(s\in \{1,\ldots ,n\}\)), the condition in Theorem 3 reduces to \((1-S_e)/S_p\le 1-(1-p)^{k_{l_s-1}-k_{l_s}}.\) Since the upper-bounded group size \(k_{s}^{(l_s)}\) for different specimens may not be the same, we split the specimens in the negative groups by the order of their upper-bounded retesting size \(k_{s}^{(l_s)}\) in the nested group testing procedure. To be specific, in each stage, we calculate \(k_{s}^{(l_s)}\) for all specimens in the negative groups and sort them in an ascending order; the specimens are subsequently split based on the ordered upper-bounded sizes.

For convenience, we could directly use the group size \(k_{s}^{(l_s)}\) for retesting. In this way, the false-negative predictive values from our nested group testing procedure \({\mathcal {A}}_{N}\) are reduced and more stable regardless of the stages. As shown in Theorem 2.2, the false-negative predictive value of \({\mathcal {A}}_{O}\) is far larger than that of \({\mathcal {A}}_{I}\) especially when the stage l goes high. The retest group size \(k_{s}^{(l_s)}\) is obtained through targeting the individual testing. Therefore, the false-negative predictive value of \({\mathcal {A}}_{N}\) is close to \({\mathcal {A}}_{I}\) and maintains the efficiency. In another word, our procedure \({\mathcal {A}}_{N}\) is more stable than \({\mathcal {A}}_{O}.\) To measure it, denote by \({\bar{\xi }}_{1,{\mathcal {A}}_N}=\frac{1}{n}\sum _{s=1}^n\xi _{1,{\mathcal {A}}_N}(X_{s})\), and

$$\begin{aligned} \varDelta _{{\mathcal {A}}_N}= \frac{1}{n}\sum _{s=1}^n\left( \xi _{1,{\mathcal {A}}_N}(X_{s})-{\bar{\xi }}_{1,{\mathcal {A}}_N}\right) ^2. \end{aligned}$$

For individual testing, we have \(\varDelta _{{\mathcal {A}}_I}=0\) for homogeneous population. This measurement \(\varDelta _{{\mathcal {A}}_N}\) features the difference of false-negative predictive value of specimens. We name it as false-negative(FN)-alike measurement. In next section, we will show in detail that the nested group testing procedure has smaller FN-alike measurement \(\varDelta _{{\mathcal {A}}_N},\) compared with the original group testing procedure \({\mathcal {A}}_O.\) In another word, the nested group testing procedure has more stable performance on the false-negative predictive value at each stage.

3 Applications to the Existing Group Testing Procedures

Considerable attention has been given to Dorfman’s [6] group testing strategy since its appearance, resulting in various extensions and improvements. Here we focus on hierarchical group testing algorithms including two-stage Dorfman’s procedure, three-stage Halving procedure and one-step Sterrett procedure [13, 24].

3.1 Dorfman’s and Halving Procedure

To investigate the performance of the proposed method, we apply it to two common group testing procedures including Dorfman’s procedure (\({\mathcal {A}}_{D}\)) with \(L=2\) (see Fig. 2a) and Halving procedure (\({\mathcal {A}}_{H}\)) with \(L=3\) (see Fig. 2c) [1, 3, 5]. We also construct versions of nested group testing procedures for these two procedures (see Fig. 2b, d).

Fig. 2
figure 2

An illustrative diagram of Dorfman’s group testing procedures (subfigure a) and nested Dorfman’s procedure (subfigure b), Halving procedure (subfigure c) and nested halving procedure (subfigure d)

A specimen can be tested negative in stage 1 or 2 for Dorfman’s procedure (Fig. 2a), and in stage 1, 2, or 3 for the three-stage halving procedure (Fig. 2c). The predicative values for these two nested procedures are given as follows.

Dorfman’s procedure:

$$\begin{aligned} \begin{array}{ccl} \xi _{1,{\mathcal {A}}_{D}}\left( X_{s}|l_s=1\right) &{}=&{}\mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{\left( 1\right) }\left( X_s\right) =0\right) = \frac{\left( 1-S_e\right) p_{s}}{1-S_e+r\varphi \left( G^{\left( 1\right) }\left( X_s\right) \right) }, \\ \xi _{1,{\mathcal {A}}_{D}}\left( X_{s}|l_s=2\right) &{}=&{}\mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{\left( 1\right) }\left( X_s\right) =1,G^{\left( 2\right) }\left( X_s\right) =0\right) \\ &{}=&{}\frac{\left( 1-S_e\right) S_ep_{s}}{S_e\left( 1-S_e\right) +rS_e\left( 1-p_{s}\right) -rS_p\varphi \left( G^{\left( 1\right) }\left( X_s\right) \right) }.\\ \end{array} \end{aligned}$$

Halving procedure:

$$\begin{aligned} \begin{array}{ccl} \xi _{1,{\mathcal {A}}_{H}}\left( X_{s}|l_s=1\right) &{}=&{}\mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{\left( 1\right) }\left( X_s\right) =0\right) = \frac{\left( 1-S_e\right) p_{s}}{1-S_e+r\varphi \left( G^{\left( 1\right) }\left( X_s\right) \right) },\\ \xi _{1,{\mathcal {A}}_{H}}\left( X_{s}|l_s=2\right) &{}=&{}\mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{\left( 1\right) }\left( X_s\right) =1,G^{\left( 2\right) }\left( X_s\right) =0\right) \\ &{}=&{}\frac{\left( 1-S_e\right) S_ep_{s}}{S_e\left( 1-S_e\right) +rS_e\varphi \left( G^{\left( 2\right) }\left( X_s\right) \right) -rS_p\varphi \left( G^{\left( 1\right) }\left( X_s\right) \right) },\\ \xi _{1,{\mathcal {A}}_{H}}\left( X_{s}|l_s=3\right) &{}=&{} \mathrm{Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{\left( 1\right) }\left( X_s\right) =1,G^{\left( 2\right) }\left( X_s\right) =1,G^{\left( 3\right) }\left( X_s\right) =0\right) \\ &{}=&{}\frac{\left( 1-S_e\right) S_e^2p_{s}}{S_e^2\left( 1-S_e\right) +rS_e^2\varphi \left( G^{\left( 3\right) }\left( X_s\right) \right) -rS_eS_p\varphi \left( G^{\left( 2\right) }\left( X_s\right) \right) -rS_p\left( 1-S_p\right) \varphi \left( G^{\left( 1\right) }\left( X_s\right) \right) }, \end{array} \end{aligned}$$

where \(\varphi (A)=\prod _{\{j:X_j\in A\}}(1-p_j)\) and \(r=S_e+S_p-1\). The detailed derivations are given in Appendix A.

Set the prevalence as \(p=0.005\), 0.01 and 0.03 and initial group size as \(k_1=80\), 40 and 20. The sensitivity and specificity \((S_e,S_p)\) are set to be (0.8,0.8), (0.85,0.85), (0.9,0.9), (0.95,0.95), and (0.99,0.99). Table 1 presents the false-negative predictive values for the Dorfman’s procedure and Halving procedure, respectively, and their corresponding nested procedures. In addition, the results for \({\mathcal {A}}_I\) are also reported. The notations are as follows. Denote the lth stage of Dorfman’s procedure and the corresponding nested group testing procedure by \(D_{l}\) and \(nD_{l}\), and the lth stage of three-stage halving procedure and the corresponding nested group testing procedure by \(H_{l}\) and \(nH_{l}\), respectively. Since \(D_1\) and \(H_1\) are the same, we omit the the results of \(H_1\). Denote the retest group size in the \(l^{th}\) stage by \(k^{(l)}\). If a maximum tolerance group size is imposed, say, \(k_{\max }=100,\) then the group size for the implemented Dorfman’s procedure in our nested group testing procedure is \(k_*^{(l)}=\min \{k^{(l)},k_{\max }\}\).

Table 1 The false-negative predictive values of \(D_1\) and \(D_2\) (the first and second stage of the Dorfman’s procedure), \(nD_1\) and \(nD_2\) (the nested Dorfman), \(H_1\) and \(H_2\) (the first and second stage of Halving procedure), and \(nH_2\) and \(nH_3\) (the nested Halving)

As expected, the false-negative predictive values of \({\mathcal {A}}_I\) are lower than those of \(D_1\) and \(D_2\), and \(H_2\) and \(H_3\). For example, when \(p=\) 0.005, \(k_1=80\) and \(S_e=S_p=\) 0.85, the false-negative predictive values of \(D_1\), \(D_2\), \(H_2\), and \(H_3\) are, \(1.212\times 10^{-3}\), \(1.985\times 10^{-3}\), \(2.952\times 10^{-3}\) and \(4.329\times 10^{-3}\), respectively, which is by far larger than \(0.886\times 10^{-3}\) of \({\mathcal {A}}_I\). Instead, using nested group testing procedure with group size \(k_*^{(l)}=100, 40, 35\) and 7, these values become \(0.248\times 10^{-3}\), \(0.522\times 10^{-3}\), \(0.565\times 10^{-3}\), and \(0.78\times 10^{-3}\), respectively. Obviously, the false-negative predictive values are greatly reduced.

Subsequently, we explore the possibility of a specimen tested negative at the \(l_s\) stage with \(l_s>1\), which is defined as \(\phi (l_s,s)=\mathrm{Pr}(G^{(j)}(X_s)=1,j<l_s,G^{(l_s)}(X_s)=0)\). Table 2 displays the results for selected values of prevalences, group sizes and testing error rates. From this table, we see that the probability is nonignorable. For example, when \(p=\)0.005, \(k_1=80\), \(S_e=\)0.99, and \(S_p=\)0.99, the probability of a specimen reported negative in the third stage of Halving procedure is 0.173.

Table 2 The probability of a specimen tested negative at lth stage for Dorfman’s procedure (\(l=1,2\)) and Halving procedure (\(l=2,3\)), and the FN-alike measurements \(\varDelta _{{\mathcal {A}}_D}\), \(\varDelta _{{\mathcal {A}}_{nD}}\)

In Table 2, we also report the FN-alike measurement \(\varDelta _{{\mathcal {A}}_D}\) for Dorfman’s algorithm and \(\varDelta _{{\mathcal {A}}_{nD}}\) for the nested Dorfman’s algorithm, along with their ratio \(r_D =\varDelta _{{\mathcal {A}}_{nD}}/\varDelta _{{\mathcal {A}}_{D}}.\) This measurement is computed based on Table 1. The small ratio \(r_D\) indicates specimens having similar performance on the false-negative predictive value although they might have went through different processes. It is an appealing characteristic since it is expected to have comparable false-negative predictive value for all specimens. The results are similar for Halving algorithm.

3.2 One-Step Sterrett Procedure

The difference between Sterrett’s and Dorman’s procedures is that in Sterrett’s procedure a unit randomly chosen from a positive group is for testing until a positive one is identified, then the remained units are formed to a group for testing. A graphical presentation of one-step Sterrett procedure is given in Fig. 3, denoted by \({\mathcal {A}}_S.\)

Fig. 3
figure 3

An illustrative diagram of one-step Sterrett procedure (subfigure a) and nested Sterrett procedure (subfigure b)

Denote the \(l^{th}\) stage of Sterrett procedure by \(S_l\), \(l=1,2,3\). Note that in stage 2, specimens could be tested negative in two different ways. Without loss of generality, assume the first group in stage 1 is tested positive, and subsequently in stage 2 the first \(d-1\) individuals are tested negative, while the \(d^{th}\) individual is tested positive. Denote by \(S_{2^{(1)}}\) the procedure for specimens using individual testing in stage 2. For those remaining specimens in that positive group, denote by \(S_{2^{(2)}}\) the procedure for specimens using group testing in stage 2. We use \(\xi _{1,{\mathcal {A}}_S}(X_{s}|l_s=2^{(1)})\) or \(\xi _{1,{\mathcal {A}}_S}(X_{s}|l_s=2^{(2)})\) to denote the false-negative predictive value of \(X_s\) that is tested negative in stage 2 by \(S_{2^{(1)}}\) or \(S_{2^{(2)}}\), respectively. Specifically, the false-negative predictive value of \(X_s\) is denoted by

$$\begin{aligned} \begin{array}{ccl} \xi _{1,{\mathcal {A}}_S}\left( X_{s}|l_s=1\right) &{}=&{}\hbox {Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{\left( 1\right) }\left( X_{s}\right) =0\right) ,~~~~~~~~~~ \\ \xi _{1,{\mathcal {A}}_S}\left( X_{s}|l_s=2^{\left( 1\right) }\right) &{}=&{}\hbox {Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{\left( 1\right) }\left( X_{s}\right) =1,\sum \limits _{j=1}^{d-1}{\mathcal {I}}_{j}=0,{\mathcal {I}}_{d}=1\right) , \\ \xi _{1,{\mathcal {A}}_S}\left( X_{s}|l_s=2^{\left( 2\right) }\right) &{}=&{}\hbox {Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{\left( 1\right) }\left( X_{s}\right) =1,\sum \limits _{j=1}^{d-1}{\mathcal {I}}_{j}=0,{\mathcal {I}}_{d}=1,G^{\left( 2\right) }\left( X_s\right) =0\right) , \\ \xi _{1,{\mathcal {A}}_S}\left( X_s|l_s=3\right) &{}=&{}\hbox {Pr}\left( \widetilde{{\mathcal {I}}}_s=1|\sum \limits _{j=1}^{d-1}{\mathcal {I}}_{j}=0,{\mathcal {I}}_{d}=1,G^{\left( l\right) }\left( X_s\right) =1,l\le 2,G^{\left( 3\right) }\left( X_s\right) =0\right) . \end{array} \end{aligned}$$

Similarly, we could obtain the true-positive predictive value of \(X_s\). We summarize the results in the following theorem.

Theorem 3.1

Suppose a specimen \(X_s\) is tested using one-step Sterrett procedure \({\mathcal {A}}_S\) and individual testing \({\mathcal {A}}_I\) respectively, \(s=1,\ldots ,n,\) then

$$\begin{aligned} \begin{aligned}&\xi _{1,{\mathcal {A}}_S}\left( X_{s}\right)>\xi _{1,{\mathcal {A}}_I}\left( X_{s}\right) ,~ \xi _{2,{\mathcal {A}}_S}\left( X_{s}\right)<\xi _{2,{\mathcal {A}}_I}\left( X_{s}\right) ,\\&\eta _{1,{\mathcal {A}}_S}\left( X_{s}\right) <\eta _{1,{\mathcal {A}}_I}\left( X_{s}\right) ,~ \eta _{2,{\mathcal {A}}_S}\left( X_{s}\right) >\eta _{2,{\mathcal {A}}_I}\left( X_{s}\right) . \end{aligned} \end{aligned}$$

This result is parallel to Theorem 1, with proof given in Appendix E. Note that Sterrett procedure is slightly different with Dorfman’s or Halving procedure. Theorem 4 shows that improvement is also necessary for Sterrett procedure. Similarly, we propose to retest those specimens which is declared negative. For example, suppose a specimen \(X_{s}\) belongs to the set \(S_{2^{(1)}}\). Denote by \(B_{s}(k)\) the retesting group with group size k which contains \(X_s\). Then the false-negative predictor value of \(X_{s}\) after retesting is defined as

$$\begin{aligned} \begin{array}{lcl} \xi _{1,{\mathcal {A}}_{N}}\left( X_{s};k_2\right) =\hbox {Pr}\left( \widetilde{{\mathcal {I}}}_{s}=1|G^{\left( 1\right) }\left( X_{s}\right) =1,\sum \limits _{j=1}^{d-1}{\mathcal {I}}_{j}=0,{\mathcal {I}}_{d}=1, B_s\left( k_2\right) =0\right) . \end{array} \end{aligned}$$

The false-negative predictor value of \(X_{s}\) belonging to \(S_1\), \(S_{2^{(2)}}\) or \(S_3\) is defined in the same way. In the following theorem, we will show that it is an increasing function with respect to the retesting size \(k_r.\) The situation in stage 2 of Sterrett procedure is complex due to the characteristic of this procedure. Therefore, we cannot obtain a result parallel to Theorem 3, but we can still obtain Theorem 3.2. The proof is given in Appendix F. Simulations reported in Table 3 show the false-negative predictor value is considerably reduced using nested Sterrett procedure.

Theorem 3.2

The false-negative predictor values of a specimen using nested Sterrett procedure are strictly increasing with respect to the retest group size \(k_r.\)

To investigate the false-negative predictive value of Sterret procedure and the nested Sterrett procedure, we run simulations with the prevalence \(p=0.005\), 0.01 and 0.03 and initial group size as \(k_1=80\), 40 and 20 correspondingly. Set number of groups by 200 and repeat the simulations by 1000 times. Note that a positive group will be tested individually until the first positive specimen comes out. Obviously, the position of the first positive specimen varies. So, we reported average false-negative predictive value of different types of specimens in Table 3. We omit the retesting group sizes \(k^*\) since they are also varies.

Table 3 The false-negative predictive values (the values have been multiplied by \(10^3\)) and the FN-alike measurement ratio \(\varDelta _{{\mathcal {A}}_{nS}}/\varDelta _{{\mathcal {A}}_S}\). \(S_{2^{(1)}}\) (\(nS_{2^{(1)}}\)), \(S_{2^{(2)}}\) (\(nS_{2^{(2)}}\)) and \(S_3\) (\(nS_3\)) represent the second and third stage of one-step Sterrett procedure and the corresponding nested group testing procedure, respectively

From Table 3, the false-negative predictive value from individual testing (\({\mathcal {A}}_I\)) is lower than that from Sterrett procedure (\(S_{2^{(1)}}\), \(S_{2^{(2)}}\) and \(S_3\)). For example, when \(p=\) 0.005, \(k_1=80\) and \(S_e=S_p=\) 0.9, the false-negative predictive values from \(S_{2^{(1)}}\), \(S_{2^{(2)}}\), and \(S_3\) are \(1.460\times 10^{-3}\), \(3.847\times 10^{-3}\) and \(1.813\times 10^{-3}\), respectively, which is by far larger than \(0.552\times 10^{-3}\) for individual testing procedure. After using nested group testing procedure, these values become \(0.282\times 10^{-3}\), \(0.758\times 10^{-3}\), \(0.371\times 10^{-3}\) respectively. Although the false-negative predictive value of \(nS_{2^{(2)}}\) is slightly higher than \(0.552\times 10^{-3}\), all the false-negative predictive values are greatly reduced, while compared with the original Sterrett procedure. Meanwhile, we report the FN-alike measurement of Sterret algorithm. Not surprisingly, the nested Sterrett procedure has more stable performance on the false-negative predictive value. Additionally, the ratio \(\varDelta _{{\mathcal {A}}_{nS}}\)/\(\varDelta _{{\mathcal {A}}_{S}}\) are all smaller than 0.5.

4 Further Evaluation of Nested Group Testing Procedures

4.1 Pooling Sensitivity and Specificity

In this part, we compare the pooling sensitivity and specificity of individual testing, Dorfman’s and Halving procedure, and the nested group testing procedures. The initial group size \(k_1\) is set to be 40 and the prevalence p to be from 0.003 to 0.03. The group sizes for retesting are calculated based on Theorem 3, with a maximum tolerance group size \(k_{\max }=100\). Set sensitivity and specificity to \(S_e=S_p=\)0.95, and repeating time of \(M=1000\). We simulate the four group testing procedures and then calculate pooling sensitivity and pooling specificity of these procedures; the results are presented in Fig. 4.

Fig. 4
figure 4

Pooling sensitivities and specificities of the individual testing (black point), Dorfman’s procedure (blue circle) and nested Dorfman procedure (red pentagram), Halving procedure (blue square) and nested halving procedure (red x-mark), Sterrett (blue diamond) and nested Sterrett (red star) with \(S_e=S_p=\) 0.95. The left panel is for sensitivity and right panel for specificity

From this figure, we observe that the nested Dorfman’s procedure always has larger pooling sensitivity than individual testing procedure. For example, when the prevalence is \(p=0.01\), the pooling sensitivity from individual testing, Dorfman’s procedure and the nested group testing method are 0.9494, 0.9013 and 0.9907, respectively. We notice that the nested group testing method has slightly lower pooling specificity than Dorfman’s procedure. Nevertheless, it outperforms individual testing based on all commonly used operating characteristics including false (true)-negative predictive value, false (true)-positive predictive value, pooling sensitivity and pooling specificity. For Halving procedure, similar conclusions can be drawn.

4.2 Malaria Infection Group Testing

Zhou et al. [27] reported a study of detecting malaria infection in microscopy-negative Malawian women using nested PCR (nPCR) [27]. They found that about 3.2% subjects in histology-negative group (433 dried blood spot) were nPCR positive. The method PCR had a median sensitivity of 96% and specificity of 99.1%. The group size was 10 for each group. So we set \(p=0.032\), \(S_e=0.96\) and \(S_p=0.991.\) According to [8], the maximum group size is set by \(k_{\max }=20\). Based on the configuration of parameters, we record the process of decoding the specimens. Upon completion of the procedure, the observed status of each specimen can be obtained. Then we calculate the false-negative predictive values for specimens at different stages, the efficiency, pooling sensitivity, and pooling specificity.

The results are summarized in Table 4, which show that the Dorfman’s procedure, Halving procedure and Sterrett procedure have slightly higher false-negative predictive values than that from individual testing. The proposed nested group testing method can substantially reduce the false-negative predictive values, which are lower than those for individual testing procedure. For example, the false-negative predictive value at the third stage of the nested halving procedure (\(nH_3\)) is \(0.986\times 10^{-3}\), while that of Halving procedure (\(H_3\)) is \(10.81\times 10^{-3}\). We also consider two different settings of \(S_e=S_p=0.95, 0.99\). The performances are similar. Moreover, as compared to individual testing procedure and original group testing procedures, the proposed nested method can improve the pooling sensitivity.

Table 4 The false-negative predictive values (\(\xi _{1,{\mathcal {A}}}\), the values have been multiplied by \(10^3\)), efficiency (Eff), pooling sensitivity (poolSe), and pooling specificity (poolSp) for malaria infection

5 Conclusion

Group testing strategy is cost-effective for rare disease. However, such efficiency gain often couples with higher false-negative predictive value, which is not desirable for life-threatening diseases such as malaria infection during pregnancy which may result in severe maternal anemia, prematurity and low birth weight of babies, increasing the risk of maternal and neonatal deaths [27]. In the present paper, we investigated predictive values including false (true)-negative predictive value and false (true)-positive predictive values from a group testing algorithm. As compared to the individual testing, we theoretically showed that multi-stage group testing procedures have higher false-negative predictive value, and lower false-positive predictive value. Our proposed nested group testing procedure can reduce the false-negative predictive value, lower than that from individual testing through careful selection of group sizes. We provide formulas and demonstrate the usage in details for commonly used group testing procedures, including Dorfman’s algorithm, Halving and Sterrett algorithms.

As alternatives to hierarchical group testing procedures, non-hierarchical procedures such as array testing (or matrix pooling) and three-dimensional procedures are also used in group testing. Due to the overlapped groups, it is not a trivial thing to extend the proposed method to non-hierarchical group testing procedures. Besides this, we assumed that the sensitivity and specificity are known and do not depend on the group size. It is better to consider the dilution effect with different group sizes. Both issues might be future topics.