Background

Increased testing of those at risk of HIV infection and improving access to antiretroviral therapy (ART) particularly in the most HIV-affected regions is a recognized global strategy to end the AIDS epidemic. However, only 15.0 million of an estimated of 36.9 million (40.6 %) HIV-infected individuals have access to ART [1]. Whilst new evidence endorses treatment of all HIV-infected individuals [2, 3] and the World Health Organization (WHO) guideline supports treatment of all individuals regardless of CD4 count [4], the current practical goal, one that is likely to apply in most resource limited countries into the foreseeable future, is to achieve ART coverage of all HIV-infected adults with low CD4 cell count (<350 cells/μL) before expanding ART scale-up to people with higher CD4 cell counts. Furthermore, for patients who are presenting late to care, a CD4 count is required as a baseline measurement to identify the need for screening and prophylaxis for major opportunistic infections which are often associated with low CD4 count and increased risk of mortality. For treatment monitoring, CD4 count is important to assess CD4-related risk of toxicity to Nevirapine and, CD4 testing remains an important method to monitor patients who are on treatment in settings where access to viral load monitoring is still limited [5]. Thus, CD4 monitoring remains an essential and practical component of HIV care in the near future [6].

It has been known that barriers resulting in substantial losses to the continuum of HIV care include poor access to CD4 testing, particularly in disadvantaged and remote areas where laboratory-based CD4 testing by flow cytometry is not available [7, 8]. Point-of-care (POC) testing is an effective strategy to overcome this challenge. Findings from a number of field studies show that POC CD4 testing can have a positive impact on the HIV continuum of care [915]. The use of POC CD4 in lower and middle income countries (LMICs) where resources for HIV care are most limited, is expected to produce greatest clinical and economic impacts from both patient and health system perspectives [16, 17]. With a number of POC CD4 technologies available or in the pipe-line [18], there is a need for consolidated evidence on the performance of different POC CD4 tests, particularly in LMICs, to inform decision-making related to selection of an appropriate test in field settings. A systematic review and a meta-analysis of test performance of CD4 count technologies showed that POC CD4 test is suitable for ART eligibility assessment at CD4 thresholds of 350 and 500 cells/μL [19, 20]. However, these studies pooled data from both laboratory and field evaluations in low and high income countries. As there is evidence to suggest that performance of the test varies significantly across evaluation settings (laboratory vs clinic) in different countries [21, 22], the question remains whether these study findings are transferable, specifically to non-laboratory environments in LMICs. We, therefore, conducted a systematic review to assess the performance, acceptability and feasibility in non-laboratory field settings of currently available or prototype commercial POC CD4 tests in LMICs. Here, we report on “field performance” of different POC CD4 tests; findings on “acceptability, feasibility” will be reported elsewhere.

Methods

This systematic review was conducted according to a protocol developed using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) statement [23].

Literature search strategy

The search strategy was designed to identify any studies describing POC CD4 tests. After an initial search for articles in Medline and Embase, an assessment of text words within the title and abstract and of the index terms used to describe these articles was conducted. A subsequent full search using clearly established search terms (see Additional file 1: Annex 1) was undertaken across included databases, and adapted as appropriate to the specifications for the respective databases: Medline, Embase, CENTRAL, Cinahl, PsycINFO, Biological Abstracts, and Scopus. Web of Science and conference databases were also searched to identify relevant studies. Reference lists of all identified reports and articles were searched for additional studies. Moreover, searches were conducted in grey literature resources such as conference websites (NLM Gateway, the British Library Conference and International AIDS Society and Conference on Retroviruses and Opportunistic Infections) and clinical trials websites. Hand-searching and reference checking of citations and reference lists was undertaken. Authors of relevant studies were contacted if insufficient data were published. Government reports, letters to editors, commentaries, editorials, non-peer reviewed articles and review articles were excluded. Studies conducted and funded by the manufacturer, if stated, were also excluded.

Study selection

Inclusion criteria were that studies needed to be published between January 2005 and January 2015, written in English language and conducted in LMICs (http://data.worldbank.org/about/country-and-lending-groups). Studies conducted in both LMICs and non-LMICs were included if data from LMICs were presented separately. Further eligibility criteria were defined using PICO (participants, interventions, comparisons, outcomes) format [24]. Participants (P) included HIV positive, HIV negative and unknown HIV status persons aged ≥ 12 months. For intervention (I), any of the following six commercially available POC CD4 testing platforms listed in the UNITAID “2014 HIV/AIDS Diagnosis Technology Landscape” report [18] were included: (1) PointCare NOW™ (PointCare Technology Inc, Marlborough, MA, USA); (2) Pima™ CD4 (Alere Inc, Waltham, MA, USA); (3) Daktari™ CD4 Counter (Daktari Diagnostics Inc, Cambridge MA, USA); (4) CyFlow® CD4 miniPOC (Partec, Munich, Germany); (5) BD FACSPresto™ (BD Biosciences, San Jose, CA, USA); and (6) MyT4™ CD4 Test (Zyomyx Inc, Fremont, CA, USA). Results of POC CD4 test needed to be compared (C) to reference laboratory-based assays with outcomes (O) containing diagnostic performance of POC CD4 test in field settings. All retrieved articles were checked for duplication; conference abstracts were excluded if duplicated with full-text articles. Titles, abstracts and summaries of identified records were screened for relevance. Retained records meeting the inclusion criteria were then examined in full text (Fig. 1).

Fig. 1
figure 1

Selection process of included studies

Data extraction and data analysis

An electronic data extraction form was developed, pre-tested and finalized by consensus among authors. Data extraction was conducted by one reviewer and verified by the second using the data extraction form with 20 % duplicate extraction. Quality of included studies was assessed using EPHPP tool for quantitative studies [25] and QUADAS tool for diagnostic accuracy studies [26].

Key measures used to evaluate performance of POC CD4 tests were: (1) failure rates defined as the percentage of the total number of tests performed with an invalid result such that tests did not provide results and/or results could not be read. (2) Sensitivity and specificity of index POC CD4 test at certain CD4 thresholds as compared to the predicate method. (3) Misclassification of index POC CD4 at certain CD4 thresholds. Misclassification is defined as the percentage of total HIV-infected individuals/blood samples tested with disagreement in results between POC CD4 and predicate method for the purpose of identifying ART eligibility at pre-specified CD4 threshold values of 200, 350 and 500 cells/μl; and (4) Difference in mean (bias) of CD4 counts and limit of agreement between index POC CD4 and predicate method. Bias included absolute and relative bias defined as the mean absolute and/or relative difference in CD4 count between the index POC CD4 and the predicate test. Limit of agreement was calculated as the mean difference ± 1.96 standard deviation of the difference.

A meta-analysis of diagnostic test accuracy (DTA) was conducted of the Alere Pima™ CD4 (Pima) test through application of a bivariate multi-level random effects modeling approach [27], recommended for meta-analysis for Cochrane DTA reviews. The bivariate random-effects model accounts for both the correlation between study sensitivity and specificity estimates, and also unobserved between-study heterogeneity in test performance through specification of a multi-level bivariate normal regression approach. The bivariate model estimates sensitivity and specificity by modeling random-effects across two levels; level 1 representing the within-study variability between sensitivity and specificity and level 2 representing heterogeneity in diagnostic performance of the index test across studies. Using the multi-level bivariate model; sensitivity, specificity, positive and negative likelihood ratios (±LR) of the Pima test were estimated. In order to test for any difference in diagnostic performance across venous and capillary sample methods, the multi-level bivariate random-effects model was extended to include a covariate for blood sample type. Post-estimation Wald tests were used to test the joint (i.e. for sensitivity and specificity simultaneously) effect of blood sample type on diagnostic accuracy. In all multi-level analyses of pooled Pima test diagnostic performance, Huber/White variance estimation was used to provide appropriate standard errors in instances where multiple sets of diagnostic data were taken from a single study [28]. Diagnostic statistics (Cook’s distance [using a 5 parameter model based cutoff >2] and standardized residual plots) from multi-level bivariate random-effect models were examined to assess pooled sensitivity and specificity estimates for outlier bias. Using Stata version 13.1 (Stata Corporation, TX, USA), both the user-written Stata program Midas [29] and author-written code using GLLAMM (Generalized Latent and Linear Mixed Modeling) [30] were used to provide key statistical output for pooled multi-level bivariate random-effect model analyses of the diagnostic performance of the Pima test. The user-written Stata program metandiplot [31] was used to plot the hierarchical summary receiver operating characteristic (HSROC) curve [32] from observed study sensitivity and specificity estimates. Statistical significance was assessed at 5 % in all analyses.

Results

Study characteristics

The initial search, after removing duplicates, yielded 2,919 records, among which 27 studies met all of the inclusion criteria and comprised 24 full-text articles and three conference abstracts [3335].

Twenty four studies (24) used Pima, two studies used PointCare NOW, one used MyT4 as the index POC CD4 test. Overall, the quality of included studies was considered between moderate and strong. The QUADAS scores ranged from 7 to 12 (of a maximum score of 14) with 63 % (12/19) of studies scoring between 10 and 12. Among 22 studies reporting performance of POC CD4 test (Table 1), 19 (86 %) were conducted in sub-Saharan countries and three others were in India [36], Brazil [37] and PNG [38]. CD4 predicate testing technologies used as reference include FACSCalibur, FACSCount, Panleucogating (PLG) flow cytometry, Partec CyFlow and GUAVA. Only one study performed duplicate testing [39], and three studies performed precision testing of the reference method on a subset of whole blood samples (3 – 15 blood specimens) [4042]. Among 19 studies reporting diagnostic performance of Pima, only 11 studies provided data required for and then subsequently included in the meta-analyses.

Table 1 Characteristics and Quality assessment of studies included in the review

Performance of POC CD4 tests in field settings

Failure rates

For Pima, studies with capillary blood reported a wide range of failure rates from 2 % [43] to 23.3 % [44]. Studies using venous blood in various clinical settings reported failure rates ranging from 4.8 to 15.2 % [40, 4446]. One study reported a zero “no read” error in laboratory evaluation of Pima with venous blood, however, in field evaluations in different clinical settings failure rate was recorded at a wide range from 6.8 to 20.9 % [22]. For PointCare Now the failure rate varied from 2.9 % in one study [41] to 9.2 % in another [47]. With MyT4 CD4 test a study-wide error rate of 9.6 % was recorded [48].

Detailed performance data of Pima is presented in Table 2.

Table 2 Performance of Pima stratified by venous and capillary blood collection and presented by reference test used

Misclassification, sensitivity and specificity

When CD4 count testing was conducted using a venous blood specimen Pima showed lower misclassification and higher probabilities of correctly identifying patients eligible for ART across studies and different reference methods. At a CD4 threshold of 350 cells/μl, the total misclassification probability of Pima test using venous blood was 4.0–12.2 % [44, 49] versus 6.7–17 % for capillary blood [39, 50]. Pima point estimates for sensitivity and specificity ranged from 89–99 % and 77–93 % for venous blood; and 79–98 % and 80–99 % for capillary, respectively. For PointCare Now, one study [47] reported site-specific sensitivity ranging from 38 to 63 %, resulting in misclassification of 50 % of patients tested as ineligible for ART; another study [41] reported a lower misclassification of 6 % of patients as ineligible for treatment. For MyT4 CD4, the sensitivity and specificity of the test were 88 and 84 % when compared to FACSCalibur and 95 and 88 % as compared to FACSCount [48].

Meta-analysis of diagnostic accuracy of the Alere Pima™ CD4 in field testing

We aimed to conduct categorical data analysis of diagnostic accuracy of Pima at CD4 350 and 500 cells/μl cut-offs. However, only three included studies [40, 42, 49] reported data required for meta-analysis at the 500 cut-off. Thus only analysis at a CD4 threshold of 350 cells/μl was performed. Required data including number of true positive, false positive, false negative and true negative cases were reported in the literature from 9 studies [38, 40, 42, 43, 45, 4952]. Data from two studies [37, 53] were received from the authors following email contact, yielding a final dataset comprising 11 studies for the meta-analysis. Among these, two studies [40, 53] reported Pima test results for both venous and capillary blood, and these data were treated in meta-analyses as independent study results but with model standard errors corrected for the lack of independence in observations. Five studies [37, 38, 42, 45, 49] reported the results with venous and four studies [43, 5052] with capillary blood only.

Examination of post-estimation of diagnostic statistics after preliminary meta-analyses provided some evidence of model outlier bias, with two studies [53] (capillary) and [38] (venous) showing model discrepant test sensitivity and specificity (Cook’s distances: 2.12 and 3.65 respectively). However, diagnostic test data from these studies were included in the pooled meta-analysis after sensitivity analysis, with outlying cases excluded, indicated no marked difference in pooled estimates (included versus excluded sensitivity and specificity: 92 vs. 92 % and 87 vs. 88 %, respectively).

Diagnostic accuracy of the test in field settings was relatively high, with pooled sensitivity and specificity estimated at 92 % (95 % CI = 88–95 %) and 87 % (95 % CI = 85–88 %) respectively (Fig. 2). Further, pooled positive and negative likelihood ratios were also at levels indicating relatively strong diagnostic performance of Pima (+LR = 7.0, 95 % CI = 6.1–7.9; −LR = 0.09, 95 % CI = 0.06–0.13). Figure 3 shows observed sensitivity and specificity plotted for each included study with the HSROC curve, the pooled estimate and 95 % confidence and prediction contours.

Fig. 2
figure 2

Point estimates of diagnostics performance of Pima in field settings at CD4 350 cells/μl cut-off

Fig. 3
figure 3

HSROC curve from multi-level bivariate random effects model estimation of diagnostic performance of Pima at CD4 350 cells/μ cut-off¥: plots observed sensitivity and specificity, diagnostic summary point, 95 % confidence and prediction contours.

Bivariate random-effect hierarchical models estimating pooled diagnostic performance with a covariate for blood sample type showed some potential difference in summary sensitivity and specificity by blood sample type. Using venous samples, pooled sensitivity was 94 % (95 % CI = 89–97 %) and pooled specificity 86 % (95 % CI = 82–89 %), while using capillary blood pooled sensitivity was 89 % (95 % CI = 83–93 %) and specificity 87 % (95 % CI = 86–89 %). However, a post-estimation test of the joint effect of blood sample type on the sensitivity and specificity of Pima showed that these differences in diagnostic accuracy did not reach statistical significance (Wald χ2(2) = 4.77, p = 0.09).

Bias and limit of agreement (LoA)

Overall, Pima showed a better performance with venous compared to capillary blood samples with a smaller range of bias and tighter LoA across studies with different predicate technologies. Studies reported absolute bias of Pima at CD4 > 500 cells/μl ranging from −66.3 cells/μl (LoA: −286.6, +154.0) for venous [49] to −120.6 cells/μl (LoA: −162.8, −78.4) for capillary blood [46]. At lower CD4 ranges, bias were reported at +15 cells/μl (LoA: −89 to +118) for CD4 < 200 cell/μl using capillary blood [44]; and +5.1 cells/μl (LoA: −126.6, +136.8) for CD4 < 350 cell/μl using venous sample [49]. These data suggest that Pima overestimates the CD4 count at lower CD4 ranges and underestimates the CD4 count at higher ranges; the bias was also increased at higher CD4 counts.

Discussion

Findings of this review suggest that POC CD4 testing can provide reliable results for making treatment decisions among HIV patients in LMICs. This review highlights the need for published data regarding the field evaluation of available POC CD4 tests, particularly in low-resource settings where these novel technologies are already demonstrating significant impact on the continuum of care for HIV-positive persons. Among six current or prospective commercially available POC CD4 technologies, only three have published studies that meet inclusion criteria and most of these used Pima as the index test. Among 19 studies reporting Pima performance data, 11 studies provided data required for meta-analysis of diagnostic test accuracy.

Findings on Pima performance from two studies using both venous and capillary blood showed that CD4 counts on venous blood samples produced more accurate results than capillary blood, with lower failure/error reading rate, the authors suggest that variation in test results was likely due to quality of capillary sampling [22, 44]. However, other evidence supports the use of either venous or capillary specimens [35, 40]. Though not statistically significant, our meta-analysis shows that there is a trend towards a better performance of the test with venous blood, with a sensitivity of 0.94 for venous and 0.89 for capillary blood (p = 0.09) in identifying HIV-positive person eligible for ART at a cut-off of 350 cells/μL. If this is a true difference in performance of the test, the use of venous blood when using Pima for ART eligibility assessment would be preferable as it could reduce false negative test results which represent patient’s missed opportunities for timely treatment initiation.

An observed wide range of failure rate of the Pima technology across studies is another attribute that needs further attention. Apart from technical and operational characteristics of the test, evidence from field studies suggests that performance of the test operator influences the accuracy of diagnostic test in the field [11, 22]. Therefore, the quality of training on POC testing for test operators and their supervisors becomes critically important to ensure effectiveness and efficiency of the technology in field settings. Of note, few of the included studies mentioned the effect of staff training on performance of POC CD4 test and none described details of the training program.

Bias in assessing the diagnostic accuracy of any new test could arise from faulty results of the reference test itself as there is no gold standard for CD4 testing although single platform flow cytometry has assumed that position. Thus evidence of participation and successful performance in external quality assurance (EQA) programs and performing duplicate tests on a sample using the predicate test are worthy recommendations in order to ensure the highest accuracy of predicate results. In this review only half of the included studies described EQA participation for the reference test and only one study conducted duplicate testing.

In order to better inform the decision making process on selection and adoption of POC CD4 testing in LMICs, further studies on currently and newly available POC CD4 technologies in various level settings and different geographic regions are needed. It is recommended that the quality of studies as well as quality of study reporting should be improved by following established standards [26, 54] and the focus of these future studies should not only be on test diagnostic accuracy but also on implementation aspects of the test, aiming at providing practical evidence to inform effective implementation strategies of POC CD4 testing.

There are two published systematic reviews and meta-analyses of the performance of POC CD4 tests, one by Scott et al. [19] and one by Peeling et al. [20]. In comparison, our review included 22 peer reviewed publications, providing a large increase in analysis of published work on POC CD4 technologies compared to the other reviews. Importantly, our study differed from the previously published reviews in that only studies conducted in field settings were included and thereby specially assessed field performance of the Pima. We demonstrated that misclassification by Pima, particularly with the use of capillary blood samples under field conditions, can be higher than that under a laboratory environment; thus our data vary with the earlier study where the reported probability of Pima misclassification was less than 10 % [20]. A significant methodological strength of our meta-analysis is the direct estimation of the joint effect of blood sample type on Pima sensitivity and specificity simultaneously, using a bivariate multi-level random effect model with pooled study data. This is a significant improvement in statistical robustness compared to the simple comparison of 95 % CI estimates for sensitivity and specificity applied in the other meta-analysis [19]. Encouragingly, our results, in line with findings from other reviews, confirm that POC CD4 tests also perform well if assessed specifically in field settings. Pima therefore has the potential for further deployment for ART eligibility assessment and treatment monitoring, especially in areas where laboratory-based CD4 testing is not available or difficult to access.

This review has some limitations which may affect the generalization of the findings. First, we included only published, peer-reviewed journal articles in English and this inclusion may overlook data from studies published in other languages or unpublished data from evaluations/studies conducted by government agencies, reference facilities or similar institutions. The inclusion of conference abstracts has it strengths in limiting publication bias; however, confidence in these findings is limited as the quality of these studies has not been assessed via formal peer review. Second, only three of the six POC CD4 technologies found in this review were published with field study data and one technology (Pima) featured most prominently in the included studies. This presents challenges in terms of generalizing many of the findings of the review to “all” POC CD4 tests as it may be subjected to reporting bias. Third, interpretation of findings from meta-analyses of this review should be contextualized in terms of the limited diagnostic test data available from published studies. This limitation cannot be overcome until more data from field studies of different POC CD4 technologies, including the Pima, are available.

Conclusions

Findings of this review suggest that field studies of POC CD4 tests currently available on the market and those eagerly anticipated, conducted in LMICs where they are needed the most, remain much in need. The Pima™ CD4 showed acceptable diagnostic test accuracy using either venous or capillary blood. Existing evidence indicates that POC CD4 testing, can provide reliable results under field conditions and could play an important role in HIV continuum of care. This remains true, despite the changing landscape with respect to guidelines for ART initiation. Whilst evidence supports increasingly earlier commencement of treatment at an individual and community level, the financial reality is that in many parts of the world priority for ART initiation must still continue to be given to those with evidence of declining immune function. Further evidence is needed to ensure that efficacy is acceptable with both venous and capillary blood samples in field settings.