Introduction

The Coronavirus disease 2019 (COVID-19) pandemic caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is the most significant infectious disease pandemic in the last century1,2. In addition to preventive measures such as social distancing, mask wearing, and vaccination, pillars of pandemic control rely on tools to rapidly identify cases and monitor transmission3. Molecular testing methods based on reverse transcription quantitative polymerase chain reactions (RT-qPCR) remain the backbone of many testing programs globally4. However, RT-qPCR-based testing is heavily influenced by supply chain restrictions, need for trained personnel and central laboratories, and relatively long turnaround times, particularly in resource-constrained settings5. Therefore, it is still challenging to scale up RT-qPCR tests for population surveillance and the timely detection of the large proportion of asymptomatic SARS-CoV-2-infected carriers6. Rapid detection of SARS-CoV-2 infected individuals allows for faster clinical intervention and implementation of public health measures such as isolation and contact-tracing, to prevent forward transmission7.

Rapid antigen-detecting diagnostic tests (RDTs) for COVID-19, many of which can yield actionable results in turnaround times often below 20 min, require little laboratory capacity, and can be performed easily by non-laboratory personnel8. Furthermore, decentralized access to RT-qPCR testing remains sparse in resource-constrained communities9. The low cost of antigen-detecting RDTs, short turnaround times and ease of use make them excellent candidates to increase their accessibility for large-scale implementation in varied community settings10.

Since the pandemic’s onset, several antigen-detecting RDTs have been developed for the detection of SARS-CoV-211. Many of the antigen-detecting RDTs received Emergency Use Authorization (EUA) approvals by the Food and Drug Administration (FDA)12. Indeed, over the last year, rapid diagnostic tests have become more widely used for the diagnosis of COVID-19 in diverse settings outside the hospital, including at-home testing. However, evaluations to receive EUA were performed by demonstrating accuracy in symptomatic individuals only13. This narrow indication for antigen-detecting RDTs raises their limited utility in detecting SARS-CoV-2 infection in asymptomatic carriers. In fact, several independent evaluations demonstrate the decreased sensitivity of antigen-detecting RDTs in asymptomatic RT-qPCR positive individuals compared to those with symptoms14,15,16,17. In the United States, studies thus far have focused on 3 RDTs: Quidel Sofia8,18, BD Veritor13,19 and Abbott BinaxNOW15,16,20. On March 31st, 2021, the FDA also authorized these tests for home use, raising concerns about misinterpretation of false negative results21. Therefore, evidence to establish their performance characteristics to guide their implementation in real-world settings is even more urgent now.

In this study, we evaluated the Access Bio CareStart COVID-19 RDT (CareStart), a chromatographic antigen-detecting lateral flow immunoassay that received EUA by the FDA on October 8th, 202012,17. We evaluated CareStart in asymptomatic and mildly symptomatic individuals presenting for routine testing at one of the ‘Stop the Spread’ free community testing sites in Holyoke, Massachusetts22. Public health messaging for testing at these community testing sites targeted asymptomatic individuals. We evaluate the sensitivity, specificity, and positive (PPV) and negative predictive values (NPV) as a function of different prevalence scenarios.

Methods

Study population and ethical approval

This was a prospective evaluation using convenience sampling of asymptomatic and mildly symptomatic individuals presenting for routine testing for COVID-19. The study was performed between January 6 and February 26, 2021, at the Holyoke “Stop the Spread” walk-up testing site, a free Massachusetts public testing program, which targets asymptomatic individuals22. The testing site opened three days a week. Individuals who presented to the site during testing hours were approached by our research staff who explained the nature of the study, risks, benefits, and answered any questions before inviting individuals to participate in the study. Informed verbal consent, in lieu of written consent, was obtained and documented by the research staff from participants standing in testing lines to collect a second anterior nasal swab as well as from guardians of minors below 18 years of age, from whom verbal assent was also obtained. The participants were treated in accordance with Good Clinical Practice guidelines and the Declaration of Helsinki. The study protocol was approved by the Partners Institutional Review Board (Protocol ID: 2020P003892).

Study intake and data collection

After enrollment in the study, our study staff implemented an intake questionnaire capturing information on participant demographics, presence or absence of symptoms based on case definitions from the Council for State and Territorial Epidemiologists23: cough, sore throat, chills, shortness of breath, fever, muscle aches or soreness, nausea, vomiting or diarrhea, decreased sense of smell or taste, loss of appetite, general weakness or fatigue, or headaches. The survey also captured prior COVID-19 testing and potential exposures. Each test was assigned a unique anonymous ID. Data collected was inputted into a secure Research Electronic Data Capture (REDCap) database on encrypted tablets. We used the demographic information and specimen numbers to match the RDTs result with the RT-qPCR data collected at the Broad Institute Clinical Research Sequencing Platform (CRSP) as performed in other studies15,24.

Swab collection procedure

The sample was collected at the city testing site by personnel who had received a brief training on performance of the RDT but were not trained health care providers or diagnostic specialists. We used dry anterior nasal (AN) swabs: Puritan 6″ Sterile Standard Foam Swab with Polystyrene Handle (Puritan, Guilford). Both anterior nares were swabbed 2 times (5 rotations in each nostril), once for RT-qPCR testing and once for the RDT sample. For practical reasons, the swabs for RT-qPCR and RDT were not always collected in the same order. Both samples were placed inside closed test tubes. The RT-qPCR sample was transported to the Broad Institute at the Massachusetts Institute of Technology. The second anterior nasal swab sample was transported to a nearby testing station and the RDT was performed within an hour of sample collection. The RT-qPCR testing results were interpreted according to the publicly available rubric for the Broad Institute COVID-19 testing program: https://sites.broadinstitute.org/safe-for-school/result-code-information. Briefly, the assay is a multiplexed RT-qPCR assay, which runs up to 40 cycles. The assay targets the N1 and N2 genes, using the CDC primers, with an RNase P (RP) gene as an internal control gene for test validity25. Any cycle threshold (Ct) value for either N1 or N2 below 40 is considered a positive result, which is how we define SARS-CoV-2-positive individuals to benchmark the performance of the CareStart RDT.

Rapid test procedure

The CareStart device came with instructions for use and diagrams. The study staff received a one-hour training prior to the study and practiced the RDT on positive and negative control samples provided in the kit. One operator performed the test at a workstation following the CareStart manufacturer’s instructions for use (IFU)26, took pictures of the tests, read the result as positive or negative, and captured into the electronic data entry forms. Participants with a positive RDT were contacted by phone per request from the Department of Public Health within a twenty-four-hour period, informed of their result, and advised to isolate until they received their RT-qPCR result.

Reference RT-qPCR standard

The gold standard reference used was the SARS-CoV-2 RT-qPCR laboratory developed test through the Broad Institute CRSP, which is approved by the FDA under EUA. The test provides two cycle threshold (Ct) values, one for the nucleocapsid (N2) gene, and one for an internal positive control RNaseP gene. We compared the sensitivity of CareStart against both the qualitative binary RT-qPCR results and the Ct values of the N2 gene amplification reaction, as previously described17.

Statistical analyses

We calculated sensitivity, specificity, PPV and NPV of the RDT from 2 × 2 contingency tables using RT-qPCR as the gold standard reference. Sensitivity and specificity were further stratified and compared by presence of symptoms and quantitative Ct values. Median Ct values were compared using the non-parametric unpaired Mann–Whitney U test. 95% Pearson-Clopper confidence intervals (CI) were calculated for sensitivity and specificity estimates. Since RDTs have been reported to have high accuracy among symptomatic individuals8,15,16,17, we also tested whether presence of symptoms would increase the sensitivity of the CareStart RDT. Statistical analyses were conducted using R V3.6.0 (R Core Team 2020).

Results

We performed 666 CareStart RDTs from participants who provided verbal consent at the walk-up testing site. Of these, 4 tests were excluded because test vial caps were malformed and the operator was unable to load the RDT, resulting in 662 tests included for analysis. The 662 tests performed were comprised of 588 unique participants. (Tables 1, 2 and Fig. 1). 60 participants by chance received more than one test, with a total of 75 tests performed in addition to the first test per participant (Supplementary Table 1). Among the 588 participants, 51.9% were residents from Holyoke, as identified by their residential zip codes. Just over half the participants (51.9%) identified as female. The mean age was 38.1, and 44.7% of participants identified as Hispanic or LatinX (Table 1). The study staff evaluated the usability of the CareStart devices. All tests showed a positive control band, indicating they were valid. The RDT procedures involved immersing a swab into a vial consisting of extraction buffer, subsequently, the swab was taken away and the cap was used to close the vial. A few drops of the specimen solution were applied to the test device. Of the valid tests, we noted variable band intensities (Fig. 2). The positive test line was sometimes so faint that a flashlight was necessary to see it.

Table 1 Demographics of unique study participants who enrolled in the CareStart Rapid Antigen Test evaluation at the Stop the Spread COVID-19 testing site in Holyoke, Massachusetts.
Table 2 Tester symptoms, exposure history and prior COVID-19 testing per each CareStart testing occurrence, including repeated tests from the same participants.
Figure 1
figure 1

Number of CareStart rapid antigen test administered by date (n = 666). The bar colors reflect the results of the rapid tests on different days.

Figure 2
figure 2

Examples of images of CareStart rapid test showing variable band intensities.

To determine the accuracy of the CareStart RDT, we calculated the concordance between the RDT and RT-qPCR (Table 3). Thirty-one RT-qPCR tests were excluded from the analysis because the sample was unsatisfactory for processing, or testing yielded inconclusive results (detection of one of the two viral probes). This testing in real-world settings would have prompted a recommendation for re-testing. Using all RT-qPCR values below 40 as a positive reference, the sensitivity of the CareStart RDT was 48.1% (95% CI 34.0–62.4%), while the specificity was 99.0% (95% CI 97.8–99.6%) (Table 4). Of the 662 visits, participants reported presence of symptoms 90 (14.0%) times (Table 2). Cough was the most reported (n = 38, 5.7%) symptom, while loss of smell or taste, a more specific COVID-19 symptom, was only reported in 18 RDTs (2.7%) (Supplementary Table 2). Participants who tested positive for the CareStart RDT were more likely to report at least one symptom compared to participants that tested negative (41.9% vs. 12.2%; Chi-square p < 0.0001) (Table 2). Due to the limited sample size, we only stratified individuals tested by presence (n = 90) or absence (n = 572) of symptoms to test the CareStart RDT accuracy as a function of symptoms. The sensitivity of CareStart RDT in symptomatic individuals was 46.4% (95% CI 27.5–66.1%), and the specificity was 100% (95% CI 95–100%) (Supplementary Table 3A,B). In asymptomatic individuals, the sensitivity of the CareStart RDT was 52.2% (95% CI 30.6–73.2%), and the specificity was 99.4% (95% CI 98.3–99.9%) (Supplementary Table 3C,D). Sensitivity and specificity did not significantly differ between symptomatic and asymptomatic individuals (p = 0.781 for sensitivity; p > 0.999 for specificity).

Table 3 Concordance between CareStart test results and RT-qPCR test results.
Table 4 Performance characteristics of CareStart test results benchmarked against the RT-qPCR gold standard.

Next, we used Ct values for amplification of the N2 target as a proxy for viral load, where higher Ct values reflected low viral loads, as previously reported27. The Ct values of samples recorded as negative using the CareStart RDT were significantly higher than positive counterparts (Mann Whitney U p value < 0.0001, Fig. 3). Therefore, we also performed a subset analysis where we only considered samples with a Ct < 30 as positive (Table 5). Using this cut-off, the CareStart RDT sensitivity and specificity were 64.9% (95% CI 47.5–79.8%) and 99.3% (95% CI 98.3–99.8%), respectively (Table 6). Although the CareStart RDT EUA does not indicate a specific Ct threshold for the positivity of the comparator RT-qPCR26, these data suggest that applying a more stringent Ct value threshold moderately improves the sensitivity of the CareStart RDT.

Figure 3
figure 3

N2 gene RT-qPCR Cycle threshold (Ct) values corresponding to positive and negative CareStart rapid antigen test results for all RT-qPCR positive samples (n = 52).

Table 5 CareStart test results compared to RT-qPCR using Ct positivity threshold of < 30.
Table 6 CareStart test result sensitivity and specificity using Ct positivity threshold of < 30.

Positive and negative predictive values of diagnostic tests depend on the prevalence of infections in a population, where a higher prevalence increases the PPV at the expense of the NPV28. We calculated the PPV and NPV values as a function of prevalence rates up to 10%, where the PPV steeply dropped in prevalence rates lower than 5% (Fig. 4). At a sensitivity of 49% and specificity of 99.5% (Table 4), the PPV of CareStart was 49.7% at a SARS-CoV-2 infection prevalence of 1%, and 91.6% at a prevalence of 10%. In contrast, the NPV was 99.5% at a prevalence of 1%, and 94.6% at a prevalence of 10%.

Figure 4
figure 4

Calculated positive (left) and negative predictive values (right) based on the CareStart performance characteristics and different prevalence estimates of SARS-CoV-2 infections.

Finally, our cohort included individuals who presented to the testing site multiple times, who had at least one positive RT-qPCR test result. Therefore, we performed an exploratory analysis of their longitudinal test results (Supplementary Table 1 and Fig. 5). We enrolled 5 participants who converted from a negative to positive on RT-qPCR tests, all of which were accurately detected as positive by the RDT. Two participants with both positive RT-qPCR and RDT test results reverted to negative test results on both platforms. However, one participant converted from a positive to negative RDT test result but was detected as positive by the RT-qPCR on the second test, which was conducted in less than a week.

Figure 5
figure 5

Individuals who enrolled in the study multiple times and had at least one positive gold standard RT-qPCR reference (n = 11). The point colors reflect the different combinations of RT-qPCR and CareStart rapid test results. The numbers above the point correspond to Ct values of the RT-qPCR.

Discussion

Antigen-detecting RDTs provide a scalable and affordable alternative to molecular tests for the diagnosis of SARS-CoV-2 infection29. In this study, we present a prospective evaluation of the CareStart antigen-detecting RDT for SARS-CoV-2 detection in a real-world, walk-up community COVID-19 testing site in Holyoke, Western Massachusetts, a region experiencing disparities in testing that could be addressed by the scale-up of validated, affordable antigen-detecting RDTs such as CareStart30.

Compared to a RT-qPCR-based test performed on anterior nasal swabs, we found a much lower sensitivity (49.0%) than what was reported in the FDA package insert (87.2%), which was restricted to 39 symptomatic individuals within 5 days of symptom onset26. However, our measured sensitivity was consistent with the reported sensitivity of CareStart in asymptomatic individuals from a study at Lawrence General Hospital in Massachusetts where the estimated sensitivity of CareStart was 51.4%17. Compared to the Abbott BinaxNOW RDT, which has been validated in several studies including a recent study in Massachusetts, the sensitivity of the CareStart RDT was lower overall and when using a Ct positivity cutoff of ≤ 3015. The specificity of both tests was comparable at nearly > 99%. Consistent with several studies, the sensitivity of antigen detecting RDTs was modest in individuals with no or mild symptoms8,14,15,16,20. Given that presymptomatic and asymptomatic transmission is an important component of the pandemic31, implementation of antigen-detecting RDTs needs to weigh the benefits of rapid detection of SARS-CoV-2-infected individuals with the lower sensitivity of these tests in asymptomatic carriers29. On the other hand, the high specificity of these tests reduces the probability of false positive SARS-CoV-2 test results that could lead to restrictions and inconveniences that interfere with the livelihood of these individuals.

The RT-qPCR cycle threshold (Ct), a proxy for lower viral load, as reported in other evaluations15,17,32, had a clear impact on the sensitivity of the CareStart RDT. Concordant results showed lower Ct values whereas discordant results had higher Ct values. Consistent with this, the CareStart RDT sensitivity improved with a RT-qPCR positivity Ct cut-off of < 30. These data suggest that the CareStart RDT positivity suggested a state of higher viral load, which correlates with infectivity of cells in vitro33 and likely transmissibility. The viral load at the time of testing depends on multiple factors including host vaccination status, host immunity, viral variants and replication kinetics, local epidemic curves, and the method and quality of sample collection. For example34, individuals vaccinated with the Pfizer/BioNtech BNT162b2 vaccine with breakthrough SARS-CoV-2 infections have been reported to show higher Ct values post-vaccination than unvaccinated counterparts, which correlates with lower viral loads35, suggesting that vaccinated individuals with breakthrough SARS-CoV-2 infections would be more difficult to detect by antigen-detecting RDT with similar performance characteristics to CareStart. Further studies are needed to evaluate the performance characteristics of CareStart and other RDTs in these different scenarios.

Although our sample of repeat testers was limited, it suggested that individuals who recently converted from negative to positive RT-qPCR test, i.e. recently acquired SARS-CoV-2 infection, were easily detectable by the CareStart RDT. Recently infected individuals have been shown to be more contagious27. Samples from these recently infected individuals had low Ct values, and were thus more likely to transmit the virus33. Though our data is underpowered to evaluate performance characteristics in the setting of repeat testing, this limited sample supports the usefulness of serial rapid antigen testing in detecting recent infections and guiding the implementation of containment measures36. This is supported by studies with other RDTs, for example, a study showing that daily testing increased the sensitivity of Quidel SARS Sofia antigen fluorescent immunoassay (FIA)37. The public health benefits of serial testing with RDTs should be studied further.

This study has several limitations. First, we had limited control over or ability to monitor the order by which the two bilateral nasal swabs were collected because of embedding the study in a ‘real world’ testing program. It is possible that performing the PCR swab first may decrease the available viral load for the antigen test. However, a recent evaluation of the Abbott BinaxNOW suggested that the order of swabs had little impact on the test result20. Second, it remains possible that we overestimated the sensitivity of the CareStart rapid test by using anterior nare swabs instead of nasopharyngeal swab samples for the qPCR reference, which may be a slightly less sensitive gold-standard compared than nasopharyngeal swabs38,39. Third, since rapid testing in the USA has primarily transitioned from community-based testing to over the counter direct-to-consumer testing40, it is important to develop tools to facilitate the interpretation of the positive predictive value of a test with low or moderate sensitivity. We previously published a tool to facilitate the interpretation of serological lateral flow assays, that translates the test accuracy and community prevalence into a PPV: https://covid.omics.kitchen/41, which can be applied to rapid tests. Overall, the high specificity of the test increases its utility as a rule-in triage test, since false positive results are unlikely, but it is important to emphasize that false negatives are still likely. Finally, the study was conducted in Jan-Feb 2021, when the major lineage in New England, USA were predominantly the Wuhan or alpha variants42. Hence, it is unclear if the CareStart RDT is less sensitive for detecting the subsequent delta or Omicron variants and sub-lineages. Finally, the study was conducted at the time when vaccination against SARS-CoV-2 was low, which may affect the generalizability of these findings. Further studies are needed to evaluate the performance characteristics of this and other SARS-CoV-2 RDTs in these different scenarios.

In conclusion, RDTs such as CareStart, can support SARS-CoV-2 testing efforts in minimally symptomatic or asymptomatic individuals. However, the impact of the limited sensitivity of these tests on their positive predictive values warrants caution. The moderate sensitivity of these tests means that some potentially infectious individuals may be classified as SARS-CoV-2-negative. Therefore, implementing RDTs for travel, home testing, or to guide re-openings of schools and workplaces should be interpreted with caution and the utility of RDTs in each of these use cases should be carefully evaluated. Furthermore, implementation studies to analyze their usefulness and acceptability by both users and providers are necessary.