Background

In April 2021, the California Department of Public Health (CDPH) partnered with Stanford University to launch CalScope for estimating the proportion of Californians with evidence of SARS-CoV-2 antibodies from prior infection or vaccination [1]. CalScope was adapted from the CA-FACTS [2] study and designed to address three main challenges as part of CDPH’s enhanced surveillance efforts to better monitor the COVID-19 pandemic over time and inform public health action: the high proportion of cases not being reported to the state surveillance database due to limited testing capacity and mild or asymptomatic cases, a population of nearly 40 million residents spread throughout a large and diverse state, and stay-at-home orders which limited in-person study administration.

The objective of this paper is to describe the procedures for conducting CalScope, including the study design, data and support systems, and communication plan.

Methods

CalScope used random address-based sampling over three collection periods to invite households from seven counties in California, which were chosen to facilitate efficient sampling of the entire state, focus community-based outreach efforts to smaller geographic areas, and allow for region-specific estimates of seroprevalence. Up to one adult (18 + years old) and/or one child in each randomly selected household were invited to participate by completing an anonymous online survey and providing a self-administered, at-home SARS-CoV-2 antibody blood sample via a test kit with dried blood spot (DBS) card. Further details on the sampling strategy and methods have been described previously [1].

Study process

Each selected household received a mailed invitation letter and follow-up postcard with a unique, address-linked, 8-digit code with instructions on how to register for the study online or over the phone (Fig. 1A). For participants without internet access, an interactive voice response (IVR) system (automated telephone system using pre-recorded messages) accepted registrations over the phone.

Fig. 1
figure 1

Participation schema. (A) Registration and Survey only: 1. Household receives mailed letter or postcard invitation. 2. Adult registers online or by phone. 3. Study team contacts survey-only households without internet to complete survey over the phone. (B) Survey and Test Kit: 1. Household receives test kit by mail. 2. Adult activates the test kit online or calls study team for assistance. 3. Household mails back completed test kit. 4. Household receives test results by mail. Gift cards are delivered by mail or email

The registration form required an adult (aged 18+) from each household to enter the number of eligible adults and children in residence. Based on this enumeration, households had three options for study participation: (1) complete the adult survey-only option without ordering any test kits, (2) complete one adult survey with test kit and a child survey-only option (or vice versa), or (3) complete surveys with test kits for both an adult and child. Full participation in the study was defined as completion of both a survey and test kit per participant. Households without children could order only one adult survey with test kit or complete the adult survey-only option. Households with multiple adults and/or children were instructed to randomize participation by enrolling the member with the next upcoming birthday.

Registered households were mailed at-home SARS-CoV-2 antibody test kit(s) with instructions for completing the online survey, collecting a finger-prick DBS sample, and returning completed test kits in pre-addressed envelopes within two weeks. Upon return of test kits, DBS cards with adequate collections of blood samples were tested for the presence or absence of antibodies against the spike (S1) and nucleocapsid protein of SARS-CoV-2 [3]. The S1 results were mailed back to participants with no personally identifiable information, differentiating results by sample type (adult or child), collection date, and result date only. The letter also included information on how to interpret the results. Two $20 gift cards were disbursed to each participant, once at the completion of the online survey and again after the return of a completed test kit with DBS samples.

Study surveys

All data collection materials were administered online using REDCap [4, 5]. Respondents completed forms for language selection, registration, adult or child surveys, and gift card preference. The language selection form supported English, Spanish, Filipino/Tagalog, and simplified Chinese, the four most common languages in the seven selected counties [6]. Participants who did not have internet access or had difficulty completing the online survey were asked to request help through IVR so that they could complete the survey with a study staff member over the phone.

The registration form included questions on number of adults and children in the household, choice of survey and/or at-home test kit orders, and optional contact information for study updates and reminders. The adult survey included questions on demographics, income, occupation, medical history, COVID-19 vaccination history, and risk factors for COVID-19 disease at both the household and individual level. The child survey included questions on individual demographics, medical history, COVID-19 vaccination history, and risk factors for COVID-19 disease. A gift card form at the end of each survey asked participants for their preferred method of delivery (mail or email).

At-Home SARS-CoV-2 antibody test kits

Antibody test kits were mailed to registered households with a unique 6-character alphanumeric code for activation, which allowed participants to access the online survey and instructional video for blood collection (Fig. 1B). Each kit box included: (1) a 1-page letter outlining the steps for accessing the online survey and collecting the blood sample; (2) an instructional booklet on how to collect a DBS sample; (3) materials for blood collection (i.e. lancet, gauze, alcohol wipes, DBS card); and (4) a pre-paid United States Postal Service (USPS) mailer addressed to the testing laboratory (Enable Biosciences, South San Francisco, CA). After receipt at the laboratory, collected blood samples were tested for SARS-CoV-2 antibodies.

Data system

Four REDCap databases were used to manage study data with external modules [7,8,9,10] and communicated with external databases via Application Programming Interface (API) to update information on test kit shipment, return, and lab results on a daily schedule (Fig. 2). Each address was linked with the unique access code that was printed on the invitation. Printed invitations were delivered by USPS, and households could register online using the access and ZIP codes on the letter or postcard. If the entered combination was valid and unused, a new registration form could be filled out, and the address associated with the access code was displayed to make sure a registration was being completed by the intended household. Alternatively, registrations could be completed by phone utilizing the IVR system to collect the same information as the online form, but the address validity check was performed after submission before an order was processed. Confirmation emails and text messages were sent out if contact information was provided during registration, allowing participants to confirm that the correct order information was received as well.

Fig. 2
figure 2

Data flow diagram for study information between various databases. The Main REDCap database was the primary repository communicating via REST API, updating survey, kit shipment, and laboratory results on a daily schedule. Numbered pathways show the connections involved with data management: (1) The sampling frame linked with the unique access codes (AC) were saved to AC and transferred to the printing partner through a web secure file transfer protocol (sFTP) client. (1b) Address-linked ACs were mailed out to households. (2) Adult participant entered in access and ZIP code combination into website, or (3) by phone, facilitated via Twilio’s API. (4) Scheduled task checked for new IVR registrations against existing records in Main. (5) Main queried AC to look up and verify entered combinations. If valid and unique, new record was created in Main to open the registration form and for IVR, the order was copied over to a new Main record. (6-6b) Survey-only registrations by IVR were pulled to generate a follow-up list. (7) Information on orders and shipments (USPS tracking number, activation code, and DBS codes) updated through API between Main and Shipping Partner. USPS API tracked shipment deliveries and returns. (8) Kit shipments (DBS codes, tracking numbers) received from the Shipping Partner and Main record ID were relayed to the laboratory by API. (2-2b) Activation and zip code combinations on test kits were entered into the website, which located the Main record by API and displayed personalized survey link(s) for the test kit. DBS completion steps were viewable at completion of at least 1 survey. (8) Processing and result information for returned test kits were pulled from the Lab into Main using API. (9) When compensation criteria were met in Main, an external module (EM) assigned the next available Gift Card link to the Main record. (10) Main sent email/text for reminders, confirmations, and compensation based on programmed logic

Once a registration form was approved, the appropriate order information (including mailing address and test kit type) was sent to the shipping partner (ALOM, Fremont, CA). Upon fulfillment, information including the USPS tracking number, activation code, and DBS card ID numbers were saved to the household record and sent to the laboratory in preparation for accessioning once test kits were returned by participants.

When participants received their test kits, they were instructed to enter the printed activation and ZIP codes into the study website. When the correct combination was entered, the adult and/or child survey link(s) for the household was displayed along with instructions on collection and return of the blood sample. After the laboratory tested the samples, results were returned to participants by mail. Gift cards were then sent to participants using their specified delivery method.

Data from each REDCap database were exported daily and cleaned via an automated program using SAS version 9.4 (SAS Institute Inc., Cary, NC). Study staff used these datasets to create reports, track enrollment, monitor data quality, flag errors, and ensure that all data exchanges occurred correctly throughout the study period.

Support requests and follow-up

Six to ten members of the CDPH California Connected contact tracing team provided follow-up support for participants over the phone and by email. Experienced with conducting contact tracing interviews, this support team was trained with a call center protocol and phone script. A CalScope study member was also available to provide technical support during follow-up. To follow up with participants who needed support in languages other than English, an interpreter service was utilized.

Participants could submit support requests to the study team via: (1) a web-based “Contact Us” support form on the study website, (2) direct emails to the study inbox, or (3) a voicemail utilizing the IVR system (Fig. 3). Study staff consolidated requests and created daily reports to review and track open requests. Each request was assigned a number which was used for tracking as staff sent out emails, documented notes, and updated the resolution status for each follow-up event.

Fig. 3
figure 3

Data flow diagram for support and tracking follow-up activity. The Support REDCap database was the primary data repository for support requests where all participant communication received through the website, email, or voicemail was consolidated for follow-up documentation and tracking. Numbered pathways show the databases and connections involved with the management of support and follow-up activities: (1-1b) Participant submitted a web-based support form through the survey link labeled “Contact Us” on the study website to create a new request. (2-2b) Participant sent an email to the study inbox which was manually checked by study staff. (3) Participant left a voicemail by utilizing the IVR system enabled by Twilio’s API and an external module (EM). (3b) Automatic transcriptions generated by the EM were sent to study staff by email. (4) Study staff manually transferred incoming email and voicemail into Support, creating new support requests (5). Daily reports were generated for study staff to review and track open support requests. (8) Study staff followed-up with support requests by contacting participants through phone or email

Communication materials and community outreach

A web-based communications toolkit was developed to increase study awareness and community outreach, which contained a study FAQ, a participation infographic, outreach flyers, social media messages and graphics, templates for a press release, and radio scripts. Virtual presentations were regularly scheduled with local health departments in participating counties to provide study progress updates and review communication materials and outreach strategies for increasing study engagement and utilization of the toolkit. To promote general awareness, CDPH shared study information through social media messages and COVID-19 media updates. Local health departments were also encouraged to partner with community-based organizations to increase local visibility of the CalScope study.

Results

Registration

CalScope recruited participants through three separate samplings: 200,000 households in wave 1 (W1; 04/20/2021-06/16/2021), 200,000 households in wave 2 (W2; 10/15/2021-12/17/2022) and 266,857 in wave 3 (W3; 04/01/2022-07/29/2022) with test kit return deadlines on 08/01/2021, 02/11/2022, and 08/26/2022 respectively. The number of invited households overall was increased by 66,857 in W3 to adjust for decreasing participation rates between the first two waves.

Registration was highest in W1 with 5.6% of invited households responding to the invitation and decreased in subsequent waves (Table 1). The number of households that did not complete the registration form or opted out of both the kit and survey at the end of registration was 3.0% (344/11,286) in W1, 3.5% (318/9,019) in W2, and 4.2% (515/12,366) in W3. Across waves, 30.7% of registered households reported having at least 1 child in residence (W1 30.1%; W2 32.1%; W3 30.0%).

Table 1 Study participation across all three waves at the household and individual adult and child participant level

Overall study participation

Out of 666,857 households invited across three waves, 32,671 (4.9%) completed a registration form and a total of 31,496 (4.7%) completed an antibody test kit and/or online survey (Table 1). Among households that selected the survey-only option, the survey completion rate was 91.1% (1,816/1,993), with 91.0% (1,042/1,145) of adult and 91.3% (774/848) of child surveys completed. Among households that selected the survey and test-kit orders, the survey completion rate was 80.0% (24,299/30,358), with 80.0% (27,272/30,342) of adult and 69.5% (5,700/8,201) of child test kit surveys completed. Withdrawals from the study occurred only among households that had ordered test kits, with 0.1% (15/10,444) withdrawn in W1, 0.2% (14/8,466) in W2, and 0.4% (44/11,448) in W3.

Across three waves, 77.7% (23,587/30,358) of household test kits were returned. About 99.2% (23,396/23,587) of these households provided adequate samples on the DBS cards for testing (23,376 adult; 4,603 child). Sample rejections were more common among returned child kits (12.0%) compared to adult kits (0.7%). In total, 70.5% (23,043/32,671) of registered households (23,022 adult; 4,526 child) completed the survey and returned adequate blood samples for testing. In the end, 74.3% (24,270/32,671) of registered households completed partial or full participation in the study through the survey-only option (5.1%), the test kit-only option (92.5%), or a combination of the survey-only and test kit option (2.4%).

Participation by IVR-registered households

Registrations completed through IVR accounted for 6.3% (2,065/32,671) of all registrations (W1 4.1%; W2 7.5%; W3 7.5%). Compared to 78.9% (24,136/30,606) of online-registered households that completed surveys, only 65.5% (1,352/2,065) of households that registered through IVR completed surveys. The percentage of test kits returned was also higher among online-registered households (78.5%), compared to IVR-registered households (66.2%). A smaller proportion of IVR households completed full participation in the study with 60.4% (1,247/2,065) returning adequate samples with a completed survey, compared to 71.2% (21,797/30,606) of online-registered households.

Participation by language

Out of 32,671 total households registered, the majority completed registrations in English (95.7%), followed by Spanish (2.9%) and simplified Chinese (0.8%). Survey completion was higher in households that registered in English (78.9%) or simplified Chinese (78.3%), compared to households that registered in Filipino/Tagalog (66.7%) or Spanish (62.6%). In terms of full participation, 71.5% of households that registered in English (22,356), 66.7% simplified Chinese (172/258), 66.7% Filipino/Tagalog (6/9), and 53.9% Spanish (510/947) submitted adequate samples along with a complete survey.

Support requests and follow-up

A total of 5,807 support requests were submitted to the study for follow-up: 2,476 (42.6%) in W1, 1,626 (28.0%) in W2, and 1,705 (29.4%) in W3. On average, support requests were followed-up by study staff within 1–2 business days and were resolved within 3 maximum follow-up attempts. In total, participants submitted 2,106 (36.3%) web-based forms, 1,978 (34.1%) emails, and 1,723 (29.7%) voicemail support requests. For method of contact, 3,890 (67.0%) requested follow-up by email, while 1,917 (33.0%) requested a follow-up phone call. The majority of requests (95.8%) were submitted in English, followed by Spanish (3.3%), simplified Chinese (0.9%), and only 1 request was submitted in Tagalog/Filipino.

Table 2 shows submitted participant questions by support type. Across waves, the highest number of support requests was for the test kit (1,836), which included participants requesting assistance with the test kit survey (30.6%), replacements for lost/missing kits (19.2%), and confirmation that the laboratory received the test kit (18.7%). Questions about test results (1,557) were the second highest reason for support requests, with most participants asking about their results or requesting re-delivery (94.5%).

Table 2 Study participant questions by support type

Communication materials and community outreach

A total of 14 communication materials were developed and refined with the input and feedback of the participating counties’ local health departments. Materials were available in all four study languages and could be further tailored to better meet the needs of target populations in each county. The CalScope study team also partnered with local health departments and engaged with 54 trusted community-based organizations and local partners to disseminate study outreach materials and increase study awareness and support. Community partners included COVID-19 equity task forces and local libraries. Some examples of unique outreach approaches included hosting virtual presentations, creating radio public service announcement scripts, and filming video testimonials with previous participants that could be shared on public health websites and social media platforms.

Discussion

CalScope was the first state sponsored study undertaken at CDPH to estimate SARS-CoV-2 seroprevalence in California. The study was designed to collect representative survey data to complement the state’s existing surveillance efforts, reaching households which may have had less access to testing resources as well as the pediatric population. Over the course of three waves between April 2021 and August 2022, a total of 32,671 (4.9%) households registered for the study, with 25,488 (78.0%) households completing 25,314 adult and 6,474 child surveys. In addition, 23,396 (71.6%) households provided 23,376 adult and 4,603 child blood samples for testing using an entirely home-based test kit without any in-person contact between participants and study staff.

The greatest loss to follow-up occurred among households that ordered a test kit, with 22.3% failing to return the test kit. Moreover, 31.8% of households who ordered a child test kit failed to return the test kit, which was higher than the 22.3% loss to follow-up seen for adult test kits. Among test kits that were returned, almost 12.1% of child test kits had an inadequate blood sample collected, which suggests some persistent difficulties with blood collection among child participants. However, testing for a child within the household seemed widely acceptable, as 83.5% of households with children opted to order a child test kit and 46.1% had a child complete full participation in the study.

Several study design considerations were integral to the overall success of CalScope. First, a scalable and customizable data system designed through REDCap helped to manage this statewide study with a relatively small number of core staff. The security, use, and adaptability of REDCap for data collection and management is well documented [11, 12]. For CalScope, API integration created an enhanced participant experience by facilitating automated data exchanges with the network of vendor and partner systems. The ability to program logic for customized email and text messages also greatly reduced staff burden for sending communications and reminders.

Study accessibility for non-English speakers or those without internet access was addressed by using the IVR system, interpreters, and translated materials. IVR was a particularly valuable registration method, utilized by 4–8% of households for registration per wave. However, the 13.4% difference in survey completion and 10.8% difference in full participation between IVR and online-registered households indicates that the two options may not have provided equal participant experiences. Overall low utilization of non-English forms to complete registrations (3.7%), surveys (3.1%), and support requests (4.2%) also indicate that despite the inclusion of multiple considerations to increase accessibility, participation for non-English speaking households was lower than expected based on population demographics.

The infrastructure for tracking and follow-up of support requests allowed staff to quickly respond to participant questions, keeping them engaged and more likely to complete the study. Systematically documenting the types of support also allowed staff to continuously analyze common issues and concerns, resulting in improved study materials and processes between waves. Furthermore, having support staff directly reach participants by phone was especially helpful for survey completion reminders and troubleshooting.

Finally, recruiting participants from seven selected counties instead of the entire state helped to focus the communication and outreach around study participation. CalScope relied on partnerships with trusted community-based organizations and local health departments to disseminate tailored messaging about the study and encourage participation from invited residents. CalScope was featured in press releases, local news stories, and social media posts from official government accounts, which likely increased participants’ confidence around the legitimacy of the study. As a result, CalScope was able to meet study registration targets for most counties, with improvements seen even within counties where response rates were lower than expected.

Conclusions

Despite some limitations, public health departments can enhance existing routine surveillance systems with remotely collected surveys and at-home serological testing data. Such data can improve population-based estimates of novel diseases and conditions that may be underreported. Data from such studies can also be used to improve forecast models and identify health disparities among specific populations that may benefit from increased attention and better resource allocation. Although these methods are not universally applicable, diseases with higher true prevalence and longer immunological responses that can be tested with DBS samples may benefit from adapting the study methods presented.