Introduction

In health care institutions, screening for COVID-19 symptoms is essential for patient and staff safety. At Mayo Clinic, staff were assigned to screen patients for COVID-19 by phone before appointments and to screen for symptoms during patient visits, which increased the time needed for appointments and required more resources for our already stressed health care system. With a 90% penetration rate of cellular phone subscriptions, mobile phone messaging offers a low-cost approach for health information transfer without a need for face-to-face contact [1]. Mobile phones have many applications in health care, including increasing access to care; enhancing efficiency of service delivery; improving diagnosis, treatment, and rehabilitation; and supporting public health programs [2]. Mobile phone services have been used for appointment reminders, to improve patient compliance with medications, to monitor chronic diseases, to provide psychological support, and to trace contacts for communicable diseases [1].

Short message service (SMS)-based communications have been used for patient appointment reminders, including for radiology appointments [3], and to promote certain behaviors, such as medication adherence [4]. Chatbots are conversational SMS-based agents with different levels of dialogue management, ranging from finite, wherein the user is guided through a sequenced dialogue with predetermined steps, to a conversational agent, wherein complex communication occurs between a system, the user, and an application such as Now (Google), Alexa (Amazon.com, Inc), or Siri (Apple Inc) [2]. These advanced conversational agents have several advantages over other types of communication: ease of use, relatively low cost, rapid and automated message delivery, and minimal risk of harm to the patients [4]. Therefore, we aimed to demonstrate the feasibility of a chatbot system and to describe patient experiences with SMS-based COVID-19 symptom screening before scheduled radiology visits.

Methods

We followed SQUIRE 2.0 guidelines for reporting clinical practice quality improvement studies [5]. In May 2020, we obtained internal approval for a pilot project to assess the feasibility and effectiveness of using an SMS-based chatbot communication tool to screen patients for COVID-19 before outpatient magnetic resonance imaging (MRI) and ultrasound (US) radiology appointments.

We partnered with a company (GYANT, San Francisco, CA) to customize a secure chatbot system to screen patients for COVID-19 symptoms. The multidisciplinary team developing the chatbot consisted of members from the Center for Digital Health Arizona, the Department of Radiology at Mayo Clinic, and the software company. The customization included organization-approved wording for screening and links to maps that would direct patients to appointments. Readability of chatbot messages was assessed with the Flesch Reading Ease Score and averaged 73.6, which corresponded to a seventh grade reading level. Because cybersecurity is an increasing concern for patients and institutions, all third-party applications at Mayo Clinic, including the chatbot used in this paper, undergo rigorous testing by the Third Party Risk Management department to ensure compliance with the Health Insurance Portability and Accountability Act and secure data storage practices.

Patients were first notified about COVID-19 screening via an SMS message that asked them to click a link to access a secure website where they would answer screening questions (Fig. 1). This SMS message did not identify the institution as the sender. Once the link was activated, the user was routed to a secure website with the institution’s logo and directed to answer questions in a window that simulated an SMS message. The software used a decision-tree algorithm capable of routing patients into several pathways depending on their responses to the COVID-19 screening questions (Fig. 2). Patients who reported symptoms were then directed via the algorithm to get COVID-19 testing before their appointment at our testing site. Asymptomatic patients were routed to confirm appointments, and patients with questions were identified so that an outbound call to the patient could be made.

Fig. 1
figure 1

Representative images of the interaction between the chatbot and the user including authentication, followed by COVID-19 screening questions

Fig. 2
figure 2

Flowchart demonstrating recording of each combination of patient responses. White represents chatbot interactions, beige represents patient decision points, and red represents end points

Study of Intervention and Measures

To assess patient engagement, we calculated the response rate of patients who completed the chatbot. We categorized patients as responders or nonresponders. Responders were defined as patients who completed the chatbot. Nonresponders were defined as patients who experienced authentication failure, had no valid phone number, timed out of the chatbot, or unsubscribed from the chatbot. For responders, the last question in the chatbot was a brief questionnaire that addressed ease of use of the chatbot, which was scored on a 5-point Likert scale (1 = poor and 5 = outstanding). The patient responses were linked with demographic information (age and sex), and written-language preference was obtained from the electronic health record.

Statistical Analysis

We assessed engagement to the chatbot screening by patient age and sex, imaging modality (US or MRI), and English preference as primary written language. Patients were stratified by age groups: baby boomers (57 years and older), Generation X (41–56 years), Generation Y (25–40 years), and Generation Z (less than 25 years). We used a nonparametric Mann–Whitney U test for continuous variables and a χ2 test for categorical variables. We used a multivariable logistic regression model to predict response rate with the following covariates (English vs non-English written-language preference, age, and sex). P less than 0.05 was considered significant. Statistical analysis was performed on STATA/IC 15.1 (College Station, TX).

Ethical Considerations

The study was compliant with the US Health Insurance Portability and Accountability Act, and the Mayo Clinic Institutional Review Board waived the need for patient consent.

Results

Over the course of the 4-month initiative, the chatbot COVID-19 screening SMS message was sent to 4687 patients who obtained outpatient MRI (44.7%) and US (55.3%) examinations. A single SMS was sent to 4413 patients (94.1%). A second SMS was sent to 274 patients (5.8%) of those who did not respond after the first SMS message. For patients whom we sent a follow-up SMS, we included only the most recent SMS response in the analysis. Of the 4687 patients, 2722 (58.1%) responded, and 1965 (41.9%) did not respond. Of the 2722 (58.1%) responders, 2322 (85.3%) confirmed the appointment, 46 (1.7%) reported COVID-19 symptoms, and 34 (1.2%) had a COVID-19 test scheduled or pending. Of the 1965 nonresponders, 174 (8.8%) had an authentication failure, 1496 (76.1%) did not engage with the initial SMS message, and 251 (12.8%) timed out of the chatbot. Several patients contacted the organization to report a concern that the original SMS message was spam because the sender’s name did not include the organization name. On a 5-point Likert scale, the mean user rating of the chatbot experience of those patients who completed the chatbot was 4.6.

We found no differences in response rate by age or sex, although we observed differences in response rate by language preference (Table 1). There was also no significant difference between MRI (56.0%) and US (58.8%). Of the 4687 patients, 4600 (98.1%) reported English as their primary written language. Patients with an English written-language preference had a higher response rate than those whose written-language preference was not English (56.3% vs 43.7%, P < 0.05). There were no differences in the median age between nonresponders and respondents (58.0 vs 59.0, P = 0.91), male sex (45.3% vs 46.2%, P = 0.55), and age distribution (P = 0.23). The covariates English vs non-English written-language preference, age, and sex were included in the multivariable logistic regression model to predict response rate. From this model, we found that English written-language preference independently predicted response (odds ratio, 2.71 [95% CI, 1.77–2.77]; P = 0.007) compared with non-English written-language preference. Age (P = 0.57) and sex (P = 0.51) did not predict response rate.

Table 1 Differences in response rate based on demographic variables

Discussion

We had a 58% response rate from patients using the SMS-based chatbot COVID-19 screening. The majority (85%) of patients confirmed appointments, and a minority reported COVID-19 symptoms or had COVID-19 testing pending or scheduled (less than 5% in total). Patients with an English written-language preference had 2.7 higher odds of responding to SMS-based COVID-19 screening than patients with a non-English written-language preference. Sex and age did not impact the response rate. Our results showed the feasibility of using SMS-based chatbot communication for COVID-19 screening. To our knowledge, this is the first description of using an SMS-based communication method to screen patients for radiology examinations. SMS-based chatbot screening could also be applied in other areas of radiology to streamline workflow, such as for MRI safety screening. During SMS-based communication, it is important to be mindful of language preferences to ensure inclusivity of diverse patient populations, as shown by our result. This also suggests that patient-specific variables such as patient-centered language, reading level, and chatbot length would be important for an inclusive and accessible radiology communication tool and should be considered in future studies of health-related communications [6].

Interpretation

During the COVID-19 pandemic, local, state, and national groups used chatbots to share information, encourage health-influencing behaviors, reduce psychological damage, and combat COVID-19 misinformation [7, 8]. Health care systems have also used chatbots for large-scale symptom screening for COVID-19 [9]. In this quality improvement project, we observed a 58% response rate from patients for COVID-19 symptom screening. This percentage is lower than the 97% rate reported by Judson et al. [9] for COVID-19 screening of health system employees. The difference may be due to differences in populations (patients vs employees) and frequency (1-time appointment vs daily work). In addition, some of our patients did not engage because they were concerned that the initial SMS message was spam or phishing. In the Judson et al. study [9], only 0.2% of employees reported COVID-19 symptoms. In our study, 1.7% of patients reported COVID-19 symptoms or had active tests pending. Both screening of employees in the Judson et al. study [9] and our screening of patients for COVID-19 showed relatively low rates of symptoms of possible COVID-19 infection. Although our study does not explicitly measure time saved between chatbot and nursing calls, Judson et al. [9] reported that chatbot screening allowed faster clearance and a significant time savings in the similar area of employee screening.

Chatbots are now in widespread use, which makes understanding end users important for knowing how best to implement SMS-based communications to gain the widest audience. Kocielnik et al. [10] assessed the use of a chatbot versus a survey to determine communication preferences for patients of low and high health literacy being screened for social needs in the emergency department. Although both platforms resulted in comparable data, the low health-literacy group preferred the chatbot, which they found more engaging because of its conversational design. In contrast, the high health-literacy group preferred surveys because they thought the chatbot was too slow at allowing them to complete the screening. In our study, we found that language preference was another important contributor to engagement. Participants with a written-language preference for English had 2.7 higher odds of responding than those with a non-English written-language preference. Therefore, literacy and language preference are important factors to consider for engagement in developing a chatbot.

Limitations

Our quality improvement effort had several limitations. First, we did not have a control group given the pilot nature of the study and novel needs during the pandemic. Second, we only had a patient-experience score and lacked granular feedback about patient experiences, which kept us from understanding the reasons for nonresponse. Third, the single-institution experience limits the generalizability of the work. Fourth, our screening effort was limited to the radiology patient population, and thus, we did not have comparative data for patients outside radiology who had additional COVID-19 screening through parallel hospital efforts. Fifth, our screening chatbot was initially accessed through an SMS message with no Mayo Clinic identification, which led patients to categorize the SMS as spam. Further improvements to our screening chatbot would include outreach through more identifiable modalities such as portal messaging or email. Despite these limitations, we showed the feasibility of implementing an SMS-based COVID-19 screening chatbot before radiology appointments in a large academic institution, which resulted in engagement with the majority of patients and in follow-up communication for a minority of patients.

Conclusions

In our pilot study, we showed that SMS-based chatbot communication for COVID-19 screening of radiology patients had a nearly 60% response rate with high patient-experience scores. Response rates may have been higher if it had been clear that the original SMS message originated from Mayo Clinic. A preference for English as the written language was associated with a higher response rate, but age did not affect response; i.e., older patients were as likely to interact with this technology as younger patients. Future studies could explore the effect of a multilingual chatbot on response rates in populations with preferences for written languages other than English. Moreover, patients who interacted with the chatbot responded positively, with a high average patient experience score of 4.6/5 for those who completed the chatbot communication. However, in the future, it will be important to assess overall patient satisfaction with the chatbot through methods independent of the chatbot, such as a questionnaire completed at the in-person visit.

To our knowledge, this is the first study to examine chatbot screening in radiology, and our findings have implications for expanding the use of SMS-based technology to other medical screening purposes within and outside of radiology, for augmenting or replacing standard phone and online-portal communication, and for using real-time communication when patients are inside the facility. This pilot study showed that an interactive chatbot experience was a feasible way to engage patients to obtain needed health information.