Development and Evaluation of a Relational Agent to Assist with Screening and Intervention for Unhealthy Drinking in Primary Care

Screening, brief intervention, and referral for alcohol misuse during primary care appointments is recommended to address high rates of unhealthy alcohol use. However, implementation of screening and referral practices into primary care remains difficult. Computerized Relational Agents programmed to provide alcohol screening, brief intervention, and referral can effectively reduce the burden on clinical staff by increasing screening practices. As part of a larger clinical trial, we aimed to solicit input from patients about the design and development of a Relational Agent for alcohol brief intervention. We also solicited input from patients who interacted with the implemented version of the Relational Agent intervention after they finished the trial. A two-part development and evaluation study was conducted. To begin, a user-centered design approach was used to customize the intervention for the population served. A total of 19 participants shared their preferences on the appearance, setting, and preferences of multiple Relational Agents through semi-structured interviews. Following the completion of the study one interviews, a Relational Agent was chosen and refined for use in the intervention. In study two, twenty participants who participated in the clinical trial intervention were invited back to participate in a semi-structured interview to provide feedback about their experiences in interacting with the intervention. Study one results showed that participants preferred a female Relational Agent located in an office-like setting, but the mechanical and still movements of the Relational agent decreased feelings of authenticity and human trustworthiness for participants. After refinements to the Relational Agent, post-intervention results in study two showed that participants (n = 17, 89%) felt comfortable interacting and discussing their drinking habits with the Relational Agent and participants (n = 10, 53%) believed that the intervention had a positive impact on the way participants thought about drinking or on their actual drinking habits. Despite variability in the preferences of participants during the development stage of the intervention, incorporating the feedback of participants during the design process resulted in optimized comfort levels for individuals interacting with the Relational Agent. clinicaltrials.gov, NCT02030288, https://clinicaltrials.gov/ct2/home


Background
Unhealthy alcohol use (Saitz, 2005) is currently the leading cause of preventable death in the USA (Center for Disease Control and Prevention, 2022) and is more prevalent among veterans relative to non-veteran civilian populations (Teeters et al., 2017).These high rates of unhealthy alcohol use result in increased healthcare utilization rates to address drinking consequences and high-cost burdens to hospital systems (Center for Disease Control and Prevention, 2022;Sacks et al., 2015).To address the rates of unhealthy alcohol use (i.e., problem drinking, alcohol abuse, and dependence), the US Preventative Services Task Force (USPSTF) (Curry et al., 2018) recommends regular alcohol screening in primary care for all adults over the age of 18, and the Veterans Health Administration (VHA) has joined the USPSTF by mandating that all veterans be screened for alcohol use annually during regular primary care visits (Teeters et al., 2017).
Screening, Brief Intervention, and Referral to Treatment (SBIRT) is an evidenced-based approach used to identify and provide treatment to patients in a variety of clinical settings.SBIRT is conducted with the overall goal of immediate intervention and presentation of treatment options to individuals presenting with unhealthy levels of alcohol use (Babor et al., 2007).Primary care is often considered the ideal clinical setting to conduct SBIRT due to the frequency in which patients present to primary care, readily available treatment resources, and a higher rate of compliance among patients asked to participate in screening in primary care (McCance-Katz & Satterfield, 2012) versus other clinical settings.
Despite the recommendation by the USPSTF and VHA, effective implementation of unhealthy alcohol use screening into primary care settings is difficult, and screening rates in primary care clinics remain highly variable.The low screening rates are often attributed to implementation related physician barriers including limited availability of the physician, competing clinical demands during the primary care visit (McNeely et al., 2018;Rahm et al., 2015;Yarnall et al., 2003), the need to address other pressing medical issues, and limited overall knowledge of available follow-up referral care services when a patient screens positive-a key component of the SBIRT approach.However, low positive screening rates may also be exacerbated by the reluctance of patients to disclose alcohol use, possibly due to feelings of discomfort or due to fears of substance use-related stigmas or consequences (McNeely et al., 2018;Teeters et al., 2017;Williams et al., 2016).
The lack of overall knowledge surrounding the SBIRT approach among primary care providers has resulted in the low referral rates nationwide remaining a prevalent issue, and despite the recommendation and incentive for primary care physicians to administer the screening, the quality of the screening remains inconsistent.Upwards of 70% of veterans screening positive for past-year substance use disorders are not receiving treatment referrals (Boden & Hoggatt, 2018;Golub et al., 2013), and the proportion of veterans with unmet alcohol use-related treatment needs is double that of the proportion of unmet treatment needs for serious mental health concerns (Golub et al., 2013).Identifying innovative ways to provide all components of SBIRT in routine care, and not just screening, is critical to combating this public health issue in both the veteran and general population.
Technology-based advances in alcohol screening and intervention provide a unique opportunity to combat unhealthy alcohol use by offsetting both provider and patient barriers to conducting SBIRT (Harris & Knight, 2014).Relational Agents are computer-based programs that can simulate face-to-face counseling and have been developed and tested for a variety of health-related conditions including weight loss (Watson et al., 2012), exercise (Bickmore et al., 2013;Fasola & Mataric, 2012;Sillice et al., 2018), diabetes management (Thompson et al., 2019), and medication adherence (Bickmore et al., 2010a, b).Relational Agents provide the opportunity to form therapeutic bonds with patients (Bickmore & Gruber, 2010) by fostering a setting of comfort and confidentiality and thereby providing patients with an alternative method to disclose and discuss behaviors that may be uncomfortable when meeting with their physicians face to face.Previous research has also shown that patient preferences for Relational Agents can vary on numerous characteristics including, but not limited to, gender (Esposito et al., 2021), animation (Parmar et al., 2022), and race (Bickmore et al., 2005a, b;Persky et al., 2013;Zhou et al., 2014).Tailoring a Relational Agent to the patient population's preferences can lead to increased trust and buy-in from patients.In addition, Relational Agents have been shown to effectively promote long-term behavior change.However, testing and development in the context of substance use are limited (Bickmore et al., 2020).
The Relational Agent design was based on research demonstrating the elements of effective interventions (FRAMES; feedback, responsibility, advice, menu, empathy, self-efficacy) (Bien et al., 1993) and using a motivational interviewing style (Miller & Rollnick, 2012).It was designed to administer the QDS (Quick Drinking Screen) (Sobell et al., 2003) and AUDIT-C (Alcohol Use Disorder Identification Test) (Bush et al., 1998) to the participant, provide normative feedback about how their drinking compared to that of their same-gender peers, elicit concern about the participant's drinking habits, motivate and ask for commitment to change, and refer the participant to treatment.Feedback was tailored based on participant responses to the AUDIT-C and QDS screening tools.The Relational Agent was designed to take roughly 15 to 20 minutes to complete.
This system was designed specifically to fit into primary care in providing both screening and motivational feedback as well as referral.The Relational Agent presented here is unique in that other computerized systems in the VHA provide only screening (i.e., eScreening, My HealtheVet alcohol use screener), depend on trained interviewers (e.g., Behavioral Health Lab (U.S.Department of Veterans Affairs, 2022), Mental Health Assistant (U.S.Department of Veterans Affairs, 2012)), or are part of a multi-session system meant to provide treatment (i.e., VetChange (Brief et al., 2013)).Many systems outside VHA are not optimized to primary care and tend to target college students.For instance, My Student Body is a program that provided screening and feedback using only social norms (Chiauzzi et al., 2005); Talk to Frank (The Home Office United Kingdom) and is informational only.In addition, eCheckup to Go provided similar but much less extensive feedback to users (Moyer et al., 2004).
This report is part of a larger project to develop and implement an engaging and confidential method of delivering SBIRT to veterans in primary care.The goal of this study was to engage veterans in a novel user-centered design of a Relational Agent intervention for unhealthy alcohol use by optimizing the presentation, acceptability and feasibility of the intervention, and its ability to effectively engage end users.We aimed to solicit input on veteran impressions during the development process and following use of the final intervention during a randomized clinical trial.This was part of a multi-part Hybrid I Effectiveness-Implementation study (Curran et al., 2012), beginning with veteran patient input and user design implementation, followed by implementation and randomized clinical trial (as reported in (Rubin et al., 2022;Zhou et al., 2017)), and then follow-up assessment among selected veterans who used the final implemented version of the Relational Agent intervention in the RCT to solicit qualitative feedback to understand the results and further improve the intervention.The importance of effectiveness-implementation hybrid designs (Landes et al., 2020) is that critical information about successfully implementing the practice in the real world is gathered at the same times as effectiveness is tested in a classic clinical trial.This iterative user-centered design provides an important model demonstrating how to customize an intervention for the target population and the setting in which it will be implemented.Here, we report on the qualitative implementation data gathered.
Presentation and discussion of the intervention Development and Usability phases and qualitative interview data collected among users in the trial (Evaluation phase) are presented here.Intervention development in the development phase was designed to determine participants' preferences for the physical appearance and environmental setting of a Relational Agent and to understand participants' perceptions of the ease of use and the perceived effectiveness of the Relational Agent intervention while still in the development phase.The evaluation of the implemented Relational Agent phase aimed to evaluate participants' overall perceptions of the Relational Agent following the completion of the clinical trial intervention (Rubin et al., 2022).

Design and Usability Phases
Recruitment, Eligibility, and Participants for Design and Usability Phases Recruitment Participants were recruited from two primary care clinic locations within VA Boston Healthcare System via posters displayed around the clinics describing the study.Clinic staff and physicians were encouraged to refer patients to the study, and two research assistants were available to speak with interested participants.All veterans who received services through the VA primary care clinic were eligible to participate.No level of alcohol use involvement was specified as an eligibility requirement during these Relational Agent development phases, as the goal was to develop a Relational Agent that could be used as a screening and (as needed) intervention and referral tool for all incoming primary care patients, regardless of their drinking status.
Design Phase Twenty participants agreed to participate in the semi-structured interviews.One participant was later excluded due to ineligibility, resulting in 19 participants included in the analysis.Participants were 79% (n = 15) male with a mean age of 56 ± 15.3 years; 68% (n = 13) were White.Most participants (n = 17, 89%) self-reported using a computer regularly, while two participants (11%) self-reported limited computer use (i.e., only used a computer a few times).

Usability Phase
We invited the 19 participants who participated in the first interview to return for a second interview.However, only fifteen participants returned to complete the usability testing and second semi-structured interview.Participants were 80% (n = 14) male with a mean age of 57 ± 15 years; 73% (n = 12) were White.

Study Procedures
All study procedures and documents were approved by the VA Boston Healthcare System Institutional Review Board.

Design Phase
The Relational Agent was developed and rendered in the Unity3D game engine.The Relational Agent is able to simulate human conversational behavior, speaking using a speech synthesizer, synchronized with nonverbal behaviors generated using BEAT (Cassell et al., 2001), including hand gestures, gaze behavior, and facial display.The Relational Agent's language is modeled in a custom scripting language that represents dialogue using hierarchical transition networks with template-based text generation, enabling real-time tailoring and personalization of the counseling dialogue for each patient.Users converse with the Relational Agent by selecting multiple-choice utterance options displayed on the screen, updated at each turn of the conversation.The use of fully constrained user input ensures patient safety by validating all responses that the Relational Agent gives (Bickmore et al., 2018).
The content was developed using decision trees, based on the senior author's extensive experience with developing these systems using various computer technologies to screen and intervene with people with substance use.A rule-based Relational Agent dialogue engine, as used here, allows for the Relational Agent responses to be designed and validated for every possible dialogue context and input.These decision trees, and the dialogue attached to them, were informed by and developed over several projects, in collaboration with colleagues who were part of the Motivational Interviewing Network of Trainers (motivationalinterviewing.org), and included insights from prior interactive voice response (Rubin, 2010;Rubin et al. 2006Rubin et al. , 2007)), web-based programs (Brief et al., 2013;Schreiner et al., 2021), and the Relational Agent research (Bickmore et al., 2020;Rubin et al. 2015).In the end, this rule-based and closed, multiple-choice dialogue allowed for target and streamlined communication between the Relational Agent and the veterans, through verbal output from the Relational Agent, and multiple choice responses from the veteran that flowed from alcohol screening, brief intervention, and, when indicated, referral to treatment.
Participants first viewed 8-10 sixty-second video clips on a touchscreen tablet computer that displayed different Relational Agent characters that varied by hair, facial features, clothing, background setting, and synthesized voice (see Table 4 for examples).The Relational Agent dialogue was exactly the same in each clip, with the agent greeting the participant, inquiring how the participant was feeling, asking if the participant wanted to discuss their alcohol use, and providing participants with a demonstration of how the program worked through a pre-recorded video.
Following the completion of each video clip, a brief semistructured interview was conducted by the research assistant under the training and supervision of the senior author.The interview included questions about the Relational Agent's voice, setting, and appearance.Participants were asked to rank each Relational Agent feature on a scale from 0 ("Not at all comfortable") to 10 ("Extremely comfortable") based on their comfort level interacting with the Relational Agent.After viewing all the video clips, participants were asked to rank all Relational Agent characters in order of preference from most to least preferred.Participants were also queried about preferred titles for the Relational Agent (e.g., health advisor, health coach, counselor).At the end of the interview, participants were asked to suggest modifications to the characters if they were to design the Relational Agent.Additional questions were asked about specific conversational items to gauge veterans' reactions.For instance, the Relational Agent engaged in social chat such as having friends who served and thanking participants for their service.Participants were offered $50 in compensation for their time.

Design Phase Qualitative Analysis Approach
All interviews were transcribed.Three research assistants performed a content analysis of the transcribed interviews, incorporating the principles of the immersion-crystallization method.This qualitative approach consists of repeated cycles of immersion into the collected data with subsequent emergence, after reflection, of an intuitive crystallization of the dominant themes (Borkan, 1999;Krueger, 1997;Malterud, 2001).The research assistants then met with two senior authors (AR and SRS) to discuss and agree on the final themes (see Table 1 for themes).Next, the interviewers coded the transcription of each interview separately and resolved any coding discrepancies until consensus was reached for all interviews.Qualitative analysis of design phase data was conducted using Microsoft Excel.
We examined the mode and median comfort level and preference ranking for each Relational Agent across all participants.Any character that had very low comfort or preference ratings was ruled out.The character with the highest median score was selected, consistent with methods used in previous Relational Agent research to select final characters (Bickmore et al., 2008;Bickmore et al., 2009;Bickmore et al., 2005a, b;Bickmore et al., 2010a, b).The research team reviewed the quantitative and qualitative analyses to select a final Relational Agent character and modified her appearance and setting based on qualitative themes.

Usability Phase
After the Relational Agent character and setting was developed, the rest of the dialogue for assessment, intervention, and referral features were developed and programmed (Zhou et al., 2017).Before the entire system was finalized, the system itself underwent further qualitative analysis.Participants from the design phase of the study were invited to return to test the Relational Agent's usability.Fifteen participants from the design phase returned for the usability phase.The fifteen participants were split into two groups, with group 1 consisting of 8 participants who completed session 1 of usability phase and 7 participants completing sessions 1 and 2 (group 2).Both sessions in the usability phase were 30 min in duration.In the design phase of the study, participants could only view the videos of the Relational Agent.However, in the usability phase, participants were able to interact with the Relational Agent.The Relational Agent was programmed to conduct an alcohol screening using an SBIRT approach, just as a patient would be receiving from their primary care providers.This included brief intervention based on motivational interviewing strategies and asking for a commitment to change (Miller & Rollnick, 2012).Participants responded to the Relational Agent by choosing buttons on displayed on the screen containing different possible answer phrases updated at each turn of the dialogue.
Participants were asked to provide verbal feedback, using a think aloud protocol (Fonteyn et al., 1993), on the dialogue and behavior of the Relational Agent while using the system.To encourage consistency of response regardless of individual experience, participants were asked to imagine that they were a movie character that consumed a high level of alcohol and report that character's level of drinking to the program, regardless of their own personal alcohol use level.The interviewer sat next to the participant, periodically prompted the participant for their thoughts, and took hand-written notes throughout the interview.
After the completion of the "think aloud phase," a semistructured interview was conducted.Participants were asked to rate their comfort level of speaking with the Relational Agent and were asked about their preference of a name for the Relational Agent.
Examples of questions include the following: • How did you feel about this computer advisor?
• What did you like or dislike about him or her?
• How comfortable would you be talking with this computer advisor about your alcohol use? • Do you trust that the computer advisor is knowledgeable about alcohol use?Would you be open to talking with the computer advisor about your alcohol use?

Usability Phase Qualitative Analysis Approach
Interview transcripts were analyzed via an inductive, content analysis approach described by Bradley et al. (2007) and Borkan's (1999) immersion/crystallization technique.The usability phase qualitative analysis team consisted of two research assistants trained and whose work was supervised by the senior author.To begin, three transcripts were randomly selected, read, and analyzed independently by members of the qualitative study team for initial emergent themes.The qualitative study team then came back together to discuss themes that emerged until reaching consensus.Two transcripts were then read independently by the analysts to confirm that no new themes emerged.After concluding the review of the two additional transcripts, the qualitative study team felt they had reached thematic saturation and that the codebook was complete (see Table 2).The analysts proceeded with independently coding all transcripts, including re-analyzing the transcripts previously reviewed during the codebook development stage.
If new themes were identified during the coding process, the qualitative analysis team came back together to discuss the new themes for consensus before continuing with the remainder of the coding.All transcripts were independently coded by both qualitative analysts, and all discrepancies were discussed until consensus was reached.Coding of all transcripts was conducted in Microsoft Excel.Participants in the clinical trial who were randomized to use the Relational Agent and completed the intervention were recruited for the evaluation phase.Briefly, participants receiving primary care services through the VA who received a positive AUDIT-C screening (Bush et al., 1998) (a score of at least three for women and four for men within the last 3 months) and reported drinking above the National Institute on Alcohol Abuse and Alcoholism (NIAAA, 2022) guidelines, as collected using the QDS, within the past 30 days (i.e., more than 3 drinks per day and/or more than 7 drinks per week for women; more than 4 drinks per day and/or more than 14 drinks per week for men) (National Institute on Alcohol Abuse and Alcoholism), were eligible to participate in the trial and, thus, in the evaluation phase of this study.Participants were ineligible in the RCT to participate if they had received treatment for substance use in the past 30 days.
Participants Twenty participants who participated in the Relational Agent clinical trial agreed to participate in the semi-structured interview about their experiences interacting with the Relational Agent during the trial.One participant was later removed due to audio data quality issues resulting in the data of nineteen participants included in the analysis.
Participants were 79% male with a mean age of 53 ± 15.2 years; 63% were white.The median AUDIT-C score for participants was 5 (range: 3-9).Most participants self-reported themselves as using a computer regularly (n = 16, 84%) or being an expert computer user (n = 2, 11%).Only one (5%) participant reported only using a computer a few times.

Study Procedures
Participants were interviewed within 1 month of completing the randomized clinical trial.No participants that were involved in the Relational Agent development and usability phases were included in the trial or the subsequent evaluation of the Relational Agent intervention.Interview questions were developed under the guidance of a research physician (SRS) and a research psychologist (AR) with expertise in qualitative analysis and were designed to solicit feedback on Participants would often comment on the program as "their younger self" or as "someone who drinks a lot."This section contains those comments relating to what the participant believes would be the hypothetical reactions from these "made-up" personas throughout different parts of the program.Reactions to advice/comments about drinking that the advisor provides Participant's reactions to the advice and comments that the Relational Agent provides regarding the user's drinking habits.Reaction to reporting of consequences Participant's reactions to the computer advisor summarizing the consequences that the user reported during the AUDIT-C section.General reactions to pros and cons sections Participant's reactions to the pros and cons sections.Specifically, commenting on the options provided in each section and if items should be added or subtracted.General reactions to commitment section Participant's reactions to the commitment options provided."Thank you for your service" Participant's opinions and reactions to having the Relational Agent thank them for their service.Addition of non-alcohol-related conversation?
Participant's opinions on whether additional, non-alcohol-related conversation topics should be included in the program.How do you feel about the advisor/how comfortable are you?
Participant's reports on how comfortable they are and how they feel about the Relational Agent.In addition, participant's suggestions regarding how to make the computer program more "approachable" or "comforting." aspects of the intervention that participants found helpful, ease of use, and about areas that provided minimal value to the participant.
Examples of questions include the following."Laura" is the name of the Relational Agent.
• Did "Laura" make you feel comfortable or uncomfortable?Why? • Did you feel that "Laura" pushed you too hard to change your drinking, or not hard enough?Why? • Was there anything in particular that "Laura" asked or said that you thought was really useful or effective?Harmful or counter-productive?
Participants were given the opportunity to view a brief sample of the intervention to help refresh their memory prior to the interview, which was then conducted by the senior author.Interviews were recorded with the permission of the participant and transcribed upon their completion.Participants received $20 in compensation for their time.

Qualitative Analysis Approach
The evaluation phase qualitative analysis team consisted of a bachelor's level and a master's level analyst, who were also trained and supervised by the senior author.The inductive, content analysis approach was conducted as described in the design and usability phases (Bradley et al. (2007)), involving regular meetings with members of the qualitative analysis team to ensure continued consensus throughout the coding process (see Table 3 for coding categories used in the analysis).All coding in the evaluation phase was conducted using NVivo qualitative analysis software (NVivo, 2012).

Appearance
Participants commented on a wide range of the Relational Agents' appearance that included facial features, hair, and clothing.Nine participants reported a preference for a female character, while seven participants reported no preference, and three participants preferred a male character.Participants stated they wanted the character to have natural hair, with five participants stating a preference for long hair (i.e., past shoulder-length).Participants (8/19) expressed a desire for the character to look professional (e.g., nice shirt with blazer) as opposed to a tee shirt; however, five participants stated not wanting the character to wear a doctor's white coat.

Setting
Nine participants expressed opinions on the settings of the characters.Most participants preferred a professional setting with an environment that felt warm and inviting.Four participants shared they felt more comfortable when

General likes or dislikes
General comments about liking or disliking Laura or the program Ease of use of operating computer or Relational Agent Ease or difficulty of use experienced while interacting with the program regarding the audio, visual, and any technical difficulties including using the mouse and/or laptop touchpad there was a lot of space in the room and when the room was open.Five participants suggested a background relating to military service (e.g., the Department of Veterans Affairs logo or the American flag).One participant preferred no windows in the background due to fear of lack of confidentiality.Only two participants favored a casual or outdoor setting.

Animation
The majority of participants who provided answers observed the characters' movements as robotic and mechanical.Five participants reported characters' movements to be stiff and unnatural, with participants noting that the characters' hand gestures often did not sync up with the characters' speech.Furthermore, participants commented on repetitive movements of the eyes and eyebrows that drew attention to them and felt distracting.Five participants reported the characters' voice to sound too robotic and/or too monotoned.Notably, participants preferred characters without toon-shading (i.e., a method to make the character look more cartoonish) compared to normal shading.

Human Trustworthiness
Participants responded to queries about (1) their comfort talking with the characters about alcohol use, (2) the characters' knowledgeability about alcohol use, and (3) whether the character seemed to belong to an outpatient setting.Participants stated that they preferred a character that was more familiar with their situation, similar to a real doctor who had access to the patient's health information.Secondly, participants reported they felt the Relational Agent could only have limited knowledge.Lastly, participants felt the majority of the characters did fit into an outpatient clinic setting.However, some participants were comfortable talking with the agent, and some stated they would do so if the doctor was not available.

Animation
Many participants reported negative reactions to the Relational Agent's animation.Participants reported that the graphics appeared "robotic" and "stiff."Participants also noted that it appeared that the Relational Agent could not engage in human-like eye contact with them, with one participant reporting that they felt as though the agent was looking and reading off a teleprompter behind them.Participants (7/15, 47%) reported that the agent's body language was too stiff, and her voice sounded "mechanical."

Comfort Level Interacting with Relational Agent
Most comments made by participants regarding the participants' comfort levels with the Relational Agent were positive.Fourteen participants commented on the Relational Agent stating she had "friends in the military" and "liked working with veterans," with most participants reporting positive comments (11/15, 73%).Many of the participants reported being comfortable with the computer character (10/15, 67%).The mean comfort ranking during this phase was 8.6 out of 10.Some participants (3/15, 20%) commented that the agent was judgmental and disingenuous in her interactions, but most of the participants (12/15, 80%) reported feeling that they would feel very comfortable speaking with the Relational Agent about their alcohol use.Despite the overall comfort level, eight (53%) participants thought that the phrasing of the questions was slightly abrasive or ambiguous, possibly leading to some confusion.However, only three (20%) participants noted that they would prefer to speak with a human being over the Relational Agent.
As part of the program, the Relational Agent thanks the veteran for their service.In general, participants reported positive feelings regarding this statement (9/15, 60%).Two Vietnam-era Veterans noted to be especially happy to hear the phrase due to lack of gratitude received during their initial homecoming following their Vietnam deployment.Despite the majority of participants reporting positive comments about the statement, some (5/15, 33%) reported a negative reaction with participants describing the phrase as "shallow" and "not genuine."

Final Selection of Relational Agent
Overall scores were analyzed to select the final Relational agent.Female 1 received the highest overall median score (7), and female 2 came in second (6.5) with female 2 scoring in the top two for both average comfort and character preference categories (see Table 4).In addition, the lowest comfort score female 2 received was four, while other characters scored as low as a two on this rating (see Table 4).Based on the overall ratings of comfort and preference, female 2 was chosen as the final Relational Agent character (see Fig. 1).Female 2 was subsequently named Laura for the evaluation phase (see overall responses on characters' comfortability and preference in Table 4).

Appearance and Animation
Participants overall reported favorable opinions on the appearance of the Relational Agent.Participants were free to comment on any aspect, so not all participants commented on any particular attribute, with participants reporting positive opinions regarding her overall attractiveness (6/19, 32%), professional clothing (10/19, 53%), and her clear voice (5/19, 26%).One participant reported that the agent looked too professional and would feel more comfortable interacting with her if she was wearing more casual clothing.Similar to feedback received during the development and usability phases of the study, the majority of participants (10/19, 53%) reported that the agent's movements felt slightly to moderately robotic.Comments included reporting that the Agent was "missing some body language and eye contact that would happen during a normal conversation [with a human]" and that the Agent "moves like a robot but talks like a person."Only two (11%) participants reported that they would prefer to talk with a human being.

Setting
Nearly half of all the participants (9/19, 47%) commented favorably about the professional office setting that the Relational Agent was presented in.Participants liked that the setting was similar to that of their providers.Two participants commented that having the Relational Agent standing up in the office while they were sitting would decrease the overall comfort level.

Personality and Comfort Level
Nine participants commented that the Relational Agent's personality was professional, polite, and appropriate for the conversation.One participant noted "she's very easy to talk to so she relates back and talks back in a way you'd expect a normal human being to."Some participants reported that the agent felt "programmed" (4/19, 21%) and lacked a dynamic personality (3/19, 16%).Despite some feelings of awkwardness when first interacting with the agent, participants were overwhelmingly comfortable with their overall interaction with the agent (17/19, 89%).Only four participants (21%) felt that the agent was judgmental when discussing personal drinking habits.

Change in Drinking Habits
The Relational Agent was able to successfully help ten participants (53%) decrease their drinking habits or stop drinking altogether, per self-report.One participant commented that "[The Relational Agent] got me interested in [cutting] down the drinking.She helped me cut it down.She made me address the problem more openly."Another recounted "It did change my outlook, my whole thought process on alcohol."Some participants felt that the agent felt "pushy" at first when trying to change their drinking habits, but most found that, in the end, the agent's demeanor and approach to brief intervention were motivating.Interestingly, six participants (32%) thought about the agent while drinking or planning to drink which subsequently changed their drinking habit for at least that moment.

Discussion
In this study, we provide an outline for iterative and usercentered technology-based intervention development, which included providing various options, features, and designs for participants to choose from.Participant feedback was then used to refine and ultimately finalize the intervention for delivery of SBIRT within VA primary care clinics.In doing so, we helped to ensure that the finished intervention incorporated the perspectives and preferences of our end users, who were co-creators of the finished intervention.We also provide data from participants who completed the randomized clinical trial (Rubin et al., 2022) who weighed in on the finished product but also the perceived effectiveness of the intervention, having used it as designed and implemented within real-world primary care clinics.As a whole, this project provides an illustration of how technology-based interventions can be built with key end users in mind, and in a manner that appeals to the preferences of the majority of target patients.In addition, we also provide insight into the pros and cons of automated intervention technologies, and Relational Agents in particular, in relation to providing SBIRT, relative to in-person providers.We found that opinions and preferences about the Relational Agent differed quite a bit, with participants offering their preference on everything from the physical features and movements of the Relational Agent to the setting and physical positioning (e.g., sitting vs. standing position) of the Relational Agent in their reported observations.Key themes relating to participants' observation emerged including those indicating that the intervention appeared "mechanical" or noting its perceived limitations to providing a human connection.At the same time, the majority of participants interviewed from the RCT stated that they preferred the Relational Agent over in-person providers.Despite the observed limits of the technology, veterans provided generally positive ratings and felt comfortable discussing their drinking habits with Laura.Participants also believed that the intervention either had a positive impact on their drinking, or at least on the ways they think about drinking.
Participants overwhelmingly endorsed that the "stiff" movements and personality of the Relational Agent decreased the feelings of authenticity of the Relational Agent.Despite feeling that the physical features of the Relational Agent were not human like and were, at times, limiting, participants preferred the Relational Agent without "toon-shading," thereby supporting previous research findings that showed that although characters with toon-shading were felt to be overall friendlier, for conversations involving medical content, human-like shading was preferred and felt to be more appropriate (Ring et al., 2014).Prior research has found that highest levels of persuasion by the Relational Agent are found when the degree of animation is minimized (Parmar et al., 2022), aligning with the Relational Agent developed in this study.Participants' preferences for a female Relational Agent follow the gender preference found in previous research (Esposito et al., 2021).
Importantly, we observed that the attitudes of the participants towards the Relational Agent improved over the course of the project as we worked to refine the Relational Agent to better meet patients' needs and preferences.At each stage, we incorporated the participants' feedback into the changes being made (e.g., changing setting or changing the level of animation of the character).The improved and overall positive feedback and reviews of the Relational Agent at the conclusion of the larger trial exemplify the success of including patients in the development of such health technologies given that these participants were the ones interacting with the Relational Agent as part of the clinical trial intervention.
The high variability in the preferences of participants during the development phases suggests that availability of multiple characters may allow for an increased reach to broader audiences.The availability of multiple Relational Agent character options for patients to choose from would allow for patients to choose a character that they are most comfortable disclosing information to and interacting with, ideally leading to increased rapport between the patient and "provider."Research has shown that a strong therapeutic alliance can be formed between virtual agents and patients and that patients that perceive themselves to be more like the agent (e.g., race and ethnicity) report higher therapeutic alliances (Bickmore et al., 2005a, b;Persky et al., 2013;Zhou et al., 2014) and greater trust in the Relational Agent.
Automated Relational Agent technology provides promising means of bridging service gaps in alcohol screening and intervention without adding to provider or clinic burden.It is also possible that Relational Agent technology, while offloading tasks normally required of providers, could also deliver these services with greater consistency and fidelity.For example, in the clinical trial results from this study, we found that rates of brief intervention and referral to treatment were substantially higher among patients assigned to the Relational Agent condition; this was relative to patients receiving primary care services as usual, for whom annual alcohol screening and, when indicated, brief intervention and referral to treatment are considered best-practice.The potential for improving SBIRT fidelity relative to treatment as usual, coupled with the favorability ratings for the Relational Agent reported herein, supports ongoing research and development of similar intervention technology in healthcare settings, where digital health applications promote more honest disclosure as a result of decreased stigmas and levels of embarrassment (Berry et al., 2018;Olafsson et al., 2018Olafsson et al., , 2020)).Furthermore, fully constrained user input (i.e., multiple choice) has been demonstrated to work effectively in several prior Relational Agent-based interventions, including several with individuals of low health literacy and low computer literacy (Bickmore et al., 2010a(Bickmore et al., , b, 2015;;Zhou et al., 2014), and research has also demonstrated that patients are equally comfortable using the multiple-choice modality compare to unconstrained speech input (Murali et al., 2019).
Relational Agents provide unique affordances relative to other health education media.Unlike text-only chatbots or static web pages, Relational Agents rely only minimally on text comprehension.The use of nonverbal conversational behaviors-such as hand gestures that convey specific information through pointing or through shape or motionprovides redundant channels of information for conveying semantic content also communicated in speech.The use of multiple communication channels enhances the likelihood of message comprehension, and the Relational Agents can emphasize and enhance recall of critical information through nonverbal emphasis.Relational Agents provide a much more flexible and effective communication medium than taped content or even combined video segments.The use of synthetic speech makes it possible to tailor each utterance to the user (e.g., using their name, gender, age, and other personal information), to the context of a given conversation (e.g., what was just said, whether it is morning or evening), and to changing parameters over a series of conversation (e.g., gradually increasing self-efficacy).Importantly, we have shown that current commercially available conversational systems (e.g., Siri, Alexa) frequently make unsafe health recommendations (Bickmore et al., 2018).
The Relational Agent created for this study was made available through a VA laptop computer.This platform might be feasible in some but not all clinics, and the technology can be adapted for delivery on personal and handheld devices such as smartphones and tablets.With these flexible options, SBIRT delivery could take place in clinic waiting rooms, during or just prior to virtual care visits, or as part of routine check-ups with existing patients separate from a scheduled office visit (i.e., for intermittent screening and prevention between appointments) on the web.

Limitations
A relative strength of the study is that we assess a variety of domains and diverse aspects of the Relational Agent throughout the development phase, using both qualitative and self-report scale questions.Our phased approach to the design and usability of the Relational Agent also allowed for iterative refinement of the Relational Agent intervention before formally testing it in the trial.We also benefitted from the perspective of veterans drinking at unhealthy levels and who participated in the trial, which provided critical feedback about the intervention from target end users.For the development of the Relational Agent, we included veterans regardless of their drinking, to create an intervention that would be used to screen all veterans coming into primary care clinics.On the other hand, one potential limitation of this approach is that we did not target veterans drinking above recommended limits to participate during the intervention creation phase.In addition, the limited patient sample consisted of mainly White males, reflecting much of the patient population of the VA health care system where the research was conducted.Future research might identify and incorporate feedback from a more diverse patient sample, including individuals drinking above the recommended limit to ensure that the dialogue and functionality of the intervention are satisfactory to these veterans.

Conclusions
End-user involvement in the development of digital health tools and Relational Agents allows for customization of the tool for the population and setting it is aimed to serve.Through soliciting the feedback and perspectives of veterans in this study, we successfully developed a Relational Agent that met the needs of the veterans while optimizing the participants' comfort levels interacting with the Relational Agent.Continued development of the Relational Agent is necessary to overcome some of the characteristics (e.g., mechanical body movements and tone of voice) of the Relational Agent that may prevent patients from fully engaging with the agent.Despite the continued opportunity for additional refinement of the Relational Agent, participants overall found the agent to be trustworthy and effective at prompting participants to think about their unhealthy drinking habits.

Table 4
Participants' overall rating and self-rated comfort level ratings for Relational Agents evaluated a Participants ranked the Relational Agents in order of preference (1=most preferred to 8=least preferred) b Participants used ruler-based rating (0=not at all comfortable to 10=extremely comfortable) to rank comfort level talking with the Relational Agent about alcohol use

Table 1
Design phase: description of coding categories

, and Participants for Evaluation of Developed Relational Agent Recruitment
Rubin et al., 2022n of the design and usability phases, we conducted the larger clinical trial (reported elsewhere; seeRubin et al., 2022for primary outcomes).

Table 2
Usability phase: description of coding categories General/ non-specific reactions referring to all aspects of program General reactions to the program and reactions at the beginning of the program relating to all aspects of the program (i.e., technical and non-technical aspects).Branch of military questionsParticipant's reactions to the Relational Agent asking the user which branch of the military they served in.

Table 3
Evaluation phase: description of coding categories Julianne Brady: formal analysis, writing-original draft preparation; Nicholas Livingston: methodology, writing-original draft preparation; Molly Sawdy: data curation, writing-original draft preparation; Kate Yeksigian: data curation, formal analysis, writing-original draft preparation; Shou Zhou: conceptualization, methodology, writing-original draft preparation; Timothy Bickmore: conceptualization, methodology, writing-original draft preparation; Steven Simon: funding acquisition, conceptualization, methodology; Amy Rubin: funding acquisition, conceptualization, methodology, formal analysis, writing-original draft preparation Funding This research was supported by the Department of Veterans Affairs, Veterans Health Administration, Health Services Research and Development (HSR&D) Services (IIR11-3346; PI Simon).