Early guiding principles of applied behavior analysis (ABA) included a focus on meaningful and valuable procedures and behavior change. In particular, the work of Baer et al. (1968) represented a shift in the field from a focus on behavior reduction to the applied analysis of behavior, and the authors emphasized that applied research is “constrained to examining behaviors which are socially important, rather than convenient for study” (Baer et al., 1968, p. 92). In 1978, Wolf extended the applied emphasis of ABA by highlighting that “society would need to validate our work” (Wolf, 1978, p. 207) on at least three levels: treatment goals, procedures, and outcomes. Wolf emphasized that our science is tasked with making changes of social importance and that if the scientific models were not aligned with the values of the clientele, then our science was not truly applied.

Given this focus on improving socially significant behaviors, it is important to identify ways in which behavior analysts can maximally support clients and effect meaningful change. Most individuals who hold Behavior Analyst Certification Board (BACB) certification credentials work with individuals with autism spectrum disorders (BACB, 2020). The National Autism Center recommends that families be involved in children’s treatment plans (National Autism Center, 2009, 2015). Board certified behavior analysts (BCBAs) adhere to the Ethics Code for Behavior Analysts (hereafter “the Code”), a framework of ethical statutes including items related to involving stakeholders in intervention (BACB, 2020). As dictated by the Code, behavior analysts must “involve clients and relevant stakeholders throughout the service relationship, including selecting goals, selecting and designing assessments and behavior-change interventions, and conducting continual progress monitoring” (BACB, 2020, p. 11, 2.09) and “use understandable language in, and ensure comprehension of, all communications with clients, stakeholders, supervisees, trainees, and research participants” (BACB, 2020, p. 11, 2.08). The Code also outlines four core principles that serve as the overarching framework for the ethics standards. These principles are intended to help behavior analysts interpret and apply the standards in the Code. The four core principles are that behavior analysts should: (1) benefit others; (2) treat others with compassion, dignity, and respect; (3) behave with integrity; and (4) ensure their competence. The introduction of a core principle that emphasizes compassion is indicative of the shift in the field toward prioritizing the inclusion of these skills in client care. The Code items above are a starting point for practitioners, yet offer a minimum foundation for effective compassionate practices. In order to promote the involvement of the client in treatment, behavior analysts must begin to operationalize ways to effectively interact with families. Positive family–practitioner interactions and family-centered care are more likely to lead to empowered families (Horst et al., 2000) and therefore more positive experiences with ABA overall.

Recent clinical trends in ABA have emphasized compassionate care (e.g., LeBlanc et al., 2019; Taylor et al., 2019) and related repertoires such as relationship-building skills (Canon & Gould, 2021), behavioral artistry (Callahan et al., 2019), and cultural humility (Beaulieu et al., 2018; Conners et al., 2019; Fong et al., 2016; Fong & Tanaka, 2013). Researchers have found that ABA training is lacking in the training of compassionate care skills compared to other fields (LeBlanc et al., 2019). These trends have illuminated both the importance of these interpersonal skills and the need to improve our training approaches as a field. Our ethical obligation to represent the field of behavior analysis with integrity and compassion, which was recently articulated within the Code, furthers the recent calls to action to promote humanity and compassion in practice (BACB, 2020; Callahan et al., 2019; Canon & Gould, 2021; LeBlanc et al., 2019; Rohrer et al., 2021; Taylor et al., 2019).

In allied disciplines such as health care, researchers have been investigating the relationship between empathic practitioners and therapeutic outcomes. Indeed, other fields have emphasized these skills as therapeutic listening, relationship-building, rapport development, and as engendering a therapeutic alliance. In all of these cases, component skills include behaviors demonstrating understanding and aiming to alleviate distress (e.g., Canon & Gould, 2021; Kerns et al., 2018; Leach, 2005). Preliminary findings have indicated that compassionate care and empathic responding are associated with better outcomes including improved quality care, improved patient satisfaction, enhanced adherence, decreased hospitalizations, and overall better physical health (Allen & Warzak, 2000; Beach et al., 2006; Beck et al., 2002; Bonvicini et al., 2009; Hojat et al., 2011; Horst et al., 2000). Related to this, soft skills that support therapists being perceived as “behavioral artists” are rated more highly by parents (Callahan et al., 2019; Foxx, 1985). Canon and Gould (2021) successfully used verbal instructions, clicker training, and role-play to improve therapeutic relationship skills (e.g., mindful reflecting, appreciating, and asking questions) with two employees at an ABA agency. For our purposes, we will define compassionate care as a set of behaviors that both demonstrate empathy and aim to alleviate the suffering of others (Taylor et al., 2019).

The purpose of this study was to expand upon the emerging research in this area (e.g., Canon & Gould, 2021) and extend the focus of training in behavior analysis to component skills in compassionate care. In particular, we sought to teach students of behavior analysis to engage in compassionate care responses via remote practice in analog interviews using a novel application of BST.

Method

Participants

Four master’s students in an online program in behavior analysis were included as participants in the study. Students were recruited from two master-level courses at the university where the first author worked as an adjunct professor, and IRB approval was obtained. Students were second-year master’s students and were eligible for participation if they had not yet completed the program’s courses in ethics or collaboration. This criterion was included to avoid the students having previous exposure to coursework involving collaboration with families or related topics. Students from the courses were offered extra credit in their class, or the option of substituting a paper on a related topic with participation in the study. To participate in the study, participants needed to score lower than 70% engagement on at least two of the skills in each skill set. There were six total prospective participants. Two scored above the cutoff and were therefore not included in the study, and the remaining four scored below the cutoff criteria and served as the participants.

The participants were three female students and one male student. All participants were 24 or 25 years old, and all four held bachelor’s degrees in either psychology or education. Additional information regarding participant experience can be found in Table 1.

Table 1 Participant demographics

Setting and Materials

All sessions were conducted online, via a recorded online video conferencing session (using the Zoom platform). The primary impetus for the use of telehealth was the onset of the COVID-19 pandemic, because training could not be conducted in person. In addition to the technology materials (laptop or other device, Wi-Fi/broadband access), materials included written descriptions and rationales for each skill trained. These were provided to the participant by the experimenter sharing her screen. Each participant was asked to have a pencil and paper available to take notes on the skills as they were trained. The experimenter also provided (via screen sharing) a brief overview of the child (name, age, diagnosis, location of services) for each role-play scenario. Each participant was given access to a survey link using Qualtrics software by which they could access the surveys that were conducted before and after training.

Dependent Variable

The primary dependent variable was the participants’ performance as measured on a checklist of skills consistent with compassionate care approaches. These skills were measured in the context of a role-played interview where the participant served as the interviewer. A total of 11 skills were measured and trained, and these were divided into three skill sets (i.e., basic interviewing skills, interest in family, joining with family) based on their fit within these categories. Skill set 1 (basic interviewing skills) included four skills: (1) tell the caregiver you are taking notes; (2) nodding; (3) backchannel; and (4) positive introduction. Backchannel (saying, “uh huh,” or “yeah”) is described by Heinz (2003) as vocal responses that indicate that a conversational partner is listening and wants the communication to continue. Head nodding and backchannel often co-occur, and both indicate attentive listening. Skill set 2 (interest in family) included four skills: (1) acknowledge the child’s abilities/efforts; (2) ask about the child's interests; (3) ask about parent priorities; and (4) reflect and incorporate parent priorities. Skill set 3 (joining with family) included three skills: (1) making an empathy statement; (2) normalizing; and (3) partnering. Normalizing is a reflective strategy used within the cognitive-behavioral therapy literature and is used to reflect that other people may have similar experiences (Bennett-Levy et al., 2009). Partnering (i.e., making collaborative statements) indicates that the clinician will work with the family, which has been described as an important component in service delivery (Osher & Osher, 2002; Rohrer et al., 2021). Each skill was taught separately, with the exception of nodding and backchannel, which were trained together (i.e., the video model included both skills). With the exception of nodding and backchannel, all skills were recorded as either present or absent during the entire interview using partial interval recording (i.e., the targeted skill occurred at least once during the entirety of the interview or did not occur at all). Nodding and backchannel were individually scored using 60-s partial interval recording (i.e., the targeted skill occurred at least once during the 60-s interval or did not occur at all).

Three different sources were used in order to identify which socially valid and well-defined skills to teach. First, a caregiver survey was conducted with parents of individuals with special needs to gain information about their preferences regarding interactions with behavior analysts. Second, expert role-plays were conducted with experienced clinicians who had significant experience in ABA as well as psychology. In particular, experts role-played conducting an intake interview with a caregiver. Third, the Compassionate Collaboration Tool (Rohrer et al., 2021) was reviewed and skills that were considered to be relevant to a virtual platform were considered. All pre-study procedures are described below.

Ancillary Measure of Empathy

A secondary measure of empathy, the Jefferson Scale of Physician Empathy-Health Professions Version (JSPE-HP) was included in this study. The JSPE-HP, which is the most commonly used measure for assessing practitioner empathy reported in the medical literature, assesses empathy as a cognitive attribute involving an understanding of the patient’s experiences, concerns, and perspectives (e.g., seems concerned about me and my family, asks about what is happening in my daily life, is an understanding doctor; Hojat et al., 2011; Kane et al., 2007). This measure has been used for many years in a number of studies with physicians, and the data provide some goal ranges for high empathy levels. The measure is intriguing for behavior analysis, because it represents many years of nuanced work in assessing and measuring this elusive, yet vital, characteristic of professional treatment providers. Although there are not universally recognized levels of empathy associated with the instrument, it does provide an index of empathy, and may eventually be able to provide a relative ranking of empathy and/or to serve as a measure of change in empathy. Participants completed the JSPE-HP version prior to receiving training, and again post-training. The JSPE is commonly used and has been found to have high reliability (Fields et al., 2004) and is beginning to be validated as a measurement tool (Hojat & Gonnella, 2015). Preliminary data on the JSPE-HP are extensive, and generally show high median levels of empathy as well as high ranges of empathy in physicians. The highest possible score is 140. Large samples from typical medical students have indicated an average median of 115 (across 11 years of data) and a range from 52 to 140. Current guidelines for a cutoff indicative of low empathy is 100 (Hojat & Gonnella, 2015).

The JSPE-HP instrument was included as exploratory, providing the first data on levels of empathy among behavior analytic students on this instrument, as well as the first use of the measure as a pre- and post-assessment of the impact of empathy training for this group. Finally, it assists in bridging the gap between behavior analysis and other fields, by using a common measure from the field of medicine, and by pairing survey data with a direct instructional approach.

Pre-Study Procedures

Three sources of information were used to inform the targets of intervention: the compassionate collaboration tool, a caregiver survey, and expert role-plays. These sources are described below.

Compassionate Collaboration Tool

As mentioned above, the Compassionate Collaboration Tool (Rohrer et al., 2021) is a 25-item checklist that was developed based on literature from health care and related human service fields. The Compassionate Collaboration Tool synthesized findings and recommendations from seven published resources on culturally sensitive, empathic, and compassionate engagement interaction with families. For the purpose of informing the selection of skills taught, the Compassionate Collaboration Tool was reviewed for themes that were relevant to the context of a remotely conducted intake interview (the context of the training). Items incorporated from the tool included items that were consistent with behaviors observed in the expert role-plays. For example, the experts all actively solicited input from the caregiver on their priorities and reflected and incorporated these priorities. This overlapped with items from the Compassionate Collaboration Tool including: (1) incorporated family/individual client input when identifying objectives/instructional targets or procedures; and (2) actively solicited input from family about preferences/priorities for targets (“What is important to you to teach?”). Experts also engaged in active/attentive listening (nonverbal and paralanguage skills, “mmhmm,” nodding, mirroring facial expressions, appropriate body language).

Pre-Study Caregiver Survey

Prior to the start of the study, a survey of 67 caregivers and parents of children with autism and related disorders was used to identify specific interpersonal behaviors valued by caregivers. Caregivers were asked to rate 20 statements about interpersonal interactions with a behavior analyst using a 7-point Likert-type scale. Statements included both nonvocal responses (e.g., “It is important to me that the behavior analyst is approachable [friendly expression, open body language]”) as well as content-related statements (e.g., “It is important to me that the behavior analyst asks about my family's preferences/priorities regarding treatment goals”). Of the 20 statements in the survey, the 5 most highly rated statements were the following: “It is important to me that the behavior analyst asks about what my child enjoys” (99%); “It is important to me that the behavior analyst is approachable (friendly expression, open body language)” (95%); “It is important to me that the behavior analyst is not distracted when meeting with me” (95%); “It is important to me that the behavior analyst talks about the rationale for selecting skills” (93%); and “It is important to me that the behavior analyst asks about preferences and priorities” (93%). Statements that had lower endorsements included, “It is important to me that the behavior analyst makes small talk” (21%); and “It is important to me that the behavior analyst avoids using technical language” (36%). Skills that were highly rated (i.e., above 90%) were incorporated into targeted skills to teach to participants. Selected targets were determined by combining the results of the caregiver survey with the expert role-plays conducted with experienced professionals.

Expert Role-plays

Prior to the initiation of baseline, expert role-plays were conducted with four experienced clinicians. The clinicians selected as experts were doctoral-level BCBAs who each had over 10 years of experience in the field. In addition, the experts were selected because they also had doctoral degrees in psychology and specific training in clinical interviewing techniques and rapport-building. The purpose of these role-plays was threefold. First, the interviews served to provide information on an appropriate frequency or topography of engagement in targeted skills (e.g., nodding and backchannel such as “mmhmm,” “okay”). Second, the role-played interviews provided an opportunity to observe the integration of compassionate care responses in an interview format with skilled professionals. Third, the expert role-plays served as an additional resource for identifying target skills to teach. The clinicians were observed engaging in many of the same behaviors as one another, many of which overlapped with the responses identified as important in the caregiver survey as well as the Compassionate Collaboration Tool. For example, all clinicians made empathy statements, engaged in high levels of nodding and backchannel (active listening), asked about the child’s interests, and reflected and incorporated parent priorities. The expert clinicians also made positive statements about the child or acknowledged their efforts, and acknowledged that certain behaviors the parent reported were common (i.e., normalizing). During the expert role-plays, scripts were not followed; the practitioners simply responded naturally to the caregiver’s statements. At the conclusion of the role-play, the video sample was viewed and scored by the primary researcher who then created operational definitions of the skills observed, and solicited interobserver agreement (IOA) on the categories and definitions from the second author. The clinicians conducting the expert role-plays engaged in 10 out of the 11 skills selected for training. The one skill they did not engage in was “Tell the caregiver you are taking notes,” which was added to the training targets to address responses related to eye contact and active listening, given that the trainings occurred virtually. The responses that the experts engaged in were cross-referenced with the results from the caregiver survey and Compassionate Collaboration Tool (described below), and most responses (7 out of 10) were found to overlap with the caregiver survey, the Compassionate Collaboration Tool, or both. Although there were three skills that did not overlap (e.g., acknowledging abilities or efforts, normalizing, partnering), these skills were incorporated into training targets because they added a positive quality to the interview and were consistent across the expert role-plays.

Design

A multiple-baseline design across skill sets (Hersen & Barlow, 1976) was utilized to measure the effects of the training procedure on the participants’ performance on the 11 identified responses consistent with compassionate approaches. The multiple-baseline design consisted of three phases (i.e., baseline, post-BST training, and maintenance/generalization). Probes for maintenance of the skills were conducted 2 weeks following the final training session. Generalization of the engagement in compassionate skills was measured by evaluating performance on the checklist with a different role-play scenario and then with a different experimenter role playing the caregiver.

Interobserver Agreement and Treatment Fidelity

Interobserver agreement (IOA) measures were collected on 33% of participants’ sessions across conditions, including generalization and maintenance. In particular, 33% of baseline sessions and 33% of post-training sessions were used, and the samples represented all participants. IOA was obtained by having an independent observer watch the recorded session and record whether the participant engaged in the skills during the session. The independent observer was provided with a list of the skills trained, operational definitions, as well as examples and nonexamples of each skill, and the observer scored the extent to which participants engaged in the target skills. Because each skill was recorded using partial interval data collection, each interval was evaluated for agreement or disagreement. Most skills were recorded using partial interval recording with the entire interval serving as the interval. Two skills (i.e., nodding and backchannel) were recorded using partial interval recording across 1-min intervals. IOA was calculated by dividing the number of agreements by the number of agreements plus disagreements and multiplying by 100. Agreements were defined as both observers scoring an occurrence (or nonoccurrence) of the participant’s response for a given interval. Across baseline and post-training (including maintenance and generalization) probes for all participants, IOA averaged 92% (range: 79%–100%).

Procedural fidelity of the experimenter during baseline and BST training was also collected based on engagement (yes) or nonengagement (no) of the BST skill listed. An independent observer watched the recorded training sessions and marked whether the experimenter engaged in each skill on the checklist both during training (e.g., review written steps of skills with trainee and instructs them to take notes, verbally instructs trainee on steps of training skill that are outlined in written steps) and during baseline sessions (e.g., states what the activity will be without revealing information about the purpose of the training; refrains from providing specific feedback about performance). Procedural fidelity data were collected for 33% of baseline and training sessions and averaged 97% (range: 88%–100%) across sessions.

Procedure

General Procedures

During baseline sessions, participants signed on to a Zoom session sent by the primary experimenter. The experimenter gained verbal consent to record the session at the start of the meeting. The participant was not given any information beforehand regarding the purpose of the training project.

Jefferson Scale of Physician Empathy

Prior to beginning baseline sessions, participants were provided a link to a Qualtrics page with the JSPE-HP, and asked to complete the measure. The participants responded to each item on a 7-point Likert-type scale ranging from “Strongly Agree” to “Strongly Disagree.” The experimenter then asked participants to click a Qualtrics link (provided in the chat box) to the JSPE-HP survey. The survey was entitled “JSPE” so it was not clear to the participant the topic of the survey. The experimenter asked the participant to complete the survey and explained that the survey is a measure that was designed for use in the medical field, and as such the participant should substitute “clinician” or “behavioral therapist/analyst” for any language that says “physician” or “surgeon.” Likewise, “medical or surgical treatment” was to be substituted with “behavioral treatment,” and “patient” translated as “client.” The experimenter also pasted a reminder of these “translations” in the chat box for the participant to reference while they completed the survey. Following the post-training probes (including generalization and maintenance probes), the participants were sent a link to the JSPE-HP via email and asked to complete it. This served as the post-training measure.

Baseline Sessions

After the participant indicated they had completed the JSPE-HP survey (approximately 4–5 min), the experimenter said the following script:

Today we are going to do a brief interview activity where you will be the clinician and interview a caregiver (me) about my child. I won’t be able to answer any questions about the interview itself but we will chat after about the interview and skills we are looking at. You will treat this as an intake interview and identify the top three things to focus on in treatment. Don’t worry too much about exactly what you say, just think of this as an opportunity to get to know the child and family.

The participants were not given directives related to how to arrange their screen (e.g., gallery view, speaker view). Upon completion of the third (and final) baseline interview, the experimenter thanked the participant, discussed the purpose of the study, and as long as baseline responding was at stable levels, moved into training.

The experimenter then shared her screen and showed the participant a slide of the caregiver’s name, the child’s name, diagnosis, and location of services (home or clinic). On the slide the following statement was written: “This is an intake interview. Get information about the child’s present needs and determine the top three things to focus on in treatment.” If the participant asked questions about what to do, they were directed to do their best, and just get information about the child and determine the things to focus on in treatment. If the participant asked specific clarifying questions that could be answered without revealing anything about the role-play or purpose of the study, they were answered. For example, if the participant asked, “Are you the parent?” the experimenter replied, “Yes.” If they asked, “Do I ask about challenging behaviors or skills?” the experimenter replied, “Just do your best, and get information about the child and the top three things to focus on in treatment.” Once the participant began, the experimenter role played the caregiver, responding to questions about the child, the family, and priorities. The slide remained visible on the screen throughout the interview. The role-play was concluded when the participant indicated that they had enough information or said goodbye to the caregiver. The duration of each interview ranged from 4 min to 11 min, 44 s across participants. All three baseline probes were conducted on the same day.

Training Sessions

After three baseline sessions, as long as responding was stable, the first skill set (i.e., basic interviewing skills) was trained. Three post-training probes were conducted, as long as the participant maintained 80% or higher skill engagement. After three post-training probes, the second skill set (i.e., interest in the family) was trained. Following three post-training probes, the final skill set (i.e., joining with the family) was trained. The skill sets were trained for each participant in this order because they built on one another with respect to complexity. During the initial training session (i.e., skill set 1), the experimenter described what the training would focus on and provided (verbally) the following script:

Today we will learn about certain compassionate responses when interacting with caregivers. I have selected some skills that we will focus on, and you will receive training on these skills. You will also write down the skills we discuss, so that you can refer back to them while we work together. Please feel free to reference these notes as frequently as you need to! The reason we are focusing on these skills is that a lot of research shows that when people have better relationships with their therapists, the overall outcomes are improved. This can be achieved through showing empathy and compassion, and can help caregivers and families feel that we have a better understanding of who their child is as a person, as well as what their family needs. When this happens, caregivers might implement procedures better, and there may be an overall better relationship between the family and you as the practitioner. When we are engaging via telehealth, there are some skills that are even more important because we are not seeing the parent in person and so we need to be more deliberate (for example, nodding or paralanguage skills like mmhmm). First, I will outline the skills that we will learn. Then, we will watch a model of someone engaging in the skills. I might pause the video so I can point out each skill. Then, we will practice the skills in role-play, with you playing the clinician and me playing the caregiver. After, I will give you feedback on how you did with the skills and we’ll keep practicing as needed!

The experimenter then shared a screen with a list of the first four skills to be trained. The experimenter read through each skill, describing what the skill looks/sounds like as well as the rationale for engaging in the skill. This served as the “instructions” component of BST. The experimenter then played a video for each skill, using the shared screen feature, and watched the video along with the participant. The models on the videos were the expert clinicians from the expert role-plays, whose samples had been used to identify components of compassionate care, as explained above in pre-study procedures. Video clips ranged in duration from 42 s (i.e., for the skill positive introduction) to 3 min (i.e., for the skill empathy statement) and provided clear and concise demonstrations of each target skill. This served as the “modeling” component of BST. The experimenter provided an opportunity for the individual participants to ask questions after the video, but none of the participants asked any questions. Following the videos, the experimenter then shared the “caregiver slide” from baseline, and instructed the participant to role-play again, this time incorporating the skills that had been reviewed. This served as the “rehearsal” component of BST. The experimenter did not follow a script, but used notes related to the pre-developed caregiver/scenario profile to respond during the interview. For example, the scenario profile notes may have stated that the child engages in repetitive vocalizations that get louder when he is upset, so the experimenter spoke about this when asked by the participant during the interview. The scenario profiles remained the same from baseline through training, and there were three caregiver profiles which were rotated through. For subsequent training sessions (i.e., skill sets 2 and 3), the experimenter did not provide the full script above. Instead, the experimenter reminded the participant of the skills they had been taught previously (by showing the list of skills on a document using the screen sharing feature) and then moved on to the next set of skills. For example, the experimenter said, “Last time we worked on the skills listed here (positive introduction, nodding, backchannel, and letting the parent know you are taking notes). This time, we’ll work on four more skills that are listed here.” The experimenter then reviewed each skill by reading it aloud to the participant, providing a rationale, and giving one or two examples of what the participant might say during the interview. The list of all 11 skills that were shown to the participant can be found in Table 2. Following the rehearsal, the experimenter provided praise if the participant engaged in the trained skills, and corrective feedback if the participant did not engage in one or more of the skills. The participant was then asked to practice the component that was missing or incorrect until they engaged in the skills at 100% of intervals. This served as the “feedback” component of BST. Each skillset was trained on the same day (e.g., skill set 1 was trained within a day, skillset 2 on a different day). The length of training sessions (including all interviews within the condition) ranged from 14 to 35 min, with a mean of 24 min. The intervals between training sessions were not standard due to participant availability, but probes were always done immediately after training, and intersession intervals never exceeded 1 week.

Table 2 List of skills trained

Post-Training Probes

Following the BST training on the specific skill set, three post-training probes were conducted. Each caregiver scenario was probed in these sessions. That is, the experimenter role-played a different caregiver in each probe, and each of three “caregivers” was the interviewee. Prior to each probe, the same visual information was presented to the participant as in baseline. In particular, the caregiver slide was shared on the screen, and the pertinent information was reviewed (e.g., “I will be playing Susana, the parent. The child’s name is Sam, there’s some information about him listed here. You will treat this like an intake interview and gather information on the child’s present needs and determine the top three things to focus on in treatment. Whenever you’re ready you can go ahead!”). As in baseline, the session ended when the participant indicated that they were finished, or when they said goodbye to the caregiver. A transcribed sample of a post-training probe can be found in Appendix 1. This post-training probe represents the first probe following training on the final skill set (i.e., skill set 3).

Maintenance Probes

Two maintenance probes were conducted for each participant and were identical to post-training probes, except that they were conducted two weeks following the final post-training probe. The caregiver scenarios used for the maintenance probes were familiar (i.e., Susana, Frances). The experimenter introduced the scenarios in the same way as in post-training probe sessions. That is, the experimenter shared her screen to display the caregiver/child information and told the participant, “This is some information about the child. This is an intake interview; you will get information about the child’s present needs and determine the top three things to focus on in treatment. Whenever you’re ready, you can go ahead.” As in all other sessions, the role-plays were conducted via the Zoom video conferencing platform and were recorded.

Generalization Probes

Generalization probes were also identical to post-training probes but were conducted just following maintenance sessions (either on the same day or within 1–2 days of the maintenance sessions), and included novel caregiver scenarios. The first generalization probe was conducted by the primary experimenter and introduced a novel caregiver scenario (i.e., Sarah). The second generalization probe was conducted by an unfamiliar experimenter who role-played a different novel caregiver (i.e., Alana). These were the only two novel caregiver scenarios utilized. As in all other sessions, the role-plays were conducted and recorded via the Zoom video conferencing platform. It should be noted that for all post-training sessions (including maintenance and generalization sessions), the duration of the role-played interviews ranged from 6 min to 15 min, 30 s per interview.

Social Validity

Three measures of social validity were undertaken following the training. First, a survey of the participants regarding their experience was administered. Participant survey items were related to the overall impact of training (e.g., “This training helped me learn ways to engage compassionately with families”; “I feel more confident than before in my ability to show empathy and compassion when talking with families”; “Learning these compassionate skills has helped/will help my practice”; “This training helped me think objectively about interpersonal skills”) as well as the specific aspects of the training procedures (e.g., “The trainer's explanation of skills was valuable in helping me learn the skills”; “The video models were valuable in helping me learn the skills”; “The role-plays were valuable in helping me learn the skills”; “I would recommend this training to other clinicians”). These were scored using a 7-point Likert-type scale ranging from “Strongly agree” to “Strongly disagree.” Second, professional experts rated video clips of each participant, and third, consumer experts (parents of individuals with special needs) rated these same video clips. Both professional and consumer raters responded to seven survey items related to their experience of the clinician’s interactions with the caregiver. Professional and consumer expert raters responded to the following statements using a 7-point Likert-type scale: (1) the interviewer showed sincere concern for the caregiver and their needs; (2) the interviewer showed compassion and empathy for the caregiver; (3) the interviewer made an effort to get to know the family and child; (4) the interviewer showed attentive and engaged listening; (5) the interviewer made sure to ask about the caregiver's priorities; (6) the interviewer incorporated the caregiver's priorities when choosing targets; and (7) the interviewer made an effort to establish a collaborative relationship with the family. These social validity measures served to evaluate the impact and value of the training itself (for the participant survey) as well as to evaluate the experience of actual professionals and consumers in the field of ABA. The professional expert raters consisted of clinicians who had each worked in the field of behavior analysis with families of individuals with autism for more than 20 years. The clinicians selected were doctoral-level BCBAs as well as clinical psychologists. The consumer expert raters consisted of three parents of individuals with autism. These individuals were selected because they had each had 10 or more years as consumers of behavior analytic treatment for their children. The consumers also worked in the field of behavior analysis or allied disciplines (e.g., speech, psychology). Each expert rater was sent a video compilation that included a segment of each participant’s interview. The interview used was randomly selected from a subset of the role-plays (either post-training on skill set 3, maintenance, or generalization). Each segment was approximately 4 min long and started 1–2 min into the interview. The purpose of the expert raters was to gauge how an unfamiliar professional or consumer would experience the compassion skills of each participant during the role-played interview. All post-study measures of outcomes are listed and described in Table 3 below.

Table 3 Post-study measures of outcomes

Results

Results will be presented in the following order: (1) acquisition data on training for individual participants; (2) change scores on the JSPE-HP; (3) social validity scores from participants; and (4) social validity scores from expert raters. Training results for all participants are presented in Figs. 1, 2, 3 and 4. Participants consistently performed low in baseline sessions. Participant 1 averaged 43% (range: 40%–50%), 4% (range: 0%–25%), and 4% (range: 0%–33%) on skill sets 1, 2, and 3, respectively. Participant 2 averaged 40% (range: 31%–45%), 13% (range: 0%–25%), and 7% (range: 0%–33%) on skill sets 1, 2, and 3, respectively. Participant 3 averaged 57% (range: 50%–65%), 50% (range: 50%–50%), and 15% (range: 0%–67%) on skill sets 1, 2, and 3, respectively. Participant 4 had relatively higher scores in baseline. This participant averaged 65% (range: 60%–72%), 42% (range: 25%–75%), 22% (range: 0%–33%) on skill sets 1, 2, and 3, respectively. All participants demonstrated an immediate increase in all skill sets following training, and performed taught skills at extremely high levels. Participant 1 averaged 99% (range: 95%–100%), 100%, and 100% in post-training probes across skill sets 1, 2, and 3, respectively. Participant 2 averaged 94% (range: 75%–100%), 100%, and 100% in post-training probes across skill sets 1, 2, and 3, respectively. Participant 3 averaged 93% (range: 67%–100%), 100%, and 100% in post-training probes across skill sets 1, 2, and 3, respectively. Participant 4 averaged 98% (range: 96%–100%), 100%, and 100% in post-training probes across skill sets 1, 2, and 3, respectively. Maintenance and generalization probes remained high, averaging 99% across participants and skill sets, with the exception of the first participant’s second maintenance probe in skill set 3, which was 67%.

Fig. 1
figure 1

Engagement in skills for Participant 1

Fig. 2
figure 2

Engagement in skills for Participant 2

Fig. 3
figure 3

Engagement in skills for Participant 3

Fig. 4
figure 4

Engagement in skills for Participant 4

Jefferson Scale of Physician Empathy

The results of the pre- and post-administrations of the JSPE-HP are displayed in Fig. 5. Initial scores (i.e., pre-test) on the JSPE-HP were 99, 89, 105, and 117 for participants 1, 2, 3, and 4, respectively. All participants showed increased scores on the JSPE-HP following training, with increases ranging from 4 points (i.e., participant 4) to 22 points (i.e., participant 1), with an average change across participants of 13.5 on the scale. Final scores (i.e., post-test) on the JSPE-HP were 121, 100, 122, and 121 for participants 1, 2, 3, and 4, respectively. The highest possible number of points that can be obtained on the JSPE-HP is 140 (7 points on each of 20 questions). It is interesting that none of the students of ABA scored below the cutoff of 100 on the post-assessment. Two of the four were below this cutoff in the pretraining assessment. This may suggest that the training was associated with some empathy change as measured by this instrument.

Fig. 5
figure 5

Jefferson scale of physician empathy results

Social Validity

Three measures of social validity were undertaken following the study: a participant survey, professional expert ratings, and consumer expert ratings. These social validity measures served to evaluate the impact and value of the training itself (for the participant survey) as well as to evaluate the experience of actual professionals and consumers in the field of ABA. All post-study measures of outcome are listed and described in Table 3.

The results for each of the 10 questions on the participant survey are displayed in Fig. 6. The average score across all questions and participants was 6.83 (range: 6.5–7). The lowest average scoring item was “The video models were valuable in helping me learn the skills” (average score: 6.5). It should be noted that in several instances across the videos, participants indicated that they could not hear the video well or that the volume was low or cutting out. It is possible that this technology issue contributed to the participants rating the video models as less valuable than other aspects of the training (i.e., role-plays, trainer’s explanation, which both averaged 6.75). It is important to note that the survey items that apply more directly to clinical practice were rated the highest. In particular, “This training helped me learn ways to engage compassionately with families,” “I feel more confident than before in my ability to show empathy and compassion when talking with families,” and “Learning these compassionate skills has helped/will help my practice,” were all endorsed at the highest level by all participants. Specific feedback was solicited from participants regarding aspects of this training that they enjoyed. Two participants noted that they particularly enjoyed the role-plays, and one participant noted that they appreciated the focus on communicating virtually, noting, “I really enjoyed learning about the ways to communicate with parents especially via video chat which is how most meetings are. I really liked the list of skills that were given and some of the examples of things to say. I have them written down which will be nice to use as a reference for the future.”

Fig. 6
figure 6

Average participant social validity score by question

Overall, scores from both professional and consumer expert raters were high. On a 7-point Likert-type scale, all participants averaged between 4.2 and 6.5 overall (range: 2–7) across skills. In general, participant 3 scored higher across all rated skills, and this was consistent between both groups of expert raters. Among professional raters, participant 3 averaged the highest survey ratings (6.6 overall, range: 6.3–6.7) and participant 4 averaged the lowest survey ratings (4.5 overall, range: 3.3–5). Likewise, among consumer raters, participant 3 averaged the highest survey ratings (6.7 overall, range: 6.3–7) and participant 4 averaged the lowest ratings (5.7 overall, range: 4.3–6.7). It is interesting that professional raters gave overall lower ratings than consumers. The highest scoring items as rated by consumers were: (1) the interviewer showed sincere concern for the caregiver and their needs (average, 6.4); (2) the interviewer made an effort to get to know the family and child (average, 6.5); and (3) the interviewer made an effort to establish a collaborative relationship with the family (average, 6.3). The highest scoring items as rated by professionals were: (1) the interviewer made an effort to get to know the family and child; and (2) the interviewer asked about caregiver priorities (both statements averaged 6.1) Fig. 7.

Fig. 7
figure 7

Expert rater survey results

It is possible that because the video clips were excerpts, and therefore did not show the entire interview, that interviewees were rated lower on skills that they would later engage in. For example, participant 4 was rated by both expert and consumer groups as relatively low in the area of incorporating caregiver priorities. It is possible that this participant incorporated the caregiver’s stated priorities later on in the interview, which would not have been represented in the expert rater segment.

There were no significant discrepancies between scores for general perception statements (e.g., “The interviewer showed sincere concern for the caregiver and their needs”) and statements related to the trained skills (e.g., “The interviewer asked about caregiver priorities”), lending some support to the success of the intervention.

Discussion

This study demonstrated the effectiveness of BST in training students of ABA on engaging compassionately in analog role-plays with caregivers of children with autism. All participants improved their compassionate care skills across all three skill sets from low baseline levels (averages between 0% and 65%) to very high levels post-training (averages between 93% and 100%). All four participants maintained high responding during maintenance and generalization probes, with three participants maintaining responding with at least 99% engagement across skill sets. The study was done via telehealth, which is also important to note. These skills have rarely been targeted within ABA practitioner training (Canon & Gould, 2021), and have not been effectively taught on a remote learning platform.

The need for compassionate care skills in the field of ABA is evident, and this study took initial steps toward defining target skills and toward training future practitioners on these skills. Basic interviewing skills (i.e., nodding, backchannel, greeting, and previewing notetaking) were trained first, and more complex interpersonal skills (e.g., normalizing, partnering) were taught following mastery of these skills. All participants had positive feedback regarding the training. Professional and consumer expert raters watched videos of the participants and indicated that they represented high levels of compassion, empathy, and collaboration. These findings lend social validity to the instruction, and are encouraging in terms of real-world outcomes.

The demonstration of skill acquisition and the positive ratings (from both participants and expert raters) provide preliminary support that this approach may be both effective and socially valid. It is especially heartening to see that the behaviors exhibited by participants were rated as clinically appropriate by experts, indicating that the behaviors were authentic and potentially effective in a service provision context. Of course, much more research is needed to evaluate the extension of this work to actual clinical interactions.

Another important aspect of the current study includes training students to engage in interpersonal skills via telehealth. Related training has also been undertaken in the nursing field (e.g., Gustin et al., 2020), and the results of this study corroborate that compassionate interpersonal skills can be taught via a remote synchronous format. In addition, the JSPE-HP is a valuable and commonly used tool that was also incorporated into this study. Because the JSPE-HP has been integrated into research in the field of health care, the use of this measurement helps draw parallels between ABA and related human service fields. The use of the JSPE-HP in this study represents the first integration of a standardized measure of empathy as it relates to behavioral health providers, particularly in understanding the levels of empathy demonstrated and whether this level changes based on training of compassion-related skills. The results of the JSPE-HP in the context of this study showed that all participants’ JSPE-HP scores improved following the training of compassionate care skills, and that 2 of the 4 participants moved from below the cutoff value to above the cutoff. Participant 4, who demonstrated relatively higher baseline levels across skill sets than the other participants, also had a higher initial score on the JSPE-HP (i.e., 117 for participant 4 vs. 105, 89, and 99 for participants 3, 2, and 1, respectively). It is interesting that this participant was the lowest-rated participant on the expert rater surveys, across both consumer and professional groups. This participant was also the only male, and national norms of the student version have differentiated between male and female respondents due to the fact that researchers who have used the JSPE-HP tend to find gender differences in favor of females (Hojat & Gonnella, 2015). This and other elements of participant characteristics should be explored, as individual characteristics may affect the acquisition of the skills trained.

Limitations

As this study represents an initial step toward developing compassionate care skills in aspiring behavior analysts, there are a number of limitations that should be discussed. First, it should be noted that the order of training of the skill sets was not varied. That is, for each participant the experimenter first trained skill set 1, then skill set 2, then skill set 3. This was a clinical decision made based on the initial training sessions with the first two participants. It appeared that the skills built on one another in terms of complexity and nuance, and it did not seem appropriate to train more advanced skills (normalizing, joining) prior to training more basic skills that demonstrated interest in the child and family priorities.

It is important to note that it is a potential confound that the primary experimenter conducted the training sessions as well as the post-training probes. It is also possible that participants may have learned to respond to specific scenarios in training and then responded correctly in probes. This was somewhat but not entirely mitigated by novel caregiver/experimenter in generalization. Future iterations of this research would be strengthened significantly by having multiple (unfamiliar) clinicians play the role of the caregiver, or by having caregivers themselves participate in the interviews. Perhaps most important, the ultimate test of these skills is generalization to clinical contexts. This is the most important future direction, and all current findings should be interpreted with caution until that outcome is demonstrated.

Training Content Limitations

As the skills trained in this study were complex, there are a number of limitations that relate specifically to the content of the training itself. For example, the compassionate care skills taught in this study were developed from a U.S.-centric perspective. Expert interviewers, caregivers surveyed, participants, and experimenters were all English-speaking individuals who are based in the United States. Therefore, the skills discussed here should not be taken as representative of and relevant to all cultures, even within the United States. For example, in some cultures, eye contact and asking probing questions may be experienced as intrusive. Even skills related to joining with the family (normalizing, partnering) could be experienced as patronizing rather than reassuring. Therefore, in future research, careful consideration should be given to incorporating intake questions that are both compassionate and culturally sensitive.

An important consideration that arose from this study was the difficulty level of skills selected for training. In some cases, ancillary knowledge was needed. For example, pre-service clinicians did not necessarily have an understanding of developmental norms or norms related to ASD specifically (i.e., participant 2 noted this during a training session). In order to join a caregiver in understanding their challenges, while still recognizing that some child-rearing challenges might be universal or at least common (i.e., the skill of “normalizing”), the clinician would need to be well-versed in typical developmental trajectories. Therefore, normalizing may be too advanced of a skill to train without additional prerequisite psychoeducation related to developmental norms.

In addition to the specific skills themselves, careful attention should be paid to the style of delivery and timing. In this study, skills were clearly defined and modeled, with examples provided. However, there are likely to be qualitative nuances with respect to the execution of these skills that are less easily explained. For example, when providing an empathy statement, it is important to gauge the flow of the conversation and interject the statement at an appropriate time, rather than interrupting the caregiver (i.e., timing of the response). Although none of these “poor” examples were observed during the course of this study, it is pertinent to discuss the qualitative elements of clinical responses and to plan to do both qualitative and component analyses of these responses in future research. Much more attention should be devoted to the issues of quality, timing, and delivery of these statements.

Related to this, the technology challenges experienced as a part of using a video conferencing platform may have presented additional barriers. Several of the participants noted that the volume on the video models sometimes cut out or was difficult to hear. When this occurred, the experimenter paused the video and relayed what was being said in the video model. Likewise, the video conferencing platform sometimes lagged, resulting in a brief delay between what one person said and what the other heard. This is particularly important when considering the time-sensitive nature of the skills being taught (e.g., it is important to provide backchannel immediately after a statement).

Another limitation related to training soft skills was that, because of the overlap of many of the skills in the context of an interaction, it was difficult to completely isolate each individual skill. For example, in the video model for “nodding and backchannel,” the expert clinician makes a statement that is consistent with the skill “partnering” (e.g., she says, “. . . and that’s what we’re going to talk about together.”). Because many of the skills co-occur in practice, and may in fact be functionally related (e.g., for the purpose of reassuring the caregiver, demonstrating engagement), it follows that many skills cluster together and are difficult to target in isolation.

Related to this, the length of sessions across participants varied, as some participants asked more questions during the interview, leading to longer overall session time. This led to some natural variability, which may have created additional learning or practice opportunities for some participants.

Future Directions

The results of this study suggest many areas for future research and clinical supervision and training. The complexity of skill selection, defining responses, and training procedures necessitate continued attention to this area. As the field moves toward finding concrete ways to integrate consumer priorities in compassionate ways, there will be space for supervisors, practitioners, and leaders in the field to examine ways to expand compassionate treatment and improve the overall social validity of our work.

The findings of this study should be considered in the integration of interpersonal skills into training programs for students. A BST approach was effective for all four participants. Some or all components of the training (i.e., instructions, rationale, modeling, rehearsal, and feedback) of compassionate care skills may be incorporated into coursework or supervision experiences for aspiring ABA clinicians. Incorporating the training of these skills via modeling and role-play into coursework and supervision experiences has been recommended within the behavior analytic literature (Canon & Gould, 2021; LeBlanc et al., 2019; Taylor et al., 2019), and this study provides preliminary support for using BST to address this skill set.

Participants in this study were trained one at a time, allowing for individualized feedback and a rich training experience. The training process (not including post-training probes) for each participant took an average of 1 hr, 13 min, which is a relatively short duration for training that produced marked improvement in critical interpersonal skills. The total duration of training time varied, ranging from 60 min to train all three skill sets (i.e., participant 1) to 90 min (i.e., participant 3). The training was effective and highly regarded by participants according to social validity ratings. Future iterations of such training could consider increasing the efficiency of training even further by incorporating a group training format. The richness of this individually tailored training experience may not be necessary to achieve positive results, and a group training format may achieve similar results and have additional benefits, such as peer modeling. Group-based BST has been shown to be an effective training format in previous research (e.g., Whiting et al., 2014), so it is anticipated that the efficacy would be similar and training time may be shortened.

In addition, the individual components of BST should be examined to determine which of the components (i.e., instruction, modeling, rehearsal, feedback) are most necessary for effective training. Because BST procedures allow for specific performance-based feedback, the experimenter (or in future clinical application, the supervisor) could target specific qualitative aspects of the clinical responses. It may be that participants would have developed better skill sets if the training had incorporated more examples and nonexamples, and if there had been a more nuanced discrimination training between excellent exemplars of clinical responses and those that might lack genuineness or nuance. This is another aspect of training that could be incorporated into future research.

Another interesting aspect of training to examine would be the experimenter’s consistency in responses. For example, it may be informative to collect data on the experimenter’s engagement in certain skills such as nodding and backchannel during the role plays to see whether they are consistent across baseline and post-training probes. In addition, the experimenter’s behavior could be recorded across participants, to determine whether there was fidelity in responding during each participant’s role-play. That is, it would be important to identify whether there is an inadvertent discrepancy in the experimenter’s behavior (more “leading” comments, pauses between statements) between baseline and training

The participants in this study were students, aged 24–25 years, in an online ABA master’s degree program of study. It is possible that the low baseline levels observed were in part a function of minimal experience in the field or with prolonged contact with children. It would be interesting to explore the performance of participants who have a great deal of experience or contact with children (parents, caregivers, babysitters), in comparison to participants with less experience with children overall. It would also be interesting to conduct this with practicing BCBAs who have accrued years of clinical experience. It is possible that after years of experience, either in the field or with children and families, interpersonal interaction skills are learned by proxy without explicit teaching. Even so, we should consider whether there is time to wait for these skills to develop organically, or whether they are so integral to effective treatment that we cannot wait for clinicians to gain this experience before acquiring compassionate skills.

The generalization probes assessed the extension of the skills trained to novel role-plays and a novel experimenter who was familiar with the study. An important next step would be to expand significantly on the generalization component of this study. This may involve conducting probes with caregivers who respond based on their actual experiences as a parent. It will be essential to evaluate the impact of this type of training on authentic interactions in a clinical setting. The impact to which improved compassion affects positive outcome measures (e.g., adherence) should also be explored.

In future clinical applications of these procedures, researchers should explore the qualitative dimensions of compassionate care responses. To this end, participants could watch videos of their own role-plays and rate themselves on how compassionate they are. This could be done using a rubric of skills, with clinicians comparing themselves to a standard (i.e., self-evaluation) or a more subjective judgment/rating scale and reviewed with a supervisor. The skills in this study were taught using a specific training package (i.e., BST), but future research should explore the use of self-evaluative techniques to improve compassionate responding. Self-evaluation requires discriminating one’s own behavior as well as assessing their behavior in relation to a predetermined criterion. A checklist of skills such as the one taught in this study could be introduced as a self-presented prompt, which could change the probability of engagement in more compassionate interactions with clients. Self-evaluation has been shown to be an effective self-management intervention to increase a broad range of appropriate behaviors (e.g., Carr et al., 2014; Sainato et al., 1990). The use of a skills checklist as part of a self-evaluation package could improve compassionate responding with families in ways that are efficient and broad reaching.

Conclusion

This study adds to the literature by demonstrating that compassionate care components can be effectively taught to students of behavior analysis through a remote platform. The changes were notable, and were rated as good samples of compassionate responses by both expert and family raters. This represents the beginning of defining an instructional approach to these skills, which is extremely important. Many questions remain about increasing the efficiency of the procedures, about refining the target skills, and about assessing generalization. However, this represents an important step toward actualizing the goal of teaching compassionate care skills to students of behavior analysis. It will be exciting to see how this line of research evolves and its impact on rapport, on intervention outcomes, and on the reputation of the field of ABA, particularly in the eyes of consumer stakeholders.