Training Individuals to Implement Applied Behavior Analytic Procedures via Telehealth: A Systematic Review of the Literature

The purpose of this article is to summarize literature relating to training individuals to implement applied behavior analytic procedures via telehealth and identify any gaps in the evidence base for this type of support. A systematic literature search revealed 20 articles focusing on training individuals to implement specific ABA techniques via telehealth. The Evaluative Method (Reichow et al. in J Austism Dev Disord 38:1311–1319, 2008; Reichow, in: Reichow, Doehring, Cicchetti, Volkmar (eds) Evidence-based practices and treatments for children with autism, Springer, New York, Reichow 2011) was used to assess the methodological quality of included articles. Results indicated that individuals were trained to implement a range of techniques, including assessments, targeted interventions, and specific teaching techniques. Socially significant outcomes were reported for clients in the form of reduced challenging behavior and increased skills. Trainee fidelity following training via telehealth was variable, and barriers related to the use of telehealth were highlighted. Where evaluated, cost and travel burdens were considerably lower than support provided in-person. The emerging literature is promising and suggests that telehealth may be an effective means of training individuals in ABA techniques; however, wider issues and practical implications related to the use of telehealth should be considered and are discussed as it relates to ABA providers.


Introduction
Technology is increasingly becoming a part of everyday life, with smart phones, tablets, laptops, and high-speed internet connections becoming more accessible and affordable. Given the prominence of this technology in our society, it is not surprising that health organizations have adopted technology to provide services in innovative ways. The application of technology to providing such services has been termed 'telehealth' and is defined as "the use of telecommunications and information technology to provide access to health [or behavioral health] assessment, diagnosis, intervention, consultation, supervision, education, and information across distance" (Nickelson, 1998, p. 527). This can include communication through the telephone, email, online chat rooms, or videoconferencing (e.g., Gerrits et al. 2007;Phillips et al. 2001;Torres-Pereira et al. 2008), computer-or internet-based interventions (e.g., Khanna and Kendall 2008;Klein et al. 2010), and even the use of smart phone or tablet applications (e.g., Gregoski et al. 2012). Telehealth has been applied in a range of ways across a number of fields. For example, it has been used for collaborations between healthcare professionals (e.g., Katzman 2013;Zollo et al. 1999), a wide range of assessments (e.g., Loh et al. 2004;Turkstra et al. 2012), medical diagnostic services (e.g., Edison et al. 2008;Torres-Pereira et al. 2008), monitoring of long-term conditions (e.g., Fatehi et al. 2014;Inglis et al. 2014), parent training (e.g., Reese et al. 2015;Xie et al. 2013), speech and language therapy interventions (e.g., Georgeadis et al. 2004;Grogan-Johnson et al. 2011), and mental health support (e.g., Klein et al. 2010;Mitchell et al. 2008). Delivering services via telehealth may have a number of practical advantages for clinical practice in that it may enable increased access to populations that are hard to reach (e.g., those with rare conditions or those living in rural areas), reduce travel related costs, make scheduling appointments easier, and even increase family carer participation in interventions with their child as the clinician is not physically present (see, for discussion, Hilty et al. 2002;Meadan and Daczewitz 2015). In relation to psychiatric services, telehealth support has been reported to be reliable, acceptable to both the individuals receiving telehealth and the individual delivering the service, and associated with a range of positive outcomes such as reduced costs and fewer medication errors (Hilty et al. 2002). Telehealth and its application to psychological and behavioral support services is therefore an important area of study.
Although the use of telehealth is relatively well established in psychiatric and psychological services, with 98% of psychologists reportedly using some form of telehealth in 2000 (Vandenbos and Williams 2000), the field of Applied Behavior Analysis (ABA) has evidenced less use of telehealth. Some early work involved the use of telephone support during parent training (e.g., Patterson 1974;Patterson et al. 1982), or 'bug in ear' technology to provide real-time coaching (e.g., Bowles and Nelson 1976;Stumphauzer 1971). However, articles reporting more extensive use of telehealth in ABA are only just beginning to emerge. This disparity between fields may be due to key differences between general psychological or health support, which is often delivered directly to a client, and behavior analytic support which often involves training others in specific techniques (e.g., Deliperi et al. 2015;Downs and Downs 2013;Wacker et al. 2017) or using a more formal behavioral consultation model (see, for example, Sheridan et al. 1996;Sheridan and Kratochwill 2007;Watson and Robinson 1996;Wilkinson 2006). These training and consultation approaches have been shown to be effective in enhancing consultee skills and fidelity (e.g., Collier-Meek and Sanetti 2014; Deliperi et al. 2015;McKenney et al. 2013) and improving child behavior or academic and social skills (e.g., Garbacz and McIntyre 2016;Sheridan et al. 2006;Sheridan et al. 2013;Wacker et al. 2017). However, some authors highlight barriers to this type of support due to the amount of consultant time needed and difficulties providing training or behavioral consultation to clients in rural areas, suggesting that telehealth may be a useful alternative method of providing such support (e.g., Bice-Urbach and Kratochwill 2016;Fischer et al. 2016a, b).
Despite this, conducting training primarily via telehealth may present more barriers than providing training in-person in relation to role playing skills, observing practice, monitoring implementation fidelity, and collecting data. This may partially explain the slower uptake of telehealth within ABA, and early examples often used initial in-person training supplemented by telehealth support (e.g., Patterson 1974;Patterson et al. 1982). However, there is some evidence that general parent training or parenting interventions can be effectively delivered via telehealth. For example, Reese et al. (2015) reported comparable results for both parents and children when a parenting intervention was delivered via telehealth or in-person, suggesting that training a consultee to support a client may be possible via telehealth. Similarly, Xie et al. (2013) reported comparable findings for parents of children with Attention Deficit Hyperactivity Disorder (ADHD), and greater improvements in hyperactivity for those whose parents were trained via telehealth rather than in-person. Although this evidence may have implications for behavior analytic support, the parenting interventions presented in these articles were not explicitly based on ABA; thus, it is unknown whether these results generalize to ABA services.
Given the recent emergence of articles relating to the use of telehealth for training consultees in ABA, a review of the literature is both timely and important in order to identify the breadth of application of telehealth methodology, indicators of effectiveness, and any limitations or difficulties encountered in its use. There is currently no known review focusing solely on behavior analytic research, with previous reviews focusing on other fields (e.g., psychotherapy Gros et al. 2013; palliative care, Kidd et al. 2010; speech pathology, Mashima and Doarn 2008), or more broad training interventions for parents of children with disabilities (e.g., Meadan and Daczewitz 2015). Boisvert et al. (2010) recently reviewed literature relating to the use of telehealth for providing support to individuals with an Autism Spectrum Disorder (ASD), including five studies focusing solely on ABA techniques. The review included articles where support was provided in relation to behavior and educational goals to teaching staff and parents, or psychological support provided directly to individuals with ASD. They found that such support provided via telehealth was deemed to be effective for the client in seven out of eight cases, with technical difficulties influencing conclusions in one case. In addition, a review by Neely and colleagues (2017) focused on the fidelity with which individuals were able to implement techniques when trained via telehealth to support individuals with ASD. They reported that trainee fidelity increased throughout the intervention; however, results were mixed and often did not maintain in the absence of direct training or coaching. Although some of the studies included in these reviews involved the use of ABA techniques, the focus on ASD alone, specific outcomes (i.e., fidelity) and the inclusion of support provided within other disciplines leaves open the question of how effective telehealth is as a service delivery mechanism for ABA specifically.
The current review aims to synthesize the literature relating specifically to training an individual in ABA techniques via telehealth in order to provide an overview of the current state of the evidence and highlight gaps in research relating to this method of providing support. The review seeks to answer the following research questions: (1) How has telehealth methodology been utilized for training individuals in ABA approaches, including the context in which it is adopted, the training focus, methodology used, and characteristics of those involved? (2) How effective is the use of telehealth for training individuals in ABA approaches in relation to improving trainee skills or fidelity, and/or changing client behavior?
(3) Is the use of telehealth for training in ABA approaches socially acceptable and are there any obstacles reported that researchers and practitioners in the field should consider when utilizing such methodology?

Method Inclusion/Exclusion Criteria
Original empirical articles published in peer reviewed journals were included in the current review if they met all of the following criteria. Firstly, the study involved training an agent (e.g., a parent, therapist, teacher) in a specific behavior procedure (e.g., preference/functional assessments, teaching techniques such as discrete trial teaching, functional communication training [FCT]). Studies which involved delivering support directly to a client or delivering broader parentingbased programs (i.e., those focusing on more general parenting skills or focusing on knowledge about behavioral approaches more generally rather than specific techniques) were excluded. Similarly, due to the focus on direct training, articles which involved self-directed study only with no additional support from a trainer were not included. Secondly, articles were only included if data relating to behavioral outcomes for the trainee (e.g., increased skills/fidelity of implementation) and/or the client were presented. Thirdly, all of the training relating to implementing the techniques was provided through telehealth methodology (e.g., videoconferencing, telephone, email) to ensure that the focus was on telehealth training, rather than the telehealth role being supplementary to support provided in-person. There were no criteria relating to the date of publication in order to ensure that all relevant articles were included, as it is not possible to pinpoint when telehealth methodology was first adopted.

Search Strategy
A three-phase search strategy was adopted for the current review, and all searches were conducted in July 2017 encompassing literature published up to this date. Firstly, a search string was entered into PsycINFO, Web of Science, and PubMed databases using the search terms listed in Table 1 such that each group 1 term was combined with each group 2 term. These databases are most commonly used in the behavioral sciences, and index relevant articles relating to these topics. It was therefore expected that these databases would identify the highest number of relevant articles for the current review.
The use of these terms aimed to identify the majority of telehealth-based ABA research. Given evidence from an earlier review (Brady et al. under review) indicating that a large proportion of Positive Behavior Support (PBS) research may not be multi-component and may instead focus on specific behavioral techniques, the inclusion of the term "positive behav* support" aimed to identify those articles that may be labeled primarily as positive behavior support, rather than applied behavior analytic. Furthermore, the use of ABA is a core component of PBS (Gore et al. 2013) and may therefore mean that studies utilizing PBS also involve training an agent in behavioral techniques. As stated above, articles were only included if training related to a clearly defined behavioral procedure, rather than multi-component behavioral support plans. The authors were also aware of a number of recent articles focusing on the use of videoconferencing in training agents to conduct behavioral techniques, therefore "videoconferenc*" was included to ensure that this group of articles was explicitly searched for.
A total of 14,002 original articles were identified from the database searches and the titles/abstracts of these articles were screened, resulting in 30 articles being retained for further review. Articles were excluded following title/abstract screen if it was clear that they did not meet one or more of the inclusion criteria (e.g., studies relating to animals, medical conditions, or support provided directly to a client via telehealth). After applying inclusion and exclusion criteria to the retained articles, 17 were included in the review. Secondly, a hand search was conducted of the three journals (Journal of Applied Behavior Analysis, Research in Autism Spectrum Disorders, Journal of Behavioral Education) that published the highest number of included articles. One additional article was identified, which did not meet inclusion criteria after full text review. Finally, the reference lists of all included articles were searched which resulted in an additional 9 articles being identified, of which 2 were included. An additional two articles were reviewed that had not been found via the searches described above but had been brought to the authors' attention by other researchers. One of these articles met inclusion criteria and was included in the review. A total of 20 articles were included in the review with 17 of these utilizing single case designs. An overview of the search strategy and reasons for exclusion of articles at each stage can be seen in Fig. 1

Methodological Quality Evaluation
In order to evaluate the methodological quality of included articles, the Evaluative Method (Reichow et al. 2008;Reichow 2011) was used in the current review. In a recent review of single case design evaluation tools (see Wendt and Miller 2012), the Evaluative Method was rated highly based on its congruence with agreed standards for quality in single case design studies, its ability to distinguish between studies of variable quality, and empirical evidence supporting its validation. It was also the only highly rated tool able to appraise both single case and group design studies, utilizing a comparable scale across both types of design. As a result, this tool was used over other highly rated single case design evaluation tools in order to enhance interpretability of the quality ratings across both types of design in the current review. A final rating of Weak, Adequate, or Strong is assigned to articles based on ratings given in relation to primary indicators (such as the quality of baseline data, the details reported about participants, experimental control, comparison groups etc.) and secondary indicators (such as interobserver agreement, blind raters, social validity etc.). See Appendix A for definitions of the criteria for each primary and secondary indicator. The tool was modified in two main ways for use in the current review (consistent with procedures adopted in an earlier review, see Brady et al. under review). The final ratings were expanded to include "Borderline Adequate" and "Borderline Strong" in order to illustrate broader variability in quality of the articles, as a high number of articles were initially rated as Weak (see for criteria used to assign ratings). In addition, as the Evaluative Method was initially designed to be used for research relating to ASD, the 'participant' criteria were expanded to ensure that articles could still score 'high' as long as any applicable diagnoses were clearly stated. This ensured that studies including participants without easily operationalized diagnoses, or those without disabilities, were still able to score highly.
The tool was applied to each article in relation to the outcomes reported. This meant that for some articles, the tool was applied twice (e.g., for outcomes relating to the trainee such as fidelity/skills, and for assessment/intervention outcomes relating to the client due to the trainee implementing behavioral techniques with them). Where applicable, criteria for assigning ratings were considered in relation to the specific outcomes being assessed (e.g., participant ratings where trainee outcomes were assessed were evaluated in relation to details reported about trainees, rather than clients-see Appendix A for further detail). A second coder independently applied the tool to 50% of the articles (10 articles). Percentage agreement across indicators and final ratings was calculated and was 81.45% across indicators, and 60% across final ratings. The low agreement for final ratings is reflective of the higher weighting of primary indicators on the final rating given to an article, meaning that disagreements on these indicators would often also result in disagreements on the final ratings assigned. Disagreements were discussed and consensus was reached on ratings, and where necessary ratings for all articles were reviewed in light of agreements following discussion.

Coding
The first author read each of the included articles and recorded information about the context and background to adopting telehealth methodology given by the researchers, trainer/trainee/client characteristics, telehealth methods used including characteristics of training (e.g., methods and technology used, dosage of training, format of training), the behavioral focus of the training (e.g., type of assessments, skills, or interventions used), and outcomes (for trainer, trainee, client, social validity, obstacles experienced). The second author also checked the extracted information for 55% of articles for accuracy and completeness.

Methodological Quality
The Evaluative Method was applied 23 times for the 17 single case design articles (i.e., six articles included outcomes related to both the trainee and client) and once for each group design article as none of the group design articles presented outcomes data relating to both the trainees and clients. The most common ratings were "Weak" or "Borderline Adequate" with only one single case design article rated as "Strong" in relation to outcomes for the client (see Fig. 2).
Appendix A provides an overview of the individual indicator ratings and final rating given to each article. Single case designs most often did not score highly on evidencing a stable baseline across at least 3 data points (16/23 instances) or having stable data that varied with implementation of the intervention (17/23 instances for visual analysis criteria relating to stability of data and overlap between conditions, and 14/23 for experimental control criteria relating to number of reversals and variation in data based on implementation of the independent variable). In addition, none Evaluative method ratings for single case and group design articles for trainee and client outcomes of the single case designs included Kappa statistics, only two used blind raters, and most did not collect data on the fidelity of implementation or meet fidelity criteria where data were presented (for either the main trainer related to implementing the training, or the trainee for implementation of the intervention: 17/23 instances). Group designs did not score highly for the use of appropriate statistical analyses with adequate sample size and power (2/3 instances), did not use blind raters (2/3 instances), and did not collect data on the fidelity of intervention implementation (for either the main trainer related to implementing the training, or the trainee for implementation of the intervention: 3/3 instances), or on generalization/maintenance (2/3 instances). They also did not include effect sizes calculations (2/3 instances).

Breadth and Context
As stated above, 20 articles were identified which focused on using telehealth methodology to train stakeholders in behavioral techniques. Across these 20 articles, 113 agents were trained in behavioral techniques via telehealth by at least 27 trainers (it was not possible to determine the number of trainers for three articles: Alnemary et al. 2015; Lindgren et al. 2016;Wainer and Ingersoll 2015), and 104 children received support from someone who had been trained via telehealth. In some cases, additional individuals were also trained including four trainees as part of a wait list control group (Fisher et al. 2014), and 53 individuals who were trained via in-person methods as a comparison group (Hay-Hansson and Eldevik 2013; Lindgren et al. 2016). Table 2 provides an overview of each included study. Studies were conducted by research teams primarily located in the USA, with one study conducted by a research team in Norway. Where information was reported on the distance over which telehealth support was provided, distances varied from a different room in the same building (Higgins et al. 2017;Machalicek et al. 2009b), a different location under 100 miles away (Barretto et al. 2006;Gibson et al. 2010;Lindgren et al. 2016;Machalicek et al., 2009a;Machalicek et al. 2016;Neely et al. 2016), between 100 and 200 miles away (Barretto et al. 2006;Lindgren et al. 2016;, or over 200 miles away (Knowles et al. 2017;Lindgren et al. 2016;Wacker et al. 2013a, b). In three cases, training was provided for trainees in a different country located 300 (Wainer and Ingersoll 2015), 5863 (Barkaia et al. 2017), and 8333 (Alnemary et al. 2015) miles away from the trainer.
The context in which telehealth methodology was employed varied across the articles. Some researchers cited practical difficulties with offering support in-person, such as large waiting lists for support or costs and time involved with traveling around rural areas (Barretto et al. 2006;Gibson et al. 2010;Hay-Hansson and Eldevik 2013;Knowles et al. 2017;Machalicek et al. 2009aMachalicek et al. , b, 2010Machalicek et al. , 2016Neely et al. 2016;Wacker et al., 2013a, b;Wainer and Ingersoll 2015). Alnemary et al. and Barkaia et al. further cited a lack of behavioral expertise and support available internationally in Saudi Arabia and Georgia, respectively. Other researchers cited knowledge gaps relating to effectiveness, efficiency or agent fidelity when training is conducted via telehealth Suess et al. 2014). Finally, some researchers highlighted the need to compare delivery formats (Lindgren et al. 2016) 1 3 J Behav Educ (2018) 27:172-222  Neely et al. 2016;Wainer and Ingersoll 2015), while others cited methodological considerations relating to telehealth research including the use of a randomly controlled or multiple baseline design (Fisher et al. 2014;Higgins et al. 2017), the incorporation of telehealth into existing support models , or the use of specific technology and software Machalicek et al. 2009b).

Trainer Characteristics
In three cases (Alnemary et al. 2015; Barretto et al. 2006;Wainer and Ingersoll 2015), the characteristics of the trainer were not stated and in some instances the trainer was listed only as one or more of the authors or a researcher/experimenter, with no further details about their skills, training, or experience provided. Where the characteristics of the trainer were stated, these individuals were most commonly professionals who had had prior experience or training in behavior analytic approaches. For example, in six articles (Higgins et al. 2017;Machalicek et al. 2009aMachalicek et al. , b, 2010Machalicek et al. , 2016Neely et al. 2016), it was explicitly stated that trainers were Board Certified Behavior Analysts. Trainers were often Doctoral or Master's students Higgins et al. 2017;Knowles et al. 2017;Lindgren et al. 2016;Machalicek et al., 2009aMachalicek et al., , b, 2010Suess et al. 2014Wacker et al. 2013a, b) and trainees had varying levels of experience using behavioral approaches, ranging from one  to 20 years' experience of implementing behavioral techniques (Wacker et al. 2013a).

Trainee Characteristics
Of the 113 individuals trained via telehealth, 72 were family carers, 26 were teaching staff, nine were students/graduates, and six were ABA therapists or direct care staff. In many cases, trainees had no prior experience or knowledge of behavioral techniques. Three trainees in one study had some prior experience although it was not possible to determine whether these received training via telehealth or in-person (Hay-Hansson and Eldevik 2013), and in one study, therapists were used who had reportedly taken a class relating to ABA (Barkaia et al. 2017). In five studies (fifteen trainees), it appeared that agents may have had prior experience in behavior analytic techniques, but had no experience in the specific technique used in the study (Higgins et al. 2017;Machalicek et al., 2009aMachalicek et al., , b, 2010Machalicek et al., , 2016, and in three articles (seven trainees), it was not clear how much prior experience the trainees had (Barretto et al. 2006;Gibson et al. 2010;Suess et al. 2014).
In some cases, other individuals were also present during the sessions to offer logistical support to trainees. Parent assistants with no prior experience of behavioral techniques were used in three studies Wacker et al. 2013a, b) and received training via telehealth as part of the study. These individuals assisted parents during the sessions in relation to setting up the room, ensuring materials were available, and providing physical assistance. Similarly, Barkaia et al. (2017) involved an additional psychologist in situ for trainees during implementation of procedures; however, it was not clear what type of support this individual provided during the study. Additional individuals known to the client were also present in one study (Barretto et al. 2006) and included a school psychologist, a physical therapist, biological parent, special education teacher, social worker, nurse, and pediatrician. These individuals were not involved in the sessions, with the exception of the school psychologist who acted as a coach for one parent, and the physical therapist who carried out physical activities as demand activities for one child.

Client Characteristics
As noted above, 104 individuals received support from someone who had been trained via telehealth, and in almost all instances (with the exception of one child in Fischer et al. 2016; and two children in Knowles et al. 2017), these individuals were children with intellectual or developmental disabilities, most commonly ASD. Children were aged between 12 months and 16 years (where it was possible to determine age) and in thirteen studies (78 children) children reportedly displayed challenging behaviors such as self-injury, property destruction, aggression or noncompliance (Alnemary et al. 2015; Barretto et al. 2006;Fischer et al. 2016;Gibson et al. 2010;Knowles et al. 2017;Lindgren et al. 2016;Machalicek et al. 2009bMachalicek et al. , 2010Machalicek et al. , 2016Suess et al. 2014Wacker et al. 2013a, b). Only seven studies (Barkaia et al. 2017;Gibson et al. 2010;Machalicek et al. 2009bMachalicek et al. , 2010Machalicek et al. , 2016Neely et al. 2016;Wacker et al. 2013b) reported on client's communication abilities. However, across these studies, they had a range of abilities from no spoken language to fluent speech.

Training Focus
In most cases, training focused on assessments such as functional analyses (Alnemary et al. 2015;Lindgren et al. 2016;Machalicek et al. 2009bMachalicek et al. , 2010Machalicek et al. , 2016Wacker et al. 2013b) or preference assessments (Higgins et al. 2017;Machalicek et al. 2009a). Fewer studies focused on training for specific intervention strategies: in seven cases, trainees were supported to develop and implement FCT or differential reinforcement interventions Gibson et al. 2010;Lindgren et al. 2016;Machalicek et al. 2016;Suess et al. 2014Wacker et al. 2013a), and in one case, each trainees were taught to implement Reciprocal Imitation Training (Neely et al. 2016), mand and echoic training (Barkaia et al. 2017), or classroom management approaches within a Positive Behavior Interventions and Supports (PBIS) model (Knowles et al. 2017). Three studies focused on improving trainee's skills relating to implementing behavioral teaching techniques such as discrete trial teaching or incidental teaching (Fisher et al. 2014;Hay-Hansson and Eldevik 2013;Wainer and Ingersoll 2015).

Training Methods
In all cases, training was provided via videoconferencing (i.e., real-time communication across a distance using an internet connection with video and audio facilities) with the trainer providing training and/or coaching from a different location, using a computer, webcam, and microphone (see Table 3 for technical setup and difficulties reported in each article). However, the specific methods used to conduct training differed across the articles. In most cases, initial training was provided in some way to trainees using a variety of methods. Some researchers provided extended training sessions, lasting between 15 min and 3 h, which involved a combination of presentations relating to the techniques, direct instruction, modeling, or role playing (Alnemary et al. 2015; Barkaia et al. 2017;Fisher et al. 2014;Gibson et al. 2010;Hay-Hansson and Eldevik 2013;Higgins et al. 2017;Machalicek et al. 2016;Suess et al. 2014Wacker et al. 2013a, b). This initial training was usually provided via videoconferencing and was provided via telephone in one study (Barretto et al. 2006). In other cases, trainees undertook self-instruction using online modules or videos (Fisher et al. 2014;Knowles et al. 2017;Neely et al. 2016;Wainer and Ingersoll 2015), or written explanations of the techniques and individual practice (Machalicek et al. 2009a).
In some cases, training was provided solely through live coaching via videoconferencing during implementation of procedures. However, in nearly all of these instances, trainees or individuals who supported trainees in situ appeared to have prior knowledge of behavioral techniques (Barretto et al. 2006;Lindgren et al. 2016;Machalicek et al. 2009aMachalicek et al. , b, 2010. Other researchers used live coaching to supplement initial training (Alnemary et al. 2015; Barkaia et al. 2017;Suess et al. 2014Wacker et al. 2013a, b;Wainer and Ingersoll 2015), and in two studies delayed feedback was provided based on videos made during earlier clinical sessions (Knowles et al. 2017;Neely et al. 2016). In all cases, feedback involved providing praise and corrective feedback. Where live coaching was used, this was usually provided for all sessions. However, some researchers also conducted sessions in which trainees were not directly coached in order to test their skills or evaluate whether behavioral change had maintained at follow-up (Fisher et al. 2014;Hay-Hansson and Eldevik 2013;Higgins et al. 2017;Neely et al. 2016;Wainer and Ingersoll 2015). Sessions without coaching were also used in order to assess whether trainees could perform as well when not coached (Machalicek et al. 2010;Suess et al. 2014). In addition to this direct training/ coaching, trainees were explicitly asked to independently practice techniques or complete homework in five instances (Lindgren et al. 2016;Machalicek et al. 2009a;Wacker et al. 2013a;Wainer and Ingersoll 2015).
A supplemental trainee manual was described in four articles (Suess et al. 2014;Wacker et al. 2013a, b;Wainer and Ingersoll 2015), and an additional parent assistant manual containing information about the techniques, data recording forms, and scripts for use with parents was used by Wacker and colleagues (Wacker et al. 2013a, b). Some studies also reported the use of written protocols for trainers to use during coaching/training            Knowles et al. 2017;Machalicek et al. 2010;Suess et al. 2014). Training often continued until trainees met predetermined criteria for fidelity or accuracy (Barkaia et al. 2017;Fisher et al. 2014;Gibson et al. 2010;Machalicek et al. 2010Machalicek et al. , 2016Neely et al. 2016;Suess et al. 2014). However, in many studies, training procedures were fixed and not responsive to fidelity (Barretto et al. 2006;Fischer et al. 2016;Hay-Hansson and Eldevik 2013;Knowles et al. 2017;Lindgren et al. 2016;Machalicek et al. 2009a, b;Wacker et al. 2013a, b;Wainer and Ingersoll 2015), and in three instances, training was supplemented with individual feedback or additional training based on fidelity (Alnemary et al. 2015;Fischer et al. 2016;Higgins et al. 2017).

Outcomes
A range of outcomes were included in the articles for both the trainee themselves and the client. Only two studies compared outcomes of training conducted via telehealth with in-person methods (Hay-Hansson and Eldevik 2013;Lindgren et al. 2016), and both found comparable results between the two delivery formats suggesting that delivery of training via telehealth may be as effective as delivery via traditional in-person methods. Additionally, Wacker et al. (2013a) anecdotally reported comparable outcomes for clients between their current project, in which trainees were trained via telehealth, and previous projects, in which trainees were trained via in-person methods.

Trainee Outcomes
Outcomes reported for trainees related in most cases to trainee fidelity or skills, with only one article examining changes in trainee knowledge about the procedures and reporting large increases (Wainer and Ingersoll 2015). In eight articles, no outcomes data were presented for trainees with outcomes presented only for the client (Barretto et al. 2006;Gibson et al. 2010;Lindgren et al. 2016;Machalicek et al. 2009bMachalicek et al. , 2016Wacker et al., 2013a, b). Where data were presented on trainee fidelity/skills mastered, results were variable. Some studies reported very high fidelity across trainees. For example, Machalicek et al. (2009a) reported 100% accuracy for teachers completing preference assessments and Wacker et al. (2013b) reported averages of 96% (without corrections) and 97% (with corrections) fidelity across 24% of sessions for all t. Despite this, while all of the studies reported increases in fidelity for those who were trained (with some significant increases over time or relative to a control group: Fisher et al. 2014;Hay-Hansson and Eldevik 2013), in the majority of cases trainees failed to meet criterion fidelity, with only four articles reporting that criterion fidelity was met by all trainees across all session types or experimental phases (Fisher et al. 2014;Machalicek et al. 2009a;Neely et al. 2016;Wacker et al. 2013b). Hay-Hansson and Eldevik (2013) did, however, report comparable fidelity between individuals trained via telehealth and those trained in-person, suggesting that variable fidelity may be a common finding regardless of delivery format. However, the small number of studies directly comparing delivery formats precludes a more detailed analysis of the relative fidelity with which trainees are able to implement procedures when trained or coached via telehealth.

Client Outcomes
A range of outcomes were reported in relation to the client; however, five articles included outcomes for the trainee only (Alnemary et al. 2015;Fisher et al. 2014;Hay-Hansson and Eldevik 2013;Higgins et al. 2017;Machalicek et al. 2010). Outcomes for the client were usually presented where individuals were trained to undertake assessments or specific intervention techniques. Only one of the studies which focused on teaching techniques presented client outcomes, reporting large increase in children's use of mands (Neely et al. 2016).
Where trainees implemented functional analyses, a social function was identified for the client's behavior in the majority of cases (Barretto et al. 2006;Lindgren et al. 2016;Machalicek et al. 2009bMachalicek et al. , 2016Suess et al. 2014;Wacker et al. 2013b) with the exception of one client in  and two in Wacker et al. (2013b) for whom no function was identified. The results of the analyses were directly verified using a function-based intervention in five articles (Lindgren et al. 2016;Machalicek et al. 2016;Suess et al. 2014, and Wacker et al. (2013b) verified results using FCT presented in a subsequent article for 13 clients (Wacker et al. 2013a). In one article (Barretto et al. 2006), analysis results were not verified by a subsequent intervention. Only one article (Machalicek et al. 2009a) presented results of preference assessments conducted by trainees for three children. In this instance, preferred items were identified for each child and these preferences were subsequently verified using an instructional intervention in which children were observed to choose the task associated with access to the items identified as preferred.
Some articles focused on training agents to implement specific interventions such as FCT or differential reinforcement, Reciprocal Imitation Training, PBIS approaches, or mand and echoic training. FCT and differential reinforcement interventions were found to be generally effective when implemented by trainees. For example, Gibson et al. (2010) reported that elopement occurred only 5% of the time following FCT compared to over 90% of the time during baseline sessions. A number of studies Lindgren et al. 2016;Suess et al. 2014;Wacker et al. 2013a) similarly reported large reductions in challenging behavior for the majority of clients. However, results were variable with less than 80% reductions for some clients and additional intervention elements required in some cases. Results were particularly variable with an average of only 65.1% reduction in challenging behavior in , despite challenging behavior being found to be significantly lower during the intervention than baseline. It must be noted, however, that telehealth training for functional analyses and FCT was implemented in this study in order to examine whether it could be delivered within the same time frame (i.e., two hours) as existing clinical support systems. As a result, the authors highlight that the findings offer preliminary evidence that telehealth training for functional analyses and FCT can be incorporated into existing systems, with questions remaining about ways to maximize intervention effects within a short timeframe. In relation to Reciprocal Imitation Training (Wainer and Ingersoll 2015) or echoic and mand training (Barkaia et al. 2017), outcomes were reportedly variable but with moderate increases in children's spontaneous imitation or communication overall.

Social Validity
Fourteen of the 20 articles included data relating to the social validity of the training/coaching delivered via telehealth. In most cases, social validity ratings were very high and nearly at ceiling levels on the measures used. For example, Fisher et al. (2014) developed a 14-item social validity questionnaire (utilizing a 7-point Likert scale from 1 [strongly disagree] to 7 [strongly agree]) relating to the use of web-based technology, the content of the online modules, the interactions with the trainee, and their overall satisfaction. Mean ratings assigned to each of the items ranged from 5.4 (for use of web-based technology) to 7 (for overall satisfaction) indicating high social validity. Other researchers evidenced similarly high social validity with a range of standardized and novel questionnaires (Barkaia et al. 2017;Fischer et al. 2016;Gibson et al. 2010;Higgins et al. 2017;Knowles et al. 2017;Machalicek et al. 2016;Neely et al. 2016;Suess et al. 2014;Wainer and Ingersoll 2015), with one article highlighting that scores were comparable to other interventions provided in-person (Wacker et al. 2013a). However, social validity scores were variable in one study (Alnemary et al. 2015) with low scores assigned to aspects of the videoconferencing, indicating technical difficulties experienced (see Table 3 and further discussion below). Despite this, trainees stated that they would recommend the training to others, a finding that was replicated by Fisher et al. (2014) and Higgins et al. (2017). Trainees reported across the studies that they found the use of telehealth simple, valuable, unobtrusive, and convenient as it allowed more frequent meetings with the trainer and immediate feedback. Although the use of telehealth was generally rated highly, two individuals in separate studies stated that they felt the training would have been easier or preferable in-person (Alnemary et al. 2015;Neely et al. 2016) and another expressed concerns about the possibility of technical difficulties (Gibson et al. 2010).
In addition to assessing social validity, some researchers also examined costs relating to the use of telehealth in comparison with in-person support. For example, Wacker and colleagues (Wacker et al. 2013a, b) estimated that the weekly costs of providing a functional analysis would have been $335.09 per client if training were delivered in-person (when including costs related to the behavioral consultant's time and travel) versus $57.95 when training was delivered via videoconferencing. Similarly, the combination of a functional analysis and FCT would have resulted in total costs per client of $55,872 if delivered in-person, versus $11,500 when delivered via videoconferencing. Lindgren et al. (2016) similarly evidenced large cost savings as a result of the use of telehealth, particularly when telehealth support was provided in client's homes rather than regional clinic settings (due in part to the exclusion of costs relating to families travel to the clinics, additional staff support, and use of other resources).

Obstacles Relating to Telehealth
A number of obstacles were identified in the articles relating to the use of telehealth for training. These often related to technical difficulties (see Table 3). However, in most cases, authors reported that technical issues did not significantly affect the training and were easily resolved. Issues relating to the logistics of using the equipment were also highlighted, including the possibility of needing someone to set up equipment prior to sessions, or transferring potentially large video files , and issues with protecting clients' confidentiality or obtaining informed consent (Barkaia et al. 2017;Fischer et al. 2016). Some authors discussed issues with software being blocked by local firewalls (Hay-Hansson and Eldevik 2013), and with insurance companies not covering the cost of support delivered via telehealth (Barretto et al. 2006). Finally, researchers also highlighted potential limitations of support provided via telehealth, such as whether it can be used with all types of behavior or techniques (Machalicek et al. 2010;Wacker et al. 2013a) and whether some trainees may need more direct modeling which is not possible via telehealth (Suess et al. 2014).

Discussion
The results of this review provide initial support for the use of telehealth as a way to effectively train individuals to implement ABA techniques including assessments, teaching procedures and specific interventions. In some cases, training via telehealth was found to produce comparable results to traditional in-person training and resulted in behavioral change or useful assessment outcomes for clients. Furthermore, telehealth training was rated as highly socially valid and, in preliminary analyses, resulted in significant financial savings for organizations and reduced travel burdens for trainees. Providing training via telehealth may therefore be a promising method of supporting behavioral change for clients and increasing access to behavioral support.

Methodological Quality of Evidence Base
Although these initial results are promising, a key limitation of the evidence base for telehealth training in ABA procedures relates to the methodological quality of the studies. The articles included in this review were most commonly rated as "Weak" or "Borderline Adequate" on the Evaluative Method, indicating that they lacked key indicators of methodological quality. This finding replicates earlier findings by Boisvert et al. (2010) who similarly found that research relating to telehealth support for people with ASD had key methodological flaws. Only five studies in the current review were rated as "Adequate" (one relating to trainee outcomes: Knowles et al. 2017; four relating to client outcomes: Lindgren et al. 2016;Machalicek et al. 2009b;Neely et al. 2016;Wacker et al. 2013a), and one as "Strong" (relating to client outcomes: Gibson et al. 2010).
Due to the low number of articles utilizing a group design included in the review, it is not possible to examine the methodological quality of these articles in depth. However, the most common cause of low ratings for single case design studies related to graphical representations of the data which suggested unstable data in the baseline or intervention conditions, poor experimental control, insufficient replication of independent variable manipulations, or a lack of adequate data to evidence an effect. This may suggest that variables other than the training (for trainee outcomes) or behavioral techniques (for client outcomes) influenced results. When considering trainee outcomes, it is unclear whether these elements are due to difficulties in training individuals via telehealth or other aspects of the study design. However, for client outcomes, only some of these elements (i.e., number of independent variable manipulations, amount of data collected) are likely to be within the control of the researcher, with others likely to be influenced by the fidelity with which trainees implement the techniques, which was found to be variable when examined by the studies included in this review and in a previous review (Neely et al. 2017). Despite this, additional research examining whether issues in these areas of experimental design are common among interventions utilizing a training or behavioral consultation model is warranted in order to identify whether this is unique to the use of telehealth. Furthermore, the low ratings for single case design studies may be in part explained by the emphasis given to different elements of study design by the Evaluative Method. Wendt and Miller (2012) suggested that elements such as interobserver agreement and fidelity may also be key indicators of internal validity in single case design, but are currently considered only as secondary indicators on the Evaluative Method with less influence on the overall rating. This may be particularly relevant in the current review, as only two studies (Barretto et al. 2006;Neely et al. 2016) did not evidence acceptable levels of interobserver agreement across all measures, conditions and participants. Nonetheless, while the studies reviewed here often did not score highly on the existing measures of internal validity on the Evaluative Method, their external validity is supported where comparisons were conducted to training provided in-person, as findings were often reported to be comparable. This is a key strength of the evidence base to date. It is also important to consider that research relating to training individuals in behavioral procedures via telehealth is a relatively new in the field, and therefore should be considered in light of this. Further, studies that evidence high methodological quality are undoubtedly needed; however, the positive outcomes reported here remain a promising indication of the potential effectiveness and utility of this type of support.

Limitations and Areas for Further Study
Some additional limitations of the evidence base must also be considered. Firstly, the vast majority of research included in this review was conducted by research teams located primarily in the USA; therefore, it is unclear whether such methodology could be integrated into the support systems of other countries. In addition, there are only a few direct comparisons of training provided via telehealth with training provided via in-person methodology. Although this is in an important omission and requires further study, it may be sufficient to demonstrate that training provided via telehealth is effective more generally, given that it may not be possible to provide in-person support to some trainees/clients (e.g., in very rural areas in which there are no professionals with expertise in behavior analysis). This limitation may therefore relate to the theoretical understanding of telehealth-based support, rather than its clinical utility. Secondly, the variable results relating to trainee fidelity warrant further study to identify the determinants of and ways to improve trainee's implementation of techniques, and the impact of this on client outcomes. Many studies included in the review did not report fidelity data (for either the trainer or trainee) which is a key methodological limitation, although this limitation is also applicable to behavioral research more widely (e.g., Gresham et al. 1993;Ledford and Wolery 2013). Comparisons with fidelity when trainees receive training via inperson methodology would again be useful, given one study in this review finding that variable fidelity was common across both training modalities (Hay-Hansson and Eldevik 2013). Finally, some technical difficulties were reported in the studies, suggesting a need to document and refine the technological requirements for successful telehealth interventions. This is likely to be a common concern for telehealth interventions across a number of fields and Lee et al. (2015) provide an initial analysis of the particular considerations for training relating to FCT interventions. More demonstrations of sufficient technology for conducting telehealth and troubleshooting guidelines are undoubtedly needed if practitioners are to adopt such methodology within their practice.
In addition to limitations in the evidence base, there are also limitations relating to the current review which must be considered when interpreting results. Firstly, it was beyond the scope of the review to consider interventions that did not include additional support from a trainer (e.g. those based solely on self-directed learning such as Jang et al. 2012), or interventions relating to more broad behavioral methodology rather than defined procedures (e.g., Heitzman-Powell et al. 2014;Vismara et al. 2009;Vismara et al. 2013;Vismara et al. 2012); therefore, the utility of telehealth in these contexts cannot be inferred from this review. In addition, due to the nature of systematic review methodology, some relevant articles may not be included if they were not identified as part of the search strategy and it was not possible to include gray literature such as unpublished manuscripts, dissertations, or book chapters. As a result, some relevant evidence may not have been included in the review. Despite this, the methodology of a systematic review requires adherence to tight inclusion criteria and this is therefore a limitation of systematic reviews in general. Finally, due to the small number of studies identified, it was not possible to assess the effectiveness of the interventions quantitatively; therefore, conclusions relating to effectiveness are only tentatively made.
Despite limitations, this review has highlighted a number of specific areas that require further study. Any future research should aim to overcome methodological limitations highlighted in this review, and be conducted in a range of countries and contexts in order to demonstrate the applicability of telehealth to ABA support internationally. Additional research is also needed for wider target populations, as nearly all studies in this review focused on children with disabilities, for a greater range of outcomes (e.g., trainee knowledge and confidence), and on other ABA techniques J Behav Educ (2018) 27:172-222 and interventions. Finally, a component analysis of telehealth training would add to the evidence base by determining which elements of training are necessary or sufficient for behavioral change, as many studies used multiple approaches including initial training, real-time coaching, accompanying manuals, and logistical support from other individuals during sessions.

Broader Considerations Relating to Telehealth
Some wider issues relating to the use of telehealth also warrant further discussion and will require investigation and clarification if the field of ABA is to adopt telehealth methodology more widely. The articles included in this review often contained only limited details about the characteristics of the trainer, trainee, and clients, with no evaluation of the characteristics of those who would be most able to deliver training via telehealth or benefit from the use of this technology. Some authors highlighted a need to investigate this further (Suess et al. 2014;Wacker et al. 2013a), and this may be a key consideration for professionals wishing to use telehealth methodology. It is possible that some individuals may have difficulty engaging with or benefiting from support provided via telehealth due, for example, to difficulty accessing or using the technology required, cultural and language barriers, or preferences for support provided in a particular way. Identifying the characteristics of those who would benefit most and engage with telehealth support would ensure that such methodology is used when it is most appropriate and useful. On a related note, there is debate within other fields around the extent to which support provided via telehealth alters the therapeutic relationship between the therapist/trainer and the recipient (see, for example, Kaplan and Litewka 2008;McCarty and Clancy 2002;Swinton et al. 2009). Although a full overview of this debate is beyond the scope of this review, there may be important implications relating to this for behavior analytic support provided via telehealth. For example, if the therapeutic relationship is indeed altered, it may imply that behavior analytic telehealth support will be most appropriate for individuals who are more emotionally resilient and require less therapeutic/emotional support from trainers alongside the training. These implications will need to be investigated and taken into account when implementing support via telehealth.
Other limitations relating to the use of telehealth in ABA may also exist, with some authors highlighting that use of the methodology may be limited to particular types of target behaviors (Machalicek et al. 2010), or particular procedures, as training relating to highly specific procedures may be more suited to delivery via telehealth than training for less easily defined procedures (Machalicek et al. 2010;Wacker et al. 2013a). Although some authors have applied telehealth methodology to more broad training (see, for example, Heitzman-Powell et al. 2014;Reese et al. 2015;Vismara et al. 2009Vismara et al. , 2013Vismara et al. , 2012Xie et al. 2013), an analysis of the factors related to the effectiveness of telehealth for different types of support and with different behavioral targets is warranted. Finally, the motivations and context for adopting telehealth support in ABA services must be considered. Although providing support via telehealth has preliminarily been shown to reduce costs or travel burdens 1 3 (Gibson et al. 2010;Lindgren et al. 2016;Wacker et al. 2013a, b), it can be argued that this should be a secondary focus, with clinical need taking precedence. Furthermore, there is some evidence that despite reduced professional costs, client related costs may increase as a result of the use of telehealth (see Lindgren et al. 2016) which may present a barrier to participation for some families. It may be important, therefore, to ensure that services do not adopt telehealth methodology solely to reduce professional costs where in-person training is possible, but instead adopt telehealth to support populations who may be unable to otherwise access support (e.g., in rural settings) or who would specifically benefit from the use of such technology.

Implications for Practice
Although these broader issues require further investigation and the methodological quality of articles included in this review presents a significant limitation, the findings presented here and the literature relating to telehealth more generally may have important implications for clinical practice. In early evaluations, telehealth methodology appeared to be effective for training individuals in a number of ABA techniques. Although more high-quality research is warranted, these findings suggest that telehealth support may have the potential to improve the reach and scope of behavior analytic support and enable professionals to effectively support populations that would otherwise struggle to access such support. This may be particularly important in contexts where expertise in behavior analysis is scarce or not geographically widespread such as the UK, where only 275 professionals are registered with the Behavior Analyst Certification Board as Board Certified Behavior Analysts or Board Certified Assistant Behavior Analysts (Behavior Analyst Certification Board 2017). This is equivalent to one certified professional per 235,525 people and is much lower than other countries such as the USA, where there is one certified professional for every 12,776 people (based on total population data as of 1st July 2017 [the most recent available data for the UK]: United States Census Review 2017). Telehealth support in ABA services also necessitates a focus on training stakeholders, as it may be difficult to provide direct behavioral interventions to a client using telehealth due to the need to be able to deliver reinforcement and manipulate aspects of the environment directly. Training stakeholders is consistent with best practice in Positive Behavioral Support (Gore et al. 2013), and is also likely to improve stakeholder skills and promote the sustainability of behavioral support for the client over time. In addition to this, telehealth-based interventions were considered highly socially valid by trainees which is another important determinant of the likelihood that the intervention will be continued in the absence of direct professional support (Baer et al. 1987). Finally, in initial investigations, telehealth training appears to be an efficient and cost-effective way to provide support, given evidence of potentially large cost savings overall and reduced travel burdens (Gibson et al. 2010;Lindgren et al. 2016;Wacker et al. 2013a, b). Although caution should be exercised in solely using financial benefits to justify the adoption of telehealth methodology as discussed above, this may be an important consideration for the field in the current economic and political climate. J Behav Educ (2018) 27:172-222 Compliance with Ethical Standards Conflict of interest The authors declare that they have no conflict of interest.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix: Evaluative Method Definitions and Ratings
See Tables 4, 5, 6 and 7. Borderline Adequate b 'High' on 3 primary indicators, no more than 1 primary indicator rated as 'Unacceptable' Evidence of 2 or more secondary indicators Weak a 'High' on less than 3 primary indicators, or 2 or more primary indicators rated as 'Unacceptable' Evidence of less than 2 secondary indicators.  At least 50% of the demonstrations of the experimental effect meet these criteria, there are two demonstrations of the experimental effect at two different points in time and changes in the DV vary with the manipulation of the IV Less than 50% of the demonstrations of the experimental effect occurring at two different time points in which changes in the DV vary with manipulation of the IV Criteria for all indicators were applied with reference to the specific outcomes under examination. For example, participant criteria related to trainees (where the tool was applied to trainee outcomes), or clients (where the tool was applied to client outcomes), the independent variable was treated as the training (for trainee outcomes), or the behavioral techniques (for client outcomes), and so on. Definitions adapted from Reichow (2011) 1 3 J Behav Educ (2018) 27:172-222 Table 6 Definitions for secondary indicator ratings across group and single case designs on the evaluative method Criteria for all indicators were applied with reference to the specific outcomes under examination. For example, IOA criteria related to trainee behavioral data (where the tool was applied to trainee outcomes), or client behavioral data (where the tool was applied to client outcomes); fidelity was examined relating to trainer's implementation of the training (for trainee outcomes), or the trainee's implementation of behavioral procedures (for client outcomes), and so on. Definitions adapted from Reichow (2011)  Group and single case design Generalization/maintenance Outcome measures are collected after the final data collection to assess generalization/maintenance Group and single case design Social validity Study contains at least 4 of the following: Socially important dependent variables (i.e., society would value the changes in outcome of the study) Time-and cost-effective intervention (i.e., the results justify the means) Comparison between individuals with and without disabilities A behavioral change that is large enough for practical value (i.e., it is clinically significant) Consumers who are satisfied with the results Independent variable manipulation by people who typically come into contact with the participant A natural context Group Random assignment Participants are assigned to groups using a random assignment procedure Group Interobserver agreement (IOA) IOA data collected across all conditions, raters and participants with reliability > .80 (kappa > .60) or psychometric properties of standardized tests are reported and are > .70 agreement with a Kappa > .40 Group Attrition Attrition is comparable (does not differ between groups by more than 25%) across conditions and less than 30% at the final outcome measure Group Effect size Effect sizes are reported for at least 75% of the outcome measures and are > .40 Single case design Interobserver agreement (IOA) IOA data collected across all conditions, raters, and participants with reliability > .80 Single case design Kappa Kappa is calculated on at least 20% of sessions across all conditions, raters, and participants with a score > .60