Challenging behaviors such as aggression, self-injury, stereotypy (e.g., repetitive hand movements or speech), and non-compliance are common in individuals with autism spectrum disorder (ASD), intellectual disabilities (ID), and multiple disabilities (Baghdadli, Pascal, Grisli, and Aussiloux 2003; Kiernan and Kiernan 1994; McClintock, Hall, and Oliver 2003; Murphy et al. 1999; Poppes et al. 2010). If severe and chronic challenging behavior is not addressed, individuals with disabilities are at risk for poor academic achievement, adult mental health concerns, and peer rejection (Dunlap et al. 2006). Challenging behavior also puts individuals at higher risk for abuse, neglect, deprivation (Emerson et al. 2001; Lowe et al. 2007), victimization (Crocker et al. 2006; Rusch et al. 1986), and incarceration (Lund 1990; Crocker and Hoggins 1997; Crocker et al. 2006). Many of these risks can be linked to restrictive social and learning environments due to said challenging behaviors (Buschbacher and Fox 2003; Machalicek et al. 2007; Reichle 1990). Individuals may be segregated or excluded to institutions or specialized treatment centers due to these behaviors, and services within these more restrictive settings can be inconsistent and inadequate due to a higher rate of staff turnover (Hastings and Brown, 2002; Lowe et al. 2007; Machalicek et al. 2007). To decrease the risk of segregation and serious emotional issues, challenging behavior must be addressed using consistent implementation of evidence-based practices.

Prior to the mid-1980s, a majority of the research on behavioral interventions for challenging behavior focused on reactive approaches, for example, punishment or withholding reinforcement (Carr 1985; Carr and Durand 1985). Time-out (Zeilberger, Sampen, and Sloane 1968), extinction (Lovaas et al. 1965), contingent restraint (Azrin et al. 1982), and response cost (Iwata and Bailey 1974) are examples of punishment or withholding reinforcement. While these types of interventions were often effective in decreasing challenging behavior, the interventions did not directly teach replacement behaviors or more socially appropriate behaviors (Carr and Durand 1985).

Functional communication training (FCT), an evidence-based practice developed by Carr and Durand (1985), is a non-aversive alternative approach to addressing challenging problem behavior (Ganz and Hong 2014). FCT is built on the theory that challenging behavior may be a means of communicating one’s needs when individuals are unable to communicate in more socially acceptable manners, such as through conventional speech (Mancil 2006; Kurtz et al. 2011; Sigafoos, Ganz, O’Reilly, and Lancioni 2008). By teaching socially appropriate communicative responses to meet the needs of an individual, challenging behaviors may diminish (Carr and Durand 1985). FCT therefore begins with a functional analysis (FA) or a functional behavior assessment (FBA) of the individual’s challenging behavior and then a communicative response is taught to the individual to serve the same function, or purpose, previously occasioning challenging behavior (Durand 1990). An FBA is a process that includes a variety of means of evaluating why an individual is engaging in a particular behavior; this may include checklists and observations in typical settings in which the behavior occurs (Mancil 2006). An FA involves experimentally manipulating a number of possible conditions while collecting data to measure the frequency of the behavior in each conditions, such as the frequency of the behavior contingent upon gaining attention, gaining access to a preferred item or activity, escaping a task, or while alone (Iwata et al. 1982). The function of communication varies across individuals, and may be hypothesized based on the results of the functional analysis, including functions such as allowing the individual to gain access to attention from another person or tangible item or an activity, to avoid or escape a task demand or to access time for self-stimulatory behaviors (Durand 1990; Mancil 2006; Mancil and Boman 2010). The mode of communication used may be verbal, gestural or sign language, pictorial, or utilize a speech-generating device (Durand 1990).

There are several variables to consider when choosing a mode of communication (e.g., speech, sign language, and augmentative or alternative communication) for the replacement behavior. For instance, less effort required in the communication response leads to more effective interventions (Bailey, McComas, Benavidas, and Lovascz 2002; Buckley and Newchok 2005; Horner and Day 1991; Richman, Wacker, and Winborn 2001; Ringdahl et al. 2009). That is, the mode of communication that the individual is most proficient with using at pre-assessment may be the most effective at replacing challenging behavior (Ringdahl et al. 2009). Additionally, when prior means of communication (e.g., challenging behavior) were strongly linked to reinforcement, instruction in replacement communicative acts may be initially ineffective (Winborn et al. 2002); that is, established learned challenging behaviors are difficult to modify until the older behaviors no longer result in access to reinforcement. Teaching a novel communicative response may be more effective, such as selecting a mode (e.g., verbal, pictorial, gestural) that is unfamiliar or a new response within the same mode but considered to be different from how the individual usually communicates (Winborn et al. 2002; *Winborn-Kemmerer, L et al. 2009). While mode of communication appears to be linked directly to proficiency, research determining whether the mode of communication differentially impacts the effectiveness of FCT is limited and no review or meta-analysis has considered this question to date.

FCT has been investigated via single-case studies spanning a wide range of age groups. This includes individuals in preschool (e.g., Durand 1993; Durand and Carr 1987; Gibson et al. 2010; Mancil et al. 2009; Wacker et al. 2013), elementary school (e.g., Durand and Carr 1991; Falcomata et al. 2013; Franco et al. 2009; Sigafoos and Meikle 1996), secondary school (e.g., Carr and Durand 1985; Durand 1993; Fisher et al. 2005), and adults (e.g., Kahng et al. 1997; *Shirley et al. 1997; *Worsdell, A. S et al. 2000). To date, no study has examined the relative effectiveness of FCT across age groups, which prevents practitioners from determining the extent to which FCT may be effective with the particular individuals they serve.

In addition to mode of communication and age group differences, it is also important to determine if an individual’s disability has an impact on the effectiveness of FCT. Disabilities that impair communication, such as ASD and ID (Pinboroug-Zimmerman et al. 2007), may be more likely to cause challenging behavior due to the individual’s inability to communicate his or her needs via conventional means (Carr 1985; Carr and Durand 1985; Neel et al. 1983; Reichle and Yoder 1979). Effective and efficient conventional communication alleviates the need for the challenging behavior and provides more socially acceptable means to accessing reinforcement (Buschbacher and Fox 2003; Carr and Durand 1985; Heflin and Alaimo 2007). FCT has been implemented among individuals with various disabilities including ASD, ID, hydrocephaly (Hagopian et al. 2004), cerebral palsy (Durand 1993; Kuhn 2010), and developmental disorders (Peck Peterson, et al. 2005; Volkert et al. 2009). However, to date, there is no research comparing the effectiveness of FCT across disability categories.

Traditionally, single-case research uses direct and systematic replication to establish the external validity of a particular practice (Horner et al. 2005). Literature reviews may provide some insight to the literature, albeit descriptive and subjective in nature (Falcomata and Wacker 2013; Kurtz et al. 2011; Mancil 2006). However, meta-analysis provides a means of evaluating potentially evidence-based practices based on objective measures of degree of effectiveness. Meta-analyses have the following four purposes: (a) identification of variables that may have an influence on outcome variables, (b) summarizing the overall effectiveness of the treatment being examined, (c) describing the body of research as a whole, and (d) providing quantification for the effectiveness of an intervention (Blimling 1988; Busk and Serlin 1992; Busse, Kratochwill, and Elliot 1995). Three meta-analyses have investigated function-based interventions and therefore included studies utilizing FCT (Goh and Bambara 2010; Gresham et al. 2003; Marquis et al. 2000); they found that FBA-based interventions can produce positive behavioral changes. Further, one recent meta-analysis examined the use of augmentative and alternative communication (AAC), including via FCT-based interventions, to address challenging behavior (Walker and Snell 2013). However, there are currently no published meta-analyses assessing the overall effectiveness of FCT apart from other interventions and across communication modes.

The purpose of the current study is to quantitatively determine (a) the overall effectiveness of FCT, (b) if FCT is differentially more effective for specific communication modes, (c) how effective FCT is differentiated by age range, and (d) how effective FCT is differentiated by disability category.

Methods

Literature Search

A comprehensive search was performed using a variety of databases. Academic Search Complete, Medline, and PsychINFO were searched using the terms functional communication training, functional communication, functional analysis communication, and mand training. The databases were limited to the years 1980 through the date of the search, 2011. To ensure that no relevant articles were excluded, the researcher also conducted a search using the same terms and restricted years using GOOGLE scholar. Finally, the reference sections of all articles that met the inclusion criteria were reviewed to ensure no articles were missed.

Each article found via the search methods was evaluated to determine whether or not it met all of the following inclusion criteria: (a) the participants had a diagnosed disability other than speech impairment, (b) the dependent variables had to include a measurement of either challenging behavior or adaptive behavior (e.g., aggression, self-injury, on-task behavior), (c) the data for challenging behavior were displayed in line graphs, (d) the study demonstrated experimental control while using a single-case research design (e.g., multiple baseline, reversal/ABAB, alternating treatment), (e) the primary intervention was FCT with a clear explanation of how behavioral functions were determined, and (f) the articles were published in peer-reviewed journals in English. An inclusion chart was created based on the inclusion criteria (Berman and Parker 2002) to rate each article and determine if the article should be included in the meta-analysis. The charts were completed by the researcher and an individual who was blind to the purpose of the research study. Prior to rating each article, the raters discussed the inclusion criteria to ensure the criteria were judged similarly. A document was created defining each inclusion criterion to ensure that both raters were able to complete the task using the same methods. Both raters assessed every article and completed the chart. The results from the charts were compared to ensure reliability. If the two raters disagreed about an article, a third person rated the article and the decision of two of the three raters determined whether or not the article was included.

The combined search methods identified 80 articles, dissertations, book chapters, and other literature related to FCT. After reviewing the literature and determining whether or not each article met the inclusion criteria, 36 articles met the criteria and were included in this meta-analysis.

Data Extraction and Coding

After the articles were selected for participation in the study, each article was coded using the potential moderating variables of mode of communication, participant age, and primary disability. Each study was further coded into different levels within the variables. Mode of communicative response was coded as Aided Augmentative and Alternative Communication (A-AAC), Unaided-Augmentative and Alternative Communication (U-AAC), Verbal, or Multiple. A-AAC included any study that used any type of speech generating device or picture cards to generate the communicative word or phrase. U-AAC is communication that requires no additional tools or devices. For this study, U-AAC included sign language and any type of gesture to gain attention, such as tapping someone on the shoulder or pointing. Verbal was any verbal response using one’s vocal cords. Multiple was used when a study allowed the participant to choose from an array of communicative responses. Participant age was broken into age groups of Primary (ages 0–5 years old), Elementary (ages 6–12), Secondary (ages 13–21), and Adult (ages 22 and older). The disabilities were coded as either AU, including pervasive developmental disorders, autism, and Asperger syndrome, ID, and Other. The primary disability label, as defined as the primary, first, or only disability identified by the study authors, was used to determine the group in which the participant belongs. Therefore, if a participant’s primary disability was AU but he or she had a secondary disability label of ID, the individual was coded as having AU. If the reverse was true, ID was primary and AU was secondary, the participant was coded as having ID. Any other disability was labeled as Other due to numbers of participants that were two small or dissimilar to aggregate into another category.

Data Analysis

The field has not reached a consensus regarding what effect size or analysis method is most appropriate in single-case research (Berman and Parker 2002; Busse et al. 1995; Center et al. 1985; Kavale et al. 2000; Schneider et al. 2008; Scruggs 1987; Scruggs and Mastropieri 2001). Most data from single-case research do not follow the assumptions required for parametric measures, such as normal distribution and scale type (Parker et al. 2011). When data do not follow parametric assumptions, for example, when the data are highly variable, measures such as mean, median, and mode do not accurately represent the data, so non-parametric measures should be used. Among all the non-parametric measures most suitable for single-case designs, non-overlap methods are simpler and distribution free (Parker, Vannest, and Brown 2009; Parker et al. 2011).

An effect size (ES) for each study was calculated comparing baseline performances to intervention performances (Kavale 2001). In a meta-analysis, individual ESs are summarized to create a common unit for comparison between levels. Robust improvement rate difference (IRD; Parker et al. 2011) was selected as the metric for calculating ESs. Robust IRD is a calculation of the improvement rate for the intervention phase minus the improvement rate for the baseline phase (Parker et al. 2009). To compute improvement rate, the number of “improved data points” in each phase is divided by the total number of data points in that phase. An improved data point in the intervention phase is considered improved if it ties or exceeds all data points in the baseline phase (Parker et al. 2009). A 2 × 2 table is used to help organize the data in each IRD calculation. Improved data points for baseline, improved data points for intervention, not improved data points for baseline, and not improved data points for intervention are entered into the cells within the table (Parker et al. 2009). The numbers of improved data points in both phases are added together and then divided equally into the two improved boxes in the 2 × 2 table. This process causes Robust IRD to be less susceptible to outlier data points because these data points are spread equally between the two phases.

Robust IRD is equal to Phi, which is a respected Pearson correlation for a 2 × 2 table, as well as Cohen’s Kappa and Cramer’s V (Parker et al. 2011). By using software to run the analyses, one can obtain confidence intervals (CIs) and p values. Robust IRD has also been applied in single-case meta-analyses (Ganz, Parker, and Benson 2009; Vannest et al. 2010a; 2010b). Parker et al. (2009) loosely proposed criteria of Robust IRD scores at .50 and below as very small or questionable, .50 to .70 as moderate effects, and .70 and greater as large and very large.

Robust IRD can be confounded by positive baseline trend (Parker et al. 2009). Thirty data sets from this meta-analysis were randomly selected for visual analysis to determine if positive baseline trends may skew the results. Less than 5 % of the 30 data sets had positive base line trend. This meta-analysis therefore utilized Robust IRD for all effect size measures. Robust IRD was calculated by contrasting baseline with intervention phases for each single-case design. The major designs used in the studies reviewed were multiple baseline designs and ABAB. For all comparisons, this meta-analysis compared the first baseline with the first phase of intervention (A1 to B1). In the case of multiple baseline designs, the data from each level of the design were analyzed by comparing the baseline to the first phase of intervention. In the case of ABAB designs, A1 was compared to B1. The remaining data (A2 and B2 in the case of a withdrawal design) were excluded due to the occasional use of ABAB designs embedded within multiple baseline designs to demonstrate maintenance rather than reversal. Additionally, comparisons from baseline to generalization or maintenance were computed to ensure that all relevant data were accounted for within each potential moderating variable.

Robust IRD scores were combined to determine the effectiveness of FCT overall. Robust IRD calculations were also combined according to each level of the potential moderating variables to answer the questions posed in this research study. The data were processed using Number Cruncher Statistical Software (NCSS, Hintze, 2002), a common statistical analysis program. NCSS has a built-in meta-analyses algorithm that is able to calculate an average ES. It does this by applying weights to each study’s ES based on the inverse of the standard error.

Fixed Effect Size Model

A fixed effect size model was used when calculating the Robust IRDs because it is reasonable to assume that there is one true effect that can be determined through a review of the existing data (Borenstein et al. 2009). In a fixed effects model, all error is due to sampling and with an infinite number of samples the true effect can be found (Borenstein, Hedges, Higgins, and Rothstein 2009). Each study included in this meta-analysis applied the same treatment, FCT. In each study, the goal of the intervention was to decrease challenging behavior by increasing appropriate communicative responses. If FCT is an effective intervention, there should be one true effect observed in every study that utilized FCT. This meta-analysis was interested in determining the true effect of FCT, and therefore the fixed effects model was used.

Determining Statistical Significance

Each level of the potential moderator variable was compared to determine if there were differential effects between the identified levels. Statistically significant (p = .05) differences were determined by comparing the CI for each group within the moderator by setting the CI to 84.3 %. Results were considered statistically significant (p = .05) if the CI for each measure did not overlap at the upper and lower limits (Payton et al. 2000; Payton, Greenstone, and Schenker 2003; Schenker and Gentleman 2001). If the data revealed statistically significant differences between the levels, the variable was confirmed as a moderator because the levels differentially affected the intervention.

Forest Plots

Forest plots were used to compare the consistency of the results within the levels of a potential moderating variable and help identify outlier data points that may skew the overall IRD score for each level. When ES measures are closely grouped with a majority of the CI overlapping, it can be stated that the results for that level are consistent and therefore the combined Robust IRD score is a reliable measure of the true effectiveness of FCT. A forest plot visually displays the individual ES and CI for each comparison (Lewis and Clark 2001; Parker et al. 2009). The highest possible robust IRD score is 1, which occurs when there is no overlap between the two phases. A negative IRD score reveals that there was more improved data in the baseline phase than in the intervention phase.

Inter-rater Reliability

To ensure that IRD calculations were reliable, 35 % of the total IRD calculations were conducted by two raters. Inter-rater agreement was determined by dividing the number of times the raters entered the same number in each cell within the 2 × 2 tables divided by the total number of cells in all 2 × 2 tables combined. There were 147 total comparisons within and between all the studies. Therefore, 51 IRD 2 × 2 tables were completed by two raters for inter-rater reliability. The tables were compared prior to adding the improved data points and dividing them equally between the two improved quadrants in the table to ensure that the data were accurate prior to manipulation. The inter-rater reliability score was 85.3 %. This score was over our minimum of 80 %, thus, high enough to proceed with analyzing the data to determine the effectiveness of FCT overall and across different moderators.

Results

The IRD 2 × 2 tables for each comparison within a study were combined, and then NCSS was used to calculate Robust IRD to determine overall IRD for each study. The combined overall IRD for FCT was 0.86 (CI = 0.85–0.87), which is considered a large effect (Parker et al. 2009). The IRD scores and CIs for each study are fairly widespread. Peck Peterson et al. (2005) was an outlier score and the lowest IRD score of −0.25. The follow-up data negatively impacted the IRD results for this study, because the challenging behavior was more severe than in the baseline condition. Mancil et al. (2006) had the highest IRD score of 0.96. All other scores were fairly well spread between these upper and lower numbers. Of the thirty-nine studies included in this meta-analysis, 54 % of the studies (n = 21) fell within the large to very large range in overall ES. Only 15 % of the studies (n = 6) fell in the very small or questionable range of .50 and below (*Fisher, W et al. 1993; Hagopian et al. 2004; Harding et al. 2009B; Kelley et al. 2002; Peck Peterson et al. 2005; Winborn-Kemmerer et al. 2010).

Mode of Communication

Modes of communication were coded as Verbal, A-AAC, U-AAC, or Multiple. Of the 147 analyses, 34 % were coded as Verbal (n = 49), 43 % were coded as A-AAC (n = 63), 27 % were coded as U-AAC (n = 31), and only 1% was coded as Multiple (n = 2). Because there were only two ES for Multiple, the data were not analyzed for this level. Figure 1 provides a forest plot of the combined ES measures for each level. The CI for Verbal does not overlap with the CIs for either A-AAC or U-AAC; therefore, FCT had significantly larger effects when using verbal modes of communication rather than A-AAC and U-AAC. FCT implemented with both Verbal and A-AAC communication modalities had large to very large effects on participant outcomes. The results for U-AAC and A-AAC are statistically different because CIs for these two levels do not overlap; thus, A-AAC methods had significantly higher effects than U-AAC methods when implemented during FCT. Based on these results, mode of communication is a moderating variable for FCT.

Fig. 1
figure 1

Robust improvement rate difference for mode of communication

Participant Age

The ages for each participant were coded into Primary, Elementary, Secondary, and Adult. Fisher et al. (1993) did not report the ages of the participants in their study so four participants are not included in the results for this analysis. The studies included a total of 87 participants; however, four of their ages were not individually identified by the study authors and they are excluded from this analysis. Twenty-eight percent of the participants were in the Primary age group (n = 22). Forty percent of the participants were Elementary age (n = 35). Only 16% of participants were in the Secondary age group (n = 14) and 16% were Adults (n = 12).

Figure 2 contains the results for each of the FCT when applied to the different age groups. All of the ESs for each age category fall within the range of moderate or large effects. Individuals in the Primary age range had the highest IRD result of 0.83, and the Adult age range had the lowest IRD score of 0.64. The CIs for individuals in the Adult and Primary age ranges do not overlap, as seen in Fig. 2; therefore, FCT has a significantly higher effect for individuals in the Primary age range than for Adults. The Secondary (.78) age group had a large effect size, but the CI overlaps at the upper most end with the Primary age range. FCT appeared to be equally effective with Primary and Secondary age individuals. The CI for Elementary (.76) ages does not overlap with individuals in the Primary age group, so there is a significant difference between the two levels. The Secondary and Adult age groups do not have CI that overlap and have a statistically significant difference, with FCT having significantly higher effects for individuals in the Secondary age range. Individuals in the Secondary and Elementary age range are very close in ES and their CIs overlap. In fact, the scores for individuals at the Elementary age range fall completely within the CI for individuals in the Secondary age range, so FCT appears to be equally effective when administered with either age range. The final comparison is between individuals of Elementary age and Adults. The Elementary age group had a larger ES than the Adult level. FCT had significantly higher effects for individuals in the Elementary age range than for Adults.

Fig. 2
figure 2

Robust improvement rate difference for age of participants

Disability Category

Disability was divided into two levels, autism (AU) and intellectual disability (ID). Other disabilities were excluded from further evaluation due to heterogeneity within the category. Of the phase contrasts analyzed, 65 % of the analyses included individuals with AU (n = 40 participants, 84 phase contrasts) and 35 % included individuals with ID (n = 32 participant, 45 phase contrasts). Figure 3 is the forest plot for the combined results for each level to aid in visual analysis of the data.

Fig. 3
figure 3

Robust improvement rate difference for disability

The effect size for individual AU (.79) was higher than for individuals with ID (.64). CIs for individuals with AU do not overlap with individuals with ID, so FCT had a significantly higher effect when implemented with individuals with AU rather than individuals with ID.

Discussion

Determining FCT’s effectiveness in reducing challenging behavior was the first question posed in this meta-analysis. The following questions were also posed: (a) is FCT differentially more effective for particular communication modes (unaided-augmentative and alternative communication, aided augmentative and alternative communication, or verbal); (b) how effective is FCT with individuals with challenging behavior, differentiated by age range; and (c) how effective is FCT with individuals with challenging behavior, differentiated by disability category?

Overall, the results of this meta-analysis support the findings of the NPDC-ASD (2009) in listing FCT as an evidence-based practice. This study found that FCT appears to be highly effective in decreasing challenging behavior. While FCT is an evidence-based practice, this meta-analysis enhanced previous work by evaluating variables that moderate the effectiveness of FCT.

Based on the results of this study, verbal modes of communication appear to be the most effective mode of communication, followed by aided AAC. Unaided AAC fell into the small or questionable range. Mode of communication should be selected based on the individual’s ability to use the communicative response (Durand 1990). A majority of the individuals using speech as a mode of communication did not have ID as either a primary or secondary diagnosis. The strong results for individuals using verbal responses may be tied to the cognitive level of the participants. Unaided AAC was most often utilized with individuals with ID as a primary diagnosis, and therefore, the results may be skewed by the participants’ cognitive abilities. Lower cognitive ability may have impacted the individual’s ability to learn the new communication skill and thereby decrease the effectiveness of the intervention. The studies included in this meta-analysis did not contain specific information about the participants’ cognitive functioning such as assessment scores; therefore, more research is needed.

There has been much debate over whether unaided AAC is more effective than aided AAC, specifically comparing sign language to the Picture Exchange Communication System, for individuals with autism (Gevarter et al. 2013; Schlosser and Sigafoos 2006; Tincani 2004). Tincani (2004) found that PECS was more effective for one participant, whereas sign language (unaided AAC) was more effective for the other participant. Individual learning preferences and learning styles may have impacted the results for Tincani (2004). Learning preference and learning styles should be taken into consideration in the planning phase of FCT. Therefore, cognitive ability may be the best possible explanation for the difference between the levels. In order to determine if the difference between the levels was truly due to the mode of communication, all other variables would need to be consistent between each level. Of those participants who were taught a means of communicating via AAC, approximately half had ASD, 28 % had ID, and 23 % were categorized as Other. This may indicate a more frequent selection of AAC for people with ASD, although it is unclear due to the limited numbers. A review of all communication interventions to determine the most commonly selected communication mode given particular populations would be illuminative. Limited research has been conducted to compare aided to unaided AAC within a single study, as noted in a recent literature review (Gevarter et al. 2013). Although recent work in this area has been conducted (van der Meer, Sigafoos, O’Reilly, and Lancioni 2011), it was not in the context of FCT interventions.

This meta-analysis indicates that verbal responses and aided AAC result in stronger effects than unaided AAC. However, these results should be viewed with caution based on the discrepancy between the cognitive levels of the participants across the modalities. Interventionists should always take into consideration the needs and learning rates of the individuals when determining the mode of communication.

Effects of FCT were also examined based on the age of the participants. FCT had the largest effect on challenging behavior for individuals at the primary age level. The results for primary-aged individuals were not statistically different from individuals in the secondary age group. Primary, elementary, and secondary are all statistically significant when compared to adult participants. There was also a statistically significant difference between primary and elementary participants. It is possible that FCT appeared to be less effective for adults due to a lengthy history of reinforcement for challenging behaviors, making those behaviors more resistant to extinction. Thus, in actuality, FCT may well be warranted for all age ranges, particularly given that effects were at least moderate for all groups.

Federal legislation mandates early intervention because it is the most effective means of changing an individual’s quality of life (Anderson et al., 2003; Individuals with Disabilities Education Act Amendments of 2004; Love et al. 2005; Ramey and Ramey 1998; Ramey et al. 2000). Children at the primary age are still building appropriate communication skills. As an individual gets older, communication skills may be more difficult to develop. Ganz et al. (2011, 2012) found that aided AAC was more effective with individuals at younger ages. The current study confirms that communication skills may be easier to learn at younger ages. This may have led to FCT being more effective at earlier ages. As an individual gets older, FCT can still be effective, but the effects may not be as strong. However, results of this meta-analysis should be interpreted with caution because they may be skewed due to the small number of adult participants.

The final variable of interest was disability. There was a significant difference between the effectiveness of FCT with individuals with autism versus individuals with ID and those with other disabilities. Students with autism and ID may have impaired communication skills (Heflin and Alaimo 2007; Pinboroug-Zimmerman et al. 2007). When communication skills are impaired, individuals are more likely to exhibit challenging behavior (Carr 1985; Carr and Durand 1985; Neel et al. 1983; Reichle and Yoder 1979). FCT focuses on improving communication skills in an attempt to decrease challenging behavior (Durand 1990). The cognitive ability of the individuals with ID may have impacted their ability to obtain and use the new communicative skill. Individuals with ID may have impaired communicative ability as well as impaired cognitive ability. This is not true for the individuals with autism in this study. While some of them had a dual diagnosis of autism and ID, the majority of the studies did not identify a dual diagnosis; therefore, there was potentially a difference between the two levels. This difference was dependent on the individuals being correctly diagnosed. A majority of the articles did not report intellectual assessments so there was no way to confirm if there was a difference between the individuals with autism and individuals with ID. Alternatively, FCT may be better suited for people with ASD than those with ID who did not also have ASD; thus, future research should investigate efficacy of FCT depending on individual characteristics. This would require single-case researchers to more effectively assess and report characteristics of their participants.

This study was limited in that all of the levels were not equally populated. A small n for any level allows outlier data points to have a stronger impact on the results. This can cause the CIs to be larger. Statistical significance is determined by overlap of the CIs (Payton et al. 2000; Payton, Greenstone, and Schenker 2003; Schenker and Gentleman 2001). Smaller CI could reduce overlap and therefore impact statistical significance. This study was also limited by the information provided in the original studies. Cognitive and communicative ability were not precisely reported in any of the studies. These variables may have impacted the effectiveness of FCT within each level. Additionally, few of the studies described methods of assessing for selection of particular communication modes. That is, they generally described each participant’s level of verbal abilities, but did not report an assessment tool or strategic approach to selecting the mode of communication. This is a limitation that should be addressed in future research, particularly in terms of selecting an AAC mode, although the AAC literature remains unclear regarding evidence-based means of doing so (Ganz 2014).

This study confirms the findings of the NPDC-ASD (2009) that listed FCT as an evidence-based practice. This meta-analysis also confirms the conclusions of prior literature reviews that categorized FCT as an effective intervention (Mancil 2006; Kurtz et al. 2011). The NCLB (No Child Left Behind Act of 2001) and IDEA (2004) laws push for schools to use evidence-based practices as their primary strategies. FCT is an individualized intervention that is effective at decreasing challenging behavior as well as teaching a more appropriate replacement behavior. Challenging behavior can lead to teacher turnover and more restrictive settings for the individual displaying the challenging behavior (Hastings and Brown, 2002; Lowe et al. 2007; Machalicek et al. 2007). Finding reliable interventions that can decrease the challenging behavior should be a high priority for interventionists.

This meta-analysis suggests several avenues for future research. Overall FCT was found to be highly effective in decreasing challenging behavior. More research is needed to determine if cognitive ability or communicative ability impact the effectiveness of FCT. For this to be addressed, research studies need to include information for each participant in regards to cognitive and communicative ability based on standardized assessments. A limited number of studies included adults and individuals at the secondary age range. Kurtz et al. (2011) found similar results in that the adult age range was limited in high-quality studies and therefore considered probably efficacious. The lack of participants in the secondary and adult levels is a limitation within the field of FCT research as a whole. This could be due to the fact that public schools provide easy access to research participants. Once individuals are no longer in public schools, it may be harder to find participants. However, adults are greatly in need of research, particularly as individuals with disabilities age and still require services. Further, investigation and comparison of the efficacy of particular components of FCT is needed. For example, investigation of the comparative efficacy of brief FAs versus complete FAs and FBAs should be conducted to provide practitioners with potential means of efficiently implementing this intervention. Finally, as generalization and maintenance of learned skills are critical, future single-case researchers must collect more of these data, both in baseline and in intervention phases and at some length beyond intervention. Doing so would enable aggregation of results. Most of the studies in this review provided no generalization in baseline and only 1–2 data points in any phase, making aggregations of results difficult due to a small sample size.