A Comparison of Variations of Prompt Delay During Instruction on an Expressive Labeling Task

Variations in prompt delay procedures are used in discrete-trial training to reduce the occurrence of errors before task mastery. However, the variations are seldom compared systematically. Using an adapted alternating treatments design, the present study compared progressive prompt delay with 2-s or 5-s constant prompt delay, on the acquisition of an expressive labeling task in four participants with autism spectrum disorder and intellectual disability. While all three prompt delay methods led to mastery of the tasks, albeit only when the tasks were simplified for one participant, progressive prompt delay generally proved the most efficient method on several measures, including lower error rates. This is consistent with the nature of the progressive prompt delay procedure which allows less time for errors to occur early in training. It is provisionally concluded that selection of progressive prompt delay is supported as a wise first choice option for clinicians, as a history of high error rates may impair later learning.


Introduction
Common problems encountered during instruction can relate to behavioral skills deficits, low motivation to learn, poor stimulus control, limited generalization, or behavioral excesses. Such problems are barriers to learning that can negatively impact the independence of individuals with autism spectrum disorder (ASD) and/or intellectual disability (ID), restricting their ability to achieve meaningful outcomes (Hanley, Iwata, & McCord, 2003;Odom & Strain, 2002). Prompting strategies are integral components of instructional programs for individuals from these populations. An instructor may use prompts to evoke a desired response during initial instruction. After the learner has met a pre-determined level of accuracy, the instructor will then systemically fade the prompts to facilitate independent responding by the individual There are many types of prompt-fading procedures available to instructors (See Green, 2001, for a review). Prompt delay 1 (Coleman-Martin & Heller, 2004;Halbur, Kodak, Wood, & Corrigan, 2019;Heal, Hanley, & Layer, 2009;O'Neill, McDowell, & Leslie, 2018;Reichow & Wolery, 2011) uses a delay interval to 'fade out' or transfer stimulus control from the prompt to the natural discriminative stimulus (S D ) intrinsic to the skill under instruction (Snell, 1982;Snell & Gast, 1981;Touchette, 1971;Touchette & Howard, 1984), and allows for independent responding to emerge.
Conditional discriminations are essential in functional and academic skills commonly taught in behavioral interventions for individuals with ASD/ID (Fisher et al., 2019;Green, 2001). Conditional discriminations contain four components: (1) a contextual or conditional stimulus; (2) a discriminative stimulus (S D ) indicating that reinforcement is available given a response; (3) a response; and (4) a consequence. This type of training requires learners to respond differentially to non-identical but related stimuli when accompanied or preceded by a certain antecedent stimulus. An example of this is holding a blue card up (conditional stimulus) to a learner, asking 'what's this' (S D ) the learner saying 'blue' (response), and verbal praise, 'well done', being delivered as a consequence.
Prompting is often necessary for individuals with ASD/ID as verbal instructions may be insufficient to evoke a target response. Prompt delay procedures are adaptable and may be used in conditional discrimination training. The initial set of trials in a prompt delay procedure begins by providing a conditional stimulus followed by presentation of the S D along with an added prompt to evoke the target response. This is known as zero-second or simultaneous prompting and functions to reduce the probability of a learner error. Subsequently, the prompt is delayed by an interval of time allowing for an independent correct response to occur. Progressive and constant prompt delay procedures differ at this point (Walker, 2008). In progressive prompt delay, the prompt is delayed by small intervals of time that increase incrementally (e.g., 1 s, then 2 s, up to a maximum value, in this case 5 s) between instructional sessions, with increases in delay to the prompt being contingent on learners performance. Alternatively, in the constant prompt delay procedure the prompt is delayed by the same interval of time in each session that include prompts. With either procedure, prompts remain until a mastery criterion is achieved.
The scheduled delay of a prompt allows time for the learner to make an independent response, but if this is not made, the prompt occurs. This is an important feature of prompt delay, as without these prompting sequences repeated errors may occur and may hinder subsequent learning when more effective procedures are introduced. Cengher et al. (2016), amongst others (Coon & Miguel, 2012;Etzel & LeBlanc, 1979;Roncati, Souza, & Miguel, 2019;Schilmoeller, Schilmoeller, Etzel, & LeBlanc, 1979), provide evidence for the role of instructional history in hindered skill acquisition. In their study, learning targets assigned to the control condition and least efficient instructional procedure (Least-to-most; LTM) were subsequently taught with the most effective and efficient procedure (Most-to-least; MTL). At this stage, three out of four participants acquired the targets previously assigned to the control condition, but no participant mastered targets previously taught with the least efficient condition. The Coon and Miguel (2012) and Roncati et al. (2019) studies provide evidence for increasingly efficient acquisition of instructional targets when following a recent history of instruction with those same prompt types, as compared with other prompt types. This lends support to the importance and influence of recent instructional history on subsequent instructional performance. In behavior-analytic practice, if an instructional strategy proves ineffective, then a different instructional procedure is implemented. These findings direct us to the importance of selecting procedures most likely to be effective and efficient first.
The efficacy of prompt delay is reported in several reviews (Cengher, Budd, Farrell, & Fienup, 2018;Cengher, Kim, & Fienup, 2019;Demchak, 1990;Handen & Zane, 1987;Walker, 2008;Wolery et al., 1992) and is supported as an evidence-based intervention in literature related to a wide range of problems and contexts (National Autism Center, 2009, 2015Wong et al., 2015); However, reviews of this evidence tend to group progressive and constant prompt delay together, making broad recommendations for their use despite their procedural differences. This is true of the Wong et al. (2015) and National Autism Center's (2009) reviews, and is presumably due to the scarcity of published work to have compared variations of prompt delay directly (Ault, Gast, & Wolery, 1988;O'Neill et al., 2018). However, Libby, Weiss, Bancroft, and Ahearn (2008) did compare three prompting procedures, one of which involved delay. These were most to least (MTL) prompting, least to most (LTM) prompting, and MTL with a delay (MTLD). They found that acquisition for the three participants was nearly as rapid in MTLD as LTM, but MTLD produced fewer errors than LTM, with MTL producing the slowest acquisition. High error rates may lead to the emission of responses that function to escape difficult, error prone tasks (Carr & Durand, 1985;Heckaman, Alber, Hooper, & Heward, 1998;MacDuff, Krantz, & McClannahan, 1993;Schilmoeller et al., 1979;Weeks & Gaylord-Ross, 1981) and reduce contact with reinforcement contingencies. A major objective of prompt delay is to move towards errorless learning, which has been shown to be possible with pigeons (Terrace, 1963), first-grade children (Robinson & Storm, 1978), children with ASD and/or ID (Ault et al., 1988;O'Neill et al., 2018), and adults with ID (Touchette, 1968(Touchette, , 1971). The argument is that an errorless procedure will ensure that contact is maintained with the contingencies in the early stages of training.
To date only two published studies have directly compared variations of prompt delay. Ault, Gast, and Wolery (1988) used both an 8-s progressive prompt delay and a 5-s constant prompt delay procedure with three learners with moderate ID when learning a community sign-reading task. Both variations of prompt delay were effective, with the 5-s constant prompt delay procedure shown to be marginally more efficient than the progressive prompt delay procedure. Results did not conclusively favor one procedure over the other in this study, with replication recommended. More recently, O'Neill et al. (2018) attempted a partial replication, comparing three variations of prompt delay (2-s or 5-s constant prompt delay, and 5-s progressive prompt delay) with trial-and-error instruction on a receptive conditional discrimination task. A procedural modification, in the form of differential reinforcement, was added to prompt delay for two of the four participants. With or without this procedural modification, results suggested progressive prompt delay was effective and most efficient in reducing learner errors during instruction. Mixed outcomes are not uncommon in comparison studies, and may be due, in part, to methodological differences (Cengher et al., 2018;Wolery et al., 1992). Although one further study did compare constant with progressive prompt delay and reported that overall progressive prompt delay resulted in fewer errors, instructional time and sessions to criterion, this was an unpublished master's thesis (Thomas, 1989, cited in Wolery et al. 1992, and there is a lack of consensus amongst the two published comparison studies reviewed above. Given the importance of having evidence for selecting the most effective and efficient instructional procedure first (Cengher et al., 2016;Coon & Miguel, 2012;Etzel & LeBlanc, 1979;Roncati et al., 2019;Schilmoeller et al., 1979) a further comparison was made in the present study.
This study used an adapted alternating treatments design to compare three variations of the prompt delay procedure (2-s or 5-s constant prompt delay and 5-s progressive prompt delay), with a control condition, on measures of effectiveness and efficiency when teaching an expressive labeling task to learners with ASD and ID. The research questions were: 1) which of these conditions were effective? 2) which 2ould prove most efficient in terms of trials to criterion, errors to criterion, and duration of instruction, with this client group?

Participants and Setting
The four participants, three males and one female, attended a special education school that was purpose built for learners with severe, profound and multiple learning difficulties of both primary and secondary school age. Participants attended 5.5 hr per day, 5 days per week, 9 months of the year and ranged between 11.6 and 18.4 years and had Expressive Vocabulary Test-2 (EVT™-2) scores across the range of 3.11 to 9.9 years. The EVT-2 is a norm-referenced standardized test used to assess expressive language (Williams, 2007). For inclusion, participants had been independently diagnosed with an ASD and/or ID in the severe range, were able to attend to a table-top task for approximately 10 min, and could imitate an echoic verbal prompt within 3-5 s of delivery. Participants had not been exposed to the stimuli that were used for training (country flags), and had no history of training with delayed prompting procedures or other systematic prompting procedures commonly used in behavioral interventions. All participants were native English speakers. Consent to conduct this experiment was granted by the University's Research Ethics Committee, and informed consent and assent were obtained from the participants' parents and participants themselves, respectively. Table 1 contains participant information (name, gender, age, EVT-2, difference between age and EVT-2 score, and diagnosis). Experimental sessions were conducted in a small classroom adjoining the main classroom. This room was used by all students for one-to-one instruction with a teaching assistant from time to time as it provided minimal distractions from educational tasks. During experimental sessions, participants were seated at a table beside the experimenter, no other students were present during these sessions.

Materials
As the host school required educationally relevant materials to be used in the study, advice was taken from teachers about which instructional materials to use. Based on this, sets of 12 multi-colored country flags were devised. Flags were individually printed and laminated onto 16 cm x 12 cm flash cards. Four sets of three flags were randomly assigned to the experimental conditions described below, with assignments differing across participants. A logical analysis (Gast, 2009)  Presentation order was determined according to a quasi-randomized sequence determined by a standardized data-recording sheet. Pre-determined reinforcers, data collection sheet, and a token board were used during experimentation. The token board enabled 9 tokens, shaped as footballs, to be attached with Velcro when earned by a participant.

Dependent Measures
Responses were recorded in one of five possible ways at the conclusion of a trial.
Independent correct or incorrect responses were scored dichotomously. These were defined as the participant emitting a verbal response that closely matched the name of the sample stimulus (+; country flag) or a verbal response that did not correspond to the sample stimulus (-), respectively.
Prompted correct responses (+p), or incorrect prompted responses (-p) were defined in the same manner as outlined above with the inclusion of an echoic prompt delivered prior to the emission of a response. Failures to respond at all were recorded (NR = no response). Errors were also recorded but had no programmed consequences and signaled the end of a trial.
Direct comparisons between instructional conditions were made using effectiveness and efficiency data. Effectiveness was defined as an instructional condition producing responding to mastery criterion level. Mastery criterion was defined as eight or nine (>89%) independent or prompted correct responses (+ or +p) made on three consecutive sessions inclusive of one 'no prompt' post-test session. A post-test session was conducted in the same manner as baseline (see below). Due to criteria used to move between prompting levels across sessions, prompting could only have occurred in the first of three consecutive sessions used to assess for mastery. Efficiency measures included: number of training trials, number of errors, percentage of errors to criterion and instructional duration. Duration was a measure of time taken to carry out an instructional condition from beginning until mastery was attained. Recording began immediately prior to the experimenter gaining eye contact with the participant and ended when the last token was placed on the token board, signaling the end of the session.

Experimental Design
A within-subject adapted alternating treatments design was used to examine four experimental conditions simultaneously, targeting non-reversible behaviors, with a focus on delineating relative efficiencies (Sindelar, Rosenberg, & Wilson, 1985). This allowed for a direct comparison using effectiveness and efficiency measures across baseline and instructional phases.

Procedure
Pre-baseline assessment. Prior to commencement, potential reinforcers were initially determined through teacher nomination and then by preference assessments of participant choice using a multiple stimulus without replacement protocol (DeLeon & Iwata, 1996). Potential reinforcers used in the preference assessments included a range of edibles and time in a soft play area.
Additionally, all participants underwent a screening of their ability to verbally imitate an echoic prompt (country names) from a list of potential training stimuli for use in the expressive labeling task.
A list of 30 countries' names, each with a maximum of three syllables, was compiled. Stimuli successfully imitated by the participant were retained within a bank of potential training stimuli, and those that were not imitated were excluded. The verbal country name and flag were never paired at this stage.
Baseline. One or two baseline sessions, consisting of nine trials per session, occurred for each of the stimulus sets to ensure training materials were novel and equally difficult (this was established as indicated by 0% correct responding reported in Results). A baseline trial consisted of a sequence of nine components. The sequence was: (a) establish eye contact; (b) hold up sample stimulus; (c) experimenter says "touch this"; (d) participant touches sample card with index finger (the differential observing response); (e) experimenter says "what's this?"; (f) await learner response (up to 8 s); (g) provide contingent reinforcement if appropriate (in practice, there were no correct responses in these sessions); (h) remove materials; and (i) observe a 3-to 5-s inter-trial interval.
Instruction. An instructional trial included 10 components: (a) establish eye contact; (b) hold up sample stimulus; (c) experimenter says "touch this" (d) secure a differential observing response;  Table 2. All instructional conditions began with zero-second prompting. A zero-second delay level trial contained components (a) through (j), outlined above. At step (f), following the experimenter saying, "what's this?", the experimenter immediately provided an echoic prompt (S+). Thereafter, instructional conditions differed. In 5-s PPD the prompt delay was increased (across sessions) by 1-s increments up to a maximum of 5 s. For example, at level 1 at step (f) the experimenter waited 1 s before delivering an echoic prompt, but at level 2 a 2 s delay was used, and so on. In 5-s CPD a constant delay of 5 s was used at that point throughout training. In 2-s CPD, a constant delay of 2 s was used throughout training. The control condition was an extension of baseline. No prompting was provided, but token reinforcement was available on an FR1 schedule for all correct independent responses. Instruction continued until the mastery criterion was met for each condition. When mastery had been achieved for a condition, instruction sessions continued with the remaining conditions. Once mastery had been reached for all three prompt conditions, instruction ceased. Maintenance probes were conducted at 2 and 4 weeks for one participant. These sessions followed the same protocol as that outlined above for baseline.
Prompting levels. Criteria to move between the prompting levels of a condition were as follows: If eight or nine independent or prompted correct responses occurred in one session, the level was increased for the next session, and if two consecutive errors, or a total of three or more errors, occurred in one session, the prompt level was decreased for the next session. A 'no prompt' post-test session followed the protocol described for baseline.
Inter-observer agreement (IOA) and procedural integrity (PI). Observers independent of this research retrospectively analyzed IOA and PI (Billingsley, White, & Munson, 1980). This was done in 33% of sessions, across participant conditions. Prior to IOA and PI analysis, observers underwent training where they reviewed 3-5 recorded experimental practice sessions with the first author. The first author scored each session pointing out the operationally defined steps contained in each. Observers then watched and scored practice sessions independently until they scored above 89% accuracy across two consecutive training sessions before moving on to scoring experimental sessions. The point-by-point method was used to calculate IOA for responses recorded (Ayres & Gast, 2009). This was done by dividing the number of agreements by the number of agreements plus disagreements and multiplying by 100. Agreement averaged 98.6% (range 77-100%) across participants. PI was calculated by dividing the number of completed instructional trial components by the number of planned trial components and multiplying by 100. Average PI score was 93.6% (range 33-100%) across participants. The wide range (and slightly lower average) of PI scores came about because of disagreements in scores for one participant.

Results
The main study aims were to determine the effectiveness and relative efficiency of each instructional condition. Effectiveness will be reviewed first, and then various efficiency measures. As shown in Figure 1, all three instructional conditions were effective at producing mastery level performance in correct independent responses for all four participants. As this was an expressive language task, it was not surprising that on this measure all participants were unable to name any of the flags and scored zero in baseline sessions, indicating potential equal difficulty of the stimulus sets used. Once instruction started, performance improved in all three prompt delay conditions but not in the control condition. For three participants, mastery was reached soonest in the 5-s PPD condition.
For the other participant, the task was simplified after session 53 because of slow progress, and then mastery was reached soonest in the 2-s CPD condition.
Looking at effectiveness with individuals, Figure 1 shows that for Seamus acquisition was rapid for the 5-s PPD condition with an immediate rise to mastery criterion level (8/9) in Session 3.
Although this dropped to 7/9 in Session 4 it recovered in Session 5 to 9/9 (100%), maintaining this level for two additional sessions, inclusive of a post-test, to achieve mastery. 5-s PPD ranked first with mastery criterion attained in 63 training trials, The 5-s and 2-s CPD conditions followed, ranked second (90 trials) and third (117 trials), respectively. At this point, maintenance probes were run for all three prompt delay conditions. In all, 117 trials were run in the control condition with 108 (92.3%) errors. Interestingly, the control condition encountered a threat to internal validity. During the latter quarter of the experiment, a different classroom of the school displayed a map of the world in which one of the country flags assigned to the control condition was shown; this name was taught and Seamus learned this name, resulting in three independent correct responses over the final three consecutive sessions. Following discovery of the classroom display, the experimenter confirmed that no other stimuli had been compromised and that no further flag training would occur before the conclusion of the study.
Cian also achieved mastery in all three instructional conditions. Following baseline, a steady upward acquisition trend was observed across all his conditions with no return to zero-second prompting following the first session. The 5-s PPD condition was ranked first with mastery achieved in 108 training trials; ranked second was the 2-s CPD condition with 117 training trials; ranked third was the 5-s CPD condition with 144 trials. 144 trials were run in the control condition with 144 (100%) errors.
As with the other participants, Cahil initially underwent training with stimulus sets containing three flags assigned to each condition, but this participant's performance was characterized by considerable variability throughout training. Following 99 trials with no clear indication of an upward trend towards acquisition, training was altered, and stimulus training sets were reduced from three to two and training continued. This is denoted with a condition change line in Figure 1. Following this change a clear upward trend towards mastery criterion was seen in all of his conditions. Ranked first was the 2-s CPD condition with 45 trials; second was the 5-s PPD condition with 54 training trials; third was the 5-s CPD condition with 108 training trials. At two weeks post-acquisition, Cahil's performance fell to below the mastery criterion level; at four weeks post-acquisition then returned to within mastery criterion range (≥8/9+) for the three probed conditions. A total of 198 trials were run in the control condition, with 198 (100%) errors.
Eimear's performance is characterized by moderate variability between data points in each of her instructional conditions. This was indicative of a prolonged acquisition phase, with Eimear undergoing the most training trials overall (603; this compared with Seamus 270, Cian 369,and Cahil,495). Despite this, mastery criterion was achieved, first in the 5-s PPD condition with 117 trials; ranked second was the 5-s CPD condition with 225 training trials; ranked third was the 2-s CPD condition with 261 trials. Control conditions were carried out for 171 trials with 170 (99.4%) errors.
In summary, training was effective in all three instructional conditions for all four participants.
The trials to mastery criterion given above are one measure of efficiency. Other measures are the number of errors before criterion was reached, and the time taken (duration) for each condition. All three efficiency measures are shown in Table 3. These show that for the three participants trained throughout with 3-stimuli instruction, 5-s PPD was ranked first in trials to mastery, while for the other participant (Cahil) they were ranked 2-s CPD, then 5-s PPD, then 5-s CPD, In terms of errors made, or percentage errors, 5-s PPD also produced the fewest errors for all four participants. Table 3 also shows the total duration of training for each instructional condition, and for three participants the shortest time was recorded for the condition in which fewest trials to mastery were required (the exception was Cian).
Average performances are shown in Figures 2 and 3. Figure 2 indicates the mean trials to criterion and percentages of errors to criterion for each instructional condition, with 5-s PPD having the lowest number of trials and the smallest percentage of errors. Figure 3 plots the mean instructional duration, and again the 5-s PPD condition has the lowest value. Although there is variation across participants, the consistent overall pattern that emerges is of a typical advantage of 5-s PPD over the other instructional conditions.

Discussion
The main research questions were to determine the effectiveness and relative efficiency of each instructional condition. All three prompt delay procedures proved effective, and, on balance, PPD was more efficient than either CPD procedure. Given the limited evidence in the literature with direct comparison of these procedures, this is useful information about alternate versions of response prompting, a strategy often used during instruction to address problems with behavioral skills deficits that are seen in individuals with ASD/ID. Present findings show that all three prompt delay conditions proved effective, producing acquisition to mastery criterion level and, as anticipated, mastery was not attained in the control condition by any of the participants. This finding is consistent with previous research that has shown prompt delay to be an effective instructional procedure in stand-alone investigations (Ault et al., 1988;Doyle, Wolery, Gast, Ault, & Wiley, 1990;O'Neill et al., 2018;Wolery, Munson-Doyle, Gast, Ault, & Simpson, 1993) and in reviews of the literature (Cengher et al., 2018(Cengher et al., , 2019Handen & Zane, 1987;Walker, 2008;Wolery & Gast, 1984;Wolery et al., 1992). When compared with other types of prompt and prompt-fading procedures, CPD and PPD have been reported as being as effective as LTM prompting; but more efficient (Bennett, Gast, Wolery, & Schuster, 1986;Heckaman et al., 1998;Wolery, Ault, Gast, Munson-Doyle, & Griffen, 1990); whereas, MTL prompting is reported as being as effective but more efficient than CPD (Aykut, 2012). More recently, PPD has been shown to be as effective as LTM and MTL fading, with the LTM procedure most efficient when compared with both (Schnell et al., 2019;Seaver & Bourret, 2014); in one, PPD was the middle ranked procedure in terms of efficiency (Seaver & Bourret, 2014). Looking at a range of prompting strategies, both Seaver and Bourret (2014) and Schnell et al. (2019) concluded that outcomes were associated with variables specific to each learner. Neither study sought to systematically CPD with PPD, but rather sought to identify the optimal instructional procedure for each learner.
Because there is not much time to commit errors and the delay to the prompt is faded gradually, it is more likely that PPD will be an errorless procedure, and this is supported by our findings. In terms of how efficient each procedure was against each other, PPD produced the least mean number of training trials and considerably lower percentage of errors (6%), in comparison with 5-s (15%) and 2-s CPD (18%, Figure 2). At the individual level, with Seamus, Cian, and Eimear, 5-s PPD was ranked first in terms of the least number of training trials to mastery, and second on this measure for Cahil. Cahil was also the participant for whom the procedure had to be modified to reduce its difficulty before mastery was reached with any of the prompting procedures, and he had the lowest scores on ability measures (see Table 1), so it may be that he was at the margin of the ability range where these procedures can be used effectively. This study also measured duration of instructional time to criterion (Figure 3) and mean instructional duration was also lowest with progressive prompt delay. Thus, on almost all efficiency measures for the participants included here, the 5-s PPD was the most efficient prompt fading procedure. This is consistent with the findings of O'Neill et al. (2018) where for three out of four of their participants, PPD was most efficient when compared with 2-s and 5-s CPD, producing acquisition to mastery criterion in the least mean number of training trials whilst producing the least percentage of errors to criterion. The present study adds to the body of evidence in support of using the 5-s PPD, not least because duration of instruction to mastery was also less than both CPD conditions. As noted above, for one participant across these two studies, 5-s PPD was not ranked first. Additionally, training stimuli had to be reduced from three to two for that participant before mastery was achieved. These findings chime with the inconsistencies in the literature in relation to the efficiency of outcomes of the many instructional procedures used for persons with ASD/ID, and this is perhaps related to the intrinsic heterogeneity found in this group (Ault et al., 1988;Cengher et al., 2018Cengher et al., , 2019Cengher et al., , 2016Libby et al., 2008;O'Neill et al., 2018;Schnell et al., 2019;Seaver & Bourret, 2014;Walker, 2008;Wolery & Gast, 1984).
It is perhaps not surprising that PPD was associated with the least number of errors. At the beginning of PPD training, the opportunity to engage in an independent learner response is limited to 1 s, then 2 s, and so on. Because duration of time available to respond independently only increases contingent on correct learner responses in the previous session, learners are only given more time to respond when correct independent responses have become more likely; the procedure is therefore responsive to learner's performance as the delay to prompt will decrease should the learner falter.
This incremental approach appeared to benefit the learners within this study (as in O'Neill et al., 2018). With the CPD procedures, in contrast, there is more time to respond from the outset of training and therefore more time to commit an error. This is likely why CPD was associated with a higher percentage of errors. This factor is particularly important at the beginning of instruction, when more errors are likely to occur. It is expected that as instruction progresses the prompt will not be necessary, being replaced with independent responding. In the present procedure, prompted and independent correct responses were not differentially reinforced (i.e., both were followed by the same reinforcer) except that independent correct responses resulted in a shorter delay between trial onset and reinforcement. This procedural variation did ensure that reinforcement during instruction was high, and as performance improved, prompted correct responses became rare.
The issue of errors during instruction has previously been identified as important by Green (2001), as too high an error rate may contribute to the development of faulty stimulus control. Current evidence tentatively suggests that PPD may be a wise first choice when deciding amongst prompt delay procedures. However, this conclusion would be strengthened if the advantage of PPD was replicated in another laboratory.

Limitations and Directions for Future Research
The present findings could be strengthened and clarified in future research if some limitations of the present study were addressed. The adapted alternating treatment design does not require any baseline data with recommendations suggesting just one or two sessions be conducted (Gast, 2009).
These recommendations were followed here, but it can be argued that stronger conclusions can be drawn if the baseline phase is continued for at least three sessions in each condition to detect any possible trend. During instruction conditions, participants accessed the programmed consequence contingent on correct independent or correct prompted responses. The absence of differential reinforcement for independent responding (except, as noted, through earlier delivery of the reinforcer) may have slowed down the transfer of stimulus control from the prompt to the programmed discriminative stimulus, so the inclusion of that contingency should be considered. The PI check highlighted some errors of omission for one participant. PI scores drop as soon as one step of a sequence is in error. In this case, on a few occasions the experimenter either observed a longer delay value than was prescribed for that instructional step and condition or failed to record a correct learner response when one had occurred. This resulted in a slightly lower average PI score across participants and represents a limitation of the present study.
This study, and a previous related one in a similar school setting (O'Neill et al., 2018), targeted academic skills. It would be useful if future attempts at replication included functional skills, such as demonstrating a preference between activities, identifying types of money, or toothbrushing skills, to see if results are generalizable to those types of skills, as these are often targeted by behavior analysts. Relatedly, this study was entirely conducted within the school setting, so generalization to other contexts (e.g., home, or community) was not assessed. Finally, inclusion of a social validity measure to assess type of the instruction the learners prefer, and which procedures special educators are able to implement, would be useful.

Implications for Practice
It is important for clinicians to have an evidence base that may be used to inform instructional decision making (Odom, Collet-Klingenberg, Rogers, & Hatton, 2010). Comparisons that include efficiency measures like the ones used here extend the data available beyond that of effectiveness.
While all three prompt delay procedures proved effective, on balance, PPD was more efficient than either CPD procedure. As in previous research, there was some variation in the outcome across individuals, with the three higher-functioning individuals showing more consistent findings, Nonetheless, the sum of evidence of this and previous research (O'Neill et al. 2018), may support selection of PPD as a wise option as first choice by clinicians. Difference between chronological age and EVT score (EVT™-2), b severe learning difficulty, c autism spectrum disorder. Table 2 Prompt levels for each instructional condition, progressive prompt delay (PPD) or constant prompt delay (CPD), and a control condition.  Table 3 Effectiveness and efficiency data for four participants across three instructional conditions, progressive prompt delay (PPD) or constant prompt delay (CPD), and a control condition (baseline data are excluded). Rank order is based upon the number of trials to criterion. Data in parentheses relate to reduced 2-stimuli instruction. responses in successive sessions, for four participants with three prompt delays, Progressive Prompt delay (PPD, black squares), 2-s Constant Prompt delay (CDP 2-s, black triangles), 5-s Constant Prompt delay (5-s CPD. black circles), or control (open diamonds) conditions. When mastery criterion was met, a post-test (PT) was conducted. For Cahil, 3-stimuli instruction was changed to 2-stimuli instruction, and maintenance probes (MP) were conducted several weeks (WK) after final PT.