Speech-generating devices (SGDs) are electronic or computer-based forms of assistive technology that can be programed to produce either digitized (recorded) or synthesized speech output when the user selects an icon, photograph, or other graphic symbols from the display interface (Franzone & Collet-Klingenberg, 2008; Lancioni et al., 2013; Sigafoos et al., 2014). SGDs are increasingly being used as a form of assistive communication technology for persons with severe speech impairments (Elsahar et al., 2019). SGDs also represent a potentially beneficial form of assistive technology for the approximately 30% of autistic children who are minimally verbal, meaning they lack sufficient expressive speech to meet their everyday communication needs (Anderson et al., 2007; Tager-Flusberg & Kasari, 2013; van der Meer & Rispoli, 2010).

Empirical support for this claim can be found in numerous studies that have demonstrated successful procedures for teaching SGD use to minimally verbal autistic children (for reviews see Lorah et al., 2015; Muharib & Alzrayer, 2018; van der Meer & Rispoli, 2010). van der Meer and Rispoli, for example, identified 23 studies that focused on teaching SGD use to autistic children. Most (86%) studies demonstrated successful acquisition of targeted SGD-based communication skills through the application of systematic instructional procedures (e.g., time delay, response prompting, prompt fading, error correction, and reinforcement). Despite these generally positive outcomes, the literature on teaching SGD use to minimally verbal autistic children is limited. Most studies to date have focused on teaching only single-step requesting responses (e.g., the child makes one response/selects one icon to request a preferred object).

An important goal for many minimally verbal autistic children would be to move beyond single-step requesting to more advanced multi-step communicative exchanges. Multi-step exchanges might involve both requesting and social communication. For example, a child might be taught to use a SGD to (a) first greet his or her listener, (b) then make a generalized request (“I want a snack.”), followed by (c) a more specific request for one of several relevant options that the listener can offer (“popcorn” versus “cookie”), and finally (d) ending the exchange by thanking the listener for providing the requested item. Establishing a multi-step communication sequence of this type could be seen as enhancing the social appropriateness of communication responses, increasing social interaction, and promoting conversational turn-taking.

Along these lines, emerging evidence has shown that systematic instruction can also be successfully applied to teach multi-step SGD use to autistic children (Alzrayer et al., 2017, 2019; Chavers et al., 2021; Genc-Tosun & Kurt, 2017; van der Meer et al., 2013; Waddington et al., 2014). Alzrayer et al. (2017), for example, taught four autistic children to complete a three-step requesting sequence on an iPad®-based SGD. In an extension of this work, Genc-Tosun and Kurt (2017) taught four young boys with autism to perform a longer (six-step) requesting sequence on a SGD using time delay, physical prompting, and reinforcement. Alzrayer et al. (2019) further extended this work by teaching three autistic children to engage in a multi-step communication interaction that involved both requesting and social responses (e.g., saying “Thank you.” and answering questions). Similarly, Chavers et al. (2021) successfully taught three autistic children to use an SGD to make requests and engage in social communication using time delay, least-to-most prompting, error correction, and reinforcement.

An important factor to consider when teaching multi-step SGD use is whether participants are in fact discriminating among the available icons as opposed to simply selecting any icon displayed at each successive step of the sequence. For example, when participants are presented with two icons, both of which represent preferred items, any icon selection could be seen as correct, even though the participant might be selecting icons at random or according to some biased response pattern (e.g., always selecting the icon on the right side of the screen display). Previous studies on teaching multi-step SGD use have involved varying degrees of symbol discrimination and discrimination training. Alzrayer et al. (2017), for example, taught participants to navigate through three screens. The first two screens included only a single symbol and the third screen included multiple symbols representing only preferred items. Thus this configuration did not require symbol discrimination. van der Meer et al. (2013), in contract, required a discrimination among 15 different symbols, representing the range of targeted communication functions (i.e., requests for specific snacks and toys, greetings, answering questions, and social etiquette responses). Ensuring discriminated icon selections when teaching multi-step communication would seem critical for ensuring icon selections do in fact function as valid communication responses (Simacek et al., 2018).

Along these lines, at least two approaches could be used to assess for discriminated and functional icon use. One approach would be to configure SGD displays so that more than one icon is displayed at each step of the sequence. For example, at step 1—in a four-step sequence—the correct response could be designated as selecting the HELLO icon rather than the simultaneously available THANK YOU icon. Alternatively, at the final step of the sequence, the correct response could be designated as selecting the THANK YOU icon rather than the HELLO icon. If participants did in fact learn to correctly sequence the selection of these two icons, then this would provide some evidence of functional and discriminated icon use. This approach might work well for social communication responses, but it is arguably less useful when assessing for discriminated and functional use of requesting icons in scenarios where all available icons are references for preferred items/reinforcers. For this latter situation, a correspondence test could be used (Reichle et al., 1989). Specifically, after requesting a specific preferred item (e.g., a puzzle), the participant could be offered that object along with another preferred item (e.g., a puzzle and a potato chip). If the prior request was discriminated and functional, then the participant should take the item that matched or corresponded to their prior request (i.e., they should take the puzzle and not the potato chip).

The present study evaluated the effects of a systematic instructional package for teaching a four-step SGD-based requesting and social communication sequence to five minimally verbal autistic children. The study aimed to extend previous research by assessing acquisition, generalization, maintenance, and the extent to which icon selections were discriminated/functional using the two approaches outlined above. Based on previous studies, we hypothesized that systematic instruction would be effective in teaching the participants to perform the four-step requesting and social communication sequence on an iPad®-based SGD. Generalization to a second interventionist and a high level of maintenance at follow-up was also predicted. We further hypothesized that discriminated use of icons would result from presenting more than one icon at each step of the sequence and through correspondence testing.

Method 

Participants

The five participating children were recruited from a university database because they had an autism diagnosis, did not currently use an SGD for multi-step communication, and had sufficient motor control to select icons from the screen of an iPad®-based SGD. All of the participants were considered to be candidates for SGD use because they either had no speech or spoke only a few single words. They were assigned pseudonyms for this report. To confirm their status as minimally verbal, receptive and expressive communication abilities were assessed using the third edition of the Vineland Adaptive Behavior Scales (Vineland-III, Sparrow et al., 2016).

Sean was an 8-year-old male of Russian ethnicity. On the Vineland, he obtained age equivalencies of 1:2 (years:months) and 0:9 for receptive and expressive communication, respectively. He had a history of biting and elopement. At home, he reportedly used an iPad®-based SGD with ProloQuo2Go™ software to make one-step requests for preferred objects.

Chris was a 10-year-old male of Fijian/Indian ethnicity. His receptive and expressive age equivalencies were both 1:7. He had no prior experience with SGDs, but had some experience in using a picture-exchange communication system. Prior to baseline of the present study, he was taught to make one-step requests for preferred objects with an SGD.

Andy was a 6-year-old male of Māori/New Zealand European ethnicity. His receptive and expressive age equivalencies were 0:8 and 0:7 respectively. He had occasional tantrums and some prior experience using an SGD to make one-step requests for preferred objects.

Victor was a 7-year-old male of New Zealand European ethnicity. His receptive and expressive age equivalencies were 0:11 and 0:8 respectively. He occasionally spoke single words (e.g., no, hello, and okay). He had some experience using a SGD to request preferred objects, following a visual schedule, and using a picture-exchange communication system.

Grace was a 7-year-old female of Māori and British ethnicity. Her age equivalencies were assessed at less than 1 month for receptive communication and 0:9 for expressive communication. She used one manual sign (MORE) and also exchanged picture cards to request preferred stimuli (e.g., snacks and television) and to indicate the need to use the toilet. Grace also had experience making simple, one-step requests using her iPad® with ProloQuo2Go™ software.

Procedures

Participants received 1:1 sessions in a quiet private room at their respective schools. During sessions, the child sat at a table/desk with the interventionist (first author). The SGD was placed on the table/desk within the child’s reach. Reinforcers were kept in a clear storage box. A teacher, teacher’s aide, and/or university graduate student was also often in the room during sessions to collect inter-observer agreement and procedural integrity data and serve as the novel interventionist for the generalization probes. Some sessions were video recorded for checking inter-observer and procedural integrity when a live observer was not available.

Preferred Stimuli

Preferred stimuli that the participants would be taught to request were identified using a two-stage stimulus assessment process (Fisher et al., 1996). Stage 1 involved asking parents and teachers to provide a list of snacks and toys that the children seemed to enjoy. During stage 2, four items from each list were used in a paired stimulus preference assessment procedure. Specifically, a pair of items from each category (e.g., toys or snacks) were presented and the child was asked to choose one item. Every item was paired with every other item of the same category (toys or snacks), and the process was repeated a minimum of four times. When a snack item was selected, the child was allowed to consume a bite-sized portion of that item. When a toy was selected, the child was allowed to play with the toy for 30 s. Each child’s two most frequently selected snacks and toys were retained for use in the study (see Table 1). Note that because of dietary concerns, Grace’s preferred snacks were identified by her mother and not from the stage 2 procedure.

Table 1 Preferred stimuli identified for each participant

Speech-Generating Device

Children were taught to engage in a four-step requesting and social communication sequence by selecting icons from an iPad® that was loaded with Proloquo2Go™ software (Sennott & Bowker, 2009). Each iPad® was configured with four progressive screens with each screen containing two icons. The two icons on screen 1 were HELLO and THANK YOU. Activating the HELLO icon produced corresponding synthesized speech output (“Hello”) and then also automatically progressed to screen 2. In contrast, activating the THANK YOU on screen 1 generated relevant speech output (“Thank you”), but did not progress to the next screen as this was an incorrect response at step 1. The next screen (screen 2) contained two icons (SNACK and TOY) representing general requests for a snack or a toy. Selecting an icon on screen 2 produced corresponding speech output (i.e., “I want a snack” or “I want a toy”). Also, as soon as the icon was selected, the next page (screen 3) appeared. On screen 3, participants could make a more specific request for one of their two preferred snacks or for one of their two preferred toys depending on whether they had selected the SNACK or TOY icon on the previous page (screen 2). After making a specific request from screen 3, the display progressed to screen 4 which contained the HELLO and THANK YOU icons again. Activating either icon did not take the user to another screen, but only generated corresponding speech output (i.e., “Hello” or “Thank you”). The icons used on screens 1, 2, and 4 were SymbolStix™ images taken from the Proloquo2Go™ database and these were identical for all five participants. The snack and toy icons appearing on screen 3, in contrast, were individualized for each participant, based on the results of the prior preference assessment. Individualized icons for screen 3 consisted of photographs of their two preferred snacks or their two preferred toys. All of the synthesized speech output was in a standard English/Australian accent in a boy’s voice for Sean, Chris, Andy, and Victor or in a girl’s voice for Grace.

Response Definition and Measurement

Correct responding was defined as independently activating (i.e., without prompting) the correct icon so as to generate speech output at each step in the communication sequence. First, at the start of each communication opportunity, the participant had to greet the researcher by selecting the HELLO icon within 10 s of the interventionist initiating an opportunity. The interventionist initiated an opportunity by looking at the participant and saying Hello. Let me know if you want a snack or a toy. The second step required the participant to make a general request for a toy or a snack by selecting the SNACK or the TOY icon from screen 2. Again, this response had to occur within 10 s of that screen appearing on the SGD. For the third step of the communication sequence, the participant had to make a specific request by selecting one of the two specific snack or toy icons that were available on screen 3 within 10 s of that screen appearing. Lastly, the participant had to select the THANK YOU icon from screen 4 within 10 s of that screen appearing.

Data on performance at each step were collected for each communication opportunity initiated by the interventionist. If a participant did not activate an icon within 10 s of a screen appearing, a non-response was recorded. An incorrect response was recorded if the participant selected an icon that was incorrect for that step. In the four-step sequence, errors could occur on screen 1 by selecting the THANK YOU icon rather than the HELLO icon and on screen 4 by selecting the HELLO icon rather than the THANK YOU icon.

Experimental Design and Sessions

Intervention effects were evaluated in a multiple-baseline across participants design (Kennedy, 2005). The design included the following sequence of experimental phases: (a) baseline, (b) intervention, and (c) follow-up. Generalization probes, which involved having a second person serve as the interventionist, were conducted during each phase of the study. One such probe occurred for each participant in baseline and follow-up and three generalization probes occurred during intervention.

Sessions were scheduled to occur at the same time of the day for each participant, 2 days per week (Tuesday and Thursday), barring absences and school holidays. Baseline and follow-up sessions were approximately 15 min in duration and consisted of four communication opportunities. Intervention sessions of about 15 min duration also consisted of four communication opportunities, but each intervention session was proceeded by a set of four practice runs. During practice runs, participants were physically prompted to complete the four-step communication sequence a total of four times in rapid succession. Data on participants’ responses were only collected during the communication opportunities, not during the practice runs.

Baseline

Each of the four communication opportunities during baseline was initiated by the interventionist saying Hello. Let me know if you want a snack or a toy. The box of preferred stimuli and the iPad® were placed on the table with the iPad® in reach and open to screen 1. After initiating an opportunity, the interventionist waited for 10 s and then recorded data on the participant’s responses for each step of the sequence. A correct response at each step (e.g., selecting the HELLO icon from screen 1) was followed by the interventionist making a relevant spoken comment (replying with Hi or Hello). Correct performance of the entire sequence would have resulted in the participant receiving the requested item, but this never occurred in baseline. If an opportunity ended due to non-responding within 10 s or due to an error, the box of preferred stimuli reinforcers was moved out of sight. After an approximate 30-s inter-opportunity interval, the interventionist initiated the next communication opportunity by saying Hello. Let me know if you want a snack or a toy.

Intervention

Immediately prior to each intervention session, four practice runs were conducted. For each practice run, the participant was physically prompted to complete the four-step communication sequence (one practice run for each preferred snack and toy in random order). Practice runs were suspended after the participant correctly participated in the extended communicative exchange with 100% accuracy over three consecutive sessions. Performance during practice runs was not considered in this criteria nor are practice run data presented in Fig. 1 (see “Results”). An intervention session began about 30 s after the last practice run. Each session consisted of four communication opportunities that were initiated as in baseline. Any correct responses were followed by progression to the next screen and by the interventionist making a socially appropriate reply. Also, the requested snack or toy was delivered following the completion of step 4 of the sequence. During intervention sessions, prompting was only used at step 4 if the participant did not independently activate the THANK YOU icon within 10 s of arriving at screen 4. Prompting consisted of holding up the requested item and waiting for 10 s for the final (i.e., THANK YOU) response to occur before delivering the requested item. If the THANK YOU response still did not occur within 10 s of holding up the item, then the child was physically prompted to tap the THANK YOU icon using the least amount of physical guidance necessary. We required participants to select the THANK YOU icon before receiving the requested item because this is consistent with New Zealand social norms and we also reasoned it would be necessary to ensure participants had a reason to complete the final step of the sequence.

Fig. 1
figure 1

Percentage of opportunities with correct performance of the communication sequence for each participant and each session

Correspondence Tests

A total of 12 correspondence tests were conducted for each participant. Testing began after the participant had maintained 100% correct performance across three intervention sessions. Each test sought to determine if the participant would select the item that matched their prior request. A test was conducted after the participant had completed the communication sequence, but before the requested item was delivered. Instead of delivering the item, the interventionist presented the box of preferred stimuli and recorded which item the participant selected from the box. For each test, we recorded the icon activated at step 3 and then the real item that the participant selected from the box of preferred items when this box was then offered to them.

Follow-Up and Generalization

Two follow-up sessions were conducted from three to eight weeks after the last intervention session. The procedures were the same as in the intervention phase except that practice runs were not conducted and the final response of selecting the THANK YOU icon was never prompted. Generalization probes were conducted by either a teacher, teaching assistant, or graduate student rather than the interventionist. Generalization probes were conducted in the baseline, intervention, and follow-up phases using the baseline procedures.

Inter-observer Agreement and Procedural Integrity

Agreement checks on data recording by an independent observer (either live or from videotapes) occurred during a minimum of 20% of the sessions in each phase and for each participant. Agreement percentages, calculated using the formula: agreements/(agreements + disagreements) × 100, ranged from 89 to 100. Independent observers also conducted checks on procedural integrity using a checklist to determine if the procedural steps had been implemented correctly. Checks occurred during 22 to 37% of sessions per participant, with a minimum of one observation for each phase of the study. The resulting percentages of correct implementation were always 95% or above.

Results

Figure 1 shows the percentage of communication opportunities in which each participant correctly and independently completed the four-step requesting and social communication sequence. During baseline, Sean was the only participant to correctly and independently complete the four-step sequence. He did this once in session 3. With intervention, all five children reached 100% correct performance within two to nine sessions. Once a high level of correct responding was reached during intervention, it was maintained at 75–100% and generalized to the novel interventionist throughout intervention and follow-up. Table 2 shows the results of the correspondence test for each participant. Correspondence occurred on 83 to 100% of the 12 total tests conducted with each participant.

Table 2 Results of correspondence tests for each participant

Discussion

These data show relatively rapid acquisition of the four-step requesting and social communication sequence by all five participants. Participants also made relatively few mistakes (errors or non-responses) during intervention and none at all during follow-up. Performance also generalized to a second person who had not provided intervention. Rapid acquisition with minimal mistakes, and with evidence of generalization and maintenance suggests a positive intervention effect.

The present four-step sequence required participants to discriminate between the HELLO and THANK YOU icons that appeared at steps 1 and 4 of the sequence. The fact that participants learned to perform the sequence with few errors suggests the intervention was effective at teaching participants to discriminate between these two icons. Results from the correspondence tests also suggest that the intervention was effective at establishing functional/discriminated use of the two requesting icons that were displayed in step 3. That is, the high degree of correspondence in these tests suggests that the participants must have been discriminating between the two more specific snack and toy icons and not simply selecting icons in some biased or random fashion. Previous studies have shown individual differences in the extent to which learners show a correspondence between initial requests and subsequent item selections (Reichle et al., 1989). The degree to which such correspondence occurs seems to depend, in part, on the number of icons available (Sigafoos et al., 2007) and the relative preference value of requested items (Sigafoos & Kook, 1992). The high level of correspondence in the present study might therefore stem from the fact that at step 3 the participants had only two icon options (either two snack options or two toy options), all of which represented preferred items.

The positive intervention effect overall might be generally attributed to the use of well-established systematic instructional procedures. It is not surprising that the systematic instructional procedures employed in the present study (e.g., time delay, prompting, and reinforcement) appeared to be effective given that such tactics have a long history of success for teaching a range of functional skills to individuals with autism and other developmental disabilities (Lang & Sturmey, 2021). More specifically, these teaching strategies have been successful in teaching SGD use, including multi-step requesting and social communication sequences, to minimally verbal autistic children (Alzrayer et al., 2017, 2019; Chavers et al., 2021; Genc-Tosun & Kurt, 2017; van der Meer et al., 2013; Waddington et al., 2014). The results of the present study provide further empirical support for use of systematic instruction in communication interventions for minimally verbal autistic children.

The rapid acquisition demonstrated in Fig. 1 was also likely facilitated by the fact that our participants had prior experience with SGDs and one-step requesting. This prior experience may have influenced the speed of acquisition. However, the low level of performance in baseline suggests that this prior experience had not generalized to the multi-step requesting and social communication sequence targeted in the present study. Still, the positive outcomes reported in the present study may depend on participants having prior experience in using SGDs for one-step requesting.

A unique aspect of our intervention was the provision of practice runs prior to each intervention session. The use of immediate, hand-over-hand physical prompting during practice runs was intended to increase the tendency to respond when each new screen appeared as well as prevent errors and non-responding. Practice runs were envisioned as a way of priming the pump (Skinner, 1968) so to speak or, more technically, generating some behavioral momentum (Davis & Brady, 1993). We anticipated that this type of practice would carry over to the four communication opportunities that were subsequently conducted in each intervention session. Practice runs appeared to be an effective instructional component for teaching participants to initiate and make a correct response each time a new screen appeared. Also by restricting most of the physical prompting to practice runs, there was less need to interrupt subsequent intervention opportunities with prompts and this seemed to make those opportunities flow more naturally. Importantly, participants continued to perform the targeted communication sequence correctly when practice runs were no longer conducted (e.g., during the follow-up sessions and generalization probes). This suggests that participants did not become reliant on any such priming effect.

The use of a progressive display with only two icons per page is another variable that might have facilitated participants’ acquisition of the four-step requesting sequence. The progression to a new screen after a response had been made on the previous screen could itself be viewed as a type of discriminative stimulus or stimulus prompt that eventually came to evoke or control the next response in the chain. Correct performance on the progressive display also only required that participants learn to discriminate between two icons, that is the HELLO and THANK YOU icons, which appeared on the first and last screens. For the other two steps in the sequence, the available icons all represented preferred stimuli and thus any selection from screens 2 and 3 could be seen as “correct”. However, the results of the correspondence tests, as mentioned previously, suggest that participants were in fact making discriminated requests at Step 3 of the sequence.

Limitations and Future Directions

The results of the present study should be interpreted with caution due to several limitations. First, while the multi-step sequence involved social and requesting responses, it is unclear if participants were actually socially engaged as opposed to simply using the HELLO and THANK YOU icons because doing so was required to gain access to a preferred snack or toy. Our primary purpose for requiring an initial greeting response and a final “thank you” response was to ensure the targeted communication sequence reflected New Zealand social norms. However, it is possible that embedding such social responses into requesting sequences might improve the person’s social image and increase the probability of reinforcement by “softening” the request. Doing so might also help to recruit the attention of the listener. This, in turn, could increase the probability that the request is heard and reinforced by the listener (Cipani, 1990). Future research could explore these possibilities by comparing the social validity and effectiveness of multi-step requesting sequences with versus without embedded social responses. Given that autistic children are generally less inclined towards social communication (Schertz et al., 2017), embedding social requirements into requesting sequences might also represent a useful initial approach for eventually increasing the child’s social motivation. A second limitation was the relatively modest amount and duration of follow-up (i.e., only two sessions conducted from three to 8 weeks post-intervention). Additional follow-up over a longer period of time is necessary to appraise the extent to which multi-step SGD use is maintained. Longer-term maintenance is likely to depend, in part, on the extent to which the acquired communication skills remain functional for the participant. Third, generalization was limited to assessing performance across one person. Future research could be improved by assessing and programming for generalization across additional partners (e.g., siblings and peers) and settings (e.g., home, playground, community). Interventions leading to wider generalization across people and settings would increase the ecological validity of the existing evidence base on teaching multi-step SGD use to minimally verbal autistic children.

The systematic instructional package was effective in teaching a four-step SGD-based requesting and social communication sequence to five minimally verbal autistic children. When used in combination with SGDs configured to minimize errors (via a progressive display with only two icons per page), acquisition can be rapid. The intervention also appeared to promote generalization, maintenance, and discriminated use. Multi-step requesting and social communication sequences may represent the next logical learning objective for minimally verbal autistic children who are at the single-step requesting stage.