Children with autism spectrum disorder (ASD) and complex communication needs can benefit from alternative augmentative communication (AAC) systems (Muharib & Alzrayer, 2018). Mobile technologies (e.g., iPads) with AAC applications are increasingly recommended for this population (Muharib & Alzrayer, 2018). AAC applications often include an array of symbols (e.g., graphic pictures, photographs) that represent vocabulary items and produce speech output. To select AAC vocabulary for young children, clinicians must consider individual needs (e.g., interests and environments) and knowledge of typical development (Laubscher & Light, 2020). For children developing typically, early words include nouns, verbs, descriptors, and social words (Laubscher & Light, 2020). AAC vocabularies that do not include a variety of word classes may limit a child’s ability to communicate for a variety of purposes and delay the development of syntax skills (Binger et al., 2020; Laubscher & Light, 2020). As verbs are part of early lexicons, the use of verb symbols amongst children with ASD should be explored.

Although children with ASD who do not rely on AAC use verbs in everyday speech, the verbs produced can be impacted by social communication skills (Douglas, 2012; Haebig et al., 2020; Jiménez et al., 2020). For instance, children with ASD may be more likely to use action verbs such as eat, jump, or open (which describe a movement or change) than verbs related to internal states such as feel, want, or know (Douglas, 2012). Verb interventions for children with ASD and natural speech often focus on teaching participants to label actions presented using video or live models in which an adult or toy figurine engages in targeted actions (Frampton et al., 2016; Shulman and Guberman, 2007; Shepley et al., 2016). Successful intervention methods have included syntactic clues, time delay, questions, models, reinforcement, and matrix training.

Unfortunately, additional challenges to teaching AAC verb symbols exist. Although nouns representing objects are easily depicted with static pictures, actions may be harder to represent as their meanings come from movement (Schlosser et al., 2019). Animated verbs (which show movement) may be more transparent than static symbols (Fujisawa et al., 2011; Schlosser et al., 2014, 2019). However, as a majority of popular AAC applications do not utilize animation for this purpose, most individuals likely only have access to static symbols (Frick et al., 2022). AAC research involving non-autistic populations (e.g., those with motor speech disorders) suggests that children with complex communication needs can use static verb symbols (Binger et al., 2008, 2017b; Tönsing, 2016). Studies have used strategies such as aided language modeling and prompting to teach children to label stimuli using symbol combinations involving verbs (Binger et al., 2008; 2017b; Tönsing, 2016). Symbol combinations with verbs involved the use of actions (e.g., agent-action combinations like MICKEY EATS or agent-action-object combinations like MICKEY EATS APPLE).

Recent AAC intervention studies involving children with ASD have also included verb targets (Carnett et al., 2019; Gevarter et al., 2021, 2022; Holyfield, 2021; Marya et al., 2021; Muharib et al., 2019). Findings demonstrate that although children with ASD can learn to use verb symbols, progress may be gradual and modifications may be needed. Three studies have used behavioral strategies such as time delay, prompting (e.g., verbal, gestural, model, and physical cues) task analysis, and reinforcement to teach children to use single symbol verb responses (Carnett et al., 2019; Gevarter et al., 2021; Holyfield, 2021). These studies utilized embedded instruction, which incorporates structured learning opportunities (trials) within naturally occurring contexts such as play or storybook reading (Geiger et al., 2012). Preferred interests and items are also incorporated, and responses are reinforced with naturalistic consequences (Gevarter et al., 2021). For example, Carnett et al (2019) taught participants to use an AAC application to request actions (e.g., UNLOCK, WATCH) needed to engage with preferred activities such as watching videos on an iPad. Although one participant showed consistent independent responding, two others required modifications. Gevarter et al. (2021) also reported mixed findings when using preferred activities to teach three preschool-aged children to use AAC symbols. Parents were taught to embed opportunities for children to request items (using noun symbols) and to reject items, request actions, and label actions (non-noun symbols) during natural routines (e.g., play). Two participants mastered all targets but demonstrated more gradual acquisition with non-noun targets. The third participant only mastered noun targets. More gradual success in acquiring non-noun targets was also demonstrated in a study in which participants were taught to fill in cloze statements in adapted books with noun or verb symbols (Holyfield, 2021). Some participants were more successful in using text-only symbols compared to static picture symbols with text for verb targets. The fact that this difference was not observed for noun targets supports the notion that static verb symbols may not be transparent for children with ASD (Schlosser et al., 2019).

Three additional studies have focused on teaching children with ASD to use multi-symbol utterances involving verb symbols (Gevarter et al., 2022; Marya et al., 2021, Muharib et al., 2019). Similar to studies involving single symbol action requests, Muharib et al. (2019) used behavioral methods (e.g., least-to-most prompting, backward-chaining) to introduce action symbols as part of three symbol request sequences (e.g., I WANT TO + EAT + APPLE). Although two participants rapidly acquired the use of action symbols, a third participant required more extensive practice to accurately use action symbols. Moving beyond requesting, both Gevarter et al. (2022) and Marya et al. (2021) focused on teaching children to use agent-action responses to label stimuli. Although both studies used matrix training and prompting, Gevarter et al. (2022) embedded stimuli in play and Marya et al. (2021) presented video stimuli during discrete trial training (DTT) sessions. During DTT, a clinician presents a specific stimulus to promote a target behavior that is reinforced (e.g., with positive verbal feedback or preferred tangible items), but instructional materials are not incorporated into a natural learning activity (Geiger et al., 2012). Gevarter et al. (2022) further supported the idea that children with ASD may acquire noun-based targets more rapidly than verb targets as participants demonstrated more immediate success with possessor-possession targets (involving only noun symbols) than agent-action targets. Although the DTT study (Marya et al., 2021) had more consistent positive results for agent-action targets, participants were required to master the use of noun and verb symbols in isolation prior to intervention, which was not a requirement in the embedded intervention study.

Based on prior research, it is unclear whether DTT or embedded instruction approaches present any advantages for teaching action verb symbols. Direct comparisons of embedded instruction to DTT for teaching receptive language and other early learning skills to children with ASD suggest that even though children often learn with both approaches, embedded instruction may produce additional benefits such as faster rate of acquisition, fewer challenging behaviors, and higher mood ratings (Geiger et al, 2012; Sigafoos et al., 2006). However, for some children/tasks, environmental distractions that may occur during embedded instruction may make it harder to attend to stimuli (Geiger et al., 2012). In such cases, DTT may be considered. It has also been suggested that video stimuli may further reduce environmental distractions for children with ASD (Charlop-Christy et al., 2000; Plavnick & Vitale, 2016).

In addition to determining an appropriate instructional format (e.g., DTT vs. embedded instruction), another important component of AAC intervention is selecting targets based upon communicative function. As noted previously, prior AAC research involving children with ASD has focused on using verb symbols to request (Carnett et al., 2019; Gevarter et al., 2021; Muharib et al., 2019), to fill in cloze sentences (Holyfield, 2021), or to label stimuli (Gevarter et al., 2021, 2022; Marya et al., 2021). Requesting is considered an imperative language function, and sharing information or directing attention via commenting, labeling, and answering questions is considered declarative functions (Harbison et al., 2017). Children with ASD often show differences in their use of imperative and declarative functions (Harbison et al., 2017; La Valle et al., 2020). Although requesting relies on natural motivation for actions or items, declarative functions often rely on social reinforcement (La Valle et al., 2020). Challenges with declarative functions may be more pronounced for children with ASD and limited spoken language (Harbison et al., 2017; La Valle et al., 2020). This may be because joint attention abilities (which include a variety of skills that allow an individual to communicate about an object or an activity with another person) are often delayed in children with ASD with limited language (Bruinsma et al., 2004). For this reason, it is not surprising that most AAC research involving children with ASD has focused on teaching requesting skills (Muharib & Alzrayer, 2018). As the majority of such research has not included verb symbols, more research is needed to assess the impact of pragmatic function on the acquisition of verbs.

Given the mixed findings of AAC verb research and the unique characteristics of autism, assessments that can elucidate individualized needs are also critical. Dynamic assessment (DA) is well suited to this task. Using the principles of Vygotsky’s social interactionist theory, DA can determine communication targets within the child’s zone of proximal development (Minick, 1987). During a graduated prompting DA, a child receives varying levels of assistance and an examiner notes the supports needed for successful responses (Bain & Olswang, 1995; Patterson et al., 2013). When utilizing graduated prompting for an AAC DA, an examiner might implement a least-to-most prompt hierarchy involving supports such as a verbal reminder to use the device, an aided model prompt, a gesture prompt, and a physical prompt (Gevarter et al., 2020). For each opportunity a child has to make an AAC response, the examiner would record the lowest prompt level at which the child was able to accurately respond. Some studies have used DA with AAC (Binger et al., 2017a; Gevarter et al., 2020; King et al., 2015). One study involving six preschool-aged children with ASD examined the average levels of support participants needed to make AAC responses that varied in word class, communicative function, and display complexity (Gevarter et al., 2020). Three participants with mild-to-moderate ASD and prior experience with the Picture Exchange Communication System (PECS) were highly proficient with item requests involving noun symbols and showed progress requesting with non-noun targets (including verbs such as OPEN and CLOSE). Three participants with severe ASD and no PECS experience also required fewer supports with noun-based requests than other targets but required high levels of support across targets (Gevarter et al., 2020). Although the study design did not permit conclusions regarding the relationships between participant characteristics, DA, and intervention success, three participants who showed progress during DA demonstrated success with similar targets in the Gevarter et al. (2021) study.

The current study aimed to extend AAC DA literature by exploring the learning potential of young children with ASD for acquiring action symbol use across different conditions on an AAC application. Similar to Gevarter et al. (2020), this study did not examine the effects of an ongoing or extensive intervention. Instead, a short series of DA sessions was used to evaluate whether children reduced supports needed to use targets across a limited number of trials. Rather than conduct DA in one sitting, practice opportunities were distributed across sessions to prevent satiation (e.g., losing motivation to request actions) or task frustration. The study aimed to explore whether young children with ASD (a) demonstrate progress in learning to use action verb symbols across three instructional conditions and a control, (b) differ in their use of verb symbols for requesting vs. labeling actions, and (c) differ in their use of verb symbols to label actions across play-based embedded instruction vs. DTT with videos.

Method

Participants

Participants included four children who (a) had an independent diagnosis of ASD; (b) were between the ages of 3 and 6; (c) used fewer than 50 functional spoken words as reported by parents on the Vineland Adaptive Behavior Scales Third Edition (VABS-III; Sparrow et al., 2016); (d) had prior experience with symbolic communication systems such as PECS, sign, or vocal speech, but no prior experience using AAC applications; and (e) could request objects with no more than a gestural prompt on an AAC application. Participants were recruited from agencies serving children with ASD. Five children (Charlie, Elijah, Nate, Pedro, and Sean; all pseudonyms) participated in screening. Charlie was excluded because he required more than gestural prompts to make AAC object requests. Characteristics of the remaining participants (e.g., age, race/ethnicity, prior natural speech, and AAC use) are displayed in Table 1. Table 1 also includes estimates of ASD severity, as measured by the Childhood Autism Rating Scale Second Edition (CARS-2; Schopler et al., 2010), and expressive and receptive language age estimates as measured by VABS-III (Sparrow et al., 2016). During a researcher-created communication and preferences interview, Nate and Sean’s parents both reported that their children primarily used existing communication forms (PECS, manual signs) to request. Elijah and Pedro’s parents reported that existing communication forms (natural speech, signs) were most often used to request or imitate, but occasionally used to fill in words to songs, or label items (e.g., animal toys, favorite Disney characters).

Table 1 Participant characteristics and assessment data

Parents of each participant provided consent on a document approved by a university review board. The consent form described methods to determine a child’s assent (e.g., approaching/engaging in activities). All DA sessions took place in private rooms at a university speech clinic (Nate and Pedro) or at a local applied behavior analysis clinic (Elijah and Sean).

Procedures

Research Design

The study compared three instructional conditions and a control condition using an adapted single phase multielement design (Gevarter et al., 2020). This design demonstrates experimental control in a manner similar to an adapted alternating treatments design (Schlosser, 1999; Sindelar et al., 1985). Conditions are rotated and assigned different targets, but the design does not require a baseline, as differences between conditions establish control (Cooper et al., 2019). Each participant was assigned five unique verbs per condition and received four DA sessions (with 10 trials each) per condition. DA sessions occurred during 45-min visits up to 2 days/week, with two–three conditions implemented each visit. Block randomization was used so that condition order varied, but each condition was not presented twice during a visit. DA was intended to be completed in six to eight visits across 3–4 weeks. Due to COVID-19 absences, participants completed the study in 4–6 weeks.

Independent and Dependent Variables

The independent variables were the different conditions for presenting verb targets. Instructional conditions included (a) embedded action requesting, (b) embedded play action labeling, and (c) DTT video action labeling. The control condition involved asking participants to identify verb symbols without supports. The three instructional conditions utilized the same prompt hierarchy but were differentiated by the format/function of opportunities presented. A support score for each trial presented during a session was assigned based upon the level of support (e.g., prompt type) a participant needed to activate the speech output for the appropriate verb symbol. The primary dependent variable was the average level of support a participant needed to make correct responses. The “Measures” section provides a detailed description of how the average level of support was calculating using support scores. A secondary descriptive measure included the percentage of trials (across sessions) correct at a given support level for each condition/participant (see the “Measures” section).

Materials

Parent report and preference assessments were used to determine items for the AAC item requesting pre-assessment and the DA embedded request and play label conditions. Parent report was also used to select preferred YouTube videos (e.g., short clips from popular children’s cartoons such as Veggie Tales) used to reinforce responses in the DTT labeling condition. Instructional videos that depicted targeted actions were also used in DTT condition. These videos used Toy Story figurines such as Woody, Buzz, and Rex to model actions against a white background. Preferred and instructional videos were presented on a 15-inch Mac laptop.

Using the Proloquo2Go application (Assistiveware) on an iPad, researchers created four display pages per child (i.e., one page per condition). Six picture symbols (from Symbol Stix) were presented per page in a 2 × 3 array, and each symbol was 2.5 × 1 inches. Each display included five targeted verbs along with one distractor (see Fig. 1). For embedded request and play label conditions, verbs were assigned based on (a) parental suggestions regarding preferred actions for requesting or labeling during play (e.g., parent noted a child likes to jump), (b) how verbs could be grouped during different play contexts (e.g., pour, splash, swim, for a water play activity), and (c) observations of how children interacted with stimuli during preference assessments. For the DTT and control conditions, researchers assigned verbs that parents suggested their children would be interested in but were not selected for other conditions. Screenshots of display pages were made into paper printouts containing the targeted verb symbols for each child. The printouts were used to assess verb symbol identification during pre-assessment.

Fig. 1
figure 1

Example of Proloquo2Go display page

Parent Interviews and Preference Assessment

Parent interviewing was used to complete the CARS-2 (Schopler et al., 2010), the expressive and receptive sections of the VABS-III (Sparrow et al., 2016), and a researcher-created communication and preferences interview. The researcher-created interview included questions about children’s existing communication forms/functions and preferences for different items/activities. The interview also included a section where parents were presented with a list of 45 action words and asked to indicate which words their child would show interest in requesting or labeling. Most actions came from the MacArthur-Bates Communicative Development Inventories (Fenson et al.,, 2007). Action words related to sensory preferences (e.g., squeeze, spin) were also included. Additional words were added based on parent recommendations. After gathering information from parents, direct preference assessments with participants were then conducted. First, a multiple stimulus without replacement preference assessment (DeLeon & Iwata, 1996) was used to determine snack or toy items to incorporate during the AAC item requesting assessment used for screening purposes. The top five items were used for the screening assessment. To select preferred items and actions for DA conditions, the researcher used variations of a single stimulus preference assessment (Pace et al., 1985). To select stimuli for the embedded play label condition, the researcher presented the learner with parent-recommended toys that could be used to model target actions (e.g., using a sensory toy to model the action squeeze). The researcher recorded whether the learner engaged with the toy (e.g., picked it up, played with it) and attended to actions modeled. To determine preferred actions for the embedded request condition, the researcher attempted to engage the learner in a parent-suggested actions (e.g., bouncing the child on a ball) and recorded whether the child engaged in the action. Finally, to determine preferred videos used as reinforcers in the DTT condition, the researcher played YouTube clips suggested by parents (e.g., Veggie Tales) and noted whether the child watched the video and showed signs of interest (e.g., smiling, laughing).

AAC Item Requesting Screening Assessment

During this screening task, participants were presented with the AAC display with symbols representing five preferred snack or toy items selected via preference assessment. The researcher created 20 opportunities for the child to request the preferred items by intermittently offering the child all five items, observing which item the child reached for or pointed to, holding that item up, and presenting the child with the AAC display. The researcher then implemented a least-to-most prompting hierarchy involving a 6-s time delay, spoken reminder + general gesture to use device, model (aided + spoken), specific gesture, and physical prompt. The first 5 of the 20 total trials were practice trials and were not counted for screening purposes. Participants who needed no more than a specific gesture for at least 10 of the 15 assessment trials met study criteria. This criterion was based on findings from Gevarter et al. (2020), which indicated that participants who showed progress with non-noun symbols rarely required more than a gesture prompt with noun symbols used for requesting. Charlie was excluded based on this screening, as he required physical prompts for all trials. Average support scores for participants who met criteria were as follows: 4.5 for Elijah and Pedro (relatively few prompts needed), 3.3 for Nate (mostly requiring a model), and 2.5 for Sean (mostly requiring a model or specific gestures).

Verb Symbol Identification Pre-assessment

For this pre-assessment, participants were presented with paper printouts of Proloquo2Go displays with the verb symbols that had been initially assigned to each condition. The researcher directed the child to show me (verb) and waited 6 s before marking whether the child pointed to or touched the correct symbol. To ensure task completion, participants were intermittently provided with small amounts of preferred snack or toy items regardless of whether responses were right or wrong. Twenty-four verbs (i.e., six per condition) were presented to each child in random order. If a child identified a symbol across three trials, that verb was not included in DA.

If a participant did not identify a verb on the first trial, additional trials with that verb were not conducted. Three of the four participants (Elijah, Nate, Sean) were unable to identify any symbols on the first trial. Additional trials with verbs correctly identified on the first trial were presented for Pedro (in a different order) after a short break (about 5 min). To prevent task fatigue, third trials for verbs Pedro identified at least twice were presented during an additional pre-assessment visit. In total, Pedro correctly identified nine of his selected verbs (38%), across all three trials. Additional preference assessments and symbol identification tasks with 12 additional verbs were conducted until enough unknown verbs were identified for DA conditions.

Dynamic Assessment

The first author (an assistant professor in speech and hearing/behavior analyst) conducted DA for Elijah and Sean, and the second author (a graduate student in speech and hearing) conducted DA for Nate and Pedro. A DA session in each condition consisted of 10 opportunities to use targeted verbs (i.e., two trials with each verb). Two trials for each target were presented in a row unless the child stopped showing interest in a particular action (in which case the second trial for the same action occurred at a later point in the session). After a trial was initiated, the AAC display was placed in front of the child. The researcher used a time delay of 6 s to wait for the child to make a response. In instructional conditions, non-responses or incorrect responses (e.g., using a symbol that did not match the opportunity or pressing display parts that did not produce output) were corrected using a least-to-most prompt hierarchy. The hierarchy included the following supports: (a) spoken reminder + general gesture, (b) model (aided + spoken), (c) specific gesture, and (d) physical prompt. The spoken reminder + general gesture involved the researcher saying Use the iPad and gesturing in the direction of the iPad. Modeling involved the researcher activating the correct symbol on the iPad (e.g., pressing OPEN) and provided a spoken model (e.g., saying open). A specific gesture involved the researcher pointing to the correct symbol but not activating it (e.g., hovering pointer finger just above the OPEN symbol). Lastly, a physical prompt involved the researcher guiding the child’s hand to press the correct symbol. The researcher moved to a more restrictive level on the prompt hierarchy when the participant either made an incorrect response at the prior level (e.g., selected a symbol that did not match the communication opportunity) or did not respond at the prior level after 6 s. The specific gesture was considered to provide a higher level of support than the model as it provided a more permanent cue to use the correct symbol (Gevarter et al., 2020). The spoken reminder + general gesture was skipped if an incorrect response included touching the iPad. If a child resisted a physical prompt, he was not forced to select a symbol. Independent or prompted correct responses resulted in reinforcement specific to instructional conditions and a grammatically complete verbal expansion (e.g., You want to jump!). As this was not an intervention, the use of more restrictive prompts was not faded during DA. Specific procedures for each condition are described below.

Embedded Action Requesting

During this condition, to create a communication opportunity, the researcher first allowed the child to engage in the target action for a short period of time (e.g., jumping on trampoline) or modeled a desired action the child would need the adult’s assistance to complete (e.g., opening a container with preferred toys). Next, the researcher interrupted or stopped the activity (e.g., stopped the child from jumping, or closed box with preferred items), asked the child What do you want to do? or What should we do? while presenting the iPad, and used the prompt hierarchy as needed. Reinforcement consisted of continuing the desired action. If the child did not show interest in an opportunity, the researcher could repeat the opportunity, stop the trial and re-introduce that action at a later point, or vary materials (e.g., place other preferred items inside the box to increase motivation for OPEN).

Embedded Play Action Labeling

In this condition, the researcher allowed the child to engage freely with play items and intermittently looked for opportunities to model targeted actions (e.g., clinician dropped a ball or showed a dinosaur biting). If the child did not attend, the researcher directed the child’s attention by saying Look and repeating the model/ question. Materials could vary slightly to increase attendance (e.g., using a different dinosaur to model biting). After modeling the action, the researcher asked What did do? or What happened? while presenting the iPad, and used the prompt hierarchy as needed. Reinforcement involved allowing the child to return to play (with the researcher joining in play as appropriate).

DTT Video Action Labeling

To create a communication opportunity in this condition, the researcher played a short video clip showing the target action. After the video ended, the researcher asked What did_____ do or What happened? while presenting the iPad display, and used the prompt hierarchy as needed. If the child was not attending to the stimuli, the researcher directed the child’s attention by saying Look. Reinforcement consisted of playing a preferred YouTube video for 30 s. Additional reinforcement in the form of small sensory toys was added for Nate who had difficulty transitioning to this condition.

Control

In this condition, the researcher presented the AAC display and said Show me (verb). The researcher then waited 6 s for a response but did not provide any prompts. Intermittent reinforcement using toys, snack items, and positive verbal feedback (Great job pointing!) was provided for participating but was not contingent upon correct responding.

Measures

Screening Measures

The CARS-2 (Schopler et al., 2010) was used to confirm independent ASD diagnoses and provide estimates of autism severity. Severity classifications of the CARS-2 (e.g., mild-to-moderate, severe) correlate with other measures (Reszka et al., 2013). The VABS-III (Sparrow et al., 2016) was used to gather information regarding communication delays and the number of spoken words children were reported to use. The VABS-III has indicators of validity and strong internal consistency (Pepperdine & McCrimmon, 2018).

Average Support Scores

Support scores were adapted from prior research (Binger et al., 2017b; Gevarter et al., 2020; Patterson et al., 2013). For instructional conditions, support scores were 5 = time delay, 4 = spoken reminder + general gesture, 3 = model (aided + spoken), 2 = specific gesture, 1 = physical prompt, and 0 = no correct response. For instructional conditions, a score of 0 was assigned when a participant resisted a physical prompt (e.g., reaching arm away) or continued to reject a communication temptation (e.g., pushing away materials). For the control, as prompting was not utilized, participants could only receive a score of 0 or 5. Table 2 provides definitions of each support level.

Table 2 Support level and score definitions

Percentage of Trials Correct at a Given Support Level

.To calculate this measure, the researchers first computed the total number of times a participant responded correctly at a given specific support level (e.g., correctly responded at level 1 after the time delay alone) across all sessions within a condition (e.g., embedded action requesting). Next, this number was divided by the total number of trials in that condition (i.e., 40) and multiplied by 100. This process was repeated across all support levels, conditions, and participants.

Treatment Fidelity

All sessions were video recorded, and the researchers created a task analysis of steps needed to complete a trial. The steps included were as follows: (a) present communication opportunity, (b) present display, (c) use time delay, (d) implement prompt hierarchy when needed (instructional conditions), (e) reinforce prompted or independent responses (instructional conditions), and (f) use verbal expansion (instructional conditions). At least one session in each condition per participant plus two additional sessions from any condition (i.e., 38% of all sessions) were randomly selected for fidelity checks. The first and second authors independently conducted fidelity checks for sessions in which they were not the experimenter. Coders marked whether each step was completed correctly for each trial in a session. Fidelity was calculated for by dividing the number of steps followed correctly by the total number of steps multiplied by 100. Scores were 97% (range 93–100%) for Nate, 99% (range 98–100%) for Pedro, 96% (range 88–100%) for Sean, and 100% for Elijah, indicating high levels of adherence to the DA protocol.

Data Analyses

The first and second authors independently coded videos and recorded support scores. For each trial in an instructional condition session, the coder marked the level of support (1–5) a participant needed to select the targeted verb symbol or scored a zero when no correct independent or prompted response was made. For the control condition, dichotomous scores of either 0 (symbol not correctly identified) or 5 (symbol correctly identified) were assigned. Average support scores for each session were calculated and graphed across sessions. Visual analysis was used to compare differences in level, trend, and variability between conditions (Kratochwill et al., 2013). The percentage of trials across sessions that were correct at a given support level for each participant and condition was graphed in a bar chart.

Inter-observer Agreement (IOA)

The first and second authors independently conducted IOA checks for participants for which they were not the primary coder. The same sessions used for treatment fidelity checks were also used for IOA (i.e., 38% of sessions). Observers coded the level of support for each trial in a session for each participant. Trial-by-trial IOA was calculated by dividing the number of agreements by the sum of the agreements and disagreements and multiplying by 100. Average IOA scores were 85% (80–100%) for Elijah, 97% for Nate (93–100%), 99% for Pedro (98–100%), and 96% for Samuel (88–100%), indicating that the data were reliably coded.

Results

Individual participant graphs showing the average support scores across conditions and sessions are presented in Figs. 2, 3, 4, and 5. Bar charts showing the percentage of trials across sessions that were correct at a given support level for each participant/condition are presented in Fig. 6.

Fig. 2
figure 2

Elijah’s average support scores across session and conditions

Fig. 3
figure 3

Nate’s average support scores across session and conditions

Fig. 4
figure 4

Pedro’s average support scores across session and conditions

Fig. 5
figure 5

Sean’s average support scores across session and conditions

Fig. 6
figure 6

The percentage of trials correct at a given support level for each participant and condition

Progress in Learning to Use Verb Symbols

Experimental control, with no overlap between instructional conditions and the control condition, was established for three of the four participants (Elijah, Sean, Nate). These participants correctly responded in the control condition with the time delay (level 5) less than 10% of the time. The fourth participant (Pedro) demonstrated an increase in responding in the control condition which overlapped with data in all instructional conditions. He correctly responded at level 5 in 72% of control trials. Of note, Pedro was the only participant who receptively identified verb symbols during pre-assessment. Although DA used targets he did not identify, the control condition was similar in nature to the pre-assessment.

Visual analysis also indicated that three participants (Elijah, Pedro, and Sean) showed modest increasing trends in average support scores (i.e., reducing levels of support needed) in at least two instructional conditions. Across instructional conditions, their percentage of correct responding at level 5 ranged from 15 to 25% of trials for Elijah, 15 to 43% for Pedro, and 18 to 28% for Sean. Nate did not demonstrate increasing trends in support scores in any instructional conditions, but did respond with low support levels (e.g., models) in some conditions. His percentage of correct responding at level 5 ranged from 10 to 13%. Across a majority of participants and conditions, models (Level 3) were most commonly required for correct responding. Participants infrequently required a general reminder to use the iPad (Level 4) as incorrect responses often involved interacting with the iPad but selecting the wrong symbol.

Differences in Learning Based on Pragmatic Function

During their first requesting sessions, three participants (Elijah, Nate, and Sean) required less support than they did with either of their first sessions with labeling conditions. These participants responded to most initial request trials with no more than a model (level 3). In comparison, during early sessions involving labeling, Elijah and Sean more frequently required specific gestures (level 2) and occasionally needed physical prompts (level 1). Nate often required specific gestures (level 2) or physical prompts (level 1) and occasionally could not be prompted (level 0) in early labeling sessions. Pedro’s score for his first requesting session was similar to his first DTT session (responding across a range of levels 1–5), but on his first embedded play labeling session, he required more physical prompting (level 1).

Over time, Nate continued to maintain differences between requesting and labeling. His data indicated minimal overlap between requesting and the DTT video labeling and no overlap between requesting and embedded play labeling. Across sessions, 65% of his correct requesting responses occurred following a model (Level 3), compared to 20–28% of trials for labeling conditions. In the requesting condition, he never received a level 0 score (no correct response) and required physical prompts in only 10% of trials. Comparatively, he did not respond correctly in 8–13% of labeling trials and needed physical prompts in 18–43% of trials.

Although Elijah and Pedro showed modest increasing trends with requesting, increasing trends in labeling conditions led to overlap between conditions. Elijah responded to level 3 (model) and level 5 (time delay) cues at similar rates across instructional conditions (48–53% at level 3 and 15–25% at level 5), responding to these lower-level cues only slightly more often for requesting than in labeling conditions. He also never required physical prompting for requesting which was needed occasionally in labeling conditions. Pedro showed differences between conditions based on level 5 responding. Specifically, he responded at level 5 in 43% of request trials compared to 15–25% of labeling trials. Although Sean maintained requesting with low supports, he did not show clear increases in trend and had limited differentiation between instructional conditions over time. His responding at level 5 was slightly higher for requesting (28% of trials) compared to labeling (18–25% of trials). Similarly, his responding at level 3 was slightly higher in requesting (63%) than in labeling conditions (55–58%).

Differences Between Embedded Play and DTT Video Labeling

Minimal differences in average support scores between embedded play labeling and DTT video labeling were apparent. All participants had overlap between the two conditions (with Nate having slightly less overlap). Elijah had a slightly steeper increasing trend in the DTT condition but also showed growth in the play condition. His percentage of support level use across the two conditions was similar. Sean and Pedro showed slightly steeper increases in trend in the play condition than DTT. Sean responded at level 5 (time delay) slightly more often for embedded play trials (25%) than DTT (18%). Pedro showed the opposite pattern, responding at level 5 in 25% of DTT trials and 15% of play trials. Nate’s scores did not increase in either condition. Although he responded to level 3–5 cues at similar rates in both conditions, he required level 1 (physical prompts) in 43% of embedded play trials compared to only 18% of DTT trials. He did not respond correctly in 13% of embedded play trials and 8% of DTT trials.

Discussion

Results of this study support prior research demonstrating mixed outcomes when teaching children with ASD to use verb symbols (Carnett et al., 2019; Gevarter et al., 2021, 2022; Holyfield, 2021). Experimental control showing differences between instructional conditions and the control condition was established for three of the four participants during DA. Three participants with mild-to-moderate ASD (Elijah, Sean, Pedro) also showed evidence of learning to use verb symbols with reduced levels of support in more than one instructional condition. Growth in each instructional condition varied slightly across these three participants. The fourth participant (Nate), who had severe characteristics of ASD, did not demonstrate growth in any condition. However, despite showing limited signs of growth, Nate was able to respond to low level prompts (i.e., models) in the requesting condition from the start of DA and his support scores across all instructional conditions did not overlap with control. Given more practice during intervention, he might show progress with verb targets when provided with minimal prompts for requesting, or more intensive prompts for labeling. Although the observed differences in DA performance based upon ASD severity level align with findings from Gevarter et al. (2020), results differ in that prior PECS experience did not appear to impact success. Elijah and Pedro, who showed progress during DA, did not have prior PECS experience, but Nate was a PECS user. Thus, the current study does not suggest that experience with low-tech AAC is a prerequisite for introducing high-tech systems or teaching verbs.

This study does suggest, however, that some children with ASD may require more time to acquire independent use of verb symbols. Although most participants decreased support levels and produced some independent responses across instructional conditions, none independently initiated the use of verb symbols for a majority of responses. In comparison, children with mild-to-moderate ASD in the Gevarter et al. (2020) showed consistent independent responding using noun symbols to request items and showed greater improvement with non-noun symbols including verbs. In this study, researchers only assessed noun use during pre-assessment. Supporting findings of Gevarter et al. (2020), most participants required less support with item requesting during the pre-assessment than during initial DA sessions with verbs.

Learning to use action verb symbols expressively could also be impacted by receptive skills. Some prior verb intervention studies have excluded children who could not receptively identify targeted actions (Frampton et al. 2016; Marya et al., 2021). In this study, to control for differences in symbol knowledge across conditions, the researchers only targeted verbs that participants could not receptively identify. Only Pedro receptively identified verb symbols during pre-assessment. Although the verbs he identified were not used in any of the DA conditions, he was the only participant who showed consistent independent responding in the control condition, which was similar to the receptive pre-assessment. Interestingly, Pedro’s receptive skills did not immediately generalize to the expressive instructional conditions.

Although not formally assessed, anecdotal evidence also suggests that symbol characteristics may affect verb learning. Across conditions, some of the symbols that participants were most likely to use independently appeared to be more iconic. For example, symbols representing the actions blow and jump included relevant objects within the graphic representation of the actions (e.g., the BLOW symbol included a balloon, and the JUMP symbol included a trampoline) that may have made the symbols more transparent, compared to symbols such as GO (represented by a green arrow). Some participants also showed preferences for specific AAC symbols depending on speech output, appearance, location, or overall interest in the activity with which it was associated. For example, Pedro chose RIDE consistently in the DTT condition and appeared to enjoy the speech output of the symbol (e.g., would press it repetitively while smiling). Nate showed a location preference as incorrect responses often involved choosing symbols located in the top center of the display. Sean also frequently activated the SWIM symbol in the play condition, which could have been impacted by the fact that the symbol may have been associated with his larger water play routine.

In terms of differences in responding based on communicative function, findings suggest that learning to use verbs to request may have initially required less support than using verbs for labeling. Specifically, three participants required lower levels of support in their first requesting DA sessions than they did in either of their first labeling sessions. Although such findings support prior research suggesting that children with ASD may be more likely to use imperative than declarative functions (Harbison et al., 2017; La Valle et al., 2020), there are several relevant factors to consider. First, although all participants in this study were reported to have prior experience using request functions with other forms of communication (e.g., PECS, manual sign, natural speech), only Elijah and Pedro previously used communicative responses for labeling purposes. This may have differentially impacted participants’ ability to generalize existing forms and functions of communication to the AAC application. Additionally, participant differences appeared to impact growth with communicative functions over time. Across sessions, Nate was the only participant who maintained clear-cut support score differences between requesting and labeling conditions. This difference was likely influenced by the fact that Nate showed difficulty attending to stimuli in the labeling conditions. These findings support prior research suggesting that children with limited joint attention skills may experience more difficulties with declarative language (La Valle et al., 2020). In contrast to Nate, other participants appeared to be motivated by and interested in labeling activities. For example, during DTT, Elijah would laugh at specific videos and try to replay them. At the start of a DA session, Sean would seek out materials used for his play labeling activity. Although these participants also showed interest in requesting, in some instances, motivation to request certain actions varied over time. For instance, Elijah showed decreased interest in requesting using a symbol for PUT IN. Although the clinician made efforts to increase motivation by using different stimuli (e.g., putting small figures in a truck, or putting mini cars inside a bucket), this did not always increase interest. In contrast, similar efforts to ensure Pedro’s motivation to request appeared more successful. Over time, he showed higher rates of responding to the time delay (level 5) alone in the requesting condition than in either of his labeling conditions. Thus, motivation to use a particular communicative function may contribute to verb symbol use across contexts.

Although there were some differences between requesting and labeling, there were fewer differences between the two labeling conditions comparing embedded instruction with live action models during play to DTT with video models. Across participants, there was consistent overlap between conditions. When there are minimal acquisition differences between embedded and DTT approaches, other potential advantages of an embedded approach such as fewer challenging behaviors, improved affect, or increased generalization to natural contexts should be considered (Sigafoos et al., 2006). Although Nate more frequently required physical prompts in the play condition than with DTT, he did not make progress with either condition. After Nate showed escape behaviors with DTT, the use of tangible reinforcers (e.g., small sensory toys) was added so that he would engage in the task. Even though Nate did not show escape behaviors during play (where he interacted with a variety of preferred items), he had difficulty shifting his attention to the stimuli the researcher used to model actions. Although children with limited joint attention skills might benefit from a video approach with fewer distractions, motivation to attend to videos must also be considered (Charlop-Christy et al., 2000; Plavnick & Vitale, 2016).

Limitations and Directions for Future Research

One of the limitations of this study is that DA did not reveal consistent learning differences between instructional conditions across participants. However, findings did provide insight into how personal characteristics (e.g., ASD severity, joint attention skills, motivational interests), symbol features, and prompt levels might impact learning. As the design of this study prevents explicit claims regarding the utility of DA for differentiating the importance of these variables, future research should examine these relationships. For instance, formal assessments could be used to compare groups of children with varying joint attention skills. Follow-up intervention studies can also be used to evaluate whether DA predicts intervention success.

Other study limitations may have also contributed to the fact that participants demonstrated limited independence in using verbs. First, although the intended timeframe for DA was 2–3 weeks, most participants had COVID-19-related absences, leading to an extended assessment period that may not have supported skill acquisition. Additionally, participants only had two trials with each verb target per session, compared to four trials per target in Gevarter et al. (2020) study. Future studies could focus on assessing fewer targets at a time. Although prior research suggests that increasing dosage or teaching trials may also improve AAC responding for children with ASD (Logan et al., 2017), conducting additional DA sessions would defeat the purposes of assessment, as it would be hard to distinguish DA from intervention. Instead, research could evaluate whether responding to initial trials (e.g., the first 10) with low levels of support (e.g., no more than level 3 models) predicts intervention success. Furthermore, as the researchers did not control for symbol transparency, future research should explore how the use of iconic versus non-iconic symbols affects verb learning. More research examining the benefits of animated symbols is also warranted (Schlosser et al., 2019). It may also be useful to compare whether teaching verb targets alone versus simultaneously with nouns improves learning.

Additional study limitations relate to how DA sessions were scored and implemented. First, even though most participants did not show progress in the control condition (i.e., did not respond independently without supports), because participants did not receive any support in the control condition, a learner could only earn a score of 0 (indicating no correct response) or 5 (a correct response with a time delay) for any given trial. In contrast, the instructional conditions were scored on a scale ranging from 0 to 5 to indicate the level of prompting needed. Most participants’ data showed differentiation between the control and instructional conditions because they were unable to respond correctly without prompts. However, the fact that Pedro did acquire targets in the control condition without any support could indicate that prompts may not always be needed for familiar tasks, as Pedro was the only participant to demonstrate receptive symbol identification prior to DA. Future research could explore how children with prior receptive symbol knowledge generalize skills to both receptive and expressive AAC skills.

Finally, in this study, DA was conducted in one-on-one clinical environments by trained researchers. The use of the DA procedures was not tested by speech-language pathologists (SLPs) or other educators in natural contexts (e.g., schools, homes). Although an SLP graduate student learned to use techniques with fidelity without extensive training, replication across different implementers is needed. Researchers should seek input from practicing SLPs about feasibility or adaptations to DA procedures and explore processes for training clinicians in DA.