Regular physical activity, which the World Health Organization (WHO) defines as “any bodily movement produced by skeletal muscles that requires energy expenditure” (2018, p. 1), is beneficial in many ways. For example, it helps to control weight, reduce the likelihood of heart disease and type 2 diabetes, improve mood, boost energy, counter depression, and foster sound sleep (Mayo Clinic, 2019; WHO, 2018). Only modest energy expenditure is required for benefits to accrue. The WHO recommends that each week adults engage in at least 150 minutes of moderate-intensity aerobic physical activity, 75 minutes of vigorous-intensity aerobic activity, or a combination of the two. The WHO, (2018) also recommends that muscle-strengthening activities should be done involving major muscle groups on two or more days a week. Worldwide, most adults (75%) fail to attain these levels of activity (WHO, 2018).

Individuals with an intellectual disability (ID) are even less likely to exercise regularly than members of the population at large (Department of Health and Human Services, 2017; Oliviedo et al., 2017; Ptomey et al., 2017). Several studies reviewed elsewhere (Bouzas et al., 2019; Hassan et al., 2019; St John et al., 2020) have demonstrated that a variety of interventions increase physical activity in individuals with an ID and, when activity increases, beneficial physiological, behavioral, and psychological outcomes are commonly observed. Children and adolescents were the participants in most of the studies covered by these reviews and Peck, (2020) recently contended that more work with adults is needed. In 2017, Temple et al. reviewed research with adults. They found six relevant studies, none of which were conducted in a school setting. The absence of research in school settings is unsurprising because public education for most students ends when they reach 18 years of age, that is, adulthood.

Some states in the United States, however, provide services to special education students more than 18 years of age. Those services typically prepare students to live as independently, and as well, as possible once formal schooling ends. Given the wide range of benefits afforded by regular physical activity, interventions that foster regular exercise by adults with an ID in school settings would benefit them in those settings. Moreover, and importantly, such interventions might well increase the likelihood of participants continuing to exercise after their formal schooling ends. Working with members of a local school that serves adults (18–26 years of age) we developed and evaluated an intervention package designed to increase physical activity in members of this population. We selected components of the package based on the results of prior studies described below.

Token reinforcement is one intervention that has proven effective in increasing exercise in adults with an ID, outside of school settings. For example, Bennett et al., (1989) used token reinforcement to increase the number of revolutions completed on a stationary bicycle by three young women with an ID. Croce and Horvat, (1992) also used token reinforcement to increase exercise in three young adults with an ID, but they targeted cycling and running. More recently, Krentz et al., (2016) and Valbuena et al., (2018) used token reinforcement to increase walking by adults with an ID.

There is also evidence that offering individuals a choice of physical activities can increase motivation to participate, therefore increasing overall levels of physical activity. Prusak et al., (2004) and Ward et al., (2008) found that middle schoolers who were offered a choice of activity during gym class showed greater levels of self-determination and physical activity than their peers who were not offered a choice. In 2008, Patall and colleagues conducted a meta-analysis of 41 studies that examined the effect of choice on intrinsic motivation and related outcomes for individuals of all ages and ability levels. They found that, “providing a choice enhanced intrinsic motivation, effort, task performance, and perceived competence, among other outcomes” (p. 270).

Finally, research has shown that individuals with an ID respond well to modeling, specifically video modeling. For example, Mechling et al., (2014) used continuous video modeling to assist adults with an ID in the completion of multi-component tasks. Spivey and Mechling, (2016) taught adults with an ID social safety skills with video modeling that incorporated a constant time delay.

Previous studies demonstrate the positive effects of token reinforcement, choice, and modeling on physical activity levels. They also suggest that package interventions that combine these strategies can be extended to benefit adults with an ID. One extension is to muscle-strengthening activities, which the WHO, (2018) deems an important part of physical activity. Another viable extension is to classroom-wide applications. Many of the studies described above (e.g., Bennett et al., 1989; Croce & Horvat, 1992; Krentz et al., 2016; Mechling et al., 2014; Spivey & Mechling, 2016; Valbuena et al., 2018) arranged interventions for a small number of individuals (three, three, five, four, three, and five, respectively). Such small-N research is characteristic of applied behavior analysis and is certainly not a problem, but interventions that can be jointly and effectively applied to a relatively large number of individuals are of obvious practical value.

The purpose of the present study was to evaluate the effects of a classroom-wide treatment package on the number of steps taken and calories burned by young adults with an ID. Given that previous studies involving conditioned reinforcement have proven effective (Bennett et al., 1989; Croce & Horvat, 1992; Krentz et al., 2016; Valbuena et al., 2018) the treatment package included token reinforcement. It also involved modeling (live followed by video) of target responses (dancing or strengthening activities). Although, to our knowledge, modeling has not been incorporated in studies that increased exercise by adults with an ID, Ptomey et al., (2017) reported that a 12-week, web-based group exercise program was effective in increasing exercising by 29 of 31 adolescents with intellectual or developmental disabilities. In that study, group leaders demonstrated the activities to be performed and, given the general effectiveness of modeling as a strategy for changing behavior in people with disabilities (see Cooper et al., 2020), the leaders’ modeling of desired actions may well have contributed to the success of the intervention. A final component of our intervention was giving participants a daily choice between dancing or strength training. Both aerobic and strength training activities contribute to health (WHO, 2018) and we wanted to ascertain if our participants would engage in both. To our knowledge, no published study has combined the three elements comprised by our package intervention (i.e., token reinforcement, modeling, and choice) to increase exercising by adults with ID. The present study was initiated at the request of school staff, who wanted to develop and implement an easy-to-implement and effective school-wide exercise program. They, and we, asked a simple research question: Will an intervention package compromising modeling, choice, and reinforcement reliably increase steps taken and calories burned by young adults with ID?

Method

Participants

Six young adults (ages 19–24 years) enrolled in a school for students with special needs were selected for careful study; they were the primary participants in the study. Two students came from each of three functionally equivalent classrooms (i.e., students within these classrooms had similar skill, verbal, and activity levels). These classrooms were pre-determined by the school principal who requested the intervention. Six primary participants were selected due to a limited number of measurement devices. Inclusion criteria were teachers’ reports that the selected students did not exhibit significant disruptive behavior (e.g., self-injury or physical assault), did not actively refuse to engage in scheduled physical activities, and did not regularly miss school. All students in each classroom who met these criteria were determined. The three classrooms typically held 18, 7, and 20 students and 13, 3, and 16 students from these respective classrooms met inclusion criteria. The primary participants were randomly selected from the three pools of potential participants. In selecting participants, a number was assigned to each potential participant, written on a piece of paper, and placed in one of three bowls that corresponded to potential participants’ classroom. Researchers blindly selected two pieces of paper from each bowl, representing the selected primary participants.

All remaining participants were considered secondary participants, as their steps and calories were not counted (i.e., due to limited measurement devices). However, their overall participation was tracked, as was the overall participation of the primary participants. Informed consent was obtained from all participants, or their legal guardians, and a Human Subjects Institutional Review Board approved the study. In cases where the participants could not legally consent for themselves, verbal assents were obtained by explaining and demonstrating the activities and asking the individuals if they wished to partake.

Table 1 provides demographic information for the six primary participants, three of which were men and three were women. Five of them exhibited a cognitive impairment (i.e., ID), while one had experienced a traumatic brain injury. All the primary participants were fully mobile; however, two had limited use of one arm. All were vocally verbal, to varying degrees.

Table 1 Participant demographics

Ten staff members employed by the school acted as interventionists throughout the study. The head classroom teachers accounted for three of these staff, whereas the remaining seven were paraprofessionals. None of these staff members had special training in relevant areas, such as exercise science, personal training, coaching, kinesiology, physical therapy, physical education/adaptive physical education, or behavior analysis. All ten staff members were female, ranging from 22–42 years old. Experience ranged from 1–7 years.

Materials

The study took place in two 84 m2 rooms within the school. Each room contained a projector, laptop or desktop computer, speaker, iPod®, and timer. Each morning, personalized popsicle sticks and glass jars were used to provide all participants a choice between physical activities (i.e., dancing or strength training). A Fitbit Charge HR® (San Francisco, CA) device was placed on the wrist of each primary participant at the beginning of each session and removed at its end. Classroom sticker charts and stickers were used for the token economy. At the end of each week, a prize box containing a variety of trinkets (e.g., pencils, spinning tops, fidget spinners) and small edibles (e.g., fruit strips, crackers, granola bars) was presented to student participants who had earned at least 75% of available stickers for that week. The items in the prize box were determined by participant preference. Prior to the study, the lead researcher entered each classroom and asked students to list items that they found motivating. All prizes were available to keep, indefinitely.

Experimental Design

A multiple baseline across settings (classrooms) experimental design (Baer et al., 1968), with a reversal component, was used in this study. For Classroom 1, Classroom 2, and Classroom 3, the initial baseline condition lasted three, five, and seven days, respectively, although due to absences not all participants were present each day. Following baseline, students in each classroom were exposed to the following conditions for three, one, six, two, three, and one weeks, respectively: live model, baseline, video model, live model, video model, and baseline. Condition changes were made based on the following: (a) consideration of participants’ performance, (b) school schedule, and (c) staff request (i.e., the live model condition required significant physical effort, which was not favored by staff, and resulted in staff requesting to switch back to the video model condition).

Dependent Variables and Data Collection

The primary dependent variables were step count and calories burned per workout session. During baseline and all subsequent conditions, the six primary student participants wore Fitbits® on their wrists. At the beginning of each session, a workout was started on each Fitbit®. At the end of each session, the researcher ended that workout session on the Fitbit® and recorded the number of steps taken and number of calories burned that appeared on the face (screen) of the Fitbit®. The workout feature on the Fitbit® was utilized to ensure only steps taken and calories burned during that session were counted, excluding data that were previously recorded by the device.

The secondary dependent variable was the percentage of students imitating the model for 100% of the workout session. Behavioral skills training was used to teach the previously described staff members to observe and measure participant engagement during each session. Two or three staff members simultaneously recorded data for each session; the individuals who served as observers depended on availability, which was largely determined by other required activities.

During observations, each staff member scanned all student participants, looking for students who stopped imitating the model for 10 consecutive seconds, or more. Observers viewed the students from different vantage points and recorded data independently. If a student was observed to not be imitating the model, the observer started a timer. If the timer reached 10 seconds, the observer made a check on a data sheet beside the student’s name, indicating that the student did not earn a token (i.e., sticker) for that session. If the student resumed imitation within nine seconds, nothing was recorded. In either case, scanning of all students then resumed. Recording data in this manner allowed participants to take brief (less than 10-s breaks), which school staff and researchers deemed appropriate, but provided a meaningful measure of general level of engagement. A student was considered to not be continuously participating in any session where one or more check marks were recorded.

It is possible that two or more students simultaneously stopped exercising at the same time. If all observers focused on the same student, the other student(s) could have failed to meet the criterion for imitating the model for 100% of the session without this being detected. For this to occur, each of the observers would have had to simultaneously focus on the same student. Moreover, when observations of the other nonresponding students resumed, they would have had to begin exercising soon enough for their inactivity to not be detected, or for it to be detected and the 10-s criterion to not be met. Nonsystematic observations indicated that students who stopped exercising typically did so for a minute or more, which minimized the likelihood that their inactivity was not detected. Moreover, having multiple observers collect data from different vantage points reduced the likelihood of observers focusing on the same inactive student.

But these considerations do not rule out the possibility that this measurement system missed instances of inactivity. It is for that reason that we describe it as a secondary dependent variable. It is important to recognize, however, that the limitation of the data collection system just described was consistent across conditions and would not increase the apparent effect of the intervention. In fact, because more students failed to meet the criterion for imitating the model for 100% of the session in the baseline condition than in the other conditions, inaccurate data collection through the mechanism just described would be most likely to occur in the baseline condition. This would falsely elevate the percentage of students who imitated the model throughout the session during baseline, an error that would reduce the magnitude of the treatment effect.

Intervention

Baseline

Prior to introduction of the intervention, students in each classroom engaged in a 20-minute movement session each day. At the beginning of each session, staff informed students that it was time for movement. Staff instructed students to stand up at their desk and follow along with a variety of GoNoodle® (www.gonoodle.com) or YouTube® (www.youtube.com) videos. The activities performed in the GoNoodle® and YouTube® videos ranged from practicing mindfulness to high-intensity aerobic circuits and were selected based on student request. If a student sat down or failed to participate, staff would occasionally vocally prompt the student to follow along with the video. Staff prompting during baseline was not recorded. Other than the vocal prompting and occasional vocal praise for participation, no consequences were provided for student engagement, or lack thereof.

Intervention

The day prior to intervention, students were informed of the token economy. That is, staff explained and demonstrated the token economy, specifying how tokens could be earned, how many tokens were required to earn the backup reinforcers, and the nature of those backup reinforcers. Staff reminded students of the token economy requirements at the beginning of every movement session.

The intervention took place during the same allotted time as the baseline movement sessions (i.e., one 20-minute session per day). Sessions were arranged five days a week. Upon entering their classroom each morning, students were instructed to choose which activity (i.e., dance or strength training) they wished to participate in by placing a personalized popsicle stick in a jar that corresponded to their choice. Popsicle sticks were personalized by student name or photo. Most students made independent choices; however, some required vocal or gestural prompts to make a choice. Vocal prompts involved saying “put your stick in one of the jars” and gestural prompts involved pointing and then, if needed, moving the participant’s hand with the popsicle stick towards, the jars. Verbal praise (e.g., saying “nice job making a choice”) was provided as soon at the popsicle stick was placed in a jar.

When it was time for each classroom’s movement session, students were directed to one of two rooms; the room entered depended on the choice they made, with one room dedicated to strength training and one to dancing. Upon entering the room, a Fitbit® device was placed on the wrist of each primary student participant. Once all students arrived, the researcher or staff member leading that group stated the following rule: “To earn a sticker, you must participate until the timer goes off.” Immediately after this statement ended, a large visible timer was set for 20 minutes and started, music was turned on, and the modeling began. A playlist of student-requested songs was created prior to intervention and was played (on shuffle) throughout each session. Songs were added to the playlist throughout the study, as students made additional song requests.

At the end of each session, students who were eligible for a sticker received one. Eligibility was dependent on not having a check, indicating at least 10 consecutive seconds of nonparticipation, on the primary observer’s data sheet. Upon returning to their base classroom, students were instructed to place the sticker next to their name on a class sticker chart. At the end of the week, all eligible students (i.e., those who had earned at least 75% of the available stickers for that week) were given the opportunity to choose a prize out of a prize bin. Staff explained to students who were not eligible for prizes why they were not eligible, explained what they needed to do differently to be eligible next week, and encouraged them to keep trying. No other consequences were arranged for them.

Live Model

Initially, a live model demonstrated each exercise movement in a rotating sequence. One of the researchers and three undergraduate research assistants served as the live models, on a rotating schedule, depending upon availability. All four live models were female, ranging from 19-24 years old. The researchers trained the three undergraduate research assistants to perform each move through behavioral skills training, prior to the first session. Table 2 shows the sequence of dance and strength training exercises. Models were instructed to engage in each exercise for 15-20 seconds at a time, rotating through exercises in a random order that somewhat coincided with the music being played. The initial live model condition lasted three weeks.

Table 2 Sequence of exercises a

Originally, we intended to fade staff members in as the live models, without any notion of introducing a video model. However, as researchers began training the staff members on the different exercise movements, they quickly expressed their displeasure with this plan. That is, they were surprised by how much energy was required to model the activities, and how sweaty doing so made them. Although they were enthusiastic about being models when the study was planned, they unanimously asked to be excused from this activity. Their request was, of course, granted and we quickly switched to a video model. No sessions included a staff member as the live model.

Video Model

The video models were the same people as the live models, but their performances were recorded (YAP Fitness, 2020) and projected on a screen visible to participants. That is, each video entailed one of the live models, rotating through the aforementioned exercise movements, for the designated duration of the given song. Staff members played the different videos in irregular order during each 20-minute session, did not initiate interactions with participants, and minimized interaction when they were initiated by participants, which was rare.

To allow time for the researchers to create these video models, participants were returned to the baseline condition for one week, following the three-week live model condition. Treatment was then resumed, with a video model substituted for the live model. The video modeling condition was in effect for six weeks, after which the live model condition was reinstated (two weeks), followed by the video model (three weeks), and baseline (one week) conditions.

Maintenance

Maintenance conditions were identical to the video model condition, except researchers and undergraduate research assistants were not present and/or available for consultation. We obtained maintenance data for a single session, conducted one week after researcher assistance ended. Only two primary participants (i.e., Participant 1 and Participant 4) were present for that session.

Interobserver Agreement (IOA)

Two people independently observed 42 of the 210 total sessions (20%), with at least one observation in each classroom under each condition. That is, one of the researchers independently scanned student participation, just as the staff members did, indicating any students who stopped imitating for 10 seconds, or more. If both datasets matched, it was considered an agreement. If the datasets did not match, it was considered a disagreement. The total number of agreements was divided by the total number of occasions and the resultant value was multiplied by 100 to yield a percentage. Overall, the two observers’ datasets matched on 90% (38 of 42) of occasions. IOA was not calculated for steps taken or calories burned, because these values were simply recorded from the Fitbit® devices.

Intervention Integrity

A 10-item intervention integrity checklist was developed that described the essential components of the intervention (Appendix). A researcher observed 31 of the 210 sessions (15%) and completed the checklist for each of them. In all cases, the intervention integrity was 100%. Intervention integrity was calculated by dividing the total number of checklist items that were completed correctly by the total number of checklist items observed and multiplying by 100.

Results

Figures 1 and 2 show for each of the six primary participants the number of steps taken and the number of calories burned each session under all conditions, respectively. The participants differed substantially in the activity they chose. Participants 1, 2, 3, 4, 5, and 6 chose dance on 92%, 61%, 100%, 66%, 100%, and 98% of sessions, respectively.

Fig. 1
figure 1

Number of steps taken per session by primary participants. This figure depicts the number of steps taken each session, by each of the six primary participants, under all conditions. Baseline, live model, video model, and maintenance conditions are labeled as BL, LM, VM, and Maint., respectively

Fig. 2
figure 2

Number of calories burned per session, by primary participants. This figure depicts the number of calories burned each session, by each of the six primary participants, under all conditions. Baseline, live model, video model, and maintenance conditions are labeled as BL, LM, VM, and Maint., respectively

All primary participants responded favorably to both the live and video models. Across all baseline conditions, Participant 1 took an average of 538 (range: 130-1,183) steps per session. Under the treatment condition with a live model, his average number of steps increased to 1,631 (435-2,136) per session. With the video model, his average number of steps was 1,563 (714-2,152) per session. As for calories burned, under baseline conditions, he burned an average of 70 (25-134) calories during each session. With the live model, he burned an average of 155 (103-201) calories per session. With the video model, he burned an average of 138 (95-175) calories per session. Participant 2 took an average of 500 (41-1,172) steps under baseline conditions, 1,038 (277-1,917) steps with the live model, and 988 (468-1,595) steps with the video model. He burned an average of 107 (64-177) calories per session during baseline, 171 (105-215) calories with the live model, and 163 (109-257) calories with the video model. Participant 3 took an average of 1,070 (224-1,645) steps during baseline, 1,679 (941-2,287) steps with the live model, and 1,305 (629-1,865) steps with the video model. She burned an average of 102 (35-154) calories per session during baseline, 154 (128-229) calories with the live model, and 126 (76-166) calories with the video model. Participant 4 took an average of 751 (122-1,355) steps during baseline, 1,318 (680-2,157) steps with the live model, and 1,384 (839-2,259) steps with the video model. He burned an average of 108 (33-173) calories per session during baseline, 152 (90-199) calories with the live model, and 143 (101-212) calories with the video model. Participant 5 took an average of 813 (185-1,425) steps during baseline, 1,351 (813-1,999) steps with the live model, and 1,214 (751-1,686) steps with the video model. She burned an average of 96 (33-156) calories per session during baseline, 125 (91-172) calories with the live model, and 136 (99-185) calories with the video model. Participant 6 took an average of 571 (315-1,087) steps during baseline, 1,296 (590-1,693) steps with the live model, and 1,219 (512-1,673) steps with the video model. She burned an average of 86 (65-131) calories per session during baseline, 134 (88-194) calories with the live model, and 128 (83-178) calories with the video model.

For all participants, the number of steps taken typically was higher during dance sessions than during strength training sessions, but the number of calories burned was similar in both conditions. The two participants (i.e., Participant 1 and Participant 4) for whom we were able to collect maintenance data showed good retention of treatment effects. That is, the number of steps taken and calories burned during the single maintenance session resembled their prior performance under the video model condition.

Figure 3 shows the percentage of students in each classroom who consistently exercised, and earned stickers, under all conditions. That is, these data represent both primary and secondary participants. Each datum represents the percentage of students who were eligible for a sticker at the end of that session. For Classroom 1, average participation during baseline was 53% (range: 10-89%). During treatment with the live model, it was 92% (75-100%). During treatment with the video model, it was 90% (65-100%). For Classroom 2, average participation during baseline, the live model condition, and the video model condition were 22% (0-58%), 90% (80-100%), and 94% (63-100%), respectively. Finally, for Classroom 3, average participation during baseline, the live model condition, and the video model condition were 66% (36-94%), 94% (74-100%), and 98% (80-100%), respectively.

Fig. 3
figure 3

This figure depicts the percentage of students that earned a sticker each session, by classroom, under all conditions. Baseline, live model, video model, and maintenance conditions are labeled as BL, LM, VM, and Maint., respectively

Discussion

We attempted to answer a simple research question: Will an intervention package compromising modeling, choice, and reinforcement reliably increase steps taken and calories burned by young adults with ID? Our findings indicate that the answer is yes. All six primary participants responded favorably to the classroom-wide treatment package, with both the live and video models. Moreover, overall student participation in exercise was substantially higher in all three special education classrooms when the package was in effect, than during baseline sessions. That is, more students participated, and did so consistently, when the treatment package was in effect, than under baseline conditions.

Despite the many benefits of regular exercise (Mayo Clinic, 2019; WHO, 2018), few individuals with ID meet the WHO’s recommendation of at least 150 minutes of moderate-intensity aerobic exercise per week (Department of Health and Human Services, 2017; Oliviedo et al., 2017; Ptomey et al., 2017). Even fewer engage in muscle-strengthening activities, which the WHO, (2018) also recommends on two or more days per week. We allowed students to engage in both strength-training and aerobic activities, which is a strength of our study, but we did not determine whether participants increased their strength or aerobic capacity, which is a limitation. A replication of the present study that incorporated such measures would be of value.

We also arranged a classroom-wide intervention, as opposed to intervening with students on a one-by-one basis, which carried some advantages. For example, classroom-wide interventions have the potential to benefit substantially more students with far less resources (e.g., staff and time). The intervention package we evaluated was not difficult to implement with the involvement of the researchers, but the one involving a video model was substantially easier to use. In fact, it was so easy to use that school staff appeared to be able to use it independent of the researchers and wanted to do so. We were beginning to evaluate the effectiveness of the program when implemented independently by school staff, with promising initial results, but COVID-19 ended classroom activities before we could do an adequate evaluation.

The video modeling intervention resembled in some important ways what staff were doing during baseline, which increased ease of use. During baseline, the teachers would simply tell their students it was time for movement, and they would play a variety of music videos on GoNoodle® or YouTube®. The students appeared to enjoy watching and listening to those videos, but the exercise movements within them were often too complicated for students to imitate. This left most of them standing stationary for much of the designated movement time. The movements modeled in our interventions were selected to be appropriate for the students and were performed at a pace that allowed the students to keep up, which appeared to contribute to the success of the intervention. In retrospect, this may have been a fourth, and unintentional, component of our intervention package. A comparison study that evaluates the physical activity levels of people with an ID when they watch GoNoodle®, YouTube®, and our videos would be a valuable extension of the present study. The videos that we developed are available free of charge on-line (YAP Fitness, 2020).

Our intention was to have school staff use those videos during the remainder of the school year and to collect maintenance data over this period, but we were unable to do so because in-person classes ended prematurely. Our ability to collect only minimal maintenance data is a limitation of our study.

So, too, is our failure to collect systematic social validity data. Our plan was to collect such data from school staff and student participants near the end of the school year, and we had selected instruments for collecting them, but the COVID-19 pandemic ended our interactions with staff and students before the instruments could be administered. Informal observations suggested, however, that both the live model and video model interventions were well received, with the latter favored by staff. The principal at the school where the study was conducted has asked for our assistance in continuing and expanding the video modeling exercise program when it can be done safely, which is meaningful evidence of its acceptability.

A third limitation of our study is the limited amount of staggered baseline and interventions across participants. Put simply, participants’ performance influenced, but did not entirely determine, when we changed conditions, which weakened our ability to demonstrate experimental control. Moreover, our initial baseline condition was in effect for a short period, during which the responding of some individual participants was not stable. These condition changes were implemented as such because the project was at the behest of school staff. That is, staff wanted an intervention to be put in place as soon as possible. They also developed a distaste for the live model condition due to the amount of physical effort required of them. These factors led to premature condition changes that sacrificed scientific rigor in the interest of social validity.

Another limitation of our study is that we did not demonstrate that either dancing or strength training affected participants’ fitness or health. We attempted to make the two activities roughly equivalent in terms of the effort required for full participation but may not have succeeded. By nature, dancing leads to more steps being taken than does strength training. This difference is clearly evident in the performance of Participants 2 and 4, as depicted in Fig. 1. Figure 2, shows, however, that they burned a similar number of calories when performing the two activities. Nonetheless, dancing was the preferred activity, although four of the six primary participants at least occasionally engaged in both options.

A final limitation is that our study concluded, prematurely, under baseline conditions due to the COVID-19 pandemic, which is unfortunate from an ethical perspective. Our intent was to examine, during the last two months of the school year, the effectiveness of the procedure when implemented without the assistance of the researchers. Unfortunately, the COVID-19 pandemic ended in-person classes soon after the maintenance phase began, resulting in very limited maintenance data. In recent months, many students have returned to in-person instruction at the school where the study occurred, and the principal has encouraged the staff members who participated in the study to continue implementation. She has also acknowledged, however, that there are higher priorities (e.g., preventing COVID-19 outbreaks and arranging on-line instruction). Our hope and plan is to help school staff reintroduce and expand the exercise program during the 2021-2022 school year.

We did not measure the intensity of exercise, but the WHO, (2018) lists dance as an example of a moderate-intensity activity. The WHO also recommends that adults engage in at least 150 minutes of moderate-intensity activity each week, and a student in our study who danced for 20 minutes each school day would be two-thirds of the way to meeting this criterion. A reasonable goal for future research is to extend sessions to 30 minutes or more. The downside of longer sessions is opportunity cost: exercise time is time away from other, educational, activities. Another downside is perspiration. Participants in our study were often sweaty when sessions ended, which was less than ideal for staff and other students who had to work in close proximity to these students. When the exercise program is reimplemented, we intend to make t-shirts with the school logo available as prizes for exercising students, and to encourage students to wear those shirts during exercise sessions, with facilities available for changing and toweling off.

The classroom-wide intervention worked well in the present study. Both the live and video model resulted in (a) increased physical activity by all six primary participants, and (b) increased participation by all involved students. Nonetheless, it is important to note that the conditions under which the intervention was implemented were favorable. No student in any classroom exhibited significant challenging behavior or substantial noncompliance. Moreover, the head teachers and other staff in each class were highly competent, well trained, and committed to the project, as were the research assistants. It is of interest to see if similar interventions are effective with more challenging students and less committed staff.