Tying the Delivery of Activity Step Instructions to Step Performance: Evaluating a Basic Technology System with People with Special Needs

This study assessed a new technology system that automatically presented instructions for multistep activities to people with intellectual and sensory disabilities. The instructions were presented one at a time, and tied to the participants’ performance of the activity steps. That is, a new instruction occurred only after the participant had carried out the previous step. The new system involved a Samsung Galaxy A10 with Android 10.0 operating system equipped with Amazon Alexa, MacroDroid, and Philips Hue applications and a Philips Hue indoor motion sensor. The assessment of the new system was carried out with seven participants who were exposed to two or three pairs of activities. They performed one activity of each pair with the new system and the other with a system presenting instructions at preset time intervals according to an alternating treatments design that included a cross-over phase. The mean percentage of correct responses tended to be higher with the new system than with the control system. Paired t-tests carried out to compare the sessions with the new system with the sessions with the control system of each participant showed that the differences in correct responses between the two sets of sessions were statistically significant for all participants. The new system may represent a useful (advantageous) tool for supporting people like those involved in this study in the performance of multistep activities.

difficulties in remembering some of those steps or the correct sequence of those steps with the consequence that their overall performance tends to remain erroneous and in need of help (Cullen et al., 2017a, b;Desideri et al., 2020;Lancioni et al., 2000;Lin et al., 2018).
To improve the situation of people with difficulties in performing multistep activities, technology-based programs have been developed over the years (Heider et al., 2019;Ivey et al., 2015;Lancioni et al., 2020;Wu et al., 2016). These programs are essentially aimed at ensuring that people can access instructions for the different steps of the activities scheduled for them (Desideri et al., 2020;Mechling, 2011;Mechling et al., 2010Mechling et al., , 2013. Using specific step instructions, in fact, is viewed as an effective strategy to ensure that people carry out the different steps of the different activities programmed in an orderly and correct fashion Kagohara et al., 2013;Savage & Taber-Doughty, 2017). While all programs have the same general aim (i.e., providing the participants with step instructions) and rely on the use of technology devices (e.g., computers and tablets), the way they are arranged to work differs. One group of programs requires that the participants operate the device to access the instructions (i.e., self-prompting programs; Cullen et al., 2017a;Desideri et al., 2020;Pérez-Fuster et al., 2019). For example, the participants touch the tablet to move to the next instruction, carry out the activity step corresponding to that instruction, and then repeat the process (i.e., seeking the next instruction and performing the related step) until the activity is completed (Heider et al., 2019;Lancioni et al., 2000;Randall et al., 2020;Savage & Taber-Doughty, 2017;Shepley et al., 2018a, b).
Another group of programs is arranged to present the instructions to the participants automatically (i.e., it does not require the participants to operate the technology to access the instructions). Typically, the instructions are delivered by the technology (i.e., one at a time) at preset time intervals (Lancioni et al., 2014(Lancioni et al., , 2015(Lancioni et al., , 2016(Lancioni et al., , 2018. The intervals are arranged by staff or caregivers based on the participants' performance skills and speed. These programs are considered less demanding than the previous, as the participants do not have to remember to operate the technology system prior to each activity step and can avoid possible errors in operating the system (Desideri et al., 2020;Lancioni et al., 2011Lancioni et al., , 2014Lancioni et al., , 2015. One question open with regard to the latter programs concerns the accuracy of the intervals preset by staff or caregivers. To determine the most functional intervals, staff and caregivers may carry out observations of the participants' activity engagement within their regular contexts so as to verify the time they need to carry out activity steps. Staff and caregivers may also extend the intervals somewhat beyond the time considered necessary for the execution of the steps to limit the risk that a new step instruction will occur too soon (i.e., while the participants are still busy carrying out the previous step). Although the aforementioned cautionary measures may be rather successful (Lancioni et al., 2016(Lancioni et al., , 2017(Lancioni et al., , 2020, conditions may occasionally arise that interfere with the instruction process and outcome. Those conditions could involve occasional slowdowns or accelerations in the participants' performance causing the participants to (a) miss (fail to respond to) some instructions in the sequence or (b) wait frustratingly for the next instructions and get distracted with an increased chance of errors.
To avoid the aforementioned risky conditions, one needs to use a technology system that regulates the intervals between instructions based on the participants' performance. A few attempts to develop one such technology system have been reported in the literature (Lancioni et al., 2011;Lin et al., 2018;Mihailidis et al., 2016;O'Neill et al., 2018). However, the limited evaluation of the system models put together and their overall complexity make their adoption in daily contexts unlikely. A fairly simple and practical technology system that one might envisage could involve a basic instruction device (i.e., smartphone or tablet) setup to work in combination with a motion/optic sensor available in the area where the activity is to be performed (e.g., a table where the objects collected are to be arranged). The sensor would detect the arrival of the participants at the activity place and the sensor's activation would be recognized by the instruction device via commercial applications that are installed in it. This recognition would then cause the device to present the next instruction of the sequence.
The objective of the present study was to set up such new technology system (i.e., a smartphone working in combination with a motion sensor) and evaluate it with seven participants with intellectual disabilities and sensory impairments. During the assessment, each participant was exposed to the new system as well as to a conventional/control system including only a smartphone, in which the intervals between instructions were preset (i.e., arranged prior to the sessions) (Lancioni et al., 2017). Table 1 identifies the seven participants by pseudonyms and reports their chronological age, their visual and auditory conditions, the types of instructions (verbal or pictorial) they used, and the age equivalents for their daily living skills on the second edition of the Vineland Adaptive Behavior Scales (Balboni et al., 2016;Sparrow et al., 2005). The participants' chronological age ranged from 21 to 62 years. Four of the participants (i.e., Louise, Jack, Daisy, and Roland) relied totally or mainly on verbal instructions (i.e., could respond to simple sentences concerning objects and actions). The other three participants (i.e., Kate, Todd, and Kenny) relied mainly or totally on pictorial instructions (i.e., could easily see and respond to images/photos of daily objects shown on a tablet or smartphone's screen). The Vineland age equivalents for daily living skills (personal domain) ranged from 3 years and 10 months (Kenny) to 5 years and 3 months (Kate). All participants attended rehabilitation and care centers for people with intellectual and sensory disabilities. The psychological records of the centers indicated that their level of functioning was within the moderate intellectual disability range, but no specific tests had been applied and no IQ scores were available.

Participants
The participants were included in the study on the basis of the following criteria. First, they could follow verbal or pictorial instructions concerning objects to gather and transport/use (e.g., collecting and sorting objects). Second, they (a) were used to follow such instructions delivered by different devices (e.g., computers and smartphones) or under staff supervision and (b) seemed willing (i.e., by expressing verbal approval or smiling) to be involved in activity situations in which they received instructions to collect and put away large series of objects. Third, regular staff were in favor of the study (whose purpose and technology components had been presented to them in advance), as they thought that an improvement in the instruction system could have clearly positive implications for the participants' activity engagement within the daily context.
Although the participants had shown willingness to be involved in activity situations such as those used in this study (see above), none of them could read and sign a consent form. Thus, their legal representatives were asked to do that on their behalf. The study complied with the 1964 Helsinki Declaration and its later amendments and was approved by an institutional Ethics Committee.

Procedures
Setting, Activities, Sessions, and Research Assistants Quiet rooms of the centers that the participants attended served as the setting for the study sessions. Each activity consisted of collecting and arranging on a desk 28-34 objects (e.g., 30 objects concerning food and kitchen items or 30 objects concerning cleaning items). The objects/items were familiar to the participants and were located in areas of the room known to the participants. For each object, the participants received an instruction, that is, a simple verbal phrase or the object's photo on the smartphone's screen (see below), depending on the types of instructions the participants normally used in their daily context. In response to the instruction, the participants were to collect the related object and put it away on a specific desk. The sessions consisted of the time periods required for the participants to respond to the series of instructions. Sessions were conducted with the new technology system (involving a smartphone and an optic sensor to deliver instructions tied to participants' performance) and with a conventional technology system (involving a smartphone delivering verbal or pictorial instructions at preset time intervals). The duration of the former types of sessions was determined by the participant's performance. In fact, a new instruction was delivered as soon as the participant had responded to the previous one. The duration of the latter sessions was determined by the length of the intervals separating the instructions (see below). Research assistants, who were responsible for setting up the sessions and recording the data (see below), had experience with the use of technology-aided interventions for people with intellectual and other disabilities.

New Technology System
The new technology system involved (a) a Samsung Galaxy A10 with Android 10.0 operating system that was equipped with Amazon Alexa, MacroDroid, and Philips Hue applications; (b) a Philips Hue indoor motion sensor; (c) a Philips Hue Bridge and Philips Hue smart Bulb working via Bluetooth; and (d) a 4G LTE Wi-Fi router. The Philips Hue Bridge, Philips Hue smart Bulb, and Philips Hue application served to set up the Philips Hue sensor. Any activation of the sensor by the arrival/presence of a participant was detected by the Amazon Alexa application. This application transmitted the arrival/ presence message to the smartphone. Such message was then used by MacroDroid to present the next verbal or pictorial instruction of the sequence (i.e., an instruction for an activity step) to the participant.
Verbal and pictorial instructions consisted of brief sentences (e.g., take the bottle from the cupboard) and photos of the objects the participants were to gather and transport to a desk. The verbal instructions were written in the Macro-Droid while the pictorial instructions were in a folder of the smartphone directly accessed by MacroDroid. The Philips Hue sensor was a box-like device with a side of 5.5 cm and a height of 3.5 cm. It was placed on the floor just before the desk on which the participants were to arrange the objects they had collected in different areas of the room (or rooms). Participants carried the smartphone with them throughout the sessions. Each pictorial instruction was visible on the smartphone's screen until the participant reached the desk (i.e., until the participant's presence was detected by the sensor). After the participant had reached the desk with the last object of the sequence (i.e., had responded to the last instruction), the smartphone presented verbally or verbally and visually (i.e., through familiar approval/cheer images) a praise message and then delivered about 3 min of pleasant stimulation. This stimulation typically consisted of one of the participants' preferred songs or videos (i.e., songs and videos selected prior to the study via stimulus preference screening; Lancioni et al., 2018Lancioni et al., , 2020.

Conventional Technology System
This system only included the Samsung Galaxy A10 with Android 10.0 operating system equipped with MacroDroid. Verbal and pictorial instructions were as those used with the previous system. Yet, those instructions were set to occur at specific time intervals. The intervals were set up by research assistants based on preliminary observations of the participants or on the participants' performance with the new system (see below). Every pictorial instruction was visible on the smartphone's screen until 5 s before the appearance of the next instruction of the sequence. After a preset interval had elapsed from the last instruction, the smartphone delivered the same types of praise and preferred stimulation as described for the new technology system.

Experimental Conditions
The assessment of the new technology system was carried out by exposing each participant to two or three pairs of activities. For each pair, one activity was carried out with the new system while the other was carried out with the conventional system according to an alternating treatment design (Barlow et al., 2009). After 10 to 15 sessions on each of the activities (i.e., first phase of the intervention), a cross-over phase was implemented. That is, the new system was used for the activity previously exposed to the conventional system and vice versa. The cross-over phase involved a number of sessions identical or similar to that used for the first phase of the intervention on the same activities and occurred according to the same design (i.e., alternating treatments design). Sessions were typically videotaped and viewed by a study coordinator who supervised the work of the research assistants to ensure procedural fidelity (Sanetti & Collier-Meek, 2014).

Intervention Sessions with the New System
In the sessions with the new system, the intervals between instructions were regulated by the smartphone based on the activations of the Philips sensor and the inputs of the Amazon Alexa and MacroDroid applications (see "New Technology System"). A new instruction was presented only after the sensor had detected the participant reaching the desk on which the object transported was to be arranged. The length of the intervals could change between different instructions and across sessions based on the participants' performance speed.

Intervention Sessions with the Conventional System
In the sessions with the conventional system, the intervals between instructions were set up by the research assistants. During the first intervention phase on each pair of activities, the length of the intervals used for the activity carried out with the conventional system was based on preliminary observations of the participants' performance (i.e., observations of the time the participants needed for responses such as those involved in that activity). The observations included a minimum of 15 responses and the time interval chosen for use in the sessions was that within which at least 75% of those responses were completed. During the cross-over phase, the intervals between instructions were set up in such a way that the total time for the activity would be only slightly longer than the mean total time required for the same activity during the first intervention phase (i.e., when the activity was carried out with the new system).

Measures
The measures involved (a) the number of correct responses (i.e., objects collected and put away correctly in relation to the corresponding instructions) and (b) the time required for the sessions. The second measure was automatically recorded by the smartphone (i.e., the smartphone recorded the time elapsed from the delivery of the first instruction to the delivery of the last instruction programmed for the session). Research assistants in charge of implementing the sessions recorded the first measure (i.e., correct responses). Interrater agreement on such measure was checked in more than 20% of the sessions of each participant by having a reliability observer join the research assistant in recording the data. The percentage of agreement was computed for each session by dividing the number of instructions for which the research assistant and the reliability observer recorded the same (correct or incorrect) response by the total number of instructions available in the session and multiplying by 100%. The percentages for the single participants were in the 90-100 range with means exceeding 97%.

Data Analyses
The data for the different pairs of activities available to each participant were summarized in graphic and table format. The difference between the two instruction systems was analyzed through paired t-tests. These tests allowed for the comparison of the numbers of correct responses obtained with the two systems over all the sessions available (i.e., the sessions for all the pairs of activities) for each participant individually (Siegel & Castellan, 1988).

Results
The panels of Fig. 1 summarize the data of the four participants who received verbal instructions (i.e., Louise, Jack, Daisy, and Roland). The panels of Fig. 2 summarize the data of the three participants who received pictorial instructions (i.e., Kate, Todd, and Kenny). The data concern the participants' correct responses with each of the two systems across all pairs of activities available. The data collected on each pair are reported separately for the Fig. 1 The panels summarize the data of Louise, Jack, Daisy, and Roland. The data concern the participants' performance with each of the two systems across the two or three pairs of activities available to them. For each pair of activities, the data of the first intervention phase and of the cross-over phase are reported separately. The black squares and open circles represent mean percentages of correct responses over blocks of sessions with the new system and the conventional system, respectively. The blocks include three sessions except when an arrow is present. Blocks with an arrow include two sessions first phase of the intervention and the cross-over phase. The black squares and open circles represent mean percentages of correct responses over blocks of sessions with the new system and the conventional system, respectively. The blocks include three sessions except when an arrow is present. Blocks with an arrow include two sessions.
As observed in the graphs for the single participants, the mean percentage of correct responses, which was always above 90, tended to be higher with the new technology system than with the conventional system irrespective of the type of instructions (i.e., verbal or pictorial) used. Paired t-tests carried out to compare all the sessions with the new system with all the sessions with the conventional system of each participant showed that the differences in correct responses between the two sets of sessions were statistically significant (p < 0.01; with t values ranging from 3.80 to 11.67) for all participants.
The mean time lengths of the intervention sessions carried out with the two systems for each pair of activities are reported in Table 2. As it can be seen in the table, the mean session lengths ranged from 9.4 to 27.3 min and were generally somewhat higher for the sessions conducted with the conventional system. No statistical analysis was carried out to determine the significance of the length differences as they may be considered of small practical/clinical relevance. However, those differences allow one to argue that the higher levels of correct responding observed during sessions with the new technology system were not due to the fact that longer (more convenient) between-step time intervals were available with such system. Fig. 2 The panels summarize the data of Kate, Todd, and Kenny. The data are plotted as in Fig. 1

Discussion
The results of this study show that (a) the conventional system promoted reasonable levels of correct responding (i.e., comparable to those observed in previous studies using similar systems;Lancioni et al., 2015Lancioni et al., , 2017Lancioni et al., , 2018, and (b) the new system was more effective (i.e., ensuring higher levels of correct responding) than the conventional system. The increased effectiveness of the new system may have potentially relevant implications for programs directed at people with moderate intellectual disabilities with or without sensory impairments (Desideri et al., 2020;Mihailidis et al., 2016). In light of the above, a number of considerations may be in order.
First, improving the efficacy of automatically delivered step instructions and thus improving the participants' performance of multistep activities can be considered a highly valued objective within any rehabilitation context (Desmond et al., 2018;Lancioni et al., 2017;Lin et al., 2018;O'Neill et al., 2018;Pérez-Fuster et al., 2019). The possibility of approaching such an objective with relatively simple and accessible technology makes the objective realistic for a number of daily contexts irrespective of the fact that their technical expertness and financial resources may be fairly limited Borg, 2019;De Witte et al., 2018;Scherer, 2019).
Second, a technology system that ties instruction delivery to participants' responding may not only be functional to increase the participants' level of correct responding but also to improve the quality (i.e., comfortableness) of their activity engagement (Brown et al., 2013;Kocman & Weber, 2018). In fact, they would not have to adjust to externally controlled instruction conditions/timing and possibly avoid experiences of (a) anxiety for missing instructions or (b) frustration caused by the need to wait for the occurrence of new instructions following the completion of previous steps (Desideri et al., 2020;Lancioni et al., 2011Lancioni et al., , 2017Lancioni et al., , 2021Lin et al., 2018;Mihailidis et al., 2016).
Third, the cost of the new system as used in the present study was less than $400. This cost was accounted for about $170 by the Samsung smartphone; for about $50 by the Philips Hue sensor; and for about $150 by the Philips Hue Bridge, the Philips Hue smart Bulb, and the 4G LTE Wi-Fi router. The relatively affordable cost and the fact that the system is easy to operate for staff and highly friendly to the participants should not hide the fact that the system is not a ready-made (off-the-shelf) tool but needs to be set up. For example, the step instructions are to be written in the MacroDroid or stocked in the smartphone's memory. The Alexa application needs to be linked to the Philips Hue sensor and to the MacroDroid. The MacroDroid, finally, has to be arranged so that it can control the delivery of step instructions.
Fourth, the new system assessed in this study involved the use of only one sensor. Yet, more sensors could be employed. For example, with an activity requiring the participants to bring various types of objects to two separate places and sort/assemble them at those places, the system could involve the use of two sensors, one at each of the places. This arrangement would ensure that the participants get instructions at each of the places (i.e., as soon as or shortly after they reach one of such places).

Limitations and Future Research
Three limitations of the study may need to be underlined. First, the number of participants involved in the study is relatively small and thus insufficient to make general statements about the potential and applicability of the new technology system. Direct and systematic replication studies will be needed to prove the strength of the system and evaluate possible upgrades of it (Kazdin, 2011;Travers et al., 2016).
Second, no assessment was made as to whether the participants had a preference for the new system (i.e., over the conventional system) although an assumption was made that the new system could improve the quality of the participants' activity engagement in addition to their level of correct responding (see above). Such an assessment could be carried out during or at the end of the intervention phases by allowing the participants to choose one system or the other prior to the sessions (McLay et al., 2017;Tullis et al., 2011). Third, staff were consulted about the new system prior to the start of the study, but they were not interviewed during or after the study. These interviews could have amounted to a social validation of the system and occurred after showing the staff videos of the participants during the sessions with the new system as well as the conventional system (Plackett et al., 2017;Worthen & Luiselli, 2019).
In conclusion, it may be argued that the new system evaluated during the study has the potential to improve the participants' level of correct responding and perhaps of making their activity engagement easier and more satisfactory. General statements about the system, however, appear premature at this time given the aforementioned limitations of the study and the need to address them through new research. Future research may also be focused on upgrading the new system and testing it with various types of activities and participants with different needs and characteristics.
Author Contribution GL was responsible for setting up the study, acquiring and analyzing the data, and writing the manuscript. MO and JS collaborated in setting up the study, analyzing the data, and writing/ editing the manuscript. GA, GT, CR, PM, and LD contributed in working out the technological aspects of the study, acquiring and analyzing the data, and editing the manuscript.
Funding Open access funding provided by Università degli Studi di Bari Aldo Moro within the CRUI-CARE Agreement.

Declarations
Ethics Approval Approval for the study was obtained from the Ethics Committee of the Lega F. D'Oro, Osimo, Italy. All procedures performed were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.
Informed Consent Written informed consent for the participants' involvement in the study was obtained from their legal representatives.

Conflict of Interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.