People with extensive neuro-motor impairment and lack of speech are known to have very serious problems engaging in occupational or recreational activities and contacting (communicating with) other persons particularly when these persons are not in the immediate proximity, and typically remain passive and marginalized (Anderlini et al., 2019; Barman et al., 2016; D’Amico et al., 2022; Fried-Oken et al., 2012; Holmqvist et al., 2018; Lancioni et al., 2019). Similar problems may also be experienced by people who present with a combination of moderate neuro-motor impairment, cognitive disabilities and lack of speech (D’Amico et al., 2022; Feenaughty et al., 2018; Kassavetis et al., 2022; Lampe et al., 2018; Lancioni et al., 2015). Generally, all these people are provided with conventional forms of treatment such as physiotherapy, speech therapy and occupational therapy (Hassett et al., 2018; Jeyakumar et al., 2023; Park & Kim, 2017; Sureshkumar & Yogarajan, 2021). These therapies are important to promote the people’s level of engagement and avoid deterioration of their general condition. However, the use of such therapies alone is unlikely to foster rapid improvements that could help people to independently engage in functional activity and communication (Basilakos, 2018; D'Amico et al., 2019; Hassett et al., 2018; MacDonald, 2017; Palmer et al., 2019).

In order to alleviate this situation, one may need to resort to the use of technology solutions that would bridge the gap between the people’s skills level and the level required for independent engagement in basic forms of functional activity and communication (Brunner et al., 2017; Fager et al., 2019; Lancioni et al., 2020; Semprini et al., 2018; Stasolla et al., 2023). Several technology-aided intervention efforts have been reported in this area (D'Amico et al., 2023; Darcy et al., 2017; Lancioni et al., 2023; Smith et al., 2018; Stasolla et al., 2015, 2022). For example, D'Amico et al. (2023) set up an intervention program with three participants who presented with neuro-motor and speech impairment and mild to moderate intellectual disability using a smartphone, five cards with radio-frequency identification tags (discriminated by the smartphone’s Near Field Communication Module), and a mini speaker. The cards represented three different singers, a telephone, and the caregiver. When a participant lowered the smartphone onto a specific singer card, the smartphone verbalized via the mini speaker the titles of three songs of that singer. The participant could choose any of the songs by bringing the smartphone close to the chest (thus activating the smartphone’s proximity sensor) as they heard the song’s title. When a participant lowered the smartphone onto the telephone card, the smartphone verbalized the names of three possible partners that the participant could call or contact through a message. The choice of the partner to call or reach via message was made by bringing the smartphone to the chest as the partner’s name was presented. When a participant lowered the smartphone onto the caregiver card, the caregiver was called and this led the caregiver to have a period of positive interaction with the participant. The results were highly satisfactory with all three participants.

Lancioni et al. (2023) worked with two participants who presented with neuro-motor and speech impairments combined with moderate intellectual disability. The intervention program involved the use of a tablet linked to two pressure sensors. The tablet was equipped with a SIM card and Internet connection and its functioning was regulated via the MacroDroid application. The tablet was set to alternate periods in which the participant could choose between making video calls to preferred partners or watching music videos with periods in which the participant was presented with a brief story and questions about it. If the participant chose to make a video call, the tablet listed the names of several partners. As soon as the participant chose a partner, the tablet started a video call with that partner. If the participant chose to watch a music video, the tablet listed several singers. As soon as the participant chose a singer, the tablet presented a music video of that singer. Choices were made through the sensors. One sensor served for choosing video calls and partners to call; the other for choosing music videos and singers. The sensors also served for selecting the correct answer to each of the questions presented after every story. Both participants used the program successfully.

This study was aimed at extending the evaluation of the Lancioni et al.’s intervention approach (2023) with three new participants. Some adaptations of the technology were, however, made to suit the new participants’ physical and cognitive functioning and communication interests. Specifically, touch and optic sensors were used instead of pressure sensors to facilitate responses. Stories followed by related questions were replaced with simple sets of questions for two participants who had problems following stories. The possibility of choosing between making a video call or sending simple messages via emoji symbols was introduced for one participant who was interested in using those messages.

Method

Participants

The participants were two adults and one adolescent males identified here with the pseudonyms of Adrian, Connor, and Travis. Their ages were 57, 35 and 16 years, respectively. Adrian had a severe form of congenital cerebral palsy, which was subsequently aggravated by subarachnoid hemorrhage. He presented with severe spastic dyskinetic tetraparesis and could only manage a small movement of his left hand. He was fitted with a tracheostomy tube and did not possess any speech but was able to understand a basic conversation carried out by familiar people and to follow brief simple stories about various topics of interest to him. Given his overall condition, he was totally dependent on external support and remained withdrawn and passive when no support was available. He was considered to function between the VII and VIII level of the Rancho Los Amigos Cognitive Scale-Revised (Chen et al., 2022; Stenberg et al., 2015).

Connor had suffered traumatic brain injury causing fronto-temporal subdural hematoma and diffuse axonal injury with bilateral fronto-temporal and splenium of the corpus callosum lesions, 2 years prior to the start of the study. He presented with severe spastic tetraparesis and extensive tendon retraction. He could move with some difficulty his left arm and hand. He was fitted with tracheostomy and gastrostomy tubes and did not possess any speech. He was apparently able to follow familiar people (particularly his wife) talking about daily events and showed positive emotional expressions (e.g., sent kisses to his wife). He also could discriminate basic sentences describing the use of a number of common objects. Given his condition, he was totally dependent on external support. He was considered to function between the V and VI level of the Rancho Los Amigos Cognitive Scale-Revised.

Travis had suffered traumatic brain injury causing multiple cranial fractures, fronto-temporal subdural hematoma and diffuse axonal injury with bilateral fronto-temporal, left occipital and corpus callosum lesions, 7 months prior to the start of the study. He presented with moderate motor impairment (i.e., could use his lower and upper limbs with some uncertainty). He lacked any speech, but understood simple sentences concerning daily events, family members, friends, and singers, and his comprehension seemed to be slowly extending to include more complex sentences. He relied on external support for most of his daily activities. He was considered to function between the VI and VII level of the Rancho Los Amigos Cognitive Scale-Revised.

The participants were selected for the study on the basis of the following conditions. First, they were known to be generally passive (detached) and with limited opportunities to practice self-determination and independent engagement, conditions considered critically important for their personal satisfaction and rehabilitation progress. Second, they had signaled their interest (through facial expressions or head nodding) for the technology system designed for this study and the occupation/communication options it provided. Third, their families and staff considered the technology system and the intervention approach based on it appropriate for the participants and thus supported its application.

Procedures

Setting, Sessions, and Research Assistant

The study was carried out in small rooms of the medical or socio-medical rehabilitation units of the center, in which the participants received their regular treatment and spent their time. Baseline and intervention sessions (i.e., sessions without the technology system and sessions with the technology system, respectively) were typically implemented once or twice a day, 5 days a week. The research assistant responsible for implementing the sessions and recording the data was a woman with a Master degree in Psychology who had experience in applying technology-aided intervention programs with people with disabilities and was familiar with data recording strategies.

Technology System (Components)

The technology included a tablet with Android operating system, and two touch or two optic sensors linked to the tablet via a basic interface. The tablet was fitted with Internet connection, WhatsApp Messenger, a SIM card, as well as the MacroDroid application, which was programmed to regulate its functioning. The tablet also contained a large variety of music videos considered to be preferred by the participants, pictures of the singers appearing in those videos, and pictures and telephone numbers of the participants’ preferred partners (i.e., the partners that could be reached via video calls or emoji messages). The touch sensors were used for Adrian and Travis and consisted of square-shaped devices measuring 5 × 5 × 0.2 cm, which could be activated by a light touch response. The optic sensors were used for Connor and consisted of triangle-shaped devices with 9-cm sides and 1-cm thickness, which were activated as the participant’s hand came to a distance of less than 2.5 cm from them. A small panel was placed between the sensors to avoid inaccurate response movements that could inadvertently activate the wrong sensor. The A, B, and C sections of Fig. 1 provide a schematic representation of the sensors and their position in relation to the tablet for the three participants. Sensor position variations served to suit the participants’ response schemes.

Fig. 1
figure 1

The A, B, and C sections of the figure provide a schematic representation of the sensors and their position in relation to the tablet for Adrian, Connor, and Travis, respectively. In the B (Connor) section, the representation also includes the small panel used between the sensors. The D section provides a representation of the tablet showing the image used for music on the left half of the screen and the image of a telephone on the right half of the screen

Technology System (Music and Communication)

Sessions with the system started with a music and communication choice period. The tablet showed the image used for music on the left half of the screen and the image of a telephone on the right half of the screen (see the D section of Fig. 1) and asked the participant whether he wanted to listen to music or get in touch with somebody. If the participant chose the music option (i.e., activated the left sensor corresponding to the music side of the tablet screen), the tablet presented the photos of two singers (one to the left and one to the right) and asked the participant whether he wanted to listen to one (SINGER NAME) or the other (SINGER NAME). The participant could choose the singer by activating the sensor corresponding to the side of the tablet screen on which that singer appeared. No choice for 10 s led the smartphone to present the music and telephone images again and repeat the initial question (i.e., restart the choice sequence with new singers and partners available). Choice led the tablet to play a music video of about 1.5 min with the singer chosen singing one of the participant’s preferred songs (i.e., one of the several preferred songs available in its memory; Lancioni et al., 2023).

At the end of a music video, the tablet asked again whether the participant wanted to listen to music or get in touch with somebody. If the participant chose to listen to music, the process described above was repeated with two different singers. If the participant chose to get in touch with somebody, the tablet presented the photos of two preferred partners (e.g., the wife and a friend) and asked whether he wanted to contact one (PARTNER NAME) or the other (PARTNER NAME). The participant could choose one of the two by activating the corresponding sensor (see above). No choice for 10 s led to a restart of the entire choice sequence (see above). Choice of a partner led the tablet to start a video call with that partner for Adrian and Connor (see Fig. 2 for a summary of the procedural conditions available for them). Choice of a partner by Travis led the tablet to present the image of a video call and a combination of emoji symbols for greetings, strength and love and ask Travis whether he wanted to have a video call or send a message. As soon as Travis chose one of the two options, the tablet started a video call with or sent a message to the partner. If a partner of any of the participants did not respond to a video call, the tablet played a pre-recorded video message of that partner. At the end of the call, message or song activated during the second choice successfully completed (see above), the music and communication period was considered concluded provided that more than 3 min had elapsed from its start. Otherwise, the tablet offered additional choice opportunities (involving other singers and partners).

Fig. 2
figure 2

The flowchart summarizes the procedural conditions available for Adrian and Connor

Technology System (Story and/or Questions)

Every music and communication choice period was followed by a story plus questions (Adrian) or questions (Connor and Travis) period. The story the tablet read to Adrian concerned a topic of interest to him (e.g., geography, sport, history) and typically lasted about 2 min. The end of the story was followed by five story-related questions. For each question, the tablet gave the participant two possible answers and he was to choose one of them with the left or right sensor. For example, a question could be formulated as follows: “Was NAME a businessman or an athlete? Touch the left sensor for businessman and the right sensor for athlete”. If the participant’s answer was correct, the tablet provided a positive feedback and presented the next question. If the participant’s answer was incorrect, the tablet provided no feedback and waited for the participant to correct his answer (i.e., touch the correct sensor) before moving to the next question.

The questions presented to Connor and Travis could concern objects of common use (e.g., clothes, appliances and work tools), famous people (e.g., singers and actors) as well as daily events, and could be as follows: “What do you use during the summer, sandals or boots?” or “Who sings SONG TITLE? SINGER NAME one or SINGER NAME two?” Together with the question the tablet presented the sandals or the picture of the first singer on one side of the screen and the boots or the picture of the second singer on the other side. The participant could choose the answer by activating the sensor corresponding to the sandals (first singer) or the sensor corresponding to the boots (second singer).

Every intervention session (i.e., with the use of the technology system) was divided into seven specific time periods. Four of those periods (i.e., the first, third, fifth and seventh) were music and communication choice periods (see above). The remaining three periods (i.e., the second, fourth, and sixth) were story plus questions (Adrian) or questions (Connor and Travis) periods.

Experimental Conditions

The study was carried out according to a non-concurrent multiple baseline design across participants (Lancioni et al., 2022; Ledford & Gast, 2018). In line with the design, the participants received different numbers of baseline sessions (without technology system) before being exposed to the intervention sessions (i.e., with the technology system). To limit participants’ frustration (i.e., given their expected inability to respond without technology system), precautions were taken. That is, the research assistant would intervene after brief periods of no responding from the participants and eventually would interrupt the sessions. Following the end of the intervention phase, a staff survey was carried out to determine staff opinion about the suitability and impact of the technology system used during the intervention phase.

A specific strategy was adopted to ensure high accuracy of the research assistant in applying the baseline and intervention procedural conditions (i.e., high procedural fidelity; St. Peter, 2023). Namely, a study coordinator who had access to video recordings of the sessions provided the research assistant with feedback on her performance daily or every other day. Feedback consisted of pointing out whether she was correct or not on each of the procedural steps implemented during the sessions.

Baseline

During the baseline sessions, the participants sat in front of a tablet, which was not fitted with MacroDroid and not linked to sensors. At the start of a session, the research assistant told the participants that they could access preferred songs and make telephone calls with preferred partners by operating the tablet. For Travis, the research assistant added that he could also send messages to his partners.

Thereafter, the research assistant invited the participants to select a singer or song. If the participants failed to do so for about 10–15 s (i.e., as expected given their condition), the research assistant did it for them to minimize frustration. A music video featuring that singer was then played for about 1.5 min (i.e., as during the intervention; see the Technology System section). After the end of the video, the research assistant invited the participants to call (Adrian and Connor) and to call or send a message (Travis) to a preferred partner. Again, the research assistant acted for them if they made no attempt for 10–15 s.

These leisure and communication choice occasions were followed by the presentation of a story and related questions (Adrian) or simply by questions (Connor and Travis). If the participants did not produce any answer to the first question, the research assistant did it for them and avoided to present any other question. Participants’ failure to respond (i.e., dependence on the research assistant) on the leisure, communication, and question occasions presented above, led to the interruption of the session with a negative score recorded for each of the remaining leisure and communication choice occasions and questions.

Intervention

During the intervention sessions, the participants were provided with the technology system that worked as described in the Technology System section. Every session contained four leisure and communication periods interspersed with three stories plus questions periods or three questions periods (see above). The first five or seven sessions served as introductory sessions in which the research assistant could provide verbal and physical guidance to help the participants activate the sensors to access leisure and communication events and answer the questions being presented to them. Guidance was used on a most-to-least basis with the aim of reaching participants’ independence by the end of the introductory sessions (Libby et al., 2008; Pierce & Cheney, 2017). During the following (regular) intervention sessions, no research assistant guidance was scheduled. However, the research assistant would read or make the tablet play any message Travis had received in response to his emoji messages at the start and at the end of the sessions.

Measures

The measures were: (a) the number of music videos accessed, (b) the number of video calls activated or emoji messages sent, (c) the number of questions answered correctly (i.e., the number of questions for which the first response was correct), and (d) the sessions’ length. The research assistant implementing the sessions also recorded the data. Interrater agreement on data recording was assessed in 21 to 24% of the sessions of each participant by having a reliability observer watch the video-recordings of those sessions and provide independent scores for the measures. Agreement on sessions’ length allowed a discrepancy of 1 min between research assistant and reliability observer. The percentage of interrater agreement on the single measures (computed by dividing the number of sessions with agreement by the total number of sessions in which the reliability observer was involved) was within the 90–100 range for all three participants.

Data Analyses

The baseline and intervention data were grouped over small blocks of sessions (i.e., to simplify their presentation) and displayed in graphic form. The Percentage of Non-overlapping Data (PND) method (Parker et al., 2011) was used to determine whether the participants’ cumulative number of music videos and communication events as well as their number of correct answers to the questions were consistently higher during the intervention sessions as compared to the baseline sessions.

Staff Survey

The survey involved 21 staff members, 12 females and 9 males of 32–59 (M = 45) years of age, who worked within the rehabilitation and medical units of the center that the participants attended. The survey involved two steps. The first step consisted of showing the staff in groups of three to five a 6-min video, which reported clips of the intervention sessions of the three participants that concerned selecting/activating music videos and video calls and answering questions. The second step consisted of asking the staff to give their opinion on each of four survey points, that is, on whether (a) the technology system was suitable and friendly to the participants, (b) the participants benefited from the use of the system, (c) the system was sufficiently practical for use within daily contexts, and (d) they would be willing to adopt such a system for other participants. For each survey point, staff opinion was to be expressed with a score of 1 (most negative) to 5 (most positive).

Results

The three graphs of Fig. 3 summarize the baseline and intervention data for Adrian, Connor, and Travis, respectively. The black triangles and empty circles represent the mean frequency of music videos and video calls (Adrian and Connor) or music videos and video calls plus messages (Travis) activated per session, respectively, over blocks of sessions. Blocks include two sessions during the baseline and three sessions during the intervention. The asterisks represent the mean percentage of story-related questions (Adrian) or general questions (Connor and Travis) that were answered correctly at first attempt over the same blocks of sessions. The introductory sessions preceding the regular intervention sessions are not reported in the figure.

Fig. 3
figure 3

The three graphs summarize the baseline and intervention data for Adrian, Connor, and Travis. The black triangles and empty circles represent the mean frequency of music videos and video calls (Adrian and Connor) or music videos and video calls plus messages (Travis) activated per session, respectively, over blocks of sessions. The blocks include two sessions during the baseline and three sessions during the intervention. Blocks with different numbers of sessions (at the end of the baseline or intervention phase) are marked with a numeral indicating the sessions included. The asterisks represent the mean percentage of story-related questions (Adrian) or general questions (Connor and Travis) that were answered correctly at first attempt over the same blocks of sessions

During the baseline phase (including five to eight sessions), the participants failed to activate any music, video call or message, and to answer any question. Thus, all sessions were interrupted and their length was always below 7 min. During the intervention phase (including 66, 106, and 73 sessions for the three participants, respectively), Adrian managed to activate a mean of over 7.5 music videos and near 0.5 video calls per session. His mean percentage of correct answers to the story-related questions was close to 100. Connor activated a mean of near 7.1 music videos and over 0.9 video calls per session. His mean percentage of correct responses to the questions presented to him during the sessions was 88. Travis activated a mean of near 6.4 music videos and near 2.9 video calls and messages per session. The messages represented about 70% of the total and typically produced replies from the partners that were played to him at the beginning and at the end of the sessions. Travis’ mean percentage of correct responses to the questions presented to him during the sessions was 75. The mean session length varied between about 19 min (Connor) and 27 min (Adrian).

The PND indices were 1 for all participants, as their cumulative number of music and communication events and their number of correct answers to the questions during intervention sessions were always higher than those recorded during the baseline sessions. Staff provided fairly high/positive scores (i.e., with means varying between 4.3 and 4.5 out of a maximum possible of 5) for the four survey points concerning whether the technology system was suitable/friendly, beneficial, and applicable in daily contexts, and whether staff were willing to use it for other participants.

Discussion

The findings suggest that the participants were able to use the technology system successfully to independently access preferred music videos, start video calls or send emoji messages, and listen to stories and/or answer series of questions. These findings corroborate the data reported by Lancioni et al. (2023) with a similar intervention approach and are in line with previous evidence on the effectiveness of technology-aided interventions to promote functional forms of activity and communication in people with extensive neuro-motor impairment and lack of speech (D’Amico et al., 2019, 2023; Lancioni et al., 2020). In light of the findings, several considerations may be made.

First, enabling people like the participants of this study to independently engage in constructive occupation and communication can be considered an important achievement with potentially positive implications for the participants’ quality of life (Brown et al., 2013; Silva et al., 2020; Tulsky & Kisala, 2019). In fact, these people are typically marginalized (passive and largely disconnected) due to their level of disability and the relatively limited impact of conventional intervention strategies (e.g., speech therapy, physiotherapy, and occupational therapy) at least in the short term (Basilakos, 2018; D’Amico et al., 2019; Hassett et al., 2018; MacDonald, 2017; Palmer et al., 2019; Rotariu et al., 2019).

Second, given the characteristics of the participants, the response schemes used in previous studies in the area (e.g., holding a card against a smartphone or moving a smartphone) would not have been feasible at least for Adrian and Connor (D’Amico et al., 2023; Lancioni et al., 2020). Similarly, Adrian and Connor would not have been capable of using pressure sensors such as those used by Lancioni et al. (2023). Touch sensors arranged before the tablet (see the A section of Fig. 1) allowed Adrian to manage the use of the technology system with small lateral movements of his left hand. Connor’s only response scheme was a form of general pointing. One of the few possible ways of capturing and making this response functional was the use of optic sensors before the tablet’s left and right screen areas (see the B section of Fig. 1).

Third, the options available to the participants within the sessions, albeit limited, were considered functional for constructive occupation. Access to music videos was viewed as a preferred form of engagement that motivated the participants to respond (to take initiative and choose within pairs of options) and offered them self-determination opportunities (Hobeika et al., 2021; Raglio et al., 2015; Wehmeyer, 2020). The video calls and messages allowed them to get in touch with relevant partners and express and receive signs of affection. Listening to brief stories and/or answering sets of questions constituted a type of engagement that (a) was a meaningful alternative to other, impossible forms of occupation (e.g., daily activities) and (b) served to stimulate the participants’ attention and memory with presumably positive impact on their cognitive functioning (Collins et al., 2019; Lancioni et al., 2023; Wood et al., 2020).

Fourth, the high accessibility and relatively low cost of the technology used in the study may be two practically relevant points. As to the accessibility, it may be noted that the technology system entails a small number of commercially available devices that are easily portable and readily usable across settings (Abdi et al., 2021; Boot et al., 2018; Borg, 2019). The cost of the technology system may be approximately US $600–750. This includes about $300 for the tablet, about $150 for the interface, and about $150 for the touch sensors used by Adrian and Travis or nearly $300 for the optic sensors used by Connor. The cost of the MacroDroid application is practically insignificant.

Fifth, the outcome of the staff survey can be considered very encouraging as to the specific relevance and usability of the technology system. In fact, staff provided positive ratings regarding the suitability, friendliness and beneficial impact of the system for the participants of this study (Shlesinger et al., 2022; Stasolla et al., 2022). Staff also expressed their willingness to promote the use of the system for other people (Stasolla et al., 2022).

Limitations and Future Research

The two most obvious limitations of the study concern the small number of participants and the lack of a formal evaluation of participants and families’ satisfaction with the technology-aided intervention. The first limitation calls for caution in making general comments about the results. Direct and systematic replication studies with additional participants are needed before one can draw conclusions as to the robustness and generality of this type of intervention (Coiera & Tong, 2021; Kazdin, 2011; Locey, 2020; Plucker & Makel, 2021).

With regard to the second limitation, one can argue that (a) the importance of participants and families’ satisfaction with the intervention can never be overemphasized and (b) formal assessment of satisfaction should be included in future studies (Lancioni et al., 2023; Stasolla et al., 2022). Notwithstanding these points, it should be noted that the intervention in this study was programmed to include access to preferred stimulation and to communication with preferred partners (e.g., family members and friends), that is, two components that were expected to create satisfaction. It is also important to add that anecdotal reports pointed out that participants were willing (eager) to be involved in the sessions and families supported the use of those sessions.

In conclusion, the findings suggest that the intervention based on a simple technology system was helpful to enable the participants to engage in constructive forms of occupation and communication independently. Although encouraging these findings need to be taken with some caution given the aforementioned limitations of the study. New research will need to amend those limitations and possibly search for ways of expanding the number of occupational and communication options available within the sessions and upgrade the technology so as to make it more easily usable (friendly) for participants and caregivers.