1 Introduction

Life expectancy has increased from 69.9 years in 1959 to 77.8 years in 2020 (Arias et al. 2021). Because of that, the absolute number of older adults began to progressively increase throughout the years. The United States Census Bureau estimates that there will be more older adults than children by 2035, increasing from 15.2% of the US population in 2016 to 23.4% by 2060 (US Census 2018).

This aging of the population will increase the number of individuals unable to live an independent life as the risk for cognitive decline, including changes that impact sensory, mental, and physical functioning, increases with age (Murman 2015; World Health Organization 2015). But even with expected declines, research has showed that older adults can still learn new performance skills and can preserve motor memories acquired later in life (Smith et al. 2005).

Newly developed technologies can help to increase the quality of life of older adults providing medical rehabilitation (Bui et al. 2021; Canning et al. 2020; Pedram et al. 2020; Perez-Marcos et al. 2018; Stamm et al. 2022), increasing physical activity engagement (Campo-Prieto et al. 2021; Gao et al. 2020), and even decreasing loneliness and social isolation (Appel et al. 2020; Lee et al. 2019). Virtual Reality (VR) technology can simulate environments that are very realistic, which can also contribute with the development of better diagnostic tools for remotely detecting changes (Zygouris et al. 2017) in cognition such as mild cognitive impairment (Cavedoni et al. 2020).

When deciding to incorporate VR into training or testing applications, researchers must understand how representative a VR task is of a real-life one, commonly defined as validity, and it can be measured by comparing behavioral metrics in VR versus real-life (Paljic 2017). Indeed, immersive VR systems have been recently used to compare real-life and VR performance in the fields of prosthetics (Joyner et al. 2021) and manual training (Carlson et al. 2015; Elbert et al. 2018; Murcia-López and Steed 2018), but not with older adults.

Using VR for training purposes has goals of facilitating the training and making sure that its results will transfer to its real-life applications (Bezerra et al. 2018; Elbert et al. 2018). Elbert et al. (2018) looked at order picking performance transferability between real-life to VR using a task replica and an immersive system in a sample of working-age adults. Another study looked at differences in kinematics when picking up objects from a supermarket shelf in real-life and in a virtual environment (Arlati et al. 2022). Bezerra et al. (2018) did not use a fully immersive system (Microsoft Kinect) but experimented with older adults specifically to evaluate the transferability of VR skills.

Not many studies had participants performing a task in real-life as well as in a replica in VR (Arlati et al. 2022; Bezerra et al. 2018; Elbert et al. 2018), which should give a better comparison especially looking at subjective measurements i.e., how people feel about the different settings. It is much easier to judge how much harder a task is when done in VR when you have just done the real-life one instead of having to recall from your previous experiences, which is known to be part of late-life cognitive decline (Hedden and Gabrieli 2004).

One factor commonly overlooked is the effect of VR on performance involving fine motor skills, and how learning the system takes place in a VR setting. Past research has used time as a measure of performance (Bezerra et al. 2018; Chen and Or 2017; Mason et al. 2019; Porffy et al. 2022), which learning the system itself might have a direct influence upon. Studies normally have a demonstration or a training portion of the study, but even with that, research should incorporate multiple trials, which was part of some past studies (Bezerra et al. 2018; Elbert et al. 2018). The challenge is dealing with practice effects, which might be due to factors such as memorization and learned strategies, something commonly seen in cognitive tests (Calamia et al. 2012).

When making a direct comparison between real and virtual environments, one must also understand its acceptability and feasibility. Past research has looked at VR versus real-life (Bezerra et al. 2018; Parra and Kaplan 2019), but not necessarily using immersive environments and/or motor-cognitive tasks related to daily abilities commonly referred to as the Instrumental Activities of Daily Living (IADLs). These can be defined as “intentional and complex activities, requiring high-level controlled processes in response to individuals’ needs, mainly related to novel and/or challenging daily living situations” (de Rotrou et al. 2012). Examples of IADLs include driving, shopping for products, cooking, managing money, and doing laundry. These complex tasks require fine motor-cognitive skills, such as the precision needed for selecting and moving objects during activities like sorting laundry. Since IADLs are considered higher-order and complex activities, it is logical to assume a strong connection between them and cognitive function. This relationship has already been demonstrated by various studies (Marshall et al. 2011; Reppermund et al. 2011).

When considering the utilization of different technologies in any field, it’s crucial to understand how they can affect performance due to user interaction. In the case of VR development, researchers must take into account the human experience, which includes understanding the usability of the system and its potential limitations. For example, consumer behavior in VR was found to be mostly consistent with everyday life for product development (Branca et al. 2023), being an effective and more sustainable alternative to traditional tests. In addition, a potential limitation can be the phenomenon of cybersickness, a common issue with VR technology (Dilanchian et al. 2021; Mittelstaedt et al. 2019; Weidner et al. 2017). However, this has been somewhat alleviated by technological advancements that increased display resolution and refresh rate (Saredakis et al. 2020). To minimize cybersickness, designers must focus on the user’s sense of presence within the VR environment. Studies have found that a stronger sense of ‘being there’ is negatively associated with feelings of cybersickness (Weech et al. 2019). Overall, effects of cybersickness reported by older adults are generally low on VR experiences (Huygelier et al. 2019), with research even pointing to older adults reporting less cybersickness than younger adults (Dilanchian et al. 2021).

Understanding the effect of the VR system on performance, including learning its use, is decisive to develop clinical applications intended to replicate tasks that are part of common cognitive abilities testing such as the IADLs. In this study, a task that requires fine motors skills was performed by older adults in both settings (real-life and VR) to evaluate differences. Participants were required to repeat the same task in each setting to determine if and how learning would take place in VR. Testing multiple times in each setting was deliberately chosen to ensure that participants had adequately learned and could perform the task in a real-life context before replicating it in VR. This approach would allow a direct comparison of learning outcomes and adaptations in the VR environment against a real-life baseline, isolating the VR-specific learning effects. Both objective (time and effectiveness) and subjective (perceived task-load) measures of comparison were collected, as well as usability and acceptability measures related to the VR equipment and environment.

2 Methodology

2.1 Participants

A total of 20 participants were recruited from the Memory and Aging Laboratory at Kansas State University. This study complied with the American Psychological Association Code of Ethics and was approved by the Kansas State University’s Institutional Review Board (#IRB-10,786). Consent was obtained by having participants read and sign the consent form after the nature of the study was explained to them.

The sample had an average age of 72.4 (SD = 5.0, MIN = 65, MAX = 84), with 12 males, 7 females, and 1 participant who did not want to report gender. The sample was on average highly educated with a mean of 17.2 (SD = 2.6) years of education. Majority of participants were retired (16 retired, 4 still working).

2.2 Task design

A task that requires fine motor coordination (selecting objects jumbled in a bowl) and is easy to complete by participants in a real-life (which will be referred to as RL) setting was designed for the study. Dealing with the same task in VR would require participants to learn how to interact with the system, which does not provide the same sensory feedback (touch) and requires more precision when selecting objects to complete a task.

The task designed for this experiment consisted of a sorting task that requires hand-eye coordination along with decision making processes. Seated participants were presented with a transparent bowl with two types of objects of different colors each. A total of 54 1-inch cubes of six different colors (9 each color) and 45 1.5-inch diameter balls of 9 different colors (5 each color) were randomly mixed in the bowl (See Fig. 1).

Fig. 1
figure 1

Real-life (RL) set up and designed task in VR

Participants were instructed to sort the objects in the transparent bowl in front of them into three different black rectangular containers according to specific pairs of colors e.g., one container with blue and green objects only, another with red and orange objects only, and the last one with red and purple objects only. To account for learning effects of decreased decision-making time when sorting objects by previously knowing which pairs to combine, the combination of colors changed for each trial performed by the participants. Only one object could be selected at a time.

To account for learning abilities related to the task, each task was performed 3 times in each setting i.e., 3 times in RL and 3 times in VR. Data collected from each trial included time (using a stopwatch) and number of misplaced objects i.e., objects put in an incorrect container.

The task was designed to isolate the VR effect on the task performance and reduce practice effect. A conscious decision was made not to employ a counterbalance between RL and VR given the primary objective of assessing abilities of older adults in performing tasks that are reflective of everyday activities with which they are already familiar. For example, the sorting task designed is akin to routine activities such as organizing household items or sorting laundry, which most participants have likely encountered in their daily lives. Practice effect was reduced by having the task change pairs of colors at every trial, so that participants could not memorize the correct order and would still have to go through the same amount of decision-making process time during each trial.

It was hypothesized that there would not be a significant difference in time performance between second and third trials in RL. If time differences between second and third trials are not significant, any time difference seen later in the VR could be attributed to the VR itself.

2.3 Virtual reality apparatus and system training

The Virtual Reality environment was designed using Unity (version 2020.3.10f1), and it was built to accurately replicate the real environment, that is, with the same quantities, colors, and sizes for objects as seen in Fig. 1. Colliders for objects were designed to match the exact same visual size and shape for balls and cubes. Additionally, a sphere collider, located in the middle of the players’ virtual hands and with a 0.05 radius, was utilized for grabbing purposes.

The Oculus Quest 2, a fully immersive VR system, was selected for this study, which consists of a Head Mounted Display (HMD) and two hand-held controllers. The side trigger button in the controller was chosen to be pressed to grab the object in the virtual environment. Although participants used physical controllers, within the VR setting, they saw only virtual hands, not the controllers themselves. The device was connected to a computer that was rendering the virtual reality environment.

A demonstration scene was prepared to train participants on how to use the controllers to grab objects. All colors and shapes of objects were displayed in front of the participant before starting the experiment to give a better understanding of correct color assignments and how to grab each object. After grasping and releasing at least half of the objects, participants could move forward to the actual sorting task. Pairs of colors for each trial were displayed on a gray wall in front of the participant. In the real-life task, the pairs of colors were displayed using a piece of paper that was affixed to the wall and replaced at each round, mirroring the setup in the VR environment.

Fig. 2
figure 2

Flowchart of study design

2.4 Procedure

The study was conducted in an Ergonomics Laboratory at Kansas State University. Participants were informed of the location of the study and came independently to the laboratory. Participants gave consent to joining the study and started by answering a pre-experiment survey that included basic demographics. Participants also reported their familiarity with technology devices by selecting which devices they have from a list e.g., tablet, smartphone, computer, and VR devices, and answered the Computer Proficiency Questionnaire (Boot et al. 2015). All participants reported having a computer and a cellphone, and 14 participants had a tablet device. The average technological device ownership was of 3.15 (SD = 0.87) devices per participant, ranging from 2 to 5 devices total.

Figure 2 describes the basic procedure followed by each participant in the study. Each participant took the Mini Montreal Cognitive Assessment (MOCA) Version 2.1. It was administered by a MOCA certified rater (ID USKAUCR7093499-01). Results were not interpreted by the researcher and participants were not informed of their scores since the purpose of the study was not to evaluate possible cognitive decline effects. The MOCA score was only used to control for cognitive abilities in the modeling process. No participant was removed from the analysis if they were able to complete the task effectively in RL and in VR, despite of the MOCA score. MOCA scores had an average of 12.80 (SD = 1.73), with 11 and above out of 15 points being considered normal cognition. Two participants scored less than 11 points, mostly due to the recall portion of the test, but were not excluded from the analysis since they successfully completed the study.

All participants were seated during both tasks. Participants were allowed to continue to the first trial of the VR condition after successfully completing the training session described in the previous section.

Right after the end of the 3 trials in RL, task load was assessed using the NASA-TLX questionnaire, which is comprised of 6 sub-dimensions related to mental, physical, and temporal demands, performance, effort, and frustration levels (Hart and Staveland 1988). Participants then executed the other remaining 3 trials in the VR condition, followed by another task load questionnaire now referring to the latest task performed and a post-experiment survey. Specific VR-related assessments in the survey included (1) presence in the VR environment using the IGroup Presence Questionnaire (IPQ) (Schubert 2003). Subscales included were related to Experienced Realism (REAL1, REAL2) and General Presence (G1, SP1, SP5), and were combined to determine a VR Realness score; (2) Simulator Sickness Questionnaire (SSQ) (Kennedy et al. 1993) measured cybersickness possibly caused by the VR experience; (3) System Usability Scale (Brooke 1996) was also utilized to evaluate the usability of this type technology; and (4) participants answered open-ended questions related to strategy changes between the real and virtual settings and likes and dislikes regarding the VR experience.

3 Results

All 20 participants completed the entire experiment with no difficulties nor technical issues. The study took approximately one hour to be complete by each participant.

3.1 Performance results

Completion rates from all participants were extremely high and with a very low number of mistakes in both settings. The average number of misplaced objects in each setting was of 0.73 per trial in RL, and 0.86 per trial in VR. The majority of these mistakes were attributed to the difficulty in distinguishing similar colors, such as pink and purple. Although the participants were capable of sorting colors during the trials, which suggests color blindness was not a factor, it is recognized that using colors with lower contrast could potentially affect performance.

To evaluate if there were significant differences between trials, a repeated measures statistical analysis was conducted. Data was tested for normality using the Shapiro Wilk Test, which was rejected for most trials due to its right-skewness. Therefore, the non-parametric Friedman test was utilized. Differences between all 3 trials in RL showed a statistically significant difference (\({\chi }^{2}\left(2\right)=4524, p<0.001)\), and the same was observed during the VR trials (\({\chi }^{2}\left(2\right)=4000, p<0.001)\).

The Wilcoxon signed-rank test was used as a paired t-test alternative using times from first and second trials in RL (\(T=29, p=0.003)\) and from second and third trials in RL (\(T=101.50 , p=0.89)\). A Bonferroni correction was utilized with a significance level of 0.025. The first trial in RL therefore required more learning of the task compared to subsequent trials. This was due to the straightforward nature of the task and the participants’ increasing proficiency with practice. In VR, fourth and fifth trials were compared (\(T=3, p<0.001)\) as well as fifth and sixth trials (\(T=44, p=0.04)\). With a 0% improvement between second and third trials in VR, we could not reject the initial hypothesis that participants had already learned the task and therefore all changes observed during the VR trials were related to the VR system. In the VR task, a much lower p-value was observed, meaning that there was likely still improvement taking place, although at a lower rate than between trials four and five. The improvement between the fifth and sixth trials did not meet the significance threshold, and therefore was not considered statistically significant.

A final test was conducted comparing the third trial (RL) and sixth trial (VR), to compare the two conditions (VR and RL directly) also using the Wilcoxon signed-rank test (\(T=0, p<0.001)\). We concluded that the two conditions were statistically significantly different, with VR taking longer than RL for participants.

Figure 3 shows the boxplots of time (measured in seconds) to complete each trial. There was a small improvement in time during the first three RL trials for many participants (17 out of 20 total). Mostly, the improvement happened between first and second trial, with more consistent results between second and third trial, demonstrating that participants had already mastered the task by then. It was observed that all 20 participants improved their baseline time during the VR trials, but still, VR tasks took longer than in RL.

Fig. 3
figure 3

- Boxplots of time (in seconds) to complete each trial

Table 1 summarizes mean times and standard deviation for each trial in the study. Average improvement in time was observed between first and second trial in RL (10.38%) but became consistent between second and third trial (0% change). A sharp learning process took place using the VR as seen in Fig. 3. In VR, larger improvements were observed between first and second trial (23.84%), and still some improvement between second and third trial (3.84%).

Table 1 Average and Std. Dev. time in seconds for each trial and average improvements

3.2 Task load comparison

Each participant rated the RL and the VR tasks using the NASA Task Load Index immediately after completing the trials of each setting, with a mean score of 12.54 (SD = 12.80) and 22.21 (SD = 17.04) out of 100 total points for RL and VR respectively. Data from NASA-TLX scores failed the Shapiro Wilk test, so the non-parametric Wilcoxson test was selected to run the analysis. Results showed a significant difference (T = 24.00, p = 0.001) between conditions. Participants, on average, reported an increase in task load when performing the same sorting task in the virtual environment.

3.2.1 Individual scores

Average scores for each dimension of the task load index seemed to get worse for the VR setting in exception for the pace of the task, which demonstrates that participants perceived they were, on average, slower in the VR task. All individual variables were tested for normality and failed the Shapiro Wilk test; therefore, the Wilcoxson test was run for each specific question, and with a Bonferroni Correction for multiple testing, the threshold used for significance was of 0.008 (0.05/6). Table 2 summarizes the statistical results of the analysis. Mental and physical demand, along with perceived stress, were considered statistically significantly different in the analysis, all with higher means in the VR portion of the study. Overall, participants did find the same task harder in the virtual environment, but they all on average rated their performance very high in both settings (lower scores represent higher success).

Table 2 Individual Scores for NASA-TLX

3.3 System usability, realness, and cybersickness effects

Data related to usability of the system was collected using the System Usability Scale (SUS). SUS scores above 68 are considered above average. Participants gave the system’s usability an average score of 78 (SD = 13.40). It was clarified to participants that the score should be related to using the VR to complete the task. Majority of participants thought the system was easy to use and felt confident using the system. Another key component was that, although most participant were using the system for the very first time, most participants did not think that they needed to learn a lot of things before they could get going with the system.

VR Realness, measured using subscales of the IGroup Presence Questionnaire, had an average score of 4.95 (SD = 0.89), with scores ranging from 0 to 6. Higher scores represent a more realistic VR experience. Given the selected subscales, with three referring to presence and two to experienced realism, participants felt overall presence in the virtual environment performing the tasks, and thought the environment looked relatively real. There were mixed answers to the specific subscale REAL2 related to the system’s consistency with its real-world counterpart, with an average score of 4.20 (SD = 1.47) and a higher standard deviation when compared to the combined results of all five subscales.

Scores for the Simulator Sickness Questionnaire were low, with a mean score of 3.9 (SD = 5.62). Scores over 20 indicate “perceptible discomfort” (Kennedy et al. 1993), and was only reported by one participant in the group.

3.4 Exploratory data analysis

In an exploratory data analysis, a K-Means cluster technique was utilized to find natural grouping amongst participants. Data was standardized prior to the cluster analysis to reduce bias from variables with larger dimensions. The elbow method showed a sharp angle at two clusters, and given the small sample size, this was the number of clusters selected for further analysis.

Table 3 shows the means and standard deviations of participants allocated in each of the 2 clusters. The first cluster had participants with an average age of 69.54 years old, with an average of 4.09 devices, and high overall usability scores of 87.04. The second cluster had slightly older participants (75.88 years old) that had a lower average number of devices, found the system to be less usable, and had higher SSQ scores. MOCA score was not a good source of differentiation between groups.

Table 3 Mean values of each cluster

3.5 Strategy changes

One key component reported by some participants was related to changes in strategy to complete the given task when switching to the virtual environment. 60% of participants reported changing their strategies, which was mostly related to an initial difficulty in selecting the exact object that they were initially planning on selecting and grabbing, which increased their decision time and made them reassess in which container the selected color should be put in.

3.6 Post-experiment feedback and other comments

Participants reported positive experiences with the study. Participants demonstrated enthusiasm for the VR equipment: “fun” and “interesting” were feedback provided by 18 participants. When asked about what components of the study participants disliked, common topics brought up included the controller itself and how to use it. A total of four participants reported having a hard time holding the controllers and pressing the correct buttons, which might have distracted them when doing the task. The weight of the headset was also brought up by two participants who reported neck discomfort even though the whole VR portion of the study lasted on average 15 min.

4 Discussion

In this study, the high completion rate and minimal errors among participants indicate that older adults were able to effectively interact with the VR system. These findings are in line with previous research (Smith et al. 2005) that suggests the potential for this demographic to engage with new technologies. VR indeed seems to be a feasible tool to be used by older adults (Appel et al. 2020; Chau et al. 2021; Gerber et al. 2018; Zygouris et al. 2017), therefore, the findings contradicted the ageism concept that older adults have difficulties with new technologies (Rosales and Fernández-Ardèvol 2019).

In this study, there was no statistically significant difference between the number of errors in the VR and RL, but participants did have a harder time selecting objects in VR. This effect was observed in the increased time spent in the VR trials, even after completing the task a couple of times. Past studies comparing VR performance also found that participants took longer to complete a real-life task in VR (Elbert et al. 2018; Oren et al. 2012). The same happened when comparing VR to a regular desktop display when performing a spatial ability test (Guzsvinecz et al. 2022), although fine motor movements were not the focus of those studies.

Cybersickness levels were low and consistent with past research (Dilanchian et al. 2021), but when grouping participants in the cluster analysis, the older group actually reported higher levels of cybersickness than the younger group. This could be related to other clinical measures or health conditions not incorporated in the study such as smoking (Kim et al. 2021), which was found to be linked to lower levels of cybersickness.

The ergonomics of the device can be improved with lighter headsets and more intuitive controllers or hand-tracking systems. Haptic gloves have been developed to enhance the user experience but its options have been limited (Perret and Vander Poorten 2018) with not many commercially available options, and its use for tasks such as Object Location Spatial Memory (OLSM) was not significantly different when compared to regular VR controllers (Forgiarini et al. 2023). Equipment utilized in similar studies will have to be light, compact, and with precise sensors so that tactile stimulations like the ones experienced in real-life can be replicated (Van Wegen et al., 2023). Also, depending on the type of task designed, reducing the amount of controller buttons available could potentially help participants. The current controllers from the chosen device included multiple other buttons that were not necessary for this study and therefore could have increased the difficulty to execute the task.

Past research has also evaluated the use of VR technology to promote well-being in older adults experiencing mild cognitive impairment or related dementias, with results showing that virtual experiences were well accepted and had improved mood and apathy of participants (D’Cunha et al. 2019). Learning rates and task feasibility in those cases should be further investigated, especially if the goal is to measure cognitive decline using VR tests, which could have a confounding factor related to learning how to use the VR technology. The VR market has been growing, and future studies should evaluate if learning rates for VR users could be different than for non-users. This will also help to determine how much training one should get to use the VR system effectively.

Other skills that relate to daily activities should also be tested in VR to evaluate feasibility and validity by comparing real and virtual environments. Training of IADLs using non-immersive systems was already able to improve neuropsychological measures of older adults (Gamito et al. 2019), so further research should test for daily abilities but incorporating the currently available immersive systems.

Furthermore, the ability to standardize IADL assessments in a VR setting presents a significant advantage over traditional office-based evaluations. While certain IADLs, such as the sorting task investigated in this study, can be readily tested in an office environment, the transition to VR can offer a more controlled and uniform testing framework (Bohil et al. 2011). This standardized approach not only minimizes the potential for rater bias but also reduces the need for extensive training and resources typically required for conducting these assessments. By ensuring a consistent and bias-free environment, VR has the potential to provide more reliable and objective measures of daily living skills, critical for accurately evaluating the capabilities of older adults.

4.1 Study limitations

Although the sample had a similar size to other VR studies with older adults (Chen and Or 2017; Dilanchian et al. 2021; Mason et al. 2019; Park et al. 2022; Parra and Kaplan 2019), larger sample sizes would provide stronger statistical power and insights. This study also had a sample with high educational levels, which might yield different results when compared to other subgroups of older adults (Brazil and Rys 2022). Even with a relatively similar group in education and technology usage, a high variability in time to perform the task in both settings was observed.

To properly model the learning rate, more trials would be necessary as the learning rate models normally work in a logarithmic scale. For this study, object-picking was not analyzed independently, but all simultaneously in the task as whole (sorting all the objects in the bowl). Each participant spent about 15 min doing the tasks in VR, which is a common length for VR sessions with older adults (D’Cunha et al. 2019; Jones et al. 2016). A shorter task will allow for more trials, and therefore be better suited to mathematically model a learning curve.

The time difference between RL and VR in this study could have happened because of two reasons. One reason would be due to participants aiming for a specific colored-object but ending up getting a different object and having to reassess which container it should go in. The second possible reason would be that participants made multiple attempts to get the specific object for which they were aiming. Participants were asked to maintain the same pace for all trials, therefore, it was assumed that the learning effect observed in the VR trials was not due to learning the task, but because participants got better at picking up the desired objects. Future work should focus on that specific component by having only one sorting strategy allowed.

While in this research VR systems with controller-based interactions was used, it should be acknowledged that new systems with hand-tracking technologies are emerging. Despite these advancements, this research provided critical insights into older adults’ interaction with VR technology used for replicating real-life tasks, offering understanding that is pertinent to this evolving technology. Future research should extend this work by comparing the efficacy and accessibility of controller-based systems and hand-tracking interfaces, particularly with a testing population of older adults to determine the best VR technology readily available. Additionally, any potential latency issues with newer systems should be rigorously evaluated to ensure its suitability for older adult users.

5 Conclusion

In this study, virtual reality’s validity for a fine motor task was assessed by directly comparing older adult’s performance in a sorting task in real-life and then in VR. The effect of learning how to use the VR system was objectively assessed, which was observed even after participants were provided with instructions and went through a demonstration scene. The task’s feasibility using VR was confirmed, as all participants effectively completed the task with a small number of mistakes. This demonstrates that older adults can learn how to use the system even for fine motor tasks. Consequently, VR emerges as a potential tool to assess older adults in instrumental activities of daily living (IADLs), aiding in the identification of functional and cognitive decline.

It is decisive to incorporate the VR effect in performance analyses for older adults, especially if the measure of performance is time to complete a task as all participants improved their VR times by the third trial. Learning rates displayed variability, though overall, the task took longer to complete in the VR environment compared to real-life. This aspect should also be factored in when designing VR tests for this demographic.