1 Introduction

Prospective memory (PM) is a set of cognitive functions responsible for our ability to remember to carry out an intended action at a specific time in the future [1]. It is now recognized that proper functioning of the prospective memory is essential in everyday life because it gives people a maximum of independence in carrying out normal activities. By providing a certain degree of personal safety in everyday life, prospective memory also has a direct effect on people’s quality of life and on their social participation. Some activities governed by prospective memory include remembering to take medication, arriving at an appointment on time and turning off the stove after using it [2]. PM is a major set of cognitive function to allow an independent life. Indeed, three types of remembering based on the nature of prospective cues have been described: Event-based, Activity-based and Time-based [3]. Event-based PM tasks are facilitated when a cue is present in the environment. The best example is the person who wishes to buy orange juice on the way home: here prospective remembering “pops up” with the appearance of a grocery store. Activity-based prospective memory consists of achieving a specific action (for example, taking a medication) in association with the realization of another action (for example, when the person is going to the work). Finally, Time-based prospective memory tasks consist of remembering an intention at a specific time (e.g. 10 o’clock) or after a specific delay (e.g. in a 20-minute delay). A good example of a Time-based prospective memory event is meeting a colleague at a specific time and day or removing a cake from the oven after 30 min.

Assessing PM is primary to understand patient’s difficulties in everyday life. For instance, aging population reports frequently PM problems as evaluate by some laboratory tasks [4] but PM problems seem to be difficult to observe in the everyday life. This particularly the case when patients are performing during natural tasks (a) realized in a familiar context, (b) interfolded in everyday life and (c) executed on several days. This contrast into the results could be explained by anxiety generate in a laboratory setting or by the fact that the elderly could use compensatory strategies in the everyday life such as a calendar, Post-it, etc. Here, virtual reality (VR) could be a good means to assess prospective memory in a verisimilar ecological manner with standardized procedures. VR is a computerize simulation of the real world. It allows to propose virtual environment (VE) designed fully to assess cognitive function in everyday life without unwanted distractors or possibilities for the patient to use his familiar compensatory strategies.

Authors have proposed an interesting model of PM (The Multiprocess Model) [5]. This model argues it has 2 main modes of remembering: automatic and strategic. More specifically, when people achieve a prospective task under automatic processes, the contact with the cue in the environment is sufficient to remember the attended action. In fact, this remembering could depend more on environmental cues, thus soliciting more memory and attention than executive functions. Strategic remembering implies bigger cognitive resources because it supposes a strategic and voluntary monitoring between ongoing and prospective tasks [5]. This process might be supported by the Supervisory Attention System (SAS), an attention-executive process [6]. The SAS facilitates encoding an association between an external event and the intended action. After, the SAS monitors the environment and searches the targeted cue to indicate the right time to perform the action [1, 7]. When the target is identified, the SAS interrupts the ongoing task, turns the attention toward the prospective task and lets the person realize the intended action. In summary, the role of the SAS in the Multiprocess model is to support the realization of the action plan by the activation of the intended scripts and by diminishing the non-pertinent scripts; this process could be implied also when the individual performs in attention mode. The SAS implies that attention resources are limited in “energy”. Actually, when attention is focused on one element of the entire task, it is difficult to pay attention to the other part of the task. In VR, we could think that when participant is concentrated on the navigation with an external device (e.g. joysticks, mice) into the virtual environment (VE), his/her cognitive readiness for the task are reduced. That could create a bias in the measurement obtained into the VE.

Because PM is a cognitive function weakened in pathological aging such as mild cognitive impairment [8], it is very important to consider the best way to assess it efficacy. Several authors said the PM must be systematically assessed by neuropsychologist when a diagnostic of dementia is suspected [9]. Unfortunately, very few assessment and clinically useful tool for PM exist because some of them lack reliability. VR could be a better alternative to assess PM but with these tools it is essential to be able to understand which part of the cognitive load is solicited by the Human-Machine interface (HCI).

1.1 Assessing Prospective Memory

The ecological validity of measures of cognitive function is contingent on the functional and predictive relationship between an individual’s performance on a set of neuropsychological tests and their real-life behaviour at home, at work, at school or in the community [11]. In other words, a test is ecologically valid if it is able to predict an individual’s level of function or detect potential problems in his/her daily life. The major challenge concerning a new paradigm in neuropsychological assessment is to develop tools with a good face-and-content validity (i.e. tasks should more closely resemble those of daily life) and with predictive validity (i.e. test must be correlated with the individual’s real-life functioning). From a psychometric perspective, ecological validity is represented by the link between the results observed in a test and the capacities of the subject as it is seen in everyday life [12]. Two dimensions of ecological tests have been reported [13]; that is useful in the understanding of a cognitive assessment from an ecological perspective. To sum up, the test can be constructed or reviewed from a “veridicality” or “verisimilitude” perspective.

Veridicality concerns all existing traditional tests and the way which they are linked empirically with everyday life [12]. Originally, the test had not been created to simulate daily-living activity. On the other hand, it can be predictive on how cognitive functions will help or not in the realization of an activity. The Trail Making Test (TMT), attention and executive functions test, is a good example of this kind of assessment. The theoretical construct of the TMT is not ecological, but it can be useful in terms of prediction. Here, it is important to remember that a test having good diagnosis capacities is not necessary ecologically valid. Moreover, [13] have found some contradictions concerning correlations obtained between some traditional tests (e.g. WCST, TMT, Controlled Oral Word Association Test) and everyday performance in naturalistic tasks that imply executive functions. In brief, veridicality is a field of ecological assessment which claims that it is possible to use a traditional neuropsychological test in order to predict everyday performance; in the end, further research in this field as far as current literature is concerned must be conducted.

Verisimilitude represents the capacities of a test to have the same cognitive demands as those found in everyday life from a theoretical perspective [12]. This approach required reinventing neuropsychological assessment in order to create a new assessment protocol, nearer to the reality of the person [24]. In this way, the expectation was to have a test with better content and criterion validity compared to a traditional one. Efforts such as those mentioned above, which set out to create every day-like tests, make better the distinction between verisimilitude and veridicality of the neuropsychological assessment.

A very good verisimilar neuropsychological test is the Multiple Errands Test (MET), which was initially developed by [14]. In this task, participants are asked to carry out tasks in a real environment. For example, the subject has to purchase some items and spend less money as possible, to avoid buying non-asked items, to respect some arbitrary rules (e.g. “Don’t talk to the evaluator during the task;” “Don’t exit a definite perimeter,” etc.). Although the MET has been shown to detect subtle problems of day-to-day living, it lacks reliability from a psychometric point of view [15]. Because the test is naturalistic, results obtained from the test in real-life settings are difficult to reproduce from individual to individual and from situation to situation [16]. But some authors have argued these psychometric considerations are contoured by a good inter-rater reliability [16, 17]. The main advantages of the MET are its sensitivity to specific neuropsychological deficits and its ecological validity [15].

From a neuropsychological perspective, the main challenge to developing new assessment tools is the standardization of assessment procedures that contain naturalistic and plausible tasks regarding day-to-day life. The development of ecological neuropsychological tests leads to the review of test structure [18]. However, if the task implies real activity and interactivity with daily-living activities, then it could lack reliability in the way of repetition of measure. Therefore, some authors have tried to create new assessments performed in the office of the neuropsychologist with an ecological construct.

Several situations in the everyday life demand good prospective memory capacities [19]. It seems to be difficult to have ecological and “verisimilar” measures of PM. To reproduce day-to-day living in a more realistic way and to assure standardized, reliable, and valid measures, several researchers have explored the potential of VR technology. In VR, the user navigates in a computer-simulated environment and interacts with objects in real time, ‘like in real life’ [20]. This technology offers researchers the best of both worlds [18]: it allows them to observe real-life situations in a laboratory setting [21]. There are several advantages that make it attractive to those in the field of neuropsychological assessment. Firstly, it allows access to the construct of prospective memory in a systematic, rigorous, and standardized manner [21,22,23] while at the same time providing a degree of ecological realism [24]. Also, because the effects of extraneous variables commonly encountered in real life can be controlled, VR researchers can assess the effects of standardized ‘unexpected situations’ in these environments [21].

VR is therefore considered as an ecologically valid tool to assess cognitive functions involved in everyday life. Studies in our group have shown the capacities of VR to detect cognitive dysfunction after traumatic brain injury or dementia [25, 26]. We have also demonstrated significant correlations between performance on virtual tasks and neuropsychological measures [27, 28]. However, our works have also demonstrated that HCI can generate a cognitive overload which affects cognitive functioning and creates a measurement bias. In accordance with our finding, some authors have demonstrated that VE can generate cognitive overload, mainly for people who are not familiar with video games, which is the case for the elderly [29]. Others have also found a significant cognitive overload in students performing tasks in a VE experienced [30]. These students reported having difficulty to direct their attention, to keep in mind several sources of information when exploring the virtual world. Therefore, there would be several active cognitive processes in the foreground and in the background when it comes to using a VE to evaluate the cognitive skills of an individual [31], particularly in the mobilization of cognitive resources by HCI.

1.2 Objectives

To investigate the effect of HCI on cognition during a neuropsychological assessment with VR, we have conducted three complementary pilot studies. All these studies implicated participants without brain lesions or psychiatric history. The purpose of these studies was to assess the normal functioning of an adult doing tasks in a VE; a magnifying glass will be focused on older participants. All these studies were conducted with the Virtual Multitasking Test (VMT) [32].

2 Method

2.1 Pilot Study 1

The aim of the first pilot study was to assess the effect of a technology known as “heavy use” (i.e. disturbs the navigation and interactions with the VE) on tasks realization during immersion in the VMT.

Participants. Five healthy adult participants (2 women & 3 men) were recruited into the general population. The mean age was 34, 5 (SD: 12,07) and mean years of education was 15 (SD: 1,55).

Virtual environment.

The Virtual Multitasking Test (VMT) aims, at the beginning of its development, to assess PM and executive functions using a multitasking paradigm [33]. Different scenarios are implanted into a 6 ½ rooms virtual apartment, each room including at least one task except the bathroom. At the beginning of the test, participants are told that they are visiting their best friend. During the day, he is at work and they must live in his apartment. In the evening, they will go to a show with their best friend. However, during the day, they must perform several tasks alone based on daily life. For instance, they must store the groceries on the counter as quickly as possible (even if they are told there is no time limit to complete the activities), answer the phone, and perform other tasks such as faxing a document, search for show tickets, dry a shirt, feed a fish. PM tasks require, among other things, to close a door just when exiting the master bedroom to prevent a dog from climbing on the bed. Unforeseen events occur during the execution of the tasks. For instance, the occurrence of a storm which overthrow objects in the guest room and let water seep into the dining room. For example, storms that reverse objects in the guest room and that let water seep into the dining room. Every time a person is exposed to the VMT, they start a training phase of the environment. Afterwards, the experimental phase began and the person had to carry out the tasks proposed by the scenarios planned by researchers.


A Head Mounted Display (eMagin z800) was used in this experimental setting that got the head movements and allowed immersion. Participants wore a 6DOF sensor on the dominant hand (Ascention’s Flock of Bird) to manipulate objects in the VE. They had in their other hand a mouse that guaranteed their movements and they had to manipulate the right, left and centre buttons to perform certain actions in the VE. The experiment was realized standing to find consistency and keep the realism of the proposed scenario. Blood pressure and heart rate measurements were also taken every 5 min through an Ambulatory Blood Pressure Monitor (ABPM) (Whelsh Allyn) to evaluate the biological variations, i.e. probable stress indicators or measurement of the workload during the task completion.


In the first experimental setting, participants felt a little stressed or anxious in relation to the upcoming experimentation (2.33 ± 2.7 on a scale of 10), and blood pressure was in the limit of normal at the time of starting the experimental protocol. After immersion, they reported that the interaction with the VMT was difficult and frustrating (8.40 ± 2.07 on a scale of 10). The analysis of individual performance demonstrated that no participant came to efficiently complete the task and that there was a great variation in the immersion time (total time of immersion: 19.45 ± 17.55 min), indicating a rise of workload during the immersion. Moreover, many mistakes were made and tasks were performed in a none-efficient way. Concerning the reported biological measures, three out of five (60%) participants showed a higher level of blood pressure during the immersion thus achieving prehypertension thresholds (120/80 mmHg) compared to initial measure. Heart rate was also affected, rising above 80 beats per minute for two participants (40%; while the normal threshold at rest was 70 beats per minute). However, despite these observations and changes of the participant’s internal state, the average sense of presence has remained satisfactory (7.67 ± 0.816/10).

2.2 Pilot Study 2

The aim of this study was to explore the effects of mental workload generated by the VMT second version.


Thirteen healthy adults were recruited into the general population to participate in this study. Six were aged between 18 and 45 years old (mean: 30; SD: 8,83) and 7 were aged over 65 years old (mean: 67,25; SD: 2,87). The experiment took place on two non-consecutive half days: the first one to take neuropsychological assessment to evaluate the executive functions, which involves “traditional” tests such as Delis-Kaplan Executive Function – System (D-KEFS) and the California Verbal Learning Test (CVLT) and “ecological” tests such as the Behavioural Assessment of the Dysexecutive Syndrome (BADS) and the Rivermead Behavioural Memory Test (RBMT) and the second to realize several tasks in the VMT.

Virtual environment.

Based on the results obtained in the first pilot study, to conduct this second study, VMT was migrated from VIRTOOL towards UNITY 3D to simplify HCI and to make the task in the VE more fluid. Several modifications were brought to the original scenario to complexity and made it more valid on a theoretical point of view the assessment.


The experimental procedure was divided in two phases (learning and test) of the new VMT-2, which was used in a non-immersive mode. The necessary equipment for the study was a computer with a multimedia projector, a keyboard for movement and mouse for gripping objects. At the end of the experiment, eight participants completed the entirety of the experimental protocol. Attrition was mainly explained by the length of the experiment (duration when including the two phases: 6 h under two days). The latter fact could possibly explain the dropout from the study.


At the end of the experiment, the participants didn’t experience cybersickness (sum = 4.43 ± 3.78/64). The feeling of presence was, for its part, in the average (mean = 2.47 ± 0.45/4). On the basis of the NASA-TLX questionnaire, VMT-2 appeared to induce a relatively modest cognitive load (i.e. 46% of estimated cognitive load). However, considering the standard deviation obtained, it seemed that some people are more likely to judge the environment as rather demanding. Indeed, it seemed that it was mainly the case for those aged over 65 years old.

2.3 Pilot Study 3

This third and final pilot study want to explore, while simplifying the HCI, if there was an age-related difference between young and elderly adults: (a) when the requested tasks in a VE were more complex and (b) on the user’s experience with the interactive system design (UX) based on the VE used. In sum, the experimental design consisted of age-related comparison of the cognitive load caused by two different VEs.


Nine young adult (18–45 years; mean = 30.44 years ± 4.98 years) and 8 elderly participants (55 years and up; 68.38 ± 9.13 years) were recruited from the general population. Hence, to be eligible for this study, participants must not have a neurological or psychiatric history; for the elderly, the Montreal Cognitive Assessment (MoCA) should be fewer than 26. Neither group differed significantly on the education plan [F(1,15) = 3.07; p = 0.1]. When asked to give a percentage (/100%) on the level of comfort use with computers, young adults (80 ± 17.79) expressed more comfort compared to elderly adults (55.63% ± 36.79); [F(1,15) = 3.36; p = 0.09].


Participants had to perform scripted tasks in the VMT-2 as described previously, environmental named “complex”. They must also in a “simple” environment make coffee with milk and two sugars. Therefore, we used the Non-Immersive Coffee Task from [27]. In this VE, participants were projected in a kitchen in which they did not have to move themselves. To achieve the task, objects must have been manipulated on the table - with computer mouse - in a logical order. The two tasks are carried out consecutively in a counterbalanced order. Following the completion of each task, participants were asked to complete two questionnaires. At first, the NASA-TLX was administered. The latter has the advantage to easily allow comparison of tasks while assessing the fluctuation of mental load according to the changing environment. The second questionnaire was AttrakDiff 2 [34 ]. The latter is a subjective measure of the ease of using technology by characterizing the experience in contact with VR. Three subscale are included: (a) the pragmatic qualities of the VR (usability utility); (b) hedonic qualities characterized by pleasure and satisfaction in contact with the VR; (c) the attractiveness or overall quality of the interaction with the VR.


The results showed a significant age-related difference for the cognitive load requested by the VE. The VMT-2 generated a greater cognitive load in the elderly compared to the young adults [F(1,12) = 8.30; p = 0.017]. Cognitive load generated by the Non-Immersive Coffee Task was similar in both groups [F (1,12) = 3.40; p = 0.617]. A subsequent analysis of the subscales revealed that the main age-related difference in the VMT-2 was in the load time, effort and performance. As for the Non-Immersive Virtual Coffee Task, they differed on the physical demands requested. Moreover, no significant difference between groups and between environments on the UX was found. That was good news because the pragmatic & hedonic aspects of our environment don’t raise the cognitive workload. Indeed, all participants described their experience as pleasant thus satisfying the two VE to which they were exposed. Finally, despite the small sample, some preliminary results have shown that the elderly spend more time in the VMT, they were less efficient in terms of movements, they are more disoriented and they need more time in the training phase compared to the younger adults. In summary, in this pilot study, both environments were considered equal in terms of UX and means of interaction were similar and simplified. The study has demonstrated that the accumulation of tasks to perform in a VE and their associated complexities bring a greater challenge for older adults in terms of requested cognitive resources.

3 Discussion

This paper wants to show how important is the impact of technology on cognition in the elderly. In the beginning, we suppose that VR is a good and reliable/valid means to assess cognitive functions. In the same way, we think that PM is a cognitive function particularly useful to detect pathological decline in aging. But, when we claim that VR is ecologically valid, it is possible to be wrong or to have difficulties to show it, if the cognitive load of the HCI is not considered. To begin this reflection, we conducted 3 very small pilot study with different experimental designs.

After the first pilot study, it was possible to see that the level of stress increased during the immersion in most participants due to navigational and manipulation difficulties in the VE. The limits of this pre-experimental study, in addition to the small number of participants, are seen in the type of measurement selected to observe the cognitive overload (i.e., time to complete the task, blood pressure and heart rate). Other measures would have allowed us to better reflect on cognitive demands of the experimental protocol; thus explaining - at least partially - the fact that the participants did not properly complete the task.

At the end of the second pilot study, we observed, when using a smaller sample that was comparable on multiple variables (i.e. cybersickness, sense of presence, anxiety, executive functions), that: while simplifying the HCI, the more a person must complete tasks in each space, the higher the cognitive load could possibly raise on several dimensions. Finally, the results showed an age effect in the HCI and in the achievement of the VMT-2 tasks.

Concerning the third pilot study, results have shown that the elderly spend more time in the VMT, they were less efficient in terms of movements, they are more disoriented and they need more time in the training phase compared to the younger adults. In summary, in this pilot study, both environments were considered equal in terms of UX and means of interaction were similar and simplified. The study has demonstrated that the accumulation of tasks to perform in a VE and their associated complexities bring a greater challenge for older adults in terms of requested cognitive resources.

To conclude, despite small sample sizes, these 3 pilot studies demonstrated that similar factors increased the solicitation of cognitive resources during task performance. These factors included the means of interaction, the nature of the required tasks and the characteristics of the participants and the VE. These factors were of importance when reinventing the neuropsychological assessment, to make it more ecological by using VR. Hence, the first study suggested that inefficient and complex interfaces could interfere significantly with the neuropsychological assessment thus affecting the validity of the measure. The second study illustrated that not all tasks were equivalent in VE. Indeed, those that required more interaction with the control devices were those that are effortful and were consequently the most frustrating. Lastly, the third study suggested that a larger virtual apartment, in which they are more movements, also influenced the cognitive load - probably through the solicitation of navigation, orientation and memory processes. These components must always be integrated with the neuropsychological measures taken in the VE. Finally, elderly adults appeared to be less skilled with technology. They were consequently more at risk of performing less efficiently in VR than in daily life actions since the measure is taken by a computer. This impacts negatively the sensitivity of the tool for the detection of mild cognitive impairment or for the early identification of neurocognitive disorders such as dementia. Although these age-related changes may fade with the aging of the current young population since they are more familiar with computers and gaming interfaces, it is still essential to consider the current age effects when evaluating in the VR a normal aging population or a clinic population.