1 Introduction

In recent years, many criticisms have been raised regarding the ecological validity of the tools used in the neuropsychological assessment of cognitive functions. The construct “ecological validity” refers to the functional and predictive relationship between patients’ performance on neuropsychological tests and real-life cognitive functioning (Sbordone et al. 1996). Neuropsychological assessment is usually provided through psychometric tests in a paper-and-pencil modality: these tests require patients to perform different behavioral and cognitive tasks in a controlled setting, determining the evaluation of abstracts concepts, without a direct link to the ecological behavior (Parsons 2015; Parsons et al. 2017; Serino and Repetto 2018). Hence, there is an ongoing debate about the effectiveness of psychometric tests in assessing life-like abilities in their natural environment (Chaytor and Schmitter-Edgecombe 2003). In fact, neuropsychological tests are not always sensitive, as patients could present scores in the normal range in the clinical setting but show difficulties in daily situations, or vice versa (Mondini et al. 2016). Parsons (2015) argued that this discrepancy can be traced back to the different methodologies used for developing the assessment procedures. Indeed, the most widely used neuropsychological tests were conceived following a construct-led approach, aiming to measure abstract cognitive constructs (e.g., working memory) without an explicit interest in predicting real-life functional abilities. On the contrary, function-led approaches provided an alternative framework focused on building neuropsychological models derived from the observation of everyday behaviors, allowing the development of more ecological assessment procedures (e.g., the Rivermead Behavioral Memory Test by Wilson et al. 1989). Thus, from a construct-driven approach, research has recently been moving to a function-led approach, recognizing the importance of conducting the evaluation in an ecological way by observing the cognitive and behavioral functioning in a real-life context (Parsons et al. 2017; Serino and Repetto 2018).

To this regard, virtual reality (VR) technology was used to develop function-led tools for neuropsychological assessment, thanks to the capability of simulating realistic environments which can be used as complex stimuli for investigating individuals’ cognitive abilities in a more ecological setting (Pedroli et al. 2015; Serino et al. 2015; Riva et al. 2019). The simulation is based on experiencing 3D computer-generated scenarios through a standard desktop interface (non-immersive VR) or in a first-perspective view thanks to a head-mounted display (HMD) (immersive VR). With a specific software (e.g., Unity 3D©), it is possible to develop virtual environments (VEs) which resemble daily life settings and activities (e.g., doing grocery) with a high level of plausibility (Parsons 2015; Rizzo and Koenig 2017). The twenty-first century has been characterized by the widespread use of this technology in both assessment and rehabilitation of neuropsychological deficits with promising results (Larson et al. 2014; Howard 2017, Pedroli et al. 2018; Moreno et al. 2019). In fact, VR allows researchers and clinicians to deliver highly ecological stimuli and collect measures which are very close to those observed in naturalistic settings, obtaining better prognostic indexes of real-life functioning in a safe and controlled situation. This is particularly true for immersive VR, which can dramatically increase the ecological value of the stimuli delivery thanks to the level of immersion provided and the elicited sense of presence. Specifically, immersivity relies on the use of specific technologies capable of providing multisensorial feedback, synchronizing virtual body actions to real movements and isolating users from the physical environment (Slater and Wilbur 1997). The sense of presence, instead, is the illusion of “being there” and is accompanied by the feeling of being involved, absorbed and captured by the virtual experience that could elicit strong emotional responses (Heeter et al. 1995; Chirico et al. 2017).

In this way, it is possible to build more complex tasks, evaluating both global functioning but also a specific function (Keefe et al. 2016). Several authors have exploited VR to develop technological tools to conduct a neuropsychological assessment (Parsons 2015; Negu et al. 2016; Rizzo and Koenig 2017). Matheis et al. (2007) pioneered the use of VR for memory evaluation developing a 3D virtual office in which patients had to learn and then recall some objects. Results showed that this protocol successfully discriminated between experimental and control groups, also showing a positive correlation between scores in the virtual test and in the standard one. A more immersive tool for the neuropsychological evaluation of memory has been developed by Ouellet et al. (2018). In this test, subjects could navigate in a virtual supermarket in which they had to memorize the grocery list and look for that products. The authors then concluded that this task is a valid and flexible tool to measure memory functioning in an ecological life-like context.

One further possibility for neuropsychological assessment can be offered by a recently emerged kind of VEs: the 360° immersive photos and videos (i.e., 360°-VR). With 360° cameras is possible to record a circular fisheye view of the surroundings, which can be later experienced as an immersive VE using an HMD. This approach started to be implemented in neuropsychological assessment only in recent studies, showing promising results for what concerns the evaluation of executive functions (Serino et al. 2017; Realdon et al. 2019). When compared to VR tests including model-based VEs (i.e., computer-generated environments), the implementation of 360° immersive photos and videos does not require high-level technical skills to be mastered, and the equipment needed to record and then visualize 360° materials is also more affordable than standard VR set-up. Moreover, being captures of real-world scenarios, 360° environments also provide a higher visual realism, which can further increase the engagement of the participants. Notably, 360°-VR advantages are limited by the fact that currently this technology can be experienced only through a 3-degrees-of-freedom (3-DOF) exploration, which allows to track only the HMD orientation and rotation in the space. In other words, users can look at any portion of the VE but cannot move closer to the objects and interact explicitly with them (e.g., move them to another position). This limit in the interaction can reduce the overall ecological value provided by 360°-VR only to a visual fidelity level, and must be taken into account when developing such experiences. Thus, this kind of VEs appears more suitable for simulating all those situations which do not require a direct interaction with the scenario (e.g., a recognition task).

The simpler design which characterizes 360°-VR environments (i.e., limited interaction), hardware (e.g., 3-DOF HMDs) and development, make it suitable also for the neuropsychological assessment of patients with mild to severe impairments (Serino et al. 2017; Realdon et al. 2019), who can likely show difficulties when interacting with more sophisticated VR settings. Hence, 360°-VR can constitute a viable option to consider when thinking about VR-based test implementation, especially considering the cost–benefit value: For clinicians, this can mean having access to more sophisticated yet user-friendly and ecological assessment tools to support already existing techniques.

As regarding the memory assessment, with immersive 360°-VR scenarios people can perceive photorealistic environments in a first-person perspective view: This characteristic is very important for the assessment of the several aspects of memory and can improve the precision of the procedure (Matheis et al. 2007; Serino et al. 2017). Moreover, the photorealistic fashion which characterize the 360° VEs can further enhance their ecological value: In fact, Robertson et al. (2016) found that 360° videos resembling realistic natural environments provide a visual experience which is very close to the natural visual exploration of real-life scenarios. These results are also consistent with the literature finding that the degree of immersivity and realism affects memory coding processes (Sutcliffe et al. 2005; Slobounov et al. 2015; Makowski et al. 2017; Serino and Repetto 2018). In this regard, what makes 360°-VR a suitable tool to study memory functioning is (a) the possibility to adopt an immersive egocentric perspective when experiencing the 360° environments and (b) the high visual fidelity provided, linked to a better visual memory encoding (Robertson et al. 2016; Serino and Repetto 2018; Ventura et al. 2019). Then, given its capability to elicit visual exploration mechanisms similarly to those adopted in real environments, these new VEs appear to be promising for developing more ecological tools for the assessment of memory processes.

Thus, in the current pilot study, we wanted to evaluate the feasibility of a memory assessment protocol based on 360° VEs. During the development of our experimental protocol, we took inspiration from a specific task included in a traditional test, the Rivermead Behavioural Memory Test – III (RBMT-III) (Wilson et al. 2008). This test was built following a function-led approach, including several “real life” tasks (e.g., recalling details of a story or remembering names and faces) which provide a valid and ecological testing of different aspects of everyday memory (Smith et al. 2000; Efklides et al. 2002). Therefore, we considered the RBMT-III as an inspiration and a benchmark for the development of our 360° tool.

Secondly, we aimed to investigate the differences showed in the performances by the participants in our 360° tool and in the RBMT-III, calculating the correlations between the measures obtained with these two tests.

Finally, we performed an user experience (UX) evaluation to study the participants’ interaction with the used technology.

2 Materials and methods

In the current study, we explored the potential use of 360° video technology for memory assessment trough a preliminary evaluation of a 360° adaptation of the Picture Recognition sub-test included in the RBMT-III. We named this adaptation ObReco-360° (Object Recognition-360°).

2.1 Sample

The participants of the present study were enrolled among the outpatients coming from the Department of Medical Rehabilitation of Istituto Auxologico Italiano in Milan. The resulting sample of twenty-four people included nine females and fifteen males, with a mean age of 70.4 (SD = 8.5) and a mean of 9.7 (SD = 3.7) years of education. The exclusion criteria for the enrollment included the presence of severe internist, psychiatric and neurological impairments. Regarding the cognitive status, only the participants who obtained a corrected score above 18 points in the Mini Mental State Examination (MMSE) (Folstein et al. 1975) Italian Version (Measso et al. 1993) were considered for the recruitment.

The study was conducted in compliance with the Helsinki Declaration of 1975 (as revised in 2008) and received ethical approval by the Ethical Committee of the Istituto Auxologico Italiano. The demographic and neuropsychological data of the final sample are showed in Table 1.

Table 1 The table shows the main demographic and neuropsychological data which characterize the sample of participants. The demographic indexes include age and years of education

2.2 Procedure

This pilot study involved randomized within-subject data collection. For this reason, participants were examined two times in a week to avoid learning effect and interferences between materials. Two different assessment protocols were administered: The standard one included only classic paper-and-pencil tests, while the 360° one also included the administration of the ObReco360° and two user-experience (UX) rating scales. During the 360° session all the participants were sitting on a turning chair, in order to freely explore the virtual environments using an Oculus Go© HMD.

2.3 Neuropsychological assessment

The neuropsychological tests administered were the MMSE, the Frontal Assessment Battery (FAB) (Dubois et al. 2000) Italian Version (Appollonio et al. 2005), the Picture Recognition sub-test included in the RBMT-III Italian Version (Beschin et al. 2013) and the Babcock Story Recall Test (BSRT) Italian Version (Spinnler and Tognoni 1987).

The MMSE is a brief screening test including thirty simple tasks (e.g., repeating and remembering words, copying a figure) oriented to a first-level assessment of different cognitive functions.

The FAB is a rapid test for the screening of executive functions which includes six tasks linked to frontal lobes activity, such as conceptualization (e.g., find common characteristics between objects) and inhibitory control (e.g., following rules given by the examiner).

The Story Recall tests aim to assess short-term and long-term memory abilities, respectively, trough an immediate or delayed recall of all the details contained in a brief tale.

2.4 RBMT-III picture recognition subtask

The Picture Recognition sub-test of the RBMT-III is divided in two phases. During the first phase (Encoding Phase), a set of 15 pictures representing common animate and inanimate objects (e.g., a clock, a chicken) are shown separately to the participants, who had to recognize and name each one of them. In the second phase (Recognition Phase), the participants must observe a total of 30 pictures including target items (i.e., the 15 pictures presented in the Encoding Phase) and distractors (i.e., 15 pictures non presented in the Encoding Phase): For each of these, they must answer yes if the picture was presented previously or no if it was not. The raw score obtained in the subtest is the number of pictures correctly recognized. In addition, before the Recognition Phase we included a Free Recall task, which required the participants to remember every object he/she could from those presented in the Encoding Phase. The raw score is defined by the number of object correctly reported. The flowchart of the procedure is presented in Fig. 1a.

Fig. 1
figure 1

The figure shows the steps included in the two experimental conditions. The left diagram shows the steps included in the standard condition, while the left diagram reports the steps included in the VR condition. Abbreviations; RBMT-III, Rivermead Behavioral Memory Test-III; ObReco-360°, Object Recogntion-360°; ITC-SOPI, Independent Test Commission-Sense of Presence Inventory; SUS, System Usability Scale

2.5 UX measures

The UX assessment procedure included two questionnaires. The first instrument was the Independent Television Commission-Sense of Presence Inventory (ITC-SOPI) (Lessiter et al. 2001), a questionnaire including forty-four items which define a set of affirmations addressing the individual’s feelings after the VR experience. Participants are asked to determine their degree of agreement with each of these affirmations using a five-points Likert scale ranging from “Strongly Agree” to “Strongly Disagree”. The ITC-SOPI is divided into 4 subscales: Sense of Physical Space (19 items), Engagement (13 items), Ecological Validity (5 items) and Negative Effects (6 items), each one linked a singular score.

The second instrument was the System Usability Scale (SUS) (Brooke et al. 1996), a questionnaire composed of ten sentences describing the user’s feeling concerning the interaction with the product to evaluate. For each of these answer, the participant needs to define their degree of agreement using a five-points Likert scale ranging from “Strongly Agree” to “Strongly Disagree”. The computed score ranges from 0 to 100.

2.6 ObReco-360°

The ObReco-360° is a novel neuropsychological assessment tool, developed using 360° immersive photo and video as VEs and derived from the Picture Recognition sub-test of the RBMT-III. The included VEs were recorded using an omnidirectional video camera, the Ricoh Theta S(c), which can record spherical photos with a resolution of 5376 × 2688 pixels and spherical videos with a resolution on 1920 × 1080 pixels. The final version of the ObReco-360° consists in a custom Android application which can be sideloaded on an Oculus Go© headset. The application was developed using the InstaVR© software, which allowed to organize the virtual environments in a single experience. The ObReco-360° test includes four different phases: the Familiarization Phase, the Encoding Phase, the Free Recall Phase and the Recognition Phase.

The Familiarization Phase is aimed to make the participants comfortable with the experience and to detect possible side-effects linked to VR exposure (e.g., dizziness, nausea). Here, the participants find themselves in a black room with a floating icon showing the number one on the center. Then, they are asked to point and select the icon with the Oculus Go© controller, in order to show a text message, “search for the number 2”, which shows the instruction for the task. The procedure is the same for numbers from 2 to 4, which are positioned in the four cardinal points around the participants: when they finally find and select the number 4 (Fig. 2a), the second virtual environment is loaded.

Fig. 2
figure 2

a The figure shows a screenshot of the target numbers in the familiarization phase. Participants need to point and select the circles showing numbers to expand a text label indicating to search the subsequent number b Screenshot of the instructions as they are presented at the beginning of the task. The two options on the low sides allowed the participants to choose between instructions playback or test start

The Encoding Phase represents the starting point of the proper test: The first scenario includes a 3D wall showing the instructions for the task (Fig. 2b), which are also presented in auditory modality. Then, the participants can choose whether to playback the instructions or go to the test phase. In the test phase, participants must pay attention to different objects presented by a virtual clinician (Fig. 3a). The objects are randomly placed in an office room; the target objects are 10 mixed with other 17 non-target ones. During the video, the clinician moves around in the room and presents the target objects closely to the camera for 5 s; in the meanwhile, the participants must name the object showed. At the end of this task, the participants are invited to take off the headset and join the “real” clinician for a 10-min long session of non-interferent tests.

Fig. 3
figure 3

a The figure shows a set of screenshots of the virtual clinician showing the 10 objects to recognize from the participants’ point of view. From the upper left to the bottom right there are a pan, an apple, a cup, a doll, a plant, a telephone, a stapler, a key, an umbrella and a torch. b A panoramic view of the virtual room with all the 10 target and 17 non-target objects

Then, the next step is represented by the Free Recall phase. The tasks simply require the participant to remember the 10 objects presented 10 min earlier in the Encoding Phase. The raw score is the number of objects correctly reported.

After wearing back the headset, the Recognition Phase begins. Again, the scenario includes a visual and auditory presentation of the instructions for the task, which asks the participants to search an immersive 360° photo of the same room included in the first task to find and nominate all the ten objects previously showed, located among other 17 non-target objects (Fig. 3b).

2.7 Data analysis

We organized all the data collected in a Windows Excel sheet and computed different indexes, both for the standard and VR assessment protocol. For the Free Recall tasks, we computed the accuracy percentages observed in the performances. For the Recognition tasks, we computed three different scores: the Hit Rate (HR, the proportion of yes responses to old items), the False Alarm Rate (the proportion of yes responses to new items) and the discrimination score PR (i.e., hits -false alarms; Snodgrass and Corwin 1988). All these scores are reported in percentages.

Then, we performed four Wilcoxon signed-rank tests to compare the Free Recall and Recognition scores in both classic and 360° mode, investigating the statistical significance of the detected differences in performances. Finally, we performed a second statistical analysis, aimed to explore the presence of significant correlations between the computed scores in the two conditions. All the analyses were performed using JASP (Version 0.14.1.0).

3 Results

3.1 Task performance

The descriptives of the accuracy performances showed by the participants on Free Recall and Recognition tasks in the two modalities are shown in Table 2. The results indicate that for the Free Recall tasks, participants performed better after the 360° presentation than after the standard one in terms of accuracy percentages (Fig. 4a), and the difference is statistically significant (W = 14.5, p < 0.001). For what concerns the Recognition indexes, the participants performed better in recognizing the objects after the standard presentation than after the 360° one (Fig. 4b), and the observed difference is statistically significant both for the HR (W = 131, p < 0.05) and PR (W = 186, p < 0.05) scores between the two conditions. No significant differences were detected in confronting the percentages of FARs. The Wilcoxon Signed-Rank Tests statistics are shown in Table 3.

Table 2 The upper part of the table shows the descriptives of the accuracy obtained by the participants in the free recall tasks (FR) in the standard (RBMT-III) and VR (ObReco-360°) condition
Fig. 4
figure 4

The bar plots show the performance differences in the free recall (FR) and the discrimination tasks (PR) for each modality. a Accuracy difference in the free recall (FR) task between the standard (RBMT-III) and VR (ObReco-360°) conditions. b Percentage differences in the discrimination scores (PRs) between standard (RBMT-III) and VR (ObReco-360°) conditions

Table 3 The table reports the statistics of the Wilcoxon Signed-Ranked Tests performed to analyze the significance of the differences observed between the standard (RBMT-III) and VR (ObReco-360°) conditions in the free recall scores (FR) and in the recognition indexes scores (HR, FAR and PR)

3.2 Correlation Analysis

We also performed a statistical analysis to search for possible significant correlations between the accuracy scores obtained by the participants on the free recall and recognition indexes in both modalities. The results show a statistically significant correlation (r = 0.64, p < 0.010) between the Free Recall task in standard and 360° modality (Table 4).

Table 4 The table shows the correlation coefficients computed between the scores obtained on the free recall (FR) and the recognition (R) tasks in the standard (RBMT-III) and VR (ObReco-360°) conditions

3.3 UX analysis

The descriptives of the scores given by the participants in the 4 scales of the ITC-SOPI and in the SUS are showed in Table 5.

Table 5 The table shows the descriptives of the scores given by the participants in each of the 4 scales of the ITC-SOPI and in the SUS

4 Discussion

With this explorative study, we wanted to test the feasibility of using 360° technology in the neuropsychological assessment of memory. The rationale of the study was based on the ongoing scientific debate concerning the ecological validity of standard paper-and-pencil tests (Chaytor and Schmitter-Edgecombe 2003; Parsons 2015) and on the evidence suggesting the implementation of VR in the neuropsychological assessment (Rizzo and Koenig 2017). Moreover, we wanted to make a further step developing a pilot tool using 360° naturalistic VEs, given their photorealistic features allowing a more ecological memory encoding (Robertson et al. 2016; Serino and Repetto 2018), especially when compared to computer-generated VEs. Considering previous results from literature (Serino et al. 2017; Realdon et al. 2019) we expected to find some correlation confronting the memory performances showed by the participants in the standard subtest of the RBMT-III and in the ObReco-360°: in particular, as previously suggested (Serino and Repetto, 2018) we focused on two memory function indexes, namely the free recall and recognition accuracy. Additionally, we studied the UX ratings given by our participants to the ObReco-360° in order to find possible difficulties related to technological asset used.

The results showed that participants obtained low scores on the Free Recall tasks included in the two conditions, showing a better performance after the ObReco-360° Encoding Phase (FR RBMT-III = 31%, FR ObReco-360° = 47%). For what instead concerns the Recognition performance, the pattern of results is inverted: indeed, the participants showed high levels of accuracy in both conditions, performing slightly better in the RBMT-III condition (PR RBMT-III = 94.4%, PR ObReco-360° = 85.4%).

These results could be explained by the fact that in the ObReco-360 condition the participants needed to recall 10 objects, against the 15 of the RBMT-III subtask. We introduced this mismatch in the target items between the two conditions according to previous findings from the literature suggesting that immersive VR tasks are more cognitive demanding (Frederiksen et al. 2020; Harris et al. 2019), in order to avoid a possible floor-effect in the ObReco-360° Free Recall performance. However, we can also hypothesize that the photorealistic fashion of 360° VEs could have led and to a better visual encoding of the stimuli and to a lower cognitive load facilitating the subsequent recall of the objects, as also suggested by previous evidence linking immersivity and realism levels to memory encoding (Robertson et al. 2016; Makowski et al. 2017). Indeed, this result is also consistent with evidence coming from the work of Ventura et al. (2019), who showed that immersive 360°-VR scenario could elicit a better visual encoding and subsequently a better recall of the encoded items when compared to the same task performed in a non-immersive scenario. Then, 360°-VR scenarios might elicit a visual memory encoding which is very close to the one performed in everyday life, thus improving the ecological validity of the assessment procedure.

Regarding the Recognition tasks, the lower scores obtained by the participants in the ObReco-360° condition could be explained by the higher complexity which characterizes this task: in fact, the 360° scenario required the users to actively explore the environment in order to discriminate the target items from the distractors, as a results of the ecological fashion which characterized the task’s design. This may have allowed a slightly more sensitive and ecological assessment of Recognition memory when compared to the RBMT-III condition, where participants’ performance showed a more prominent ceiling-effect.

Moreover, the presence of a significant correlation (r = 0.64, p < 0.10) between the Free Recall tasks of in the two conditions support the use of ObReco-360° as a measure of memory functioning.

On the UX assessment side, the scores obtained by the ObReco-360° in the 4 dimensions of the ITC-SOPI are consistent with the ones showed by Yildrim et al. (2019), which provided the ITC-SOPI benchmark scores for the 360° videos category and certificated the high ecological value provided by this technology. Additionally, the low mean score obtained by the ObReco-360° in the Negative Effects scale can be the result of the adoption of a fixed position in the VE and of the limited duration of the VR exposure (about 10 min). Moreover, considering the rating comparison scale proposed by Bangor et al. (2008), the SUS mean score of 73.5/100 ranks the usability of the ObReco-360° slightly above the third quartile, defining the tool as “Acceptable” and with a “Good” level of usability (Fig. 5).

Fig. 5
figure 5

The vertical line in the figure shows the position of the mean score of 73.5 obtained by the ObReco-360° according to the SUS rating comparison scale provided by Bangor et al. (2008). The SUS score marks the ObReco-360° as an “Acceptable” and “Good” tool

5 Limitations and conclusion

The present work has several limitations. First, the sample is limited in its number and in its representativity: Considering the preliminary nature of the study, we primarily focused on the features of the technology, but further studies must include a larger sample size with different demographic characteristics. Then, the technological equipment used was entry-level: Currently, the 360° devices market offers much higher-quality omnidirectional cameras (e.g., Insta360 Pro 2(c)) and better all-in-one headsets (e.g., Oculus Quest 2(c)), which together can provide a more realistic experience and thus give a higher ecological value to the obtained measures. Future works are needed in order to clarify what advantages/disadvantages characterize 360°-VR when compared to model-based VR. For example, further information could be obtained confronting the same task (e.g., memory encoding) in three different modalities: paper-and-pencil, 360°-VR and model-based VR.

Summarized, even if limited by the explorative nature of the study, these preliminary results encourage the implementation of 360° technology in the development of ecological tests for memory assessment. Further work is needed to improve the design of 360° experiences searching for tasks that resembles the challenges of daily life, working around the interactivity limitations that characterize this technology.