1 Introduction

Immersive media—such as 360° videos, games, or learning simulations that are presented via virtual reality (VR) technology—have gained substantial popularity in recent years. Due to their rapid technological advancement, these highly involving media forms have not only begun to conquer the mass consumer market, but also sparked great interest among content creators and persuasive communicators. For instance, in the field of education and training, the use of immersive head-mounted displays (HMDs) has been proposed as a great opportunity to enhance learning outcomes (Albus et al. 2021; Leder et al. 2019). Similarly, VR gaming has been found to increase media enjoyment (Wehden et al. 2021), while also positively affecting the memory for and evaluation of integrated brands (van Berlo et al. 2020, 2021). Further focusing on persuasive success, other research has demonstrated that 360° videos may significantly enhance people’s involvement with a prosocial topic (Breves 2020), increase their pro-environmental behavioral intentions toward distant environmental issues (Breves and Schramm 2021), and boost advertising effectivity (Van Kerrebroeck et al. 2017). Last but not least, VR simulations have proven to positively affect health-related attitudes and behaviors (Ahn 2018; Ahn et al. 2019).

Based on these findings, one may assume that presenting media in an immersive way is generally linked to clear practical benefits; and indeed, scientific efforts continue to provide new ideas of how to use immersive media in an effective manner (e.g., Meijers et al. 2021). While the results seem promising—underscoring the added value of VR and similar technologies for diverse purposes (Makransky and Petersen 2021; Nalivaiko et al. 2015)—it should be noted that immersive media forms cannot always be considered superior to traditional modes of presentation. For instance, there is a growing body of research that found no positive effects of immersive technologies on learning outcomes (Makransky et al. 2019, 2021) or persuasive effectiveness (Breves 2021; Ma 2020). In the entertainment context, Roettl and Terlutter (2018) reported that participants who experienced a more immersive form of a game did not evaluate it more positively; even more problematically, Barreda-Ángeles et al. (2021) showed that participants who experienced a 360° news segment reported lower levels of focused attention and recall of the information compared to a regular segment. In order to explain these conflicting results, some researchers have suggested that cognitive (over-)load and impaired cognitive processing elicited by immersive technologies might be responsible for the lack of consistent positive effects (Barreda-Ángeles et al. 2021; Breves and Schramm 2019; Roettl and Terlutter 2018). Cognitive load is generally understood as the total amount of mental effort that the working memory expends during a task (Chandler and Sweller 1991). Very high levels of cognitive load (i.e., cognitive overload) are often perceived as unpleasant by media users (Drolet and Luce 2004; Mayer and Moreno 2003) and can have further inadvertent consequences, such as the reduction of flow (Wissmath et al. 2009), user satisfaction (Hu et al. 2017), and performance quality (e.g., memory performance; Roettl and Terlutter 2018).

Consequently, several scholars have already scrutinized potential ways to reduce cognitive load in immersive media settings (e.g., by employing pre-training; Meyer et al. 2019). However, to this day, not enough is known about the underlying reason as to why immersive technologies may elicit higher cognitive load than traditional media settings. While recent literature has proposed two different explanations that might be responsible for this—namely, increased spatial presence and cybersickness—actual empirical investigations of both constructs’ connection to cognitive load remain lacking. In our opinion, this creates a notable research gap, so that we empirically compare the role of both suggested mechanisms in the current study. In summary, the contribution of this manuscript is twofold. First, our work offers novel insight into the emergence of cognitive load in immersive media settings, empirically investigating two conflicting predictors. Second, we translate our findings into practical recommendations for media producers, who often need to monitor and control the amount of cognitive load elicited by VR in order to ensure a pleasant user experience.

1.1 Cognitive load in immersive media forms

Cognitive load can be understood as the amount of cognitive processing that is dedicated to the execution of a task (Chandler and Sweller 1991; Hu et al. 2017). Since cognitive resources are naturally limited, the media user’s processing capacity should not be exceeded in order to achieve optimal performance (Lang 2000). The well-established Cognitive Load Theory (CLT; for a recent overview see Sweller et al. 2019) further differentiates between three different subtypes of cognitive load: (a) intrinsic load, i.e., the inherent difficulty of the task, (b) extraneous load, i.e., mental efforts required due to the instruction and presentation of the task, and (c) germane load, i.e., the load invoked by processing and automating cognitive schemata. However, most research focusing on the use of working memory resources during media experience has been concerned with the reduction of extraneous cognitive load, since adapting instructional procedures seems a particularly feasible way to reduce the required cognitive effort (Schrader and Bastiaens 2012; Sweller et al. 2019).

Returning to the topic at hand, several studies have already provided evidence that more immersive media forms elicit higher levels of cognitive load, in particular extraneous cognitive load—and may, thus, lead to cognitive overload (Makransky and Petersen 2021). Compared to 2D movies, 3D movies were found to increase participants’ cognitive load (Breves and Schramm 2019), whereas VR games elicited higher levels of cognitive load than 3D games (Roettl and Terlutter 2018). Learning material that was presented via VR was shown to cause more extraneous cognitive load than the same learning material presented via a PC (Parong and Mayer 2021). Most of the time, the enhanced amount of sensory information and distraction effects were suggested as explanations for these findings (e.g., Breves and Schramm 2019; Parong and Mayer 2021). Arguably, there is little to be done about this circumstance: Due to higher levels of vividness and interactivity, media users who experience immersive media forms are bound to receive more information. Consequently, the following first hypothesis is proposed for the current study, basically replicating the often-observed cognitive load effects in VR environments.

H1: Media offerings that are experienced using highly immersive technologies should result in higher cognitive load, compared to media offerings that are experienced using less-immersive technologies.

However, by using a more human-centered approach, psychological consequences of additional sensory information could be considered in order to identify possible ways to reduce the impact of technological immersiveness on cognitive load. In earlier research, two constructs are repeatedly mentioned (e.g., Feng et al. 2019; Makransky and Petersen 2021; Nesbitt et al. 2017; Varmaghani et al. 2021) and presumed to be responsible for additional extraneous load of immersive media: spatial presence and cybersickness. Nonetheless, actual empirical evidence regarding their impact on users’ working memory is still missing.

1.2 The role of spatial presence

Spatial presence describes the feeling of being in the depictured media environment and is thus often described as the “sense of being there” (Schubert 2003; Wirth et al. 2007). As a consequence, media users who experience high levels of spatial presence often report the perception of non-mediation (International Society for Presence Research 2000; Lombard and Ditton 1997). Based on theoretical assumptions and substantial empirical evidence, higher technological immersiveness can be regarded as an elicitor of spatial presence, albeit other factors (e.g., personal characteristics and media content) are also of importance (Cummings and Bailenson 2016; Wirth et al. 2007). Considering this, it may be assumed:

H2: Media offerings that are experienced using highly immersive technologies should result in higher levels of spatial presence compared to media offerings that are experienced using less-immersive technologies.

Several researchers have proposed the perception of spatial presence as a main reason why higher technological immersiveness should demand more cognitive resources, leading to cognitive overload (e.g., Feng et al. 2019; Makransky and Petersen 2021; Oh and Jin 2018; Vettehen et al. 2019; Waiguny et al. 2014). In their recently published paper on the Cognitive Affective Model of Immersive Learning, Makransky and Petersen (2021) theorize that spatial presence in virtual learning environments enhances cognitive load, which in turn reduces knowledge transfer and learning outcomes. Specifically, it has been suggested that spatial presence might take up more attention of the media user and thus result in increased cognitive load (Huang et al. 2019). Alternatively, scholars argue that the illusion of feeling present in the media environment while at the same time knowing that one is actually situated in the real world might require cognitive resources (Vettehen et al. 2019).

However, the assumption that spatial presence should be responsible for enhanced cognitive load and decreased learning outcomes is not undisputed. Based on the premise that the emergence of spatial presence as a cognitive feeling is an unconscious process, the notion that more cognitive resources are required seems unfounded (Schubert 2009). Parong et al. (2020) even state that spatial presence should attenuate extraneous load, presumably by enhancing naturalness and diminishing the distraction that results from wearing HMDs or interacting with controllers. Consequently, according to several researchers, spatial presence should actually be connected to less extraneous cognitive load and enhance positive media effects (e.g., Jeong et al. 2011; Li et al. 2002; Parong et al. 2020).

In summary, it seems worthwhile to further decipher the role of spatial presence in order to design and produce immersive media environments that do not trigger cognitive overload. If spatial presence does indeed require additional cognitive resources and enhance cognitive load, triggers of spatial presence (e.g., the use of customized avatars; Bailey et al. 2009) should be utilized more carefully. However, to the best of our knowledge, no study to date has empirically analyzed if the spatial presence elicited by immersive technologies truly accounts for higher amounts of cognitive load. Based on the illustrated discordance of previous scholarly work regarding this mechanism, the following research question is proposed:

RQ1: Is spatial presence responsible for the impact of technological immersiveness on cognitive load?

1.3 The role of cybersickness

While the perception of spatial presence can be understood as one of the goals of immersive media forms, adverse side effects that can accompany the use of vivid and interactive technologies have also been reported (Keshavarz et al. 2019; Lessiter et al. 2001). In the context of VR, feelings of cybersickness have been identified as a rather common occurrence (LaViola 2000; McCauley and Sharkey 1992; Yildirim 2020). As a subtype of motion sickness, the main symptoms of cybersickness are unpleasant feelings and physiological reactions such as discomfort, dizziness, nausea, headaches, or eye strain, which can occur during or even after the exposure to virtual environments (LaViola 2000; Nalivaiko et al. 2015; Porcino et al. 2021; Varmaghani et al. 2021). While the concrete neurological mechanisms responsible for these reactions are currently still unknown (Porcino et al. 2021), sensory conflict theory proposes that motion sickness might be caused by the mismatch between the visual and vestibular systems (Reason and Brand 1975). Since VR environments simulate movement even though the media users do not change position, their sensory perceptions do not align, which might be overwhelming for human physiology (Davis et al. 2014; LaViola 2000). Offering a different theoretical angle, poison theory suggests that the sensory illusions provided by VR settings might sometimes resemble the experience of having ingested a toxic substance; in turn, the human body might be evolutionarily hardwired to expel the respective substance from its stomach (Palmisano et al. 2020). In a similar vein, Ebenholtz (1992) suggests that any condition that results in a loss of eye-movement control will likely elicit feelings of nausea and dizziness, presenting yet another evolutionary psychological approach to the phenomenon.

Regardless of the reason as to why cybersickness occurs, the severity and the kind of symptoms vary greatly between individuals and depend on user characteristics, the employed technology, the environmental design, as well as the tasks that users have to perform in the environment (Stanney et al. 2020). Despite the enhanced quality of today’s immersive devices (e.g., HMDs with better refresh rates, display resolution, or improved positional tracking), cybersickness is still a side effect of virtual environments that has to be endured and is reported regularly (Porcino et al. 2021; Shafer et al. 2019; Varmaghani et al. 2021). Consequently, the following hypothesis is proposed.

H3: Media offerings that are experienced using highly immersive technologies should result in higher levels of cybersickness compared to media offerings that are experienced using less-immersive technologies.

Media producers are keen to reduce cybersickness, as it has been shown to influence the interaction with the media content as well as its consequences in a negative way (Israel et al. 2019; Varmaghani et al. 2021; Yildirim 2020)—for instance by reducing persuasive effects and decreasing media enjoyment (Breves and Dodel 2021; Yildirim 2020). In the context of VR games, Monteiro et al. (2018) explicitly stress that the avoidance of negative experiences such as cybersickness is paramount, because the player might otherwise dislike the game. While only a limited number of studies have analyzed the impact of cybersickness on cognitive processes, it is generally believed to restrict cognitive functioning (Ha 2020; Makransky and Petersen 2021; Nesbitt et al. 2017; Varmaghani et al. 2021). Based on the sensory conflict theory, a temporary cognitive decline can be expected due to the fact that the brain has to resolve the emerging sensory conflict (Varmaghani et al. 2021). Nesbitt et al. (2017) and Mittelstaedt et al. (2019), for instance, propose that participants’ drop in cognitive performance after VR use was due to side effects of the simulation and reported a positive correlation between reaction times and cybersickness to support their assumption. Ha (2020) furthermore identified a significant positive correlation between cognitive load and cybersickness for their participants who experienced a 360° education video. However, other researchers could not report a significant correlation between cognitive performance and cybersickness scores (Szpak et al. 2019; Varmaghani et al. 2021). On account of these mixed findings, the relations between cybersickness and cognitive processes have remained a hot topic for VR developers and cognitive psychologists in recent years (Varmaghani et al. 2021). To make further sense of this theoretical connection, we address the following research question:

RQ2: Is cybersickness responsible for the positive impact of technological immersiveness on cognitive load?

2 Methods

2.1 Design and stimulus

To explore our hypotheses and research questions, a between-subjects experimental design was chosen, with level of immersiveness serving as an independent variable. The participants either watched a media stimulus passively on a laptopFootnote 1 (low immersiveness, “LI”; n = 65) or via the Oculus QuestFootnote 21 (high immersiveness, “HI”; n = 56). The Oculus Quest 1 is a stand-alone high-quality HMD that can be used to play VR games or watch 360° videos and movies, without requiring an external computer system.

In terms of the specific stimulus, a 360° documentary was selected in order to keep the two experimental groups as comparable as possible. Since this was a passive 360° video, participants were only able to adjust their point of view using either head movements (HI condition) or the computer mouse (LI condition). The alternative idea of choosing a more interactive VR simulation such as a videogame—albeit more likely to foster higher levels of cybersickness or spatial presence (e.g., Yeo et al. 2020)—would have allowed for higher levels of behavioral freedom, thus reducing the internal validity of the study. Also, since previous research has shown that presenting 360° videos actually sufficed to successfully manipulate the perceived spatial presence and cybersickness of participants (e.g., Breves 2021; Groth et al. 2021; Vettehen et al. 2019), we deemed this approach suitable for our study.

Specifically, the 360° documentary Iuventa—Rescuing Refugees in the Mediterranean Sea was selected as stimulus material. The video tells the story of volunteers who work on the Iuventa, a rescue ship that operates on the Mediterranean, which is considered the world’s most dangerous refugee route (see Fig. 1). The documentary was selected for several reasons. First, it was of high quality as it was professionally produced by the ZDF, a German government-financed broadcasting station. Second, it was available both in German and in English, which means that it could be shown to the German participants in their native language but can also be experienced by researchers around the world, who may be interested in the topic.Footnote 3 Third, this particular 360° video may be considered as rather long (nearly 14 min) and includes a lot of movement, so that a higher range of spatial presence as well as cybersickness could be expected (Lee et al. 2004a; Saredakis et al. 2020).

Fig. 1
figure 1

Screenshots of the 360° documentary. Iuventa—Rescuing Refugees in the Mediterranean Sea produced by the ZDF. Copyright: ZDF/Carsten Behrendt

2.2 Procedure

The study took place in a university laboratory at a medium-sized German university during November and December 2019. Slots were available throughout the day from morning till late afternoon. After entering the laboratory one at a time, the research assistant welcomed and instructed each participant. Participants were then guided into a separate cubicle in the room and asked to sit down on a revolving chair. After providing consent, participants were told that they would take part in a study on video perception in order to conceal the true purpose of the study. If they wanted to, they were also allowed to make use of the revolving chair but were asked to stay seated during the experiment. Participants were told that if they did not feel well while watching the video, they were allowed to drop out of the study at any time. If they had any questions during the experiment, they could furthermore always ask the research assistant who was placed in front of the cubicle (see Fig. 2). Then, they were asked to put on the circumaural headphones (model AKG K77) in front of them. Participants were randomly assigned to one of the two conditions and watched the video either on the laptop or with the high-quality HMD. If participants were assigned to the LI group, the research assistant carefully hid the VR headsets before they entered the laboratory, so that the participants were not aware of the other experimental condition. After experiencing the 360° video, they filled out the online questionnaire. Participants first reported their perceived cognitive load, followed by their feelings of spatial presence and cybersickness.

Lastly, demographic details were recorded, and participants were additionally asked about their earlier experience with 360° videos before they were fully debriefed about the purpose of the study and thanked for their participation.Footnote 4

2.3 Measures

In order to measure the cognitive load of participants, the German form of the subjective mental effort questionnaire (Eilers et al. 1986) was employed, which consists of a single rating scale and has proven to be a valid measurement tool (Sauro and Duman 2009). Participants were confronted with a figure that depicted the level of cognitive load using values from 0 to 220. For instance, the value 20 corresponded to hardly effortful, while the value of 205 corresponded to extraordinarily effortful (M = 91.83, SD = 49.25).

Next, the Spatial Presence Experience Scale was used to measure the spatial presence of the participants (Hartmann et al. 2016). The scale consists of eight items measuring two dimensions: self-location (e.g., “I felt like I was actually there in the environment of the presentation”) and possible actions (e.g., “I had the impression that I could be active in the environment of the presentation.”). The participants indicated their level of agreement on a 7-point Likert Scale (1 = totally disagree, 7 = totally agree). The reliability of the scale was satisfactory (Cronbach’s α = 0.93; M = 3.63, SD = 1.46).

Similar to recent research (e.g., Seibert and Shafer 2018; Shafer et al. 2017), participants’ experience of cybersickness was measured using four items of the negative feelings subscale of the ITC-Sense of Presence inventory by Lessiter et al. (2001), which assesses different symptoms of cybersickness (α = 0.81; M = 2.61, SD = 1.51) ranging on a 7-point scale. For instance, participants were asked to indicate how dizzy or nauseous they currently felt.

2.4 Sample

A statistical power analysis was performed using the software GPower 3.1 for sample size estimation based on data from previous studies and meta-analyses, which reported medium to large effect sizes regarding the subject matter at hand (e.g., Uhm et al. 2020; Vettehen et al. 2019). With alpha error probability set to 0.05, 90% power, and ƒ = 0.30, a minimum sample size of 119 participants was calculated. Eventually, N = 121 undergraduate and graduate students of a medium-sized German university were recruited and received course credit for participating. They did not have to fulfil any criteria to be eligible for participation. However, in the study invitation, prospective participants were asked to wear contacts instead of glasses if they needed vision aids. The mean age of the sample was 20.50 years (SD = 3.27), and 100 of the participants were female (82.6%), while 21 participants identified as male (17.4%). None of the participants had to be excluded or decided to drop out during the experiment. Of the 121 participants, only 16 had never experienced a 360° video before, while 53 had experienced them at least once or twice and 52 had experienced them several times. No significant differences were found between the distribution of gender [χ2 (1) = 2.50, p = 0.114] and age [t(119) = – 0.82, p = 0.413] in both experimental groups.

3 Results

Table 1 collects the zero-order correlations between our study variables.

Table 1 Inter-correlations among the variables

To scrutinize the impact of technological immersiveness on cognitive load (H1 see Fig. 3), spatial presence (H2; see Fig. 4), and cybersickness (H3; see Fig. 5), a MANOVAFootnote 5 was conducted using SPSS, Version 26 (IBM Corp., Armonk, NY, USA). Considering Hotelling’s trace statistic, there was a significant effect of technological immersiveness on the dependent variables, T = 0.88, F(3, 117) = 34.17, p < 0.001, η2 = 0.467.

Fig. 2
figure 2

Setup of the study laboratory with the revolving chair and laptop (left picture) and an example of a participant in the high immersiveness condition (right picture). On the right picture, the chair of the research assistant in front of the cubicle is also visible

Fig. 3
figure 3

Effect of technological immersiveness on cognitive load. Values range from 0 (low cognitive load) to 220 (high cognitive load). Error bars represent 95% confidence intervals. N = 121

Fig. 4
figure 4

Effect of technological immersiveness on spatial presence. Values range from 1 (low spatial presence) to 7 (high spatial presence). Error bars represent 95% confidence intervals. N = 121

As predicted, participants who wore the HMD to experience the 360° video reported higher levels of cognitive load (M = 114.02, SD = 45.60) than participants who saw the video on the laptop (M = 72.71, SD = 44.25), supporting H1, F(1, 119) = 25.48, p < 0.001 (Fig. 3), with a rather high effect size of partial η2 = 0.176. Participants in the HI condition further perceived higher levels of spatial presence (M = 4.38, SD = 1.33) than those who were part of the LI condition (M = 2.98, SD = 1.23), F(1, 119) = 36.51, p < 0.001, partial η2 = 0.235. (Fig. 4) Last but not least, if allocated to the HI condition, participants additionally reported higher levels of cybersickness (M = 3.62, SD = 1.40) compared to those who employed a regular laptop to experience the video (M = 1.75, SD = 0.97), F(1, 119) = 74.84, p < 0.001, partial η2 = 0.386 (Fig. 5). As such, we also accept hypotheses H2 and H3, confirming well-established assumptions about the characteristics of experiencing immersive media.

Fig. 5
figure 5

Effect of technological immersiveness on cybersickness. Values range from 1 (low cybersickness) to 7 (high cybersickness). Error bars represent 95% confidence intervals. N = 121

In the final step of our data analysis, we strived to analyze whether spatial presence (RQ1) or cybersickness (RQ2) emerged as meaningful mediators for the enhanced cognitive load associated with higher immersiveness—thus being statistically accountable for this effect. A parallel mediation analysis was conducted using Hayes’s PROCESS macro for SPSS (2018) and 5.000 bootstrapping iterations. Technological immersiveness was included as the independent variable (0 = LI, 1 = HI), while participants’ levels of spatial presence and cybersickness were examined as parallel mediators. Figure 6 illustrates the observed connections between the variables, as well as the unstandardized regression coefficients. While the indirect effect of the parallel mediation via cybersickness (b = 26.81; 95% CI [12.28, 42.22]) reached significance, the indirect effect via spatial presence as a mediator (b = 1.28; 95% CI [–7.58, 10.56]) as well as the direct effect of technological immersiveness (b = 13.21; 95% CI [–7.33, 33.75]) did not turn out statistically significant. As such, cybersickness but not spatial presence could be identified as a meaningful mediator for the impact of immersiveness on cognitive load. The overall mediation model could explain 30% of the variance in the dependent variable (Fig. 6).

Fig. 6
figure 6

Parallel mediation analysis with bootstrapping (m = 5.000). *p < .001; LI = low immersiveness; HI = high immersiveness; ns = nonsignificant, N = 121

4 Discussion and implications

As predicted, the immersiveness of the media technology had significant and substantial effects on media users’ perceived spatial presence and cybersickness as well as on their cognitive load. These results clearly align with earlier research on the impact of technological factors on human perception (e.g., Cummings and Bailenson 2016; Parong and Mayer 2021; Porcino et al. 2021; Roettl and Terlutter 2018). At the same time, by answering RQ1 and RQ2, our study also yielded novel findings that can be considered as highly relevant for both researchers and media producers. As shown in the parallel mediation analysis, perceived spatial presence was not significantly connected to the media users’ cognitive load. Consequently, neither the assumption of Makransky and Petersen (2021) nor that of Parong et al. (2020)—who, respectively, hypothesized a positive or negative connection between the two variables—could be validated based on our data. Considering our findings, experiencing spatial presence does not seem to consume a lot of cognitive resources after all, which aligns with Schubert's (2009) interpretation of spatial presence as a cognitive feeling that emerges unconsciously and without mental effort. To us, this also makes sense from a theoretical point of view: Immersed in any given environment (natural or otherwise), the ability to experience physical presence constitutes an absolute prerequisite for us to successfully interact with our surroundings, i.e., a necessity for psychological functioning. Requiring a lot of mental resources for this purpose would seem maladaptive. Even more so, the literature has indicated that the human brain typically reacts to mediated stimuli as if they are part of the natural world, because our neurological architecture shows an inherent tendency to accept perceptive stimuli rather than to reject them (Lee 2004b; Panksepp 1998). If so, the experience of spatial presence might indeed be a rather intuitive process that puts little strain on the cognitive system. In turn, media producers might not have to worry too much about this aspect of their virtual environments distracting users from the task or offering at hand. While this might be comforting, we would like to point out that the opposite assumption (spatial presence serving as a means to alleviate high cognitive load) also did not hold true according to our data. Thus, the hope to make virtual applications less mentally demanding by increasing spatial presence might also lack empirical footing.

By contrast, our analyses indeed confirmed cybersickness as a profound mediator between technological immersiveness and cognitive load. Unpleasant feelings such as nausea and dizziness, which arise as side effects from using immersive media devices (e.g., HMDs), evidently deplete media users’ cognitive resources. Unlike previous studies that only reported inconsistent and correlational results in this regard (e.g., Ha 2020; Nesbitt et al. 2017; Szpak et al. 2019; Varmaghani et al. 2021), we thus present our experimental findings and new, more tenable evidence in favor of this important assumption about virtual environments.

In our opinion, our results carry several practical implications for media offerings that require a certain amount of unoccupied cognitive resources from media users (e.g., learning environments and advertisements). While designing immersive media offerings, media producers should avoid elements that might trigger feelings of cybersickness, such as camera instability, visual latencies, visual accelerations, headset weight, or a poor fit of interpupillary distance (Litleskare and Calogiuri 2019; Porcino et al. 2021; Stanney et al. 2020). Furthermore, they can actively integrate factors into their media contents that have been shown to decrease cybersickness but not spatial presence, such as motion prediction cues, rotation snapping (i.e., eliminating frames during viewpoint rotation), and translation snapping (i.e., using short jumps for translational displacement) in order to reduce the illusion of self-motion (Farmani and Teather 2020; Jeng-Weei Lin et al. 2005). Additional strategies to reduce cybersickness can be found in the literature review by Porcino et al. (2021). Furthermore, active rather than passive movements (e.g., natural walking compared to artificial locomotion techniques) in VR should also decrease cybersickness while enhancing spatial presence (Caserman et al. 2021). Since spatial presence has been connected to other beneficial user experiences, such as media enjoyment (Yim et al. 2012) and increased feelings of autonomy (Gao et al. 2018), immersive media producers can integrate presence-inducing elements that do not trigger cybersickness, such as customized avatars (Bailey et al. 2009), to optimize the media experience without having to worry about added cognitive load.

4.1 Limitations and future research

The current study offers important insights for both researchers and media producers, but several aspects have to be reflected critically. This starts with the employed immersive technology. A 360° video was chosen as stimulus material, which created high vividness but offered limited interactivity. As such, future studies should replicate the reported findings using more complex VR applications, whose additional interaction possibilities might elicit both higher levels of spatial presence as well as cybersickness (Saredakis et al. 2020; Yeo et al. 2020).

Indeed, the use of more immersive media forms might also serve to address another limitation of our work—i.e., the fact that our experiment did not manipulate spatial presence and cybersickness separately. In the context of passive 360° videos, it seems nearly impossible to manipulate one factor without influencing the other while still preserving adequate group comparability. For instance, while the variation of video image quality (low/high) as a second experimental factor might increase cybersickness (Porcino et al. 2021) it should equally reduce spatial presence (Cummings and Bailenson 2016). In order to obtain unequivocal causal evidence, future studies should therefore try to incorporate new approaches to manipulate spatial presence and cybersickness independently—which might require actual interactions with the virtual environment. A possibility for this might be to use VR games, in which the customizability of the avatar could be experimentally varied (customized/not customized) to create conditions of higher vs. lower spatial presence (Bailey et al. 2009). Similarly, different methods of virtual locomotion (teleporting/steering locomotion) could be used to specifically target the factor cybersickness—although this might, again, also influence spatial presence (Porcino et al. 2021). Despite the potential merit of these additional ideas, however, it seems clear to us that manipulating one factor without automatically changing the other will eventually remain a great challenge for the field of VR research. At the same time, we would like to underscore that even without distinct manipulations of cybersickness and spatial presence, it is still possible and valid to connect participants’ experience of both factors to the outcome in question (cognitive load). As such, our findings can still be interpreted as novel valuable insight into this ongoing debate.

Another area of improvement for future work may concern the measurement scale that was employed to operationalize media users’ levels of cognitive load. With the utilized subjective mental effort questionnaire, we made use of a self-report instrument, which might not always result in the most precise measurement of participants’ de-facto cognitive load (e.g., Breves and Schramm 2019). Future research could therefore employ more objective measures regarding this construct; physiological measurements (e.g., EEGs) or secondary task reaction time measures, for instance, are popular ways to measure the amount of cognitive load in an implicit manner—albeit they are often being criticized for disturbing the media perception process, thus limiting the external validity of the results (Baceviciute et al. 2021; Bracken et al. 2014; Paas et al. 2003). Furthermore, we note that the employed cognitive load item may be considered rather undifferentiated, because it does not capture different dimensions of cognitive load such as mental effort and mental load (e.g., Hwang et al. 2013). On the other hand, earlier work that has utilized more nuanced measurement tools often reported that it turned out very challenging to capture the different facets of cognitive load separately (e.g., Schnotz and Kürschner 2007; Schrader and Bastiaens 2012). As such, we still believe the selected single-item measure to be a suitable choice for the current study, not least considering that previous research has shown it to be valid, reliable, as well as unobtrusive (Paas et al. 2003).

In a similar vein, the use of the subscale of the ITC-Sense of Presence inventory by Lessiter et al. (2001) to measure cybersickness can also be viewed critically. In most studies, the Simulator Sickness Questionnaire (SSQ) by Kennedy et al. (1993) is incorporated as a tool to measure levels of cybersickness (Porcino et al. 2021). In a recent paper, however, Sevinc and Berkman (2020) recommend not to use the SSQ in VR settings because of its psychometric qualities and applicability. While other measurement tools exist that have been specifically introduced to measure cybersickness in VR, such as VRSQ (Virtual Reality Sickness Questionnaire; Kim et al. 2018), they have not yet been evaluated sufficiently (Sevinc and Berkman 2020). Also, as these novel methods were specifically introduced to measure the media experience with a HMD and validated in VR settings, they might not be suited for studies that compare media forms that vary in technological immersiveness. Therefore, in line with other researchers that compared media forms that differed regarding immersiveness (e.g., Seibert and Shafer 2018), we decided to use the short and easily adaptable scale by Lessiter et al. (2001) that can be used both for participants who experienced the video by using a HMD as well as a regular computer screen. Nonetheless, we recommend the additional use of other scales to capture multiple dimensions of cybersickness, such as the VRSQ (Kim et al. 2018), in future VR studies.

This study was designed to focus on two basic mechanisms that have been repeatedly suspected behind the positive connection between immersive media technologies and cognitive load. As a consequence, it did not capture secondary variables that might be of interest for specific disciplines, such as media enjoyment, memory, or learning performance. However, we believe that researchers who want to further elucidate on the impact of cybersickness and spatial presence on cognitive load as well as other variables could use the results reported here as a valuable starting point. In the same vein, it stands to reason that other theoretical mediators might help to explain some of the remaining variance beyond our explored parallel mediation model. Even though cybersickness could explain nearly a third of the impact of immersiveness on cognitive load, it might be worthwhile to explore other variables—such as perceived agency, which has been discussed in this regard (Makransky and Petersen 2021)—in future analyses.

5 Conclusion

Based on our observation that spatial presence (as a fundamental gratification of virtual reality) is not per se connected to higher cognitive load, one can conclude that immersive media forms may indeed serve as successful communication, entertainment, and learning tools—if they are correctly designed and implemented. Obstacles that are believed to elicit cybersickness should be cautiously circumvented in order to avoid cognitive overload. Otherwise, the added value of immersive technologies and spatial presence might get lost or even change to the contrary.