Are realistic details important for learning with visualizations or can depth cues provide sufficient guidance?

Skulmowski, Alexander

doi:10.1007/s10339-024-01183-3

Are realistic details important for learning with visualizations or can depth cues provide sufficient guidance?

Research Article
Open access
Published: 21 March 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Cognitive Processing Aims and scope Submit manuscript

Are realistic details important for learning with visualizations or can depth cues provide sufficient guidance?

Download PDF

Alexander Skulmowski ORCID: orcid.org/0000-0002-1682-021X¹

636 Accesses
Explore all metrics

Abstract

The optimal choice of the level of realism in instructional visualizations is a difficult task. Previous studies suggest that realism can overwhelm learners, but a growing body of research demonstrates that realistic details can enhance learning. In the first experiment (n = 107), it was assessed whether learning using realistic visualizations can be distracting and therefore particularly benefits from pre-training. Participants learned the anatomy of the parotid gland using labeled visualizations. While pre-training did not have an effect, a more realistic visualization enhanced learning compared to a schematic visualization. In the second experiment (n = 132), a schematic diagram was compared to a more realistic style featuring basic depth cues, and a highly realistic visualization containing a detailed surface. Regarding retention performance, no significant differences were found. However, an interesting pattern regarding subjective cognitive load ratings emerged: the schematic version received the highest cognitive load ratings, while the version featuring simplified shading was rated as least demanding. The version containing simplified depth cues also elicited lower cognitive load ratings than the detailed visualization. The two experiments demonstrate that fears concerning a detrimental effect of realistic details should not be over-generalized. While schematic visualizations may be easier to visually process in some cases, extracting depth information from contour drawings adds cognitive demands to a learning task. Thus, it is advisable that computer-generated visualizations contain at least simplified forms of shading, while the addition of details does not appear to have a strong positive effect.

Is there an optimum of realism in computer-generated instructional visualizations?

Article Open access 18 April 2022

Realistic details impact learners independently of split-attention effects

Article Open access 09 January 2023

Is a Preference for Realism Really Naive After All? A Cognitive Model of Learning with Realistic Visualizations

Article Open access 23 September 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

In the last several years, there has been an increasing interest in the effects of realism in visualizations on learning. While this topic has been investigated for decades (e.g., Dwyer 1967, 1969; Scheiter et al. 2009), older studies needed to rely on comparisons of analogue media, such as drawings and various types of photographs. A plethora of studies investigated the effects of realistic details on learning, for instance by varying the level of realism in instructional visualizations (e.g., Dwyer 1967, 1969). Other studies dealt with the interactions between contextual factors, such as learners’ prior knowledge (Dwyer 1975). Due to the growing usage of digital learning, ranging from websites featuring three-dimensional (3D) computer-generated visualizations to virtual reality, learners and educators need to know which presentation mode(s) will help them reach their learning objectives. As a result, the recent years saw a revival of this research area.

Research on learning with realistic visualizations is encumbered by a number of obstacles that have impeded researchers in coming to broader conclusions and recommendations. A theoretical problem persists in the definition of realism in computer-generated visualizations. While some studies focus on comparisons between the extreme opposites of “schematic” (or “abstract”) visualizations and “realistic” (or “detailed”) visualizations (e.g., Menendez et al. 2020, 2022; Scheiter et al. 2009; Skulmowski and Rey 2020), there have been various attempts at defining discrete realism levels that often range from the abstraction level of contour drawings to photorealistic visualizations (e.g., Dwyer 1967; Höffler 2010). Although such systems provide some guidance for the categorization and comparison of learning materials used in different studies, it may still be difficult to reliably label different studies as belonging to a certain level. This problem is pervasive in instructional realism research and has been discussed as a major issue before (Skulmowski and Rey 2018; Skulmowski et al. 2022). After all, if there is no agreement on what constitutes the different levels of realism, it is impossible to agree on whether there can be an optimal level of realism.

Earlier research often used the idea of a “realism continuum” to distinguish between several levels of realism (e.g., Dwyer 1967). However, more specialized methods of categorization have been presented for the realm of computer-generated visualizations. Slater et al. (2009) defined realism using the two components geometric realism and illumination realism. The former component is defined as the result of the virtual model depicted having a geometry that captures the real model as accurately as possible, while the latter component is realized by using physically correct lighting calculations to let the geometry appear as it does in real life. A more detailed system was presented by Skulmowski et al. (2022) with the geometry, shading, and rendering (GSR) model (see Fig. 1). The model considers the three major steps in creating a computer-generated visualization: starting with the level of detail of the geometry, followed by the various options concerning the appearance of the materials applied to the models in the shading stage, and concluding with the lighting and rendering stage that is used to determine the look of the rendering (ranging from a drawing-like schematic output to photorealistic renderings).

Even when studies are carried out by the same researcher(s) using the same learning materials, such as in Francis Dwyer’s case, who conducted a number of studies on learning heart anatomy and physiology (see Dwyer 1976, for an overview); the effectiveness of realism can vary substantially between studies (for a meta-analysis and discussion, see Reinwein and Huberdeau 1997). As noted by Dwyer (1976), factors such as the learning time, learners’ prior knowledge, and the learning objectives can affect the usefulness of realism. In sum, realism can be considered to be difficult to define and categorize, and even with clearly defined realism levels, the effects of realism do not appear to be consistent.

Are realistic details a form of distraction?

With the growing number of studies in which realism did not have a significant effect or even resulted in negative effects on performance, some authors characterized the belief that realism can be helpful as naive (Smallman and St. John, 2005). Despite such drastic conclusions, a new wave of realism research focused on computer-generated instructional visualizations has contributed to a more balanced view (e.g., Huk 2006; Huk et al. 2010; Menendez et al. 2022; Moreno et al. 2011; Skulmowski 2022a, 2022b; Skulmowski and Rey 2020, 2021). These studies highlight that visual realism can be particularly helpful for learners with high spatial abilities (Huk 2006), for learners of specific ages (Menendez et al. 2022), and for realistic tests (Skulmowski and Rey 2021).

Turning to relevant reviews, it becomes apparent that realism (or a large amount of detail, often called perceptual richness) resulted in mixed results, but also appears to help learners accomplish specific goals (Cromley and Chen 2023; Fyfe et al. 2014; Skulmowski et al. 2022). The overall conclusion that can be gained from these reviews is that more abstract cognitive processes (such as comprehension or drawing inferences, e.g., Butcher 2006; Kaminski and Sloutsky 2013; Kaminski et al. 2008, 2013) do not benefit from realistic details (or are even hindered by them), while tasks centered around learning concrete and visual information can gain from realism (e.g., Skulmowski 2022a, 2022b). This pattern of results provides strong evidence for the claim that realism can be beneficial if utilized appropriately. However, the complexity of this pattern also highlights that still not enough is known about the effects of realism on learning to provide straightforward guidelines.

A recurring criticism toward the use of realistic instructional visualizations is that details may be unnecessary and overwhelming (e.g., Scheiter et al. 2009; for an overview, see Skulmowski et al. 2022). According to Skulmowski et al. (2022), learners using realistic visualizations may be facing the challenge of dealing with a certain level of perceptual load stemming from details, resulting in a higher cognitive load during learning. As learners need to distinguish which details are relevant and which are not, this cognitive load has been characterized as a form of extraneous cognitive load as defined by Sweller et al. (1998, 2019) in previous research (e.g., Scheiter et al. 2009). In the framework of cognitive load theory, extraneous cognitive load is a theoretical container for all cognitive demands that are unrelated and distracting in a learning task (Sweller et al. 1998). As acknowledged in this theory, learners only have a working memory with a very limited capacity at their disposal. Extraneous cognitive load prevents learners from investing their working memory capacity in the acquisition of relevant information, the latter being called intrinsic cognitive load (Sweller et al. 1998). While the boundaries between working memory and sensory memory are hard to draw (e.g., Guo et al. 2021; for an overview, see Shevlin 2020), research on multimedia learning typically assumes the distinct memory stores of sensory memory, working memory, and long-term memory (e.g., Mayer 2014; for an overview, see Schweppe and Rummer 2014). In this view, perceptual load stemming from visually complex realistic details can be thought of as the precursor to cognitive load in the form of detailed visual elements that need to be kept in working memory (Skulmowski et al. 2022). Thus, a high perceptual load stemming from irrelevant realistic details could be assumed to contribute toward extraneous cognitive load. Although it is generally recommended to minimize extraneous cognitive load in order to optimize instruction (e.g., Sweller 2020), research has shown that higher subjective extraneous cognitive load scores do not necessarily go hand in hand with a lower learning performance (e.g., Skulmowski 2022a), making a prediction of the effects of realism even more difficult.

A recent series of studies found mixed evidence for the assumption that realism can act as a distracting influence (Skulmowski 2023a). In both studies of that paper, realism was contrasted with the split-attention effect (i.e., the finding that scattering relevant information leads to worse learning than keeping related information in close proximity, Chandler and Sweller 1991, 1992). The studies were conducted to assess whether realism and split attention reinforce each other in a negative way, which would have suggested that these two design features act on shared processing pathways. In both studies, realism did not exacerbate the negative effects of split attention, but independently had no or a negative effect on learning (in Experiment 1 and 2, respectively). Upon closer inspection of the realistic visualization used in Experiment 2 (Skulmowski 2023a), the negative effect of this particular realistic visualization could be attributed to the numerous shiny details that do not provide sufficient semantic information in return for their perceptual demands (for related discussions, see Skulmowski 2023b; Skulmowski and Xu 2022).

In research on instructional visualizations, detailed realistic renderings and simple line drawings are often considered to be the extreme ends of the realism spectrum (Skulmowski et al. 2022). Hertzmann (2020) recently proposed to reconsider this contrast and instead regard drawings consisting of contour lines as a simplified substitute of reality. In Hertzmann’s (2020) model, generating a line drawing can be thought of as removing all surface details other than object boundaries. Perceiving a line drawing, on the other hand, involves generating inferences about the 3D form of an object (Hertzmann 2020). Following this proposal, contour lines could be considered as a way of presenting a wealth of visual information in a compressed form that can be “unpacked” by the viewer in order to generate a 3D mental representation. According to Hertzmann (2020), this mental ability of constructing a 3D representation from contour lines can be an automatic step performed in the visual system for simple contours or an ability that needs to be trained, depending on the style of the visualization. However, as these steps appear to be relatively cognitively demanding, one might argue that such an “unpacking” process may add additional demands, resulting in a higher cognitive load through simplification.

In sum, several results and theoretical considerations contributed toward a mixed pattern of results regarding the impact of realism on learning. While realism often plays a key role in achieving specific learning objectives, the perceptual demands of a high number of details can be overwhelming and unnecessary for other learning tasks. However, the perceptual demands of inferring a complex 3D shape from a simplified outline (see Hertzmann 2020) may also contribute toward cognitive load. As a result, a closer investigation concerning the main drivers behind perceptual demands in learning with visualizations is necessary. In addition, given the remaining potential for distraction inherent in realistic and detailed instructional visualizations, the question arises whether this danger can be averted by enriching a learning task with a pre-training phase. An overview of the effects of pre-training is given in the following section.

Realism and prior knowledge in sequential processing

Prior knowledge is an important aspect to consider in the design of a learning task (for overviews, see Brod 2021; Mayer and Fiorella 2021). Mayer et al. (2002) used the pre-training principle in a way that subdivides a complex learning task into two easier ones: based on the assumption that an animation explaining the mechanism behind brakes would be too complex, they were successful with an approach that lets learners explore the components depicted in the animation first, and then presenting the narrated animation. As this animation highlighted the causal relationships between the parts, learners who completed the pre-training had more cognitive resources to focus on these relationships than those who did not (Mayer et al. 2002; Mayer and Moreno 2003).

In the context of realism research, Dwyer (1975) found in a quasi-experimental study that a high level of prior knowledge benefits learners regardless of the level of realism used in the learning task, but that learners with a low and medium level of prior knowledge struggle with more realistic visualizations. Based on the results discussed in this section, the factor of prior knowledge could be used to test the claim that realism is able to induce so much perceptual load as to distract learners from other information. Using the pre-training principle, a typical anatomy learning task in which learners need to memorize what an anatomical structure looks like and how the components are named can be broken down into a sequence of two steps: (1) Learn using a text in which the components are described and named; (2) learn using the complete labeled visualization. If the claim that realism is detrimental due to a distractive influence is true, there should be a particularly strong positive effect on learning with a pre-training intervention if a realistic rather than a schematic visualization is used for the second step of such a learning task. In other words, pre-training could compensate the potential negative effects of realism.

The present studies

In the first experiment, pre-training is used to assess whether realistic visualizations distract learners by keeping their attention off of the labels. For learners who receive a short text mentioning the names of the different parts and explaining their shape, this type of pre-training should be particularly beneficial if they are learning with the realistic rather than the schematic version of the visualization. Thus, an interaction effect between the factors pre-training (without versus with) and realism (schematic versus realistic) was assumed (H_1a). Regarding the effect on extraneous cognitive load, an inverse relationship of this interaction effect was hypothesized (H_1b).

The second experiment investigated whether realistic details are needed for a comprehensive mental representation or whether depth cues—lacking the distractive potential of detailed renderings—are sufficient. If the positive effects of realism stem from depth cues, the variant containing such cues should lead to a significant increase in retention performance compared with the schematic drawing (H_2a). The realistic version should have an even stronger positive effect on retention than the version containing depth cues (compared to the schematic drawing) if surface detail is indeed relevant for learning (H_2b). Based on related research (Skulmowski 2022a), it was assumed that the level of subjective extraneous cognitive load rises with more realism, so that depth cues (H_3a) and realistic details (H_3b) result in higher cognitive load ratings than the schematic version.

Experiment 1

Method

Participants and design

As previous effect sizes in realism research using a similar methodology resulted in medium to high effect sizes (ηp²) between 0.09 (Skulmowski 2022b) and 0.14 (Skulmowski and Rey 2021), and a recent study investigating the pre-training principle in virtual reality indicated a similarly large effect size of d = 0.62 (Meyer et al. 2019), a conservative estimate of ηp² = 0.07 was chosen as the basis for the sample size calculation.^{Footnote 1} Using G*Power (Version 3.1.9.2; Faul et al 2009), a sample size of 107 was calculated for the 2 × 2 design of this study (power = 0.80). The two between-subjects factors investigated in this experiment are realism (schematic versus realistic) and pre-training (without versus with).

Participants needed to fulfill certain criteria in order to participate. They needed to be native German speakers aged between 18 and 30 years who had no or little knowledge concerning the anatomy of the parotid gland. In addition, only the data of participants who confirmed that they were not strongly distracted and that no major technical problem had occurred during the learning task at the end of the study were counted as complete datasets to be used for further analyses (as in the study by Skulmowski and Rey 2020). A total of 130 participants took part in the study, with 22 of them not fulfilling the participation criteria and one participant indicating having been strongly distracted, leaving 107 datasets to be analyzed.

Of the 107 participants whose datasets were complete, 90 were female and 17 were male. All participants in the two studies presented in this article were students enrolled in teacher training courses and participated for partial course credit at a university of education in Germany. Using block randomization, three of the groups were assigned with 27 participants, and only the group receiving the pre-training before learning with a schematic visualization contained 26 participants.

Materials

The experiment used revised versions of the visualizations developed by Skulmowski and Rey (2021). In that study, participants learned the anatomy of the parotid gland either using a realistic or a schematic visualization. Using the source files of the scenes used to generate the renderings, a number of changes were made to the original version to increase the difference between the two visualizations (see Fig. 2, top row). All renderings used for the visualizations in this article were created using Blender (https://www.blender.org). The schematic version presents the parotid gland as a contour drawing filled with solid colors and minimal shading to provide the most important depth cues. The realistic version uses the same base geometry, but features realistic shading involving a color texture, bump mapping, and highlights. For the realistic version, physically correct rendering methods using a lighting setup that provides additional depth cues were employed. Thus, according to the GSR model, there would essentially be no difference in the geometry dimension, but strong contrasts in the shading and rendering dimensions. For the pre-training group, a short text (124 words) was prepared in which the different components shown in the visualization are named, and their location is explained (as in the following translated example, “From this irregularly shaped structure, the parotid duct grows out.”; the full text can be found in the supplementary file).

There are several approaches to designing visual learning tests in realism research. In the design of test visualizations, it needs to be considered whether some types of visualizations lead to biased results (see, e.g., Scheiter et al. 2009, for a discussion). While some studies utilize only schematic visualizations (e.g., Skulmowski 2022b), another approach is to use visualizations that blend schematic and realistic attributes in order to arrive at an “in-between” level that is common to both visualizations (e.g., Skulmowski and Rey 2018). However, it needs to be noted that the original study using the parotid gland visualizations revealed that a benefit of realistic visualizations during learning may only be measurable using an equally realistic test visualization (Skulmowski and Rey 2021). For the present study, an in-between approach was chosen in which the model is rendered realistically (thus, preserving all depth cues), but without a detailed material (see Fig. 2, bottom row). As in the original study, the retention test was divided into two visualizations containing lettered components to which the appropriate names needed to be assigned. Some of these components were not labeled during the learning phase and thus were needed to be assigned the option “NOT LEARNED.” For every correctly labeled component, participants scored one point, with a maximum score of 16 points. Incorrect responses did not result in penalty points. The retention test resulted in a reliability of McDonald’s ω = 0.66. The study included the three extraneous cognitive load items from Klepsch et al. (2017) that were presented with the modified wording used by Skulmowski and Rey (2020), therefore asking participants regarding their difficulties while working with the visualization, rather than their rating concerning the entire task. The averaged score of the three items using 7-point Likert scales is used for the analyses in both studies in this paper. In this study, the extraneous cognitive load items had a reliability of ω = 0.88. Both studies in this paper used SoSci Survey (Leiner 2021) to collect the data.

Procedure

The general procedure is similar to previous studies (e.g., Skulmowski and Rey 2020). The study was conducted in a PC laboratory with ten seats. Participants were required to wear face masks during the study due to COVID-19 regulations in effect at the time. After providing informed consent, participants were asked to provide information regarding the participation criteria (age range between 18 and 30 years, German as a native language, no or little knowledge on the topic). The next page of the survey provided participants with the instructions for the learning phase. For all participants, this page stated that their task would be to learn the names, shapes, and locations of the parts of the parotid that were to be presented on the visualization. They were informed about the time limit of 60 s. The pre-training group received an additional instruction that before this task, they would be presented with a short text they were asked to memorize within 90 s. Thus, they were either presented with the visualization of the parotid or the short pre-training text on the next page. Both pages featured a countdown of the remaining time. After this learning phase, they were directed to a page on which the three extraneous cognitive load question items were presented, followed by a filler task. In this sorting task that lasted 60 s, the 16 German federal states were to be ranked according to their number of universities of applied sciences. On the following two pages, the retention tests were presented. On each page, one of the test visualization was shown at the top and for each lettered component in these images; participants were asked to select the corresponding label from drop-down menus below. They were informed that was no time limit for the tests. Next, participants answered questions regarding their gender and course of study as well as two data quality control questions regarding distractions and technical difficulties. Finally, participants were thanked, and they received further information regarding the study.

Results

The analyses for Experiment 1 were planned as 2 × 2 analyses of variance (ANOVAs) at a significance level of 0.05. For some variables, the normality of residuals assumption (assessed using Shapiro–Wilk tests) was violated and nonparametric tests using aligned rank transformation (Wobbrock et al. 2011) were run instead.

Extraneous load

A nonparametric ANOVA of the extraneous cognitive load data (see Fig. 3a) did not result in significant effects (all ps > 0.168). The only tendency of interest was that inducing prior knowledge before learning with the visualization raised the overall extraneous cognitive load on the descriptive level. The hypothesized interaction effect (H_1b) was not confirmed.

Retention

An ANOVA of the retention score data (see Fig. 3b) resulted in a significant benefit of the realistic visualization over the schematic one, F(1, 103) = 4.97, p = 0.028, ηp² = 0.05. Prior knowledge and the interaction between the two factors did not result in significant effects (ps > 0.528). Thus, H_1a did not reach significance and the effect pattern supports the claim that realism does not act as a distractor that needs to be compensated using other instructional means.

Experiment 2

A second experiment was conducted to assess underlying causes of the strong positive effect of realism on learning found in Experiment 1. The learning materials used in the first experiment compared a schematic version that contains a contour line, a solid halftone fill color, and a solid shadow color. Thus, the schematic version features a limited degree of depth cues through the simplified shading. Still, the realistic version including elaborate shading resulted in better learning scores. The question arises whether the surface details found on the realistic rendering are the cause of this increase in performance (as suggested by Skulmowski and Rey 2021). In order to answer this question, a study comparing a schematic drawing without any depth cues, a simplified rendering with minimal depth cues, and a highly detailed rendering with conspicuous surface detail was conducted.